Published: 4th August 2022 DOI: 10.4204/EPTCS.364 ISSN: 2075-2180

# Proceedings 38th International Conference onLogic Programming Haifa, Israel, 31st July 2022 - 6th August 2022

Edited by: Yuliya Lierler, Jose F. Morales, Carmine Dodaro, Veronica Dahl, Martin Gebser and Tuncay Tekle

# Preface

This volume contains the Technical Communications and the Doctoral Consortium papers of the 38th International Conference on Logic Programming (ICLP 2022), held in Haifa, Israel, from July 31 to August 6, 2022. In 2022, ICLP was a part of the Federal Logic Conference (FLoC) 2022 (https://floc2022.org/).

Since the first conference held in Marseille in 1982, ICLP has been the premier international event for presenting research in logic programming. The scope of the conference covers all areas of logic programming including:

#### Foundations:

semantics, formalisms, nonmonotonic reasoning, knowledge representation.

#### Languages issues:

concurrency, objects, coordination, mobility, higher order, types, modes, assertions, modules, meta-programming, logic-based domain-specific languages, programming techniques.

#### Programming support:

program analysis, transformation, validation, verification, debugging, profiling, testing, execution visualization.

#### Implementation:

compilation, virtual machines, memory management, parallel/distributed execution, constraint handling rules, tabling, foreign interfaces, user interfaces.

inductive and coinductive logic programming, constraint logic programming, answer set programming, interaction with SAT, SMT and CSP solvers, theorem proving, argumentation, probabilistic programming, machine learning.

#### Applications:

databases, big data, data integration and federation, software engineering, natural language processing, web and semantic web, agents, artificial intelligence, computational life sciences, cybersecurity, robotics, education.
Besides the main track, ICLP 2022 included the following additional tracks:

• Applications Track: This track invited submissions of papers on emerging and deployed applications of LP, describing all aspects of the development, deployment, and evaluation of logic programming systems to solve real-world problems, including interesting case studies and benchmarks, and discussing lessons learned.

• Recently Published Research Track: This track provided a forum to discuss important results related to logic programming that appeared recently (from January 2020 onwards) in selective journals and conferences, but have not been previously presented at ICLP.

ICLP 2022 also hosted:

• School on Logic and Constraint Programming: The school was designed for those interested in learning advanced topics in logic programming and constraint programming. It consisted of a series of half-day tutorials. The school hosted tutorials given by Martin Gebser (Alpen-Adria-Universität), Joohyung Lee (Arizona State University), Paul Tarau (University of North Texas), and Francesca Toni (Imperial College).

• Doctoral Consortium: The Doctoral Consortium (DC) on Logic Programming provided students with the opportunity to present and discuss their research directions, and to obtain feedback from both peers and experts in the field.

• Logic and Constraint Programming Contest: The contest combined some features of Prolog programming contest, Answer Set Programming contest, and Constraint Programming contest. As in the previous edition, participants were not limited to use a single system and could also combine declarative and imperative programming languages.

ICLP 2022 program also included several invited talks and a panel session. The main conference invited speakers were:

• Theresa Swift (Universidade Nova de Lisboa)

• Fabrizio Riguzzi (University of Ferrara)

The panel “Past, Present and Future of Prolog” was chaired by Jose F. Morales (Universidad Politécnica de Madrid and IMDEA Software Institute). The panelists were Gopal Gupta (University of Texas), Manuel Hermenegildo (Universidad Politécnica de Madrid and IMDEA Software Institute), Theresa Swift (Universidade Nova de Lisboa), Paul Tarau (University of North Texas), and Neng-Fa Zhou (CUNY Brooklyn College and Graduate Center).

The organizers of ICLP 2022 were:

General Chairs

Michael Codish, Ben-Gurion University of the Negev, Israel

Program Chairs

Yuliya Lierler, University of Nebraska Omaha, USA
Jose F. Morales, Universidad Politécnica de Madrid and IMDEA Software Institute, Spain

Publicity Chair

Recently Published Research Track Chairs

Tuncay Tekle, Stony Brook University, USA

Programming Contest Chairs

Mario Alviano, University of Calabria, Italy
Vitaly Lagoon, Cadence Design Systems, USA

10-year/20-year Test-of-Time Award Chairs

Esra Erdem, Sabanci University, Turkey
Paul Tarau, University of North Texas, USA

Doctoral Consortium Chairs

Veronica Dahl, Simon Fraser University, Canada
Carmine Dodaro University of Calabria, Italy

Workshops Coordinator

Daniela Inclezan, Miami University, USA

Three kinds of submissions were accepted:

• Technical papers for technically sound, innovative ideas that can advance the state of logic programming.

• Application papers that impact interesting application domains.

• System and tool papers which emphasize novelty, practicality, usability, and availability of the systems and tools described.

ICLP adopted the hybrid publication model used in all recent editions of the conference, with journal papers and Technical Communications (TCs), following a decision made in 2010 by the Association for Logic Programming. Papers of the highest quality were selected to be published as rapid publications in this special issue of TPLP. The TCs comprise papers which the Program Committee (PC) judged of good quality but not yet of the standard required to be accepted and published in TPLP as well as extended abstracts from the different tracks and dissertation project descriptions stemming from the Doctoral Consortium Program (DP) held with ICLP.

We have received 68 submissions of abstracts, of which thirty six resulted in paper submissions and twenty four in extended abstract submissions, distributed as follows: ICLP main track (twenty seven papers), Applications track (nine full papers and one short paper), Recently Published Research track (twenty four extended abstracts). The Program Chairs organized the refereeing process that involved the program committee and several external reviewers. Each technical paper was reviewed by at least three referees who provided detailed written evaluations. This yielded submissions short-listed as candidates for rapid communication. The authors of these papers revised their submissions in light of the reviewers suggestions, and all these papers were subject to a second round of reviewing. Of these candidates papers, 16 were accepted to appear for publication in Theory and Practice of Logic Programming as rapid communications. In addition, the Program Committee recommended 12 papers to be accepted as technical communications, to appear at Electronic Proceedings in Theoretical Computer Science (EPTCS) either as full papers or extended abstracts, of which 10 were also presented at the conference (two were withdrawn). Twenty four extended abstracts from Recently Published Research track were accepted to appear at EPTCS.

Furthermore, after a thorough examination of citation indices (e.g., Web of Science, Google Scholar), two test-of-time awards were identified by the 10-year/20-year Test-of-Time Award Chairs:

• The John Alan Robinson 20 year test of time award: François Bry and Sebastian Schaffert. Towards a declarative query and transformation language for XML and semistructured data: Simulation unification. LNCS n. 2401 pp. 255–270, Springer 2002.

• The Alain Colmerauer 10 year test of time award: Max Ostrowski and Torsten Schaub. ASP modulo CSP: The Clingcon system. Theory and Practice of Logic Programming, 12: 485–503, ICLP 2012.

We are deeply indebted to the Program Committee members and external reviewers, as the conference would not have been possible without their dedicated, enthusiastic and outstanding work. The Program Committee members of ICLP 2022 were:

The external reviewers were:

 Martin Diller Selin Eyupoglu Giovanni Amendola Michael Bernreiter Carmine Dodaro Linde Vanbesien Arvid Becker Aysu Bogatarkan

The 18th Doctoral Consortium (DC) on Logic Programming was held in conjunction with ICLP 2022. It attracted Ph.D. students in the area of Logic Programming Languages from different backgrounds (e.g., theory, implementation, application) and encouraged a constructive and fruitful advising. Topics included: theoretical foundations of logic and constraint logic programming, sequential and parallel implementation technologies, static and dynamic analysis, abstract interpretation, compilation technology, verification, logic-based paradigms (e.g., answer set programming, concurrent logic programming, inductive logic programming), and innovative applications of logic programming. This year the DC accepted 6 papers in the areas mentioned above (the author of one of them withdrawn his contribution before the publication). We warmly thank all student authors, supervisors, reviewers, co-chairs, and Program Committee members of the DC, as well as the organizing team for making the DC greatly successful.

The Program Committee members of the Doctoral Consortium were:

 Johannes K. Fichte Fabio Fioravanti Daniela Inclezan Marco Maratea Zeynep G. Saribatur Frank Valencia Yi Wang Jessica Zangari

and Mario Alviano was an external reviewer.

We also express our gratitude to the full ICLP 2022 and FLOC 2022 organization committees. Further, we thank Thomas Eiter as current President of the Association of Logic Programming (ALP), Marco Gavanelli, in the role of conference-coordinator for ALP, and all the members of the ALP Executive Committee. We also thank Mirek Truszczynski, Editor-in-Chief of TPLP and the staff of the staff at Cambridge University Press for their assistance. We would also like to thank Rob van Glabbeek, Editor-in-Chief of EPTCS, for helping the Program Chairs with his prompt support. Finally, we wish to thank each author of every submitted paper, since their efforts keep the conference alive, and the participants to ICLP for bringing and sharing their ideas and latest developments.

Yuliya Lierler, Jose F. Morales, Carmine Dodaro, Veronica Dahl, Martin Gebser, and Tuncay Tekle (Eds.)

# Two Languages, One System:Tightly Connecting XSB Prolog and Python

Python, ranked first on the May 2022 Tiobe index, is a hugely popular language, heavily used in machine learning and other applications. Prolog, ranked twenty-firstthe May 2022 Tiobe index, while less popular has important reasoning and knowledge representation capabilities, particularly since modern Prologs support constraint-based reasoning, tabling-based deduction, and probabilistic inference. Despite their differences, Prolog and Python share important commonalities. First,both Prolog and CPython (the standard Python implementation) are written in C withwell-developed interfaces to other C programs. In addition, both languages are dynam-ically typed with data structures that are recursively generated in just a few ways. Infact, nearly all core data structures of the two languages can be efficiently bi-translated, leading to a tight connection of the two systems. This talk presents the design, experi-ence, and implications of such a connection using XSB Prolog version 5.0. The connection for XSB to call Python has led to XSB orchestrating commercial projects using interfaces to Elastic search, dense vector storage, nlp systems, Google maps, and to a14.6 billion triple Wikidata graph. The connection for Python to call XSB allows XSBto be imported as any other Python module so that XSB can easily be invoked fromJupyter notebooks or other graphical interfaces. On a more speculative level, the talk mentions how this work might be leveraged for research in neuro-symbolic learning, natural language processing and cross-language type inference.

# Probabilistic Logic Programming: Semantics, Inference and Learning

Probabilistic Logic Programming is now more than 30 years old and has become an active field of research at the intersection of Logic Programming and Uncertainty in AI. Among the various semantics, the Distribution Semantics (DS) has emerged as one of the most intuitive and versatile. The talk will introduce the DS and its various extensions to deal with function symbols, continuous random variables and multiple stable models. Computing the probability of queries is the task of inference, which can be solved either by exact or approximate algorithms. Exact inference is usually performed by means of knowledge compilation while approximate inference by means of Monte Carlo. Inference is at the basis of learning programs from data that has recently received much attention. The talk will provide an overview of algorithms for learning the parameters and for learning the structure of programs, discussing the various Inductive Logic Programming techniques that have been adopted and the various tradeoffs between quality of the model and speed of learning.

# 50th Anniversary of the Birth of Prolog: Some reflections on Prolog's Evolution, Status, and Future

This year we celebrate Prolog's 50th anniversary, and the continued relevance of Prolog and logic programming as a basis for both higher-level programming and symbolic, explainable AI. This talk will provide an overview of Prolog's evolution, status, and future. It will start with a quick tour of the major milestones in the advancement of the language and its implementations, from the original Marseille and Edinburgh versions, to the many current ones, with more appearing continuously. This will be followed by some reflections on issues such as what still makes Prolog different and relevant as a language and tool, the best approaches for teaching Prolog, some landmark applications, relation to other logic programming languages, or the use of Prolog and Prolog-related tools for other programming languages. The talk will also attempt to dispel along the way some common misconceptions about the language, while discussing some past weaknesses and others that may still need addressing. It will conclude with some reflections on challenges and opportunities for the future.

Part of the contents of this talk appear in the recent TPLP paper "50 years of Prolog and Beyond", by Philipp Körner, Michael Leuschel, João Barbosa, Vítor Santos Costa, Verónica Dahl, Manuel V. Hermenegildo, Jose F. Morales, Jan Wielemaker, Daniel Diaz, Salvador Abreu, and Giovanni Ciatto, written for Prolog's 50th anniversary and TPLP's 20th anniversary. See prologyear.logicprogramming.org for pointers to this paper and other initiatives related to the Year of Prolog. The Year of Prolog is organized by the Association for Logic Programming and the Prolog Heritage foundation.

# Solving Problems in PH with ASP(Q): Preliminary Results

Giovanni Amendola (University of Calabria, Italy)
Bernardo Cuteri (University of Calabria, Italy)
Francesco Ricca (University of Calabria, Italy)
Mirek Truszczynski (University of Kentucky, US)

Answer Set Programming (ASP) [3] is a logic programming formalism for modeling and solving search and optimization problems. It is especially effective in modeling and solving search and optimization variants of decision problems in the class NP (the first level of the polynomial hierarchy). The success of ASP is due to two factors. First, problems in NP (and their search and optimization variants) can be expressed as compact and well-structured programs by following a simple programming methodology known as generate-define-test. Second, solvers such as clasp [7], and wasp [1]were shown to be effective in processing programs for industrial-grade problems [6].

Modeling problems beyond the class NP with ASP is possible when one uses the full language of ASP that allows for disjunctions in the head of rules. In the full ASP language one can express problems whose decision variants are in the second level of the Polynomial Hierarchy [4]. However, modeling problems beyond NP with ASP is complicated. The generate-define-test approach is insufficient, and commonly used techniques such as saturation [5] are hardly intuitive as they often introduce constraints that have no direct relation to constraints of the problem being modeled.

These shortcomings of ASP motivated recent research on extending ASP with quantifiers over answer sets of programs. In particular, following the way in which Quantified Boolean formulas (QBFs) add quantifiers over sets of interpretations to the language of propositional logic, we introduced the language ASP with Quantifiers or ASP(Q), for short [2]. The language expands ASP with quantifiers over answer sets of ASP programs; its elements are called quantified programs. We showed [2] that ASP(Q) (just as QBFs) can express all decision problems forming the polynomial hierarchy. These results notwithstanding, the ultimate adoption of ASP(Q) as a knowledge representation formalism, depends on the availability of effective solvers for quantified programs. In this paper, we address this issue and present some preliminary results on how to obtain a ASP(Q) solver based on solvers for Quantified Boolean Formulas. Our solver translates a quantified program together with a given data instance into a QBF. This QBF can be solved by any of the available QBF solvers.

The architecture of out solver is depicted in Figure 1. The input program P is taken by the QBF Encoder module. It produces a QBF formula F(P) such that F(P) is true iff P is coherent. The QBF F(P) is then processed by the QBF Solving back-end that determines whether F(P) is true. The Output Interpreter presents to the user the output in a standardized format. The QBF Encoder module translates the input program P into a QBF formula F(P). This process is driven by a Director submodule that analyzes the program structure to identify the quantifiers and the subprograms P1, ..., Pn, and C. The subprograms Pi, i = 1, ..., n, are augmented with choice rules for ground atoms involving predicates occurring in earlier subprograms. In this way, ground atoms fixed by answer sets of earlier programs can be combined with the current program, implementing the use of the fix mapping described in the formal definition of the semantics. Each subprogram is then instantiated by the Grounder submodule that computes its propositional counterpart and converts it into a propositional formula in Conjunctive Normal Form (CNF) by LP2SAT module. The resulting CNF formulas are assembled together into a QBF F(P) that is true precisely when P is coherent. During this process the Director collects information that allows it to combine outputs of internal procedures. This information is also used by the Output Interpreter to print the results, in particular, an answer set of when available. The QBF formula produced by the QBF Encoder is processed by a QBF solver of choice. To facilitate a broader range of applicable solvers and to improve solver performance, the Pre-processor submodule uses tools that convert the formula F(P) to a format compatible with the solver (e.g., converting from QCIR to QDIMACS) or simplify a formula (e.g., eliminating some variables and quantifiers). The output of the QBF back-end is parsed and printed by the Output Interpreter. When P starts with an existential quantifier, and the QBF back-end is designed to provide a satisfying assignment to the corresponding existentially quantified variables, there is an option to print out the corresponding quantified answer set.

We studied the performance of the solver on the QBF evaluation and the Minmax Clique problems [2]. This preliminary experimental study confirms that QASP is a viable solver for tackling hard problems in the PH, indeed it performs well on QBF and allows to solve instances of Minmax Clique in a reasonable time.

For the future work, we plan to further extend the application of ASP(Q) to model hard AI problems. On the implementation side we plan to support several QBF backends, and enhance the efficiency of the QBF encoder.

### References

1. Mario Alviano, Carmine Dodaro, Nicola Leone, Francesco Ricca (2015): Advances in WASP. In: LPNMR, LNCS 9345, Springer, pp. 40-54 doi:10.1007/978-3-319-23264-5.
2. Giovanni Amendola, Francesco Ricca, Miroslaw Truszczynski (2019): Beyond NP: Quantifying over Answer Sets. In: Theory Pract. Log. Program. 19(5-6), pp. 705-721 doi:10.1017/S1471068419000140.
3. Gerhard Brewka, Thomas Eiter, Miroslaw Truszczynski (2011): Answer set programming at a glance. In: Commun. ACM 54(12), pp. 92-103 doi:10.1145/2043174.2043195.
4. Evgeny Dantsin, Thomas Eiter, Georg Gottlob, Andrei Voronkov (2001): Complexity and expressive power of logic programming. In: ACM Comput. Surv. 33(3), pp. 374-425 doi:10.1145/502807.502810.
5. Thomas Eiter & Georg Gottlob (1995): On the Computational Cost of Disjunctive Logic Programming: Propositional Case. In: Ann. Math. Artif. Intell. 15(3-4), pp. 289-323 doi:10.1007/BF01536399.
6. Esra Erdem, Michael Gelfond & Nicola Leone (2016): Applications of Answer Set Programming.. In: AI Magazine, 37(3), pp. 53-68 doi:10.1609/aimag.v37i3.2678.
7. Martin Gebser, Roland Kaminski, Benjamin Kaufmann, Javier Romero & Torsten Schaub (2015): Progress in clasp Series 3.. In: LPNMR, LNCS 9345, Springer, pp. 368-383 doi:10.1007/978-3-319-23264-5.

# Towards Stream Reasoning with Deadlines

Periklis Mantenoglou (NKUA & NCSR "Demokritos")
Manolis Pitsikalis (University of Liverpool)
Alexander Artikis (University of Piraeus & NCSR "Demokritos")

### Motivation

Intelligent processing systems are used in contemporary applications for deriving the values of various properties at different points in time, based on a set of actions/events. The specifications of such applications often include events with temporally-constrained effects. In multi-agent voting protocols, e.g., a proposed motion may be seconded at the latest by a specified time. In e-commerce, a contract may expire at some future time. Moreover, an agent may be suspended for performing a forbidden action, but only temporarily. Consider, also, composite event recognition for maritime situational awareness. A common type of fishing, i.e., trawling, includes a series of change in heading' events detected on vessel position streams. Trawling is said to end some minutes after a change in heading' event, provided that no other event of this type has taken place in the meantime.

Several formalisms have been proposed for handling such temporal phenomena. For instance, Miller and Shanahan ([4]) modelled events with temporally-constrained effects in the Event Calculus, which is a logic programming formalism for representing and reasoning about events and their effects ([3]). Their approach, however, is not scalable to large, high-velocity data streams, which are omnipresent in contemporary applications. Such applications additionally require the continuous, online evaluation of temporal patterns/queries over streaming data. In this case, a computational framework must be efficient enough so that instances of pattern satisfaction can be reported with minimal latency. As an example, in the domain of maritime situational awareness, millions of signals are emitted by vessels each day, including information about, e.g., their position, heading and velocity. A stream reasoning system may process these data streams and compute the occurrences of composite activities, such illegal fishing or a rendez-vous between two vessels ([5]).

The goal of this work is to extend the stream reasoning system RTEC ([1]) with an algorithm which computes efficiently the temporally-constrained effects of events.

### The Event Calculus for Run-Time Reasoning

RTEC (or Event Calculus for Run-Time Reasoning') is a logic programming implementation of the Event Calculus ([3]), optimised for stream reasoning. The language of RTEC is many-sorted and includes events, fluents, i.e., properties which may have different values at different points in time, and a linear time model which includes integer time-points. Event occurrences, fluent-value changes and fluent-value persistence are expressed through domain-independent predicates. happensAt(E, T) denotes that event E takes place at time-point T, while initiatedAt(F=V, T) (resp. terminatedAt(F=V, T)) expresses that a time period during which fluent F has the value V is initiated (terminated) at time-point T. holdsFor(F=V, I) states that fluent F has the value V continuously in the maximal intervals included in list I. Finally, holdsAt(F=V, T) states that fluent F has the value V at time-point T.

The key feature of the Event Calculus is the common-sense law of inertia, which expresses that fluent-value pairs (FVPs) persist, unless an event which explicitly affects the value of the fluent occurs in the meantime. The conditions under which event occurrences may affect the values of fluents are expressed through application-specific initiatedAt and terminatedAt rules. Given a set of initiatedAt and terminatedAt rules for a FVP F=V, and a stream of events, RTEC computes holdsFor(F=V, I), i.e., the maximal intervals for which F=V holds continuously.

RTEC supports stream reasoning applications by integrating Event Calculus-based representation and reasoning with caching, indexing, windowing and a forget' mechanism which aid computational performance. Morover, RTEC is restricted in hierarchical knowledge bases which allow bottom-up processing, thus avoiding re-computations. The complexity analysis of RTEC is available in ([1]) while the code is publicly available.

### Supporting Events with Temporally-Constrained Effects

RTEC→ is an extension of RTEC which supports events with temporally-constrained effects by means of domain-independent reasoning. In RTEC→, an event description may include declarations which specify that the values of certain FVPs may change at a specified time-point after their initiation. In that case, the FVP is subject to a deadline. The deadline of a FVP is extensible if a re-initiation of the FVP at an intermediate time-point postpones the scheduled FVP change. RTEC→ supports both deadline types. The addition of deadlines in RTEC is motivated by earlier work on the representation of events/actions with delayed effects ([2]), particularly in the context of the Event Calculus ([4]). In contrast to the literature, we focus on computing FVP deadlines efficiently, thus supporting streaming applications. A description and experimental evaluation of RTEC→ will be presented in a forthcoming publication.

### References

1. Alexander Artikis, Marek Sergot & Georgios Paliouras (2015): An Event Calculus for Event Recognition. In: IEEE Transactions on Knowledge and Data Engineering 27(4) pp. 895–908, doi:10.1109/TKDE.2014.2356476.
2. Lars Karlsson, Joakim Gustafsson & Patrick Doherty (1998): Delayed Effects of Actions. In: Henri Prade, editor: ECAI, John Wiley and Sons pp. 542–546,
3. Robert Kowalski & Marek Sergot (1986): A Logic-Based Calculus of Events. In: New Generation Computing 4(1) pp. 67–96, doi:10.1007/BF03037383.
4. Rob Miller & Murray Shanahan (2002): Some Alternative Formulations of the Event Calculus. In: Computational Logic: Logic Programming and Beyond pp. 452–490, doi:10.1007/3-540-45632-5_17.
5. Manolis Pitsikalis, Alexander Artikis, Richard Dreo, Cyril Ray, Elena Camossi & Anne-Laure Jousselme (2019): Composite Event Recognition for Maritime Monitoring. In: DEBS, ACM, pp. 163–174, doi:10.1145/3328905.3329762.

# Model Reconciliation in Logic Programs - Extended Abstract

Tran Cao Son (New Mexico State University)
Van Nguyen (New Mexico State University)
Stylianos Loukas Vasileiou (Washington University in St. Louis)
William Yeoh (Washington University in St. Louis)

Introduction. The model reconciliation problem (MRP) has been introduced and investigated in the context of planning [1,2,4,5,6], where one agent (e.g., a robot) needs to explain to another agent (e.g., a human) the question why a certain plan is an optimal plan?'' MRP defines the notion of an explanation from an agent to a human as a pair $(\epsilon^+, \epsilon^-)$ of sets where $\epsilon^-$ and $\epsilon^+$ is a set of information, such as preconditions or post-conditions of actions, that the human should add and remove from their problem definition, respectively. The focus has been on developing algorithms for computing explanations that are optimal with respect to some criteria (e.g., the minimality of $|\epsilon^+ \cup \epsilon^-|$).

This is an extended abstract of our JELIA paper [3]. It proposes a generalization of MRP in the context of answer set programming and defines the model reconciliation problem in logic programs (MRLP). Given $\pi_a$ and $\pi_h$, which represent the knowledge bases of an agent and a human, respectively, and a query $q$ such that $\pi_a$ entails $q$ and $\pi_h$ does not entail $q$ (or $\pi_a$ does not entail $q$ and $\pi_h$ entails $q$), MRLP focuses on determining a pair $(\epsilon^+,\epsilon^-)$ such that $\epsilon^+ \subseteq \pi_a$, $\epsilon^- \subseteq \pi_h$, and the program $\hat{\pi}_h = (\pi_h\setminus \epsilon^-) \cup \epsilon^+$ has an answer set containing $q$ (or has no answer set containing $q$). The pair $(\epsilon^+,\epsilon^-)$ is referred to as a solution for the model reconciliation problem $(\pi_a,\pi_h,q)$ (or $(\pi_a, \pi_h, \neg q)$). Here, we say a program $\pi$ entails (resp. does not entail) an atom $q$, denoted by $\pi \:{\mid \!\sim}\: q$ (resp. $\pi {{\:{\mid \!\sim \!\!\!\!\!\!\not} \: \: \: \: \:\: }} q$), if $q$ belongs to an answer set of $\pi$ (resp. does not belong to any answer set of $\pi$). The paper discusses different characterizations of solutions and algorithms for computing solutions for MRLP.

Model Reconciliation in Logic Programs (MRLP). A general MRLP is defined as a combination of two sub-problems: One aims at changing the human program so that it entails an atom (e-MRLP{}) and another focuses on achieving that the updated program does not entail an atom (n-MRLP{}). Let $\pi_a$ and $\pi_h$ be two logic programs and $q$ be an atom in the language of $\pi_a$.

• The problem of model reconciliation for entailment in logic programs (e-MRLP{}) is defined by a triple $(\pi_a, \pi_h, q)$. A pair of programs $(\epsilon^+,\epsilon^-)$, such that $\epsilon^+ \subseteq \pi_a$ and $\epsilon^- \subseteq \pi_h$ is a solution of $(\pi_a, \pi_h,q)$ if $\hat{\pi}_h {\:{\mid \!\sim}\:} q$, where $\hat{\pi}_h = \pi_h \setminus \epsilon^- \cup \epsilon^+$.
• The problem of model reconciliation for non-entailment in logic programs (n-MRLP{}) is defined by a triple $(\pi_a, \pi_h, \neg q)$. A pair of programs $(\epsilon^+,\epsilon^-)$, such that $\epsilon^+ \subseteq \pi_a$ and $\epsilon^- \subseteq \pi_h$ is a solution of $(\pi_a, \pi_h,\neg q)$ if $\hat{\pi}_h {{\:{\mid \!\sim \!\!\!\!\!\!\not} \: \: \: \: \:\: }} q$, where $\hat{\pi}_h = \pi_h \setminus \epsilon^- \cup \epsilon^+$.
• The general problem of model reconciliation in logic programs (MRLP) is defined by a triple $(\pi_a, \pi_h, \omega)$ where $\omega = \omega^+ \wedge \neg \omega^-$ and $\omega^+$ (resp. $\omega^-$) is a conjunction of atoms in $\pi_a$. $(\epsilon^+,\epsilon^-)$ is a solution for the MRLP problem if it is a solution for $(\pi_a,\pi_h, q)$ for each conjunct $q$ in $\omega^+$ and solution for $(\pi_a,\pi_h, \neg r)$ for each conjunct $r$ in $\omega^-$.
Characterizing Solutions. A MRLP might have several solutions and choosing a suitable solution is application dependent. Some characteristics of solutions that could influence the choice are defined next. Let $(\pi_a,\pi_h, \omega)$ be an MRLP problem and $(\epsilon^+,\epsilon^-)$ be a solution of $(\pi_a,\pi_h, \omega)$. We say: %
• $(\epsilon^+,\epsilon^-)$ is optimal if there exists no solution $(\lambda^+,\lambda^-)$ such that $\lambda^+ \cup \lambda^- \subset \epsilon^+ \cup \epsilon^-$.
• $(\epsilon^+,\epsilon^-)$ is $\pi$-restrictive for $\pi \subseteq \pi_a$ if $\epsilon^+ \subseteq \pi$; it is minimally-restrictive if there exists no solution $(\lambda^+,\lambda^-)$ such that $\lambda^+ \subset \epsilon^+$.
• $(\epsilon^+,\epsilon^-)$ is $\pi$-preserving for $\pi \subseteq \pi_h$ if $\pi \cap \epsilon^- = \emptyset$; it is maximally-preserving if there exists no solution $(\lambda^+,\lambda^-)$ such that $\lambda^- \subset \epsilon^-$.
• $(\epsilon^+,\epsilon^-)$ is assertive if every answer set of $\pi_h \setminus \epsilon^- \cup \epsilon^+$ satisfies $\omega$.
• $(\epsilon^+,\epsilon^-)$ is a solution with justification (or j-solution) if $\epsilon^+$ contains a justification for $\omega^+$ w.r.t. some answer set $I$ of $\pi_a$.
• $(\epsilon^+,\epsilon^-)$ is a fact-only if $\epsilon^+$ and $\epsilon^-$ are set of facts in $\pi_a$ and $\pi_h$, respectively.
Each class of solutions has its own merits and could be useful in different situations. Optimal solutions could be useful when solutions are associated with some costs. By associating costs to rules or atoms (e.g., via a cost function), the cost of a solution $(\epsilon^+,\epsilon^-)$ can be defined and used as a criteria to evaluate solutions. Minimally-restrictive solutions focus on minimizing the amount of information that the agent needs to introduce to the human and could be useful when explaining a new rule is expensive. On the other hand, maximally-preserving solutions are appropriate when one seeks to minimize the amount of information that needs to be removed from the human knowledge base. Solutions with justifications are those that come with their own support. Assertive solutions do not leave the human any reason for questioning the formula in discussion. Fact-only solutions are special in that they inform the human of their missing or false facts. As a planning problem can be encoded by a logic program whose answer sets encode solutions of the original planning problem, it is easy to see that an MRP in planning can be encoded as an MRLP whose fact-only solutions encode the solutions of the original MRP.

Computing Solutions. Let $\pi_a$ and $\pi_h$ be two programs and $I$ be a set of atoms of $\pi_a$ and $\epsilon^+ \subseteq \pi_a$. $\otimes(\pi_h,\epsilon^+,I)$ is the collection of rules from $\pi_h \setminus \epsilon^+$ such that for each rule $r \in \otimes(\pi_h,\epsilon^+,I)$: (i) $head(r) \in I$ and $neg(r) \cap I = \emptyset$; or (ii} $neg(r) \cap heads(\epsilon^+) \ne \emptyset$; or (iii) ${pos(r) \setminus I \ne \emptyset}$. Let $\epsilon^-[\epsilon^+,I,\pi_h]$ denote the set of rules $\pi_h \setminus \otimes(\pi_h,\epsilon^+,I)$. This can be used for computing solutions of general MRLP problems as follows. Without loss of generality, consider the problem $(\pi_a, \pi_h, q \wedge \neg r)$, where $q$ and $r$ are atoms of $\pi_a$ and $\pi_a {\:{\mid \!\sim}\:} q$ and $\pi_a {{\:{\mid \!\sim \!\!\!\!\!\!\not} \: \: \: \: \:\: }} r$. A solution $(\epsilon^+,\epsilon^-)$ for $(\pi_a, \pi_h, q \wedge \neg r)$ can be computed by the following steps: ({\em i}) compute an answer set $I$ of $\pi_a$ that supports $q$ and identify a minimal justification $\epsilon^+$ of $q$ w.r.t. $I$; ({\em ii})~compute $\epsilon^-=\epsilon^-[\epsilon^+,I,\pi_h]$; and ({\em iii}) identify a set of rules $\lambda$ from $\pi' = \pi_h \setminus \epsilon \cup \epsilon^+$ so that $\pi' \setminus \lambda {{\:{\mid \!\sim \!\!\!\!\!\!\not} \: \: \: \: \:\: }} r$. The final solution for $(\pi_a, \pi_h, q \wedge \neg r)$ is then $(\epsilon^+, \epsilon^- \cup \lambda)$. This process can be implemented using answer set programming.

Conclusions and Future Work. The paper discusses the MRLP problem and its theoretical foundation such as the definition of a solution, the classification of solutions, or methods for computing solutions. The present work assumes that the agent, who needs to compute solutions, has the knowledge of both programs $\pi_a$ and $\pi_h$. In practice, this assumption is likely invalid and the agent might also needs to change its program through communication or dialogue with the human. Addressing this issue and developing a system for computing solutions of MRLPs are our immediate future work.

### References

1. Tathagata Chakraborti, Sarath Sreedharan, Yu Zhang & Subbarao Kambhampati (2017): Plan Explana- tions as Model Reconciliation: Moving Beyond Explanation as Soliloquy. In: IJCAI, pp. 156-163, doi:10.24963/ijcai.2017/23.
2. Van Nguyen, Stylianos Loukas Vasileiou, Tran Cao Son & William Yeoh (2020): Explainable Planning Using Answer Set Programming. In: Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning, KR 2020, Rhodes, Greece, September 12-18, 2020, pp. 662-666, doi:10.24963/kr.2020/66.
3. Tran Cao Son, Van Nguyen, Stylianos Loukas Vasileiou & William Yeoh (2021): Model Reconciliation in Logic Programs. In Wolfgang Faber, Gerhard Friedrich, Martin Gebser & Michael Morak, editors: JELIA, Lecture Notes in Computer Science 12678, Springer, pp. 393?406, doi:10.1007/978-3-030-75775-5 26.
4. Sarath Sreedharan, Tathagata Chakraborti & Subbarao Kambhampati (2021): Foundations of explanations as model reconciliation. Artif. Intell. 301, p. 103558, doi:10.1016/j.artint.2021.103558.
5. Sarath Sreedharan, Alberto Olmo Hernandez, Aditya Prasad Mishra & Subbarao Kambhampati (2019): Model-Free Model Reconciliation. In Sarit Kraus, editor: IJCAI, ijcai.org, pp. 587?594, doi:10.24963/ijcai.2019/83.
6. Stylianos Loukas Vasileiou, Alessandro Previti, William Yeoh(2021): On Exploiting Hitting Sets for Model Reconciliation. In: AAAI. Available at https://ojs.aaai.org/index.php/AAAI/article/view/16807.

This research is partially supported by NSF grants 1757207, 1812619, 1812628, and 1914635.

# DeepStochLog: Neural Stochastic Logic Programming (Extended Abstract)

Thomas Winters (KU Leuven, Belgium)
Giuseppe Marra (KU Leuven, Belgium)
Robin Manhaeve (KU Leuven, Belgium)
Luc De Raedt (KU Leuven Belgium; Örebro University, Sweden)

This is an extended abstract of a paper originally published at the 36th AAAI Conference on Artificial Intelligence (AAAI22): https://www.aaai.org/AAAI22Papers/AAAI-7499.WintersT.pdf↩︎

# Introduction

The integration of neural and symbolic learning methods is high on the research agenda. It is popular to use (variants of) logic programs to represent the available symbolic knowledge and use Prolog-like mechanisms to generate differentiable structures. This can often be achieved by introducing probability into these neural logic programming models, cf. [1]. Most notably, one distinguishes probabilistic from stochastic logic programs (PLPs vs SLPs). The more popular PLPs [2] are based on a possible worlds semantics, which extends probabilistic graphical models, while the SLPs [4] are based on stochastic grammars. The difference can also be described as a random graph vs a random walk model. So far, the emphasis in neural-symbolic computation has been on the PLP approach [3]. To fill this gap, we introduce DeepStochLog, a neural stochastic logic programming approach. It incorporates ideas from DeepProbLog, such as the neural predicate. The neural predicate encapsulates neural networks to cope with sub-symbolic data such as images. Without loss of generality, we base DeepStochLog on stochastic definite clause grammars (SDCGs) as this notation is easier to introduce and use and results in a sequence-based model. SDCGs are a probabilistic unification-based grammar formalism, which is translatable to SLPs. The experimental evaluation shows that DeepStochLog scales a lot better than neural probabilistic logic programs and achieves state-of-the-art results on challenging neural-symbolic learning tasks.

# Stochastic Definite Clause Grammars

Stochastic Definite Clause Grammars have their roots in Context-Free Grammars (CFG). A context-free grammar G is a 4-tuple (V,Σ,S,R), with V the set of non-terminals, Σ the set of terminals, S ∈ V the starting symbol and R a set of rewrite rules of the form N → S1, ..., Sk where N is a non-terminal and Si either terminals or non-terminals. Definite clause grammars (DCGs) extend CFGs with logic. Instead of using non-terminals, they use logical atoms, i.e. a(t1,...,tn). Stochastic definite clause grammars (SDCGs) extend DCGs by associating probabilities to the rules. This is identical to how Probabilistic Context-Free Grammars (PCFG) extend CFGs. The rules take the form pi :  : N → S1, ..., Sk, where pi is a probability. SDCGs require that the probabilities for the rules defining a single non-terminal predicate sum to 1.

Example 1 (SDCG). The SDCG below produces addition expressions that sum to a certain number N. An expression is either directly the number N (first rule) or the addition of two expressions that sum to N1 and N2, respectively, with N = N1 + N2 (second rule). The remaining rules are the terminal rules for the single digits. Notice that the probabilities defining either the predicate e or the predicate n sum to 1.

0.5 :: e(N) → n(N)
0.5 :: e(N) → e(N1), [+], n(N2), {N is N1 + N2}
0.1 :: n(0) → [0]  ...  0.1 :: n(9) → [9]

# DeepStochLog

DeepStochLog integrates neural networks and SDCGs by introducing neural definite clause grammars (NDCG). More formally, DeepStochLog allows for specifying an SDCG that additionally supports neural definite clause grammar rules. These are statements of the form: nn(m,ℐ,𝒪,𝒟) :: nt → g1, ..., gn where nt is an atom, g1, ..., gn is a goal (a conjunction of atoms), and the ℐ = [I1,...,Im] and 𝒪 = [O1,...,OL] are lists of variables occurring in g1, ..., gn and nt. The nn declaration states that m is a neural network taking variables I1, …, Im as input and producing a probability distribution over output variables O1, ..., OL as output. 𝒟 = [D1,...,DL] is a list of unary predicates, where Di defines the domain of output variable Oi.

Intuitively, a neural rule serves as a template for multiple SDCG rules sharing the same non-terminal predicate as head. They allow a neural network to compute the corresponding probabilities (that sum to 1). To do so, the neural network can take as inputs the arguments of the atoms in the rule. This allows for arguments of the rules to be subsymbolic information, like images.

DeepStochLog can be used to parse input sequences and compute the probability that the given NDCG parses a sequence. This is achieved by parsing the sequence using standard Prolog DCG techniques with SLG resolution to get an AND/OR tree (= parse tree). This tree is then translated into an arithmetic circuit by substituting AND nodes with multiplication, OR nodes with summation and leaves with corresponding probabilities. The probability that a given start symbol generates a given sequence of terminals is simply computed by evaluating the corresponding arithmetic circuit bottom-up.

Example 2 (DeepStochLog). We extend the SDCG in Example 1 to the neural setting by substituting the terminal rules with a neural rule, obtaining the following DeepProbLog program:

0.5 :: e(N) → n(N)
0.5 :: e(N) → e(N1), [+], n(N2), {N is N1 + N2}
nn(mnist,[Img],[N],[digit]) :: n(N) → [Img].

Here, the neural network called mnist takes as input an MNIST image Img and returns a probability for all N between 0 and 9, indicating its confidence for each digit. For a given input substitution Img=, the neural network outputs a distribution over the output substitutions {N = 0}; ...; {N = 9}, equivalent to adding rules: 0.87 :: n(0) → []  …  0.05 :: n(9) → []. We want to answer the following questions:

1. What is the probability that the goal e(1) generates the sequence [,+,]? (Figure 1)

2. What is the most probable N such that e(N) generates the sequence [,+,]?

# Results

DeepStochLog reaches state-of-the-art predictive performance on neural-symbolic tasks. When applied to parsing addition expressions or more general arithmetic expressions encoded as sequences of images of handwritten symbols, DeepStochLog outperforms state-of-the-art neural symbolic systems in parsing accuracy. In Table 1, we compare DeepStochLog with DeepProbLog [3] and NeurASP [5].

DeepStochLog outperforms PLP-based systems in terms of inference times thanks to the different semantics and tabling. DeepStochLog scales better than competitors when parsing longer sequences. While competitors either timeout or strongly degrade their performances, DeepStochLog handles long sequences with no performance degradation. As a consequence, DeepStochLog can be applied to semi-supervised classification in real citation networks, a task out of the scope of PLP-based frameworks.

DeepStochLog goes beyond grammars and can encode more general programs. DeepStochLog can exploit the equivalence between SDCGs and SLPs. Thus, it can encode more general programs that cannot be directly mapped to a grammar. As a consequence, DeepStochLog programs can be directly translated from DeepProbLog programs, by ensuring that the SDCG semantics is satisfied. We showed that DeepStochLog has the same performances as DeepProbLog on word algebra problems.

 Number of digits per number (N) 1 2 3 4 NeurASP 97.3 ± 0.3 93.9 ± 0.7 timeout timeout DeepProbLog 97.2 ± 0.5 95.2 ± 1.7 timeout timeout DeepStochLog 97.9 ± 0.1 96.4 ± 0.1 94.5 ± 1.1 92.7 ± 0.6
Figure 1: The AND/OR tree for the goal e(1) and the sentence [, +, ]. Failing branches are omitted.

#### Acknowledgements

This research was funded by the Research Foundation-Flanders (FWO-Vlaanderen), the KU Leuven Research Fund, the European Research Council (ERC), EU contracts #694980 and #952215, The Flemish Government and the Knut and Alice Wallenberg Foundation.

#### Bibliography

1. Luc De Raedt, Sebastijan Dumančić, Robin Manhaeve & Giuseppe Marra (2020): From Statistical Relational to Neuro-Symbolic Artificial Intelligence. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, doi:https://doi.org/10.24963/ijcai.2020/688.
2. Luc De Raedt & Angelika Kimmig (2015): Probabilistic (logic) programming concepts. Machine Learning 100(1), pp. 5–47, doi:https://doi.org/10.1007/s10994-015-5494-z.
3. Robin Manhaeve, Sebastijan Dumanˇci´c, Angelika Kimmig, Thomas Demeester & Luc De Raedt (2018): DeepProbLog: Neural probabilistic logic programming. 31.
4. Stephen Muggleton (1996): Stochastic logic programs. Advances in Inductive Logic Programming 32, pp. 254–264.
5. Zhun Yang, Adam Ishay & Joohyung Lee (2020): NeurASP: Embracing neural networks into answer set programming. In: 29th International Joint Conference on Artificial Intelligence (IJCAI 2020), doi: https://doi.org/10.24963/ijcai.2020/243.

# Transforming Gringo Rules into Formulas in a Natural Way

The semantics of some rules in the input language of the grounder gringo [2,3] can be characterized in terms of a translation into the language of first-order logic [5, Section 6]. The translation $\tau^*$, defined in that paper, transforms a mini-gringo rule into a first-order sentence that has the same meaning under the stable model semantics. It can be used to relate strong equivalence between gringo programs to a similar condition on first-order formulas [5, Proposition 4]. It is used also in the design of the proof assistant anthem [1].

The result of applying $\tau^*$ can be quite complicated, even for short rules. In this paper [4], we describe a set of mini-gringo rules—we call them regular—that can be transformed into formulas in a natural way. The new translation contributes to our understanding of the relationship between the language of gringo and first-order languages.

### Translation $\tau^*$

Formulas produced by $\tau^*$ contain variables of two sorts: integer variables and also general variables, which range over both integers and symbolic constants. The need to use a two-sorted language is related to the fact that, in first-order logic, function symbols represent total functions, and arithmetical operations are not applicable to symbolic constants.

For example, the result of applying $\tau^*$ to the rule $$q(X+ 1) \leftarrow p(X) \label{ruleplus}$$ is $$\forall X(\exists Z(Z=X \land p(Z))\to \forall Z_1 (\exists IJ(Z_1=I+J \land I=X \land J= 1)\to q(Z_1))). \label{long}$$ (In formulas, we use $X$, $Y$, $Z$ as general variables, and $I$, $J$ as integer variables.)

Complicated formulas produced by $\tau^*$, such as \eqref{long}, are often equivalent to much simpler formulas. We understand here equivalence of formulas as equivalence in two-sorted intuitionistic logic with equality; the use of intuitionistically acceptable simplifications in this context is essential because such simplifications do not affect the class of stable models [6]. For example, formula \eqref{long} is equivalent to $$\forall I(p(I)\to q(I+ 1)). \label{f1}$$

### Translation $\nu$

Formula \eqref{f1} is not only short but also natural as a representation of rule \eqref{ruleplus}, in the sense that its syntactic form is similar to the syntactic form of the rule. The new translation $\nu$ produces, whenever it is defined, a sentence equivalent to the result of applying $\tau^*$. For instance, in application to rule \eqref{ruleplus} the new translation gives formula \eqref{f1}. A few other examples of formulas produced by $\nu$ are shown in Table 1.

 Rule $R$ Formula $\nu(R)$ $1$ $q(X+1,Y) \leftarrow p(X,Y)$ $\forall I(p(I,Y) \to q(I+ 1,Y))$ $2$ $q(X,Y) \leftarrow p(X,Y),\ X=1\,..\,Y$ $\forall IJ(p(I,J)\land 1\leq I\leq J\to q(I,J))$ $3$ $q(1\,..\,X) \leftarrow p(X)$ $\forall I(p(I)\to\forall J(1\leq J\leq I \to q(J)))$

The expression $(1\,..\,3)*2$ (which denotes the set $\{2,4,6\}$) is an example of what is not allowed in regular rules.

### Relationship between rules and formulas

It was observed long ago that the head and body of a rule are similar to the consequent and antecedent of an implication, and that choice expressions are similar to excluded middle formulas. For instance, the rule $$\{q(X)\}\leftarrow p(X)$$ is similar to the formula $$p(X)\to q(X)\lor\neg q(X).$$ The translation $\nu$ allows us to extend this analogy to some rules in the input language of gringo that contain arithmetical operations. As discussed above, the distinction between general variables and integer variables is essential in this context.
• A variable in a regular rule is similar to an integer variable if it occurs at least once in the scope of an arithmetical operation or in a comparison that includes the interval symbol $..$ ; otherwise it is similar to a general variable. Consider, for instance, the rule in Line 1 of the table above. The variable $X$ in the rule is similar to an integer variable, because it is part of $X+1$; it is replaced by $I$ in the last column of the table. The variable $Y$ is similar to a general variable. In Line 2, both $X$ and $Y$ are similar to integer variables, because they occur within the comparison $X=1\,..\,Y$.
• In the body of a regular rule, the equal sign in a comparison of the form $t_1=t_2\,..\,t_3$ is similar to the set membership symbol $\in$ : it expresses that the integer denoted by the left-hand side is an element of the set denoted by the right-hand side. For instance, $\nu$ transforms $X=1\,..\,Y$ in Line 2 into the inequality expressing that $I$ belongs to the interval $[1,J]$.
• If the head of a regular rule contains the interval symbol $..$ then it is similar to a universally quantified formula (or to a conjunction, if it does not contain variables). For instance, $\nu$ transforms $q(1..X)$ in Line 3 into $\forall J(1\leq J\leq I \to q(J))$.

### References

1. Jorge Fandinno, Vladimir Lifschitz, Patrick Lühne & Torsten Schaub (2020): Verifying Tight Programs with Anthem and Vampire. Theory and Practice of Logic Programming 20, pp. 735–750, doi:10.1017/S1471068420000344.
2. Martin Gebser, Amelia Harrison, Roland Kaminski, Vladimir Lifschitz & Torsten Schaub (2015): Abstract Gringo. Theory and Practice of Logic Programming 15, pp. 449–463, doi:10.1017/S1471068415000150.
3. Martin Gebser, Roland Kaminski, Benjamin Kaufmann, Marius Lindauer, Max Ostrowski, Javier Romero, Torsten Schaub & Sven Thiele (2019): Potassco User Guide. Available at https://github.com/potassco/guide/releases/.
4. Vladimir Lifschitz (2021): Transforming Gringo Rules into Formulas in a Natural Way. In: Proceedings of the 17th European Conference on Logics in Artificial Intelligence, JELIA '21. Springer, pp. 421–434, doi:10.1007/978-3-030-75775-5_28.
5. Vladimir Lifschitz, Patrick Lühne & Torsten Schaub (2019): Verifying Strong Equivalence of Programs in the Input Language of gringo. In: Proceedings of the 15th International Conference on Logic Programming and Non-monotonic Reasoning, LPNMR '19. Springer, pp. 270–283, doi:10.1007/978-3-030-20528-7_20.
6. Vladimir Lifschitz, David Pearce & Agustin Valverde (2007): A characterization of strong equivalence for logic programs with variables. In: Proceedings of the 9th International Conference on Logic Programming and Non-monotonic Reasoning, LPNMR '07. Springer, pp. 188–200, doi:10.1007/978-3-540-72200-7_17.

# Solving a Multi-resource Partial-ordering Flexible Variant of the Job-shop Scheduling Problem with Hybrid ASP

Giulia Francescutto (Siemens AG Österreich, Vienna, Austria)
Konstantin Schekotihin (University of Klagenfurt, Klagenfurt, Austria)
Mohammed M. S. El-Kholany (University of Klagenfurt, Klagenfurt, Austria)

### Abstract

In many industries, scheduling is a key component to efficient management of resources and, thus, ensuring competitiveness and success of companies by reducing the consumption of time and money. In this work, we propose a Multi-resource Partial-ordering Flexible Job-shop Scheduling formulation to capture scheduling scenarios where partially-ordered sequences of operations must be scheduled on multiple required resources, such as tools and specialists. The resources are flexible and can process one or more kind of operations based on their characteristics. We have built a model using Answer Set Programming in which the execution time of operations is determined using Difference Logic. Furthermore, we introduced two multi-shot strategies that identify the time bounds and find a schedule while minimizing the total tardiness. Experiments on real-world instances from an Infineon Fault Analysis lab show that the devised model yields optimized schedules for 87 out of 91 instances.

### 1. Introduction

Job-shop Scheduling (JSS) ([7]) is one of the most well-known scheduling problems in which a set of machines are used to execute given jobs, represented as sequences of operations, and the goal is to complete the jobs as soon as possible. Extensions like flexible ([1]) and multi-resource ([2]) JSS generalize the allocation of a machine and additional resources for an operation to suit practical scheduling applications. In this work, we consider a Multi-resource Partial – ordering Flexible JSS (MPF-JSS) problem, where multiple resources are needed for executing operations, and jobs consist of partially-ordered sequences of operations to be completed. More specifically, we distinguish machines and engineers as two separate sets of resources required to perform an operation. Flexibility lies in the choice among several resources with the required skills in each category. The goal is to determine the best possible execution order and allocation of operations to resources for minimizing the total tardiness of jobs wrt. their deadlines. We model the MPF-JSS problem by an encoding in Answer Set Programming (ASP) with Difference Logic (DL) ([4]) and take advantage of multi-shot solving ([5]). DL constraints allow for compactly expressing timing requirements, avoiding grounding issues that would otherwise be encountered when a feasible schedule necessitates a large number of time points. By means of multi-shot solving, we implement two strategies to identify upper time bounds on feasible schedules whose tardiness is further subject to minimization. We tested our proposed model on a dataset representing ten operational days of an Infineon Fault Analysis lab. Our experiments yield success to optimize schedules for 87 out of 91 real-world instances, while only a very small number of jobs can be handled without multi-shot solving.

### 2. Modeling MPF-JSS with Hybrid ASP

We consider a scheduling problem in which different resources are interrelated to process the upcoming operations. In particular, MPF-JSS is an extension of the classical JSS problem where three additional aspects are considered: (a) Multi-resources – several categories of resources can be required to execute an operation; (b) Partially-ordered – some operations cannot be executed before completing their predecessors and others are unaffected by such constraints; and (c) Flexible – an operation can be performed by a variety of available resources. More specifically, an MPF-JSS instance consists of a set of jobs, characterized by a partially ordered set of operations to be performed, where each operation needs to be allocated to some machine and (possibly) an engineer with the required skills. The following constraints must be respected: (i) once an operation starts, it cannot be interrupted; (ii) each resource can perform only one operation at a time; (iii) the execution times of operations of the same job cannot overlap; and (iv) operations must be scheduled according to the partial order given by jobs.

We propose two multi-shot ASP modulo DL solving strategies. The first approach incrementally increases the upper bound on the tardiness of each job in order to identify a limit for which schedules are feasible. That is, search starts with the tardiness bound 0, and if this yields unsatisfiability (UNSAT), the bound is incremented by a constant, set by a parameter of our Python control script on top of clingo[DL] ([6]), and this process proceeds until some schedule is found. Figure 1 illustrates an execution of the incremental search strategy, where the bound on jobs' tardiness is increased in steps of size 2 until satisfiability (SAT) is obtained with the time bound 8. ASP optimization methods are then used to find a schedule minimizing the total tardiness, i.e., the sum of delays over jobs completed after their deadlines.

The second approach performs an exponential binary search for the exact tardiness bound at which schedules get feasible. As displayed in Figure 2, the increment on the bound is gradually increased up to step size 4 for finding the first schedule with the time bound 8. A binary search scheme then successively halves the step size for converging to the exact bound, 7 in this case, for which schedules are feasible, whose total tardiness is subject to ASP optimization in the last phase of the multi-shot solving process.

### 3. Results

We conducted experiments on a set of real-world instances of the MPF-JSS problem representing the operations at ten days randomly selected from the work records of an Infineon Fault Analysis lab. The instances yield the following approximate number of components based on the dimensions of the studied lab: (i) 50 operation kinds; (ii) 75 machines; and (iii) 45 engineers. For each of the selected days, the number of jobs ranges between 30 and 50. We further split each instance into up to ten sub-instances by picking 5, 10, 15, 20, etc. of the respective jobs, which resulted in a total of 91 instances of varying size.

For each obtained instance, we ran three solving strategies with clingo[DL] (version 1.1.0): (1) single-shot – standard ASP modulo DL program with a tardiness bound precomputed using a heuristic; (2) inc – the incremental multi-shot variant with a constant step size of 20 time points; and (3) exp – the exponential multi-shot solving approach. We restricted each solver run to 2 hours wall clock time on an Intel 3930K workstation with 64GB RAM under Ubuntu 18.05.

Figure 3 shows a cactus plot comparing the solving performance of the two multi-shot approaches and the single-shot version as baseline. The latter times out on all instances with more than 15 jobs and merely succeeds to optimize schedules on instances with 15 jobs for two of the ten days. The two multi-shot approaches significantly outperform the single-shot version. While the incremental strategy yields optimized schedules for 85 out of 91 instances, the exponential multi-shot variant scales up to 87 instances. This advantage is due to tighter tardiness bounds identified by the exponential strategy, thus reducing the efforts for ASP optimization to minimize the total tardiness. Given that a lower tardiness bound is more restrictive, the incremental variant manages to find better schedules than the exponential strategy for three instances, where the total tardiness improvement amounts to 80 time points on average.

Figure 1: Incremental approach
Figure 2: Exponential approach Figure 3: Cactus plot of solving times

### References

1. Peter Brucker & Rainer Schliel (1990): Job-shop scheduling with multi-purpose machines. Computing 45(4), pp. 369–375, doi:10.1007/BF02238804.
2. Stéphane Dauzère-Pérès, W. Roux & Jean Lasserre (1998): Multi-resource shop scheduling with resource flexibility.. EJOR 107(2), pp. 289–305, doi:10.1016/S0377-2217(97)00341-X.
3. Giulia Francescutto, Konstantin Schekotihin & Mohammed El-Kholany (2021): Solving a multi-resource partial-ordering flexible variant of the job-shop scheduling problem with hybrid ASP. In: Proceedings of the European Conference on Logics in Artificial Intelligence (JELIA 2021), Springer, pp. 313–328. doi:10.1007/978-3-030-75775-5_21
4. Martin Gebser, Roland Kaminski, Benjamin Kaufmann, Max Ostrowski, Torsten Schaub & Philipp Wanko (2016): Theory solving made easy with clingo 5. In: Technical Communications of the International Conference on Logic Programming (ICLP 2016), Leibniz International Proceedings in Informatics, pp. 2:1–2:15. doi:10.4230/OASIcs.ICLP.2016.2.
5. Martin Gebser, Roland Kaminski, Benjamin Kaufmann & Torsten Schaub (2019): Multi-shot ASP solving with clingo. TPLP 19(1), pp. 27–82, doi:10.1017/S1471068418000054.
6. Tomi Janhunen, Roland Kaminski, Max Ostrowski, Sebastian Schellhorn, Philipp Wanko & Torsten Schaub (2017): Clingo goes linear constraints over reals and integers. TPLP 17(5-6), pp. 872 – 888, doi:10.1017/S1471068417000242.
7. Selmer Martin Johnson (1954): Optimal two-and three-stage production schedules with setup times included. Naval Research Logistics Quarterly 1(1), pp. 61–68, doi:10.1002/nav.3800010110

# Efficient Datalog Rewriting for Query Answering in TGD Ontologies

Zhe Wang (Griffith University)
Peng Xiao (Alibaba)
Kewen Wang (Griffith University)
Zhiqiang Zhuang (Tianjin University)
Hai Wan (Sun Yat-sen University)

### Background

Ontology-mediated query (OMQ) concerns answering a query w.r.t. an ontology as virtual schemata. In this paper, we consider ontologies consisting of existential rules (a.k.a. Datalog$\pm$ rules or tuple generating dependencies (TGDs)) [1,2], which are a family of expressive ontology languages that cover Datalog and many Horn description logics, including the core dialects of DL-Lite and $\mathcal{EL}$, which underlay the OWL 2 Profiles. A OMQ consists of a finite set of existential rules $\Sigma$ and a conjunctive query (CQ) $q$, denoted $\Sigma_q$.

There are two major approaches for answering OMQ, the chase-based approach and the query rewriting approach. The chase-based approach relies on the termination of the chase procedure (a.k.a. forward chaining), and it is shown that computing a full chase can be rather inefficient when query answers depend only on a small portion of it. On the other hand, existing query rewriting systems for existential rules are typically based on first-order rewritings [2,3]. A limitation of such an approach is that it can only handle ontologies and queries that are first-order rewritable. Yet many practical ontologies do not necessarily fall into these classes, such as some ontologies formulated in $\mathcal{EL}$. Even for ontologies and queries that are first-order rewritable, the results of rewriting can suffer from a significant blow up and become difficult for DBMS to handle.

It is shown that taking Datalog as the target query language can lead to much more compact rewritings, yet existing research on Datalog rewriting of existential rules is mostly theoretical, and very few systems have been developed for Datalog rewriting over existential rules that are more general than description logics. In this paper, we fill the gap by presenting both a practical approach and a prototype system for Datalog rewriting and query answering over a wide range of ontologies expressed in existential rules. We also identify classes of ontologies where the rewriting process terminates, introducing a class by combining existing well-accepted classes. We implement a prototype system, Drewer, and experiments show that it is able to handle a wide range of benchmarks in the literature. Moreover, Drewer shows superior performance over state-of-the-art systems on both the compactness of rewriting and the efficiency of query answering.

### Our Approach

Our algorithm is based on the notions of piece unification [1] and unfolding [5]. To achieve compactness of rewriting, we separate the results of unfolding into short rules by introducing the so-called separating predicates and reusing such predicates when possible.

Consider a Boolean CQ $q$ expressed as a Datalog rule $\mathsf{Q} \leftarrow\mathsf{A}(U,V,W)\wedge \mathsf{A}(U,V,V) \wedge \mathsf{A}(U,U,W)$ and an existential rule $\sigma_1:\; \exists Y_1.\mathsf{A}(X_1,X_1,Y_1) \leftarrow\mathsf{B}(X_1,Z_1)$. The unfolding of $q$ by $\sigma_1$ gives $\rho_1:\; \mathsf{Q} \leftarrow\mathsf{A}(U,U,U)\wedge \mathsf{B}(U,Z_1)$. And we split the body of $\rho_1$ via a fresh predicate $\mathsf{P}$ to obtain two additional Datalog rules $\rho'_1:\; \mathsf{Q} \leftarrow\mathsf{A}(U,U,U) \wedge \mathsf{P}(U)$ and $\rho''_1:\; \mathsf{P}(U) \leftarrow\mathsf{B}(U,Z_1)$. Rules like $\rho_1$ are auxiliary and will be temporarily kept for further rewriting. For example, $\rho_1$ can be further rewritten by another existential rule $\sigma_2:\; \exists X_2.[\mathsf{A}(X_2,X_2,X_2)\wedge \mathsf{B}(X_2,Y_2)] \leftarrow\mathsf{C}(Y_2)$, whereas $\rho'_1$ and $\rho''_1$ cannot. After the rewriting process is completed, auxiliary rules like $\rho_1$ are removed.

Definition 1. For a Datalog rule $\rho$, an existential rule $\sigma$, and a piece unifier $\mu=\langle H, B, \tau\rangle$ of $\rho$ and $\sigma$, the result of rewriting $\rho$ by $\sigma$ with $\mu$ consists of the following Datalog rules \begin{align} \mathsf{head}(\rho)\tau & \leftarrow\bigwedge(\mathsf{body}(\rho)\setminus B)\tau \wedge \bigwedge \mathsf{body}(\sigma)\tau, \label{eq:0}\\ \mathsf{head}(\rho)\tau & \leftarrow\bigwedge(\mathsf{body}(\rho)\setminus B)\tau \wedge \mathsf{P}(\mathbf{X}),\label{eq:1}\\ \mathsf{P}(\mathbf{X}) & \leftarrow\bigwedge \mathsf{body}(\sigma)\tau, \label{eq:2} \end{align} where $\mathbf{X} = (\mathbf{X}_{\rho} \cup \mathcal{V}_{\mathsf{body}(\rho)\setminus B})\tau \cap \mathbf{X}_{\sigma}\tau$, and $\mathsf{P}$ is a fresh predicate with arity $|\mathbf{X}|$, called a separating predicate.

To avoid repetitively introducing fresh separating predicates, we reuse them through a labelling function. The rewriting of an OMQ $\Sigma_q$, denoted as $\mathsf{rew}(\Sigma_q)$, consists of all the resulting Datalog rules in a sequence of rewriting as above except for the auxiliary rules. $\mathsf{rew}(\Sigma_q)$ is finite if a fixpoint is reached in finite steps. The correctness of our rewriting is shown below.

Proposition 1. For an OMQ $\Sigma_q$, $\mathsf{rew}(\Sigma_q)$ is a Datalog rewriting of $\Sigma_q$ whenever $\mathsf{rew}(\Sigma_q)$ is finite.

We have also introduced several new Datalog rewritable classes based on variants of our Datalog rewriting methods. An ontology $\Sigma$ is separable if for each atomic query $q$, the variant of $\mathsf{rew}(\Sigma_q)$ that does not generate auxiliary rules is a Datalog rewriting of $\Sigma_q$. Intuitively, the condition requires that the bodies of the Datalog rules generated during rewriting can be separated. The class of separable ontologies is denoted $\mathsf{Sep}$, and we show that the existing $\mathsf{Shy}$ class [4], which was originally introduced to guarantee the termination of a special form of chase, is a subclass of $\mathsf{Sep}$.

Theorem 1. $\mathsf{Shy} \subset \mathsf{Sep}$.

The separable condition can be too restrictive, and we revise the condition to allow blocks of, instead of individual, body atoms to be separately rewritten. We also associate each block $B$ from the initial OMQ or generated during the rewriting with a fresh predicate $\mathsf{P}_B$, called a block predicate, which generalises the notion of separating predicates. Next, we define an ontology $\Sigma$ to block expandable if for each block $B$ consisting of more than one atoms, the collection of rules that can be applied in recursively rewriting atoms in $B$ can be syntatically verified as a finite unification set [1]. The class of block expandable ontologies is denoted $\mathsf{BExp}$ and it is a concrete class whose membership can be verified syntactically. The class of $\mathsf{BExp}$ contains several well known classes, such as linear ($\mathsf{Lin}$), sticky ($\mathsf{Stky}$) [2], and $\mathsf{aGRD}$ (that is, ontologies whose graphs of rule dependencies are acyclic) [1].

Theorem 2. $\mathsf{Lin}\cup \mathsf{Stky}\cup \mathsf{aGRD}\cup \mathsf{Shy}\subset \mathsf{BExp}$.

### Experiments

We have implemented a prototype system, Drewer (Datalog REWriting for Existential Rules), and evaluated the performance of our system on several commonly used ontologies for ontology-mediated query answering.

Table 1 shows the evaluations on query rewriting, where we compared Drewer with state-of-the-art query rewriting systems on the size of rewriting and time efficiency. The sizes are measured by the numbers of rules and the times are in milliseconds. "TO" denotes time out whereas a "-" means the system could not handle the OMQ (or reported errors). For Graal, its rewriting sizes are $x+y$, where $x$ is the number of CQs in the rewriting and $y$ is the number of Datalog rules corresponding to its so-called compiled pre-orders.

Table 1: Comparison on query rewriting
Ontology Query Datalog Rewriting UCQ Rewriting
Drewer Rapid Grind Graal
size time size time size time size(*) time
OBOprotein q1 30 63 29 11 29 22971 20+7 156
q2 1360 637 1356 998 1358 23237 1264+92 7088
q3 34742 185 33919 75413 - - 1+33918 334
q4 34805 1961 34879 84318 - - 682+34085 2928
q5 1389 668 27907 TO - - TO TO
RS q1 16 127 TO TO TO TO 14+4 561
q2 22 977 TO TO TO TO 100+4 16367
q3 23 10756 TO TO TO TO 143+4 46557
Reactome q1 3 3 3 6 - - 1+2 34
q2 16 4 13 6 - - TO TO
q3 16 4 13 6 - - TO TO
q4 16 5 15 7 - - TO TO
q5 16 4 15 6 - - TO TO
DEEP300 q1 25 13 - - - - 117+2 971
q2 30 15 - - - - 528+3 13072
q3 30 14 - - - - 624+2 17485
q4 234 55 - - - - TO TO
q5 55 23 - - - - TO TO

### References

1. Jean-François Baget, Michel Leclère, Marie-Laure Mugnier & Eric Salvat (2011): On rules with existential variables: Walking the decidability line. Artificial Intelligence Journal, 175(9-10):1620–1654, doi:10.1016/j.artint.2011.03.002.
2. Andrea Calì, Georg Gottlob & Andreas Pieris (2012): Towards more expressive ontology languages: The query answering problem. Artificial Intelligence Journal, 193:87–128, doi:10.1016/j.artint.2012.08.002.
3. Mélanie König, Michel Leclère, Marie-Laure Mugnier & Michaël Thomazo (2015): Sound, complete and minimal UCQ-rewriting for existential rules. Semantic Web, 6(5):451–475, doi:10.3233/SW-140153.
4. Nicola Leone, Marco Manna, Giorgio Terracina & Pierfrancesco Veltri (2019): Fast Query Answering over Existential Rules. ACM Transactions on Computational Logic, 20(2):12:1–12:48, doi:10.1145/3308448.
5. Zhe Wang, Kewen Wang & Xiaowang Zhang (2018): Forgetting and Unfolding for Existential Rules. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 2013–2020.

# ApproxASP - A Scalable Approximate Answer Set Counter (Extended Abstract)

Mohimenul Kabir (National University of Singapore)
Flavio Everardo (Tec de Monterrey Campus Puebla)
Ankit Shukla (JKU, Linz,)
Johannes K. Fichte (TU Wien)
Markus Hecher (TU Wien)
Kuldeep S. Meel (National University of Singapore)

### Abstract

Answer Set Programming (ASP) is a framework in artificial intelligence and knowledge representation for declarative modeling and problem solving. Modern ASP solvers focus on the computation or enumeration of answer sets. However, a variety of probabilistic applications in reasoning or logic programming require counting answer sets. While counting can be done by enumeration, simple enumeration becomes immediately infeasible if the number of solutions is high. On the other hand, approaches to exact counting are of high worst-case complexity. In fact, in propositional model counting, exact counting becomes impractical. In this work, we present a scalable approach to approximate counting for ASP. Our approach is based on systematically adding parity (XOR) constraints to ASP programs, which divide the search space. We prove that adding random XOR constraints partitions the answer sets of an ASP program. In practice, we use a Gaussian elimination-based approach by lifting ideas from SAT to ASP and integrate it into a state-of-the-art ASP solver, which we call $\mathsf{ApproxASP}$. Finally, our experimental evaluation shows the scalability of our approach over existing ASP systems.

### Introduction

Answer Set Programming (ASP) [1][2] is a form of declarative programming. Two decades of progress in the theory and practice of solving ASP offers a rich modeling and solving framework that has well-known applications in the area of knowledge representation and reasoning, and artificial intelligence. The problem of counting the number of solutions to a given ASP program, known as #ASP, is a computationally intriguing problem that has applications in probabilistic reasoning [3], planning [4], and combinatorial design [5]. Approximate counting (ApproxMC) [6] showed to be particularly successful in propositional model counting [7]. We investigate the design of scalable techniques for #ASP. In this context, one may wonder whether #ASP can be reduced to #SAT, since then invoking a counter such as ApproxMC suffices. However, it is well-known (and observed in practice) that such a reduction might result in exponential blow-up [8]. The primary contribution of this work is the design of a first scalable technique for #ASP that provides rigorous $(\varepsilon,\delta)$ guarantees. From the technical perspective, we lift the XOR-based hashing framework developed in the context of propositional model counting to ASP. As is witnessed in the development of ApproxMC, designing a scalable counter requires engineering of the underlying ASP solver to handle XOR constraints. To this end, we present the first ASP solver that can natively handle XOR constraints via Gauss-Jordan elimination (GJE).

### Approximate Counting

An answer-set program $P$ is a set of rules of the form $a_1\vee \dots \vee a_\ell \leftarrow a_{\ell+1},\dots,a_m,\mathord{\sim} a_{m+1}, \dots, \mathord{\sim} a_n$ where $\ell$, $n$, and $m$ are non-negative integers and $a_i$ atoms. We follow standard definitions of propositional ASP and assume familiarity with definitions of an answer set [1]. We let $\mathsf{AS}(P)$ consist of all answer sets of $P$. Given program $P$, the problem of counting answer sets, #ASP for short, asks to compute the number of answer sets of $P$. In general, #ASP is computationally very hard, more precisely #co-NP-complete [9]. If we restrict the problem #ASP to normal programs, the complexity drops to #P-complete, which is easy to see from standard reductions [10].

Approximate counting tries to compute the number of solutions using a probabilistic algorithm. Therefore, assume that $\mathsf{Sol}(I)$ consists of the set of solutions for a problem instance $I$. The approximation algorithm takes instance $I$, a real $\epsilon > 0$ called tolerance, and a real $\delta$ with $0 < \delta \leq 1$ called confidence as input. The output is a real number $\mathit{cnt}$, which estimates the cardinality of $\mathsf{Sol}(I)$ based on the parameters $\epsilon$ and $\delta$ following the inequality $\Pr[\frac{|\mathsf{Sol}(I)|}{(1 + \epsilon)} \leq \mathit{cnt} \leq (1 + \epsilon) \cdot |\mathsf{Sol}(I)|] \geq 1 - \delta$. The central idea of the approximate model counting approach is the use of hash functions to partition the answer sets for a program $P$ into roughly equal small cells. Then pick a random cell and scale it by the number of cells to obtain an $\epsilon$-approximate estimate of the model count. Note that apriori we do not know the distribution of answer sets. We have to perform good hashing without the knowledge of the distribution of answer sets, i.e., partition the $\mathsf{AS}(P)$ into cells such that each cell has roughly equal number of answer sets. Interestingly, this can be resolved by using a $k$-wise independent hash function. In the context of approximate sampling and counting techniques, the most exploited hash family is $\mathcal{H}_{xor}$ [6], which is also called parity constraints.

Intuitively, a parity constraint makes sure that an even or odd number of its atoms occur in an answer set of the program. In fact, the ASP-Core-2 language already allows for representing parity constraints using aggregate rules. Unfortunately, such a construction is quite impractical for two reasons. First, it would require us to add far too many auxiliary variables for each new parity constraint, and second, we would suffer from the known inadequacy to scale due to grounding. Hence, we follow an incremental propagator''-based implementation when solving parity constraints. We extend clingo by full XOR solving by using Han and Jiang's Gauss-Jordan elimination (GJE) [11], which is implemented within the state-of-the-art SAT solver Cryptominisat [12].

Actual approximate counting works as follows: We use an already evolved algorithm from the case of propositional satisfiability and lift it to ASP. The $\mathsf{ApproxASP}$ algorithm takes as input an ASP program $P$, an independent support $I$ for $P$ (which can also just be all atoms of the program), a tolerance $\epsilon$ $(0 < \epsilon \leq 1)$, and a confidence $\delta$ with $0 < \delta \leq 1$. First, sampled counts and the number of cells are initialized. Then, we compute a threshold $pivot$ that depends on $\epsilon$ to determine the chosen value of the size of a small cell. Next, we check whether the input program $P$ has at least a pivot number of answer sets projected to $I$ (Enum-k-AS); otherwise, we are trivially done and return $\mathsf{AS}(P)$. Subsequently, the algorithm continues and calculates a value that determines how often we need to sample for the input confidence $\delta$. Next, we divide the search space and sample a cell at most $r$ times using the function DivideNSampleCell. If the attempt to split into at least 2 cells works, represented by a non-zero number of cells, we store the count and estimate the total count by taking the number of cells times the answer sets count within the sampled cell. Finally, the estimate of the count is the median of the stored estimates.

Table 1: The runtime performance comparison of (C) clingo, (D) DynASP, (G) Ganak, (AMC) ApproxMC, and (AAS) ApproxASP on all considered instances.
C D G AMC AAS
Normal #Instances 1500 1500 1500 1500 1500
#Solved 738 47 973 1325 1323
PAR-2 5172 9705 3606 1200 1218
Disjunctive #Instances 200 200 200 200 200
#Solved 177 0 0 0 185
PAR-2 1372 10000 10000 10000 795

#### Empirical Work.

We implemented the algorithm and conducted a preliminary experimental evaluation to assess the run-time performance and quality of estimate. Therefore, we consider the following questions: (i) How does $\mathsf{ApproxASP}$ compare to existing systems? (ii) Does $\mathsf{ApproxASP}$ output approximate counts that are indeed between $(1 + \epsilon)^{-1}$ and $(1 + \epsilon)$ of the exact counts? Table 1 provides a brief idea on the outcome of Question (i). For disjunctive logic programs, $\mathsf{ApproxASP}$ performs well. $\mathsf{ApproxASP}$ solved $185$ instances among $200$ instances, while the best ASP solver clingo solved a total of $177$ instances. In addition, on normal logic programs, $\mathsf{ApproxASP}$ performs on par with state-of-the-art approximate model counter ApproxMC. Moreover, for each instance, the estimate of $\mathsf{ApproxASP}$ is between $(1 + \epsilon)^{-1}$ and $(1 + \epsilon)$ of the exact counts.

### Conclusion

We present $\mathsf{ApproxASP}$ a scalable approximate counter for ASP programs that employs pairwise independent hash functions, represented as XOR constraints, to partition the solution space, and then invokes an ASP solver on a randomly chosen cell. To achieve practical efficiency, we augment the state-of-the-art ASP solver clingo, with native support for XORs. Our empirical evaluation clearly demonstrates that $\mathsf{ApproxASP}$ is able to handle problems that lie beyond the reach of existing counting techniques. Our empirical evaluation shows that $\mathsf{ApproxASP}$ is competitive with ApproxMC on the subset of instances that can be translated to CNF without exponential blowup and can handle disjunctive programs, which can not be solved via reduction to \sharpSAT without exponential blowup. The empirical analysis, therefore, positions $\mathsf{ApproxASP}$ as the tool of choice in the context of counting for ASP programs.

### References

1. Gerhard Brewka, Thomas Eiter & Mirosław Truszczyński (2011): Answer set programming at a glance. In: Communications of the ACM 54, no. 12, pp. 92-103, doi:10.1145/2043174.2043195.
2. Martin Gebser, Roland Kaminski, Benjamin Kaufmann & Torsten Schaub (2012) Answer set solving in practice. In: Synthesis lectures on artificial intelligence and machine learning 6, no. 3, pp. 1-238. doi:10.2200/S00457ED1V01Y201211AIM019.
3. Joohyung Lee, Samidh Talsania & Yi Wang (2017) Computing LPMLN using ASP and MLN solvers. In: Theory and Practice of Logic Programming 17, no. 5-6, pp. 942-960. doi:10.1017/S1471068417000400.
4. Johannes Klaus Fichte, Sarah Alice Gaggl, and Dominik Rusovac (2021) Rushing and Strolling among Answer Sets--Navigation Made Easy. In: AAAI Conference on Artificial Intelligence, 36(5), pp. 5651-5659. doi:10.1007/3-540-45241-9_12
5. Monica Nogueira, Marcello Balduccini, Michael Gelfond, Richard Watson & Matthew Barry (2021) An A-Prolog decision support system for the Space Shuttle. In: International symposium on practical aspects of declarative languages, pp. 169-183. doi:10.1007/3-540-45241-9_12
6. Mate Soos, Stephan Gocht & Kuldeep S. Meel (2020) Tinted, detached, and lazy CNF-XOR solving and its applications to counting and sampling. In: International Conference on Computer Aided Verification, pp. 463-484. doi:10.1007/978-3-030-53288-8_22
7. Johannes Klaus Fichte, Markus Hecher & Florim Hamiti (2021) The model counting competition 2020. In: Journal of Experimental Algorithmics (JEA), pp. 1-26. doi:10.1145/3459080
8. Vladimir Lifschitz & Alexander Razborov (2006) Why are there so many loop formulas?. In: ACM Transactions on Computational Logic (TOCL) 7, no. 2, pp. 261-268. doi:10.1145/1131313.1131316
9. Johannes Klaus Fichte, Markus Hecher, Michael Morak & Stefan Woltran (2017) Answer set solving with bounded treewidth revisited. In: International Conference on Logic Programming and Nonmonotonic Reasoning, pp. 132-145. doi:10.1007/978-3-319-61660-5_13
10. Tomi Janhunen (2006) Some (in) translatability results for normal logic programs and propositional theories. In: Journal of Applied Non-Classical Logics 16, no. 1-2, pp. 35-86. doi:10.3166/jancl.16.35-86
11. Cheng-Shen Han & Jie-Hong Roland Jiang (2012) When boolean satisfiability meets gaussian elimination in a simplex way. In: International Conference on Computer Aided Verification, pp. 410-426. doi:10.1007/978-3-642-31424-7_31
12. Mate Soos, Karsten Nohl & Claude Castelluccia (2009) Extending SAT solvers to cryptographic problems. In: International Conference on Theory and Applications of Satisfiability Testing, pp. 244-257. doi:10.1007/978-3-642-02777-2_24

# Exploiting Parameters Learning for Hyper-parameters Optimization in Deep Neural Networks

Michele Fraccaroli (University of Ferrara)
Evelina Lamma (University of Ferrara)
Fabrizio Riguzzi (University of Ferrara)

In the last years, the Hyper-parameter Optimization (HPO) research field has gained more and more attention. Many works have focused on finding the best combination of the Deep Neural Network's (DNN's) hyper-parameters (HPs) or architecture. The state-of-the-art algorithm in terms of HPO is Bayesian Optimization (BO). This is because it keeps track of past results obtained during the optimization and uses this experience to build a probabilistic model mapping HPs to a probability density of the objective function. BO builds a surrogate probabilistic model of the objective function, finds the HPs values that perform best on the surrogate model and updates it with new results. In this work, a system was developed, called Symbolic DNN-Tuner which logically evaluates the results obtained from the training and the validation phase and, by applying symbolic tuning rules, fixes the network architecture, and its HPs, therefore improving performance. Symbolic DNN-Tuner improve BO applied to DNN by adding an analysis of the results of the network on training and validation sets. This analysis is performed by exploiting rule-based programming, and in particular by using Probabilistic Logic Programming (PLP).

## Introduction

This work aims to create a system to drive the training of DNNs by automatizing the choice of HPs analyzing the performance of each training done during the optimization process. This new system (Symbolic DNN-Tuner) extends BO [8, 6] exploiting PLP [7] by mapping some tricks usually used in manual HPs optimization [5] into non-deterministic, and probabilistic Symbolic Tuning Rules (STRs). These rules specify Tuning Actions (TAs) which have the purpose of editing the HPs search space, adding new HPs or updating the network structure without human intervention. Each STR has a weight that determines the probability of application of its TA. At the beginning of the optimization process, the probabilistic weights of STRs are set manually, and then they are refined, after each training, via Learning from Interpretations (LFI) [4] (an inference algorithm available in PLP) based on the improvements obtained or not, for each TA applied in previous training.

The two main parts of Symbolic DNN-Tuner [2, 3] are a first one called Neural Block which manages all aspects regarding the networks (training, validation, HPs search space and application of the TA) and a second one called Symbolic Block that, based on the network performance and computed metrics after each training, diagnoses problems and identifies the (most probable) TA to be applied on the network architecture.

## Symbolic DNN-Tuner

As mentioned before, Symbolic DNN-Tuner is a system to drive the training of a DNN, analysing the performance of each training through PLP and automatizing the choice of HPs to obtain a network with better performance. By the analysis of the metrics and the performance of the network, it is possible to identify many problems (e.g., overfitting, underfitting, etc.) that BO is not able to avoid because it works only with a single metric (validation loss or accuracy, training loss or accuracy). When Symbolic DNN-Tuner diagnoses these problems, it changes the search space of HP values or the structure of the network by applying TAs to drive the DNN to a better solution.

### Symbolic Block

The Symbolic Block of Symbolic DNN-Tuner is concerned with the analysis of metrics and performance of the trained network. This part is fully developed with ProbLog [1]. The two major contributions of this part are the symbolic analysis and LFI. This block analyses the network metrics like loss and accuracy during both training and validation and produces a diagnosis. From this diagnosis, exploiting the STRs, the block returns the TAs to be applied.

The Symbolic Block's ProbLog program is composed of three parts: Facts, Diagnosis and Tuning. The whole logic program is dynamically created by the union of these three parts at each iteration of the algorithm. Facts memorize metrics obtained from the Neural Block, the Diagnosis section contains the code for diagnosing the DNN's behaviour problems. The Tuning section is composed of the STRs. Through ProbLog inference, we can query this program to obtain the TAs. The extracted TAs are passed to the Neural Block and applied to the DNN structure or the HPs search space.

Listing 1: STRs in the symbolic part of Symbolic DNN-Tuner
0.7::action(data_augment) :- problem(overfitting).
0.3::action(decr_lr) :- problem(underfitting).
0.8::action(inc_neurons) :- problem(underfitting).
0.4::action(new_conv_layer) :- problem(underfitting).

Listing 2: Learning From Interpretation part of Symbolic DNN-Tuner
% Program
t(0.5)::action(data_augment).
t(0.2)::action(decr_lr).
t(0.85)::action(inc_neurons).
t(0.3)::action(new_conv_layer).
- - - - - - - - - - - - - - - - - - - -
% Evidence
evidence(action(data_augment), True).
evidence(action(decr_lr), False).

Listing 1 shows an extract of all STRs in Symbolic DNN-Tuner. The values in the head of the clauses are the probabilistic weights, the action() predicate identifies the TAs. Then, when a problem() is True, we take the TAs with the highest probability and the TAs (e.g., decr_lr if problem(underfitting) is true - decrementing learning rate in case of underfitting) are applied to the network or the HPs search space.

The probabilistic weights are learned from the experience gained from previous iterations. The experience becomes the set of training examples for the LFI program. This program (Listing 2) is composed of two parts: the program and the evidence obtained from the ImprovementChecker module. This module checks whether the last training is better in terms of metrics w.r.t the previously performed training. The LFI program is built dynamically at each iteration of the system. After each training, Symbolic DNN-Tuner checks the improvement of the network. The improvement is a boolean value and it is used to build the evidence. The aim is to reward with greater probability those TAs that have led to improvements. In this way, Symbolic DNN-Tuner can learn which TA was better and consequently favours it over the others.

Now, with the ProbLog program updated with the just calculated new probabilities for STRs, we can query the program for the probabilities of a given atom. For clarity purposes, we report an extract of ProbLog code in Listing 3.

Listing 3: Extract of Logic section of Symbolic DNN-Tuner

% FACTS ----------------------------------------------------------
a([0.0529, 0.0710, 0.6266, 0.6298, ... ]).
va([0.0191, 0.0593, 0.1797, 0.2304, ... ]).
l([4.7763, 4.2189, 3.9604, 1.8257, ... ]).
vl([5.6702, 4.4710, 2.0003, 1.9812, ...]).
itacc(0.1062).
itloss(0.4125).

% DIAGNOSIS ------------------------------------------------------
abs2(X,Y) :- Y is abs(X).
isclose(X,Y,W) :- D is X - Y, abs2(D,D1), D1 =< W. gap_tr_te_acc :-
a(A), va(VA), last(A,LTA), last(VA,ScoreA), Res is
LTA - ScoreA, abs2(Res,Res1), Res1> 0.2.
...

% PROBLEMS -------------------------------------------------------
problem(overfitting) :- gap_tr_te_acc; gap_tr_te_loss.
problem(underfitting) :- high_loss; low_acc.

% TUNING ---------------------------------------------------------
action(reg_l2) :- problem(overfitting).
0.545454545454545::action(inc_dropout) :- problem(overfitting).
0.0::action(data_augment) :- problem(overfitting).
0.3::action(decr_lr) :- problem(underfitting).
...

% QUERY ----------------------------------------------------------
query(problem(_)).
query(action(_)).

In Listing 3 we can see the facts, the logic code for the metrics analysis and problem identification and the STRs under FACTS, DIAGNOSIS-PROBLEMS} and TUNING} section respectively. At the end of the program we can see the queries.

The FACTS contains the history of the accuracy and the loss during training phase and validation phase (a([]), l([]), va([]) and vl([]) respectively). itacc() and itloss() are two thresholds used to diagnose underfitting.

This system was tested on three dataset: CIFAR10, CIFAR100 and CIMA dataset. The first two datasets are for benchmarking purpose and the last is a dataset provided by an Italian company called CIMA S.P.A. used as industrial, real case dataset. This dataset is composed by 3200 training images and 640 of validation images of size 256x128 divided in 16 classes. These images are facsimiles of Euro-like banknotes. The performed experiments showed that Symbolic DNN-Tuner performs better than standard Bayesian Optimization on these three datasets in terms of accuracy and loss, and also provides an explanation of the possible reasons for network malfunctioning.

### References

1. Anton Dries, Angelika Kimmig, Wannes Meert, Joris Renkens, Guy Van den Broeck, Jonas Vlasselaer & Luc De Raedt (2015): Problog2: Probabilistic logic programming. In: Joint european conference on machine learning and knowledge discovery in databases. Springer, pp. 312–315, doi:10.1007/978-3-319-23461-8_37.
2. Michele Fraccaroli, Evelina Lamma & Fabrizio Riguzzi (2021): Symbolic DNN-Tuner. Machine Learning, pp. 1–26, doi:10.1007/s10994-021-06097-1.
3. Michele Fraccaroli, Evelina Lamma & Fabrizio Riguzzi (2022): Symbolic DNN-Tuner: A Python and ProbLog-based system for optimizing Deep Neural Networks hyperparameters. SoftwareX 17, pp. 100957, doi:10.1016/j.softx.2021.100957.
4. Bernd Gutmann, Ingo Thon & Luc De Raedt (2011): Learning the parameters of probabilistic logic programs from interpretations. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, pp. 581–596, doi:10.1007/978-3-642-23780-5_47.
5. Grégoire Montavon, Geneviève Orr & Klaus-Robert Múller (2012): Neural networks: tricks of the trade 7700. springer, doi:10.1007/978-3-642-35289-8.
6. Martin Pelikan, David E Goldberg & Erick Cantú-Paz (1999): BOA: The Bayesian optimization algorithm. In: Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation-Volume 1. Morgan Kaufmann Publishers Inc., pp. 525–532. doi:10.1007/978-3-540-32373-0_
7. Fabrizio Riguzzi (2018): Foundations of Probabilistic Logic Programming. River Publishers. isbn:978-877022018-7
8. Jasper Snoek, Hugo Larochelle & Ryan P Adams (2012): Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp. 2951–2959. doi:10.48550/arXiv.1206.2944.

# Treewidth-aware Reductions of Normal ASP to SAT — Is Normal ASP Harder than SAT after All? (Extended Abstract)

Markus Hecher (TU Wien)

Answer Set Programming (ASP) is a paradigm and problem modeling and solving toolkit for KR that is often invoked. There are plenty of results dedicated to studying the hardness of (fragments of) ASP. So far, these studies resulted in characterizations in terms of computational complexity as well as in fine-grained insights presented in form of dichotomy-style results, lower bounds when translating to other formalisms like Boolean satisfiability (SAT), and even detailed parameterized complexity landscapes. A quite generic and prominent parameter in parameterized complexity originating from graph theory is the so-called treewidth, which in a sense captures structural density of a program. Recently, there was an increase in the number of treewidth-based solvers related to SAT. While there exist several translations from (normal) ASP to SAT, yet there is no reduction preserving treewidth or at least being aware of the treewidth increase. This paper deals with a novel reduction from normal ASP to SAT that is aware of the treewidth, and guarantees that a slight increase of treewidth is indeed sufficient. Then, we also establish that when considering treewidth, already the fragment of normal ASP is slightly harder than SAT, which renders the treewidth increase of our reduction unavoidable.

### 1 Introduction

Answer Set Programming (ASP) is an active research area of knowledge representation and reasoning. ASP provides a declarative modeling language and problem solving framework for hard computational problems, which has been widely applied. In ASP questions are encoded into rules and constraints that form a program (over atoms), whose solutions are called answer sets. In terms of computational complexity, the consistency problem of deciding the existence of an answer set is well-studied, i.e., the problem is Σ2P-complete. Some fragments of ASP have lower complexity though. A prominent example is the class of normal programs, whose consistency problem is NP-complete.

There is also a wide range of more fine-grained studies for ASP, also in parameterized complexity, where certain (combinations of) parameters are taken into account. In parameterized complexity, the “hardness” of a problem is classified according to the impact of a parameter for solving the problem. There, one often distinguishes the runtime dependency of the parameter, e.g., levels of exponentiality [10] in the parameter, required for problem solving. Concretely, under the Exponential Time Hypothesis (ETH) [6], Boolean satisfiability (SAT) is single exponential in the structural parameter treewidth, wheras evaluating Quantified Boolean formulas (quantifier depth two) is double exponential [4].

For ASP there is growing research on treewidth, e.g., [3]. Algorithms of these works exploit structural restrictions (in form of treewidth) of a given program, and often run in polynomial time in the program size, while being exponential only in the treewidth. Intuitively, treewidth gives rise to a tree decomposition, which allows solving numerous NP-hard problems in parts, cf., divide-and-conquer, and indicates the maximum number of variables one has to investigate in such parts during evaluation. There were also competitions and notable progresses in SAT as well as on problems beyond SAT.

Naturally, there are numerous reductions for ASP, e.g., [97], and extensions thereof to SAT. These reductions have been investigated in the context of resulting formula size and number of auxiliary variables. Existing reductions cause only sub-quadratic blow-up in the number of variables, which is unavoidable [8] if the answer sets should be preserved (bijectively). If one considers the structural dependency in form of treewidth, existing reductions could cause quadratic or even unbounded overhead in the treewidth. On the contrary, we present a novel reduction for normal programs that increases the treewidth k at most sub-quadratically (klog(k)). This is indeed interesting as there is a close connection [1] between resolution-width and treewidth, resulting in efficient SAT solver runs on instances of small treewidth. As a result, our reduction could improve solving approaches by means of SAT solvers, e.g., [9]. Then, we establish lower bounds under ETH, for exploiting treewidth for concistency of normal programs. This renders normal ASP “harder” than SAT. At the same time we prove that one cannot significantly improve the reduction, i.e., avoid the sub-quadratic increase of treewidth.

### 2 Treewidth-Aware Reductions

First, we present the idea of treewidth-aware reductions, where we reduce normal ASP to SAT. This reduction takes an arbitrary normal program Π of treewidth k and constructs a Boolean formula, whose treewidth is bounded by O(klog(k)), thereby being aware of the treewidth increase. This is achieved by guiding the reduction along a structural representation of treewidth, which is a tree decomposition (TD) of Π. Intuitively, every TD of Π is of a certain width, where the treewidth is then the smallest width among every potential TD of Π. Consequently, our treewidth-aware reduction takes a normal program Π, thereby being guided along any TD T of Π of width k, which yields itself a TD of the resulting Boolean formula of width O(klog(k)) that functionally depends on T . The reduction takes up the existing idea of level mappings [97], which, however, needs to be applied for a local part of the program, as indicated by TD T , in order to ensure treewidth-awareness. This application of level mappings [97] that are not applied globally, i.e., for every atom at once, ensures that the treewidth of the resulting Boolean formula does not arbitrarily increase. However, the shift from global to local level mappings comes at the cost of losing a one-to-one correspondence between answer sets of Π and satisfying assignments of the resulting formula. Still, the construction yields the following novel result.

Theorem 1 (Treewidth-Awareness). There is a reduction from any normal program Π of treewidth k to a SAT formula F with Π being consistent if and only if F is satisfiable. This reduction only slightly increases the treewidth such that the treewidth of F is at most O(klog(k)).

Observe that this seems to be in contrast to classical complexity, where a subquadratic overhead is only unavoidable whenever bijectively capturing the answer sets via satisfying assignments of a Boolean formula, cf., [8]. In an extended version [5] of this work, we provide an implementation of the treewidth-aware reduction and outline its usability, which we also empirically verify along applications.

Towards Novel Hardness Insights. The idea of treewidth-aware reductions can be even applied in a more general setting, which is then the key to finally establish the following main result of the paper.

Theorem 2 (Lower bound). Given an arbitrary normal program Π of treewidth k. Then, unless ETH fails, the consistency of Π cannot be solved in time 2o(klog(k)) poly(|at(Π)|).

Proof (Idea). The idea is to define a treewidth-aware reduction that constructs a normal program by reducing from the so-called disjoint paths problem. This problem considers a directed graph and aims at finding disjoint paths between a list of source vertices and a list of destination vertices such that every source reaches its corresponding destination. Disjoint paths is known to be slightly superexponential [10], i.e., it fits the desired runtime. However, the challenge for this reduction to work is to tightly guide the construction along a TD of the graph such that it is guaranteed that the treewidth is preserved linearly. Interestingly, already modeling, e.g., “modes” of reachability for every source/destination pair would cause a quadratic increase of treewidth. Our approach links source/destination pairs, guided by the TD, thereby solving reachability problems without causing a non-linear treewidth increase. __

Observe that Theorem 2 also shows that the reduction of Theorem 1 can probably not be significantly improved. Recently, this lower bound result and its insights led to the development of a family of program classes that are between normal programs and tight programs [2], reflecting the hardness of this lower bound. Each program class of this family is characterized by its “distance” to a tight program.

### 3 Conclusion

The curiosity of studying and determining the hardness of ASP and the underlying reasons has attracted the attention of the community for a long time. This work discusses the question from a different angle, which hopefully will provide new insights into the hardness of ASP and foster follow-up work. The results indicate that, at least from a structural point of view, deciding the consistency of ASP is harder than SAT, since ASP programs might compactly represent structural dependencies within the formalism. We hope this work might reopen the quest to study treewidth for ASP solving, similarly to SAT [1].

### References

[1]   Albert Atserias, Johannes K. Fichte & Marc Thurley (2011): Clause-Learning Algorithms with Many Restarts and Bounded-Width Resolution. J. Artif. Intell. Res. 40, pp. 353–373, doi:10.1613/jair.3152.

[2]   Jorge Fandinno & Markus Hecher (2021): Treewidth-Aware Complexity in ASP: Not all Positive Cycles are Equally Hard. In: AAAI’21, AAAI Press, pp. 6312–6320.

[3]   Johannes K. Fichte & Markus Hecher (2019): Treewidth and Counting Projected Answer Sets. In: LPNMR’19, LNCS 11481, Springer, pp. 105–119.

[4]   Johannes K. Fichte, Markus Hecher & Andreas Pfandler (2020): Lower Bounds for QBFs of Bounded Treewidth. In: LICS’20, Assoc. Comput. Mach., pp. 410–424, doi:10.1145/3373718.3394756.

[5]   Markus Hecher (2022): Treewidth-aware reductions of normal ASP to SAT - Is normal ASP harder than SAT after all? Artif. Intell. 304, p. 103651, doi:10.1016/j.artint.2021.103651.

[6]   Russell Impagliazzo, Ramamohan Paturi & Francis Zane (2001): Which Problems Have Strongly Exponential Complexity? J. of Computer and System Sciences 63(4), pp. 512–530, doi:10.1006/jcss.2001.1774.

[7]   Tomi Janhunen (2006): Some (in)translatability results for normal logic programs and propositional theories. J. of Applied Non-Classical Logics 16(1-2), pp. 35–86, doi:10.3166/jancl.16.35-86.

[8]   Vladimir Lifschitz & Alexander A. Razborov (2006): Why are there so many loop formulas? ACM Trans. Comput. Log. 7(2), pp. 261–268, doi:10.1145/1131313.1131316.

[9]   Fangzhen Lin & Jicheng Zhao (2003): On tight logic programs and yet another translation from normal logic programs to propositional logic. In: IJCAI’03, Morgan Kaufmann, pp. 853–858.

[10]   Daniel Lokshtanov, Daniel Marx & Saket Saurabh (2018): Slightly Superexponential Parameterized Problems. SIAM J. Comput. 47(3), pp. 675–702, doi:10.1137/16M1104834.

# Applications of Answer Set Programming to Smart Devices and Large Scale Reasoning (Extended Abstract)*

Kristian Reale (Department of Mathematics and Computer Science, University of Calabria, Italy - DLVSystem L.T.D., Via della Resistenza 19/C, Rende, Italy)
Francesco Calimeri (Department of Mathematics and Computer Science, University of Calabria, Italy - DLVSystem L.T.D., Via della Resistenza 19/C, Rende, Italy)
Nicola Leone (Department of Mathematics and Computer Science, University of Calabria, Italy)
Simona Perri (Department of Mathematics and Computer Science, University of Calabria, Italy)
Francesco Ricca (Department of Mathematics and Computer Science, University of Calabria, Italy)

### 1. Introduction

Over the last years, the development by the scientific community of robust and efficient systems supporting Answer Set Programming (ASP) [8, 2, 3], an expressive [6] logic programming paradigm proposed in the area of non-monotonic reasoning, fostered the development of a significant number of applications in several contexts (e.g., Artificial Intelligence, Information Integration, Knowledge Management, Healthcare, Workforce Management, Diagnosis, Workflows, Optimization, and more - we refer the reader to the original article [12] for some references).With the growth of the use of ASP in industry, also effective development tools have been introduced, capable of supporting programmers, knowledge engineers and organizations in managing complex projects in real-world domains [7, 9, 5, 4, 11]. Nevertheless, practical applications scenarios have been constantly evolving; we observed, in fact, the availability of computing devices that are getting smaller and smaller (e.g., Smartphones, Smart Devices, Raspberry, etc.) along with the production and availability of heterogeneous data that are getting bigger and bigger.

In this paper, we present DLV Large Scale (DLV-LS), a novel advanced platform for the development of ASP-based applications performing declarative-based reasoning tasks over data-intensive applications, possibly on Smart Devices. The platform heavily relies on DLV [10, 1], one of the most widespread ASP systems in industry, that has been engineered by the DLVSystem company and extended to obtain DLV-LS. The framework encompasses DLV Mobile Edition (DLV-ME), an ASP based solver for Android systems and Raspberry devices, and DLV Enterprise Edition (DLV-EE), an ASP-based platform, accessible by REST interfaces, for large-scale reasoning over Big Data, classical relational database systems, and NoSQL databases. DLV-LS enables Smart Devices to both locally perform reasoning over data generated by their own sensors and properly interact with DLV-EE when more computational power is needed for harder tasks, possibly over bigger centralized data. The resulting DLV-LS system was also equipped by an Integrated Development Environment relying on the extension of the most comprehensive IDE for ASP called ASPIDE [7] making easy the development of ASP solutions for DLV-LS.