Published: 17th September 2021 DOI: 10.4204/EPTCS.345 ISSN: 2075-2180 |
This volume contains the Technical Communications and the Doctoral Consortium papers of the 37th International Conference on Logic Programming (ICLP 2021), held online from September 20 to 27, 2021. The Association of Logic Programming (ALP) Executive Committee decided to hold ICLP 2021 as a fully virtual event because of the continuing coronavirus (COVID-19) pandemic.
Contributions were from all areas of logic programming, including but not restricted to:
Besides the main track, ICLP 2021 hosted two additional tracks:
Applications Track: This track invited submissions of papers on emerging and deployed applications of LP, describing all aspects of the development, deployment, and evaluation of logic programming systems to solve real-world problems, including interesting case studies and benchmarks, and discussing lessons learned.
Recently Published Research Track: This track provided a forum to discuss important results related to logic programming that appeared (from January 2019 onwards) in selective journals and conferences, but have not been previously presented at ICLP.
These tracks had their own track chairs, submission requirements, and evaluation criteria.
ICLP 2021 also hosted:
MentorLP - Mentoring Workshop on Logic Programming: The purpose of MentorLP was to support students and newcomers to pursue careers in logic programming research. This workshop held technical sessions on cutting-edge research in logic programming, and mentoring sessions on how to prepare and succeed for a research career. We had leaders in logic programming research from academia and industry to give talks on their research areas. We also had live discussions among participants on how to overcome challenges and make contributions to the research community. MentorLP was dedicated to fostering and supporting diversity, equity, and inclusion. We especially encouraged members of underrepresented groups to attend.
Autumn School on Logic and Constraint Programming: The school was designed for those interested in learning advanced topics in logic programming and constraint programming. It consisted of a series of half-day tutorials. The school hosted tutorials given by Daniela Inclezan (Miamy University), Mirek Truszczyński (University of Kentucky), Agostino Dovier (University of Udine), and Alessandra Russo & Mark Law (Imperial College).
Doctoral Consortium: The Doctoral Consortium (DC) on Logic Programming provided students with the opportunity to present and discuss their research directions, and to obtain feedback from both peers and experts in the field.
Logic and Constraint Programming Contest: The contest combined some features of Prolog programming contest, Answer Set Programming contest, and Constraint Programming contest. As in the previous edition, participants were not limited to use a single system and could also combine declarative and imperative programming languages.
ICLP 2021 program also included several invited talks and panel sessions.
The main conference invited speakers were:
The panel Datalog Perspectives was chaired by David S. Warren (Stony Brook University and XSB, Inc.). The panelists were David Maier (Portland State University), Moshe Vardi (Rice University), and Jeffrey Ullman (Stanford University). The panel was originally planned to be immediately after Ullman's invited talk, but Ullman shared his invited talk time with Maier and Vardi before the panel discussions. Maier's talk was titled The Downs and Ups of Datalogi, and Vardi's talk was titled Datalog++ and Datalog--.
The panel No Logic is an Island: Internal and External Integration of Logic Programming Paradigms was chaired by Joost Vennekens (KU Leuven). The panelists were Marc Denecker (KU Leuven), Manuel Hermenegildo (IMDEA and Universidad Politécnica de Madrid), Francesco Ricca (University of Calabria), Theresa Swift (Universidade Nova de Lisboa), and Jan Wielemaker (VU University of Amsterdam).
The organizers of ICLP 2021 were:
General ChairWe are deeply indebted to the Program Committee members and external reviewers. The conference would not have been possible without their dedicated, enthusiastic, and outstanding work.
The Program Committee members of ICLP 2021 were:
Salvador Abreu | Giovambattista Ianni | Orkunt Sabuncu |
Mario Alviano | Katsumi Inoue | Chiaki Sakama |
Marcello Balduccini | Tomi Janhunen | Vitor Santos Costa |
Roman Barták | Michael Kifer | Torsten Schaub |
Pedro Cabalar | Ekaterina Komendantskaya | Konstantin Schekotihin |
Manuel Carro | Nicola Leone | Tom Schrijvers |
Stefania Costantini | Michael Leuschel | Tran Cao Son |
Marina De Vos | Ondřej Lhoták | Theresa Swift |
Agostino Dovier | Vladimir Lifschitz | Paul Tarau |
Inês Dutra | Fangzhen Lin | Tuncay Tekle |
Thomas Eiter | Francesca Alessandra Lisi | Michael Thielscher |
Esra Erdem | Jorge Lobo | Mirek Truszczyński |
Wolfgang Faber | Viviana Mascardi | Allen van Gelder |
Sarah Alice Gaggl | Thomas Meyer | German Vidal |
Marco Gavanelli | Jose F. Morales | Alicia Villanueva |
Martin Gebser | Manuel Ojeda-Aciego | Toby Walsh |
Michael Gelfond | Carlos Olarte | Antonius Weinzierl |
Laura Giordano | Magdalena Ortiz | Jan Wielemaker |
Gopal Gupta | Mauricio Osorio | Stefan Woltran |
Michael Hanus | Enrico Pontelli | Roland Yap |
Manuel Hermenegildo | Francesco Ricca | Jia-Huai You |
The Program Committee members of the Applications track were:
Mutsunori Banbara | Matti Järvisalo | Daniele Theseider Dupré |
François Bry | Nikos Katzouris | Paolo Torroni |
Francesco Calimeri | Zeynep Kiziltan | Kewen Wang |
Angelos Charalambidis | Marco Maratea | David Warren |
Ferdinando Fioretto | Yunsong Meng | Fangkai Yang |
Gerhard Friedrich | Alessandra Mileo | Zhizheng Zhang |
Jianmin Ji | Mohan Sridharan |
The external reviewers were:
Sotiris Batsakis | Spencer Killen | Etienne Tignon |
Michael Bernreiter | Bruno Pereira | Huaduo Wang |
Francois Bry-Hausser | Jacques Robin | Philipp Wanko |
Jose-Luis Carballido | Elmer Salazar | Jessica Zangari |
Carmine Dodaro | Zeynep G. Saribatur | Claudia Zepeda Cortes |
Biqing Fang | Mantas Simkus |
The 17th Doctoral Consortium (DC) on Logic Programming was held in
conjunction with ICLP 2021. It attracted Ph.D. students in the area of
Logic Programming Languages from different backgrounds
(e.g., theory, implementation, application) and encouraged a
constructive and fruitful advising.
Topics included:
theoretical foundations of logic and constraint logic programming,
sequential and parallel implementation technologies,
static and dynamic analysis,
abstract interpretation,
compilation technology,
verification,
logic-based paradigms (e.g., answer set programming, concurrent logic
programming, inductive logic programming),
and innovative applications of logic programming.
This year the DC accepted
9 papers in the areas mentioned above
(the author of one of them opted for not including an abstract in this volume).
We warmly thank all student authors, supervisors, reviewers, co-chairs, and
Program Committee members of the DC, as well as the organizing team for making the
DC greatly successful.
The Program Committee members of the Doctoral Consortium were:
Johannes K. Fichte | Daniela Inclezan | Yi Wang |
Fabio Fioravanti | Zeynep G. Saribatur | Jessica Zangari |
Gregory Gelfond | Frank Valencia | Matthias van der Hallen |
Mario Alviano | Marco Maratea | Elena Mastria |
We would also like to express our gratitude to everyone who helped the organization of ICLP 2021, especially our General Chair, Ricardo Rocha, and Organizing and Publicity Chair, Miguel Areias. Our gratitude must be extended to the President of ALP, Thomas Eiter, to the ALP Conference Coordinator Marco Gavanelli, and to all members of the ALP Board. A special thank goes to Mirek Truszczyński, Editor-in-Chief of TPLP and to the staff at Cambridge University Press for their assistance. We would also like to thank Rob van Glabbeek, Editor-in-Chief of EPTCS, for helping the Program Chairs with his prompt support. Finally, we thank each author of every submitted paper, because their efforts keep the conference alive, and we thank all participants of ICLP 2021 for bringing and sharing their ideas and latest developments.
Bart Bogaerts, Alex Brik, Veronica Dahl, Carmine Dodaro, Paul Fodor, Andrea Formisano, |
Yanhong Annie Liu, Gian Luca Pozzato, Joost Vennekens, and Neng-Fa Zhou (Eds.) |
Neural language models, which can be pretrained on very large corpora, turn out to "know" a lot about the world, in the sense that they can be trained to answer questions surprisingly reliably. However, "language models as knowledge graphs" have many disadvantages: for example, they cannot be easily updated when information changes. I will describe recent work in my team and elsewhere on incorporating symbolic knowledge into language models and question-answering systems, and also comment on some of the remaining challenges associated with integrating symbolic reasoning and neural NLP.
Quantified modal logic provides a useful framework for precise formulation of ethical principles. This talk begins by briefly summarizing how generalization, utilitarian, and autonomy principles can be deontologically derived from the logical structure of action. It then shows how these principles can be expressed in a logical idiom suitable for use in a rule-based AI system. In particular, it indicates how quantified modal logic can enable value alignment in machine learning to integrate ethical reasoning with empirical observation.
Portions of the talk are based on joint work with Thomas Donaldson and Tae Wan Kim.
Since the early days of the relational data model, rule-based languages have been used to express semantic restrictions that the data of interest ought to obey. In the past two decades, rule-based languages have also been used as specification languages in data exchange, in data integration, and in ontology-mediated data access. Tuple-generating dependencies (tgds) form one of the most widely used and extensively studied such rule-based languages. Tgds are also known as existential rules; they include Datalog rules as a special case and possess numerous desirable structural properties. The aim of this talk is to present an overview of results, both old and new, that characterize finite sets of tgds in terms of their structural properties. These results include characterizations of finite sets of arbitrary tgds, as well as characterizations of restricted types of tgds, including full tgds, linear tgds, guarded tgds, and source-to-target tgds.
Probabilistic programming languages (PPLs) are expressive formal languages for writing probability models. One branch derives expressive power from traditional programming languages; the other from first-order logic. This talk covers Bayesian logic (BLOG), a language that facilitates specification of probability distributions over first-order structures. I will describe the language mainly through examples and cover its application to monitoring the Comprehensive Nuclear-Test-Ban Treaty. PPLs have many advantages over deep learning systems and may offer an alternative route to the construction of general-purpose AI systems.
Datalog was far from the first brand of logic to be thought of as a database query language. Other approaches were far more powerful. But its weakness was in fact its strength. Its advantage is that a logic as limited as Datalog makes optimization feasible, and for any declarative query language, the ability to select an efficient implementation from among the many possible ways to execute a query is essential.
Datalog was identified as a language and named about 35 years ago now. It has gone through phases of popularity and unpopularity. It has spawned (at least) two distinct, quite different, intuitive and formal semantics. It has stimulated much research. In this session we hear from several researchers, who did early and foundational research on Datalog, to get their perspectives on Datalog: what it is, where it's been, and where it's going.
Discussion questions:
After a burst of interest in the late 1980s and early 1990s, activity around Datalog and other logic languages declined. However, Datalog has garnered renewed attention in the last decade, often for uses beyond database querying. I will explore possible reasons for the drop-off in interest, cover a range of current application areas for Datalog, and speculate on the reasons for resurgence.
One argument in favor of Datalog is that it constitutes a sweet point in terms of expressiveness and feasibility of evaluation and optimization. I will argue that instead of thinking of Datalog as a single query language, we should consider it as a family of query languages. For some applications, such as universal relations and data exchange, we need to enhance Datalog's expressive power, hence Datalog++. For other applications, where we need query equivalence to be decidable, we need to limit Datalog's expressiveness, hence Datalog−−.
Over the course of its rich history, the area of Logic Programming has spawned many different logics, systems and paradigms. In this panel, we ask two questions:
The MentorLP Workshop is designed to broaden the exposure of students to research and career opportunities in logic programming. The workshop program included talks and sessions that cover history, mentoring sessions that cover effective habits for navigating the research landscape, current practice of logic programming, and social sessions that create opportunities for students and early researchers to interact with established researchers in the field.
Topics:
A common representational technique in Answer Set Programming [3] is the use of auxiliary atoms. Their introduction in a program may be due to many different reasons, for instance, looking for a simpler reading, providing new constructions (choice rules, aggregates, transitive closure, etc) or reducing the corresponding ground program. When a program (or program fragment) $\Pi$ for signature $AT$ uses auxiliary atoms $A \subseteq AT$, they do not have a relevant meaning outside $\Pi$. Accordingly, they are usually removed from the final stable models, so the latter only use atoms in $V=AT \setminus A$, that is, the relevant or public vocabulary that encodes the solutions to our problem in mind. Thus, when seen from outside, $\Pi$ becomes a black box that hides internal atoms from $A$ and provides solutions in terms of public atoms from $V$. A reasonable question is whether we can transform these black boxes into white boxes, that is, whether we can reformulate some program $\Pi$ exclusively in terms of public atoms $V$, forgetting the auxiliary ones in $A$. A forgetting operator $f(\Pi,A)=\Pi'$ transforms a logic program $\Pi$ into a new program $\Pi'$ that does not contain atoms in $A$ but has a similar behaviour on the public atoms $V$. In the case of removal of auxiliary atoms, the reasonable way to fix that similarity is to preserve the same knowledge for public atoms in $V$, and this can be formalised as a very specific property. In particular, both programs $\Pi$ and $\Pi'=f(\Pi,A)$ should not only produce the same stable models (projected on $V$) but also keep doing so even if we add a new piece of program $\Delta$ without atoms in $A$. This property, is known as strong persistence and was introduced in [7]. Later on, [6] provided a semantic condition, called $\Omega$, on the models of $\Pi$ in the logic of Here-and-There (HT) that allows checking when atoms $A$ are forgettable in $\Pi$ when $\Omega$ does not hold. When this happens, their approach can be used to construct $f(\Pi,A)$ from the HT models. Going one step further in this model-based orientation for forgetting, [1] overcame the limitation of unforgettable sets of atoms at the price of introducing a new type of disjunction, called fork.
Semantic-based forgetting is useful when we are interested in obtaining a compact representation. However, this is done at a high computational cost (similar to Boolean function minimisation techniques). When combined with the $\Omega$-condition or, similarly, with the use of HT-denotations, this method becomes practically unfeasible without the use of a computer. This may become a problem, for instance, when we try to prove properties of some new use of auxiliary atoms in a given setting, since one would expect a human-readable proof rather than resorting to a computer-based exhaustive exploration of models. On the other hand, semantic forgetting may easily produce results that look substantially different from the original program, even when this is not necessary.
An alternative and complementary orientation for forgetting is the use of \emph{syntactic transformations}. Recently, [2] presented a general syntactic operator $f_{sp}$ for a single atom $A=\{a\}$ that can be applied to any arbitrary logic program and satisfies strong persistence when the atom is forgettable (i.e. the $\Omega$ condition does not hold). Moreover, Berthold et al. also provided three syntactic sufficient conditions (what they call $a$-forgettable) under which $\Omega$ does not hold, and so, under which $f_{sp}$ is strongly persistent. Perhaps the main difficulty of $f_{sp}$ comes from its complex definition: it involves 10 different types of rule-matching that further deal with multiple partitions of $\Pi$ (using a construction called as-dual). As a result, even though it offers full generality when the atom is forgettable, its application by hand does not seem too practical, requiring too many steps and a careful reading of the transformations. In some particular syntactic fragments, simpler transformations can do the job and are much easier to use. This is the case not only of $f_{as}$ mentioned before, but also of $f_{su}$ in [5] for stratified logic programs and used as a basis for strong persistence with respect to uniform equivalence, that is, when the context $\Delta$ is a set of facts for atoms in $V$.
In this paper, we introduce another syntactic operator for forgetting a single atom, $f_{es}$, based on the external support construction used for loop formulas in [4]. This new operator, somehow, lies in between $f_{su}$ and $f_{sp}$. On the one hand, as happens with $f_{su}$, it only guarantees strong persistence for a syntactic class of programs, and may fail to do so if we use it outside that class, even though the atom may be forgettable. On the other hand, it is more general than $f_{su}$ and closer to $f_{sp}$ in the sense that it covers some disjunctive programs under the condition that the forgotten atom $a$ is not included in its own external support (in other words, it is not used in choice rules). This is, in fact, one of the three conditions for $a$-forgettable atoms defined in [2]. As we will discuss later, the new $f_{es}$ operator is less general than $f_{sp}$ even under that condition. However, our purpose is to provide one more tool for syntactic forgetting that we claim to possess a simple definition and can be useful when the preconditions for its application are fulfilled.
In what follows, we assume some familiarity with logic programming syntax and answer set semantics. We also apply basic strongly equivalence results based on the logic of Here-and-There (HT) (see [8] for more details). Let $\Pi$ be an extended logic program for signature $AT$ and $a \in AT$. We define the subprogram $\Pi^a \subseteq \Pi$ as the set of rules: $$\Pi^a = \{ r \in \Pi \mid a \in Head(r) \setminus (Body^+(r) \cup Body^-(r))\}$$ Intuitively, these are the rules that may provide support for $a$. Note that any rule with $a \in Head(r) \cap Body^+(r)$ is tautological and, for this reason, can be disregarded. On the other hand, if $a \in Head(r) \cap Body^-(r)$, we can actually remove $a$ from the head of $r$, so the rule does not provide any real support for $a$. Moreover, all head occurrences of $a$ in $\Pi \setminus \Pi^a$ can be removed under strong equivalence as we explain next. Given any program $\Pi$, let us define the syntactic transformation $behead^a(\Pi)$ as the result of removing all rules with $a \in Head(r) \cap Body^+(r)$ and all head occurrences of $a$ from rules where $a \in Head(r) \cap Body^-(r)$.
Proposition 1. For any logic program $\Pi$: $\Pi \equiv_s behead^a(\Pi) = behead^a(\Pi \setminus \Pi^a) \cup \Pi^a$.
We define the external support of $a \in AT$ w.r.t. program $\Pi$ as the formula: \begin{eqnarray*} ES_\Pi(a) := \bigvee_{ \begin{array}{c} r \in \Pi^a \end{array}} Body(r) \wedge \neg (Head(r) \setminus\{a\})^\vee \end{eqnarray*}
Definition (Operator $f_{es}$) Let $\Pi$ be an extended logic program for alphabet $AT$ and let $a \in AT$. Then $f_{es}(\Pi,a)$ is defined as the result of:
Corollary 1. If $\Pi$ is a stratified program (possibly with double negation), $f_{es}$ is a strongly persistent forgetting operator.
Theorem 2. Let $\Pi$ be an extended, disjunctive logic program for alphabet $AT$ and $a\in AT$. If $ES_\Pi(a)$ does not contain occurrences of $a$, and we have no pair of edges $(a,x)^+$ and $(y,a)^+$ in the dependence graph $G_\Pi$, then $f_{es}(\Pi,a)$ is a forgetting operator, and satisfies strong persistence.
Over the last decades the development of ASP has brought about an expressive modeling language powered by highly performant systems. At the same time, it gets more and more difficult to provide semantic underpinnings capturing the resulting constructs and inferences. This is even more severe when it comes to hybrid ASP languages and systems that are often needed to handle real-world applications. We address this challenge and introduce the concept of abstract and structured theories that allow us to formally elaborate upon their integration with ASP. We then use this concept to make precise the semantic characterization of clingo’s theory-reasoning framework and establish its correspondence to the logic of Here-and-there with constraints. This provides us with a formal framework in which we can elaborate formal properties of existing hybridizations of clingo such as clingcon, clingo[dl], and clingo[lp].
Answer Set Programming (ASP) is about mapping a logic program onto its so-called stable models. Over the decades, stable models have been characterized in various ways [18], somehow underpinning their appropriateness as a semantics for logic programs. However, over the same period, the syntax of logic programs has been continuously enriched, equipping ASP with a highly attractive modeling language. This development also brought about much more intricate semantics, as nicely reflected by the mathematical apparatus used in the original definition [13] and the one for capturing the input language of the ASP system clingo [11].
With the growing range of ASP applications in academia and industry [10, 9] we also witness the emergence of more and more hybrid ASP systems [17, 16], similar to the raise of SMT solving from plain SAT solving [2]. From a general perspective, the resulting paradigm of ASP modulo theories (AMT) can be seen as a refinement of ASP, in which an external theory certifies some of a program’s stable models. A common approach, stemming from lazy theory solving [2], is to view a theory as an oracle that determines the truth of a designated set of (theory) atoms in the program. This idea has meanwhile been adapted to the setting of ASP in various ways, most prominently in Constraint ASP [17] extending ASP with linear equations, cf. [8, 12, 6, 19, 5, 1, 15]. Beyond this, ASP systems like clingo or dlvhex [7] leave the choice of the specific theory largely open. For instance, clingo merely requires fixing the interaction between theories and their theory atoms. As attractive as this generality may be from an implementation point of view, it complicates the development of generic semantics that are meaningful to existing systems. Not to mention that in ASP the integration of theories takes place in a non-monotonic context.
We address this issue and show how the Logic of Here-and-There with constraints (HTc; [4]) can be used as a semantics for clingo’s theory reasoning framework. Thus, just like the plain Logic of Here-and-There (HT; [14, 20]) serves as the logic foundations of ASP, HTc extends this to AMT. In this way, we cannot only draw on the formal framework of HT but we can moreover study a heterogeneous approach such as AMT in a the uniform setting of a single logic. To this end, we introduce the concept of abstract and structured theories that allow us to formally elaborate upon their integration with ASP. With them, we make precise the semantic characterization of clingo’s theory-reasoning framework and establish its correspondence to theories in HTc. This provides us with a formal framework in which we can elaborate formal properties of existing hybridizations of clingo such as clingcon, clingo[dl], and clingo[lp]. Full details can be found in [3].
Such a formal characterization provides several valuable insights. The most obvious benefit is that we have now a roadmap to follow, not only when deciding new aspects of existing AMT systems, but also when reconsidering parts of their implementation that were originally designed with no clear hint when facing design alternatives. This will have an immediate impact on the next generation of existing AMT systems like clingcon, where head atoms will start to be considered as founded, and clingo[lp], that will also accordingly introduce an implicit separation between head atoms as founded and body atoms as external. A second important contribution is that, when proving our results, we have tried to maintain the highest possible degree of generality in the description of the abstract theories used behind. This paves the way for introducing new abstract theories: We may now start classifying them in terms of the general properties, identified in the paper (consistency, structure and denotation, closed set of external atoms, absolute complement, etc), so that we can characterize their behavior via HTc formulas.
[1] M. Banbara, B. Kaufmann, M. Ostrowski & T. Schaub (2017): Clingcon: The Next Generation. Theory and Practice of Logic Programming 17(4), pp. 408–461, doi:10.1017/S1471068417000138.
[2] C. Barrett, R. Sebastiani, S. Seshia & C. Tinelli (2009): Satisfiability Modulo Theories. In A. Biere, M. Heule, H. van Maaren & T. Walsh, editors: Handbook of Satisfiability, chapter 26, Frontiers in Artificial Intelligence and Applications 185, IOS Press, pp. 825–885, doi:10.3233/978-1-58603-929-5-825.
[3] P. Cabalar, J. Fandinno, T. Schaub & P. Wanko (2021): Towards a Semantics for Hybrid ASP Systems. CoRR abs/2108.03061. Available at https://arxiv.org/abs/2108.03061.
[4] P. Cabalar, R. Kaminski, M. Ostrowski & T. Schaub (2016): An ASP Semantics for Default Reasoning with Constraints. In R. Kambhampati, editor: Proceedings of the Twenty-fifth International Joint Conference on Artificial Intelligence (IJCAI’16), IJCAI/AAAI Press, pp. 1015–1021, doi:10.5555/3060621.3060762.
[5] A. De Rosis, T. Eiter, C. Redl & F. Ricca (2015): Constraint Answer Set Programming Based on HEX-Programs. In D. Inclezan & M. Maratea, editors: Proceedings of the Eighth Workshop on Answer Set Programming and Other Computing Paradigms (ASPOCP’15).
[6] C. Drescher & T. Walsh (2010): A Translational Approach to Constraint Answer Set Solving. Theory and Practice of Logic Programming 10(4-6), pp. 465–480, doi:10.1017/S1471068410000220.
[7] T. Eiter, S. Germano, G. Ianni, T. Kaminski, C. Redl, P. Schüller & A. Weinzierl (2018): The DLVHEX System. Künstliche Intelligenz 32(2-3), pp. 187–189, doi:10.1007/s13218-018-0535-y.
[8] I. Elkabani, E. Pontelli & T. Son (2004): Smodels with CLP and Its Applications: A Simple and Effective Approach to Aggregates in ASP. In B. Demoen & V. Lifschitz, editors: Proceedings of the Twentieth International Conference on Logic Programming (ICLP’04), Lecture Notes in Computer Science 3132, Springer-Verlag, pp. 73–89, doi:10.1007/978-3-540-27775-0_6.
[9] E. Erdem, M. Gelfond & N. Leone (2016): Applications of ASP. AI Magazine 37(3), pp. 53–68, doi:10.1609/aimag.v37i3.2678.
[10] A. Falkner, G. Friedrich, K. Schekotihin, R. Taupe & E. Teppan (2018): Industrial Applications of Answer Set Programming. Künstliche Intelligenz 32(2-3), pp. 165–176, doi:10.1007/s13218-018-0548-6.
[11] M. Gebser, A. Harrison, R. Kaminski, V. Lifschitz & T. Schaub (2015): Abstract Gringo. Theory and Practice of Logic Programming 15(4-5), pp. 449–463, doi:10.1017/S1471068415000150.
[12] M. Gebser, M. Ostrowski & T. Schaub (2009): Constraint Answer Set Solving. In P. Hill & D. Warren, editors: Proceedings of the Twenty-fifth International Conference on Logic Programming (ICLP’09), Lecture Notes in Computer Science 5649, Springer-Verlag, pp. 235–249, doi:10.1007/978-3-642-02846-5_22.
[13] M. Gelfond & V. Lifschitz (1988): The Stable Model Semantics for Logic Programming. In R. Kowalski & K. Bowen, editors: Proceedings of the Fifth International Conference and Symposium of Logic Programming (ICLP’88), MIT Press, pp. 1070–1080.
[14] A. Heyting (1930): Die formalen Regeln der intuitionistischen Logik. In: Sitzungsberichte der Preussischen Akademie der Wissenschaften, Deutsche Akademie der Wissenschaften zu Berlin, pp. 42–56. R in Logik-Texte: Kommentierte Auswahl zur Geschichte der Modernen Logik, Akademie-Verlag, 1986.
[15] T. Janhunen, R. Kaminski, M. Ostrowski, T. Schaub, S. Schellhorn & P. Wanko (2017): Clingo goes Linear Constraints over Reals and Integers. Theory and Practice of Logic Programming 17(5-6), pp. 872–888, doi:10.1017/S1471068417000242.
[16] R. Kaminski, T. Schaub & P. Wanko (2017): A Tutorial on Hybrid Answer Set Solving with clingo. In G. Ianni, D. Lembo, L. Bertossi, W. Faber, B. Glimm, G. Gottlob & S. Staab, editors: Proceedings of the Thirteenth International Summer School of the Reasoning Web, Lecture Notes in Computer Science 10370, Springer-Verlag, pp. 167–203, doi:10.1007/978-3-319-61033-7_6.
[17] Y. Lierler (2014): Relating constraint answer set programming languages and algorithms. Artificial Intelligence 207, pp. 1–22, doi:10.1016/j.artint.2013.10.004.
[18] V. Lifschitz (2010): Thirteen Definitions of a Stable Model. Lecture Notes in Computer Science 6300, Springer-Verlag, pp. 488–503, doi:10.1007/978-3-642-15025-8_24.
[19] M. Ostrowski & T. Schaub (2012): ASP modulo CSP: The clingcon system. Theory and Practice of Logic Programming 12(4-5), pp. 485–503, doi:10.1017/S1471068412000142.
[20] D. Pearce (1997): A New Logical Characterisation of Stable Models and Answer Sets. In J. Dix, L. Pereira & T. Przymusinski, editors: Proceedings of the Sixth International Workshop on Non-Monotonic Extensions of Logic Programming (NMELP’96), Lecture Notes in Computer Science 1216, Springer-Verlag, pp. 57–70, doi:10.1007/BFb0023801.
In the context of High-Utility Pattern Mining (HUPM), each item in a knowledge base is associated with one, static utility and a suitable utility function is defined over them. However, the usefulness of a pattern can be defined from very different viewpoints. To cope with this limitation, we propose the introduction of facets for items, and a more structured representation of input transactions. These extensions allow the definition of different, even complex, utility functions enabling the analysis of the underlying data at different abstraction levels. A real use case on paper reviews is also considered, in order to evaluate the potential of these ideas in the mining and extraction of new knowledge. The capabilities of Answer Set Programming and its extensions are fully exploited to cope with the wide variety of analytical scenarios that can be envisioned in this new setting.
Pattern mining is one of the data mining branches that attracted vast attention in the literature. The concept of pattern frequency is used to mine patterns from databases. Frequent patterns are useful in many domains; however, the fundamental assumption that pattern frequency is the most interesting factor may not hold in several contexts. As an example, consider a purchase transaction database; here, the pattern {flour, yeast} might be frequent but uninteresting, since it is fairly obvious that those who buy flour also buy yeast. In light of such consideration, the academic community started pointing out that a wide variety of patterns may be characterized by a low frequency but a high
However, the basic assumption of all these approaches is that each item is associated with one, static, external utility. In the economics domain a utility function "represents a consumer's preference ordering over a choice set" and, consequently, it is a subjective measure [7]. Then, it is fair to assume that the utility of an item can be defined from very different points of view (we refer to them as facets in the following) depending on the preference ordering. In current approaches, switching the facet actually means to completely change the computation scenario or, at least, the dataset; moreover, different facets of item utilities cannot be combined to detect the utility of a pattern, unless a new definition of item utility is introduced as a derived measure. This also implies that the notion of utility is intended as local and computed for each pattern occurrence; having more facets at disposal, it is possible to compute transverse utilities across pattern occurrences, such as the correlation degree among (groups of) facets across pattern occurrences.
We propose the definition of a more general framework for HUPM, which extends basic notions along the following directions: (i) transaction set representation, (ii) facets, and (iii) advanced utility functions. The versatility of these extensions is represented by taking full advantage of the expressiveness of declarative programming, focusing on Answer Set Programming (ASP) and its recent extensions [2, 3, 6]. A simple ASP encoding of the problem can be set up in a modular way such that, given a pool of alternative encodings for each module, even non ASP practitioners can set up their own variant for the analysis, and the same dataset can be analyzed from different points of view by just switching the modules.
We build a real-case example of our framework upon the work presented in [1] which exploits aspect-based sentiment analysis of scientific reviews in order to be able to extract useful information and correlate them with the accept/reject decision. An important feature we exploit in our example from the results in [1] is the automatic annotation of review sentences around eight different aspects:
In this example, we show that sentiment polarities may be exploited to find high utility patterns in reviews, i.e., sets of words that either strongly characterize acceptance/rejection of a paper or point out disagreement between reviewers sentiment and final decision. In particular, in this context, the database is composed of a set of
To give an intuition regarding the potentiality of our framework, we describe the results of some experiments carried out to show if, and how much, the extensions proposed in our framework can help users in analyzing input data from different points of view and at different abstraction levels.
In a first series of experiments, we concentrate on the computation of coherence and disagreement degrees between the decision on a paper (Accept/Reject) and one of the eight aspects used to label review sentences [1]. In particular, the objective is to look for patterns (sets of words) characterizing with good accuracy the relationship of interest and to qualitatively check if they are meaningful. Coherence and disagreement degrees are an example of advanced utility functions which can be expressed by our framework; they measure the coherence (resp., disagreement) between facets and return the percentage of pattern occurrences having this measurement. We next report some interesting results obtained. In the setting
In a second series of experiments, we focused on mining patterns where the advanced utility function is the Pearson correlation between the sentiment on a sentence aspect $X$ and the final decision on the corresponding paper. In particular, given a certain aspect, let say $X$=
The presented examples highlight the wide applicability of HUPM even in non classical contexts. Also, the ASP based computation method of the framework allows for a reasonable and versatile implementation, which provides a fast way to analyze the data from different perspectives and different abstraction levels. The real-case example has been employed as a "fil-rouge" to analyze the different aspects of the proposed framework. Again, thanks to the ASP implementation, it is possible to apply the framework to several different contexts in a straightforward way.
ANTHEM is a proof assistant that can be used for verifying the correctness of a program written in the input language of the answer set grounder GRINGO with respect to a specification expressed by first-order formulas. In addition to the rules of the program, the user of ANTHEM is expected to provide some information on the intended use of the rules: which symbols are meant to serve as placeholders, which predicates are meant to be defined in the input, and which predicates are considered part of the output. The predicates that do not belong to either category are "private" and they are not allowed in a specification. A set of rules along with this additional information is termed a "program with input and output," or an "io-program."
A specification given to ANTHEM consists of two groups of first-order sentences. One group, "assumptions," is the list of conditions on the values of placeholders and input predicates that are satisfied when the input is considered valid. The second group, "specs," is the list of conditions that characterize correct outputs.
To bridge the gap between the language of io-programs and the language of specifications, ANTHEM employs a transformation similar to program completion. The completion COMP[$\Omega$] of an io-program $\Omega$ faithfully represents the meaning of $\Omega$ whenever $\Omega$ satisfies the syntactic condition called "tightness." Tightness of io-programs is a generalization of a condition identified in early work on the relationship between stable models and completion. The operation of ANTHEM is based on the fact that the correctness of a tight io-program $\Omega$ with respect to a specification can be described by the sentence \begin{gather} A \to (\text{COMP}[\Omega] \leftrightarrow S), \label{main} \end{gather} where $A$ is the conjunction of the assumptions, and $S$ is the conjunction of the specs.
The tightness requirement eliminates practically all uses of recursion in the program. It turns out that the above-mentioned property of tight io-programs---the possibility of characterizing correctness by the above formula---holds, more generally, for "locally tight" io-programs, in which some forms of recursion are allowed. For example, the io-program with the rules \begin{flalign} &in(P,R,0) \ \texttt{:-}\ in0(P,R).&\\ &in(P,R,T+1) \ \texttt{:-}\ goto(P,R,T).&\\ &{in(P,R,T+1)} \ \texttt{:-}\ in(P,R,T), T < h.&\\ &\ \texttt{:-}\ in(P,R1,T), in(P,R2,T), R1 != R2.&\\ &in\_building(P,T) \ \texttt{:-}\ in(P,R,T).&\\ &\ \texttt{:-}\ not in\_building(P,T), person(P), T = 0..h.& \end{flalign} with the placeholder ${\tt h}$, the input predicates $\texttt{person/1}$, $\texttt{in0/2}$ and $\texttt{goto/3}$, and the output predicate $\texttt{in/3}$ is a program of this kind. We can read $\texttt{in(P,R,T)}$ as "person $P$ is in room $R$ at time $T$," and $\texttt{goto(P,R,T)}$ as "person $P$ goes to room $R$ between times $T$ and $T+1$." The symbol ${\tt h}$ represents the "horizon"---the length of the scenarios under consideration. The third rule of the program encodes the commonsense law of inertia for this domain, and this is the rule that makes the program nontight: $\texttt{in/3}$ occurs both in the head and in the body of this rule nonnegated.
In the definition of the positive dependency graph below, $[t]$ stands for the set of values of a ground term $t$; for instance, $[\tt{2+2}]=\{{\tt 4}\}$, $[{\tt 2..4}]=\{{\tt 2},{\tt 3},{\tt 4}\}$. The values of a ground term are "precomputed terms." Precise definitions, as well as a description of the syntax of io-programs, can be found in the previous publication on ANTHEM.
A valuation for an io-program $\Omega$ is a function defined on the set of placeholders of $\Omega$ such that its values are precomputed terms different from placeholders. For an io-program $\Omega$ and a valuation $v$, by $\Omega(v)$ we denote the set of rules obtained from the rules of $\Omega$ by replacing every occurrence of every placeholder $t$ by the term $v(t)$. An~input for $\Omega$ is a pair $(v,I)$, where $v$ is a valuation for $\Omega$, and $I$ is a set of precomputed atoms without placeholders, such that the predicate symbol of each atom in $I$ is an input symbol of $\Omega$.
The positive dependency graph of an io-program $\Omega$ for an input $(v,I)$} is the directed graph defined as follows. Its vertices are ground atoms $p(r_1,\dots,r_n)$ such that $p/n$ is an output symbol or a private symbol of $\Omega$, and each $r_i$ is a precomputed term different from the placeholders of $\Omega$. It has an edge from $p(r_1,\dots,r_n)$ to $p'(r'_1,\dots,r'_m)$ iff there exists a ground instance $R$ of one of the rules of $\Omega(v)$ such that
In the example above, the vertices of the positive dependency graph for an input $(v,I)$ are atoms $\texttt{in}(p,r,t)$ and $\texttt{in_buildin}g(p,t)$, where $p$, $r$, $t$ are precomputed terms different from ${\tt h}$. Its edges go from $\texttt{in_building}(p,t)$ to $\texttt{in}(p,r,t)$ and, if $v({\tt h})$ is a positive integer, from $\texttt{in}(p,r,i+1)$ to $\texttt{in}(p,r,i)$ for $i=0,\dots,v({\tt h})-1$. This graph has no infinite paths, so that the program is locally tight on all inputs.
Decision support tools face a variety of challenges, including multiplicity of information sources, heterogeneous
format, constant changes, and other characteristics of the domain they model. Handling such challenges requires the
ability to analyze and process inconsistent and incomplete information with varying degrees of associated uncertainty.
In addition to this, some domains require the system's outputs to be explainable and interpretable---an example of this
is cyber threat analysis (CTA) in cybersecurity domains. Due to the nature of these analysis processes, an automated
reasoning system with human-in-the-loop capabilities would be best suited for the task. Such a system must be able to
accomplish several goals, among which we distinguish the following main capabilities([1]):
(i) reason about evidence in a formal, principled manner;
(ii) consider evidence associated with probabilistic uncertainty;
(iii) consider logical rules that allow for the system to draw conclusions based on certain pieces of evidence and
iteratively apply such rules;
(iv) consider pieces of information that may not be compatible with each other, deciding which are most relevant; and
(v) show the actual status of the system based on the above-described features, and provide the analyst with the ability
to understand why an answer is correct, and how the system arrives at that conclusion
(i.e., explainability and interpretability).
Based on these capabilities, we work with the DeLP3E framework([1]), which was designed based on the
requirements mentioned above. A DeLP3E knowledge base
consists of two parts: an analytical model (AM) and an environmental model (EM), which represent different
aspects of a domain. The former model contains all the background information and knowledge that is available for
analysis: here we can have rules, facts, or presumptions to represent the knowledge of the domain. From this model we
can obtain conclusions based on the creation and analysis of arguments using defeasible logic programming([2]). On the other hand, the environmental model is used to describe the background knowledge and is
probabilistic in nature. Unlike the AM, this model must be consistent, which simply means that there must exist a
probabilistic distribution over the possible states of the domain (worlds) that satisfies all of the constraints in the
model, as well as the axioms of probability theory. In general, the EM contains knowledge such as evidence, intelligence
reporting, or knowledge about actions, software, and systems. There are thus two kinds of uncertainty to be modeled:
probabilistic uncertainty and that arising from defeasible knowledge. DeLP3E allows both to coexist, and combinations of
the two since defeasible rules and presumptions (that is, defeasible facts) can also be annotated with probabilistic
events.
In this model, to answer a query for a literal we need to compute the probability interval with which it is warranted in
the DeLP3E KB---for this, we must sum the probability mass of the worlds where the queried literal is warranted
(warranting scenarios, for the lower bound) and the mass worlds whose warrants the complement of the query
(for the upper bound). The lower and upper bounds obtained in this manner comprise the probability interval for the
query. This is one of the main sources of computational intractability, since we must answer the query either for all
the worlds in the EM model or all the programs in the AM model. This computational intractability is the main problem
that we address here.
This extended abstract provides an overview of a research project focused on approximate query answering in
probabilistic structured argumentation based on DeLP([2]) and probabilistic models to deliver these
capabilities. The ultimate goal is to develop a suite of algorithms for tackling the intractability of this task and
selection criteria for choosing the best one based on the analysis of available information.
The characteristics that DeLP3E offers to model reasoning in a specific domain based on the observations of probabilistic events makes it a good choice for developing a system that provides answers accompanied by explanations in complex domains such as those mentioned above. Our goal is to design an effective and efficient tool that allows to perform tasks such as: evaluating the state of the system under a particular set of observed events (evidence), deriving the most probable scenarios based on counterfactuals ("what if" scenarios), and approximating the probability interval associated with a query considering all possible eventualities (possible worlds). These tasks will be supplemented with the capability of understanding how the system reaches that conclusion through explanations based on the AM and EM components.
Our main goal is to propose a set of algorithms that take advantage of a variety of sampling-based techniques to approximate the exact probability interval with which a query is entailed. We can consider two types of sampling: world-based and subprogram-based. In world-based sampling, a subset of the possible EM worlds are chosen and, based on the annotation function (which annotates each component of the AM with logical formulas formed by components from the EM), different subprograms of AM are obtained. Given a world, an AM program is induced, which is simply a classical DeLP program([2]) in which we can query for the status of some literal. On the other hand, subprogram-based sampling chooses from the universe of all possible AM subprograms, and each of these programs can be generated by multiple worlds through their annotations. In both cases, we also have those worlds or programs in which the status of the query can be "undecided" or "unknown". By construction, these sampling techniques yield sound results, meaning that the probability intervals obtained are guaranteed to be supersets of the exact answer---given the incomplete nature of the process, some probability mass will in general remain unexplored.
As summarized above, different sampling techniques can be proposed, and our hypothesis is that their effectiveness
will be directly affected by the different settings that arise in practice.
Having empirical results based on the application of different techniques in a variety of such settings would
therefore be of great use in designing a query answering algorithm based on choosing the best technique given
the current setting.
In order to base the selection of technique on the best available data, we analyzed the information available in each
model in order to define a series of metrics that characterize the current configuration.
For the input KB, two main attributes were established: size and complexity, which apply
to each component: AM, EM, and annotation function. The former defines a quantitative value based on the number
elements in the model, while the latter is focused on capturing the
intricacy of generate and working with structures in the particular model.
Developing a comprehensive set of such metrics allows us to do two things:
(i) develop techniques for generating synthetic KBs that fall within specific settings (for instance,
simple AM and annotation function, but complex EM); and
(ii) evaluate how the possible alternatives for sampling-based approximation algorithms perform based in
different settings.
This is key in developing decision criteria for selecting the best algorithm for the job, contemplating the tradeoffs
between running time (including the cost of calculating any approximate metrics) and precision of the result obtained.
As one of our main results, we found that sampling larger sets of worlds leads to higher quality approximations;
even though this is of course expected, there are two interesting details here.
First, for the 20 EM variable case (1M worlds, the largest instances for which exact values were computable), the
quality
obtained by 5K vs. 10K world samples was not statistically significant, which means that only 5K samples sufficed to
obtain a good approximation.
Second, the proportion of repeated samples (which must be discarded, leading to wasted
effort) was quite high for both entropy levels; for higher entropy levels, on average 52% of samples were repeated,
while for lower entropy an average of 87% were non-unique. For the 20 EM variable case, the quality levels were
achieved with only 2,293 and 469 unique samples, respectively.
Our study also yielded lower variation in quality when larger sample sizes were used.
Finally, our results showed that the entropy associated with the probability distribution over possible worlds has a
large impact on expected solution quality, but in many cases even a modest number of samples suffices to achieve an
approximation of relatively good quality.
Search-optimization problems are plentiful in scientific and engineering domains. Various automated reasoning paradigms provide users with languages supporting optimization statements. Indeed, consider such popular paradigms as integer linear programming (ILP) [4], MaxSAT [5], optimization satisfiability modulo theory (OMT) [6], constraint answer set programming (CASP) [1]. These paradigms allow a user to express ``hard'' and ''soft'' constraints given a problem of interest. Hard part of an encoding for a considered problem is meant to state requirements on what constitutes a solution to this problem. Soft part of the encoding is meant to state optimization criteria based on which we compare resulting solutions and find optimal ones. These paradigms vary significantly in their languages, for example, in ways they express the ``hard constraints'', in ways they express ``soft constraints'', as well as their vocabularies. Indeed, ILP primarily objects are variables over integers, whereas in MaxSAT we are looking at binary variables. Furthermore, in formalisms such as optimizations modulo theory or constraints answer set programming both kinds of variables are allowed together with intricate interface between these.
The relations between many of the enumerated paradigms have been studied in the absence of ``soft constraints.'' Recently, Lierler provided a formal account comparing MaxSAT family and answer set programs with weak constraints (weak constraints are syntactic objects to express soft constraints in logic programs) [3]. In that work, to draw the precise parallel between these frameworks so called abstract modular weight-systems (w-systems) served the role of a primary tool. These systems abstracted away syntactic differences of the paradigms, while leaving the essential semantic ingredients sufficient to introduce a concept of a model and optimization criteria for their comparison. An abstract notion of a logic introduced by Brewka and Eiter [2] is a crucial ingredient of w-systems. This abstract logic may encapsulate languages over binary variables. Hence, w-systems may capture frameworks such as MaxSAT or logic programs with optimizations.
In this work, we extend the concepts of an abstract logic and w-systems to provide us with a framework capable to capture formalisms that utilize distinct kinds of variables in their languages. Then, we show how resulting extended w-systems encapsulate ILP, OMT (in its two common variants), and CASP with optimization statements. We trust that such birds eye view on these distinct paradigms will boost cross-fertilization between approaches used in design of algorithms supporting optimizations in distinct fields. Indeed, Lierler illustrated how MaxSAT solvers can be used to compute optimal answer sets for logic programs with weak constraints by providing theoretical basis for cross-translations between formalisms [3]. This work provides theoretical grounds for devising translations for related optimization statements in CASP and OMT. Thus, we pave the way, for example, for an extension of a constraint answer set solver EZSMT [7]. This solver relies on utilizing satisfiability modulo theory solvers for finding answer sets of a considered CASP program. In the future, this system can utilize OMT solvers to find optimal answer sets of a CASP program.
The work was partially supported by NSF grant 1707371.
The specifications of stream reasoning applications often include events with temporally-constrained effects. In e-commerce, e.g., an agent may be suspended, but only temporarily, i.e. the suspension is lifted after a specified time. Moreover, in maritime situational awareness, certain types of fishing activities are predicted to have concluded at a specified time after multiple `change in heading' events from a vessel ([6]). Therefore, stream reasoning often requires the representation and the efficient treatment of the temporally-constrained effects of events. The goal of this work is to extend the stream reasoning system RTEC ([2]) with a mechanism which handles effectively the temporally-constrained effects of events.
In order to support stream reasoning, RTEC extends the Event Calculus with features which aid computational performance, such as caching and indexing. Processing data streams in windows, in conjunction with a mechanism which discards events prior to the current window, increases efficiency as reasoning time depends on the size of the window and not on the size of the entire stream. Moreover, RTEC employs a caching mechanism to avoid unnecessary re-computations, even in the presence of large knowledge bases. The complexity analysis of RTEC is available in ([2]) while the code is publicly available.
Query abduction has been studied for ontology-mediated query answering with expressive ontology languages such as existential rules [1,3,4,2]. A query abduction problem (QAP) [1] is a tuple $\Lambda=(\mathcal{P}, \mathcal{D}, q, \Sigma)$, where $\mathcal{P}$ is a set of existential rules, $\mathcal{D}$ is a dataset, $q$ is a Boolean conjuctive query (BCQ) called the observation, and $\Sigma$ is a set of predicates called the abducibles. An explanation to $\Lambda$ is a finite set of facts $\mathcal{E}$ that may contained labelled nulls such that $\mathsf{sig}(\mathcal{E})\subseteq \Sigma$ and $\mathcal{P}\cup \mathcal{D}\cup \mathcal{E}\models q$.
Different from traditional abduction, query abduction allows place-holders (expressed as nulls) in explanations that can be substituted with constants, which often causes the total number of explanations to be significantly large or even infinite. Consider an example on contact tracing where and ontology $\mathcal{P}$ contains two rules:
\begin{align*}
& \mathsf{high\_risk}(x)\leftarrow \mathsf{close\_contact}(x, y), \mathsf{high\_risk}(y). \\
& \mathsf{high\_risk}(x)\leftarrow \mathsf{visit}(x, y), \mathsf{hotspot}(y).
\end{align*}
And the dataset $\mathcal{D} = \{\mathsf{close\_contact}(John, person_1), \ldots, \mathsf{close\_contact}(John, person_m), \mathsf{hotspot}(area_1), \ldots, \mathsf{hotspot}(area_n)\}$.
Now, if we have an observation that John is of high-risk, that is, $q: \mathsf{high\_risk}(John)$. Take $\Sigma = \{\mathsf{high\_risk}, \mathsf{close\_contact}, \mathsf{visit}, \mathsf{hotspot}\}$, and we have the following three types of obvious explanations among others: $\mathcal{E}_{p_i} = \{\mathsf{high\_risk}(person_i) \}$ ($1\le i \le m$), $\mathcal{E}_{a_j} = \{\mathsf{visit}(John, area_i) \}$ ($1\le j\le n$), and $\mathcal{E}_{ap} = \{ \mathsf{visit}(John, u), \mathsf{hotspot}(u) \}$, where $u$ is a labelled null.
We note that $\mathcal{E}_{ap}$ is different from $\mathcal{E}_{p_i}$ and $\mathcal{E}_{a_j}$ in the sense the former essentially represents an abstract pattern of explanations with substitutable values, while the latter are concrete ones referring to specific persons or locations. It is hard to tell in general which kind of explanations would be better. For some abducibles, a user may prefer concrete explanations to abstract ones; for instance, to trace high-risk individuals like in $\mathcal{E}_{p_i}$. For other abducibles, the user may have the opposite preference; for instance, to know John is of high-risk due to visiting some known hotspots expressed as $\mathcal{E}_{ap}$. Thus, in this paper, we propose a general framework of query abduction that allows such preferences over different types of explanations to be specified.
To define a selective notion of explanations, we need some preference relations over QAP explanations. For two sets of facts $\mathcal{E}$ and $\mathcal{E}'$, $\mathcal{E} \preceq_m \mathcal{E}'$ if there exists a renaming of nulls $\sigma$ such that $\mathcal{E} \sigma\subseteq \mathcal{E}'$. $\mathcal{E}\preceq_h\mathcal{E}'$ if there is a substitution $\sigma$ such that $\mathcal{E}\sigma \subseteq \mathcal{E}'$.
To allow fine-grained partitions of abducibles and preference on constants or nulls in each of the parts, we define a selection condition over the abducibles $\Sigma$ to be a tripartition $S=(\Sigma_C,\Sigma_A,\Sigma_M)$ of $\Sigma$, where $\Sigma_C$, $\Sigma_A$ and $\Sigma_M$ are called concrete, abstract and mixed abducibles, respectively. Intuitively, $S$ specifies that for (parts of) explanations on predicates in $\Sigma_C$, concrete constants are preferred over nulls in the explanations; whereas for those on predicates in $\Sigma_A$, it is the other way round, i.e., substitutable nulls are preferred over constants. To achieve this, we define a preference over explanations such that an explanation $\mathcal{E}$ is preferred if nulls in the atoms over $\Sigma_C$ are substituted with constants, whereas nulls are preserved in the atoms over $\Sigma_A$. And on $\Sigma_M$, there is no preference between nulls and constants, and the preference relation coincides with $\preceq_h$. Formally, given a QAP with a dataset $\mathcal{D}$, $\mathcal{E}\preceq_S \mathcal{E}'$ if there exists a substitution $\sigma$ such that (i) $\mathcal{E}|_{\Sigma_C}\subseteq \mathcal{E}'|_{\Sigma_C}\sigma$, (ii) $\mathcal{E}|_{\Sigma_A}\sigma\subseteq \mathcal{E}'|_{\Sigma_A}\cup \mathcal{D} $, and (iii) $\mathcal{E}|_{\Sigma_M}\sigma\subseteq \mathcal{E}'|_{\Sigma_M}$.
Definition 1 Given a QAP $\Lambda$ and a selection condition $S$, a selective explanation $\mathcal{E}$ to $\Lambda$ w.r.t.\ $S$ is an explanation to $\Lambda$ such that
The notion of preferred explanations in [3] only satisfies Conditions~1 and~2, which does not allow users to specific a preference between $\mathcal{E}_{a_j}$ and $\mathcal{E}_{ap}$ in the motivating example. Our definition coincides with that in [3] when $S = (\emptyset,\emptyset,\Sigma)$, and hence is more general. Let $S = (\{\mathsf{high\_risk}, \mathsf{close\_contact}\}, \{\mathsf{visit}, \mathsf{hotspot}\}, \emptyset)$, and $\mathcal{E}_{ap}\preceq_S \mathcal{E}_{a_j}$ based on our definition, whereas the converse does not hold.
We also propose a compact representation for selected explanations using rules. In the motivating example, explanations $\mathcal{E}_{p_i} = \{\mathsf{high\_risk}(person_i)\}$ ($1\le i\le m$) all follow a similar pattern. These explanations can be obtained from one pattern explanation $\mathcal{E}_{pp} = \{\mathsf{close\_contact}(John, u), \mathsf{high\_risk}(u)\}$. And if an instance of the form $\mathsf{close\_contact}(John, a)$ is found in $\mathcal{D}$ for some constant $a$ then $\{\mathsf{high\_risk}(a)\}$ is an explanation. Such a pattern can be captured by a rule $r: \mathsf{high\_risk}(x) \Leftarrow \mathsf{close\_contact}(John, x)$, and explanations $\mathcal{E}_{p_i}$ ($1\le i\le m$) can be derived by applying $r$ to $\mathcal{D}$. Hence, rule $r$ can be seen as a compact representation for them. Note that the compact representations must not be confused with the rules in the ontology. Unlike ontological rules, a compact representation $\mathsf{high\_risk}(x) \Leftarrow \mathsf{close\_contact}(John, x)$ may not hold in general, but is specific to a QAP. It should be understood as that if $\mathsf{close\_contact}(John, a)$ is found in the given dataset for some constant $a$ then $\{\mathsf{high\_risk}(a)\}$ is an explanation.
To compute selective explanations and their compact representations, we present a method based on query rewriting. Given an observation $q$ and an ontology $\mathcal{P}$, $q$ is first-order rewritable by $\mathcal{P}$ if there exists a finite set of BCQs, denoted as $\mathsf{rew}(q,\mathcal{P})$, such that for each dataset $\mathcal{D}$, $\mathcal{P}\cup\mathcal{D}\models q$ if and only if $\mathcal{D}\models q'$ for some BCQ $q'\in \mathsf{rew}(q,\mathcal{P})$. We assume $\mathsf{rew}(q,\mathcal{P})$ is non-redundant, that is, there do not exist two BCQs $q_1, q_2\in \mathsf{rew}(q,\mathcal{P})$ such that $q_1\preceq_h q_2$ and $q_2\not\preceq_h q_1$.
In particular, we consider facts as rules with empty bodies, and for a dataset $\mathcal{D}$, $\mathcal{R}_\mathcal{D}$ denotes the set of rules $\{P(\vec{a}) \leftarrow \;|\; P(\vec{a})\in \mathcal{D}\} $. Given a set $\mathcal{Q}$ of BCQs and a signature $\Sigma$, $\mathcal{Q}|_\Sigma = \{q\in \mathcal{Q} \;|\; \mathsf{sig}(q)\subseteq \Sigma\}$. For a set of solutions $\Xi$ to a QAP and a selection condition $S$, $\min_{\preceq_S}(\Xi)$ consists of all the solutions $\mathcal{E}\in \Xi$ such that for each solution $\mathcal{E}'\in \Xi$ satisfying $\mathcal{E}'\preceq_S \mathcal{E}$, it holds $\mathcal{E}\preceq_S \mathcal{E}'$. We assume a fixed bijective mapping $\rho$ from variables to nulls, which allows us to map a BCQ to a QAP explanation.
Proposition 1 For a QAP $\Lambda=(\mathcal{P},\mathcal{D},q,\Sigma)$, $\mathsf{exp}_S(\Lambda) \equiv \min_{\preceq_S} \big( \mathsf{rew}(q, \mathcal{P} \cup \mathcal{R}_\mathcal{D})|_\Sigma \rho \big).$
The result shows that for first-order rewritable observations and ontologies the computation of selective explanations can utilise an efficient query rewriting system, which can potential scale over large datasets. Furthermore, we show the compact representation of selective explanations (as rules) can be obtained from the BCQs in $\mathsf{rew}(q, \mathcal{P})$.
We have implemented a prototype system for computing selective explanations based on Drewer [5], a rewriting-based query answering system for existential rules, and evaluated the scalability of our system on complex QAPs with large datasets.
In Table 1, for each ontology, we used 5 BCQs as observations. For each QAP evaluated, we compare the numbers of explanations as in [3] (Explanation) and selective explanations, as well as the times (in milliseconds) used for generating them. For selective explanations, we consider two special cases where all abducibles are in $\Sigma_A$ (Abstract) and all of them are in $\Sigma_C$ (Concrete). For the numbers of explanations, we report both the numbers of fact sets (#FSets) and the numbers of rules (#Rules) as the corresponding compact representations. For Abstract, these two numbers coincide.
Ontology | Query | Explanation | Absract | Concrete | |||||
---|---|---|---|---|---|---|---|---|---|
#Rules | #FSet | Time | #FSet | Time | #Rules | #FSet | Time | ||
LUBM | q1 | 4 | 557 | 323 | 2 | 9 | 2 | 555 | 368 |
q2 | 5 | 13597 | 699 | 1 | 51 | 2 | 2167 | 155 | |
q3 | 12 | 18871 | 190 | 4 | 0 | 7 | 17093 | 240 | |
q4 | 10 | 40158 | 664 | 2 | 963 | 2 | 555 | 712 | |
q5 | 50 | 57546 | 6537 | 10 | 36 | 10 | 15086 | 322 | |
Vicodi | q1 | 5 | 1259 | 322 | 3 | 3 | 3 | 1257 | 91 |
q2 | 3 | 1257 | 42 | 1 | 11 | 2 | 1256 | 10 | |
q3 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | |
q4 | 189 | 556596 | 4478 | 84 | 29 | 116 | 556060 | 5216 | |
q5 | 247 | 866896 | 7624 | 118 | 95 | 150 | 866336 | 5743 | |
STB-128 | q1 | 9 | 50306 | 794 | 4 | 3 | 6 | 39931 | 1001 |
q2 | 3 | 9998 | 66 | 2 | 1 | 2 | 9997 | 127 | |
q3 | 3 | 10001 | 114 | 2 | 0 | 2 | 10000 | 131 | |
q4 | 15 | 80457 | 725 | 6 | 4 | 10 | 69963 | 1056 | |
q5 | 15 | 69773 | 644 | 6 | 1 | 10 | 69484 | 731 |
Automatic segmentation represents a huge breakthrough in computer aided diagnosis and medicine, thanks to its ability in extracting and providing important information useful to clinicians for interventional and diagnostic tasks. Several approaches based on Deep Learning (DL), such as Convolutional Neural Networks, have been proposed to identify anatomical and pathological structures, and to extract hidden patterns from huge amounts of data; they already proved to be suitable for learning without human supervision; however, drive the decisions of the network according to prior knowledge is still a challenging issue, as well as interpreting the choices made by the models.
In this work we propose the use of deductive rule-based approaches, such as Answer Set Programming (ASP) to drive DL approaches in performing semantic segmentation of medical images. Specifically, we encoded some available prior medical knowledge via ASP, thus defining a rule-based model for deducting all admitted combinations of classes and right locations in medical images. We make use of the inferred knowledge to: (i) define a novel loss function which includes penalty for misclassified elements identified by the network, and (ii) perform post-processing to remove small islands of noise and identified classes which do not comply with medical requirements; also, we re-assign misclassified elements to the more frequent class in the neighborhood.
We evaluated our approach using different artificial neural networks trained for automatic semantic segmentation based on Laryngeal Endoscopic Images; the resulting framework relies on ASP programs to include structural and medical properties in DL-based techniques, paving the way to an effective combination of deductive and inductive methods.
Semantic image segmentation, also defined as pixel-level classification, refers to the task of segmenting an image into regions corresponding to meaningful objects and then assigning them an object category label [5, 9]. Notably, in medical contexts, semantic segmentation of images can be extremely useful to support clinicians in providing proper diagnosis, identifying pathological conditions and highlighting image regions related to specific disease. In recent decades, Deep Learning (DL)-based approaches have shown a great ability in extracting meaningful information from different types of images (e.g., computed tomography, magnetic resonance imaging, endoscopic imaging). These approaches have showed to be suitable to perform segmentation [1, 7, 11] and semantic segmentation [4, 8, 12] of medical images and, in general, for supporting automated diagnosis, surgical scene understanding and computer-assisted interventions [10]. Among several applications, DL-based approaches are used to support clinicians in confirming the size of tumors, identifying lesion site and quantitatively evaluating the effect before and after treatments [6].
However, DL-based approaches show some limitations such as (i) provide clear interpretations and explanations of the decisions made by the network, and (ii) drive the decisions according to prior knowledge, thus affecting successful deployment in real-life experiments.
We propose the use of Answer Set Programming (ASP) [2] to steer neural networks decisions and refine the predicted output with the final aim of overcoming such limitations. Specifically, we create a rule-based model by encoding prior medical knowledge to compute all the admitted combinations of classes and, for each class, identify the wrong pixel location in medical images.
The proposed approach requires to face three main challenges: (i) the design of a model based on knowledge representation for describing domain medical knowledge, $(ii)$ the design of a standard methodology to convert network prediction into logical rules over the data model mentioned above, $(iii)$ a proper interpretation of the output of ASP and its conversion into values understandable by the network, to determine the loss function in real-time. Figure 1 shows the workflow of the proposed approach, which aims to:
We implemented our deductive approach into the ASP system DLV, in particular into its grounding subsystem I-DLV [3]. We tested our approach using different artificial neural networks (i.e., DeepLab-v3, SegNet, U-Net) for performing semantic segmentation of Laryngeal Endoscopic Images [4]^{1}. First results are promising: not only showed slight improvements (according to Intersection-over-Union metric), but proved the flexibility of the presented approach, that eases the “incorporation” of explicit additional domain knowledge into the model. As future work is concerned, we aim to investigate misclassification errors and improve the generalization capability of the model, as well as the overall performance.
We study a variation of the Stable Marriage problem, where every man and every woman express their preferences as preference lists which may be incomplete and contain ties. This problem is called the Stable Marriage problem with Ties and Incomplete preferences (SMTI). We consider three optimization variants of SMTI, Max Cardinality, Sex-Equal and Egalitarian, and empirically compare the following methods to solve them: Answer Set Programming, Constraint Programming, Integer Linear Programming. For Max Cardinality, we compare these methods with Local Search methods as well. We also empirically compare Answer Set Programming with Propositional Satisfiability, for SMTI instances.
Matching problems have been studied in economics, starting with the seminal paper of Gale and Shapley [3], which has led to a Nobel Prize in 2012, utilizing game theory methods with the goal of a mechanism design. Matching problems are about markets where individuals are matched with individuals, firms, or items, typically across two sides. In each problem, preferences of individuals, firms, or items are given, possibly along with other information [2,1].
One of the well-known matching problems is the Stable Marriage Problem (SM). In SM, for a set of $n$ men and $n$ women, we are given the preferences of individuals: for each man (resp. woman), a complete ranking of the women (resp. the men) is specified as preferred partners. The goal is to marry all men and women (i.e., to find $n$ couples) in such a way that marriages are stable: no man and woman in different couples prefer each other to their partners.
We consider a variant of SM, called SMTI, where rankings may be incomplete (i.e., some partners are not acceptable) or may include ties (i.e., some partners are preferred equally). We investigate three hard variants of SMTI [10,14], that aim to compute optimal stable matchings with respect to different measures of fairness: sex-equality (preferences of men and women are considered to be equally important), egalitarian (preferences of every individual are considered to be equally important), maximum cardinality (minimizes the number of singles).
We present a variety of methods to solve these problems, using Answer Set Programming (ASP) [15,16,12], Integer Linear Programming (ILP) [9], Constraint Programming (CP) [8], and Local Search (including Hill-Climbing [13] and Genetic Algorithms [7].
The ASP formulations of SMTI and its hard variants (Sex-Equal SMTI, Egalitarian SMTI, Max Cardinality SMTI) are novel; they are implemented for the ASP solver CLINGO [4].
We consider the ILP formulation of Max Cardinality SMTI, by Kwanashie and Manlove [11] as a basis, and introduce the ILP formulations for Sex-Equal SMTI and Egalitarian SMTI; they are implemented for Gurobi and Google-OR Tools (MIP and CP).
We consider the local search algorithms introduced by Gelain et al. [5] and Haas et al. [6] for Max Cardinality SMTI, and implement them with slight variations.
We compare these methods empirically over a large set of randomly generated instances. We believe that comparing different (but closely related) methods to solve hard problems is valuable to better understand their strengths.
The aim of Probabilistic Logic Programming is to extend the expressiveness of Logic Programming by adding the possibility to manage uncertainty over the data with the introduction of probabilistic facts representing discrete random variables. Some years ago, to manage also continuous random variables, hybrid programs were proposed, but the semantics was imprecise. In this paper, we present a formal definition of a semantics that assigns a probability value to all queries for programs with a two-valued Well-Founded model. This paper is a summary of ([1]) .
Probabilistic Logic Programs (PLPs) usually support only discrete probabilistic facts (i.e., facts that can be true or false with a certain probability) representing Bernoulli random variables. However, many models of real world tasks also require managing continuous values. Hybrid Probabilistic Logic Programs ([1,6]) extend traditional PLPs with the introduction of continuous random variables (rvs). The resulting probability space, defined as the product of the probability spaces of discrete and continuous rvs, is usually infinite. Furthermore, even in programs with only discrete probabilistic facts, the number of variables may be infinite (for example, when there is at least one function symbol and one constant). When function symbols, continuous random variables, and constraints are combined, the definition of a well-defined semantics is crucial to assign a probability value to queries.
We consider the probability space induced by the program as the product of the spaces of discrete and continuous random variables. Then, for every sample taken from the sample space for the whole program, a ground normal program is defined, called a world. If each world has a two-valued Well-Founded model, we call the probabilistic program sound. We prove that every sound program is well-defined (each query has an assigned probability). If the program fails to have all worlds with a two-valued Well-Founded model, other semantics must be used ([2]).
During the years, several semantics for programs with both discrete and continuous random variables have been proposed, none of them managing function symbols. Hybrid ProbLog ([3]) was one of the first languages that allowed the definition of a finite set of continuous probabilistic facts. Similarly, continuous distributions are supported by Distributional Clauses (DC) ([4]) but negation in the body of rules is not allowed: this limitation is no more present in HAL-ProbLog ([8]), one of its extension. The same happens with Extended PRISM ([5]). Finally, the semantics of Probabilistic Constraint Logic Programs ([7]) constrain the sample space of the whole program to be countable, a situation that may be violated when there is at least one constant and a function symbol, as explained before.
For example, consider a game where a player needs to pick a card from a deck and spin a wheel.
The card can be either blue or red and it is reinserted in the deck after every round.
The game continues until the player does not pick a red card and the axis of the wheel is between 0 and $\pi$.
In this simple example, there are both discrete (color of card) and continuous (position of the axis of the wheel) random variables.
We may be interested in computing the probability that the player draws at least one time, or that he/she never draws a red card: to compute the probability of these two queries, we need to consider an infinite number of trials.
The following program models the aforementioned scenario:
1/2 :: blue(X).
angle(_,X) : uniform_dens(X,0,6.28).
pick(0,blue) :- blue(0), angle(0,V), V > 3.14.
pick(0,red) :- \+ blue(0), angle(0,V), V > 3.14.
pick(s(X),blue):- \+ pick(X,red), angle(X,V), V > 3.14, blue(s(X)), angle(s(X),V), V > 3.14.
pick(s(X),red):- \+ pick(X,red), angle(X,V), V > 3.14, \+ blue(s(X)), angle(s(X),V), V > 3.14.
at_least_once_red :- pick(_,red).
never_red :- \+ at_least_once_red.
Both queries at_least_once_red and never_red have and infinite number of groundings, since the anonymous variable _ is unified iteratively with 0, s(0), s(s(0)), ....
To compute its probability we need to consider, for example, mutually disjoint covering sets of worlds.
See ([1]) for further details.
Abstract. Keyhole neurosurgery is one of the most hazardous procedures, due to the complexity of the brain environment and the high density of vital structures. Complying with the kinematic constraints of the probe selected to perform the procedure and of the preferences dictated by surgeons' experience are essential to find the safest path to the surgical target. This work presents and optimisation and classification strategy for neurosurgical interventions. The framework relies on Answer Set Programming to translate the requirements and the expert's knowledge into the objectives of the optimisation procedure. The semantics of Answer Set Programming grants extensive flexibility, as it allows to easily change the requirements to optimise and their priority based on the specific needs of the clinical case. (This work appeared at ICRA 2021)
Introduction. Keyhole Neurosurgery (KN) has become the gold standard in brain procedures, as it allows to access the brain via a small hole on the skull, minimising trauma [6] Paramount to the success is an accurate and precise planning of the trajectory the probe has to follow to reach the Target Structure (TS) deep inside the brain. In this context, it is important to identify and comply to precise requirements, some strictly mandatory and others defined as "soft" constraints. The first ones must be fulfilled, and are strictly related to the respect of the kinematic constraints of the selected probe: for the trajectory to be feasible, it should not surpass the maximum curvature the needle can perform; moreover, the path cannot be chosen if it approaches the delicate structures at a minimum distance that is smaller than the radius of the needle. Soft constraints are instead dictated by implicit and explicit desiderata that guide the decision-making process of the neurosurgeon. These are used to express preferences, that can vary from procedure to procedure, and even from patient to patient [2] Generally, they count minimisation of the path length to minimise risks in the motion of the needle, maximisation of the distance from obstacles to mitigate the damage of possible mistakes in the motion of the needle, minimisation of the curvature performed by the needle to have a smoother path that facilitates the control of the probe.
Given these premises, the attempt of fulfilling these requirements can be seen as an optimisation problem, where soft constraints are the objectives. Classical methods, like graph- and sampling-based techniques [5,7] or learning-based approaches [8] generate a raw path that is then input of the optimisation phase; the desiderata are translated into numerical constraints via cost functions. This standard optimisation approach presents several limitations when applied to KN. Indeed, the transposition of the preferences dictated by the clinician into ad hoc cost functions requires specific mathematical and programming skills together with several hours of coding to implement them. Therefore, it is not possible to immediately implement new requirements. This lack of flexibility and the impossibility of changing them and adapting them real-time during the pre-operative phase hinders their validity as optimisation techniques for KN.
Proposal. Interestingly, deductive reasoning, and in particular Answer Set Programming (ASP), can provide a valid alternative to overcome the drawbacks of standard optimisation approaches in the scenarios herein discussed. ASP is a declerative logic formalism that encodes computational problems via logic programs [1,3]. Indeed, its semantics allows to explicitly represent domain knowledge that can be used to intuitively implement the constraints of the complex brain environment and the surgeon's preferences. We herein present an optimisation and classification strategy that exploits the flexible semantics of ASP to find the optimal curvilinear trajectory in the pre-operative phase. The strategy takes in input not only the raw trajectory, but also the priority associated to each optimisation objective that can differ from case to case. Moreover, we show how the optimisation and classification techniques have been integrated in a surgical simulator developed in Unity [4] to intuitively guide the neurosurgeon through all the steps of the decision making process and easily visualise the result of the planning. Experiments show the effectiveness and robustness of our approach.
This abstract summarizes an architecture that enables a robot to provide on-demand explanations of its decisions and beliefs ([1]). These explanations are in the form of descriptions comprising relations between relevant domain objects, object attributes, robot attributes, and robot actions. Such ``explainability'' will improve the underlying algorithms, establish accountability, and support more effective human-robot collaboration. However, state of the art robot architectures often include knowledge-based reasoning methods (e.g., for planning) and data-driven learning methods (e.g., for recognizing objects). Providing transparency is challenging in such \emph{integrated robot systems} because the robot has to sense and interact with the physical world, reason with different descriptions of knowledge and uncertainty (e.g., logic-based descriptions of commonsense domain knowledge, probabilistic uncertainty quantification), and incrementally revise its domain knowledge (e.g., axioms governing action effects).
Towards achieving transparency in integrated robot systems, our architecture builds on cognitive systems research that indicates the benefits of formally coupling different representations, reasoning methods, and learning methods. Specifically, it supports the following key capabilities:
The figure above provides an overview of the architecture. Answer Set Prolog (ASP) is used to encode incomplete commonsense domain knowledge that includes some object attributes, spatial relations between objects; domain attributes; features from images of scenes; and axioms governing domain dynamics. The robot performs non-monotonic logical reasoning with this knowledge to compute plans to achieve any given goal and/or to complete the estimation tasks, using probabilistic reasoning when necessary. If the robot is unsuccessful or achieves an incorrect outcome, this is considered to indicate missing or incorrect knowledge. The robot automatically identifies and uses regions of interest (ROIs) in the relevant images to train and use deep networks to perform these tasks. Information from the ROIs is also used to induce previously unknown axioms and revise existing axioms. The robot also parses the human input to identify the query type (e.g., descriptive, contrastive, counterfactual), and to trace relevant beliefs and axioms to construct relational descriptions that respond to these queries. Experimental results indicate the ability to: (i) make decisions reliably and efficiently despite incomplete knowledge and noisy sensor inputs; (ii) incrementally learn previously unknown constraints, and preconditions and effects of actions; and (iii) accurately explain decisions and beliefs.
As an example, consider the example scene shown above. The following interaction takes place after the robot has executed a plan to successfully move the red cube on the orange cube.