Analyzing Robustness of Angluin's L* Algorithm in Presence of Noise

Igor Khmelnitsky
(Université Paris-Saclay, CNRS, ENS Paris-Saclay, INRIA, LMF, France)
Serge Haddad
(Université Paris-Saclay, CNRS, ENS Paris-Saclay, INRIA, LMF, France)
Lina Ye
(Université Paris-Saclay, CNRS, ENS Paris-Saclay, CentraleSupélec, LMF, France)
Benoît Barbot
(Université Paris-Est Créteil, France)
Benedikt Bollig
(Université Paris-Saclay, CNRS, ENS Paris-Saclay, LMF, France)
Martin Leucker
(Institute for Software Engineering and Programming Languages, Universität zu Lübeck, Germany)
Daniel Neider
(Carl von Ossietzky University of Oldenburg, Germany)
Rajarshi Roy
(Max Planck Institute for Software Systems, Germany)

Angluin's L* algorithm learns the minimal (complete) deterministic finite automaton (DFA) of a regular language using membership and equivalence queries. Its probabilistic approximatively correct (PAC) version substitutes an equivalence query by a large enough set of random membership queries to get a high level confidence to the answer. Thus it can be applied to any kind of (also non-regular) device and may be viewed as an algorithm for synthesizing an automaton abstracting the behavior of the device based on observations. Here we are interested on how Angluin's PAC learning algorithm behaves for devices which are obtained from a DFA by introducing some noise. More precisely we study whether Angluin's algorithm reduces the noise and produces a DFA closer to the original one than the noisy device. We propose several ways to introduce the noise: (1) the noisy device inverts the classification of words w.r.t. the DFA with a small probability, (2) the noisy device modifies with a small probability the letters of the word before asking its classification w.r.t. the DFA, and (3) the noisy device combines the classification of a word w.r.t. the DFA and its classification w.r.t. a counter automaton. Our experiments were performed on several hundred DFAs.

Our main contributions, bluntly stated, consist in showing that: (1) Angluin's algorithm behaves well whenever the noisy device is produced by a random process, (2) but poorly with a structured noise, and, that (3) almost surely randomness yields systems with non-recursively enumerable languages.

In Pierre Ganty and Dario Della Monica: Proceedings of the 13th International Symposium on Games, Automata, Logics and Formal Verification (GandALF 2022), Madrid, Spain, September 21-23, 2022, Electronic Proceedings in Theoretical Computer Science 370, pp. 81–96.
Published: 20th September 2022.

ArXived at: https://dx.doi.org/10.4204/EPTCS.370.6 bibtex PDF
References in reconstructed bibtex, XML and HTML format (approximated).
Comments and questions to: eptcs@eptcs.org
For website issues: webmaster@eptcs.org