Published: 21st August 2017
DOI: 10.4204/EPTCS.252
ISSN: 2075-2180

EPTCS 252

Proceedings 15th International Conference on
Automata and Formal Languages
Debrecen, Hungary, September 4-6, 2017

Edited by: Erzsébet Csuhaj-Varjú, Pál Dömösi and György Vaszil

Preface Erzsébet Csuhaj-Varjú, Pál Dömösi and György Vaszil
Invited Presentation: Automata and Formal Languages for Next Generation Sequencing Data Paola Bonizzoni and Gianluca Della Vedova	1
Invited Presentation: Representation of Information in Automata Henning Bordihn	3
Invited Presentation: On Regular Images of Regular Languages under Extended Word Operations Szilárd Zsolt Fazekas	5
Invited Presentation: Networks of Genetic Processors: From Universality to a Formal Language Characterization José M. Sempere	7
Invited Presentation: Rearrangement Problem of Multidimensional Arrays by Prefix Reversals Akihiro Yamamura	9
(Tissue) P Systems with Vesicles of Multisets Artiom Alhazov, Rudolf Freund, Sergiu Ivanov and Sergey Verlan	11
Unavoidable Sets of Partial Words of Uniform Length Joey Becker, F. Blanchet-Sadri, Laure Flapan and Stephen Watkins	26
On the Descriptional Complexity of Operations on Semilinear Sets Simon Beier, Markus Holzer and Martin Kutrib	41
Dyck Words, Lattice Paths, and Abelian Borders F. Blanchet-Sadri, Kun Chen and Kenneth Hawes	56
Constructing Words with High Distinct Square Densities F. Blanchet-Sadri and S. Osborne	71
Higher-Order Operator Precedence Languages Stefano Crespi Reghizzi and Matteo Pradella	86
The Triple-Pair Construction for Weighted ω-Pushdown Automata Manfred Droste, Zoltán Ésik and Werner Kuich	101
Reversible Languages Having Finitely Many Reduced Automata Kitti Gelle and Szabolcs Iván	114
Input-Driven Double-Head Pushdown Automata Markus Holzer, Martin Kutrib, Andreas Malcher and Matthias Wendlandt	128
Weakly and Strongly Irreversible Regular Languages Giovanna J. Lavado, Giovanni Pighizzini and Luca Prigioniero	143
Descriptional Complexity of Non-Unary Self-Verifying Symmetric Difference Automata Laurette Marais and Lynette van Zijl	157
CD Grammar Systems with Two Propagating Scattered Context Components Characterize the Family of Context Sensitive Languages Alexander Meduna and Jakub Martiško	170
Derivatives of Parsing Expression Grammars Aaron Moss	180
A New Sensing 5'–>3' Watson-Crick Automata Concept Benedek Nagy, Shaghayegh Parchami and Hamid Mir-Mohammad-Sadeghi	195
Exact Affine Counter Automata Masaki Nakanishi, Kamil Khadiev, Krišjānis Prūsis, Jevgēnijs Vihrovs and Abuzer Yakaryılmaz	205
On h-Lexicalized Restarting Automata Martin Plátek and Friedrich Otto	219
Generalized Results on Monoids as Memory Özlem Salehi, Flavio D'Alessandro and A. C. Cem Say	234

Preface

The 15th International Conference on Automata and Formal Languages (AFL 2017) was held in Debrecen, September 4-6, 2017. It was organized by the Faculty of Informatics, University of Debrecen and the Faculty of Informatics, Eötvös Loránd University, Budapest.

The AFL conference series was initiated by Prof. István Peák (1936-1989). He organized the AFL conferences in 1980, 1982, 1984, 1986 and 1988 and started the organization of AFL'90. These conferences were all held in the hills around Salgótarján. In 1986 and 1988, the title of the conference was Automata, Languages and Programming Systems. Since the untimely death of Prof. István Peák in 1990, the AFL conferences have been organized in every third year. Professors András Ádám and Pál Dömösi took the lead in the continuation of the series. In 1993 and 1996, two more "Salgótarján conferences" took place, the last three conferences of the series were held in Balatonfüred (AFL 2008), Debrecen (AFL'11), and Szeged (AFL 2014).

AFL 2017 intended to cover all important areas of automata and formal language theory and their applications such as grammars, acceptors and transducers for strings, trees, graphs, arrays, algebraic theories for automata and languages, combinatorial properties of words and languages, formal power series, decision problems, efficient algorithms for automata and languages, relations to complexity theory and logic, picture description and analysis, quantum computing, cryptography, concurrency, applications of automata and language theory in biology, natural language processing, and other fields.

The Programme Committee invited lectures from

Paola Bonizzoni (University of Milan-Bicocca, Italy),
Henning Bordihn (University of Potsdam, Germany),
Szilárd Zsolt Fazekas (Akita University, Japan),
José M. Sempere (Technical University of Valencia, Spain),
Akihiro Yamamura (Akita University, Japan).

A special memorial session in honor of late Zoltán Ésik, steering committee member and organizer of AFL conferences, consisted of talks presented by

Szabolcs Iván (University of Szeged, Hungary) and
Werner Kuich (Technical University of Vienna, Austria).

This volume contains the abstracts of invited lectures and the texts of the 17 papers accepted from the 23 submissions to AFL 2017.

The members of the International Program Committee were

Francine Blanchet-Sadri (Greensboro),
Victor Bovdi (Al Ain),
Erzsébet Csuhaj-Varjú (Budapest, co-chair),
Jürgen Dassow (Magdeburg),
Pál Dömösi (Debrecen, co-chair),
Henning Fernau (Trier),
Zoltán Fülöp (Szeged),
Viliam Geffert (Košice),
Marian Gheorghe (Bradford),
Oscar H. Ibarra (Santa Barbara, CA),
Szabolcs Iván (Szeged),
Galina Jirásková (Košice),
Juhani Karhumäki (Turku),
Alica Kelemenová (Opava),
Miklós Krész (Szeged),
Martin Kutrib (Giessen),
György Maróti (Veszprém),
Alexander Meduna (Brno),
Victor Mitrana (Bucharest, Madrid),
Andreas Maletti (Stuttgart),
Benedek Nagy (Famagusta),
Chrystopher Nehaniv (Hertfordshire),
Friedrich Otto (Kassel),
Gheorghe Păun (Bucharest),
Giovanni Pighizzini (Milano),
Agustín Riscos-Núñez (Sevilla),
Jeffrey Shallit (Waterloo, Canada),
Petr Sosík (Opava),
Bianca Truthe (Giessen),
György Vaszil (Debrecen, co-chair),
Laurent Vuillon (Chambéry),
Claudio Zandron (Milano).

We thank the members of the Programme Committee for the evaluation of the submissions and their subreferees for their excellent cooperation in this work. We are grateful to the contributors of the conference AFL 2017, in particular to the invited speakers for presenting interesting new developments. We are grateful to the Department of Computer Science and the Faculty of Informatics of the University of Debrecen for the local organization and for the financial support of AFL 2017.

AFL 2017 Organizing Committee

Géza Horváth (Debrecen, co-chair)
Ernőné Kása (Debrecen)
György Vaszil (Debrecen, co-chair)
Ildikó Cserhátiné Vecsei (Debrecen)

AFL Steering Committee

András Ádám (Budapest, honorary chair)
István Babcsányi (Budapest)
Erzsébet Csuhaj-Varjú (Budapest)
Pál Dömösi (Debrecen, chair)
Zoltán Fülöp (Szeged)
László Hunyadvári (Budapest)
László Kászonyi (Szombathely)
Attila Nagy (Budapest)
György Vaszil (Debrecen)

List of subreviewers

Zuzana Bednárová, Emilie Charlier, Lucie Ciencialová, James Currie, Stefan Dück, Zsolt Gazdag, Kitti Gelle, Markus Holzer, Géza Horváth, Sylvain Lombardy, Ian McQuillan, Robert Mercaş, Mark-Jan Nederhof, Pascal Ochem, Erik Paul, Luca Prigioniero, Michael Rao, Juraj Šebej, Shinnosuke Seki, Sándor Vágvölgyi, Matthias Wendlandt.

Automata and Formal Languages for Next Generation Sequencing Data

Paola Bonizzoni (Department of Informatics, Systems and Communications, University of Milan-Bicocca)
Gianluca Della Vedova (Department of Informatics, Systems and Communications, University of Milan-Bicocca)

Abstract The advent of Next Generation Sequencing (NGS) technologies is characterized by the production of massive data consisting of short sequences (reads) and it has enabled the completion of several sequencing projects on entire populations.

This talk focuses on the application of concepts from automata theory and formal languages to Bioinformatics, motivated by the repetitive nature that is intrinsic to the sequences that are produced with NGS technologies.

The analysis of such data raises the need of formal tools that are able to take advantage of the similarities among sequences to index and compare NGS data in a memory (and hence time) efficient manner. We will explore some recent applications of Burrows-Wheeler Transform, Lyndon factorization and automata theory to the analysis of genomic data.

A vast collection of computational problems in Bioinformatics are strongly dependant on the type of available biological data. Thus the formalization, the computational complexity, and the algorithmic solutions of many of these problems are affected by the intrinsic characteristics of the input data, both in terms of quality and quantity. A clear example is represented by the computational projects on sequencing data, where the algorithmic approaches and the data structures employed to attack the de novo assembly problem have changed, following the advancement of sequencing technologies. Today sequencing technologies have brought a revolution in the common practice of researchers working in molecular biology. Sequencing any organism is feasible, cheap, and can be completed quickly: thus biologists may use hundreds of millions of short sequences (reads) for their genomic studies. This requires new computational tools that are able to deal with this huge amount of sequences data, in turn encouraging the development of novel formal approaches to the comparison, analysis and managing of sequencing data, known as NGS (next generation sequencing) data. Computer Science research on NGS has focused initially on the problem of representing such data with different formal approaches, where the representation in many cases is the first step towards to the analysis and comparison of the data. Here we try to survey and summarize some theoretical frameworks that have been developed in the field, and how they have changed our perspectives towards the topic of data structures and algorithms for molecular biology. A main characteristic of NGS data has guided these frameworks: the repetitive nature of reads due to the fact that they are extracted by the same sequence (the reference genome): most importantly, some input sequences can be viewed as a shifted version of another.

We will discuss the following main notions that have recently gained a strong attention in Bioinformatics, but have their foundations in stringology and formal language theory.

The Burrows-Wheeler Transform
The investigation of the repetitive nature of a collection of strings has led to the use in Bioinformatics of two main notions developed in the context of compression and text indexing: the Burrows-Wheeler Transform [5] and the FM-index for a collection of strings. The BWT has extraordinarily improved the efficiency of performing fundamental operations on sequences in Bioinformatics, such as sequence alignment, thanks to its capability to provide a succinct representation of biological data.
We will discuss how to compute the overlap graph, which is a simple and compact representation of all prefix-suffix matches of a collection of strings. In [6] it has been shown a linear time algorithm O(nm) (being n the number of sequences, each of length m) to compute an overlap graph. The first lightweight approach for building an overlap graph has been given in [4], [3], leading to fast approach that directly works on the FM-index of strings [2] .
The Lyndon factorization The Lyndon word and the notion of Lyndon factorization have recently attracted the attention of the string processing community, thanks to its connections to the BWT, and to its various applications. The Lyndon factorization has been used to define some variants of the BWT. It has a linear time algorithm that makes it interesting for various applications. Here we discuss a notion of inverse Lyndon factorization in a formal language setting [1] with the main purpose of investigating applications of the Lyndon factorization to indexing collection of strings in Bioinformatics.
Automata Theory and NGS indexing Automata Theory has played a key role in string processing, mainly for developing algorithmic pattern matching. The use of automata for the NGS data has been proposed recently [7] . Finite automaton representation of multiple alignment of individual genomes is a key ingredient of generalized compressed suffix array (GCSA) [7]: a novel index data structures to represent the information of multiple genomes.

References

Paola Bonizzoni, Clelia De Felice, Rocco Zaccagnino & Rosalba Zizza (2017): Inverse Lyndon words and Inverse Lyndon factorizations of words. CoRR abs/1705.10277. Available at http://arxiv.org/abs/1705.10277.
Paola Bonizzoni, Gianluca Della Vedova, Yuri Pirola, Marco Previtali & Raffaella Rizzi (2016): FSG: Fast String Graph Construction for De Novo Assembly of Reads Data. In: Bioinformatics Research and Applications - 12th International Symposium, ISBRA 2016, Minsk, Belarus, June 5-8, 2016, Proceedings, pp. 27–39, doi:10.1007/978-3-319-38782-6_3.
Paola Bonizzoni, Gianluca Della Vedova, Yuri Pirola, Marco Previtali & Raffaella Rizzi (2016): LSG: An External-Memory Tool to Compute String Graphs for Next-Generation Sequencing Data Assembly. Journal of Computational Biology 23(3), pp. 137–149, doi:10.1089/cmb.2015.0172.
Paola Bonizzoni, Gianluca Della Vedova, Yuri Pirola, Marco Previtali & Raffaella Rizzi (2017): An External-Memory Algorithm for String Graph Construction. Algorithmica 78(2), pp. 394–424, doi:10.1007/s00453-016-0165-4.
M. Burrows & D. J. Wheeler (1994): A block-sorting lossless data compression algorithm. Technical Report. Digital Systems Research Center.
Jared Simpson & Richard Durbin (2012): Efficient de novo assembly of large genomes using compressed data structures. Genome Res. 22, pp. 549–556, doi:10.1101/gr.126953.111.
Jouni Sirén, Niko Välimäki & Veli Mäkinen (2014): Indexing Graphs for Path Queries with Applications in Genome Research. IEEE/ACM Trans. Comput. Biol. Bioinformatics 11(2), pp. 375–388, doi:10.1109/TCBB.2013.2297101.

Representation of Information in Automata

Henning Bordihn (Institut für Informatik, Universität Potsdam)

One of the challenging tasks in the theory of formal languages is the following one: given a language $L$ and a family of languages $\cal{L}$, prove that $L$ is not contained in $\cal{L}$. In particular cases, a series of necessary conditions for a language to be contained in a family is known such as, for example, necessary conditions expressed by pumping lemmas or by the interchange lemma for context-free languages[3]. Unfortunately, there is a considerable number of languages that meet some or all of the known necessary conditions to be contained in a family $\cal{L}$, but those languages are not contained in $\cal{L}$ or it is not known whether or not they belong to $\cal{L}$. A prominent example is the open problem of whether the language $Q$ of all primitive words is context-free [2].

In the current work, a different approach is developed. Many important families of languages are characterized by some class of automata. We aim to disprove that a given language $L$ is contained in family $\cal{L}$ by showing that $L$ cannot be accepted by an automaton from the class characterizing $\cal{L}$. The main issue in proofs like that will be to verify two facts:

On the one hand, if some appropriately chosen word in $L$ is accepted by an automaton, then a certain piece of information must be represented in the state that the automaton reaches after having processed some part of the input word.
On the other hand, this piece of information about the input word cannot be present in the state at that time or it cannot be accessed when needed in order to complete the computation successfully.

Here, we make use of the observation that one can think of computations as being guided by, or containing or using information. A state $p$, for instance, contains the information about which computations can originate at $p$ given suitable inputs. In order to keep the concepts as independent of particular types of automata as possible, we introduce them for a model that has an unstructured storage type. Moreover, it has the following properties: it is one-way, that is, it cannot re-read an input symbol; it is non-deterministic; it has a non-deterministic start state; it is real-time, that is, it reads exactly one input symbol in every step. It is easily shown that every language, not containing the empty word, is accepted by an automaton of this kind, even by a real-time automaton with a single start state and a single accepting state, regardless of computability issues. Clearly, the usual subset construction allows to turn any automaton of this kind into a deterministic one. In several cases, one needs to represent the states in a specific way as to model the access to information contained in a state. For this purpose, a state may, for instance, be a word over a state representation alphabet. One class of automata using such a representation of states is that of single-state push-down automata accepting all context-free languages.

Notions like those of equivalence, unavoidability and informational containment for states or sets of states are introduced. We show general properties of the computations of automata on the basis of these notions. Then the following concepts are derived: information contents of states, establishing information, preserving information, and using information contained in states.

Finally, the case of single-state pushdown automata (in some normal form) are treated in more detail. (A preliminary version of these considerations have been published in [1] which has been strongly revised.) The concepts developed are used in order to provide a new necessary condition for languages to be context-free. It is demonstrated how to apply this statement in order to prove that particular languages are not context-free. Advantages of the new proof technique over the pumping lemmas are discussed.

References

H. Bordihn & H. Jürgensen (2004): Pushdown Information. In: L. Ilie & D. Wotschke: Descriptional Complexity of Formal Systems (DCFS 2004), Proceedings, Report No. 619. Department of Computer Science, The University of Western Ontarion, London, Ontario, pp. 111–120.
P. Dömösi & M. Ito (2014): Context-Free Languages and Primitive Words. World Scientific, doi:10.1142/7265.
J. van Leeuwen (1990): Handbook of Theoretical Computer Science, Vol B: Formal Models and Semantics. MIT Press/Elsevier.

On Regular Images of Regular Languages under Extended Word Operations

Szilárd Zsolt Fazekas (Akita University)

Regular languages are perhaps the most thoroughly studied class of the Chomsky hierarchy. Because of the wealth of existing results and several alternative characterizations in terms of grammars, finite automata and logic, most decidability questions have been answered or are easy to deduce from previous results. Some very challenging problems for regular languages, especially concerning decidability, arose from applying word operations to all words in a regular language and asking under what conditions the result is still regular. We give an overview of the characterizations of images of regular languages under some, mostly biologically inspired such operations. Where appropriate, we discuss the related decidability problem, i.e., for an operation op is it decidable whether the image op(L) of a given regular language L is regular?

We start the survey with duplication, an operation inspired by the so called tandem repeats in genetic sequences, which takes a factor of a word and inserts a copy of it right next to the original occurrence. This operation has been extensively studied in theoretical computer science and bioinformatics. As a nice example of applying Ehrenfeucht's generalized version of the Myhill-Nerode theorem, Bovet and Varricchio [2] proved that closure under duplication, in their terminology copy languages, of arbitrary recursively enumerable languages over a binary alphabet is regular, by showing that iterated duplication induces a well-quasi order on binary words.

Next, we look at a similar operation, albeit one which had a different inspiration. Calbrix and Nivat defined the power of a language L in a paper about prefix and period languages of rational $\omega$-languages [4]: $pow(L) = \{w^i | w \in L, i \geq 1\}$. The image of a regular language is not always regular, for instance, the power of the regular language $ab^*$ is not even context-free. Calbrix and Nivat posed the problem of characterizing those regular languages whose powers are also regular, and to decide whether a given regular language has this property. After partial solutions given by Cachat [3] and Horváth et al. [9] the final - positive - answer was given through a characterization and decision algorithm [6] which exemplifies a theme common to the solution of another problem on our list, the regularity of pseudopalindromic closures.

Several operations on sequences were introduced which are either directly motivated by the biological phenomenon called stem-loop completion, or are very similar in nature to it. The mathematical hairpin concept refers to a word in which some suffix (or prefix) is the mirrored complement of a middle factor of the word, i.e., $uxv\bar{x}$, where $\bar{x}$ is the reverse of the letter-by-letter complement of x. The hairpin completion operation, which extends such a word into one with matching ends, was studied both as a one-step operation and as an iterated one. The aforementioned questions regarding regularity of the image have been settled for one-step completion [5]. In the case of iterated completion, the state-of-the-art is a recent promising result on the completion of so called (2,2)-crossing words [11], which may allow a generalization to more complicated crossing words and thus to the solution of the problem for singleton starting languages.

Motivated by the open problem of iterated hairpin completion, a restricted version of it, pseudopalindromic completion was introduced in [7]. Pseudopalindromes are words that are fixed points for an antimorphic involution. In the case of pseudopalindromic completion, symbols are added to either side of the word such that the newly obtained words are pseudopalindromes. This restriction proved strong enough to allow a characterization of regular languages that are closed under this operation and to show that the regularity of the closure under the operation is decidable.

Finally, we discuss results about the regularity aspect in the case of recently introduced related operations: hairpin lengthening, which is another restricted version of hairpin completion, and its inverse, hairpin shortening [10]; prefix-suffix duplication, which is a duplication restricted to prefixes and suffixes, and its inverse, prefix-suffix reduction, introduced in [8] and [1], respectively.

References

P. Bottoni, A. Labella & V. Mitrana (2017): Prefix-suffix square reduction. Theor. Comput. Sci. 682, pp. 49–56, doi:10.1016/j.tcs.2016.12.005.
D.P. Bovet & S. Varricchio (1992): On the regularity of languages on a binary alphabet generated by copying systems. Inf. Proc. Lett. 44(3), pp. 119 – 123, doi:10.1016/0020-0190(92)90050-6. Available at http://www.sciencedirect.com/science/article/pii/0020019092900506.
T. Cachat (2001): The Power of One-Letter Rational Languages. In: W. Kuich, G. Rozenberg & A. Salomaa: Developments in Language Theory, 5th International Conference, DLT 2001, Vienna, Austria, July 16-21, 2001, Revised Papers, Lecture Notes in Computer Science 2295. Springer, pp. 145–154, doi:10.1007/3-540-46011-X_11.
H. Calbrix & M. Nivat (1995): Prefix and Period Languages of Rational omega-Languages. In: J. Dassow, G. Rozenberg & A. Salomaa: Developments in Language Theory II, At the Crossroads of Mathematics, Computer Science and Biology, Magdeburg, Germany, 17-21 July 1995. World Scientific, Singapore, pp. 341–349.
V. Diekert, S. Kopecki & V. Mitrana (2009): On the Hairpin Completion of Regular Languages. In: M. Leucker & C. Morgan: Theoretical Aspects of Computing - ICTAC 2009, 6th International Colloquium, Kuala Lumpur, Malaysia, August 16-20, 2009. Proceedings, Lecture Notes in Computer Science 5684. Springer, pp. 170–184, doi:10.1007/978-3-642-03466-4_11.
S.Z. Fazekas (2011): Powers of Regular Languages. Int. J. Found. Comput. Sci. 22(2), pp. 323–330, doi:10.1142/S0129054111008064.
S.Z. Fazekas, F. Manea, R. Mercas & K. Shikishima-Tsuji (2014): The pseudopalindromic completion of regular languages. Inf. Comput. 239, pp. 222–236, doi:10.1016/j.ic.2014.09.001.
J. García-López, F. Manea & V. Mitrana (2014): Prefix-suffix duplication. J. Comput. Syst. Sci. 80(7), pp. 1254–1265, doi:10.1016/j.jcss.2014.02.011.
S. Horváth, P. Leupold & G. Lischke (2002): Roots and Powers of Regular Languages. In: M. Ito & M. Toyama: Developments in Language Theory, 6th International Conference, DLT 2002, Kyoto, Japan, September 18-21, 2002, Revised Papers, Lecture Notes in Computer Science 2450. Springer, pp. 220–230, doi:10.1007/3-540-45005-X_19.
F. Manea, R. Mercas & V. Mitrana (2012): Hairpin Lengthening and Shortening of Regular Languages. In: H. Bordihn, M. Kutrib & B. Truthe: Languages Alive - Essays Dedicated to Jürgen Dassow on the Occasion of His 65th Birthday, Lecture Notes in Computer Science 7300. Springer, pp. 145–159, doi:10.1007/978-3-642-31644-9_10.
K. Shikishima-Tsuji (2016): Regularity of Iterative Hairpin Completions of Crossing (2, 2)-Words. Int. J. Found. Comput. Sci. 27(3), pp. 375–390, doi:10.1142/S0129054116400153.

Networks of Genetic Processors: From Universality to a Formal Language Characterization

José M. Sempere (Department of Information Systems and Computation, Universitat Politècnica de València)

Networks of Bio-inspired Processors (NBP) [3] are a family of models of computation inspired by different biological processes over biomacromolecules (mainly, DNA and RNA molecules). These operations can be applied over strings to define formal languages. Any bio-inspired processor contains a multiset of strings, where every string can be found in an arbitrarily large number of copies. The processors apply the operations that they have defined over the strings in a massively parallel and non-deterministic manner, and they obtains a (probably) new multiset of strings. A network model arranges the connection of a finite number of bio-inspired processors. The processors can be equipped with input/output filters to receive/send the strings from/to other processors connected to it. The network computes by alternating two kind of computation steps: operation application and string communication. These models have been formulated as accepting and generating devices.

The first model of NBPs was formulated by V. Mitrana (the founder of this research topic) as Networks of Evolutionary Processors (NEP) [1]. There, the operations in the processors were inspired by point mutation over the nucleotides in DNA. Later, the splicing biological process during the DNA transcription and RNA translation inspired the Networks of Splicing Processors (NSP) [2].

The Networks of Genetic Processors (NGP) were inspired by NEPs and NSPs, by using some operations previously defined over them. At every processor, only symbol substitutions or full string crossover are applied. Hence, the model can be easily related to classical Genetic Algorithms. It has been proved that NGPs are computationally complete given that they can simulate any Turing machine, if an accepting version of the model is defined [4]. The generating version of this model can be used to introduce a descriptive complexity measure of formal languages defined according to the classical Chomsky's hierarchy: three, four, six or eight processors can be used to characterize the regular, context-free, context-sensitive and recursively enumerable languages [6].

In addition, the NGPs can be used to formally prove the computational completeness of Genetic Algorithms, if some restrictions are imposed [4], and they have been fruitfully used to solve hard optimization combinatorial problems efficiently [5].

References

J. Castellanos, C. Martín-Vide, V. Mitrana & J.M. Sempere (2003): Networks of evolutionary processors. Acta Informatica 39(6-7), pp. 517–529, doi:10.1007/s00236-003-0114-y.
F. Manea, C. Martín-Vide & V. Mitrana (2005): Accepting Networks of Splicing Processors. In: S.B. Cooper, B. Löwe & L. Torenvliet: New Computational Paradigms. CiE 2005, Lecture Notes in Computer Science Vol. 3526, pp. 300–309 Springer. doi:10.1007/11494645_38.
F. Arroyo, J. Castellanos, V. Mitrana, E. Santos & J.M. Sempere (2012): Networks of Bio-inspired processors. Triangle 7, pp. 3–22.
M. Campos & J.M. Sempere (2012): Accepting Networks of Genetic Processors are Computationally Complete. Theoretical Computer Science 456, pp. 18–29, doi:10.1016/j.tcs.2012.06.028.
M. Campos & J.M. Sempere (2013): Solving Combinatorial Problems with Networks of Genetic Processors. International Journal "Information Technologies and Knowledge" 7(1), pp. 65–71.
M. Campos & J.M. Sempere (2017): A Characterization of Formal Languages through Networks of Genetic Processors. (submitted).

Rearrangement Problem of Multidimensional Arrays by Prefix Reversals

Akihiro Yamamura (Akita University)

The problem of sorting lists by prefix reversals, as known as the pancake sorting problem, is to sort randomly piled pancakes of different size. A prefix reversal is an operation of reversing elements in the list including the beginning element. The minimum number of prefix reversals to sort a given list of integers into an ascending order was asked by Gates and Papadimitriou in [4] and an upper bound $\frac{5}{3}n$ for sorting a list of length $n$ was given in [4]. They conjectured that $\frac{19}{16}n$ is required, however, it is disproved in [5]. A better bound $\frac{18}{11}n$ was given in [2]. On the other hand, the pancake sorting problem is shown to be NP-hard in [1].

The pancake sorting problem can be extended to a rearrangement problem of two dimensional arrays by prefix reversals as follows. Suppose $A$ is an $n\times m$ array. Then $A$ comprises of $n \times m$ cells. We denote the entry in the $(i,j)$ position of $A$ by $a_{ij}$ (see Figure 1). The standard array $E_{n\times m}$ of size $n\times m$ is defined to be $(e_{ij})$ where $e_{ij}$ is the integer $(i-1)\times m + j$ (see Figure 2). We denote $A = \frac{\displaystyle A_1}{\displaystyle A_2}$ if $A$ comprises of an upper block $A_1$ and a lower block $A_2$, or $A = A_3 | A_4$ if $A$ comprises of a left block $A_3$ and a right block $A_4$. For an $n\times m$ array $A = (a_{ij})$, the {\it reversal} of $A$ is the $n\times m$ array $(b_{ij})$ such that $b_{ij} = a_{n-i+1, m-j+1}$ for every $(i,j)$. We denote it by $R(A)$ (see Figure 3). The transformation $\frac{\displaystyle A_1}{\displaystyle A_2} \Rightarrow \frac{\displaystyle R(A_1)}{\displaystyle A_2}$ is called a horizontal prefix reversal. The transformation $A_3 | A_4 \Rightarrow R(A_3) | A_4$ is called a vertical prefix reversal.

Red dot

Unlike the pancake sorting problem, it is not clear whether an array can be rearranged to a standard array. A necessary and sufficient condition that an array can be rearranged is obtained in [7]. A rearrangement is always possible if and only if either the number of rows or the number of columns is not divisible by 4. Moreover, only even permutations can be realized in the case that the numbers of both rows and columns are divisible by 4. We shall extend a rearrangement problem of two dimensional arrays toward two directions.

Another variant of the pancake sorting problem is called the burnt pancake sorting problem and studied many authors (e.g. [3], [6]). It is to sort a given list of signed integers into an ascending order of positive integers. Instead of signed integers, we use upper and lower case letters (or blue and red letters). In this problem, cases of letters are altered whenever we apply a prefix reversal.

\begin{equation*} \begin{pmatrix} e & d & A & c & b \end{pmatrix} \Rightarrow \begin{pmatrix} B & C & a & D & E \end{pmatrix} \Rightarrow \begin{pmatrix} c & b & a & D & E \end{pmatrix} \Rightarrow \begin{pmatrix} A & B & C & D & E \end{pmatrix} \end{equation*}

We shall generalize the burnt pancake problem to a rearrangement problem of two dimensional arrays filled with upper and lower case letters by prefix reversals. It is clear that the problem of linear strings ($1 \times n$ arrays) is equivalent to the original burnt pancake sorting problem. Although we have an algorithm to sort a linear string, a rearrangement problem for two dimensional arrays filled with upper and lower case letters is not trivial. The array below can be rearranged form the standard array, where the standard array of size $n \times m$ is the array in which every cell is filled with an upper case letter and is denoted by $E_{n\times m}$. We suppose $U$ is a set of upper case letters $\{ A,B,C,D,E,F\}$, $L$ is a set of lower case letters $\{ a,b,c,d,e,f\}$ and $c$ is a one-to-one correspondence between same letters with the opposite cases.

\begin{equation*} \begin{bmatrix} c & f & e \\ b & D & A \end{bmatrix} \Rightarrow \begin{bmatrix} E & F & C \\ b & D & A \end{bmatrix} \Rightarrow \begin{bmatrix} d & B & C \\ f & e & A \end{bmatrix}\Rightarrow \begin{bmatrix} a & E & F \\ c & b & D \end{bmatrix} \Rightarrow \end{equation*} \begin{equation*} \begin{bmatrix} f & e & A \\ c & b & D \end{bmatrix} \Rightarrow \begin{bmatrix} d & B & C \\ a & E & F \end{bmatrix} \Rightarrow \begin{bmatrix} A & B & C \\ D & E & F \end{bmatrix} \end{equation*}

Furthermore, we extend the rearrangement problem to a rearrangement problem of three dimensional arrays using the result for a rearrangement problem of two dimensional arrays filled with upper and lower case letters by prefix reversals. We discuss rearrangability of three dimensional arrays by prefix reversals.

This is a joint work with Riki Kase and Szilárd Zolt Fazekas.

References

L. Bulteau, G. Fertin & I. Rusu (2012): Pancake Flipping Is Hard. In MFCS 2012, LNCS 7464, Springer, pp. 247--258, doi: 0.1007/978-3-642-32589-2_24
B. Chitturi, W. Fahle, Z. Meng, L. Morales, C.O. Shields, I.H. Sudborough & W. Voit (2009): An (18/11)n upper bound for sorting by prefix reversals. Theoretical Computer Science 410 (36):3372-3390, doi:10.1016/j.tcs.2008.04.045
D.S. Cohen & M. Blum (1995): On the problem of sorting burnt pancakes. Discrete Applied Mathematics 61 (2):105--120, doi:10.1016/0166-218X(94)00009-3
W. Gates & C. Papadimitriou (1979): Bounds for Sorting by Prefix Reversal. Discrete Mathematics 79:47--57, doi:10.1016/0012-365X(79)90068-2
M.H. Heydari & I.H. Sudborough (1997): On the diameter of the pancake network. J. Algorithms 25 (1):67--94, doi: 10.1006/jagm.1997.0874
H. Kaplan, R. Shamir & R.E. Tarjan (1997): Faster and Simpler Algorithm for Sorting Signed Permutations by Reversals. In ACM-SIAM SODA 1997, pp. 344--351, doi: 10.1145/267521.267544
A. Yamamura (2015): Rearranging two dimensional arrays by prefix reversals. In RP 2015, LNCS 9328, Springer, pp. 153-165, doi: 10.1007/978-3-319-24537-9_14

EPTCS 252

Proceedings 15th International Conference on Automata and Formal Languages Debrecen, Hungary, September 4-6, 2017

Preface

AFL 2017 Organizing Committee

AFL Steering Committee

List of subreviewers

Automata and Formal Languages for Next Generation Sequencing Data

References

Representation of Information in Automata

References

On Regular Images of Regular Languages under Extended Word Operations

References

Networks of Genetic Processors: From Universality to a Formal Language Characterization

References

Rearrangement Problem of Multidimensional Arrays by Prefix Reversals

References

Proceedings 15th International Conference on
Automata and Formal Languages
Debrecen, Hungary, September 4-6, 2017