Runtime Verification of Learning Properties for Reinforcement Learning Algorithms

Tommaso Mannucci
(TNO – Netherlands Organisation for Applied Scientific Research)
Julio de Oliveira Filho
(TNO – Netherlands Organisation for Applied Scientific Research)

Reinforcement learning (RL) algorithms interact with their environment in a trial-and-error fashion. Such interactions can be expensive, inefficient, and timely when learning on a physical system rather than in a simulation. This work develops new runtime verification techniques to predict when the learning phase has not met or will not meet qualitative and timely expectations. This paper presents three verification properties concerning the quality and timeliness of learning in RL algorithms. With each property, we propose design steps for monitoring and assessing the properties during the system's operation.

In Marie Farrell, Matt Luckcuck, Mario Gleirscher and Maike Schwammberger: Proceedings Fifth International Workshop on Formal Methods for Autonomous Systems (FMAS 2023), Leiden, The Netherlands, 15th and 16th of November 2023, Electronic Proceedings in Theoretical Computer Science 395, pp. 205–219.
Published: 15th November 2023.

ArXived at: https://dx.doi.org/10.4204/EPTCS.395.15 bibtex PDF
References in reconstructed bibtex, XML and HTML format (approximated).
Comments and questions to: eptcs@eptcs.org
For website issues: webmaster@eptcs.org