Free-Energy Advantage Functions for Policy Transfer to Noisy Environments with Safety Constraints

Training acting agents for the goal of controlling complex live systems on the system itself is often an unfeasible task, either due to the high cost or the potential dangers that might arise. In this paper, we take a step towards identifying ways to evaluate the transferability of models for the class of constrained Reinforcement Learning problems. Furthermore, we present an approach based on free-energy advantage functions to improve adaptability and in turn transferability for constrained Reinforcement Learning problems and subsequently manage to increase the performance of a baseline algorithm, CPO, with regard to safety constraints in noisy environments.

Published in:
Lernen, Wissen, Daten, Analysen (LWDA) Conference Proceedings
Type:
Inproceedings
Authors:
Haritz, Pierre; Liebig, Thomas
Year:
2023
Source:
https://ceur-ws.org/Vol-3630/LWDA2023-paper36.pdf

Citation information

Haritz, Pierre; Liebig, Thomas: Free-Energy Advantage Functions for Policy Transfer to Noisy Environments with Safety Constraints, Lernen, Wissen, Daten, Analysen (LWDA) Conference Proceedings, 2023, https://ceur-ws.org/Vol-3630/LWDA2023-paper36.pdf, Haritz.Liebig.2023a,

Open BibTeX citation

Free-Energy Advantage Functions for Policy Transfer to Noisy Environments with Safety Constraints

Citation information

Associated Lamarr Researchers

Prof. Dr. Thomas Liebig