Free-Energy Advantage Functions for Policy Transfer to Noisy Environments with Safety Constraints

Training acting agents for the goal of controlling complex live systems on the system itself is often an unfeasible task, either due to the high cost or the potential dangers that might arise. In this paper, we take a step towards identifying ways to evaluate the transferability of models for the class of constrained Reinforcement Learning problems. Furthermore, we present an approach based on free-energy advantage functions to improve adaptability and in turn transferability for constrained Reinforcement Learning problems and subsequently manage to increase the performance of a baseline algorithm, CPO, with regard to safety constraints in noisy environments.

  • Published in:
    Lernen. Wissen. Daten. Analyse.
  • Type:
    Inproceedings
  • Authors:
    Haritz, Pierre; Liebig, Thomas
  • Year:
    2023

Citation information

Haritz, Pierre; Liebig, Thomas: Free-Energy Advantage Functions for Policy Transfer to Noisy Environments with Safety Constraints, Lernen. Wissen. Daten. Analyse., 2023, https://ceur-ws.org/Vol-3630/LWDA2023-paper36.pdf, Haritz.Liebig.2023a,

Associated Lamarr Researchers

lamarr institute person Liebig Thomas - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Prof. Dr. Thomas Liebig

Principal Investigator Trustworthy AI to the profile