An overview of using maximum likelihood methods in reinforcement learning when dealing with continuous reward signals, highlighting how it connects probability modeling with policy optimization. #Mach ...