We present a novel extension to the family of Soft Actor-Critic (SAC) algorithms. We argue that based on the Maximum Entropy Principle, discrete SAC can be further improved via additional statistical ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results