Hi,. V(s) is calculated by a neural network… | by Artem Oppermann

1 min readDec 22, 2018

Hi,

V(s) is calculated by a neural network with no hidden layers and a linear output function (in the case of my implementation on GitHub). It works the best with no hidden layers for some reason. Thus its nothing more than a linear classifier.

I never bothered to understand fully the mathematical reason why the variance can be reduced this way since I only needed the models to work. But the derivation for this can be found here :

Bhatnagar, S., Sutton, R. S., Ghavamzadeh, M., and Lee, M. (2007). Incremental natural actor-critic algorithms. In Neural Information Processing Systems 21.

Written by Artem Oppermann

No responses yet