Does a transition probability between a state s to a reward r R_{s,r} exists?

No, if you are in a state s you will receive with 100 % certainty a reward R_s. Only if you move to the next state s’ you have a transition probability of Pss’ that you will end up actually in the state s’ and receive the reward R_s’ associated with s’.

Acually the definition of R_s contains an expectation value, too. Is there any reason why this expectation value is not executed?

The fact that we consider E[R_s] only means that we actually do not know that the initial state s is, where the agent start it’s progress. Because of that we can not say with 100 % certainty what the very first reward R_s will be. But keep in mind it’s only mathematics. In reality if you are programming AI agents in for example openAI Gym, you don’t really care about expectations :)

Deep Learning & AI Software Developer | MSc. Physics | Educator|

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store