Okay, I get it now. Yes, I had made previously a typo in equation 13 and have later corrected it. Since I don't have any statistical background I never was interested in the “philosophical” explanation of Bayes theorem. I come from physics and we only used Bayes as a tool. And this is what this equation is for me and I wanted to present it as such.

Although we never called K-L divergence as Bayes rule. We always made a strict distinction between these two equations. K-L divergence is used to compare the “similarity” between two probability distributions. Bayes is used to calculating a probability distribution.