"Mori, S.,Tangkaratt, V.,Zhao, T.,Morimoto, J.,&Sugiyama, M.","Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation","IBISML2012-95","IEICE Technical Report",,,,"pp. 17-24",2013, "Zhao, T.,Hachiya, H.,Tangkaratt, V.,Morimoto, J.,& Sugiyama, M.","Efficient sample reuse inpolicy gradients with parameter-based exploration",,"Neural Computation",,"vol. 25","no. 6","pp. 1512-1547",2013,