My research focuses on efficient reward-based learning as a means for (approximately) optimal control. I am particularly interested in addressing high-dimensional environments where classical methods like dynamic programming become infeasible. To this end, my algorithms often utilize deep neural networks to process rich sensory data.