Optimizing Expectations: From Deep Reinforcement Learning To Stochastic Computation Graphs