The problem with reinforcement learning in Generative AI is that it's difficult to turn the real-world into a graph. Sydney Von Arx of METR talks about an approach to solve this.