Reinforcement learning algorithms for robotics are frequently benchmarked on pre-constructed simulation environments. Each environment includes heuristic task-specific design choices, including shaping the observation space, action space, terminal conditions, initial states, goal distribution, and reward. Because these choices are not automated, they remain an under-acknowledged bottleneck for applying techniques purported to be general-purpose to new robots and tasks. We propose a terminology for this problem, review relevant progress to date, and propose a new benchmark to evaluate automated environment shaping.