Reinforcement learning algorithms can break in surprising, counterintuitive ways. This post explores one failure mode, where the reward function is misspecified and resulted in the reinforcement learning agent learning how it could beat humans by 20% when it cheated and didn’t even finish the race.
Important Note
Content Editors rate, curate and regularly update what we believe are the top 11% of all AI resource and good practice examples and is why our content is rated from 90% to 100%. Content rated less than 90% is excluded from this site. All inclusions are vetted by experienced professionals with graduate level data science degrees.
In the broadest sense, any inclusion of content on this site is not an endorsement or recommendation of any service, product or content that may be discussed, recommended, endorsed or affiliated with the content, company or spokesperson. We are a 501(c)3 nonprofit and receive no website advertising monies or direct or indirect compensation for any content or other information on any of our websites. For more information, visit our TOS.