Transformer on MSNOpinion
Against the METR graph
METR’s benchmark has become a bellwether of AI capability growth, but its design isn’t up to the task, argues Nathan Witkin ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results