AI models were initially given human tests (like the LSAT or MCAT) or tests written for AIs like the MMLU. However since they’ve mastered so many of these tests and the tests don’t always carry over to real-world abilities, new measures of progress are needed. One way to rank the difficulty of a task is by how long it would take a human to complete it. In March,
AI vs Human Attention Spans
AI models were initially given human tests (like the LSAT or MCAT) or tests written for AIs like the MMLU. However since they’ve mastered so many of these tests and the tests don’t always carry over to real-world abilities, new measures of progress are needed. One way to rank the difficulty of a task is by how long it would take a human to complete it. In March,