Sep 2, 2024 · We obtain a more robust estimate of human performance by evaluating 1729 humans on the full set of 400 training and 400 evaluation tasks from the original ARC ...
Sep 2, 2024 · We obtain a more robust estimate of human performance by evaluating 1729 humans on the full set of 400 training and 400 evaluation tasks from the original ARC ...
Sep 6, 2024 · In this work, we obtain a more robust estimate of human performance by evaluating 1729 humans on the full set of 400 training and 400 evaluation tasks.
Sep 3, 2024 · The Abstraction and Reasoning Corpus (ARC) is a challenging benchmark that tests the ability of AI systems to solve abstract reasoning tasks ...
Oct 3, 2024 · To obtain an estimate of human performance on the ARC benchmark, LeGris, Vong, and their co-authors recruited over 1,700 participants through ...
2 days ago · The Abstraction and Reasoning Corpus (ARC) is a visual program synthesis benchmark designed to test challenging out-of-distribution ...
People also search for
Publications. H-ARC: A Robust Estimate of Human Performance on the Abstraction and Reasoning Corpus Benchmark (2024). See the project webpage here.
Co-authors ; H-ARC: A Robust Estimate of Human Performance on the Abstraction and Reasoning Corpus Benchmark. S LeGris, WK Vong, BM Lake, TM Gureckis. arXiv ...
19 hours ago · 2/3 ...tasks from the original ARC problem set. We estimate that average human performance lies between 73.3% and 77.2% correct with a reported ...
Sep 5, 2024 · @bimedotcom. H-ARC: A Robust Estimate of Human Performance on the Abstraction and Reasoning Corpus Benchmark https://arxiv.org/abs/2409.01374 ✍️.