This is a autopost bolg frinds we are trying to all latest sports,news,all new update provide for you
Tuesday, March 31, 2026
Show HN: PhAIL – Real-robot benchmark for AI models https://ift.tt/RiBwNOM
Show HN: PhAIL – Real-robot benchmark for AI models I built this because I couldn't find honest numbers on how well VLA models [1] actually work on commercial tasks. I come from search ranking at Google where you measure everything, and in robotics nobody seemed to know. PhAIL runs four models (OpenPI/pi0.5, GR00T, ACT, SmolVLA) on bin-to-bin order picking – one of the most common warehouse operations. Same robot (Franka FR3), same objects, hundreds of blind runs. The operator doesn't know which model is running. Best model: 64 UPH. Human teleoperating the same robot: 330. Human by hand: 1,300+. Everything is public – every run with synced video and telemetry, the fine-tuning dataset, training scripts. The leaderboard is open for submissions. Happy to answer questions about methodology, the models, or what we observed. [1] Vision-Language-Action: https://ift.tt/YjLrA6W https://phail.ai March 31, 2026 at 09:55PM
Subscribe to:
Post Comments (Atom)
Show HN: I built a screen recorder that captures console logs, requests and more https://ift.tt/dPD6hjW
Show HN: I built a screen recorder that captures console logs, requests and more https://userplane.io/ May 17, 2026 at 01:04AM
-
Show HN: A directory of 800 free APIs, no auth required Explore reliable free APIs for developers — ideal for web and software development, ...
-
Show HN: I built Dirac, Hash Anchored AST native coding agent, costs -64.8 pct Fully open source, a hard fork of cline. Full evals on the gi...
-
Show HN: I built a FOSS tool to run your Steam games in the Cloud I wanted to play my Steam games but my aging PC couldn’t keep up, so I bui...
No comments:
Post a Comment