This is a autopost bolg frinds we are trying to all latest sports,news,all new update provide for you
Sunday, February 25, 2024
Show HN: Continuous-eval – Granular evaluation of GenAI pipelines https://ift.tt/PtCLhxn
Show HN: Continuous-eval – Granular evaluation of GenAI pipelines Hi HN - we are the creators of “continuous-eval”, an open-source tool to test and evaluate generative AI apps. "Continuous-eval" came from our efforts to measure, validate and improve the reliability of a finance AI copilot we were developing for banks. End-to-end evaluation was not enough for us. We wanted to have granular evaluations that help pinpoint the bottlenecks and identify what / how to improve. We’ve since developed more metrics and made the framework more flexible so it can evaluate components like agent tool use, code change, retrieval steps, etc. Let us know what you think of our approach to GenAI App evaluation. https://ift.tt/oPqb51D February 26, 2024 at 12:11AM
Subscribe to:
Post Comments (Atom)
Show HN: PHP-fts – Full-text search engine in pure PHP, no extensions https://ift.tt/wgSBiJP
Show HN: PHP-fts – Full-text search engine in pure PHP, no extensions https://ift.tt/WpBoNzV May 7, 2026 at 01:58AM
-
Show HN: A directory of 800 free APIs, no auth required Explore reliable free APIs for developers — ideal for web and software development, ...
-
Show HN: I built Dirac, Hash Anchored AST native coding agent, costs -64.8 pct Fully open source, a hard fork of cline. Full evals on the gi...
-
Show HN: I built a FOSS tool to run your Steam games in the Cloud I wanted to play my Steam games but my aging PC couldn’t keep up, so I bui...
No comments:
Post a Comment