Wednesday, June 7, 2023

Show HN: we open sourced an IDE for ML data https://ift.tt/tJmFpbA

Show HN: we open sourced an IDE for ML data Hi HN! I’m Farah, co-founder of Dioptra.ai. We just open sourced katiML ( https://ift.tt/MXGsyik ) this week and wanted to get your take. katiML is a vector+data lake to debug, curate and version AI data. With katiML, teams avoid the “garbage in, garbage out” effect by taking control over the quality of their data. They quickly and effectively curate high quality data for training, fine-tuning, and fixing hallucinations and edge cases. Features include: - Data Curation: interactive embedding visualization and similarity search, mislabeling and hallucination scores as well as SOTA Active Learning miners. - GenAI 4 explainability: uncover drift and bias in your data with multi-modal foundational models that describe in human language what’s in your data. - ML Data Lake: zero copy, highly efficient ML data lake to store and query any ML data (vectors, metadata etc). - Data Versioning: to track changes in datasets, prompts, and model performance; and pinpoint the causes behind those changes. Check out our github repo( https://ift.tt/MXGsyik ), documentation( https://ift.tt/CleNhcx ) and new slack community channel. Here is a video to show you how we use GenAi 4 explainability( https://ift.tt/TcUpwXZ ). We'd love to hear about your experiences with iterating on your data to improve models, the challenges you faced, and how katiML could prove useful to your efforts. Let us know what you think we can do better or differently. Looking forward to your comments! https://ift.tt/TcUpwXZ June 7, 2023 at 11:41PM

No comments:

Show HN: Tablr – Supabase with AI Features https://ift.tt/ltABMro

Show HN: Tablr – Supabase with AI Features https://www.tablr.dev/ June 30, 2025 at 04:35AM