This is a autopost bolg frinds we are trying to all latest sports,news,all new update provide for you
Friday, July 21, 2023
Show HN: Datalake for Computer Vision Projects https://ift.tt/tYQan2W
Show HN: Datalake for Computer Vision Projects Buddhika, Kelum, and Chong Han here. We are building a self-hosted data infrastructure platform for computer vision. Our community page is https://ift.tt/QA0tbHg In the past, we worked on a couple of high-scale computer vision projects in retail, farming, and hospitals in various capacities. These projects involved 2D object sections, 3D object tracking, and more advanced 3D perception. Like other CV Engineers, we observed a common factor during these projects: one needs a large volume of high-quality data to build a production-deployable CV system. Our biggest challenge was not having a robust data infrastructure to handle large volumes of data. Our S3 buckets were like a data swamp; we had so much raw image and video in storage buckets without tracking. Instead of working on CV, we had to develop tools for data operations. We understand that many of us have our own custom scripts and stitch them together to make things happen in the CV pipeline. However, it is brittle and cumbersome to maintain. We wanted to build a system on top of the cloud buckets such as S3 that store all file indexes, labels, metadata attributes, inference outputs, model training outcomes, and literally anything related to machine learning/computer vision. This makes it possible for us to search for anything and consume efficiently. This behaves as a DataLake (by the way, "DataLake" is an overused term). All other downstream processes in the CV pipeline can access data more efficiently via SDK and can also return data back to the Lake (e.g., training/inference outcomes). The reason we made it self-hosted is to address data security and privacy concerns. Since data is fundamental to AI, we believe that companies and organizations should have complete control over it. Currently, we support AWS, GCP, and Azure cloud buckets; soon, we will support local storage. We ship this as a Docker container so you can just install it on any VM or local server. The installation script will do all the configuration automatically. The Python SDK and documentation are available but not perfect yet. We’ve launched this under MIT and Elastic licenses so any developer can use it. Our goal is not to charge individual developers. We make money by charging a license fee for things like multiple users, multiple buckets, scalability with K8, and providing support. Give it a try: https://ift.tt/QA0tbHg Let us know what you think. July 22, 2023 at 04:45AM
Subscribe to:
Post Comments (Atom)
Show HN: Tablr – Supabase with AI Features https://ift.tt/ltABMro
Show HN: Tablr – Supabase with AI Features https://www.tablr.dev/ June 30, 2025 at 04:35AM
-
Show HN: Locksmith – detect locks taken by Postgres migrations https://ift.tt/0cBueJt February 10, 2025 at 02:26AM
-
Show HN: I built a FOSS tool to run your Steam games in the Cloud I wanted to play my Steam games but my aging PC couldn’t keep up, so I bui...
-
Show HN: TNX API – Natural Language Interactions with Your Database Hey HN! I built TNX API to make working with databases as simple as aski...
No comments:
Post a Comment