This is a autopost bolg frinds we are trying to all latest sports,news,all new update provide for you
Tuesday, March 31, 2026
Show HN: How This Graybeard Built the Fastest and Freest Postgres BM25 Search https://ift.tt/HtEFZM8
Show HN: How This Graybeard Built the Fastest and Freest Postgres BM25 Search Last summer we faced a conundrum at my company, Tiger Data, a Postgres cloud vendor whose main business is in timeseries data. We were trying to grow our business towards emerging AI-centric workloads and wanted to provide a state-of-the-art hybrid search stack in Postgres. We'd already built pgvectorscale in house with the goal of scaling semantic search beyond pgvector's main memory limitations. We just needed a scalable ranked keyword search solution too. The problem: core Postgres doesn't provide this; the leading Postgres BM25 extension, ParadeDB, is guarded behind AGPL; developing our own extension appeared daunting. We'd need a small team of sharp engineers and 6-12 months, I figured. And we'd probably still fall short of the performance of a mature system like Parade/Tantivy. Or would we? I'd be experimenting long enough with AI-boosted development at that point to realize that with the latest tools (Claude Code + Opus) and an experienced hand (I've been working in database systems internals for 25 years now), the old time estimates pretty much go out the window. I told our CTO I thought I could solo the project in one quarter. This raised some eyebrows. It did take a little more time than that (two quarters), and we got some real help from the community (amazing!) after open-sourcing the pre-release. But I'm thrilled/exhausted today to share that pg_textsearch v1.0 is freely available via open source (Postgres license), on Tiger Data cloud, and hopefully soon, a hyperscalar near you: https://ift.tt/1b5TGhO In the blog post accompanying the release, I overview the architecture and present benchmark results using MS-MARCO. To my surprise, we were not only able to meet Parade/Tantivy's query performance, but exceed it substantially, measuring a 4.7x advantage on query throughput at scale: https://ift.tt/8wTo60m... It's exciting (and, to be honest, a little unnerving) to see a field I've spent so much time toiling in change so quickly in ways that enable us to be more ambitious in our technical objectives. Technical moats are moats no longer. The benchmark scripts and methodology are available in the github repo. Happy to answer any questions in the thread. Thanks, TJ (tj@tigerdata.com) https://ift.tt/1b5TGhO March 31, 2026 at 09:59PM
Subscribe to:
Post Comments (Atom)
Show HN: Roadie – An open-source KVM that lets AI control your phone https://ift.tt/xnsRqKm
Show HN: Roadie – An open-source KVM that lets AI control your phone Roadie is an open-source hardware KVM controlled via HTTP. HDMI capture...
-
Show HN: A directory of 800 free APIs, no auth required Explore reliable free APIs for developers — ideal for web and software development, ...
-
Show HN: I built a FOSS tool to run your Steam games in the Cloud I wanted to play my Steam games but my aging PC couldn’t keep up, so I bui...
-
Show HN: When is the next Caltrain? (minimal webapp) I was frustrated with the existing caltrain websites / apps, so I made a super minimali...
No comments:
Post a Comment