This is a autopost bolg frinds we are trying to all latest sports,news,all new update provide for you
Wednesday, April 8, 2026
Show HN: 500k+ events/sec transformations for ClickHouse ingestion https://ift.tt/1cDuIan
Show HN: 500k+ events/sec transformations for ClickHouse ingestion Hi HN! We are Ashish and Armend, founders of GlassFlow. Over the last year, we worked with teams running high-throughput pipelines into self-hosted ClickHouse. Mostly for observability and real-time analytics. A question that came repeatedly was: What happens when throughput grows? Usually, things work fine at 10k events/sec, but we started seeing backpressure and errors at >100k. When the throughput per pipeline stops scaling, then adding more CPU/memory doesn’t help because often parts of the pipeline are not parallelized or are bottlenecked by state handling. At this point, engineers usually scale by adding more pipeline instances. That works but comes with some trade-offs: - You have to split the workload (e.g., multiple pipelines reading from the same source) - Transformation logic gets duplicated across pipelines - Stateful logic becomes harder to manage and keep consistent - Debugging and changes get more difficult because the data flow is fragmented Another challenge arises when working with high-cardinality keys like user IDs, session IDs, or request IDs, and when you need to handle longer time windows (24h or more). The state grows quickly and many systems rely on in-memory state, which makes it expensive and harder to recover from failures. We wanted to solve this problem and rebuild our approach at GlassFlow. Instead of scaling by adding more pipelines, we scale within a single pipeline by using replicas. Each replica consumes, processes, and writes independently, and the workload is distributed across them. In the benchmarks we’re sharing, this scales to 500k+ events/sec while still running stateful transformations and writing into ClickHouse. A few things we think are interesting: - Scaling is close to linear as you add replicas - Works with stateful transformations (not just stateless ingestion) - State is backed by a file-based KV store instead of relying purely on memory - The ClickHouse sink is optimized for batching to avoid small inserts - The product is built with Go Full write-up + benchmarks: https://ift.tt/ol5djf9... Repo: https://ift.tt/BCG9pDw Happy to answer questions about the design or trade-offs. https://ift.tt/BCG9pDw April 8, 2026 at 10:56PM
Subscribe to:
Post Comments (Atom)
Show HN: FluidCAD – Parametric CAD with JavaScript https://ift.tt/nk9w8vT
Show HN: FluidCAD – Parametric CAD with JavaScript Hello HN users, This is a CAD by code project I have been working on on my free time for ...
-
Show HN: A directory of 800 free APIs, no auth required Explore reliable free APIs for developers — ideal for web and software development, ...
-
Show HN: I built a FOSS tool to run your Steam games in the Cloud I wanted to play my Steam games but my aging PC couldn’t keep up, so I bui...
-
Show HN: When is the next Caltrain? (minimal webapp) I was frustrated with the existing caltrain websites / apps, so I made a super minimali...
No comments:
Post a Comment