Real-time dashboards sound simple until you actually build one. "Just re-query on a timer" works fine for one user. For hundreds of dashboards each querying multiple data sources every 30 seconds, it becomes a very fast way to kill a database. Here's the architecture we landed on after two rewrites.
The first version was embarrassingly simple: each browser tab had a setInterval that fired an API call every 30 seconds per chart. Ten charts on a dashboard, 30-second interval, 50 concurrent users — that's 1,000 database queries per minute from refresh alone, before anyone asked a question. We hit the database limit within two weeks of the beta.
We moved to SSE and added a query deduplication layer. If 50 users were watching the same dashboard, they'd all subscribe to the same SSE stream, and we'd run the underlying query once, fanning out the result. Deduplication was keyed on a hash of (dashboard_id, component_id, applied_filters).
This cut query volume by ~60% in practice. The problem was SSE connection limits — browsers cap concurrent SSE connections per origin at 6, which meant dashboards with more than 6 charts started queuing. Not great.
The current architecture uses a single WebSocket connection per browser tab (no per-chart connections) and a server-side fan-out model. Here's how it works:
The refresh scheduler is still a single-process bottleneck. It works at our current scale but we know it needs to become a distributed scheduler (probably using a Redis sorted set as a priority queue) before we get to 10× current load. We're planning that work for Q3.
We also wish we'd invested in the query result cache earlier. The first two attempts wasted a lot of database reads on users who reconnected mid-session and had to wait for a full refresh cycle. The cache makes reconnect invisible.
We're hiring engineers who want to work on problems like this. If distributed systems and real-time data infrastructure are your thing, check out our open roles.