View
Semantic Search: High Performance Rate Limiting at Databricks
High Performance Rate Limiting at Databricks
Score 0.817
· Account tjphuhs@gmail.com
· 5/13/2026, 11:30:35 AM
High Performance Rate Limiting at Databricks from bytebytego@substack.com on 2026-05-13T15:30:35.000Z earlier rejection rate mechanism became unnecessary, and the team eventually converted every rate limit in the system to a token bucket. Three Coupled Decisions The Databricks story resolves into three decisions that depend on each other: The first is the algorithm, which determines how the counter behaves at the boundaries of time intervals. Fixed window, sliding window, and token bucket each p
High Performance Rate Limiting at Databricks
Score 0.816
· Account tjphuhs@gmail.com
· 5/13/2026, 11:30:35 AM
High Performance Rate Limiting at Databricks from bytebytego@substack.com on 2026-05-13T15:30:35.000Z the accuracy tradeoff that shrinking usually requires. Disclaimer: This post is based on publicly shared details from the Databricks Engineering Team. Please comment if you notice any inaccuracies. A Counting Problem Strip away the framing, and rate limiting reduces to a counting problem. Each request arrives, the system locates the right counter, compares it against a threshold, and either allo
High Performance Rate Limiting at Databricks
Score 0.803
· Account tjphuhs@gmail.com
· 5/13/2026, 11:30:35 AM
High Performance Rate Limiting at Databricks from bytebytego@substack.com on 2026-05-13T15:30:35.000Z for a second, durable storage is more than the problem requires. The count can live in memory on the server that owns it, and losing that server during a restart costs almost nothing. The challenge is that a single server cannot hold counts for every rate limit key across the fleet. The service needs a way to partition keys across servers, and a way for any client to quickly find the server that
High Performance Rate Limiting at Databricks
Score 0.795
· Account tjphuhs@gmail.com
· 5/13/2026, 11:30:35 AM
High Performance Rate Limiting at Databricks from bytebytego@substack.com on 2026-05-13T15:30:35.000Z rollout are worth noting. The team built a localhost sidecar next to the Envoy ingress to host the batch-reporting logic, because Envoy is third-party code they could not change directly. Before in-memory counting was ready, a Lua script on Redis batched writes together to keep batch-reporting latency manageable during the migration. The rebuild reframes what rate limiting is as a system problem
High Performance Rate Limiting at Databricks
Score 0.792
· Account tjphuhs@gmail.com
· 5/13/2026, 11:30:35 AM
High Performance Rate Limiting at Databricks from bytebytego@substack.com on 2026-05-13T15:30:35.000Z of its own. Between the moment a client starts exceeding a limit and the moment the server tells it to reject, traffic can leak through. A hundred milliseconds of overshoot at high QPS amounts to a lot of requests. The team wanted guarantees that kept overshoot within roughly 5 percent of the policy, and reaching that target required three-layered fixes. The first was a rejection rate returned b
High Performance Rate Limiting at Databricks
Score 0.791
· Account tjphuhs@gmail.com
· 5/13/2026, 11:30:35 AM
High Performance Rate Limiting at Databricks from bytebytego@substack.com on 2026-05-13T15:30:35.000Z View this post on the web at https://blog.bytebytego.com/p/high-performance-rate-limiting-at ScyllaDB Founders Share What Real-Time AI Requires from the Database (Sponsored) [ https://substack.com/redirect/cc973a24-c008-4be2-a06b-86d5eca83aba?j=eyJ1IjoiNGl3b2U2In0.sVDxRtmZ85v8kfdamY0krRXGMy3p768BWtuZifRB-Zs ] AI is pushing databases to their limits; learn what it takes to stay ahead AI workloads
High Performance Rate Limiting at Databricks
Score 0.774
· Account tjphuhs@gmail.com
· 5/13/2026, 11:30:35 AM
High Performance Rate Limiting at Databricks from bytebytego@substack.com on 2026-05-13T15:30:35.000Z team asked a harder question. Does every request truly need to wait for a rate limit decision before proceeding? They considered three alternatives: The first was prefetching tokens on the client, where the client pulls a block of capacity and answers rate limit checks locally. The second was batching requests on the client and waiting for a response before releasing them. The third was sampling
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time
Score 0.603
· Account tjphuhs@gmail.com
· 5/12/2026, 11:31:03 AM
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time from bytebytego@substack.com on 2026-05-12T15:31:03.000Z is based on publicly shared details from the Figma Engineering Team. Please comment if you notice any inaccuracies. When SELECT * Becomes Your Bottleneck Figma’s original data pipeline did what’s called a full sync. Every run copied the entire contents of a database table, regardless of how much had actually changed since the last run. If a table had ten million rows and
How Stripe Detects Fraudulent Transactions Within 100 ms
Score 0.592
· Account tjphuhs@gmail.com
· 4/27/2026, 11:03:53 PM
How Stripe Detects Fraudulent Transactions Within 100 ms from bytebytego@substack.com on 2026-04-28T03:03:53.000Z advanced, they moved to more complex architectures. Each jump produced an equivalent leap in model performance. The architecture preceding the current one was called Wide & Deep. It combined two models into an ensemble. The “wide” component was XGBoost, a gradient-boosted decision tree that works by combining many small decision trees into one powerful predictor. XGBoost excelled at
How Stripe Detects Fraudulent Transactions Within 100 ms
Score 0.574
· Account tjphuhs@gmail.com
· 4/27/2026, 11:03:53 PM
How Stripe Detects Fraudulent Transactions Within 100 ms from bytebytego@substack.com on 2026-04-28T03:03:53.000Z in block rate for smaller businesses, which would be disruptive for those merchants and their customers. Before releasing any model, Stripe measures the change it would cause to the false positive rate, block rate, and authorization rate on both an aggregate and per-merchant basis. If a model would cause undesirable shifts for certain users, they adjust it for those segments before r
How Stripe Detects Fraudulent Transactions Within 100 ms
Score 0.570
· Account tjphuhs@gmail.com
· 4/27/2026, 11:03:53 PM
How Stripe Detects Fraudulent Transactions Within 100 ms from bytebytego@substack.com on 2026-04-28T03:03:53.000Z overnight jobs could now run multiple times in a single working day. Stripe is now exploring techniques that this architectural shift made possible, including multi-task learning, where a single model is trained to handle several related objectives simultaneously. [Live on May 6] Stop babysitting your agents (Sponsored) [ https://substack.com/redirect/a95b6182-9924-4322-9ece-ce3923bf
How Stripe Detects Fraudulent Transactions Within 100 ms
Score 0.564
· Account tjphuhs@gmail.com
· 4/27/2026, 11:03:53 PM
How Stripe Detects Fraudulent Transactions Within 100 ms from bytebytego@substack.com on 2026-04-28T03:03:53.000Z merchants and regions behave similarly, then applies fraud knowledge across the entire network. Stripe also found that scaling up training data continued to yield significant gains. A 10x increase in training transaction data still produced meaningful model improvements, and the team was working on a 100x version. This kind of scaling was only feasible because the DNN-only architectu
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time
Score 0.558
· Account tjphuhs@gmail.com
· 5/12/2026, 11:31:03 AM
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time from bytebytego@substack.com on 2026-05-12T15:31:03.000Z is largely self-operating, and developers get involved only when alerts fire. The payoff came early. One week into testing in their staging environment, the automated re-bootstrap routine caught a severe failure mode. Had it reached production, it would have triggered a site-wide outage lasting at least twenty minutes. The old system, a daily cron job with no automated va
Database Selection in AI-Powered Software Engineering
Score 0.555
· Account tjphuhs@gmail.com
· 5/11/2026, 2:00:11 PM
Database Selection in AI-Powered Software Engineering from techscoop@substack.com on 2026-05-11T18:00:11.000Z +----------------------+ | +----------------------+ | Distributed SQL DB | +----------------------+ / | \ / | \ +---------+ +---------+ +---------+ | Node 1 | | Node 2 | | Node 3 | +---------+ +---------+ +---------+ Use Cases NewSQL databases are ideal for: Financial platforms Global SaaS applications Real-time analytics High-concurrency applications For example, Google Spanner powers g
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time
Score 0.553
· Account tjphuhs@gmail.com
· 5/12/2026, 11:31:03 AM
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time from bytebytego@substack.com on 2026-05-12T15:31:03.000Z ways multiple ways that don’t produce errors. For example: A partial failure during a snapshot export A misconfigured CDC connector An unexpected data format from the source. These issues don’t crash the pipeline. They just produce wrong data. And wrong data in an analytics warehouse leads to wrong KPIs, wrong business decisions, and a slow erosion of trust in the entire
How Stripe Detects Fraudulent Transactions Within 100 ms
Score 0.551
· Account tjphuhs@gmail.com
· 4/27/2026, 11:03:53 PM
How Stripe Detects Fraudulent Transactions Within 100 ms from bytebytego@substack.com on 2026-04-28T03:03:53.000Z and chargeback fees, a single fraudulent transaction can wipe out the profit from nearly 19 legitimate ones. For this business, aggressive blocking makes sense because the cost of missed fraud is devastating. On the other hand, a SaaS company with high margins faces the opposite calculation. The lifetime revenue lost by blocking a legitimate subscriber who would have paid $200 per mo
Acknowledge Review
Score 0.551
· Account tjphuhs@gmail.com
· 2/16/2026, 5:53:00 AM
Acknowledge Review from performance_notifications@adp.com on 2026-02-16T10:53:00.000Z Hi Timothy, Your performance review for 2025 Annual Review (Non-Exempt) is available to acknowledge. Please acknowledge your performance review before 02/27/2026. To acknowledge your performance review, click View and acknowledge next to 2025 Annual Review (Non-Exempt). Take me to Performance Dashboard Alternatively, you can copy and paste below URL in the browser address bar to access Performance Dashboard htt
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time
Score 0.548
· Account tjphuhs@gmail.com
· 5/12/2026, 11:31:03 AM
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time from bytebytego@substack.com on 2026-05-12T15:31:03.000Z Here’s why the timing matters. For example, Ssy Figma kicks off a snapshot export at 2:00 AM, and the export takes two hours to complete. During those two hours, users are still active. Records are being created, updated, and deleted. The snapshot finishes at 4:00 AM, but it only reflects the state of the table as of 2:00 AM. If the change stream starts capturing events a
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time
Score 0.548
· Account tjphuhs@gmail.com
· 5/12/2026, 11:31:03 AM
How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time from bytebytego@substack.com on 2026-05-12T15:31:03.000Z View this post on the web at https://blog.bytebytego.com/p/how-figma-upgraded-data-pipeline Harness engineering for agentic code review (Sponsored) [ https://substack.com/redirect/ba4d0cbb-1659-42ea-835a-00e16b4b9ee2?j=eyJ1IjoiNGl3b2U2In0.sVDxRtmZ85v8kfdamY0krRXGMy3p768BWtuZifRB-Zs ] Underneath every review sits a purpose-built, independent context engine. It’s the layer
How Stripe Detects Fraudulent Transactions Within 100 ms
Score 0.543
· Account tjphuhs@gmail.com
· 4/27/2026, 11:03:53 PM
How Stripe Detects Fraudulent Transactions Within 100 ms from bytebytego@substack.com on 2026-04-28T03:03:53.000Z or an unusually high number of names linked to a card. It includes a location map showing distances between the billing address, shipping address, and IP address. It shows customer metadata like email, cardholder name, and the authorization rate for transactions associated with that email. See the diagram below: Stripe also uses Elasticsearch, a search engine optimized for fast looku
The You Report: NEW Eval harness, Research API controls, auto top-up
Score 0.541
· Account tjphuhs@gmail.com
· 5/6/2026, 2:47:31 PM
The You Report: NEW Eval harness, Research API controls, auto top-up from developer@mail.you.com on 2026-05-06T18:47:31.000Z Benchmark any search API, structure outputs, and avoid credit-related downtime Welcome to this month’s newsletter! Here’s what’s new this month: Research API controls (source_control, output_schema), an open-source eval harness, and auto top-up to prevent credit-related downtime. **************** Product Releases **************** -------------------------------------------
Your API usage limits have increased
Score 0.539
· Account tjphuhs@gmail.com
· 4/24/2026, 10:37:56 PM
Your API usage limits have increased from noreply@tm.openai.com on 2026-04-25T02:37:56.000Z Your API usage limits have increased Review your new rate limits in your account settings. Hi there, We're happy to share that we've automatically moved your organization from Usage Tier 3 to Usage Tier 4 based on your usage history on our platform. For most organizations, this comes with an increase in rate limits across several models. You can review your new usage tier and limits in your account settin
Database Selection in AI-Powered Software Engineering
Score 0.539
· Account tjphuhs@gmail.com
· 5/11/2026, 2:00:11 PM
Database Selection in AI-Powered Software Engineering from techscoop@substack.com on 2026-05-11T18:00:11.000Z databases Semi-structured data → Document databases Relationship-heavy data → Graph databases Time-stamped data → Time-series databases 5. Security and Compliance Modern applications must comply with regulations such as GDPR, HIPAA, and PCI-DSS. Database security features should include: Encryption Authentication Access control Backup and recovery Audit logging Industries like healthcare
Database Selection in AI-Powered Software Engineering
Score 0.537
· Account tjphuhs@gmail.com
· 5/11/2026, 2:00:11 PM
Database Selection in AI-Powered Software Engineering from techscoop@substack.com on 2026-05-11T18:00:11.000Z datasets Less suitable for unstructured data 2. NoSQL Databases NoSQL databases were developed to address the scalability and flexibility limitations of traditional relational systems. Unlike SQL databases, NoSQL systems support unstructured or semi-structured data and can scale horizontally across distributed environments. NoSQL databases are categorized into four main types: Document d
Release Review
Score 0.535
· Account tjphuhs@gmail.com
· 1/27/2026, 2:21:29 PM
Release Review from performance_notifications@adp.com on 2026-01-27T19:21:29.000Z Hi Timothy, The performance review form of Pamela Gray, IPQA III for 2025 Annual Review (Non-Exempt) is available to release to the employee. We recommend that you have a review discussion with Pamela Gray and release the performance review form before 02/16/2026. To release the performance review, select 2025 Annual Review (Non-Exempt) on the Performance Dashboard, and search for the employee name. Click Actions a