Executive Summary: On May 1, 2026, users on our US platform experienced login failures and slowness for approximately 2.5 hours. The issue was caused by an internal service generating an excessive number of simultaneous requests to a shared backend component, which became overloaded and caused a cascading failure in the login process. The issue was resolved the same day, and the system has been operating normally since.
Incident Overview: The issue originated from one of our internal data-processing services, which attempted to process a large volume of data for a single account all at once, rather than in manageable batches. This created an unexpectedly high load on a shared backend service that other parts of the platform depend on, including the component responsible for verifying user access during login. As the shared service became overwhelmed, the login-related service was unable to complete its checks and began failing repeatedly. This meant users trying to log in received errors or experienced significant delays.
To resolve the issue, the team identified the service generating the excessive load and temporarily disabled it, since it wasn’t a real-time component. This immediately relieved the pressure on the shared backend, and the login process recovered shortly after. The service was subsequently restored with a fix applied preventing the same pattern from recurring.
Impact: Users on the US platform experienced login failures or significant slowness from approximately 2026-05-01T14:46Z to 2026-05-01T17:24Z (~2 hours 38 minutes). The platform itself remained generally available for users who were already logged in. A status page update was posted during the incident.
Detection: The incident was detected at 2026-05-01T14:46Z by our automatic monitoring system, which alerted the operations team to login-related failures in the US environment.
Response: Our operations and engineering teams began investigating immediately. Initial steps focused on scaling up the affected backend components and stabilizing the login service. Once the source of the excessive load was identified, the responsible service was temporarily disabled at 2026-05-01T17:06Z, which resolved the issue. Login errors subsided within minutes, and the status page was updated to reflect recovery at 2026-05-01T17:24Z.
Root Cause: An internal data-processing service attempted to handle all data for an unusually large dataset simultaneously, without any limit on the number of concurrent operations. This generated a surge of requests that overwhelmed a shared backend service, which in turn caused the login verification process to fail. The service has since been updated to limit concurrency and prevent this pattern from recurring.