Siteimprove Platform Login Errors

Incident Report for Siteimprove

Postmortem

Executive Summary: On May 1, 2026, users on our US platform experienced login failures and slowness for approximately 2.5 hours. The issue was caused by an internal service generating an excessive number of simultaneous requests to a shared backend component, which became overloaded and caused a cascading failure in the login process. The issue was resolved the same day, and the system has been operating normally since.

Incident Overview: The issue originated from one of our internal data-processing services, which attempted to process a large volume of data for a single account all at once, rather than in manageable batches. This created an unexpectedly high load on a shared backend service that other parts of the platform depend on, including the component responsible for verifying user access during login. As the shared service became overwhelmed, the login-related service was unable to complete its checks and began failing repeatedly. This meant users trying to log in received errors or experienced significant delays.

To resolve the issue, the team identified the service generating the excessive load and temporarily disabled it, since it wasn’t a real-time component. This immediately relieved the pressure on the shared backend, and the login process recovered shortly after. The service was subsequently restored with a fix applied preventing the same pattern from recurring.

Impact: Users on the US platform experienced login failures or significant slowness from approximately 2026-05-01T14:46Z to 2026-05-01T17:24Z (~2 hours 38 minutes). The platform itself remained generally available for users who were already logged in. A status page update was posted during the incident.

Detection: The incident was detected at 2026-05-01T14:46Z by our automatic monitoring system, which alerted the operations team to login-related failures in the US environment.

Response: Our operations and engineering teams began investigating immediately. Initial steps focused on scaling up the affected backend components and stabilizing the login service. Once the source of the excessive load was identified, the responsible service was temporarily disabled at 2026-05-01T17:06Z, which resolved the issue. Login errors subsided within minutes, and the status page was updated to reflect recovery at 2026-05-01T17:24Z.

Root Cause: An internal data-processing service attempted to handle all data for an unusually large dataset simultaneously, without any limit on the number of concurrent operations. This generated a surge of requests that overwhelmed a shared backend service, which in turn caused the login verification process to fail. The service has since been updated to limit concurrency and prevent this pattern from recurring.

Posted May 14, 2026 - 17:55 UTC

Resolved

The issue impacting logging into the Siteimprove platform via siteimprove.com and my2.us.siteimprove.com is all sorted.

We’re sorry for holding you up! If you have any additional questions or feedback, please submit a new support ticket through our Help Center ("?" button) located at the top right-hand-side of the Siteimprove platform.
Posted May 01, 2026 - 20:03 UTC

Monitoring

A fix has been implemented and we are monitoring the results. Users should now be able to successfully log into the Siteimprove platform via siteimprove.com and my2.us.siteimprove.com disruption free.
Posted May 01, 2026 - 17:27 UTC

Identified

The issue has been identified and a fix is being implemented.
Posted May 01, 2026 - 16:24 UTC

Investigating

We are currently experiencing issues where some customers are unable to log in to the Siteimprove platform via siteimprove.com and my2.us.siteimprove.com

Impact:

- Users who are already logged in may be experiencing disruptions (e.g.: modules not loading, blank screens while navigating)
- New login attempts may fail.

Our Development Team is actively working to identify the root cause and resolve the issue as quickly as possible.

We apologize for this inconvenience. We'll continue to keep you updated.
Posted May 01, 2026 - 15:16 UTC
This incident affected: Platform.