All systems Operational

Feb 06

No incidents on this day.

Feb 05

No incidents on this day.

Feb 04

No incidents on this day.

Feb 03

No incidents on this day.

Feb 02

Post-Mortem: Authentication Service Disruption
15:30 CET

Date: January 29, 2026 Duration: 53 minutes (18:15 – 19:08 CET)

Executive Summary

On Thursday, Jan 29, Trengo experienced a service disruption that prevented users from logging into the platform. The issue was traced to a failure in our internal authentication token renewal process. A fix was deployed at 18:45, and full service was restored by 19:08.

What Happened? The disruption began at 18:15 when our system-level bearer token, used to communicate with our authentication provider (Stytch), expired. While our automated cron job had successfully requested a new token, a caching error caused the system to store the new token in an incorrect location within our backend cache. As a result, even though the authentication provider was operational, Trengo’s backend continued attempting to use the expired token, leading to failed login attempts for all users.

Timeline of Events (CET) 18:15: Initial reports of user logout and login failures. 18:21: Triage initiated. Verified that external providers and recent deployments were stable. 18:35: Root cause identified: The 60-day system bearer token was refreshed but misdirected in the cache. 18:45: Fix deployed to refresh the token using a more robust pathing logic. 19:08: System-wide recovery confirmed; all users able to log in.

Corrective Actions To prevent a recurrence, we are implementing the following: Cache Validation: Added automated verification to ensure refreshed tokens are stored and retrievable from the correct keys.

Feb 01

No incidents on this day.

Jan 31

No incidents on this day.

Incidents in the past 7 days

Post-Mortem: Authentication Service Disruption