Duration: 15 minutes (2:45 PM - 3:00 PM UTC)
Impact: Webhook events experienced delays of 30-60 seconds during peak traffic.
Root Cause: A database connection pool exhaustion in our webhook delivery service caused queue buildup.
Resolution: We increased the connection pool size and optimized query performance. No data was lost.
Status: All services returned to normal operation. Webhook delivery is now at baseline latency.
Duration: 2 hours (scheduled, 11:00 PM - 1:00 AM UTC)
Impact: Customers in Europe experienced increased latency as traffic was rerouted to US edge nodes. Average response time increased from 20ms to 150ms.
Scope: Scheduled maintenance to upgrade TLS certificates and apply security patches.
Resolution: EU nodes were brought back online at 1:00 AM UTC. All systems nominal.
Status: This was planned downtime communicated 7 days in advance.
Duration: 45 minutes (3:15 PM - 4:00 PM UTC)
Impact: Dashboard page load times increased from 0.8s to 4-5 seconds. API and SDK clients were not affected.
Root Cause: A runaway query on the flags listing endpoint was consuming excessive database resources.
Resolution: We killed the problematic query, added a database index, and deployed a fix within 45 minutes.
Status: Dashboard performance has been restored. Additional monitoring was added for this query.
Get notified when ConfigDate has scheduled maintenance or incidents.