Intermittent issue preventing some links from resolving
Incident Report for Geniuslink
Postmortem

On October 10, 2023 at approximately 7:18AM PDT our systems experienced a large spike in traffic which caused intermittent issues in click processing as well as limited Dashboard access. Our engineers acted quickly to increase our infrastructure capacity to be able to handle the spike in traffic, but there were still intermittent issues while they worked to resolve things that lasted until 10:30AM PDT.

While all servers remained online during this time and the majority of clicks were still being processed, there were still some clicks that failed to be processed and resulted in timeout errors, with the majority of the impact being limited to the Western part of the United States. In total there was approximately two hours of partial downtime / intermittent service impact, before our engineers were able to get everything running smoothly again.

In order to prevent this moving forward we have significantly tuned our infrastructure (increased timeouts, better request processing distribution, improved caching, better monitoring, etc) as well as increased our overall infrastructure capacity in order to be able to better handle large spikes in traffic moving forward.

Posted Oct 14, 2023 - 08:51 PDT

Resolved
Our team has monitored this issue for the past ~32 hours since 10:30AM PST yesterday to ensure that everything remained operational and that the improvements to infrastructure were able to handle the increased capacity during Prime day. Everything remains operational and there was no further issues throughout this time.
Posted Oct 11, 2023 - 17:27 PDT
Monitoring
Engineers have re-tuned infrastructure significantly to handle the increased traffic. They’ve added increased server capacity and made other improvements to be able to handle increased requests. Link resolution as well as dashboard functionality has been fully operational since 10:30AM PST (it was intermittent starting at 7:30AM PST), and we are continuing to monitor.
Posted Oct 10, 2023 - 11:34 PDT
Investigating
Dashboard and Links are still intermittently timing out, but our engineers are all hands on deck to get things resolved.
Posted Oct 10, 2023 - 09:59 PDT
Monitoring
Server slowdown has been stabilized as of 8:13PST, and Links/Dashboard should be resolving/loading properly. Our engineers are still monitoring closely to ensure that everything remains operational.
Posted Oct 10, 2023 - 08:40 PDT
Investigating
Today at around 7:18PST Links and the dashboard became slow or unresponsive. We are investigating the issue and hope to resolve it shortly.
Posted Oct 10, 2023 - 07:41 PDT
This incident affected: Dashboard and Links.