Incident Symptoms
Around 14:50 UTC on Friday, August 13th 2021, our users encountered an issue with missing pipelines in their accounts.
Impact and Root Cause
The event was triggered by a human error in a routine operation. This error resulted in pipeline data corruption for a portion of our user base. The event was detected by customers and internal teams immediately and the root cause was identified within 15 minutes.
Investigation
The team started working on the mitigation and restoration process by 15:15 UTC. At this point, we observed some performance issues with our hosted Database vendor infrastructure causing unexpected delays to our restoration process.
Around 19:20 UTC, after confirming that we were unable to resolve these delays from our vendor, we recommended a workaround to run manual builds in order to allow any necessary changes to be deployed.
Resolution
Our team continued to work with our DB vendor to improve the restoration efficiency and were able to restore the pipelines completely around 00:30 UTC. New pipelines created during the outage were not replaced or removed to allow for the continuance of service.
Post Mortem
Please reach out support@codefresh.io if you have any questions.