Pending builds on SaaS and Hybrid

Incident Report for Codefresh

Postmortem

Impact:

We had a partial outage (some requests could not access the platform at all) and some builds were stuck in pending for 30 mins.

Detection:

We manually detected this issue before our automated check (every 10 minutes) alerted us

Root Cause:

We had a parallel issue with Firebase logging and the combination of a number of small issues as a result caused some pods to become unresponsive.

Resolution:

We reverted our last push to production to test if this was code related. Once the revert triggered services to restart, the issue was then resolved.

Posted Jun 20, 2023 - 04:09 UTC

This has now been resolved

Posted May 22, 2023 - 18:19 UTC

A fix has been implemented and we are monitoring the results.

Posted May 22, 2023 - 14:28 UTC

The issue has been identified and a fix is being implemented.

Posted May 22, 2023 - 14:23 UTC

We are currently investigating this issue.

Posted May 22, 2023 - 14:07 UTC

This incident affected: Codefresh Systems (codefresh.io).