Classic Pipeline Logging Delays
Incident Report for Codefresh
Postmortem

Impact:
Some builds had significant delays in logs appearing in the UI. The build completion state and build time was not affected.

Detection:
Customers reported the impact to us.

Root Cause:

We had saturated the incoming capacity of our Firebase instances causing an inability to write new data into it. Due to this builds of customers were not able to report all logs and in many cases had delayed logs.

The increased load spiked due to an increase in logging in production from another issue that caused builds to stay in pending. This caused an additional surge and therefore extended delays in logging.

Resolution:

We have doubled our Firebase instances to better handle spikes in demand. We will also be implemented some targeted monitoring of our Firebase instances and have improved monitoring of our overall platform state.

Posted Jun 20, 2023 - 04:09 UTC

Resolved
This incident has been resolved.
Posted May 23, 2023 - 14:20 UTC
Monitoring
A fix has been implemented to address this issue and we are monitoring the results.
Posted May 22, 2023 - 23:15 UTC
Update
We are continuing to investigate this issue.
Posted May 22, 2023 - 18:18 UTC
Investigating
We are currently investigating some occurrences where Classic Pipeline steps are experiencing significant delays in the logs of the step showing in the UI. Pipelines are continuing to operate as expected.
Posted May 22, 2023 - 18:17 UTC
This incident affected: Codefresh Systems (Codefresh Classic UI).