Impact:
Some builds had significant delays in logs appearing in the UI. The build completion state and build time was not affected.
Detection:
Customers reported the impact to us.
Root Cause:
We had saturated the incoming capacity of our Firebase instances causing an inability to write new data into it. Due to this builds of customers were not able to report all logs and in many cases had delayed logs.
The increased load spiked due to an increase in logging in production from another issue that caused builds to stay in pending. This caused an additional surge and therefore extended delays in logging.
Resolution:
We have doubled our Firebase instances to better handle spikes in demand. We will also be implemented some targeted monitoring of our Firebase instances and have improved monitoring of our overall platform state.