Intermittent problems with a data lake database cluster on US-2

Uptime Impact: 2 hours, 20 minutes, and 39 seconds
Resolved
Resolved

We've now resolved the incident. Every post-incident operation has been successfully completed. Thank you for your patience.

Avatar for
Recovering

95% of the affected teams have been migrated. We are continuing the post-incident restoration work. Continuous monitor shows the cleanup is succcessful and the restored teams are operating normally. Thank you for your continued patience.

Avatar for
Recovering

85% of the affected teams have been migrated. We are continuing the post-incident restoration work. Thank you for your continued patience.

Avatar for
Recovering

70% of the affected teams have been migrated. We are continuing the post-incident restoration work. Thank you for your continued patience.

Avatar for
Recovering

40% of the affected teams have been migrated. We are continuing with the post-incident restoration work.

Avatar for
Recovering

We have stabilized the database cluster everything is operational. However our team has identified that after stabilizing the cluster there is a chance for a performance degradation in the data jobs. To make sure the systems are operating at full capacity we are going to proactively move the data for the affected teams to a new cluster. To minimize the impact we are performing the migrations one by one as during the migration itself the data jobs will be impacted. The duration of the individual migrations depending on the team size. Thank you for your continued patience, our team is working hard to close the incident.

Avatar for
Identified

Our team has now stabilized the database cluster but we are still actively monitoring.

Avatar for
Investigating

We are still working on a mitigation plan, we apologize for the inconvenience

Avatar for
Investigating

Our monitoring has identified a problem with one of our database clusters on the US-2 realm. Due to this problem a small number of production teams can experience data job failures or longer execution times as well as problems in the SQL Workbench functionality. We are currently investigating the problem and working on a mitigation plan.

Avatar for
Began at:

Affected components
  • US-2