Intermittent problems with a data lake database cluster on US-2

Resolved
Resolved

We've now resolved the incident. Every post-incident operation has been successfully completed. Thank you for your patience.

Recovering

95% of the affected teams have been migrated. We are continuing the post-incident restoration work. Continuous monitor shows the cleanup is succcessful and the restored teams are operating normally. Thank you for your continued patience.

Recovering

85% of the affected teams have been migrated. We are continuing the post-incident restoration work. Thank you for your continued patience.

Recovering

70% of the affected teams have been migrated. We are continuing the post-incident restoration work. Thank you for your continued patience.

Recovering

40% of the affected teams have been migrated. We are continuing with the post-incident restoration work.

Recovering

We have stabilized the database cluster everything is operational. However our team has identified that after stabilizing the cluster there is a chance for a performance degradation in the data jobs. To make sure the systems are operating at full capacity we are going to proactively move the data for the affected teams to a new cluster. To minimize the impact we are performing the migrations one by one as during the migration itself the data jobs will be impacted. The duration of the individual migrations depending on the team size. Thank you for your continued patience, our team is working hard to close the incident.

Identified

Our team has now stabilized the database cluster but we are still actively monitoring.

Investigating

We are still working on a mitigation plan, we apologize for the inconvenience

Investigating

Our monitoring has identified a problem with one of our database clusters on the US-2 realm. Due to this problem a small number of production teams can experience data job failures or longer execution times as well as problems in the SQL Workbench functionality. We are currently investigating the problem and working on a mitigation plan.

Began at:

Affected components
  • US-2