.png)
.png)
When a GCP Cloud SQL instance restarts, the first question is usually simple:
Was this expected, or did something go wrong?
The answer determines the entire direction of your investigation. A planned maintenance event might require nothing more than internal documentation, while an unexpected crash could indicate an underlying infrastructure or memory issue that needs immediate attention.
If you are used to self-managed MySQL environments, you are likely used to SSHing into a server and tailing a log file. GCP Cloud SQL takes that ability away. Instead, you rely entirely on what Cloud Logging exposes. The catch is that not all critical events end up in the specific log stream you would naturally check first.
The primary challenge is that most restart events look identical from the application side. Whether the restart was caused by routine maintenance, a manual user action, an out-of-memory (OOM) condition, or a MySQL internal failure, the immediate symptom is always the same: connections drop, the instance becomes temporarily unavailable, and eventually, it comes back online.
The evidence is usually there. You are just looking in the wrong place. Searching the wrong logs can easily lead to a "no results" response, even when the actual cause has been recorded elsewhere in the system.
In the sections below, we will walk through a structured approach to determining whether a Cloud SQL restart was planned or unplanned, exactly where to look for the evidence, and how to identify the underlying cause as quickly as possible.
The Incident That Inspired This Investigation
When a client's instance started restarting intermittently, the obvious first stop was mysql.err, that is where any database administrator expects a crash to show up.
Attempt 1 - MySQL error log, pinned to one stream
resource.type="cloudsql_database"
resource.labels.database_id="PROJECT_ID:INSTANCE_NAME"
log_name="PROJECT_ID/nc-ce-inf-prd-db-mum-gdbin/logs/cloudsql.googleapis.com%2Fmysql.err"Result: No entries returned.
The log_name filter locks the query to a single stream. That stream only contains messages written directly by the MySQL engine itself. If MySQL was terminated by the host operating system before it could flush its logs, there is nothing here to find. The query is perfectly valid; the stream simply does not contain the event.
Rather than assuming nothing happened, the search was widened. The log_name filter was removed entirely and replaced with basic keyword terms. There was no stream pinning, just text matching across every log stream attached to that specific instance.
Attempt 2 - No log_name filter, keyword search across all streams
resource.type="cloudsql_database"
resource.labels.database_id="PROJECT_ID:INSTANCE_NAME"
"OOM" OR "Out of memory" OR "OOM-killer" OR "Killed process"
OR "killed database process" OR "SIGKILL" OR "fatal signal."Result: INFO 2026-06-02T06:29:25.253260Z Out of memory: killed database process: mysqld
This message was written by the host kernel's OOM killer, not by MySQL. When memory pressure forced the OS to free resources, it terminated mysqld and logged that action to a system-level stream. That stream only shows up when you search without a log_name filter. MySQL never got a chance to write anything because it was gone before it could.
The two queries differ in exactly one way: the log_name filter. The same instance and the same timeframe produced a completely different outcome just by changing the scope.
The Two Categories of Shutdown
Every Cloud SQL shutdown falls into one of two buckets. The investigation approach for each is fundamentally different.
Planned Shutdowns
A planned shutdown is anything where a user made a deliberate call to GCP maintenance, initiated a manual restart, or triggered an HA failover. The goal here is confirmation and documentation, not a technical investigation.
Unplanned Shutdowns
Unplanned shutdowns occur when MySQL or the underlying infrastructure encounters a problem that forces the instance to stop unexpectedly. These events require deeper troubleshooting because the restart is a symptom of a larger issue.
From the application side, both categories look identical. The difference lies in what they leave behind in the logs.
Know Your Log Streams First
Cloud SQL records different events in different streams. Before you start querying Google Cloud Logging, it helps to know which stream captures what.
While mysql.err is often useful, it only captures events generated by the MySQL engine itself. Different restart causes are recorded in different log streams. Understanding which stream captures which event type significantly reduces investigation time.
The following log streams are the primary sources of evidence during Cloud SQL restart investigations:
Step-by-Step Investigation Workflow
The workflow below is ordered to eliminate planned restart scenarios first before investigating potential failures. In many cases, you will find the answer within the first few steps.
Step 1: Check for a GCP-managed event
Rules out planned platform operations.
Review the system_event logs for maintenance operations, failovers, and platform-initiated restarts.
resource.type="cloudsql_database"
resource.labels.database_id="PROJECT:INSTANCE"
log_name="projects/PROJECT/logs/cloudaudit.googleapis.com%2Fsystem_event"Look for: Maintenance started, Maintenance completed, DatabaseInstanceRestart, Failover.
Step 2: Check for a manual restart via Console or CLI
Identifies human-initiated planned actions.
If someone on your team restarted the instance manually, it shows up in the activity audit log, not the system event log.
resource.type="cloudsql_database"
resource.labels.database_id="PROJECT:INSTANCE"
log_name="projects/PROJECT/logs/cloudaudit.googleapis.com%2Factivity"Look for: cloudsql.instances.restart, cloudsql.instances.update, the principal email that triggered it, and the exact timestamp.
Step 3: Check the MySQL error log for a clean shutdown message
Confirms a graceful exit regardless of who triggered it.
Whether the restart came from GCP maintenance or a manual action, MySQL should have written a clean shutdown message to mysql.err if it exited properly.
resource.type="cloudsql_database"
resource.labels.database_id="PROJECT:INSTANCE"
log_name="projects/PROJECT/logs/cloudsql.googleapis.com%2Fmysql.err"
"Shutdown complete" OR "Received SIGTERM" OR "Normal shutdown"If this message is present, the database process was not killed hard.
Step 4: Check for InnoDB crash recovery on the next startup
Catches unclean shutdown indicators and MySQL-level InnoDB crash causes.
Crash recovery indicates that the previous shutdown was not clean. If no planned event is found in Steps 1–3, this confirms an unplanned restart.
resource.type="cloudsql_database"
resource.labels.database_id="PROJECT:INSTANCE"
log_name="projects/PROJECT/logs/cloudsql.googleapis.com%2Fmysql.err"
"Starting crash recovery" OR "Log scan progressed past the checkpoint"
OR "InnoDB: Mutexes and rw_locks use" OR "crash recovery" OR "Assertion failure" OR
"mysqld got signal" OR "InnoDB: Corruption"
OR "Table is marked as crashed" OR "error writing"If internal failure signatures are found here, they usually point directly to the root cause (e.g., storage, corruption, or engine failure).
Step 5: Broad text search - no log_name filter
Catches OOM kills, kernel events, and host-layer messages.
Use a broad search without a log_name filter to identify host-level events.
resource.type="cloudsql_database"
resource.labels.database_id="PROJECT:INSTANCE"
"OOM" OR "Out of memory" OR "OOM-killer" OR "Killed process"
OR "killed database process" OR "SIGKILL" OR "fatal signal"
OR "Aborted" OR "host failure" OR "disk full"CRITICAL: Do not add log_name to this query. Pinning the log name will cause these events to be missed entirely, as they are generated by the host kernel and will not appear in mysql.err or standard audit logs.
Adjust the time window to the period surrounding the restart. If this surfaces a hit, that message tells you the cause category; then look at metrics to understand the underlying pressure.
What the Evidence Tells You
Log entries rarely tell the full story in isolation. The table below summarizes the most common evidence patterns and the conclusions you can draw from them.
Final Thoughts
Most database restarts leave enough evidence across audit logs, mysql.err, and broad Cloud Logging searches to give you a clear answer. When logs come up short, Cloud Monitoring metrics usually fill the gap, memory and connection charts do not depend on log propagation.
While these steps won't cover every single edge case, following this structured workflow will help you find the right answers faster than starting from scratch each time. For more tips on managing database anomalies, explore our other database technical blogs.
Need Expert Help Managing Your Cloud SQL Environment?
Unplanned downtime hurts business. If your team is spending too much time investigating database crashes, optimizing slow queries, or managing infrastructure limits, Mydbops can help. We provide 24/7 proactive monitoring, root cause analysis, and tailored scaling strategies for your GCP Cloud SQL environments.

.avif)

.avif)

.avif)
