RAID 5 shows degraded but the drives look healthy: what should you do?

A Warsaw office NAS reports RAID 5 degraded, but every disk LED is green and SMART looks normal. That does not mean the rebuild button is safe. The warning may come from metadata, member order, parity inconsistency, controller cache or a disk that fails only under real reads.

If this is a wider company RAID failure, start with the first-response guide for RAID failure. The narrow rule here is simple: preserve the array state before changing it.

This guide is about one specific scenario: RAID 5 reports degraded while the member drives still appear online. A degraded alert is often caused by metadata, disk order, a superblock or controller state, not by an obvious total failure of all media.

If the array shows degraded even though the disks seem healthy, the first priority is to stop writes and avoid blind rebuilds. The array may still contain recoverable data, but every write can change parity, metadata or file-system state.

Why RAID 5 can show degraded even when drives look healthy

RAID status is not based only on LEDs. The controller or NAS relies on metadata describing member order, stripe size, parity rotation, offsets and previous state. If that metadata becomes inconsistent, the array can look degraded even before a drive clearly dies.

Common triggers include power loss during writes, a cleared controller cache, BBU battery warnings, a damaged mdadm or vendor superblock, silent sector errors, firmware bugs and parity mismatch after an interrupted rebuild.

RAID 5 is popular because it balances capacity, redundancy and performance, but that also means the controller has to understand the whole set correctly. In many lab cases the issue is not a disk with a red light, but a controller or NAS interpreting the array state incorrectly after an update, cache loss or forced restart.

Other important causes include a damaged superblock, silent sector damage, a write interrupted during metadata update and a battery-backed cache fault. Understanding that cause matters because the same word, degraded, can point to a logical problem, a controller problem or a member disk that only fails under load.

When you notice the degraded state, stop file shares, virtual machines, backup jobs and databases if you can do it safely. Photograph the disk order before removing anything, and treat SMART as one diagnostic clue, not as permission to rebuild.

The safer technical path is usually to make member-disk images on a separate system and analyse the array from copies. If you contact a specialist, provide the controller model, disk count, disk order and any known stripe size or RAID parameters.

How to tell a logical problem from a real disk failure

SMART is useful, but not final. Compare controller logs, NAS event history, disk serial numbers, bay order, reported capacity, read-error counters and the timeline of recent updates or shutdowns.

If a drive disappears intermittently, produces read errors or changes behaviour under load, treat it as physical risk. If all members read consistently but metadata disagrees, the case may require controlled virtual reconstruction of RAID parameters.

A degraded message alone does not prove that a rebuild is safe. First check whether you are dealing with a failed member, or with a logical array problem such as metadata, controller cache, disk order or an interrupted resync.

This is why SMART alone is not enough. Good SMART values do not exclude parity or ordering problems, and a wrong disk order after restart can make an otherwise readable array look partially damaged.

What to prepare before contacting the lab

Do not start Rebuild, Resync, Repair or Initialise.
Stop writes from file shares, VMs, backup jobs and databases where possible.
Photograph bay order and write down serial numbers.
Save screenshots of the alert, storage pool, event logs and NAS/controller model.
Write down what changed before the warning: update, disk swap, power loss, reboot or failed backup.

This information shortens diagnosis and reduces wrong assumptions at the start. For a company environment, also compare your symptoms with what to do after RAID failure in a company and the first 24 hours after server or NAS failure.

How to describe the array before further attempts

Describe the RAID level, number of disks, NAS or controller model, disk order, last successful access, warnings in the panel and whether a rebuild was scheduled, cancelled or already started.

For a business system, also note running services: SMB shares, virtual machines, databases, accounting systems, iSCSI LUNs or backup repositories. That helps decide whether the system should be powered down or preserved in a controlled read-only state.

In practice, collect screenshots of the storage pool, disk sequence, last controller messages and the list of services that stopped working. This makes it easier to compare the case with RAID degraded/offline first aid, a broader business RAID failure or a virtualisation case such as VMware, Hyper-V or SAN data recovery.

When improvising becomes too risky

Do not keep experimenting if the array contains business data, backups are uncertain, drive order is unknown, a rebuild failed, or the NAS/controller has changed status several times.

At that point the better path is imaging member disks and reconstructing the RAID layout from copies. See also what not to do with degraded/offline RAID and the first 24 hours after server or NAS failure.

If new alerts appear after degraded, the volume disappears or someone already tried a rebuild, the situation can quickly stop being just a warning. If the array contains critical data, move straight to the symptom description form instead of testing the array until it changes state again.

How to move from alert to controlled diagnosis

A healthy-looking disk is not proof that the array is safe to rebuild. The goal is to preserve disk order, metadata and readable sectors before the next write operation changes the evidence.

If the array is still visible but degraded status raises doubts, prepare a lab contact, check how data recovery pricing is assessed and compare your case with the main RAID/NAS data recovery service. That makes it easier to decide whether the environment can still be observed safely or should be stopped for diagnosis.

Safety rule: if RAID 5 is degraded and the data matters, do not rebuild from guesses. Secure the member disks and verify the array parameters first.

Request RAID diagnosis