RAID degraded or offline: what not to do before diagnosis
A Warsaw company NAS drops from degraded to offline, or a controller asks to rebuild after a disk swap. This is the moment when one wrong click can be more damaging than the original fault: wrong drive order, blind rebuild, new metadata or a resync over stale parity.
Why degraded/offline RAID is a high-risk state
Degraded means the array is running without full redundancy or with inconsistent copies. Offline means access may already be lost. In both states, the priority is to stop writes and preserve the current evidence: member order, metadata, logs and readable sectors.
For arrays and NAS systems, the controlled route is RAID/NAS data recovery. Avoid rebuilds until the array parameters are verified.
What absolutely not to do before diagnosis
- Do not start rebuild or resync just to see what happens.
- Do not initialise disks or create new volumes.
- Do not update controller or NAS firmware during the incident.
- Do not change disk order or swap drives randomly.
- Do not run file-system repair tools on the RAID volume.
What to do instead: response checklist
- Stop services that write data: virtual machines, databases, shares and backups.
- Take photos or screenshots of array status, bay order and messages.
- Label drives by original slot and serial number.
- Perform a controlled shutdown if continued work means more writes.
- Prepare disk images before reconstruction whenever possible.
When to report the case
Report the case when business data is involved, a rebuild failed, a second disk reports errors, the NAS marks drives inconsistently or the array was already reconfigured.
Common mistakes that complicate RAID reconstruction
The most damaging mistakes are changing disk order, accepting an initialisation prompt, replacing several drives at once, running repair tools on the volume and rebuilding after the wrong disk was selected.
What to prepare before handing RAID over for diagnosis
Prepare the NAS/controller model, RAID level if known, number of drives, original bay order, serial numbers, screenshots and a short timeline of actions already taken.
How to prepare the array for diagnosis
Do not run drives separately in a normal operating system. Keep them labelled, avoid new writes and describe whether any disk was removed, replaced, reinserted or rebuilt.
When not to wait with escalation
Do not wait when a RAID contains production data, virtual machines, accounting databases, client files or the only copy of projects. Waiting while services continue writing can reduce recovery options.
How to move from diagnosis to action
Diagnosis should identify member order, parity rotation, stripe size, metadata state and disk health before logical repair. Recovery should happen from images or controlled copies.
Safety rule: preserve the array before recovery. A fast rebuild can be more dangerous than the original fault.