💾 Storage with RAID 5 - Protecting Your Data [Part 3 of 10]

:floppy_disk: Storage with RAID 5 - Protecting Your Data [Part 3 of 10]

Building redundant storage so one drive failure doesn’t ruin your day
:wrench::optical_disk::shield::bar_chart:

You’ve got your server running Ubuntu. Now let’s talk about storage.

If you’re running a HomeLab, you’re storing important data: photos, documents, configurations, databases. Losing that data because a single hard drive failed would be devastating.

Enter RAID 5 - a way to combine multiple drives into one array with built-in redundancy.


:brain: What is RAID 5?

RAID = Redundant Array of Independent Disks

RAID 5 combines 3 or more drives into a single storage pool where:

  • Data is striped across all drives (fast performance)
  • Parity information is distributed across drives (redundancy)
  • One drive can fail without losing any data
  • You get (N-1) × drive size of usable space

Example with 3 × 2TB drives:

  • Total raw capacity: 6TB
  • Usable capacity: ~4TB (one drive’s worth is used for parity)
  • Can survive: 1 drive failure

:light_bulb: Why RAID 5 for HomeLab?

:white_check_mark: Pros

  • Data protection - One drive can fail without data loss
  • Good performance - Reads are fast due to striping
  • Efficient - Only “wastes” one drive’s capacity for redundancy
  • Cost-effective - Better than mirroring (RAID 1) for capacity

:cross_mark: Cons

  • Slower writes - Parity calculations add overhead
  • Rebuild stress - Rebuilding after failure stresses remaining drives
  • Not a backup - RAID protects against drive failure, NOT accidental deletion, corruption, or disasters

:bullseye: Best For

  • Docker container data
  • Media libraries
  • File storage
  • Database storage
  • Anything you can’t afford to lose

:warning: RAID is NOT a Backup

Important: RAID protects against hardware failure, not:

  • Accidental file deletion
  • Ransomware/malware
  • Fire, flood, theft
  • Filesystem corruption
  • User error

You still need backups. RAID is one layer of protection, not the only one.


:hammer_and_wrench: What You’ll Need

Choosing Your Drives

You have three main options, each with different trade-offs:

Option 1: NVMe SSDs (What I Use)

  • Speed: Fastest (3000+ MB/s)
  • Cost: Most expensive ($$$$)
  • Capacity: Typically 1-2TB per drive
  • Best for: Maximum performance, VMs, databases
  • Requirements: M.2 slots on motherboard
  • Example: 3× 2TB NVMe = ~4TB usable for $300-600

Option 2: SATA SSDs (Best Value for Speed)

  • Speed: Fast (500-550 MB/s)
  • Cost: Moderate ($$$)
  • Capacity: 2-4TB per drive
  • Best for: Great balance of speed and capacity
  • Requirements: SATA ports on motherboard, 2.5" drive bays
  • Example: 3× 4TB SATA SSD = ~8TB usable for $400-700

Option 3: 3.5" HDDs (Maximum Capacity)

  • Speed: Slower (150-200 MB/s)
  • Cost: Cheapest ($$)
  • Capacity: 4TB+ per drive (up to 20TB+)
  • Best for: Media storage, backups, large file archives
  • Requirements: SATA ports on motherboard, 3.5" drive bays in case
  • Example: 3× 8TB HDD = ~16TB usable for $300-450

My recommendation: For most HomeLab tasks (Docker containers, file storage, media), all three options work great. Choose based on your budget and capacity needs.

Important Considerations

Before buying drives, verify:

  • :white_check_mark: Motherboard has enough connections
    • NVMe: Check how many M.2 slots
    • SATA: Check how many SATA ports (usually 4-8)
  • :white_check_mark: Case has drive bays
    • 2.5" bays for SATA SSDs
    • 3.5" bays for HDDs
    • M.2 slots don’t need bays
  • :white_check_mark: Power supply has enough connectors AND wattage
    • SATA drives need SATA power cables
    • NVMe draws power from motherboard
    • HDDs use more power: Each 3.5" HDD draws ~10W (idle) to 20W (active)
    • Example: 4× HDDs = 40-80W additional power draw
    • If using multiple large HDDs, consider a higher wattage PSU (650W+ recommended)

Drive compatibility:

  • Same size drives recommended (e.g., all 4TB)
  • Can mix brands (I’m using WD + Samsung)
  • Can’t mix types (don’t mix HDD + SSD in same array)

Hardware Summary

  • 3 or more drives (same size recommended)
  • Available connections on motherboard
  • Drive bays in case (if using SATA/HDD)
  • Sufficient power from PSU

Software

  • mdadm - Linux software RAID tool (free, built into Ubuntu)
  • ext4 - Filesystem (reliable, well-supported)

My setup (as example):

  • 3 × 2TB NVMe SSDs (2× WD_BLACK SN850X, 1× Samsung 980 PRO)
  • ~3.6TB usable capacity
  • Mounted at /mnt/storage

:clipboard: RAID 5 Setup Process

Step 1: Install mdadm

# Update package lists
sudo apt update

# Install mdadm
sudo apt install mdadm -y

During installation, you’ll be asked about email notifications. You can configure this later.


Step 2: Identify Your Drives

Find your drives:

lsblk

You’ll see output like:

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
nvme0n1     259:0    0 931.5G  0 disk 
├─nvme0n1p1 259:1    0   512M  0 part /boot/efi
└─nvme0n1p2 259:2    0   931G  0 part /
nvme1n1     259:3    0   1.8T  0 disk 
nvme2n1     259:4    0   1.8T  0 disk 
nvme3n1     259:5    0   1.8T  0 disk 

In this example:

  • nvme0n1 = OS drive (don’t touch this!)
  • nvme1n1, nvme2n1, nvme3n1 = Empty drives for RAID

Your drives might be named:

  • sda, sdb, sdc (SATA drives)
  • nvme0n1, nvme1n1, nvme2n1 (NVMe drives)

:warning: WARNING: Make absolutely sure you’re using the correct drives. Creating a RAID array will erase all data on those drives.


Step 3: Create the RAID 5 Array

Create the array (adjust drive names to match yours):

# Create RAID 5 array with 3 drives
# Replace nvme1n1, nvme2n1, nvme3n1 with YOUR drive names
sudo mdadm --create /dev/md0 \
  --level=5 \
  --raid-devices=3 \
  /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1

You’ll be asked to confirm. Type y and press Enter.

The array will start building:

# Check progress
cat /proc/mdstat

You’ll see something like:

md0 : active raid5 nvme3n1[3] nvme2n1[1] nvme1n1[0]
      3906764800 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      [>....................]  resync =  2.3% (45678912/1953382400) finish=180.5min speed=176234K/sec

This can take hours depending on drive size. The array is usable during this time, but performance will be reduced.


Step 4: Create Filesystem

Once the array is created (you can do this while it’s still syncing):

# Create ext4 filesystem on the array
sudo mkfs.ext4 /dev/md0

This takes a few minutes.


Step 5: Create Mount Point and Mount Array

# Create mount point
sudo mkdir -p /mnt/storage

# Mount the array
sudo mount /dev/md0 /mnt/storage

# Verify it's mounted
df -h /mnt/storage

You should see:

Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        3.6T   28K  3.4T   1% /mnt/storage

Step 6: Make it Permanent (Auto-mount on Boot)

Save RAID configuration:

# Scan for RAID arrays and save configuration
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf

# Update initramfs (so array is available at boot)
sudo update-initramfs -u

Add to fstab for auto-mounting:

# Edit fstab
sudo nano /etc/fstab

Add this line at the end:

/dev/md0  /mnt/storage  ext4  defaults  0  2

Save and exit: Ctrl+X, then Y, then Enter

Test the fstab entry:

# Unmount
sudo umount /mnt/storage

# Mount using fstab
sudo mount -a

# Verify
df -h /mnt/storage

If it mounts successfully, you’re good!


Step 7: Set Permissions

# Create docker directory
sudo mkdir -p /mnt/storage/docker

# Set ownership (replace 'admin' with your username)
sudo chown -R admin:admin /mnt/storage

# Verify
ls -la /mnt/storage

:white_check_mark: Verify Everything Works

Check Array Status

# Quick status
cat /proc/mdstat

# Detailed information
sudo mdadm --detail /dev/md0

Look for:

  • State : clean (array is healthy)
  • [3/3] [UUU] (all 3 drives are Up)

Check Filesystem

df -Th /mnt/storage

Should show:

Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/md0       ext4  3.6T   28K  3.4T   1% /mnt/storage

:wrench: Monitoring and Maintenance

Check Array Health Regularly

# Quick check
cat /proc/mdstat

# Detailed status
sudo mdadm --detail /dev/md0

Run Monthly Data Scrubs

A scrub checks for data corruption:

# Start a scrub
echo check | sudo tee /sys/block/md0/md/sync_action

# Monitor progress
watch cat /proc/mdstat

# Check for errors after completion
sudo mdadm --detail /dev/md0 | grep -i mismatch

If mismatches are found, the array will auto-correct them.

Recommended: Set up a monthly cron job to run scrubs automatically.


:police_car_light: What Happens When a Drive Fails?

Detecting Failure

The array will automatically detect a failed drive:

cat /proc/mdstat

You’ll see:

md0 : active raid5 nvme3n1[3] nvme2n1[1] nvme1n1[0](F)
      3906764800 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU]

Notice:

  • [3/2] - 2 out of 3 drives working
  • [_UU] - First drive is down, others are Up
  • (F) - Failed drive marker

Your data is still safe! RAID 5 can run with one failed drive.


Replacing a Failed Drive

1. Remove failed drive from array:

# Mark as failed (if not auto-detected)
sudo mdadm --manage /dev/md0 --fail /dev/nvme1n1

# Remove from array
sudo mdadm --manage /dev/md0 --remove /dev/nvme1n1

2. Physically replace the drive

  • Shut down server
  • Replace failed drive with new one
  • Boot back up

3. Add new drive to array:

# Add new drive (adjust name if needed)
sudo mdadm --manage /dev/md0 --add /dev/nvme1n1

4. Monitor rebuild:

watch cat /proc/mdstat

Rebuild will take hours. The array is usable during this time, but performance is reduced.


:high_voltage: Performance Tips

Current Configuration

  • Chunk size: 512 KB (good for large files)
  • Filesystem: ext4 (reliable, well-supported)
  • Read performance: Excellent (striped across drives)
  • Write performance: Moderate (parity overhead)

Optimize for Your Workload

For large sequential files (media, backups):

  • 512 KB chunk size is ideal (default)

For small random files (databases):

  • Consider RAID 10 instead (better random I/O)

For maximum performance:

  • Use SSDs instead of HDDs (what I’m using)
  • NVMe > SATA SSD > HDD

:brain: TL;DR

  • RAID 5 combines 3+ drives with redundancy
  • One drive can fail without data loss
  • Usable capacity: (N-1) × drive size
  • Setup: Install mdadm, create array, format, mount, configure auto-mount
  • Maintenance: Monitor health, run monthly scrubs
  • RAID ≠ Backup - You still need separate backups
  • Next: We’ll install Docker & Portainer in Part 4

:speech_balloon: Your Turn

What RAID level are you using (or planning to use)?
HDDs or SSDs for your array?
Ever had a drive fail and RAID save the day?

Drop a comment below!


Navigation: ← Part 2 | Part 4 →