From: David T-G <davidtg-robot@justpickone.org>
To: Linux RAID list <linux-raid@vger.kernel.org>
Subject: failed disks, mapper, and "Invalid argument"
Date: Wed, 20 May 2020 16:05:14 -0400 [thread overview]
Message-ID: <20200520200514.GE1415@justpickone.org> (raw)
Hi, all --
I have a four-partition RAID5 array of which one disk failed while I was
out of town and a second failed just today. Both failed smartctl tests
by not even starting, although I don't have that captured. Those two
were on a SATA daughtercard, so I swapped them (formerly sde, sdf)
up to the motherboard SATA ports like the other two (still sda, sdb) and
now all are visible and happily pass smartctl checks and generally look
good ... except that my md0 doesn't :-(
I've been through the wiki and other found documentation and have scraped
the archives, but the whole mapper thing is new to me, and I don't know
enough to pin down the error. I've been attempting to fake-build my
array with overlay devices to see how it will do. Please forgive the
long post if it's a bit ridiculous; I wanted to make sure that you have
all information :-)
Here's the array after I swapped ports and booted up:
diskfarm:root:10:~> mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Mon Feb 6 00:56:35 2017
Raid Level : raid5
Used Dev Size : 4294967295
Raid Devices : 4
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Mon May 18 01:10:07 2020
State : active, FAILED, Not Started
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : diskfarm:0 (local to host diskfarm)
UUID : ca7008ef:90693dae:6c231ad7:08b3f92d
Events : 57840
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
- 0 0 1 removed
- 0 0 2 removed
4 8 1 3 active sync /dev/sda1
diskfarm:root:10:~> mdadm --examine /dev/sd[abcd]1 | egrep '/dev|vents'
/dev/sda1:
Events : 57840
/dev/sdb1:
Events : 57840
/dev/sdc1:
Events : 57836
/dev/sdd1:
Events : 48959
I'd say sdd is the former sde that went away first and sdc that was sdf
only just fell over.
In my first round, I shut down md0
diskfarm:root:12:~> mdadm --stop /dev/md0
mdadm: stopped /dev/md0
diskfarm:root:12:~> cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdf2[0] sdg2[1] sdh2[3]
1464622080 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
unused devices: <none>
and of course it isn't in mdstat any more. Oops. But it's down, so we
won't see any more writes that could be messy.
I whipped up four loop devices and created overlay files
diskfarm:root:13:/mnt/scratch/disks> parallel truncate -s8G overlay-{/} ::: $DEVICES
...
To silence this citation notice: run 'parallel --citation'.
diskfarm:root:13:/mnt/scratch/disks> ls -goh
total 33M
-rw-r--r-- 1 8.0G May 20 14:00 overlay-sda1
-rw-r--r-- 1 8.0G May 20 14:00 overlay-sdb1
-rw-r--r-- 1 8.0G May 20 14:00 overlay-sdc1
-rw-r--r-- 1 8.0G May 20 14:00 overlay-sdd1
-rw-r--r-- 1 11K May 20 13:20 smartctl-a.sda.out
-rw-r--r-- 1 5.3K May 20 13:20 smartctl-a.sdb.out
-rw-r--r-- 1 5.3K May 20 13:20 smartctl-a.sdc.out
-rw-r--r-- 1 5.3K May 20 13:20 smartctl-a.sdd.out
diskfarm:root:13:/mnt/scratch/disks> du -skhc overlay-sd*
8.0M overlay-sda1
8.0M overlay-sdb1
8.0M overlay-sdc1
8.0M overlay-sdd1
32M total
diskfarm:root:13:/mnt/scratch/disks> ls -goh /dev/mapper/*
crw------- 1 10, 236 May 20 08:04 /dev/mapper/control
lrwxrwxrwx 1 7 May 20 14:02 /dev/mapper/sda1 -> ../dm-1
lrwxrwxrwx 1 7 May 20 14:02 /dev/mapper/sdb1 -> ../dm-0
lrwxrwxrwx 1 7 May 20 14:02 /dev/mapper/sdc1 -> ../dm-2
lrwxrwxrwx 1 7 May 20 14:02 /dev/mapper/sdd1 -> ../dm-3
and grabbed my overlays and checked the mapper
diskfarm:root:13:/mnt/scratch/disks> OVERLAYS=$(parallel echo /dev/mapper/{/} ::: $DEVICES)
diskfarm:root:13:/mnt/scratch/disks> echo $OVERLAYS
/dev/mapper/sda1 /dev/mapper/sdb1 /dev/mapper/sdc1 /dev/mapper/sdd1
diskfarm:root:13:/mnt/scratch/disks> dmsetup status
sdb1: 0 3518805647 snapshot 16/16777216 16
sdc1: 0 3518805647 snapshot 16/16777216 16
sda1: 0 3518805647 snapshot 16/16777216 16
sdd1: 0 3518805647 snapshot 16/16777216 16
and so far it looks good ... as far as I know :-)
I didn't know if I should try md0, the real array name, or create a new
md1, so I took the safe approach first
diskfarm:root:13:/mnt/scratch/disks> mdadm --assemble --force /dev/md1 $OVERLAYS
mdadm: forcing event count in /dev/mapper/sdc1(2) from 57836 upto 57840
mdadm: clearing FAULTY flag for device 2 in /dev/md1 for /dev/mapper/sdc1
mdadm: Marking array /dev/md1 as 'clean'
mdadm: failed to add /dev/mapper/sdd1 to /dev/md1: Invalid argument
mdadm: failed to add /dev/mapper/sdc1 to /dev/md1: Invalid argument
mdadm: failed to add /dev/mapper/sda1 to /dev/md1: Invalid argument
mdadm: failed to add /dev/mapper/sdb1 to /dev/md1: Invalid argument
mdadm: failed to RUN_ARRAY /dev/md1: Invalid argument
diskfarm:root:13:/mnt/scratch/disks> cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdf2[0] sdg2[1] sdh2[3]
1464622080 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
unused devices: <none>
diskfarm:root:13:/mnt/scratch/disks> mdadm --examine /dev/md1
mdadm: cannot open /dev/md1: No such file or directory
but didn't fet to move on to the next wiki step. I crossed my fingers
and tried md0
diskfarm:root:13:/mnt/scratch/disks> mdadm --assemble --force /dev/md0 $OVERLAYS
mdadm: failed to add /dev/mapper/sdd1 to /dev/md0: Invalid argument
mdadm: failed to add /dev/mapper/sdc1 to /dev/md0: Invalid argument
mdadm: failed to add /dev/mapper/sda1 to /dev/md0: Invalid argument
mdadm: failed to add /dev/mapper/sdb1 to /dev/md0: Invalid argument
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
diskfarm:root:13:/mnt/scratch/disks> mdadm --assemble --force /dev/md0 --verbose $OVERLAYS
mdadm: looking for devices for /dev/md0
mdadm: /dev/mapper/sda1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/mapper/sdb1 is identified as a member of /dev/md0, slot 0.
mdadm: /dev/mapper/sdc1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/mapper/sdd1 is identified as a member of /dev/md0, slot 1.
mdadm: failed to add /dev/mapper/sdd1 to /dev/md0: Invalid argument
mdadm: failed to add /dev/mapper/sdc1 to /dev/md0: Invalid argument
mdadm: failed to add /dev/mapper/sda1 to /dev/md0: Invalid argument
mdadm: failed to add /dev/mapper/sdb1 to /dev/md0: Invalid argument
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
diskfarm:root:13:/mnt/scratch/disks> mdadm --detail /dev/md0
mdadm: cannot open /dev/md0: No such file or directory
and STILL got nowhere. It was at this point that I figured I need to
back away and call for help! I don't want to try rebuilding the actual
array in case it's out of sync and I lose data.
Soooooo... There it is. Any suggestions to correct whatever oops I've
made or complete a step I overlooked? Any ideas why my assemble didn't?
TIA & HAND
:-D
--
David T-G
See http://justpickone.org/davidtg/email/
See http://justpickone.org/davidtg/tofu.txt
next reply other threads:[~2020-05-20 20:05 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-20 20:05 David T-G [this message]
2020-05-20 23:23 ` failed disks, mapper, and "Invalid argument" Wols Lists
2020-05-20 23:53 ` David T-G
2020-05-21 8:09 ` Wols Lists
2020-05-21 11:01 ` David T-G
2020-05-21 11:55 ` Wols Lists
2020-05-21 12:30 ` disks & prices plus python (was "Re: failed disks, mapper, and "Invalid argument"") David T-G
2020-05-21 13:07 ` antlists
2020-05-21 13:17 ` disks & prices plus python David T-G
2020-05-21 13:42 ` Wols Lists
2020-05-21 13:46 ` David T-G
2020-05-21 11:01 ` failed disks, mapper, and "Invalid argument" David T-G
2020-05-21 11:24 ` David T-G
2020-05-21 12:00 ` Wols Lists
2020-05-21 12:33 ` re-add syntax (was "Re: failed disks, mapper, and "Invalid argument"") David T-G
2020-05-21 13:01 ` antlists
2020-05-21 13:15 ` re-add syntax David T-G
2020-05-21 18:07 ` David T-G
2020-05-21 18:40 ` Roger Heflin
2020-05-21 22:52 ` David T-G
2020-05-21 23:17 ` antlists
2020-05-21 23:53 ` David T-G
2020-05-21 8:13 ` failed disks, mapper, and "Invalid argument" Wols Lists
2020-05-21 11:04 ` David T-G
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200520200514.GE1415@justpickone.org \
--to=davidtg-robot@justpickone.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.