All of lore.kernel.org
 help / color / mirror / Atom feed
* How to un-degrade an array after a totally spurious failure?
@ 2009-05-20 23:10 Nix
  2009-05-21  2:49 ` NeilBrown
  2020-01-12  9:48 ` mdadm not sending email Leslie Rhorer
  0 siblings, 2 replies; 19+ messages in thread
From: Nix @ 2009-05-20 23:10 UTC (permalink / raw)
  To: linux-raid

So this just happened on one of my older machines:

sym0: SCSI parity error detected: SCR1=132 DBC=50000000 SBCL=0
sd 0:0:0:0: [sda] ABORT operation started
sd 0:0:0:0: ABORT operation timed-out.
sd 0:0:0:0: [sda] ABORT operation started
sd 0:0:0:0: ABORT operation timed-out.
sd 0:0:2:0: [sdb] ABORT operation started
sd 0:0:2:0: ABORT operation timed-out.
sd 0:0:0:0: [sda] DEVICE RESET operation started
sd 0:0:0:0: DEVICE RESET operation timed-out.
sd 0:0:2:0: [sdb] DEVICE RESET operation started
sd 0:0:2:0: DEVICE RESET operation timed-out.
sd 0:0:0:0: [sda] BUS RESET operation started
sym0: SCSI BUS reset detected.
sym0: SCSI BUS has been reset.
sd 0:0:0:0: BUS RESET operation complete.
end_request: I/O error, dev sdb, sector 128591
md: super_written gets error=-5, uptodate=0
raid5: Disk failure on sdb6, disabling device.
raid5: Operation continuing on 2 devices.
RAID5 conf printout:
 --- rd:3 wd:2
 disk 0, o:1, dev:sda6
 disk 1, o:0, dev:sdb6
 disk 2, o:1, dev:sdd5
RAID5 conf printout:
 --- rd:3 wd:2
 disk 0, o:1, dev:sda6
 disk 2, o:1, dev:sdd5

This failure is quasi-spurious: nothing is actually wrong with the disks
(just one cable, which throws an error like this about once every two
years and otherwise works perfectly well, though the error has never
overlapped with a RAID superblock write before), so I'd like the drive
to be pulled back into the array sharpish. But it's quite unclear how to
do that. I can't afford to take the array down, but will accept (because
I must) the background hit of an array reconstruction.

Normally I'd just try things until one works, but if I get a command
wrong now then several rather important and long-running (months)
processes trickling writes to that array will be interrupted and I'll be
in rather a lot of trouble.

So, anyone got a command that would help? I'm not even sure if this is
assembly or growth: it doesn't quite fit into either of those
categories. There must be a way to do this, surely?

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-01-30 22:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-20 23:10 How to un-degrade an array after a totally spurious failure? Nix
2009-05-21  2:49 ` NeilBrown
2009-05-21  7:32   ` Nix
2009-05-26  8:25   ` Leslie Rhorer
2009-05-26 10:47     ` NeilBrown
2009-06-08  1:43       ` Leslie Rhorer
2009-06-08  1:54         ` Carlos Carvalho
2009-06-08  1:56           ` Leslie Rhorer
2009-06-08  7:51             ` Robin Hill
2009-06-08 13:12               ` Carlos Carvalho
2009-06-09  1:55                 ` Leslie Rhorer
2009-08-03  7:30       ` Leslie Rhorer
2009-08-03  7:43         ` NeilBrown
2009-08-03  8:28           ` Leslie Rhorer
2020-01-12  9:48 ` mdadm not sending email Leslie Rhorer
2020-01-12 12:47   ` John Stoffel
2020-01-13 14:00     ` Leslie Rhorer
2020-01-29 20:46       ` Leslie Rhorer
     [not found]         ` <CALc6PW4mf0kkU2y8mPvQsM3N-EMG2kLV3Y9-8EV-XQgLBmy_YA@mail.gmail.com>
2020-01-30 22:14           ` Leslie Rhorer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.