All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: RAID 6 Reshape Woes
       [not found] <41BC47FD-C02B-4DDA-BF1C-75032831AA29@abitofthisabitofthat.com>
@ 2015-11-19  1:23 ` Phil Turmel
  2015-11-19  2:45   ` Francisco Parada
  0 siblings, 1 reply; 3+ messages in thread
From: Phil Turmel @ 2015-11-19  1:23 UTC (permalink / raw)
  To: Francisco Parada, linux-raid

On 11/18/2015 08:07 PM, Francisco Parada wrote:
> Resending, previous message got rejected due to “HTML”.  Damn Apple Mail ;-)

Heh, but let me fix that typo:  Damn Apple ;-)

> Hi all,
> 
> I thought I had corrected all the flaws in my setup, but I was mistaken.  I took care of my hard drive timeout mismatch encountered via a thread a little over a week ago, subjected “RAID 6 Not Mounting (Block device is empty), by adding “smartctl -l scterc,70,70 /dev/sdX” and “for x in /sys/block/*/device/timeout ; do echo 180 > $x ; done” to my boot scripts.  I took care of my PSU issue, by replacing my enclosure’s defective PSU, with a new PSU which tested out OK with a multimeter.  Today, however, I report some bad news once again.  

Ugly.

> After having stressed my rebuilt array for a few days, by adding large sums of data and noting no further syslog errors, I decided that I could not live with 18GB of disk space remaining.  Since my last post, I’ve accumulated an additional Terrabyte, and so I ran out of space.  At the ready, I had a spare drive, so I decided to run "mdadm --grow --raid-devices=7 --backup-file=/root/grow_md126.bak /dev/md126”, to go from a 6 drive RAID 6 array to my 7 drive array.  All was good for about a minute, and then my nightmare began.  Luckily, I have a backup of prior to my Terrabyte, which is alright if I lose, just rather not.

Time to toss some enclosures and/or cables.

> mdstat output:
> ====================================================================================
> Every 1.0s: cat /proc/mdstat                                                                                        Wed Nov 18 19:25:02 2015
> 
> Personalities : [raid0] [linear] [multipath] [raid1] [raid6] [raid5] [raid4] [raid10]
> md126 : active raid6 sdh[0](F) sdk[6] sdg[5](F) sdf[4](F) sde[3](F) sdj[2] sdi[1]
>       11720540160 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/3] [_UU___U]
>       [>....................]  reshape =  0.0% (2726560/2930135040) finish=193325.8min speed=252K/sec
>       bitmap: 1/22 pages [4KB], 65536KB chunk

Hmmm.  Slow as molasses.

> The device is still mounted and I can access all the data in it.

Probably not.  You are just seeing kernel block cache effects, I suspect.

> At 18:55:24, I started my rebuild:
> =====================================================================================================
> Nov 18 18:55:24 DoctorBanner mdadm[1127]: RebuildStarted event detected on md device /dev/md126
> =====================================================================================================

Uhm, what?  What command or action did you take?  Or are you simply
doing a "flashback" to the start of this process?

> Then 3 seconds later (18:55:27), the first “reshape interrupted” message appeared, but I didn’t notice, because the array was chugging along at 9KB/s according to /proc/mdstat:
> =====================================================================================================
> Nov 18 18:55:27 DoctorBanner kernel: [77563.553030] md: md126: reshape interrupted.
> =====================================================================================================
> 
> At some point before the following entries and after starting the reshape, I ran “echo 50000 > /proc/sys/dev/raid/speed_limit_min” to help speed up the reshape, and so I think this is what started causing the issue.
> 
> It continued to reshape for about 5 minutes, and then things got really ugly:
> =====================================================================================================
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163377] ata7.00: failed to read SCR 1 (Emask=0x40)
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163382] ata7.01: failed to read SCR 1 (Emask=0x40)
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163384] ata7.02: failed to read SCR 1 (Emask=0x40)
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163385] ata7.03: failed to read SCR 1 (Emask=0x40)
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163386] ata7.04: failed to read SCR 1 (Emask=0x40)
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163388] ata7.05: failed to read SCR 1 (Emask=0x40)
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163392] ata7.15: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163394] ata7.15: irq_stat 0x08000000, interface fatal error
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163397] ata7.15: SError: { Handshk }
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163399] ata7.00: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163402] ata7.00: failed command: WRITE DMA EXT
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163406] ata7.00: cmd 35/00:40:40:fd:56/00:05:00:00:00/e0 tag 23 dma 688128 out
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163406]          res 50/00:00:7f:6b:6c/00:00:00:00:00/e0 Emask 0x100 (unknown error)
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163408] ata7.00: status: { DRDY }
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163410] ata7.01: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163412] ata7.02: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163414] ata7.03: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163416] ata7.04: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163418] ata7.05: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
> Nov 18 19:00:31 DoctorBanner kernel: [77868.163422] ata7.15: hard resetting link
> Nov 18 19:00:41 DoctorBanner kernel: [77878.160885] ata7.15: softreset failed (1st FIS failed)
> Nov 18 19:00:41 DoctorBanner kernel: [77878.160893] ata7.15: hard resetting link
> Nov 18 19:00:51 DoctorBanner kernel: [77888.162415] ata7.15: softreset failed (1st FIS failed)
> Nov 18 19:00:51 DoctorBanner kernel: [77888.162423] ata7.15: hard resetting link
> Nov 18 19:01:26 DoctorBanner kernel: [77923.153671] ata7.15: softreset failed (1st FIS failed)
> Nov 18 19:01:26 DoctorBanner kernel: [77923.153679] ata7.15: limiting SATA link speed to 1.5 Gbps
> Nov 18 19:01:26 DoctorBanner kernel: [77923.153683] ata7.15: hard resetting link
> Nov 18 19:01:31 DoctorBanner kernel: [77928.160337] ata7.15: softreset failed (1st FIS failed)
> Nov 18 19:01:31 DoctorBanner kernel: [77928.160344] ata7.15: failed to reset PMP, giving up
> Nov 18 19:01:31 DoctorBanner kernel: [77928.160347] ata7.15: Port Multiplier detaching
> =====================================================================================================
> 
> 
> Which then proceeded to rejecting I/O and offlining devices (full syslog attached).
> 
> I’m kind of alright with losing this one, since now I have a decent backup.  But is it even possible to recover from something like a failure this while it’s reshaping?

Stop the array completely.  Use --assemble --force with all of the
drives, including the new one.  Include the same --backup-file.

> I’m going to start chalking it up to the PCIe Port Multiplier being the root of the problem.

Likely.  Are the port multipliers capable of the same speeds as the
drives and controllers?

> What do you guys think?

New enclosures & controllers so you can ditch the port multipliers?

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RAID 6 Reshape Woes
  2015-11-19  1:23 ` RAID 6 Reshape Woes Phil Turmel
@ 2015-11-19  2:45   ` Francisco Parada
  2015-11-19  4:23     ` Adam Goryachev
  0 siblings, 1 reply; 3+ messages in thread
From: Francisco Parada @ 2015-11-19  2:45 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid


> Heh, but let me fix that typo:  Damn Apple ;-)
:-D I set myself up for that one!

> Time to toss some enclosures and/or cables.
Yeah, going to have to get some new toys for the holidays :-)

> Uhm, what?  What command or action did you take?  Or are you simply
> doing a "flashback" to the start of this process?
Yeah, I was just narrating my actions along with my syslog events.

> Stop the array completely.  Use --assemble --force with all of the
> drives, including the new one.  Include the same --backup-file.
Thanks, I unmounted, but my "mdadm --stop /dev/md126" command isn't exiting ... Going to give it a few more minutes. According to "ps -ef" the "/lib/systemd/systemd-udev" commanf is keeping it busy. I tried killing that PID with the "-9" argument as the root user, but no such luck.

> Likely.  Are the port multipliers capable of the same speeds as the
> drives and controllers?
I believe the enclosure's backplanes are 1.5Gbps, and maybe mismatched with my 3Gbps multiplier might be the bane of my issue.

>> What do you guys think?
> 
> New enclosures & controllers so you can ditch the port multipliers?
Yeah, I'm thinking of just building a 4U and using an LSI RAID controller I have laying around, use it as JBOD letting mdadm do the lifting, and stop with these damn cases already. I was just trying to do the best with what I had, but going this Port Multiplier route is what's getting me into this mess. I have a couple of SunFire x2270 1U systems fully spec'ed out that I wanted to take advantage of, and by using these external enclosures, I was getting around the 4 hard drive bay limitation. I guess I'll build something that's cheap just to mount the drives and share via Samba and NFS onto the SunFire systems, then do the heavy CPU lifting from there instead.

Thanks,

Cisco

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RAID 6 Reshape Woes
  2015-11-19  2:45   ` Francisco Parada
@ 2015-11-19  4:23     ` Adam Goryachev
  0 siblings, 0 replies; 3+ messages in thread
From: Adam Goryachev @ 2015-11-19  4:23 UTC (permalink / raw)
  To: Francisco Parada; +Cc: linux-raid



On 18/11/2015 21:45, Francisco Parada wrote:
>
> What do you guys think?
>> New enclosures & controllers so you can ditch the port multipliers?
> Yeah, I'm thinking of just building a 4U and using an LSI RAID controller I have laying around, use it as JBOD letting mdadm do the lifting, and stop with these damn cases already. I was just trying to do the best with what I had, but going this Port Multiplier route is what's getting me into this mess. I have a couple of SunFire x2270 1U systems fully spec'ed out that I wanted to take advantage of, and by using these external enclosures, I was getting around the 4 hard drive bay limitation. I guess I'll build something that's cheap just to mount the drives and share via Samba and NFS onto the SunFire systems, then do the heavy CPU lifting from there instead.

I've used nbd before to export a drive from one machine to use as a 
RAID1 member, added with a local block device (mdadm) and that worked 
really reliably for me (the remote nbd was also set as write-mostly).

Since then I've also used DRBD to manage both parts of that.

However, in your case, I think there is a more modern version of nbd 
which basically exports a block device as a SATA device over the 
network. Could you utilise your "couple" of 4 bay 1RU servers, and most 
of them use a single drive (or partition from a couple of drives in 
RAID1), and export the other partition (95% disk space) to the one 
"server" which will then combine them all with mdadm RAID6 or whatever 
is needed ?

One problem you have here is that you will lose too many "members" 
whenever one of your machines reboots.

Other options come to mind, but they certainly become more complex, and 
further away from where you are now.

Not sure if any of the above is useful, or if anyone else can comment on 
how well those options work, especially the SATA export for use with 
mdadm would be interesting (better if each server was exporting only 1 
disk, then you can lose a machine (or disk) without data loss.

Regards,
Adam

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-11-19  4:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <41BC47FD-C02B-4DDA-BF1C-75032831AA29@abitofthisabitofthat.com>
2015-11-19  1:23 ` RAID 6 Reshape Woes Phil Turmel
2015-11-19  2:45   ` Francisco Parada
2015-11-19  4:23     ` Adam Goryachev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.