All of lore.kernel.org
 help / color / mirror / Atom feed
* RAID6 reshape stalls immediately
@ 2015-11-04 23:39 Peter Chubb
  2015-11-06 13:48 ` Anugraha Sinha
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Chubb @ 2015-11-04 23:39 UTC (permalink / raw)
  To: linux-raid

Hi Folks,
   I added two disks to my RAID5 array then attempted to reshape it to
   RAID 6.  And it has been sitting on 0% complete, with no disk I/O
   for 24 hours now.

   Is there any way to kick the reshape process?

/proc/mdstat is:

  Personalities : [raid6] [raid5] [raid4] 
  md0 : active raid6 sdi1[6] sdh1[5] sdd1[4] sde1[2] sdb1[1] sda1[0]
      5860122624 blocks super 1.2 level 6, 512k chunk, algorithm 18 [6/5] [UUUU_U]
      [>....................]  reshape =  0.0% (0/1953374208) finish=1308.7min speed=24099K/sec
      bitmap: 0/15 pages [0KB], 65536KB chunk

  unused devices: <none>


What I did:
  mdadm --add /dev/md0 /dev/sdh1
  mdadm --add /dev/md0 /dev/sdi1
  mdadm --grow /dev/md0 --level=6 --raid-devices=6 --backup-file=/root/raid5-backup

dmesg reported:
[691739.298345] md: bind<sdh1>
[691739.364534] RAID conf printout:
[691739.364537]  --- level:5 rd:4 wd:4
[691739.364539]  disk 0, o:1, dev:sda1
[691739.364540]  disk 1, o:1, dev:sdb1
[691739.364541]  disk 2, o:1, dev:sde1
[691739.364542]  disk 3, o:1, dev:sdd1
[691741.832242] md: bind<sdi1>
[691741.898470] RAID conf printout:
[691741.898474]  --- level:5 rd:4 wd:4
[691741.898476]  disk 0, o:1, dev:sda1
[691741.898478]  disk 1, o:1, dev:sdb1
[691741.898480]  disk 2, o:1, dev:sde1
[691741.898481]  disk 3, o:1, dev:sdd1
[691741.898482] RAID conf printout:
[691741.898482]  --- level:5 rd:4 wd:4
[691741.898484]  disk 0, o:1, dev:sda1
[691741.898485]  disk 1, o:1, dev:sdb1
[691741.898486]  disk 2, o:1, dev:sde1
[691741.898487]  disk 3, o:1, dev:sdd1
[691805.469105] md/raid:md0: device sdd1 operational as raid disk 3
[691805.469110] md/raid:md0: device sde1 operational as raid disk 2
[691805.469111] md/raid:md0: device sdb1 operational as raid disk 1
[691805.469112] md/raid:md0: device sda1 operational as raid disk 0
[691805.469551] md/raid:md0: allocated 5424kB
[691805.506035] md/raid:md0: raid level 6 active with 4 out of 5 devices, algori
thm 18
[691805.506050] RAID conf printout:
[691805.506051]  --- level:6 rd:5 wd:4
[691805.506053]  disk 0, o:1, dev:sda1
[691805.506054]  disk 1, o:1, dev:sdb1
[691805.506055]  disk 2, o:1, dev:sde1
[691805.506056]  disk 3, o:1, dev:sdd1
[691805.847329] RAID conf printout:
[691805.847333]  --- level:6 rd:6 wd:5
[691805.847335]  disk 0, o:1, dev:sda1
[691805.847336]  disk 1, o:1, dev:sdb1
[691805.847337]  disk 2, o:1, dev:sde1
[691805.847338]  disk 3, o:1, dev:sdd1
[691805.847340]  disk 4, o:1, dev:sdi1
[691805.847350] RAID conf printout:
[691805.847350]  --- level:6 rd:6 wd:5
[691805.847351]  disk 0, o:1, dev:sda1
[691805.847352]  disk 1, o:1, dev:sdb1
[691805.847353]  disk 2, o:1, dev:sde1
[691805.847354]  disk 3, o:1, dev:sdd1
[691805.847354]  disk 4, o:1, dev:sdi1
[691805.847355]  disk 5, o:1, dev:sdh1
[691805.847424] md: reshape of RAID array md0
[691805.847426] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[691805.847428] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
[691805.847439] md: using 128k window, over a total of 1953374208k.

And nothing since.
-- 
Dr Peter Chubb				         http://www.data61.csiro.au
http://www.ssrg.nicta.com.au   Software Systems Research Group/NICTA/Data61

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID6 reshape stalls immediately
  2015-11-04 23:39 RAID6 reshape stalls immediately Peter Chubb
@ 2015-11-06 13:48 ` Anugraha Sinha
  2015-11-06 13:54   ` Anugraha Sinha
  2015-11-08  4:18   ` Peter Chubb
  0 siblings, 2 replies; 8+ messages in thread
From: Anugraha Sinha @ 2015-11-06 13:48 UTC (permalink / raw)
  To: Peter Chubb, linux-raid

Dear Peter,

What happening on the root partition where you are trying to take a backup?

Any I/O, filesize etc?

Also could you share the output of mdadm --examine <all of your 6 raid 
members individually?

On 11/5/2015 8:39 AM, Peter Chubb wrote:
> Hi Folks,
>     I added two disks to my RAID5 array then attempted to reshape it to
>     RAID 6.  And it has been sitting on 0% complete, with no disk I/O
>     for 24 hours now.
>
>     Is there any way to kick the reshape process?
>
> /proc/mdstat is:
>
>    Personalities : [raid6] [raid5] [raid4]
>    md0 : active raid6 sdi1[6] sdh1[5] sdd1[4] sde1[2] sdb1[1] sda1[0]
>        5860122624 blocks super 1.2 level 6, 512k chunk, algorithm 18 [6/5] [UUUU_U]
>        [>....................]  reshape =  0.0% (0/1953374208) finish=1308.7min speed=24099K/sec
>        bitmap: 0/15 pages [0KB], 65536KB chunk
>
>    unused devices: <none>
>
>
> What I did:
>    mdadm --add /dev/md0 /dev/sdh1
>    mdadm --add /dev/md0 /dev/sdi1
>    mdadm --grow /dev/md0 --level=6 --raid-devices=6 --backup-file=/root/raid5-backup
>
> dmesg reported:
> [691739.298345] md: bind<sdh1>
> [691739.364534] RAID conf printout:
> [691739.364537]  --- level:5 rd:4 wd:4
> [691739.364539]  disk 0, o:1, dev:sda1
> [691739.364540]  disk 1, o:1, dev:sdb1
> [691739.364541]  disk 2, o:1, dev:sde1
> [691739.364542]  disk 3, o:1, dev:sdd1
> [691741.832242] md: bind<sdi1>
> [691741.898470] RAID conf printout:
> [691741.898474]  --- level:5 rd:4 wd:4
> [691741.898476]  disk 0, o:1, dev:sda1
> [691741.898478]  disk 1, o:1, dev:sdb1
> [691741.898480]  disk 2, o:1, dev:sde1
> [691741.898481]  disk 3, o:1, dev:sdd1
> [691741.898482] RAID conf printout:
> [691741.898482]  --- level:5 rd:4 wd:4
> [691741.898484]  disk 0, o:1, dev:sda1
> [691741.898485]  disk 1, o:1, dev:sdb1
> [691741.898486]  disk 2, o:1, dev:sde1
> [691741.898487]  disk 3, o:1, dev:sdd1
> [691805.469105] md/raid:md0: device sdd1 operational as raid disk 3
> [691805.469110] md/raid:md0: device sde1 operational as raid disk 2
> [691805.469111] md/raid:md0: device sdb1 operational as raid disk 1
> [691805.469112] md/raid:md0: device sda1 operational as raid disk 0
> [691805.469551] md/raid:md0: allocated 5424kB
> [691805.506035] md/raid:md0: raid level 6 active with 4 out of 5 devices, algori
> thm 18
> [691805.506050] RAID conf printout:
> [691805.506051]  --- level:6 rd:5 wd:4
> [691805.506053]  disk 0, o:1, dev:sda1
> [691805.506054]  disk 1, o:1, dev:sdb1
> [691805.506055]  disk 2, o:1, dev:sde1
> [691805.506056]  disk 3, o:1, dev:sdd1
> [691805.847329] RAID conf printout:
> [691805.847333]  --- level:6 rd:6 wd:5
> [691805.847335]  disk 0, o:1, dev:sda1
> [691805.847336]  disk 1, o:1, dev:sdb1
> [691805.847337]  disk 2, o:1, dev:sde1
> [691805.847338]  disk 3, o:1, dev:sdd1
> [691805.847340]  disk 4, o:1, dev:sdi1
> [691805.847350] RAID conf printout:
> [691805.847350]  --- level:6 rd:6 wd:5
> [691805.847351]  disk 0, o:1, dev:sda1
> [691805.847352]  disk 1, o:1, dev:sdb1
> [691805.847353]  disk 2, o:1, dev:sde1
> [691805.847354]  disk 3, o:1, dev:sdd1
> [691805.847354]  disk 4, o:1, dev:sdi1
> [691805.847355]  disk 5, o:1, dev:sdh1
> [691805.847424] md: reshape of RAID array md0
> [691805.847426] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [691805.847428] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
> [691805.847439] md: using 128k window, over a total of 1953374208k.
>
> And nothing since.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID6 reshape stalls immediately
  2015-11-06 13:48 ` Anugraha Sinha
@ 2015-11-06 13:54   ` Anugraha Sinha
  2015-11-08  4:18   ` Peter Chubb
  1 sibling, 0 replies; 8+ messages in thread
From: Anugraha Sinha @ 2015-11-06 13:54 UTC (permalink / raw)
  To: Peter Chubb, linux-raid

Dear Peter,

Also please follow one of the mails over linux-raid mailing list.
Subject : RAID6 reshape stalls immediately

It discusses a similar problem, when giving backup-file.

Regards
Anugraha

On 11/6/2015 10:48 PM, Anugraha Sinha wrote:
> Dear Peter,
>
> What happening on the root partition where you are trying to take a backup?
>
> Any I/O, filesize etc?
>
> Also could you share the output of mdadm --examine <all of your 6 raid
> members individually?
>
> On 11/5/2015 8:39 AM, Peter Chubb wrote:
>> Hi Folks,
>>     I added two disks to my RAID5 array then attempted to reshape it to
>>     RAID 6.  And it has been sitting on 0% complete, with no disk I/O
>>     for 24 hours now.
>>
>>     Is there any way to kick the reshape process?
>>
>> /proc/mdstat is:
>>
>>    Personalities : [raid6] [raid5] [raid4]
>>    md0 : active raid6 sdi1[6] sdh1[5] sdd1[4] sde1[2] sdb1[1] sda1[0]
>>        5860122624 blocks super 1.2 level 6, 512k chunk, algorithm 18
>> [6/5] [UUUU_U]
>>        [>....................]  reshape =  0.0% (0/1953374208)
>> finish=1308.7min speed=24099K/sec
>>        bitmap: 0/15 pages [0KB], 65536KB chunk
>>
>>    unused devices: <none>
>>
>>
>> What I did:
>>    mdadm --add /dev/md0 /dev/sdh1
>>    mdadm --add /dev/md0 /dev/sdi1
>>    mdadm --grow /dev/md0 --level=6 --raid-devices=6
>> --backup-file=/root/raid5-backup
>>
>> dmesg reported:
>> [691739.298345] md: bind<sdh1>
>> [691739.364534] RAID conf printout:
>> [691739.364537]  --- level:5 rd:4 wd:4
>> [691739.364539]  disk 0, o:1, dev:sda1
>> [691739.364540]  disk 1, o:1, dev:sdb1
>> [691739.364541]  disk 2, o:1, dev:sde1
>> [691739.364542]  disk 3, o:1, dev:sdd1
>> [691741.832242] md: bind<sdi1>
>> [691741.898470] RAID conf printout:
>> [691741.898474]  --- level:5 rd:4 wd:4
>> [691741.898476]  disk 0, o:1, dev:sda1
>> [691741.898478]  disk 1, o:1, dev:sdb1
>> [691741.898480]  disk 2, o:1, dev:sde1
>> [691741.898481]  disk 3, o:1, dev:sdd1
>> [691741.898482] RAID conf printout:
>> [691741.898482]  --- level:5 rd:4 wd:4
>> [691741.898484]  disk 0, o:1, dev:sda1
>> [691741.898485]  disk 1, o:1, dev:sdb1
>> [691741.898486]  disk 2, o:1, dev:sde1
>> [691741.898487]  disk 3, o:1, dev:sdd1
>> [691805.469105] md/raid:md0: device sdd1 operational as raid disk 3
>> [691805.469110] md/raid:md0: device sde1 operational as raid disk 2
>> [691805.469111] md/raid:md0: device sdb1 operational as raid disk 1
>> [691805.469112] md/raid:md0: device sda1 operational as raid disk 0
>> [691805.469551] md/raid:md0: allocated 5424kB
>> [691805.506035] md/raid:md0: raid level 6 active with 4 out of 5
>> devices, algori
>> thm 18
>> [691805.506050] RAID conf printout:
>> [691805.506051]  --- level:6 rd:5 wd:4
>> [691805.506053]  disk 0, o:1, dev:sda1
>> [691805.506054]  disk 1, o:1, dev:sdb1
>> [691805.506055]  disk 2, o:1, dev:sde1
>> [691805.506056]  disk 3, o:1, dev:sdd1
>> [691805.847329] RAID conf printout:
>> [691805.847333]  --- level:6 rd:6 wd:5
>> [691805.847335]  disk 0, o:1, dev:sda1
>> [691805.847336]  disk 1, o:1, dev:sdb1
>> [691805.847337]  disk 2, o:1, dev:sde1
>> [691805.847338]  disk 3, o:1, dev:sdd1
>> [691805.847340]  disk 4, o:1, dev:sdi1
>> [691805.847350] RAID conf printout:
>> [691805.847350]  --- level:6 rd:6 wd:5
>> [691805.847351]  disk 0, o:1, dev:sda1
>> [691805.847352]  disk 1, o:1, dev:sdb1
>> [691805.847353]  disk 2, o:1, dev:sde1
>> [691805.847354]  disk 3, o:1, dev:sdd1
>> [691805.847354]  disk 4, o:1, dev:sdi1
>> [691805.847355]  disk 5, o:1, dev:sdh1
>> [691805.847424] md: reshape of RAID array md0
>> [691805.847426] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
>> [691805.847428] md: using maximum available idle IO bandwidth (but not
>> more than 200000 KB/sec) for reshape.
>> [691805.847439] md: using 128k window, over a total of 1953374208k.
>>
>> And nothing since.
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID6 reshape stalls immediately
  2015-11-06 13:48 ` Anugraha Sinha
  2015-11-06 13:54   ` Anugraha Sinha
@ 2015-11-08  4:18   ` Peter Chubb
  2015-11-09 13:27     ` Anugraha Sinha
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Chubb @ 2015-11-08  4:18 UTC (permalink / raw)
  To: Anugraha Sinha; +Cc: Peter Chubb, linux-raid

>>>>> "Anugraha" == Anugraha Sinha <asinha.mailinglist@gmail.com> writes:

Anugraha> Dear Peter, What happening on the root partition where you
Anugraha> are trying to take a backup?

Anugraha> Any I/O, filesize etc?

ls -l /root/raid5-backup
-rw------- 1 root root 6295552 Nov  4 12:04 raid5-backup

It hasn't changed since the first write when the reshape started.

Anugraha> Also could you share the output of mdadm --examine <all of
Anugraha> your 6 raid members individually?

Here:
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 6ff23c3d:01042464:77338dc6:710dfaee
           Name : lemma:0  (local to host lemma)
  Creation Time : Tue Oct 27 11:52:53 2015
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 3906748416 (1862.88 GiB 2000.26 GB)
     Array Size : 7813496832 (7451.53 GiB 8001.02 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 4f5d0861:9a512ce8:d4233208:5bd1e094

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

     Update Time : Sun Nov  8 15:12:08 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : d1f1a56f - correct
         Events : 48094

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)

/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 6ff23c3d:01042464:77338dc6:710dfaee
           Name : lemma:0  (local to host lemma)
  Creation Time : Tue Oct 27 11:52:53 2015
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 3906748416 (1862.88 GiB 2000.26 GB)
     Array Size : 7813496832 (7451.53 GiB 8001.02 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : fbd91407:208c66cc:1020f996:d7b1cd5d

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Sun Nov  8 15:12:08 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : b3ec395a - correct
         Events : 48094

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 6ff23c3d:01042464:77338dc6:710dfaee
           Name : lemma:0  (local to host lemma)
  Creation Time : Tue Oct 27 11:52:53 2015
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 3906748416 (1862.88 GiB 2000.26 GB)
     Array Size : 7813496832 (7451.53 GiB 8001.02 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 074e15ab:5508d37d:931e4e42:ecb6476e

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Sun Nov  8 15:12:08 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : c5282d36 - correct
         Events : 48094

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 6ff23c3d:01042464:77338dc6:710dfaee
           Name : lemma:0  (local to host lemma)
  Creation Time : Tue Oct 27 11:52:53 2015
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 3906748416 (1862.88 GiB 2000.26 GB)
     Array Size : 7813496832 (7451.53 GiB 8001.02 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 785d7bee:4bd7939c:8303ba47:64f40faf

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Sun Nov  8 15:12:08 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 6d832e04 - correct
         Events : 48094

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)

/dev/sdh1:
	  Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 6ff23c3d:01042464:77338dc6:710dfaee
           Name : lemma:0  (local to host lemma)
  Creation Time : Tue Oct 27 11:52:53 2015
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 3906748416 (1862.88 GiB 2000.26 GB)
     Array Size : 7813496832 (7451.53 GiB 8001.02 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : a473e815:4efb6be4:45db7fe4:c3f5dce2

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Sun Nov  8 15:12:08 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : ad5b4157 - correct
         Events : 48094

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdi1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x7
     Array UUID : 6ff23c3d:01042464:77338dc6:710dfaee
           Name : lemma:0  (local to host lemma)
  Creation Time : Tue Oct 27 11:52:53 2015
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 3906748416 (1862.88 GiB 2000.26 GB)
     Array Size : 7813496832 (7451.53 GiB 8001.02 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
Recovery Offset : 0 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 46872307:b4a769e3:38ebf84c:a031d4dd

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Sun Nov  8 15:12:08 2015
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 1044d32 - correct
         Events : 48094

         Layout : left-symmetric-6
     Chunk Size : 512K
   Device Role : Active device 4
   Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)


And /proc/mdstat is still:
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid6 sdi1[6] sdh1[5] sdd1[4] sde1[2] sdb1[1] sda1[0]
      5860122624 blocks super 1.2 level 6, 512k chunk, algorithm 18 [6/5] [UUUU_U]
      [>....................]  reshape =  0.0% (0/1953374208) finish=5766.8min speed=5469K/sec
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID6 reshape stalls immediately
  2015-11-08  4:18   ` Peter Chubb
@ 2015-11-09 13:27     ` Anugraha Sinha
  2015-11-09 13:38       ` Phil Turmel
  0 siblings, 1 reply; 8+ messages in thread
From: Anugraha Sinha @ 2015-11-09 13:27 UTC (permalink / raw)
  To: Peter Chubb; +Cc: linux-raid, Phil Turmel

Dear Peter,

Apologies for the late reply.

On 11/8/2015 1:18 PM, Peter Chubb wrote:
>
> ls -l /root/raid5-backup
> -rw------- 1 root root 6295552 Nov  4 12:04 raid5-backup
>
> It hasn't changed since the first write when the reshape started.

That means there is no write happening on the root file which was chosen 
for backup.
Just a check, for sanity, I hope the root partition is not filled up.

>
> /dev/sdb1:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x5
>       Array UUID : 6ff23c3d:01042464:77338dc6:710dfaee
>             Name : lemma:0  (local to host lemma)
>    Creation Time : Tue Oct 27 11:52:53 2015
>       Raid Level : raid6
>     Raid Devices : 6
>
>   Avail Dev Size : 3906748416 (1862.88 GiB 2000.26 GB)
>       Array Size : 7813496832 (7451.53 GiB 8001.02 GB)
>      Data Offset : 262144 sectors
>     Super Offset : 8 sectors
>     Unused Space : before=262056 sectors, after=0 sectors
>            State : clean
>      Device UUID : fbd91407:208c66cc:1020f996:d7b1cd5d
>
> Internal Bitmap : 8 sectors from superblock
>    Reshape pos'n : 0
>    Delta Devices : 1 (5->6)
>       New Layout : left-symmetric
>
>      Update Time : Sun Nov  8 15:12:08 2015
>    Bad Block Log : 512 entries available at offset 72 sectors
>         Checksum : b3ec395a - correct
>           Events : 48094
>
>           Layout : left-symmetric-6
>       Chunk Size : 512K
>
>     Device Role : Active device 1
>     Array State : AAAAAA ('A' == active, '.' == missing, 'R' == replacing)

\snip\

> And /proc/mdstat is still:
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid6 sdi1[6] sdh1[5] sdd1[4] sde1[2] sdb1[1] sda1[0]
>        5860122624 blocks super 1.2 level 6, 512k chunk, algorithm 18 [6/5] [UUUU_U]
>        [>....................]  reshape =  0.0% (0/1953374208) finish=5766.8min speed=5469K/sec
>        bitmap: 0/15 pages [0KB], 65536KB chunk
>
> unused devices: <none>
>

\snip\
1. your mdstat says that sdb1 device is down. However, examine says sdb1 
seems OK. I am not sure why?

Could you check for the daemon(process) running for md0 and see the 
strace of it. Where is it waiting?

Also I would want to check, what have been the last few dmesg in your 
system, any updated since the resyncing/reshaping started, as you shared 
in the last mail.

I am looping in to Phil explicitly for some help here.

@Phil,
Need some help here!

Regards
Anugraha Sinha

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID6 reshape stalls immediately
  2015-11-09 13:27     ` Anugraha Sinha
@ 2015-11-09 13:38       ` Phil Turmel
  2015-11-09 22:02         ` Peter Chubb
  0 siblings, 1 reply; 8+ messages in thread
From: Phil Turmel @ 2015-11-09 13:38 UTC (permalink / raw)
  To: Anugraha Sinha, Peter Chubb; +Cc: linux-raid

On 11/09/2015 08:27 AM, Anugraha Sinha wrote:

> I am looping in to Phil explicitly for some help here.
> 
> @Phil,
> Need some help here!

I don't know why it's stuck.  This has never happened to me :-).

Neil helped someone recently with a similar problem, and suggested
omitting the backup-file.  It isn't needed for --grow operations that
increase capacity.  I also recall some discussions about systemd vs.
mdmon issues -- if that applies to you.

If omitting the --backup-file fails, I would reboot into a rescue CD
with a recent kernel and mdadm version.  I recommend www.sysrescuecd.ord
for such things.  Then use mdadm --grow --continue within the rescue
environment.

Phil

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID6 reshape stalls immediately
  2015-11-09 13:38       ` Phil Turmel
@ 2015-11-09 22:02         ` Peter Chubb
  2015-11-09 22:26           ` Phil Turmel
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Chubb @ 2015-11-09 22:02 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Anugraha Sinha, Peter Chubb, linux-raid

>>>>> "Phil" == Phil Turmel <philip@turmel.org> writes:

Phil> On 11/09/2015 08:27 AM, Anugraha Sinha wrote:
>> I am looping in to Phil explicitly for some help here.
>> 
>> @Phil, Need some help here!

Phil> I don't know why it's stuck.  This has never happened to me :-).
Phil> If omitting the --backup-file fails, I would reboot into a
Phil> rescue CD with a recent kernel and mdadm version.  I recommend
Phil> www.sysrescuecd.ord for such things.  Then use mdadm --grow
Phil> --continue within the rescue environment.


I just did
  mdadm --grow --continue /dev/md0

and the reshape started up again.  I didn't realise that mdadm had to
keep going to do the reshape -- it must have died when I logged off
last time.  So many other operations are handled entirely in the
kernel...

Peter C
-- 
Dr Peter Chubb				         http://www.data61.csiro.au
http://www.ssrg.nicta.com.au   Software Systems Research Group/NICTA/Data61

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RAID6 reshape stalls immediately
  2015-11-09 22:02         ` Peter Chubb
@ 2015-11-09 22:26           ` Phil Turmel
  0 siblings, 0 replies; 8+ messages in thread
From: Phil Turmel @ 2015-11-09 22:26 UTC (permalink / raw)
  To: Peter Chubb; +Cc: Anugraha Sinha, linux-raid

On 11/09/2015 05:02 PM, Peter Chubb wrote:

> I just did
>   mdadm --grow --continue /dev/md0
> 
> and the reshape started up again.  I didn't realise that mdadm had to
> keep going to do the reshape -- it must have died when I logged off
> last time.  So many other operations are handled entirely in the
> kernel...

mdadm spawns a copy of mdmon to perform these tasks.  If the process
hierarchy is strictly enforced, and kills off spawned processes when
the highest parent dies, that's a problem for mdadm.  I recall some
discussions about this with systemd and/or cgroups.  You might want
to make sure all of your utilities have the latest fixes.

Phil


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-11-09 22:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-04 23:39 RAID6 reshape stalls immediately Peter Chubb
2015-11-06 13:48 ` Anugraha Sinha
2015-11-06 13:54   ` Anugraha Sinha
2015-11-08  4:18   ` Peter Chubb
2015-11-09 13:27     ` Anugraha Sinha
2015-11-09 13:38       ` Phil Turmel
2015-11-09 22:02         ` Peter Chubb
2015-11-09 22:26           ` Phil Turmel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.