* Array died during grow; now resync stopped
@ 2015-02-02 9:41 Jörg Habenicht
2015-02-02 19:40 ` NeilBrown
0 siblings, 1 reply; 6+ messages in thread
From: Jörg Habenicht @ 2015-02-02 9:41 UTC (permalink / raw)
To: linux-raid
Hi all,
I had a server crash during an array grow.
Commandline was "mdadm --grow /dev/md0 --raid-devices=6 --chunk=1M"
Now the sync is stuck at 27% and wont continue.
$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde1[0] sdg1[9] sdc1[6] sdb1[7] sdd1[8] sdf1[5]
5860548608 blocks super 1.0 level 5, 256k chunk, algorithm 2 [6/6]
[UUUUUU]
[=====>...............] reshape = 27.9% (410229760/1465137152)
finish=8670020128.0min speed=0K/sec
unused devices: <none>
$ mdadm -D /dev/md0
/dev/md0:
Version : 1.0
Creation Time : Thu Oct 7 09:28:04 2010
Raid Level : raid5
Array Size : 5860548608 (5589.05 GiB 6001.20 GB)
Used Dev Size : 1465137152 (1397.26 GiB 1500.30 GB)
Raid Devices : 6
Total Devices : 6
Persistence : Superblock is persistent
Update Time : Sun Feb 1 13:30:05 2015
State : clean, reshaping
Active Devices : 6
Working Devices : 6
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 256K
Reshape Status : 27% complete
Delta Devices : 1, (5->6)
New Chunksize : 1024K
Name : stelli:3 (local to host stelli)
UUID : 52857d77:3806e446:477d4865:d711451e
Events : 2254869
Number Major Minor RaidDevice State
0 8 65 0 active sync /dev/sde1
5 8 81 1 active sync /dev/sdf1
8 8 49 2 active sync /dev/sdd1
7 8 17 3 active sync /dev/sdb1
6 8 33 4 active sync /dev/sdc1
9 8 97 5 active sync /dev/sdg1
smartcrl reports the disks are OK. No remapped sectors, no pending writes, etc.
The system load keeps at 2.0:
$ cat /proc/loadavg
2.00 2.00 1.95 1/140 2937
which may be caused by udevd and md0_reshape
$ ps fax
PID TTY STAT TIME COMMAND
2 ? S 0:00 [kthreadd]
...
1671 ? D 0:00 \_ [md0_reshape]
...
1289 ? Ss 0:01 /sbin/udevd --daemon
1672 ? D 0:00 \_ /sbin/udevd --daemon
Could this be caused by a software lock?
The system got 2G RAM and 2G swap. Is this sufficient to complete?
$ free
total used free shared buffers cached
Mem: 1799124 351808 1447316 540 14620 286216
-/+ buffers/cache: 50972 1748152
Swap: 2104508 0 2104508
And in dmesg I found this:
$ dmesg | less
[ 5.456941] md: bind<sdg1>
[ 11.015014] xor: measuring software checksum speed
[ 11.051384] prefetch64-sse: 3291.000 MB/sec
[ 11.091375] generic_sse: 3129.000 MB/sec
[ 11.091378] xor: using function: prefetch64-sse (3291.000 MB/sec)
[ 11.159365] raid6: sse2x1 1246 MB/s
[ 11.227343] raid6: sse2x2 2044 MB/s
[ 11.295327] raid6: sse2x4 2487 MB/s
[ 11.295331] raid6: using algorithm sse2x4 (2487 MB/s)
[ 11.295334] raid6: using intx1 recovery algorithm
[ 11.328771] md: raid6 personality registered for level 6
[ 11.328776] md: raid5 personality registered for level 5
[ 11.328779] md: raid4 personality registered for level 4
[ 19.840890] bio: create slab <bio-1> at 1
[ 159.701406] md: md0 stopped.
[ 159.701413] md: unbind<sdg1>
[ 159.709902] md: export_rdev(sdg1)
[ 159.709980] md: unbind<sdd1>
[ 159.721856] md: export_rdev(sdd1)
[ 159.721955] md: unbind<sdb1>
[ 159.733883] md: export_rdev(sdb1)
[ 159.733991] md: unbind<sdc1>
[ 159.749856] md: export_rdev(sdc1)
[ 159.749954] md: unbind<sdf1>
[ 159.769885] md: export_rdev(sdf1)
[ 159.769985] md: unbind<sde1>
[ 159.781873] md: export_rdev(sde1)
[ 160.471460] md: md0 stopped.
[ 160.490329] md: bind<sdf1>
[ 160.490478] md: bind<sdd1>
[ 160.490689] md: bind<sdb1>
[ 160.490911] md: bind<sdc1>
[ 160.491164] md: bind<sdg1>
[ 160.491408] md: bind<sde1>
[ 160.492616] md/raid:md0: reshape will continue
[ 160.492638] md/raid:md0: device sde1 operational as raid disk 0
[ 160.492640] md/raid:md0: device sdg1 operational as raid disk 5
[ 160.492641] md/raid:md0: device sdc1 operational as raid disk 4
[ 160.492642] md/raid:md0: device sdb1 operational as raid disk 3
[ 160.492644] md/raid:md0: device sdd1 operational as raid disk 2
[ 160.492645] md/raid:md0: device sdf1 operational as raid disk 1
[ 160.493187] md/raid:md0: allocated 0kB
[ 160.493253] md/raid:md0: raid level 5 active with 6 out of 6 devices,
algorithm 2
[ 160.493256] RAID conf printout:
[ 160.493257] --- level:5 rd:6 wd:6
[ 160.493259] disk 0, o:1, dev:sde1
[ 160.493261] disk 1, o:1, dev:sdf1
[ 160.493262] disk 2, o:1, dev:sdd1
[ 160.493263] disk 3, o:1, dev:sdb1
[ 160.493264] disk 4, o:1, dev:sdc1
[ 160.493266] disk 5, o:1, dev:sdg1
[ 160.493336] md0: detected capacity change from 0 to 6001201774592
[ 160.493340] md: reshape of RAID array md0
[ 160.493342] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 160.493343] md: using maximum available idle IO bandwidth (but not more
than 200000 KB/sec) for reshape.
[ 160.493351] md: using 128k window, over a total of 1465137152k.
[ 160.951404] md0: unknown partition table
[ 190.984871] udevd[1289]: worker [1672] /devices/virtual/block/md0
timeout; kill it
[ 190.984901] udevd[1289]: seq 2259 '/devices/virtual/block/md0' killed
$ mdadm --version
mdadm - v3.3.1 - 5th June 2014
uname -a
Linux XXXXX 3.14.14-gentoo #3 SMP Sat Jan 31 18:45:04 CET 2015 x86_64 AMD
Athlon(tm) II X2 240e Processor AuthenticAMD GNU/Linux
Currently I can't access the array to read the remaining data, nor can I
continue the array grow.
Can you help me get it running?
best regards
Jörg
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Array died during grow; now resync stopped
2015-02-02 9:41 Array died during grow; now resync stopped Jörg Habenicht
@ 2015-02-02 19:40 ` NeilBrown
2015-02-03 10:35 ` Aw: " "Jörg Habenicht"
0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2015-02-02 19:40 UTC (permalink / raw)
To: Jörg Habenicht; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 6762 bytes --]
On Mon, 2 Feb 2015 09:41:02 +0000 (UTC) Jörg Habenicht <j.habenicht@gmx.de>
wrote:
> Hi all,
>
> I had a server crash during an array grow.
> Commandline was "mdadm --grow /dev/md0 --raid-devices=6 --chunk=1M"
>
> Now the sync is stuck at 27% and wont continue.
> $ cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid5 sde1[0] sdg1[9] sdc1[6] sdb1[7] sdd1[8] sdf1[5]
> 5860548608 blocks super 1.0 level 5, 256k chunk, algorithm 2 [6/6]
> [UUUUUU]
> [=====>...............] reshape = 27.9% (410229760/1465137152)
> finish=8670020128.0min speed=0K/sec
>
> unused devices: <none>
>
>
> $ mdadm -D /dev/md0
> /dev/md0:
> Version : 1.0
> Creation Time : Thu Oct 7 09:28:04 2010
> Raid Level : raid5
> Array Size : 5860548608 (5589.05 GiB 6001.20 GB)
> Used Dev Size : 1465137152 (1397.26 GiB 1500.30 GB)
> Raid Devices : 6
> Total Devices : 6
> Persistence : Superblock is persistent
>
> Update Time : Sun Feb 1 13:30:05 2015
> State : clean, reshaping
> Active Devices : 6
> Working Devices : 6
> Failed Devices : 0
> Spare Devices : 0
>
> Layout : left-symmetric
> Chunk Size : 256K
>
> Reshape Status : 27% complete
> Delta Devices : 1, (5->6)
> New Chunksize : 1024K
>
> Name : stelli:3 (local to host stelli)
> UUID : 52857d77:3806e446:477d4865:d711451e
> Events : 2254869
>
> Number Major Minor RaidDevice State
> 0 8 65 0 active sync /dev/sde1
> 5 8 81 1 active sync /dev/sdf1
> 8 8 49 2 active sync /dev/sdd1
> 7 8 17 3 active sync /dev/sdb1
> 6 8 33 4 active sync /dev/sdc1
> 9 8 97 5 active sync /dev/sdg1
>
>
> smartcrl reports the disks are OK. No remapped sectors, no pending writes, etc.
>
> The system load keeps at 2.0:
> $ cat /proc/loadavg
> 2.00 2.00 1.95 1/140 2937
> which may be caused by udevd and md0_reshape
> $ ps fax
> PID TTY STAT TIME COMMAND
> 2 ? S 0:00 [kthreadd]
> ...
> 1671 ? D 0:00 \_ [md0_reshape]
> ...
> 1289 ? Ss 0:01 /sbin/udevd --daemon
> 1672 ? D 0:00 \_ /sbin/udevd --daemon
>
>
> Could this be caused by a software lock?
Some sort of software problem I suspect.
What does
cat /proc/1671/stack
cat /proc/1672/stack
show?
Alternatively,
echo w > /proc/sysrq-trigger
and see what appears in 'dmesg'.
>
> The system got 2G RAM and 2G swap. Is this sufficient to complete?
Memory shouldn't be a problem.
However it wouldn't hurt to see what value is in
/sys/block/md0/md/stripe_cache_size
and double it.
If all else fails a reboot should be safe and will probably start the reshape
properly. md is very careful about surviving reboots.
NeilBrown
> $ free
> total used free shared buffers cached
> Mem: 1799124 351808 1447316 540 14620 286216
> -/+ buffers/cache: 50972 1748152
> Swap: 2104508 0 2104508
>
>
> And in dmesg I found this:
> $ dmesg | less
> [ 5.456941] md: bind<sdg1>
> [ 11.015014] xor: measuring software checksum speed
> [ 11.051384] prefetch64-sse: 3291.000 MB/sec
> [ 11.091375] generic_sse: 3129.000 MB/sec
> [ 11.091378] xor: using function: prefetch64-sse (3291.000 MB/sec)
> [ 11.159365] raid6: sse2x1 1246 MB/s
> [ 11.227343] raid6: sse2x2 2044 MB/s
> [ 11.295327] raid6: sse2x4 2487 MB/s
> [ 11.295331] raid6: using algorithm sse2x4 (2487 MB/s)
> [ 11.295334] raid6: using intx1 recovery algorithm
> [ 11.328771] md: raid6 personality registered for level 6
> [ 11.328776] md: raid5 personality registered for level 5
> [ 11.328779] md: raid4 personality registered for level 4
> [ 19.840890] bio: create slab <bio-1> at 1
> [ 159.701406] md: md0 stopped.
> [ 159.701413] md: unbind<sdg1>
> [ 159.709902] md: export_rdev(sdg1)
> [ 159.709980] md: unbind<sdd1>
> [ 159.721856] md: export_rdev(sdd1)
> [ 159.721955] md: unbind<sdb1>
> [ 159.733883] md: export_rdev(sdb1)
> [ 159.733991] md: unbind<sdc1>
> [ 159.749856] md: export_rdev(sdc1)
> [ 159.749954] md: unbind<sdf1>
> [ 159.769885] md: export_rdev(sdf1)
> [ 159.769985] md: unbind<sde1>
> [ 159.781873] md: export_rdev(sde1)
> [ 160.471460] md: md0 stopped.
> [ 160.490329] md: bind<sdf1>
> [ 160.490478] md: bind<sdd1>
> [ 160.490689] md: bind<sdb1>
> [ 160.490911] md: bind<sdc1>
> [ 160.491164] md: bind<sdg1>
> [ 160.491408] md: bind<sde1>
> [ 160.492616] md/raid:md0: reshape will continue
> [ 160.492638] md/raid:md0: device sde1 operational as raid disk 0
> [ 160.492640] md/raid:md0: device sdg1 operational as raid disk 5
> [ 160.492641] md/raid:md0: device sdc1 operational as raid disk 4
> [ 160.492642] md/raid:md0: device sdb1 operational as raid disk 3
> [ 160.492644] md/raid:md0: device sdd1 operational as raid disk 2
> [ 160.492645] md/raid:md0: device sdf1 operational as raid disk 1
> [ 160.493187] md/raid:md0: allocated 0kB
> [ 160.493253] md/raid:md0: raid level 5 active with 6 out of 6 devices,
> algorithm 2
> [ 160.493256] RAID conf printout:
> [ 160.493257] --- level:5 rd:6 wd:6
> [ 160.493259] disk 0, o:1, dev:sde1
> [ 160.493261] disk 1, o:1, dev:sdf1
> [ 160.493262] disk 2, o:1, dev:sdd1
> [ 160.493263] disk 3, o:1, dev:sdb1
> [ 160.493264] disk 4, o:1, dev:sdc1
> [ 160.493266] disk 5, o:1, dev:sdg1
> [ 160.493336] md0: detected capacity change from 0 to 6001201774592
> [ 160.493340] md: reshape of RAID array md0
> [ 160.493342] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
> [ 160.493343] md: using maximum available idle IO bandwidth (but not more
> than 200000 KB/sec) for reshape.
> [ 160.493351] md: using 128k window, over a total of 1465137152k.
> [ 160.951404] md0: unknown partition table
> [ 190.984871] udevd[1289]: worker [1672] /devices/virtual/block/md0
> timeout; kill it
> [ 190.984901] udevd[1289]: seq 2259 '/devices/virtual/block/md0' killed
>
>
>
> $ mdadm --version
> mdadm - v3.3.1 - 5th June 2014
>
> uname -a
> Linux XXXXX 3.14.14-gentoo #3 SMP Sat Jan 31 18:45:04 CET 2015 x86_64 AMD
> Athlon(tm) II X2 240e Processor AuthenticAMD GNU/Linux
>
>
> Currently I can't access the array to read the remaining data, nor can I
> continue the array grow.
> Can you help me get it running?
>
>
> best regards
> Jörg
>
>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Aw: Re: Array died during grow; now resync stopped
2015-02-02 19:40 ` NeilBrown
@ 2015-02-03 10:35 ` "Jörg Habenicht"
2015-02-04 6:45 ` NeilBrown
0 siblings, 1 reply; 6+ messages in thread
From: "Jörg Habenicht" @ 2015-02-03 10:35 UTC (permalink / raw)
To: linux-raid
Hello Neil,
thank you for caring.
(And sorry for the malformed structure, I have to use webmail.. )
On Mon, 2 Feb 2015 09:41:02 +0000 (UTC) Jörg Habenicht <j.habenicht@gmx.de>
wrote:
> Hi all,
>
> I had a server crash during an array grow.
> Commandline was "mdadm --grow /dev/md0 --raid-devices=6 --chunk=1M"
>
>
>
> Could this be caused by a software lock?
>Some sort of software problem I suspect.
>What does
>cat /proc/1671/stack
>cat /proc/1672/stack
>show?
$ cat /proc/1671/stack
cat: /proc/1671/stack: No such file or directory
Huch?
$ ls /proc/1671
ls: cannot read symbolic link /proc/1671/exe: No such file or directory
attr comm fdinfo mounts oom_score stat
autogroup coredump_filter io mountstats oom_score_adj statm
auxv cwd limits net pagemap status
cgroup environ maps ns personality syscall
clear_refs exe mem numa_maps root task
cmdline fd mountinfo oom_adj smaps wchan
$ id
uid=0(root) gid=0(root) groups=0(root), ...
$ cat /proc/1672/stack
cat: /proc/1672/stack: No such file or directory
>Alternatively,
>echo w > /proc/sysrq-trigger
>and see what appears in 'dmesg'.
No good:
[99166.625796] SysRq : Show Blocked State
[99166.625829] task PC stack pid father
[99166.625845] md0_reshape D ffff88006cb81e08 0 1671 2 0x00000000
[99166.625854] ffff88006a17fb30 0000000000000046 000000000000a000 ffff88006cc9b7e0
[99166.625861] ffff88006a17ffd8 ffff88006cc9b7e0 ffff88006fc11830 ffff88006fc11830
[99166.625866] 0000000000000001 ffffffff81068670 ffff88006ca56848 ffff88006fc11830
[99166.625871] Call Trace:
[99166.625884] [<ffffffff81068670>] ? __dequeue_entity+0x40/0x50
[99166.625891] [<ffffffff8106b966>] ? pick_next_task_fair+0x56/0x1b0
[99166.625898] [<ffffffff813f4a50>] ? __schedule+0x2a0/0x820
[99166.625905] [<ffffffff8106273d>] ? ttwu_do_wakeup+0xd/0x80
[99166.625914] [<ffffffffa027b4c5>] ? get_active_stripe+0x185/0x5c0 [raid456]
[99166.625922] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
[99166.625929] [<ffffffffa027e83a>] ? reshape_request+0x21a/0x860 [raid456]
[99166.625935] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
[99166.625942] [<ffffffffa02744f6>] ? sync_request+0x236/0x380 [raid456]
[99166.625955] [<ffffffffa01557ad>] ? md_do_sync+0x82d/0xd00 [md_mod]
[99166.625961] [<ffffffff810684b4>] ? update_curr+0x64/0xe0
[99166.625971] [<ffffffffa0152197>] ? md_thread+0xf7/0x110 [md_mod]
[99166.625977] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
[99166.625985] [<ffffffffa01520a0>] ? md_register_thread+0xf0/0xf0 [md_mod]
[99166.625991] [<ffffffff81059de8>] ? kthread+0xb8/0xd0
[99166.625997] [<ffffffff81059d30>] ? kthread_create_on_node+0x180/0x180
[99166.626003] [<ffffffff813f837c>] ? ret_from_fork+0x7c/0xb0
[99166.626008] [<ffffffff81059d30>] ? kthread_create_on_node+0x180/0x180
[99166.626012] udevd D ffff88006cb81e08 0 1672 1289 0x00000004
[99166.626017] ffff88006a1819e8 0000000000000086 000000000000a000 ffff88006c4967a0
[99166.626022] ffff88006a181fd8 ffff88006c4967a0 0000000000000000 0000000000000000
[99166.626027] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[99166.626032] Call Trace:
[99166.626039] [<ffffffff810c24ed>] ? zone_statistics+0x9d/0xa0
[99166.626044] [<ffffffff810c24ed>] ? zone_statistics+0x9d/0xa0
[99166.626050] [<ffffffff810b13e7>] ? get_page_from_freelist+0x507/0x850
[99166.626057] [<ffffffffa027b4c5>] ? get_active_stripe+0x185/0x5c0 [raid456]
[99166.626063] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
[99166.626069] [<ffffffffa027f627>] ? make_request+0x7a7/0xa00 [raid456]
[99166.626075] [<ffffffff81080afd>] ? ktime_get_ts+0x3d/0xd0
[99166.626080] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
[99166.626089] [<ffffffffa014ea12>] ? md_make_request+0xd2/0x210 [md_mod]
[99166.626096] [<ffffffff811e649d>] ? generic_make_request_checks+0x23d/0x270
[99166.626100] [<ffffffff810acc68>] ? mempool_alloc+0x58/0x140
[99166.626106] [<ffffffff811e7238>] ? generic_make_request+0xa8/0xf0
[99166.626111] [<ffffffff811e72e7>] ? submit_bio+0x67/0x130
[99166.626117] [<ffffffff8112a638>] ? bio_alloc_bioset+0x1b8/0x2a0
[99166.626123] [<ffffffff81126a57>] ? _submit_bh+0x127/0x200
[99166.626129] [<ffffffff8112815d>] ? block_read_full_page+0x1fd/0x290
[99166.626133] [<ffffffff8112b680>] ? I_BDEV+0x10/0x10
[99166.626140] [<ffffffff810aad2b>] ? add_to_page_cache_locked+0x6b/0xc0
[99166.626146] [<ffffffff810b5520>] ? __do_page_cache_readahead+0x1b0/0x220
[99166.626152] [<ffffffff810b5812>] ? force_page_cache_readahead+0x62/0xa0
[99166.626159] [<ffffffff810ac936>] ? generic_file_aio_read+0x4b6/0x6c0
[99166.626166] [<ffffffff810f9f87>] ? do_sync_read+0x57/0x90
[99166.626172] [<ffffffff810fa571>] ? vfs_read+0xa1/0x180
[99166.626178] [<ffffffff810fb0ab>] ? SyS_read+0x4b/0xc0
[99166.626183] [<ffffffff813f7f72>] ? page_fault+0x22/0x30
[99166.626190] [<ffffffff813f8422>] ? system_call_fastpath+0x16/0x1b
>
> The system got 2G RAM and 2G swap. Is this sufficient to complete?
>Memory shouldn't be a problem.
>However it wouldn't hurt to see what value is in
>/sys/block/md0/md/stripe_cache_size
>and double it.
$ cat /sys/block/md0/md/stripe_cache_size
256
I did not change it due to the crash in md_reshape
>If all else fails a reboot should be safe and will probably start the reshape
>properly. md is very careful about surviving reboots.
I already did reboot twice before I wrote to the list. Same result.
>NeilBrown
cu,
Joerg
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Array died during grow; now resync stopped
2015-02-03 10:35 ` Aw: " "Jörg Habenicht"
@ 2015-02-04 6:45 ` NeilBrown
2015-02-04 11:58 ` Aw: " "Jörg Habenicht"
0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2015-02-04 6:45 UTC (permalink / raw)
To: Jörg Habenicht; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 7305 bytes --]
On Tue, 3 Feb 2015 11:35:53 +0100 "Jörg Habenicht" <j.habenicht@gmx.de> wrote:
> Hello Neil,
>
> thank you for caring.
>
> (And sorry for the malformed structure, I have to use webmail.. )
>
>
> On Mon, 2 Feb 2015 09:41:02 +0000 (UTC) Jörg Habenicht <j.habenicht@gmx.de>
> wrote:
>
> > Hi all,
> >
> > I had a server crash during an array grow.
> > Commandline was "mdadm --grow /dev/md0 --raid-devices=6 --chunk=1M"
> >
> >
> >
> > Could this be caused by a software lock?
>
> >Some sort of software problem I suspect.
> >What does
> >cat /proc/1671/stack
> >cat /proc/1672/stack
> >show?
>
> $ cat /proc/1671/stack
> cat: /proc/1671/stack: No such file or directory
I guess you don't have that feature compiled into your kernel.
>
> Huch?
> $ ls /proc/1671
> ls: cannot read symbolic link /proc/1671/exe: No such file or directory
> attr comm fdinfo mounts oom_score stat
> autogroup coredump_filter io mountstats oom_score_adj statm
> auxv cwd limits net pagemap status
> cgroup environ maps ns personality syscall
> clear_refs exe mem numa_maps root task
> cmdline fd mountinfo oom_adj smaps wchan
> $ id
> uid=0(root) gid=0(root) groups=0(root), ...
>
> $ cat /proc/1672/stack
> cat: /proc/1672/stack: No such file or directory
>
>
> >Alternatively,
> >echo w > /proc/sysrq-trigger
> >and see what appears in 'dmesg'.
>
> No good:
Quite the reverse, this is exactly what I wanted. It shows the stack trace
of pid 1671 and 1672..
>
> [99166.625796] SysRq : Show Blocked State
> [99166.625829] task PC stack pid father
> [99166.625845] md0_reshape D ffff88006cb81e08 0 1671 2 0x00000000
> [99166.625854] ffff88006a17fb30 0000000000000046 000000000000a000 ffff88006cc9b7e0
> [99166.625861] ffff88006a17ffd8 ffff88006cc9b7e0 ffff88006fc11830 ffff88006fc11830
> [99166.625866] 0000000000000001 ffffffff81068670 ffff88006ca56848 ffff88006fc11830
> [99166.625871] Call Trace:
> [99166.625884] [<ffffffff81068670>] ? __dequeue_entity+0x40/0x50
> [99166.625891] [<ffffffff8106b966>] ? pick_next_task_fair+0x56/0x1b0
> [99166.625898] [<ffffffff813f4a50>] ? __schedule+0x2a0/0x820
> [99166.625905] [<ffffffff8106273d>] ? ttwu_do_wakeup+0xd/0x80
> [99166.625914] [<ffffffffa027b4c5>] ? get_active_stripe+0x185/0x5c0 [raid456]
> [99166.625922] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> [99166.625929] [<ffffffffa027e83a>] ? reshape_request+0x21a/0x860 [raid456]
> [99166.625935] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> [99166.625942] [<ffffffffa02744f6>] ? sync_request+0x236/0x380 [raid456]
> [99166.625955] [<ffffffffa01557ad>] ? md_do_sync+0x82d/0xd00 [md_mod]
> [99166.625961] [<ffffffff810684b4>] ? update_curr+0x64/0xe0
> [99166.625971] [<ffffffffa0152197>] ? md_thread+0xf7/0x110 [md_mod]
> [99166.625977] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> [99166.625985] [<ffffffffa01520a0>] ? md_register_thread+0xf0/0xf0 [md_mod]
> [99166.625991] [<ffffffff81059de8>] ? kthread+0xb8/0xd0
> [99166.625997] [<ffffffff81059d30>] ? kthread_create_on_node+0x180/0x180
> [99166.626003] [<ffffffff813f837c>] ? ret_from_fork+0x7c/0xb0
> [99166.626008] [<ffffffff81059d30>] ? kthread_create_on_node+0x180/0x180
That's not surprise. Whenever anything goes wrong in raid5, something gets
stuck in get_active_stripe()...
> [99166.626012] udevd D ffff88006cb81e08 0 1672 1289 0x00000004
> [99166.626017] ffff88006a1819e8 0000000000000086 000000000000a000 ffff88006c4967a0
> [99166.626022] ffff88006a181fd8 ffff88006c4967a0 0000000000000000 0000000000000000
> [99166.626027] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [99166.626032] Call Trace:
> [99166.626039] [<ffffffff810c24ed>] ? zone_statistics+0x9d/0xa0
> [99166.626044] [<ffffffff810c24ed>] ? zone_statistics+0x9d/0xa0
> [99166.626050] [<ffffffff810b13e7>] ? get_page_from_freelist+0x507/0x850
> [99166.626057] [<ffffffffa027b4c5>] ? get_active_stripe+0x185/0x5c0 [raid456]
> [99166.626063] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> [99166.626069] [<ffffffffa027f627>] ? make_request+0x7a7/0xa00 [raid456]
> [99166.626075] [<ffffffff81080afd>] ? ktime_get_ts+0x3d/0xd0
> [99166.626080] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> [99166.626089] [<ffffffffa014ea12>] ? md_make_request+0xd2/0x210 [md_mod]
> [99166.626096] [<ffffffff811e649d>] ? generic_make_request_checks+0x23d/0x270
> [99166.626100] [<ffffffff810acc68>] ? mempool_alloc+0x58/0x140
> [99166.626106] [<ffffffff811e7238>] ? generic_make_request+0xa8/0xf0
> [99166.626111] [<ffffffff811e72e7>] ? submit_bio+0x67/0x130
> [99166.626117] [<ffffffff8112a638>] ? bio_alloc_bioset+0x1b8/0x2a0
> [99166.626123] [<ffffffff81126a57>] ? _submit_bh+0x127/0x200
> [99166.626129] [<ffffffff8112815d>] ? block_read_full_page+0x1fd/0x290
> [99166.626133] [<ffffffff8112b680>] ? I_BDEV+0x10/0x10
> [99166.626140] [<ffffffff810aad2b>] ? add_to_page_cache_locked+0x6b/0xc0
> [99166.626146] [<ffffffff810b5520>] ? __do_page_cache_readahead+0x1b0/0x220
> [99166.626152] [<ffffffff810b5812>] ? force_page_cache_readahead+0x62/0xa0
> [99166.626159] [<ffffffff810ac936>] ? generic_file_aio_read+0x4b6/0x6c0
> [99166.626166] [<ffffffff810f9f87>] ? do_sync_read+0x57/0x90
> [99166.626172] [<ffffffff810fa571>] ? vfs_read+0xa1/0x180
> [99166.626178] [<ffffffff810fb0ab>] ? SyS_read+0x4b/0xc0
> [99166.626183] [<ffffffff813f7f72>] ? page_fault+0x22/0x30
> [99166.626190] [<ffffffff813f8422>] ? system_call_fastpath+0x16/0x1b
And this is stuck in the same place.... what what is consuming all the
stripes I wonder....
>
>
> >
> > The system got 2G RAM and 2G swap. Is this sufficient to complete?
>
> >Memory shouldn't be a problem.
> >However it wouldn't hurt to see what value is in
> >/sys/block/md0/md/stripe_cache_size
> >and double it.
>
> $ cat /sys/block/md0/md/stripe_cache_size
> 256
You are setting the chunk size to 1M, which is 256 4K pages.
So this stripe_cache only just has enough space to store one full stripe at
the new chunk size. That isn't enough.
If you double it, the problem should go away.
mdadm should do that for you .... I wonder why it didn't.
>
> I did not change it due to the crash in md_reshape
What crash is that? The above stack traces that you said "No good" about?
That isn't a crash. That is the kernel showing you stack traces because you
asked for them.
echo 1024 > /sys/block/md0/md/stripe_cache_size
should make it work.
NeilBrown
>
>
> >If all else fails a reboot should be safe and will probably start the reshape
> >properly. md is very careful about surviving reboots.
>
> I already did reboot twice before I wrote to the list. Same result.
>
>
> >NeilBrown
>
>
> cu,
> Joerg
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Aw: Re: Array died during grow; now resync stopped
2015-02-04 6:45 ` NeilBrown
@ 2015-02-04 11:58 ` "Jörg Habenicht"
2015-02-05 4:40 ` NeilBrown
0 siblings, 1 reply; 6+ messages in thread
From: "Jörg Habenicht" @ 2015-02-04 11:58 UTC (permalink / raw)
To: linux-raid
It works! :-)) The array syncs!
> > >What does
> > >cat /proc/1671/stack
> > >cat /proc/1672/stack
> > >show?
> >
> > $ cat /proc/1671/stack
> > cat: /proc/1671/stack: No such file or directory
>
> I guess you don't have that feature compiled into your kernel.
Guess so. I'm going to look to the missing CONFIG_ line.
> > >Alternatively,
> > >echo w > /proc/sysrq-trigger
> > >and see what appears in 'dmesg'.
> >
> > No good:
>
> Quite the reverse, this is exactly what I wanted. It shows the stack trace
> of pid 1671 and 1672..
Ahh, OK. I thought it being a crash dump.
> >
> > [99166.625796] SysRq : Show Blocked State
> > [99166.625829] task PC stack pid father
> > [99166.625845] md0_reshape D ffff88006cb81e08 0 1671 2 0x00000000
> > [99166.625854] ffff88006a17fb30 0000000000000046 000000000000a000 ffff88006cc9b7e0
> > [99166.625861] ffff88006a17ffd8 ffff88006cc9b7e0 ffff88006fc11830 ffff88006fc11830
> > [99166.625866] 0000000000000001 ffffffff81068670 ffff88006ca56848 ffff88006fc11830
> > [99166.625871] Call Trace:
> > [99166.625884] [<ffffffff81068670>] ? __dequeue_entity+0x40/0x50
> > [99166.625891] [<ffffffff8106b966>] ? pick_next_task_fair+0x56/0x1b0
> > [99166.625898] [<ffffffff813f4a50>] ? __schedule+0x2a0/0x820
> > [99166.625905] [<ffffffff8106273d>] ? ttwu_do_wakeup+0xd/0x80
> > [99166.625914] [<ffffffffa027b4c5>] ? get_active_stripe+0x185/0x5c0 [raid456]
> > [99166.625922] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> > [99166.625929] [<ffffffffa027e83a>] ? reshape_request+0x21a/0x860 [raid456]
> > [99166.625935] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> > [99166.625942] [<ffffffffa02744f6>] ? sync_request+0x236/0x380 [raid456]
> > [99166.625955] [<ffffffffa01557ad>] ? md_do_sync+0x82d/0xd00 [md_mod]
> > [99166.625961] [<ffffffff810684b4>] ? update_curr+0x64/0xe0
> > [99166.625971] [<ffffffffa0152197>] ? md_thread+0xf7/0x110 [md_mod]
> > [99166.625977] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> > [99166.625985] [<ffffffffa01520a0>] ? md_register_thread+0xf0/0xf0 [md_mod]
> > [99166.625991] [<ffffffff81059de8>] ? kthread+0xb8/0xd0
> > [99166.625997] [<ffffffff81059d30>] ? kthread_create_on_node+0x180/0x180
> > [99166.626003] [<ffffffff813f837c>] ? ret_from_fork+0x7c/0xb0
> > [99166.626008] [<ffffffff81059d30>] ? kthread_create_on_node+0x180/0x180
>
> That's not surprise. Whenever anything goes wrong in raid5, something gets
> stuck in get_active_stripe()...
>
>
> > [99166.626012] udevd D ffff88006cb81e08 0 1672 1289 0x00000004
> > [99166.626017] ffff88006a1819e8 0000000000000086 000000000000a000 ffff88006c4967a0
> > [99166.626022] ffff88006a181fd8 ffff88006c4967a0 0000000000000000 0000000000000000
> > [99166.626027] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [99166.626032] Call Trace:
> > [99166.626039] [<ffffffff810c24ed>] ? zone_statistics+0x9d/0xa0
> > [99166.626044] [<ffffffff810c24ed>] ? zone_statistics+0x9d/0xa0
> > [99166.626050] [<ffffffff810b13e7>] ? get_page_from_freelist+0x507/0x850
> > [99166.626057] [<ffffffffa027b4c5>] ? get_active_stripe+0x185/0x5c0 [raid456]
> > [99166.626063] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> > [99166.626069] [<ffffffffa027f627>] ? make_request+0x7a7/0xa00 [raid456]
> > [99166.626075] [<ffffffff81080afd>] ? ktime_get_ts+0x3d/0xd0
> > [99166.626080] [<ffffffff81072110>] ? __wake_up_sync+0x10/0x10
> > [99166.626089] [<ffffffffa014ea12>] ? md_make_request+0xd2/0x210 [md_mod]
> > [99166.626096] [<ffffffff811e649d>] ? generic_make_request_checks+0x23d/0x270
> > [99166.626100] [<ffffffff810acc68>] ? mempool_alloc+0x58/0x140
> > [99166.626106] [<ffffffff811e7238>] ? generic_make_request+0xa8/0xf0
> > [99166.626111] [<ffffffff811e72e7>] ? submit_bio+0x67/0x130
> > [99166.626117] [<ffffffff8112a638>] ? bio_alloc_bioset+0x1b8/0x2a0
> > [99166.626123] [<ffffffff81126a57>] ? _submit_bh+0x127/0x200
> > [99166.626129] [<ffffffff8112815d>] ? block_read_full_page+0x1fd/0x290
> > [99166.626133] [<ffffffff8112b680>] ? I_BDEV+0x10/0x10
> > [99166.626140] [<ffffffff810aad2b>] ? add_to_page_cache_locked+0x6b/0xc0
> > [99166.626146] [<ffffffff810b5520>] ? __do_page_cache_readahead+0x1b0/0x220
> > [99166.626152] [<ffffffff810b5812>] ? force_page_cache_readahead+0x62/0xa0
> > [99166.626159] [<ffffffff810ac936>] ? generic_file_aio_read+0x4b6/0x6c0
> > [99166.626166] [<ffffffff810f9f87>] ? do_sync_read+0x57/0x90
> > [99166.626172] [<ffffffff810fa571>] ? vfs_read+0xa1/0x180
> > [99166.626178] [<ffffffff810fb0ab>] ? SyS_read+0x4b/0xc0
> > [99166.626183] [<ffffffff813f7f72>] ? page_fault+0x22/0x30
> > [99166.626190] [<ffffffff813f8422>] ? system_call_fastpath+0x16/0x1b
>
> And this is stuck in the same place.... what what is consuming all the
> stripes I wonder....
Do you like me to collect more information?
> > >
> > > The system got 2G RAM and 2G swap. Is this sufficient to complete?
> >
> > >Memory shouldn't be a problem.
> > >However it wouldn't hurt to see what value is in
> > >/sys/block/md0/md/stripe_cache_size
> > >and double it.
> >
> > $ cat /sys/block/md0/md/stripe_cache_size
> > 256
>
>
> You are setting the chunk size to 1M, which is 256 4K pages.
> So this stripe_cache only just has enough space to store one full stripe at
> the new chunk size. That isn't enough.
>
> If you double it, the problem should go away.
>
> mdadm should do that for you .... I wonder why it didn't.
>
Do you like to have more test results?
> >
> > I did not change it due to the crash in md_reshape
>
> What crash is that? The above stack traces that you said "No good" about?
> That isn't a crash. That is the kernel showing you stack traces because you
> asked for them.
Ok, Thanks, learned something more :-)
>
> echo 1024 > /sys/block/md0/md/stripe_cache_size
>
> should make it work.
Yes it did!
$ echo 1024 > /sys/block/md0/md/stripe_cache_size
$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde1[0] sdg1[9] sdc1[6] sdb1[7] sdd1[8] sdf1[5]
5860548608 blocks super 1.0 level 5, 256k chunk, algorithm 2 [6/6] [UUUUUU]
[=====>...............] reshape = 28.0% (410341376/1465137152) finish=29602912.5min speed=0K/sec
unused devices: <none>
$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde1[0] sdg1[9] sdc1[6] sdb1[7] sdd1[8] sdf1[5]
5860548608 blocks super 1.0 level 5, 256k chunk, algorithm 2 [6/6] [UUUUUU]
[=====>...............] reshape = 28.0% (410656252/1465137152) finish=7746625.6min speed=2K/sec
unused devices: <none>
$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde1[0] sdg1[9] sdc1[6] sdb1[7] sdd1[8] sdf1[5]
5860548608 blocks super 1.0 level 5, 256k chunk, algorithm 2 [6/6] [UUUUUU]
[=====>...............] reshape = 28.0% (410851328/1465137152) finish=5314609.8min speed=3K/sec
unused devices: <none>
$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde1[0] sdg1[9] sdc1[6] sdb1[7] sdd1[8] sdf1[5]
5860548608 blocks super 1.0 level 5, 256k chunk, algorithm 2 [6/6] [UUUUUU]
[=====>...............] reshape = 28.4% (416577276/1465137152) finish=870.3min speed=20079K/sec
unused devices: <none>
Immediatly it starts to sync.
I wonder why it got stuck at 27% and not at 0%? Shouldn't it get stuck at the beginning if the cache size is not sufficient?
Or is it because of a reboot which happened at 27% sync status?
Anyways,
Thank you for your help.
>
> NeilBrown
>
>
cu,
Joerg
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Array died during grow; now resync stopped
2015-02-04 11:58 ` Aw: " "Jörg Habenicht"
@ 2015-02-05 4:40 ` NeilBrown
0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2015-02-05 4:40 UTC (permalink / raw)
To: Jörg Habenicht; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1253 bytes --]
On Wed, 4 Feb 2015 12:58:06 +0100 "Jörg Habenicht" <j.habenicht@gmx.de> wrote:
> It works! :-)) The array syncs!
Excellent!
>
>
> > > >What does
> > > >cat /proc/1671/stack
> > > >cat /proc/1672/stack
> > > >show?
> > >
> > > $ cat /proc/1671/stack
> > > cat: /proc/1671/stack: No such file or directory
> >
> > I guess you don't have that feature compiled into your kernel.
>
> Guess so. I'm going to look to the missing CONFIG_ line.
CONFIG_STACKTRACE ... in case you haven't found it already.
> >
> > And this is stuck in the same place.... what what is consuming all the
> > stripes I wonder....
>
> Do you like me to collect more information?
No thanks, the required information was further down. They were all consumed
by the reshape (which needs at least twice the largest chunk size).
> > If you double it, the problem should go away.
> >
> > mdadm should do that for you .... I wonder why it didn't.
> >
>
> Do you like to have more test results?
I had a look at the code and it was fairly obvious. mdadm adjusts the stripe
cache based on the 'old' chunk size. It never includes the 'new' chunk size
in those calculations. I've made a note to fix that.
Thanks,
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-02-05 4:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-02 9:41 Array died during grow; now resync stopped Jörg Habenicht
2015-02-02 19:40 ` NeilBrown
2015-02-03 10:35 ` Aw: " "Jörg Habenicht"
2015-02-04 6:45 ` NeilBrown
2015-02-04 11:58 ` Aw: " "Jörg Habenicht"
2015-02-05 4:40 ` NeilBrown
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.