All of lore.kernel.org
 help / color / mirror / Atom feed
* Pausing md check hangs
@ 2020-01-27 13:42 Georgi Nikolov
  2020-01-27 17:11 ` Song Liu
  2020-02-25  2:10 ` Guoqing Jiang
  0 siblings, 2 replies; 12+ messages in thread
From: Georgi Nikolov @ 2020-01-27 13:42 UTC (permalink / raw)
  To: song; +Cc: linux-raid

Hi,

I posted a kernel bug about this a month ago but it did not receive any 
attention: https://bugzilla.kernel.org/show_bug.cgi?id=205929
Here is a copy of the bug report and I hope that this is the correct 
place to discuss this:

I have a Supermicro server with 10 md raid6 arrays each consisting of 8 
SATA drives. SATA drives are Hitachi/HGST Ultrastar 7K4000 8T.
When i try to pause array check with "echo idle > 
"/sys/block/<md_dev>/md/sync_action" it randomly hangs at different md 
device.
Process "mdX_raid6" is at 100% cpu usage. cat 
/sys/block/mdX/md/journal_mode hungs forever.

Here is the state at the moment of crash for one of the md devices:

root@supermicro:/sys/block/mdX/md# find -mindepth 1 -maxdepth 1 -type 
f|sort|grep -v journal_mode|xargs -r egrep .
./array_size:default
./array_state:write-pending
grep: ./bitmap_set_bits: Permission denied
./chunk_size:524288
./component_size:7813895168
./consistency_policy:resync
./degraded:0
./group_thread_cnt:4
./last_sync_action:check
./layout:2
./level:raid6
./max_read_errors:20
./metadata_version:1.2
./mismatch_cnt:0
grep: ./new_dev: Permission denied
./preread_bypass_threshold:1
./raid_disks:8
./reshape_direction:forwards
./reshape_position:none
./resync_start:none
./rmw_level:1
./safe_mode_delay:0.204
./skip_copy:0
./stripe_cache_active:13173
./stripe_cache_size:8192
./suspend_hi:0
./suspend_lo:0
./sync_action:check
./sync_completed:3566405120 / 15627790336
./sync_force_parallel:0
./sync_max:max
./sync_min:1821385984
./sync_speed:126
./sync_speed_max:1000 (local)
./sync_speed_min:1000 (system)

root@supermicro:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] 
[raid4] [raid10]
md4 : active raid6 sdaa[2] sdab[3] sdy[0] sdae[6] sdac[4] sdad[5] 
sdaf[7] sdz[1]
       46883371008 blocks super 1.2 level 6, 512k chunk, algorithm 2 
[8/8] [UUUUUUUU]
       [====>................]  check = 22.8% (1784112640/7813895168) 
finish=20571.7min speed=4884K/sec


Regards,
Georgi Nikolov

^ permalink raw reply	[flat|nested] 12+ messages in thread
* Pausing md check hangs
@ 2020-01-27  9:52 Georgi Nikolov
  0 siblings, 0 replies; 12+ messages in thread
From: Georgi Nikolov @ 2020-01-27  9:52 UTC (permalink / raw)
  To: shli, linux-raid, linux-kernel

Hi,

I posted a kernel bug about this a month ago but it did not receive any 
attention: https://bugzilla.kernel.org/show_bug.cgi?id=205929
Here is a copy of the bug report and I hope that this is the correct 
place to discuss this:

I have a Supermicro server with 10 md raid6 arrays each consisting of 8 SATA drives. SATA drives are Hitachi/HGST Ultrastar 7K4000 8T.
When i try to pause array check with "echo idle > "/sys/block/<md_dev>/md/sync_action" it randomly hangs at different md device.
Process "mdX_raid6" is at 100% cpu usage. cat /sys/block/mdX/md/journal_mode hungs forever.

Here is the state at the moment of crash for one of the md devices:

root@supermicro:/sys/block/mdX/md# find -mindepth 1 -maxdepth 1 -type f|sort|grep -v journal_mode|xargs -r egrep .
./array_size:default
./array_state:write-pending
grep: ./bitmap_set_bits: Permission denied
./chunk_size:524288
./component_size:7813895168
./consistency_policy:resync
./degraded:0
./group_thread_cnt:4
./last_sync_action:check
./layout:2
./level:raid6
./max_read_errors:20
./metadata_version:1.2
./mismatch_cnt:0
grep: ./new_dev: Permission denied
./preread_bypass_threshold:1
./raid_disks:8
./reshape_direction:forwards
./reshape_position:none
./resync_start:none
./rmw_level:1
./safe_mode_delay:0.204
./skip_copy:0
./stripe_cache_active:13173
./stripe_cache_size:8192
./suspend_hi:0
./suspend_lo:0
./sync_action:check
./sync_completed:3566405120 / 15627790336
./sync_force_parallel:0
./sync_max:max
./sync_min:1821385984
./sync_speed:126
./sync_speed_max:1000 (local)
./sync_speed_min:1000 (system)

root@supermicro:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md4 : active raid6 sdaa[2] sdab[3] sdy[0] sdae[6] sdac[4] sdad[5] sdaf[7] sdz[1]
       46883371008 blocks super 1.2 level 6, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
       [====>................]  check = 22.8% (1784112640/7813895168) finish=20571.7min speed=4884K/sec


Regards,
Georgi Nikolov

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-03-10 20:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-27 13:42 Pausing md check hangs Georgi Nikolov
2020-01-27 17:11 ` Song Liu
2020-01-28  8:11   ` Georgi Nikolov
2020-01-28 18:04     ` Song Liu
2020-02-01 14:26       ` Georgi Nikolov
2020-02-03 23:31         ` Song Liu
2020-02-17 12:14           ` Georgi Nikolov
2020-02-17 13:15           ` Georgi Nikolov
2020-02-25  2:10 ` Guoqing Jiang
2020-03-10 15:30   ` Georgi Nikolov
2020-03-10 20:27     ` Guoqing Jiang
  -- strict thread matches above, loose matches on Subject: below --
2020-01-27  9:52 Georgi Nikolov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.