linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mdxxx_raid6 kernel thread frozen
@ 2021-02-15 21:31 Michael D. O'Brien
  2021-02-16 14:30 ` Thomas Kreitler
  0 siblings, 1 reply; 3+ messages in thread
From: Michael D. O'Brien @ 2021-02-15 21:31 UTC (permalink / raw)
  To: linux-raid

Hi, I have a single mdadm raid6 in a 56-drive raid60 (7x8) with a
kernel thread stuck at 100% cpu. The stuck thread typically happens
during array checks, but is not the resync thread - md122_raid6 is at
100% cpu, whereas md122_resync is at ~0%. When this happens, the
reported sync speed drops until it reaches 4K/sec. Setting sync_action
to idle gets stuck.

iostat shows backing devices aren't doing anything i/o wise, SMART is
clean for all member drives, and dmesg doesn't say anything useful
(until the thread is hung for a long time, then it tells me as much -
I'll post that message when the current issue times out). A reboot
typically clears the issue, but takes quite a long time, as the raid
60 is the backing device for a bcache device (with an optane cache)
that has a large mounted xfs file system in place.

I figured I could strace the process, but I learned that's impossible
with kernel threads :)

Output of various things - please let me know what else I can run to
help track this down:

/prod/mdstat:
md118 : active raid0 md120[4] md119[5] md123[6] md125[3] md121[0]
md124[1] md122[2]
      410183875584 blocks super 1.2 3072k chunks
md119 : active raid6 sdbh[1] sdbi[2] sdan[4] sdbc[0] sdar[7] sdaq[6]
sdbe[8] sdao[5]
      58597828608 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]
md120 : active raid6 sdbd[7] sdat[1] sdaz[4] sday[3] sdau[2] sdba[5]
sdbb[6] sdas[0]
      58597828608 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]
md121 : active raid6 sdaj[5] sdag[2] sdal[7] sdai[4] sdae[0] sdak[6]
sdaf[1] sdah[3]
      58597828608 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]
md122 : active raid6 sdu[7] sdq[3] sdr[4] sdp[2] sdn[0] sdt[6] sds[5] sdo[1]
      58597828608 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]
      [================>....]  check = 81.5% (7963280396/9766304768)
finish=147106.8min speed=204K/sec
md123 : active raid6 sdax[7] sdaw[6] sdav[5] sdap[4] sdy[3] sdc[0] sdd[1] sdh[2]
      58597828608 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]
md124 : active raid6 sdab[5] sdaa[4] sdad[7] sdz[3] sdv[0] sdx[2] sdac[6] sdw[1]
      58597828608 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]
md125 : active raid6 sde[0] sdam[7] sdg[2] sdbg[8] sdf[1] sdi[3] sdk[5] sdj[4]
      58597828608 blocks super 1.2 level 6, 512k chunk, algorithm 2
[8/8] [UUUUUUUU]

/proc/{PID of md122_raid6}/stack alternates between nothing and:
[<0>] ops_run_io+0x3e/0xdb0 [raid456]
[<0>] handle_stripe+0x144/0x1260 [raid456]
[<0>] handle_active_stripes.isra.0+0x3c5/0x5a0 [raid456]
[<0>] raid5d+0x35c/0x550 [raid456]
[<0>] md_thread+0x97/0x160
[<0>] kthread+0x114/0x150
[<0>] ret_from_fork+0x22/0x30

/proc/{PID of md122_raid6}/status:
Name:   md122_raid6
Umask:  0000
State:  R (running)
Tgid:   2167
Ngid:   0
Pid:    2167
PPid:   2
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 64
Groups:
NStgid: 2167
NSpid:  2167
NSpgid: 0
NSsid:  0
Threads:        1
SigQ:   0/1031010
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: fffffffffffffeff
SigCgt: 0000000000000100
CapInh: 0000000000000000
CapPrm: 000000ffffffffff
CapEff: 000000ffffffffff
CapBnd: 000000ffffffffff
CapAmb: 0000000000000000
NoNewPrivs:     0
Seccomp:        0
Speculation_Store_Bypass:       thread vulnerable
Cpus_allowed:   ffffff
Cpus_allowed_list:      0-23
Mems_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        73369830
nonvoluntary_ctxt_switches:     29419786

/proc/{PID of md122_raid6}/stat:
2167 (md122_raid6) R 2 0 0 0 -1 2129984 0 0 0 0 0 5079064 0 0 20 0 1 0
1724 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483391 256 0 0 0 17 21
0 0 390998 0 0 0 0 0 0 0 0 0 0

mdadm -D {raid_60_device}:
/dev/md118:
           Version : 1.2
     Creation Time : Sun Apr  5 13:43:11 2020
        Raid Level : raid0
        Array Size : 410183875584 (391181.83 GiB 420028.29 GB)
      Raid Devices : 7
     Total Devices : 7
       Persistence : Superblock is persistent

       Update Time : Sun Apr  5 13:43:11 2020
             State : clean
    Active Devices : 7
   Working Devices : 7
    Failed Devices : 0
     Spare Devices : 0

            Layout : -unknown-
        Chunk Size : 3072K

Consistency Policy : none

              Name : host:all_spinners
              UUID : 74727e9d:8d3cd62a:48369430:dea1e4eb
            Events : 0

    Number   Major   Minor   RaidDevice State
       0       9      121        0      active sync   /dev/md/host:spinners_1
       1       9      124        1      active sync   /dev/md/host:spinners_2
       2       9      122        2      active sync   /dev/md/host:spinners_3
       3       9      125        3      active sync   /dev/md/host:spinners_4
       4       9      120        4      active sync   /dev/md/host:spinners_5
       5       9      119        5      active sync   /dev/md/host:spinners_6
       6       9      123        6      active sync   /dev/md/host:spinners_7

mdadm -D {md122, frozen device}:
/dev/md122:
           Version : 1.2
     Creation Time : Sat Apr  4 10:12:53 2020
        Raid Level : raid6
        Array Size : 58597828608 (55883.24 GiB 60004.18 GB)
     Used Dev Size : 9766304768 (9313.87 GiB 10000.70 GB)
      Raid Devices : 8
     Total Devices : 8
       Persistence : Superblock is persistent

       Update Time : Mon Feb 15 12:02:41 2021
             State : active, checking
    Active Devices : 8
   Working Devices : 8
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

      Check Status : 81% complete

              Name : host:spinners_3
              UUID : 331bc2af:3207b40c:983b923f:14fe1762
            Events : 5869

    Number   Major   Minor   RaidDevice State
       0       8      208        0      active sync   /dev/sdn
       1       8      224        1      active sync   /dev/sdo
       2       8      240        2      active sync   /dev/sdp
       3      65        0        3      active sync   /dev/sdq
       4      65       16        4      active sync   /dev/sdr
       5      65       32        5      active sync   /dev/sds
       6      65       48        6      active sync   /dev/sdt
       7      65       64        7      active sync   /dev/sdu

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mdxxx_raid6 kernel thread frozen
  2021-02-15 21:31 mdxxx_raid6 kernel thread frozen Michael D. O'Brien
@ 2021-02-16 14:30 ` Thomas Kreitler
  2021-02-16 17:20   ` Michael D. O'Brien
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Kreitler @ 2021-02-16 14:30 UTC (permalink / raw)
  To: Michael D. O'Brien, linux-raid


On 2021-02-15 22:31, Michael D. O'Brien wrote:
> Hi, I have a single mdadm raid6 in a 56-drive raid60 (7x8) with a
> kernel thread stuck at 100% cpu. The stuck thread typically happens
> during array checks, but is not the resync thread - md122_raid6 is at
> 100% cpu, whereas md122_resync is at ~0%. When this happens, the
> reported sync speed drops until it reaches 4K/sec. Setting sync_action
> to idle gets stuck.
> 
> iostat shows backing devices aren't doing anything i/o wise, SMART is
> clean for all member drives, and dmesg doesn't say anything useful
> (until the thread is hung for a long time, then it tells me as much -
> I'll post that message when the current issue times out). A reboot
> typically clears the issue, but takes quite a long time, as the raid
> 60 is the backing device for a bcache device (with an optane cache)
> that has a large mounted xfs file system in place.
> 
> I figured I could strace the process, but I learned that's impossible
> with kernel threads :)
> 
[...]

Hello Michael,

This sounds pretty much the same what we have experienced whilst 
checking raid6 assemblies.

The issue is actively tackled in the moment, c.f the "[PATCH V2] md: 
don't unregister sync_thread with reconfig_mutex held" thread.

And more details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t


Kind regards,

	Thomas


-- 
Thomas Kreitler - Information Retrieval
kreitler@molgen.mpg.de
49/30/8413 1702

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mdxxx_raid6 kernel thread frozen
  2021-02-16 14:30 ` Thomas Kreitler
@ 2021-02-16 17:20   ` Michael D. O'Brien
  0 siblings, 0 replies; 3+ messages in thread
From: Michael D. O'Brien @ 2021-02-16 17:20 UTC (permalink / raw)
  To: Thomas Kreitler; +Cc: linux-raid

On Tue, Feb 16, 2021 at 6:30 AM Thomas Kreitler <kreitler@molgen.mpg.de> wrote:
>
> This sounds pretty much the same what we have experienced whilst
> checking raid6 assemblies.
>
> The issue is actively tackled in the moment, c.f the "[PATCH V2] md:
> don't unregister sync_thread with reconfig_mutex held" thread.
>
> And more details in the link:
> https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

Thank you for the pointer Thomas - I was reading through that thread
last night (I suppose I should have realized it was similar prior to
my e-mail :)), and the progress is quite encouraging.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-16 17:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-15 21:31 mdxxx_raid6 kernel thread frozen Michael D. O'Brien
2021-02-16 14:30 ` Thomas Kreitler
2021-02-16 17:20   ` Michael D. O'Brien

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).