linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xiao Ni <xni@redhat.com>
To: Coly Li <colyli@suse.de>, linux-raid@vger.kernel.org
Cc: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>,
	Song Liu <songliubraving@fb.com>
Subject: Re: unexpected 'mdadm -S' hang with I/O pressure testing
Date: Sat, 12 Sep 2020 22:59:49 +0800	[thread overview]
Message-ID: <dce9b69e-1d22-226a-9a0d-87d8a43c4827@redhat.com> (raw)
In-Reply-To: <0be1a9cf-3a8a-4ed9-91b8-d15787528acf@suse.de>

Hi all

I did a test on single disk (just replace /dev/md0 with /dev/sde), it 
had the same problem too.
Ctrl+c can't stop the fio process. There still are write I/O on /dev/sde 
after ctrl+c. It needs to wait
for all I/O finish.

Regards
Xiao

On 09/12/2020 10:21 PM, Coly Li wrote:
> One thing to correct: the hang is not forever - after I posted the
> previous email, all commands returns and the array stopped. It takes
> around 40 minutes -- still quite unexpected and suspicious.
>
> Thanks.
>
> Coly Li
>
> On 2020/9/12 22:06, Coly Li wrote:
>> Unexpected Behavior:
>> - With Linux v5.9-rc4 mainline kernel and latest mdadm upstream code
>> - After running fio with 10 jobs, 16 iodpes and 64K block size for a
>> while, try to stop the fio process by 'Ctrl + c', the main fio process
>> hangs.
>> - Then try to stop the md raid 5 array by 'mdadm -S /dev/md0', the mdad
>> process hangs.
>> - Reboot the system by 'echo b > /proc/sysrq-trigger', this md raid5
>> array is assembled but inactive. /proc/mdstat shows,
>> 	Personalities : [raid6] [raid5] [raid4]
>> 	md127 : inactive sdc[0] sde[3] sdd[1]
>> 	      35156259840 blocks super 1.2
>>
>> Expectation:
>> - The fio process can stop with 'Ctrl + c'
>> - The raid5 array can be stopped by 'mdadm -S /dev/md0'
>> - This md raid5 array may continue to work (resync and being active)
>> after reboot
>>
>>
>> How to reproduce:
>> 1) Create md raid5 with 3 hard drives (12TB for each SATA spinning disk)
>>    # mdadm -C /dev/md0 -l 5 -n 3 /dev/sd{c,d,e}
>>    # cat /proc/mdstat
>> Personalities : [raid6] [raid5] [raid4]
>> md0 : active raid5 sde[3] sdd[1] sdc[0]
>>        23437506560 blocks super 1.2 level 5, 512k chunk, algorithm 2
>> [3/2] [UU_]
>>        [>....................]  recovery =  0.0% (2556792/11718753280)
>> finish=5765844.7min speed=33K/sec
>>        bitmap: 2/88 pages [8KB], 65536KB chunk
>>
>> 2) Run fio for random write on the raid5 array
>>    fio job file content:
>> [global]
>> thread=1
>> ioengine=libaio
>> random_generator=tausworthe64
>>
>> [job]
>> filename=/dev/md0
>> readwrite=randwrite
>> blocksize=64K
>> numjobs=10
>> iodepth=16
>> runtime=1m
>>    # fio ./raid5.fio
>>
>> 3) Wait for 10 seconds after the above fio runs, then type 'Ctrl + c' to
>> stop the fio process:
>> x:/home/colyli/fio_test/raid5 # fio ./raid5.fio
>> job: (g=0): rw=randwrite, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB,
>> (T) 64.0KiB-64.0KiB, ioengine=libaio, iodepth=16
>> ...
>> fio-3.23-10-ge007
>> Starting 12 threads
>> ^Cbs: 12 (f=12): [w(12)][3.3%][w=6080KiB/s][w=95 IOPS][eta 14m:30s]
>> fio: terminating on signal 2
>> ^C
>> fio: terminating on signal 2
>> ^C
>> fio: terminating on signal 2
>> Jobs: 11 (f=11): [w(5),_(1),w(4),f(1),w(1)][7.5%][eta 14m:20s]
>> ^C
>> fio: terminating on signal 2
>> Jobs: 11 (f=11): [w(5),_(1),w(4),f(1),w(1)][70.5%][eta 15m:00s]
>>
>> Now the fio process is hang forever.
>>
>> 4) try to stop this md raid5 array by mdadm
>>    # mdadm -S /dev/md0
>>    Now the mdadm process hangs for ever
>>
>>
>> Kernel versions to reproduce
>> - Use latest upstream mdadm source code
>> - I tried Linux v5.9-rc4, and Linux v4.12, both of them may stable
>> reproduce the above unexpected behavior.
>>    Therefore I assume maybe at least from v4.12 to v5.9 may have such issue.
>>
>> Just for your information, hope you may have a look into it. Thanks in
>> advance.
>>
>> Coly Li
>>


  reply	other threads:[~2020-09-12 15:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-12 14:06 unexpected 'mdadm -S' hang with I/O pressure testing Coly Li
2020-09-12 14:21 ` Coly Li
2020-09-12 14:59   ` Xiao Ni [this message]
2020-09-14  0:03 ` Roger Heflin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dce9b69e-1d22-226a-9a0d-87d8a43c4827@redhat.com \
    --to=xni@redhat.com \
    --cc=colyli@suse.de \
    --cc=guoqing.jiang@cloud.ionos.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).