Issue: md_raid5 using 100% CPU and hang with resync=PENDING status, if a drive is removed during initialization Discription: In RAID5 array, if a drive is removed during initialization and the same time if IO is happening to that md. Then IO is getting struck, and md_raid5 thread is using 100 % of CPU. Also the md state showing as resync=PENDING Kernel : Issue found in the following kernels >RHEL 6.5 (2.6.32-431.el6.x86_64) >CentOS 7 (kernel-3.10.0-123.13.1.el7.x86_64) Steps to Reproduce the issue: 1. Created a raid 5 md with 4 drives using the below mdadm command. mdadm -C /dev/md0 -c 64 -l 5 -f -n 4 -e 1.2 /dev/sdb6 /dev/sdc6 /dev/sdd6 /dev/sde6 2. Make the md writable mdadm –readwrite /dev/md0 3. Now md will start initialization 4. Run FIO Tool, the the below said configuration /usr/bin/fio --name=md0 --filename=/dev/md0 --thread --numjobs=10 --direct=1 --group_reporting --unlink=0 --loops=1 --offset=0 --randrepeat=1 --norandommap --scramble_buffers=1 --stonewall --ioengine=libaio --rw=randwrite --bs=8704 --iodepth=4000 --runtime=3000 --blockalign=512 4. During MD initialzing, remove a drive(either using MDADM set faulty/remove or remove manually) 5. Now the IO will struck, and cat /proc/mdstat shows states with resync=PENDING Step done one by one to reproduce the issue, and the Observation during each step: 1. System Information: [root@root ~]# uname -a Linux root 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@root ~]# mdadm -V mdadm - v3.2.6 - 25th October 2012 [root@root ~]# fio --version fio-2.1.10 [root@root ~]# lsscsi [0:0:0:0] disk SEAGATE ST31000640SS 0003 /dev/sda [0:0:1:0] disk SEAGATE ST2000NM0001 0002 /dev/sdb [0:0:2:0] enclosu LSI CORP SAS2X36 0424 - [0:0:3:0] disk SEAGATE ST200FM0002 0003 /dev/sdc [0:0:4:0] disk SEAGATE ST200FM0002 0003 /dev/sdd [0:0:5:0] disk SEAGATE ST200FM0002 0003 /dev/sde [0:0:6:0] disk SEAGATE ST200FM0002 0003 /dev/sdf 2. Creating raid5 md [root@root ~]# mdadm -C /dev/md0 -c 64 -l 5 -f -n 4 -e 1.2 /dev/sd[cdef]6 mdadm: /dev/sdc6 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Dec 15 18:23:17 2014 mdadm: /dev/sdd6 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Dec 15 18:23:17 2014 mdadm: /dev/sde6 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Dec 15 18:23:17 2014 mdadm: /dev/sdf6 appears to be part of a raid array: level=raid5 devices=4 ctime=Mon Dec 15 18:23:17 2014 Continue creating array? y mdadm: array /dev/md0 started. dmesg md: unbind md: export_rdev(sdf6) md: unbind md: export_rdev(sde6) md: unbind md: export_rdev(sdd6) md: unbind md: export_rdev(sdc6) md: bind md: bind md: bind md: bind async_tx: api initialized (async) xor: automatically using best checksumming function: generic_sse generic_sse: 9976.000 MB/sec xor: using function: generic_sse (9976.000 MB/sec) raid6: sse2x1 6386 MB/s raid6: sse2x2 7464 MB/s raid6: sse2x4 8199 MB/s raid6: using algorithm sse2x4 (8199 MB/s) raid6: using ssse3x2 recovery algorithm md: raid6 personality registered for level 6 md: raid5 personality registered for level 5 md: raid4 personality registered for level 4 bio: create slab at 1 md/raid:md0: not clean -- starting background reconstruction md/raid:md0: device sdf6 operational as raid disk 3 md/raid:md0: device sde6 operational as raid disk 2 md/raid:md0: device sdd6 operational as raid disk 1 md/raid:md0: device sdc6 operational as raid disk 0 md/raid:md0: allocated 4314kB md/raid:md0: raid level 5 active with 4 out of 4 devices, algorithm 2 RAID conf printout: --- level:5 rd:4 wd:4 disk 0, o:1, dev:sdc6 disk 1, o:1, dev:sdd6 disk 2, o:1, dev:sde6 disk 3, o:1, dev:sdf6 md0: detected capacity change from 0 to 576636125184 md: resync of RAID array md0 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. md: using 128k window, over a total of 187707072k. md0: unknown partition table [root@root ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sdf6[3] sde6[2] sdd6[1] sdc6[0] 563121216 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU] [=>...................] resync = 8.2% (15404908/187707072) finish=16.6min speed=172462K/sec unused devices: [root@root ~]# echo 10000 > /sys/block/md0/md/sync_speed_min [root@root ~]# echo 30000 > /sys/block/md0/md/sync_speed_max [root@root ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sdf6[3] sde6[2] sdd6[1] sdc6[0] 563121216 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU] [===>.................] resync = 16.1% (30226432/187707072) finish=47.3min speed=55459K/sec unused devices: 3. Start FIO [root@root ~]# /usr/bin/fio --name=md0 --filename=/dev/md0 --thread --numjobs=10 --direct=1 --group_reporting --unlink=0 --loops=1 --offset=0 --randrepeat=1 --norandommap --scramble_buffers=1 --stonewall --ioengine=libaio --rw=randwrite --bs=8704 --iodepth=4000 --runtime=3000 --blockalign=512 md0: (g=0): rw=randwrite, bs=8704-8704/8704-8704/8704-8704, ioengine=libaio, iodepth=4000 ... fio-2.1.10 Starting 10 threads Jobs: 10 (f=10): [wwwwwwwwww] [12.8% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 43m:37s] 4. Remove a drive from md arry using mdadm command [root@root ~]# mdadm /dev/md0 --set-faulty /dev/sdc6 dmesg md/raid:md0: Disk failure on sdc6, disabling device. md/raid:md0: Operation continuing on 3 devices. md: md0: resync done. md: checkpointing resync of md0. 5. System state after the drive is removed [root@root ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sdf6[3] sde6[2] sdd6[1] sdc6[0](F) 563121216 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [_UUU] resync=PENDING unused devices: top top - 17:55:06 up 1:09, 3 users, load average: 11.98, 8.53, 3.99 Tasks: 313 total, 2 running, 311 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 6.3%sy, 0.0%ni, 93.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16455916k total, 780184k used, 15675732k free, 29628k buffers Swap: 6127608k total, 0k used, 6127608k free, 116212k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2690 root 20 0 0 0 0 R 100.0 0.0 6:44.41 md0_raid5 235 root 39 19 0 0 0 S 0.3 0.0 0:12.95 kipmi0 2650 root 20 0 98.1m 4456 3348 S 0.3 0.0 0:00.21 sshd 1 root 20 0 19364 1536 1232 S 0.0 0.0 0:01.42 init Dmesg INFO: task fio:2715 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 000000000000000a 0 2715 2654 0x00000080 ffff88043b623598 0000000000000082 0000000000000000 ffffffff81058d53 ffff88043b623548 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b40b098 ffff88043b623fd8 000000000000fbc8 ffff88043b40b098 Call Trace: [] ? __wake_up+0x53/0x70 [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? md_wakeup_thread+0x39/0x70 [] md_make_request+0xe1/0x230 [] ? make_request+0x306/0xc6c [raid456] [] generic_make_request+0x240/0x5a0 [] ? mempool_alloc_slab+0x15/0x20 [] ? mempool_alloc+0x63/0x140 [] submit_bio+0x70/0x120 [] do_direct_IO+0x7ca/0xfa0 [] __blockdev_direct_IO_newtrunc+0x346/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2717 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000004 0 2717 2654 0x00000080 ffff880439e97698 0000000000000082 ffff880439e97628 ffffffff81058d53 ffff880439e97648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b0adab8 ffff880439e97fd8 000000000000fbc8 ffff88043b0adab8 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0x1000/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2718 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000005 0 2718 2654 0x00000080 ffff88043bc13698 0000000000000082 ffff88043bc13628 ffffffff81058d53 ffff88043bc13648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b0ad058 ffff88043bc13fd8 000000000000fbc8 ffff88043b0ad058 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0x1000/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2719 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000001 0 2719 2654 0x00000080 ffff880439ebb698 0000000000000082 ffff880439ebb628 ffffffff81058d53 ffff880439ebb648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043b0ac5f8 ffff880439ebbfd8 000000000000fbc8 ffff88043b0ac5f8 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2720 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000008 0 2720 2654 0x00000080 ffff88043b8cf698 0000000000000082 ffff88043b8cf628 ffffffff81058d53 ffff88043b8cf648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff880439e89af8 ffff88043b8cffd8 000000000000fbc8 ffff880439e89af8 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2721 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000000 0 2721 2654 0x00000080 ffff88043b047698 0000000000000082 ffff88043b047628 ffffffff81058d53 ffff88043b047648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff880439e89098 ffff88043b047fd8 000000000000fbc8 ffff880439e89098 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2722 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000000 0 2722 2654 0x00000080 ffff880439ea3698 0000000000000082 ffff880439ea3628 ffffffff81058d53 ffff880439ea3648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff880439e88638 ffff880439ea3fd8 000000000000fbc8 ffff880439e88638 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2723 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000006 0 2723 2654 0x00000080 ffff88043bf5f698 0000000000000082 ffff88043bf5f628 ffffffff81058d53 ffff88043bf5f648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043a183ab8 ffff88043bf5ffd8 000000000000fbc8 ffff88043a183ab8 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2724 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 000000000000000b 0 2724 2654 0x00000080 ffff88043be05698 0000000000000082 ffff88043be05628 ffffffff81058d53 ffff88043be05648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043a183058 ffff88043be05fd8 000000000000fbc8 ffff88043a183058 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b INFO: task fio:2725 blocked for more than 120 seconds. Not tainted 2.6.32-431.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000003 0 2725 2654 0x00000080 ffff88043be07698 0000000000000082 ffff88043be07628 ffffffff81058d53 ffff88043be07648 ffff880230e49cc0 ffff8802389aa228 ffff88043b2ad1b8 ffff88043a1825f8 ffff88043be07fd8 000000000000fbc8 ffff88043a1825f8 Call Trace: [] ? __wake_up+0x53/0x70 [] ? md_raid5_unplug_device+0x7b/0x100 [raid456] [] get_active_stripe+0x236/0x830 [raid456] [] ? default_wake_function+0x0/0x20 [] ? prepare_to_wait+0x4e/0x80 [] make_request+0x1b5/0xc6c [raid456] [] ? autoremove_wake_function+0x0/0x40 [] ? mempool_alloc_slab+0x15/0x20 [] md_make_request+0xe1/0x230 [] ? bvec_alloc_bs+0x62/0x110 [] ? __bio_add_page+0x110/0x230 [] generic_make_request+0x240/0x5a0 [] ? do_direct_IO+0x57c/0xfa0 [] submit_bio+0x70/0x120 [] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 [] ? blkdev_get_block+0x0/0x20 [] __blockdev_direct_IO+0x77/0xe0 [] ? blkdev_get_block+0x0/0x20 [] blkdev_direct_IO+0x57/0x60 [] ? blkdev_get_block+0x0/0x20 [] generic_file_direct_write+0xc2/0x190 [] __generic_file_aio_write+0x3a1/0x490 [] ? aio_read_evt+0xa0/0x170 [] blkdev_aio_write+0x3c/0xa0 [] ? blkdev_aio_write+0x0/0xa0 [] aio_rw_vect_retry+0x84/0x200 [] aio_run_iocb+0x64/0x170 [] do_io_submit+0x291/0x920 [] sys_io_submit+0x10/0x20 [] system_call_fastpath+0x16/0x1b [root@root ~]# cat /proc/2690/stack [] __cond_resched+0x2a/0x40 [] ops_run_io+0x2c/0x920 [raid456] [] handle_stripe+0x9cc/0x2980 [raid456] [] raid5d+0x624/0x850 [raid456] [] md_thread+0x115/0x150 [] kthread+0x96/0xa0 [] child_rip+0xa/0x20 [] 0xffffffffffffffff [root@root ~]# cat /proc/2690/stat 2690 (md0_raid5) R 2 0 0 0 -1 2149613632 0 0 0 0 0 68495 0 0 20 0 1 0 350990 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483391 256 0 0 0 17 2 0 0 6855 0 0 [root@root ~]# cat /proc/2690/statm 0 0 0 0 0 0 0 [root@root ~]# cat /proc/2690/stat stat statm status [root@root ~]# cat /proc/2690/status Name: md0_raid5 State: R (running) Tgid: 2690 Pid: 2690 PPid: 2 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 Utrace: 0 FDSize: 64 Groups: Threads: 1 SigQ: 2/128402 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: fffffffffffffeff SigCgt: 0000000000000100 CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: fffffffffffffeff CapBnd: ffffffffffffffff Cpus_allowed: ffffff Cpus_allowed_list: 0-23 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 Mems_allowed_list: 0-1 voluntary_ctxt_switches: 5411612 nonvoluntary_ctxt_switches: 257032