* WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] [not found] <1382969797.9407100.1469619450040.JavaMail.zimbra@redhat.com> @ 2016-07-27 11:38 ` Yi Zhang 2016-07-28 1:13 ` Shaohua Li ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Yi Zhang @ 2016-07-27 11:38 UTC (permalink / raw) To: linux-raid Hello everyone I'm testing raid5-cache recently and found below issue on 4.7.0-rc7. [ 902.701162] md: bind<sdb1> [ 902.701248] md: bind<sdd1> [ 902.702566] md: bind<sde1> [ 902.702625] md: bind<sdf1> [ 902.703306] md: bind<sdc1> [ 902.705899] md/raid:md0: not clean -- starting background reconstruction [ 902.705972] md/raid:md0: device sdf1 operational as raid disk 3 [ 902.705974] md/raid:md0: device sde1 operational as raid disk 2 [ 902.705975] md/raid:md0: device sdd1 operational as raid disk 1 [ 902.705976] md/raid:md0: device sdb1 operational as raid disk 0 [ 902.706312] md/raid:md0: allocated 4374kB [ 902.706371] md/raid:md0: raid level 6 active with 4 out of 4 devices, algorithm 2 [ 902.706372] RAID conf printout: [ 902.706380] --- level:6 rd:4 wd:4 [ 902.706381] disk 0, o:1, dev:sdb1 [ 902.706382] disk 1, o:1, dev:sdd1 [ 902.706383] disk 2, o:1, dev:sde1 [ 902.706384] disk 3, o:1, dev:sdf1 [ 902.706393] md/raid456: discard support disabled due to uncertainty. [ 902.706394] Set raid456.devices_handle_discard_safely=Y to override. [ 902.706396] md/raid:md0: using device sdc1 as journal [ 902.707354] created bitmap (1 pages) for device md0 [ 902.707401] md0: bitmap initialized from disk: read 1 pages, set 8 of 8 bits [ 902.781331] md0: detected capacity change from 0 to 1071644672 [ 902.781465] md: resync of RAID array md0 [ 902.781468] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [ 902.781469] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. [ 902.781475] md: using 128k window, over a total of 523264k. [ 902.829618] md: couldn't update array info. -22 [ 902.876797] md: couldn't update array info. -22 [ 902.896926] md: couldn't update array info. -22 [ 909.185914] md: md0: resync done. [ 909.262779] RAID conf printout: [ 909.262782] --- level:6 rd:4 wd:4 [ 909.262783] disk 0, o:1, dev:sdb1 [ 909.262784] disk 1, o:1, dev:sdd1 [ 909.262785] disk 2, o:1, dev:sde1 [ 909.262786] disk 3, o:1, dev:sdf1 [ 912.795104] md/raid:md0: Disk failure on sdb1, disabling device. md/raid:md0: Operation continuing on 3 devices. [ 912.851893] RAID conf printout: [ 912.851895] --- level:6 rd:4 wd:3 [ 912.851897] disk 0, o:0, dev:sdb1 [ 912.851898] disk 1, o:1, dev:sdd1 [ 912.851899] disk 2, o:1, dev:sde1 [ 912.851900] disk 3, o:1, dev:sdf1 [ 912.855130] RAID conf printout: [ 912.855133] --- level:6 rd:4 wd:3 [ 912.855135] disk 1, o:1, dev:sdd1 [ 912.855136] disk 2, o:1, dev:sde1 [ 912.855137] disk 3, o:1, dev:sdf1 [ 912.885778] md: unbind<sdb1> [ 912.894205] md: export_rdev(sdb1) [ 1548.098231] md: bind<sdb1> [ 1548.452714] RAID conf printout: [ 1548.452717] --- level:6 rd:4 wd:3 [ 1548.452718] disk 0, o:1, dev:sdb1 [ 1548.452719] disk 1, o:1, dev:sdd1 [ 1548.452720] disk 2, o:1, dev:sde1 [ 1548.452721] disk 3, o:1, dev:sdf1 [ 1548.486067] ------------[ cut here ]------------ [ 1548.486076] WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] [ 1548.486077] Modules linked in: fuse btrfs vfat msdos fat ext4 jbd2 mbcache binfmt_misc raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ipt_REJECT nf_reject_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sb_edac edac_core snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic x86_pkg_temp_thermal snd_hda_intel intel_powerclamp coretemp snd _hda_codec [ 1548.486117] kvm_intel kvm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel snd_seq aesni_intel lrw snd_seq_device gf128mul glue_helper snd_pcm ablk_helper hp_wmi iTCO_wdt snd_timer sparse_keymap rfkill cryptd iTCO_vendor_support mei_me ioatdma snd mei shpchp pcspkr sg soundcore i2c_i801 tpm_infineon lpc_ich mfd_core dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci e1000e ata_generic libsas pata_acpi ahci libahci scsi_transport_sas firewire_ohci libata firewire_core ptp crc32c_intel serio_raw pps_core i2c_core crc_itu_t wmi fjes dm_mirror dm_region_hash dm_log dm_mod [ 1548.486152] CPU: 4 PID: 10512 Comm: md0_resync Tainted: G W 4.7.0-rc7 #1 [ 1548.486153] Hardware name: Hewlett-Packard HP Z620 Workstation/158A, BIOS J61 v03.69 03/25/2014 [ 1548.486155] 0000000000000286 000000002369dfe1 ffff88079d713b40 ffffffff8134caec [ 1548.486157] 0000000000000000 0000000000000000 ffff88079d713b80 ffffffff8108c351 [ 1548.486159] 000002d8a6ae4d80 000000008f0c904f ffff8807f64b0000 ffff8807c4ab6888 [ 1548.486161] Call Trace: [ 1548.486166] [<ffffffff8134caec>] dump_stack+0x63/0x87 [ 1548.486169] [<ffffffff8108c351>] __warn+0xd1/0xf0 [ 1548.486172] [<ffffffff8108c48d>] warn_slowpath_null+0x1d/0x20 [ 1548.486175] [<ffffffffa097d805>] r5l_do_reclaim+0x415/0x430 [raid456] [ 1548.486178] [<ffffffff811f9c50>] ? kfree+0x120/0x170 [ 1548.486181] [<ffffffffa097e33b>] r5l_quiesce+0x7b/0xa0 [raid456] [ 1548.486183] [<ffffffffa0970da0>] raid5_quiesce+0x50/0x2a0 [raid456] [ 1548.486186] [<ffffffff810d0250>] ? prepare_to_wait_event+0xf0/0xf0 [ 1548.486189] [<ffffffff8156d00e>] md_do_sync+0xe7e/0xf60 [ 1548.486192] [<ffffffff810c4370>] ? enqueue_entity+0x2a0/0xcc0 [ 1548.486194] [<ffffffff810c201d>] ? update_curr+0xed/0x180 [ 1548.486196] [<ffffffff810c026e>] ? account_entity_dequeue+0xae/0xd0 [ 1548.486197] [<ffffffff810c27f6>] ? dequeue_entity+0x266/0x980 [ 1548.486200] [<ffffffff810b5ef5>] ? check_preempt_curr+0x75/0x90 [ 1548.486202] [<ffffffff810b5f29>] ? ttwu_do_wakeup+0x19/0xe0 [ 1548.486204] [<ffffffff810c2fce>] ? dequeue_task_fair+0xbe/0x830 [ 1548.486206] [<ffffffff810c3995>] ? put_prev_entity+0x35/0x730 [ 1548.486209] [<ffffffff8102c6d9>] ? __switch_to+0x219/0x5c0 [ 1548.486211] [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0 [ 1548.486214] [<ffffffff81566aa6>] md_thread+0x136/0x150 [ 1548.486216] [<ffffffff81566970>] ? find_pers+0x70/0x70 [ 1548.486219] [<ffffffff810ab698>] kthread+0xd8/0xf0 [ 1548.486221] [<ffffffff816e873f>] ret_from_fork+0x1f/0x40 [ 1548.486223] [<ffffffff810ab5c0>] ? kthread_park+0x60/0x60 [ 1548.486224] ---[ end trace 242e9516e36acb1e ]--- [ 1548.543855] md: recovery of RAID array md0 [ 1548.543857] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [ 1548.543859] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [ 1548.543864] md: using 128k window, over a total of 523264k. [ 1548.543868] md: md0: recovery done. [ 1548.574961] RAID conf printout: [ 1548.574964] --- level:6 rd:4 wd:4 [ 1548.574966] disk 0, o:1, dev:sdb1 [ 1548.574967] disk 1, o:1, dev:sdd1 [ 1548.574968] disk 2, o:1, dev:sde1 [ 1548.574969] disk 3, o:1, dev:sdf1 Steps I used: mdadm --create --run /dev/md0 --level 4 --metadata 1.2 --raid-devices 4 /dev/sdb1 /dev/sd[d-f]1 --write-journal /dev/sdc1 --bitmap=internal --bitmap-chunk=64M --chunk 512 mdadm --wait /dev/md0 mkfs.ext4 /dev/md0 mdadm /dev/md0 -f /dev/sdb1 mdadm /dev/md0 -r /dev/sdb1 mdadm /dev/md0 -a /dev/sdb1 Best Regards Yi Zhang Best Regards, Yi Zhang ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] 2016-07-27 11:38 ` WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] Yi Zhang @ 2016-07-28 1:13 ` Shaohua Li 2016-08-01 9:01 ` Guoqing Jiang [not found] ` <579F0E52.8090704@suse.com> 2 siblings, 0 replies; 7+ messages in thread From: Shaohua Li @ 2016-07-28 1:13 UTC (permalink / raw) To: Yi Zhang; +Cc: linux-raid On Wed, Jul 27, 2016 at 07:38:49AM -0400, Yi Zhang wrote: > > Hello everyone > > I'm testing raid5-cache recently and found below issue on 4.7.0-rc7. > > [ 902.701162] md: bind<sdb1> > [ 902.701248] md: bind<sdd1> > [ 902.702566] md: bind<sde1> > [ 902.702625] md: bind<sdf1> > [ 902.703306] md: bind<sdc1> > [ 902.705899] md/raid:md0: not clean -- starting background reconstruction > [ 902.705972] md/raid:md0: device sdf1 operational as raid disk 3 > [ 902.705974] md/raid:md0: device sde1 operational as raid disk 2 > [ 902.705975] md/raid:md0: device sdd1 operational as raid disk 1 > [ 902.705976] md/raid:md0: device sdb1 operational as raid disk 0 > [ 902.706312] md/raid:md0: allocated 4374kB > [ 902.706371] md/raid:md0: raid level 6 active with 4 out of 4 devices, algorithm 2 > [ 902.706372] RAID conf printout: > [ 902.706380] --- level:6 rd:4 wd:4 > [ 902.706381] disk 0, o:1, dev:sdb1 > [ 902.706382] disk 1, o:1, dev:sdd1 > [ 902.706383] disk 2, o:1, dev:sde1 > [ 902.706384] disk 3, o:1, dev:sdf1 > [ 902.706393] md/raid456: discard support disabled due to uncertainty. > [ 902.706394] Set raid456.devices_handle_discard_safely=Y to override. > [ 902.706396] md/raid:md0: using device sdc1 as journal > [ 902.707354] created bitmap (1 pages) for device md0 > [ 902.707401] md0: bitmap initialized from disk: read 1 pages, set 8 of 8 bits > [ 902.781331] md0: detected capacity change from 0 to 1071644672 > [ 902.781465] md: resync of RAID array md0 > [ 902.781468] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [ 902.781469] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. > [ 902.781475] md: using 128k window, over a total of 523264k. > [ 902.829618] md: couldn't update array info. -22 > [ 902.876797] md: couldn't update array info. -22 > [ 902.896926] md: couldn't update array info. -22 > [ 909.185914] md: md0: resync done. > [ 909.262779] RAID conf printout: > [ 909.262782] --- level:6 rd:4 wd:4 > [ 909.262783] disk 0, o:1, dev:sdb1 > [ 909.262784] disk 1, o:1, dev:sdd1 > [ 909.262785] disk 2, o:1, dev:sde1 > [ 909.262786] disk 3, o:1, dev:sdf1 > [ 912.795104] md/raid:md0: Disk failure on sdb1, disabling device. > md/raid:md0: Operation continuing on 3 devices. > [ 912.851893] RAID conf printout: > [ 912.851895] --- level:6 rd:4 wd:3 > [ 912.851897] disk 0, o:0, dev:sdb1 > [ 912.851898] disk 1, o:1, dev:sdd1 > [ 912.851899] disk 2, o:1, dev:sde1 > [ 912.851900] disk 3, o:1, dev:sdf1 > [ 912.855130] RAID conf printout: > [ 912.855133] --- level:6 rd:4 wd:3 > [ 912.855135] disk 1, o:1, dev:sdd1 > [ 912.855136] disk 2, o:1, dev:sde1 > [ 912.855137] disk 3, o:1, dev:sdf1 > [ 912.885778] md: unbind<sdb1> > [ 912.894205] md: export_rdev(sdb1) > [ 1548.098231] md: bind<sdb1> > [ 1548.452714] RAID conf printout: > [ 1548.452717] --- level:6 rd:4 wd:3 > [ 1548.452718] disk 0, o:1, dev:sdb1 > [ 1548.452719] disk 1, o:1, dev:sdd1 > [ 1548.452720] disk 2, o:1, dev:sde1 > [ 1548.452721] disk 3, o:1, dev:sdf1 > [ 1548.486067] ------------[ cut here ]------------ > [ 1548.486076] WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] > [ 1548.486077] Modules linked in: fuse btrfs vfat msdos fat ext4 jbd2 mbcache binfmt_misc raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ipt_REJECT nf_reject_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sb_edac edac_core snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic x86_pkg_temp_thermal snd_hda_intel intel_powerclamp coretemp s nd_hda_codec > [ 1548.486117] kvm_intel kvm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel snd_seq aesni_intel lrw snd_seq_device gf128mul glue_helper snd_pcm ablk_helper hp_wmi iTCO_wdt snd_timer sparse_keymap rfkill cryptd iTCO_vendor_support mei_me ioatdma snd mei shpchp pcspkr sg soundcore i2c_i801 tpm_infineon lpc_ich mfd_core dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci e1000e ata_generic libsas pata_acpi ahci libahci scsi_transport_sas firewire_ohci libata firewire_core ptp crc32c_intel serio_raw pps_core i2c_core crc_itu_t wmi fjes dm_mirror dm_region_hash dm_log dm_mod > [ 1548.486152] CPU: 4 PID: 10512 Comm: md0_resync Tainted: G W 4.7.0-rc7 #1 > [ 1548.486153] Hardware name: Hewlett-Packard HP Z620 Workstation/158A, BIOS J61 v03.69 03/25/2014 > [ 1548.486155] 0000000000000286 000000002369dfe1 ffff88079d713b40 ffffffff8134caec > [ 1548.486157] 0000000000000000 0000000000000000 ffff88079d713b80 ffffffff8108c351 > [ 1548.486159] 000002d8a6ae4d80 000000008f0c904f ffff8807f64b0000 ffff8807c4ab6888 > [ 1548.486161] Call Trace: > [ 1548.486166] [<ffffffff8134caec>] dump_stack+0x63/0x87 > [ 1548.486169] [<ffffffff8108c351>] __warn+0xd1/0xf0 > [ 1548.486172] [<ffffffff8108c48d>] warn_slowpath_null+0x1d/0x20 > [ 1548.486175] [<ffffffffa097d805>] r5l_do_reclaim+0x415/0x430 [raid456] > [ 1548.486178] [<ffffffff811f9c50>] ? kfree+0x120/0x170 > [ 1548.486181] [<ffffffffa097e33b>] r5l_quiesce+0x7b/0xa0 [raid456] > [ 1548.486183] [<ffffffffa0970da0>] raid5_quiesce+0x50/0x2a0 [raid456] > [ 1548.486186] [<ffffffff810d0250>] ? prepare_to_wait_event+0xf0/0xf0 > [ 1548.486189] [<ffffffff8156d00e>] md_do_sync+0xe7e/0xf60 > [ 1548.486192] [<ffffffff810c4370>] ? enqueue_entity+0x2a0/0xcc0 > [ 1548.486194] [<ffffffff810c201d>] ? update_curr+0xed/0x180 > [ 1548.486196] [<ffffffff810c026e>] ? account_entity_dequeue+0xae/0xd0 > [ 1548.486197] [<ffffffff810c27f6>] ? dequeue_entity+0x266/0x980 > [ 1548.486200] [<ffffffff810b5ef5>] ? check_preempt_curr+0x75/0x90 > [ 1548.486202] [<ffffffff810b5f29>] ? ttwu_do_wakeup+0x19/0xe0 > [ 1548.486204] [<ffffffff810c2fce>] ? dequeue_task_fair+0xbe/0x830 > [ 1548.486206] [<ffffffff810c3995>] ? put_prev_entity+0x35/0x730 > [ 1548.486209] [<ffffffff8102c6d9>] ? __switch_to+0x219/0x5c0 > [ 1548.486211] [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0 > [ 1548.486214] [<ffffffff81566aa6>] md_thread+0x136/0x150 > [ 1548.486216] [<ffffffff81566970>] ? find_pers+0x70/0x70 > [ 1548.486219] [<ffffffff810ab698>] kthread+0xd8/0xf0 > [ 1548.486221] [<ffffffff816e873f>] ret_from_fork+0x1f/0x40 > [ 1548.486223] [<ffffffff810ab5c0>] ? kthread_park+0x60/0x60 > [ 1548.486224] ---[ end trace 242e9516e36acb1e ]--- > [ 1548.543855] md: recovery of RAID array md0 > [ 1548.543857] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [ 1548.543859] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. > [ 1548.543864] md: using 128k window, over a total of 523264k. > [ 1548.543868] md: md0: recovery done. > [ 1548.574961] RAID conf printout: > [ 1548.574964] --- level:6 rd:4 wd:4 > [ 1548.574966] disk 0, o:1, dev:sdb1 > [ 1548.574967] disk 1, o:1, dev:sdd1 > [ 1548.574968] disk 2, o:1, dev:sde1 > [ 1548.574969] disk 3, o:1, dev:sdf1 > > Steps I used: > > mdadm --create --run /dev/md0 --level 4 --metadata 1.2 --raid-devices 4 > /dev/sdb1 /dev/sd[d-f]1 --write-journal /dev/sdc1 --bitmap=internal > --bitmap-chunk=64M --chunk 512 > mdadm --wait /dev/md0 > mkfs.ext4 /dev/md0 > mdadm /dev/md0 -f /dev/sdb1 > mdadm /dev/md0 -r /dev/sdb1 > mdadm /dev/md0 -a /dev/sdb1 Thanks for the reporting. Looks the assumption .quiesce is called with reconfig_mutex locked is incorrect. I'll work out a fix. Thanks, Shaohua ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] 2016-07-27 11:38 ` WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] Yi Zhang 2016-07-28 1:13 ` Shaohua Li @ 2016-08-01 9:01 ` Guoqing Jiang [not found] ` <579F0E52.8090704@suse.com> 2 siblings, 0 replies; 7+ messages in thread From: Guoqing Jiang @ 2016-08-01 9:01 UTC (permalink / raw) To: Yi Zhang, linux-raid On 07/27/2016 07:38 AM, Yi Zhang wrote: > Hello everyone > > I'm testing raid5-cache recently and found below issue on 4.7.0-rc7. > > [ 902.701162] md: bind<sdb1> > [ 902.701248] md: bind<sdd1> > [ 902.702566] md: bind<sde1> > [ 902.702625] md: bind<sdf1> > [ 902.703306] md: bind<sdc1> > [ 902.705899] md/raid:md0: not clean -- starting background reconstruction > [ 902.705972] md/raid:md0: device sdf1 operational as raid disk 3 > [ 902.705974] md/raid:md0: device sde1 operational as raid disk 2 > [ 902.705975] md/raid:md0: device sdd1 operational as raid disk 1 > [ 902.705976] md/raid:md0: device sdb1 operational as raid disk 0 > [ 902.706312] md/raid:md0: allocated 4374kB > [ 902.706371] md/raid:md0: raid level 6 active with 4 out of 4 devices, algorithm 2 This is level 6 not 4 as you said in the end. > [ 902.706372] RAID conf printout: > [ 902.706380] --- level:6 rd:4 wd:4 > [ 902.706381] disk 0, o:1, dev:sdb1 > [ 902.706382] disk 1, o:1, dev:sdd1 > [ 902.706383] disk 2, o:1, dev:sde1 > [ 902.706384] disk 3, o:1, dev:sdf1 > [ 902.706393] md/raid456: discard support disabled due to uncertainty. > [ 902.706394] Set raid456.devices_handle_discard_safely=Y to override. > [ 902.706396] md/raid:md0: using device sdc1 as journal > [ 902.707354] created bitmap (1 pages) for device md0 > [ 902.707401] md0: bitmap initialized from disk: read 1 pages, set 8 of 8 bits > [ 902.781331] md0: detected capacity change from 0 to 1071644672 > [ 902.781465] md: resync of RAID array md0 > [ 902.781468] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [ 902.781469] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. > [ 902.781475] md: using 128k window, over a total of 523264k. > [ 902.829618] md: couldn't update array info. -22 > [ 902.876797] md: couldn't update array info. -22 > [ 902.896926] md: couldn't update array info. -22 Errors happened when update array info, not sure it have relationship with the WARNING or not. > [ 909.185914] md: md0: resync done. > [ 909.262779] RAID conf printout: > [ 909.262782] --- level:6 rd:4 wd:4 > [ 909.262783] disk 0, o:1, dev:sdb1 > [ 909.262784] disk 1, o:1, dev:sdd1 > [ 909.262785] disk 2, o:1, dev:sde1 > [ 909.262786] disk 3, o:1, dev:sdf1 > [ 912.795104] md/raid:md0: Disk failure on sdb1, disabling device. > md/raid:md0: Operation continuing on 3 devices. > [ 912.851893] RAID conf printout: > [ 912.851895] --- level:6 rd:4 wd:3 > [ 912.851897] disk 0, o:0, dev:sdb1 > [ 912.851898] disk 1, o:1, dev:sdd1 > [ 912.851899] disk 2, o:1, dev:sde1 > [ 912.851900] disk 3, o:1, dev:sdf1 > [ 912.855130] RAID conf printout: > [ 912.855133] --- level:6 rd:4 wd:3 > [ 912.855135] disk 1, o:1, dev:sdd1 > [ 912.855136] disk 2, o:1, dev:sde1 > [ 912.855137] disk 3, o:1, dev:sdf1 > [ 912.885778] md: unbind<sdb1> > [ 912.894205] md: export_rdev(sdb1) > [ 1548.098231] md: bind<sdb1> > [ 1548.452714] RAID conf printout: > [ 1548.452717] --- level:6 rd:4 wd:3 > [ 1548.452718] disk 0, o:1, dev:sdb1 > [ 1548.452719] disk 1, o:1, dev:sdd1 > [ 1548.452720] disk 2, o:1, dev:sde1 > [ 1548.452721] disk 3, o:1, dev:sdf1 > [ 1548.486067] ------------[ cut here ]------------ > [ 1548.486076] WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] > [ 1548.486077] Modules linked in: fuse btrfs vfat msdos fat ext4 jbd2 mbcache binfmt_misc raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ipt_REJECT nf_reject_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sb_edac edac_core snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic x86_pkg_temp_thermal snd_hda_intel intel_powerclamp coretemp s nd_hda_codec > [ 1548.486117] kvm_intel kvm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel snd_seq aesni_intel lrw snd_seq_device gf128mul glue_helper snd_pcm ablk_helper hp_wmi iTCO_wdt snd_timer sparse_keymap rfkill cryptd iTCO_vendor_support mei_me ioatdma snd mei shpchp pcspkr sg soundcore i2c_i801 tpm_infineon lpc_ich mfd_core dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci e1000e ata_generic libsas pata_acpi ahci libahci scsi_transport_sas firewire_ohci libata firewire_core ptp crc32c_intel serio_raw pps_core i2c_core crc_itu_t wmi fjes dm_mirror dm_region_hash dm_log dm_mod > [ 1548.486152] CPU: 4 PID: 10512 Comm: md0_resync Tainted: G W 4.7.0-rc7 #1 > [ 1548.486153] Hardware name: Hewlett-Packard HP Z620 Workstation/158A, BIOS J61 v03.69 03/25/2014 > [ 1548.486155] 0000000000000286 000000002369dfe1 ffff88079d713b40 ffffffff8134caec > [ 1548.486157] 0000000000000000 0000000000000000 ffff88079d713b80 ffffffff8108c351 > [ 1548.486159] 000002d8a6ae4d80 000000008f0c904f ffff8807f64b0000 ffff8807c4ab6888 > [ 1548.486161] Call Trace: > [ 1548.486166] [<ffffffff8134caec>] dump_stack+0x63/0x87 > [ 1548.486169] [<ffffffff8108c351>] __warn+0xd1/0xf0 > [ 1548.486172] [<ffffffff8108c48d>] warn_slowpath_null+0x1d/0x20 > [ 1548.486175] [<ffffffffa097d805>] r5l_do_reclaim+0x415/0x430 [raid456] > [ 1548.486178] [<ffffffff811f9c50>] ? kfree+0x120/0x170 > [ 1548.486181] [<ffffffffa097e33b>] r5l_quiesce+0x7b/0xa0 [raid456] > [ 1548.486183] [<ffffffffa0970da0>] raid5_quiesce+0x50/0x2a0 [raid456] > [ 1548.486186] [<ffffffff810d0250>] ? prepare_to_wait_event+0xf0/0xf0 > [ 1548.486189] [<ffffffff8156d00e>] md_do_sync+0xe7e/0xf60 > [ 1548.486192] [<ffffffff810c4370>] ? enqueue_entity+0x2a0/0xcc0 > [ 1548.486194] [<ffffffff810c201d>] ? update_curr+0xed/0x180 > [ 1548.486196] [<ffffffff810c026e>] ? account_entity_dequeue+0xae/0xd0 > [ 1548.486197] [<ffffffff810c27f6>] ? dequeue_entity+0x266/0x980 > [ 1548.486200] [<ffffffff810b5ef5>] ? check_preempt_curr+0x75/0x90 > [ 1548.486202] [<ffffffff810b5f29>] ? ttwu_do_wakeup+0x19/0xe0 > [ 1548.486204] [<ffffffff810c2fce>] ? dequeue_task_fair+0xbe/0x830 > [ 1548.486206] [<ffffffff810c3995>] ? put_prev_entity+0x35/0x730 > [ 1548.486209] [<ffffffff8102c6d9>] ? __switch_to+0x219/0x5c0 > [ 1548.486211] [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0 > [ 1548.486214] [<ffffffff81566aa6>] md_thread+0x136/0x150 > [ 1548.486216] [<ffffffff81566970>] ? find_pers+0x70/0x70 > [ 1548.486219] [<ffffffff810ab698>] kthread+0xd8/0xf0 > [ 1548.486221] [<ffffffff816e873f>] ret_from_fork+0x1f/0x40 > [ 1548.486223] [<ffffffff810ab5c0>] ? kthread_park+0x60/0x60 > [ 1548.486224] ---[ end trace 242e9516e36acb1e ]--- > [ 1548.543855] md: recovery of RAID array md0 > [ 1548.543857] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [ 1548.543859] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. > [ 1548.543864] md: using 128k window, over a total of 523264k. > [ 1548.543868] md: md0: recovery done. > [ 1548.574961] RAID conf printout: > [ 1548.574964] --- level:6 rd:4 wd:4 > [ 1548.574966] disk 0, o:1, dev:sdb1 > [ 1548.574967] disk 1, o:1, dev:sdd1 > [ 1548.574968] disk 2, o:1, dev:sde1 > [ 1548.574969] disk 3, o:1, dev:sdf1 > > Steps I used: > > mdadm --create --run /dev/md0 --level 4 --metadata 1.2 --raid-devices 4 > /dev/sdb1 /dev/sd[d-f]1 --write-journal /dev/sdc1 --bitmap=internal > --bitmap-chunk=64M --chunk 512 > mdadm --wait /dev/md0 > mkfs.ext4 /dev/md0 > mdadm /dev/md0 -f /dev/sdb1 > mdadm /dev/md0 -r /dev/sdb1 > mdadm /dev/md0 -a /dev/sdb1 I tried above cmds in my VM machine (with set level as 6), but can't see the calltrace with below versions. Maybe it is a regression issue, if so, I would suggest to do bisect to narrow down the issue. linux135:~ # uname -r 4.4.13-46-default linux135:~ # mdadm --version mdadm - v3.4-47-g8718fc3 - 11th May 2016 Thanks, Guoqing ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <579F0E52.8090704@suse.com>]
* Re: WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] [not found] ` <579F0E52.8090704@suse.com> @ 2016-08-01 12:32 ` Yi Zhang [not found] ` <7d0acb42-1209-0722-7740-0fa03d44106b@redhat.com> 1 sibling, 0 replies; 7+ messages in thread From: Yi Zhang @ 2016-08-01 12:32 UTC (permalink / raw) To: Guoqing Jiang; +Cc: linux-raid Best Regards, Yi Zhang ----- Original Message ----- From: "Guoqing Jiang" <gqjiang@suse.com> To: "Yi Zhang" <yizhan@redhat.com>, linux-raid@vger.kernel.org Sent: Monday, August 1, 2016 4:54:42 PM Subject: Re: WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] On 07/27/2016 07:38 PM, Yi Zhang wrote: > Hello everyone > > I'm testing raid5-cache recently and found below issue on 4.7.0-rc7. > > [ 902.701162] md: bind<sdb1> > [ 902.701248] md: bind<sdd1> > [ 902.702566] md: bind<sde1> > [ 902.702625] md: bind<sdf1> > [ 902.703306] md: bind<sdc1> > [ 902.705899] md/raid:md0: not clean -- starting background reconstruction > [ 902.705972] md/raid:md0: device sdf1 operational as raid disk 3 > [ 902.705974] md/raid:md0: device sde1 operational as raid disk 2 > [ 902.705975] md/raid:md0: device sdd1 operational as raid disk 1 > [ 902.705976] md/raid:md0: device sdb1 operational as raid disk 0 > [ 902.706312] md/raid:md0: allocated 4374kB > [ 902.706371] md/raid:md0: raid level*6* active with 4 out of 4 devices, algorithm 2 This is level 6 not 4. Yi: Sorry for that, but both level 6 and leve4 can be reproduced on my environment. > [ 902.706372] RAID conf printout: > [ 902.706380] --- level:6 rd:4 wd:4 > [ 902.706381] disk 0, o:1, dev:sdb1 > [ 902.706382] disk 1, o:1, dev:sdd1 > [ 902.706383] disk 2, o:1, dev:sde1 > [ 902.706384] disk 3, o:1, dev:sdf1 > [ 902.706393] md/raid456: discard support disabled due to uncertainty. > [ 902.706394] Set raid456.devices_handle_discard_safely=Y to override. > [ 902.706396] md/raid:md0: using device sdc1 as journal > [ 902.707354] created bitmap (1 pages) for device md0 > [ 902.707401] md0: bitmap initialized from disk: read 1 pages, set 8 of 8 bits > [ 902.781331] md0: detected capacity change from 0 to 1071644672 > [ 902.781465] md: resync of RAID array md0 > [ 902.781468] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [ 902.781469] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. > [ 902.781475] md: using 128k window, over a total of 523264k. > [ 902.829618] md: couldn't update array info. -22 > [ 902.876797] md: couldn't update array info. -22 > [ 902.896926] md: couldn't update array info. -22 Errors happened when update array info, not sure it have relationship with the WARNING or not. > [ 909.185914] md: md0: resync done. > [ 909.262779] RAID conf printout: > [ 909.262782] --- level:6 rd:4 wd:4 > [ 909.262783] disk 0, o:1, dev:sdb1 > [ 909.262784] disk 1, o:1, dev:sdd1 > [ 909.262785] disk 2, o:1, dev:sde1 > [ 909.262786] disk 3, o:1, dev:sdf1 > [ 912.795104] md/raid:md0: Disk failure on sdb1, disabling device. > md/raid:md0: Operation continuing on 3 devices. > [ 912.851893] RAID conf printout: > [ 912.851895] --- level:6 rd:4 wd:3 > [ 912.851897] disk 0, o:0, dev:sdb1 > [ 912.851898] disk 1, o:1, dev:sdd1 > [ 912.851899] disk 2, o:1, dev:sde1 > [ 912.851900] disk 3, o:1, dev:sdf1 > [ 912.855130] RAID conf printout: > [ 912.855133] --- level:6 rd:4 wd:3 > [ 912.855135] disk 1, o:1, dev:sdd1 > [ 912.855136] disk 2, o:1, dev:sde1 > [ 912.855137] disk 3, o:1, dev:sdf1 > [ 912.885778] md: unbind<sdb1> > [ 912.894205] md: export_rdev(sdb1) > [ 1548.098231] md: bind<sdb1> > [ 1548.452714] RAID conf printout: > [ 1548.452717] --- level:6 rd:4 wd:3 > [ 1548.452718] disk 0, o:1, dev:sdb1 > [ 1548.452719] disk 1, o:1, dev:sdd1 > [ 1548.452720] disk 2, o:1, dev:sde1 > [ 1548.452721] disk 3, o:1, dev:sdf1 > [ 1548.486067] ------------[ cut here ]------------ > [ 1548.486076] WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] > [ 1548.486077] Modules linked in: fuse btrfs vfat msdos fat ext4 jbd2 mbcache binfmt_misc raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ipt_REJECT nf_reject_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sb_edac edac_core snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic x86_pkg_temp_thermal snd_hda_intel intel_powerclamp coretemp s nd_hda_codec > [ 1548.486117] kvm_intel kvm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel snd_seq aesni_intel lrw snd_seq_device gf128mul glue_helper snd_pcm ablk_helper hp_wmi iTCO_wdt snd_timer sparse_keymap rfkill cryptd iTCO_vendor_support mei_me ioatdma snd mei shpchp pcspkr sg soundcore i2c_i801 tpm_infineon lpc_ich mfd_core dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci e1000e ata_generic libsas pata_acpi ahci libahci scsi_transport_sas firewire_ohci libata firewire_core ptp crc32c_intel serio_raw pps_core i2c_core crc_itu_t wmi fjes dm_mirror dm_region_hash dm_log dm_mod > [ 1548.486152] CPU: 4 PID: 10512 Comm: md0_resync Tainted: G W 4.7.0-rc7 #1 > [ 1548.486153] Hardware name: Hewlett-Packard HP Z620 Workstation/158A, BIOS J61 v03.69 03/25/2014 > [ 1548.486155] 0000000000000286 000000002369dfe1 ffff88079d713b40 ffffffff8134caec > [ 1548.486157] 0000000000000000 0000000000000000 ffff88079d713b80 ffffffff8108c351 > [ 1548.486159] 000002d8a6ae4d80 000000008f0c904f ffff8807f64b0000 ffff8807c4ab6888 > [ 1548.486161] Call Trace: > [ 1548.486166] [<ffffffff8134caec>] dump_stack+0x63/0x87 > [ 1548.486169] [<ffffffff8108c351>] __warn+0xd1/0xf0 > [ 1548.486172] [<ffffffff8108c48d>] warn_slowpath_null+0x1d/0x20 > [ 1548.486175] [<ffffffffa097d805>] r5l_do_reclaim+0x415/0x430 [raid456] > [ 1548.486178] [<ffffffff811f9c50>] ? kfree+0x120/0x170 > [ 1548.486181] [<ffffffffa097e33b>] r5l_quiesce+0x7b/0xa0 [raid456] > [ 1548.486183] [<ffffffffa0970da0>] raid5_quiesce+0x50/0x2a0 [raid456] > [ 1548.486186] [<ffffffff810d0250>] ? prepare_to_wait_event+0xf0/0xf0 > [ 1548.486189] [<ffffffff8156d00e>] md_do_sync+0xe7e/0xf60 > [ 1548.486192] [<ffffffff810c4370>] ? enqueue_entity+0x2a0/0xcc0 > [ 1548.486194] [<ffffffff810c201d>] ? update_curr+0xed/0x180 > [ 1548.486196] [<ffffffff810c026e>] ? account_entity_dequeue+0xae/0xd0 > [ 1548.486197] [<ffffffff810c27f6>] ? dequeue_entity+0x266/0x980 > [ 1548.486200] [<ffffffff810b5ef5>] ? check_preempt_curr+0x75/0x90 > [ 1548.486202] [<ffffffff810b5f29>] ? ttwu_do_wakeup+0x19/0xe0 > [ 1548.486204] [<ffffffff810c2fce>] ? dequeue_task_fair+0xbe/0x830 > [ 1548.486206] [<ffffffff810c3995>] ? put_prev_entity+0x35/0x730 > [ 1548.486209] [<ffffffff8102c6d9>] ? __switch_to+0x219/0x5c0 > [ 1548.486211] [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0 > [ 1548.486214] [<ffffffff81566aa6>] md_thread+0x136/0x150 > [ 1548.486216] [<ffffffff81566970>] ? find_pers+0x70/0x70 > [ 1548.486219] [<ffffffff810ab698>] kthread+0xd8/0xf0 > [ 1548.486221] [<ffffffff816e873f>] ret_from_fork+0x1f/0x40 > [ 1548.486223] [<ffffffff810ab5c0>] ? kthread_park+0x60/0x60 > [ 1548.486224] ---[ end trace 242e9516e36acb1e ]--- > [ 1548.543855] md: recovery of RAID array md0 > [ 1548.543857] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [ 1548.543859] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. > [ 1548.543864] md: using 128k window, over a total of 523264k. > [ 1548.543868] md: md0: recovery done. > [ 1548.574961] RAID conf printout: > [ 1548.574964] --- level:6 rd:4 wd:4 > [ 1548.574966] disk 0, o:1, dev:sdb1 > [ 1548.574967] disk 1, o:1, dev:sdd1 > [ 1548.574968] disk 2, o:1, dev:sde1 > [ 1548.574969] disk 3, o:1, dev:sdf1 > > Steps I used: > > mdadm --create --run /dev/md0 --level 4 --metadata 1.2 --raid-devices 4 > /dev/sdb1 /dev/sd[d-f]1 --write-journal /dev/sdc1 --bitmap=internal > --bitmap-chunk=64M --chunk 512 > mdadm --wait /dev/md0 > mkfs.ext4 /dev/md0 > mdadm /dev/md0 -f /dev/sdb1 > mdadm /dev/md0 -r /dev/sdb1 > mdadm /dev/md0 -a /dev/sdb1 I tried above cmds in my VM machine (with set level as 6), but can't see the calltrace with below versions. linux135:~ # uname -r 4.4.13-46-default linux135:~ # mdadm --version mdadm - v3.4-47-g8718fc3 - 11th May 2016 Yi: Could you try with 4.7.0, it's very easy to reproduce on my side below is my env: #lsblk /dev/sd[b-f] NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdb 8:16 0 931.5G 0 disk └─sdb1 8:17 0 10G 0 part └─md0 9:0 0 30G 0 raid4 sdc 8:32 0 186.3G 0 disk └─sdc1 8:33 0 10G 0 part └─md0 9:0 0 30G 0 raid4 sdd 8:48 0 111.8G 0 disk └─sdd1 8:49 0 10G 0 part └─md0 9:0 0 30G 0 raid4 sde 8:64 0 931.5G 0 disk └─sde1 8:65 0 10G 0 part └─md0 9:0 0 30G 0 raid4 sdf 8:80 0 931.5G 0 disk └─sdf1 8:81 0 10G 0 part └─md0 9:0 0 30G 0 raid4 # uname -r 4.7.0 # mdadm --version mdadm - v3.4-63-g52209d6 - 21st July 2016 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <7d0acb42-1209-0722-7740-0fa03d44106b@redhat.com>]
* Re: WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] [not found] ` <7d0acb42-1209-0722-7740-0fa03d44106b@redhat.com> @ 2016-08-02 8:36 ` Guoqing Jiang 2016-08-02 13:31 ` yizhan 0 siblings, 1 reply; 7+ messages in thread From: Guoqing Jiang @ 2016-08-02 8:36 UTC (permalink / raw) To: yizhan, linux-raid On 08/01/2016 08:28 AM, yizhan wrote: > > >>> [ 902.706372] RAID conf printout: >>> [ 902.706380] --- level:6 rd:4 wd:4 >>> [ 902.706381] disk 0, o:1, dev:sdb1 >>> [ 902.706382] disk 1, o:1, dev:sdd1 >>> [ 902.706383] disk 2, o:1, dev:sde1 >>> [ 902.706384] disk 3, o:1, dev:sdf1 >>> [ 902.706393] md/raid456: discard support disabled due to uncertainty. >>> [ 902.706394] Set raid456.devices_handle_discard_safely=Y to override. >>> [ 902.706396] md/raid:md0: using device sdc1 as journal >>> [ 902.707354] created bitmap (1 pages) for device md0 >>> [ 902.707401] md0: bitmap initialized from disk: read 1 pages, set 8 of 8 bits >>> [ 902.781331] md0: detected capacity change from 0 to 1071644672 >>> [ 902.781465] md: resync of RAID array md0 >>> [ 902.781468] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. >>> [ 902.781469] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. >>> [ 902.781475] md: using 128k window, over a total of 523264k. >>> [ 902.829618] md: couldn't update array info. -22 >>> [ 902.876797] md: couldn't update array info. -22 >>> [ 902.896926] md: couldn't update array info. -22 >> >> Errors happened when update array info, not sure it have relationship >> with >> the WARNING or not. >> [snip] >>> [ 1548.486067] ------------[ cut here ]------------ >>> [ 1548.486076] WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] >>> [ 1548.486077] Modules linked in: fuse btrfs vfat msdos fat ext4 jbd2 mbcache binfmt_misc raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ipt_REJECT nf_reject_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sb_edac edac_core snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic x86_pkg_temp_thermal snd_hda_intel intel_powerclamp coretemp snd_hda_codec >>> [ 1548.486117] kvm_intel kvm snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel snd_seq aesni_intel lrw snd_seq_device gf128mul glue_helper snd_pcm ablk_helper hp_wmi iTCO_wdt snd_timer sparse_keymap rfkill cryptd iTCO_vendor_support mei_me ioatdma snd mei shpchp pcspkr sg soundcore i2c_i801 tpm_infineon lpc_ich mfd_core dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm isci e1000e ata_generic libsas pata_acpi ahci libahci scsi_transport_sas firewire_ohci libata firewire_core ptp crc32c_intel serio_raw pps_core i2c_core crc_itu_t wmi fjes dm_mirror dm_region_hash dm_log dm_mod >>> [ 1548.486152] CPU: 4 PID: 10512 Comm: md0_resync Tainted: G W 4.7.0-rc7 #1 >>> [ 1548.486153] Hardware name: Hewlett-Packard HP Z620 Workstation/158A, BIOS J61 v03.69 03/25/2014 >>> [ 1548.486155] 0000000000000286 000000002369dfe1 ffff88079d713b40 ffffffff8134caec >>> [ 1548.486157] 0000000000000000 0000000000000000 ffff88079d713b80 ffffffff8108c351 >>> [ 1548.486159] 000002d8a6ae4d80 000000008f0c904f ffff8807f64b0000 ffff8807c4ab6888 >>> [ 1548.486161] Call Trace: >>> [ 1548.486166] [<ffffffff8134caec>] dump_stack+0x63/0x87 >>> [ 1548.486169] [<ffffffff8108c351>] __warn+0xd1/0xf0 >>> [ 1548.486172] [<ffffffff8108c48d>] warn_slowpath_null+0x1d/0x20 >>> [ 1548.486175] [<ffffffffa097d805>] r5l_do_reclaim+0x415/0x430 [raid456] >>> [ 1548.486178] [<ffffffff811f9c50>] ? kfree+0x120/0x170 >>> [ 1548.486181] [<ffffffffa097e33b>] r5l_quiesce+0x7b/0xa0 [raid456] >>> [ 1548.486183] [<ffffffffa0970da0>] raid5_quiesce+0x50/0x2a0 [raid456] >>> [ 1548.486186] [<ffffffff810d0250>] ? prepare_to_wait_event+0xf0/0xf0 >>> [ 1548.486189] [<ffffffff8156d00e>] md_do_sync+0xe7e/0xf60 >>> [ 1548.486192] [<ffffffff810c4370>] ? enqueue_entity+0x2a0/0xcc0 >>> [ 1548.486194] [<ffffffff810c201d>] ? update_curr+0xed/0x180 >>> [ 1548.486196] [<ffffffff810c026e>] ? account_entity_dequeue+0xae/0xd0 >>> [ 1548.486197] [<ffffffff810c27f6>] ? dequeue_entity+0x266/0x980 >>> [ 1548.486200] [<ffffffff810b5ef5>] ? check_preempt_curr+0x75/0x90 >>> [ 1548.486202] [<ffffffff810b5f29>] ? ttwu_do_wakeup+0x19/0xe0 >>> [ 1548.486204] [<ffffffff810c2fce>] ? dequeue_task_fair+0xbe/0x830 >>> [ 1548.486206] [<ffffffff810c3995>] ? put_prev_entity+0x35/0x730 >>> [ 1548.486209] [<ffffffff8102c6d9>] ? __switch_to+0x219/0x5c0 >>> [ 1548.486211] [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0 >>> [ 1548.486214] [<ffffffff81566aa6>] md_thread+0x136/0x150 >>> [ 1548.486216] [<ffffffff81566970>] ? find_pers+0x70/0x70 >>> [ 1548.486219] [<ffffffff810ab698>] kthread+0xd8/0xf0 >>> [ 1548.486221] [<ffffffff816e873f>] ret_from_fork+0x1f/0x40 >>> [ 1548.486223] [<ffffffff810ab5c0>] ? kthread_park+0x60/0x60 >>> [ 1548.486224] ---[ end trace 242e9516e36acb1e ]--- >>> [ 1548.543855] md: recovery of RAID array md0 >>> [ 1548.543857] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. >>> [ 1548.543859] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. >>> [ 1548.543864] md: using 128k window, over a total of 523264k. >>> [ 1548.543868] md: md0: recovery done. >>> [ 1548.574961] RAID conf printout: >>> [ 1548.574964] --- level:6 rd:4 wd:4 >>> [ 1548.574966] disk 0, o:1, dev:sdb1 >>> [ 1548.574967] disk 1, o:1, dev:sdd1 >>> [ 1548.574968] disk 2, o:1, dev:sde1 >>> [ 1548.574969] disk 3, o:1, dev:sdf1 >>> >>> Steps I used: >>> >>> mdadm --create --run /dev/md0 --level 4 --metadata 1.2 --raid-devices 4 >>> /dev/sdb1 /dev/sd[d-f]1 --write-journal /dev/sdc1 --bitmap=internal >>> --bitmap-chunk=64M --chunk 512 >>> mdadm --wait /dev/md0 >>> mkfs.ext4 /dev/md0 >>> mdadm /dev/md0 -f /dev/sdb1 >>> mdadm /dev/md0 -r /dev/sdb1 >>> mdadm /dev/md0 -a /dev/sdb1 >> >> I tried above cmds in my VM machine (with set level as 6), but can't >> see the calltrace with below versions. >> >> linux135:~ # uname -r >> 4.4.13-46-default >> linux135:~ # mdadm --version >> mdadm - v3.4-47-g8718fc3 - 11th May 2016 >> >> Thanks, >> Guoqing >> > Could you try with 4.7.0, it's very easy to reproduce on my side > below is my env: > #lsblk /dev/sd[b-f] > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sdb 8:16 0 931.5G 0 disk > └─sdb1 8:17 0 10G 0 part > └─md0 9:0 0 30G 0 raid4 > sdc 8:32 0 186.3G 0 disk > └─sdc1 8:33 0 10G 0 part > └─md0 9:0 0 30G 0 raid4 > sdd 8:48 0 111.8G 0 disk > └─sdd1 8:49 0 10G 0 part > └─md0 9:0 0 30G 0 raid4 > sde 8:64 0 931.5G 0 disk > └─sde1 8:65 0 10G 0 part > └─md0 9:0 0 30G 0 raid4 > sdf 8:80 0 931.5G 0 disk > └─sdf1 8:81 0 10G 0 part > └─md0 9:0 0 30G 0 raid4 > # uname -r > 4.7.0 > # mdadm --version > mdadm - v3.4-63-g52209d6 - 21st July 2016 > I tried with latest Linus's tree (731c7d3a205ba89b475b2aa71b5f13dd6ae3de56), but still can't find the warning. linux241:~ # uname -r 4.7.0-11-default+ linux241:~ # mdadm --version mdadm - v3.4-41-gf96b130 - 30th March 2016 The biggest difference is that I am using VM machine while you were testing with real hardware, though I am not sure it was caused by hardware or not. Do you always see the "couldn't update array info" info during test? If so, pls try to make it disappear. Thanks, Guoqing ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] 2016-08-02 8:36 ` Guoqing Jiang @ 2016-08-02 13:31 ` yizhan 2016-08-03 2:46 ` Guoqing Jiang 0 siblings, 1 reply; 7+ messages in thread From: yizhan @ 2016-08-02 13:31 UTC (permalink / raw) To: Guoqing Jiang, linux-raid > I tried with latest Linus's tree > (731c7d3a205ba89b475b2aa71b5f13dd6ae3de56), but still can't find the > warning. > > linux241:~ # uname -r > 4.7.0-11-default+ > linux241:~ # mdadm --version > mdadm - v3.4-41-gf96b130 - 30th March 2016 > > The biggest difference is that I am using VM machine while you were > testing with real hardware, though I am not > sure it was caused by hardware or not. Do you always see the "couldn't > update array info" info during test? If so, > pls try to make it disappear. > > Thanks, > Guoqing May be it's env related, I have tried on another PC which with the same kernel/mdadm version and cannot reproduce it. Below is one log I reproduced which without " couldn't update array info"' [ 504.650453] md: bind<sdb1> [ 504.650539] md: bind<sdd1> [ 504.650611] md: bind<sde1> [ 504.651137] md: bind<sdc1> [ 504.651211] md: bind<sdf1> [ 504.655535] md/raid:md0: device sde1 operational as raid disk 2 [ 504.655537] md/raid:md0: device sdd1 operational as raid disk 1 [ 504.655538] md/raid:md0: device sdb1 operational as raid disk 0 [ 504.656040] md/raid:md0: allocated 4374kB [ 504.656069] md/raid:md0: raid level 4 active with 3 out of 4 devices, algorithm 0 [ 504.656071] RAID conf printout: [ 504.656085] --- level:4 rd:4 wd:3 [ 504.656086] disk 0, o:1, dev:sdb1 [ 504.656087] disk 1, o:1, dev:sdd1 [ 504.656088] disk 2, o:1, dev:sde1 [ 504.656107] md/raid:md0: using device sdc1 as journal [ 504.658052] created bitmap (1 pages) for device md0 [ 504.658124] md0: bitmap initialized from disk: read 1 pages, set 16 of 16 bits [ 504.667007] md0: detected capacity change from 0 to 3221225472 [ 504.667032] RAID conf printout: [ 504.667035] --- level:4 rd:4 wd:3 [ 504.667037] disk 0, o:1, dev:sdb1 [ 504.667039] disk 1, o:1, dev:sdd1 [ 504.667041] disk 2, o:1, dev:sde1 [ 504.667042] disk 3, o:1, dev:sdf1 [ 504.667156] md: recovery of RAID array md0 [ 504.667159] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [ 504.667160] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [ 504.667165] md: using 128k window, over a total of 1048576k. [ 516.581223] md: md0: recovery done. [ 516.678940] RAID conf printout: [ 516.678943] --- level:4 rd:4 wd:4 [ 516.678945] disk 0, o:1, dev:sdb1 [ 516.678946] disk 1, o:1, dev:sdd1 [ 516.678947] disk 2, o:1, dev:sde1 [ 516.678948] disk 3, o:1, dev:sdf1 [ 517.694241] md/raid:md0: Disk failure on sdb1, disabling device. md/raid:md0: Operation continuing on 3 devices. [ 517.734547] RAID conf printout: [ 517.734549] --- level:4 rd:4 wd:3 [ 517.734551] disk 0, o:0, dev:sdb1 [ 517.734552] disk 1, o:1, dev:sdd1 [ 517.734553] disk 2, o:1, dev:sde1 [ 517.734554] disk 3, o:1, dev:sdf1 [ 517.737295] RAID conf printout: [ 517.737298] --- level:4 rd:4 wd:3 [ 517.737299] disk 1, o:1, dev:sdd1 [ 517.737300] disk 2, o:1, dev:sde1 [ 517.737301] disk 3, o:1, dev:sdf1 [ 522.695241] md: unbind<sdb1> [ 522.698393] md: export_rdev(sdb1) [ 523.277566] md: bind<sdb1> [ 523.333573] RAID conf printout: [ 523.333576] --- level:4 rd:4 wd:3 [ 523.333577] disk 0, o:1, dev:sdb1 [ 523.333578] disk 1, o:1, dev:sdd1 [ 523.333579] disk 2, o:1, dev:sde1 [ 523.333580] disk 3, o:1, dev:sdf1 [ 523.379453] ------------[ cut here ]------------ [ 523.379462] WARNING: CPU: 0 PID: 5186 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] [ 523.379463] Modules linked in: raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ipt_REJECT nf_reject_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sb_edac edac_core snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_codec_realtek intel_powerclamp coretemp snd_hda_codec_generic snd_hda_intel kvm_intel snd_hda_codec kvm snd_hda_core snd_hwdep snd_seq snd_seq_device [ 523.379488] irqbypass snd_pcm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd snd_timer iTCO_wdt hp_wmi snd sparse_keymap ioatdma rfkill iTCO_vendor_support sg pcspkr shpchp soundcore mei_me i2c_i801 dca mei lpc_ich mfd_core tpm_infineon nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm isci drm libsas e1000e ata_generic ahci pata_acpi scsi_transport_sas libahci libata firewire_ohci firewire_core crc32c_intel serio_raw ptp pps_core crc_itu_t i2c_core wmi fjes dm_mirror dm_region_hash dm_log dm_mod [ 523.379520] CPU: 0 PID: 5186 Comm: md0_resync Tainted: G W 4.7.0 #5 [ 523.379521] Hardware name: Hewlett-Packard HP Z620 Workstation/158A, BIOS J61 v03.69 03/25/2014 [ 523.379522] 0000000000000286 00000000277f1813 ffff880801c5fb40 ffffffff8134cd0c [ 523.379524] 0000000000000000 0000000000000000 ffff880801c5fb80 ffffffff8108c351 [ 523.379526] 000002d8c7a96d80 0000000023f0d6c6 ffff8800c6dc8000 ffff8807f334be88 [ 523.379528] Call Trace: [ 523.379533] [<ffffffff8134cd0c>] dump_stack+0x63/0x87 [ 523.379537] [<ffffffff8108c351>] __warn+0xd1/0xf0 [ 523.379539] [<ffffffff8108c48d>] warn_slowpath_null+0x1d/0x20 [ 523.379542] [<ffffffffa0995805>] r5l_do_reclaim+0x415/0x430 [raid456] [ 523.379546] [<ffffffff811f9d00>] ? kfree+0x120/0x170 [ 523.379548] [<ffffffffa099633b>] r5l_quiesce+0x7b/0xa0 [raid456] [ 523.379551] [<ffffffffa0988da0>] raid5_quiesce+0x50/0x2a0 [raid456] [ 523.379553] [<ffffffff810d0240>] ? prepare_to_wait_event+0xf0/0xf0 [ 523.379556] [<ffffffff8156d32e>] md_do_sync+0xe7e/0xf60 [ 523.379560] [<ffffffff81036bf9>] ? sched_clock+0x9/0x10 [ 523.379562] [<ffffffff810bc4b2>] ? sched_clock_cpu+0x72/0xa0 [ 523.379564] [<ffffffff810c3985>] ? put_prev_entity+0x35/0x730 [ 523.379567] [<ffffffff8102c6d9>] ? __switch_to+0x219/0x5c0 [ 523.379570] [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0 [ 523.379572] [<ffffffff81566dc6>] md_thread+0x136/0x150 [ 523.379575] [<ffffffff81566c90>] ? find_pers+0x70/0x70 [ 523.379577] [<ffffffff810ab678>] kthread+0xd8/0xf0 [ 523.379580] [<ffffffff816e88bf>] ret_from_fork+0x1f/0x40 [ 523.379582] [<ffffffff810ab5a0>] ? kthread_park+0x60/0x60 [ 523.379583] ---[ end trace 76fe4588f24ee7b5 ]--- [ 523.429860] md: recovery of RAID array md0 [ 523.429862] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [ 523.429864] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [ 523.429876] md: using 128k window, over a total of 1048576k. [ 523.429882] md: md0: recovery interrupted. [ 523.501621] md: md0 still in use. [ 523.501629] RAID conf printout: [ 523.501632] --- level:4 rd:4 wd:3 [ 523.501634] disk 0, o:1, dev:sdb1 [ 523.501635] disk 1, o:1, dev:sdd1 [ 523.501636] disk 2, o:1, dev:sde1 [ 523.501637] disk 3, o:1, dev:sdf1 [ 523.501638] RAID conf printout: [ 523.501638] --- level:4 rd:4 wd:3 [ 523.501639] disk 0, o:1, dev:sdb1 [ 523.501640] disk 1, o:1, dev:sdd1 [ 523.501641] disk 2, o:1, dev:sde1 [ 523.501642] disk 3, o:1, dev:sdf1 [ 523.501737] md: recovery of RAID array md0 [ 523.501740] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [ 523.501741] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [ 523.501745] md: using 128k window, over a total of 1048576k. [ 523.501746] md: resuming recovery of md0 from checkpoint. [ 523.501768] md: md0: recovery done. [ 523.534961] RAID conf printout: [ 523.534964] --- level:4 rd:4 wd:4 [ 523.534966] disk 0, o:1, dev:sdb1 [ 523.534967] disk 1, o:1, dev:sdd1 [ 523.534968] disk 2, o:1, dev:sde1 [ 523.534969] disk 3, o:1, dev:sdf1 [ 523.725191] md0: detected capacity change from 3221225472 to 0 [ 523.725282] md: md0 stopped. [ 523.725286] md: unbind<sdb1> [ 523.740217] md: export_rdev(sdb1) [ 523.740226] md: unbind<sdf1> [ 523.746268] md: export_rdev(sdf1) [ 523.746284] md: unbind<sdc1> [ 523.753263] md: export_rdev(sdc1) [ 523.753278] md: unbind<sde1> [ 523.759267] md: export_rdev(sde1) [ 523.759282] md: unbind<sdd1> [ 523.765299] md: export_rdev(sdd1) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] 2016-08-02 13:31 ` yizhan @ 2016-08-03 2:46 ` Guoqing Jiang 0 siblings, 0 replies; 7+ messages in thread From: Guoqing Jiang @ 2016-08-03 2:46 UTC (permalink / raw) To: yizhan, linux-raid On 08/02/2016 09:31 PM, yizhan wrote: > >> I tried with latest Linus's tree >> (731c7d3a205ba89b475b2aa71b5f13dd6ae3de56), but still can't find the >> warning. >> >> linux241:~ # uname -r >> 4.7.0-11-default+ >> linux241:~ # mdadm --version >> mdadm - v3.4-41-gf96b130 - 30th March 2016 >> >> The biggest difference is that I am using VM machine while you were >> testing with real hardware, though I am not >> sure it was caused by hardware or not. Do you always see the >> "couldn't update array info" info during test? If so, >> pls try to make it disappear. >> >> Thanks, >> Guoqing > May be it's env related, I have tried on another PC which with the > same kernel/mdadm version and cannot reproduce it. I guess you can check the difference of hardware about why it can't be happen with another PC. Regards, Guoqing ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-08-03 2:46 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1382969797.9407100.1469619450040.JavaMail.zimbra@redhat.com> 2016-07-27 11:38 ` WARNING: CPU: 4 PID: 10512 at drivers/md/raid5-cache.c:728 r5l_do_reclaim+0x415/0x430 [raid456] Yi Zhang 2016-07-28 1:13 ` Shaohua Li 2016-08-01 9:01 ` Guoqing Jiang [not found] ` <579F0E52.8090704@suse.com> 2016-08-01 12:32 ` Yi Zhang [not found] ` <7d0acb42-1209-0722-7740-0fa03d44106b@redhat.com> 2016-08-02 8:36 ` Guoqing Jiang 2016-08-02 13:31 ` yizhan 2016-08-03 2:46 ` Guoqing Jiang
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.