From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\)) Subject: Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook From: Paolo Valente In-Reply-To: <1517918234.25841.31.camel@gmx.de> Date: Tue, 6 Feb 2018 13:26:11 +0100 Cc: Oleksandr Natalenko , Jens Axboe , linux-block , Linux Kernel Mailing List , Ulf Hansson , Mark Brown , Linus Walleij , 'Paolo Valente' via bfq-iosched , Alban Browaeys , ming.lei@redhat.com, ivan@ludios.org, 169364@studenti.unimore.it, Serena Ziviani Message-Id: References: <20180205190510.5499-1-paolo.valente@linaro.org> <20180205190510.5499-2-paolo.valente@linaro.org> <1517903761.9843.12.camel@gmx.de> <899B68CC-5955-4418-8BFF-DC55A743A61B@linaro.org> <1517918234.25841.31.camel@gmx.de> To: Mike Galbraith List-ID: > Il giorno 06 feb 2018, alle ore 12:57, Mike Galbraith = ha scritto: >=20 > On Tue, 2018-02-06 at 10:38 +0100, Paolo Valente wrote: >>=20 >> Hi Mike, >> as you can imagine, I didn't get any failure in my pre-submission >> tests on this patch. In addition, it is not that easy to link this >> patch, which just adds some internal bfq housekeeping in case of a >> requeue, with a corruption of external lists for general I/O >> management. >>=20 >> In this respect, as Oleksandr comments point out, by switching from >> cfq to bfq, you switch between much more than two schedulers. = Anyway, >> who knows ... >=20 > Not me. Box seems to be fairly sure that it is bfq. Yeah, sorry for the too short comment: what I meant is that cfq (and deadline) are in legacy blk, while bfq is in blk-mq. So, to use bfq, you must also switch from legacy-blk I/O stack to blk-mq I/O stack. > Twice again box > went belly up on me in fairly short order with bfq, but seemed fine > with deadline. I'm currently running deadline again, and box again > seems solid, thought I won't say _is_ solid until it's been happily > trundling along with deadline for a quite a bit longer. >=20 As Oleksadr asked too, is it deadline or mq-deadline? > I was ssh'd in during the last episode, got this out. I should be > getting crash dumps, but seems kdump is only working intermittently > atm. I did get one earlier, but 3 of 4 times not. Hohum. >=20 > [ 484.179292] BUG: unable to handle kernel paging request at = ffffffffa0817000 > [ 484.179436] IP: __trace_note_message+0x1f/0xd0 > [ 484.179576] PGD 1e0c067 P4D 1e0c067 PUD 1e0d063 PMD 3faff2067 PTE 0 > [ 484.179719] Oops: 0000 [#1] SMP PTI > [ 484.179861] Dumping ftrace buffer: > [ 484.180011] (ftrace buffer empty) > [ 484.180138] Modules linked in: fuse(E) ebtable_filter(E) = ebtables(E) af_packet(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) = iscsi_boot_sysfs(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) ipt_REJECT(E) = xt_tcpudp(E) iptable_filter(E) ip6table_mangle(E) = nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) = nf_conntrack_ipv4(E) nf_defrag_ipv4(E) ip_tables(E) xt_conntrack(E) = nf_conntrack(E) ip6table_filter(E) ip6_tables(E) x_tables(E) = nls_iso8859_1(E) nls_cp437(E) intel_rapl(E) x86_pkg_temp_thermal(E) = intel_powerclamp(E) snd_hda_codec_hdmi(E) coretemp(E) kvm_intel(E) = snd_hda_codec_realtek(E) kvm(E) snd_hda_codec_generic(E) = snd_hda_intel(E) snd_hda_codec(E) sr_mod(E) snd_hwdep(E) cdrom(E) = joydev(E) snd_hda_core(E) snd_pcm(E) snd_timer(E) irqbypass(E) snd(E) = crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) r8169(E) > [ 484.180740] iTCO_wdt(E) ghash_clmulni_intel(E) mii(E) = iTCO_vendor_support(E) pcbc(E) aesni_intel(E) soundcore(E) aes_x86_64(E) = shpchp(E) crypto_simd(E) lpc_ich(E) glue_helper(E) i2c_i801(E) mei_me(E) = mfd_core(E) mei(E) cryptd(E) intel_smartconnect(E) pcspkr(E) fan(E) = thermal(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) = hid_logitech_hidpp(E) hid_logitech_dj(E) uas(E) usb_storage(E) = hid_generic(E) usbhid(E) nouveau(E) wmi(E) i2c_algo_bit(E) = drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) = fb_sys_fops(E) ahci(E) xhci_pci(E) ehci_pci(E) libahci(E) ttm(E) = ehci_hcd(E) xhci_hcd(E) libata(E) drm(E) usbcore(E) video(E) button(E) = sd_mod(E) vfat(E) fat(E) virtio_blk(E) virtio_mmio(E) virtio_pci(E) = virtio_ring(E) virtio(E) ext4(E) crc16(E) mbcache(E) jbd2(E) loop(E) = sg(E) dm_multipath(E) > [ 484.181421] dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) = scsi_dh_alua(E) scsi_mod(E) efivarfs(E) autofs4(E) > [ 484.181583] CPU: 3 PID: 500 Comm: kworker/3:1H Tainted: G = E 4.15.0.ge237f98-master #609 > [ 484.181746] Hardware name: MEDION MS-7848/MS-7848, BIOS = M7848W08.20C 09/23/2013 > [ 484.181910] Workqueue: kblockd blk_mq_requeue_work > [ 484.182076] RIP: 0010:__trace_note_message+0x1f/0xd0 > [ 484.182250] RSP: 0018:ffff8803f45bfc20 EFLAGS: 00010282 > [ 484.182436] RAX: 0000000000000000 RBX: ffffffffa0817000 RCX: = 00000000ffff8803 > [ 484.182622] RDX: ffffffff81bf514d RSI: 0000000000000000 RDI: = ffffffffa0817000 > [ 484.182810] RBP: ffff8803f45bfc80 R08: 0000000000000041 R09: = ffff8803f69cc5d0 > [ 484.182998] R10: ffff8803f80b47d0 R11: 0000000000001000 R12: = ffff8803f45e8000 > [ 484.183185] R13: 000000000000000d R14: 0000000000000000 R15: = ffff8803fba112c0 > [ 484.183372] FS: 0000000000000000(0000) GS:ffff88041ecc0000(0000) = knlGS:0000000000000000 > [ 484.183561] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 484.183747] CR2: ffffffffa0817000 CR3: 0000000001e0a006 CR4: = 00000000001606e0 > [ 484.183934] Call Trace: > [ 484.184122] bfq_put_queue+0xd3/0xe0 > [ 484.184305] bfq_finish_requeue_request+0x72/0x350 > [ 484.184493] __blk_mq_requeue_request+0x8f/0x120 > [ 484.184678] blk_mq_dispatch_rq_list+0x342/0x550 > [ 484.184866] ? kyber_dispatch_request+0xd0/0xd0 > [ 484.185053] blk_mq_sched_dispatch_requests+0xf7/0x180 > [ 484.185238] __blk_mq_run_hw_queue+0x58/0xd0 > [ 484.185429] __blk_mq_delay_run_hw_queue+0x99/0xa0 > [ 484.185614] blk_mq_run_hw_queue+0x54/0xf0 > [ 484.185805] blk_mq_run_hw_queues+0x4b/0x60 > [ 484.185994] blk_mq_requeue_work+0x13a/0x150 > [ 484.186192] process_one_work+0x147/0x350 > [ 484.186383] worker_thread+0x47/0x3e0 > [ 484.186572] kthread+0xf8/0x130 > [ 484.186760] ? rescuer_thread+0x360/0x360 > [ 484.186948] ? kthread_stop+0x120/0x120 > [ 484.187137] ret_from_fork+0x35/0x40 > [ 484.187321] Code: ff 48 89 44 24 10 e9 58 fd ff ff 90 55 48 89 e5 = 41 55 41 54 53 48 89 fb 48 83 ec 48 48 89 4c 24 30 4c 89 44 24 38 4c 89 = 4c 24 40 <83> 3f 02 0f 85 87 00 00 00 f6 43 21 04 75 0b 48 83 c4 48 5b = 41=20 > [ 484.187525] RIP: __trace_note_message+0x1f/0xd0 RSP: = ffff8803f45bfc20 > [ 484.187727] CR2: ffffffffa0817000 ok, right in the middle of bfq this time ... Was this the first OOPS in = your kernel log? Thanks, Paolo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752953AbeBFM02 (ORCPT ); Tue, 6 Feb 2018 07:26:28 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:34200 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752475AbeBFM0S (ORCPT ); Tue, 6 Feb 2018 07:26:18 -0500 X-Google-Smtp-Source: AH8x227PFPfTnEwR3Ddn12ztoOI9DaPkdsV3nMnOrGbxkVJc4m2MdwH3NRkVVLxboORkG3mhBaGx4Q== Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\)) Subject: Re: [PATCH BUGFIX 1/1] block, bfq: add requeue-request hook From: Paolo Valente In-Reply-To: <1517918234.25841.31.camel@gmx.de> Date: Tue, 6 Feb 2018 13:26:11 +0100 Cc: Oleksandr Natalenko , Jens Axboe , linux-block , Linux Kernel Mailing List , Ulf Hansson , Mark Brown , Linus Walleij , "'Paolo Valente' via bfq-iosched" , Alban Browaeys , ming.lei@redhat.com, ivan@ludios.org, 169364@studenti.unimore.it, Serena Ziviani Message-Id: References: <20180205190510.5499-1-paolo.valente@linaro.org> <20180205190510.5499-2-paolo.valente@linaro.org> <1517903761.9843.12.camel@gmx.de> <899B68CC-5955-4418-8BFF-DC55A743A61B@linaro.org> <1517918234.25841.31.camel@gmx.de> To: Mike Galbraith X-Mailer: Apple Mail (2.3445.5.20) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id w16CQXx1017332 > Il giorno 06 feb 2018, alle ore 12:57, Mike Galbraith ha scritto: > > On Tue, 2018-02-06 at 10:38 +0100, Paolo Valente wrote: >> >> Hi Mike, >> as you can imagine, I didn't get any failure in my pre-submission >> tests on this patch. In addition, it is not that easy to link this >> patch, which just adds some internal bfq housekeeping in case of a >> requeue, with a corruption of external lists for general I/O >> management. >> >> In this respect, as Oleksandr comments point out, by switching from >> cfq to bfq, you switch between much more than two schedulers. Anyway, >> who knows ... > > Not me. Box seems to be fairly sure that it is bfq. Yeah, sorry for the too short comment: what I meant is that cfq (and deadline) are in legacy blk, while bfq is in blk-mq. So, to use bfq, you must also switch from legacy-blk I/O stack to blk-mq I/O stack. > Twice again box > went belly up on me in fairly short order with bfq, but seemed fine > with deadline. I'm currently running deadline again, and box again > seems solid, thought I won't say _is_ solid until it's been happily > trundling along with deadline for a quite a bit longer. > As Oleksadr asked too, is it deadline or mq-deadline? > I was ssh'd in during the last episode, got this out. I should be > getting crash dumps, but seems kdump is only working intermittently > atm. I did get one earlier, but 3 of 4 times not. Hohum. > > [ 484.179292] BUG: unable to handle kernel paging request at ffffffffa0817000 > [ 484.179436] IP: __trace_note_message+0x1f/0xd0 > [ 484.179576] PGD 1e0c067 P4D 1e0c067 PUD 1e0d063 PMD 3faff2067 PTE 0 > [ 484.179719] Oops: 0000 [#1] SMP PTI > [ 484.179861] Dumping ftrace buffer: > [ 484.180011] (ftrace buffer empty) > [ 484.180138] Modules linked in: fuse(E) ebtable_filter(E) ebtables(E) af_packet(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) ipt_REJECT(E) xt_tcpudp(E) iptable_filter(E) ip6table_mangle(E) nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) ip_tables(E) xt_conntrack(E) nf_conntrack(E) ip6table_filter(E) ip6_tables(E) x_tables(E) nls_iso8859_1(E) nls_cp437(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) snd_hda_codec_hdmi(E) coretemp(E) kvm_intel(E) snd_hda_codec_realtek(E) kvm(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_hda_codec(E) sr_mod(E) snd_hwdep(E) cdrom(E) joydev(E) snd_hda_core(E) snd_pcm(E) snd_timer(E) irqbypass(E) snd(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) r8169(E) > [ 484.180740] iTCO_wdt(E) ghash_clmulni_intel(E) mii(E) iTCO_vendor_support(E) pcbc(E) aesni_intel(E) soundcore(E) aes_x86_64(E) shpchp(E) crypto_simd(E) lpc_ich(E) glue_helper(E) i2c_i801(E) mei_me(E) mfd_core(E) mei(E) cryptd(E) intel_smartconnect(E) pcspkr(E) fan(E) thermal(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) hid_logitech_hidpp(E) hid_logitech_dj(E) uas(E) usb_storage(E) hid_generic(E) usbhid(E) nouveau(E) wmi(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) ahci(E) xhci_pci(E) ehci_pci(E) libahci(E) ttm(E) ehci_hcd(E) xhci_hcd(E) libata(E) drm(E) usbcore(E) video(E) button(E) sd_mod(E) vfat(E) fat(E) virtio_blk(E) virtio_mmio(E) virtio_pci(E) virtio_ring(E) virtio(E) ext4(E) crc16(E) mbcache(E) jbd2(E) loop(E) sg(E) dm_multipath(E) > [ 484.181421] dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) efivarfs(E) autofs4(E) > [ 484.181583] CPU: 3 PID: 500 Comm: kworker/3:1H Tainted: G E 4.15.0.ge237f98-master #609 > [ 484.181746] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013 > [ 484.181910] Workqueue: kblockd blk_mq_requeue_work > [ 484.182076] RIP: 0010:__trace_note_message+0x1f/0xd0 > [ 484.182250] RSP: 0018:ffff8803f45bfc20 EFLAGS: 00010282 > [ 484.182436] RAX: 0000000000000000 RBX: ffffffffa0817000 RCX: 00000000ffff8803 > [ 484.182622] RDX: ffffffff81bf514d RSI: 0000000000000000 RDI: ffffffffa0817000 > [ 484.182810] RBP: ffff8803f45bfc80 R08: 0000000000000041 R09: ffff8803f69cc5d0 > [ 484.182998] R10: ffff8803f80b47d0 R11: 0000000000001000 R12: ffff8803f45e8000 > [ 484.183185] R13: 000000000000000d R14: 0000000000000000 R15: ffff8803fba112c0 > [ 484.183372] FS: 0000000000000000(0000) GS:ffff88041ecc0000(0000) knlGS:0000000000000000 > [ 484.183561] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 484.183747] CR2: ffffffffa0817000 CR3: 0000000001e0a006 CR4: 00000000001606e0 > [ 484.183934] Call Trace: > [ 484.184122] bfq_put_queue+0xd3/0xe0 > [ 484.184305] bfq_finish_requeue_request+0x72/0x350 > [ 484.184493] __blk_mq_requeue_request+0x8f/0x120 > [ 484.184678] blk_mq_dispatch_rq_list+0x342/0x550 > [ 484.184866] ? kyber_dispatch_request+0xd0/0xd0 > [ 484.185053] blk_mq_sched_dispatch_requests+0xf7/0x180 > [ 484.185238] __blk_mq_run_hw_queue+0x58/0xd0 > [ 484.185429] __blk_mq_delay_run_hw_queue+0x99/0xa0 > [ 484.185614] blk_mq_run_hw_queue+0x54/0xf0 > [ 484.185805] blk_mq_run_hw_queues+0x4b/0x60 > [ 484.185994] blk_mq_requeue_work+0x13a/0x150 > [ 484.186192] process_one_work+0x147/0x350 > [ 484.186383] worker_thread+0x47/0x3e0 > [ 484.186572] kthread+0xf8/0x130 > [ 484.186760] ? rescuer_thread+0x360/0x360 > [ 484.186948] ? kthread_stop+0x120/0x120 > [ 484.187137] ret_from_fork+0x35/0x40 > [ 484.187321] Code: ff 48 89 44 24 10 e9 58 fd ff ff 90 55 48 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 48 48 89 4c 24 30 4c 89 44 24 38 4c 89 4c 24 40 <83> 3f 02 0f 85 87 00 00 00 f6 43 21 04 75 0b 48 83 c4 48 5b 41 > [ 484.187525] RIP: __trace_note_message+0x1f/0xd0 RSP: ffff8803f45bfc20 > [ 484.187727] CR2: ffffffffa0817000 ok, right in the middle of bfq this time ... Was this the first OOPS in your kernel log? Thanks, Paolo