From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6916C2D0E4 for ; Fri, 18 Dec 2020 22:14:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F61B23BA8 for ; Fri, 18 Dec 2020 22:14:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725988AbgLRWOR (ORCPT ); Fri, 18 Dec 2020 17:14:17 -0500 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]:40867 "EHLO outpost1.zedat.fu-berlin.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725945AbgLRWOR (ORCPT ); Fri, 18 Dec 2020 17:14:17 -0500 Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost.zedat.fu-berlin.de (Exim 4.94) with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (envelope-from ) id 1kqO0A-000nUy-NP; Fri, 18 Dec 2020 23:13:30 +0100 Received: from p5b13a238.dip0.t-ipconnect.de ([91.19.162.56] helo=[192.168.178.139]) by inpost2.zedat.fu-berlin.de (Exim 4.94) with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (envelope-from ) id 1kqO0A-003yzK-Ex; Fri, 18 Dec 2020 23:13:30 +0100 Subject: Re: [PATCH v2 05/15] ia64: convert to legacy_timer_tick To: Arnd Bergmann Cc: "linux-kernel@vger.kernel.org" , Thomas Gleixner , Arnd Bergmann , Russell King , Tony Luck , Fenghua Yu , Greg Ungerer , Finn Thain , Philip Blundell , Joshua Thompson , Sam Creasey , "James E.J. Bottomley" , Helge Deller , Daniel Lezcano , John Stultz , Stephen Boyd , Linus Walleij , linux-ia64@vger.kernel.org, Parisc List , linux-m68k , Linux ARM , Mike Rapoport , Anatoly Pugachev References: <20201030151758.1241164-1-arnd@kernel.org> <20201030151758.1241164-6-arnd@kernel.org> <59efce0e-a28d-9424-82ca-fb7f3a1b9c29@physik.fu-berlin.de> From: John Paul Adrian Glaubitz Message-ID: Date: Fri, 18 Dec 2020 23:13:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Original-Sender: glaubitz@physik.fu-berlin.de X-Originating-IP: 91.19.162.56 Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org Hi Arnd! On 12/18/20 11:07 PM, Arnd Bergmann wrote: > Sorry for causing this bug, and thank you for bisecting it > down to my patch. > > Do you see any other strange behavior with that patch, or is > this the only symptom you are aware of? This seems to be the only issue I'm seeing so far. However, as I'm not able to fully boot the system, I'm not able to be certain that there might be other fallouts once the system is running. >> I'm seeing this backtrace now: >> >> [ 905.883273] usb 1-2: SerialNumber: A60020000001 >> [ 905.918170] sda: sda1 sda2 sda3 >> [ 905.920107] sd 0:1:0:0: [sda] Attached SCSI disk >> [ 905.944102] usb-storage 1-2:1.0: USB Mass Storage device detected >> [ 905.944102] scsi host1: usb-storage 1-2:1.0 >> [ 905.944102] usbcore: registered new interface driver usb-storage >> [ 905.944117] usbcore: registered new interface driver uas > > The timestamps show that time is moving forward, which is at least > something. Do you have a feeling for whether the timestamps are > counting in (roughly) the correct speed, or is it going much faster > or slower than it should? > > To clarify: the [905.944117] numbers are seconds/microseconds > since boot, so message would be 906 seconds after the kernel > started. No, that would be definitely off. I hadn't had the machine up and running for 15 minutes. This issue showed right after boot. >> Begin: Loading essential drivers ... done. > Begin: Running /scripts/init-premount ... done. > Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. > > Ok, so it gets into user space. Is this initramfs or the actual read-only root? This is using an initramfs. >> [ 906.666923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 >> [ 906.670923] hpsa 0000:05:00.0: device is ready. >> [ 906.670923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 >> done. >> [ 906.722166] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 906.722166] rcu: 2-....: (3 ticks this GP) idle=fe6/1/0x4000000000000000 softirq=693/698 fqs=4 >> [ 906.722166] (detected by 0, t=6115 jiffies, g=465, q=80) > This appears to be an 'rcu stall' warning, from print_cpu_stall_info(), > indicating that timer ticks are missing. OK. >> [ 909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 seconds. >> [ 909.360108] Not tainted 5.10.0+ #130 >> [ 909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 909.360108] task:systemd-sysv-ge state:D stack: 0 pid: 200 ppid: 189 flags:0x00000000 >> [ 909.364108] >> [ 909.364108] Call Trace: >> [ 909.364423] [] __schedule+0x890/0x21e0 >> [ 909.364423] sp=e0000100487d7b70 bsp=e0000100487d1748 >> [ 909.368423] [] schedule+0xa0/0x240 >> [ 909.368423] sp=e0000100487d7b90 bsp=e0000100487d16e0 >> [ 909.368558] [] io_schedule+0x70/0xa0 >> [ 909.368558] sp=e0000100487d7b90 bsp=e0000100487d16c0 >> [ 909.372290] [] bit_wait_io+0x20/0xe0 >> [ 909.372290] sp=e0000100487d7b90 bsp=e0000100487d1698 >> [ 909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 909.376290] [] __wait_on_bit+0xc0/0x1c0 >> [ 909.376290] sp=e0000100487d7b90 bsp=e0000100487d1648 >> [ 909.374168] rcu: 3-....: (2 ticks this GP) idle=19e/1/0x4000000000000002 softirq=1581/1581 fqs=2 >> [ 909.374168] (detected by 0, t=5661 jiffies, g=1089, q=3) >> [ 909.376290] [] out_of_line_wait_on_bit+0x120/0x140 >> [ 909.376290] sp=e0000100487d7b90 bsp=e0000100487d1610 >> [ 909.374168] Task dump for CPU 3: >> [ 909.374168] task:khungtaskd state:R running task > > and this seems to be another instance of the same. I would assume that this > is completely unrelated to the block driver and just happened to trigger during > the same time the driver was doing something. > > Can you see in your full logs if the "Oops: timer tick before it's due" warning > triggered at any point? It's difficult, to be honest. The problem is that the above message spams the whole kernel buffer to the point that the buffer of the built-in serial console is filled up. So I'm not sure if I've seen this message. > I've attached a patch for a partial revert of my original change, this > should still work with the final cleanup on top, but restore the loop > plus the local_irq_enable()/local_irq_disable() that I dropped from > the original code. Does this make a difference? I'll give it a try and report back. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz@debian.org `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F405C4361B for ; Fri, 18 Dec 2020 22:14:54 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 00B7F23B98 for ; Fri, 18 Dec 2020 22:14:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 00B7F23B98 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=physik.fu-berlin.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=e8P9btHY1S8lk0Zsl3TuJ4Ak0NR0ALl7ruqzZ8wsydU=; b=rdDx+zxaW4gD84QuI1TpWJgOu KIMTdH6C+Rk2i01iJ6ccgEEbmvKkCO+7+ifaZjifbEBGvpBLrWjsmJyFVx+FYPqOyr1cqv3VHU/Zf GtKYLZVJtnM4bL0xjJe6maEQwnytTLRmWsRfIGgomkpmo7KSu09HSDbSXoxgokHO72PAt78+6x28X g441AD8R4IeieHORtl3IudUEbTr22pJEcTabNh9qhmbR2jWh4e8zuzPwvc2G7lbPZt9+/o3u5u5bj gfs8wMqPuPk/ykYwrFSgdq+x5LImP7pp/WrNfVZq9URXRmNj6TT1y3FaBxzfVDv8Fvy0zs31bQXu0 By5OlP41Q==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kqO0F-0002VK-Ns; Fri, 18 Dec 2020 22:13:35 +0000 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kqO0D-0002Ut-DO for linux-arm-kernel@lists.infradead.org; Fri, 18 Dec 2020 22:13:34 +0000 Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost.zedat.fu-berlin.de (Exim 4.94) with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (envelope-from ) id 1kqO0A-000nUy-NP; Fri, 18 Dec 2020 23:13:30 +0100 Received: from p5b13a238.dip0.t-ipconnect.de ([91.19.162.56] helo=[192.168.178.139]) by inpost2.zedat.fu-berlin.de (Exim 4.94) with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (envelope-from ) id 1kqO0A-003yzK-Ex; Fri, 18 Dec 2020 23:13:30 +0100 Subject: Re: [PATCH v2 05/15] ia64: convert to legacy_timer_tick To: Arnd Bergmann References: <20201030151758.1241164-1-arnd@kernel.org> <20201030151758.1241164-6-arnd@kernel.org> <59efce0e-a28d-9424-82ca-fb7f3a1b9c29@physik.fu-berlin.de> From: John Paul Adrian Glaubitz Message-ID: Date: Fri, 18 Dec 2020 23:13:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Original-Sender: glaubitz@physik.fu-berlin.de X-Originating-IP: 91.19.162.56 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201218_171333_610710_83BA6285 X-CRM114-Status: GOOD ( 30.60 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-ia64@vger.kernel.org, Linus Walleij , "James E.J. Bottomley" , Greg Ungerer , Helge Deller , Daniel Lezcano , Anatoly Pugachev , Russell King , Finn Thain , Mike Rapoport , Sam Creasey , Fenghua Yu , Arnd Bergmann , linux-m68k , John Stultz , Thomas Gleixner , Linux ARM , Tony Luck , Parisc List , Stephen Boyd , "linux-kernel@vger.kernel.org" , Philip Blundell , Joshua Thompson Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Arnd! On 12/18/20 11:07 PM, Arnd Bergmann wrote: > Sorry for causing this bug, and thank you for bisecting it > down to my patch. > > Do you see any other strange behavior with that patch, or is > this the only symptom you are aware of? This seems to be the only issue I'm seeing so far. However, as I'm not able to fully boot the system, I'm not able to be certain that there might be other fallouts once the system is running. >> I'm seeing this backtrace now: >> >> [ 905.883273] usb 1-2: SerialNumber: A60020000001 >> [ 905.918170] sda: sda1 sda2 sda3 >> [ 905.920107] sd 0:1:0:0: [sda] Attached SCSI disk >> [ 905.944102] usb-storage 1-2:1.0: USB Mass Storage device detected >> [ 905.944102] scsi host1: usb-storage 1-2:1.0 >> [ 905.944102] usbcore: registered new interface driver usb-storage >> [ 905.944117] usbcore: registered new interface driver uas > > The timestamps show that time is moving forward, which is at least > something. Do you have a feeling for whether the timestamps are > counting in (roughly) the correct speed, or is it going much faster > or slower than it should? > > To clarify: the [905.944117] numbers are seconds/microseconds > since boot, so message would be 906 seconds after the kernel > started. No, that would be definitely off. I hadn't had the machine up and running for 15 minutes. This issue showed right after boot. >> Begin: Loading essential drivers ... done. > Begin: Running /scripts/init-premount ... done. > Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. > > Ok, so it gets into user space. Is this initramfs or the actual read-only root? This is using an initramfs. >> [ 906.666923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 >> [ 906.670923] hpsa 0000:05:00.0: device is ready. >> [ 906.670923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 >> done. >> [ 906.722166] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 906.722166] rcu: 2-....: (3 ticks this GP) idle=fe6/1/0x4000000000000000 softirq=693/698 fqs=4 >> [ 906.722166] (detected by 0, t=6115 jiffies, g=465, q=80) > This appears to be an 'rcu stall' warning, from print_cpu_stall_info(), > indicating that timer ticks are missing. OK. >> [ 909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 seconds. >> [ 909.360108] Not tainted 5.10.0+ #130 >> [ 909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 909.360108] task:systemd-sysv-ge state:D stack: 0 pid: 200 ppid: 189 flags:0x00000000 >> [ 909.364108] >> [ 909.364108] Call Trace: >> [ 909.364423] [] __schedule+0x890/0x21e0 >> [ 909.364423] sp=e0000100487d7b70 bsp=e0000100487d1748 >> [ 909.368423] [] schedule+0xa0/0x240 >> [ 909.368423] sp=e0000100487d7b90 bsp=e0000100487d16e0 >> [ 909.368558] [] io_schedule+0x70/0xa0 >> [ 909.368558] sp=e0000100487d7b90 bsp=e0000100487d16c0 >> [ 909.372290] [] bit_wait_io+0x20/0xe0 >> [ 909.372290] sp=e0000100487d7b90 bsp=e0000100487d1698 >> [ 909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 909.376290] [] __wait_on_bit+0xc0/0x1c0 >> [ 909.376290] sp=e0000100487d7b90 bsp=e0000100487d1648 >> [ 909.374168] rcu: 3-....: (2 ticks this GP) idle=19e/1/0x4000000000000002 softirq=1581/1581 fqs=2 >> [ 909.374168] (detected by 0, t=5661 jiffies, g=1089, q=3) >> [ 909.376290] [] out_of_line_wait_on_bit+0x120/0x140 >> [ 909.376290] sp=e0000100487d7b90 bsp=e0000100487d1610 >> [ 909.374168] Task dump for CPU 3: >> [ 909.374168] task:khungtaskd state:R running task > > and this seems to be another instance of the same. I would assume that this > is completely unrelated to the block driver and just happened to trigger during > the same time the driver was doing something. > > Can you see in your full logs if the "Oops: timer tick before it's due" warning > triggered at any point? It's difficult, to be honest. The problem is that the above message spams the whole kernel buffer to the point that the buffer of the built-in serial console is filled up. So I'm not sure if I've seen this message. > I've attached a patch for a partial revert of my original change, this > should still work with the final cleanup on top, but restore the loop > plus the local_irq_enable()/local_irq_disable() that I dropped from > the original code. Does this make a difference? I'll give it a try and report back. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz@debian.org `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Paul Adrian Glaubitz Date: Fri, 18 Dec 2020 22:13:29 +0000 Subject: Re: [PATCH v2 05/15] ia64: convert to legacy_timer_tick Message-Id: List-Id: References: <20201030151758.1241164-1-arnd@kernel.org> <20201030151758.1241164-6-arnd@kernel.org> <59efce0e-a28d-9424-82ca-fb7f3a1b9c29@physik.fu-berlin.de> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable To: Arnd Bergmann Cc: "linux-kernel@vger.kernel.org" , Thomas Gleixner , Arnd Bergmann , Russell King , Tony Luck , Fenghua Yu , Greg Ungerer , Finn Thain , Philip Blundell , Joshua Thompson , Sam Creasey , "James E.J. Bottomley" , Helge Deller , Daniel Lezcano , John Stultz , Stephen Boyd , Linus Walleij , linux-ia64@vger.kernel.org, Parisc List , linux-m68k , Linux ARM , Mike Rapoport , Anatoly Pugachev Hi Arnd! On 12/18/20 11:07 PM, Arnd Bergmann wrote: > Sorry for causing this bug, and thank you for bisecting it > down to my patch. >=20 > Do you see any other strange behavior with that patch, or is > this the only symptom you are aware of? This seems to be the only issue I'm seeing so far. However, as I'm not able to fully boot the system, I'm not able to be certain that there might be other fallouts once the system is running. >> I'm seeing this backtrace now: >> >> [ 905.883273] usb 1-2: SerialNumber: A60020000001 >> [ 905.918170] sda: sda1 sda2 sda3 >> [ 905.920107] sd 0:1:0:0: [sda] Attached SCSI disk >> [ 905.944102] usb-storage 1-2:1.0: USB Mass Storage device detected >> [ 905.944102] scsi host1: usb-storage 1-2:1.0 >> [ 905.944102] usbcore: registered new interface driver usb-storage >> [ 905.944117] usbcore: registered new interface driver uas >=20 > The timestamps show that time is moving forward, which is at least > something. Do you have a feeling for whether the timestamps are > counting in (roughly) the correct speed, or is it going much faster > or slower than it should? >=20 > To clarify: the [905.944117] numbers are seconds/microseconds > since boot, so message would be 906 seconds after the kernel > started. No, that would be definitely off. I hadn't had the machine up and running for 15 minutes. This issue showed right after boot. >> Begin: Loading essential drivers ... done. = = > Begin: Running /scripts/init-premount ... done. = = > Begin: Mounting root file system ... Begin: Ru= nning /scripts/local-top ... done. >=20 > Ok, so it gets into user space. Is this initramfs or the actual read-only= root? This is using an initramfs. >> [ 906.666923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical Direc= t-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=3D1 >> [ 906.670923] hpsa 0000:05:00.0: device is ready. >> [ 906.670923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical completed= successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPa= thCap- En- Exp=3D1 >> done. >> [ 906.722166] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 906.722166] rcu: 2-....: (3 ticks this GP) idle=FE6/1/0x400000000= 0000000 softirqi3/698 fqs=3D4 >> [ 906.722166] (detected by 0, ta15 jiffies, gF5, q=80) > This appears to be an 'rcu stall' warning, from print_cpu_stall_info(), > indicating that timer ticks are missing. OK. >> [ 909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 = seconds. >> [ 909.360108] Not tainted 5.10.0+ #130 >> [ 909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disabl= es this message. >> [ 909.360108] task:systemd-sysv-ge state:D stack: 0 pid: 200 ppid: = 189 flags:0x00000000 >> [ 909.364108] >> [ 909.364108] Call Trace: >> [ 909.364423] [] __schedule+0x890/0x21e0 >> [ 909.364423] sp=E0000100487d7b70 bsp= =E0000100487d1748 >> [ 909.368423] [] schedule+0xa0/0x240 >> [ 909.368423] sp=E0000100487d7b90 bsp= =E0000100487d16e0 >> [ 909.368558] [] io_schedule+0x70/0xa0 >> [ 909.368558] sp=E0000100487d7b90 bsp= =E0000100487d16c0 >> [ 909.372290] [] bit_wait_io+0x20/0xe0 >> [ 909.372290] sp=E0000100487d7b90 bsp= =E0000100487d1698 >> [ 909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 909.376290] [] __wait_on_bit+0xc0/0x1c0 >> [ 909.376290] sp=E0000100487d7b90 bsp= =E0000100487d1648 >> [ 909.374168] rcu: 3-....: (2 ticks this GP) idle=19e/1/0x400000000= 0000002 softirq=1581/1581 fqs=3D2 >> [ 909.374168] (detected by 0, tV61 jiffies, g=1089, q=3D3) >> [ 909.376290] [] out_of_line_wait_on_bit+0x120/0x140 >> [ 909.376290] sp=E0000100487d7b90 bsp= =E0000100487d1610 >> [ 909.374168] Task dump for CPU 3: >> [ 909.374168] task:khungtaskd state:R running task >=20 > and this seems to be another instance of the same. I would assume that th= is > is completely unrelated to the block driver and just happened to trigger = during > the same time the driver was doing something. >=20 > Can you see in your full logs if the "Oops: timer tick before it's due" w= arning > triggered at any point? It's difficult, to be honest. The problem is that the above message spams t= he whole kernel buffer to the point that the buffer of the built-in serial console i= s filled up. So I'm not sure if I've seen this message. > I've attached a patch for a partial revert of my original change, this > should still work with the final cleanup on top, but restore the loop > plus the local_irq_enable()/local_irq_disable() that I dropped from > the original code. Does this make a difference? I'll give it a try and report back. Adrian --=20 .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz@debian.org `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913