From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 774CFC433E0 for ; Wed, 17 Mar 2021 08:51:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3811864E27 for ; Wed, 17 Mar 2021 08:51:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229460AbhCQIup (ORCPT ); Wed, 17 Mar 2021 04:50:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229506AbhCQIud (ORCPT ); Wed, 17 Mar 2021 04:50:33 -0400 Received: from mail-qv1-xf2a.google.com (mail-qv1-xf2a.google.com [IPv6:2607:f8b0:4864:20::f2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EFA7C06174A for ; Wed, 17 Mar 2021 01:50:33 -0700 (PDT) Received: by mail-qv1-xf2a.google.com with SMTP id d10so1003917qve.7 for ; Wed, 17 Mar 2021 01:50:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PXKpgD8S0tpfF/4Jl42if8dBS2c84WD9R/ZEPqRwZqw=; b=Cyp4O4UBuKuRaZgByLxhrwpPMaWHFFb0EOD2eak0VGOkqSzdtZZjavXFfYxFv65j5Q Ii9UYCYGe3D34How2UtG4DECdWJLIdkgvbNPsDFMOqc9JqvkAMmEpSD7qbklrI9u2l+k DD3Xbpd4DV2kzt99QVluZdvlDPaoJL+oRQ+jeJf5x2mVkBVYR5NtTjhbKdbAE2tqhydN voh+Nuav3DagG7xs8bjp1gB59j9YBEFngNV3TK/Eq/EvKxSJWBlGYMR88/tNYPdDl6cR UCpAHQyUjJLLoREOtLr031OlbQ3Dz7ROz8FcwoT3jwyqnB9ZFsrmv4gmneVZgIEPCxkQ YmGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PXKpgD8S0tpfF/4Jl42if8dBS2c84WD9R/ZEPqRwZqw=; b=IkoSPuZaUwLmcYlrfO69Z6uouUh6B/PnRQ0FK5sJ8Behcg50Jnxm2gq9r6pmcYg0Eg 2IdQLxVVbFijAUz+j9MzHdtni6TmqrAzq4y8u41KOYWNR8kPvmi+/3ankA4BFlBBgpyK hMtTbzM+Y5f7GTM/Wc6wVIYVHmRvgTjRhABFatYMYQDXZj4cHI/YKvcqXjMn1WtMlaEb iqKnGDh71+bA1YO3SmmqwF+Hueo1nyjHXVQeHp+rVjNXlg8LXeGYBPso4OumTUQsBQVR uRwenIRCRRdMCnXFRe4rjAwCoGoE9cBMwiz4QHvLHTspvEVRJjcOngWnyX3pMVE2qpX1 UpVw== X-Gm-Message-State: AOAM531aEGFGRgbiVbqmmAWvnpbVvE4+Xgr4zS/KQ2REsGHDhDN5YyBR talDxbt59JhFOqSn7zAxmNS7n8nvULOcondEpMvMOw== X-Google-Smtp-Source: ABdhPJw3MvrVFRDb5uadNU8fguI0OEoyzAIb8cGdqhM4lMMYb67TEvNwJKTW5/8us9UzH1Z/T0qce0OV+S/oY02V9zY= X-Received: by 2002:a0c:8304:: with SMTP id j4mr4356484qva.18.1615971032019; Wed, 17 Mar 2021 01:50:32 -0700 (PDT) MIME-Version: 1.0 References: <00000000000069802205bda22b7f@google.com> <20210316155102.GP1463@shell.armlinux.org.uk> In-Reply-To: From: Dmitry Vyukov Date: Wed, 17 Mar 2021 09:50:20 +0100 Message-ID: Subject: Re: [syzbot] kernel panic: corrupted stack end in openat To: Arnd Bergmann Cc: Russell King - ARM Linux admin , syzbot , Linus Walleij , Linux ARM , Andrew Morton , LKML , Linux-MM , syzkaller-bugs , =?UTF-8?Q?Uwe_Kleine=2DK=C3=B6nig?= Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 17, 2021 at 9:32 AM Arnd Bergmann wrote: > > > > > wrote: > > > > > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > > > > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov wrote: > > > > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > > > > > > > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > > > > > > the closest I have installed, and I think the Debian and Ubuntu versions > > > > > > > are generally quite close in case of gcc since they are maintained by > > > > > > > the same packagers. > > > > > > > > > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack > > > > > > space. Could it be the syzbot's gcc is doing something weird and > > > > > > inflating the stack frames? > > > > > > > > > > It's possible, I think that's really unlikely given that it's just Debian's > > > > > gcc, which is as close to mainline as the version I was using. > > > > > > > > > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might > > > > > help if this was the problem though: > > > > > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koenig@pengutronix.de/ > > > > > > > > > > My best guess is something going wrong in the interrupt > > > > > that triggered the preempt_schedule() which ended up calling > > > > > task_stack_end_corrupted() in schedule_debug(), as you suggested > > > > > earlier. > > > > > > > > FWIW I see slightly larger frames with the config: > > > > > > > > 073ab64 : > > > > 8073ab64: e1a0c00d mov ip, sp > > > > 8073ab68: e92ddff0 push {r4, r5, r6, r7, r8, r9, sl, > > > > fp, ip, lr, pc} > > > > 8073ab6c: e24cb004 sub fp, ip, #4 > > > > 8073ab70: e24ddfa7 sub sp, sp, #668 ; 0x29c > > > > > > Yes, this is the one that the compiler complained about when warning > > > for stack over 600 bytes. It's not called in this call chain though. > > > > > > > page_alloc can also do reclaim, I had the impression that reclaim can > > > > be quite heavy-weight in all respects. > > > > > > Yes, that is another possibility. What writable file systems or swap > > > do you normally have mounted that it could be writing to, and on > > > what storage device? > > > > The root fs is ext4 on virtio-blk. > > > > There are also several dozens of shrinkers that can be called during reclaim: > > https://elixir.bootlin.com/linux/latest/C/ident/unregister_shrinker > > Right, unfortunately I don't see a smoking gun there either, unless you are > also using NFS or devicemapper. > > Implementing VMAP_STACK as you suggested earlier is probably the > best way to figure out if there is an actual overrun of the stack. > Alternatively, adding support for GCC_PLUGIN_STACKLEAK might > also help find out if we ever get close to the limit. This is probably > less work, but it might not actually help in this case. VMAP_STACK is quite intrusive as far as I understand. For KASAN I considered a simpler option: have a debug config that allocates an extra page after the stack and mprotect's it. It wastes a physical page per task (fine for a debug config), but I would assume should be radically simpler to implement. In the end somebody implemented proper VMAP_STACK support for KASAN, but I still think it may be a reasonable compromise between time investment and value. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C42BC433E0 for ; Wed, 17 Mar 2021 08:52:08 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 19B1D64E27 for ; Wed, 17 Mar 2021 08:52:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 19B1D64E27 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=G8xrGxrX4ZeOclY9M0Zme8YgVy03IvCo9c0taopt5hI=; b=Grwr2eUWrWLTIQt6ZPGihB3cn SwZFYSolXnTzLqOVftoeWX9RW38lgGYoOsQ4MhgpaDgdBKmWzM1OSWKf4d9O3LU2kFYLiQ0ok2kI/ B+VA22K4D+TBi5evl0AGAaNUpITWXfIBkfShM8qehMOzeFnOqpXDhxDjVwK2eK+3QdtKFJUcV0enY ACTyHmQnHaO3GuNaSME3xhI89nolPkCHZM1GIgNm4+4v6dF6T8ctSo17U37bCpaWAsNKigYmUus92 9n0jcEPyYVZIredaORA0gE9BFzoNTr7TqKBkIN3Ty7BqfggkCCJXdiKbndj0Zde10M4uJ7zwHaVFY aIwcV9n2w==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lMRt2-002nLB-Cn; Wed, 17 Mar 2021 08:50:40 +0000 Received: from mail-qv1-xf30.google.com ([2607:f8b0:4864:20::f30]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lMRsw-002nKV-Dw for linux-arm-kernel@lists.infradead.org; Wed, 17 Mar 2021 08:50:36 +0000 Received: by mail-qv1-xf30.google.com with SMTP id x16so1024849qvk.3 for ; Wed, 17 Mar 2021 01:50:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PXKpgD8S0tpfF/4Jl42if8dBS2c84WD9R/ZEPqRwZqw=; b=Cyp4O4UBuKuRaZgByLxhrwpPMaWHFFb0EOD2eak0VGOkqSzdtZZjavXFfYxFv65j5Q Ii9UYCYGe3D34How2UtG4DECdWJLIdkgvbNPsDFMOqc9JqvkAMmEpSD7qbklrI9u2l+k DD3Xbpd4DV2kzt99QVluZdvlDPaoJL+oRQ+jeJf5x2mVkBVYR5NtTjhbKdbAE2tqhydN voh+Nuav3DagG7xs8bjp1gB59j9YBEFngNV3TK/Eq/EvKxSJWBlGYMR88/tNYPdDl6cR UCpAHQyUjJLLoREOtLr031OlbQ3Dz7ROz8FcwoT3jwyqnB9ZFsrmv4gmneVZgIEPCxkQ YmGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PXKpgD8S0tpfF/4Jl42if8dBS2c84WD9R/ZEPqRwZqw=; b=XnevSnQaI5XxXAsXByxt7kD9ZAFoqx2qJItm76FOvHjNrb2+szYkXhrPolWEO9MxbO 4gfEaQW67bnS4jjE5st4m/Ib1qKgb0RgyXzMyC7KSKtRU4hf/DySbPgbywTl3WCfhsl8 jcvba7G1/6YtwUxawtWZliUjuAu1KKDqtLas0+VRHvpYAIDMzDgWc2rjEw/NVBIF0j2D /kbMBiqtUg5Ft//rXY6hfndhwTWxcenxa64I+iEB3xPV6OhxJXhka3aUYCGFKS8erSjH 86Witp8gqOOCeFt1fGfVc7/V0FZiHva+aGPsR+LSgERLsoqX/zYPx/M42vDERbHGAGTG cT+A== X-Gm-Message-State: AOAM530FA2H4GFkMMb5u5Y0QKN+camTNxRflo+/9N7tucK6Q9s+KKXYa pdsV+olvskaN/vAbfuu9qRdXgsj8u+et5s6EWFpyew== X-Google-Smtp-Source: ABdhPJw3MvrVFRDb5uadNU8fguI0OEoyzAIb8cGdqhM4lMMYb67TEvNwJKTW5/8us9UzH1Z/T0qce0OV+S/oY02V9zY= X-Received: by 2002:a0c:8304:: with SMTP id j4mr4356484qva.18.1615971032019; Wed, 17 Mar 2021 01:50:32 -0700 (PDT) MIME-Version: 1.0 References: <00000000000069802205bda22b7f@google.com> <20210316155102.GP1463@shell.armlinux.org.uk> In-Reply-To: From: Dmitry Vyukov Date: Wed, 17 Mar 2021 09:50:20 +0100 Message-ID: Subject: Re: [syzbot] kernel panic: corrupted stack end in openat To: Arnd Bergmann Cc: Russell King - ARM Linux admin , syzbot , Linus Walleij , Linux ARM , Andrew Morton , LKML , Linux-MM , syzkaller-bugs , =?UTF-8?Q?Uwe_Kleine=2DK=C3=B6nig?= X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210317_085034_588958_A3B4A410 X-CRM114-Status: GOOD ( 35.97 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Mar 17, 2021 at 9:32 AM Arnd Bergmann wrote: > > > > > wrote: > > > > > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote: > > > > > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov wrote: > > > > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6) > > > > > > > > > > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's > > > > > > > the closest I have installed, and I think the Debian and Ubuntu versions > > > > > > > are generally quite close in case of gcc since they are maintained by > > > > > > > the same packagers. > > > > > > > > > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack > > > > > > space. Could it be the syzbot's gcc is doing something weird and > > > > > > inflating the stack frames? > > > > > > > > > > It's possible, I think that's really unlikely given that it's just Debian's > > > > > gcc, which is as close to mainline as the version I was using. > > > > > > > > > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might > > > > > help if this was the problem though: > > > > > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koenig@pengutronix.de/ > > > > > > > > > > My best guess is something going wrong in the interrupt > > > > > that triggered the preempt_schedule() which ended up calling > > > > > task_stack_end_corrupted() in schedule_debug(), as you suggested > > > > > earlier. > > > > > > > > FWIW I see slightly larger frames with the config: > > > > > > > > 073ab64 : > > > > 8073ab64: e1a0c00d mov ip, sp > > > > 8073ab68: e92ddff0 push {r4, r5, r6, r7, r8, r9, sl, > > > > fp, ip, lr, pc} > > > > 8073ab6c: e24cb004 sub fp, ip, #4 > > > > 8073ab70: e24ddfa7 sub sp, sp, #668 ; 0x29c > > > > > > Yes, this is the one that the compiler complained about when warning > > > for stack over 600 bytes. It's not called in this call chain though. > > > > > > > page_alloc can also do reclaim, I had the impression that reclaim can > > > > be quite heavy-weight in all respects. > > > > > > Yes, that is another possibility. What writable file systems or swap > > > do you normally have mounted that it could be writing to, and on > > > what storage device? > > > > The root fs is ext4 on virtio-blk. > > > > There are also several dozens of shrinkers that can be called during reclaim: > > https://elixir.bootlin.com/linux/latest/C/ident/unregister_shrinker > > Right, unfortunately I don't see a smoking gun there either, unless you are > also using NFS or devicemapper. > > Implementing VMAP_STACK as you suggested earlier is probably the > best way to figure out if there is an actual overrun of the stack. > Alternatively, adding support for GCC_PLUGIN_STACKLEAK might > also help find out if we ever get close to the limit. This is probably > less work, but it might not actually help in this case. VMAP_STACK is quite intrusive as far as I understand. For KASAN I considered a simpler option: have a debug config that allocates an extra page after the stack and mprotect's it. It wastes a physical page per task (fine for a debug config), but I would assume should be radically simpler to implement. In the end somebody implemented proper VMAP_STACK support for KASAN, but I still think it may be a reasonable compromise between time investment and value. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel