From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78188C4363A for ; Fri, 23 Oct 2020 05:02:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C428824640 for ; Fri, 23 Oct 2020 05:02:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C428824640 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 883436B005D; Fri, 23 Oct 2020 01:02:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 833FE6B0062; Fri, 23 Oct 2020 01:02:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FC246B0068; Fri, 23 Oct 2020 01:02:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 43F596B005D for ; Fri, 23 Oct 2020 01:02:19 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D3AAD362C for ; Fri, 23 Oct 2020 05:02:18 +0000 (UTC) X-FDA: 77401993956.04.air36_120642a27256 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id B92B480058F2 for ; Fri, 23 Oct 2020 05:02:18 +0000 (UTC) X-HE-Tag: air36_120642a27256 X-Filterd-Recvd-Size: 5909 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Fri, 23 Oct 2020 05:02:17 +0000 (UTC) IronPort-SDR: ncvxYuaAQtUENC6TJkKjFro8vSrAw9aPKzX6xvOn97Ki1CBNE8tSHd25vkrM5l9RWqR5Fbvhfx VXwmJ2zP8cdA== X-IronPort-AV: E=McAfee;i="6000,8403,9782"; a="146921183" X-IronPort-AV: E=Sophos;i="5.77,404,1596524400"; d="scan'208";a="146921183" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2020 22:02:15 -0700 IronPort-SDR: 7Day5D5KPL2/noKOCrxgfsCyMurAxXkzfTj5J0GV0BS3bePPACh7pdn5rGhilSEUDVqYQNYw7j kzNa5DgChLHA== X-IronPort-AV: E=Sophos;i="5.77,404,1596524400"; d="scan'208";a="466940929" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.160]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2020 22:02:15 -0700 Date: Thu, 22 Oct 2020 22:02:14 -0700 From: Sean Christopherson To: Linus Torvalds Cc: Daniel =?iso-8859-1?Q?D=EDaz?= , Naresh Kamboju , Stephen Rothwell , "Matthew Wilcox (Oracle)" , zenglg.jy@cn.fujitsu.com, "Peter Zijlstra (Intel)" , Viresh Kumar , X86 ML , open list , lkft-triage@lists.linaro.org, "Eric W. Biederman" , linux-mm , linux-m68k , Linux-Next Mailing List , Thomas Gleixner , kasan-dev , Dmitry Vyukov , Geert Uytterhoeven , Christian Brauner , Ingo Molnar , LTP List , Al Viro Subject: Re: [LTP] mmstress[1309]: segfault at 7f3d71a36ee8 ip 00007f3d77132bdf sp 00007f3d71a36ee8 error 4 in libc-2.27.so[7f3d77058000+1aa000] Message-ID: <20201023050214.GG23681@linux.intel.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Oct 22, 2020 at 08:05:05PM -0700, Linus Torvalds wrote: > On Thu, Oct 22, 2020 at 6:36 PM Daniel D=EDaz = wrote: > > > > The kernel Naresh originally referred to is here: > > https://builds.tuxbuild.com/SCI7Xyjb7V2NbfQ2lbKBZw/ >=20 > Thanks. >=20 > And when I started looking at it, I realized that my original idea > ("just look for __put_user_nocheck_X calls, there aren't so many of > those") was garbage, and that I was just being stupid. >=20 > Yes, the commit that broke was about __put_user(), but in order to not > duplicate all the code, it re-used the regular put_user() > infrastructure, and so all the normal put_user() calls are potential > problem spots too if this is about the compiler interaction with KASAN > and the asm changes. >=20 > So it's not just a couple of special cases to look at, it's all the > normal cases too. >=20 > Ok, back to the drawing board, but I think reverting it is probably > the right thing to do if I can't think of something smart. >=20 > That said, since you see this on x86-64, where the whole ugly trick wit= h that >=20 > register asm("%"_ASM_AX) >=20 > is unnecessary (because the 8-byte case is still just a single > register, no %eax:%edx games needed), it would be interesting to hear > if the attached patch fixes it. That would confirm that the problem > really is due to some register allocation issue interaction (or, > alternatively, it would tell me that there's something else going on). I haven't reproduced the crash, but I did find a smoking gun that confirm= s the "register shenanigans are evil shenanigans" theory. I ran into a similar= thing recently where a seemingly innocuous line of code after loading a value i= nto a register variable wreaked havoc because it clobbered the input register. This put_user() in schedule_tail(): if (current->set_child_tid) put_user(task_pid_vnr(current), current->set_child_tid); generates the following assembly with KASAN out-of-line: 0xffffffff810dccc9 <+73>: xor %edx,%edx 0xffffffff810dcccb <+75>: xor %esi,%esi 0xffffffff810dcccd <+77>: mov %rbp,%rdi 0xffffffff810dccd0 <+80>: callq 0xffffffff810bf5e0 <__task_pid_nr_ns> 0xffffffff810dccd5 <+85>: mov %r12,%rdi 0xffffffff810dccd8 <+88>: callq 0xffffffff81388c60 <__asan_load8> 0xffffffff810dccdd <+93>: mov 0x590(%rbp),%rcx 0xffffffff810dcce4 <+100>: callq 0xffffffff817708a0 <__put_user_4> 0xffffffff810dcce9 <+105>: pop %rbx 0xffffffff810dccea <+106>: pop %rbp 0xffffffff810dcceb <+107>: pop %r12 __task_pid_nr_ns() returns the pid in %rax, which gets clobbered by __asan_load8()'s check on current for the current->set_child_tid derefere= nce.