From: "Zi Yan" <zi.yan@cs.rutgers.edu>
To: "Mel Gorman" <mgorman@techsingularity.net>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
"Andrea Arcangeli" <aarcange@redhat.com>,
"Rik van Riel" <riel@redhat.com>,
"Michal Hocko" <mhocko@kernel.org>,
"Vlastimil Babka" <vbabka@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm, numa: Fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa
Date: Mon, 10 Apr 2017 11:45:08 -0500 [thread overview]
Message-ID: <84B5E286-4E2A-4DE0-8351-806D2102C399@cs.rutgers.edu> (raw)
In-Reply-To: <20170410094825.2yfo5zehn7pchg6a@techsingularity.net>
[-- Attachment #1: Type: text/plain, Size: 2512 bytes --]
On 10 Apr 2017, at 4:48, Mel Gorman wrote:
> A user reported a bug against a distribution kernel while running
> a proprietary workload described as "memory intensive that is not
> swapping" that is expected to apply to mainline kernels. The workload
> is read/write/modifying ranges of memory and checking the contents. They
> reported that within a few hours that a bad PMD would be reported followed
> by a memory corruption where expected data was all zeros. A partial report
> of the bad PMD looked like
>
> [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
> [ 5195.341184] ------------[ cut here ]------------
> [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
> ....
> [ 5195.410033] Call Trace:
> [ 5195.410471] [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
> [ 5195.410716] [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
> [ 5195.410918] [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
> [ 5195.411200] [<ffffffff81098322>] task_work_run+0x72/0x90
> [ 5195.411246] [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
> [ 5195.411494] [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
> [ 5195.411739] [<ffffffff815e56af>] retint_user+0x8/0x10
>
> Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
> was a false detection. The bug does not trigger if automatic NUMA balancing
> or transparent huge pages is disabled.
>
> The bug is due a race in change_pmd_range between a pmd_trans_huge and
> pmd_nond_or_clear_bad check without any locks held. During the pmd_trans_huge
> check, a parallel protection update under lock can have cleared the PMD
> and filled it with a prot_numa entry between the transhuge check and the
> pmd_none_or_clear_bad check.
>
> While this could be fixed with heavy locking, it's only necessary to
> make a copy of the PMD on the stack during change_pmd_range and avoid
> races. A new helper is created for this as the check if quite subtle and the
> existing similar helpful is not suitable. This passed 154 hours of testing
> (usually triggers between 20 minutes and 24 hours) without detecting bad
> PMDs or corruption. A basic test of an autonuma-intensive workload showed
> no significant change in behaviour.
>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> Cc: stable@vger.kernel.org
Does this patch fix the same problem fixed by Kirill's patch here?
https://lkml.org/lkml/2017/3/2/347
--
Best Regards
Yan Zi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 496 bytes --]
next prev parent reply other threads:[~2017-04-10 17:45 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-10 9:48 [PATCH] mm, numa: Fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa Mel Gorman
2017-04-10 10:03 ` Vlastimil Babka
2017-04-10 12:19 ` Mel Gorman
2017-04-10 12:38 ` Rik van Riel
2017-04-10 13:53 ` Michal Hocko
2017-04-10 17:38 ` Mel Gorman
2017-04-10 16:45 ` Zi Yan [this message]
2017-04-10 17:20 ` Mel Gorman
2017-04-10 17:49 ` Zi Yan
2017-04-10 18:07 ` Mel Gorman
2017-04-10 22:09 ` Andrew Morton
2017-04-10 22:28 ` Zi Yan
2017-04-11 6:35 ` Vlastimil Babka
2017-04-11 21:44 ` Andrew Morton
2017-04-11 8:29 ` Mel Gorman
2020-02-16 19:18 [PATCH] mm, numa: fix " Rafael Aquini
2020-02-16 23:32 ` Mel Gorman
2020-03-07 2:40 ` Qian Cai
2020-03-07 3:05 ` Rafael Aquini
2020-03-08 3:20 ` Qian Cai
2020-03-08 23:14 ` Rafael Aquini
2020-03-09 3:27 ` Qian Cai
2020-03-09 15:05 ` Rafael Aquini
2020-03-11 0:04 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=84B5E286-4E2A-4DE0-8351-806D2102C399@cs.rutgers.edu \
--to=zi.yan@cs.rutgers.edu \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=riel@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).