All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: torvalds@linux-foundation.org, kirill.shutemov@linux.intel.com,
	akpm@linux-foundation.org, hannes@cmpxchg.org,
	iamjoonsoo.kim@lge.com, mgorman@techsingularity.net,
	tony.luck@intel.com, vbabka@suse.cz, mhocko@kernel.org,
	aarcange@redhat.com, hillf.zj@alibaba-inc.com, hughd@google.com,
	oleg@redhat.com, peterz@infradead.org, riel@redhat.com,
	srikar@linux.vnet.ibm.com, vdavydov.dev@gmail.com,
	dave.hansen@linux.intel.com, mingo@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.
Date: Thu, 18 Jan 2018 17:34:10 +0300	[thread overview]
Message-ID: <20180118143410.sozfsbmb3liumn3x@node.shutemov.name> (raw)
In-Reply-To: <20180118131210.456oyh6fw4scwv53@node.shutemov.name>

On Thu, Jan 18, 2018 at 04:12:10PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 03:25:50PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> > > Tetsuo Handa wrote:
> > > > OK. I missed the mark. I overlooked that 4.11 already has this problem.
> > > > 
> > > > I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
> > > > 
> > > > I haven't completed bisecting between b4fb8f66f1ae2e16 and c470abd4fde40ea6, but
> > > > b4fb8f66f1ae2e16 ("mm, page_alloc: Add missing check for memory holes") and
> > > > 13ad59df67f19788 ("mm, page_alloc: avoid page_to_pfn() when merging buddies")
> > > > are talking about memory holes, which matches the situation that I'm trivially
> > > > hitting the bug if CONFIG_SPARSEMEM=y .
> > > > 
> > > > Thus, I call for an attention by speculative execution. ;-)
> > > 
> > > Speculative execution failed. I was confused by jiffies precision bug.
> > > The final culprit is c7ab0d2fdc840266 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()").
> > 
> > I think I've tracked it down. check_pte() in mm/page_vma_mapped.c doesn't
> > work as intended.
> > 
> > I've added instrumentation below to prove it.
> > 
> > The BUG() triggers with following output:
> > 
> > [   10.084024] diff: -858690919
> > [   10.084258] hpage_nr_pages: 1
> > [   10.084386] check1: 0
> > [   10.084478] check2: 0
> > 
> > Basically, pte_page(*pvmw->pte) is below pvmw->page, but
> > (pte_page(*pvmw->pte) < pvmw->page) doesn't catch it.
> > 
> > Well, I can see how C lawyer can argue that you can only compare pointers
> > of the same memory object which is not the case here. But this is kinda
> > insane.
> > 
> > Any suggestions how to rewrite it in a way that compiler would
> > understand?
> 
> The patch below makes the crash go away for me.
> 
> But this is situation is scary. So we cannot compare arbitrary pointers in
> kernel?
> 
> Don't we rely on this for lock ordering in some cases? Like in
> mutex_lock_double()?
> 
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index d22b84310f6d..1f0f512fd127 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -51,6 +51,8 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  		WARN_ON_ONCE(1);
>  #endif
>  	} else {
> +		unsigned long ptr1, ptr2;
> +
>  		if (is_swap_pte(*pvmw->pte)) {
>  			swp_entry_t entry;
>  
> @@ -63,12 +65,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  		if (!pte_present(*pvmw->pte))
>  			return false;
>  
> -		/* THP can be referenced by any subpage */
> -		if (pte_page(*pvmw->pte) - pvmw->page >=
> -				hpage_nr_pages(pvmw->page)) {
> +		ptr1 = (unsigned long)pte_page(*pvmw->pte);
> +		ptr2 = (unsigned long)pvmw->page;
> +
> +		if (ptr1 < ptr2)
>  			return false;
> -		}
> -		if (pte_page(*pvmw->pte) < pvmw->page)
> +
> +		/* THP can be referenced by any subpage */
> +		if (ptr1 - ptr2 >= hpage_nr_pages(pvmw->page))

Arghhh.. It has to be

		if (ptr1 - ptr2 >= hpage_nr_pages(pvmw->page) * sizeof(*pvmw->page))

-- 
 Kirill A. Shutemov

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: torvalds@linux-foundation.org, kirill.shutemov@linux.intel.com,
	akpm@linux-foundation.org, hannes@cmpxchg.org,
	iamjoonsoo.kim@lge.com, mgorman@techsingularity.net,
	tony.luck@intel.com, vbabka@suse.cz, mhocko@kernel.org,
	aarcange@redhat.com, hillf.zj@alibaba-inc.com, hughd@google.com,
	oleg@redhat.com, peterz@infradead.org, riel@redhat.com,
	srikar@linux.vnet.ibm.com, vdavydov.dev@gmail.com,
	dave.hansen@linux.intel.com, mingo@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.
Date: Thu, 18 Jan 2018 17:34:10 +0300	[thread overview]
Message-ID: <20180118143410.sozfsbmb3liumn3x@node.shutemov.name> (raw)
In-Reply-To: <20180118131210.456oyh6fw4scwv53@node.shutemov.name>

On Thu, Jan 18, 2018 at 04:12:10PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 03:25:50PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> > > Tetsuo Handa wrote:
> > > > OK. I missed the mark. I overlooked that 4.11 already has this problem.
> > > > 
> > > > I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
> > > > 
> > > > I haven't completed bisecting between b4fb8f66f1ae2e16 and c470abd4fde40ea6, but
> > > > b4fb8f66f1ae2e16 ("mm, page_alloc: Add missing check for memory holes") and
> > > > 13ad59df67f19788 ("mm, page_alloc: avoid page_to_pfn() when merging buddies")
> > > > are talking about memory holes, which matches the situation that I'm trivially
> > > > hitting the bug if CONFIG_SPARSEMEM=y .
> > > > 
> > > > Thus, I call for an attention by speculative execution. ;-)
> > > 
> > > Speculative execution failed. I was confused by jiffies precision bug.
> > > The final culprit is c7ab0d2fdc840266 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()").
> > 
> > I think I've tracked it down. check_pte() in mm/page_vma_mapped.c doesn't
> > work as intended.
> > 
> > I've added instrumentation below to prove it.
> > 
> > The BUG() triggers with following output:
> > 
> > [   10.084024] diff: -858690919
> > [   10.084258] hpage_nr_pages: 1
> > [   10.084386] check1: 0
> > [   10.084478] check2: 0
> > 
> > Basically, pte_page(*pvmw->pte) is below pvmw->page, but
> > (pte_page(*pvmw->pte) < pvmw->page) doesn't catch it.
> > 
> > Well, I can see how C lawyer can argue that you can only compare pointers
> > of the same memory object which is not the case here. But this is kinda
> > insane.
> > 
> > Any suggestions how to rewrite it in a way that compiler would
> > understand?
> 
> The patch below makes the crash go away for me.
> 
> But this is situation is scary. So we cannot compare arbitrary pointers in
> kernel?
> 
> Don't we rely on this for lock ordering in some cases? Like in
> mutex_lock_double()?
> 
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index d22b84310f6d..1f0f512fd127 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -51,6 +51,8 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  		WARN_ON_ONCE(1);
>  #endif
>  	} else {
> +		unsigned long ptr1, ptr2;
> +
>  		if (is_swap_pte(*pvmw->pte)) {
>  			swp_entry_t entry;
>  
> @@ -63,12 +65,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  		if (!pte_present(*pvmw->pte))
>  			return false;
>  
> -		/* THP can be referenced by any subpage */
> -		if (pte_page(*pvmw->pte) - pvmw->page >=
> -				hpage_nr_pages(pvmw->page)) {
> +		ptr1 = (unsigned long)pte_page(*pvmw->pte);
> +		ptr2 = (unsigned long)pvmw->page;
> +
> +		if (ptr1 < ptr2)
>  			return false;
> -		}
> -		if (pte_page(*pvmw->pte) < pvmw->page)
> +
> +		/* THP can be referenced by any subpage */
> +		if (ptr1 - ptr2 >= hpage_nr_pages(pvmw->page))

Arghhh.. It has to be

		if (ptr1 - ptr2 >= hpage_nr_pages(pvmw->page) * sizeof(*pvmw->page))

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-18 14:34 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-05 14:45 [x86? mm? fs? 4.15-rc6] Random oopses by simple write under memory pressure Tetsuo Handa
2018-01-09 10:39 ` [mm? 4.15-rc7] " Tetsuo Handa
2018-01-10 11:49   ` [mm? 4.15-rc7] Random oopses " Tetsuo Handa
2018-01-10 12:45     ` Michal Hocko
2018-01-10 13:37       ` Tetsuo Handa
2018-01-11 13:57         ` Michal Hocko
2018-01-11 14:11           ` Tetsuo Handa
2018-01-11 14:21             ` Michal Hocko
2018-01-11 14:37               ` Tetsuo Handa
2018-01-12  1:31               ` [mm " Tetsuo Handa
2018-01-12  1:42                 ` Linus Torvalds
2018-01-12 11:22                   ` Tetsuo Handa
2018-01-14 11:54                     ` Tetsuo Handa
2018-01-14 11:54                       ` Tetsuo Handa
2018-01-15 23:05                       ` Linus Torvalds
2018-01-15 23:05                         ` Linus Torvalds
2018-01-16  1:15                         ` [mm 4.15-rc8] " Tetsuo Handa
2018-01-16  1:15                           ` Tetsuo Handa
2018-01-16  2:14                           ` Linus Torvalds
2018-01-16  2:14                             ` Linus Torvalds
2018-01-16  8:06                             ` Dave Hansen
2018-01-16  8:06                               ` Dave Hansen
2018-01-16  8:37                               ` Ingo Molnar
2018-01-16  8:37                                 ` Ingo Molnar
2018-01-16 19:30                               ` Linus Torvalds
2018-01-16 19:30                                 ` Linus Torvalds
2018-01-16 17:33                             ` Tetsuo Handa
2018-01-16 17:33                               ` Tetsuo Handa
2018-01-16 19:34                               ` Linus Torvalds
2018-01-16 19:34                                 ` Linus Torvalds
2018-01-17 11:08                                 ` Tetsuo Handa
2018-01-17 11:08                                   ` Tetsuo Handa
2018-01-17 21:39                                   ` Linus Torvalds
2018-01-17 21:39                                     ` Linus Torvalds
2018-01-17 21:51                                     ` Linus Torvalds
2018-01-17 21:51                                       ` Linus Torvalds
2018-01-17 22:04                                       ` Dave Hansen
2018-01-17 22:04                                         ` Dave Hansen
2018-01-17 22:00                                     ` Dave Hansen
2018-01-17 22:00                                       ` Dave Hansen
2018-01-17 22:15                                       ` Linus Torvalds
2018-01-17 22:15                                         ` Linus Torvalds
2018-01-18  8:12                                   ` Tetsuo Handa
2018-01-18  8:12                                     ` Tetsuo Handa
2018-01-18 12:25                                     ` Kirill A. Shutemov
2018-01-18 12:25                                       ` Kirill A. Shutemov
2018-01-18 13:12                                       ` Kirill A. Shutemov
2018-01-18 13:12                                         ` Kirill A. Shutemov
2018-01-18 14:34                                         ` Kirill A. Shutemov [this message]
2018-01-18 14:34                                           ` Kirill A. Shutemov
2018-01-18 14:38                                         ` Dave Hansen
2018-01-18 14:38                                           ` Dave Hansen
2018-01-18 14:45                                           ` Kirill A. Shutemov
2018-01-18 14:45                                             ` Kirill A. Shutemov
2018-01-18 14:51                                             ` Dave Hansen
2018-01-18 14:51                                               ` Dave Hansen
2018-01-18 16:58                                           ` Linus Torvalds
2018-01-18 16:58                                             ` Linus Torvalds
2018-01-18 14:45                                       ` Dave Hansen
2018-01-18 14:45                                         ` Dave Hansen
2018-01-18 14:58                                         ` Andrea Arcangeli
2018-01-18 14:58                                           ` Andrea Arcangeli
2018-01-18 16:56                                           ` Kirill A. Shutemov
2018-01-18 16:56                                             ` Kirill A. Shutemov
2018-01-18 17:26                                             ` Luck, Tony
2018-01-18 17:26                                               ` Luck, Tony
2018-01-18 17:28                                               ` Linus Torvalds
2018-01-18 17:28                                                 ` Linus Torvalds
2018-01-18 17:26                                             ` Linus Torvalds
2018-01-18 17:26                                               ` Linus Torvalds
2018-01-18 23:49                                               ` Kirill A. Shutemov
2018-01-18 23:49                                                 ` Kirill A. Shutemov
2018-01-19 12:55                                                 ` Matthew Wilcox
2018-01-19 12:55                                                   ` Matthew Wilcox
2018-01-19 18:42                                                   ` Linus Torvalds
2018-01-19 18:42                                                     ` Linus Torvalds
2018-01-19 22:12                                                     ` Al Viro
2018-01-19 22:12                                                       ` Al Viro
2018-01-19 22:53                                                       ` Linus Torvalds
2018-01-19 22:53                                                         ` Linus Torvalds
2018-01-20  2:02                                                         ` Al Viro
2018-01-20  2:02                                                           ` Al Viro
2018-01-20  5:24                                                           ` Al Viro
2018-01-20  5:24                                                             ` Al Viro
2018-01-20  9:38                                                             ` Luc Van Oostenryck
2018-01-20  9:38                                                               ` Luc Van Oostenryck
2018-01-20  9:38                                                               ` Luc Van Oostenryck
2018-01-20 14:45                                                               ` Luc Van Oostenryck
2018-01-22 13:26                                                     ` Rasmus Villemoes
2018-01-22 19:58                                                       ` Linus Torvalds
2018-01-18 15:40                                         ` Kirill A. Shutemov
2018-01-18 15:40                                           ` Kirill A. Shutemov
2018-01-18 17:22                                           ` Michal Hocko
2018-01-18 17:22                                             ` Michal Hocko
2018-01-19 10:02                                             ` Kirill A. Shutemov
2018-01-19 10:02                                               ` Kirill A. Shutemov
2018-01-19 10:33                                               ` Michal Hocko
2018-01-19 10:33                                                 ` Michal Hocko
2018-01-19 11:49                                                 ` Kirill A. Shutemov
2018-01-19 11:49                                                   ` Kirill A. Shutemov
2018-01-19 12:07                                                   ` Michal Hocko
2018-01-19 12:07                                                     ` Michal Hocko
2018-01-19 12:30                                                     ` Kirill A. Shutemov
2018-01-19 12:30                                                       ` Kirill A. Shutemov
2018-01-19  2:01                                           ` Tetsuo Handa
2018-01-19  2:01                                             ` Tetsuo Handa
2018-01-11 18:11             ` [mm? 4.15-rc7] " Linus Torvalds
2018-01-11 20:59               ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180118143410.sozfsbmb3liumn3x@node.shutemov.name \
    --to=kirill@shutemov.name \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hillf.zj@alibaba-inc.com \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.