All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.
Date: Mon, 15 Jan 2018 18:14:49 -0800	[thread overview]
Message-ID: <CA+55aFxOn5n4O2JNaivi8rhDmeFhTQxEHD4xE33J9xOrFu=7kQ@mail.gmail.com> (raw)
In-Reply-To: <201801160115.w0G1FOIG057203@www262.sakura.ne.jp>

On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether
> we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly
> timing dependent.

Hmm. Maybe. But sparsemem really also generates *much* more complex
code particularly for the pfn_to_page() case.

It also has much less testing. For example, on x86-64 we do use
sparsemem, but we use the VMEMMAP version of sparsemem: the version
that does *not* play really odd and complex games with that whole
pfn_to_page().

I've always felt like sparsemem was really damn complicated.  The
whole "section_mem_map" encoding is really subtle and odd.

And considering that we're getting what appears to be a invalid page,
in one of the more complicated sequences that very much does that
whole pfn_to_page(), I really wonder.

I wonder if somebody could add some VM_BUG_ON() checks to the
non-vmemmap case of sparsemem in include/asm-generic/memory_model.h.

Because this:

  #define __pfn_to_page(pfn)                              \
  ({      unsigned long __pfn = (pfn);                    \
          struct mem_section *__sec = __pfn_to_section(__pfn);    \
          __section_mem_map_addr(__sec) + __pfn;          \
  })

is really subtle, and if we have some case where we pass in an
out-of-range pfn, or some case where we get the section wrong (because
the pfn is between sections or whatever due to some subtle setup bug),
things will really go sideways.

The reason I was hoping you could do this for FLATMEM is that it's
much easier to verify the pfn range in that case.  The sparsemem cases
really makes it much nastier.

That said, all of that code is really old. Most of it goes back to
-05/06 or so. But since you seem to be able to reproduce at least back
to 4.8, I guess this bug does back years too.

But I'm adding Dave Hansen explicitly to the cc, in case he has any
ideas. Not because I blame him, but he's touched the sparsemem code
fairly recently, so maybe he'd have some idea on adding sanity
checking to the sparsemem version of pfn_to_page().

> I dont know why but selecting CONFIG_FLATMEM=y seems to avoid a different bug
> where bootup of qemu randomly fails at

Hmm. That looks very different indeed. But if CONFIG_SPARSEMEM
(presumably together with HIGHMEM) has some odd off-by-one corner case
or similar, who knows *what* issues it could trigger.

                 Linus

WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.
Date: Mon, 15 Jan 2018 18:14:49 -0800	[thread overview]
Message-ID: <CA+55aFxOn5n4O2JNaivi8rhDmeFhTQxEHD4xE33J9xOrFu=7kQ@mail.gmail.com> (raw)
In-Reply-To: <201801160115.w0G1FOIG057203@www262.sakura.ne.jp>

On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether
> we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly
> timing dependent.

Hmm. Maybe. But sparsemem really also generates *much* more complex
code particularly for the pfn_to_page() case.

It also has much less testing. For example, on x86-64 we do use
sparsemem, but we use the VMEMMAP version of sparsemem: the version
that does *not* play really odd and complex games with that whole
pfn_to_page().

I've always felt like sparsemem was really damn complicated.  The
whole "section_mem_map" encoding is really subtle and odd.

And considering that we're getting what appears to be a invalid page,
in one of the more complicated sequences that very much does that
whole pfn_to_page(), I really wonder.

I wonder if somebody could add some VM_BUG_ON() checks to the
non-vmemmap case of sparsemem in include/asm-generic/memory_model.h.

Because this:

  #define __pfn_to_page(pfn)                              \
  ({      unsigned long __pfn = (pfn);                    \
          struct mem_section *__sec = __pfn_to_section(__pfn);    \
          __section_mem_map_addr(__sec) + __pfn;          \
  })

is really subtle, and if we have some case where we pass in an
out-of-range pfn, or some case where we get the section wrong (because
the pfn is between sections or whatever due to some subtle setup bug),
things will really go sideways.

The reason I was hoping you could do this for FLATMEM is that it's
much easier to verify the pfn range in that case.  The sparsemem cases
really makes it much nastier.

That said, all of that code is really old. Most of it goes back to
-05/06 or so. But since you seem to be able to reproduce at least back
to 4.8, I guess this bug does back years too.

But I'm adding Dave Hansen explicitly to the cc, in case he has any
ideas. Not because I blame him, but he's touched the sparsemem code
fairly recently, so maybe he'd have some idea on adding sanity
checking to the sparsemem version of pfn_to_page().

> I dont know why but selecting CONFIG_FLATMEM=y seems to avoid a different bug
> where bootup of qemu randomly fails at

Hmm. That looks very different indeed. But if CONFIG_SPARSEMEM
(presumably together with HIGHMEM) has some odd off-by-one corner case
or similar, who knows *what* issues it could trigger.

                 Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-16  2:14 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-05 14:45 [x86? mm? fs? 4.15-rc6] Random oopses by simple write under memory pressure Tetsuo Handa
2018-01-09 10:39 ` [mm? 4.15-rc7] " Tetsuo Handa
2018-01-10 11:49   ` [mm? 4.15-rc7] Random oopses " Tetsuo Handa
2018-01-10 12:45     ` Michal Hocko
2018-01-10 13:37       ` Tetsuo Handa
2018-01-11 13:57         ` Michal Hocko
2018-01-11 14:11           ` Tetsuo Handa
2018-01-11 14:21             ` Michal Hocko
2018-01-11 14:37               ` Tetsuo Handa
2018-01-12  1:31               ` [mm " Tetsuo Handa
2018-01-12  1:42                 ` Linus Torvalds
2018-01-12 11:22                   ` Tetsuo Handa
2018-01-14 11:54                     ` Tetsuo Handa
2018-01-14 11:54                       ` Tetsuo Handa
2018-01-15 23:05                       ` Linus Torvalds
2018-01-15 23:05                         ` Linus Torvalds
2018-01-16  1:15                         ` [mm 4.15-rc8] " Tetsuo Handa
2018-01-16  1:15                           ` Tetsuo Handa
2018-01-16  2:14                           ` Linus Torvalds [this message]
2018-01-16  2:14                             ` Linus Torvalds
2018-01-16  8:06                             ` Dave Hansen
2018-01-16  8:06                               ` Dave Hansen
2018-01-16  8:37                               ` Ingo Molnar
2018-01-16  8:37                                 ` Ingo Molnar
2018-01-16 19:30                               ` Linus Torvalds
2018-01-16 19:30                                 ` Linus Torvalds
2018-01-16 17:33                             ` Tetsuo Handa
2018-01-16 17:33                               ` Tetsuo Handa
2018-01-16 19:34                               ` Linus Torvalds
2018-01-16 19:34                                 ` Linus Torvalds
2018-01-17 11:08                                 ` Tetsuo Handa
2018-01-17 11:08                                   ` Tetsuo Handa
2018-01-17 21:39                                   ` Linus Torvalds
2018-01-17 21:39                                     ` Linus Torvalds
2018-01-17 21:51                                     ` Linus Torvalds
2018-01-17 21:51                                       ` Linus Torvalds
2018-01-17 22:04                                       ` Dave Hansen
2018-01-17 22:04                                         ` Dave Hansen
2018-01-17 22:00                                     ` Dave Hansen
2018-01-17 22:00                                       ` Dave Hansen
2018-01-17 22:15                                       ` Linus Torvalds
2018-01-17 22:15                                         ` Linus Torvalds
2018-01-18  8:12                                   ` Tetsuo Handa
2018-01-18  8:12                                     ` Tetsuo Handa
2018-01-18 12:25                                     ` Kirill A. Shutemov
2018-01-18 12:25                                       ` Kirill A. Shutemov
2018-01-18 13:12                                       ` Kirill A. Shutemov
2018-01-18 13:12                                         ` Kirill A. Shutemov
2018-01-18 14:34                                         ` Kirill A. Shutemov
2018-01-18 14:34                                           ` Kirill A. Shutemov
2018-01-18 14:38                                         ` Dave Hansen
2018-01-18 14:38                                           ` Dave Hansen
2018-01-18 14:45                                           ` Kirill A. Shutemov
2018-01-18 14:45                                             ` Kirill A. Shutemov
2018-01-18 14:51                                             ` Dave Hansen
2018-01-18 14:51                                               ` Dave Hansen
2018-01-18 16:58                                           ` Linus Torvalds
2018-01-18 16:58                                             ` Linus Torvalds
2018-01-18 14:45                                       ` Dave Hansen
2018-01-18 14:45                                         ` Dave Hansen
2018-01-18 14:58                                         ` Andrea Arcangeli
2018-01-18 14:58                                           ` Andrea Arcangeli
2018-01-18 16:56                                           ` Kirill A. Shutemov
2018-01-18 16:56                                             ` Kirill A. Shutemov
2018-01-18 17:26                                             ` Luck, Tony
2018-01-18 17:26                                               ` Luck, Tony
2018-01-18 17:28                                               ` Linus Torvalds
2018-01-18 17:28                                                 ` Linus Torvalds
2018-01-18 17:26                                             ` Linus Torvalds
2018-01-18 17:26                                               ` Linus Torvalds
2018-01-18 23:49                                               ` Kirill A. Shutemov
2018-01-18 23:49                                                 ` Kirill A. Shutemov
2018-01-19 12:55                                                 ` Matthew Wilcox
2018-01-19 12:55                                                   ` Matthew Wilcox
2018-01-19 18:42                                                   ` Linus Torvalds
2018-01-19 18:42                                                     ` Linus Torvalds
2018-01-19 22:12                                                     ` Al Viro
2018-01-19 22:12                                                       ` Al Viro
2018-01-19 22:53                                                       ` Linus Torvalds
2018-01-19 22:53                                                         ` Linus Torvalds
2018-01-20  2:02                                                         ` Al Viro
2018-01-20  2:02                                                           ` Al Viro
2018-01-20  5:24                                                           ` Al Viro
2018-01-20  5:24                                                             ` Al Viro
2018-01-20  9:38                                                             ` Luc Van Oostenryck
2018-01-20  9:38                                                               ` Luc Van Oostenryck
2018-01-20  9:38                                                               ` Luc Van Oostenryck
2018-01-20 14:45                                                               ` Luc Van Oostenryck
2018-01-22 13:26                                                     ` Rasmus Villemoes
2018-01-22 19:58                                                       ` Linus Torvalds
2018-01-18 15:40                                         ` Kirill A. Shutemov
2018-01-18 15:40                                           ` Kirill A. Shutemov
2018-01-18 17:22                                           ` Michal Hocko
2018-01-18 17:22                                             ` Michal Hocko
2018-01-19 10:02                                             ` Kirill A. Shutemov
2018-01-19 10:02                                               ` Kirill A. Shutemov
2018-01-19 10:33                                               ` Michal Hocko
2018-01-19 10:33                                                 ` Michal Hocko
2018-01-19 11:49                                                 ` Kirill A. Shutemov
2018-01-19 11:49                                                   ` Kirill A. Shutemov
2018-01-19 12:07                                                   ` Michal Hocko
2018-01-19 12:07                                                     ` Michal Hocko
2018-01-19 12:30                                                     ` Kirill A. Shutemov
2018-01-19 12:30                                                       ` Kirill A. Shutemov
2018-01-19  2:01                                           ` Tetsuo Handa
2018-01-19  2:01                                             ` Tetsuo Handa
2018-01-11 18:11             ` [mm? 4.15-rc7] " Linus Torvalds
2018-01-11 20:59               ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFxOn5n4O2JNaivi8rhDmeFhTQxEHD4xE33J9xOrFu=7kQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.