From: Linus Torvalds <torvalds@linux-foundation.org> To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>, Dave Hansen <dave.hansen@linux.intel.com> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>, "the arch/x86 maintainers" <x86@kernel.org>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, Michal Hocko <mhocko@kernel.org> Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure. Date: Mon, 15 Jan 2018 18:14:49 -0800 [thread overview] Message-ID: <CA+55aFxOn5n4O2JNaivi8rhDmeFhTQxEHD4xE33J9xOrFu=7kQ@mail.gmail.com> (raw) In-Reply-To: <201801160115.w0G1FOIG057203@www262.sakura.ne.jp> On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether > we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly > timing dependent. Hmm. Maybe. But sparsemem really also generates *much* more complex code particularly for the pfn_to_page() case. It also has much less testing. For example, on x86-64 we do use sparsemem, but we use the VMEMMAP version of sparsemem: the version that does *not* play really odd and complex games with that whole pfn_to_page(). I've always felt like sparsemem was really damn complicated. The whole "section_mem_map" encoding is really subtle and odd. And considering that we're getting what appears to be a invalid page, in one of the more complicated sequences that very much does that whole pfn_to_page(), I really wonder. I wonder if somebody could add some VM_BUG_ON() checks to the non-vmemmap case of sparsemem in include/asm-generic/memory_model.h. Because this: #define __pfn_to_page(pfn) \ ({ unsigned long __pfn = (pfn); \ struct mem_section *__sec = __pfn_to_section(__pfn); \ __section_mem_map_addr(__sec) + __pfn; \ }) is really subtle, and if we have some case where we pass in an out-of-range pfn, or some case where we get the section wrong (because the pfn is between sections or whatever due to some subtle setup bug), things will really go sideways. The reason I was hoping you could do this for FLATMEM is that it's much easier to verify the pfn range in that case. The sparsemem cases really makes it much nastier. That said, all of that code is really old. Most of it goes back to -05/06 or so. But since you seem to be able to reproduce at least back to 4.8, I guess this bug does back years too. But I'm adding Dave Hansen explicitly to the cc, in case he has any ideas. Not because I blame him, but he's touched the sparsemem code fairly recently, so maybe he'd have some idea on adding sanity checking to the sparsemem version of pfn_to_page(). > I dont know why but selecting CONFIG_FLATMEM=y seems to avoid a different bug > where bootup of qemu randomly fails at Hmm. That looks very different indeed. But if CONFIG_SPARSEMEM (presumably together with HIGHMEM) has some odd off-by-one corner case or similar, who knows *what* issues it could trigger. Linus
WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org> To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>, Dave Hansen <dave.hansen@linux.intel.com> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>, "the arch/x86 maintainers" <x86@kernel.org>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, Michal Hocko <mhocko@kernel.org> Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure. Date: Mon, 15 Jan 2018 18:14:49 -0800 [thread overview] Message-ID: <CA+55aFxOn5n4O2JNaivi8rhDmeFhTQxEHD4xE33J9xOrFu=7kQ@mail.gmail.com> (raw) In-Reply-To: <201801160115.w0G1FOIG057203@www262.sakura.ne.jp> On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether > we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly > timing dependent. Hmm. Maybe. But sparsemem really also generates *much* more complex code particularly for the pfn_to_page() case. It also has much less testing. For example, on x86-64 we do use sparsemem, but we use the VMEMMAP version of sparsemem: the version that does *not* play really odd and complex games with that whole pfn_to_page(). I've always felt like sparsemem was really damn complicated. The whole "section_mem_map" encoding is really subtle and odd. And considering that we're getting what appears to be a invalid page, in one of the more complicated sequences that very much does that whole pfn_to_page(), I really wonder. I wonder if somebody could add some VM_BUG_ON() checks to the non-vmemmap case of sparsemem in include/asm-generic/memory_model.h. Because this: #define __pfn_to_page(pfn) \ ({ unsigned long __pfn = (pfn); \ struct mem_section *__sec = __pfn_to_section(__pfn); \ __section_mem_map_addr(__sec) + __pfn; \ }) is really subtle, and if we have some case where we pass in an out-of-range pfn, or some case where we get the section wrong (because the pfn is between sections or whatever due to some subtle setup bug), things will really go sideways. The reason I was hoping you could do this for FLATMEM is that it's much easier to verify the pfn range in that case. The sparsemem cases really makes it much nastier. That said, all of that code is really old. Most of it goes back to -05/06 or so. But since you seem to be able to reproduce at least back to 4.8, I guess this bug does back years too. But I'm adding Dave Hansen explicitly to the cc, in case he has any ideas. Not because I blame him, but he's touched the sparsemem code fairly recently, so maybe he'd have some idea on adding sanity checking to the sparsemem version of pfn_to_page(). > I dont know why but selecting CONFIG_FLATMEM=y seems to avoid a different bug > where bootup of qemu randomly fails at Hmm. That looks very different indeed. But if CONFIG_SPARSEMEM (presumably together with HIGHMEM) has some odd off-by-one corner case or similar, who knows *what* issues it could trigger. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-01-16 2:14 UTC|newest] Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-01-05 14:45 [x86? mm? fs? 4.15-rc6] Random oopses by simple write under memory pressure Tetsuo Handa 2018-01-09 10:39 ` [mm? 4.15-rc7] " Tetsuo Handa 2018-01-10 11:49 ` [mm? 4.15-rc7] Random oopses " Tetsuo Handa 2018-01-10 12:45 ` Michal Hocko 2018-01-10 13:37 ` Tetsuo Handa 2018-01-11 13:57 ` Michal Hocko 2018-01-11 14:11 ` Tetsuo Handa 2018-01-11 14:21 ` Michal Hocko 2018-01-11 14:37 ` Tetsuo Handa 2018-01-12 1:31 ` [mm " Tetsuo Handa 2018-01-12 1:42 ` Linus Torvalds 2018-01-12 11:22 ` Tetsuo Handa 2018-01-14 11:54 ` Tetsuo Handa 2018-01-14 11:54 ` Tetsuo Handa 2018-01-15 23:05 ` Linus Torvalds 2018-01-15 23:05 ` Linus Torvalds 2018-01-16 1:15 ` [mm 4.15-rc8] " Tetsuo Handa 2018-01-16 1:15 ` Tetsuo Handa 2018-01-16 2:14 ` Linus Torvalds [this message] 2018-01-16 2:14 ` Linus Torvalds 2018-01-16 8:06 ` Dave Hansen 2018-01-16 8:06 ` Dave Hansen 2018-01-16 8:37 ` Ingo Molnar 2018-01-16 8:37 ` Ingo Molnar 2018-01-16 19:30 ` Linus Torvalds 2018-01-16 19:30 ` Linus Torvalds 2018-01-16 17:33 ` Tetsuo Handa 2018-01-16 17:33 ` Tetsuo Handa 2018-01-16 19:34 ` Linus Torvalds 2018-01-16 19:34 ` Linus Torvalds 2018-01-17 11:08 ` Tetsuo Handa 2018-01-17 11:08 ` Tetsuo Handa 2018-01-17 21:39 ` Linus Torvalds 2018-01-17 21:39 ` Linus Torvalds 2018-01-17 21:51 ` Linus Torvalds 2018-01-17 21:51 ` Linus Torvalds 2018-01-17 22:04 ` Dave Hansen 2018-01-17 22:04 ` Dave Hansen 2018-01-17 22:00 ` Dave Hansen 2018-01-17 22:00 ` Dave Hansen 2018-01-17 22:15 ` Linus Torvalds 2018-01-17 22:15 ` Linus Torvalds 2018-01-18 8:12 ` Tetsuo Handa 2018-01-18 8:12 ` Tetsuo Handa 2018-01-18 12:25 ` Kirill A. Shutemov 2018-01-18 12:25 ` Kirill A. Shutemov 2018-01-18 13:12 ` Kirill A. Shutemov 2018-01-18 13:12 ` Kirill A. Shutemov 2018-01-18 14:34 ` Kirill A. Shutemov 2018-01-18 14:34 ` Kirill A. Shutemov 2018-01-18 14:38 ` Dave Hansen 2018-01-18 14:38 ` Dave Hansen 2018-01-18 14:45 ` Kirill A. Shutemov 2018-01-18 14:45 ` Kirill A. Shutemov 2018-01-18 14:51 ` Dave Hansen 2018-01-18 14:51 ` Dave Hansen 2018-01-18 16:58 ` Linus Torvalds 2018-01-18 16:58 ` Linus Torvalds 2018-01-18 14:45 ` Dave Hansen 2018-01-18 14:45 ` Dave Hansen 2018-01-18 14:58 ` Andrea Arcangeli 2018-01-18 14:58 ` Andrea Arcangeli 2018-01-18 16:56 ` Kirill A. Shutemov 2018-01-18 16:56 ` Kirill A. Shutemov 2018-01-18 17:26 ` Luck, Tony 2018-01-18 17:26 ` Luck, Tony 2018-01-18 17:28 ` Linus Torvalds 2018-01-18 17:28 ` Linus Torvalds 2018-01-18 17:26 ` Linus Torvalds 2018-01-18 17:26 ` Linus Torvalds 2018-01-18 23:49 ` Kirill A. Shutemov 2018-01-18 23:49 ` Kirill A. Shutemov 2018-01-19 12:55 ` Matthew Wilcox 2018-01-19 12:55 ` Matthew Wilcox 2018-01-19 18:42 ` Linus Torvalds 2018-01-19 18:42 ` Linus Torvalds 2018-01-19 22:12 ` Al Viro 2018-01-19 22:12 ` Al Viro 2018-01-19 22:53 ` Linus Torvalds 2018-01-19 22:53 ` Linus Torvalds 2018-01-20 2:02 ` Al Viro 2018-01-20 2:02 ` Al Viro 2018-01-20 5:24 ` Al Viro 2018-01-20 5:24 ` Al Viro 2018-01-20 9:38 ` Luc Van Oostenryck 2018-01-20 9:38 ` Luc Van Oostenryck 2018-01-20 9:38 ` Luc Van Oostenryck 2018-01-20 14:45 ` Luc Van Oostenryck 2018-01-22 13:26 ` Rasmus Villemoes 2018-01-22 19:58 ` Linus Torvalds 2018-01-18 15:40 ` Kirill A. Shutemov 2018-01-18 15:40 ` Kirill A. Shutemov 2018-01-18 17:22 ` Michal Hocko 2018-01-18 17:22 ` Michal Hocko 2018-01-19 10:02 ` Kirill A. Shutemov 2018-01-19 10:02 ` Kirill A. Shutemov 2018-01-19 10:33 ` Michal Hocko 2018-01-19 10:33 ` Michal Hocko 2018-01-19 11:49 ` Kirill A. Shutemov 2018-01-19 11:49 ` Kirill A. Shutemov 2018-01-19 12:07 ` Michal Hocko 2018-01-19 12:07 ` Michal Hocko 2018-01-19 12:30 ` Kirill A. Shutemov 2018-01-19 12:30 ` Kirill A. Shutemov 2018-01-19 2:01 ` Tetsuo Handa 2018-01-19 2:01 ` Tetsuo Handa 2018-01-11 18:11 ` [mm? 4.15-rc7] " Linus Torvalds 2018-01-11 20:59 ` Tetsuo Handa
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CA+55aFxOn5n4O2JNaivi8rhDmeFhTQxEHD4xE33J9xOrFu=7kQ@mail.gmail.com' \ --to=torvalds@linux-foundation.org \ --cc=dave.hansen@linux.intel.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=penguin-kernel@i-love.sakura.ne.jp \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.