From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753462AbeAQVvz (ORCPT ); Wed, 17 Jan 2018 16:51:55 -0500 Received: from mail-it0-f68.google.com ([209.85.214.68]:44289 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752250AbeAQVvy (ORCPT ); Wed, 17 Jan 2018 16:51:54 -0500 X-Google-Smtp-Source: ACJfBouytHX0Jc0OiiWcwJZ24Gk9QaLnhear+JacvJigkLzixgKiYUQk+fesPETXBpXnz3OYCUaDwF9+qfb7PRbxZRw= MIME-Version: 1.0 In-Reply-To: References: <201801160115.w0G1FOIG057203@www262.sakura.ne.jp> <201801170233.JDG21842.OFOJMQSHtOFFLV@I-love.SAKURA.ne.jp> <201801172008.CHH39543.FFtMHOOVSQJLFO@I-love.SAKURA.ne.jp> From: Linus Torvalds Date: Wed, 17 Jan 2018 13:51:53 -0800 X-Google-Sender-Auth: rLz9KCBIcqN2WEYHCn04Nr3Nmww Message-ID: Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure. To: Tetsuo Handa Cc: "Kirill A. Shutemov" , Andrew Morton , Johannes Weiner , Joonsoo Kim , Mel Gorman , Tony Luck , Vlastimil Babka , Michal Hocko , Dave Hansen , Ingo Molnar , Linux Kernel Mailing List , linux-mm , "the arch/x86 maintainers" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 17, 2018 at 1:39 PM, Linus Torvalds wrote: > > In fact, the whole > > pfn_valid_within(buddy_pfn) > > test looks very odd. Maybe the pfn of the buddy is valid, but it's not > in the same zone? Then we'd combine the two pages in two different > zones into one combined page. It might also be the same allocation zone, but if the pfn's are in different sparsemem sections that would also be problematic. But I hope/assume that all sparsemem sections are always aligned to (PAGE_SIZE << MAXORDER). In contrast, the ZONE_HIGHMEM limit really does seems to be potentially not aligned to anything, ie arch/x86/include/asm/pgtable_32_types.h: #define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE) which I have no idea what the alignment is, but VMALLOC_END at least does not seem to have any MAXORDER alignment. So it really does look like the zone for two page orders that would otherwise be buddies might actually be different. Interesting if this really is the case. Because afaik, if that WARN_ON_ONCE actually triggers, it does seem like this bug could go back pretty much forever. In fact, it seems to be such a fundamental bug that I suspect I'm entirely wrong, and full of shit. So it's an interesting and not _obviously_ incorrect theory, but I suspect I must be missing something. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f70.google.com (mail-it0-f70.google.com [209.85.214.70]) by kanga.kvack.org (Postfix) with ESMTP id EB27A6B0271 for ; Wed, 17 Jan 2018 16:51:54 -0500 (EST) Received: by mail-it0-f70.google.com with SMTP id p144so8319507itc.9 for ; Wed, 17 Jan 2018 13:51:54 -0800 (PST) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id r132sor2625918itd.53.2018.01.17.13.51.53 for (Google Transport Security); Wed, 17 Jan 2018 13:51:54 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <201801160115.w0G1FOIG057203@www262.sakura.ne.jp> <201801170233.JDG21842.OFOJMQSHtOFFLV@I-love.SAKURA.ne.jp> <201801172008.CHH39543.FFtMHOOVSQJLFO@I-love.SAKURA.ne.jp> From: Linus Torvalds Date: Wed, 17 Jan 2018 13:51:53 -0800 Message-ID: Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure. Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org List-ID: To: Tetsuo Handa Cc: "Kirill A. Shutemov" , Andrew Morton , Johannes Weiner , Joonsoo Kim , Mel Gorman , Tony Luck , Vlastimil Babka , Michal Hocko , Dave Hansen , Ingo Molnar , Linux Kernel Mailing List , linux-mm , the arch/x86 maintainers On Wed, Jan 17, 2018 at 1:39 PM, Linus Torvalds wrote: > > In fact, the whole > > pfn_valid_within(buddy_pfn) > > test looks very odd. Maybe the pfn of the buddy is valid, but it's not > in the same zone? Then we'd combine the two pages in two different > zones into one combined page. It might also be the same allocation zone, but if the pfn's are in different sparsemem sections that would also be problematic. But I hope/assume that all sparsemem sections are always aligned to (PAGE_SIZE << MAXORDER). In contrast, the ZONE_HIGHMEM limit really does seems to be potentially not aligned to anything, ie arch/x86/include/asm/pgtable_32_types.h: #define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE) which I have no idea what the alignment is, but VMALLOC_END at least does not seem to have any MAXORDER alignment. So it really does look like the zone for two page orders that would otherwise be buddies might actually be different. Interesting if this really is the case. Because afaik, if that WARN_ON_ONCE actually triggers, it does seem like this bug could go back pretty much forever. In fact, it seems to be such a fundamental bug that I suspect I'm entirely wrong, and full of shit. So it's an interesting and not _obviously_ incorrect theory, but I suspect I must be missing something. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org