From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C8B3C433EF for ; Wed, 30 Mar 2022 21:48:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351354AbiC3VuL (ORCPT ); Wed, 30 Mar 2022 17:50:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351360AbiC3Vtz (ORCPT ); Wed, 30 Mar 2022 17:49:55 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A23652E4A for ; Wed, 30 Mar 2022 14:48:09 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id DF36F218F8; Wed, 30 Mar 2022 21:48:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1648676887; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bWhCH76wJNOPUkyUEGjWSMW9zgCG+hN7ltEMYwIFxNA=; b=0hGaA5oUgyBUmiCvjSMdMMYK2/vVDl0Paz7oDbnNAYjh4/os8TxHqDREx2rf9SuMr7F53f PPiEMTg+AkY+dCF5urd4okHT3fXJY5I8uMymT7Dg2hlIxe0LuC7sCpets5jwYGXTrvgCFm pTXzbFqNCftxSHTEqHt3TlfiEYtKrOM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1648676887; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bWhCH76wJNOPUkyUEGjWSMW9zgCG+hN7ltEMYwIFxNA=; b=GSR7mChVNeh+pGPYpRizx3j4jedxwMLSR2xHiSKZs5M7BkTT08RBLW4XLRH7B98bcZli3o XAi0+gkLkZCBhDAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A6B1613AF3; Wed, 30 Mar 2022 21:48:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id T5nBJxfQRGJxHgAAMHmgww (envelope-from ); Wed, 30 Mar 2022 21:48:07 +0000 Message-ID: <2b84aba9-7435-0073-59f0-410fddb6df7d@suse.cz> Date: Wed, 30 Mar 2022 23:48:07 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [BUG] Crash on x86_32 for: mm: page_alloc: avoid merging non-fallbackable pageblocks with others Content-Language: en-US To: Zi Yan , Steven Rostedt Cc: Linus Torvalds , LKML , Mel Gorman , David Hildenbrand , Mike Rapoport , Oscar Salvador , Andrew Morton , Linux-MM References: <20220330154208.71aca532@gandalf.local.home> <20220330165337.7138810e@gandalf.local.home> <733F211D-9717-46A7-A0A2-40353E12F65A@nvidia.com> From: Vlastimil Babka In-Reply-To: <733F211D-9717-46A7-A0A2-40353E12F65A@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/30/22 23:43, Zi Yan wrote: > On 30 Mar 2022, at 17:25, Zi Yan wrote: > >> On 30 Mar 2022, at 16:53, Steven Rostedt wrote: >> >>> On Wed, 30 Mar 2022 16:29:28 -0400 >>> Zi Yan wrote: >>> >>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>> index bdc8f60ae462..83a90e2973b7 100644 >>>> --- a/mm/page_alloc.c >>>> +++ b/mm/page_alloc.c >>>> @@ -1108,6 +1108,8 @@ static inline void __free_one_page(struct page *page, >>>> >>>> buddy_pfn = __find_buddy_pfn(pfn, order); >>>> buddy = page + (buddy_pfn - pfn); >>>> + if (!page_is_buddy(page, buddy, order)) >>>> + goto done_merging; >>>> buddy_mt = get_pageblock_migratetype(buddy); >>>> >>>> if (migratetype != buddy_mt >>>> >>> >>> The above did not apply to Linus's tree, nor even the problem commit >>> (before or after), but I found where the code is, and added it manually. >>> >>> It does appear to allow the machine to boot. >>> >> I just pulled Linus’s tree and grabbed the diff. Anyway, thanks. >> >> I would like to get more understanding of the issue before blindly sending >> this as a fix. >> >> Merge the other thread: >>> >>> Not sure if this matters or not, but my kernel command line has: >>> >>> crashkernel=256M >>> >>> Could that have caused this to break? >> >> Unlikely, 256MB is MAX_ORDER_NR_PAGES aligned (MAX_ORDER is 11 here). >> __find_buddy_pfn() will not get any buddy_pfn from crashkernel memory >> region, since that would cross MAX_ORDER_NR_PAGES boundary. >> >> page_is_buddy() checks page_is_guard(buddy), PageBuddy(buddy), >> buddy_order(buddy), and page_zone_id(buddy), where page_is_guard(buddy) >> is always false since CONFIG_DEBUG_PAGEALLOC is not set in your config. >> So either PageBuddy(buddy) is false, buddy_order(buddy) != order, >> or page_zone_id(buddy) is not the same as page_zone_id(page). >> >> Do you mind adding the following code right before my fix code above >> and provide a complete boot log? I would like to understand what >> went wrong. Thanks. >> >> pr_info("buddy_pfn: %lx, PageBuddy: %d, buddy_order: %d (vs %d), page_zone_id: %d (vs %d)\n", >> buddy_pfn, PageBuddy(buddy), buddy_order(buddy), order, page_zone_id(buddy), >> page_zone_id(page)); >> >> > > This seems to be a bug in the original code too. > But "if (unlikely(has_isolate_pageblock(zone)))" is too rare to trigger it. > I do not see how having isolated pageblocks in a zone could get us away > from checking page_is_buddy(). IIRC the assumption was that pageblock bitmaps would always exist withing MAX_ORDER blocks. But here we are still under mem_init() where has_isolate_pageblock() couldn't happen. And the assumption could have been silently broken by subsequent memory init changes. > -- > Best Regards, > Yan, Zi