From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755588AbdKBNkN (ORCPT ); Thu, 2 Nov 2017 09:40:13 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:25390 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754894AbdKBNkL (ORCPT ); Thu, 2 Nov 2017 09:40:11 -0400 Subject: Re: [PATCH v1 1/1] mm: buddy page accessed before initialized To: Michal Hocko Cc: steven.sistare@oracle.com, daniel.m.jordan@oracle.com, akpm@linux-foundation.org, mgorman@techsingularity.net, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20171031155002.21691-1-pasha.tatashin@oracle.com> <20171031155002.21691-2-pasha.tatashin@oracle.com> <20171102133235.2vfmmut6w4of2y3j@dhcp22.suse.cz> From: Pavel Tatashin Message-ID: Date: Thu, 2 Nov 2017 09:39:58 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <20171102133235.2vfmmut6w4of2y3j@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/02/2017 09:32 AM, Michal Hocko wrote: > On Tue 31-10-17 11:50:02, Pavel Tatashin wrote: > [...] >> The problem happens in this path: >> >> page_alloc_init_late >> deferred_init_memmap >> deferred_init_range >> __def_free >> deferred_free_range >> __free_pages_boot_core(page, order) >> __free_pages() >> __free_pages_ok() >> free_one_page() >> __free_one_page(page, pfn, zone, order, migratetype); >> >> deferred_init_range() initializes one page at a time by calling >> __init_single_page(), once it initializes pageblock_nr_pages pages, it >> calls deferred_free_range() to free the initialized pages to the buddy >> allocator. Eventually, we reach __free_one_page(), where we compute buddy >> page: >> buddy_pfn = __find_buddy_pfn(pfn, order); >> buddy = page + (buddy_pfn - pfn); >> >> buddy_pfn is computed as pfn ^ (1 << order), or pfn + pageblock_nr_pages. >> Thefore, buddy page becomes a page one after the range that currently was >> initialized, and we access this page in this function. Also, later when we >> return back to deferred_init_range(), the buddy page is initialized again. >> >> So, in order to avoid this issue, we must initialize the buddy page prior >> to calling deferred_free_range(). > > How come we didn't have this problem previously? I am really confused. > Hi Michal, Previously as before my project? That is because memory for all struct pages was always zeroed in memblock, and in __free_one_page() page_is_buddy() was always returning false, thus we never tried to incorrectly remove it from the list: 837 list_del(&buddy->lru); Now, that memory is not zeroed, page_is_buddy() can return true after kexec when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG does not catch this case). And proceed further to incorrectly remove buddy from the list. This is why we must initialize the computed buddy page beforehand. Pasha