From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752740AbdHKJhw (ORCPT ); Fri, 11 Aug 2017 05:37:52 -0400 Received: from mx2.suse.de ([195.135.220.15]:35822 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751842AbdHKJhu (ORCPT ); Fri, 11 Aug 2017 05:37:50 -0400 Date: Fri, 11 Aug 2017 11:37:47 +0200 From: Michal Hocko To: Pavel Tatashin Cc: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, ard.biesheuvel@linaro.org, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org Subject: Re: [v6 05/15] mm: don't accessed uninitialized struct pages Message-ID: <20170811093746.GF30811@dhcp22.suse.cz> References: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 07-08-17 16:38:39, Pavel Tatashin wrote: > In deferred_init_memmap() where all deferred struct pages are initialized > we have a check like this: > > if (page->flags) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > > This way we are checking if the current deferred page has already been > initialized. It works, because memory for struct pages has been zeroed, and > the only way flags are not zero if it went through __init_single_page() > before. But, once we change the current behavior and won't zero the memory > in memblock allocator, we cannot trust anything inside "struct page"es > until they are initialized. This patch fixes this. > > This patch defines a new accessor memblock_get_reserved_pfn_range() > which returns successive ranges of reserved PFNs. deferred_init_memmap() > calls it to determine if a PFN and its struct page has already been > initialized. Why don't we simply check the pfn against pgdat->first_deferred_pfn? > Signed-off-by: Pavel Tatashin > Reviewed-by: Steven Sistare > Reviewed-by: Daniel Jordan > Reviewed-by: Bob Picco > --- > include/linux/memblock.h | 3 +++ > mm/memblock.c | 54 ++++++++++++++++++++++++++++++++++++++++++------ > mm/page_alloc.c | 11 +++++++++- > 3 files changed, 61 insertions(+), 7 deletions(-) > > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index bae11c7e7bf3..b6a2a610f5e1 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -320,6 +320,9 @@ int memblock_is_map_memory(phys_addr_t addr); > int memblock_is_region_memory(phys_addr_t base, phys_addr_t size); > bool memblock_is_reserved(phys_addr_t addr); > bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size); > +void memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *pfn_start, > + unsigned long *pfn_end); > > extern void __memblock_dump_all(void); > > diff --git a/mm/memblock.c b/mm/memblock.c > index bf14aea6ab70..08f449acfdd1 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1580,7 +1580,13 @@ void __init memblock_mem_limit_remove_map(phys_addr_t limit) > memblock_cap_memory_range(0, max_addr); > } > > -static int __init_memblock memblock_search(struct memblock_type *type, phys_addr_t addr) > +/** > + * Return index in regions array if addr is within the region. Otherwise > + * return -1. If -1 is returned and *next_idx is not %NULL, sets it to the > + * next region index or -1 if there is none. > + */ > +static int __init_memblock memblock_search(struct memblock_type *type, > + phys_addr_t addr, int *next_idx) > { > unsigned int left = 0, right = type->cnt; > > @@ -1595,22 +1601,26 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr > else > return mid; > } while (left < right); > + > + if (next_idx) > + *next_idx = (right == type->cnt) ? -1 : right; > + > return -1; > } > > bool __init memblock_is_reserved(phys_addr_t addr) > { > - return memblock_search(&memblock.reserved, addr) != -1; > + return memblock_search(&memblock.reserved, addr, NULL) != -1; > } > > bool __init_memblock memblock_is_memory(phys_addr_t addr) > { > - return memblock_search(&memblock.memory, addr) != -1; > + return memblock_search(&memblock.memory, addr, NULL) != -1; > } > > int __init_memblock memblock_is_map_memory(phys_addr_t addr) > { > - int i = memblock_search(&memblock.memory, addr); > + int i = memblock_search(&memblock.memory, addr, NULL); > > if (i == -1) > return false; > @@ -1622,7 +1632,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > unsigned long *start_pfn, unsigned long *end_pfn) > { > struct memblock_type *type = &memblock.memory; > - int mid = memblock_search(type, PFN_PHYS(pfn)); > + int mid = memblock_search(type, PFN_PHYS(pfn), NULL); > > if (mid == -1) > return -1; > @@ -1646,7 +1656,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > */ > int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size) > { > - int idx = memblock_search(&memblock.memory, base); > + int idx = memblock_search(&memblock.memory, base, NULL); > phys_addr_t end = base + memblock_cap_size(base, &size); > > if (idx == -1) > @@ -1655,6 +1665,38 @@ int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size > memblock.memory.regions[idx].size) >= end; > } > > +/** > + * memblock_get_reserved_pfn_range - search for the next reserved region > + * > + * @pfn: start searching from this pfn. > + * > + * RETURNS: > + * [start_pfn, end_pfn), where start_pfn >= pfn. If none is found > + * start_pfn, and end_pfn are both set to ULONG_MAX. > + */ > +void __init_memblock memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *start_pfn, > + unsigned long *end_pfn) > +{ > + struct memblock_type *type = &memblock.reserved; > + int next_idx, idx; > + > + idx = memblock_search(type, PFN_PHYS(pfn), &next_idx); > + if (idx == -1 && next_idx == -1) { > + *start_pfn = ULONG_MAX; > + *end_pfn = ULONG_MAX; > + return; > + } > + > + if (idx == -1) { > + idx = next_idx; > + *start_pfn = PFN_DOWN(type->regions[idx].base); > + } else { > + *start_pfn = pfn; > + } > + *end_pfn = PFN_DOWN(type->regions[idx].base + type->regions[idx].size); > +} > + > /** > * memblock_is_region_reserved - check if a region intersects reserved memory > * @base: base of region to check > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 63d16c185736..983de0a8047b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1447,6 +1447,7 @@ static int __init deferred_init_memmap(void *data) > pg_data_t *pgdat = data; > int nid = pgdat->node_id; > struct mminit_pfnnid_cache nid_init_state = { }; > + unsigned long resv_start_pfn = 0, resv_end_pfn = 0; > unsigned long start = jiffies; > unsigned long nr_pages = 0; > unsigned long walk_start, walk_end; > @@ -1491,6 +1492,10 @@ static int __init deferred_init_memmap(void *data) > pfn = zone->zone_start_pfn; > > for (; pfn < end_pfn; pfn++) { > + if (pfn >= resv_end_pfn) > + memblock_get_reserved_pfn_range(pfn, > + &resv_start_pfn, > + &resv_end_pfn); > if (!pfn_valid_within(pfn)) > goto free_range; > > @@ -1524,7 +1529,11 @@ static int __init deferred_init_memmap(void *data) > cond_resched(); > } > > - if (page->flags) { > + /* > + * Check if this page has already been initialized due > + * to being reserved during boot in memblock. > + */ > + if (pfn >= resv_start_pfn) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > -- > 2.14.0 -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Date: Fri, 11 Aug 2017 09:37:47 +0000 Subject: Re: [v6 05/15] mm: don't accessed uninitialized struct pages Message-Id: <20170811093746.GF30811@dhcp22.suse.cz> List-Id: References: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> In-Reply-To: <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-arm-kernel@lists.infradead.org On Mon 07-08-17 16:38:39, Pavel Tatashin wrote: > In deferred_init_memmap() where all deferred struct pages are initialized > we have a check like this: > > if (page->flags) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > > This way we are checking if the current deferred page has already been > initialized. It works, because memory for struct pages has been zeroed, and > the only way flags are not zero if it went through __init_single_page() > before. But, once we change the current behavior and won't zero the memory > in memblock allocator, we cannot trust anything inside "struct page"es > until they are initialized. This patch fixes this. > > This patch defines a new accessor memblock_get_reserved_pfn_range() > which returns successive ranges of reserved PFNs. deferred_init_memmap() > calls it to determine if a PFN and its struct page has already been > initialized. Why don't we simply check the pfn against pgdat->first_deferred_pfn? > Signed-off-by: Pavel Tatashin > Reviewed-by: Steven Sistare > Reviewed-by: Daniel Jordan > Reviewed-by: Bob Picco > --- > include/linux/memblock.h | 3 +++ > mm/memblock.c | 54 ++++++++++++++++++++++++++++++++++++++++++------ > mm/page_alloc.c | 11 +++++++++- > 3 files changed, 61 insertions(+), 7 deletions(-) > > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index bae11c7e7bf3..b6a2a610f5e1 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -320,6 +320,9 @@ int memblock_is_map_memory(phys_addr_t addr); > int memblock_is_region_memory(phys_addr_t base, phys_addr_t size); > bool memblock_is_reserved(phys_addr_t addr); > bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size); > +void memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *pfn_start, > + unsigned long *pfn_end); > > extern void __memblock_dump_all(void); > > diff --git a/mm/memblock.c b/mm/memblock.c > index bf14aea6ab70..08f449acfdd1 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1580,7 +1580,13 @@ void __init memblock_mem_limit_remove_map(phys_addr_t limit) > memblock_cap_memory_range(0, max_addr); > } > > -static int __init_memblock memblock_search(struct memblock_type *type, phys_addr_t addr) > +/** > + * Return index in regions array if addr is within the region. Otherwise > + * return -1. If -1 is returned and *next_idx is not %NULL, sets it to the > + * next region index or -1 if there is none. > + */ > +static int __init_memblock memblock_search(struct memblock_type *type, > + phys_addr_t addr, int *next_idx) > { > unsigned int left = 0, right = type->cnt; > > @@ -1595,22 +1601,26 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr > else > return mid; > } while (left < right); > + > + if (next_idx) > + *next_idx = (right = type->cnt) ? -1 : right; > + > return -1; > } > > bool __init memblock_is_reserved(phys_addr_t addr) > { > - return memblock_search(&memblock.reserved, addr) != -1; > + return memblock_search(&memblock.reserved, addr, NULL) != -1; > } > > bool __init_memblock memblock_is_memory(phys_addr_t addr) > { > - return memblock_search(&memblock.memory, addr) != -1; > + return memblock_search(&memblock.memory, addr, NULL) != -1; > } > > int __init_memblock memblock_is_map_memory(phys_addr_t addr) > { > - int i = memblock_search(&memblock.memory, addr); > + int i = memblock_search(&memblock.memory, addr, NULL); > > if (i = -1) > return false; > @@ -1622,7 +1632,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > unsigned long *start_pfn, unsigned long *end_pfn) > { > struct memblock_type *type = &memblock.memory; > - int mid = memblock_search(type, PFN_PHYS(pfn)); > + int mid = memblock_search(type, PFN_PHYS(pfn), NULL); > > if (mid = -1) > return -1; > @@ -1646,7 +1656,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > */ > int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size) > { > - int idx = memblock_search(&memblock.memory, base); > + int idx = memblock_search(&memblock.memory, base, NULL); > phys_addr_t end = base + memblock_cap_size(base, &size); > > if (idx = -1) > @@ -1655,6 +1665,38 @@ int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size > memblock.memory.regions[idx].size) >= end; > } > > +/** > + * memblock_get_reserved_pfn_range - search for the next reserved region > + * > + * @pfn: start searching from this pfn. > + * > + * RETURNS: > + * [start_pfn, end_pfn), where start_pfn >= pfn. If none is found > + * start_pfn, and end_pfn are both set to ULONG_MAX. > + */ > +void __init_memblock memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *start_pfn, > + unsigned long *end_pfn) > +{ > + struct memblock_type *type = &memblock.reserved; > + int next_idx, idx; > + > + idx = memblock_search(type, PFN_PHYS(pfn), &next_idx); > + if (idx = -1 && next_idx = -1) { > + *start_pfn = ULONG_MAX; > + *end_pfn = ULONG_MAX; > + return; > + } > + > + if (idx = -1) { > + idx = next_idx; > + *start_pfn = PFN_DOWN(type->regions[idx].base); > + } else { > + *start_pfn = pfn; > + } > + *end_pfn = PFN_DOWN(type->regions[idx].base + type->regions[idx].size); > +} > + > /** > * memblock_is_region_reserved - check if a region intersects reserved memory > * @base: base of region to check > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 63d16c185736..983de0a8047b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1447,6 +1447,7 @@ static int __init deferred_init_memmap(void *data) > pg_data_t *pgdat = data; > int nid = pgdat->node_id; > struct mminit_pfnnid_cache nid_init_state = { }; > + unsigned long resv_start_pfn = 0, resv_end_pfn = 0; > unsigned long start = jiffies; > unsigned long nr_pages = 0; > unsigned long walk_start, walk_end; > @@ -1491,6 +1492,10 @@ static int __init deferred_init_memmap(void *data) > pfn = zone->zone_start_pfn; > > for (; pfn < end_pfn; pfn++) { > + if (pfn >= resv_end_pfn) > + memblock_get_reserved_pfn_range(pfn, > + &resv_start_pfn, > + &resv_end_pfn); > if (!pfn_valid_within(pfn)) > goto free_range; > > @@ -1524,7 +1529,11 @@ static int __init deferred_init_memmap(void *data) > cond_resched(); > } > > - if (page->flags) { > + /* > + * Check if this page has already been initialized due > + * to being reserved during boot in memblock. > + */ > + if (pfn >= resv_start_pfn) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > -- > 2.14.0 -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f72.google.com (mail-wm0-f72.google.com [74.125.82.72]) by kanga.kvack.org (Postfix) with ESMTP id C94EC6B0292 for ; Fri, 11 Aug 2017 05:37:50 -0400 (EDT) Received: by mail-wm0-f72.google.com with SMTP id h126so5731258wmf.10 for ; Fri, 11 Aug 2017 02:37:50 -0700 (PDT) Received: from mx1.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id u9si372462wra.414.2017.08.11.02.37.49 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 11 Aug 2017 02:37:49 -0700 (PDT) Date: Fri, 11 Aug 2017 11:37:47 +0200 From: Michal Hocko Subject: Re: [v6 05/15] mm: don't accessed uninitialized struct pages Message-ID: <20170811093746.GF30811@dhcp22.suse.cz> References: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> Sender: owner-linux-mm@kvack.org List-ID: To: Pavel Tatashin Cc: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, ard.biesheuvel@linaro.org, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org On Mon 07-08-17 16:38:39, Pavel Tatashin wrote: > In deferred_init_memmap() where all deferred struct pages are initialized > we have a check like this: > > if (page->flags) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > > This way we are checking if the current deferred page has already been > initialized. It works, because memory for struct pages has been zeroed, and > the only way flags are not zero if it went through __init_single_page() > before. But, once we change the current behavior and won't zero the memory > in memblock allocator, we cannot trust anything inside "struct page"es > until they are initialized. This patch fixes this. > > This patch defines a new accessor memblock_get_reserved_pfn_range() > which returns successive ranges of reserved PFNs. deferred_init_memmap() > calls it to determine if a PFN and its struct page has already been > initialized. Why don't we simply check the pfn against pgdat->first_deferred_pfn? > Signed-off-by: Pavel Tatashin > Reviewed-by: Steven Sistare > Reviewed-by: Daniel Jordan > Reviewed-by: Bob Picco > --- > include/linux/memblock.h | 3 +++ > mm/memblock.c | 54 ++++++++++++++++++++++++++++++++++++++++++------ > mm/page_alloc.c | 11 +++++++++- > 3 files changed, 61 insertions(+), 7 deletions(-) > > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index bae11c7e7bf3..b6a2a610f5e1 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -320,6 +320,9 @@ int memblock_is_map_memory(phys_addr_t addr); > int memblock_is_region_memory(phys_addr_t base, phys_addr_t size); > bool memblock_is_reserved(phys_addr_t addr); > bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size); > +void memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *pfn_start, > + unsigned long *pfn_end); > > extern void __memblock_dump_all(void); > > diff --git a/mm/memblock.c b/mm/memblock.c > index bf14aea6ab70..08f449acfdd1 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1580,7 +1580,13 @@ void __init memblock_mem_limit_remove_map(phys_addr_t limit) > memblock_cap_memory_range(0, max_addr); > } > > -static int __init_memblock memblock_search(struct memblock_type *type, phys_addr_t addr) > +/** > + * Return index in regions array if addr is within the region. Otherwise > + * return -1. If -1 is returned and *next_idx is not %NULL, sets it to the > + * next region index or -1 if there is none. > + */ > +static int __init_memblock memblock_search(struct memblock_type *type, > + phys_addr_t addr, int *next_idx) > { > unsigned int left = 0, right = type->cnt; > > @@ -1595,22 +1601,26 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr > else > return mid; > } while (left < right); > + > + if (next_idx) > + *next_idx = (right == type->cnt) ? -1 : right; > + > return -1; > } > > bool __init memblock_is_reserved(phys_addr_t addr) > { > - return memblock_search(&memblock.reserved, addr) != -1; > + return memblock_search(&memblock.reserved, addr, NULL) != -1; > } > > bool __init_memblock memblock_is_memory(phys_addr_t addr) > { > - return memblock_search(&memblock.memory, addr) != -1; > + return memblock_search(&memblock.memory, addr, NULL) != -1; > } > > int __init_memblock memblock_is_map_memory(phys_addr_t addr) > { > - int i = memblock_search(&memblock.memory, addr); > + int i = memblock_search(&memblock.memory, addr, NULL); > > if (i == -1) > return false; > @@ -1622,7 +1632,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > unsigned long *start_pfn, unsigned long *end_pfn) > { > struct memblock_type *type = &memblock.memory; > - int mid = memblock_search(type, PFN_PHYS(pfn)); > + int mid = memblock_search(type, PFN_PHYS(pfn), NULL); > > if (mid == -1) > return -1; > @@ -1646,7 +1656,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > */ > int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size) > { > - int idx = memblock_search(&memblock.memory, base); > + int idx = memblock_search(&memblock.memory, base, NULL); > phys_addr_t end = base + memblock_cap_size(base, &size); > > if (idx == -1) > @@ -1655,6 +1665,38 @@ int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size > memblock.memory.regions[idx].size) >= end; > } > > +/** > + * memblock_get_reserved_pfn_range - search for the next reserved region > + * > + * @pfn: start searching from this pfn. > + * > + * RETURNS: > + * [start_pfn, end_pfn), where start_pfn >= pfn. If none is found > + * start_pfn, and end_pfn are both set to ULONG_MAX. > + */ > +void __init_memblock memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *start_pfn, > + unsigned long *end_pfn) > +{ > + struct memblock_type *type = &memblock.reserved; > + int next_idx, idx; > + > + idx = memblock_search(type, PFN_PHYS(pfn), &next_idx); > + if (idx == -1 && next_idx == -1) { > + *start_pfn = ULONG_MAX; > + *end_pfn = ULONG_MAX; > + return; > + } > + > + if (idx == -1) { > + idx = next_idx; > + *start_pfn = PFN_DOWN(type->regions[idx].base); > + } else { > + *start_pfn = pfn; > + } > + *end_pfn = PFN_DOWN(type->regions[idx].base + type->regions[idx].size); > +} > + > /** > * memblock_is_region_reserved - check if a region intersects reserved memory > * @base: base of region to check > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 63d16c185736..983de0a8047b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1447,6 +1447,7 @@ static int __init deferred_init_memmap(void *data) > pg_data_t *pgdat = data; > int nid = pgdat->node_id; > struct mminit_pfnnid_cache nid_init_state = { }; > + unsigned long resv_start_pfn = 0, resv_end_pfn = 0; > unsigned long start = jiffies; > unsigned long nr_pages = 0; > unsigned long walk_start, walk_end; > @@ -1491,6 +1492,10 @@ static int __init deferred_init_memmap(void *data) > pfn = zone->zone_start_pfn; > > for (; pfn < end_pfn; pfn++) { > + if (pfn >= resv_end_pfn) > + memblock_get_reserved_pfn_range(pfn, > + &resv_start_pfn, > + &resv_end_pfn); > if (!pfn_valid_within(pfn)) > goto free_range; > > @@ -1524,7 +1529,11 @@ static int __init deferred_init_memmap(void *data) > cond_resched(); > } > > - if (page->flags) { > + /* > + * Check if this page has already been initialized due > + * to being reserved during boot in memblock. > + */ > + if (pfn >= resv_start_pfn) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > -- > 2.14.0 -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: mhocko@kernel.org (Michal Hocko) Date: Fri, 11 Aug 2017 11:37:47 +0200 Subject: [v6 05/15] mm: don't accessed uninitialized struct pages In-Reply-To: <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> References: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> <1502138329-123460-6-git-send-email-pasha.tatashin@oracle.com> Message-ID: <20170811093746.GF30811@dhcp22.suse.cz> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Mon 07-08-17 16:38:39, Pavel Tatashin wrote: > In deferred_init_memmap() where all deferred struct pages are initialized > we have a check like this: > > if (page->flags) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > > This way we are checking if the current deferred page has already been > initialized. It works, because memory for struct pages has been zeroed, and > the only way flags are not zero if it went through __init_single_page() > before. But, once we change the current behavior and won't zero the memory > in memblock allocator, we cannot trust anything inside "struct page"es > until they are initialized. This patch fixes this. > > This patch defines a new accessor memblock_get_reserved_pfn_range() > which returns successive ranges of reserved PFNs. deferred_init_memmap() > calls it to determine if a PFN and its struct page has already been > initialized. Why don't we simply check the pfn against pgdat->first_deferred_pfn? > Signed-off-by: Pavel Tatashin > Reviewed-by: Steven Sistare > Reviewed-by: Daniel Jordan > Reviewed-by: Bob Picco > --- > include/linux/memblock.h | 3 +++ > mm/memblock.c | 54 ++++++++++++++++++++++++++++++++++++++++++------ > mm/page_alloc.c | 11 +++++++++- > 3 files changed, 61 insertions(+), 7 deletions(-) > > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index bae11c7e7bf3..b6a2a610f5e1 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -320,6 +320,9 @@ int memblock_is_map_memory(phys_addr_t addr); > int memblock_is_region_memory(phys_addr_t base, phys_addr_t size); > bool memblock_is_reserved(phys_addr_t addr); > bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size); > +void memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *pfn_start, > + unsigned long *pfn_end); > > extern void __memblock_dump_all(void); > > diff --git a/mm/memblock.c b/mm/memblock.c > index bf14aea6ab70..08f449acfdd1 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1580,7 +1580,13 @@ void __init memblock_mem_limit_remove_map(phys_addr_t limit) > memblock_cap_memory_range(0, max_addr); > } > > -static int __init_memblock memblock_search(struct memblock_type *type, phys_addr_t addr) > +/** > + * Return index in regions array if addr is within the region. Otherwise > + * return -1. If -1 is returned and *next_idx is not %NULL, sets it to the > + * next region index or -1 if there is none. > + */ > +static int __init_memblock memblock_search(struct memblock_type *type, > + phys_addr_t addr, int *next_idx) > { > unsigned int left = 0, right = type->cnt; > > @@ -1595,22 +1601,26 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr > else > return mid; > } while (left < right); > + > + if (next_idx) > + *next_idx = (right == type->cnt) ? -1 : right; > + > return -1; > } > > bool __init memblock_is_reserved(phys_addr_t addr) > { > - return memblock_search(&memblock.reserved, addr) != -1; > + return memblock_search(&memblock.reserved, addr, NULL) != -1; > } > > bool __init_memblock memblock_is_memory(phys_addr_t addr) > { > - return memblock_search(&memblock.memory, addr) != -1; > + return memblock_search(&memblock.memory, addr, NULL) != -1; > } > > int __init_memblock memblock_is_map_memory(phys_addr_t addr) > { > - int i = memblock_search(&memblock.memory, addr); > + int i = memblock_search(&memblock.memory, addr, NULL); > > if (i == -1) > return false; > @@ -1622,7 +1632,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > unsigned long *start_pfn, unsigned long *end_pfn) > { > struct memblock_type *type = &memblock.memory; > - int mid = memblock_search(type, PFN_PHYS(pfn)); > + int mid = memblock_search(type, PFN_PHYS(pfn), NULL); > > if (mid == -1) > return -1; > @@ -1646,7 +1656,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn, > */ > int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size) > { > - int idx = memblock_search(&memblock.memory, base); > + int idx = memblock_search(&memblock.memory, base, NULL); > phys_addr_t end = base + memblock_cap_size(base, &size); > > if (idx == -1) > @@ -1655,6 +1665,38 @@ int __init_memblock memblock_is_region_memory(phys_addr_t base, phys_addr_t size > memblock.memory.regions[idx].size) >= end; > } > > +/** > + * memblock_get_reserved_pfn_range - search for the next reserved region > + * > + * @pfn: start searching from this pfn. > + * > + * RETURNS: > + * [start_pfn, end_pfn), where start_pfn >= pfn. If none is found > + * start_pfn, and end_pfn are both set to ULONG_MAX. > + */ > +void __init_memblock memblock_get_reserved_pfn_range(unsigned long pfn, > + unsigned long *start_pfn, > + unsigned long *end_pfn) > +{ > + struct memblock_type *type = &memblock.reserved; > + int next_idx, idx; > + > + idx = memblock_search(type, PFN_PHYS(pfn), &next_idx); > + if (idx == -1 && next_idx == -1) { > + *start_pfn = ULONG_MAX; > + *end_pfn = ULONG_MAX; > + return; > + } > + > + if (idx == -1) { > + idx = next_idx; > + *start_pfn = PFN_DOWN(type->regions[idx].base); > + } else { > + *start_pfn = pfn; > + } > + *end_pfn = PFN_DOWN(type->regions[idx].base + type->regions[idx].size); > +} > + > /** > * memblock_is_region_reserved - check if a region intersects reserved memory > * @base: base of region to check > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 63d16c185736..983de0a8047b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1447,6 +1447,7 @@ static int __init deferred_init_memmap(void *data) > pg_data_t *pgdat = data; > int nid = pgdat->node_id; > struct mminit_pfnnid_cache nid_init_state = { }; > + unsigned long resv_start_pfn = 0, resv_end_pfn = 0; > unsigned long start = jiffies; > unsigned long nr_pages = 0; > unsigned long walk_start, walk_end; > @@ -1491,6 +1492,10 @@ static int __init deferred_init_memmap(void *data) > pfn = zone->zone_start_pfn; > > for (; pfn < end_pfn; pfn++) { > + if (pfn >= resv_end_pfn) > + memblock_get_reserved_pfn_range(pfn, > + &resv_start_pfn, > + &resv_end_pfn); > if (!pfn_valid_within(pfn)) > goto free_range; > > @@ -1524,7 +1529,11 @@ static int __init deferred_init_memmap(void *data) > cond_resched(); > } > > - if (page->flags) { > + /* > + * Check if this page has already been initialized due > + * to being reserved during boot in memblock. > + */ > + if (pfn >= resv_start_pfn) { > VM_BUG_ON(page_zone(page) != zone); > goto free_range; > } > -- > 2.14.0 -- Michal Hocko SUSE Labs