* [Question] pfn_valid_within(page_to_pfn(buddy)) vs pfn_valid_within(buddy_pfn) in __free_one_page()
@ 2019-10-08 8:35 Kefeng Wang
2019-10-08 9:12 ` Vlastimil Babka
0 siblings, 1 reply; 3+ messages in thread
From: Kefeng Wang @ 2019-10-08 8:35 UTC (permalink / raw)
To: linux-mm, Vlastimil Babka
Cc: Michal Hocko, Andrew Morton, Mel Gorman, qiuguorui1, wangkefeng wang
Hi Vlastimil and all,
We met a Null pointer when do page_to_pfn() in __free_one_page() in older kernel, __nr_to_section(__sec) return NULL,
#define __page_to_pfn(pg) \
({ const struct page *__pg = (pg); \
int __sec = page_to_section(__pg); \
(unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \
})
Before v4.11, __free_one_page() use pfn_valid_within(page_to_pfn(buddy)) to check pfn, after the
following patches, it use buddy_pfn directly.
b4fb8f66f1ae mm, page_alloc: Add missing check for memory holes
13ad59df67f1 mm, page_alloc: avoid page_to_pfn() when merging buddies // NOTE: directly use buddy_pfn
76741e776a37 mm, page_alloc: don't convert pfn to idx when merging
If use page_to_pfn(buddy) back in mainline 5.4-rc2, the same issue will occur,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c0b2e0306720..fbbfe8e8b4ca 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -934,7 +934,7 @@ static inline void __free_one_page(struct page *page,
buddy_pfn = __find_buddy_pfn(pfn, order);
buddy = page + (buddy_pfn - pfn);
- if (!pfn_valid_within(buddy_pfn))
+ if (!pfn_valid_within(page_to_pfn(buddy)))
goto done_merging;
if (!page_is_buddy(page, buddy, order))
goto done_merging;
It shows the buddy->flags is wrong, that is, buddy is bad, we find the buddy by page + (buddy_pfn - pfn),
so there is some issue in __find_buddy_pfn(pfn, order)?
The following is the debug print and CallTrace, any comment?
Thanks,
Kefeng
1) MEMBLOCK configuration:
memory size = 0x0000000036800000 reserved size = 0x0000000004ca7fbc
memory.cnt = 0x4
memory[0x0] [0x0000000000000000-0x0000000013ffffff], 0x0000000014000000 bytes flags: 0x0
memory[0x1] [0x000000002d600000-0x0000000033ffffff], 0x0000000006a00000 bytes flags: 0x0
memory[0x2] [0x0000000034800000-0x00000000445fffff], 0x000000000fe00000 bytes flags: 0x0
memory[0x3] [0x0000000044a00000-0x00000000509fffff], 0x000000000c000000 bytes flags: 0x0
reserved.cnt = 0x6
reserved[0x0] [0x000000002d680000-0x000000002e7c8fff], 0x0000000001149000 bytes flags: 0x0
reserved[0x1] [0x0000000030b00000-0x000000003239cfff], 0x000000000189d000 bytes flags: 0x0
reserved[0x2] [0x0000000032400000-0x00000000324fffff], 0x0000000000100000 bytes flags: 0x0
reserved[0x3] [0x000000004e000000-0x000000004fffffff], 0x0000000002000000 bytes flags: 0x0
reserved[0x4] [0x000000005083e040-0x0000000050849ffb], 0x000000000000bfbc bytes flags: 0x0
reserved[0x5] [0x000000005084a000-0x00000000509fffff], 0x00000000001b6000 bytes flags: 0x0
2) CONFIG
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
# CONFIG_SPARSEMEM_VMEMMAP is not set
3) debug print and CallTrace
__free_one_page , order = 9, max_order = 11, page = ffffff804e128000, buddy = ffffff804e120000, sec = 42623, mem_section = 0000000000000000
buddy = ffffff804e120000, __sec = 42623, buddy->flags = 299fcc27aebc552f, SECTIONS_PGSHIFT = 46, SECTIONS_MASK = 3ffff
__section_mem_map_addr, section = 0000000000000000
------------[ cut here ]------------
Ignoring spurious kernel translation fault at virtual address 0000000000000000
WARNING: CPU: 1 PID: 29 at ../arch/arm64/mm/fault.c:302 __do_kernel_fault+0xb8/0x130
Modules linked in:
CPU: 1 PID: 29 Comm: kworker/1:1 Not tainted 5.4.0-rc2+ #16
Hardware name: linux,dummy-virt (DT)
Workqueue: events delayed_fput
pstate: 60000085 (nZCv daIf -PAN -UAO)
pc : __do_kernel_fault+0xb8/0x130
lr : __do_kernel_fault+0xb8/0x130
sp : ffffffc011273560
x29: ffffffc011273560 x28: ffffff805020ce00
x27: ffffff804e128000 x26: ffffff804e120000
x25: 0000000000000000 x24: 0000000000000025
x23: 0000000096000005 x22: 0000000000000025
x21: 0000000096000005 x20: ffffffc011273650
x19: 0000000000000000 x18: 0000000000000000
x17: 00000000f06e9c14 x16: 0000000000000014
x15: 0000000000000000 x14: 7320303030303030
x13: 6c20616464726573 x12: 7420766972747561
x11: 206661756c742061 x10: 6e736c6174696f6e
x9 : 726e656c20747261 x8 : 72696f7573206b65
x7 : 72696e6720737075 x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000000
x3 : 0000000000000000 x2 : 0000000000000000
x1 : 0095a39d527dc9a0 x0 : 0000000000000000
Call trace:
__do_kernel_fault+0xb8/0x130
do_page_fault+0x60/0x3ec
do_translation_fault+0x40/0x78
do_mem_abort+0x50/0xa8
el1_da+0x20/0x94
__free_one_page+0x1d8/0x31c
free_pcppages_bulk+0x1dc/0x258
free_unref_page_commit.isra.114+0xb0/0xc0
free_unref_page_list+0x144/0x198
release_pages+0x8c/0x2bc
__pagevec_release+0x38/0x48
pagevec_release+0x14/0x20
shmem_undo_range+0x23c/0x49c
shmem_truncate_range+0x38/0x58
shmem_evict_inode+0xd4/0x1e8
evict+0xb8/0x174
iput+0x178/0x1c0
dentry_unlink_inode+0x120/0x124
__dentry_kill+0x98/0x170
dput+0x88/0x140
__fput+0x184/0x1e8
delayed_fput+0x40/0x54
process_one_work+0x1a4/0x294
worker_thread+0x1ec/0x284
kthread+0xf0/0x100
ret_from_fork+0x10/0x18
---[ end trace ebdfde5c0fdc7511 ]---
------------[ cut here ]------------
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [Question] pfn_valid_within(page_to_pfn(buddy)) vs pfn_valid_within(buddy_pfn) in __free_one_page()
2019-10-08 8:35 [Question] pfn_valid_within(page_to_pfn(buddy)) vs pfn_valid_within(buddy_pfn) in __free_one_page() Kefeng Wang
@ 2019-10-08 9:12 ` Vlastimil Babka
2019-10-09 1:07 ` Kefeng Wang
0 siblings, 1 reply; 3+ messages in thread
From: Vlastimil Babka @ 2019-10-08 9:12 UTC (permalink / raw)
To: Kefeng Wang, linux-mm; +Cc: Michal Hocko, Andrew Morton, Mel Gorman, qiuguorui1
On 10/8/19 10:35 AM, Kefeng Wang wrote:
> Hi Vlastimil and all,
>
> We met a Null pointer when do page_to_pfn() in __free_one_page() in older kernel, __nr_to_section(__sec) return NULL,
>
> #define __page_to_pfn(pg) \
> ({ const struct page *__pg = (pg); \
> int __sec = page_to_section(__pg); \
> (unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \
> })
>
> Before v4.11, __free_one_page() use pfn_valid_within(page_to_pfn(buddy)) to check pfn, after the
Hmm, looks like the code before v4.11 was wrong. pfn_valid_(within)
should be checked first, before obtaining and working with the struct
page. Here we already have a struct page obtained by pointer arithmetics,
and are calling page_to_pfn() on it, which means accessing its flags.
The pfn_valid_within() then comes too late.
> following patches, it use buddy_pfn directly.
>
> b4fb8f66f1ae mm, page_alloc: Add missing check for memory holes
> 13ad59df67f1 mm, page_alloc: avoid page_to_pfn() when merging buddies // NOTE: directly use buddy_pfn
> 76741e776a37 mm, page_alloc: don't convert pfn to idx when merging
Looks like my patches fixed that code without realizing there was
the bug. Commit b4fb8f66f1ae shows I've also introduced it elsewhere.
> If use page_to_pfn(buddy) back in mainline 5.4-rc2, the same issue will occur,
No surprise, we must validate pfn first before touching the page.
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c0b2e0306720..fbbfe8e8b4ca 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -934,7 +934,7 @@ static inline void __free_one_page(struct page *page,
> buddy_pfn = __find_buddy_pfn(pfn, order);
> buddy = page + (buddy_pfn - pfn);
>
> - if (!pfn_valid_within(buddy_pfn))
> + if (!pfn_valid_within(page_to_pfn(buddy)))
> goto done_merging;
> if (!page_is_buddy(page, buddy, order))
> goto done_merging;
>
> It shows the buddy->flags is wrong, that is, buddy is bad, we find the buddy by page + (buddy_pfn - pfn),
> so there is some issue in __find_buddy_pfn(pfn, order)?
No, result of __find_buddy_pfn has to be validated first.
> The following is the debug print and CallTrace, any comment?
>
> Thanks,
> Kefeng
>
> 1) MEMBLOCK configuration:
> memory size = 0x0000000036800000 reserved size = 0x0000000004ca7fbc
> memory.cnt = 0x4
> memory[0x0] [0x0000000000000000-0x0000000013ffffff], 0x0000000014000000 bytes flags: 0x0
> memory[0x1] [0x000000002d600000-0x0000000033ffffff], 0x0000000006a00000 bytes flags: 0x0
> memory[0x2] [0x0000000034800000-0x00000000445fffff], 0x000000000fe00000 bytes flags: 0x0
> memory[0x3] [0x0000000044a00000-0x00000000509fffff], 0x000000000c000000 bytes flags: 0x0
> reserved.cnt = 0x6
> reserved[0x0] [0x000000002d680000-0x000000002e7c8fff], 0x0000000001149000 bytes flags: 0x0
> reserved[0x1] [0x0000000030b00000-0x000000003239cfff], 0x000000000189d000 bytes flags: 0x0
> reserved[0x2] [0x0000000032400000-0x00000000324fffff], 0x0000000000100000 bytes flags: 0x0
> reserved[0x3] [0x000000004e000000-0x000000004fffffff], 0x0000000002000000 bytes flags: 0x0
> reserved[0x4] [0x000000005083e040-0x0000000050849ffb], 0x000000000000bfbc bytes flags: 0x0
> reserved[0x5] [0x000000005084a000-0x00000000509fffff], 0x00000000001b6000 bytes flags: 0x0
These might be holes in the zones, right.
> 2) CONFIG
> CONFIG_SPARSEMEM_MANUAL=y
> CONFIG_SPARSEMEM=y
> CONFIG_HAVE_MEMORY_PRESENT=y
> CONFIG_SPARSEMEM_EXTREME=y
> CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
> # CONFIG_SPARSEMEM_VMEMMAP is not set
Is CONFIG_HOLES_IN_ZONE enabled? Probably yes as that's arm64.
> 3) debug print and CallTrace
> __free_one_page , order = 9, max_order = 11, page = ffffff804e128000, buddy = ffffff804e120000, sec = 42623, mem_section = 0000000000000000
I would assume buddy is in one of the holes, but you'd have to print the pfn's to be sure.
> buddy = ffffff804e120000, __sec = 42623, buddy->flags = 299fcc27aebc552f, SECTIONS_PGSHIFT = 46, SECTIONS_MASK = 3ffff
> __section_mem_map_addr, section = 0000000000000000
> ------------[ cut here ]------------
> Ignoring spurious kernel translation fault at virtual address 0000000000000000
> WARNING: CPU: 1 PID: 29 at ../arch/arm64/mm/fault.c:302 __do_kernel_fault+0xb8/0x130
> Modules linked in:
> CPU: 1 PID: 29 Comm: kworker/1:1 Not tainted 5.4.0-rc2+ #16
> Hardware name: linux,dummy-virt (DT)
> Workqueue: events delayed_fput
> pstate: 60000085 (nZCv daIf -PAN -UAO)
> pc : __do_kernel_fault+0xb8/0x130
> lr : __do_kernel_fault+0xb8/0x130
> sp : ffffffc011273560
> x29: ffffffc011273560 x28: ffffff805020ce00
> x27: ffffff804e128000 x26: ffffff804e120000
> x25: 0000000000000000 x24: 0000000000000025
> x23: 0000000096000005 x22: 0000000000000025
> x21: 0000000096000005 x20: ffffffc011273650
> x19: 0000000000000000 x18: 0000000000000000
> x17: 00000000f06e9c14 x16: 0000000000000014
> x15: 0000000000000000 x14: 7320303030303030
> x13: 6c20616464726573 x12: 7420766972747561
> x11: 206661756c742061 x10: 6e736c6174696f6e
> x9 : 726e656c20747261 x8 : 72696f7573206b65
> x7 : 72696e6720737075 x6 : 0000000000000000
> x5 : 0000000000000001 x4 : 0000000000000000
> x3 : 0000000000000000 x2 : 0000000000000000
> x1 : 0095a39d527dc9a0 x0 : 0000000000000000
> Call trace:
> __do_kernel_fault+0xb8/0x130
> do_page_fault+0x60/0x3ec
> do_translation_fault+0x40/0x78
> do_mem_abort+0x50/0xa8
> el1_da+0x20/0x94
> __free_one_page+0x1d8/0x31c
> free_pcppages_bulk+0x1dc/0x258
> free_unref_page_commit.isra.114+0xb0/0xc0
> free_unref_page_list+0x144/0x198
> release_pages+0x8c/0x2bc
> __pagevec_release+0x38/0x48
> pagevec_release+0x14/0x20
> shmem_undo_range+0x23c/0x49c
> shmem_truncate_range+0x38/0x58
> shmem_evict_inode+0xd4/0x1e8
> evict+0xb8/0x174
> iput+0x178/0x1c0
> dentry_unlink_inode+0x120/0x124
> __dentry_kill+0x98/0x170
> dput+0x88/0x140
> __fput+0x184/0x1e8
> delayed_fput+0x40/0x54
> process_one_work+0x1a4/0x294
> worker_thread+0x1ec/0x284
> kthread+0xf0/0x100
> ret_from_fork+0x10/0x18
> ---[ end trace ebdfde5c0fdc7511 ]---
> ------------[ cut here ]------------
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Question] pfn_valid_within(page_to_pfn(buddy)) vs pfn_valid_within(buddy_pfn) in __free_one_page()
2019-10-08 9:12 ` Vlastimil Babka
@ 2019-10-09 1:07 ` Kefeng Wang
0 siblings, 0 replies; 3+ messages in thread
From: Kefeng Wang @ 2019-10-09 1:07 UTC (permalink / raw)
To: Vlastimil Babka, linux-mm
Cc: Michal Hocko, Andrew Morton, Mel Gorman, qiuguorui1
On 2019/10/8 17:12, Vlastimil Babka wrote:
> On 10/8/19 10:35 AM, Kefeng Wang wrote:
>> Hi Vlastimil and all,
>>
>> We met a Null pointer when do page_to_pfn() in __free_one_page() in older kernel, __nr_to_section(__sec) return NULL,
>>
>> #define __page_to_pfn(pg) \
>> ({ const struct page *__pg = (pg); \
>> int __sec = page_to_section(__pg); \
>> (unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \
>> })
>>
>> Before v4.11, __free_one_page() use pfn_valid_within(page_to_pfn(buddy)) to check pfn, after the
>
> Hmm, looks like the code before v4.11 was wrong. pfn_valid_(within)
> should be checked first, before obtaining and working with the struct
> page. Here we already have a struct page obtained by pointer arithmetics,
> and are calling page_to_pfn() on it, which means accessing its flags.
> The pfn_valid_within() then comes too late.
>
>> following patches, it use buddy_pfn directly.
>>
>> b4fb8f66f1ae mm, page_alloc: Add missing check for memory holes
>> 13ad59df67f1 mm, page_alloc: avoid page_to_pfn() when merging buddies // NOTE: directly use buddy_pfn
>> 76741e776a37 mm, page_alloc: don't convert pfn to idx when merging
>
> Looks like my patches fixed that code without realizing there was
> the bug. Commit b4fb8f66f1ae shows I've also introduced it elsewhere.
>
>> If use page_to_pfn(buddy) back in mainline 5.4-rc2, the same issue will occur,
>
> No surprise, we must validate pfn first before touching the page.
>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index c0b2e0306720..fbbfe8e8b4ca 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -934,7 +934,7 @@ static inline void __free_one_page(struct page *page,
>> buddy_pfn = __find_buddy_pfn(pfn, order);
>> buddy = page + (buddy_pfn - pfn);
>>
>> - if (!pfn_valid_within(buddy_pfn))
>> + if (!pfn_valid_within(page_to_pfn(buddy)))
>> goto done_merging;
>> if (!page_is_buddy(page, buddy, order))
>> goto done_merging;
>>
>> It shows the buddy->flags is wrong, that is, buddy is bad, we find the buddy by page + (buddy_pfn - pfn),
>> so there is some issue in __find_buddy_pfn(pfn, order)?
>
> No, result of __find_buddy_pfn has to be validated first.
Hi Vlastimil, Thank you for your explanation, got it.
>
>> The following is the debug print and CallTrace, any comment?
>>
>> Thanks,
>> Kefeng
>>
>> 1) MEMBLOCK configuration:
>> memory size = 0x0000000036800000 reserved size = 0x0000000004ca7fbc
>> memory.cnt = 0x4
>> memory[0x0] [0x0000000000000000-0x0000000013ffffff], 0x0000000014000000 bytes flags: 0x0
>> memory[0x1] [0x000000002d600000-0x0000000033ffffff], 0x0000000006a00000 bytes flags: 0x0
>> memory[0x2] [0x0000000034800000-0x00000000445fffff], 0x000000000fe00000 bytes flags: 0x0
>> memory[0x3] [0x0000000044a00000-0x00000000509fffff], 0x000000000c000000 bytes flags: 0x0
>> reserved.cnt = 0x6
>> reserved[0x0] [0x000000002d680000-0x000000002e7c8fff], 0x0000000001149000 bytes flags: 0x0
>> reserved[0x1] [0x0000000030b00000-0x000000003239cfff], 0x000000000189d000 bytes flags: 0x0
>> reserved[0x2] [0x0000000032400000-0x00000000324fffff], 0x0000000000100000 bytes flags: 0x0
>> reserved[0x3] [0x000000004e000000-0x000000004fffffff], 0x0000000002000000 bytes flags: 0x0
>> reserved[0x4] [0x000000005083e040-0x0000000050849ffb], 0x000000000000bfbc bytes flags: 0x0
>> reserved[0x5] [0x000000005084a000-0x00000000509fffff], 0x00000000001b6000 bytes flags: 0x0
>
> These might be holes in the zones, right.
>
>> 2) CONFIG
>> CONFIG_SPARSEMEM_MANUAL=y
>> CONFIG_SPARSEMEM=y
>> CONFIG_HAVE_MEMORY_PRESENT=y
>> CONFIG_SPARSEMEM_EXTREME=y
>> CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
>> # CONFIG_SPARSEMEM_VMEMMAP is not set
>
> Is CONFIG_HOLES_IN_ZONE enabled? Probably yes as that's arm64.
Yes, arm64 force enable HOLES_IN_ZONE.
>
>> 3) debug print and CallTrace
>> __free_one_page , order = 9, max_order = 11, page = ffffff804e128000, buddy = ffffff804e120000, sec = 42623, mem_section = 0000000000000000
>
> I would assume buddy is in one of the holes, but you'd have to print the pfn's to be sure.
buddy_pfn = 280576, buddy_addr = 44800000, and it is in the hole between memory[2] and memory[3].
>
>> buddy = ffffff804e120000, __sec = 42623, buddy->flags = 299fcc27aebc552f, SECTIONS_PGSHIFT = 46, SECTIONS_MASK = 3ffff
>> __section_mem_map_addr, section = 0000000000000000
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-10-09 1:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08 8:35 [Question] pfn_valid_within(page_to_pfn(buddy)) vs pfn_valid_within(buddy_pfn) in __free_one_page() Kefeng Wang
2019-10-08 9:12 ` Vlastimil Babka
2019-10-09 1:07 ` Kefeng Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).