linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/1] buddy page accessed before initialized
@ 2017-10-31 15:50 Pavel Tatashin
  2017-10-31 15:50 ` [PATCH v1 1/1] mm: " Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2017-10-31 15:50 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, akpm, mgorman, mhocko, linux-mm,
	linux-kernel

This problem is introduced in linux-next:

a4d28b2d6e64 mm: deferred_init_memmap improvements

If it is more appropriate a create a new patch that includes this fix into
the original patch please let me know.

Pavel Tatashin (1):
  mm: buddy page accessed before initialized

 mm/page_alloc.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-10-31 15:50 [PATCH v1 0/1] buddy page accessed before initialized Pavel Tatashin
@ 2017-10-31 15:50 ` Pavel Tatashin
  2017-11-02 13:32   ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2017-10-31 15:50 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, akpm, mgorman, mhocko, linux-mm,
	linux-kernel

This problem is seen when machine is rebooted after kexec:
A message like this is printed:
==========================================================================
WARNING: CPU: 21 PID: 249 at linux/lib/list_debug.c:53__listd+0x83/0xa0
Modules linked in:
CPU: 21 PID: 249 Comm: pgdatinit0 Not tainted 4.14.0-rc6_pt_deferred #90
Hardware name: Oracle Corporation ORACLE SERVER X6-2/ASM,MOTHERBOARD,1U,
BIOS 3016
node 1 initialised, 32444607 pages in 1679ms
task: ffff880180e75a00 task.stack: ffffc9000cdb0000
RIP: 0010:__list_del_entry_valid+0x83/0xa0
RSP: 0000:ffffc9000cdb3d18 EFLAGS: 00010046
RAX: 0000000000000054 RBX: 0000000000000009 RCX: ffffffff81c5f3e8
RDX: 0000000000000000 RSI: 0000000000000086 RDI: 0000000000000046
RBP: ffffc9000cdb3d18 R08: 00000000fffffffe R09: 0000000000000154
R10: 0000000000000005 R11: 0000000000000153 R12: 0000000001fcdc00
R13: 0000000001fcde00 R14: ffff88207ffded00 R15: ffffea007f370000
FS:  0000000000000000(0000) GS:ffff881fffac0000(0000) knlGS:0
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000407ec09001 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 free_one_page+0x103/0x390
 __free_pages_ok+0x1cf/0x2d0
 __free_pages+0x19/0x30
 __free_pages_boot_core+0xae/0xba
 deferred_free_range+0x60/0x94
 deferred_init_memmap+0x324/0x372
 kthread+0x109/0x140
 ? __free_pages_bootmem+0x2e/0x2e
 ? kthread_park+0x60/0x60
 ret_from_fork+0x25/0x30

list_del corruption. next->prev should be ffffea007f428020, but was
ffffea007f1d8020
==========================================================================

The problem happens in this path:

page_alloc_init_late
  deferred_init_memmap
    deferred_init_range
      __def_free
        deferred_free_range
          __free_pages_boot_core(page, order)
            __free_pages()
              __free_pages_ok()
                free_one_page()
                  __free_one_page(page, pfn, zone, order, migratetype);

deferred_init_range() initializes one page at a time by calling
__init_single_page(), once it initializes pageblock_nr_pages pages, it
calls deferred_free_range() to free the initialized pages to the buddy
allocator. Eventually, we reach __free_one_page(), where we compute buddy
page:
	buddy_pfn = __find_buddy_pfn(pfn, order);
	buddy = page + (buddy_pfn - pfn);

buddy_pfn is computed as pfn ^ (1 << order), or pfn + pageblock_nr_pages.
Thefore, buddy page becomes a page one after the range that currently was
initialized, and we access this page in this function. Also, later when we
return back to deferred_init_range(), the buddy page is initialized again.

So, in order to avoid this issue, we must initialize the buddy page prior
to calling deferred_free_range().

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 mm/page_alloc.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 97687b38da05..f3ea06db3eed 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1500,9 +1500,17 @@ static unsigned long deferred_init_range(int nid, int zid, unsigned long pfn,
 			__init_single_page(page, pfn, zid, nid);
 			nr_free++;
 		} else {
-			nr_pages += __def_free(&nr_free, &free_base_pfn, &page);
 			page = pfn_to_page(pfn);
 			__init_single_page(page, pfn, zid, nid);
+			/*
+			 * We must free previous range after initializing the
+			 * first page of the next range. This is because first
+			 * page may be accessed in __free_one_page(), when buddy
+			 * page is computed:
+			 *   buddy_pfn = pfn + pageblock_nr_pages
+			 */
+			deferred_free_range(free_base_pfn, nr_free);
+			nr_pages += nr_free;
 			free_base_pfn = pfn;
 			nr_free = 1;
 			cond_resched();
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-10-31 15:50 ` [PATCH v1 1/1] mm: " Pavel Tatashin
@ 2017-11-02 13:32   ` Michal Hocko
  2017-11-02 13:39     ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2017-11-02 13:32 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

On Tue 31-10-17 11:50:02, Pavel Tatashin wrote:
[...]
> The problem happens in this path:
> 
> page_alloc_init_late
>   deferred_init_memmap
>     deferred_init_range
>       __def_free
>         deferred_free_range
>           __free_pages_boot_core(page, order)
>             __free_pages()
>               __free_pages_ok()
>                 free_one_page()
>                   __free_one_page(page, pfn, zone, order, migratetype);
> 
> deferred_init_range() initializes one page at a time by calling
> __init_single_page(), once it initializes pageblock_nr_pages pages, it
> calls deferred_free_range() to free the initialized pages to the buddy
> allocator. Eventually, we reach __free_one_page(), where we compute buddy
> page:
> 	buddy_pfn = __find_buddy_pfn(pfn, order);
> 	buddy = page + (buddy_pfn - pfn);
> 
> buddy_pfn is computed as pfn ^ (1 << order), or pfn + pageblock_nr_pages.
> Thefore, buddy page becomes a page one after the range that currently was
> initialized, and we access this page in this function. Also, later when we
> return back to deferred_init_range(), the buddy page is initialized again.
> 
> So, in order to avoid this issue, we must initialize the buddy page prior
> to calling deferred_free_range().

How come we didn't have this problem previously? I am really confused.

> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> ---
>  mm/page_alloc.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 97687b38da05..f3ea06db3eed 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1500,9 +1500,17 @@ static unsigned long deferred_init_range(int nid, int zid, unsigned long pfn,
>  			__init_single_page(page, pfn, zid, nid);
>  			nr_free++;
>  		} else {
> -			nr_pages += __def_free(&nr_free, &free_base_pfn, &page);
>  			page = pfn_to_page(pfn);
>  			__init_single_page(page, pfn, zid, nid);
> +			/*
> +			 * We must free previous range after initializing the
> +			 * first page of the next range. This is because first
> +			 * page may be accessed in __free_one_page(), when buddy
> +			 * page is computed:
> +			 *   buddy_pfn = pfn + pageblock_nr_pages
> +			 */
> +			deferred_free_range(free_base_pfn, nr_free);
> +			nr_pages += nr_free;
>  			free_base_pfn = pfn;
>  			nr_free = 1;
>  			cond_resched();
> -- 
> 2.14.3
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 13:32   ` Michal Hocko
@ 2017-11-02 13:39     ` Pavel Tatashin
  2017-11-02 13:54       ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2017-11-02 13:39 UTC (permalink / raw)
  To: Michal Hocko
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

On 11/02/2017 09:32 AM, Michal Hocko wrote:
> On Tue 31-10-17 11:50:02, Pavel Tatashin wrote:
> [...]
>> The problem happens in this path:
>>
>> page_alloc_init_late
>>    deferred_init_memmap
>>      deferred_init_range
>>        __def_free
>>          deferred_free_range
>>            __free_pages_boot_core(page, order)
>>              __free_pages()
>>                __free_pages_ok()
>>                  free_one_page()
>>                    __free_one_page(page, pfn, zone, order, migratetype);
>>
>> deferred_init_range() initializes one page at a time by calling
>> __init_single_page(), once it initializes pageblock_nr_pages pages, it
>> calls deferred_free_range() to free the initialized pages to the buddy
>> allocator. Eventually, we reach __free_one_page(), where we compute buddy
>> page:
>> 	buddy_pfn = __find_buddy_pfn(pfn, order);
>> 	buddy = page + (buddy_pfn - pfn);
>>
>> buddy_pfn is computed as pfn ^ (1 << order), or pfn + pageblock_nr_pages.
>> Thefore, buddy page becomes a page one after the range that currently was
>> initialized, and we access this page in this function. Also, later when we
>> return back to deferred_init_range(), the buddy page is initialized again.
>>
>> So, in order to avoid this issue, we must initialize the buddy page prior
>> to calling deferred_free_range().
> 
> How come we didn't have this problem previously? I am really confused.
> 

Hi Michal,

Previously as before my project? That is because memory for all struct 
pages was always zeroed in memblock, and in __free_one_page() 
page_is_buddy() was always returning false, thus we never tried to 
incorrectly remove it from the list:

837			list_del(&buddy->lru);

Now, that memory is not zeroed, page_is_buddy() can return true after 
kexec when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG 
does not catch this case). And proceed further to incorrectly remove 
buddy from the list.

This is why we must initialize the computed buddy page beforehand.

Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 13:39     ` Pavel Tatashin
@ 2017-11-02 13:54       ` Michal Hocko
  2017-11-02 14:00         ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2017-11-02 13:54 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

On Thu 02-11-17 09:39:58, Pavel Tatashin wrote:
[...]
> Hi Michal,
> 
> Previously as before my project? That is because memory for all struct pages
> was always zeroed in memblock, and in __free_one_page() page_is_buddy() was
> always returning false, thus we never tried to incorrectly remove it from
> the list:
> 
> 837			list_del(&buddy->lru);
> 
> Now, that memory is not zeroed, page_is_buddy() can return true after kexec
> when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG does not
> catch this case). And proceed further to incorrectly remove buddy from the
> list.

OK, I thought this was a regression from one of the recent patches. So
the problem is not new. Why don't we see the same problem during the
standard boot?

> This is why we must initialize the computed buddy page beforehand.

Ble, this is really ugly. I will think about it more.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 13:54       ` Michal Hocko
@ 2017-11-02 14:00         ` Pavel Tatashin
  2017-11-02 14:08           ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2017-11-02 14:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel



On 11/02/2017 09:54 AM, Michal Hocko wrote:
> On Thu 02-11-17 09:39:58, Pavel Tatashin wrote:
> [...]
>> Hi Michal,
>>
>> Previously as before my project? That is because memory for all struct pages
>> was always zeroed in memblock, and in __free_one_page() page_is_buddy() was
>> always returning false, thus we never tried to incorrectly remove it from
>> the list:
>>
>> 837			list_del(&buddy->lru);
>>
>> Now, that memory is not zeroed, page_is_buddy() can return true after kexec
>> when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG does not
>> catch this case). And proceed further to incorrectly remove buddy from the
>> list.
> 
> OK, I thought this was a regression from one of the recent patches. So
> the problem is not new. Why don't we see the same problem during the
> standard boot?

Because, I believe, BIOS is zeroing all the memory for us.

> 
>> This is why we must initialize the computed buddy page beforehand.
> 
> Ble, this is really ugly. I will think about it more.
> 

Another approach that I considered is to split loop inside 
deferred_init_range() into two loops: one where we initialize pages by 
calling __init_single_page(), another where we free them to buddy 
allocator by calling deferred_free_range().

Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 14:00         ` Pavel Tatashin
@ 2017-11-02 14:08           ` Michal Hocko
  2017-11-02 14:16             ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2017-11-02 14:08 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

On Thu 02-11-17 10:00:59, Pavel Tatashin wrote:
> 
> 
> On 11/02/2017 09:54 AM, Michal Hocko wrote:
> > On Thu 02-11-17 09:39:58, Pavel Tatashin wrote:
> > [...]
> > > Hi Michal,
> > > 
> > > Previously as before my project? That is because memory for all struct pages
> > > was always zeroed in memblock, and in __free_one_page() page_is_buddy() was
> > > always returning false, thus we never tried to incorrectly remove it from
> > > the list:
> > > 
> > > 837			list_del(&buddy->lru);
> > > 
> > > Now, that memory is not zeroed, page_is_buddy() can return true after kexec
> > > when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG does not
> > > catch this case). And proceed further to incorrectly remove buddy from the
> > > list.
> > 
> > OK, I thought this was a regression from one of the recent patches. So
> > the problem is not new. Why don't we see the same problem during the
> > standard boot?
> 
> Because, I believe, BIOS is zeroing all the memory for us.

I thought you were runnning with the debugging which poisons all the
allocated memory...
 
> > > This is why we must initialize the computed buddy page beforehand.
> > 
> > Ble, this is really ugly. I will think about it more.
> > 
> 
> Another approach that I considered is to split loop inside
> deferred_init_range() into two loops: one where we initialize pages by
> calling __init_single_page(), another where we free them to buddy allocator
> by calling deferred_free_range().

Yes, that would make much more sense to me.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 14:08           ` Michal Hocko
@ 2017-11-02 14:16             ` Pavel Tatashin
  2017-11-02 14:27               ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2017-11-02 14:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

>>>> Now, that memory is not zeroed, page_is_buddy() can return true after kexec
>>>> when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG does not
>>>> catch this case). And proceed further to incorrectly remove buddy from the
>>>> list.
>>>
>>> OK, I thought this was a regression from one of the recent patches. So
>>> the problem is not new. Why don't we see the same problem during the
>>> standard boot?
>>
>> Because, I believe, BIOS is zeroing all the memory for us.
> 
> I thought you were runnning with the debugging which poisons all the
> allocated memory...

Yes, but as I said, unfortunately memset(1) with CONFIG_VM_DEBUG does 
not catch this case. So, when CONFIG_VM_DEBUG is enabled kexec reboots 
without issues.

>   
>>>> This is why we must initialize the computed buddy page beforehand.
>>>
>>> Ble, this is really ugly. I will think about it more.
>>>
>>
>> Another approach that I considered is to split loop inside
>> deferred_init_range() into two loops: one where we initialize pages by
>> calling __init_single_page(), another where we free them to buddy allocator
>> by calling deferred_free_range().
> 
> Yes, that would make much more sense to me.
> 

Ok, so should I submit a new patch with two loops? (The logic within 
loops is going to be the same:

if (!pfn_valid_within(pfn)) {
} else if (!(pfn & nr_pgmask) && !pfn_valid(pfn)) {
} else if (!meminit_pfn_in_nid(pfn, nid, &nid_init_state)) {
} else if (page && (pfn & nr_pgmask)) {

This fix was already added into mm-tree as
mm-deferred_init_memmap-improvements-fix-2.patch

Thank you,
Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 14:16             ` Pavel Tatashin
@ 2017-11-02 14:27               ` Michal Hocko
  2017-11-02 16:10                 ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2017-11-02 14:27 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

On Thu 02-11-17 10:16:49, Pavel Tatashin wrote:
> > > > > Now, that memory is not zeroed, page_is_buddy() can return true after kexec
> > > > > when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG does not
> > > > > catch this case). And proceed further to incorrectly remove buddy from the
> > > > > list.
> > > > 
> > > > OK, I thought this was a regression from one of the recent patches. So
> > > > the problem is not new. Why don't we see the same problem during the
> > > > standard boot?
> > > 
> > > Because, I believe, BIOS is zeroing all the memory for us.
> > 
> > I thought you were runnning with the debugging which poisons all the
> > allocated memory...
> 
> Yes, but as I said, unfortunately memset(1) with CONFIG_VM_DEBUG does not
> catch this case. So, when CONFIG_VM_DEBUG is enabled kexec reboots without
> issues.

Can we make the init pattern to catch this?

> > > > > This is why we must initialize the computed buddy page beforehand.
> > > > 
> > > > Ble, this is really ugly. I will think about it more.
> > > > 
> > > 
> > > Another approach that I considered is to split loop inside
> > > deferred_init_range() into two loops: one where we initialize pages by
> > > calling __init_single_page(), another where we free them to buddy allocator
> > > by calling deferred_free_range().
> > 
> > Yes, that would make much more sense to me.
> > 
> 
> Ok, so should I submit a new patch with two loops? (The logic within loops
> is going to be the same:

Could you post it please?
 
> if (!pfn_valid_within(pfn)) {
> } else if (!(pfn & nr_pgmask) && !pfn_valid(pfn)) {
> } else if (!meminit_pfn_in_nid(pfn, nid, &nid_init_state)) {
> } else if (page && (pfn & nr_pgmask)) {
> 
> This fix was already added into mm-tree as
> mm-deferred_init_memmap-improvements-fix-2.patch

I think Andrew can drop it and replace by a different patch.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 14:27               ` Michal Hocko
@ 2017-11-02 16:10                 ` Pavel Tatashin
  2017-11-03  8:59                   ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2017-11-02 16:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

>>
>> Yes, but as I said, unfortunately memset(1) with CONFIG_VM_DEBUG does not
>> catch this case. So, when CONFIG_VM_DEBUG is enabled kexec reboots without
>> issues.
> 
> Can we make the init pattern to catch this?

Unfortunately, that is not easy: memset() gives us only one byte to play 
with, and if we use something else that will make CONFIG_VM_DEBUG 
unacceptably slow.

One byte is not enough to trigger the pattern that satisfy 
page_is_buddy() logic. I have tried it. With kexec, however it is more 
predictable: we use the same memory during boot to allocate vmemmap, and 
therefore the struct pages are more like "valid" struct pages from the 
previous boot.

> 
>>>>>> This is why we must initialize the computed buddy page beforehand.
>>>>>
>>>>> Ble, this is really ugly. I will think about it more.
>>>>>
>>>>
>>>> Another approach that I considered is to split loop inside
>>>> deferred_init_range() into two loops: one where we initialize pages by
>>>> calling __init_single_page(), another where we free them to buddy allocator
>>>> by calling deferred_free_range().
>>>
>>> Yes, that would make much more sense to me.
>>>
>>
>> Ok, so should I submit a new patch with two loops? (The logic within loops
>> is going to be the same:
> 
> Could you post it please?
>   
>> if (!pfn_valid_within(pfn)) {
>> } else if (!(pfn & nr_pgmask) && !pfn_valid(pfn)) {
>> } else if (!meminit_pfn_in_nid(pfn, nid, &nid_init_state)) {
>> } else if (page && (pfn & nr_pgmask)) {
>>
>> This fix was already added into mm-tree as
>> mm-deferred_init_memmap-improvements-fix-2.patch
> 
> I think Andrew can drop it and replace by a different patch.
> 

The new patch is coming, I will test it on two machines where I observed 
the problem.

Thank you,
Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-02 16:10                 ` Pavel Tatashin
@ 2017-11-03  8:59                   ` Michal Hocko
  2017-11-03 14:17                     ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2017-11-03  8:59 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, akpm, mgorman, linux-mm, linux-kernel

On Thu 02-11-17 12:10:39, Pavel Tatashin wrote:
> > > 
> > > Yes, but as I said, unfortunately memset(1) with CONFIG_VM_DEBUG does not
> > > catch this case. So, when CONFIG_VM_DEBUG is enabled kexec reboots without
> > > issues.
> > 
> > Can we make the init pattern to catch this?
> 
> Unfortunately, that is not easy: memset() gives us only one byte to play
> with, and if we use something else that will make CONFIG_VM_DEBUG
> unacceptably slow.

Why cannot we do something similar to the optimized struct page
initialization and write 8B at the time and fill up the size unaligned
chunk in 1B?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/1] mm: buddy page accessed before initialized
  2017-11-03  8:59                   ` Michal Hocko
@ 2017-11-03 14:17                     ` Pavel Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2017-11-03 14:17 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Steve Sistare, Daniel Jordan, Andrew Morton, mgorman,
	Linux Memory Management List, linux-kernel

> Why cannot we do something similar to the optimized struct page
> initialization and write 8B at the time and fill up the size unaligned
> chunk in 1B?

I do not think this is a good idea: memset() on SPARC is slow for
small sizes, this is why we ended up using stores, but thats not the
case on x86 where memset() is well optimized for small sizes. So, I
believe we will see regressions. But even without the regressions
there are several reasons why I think this is not a good idea:
1. struct page size vary depending on configs. So, in order to create
a pattern that looks like a valid struct page, we would need to figure
out what is our struct page size.
2. memblock allocator is totally independent from struct pages, it is
going to be strange to add this dependency. The allocatoted memory is
also used for page tables, and kasan, so we do not really know where
the pattern should start from the allocator point of view.
3. It is going to be too easy to break that pattern if something
changes or shifts: struct page changes, vmemmap allocations change or
anything else.

Overall, I think now we have a good coverage now: CONFIG_DEBUG_VM
option tests for totally invalid struct pages, and kexec tests for
struct pages that look like valid ones, but they are invalid because
from the previous instance of kernel.

Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-03 14:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-31 15:50 [PATCH v1 0/1] buddy page accessed before initialized Pavel Tatashin
2017-10-31 15:50 ` [PATCH v1 1/1] mm: " Pavel Tatashin
2017-11-02 13:32   ` Michal Hocko
2017-11-02 13:39     ` Pavel Tatashin
2017-11-02 13:54       ` Michal Hocko
2017-11-02 14:00         ` Pavel Tatashin
2017-11-02 14:08           ` Michal Hocko
2017-11-02 14:16             ` Pavel Tatashin
2017-11-02 14:27               ` Michal Hocko
2017-11-02 16:10                 ` Pavel Tatashin
2017-11-03  8:59                   ` Michal Hocko
2017-11-03 14:17                     ` Pavel Tatashin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).