All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
@ 2018-10-09 15:21 Sergey Dyasli
  2018-10-12 13:40 ` Jan Beulich
  2018-11-07 18:20 ` Andrew Cooper
  0 siblings, 2 replies; 10+ messages in thread
From: Sergey Dyasli @ 2018-10-09 15:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, George Dunlap, Andrew Cooper,
	Julien Grall, Jan Beulich, Boris Ostrovsky

Scrubbing RAM during boot may take a long time on machines with lots
of RAM. Add 'idle' option to bootscrub which marks all pages dirty
initially so they will eventually be scrubbed in idle-loop on every
online CPU.

It's guaranteed that the allocator will return scrubbed pages by doing
eager scrubbing during allocation (unless MEMF_no_scrub was provided).

Use the new 'idle' option as the default one.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
---
v1 --> v2:
- dropped comment about performance
- changed default to 'idle'
- changed type of opt_bootscrub to enum
- restored __initdata for opt_bootscrub
- call parse_bool() first during parsing
- using switch() in heap_init_late()

Note: The message "Scrubbing Free RAM on %d nodes" corresponds to
the similar one in scrub_heap_pages()

CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien.grall@arm.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 docs/misc/xen-command-line.markdown |  9 ++++-
 xen/common/page_alloc.c             | 62 +++++++++++++++++++++++++++--
 2 files changed, 65 insertions(+), 6 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 1ffd586224..9c15d52bf7 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -227,14 +227,19 @@ that byte `0x12345678` is bad, you would place `badpage=0x12345` on
 Xen's command line.
 
 ### bootscrub
-> `= <boolean>`
+> `= idle | <boolean>`
 
-> Default: `true`
+> Default: `idle`
 
 Scrub free RAM during boot.  This is a safety feature to prevent
 accidentally leaking sensitive VM data into other VMs if Xen crashes
 and reboots.
 
+In `idle` mode, RAM is scrubbed in background on all CPUs during idle-loop
+with a guarantee that memory allocations always provide scrubbed pages.
+This option reduces boot time on machines with a large amount of RAM while
+still providing security benefits.
+
 ### bootscrub\_chunk
 > `= <size>`
 
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 16e1b0c357..7df5d5c545 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -161,8 +161,42 @@ string_param("badpage", opt_badpage);
 /*
  * no-bootscrub -> Free pages are not zeroed during boot.
  */
-static bool_t opt_bootscrub __initdata = 1;
-boolean_param("bootscrub", opt_bootscrub);
+enum bootscrub_mode {
+    BOOTSCRUB_OFF = 0,
+    BOOTSCRUB_ON,
+    BOOTSCRUB_IDLE,
+};
+static enum bootscrub_mode __initdata opt_bootscrub = BOOTSCRUB_IDLE;
+static int __init parse_bootscrub_param(const char *s)
+{
+    /* Interpret 'bootscrub' alone in its positive boolean form */
+    if ( *s == '\0' )
+    {
+        opt_bootscrub = BOOTSCRUB_ON;
+        return 0;
+    }
+
+    switch ( parse_bool(s, NULL) )
+    {
+    case 0:
+        opt_bootscrub = BOOTSCRUB_OFF;
+        break;
+
+    case 1:
+        opt_bootscrub = BOOTSCRUB_ON;
+        break;
+
+    default:
+        if ( !strcmp(s, "idle") )
+            opt_bootscrub = BOOTSCRUB_IDLE;
+        else
+            return -EINVAL;
+        break;
+    }
+
+    return 0;
+}
+custom_param("bootscrub", parse_bootscrub_param);
 
 /*
  * bootscrub_chunk -> Amount of bytes to scrub lockstep on non-SMT CPUs
@@ -1726,6 +1760,7 @@ static void init_heap_pages(
     struct page_info *pg, unsigned long nr_pages)
 {
     unsigned long i;
+    bool idle_scrub = false;
 
     /*
      * Some pages may not go through the boot allocator (e.g reserved
@@ -1737,6 +1772,9 @@ static void init_heap_pages(
     first_valid_mfn = mfn_min(page_to_mfn(pg), first_valid_mfn);
     spin_unlock(&heap_lock);
 
+    if ( system_state < SYS_STATE_active && opt_bootscrub == BOOTSCRUB_IDLE )
+        idle_scrub = true;
+
     for ( i = 0; i < nr_pages; i++ )
     {
         unsigned int nid = phys_to_nid(page_to_maddr(pg+i));
@@ -1763,7 +1801,7 @@ static void init_heap_pages(
             nr_pages -= n;
         }
 
-        free_heap_pages(pg + i, 0, scrub_debug);
+        free_heap_pages(pg + i, 0, scrub_debug || idle_scrub);
     }
 }
 
@@ -2039,8 +2077,24 @@ void __init heap_init_late(void)
      */
     setup_low_mem_virq();
 
-    if ( opt_bootscrub )
+    switch ( opt_bootscrub )
+    {
+    default:
+        ASSERT_UNREACHABLE();
+        /* Fall through */
+
+    case BOOTSCRUB_IDLE:
+        printk("Scrubbing Free RAM on %d nodes in background\n",
+               num_online_nodes());
+        break;
+
+    case BOOTSCRUB_ON:
         scrub_heap_pages();
+        break;
+
+    case BOOTSCRUB_OFF:
+        break;
+    }
 }
 
 
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-10-09 15:21 [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop Sergey Dyasli
@ 2018-10-12 13:40 ` Jan Beulich
  2018-10-15  7:53   ` Sergey Dyasli
  2018-11-07 18:20 ` Andrew Cooper
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2018-10-12 13:40 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Wei Liu, George Dunlap, Andrew Cooper, xen-devel, Julien Grall,
	Boris Ostrovsky

>>> On 09.10.18 at 17:21, <sergey.dyasli@citrix.com> wrote:
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -161,8 +161,42 @@ string_param("badpage", opt_badpage);
>  /*
>   * no-bootscrub -> Free pages are not zeroed during boot.
>   */
> -static bool_t opt_bootscrub __initdata = 1;
> -boolean_param("bootscrub", opt_bootscrub);
> +enum bootscrub_mode {
> +    BOOTSCRUB_OFF = 0,

The "= 0" is pointless.

> @@ -2039,8 +2077,24 @@ void __init heap_init_late(void)
>       */
>      setup_low_mem_virq();
>  
> -    if ( opt_bootscrub )
> +    switch ( opt_bootscrub )
> +    {
> +    default:
> +        ASSERT_UNREACHABLE();
> +        /* Fall through */
> +
> +    case BOOTSCRUB_IDLE:
> +        printk("Scrubbing Free RAM on %d nodes in background\n",
> +               num_online_nodes());

Still the question whether this printk(), and in particular the inclusion
of the node count, is meaningful in any way. Other than this
Reviewed-by: Jan Beulich <jbeulich@suse.com>
and one or both changes would be easy enough to make while
committing, provided we can reach agreement.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-10-12 13:40 ` Jan Beulich
@ 2018-10-15  7:53   ` Sergey Dyasli
  0 siblings, 0 replies; 10+ messages in thread
From: Sergey Dyasli @ 2018-10-15  7:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Wei Liu,
	George Dunlap, Andrew Cooper, xen-devel, Julien Grall,
	Boris Ostrovsky

On 12/10/18 14:40, Jan Beulich wrote:
>>>> On 09.10.18 at 17:21, <sergey.dyasli@citrix.com> wrote:
>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -161,8 +161,42 @@ string_param("badpage", opt_badpage);
>>  /*
>>   * no-bootscrub -> Free pages are not zeroed during boot.
>>   */
>> -static bool_t opt_bootscrub __initdata = 1;
>> -boolean_param("bootscrub", opt_bootscrub);
>> +enum bootscrub_mode {
>> +    BOOTSCRUB_OFF = 0,
> 
> The "= 0" is pointless.

I don't mind this change.

>> @@ -2039,8 +2077,24 @@ void __init heap_init_late(void)
>>       */
>>      setup_low_mem_virq();
>>  
>> -    if ( opt_bootscrub )
>> +    switch ( opt_bootscrub )
>> +    {
>> +    default:
>> +        ASSERT_UNREACHABLE();
>> +        /* Fall through */
>> +
>> +    case BOOTSCRUB_IDLE:
>> +        printk("Scrubbing Free RAM on %d nodes in background\n",
>> +               num_online_nodes());
> 
> Still the question whether this printk(), and in particular the inclusion
> of the node count, is meaningful in any way. Other than this

I don't have any strong opinion about how this printk() statement should
look like. It can be changed to whatever maintainers find more appropriate.

> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> and one or both changes would be easy enough to make while
> committing, provided we can reach agreement.

Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-10-09 15:21 [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop Sergey Dyasli
  2018-10-12 13:40 ` Jan Beulich
@ 2018-11-07 18:20 ` Andrew Cooper
  2018-11-08  9:05   ` Sergey Dyasli
  2018-11-08 10:31   ` Jan Beulich
  1 sibling, 2 replies; 10+ messages in thread
From: Andrew Cooper @ 2018-11-07 18:20 UTC (permalink / raw)
  To: Sergey Dyasli, xen-devel
  Cc: George Dunlap, Boris Ostrovsky, Wei Liu, Julien Grall, Jan Beulich

On 09/10/18 16:21, Sergey Dyasli wrote:
> Scrubbing RAM during boot may take a long time on machines with lots
> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
> initially so they will eventually be scrubbed in idle-loop on every
> online CPU.
>
> It's guaranteed that the allocator will return scrubbed pages by doing
> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>
> Use the new 'idle' option as the default one.
>
> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>

This patch reliably breaks boot, although its not immediately obvious how:

(d9) (XEN) mcheck_poll: Machine check polling timer started.
(d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 60 is not supported
(d9) (XEN) Dom0 has maximum 400 PIRQs
(d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
(d9) (XEN) CPU:    0
(d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
(d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
(d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 0000000000000000
(d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: ffff83000045c24b
(d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  ffff83003f057000
(d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 0000000000000001
(d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: ffff82d0805f33d0
(d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 00000000001526e0
(d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
(d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
(d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(d9) (XEN) Xen code around <ffff82d080440ddb> (setup.c#cmdline_cook+0x1d/0x77):
(d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 74 f7 80 3d
(d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
(d9) (XEN)    0000000000000000 ffff8300c2c2c2c2 ffff82d0804b7ee8 ffff82d080443b7f
(d9) (XEN)    0000000000000000 00000000003f3480 0000000000000002 0000000000000002
(d9) (XEN)    0000000000000002 0000000000000002 0000000000000002 0000000000000001
(d9) (XEN)    0000000000000001 0000000000000003 00000000000feffc 0000000000000000
(d9) (XEN)    00000000000feffd 0000000000000000 0000000000800163 00000000feffd000
(d9) (XEN)    ffff83000045c24b ffffffff00000002 0000000000000001 0000000000000001
(d9) (XEN)    ffff83000048da80 ffff82d08048db00 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000200000004 00000040ffffffff 0000000000000400
(d9) (XEN)    0000000800000000 000000010000006e 0000000000000003 00000000000002f8
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 ffff82d0802000f3
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(d9) (XEN)    0000000000000000 0000000000000000 ffff83003f0ce000 0000000000000000
(d9) (XEN)    00000000001526e0 0000000000000000 0000000000000000 0000060000000000
(d9) (XEN)    0000001800000000
(d9) (XEN) Xen call trace:
(d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
(d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
(d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
(d9) (XEN) 
(d9) (XEN) Pagetable walk from ffff8300c2c2c2c2:
(d9) (XEN)  L4[0x106] = 800000003fea5063 ffffffffffffffff
(d9) (XEN)  L3[0x003] = 000000003fea2063 ffffffffffffffff
(d9) (XEN)  L2[0x016] = 0000000000000000 ffffffffffffffff
(d9) (XEN) 
(d9) (XEN) ****************************************
(d9) (XEN) Panic on CPU 0:
(d9) (XEN) FATAL PAGE FAULT
(d9) (XEN) [error_code=0000]
(d9) (XEN) Faulting linear address: ffff8300c2c2c2c2
(d9) (XEN) ****************************************
(d9) (XEN) 
(d9) (XEN) Reboot in five seconds...

The low part of 0xffff8300c2c2c2c2 looks to be poisoned, so
__va(mod[0].string) is obviously turning out to be junk.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-11-07 18:20 ` Andrew Cooper
@ 2018-11-08  9:05   ` Sergey Dyasli
  2018-11-08 10:31   ` Jan Beulich
  1 sibling, 0 replies; 10+ messages in thread
From: Sergey Dyasli @ 2018-11-08  9:05 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Wei Liu,
	George Dunlap, Julien Grall, Jan Beulich, Boris Ostrovsky

On 07/11/2018 18:20, Andrew Cooper wrote:
> On 09/10/18 16:21, Sergey Dyasli wrote:
>> Scrubbing RAM during boot may take a long time on machines with lots
>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>> initially so they will eventually be scrubbed in idle-loop on every
>> online CPU.
>>
>> It's guaranteed that the allocator will return scrubbed pages by doing
>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>
>> Use the new 'idle' option as the default one.
>>
>> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
> 
> This patch reliably breaks boot, although its not immediately obvious how:
> 
> (d9) (XEN) mcheck_poll: Machine check polling timer started.
> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 60 is not supported
> (d9) (XEN) Dom0 has maximum 400 PIRQs
> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
> (d9) (XEN) CPU:    0
> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 0000000000000000
> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: ffff83000045c24b
> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  ffff83003f057000
> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 0000000000000001
> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: ffff82d0805f33d0
> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 00000000001526e0
> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> (d9) (XEN) Xen code around <ffff82d080440ddb> (setup.c#cmdline_cook+0x1d/0x77):
> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 74 f7 80 3d
> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
> (d9) (XEN)    0000000000000000 ffff8300c2c2c2c2 ffff82d0804b7ee8 ffff82d080443b7f
> (d9) (XEN)    0000000000000000 00000000003f3480 0000000000000002 0000000000000002
> (d9) (XEN)    0000000000000002 0000000000000002 0000000000000002 0000000000000001
> (d9) (XEN)    0000000000000001 0000000000000003 00000000000feffc 0000000000000000
> (d9) (XEN)    00000000000feffd 0000000000000000 0000000000800163 00000000feffd000
> (d9) (XEN)    ffff83000045c24b ffffffff00000002 0000000000000001 0000000000000001
> (d9) (XEN)    ffff83000048da80 ffff82d08048db00 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000200000004 00000040ffffffff 0000000000000400
> (d9) (XEN)    0000000800000000 000000010000006e 0000000000000003 00000000000002f8
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 ffff82d0802000f3
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (d9) (XEN)    0000000000000000 0000000000000000 ffff83003f0ce000 0000000000000000
> (d9) (XEN)    00000000001526e0 0000000000000000 0000000000000000 0000060000000000
> (d9) (XEN)    0000001800000000
> (d9) (XEN) Xen call trace:
> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
> (d9) (XEN) 
> (d9) (XEN) Pagetable walk from ffff8300c2c2c2c2:
> (d9) (XEN)  L4[0x106] = 800000003fea5063 ffffffffffffffff
> (d9) (XEN)  L3[0x003] = 000000003fea2063 ffffffffffffffff
> (d9) (XEN)  L2[0x016] = 0000000000000000 ffffffffffffffff
> (d9) (XEN) 
> (d9) (XEN) ****************************************
> (d9) (XEN) Panic on CPU 0:
> (d9) (XEN) FATAL PAGE FAULT
> (d9) (XEN) [error_code=0000]
> (d9) (XEN) Faulting linear address: ffff8300c2c2c2c2
> (d9) (XEN) ****************************************
> (d9) (XEN) 
> (d9) (XEN) Reboot in five seconds...
> 
> The low part of 0xffff8300c2c2c2c2 looks to be poisoned, so
> __va(mod[0].string) is obviously turning out to be junk.

0xc2 is a SCRUB_PATTERN, so my patch might have uncovered a real issue.
There are 2 implications of idle scrub:

    1. alloc_xenheap_pages() might return scrubbed memory (despite
       passing MEMF_no_scrub, and after secondary CPUs enter idle-loop)

    2. alloc_domheap_pages() will return scrubbed memory by default
       during Xen boot

What is the exact place of this crash? Maybe zeroing of allocated pages
is needed there? Can you reproduce the issue with Release build, where
scrub pattern is 0?

--
Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-11-07 18:20 ` Andrew Cooper
  2018-11-08  9:05   ` Sergey Dyasli
@ 2018-11-08 10:31   ` Jan Beulich
  2018-11-08 11:07     ` Andrew Cooper
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2018-11-08 10:31 UTC (permalink / raw)
  To: Andrew Cooper, Sergey Dyasli
  Cc: George Dunlap, Julien Grall, Wei Liu, Boris Ostrovsky, xen-devel

>>> On 07.11.18 at 19:20, <andrew.cooper3@citrix.com> wrote:
> On 09/10/18 16:21, Sergey Dyasli wrote:
>> Scrubbing RAM during boot may take a long time on machines with lots
>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>> initially so they will eventually be scrubbed in idle-loop on every
>> online CPU.
>>
>> It's guaranteed that the allocator will return scrubbed pages by doing
>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>
>> Use the new 'idle' option as the default one.
>>
>> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
> 
> This patch reliably breaks boot, although its not immediately obvious how:
> 
> (d9) (XEN) mcheck_poll: Machine check polling timer started.
> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 
> 60 is not supported
> (d9) (XEN) Dom0 has maximum 400 PIRQs
> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
> (d9) (XEN) CPU:    0
> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 0000000000000000
> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: ffff83000045c24b
> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  ffff83003f057000
> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 0000000000000001
> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: ffff82d0805f33d0
> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 00000000001526e0
> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> (d9) (XEN) Xen code around <ffff82d080440ddb> (setup.c#cmdline_cook+0x1d/0x77):
> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 74 f7 80 3d
> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
>[...]
> (d9) (XEN) Xen call trace:
> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55

That's apparently the 2nd cmdline_cook() invocation, when producing
the Dom0 command line. I would suppose what "loader" points to has
been scrubbed by the time we get there (with synchronous scrubbing
APs wouldn't be able to get going with this before reaching
heap_init_late()).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-11-08 10:31   ` Jan Beulich
@ 2018-11-08 11:07     ` Andrew Cooper
  2018-11-08 14:48       ` Sergey Dyasli
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2018-11-08 11:07 UTC (permalink / raw)
  To: Jan Beulich, Sergey Dyasli
  Cc: George Dunlap, Julien Grall, Wei Liu, Boris Ostrovsky, xen-devel

On 08/11/18 10:31, Jan Beulich wrote:
>>>> On 07.11.18 at 19:20, <andrew.cooper3@citrix.com> wrote:
>> On 09/10/18 16:21, Sergey Dyasli wrote:
>>> Scrubbing RAM during boot may take a long time on machines with lots
>>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>>> initially so they will eventually be scrubbed in idle-loop on every
>>> online CPU.
>>>
>>> It's guaranteed that the allocator will return scrubbed pages by doing
>>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>>
>>> Use the new 'idle' option as the default one.
>>>
>>> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
>> This patch reliably breaks boot, although its not immediately obvious how:
>>
>> (d9) (XEN) mcheck_poll: Machine check polling timer started.
>> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 
>> 60 is not supported
>> (d9) (XEN) Dom0 has maximum 400 PIRQs
>> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>> (d9) (XEN) CPU:    0
>> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
>> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 0000000000000000
>> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: ffff83000045c24b
>> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  ffff83003f057000
>> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 0000000000000001
>> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: ffff82d0805f33d0
>> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 00000000001526e0
>> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
>> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
>> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>> (d9) (XEN) Xen code around <ffff82d080440ddb> (setup.c#cmdline_cook+0x1d/0x77):
>> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 74 f7 80 3d
>> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
>> [...]
>> (d9) (XEN) Xen call trace:
>> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
>> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
> That's apparently the 2nd cmdline_cook() invocation, when producing
> the Dom0 command line. I would suppose what "loader" points to has
> been scrubbed by the time we get there (with synchronous scrubbing
> APs wouldn't be able to get going with this before reaching
> heap_init_late()).

This is via a PVH boot (like a lot of my development work), and does
look to be a latent use-after-free.  Dropping the VM down to a single
vcpu causes the problem to go away.

Sergey is kindly investigating.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-11-08 11:07     ` Andrew Cooper
@ 2018-11-08 14:48       ` Sergey Dyasli
  2018-11-08 15:18         ` Roger Pau Monné
  0 siblings, 1 reply; 10+ messages in thread
From: Sergey Dyasli @ 2018-11-08 14:48 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: Wei Liu, George Dunlap, xen-devel, Julien Grall, Boris Ostrovsky,
	Roger Pau Monne

(CCing Roger)

On 08/11/2018 11:07, Andrew Cooper wrote:
> On 08/11/18 10:31, Jan Beulich wrote:
>>>>> On 07.11.18 at 19:20, <andrew.cooper3@citrix.com> wrote:
>>> On 09/10/18 16:21, Sergey Dyasli wrote:
>>>> Scrubbing RAM during boot may take a long time on machines with lots
>>>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>>>> initially so they will eventually be scrubbed in idle-loop on every
>>>> online CPU.
>>>>
>>>> It's guaranteed that the allocator will return scrubbed pages by doing
>>>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>>>
>>>> Use the new 'idle' option as the default one.
>>>>
>>>> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
>>> This patch reliably breaks boot, although its not immediately obvious how:
>>>
>>> (d9) (XEN) mcheck_poll: Machine check polling timer started.
>>> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 
>>> 60 is not supported
>>> (d9) (XEN) Dom0 has maximum 400 PIRQs
>>> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>>> (d9) (XEN) CPU:    0
>>> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
>>> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 0000000000000000
>>> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: ffff83000045c24b
>>> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  ffff83003f057000
>>> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 0000000000000001
>>> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: ffff82d0805f33d0
>>> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 00000000001526e0
>>> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
>>> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
>>> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>> (d9) (XEN) Xen code around <ffff82d080440ddb> (setup.c#cmdline_cook+0x1d/0x77):
>>> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 74 f7 80 3d
>>> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
>>> [...]
>>> (d9) (XEN) Xen call trace:
>>> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
>>> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
>> That's apparently the 2nd cmdline_cook() invocation, when producing
>> the Dom0 command line. I would suppose what "loader" points to has
>> been scrubbed by the time we get there (with synchronous scrubbing
>> APs wouldn't be able to get going with this before reaching
>> heap_init_late()).
> 
> This is via a PVH boot (like a lot of my development work), and does
> look to be a latent use-after-free.  Dropping the VM down to a single
> vcpu causes the problem to go away.
> 
> Sergey is kindly investigating.

Yes, this seems to be a bug in Xen PVH boot path. From the serial:

(XEN) == mbi->mods_addr 0x46dce0

which is marked as usable in e820:

(XEN) PVH-e820 RAM map:
(XEN)  0000000000000000 - 00000000000a0000 (usable)
(XEN)  0000000000100000 - 0000000040000400 (usable)
(XEN)  00000000fc000000 - 00000000fc009040 (ACPI data)
(XEN)  00000000feff8000 - 00000000feffc000 (reserved)
(XEN)  00000000feffc000 - 00000000feffd000 (usable)
(XEN)  00000000feffd000 - 00000000ff000000 (reserved)

This memory is then given to the allocator and scrubbed by secondary
CPUs which leads to use-after-free. Even with fixing the cmdline issue,
another FATAL PAGE FAULT occurs further down the boot path:

(d16) [183465.829440] (XEN) Xen call trace:
(d16) [183465.829467] (XEN)    [<ffff82d08023d6c5>] memcmp+0x9/0x3a
(d16) [183465.829494] (XEN)    [<ffff82d080436702>]
bzimage.c#bzimage_check+0x32/0x71
(d16) [183465.829511] (XEN)    [<ffff82d080436806>] bzimage_parse+0x22/0xba
(d16) [183465.829528] (XEN)    [<ffff82d080431086>]
dom0_build.c#pvh_load_kernel+0x82/0x3c0
(d16) [183465.829612] (XEN)    [<ffff82d0804316e0>]
dom0_construct_pvh+0x1c9/0x11bf
(d16) [183465.829638] (XEN)    [<ffff82d0804387a6>]
construct_dom0+0xd4/0xb0e
(d16) [183465.829655] (XEN)    [<ffff82d0804280cc>]
__start_xen+0x2631/0x28b6
(d16) [183465.829682] (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
...
(XEN) Faulting linear address: ffff8f2c2d301202

Looking at mod[0].pa in PVH start info, I suspect that it also gets
overwritten:

(XEN) PVH start info: (pa 0000ffc0)
(XEN)   version:    1
(XEN)   flags:      0
(XEN)   nr_modules: 1
(XEN)   modlist_pa: 000000000000ff70
(XEN)   cmdline_pa: 000000000000ff90
(XEN)   cmdline:    'console=xen,pv dom0=pvh xsm=flask'
(XEN)   rsdp_pa:    00000000fc009000
(XEN)     mod[0].pa:         00000000005b1000
(XEN)     mod[0].size:       0000000004784128
(XEN)     mod[0].cmdline_pa: 0000000000000000

The issue is easily reproduced by running Xen as a PVH guest with the
following config:

type="pvh"

vcpus=2
memory=1024
nestedhvm=1

kernel="/root/xen-syms"
ramdisk="/boot/vmlinuz-4.4.0+10"
cmdline="console=xen,pv dom0=pvh xsm=flask"

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-11-08 14:48       ` Sergey Dyasli
@ 2018-11-08 15:18         ` Roger Pau Monné
  2018-11-09  8:49           ` Sergey Dyasli
  0 siblings, 1 reply; 10+ messages in thread
From: Roger Pau Monné @ 2018-11-08 15:18 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Wei Liu, George Dunlap, Andrew Cooper, xen-devel, Julien Grall,
	Jan Beulich, Boris Ostrovsky

On Thu, Nov 08, 2018 at 02:48:40PM +0000, Sergey Dyasli wrote:
> (CCing Roger)
> 
> On 08/11/2018 11:07, Andrew Cooper wrote:
> > On 08/11/18 10:31, Jan Beulich wrote:
> >>>>> On 07.11.18 at 19:20, <andrew.cooper3@citrix.com> wrote:
> >>> On 09/10/18 16:21, Sergey Dyasli wrote:
> >>>> Scrubbing RAM during boot may take a long time on machines with lots
> >>>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
> >>>> initially so they will eventually be scrubbed in idle-loop on every
> >>>> online CPU.
> >>>>
> >>>> It's guaranteed that the allocator will return scrubbed pages by doing
> >>>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
> >>>>
> >>>> Use the new 'idle' option as the default one.
> >>>>
> >>>> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
> >>> This patch reliably breaks boot, although its not immediately obvious how:
> >>>
> >>> (d9) (XEN) mcheck_poll: Machine check polling timer started.
> >>> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 
> >>> 60 is not supported
> >>> (d9) (XEN) Dom0 has maximum 400 PIRQs
> >>> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
> >>> (d9) (XEN) CPU:    0
> >>> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
> >>> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
> >>> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 0000000000000000
> >>> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: ffff83000045c24b
> >>> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  ffff83003f057000
> >>> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 0000000000000001
> >>> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: ffff82d0805f33d0
> >>> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 00000000001526e0
> >>> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
> >>> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
> >>> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> >>> (d9) (XEN) Xen code around <ffff82d080440ddb> (setup.c#cmdline_cook+0x1d/0x77):
> >>> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 74 f7 80 3d
> >>> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
> >>> [...]
> >>> (d9) (XEN) Xen call trace:
> >>> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
> >>> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
> >>> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
> >> That's apparently the 2nd cmdline_cook() invocation, when producing
> >> the Dom0 command line. I would suppose what "loader" points to has
> >> been scrubbed by the time we get there (with synchronous scrubbing
> >> APs wouldn't be able to get going with this before reaching
> >> heap_init_late()).
> > 
> > This is via a PVH boot (like a lot of my development work), and does
> > look to be a latent use-after-free.  Dropping the VM down to a single
> > vcpu causes the problem to go away.
> > 
> > Sergey is kindly investigating.
> 
> Yes, this seems to be a bug in Xen PVH boot path. From the serial:
> 
> (XEN) == mbi->mods_addr 0x46dce0
> 
> which is marked as usable in e820:
> 
> (XEN) PVH-e820 RAM map:
> (XEN)  0000000000000000 - 00000000000a0000 (usable)
> (XEN)  0000000000100000 - 0000000040000400 (usable)
> (XEN)  00000000fc000000 - 00000000fc009040 (ACPI data)
> (XEN)  00000000feff8000 - 00000000feffc000 (reserved)
> (XEN)  00000000feffc000 - 00000000feffd000 (usable)
> (XEN)  00000000feffd000 - 00000000ff000000 (reserved)
> 
> This memory is then given to the allocator and scrubbed by secondary
> CPUs which leads to use-after-free. Even with fixing the cmdline issue,
> another FATAL PAGE FAULT occurs further down the boot path:

Right, shouldn't the scrub be started after Dom0 has been constructed?
I would say the scrubbing should be started at the same time as
before, which is just before jumping into Dom0 entry point IIRC?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop
  2018-11-08 15:18         ` Roger Pau Monné
@ 2018-11-09  8:49           ` Sergey Dyasli
  0 siblings, 0 replies; 10+ messages in thread
From: Sergey Dyasli @ 2018-11-09  8:49 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Wei Liu,
	George Dunlap, Andrew Cooper, xen-devel, Julien Grall,
	Jan Beulich, Boris Ostrovsky

On 08/11/2018 15:18, Roger Pau Monné wrote:
> On Thu, Nov 08, 2018 at 02:48:40PM +0000, Sergey Dyasli wrote:
>> (CCing Roger)
>>
>> On 08/11/2018 11:07, Andrew Cooper wrote:
>>> On 08/11/18 10:31, Jan Beulich wrote:
>>>>>>> On 07.11.18 at 19:20, <andrew.cooper3@citrix.com> wrote:
>>>>> On 09/10/18 16:21, Sergey Dyasli wrote:
>>>>>> Scrubbing RAM during boot may take a long time on machines with lots
>>>>>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>>>>>> initially so they will eventually be scrubbed in idle-loop on every
>>>>>> online CPU.
>>>>>>
>>>>>> It's guaranteed that the allocator will return scrubbed pages by doing
>>>>>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>>>>>
>>>>>> Use the new 'idle' option as the default one.
>>>>>>
>>>>>> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
>>>>> This patch reliably breaks boot, although its not immediately obvious how:
>>>>>
>>>>> (d9) (XEN) mcheck_poll: Machine check polling timer started.
>>>>> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 
>>>>> 60 is not supported
>>>>> (d9) (XEN) Dom0 has maximum 400 PIRQs
>>>>> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>>>>> (d9) (XEN) CPU:    0
>>>>> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>>>> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
>>>>> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 0000000000000000
>>>>> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: ffff83000045c24b
>>>>> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  ffff83003f057000
>>>>> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 0000000000000001
>>>>> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: ffff82d0805f33d0
>>>>> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 00000000001526e0
>>>>> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
>>>>> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
>>>>> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>>>> (d9) (XEN) Xen code around <ffff82d080440ddb> (setup.c#cmdline_cook+0x1d/0x77):
>>>>> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 74 f7 80 3d
>>>>> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
>>>>> [...]
>>>>> (d9) (XEN) Xen call trace:
>>>>> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>>>> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
>>>>> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
>>>> That's apparently the 2nd cmdline_cook() invocation, when producing
>>>> the Dom0 command line. I would suppose what "loader" points to has
>>>> been scrubbed by the time we get there (with synchronous scrubbing
>>>> APs wouldn't be able to get going with this before reaching
>>>> heap_init_late()).
>>>
>>> This is via a PVH boot (like a lot of my development work), and does
>>> look to be a latent use-after-free.  Dropping the VM down to a single
>>> vcpu causes the problem to go away.
>>>
>>> Sergey is kindly investigating.
>>
>> Yes, this seems to be a bug in Xen PVH boot path. From the serial:
>>
>> (XEN) == mbi->mods_addr 0x46dce0
>>
>> which is marked as usable in e820:
>>
>> (XEN) PVH-e820 RAM map:
>> (XEN)  0000000000000000 - 00000000000a0000 (usable)
>> (XEN)  0000000000100000 - 0000000040000400 (usable)
>> (XEN)  00000000fc000000 - 00000000fc009040 (ACPI data)
>> (XEN)  00000000feff8000 - 00000000feffc000 (reserved)
>> (XEN)  00000000feffc000 - 00000000feffd000 (usable)
>> (XEN)  00000000feffd000 - 00000000ff000000 (reserved)
>>
>> This memory is then given to the allocator and scrubbed by secondary
>> CPUs which leads to use-after-free. Even with fixing the cmdline issue,
>> another FATAL PAGE FAULT occurs further down the boot path:
> 
> Right, shouldn't the scrub be started after Dom0 has been constructed?
> I would say the scrubbing should be started at the same time as
> before, which is just before jumping into Dom0 entry point IIRC?

No, this would only mask the issue again. Although unlikely, that memory
for modules might be given to someone by the allocator, which can lead
to silent memory corruption. Modules are supposed to be freed by
discard_initial_images() which is already called by pvh_load_kernel().

--
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-11-09  8:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-09 15:21 [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop Sergey Dyasli
2018-10-12 13:40 ` Jan Beulich
2018-10-15  7:53   ` Sergey Dyasli
2018-11-07 18:20 ` Andrew Cooper
2018-11-08  9:05   ` Sergey Dyasli
2018-11-08 10:31   ` Jan Beulich
2018-11-08 11:07     ` Andrew Cooper
2018-11-08 14:48       ` Sergey Dyasli
2018-11-08 15:18         ` Roger Pau Monné
2018-11-09  8:49           ` Sergey Dyasli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.