All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] xen: avoid crash in disable_hotplug_cpu
@ 2018-09-07  6:30 Olaf Hering
  2018-09-07  7:48 ` Juergen Gross
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Olaf Hering @ 2018-09-07  6:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Boris Ostrovsky, Juergen Gross, open list

The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:

BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
RIP: e030:device_offline+0x9/0xb0
Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
Call Trace:
 handle_vcpu_hotplug_event+0xb5/0xc0
 xenwatch_thread+0x80/0x140
 ? wait_woken+0x80/0x80
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x40/0x40
 ret_from_fork+0x3a/0x50

This happens because handle_vcpu_hotplug_event is called twice. In the
first iteration cpu_present is still true, in the second iteration
cpu_present is false which causes get_cpu_device to return NULL.
In case of cpu#0, cpu_online is apparently always true.

Fix this crash by checking if the cpu can be hotplugged, which is false
for a cpu that was just removed.

Also check if the cpu was actually offlined by device_remove, otherwise
leave the cpu_present state as it is.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 drivers/xen/cpu_hotplug.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index d4265c8ebb22..68f9f663da08 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -19,11 +19,15 @@ static void enable_hotplug_cpu(int cpu)
 
 static void disable_hotplug_cpu(int cpu)
 {
+	if (!cpu_is_hotpluggable(cpu))
+		return;
 	if (cpu_online(cpu)) {
 		lock_device_hotplug();
 		device_offline(get_cpu_device(cpu));
 		unlock_device_hotplug();
 	}
+	if (cpu_online(cpu))
+		return;
 	if (cpu_present(cpu))
 		xen_arch_unregister_cpu(cpu);
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  6:30 [PATCH v3] xen: avoid crash in disable_hotplug_cpu Olaf Hering
@ 2018-09-07  7:48 ` Juergen Gross
  2018-09-07  9:24   ` Olaf Hering
  2018-09-07  9:24   ` Olaf Hering
  2018-09-07  7:48 ` Juergen Gross
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-07  7:48 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Boris Ostrovsky, open list

On 07/09/18 08:30, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> This happens because handle_vcpu_hotplug_event is called twice. In the
> first iteration cpu_present is still true, in the second iteration
> cpu_present is false which causes get_cpu_device to return NULL.
> In case of cpu#0, cpu_online is apparently always true.
> 
> Fix this crash by checking if the cpu can be hotplugged, which is false
> for a cpu that was just removed.
> 
> Also check if the cpu was actually offlined by device_remove, otherwise
> leave the cpu_present state as it is.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>  drivers/xen/cpu_hotplug.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
> index d4265c8ebb22..68f9f663da08 100644
> --- a/drivers/xen/cpu_hotplug.c
> +++ b/drivers/xen/cpu_hotplug.c
> @@ -19,11 +19,15 @@ static void enable_hotplug_cpu(int cpu)
>  
>  static void disable_hotplug_cpu(int cpu)
>  {
> +	if (!cpu_is_hotpluggable(cpu))
> +		return;
>  	if (cpu_online(cpu)) {
>  		lock_device_hotplug();
>  		device_offline(get_cpu_device(cpu));
>  		unlock_device_hotplug();
>  	}
> +	if (cpu_online(cpu))
> +		return;
>  	if (cpu_present(cpu))
>  		xen_arch_unregister_cpu(cpu);

Could you merge the two if conditions?

if (!cpu_online(cpu) && cpu_present(cpu))
	xen_arch_unregister_cpu(cpu);

And while not really important, as we are called in the xenstore watch
thread only, it might be a good idea to move the first cpu_online()
test into the lock, i.e.:

lock_device_hotplug();
if (cpu_online(cpu))
	device_offline(get_cpu_device(cpu));
unlock_device_hotplug();

This will make the code robust against reentry.


Juergen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  6:30 [PATCH v3] xen: avoid crash in disable_hotplug_cpu Olaf Hering
  2018-09-07  7:48 ` Juergen Gross
@ 2018-09-07  7:48 ` Juergen Gross
  2018-09-07 13:33 ` Boris Ostrovsky
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-07  7:48 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Boris Ostrovsky, open list

On 07/09/18 08:30, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> This happens because handle_vcpu_hotplug_event is called twice. In the
> first iteration cpu_present is still true, in the second iteration
> cpu_present is false which causes get_cpu_device to return NULL.
> In case of cpu#0, cpu_online is apparently always true.
> 
> Fix this crash by checking if the cpu can be hotplugged, which is false
> for a cpu that was just removed.
> 
> Also check if the cpu was actually offlined by device_remove, otherwise
> leave the cpu_present state as it is.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>  drivers/xen/cpu_hotplug.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
> index d4265c8ebb22..68f9f663da08 100644
> --- a/drivers/xen/cpu_hotplug.c
> +++ b/drivers/xen/cpu_hotplug.c
> @@ -19,11 +19,15 @@ static void enable_hotplug_cpu(int cpu)
>  
>  static void disable_hotplug_cpu(int cpu)
>  {
> +	if (!cpu_is_hotpluggable(cpu))
> +		return;
>  	if (cpu_online(cpu)) {
>  		lock_device_hotplug();
>  		device_offline(get_cpu_device(cpu));
>  		unlock_device_hotplug();
>  	}
> +	if (cpu_online(cpu))
> +		return;
>  	if (cpu_present(cpu))
>  		xen_arch_unregister_cpu(cpu);

Could you merge the two if conditions?

if (!cpu_online(cpu) && cpu_present(cpu))
	xen_arch_unregister_cpu(cpu);

And while not really important, as we are called in the xenstore watch
thread only, it might be a good idea to move the first cpu_online()
test into the lock, i.e.:

lock_device_hotplug();
if (cpu_online(cpu))
	device_offline(get_cpu_device(cpu));
unlock_device_hotplug();

This will make the code robust against reentry.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  7:48 ` Juergen Gross
  2018-09-07  9:24   ` Olaf Hering
@ 2018-09-07  9:24   ` Olaf Hering
  2018-09-07  9:50     ` Juergen Gross
  2018-09-07  9:50     ` Juergen Gross
  1 sibling, 2 replies; 18+ messages in thread
From: Olaf Hering @ 2018-09-07  9:24 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list

[-- Attachment #1: Type: text/plain, Size: 493 bytes --]

Am Fri, 7 Sep 2018 09:48:28 +0200
schrieb Juergen Gross <jgross@suse.com>:

> On 07/09/18 08:30, Olaf Hering wrote:
> > +	if (cpu_online(cpu))
> > +		return;
> >  	if (cpu_present(cpu))
> >  		xen_arch_unregister_cpu(cpu);  

> Could you merge the two if conditions?
> if (!cpu_online(cpu) && cpu_present(cpu))
> 	xen_arch_unregister_cpu(cpu);

Is that any different, beside being wrong, from what the patch actually does?
It would still clear the present bit later on.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  7:48 ` Juergen Gross
@ 2018-09-07  9:24   ` Olaf Hering
  2018-09-07  9:24   ` Olaf Hering
  1 sibling, 0 replies; 18+ messages in thread
From: Olaf Hering @ 2018-09-07  9:24 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list


[-- Attachment #1.1: Type: text/plain, Size: 493 bytes --]

Am Fri, 7 Sep 2018 09:48:28 +0200
schrieb Juergen Gross <jgross@suse.com>:

> On 07/09/18 08:30, Olaf Hering wrote:
> > +	if (cpu_online(cpu))
> > +		return;
> >  	if (cpu_present(cpu))
> >  		xen_arch_unregister_cpu(cpu);  

> Could you merge the two if conditions?
> if (!cpu_online(cpu) && cpu_present(cpu))
> 	xen_arch_unregister_cpu(cpu);

Is that any different, beside being wrong, from what the patch actually does?
It would still clear the present bit later on.

Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  9:24   ` Olaf Hering
@ 2018-09-07  9:50     ` Juergen Gross
  2018-09-07  9:50     ` Juergen Gross
  1 sibling, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-07  9:50 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Boris Ostrovsky, open list

On 07/09/18 11:24, Olaf Hering wrote:
> Am Fri, 7 Sep 2018 09:48:28 +0200
> schrieb Juergen Gross <jgross@suse.com>:
> 
>> On 07/09/18 08:30, Olaf Hering wrote:
>>> +	if (cpu_online(cpu))
>>> +		return;
>>>  	if (cpu_present(cpu))
>>>  		xen_arch_unregister_cpu(cpu);  
> 
>> Could you merge the two if conditions?
>> if (!cpu_online(cpu) && cpu_present(cpu))
>> 	xen_arch_unregister_cpu(cpu);
> 
> Is that any different, beside being wrong, from what the patch actually does?

No. Just a matter of style.

> It would still clear the present bit later on.

This should be part of the if clause, of course.


Juergen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  9:24   ` Olaf Hering
  2018-09-07  9:50     ` Juergen Gross
@ 2018-09-07  9:50     ` Juergen Gross
  1 sibling, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-07  9:50 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Boris Ostrovsky, open list

On 07/09/18 11:24, Olaf Hering wrote:
> Am Fri, 7 Sep 2018 09:48:28 +0200
> schrieb Juergen Gross <jgross@suse.com>:
> 
>> On 07/09/18 08:30, Olaf Hering wrote:
>>> +	if (cpu_online(cpu))
>>> +		return;
>>>  	if (cpu_present(cpu))
>>>  		xen_arch_unregister_cpu(cpu);  
> 
>> Could you merge the two if conditions?
>> if (!cpu_online(cpu) && cpu_present(cpu))
>> 	xen_arch_unregister_cpu(cpu);
> 
> Is that any different, beside being wrong, from what the patch actually does?

No. Just a matter of style.

> It would still clear the present bit later on.

This should be part of the if clause, of course.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  6:30 [PATCH v3] xen: avoid crash in disable_hotplug_cpu Olaf Hering
                   ` (2 preceding siblings ...)
  2018-09-07 13:33 ` Boris Ostrovsky
@ 2018-09-07 13:33 ` Boris Ostrovsky
  2018-09-07 13:39   ` Olaf Hering
  2018-09-07 13:39   ` Olaf Hering
  2018-09-11 11:25 ` Juergen Gross
  2018-09-11 11:25 ` Juergen Gross
  5 siblings, 2 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2018-09-07 13:33 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Juergen Gross, open list

On 09/07/2018 02:30 AM, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
>
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
>
> This happens because handle_vcpu_hotplug_event is called twice. In the
> first iteration cpu_present is still true, in the second iteration
> cpu_present is false which causes get_cpu_device to return NULL.
> In case of cpu#0, cpu_online is apparently always true.
>
> Fix this crash by checking if the cpu can be hotplugged, which is false
> for a cpu that was just removed.
>
> Also check if the cpu was actually offlined by device_remove, otherwise
> leave the cpu_present state as it is.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>  drivers/xen/cpu_hotplug.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
> index d4265c8ebb22..68f9f663da08 100644
> --- a/drivers/xen/cpu_hotplug.c
> +++ b/drivers/xen/cpu_hotplug.c
> @@ -19,11 +19,15 @@ static void enable_hotplug_cpu(int cpu)
>  
>  static void disable_hotplug_cpu(int cpu)
>  {
> +	if (!cpu_is_hotpluggable(cpu))
> +		return;
>  	if (cpu_online(cpu)) {
>  		lock_device_hotplug();
>  		device_offline(get_cpu_device(cpu));
>  		unlock_device_hotplug();
>  	}
> +	if (cpu_online(cpu))
> +		return;
>  	if (cpu_present(cpu))
>  		xen_arch_unregister_cpu(cpu);
>  


You don't get a warning in xen_init_lock_cpu() for CPU0?

Also, as a side question --- what is the purpose of 'xl vcpu-set <domid> 0'?

-boris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  6:30 [PATCH v3] xen: avoid crash in disable_hotplug_cpu Olaf Hering
  2018-09-07  7:48 ` Juergen Gross
  2018-09-07  7:48 ` Juergen Gross
@ 2018-09-07 13:33 ` Boris Ostrovsky
  2018-09-07 13:33 ` Boris Ostrovsky
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2018-09-07 13:33 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Juergen Gross, open list

On 09/07/2018 02:30 AM, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
>
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
>
> This happens because handle_vcpu_hotplug_event is called twice. In the
> first iteration cpu_present is still true, in the second iteration
> cpu_present is false which causes get_cpu_device to return NULL.
> In case of cpu#0, cpu_online is apparently always true.
>
> Fix this crash by checking if the cpu can be hotplugged, which is false
> for a cpu that was just removed.
>
> Also check if the cpu was actually offlined by device_remove, otherwise
> leave the cpu_present state as it is.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>  drivers/xen/cpu_hotplug.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
> index d4265c8ebb22..68f9f663da08 100644
> --- a/drivers/xen/cpu_hotplug.c
> +++ b/drivers/xen/cpu_hotplug.c
> @@ -19,11 +19,15 @@ static void enable_hotplug_cpu(int cpu)
>  
>  static void disable_hotplug_cpu(int cpu)
>  {
> +	if (!cpu_is_hotpluggable(cpu))
> +		return;
>  	if (cpu_online(cpu)) {
>  		lock_device_hotplug();
>  		device_offline(get_cpu_device(cpu));
>  		unlock_device_hotplug();
>  	}
> +	if (cpu_online(cpu))
> +		return;
>  	if (cpu_present(cpu))
>  		xen_arch_unregister_cpu(cpu);
>  


You don't get a warning in xen_init_lock_cpu() for CPU0?

Also, as a side question --- what is the purpose of 'xl vcpu-set <domid> 0'?

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07 13:33 ` Boris Ostrovsky
  2018-09-07 13:39   ` Olaf Hering
@ 2018-09-07 13:39   ` Olaf Hering
  1 sibling, 0 replies; 18+ messages in thread
From: Olaf Hering @ 2018-09-07 13:39 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: xen-devel, Juergen Gross, open list

[-- Attachment #1: Type: text/plain, Size: 236 bytes --]

Am Fri, 7 Sep 2018 09:33:20 -0400
schrieb Boris Ostrovsky <boris.ostrovsky@oracle.com>:

> what is the purpose of 'xl vcpu-set <domid> 0'?

Likely just a script that went wrong. But that command should not break the dom0.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07 13:33 ` Boris Ostrovsky
@ 2018-09-07 13:39   ` Olaf Hering
  2018-09-07 13:39   ` Olaf Hering
  1 sibling, 0 replies; 18+ messages in thread
From: Olaf Hering @ 2018-09-07 13:39 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: Juergen Gross, xen-devel, open list


[-- Attachment #1.1: Type: text/plain, Size: 236 bytes --]

Am Fri, 7 Sep 2018 09:33:20 -0400
schrieb Boris Ostrovsky <boris.ostrovsky@oracle.com>:

> what is the purpose of 'xl vcpu-set <domid> 0'?

Likely just a script that went wrong. But that command should not break the dom0.

Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  6:30 [PATCH v3] xen: avoid crash in disable_hotplug_cpu Olaf Hering
                   ` (3 preceding siblings ...)
  2018-09-07 13:33 ` Boris Ostrovsky
@ 2018-09-11 11:25 ` Juergen Gross
  2018-09-11 14:29   ` Boris Ostrovsky
  2018-09-11 14:29   ` Boris Ostrovsky
  2018-09-11 11:25 ` Juergen Gross
  5 siblings, 2 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-11 11:25 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Boris Ostrovsky, open list

On 07/09/18 08:30, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> This happens because handle_vcpu_hotplug_event is called twice. In the
> first iteration cpu_present is still true, in the second iteration
> cpu_present is false which causes get_cpu_device to return NULL.
> In case of cpu#0, cpu_online is apparently always true.
> 
> Fix this crash by checking if the cpu can be hotplugged, which is false
> for a cpu that was just removed.
> 
> Also check if the cpu was actually offlined by device_remove, otherwise
> leave the cpu_present state as it is.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-07  6:30 [PATCH v3] xen: avoid crash in disable_hotplug_cpu Olaf Hering
                   ` (4 preceding siblings ...)
  2018-09-11 11:25 ` Juergen Gross
@ 2018-09-11 11:25 ` Juergen Gross
  5 siblings, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-11 11:25 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Boris Ostrovsky, open list

On 07/09/18 08:30, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> This happens because handle_vcpu_hotplug_event is called twice. In the
> first iteration cpu_present is still true, in the second iteration
> cpu_present is false which causes get_cpu_device to return NULL.
> In case of cpu#0, cpu_online is apparently always true.
> 
> Fix this crash by checking if the cpu can be hotplugged, which is false
> for a cpu that was just removed.
> 
> Also check if the cpu was actually offlined by device_remove, otherwise
> leave the cpu_present state as it is.
> 
> Signed-off-by: Olaf Hering <olaf@aepfle.de>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-11 11:25 ` Juergen Gross
  2018-09-11 14:29   ` Boris Ostrovsky
@ 2018-09-11 14:29   ` Boris Ostrovsky
  2018-09-11 14:29     ` Juergen Gross
  2018-09-11 14:29     ` Juergen Gross
  1 sibling, 2 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2018-09-11 14:29 UTC (permalink / raw)
  To: Juergen Gross, Olaf Hering, xen-devel; +Cc: open list

On 9/11/18 7:25 AM, Juergen Gross wrote:
>
> Reviewed-by: Juergen Gross <jgross@suse.com>

So is it v3 or v4, you gave R-b for both. (I slightly prefer v3, but
either is fine).

-boris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-11 11:25 ` Juergen Gross
@ 2018-09-11 14:29   ` Boris Ostrovsky
  2018-09-11 14:29   ` Boris Ostrovsky
  1 sibling, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2018-09-11 14:29 UTC (permalink / raw)
  To: Juergen Gross, Olaf Hering, xen-devel; +Cc: open list

On 9/11/18 7:25 AM, Juergen Gross wrote:
>
> Reviewed-by: Juergen Gross <jgross@suse.com>

So is it v3 or v4, you gave R-b for both. (I slightly prefer v3, but
either is fine).

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-11 14:29   ` Boris Ostrovsky
  2018-09-11 14:29     ` Juergen Gross
@ 2018-09-11 14:29     ` Juergen Gross
  1 sibling, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-11 14:29 UTC (permalink / raw)
  To: Boris Ostrovsky, Olaf Hering, xen-devel; +Cc: open list

On 11/09/18 16:29, Boris Ostrovsky wrote:
> On 9/11/18 7:25 AM, Juergen Gross wrote:
>>
>> Reviewed-by: Juergen Gross <jgross@suse.com>
> 
> So is it v3 or v4, you gave R-b for both. (I slightly prefer v3, but
> either is fine).

V4 is better as it does the online test only after acquiring the
hotplug lock.


Juergen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] xen: avoid crash in disable_hotplug_cpu
  2018-09-11 14:29   ` Boris Ostrovsky
@ 2018-09-11 14:29     ` Juergen Gross
  2018-09-11 14:29     ` Juergen Gross
  1 sibling, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2018-09-11 14:29 UTC (permalink / raw)
  To: Boris Ostrovsky, Olaf Hering, xen-devel; +Cc: open list

On 11/09/18 16:29, Boris Ostrovsky wrote:
> On 9/11/18 7:25 AM, Juergen Gross wrote:
>>
>> Reviewed-by: Juergen Gross <jgross@suse.com>
> 
> So is it v3 or v4, you gave R-b for both. (I slightly prefer v3, but
> either is fine).

V4 is better as it does the online test only after acquiring the
hotplug lock.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3] xen: avoid crash in disable_hotplug_cpu
@ 2018-09-07  6:30 Olaf Hering
  0 siblings, 0 replies; 18+ messages in thread
From: Olaf Hering @ 2018-09-07  6:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Olaf Hering, Boris Ostrovsky, open list

The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:

BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
RIP: e030:device_offline+0x9/0xb0
Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
Call Trace:
 handle_vcpu_hotplug_event+0xb5/0xc0
 xenwatch_thread+0x80/0x140
 ? wait_woken+0x80/0x80
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x40/0x40
 ret_from_fork+0x3a/0x50

This happens because handle_vcpu_hotplug_event is called twice. In the
first iteration cpu_present is still true, in the second iteration
cpu_present is false which causes get_cpu_device to return NULL.
In case of cpu#0, cpu_online is apparently always true.

Fix this crash by checking if the cpu can be hotplugged, which is false
for a cpu that was just removed.

Also check if the cpu was actually offlined by device_remove, otherwise
leave the cpu_present state as it is.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 drivers/xen/cpu_hotplug.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index d4265c8ebb22..68f9f663da08 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -19,11 +19,15 @@ static void enable_hotplug_cpu(int cpu)
 
 static void disable_hotplug_cpu(int cpu)
 {
+	if (!cpu_is_hotpluggable(cpu))
+		return;
 	if (cpu_online(cpu)) {
 		lock_device_hotplug();
 		device_offline(get_cpu_device(cpu));
 		unlock_device_hotplug();
 	}
+	if (cpu_online(cpu))
+		return;
 	if (cpu_present(cpu))
 		xen_arch_unregister_cpu(cpu);
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-09-11 14:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-07  6:30 [PATCH v3] xen: avoid crash in disable_hotplug_cpu Olaf Hering
2018-09-07  7:48 ` Juergen Gross
2018-09-07  9:24   ` Olaf Hering
2018-09-07  9:24   ` Olaf Hering
2018-09-07  9:50     ` Juergen Gross
2018-09-07  9:50     ` Juergen Gross
2018-09-07  7:48 ` Juergen Gross
2018-09-07 13:33 ` Boris Ostrovsky
2018-09-07 13:33 ` Boris Ostrovsky
2018-09-07 13:39   ` Olaf Hering
2018-09-07 13:39   ` Olaf Hering
2018-09-11 11:25 ` Juergen Gross
2018-09-11 14:29   ` Boris Ostrovsky
2018-09-11 14:29   ` Boris Ostrovsky
2018-09-11 14:29     ` Juergen Gross
2018-09-11 14:29     ` Juergen Gross
2018-09-11 11:25 ` Juergen Gross
  -- strict thread matches above, loose matches on Subject: below --
2018-09-07  6:30 Olaf Hering

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.