All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] xen: avoid crash in disable_hotplug_cpu
@ 2018-09-05 10:40 ` Olaf Hering
  0 siblings, 0 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 10:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Olaf Hering, Boris Ostrovsky, Juergen Gross, open list

The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:

BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
RIP: e030:device_offline+0x9/0xb0
Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
Call Trace:
 handle_vcpu_hotplug_event+0xb5/0xc0
 xenwatch_thread+0x80/0x140
 ? wait_woken+0x80/0x80
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x40/0x40
 ret_from_fork+0x3a/0x50

Fix this crash by checking the return value of get_cpu_device().

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 drivers/xen/cpu_hotplug.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index d4265c8ebb22..182be49f5a0b 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -19,9 +19,13 @@ static void enable_hotplug_cpu(int cpu)
 
 static void disable_hotplug_cpu(int cpu)
 {
+	struct device *dev;
+
 	if (cpu_online(cpu)) {
 		lock_device_hotplug();
-		device_offline(get_cpu_device(cpu));
+		dev = get_cpu_device(cpu);
+		if (dev)
+			device_offline(dev);
 		unlock_device_hotplug();
 	}
 	if (cpu_present(cpu))

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1] xen: avoid crash in disable_hotplug_cpu
@ 2018-09-05 10:40 ` Olaf Hering
  0 siblings, 0 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 10:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Olaf Hering, Boris Ostrovsky, open list

The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:

BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
RIP: e030:device_offline+0x9/0xb0
Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
Call Trace:
 handle_vcpu_hotplug_event+0xb5/0xc0
 xenwatch_thread+0x80/0x140
 ? wait_woken+0x80/0x80
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x40/0x40
 ret_from_fork+0x3a/0x50

Fix this crash by checking the return value of get_cpu_device().

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 drivers/xen/cpu_hotplug.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index d4265c8ebb22..182be49f5a0b 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -19,9 +19,13 @@ static void enable_hotplug_cpu(int cpu)
 
 static void disable_hotplug_cpu(int cpu)
 {
+	struct device *dev;
+
 	if (cpu_online(cpu)) {
 		lock_device_hotplug();
-		device_offline(get_cpu_device(cpu));
+		dev = get_cpu_device(cpu);
+		if (dev)
+			device_offline(dev);
 		unlock_device_hotplug();
 	}
 	if (cpu_present(cpu))

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 10:40 ` Olaf Hering
  (?)
@ 2018-09-05 10:55 ` Juergen Gross
  2018-09-05 11:50   ` Olaf Hering
                     ` (3 more replies)
  -1 siblings, 4 replies; 14+ messages in thread
From: Juergen Gross @ 2018-09-05 10:55 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Boris Ostrovsky, open list

On 05/09/18 12:40, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> Fix this crash by checking the return value of get_cpu_device().

Instead of trying to fight the symptoms, I think avoiding to offline
the last cpu would make more sense.

I don't think calling xen_arch_unregister_cpu() just a few lines down
disable_hotplug_cpu() and then setting it as not present will result in
a good user experience.


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 10:40 ` Olaf Hering
  (?)
  (?)
@ 2018-09-05 10:55 ` Juergen Gross
  -1 siblings, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2018-09-05 10:55 UTC (permalink / raw)
  To: Olaf Hering, xen-devel; +Cc: Boris Ostrovsky, open list

On 05/09/18 12:40, Olaf Hering wrote:
> The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
> RIP: e030:device_offline+0x9/0xb0
> Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
> RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
> R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
> R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
> FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
> Call Trace:
>  handle_vcpu_hotplug_event+0xb5/0xc0
>  xenwatch_thread+0x80/0x140
>  ? wait_woken+0x80/0x80
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> Fix this crash by checking the return value of get_cpu_device().

Instead of trying to fight the symptoms, I think avoiding to offline
the last cpu would make more sense.

I don't think calling xen_arch_unregister_cpu() just a few lines down
disable_hotplug_cpu() and then setting it as not present will result in
a good user experience.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 10:55 ` Juergen Gross
@ 2018-09-05 11:50   ` Olaf Hering
  2018-09-05 12:04     ` Juergen Gross
  2018-09-05 12:04     ` Juergen Gross
  2018-09-05 11:50   ` Olaf Hering
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 11:50 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list

[-- Attachment #1: Type: text/plain, Size: 351 bytes --]

Am Wed, 5 Sep 2018 12:55:58 +0200
schrieb Juergen Gross <jgross@suse.com>:

> the last cpu

Which one is the "last" one? I mean, if cpu#0 never can be offlined than perhaps the code should check for just that and return early. But if cpu#0 could be disabled while some other cpu is the remaining cpu, some other check is needed I think.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 10:55 ` Juergen Gross
  2018-09-05 11:50   ` Olaf Hering
@ 2018-09-05 11:50   ` Olaf Hering
  2018-09-05 14:47   ` Olaf Hering
  2018-09-05 14:47   ` Olaf Hering
  3 siblings, 0 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 11:50 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list


[-- Attachment #1.1: Type: text/plain, Size: 351 bytes --]

Am Wed, 5 Sep 2018 12:55:58 +0200
schrieb Juergen Gross <jgross@suse.com>:

> the last cpu

Which one is the "last" one? I mean, if cpu#0 never can be offlined than perhaps the code should check for just that and return early. But if cpu#0 could be disabled while some other cpu is the remaining cpu, some other check is needed I think.

Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 11:50   ` Olaf Hering
@ 2018-09-05 12:04     ` Juergen Gross
  2018-09-05 12:04     ` Juergen Gross
  1 sibling, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2018-09-05 12:04 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Boris Ostrovsky, open list

On 05/09/18 13:50, Olaf Hering wrote:
> Am Wed, 5 Sep 2018 12:55:58 +0200
> schrieb Juergen Gross <jgross@suse.com>:
> 
>> the last cpu
> 
> Which one is the "last" one? I mean, if cpu#0 never can be offlined than perhaps the code should check for just that and return early. But if cpu#0 could be disabled while some other cpu is the remaining cpu, some other check is needed I think.

num_online_cpus() == 1


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 11:50   ` Olaf Hering
  2018-09-05 12:04     ` Juergen Gross
@ 2018-09-05 12:04     ` Juergen Gross
  1 sibling, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2018-09-05 12:04 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Boris Ostrovsky, open list

On 05/09/18 13:50, Olaf Hering wrote:
> Am Wed, 5 Sep 2018 12:55:58 +0200
> schrieb Juergen Gross <jgross@suse.com>:
> 
>> the last cpu
> 
> Which one is the "last" one? I mean, if cpu#0 never can be offlined than perhaps the code should check for just that and return early. But if cpu#0 could be disabled while some other cpu is the remaining cpu, some other check is needed I think.

num_online_cpus() == 1


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 10:55 ` Juergen Gross
  2018-09-05 11:50   ` Olaf Hering
  2018-09-05 11:50   ` Olaf Hering
@ 2018-09-05 14:47   ` Olaf Hering
  2018-09-05 15:14     ` Juergen Gross
  2018-09-05 15:14     ` Juergen Gross
  2018-09-05 14:47   ` Olaf Hering
  3 siblings, 2 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 14:47 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list

[-- Attachment #1: Type: text/plain, Size: 1906 bytes --]

Am Wed, 5 Sep 2018 12:55:58 +0200
schrieb Juergen Gross <jgross@suse.com>:

> Instead of trying to fight the symptoms, I think avoiding to offline
> the last cpu would make more sense.

Well, apparently the fix is to leave cpu#0 online because of a backtrace like that:

WARNING: CPU: 0 PID: 83 at kernel/sched/cpudeadline.c:159 cpudl_clear+0xa5/0xb0
Workqueue: events cpuset_hotplug_workfn
RIP: e030:cpudl_clear+0xa5/0xb0
Code: 8b 43 48 c7 44 28 0c ff ff ff ff e8 d5 fd ff ff 48 8d 43 08 f0 4c 0f ab 20 4c 89 ee 48 89 df 5b 5d 41 5c 41 5d e9 0b 3b 79 00 <0f> 0b e9 76 ff ff ff 0f 1f 40 00 66 66 66 66 90 41 56 49 89 d6 41
RSP: e02b:ffffc900411cbc40 EFLAGS: 00010086
RAX: ffffffff810d09a0 RBX: ffff880106f1a100 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880106f1a100
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8801068989b0
R10: ffff8801068989d0 R11: 0000000000000008 R12: 0000000000000000
R13: ffff8801f3800200 R14: 0000000000000001 R15: ffff8801f3823240
FS:  00007fd40d7f08c0(0000) GS:ffff8801f3800000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055eff60fe098 CR3: 00000001edf24000 CR4: 0000000000002660
Call Trace:
 rq_offline_dl+0x36/0x80
 set_rq_offline+0x31/0x60
 rq_attach_root+0x98/0xc0
 cpu_attach_domain+0x107/0x320
 partition_sched_domains+0x117/0x347
 ? cpus_read_lock+0x2d/0x50
 rebuild_sched_domains_locked+0xe4/0x4e0
 ? __switch_to_asm+0x40/0x70
 ? xen_mc_flush+0x102/0x210
 rebuild_sched_domains+0x16/0x30
 cpuset_hotplug_workfn+0x45e/0xef0
 ? _raw_spin_unlock_irq+0x22/0x40
 ? finish_task_switch+0x75/0x250
 process_one_work+0x1fd/0x3e0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x40/0x40
 ret_from_fork+0x3a/0x50

Initially I did not spot it because the kernel was booted with 'quiet'.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 10:55 ` Juergen Gross
                     ` (2 preceding siblings ...)
  2018-09-05 14:47   ` Olaf Hering
@ 2018-09-05 14:47   ` Olaf Hering
  3 siblings, 0 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 14:47 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list


[-- Attachment #1.1: Type: text/plain, Size: 1906 bytes --]

Am Wed, 5 Sep 2018 12:55:58 +0200
schrieb Juergen Gross <jgross@suse.com>:

> Instead of trying to fight the symptoms, I think avoiding to offline
> the last cpu would make more sense.

Well, apparently the fix is to leave cpu#0 online because of a backtrace like that:

WARNING: CPU: 0 PID: 83 at kernel/sched/cpudeadline.c:159 cpudl_clear+0xa5/0xb0
Workqueue: events cpuset_hotplug_workfn
RIP: e030:cpudl_clear+0xa5/0xb0
Code: 8b 43 48 c7 44 28 0c ff ff ff ff e8 d5 fd ff ff 48 8d 43 08 f0 4c 0f ab 20 4c 89 ee 48 89 df 5b 5d 41 5c 41 5d e9 0b 3b 79 00 <0f> 0b e9 76 ff ff ff 0f 1f 40 00 66 66 66 66 90 41 56 49 89 d6 41
RSP: e02b:ffffc900411cbc40 EFLAGS: 00010086
RAX: ffffffff810d09a0 RBX: ffff880106f1a100 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880106f1a100
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8801068989b0
R10: ffff8801068989d0 R11: 0000000000000008 R12: 0000000000000000
R13: ffff8801f3800200 R14: 0000000000000001 R15: ffff8801f3823240
FS:  00007fd40d7f08c0(0000) GS:ffff8801f3800000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055eff60fe098 CR3: 00000001edf24000 CR4: 0000000000002660
Call Trace:
 rq_offline_dl+0x36/0x80
 set_rq_offline+0x31/0x60
 rq_attach_root+0x98/0xc0
 cpu_attach_domain+0x107/0x320
 partition_sched_domains+0x117/0x347
 ? cpus_read_lock+0x2d/0x50
 rebuild_sched_domains_locked+0xe4/0x4e0
 ? __switch_to_asm+0x40/0x70
 ? xen_mc_flush+0x102/0x210
 rebuild_sched_domains+0x16/0x30
 cpuset_hotplug_workfn+0x45e/0xef0
 ? _raw_spin_unlock_irq+0x22/0x40
 ? finish_task_switch+0x75/0x250
 process_one_work+0x1fd/0x3e0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x40/0x40
 ret_from_fork+0x3a/0x50

Initially I did not spot it because the kernel was booted with 'quiet'.

Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 14:47   ` Olaf Hering
@ 2018-09-05 15:14     ` Juergen Gross
  2018-09-05 15:27       ` Olaf Hering
  2018-09-05 15:27       ` Olaf Hering
  2018-09-05 15:14     ` Juergen Gross
  1 sibling, 2 replies; 14+ messages in thread
From: Juergen Gross @ 2018-09-05 15:14 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Boris Ostrovsky, open list

On 05/09/18 16:47, Olaf Hering wrote:
> Am Wed, 5 Sep 2018 12:55:58 +0200
> schrieb Juergen Gross <jgross@suse.com>:
> 
>> Instead of trying to fight the symptoms, I think avoiding to offline
>> the last cpu would make more sense.
> 
> Well, apparently the fix is to leave cpu#0 online because of a backtrace like that:

I'd go with testing whether cpu_is_hotpluggable(cpu) returns true.
Per default this will return false for cpu 0.

Additionally I'd really like a test for num_online_cpus() > 1

BTW: I'm not sure this WARN triggers because it is cpu#0. Are you sure
the tested cpu in that WARN was 0? After all the test is just running
on cpu#0 and I don't think it can be offline already.


Juergen

> 
> WARNING: CPU: 0 PID: 83 at kernel/sched/cpudeadline.c:159 cpudl_clear+0xa5/0xb0
> Workqueue: events cpuset_hotplug_workfn
> RIP: e030:cpudl_clear+0xa5/0xb0
> Code: 8b 43 48 c7 44 28 0c ff ff ff ff e8 d5 fd ff ff 48 8d 43 08 f0 4c 0f ab 20 4c 89 ee 48 89 df 5b 5d 41 5c 41 5d e9 0b 3b 79 00 <0f> 0b e9 76 ff ff ff 0f 1f 40 00 66 66 66 66 90 41 56 49 89 d6 41
> RSP: e02b:ffffc900411cbc40 EFLAGS: 00010086
> RAX: ffffffff810d09a0 RBX: ffff880106f1a100 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880106f1a100
> RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8801068989b0
> R10: ffff8801068989d0 R11: 0000000000000008 R12: 0000000000000000
> R13: ffff8801f3800200 R14: 0000000000000001 R15: ffff8801f3823240
> FS:  00007fd40d7f08c0(0000) GS:ffff8801f3800000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055eff60fe098 CR3: 00000001edf24000 CR4: 0000000000002660
> Call Trace:
>  rq_offline_dl+0x36/0x80
>  set_rq_offline+0x31/0x60
>  rq_attach_root+0x98/0xc0
>  cpu_attach_domain+0x107/0x320
>  partition_sched_domains+0x117/0x347
>  ? cpus_read_lock+0x2d/0x50
>  rebuild_sched_domains_locked+0xe4/0x4e0
>  ? __switch_to_asm+0x40/0x70
>  ? xen_mc_flush+0x102/0x210
>  rebuild_sched_domains+0x16/0x30
>  cpuset_hotplug_workfn+0x45e/0xef0
>  ? _raw_spin_unlock_irq+0x22/0x40
>  ? finish_task_switch+0x75/0x250
>  process_one_work+0x1fd/0x3e0
>  worker_thread+0x2d/0x3d0
>  ? rescuer_thread+0x340/0x340
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> Initially I did not spot it because the kernel was booted with 'quiet'.
> 
> Olaf
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 14:47   ` Olaf Hering
  2018-09-05 15:14     ` Juergen Gross
@ 2018-09-05 15:14     ` Juergen Gross
  1 sibling, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2018-09-05 15:14 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Boris Ostrovsky, open list

On 05/09/18 16:47, Olaf Hering wrote:
> Am Wed, 5 Sep 2018 12:55:58 +0200
> schrieb Juergen Gross <jgross@suse.com>:
> 
>> Instead of trying to fight the symptoms, I think avoiding to offline
>> the last cpu would make more sense.
> 
> Well, apparently the fix is to leave cpu#0 online because of a backtrace like that:

I'd go with testing whether cpu_is_hotpluggable(cpu) returns true.
Per default this will return false for cpu 0.

Additionally I'd really like a test for num_online_cpus() > 1

BTW: I'm not sure this WARN triggers because it is cpu#0. Are you sure
the tested cpu in that WARN was 0? After all the test is just running
on cpu#0 and I don't think it can be offline already.


Juergen

> 
> WARNING: CPU: 0 PID: 83 at kernel/sched/cpudeadline.c:159 cpudl_clear+0xa5/0xb0
> Workqueue: events cpuset_hotplug_workfn
> RIP: e030:cpudl_clear+0xa5/0xb0
> Code: 8b 43 48 c7 44 28 0c ff ff ff ff e8 d5 fd ff ff 48 8d 43 08 f0 4c 0f ab 20 4c 89 ee 48 89 df 5b 5d 41 5c 41 5d e9 0b 3b 79 00 <0f> 0b e9 76 ff ff ff 0f 1f 40 00 66 66 66 66 90 41 56 49 89 d6 41
> RSP: e02b:ffffc900411cbc40 EFLAGS: 00010086
> RAX: ffffffff810d09a0 RBX: ffff880106f1a100 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880106f1a100
> RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8801068989b0
> R10: ffff8801068989d0 R11: 0000000000000008 R12: 0000000000000000
> R13: ffff8801f3800200 R14: 0000000000000001 R15: ffff8801f3823240
> FS:  00007fd40d7f08c0(0000) GS:ffff8801f3800000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055eff60fe098 CR3: 00000001edf24000 CR4: 0000000000002660
> Call Trace:
>  rq_offline_dl+0x36/0x80
>  set_rq_offline+0x31/0x60
>  rq_attach_root+0x98/0xc0
>  cpu_attach_domain+0x107/0x320
>  partition_sched_domains+0x117/0x347
>  ? cpus_read_lock+0x2d/0x50
>  rebuild_sched_domains_locked+0xe4/0x4e0
>  ? __switch_to_asm+0x40/0x70
>  ? xen_mc_flush+0x102/0x210
>  rebuild_sched_domains+0x16/0x30
>  cpuset_hotplug_workfn+0x45e/0xef0
>  ? _raw_spin_unlock_irq+0x22/0x40
>  ? finish_task_switch+0x75/0x250
>  process_one_work+0x1fd/0x3e0
>  worker_thread+0x2d/0x3d0
>  ? rescuer_thread+0x340/0x340
>  kthread+0x112/0x130
>  ? kthread_create_worker_on_cpu+0x40/0x40
>  ret_from_fork+0x3a/0x50
> 
> Initially I did not spot it because the kernel was booted with 'quiet'.
> 
> Olaf
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 15:14     ` Juergen Gross
@ 2018-09-05 15:27       ` Olaf Hering
  2018-09-05 15:27       ` Olaf Hering
  1 sibling, 0 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 15:27 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list

[-- Attachment #1: Type: text/plain, Size: 376 bytes --]

Am Wed, 5 Sep 2018 17:14:48 +0200
schrieb Juergen Gross <jgross@suse.com>:

> I'm not sure this WARN triggers because it is cpu#0. Are you sure
> the tested cpu in that WARN was 0? After all the test is just running
> on cpu#0 and I don't think it can be offline already.

If I leave cpu#0 alone, no WARN is triggered.
I resend with test cpu_is_hotpluggable.

Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1] xen: avoid crash in disable_hotplug_cpu
  2018-09-05 15:14     ` Juergen Gross
  2018-09-05 15:27       ` Olaf Hering
@ 2018-09-05 15:27       ` Olaf Hering
  1 sibling, 0 replies; 14+ messages in thread
From: Olaf Hering @ 2018-09-05 15:27 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, open list


[-- Attachment #1.1: Type: text/plain, Size: 376 bytes --]

Am Wed, 5 Sep 2018 17:14:48 +0200
schrieb Juergen Gross <jgross@suse.com>:

> I'm not sure this WARN triggers because it is cpu#0. Are you sure
> the tested cpu in that WARN was 0? After all the test is just running
> on cpu#0 and I don't think it can be offline already.

If I leave cpu#0 alone, no WARN is triggered.
I resend with test cpu_is_hotpluggable.

Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-09-05 15:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-05 10:40 [PATCH v1] xen: avoid crash in disable_hotplug_cpu Olaf Hering
2018-09-05 10:40 ` Olaf Hering
2018-09-05 10:55 ` Juergen Gross
2018-09-05 11:50   ` Olaf Hering
2018-09-05 12:04     ` Juergen Gross
2018-09-05 12:04     ` Juergen Gross
2018-09-05 11:50   ` Olaf Hering
2018-09-05 14:47   ` Olaf Hering
2018-09-05 15:14     ` Juergen Gross
2018-09-05 15:27       ` Olaf Hering
2018-09-05 15:27       ` Olaf Hering
2018-09-05 15:14     ` Juergen Gross
2018-09-05 14:47   ` Olaf Hering
2018-09-05 10:55 ` Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.