linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] ioapic hot-removal bugs
@ 2016-06-07  7:21 Rui Wang
  2016-06-07  7:21 ` [PATCH 1/2] x86/ioapic: Support hot-removal of IOAPICs present during boot Rui Wang
  2016-06-07  7:21 ` [PATCH 2/2] x86/ioapic: Fix wrong pointers in ioapic_setup_resources() Rui Wang
  0 siblings, 2 replies; 5+ messages in thread
From: Rui Wang @ 2016-06-07  7:21 UTC (permalink / raw)
  To: tglx, rjw, tony.luck, bhelgaas
  Cc: linux-acpi, linux-pci, linux-kernel, rui.y.wang

Hi All,

While testing ioapic hotplug, two bugs were found.

1) acpi_ioapic_add() is only called during hotadd of ioapics. Those
already present during system boot are not added, and thus cannot be
hot-removed.

2) ioapics[i].iomem_res were assigned the wrong pointers, causing panic
while hot-removing ioapics.

On a 4-socket brickland, hot-removal of the 3 sockets can be done
only after applying these two patches.

Regards,
Rui

Rui Wang (2):
  Support hot-removal of IOAPICs present during boot
  x86/ioapic: Fix wrong pointers in ioapic_setup_resources()

 arch/x86/kernel/apic/io_apic.c | 18 +++++++-----------
 drivers/acpi/internal.h        |  2 --
 drivers/acpi/ioapic.c          |  7 ++++---
 drivers/acpi/pci_root.c        |  2 +-
 drivers/pci/setup-bus.c        |  5 ++++-
 include/linux/acpi.h           |  3 +++
 6 files changed, 19 insertions(+), 18 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] x86/ioapic: Support hot-removal of IOAPICs present during boot
  2016-06-07  7:21 [PATCH 0/2] ioapic hot-removal bugs Rui Wang
@ 2016-06-07  7:21 ` Rui Wang
  2016-06-07  7:21 ` [PATCH 2/2] x86/ioapic: Fix wrong pointers in ioapic_setup_resources() Rui Wang
  1 sibling, 0 replies; 5+ messages in thread
From: Rui Wang @ 2016-06-07  7:21 UTC (permalink / raw)
  To: tglx, rjw, tony.luck, bhelgaas
  Cc: linux-acpi, linux-pci, linux-kernel, rui.y.wang

IOAPICs present during system boot aren't added to ioapic_list,
thus are unable to be hot-removed. Fix it by calling
acpi_ioapic_add() during root bus enumeration.

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
---
 drivers/acpi/internal.h | 2 --
 drivers/acpi/ioapic.c   | 7 ++++---
 drivers/acpi/pci_root.c | 2 +-
 drivers/pci/setup-bus.c | 5 ++++-
 include/linux/acpi.h    | 3 +++
 5 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 1e6833a..898f314 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -33,10 +33,8 @@ int acpi_sysfs_init(void);
 void acpi_container_init(void);
 void acpi_memory_hotplug_init(void);
 #ifdef	CONFIG_ACPI_HOTPLUG_IOAPIC
-int acpi_ioapic_add(struct acpi_pci_root *root);
 int acpi_ioapic_remove(struct acpi_pci_root *root);
 #else
-static inline int acpi_ioapic_add(struct acpi_pci_root *root) { return 0; }
 static inline int acpi_ioapic_remove(struct acpi_pci_root *root) { return 0; }
 #endif
 #ifdef CONFIG_ACPI_DOCK
diff --git a/drivers/acpi/ioapic.c b/drivers/acpi/ioapic.c
index ccdc8db..0f272e2 100644
--- a/drivers/acpi/ioapic.c
+++ b/drivers/acpi/ioapic.c
@@ -189,16 +189,17 @@ exit:
 	return AE_OK;
 }
 
-int acpi_ioapic_add(struct acpi_pci_root *root)
+int acpi_ioapic_add(acpi_handle root_handle)
 {
 	acpi_status status, retval = AE_OK;
 
-	status = acpi_walk_namespace(ACPI_TYPE_DEVICE, root->device->handle,
+	status = acpi_walk_namespace(ACPI_TYPE_DEVICE, root_handle,
 				     UINT_MAX, handle_ioapic_add, NULL,
-				     root->device->handle, (void **)&retval);
+				     root_handle, (void **)&retval);
 
 	return ACPI_SUCCESS(status) && ACPI_SUCCESS(retval) ? 0 : -ENODEV;
 }
+EXPORT_SYMBOL_GPL(acpi_ioapic_add);
 
 int acpi_ioapic_remove(struct acpi_pci_root *root)
 {
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index ae3fe4e..53f5965 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -614,7 +614,7 @@ static int acpi_pci_root_add(struct acpi_device *device,
 	if (hotadd) {
 		pcibios_resource_survey_bus(root->bus);
 		pci_assign_unassigned_root_bus_resources(root->bus);
-		acpi_ioapic_add(root);
+		acpi_ioapic_add(root->device->handle);
 	}
 
 	pci_lock_rescan_remove();
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 7796d0a..2f71167 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -26,6 +26,7 @@
 #include <linux/cache.h>
 #include <linux/slab.h>
 #include <asm-generic/pci-bridge.h>
+#include <linux/acpi.h>
 #include "pci.h"
 
 unsigned int pci_flags;
@@ -1780,8 +1781,10 @@ void __init pci_assign_unassigned_resources(void)
 {
 	struct pci_bus *root_bus;
 
-	list_for_each_entry(root_bus, &pci_root_buses, node)
+	list_for_each_entry(root_bus, &pci_root_buses, node) {
 		pci_assign_unassigned_root_bus_resources(root_bus);
+		acpi_ioapic_add(ACPI_HANDLE(root_bus->bridge));
+	}
 }
 
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 06ed7e5..ca4d22f 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -269,6 +269,9 @@ int acpi_unmap_cpu(int cpu);
 
 #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
 int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
+int acpi_ioapic_add(acpi_handle root);
+#else
+static inline int acpi_ioapic_add(acpi_handle root) { return 0; }
 #endif
 
 int acpi_register_ioapic(acpi_handle handle, u64 phys_addr, u32 gsi_base);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] x86/ioapic: Fix wrong pointers in ioapic_setup_resources()
  2016-06-07  7:21 [PATCH 0/2] ioapic hot-removal bugs Rui Wang
  2016-06-07  7:21 ` [PATCH 1/2] x86/ioapic: Support hot-removal of IOAPICs present during boot Rui Wang
@ 2016-06-07  7:21 ` Rui Wang
  2016-06-07  9:17   ` Thomas Gleixner
  1 sibling, 1 reply; 5+ messages in thread
From: Rui Wang @ 2016-06-07  7:21 UTC (permalink / raw)
  To: tglx, rjw, tony.luck, bhelgaas
  Cc: linux-acpi, linux-pci, linux-kernel, rui.y.wang

On a 4-socket brickland, hot-removing one ioapic is fine. Hot-removing
the 2nd one causes panic:

[  453.422259] BUG: unable to handle kernel NULL pointer dereference at
0000000000000030
[  453.431059] IP: [<ffffffff8109a8c2>] release_resource+0x22/0x80
[  453.437713] PGD 0
[  453.439976] Oops: 0000 [#1] SMP
[  453.443610] Modules linked in: fuse btrfs xor raid6_pq msdos ext4
mbcache jbd2 binfmt_misc xt_CHECKSUM ipt_MAS
QUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrac
k ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
ip6table_filter ip6_tables iptable_nat nf_conntrack_i
pv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_security iptable_raw iptable_filter vfa
t fat x86_pkg_temp_thermal intel_powerclamp coretemp kvm sb_edac
irqbypass edac_core aesni_intel ipmi_ssif iTCO_w
dt iTCO_vendor_support lpc_ich glue_helper ipmi_si ablk_helper sg shpchp
pcspkr mfd_core i2c_i801 ipmi_msghandler
 wmi acpi_pad nfsd
[  453.523040]  auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs
libcrc32c sr_mod cdrom sd_mod mgag200 drm_km
s_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ixgbe igb
ahci libahci libata mdio i2c_algo_bit pt
p i2c_core megaraid_sas pps_core dca dm_mirror dm_region_hash dm_log
dm_mod
[  453.551438] CPU: 34 PID: 1146 Comm: kworker/u288:1 Not tainted
4.5.0-rc1+ #69
[  453.559418] Hardware name: Intel Corporation BRICKLAND/BRICKLAND,
BIOS BRHSXSD1.86B.0063.R00.1503261059 03/26/
2015
[  453.570994] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  453.577041] task: ffff880463325800 ti: ffff88046267c000 task.ti:
ffff88046267c000
[  453.585415] RIP: 0010:[<ffffffff8109a8c2>]  [<ffffffff8109a8c2>]
release_resource+0x22/0x80
[  453.594768] RSP: 0018:ffff88046267fcc8  EFLAGS: 00010246
[  453.600706] RAX: 00000000ffffffea RBX: ffff88087fffde00 RCX:
0000000000000000
[  453.608684] RDX: 00000000000000ff RSI: ffffea0011b72180 RDI:
ffffffff81e3c0f8
[  453.616663] RBP: ffff88046267fcd0 R08: ffff88046dc86fc0 R09:
00000001802a0028
[  453.624641] R10: 000000006dc86f01 R11: ffffea0011b72180 R12:
0000000000000003
[  453.632619] R13: ffffffff81e1d450 R14: 00000000000000d8 R15:
0000000000000003
[  453.640598] FS:  0000000000000000(0000) GS:ffff88086f000000(0000)
knlGS:0000000000000000
[  453.649645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  453.656069] CR2: 0000000000000030 CR3: 0000000001a6e000 CR4:
00000000001406e0
[  453.664047] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  453.672027] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  453.680005] Stack:
[  453.682251]  00000000000000d8 ffff88046267fd08 ffffffff81057965
0000000000000048
[  453.690567]  ffffffff81b43bd8 ffff88086b125358 ffff88086b783ea0
ffff88086b125300
[  453.698876]  ffff88046267fd20 ffffffff8104e3ff 0000000000000000
ffff88046267fd58
[  453.707195] Call Trace:
[  453.709935]  [<ffffffff81057965>] mp_unregister_ioapic+0x125/0x180
[  453.716846]  [<ffffffff8104e3ff>] acpi_unregister_ioapic+0x1f/0x40
[  453.723759]  [<ffffffff8140cfe3>] acpi_ioapic_remove+0x5f/0xf0
[  453.730283]  [<ffffffff813e0645>] acpi_pci_root_remove+0x2c/0x80
[  453.737002]  [<ffffffff813da86b>] acpi_bus_trim+0x5a/0x8d
[  453.743039]  [<ffffffff813dc31d>] acpi_device_hotplug+0x1b7/0x418
[  453.749851]  [<ffffffff813d4f8a>] acpi_hotplug_work_fn+0x1e/0x29
[  453.756570]  [<ffffffff810ad67f>] process_one_work+0x14f/0x3d0
[  453.763092]  [<ffffffff810adf35>] worker_thread+0x125/0x4b0
[  453.769325]  [<ffffffff816fd5c1>] ? __schedule+0x2b1/0x700
[  453.775459]  [<ffffffff810ade10>] ? rescuer_thread+0x370/0x370
[  453.781981]  [<ffffffff810b3a58>] kthread+0xd8/0xf0
[  453.787435]  [<ffffffff810b3980>] ? kthread_park+0x60/0x60
[  453.793570]  [<ffffffff8170190f>] ret_from_fork+0x3f/0x70
[  453.800203]  [<ffffffff810b3980>] ? kthread_park+0x60/0x60
[  453.806914] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89
e5 53 48 89 fb 48 c7 c7 f8 c0 e3 81 e8 87 69 66 00 48 8b 4b 20 b8 ea ff
ff ff <48> 8b 51 30 48 85 d2 74 1d 48 39 d3 75 0a eb 3f 48 39 c3 74 1b
[  453.829861] RIP  [<ffffffff8109a8c2>] release_resource+0x22/0x80
[  453.837188]  RSP <ffff88046267fcc8>
[  453.841673] CR2: 0000000000000030

Fix it by assigning the correct pointers to ioapics[i].iomem_res in
ioapic_setup_resources(). Also simplify the function by removing
the redundant 'num' variable.

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
---
 arch/x86/kernel/apic/io_apic.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index f253218..a90b131 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2563,29 +2563,25 @@ static struct resource * __init ioapic_setup_resources(void)
 	unsigned long n;
 	struct resource *res;
 	char *mem;
-	int i, num = 0;
+	int i;
 
-	for_each_ioapic(i)
-		num++;
-	if (num == 0)
+	if (nr_ioapics == 0)
 		return NULL;
 
 	n = IOAPIC_RESOURCE_NAME_SIZE + sizeof(struct resource);
-	n *= num;
+	n *= nr_ioapics;
 
 	mem = alloc_bootmem(n);
 	res = (void *)mem;
 
-	mem += sizeof(struct resource) * num;
+	mem += sizeof(struct resource) * nr_ioapics;
 
-	num = 0;
 	for_each_ioapic(i) {
-		res[num].name = mem;
-		res[num].flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+		res[i].name = mem;
+		res[i].flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 		snprintf(mem, IOAPIC_RESOURCE_NAME_SIZE, "IOAPIC %u", i);
 		mem += IOAPIC_RESOURCE_NAME_SIZE;
-		num++;
-		ioapics[i].iomem_res = res;
+		ioapics[i].iomem_res = &res[i];
 	}
 
 	ioapic_resources = res;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] x86/ioapic: Fix wrong pointers in ioapic_setup_resources()
  2016-06-07  7:21 ` [PATCH 2/2] x86/ioapic: Fix wrong pointers in ioapic_setup_resources() Rui Wang
@ 2016-06-07  9:17   ` Thomas Gleixner
  2016-06-08  0:07     ` Rui Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2016-06-07  9:17 UTC (permalink / raw)
  To: Rui Wang; +Cc: rjw, Tony Luck, bhelgaas, linux-acpi, linux-pci, LKML

B1;2802;0cOn Tue, 7 Jun 2016, Rui Wang wrote:
> On a 4-socket brickland, hot-removing one ioapic is fine. Hot-removing
> the 2nd one causes panic:
> 
> [  453.422259] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000030
> [  453.431059] IP: [<ffffffff8109a8c2>] release_resource+0x22/0x80

<Useless information>
> [  453.437713] PGD 0
> [  453.439976] Oops: 0000 [#1] SMP
> [  453.698876]  ffff88046267fd20 ffffffff8104e3ff 0000000000000000
> ffff88046267fd58
</Useless information>

> [  453.707195] Call Trace:
> [  453.709935]  [<ffffffff81057965>] mp_unregister_ioapic+0x125/0x180
> [  453.716846]  [<ffffffff8104e3ff>] acpi_unregister_ioapic+0x1f/0x40
> [  453.723759]  [<ffffffff8140cfe3>] acpi_ioapic_remove+0x5f/0xf0
> [  453.730283]  [<ffffffff813e0645>] acpi_pci_root_remove+0x2c/0x80
> [  453.737002]  [<ffffffff813da86b>] acpi_bus_trim+0x5a/0x8d
> [  453.743039]  [<ffffffff813dc31d>] acpi_device_hotplug+0x1b7/0x418
> [  453.749851]  [<ffffffff813d4f8a>] acpi_hotplug_work_fn+0x1e/0x29

<Useless information>
> [  453.756570]  [<ffffffff810ad67f>] process_one_work+0x14f/0x3d0
> [  453.763092]  [<ffffffff810adf35>] worker_thread+0x125/0x4b0
> [  453.769325]  [<ffffffff816fd5c1>] ? __schedule+0x2b1/0x700
> [  453.775459]  [<ffffffff810ade10>] ? rescuer_thread+0x370/0x370
> [  453.781981]  [<ffffffff810b3a58>] kthread+0xd8/0xf0
> [  453.787435]  [<ffffffff810b3980>] ? kthread_park+0x60/0x60
> [  453.793570]  [<ffffffff8170190f>] ret_from_fork+0x3f/0x70
> [  453.800203]  [<ffffffff810b3980>] ? kthread_park+0x60/0x60
> [  453.806914] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89
> e5 53 48 89 fb 48 c7 c7 f8 c0 e3 81 e8 87 69 66 00 48 8b 4b 20 b8 ea ff
> ff ff <48> 8b 51 30 48 85 d2 74 1d 48 39 d3 75 0a eb 3f 48 39 c3 74 1b
> [  453.829861] RIP  [<ffffffff8109a8c2>] release_resource+0x22/0x80
> [  453.837188]  RSP <ffff88046267fcc8>
> [  453.841673] CR2: 0000000000000030
</Useless information>

Please trim the dumps to the relevant information

> Fix it by assigning the correct pointers to ioapics[i].iomem_res in
> ioapic_setup_resources().

This does not explain the splat above. Please explain which pointer is
wrong and what effects that has.

> Also simplify the function by removing the redundant 'num' variable.

Please don't do that. This makes the patch hard to read. Split this into a
minimal bugfix, which can be backported and a cleanup patch which gets rid of
the extra variable.
 
> -		ioapics[i].iomem_res = res;
> +		ioapics[i].iomem_res = &res[i];

If I read the patch correctly, then this is the fix. Right? So please make it
a one liner and send a cleanup patch seperately.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] x86/ioapic: Fix wrong pointers in ioapic_setup_resources()
  2016-06-07  9:17   ` Thomas Gleixner
@ 2016-06-08  0:07     ` Rui Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Rui Wang @ 2016-06-08  0:07 UTC (permalink / raw)
  To: tglx
  Cc: rjw, tony.luck, bhelgaas, linux-acpi, linux-pci, linux-kernel,
	rui.y.wang

On Tuesday, June 7, 2016 5:17 PM, Thomas Gleixner wrote:
> On Tue, 7 Jun 2016, Rui Wang wrote:
> > Also simplify the function by removing the redundant 'num' variable.

> Please don't do that. This makes the patch hard to read. Split this into a
> minimal bugfix, which can be backported and a cleanup patch which gets rid of
> the extra variable.
 
> > -		ioapics[i].iomem_res = res;
> > +		ioapics[i].iomem_res = &res[i];
>
> If I read the patch correctly, then this is the fix. Right? So please make it
> a one liner and send a cleanup patch seperately.

Hi Thomas,

Yes exactly. I'll send a v2.

Thanks
Rui

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-08  0:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-07  7:21 [PATCH 0/2] ioapic hot-removal bugs Rui Wang
2016-06-07  7:21 ` [PATCH 1/2] x86/ioapic: Support hot-removal of IOAPICs present during boot Rui Wang
2016-06-07  7:21 ` [PATCH 2/2] x86/ioapic: Fix wrong pointers in ioapic_setup_resources() Rui Wang
2016-06-07  9:17   ` Thomas Gleixner
2016-06-08  0:07     ` Rui Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).