All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Drivers: hv: vmbus: Enable unloading of vmbus driver
@ 2015-01-27 23:46 K. Y. Srinivasan
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
  0 siblings, 1 reply; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx; +Cc: K. Y. Srinivasan

Windows hosts starting with Ws2012 R2 permit re-establishing the vmbus
connection from the guest. This patch-set includes patches from Vitaly
to cleanup the VMBUS unload path so we can potentially reload the driver.

This set also includes a patch from Jake to correctly extract MMIO
information on both Gen1 and Gen 2 firmware.


Jake Oshins (1):
  drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range
    for children

Vitaly Kuznetsov (7):
  Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
  Drivers: hv: vmbus: rename channel work queues
  Drivers: hv: vmbus: avoid double kfree for device_obj
  Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and
    vmbus_connection pages on shutdown
  drivers: hv: vmbus: Teardown synthetic interrupt controllers on
    module unload
  clockevents: export clockevents_unbind_device instead of
    clockevents_unbind
  Drivers: hv: vmbus: Teardown clockevent devices on module unload

 drivers/hv/channel_mgmt.c       |    6 +-
 drivers/hv/connection.c         |   17 +++--
 drivers/hv/hv.c                 |   34 ++++++++-
 drivers/hv/hyperv_vmbus.h       |    3 +
 drivers/hv/vmbus_drv.c          |  150 ++++++++++++++++++++++++++++++++++-----
 drivers/video/fbdev/hyperv_fb.c |    2 +-
 include/linux/hyperv.h          |    5 +-
 kernel/time/clockevents.c       |    2 +-
 8 files changed, 188 insertions(+), 31 deletions(-)

-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
  2015-01-27 23:46 [PATCH 0/8] Drivers: hv: vmbus: Enable unloading of vmbus driver K. Y. Srinivasan
@ 2015-01-27 23:46 ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
                     ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx; +Cc: K. Y. Srinivasan

From: Vitaly Kuznetsov <vkuznets@redhat.com>

When an SMP Hyper-V guest is running on top of 2012R2 Server and secondary
cpus are sent offline (with echo 0 > /sys/devices/system/cpu/cpu$cpu/online)
the system freeze is observed. This happens due to the fact that on newer
hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
when vmbus is loaded until the issue is fixed host-side.

This patch also disables hibernation but it is OK as it is also broken (MCE
error is hit on resume). Suspend still works.

Tested with WS2008R2 and WS2012R2.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/vmbus_drv.c |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index f518b8d7..3b18a66 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -33,6 +33,7 @@
 #include <linux/hyperv.h>
 #include <linux/kernel_stat.h>
 #include <linux/clockchips.h>
+#include <linux/cpu.h>
 #include <asm/hyperv.h>
 #include <asm/hypervisor.h>
 #include <asm/mshyperv.h>
@@ -704,6 +705,39 @@ static void vmbus_isr(void)
 	}
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static int hyperv_cpu_disable(void)
+{
+	return -ENOSYS;
+}
+
+static void hv_cpu_hotplug_quirk(bool vmbus_loaded)
+{
+	static void *previous_cpu_disable;
+
+	/*
+	 * Offlining a CPU when running on newer hypervisors (WS2012R2, Win8,
+	 * ...) is not supported at this moment as channel interrupts are
+	 * distributed across all of them.
+	 */
+
+	if ((vmbus_proto_version == VERSION_WS2008) ||
+	    (vmbus_proto_version == VERSION_WIN7))
+		return;
+
+	if (vmbus_loaded) {
+		previous_cpu_disable = smp_ops.cpu_disable;
+		smp_ops.cpu_disable = hyperv_cpu_disable;
+		pr_notice("CPU offlining is not supported by hypervisor\n");
+	} else if (previous_cpu_disable)
+		smp_ops.cpu_disable = previous_cpu_disable;
+}
+#else
+static void hv_cpu_hotplug_quirk(bool vmbus_loaded)
+{
+}
+#endif
+
 /*
  * vmbus_bus_init -Main vmbus driver initialization routine.
  *
@@ -744,6 +778,7 @@ static int vmbus_bus_init(int irq)
 	if (ret)
 		goto err_alloc;
 
+	hv_cpu_hotplug_quirk(true);
 	vmbus_request_offers();
 
 	return 0;
@@ -997,6 +1032,7 @@ static void __exit vmbus_exit(void)
 	bus_unregister(&hv_bus);
 	hv_cleanup();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
+	hv_cpu_hotplug_quirk(false);
 }
 
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children K. Y. Srinivasan
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

All channel work queues are named 'hv_vmbus_ctl', this makes them
indistinguishable in ps output and makes it hard to link to the corresponding
vmbus device. Rename them to hv_vmbus_ctl/N and make vmbus device names match,
e.g. now vmbus_1 device is served by hv_vmbus_ctl/1 work queue.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel_mgmt.c |    5 ++++-
 drivers/hv/vmbus_drv.c    |    6 ++----
 include/linux/hyperv.h    |    3 +++
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 3736f71..ba4b25f 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -139,19 +139,22 @@ EXPORT_SYMBOL_GPL(vmbus_prep_negotiate_resp);
  */
 static struct vmbus_channel *alloc_channel(void)
 {
+	static atomic_t chan_num = ATOMIC_INIT(0);
 	struct vmbus_channel *channel;
 
 	channel = kzalloc(sizeof(*channel), GFP_ATOMIC);
 	if (!channel)
 		return NULL;
 
+	channel->id = atomic_inc_return(&chan_num);
 	spin_lock_init(&channel->inbound_lock);
 	spin_lock_init(&channel->lock);
 
 	INIT_LIST_HEAD(&channel->sc_list);
 	INIT_LIST_HEAD(&channel->percpu_list);
 
-	channel->controlwq = create_workqueue("hv_vmbus_ctl");
+	channel->controlwq = alloc_workqueue("hv_vmbus_ctl/%d", WQ_MEM_RECLAIM,
+					     1, channel->id);
 	if (!channel->controlwq) {
 		kfree(channel);
 		return NULL;
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 3b18a66..e334ccc 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -875,10 +875,8 @@ int vmbus_device_register(struct hv_device *child_device_obj)
 {
 	int ret = 0;
 
-	static atomic_t device_num = ATOMIC_INIT(0);
-
-	dev_set_name(&child_device_obj->device, "vmbus_0_%d",
-		     atomic_inc_return(&device_num));
+	dev_set_name(&child_device_obj->device, "vmbus_%d",
+		     child_device_obj->channel->id);
 
 	child_device_obj->device.bus = &hv_bus;
 	child_device_obj->device.parent = &hv_acpi_dev->dev;
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 5a2ba67..26a32b7 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -646,6 +646,9 @@ struct hv_input_signal_event_buffer {
 };
 
 struct vmbus_channel {
+	/* Unique channel id */
+	int id;
+
 	struct list_head listentry;
 
 	struct hv_device *device_obj;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj K. Y. Srinivasan
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Jake Oshins, Jake Oshins, K. Y. Srinivasan

From: Jake Oshins <[mailto:jakeo@microsoft.com]>

This set of changes finds the _CRS object in the ACPI namespace
that contains memory address space descriptors, intended to convey
to VMBus which ranges of memory-mapped I/O space are available for
child devices, and then builds a resource list that contains all
those ranges.  Without this change, only some of the memory-mapped
I/O space will be available for child devices, and only in some
virtual BIOS configurations (Generation 2 VMs).

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/vmbus_drv.c          |   97 +++++++++++++++++++++++++++++++++------
 drivers/video/fbdev/hyperv_fb.c |    2 +-
 include/linux/hyperv.h          |    2 +-
 3 files changed, 85 insertions(+), 16 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index e334ccc..aebc8fe 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -45,10 +45,7 @@ static struct tasklet_struct msg_dpc;
 static struct completion probe_event;
 static int irq;
 
-struct resource hyperv_mmio = {
-	.name  = "hyperv mmio",
-	.flags = IORESOURCE_MEM,
-};
+struct resource *hyperv_mmio;
 EXPORT_SYMBOL_GPL(hyperv_mmio);
 
 static int vmbus_exists(void)
@@ -915,30 +912,98 @@ void vmbus_device_unregister(struct hv_device *device_obj)
 
 
 /*
- * VMBUS is an acpi enumerated device. Get the the information we
+ * VMBUS is an acpi enumerated device. Get the information we
  * need from DSDT.
  */
 
 static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx)
 {
+	resource_size_t start = 0;
+	resource_size_t end = 0;
+	struct resource *new_res;
+	struct resource **old_res = &hyperv_mmio;
+
 	switch (res->type) {
 	case ACPI_RESOURCE_TYPE_IRQ:
 		irq = res->data.irq.interrupts[0];
+		return AE_OK;
+
+	/*
+	 * "Address" descriptors are for bus windows. Ignore
+	 * "memory" descriptors, which are for registers on
+	 * devices.
+	 */
+	case ACPI_RESOURCE_TYPE_ADDRESS32:
+		start = res->data.address32.address.minimum;
+		end = res->data.address32.address.maximum;
 		break;
 
 	case ACPI_RESOURCE_TYPE_ADDRESS64:
-		hyperv_mmio.start = res->data.address64.address.minimum;
-		hyperv_mmio.end = res->data.address64.address.maximum;
+		start = res->data.address64.address.minimum;
+		end = res->data.address64.address.maximum;
 		break;
+
+	default:
+		/* Unused resource type */
+		return AE_OK;
+
 	}
+	/*
+	 * Ignore ranges that are below 1MB, as they're not
+	 * necessary or useful here.
+	*/
+	if (end < 0x100000)
+		return AE_OK;
+
+	new_res = kzalloc(sizeof(*new_res), GFP_ATOMIC);
+	if (!new_res)
+		return AE_NO_MEMORY;
+
+	new_res->name = "hyperv mmio";
+	new_res->flags = IORESOURCE_MEM;
+	new_res->start = start;
+	new_res->end = end;
+
+	do {
+		if (!*old_res) {
+			*old_res = new_res;
+			break;
+		}
+
+		if ((*old_res)->start > new_res->end) {
+			new_res->sibling = *old_res;
+			*old_res = new_res;
+			break;
+		}
+
+		old_res = &(*old_res)->sibling;
+
+	} while (1);
 
 	return AE_OK;
 }
 
+static int vmbus_acpi_remove(struct acpi_device *device)
+{
+	struct resource *cur_res;
+	struct resource *next_res;
+
+	if (hyperv_mmio) {
+		release_resource(hyperv_mmio);
+		for (cur_res = hyperv_mmio; cur_res; cur_res = next_res) {
+			next_res = cur_res->sibling;
+			kfree(cur_res);
+		}
+	}
+
+	return 0;
+}
+
 static int vmbus_acpi_add(struct acpi_device *device)
 {
 	acpi_status result;
 	int ret_val = -ENODEV;
+	struct acpi_device *ancestor;
 
 	hv_acpi_dev = device;
 
@@ -948,23 +1013,26 @@ static int vmbus_acpi_add(struct acpi_device *device)
 	if (ACPI_FAILURE(result))
 		goto acpi_walk_err;
 	/*
-	 * The parent of the vmbus acpi device (Gen2 firmware) is the VMOD that
-	 * has the mmio ranges. Get that.
+	 * Some ancestor of the vmbus acpi device (Gen1 or Gen2
+	 * firmware) is the VMOD that has the mmio ranges. Get that.
 	 */
-	if (device->parent) {
-		result = acpi_walk_resources(device->parent->handle,
+	for (ancestor = device->parent; ancestor; ancestor = ancestor->parent) {
+		result = acpi_walk_resources(ancestor->handle,
 					METHOD_NAME__CRS,
 					vmbus_walk_resources, NULL);
 
 		if (ACPI_FAILURE(result))
-			goto acpi_walk_err;
-		if (hyperv_mmio.start && hyperv_mmio.end)
-			request_resource(&iomem_resource, &hyperv_mmio);
+			continue;
+		if (hyperv_mmio) {
+			request_resource(&iomem_resource, hyperv_mmio);
+			break;
+		}
 	}
 	ret_val = 0;
 
 acpi_walk_err:
 	complete(&probe_event);
+	vmbus_acpi_remove(device);
 	return ret_val;
 }
 
@@ -980,6 +1048,7 @@ static struct acpi_driver vmbus_acpi_driver = {
 	.ids = vmbus_acpi_device_ids,
 	.ops = {
 		.add = vmbus_acpi_add,
+		.remove = vmbus_acpi_remove,
 	},
 };
 
diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
index 4254336..003c8f0 100644
--- a/drivers/video/fbdev/hyperv_fb.c
+++ b/drivers/video/fbdev/hyperv_fb.c
@@ -686,7 +686,7 @@ static int hvfb_getmem(struct fb_info *info)
 	par->mem.name = KBUILD_MODNAME;
 	par->mem.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 	if (gen2vm) {
-		ret = allocate_resource(&hyperv_mmio, &par->mem,
+		ret = allocate_resource(hyperv_mmio, &par->mem,
 					screen_fb_size,
 					0, -1,
 					screen_fb_size,
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 26a32b7..e73cfeb 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1217,7 +1217,7 @@ int hv_vss_init(struct hv_util_service *);
 void hv_vss_deinit(void);
 void hv_vss_onchannelcallback(void *);
 
-extern struct resource hyperv_mmio;
+extern struct resource *hyperv_mmio;
 
 /*
  * Negotiated version with the Host.
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown K. Y. Srinivasan
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

On driver shutdown device_obj is being freed twice:
1) In vmbus_free_channels()
2) vmbus_device_release() (which is being triggered by device_unregister() in
   vmbus_device_unregister().
This double kfree leads to the following sporadic crash on driver unload:

[   23.469876] general protection fault: 0000 [#1] SMP
[   23.470036] Modules linked in: hv_vmbus(-)
[   23.470036] CPU: 2 PID: 213 Comm: rmmod Not tainted 3.19.0-rc5_bug923184+ #488
[   23.470036] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   23.470036] task: ffff880036ef1cb0 ti: ffff880036ce8000 task.ti: ffff880036ce8000
[   23.470036] RIP: 0010:[<ffffffff811d2e1b>]  [<ffffffff811d2e1b>] __kmalloc_node_track_caller+0xdb/0x1e0
[   23.470036] RSP: 0018:ffff880036cebcc8  EFLAGS: 00010246
...

When this crash does not happen on driver unload the similar one is expected if
we try to load hv_vmbus again.

Remove kfree from vmbus_free_channels() as freeing it from
vmbus_device_release() seems right.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel_mgmt.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index ba4b25f..36bacc7 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -262,7 +262,6 @@ void vmbus_free_channels(void)
 
 	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
 		vmbus_device_unregister(channel->device_obj);
-		kfree(channel->device_obj);
 		free_channel(channel);
 	}
 }
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (2 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload K. Y. Srinivasan
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

We need to destroy hv_vmbus_con on module shutdown, otherwise the following
crash is sometimes observed:

[   76.569845] hv_vmbus: Hyper-V Host Build:9600-6.3-17-0.17039; Vmbus version:3.0
[   82.598859] BUG: unable to handle kernel paging request at ffffffffa0003480
[   82.599287] IP: [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] PGD 1f34067 PUD 1f35063 PMD 3f72d067 PTE 0
[   82.599287] Oops: 0010 [#1] SMP
[   82.599287] Modules linked in: [last unloaded: hv_vmbus]
[   82.599287] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 3.19.0-rc5_bug923184+ #488
[   82.599287] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   82.599287] Workqueue: hv_vmbus_con 0xffffffffa0003480
[   82.599287] task: ffff88007b6ddfa0 ti: ffff88007f8f8000 task.ti: ffff88007f8f8000
[   82.599287] RIP: 0010:[<ffffffffa0003480>]  [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] RSP: 0018:ffff88007f8fbe00  EFLAGS: 00010202
...

To avoid memory leaks we need to free monitor_pages and int_page for
vmbus_connection. Implement vmbus_disconnect() function by separating cleanup
path from vmbus_connect().

As we use hv_vmbus_con to release channels (see free_channel() in channel_mgmt.c)
we need to make sure the work was done before we remove the queue, do that with
drain_workqueue(). We also need to avoid handling messages  which can (potentially)
create new channels, so set vmbus_connection.conn_state = DISCONNECTED at the very
beginning of vmbus_exit() and check for that in vmbus_onmessage_work().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/connection.c   |   17 ++++++++++++-----
 drivers/hv/hyperv_vmbus.h |    1 +
 drivers/hv/vmbus_drv.c    |    6 ++++++
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index a63a795..c4acd1c 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -216,10 +216,21 @@ int vmbus_connect(void)
 
 cleanup:
 	pr_err("Unable to connect to host\n");
+
 	vmbus_connection.conn_state = DISCONNECTED;
+	vmbus_disconnect();
+
+	kfree(msginfo);
+
+	return ret;
+}
 
-	if (vmbus_connection.work_queue)
+void vmbus_disconnect(void)
+{
+	if (vmbus_connection.work_queue) {
+		drain_workqueue(vmbus_connection.work_queue);
 		destroy_workqueue(vmbus_connection.work_queue);
+	}
 
 	if (vmbus_connection.int_page) {
 		free_pages((unsigned long)vmbus_connection.int_page, 0);
@@ -230,10 +241,6 @@ cleanup:
 	free_pages((unsigned long)vmbus_connection.monitor_pages[1], 0);
 	vmbus_connection.monitor_pages[0] = NULL;
 	vmbus_connection.monitor_pages[1] = NULL;
-
-	kfree(msginfo);
-
-	return ret;
 }
 
 /*
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 44b1c94..6cf2de9 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -692,6 +692,7 @@ void vmbus_free_channels(void);
 /* Connection interface */
 
 int vmbus_connect(void);
+void vmbus_disconnect(void);
 
 int vmbus_post_msg(void *buffer, size_t buflen);
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index aebc8fe..3b304b9 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -571,6 +571,10 @@ static void vmbus_onmessage_work(struct work_struct *work)
 {
 	struct onmessage_work_context *ctx;
 
+	/* Do not process messages if we're in DISCONNECTED state */
+	if (vmbus_connection.conn_state == DISCONNECTED)
+		return;
+
 	ctx = container_of(work, struct onmessage_work_context,
 			   work);
 	vmbus_onmessage(&ctx->msg);
@@ -1094,12 +1098,14 @@ cleanup:
 
 static void __exit vmbus_exit(void)
 {
+	vmbus_connection.conn_state = DISCONNECTED;
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
 	hv_cleanup();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
 	hv_cpu_hotplug_quirk(false);
+	vmbus_disconnect();
 }
 
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (3 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind K. Y. Srinivasan
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

SynIC has to be switched off when we unload the module, otherwise registered
memory pages can get corrupted after (as Hyper-V host still writes there) and
we see the following crashes for random processes:

[   89.116774] BUG: Bad page map in process sh  pte:4989c716 pmd:36f81067
[   89.159454] addr:0000000000437000 vm_flags:00000875 anon_vma:          (null) mapping:ffff88007bba55a0 index:37
[   89.226146] vma->vm_ops->fault: filemap_fault+0x0/0x410
[   89.257776] vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x60
[   89.297570] CPU: 0 PID: 215 Comm: sh Tainted: G    B          3.19.0-rc5_bug923184+ #488
[   89.353738] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   89.409138]  0000000000000000 000000004e083d7b ffff880036e9fa18 ffffffff81a68d31
[   89.468724]  0000000000000000 0000000000437000 ffff880036e9fa68 ffffffff811a1e3a
[   89.519233]  000000004989c716 0000000000000037 ffffea0001edc340 0000000000437000
[   89.575751] Call Trace:
[   89.591060]  [<ffffffff81a68d31>] dump_stack+0x45/0x57
[   89.625164]  [<ffffffff811a1e3a>] print_bad_pte+0x1aa/0x250
[   89.667234]  [<ffffffff811a2c95>] vm_normal_page+0x55/0xa0
[   89.703818]  [<ffffffff811a3105>] unmap_page_range+0x425/0x8a0
[   89.737982]  [<ffffffff811a3601>] unmap_single_vma+0x81/0xf0
[   89.780385]  [<ffffffff81184320>] ? lru_deactivate_fn+0x190/0x190
[   89.820130]  [<ffffffff811a4131>] unmap_vmas+0x51/0xa0
[   89.860168]  [<ffffffff811ad12c>] exit_mmap+0xac/0x1a0
[   89.890588]  [<ffffffff810763c3>] mmput+0x63/0x100
[   89.919205]  [<ffffffff811eba48>] flush_old_exec+0x3f8/0x8b0
[   89.962135]  [<ffffffff8123b5bb>] load_elf_binary+0x32b/0x1260
[   89.998581]  [<ffffffff811a14f2>] ? get_user_pages+0x52/0x60

hv_synic_cleanup() function exists but noone calls it now. Do the following:
- call hv_synic_cleanup() on each cpu from vmbus_exit();
- write global disable bit through MSR;
- use hv_synic_free_cpu() to avoid memory leask and code duplication.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/hv.c        |    9 +++++++--
 drivers/hv/vmbus_drv.c |    4 ++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 50e51a5..39531dc 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -477,6 +477,7 @@ void hv_synic_cleanup(void *arg)
 	union hv_synic_sint shared_sint;
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
+	union hv_synic_scontrol sctrl;
 	int cpu = smp_processor_id();
 
 	if (!hv_context.synic_initialized)
@@ -502,6 +503,10 @@ void hv_synic_cleanup(void *arg)
 
 	wrmsrl(HV_X64_MSR_SIEFP, siefp.as_uint64);
 
-	free_page((unsigned long)hv_context.synic_message_page[cpu]);
-	free_page((unsigned long)hv_context.synic_event_page[cpu]);
+	/* Disable the global synic bit */
+	rdmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
+	sctrl.enable = 0;
+	wrmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
+
+	hv_synic_free_cpu(cpu);
 }
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 3b304b9..8ff6f69 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1098,11 +1098,15 @@ cleanup:
 
 static void __exit vmbus_exit(void)
 {
+	int cpu;
+
 	vmbus_connection.conn_state = DISCONNECTED;
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
 	hv_cleanup();
+	for_each_online_cpu(cpu)
+		smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
 	hv_cpu_hotplug_quirk(false);
 	vmbus_disconnect();
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (4 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload K. Y. Srinivasan
  2015-01-28 22:16   ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors KY Srinivasan
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

It looks like clockevents_unbind is being exported by mistake as:
- it is static;
- it is not listed in include/linux/clockchips.h;
- EXPORT_SYMBOL_GPL(clockevents_unbind) follows clockevents_unbind_device()
  implementation.

I think clockevents_unbind_device should be exported instead. This is going to
be used to teardown Hyper-V clockevent devices on module unload.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 kernel/time/clockevents.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 5544990..888ecc1 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -371,7 +371,7 @@ int clockevents_unbind_device(struct clock_event_device *ced, int cpu)
 	mutex_unlock(&clockevents_mutex);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(clockevents_unbind);
+EXPORT_SYMBOL_GPL(clockevents_unbind_device);
 
 /**
  * clockevents_register_device - register a clock event device
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (5 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-28 22:16   ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors KY Srinivasan
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

Newly introduced clockevent devices made it impossible to unload hv_vmbus
module as clockevents_config_and_register() takes additional reverence to
the module. To make it possible again we do the following:
- avoid setting dev->owner for clockevent devices;
- implement hv_synic_clockevents_cleanup() doing clockevents_unbind_device();
- call it from vmbus_exit().

In theory hv_synic_clockevents_cleanup() can be merged with hv_synic_cleanup(),
however, we call hv_synic_cleanup() from smp_call_function_single() and this
doesn't work for clockevents_unbind_device() as it does such call on its own. I
opted for a separate function.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/hv.c           |   25 ++++++++++++++++++++++++-
 drivers/hv/hyperv_vmbus.h |    2 ++
 drivers/hv/vmbus_drv.c    |    1 +
 3 files changed, 27 insertions(+), 1 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 39531dc..d3943bc 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -312,7 +312,11 @@ static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu)
 	dev->features = CLOCK_EVT_FEAT_ONESHOT;
 	dev->cpumask = cpumask_of(cpu);
 	dev->rating = 1000;
-	dev->owner = THIS_MODULE;
+	/*
+	 * Avoid settint dev->owner = THIS_MODULE deliberately as doing so will
+	 * result in clockevents_config_and_register() taking additional
+	 * references to the hv_vmbus module making it impossible to unload.
+	 */
 
 	dev->set_mode = hv_ce_setmode;
 	dev->set_next_event = hv_ce_set_next_event;
@@ -470,6 +474,20 @@ void hv_synic_init(void *arg)
 }
 
 /*
+ * hv_synic_clockevents_cleanup - Cleanup clockevent devices
+ */
+void hv_synic_clockevents_cleanup(void)
+{
+	int cpu;
+
+	if (!(ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE))
+		return;
+
+	for_each_online_cpu(cpu)
+		clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
+}
+
+/*
  * hv_synic_cleanup - Cleanup routine for hv_synic_init().
  */
 void hv_synic_cleanup(void *arg)
@@ -483,6 +501,11 @@ void hv_synic_cleanup(void *arg)
 	if (!hv_context.synic_initialized)
 		return;
 
+	/* Turn off clockevent device */
+	if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE)
+		hv_ce_setmode(CLOCK_EVT_MODE_SHUTDOWN,
+			      hv_context.clk_evt[cpu]);
+
 	rdmsrl(HV_X64_MSR_SINT0 + VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
 
 	shared_sint.masked = 1;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 6cf2de9..b055e53 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -572,6 +572,8 @@ extern void hv_synic_init(void *irqarg);
 
 extern void hv_synic_cleanup(void *arg);
 
+extern void hv_synic_clockevents_cleanup(void);
+
 /*
  * Host version information.
  */
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 8ff6f69..2de170a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1101,6 +1101,7 @@ static void __exit vmbus_exit(void)
 	int cpu;
 
 	vmbus_connection.conn_state = DISCONNECTED;
+	hv_synic_clockevents_cleanup();
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (6 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload K. Y. Srinivasan
@ 2015-01-28 22:16   ` KY Srinivasan
  7 siblings, 0 replies; 10+ messages in thread
From: KY Srinivasan @ 2015-01-28 22:16 UTC (permalink / raw)
  To: KY Srinivasan, gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx



> -----Original Message-----
> From: K. Y. Srinivasan [mailto:kys@microsoft.com]
> Sent: Tuesday, January 27, 2015 3:47 PM
> To: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; olaf@aepfle.de; apw@canonical.com;
> vkuznets@redhat.com; tglx@linutronix.de
> Cc: KY Srinivasan
> Subject: [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on
> newer hypervisors
> 
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> When an SMP Hyper-V guest is running on top of 2012R2 Server and
> secondary cpus are sent offline (with echo 0 >
> /sys/devices/system/cpu/cpu$cpu/online)
> the system freeze is observed. This happens due to the fact that on newer
> hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
> across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
> and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
> when vmbus is loaded until the issue is fixed host-side.
> 
> This patch also disables hibernation but it is OK as it is also broken (MCE error
> is hit on resume). Suspend still works.
> 
> Tested with WS2008R2 and WS2012R2.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>

Greg,

Please drop this entire series; the patch-set was based on the incorrect tree. I will resend the set shortly
after rebasing them on your  char-misc.git.

Thank you,

K. Y

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-01-29  1:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-27 23:46 [PATCH 0/8] Drivers: hv: vmbus: Enable unloading of vmbus driver K. Y. Srinivasan
2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload K. Y. Srinivasan
2015-01-28 22:16   ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors KY Srinivasan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.