All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path
@ 2015-01-21 19:02 Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

It is possible (since 93e5bd06a953: "Drivers: hv: Make the vmbus driver
unloadable") to unload hv_vmbus driver if no other devices are connected.
1aec169673d7: "x86: Hyperv: Cleanup the irq mess" fixed doulble interrupt
gate setup. However, if we try to unload hv_vmbus and then load it back
crashes in different places of vmbus driver occur on both unload and second
load paths. Address those I saw in my testing.

Not everything is fixed though. MCE was hit once on Generation2 instance and
I neither understand what caused it nor do I know the way to reproduce it.
Anyway, here is the log:

[  204.846255] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0020001
[  204.846675] mce: [Hardware Error]: TSC 6b5cd64bc8 
[  204.846675] mce: [Hardware Error]: PROCESSOR 0:306e4 TIME 1421944123 SOCKET 0 APIC 0 microcode ffffffff
[  204.846675] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[  204.846675] mce: [Hardware Error]: Machine check: Processor context corrupt
[  204.846675] Kernel panic - not syncing: Fatal Machine check
[  204.846675] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[  204.846675] Rebooting in 30 seconds..
[  204.846675] ACPI MEMORY or I/O RESET_REG.

Vitaly Kuznetsov (3):
  Drivers: hv: vmbus: avoid double kfree for device_obj
  Drivers: hv: vmbus: introduce vmbus_acpi_remove
  Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and
    vmbus_connection pages on shutdown

 drivers/hv/channel_mgmt.c |  1 -
 drivers/hv/connection.c   | 17 ++++++++++++-----
 drivers/hv/hyperv_vmbus.h |  1 +
 drivers/hv/vmbus_drv.c    | 16 ++++++++++++++++
 4 files changed, 29 insertions(+), 6 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
@ 2015-01-21 19:02 ` Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

On driver shutdown device_obj is being freed twice:
1) In vmbus_free_channels()
2) vmbus_device_release() (which is being triggered by device_unregister() in
   vmbus_device_unregister().
This double kfree leads to the following sporadic crash on driver unload:

[   23.469876] general protection fault: 0000 [#1] SMP
[   23.470036] Modules linked in: hv_vmbus(-)
[   23.470036] CPU: 2 PID: 213 Comm: rmmod Not tainted 3.19.0-rc5_bug923184+ #488
[   23.470036] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   23.470036] task: ffff880036ef1cb0 ti: ffff880036ce8000 task.ti: ffff880036ce8000
[   23.470036] RIP: 0010:[<ffffffff811d2e1b>]  [<ffffffff811d2e1b>] __kmalloc_node_track_caller+0xdb/0x1e0
[   23.470036] RSP: 0018:ffff880036cebcc8  EFLAGS: 00010246
...

When this crash does not happen on driver unload the similar one is expected if
we try to load hv_vmbus again.

Remove kfree from vmbus_free_channels() as freeing it from
vmbus_device_release() seems right.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/channel_mgmt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 877a944..0141a3d 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -259,7 +259,6 @@ void vmbus_free_channels(void)
 
 	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
 		vmbus_device_unregister(channel->device_obj);
-		kfree(channel->device_obj);
 		free_channel(channel);
 	}
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
@ 2015-01-21 19:02 ` Vitaly Kuznetsov
  2015-01-22 19:08   ` KY Srinivasan
  2015-01-21 19:02 ` [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown Vitaly Kuznetsov
  2015-01-26 13:41 ` [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  3 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

In case we do request_resource() in vmbus_acpi_add() we need to tear it down
to be able to load the driver again. Otherwise the following crash in oberved
when hv_vmbus unload/load sequence is performed on Generation2 instance:

[   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
[   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
[   38.166315] Oops: 0000 [#1] SMP
[   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
[   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-rc5_bug923184+ #486
[   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti: ffff88003f60c000
[   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
...

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/vmbus_drv.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 4d6b269..b06cb87 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -902,6 +902,15 @@ acpi_walk_err:
 	return ret_val;
 }
 
+static int vmbus_acpi_remove(struct acpi_device *device)
+{
+	int ret = 0;
+
+	if (hyperv_mmio.start && hyperv_mmio.end)
+		ret = release_resource(&hyperv_mmio);
+	return ret;
+}
+
 static const struct acpi_device_id vmbus_acpi_device_ids[] = {
 	{"VMBUS", 0},
 	{"VMBus", 0},
@@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
 	.ids = vmbus_acpi_device_ids,
 	.ops = {
 		.add = vmbus_acpi_add,
+		.remove = vmbus_acpi_remove,
 	},
 };
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
@ 2015-01-21 19:02 ` Vitaly Kuznetsov
  2015-01-26 13:41 ` [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  3 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

We need to destroy hv_vmbus_con on module shutdown, otherwise the following
crash is sometimes observed:

[   76.569845] hv_vmbus: Hyper-V Host Build:9600-6.3-17-0.17039; Vmbus version:3.0
[   82.598859] BUG: unable to handle kernel paging request at ffffffffa0003480
[   82.599287] IP: [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] PGD 1f34067 PUD 1f35063 PMD 3f72d067 PTE 0
[   82.599287] Oops: 0010 [#1] SMP
[   82.599287] Modules linked in: [last unloaded: hv_vmbus]
[   82.599287] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 3.19.0-rc5_bug923184+ #488
[   82.599287] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   82.599287] Workqueue: hv_vmbus_con 0xffffffffa0003480
[   82.599287] task: ffff88007b6ddfa0 ti: ffff88007f8f8000 task.ti: ffff88007f8f8000
[   82.599287] RIP: 0010:[<ffffffffa0003480>]  [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] RSP: 0018:ffff88007f8fbe00  EFLAGS: 00010202
...

To avoid memory leaks we need to free monitor_pages and int_page for
vmbus_connection. Implement vmbus_disconnect() function by separating cleanup
path from vmbus_connect().

As we use hv_vmbus_con to release channels (see free_channel() in channel_mgmt.c)
we need to make sure the work was done before we remove the queue, do that with
drain_workqueue(). We also need to avoid handling messages  which can (potentially)
create new channels, so set vmbus_connection.conn_state = DISCONNECTED at the very
beginning of vmbus_exit() and check for that in vmbus_onmessage_work().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/connection.c   | 17 ++++++++++++-----
 drivers/hv/hyperv_vmbus.h |  1 +
 drivers/hv/vmbus_drv.c    |  6 ++++++
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index e206619..2fce0c7 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -214,10 +214,21 @@ int vmbus_connect(void)
 
 cleanup:
 	pr_err("Unable to connect to host\n");
+
 	vmbus_connection.conn_state = DISCONNECTED;
+	vmbus_disconnect();
+
+	kfree(msginfo);
+
+	return ret;
+}
 
-	if (vmbus_connection.work_queue)
+void vmbus_disconnect(void)
+{
+	if (vmbus_connection.work_queue) {
+		drain_workqueue(vmbus_connection.work_queue);
 		destroy_workqueue(vmbus_connection.work_queue);
+	}
 
 	if (vmbus_connection.int_page) {
 		free_pages((unsigned long)vmbus_connection.int_page, 0);
@@ -228,10 +239,6 @@ cleanup:
 	free_pages((unsigned long)vmbus_connection.monitor_pages[1], 0);
 	vmbus_connection.monitor_pages[0] = NULL;
 	vmbus_connection.monitor_pages[1] = NULL;
-
-	kfree(msginfo);
-
-	return ret;
 }
 
 /*
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index c386d8d..e01c13e 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -671,6 +671,7 @@ void vmbus_free_channels(void);
 /* Connection interface */
 
 int vmbus_connect(void);
+void vmbus_disconnect(void);
 
 int vmbus_post_msg(void *buffer, size_t buflen);
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b06cb87..e647505 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -572,6 +572,10 @@ static void vmbus_onmessage_work(struct work_struct *work)
 {
 	struct onmessage_work_context *ctx;
 
+	/* Do not process messages if we're in DISCONNECTED state */
+	if (vmbus_connection.conn_state == DISCONNECTED)
+		return;
+
 	ctx = container_of(work, struct onmessage_work_context,
 			   work);
 	vmbus_onmessage(&ctx->msg);
@@ -969,11 +973,13 @@ cleanup:
 
 static void __exit vmbus_exit(void)
 {
+	vmbus_connection.conn_state = DISCONNECTED;
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
 	hv_cleanup();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
+	vmbus_disconnect();
 }
 
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
@ 2015-01-22 19:08   ` KY Srinivasan
  2015-01-23 10:01     ` Vitaly Kuznetsov
  2015-04-10 12:47     ` Vitaly Kuznetsov
  0 siblings, 2 replies; 9+ messages in thread
From: KY Srinivasan @ 2015-01-22 19:08 UTC (permalink / raw)
  To: Vitaly Kuznetsov, devel
  Cc: Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Wednesday, January 21, 2015 11:02 AM
> To: KY Srinivasan; devel@linuxdriverproject.org
> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
> 
> In case we do request_resource() in vmbus_acpi_add() we need to tear it
> down to be able to load the driver again. Otherwise the following crash in
> oberved when hv_vmbus unload/load sequence is performed on
> Generation2 instance:
> 
> [   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
> [   38.166315] Oops: 0000 [#1] SMP
> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
> rc5_bug923184+ #486
> [   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual
> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
> ffff88003f60c000
> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
> __request_resource+0x2f/0x50
> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
> ...
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  drivers/hv/vmbus_drv.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
> 4d6b269..b06cb87 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -902,6 +902,15 @@ acpi_walk_err:
>  	return ret_val;
>  }
> 
> +static int vmbus_acpi_remove(struct acpi_device *device) {
> +	int ret = 0;
> +
> +	if (hyperv_mmio.start && hyperv_mmio.end)
> +		ret = release_resource(&hyperv_mmio);
> +	return ret;
> +}
> +
>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
>  	{"VMBUS", 0},
>  	{"VMBus", 0},
> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
>  	.ids = vmbus_acpi_device_ids,
>  	.ops = {
>  		.add = vmbus_acpi_add,
> +		.remove = vmbus_acpi_remove,
>  	},
>  };

Vitaly,

Jake has sent the following patch that has fixed retrieving of the mmio resources:
https://lkml.org/lkml/2015/1/20/876

This patch also deals with the resource cleanup that you have in this patch.

Regards,

K. Y
> 
> --
> 1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-22 19:08   ` KY Srinivasan
@ 2015-01-23 10:01     ` Vitaly Kuznetsov
  2015-04-10 12:47     ` Vitaly Kuznetsov
  1 sibling, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-23 10:01 UTC (permalink / raw)
  To: KY Srinivasan; +Cc: devel, Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins

KY Srinivasan <kys@microsoft.com> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> Sent: Wednesday, January 21, 2015 11:02 AM
>> To: KY Srinivasan; devel@linuxdriverproject.org
>> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
>> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
>> 
>> In case we do request_resource() in vmbus_acpi_add() we need to tear it
>> down to be able to load the driver again. Otherwise the following crash in
>> oberved when hv_vmbus unload/load sequence is performed on
>> Generation2 instance:
>> 
>> [   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
>> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
>> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
>> [   38.166315] Oops: 0000 [#1] SMP
>> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
>> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
>> rc5_bug923184+ #486
>> [   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual
>> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
>> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
>> ffff88003f60c000
>> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
>> __request_resource+0x2f/0x50
>> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
>> ...
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  drivers/hv/vmbus_drv.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>> 
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
>> 4d6b269..b06cb87 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -902,6 +902,15 @@ acpi_walk_err:
>>  	return ret_val;
>>  }
>> 
>> +static int vmbus_acpi_remove(struct acpi_device *device) {
>> +	int ret = 0;
>> +
>> +	if (hyperv_mmio.start && hyperv_mmio.end)
>> +		ret = release_resource(&hyperv_mmio);
>> +	return ret;
>> +}
>> +
>>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
>>  	{"VMBUS", 0},
>>  	{"VMBus", 0},
>> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
>>  	.ids = vmbus_acpi_device_ids,
>>  	.ops = {
>>  		.add = vmbus_acpi_add,
>> +		.remove = vmbus_acpi_remove,
>>  	},
>>  };
>
> Vitaly,
>
> Jake has sent the following patch that has fixed retrieving of the mmio resources:
> https://lkml.org/lkml/2015/1/20/876
>
> This patch also deals with the resource cleanup that you have in this
> patch.

Ah, I see, I'm stepping on his toes here :) We can just drop this patch
from this series if Jake's get accepted (and I think it should).

>
> Regards,
>
> K. Y
>> 
>> --
>> 1.9.3

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2015-01-21 19:02 ` [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown Vitaly Kuznetsov
@ 2015-01-26 13:41 ` Vitaly Kuznetsov
  3 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-26 13:41 UTC (permalink / raw)
  To: K. Y. Srinivasan; +Cc: devel, Haiyang Zhang, linux-kernel

Vitaly Kuznetsov <vkuznets@redhat.com> writes:

> It is possible (since 93e5bd06a953: "Drivers: hv: Make the vmbus driver
> unloadable") to unload hv_vmbus driver if no other devices are connected.
> 1aec169673d7: "x86: Hyperv: Cleanup the irq mess" fixed doulble interrupt
> gate setup. However, if we try to unload hv_vmbus and then load it back
> crashes in different places of vmbus driver occur on both unload and second
> load paths. Address those I saw in my testing.

It seems that newly introduced clockevent device (Drivers: hv: vmbus:
Implement a clockevent device) makes it impossible to unload hv_vmbus
module:

# rmmod hv_vmbus
rmmod hv_vmbus
rmmod: ERROR: Module hv_vmbus is in use

I'll try investigating before sending v2 without PATCH 2/3.

>
> Not everything is fixed though. MCE was hit once on Generation2 instance and
> I neither understand what caused it nor do I know the way to reproduce it.
> Anyway, here is the log:
>
> [  204.846255] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0020001
> [  204.846675] mce: [Hardware Error]: TSC 6b5cd64bc8 
> [  204.846675] mce: [Hardware Error]: PROCESSOR 0:306e4 TIME 1421944123 SOCKET 0 APIC 0 microcode ffffffff
> [  204.846675] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
> [  204.846675] mce: [Hardware Error]: Machine check: Processor context corrupt
> [  204.846675] Kernel panic - not syncing: Fatal Machine check
> [  204.846675] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
> [  204.846675] Rebooting in 30 seconds..
> [  204.846675] ACPI MEMORY or I/O RESET_REG.
>
> Vitaly Kuznetsov (3):
>   Drivers: hv: vmbus: avoid double kfree for device_obj
>   Drivers: hv: vmbus: introduce vmbus_acpi_remove
>   Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and
>     vmbus_connection pages on shutdown
>
>  drivers/hv/channel_mgmt.c |  1 -
>  drivers/hv/connection.c   | 17 ++++++++++++-----
>  drivers/hv/hyperv_vmbus.h |  1 +
>  drivers/hv/vmbus_drv.c    | 16 ++++++++++++++++
>  4 files changed, 29 insertions(+), 6 deletions(-)

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-22 19:08   ` KY Srinivasan
  2015-01-23 10:01     ` Vitaly Kuznetsov
@ 2015-04-10 12:47     ` Vitaly Kuznetsov
  2015-04-10 14:50       ` KY Srinivasan
  1 sibling, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-04-10 12:47 UTC (permalink / raw)
  To: KY Srinivasan; +Cc: devel, Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins

KY Srinivasan <kys@microsoft.com> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> Sent: Wednesday, January 21, 2015 11:02 AM
>> To: KY Srinivasan; devel@linuxdriverproject.org
>> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
>> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
>> 
>> In case we do request_resource() in vmbus_acpi_add() we need to tear it
>> down to be able to load the driver again. Otherwise the following crash in
>> oberved when hv_vmbus unload/load sequence is performed on
>> Generation2 instance:
>> 
>> [   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
>> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
>> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
>> [   38.166315] Oops: 0000 [#1] SMP
>> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
>> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
>> rc5_bug923184+ #486
>> [   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual
>> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
>> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
>> ffff88003f60c000
>> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
>> __request_resource+0x2f/0x50
>> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
>> ...
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  drivers/hv/vmbus_drv.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>> 
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
>> 4d6b269..b06cb87 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -902,6 +902,15 @@ acpi_walk_err:
>>  	return ret_val;
>>  }
>> 
>> +static int vmbus_acpi_remove(struct acpi_device *device) {
>> +	int ret = 0;
>> +
>> +	if (hyperv_mmio.start && hyperv_mmio.end)
>> +		ret = release_resource(&hyperv_mmio);
>> +	return ret;
>> +}
>> +
>>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
>>  	{"VMBUS", 0},
>>  	{"VMBus", 0},
>> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
>>  	.ids = vmbus_acpi_device_ids,
>>  	.ops = {
>>  		.add = vmbus_acpi_add,
>> +		.remove = vmbus_acpi_remove,
>>  	},
>>  };
>
> Vitaly,
>
> Jake has sent the following patch that has fixed retrieving of the mmio resources:
> https://lkml.org/lkml/2015/1/20/876
>
> This patch also deals with the resource cleanup that you have in this patch.
>

Hi K.Y.,

it looks like Jake was forced to redo his work, do you think it would
make sense to have this simple fix for mainline in the meantime? I can
rebase/resend in case you do.

Thanks,

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-04-10 12:47     ` Vitaly Kuznetsov
@ 2015-04-10 14:50       ` KY Srinivasan
  0 siblings, 0 replies; 9+ messages in thread
From: KY Srinivasan @ 2015-04-10 14:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: devel, Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Friday, April 10, 2015 5:48 AM
> To: KY Srinivasan
> Cc: devel@linuxdriverproject.org; Haiyang Zhang; linux-
> kernel@vger.kernel.org; Dexuan Cui; Jake Oshins
> Subject: Re: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
> 
> KY Srinivasan <kys@microsoft.com> writes:
> 
> >> -----Original Message-----
> >> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> >> Sent: Wednesday, January 21, 2015 11:02 AM
> >> To: KY Srinivasan; devel@linuxdriverproject.org
> >> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
> >> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
> >>
> >> In case we do request_resource() in vmbus_acpi_add() we need to tear it
> >> down to be able to load the driver again. Otherwise the following crash in
> >> oberved when hv_vmbus unload/load sequence is performed on
> >> Generation2 instance:
> >>
> >> [   38.165701] BUG: unable to handle kernel paging request at
> ffffffffa00075a0
> >> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
> >> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
> >> [   38.166315] Oops: 0000 [#1] SMP
> >> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
> >> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
> >> rc5_bug923184+ #486
> >> [   38.166315] Hardware name: Microsoft Corporation Virtual
> Machine/Virtual
> >> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
> >> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
> >> ffff88003f60c000
> >> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
> >> __request_resource+0x2f/0x50
> >> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
> >> ...
> >>
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> ---
> >>  drivers/hv/vmbus_drv.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
> >> 4d6b269..b06cb87 100644
> >> --- a/drivers/hv/vmbus_drv.c
> >> +++ b/drivers/hv/vmbus_drv.c
> >> @@ -902,6 +902,15 @@ acpi_walk_err:
> >>  	return ret_val;
> >>  }
> >>
> >> +static int vmbus_acpi_remove(struct acpi_device *device) {
> >> +	int ret = 0;
> >> +
> >> +	if (hyperv_mmio.start && hyperv_mmio.end)
> >> +		ret = release_resource(&hyperv_mmio);
> >> +	return ret;
> >> +}
> >> +
> >>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
> >>  	{"VMBUS", 0},
> >>  	{"VMBus", 0},
> >> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
> >>  	.ids = vmbus_acpi_device_ids,
> >>  	.ops = {
> >>  		.add = vmbus_acpi_add,
> >> +		.remove = vmbus_acpi_remove,
> >>  	},
> >>  };
> >
> > Vitaly,
> >
> > Jake has sent the following patch that has fixed retrieving of the mmio
> resources:
> > https://lkml.org/lkml/2015/1/20/876
> >
> > This patch also deals with the resource cleanup that you have in this patch.
> >
> 
> Hi K.Y.,
> 
> it looks like Jake was forced to redo his work, do you think it would
> make sense to have this simple fix for mainline in the meantime? I can
> rebase/resend in case you do.

Please do. Thanks.

K. Y
> 
> Thanks,
> 
> --
>   Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-10 14:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
2015-01-22 19:08   ` KY Srinivasan
2015-01-23 10:01     ` Vitaly Kuznetsov
2015-04-10 12:47     ` Vitaly Kuznetsov
2015-04-10 14:50       ` KY Srinivasan
2015-01-21 19:02 ` [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown Vitaly Kuznetsov
2015-01-26 13:41 ` [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.