All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
@ 2016-11-25 12:48 Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels() Vitaly Kuznetsov
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
if you boot with all CPUs onlined. Turns out, people may want to limit the
number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
crash in Hyper-V code in this case. After some thinking, I think we may not
only fix the crash but also make the offlining prevention fine-grained: we
need to prevent from offlining CPUs which have VMBus channels attached
only. All offlined CPUs may always be onlined.

PATCH1 fixes a bug which is not directly related to the series, I hit it
while testing hv_vmbus module unload with this series.

Vitaly Kuznetsov (7):
  hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
  hv: allocate synic pages for all present CPUs
  hv: init percpu_list in hv_synic_alloc()
  hv: change clockevents unbind tactics
  hv: check all present cpus in vmbus_wait_for_unload()
  hv: switch to cpuhp state machine for synic init/cleanup
  hv: make CPU offlining prevention fine-grained

 drivers/hv/channel_mgmt.c |  6 +++--
 drivers/hv/hv.c           | 60 ++++++++++++++++++++++++++++++++++++-----------
 drivers/hv/hyperv_vmbus.h |  4 ++--
 drivers/hv/vmbus_drv.c    | 28 ++++++++++++----------
 4 files changed, 67 insertions(+), 31 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
@ 2016-11-25 12:48 ` Vitaly Kuznetsov
  2016-11-28  6:20   ` Dexuan Cui
  2016-11-25 12:48 ` [PATCH 2/7] hv: allocate synic pages for all present CPUs Vitaly Kuznetsov
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

"kernel BUG at drivers/hv/channel_mgmt.c:350!" is observed when hv_vmbus
module is unloaded. BUG_ON() was introduced in commit 85d9aa705184
("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()") as
vmbus_free_channels() codepath was apparently forgotten.

Fixes: 85d9aa705184 ("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/channel_mgmt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 96a85cd..1bc1d479 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -389,6 +389,7 @@ void vmbus_free_channels(void)
 {
 	struct vmbus_channel *channel, *tmp;
 
+	mutex_lock(&vmbus_connection.channel_mutex);
 	list_for_each_entry_safe(channel, tmp, &vmbus_connection.chn_list,
 		listentry) {
 		/* hv_process_channel_removal() needs this */
@@ -396,6 +397,7 @@ void vmbus_free_channels(void)
 
 		vmbus_device_unregister(channel->device_obj);
 	}
+	mutex_unlock(&vmbus_connection.channel_mutex);
 }
 
 /*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/7] hv: allocate synic pages for all present CPUs
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels() Vitaly Kuznetsov
@ 2016-11-25 12:48 ` Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 3/7] hv: init percpu_list in hv_synic_alloc() Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

It may happen that not all CPUs are online when we do hv_synic_alloc() and
in case more CPUs come online later we may try accessing these allocated
structures.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/hv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 60dbd6c..e4bb498 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -411,7 +411,7 @@ int hv_synic_alloc(void)
 		goto err;
 	}
 
-	for_each_online_cpu(cpu) {
+	for_each_present_cpu(cpu) {
 		hv_context.event_dpc[cpu] = kmalloc(size, GFP_ATOMIC);
 		if (hv_context.event_dpc[cpu] == NULL) {
 			pr_err("Unable to allocate event dpc\n");
@@ -482,7 +482,7 @@ void hv_synic_free(void)
 	int cpu;
 
 	kfree(hv_context.hv_numa_map);
-	for_each_online_cpu(cpu)
+	for_each_present_cpu(cpu)
 		hv_synic_free_cpu(cpu);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/7] hv: init percpu_list in hv_synic_alloc()
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels() Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 2/7] hv: allocate synic pages for all present CPUs Vitaly Kuznetsov
@ 2016-11-25 12:48 ` Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 4/7] hv: change clockevents unbind tactics Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

Initializing hv_context.percpu_list in hv_synic_alloc() helps to prevent a
crash in percpu_channel_enq() when not all CPUs were online during
initialization and it naturally belongs there.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/hv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index e4bb498..a2567a4 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -457,6 +457,8 @@ int hv_synic_alloc(void)
 			pr_err("Unable to allocate post msg page\n");
 			goto err;
 		}
+
+		INIT_LIST_HEAD(&hv_context.percpu_list[cpu]);
 	}
 
 	return 0;
@@ -552,8 +554,6 @@ void hv_synic_init(void *arg)
 	rdmsrl(HV_X64_MSR_VP_INDEX, vp_index);
 	hv_context.vp_index[cpu] = (u32)vp_index;
 
-	INIT_LIST_HEAD(&hv_context.percpu_list[cpu]);
-
 	/*
 	 * Register the per-cpu clockevent source.
 	 */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/7] hv: change clockevents unbind tactics
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2016-11-25 12:48 ` [PATCH 3/7] hv: init percpu_list in hv_synic_alloc() Vitaly Kuznetsov
@ 2016-11-25 12:48 ` Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 5/7] hv: check all present cpus in vmbus_wait_for_unload() Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

To get prepared to CPU offlining support we need co change the way how we
unbind clockevent devices. As one CPU may go online/offline multiple times
we need to bind it in hv_synic_init() and unbind it in hv_synic_cleanup().
The is an additional corner case: when we unload the module completely we
need to switch to some other clockevent mechanism before stopping VMBus or
we will hang. We can't call hv_synic_cleanup() before unloading VMBus as
we won't be able to send UNLOAD request and get a response so
hv_synic_clockevents_cleanup() has to live. Luckily, we can always call
clockevents_unbind_device(), even if it wasn't bound before and there is
no issue if we call it twice.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/hv.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index a2567a4..c11393c 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -575,7 +575,7 @@ void hv_synic_clockevents_cleanup(void)
 	if (!(ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE))
 		return;
 
-	for_each_online_cpu(cpu)
+	for_each_present_cpu(cpu)
 		clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
 }
 
@@ -594,8 +594,10 @@ void hv_synic_cleanup(void *arg)
 		return;
 
 	/* Turn off clockevent device */
-	if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE)
+	if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) {
+		clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
 		hv_ce_shutdown(hv_context.clk_evt[cpu]);
+	}
 
 	rdmsrl(HV_X64_MSR_SINT0 + VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/7] hv: check all present cpus in vmbus_wait_for_unload()
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2016-11-25 12:48 ` [PATCH 4/7] hv: change clockevents unbind tactics Vitaly Kuznetsov
@ 2016-11-25 12:48 ` Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 6/7] hv: switch to cpuhp state machine for synic init/cleanup Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

It should never happen, but let's get prepared to receiving a confirmation
for VMBus unload on an offlined CPU. As we allocate all structures for all
present CPUs now it's safe.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/channel_mgmt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 1bc1d479..5011c95 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -674,7 +674,7 @@ static void vmbus_wait_for_unload(void)
 		if (completion_done(&vmbus_connection.unload_event))
 			break;
 
-		for_each_online_cpu(cpu) {
+		for_each_present_cpu(cpu) {
 			page_addr = hv_context.synic_message_page[cpu];
 			msg = (struct hv_message *)page_addr +
 				VMBUS_MESSAGE_SINT;
@@ -700,7 +700,7 @@ static void vmbus_wait_for_unload(void)
 	 * maybe-pending messages on all CPUs to be able to receive new
 	 * messages after we reconnect.
 	 */
-	for_each_online_cpu(cpu) {
+	for_each_present_cpu(cpu) {
 		page_addr = hv_context.synic_message_page[cpu];
 		msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT;
 		msg->header.message_type = HVMSG_NONE;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/7] hv: switch to cpuhp state machine for synic init/cleanup
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2016-11-25 12:48 ` [PATCH 5/7] hv: check all present cpus in vmbus_wait_for_unload() Vitaly Kuznetsov
@ 2016-11-25 12:48 ` Vitaly Kuznetsov
  2016-11-25 12:48 ` [PATCH 7/7] hv: make CPU offlining prevention fine-grained Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

To make it possible to online/offline CPUs switch to cpuhp infrastructure
for doing hv_synic_init()/hv_synic_cleanup().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/hv.c           | 15 +++++++--------
 drivers/hv/hyperv_vmbus.h |  4 ++--
 drivers/hv/vmbus_drv.c    | 19 +++++++++++--------
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index c11393c..b3f6a1b 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -495,7 +495,7 @@ void hv_synic_free(void)
  * retrieve the initialized message and event pages.  Otherwise, we create and
  * initialize the message and event pages.
  */
-void hv_synic_init(void *arg)
+int hv_synic_init(unsigned int cpu)
 {
 	u64 version;
 	union hv_synic_simp simp;
@@ -504,10 +504,8 @@ void hv_synic_init(void *arg)
 	union hv_synic_scontrol sctrl;
 	u64 vp_index;
 
-	int cpu = smp_processor_id();
-
 	if (!hv_context.hypercall_page)
-		return;
+		return -EFAULT;
 
 	/* Check the version */
 	rdmsrl(HV_X64_MSR_SVERSION, version);
@@ -562,7 +560,7 @@ void hv_synic_init(void *arg)
 						HV_TIMER_FREQUENCY,
 						HV_MIN_DELTA_TICKS,
 						HV_MAX_MAX_DELTA_TICKS);
-	return;
+	return 0;
 }
 
 /*
@@ -582,16 +580,15 @@ void hv_synic_clockevents_cleanup(void)
 /*
  * hv_synic_cleanup - Cleanup routine for hv_synic_init().
  */
-void hv_synic_cleanup(void *arg)
+int hv_synic_cleanup(unsigned int cpu)
 {
 	union hv_synic_sint shared_sint;
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
 	union hv_synic_scontrol sctrl;
-	int cpu = smp_processor_id();
 
 	if (!hv_context.synic_initialized)
-		return;
+		return -EFAULT;
 
 	/* Turn off clockevent device */
 	if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) {
@@ -623,4 +620,6 @@ void hv_synic_cleanup(void *arg)
 	rdmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
 	sctrl.enable = 0;
 	wrmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
+
+	return 0;
 }
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index a5b4442..d50d56c 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -505,9 +505,9 @@ extern int hv_synic_alloc(void);
 
 extern void hv_synic_free(void);
 
-extern void hv_synic_init(void *irqarg);
+extern int hv_synic_init(unsigned int cpu);
 
-extern void hv_synic_cleanup(void *arg);
+extern int hv_synic_cleanup(unsigned int cpu);
 
 extern void hv_synic_clockevents_cleanup(void);
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 0276d2e..a115c90 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -49,6 +49,7 @@ static struct acpi_device  *hv_acpi_dev;
 
 static struct completion probe_event;
 
+static int hyperv_cpuhp_online;
 
 static void hyperv_report_panic(struct pt_regs *regs)
 {
@@ -844,7 +845,12 @@ static int vmbus_bus_init(void)
 	 * Initialize the per-cpu interrupt state and
 	 * connect to the host.
 	 */
-	on_each_cpu(hv_synic_init, NULL, 1);
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv:online",
+				hv_synic_init, hv_synic_cleanup);
+	if (ret < 0)
+		goto err_alloc;
+	hyperv_cpuhp_online = ret;
+
 	ret = vmbus_connect();
 	if (ret)
 		goto err_connect;
@@ -866,7 +872,7 @@ static int vmbus_bus_init(void)
 	return 0;
 
 err_connect:
-	on_each_cpu(hv_synic_cleanup, NULL, 1);
+	cpuhp_remove_state(hyperv_cpuhp_online);
 err_alloc:
 	hv_synic_free();
 	hv_remove_vmbus_irq();
@@ -1320,12 +1326,9 @@ static struct acpi_driver vmbus_acpi_driver = {
 
 static void hv_kexec_handler(void)
 {
-	int cpu;
-
 	hv_synic_clockevents_cleanup();
 	vmbus_initiate_unload(false);
-	for_each_online_cpu(cpu)
-		smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
+	cpuhp_remove_state(hyperv_cpuhp_online);
 	hv_cleanup(false);
 };
 
@@ -1337,7 +1340,7 @@ static void hv_crash_handler(struct pt_regs *regs)
 	 * doing the cleanup for current CPU only. This should be sufficient
 	 * for kdump.
 	 */
-	hv_synic_cleanup(NULL);
+	hv_synic_cleanup(smp_processor_id());
 	hv_cleanup(true);
 };
 
@@ -1401,8 +1404,8 @@ static void __exit vmbus_exit(void)
 	hv_cleanup(false);
 	for_each_online_cpu(cpu) {
 		tasklet_kill(hv_context.event_dpc[cpu]);
-		smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
 	}
+	cpuhp_remove_state(hyperv_cpuhp_online);
 	hv_synic_free();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
 	if (vmbus_proto_version > VERSION_WIN7)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/7] hv: make CPU offlining prevention fine-grained
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2016-11-25 12:48 ` [PATCH 6/7] hv: switch to cpuhp state machine for synic init/cleanup Vitaly Kuznetsov
@ 2016-11-25 12:48 ` Vitaly Kuznetsov
  2016-11-26 17:05 ` [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Stephen Hemminger
  2017-01-02 19:49 ` Vitaly Kuznetsov
  8 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-25 12:48 UTC (permalink / raw)
  To: devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui,
	Stephen Hemminger

Since commit e513229b4c38 ("Drivers: hv: vmbus: prevent cpu offlining on
newer hypervisors") cpu offlining was disabled. It is still true that we
can't offline CPUs which have VMBus channels bound to them but we may have
'free' CPUs (e.v. we booted with maxcpus= parameter and onlined CPUs after
VMBus was initialized), these CPUs may be disabled without issues.

In future, we may even allow closing CPUs which have only sub-channels
assinged to them by closing these sub-channels. All devices will continue
to work.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/hv.c        | 31 +++++++++++++++++++++++++++++++
 drivers/hv/vmbus_drv.c |  9 ++++-----
 2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index b3f6a1b..4ece040 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -586,10 +586,41 @@ int hv_synic_cleanup(unsigned int cpu)
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
 	union hv_synic_scontrol sctrl;
+	struct vmbus_channel *channel, *sc;
+	bool channel_found = false;
+	unsigned long flags;
 
 	if (!hv_context.synic_initialized)
 		return -EFAULT;
 
+	/*
+	 * Search for channels which are bound to the CPU we're about to
+	 * cleanup. In case we find one and vmbus is still connected we need to
+	 * fail, this will effectively prevent CPU offlining. There is no way
+	 * we can re-bind channels to different CPUs for now.
+	 */
+	mutex_lock(&vmbus_connection.channel_mutex);
+	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
+		if (channel->target_cpu == cpu) {
+			channel_found = true;
+			break;
+		}
+		spin_lock_irqsave(&channel->lock, flags);
+		list_for_each_entry(sc, &channel->sc_list, sc_list) {
+			if (sc->target_cpu == cpu) {
+				channel_found = true;
+				break;
+			}
+		}
+		spin_unlock_irqrestore(&channel->lock, flags);
+		if (channel_found)
+			break;
+	}
+	mutex_unlock(&vmbus_connection.channel_mutex);
+
+	if (channel_found && vmbus_connection.conn_state == CONNECTED)
+		return -EBUSY;
+
 	/* Turn off clockevent device */
 	if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) {
 		clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index a115c90..61d879f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -855,9 +855,6 @@ static int vmbus_bus_init(void)
 	if (ret)
 		goto err_connect;
 
-	if (vmbus_proto_version > VERSION_WIN7)
-		cpu_hotplug_disable();
-
 	/*
 	 * Only register if the crash MSRs are available
 	 */
@@ -1328,6 +1325,9 @@ static void hv_kexec_handler(void)
 {
 	hv_synic_clockevents_cleanup();
 	vmbus_initiate_unload(false);
+	vmbus_connection.conn_state = DISCONNECTED;
+	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
+	mb();
 	cpuhp_remove_state(hyperv_cpuhp_online);
 	hv_cleanup(false);
 };
@@ -1340,6 +1340,7 @@ static void hv_crash_handler(struct pt_regs *regs)
 	 * doing the cleanup for current CPU only. This should be sufficient
 	 * for kdump.
 	 */
+	vmbus_connection.conn_state = DISCONNECTED;
 	hv_synic_cleanup(smp_processor_id());
 	hv_cleanup(true);
 };
@@ -1408,8 +1409,6 @@ static void __exit vmbus_exit(void)
 	cpuhp_remove_state(hyperv_cpuhp_online);
 	hv_synic_free();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
-	if (vmbus_proto_version > VERSION_WIN7)
-		cpu_hotplug_enable();
 }
 
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2016-11-25 12:48 ` [PATCH 7/7] hv: make CPU offlining prevention fine-grained Vitaly Kuznetsov
@ 2016-11-26 17:05 ` Stephen Hemminger
  2016-11-28  6:56   ` Dexuan Cui
  2017-01-02 19:49 ` Vitaly Kuznetsov
  8 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2016-11-26 17:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: devel, linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Dexuan Cui

On Fri, 25 Nov 2016 13:48:36 +0100
Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
> if you boot with all CPUs onlined. Turns out, people may want to limit the
> number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
> crash in Hyper-V code in this case. After some thinking, I think we may not
> only fix the crash but also make the offlining prevention fine-grained: we
> need to prevent from offlining CPUs which have VMBus channels attached
> only. All offlined CPUs may always be onlined.
> 
> PATCH1 fixes a bug which is not directly related to the series, I hit it
> while testing hv_vmbus module unload with this series.
> 
> Vitaly Kuznetsov (7):
>   hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
>   hv: allocate synic pages for all present CPUs
>   hv: init percpu_list in hv_synic_alloc()
>   hv: change clockevents unbind tactics
>   hv: check all present cpus in vmbus_wait_for_unload()
>   hv: switch to cpuhp state machine for synic init/cleanup
>   hv: make CPU offlining prevention fine-grained
> 
>  drivers/hv/channel_mgmt.c |  6 +++--
>  drivers/hv/hv.c           | 60 ++++++++++++++++++++++++++++++++++++-----------
>  drivers/hv/hyperv_vmbus.h |  4 ++--
>  drivers/hv/vmbus_drv.c    | 28 ++++++++++++----------
>  4 files changed, 67 insertions(+), 31 deletions(-)
> 

As a temporary solution this is ok, but long term we need to support
dynamic CPU online/offline.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
  2016-11-25 12:48 ` [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels() Vitaly Kuznetsov
@ 2016-11-28  6:20   ` Dexuan Cui
  0 siblings, 0 replies; 13+ messages in thread
From: Dexuan Cui @ 2016-11-28  6:20 UTC (permalink / raw)
  To: Vitaly Kuznetsov, devel
  Cc: linux-kernel, KY Srinivasan, Haiyang Zhang, Stephen Hemminger

> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Friday, November 25, 2016 20:49
> To: devel@linuxdriverproject.org
> Cc: linux-kernel@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang
> Zhang <haiyangz@microsoft.com>; Dexuan Cui <decui@microsoft.com>;
> Stephen Hemminger <stephen@networkplumber.org>
> Subject: [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in
> vmbus_free_channels()
> 
> "kernel BUG at drivers/hv/channel_mgmt.c:350!" is observed when hv_vmbus
> module is unloaded. BUG_ON() was introduced in commit 85d9aa705184
> ("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()") as
> vmbus_free_channels() codepath was apparently forgotten.
> 
> Fixes: 85d9aa705184 ("Drivers: hv: vmbus: add an API
> vmbus_hvsock_device_unregister()")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Reviewed-by: Dexuan Cui <decui@microsoft.com>

Thanks for the fix!

-- Dexuan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
  2016-11-26 17:05 ` [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Stephen Hemminger
@ 2016-11-28  6:56   ` Dexuan Cui
  2016-11-28  9:12     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 13+ messages in thread
From: Dexuan Cui @ 2016-11-28  6:56 UTC (permalink / raw)
  To: Stephen Hemminger, Vitaly Kuznetsov
  Cc: devel, linux-kernel, KY Srinivasan, Haiyang Zhang

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Sunday, November 27, 2016 01:06
> To: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Dexuan Cui
> <decui@microsoft.com>
> Subject: Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
> 
> On Fri, 25 Nov 2016 13:48:36 +0100
> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> 
> > Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
> > if you boot with all CPUs onlined. Turns out, people may want to limit the
> > number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
> > crash in Hyper-V code in this case. After some thinking, I think we may not
> > only fix the crash but also make the offlining prevention fine-grained: we
> > need to prevent from offlining CPUs which have VMBus channels attached
> > only. All offlined CPUs may always be onlined.
> >
> 
> As a temporary solution this is ok, but long term we need to support
> dynamic CPU online/offline.

If a CPU is bound to some channels, it seems impossible to make it offline,
unless Hyper-V supplies a mechanism to dynamically rebind  the channels (i.e.
without closing & opening the channels) to another CPU, e.g. CPU0.
Currently it looks there is no such mechanism.

For CPU online, my understanding is: this patchset of Vitaly actually makes it
work (see PATCH 6/7).

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
  2016-11-28  6:56   ` Dexuan Cui
@ 2016-11-28  9:12     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-28  9:12 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Stephen Hemminger, devel, linux-kernel, KY Srinivasan, Haiyang Zhang

Dexuan Cui <decui@microsoft.com> writes:

>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>> Sent: Sunday, November 27, 2016 01:06
>> To: Vitaly Kuznetsov <vkuznets@redhat.com>
>> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; KY Srinivasan
>> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Dexuan Cui
>> <decui@microsoft.com>
>> Subject: Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
>> 
>> On Fri, 25 Nov 2016 13:48:36 +0100
>> Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> 
>> > Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
>> > if you boot with all CPUs onlined. Turns out, people may want to limit the
>> > number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
>> > crash in Hyper-V code in this case. After some thinking, I think we may not
>> > only fix the crash but also make the offlining prevention fine-grained: we
>> > need to prevent from offlining CPUs which have VMBus channels attached
>> > only. All offlined CPUs may always be onlined.
>> >
>> 
>> As a temporary solution this is ok, but long term we need to support
>> dynamic CPU online/offline.
>
> If a CPU is bound to some channels, it seems impossible to make it offline,
> unless Hyper-V supplies a mechanism to dynamically rebind  the channels (i.e.
> without closing & opening the channels) to another CPU, e.g. CPU0.
> Currently it looks there is no such mechanism.
>

Exactly. Offlining a CPU with an attached active VMBus channel means
killing the device and unless we're offereed an API to rebind it to some
other CPU there is no point in allowing that. We may, however, allow
offlining CPUs with sub-channels attached by closing these sub-channels,
not sure if we want to.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
  2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
                   ` (7 preceding siblings ...)
  2016-11-26 17:05 ` [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Stephen Hemminger
@ 2017-01-02 19:49 ` Vitaly Kuznetsov
  8 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2017-01-02 19:49 UTC (permalink / raw)
  To: K. Y. Srinivasan
  Cc: linux-kernel, devel, Haiyang Zhang, Dexuan Cui, Stephen Hemminger

Vitaly Kuznetsov <vkuznets@redhat.com> writes:

> Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
> if you boot with all CPUs onlined. Turns out, people may want to limit the
> number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
> crash in Hyper-V code in this case. After some thinking, I think we may not
> only fix the crash but also make the offlining prevention fine-grained: we
> need to prevent from offlining CPUs which have VMBus channels attached
> only. All offlined CPUs may always be onlined.
>
> PATCH1 fixes a bug which is not directly related to the series, I hit it
> while testing hv_vmbus module unload with this series.
>
> Vitaly Kuznetsov (7):
>   hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
>   hv: allocate synic pages for all present CPUs
>   hv: init percpu_list in hv_synic_alloc()
>   hv: change clockevents unbind tactics
>   hv: check all present cpus in vmbus_wait_for_unload()
>   hv: switch to cpuhp state machine for synic init/cleanup
>   hv: make CPU offlining prevention fine-grained

K. Y., 

it seems that for some reason only patches 1 and 4 from this series made
it upstream (and to char-misc tree). Could you please resend the rest to
Greg? Please let me know if you want me to rebase/retest. Thanks!

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-02 19:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-25 12:48 [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Vitaly Kuznetsov
2016-11-25 12:48 ` [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels() Vitaly Kuznetsov
2016-11-28  6:20   ` Dexuan Cui
2016-11-25 12:48 ` [PATCH 2/7] hv: allocate synic pages for all present CPUs Vitaly Kuznetsov
2016-11-25 12:48 ` [PATCH 3/7] hv: init percpu_list in hv_synic_alloc() Vitaly Kuznetsov
2016-11-25 12:48 ` [PATCH 4/7] hv: change clockevents unbind tactics Vitaly Kuznetsov
2016-11-25 12:48 ` [PATCH 5/7] hv: check all present cpus in vmbus_wait_for_unload() Vitaly Kuznetsov
2016-11-25 12:48 ` [PATCH 6/7] hv: switch to cpuhp state machine for synic init/cleanup Vitaly Kuznetsov
2016-11-25 12:48 ` [PATCH 7/7] hv: make CPU offlining prevention fine-grained Vitaly Kuznetsov
2016-11-26 17:05 ` [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements Stephen Hemminger
2016-11-28  6:56   ` Dexuan Cui
2016-11-28  9:12     ` Vitaly Kuznetsov
2017-01-02 19:49 ` Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.