linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
@ 2018-11-29  4:36 Dexuan Cui
  2018-11-29  7:44 ` gregkh
  0 siblings, 1 reply; 11+ messages in thread
From: Dexuan Cui @ 2018-11-29  4:36 UTC (permalink / raw)
  To: gregkh, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	linux-kernel, devel
  Cc: Michael Kelley, vkuznets, jasowang, olaf, apw


vmbus_process_offer() mustn't call channel->sc_creation_callback()
directly for sub-channels, because sc_creation_callback() ->
vmbus_open() may never get the host's response to the
OPEN_CHANNEL message (the host may rescind a channel at any time,
e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
may not wake up the vmbus_open() as it's blocked due to a non-zero
vmbus_connection.offer_in_progress, and finally we have a deadlock.

The above is also true for primary channels, if the related device
drivers use sync probing mode by default.

And, usually the handling of primary channels and sub-channels can
depend on each other, so we should offload them to different
workqueues to avoid possible deadlock, e.g. in sync-probing mode,
NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
and waits for all the sub-channels to appear, but the latter
can't get the rtnl_lock and this blocks the handling of sub-channels.

The patch can fix the multiple-NIC deadlock described above for
v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
but don't enable async-probing for Hyper-V drivers (yet).

The patch can also fix the hang issue in sub-channel's handling described
above for all versions of kernels, including v4.19 and v4.20-rc4.

So, actually the patch should be applied to all the existing kernels,
not only the kernels that have 8195b1396ec8.

Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Cc: stable@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---

There is no change in this repost. I just rebased this patch to today's
char-misc's char-misc-next branch. Previously KY posted the patch with his
Signed-off-by (which is kept in this repost), but there was a conflict issue.

Note: the patch can't be cleanly applied to char-misc's char-misc-linus branch --
to do that, we need to cherry-pick the supporting patch first:
4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API vmbus_get_outgoing_channel()")

 drivers/hv/channel_mgmt.c | 188 +++++++++++++++++++++++++++++++---------------
 drivers/hv/connection.c   |  24 +++++-
 drivers/hv/hyperv_vmbus.h |   7 ++
 include/linux/hyperv.h    |   7 ++
 4 files changed, 161 insertions(+), 65 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 82e6736..d016890 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -434,60 +434,16 @@ void vmbus_free_channels(void)
 	}
 }
 
-/*
- * vmbus_process_offer - Process the offer by creating a channel/device
- * associated with this offer
- */
-static void vmbus_process_offer(struct vmbus_channel *newchannel)
+/* Note: the function can run concurrently for primary/sub channels. */
+static void vmbus_add_channel_work(struct work_struct *work)
 {
-	struct vmbus_channel *channel;
-	bool fnew = true;
+	struct vmbus_channel *newchannel =
+		container_of(work, struct vmbus_channel, add_channel_work);
+	struct vmbus_channel *primary_channel = newchannel->primary_channel;
 	unsigned long flags;
 	u16 dev_type;
 	int ret;
 
-	/* Make sure this is a new offer */
-	mutex_lock(&vmbus_connection.channel_mutex);
-
-	/*
-	 * Now that we have acquired the channel_mutex,
-	 * we can release the potentially racing rescind thread.
-	 */
-	atomic_dec(&vmbus_connection.offer_in_progress);
-
-	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
-		if (!uuid_le_cmp(channel->offermsg.offer.if_type,
-			newchannel->offermsg.offer.if_type) &&
-			!uuid_le_cmp(channel->offermsg.offer.if_instance,
-				newchannel->offermsg.offer.if_instance)) {
-			fnew = false;
-			break;
-		}
-	}
-
-	if (fnew)
-		list_add_tail(&newchannel->listentry,
-			      &vmbus_connection.chn_list);
-
-	mutex_unlock(&vmbus_connection.channel_mutex);
-
-	if (!fnew) {
-		/*
-		 * Check to see if this is a sub-channel.
-		 */
-		if (newchannel->offermsg.offer.sub_channel_index != 0) {
-			/*
-			 * Process the sub-channel.
-			 */
-			newchannel->primary_channel = channel;
-			spin_lock_irqsave(&channel->lock, flags);
-			list_add_tail(&newchannel->sc_list, &channel->sc_list);
-			spin_unlock_irqrestore(&channel->lock, flags);
-		} else {
-			goto err_free_chan;
-		}
-	}
-
 	dev_type = hv_get_dev_type(newchannel);
 
 	init_vp_index(newchannel, dev_type);
@@ -505,27 +461,26 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 	/*
 	 * This state is used to indicate a successful open
 	 * so that when we do close the channel normally, we
-	 * can cleanup properly
+	 * can cleanup properly.
 	 */
 	newchannel->state = CHANNEL_OPEN_STATE;
 
-	if (!fnew) {
-		struct hv_device *dev
-			= newchannel->primary_channel->device_obj;
+	if (primary_channel != NULL) {
+		/* newchannel is a sub-channel. */
+		struct hv_device *dev = primary_channel->device_obj;
 
 		if (vmbus_add_channel_kobj(dev, newchannel))
-			goto err_free_chan;
+			goto err_deq_chan;
+
+		if (primary_channel->sc_creation_callback != NULL)
+			primary_channel->sc_creation_callback(newchannel);
 
-		if (channel->sc_creation_callback != NULL)
-			channel->sc_creation_callback(newchannel);
 		newchannel->probe_done = true;
 		return;
 	}
 
 	/*
-	 * Start the process of binding this offer to the driver
-	 * We need to set the DeviceObject field before calling
-	 * vmbus_child_dev_add()
+	 * Start the process of binding the primary channel to the driver
 	 */
 	newchannel->device_obj = vmbus_device_create(
 		&newchannel->offermsg.offer.if_type,
@@ -554,13 +509,28 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 
 err_deq_chan:
 	mutex_lock(&vmbus_connection.channel_mutex);
-	list_del(&newchannel->listentry);
+
+	/*
+	 * We need to set the flag, otherwise
+	 * vmbus_onoffer_rescind() can be blocked.
+	 */
+	newchannel->probe_done = true;
+
+	if (primary_channel == NULL) {
+		list_del(&newchannel->listentry);
+	} else {
+		spin_lock_irqsave(&primary_channel->lock, flags);
+		list_del(&newchannel->sc_list);
+		spin_unlock_irqrestore(&primary_channel->lock, flags);
+	}
+
 	mutex_unlock(&vmbus_connection.channel_mutex);
 
 	if (newchannel->target_cpu != get_cpu()) {
 		put_cpu();
 		smp_call_function_single(newchannel->target_cpu,
-					 percpu_channel_deq, newchannel, true);
+					 percpu_channel_deq,
+					 newchannel, true);
 	} else {
 		percpu_channel_deq(newchannel);
 		put_cpu();
@@ -568,14 +538,104 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 
 	vmbus_release_relid(newchannel->offermsg.child_relid);
 
-err_free_chan:
 	free_channel(newchannel);
 }
 
 /*
+ * vmbus_process_offer - Process the offer by creating a channel/device
+ * associated with this offer
+ */
+static void vmbus_process_offer(struct vmbus_channel *newchannel)
+{
+	struct vmbus_channel *channel;
+	struct workqueue_struct *wq;
+	unsigned long flags;
+	bool fnew = true;
+
+	mutex_lock(&vmbus_connection.channel_mutex);
+
+	/*
+	 * Now that we have acquired the channel_mutex,
+	 * we can release the potentially racing rescind thread.
+	 */
+	atomic_dec(&vmbus_connection.offer_in_progress);
+
+	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
+		if (!uuid_le_cmp(channel->offermsg.offer.if_type,
+				 newchannel->offermsg.offer.if_type) &&
+		    !uuid_le_cmp(channel->offermsg.offer.if_instance,
+				 newchannel->offermsg.offer.if_instance)) {
+			fnew = false;
+			break;
+		}
+	}
+
+	if (fnew)
+		list_add_tail(&newchannel->listentry,
+			      &vmbus_connection.chn_list);
+	else {
+		/*
+		 * Check to see if this is a valid sub-channel.
+		 */
+		if (newchannel->offermsg.offer.sub_channel_index == 0) {
+			mutex_unlock(&vmbus_connection.channel_mutex);
+			/*
+			 * Don't call free_channel(), because newchannel->kobj
+			 * is not initialized yet.
+			 */
+			kfree(newchannel);
+			WARN_ON_ONCE(1);
+			return;
+		}
+		/*
+		 * Process the sub-channel.
+		 */
+		newchannel->primary_channel = channel;
+		spin_lock_irqsave(&channel->lock, flags);
+		list_add_tail(&newchannel->sc_list, &channel->sc_list);
+		spin_unlock_irqrestore(&channel->lock, flags);
+	}
+
+	mutex_unlock(&vmbus_connection.channel_mutex);
+
+	/*
+	 * vmbus_process_offer() mustn't call channel->sc_creation_callback()
+	 * directly for sub-channels, because sc_creation_callback() ->
+	 * vmbus_open() may never get the host's response to the
+	 * OPEN_CHANNEL message (the host may rescind a channel at any time,
+	 * e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
+	 * may not wake up the vmbus_open() as it's blocked due to a non-zero
+	 * vmbus_connection.offer_in_progress, and finally we have a deadlock.
+	 *
+	 * The above is also true for primary channels, if the related device
+	 * drivers use sync probing mode by default.
+	 *
+	 * And, usually the handling of primary channels and sub-channels can
+	 * depend on each other, so we should offload them to different
+	 * workqueues to avoid possible deadlock, e.g. in sync-probing mode,
+	 * NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
+	 * rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
+	 * and waits for all the sub-channels to appear, but the latter
+	 * can't get the rtnl_lock and this blocks the handling of
+	 * sub-channels.
+	 */
+	INIT_WORK(&newchannel->add_channel_work, vmbus_add_channel_work);
+	wq = fnew ? vmbus_connection.handle_primary_chan_wq :
+		    vmbus_connection.handle_sub_chan_wq;
+	queue_work(wq, &newchannel->add_channel_work);
+}
+
+/*
  * We use this state to statically distribute the channel interrupt load.
  */
 static int next_numa_node_id;
+/*
+ * init_vp_index() accesses global variables like next_numa_node_id, and
+ * it can run concurrently for primary channels and sub-channels: see
+ * vmbus_process_offer(), so we need the lock to protect the global
+ * variables.
+ */
+static DEFINE_SPINLOCK(bind_channel_to_cpu_lock);
 
 /*
  * Starting with Win8, we can statically distribute the incoming
@@ -611,6 +671,8 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
 		return;
 	}
 
+	spin_lock(&bind_channel_to_cpu_lock);
+
 	/*
 	 * Based on the channel affinity policy, we will assign the NUMA
 	 * nodes.
@@ -693,6 +755,8 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
 	channel->target_cpu = cur_cpu;
 	channel->target_vp = hv_cpu_number_to_vp_number(cur_cpu);
 
+	spin_unlock(&bind_channel_to_cpu_lock);
+
 	free_cpumask_var(available_mask);
 }
 
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index f4d08c8..4fe117b7 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -190,6 +190,20 @@ int vmbus_connect(void)
 		goto cleanup;
 	}
 
+	vmbus_connection.handle_primary_chan_wq =
+		create_workqueue("hv_pri_chan");
+	if (!vmbus_connection.handle_primary_chan_wq) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	vmbus_connection.handle_sub_chan_wq =
+		create_workqueue("hv_sub_chan");
+	if (!vmbus_connection.handle_sub_chan_wq) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
 	INIT_LIST_HEAD(&vmbus_connection.chn_msg_list);
 	spin_lock_init(&vmbus_connection.channelmsg_lock);
 
@@ -280,10 +294,14 @@ void vmbus_disconnect(void)
 	 */
 	vmbus_initiate_unload(false);
 
-	if (vmbus_connection.work_queue) {
-		drain_workqueue(vmbus_connection.work_queue);
+	if (vmbus_connection.handle_sub_chan_wq)
+		destroy_workqueue(vmbus_connection.handle_sub_chan_wq);
+
+	if (vmbus_connection.handle_primary_chan_wq)
+		destroy_workqueue(vmbus_connection.handle_primary_chan_wq);
+
+	if (vmbus_connection.work_queue)
 		destroy_workqueue(vmbus_connection.work_queue);
-	}
 
 	if (vmbus_connection.int_page) {
 		free_pages((unsigned long)vmbus_connection.int_page, 0);
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index f17c06a5..313a07f 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -333,7 +333,14 @@ struct vmbus_connection {
 	struct list_head chn_list;
 	struct mutex channel_mutex;
 
+	/*
+	 * An offer message is handled first on the work_queue, and then
+	 * is further handled on handle_primary_chan_wq or
+	 * handle_sub_chan_wq.
+	 */
 	struct workqueue_struct *work_queue;
+	struct workqueue_struct *handle_primary_chan_wq;
+	struct workqueue_struct *handle_sub_chan_wq;
 };
 
 
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 07a367f..f0885cc 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -896,6 +896,13 @@ struct vmbus_channel {
 
 	bool probe_done;
 
+	/*
+	 * We must offload the handling of the primary/sub channels
+	 * from the single-threaded vmbus_connection.work_queue to
+	 * two different workqueue, otherwise we can block
+	 * vmbus_connection.work_queue and hang: see vmbus_process_offer().
+	 */
+	struct work_struct add_channel_work;
 };
 
 static inline bool is_hvsock_channel(const struct vmbus_channel *c)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-11-29  4:36 [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues Dexuan Cui
@ 2018-11-29  7:44 ` gregkh
  2018-11-29  8:17   ` Dexuan Cui
  2018-12-03 11:50   ` Dan Carpenter
  0 siblings, 2 replies; 11+ messages in thread
From: gregkh @ 2018-11-29  7:44 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-kernel,
	devel, apw, vkuznets, olaf, jasowang, Michael Kelley

On Thu, Nov 29, 2018 at 04:36:43AM +0000, Dexuan Cui wrote:
> 
> vmbus_process_offer() mustn't call channel->sc_creation_callback()
> directly for sub-channels, because sc_creation_callback() ->
> vmbus_open() may never get the host's response to the
> OPEN_CHANNEL message (the host may rescind a channel at any time,
> e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
> may not wake up the vmbus_open() as it's blocked due to a non-zero
> vmbus_connection.offer_in_progress, and finally we have a deadlock.
> 
> The above is also true for primary channels, if the related device
> drivers use sync probing mode by default.
> 
> And, usually the handling of primary channels and sub-channels can
> depend on each other, so we should offload them to different
> workqueues to avoid possible deadlock, e.g. in sync-probing mode,
> NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
> rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
> and waits for all the sub-channels to appear, but the latter
> can't get the rtnl_lock and this blocks the handling of sub-channels.
> 
> The patch can fix the multiple-NIC deadlock described above for
> v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
> of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
> but don't enable async-probing for Hyper-V drivers (yet).
> 
> The patch can also fix the hang issue in sub-channel's handling described
> above for all versions of kernels, including v4.19 and v4.20-rc4.
> 
> So, actually the patch should be applied to all the existing kernels,
> not only the kernels that have 8195b1396ec8.
> 
> Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
> Cc: stable@vger.kernel.org
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: K. Y. Srinivasan <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> ---
> 
> There is no change in this repost. I just rebased this patch to today's
> char-misc's char-misc-next branch. Previously KY posted the patch with his
> Signed-off-by (which is kept in this repost), but there was a conflict issue.
> 
> Note: the patch can't be cleanly applied to char-misc's char-misc-linus branch --
> to do that, we need to cherry-pick the supporting patch first:
> 4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API vmbus_get_outgoing_channel()")

That is not going to work for the obvious reason that this dependant
patch is not going to be merged into 4.20-final.

So, what do you expect us to do here?  The only way this can be accepted
is to have it go into my -next branch, which means it will show up in
4.21-rc1, is that ok?

But then, if that happens, it will fail to apply to any stable tree for
4.20 and older, like you are asking it to be done for.

So what do you expect me to do here with this?

totally confused,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-11-29  7:44 ` gregkh
@ 2018-11-29  8:17   ` Dexuan Cui
  2018-11-30 17:31     ` KY Srinivasan
  2018-12-03 11:50   ` Dan Carpenter
  1 sibling, 1 reply; 11+ messages in thread
From: Dexuan Cui @ 2018-11-29  8:17 UTC (permalink / raw)
  To: gregkh
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-kernel,
	devel, apw, vkuznets, olaf, jasowang, Michael Kelley

> From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> Sent: Wednesday, November 28, 2018 11:45 PM
> >
> > There is no change in this repost. I just rebased this patch to today's
> > char-misc's char-misc-next branch. Previously KY posted the patch with his
> > Signed-off-by (which is kept in this repost), but there was a conflict issue.
> >
> > Note: the patch can't be cleanly applied to char-misc's char-misc-linus branch
> --
> > to do that, we need to cherry-pick the supporting patch first:
> > 4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API
> vmbus_get_outgoing_channel()")
> 
> That is not going to work for the obvious reason that this dependant
> patch is not going to be merged into 4.20-final.

It looks the dependent patch (4d3c5c69191f) is going to miss the v4.20 release. 
This is not a big issue, as the dependent patch isn't really important.
 
> So, what do you expect us to do here?  The only way this can be accepted
> is to have it go into my -next branch, which means it will show up in
> 4.21-rc1, is that ok?

Is there any chance for this patch ("Drivers: hv: vmbus: Offload the handling ...") to
go into v4.20? 

If yes, I can quickly do a rebase to char-misc's char-misc-linus branch,
because actually the conflict can be very easily fixed. And I can help to fix any 
conflict when the dependent patch is backported to v4.20.1.

If no, I think this patch and the dependent patch can both go into v4.21, and 
they can be both backported to v4.20.1 in future.

> But then, if that happens, it will fail to apply to any stable tree for
> 4.20 and older, like you are asking it to be done for.
> 
> So what do you expect me to do here with this?
> 
> totally confused,
> 
> greg k-h

I hope my above reply made me clear. Sorry, I'm not really know how exactly
the releasing procedure works...

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-11-29  8:17   ` Dexuan Cui
@ 2018-11-30 17:31     ` KY Srinivasan
  2018-11-30 18:09       ` Dexuan Cui
  0 siblings, 1 reply; 11+ messages in thread
From: KY Srinivasan @ 2018-11-30 17:31 UTC (permalink / raw)
  To: Dexuan Cui, gregkh
  Cc: Haiyang Zhang, Stephen Hemminger, linux-kernel, devel, apw,
	vkuznets, olaf, jasowang, Michael Kelley



> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Thursday, November 29, 2018 12:17 AM
> To: gregkh@linuxfoundation.org
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; apw@canonical.com; vkuznets
> <vkuznets@redhat.com>; olaf@aepfle.de; jasowang@redhat.com; Michael
> Kelley <mikelley@microsoft.com>
> Subject: RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of
> channels to two workqueues
> 
> > From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> > Sent: Wednesday, November 28, 2018 11:45 PM
> > >
> > > There is no change in this repost. I just rebased this patch to today's
> > > char-misc's char-misc-next branch. Previously KY posted the patch with
> his
> > > Signed-off-by (which is kept in this repost), but there was a conflict issue.
> > >
> > > Note: the patch can't be cleanly applied to char-misc's char-misc-linus
> branch
> > --
> > > to do that, we need to cherry-pick the supporting patch first:
> > > 4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API
> > vmbus_get_outgoing_channel()")
> >
> > That is not going to work for the obvious reason that this dependant
> > patch is not going to be merged into 4.20-final.
> 
> It looks the dependent patch (4d3c5c69191f) is going to miss the v4.20
> release.
> This is not a big issue, as the dependent patch isn't really important.
> 
> > So, what do you expect us to do here?  The only way this can be accepted
> > is to have it go into my -next branch, which means it will show up in
> > 4.21-rc1, is that ok?
> 
> Is there any chance for this patch ("Drivers: hv: vmbus: Offload the handling
> ...") to
> go into v4.20?
> 
> If yes, I can quickly do a rebase to char-misc's char-misc-linus branch,
> because actually the conflict can be very easily fixed. And I can help to fix any
> conflict when the dependent patch is backported to v4.20.1.

This patch fixes an important bug while the patch this depends on is not critical.
I suggest we revert the patch that this patch depends on
and we can submit a new version of this patch that can go in now - into 4.20 release.

K. Y 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-11-30 17:31     ` KY Srinivasan
@ 2018-11-30 18:09       ` Dexuan Cui
  2018-12-02  7:33         ` gregkh
  0 siblings, 1 reply; 11+ messages in thread
From: Dexuan Cui @ 2018-11-30 18:09 UTC (permalink / raw)
  To: KY Srinivasan, gregkh
  Cc: Haiyang Zhang, Stephen Hemminger, linux-kernel, devel, apw,
	vkuznets, olaf, jasowang, Michael Kelley

> From: KY Srinivasan <kys@microsoft.com>
> Sent: Friday, November 30, 2018 9:31 AM
> > From: Dexuan Cui <decui@microsoft.com>
> > Sent: Thursday, November 29, 2018 12:17 AM
> > To: gregkh@linuxfoundation.org
> > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > <haiyangz@microsoft.com>; Stephen Hemminger
> > <sthemmin@microsoft.com>; linux-kernel@vger.kernel.org;
> > devel@linuxdriverproject.org; apw@canonical.com; vkuznets
> > <vkuznets@redhat.com>; olaf@aepfle.de; jasowang@redhat.com; Michael
> > Kelley <mikelley@microsoft.com>
> > Subject: RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of
> > channels to two workqueues
> >
> > > From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> > > Sent: Wednesday, November 28, 2018 11:45 PM
> > > >
> > > > There is no change in this repost. I just rebased this patch to today's
> > > > char-misc's char-misc-next branch. Previously KY posted the patch with
> > his
> > > > Signed-off-by (which is kept in this repost), but there was a conflict issue.
> > > >
> > > > Note: the patch can't be cleanly applied to char-misc's char-misc-linus
> > branch
> > > --
> > > > to do that, we need to cherry-pick the supporting patch first:
> > > > 4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API
> > > vmbus_get_outgoing_channel()")
> > >
> > > That is not going to work for the obvious reason that this dependant
> > > patch is not going to be merged into 4.20-final.
> >
> > It looks the dependent patch (4d3c5c69191f) is going to miss the v4.20
> > release.
> > This is not a big issue, as the dependent patch isn't really important.
> >
> > > So, what do you expect us to do here?  The only way this can be accepted
> > > is to have it go into my -next branch, which means it will show up in
> > > 4.21-rc1, is that ok?
> >
> > Is there any chance for this patch ("Drivers: hv: vmbus: Offload the handling
> > ...") to
> > go into v4.20?
> >
> > If yes, I can quickly do a rebase to char-misc's char-misc-linus branch,
> > because actually the conflict can be very easily fixed. And I can help to fix any
> > conflict when the dependent patch is backported to v4.20.1.
> 
> This patch fixes an important bug while the patch this depends on is not
> critical.
> I suggest we revert the patch that this patch depends on
> and we can submit a new version of this patch that can go in now - into 4.20
> release.
> 
> K. Y

I agree.

Hi Greg,
Please let us know what we can do to try to push this important fix into v4.20.

Actually it's straightforward, though it looks big. And, we ave done a full testing
with the patch.

Thanks,
--Dexuan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-11-30 18:09       ` Dexuan Cui
@ 2018-12-02  7:33         ` gregkh
  2018-12-02  8:47           ` Dexuan Cui
  0 siblings, 1 reply; 11+ messages in thread
From: gregkh @ 2018-12-02  7:33 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-kernel,
	devel, apw, vkuznets, olaf, jasowang, Michael Kelley

On Fri, Nov 30, 2018 at 06:09:54PM +0000, Dexuan Cui wrote:
> > From: KY Srinivasan <kys@microsoft.com>
> > Sent: Friday, November 30, 2018 9:31 AM
> > > From: Dexuan Cui <decui@microsoft.com>
> > > Sent: Thursday, November 29, 2018 12:17 AM
> > > To: gregkh@linuxfoundation.org
> > > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > > <haiyangz@microsoft.com>; Stephen Hemminger
> > > <sthemmin@microsoft.com>; linux-kernel@vger.kernel.org;
> > > devel@linuxdriverproject.org; apw@canonical.com; vkuznets
> > > <vkuznets@redhat.com>; olaf@aepfle.de; jasowang@redhat.com; Michael
> > > Kelley <mikelley@microsoft.com>
> > > Subject: RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of
> > > channels to two workqueues
> > >
> > > > From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> > > > Sent: Wednesday, November 28, 2018 11:45 PM
> > > > >
> > > > > There is no change in this repost. I just rebased this patch to today's
> > > > > char-misc's char-misc-next branch. Previously KY posted the patch with
> > > his
> > > > > Signed-off-by (which is kept in this repost), but there was a conflict issue.
> > > > >
> > > > > Note: the patch can't be cleanly applied to char-misc's char-misc-linus
> > > branch
> > > > --
> > > > > to do that, we need to cherry-pick the supporting patch first:
> > > > > 4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API
> > > > vmbus_get_outgoing_channel()")
> > > >
> > > > That is not going to work for the obvious reason that this dependant
> > > > patch is not going to be merged into 4.20-final.
> > >
> > > It looks the dependent patch (4d3c5c69191f) is going to miss the v4.20
> > > release.
> > > This is not a big issue, as the dependent patch isn't really important.
> > >
> > > > So, what do you expect us to do here?  The only way this can be accepted
> > > > is to have it go into my -next branch, which means it will show up in
> > > > 4.21-rc1, is that ok?
> > >
> > > Is there any chance for this patch ("Drivers: hv: vmbus: Offload the handling
> > > ...") to
> > > go into v4.20?
> > >
> > > If yes, I can quickly do a rebase to char-misc's char-misc-linus branch,
> > > because actually the conflict can be very easily fixed. And I can help to fix any
> > > conflict when the dependent patch is backported to v4.20.1.
> > 
> > This patch fixes an important bug while the patch this depends on is not
> > critical.
> > I suggest we revert the patch that this patch depends on
> > and we can submit a new version of this patch that can go in now - into 4.20
> > release.
> > 
> > K. Y
> 
> I agree.
> 
> Hi Greg,
> Please let us know what we can do to try to push this important fix into v4.20.
> 
> Actually it's straightforward, though it looks big. And, we ave done a full testing
> with the patch.

Ok, you all need to figure this out on your own, sorry.  Please give me
a patch that I can actually apply to the tree if you need it merged into
4.20-final.  This shouldn't be tough, you all have been doing this long
enough by now...

I have no bandwidth to do this myself for you,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-12-02  7:33         ` gregkh
@ 2018-12-02  8:47           ` Dexuan Cui
  2018-12-02 15:33             ` gregkh
  0 siblings, 1 reply; 11+ messages in thread
From: Dexuan Cui @ 2018-12-02  8:47 UTC (permalink / raw)
  To: gregkh
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-kernel,
	devel, apw, vkuznets, olaf, jasowang, Michael Kelley

[-- Attachment #1: Type: text/plain, Size: 4289 bytes --]

> From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> Sent: Saturday, December 1, 2018 11:34 PM
> To: Dexuan Cui <decui@microsoft.com>
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; apw@canonical.com; vkuznets
> <vkuznets@redhat.com>; olaf@aepfle.de; jasowang@redhat.com; Michael
> Kelley <mikelley@microsoft.com>
> Subject: Re: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of
> channels to two workqueues
> 
> On Fri, Nov 30, 2018 at 06:09:54PM +0000, Dexuan Cui wrote:
> > > From: KY Srinivasan <kys@microsoft.com>
> > > Sent: Friday, November 30, 2018 9:31 AM
> > > > From: Dexuan Cui <decui@microsoft.com>
> > > > Sent: Thursday, November 29, 2018 12:17 AM
> > > > To: gregkh@linuxfoundation.org
> > > > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > > > <haiyangz@microsoft.com>; Stephen Hemminger
> > > > <sthemmin@microsoft.com>; linux-kernel@vger.kernel.org;
> > > > devel@linuxdriverproject.org; apw@canonical.com; vkuznets
> > > > <vkuznets@redhat.com>; olaf@aepfle.de; jasowang@redhat.com;
> Michael
> > > > Kelley <mikelley@microsoft.com>
> > > > Subject: RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of
> > > > channels to two workqueues
> > > >
> > > > > From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> > > > > Sent: Wednesday, November 28, 2018 11:45 PM
> > > > > >
> > > > > > There is no change in this repost. I just rebased this patch to today's
> > > > > > char-misc's char-misc-next branch. Previously KY posted the patch
> with
> > > > his
> > > > > > Signed-off-by (which is kept in this repost), but there was a conflict
> issue.
> > > > > >
> > > > > > Note: the patch can't be cleanly applied to char-misc's
> char-misc-linus
> > > > branch
> > > > > --
> > > > > > to do that, we need to cherry-pick the supporting patch first:
> > > > > > 4d3c5c69191f ("Drivers: hv: vmbus: Remove the useless API
> > > > > vmbus_get_outgoing_channel()")
> > > > >
> > > > > That is not going to work for the obvious reason that this dependant
> > > > > patch is not going to be merged into 4.20-final.
> > > >
> > > > It looks the dependent patch (4d3c5c69191f) is going to miss the v4.20
> > > > release.
> > > > This is not a big issue, as the dependent patch isn't really important.
> > > >
> > > > > So, what do you expect us to do here?  The only way this can be
> accepted
> > > > > is to have it go into my -next branch, which means it will show up in
> > > > > 4.21-rc1, is that ok?
> > > >
> > > > Is there any chance for this patch ("Drivers: hv: vmbus: Offload the
> handling
> > > > ...") to
> > > > go into v4.20?
> > > >
> > > > If yes, I can quickly do a rebase to char-misc's char-misc-linus branch,
> > > > because actually the conflict can be very easily fixed. And I can help to
> fix any
> > > > conflict when the dependent patch is backported to v4.20.1.
> > >
> > > This patch fixes an important bug while the patch this depends on is not
> > > critical.
> > > I suggest we revert the patch that this patch depends on
> > > and we can submit a new version of this patch that can go in now - into
> 4.20
> > > release.
> > >
> > > K. Y
> >
> > I agree.
> >
> > Hi Greg,
> > Please let us know what we can do to try to push this important fix into
> v4.20.
> >
> > Actually it's straightforward, though it looks big. And, we ave done a full
> testing
> > with the patch.
> 
> Ok, you all need to figure this out on your own, sorry.  Please give me
> a patch that I can actually apply to the tree if you need it merged into
> 4.20-final.  This shouldn't be tough, you all have been doing this long
> enough by now...
> 
> I have no bandwidth to do this myself for you,
> 
> greg k-h

Hi Greg,
Please use the attached patch: I rebased the patch to today's char-misc's
char-misc-linus branch. It can also cleanly apply to Linus's master branch
today.

Please let me know in case you need me to re-post the patch in a new mail
(rather than the attachment here) or you need me to rebase the patch to
a different tree/branch.

Thanks,
Dexuan


[-- Attachment #2: 0001-Drivers-hv-vmbus-Offload-the-handling-of-channels-to.patch --]
[-- Type: application/octet-stream, Size: 13367 bytes --]

From f34297d7fe61e3aa08a4b8a34f889850c5f1567f Mon Sep 17 00:00:00 2001
From: Dexuan Cui <decui@microsoft.com>
Date: Sun, 2 Dec 2018 00:35:16 -0800
Subject: [PATCH] Drivers: hv: vmbus: Offload the handling of channels to two
 workqueues

vmbus_process_offer() mustn't call channel->sc_creation_callback()
directly for sub-channels, because sc_creation_callback() ->
vmbus_open() may never get the host's response to the
OPEN_CHANNEL message (the host may rescind a channel at any time,
e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
may not wake up the vmbus_open() as it's blocked due to a non-zero
vmbus_connection.offer_in_progress, and finally we have a deadlock.

The above is also true for primary channels, if the related device
drivers use sync probing mode by default.

And, usually the handling of primary channels and sub-channels can
depend on each other, so we should offload them to different
workqueues to avoid possible deadlock, e.g. in sync-probing mode,
NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
and waits for all the sub-channels to appear, but the latter
can't get the rtnl_lock and this blocks the handling of sub-channels.

The patch can fix the multiple-NIC deadlock described above for
v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
but don't enable async-probing for Hyper-V drivers (yet).

The patch can also fix the hang issue in sub-channel's handling described
above for all versions of kernels, including v4.19 and v4.20-rc4.

So actually the patch should be applied to all the existing kernels,
not only the kernels that have 8195b1396ec8.

Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Cc: stable@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---

[Dec. 2, 2018] Dexuan: I rebased this patch to today's char-misc's
char-misc-linus branch, whose top commit is:
d8f190ee836a ("Merge branch 'akpm' (patches from Andrew)")

Previously KY posted the patch with his Signed-off-by (which is kept in this
repost), but there was a conflict issue.

We have done a full testing with the patch.
We hope the important patch can be in the final v4.20. Thanks!

 drivers/hv/channel_mgmt.c | 189 ++++++++++++++++++++++++++++++----------------
 drivers/hv/connection.c   |  24 +++++-
 drivers/hv/hyperv_vmbus.h |   7 ++
 include/linux/hyperv.h    |   7 ++
 4 files changed, 161 insertions(+), 66 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 6277597d..edd34c1 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -435,61 +435,16 @@ void vmbus_free_channels(void)
 	}
 }
 
-/*
- * vmbus_process_offer - Process the offer by creating a channel/device
- * associated with this offer
- */
-static void vmbus_process_offer(struct vmbus_channel *newchannel)
+/* Note: the function can run concurrently for primary/sub channels. */
+static void vmbus_add_channel_work(struct work_struct *work)
 {
-	struct vmbus_channel *channel;
-	bool fnew = true;
+	struct vmbus_channel *newchannel =
+		container_of(work, struct vmbus_channel, add_channel_work);
+	struct vmbus_channel *primary_channel = newchannel->primary_channel;
 	unsigned long flags;
 	u16 dev_type;
 	int ret;
 
-	/* Make sure this is a new offer */
-	mutex_lock(&vmbus_connection.channel_mutex);
-
-	/*
-	 * Now that we have acquired the channel_mutex,
-	 * we can release the potentially racing rescind thread.
-	 */
-	atomic_dec(&vmbus_connection.offer_in_progress);
-
-	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
-		if (!uuid_le_cmp(channel->offermsg.offer.if_type,
-			newchannel->offermsg.offer.if_type) &&
-			!uuid_le_cmp(channel->offermsg.offer.if_instance,
-				newchannel->offermsg.offer.if_instance)) {
-			fnew = false;
-			break;
-		}
-	}
-
-	if (fnew)
-		list_add_tail(&newchannel->listentry,
-			      &vmbus_connection.chn_list);
-
-	mutex_unlock(&vmbus_connection.channel_mutex);
-
-	if (!fnew) {
-		/*
-		 * Check to see if this is a sub-channel.
-		 */
-		if (newchannel->offermsg.offer.sub_channel_index != 0) {
-			/*
-			 * Process the sub-channel.
-			 */
-			newchannel->primary_channel = channel;
-			spin_lock_irqsave(&channel->lock, flags);
-			list_add_tail(&newchannel->sc_list, &channel->sc_list);
-			channel->num_sc++;
-			spin_unlock_irqrestore(&channel->lock, flags);
-		} else {
-			goto err_free_chan;
-		}
-	}
-
 	dev_type = hv_get_dev_type(newchannel);
 
 	init_vp_index(newchannel, dev_type);
@@ -507,27 +462,26 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 	/*
 	 * This state is used to indicate a successful open
 	 * so that when we do close the channel normally, we
-	 * can cleanup properly
+	 * can cleanup properly.
 	 */
 	newchannel->state = CHANNEL_OPEN_STATE;
 
-	if (!fnew) {
-		struct hv_device *dev
-			= newchannel->primary_channel->device_obj;
+	if (primary_channel != NULL) {
+		/* newchannel is a sub-channel. */
+		struct hv_device *dev = primary_channel->device_obj;
 
 		if (vmbus_add_channel_kobj(dev, newchannel))
-			goto err_free_chan;
+			goto err_deq_chan;
+
+		if (primary_channel->sc_creation_callback != NULL)
+			primary_channel->sc_creation_callback(newchannel);
 
-		if (channel->sc_creation_callback != NULL)
-			channel->sc_creation_callback(newchannel);
 		newchannel->probe_done = true;
 		return;
 	}
 
 	/*
-	 * Start the process of binding this offer to the driver
-	 * We need to set the DeviceObject field before calling
-	 * vmbus_child_dev_add()
+	 * Start the process of binding the primary channel to the driver
 	 */
 	newchannel->device_obj = vmbus_device_create(
 		&newchannel->offermsg.offer.if_type,
@@ -556,13 +510,28 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 
 err_deq_chan:
 	mutex_lock(&vmbus_connection.channel_mutex);
-	list_del(&newchannel->listentry);
+
+	/*
+	 * We need to set the flag, otherwise
+	 * vmbus_onoffer_rescind() can be blocked.
+	 */
+	newchannel->probe_done = true;
+
+	if (primary_channel == NULL) {
+		list_del(&newchannel->listentry);
+	} else {
+		spin_lock_irqsave(&primary_channel->lock, flags);
+		list_del(&newchannel->sc_list);
+		spin_unlock_irqrestore(&primary_channel->lock, flags);
+	}
+
 	mutex_unlock(&vmbus_connection.channel_mutex);
 
 	if (newchannel->target_cpu != get_cpu()) {
 		put_cpu();
 		smp_call_function_single(newchannel->target_cpu,
-					 percpu_channel_deq, newchannel, true);
+					 percpu_channel_deq,
+					 newchannel, true);
 	} else {
 		percpu_channel_deq(newchannel);
 		put_cpu();
@@ -570,14 +539,104 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 
 	vmbus_release_relid(newchannel->offermsg.child_relid);
 
-err_free_chan:
 	free_channel(newchannel);
 }
 
 /*
+ * vmbus_process_offer - Process the offer by creating a channel/device
+ * associated with this offer
+ */
+static void vmbus_process_offer(struct vmbus_channel *newchannel)
+{
+	struct vmbus_channel *channel;
+	struct workqueue_struct *wq;
+	unsigned long flags;
+	bool fnew = true;
+
+	mutex_lock(&vmbus_connection.channel_mutex);
+
+	/*
+	 * Now that we have acquired the channel_mutex,
+	 * we can release the potentially racing rescind thread.
+	 */
+	atomic_dec(&vmbus_connection.offer_in_progress);
+
+	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
+		if (!uuid_le_cmp(channel->offermsg.offer.if_type,
+				 newchannel->offermsg.offer.if_type) &&
+		    !uuid_le_cmp(channel->offermsg.offer.if_instance,
+				 newchannel->offermsg.offer.if_instance)) {
+			fnew = false;
+			break;
+		}
+	}
+
+	if (fnew)
+		list_add_tail(&newchannel->listentry,
+			      &vmbus_connection.chn_list);
+	else {
+		/*
+		 * Check to see if this is a valid sub-channel.
+		 */
+		if (newchannel->offermsg.offer.sub_channel_index == 0) {
+			mutex_unlock(&vmbus_connection.channel_mutex);
+			/*
+			 * Don't call free_channel(), because newchannel->kobj
+			 * is not initialized yet.
+			 */
+			kfree(newchannel);
+			WARN_ON_ONCE(1);
+			return;
+		}
+		/*
+		 * Process the sub-channel.
+		 */
+		newchannel->primary_channel = channel;
+		spin_lock_irqsave(&channel->lock, flags);
+		list_add_tail(&newchannel->sc_list, &channel->sc_list);
+		spin_unlock_irqrestore(&channel->lock, flags);
+	}
+
+	mutex_unlock(&vmbus_connection.channel_mutex);
+
+	/*
+	 * vmbus_process_offer() mustn't call channel->sc_creation_callback()
+	 * directly for sub-channels, because sc_creation_callback() ->
+	 * vmbus_open() may never get the host's response to the
+	 * OPEN_CHANNEL message (the host may rescind a channel at any time,
+	 * e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
+	 * may not wake up the vmbus_open() as it's blocked due to a non-zero
+	 * vmbus_connection.offer_in_progress, and finally we have a deadlock.
+	 *
+	 * The above is also true for primary channels, if the related device
+	 * drivers use sync probing mode by default.
+	 *
+	 * And, usually the handling of primary channels and sub-channels can
+	 * depend on each other, so we should offload them to different
+	 * workqueues to avoid possible deadlock, e.g. in sync-probing mode,
+	 * NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
+	 * rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
+	 * and waits for all the sub-channels to appear, but the latter
+	 * can't get the rtnl_lock and this blocks the handling of
+	 * sub-channels.
+	 */
+	INIT_WORK(&newchannel->add_channel_work, vmbus_add_channel_work);
+	wq = fnew ? vmbus_connection.handle_primary_chan_wq :
+		    vmbus_connection.handle_sub_chan_wq;
+	queue_work(wq, &newchannel->add_channel_work);
+}
+
+/*
  * We use this state to statically distribute the channel interrupt load.
  */
 static int next_numa_node_id;
+/*
+ * init_vp_index() accesses global variables like next_numa_node_id, and
+ * it can run concurrently for primary channels and sub-channels: see
+ * vmbus_process_offer(), so we need the lock to protect the global
+ * variables.
+ */
+static DEFINE_SPINLOCK(bind_channel_to_cpu_lock);
 
 /*
  * Starting with Win8, we can statically distribute the incoming
@@ -613,6 +672,8 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
 		return;
 	}
 
+	spin_lock(&bind_channel_to_cpu_lock);
+
 	/*
 	 * Based on the channel affinity policy, we will assign the NUMA
 	 * nodes.
@@ -695,6 +756,8 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
 	channel->target_cpu = cur_cpu;
 	channel->target_vp = hv_cpu_number_to_vp_number(cur_cpu);
 
+	spin_unlock(&bind_channel_to_cpu_lock);
+
 	free_cpumask_var(available_mask);
 }
 
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index f4d08c8..4fe117b7 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -190,6 +190,20 @@ int vmbus_connect(void)
 		goto cleanup;
 	}
 
+	vmbus_connection.handle_primary_chan_wq =
+		create_workqueue("hv_pri_chan");
+	if (!vmbus_connection.handle_primary_chan_wq) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	vmbus_connection.handle_sub_chan_wq =
+		create_workqueue("hv_sub_chan");
+	if (!vmbus_connection.handle_sub_chan_wq) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
 	INIT_LIST_HEAD(&vmbus_connection.chn_msg_list);
 	spin_lock_init(&vmbus_connection.channelmsg_lock);
 
@@ -280,10 +294,14 @@ void vmbus_disconnect(void)
 	 */
 	vmbus_initiate_unload(false);
 
-	if (vmbus_connection.work_queue) {
-		drain_workqueue(vmbus_connection.work_queue);
+	if (vmbus_connection.handle_sub_chan_wq)
+		destroy_workqueue(vmbus_connection.handle_sub_chan_wq);
+
+	if (vmbus_connection.handle_primary_chan_wq)
+		destroy_workqueue(vmbus_connection.handle_primary_chan_wq);
+
+	if (vmbus_connection.work_queue)
 		destroy_workqueue(vmbus_connection.work_queue);
-	}
 
 	if (vmbus_connection.int_page) {
 		free_pages((unsigned long)vmbus_connection.int_page, 0);
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 72eaba3..87d3d7d 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -335,7 +335,14 @@ struct vmbus_connection {
 	struct list_head chn_list;
 	struct mutex channel_mutex;
 
+	/*
+	 * An offer message is handled first on the work_queue, and then
+	 * is further handled on handle_primary_chan_wq or
+	 * handle_sub_chan_wq.
+	 */
 	struct workqueue_struct *work_queue;
+	struct workqueue_struct *handle_primary_chan_wq;
+	struct workqueue_struct *handle_sub_chan_wq;
 };
 
 
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index b3e2436..14131b6 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -905,6 +905,13 @@ struct vmbus_channel {
 
 	bool probe_done;
 
+	/*
+	 * We must offload the handling of the primary/sub channels
+	 * from the single-threaded vmbus_connection.work_queue to
+	 * two different workqueue, otherwise we can block
+	 * vmbus_connection.work_queue and hang: see vmbus_process_offer().
+	 */
+	struct work_struct add_channel_work;
 };
 
 static inline bool is_hvsock_channel(const struct vmbus_channel *c)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-12-02  8:47           ` Dexuan Cui
@ 2018-12-02 15:33             ` gregkh
  2018-12-03  0:50               ` Dexuan Cui
  0 siblings, 1 reply; 11+ messages in thread
From: gregkh @ 2018-12-02 15:33 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-kernel,
	devel, apw, vkuznets, olaf, jasowang, Michael Kelley

On Sun, Dec 02, 2018 at 08:47:03AM +0000, Dexuan Cui wrote:
> Hi Greg,
> Please use the attached patch: I rebased the patch to today's char-misc's
> char-misc-linus branch. It can also cleanly apply to Linus's master branch
> today.

I can't use an attached patch, you know better.  Please fix up and
resend properly.

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-12-02 15:33             ` gregkh
@ 2018-12-03  0:50               ` Dexuan Cui
  0 siblings, 0 replies; 11+ messages in thread
From: Dexuan Cui @ 2018-12-03  0:50 UTC (permalink / raw)
  To: gregkh
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-kernel,
	devel, apw, vkuznets, olaf, jasowang, Michael Kelley

> From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> Sent: Sunday, December 2, 2018 7:33 AM
> To: Dexuan Cui <decui@microsoft.com>
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; apw@canonical.com; vkuznets
> <vkuznets@redhat.com>; olaf@aepfle.de; jasowang@redhat.com; Michael
> Kelley <mikelley@microsoft.com>
> Subject: Re: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of
> channels to two workqueues
> 
> On Sun, Dec 02, 2018 at 08:47:03AM +0000, Dexuan Cui wrote:
> > Hi Greg,
> > Please use the attached patch: I rebased the patch to today's char-misc's
> > char-misc-linus branch. It can also cleanly apply to Linus's master branch
> > today.
> 
> I can't use an attached patch, you know better.  Please fix up and
> resend properly.
> 
> greg k-h

Ok, let me re-send it right now.

BTW, Linus's tree was updated to v4.20-rc5 98 minutes ago. This patch can
still cleanly apply there, and on your char-misc-linus branch.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-11-29  7:44 ` gregkh
  2018-11-29  8:17   ` Dexuan Cui
@ 2018-12-03 11:50   ` Dan Carpenter
  2018-12-03 18:41     ` Dan Carpenter
  1 sibling, 1 reply; 11+ messages in thread
From: Dan Carpenter @ 2018-12-03 11:50 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: olaf, Stephen Hemminger, Haiyang Zhang, linux-kernel,
	Michael Kelley, apw, devel, vkuznets, jasowang, gregkh

The list is still rejecting @microsoft.com patches...  :(

I mentioned this last time when you guys were complaining that no one
reads your patches and someone sent me a link to marc.info.  I want to
help but obviously no one has time to look at patches on marc.info...

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues
  2018-12-03 11:50   ` Dan Carpenter
@ 2018-12-03 18:41     ` Dan Carpenter
  0 siblings, 0 replies; 11+ messages in thread
From: Dan Carpenter @ 2018-12-03 18:41 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: olaf, Stephen Hemminger, Haiyang Zhang, linux-kernel,
	Michael Kelley, apw, devel, vkuznets, jasowang, gregkh

On Mon, Dec 03, 2018 at 02:50:19PM +0300, Dan Carpenter wrote:
> The list is still rejecting @microsoft.com patches...  :(
> 
> I mentioned this last time when you guys were complaining that no one
> reads your patches and someone sent me a link to marc.info.  I want to
> help but obviously no one has time to look at patches on marc.info...
> 

Oh... Hm.  Sorry, my bad.  They're making it to the list but somehow
gmail is eating them.  I search my spam folder and they're not there
either.  :(

I don't know what's going on.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-12-03 18:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-29  4:36 [PATCH] [repost] Drivers: hv: vmbus: Offload the handling of channels to two workqueues Dexuan Cui
2018-11-29  7:44 ` gregkh
2018-11-29  8:17   ` Dexuan Cui
2018-11-30 17:31     ` KY Srinivasan
2018-11-30 18:09       ` Dexuan Cui
2018-12-02  7:33         ` gregkh
2018-12-02  8:47           ` Dexuan Cui
2018-12-02 15:33             ` gregkh
2018-12-03  0:50               ` Dexuan Cui
2018-12-03 11:50   ` Dan Carpenter
2018-12-03 18:41     ` Dan Carpenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).