All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Parri <parri.andrea@gmail.com>
To: Long Li <longli@microsoft.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	Michael Kelley <mikelley@microsoft.com>,
	Dexuan Cui <decui@microsoft.com>,
	Boqun Feng <boqun.feng@gmail.com>, vkuznets <vkuznets@redhat.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned
Date: Tue, 7 Apr 2020 02:35:52 +0200	[thread overview]
Message-ID: <20200407003459.GA7776@andrea> (raw)
In-Reply-To: <BN8PR21MB1155C78AAC02D02EA7A7841DCEC20@BN8PR21MB1155.namprd21.prod.outlook.com>

> >@@ -621,6 +621,63 @@ static inline struct storvsc_device
> >*get_in_stor_device(
> >
> > }
> >
> >+void storvsc_change_target_cpu(struct vmbus_channel *channel, u32 old,
> >+u32 new) {
> >+	struct storvsc_device *stor_device;
> >+	struct vmbus_channel *cur_chn;
> >+	bool old_is_alloced = false;
> >+	struct hv_device *device;
> >+	unsigned long flags;
> >+	int cpu;
> >+
> >+	device = channel->primary_channel ?
> >+			channel->primary_channel->device_obj
> >+				: channel->device_obj;
> >+	stor_device = get_out_stor_device(device);
> >+	if (!stor_device)
> >+		return;
> >+
> >+	/* See storvsc_do_io() -> get_og_chn(). */
> >+	spin_lock_irqsave(&device->channel->lock, flags);
> >+
> >+	/*
> >+	 * Determines if the storvsc device has other channels assigned to
> >+	 * the "old" CPU to update the alloced_cpus mask and the stor_chns
> >+	 * array.
> >+	 */
> >+	if (device->channel != channel && device->channel->target_cpu ==
> >old) {
> >+		cur_chn = device->channel;
> >+		old_is_alloced = true;
> >+		goto old_is_alloced;
> >+	}
> >+	list_for_each_entry(cur_chn, &device->channel->sc_list, sc_list) {
> >+		if (cur_chn == channel)
> >+			continue;
> >+		if (cur_chn->target_cpu == old) {
> >+			old_is_alloced = true;
> >+			goto old_is_alloced;
> >+		}
> >+	}
> >+
> >+old_is_alloced:
> >+	if (old_is_alloced)
> >+		WRITE_ONCE(stor_device->stor_chns[old], cur_chn);
> >+	else
> >+		cpumask_clear_cpu(old, &stor_device->alloced_cpus);
> 
> If the old cpu is not allocated, is it still necessary to do a cpumask_clear_cpu?

AFAICT, this really depends on how much we "believe" in the current
heuristic (as implemented by get_og_chn()):  ;-)

The cpumask_clear_cpu() (and the below, dependent "flush" as well)
are intended to re-initialize alloced_cpus and stor_chns in order
for get_og_chn() to re-process/update them.

Also, notice that (both in the current code and after this series)
alloced_cpus can't be offlined and get_og_chn() does rely on this
property (cf., e.g., the loop/check over alloced_cpus/node_mask).

I suspect that giving up on this invariant/property would require
a certain amount of re-design in the heuristic/code in question...


> >@@ -1360,7 +1432,14 @@ static int storvsc_do_io(struct hv_device *device,
> > 			}
> > 		}
> > 	} else {
> >+		spin_lock_irqsave(&device->channel->lock, flags);
> >+		outgoing_channel = stor_device->stor_chns[q_num];
> >+		if (outgoing_channel != NULL) {
> >+			spin_unlock_irqrestore(&device->channel->lock,
> >flags);
> 
> Checking outgoing_channel again seems unnecessary. Why not just call get_og_chn()?

target_cpu_store() might have changed stor_chns (and alloced_cpus)
in the meantime (but before we've acquired the device's lock): the
double check is to make sure we have a "consistent"/an up-to-date
view of stor_chns and alloced_cpus.


> 
> >+			goto found_channel;
> >+		}
> > 		outgoing_channel = get_og_chn(stor_device, q_num);
> >+		spin_unlock_irqrestore(&device->channel->lock, flags);
> > 	}
> 
> With device->channel->lock, now we have one more lock on the I/O issuing path. It doesn't seem optimal as you are trying to protect the code in storvsc_change_target_cpu(), this doesn't need to block concurrent I/O issuers. Maybe moving to RCU is a better approach?

I don't see this as a problem (*and I've validated such conclusion
in experiments, where the "patched kernel" was sometimes performing
slighlty better than the "unpatched kernel" and sometimes slightly
worse...):

On the one hand, the stor_chns array "stabilizes" quite early after
system initialization in "normal" (i.e., common) situations (i.e.,
no channel reassignments, no device hotplugs...); IOW, get_og_chn()
really represents the "rare and slow" path here (but not that slow!
after all...).  Furthermore, notice that even in those "rare cases"
the number of "contending" channels is limited to at most 1 per 4
CPUs IIRC (alloced_cpus is "sparsely populated"...).

The latencies of the RCU grace period (in the order of milliseconds)
would be a major concern for the adoption of RCU here (at least, if
we continue to consider get_og_chn() as an "updater").  I'm afraid
that this could be "too slow" even for our slow path...  ;-/

What am I missing?  ;-)

Thanks,
  Andrea

  reply	other threads:[~2020-04-07  0:36 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-06  0:15 [PATCH 00/11] VMBus channel interrupt reassignment Andrea Parri (Microsoft)
2020-04-06  0:15 ` [PATCH 01/11] Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 Andrea Parri (Microsoft)
2020-04-10 17:18   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU Andrea Parri (Microsoft)
2020-04-10 17:23   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels Andrea Parri (Microsoft)
2020-04-10 17:29   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 04/11] hv_netvsc: Disable NAPI before closing the VMBus channel Andrea Parri (Microsoft)
2020-04-10 19:20   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 05/11] hv_utils: Always execute the fcopy and vss callbacks in a tasklet Andrea Parri (Microsoft)
2020-04-10 19:26   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 06/11] Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal Andrea Parri (Microsoft)
2020-04-10 19:28   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 07/11] PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality Andrea Parri (Microsoft)
2020-04-10 19:30   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 08/11] Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic Andrea Parri (Microsoft)
2020-04-10 19:32   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 09/11] Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug Andrea Parri (Microsoft)
2020-04-10 19:33   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type Andrea Parri (Microsoft)
2020-04-10 19:34   ` Michael Kelley
2020-04-06  0:15 ` [PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned Andrea Parri (Microsoft)
2020-04-06 19:54   ` Long Li
2020-04-07  0:35     ` Andrea Parri [this message]
2020-04-08  2:25       ` Long Li
2020-04-08 14:54         ` Andrea Parri
2020-04-10 19:35   ` Michael Kelley
2020-04-13 15:48 ` [PATCH 00/11] VMBus channel interrupt reassignment Wei Liu
2020-04-13 17:07   ` Andrea Parri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200407003459.GA7776@andrea \
    --to=parri.andrea@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=jejb@linux.ibm.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=martin.petersen@oracle.com \
    --cc=mikelley@microsoft.com \
    --cc=sthemmin@microsoft.com \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.