linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Parri <parri.andrea@gmail.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	"K . Y . Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>,
	linux-hyperv@vger.kernel.org,
	Michael Kelley <mikelley@microsoft.com>,
	Dexuan Cui <decui@microsoft.com>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels
Date: Fri, 3 Apr 2020 15:38:26 +0200	[thread overview]
Message-ID: <20200403133826.GA25401@andrea> (raw)
In-Reply-To: <87imim2epp.fsf@vitty.brq.redhat.com>

On Mon, Mar 30, 2020 at 02:45:54PM +0200, Vitaly Kuznetsov wrote:
> Andrea Parri <parri.andrea@gmail.com> writes:
> 
> >> Correct me if I'm wrong, but currently vmbus_chan_sched() accesses
> >> per-cpu list of channels on the same CPU so we don't need a spinlock to
> >> guarantee that during an interrupt we'll be able to see the update if it
> >> happened before the interrupt (in chronological order). With a global
> >> list of relids, who guarantees that an interrupt handler on another CPU
> >> will actually see the modified list? 
> >
> > Thanks for pointing this out!
> >
> > The offer/resume path presents implicit full memory barriers, program
> > -order after the array store which should guarantee the visibility of
> > the store to *all* CPUs before the offer/resume can complete (c.f.,
> >
> >   tools/memory-model/Documentation/explanation.txt, Sect. #13
> >
> > and assuming that the offer/resume for a channel must complete before
> > the corresponding handler, which seems to be the case considered that
> > some essential channel fields are initialized only later...)
> >
> > IIUC, the spin lock approach you suggested will work and be "simpler";
> > an obvious side effect would be, well, a global synchronization point
> > in vmbus_chan_sched()...
> >
> > Thoughts?
> 
> This is, of course, very theoretical as if we're seeing an interrupt for
> a channel at the same time we're writing its relid we're already in
> trouble. I can, however, try to suggest one tiny improvement:

Indeed.  I think the idea (still quite informal) is that:

  1) the mapping of the channel relid is propagated to (visible from)
     all CPUs before add_channel_work is queued (full barrier in
     queue_work()),

  2) add_channel_work is queued before the channel is opened (aka,
     before the channel ring buffer is allocate/initalized and the
     OPENCHANNEL msg is sent and acked from Hyper-V, cf. OPEN_STATE),

  3) the channel is opened before Hyper-V can start sending interrupts
     for the channel, and hence before vmbus_chan_sched() can find the
     channel relid in recv_int_page set,

  4) vmbus_chan_sched() finds the channel's relid in recv_int_page
     set before it search/load from the channel array (full barrier
     in sync_test_and_clear_bit()).

This is for the "normal"/not resuming from hibernation case; for the
latter, notice that:

  a) vmbus_isr() (and vmbus_chan_sched()) can not run until when
     vmbus_bus_resume() has finished (@resume_noirq callback),

  b) vmbus_bus_resume() can not complete before nr_chan_fixup_on_resume
     equals 0 in check_ready_for_resume_event().
     
(and check_ready_for_resume_event() does also provides a full barrier).

If makes sense to you, I'll try to add some of the above in comments.

Thanks,
  Andrea


> 
> vmbus_chan_sched() now clean the bit in the event page and then searches
> for a channel with this relid; in case we allow the search to
> (temporary) fail we can reverse the logic: search for the channel and
> clean the bit only if we succeed. In case we fail, next time (next IRQ)
> we'll try again and likely succeed. The only purpose is to make sure no
> interrupts are ever lost.  This may be an overkill, we may want to try
> to count how many times (if ever) this happens. 
> 
> Just a thought though.
> 
> -- 
> Vitaly
> 

  reply	other threads:[~2020-04-03 13:38 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25 22:54 [RFC PATCH 00/11] VMBus channel interrupt reassignment Andrea Parri (Microsoft)
2020-03-25 22:54 ` [RFC PATCH 01/11] Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 Andrea Parri (Microsoft)
2020-03-26 14:05   ` Vitaly Kuznetsov
2020-03-28 18:50     ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU Andrea Parri (Microsoft)
2020-03-26 14:16   ` Vitaly Kuznetsov
2020-03-26 15:47     ` Andrea Parri
2020-03-26 17:26       ` Vitaly Kuznetsov
2020-03-28 17:08         ` Andrea Parri
2020-03-29  3:43           ` Michael Kelley
2020-03-30 12:24             ` Vitaly Kuznetsov
2020-04-03 12:04               ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels Andrea Parri (Microsoft)
2020-03-26 14:31   ` Vitaly Kuznetsov
2020-03-26 17:05     ` Andrea Parri
2020-03-26 17:43       ` Vitaly Kuznetsov
2020-03-28 18:21         ` Andrea Parri
2020-03-29  3:49           ` Michael Kelley
2020-03-30 12:45           ` Vitaly Kuznetsov
2020-04-03 13:38             ` Andrea Parri [this message]
2020-04-03 14:56               ` Vitaly Kuznetsov
2020-03-25 22:54 ` [RFC PATCH 04/11] hv_netvsc: Disable NAPI before closing the VMBus channel Andrea Parri (Microsoft)
2020-03-26 15:26   ` Stephen Hemminger
2020-03-26 17:55     ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 05/11] hv_utils: Always execute the fcopy and vss callbacks in a tasklet Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 06/11] Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 07/11] PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 08/11] Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 09/11] Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type Andrea Parri (Microsoft)
2020-03-26 14:46   ` Vitaly Kuznetsov
2020-03-28 18:48     ` Andrea Parri
2020-04-03 14:55       ` Andrea Parri
2020-03-25 22:55 ` [RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned Andrea Parri (Microsoft)
2020-03-30 16:42   ` Michael Kelley
2020-03-30 18:55     ` Andrea Parri
2020-03-30 19:49       ` Michael Kelley
2020-04-03 13:41         ` Andrea Parri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200403133826.GA25401@andrea \
    --to=parri.andrea@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=sthemmin@microsoft.com \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).