All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-dev] Device-to-driver notification management for hardware implementations
@ 2018-09-12 16:41 David Riddoch
  2018-11-05  8:53 ` Stefan Hajnoczi
  0 siblings, 1 reply; 6+ messages in thread
From: David Riddoch @ 2018-09-12 16:41 UTC (permalink / raw)
  To: virtio-dev

All,

The VIRTIO_F_NOTIFICATION_DATA feature helps hardware implementations by 
telling them directly how many descriptors are available in the ring, 
saving speculative reads.  There is a related issue with interrupts: As 
things stand, after writing 'used' descriptors into the ring (packed or 
split) the device has to read the event idx to determine whether a 
device-to-driver notification (interrupt) is needed.

It would be very helpful to a hardware implementation if the event idx 
could be written directly to the device.  Has anyone proposed an 
extension to address this already?  (I have some thoughts on how to do 
this if we don't have a proposal yet).

Best,
David



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [virtio-dev] Device-to-driver notification management for hardware implementations
  2018-09-12 16:41 [virtio-dev] Device-to-driver notification management for hardware implementations David Riddoch
@ 2018-11-05  8:53 ` Stefan Hajnoczi
  2018-11-08 12:16   ` David Riddoch
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2018-11-05  8:53 UTC (permalink / raw)
  To: David Riddoch; +Cc: virtio-dev

[-- Attachment #1: Type: text/plain, Size: 880 bytes --]

On Wed, Sep 12, 2018 at 05:41:07PM +0100, David Riddoch wrote:
> The VIRTIO_F_NOTIFICATION_DATA feature helps hardware implementations by
> telling them directly how many descriptors are available in the ring, saving
> speculative reads.  There is a related issue with interrupts: As things
> stand, after writing 'used' descriptors into the ring (packed or split) the
> device has to read the event idx to determine whether a device-to-driver
> notification (interrupt) is needed.
> 
> It would be very helpful to a hardware implementation if the event idx could
> be written directly to the device.  Has anyone proposed an extension to
> address this already?  (I have some thoughts on how to do this if we don't
> have a proposal yet).

Hi David,
Has there been any further discussion on this topic?  It seems like a
useful feature to have in VIRTIO.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [virtio-dev] Device-to-driver notification management for hardware implementations
  2018-11-05  8:53 ` Stefan Hajnoczi
@ 2018-11-08 12:16   ` David Riddoch
  2018-11-12 14:37     ` Stefan Hajnoczi
       [not found]     ` <e28d3203-4566-ae6e-61c7-24a9a3e6ca3b@solarflare.com>
  0 siblings, 2 replies; 6+ messages in thread
From: David Riddoch @ 2018-11-08 12:16 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Virtio-Dev, Steve Pope

On 05/11/18 08:53, Stefan Hajnoczi wrote:
> On Wed, Sep 12, 2018 at 05:41:07PM +0100, David Riddoch wrote:
>> The VIRTIO_F_NOTIFICATION_DATA feature helps hardware implementations by
>> telling them directly how many descriptors are available in the ring, saving
>> speculative reads.  There is a related issue with interrupts: As things
>> stand, after writing 'used' descriptors into the ring (packed or split) the
>> device has to read the event idx to determine whether a device-to-driver
>> notification (interrupt) is needed.
>>
>> It would be very helpful to a hardware implementation if the event idx could
>> be written directly to the device.  Has anyone proposed an extension to
>> address this already?  (I have some thoughts on how to do this if we don't
>> have a proposal yet).
> Hi David,
> Has there been any further discussion on this topic?  It seems like a
> useful feature to have in VIRTIO.
Hi Stefan,

Thanks for the nudge, no further discussion.  So here are my thoughts:

To provide context here is an outline of how drivers typically process 
completions from a device.  This applies to both para-virtualised and 
real hardware.  (I think we only need to worry about packed-ring).

    disable_cb();
  again:
    while( budget_left && desc_is_used() )
      handle_used();
    if( budget_left &&enable_cb() )
      goto again;

The budget stuff relates to napi polling, and isn't always there. But if 
it is, and budget is exhausted, then an outer loop will invoke this code 
again and so we don't want an interrupt.

In our current packed-ring design enable_cb() does roughly this:

    enable_cb_packed() {
      guest_mem->event_suppress = last_used_idx;
      virtio_mb();
      return desc_is_used();
    }

This tells the device to raise an interrupt when moving used_idx over 
the given idx.  (And you can set that idx to a position further in the 
ring if you're not interested in getting woken until a bunch of 
descriptors have been consumed).

This design is unpleasant for hardware devices, because it forces them 
to issue a read over the PCIe bus after writing used descriptors, in 
order to determine whether an interrupt is needed. Such a read adds 
complexity, adds overhead to the PCIe bus and memory system, and adds 
latency.

In hardware devices interrupts are typically enabled like this:

    enable_cb_hw() {
      writel(last_used_idx, &dev_bar->int_en_doorbell);
      return 0;
    }

This tells the device to raise an interrupt once the descriptor at the 
given index has been used.  If it has already been used, then an 
interrupt is raised immediately, else the interrupt is enabled and 
raised after the used descriptor has been written.  The race condition 
is handled without having to check the ring again.  Note that interrupts 
are fire-once: For each write to int_en_doorbell you get a single 
interrupt, and disable_cb() is therefore a no-op.

When you emulate a device using this model, you get a vmexit every time 
int_en_doorbell is written, which is expensive.  Therefore this approach 
is not desirable when emulated.

So perhaps we should introduce an option to use this hardware-oriented 
model.  The problem is that with vDPA the implementation of a device can 
shift under the feet of a driver between hardware and emulation.  We'd 
like to get the best performance we can in both cases.

My proposal is that we add an option to use both models at the same 
time.  enable_cb() would become:

30   enable_cb_new() {
31     if( int_en_doorbell_enabled )
32 writel(last_used_idx, p_int_en_doorbell);
33 guest_mem->event_suppress = last_used_idx;
34     virtio_mb();
35     return desc_is_used();
36   }

p_int_en_doorbell it notionally a pointer into a memory mapping onto a 
device BAR, and when backed by hardware that is exactly what it is (and 
the information in the event suppression structure is ignored).  But 
when backed by an emulated device the p_int_en_doorbell mapping should 
just point at normal host memory (which is otherwise unused).  This 
means that we avoid vmexits in the emulated case.

Comments?

David

-- 
David Riddoch  <driddoch@solarflare.com> -- Chief Architect, Solarflare


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [virtio-dev] Device-to-driver notification management for hardware implementations
  2018-11-08 12:16   ` David Riddoch
@ 2018-11-12 14:37     ` Stefan Hajnoczi
  2019-02-01 14:23       ` David Riddoch
       [not found]     ` <e28d3203-4566-ae6e-61c7-24a9a3e6ca3b@solarflare.com>
  1 sibling, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2018-11-12 14:37 UTC (permalink / raw)
  To: David Riddoch; +Cc: Virtio-Dev, Steve Pope

[-- Attachment #1: Type: text/plain, Size: 4527 bytes --]

On Thu, Nov 08, 2018 at 12:16:28PM +0000, David Riddoch wrote:
> On 05/11/18 08:53, Stefan Hajnoczi wrote:
> > On Wed, Sep 12, 2018 at 05:41:07PM +0100, David Riddoch wrote:
> > > The VIRTIO_F_NOTIFICATION_DATA feature helps hardware implementations by
> > > telling them directly how many descriptors are available in the ring, saving
> > > speculative reads.  There is a related issue with interrupts: As things
> > > stand, after writing 'used' descriptors into the ring (packed or split) the
> > > device has to read the event idx to determine whether a device-to-driver
> > > notification (interrupt) is needed.
> > > 
> > > It would be very helpful to a hardware implementation if the event idx could
> > > be written directly to the device.  Has anyone proposed an extension to
> > > address this already?  (I have some thoughts on how to do this if we don't
> > > have a proposal yet).
> > Hi David,
> > Has there been any further discussion on this topic?  It seems like a
> > useful feature to have in VIRTIO.
> Hi Stefan,
> 
> Thanks for the nudge, no further discussion.  So here are my thoughts:
> 
> To provide context here is an outline of how drivers typically process
> completions from a device.  This applies to both para-virtualised and real
> hardware.  (I think we only need to worry about packed-ring).
> 
>    disable_cb();
>  again:
>    while( budget_left && desc_is_used() )
>      handle_used();
>    if( budget_left &&enable_cb() )
>      goto again;
> 
> The budget stuff relates to napi polling, and isn't always there. But if it
> is, and budget is exhausted, then an outer loop will invoke this code again
> and so we don't want an interrupt.
> 
> In our current packed-ring design enable_cb() does roughly this:
> 
>    enable_cb_packed() {
>      guest_mem->event_suppress = last_used_idx;
>      virtio_mb();
>      return desc_is_used();
>    }
> 
> This tells the device to raise an interrupt when moving used_idx over the
> given idx.  (And you can set that idx to a position further in the ring if
> you're not interested in getting woken until a bunch of descriptors have
> been consumed).
> 
> This design is unpleasant for hardware devices, because it forces them to
> issue a read over the PCIe bus after writing used descriptors, in order to
> determine whether an interrupt is needed. Such a read adds complexity, adds
> overhead to the PCIe bus and memory system, and adds latency.
> 
> In hardware devices interrupts are typically enabled like this:
> 
>    enable_cb_hw() {
>      writel(last_used_idx, &dev_bar->int_en_doorbell);
>      return 0;
>    }
> 
> This tells the device to raise an interrupt once the descriptor at the given
> index has been used.  If it has already been used, then an interrupt is
> raised immediately, else the interrupt is enabled and raised after the used
> descriptor has been written.  The race condition is handled without having
> to check the ring again.  Note that interrupts are fire-once: For each write
> to int_en_doorbell you get a single interrupt, and disable_cb() is therefore
> a no-op.
> 
> When you emulate a device using this model, you get a vmexit every time
> int_en_doorbell is written, which is expensive.  Therefore this approach is
> not desirable when emulated.
> 
> So perhaps we should introduce an option to use this hardware-oriented
> model.  The problem is that with vDPA the implementation of a device can
> shift under the feet of a driver between hardware and emulation.  We'd like
> to get the best performance we can in both cases.
> 
> My proposal is that we add an option to use both models at the same time. 
> enable_cb() would become:
> 
> 30   enable_cb_new() {
> 31     if( int_en_doorbell_enabled )
> 32 writel(last_used_idx, p_int_en_doorbell);
> 33 guest_mem->event_suppress = last_used_idx;
> 34     virtio_mb();
> 35     return desc_is_used();
> 36   }
> 
> p_int_en_doorbell it notionally a pointer into a memory mapping onto a
> device BAR, and when backed by hardware that is exactly what it is (and the
> information in the event suppression structure is ignored).  But when backed
> by an emulated device the p_int_en_doorbell mapping should just point at
> normal host memory (which is otherwise unused).  This means that we avoid
> vmexits in the emulated case.
> 
> Comments?

Sounds fine to me.

Could you send a VIRTIO spec patch so this can be discussed in detail?

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [virtio-dev] Device-to-driver notification management for hardware implementations
  2018-11-12 14:37     ` Stefan Hajnoczi
@ 2019-02-01 14:23       ` David Riddoch
  0 siblings, 0 replies; 6+ messages in thread
From: David Riddoch @ 2019-02-01 14:23 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Virtio-Dev, Steve Pope


> Sounds fine to me.
>
> Could you send a VIRTIO spec patch so this can be discussed in detail?

Hi Stefan,

Since writing that proposal I've come to the conclusion that this 
proposed extension would only really be useful if it were compulsory 
rather than optional.  It isn't essential for performance, and it 
doesn't reduce complexity in the design if optional, as we still have to 
operate without it.

I have however, found another problem that is much more important to 
solve...see next email.

Best,
David

-- 
David Riddoch  <driddoch@solarflare.com> -- Chief Architect, Solarflare


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [virtio-dev] Re: [virtio] RE: Re: [virtio-dev] Device-to-driver notification management for hardware implementations
       [not found]         ` <20190211020351-mutt-send-email-mst@kernel.org>
@ 2019-02-11 10:16           ` David Riddoch
  0 siblings, 0 replies; 6+ messages in thread
From: David Riddoch @ 2019-02-11 10:16 UTC (permalink / raw)
  To: Michael S. Tsirkin, Chandra Thyamagondlu; +Cc: Virtio-Dev, Kumar Sanghvi

(Apologies if you got this a second time: The first copy went to the 
wrong list).

>> I think your proposal of write to p_int_en_doorbell will work for us. 
>> The write
>> event to this new address in BAR itself implicitly tell the device that
>> interrupt is enabled and also it conveys the last used entry that it 
>> processed.
>> Then device can provide nice interrupt moderation by making sure at 
>> least one
>> new used entry was written and some time has elapsed from previous 
>> interrupt to
>> driver.
>>
>> It would also be helpful if you can also outline the new structure 
>> and your
>> proposal for offset in the BAR.
>>
>>
>> Chandra
> So this boils down to adding ability to notify devices about enabling
> interrupts. Question would be, this slows down the driver. Is this
> still worth doing? Right now we are trying to offload as
> much as we can to the device.
I don't think it slows down the driver significantly, because the 
interrupt rate is usually moderated so that it is not too high. (So 
these notify MMIOs should not be issued at a high enough rate to get 
close to causing a bottleneck).

It is a win in terms of PCIe overheads, as the device never needs to 
read the suppression structure.  But again the win is limited because 
interrupt rate will usually be limited by moderation.

It can be a win for latency.  In the case of a net rx virt-queue the 
device can issue an interrupt without first reading the suppression 
structure.

However, despite having made the proposal, I no longer consider this 
high priority: Because it is an optional feature it would not reduce 
complexity since a device has to be able to operate without it.  I'm not 
sure the benefits are worth the complexity added.

David

-- 
David Riddoch  <driddoch@solarflare.com> -- Chief Architect, Solarflare


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-02-11 10:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-12 16:41 [virtio-dev] Device-to-driver notification management for hardware implementations David Riddoch
2018-11-05  8:53 ` Stefan Hajnoczi
2018-11-08 12:16   ` David Riddoch
2018-11-12 14:37     ` Stefan Hajnoczi
2019-02-01 14:23       ` David Riddoch
     [not found]     ` <e28d3203-4566-ae6e-61c7-24a9a3e6ca3b@solarflare.com>
     [not found]       ` <BYAPR02MB471166783BC223C9AB8A12BBBB990@BYAPR02MB4711.namprd02.prod.outlook.com>
     [not found]         ` <20190211020351-mutt-send-email-mst@kernel.org>
2019-02-11 10:16           ` [virtio-dev] Re: [virtio] RE: " David Riddoch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.