All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Stratos Mailing List <stratos-dev@op-lists.linaro.org>,
	virtio-dev@lists.oasis-open.org,
	Arnd Bergmann <arnd.bergmann@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Stefano Stabellini <stefano.stabellini@xilinx.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Carl van Schaik <cvanscha@qti.qualcomm.com>,
	pratikp@quicinc.com, Srivatsa Vaddagiri <vatsa@codeaurora.org>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Wei.Chen@arm.com, olekstysh@gmail.com,
	Oleksandr_Tyshchenko@epam.com, Bertrand.Marquis@arm.com,
	Artem_Mygaiev@epam.com, julien@xen.org, jgross@suse.com,
	paul@xen.org, xen-devel@lists.xen.org,
	Elena Afanasova <eafanasova@gmail.com>
Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
Date: Fri, 03 Sep 2021 10:28:06 +0100	[thread overview]
Message-ID: <87czpqq9qu.fsf@linaro.org> (raw)
In-Reply-To: <20210903080609.GD47953@laputa>


AKASHI Takahiro <takahiro.akashi@linaro.org> writes:

> Alex,
>
> On Wed, Sep 01, 2021 at 01:53:34PM +0100, Alex Benn??e wrote:
>> 
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> 
>> > [[PGP Signed Part:Undecided]]
>> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
>> >> > Could we consider the kernel internally converting IOREQ messages from
>> >> > the Xen hypervisor to eventfd events? Would this scale with other kernel
>> >> > hypercall interfaces?
>> >> > 
>> >> > So any thoughts on what directions are worth experimenting with?
>> >>  
>> >> One option we should consider is for each backend to connect to Xen via
>> >> the IOREQ interface. We could generalize the IOREQ interface and make it
>> >> hypervisor agnostic. The interface is really trivial and easy to add.
>> >> The only Xen-specific part is the notification mechanism, which is an
>> >> event channel. If we replaced the event channel with something else the
>> >> interface would be generic. See:
>> >> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
>> >
>> > There have been experiments with something kind of similar in KVM
>> > recently (see struct ioregionfd_cmd):
>> > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanasova@gmail.com/
>> 
>> Reading the cover letter was very useful in showing how this provides a
>> separate channel for signalling IO events to userspace instead of using
>> the normal type-2 vmexit type event. I wonder how deeply tied the
>> userspace facing side of this is to KVM? Could it provide a common FD
>> type interface to IOREQ?
>
> Why do you stick to a "FD" type interface?

I mean most user space interfaces on POSIX start with a file descriptor
and the usual read/write semantics or a series of ioctls.

>> As I understand IOREQ this is currently a direct communication between
>> userspace and the hypervisor using the existing Xen message bus. My
>
> With IOREQ server, IO event occurrences are notified to BE via Xen's event
> channel, while the actual contexts of IO events (see struct ioreq in ioreq.h)
> are put in a queue on a single shared memory page which is to be assigned
> beforehand with xenforeignmemory_map_resource hypervisor call.

If we abstracted the IOREQ via the kernel interface you would probably
just want to put the ioreq structure on a queue rather than expose the
shared page to userspace. 

>> worry would be that by adding knowledge of what the underlying
>> hypervisor is we'd end up with excess complexity in the kernel. For one
>> thing we certainly wouldn't want an API version dependency on the kernel
>> to understand which version of the Xen hypervisor it was running on.
>
> That's exactly what virtio-proxy in my proposal[1] does; All the hypervisor-
> specific details of IO event handlings are contained in virtio-proxy
> and virtio BE will communicate with virtio-proxy through a virtqueue
> (yes, virtio-proxy is seen as yet another virtio device on BE) and will
> get IO event-related *RPC* callbacks, either MMIO read or write, from
> virtio-proxy.
>
> See page 8 (protocol flow) and 10 (interfaces) in [1].

There are two areas of concern with the proxy approach at the moment.
The first is how the bootstrap of the virtio-proxy channel happens and
the second is how many context switches are involved in a transaction.
Of course with all things there is a trade off. Things involving the
very tightest latency would probably opt for a bare metal backend which
I think would imply hypervisor knowledge in the backend binary.

>
> If kvm's ioregionfd can fit into this protocol, virtio-proxy for kvm
> will hopefully be implemented using ioregionfd.
>
> -Takahiro Akashi
>
> [1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000548.html

-- 
Alex Bennée


WARNING: multiple messages have this Message-ID (diff)
From: "Alex Bennée" <alex.bennee@linaro.org>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Stratos Mailing List <stratos-dev@op-lists.linaro.org>,
	virtio-dev@lists.oasis-open.org,
	Arnd Bergmann <arnd.bergmann@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Stefano Stabellini <stefano.stabellini@xilinx.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Carl van Schaik <cvanscha@qti.qualcomm.com>,
	pratikp@quicinc.com, Srivatsa Vaddagiri <vatsa@codeaurora.org>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Wei.Chen@arm.com, olekstysh@gmail.com,
	Oleksandr_Tyshchenko@epam.com, Bertrand.Marquis@arm.com,
	Artem_Mygaiev@epam.com, julien@xen.org, jgross@suse.com,
	paul@xen.org, xen-devel@lists.xen.org,
	Elena Afanasova <eafanasova@gmail.com>
Subject: [virtio-dev] Re: Enabling hypervisor agnosticism for VirtIO backends
Date: Fri, 03 Sep 2021 10:28:06 +0100	[thread overview]
Message-ID: <87czpqq9qu.fsf@linaro.org> (raw)
In-Reply-To: <20210903080609.GD47953@laputa>


AKASHI Takahiro <takahiro.akashi@linaro.org> writes:

> Alex,
>
> On Wed, Sep 01, 2021 at 01:53:34PM +0100, Alex Benn??e wrote:
>> 
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> 
>> > [[PGP Signed Part:Undecided]]
>> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
>> >> > Could we consider the kernel internally converting IOREQ messages from
>> >> > the Xen hypervisor to eventfd events? Would this scale with other kernel
>> >> > hypercall interfaces?
>> >> > 
>> >> > So any thoughts on what directions are worth experimenting with?
>> >>  
>> >> One option we should consider is for each backend to connect to Xen via
>> >> the IOREQ interface. We could generalize the IOREQ interface and make it
>> >> hypervisor agnostic. The interface is really trivial and easy to add.
>> >> The only Xen-specific part is the notification mechanism, which is an
>> >> event channel. If we replaced the event channel with something else the
>> >> interface would be generic. See:
>> >> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
>> >
>> > There have been experiments with something kind of similar in KVM
>> > recently (see struct ioregionfd_cmd):
>> > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanasova@gmail.com/
>> 
>> Reading the cover letter was very useful in showing how this provides a
>> separate channel for signalling IO events to userspace instead of using
>> the normal type-2 vmexit type event. I wonder how deeply tied the
>> userspace facing side of this is to KVM? Could it provide a common FD
>> type interface to IOREQ?
>
> Why do you stick to a "FD" type interface?

I mean most user space interfaces on POSIX start with a file descriptor
and the usual read/write semantics or a series of ioctls.

>> As I understand IOREQ this is currently a direct communication between
>> userspace and the hypervisor using the existing Xen message bus. My
>
> With IOREQ server, IO event occurrences are notified to BE via Xen's event
> channel, while the actual contexts of IO events (see struct ioreq in ioreq.h)
> are put in a queue on a single shared memory page which is to be assigned
> beforehand with xenforeignmemory_map_resource hypervisor call.

If we abstracted the IOREQ via the kernel interface you would probably
just want to put the ioreq structure on a queue rather than expose the
shared page to userspace. 

>> worry would be that by adding knowledge of what the underlying
>> hypervisor is we'd end up with excess complexity in the kernel. For one
>> thing we certainly wouldn't want an API version dependency on the kernel
>> to understand which version of the Xen hypervisor it was running on.
>
> That's exactly what virtio-proxy in my proposal[1] does; All the hypervisor-
> specific details of IO event handlings are contained in virtio-proxy
> and virtio BE will communicate with virtio-proxy through a virtqueue
> (yes, virtio-proxy is seen as yet another virtio device on BE) and will
> get IO event-related *RPC* callbacks, either MMIO read or write, from
> virtio-proxy.
>
> See page 8 (protocol flow) and 10 (interfaces) in [1].

There are two areas of concern with the proxy approach at the moment.
The first is how the bootstrap of the virtio-proxy channel happens and
the second is how many context switches are involved in a transaction.
Of course with all things there is a trade off. Things involving the
very tightest latency would probably opt for a bare metal backend which
I think would imply hypervisor knowledge in the backend binary.

>
> If kvm's ioregionfd can fit into this protocol, virtio-proxy for kvm
> will hopefully be implemented using ioregionfd.
>
> -Takahiro Akashi
>
> [1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000548.html

-- 
Alex Bennée

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  reply	other threads:[~2021-09-03  9:39 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04  9:04 [virtio-dev] Enabling hypervisor agnosticism for VirtIO backends Alex Bennée
2021-08-04 19:20 ` Stefano Stabellini
2021-08-11  6:27   ` AKASHI Takahiro
2021-08-14 15:37     ` Oleksandr Tyshchenko
2021-08-16 10:04       ` Wei Chen
2021-08-17  8:07         ` AKASHI Takahiro
2021-08-17  8:39           ` Wei Chen
2021-08-18  5:38             ` AKASHI Takahiro
2021-08-18  8:35               ` Wei Chen
2021-08-20  6:41                 ` AKASHI Takahiro
2021-08-26  9:40                   ` AKASHI Takahiro
2021-08-26 12:10                     ` Wei Chen
2021-08-30 19:36                       ` Christopher Clark
2021-08-30 19:53                         ` Christopher Clark
2021-08-30 19:53                           ` [virtio-dev] " Christopher Clark
2021-09-02  7:19                           ` AKASHI Takahiro
2021-09-07  0:57                             ` Christopher Clark
2021-09-07  0:57                               ` [virtio-dev] " Christopher Clark
2021-09-07 11:55                               ` AKASHI Takahiro
2021-09-07 18:09                                 ` Christopher Clark
2021-09-07 18:09                                   ` [virtio-dev] " Christopher Clark
2021-09-10  3:12                                   ` AKASHI Takahiro
2021-08-31  6:18                       ` AKASHI Takahiro
2021-09-01 11:12                         ` Wei Chen
2021-09-01 12:29                           ` AKASHI Takahiro
2021-09-01 16:26                             ` Oleksandr Tyshchenko
2021-09-02  1:30                             ` Wei Chen
2021-09-02  1:50                               ` Wei Chen
     [not found]   ` <0100017b33e585a5-06d4248e-b1a7-485e-800c-7ead89e5f916-000000@email.amazonses.com>
2021-08-12  7:55     ` [Stratos-dev] " François Ozog
2021-08-13  5:10       ` AKASHI Takahiro
2021-09-01  8:57         ` Alex Bennée
2021-09-01  8:57           ` [virtio-dev] " Alex Bennée
2021-08-17 10:41   ` Stefan Hajnoczi
2021-08-17 10:41     ` [virtio-dev] " Stefan Hajnoczi
2021-08-23  6:25     ` AKASHI Takahiro
2021-08-23  9:58       ` Stefan Hajnoczi
2021-08-23  9:58         ` [virtio-dev] " Stefan Hajnoczi
2021-08-25 10:29         ` AKASHI Takahiro
2021-08-25 15:02           ` Stefan Hajnoczi
2021-08-25 15:02             ` [virtio-dev] " Stefan Hajnoczi
2021-09-01 12:53     ` Alex Bennée
2021-09-01 12:53       ` [virtio-dev] " Alex Bennée
2021-09-02  9:12       ` Stefan Hajnoczi
2021-09-02  9:12         ` [virtio-dev] " Stefan Hajnoczi
2021-09-03  8:06       ` AKASHI Takahiro
2021-09-03  9:28         ` Alex Bennée [this message]
2021-09-03  9:28           ` [virtio-dev] " Alex Bennée
2021-09-06  2:23           ` AKASHI Takahiro
2021-09-07  2:41             ` [Stratos-dev] " Christopher Clark
2021-09-07  2:41               ` [virtio-dev] " Christopher Clark
2021-09-10  2:50               ` AKASHI Takahiro
2021-09-10  9:35               ` Alex Bennée
2021-09-10  9:35                 ` [virtio-dev] " Alex Bennée
2021-09-13 23:51             ` Stefano Stabellini
2021-09-14  6:08               ` [Stratos-dev] " François Ozog
2021-09-14 14:25               ` Alex Bennée
2021-09-14 14:25                 ` [virtio-dev] " Alex Bennée
2021-09-14 17:38               ` [Stratos-dev] " Trilok Soni
2021-09-15  3:29                 ` Stefano Stabellini
2021-09-15 23:50                   ` Trilok Soni
2021-09-16  2:11                     ` Stefano Stabellini
2021-08-05 15:48 ` [virtio-dev] " Stefan Hajnoczi
2021-08-19  9:11 ` [virtio-dev] " Matias Ezequiel Vara Larsen
     [not found]   ` <20210820060558.GB13452@laputa>
2021-08-21 14:08     ` Matias Ezequiel Vara Larsen
     [not found]       ` <20210823012029.GB40863@laputa>
2021-10-04 11:33         ` Matias Ezequiel Vara Larsen
2021-09-01  8:43   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czpqq9qu.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=Artem_Mygaiev@epam.com \
    --cc=Bertrand.Marquis@arm.com \
    --cc=Oleksandr_Tyshchenko@epam.com \
    --cc=Wei.Chen@arm.com \
    --cc=arnd.bergmann@linaro.org \
    --cc=cvanscha@qti.qualcomm.com \
    --cc=eafanasova@gmail.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgross@suse.com \
    --cc=julien@xen.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=olekstysh@gmail.com \
    --cc=paul@xen.org \
    --cc=pratikp@quicinc.com \
    --cc=sstabellini@kernel.org \
    --cc=stefanha@redhat.com \
    --cc=stefano.stabellini@xilinx.com \
    --cc=stratos-dev@op-lists.linaro.org \
    --cc=takahiro.akashi@linaro.org \
    --cc=vatsa@codeaurora.org \
    --cc=viresh.kumar@linaro.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.