All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frank Yang <lfy@google.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Roman Kiryanov <rkir@google.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	virtio-dev@lists.oasis-open.org,
	Greg Hartman <ghartman@google.com>
Subject: Re: [virtio-dev] Memory sharing device
Date: Tue, 12 Feb 2019 12:15:45 -0800	[thread overview]
Message-ID: <CAEkmjvVW03jYgSH_VDD3KErT2L24mhnb25LaR+ZPKr+N=iVeUw@mail.gmail.com> (raw)
In-Reply-To: <20190212140706-mutt-send-email-mst@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 7767 bytes --]

On Tue, Feb 12, 2019 at 11:15 AM Michael S. Tsirkin <mst@redhat.com> wrote:

> On Tue, Feb 12, 2019 at 11:01:21AM -0800, Frank Yang wrote:
> >
> >
> >
> > On Tue, Feb 12, 2019 at 10:22 AM Michael S. Tsirkin <mst@redhat.com>
> wrote:
> >
> >     On Tue, Feb 12, 2019 at 07:56:58AM -0800, Frank Yang wrote:
> >     > Stepping back to standardization and portability concerns, it is
> also not
> >     > necessarily desirable to use general pipes to do what we want,
> because
> >     even
> >     > though that device exists and is part of the spec already, that
> results
> >     in
> >     > _de-facto_ non-portability.
> >
> >     That's not different from e.g. TCP.
> >
> >     > If we had some kind of spec to enumerate such
> >     > 'user-defined' devices, at least we can have _de-jure_
> non-portability;
> >     an
> >     > enumerated device doesn't work as advertised.
> >
> >     I am not sure distinguishing between different types of non
> portability
> >     will be in scope for virtio. Actually having devices that are
> portable
> >     would be.
> >
> >
> > The device itself is portable; the user-defined drivers that run on them
> will
> > work or not depending on
> > negotiating device IDs.
> >
> >     ...
> >
> >     > Note that virtio-serial/virtio-vsock is not considered because
> they do
> >     not
> >     > standardize the set of devices that operate on top of them, but in
> >     practice,
> >     > are often used for fully general devices.  Spec-wise, this is not
> a great
> >     > situation because we would still have potentially non portable
> device
> >     > implementations where there is no standard mechanism to determine
> whether
> >     or
> >     > not things are portable.
> >
> >     Well it's easy to add an enumeration on top of sockets, and several
> well
> >     known solutions exist. There's an advantage to just reusing these.
> >
> >
> > Sure, but there are many unique features/desirable properties of having
> the
> > virtio meta device
> > because (as explained in the spec) there are limitations to
> network/socket
> > based communication.
> >
> >
> >     > virtio-user provides a device enumeration mechanism
> >     > to better control this.
> >
> >     We'll have to see what it all looks like. For virtio pci transport
> it's
> >     important that you can reason about the device at a basic level
> based on
> >     it's PCI ID, and that is quite fundamental.
> >
> >
> > The spec contains more details; basically the device itself is always
> portable,
> > and there is a configuration protocol
> > to negotiate whether a particular use of the device is available. This is
> > similar to PCI,
> > but with more defined ways to operate the device in terms of callbacks in
> > shared libraries on the host.
> >
> >
> >     Maybe what you are looking for is a new virtio transport then?
> >
> >
> >
> > Perhaps, something like virtio host memory transport? But
> > at the same time, it needs to interact with shared memory which is best
> set as
> > a PCI device.
> > Can we mix transport types? In any case, the analog of "PCI ID"'s here
> (the
> > vendor/device/version numbers)
> > are meaningful, with the contract being that the user of the device
> needs to
> > match on vendor/device id and
> > negotiate on version number.
>
> Virtio is fundamentally using feature bits not versions.
> It's been pretty successful in maintaining compatiblity
> across a wide range of hypervisor/guest revisions.
>

The proposed device should be compatible with all hypervisor/guest
revisions as is,
without feature bits.


> > Wha are the advantages of defining a new virtio transport type?
> > it would be something that has the IDs, and be able to handle resolving
> offsets
> > to
> > physical addresses to host memory addresses,
> > in addition to dispatching to callbacks on the host.
> > But it would be effectively equivalent to having a new virtio device
> type with
> > device ID enumeration, right?
>
> Under virtio PCI Device IDs are all defined in virtio spec.
> If you want your own ID scheme you want an alternative transport.
> But now what you describe looks kind of like vhost pci to me.
>
> Yet, vhost is still not a good fit, due to reliance on sockets/network
functionality.
It looks like we want to build something that has aspects in common with
other virtio devices,
but there is no combination of existing virtio devices that works.

   - the locality of virtio-fs, but not specialized to fuse
   - the device id enumeration from virtio pci, but not in the pci id space
   - the general communication of virtio-vsock, but not specialized to
   sockets and allowing host memory-based transport

So, I still think it should be a new device (or transport if that works
better).


> >
> >
> >     > In addition, for performance considerations in applications such as
> >     graphics
> >     > and media, virtio-serial/virtio-vsock have the overhead of sending
> actual
> >     > traffic through the virtqueue, while an approach based on shared
> memory
> >     can
> >     > result in having fewer copies and virtqueue messages.
> virtio-serial is
> >     also
> >     > limited in being specialized for console forwarding and having a
> cap on
> >     the
> >     > number of clients.  virtio-vsock is also not optimal in its choice
> of
> >     sockets
> >     > API for transport; shared memory cannot be used, arbitrary strings
> can be
> >     > passed without an designation of the device/driver being run
> de-facto,
> >     and the
> >     > guest must have additional machinery to handle socket APIs.  In
> addition,
> >     on
> >     > the host, sockets are only dependable on Linux, with less
> predictable
> >     behavior
> >     > from Windows/macOS regarding Unix sockets.  Waiting for socket
> traffic on
> >     the
> >     > host also requires a poll() loop, which is suboptimal for
> latency.  With
> >     > virtio-user, only the bare set of standard driver calls
> >     > (open/close/ioctl/mmap/read) is needed, and RAM is a more universal
> >     transport
> >     > abstraction.  We also explicitly spec out callbacks on host that
> are
> >     triggered
> >     > by virtqueue messages, which results in lower latency and makes it
> easy
> >     to
> >     > dispatch to a particular device implementation without polling.
> >
> >     open/close/mmap/read seem to make sense. ioctl gives one pause.
> >
> >
> > ioctl would be to send ping messages, but I'm not fixated on that
> choice. write
> > () is also a possibility to send ping messages; I preferred ioctl()
> because it
> > should be clear that it's a control message not a data message.
>
> Yes if ioctls supported are white-listed and not blindly passed through
> (e.g. send a ping message), then it does not matter.
>
>
There would be one whitelisted ioctl: IOCTL_PING, with struct close to what
is specified in the proposed spec,
with the instance handle field populated by the kernel.


>
> >
> >     Given open/close this begins to look a bit like virtio-fs.
> >     Have you looked at that?
> >
> >
> >
> > That's an interesting possibility since virtio-fs maps host pointers as
> well,
> > which fits our use cases.
> > Another alternative is to add the features unique about virtio-user to
> > virtio-fs:
> > device enumeration, memory sharing operations, operation in terms of
> callbacks
> > on the host.
> > However, it doesn't seem like a good fit due to being specialized to
> filesystem
> > operations.
>
> Well everything is a file :)
>
> Not according to virtio-fs; it specializes in fuse support in the guest
kernel.
However, it would work for me if the unique features about virtio-user were
added to it,
and dropping the requirement to use fuse.

> >
> >
> >     --
> >     MST
> >
>

[-- Attachment #2: Type: text/html, Size: 10110 bytes --]

  reply	other threads:[~2019-02-12 20:15 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-01 20:34 [virtio-dev] Memory sharing device Roman Kiryanov
2019-02-04  5:40 ` Stefan Hajnoczi
2019-02-04 10:13   ` Gerd Hoffmann
2019-02-04 10:18     ` Roman Kiryanov
2019-02-05  7:42     ` Roman Kiryanov
2019-02-05 10:04       ` Dr. David Alan Gilbert
2019-02-05 15:17         ` Frank Yang
2019-02-05 15:21           ` Frank Yang
2019-02-05 21:06         ` Roman Kiryanov
2019-02-06  7:03           ` Gerd Hoffmann
2019-02-06 15:09             ` Frank Yang
2019-02-06 15:11               ` Frank Yang
2019-02-08  7:57               ` Stefan Hajnoczi
2019-02-08 14:46                 ` Frank Yang
2019-02-06 20:14           ` Dr. David Alan Gilbert
2019-02-06 20:27             ` Frank Yang
2019-02-07 12:10               ` Cornelia Huck
2019-02-11 14:49       ` Michael S. Tsirkin
2019-02-11 15:14         ` Frank Yang
2019-02-11 15:25           ` Frank Yang
2019-02-12 13:01             ` Michael S. Tsirkin
2019-02-12 13:16             ` Dr. David Alan Gilbert
2019-02-12 13:27               ` Michael S. Tsirkin
2019-02-12 16:17                 ` Frank Yang
2019-02-19  7:17                   ` Gerd Hoffmann
2019-02-19 15:59                     ` Frank Yang
2019-02-20  6:51                       ` Gerd Hoffmann
2019-02-20 15:31                         ` Frank Yang
2019-02-21  6:55                           ` Gerd Hoffmann
2019-02-19  7:12             ` Gerd Hoffmann
2019-02-19 16:02               ` Frank Yang
2019-02-20  7:02                 ` Gerd Hoffmann
2019-02-20 15:32                   ` Frank Yang
2019-02-21  7:29                     ` Gerd Hoffmann
2019-02-21  9:24                       ` Dr. David Alan Gilbert
2019-02-21  9:59                         ` Gerd Hoffmann
2019-02-21 10:03                           ` Dr. David Alan Gilbert
2019-02-22  6:15                           ` Michael S. Tsirkin
2019-02-22  6:42                             ` Gerd Hoffmann
2019-02-11 16:57           ` Michael S. Tsirkin
2019-02-12  8:27         ` Roman Kiryanov
2019-02-12 11:25           ` Dr. David Alan Gilbert
2019-02-12 13:47             ` Cornelia Huck
2019-02-12 14:03               ` Michael S. Tsirkin
2019-02-12 15:56                 ` Frank Yang
2019-02-12 16:46                   ` Dr. David Alan Gilbert
2019-02-12 17:20                     ` Frank Yang
2019-02-12 17:26                       ` Frank Yang
2019-02-12 19:06                         ` Michael S. Tsirkin
2019-02-13  2:50                           ` Frank Yang
2019-02-13  4:02                             ` Michael S. Tsirkin
2019-02-13  4:19                               ` Michael S. Tsirkin
2019-02-13  4:59                                 ` Frank Yang
2019-02-13 18:18                                   ` Frank Yang
2019-02-14  7:15                                     ` Frank Yang
2019-02-22 22:05                                       ` Michael S. Tsirkin
2019-02-24 21:19                                         ` Frank Yang
2019-02-13  4:59                               ` Frank Yang
2019-02-19  7:54                       ` Gerd Hoffmann
2019-02-19 15:54                         ` Frank Yang
2019-02-20  3:46                           ` Michael S. Tsirkin
2019-02-20 15:24                             ` Frank Yang
2019-02-20 19:29                               ` Michael S. Tsirkin
2019-02-20  6:25                           ` Gerd Hoffmann
2019-02-20 15:30                             ` Frank Yang
2019-02-20 15:35                               ` Frank Yang
2019-02-21  6:44                               ` Gerd Hoffmann
2019-02-12 18:22                   ` Michael S. Tsirkin
2019-02-12 19:01                     ` Frank Yang
2019-02-12 19:15                       ` Michael S. Tsirkin
2019-02-12 20:15                         ` Frank Yang [this message]
2019-02-12 13:00           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEkmjvVW03jYgSH_VDD3KErT2L24mhnb25LaR+ZPKr+N=iVeUw@mail.gmail.com' \
    --to=lfy@google.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ghartman@google.com \
    --cc=kraxel@redhat.com \
    --cc=mst@redhat.com \
    --cc=rkir@google.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.