All of lore.kernel.org
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Stefan Hajnoczi <shajnocz@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>,
	mst@redhat.com, Stefan Hajnoczi <stefanha@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	torvalds@linux-foundation.org, ascull@google.com, maz@kernel.org,
	keirf@google.com, jiyong@google.com, kernel-team@android.com,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org
Subject: Re: IOTLB support for vhost/vsock breaks crosvm on Android
Date: Sat, 6 Aug 2022 15:34:44 +0100	[thread overview]
Message-ID: <20220806143443.GA30658@willie-the-truck> (raw)
In-Reply-To: <CAD60JZMbbkwFHqCm_iCrOrKgRLBUMkDQfuJ=Q1T-sZt59eTBrw@mail.gmail.com>

On Sat, Aug 06, 2022 at 06:52:15AM -0400, Stefan Hajnoczi wrote:
> On Sat, Aug 6, 2022 at 5:50 AM Will Deacon <will@kernel.org> wrote:
> > On Sat, Aug 06, 2022 at 09:48:28AM +0200, Stefano Garzarella wrote:
> > > On Fri, Aug 05, 2022 at 07:11:06PM +0100, Will Deacon wrote:
> > > If the VMM implements the translation feature, it is right in my opinion
> > > that it does not enable the feature for the vhost device. Otherwise, if it
> > > wants the vhost device to do the translation, enable the feature and send
> > > the IOTLB messages to set the translation.
> > >
> > > QEMU for example masks features when not required or supported.
> > > crosvm should negotiate only the features it supports.
> > >
> > > @Michael and @Jason can correct me, but if a vhost device negotiates
> > > VIRTIO_F_ACCESS_PLATFORM, then it expects the VMM to send IOTLB messages to
> > > set the translation.
> >
> > As above, the issue is that vhost now unconditionally advertises this in
> > VHOST_GET_FEATURES and so a VMM with no knowledge of IOTLB can end up
> > enabling it by accident.
> 
> Unconditionally exposing all vhost feature bits to the guest is
> incorrect. The emulator must filter out only the feature bits that it
> supports.

I've evidently done a bad job of explaining this, sorry.

crosvm _does_ filter the feature bits which it passes to vhost. It takes the
feature set which it has negotiated with the guest and then takes the
intersection of this set with the set of features which vhost advertises.
The result is what is passed to VHOST_SET_FEATURES. I included the rust
for this in my initial mail, but in C it might look something like:

	u64 features = negotiate_features_with_guest(dev);

	ioctl(vhost_fd, VHOST_GET_FEATURES, &vhost_features);
	vhost_features &= features;	/* Mask out unsupported features */
	ioctl(vhost_fd, VHOST_SET_FEATURES, &vhost_features);

The problem is that crosvm has negotiated VIRTIO_F_ACCESS_PLATFORM with
the guest so that restricted DMA is used for the virtio devices. With
e13a6915a03f, VIRTIO_F_ACCESS_PLATFORM is now advertised by
VHOST_GET_FEATURES and so IOTLB is enabled by the sequence above.

> For example, see QEMU's vhost-net device's vhost feature bit allowlist:
> https://gitlab.com/qemu-project/qemu/-/blob/master/hw/net/vhost_net.c#L40

I agree that changing crosvm to use an allowlist would fix the problem,
I'm just questioning whether we should be changing userspace at all to
resolve this issue.

> The reason why the emulator (crosvm/QEMU/etc) must filter out feature
> bits is that vhost devices are full VIRTIO devices. They are a subset
> of a VIRTIO device and the emulator is responsible for the rest of the
> device. Some features will require both vhost and emulator support.
> Therefore it is incorrect to expect the device to work correctly if
> the vhost feature bits are passed through to the guest.

I think crosvm is trying to cater for this by masking out the features
it doesn't know about.

Will

WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will@kernel.org>
To: Stefan Hajnoczi <shajnocz@redhat.com>
Cc: jiyong@google.com, kvm@vger.kernel.org, mst@redhat.com,
	maz@kernel.org, keirf@google.com, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, ascull@google.com,
	Stefan Hajnoczi <stefanha@redhat.com>,
	kernel-team@android.com, torvalds@linux-foundation.org
Subject: Re: IOTLB support for vhost/vsock breaks crosvm on Android
Date: Sat, 6 Aug 2022 15:34:44 +0100	[thread overview]
Message-ID: <20220806143443.GA30658@willie-the-truck> (raw)
In-Reply-To: <CAD60JZMbbkwFHqCm_iCrOrKgRLBUMkDQfuJ=Q1T-sZt59eTBrw@mail.gmail.com>

On Sat, Aug 06, 2022 at 06:52:15AM -0400, Stefan Hajnoczi wrote:
> On Sat, Aug 6, 2022 at 5:50 AM Will Deacon <will@kernel.org> wrote:
> > On Sat, Aug 06, 2022 at 09:48:28AM +0200, Stefano Garzarella wrote:
> > > On Fri, Aug 05, 2022 at 07:11:06PM +0100, Will Deacon wrote:
> > > If the VMM implements the translation feature, it is right in my opinion
> > > that it does not enable the feature for the vhost device. Otherwise, if it
> > > wants the vhost device to do the translation, enable the feature and send
> > > the IOTLB messages to set the translation.
> > >
> > > QEMU for example masks features when not required or supported.
> > > crosvm should negotiate only the features it supports.
> > >
> > > @Michael and @Jason can correct me, but if a vhost device negotiates
> > > VIRTIO_F_ACCESS_PLATFORM, then it expects the VMM to send IOTLB messages to
> > > set the translation.
> >
> > As above, the issue is that vhost now unconditionally advertises this in
> > VHOST_GET_FEATURES and so a VMM with no knowledge of IOTLB can end up
> > enabling it by accident.
> 
> Unconditionally exposing all vhost feature bits to the guest is
> incorrect. The emulator must filter out only the feature bits that it
> supports.

I've evidently done a bad job of explaining this, sorry.

crosvm _does_ filter the feature bits which it passes to vhost. It takes the
feature set which it has negotiated with the guest and then takes the
intersection of this set with the set of features which vhost advertises.
The result is what is passed to VHOST_SET_FEATURES. I included the rust
for this in my initial mail, but in C it might look something like:

	u64 features = negotiate_features_with_guest(dev);

	ioctl(vhost_fd, VHOST_GET_FEATURES, &vhost_features);
	vhost_features &= features;	/* Mask out unsupported features */
	ioctl(vhost_fd, VHOST_SET_FEATURES, &vhost_features);

The problem is that crosvm has negotiated VIRTIO_F_ACCESS_PLATFORM with
the guest so that restricted DMA is used for the virtio devices. With
e13a6915a03f, VIRTIO_F_ACCESS_PLATFORM is now advertised by
VHOST_GET_FEATURES and so IOTLB is enabled by the sequence above.

> For example, see QEMU's vhost-net device's vhost feature bit allowlist:
> https://gitlab.com/qemu-project/qemu/-/blob/master/hw/net/vhost_net.c#L40

I agree that changing crosvm to use an allowlist would fix the problem,
I'm just questioning whether we should be changing userspace at all to
resolve this issue.

> The reason why the emulator (crosvm/QEMU/etc) must filter out feature
> bits is that vhost devices are full VIRTIO devices. They are a subset
> of a VIRTIO device and the emulator is responsible for the rest of the
> device. Some features will require both vhost and emulator support.
> Therefore it is incorrect to expect the device to work correctly if
> the vhost feature bits are passed through to the guest.

I think crosvm is trying to cater for this by masking out the features
it doesn't know about.

Will
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2022-08-06 14:34 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05 18:11 IOTLB support for vhost/vsock breaks crosvm on Android Will Deacon
2022-08-05 18:11 ` Will Deacon
2022-08-05 22:57 ` Linus Torvalds
2022-08-05 22:57   ` Linus Torvalds
2022-08-06  8:17   ` Stefano Garzarella
2022-08-06  8:17     ` Stefano Garzarella
2022-08-06  9:42   ` Will Deacon
2022-08-06  9:42     ` Will Deacon
2022-08-07  6:52   ` Christoph Hellwig
2022-08-07  6:52     ` Christoph Hellwig
2022-08-07 13:35     ` Michael S. Tsirkin
2022-08-07 13:35       ` Michael S. Tsirkin
2022-08-07 13:27   ` Michael S. Tsirkin
2022-08-07 13:27     ` Michael S. Tsirkin
2022-08-06  7:48 ` Stefano Garzarella
2022-08-06  7:48   ` Stefano Garzarella
2022-08-06  9:42   ` Will Deacon
2022-08-06  9:42     ` Will Deacon
2022-08-06 10:52     ` Stefan Hajnoczi
2022-08-06 10:52       ` Stefan Hajnoczi
2022-08-06 14:34       ` Will Deacon [this message]
2022-08-06 14:34         ` Will Deacon
2022-08-06 21:47         ` Stefan Hajnoczi
2022-08-06 10:52     ` Stefano Garzarella
2022-08-06 10:52       ` Stefano Garzarella
2022-08-07 13:14 ` Michael S. Tsirkin
2022-08-07 13:14   ` Michael S. Tsirkin
2022-08-08 10:18   ` Will Deacon
2022-08-08 10:18     ` Will Deacon
2022-08-08 12:45     ` Michael S. Tsirkin
2022-08-08 12:45       ` Michael S. Tsirkin
2022-08-08 14:33       ` Stefan Hajnoczi
2022-08-09  3:16       ` Jason Wang
2022-08-09  3:16         ` Jason Wang
2022-08-17 13:48       ` Will Deacon
2022-08-17 13:48         ` Will Deacon
2022-08-17 17:05         ` Michael S. Tsirkin
2022-08-17 17:05           ` Michael S. Tsirkin
2022-08-09  3:12   ` Jason Wang
2022-08-09  3:12     ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220806143443.GA30658@willie-the-truck \
    --to=will@kernel.org \
    --cc=ascull@google.com \
    --cc=jasowang@redhat.com \
    --cc=jiyong@google.com \
    --cc=keirf@google.com \
    --cc=kernel-team@android.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mst@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=shajnocz@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.