qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
@ 2021-02-19 16:04 Alex Bennée
  2021-02-22 13:06 ` Dr. David Alan Gilbert
  2021-02-23 11:44 ` Michael S. Tsirkin
  0 siblings, 2 replies; 11+ messages in thread
From: Alex Bennée @ 2021-02-19 16:04 UTC (permalink / raw)
  To: qemu-devel, rust-vmm
  Cc: Marc-André Lureau, Dr. David Alan Gilbert, Sergio Lopez,
	Stefan Hajnoczi, Michael S. Tsirkin

Hi,

I finally got a chance to get down into the guts of vhost-user while
attempting to port my original C RPMB daemon to Rust using the
vhost-user-backend and related crates. I ended up with this hang during
negotiation:

  startup

  vhost_user_write req:1 flags:0x1
  vhost_user_read_start
  vhost_user_read req:1 flags:0x5
  vhost_user_backend_init: we got 170000000
  vhost_user_write req:15 flags:0x1
  vhost_user_read_start
  vhost_user_read req:15 flags:0x5
  vhost_user_set_protocol_features: 2008
  vhost_user_write req:16 flags:0x1
  vhost_user_write req:3 flags:0x1
  vhost_user_write req:1 flags:0x1
  vhost_user_read_start
  vhost_user_read req:1 flags:0x5
  vhost_user_write req:13 flags:0x1

  kernel initialises device

  virtio_rpmb virtio1: init done!
  vhost_user_write req:13 flags:0x1
  vhost_dev_set_features: 130000000
  vhost_user_set_features: 130000000
  vhost_user_write req:2 flags:0x1
  vhost_user_write req:5 flags:0x9
  vhost_user_read_start

The proximate cause is the vhost crate handling:

  MasterReq::SET_MEM_TABLE => {
      let res = self.set_mem_table(&hdr, size, &buf, rfds);
      self.send_ack_message(&hdr, res)?;
  }

which gates on the replay_ack_enabled flag:

    fn send_ack_message(
        &mut self,
        req: &VhostUserMsgHeader<MasterReq>,
        res: Result<()>,
    ) -> Result<()> {
        if dbg!(self.reply_ack_enabled) {
            let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
            let val = match res {
                Ok(_) => 0,
                Err(_) => 1,
            };
            let msg = VhostUserU64::new(val);
            self.main_sock.send_message(&hdr, &msg, None)?;
        }
        Ok(())
    }

which is only set when we have all the appropriate acknowledged flags:

    fn update_reply_ack_flag(&mut self) {
        let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
        let pflag = VhostUserProtocolFeatures::REPLY_ACK;
        if (self.virtio_features & vflag) != 0
            && (self.acked_virtio_features & vflag) != 0
            && self.protocol_features.contains(pflag)
            && (self.acked_protocol_features & pflag.bits()) != 0
        {
            self.reply_ack_enabled = true;
        } else {
            self.reply_ack_enabled = false;
        }
    }

which from above you can see QEMU helpfully dropped those bits in the
reply. It does however work in the C/libvhost version:

  virtio_rpmb virtio1: init done!
  vhost_user_write req:13 flags:0x1
  vhost_dev_set_features: 130000000
  vhost_user_set_features: 130000000
  vhost_user_write req:2 flags:0x1
  vhost_user_write req:37 flags:0x9
  vhost_user_read_start
  vhost_user_read req:37 flags:0x5
  vhost_user_write req:8 flags:0x1
  vhost_user_write req:10 flags:0x1
  vhost_user_write req:9 flags:0x1
  vhost_user_write req:12 flags:0x1
  vhost_user_write req:13 flags:0x1

albeit with a slightly different message sequence
(VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
the C code you can see why:

    need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;

    reply_requested = vu_process_message(dev, &vmsg);
    if (!reply_requested && need_reply) {
        vmsg_set_reply_u64(&vmsg, 0);
        reply_requested = 1;
    }

So regardless of what may have been negotiated it will always reply with
something if the master requested it do so. This points us at the
specification which reads:

  - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
    details.

which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
be honoured when the feature has been negotiated. Which brings us to a
series of questions:

 - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
   when doing the eventual VHOST_USER_SET_FEATURES reply?

 - Is vhost.rs being to strict or libvhost-user too lax in interpreting
   the negotiated features before processing the ``need_reply`` [Bit 3]
   field of the messages?

 - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
   in the "list of the ones that do" require replies or do they only
   reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
   box out seems to imply?

Currently I have some hacks in:

  https://github.com/stsquad/vhost/tree/my-hacks

which gets my daemon booting up to the point we actually need to do a
transaction. However I won't submit a PR until I've worked out exactly
where the problems are.

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-19 16:04 vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs) Alex Bennée
@ 2021-02-22 13:06 ` Dr. David Alan Gilbert
  2021-02-22 13:21   ` Alex Bennée
  2021-02-23 11:44 ` Michael S. Tsirkin
  1 sibling, 1 reply; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-22 13:06 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
	rust-vmm, Marc-André Lureau

* Alex Bennée (alex.bennee@linaro.org) wrote:
> Hi,
> 
> I finally got a chance to get down into the guts of vhost-user while
> attempting to port my original C RPMB daemon to Rust using the
> vhost-user-backend and related crates. I ended up with this hang during
> negotiation:
> 
>   startup
> 
>   vhost_user_write req:1 flags:0x1
>   vhost_user_read_start
>   vhost_user_read req:1 flags:0x5
>   vhost_user_backend_init: we got 170000000
>   vhost_user_write req:15 flags:0x1
>   vhost_user_read_start
>   vhost_user_read req:15 flags:0x5
>   vhost_user_set_protocol_features: 2008
>   vhost_user_write req:16 flags:0x1
>   vhost_user_write req:3 flags:0x1
>   vhost_user_write req:1 flags:0x1
>   vhost_user_read_start
>   vhost_user_read req:1 flags:0x5
>   vhost_user_write req:13 flags:0x1
> 
>   kernel initialises device
> 
>   virtio_rpmb virtio1: init done!
>   vhost_user_write req:13 flags:0x1
>   vhost_dev_set_features: 130000000
>   vhost_user_set_features: 130000000
>   vhost_user_write req:2 flags:0x1
>   vhost_user_write req:5 flags:0x9
>   vhost_user_read_start
> 
> The proximate cause is the vhost crate handling:
> 
>   MasterReq::SET_MEM_TABLE => {
>       let res = self.set_mem_table(&hdr, size, &buf, rfds);
>       self.send_ack_message(&hdr, res)?;
>   }
> 
> which gates on the replay_ack_enabled flag:
> 
>     fn send_ack_message(
>         &mut self,
>         req: &VhostUserMsgHeader<MasterReq>,
>         res: Result<()>,
>     ) -> Result<()> {
>         if dbg!(self.reply_ack_enabled) {
>             let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
>             let val = match res {
>                 Ok(_) => 0,
>                 Err(_) => 1,
>             };
>             let msg = VhostUserU64::new(val);
>             self.main_sock.send_message(&hdr, &msg, None)?;
>         }
>         Ok(())
>     }
> 
> which is only set when we have all the appropriate acknowledged flags:
> 
>     fn update_reply_ack_flag(&mut self) {
>         let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
>         let pflag = VhostUserProtocolFeatures::REPLY_ACK;
>         if (self.virtio_features & vflag) != 0
>             && (self.acked_virtio_features & vflag) != 0
>             && self.protocol_features.contains(pflag)
>             && (self.acked_protocol_features & pflag.bits()) != 0
>         {
>             self.reply_ack_enabled = true;
>         } else {
>             self.reply_ack_enabled = false;
>         }
>     }
> 
> which from above you can see QEMU helpfully dropped those bits in the
> reply. It does however work in the C/libvhost version:
> 
>   virtio_rpmb virtio1: init done!
>   vhost_user_write req:13 flags:0x1
>   vhost_dev_set_features: 130000000
>   vhost_user_set_features: 130000000
>   vhost_user_write req:2 flags:0x1
>   vhost_user_write req:37 flags:0x9
>   vhost_user_read_start
>   vhost_user_read req:37 flags:0x5
>   vhost_user_write req:8 flags:0x1
>   vhost_user_write req:10 flags:0x1
>   vhost_user_write req:9 flags:0x1
>   vhost_user_write req:12 flags:0x1
>   vhost_user_write req:13 flags:0x1
> 
> albeit with a slightly different message sequence
> (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> the C code you can see why:
> 
>     need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
> 
>     reply_requested = vu_process_message(dev, &vmsg);
>     if (!reply_requested && need_reply) {
>         vmsg_set_reply_u64(&vmsg, 0);
>         reply_requested = 1;
>     }
> 
> So regardless of what may have been negotiated it will always reply with
> something if the master requested it do so. This points us at the
> specification which reads:
> 
>   - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
>     details.
> 
> which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> be honoured when the feature has been negotiated. Which brings us to a
> series of questions:
> 
>  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>    when doing the eventual VHOST_USER_SET_FEATURES reply?
> 
>  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>    the negotiated features before processing the ``need_reply`` [Bit 3]
>    field of the messages?

I think vhost.rs is being correctly strict - but there would be no harm
in it flagging that you'd hit an inconsistency if it finds a need_reply
without the feature.

>  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>    in the "list of the ones that do" require replies or do they only
>    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>    box out seems to imply?

set_mem_table gives a reply when postcopy is enabled (and then qemu
replies to the reply!) but otherwise doesn't.
(Note there's an issue opened for .rs to support ADD_MEM_REGION
since it's a lot better than SET_MEM_TABLE which has a fixed size table
that's small).

Dave

> Currently I have some hacks in:
> 
>   https://github.com/stsquad/vhost/tree/my-hacks
> 
> which gets my daemon booting up to the point we actually need to do a
> transaction. However I won't submit a PR until I've worked out exactly
> where the problems are.
> 
> -- 
> Alex Bennée
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-22 13:06 ` Dr. David Alan Gilbert
@ 2021-02-22 13:21   ` Alex Bennée
  2021-02-22 13:27     ` Dr. David Alan Gilbert
  2021-02-26 19:58     ` Raphael Norwitz
  0 siblings, 2 replies; 11+ messages in thread
From: Alex Bennée @ 2021-02-22 13:21 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
	rust-vmm, Marc-André Lureau


Dr. David Alan Gilbert <dgilbert@redhat.com> writes:

> * Alex Bennée (alex.bennee@linaro.org) wrote:
>> Hi,
>> 
>> I finally got a chance to get down into the guts of vhost-user while
>> attempting to port my original C RPMB daemon to Rust using the
>> vhost-user-backend and related crates. I ended up with this hang during
>> negotiation:
>> 
>>   startup
>> 
>>   vhost_user_write req:1 flags:0x1
>>   vhost_user_read_start
>>   vhost_user_read req:1 flags:0x5
>>   vhost_user_backend_init: we got 170000000

GET_FEATURES

>>   vhost_user_write req:15 flags:0x1
>>   vhost_user_read_start
>>   vhost_user_read req:15 flags:0x5
>>   vhost_user_set_protocol_features: 2008
>>   vhost_user_write req:16 flags:0x1
>>   vhost_user_write req:3 flags:0x1
>>   vhost_user_write req:1 flags:0x1
>>   vhost_user_read_start
>>   vhost_user_read req:1 flags:0x5
>>   vhost_user_write req:13 flags:0x1
>> 
>>   kernel initialises device
>> 
>>   virtio_rpmb virtio1: init done!
>>   vhost_user_write req:13 flags:0x1
>>   vhost_dev_set_features: 130000000
>>   vhost_user_set_features: 130000000

SET_FEATURES

>>   vhost_user_write req:2 flags:0x1
>>   vhost_user_write req:5 flags:0x9
>>   vhost_user_read_start
>> 
<snip>
>> 
>>  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>>    when doing the eventual VHOST_USER_SET_FEATURES reply?
>> 
>>  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>>    the negotiated features before processing the ``need_reply`` [Bit 3]
>>    field of the messages?
>
> I think vhost.rs is being correctly strict - but there would be no harm
> in it flagging that you'd hit an inconsistency if it finds a need_reply
> without the feature.

But the feature should have been negotiated. So unless the slave can
assume it is enabled because it asked I think QEMU is in the wrong by
not preserving the feature bits in it's SET_FEATURES reply. We just gets
away with it with libvhostuser being willing to reply anyway.

>
>>  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>>    in the "list of the ones that do" require replies or do they only
>>    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>>    box out seems to imply?
>
> set_mem_table gives a reply when postcopy is enabled (and then qemu
> replies to the reply!) but otherwise doesn't.
> (Note there's an issue opened for .rs to support ADD_MEM_REGION
> since it's a lot better than SET_MEM_TABLE which has a fixed size table
> that's small).

Thanks for the heads up.

>
> Dave
>
>> Currently I have some hacks in:
>> 
>>   https://github.com/stsquad/vhost/tree/my-hacks
>> 
>> which gets my daemon booting up to the point we actually need to do a
>> transaction. However I won't submit a PR until I've worked out exactly
>> where the problems are.
>> 
>> -- 
>> Alex Bennée
>> 


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-22 13:21   ` Alex Bennée
@ 2021-02-22 13:27     ` Dr. David Alan Gilbert
  2021-02-23 10:23       ` [Rust-VMM] " Jiang Liu
  2021-02-26 19:58     ` Raphael Norwitz
  1 sibling, 1 reply; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-22 13:27 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
	rust-vmm, Marc-André Lureau

* Alex Bennée (alex.bennee@linaro.org) wrote:
> 
> Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
> 
> > * Alex Bennée (alex.bennee@linaro.org) wrote:
> >> Hi,
> >> 
> >> I finally got a chance to get down into the guts of vhost-user while
> >> attempting to port my original C RPMB daemon to Rust using the
> >> vhost-user-backend and related crates. I ended up with this hang during
> >> negotiation:
> >> 
> >>   startup
> >> 
> >>   vhost_user_write req:1 flags:0x1
> >>   vhost_user_read_start
> >>   vhost_user_read req:1 flags:0x5
> >>   vhost_user_backend_init: we got 170000000
> 
> GET_FEATURES
> 
> >>   vhost_user_write req:15 flags:0x1
> >>   vhost_user_read_start
> >>   vhost_user_read req:15 flags:0x5
> >>   vhost_user_set_protocol_features: 2008
> >>   vhost_user_write req:16 flags:0x1
> >>   vhost_user_write req:3 flags:0x1
> >>   vhost_user_write req:1 flags:0x1
> >>   vhost_user_read_start
> >>   vhost_user_read req:1 flags:0x5
> >>   vhost_user_write req:13 flags:0x1
> >> 
> >>   kernel initialises device
> >> 
> >>   virtio_rpmb virtio1: init done!
> >>   vhost_user_write req:13 flags:0x1
> >>   vhost_dev_set_features: 130000000
> >>   vhost_user_set_features: 130000000
> 
> SET_FEATURES
> 
> >>   vhost_user_write req:2 flags:0x1
> >>   vhost_user_write req:5 flags:0x9
> >>   vhost_user_read_start
> >> 
> <snip>
> >> 
> >>  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> >>    when doing the eventual VHOST_USER_SET_FEATURES reply?
> >> 
> >>  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> >>    the negotiated features before processing the ``need_reply`` [Bit 3]
> >>    field of the messages?
> >
> > I think vhost.rs is being correctly strict - but there would be no harm
> > in it flagging that you'd hit an inconsistency if it finds a need_reply
> > without the feature.
> 
> But the feature should have been negotiated. So unless the slave can
> assume it is enabled because it asked I think QEMU is in the wrong by
> not preserving the feature bits in it's SET_FEATURES reply. We just gets
> away with it with libvhostuser being willing to reply anyway.

Oh I wasn't trying to reply to that bit; I can never remember how the
vhost/virtio feature bit negotiation works...

Dave

> >
> >>  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> >>    in the "list of the ones that do" require replies or do they only
> >>    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> >>    box out seems to imply?
> >
> > set_mem_table gives a reply when postcopy is enabled (and then qemu
> > replies to the reply!) but otherwise doesn't.
> > (Note there's an issue opened for .rs to support ADD_MEM_REGION
> > since it's a lot better than SET_MEM_TABLE which has a fixed size table
> > that's small).
> 
> Thanks for the heads up.
> 
> >
> > Dave
> >
> >> Currently I have some hacks in:
> >> 
> >>   https://github.com/stsquad/vhost/tree/my-hacks
> >> 
> >> which gets my daemon booting up to the point we actually need to do a
> >> transaction. However I won't submit a PR until I've worked out exactly
> >> where the problems are.
> >> 
> >> -- 
> >> Alex Bennée
> >> 
> 
> 
> -- 
> Alex Bennée
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Rust-VMM] vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-22 13:27     ` Dr. David Alan Gilbert
@ 2021-02-23 10:23       ` Jiang Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Jiang Liu @ 2021-02-23 10:23 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
	rust-vmm, Marc-André Lureau, Alex Bennée

I just found this thread in my email junk box:(
I do have found some bugs in the vhost_rs crate, related to handle the need_reply flag.
But that bug only affects virtio-fs fs_map operations.
Please refer to:
https://github.com/rust-vmm/vhost/pull/19
https://github.com/rust-vmm/vhost/pull/19/commits/a2c5a4f50e45ae1ab78622dda9a3f861bd43a17e

Thanks,
Gerry

> On Feb 22, 2021, at 9:27 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> 
> * Alex Bennée (alex.bennee@linaro.org) wrote:
>> 
>> Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
>> 
>>> * Alex Bennée (alex.bennee@linaro.org) wrote:
>>>> Hi,
>>>> 
>>>> I finally got a chance to get down into the guts of vhost-user while
>>>> attempting to port my original C RPMB daemon to Rust using the
>>>> vhost-user-backend and related crates. I ended up with this hang during
>>>> negotiation:
>>>> 
>>>> startup
>>>> 
>>>> vhost_user_write req:1 flags:0x1
>>>> vhost_user_read_start
>>>> vhost_user_read req:1 flags:0x5
>>>> vhost_user_backend_init: we got 170000000
>> 
>> GET_FEATURES
>> 
>>>> vhost_user_write req:15 flags:0x1
>>>> vhost_user_read_start
>>>> vhost_user_read req:15 flags:0x5
>>>> vhost_user_set_protocol_features: 2008
>>>> vhost_user_write req:16 flags:0x1
>>>> vhost_user_write req:3 flags:0x1
>>>> vhost_user_write req:1 flags:0x1
>>>> vhost_user_read_start
>>>> vhost_user_read req:1 flags:0x5
>>>> vhost_user_write req:13 flags:0x1
>>>> 
>>>> kernel initialises device
>>>> 
>>>> virtio_rpmb virtio1: init done!
>>>> vhost_user_write req:13 flags:0x1
>>>> vhost_dev_set_features: 130000000
>>>> vhost_user_set_features: 130000000
>> 
>> SET_FEATURES
>> 
>>>> vhost_user_write req:2 flags:0x1
>>>> vhost_user_write req:5 flags:0x9
>>>> vhost_user_read_start
>>>> 
>> <snip>
>>>> 
>>>> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>>>>  when doing the eventual VHOST_USER_SET_FEATURES reply?
>>>> 
>>>> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>>>>  the negotiated features before processing the ``need_reply`` [Bit 3]
>>>>  field of the messages?
>>> 
>>> I think vhost.rs is being correctly strict - but there would be no harm
>>> in it flagging that you'd hit an inconsistency if it finds a need_reply
>>> without the feature.
>> 
>> But the feature should have been negotiated. So unless the slave can
>> assume it is enabled because it asked I think QEMU is in the wrong by
>> not preserving the feature bits in it's SET_FEATURES reply. We just gets
>> away with it with libvhostuser being willing to reply anyway.
> 
> Oh I wasn't trying to reply to that bit; I can never remember how the
> vhost/virtio feature bit negotiation works...
> 
> Dave
> 
>>> 
>>>> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>>>>  in the "list of the ones that do" require replies or do they only
>>>>  reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>>>>  box out seems to imply?
>>> 
>>> set_mem_table gives a reply when postcopy is enabled (and then qemu
>>> replies to the reply!) but otherwise doesn't.
>>> (Note there's an issue opened for .rs to support ADD_MEM_REGION
>>> since it's a lot better than SET_MEM_TABLE which has a fixed size table
>>> that's small).
>> 
>> Thanks for the heads up.
>> 
>>> 
>>> Dave
>>> 
>>>> Currently I have some hacks in:
>>>> 
>>>> https://github.com/stsquad/vhost/tree/my-hacks
>>>> 
>>>> which gets my daemon booting up to the point we actually need to do a
>>>> transaction. However I won't submit a PR until I've worked out exactly
>>>> where the problems are.
>>>> 
>>>> -- 
>>>> Alex Bennée
>>>> 
>> 
>> 
>> -- 
>> Alex Bennée
>> 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 
> _______________________________________________
> Rust-vmm mailing list
> Rust-vmm@lists.opendev.org
> http://lists.opendev.org/cgi-bin/mailman/listinfo/rust-vmm



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-19 16:04 vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs) Alex Bennée
  2021-02-22 13:06 ` Dr. David Alan Gilbert
@ 2021-02-23 11:44 ` Michael S. Tsirkin
  2021-02-25  4:19   ` [Rust-VMM] " Dylan Reid
  1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2021-02-23 11:44 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergio Lopez, Dr. David Alan Gilbert, qemu-devel,
	Stefan Hajnoczi, rust-vmm, Marc-André Lureau,
	raphael.norwitz

Cc: Raphael

On Fri, Feb 19, 2021 at 04:04:34PM +0000, Alex Bennée wrote:
> Hi,
> 
> I finally got a chance to get down into the guts of vhost-user while
> attempting to port my original C RPMB daemon to Rust using the
> vhost-user-backend and related crates. I ended up with this hang during
> negotiation:
> 
>   startup
> 
>   vhost_user_write req:1 flags:0x1
>   vhost_user_read_start
>   vhost_user_read req:1 flags:0x5
>   vhost_user_backend_init: we got 170000000
>   vhost_user_write req:15 flags:0x1
>   vhost_user_read_start
>   vhost_user_read req:15 flags:0x5
>   vhost_user_set_protocol_features: 2008
>   vhost_user_write req:16 flags:0x1
>   vhost_user_write req:3 flags:0x1
>   vhost_user_write req:1 flags:0x1
>   vhost_user_read_start
>   vhost_user_read req:1 flags:0x5
>   vhost_user_write req:13 flags:0x1
> 
>   kernel initialises device
> 
>   virtio_rpmb virtio1: init done!
>   vhost_user_write req:13 flags:0x1
>   vhost_dev_set_features: 130000000
>   vhost_user_set_features: 130000000
>   vhost_user_write req:2 flags:0x1
>   vhost_user_write req:5 flags:0x9
>   vhost_user_read_start
> 
> The proximate cause is the vhost crate handling:
> 
>   MasterReq::SET_MEM_TABLE => {
>       let res = self.set_mem_table(&hdr, size, &buf, rfds);
>       self.send_ack_message(&hdr, res)?;
>   }
> 
> which gates on the replay_ack_enabled flag:
> 
>     fn send_ack_message(
>         &mut self,
>         req: &VhostUserMsgHeader<MasterReq>,
>         res: Result<()>,
>     ) -> Result<()> {
>         if dbg!(self.reply_ack_enabled) {
>             let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
>             let val = match res {
>                 Ok(_) => 0,
>                 Err(_) => 1,
>             };
>             let msg = VhostUserU64::new(val);
>             self.main_sock.send_message(&hdr, &msg, None)?;
>         }
>         Ok(())
>     }
> 
> which is only set when we have all the appropriate acknowledged flags:
> 
>     fn update_reply_ack_flag(&mut self) {
>         let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
>         let pflag = VhostUserProtocolFeatures::REPLY_ACK;
>         if (self.virtio_features & vflag) != 0
>             && (self.acked_virtio_features & vflag) != 0
>             && self.protocol_features.contains(pflag)
>             && (self.acked_protocol_features & pflag.bits()) != 0
>         {
>             self.reply_ack_enabled = true;
>         } else {
>             self.reply_ack_enabled = false;
>         }
>     }
> 
> which from above you can see QEMU helpfully dropped those bits in the
> reply. It does however work in the C/libvhost version:
> 
>   virtio_rpmb virtio1: init done!
>   vhost_user_write req:13 flags:0x1
>   vhost_dev_set_features: 130000000
>   vhost_user_set_features: 130000000
>   vhost_user_write req:2 flags:0x1
>   vhost_user_write req:37 flags:0x9
>   vhost_user_read_start
>   vhost_user_read req:37 flags:0x5
>   vhost_user_write req:8 flags:0x1
>   vhost_user_write req:10 flags:0x1
>   vhost_user_write req:9 flags:0x1
>   vhost_user_write req:12 flags:0x1
>   vhost_user_write req:13 flags:0x1
> 
> albeit with a slightly different message sequence
> (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> the C code you can see why:
> 
>     need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
> 
>     reply_requested = vu_process_message(dev, &vmsg);
>     if (!reply_requested && need_reply) {
>         vmsg_set_reply_u64(&vmsg, 0);
>         reply_requested = 1;
>     }
> 
> So regardless of what may have been negotiated it will always reply with
> something if the master requested it do so. This points us at the
> specification which reads:
> 
>   - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
>     details.
> 
> which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> be honoured when the feature has been negotiated. Which brings us to a
> series of questions:
> 
>  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>    when doing the eventual VHOST_USER_SET_FEATURES reply?

Hmm looks like a bug indeed ... Anyone wants to look
into fixing that? Marc-André?


>  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>    the negotiated features before processing the ``need_reply`` [Bit 3]
>    field of the messages?
> 
>  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>    in the "list of the ones that do" require replies or do they only
>    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>    box out seems to imply?
> 
> Currently I have some hacks in:
> 
>   https://github.com/stsquad/vhost/tree/my-hacks
> 
> which gets my daemon booting up to the point we actually need to do a
> transaction. However I won't submit a PR until I've worked out exactly
> where the problems are.
> 
> -- 
> Alex Bennée



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Rust-VMM] vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-23 11:44 ` Michael S. Tsirkin
@ 2021-02-25  4:19   ` Dylan Reid
  2021-02-25  6:36     ` Keiichi Watanabe
  0 siblings, 1 reply; 11+ messages in thread
From: Dylan Reid @ 2021-02-25  4:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sergio Lopez, Chirantan Ekbote, qemu-devel,
	Dr. David Alan Gilbert, Keiichi Watanabe, Stefan Hajnoczi,
	rust-vmm, Marc-André Lureau, raphael.norwitz,
	Alex Bennée

On Tue, Feb 23, 2021 at 8:20 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> Cc: Raphael
>
> On Fri, Feb 19, 2021 at 04:04:34PM +0000, Alex Bennée wrote:
> > Hi,
> >
> > I finally got a chance to get down into the guts of vhost-user while
> > attempting to port my original C RPMB daemon to Rust using the
> > vhost-user-backend and related crates. I ended up with this hang during
> > negotiation:
> >
> >   startup
> >
> >   vhost_user_write req:1 flags:0x1
> >   vhost_user_read_start
> >   vhost_user_read req:1 flags:0x5
> >   vhost_user_backend_init: we got 170000000
> >   vhost_user_write req:15 flags:0x1
> >   vhost_user_read_start
> >   vhost_user_read req:15 flags:0x5
> >   vhost_user_set_protocol_features: 2008
> >   vhost_user_write req:16 flags:0x1
> >   vhost_user_write req:3 flags:0x1
> >   vhost_user_write req:1 flags:0x1
> >   vhost_user_read_start
> >   vhost_user_read req:1 flags:0x5
> >   vhost_user_write req:13 flags:0x1
> >
> >   kernel initialises device
> >
> >   virtio_rpmb virtio1: init done!
> >   vhost_user_write req:13 flags:0x1
> >   vhost_dev_set_features: 130000000
> >   vhost_user_set_features: 130000000
> >   vhost_user_write req:2 flags:0x1
> >   vhost_user_write req:5 flags:0x9
> >   vhost_user_read_start
> >
> > The proximate cause is the vhost crate handling:
> >
> >   MasterReq::SET_MEM_TABLE => {
> >       let res = self.set_mem_table(&hdr, size, &buf, rfds);
> >       self.send_ack_message(&hdr, res)?;
> >   }
> >
> > which gates on the replay_ack_enabled flag:
> >
> >     fn send_ack_message(
> >         &mut self,
> >         req: &VhostUserMsgHeader<MasterReq>,
> >         res: Result<()>,
> >     ) -> Result<()> {
> >         if dbg!(self.reply_ack_enabled) {
> >             let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
> >             let val = match res {
> >                 Ok(_) => 0,
> >                 Err(_) => 1,
> >             };
> >             let msg = VhostUserU64::new(val);
> >             self.main_sock.send_message(&hdr, &msg, None)?;
> >         }
> >         Ok(())
> >     }
> >
> > which is only set when we have all the appropriate acknowledged flags:
> >
> >     fn update_reply_ack_flag(&mut self) {
> >         let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
> >         let pflag = VhostUserProtocolFeatures::REPLY_ACK;
> >         if (self.virtio_features & vflag) != 0
> >             && (self.acked_virtio_features & vflag) != 0
> >             && self.protocol_features.contains(pflag)
> >             && (self.acked_protocol_features & pflag.bits()) != 0
> >         {
> >             self.reply_ack_enabled = true;
> >         } else {
> >             self.reply_ack_enabled = false;
> >         }
> >     }
> >
> > which from above you can see QEMU helpfully dropped those bits in the
> > reply. It does however work in the C/libvhost version:
> >
> >   virtio_rpmb virtio1: init done!
> >   vhost_user_write req:13 flags:0x1
> >   vhost_dev_set_features: 130000000
> >   vhost_user_set_features: 130000000
> >   vhost_user_write req:2 flags:0x1
> >   vhost_user_write req:37 flags:0x9
> >   vhost_user_read_start
> >   vhost_user_read req:37 flags:0x5
> >   vhost_user_write req:8 flags:0x1
> >   vhost_user_write req:10 flags:0x1
> >   vhost_user_write req:9 flags:0x1
> >   vhost_user_write req:12 flags:0x1
> >   vhost_user_write req:13 flags:0x1
> >
> > albeit with a slightly different message sequence
> > (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> > the C code you can see why:
> >
> >     need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
> >
> >     reply_requested = vu_process_message(dev, &vmsg);
> >     if (!reply_requested && need_reply) {
> >         vmsg_set_reply_u64(&vmsg, 0);
> >         reply_requested = 1;
> >     }
> >
> > So regardless of what may have been negotiated it will always reply with
> > something if the master requested it do so. This points us at the
> > specification which reads:
> >
> >   - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
> >     details.
> >
> > which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> > be honoured when the feature has been negotiated. Which brings us to a
> > series of questions:
> >
> >  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> >    when doing the eventual VHOST_USER_SET_FEATURES reply?
>
> Hmm looks like a bug indeed ... Anyone wants to look
> into fixing that? Marc-André?

chirantan and keiichi will be implementing vhost-user-vitio-fs on
Chrome OS, maybe one of you two can take a look?


>
>
>
> >  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> >    the negotiated features before processing the ``need_reply`` [Bit 3]
> >    field of the messages?
> >
> >  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> >    in the "list of the ones that do" require replies or do they only
> >    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> >    box out seems to imply?
> >
> > Currently I have some hacks in:
> >
> >   https://github.com/stsquad/vhost/tree/my-hacks
> >
> > which gets my daemon booting up to the point we actually need to do a
> > transaction. However I won't submit a PR until I've worked out exactly
> > where the problems are.
> >
> > --
> > Alex Bennée
>
>
> _______________________________________________
> Rust-vmm mailing list
> Rust-vmm@lists.opendev.org
> http://lists.opendev.org/cgi-bin/mailman/listinfo/rust-vmm


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Rust-VMM] vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-25  4:19   ` [Rust-VMM] " Dylan Reid
@ 2021-02-25  6:36     ` Keiichi Watanabe
  0 siblings, 0 replies; 11+ messages in thread
From: Keiichi Watanabe @ 2021-02-25  6:36 UTC (permalink / raw)
  To: Dylan Reid, woodychow
  Cc: Sergio Lopez, Michael S. Tsirkin, Chirantan Ekbote, qemu-devel,
	Dr. David Alan Gilbert, Stefan Hajnoczi, rust-vmm,
	Marc-André Lureau, raphael.norwitz, Alex Bennée

[-- Attachment #1: Type: text/plain, Size: 6257 bytes --]

On Thu, Feb 25, 2021 at 1:20 PM Dylan Reid <dgreid@chromium.org> wrote:

> On Tue, Feb 23, 2021 at 8:20 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Cc: Raphael
> >
> > On Fri, Feb 19, 2021 at 04:04:34PM +0000, Alex Bennée wrote:
> > > Hi,
> > >
> > > I finally got a chance to get down into the guts of vhost-user while
> > > attempting to port my original C RPMB daemon to Rust using the
> > > vhost-user-backend and related crates. I ended up with this hang during
> > > negotiation:
> > >
> > >   startup
> > >
> > >   vhost_user_write req:1 flags:0x1
> > >   vhost_user_read_start
> > >   vhost_user_read req:1 flags:0x5
> > >   vhost_user_backend_init: we got 170000000
> > >   vhost_user_write req:15 flags:0x1
> > >   vhost_user_read_start
> > >   vhost_user_read req:15 flags:0x5
> > >   vhost_user_set_protocol_features: 2008
> > >   vhost_user_write req:16 flags:0x1
> > >   vhost_user_write req:3 flags:0x1
> > >   vhost_user_write req:1 flags:0x1
> > >   vhost_user_read_start
> > >   vhost_user_read req:1 flags:0x5
> > >   vhost_user_write req:13 flags:0x1
> > >
> > >   kernel initialises device
> > >
> > >   virtio_rpmb virtio1: init done!
> > >   vhost_user_write req:13 flags:0x1
> > >   vhost_dev_set_features: 130000000
> > >   vhost_user_set_features: 130000000
> > >   vhost_user_write req:2 flags:0x1
> > >   vhost_user_write req:5 flags:0x9
> > >   vhost_user_read_start
> > >
> > > The proximate cause is the vhost crate handling:
> > >
> > >   MasterReq::SET_MEM_TABLE => {
> > >       let res = self.set_mem_table(&hdr, size, &buf, rfds);
> > >       self.send_ack_message(&hdr, res)?;
> > >   }
> > >
> > > which gates on the replay_ack_enabled flag:
> > >
> > >     fn send_ack_message(
> > >         &mut self,
> > >         req: &VhostUserMsgHeader<MasterReq>,
> > >         res: Result<()>,
> > >     ) -> Result<()> {
> > >         if dbg!(self.reply_ack_enabled) {
> > >             let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
> > >             let val = match res {
> > >                 Ok(_) => 0,
> > >                 Err(_) => 1,
> > >             };
> > >             let msg = VhostUserU64::new(val);
> > >             self.main_sock.send_message(&hdr, &msg, None)?;
> > >         }
> > >         Ok(())
> > >     }
> > >
> > > which is only set when we have all the appropriate acknowledged flags:
> > >
> > >     fn update_reply_ack_flag(&mut self) {
> > >         let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
> > >         let pflag = VhostUserProtocolFeatures::REPLY_ACK;
> > >         if (self.virtio_features & vflag) != 0
> > >             && (self.acked_virtio_features & vflag) != 0
> > >             && self.protocol_features.contains(pflag)
> > >             && (self.acked_protocol_features & pflag.bits()) != 0
> > >         {
> > >             self.reply_ack_enabled = true;
> > >         } else {
> > >             self.reply_ack_enabled = false;
> > >         }
> > >     }
> > >
> > > which from above you can see QEMU helpfully dropped those bits in the
> > > reply. It does however work in the C/libvhost version:
> > >
> > >   virtio_rpmb virtio1: init done!
> > >   vhost_user_write req:13 flags:0x1
> > >   vhost_dev_set_features: 130000000
> > >   vhost_user_set_features: 130000000
> > >   vhost_user_write req:2 flags:0x1
> > >   vhost_user_write req:37 flags:0x9
> > >   vhost_user_read_start
> > >   vhost_user_read req:37 flags:0x5
> > >   vhost_user_write req:8 flags:0x1
> > >   vhost_user_write req:10 flags:0x1
> > >   vhost_user_write req:9 flags:0x1
> > >   vhost_user_write req:12 flags:0x1
> > >   vhost_user_write req:13 flags:0x1
> > >
> > > albeit with a slightly different message sequence
> > > (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> > > the C code you can see why:
> > >
> > >     need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
> > >
> > >     reply_requested = vu_process_message(dev, &vmsg);
> > >     if (!reply_requested && need_reply) {
> > >         vmsg_set_reply_u64(&vmsg, 0);
> > >         reply_requested = 1;
> > >     }
> > >
> > > So regardless of what may have been negotiated it will always reply
> with
> > > something if the master requested it do so. This points us at the
> > > specification which reads:
> > >
> > >   - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
> > >     details.
> > >
> > > which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> > > be honoured when the feature has been negotiated. Which brings us to a
> > > series of questions:
> > >
> > >  - Should QEMU have preserved
> VhostUserVirtioFeatures::PROTOCOL_FEATURES
> > >    when doing the eventual VHOST_USER_SET_FEATURES reply?
> >
> > Hmm looks like a bug indeed ... Anyone wants to look
> > into fixing that? Marc-André?
>
> chirantan and keiichi will be implementing vhost-user-vitio-fs on
> Chrome OS, maybe one of you two can take a look?
>
>
Yeah, our team is working on vhost-user virtiofs. I think +Woody Chow
<woodychow@chromium.org> will probably be able to look into this issue.


>
> >
> >
> >
> > >  - Is vhost.rs being to strict or libvhost-user too lax in
> interpreting
> > >    the negotiated features before processing the ``need_reply`` [Bit 3]
> > >    field of the messages?
> > >
> > >  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> > >    in the "list of the ones that do" require replies or do they only
> > >    reply when REPLY_ACK has been negotiated as the ambiguous
> "seealso::"
> > >    box out seems to imply?
> > >
> > > Currently I have some hacks in:
> > >
> > >   https://github.com/stsquad/vhost/tree/my-hacks
> > >
> > > which gets my daemon booting up to the point we actually need to do a
> > > transaction. However I won't submit a PR until I've worked out exactly
> > > where the problems are.
> > >
> > > --
> > > Alex Bennée
> >
> >
> > _______________________________________________
> > Rust-vmm mailing list
> > Rust-vmm@lists.opendev.org
> > http://lists.opendev.org/cgi-bin/mailman/listinfo/rust-vmm
>

[-- Attachment #2: Type: text/html, Size: 8814 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-22 13:21   ` Alex Bennée
  2021-02-22 13:27     ` Dr. David Alan Gilbert
@ 2021-02-26 19:58     ` Raphael Norwitz
  2021-02-26 21:25       ` Raphael Norwitz
  1 sibling, 1 reply; 11+ messages in thread
From: Raphael Norwitz @ 2021-02-26 19:58 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergio Lopez, Michael S. Tsirkin, chirantan, QEMU,
	Dr. David Alan Gilbert, keiichiw, Stefan Hajnoczi, rust-vmm,
	Marc-André Lureau, dgreid

There are two sets of features being negotiated - virtio and
vhost-user.  Based on what you've posted here, I suspect the
VHOST_USER_F_PROTOCOL_FEATURES virtio feature may not be negotiated by
the backend, preventing the vhost-user protocol feature negotiation
from happening at all. I'm not 100% sure why this would cause QEMU to
assume that REPLY_ACK was negotiated though.

some questions:

On Mon, Feb 22, 2021 at 3:26 AM Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
>
> > * Alex Bennée (alex.bennee@linaro.org) wrote:
> >> Hi,
> >>
> >> I finally got a chance to get down into the guts of vhost-user while
> >> attempting to port my original C RPMB daemon to Rust using the
> >> vhost-user-backend and related crates. I ended up with this hang during
> >> negotiation:
> >>
> >>   startup
> >>
> >>   vhost_user_write req:1 flags:0x1
> >>   vhost_user_read_start
> >>   vhost_user_read req:1 flags:0x5
> >>   vhost_user_backend_init: we got 170000000
>
> GET_FEATURES

Do we also see a GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES
message here? If so can you confirm what flags they contained?

vhost-user feature negotiation works as follows (see vhost_user_backend_init()):

err = vhost_user_get_features(dev, &features);
if (err < 0) {
    return err;
}

if (virtio_has_feature(features, VHOST_USER_F_PROTOCOL_FEATURES)) {
    dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;

    err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
                                              &protocol_features);
    if (err < 0) {
        return err;
    }

    dev->protocol_features =
        protocol_features & VHOST_USER_PROTOCOL_FEATURE_MASK;
...

    err = vhost_user_set_protocol_features(dev, dev->protocol_features);
    if (err < 0) {
         return err;
     }
}

So we first get the virtio features and check if the backend
advertises VHOST_USER_F_PROTOCOL_FEATURES. If it does, we proceed to
negotiate vhost-user features, in which case we should see
GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES. Otherwise it looks
like the function just returns, and we leave the vhost-user features
uninitialized (presumably zeroed out?), and the backend will never
even receive a GET/SET_PROTOCOL_FEATURES.

dev->protocol_features is not touched anywhere else, and, if
VHOST_USER_F_PROTOCOL_FEATURES is negotiated, comes directly to the
backend from the protocol_features the backend &ed with
VHOST_USER_PROTOCOL_FEATURE_MASK. Therefore if
VHOST_USER_F_PROTOCOL_FEATURES is indeed negotiated here I'm not sure
what could cause QEMU to think REPLY_ACK was negotiated while the
backend does not, spare something obvious like the backend mishandling
the GET/SET_PROTOCOL_FEATURES messages. I briefly checked the rustvmm
code for that and didn't see anything obvious.

mst - are backend devices meant to function if
VHOST_USER_F_PROTOCOL_FEATURES is not advertised? Do we know of any
functioning backend which does not advertise this virtio feature? If
not, maybe we consider failing out here?

alex - Are you sure QEMU gets stuck waiting on a reply_ack message,
and not somewhere else in the setup path? I trust a SET_MEM_TABLE
message was actually received by the backend. Did you confirm that
QEMU was indeed stuck waiting for a reply and not somewhere else later
on?

>
> >>   vhost_user_write req:15 flags:0x1
> >>   vhost_user_read_start
> >>   vhost_user_read req:15 flags:0x5
> >>   vhost_user_set_protocol_features: 2008
> >>   vhost_user_write req:16 flags:0x1
> >>   vhost_user_write req:3 flags:0x1
> >>   vhost_user_write req:1 flags:0x1
> >>   vhost_user_read_start
> >>   vhost_user_read req:1 flags:0x5
> >>   vhost_user_write req:13 flags:0x1
> >>
> >>   kernel initialises device
> >>
> >>   virtio_rpmb virtio1: init done!
> >>   vhost_user_write req:13 flags:0x1
> >>   vhost_dev_set_features: 130000000
> >>   vhost_user_set_features: 130000000
>
> SET_FEATURES

This is setting virtio features - should have nothing to do with REPLY_ACK.

>
> >>   vhost_user_write req:2 flags:0x1
> >>   vhost_user_write req:5 flags:0x9
> >>   vhost_user_read_start
> >>
> <snip>
> >>
> >>  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> >>    when doing the eventual VHOST_USER_SET_FEATURES reply?
> >>
> >>  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> >>    the negotiated features before processing the ``need_reply`` [Bit 3]
> >>    field of the messages?
> >
> > I think vhost.rs is being correctly strict - but there would be no harm
> > in it flagging that you'd hit an inconsistency if it finds a need_reply
> > without the feature.
>
> But the feature should have been negotiated. So unless the slave can
> assume it is enabled because it asked I think QEMU is in the wrong by
> not preserving the feature bits in it's SET_FEATURES reply. We just gets
> away with it with libvhostuser being willing to reply anyway.
>
> >
> >>  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> >>    in the "list of the ones that do" require replies or do they only
> >>    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> >>    box out seems to imply?
> >
> > set_mem_table gives a reply when postcopy is enabled (and then qemu
> > replies to the reply!) but otherwise doesn't.
> > (Note there's an issue opened for .rs to support ADD_MEM_REGION
> > since it's a lot better than SET_MEM_TABLE which has a fixed size table
> > that's small).
>
> Thanks for the heads up.
>
> >
> > Dave
> >
> >> Currently I have some hacks in:
> >>
> >>   https://github.com/stsquad/vhost/tree/my-hacks
> >>
> >> which gets my daemon booting up to the point we actually need to do a
> >> transaction. However I won't submit a PR until I've worked out exactly
> >> where the problems are.
> >>
> >> --
> >> Alex Bennée
> >>
>
>
> --
> Alex Bennée
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-26 19:58     ` Raphael Norwitz
@ 2021-02-26 21:25       ` Raphael Norwitz
  2021-02-27 12:23         ` Alex Bennée
  0 siblings, 1 reply; 11+ messages in thread
From: Raphael Norwitz @ 2021-02-26 21:25 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Sergio Lopez, Michael S. Tsirkin, chirantan, QEMU,
	Dr. David Alan Gilbert, keiichiw, Stefan Hajnoczi, rust-vmm,
	Marc-André Lureau, dgreid

As an afterthought - if VHOST_USER_F_PROTOCOL_FEATURES is indeed
unset, the issue may well be caused by QEMU reading an uninitialized
value for dev->protocol_features. Some device types like cryptodev
explicitly zero it out. As I said, it isn't set anywhere else in the
source and If dev->protocol_features had REPLY_ACK set when the
vhost_dev device is initialized, it would exactly explain the behavior
you are seeing.

On Fri, Feb 26, 2021 at 9:58 AM Raphael Norwitz
<raphael.s.norwitz@gmail.com> wrote:
>
> There are two sets of features being negotiated - virtio and
> vhost-user.  Based on what you've posted here, I suspect the
> VHOST_USER_F_PROTOCOL_FEATURES virtio feature may not be negotiated by
> the backend, preventing the vhost-user protocol feature negotiation
> from happening at all. I'm not 100% sure why this would cause QEMU to
> assume that REPLY_ACK was negotiated though.
>
> some questions:
>
> On Mon, Feb 22, 2021 at 3:26 AM Alex Bennée <alex.bennee@linaro.org> wrote:
> >
> >
> > Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
> >
> > > * Alex Bennée (alex.bennee@linaro.org) wrote:
> > >> Hi,
> > >>
> > >> I finally got a chance to get down into the guts of vhost-user while
> > >> attempting to port my original C RPMB daemon to Rust using the
> > >> vhost-user-backend and related crates. I ended up with this hang during
> > >> negotiation:
> > >>
> > >>   startup
> > >>
> > >>   vhost_user_write req:1 flags:0x1
> > >>   vhost_user_read_start
> > >>   vhost_user_read req:1 flags:0x5
> > >>   vhost_user_backend_init: we got 170000000
> >
> > GET_FEATURES
>
> Do we also see a GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES
> message here? If so can you confirm what flags they contained?
>
> vhost-user feature negotiation works as follows (see vhost_user_backend_init()):
>
> err = vhost_user_get_features(dev, &features);
> if (err < 0) {
>     return err;
> }
>
> if (virtio_has_feature(features, VHOST_USER_F_PROTOCOL_FEATURES)) {
>     dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
>
>     err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
>                                               &protocol_features);
>     if (err < 0) {
>         return err;
>     }
>
>     dev->protocol_features =
>         protocol_features & VHOST_USER_PROTOCOL_FEATURE_MASK;
> ...
>
>     err = vhost_user_set_protocol_features(dev, dev->protocol_features);
>     if (err < 0) {
>          return err;
>      }
> }
>
> So we first get the virtio features and check if the backend
> advertises VHOST_USER_F_PROTOCOL_FEATURES. If it does, we proceed to
> negotiate vhost-user features, in which case we should see
> GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES. Otherwise it looks
> like the function just returns, and we leave the vhost-user features
> uninitialized (presumably zeroed out?), and the backend will never
> even receive a GET/SET_PROTOCOL_FEATURES.
>
> dev->protocol_features is not touched anywhere else, and, if
> VHOST_USER_F_PROTOCOL_FEATURES is negotiated, comes directly to the
> backend from the protocol_features the backend &ed with
> VHOST_USER_PROTOCOL_FEATURE_MASK. Therefore if
> VHOST_USER_F_PROTOCOL_FEATURES is indeed negotiated here I'm not sure
> what could cause QEMU to think REPLY_ACK was negotiated while the
> backend does not, spare something obvious like the backend mishandling
> the GET/SET_PROTOCOL_FEATURES messages. I briefly checked the rustvmm
> code for that and didn't see anything obvious.
>
> mst - are backend devices meant to function if
> VHOST_USER_F_PROTOCOL_FEATURES is not advertised? Do we know of any
> functioning backend which does not advertise this virtio feature? If
> not, maybe we consider failing out here?
>
> alex - Are you sure QEMU gets stuck waiting on a reply_ack message,
> and not somewhere else in the setup path? I trust a SET_MEM_TABLE
> message was actually received by the backend. Did you confirm that
> QEMU was indeed stuck waiting for a reply and not somewhere else later
> on?
>
> >
> > >>   vhost_user_write req:15 flags:0x1
> > >>   vhost_user_read_start
> > >>   vhost_user_read req:15 flags:0x5
> > >>   vhost_user_set_protocol_features: 2008
> > >>   vhost_user_write req:16 flags:0x1
> > >>   vhost_user_write req:3 flags:0x1
> > >>   vhost_user_write req:1 flags:0x1
> > >>   vhost_user_read_start
> > >>   vhost_user_read req:1 flags:0x5
> > >>   vhost_user_write req:13 flags:0x1
> > >>
> > >>   kernel initialises device
> > >>
> > >>   virtio_rpmb virtio1: init done!
> > >>   vhost_user_write req:13 flags:0x1
> > >>   vhost_dev_set_features: 130000000
> > >>   vhost_user_set_features: 130000000
> >
> > SET_FEATURES
>
> This is setting virtio features - should have nothing to do with REPLY_ACK.
>
> >
> > >>   vhost_user_write req:2 flags:0x1
> > >>   vhost_user_write req:5 flags:0x9
> > >>   vhost_user_read_start
> > >>
> > <snip>
> > >>
> > >>  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> > >>    when doing the eventual VHOST_USER_SET_FEATURES reply?
> > >>
> > >>  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> > >>    the negotiated features before processing the ``need_reply`` [Bit 3]
> > >>    field of the messages?
> > >
> > > I think vhost.rs is being correctly strict - but there would be no harm
> > > in it flagging that you'd hit an inconsistency if it finds a need_reply
> > > without the feature.
> >
> > But the feature should have been negotiated. So unless the slave can
> > assume it is enabled because it asked I think QEMU is in the wrong by
> > not preserving the feature bits in it's SET_FEATURES reply. We just gets
> > away with it with libvhostuser being willing to reply anyway.
> >
> > >
> > >>  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> > >>    in the "list of the ones that do" require replies or do they only
> > >>    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> > >>    box out seems to imply?
> > >
> > > set_mem_table gives a reply when postcopy is enabled (and then qemu
> > > replies to the reply!) but otherwise doesn't.
> > > (Note there's an issue opened for .rs to support ADD_MEM_REGION
> > > since it's a lot better than SET_MEM_TABLE which has a fixed size table
> > > that's small).
> >
> > Thanks for the heads up.
> >
> > >
> > > Dave
> > >
> > >> Currently I have some hacks in:
> > >>
> > >>   https://github.com/stsquad/vhost/tree/my-hacks
> > >>
> > >> which gets my daemon booting up to the point we actually need to do a
> > >> transaction. However I won't submit a PR until I've worked out exactly
> > >> where the problems are.
> > >>
> > >> --
> > >> Alex Bennée
> > >>
> >
> >
> > --
> > Alex Bennée
> >


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
  2021-02-26 21:25       ` Raphael Norwitz
@ 2021-02-27 12:23         ` Alex Bennée
  0 siblings, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2021-02-27 12:23 UTC (permalink / raw)
  To: Raphael Norwitz
  Cc: Sergio Lopez, Michael S. Tsirkin, chirantan, QEMU,
	Dr. David Alan Gilbert, keiichiw, Stefan Hajnoczi, rust-vmm,
	Marc-André Lureau, dgreid


Raphael Norwitz <raphael.s.norwitz@gmail.com> writes:

> As an afterthought - if VHOST_USER_F_PROTOCOL_FEATURES is indeed
> unset, the issue may well be caused by QEMU reading an uninitialized
> value for dev->protocol_features. Some device types like cryptodev
> explicitly zero it out. As I said, it isn't set anywhere else in the
> source and If dev->protocol_features had REPLY_ACK set when the
> vhost_dev device is initialized, it would exactly explain the behavior
> you are seeing.

Can I point you at the proposed clarification of the vhost-user
specification:

  Subject: [VHOST USER SPEC PATCH] vhost-user.rst: add clarifying language about protocol negotiation
  Date: Fri, 26 Feb 2021 11:16:19 +0000
  Message-Id: <20210226111619.21178-1-alex.bennee@linaro.org>


>
> On Fri, Feb 26, 2021 at 9:58 AM Raphael Norwitz
> <raphael.s.norwitz@gmail.com> wrote:
>>
>> There are two sets of features being negotiated - virtio and
>> vhost-user.  Based on what you've posted here, I suspect the
>> VHOST_USER_F_PROTOCOL_FEATURES virtio feature may not be negotiated by
>> the backend, preventing the vhost-user protocol feature negotiation
>> from happening at all. I'm not 100% sure why this would cause QEMU to
>> assume that REPLY_ACK was negotiated though.
>>
>> some questions:
>>
>> On Mon, Feb 22, 2021 at 3:26 AM Alex Bennée <alex.bennee@linaro.org> wrote:
>> >
>> >
>> > Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
>> >
>> > > * Alex Bennée (alex.bennee@linaro.org) wrote:
>> > >> Hi,
>> > >>
>> > >> I finally got a chance to get down into the guts of vhost-user while
>> > >> attempting to port my original C RPMB daemon to Rust using the
>> > >> vhost-user-backend and related crates. I ended up with this hang during
>> > >> negotiation:
>> > >>
>> > >>   startup
>> > >>
>> > >>   vhost_user_write req:1 flags:0x1
>> > >>   vhost_user_read_start
>> > >>   vhost_user_read req:1 flags:0x5
>> > >>   vhost_user_backend_init: we got 170000000
>> >
>> > GET_FEATURES
>>
>> Do we also see a GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES
>> message here? If so can you confirm what flags they contained?
>>
>> vhost-user feature negotiation works as follows (see vhost_user_backend_init()):
>>
>> err = vhost_user_get_features(dev, &features);
>> if (err < 0) {
>>     return err;
>> }
>>
>> if (virtio_has_feature(features, VHOST_USER_F_PROTOCOL_FEATURES)) {
>>     dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
>>
>>     err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
>>                                               &protocol_features);
>>     if (err < 0) {
>>         return err;
>>     }
>>
>>     dev->protocol_features =
>>         protocol_features & VHOST_USER_PROTOCOL_FEATURE_MASK;
>> ...
>>
>>     err = vhost_user_set_protocol_features(dev, dev->protocol_features);
>>     if (err < 0) {
>>          return err;
>>      }
>> }
>>
>> So we first get the virtio features and check if the backend
>> advertises VHOST_USER_F_PROTOCOL_FEATURES. If it does, we proceed to
>> negotiate vhost-user features, in which case we should see
>> GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES. Otherwise it looks
>> like the function just returns, and we leave the vhost-user features
>> uninitialized (presumably zeroed out?), and the backend will never
>> even receive a GET/SET_PROTOCOL_FEATURES.
>>
>> dev->protocol_features is not touched anywhere else, and, if
>> VHOST_USER_F_PROTOCOL_FEATURES is negotiated, comes directly to the
>> backend from the protocol_features the backend &ed with
>> VHOST_USER_PROTOCOL_FEATURE_MASK. Therefore if
>> VHOST_USER_F_PROTOCOL_FEATURES is indeed negotiated here I'm not sure
>> what could cause QEMU to think REPLY_ACK was negotiated while the
>> backend does not, spare something obvious like the backend mishandling
>> the GET/SET_PROTOCOL_FEATURES messages. I briefly checked the rustvmm
>> code for that and didn't see anything obvious.
>>
>> mst - are backend devices meant to function if
>> VHOST_USER_F_PROTOCOL_FEATURES is not advertised? Do we know of any
>> functioning backend which does not advertise this virtio feature? If
>> not, maybe we consider failing out here?
>>
>> alex - Are you sure QEMU gets stuck waiting on a reply_ack message,
>> and not somewhere else in the setup path? I trust a SET_MEM_TABLE
>> message was actually received by the backend. Did you confirm that
>> QEMU was indeed stuck waiting for a reply and not somewhere else later
>> on?
>>
>> >
>> > >>   vhost_user_write req:15 flags:0x1
>> > >>   vhost_user_read_start
>> > >>   vhost_user_read req:15 flags:0x5
>> > >>   vhost_user_set_protocol_features: 2008
>> > >>   vhost_user_write req:16 flags:0x1
>> > >>   vhost_user_write req:3 flags:0x1
>> > >>   vhost_user_write req:1 flags:0x1
>> > >>   vhost_user_read_start
>> > >>   vhost_user_read req:1 flags:0x5
>> > >>   vhost_user_write req:13 flags:0x1
>> > >>
>> > >>   kernel initialises device
>> > >>
>> > >>   virtio_rpmb virtio1: init done!
>> > >>   vhost_user_write req:13 flags:0x1
>> > >>   vhost_dev_set_features: 130000000
>> > >>   vhost_user_set_features: 130000000
>> >
>> > SET_FEATURES
>>
>> This is setting virtio features - should have nothing to do with REPLY_ACK.
>>
>> >
>> > >>   vhost_user_write req:2 flags:0x1
>> > >>   vhost_user_write req:5 flags:0x9
>> > >>   vhost_user_read_start
>> > >>
>> > <snip>
>> > >>
>> > >>  - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>> > >>    when doing the eventual VHOST_USER_SET_FEATURES reply?
>> > >>
>> > >>  - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>> > >>    the negotiated features before processing the ``need_reply`` [Bit 3]
>> > >>    field of the messages?
>> > >
>> > > I think vhost.rs is being correctly strict - but there would be no harm
>> > > in it flagging that you'd hit an inconsistency if it finds a need_reply
>> > > without the feature.
>> >
>> > But the feature should have been negotiated. So unless the slave can
>> > assume it is enabled because it asked I think QEMU is in the wrong by
>> > not preserving the feature bits in it's SET_FEATURES reply. We just gets
>> > away with it with libvhostuser being willing to reply anyway.
>> >
>> > >
>> > >>  - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>> > >>    in the "list of the ones that do" require replies or do they only
>> > >>    reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>> > >>    box out seems to imply?
>> > >
>> > > set_mem_table gives a reply when postcopy is enabled (and then qemu
>> > > replies to the reply!) but otherwise doesn't.
>> > > (Note there's an issue opened for .rs to support ADD_MEM_REGION
>> > > since it's a lot better than SET_MEM_TABLE which has a fixed size table
>> > > that's small).
>> >
>> > Thanks for the heads up.
>> >
>> > >
>> > > Dave
>> > >
>> > >> Currently I have some hacks in:
>> > >>
>> > >>   https://github.com/stsquad/vhost/tree/my-hacks
>> > >>
>> > >> which gets my daemon booting up to the point we actually need to do a
>> > >> transaction. However I won't submit a PR until I've worked out exactly
>> > >> where the problems are.
>> > >>
>> > >> --
>> > >> Alex Bennée
>> > >>
>> >
>> >
>> > --
>> > Alex Bennée
>> >


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-02-27 12:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-19 16:04 vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs) Alex Bennée
2021-02-22 13:06 ` Dr. David Alan Gilbert
2021-02-22 13:21   ` Alex Bennée
2021-02-22 13:27     ` Dr. David Alan Gilbert
2021-02-23 10:23       ` [Rust-VMM] " Jiang Liu
2021-02-26 19:58     ` Raphael Norwitz
2021-02-26 21:25       ` Raphael Norwitz
2021-02-27 12:23         ` Alex Bennée
2021-02-23 11:44 ` Michael S. Tsirkin
2021-02-25  4:19   ` [Rust-VMM] " Dylan Reid
2021-02-25  6:36     ` Keiichi Watanabe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).