* vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
@ 2021-02-19 16:04 Alex Bennée
2021-02-22 13:06 ` Dr. David Alan Gilbert
2021-02-23 11:44 ` Michael S. Tsirkin
0 siblings, 2 replies; 11+ messages in thread
From: Alex Bennée @ 2021-02-19 16:04 UTC (permalink / raw)
To: qemu-devel, rust-vmm
Cc: Marc-André Lureau, Dr. David Alan Gilbert, Sergio Lopez,
Stefan Hajnoczi, Michael S. Tsirkin
Hi,
I finally got a chance to get down into the guts of vhost-user while
attempting to port my original C RPMB daemon to Rust using the
vhost-user-backend and related crates. I ended up with this hang during
negotiation:
startup
vhost_user_write req:1 flags:0x1
vhost_user_read_start
vhost_user_read req:1 flags:0x5
vhost_user_backend_init: we got 170000000
vhost_user_write req:15 flags:0x1
vhost_user_read_start
vhost_user_read req:15 flags:0x5
vhost_user_set_protocol_features: 2008
vhost_user_write req:16 flags:0x1
vhost_user_write req:3 flags:0x1
vhost_user_write req:1 flags:0x1
vhost_user_read_start
vhost_user_read req:1 flags:0x5
vhost_user_write req:13 flags:0x1
kernel initialises device
virtio_rpmb virtio1: init done!
vhost_user_write req:13 flags:0x1
vhost_dev_set_features: 130000000
vhost_user_set_features: 130000000
vhost_user_write req:2 flags:0x1
vhost_user_write req:5 flags:0x9
vhost_user_read_start
The proximate cause is the vhost crate handling:
MasterReq::SET_MEM_TABLE => {
let res = self.set_mem_table(&hdr, size, &buf, rfds);
self.send_ack_message(&hdr, res)?;
}
which gates on the replay_ack_enabled flag:
fn send_ack_message(
&mut self,
req: &VhostUserMsgHeader<MasterReq>,
res: Result<()>,
) -> Result<()> {
if dbg!(self.reply_ack_enabled) {
let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
let val = match res {
Ok(_) => 0,
Err(_) => 1,
};
let msg = VhostUserU64::new(val);
self.main_sock.send_message(&hdr, &msg, None)?;
}
Ok(())
}
which is only set when we have all the appropriate acknowledged flags:
fn update_reply_ack_flag(&mut self) {
let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
let pflag = VhostUserProtocolFeatures::REPLY_ACK;
if (self.virtio_features & vflag) != 0
&& (self.acked_virtio_features & vflag) != 0
&& self.protocol_features.contains(pflag)
&& (self.acked_protocol_features & pflag.bits()) != 0
{
self.reply_ack_enabled = true;
} else {
self.reply_ack_enabled = false;
}
}
which from above you can see QEMU helpfully dropped those bits in the
reply. It does however work in the C/libvhost version:
virtio_rpmb virtio1: init done!
vhost_user_write req:13 flags:0x1
vhost_dev_set_features: 130000000
vhost_user_set_features: 130000000
vhost_user_write req:2 flags:0x1
vhost_user_write req:37 flags:0x9
vhost_user_read_start
vhost_user_read req:37 flags:0x5
vhost_user_write req:8 flags:0x1
vhost_user_write req:10 flags:0x1
vhost_user_write req:9 flags:0x1
vhost_user_write req:12 flags:0x1
vhost_user_write req:13 flags:0x1
albeit with a slightly different message sequence
(VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
the C code you can see why:
need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
reply_requested = vu_process_message(dev, &vmsg);
if (!reply_requested && need_reply) {
vmsg_set_reply_u64(&vmsg, 0);
reply_requested = 1;
}
So regardless of what may have been negotiated it will always reply with
something if the master requested it do so. This points us at the
specification which reads:
- Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
details.
which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
be honoured when the feature has been negotiated. Which brings us to a
series of questions:
- Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
when doing the eventual VHOST_USER_SET_FEATURES reply?
- Is vhost.rs being to strict or libvhost-user too lax in interpreting
the negotiated features before processing the ``need_reply`` [Bit 3]
field of the messages?
- are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
in the "list of the ones that do" require replies or do they only
reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
box out seems to imply?
Currently I have some hacks in:
https://github.com/stsquad/vhost/tree/my-hacks
which gets my daemon booting up to the point we actually need to do a
transaction. However I won't submit a PR until I've worked out exactly
where the problems are.
--
Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-19 16:04 vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs) Alex Bennée
@ 2021-02-22 13:06 ` Dr. David Alan Gilbert
2021-02-22 13:21 ` Alex Bennée
2021-02-23 11:44 ` Michael S. Tsirkin
1 sibling, 1 reply; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-22 13:06 UTC (permalink / raw)
To: Alex Bennée
Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
rust-vmm, Marc-André Lureau
* Alex Bennée (alex.bennee@linaro.org) wrote:
> Hi,
>
> I finally got a chance to get down into the guts of vhost-user while
> attempting to port my original C RPMB daemon to Rust using the
> vhost-user-backend and related crates. I ended up with this hang during
> negotiation:
>
> startup
>
> vhost_user_write req:1 flags:0x1
> vhost_user_read_start
> vhost_user_read req:1 flags:0x5
> vhost_user_backend_init: we got 170000000
> vhost_user_write req:15 flags:0x1
> vhost_user_read_start
> vhost_user_read req:15 flags:0x5
> vhost_user_set_protocol_features: 2008
> vhost_user_write req:16 flags:0x1
> vhost_user_write req:3 flags:0x1
> vhost_user_write req:1 flags:0x1
> vhost_user_read_start
> vhost_user_read req:1 flags:0x5
> vhost_user_write req:13 flags:0x1
>
> kernel initialises device
>
> virtio_rpmb virtio1: init done!
> vhost_user_write req:13 flags:0x1
> vhost_dev_set_features: 130000000
> vhost_user_set_features: 130000000
> vhost_user_write req:2 flags:0x1
> vhost_user_write req:5 flags:0x9
> vhost_user_read_start
>
> The proximate cause is the vhost crate handling:
>
> MasterReq::SET_MEM_TABLE => {
> let res = self.set_mem_table(&hdr, size, &buf, rfds);
> self.send_ack_message(&hdr, res)?;
> }
>
> which gates on the replay_ack_enabled flag:
>
> fn send_ack_message(
> &mut self,
> req: &VhostUserMsgHeader<MasterReq>,
> res: Result<()>,
> ) -> Result<()> {
> if dbg!(self.reply_ack_enabled) {
> let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
> let val = match res {
> Ok(_) => 0,
> Err(_) => 1,
> };
> let msg = VhostUserU64::new(val);
> self.main_sock.send_message(&hdr, &msg, None)?;
> }
> Ok(())
> }
>
> which is only set when we have all the appropriate acknowledged flags:
>
> fn update_reply_ack_flag(&mut self) {
> let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
> let pflag = VhostUserProtocolFeatures::REPLY_ACK;
> if (self.virtio_features & vflag) != 0
> && (self.acked_virtio_features & vflag) != 0
> && self.protocol_features.contains(pflag)
> && (self.acked_protocol_features & pflag.bits()) != 0
> {
> self.reply_ack_enabled = true;
> } else {
> self.reply_ack_enabled = false;
> }
> }
>
> which from above you can see QEMU helpfully dropped those bits in the
> reply. It does however work in the C/libvhost version:
>
> virtio_rpmb virtio1: init done!
> vhost_user_write req:13 flags:0x1
> vhost_dev_set_features: 130000000
> vhost_user_set_features: 130000000
> vhost_user_write req:2 flags:0x1
> vhost_user_write req:37 flags:0x9
> vhost_user_read_start
> vhost_user_read req:37 flags:0x5
> vhost_user_write req:8 flags:0x1
> vhost_user_write req:10 flags:0x1
> vhost_user_write req:9 flags:0x1
> vhost_user_write req:12 flags:0x1
> vhost_user_write req:13 flags:0x1
>
> albeit with a slightly different message sequence
> (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> the C code you can see why:
>
> need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
>
> reply_requested = vu_process_message(dev, &vmsg);
> if (!reply_requested && need_reply) {
> vmsg_set_reply_u64(&vmsg, 0);
> reply_requested = 1;
> }
>
> So regardless of what may have been negotiated it will always reply with
> something if the master requested it do so. This points us at the
> specification which reads:
>
> - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
> details.
>
> which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> be honoured when the feature has been negotiated. Which brings us to a
> series of questions:
>
> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> when doing the eventual VHOST_USER_SET_FEATURES reply?
>
> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> the negotiated features before processing the ``need_reply`` [Bit 3]
> field of the messages?
I think vhost.rs is being correctly strict - but there would be no harm
in it flagging that you'd hit an inconsistency if it finds a need_reply
without the feature.
> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> in the "list of the ones that do" require replies or do they only
> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> box out seems to imply?
set_mem_table gives a reply when postcopy is enabled (and then qemu
replies to the reply!) but otherwise doesn't.
(Note there's an issue opened for .rs to support ADD_MEM_REGION
since it's a lot better than SET_MEM_TABLE which has a fixed size table
that's small).
Dave
> Currently I have some hacks in:
>
> https://github.com/stsquad/vhost/tree/my-hacks
>
> which gets my daemon booting up to the point we actually need to do a
> transaction. However I won't submit a PR until I've worked out exactly
> where the problems are.
>
> --
> Alex Bennée
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-22 13:06 ` Dr. David Alan Gilbert
@ 2021-02-22 13:21 ` Alex Bennée
2021-02-22 13:27 ` Dr. David Alan Gilbert
2021-02-26 19:58 ` Raphael Norwitz
0 siblings, 2 replies; 11+ messages in thread
From: Alex Bennée @ 2021-02-22 13:21 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
rust-vmm, Marc-André Lureau
Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
> * Alex Bennée (alex.bennee@linaro.org) wrote:
>> Hi,
>>
>> I finally got a chance to get down into the guts of vhost-user while
>> attempting to port my original C RPMB daemon to Rust using the
>> vhost-user-backend and related crates. I ended up with this hang during
>> negotiation:
>>
>> startup
>>
>> vhost_user_write req:1 flags:0x1
>> vhost_user_read_start
>> vhost_user_read req:1 flags:0x5
>> vhost_user_backend_init: we got 170000000
GET_FEATURES
>> vhost_user_write req:15 flags:0x1
>> vhost_user_read_start
>> vhost_user_read req:15 flags:0x5
>> vhost_user_set_protocol_features: 2008
>> vhost_user_write req:16 flags:0x1
>> vhost_user_write req:3 flags:0x1
>> vhost_user_write req:1 flags:0x1
>> vhost_user_read_start
>> vhost_user_read req:1 flags:0x5
>> vhost_user_write req:13 flags:0x1
>>
>> kernel initialises device
>>
>> virtio_rpmb virtio1: init done!
>> vhost_user_write req:13 flags:0x1
>> vhost_dev_set_features: 130000000
>> vhost_user_set_features: 130000000
SET_FEATURES
>> vhost_user_write req:2 flags:0x1
>> vhost_user_write req:5 flags:0x9
>> vhost_user_read_start
>>
<snip>
>>
>> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>> when doing the eventual VHOST_USER_SET_FEATURES reply?
>>
>> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>> the negotiated features before processing the ``need_reply`` [Bit 3]
>> field of the messages?
>
> I think vhost.rs is being correctly strict - but there would be no harm
> in it flagging that you'd hit an inconsistency if it finds a need_reply
> without the feature.
But the feature should have been negotiated. So unless the slave can
assume it is enabled because it asked I think QEMU is in the wrong by
not preserving the feature bits in it's SET_FEATURES reply. We just gets
away with it with libvhostuser being willing to reply anyway.
>
>> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>> in the "list of the ones that do" require replies or do they only
>> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>> box out seems to imply?
>
> set_mem_table gives a reply when postcopy is enabled (and then qemu
> replies to the reply!) but otherwise doesn't.
> (Note there's an issue opened for .rs to support ADD_MEM_REGION
> since it's a lot better than SET_MEM_TABLE which has a fixed size table
> that's small).
Thanks for the heads up.
>
> Dave
>
>> Currently I have some hacks in:
>>
>> https://github.com/stsquad/vhost/tree/my-hacks
>>
>> which gets my daemon booting up to the point we actually need to do a
>> transaction. However I won't submit a PR until I've worked out exactly
>> where the problems are.
>>
>> --
>> Alex Bennée
>>
--
Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-22 13:21 ` Alex Bennée
@ 2021-02-22 13:27 ` Dr. David Alan Gilbert
2021-02-23 10:23 ` [Rust-VMM] " Jiang Liu
2021-02-26 19:58 ` Raphael Norwitz
1 sibling, 1 reply; 11+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-22 13:27 UTC (permalink / raw)
To: Alex Bennée
Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
rust-vmm, Marc-André Lureau
* Alex Bennée (alex.bennee@linaro.org) wrote:
>
> Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
>
> > * Alex Bennée (alex.bennee@linaro.org) wrote:
> >> Hi,
> >>
> >> I finally got a chance to get down into the guts of vhost-user while
> >> attempting to port my original C RPMB daemon to Rust using the
> >> vhost-user-backend and related crates. I ended up with this hang during
> >> negotiation:
> >>
> >> startup
> >>
> >> vhost_user_write req:1 flags:0x1
> >> vhost_user_read_start
> >> vhost_user_read req:1 flags:0x5
> >> vhost_user_backend_init: we got 170000000
>
> GET_FEATURES
>
> >> vhost_user_write req:15 flags:0x1
> >> vhost_user_read_start
> >> vhost_user_read req:15 flags:0x5
> >> vhost_user_set_protocol_features: 2008
> >> vhost_user_write req:16 flags:0x1
> >> vhost_user_write req:3 flags:0x1
> >> vhost_user_write req:1 flags:0x1
> >> vhost_user_read_start
> >> vhost_user_read req:1 flags:0x5
> >> vhost_user_write req:13 flags:0x1
> >>
> >> kernel initialises device
> >>
> >> virtio_rpmb virtio1: init done!
> >> vhost_user_write req:13 flags:0x1
> >> vhost_dev_set_features: 130000000
> >> vhost_user_set_features: 130000000
>
> SET_FEATURES
>
> >> vhost_user_write req:2 flags:0x1
> >> vhost_user_write req:5 flags:0x9
> >> vhost_user_read_start
> >>
> <snip>
> >>
> >> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> >> when doing the eventual VHOST_USER_SET_FEATURES reply?
> >>
> >> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> >> the negotiated features before processing the ``need_reply`` [Bit 3]
> >> field of the messages?
> >
> > I think vhost.rs is being correctly strict - but there would be no harm
> > in it flagging that you'd hit an inconsistency if it finds a need_reply
> > without the feature.
>
> But the feature should have been negotiated. So unless the slave can
> assume it is enabled because it asked I think QEMU is in the wrong by
> not preserving the feature bits in it's SET_FEATURES reply. We just gets
> away with it with libvhostuser being willing to reply anyway.
Oh I wasn't trying to reply to that bit; I can never remember how the
vhost/virtio feature bit negotiation works...
Dave
> >
> >> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> >> in the "list of the ones that do" require replies or do they only
> >> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> >> box out seems to imply?
> >
> > set_mem_table gives a reply when postcopy is enabled (and then qemu
> > replies to the reply!) but otherwise doesn't.
> > (Note there's an issue opened for .rs to support ADD_MEM_REGION
> > since it's a lot better than SET_MEM_TABLE which has a fixed size table
> > that's small).
>
> Thanks for the heads up.
>
> >
> > Dave
> >
> >> Currently I have some hacks in:
> >>
> >> https://github.com/stsquad/vhost/tree/my-hacks
> >>
> >> which gets my daemon booting up to the point we actually need to do a
> >> transaction. However I won't submit a PR until I've worked out exactly
> >> where the problems are.
> >>
> >> --
> >> Alex Bennée
> >>
>
>
> --
> Alex Bennée
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Rust-VMM] vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-22 13:27 ` Dr. David Alan Gilbert
@ 2021-02-23 10:23 ` Jiang Liu
0 siblings, 0 replies; 11+ messages in thread
From: Jiang Liu @ 2021-02-23 10:23 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Sergio Lopez, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi,
rust-vmm, Marc-André Lureau, Alex Bennée
I just found this thread in my email junk box:(
I do have found some bugs in the vhost_rs crate, related to handle the need_reply flag.
But that bug only affects virtio-fs fs_map operations.
Please refer to:
https://github.com/rust-vmm/vhost/pull/19
https://github.com/rust-vmm/vhost/pull/19/commits/a2c5a4f50e45ae1ab78622dda9a3f861bd43a17e
Thanks,
Gerry
> On Feb 22, 2021, at 9:27 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>
> * Alex Bennée (alex.bennee@linaro.org) wrote:
>>
>> Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
>>
>>> * Alex Bennée (alex.bennee@linaro.org) wrote:
>>>> Hi,
>>>>
>>>> I finally got a chance to get down into the guts of vhost-user while
>>>> attempting to port my original C RPMB daemon to Rust using the
>>>> vhost-user-backend and related crates. I ended up with this hang during
>>>> negotiation:
>>>>
>>>> startup
>>>>
>>>> vhost_user_write req:1 flags:0x1
>>>> vhost_user_read_start
>>>> vhost_user_read req:1 flags:0x5
>>>> vhost_user_backend_init: we got 170000000
>>
>> GET_FEATURES
>>
>>>> vhost_user_write req:15 flags:0x1
>>>> vhost_user_read_start
>>>> vhost_user_read req:15 flags:0x5
>>>> vhost_user_set_protocol_features: 2008
>>>> vhost_user_write req:16 flags:0x1
>>>> vhost_user_write req:3 flags:0x1
>>>> vhost_user_write req:1 flags:0x1
>>>> vhost_user_read_start
>>>> vhost_user_read req:1 flags:0x5
>>>> vhost_user_write req:13 flags:0x1
>>>>
>>>> kernel initialises device
>>>>
>>>> virtio_rpmb virtio1: init done!
>>>> vhost_user_write req:13 flags:0x1
>>>> vhost_dev_set_features: 130000000
>>>> vhost_user_set_features: 130000000
>>
>> SET_FEATURES
>>
>>>> vhost_user_write req:2 flags:0x1
>>>> vhost_user_write req:5 flags:0x9
>>>> vhost_user_read_start
>>>>
>> <snip>
>>>>
>>>> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>>>> when doing the eventual VHOST_USER_SET_FEATURES reply?
>>>>
>>>> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>>>> the negotiated features before processing the ``need_reply`` [Bit 3]
>>>> field of the messages?
>>>
>>> I think vhost.rs is being correctly strict - but there would be no harm
>>> in it flagging that you'd hit an inconsistency if it finds a need_reply
>>> without the feature.
>>
>> But the feature should have been negotiated. So unless the slave can
>> assume it is enabled because it asked I think QEMU is in the wrong by
>> not preserving the feature bits in it's SET_FEATURES reply. We just gets
>> away with it with libvhostuser being willing to reply anyway.
>
> Oh I wasn't trying to reply to that bit; I can never remember how the
> vhost/virtio feature bit negotiation works...
>
> Dave
>
>>>
>>>> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>>>> in the "list of the ones that do" require replies or do they only
>>>> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>>>> box out seems to imply?
>>>
>>> set_mem_table gives a reply when postcopy is enabled (and then qemu
>>> replies to the reply!) but otherwise doesn't.
>>> (Note there's an issue opened for .rs to support ADD_MEM_REGION
>>> since it's a lot better than SET_MEM_TABLE which has a fixed size table
>>> that's small).
>>
>> Thanks for the heads up.
>>
>>>
>>> Dave
>>>
>>>> Currently I have some hacks in:
>>>>
>>>> https://github.com/stsquad/vhost/tree/my-hacks
>>>>
>>>> which gets my daemon booting up to the point we actually need to do a
>>>> transaction. However I won't submit a PR until I've worked out exactly
>>>> where the problems are.
>>>>
>>>> --
>>>> Alex Bennée
>>>>
>>
>>
>> --
>> Alex Bennée
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> _______________________________________________
> Rust-vmm mailing list
> Rust-vmm@lists.opendev.org
> http://lists.opendev.org/cgi-bin/mailman/listinfo/rust-vmm
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-19 16:04 vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs) Alex Bennée
2021-02-22 13:06 ` Dr. David Alan Gilbert
@ 2021-02-23 11:44 ` Michael S. Tsirkin
2021-02-25 4:19 ` [Rust-VMM] " Dylan Reid
1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2021-02-23 11:44 UTC (permalink / raw)
To: Alex Bennée
Cc: Sergio Lopez, Dr. David Alan Gilbert, qemu-devel,
Stefan Hajnoczi, rust-vmm, Marc-André Lureau,
raphael.norwitz
Cc: Raphael
On Fri, Feb 19, 2021 at 04:04:34PM +0000, Alex Bennée wrote:
> Hi,
>
> I finally got a chance to get down into the guts of vhost-user while
> attempting to port my original C RPMB daemon to Rust using the
> vhost-user-backend and related crates. I ended up with this hang during
> negotiation:
>
> startup
>
> vhost_user_write req:1 flags:0x1
> vhost_user_read_start
> vhost_user_read req:1 flags:0x5
> vhost_user_backend_init: we got 170000000
> vhost_user_write req:15 flags:0x1
> vhost_user_read_start
> vhost_user_read req:15 flags:0x5
> vhost_user_set_protocol_features: 2008
> vhost_user_write req:16 flags:0x1
> vhost_user_write req:3 flags:0x1
> vhost_user_write req:1 flags:0x1
> vhost_user_read_start
> vhost_user_read req:1 flags:0x5
> vhost_user_write req:13 flags:0x1
>
> kernel initialises device
>
> virtio_rpmb virtio1: init done!
> vhost_user_write req:13 flags:0x1
> vhost_dev_set_features: 130000000
> vhost_user_set_features: 130000000
> vhost_user_write req:2 flags:0x1
> vhost_user_write req:5 flags:0x9
> vhost_user_read_start
>
> The proximate cause is the vhost crate handling:
>
> MasterReq::SET_MEM_TABLE => {
> let res = self.set_mem_table(&hdr, size, &buf, rfds);
> self.send_ack_message(&hdr, res)?;
> }
>
> which gates on the replay_ack_enabled flag:
>
> fn send_ack_message(
> &mut self,
> req: &VhostUserMsgHeader<MasterReq>,
> res: Result<()>,
> ) -> Result<()> {
> if dbg!(self.reply_ack_enabled) {
> let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
> let val = match res {
> Ok(_) => 0,
> Err(_) => 1,
> };
> let msg = VhostUserU64::new(val);
> self.main_sock.send_message(&hdr, &msg, None)?;
> }
> Ok(())
> }
>
> which is only set when we have all the appropriate acknowledged flags:
>
> fn update_reply_ack_flag(&mut self) {
> let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
> let pflag = VhostUserProtocolFeatures::REPLY_ACK;
> if (self.virtio_features & vflag) != 0
> && (self.acked_virtio_features & vflag) != 0
> && self.protocol_features.contains(pflag)
> && (self.acked_protocol_features & pflag.bits()) != 0
> {
> self.reply_ack_enabled = true;
> } else {
> self.reply_ack_enabled = false;
> }
> }
>
> which from above you can see QEMU helpfully dropped those bits in the
> reply. It does however work in the C/libvhost version:
>
> virtio_rpmb virtio1: init done!
> vhost_user_write req:13 flags:0x1
> vhost_dev_set_features: 130000000
> vhost_user_set_features: 130000000
> vhost_user_write req:2 flags:0x1
> vhost_user_write req:37 flags:0x9
> vhost_user_read_start
> vhost_user_read req:37 flags:0x5
> vhost_user_write req:8 flags:0x1
> vhost_user_write req:10 flags:0x1
> vhost_user_write req:9 flags:0x1
> vhost_user_write req:12 flags:0x1
> vhost_user_write req:13 flags:0x1
>
> albeit with a slightly different message sequence
> (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> the C code you can see why:
>
> need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
>
> reply_requested = vu_process_message(dev, &vmsg);
> if (!reply_requested && need_reply) {
> vmsg_set_reply_u64(&vmsg, 0);
> reply_requested = 1;
> }
>
> So regardless of what may have been negotiated it will always reply with
> something if the master requested it do so. This points us at the
> specification which reads:
>
> - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
> details.
>
> which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> be honoured when the feature has been negotiated. Which brings us to a
> series of questions:
>
> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> when doing the eventual VHOST_USER_SET_FEATURES reply?
Hmm looks like a bug indeed ... Anyone wants to look
into fixing that? Marc-André?
> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> the negotiated features before processing the ``need_reply`` [Bit 3]
> field of the messages?
>
> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> in the "list of the ones that do" require replies or do they only
> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> box out seems to imply?
>
> Currently I have some hacks in:
>
> https://github.com/stsquad/vhost/tree/my-hacks
>
> which gets my daemon booting up to the point we actually need to do a
> transaction. However I won't submit a PR until I've worked out exactly
> where the problems are.
>
> --
> Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Rust-VMM] vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-23 11:44 ` Michael S. Tsirkin
@ 2021-02-25 4:19 ` Dylan Reid
2021-02-25 6:36 ` Keiichi Watanabe
0 siblings, 1 reply; 11+ messages in thread
From: Dylan Reid @ 2021-02-25 4:19 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Sergio Lopez, Chirantan Ekbote, qemu-devel,
Dr. David Alan Gilbert, Keiichi Watanabe, Stefan Hajnoczi,
rust-vmm, Marc-André Lureau, raphael.norwitz,
Alex Bennée
On Tue, Feb 23, 2021 at 8:20 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> Cc: Raphael
>
> On Fri, Feb 19, 2021 at 04:04:34PM +0000, Alex Bennée wrote:
> > Hi,
> >
> > I finally got a chance to get down into the guts of vhost-user while
> > attempting to port my original C RPMB daemon to Rust using the
> > vhost-user-backend and related crates. I ended up with this hang during
> > negotiation:
> >
> > startup
> >
> > vhost_user_write req:1 flags:0x1
> > vhost_user_read_start
> > vhost_user_read req:1 flags:0x5
> > vhost_user_backend_init: we got 170000000
> > vhost_user_write req:15 flags:0x1
> > vhost_user_read_start
> > vhost_user_read req:15 flags:0x5
> > vhost_user_set_protocol_features: 2008
> > vhost_user_write req:16 flags:0x1
> > vhost_user_write req:3 flags:0x1
> > vhost_user_write req:1 flags:0x1
> > vhost_user_read_start
> > vhost_user_read req:1 flags:0x5
> > vhost_user_write req:13 flags:0x1
> >
> > kernel initialises device
> >
> > virtio_rpmb virtio1: init done!
> > vhost_user_write req:13 flags:0x1
> > vhost_dev_set_features: 130000000
> > vhost_user_set_features: 130000000
> > vhost_user_write req:2 flags:0x1
> > vhost_user_write req:5 flags:0x9
> > vhost_user_read_start
> >
> > The proximate cause is the vhost crate handling:
> >
> > MasterReq::SET_MEM_TABLE => {
> > let res = self.set_mem_table(&hdr, size, &buf, rfds);
> > self.send_ack_message(&hdr, res)?;
> > }
> >
> > which gates on the replay_ack_enabled flag:
> >
> > fn send_ack_message(
> > &mut self,
> > req: &VhostUserMsgHeader<MasterReq>,
> > res: Result<()>,
> > ) -> Result<()> {
> > if dbg!(self.reply_ack_enabled) {
> > let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
> > let val = match res {
> > Ok(_) => 0,
> > Err(_) => 1,
> > };
> > let msg = VhostUserU64::new(val);
> > self.main_sock.send_message(&hdr, &msg, None)?;
> > }
> > Ok(())
> > }
> >
> > which is only set when we have all the appropriate acknowledged flags:
> >
> > fn update_reply_ack_flag(&mut self) {
> > let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
> > let pflag = VhostUserProtocolFeatures::REPLY_ACK;
> > if (self.virtio_features & vflag) != 0
> > && (self.acked_virtio_features & vflag) != 0
> > && self.protocol_features.contains(pflag)
> > && (self.acked_protocol_features & pflag.bits()) != 0
> > {
> > self.reply_ack_enabled = true;
> > } else {
> > self.reply_ack_enabled = false;
> > }
> > }
> >
> > which from above you can see QEMU helpfully dropped those bits in the
> > reply. It does however work in the C/libvhost version:
> >
> > virtio_rpmb virtio1: init done!
> > vhost_user_write req:13 flags:0x1
> > vhost_dev_set_features: 130000000
> > vhost_user_set_features: 130000000
> > vhost_user_write req:2 flags:0x1
> > vhost_user_write req:37 flags:0x9
> > vhost_user_read_start
> > vhost_user_read req:37 flags:0x5
> > vhost_user_write req:8 flags:0x1
> > vhost_user_write req:10 flags:0x1
> > vhost_user_write req:9 flags:0x1
> > vhost_user_write req:12 flags:0x1
> > vhost_user_write req:13 flags:0x1
> >
> > albeit with a slightly different message sequence
> > (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> > the C code you can see why:
> >
> > need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
> >
> > reply_requested = vu_process_message(dev, &vmsg);
> > if (!reply_requested && need_reply) {
> > vmsg_set_reply_u64(&vmsg, 0);
> > reply_requested = 1;
> > }
> >
> > So regardless of what may have been negotiated it will always reply with
> > something if the master requested it do so. This points us at the
> > specification which reads:
> >
> > - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
> > details.
> >
> > which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> > be honoured when the feature has been negotiated. Which brings us to a
> > series of questions:
> >
> > - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> > when doing the eventual VHOST_USER_SET_FEATURES reply?
>
> Hmm looks like a bug indeed ... Anyone wants to look
> into fixing that? Marc-André?
chirantan and keiichi will be implementing vhost-user-vitio-fs on
Chrome OS, maybe one of you two can take a look?
>
>
>
> > - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> > the negotiated features before processing the ``need_reply`` [Bit 3]
> > field of the messages?
> >
> > - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> > in the "list of the ones that do" require replies or do they only
> > reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> > box out seems to imply?
> >
> > Currently I have some hacks in:
> >
> > https://github.com/stsquad/vhost/tree/my-hacks
> >
> > which gets my daemon booting up to the point we actually need to do a
> > transaction. However I won't submit a PR until I've worked out exactly
> > where the problems are.
> >
> > --
> > Alex Bennée
>
>
> _______________________________________________
> Rust-vmm mailing list
> Rust-vmm@lists.opendev.org
> http://lists.opendev.org/cgi-bin/mailman/listinfo/rust-vmm
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Rust-VMM] vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-25 4:19 ` [Rust-VMM] " Dylan Reid
@ 2021-02-25 6:36 ` Keiichi Watanabe
0 siblings, 0 replies; 11+ messages in thread
From: Keiichi Watanabe @ 2021-02-25 6:36 UTC (permalink / raw)
To: Dylan Reid, woodychow
Cc: Sergio Lopez, Michael S. Tsirkin, Chirantan Ekbote, qemu-devel,
Dr. David Alan Gilbert, Stefan Hajnoczi, rust-vmm,
Marc-André Lureau, raphael.norwitz, Alex Bennée
[-- Attachment #1: Type: text/plain, Size: 6257 bytes --]
On Thu, Feb 25, 2021 at 1:20 PM Dylan Reid <dgreid@chromium.org> wrote:
> On Tue, Feb 23, 2021 at 8:20 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Cc: Raphael
> >
> > On Fri, Feb 19, 2021 at 04:04:34PM +0000, Alex Bennée wrote:
> > > Hi,
> > >
> > > I finally got a chance to get down into the guts of vhost-user while
> > > attempting to port my original C RPMB daemon to Rust using the
> > > vhost-user-backend and related crates. I ended up with this hang during
> > > negotiation:
> > >
> > > startup
> > >
> > > vhost_user_write req:1 flags:0x1
> > > vhost_user_read_start
> > > vhost_user_read req:1 flags:0x5
> > > vhost_user_backend_init: we got 170000000
> > > vhost_user_write req:15 flags:0x1
> > > vhost_user_read_start
> > > vhost_user_read req:15 flags:0x5
> > > vhost_user_set_protocol_features: 2008
> > > vhost_user_write req:16 flags:0x1
> > > vhost_user_write req:3 flags:0x1
> > > vhost_user_write req:1 flags:0x1
> > > vhost_user_read_start
> > > vhost_user_read req:1 flags:0x5
> > > vhost_user_write req:13 flags:0x1
> > >
> > > kernel initialises device
> > >
> > > virtio_rpmb virtio1: init done!
> > > vhost_user_write req:13 flags:0x1
> > > vhost_dev_set_features: 130000000
> > > vhost_user_set_features: 130000000
> > > vhost_user_write req:2 flags:0x1
> > > vhost_user_write req:5 flags:0x9
> > > vhost_user_read_start
> > >
> > > The proximate cause is the vhost crate handling:
> > >
> > > MasterReq::SET_MEM_TABLE => {
> > > let res = self.set_mem_table(&hdr, size, &buf, rfds);
> > > self.send_ack_message(&hdr, res)?;
> > > }
> > >
> > > which gates on the replay_ack_enabled flag:
> > >
> > > fn send_ack_message(
> > > &mut self,
> > > req: &VhostUserMsgHeader<MasterReq>,
> > > res: Result<()>,
> > > ) -> Result<()> {
> > > if dbg!(self.reply_ack_enabled) {
> > > let hdr = self.new_reply_header::<VhostUserU64>(req, 0)?;
> > > let val = match res {
> > > Ok(_) => 0,
> > > Err(_) => 1,
> > > };
> > > let msg = VhostUserU64::new(val);
> > > self.main_sock.send_message(&hdr, &msg, None)?;
> > > }
> > > Ok(())
> > > }
> > >
> > > which is only set when we have all the appropriate acknowledged flags:
> > >
> > > fn update_reply_ack_flag(&mut self) {
> > > let vflag = VhostUserVirtioFeatures::PROTOCOL_FEATURES.bits();
> > > let pflag = VhostUserProtocolFeatures::REPLY_ACK;
> > > if (self.virtio_features & vflag) != 0
> > > && (self.acked_virtio_features & vflag) != 0
> > > && self.protocol_features.contains(pflag)
> > > && (self.acked_protocol_features & pflag.bits()) != 0
> > > {
> > > self.reply_ack_enabled = true;
> > > } else {
> > > self.reply_ack_enabled = false;
> > > }
> > > }
> > >
> > > which from above you can see QEMU helpfully dropped those bits in the
> > > reply. It does however work in the C/libvhost version:
> > >
> > > virtio_rpmb virtio1: init done!
> > > vhost_user_write req:13 flags:0x1
> > > vhost_dev_set_features: 130000000
> > > vhost_user_set_features: 130000000
> > > vhost_user_write req:2 flags:0x1
> > > vhost_user_write req:37 flags:0x9
> > > vhost_user_read_start
> > > vhost_user_read req:37 flags:0x5
> > > vhost_user_write req:8 flags:0x1
> > > vhost_user_write req:10 flags:0x1
> > > vhost_user_write req:9 flags:0x1
> > > vhost_user_write req:12 flags:0x1
> > > vhost_user_write req:13 flags:0x1
> > >
> > > albeit with a slightly different message sequence
> > > (VHOST_USER_ADD_MEM_REG instead of VHOST_USER_SET_MEM_TABLE). Reading
> > > the C code you can see why:
> > >
> > > need_reply = vmsg.flags & VHOST_USER_NEED_REPLY_MASK;
> > >
> > > reply_requested = vu_process_message(dev, &vmsg);
> > > if (!reply_requested && need_reply) {
> > > vmsg_set_reply_u64(&vmsg, 0);
> > > reply_requested = 1;
> > > }
> > >
> > > So regardless of what may have been negotiated it will always reply
> with
> > > something if the master requested it do so. This points us at the
> > > specification which reads:
> > >
> > > - Bit 3 is the need_reply flag - see :ref:`REPLY_ACK <reply_ack>` for
> > > details.
> > >
> > > which says in VHOST_USER_PROTOCOL_F_REPLY_ACK that this bit should only
> > > be honoured when the feature has been negotiated. Which brings us to a
> > > series of questions:
> > >
> > > - Should QEMU have preserved
> VhostUserVirtioFeatures::PROTOCOL_FEATURES
> > > when doing the eventual VHOST_USER_SET_FEATURES reply?
> >
> > Hmm looks like a bug indeed ... Anyone wants to look
> > into fixing that? Marc-André?
>
> chirantan and keiichi will be implementing vhost-user-vitio-fs on
> Chrome OS, maybe one of you two can take a look?
>
>
Yeah, our team is working on vhost-user virtiofs. I think +Woody Chow
<woodychow@chromium.org> will probably be able to look into this issue.
>
> >
> >
> >
> > > - Is vhost.rs being to strict or libvhost-user too lax in
> interpreting
> > > the negotiated features before processing the ``need_reply`` [Bit 3]
> > > field of the messages?
> > >
> > > - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> > > in the "list of the ones that do" require replies or do they only
> > > reply when REPLY_ACK has been negotiated as the ambiguous
> "seealso::"
> > > box out seems to imply?
> > >
> > > Currently I have some hacks in:
> > >
> > > https://github.com/stsquad/vhost/tree/my-hacks
> > >
> > > which gets my daemon booting up to the point we actually need to do a
> > > transaction. However I won't submit a PR until I've worked out exactly
> > > where the problems are.
> > >
> > > --
> > > Alex Bennée
> >
> >
> > _______________________________________________
> > Rust-vmm mailing list
> > Rust-vmm@lists.opendev.org
> > http://lists.opendev.org/cgi-bin/mailman/listinfo/rust-vmm
>
[-- Attachment #2: Type: text/html, Size: 8814 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-22 13:21 ` Alex Bennée
2021-02-22 13:27 ` Dr. David Alan Gilbert
@ 2021-02-26 19:58 ` Raphael Norwitz
2021-02-26 21:25 ` Raphael Norwitz
1 sibling, 1 reply; 11+ messages in thread
From: Raphael Norwitz @ 2021-02-26 19:58 UTC (permalink / raw)
To: Alex Bennée
Cc: Sergio Lopez, Michael S. Tsirkin, chirantan, QEMU,
Dr. David Alan Gilbert, keiichiw, Stefan Hajnoczi, rust-vmm,
Marc-André Lureau, dgreid
There are two sets of features being negotiated - virtio and
vhost-user. Based on what you've posted here, I suspect the
VHOST_USER_F_PROTOCOL_FEATURES virtio feature may not be negotiated by
the backend, preventing the vhost-user protocol feature negotiation
from happening at all. I'm not 100% sure why this would cause QEMU to
assume that REPLY_ACK was negotiated though.
some questions:
On Mon, Feb 22, 2021 at 3:26 AM Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
>
> > * Alex Bennée (alex.bennee@linaro.org) wrote:
> >> Hi,
> >>
> >> I finally got a chance to get down into the guts of vhost-user while
> >> attempting to port my original C RPMB daemon to Rust using the
> >> vhost-user-backend and related crates. I ended up with this hang during
> >> negotiation:
> >>
> >> startup
> >>
> >> vhost_user_write req:1 flags:0x1
> >> vhost_user_read_start
> >> vhost_user_read req:1 flags:0x5
> >> vhost_user_backend_init: we got 170000000
>
> GET_FEATURES
Do we also see a GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES
message here? If so can you confirm what flags they contained?
vhost-user feature negotiation works as follows (see vhost_user_backend_init()):
err = vhost_user_get_features(dev, &features);
if (err < 0) {
return err;
}
if (virtio_has_feature(features, VHOST_USER_F_PROTOCOL_FEATURES)) {
dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
&protocol_features);
if (err < 0) {
return err;
}
dev->protocol_features =
protocol_features & VHOST_USER_PROTOCOL_FEATURE_MASK;
...
err = vhost_user_set_protocol_features(dev, dev->protocol_features);
if (err < 0) {
return err;
}
}
So we first get the virtio features and check if the backend
advertises VHOST_USER_F_PROTOCOL_FEATURES. If it does, we proceed to
negotiate vhost-user features, in which case we should see
GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES. Otherwise it looks
like the function just returns, and we leave the vhost-user features
uninitialized (presumably zeroed out?), and the backend will never
even receive a GET/SET_PROTOCOL_FEATURES.
dev->protocol_features is not touched anywhere else, and, if
VHOST_USER_F_PROTOCOL_FEATURES is negotiated, comes directly to the
backend from the protocol_features the backend &ed with
VHOST_USER_PROTOCOL_FEATURE_MASK. Therefore if
VHOST_USER_F_PROTOCOL_FEATURES is indeed negotiated here I'm not sure
what could cause QEMU to think REPLY_ACK was negotiated while the
backend does not, spare something obvious like the backend mishandling
the GET/SET_PROTOCOL_FEATURES messages. I briefly checked the rustvmm
code for that and didn't see anything obvious.
mst - are backend devices meant to function if
VHOST_USER_F_PROTOCOL_FEATURES is not advertised? Do we know of any
functioning backend which does not advertise this virtio feature? If
not, maybe we consider failing out here?
alex - Are you sure QEMU gets stuck waiting on a reply_ack message,
and not somewhere else in the setup path? I trust a SET_MEM_TABLE
message was actually received by the backend. Did you confirm that
QEMU was indeed stuck waiting for a reply and not somewhere else later
on?
>
> >> vhost_user_write req:15 flags:0x1
> >> vhost_user_read_start
> >> vhost_user_read req:15 flags:0x5
> >> vhost_user_set_protocol_features: 2008
> >> vhost_user_write req:16 flags:0x1
> >> vhost_user_write req:3 flags:0x1
> >> vhost_user_write req:1 flags:0x1
> >> vhost_user_read_start
> >> vhost_user_read req:1 flags:0x5
> >> vhost_user_write req:13 flags:0x1
> >>
> >> kernel initialises device
> >>
> >> virtio_rpmb virtio1: init done!
> >> vhost_user_write req:13 flags:0x1
> >> vhost_dev_set_features: 130000000
> >> vhost_user_set_features: 130000000
>
> SET_FEATURES
This is setting virtio features - should have nothing to do with REPLY_ACK.
>
> >> vhost_user_write req:2 flags:0x1
> >> vhost_user_write req:5 flags:0x9
> >> vhost_user_read_start
> >>
> <snip>
> >>
> >> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> >> when doing the eventual VHOST_USER_SET_FEATURES reply?
> >>
> >> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> >> the negotiated features before processing the ``need_reply`` [Bit 3]
> >> field of the messages?
> >
> > I think vhost.rs is being correctly strict - but there would be no harm
> > in it flagging that you'd hit an inconsistency if it finds a need_reply
> > without the feature.
>
> But the feature should have been negotiated. So unless the slave can
> assume it is enabled because it asked I think QEMU is in the wrong by
> not preserving the feature bits in it's SET_FEATURES reply. We just gets
> away with it with libvhostuser being willing to reply anyway.
>
> >
> >> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> >> in the "list of the ones that do" require replies or do they only
> >> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> >> box out seems to imply?
> >
> > set_mem_table gives a reply when postcopy is enabled (and then qemu
> > replies to the reply!) but otherwise doesn't.
> > (Note there's an issue opened for .rs to support ADD_MEM_REGION
> > since it's a lot better than SET_MEM_TABLE which has a fixed size table
> > that's small).
>
> Thanks for the heads up.
>
> >
> > Dave
> >
> >> Currently I have some hacks in:
> >>
> >> https://github.com/stsquad/vhost/tree/my-hacks
> >>
> >> which gets my daemon booting up to the point we actually need to do a
> >> transaction. However I won't submit a PR until I've worked out exactly
> >> where the problems are.
> >>
> >> --
> >> Alex Bennée
> >>
>
>
> --
> Alex Bennée
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-26 19:58 ` Raphael Norwitz
@ 2021-02-26 21:25 ` Raphael Norwitz
2021-02-27 12:23 ` Alex Bennée
0 siblings, 1 reply; 11+ messages in thread
From: Raphael Norwitz @ 2021-02-26 21:25 UTC (permalink / raw)
To: Alex Bennée
Cc: Sergio Lopez, Michael S. Tsirkin, chirantan, QEMU,
Dr. David Alan Gilbert, keiichiw, Stefan Hajnoczi, rust-vmm,
Marc-André Lureau, dgreid
As an afterthought - if VHOST_USER_F_PROTOCOL_FEATURES is indeed
unset, the issue may well be caused by QEMU reading an uninitialized
value for dev->protocol_features. Some device types like cryptodev
explicitly zero it out. As I said, it isn't set anywhere else in the
source and If dev->protocol_features had REPLY_ACK set when the
vhost_dev device is initialized, it would exactly explain the behavior
you are seeing.
On Fri, Feb 26, 2021 at 9:58 AM Raphael Norwitz
<raphael.s.norwitz@gmail.com> wrote:
>
> There are two sets of features being negotiated - virtio and
> vhost-user. Based on what you've posted here, I suspect the
> VHOST_USER_F_PROTOCOL_FEATURES virtio feature may not be negotiated by
> the backend, preventing the vhost-user protocol feature negotiation
> from happening at all. I'm not 100% sure why this would cause QEMU to
> assume that REPLY_ACK was negotiated though.
>
> some questions:
>
> On Mon, Feb 22, 2021 at 3:26 AM Alex Bennée <alex.bennee@linaro.org> wrote:
> >
> >
> > Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
> >
> > > * Alex Bennée (alex.bennee@linaro.org) wrote:
> > >> Hi,
> > >>
> > >> I finally got a chance to get down into the guts of vhost-user while
> > >> attempting to port my original C RPMB daemon to Rust using the
> > >> vhost-user-backend and related crates. I ended up with this hang during
> > >> negotiation:
> > >>
> > >> startup
> > >>
> > >> vhost_user_write req:1 flags:0x1
> > >> vhost_user_read_start
> > >> vhost_user_read req:1 flags:0x5
> > >> vhost_user_backend_init: we got 170000000
> >
> > GET_FEATURES
>
> Do we also see a GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES
> message here? If so can you confirm what flags they contained?
>
> vhost-user feature negotiation works as follows (see vhost_user_backend_init()):
>
> err = vhost_user_get_features(dev, &features);
> if (err < 0) {
> return err;
> }
>
> if (virtio_has_feature(features, VHOST_USER_F_PROTOCOL_FEATURES)) {
> dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
>
> err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
> &protocol_features);
> if (err < 0) {
> return err;
> }
>
> dev->protocol_features =
> protocol_features & VHOST_USER_PROTOCOL_FEATURE_MASK;
> ...
>
> err = vhost_user_set_protocol_features(dev, dev->protocol_features);
> if (err < 0) {
> return err;
> }
> }
>
> So we first get the virtio features and check if the backend
> advertises VHOST_USER_F_PROTOCOL_FEATURES. If it does, we proceed to
> negotiate vhost-user features, in which case we should see
> GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES. Otherwise it looks
> like the function just returns, and we leave the vhost-user features
> uninitialized (presumably zeroed out?), and the backend will never
> even receive a GET/SET_PROTOCOL_FEATURES.
>
> dev->protocol_features is not touched anywhere else, and, if
> VHOST_USER_F_PROTOCOL_FEATURES is negotiated, comes directly to the
> backend from the protocol_features the backend &ed with
> VHOST_USER_PROTOCOL_FEATURE_MASK. Therefore if
> VHOST_USER_F_PROTOCOL_FEATURES is indeed negotiated here I'm not sure
> what could cause QEMU to think REPLY_ACK was negotiated while the
> backend does not, spare something obvious like the backend mishandling
> the GET/SET_PROTOCOL_FEATURES messages. I briefly checked the rustvmm
> code for that and didn't see anything obvious.
>
> mst - are backend devices meant to function if
> VHOST_USER_F_PROTOCOL_FEATURES is not advertised? Do we know of any
> functioning backend which does not advertise this virtio feature? If
> not, maybe we consider failing out here?
>
> alex - Are you sure QEMU gets stuck waiting on a reply_ack message,
> and not somewhere else in the setup path? I trust a SET_MEM_TABLE
> message was actually received by the backend. Did you confirm that
> QEMU was indeed stuck waiting for a reply and not somewhere else later
> on?
>
> >
> > >> vhost_user_write req:15 flags:0x1
> > >> vhost_user_read_start
> > >> vhost_user_read req:15 flags:0x5
> > >> vhost_user_set_protocol_features: 2008
> > >> vhost_user_write req:16 flags:0x1
> > >> vhost_user_write req:3 flags:0x1
> > >> vhost_user_write req:1 flags:0x1
> > >> vhost_user_read_start
> > >> vhost_user_read req:1 flags:0x5
> > >> vhost_user_write req:13 flags:0x1
> > >>
> > >> kernel initialises device
> > >>
> > >> virtio_rpmb virtio1: init done!
> > >> vhost_user_write req:13 flags:0x1
> > >> vhost_dev_set_features: 130000000
> > >> vhost_user_set_features: 130000000
> >
> > SET_FEATURES
>
> This is setting virtio features - should have nothing to do with REPLY_ACK.
>
> >
> > >> vhost_user_write req:2 flags:0x1
> > >> vhost_user_write req:5 flags:0x9
> > >> vhost_user_read_start
> > >>
> > <snip>
> > >>
> > >> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
> > >> when doing the eventual VHOST_USER_SET_FEATURES reply?
> > >>
> > >> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
> > >> the negotiated features before processing the ``need_reply`` [Bit 3]
> > >> field of the messages?
> > >
> > > I think vhost.rs is being correctly strict - but there would be no harm
> > > in it flagging that you'd hit an inconsistency if it finds a need_reply
> > > without the feature.
> >
> > But the feature should have been negotiated. So unless the slave can
> > assume it is enabled because it asked I think QEMU is in the wrong by
> > not preserving the feature bits in it's SET_FEATURES reply. We just gets
> > away with it with libvhostuser being willing to reply anyway.
> >
> > >
> > >> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
> > >> in the "list of the ones that do" require replies or do they only
> > >> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
> > >> box out seems to imply?
> > >
> > > set_mem_table gives a reply when postcopy is enabled (and then qemu
> > > replies to the reply!) but otherwise doesn't.
> > > (Note there's an issue opened for .rs to support ADD_MEM_REGION
> > > since it's a lot better than SET_MEM_TABLE which has a fixed size table
> > > that's small).
> >
> > Thanks for the heads up.
> >
> > >
> > > Dave
> > >
> > >> Currently I have some hacks in:
> > >>
> > >> https://github.com/stsquad/vhost/tree/my-hacks
> > >>
> > >> which gets my daemon booting up to the point we actually need to do a
> > >> transaction. However I won't submit a PR until I've worked out exactly
> > >> where the problems are.
> > >>
> > >> --
> > >> Alex Bennée
> > >>
> >
> >
> > --
> > Alex Bennée
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs)
2021-02-26 21:25 ` Raphael Norwitz
@ 2021-02-27 12:23 ` Alex Bennée
0 siblings, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2021-02-27 12:23 UTC (permalink / raw)
To: Raphael Norwitz
Cc: Sergio Lopez, Michael S. Tsirkin, chirantan, QEMU,
Dr. David Alan Gilbert, keiichiw, Stefan Hajnoczi, rust-vmm,
Marc-André Lureau, dgreid
Raphael Norwitz <raphael.s.norwitz@gmail.com> writes:
> As an afterthought - if VHOST_USER_F_PROTOCOL_FEATURES is indeed
> unset, the issue may well be caused by QEMU reading an uninitialized
> value for dev->protocol_features. Some device types like cryptodev
> explicitly zero it out. As I said, it isn't set anywhere else in the
> source and If dev->protocol_features had REPLY_ACK set when the
> vhost_dev device is initialized, it would exactly explain the behavior
> you are seeing.
Can I point you at the proposed clarification of the vhost-user
specification:
Subject: [VHOST USER SPEC PATCH] vhost-user.rst: add clarifying language about protocol negotiation
Date: Fri, 26 Feb 2021 11:16:19 +0000
Message-Id: <20210226111619.21178-1-alex.bennee@linaro.org>
>
> On Fri, Feb 26, 2021 at 9:58 AM Raphael Norwitz
> <raphael.s.norwitz@gmail.com> wrote:
>>
>> There are two sets of features being negotiated - virtio and
>> vhost-user. Based on what you've posted here, I suspect the
>> VHOST_USER_F_PROTOCOL_FEATURES virtio feature may not be negotiated by
>> the backend, preventing the vhost-user protocol feature negotiation
>> from happening at all. I'm not 100% sure why this would cause QEMU to
>> assume that REPLY_ACK was negotiated though.
>>
>> some questions:
>>
>> On Mon, Feb 22, 2021 at 3:26 AM Alex Bennée <alex.bennee@linaro.org> wrote:
>> >
>> >
>> > Dr. David Alan Gilbert <dgilbert@redhat.com> writes:
>> >
>> > > * Alex Bennée (alex.bennee@linaro.org) wrote:
>> > >> Hi,
>> > >>
>> > >> I finally got a chance to get down into the guts of vhost-user while
>> > >> attempting to port my original C RPMB daemon to Rust using the
>> > >> vhost-user-backend and related crates. I ended up with this hang during
>> > >> negotiation:
>> > >>
>> > >> startup
>> > >>
>> > >> vhost_user_write req:1 flags:0x1
>> > >> vhost_user_read_start
>> > >> vhost_user_read req:1 flags:0x5
>> > >> vhost_user_backend_init: we got 170000000
>> >
>> > GET_FEATURES
>>
>> Do we also see a GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES
>> message here? If so can you confirm what flags they contained?
>>
>> vhost-user feature negotiation works as follows (see vhost_user_backend_init()):
>>
>> err = vhost_user_get_features(dev, &features);
>> if (err < 0) {
>> return err;
>> }
>>
>> if (virtio_has_feature(features, VHOST_USER_F_PROTOCOL_FEATURES)) {
>> dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;
>>
>> err = vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES,
>> &protocol_features);
>> if (err < 0) {
>> return err;
>> }
>>
>> dev->protocol_features =
>> protocol_features & VHOST_USER_PROTOCOL_FEATURE_MASK;
>> ...
>>
>> err = vhost_user_set_protocol_features(dev, dev->protocol_features);
>> if (err < 0) {
>> return err;
>> }
>> }
>>
>> So we first get the virtio features and check if the backend
>> advertises VHOST_USER_F_PROTOCOL_FEATURES. If it does, we proceed to
>> negotiate vhost-user features, in which case we should see
>> GET_PROTOCOL_FEATURES and a SET_PROTOCOL_FEATURES. Otherwise it looks
>> like the function just returns, and we leave the vhost-user features
>> uninitialized (presumably zeroed out?), and the backend will never
>> even receive a GET/SET_PROTOCOL_FEATURES.
>>
>> dev->protocol_features is not touched anywhere else, and, if
>> VHOST_USER_F_PROTOCOL_FEATURES is negotiated, comes directly to the
>> backend from the protocol_features the backend &ed with
>> VHOST_USER_PROTOCOL_FEATURE_MASK. Therefore if
>> VHOST_USER_F_PROTOCOL_FEATURES is indeed negotiated here I'm not sure
>> what could cause QEMU to think REPLY_ACK was negotiated while the
>> backend does not, spare something obvious like the backend mishandling
>> the GET/SET_PROTOCOL_FEATURES messages. I briefly checked the rustvmm
>> code for that and didn't see anything obvious.
>>
>> mst - are backend devices meant to function if
>> VHOST_USER_F_PROTOCOL_FEATURES is not advertised? Do we know of any
>> functioning backend which does not advertise this virtio feature? If
>> not, maybe we consider failing out here?
>>
>> alex - Are you sure QEMU gets stuck waiting on a reply_ack message,
>> and not somewhere else in the setup path? I trust a SET_MEM_TABLE
>> message was actually received by the backend. Did you confirm that
>> QEMU was indeed stuck waiting for a reply and not somewhere else later
>> on?
>>
>> >
>> > >> vhost_user_write req:15 flags:0x1
>> > >> vhost_user_read_start
>> > >> vhost_user_read req:15 flags:0x5
>> > >> vhost_user_set_protocol_features: 2008
>> > >> vhost_user_write req:16 flags:0x1
>> > >> vhost_user_write req:3 flags:0x1
>> > >> vhost_user_write req:1 flags:0x1
>> > >> vhost_user_read_start
>> > >> vhost_user_read req:1 flags:0x5
>> > >> vhost_user_write req:13 flags:0x1
>> > >>
>> > >> kernel initialises device
>> > >>
>> > >> virtio_rpmb virtio1: init done!
>> > >> vhost_user_write req:13 flags:0x1
>> > >> vhost_dev_set_features: 130000000
>> > >> vhost_user_set_features: 130000000
>> >
>> > SET_FEATURES
>>
>> This is setting virtio features - should have nothing to do with REPLY_ACK.
>>
>> >
>> > >> vhost_user_write req:2 flags:0x1
>> > >> vhost_user_write req:5 flags:0x9
>> > >> vhost_user_read_start
>> > >>
>> > <snip>
>> > >>
>> > >> - Should QEMU have preserved VhostUserVirtioFeatures::PROTOCOL_FEATURES
>> > >> when doing the eventual VHOST_USER_SET_FEATURES reply?
>> > >>
>> > >> - Is vhost.rs being to strict or libvhost-user too lax in interpreting
>> > >> the negotiated features before processing the ``need_reply`` [Bit 3]
>> > >> field of the messages?
>> > >
>> > > I think vhost.rs is being correctly strict - but there would be no harm
>> > > in it flagging that you'd hit an inconsistency if it finds a need_reply
>> > > without the feature.
>> >
>> > But the feature should have been negotiated. So unless the slave can
>> > assume it is enabled because it asked I think QEMU is in the wrong by
>> > not preserving the feature bits in it's SET_FEATURES reply. We just gets
>> > away with it with libvhostuser being willing to reply anyway.
>> >
>> > >
>> > >> - are VHOST_USER_SET_MEM_TABLE to VHOST_USER_SET_INFLIGHT_FD included
>> > >> in the "list of the ones that do" require replies or do they only
>> > >> reply when REPLY_ACK has been negotiated as the ambiguous "seealso::"
>> > >> box out seems to imply?
>> > >
>> > > set_mem_table gives a reply when postcopy is enabled (and then qemu
>> > > replies to the reply!) but otherwise doesn't.
>> > > (Note there's an issue opened for .rs to support ADD_MEM_REGION
>> > > since it's a lot better than SET_MEM_TABLE which has a fixed size table
>> > > that's small).
>> >
>> > Thanks for the heads up.
>> >
>> > >
>> > > Dave
>> > >
>> > >> Currently I have some hacks in:
>> > >>
>> > >> https://github.com/stsquad/vhost/tree/my-hacks
>> > >>
>> > >> which gets my daemon booting up to the point we actually need to do a
>> > >> transaction. However I won't submit a PR until I've worked out exactly
>> > >> where the problems are.
>> > >>
>> > >> --
>> > >> Alex Bennée
>> > >>
>> >
>> >
>> > --
>> > Alex Bennée
>> >
--
Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-02-27 12:25 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-19 16:04 vhost reply_ack negotiation (a.k.a differences in vhost-user behaviour with libvhost-user and vhost-user-backend.rs) Alex Bennée
2021-02-22 13:06 ` Dr. David Alan Gilbert
2021-02-22 13:21 ` Alex Bennée
2021-02-22 13:27 ` Dr. David Alan Gilbert
2021-02-23 10:23 ` [Rust-VMM] " Jiang Liu
2021-02-26 19:58 ` Raphael Norwitz
2021-02-26 21:25 ` Raphael Norwitz
2021-02-27 12:23 ` Alex Bennée
2021-02-23 11:44 ` Michael S. Tsirkin
2021-02-25 4:19 ` [Rust-VMM] " Dylan Reid
2021-02-25 6:36 ` Keiichi Watanabe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).