All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vhost: fix MQ fails to startup
@ 2017-04-27  6:34 Zhiyong Yang
  2017-04-27  7:41 ` Loftus, Ciara
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Zhiyong Yang @ 2017-04-27  6:34 UTC (permalink / raw)
  To: dev; +Cc: yuanhan.liu, maxime.coquelin, ciara.loftus, Zhiyong Yang

vhost since dpdk17.02 + qemu2.7 and above will cause failures of
new connection when negotiating to set MQ. (one queue pair works
well).Because there exist some bugs in qemu code when introducing
VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
doesn't send the messge (The message needs to be sent only once)but
still will be waiting for dpdk's reply ack, then, qemu is always
freezing. DPDK code works in the right way. But the feature 
VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
dpdk side in order to avoid the feature support of DPDK + qemu at
the same time. if doing like that, MQ can works well. Once Qemu bugs
have been fixed and upstreamed, we can enable it.

Fixes: 73c8f9f69c6c("vhost: introduce reply ack feature")

Reported-by: Loftus, Ciara <ciara.loftus@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
---
 lib/librte_vhost/vhost_user.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 2ba22db..a3d2900 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -52,7 +52,7 @@
 #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
 					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
-					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
+					 (0ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))
 
 typedef enum VhostUserRequest {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  6:34 [PATCH] vhost: fix MQ fails to startup Zhiyong Yang
@ 2017-04-27  7:41 ` Loftus, Ciara
  2017-04-27  7:56 ` Maxime Coquelin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Loftus, Ciara @ 2017-04-27  7:41 UTC (permalink / raw)
  To: Yang, Zhiyong, dev; +Cc: yuanhan.liu, maxime.coquelin

> 
> vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> new connection when negotiating to set MQ. (one queue pair works
> well).Because there exist some bugs in qemu code when introducing
> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the
> vhost
> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> doesn't send the messge (The message needs to be sent only once)but
> still will be waiting for dpdk's reply ack, then, qemu is always
> freezing. DPDK code works in the right way. But the feature
> VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
> dpdk side in order to avoid the feature support of DPDK + qemu at
> the same time. if doing like that, MQ can works well. Once Qemu bugs
> have been fixed and upstreamed, we can enable it.
> 
> Fixes: 73c8f9f69c6c("vhost: introduce reply ack feature")
> 
> Reported-by: Loftus, Ciara <ciara.loftus@intel.com>
> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>

Thanks for the fix Zhiyong. I tested the patch in my environment and it resolves the issue I was seeing.

Tested-by: Ciara Loftus <ciara.loftus@intel.com>

Thanks,
Ciara

> ---
>  lib/librte_vhost/vhost_user.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
> index 2ba22db..a3d2900 100644
> --- a/lib/librte_vhost/vhost_user.h
> +++ b/lib/librte_vhost/vhost_user.h
> @@ -52,7 +52,7 @@
>  #define VHOST_USER_PROTOCOL_FEATURES	((1ULL <<
> VHOST_USER_PROTOCOL_F_MQ) | \
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_RARP) | \
> -					 (1ULL <<
> VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
> +					 (0ULL <<
> VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_NET_MTU))
> 
>  typedef enum VhostUserRequest {
> --
> 2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  6:34 [PATCH] vhost: fix MQ fails to startup Zhiyong Yang
  2017-04-27  7:41 ` Loftus, Ciara
@ 2017-04-27  7:56 ` Maxime Coquelin
  2017-04-27  8:05   ` Maxime Coquelin
  2017-04-27  8:20   ` Yuanhan Liu
  2017-04-27  8:12 ` Yuanhan Liu
  2017-04-27  9:41 ` [PATCH v2] vhost: workaround " Zhiyong Yang
  3 siblings, 2 replies; 20+ messages in thread
From: Maxime Coquelin @ 2017-04-27  7:56 UTC (permalink / raw)
  To: Zhiyong Yang, dev; +Cc: yuanhan.liu, ciara.loftus, Marc-André Lureau

Hi Zhiyong,

+Marc-André

On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
> vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> new connection when negotiating to set MQ. (one queue pair works
> well).Because there exist some bugs in qemu code when introducing
> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> doesn't send the messge (The message needs to be sent only once)but
> still will be waiting for dpdk's reply ack, then, qemu is always
> freezing. DPDK code works in the right way.

I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
see how it could wait for the reply-ack if it didn't send the
VHOST_USER_SET_MEM_TABLE request before.

> But the feature
> VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
> dpdk side in order to avoid the feature support of DPDK + qemu at
> the same time. if doing like that, MQ can works well. Once Qemu bugs
> have been fixed and upstreamed, we can enable it.

The problem is for DPDK to detect whether bug is fixed in Qemu.
Maybe only way would be to have a new protocol feature flag, which is
not really its role.

> Fixes: 73c8f9f69c6c("vhost: introduce reply ack feature")
> 
> Reported-by: Loftus, Ciara <ciara.loftus@intel.com>
> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> ---
>   lib/librte_vhost/vhost_user.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
> index 2ba22db..a3d2900 100644
> --- a/lib/librte_vhost/vhost_user.h
> +++ b/lib/librte_vhost/vhost_user.h
> @@ -52,7 +52,7 @@
>   #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
>   					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
>   					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
> -					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
> +					 (0ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
>   					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))
>   
>   typedef enum VhostUserRequest {
> 

Cheers,
Maxime

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  7:56 ` Maxime Coquelin
@ 2017-04-27  8:05   ` Maxime Coquelin
  2017-04-27  8:24     ` Yang, Zhiyong
  2017-04-27  8:20   ` Yuanhan Liu
  1 sibling, 1 reply; 20+ messages in thread
From: Maxime Coquelin @ 2017-04-27  8:05 UTC (permalink / raw)
  To: Zhiyong Yang, dev; +Cc: yuanhan.liu, ciara.loftus, Marc-André Lureau



On 04/27/2017 09:56 AM, Maxime Coquelin wrote:
> Hi Zhiyong,
> 
> +Marc-André
> 
> On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
>> vhost since dpdk17.02 + qemu2.7 and above will cause failures of
>> new connection when negotiating to set MQ. (one queue pair works
>> well).Because there exist some bugs in qemu code when introducing
>> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
>> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
>> doesn't send the messge (The message needs to be sent only once)but
>> still will be waiting for dpdk's reply ack, then, qemu is always
>> freezing. DPDK code works in the right way.
> 
> I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
> see how it could wait for the reply-ack if it didn't send the
> VHOST_USER_SET_MEM_TABLE request before.

Oh, sorry, I get it now.
Are you working for a fix in Qemu, or have you already reported the
problem?

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  6:34 [PATCH] vhost: fix MQ fails to startup Zhiyong Yang
  2017-04-27  7:41 ` Loftus, Ciara
  2017-04-27  7:56 ` Maxime Coquelin
@ 2017-04-27  8:12 ` Yuanhan Liu
  2017-04-27  8:32   ` Yang, Zhiyong
  2017-04-27  9:41 ` [PATCH v2] vhost: workaround " Zhiyong Yang
  3 siblings, 1 reply; 20+ messages in thread
From: Yuanhan Liu @ 2017-04-27  8:12 UTC (permalink / raw)
  To: Zhiyong Yang; +Cc: dev, maxime.coquelin, ciara.loftus

On Thu, Apr 27, 2017 at 02:34:53PM +0800, Zhiyong Yang wrote:
> vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> new connection when negotiating to set MQ. (one queue pair works
> well).Because there exist some bugs in qemu code when introducing
> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> doesn't send the messge (The message needs to be sent only once)but
> still will be waiting for dpdk's reply ack, then, qemu is always
> freezing. DPDK code works in the right way. But the feature 
> VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
> dpdk side in order to avoid the feature support of DPDK + qemu at
> the same time. if doing like that, MQ can works well.
...
> Once Qemu bugs
> have been fixed and upstreamed, we can enable it.

As I have said, we should not enable it again, because there are already
few buggy QEMU releases out. We should make sure DPDK also works well with
them.

> Fixes: 73c8f9f69c6c("vhost: introduce reply ack feature")

That commit does nothing wrong. It's QEMU being buggy. That said, I will
not add such fixline. I will also use "workaround" instead of "fix" in
the title.

Also, this patch should be backported to stable release. So, you should
add:
	Cc: stable@dpdk.org

Besides, please reformat you commit log a bit. For example, add space
after punctuation, use paragraph as possible, etc.

> 
> Reported-by: Loftus, Ciara <ciara.loftus@intel.com>

No "," is allowed.

> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> ---
>  lib/librte_vhost/vhost_user.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
> index 2ba22db..a3d2900 100644
> --- a/lib/librte_vhost/vhost_user.h
> +++ b/lib/librte_vhost/vhost_user.h
> @@ -52,7 +52,7 @@
>  #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
>  					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
>  					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
> -					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
> +					 (0ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
>  					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))

I think you might want to add a simple comment here, something like

    /*
     * disable REPLY_ACK feature to workaround the buggy QEMU implementation.
     * Proved buggy QEMU includes v2.7 - v2.9.
     */

So, please send v2?

	--yliu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  7:56 ` Maxime Coquelin
  2017-04-27  8:05   ` Maxime Coquelin
@ 2017-04-27  8:20   ` Yuanhan Liu
  2017-04-27  8:52     ` Maxime Coquelin
  1 sibling, 1 reply; 20+ messages in thread
From: Yuanhan Liu @ 2017-04-27  8:20 UTC (permalink / raw)
  To: Maxime Coquelin; +Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau

On Thu, Apr 27, 2017 at 09:56:47AM +0200, Maxime Coquelin wrote:
> Hi Zhiyong,
> 
> +Marc-André
> 
> On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
> >vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> >new connection when negotiating to set MQ. (one queue pair works
> >well).Because there exist some bugs in qemu code when introducing
> >VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
> >message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> >doesn't send the messge (The message needs to be sent only once)but
> >still will be waiting for dpdk's reply ack, then, qemu is always
> >freezing. DPDK code works in the right way.
> 
> I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
> see how it could wait for the reply-ack if it didn't send the
> VHOST_USER_SET_MEM_TABLE request before.
> 
> >But the feature
> >VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
> >dpdk side in order to avoid the feature support of DPDK + qemu at
> >the same time. if doing like that, MQ can works well. Once Qemu bugs
> >have been fixed and upstreamed, we can enable it.
> 
> The problem is for DPDK to detect whether bug is fixed in Qemu.
> Maybe only way would be to have a new protocol feature flag, which is
> not really its role.

Wouldn't that be an overkill, judging that REPLY_ACK is not a must
feature?

	--yliu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  8:05   ` Maxime Coquelin
@ 2017-04-27  8:24     ` Yang, Zhiyong
  2017-04-27  8:32       ` Maxime Coquelin
  0 siblings, 1 reply; 20+ messages in thread
From: Yang, Zhiyong @ 2017-04-27  8:24 UTC (permalink / raw)
  To: Maxime Coquelin, dev; +Cc: yuanhan.liu, Loftus, Ciara, Marc-André Lureau

Hi, Maxime:

> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Thursday, April 27, 2017 4:05 PM
> To: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org
> Cc: yuanhan.liu@linux.intel.com; Loftus, Ciara <ciara.loftus@intel.com>; Marc-
> André Lureau <mlureau@redhat.com>
> Subject: Re: [PATCH] vhost: fix MQ fails to startup
> 
> 
> 
> On 04/27/2017 09:56 AM, Maxime Coquelin wrote:
> > Hi Zhiyong,
> >
> > +Marc-André
> >
> > On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
> >> vhost since dpdk17.02 + qemu2.7 and above will cause failures of new
> >> connection when negotiating to set MQ. (one queue pair works
> >> well).Because there exist some bugs in qemu code when introducing
> >> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the
> vhost
> >> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> >> doesn't send the messge (The message needs to be sent only once)but
> >> still will be waiting for dpdk's reply ack, then, qemu is always
> >> freezing. DPDK code works in the right way.
> >
> > I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
> > see how it could wait for the reply-ack if it didn't send the
> > VHOST_USER_SET_MEM_TABLE request before.
> 
> Oh, sorry, I get it now.
> Are you working for a fix in Qemu, or have you already reported the problem?

I will send bug fix patch to Qemu. 
The same wrong code has also been used when Qemu2.9 introduce the MTU negotiation with DPDK.
I will fix them at the same time in qemu.

Thanks
Zhiyong

> 
> Thanks,
> Maxime

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  8:12 ` Yuanhan Liu
@ 2017-04-27  8:32   ` Yang, Zhiyong
  0 siblings, 0 replies; 20+ messages in thread
From: Yang, Zhiyong @ 2017-04-27  8:32 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, maxime.coquelin, Loftus, Ciara

Hi, yuanhan:

> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Thursday, April 27, 2017 4:12 PM
> To: Yang, Zhiyong <zhiyong.yang@intel.com>
> Cc: dev@dpdk.org; maxime.coquelin@redhat.com; Loftus, Ciara
> <ciara.loftus@intel.com>
> Subject: Re: [PATCH] vhost: fix MQ fails to startup
> 
> On Thu, Apr 27, 2017 at 02:34:53PM +0800, Zhiyong Yang wrote:
> > vhost since dpdk17.02 + qemu2.7 and above will cause failures of new
> > connection when negotiating to set MQ. (one queue pair works
> > well).Because there exist some bugs in qemu code when introducing
> > VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the
> vhost
> > message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> > doesn't send the messge (The message needs to be sent only once)but
> > still will be waiting for dpdk's reply ack, then, qemu is always
> > freezing. DPDK code works in the right way. But the feature
> > VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
> > dpdk side in order to avoid the feature support of DPDK + qemu at the
> > same time. if doing like that, MQ can works well.
> ...
> > Once Qemu bugs
> > have been fixed and upstreamed, we can enable it.
> 
> As I have said, we should not enable it again, because there are already few
> buggy QEMU releases out. We should make sure DPDK also works well with
> them.

Ok.

> 
> > Fixes: 73c8f9f69c6c("vhost: introduce reply ack feature")
> 
> That commit does nothing wrong. It's QEMU being buggy. That said, I will not
> add such fixline. I will also use "workaround" instead of "fix" in the title.
>

Ok. I also think dpdk works in the right way. But I'm using "fix" in the title here.
Your suggestion is good.
 
> Also, this patch should be backported to stable release. So, you should
> add:
> 	Cc: stable@dpdk.org
> 

Ok

> Besides, please reformat you commit log a bit. For example, add space after
> punctuation, use paragraph as possible, etc.

Ok.

> 
> >
> > Reported-by: Loftus, Ciara <ciara.loftus@intel.com>
> 
> No "," is allowed.
> 

Ok

> > Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> > ---
> >  lib/librte_vhost/vhost_user.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_vhost/vhost_user.h
> > b/lib/librte_vhost/vhost_user.h index 2ba22db..a3d2900 100644
> > --- a/lib/librte_vhost/vhost_user.h
> > +++ b/lib/librte_vhost/vhost_user.h
> > @@ -52,7 +52,7 @@
> >  #define VHOST_USER_PROTOCOL_FEATURES	((1ULL <<
> VHOST_USER_PROTOCOL_F_MQ) | \
> >  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
> >  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_RARP) | \
> > -					 (1ULL <<
> VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
> > +					 (0ULL <<
> VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
> >  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_NET_MTU))
> 
> I think you might want to add a simple comment here, something like
> 
>     /*
>      * disable REPLY_ACK feature to workaround the buggy QEMU
> implementation.
>      * Proved buggy QEMU includes v2.7 - v2.9.
>      */
> 
> So, please send v2?
>

Ok. 
I will send it later.

Zhiyong
 
> 	--yliu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  8:24     ` Yang, Zhiyong
@ 2017-04-27  8:32       ` Maxime Coquelin
  0 siblings, 0 replies; 20+ messages in thread
From: Maxime Coquelin @ 2017-04-27  8:32 UTC (permalink / raw)
  To: Yang, Zhiyong, dev; +Cc: yuanhan.liu, Loftus, Ciara, Marc-André Lureau



On 04/27/2017 10:24 AM, Yang, Zhiyong wrote:
> Hi, Maxime:
> 
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>> Sent: Thursday, April 27, 2017 4:05 PM
>> To: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org
>> Cc: yuanhan.liu@linux.intel.com; Loftus, Ciara <ciara.loftus@intel.com>; Marc-
>> André Lureau <mlureau@redhat.com>
>> Subject: Re: [PATCH] vhost: fix MQ fails to startup
>>
>>
>>
>> On 04/27/2017 09:56 AM, Maxime Coquelin wrote:
>>> Hi Zhiyong,
>>>
>>> +Marc-André
>>>
>>> On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
>>>> vhost since dpdk17.02 + qemu2.7 and above will cause failures of new
>>>> connection when negotiating to set MQ. (one queue pair works
>>>> well).Because there exist some bugs in qemu code when introducing
>>>> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the
>> vhost
>>>> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
>>>> doesn't send the messge (The message needs to be sent only once)but
>>>> still will be waiting for dpdk's reply ack, then, qemu is always
>>>> freezing. DPDK code works in the right way.
>>>
>>> I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
>>> see how it could wait for the reply-ack if it didn't send the
>>> VHOST_USER_SET_MEM_TABLE request before.
>>
>> Oh, sorry, I get it now.
>> Are you working for a fix in Qemu, or have you already reported the problem?
> 
> I will send bug fix patch to Qemu.
> The same wrong code has also been used when Qemu2.9 introduce the MTU negotiation with DPDK.
> I will fix them at the same time in qemu.

I think the problem must be fixed generally and not per request.
Maybe in vhost_user_write() if one-time request, just clear the
VHOST_USER_NEED_REPLY flag. Then, in process_message_reply(), return
early if this flag isn't set.

But that is not enough because as said, even if fixed, the backend has
no way to know about it.

> Thanks
> Zhiyong
> 
>>
>> Thanks,
>> Maxime

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  8:20   ` Yuanhan Liu
@ 2017-04-27  8:52     ` Maxime Coquelin
  2017-04-28  2:25       ` Yuanhan Liu
  0 siblings, 1 reply; 20+ messages in thread
From: Maxime Coquelin @ 2017-04-27  8:52 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau



On 04/27/2017 10:20 AM, Yuanhan Liu wrote:
> On Thu, Apr 27, 2017 at 09:56:47AM +0200, Maxime Coquelin wrote:
>> Hi Zhiyong,
>>
>> +Marc-André
>>
>> On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
>>> vhost since dpdk17.02 + qemu2.7 and above will cause failures of
>>> new connection when negotiating to set MQ. (one queue pair works
>>> well).Because there exist some bugs in qemu code when introducing
>>> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
>>> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
>>> doesn't send the messge (The message needs to be sent only once)but
>>> still will be waiting for dpdk's reply ack, then, qemu is always
>>> freezing. DPDK code works in the right way.
>>
>> I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
>> see how it could wait for the reply-ack if it didn't send the
>> VHOST_USER_SET_MEM_TABLE request before.
>>
>>> But the feature
>>> VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
>>> dpdk side in order to avoid the feature support of DPDK + qemu at
>>> the same time. if doing like that, MQ can works well. Once Qemu bugs
>>> have been fixed and upstreamed, we can enable it.
>>
>> The problem is for DPDK to detect whether bug is fixed in Qemu.
>> Maybe only way would be to have a new protocol feature flag, which is
>> not really its role.
> 
> Wouldn't that be an overkill, judging that REPLY_ACK is not a must
> feature?

Yes, maybe. But it was introduced to fix (possible) race conditions:
https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html

Note that I planned to use this feature for the device IOTLB
implementation to let the backend decide whether it wants the IOTLB
misses synchronous or asynchronous. But I can still change the protocol
spec to make this behavior specific to this request.

Maxime
> 	--yliu
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2] vhost: workaround MQ fails to startup
  2017-04-27  6:34 [PATCH] vhost: fix MQ fails to startup Zhiyong Yang
                   ` (2 preceding siblings ...)
  2017-04-27  8:12 ` Yuanhan Liu
@ 2017-04-27  9:41 ` Zhiyong Yang
  2017-04-27 10:00   ` Maxime Coquelin
  3 siblings, 1 reply; 20+ messages in thread
From: Zhiyong Yang @ 2017-04-27  9:41 UTC (permalink / raw)
  To: dev; +Cc: yuanhan.liu, ciara.loftus, maxime.coquelin, stable, Zhiyong Yang

  vhost since dpdk17.02 + qemu2.7 and above will cause failures of
new connection when negotiating to set MQ. (one queue pair works
well).
   Because there exist some bugs in qemu code when introducing
VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
doesn't send the messge (The message needs to be sent only once)but
still will be waiting for dpdk's reply ack, then, qemu is always
freezing. DPDK code indeed works in the right way.
   The feature VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled
by default at the dpdk side in order to avoid the feature support of
DPDK + qemu at the same time. if doing like that, MQ can works well.

Cc: stable@dpdk.org

Reported-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
---

changes in V2
1. modify "workaround" instead of "fix" in the title.
2. add a simple comment suggested by yuanhan

 lib/librte_vhost/vhost_user.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 2ba22db..35ebd71 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -49,10 +49,14 @@
 #define VHOST_USER_PROTOCOL_F_REPLY_ACK	3
 #define VHOST_USER_PROTOCOL_F_NET_MTU 4
 
+/*
+ * disable REPLY_ACK feature to workaround the buggy QEMU implementation.
+ * Proved buggy QEMU includes v2.7 - v2.9.
+ */
 #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
 					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
-					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
+					 (0ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))
 
 typedef enum VhostUserRequest {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] vhost: workaround MQ fails to startup
  2017-04-27  9:41 ` [PATCH v2] vhost: workaround " Zhiyong Yang
@ 2017-04-27 10:00   ` Maxime Coquelin
  2017-04-28  4:29     ` Yuanhan Liu
  0 siblings, 1 reply; 20+ messages in thread
From: Maxime Coquelin @ 2017-04-27 10:00 UTC (permalink / raw)
  To: Zhiyong Yang, dev; +Cc: yuanhan.liu, ciara.loftus, stable



On 04/27/2017 11:41 AM, Zhiyong Yang wrote:
>    vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> new connection when negotiating to set MQ. (one queue pair works
> well).
>     Because there exist some bugs in qemu code when introducing
> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> doesn't send the messge (The message needs to be sent only once)but
> still will be waiting for dpdk's reply ack, then, qemu is always
> freezing. DPDK code indeed works in the right way.
>     The feature VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled
> by default at the dpdk side in order to avoid the feature support of
> DPDK + qemu at the same time. if doing like that, MQ can works well.
> 
> Cc: stable@dpdk.org
> 
> Reported-by: Ciara Loftus <ciara.loftus@intel.com>
> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> Tested-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
> 
> changes in V2
> 1. modify "workaround" instead of "fix" in the title.
> 2. add a simple comment suggested by yuanhan
> 
>   lib/librte_vhost/vhost_user.h | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-27  8:52     ` Maxime Coquelin
@ 2017-04-28  2:25       ` Yuanhan Liu
  2017-04-28  7:23         ` Maxime Coquelin
  0 siblings, 1 reply; 20+ messages in thread
From: Yuanhan Liu @ 2017-04-28  2:25 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau,
	Michael S. Tsirkin

On Thu, Apr 27, 2017 at 10:52:20AM +0200, Maxime Coquelin wrote:
> 
> 
> On 04/27/2017 10:20 AM, Yuanhan Liu wrote:
> >On Thu, Apr 27, 2017 at 09:56:47AM +0200, Maxime Coquelin wrote:
> >>Hi Zhiyong,
> >>
> >>+Marc-André
> >>
> >>On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
> >>>vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> >>>new connection when negotiating to set MQ. (one queue pair works
> >>>well).Because there exist some bugs in qemu code when introducing
> >>>VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
> >>>message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> >>>doesn't send the messge (The message needs to be sent only once)but
> >>>still will be waiting for dpdk's reply ack, then, qemu is always
> >>>freezing. DPDK code works in the right way.
> >>
> >>I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
> >>see how it could wait for the reply-ack if it didn't send the
> >>VHOST_USER_SET_MEM_TABLE request before.
> >>
> >>>But the feature
> >>>VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
> >>>dpdk side in order to avoid the feature support of DPDK + qemu at
> >>>the same time. if doing like that, MQ can works well. Once Qemu bugs
> >>>have been fixed and upstreamed, we can enable it.
> >>
> >>The problem is for DPDK to detect whether bug is fixed in Qemu.
> >>Maybe only way would be to have a new protocol feature flag, which is
> >>not really its role.
> >
> >Wouldn't that be an overkill, judging that REPLY_ACK is not a must
> >feature?
> 
> Yes, maybe. But it was introduced to fix (possible) race conditions:
> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html

But AFAIK, that commit has been reverted:

    commit 94c9cb31c04737f86be29afefbff401cd23bc24d
    Author: Michael S. Tsirkin <mst@redhat.com>
    Date:   Mon Aug 15 16:35:24 2016 +0300
    
        Revert "vhost-user: Attempt to fix a race with set_mem_table."
    
        This reverts commit 28ed5ef16384f12500abd3647973ee21b03cbe23.
    
        I still think it's the right thing to do, but
        tests have been failing sporadically.
    
        Revert for now, and hope to fix it before the release.

> 
> Note that I planned to use this feature for the device IOTLB
> implementation to let the backend decide whether it wants the IOTLB
> misses synchronous or asynchronous. But I can still change the protocol
> spec to make this behavior specific to this request.

Maybe we could introduce a version message? With that, we could tell
whether the frontend has fixed the known bug or not.

Note that we already has the "version" info in current vhost-user spec.
It's just 2 bits in the message "flag" field though, which is not quite
enough.

	--yliu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] vhost: workaround MQ fails to startup
  2017-04-27 10:00   ` Maxime Coquelin
@ 2017-04-28  4:29     ` Yuanhan Liu
  2017-05-10  2:07       ` Yang, Zhiyong
  0 siblings, 1 reply; 20+ messages in thread
From: Yuanhan Liu @ 2017-04-28  4:29 UTC (permalink / raw)
  To: Maxime Coquelin; +Cc: Zhiyong Yang, dev, ciara.loftus, stable

On Thu, Apr 27, 2017 at 12:00:52PM +0200, Maxime Coquelin wrote:
> 
> 
> On 04/27/2017 11:41 AM, Zhiyong Yang wrote:
> >   vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> >new connection when negotiating to set MQ. (one queue pair works
> >well).
> >    Because there exist some bugs in qemu code when introducing
> >VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
> >message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> >doesn't send the messge (The message needs to be sent only once)but
> >still will be waiting for dpdk's reply ack, then, qemu is always
> >freezing. DPDK code indeed works in the right way.
> >    The feature VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled
> >by default at the dpdk side in order to avoid the feature support of
> >DPDK + qemu at the same time. if doing like that, MQ can works well.
> >
> >Cc: stable@dpdk.org
> >
> >Reported-by: Ciara Loftus <ciara.loftus@intel.com>
> >Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> >Tested-by: Ciara Loftus <ciara.loftus@intel.com>
> >---
> >
> >changes in V2
> >1. modify "workaround" instead of "fix" in the title.
> >2. add a simple comment suggested by yuanhan
> >
> >  lib/librte_vhost/vhost_user.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> 
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Applied to dpdk-next-virtio.

Thanks.

	--yliu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-28  2:25       ` Yuanhan Liu
@ 2017-04-28  7:23         ` Maxime Coquelin
  2017-04-28  7:35           ` Yuanhan Liu
  0 siblings, 1 reply; 20+ messages in thread
From: Maxime Coquelin @ 2017-04-28  7:23 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau,
	Michael S. Tsirkin



On 04/28/2017 04:25 AM, Yuanhan Liu wrote:
> On Thu, Apr 27, 2017 at 10:52:20AM +0200, Maxime Coquelin wrote:
>>
>>
>> On 04/27/2017 10:20 AM, Yuanhan Liu wrote:
>>> On Thu, Apr 27, 2017 at 09:56:47AM +0200, Maxime Coquelin wrote:
>>>> Hi Zhiyong,
>>>>
>>>> +Marc-André
>>>>
>>>> On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
>>>>> vhost since dpdk17.02 + qemu2.7 and above will cause failures of
>>>>> new connection when negotiating to set MQ. (one queue pair works
>>>>> well).Because there exist some bugs in qemu code when introducing
>>>>> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
>>>>> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
>>>>> doesn't send the messge (The message needs to be sent only once)but
>>>>> still will be waiting for dpdk's reply ack, then, qemu is always
>>>>> freezing. DPDK code works in the right way.
>>>>
>>>> I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
>>>> see how it could wait for the reply-ack if it didn't send the
>>>> VHOST_USER_SET_MEM_TABLE request before.
>>>>
>>>>> But the feature
>>>>> VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
>>>>> dpdk side in order to avoid the feature support of DPDK + qemu at
>>>>> the same time. if doing like that, MQ can works well. Once Qemu bugs
>>>>> have been fixed and upstreamed, we can enable it.
>>>>
>>>> The problem is for DPDK to detect whether bug is fixed in Qemu.
>>>> Maybe only way would be to have a new protocol feature flag, which is
>>>> not really its role.
>>>
>>> Wouldn't that be an overkill, judging that REPLY_ACK is not a must
>>> feature?
>>
>> Yes, maybe. But it was introduced to fix (possible) race conditions:
>> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html
> 
> But AFAIK, that commit has been reverted:
> 
>      commit 94c9cb31c04737f86be29afefbff401cd23bc24d
>      Author: Michael S. Tsirkin <mst@redhat.com>
>      Date:   Mon Aug 15 16:35:24 2016 +0300
>      
>          Revert "vhost-user: Attempt to fix a race with set_mem_table."
>      
>          This reverts commit 28ed5ef16384f12500abd3647973ee21b03cbe23.
>      
>          I still think it's the right thing to do, but
>          tests have been failing sporadically.
>      
>          Revert for now, and hope to fix it before the release.

No, what has been reverted is a workaround when REPLY_ACK protocol
feature has not been negotiated.

Instead of waiting for the backend to send the ack, the workaround
consisted in sending a GET_FEATURES request after having sent the
SET_MEM_TABLE request, in order to ensure SET_MEM_TABLE request handling
was done before.

The problem is that it sometimes created a deadlock when when running
QEMU's vhost-user-test in TCG mode.

>>
>> Note that I planned to use this feature for the device IOTLB
>> implementation to let the backend decide whether it wants the IOTLB
>> misses synchronous or asynchronous. But I can still change the protocol
>> spec to make this behavior specific to this request.
> 
> Maybe we could introduce a version message? With that, we could tell
> whether the frontend has fixed the known bug or not.

That's a possibility, but this is not really the role of a protocol
version. As in this case, the protocol does not change, just an
implementation.

> Note that we already has the "version" info in current vhost-user spec.
> It's just 2 bits in the message "flag" field though, which is not quite
> enough.

Indeed, it does not let room for lots of bugs :)

Thanks,
Maxime
> 	--yliu
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-28  7:23         ` Maxime Coquelin
@ 2017-04-28  7:35           ` Yuanhan Liu
  2017-04-28  7:39             ` Yuanhan Liu
  2017-04-28  7:57             ` Maxime Coquelin
  0 siblings, 2 replies; 20+ messages in thread
From: Yuanhan Liu @ 2017-04-28  7:35 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau,
	Michael S. Tsirkin

On Fri, Apr 28, 2017 at 09:23:54AM +0200, Maxime Coquelin wrote:
> 
> 
> On 04/28/2017 04:25 AM, Yuanhan Liu wrote:
> >On Thu, Apr 27, 2017 at 10:52:20AM +0200, Maxime Coquelin wrote:
> >>
> >>
> >>On 04/27/2017 10:20 AM, Yuanhan Liu wrote:
> >>>On Thu, Apr 27, 2017 at 09:56:47AM +0200, Maxime Coquelin wrote:
> >>>>Hi Zhiyong,
> >>>>
> >>>>+Marc-André
> >>>>
> >>>>On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
> >>>>>vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> >>>>>new connection when negotiating to set MQ. (one queue pair works
> >>>>>well).Because there exist some bugs in qemu code when introducing
> >>>>>VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
> >>>>>message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> >>>>>doesn't send the messge (The message needs to be sent only once)but
> >>>>>still will be waiting for dpdk's reply ack, then, qemu is always
> >>>>>freezing. DPDK code works in the right way.
> >>>>
> >>>>I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
> >>>>see how it could wait for the reply-ack if it didn't send the
> >>>>VHOST_USER_SET_MEM_TABLE request before.
> >>>>
> >>>>>But the feature
> >>>>>VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
> >>>>>dpdk side in order to avoid the feature support of DPDK + qemu at
> >>>>>the same time. if doing like that, MQ can works well. Once Qemu bugs
> >>>>>have been fixed and upstreamed, we can enable it.
> >>>>
> >>>>The problem is for DPDK to detect whether bug is fixed in Qemu.
> >>>>Maybe only way would be to have a new protocol feature flag, which is
> >>>>not really its role.
> >>>
> >>>Wouldn't that be an overkill, judging that REPLY_ACK is not a must
> >>>feature?
> >>
> >>Yes, maybe. But it was introduced to fix (possible) race conditions:
> >>https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html
> >
> >But AFAIK, that commit has been reverted:
> >
> >     commit 94c9cb31c04737f86be29afefbff401cd23bc24d
> >     Author: Michael S. Tsirkin <mst@redhat.com>
> >     Date:   Mon Aug 15 16:35:24 2016 +0300
> >         Revert "vhost-user: Attempt to fix a race with set_mem_table."
> >         This reverts commit 28ed5ef16384f12500abd3647973ee21b03cbe23.
> >         I still think it's the right thing to do, but
> >         tests have been failing sporadically.
> >         Revert for now, and hope to fix it before the release.
> 
> No, what has been reverted is a workaround when REPLY_ACK protocol
> feature has not been negotiated.

Good to know.

> 
> Instead of waiting for the backend to send the ack, the workaround
> consisted in sending a GET_FEATURES request after having sent the
> SET_MEM_TABLE request, in order to ensure SET_MEM_TABLE request handling
> was done before.
> 
> The problem is that it sometimes created a deadlock when when running
> QEMU's vhost-user-test in TCG mode.
> 
> >>
> >>Note that I planned to use this feature for the device IOTLB
> >>implementation to let the backend decide whether it wants the IOTLB
> >>misses synchronous or asynchronous. But I can still change the protocol
> >>spec to make this behavior specific to this request.
> >
> >Maybe we could introduce a version message? With that, we could tell
> >whether the frontend has fixed the known bug or not.
> 
> That's a possibility, but this is not really the role of a protocol
> version. As in this case, the protocol does not change, just an
> implementation.

Maybe. Well, you might could think this way: we do increase the version
when we make a new release (with bugs being fixed).

Or, we could also make the version two parts: major and minor. We increase
major for major updates (say, new features, etc). We increase minor for
bug fixes.

The only thing that doesn't make too much sense is the bug is actually
from the QEMU implementation but not from the vhost-user spec. Talking
about that, it may make more sense to introduce a new message to carry
the frontend version, something like a string "QEMU v2.8".

	--yliu
> 
> >Note that we already has the "version" info in current vhost-user spec.
> >It's just 2 bits in the message "flag" field though, which is not quite
> >enough.
> 
> Indeed, it does not let room for lots of bugs :)
> 
> Thanks,
> Maxime
> >	--yliu
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-28  7:35           ` Yuanhan Liu
@ 2017-04-28  7:39             ` Yuanhan Liu
  2017-04-28  7:57             ` Maxime Coquelin
  1 sibling, 0 replies; 20+ messages in thread
From: Yuanhan Liu @ 2017-04-28  7:39 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau,
	Michael S. Tsirkin

On Fri, Apr 28, 2017 at 03:35:53PM +0800, Yuanhan Liu wrote:
> > >Maybe we could introduce a version message? With that, we could tell
> > >whether the frontend has fixed the known bug or not.
> > 
> > That's a possibility, but this is not really the role of a protocol
> > version. As in this case, the protocol does not change, just an
> > implementation.
> 
> Maybe. Well, you might could think this way: we do increase the version
> when we make a new release (with bugs being fixed).
> 
> Or, we could also make the version two parts: major and minor. We increase
> major for major updates (say, new features, etc). We increase minor for
> bug fixes.

Nah, just forgot above two paragraphs, I overlooked it. You just need
care the below one.

	--yliu

> The only thing that doesn't make too much sense is the bug is actually
> from the QEMU implementation but not from the vhost-user spec. Talking
> about that, it may make more sense to introduce a new message to carry
> the frontend version, something like a string "QEMU v2.8".

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-28  7:35           ` Yuanhan Liu
  2017-04-28  7:39             ` Yuanhan Liu
@ 2017-04-28  7:57             ` Maxime Coquelin
  2017-04-28  8:00               ` Yuanhan Liu
  1 sibling, 1 reply; 20+ messages in thread
From: Maxime Coquelin @ 2017-04-28  7:57 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau,
	Michael S. Tsirkin



On 04/28/2017 09:35 AM, Yuanhan Liu wrote:
> On Fri, Apr 28, 2017 at 09:23:54AM +0200, Maxime Coquelin wrote:
>>
>>
>> On 04/28/2017 04:25 AM, Yuanhan Liu wrote:
>>> On Thu, Apr 27, 2017 at 10:52:20AM +0200, Maxime Coquelin wrote:
>>>>
>>>>
>>>> On 04/27/2017 10:20 AM, Yuanhan Liu wrote:
>>>>> On Thu, Apr 27, 2017 at 09:56:47AM +0200, Maxime Coquelin wrote:
>>>>>> Hi Zhiyong,
>>>>>>
>>>>>> +Marc-André
>>>>>>
>>>>>> On 04/27/2017 08:34 AM, Zhiyong Yang wrote:
>>>>>>> vhost since dpdk17.02 + qemu2.7 and above will cause failures of
>>>>>>> new connection when negotiating to set MQ. (one queue pair works
>>>>>>> well).Because there exist some bugs in qemu code when introducing
>>>>>>> VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the vhost
>>>>>>> message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
>>>>>>> doesn't send the messge (The message needs to be sent only once)but
>>>>>>> still will be waiting for dpdk's reply ack, then, qemu is always
>>>>>>> freezing. DPDK code works in the right way.
>>>>>>
>>>>>> I'm looking at Qemu's vhost_user_set_mem_table() function, but fail to
>>>>>> see how it could wait for the reply-ack if it didn't send the
>>>>>> VHOST_USER_SET_MEM_TABLE request before.
>>>>>>
>>>>>>> But the feature
>>>>>>> VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by default at the
>>>>>>> dpdk side in order to avoid the feature support of DPDK + qemu at
>>>>>>> the same time. if doing like that, MQ can works well. Once Qemu bugs
>>>>>>> have been fixed and upstreamed, we can enable it.
>>>>>>
>>>>>> The problem is for DPDK to detect whether bug is fixed in Qemu.
>>>>>> Maybe only way would be to have a new protocol feature flag, which is
>>>>>> not really its role.
>>>>>
>>>>> Wouldn't that be an overkill, judging that REPLY_ACK is not a must
>>>>> feature?
>>>>
>>>> Yes, maybe. But it was introduced to fix (possible) race conditions:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html
>>>
>>> But AFAIK, that commit has been reverted:
>>>
>>>      commit 94c9cb31c04737f86be29afefbff401cd23bc24d
>>>      Author: Michael S. Tsirkin <mst@redhat.com>
>>>      Date:   Mon Aug 15 16:35:24 2016 +0300
>>>          Revert "vhost-user: Attempt to fix a race with set_mem_table."
>>>          This reverts commit 28ed5ef16384f12500abd3647973ee21b03cbe23.
>>>          I still think it's the right thing to do, but
>>>          tests have been failing sporadically.
>>>          Revert for now, and hope to fix it before the release.
>>
>> No, what has been reverted is a workaround when REPLY_ACK protocol
>> feature has not been negotiated.
> 
> Good to know.
> 
>>
>> Instead of waiting for the backend to send the ack, the workaround
>> consisted in sending a GET_FEATURES request after having sent the
>> SET_MEM_TABLE request, in order to ensure SET_MEM_TABLE request handling
>> was done before.
>>
>> The problem is that it sometimes created a deadlock when when running
>> QEMU's vhost-user-test in TCG mode.
>>
>>>>
>>>> Note that I planned to use this feature for the device IOTLB
>>>> implementation to let the backend decide whether it wants the IOTLB
>>>> misses synchronous or asynchronous. But I can still change the protocol
>>>> spec to make this behavior specific to this request.
>>>
>>> Maybe we could introduce a version message? With that, we could tell
>>> whether the frontend has fixed the known bug or not.
>>
>> That's a possibility, but this is not really the role of a protocol
>> version. As in this case, the protocol does not change, just an
>> implementation.
> 
> Maybe. Well, you might could think this way: we do increase the version
> when we make a new release (with bugs being fixed).
> 
> Or, we could also make the version two parts: major and minor. We increase
> major for major updates (say, new features, etc). We increase minor for
> bug fixes.
> 
> The only thing that doesn't make too much sense is the bug is actually
> from the QEMU implementation but not from the vhost-user spec. 

Yes, I was maybe not clear, but that's what I meant when saying that was
not the role of the protocol version.

> Talking
> about that, it may make more sense to introduce a new message to carry
> the frontend version, something like a string "QEMU v2.8".

I don't think this is a good idea as it would create more problems that 
it would solve. Indeed, you would need also the distro version, as for
example, Red Hat could backport the fix in its QEMU v2.6 package, Ubuntu
in its v2.7, etc...

> 
> 	--yliu
>>
>>> Note that we already has the "version" info in current vhost-user spec.
>>> It's just 2 bits in the message "flag" field though, which is not quite
>>> enough.
>>
>> Indeed, it does not let room for lots of bugs :)
>>
>> Thanks,
>> Maxime
>>> 	--yliu
>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] vhost: fix MQ fails to startup
  2017-04-28  7:57             ` Maxime Coquelin
@ 2017-04-28  8:00               ` Yuanhan Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Yuanhan Liu @ 2017-04-28  8:00 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Zhiyong Yang, dev, ciara.loftus, Marc-André Lureau,
	Michael S. Tsirkin

On Fri, Apr 28, 2017 at 09:57:20AM +0200, Maxime Coquelin wrote:
> >>>Maybe we could introduce a version message? With that, we could tell
> >>>whether the frontend has fixed the known bug or not.
> >>
> >>That's a possibility, but this is not really the role of a protocol
> >>version. As in this case, the protocol does not change, just an
> >>implementation.
> >
> >Maybe. Well, you might could think this way: we do increase the version
> >when we make a new release (with bugs being fixed).
> >
> >Or, we could also make the version two parts: major and minor. We increase
> >major for major updates (say, new features, etc). We increase minor for
> >bug fixes.
> >
> >The only thing that doesn't make too much sense is the bug is actually
> >from the QEMU implementation but not from the vhost-user spec.
> 
> Yes, I was maybe not clear, but that's what I meant when saying that was
> not the role of the protocol version.

Yes, I realized it later: I overlooked it. Sorry.

> >Talking
> >about that, it may make more sense to introduce a new message to carry
> >the frontend version, something like a string "QEMU v2.8".
> 
> I don't think this is a good idea as it would create more problems that it
> would solve. Indeed, you would need also the distro version, as for
> example, Red Hat could backport the fix in its QEMU v2.6 package, Ubuntu
> in its v2.7, etc...

I have thought of stable release, say "QEMU v2.8.1". But you are right,
it got way more complex when distro backport is considered :(

	--yliu

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] vhost: workaround MQ fails to startup
  2017-04-28  4:29     ` Yuanhan Liu
@ 2017-05-10  2:07       ` Yang, Zhiyong
  0 siblings, 0 replies; 20+ messages in thread
From: Yang, Zhiyong @ 2017-05-10  2:07 UTC (permalink / raw)
  To: Yuanhan Liu, Maxime Coquelin; +Cc: dev, Loftus, Ciara, stable

Hi, all:

       The patch  which is used to fix the Qemu bug has been accepted in qemu community.
http://patchwork.ozlabs.org/patch/760118/

Thanks
Zhiyong

> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Friday, April 28, 2017 12:29 PM
> To: Maxime Coquelin <maxime.coquelin@redhat.com>
> Cc: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org; Loftus, Ciara
> <ciara.loftus@intel.com>; stable@dpdk.org
> Subject: Re: [PATCH v2] vhost: workaround MQ fails to startup
> 
> On Thu, Apr 27, 2017 at 12:00:52PM +0200, Maxime Coquelin wrote:
> >
> >
> > On 04/27/2017 11:41 AM, Zhiyong Yang wrote:
> > >   vhost since dpdk17.02 + qemu2.7 and above will cause failures of
> > >new connection when negotiating to set MQ. (one queue pair works
> > >well).
> > >    Because there exist some bugs in qemu code when introducing
> > >VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. when dealing with the
> vhost
> > >message VHOST_USER_SET_MEM_TABLE for the second time, qemu indeed
> > >doesn't send the messge (The message needs to be sent only once)but
> > >still will be waiting for dpdk's reply ack, then, qemu is always
> > >freezing. DPDK code indeed works in the right way.
> > >    The feature VHOST_USER_PROTOCOL_F_REPLY_ACK has to be disabled by
> > >default at the dpdk side in order to avoid the feature support of
> > >DPDK + qemu at the same time. if doing like that, MQ can works well.
> > >
> > >Cc: stable@dpdk.org
> > >
> > >Reported-by: Ciara Loftus <ciara.loftus@intel.com>
> > >Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> > >Tested-by: Ciara Loftus <ciara.loftus@intel.com>
> > >---
> > >
> > >changes in V2
> > >1. modify "workaround" instead of "fix" in the title.
> > >2. add a simple comment suggested by yuanhan
> > >
> > >  lib/librte_vhost/vhost_user.h | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> >
> > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 
> Applied to dpdk-next-virtio.
> 
> Thanks.
> 
> 	--yliu

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-05-10  2:07 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-27  6:34 [PATCH] vhost: fix MQ fails to startup Zhiyong Yang
2017-04-27  7:41 ` Loftus, Ciara
2017-04-27  7:56 ` Maxime Coquelin
2017-04-27  8:05   ` Maxime Coquelin
2017-04-27  8:24     ` Yang, Zhiyong
2017-04-27  8:32       ` Maxime Coquelin
2017-04-27  8:20   ` Yuanhan Liu
2017-04-27  8:52     ` Maxime Coquelin
2017-04-28  2:25       ` Yuanhan Liu
2017-04-28  7:23         ` Maxime Coquelin
2017-04-28  7:35           ` Yuanhan Liu
2017-04-28  7:39             ` Yuanhan Liu
2017-04-28  7:57             ` Maxime Coquelin
2017-04-28  8:00               ` Yuanhan Liu
2017-04-27  8:12 ` Yuanhan Liu
2017-04-27  8:32   ` Yang, Zhiyong
2017-04-27  9:41 ` [PATCH v2] vhost: workaround " Zhiyong Yang
2017-04-27 10:00   ` Maxime Coquelin
2017-04-28  4:29     ` Yuanhan Liu
2017-05-10  2:07       ` Yang, Zhiyong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.