All of lore.kernel.org
 help / color / mirror / Atom feed
* RING_HAS_UNCONSUMED_REQUESTS oddness
@ 2014-03-06 15:47 Zoltan Kiss
  2014-03-06 15:53 ` Ian Campbell
  2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell
  0 siblings, 2 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-06 15:47 UTC (permalink / raw)
  To: xen-devel, Wei Liu, Ian Campbell

Hi,

I was wondering for a while why this macro looks like this:

#define RING_HAS_UNCONSUMED_REQUESTS(_r)				\
    ({									\
	unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;	\
	unsigned int rsp = RING_SIZE(_r) -				\
			   ((_r)->req_cons - (_r)->rsp_prod_pvt);	\
	req < rsp ? req : rsp;						\
    })

I would expect to check prod - cons, like RING_HAS_UNCONSUMED_RESPONSES does:

#define RING_HAS_UNCONSUMED_RESPONSES(_r)				\
    ((_r)->sring->rsp_prod - (_r)->rsp_cons)

By my understanding, there is no way rsp could be smaller than req, so
there is no point having this. Am I missing something?

Regards,

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss
@ 2014-03-06 15:53 ` Ian Campbell
  2014-03-06 16:31   ` Zoltan Kiss
  2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell
  1 sibling, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-06 15:53 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu

On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> By my understanding, there is no way rsp could be smaller than req, so
> there is no point having this. Am I missing something?

It happens during wraparound, i.e. after req has wrapped but rsp hasn't
yet.

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 15:53 ` Ian Campbell
@ 2014-03-06 16:31   ` Zoltan Kiss
  2014-03-06 17:30     ` Tim Deegan
  0 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-06 16:31 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu

On 06/03/14 15:53, Ian Campbell wrote:
> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
>> By my understanding, there is no way rsp could be smaller than req, so
>> there is no point having this. Am I missing something?
>
> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
> yet.

The name of the macro suggest we are interested whether the ring has 
unconsumed requests, and netback uses it that way. The answer to that 
question is req_prod - req_cons. And it works if prod wrapped but cons 
didn't.
rsp calculates the number of "consumed but not responded" requests (it 
also works well if req_cons wrapped but rsp_prod_pvt didn't), then 
subtract it from the ring size. So it gives the number of unconsumed 
responses + unconsumed requests + unused slots. Why do we care about it? 
And as it includes the number of unconsumed requests, it couldn't be 
smaller than that, could it?

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 16:31   ` Zoltan Kiss
@ 2014-03-06 17:30     ` Tim Deegan
  2014-03-06 21:39       ` Zoltan Kiss
  2014-03-13 16:38       ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan
  0 siblings, 2 replies; 53+ messages in thread
From: Tim Deegan @ 2014-03-06 17:30 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Ian Campbell

At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
> On 06/03/14 15:53, Ian Campbell wrote:
> > On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> >> By my understanding, there is no way rsp could be smaller than req, so
> >> there is no point having this. Am I missing something?
> >
> > It happens during wraparound, i.e. after req has wrapped but rsp hasn't
> > yet.
> 
> The name of the macro suggest we are interested whether the ring has 
> unconsumed requests, and netback uses it that way. The answer to that 
> question is req_prod - req_cons. And it works if prod wrapped but cons 
> didn't.

Yes.

> rsp calculates the number of "consumed but not responded" requests (it 
> also works well if req_cons wrapped but rsp_prod_pvt didn't), then 
> subtract it from the ring size.

That is indeed an odd thing to check, since it seems like it could only
be relevant if the request producer overran the response producer.
It's been there in one form or another since the original ring.h,
and RING_REQUEST_CONS_OVERFLOW does something similar.

I can't remember the original reasoning, and so I'm reluctant to
suggest removing it without some more eyes on the code...

Tim.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 17:30     ` Tim Deegan
@ 2014-03-06 21:39       ` Zoltan Kiss
  2014-03-07  9:23         ` Paul Durrant
                           ` (2 more replies)
  2014-03-13 16:38       ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan
  1 sibling, 3 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-06 21:39 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Wei Liu, Ian Campbell

On 06/03/14 17:30, Tim Deegan wrote:
> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
>> On 06/03/14 15:53, Ian Campbell wrote:
>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
>>>> By my understanding, there is no way rsp could be smaller than req, so
>>>> there is no point having this. Am I missing something?
>>>
>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
>>> yet.
>>
>> The name of the macro suggest we are interested whether the ring has
>> unconsumed requests, and netback uses it that way. The answer to that
>> question is req_prod - req_cons. And it works if prod wrapped but cons
>> didn't.
>
> Yes.
>
>> rsp calculates the number of "consumed but not responded" requests (it
>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
>> subtract it from the ring size.
>
> That is indeed an odd thing to check, since it seems like it could only
> be relevant if the request producer overran the response producer.
> It's been there in one form or another since the original ring.h,
> and RING_REQUEST_CONS_OVERFLOW does something similar.
>
> I can't remember the original reasoning, and so I'm reluctant to
> suggest removing it without some more eyes on the code...

I've added the following printk before the "req < rsp" part:

	if (rsp < req)							\
		pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", req, 
rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \

And it gave me such results:

xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod 
1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755

So it can happen that req_prod is behind req_cons, sometimes even with 
17! But it always happen in this callback of my new grant mapping 
series, which runs outside the NAPI instance. My theory why this can happen:
- callback reads req_prod
- frontend writes it
- backend picks it up, and consumes those slots
- callback reads req_cons

So req can be near UINT_MAX if you call this macro outside the backend. 
The only place where the actual return value of this macro matters is 
xenvif_tx_build_gops, and it should be correct there. At other places we 
are only looking for the fact whether the ring has unconsumed requests 
or not. If prod is smaller than cons, we clearly read a wrong value. I 
think what we can do:
1. try again until its correct
2. just return a non-zero value, it shouldn't cause too much trouble if 
we say yes here
3. we can't see rsp_cons, so try to figure out if the ring is full of 
consumed but not responded requests, and return zero then, otherwise a 
positive value. That's what we do know.

Does this make sense? Should we rather go option 1? Should I post a 
comment patch to document this, and spare a few hours for future 
generations? :)

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 21:39       ` Zoltan Kiss
@ 2014-03-07  9:23         ` Paul Durrant
  2014-03-07 17:43           ` Zoltan Kiss
  2014-03-07 12:02         ` Wei Liu
  2014-03-11 15:55         ` Ian Campbell
  2 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-07  9:23 UTC (permalink / raw)
  To: Zoltan Kiss, Tim (Xen.org); +Cc: xen-devel, Wei Liu, Ian Campbell

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Zoltan Kiss
> Sent: 06 March 2014 21:40
> To: Tim (Xen.org)
> Cc: xen-devel@lists.xenproject.org; Wei Liu; Ian Campbell
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On 06/03/14 17:30, Tim Deegan wrote:
> > At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
> >> On 06/03/14 15:53, Ian Campbell wrote:
> >>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> >>>> By my understanding, there is no way rsp could be smaller than req, so
> >>>> there is no point having this. Am I missing something?
> >>>
> >>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
> >>> yet.
> >>
> >> The name of the macro suggest we are interested whether the ring has
> >> unconsumed requests, and netback uses it that way. The answer to that
> >> question is req_prod - req_cons. And it works if prod wrapped but cons
> >> didn't.
> >
> > Yes.
> >
> >> rsp calculates the number of "consumed but not responded" requests (it
> >> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
> >> subtract it from the ring size.
> >
> > That is indeed an odd thing to check, since it seems like it could only
> > be relevant if the request producer overran the response producer.
> > It's been there in one form or another since the original ring.h,
> > and RING_REQUEST_CONS_OVERFLOW does something similar.
> >
> > I can't remember the original reasoning, and so I'm reluctant to
> > suggest removing it without some more eyes on the code...
> 
> I've added the following printk before the "req < rsp" part:
> 
> 	if (rsp < req)							\
> 		pr_err("req %u rsp %u req_prod %u req_cons %u
> rsp_prod_pvt %u\n", req,
> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \
> 
> And it gave me such results:
> 
> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod
> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
> 
> So it can happen that req_prod is behind req_cons, sometimes even with
> 17! But it always happen in this callback of my new grant mapping
> series, which runs outside the NAPI instance. My theory why this can
> happen:
> - callback reads req_prod
> - frontend writes it
> - backend picks it up, and consumes those slots
> - callback reads req_cons
> 
> So req can be near UINT_MAX if you call this macro outside the backend.
> The only place where the actual return value of this macro matters is
> xenvif_tx_build_gops, and it should be correct there. At other places we
> are only looking for the fact whether the ring has unconsumed requests
> or not. If prod is smaller than cons, we clearly read a wrong value. I
> think what we can do:
> 1. try again until its correct
> 2. just return a non-zero value, it shouldn't cause too much trouble if
> we say yes here
> 3. we can't see rsp_cons, so try to figure out if the ring is full of
> consumed but not responded requests, and return zero then, otherwise a
> positive value. That's what we do know.
> 
> Does this make sense? Should we rather go option 1? Should I post a
> comment patch to document this, and spare a few hours for future
> generations? :)
> 

The number of responses on the ring is clearly immaterial if you're only looking for unconsumed requests. So I think it should something analogous to RING_HAS_UNCONSUMED_RESPONSES:

#define RING_HAS_UNCONSUMED_REQUESTS(_r) \
    ((_r)->sring->req_prod - (_r)->req_cons)

Any racing between these two values is the responsibility of the individual frontend/backend and not something that this macro needs to care about.

  Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 21:39       ` Zoltan Kiss
  2014-03-07  9:23         ` Paul Durrant
@ 2014-03-07 12:02         ` Wei Liu
  2014-03-07 18:58           ` Zoltan Kiss
  2014-03-11 15:55         ` Ian Campbell
  2 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2014-03-07 12:02 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Tim Deegan, Ian Campbell, Wei Liu

On Thu, Mar 06, 2014 at 09:39:40PM +0000, Zoltan Kiss wrote:
> On 06/03/14 17:30, Tim Deegan wrote:
> >At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
> >>On 06/03/14 15:53, Ian Campbell wrote:
> >>>On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> >>>>By my understanding, there is no way rsp could be smaller than req, so
> >>>>there is no point having this. Am I missing something?
> >>>
> >>>It happens during wraparound, i.e. after req has wrapped but rsp hasn't
> >>>yet.
> >>
> >>The name of the macro suggest we are interested whether the ring has
> >>unconsumed requests, and netback uses it that way. The answer to that
> >>question is req_prod - req_cons. And it works if prod wrapped but cons
> >>didn't.
> >
> >Yes.
> >
> >>rsp calculates the number of "consumed but not responded" requests (it
> >>also works well if req_cons wrapped but rsp_prod_pvt didn't), then
> >>subtract it from the ring size.
> >
> >That is indeed an odd thing to check, since it seems like it could only
> >be relevant if the request producer overran the response producer.
> >It's been there in one form or another since the original ring.h,
> >and RING_REQUEST_CONS_OVERFLOW does something similar.
> >
> >I can't remember the original reasoning, and so I'm reluctant to
> >suggest removing it without some more eyes on the code...
> 
> I've added the following printk before the "req < rsp" part:
> 
> 	if (rsp < req)							\
> 		pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n",
> req, rsp, (_r)->sring->req_prod, (_r)->req_cons,
> (_r)->rsp_prod_pvt); \
> 
> And it gave me such results:
> 
> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod
> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
> 
> So it can happen that req_prod is behind req_cons, sometimes even
> with 17! But it always happen in this callback of my new grant
> mapping series, which runs outside the NAPI instance. My theory why
> this can happen:
> - callback reads req_prod
> - frontend writes it
> - backend picks it up, and consumes those slots
> - callback reads req_cons
> 

So there's three parties accessing the ring? I *think* the ring is only
designed to be lockfree for two parties so I don't know whether this
model is legit.

Wei.

> So req can be near UINT_MAX if you call this macro outside the
> backend. The only place where the actual return value of this macro
> matters is xenvif_tx_build_gops, and it should be correct there. At
> other places we are only looking for the fact whether the ring has
> unconsumed requests or not. If prod is smaller than cons, we clearly
> read a wrong value. I think what we can do:
> 1. try again until its correct
> 2. just return a non-zero value, it shouldn't cause too much trouble
> if we say yes here
> 3. we can't see rsp_cons, so try to figure out if the ring is full
> of consumed but not responded requests, and return zero then,
> otherwise a positive value. That's what we do know.
> 
> Does this make sense? Should we rather go option 1? Should I post a
> comment patch to document this, and spare a few hours for future
> generations? :)
> 
> Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-07  9:23         ` Paul Durrant
@ 2014-03-07 17:43           ` Zoltan Kiss
  0 siblings, 0 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-07 17:43 UTC (permalink / raw)
  To: Paul Durrant, Tim (Xen.org); +Cc: xen-devel, Wei Liu, Ian Campbell

On 07/03/14 09:23, Paul Durrant wrote:
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> bounces@lists.xen.org] On Behalf Of Zoltan Kiss
>> Sent: 06 March 2014 21:40
>> To: Tim (Xen.org)
>> Cc: xen-devel@lists.xenproject.org; Wei Liu; Ian Campbell
>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
>>
>> On 06/03/14 17:30, Tim Deegan wrote:
>>> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
>>>> On 06/03/14 15:53, Ian Campbell wrote:
>>>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
>>>>>> By my understanding, there is no way rsp could be smaller than req, so
>>>>>> there is no point having this. Am I missing something?
>>>>>
>>>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
>>>>> yet.
>>>>
>>>> The name of the macro suggest we are interested whether the ring has
>>>> unconsumed requests, and netback uses it that way. The answer to that
>>>> question is req_prod - req_cons. And it works if prod wrapped but cons
>>>> didn't.
>>>
>>> Yes.
>>>
>>>> rsp calculates the number of "consumed but not responded" requests (it
>>>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
>>>> subtract it from the ring size.
>>>
>>> That is indeed an odd thing to check, since it seems like it could only
>>> be relevant if the request producer overran the response producer.
>>> It's been there in one form or another since the original ring.h,
>>> and RING_REQUEST_CONS_OVERFLOW does something similar.
>>>
>>> I can't remember the original reasoning, and so I'm reluctant to
>>> suggest removing it without some more eyes on the code...
>>
>> I've added the following printk before the "req < rsp" part:
>>
>> 	if (rsp < req)							\
>> 		pr_err("req %u rsp %u req_prod %u req_cons %u
>> rsp_prod_pvt %u\n", req,
>> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \
>>
>> And it gave me such results:
>>
>> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod
>> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
>>
>> So it can happen that req_prod is behind req_cons, sometimes even with
>> 17! But it always happen in this callback of my new grant mapping
>> series, which runs outside the NAPI instance. My theory why this can
>> happen:
>> - callback reads req_prod
>> - frontend writes it
>> - backend picks it up, and consumes those slots
>> - callback reads req_cons
>>
>> So req can be near UINT_MAX if you call this macro outside the backend.
>> The only place where the actual return value of this macro matters is
>> xenvif_tx_build_gops, and it should be correct there. At other places we
>> are only looking for the fact whether the ring has unconsumed requests
>> or not. If prod is smaller than cons, we clearly read a wrong value. I
>> think what we can do:
>> 1. try again until its correct
>> 2. just return a non-zero value, it shouldn't cause too much trouble if
>> we say yes here
>> 3. we can't see rsp_cons, so try to figure out if the ring is full of
>> consumed but not responded requests, and return zero then, otherwise a
>> positive value. That's what we do know.
>>
>> Does this make sense? Should we rather go option 1? Should I post a
>> comment patch to document this, and spare a few hours for future
>> generations? :)
>>
>
> The number of responses on the ring is clearly immaterial if you're only looking for unconsumed requests. So I think it should something analogous to RING_HAS_UNCONSUMED_RESPONSES:
>
> #define RING_HAS_UNCONSUMED_REQUESTS(_r) \
>      ((_r)->sring->req_prod - (_r)->req_cons)
>
> Any racing between these two values is the responsibility of the individual frontend/backend and not something that this macro needs to care about.

I think we should introduce a new macro with a name like 
RING_NR_UNCONSUMED_REQUESTS, which does what you wrote above: 
((_r)->sring->req_prod - (_r)->req_cons)
It should be used where we actually care about the number of the 
unconsumed request, practically from xenvif_tx_build_gops. And we should 
note in a comment that this macro works only if you use it from the 
context of the writer of req_cons (NAPI instance in our case) We can 
actually use it from tx_work_todo as well, and comment on the function 
that it should be used only from the NAPI instance.

The other users, who are not interested in the actual number:
xenvif_tx_interrupt
xenvif_zerocopy_callback (new stuff from my series)
and through RING_FINAL_CHECK_FOR_REQUESTS:
xenvif_check_rx_xenvif
xenvif_poll
do_block_io_op

Apart from xenvif_poll they can run into the problem when (req_prod < 
req_cons). To handle that situation I've listed three options in my 
previous email. I think the current one is a quite good best effort to 
optimize the situation, and we can go with that, if we document it 
properly in comments why is it necessary.
To clarify what does this do, let me show an example:
req_prod = 253
req_cons = 256
rsp_prod_pvt = 0

req will be UINT_MAX-2, as the values changed in the meantime, and rsp 
is 0. It's reasonable to return 0 here, as the backend hasn't replied 
anything yet, so we clearly shouldn't have any unconsumed request in the 
ring.

Regards,

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-07 12:02         ` Wei Liu
@ 2014-03-07 18:58           ` Zoltan Kiss
  0 siblings, 0 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-07 18:58 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Tim Deegan, Ian Campbell

On 07/03/14 12:02, Wei Liu wrote:
> On Thu, Mar 06, 2014 at 09:39:40PM +0000, Zoltan Kiss wrote:
>> On 06/03/14 17:30, Tim Deegan wrote:
>>> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
>>>> On 06/03/14 15:53, Ian Campbell wrote:
>>>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
>>>>>> By my understanding, there is no way rsp could be smaller than req, so
>>>>>> there is no point having this. Am I missing something?
>>>>>
>>>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
>>>>> yet.
>>>>
>>>> The name of the macro suggest we are interested whether the ring has
>>>> unconsumed requests, and netback uses it that way. The answer to that
>>>> question is req_prod - req_cons. And it works if prod wrapped but cons
>>>> didn't.
>>>
>>> Yes.
>>>
>>>> rsp calculates the number of "consumed but not responded" requests (it
>>>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
>>>> subtract it from the ring size.
>>>
>>> That is indeed an odd thing to check, since it seems like it could only
>>> be relevant if the request producer overran the response producer.
>>> It's been there in one form or another since the original ring.h,
>>> and RING_REQUEST_CONS_OVERFLOW does something similar.
>>>
>>> I can't remember the original reasoning, and so I'm reluctant to
>>> suggest removing it without some more eyes on the code...
>>
>> I've added the following printk before the "req < rsp" part:
>>
>> 	if (rsp < req)							\
>> 		pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n",
>> req, rsp, (_r)->sring->req_prod, (_r)->req_cons,
>> (_r)->rsp_prod_pvt); \
>>
>> And it gave me such results:
>>
>> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod
>> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
>>
>> So it can happen that req_prod is behind req_cons, sometimes even
>> with 17! But it always happen in this callback of my new grant
>> mapping series, which runs outside the NAPI instance. My theory why
>> this can happen:
>> - callback reads req_prod
>> - frontend writes it
>> - backend picks it up, and consumes those slots
>> - callback reads req_cons
>>
>
> So there's three parties accessing the ring? I *think* the ring is only
> designed to be lockfree for two parties so I don't know whether this
> model is legit.
Indeed, however if we want to reintroduce grant mapping in netback, the 
releasing of pages will happen asynchronously. And there, you have to 
check whether you should kick the NAPI instance, to avoid this kind of 
stall:
- xenvif_poll call napi_complete, because more_to_do=true, but 
xenvif_tx_pending_slots_available=false
- we won't receive interrupt from frontend, as we already got one
- the reasonable place to kick in NAPI instance is when we released 
packets. Fortunately it doesn't happen very often

Actually at the moment I do it in the callback, but it's better in the 
dealloc thread, when we really released the pending slots, otherwise we 
might risk that NAPI instance quickly deschedule itself again. I'll fix 
that in a next series

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss
  2014-03-06 15:53 ` Ian Campbell
@ 2014-03-11 15:44 ` Ian Campbell
  2014-03-11 23:24   ` Zoltan Kiss
  1 sibling, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-11 15:44 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan

On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> Hi,
> 
> I was wondering for a while why this macro looks like this:
> 
> #define RING_HAS_UNCONSUMED_REQUESTS(_r)				\
>     ({									\
> 	unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;	\
> 	unsigned int rsp = RING_SIZE(_r) -				\
> 			   ((_r)->req_cons - (_r)->rsp_prod_pvt);	\
> 	req < rsp ? req : rsp;						\
>     })
> 
> I would expect to check prod - cons, like RING_HAS_UNCONSUMED_RESPONSES does:
> 
> #define RING_HAS_UNCONSUMED_RESPONSES(_r)				\
>     ((_r)->sring->rsp_prod - (_r)->rsp_cons)
> 
> By my understanding, there is no way rsp could be smaller than req, so
> there is no point having this. Am I missing something?

Just looking at this again. All that stuff I said about wraparound was
misleading/irrelevant since req and rsp are not the pointers, but
actually the number of things. Sorry.

Is it the case that this macro considers a request to be unconsumed if
the *response* to a request is outstanding as well as if the request
itself is still on the ring?

I wonder if this apparently weird construction is due to pathological
cases when one or the other end is not picking up requests/responses?
i.e. trying to avoid deadlocking the ring or generating an interrupt
storm when the ring it is full of one or the other or something along
those lines?

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-06 21:39       ` Zoltan Kiss
  2014-03-07  9:23         ` Paul Durrant
  2014-03-07 12:02         ` Wei Liu
@ 2014-03-11 15:55         ` Ian Campbell
  2014-03-11 23:34           ` Zoltan Kiss
  2 siblings, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-11 15:55 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Tim Deegan, Wei Liu

On Thu, 2014-03-06 at 21:39 +0000, Zoltan Kiss wrote:
> On 06/03/14 17:30, Tim Deegan wrote:
> > At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
> >> On 06/03/14 15:53, Ian Campbell wrote:
> >>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> >>>> By my understanding, there is no way rsp could be smaller than req, so
> >>>> there is no point having this. Am I missing something?
> >>>
> >>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
> >>> yet.
> >>
> >> The name of the macro suggest we are interested whether the ring has
> >> unconsumed requests, and netback uses it that way. The answer to that
> >> question is req_prod - req_cons. And it works if prod wrapped but cons
> >> didn't.
> >
> > Yes.
> >
> >> rsp calculates the number of "consumed but not responded" requests (it
> >> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
> >> subtract it from the ring size.
> >
> > That is indeed an odd thing to check, since it seems like it could only
> > be relevant if the request producer overran the response producer.
> > It's been there in one form or another since the original ring.h,
> > and RING_REQUEST_CONS_OVERFLOW does something similar.
> >
> > I can't remember the original reasoning, and so I'm reluctant to
> > suggest removing it without some more eyes on the code...
> 
> I've added the following printk before the "req < rsp" part:
> 
> 	if (rsp < req)							\
> 		pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", req, 
> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \
> 
> And it gave me such results:
> 
> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod 
> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
> 
> So it can happen that req_prod is behind req_cons, sometimes even with 
> 17! But it always happen in this callback of my new grant mapping 
> series, which runs outside the NAPI instance. My theory why this can happen:
> - callback reads req_prod
> - frontend writes it
> - backend picks it up, and consumes those slots
> - callback reads req_cons

I'm a bit confused by what you mean by "it" in your theory, so perhaps I
misunderstand. Can you use the actual variable names for clarity.

Are you sure this is a problem with the actual code and not just with
your debug print? I would expect the real code to be snapshotting things
as appropriate etc, and also for the public/private state to not
necessarily be totally in sync when RING_HAS_UNCONSUMED_REQUESTS is
being called.

> So req can be near UINT_MAX if you call this macro outside the backend. 
> The only place where the actual return value of this macro matters is 
> xenvif_tx_build_gops, and it should be correct there. At other places we 
> are only looking for the fact whether the ring has unconsumed requests 
> or not. If prod is smaller than cons, we clearly read a wrong value. I 
> think what we can do:
> 1. try again until its correct
> 2. just return a non-zero value, it shouldn't cause too much trouble if 
> we say yes here
> 3. we can't see rsp_cons, so try to figure out if the ring is full of 
> consumed but not responded requests, and return zero then, otherwise a 
> positive value. That's what we do know.

s/know/now/? Is #3 here the status quo? 

> Does this make sense? Should we rather go option 1? Should I post a 
> comment patch to document this, and spare a few hours for future 
> generations? :)

Docs are always good IMHO. 

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell
@ 2014-03-11 23:24   ` Zoltan Kiss
  2014-03-12 10:28     ` Ian Campbell
  0 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-11 23:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan

On 11/03/14 15:44, Ian Campbell wrote:
> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
>> Hi,
>>
>> I was wondering for a while why this macro looks like this:
>>
>> #define RING_HAS_UNCONSUMED_REQUESTS(_r)				\
>>      ({									\
>> 	unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;	\
>> 	unsigned int rsp = RING_SIZE(_r) -				\
>> 			   ((_r)->req_cons - (_r)->rsp_prod_pvt);	\
>> 	req < rsp ? req : rsp;						\
>>      })
>>
>> I would expect to check prod - cons, like RING_HAS_UNCONSUMED_RESPONSES does:
>>
>> #define RING_HAS_UNCONSUMED_RESPONSES(_r)				\
>>      ((_r)->sring->rsp_prod - (_r)->rsp_cons)
>>
>> By my understanding, there is no way rsp could be smaller than req, so
>> there is no point having this. Am I missing something?
>
> Just looking at this again. All that stuff I said about wraparound was
> misleading/irrelevant since req and rsp are not the pointers, but
> actually the number of things. Sorry.
>
> Is it the case that this macro considers a request to be unconsumed if
> the *response* to a request is outstanding as well as if the request
> itself is still on the ring?
I don't think that would make sense. I think everywhere where this macro 
is called the caller is not interested in pending request (pending means 
consumed but not responded)

>
> I wonder if this apparently weird construction is due to pathological
> cases when one or the other end is not picking up requests/responses?
> i.e. trying to avoid deadlocking the ring or generating an interrupt
> storm when the ring it is full of one or the other or something along
> those lines?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-11 15:55         ` Ian Campbell
@ 2014-03-11 23:34           ` Zoltan Kiss
  0 siblings, 0 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-11 23:34 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Tim Deegan, Wei Liu

On 11/03/14 15:55, Ian Campbell wrote:
> On Thu, 2014-03-06 at 21:39 +0000, Zoltan Kiss wrote:
>> On 06/03/14 17:30, Tim Deegan wrote:
>>> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
>>>> On 06/03/14 15:53, Ian Campbell wrote:
>>>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
>>>>>> By my understanding, there is no way rsp could be smaller than req, so
>>>>>> there is no point having this. Am I missing something?
>>>>>
>>>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
>>>>> yet.
>>>>
>>>> The name of the macro suggest we are interested whether the ring has
>>>> unconsumed requests, and netback uses it that way. The answer to that
>>>> question is req_prod - req_cons. And it works if prod wrapped but cons
>>>> didn't.
>>>
>>> Yes.
>>>
>>>> rsp calculates the number of "consumed but not responded" requests (it
>>>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
>>>> subtract it from the ring size.
>>>
>>> That is indeed an odd thing to check, since it seems like it could only
>>> be relevant if the request producer overran the response producer.
>>> It's been there in one form or another since the original ring.h,
>>> and RING_REQUEST_CONS_OVERFLOW does something similar.
>>>
>>> I can't remember the original reasoning, and so I'm reluctant to
>>> suggest removing it without some more eyes on the code...
>>
>> I've added the following printk before the "req < rsp" part:
>>
>> 	if (rsp < req)							\
>> 		pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", req,
>> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \
>>
>> And it gave me such results:
>>
>> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod
>> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
>>
>> So it can happen that req_prod is behind req_cons, sometimes even with
>> 17! But it always happen in this callback of my new grant mapping
>> series, which runs outside the NAPI instance. My theory why this can happen:
>> - callback reads req_prod
>> - frontend writes it
>> - backend picks it up, and consumes those slots
>> - callback reads req_cons
>
> I'm a bit confused by what you mean by "it" in your theory, so perhaps I
> misunderstand. Can you use the actual variable names for clarity.
I meant req_prod all the time.

> Are you sure this is a problem with the actual code and not just with
> your debug print? I would expect the real code to be snapshotting things
> as appropriate etc, and also for the public/private state to not
> necessarily be totally in sync when RING_HAS_UNCONSUMED_REQUESTS is
> being called.
I think my above code does the right thing. For double check I printed 
out req, rsp, and the values they are calculated from. I guess req_prod 
is cached in the printk and that's why it is still the same value as we 
read when calculating req. I'll test that with a memory barrier.
>
>> So req can be near UINT_MAX if you call this macro outside the backend.
>> The only place where the actual return value of this macro matters is
>> xenvif_tx_build_gops, and it should be correct there. At other places we
>> are only looking for the fact whether the ring has unconsumed requests
>> or not. If prod is smaller than cons, we clearly read a wrong value. I
>> think what we can do:
>> 1. try again until its correct
>> 2. just return a non-zero value, it shouldn't cause too much trouble if
>> we say yes here
>> 3. we can't see rsp_cons, so try to figure out if the ring is full of
>> consumed but not responded requests, and return zero then, otherwise a
>> positive value. That's what we do know.
>
> s/know/now/? Is #3 here the status quo?
Yes, that's the status quo. It's a best effort calculation, rsp takes 
only effect if req ~ UINT_MAX.
>
>> Does this make sense? Should we rather go option 1? Should I post a
>> comment patch to document this, and spare a few hours for future
>> generations? :)
>
> Docs are always good IMHO.
>
> Ian.
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-11 23:24   ` Zoltan Kiss
@ 2014-03-12 10:28     ` Ian Campbell
  2014-03-12 10:48       ` Roger Pau Monné
                         ` (4 more replies)
  0 siblings, 5 replies; 53+ messages in thread
From: Ian Campbell @ 2014-03-12 10:28 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan

On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> On 11/03/14 15:44, Ian Campbell wrote:

> > Is it the case that this macro considers a request to be unconsumed if
> > the *response* to a request is outstanding as well as if the request
> > itself is still on the ring?
> I don't think that would make sense. I think everywhere where this macro 
> is called the caller is not interested in pending request (pending means 
> consumed but not responded)

It might be interested in such pending requests in some of the
pathological cases I allude to in the next paragraph though?

For example if the ring has unconsumed requests but there are no slots
free for a response, it would be better to treat it as no unconsumed
requests until space opens up for a response, otherwise something else
just has to abort the processing of the request when it notices the lack
of space.

(I'm totally speculating here BTW, I don't have any concrete idea why
things are done this way...)


> > I wonder if this apparently weird construction is due to pathological
> > cases when one or the other end is not picking up requests/responses?
> > i.e. trying to avoid deadlocking the ring or generating an interrupt
> > storm when the ring it is full of one or the other or something along
> > those lines?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 10:28     ` Ian Campbell
@ 2014-03-12 10:48       ` Roger Pau Monné
  2014-03-12 11:25       ` Paul Durrant
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2014-03-12 10:48 UTC (permalink / raw)
  To: Ian Campbell, Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan

On 12/03/14 11:28, Ian Campbell wrote:
> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
>> On 11/03/14 15:44, Ian Campbell wrote:
> 
>>> Is it the case that this macro considers a request to be unconsumed if
>>> the *response* to a request is outstanding as well as if the request
>>> itself is still on the ring?
>> I don't think that would make sense. I think everywhere where this macro 
>> is called the caller is not interested in pending request (pending means 
>> consumed but not responded)
> 
> It might be interested in such pending requests in some of the
> pathological cases I allude to in the next paragraph though?
> 
> For example if the ring has unconsumed requests but there are no slots
> free for a response, it would be better to treat it as no unconsumed
> requests until space opens up for a response, otherwise something else
> just has to abort the processing of the request when it notices the lack
> of space.

Yes, I think this is true at least for blkback, when blkback starts
processing a request it assumes that a response slot will be available
in order to write the response when the request has finished processing.

Roger.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 10:28     ` Ian Campbell
  2014-03-12 10:48       ` Roger Pau Monné
@ 2014-03-12 11:25       ` Paul Durrant
  2014-03-12 11:38       ` Paul Durrant
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2014-03-12 11:25 UTC (permalink / raw)
  To: Ian Campbell, Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim (Xen.org)

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Ian Campbell
> Sent: 12 March 2014 10:28
> To: Zoltan Kiss
> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> > On 11/03/14 15:44, Ian Campbell wrote:
> 
> > > Is it the case that this macro considers a request to be unconsumed if
> > > the *response* to a request is outstanding as well as if the request
> > > itself is still on the ring?
> > I don't think that would make sense. I think everywhere where this macro
> > is called the caller is not interested in pending request (pending means
> > consumed but not responded)
> 
> It might be interested in such pending requests in some of the
> pathological cases I allude to in the next paragraph though?
> 
> For example if the ring has unconsumed requests but there are no slots
> free for a response, it would be better to treat it as no unconsumed
> requests until space opens up for a response, otherwise something else
> just has to abort the processing of the request when it notices the lack
> of space.
> 

Huh? The act of consuming the request will create the space for the response. If responses and requests are not balanced then I'd argue that the protocol is broken.

  Paul

> (I'm totally speculating here BTW, I don't have any concrete idea why
> things are done this way...)
> 
> 
> > > I wonder if this apparently weird construction is due to pathological
> > > cases when one or the other end is not picking up requests/responses?
> > > i.e. trying to avoid deadlocking the ring or generating an interrupt
> > > storm when the ring it is full of one or the other or something along
> > > those lines?
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 10:28     ` Ian Campbell
  2014-03-12 10:48       ` Roger Pau Monné
  2014-03-12 11:25       ` Paul Durrant
@ 2014-03-12 11:38       ` Paul Durrant
  2014-03-12 14:41         ` Zoltan Kiss
  2014-03-12 14:25       ` Zoltan Kiss
  2014-03-12 14:27       ` Zoltan Kiss
  4 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-12 11:38 UTC (permalink / raw)
  To: Ian Campbell, Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim (Xen.org)

> -----Original Message-----
> From: Paul Durrant
> Sent: 12 March 2014 11:26
> To: Ian Campbell; Zoltan Kiss
> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
> Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > Sent: 12 March 2014 10:28
> > To: Zoltan Kiss
> > Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
> > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> >
> > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> > > On 11/03/14 15:44, Ian Campbell wrote:
> >
> > > > Is it the case that this macro considers a request to be unconsumed if
> > > > the *response* to a request is outstanding as well as if the request
> > > > itself is still on the ring?
> > > I don't think that would make sense. I think everywhere where this
> macro
> > > is called the caller is not interested in pending request (pending means
> > > consumed but not responded)
> >
> > It might be interested in such pending requests in some of the
> > pathological cases I allude to in the next paragraph though?
> >
> > For example if the ring has unconsumed requests but there are no slots
> > free for a response, it would be better to treat it as no unconsumed
> > requests until space opens up for a response, otherwise something else
> > just has to abort the processing of the request when it notices the lack
> > of space.
> >
> 
> Huh? The act of consuming the request will create the space for the
> response. If responses and requests are not balanced then I'd argue that the
> protocol is broken.
> 

Actually ancient memory tells me that, unfortunately, netback's backend->frontend GSO protocol is broken in this way... it requires one more response slot than the number of requests it consumes (for the extra metadata), which means that if the frontend keeps the ring full you can get overflow. It's a bit of a tangent though, because that code doesn't use this macro (or in fact check the ring has space in any way IIRC). The prefix variant of the protocol is ok though.

  Paul

>   Paul
> 
> > (I'm totally speculating here BTW, I don't have any concrete idea why
> > things are done this way...)
> >
> >
> > > > I wonder if this apparently weird construction is due to pathological
> > > > cases when one or the other end is not picking up requests/responses?
> > > > i.e. trying to avoid deadlocking the ring or generating an interrupt
> > > > storm when the ring it is full of one or the other or something along
> > > > those lines?
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 10:28     ` Ian Campbell
                         ` (2 preceding siblings ...)
  2014-03-12 11:38       ` Paul Durrant
@ 2014-03-12 14:25       ` Zoltan Kiss
  2014-03-12 14:27       ` Zoltan Kiss
  4 siblings, 0 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 14:25 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan

On 12/03/14 10:28, Ian Campbell wrote:
> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
>> On 11/03/14 15:44, Ian Campbell wrote:
>
>>> Is it the case that this macro considers a request to be unconsumed if
>>> the *response* to a request is outstanding as well as if the request
>>> itself is still on the ring?
>> I don't think that would make sense. I think everywhere where this macro
>> is called the caller is not interested in pending request (pending means
>> consumed but not responded)
>
> It might be interested in such pending requests in some of the
> pathological cases I allude to in the next paragraph though?
>
> For example if the ring has unconsumed requests but there are no slots
> free for a response, it would be better to treat it as no unconsumed
> requests until space opens up for a response, otherwise something else
> just has to abort the processing of the request when it notices the lack
> of space.

I agree with Paul on this: if you consume a request, it becomes pending 
until you finish your work on it. If you finish, that slot ceases to be 
a pending one, and you free up space for a response. So unless a request 
needs more than one response, we are good.

>
> (I'm totally speculating here BTW, I don't have any concrete idea why
> things are done this way...)
>
>
>>> I wonder if this apparently weird construction is due to pathological
>>> cases when one or the other end is not picking up requests/responses?
>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
>>> storm when the ring it is full of one or the other or something along
>>> those lines?
>
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 10:28     ` Ian Campbell
                         ` (3 preceding siblings ...)
  2014-03-12 14:25       ` Zoltan Kiss
@ 2014-03-12 14:27       ` Zoltan Kiss
  2014-03-12 14:30         ` Ian Campbell
  4 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 14:27 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan

On 12/03/14 10:28, Ian Campbell wrote:
> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
>> On 11/03/14 15:44, Ian Campbell wrote:
>
>>> Is it the case that this macro considers a request to be unconsumed if
>>> the *response* to a request is outstanding as well as if the request
>>> itself is still on the ring?
>> I don't think that would make sense. I think everywhere where this macro
>> is called the caller is not interested in pending request (pending means
>> consumed but not responded)
>
> It might be interested in such pending requests in some of the
> pathological cases I allude to in the next paragraph though?
>
> For example if the ring has unconsumed requests but there are no slots
> free for a response, it would be better to treat it as no unconsumed
> requests until space opens up for a response, otherwise something else
> just has to abort the processing of the request when it notices the lack
> of space.
>
> (I'm totally speculating here BTW, I don't have any concrete idea why
> things are done this way...)
>
>
>>> I wonder if this apparently weird construction is due to pathological
>>> cases when one or the other end is not picking up requests/responses?
>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
>>> storm when the ring it is full of one or the other or something along
>>> those lines?
>
>

Also, let me quote again my example about when rsp makes sense:

"To clarify what does this do, let me show an example:
req_prod = 253
req_cons = 256
rsp_prod_pvt = 0

req will be UINT_MAX-2, as the values changed in the meantime, and rsp 
is 0. It's reasonable to return 0 here, as the backend hasn't replied 
anything yet, so we clearly shouldn't have any unconsumed request in the 
ring."

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 14:27       ` Zoltan Kiss
@ 2014-03-12 14:30         ` Ian Campbell
  2014-03-12 15:14           ` Zoltan Kiss
  0 siblings, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-12 14:30 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan

On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote:
> On 12/03/14 10:28, Ian Campbell wrote:
> > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> >> On 11/03/14 15:44, Ian Campbell wrote:
> >
> >>> Is it the case that this macro considers a request to be unconsumed if
> >>> the *response* to a request is outstanding as well as if the request
> >>> itself is still on the ring?
> >> I don't think that would make sense. I think everywhere where this macro
> >> is called the caller is not interested in pending request (pending means
> >> consumed but not responded)
> >
> > It might be interested in such pending requests in some of the
> > pathological cases I allude to in the next paragraph though?
> >
> > For example if the ring has unconsumed requests but there are no slots
> > free for a response, it would be better to treat it as no unconsumed
> > requests until space opens up for a response, otherwise something else
> > just has to abort the processing of the request when it notices the lack
> > of space.
> >
> > (I'm totally speculating here BTW, I don't have any concrete idea why
> > things are done this way...)
> >
> >
> >>> I wonder if this apparently weird construction is due to pathological
> >>> cases when one or the other end is not picking up requests/responses?
> >>> i.e. trying to avoid deadlocking the ring or generating an interrupt
> >>> storm when the ring it is full of one or the other or something along
> >>> those lines?
> >
> >
> 
> Also, let me quote again my example about when rsp makes sense:
> 
> "To clarify what does this do, let me show an example:
> req_prod = 253
> req_cons = 256
> rsp_prod_pvt = 0

I think to make sense of this I need to see the sequence of reads/writes
from both parties in a sensible ordering which would result in reads
showing the above. i.e. a demonstration of the race not just an
assertion that if the values are read as is things makes sense.

> 
> req will be UINT_MAX-2, as the values changed in the meantime, and rsp 
> is 0. It's reasonable to return 0 here, as the backend hasn't replied 
> anything yet, so we clearly shouldn't have any unconsumed request in the 
> ring."
> 
> Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 11:38       ` Paul Durrant
@ 2014-03-12 14:41         ` Zoltan Kiss
  2014-03-12 15:23           ` Paul Durrant
  0 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 14:41 UTC (permalink / raw)
  To: Paul Durrant, Ian Campbell; +Cc: xen-devel, Wei Liu, Tim (Xen.org)

On 12/03/14 11:38, Paul Durrant wrote:
>> -----Original Message-----
>> From: Paul Durrant
>> Sent: 12 March 2014 11:26
>> To: Ian Campbell; Zoltan Kiss
>> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
>> Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
>>
>>> -----Original Message-----
>>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>>> bounces@lists.xen.org] On Behalf Of Ian Campbell
>>> Sent: 12 March 2014 10:28
>>> To: Zoltan Kiss
>>> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
>>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
>>>
>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
>>>> On 11/03/14 15:44, Ian Campbell wrote:
>>>
>>>>> Is it the case that this macro considers a request to be unconsumed if
>>>>> the *response* to a request is outstanding as well as if the request
>>>>> itself is still on the ring?
>>>> I don't think that would make sense. I think everywhere where this
>> macro
>>>> is called the caller is not interested in pending request (pending means
>>>> consumed but not responded)
>>>
>>> It might be interested in such pending requests in some of the
>>> pathological cases I allude to in the next paragraph though?
>>>
>>> For example if the ring has unconsumed requests but there are no slots
>>> free for a response, it would be better to treat it as no unconsumed
>>> requests until space opens up for a response, otherwise something else
>>> just has to abort the processing of the request when it notices the lack
>>> of space.
>>>
>>
>> Huh? The act of consuming the request will create the space for the
>> response. If responses and requests are not balanced then I'd argue that the
>> protocol is broken.
>>
>
> Actually ancient memory tells me that, unfortunately, netback's backend->frontend GSO protocol is broken in this way... it requires one more response slot than the number of requests it consumes (for the extra metadata), which means that if the frontend keeps the ring full you can get overflow. It's a bit of a tangent though, because that code doesn't use this macro (or in fact check the ring has space in any way IIRC). The prefix variant of the protocol is ok though.

I think it's not: it consumes a request for the metadata, and when the 
packet is grant copied to the guest, it creates a response for that slot 
as well.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 14:30         ` Ian Campbell
@ 2014-03-12 15:14           ` Zoltan Kiss
  2014-03-12 15:37             ` Ian Campbell
  0 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 15:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan

On 12/03/14 14:30, Ian Campbell wrote:
> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote:
>> On 12/03/14 10:28, Ian Campbell wrote:
>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
>>>> On 11/03/14 15:44, Ian Campbell wrote:
>>>
>>>>> Is it the case that this macro considers a request to be unconsumed if
>>>>> the *response* to a request is outstanding as well as if the request
>>>>> itself is still on the ring?
>>>> I don't think that would make sense. I think everywhere where this macro
>>>> is called the caller is not interested in pending request (pending means
>>>> consumed but not responded)
>>>
>>> It might be interested in such pending requests in some of the
>>> pathological cases I allude to in the next paragraph though?
>>>
>>> For example if the ring has unconsumed requests but there are no slots
>>> free for a response, it would be better to treat it as no unconsumed
>>> requests until space opens up for a response, otherwise something else
>>> just has to abort the processing of the request when it notices the lack
>>> of space.
>>>
>>> (I'm totally speculating here BTW, I don't have any concrete idea why
>>> things are done this way...)
>>>
>>>
>>>>> I wonder if this apparently weird construction is due to pathological
>>>>> cases when one or the other end is not picking up requests/responses?
>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
>>>>> storm when the ring it is full of one or the other or something along
>>>>> those lines?
>>>
>>>
>>
>> Also, let me quote again my example about when rsp makes sense:
>>
>> "To clarify what does this do, let me show an example:
>> req_prod = 253
>> req_cons = 256
>> rsp_prod_pvt = 0
>
> I think to make sense of this I need to see the sequence of reads/writes
> from both parties in a sensible ordering which would result in reads
> showing the above. i.e. a demonstration of the race not just an
> assertion that if the values are read as is things makes sense.

Let me extend it:

- callback reads req_prod = 253
- frontend writes req_prod, now its 256
- backend picks it up, and consumes those slots, req_cons become 256
- callback reads req_cons = 256
- req is UINT_MAX-3 therefore, but actually there isn't any request to 
consume, it should be 0
- callback reads rsp_prod_pvt = 0, because backend haven't responded to 
any requests
- rsp is therefore 256 - (256 -0) = 0
- the macro returns rsp, as it is smaller. And that's good, because 
despite the macro failed to determine the number of unconsumed requests, 
at least it detected that the ring is full with consumed but not replied 
requests, so there shouldn't be any unconsumed req

And I call this best effort because if rsp_prod_pvt is e.g. 10, rsp will 
be then 10 as well, we return it, and the caller thinks there are 
unconsumed requests, despite there isn't any.

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 14:41         ` Zoltan Kiss
@ 2014-03-12 15:23           ` Paul Durrant
  2014-03-12 15:42             ` Wei Liu
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-12 15:23 UTC (permalink / raw)
  To: Zoltan Kiss, Ian Campbell; +Cc: xen-devel, Wei Liu, Tim (Xen.org)

> -----Original Message-----
> From: Zoltan Kiss
> Sent: 12 March 2014 14:42
> To: Paul Durrant; Ian Campbell
> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On 12/03/14 11:38, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Paul Durrant
> >> Sent: 12 March 2014 11:26
> >> To: Ian Campbell; Zoltan Kiss
> >> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
> >> Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> >>
> >>> -----Original Message-----
> >>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> >>> bounces@lists.xen.org] On Behalf Of Ian Campbell
> >>> Sent: 12 March 2014 10:28
> >>> To: Zoltan Kiss
> >>> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org)
> >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS
> oddness
> >>>
> >>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> >>>> On 11/03/14 15:44, Ian Campbell wrote:
> >>>
> >>>>> Is it the case that this macro considers a request to be unconsumed if
> >>>>> the *response* to a request is outstanding as well as if the request
> >>>>> itself is still on the ring?
> >>>> I don't think that would make sense. I think everywhere where this
> >> macro
> >>>> is called the caller is not interested in pending request (pending means
> >>>> consumed but not responded)
> >>>
> >>> It might be interested in such pending requests in some of the
> >>> pathological cases I allude to in the next paragraph though?
> >>>
> >>> For example if the ring has unconsumed requests but there are no slots
> >>> free for a response, it would be better to treat it as no unconsumed
> >>> requests until space opens up for a response, otherwise something else
> >>> just has to abort the processing of the request when it notices the lack
> >>> of space.
> >>>
> >>
> >> Huh? The act of consuming the request will create the space for the
> >> response. If responses and requests are not balanced then I'd argue that
> the
> >> protocol is broken.
> >>
> >
> > Actually ancient memory tells me that, unfortunately, netback's backend-
> >frontend GSO protocol is broken in this way... it requires one more
> response slot than the number of requests it consumes (for the extra
> metadata), which means that if the frontend keeps the ring full you can get
> overflow. It's a bit of a tangent though, because that code doesn't use this
> macro (or in fact check the ring has space in any way IIRC). The prefix variant
> of the protocol is ok though.
> 
> I think it's not: it consumes a request for the metadata, and when the
> packet is grant copied to the guest, it creates a response for that slot
> as well.

As explained verbally, it doesn't consume a request for the 'extra' info. Let me elaborate here for the benefit of the list...

In xenvif_gop_skb(), in the non-prefix GSO case, a single request is consumed for the header along with a meta slot which is used to hold the GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() for the header, but then *before* moving onto the next meta slot it makes an 'extra' response for the GSO metadata. So - one meta slot - one request consumed, but two responses produced.
So this mechanism totally relies on the netfront driver not completely filling the shared ring. If it ever does, you'll get overflow.

  Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 15:14           ` Zoltan Kiss
@ 2014-03-12 15:37             ` Ian Campbell
  2014-03-12 17:14               ` Zoltan Kiss
  0 siblings, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-12 15:37 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan

On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote:
> On 12/03/14 14:30, Ian Campbell wrote:
> > On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote:
> >> On 12/03/14 10:28, Ian Campbell wrote:
> >>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> >>>> On 11/03/14 15:44, Ian Campbell wrote:
> >>>
> >>>>> Is it the case that this macro considers a request to be unconsumed if
> >>>>> the *response* to a request is outstanding as well as if the request
> >>>>> itself is still on the ring?
> >>>> I don't think that would make sense. I think everywhere where this macro
> >>>> is called the caller is not interested in pending request (pending means
> >>>> consumed but not responded)
> >>>
> >>> It might be interested in such pending requests in some of the
> >>> pathological cases I allude to in the next paragraph though?
> >>>
> >>> For example if the ring has unconsumed requests but there are no slots
> >>> free for a response, it would be better to treat it as no unconsumed
> >>> requests until space opens up for a response, otherwise something else
> >>> just has to abort the processing of the request when it notices the lack
> >>> of space.
> >>>
> >>> (I'm totally speculating here BTW, I don't have any concrete idea why
> >>> things are done this way...)
> >>>
> >>>
> >>>>> I wonder if this apparently weird construction is due to pathological
> >>>>> cases when one or the other end is not picking up requests/responses?
> >>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
> >>>>> storm when the ring it is full of one or the other or something along
> >>>>> those lines?
> >>>
> >>>
> >>
> >> Also, let me quote again my example about when rsp makes sense:
> >>
> >> "To clarify what does this do, let me show an example:
> >> req_prod = 253
> >> req_cons = 256
> >> rsp_prod_pvt = 0
> >
> > I think to make sense of this I need to see the sequence of reads/writes
> > from both parties in a sensible ordering which would result in reads
> > showing the above. i.e. a demonstration of the race not just an
> > assertion that if the values are read as is things makes sense.
> 
> Let me extend it:
> 
> - callback reads req_prod = 253

callback == backend? Which context is this code running in? Which part
of the system is the callback logically part of?

> - frontend writes req_prod, now its 256
> - backend picks it up, and consumes those slots, req_cons become 256

"it"? Do you mean req_prod? Please be precise.

> - callback reads req_cons = 256

But the backend has also seen req_prod at 256 at this point, hasn't it?
You said so above but said "it" so I'm not sure. If the callback is part
of the backend then why hasn't it also seen this?

> - req is UINT_MAX-3 therefore, but actually there isn't any request to 
> consume, it should be 0

Only if something is ignoring the fact that it has seen req_prod == 256.

If callback is some separate entity to backend within dom0 then what you
have here is an internal inconsistency in dom0 AFAICT. IOW it seems like
you are missing some synchronisation and/or have two different entities
acting as backend.

> - callback reads rsp_prod_pvt = 0, because backend haven't responded to 
> any requests
> - rsp is therefore 256 - (256 -0) = 0
> - the macro returns rsp, as it is smaller. And that's good, because 
> despite the macro failed to determine the number of unconsumed requests, 
> at least it detected that the ring is full with consumed but not replied 
> requests, so there shouldn't be any unconsumed req
> 
> And I call this best effort because if rsp_prod_pvt is e.g. 10, rsp will 
> be then 10 as well, we return it, and the caller thinks there are 
> unconsumed requests, despite there isn't any.
> 
> Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 15:23           ` Paul Durrant
@ 2014-03-12 15:42             ` Wei Liu
  2014-03-12 15:56               ` Paul Durrant
                                 ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Wei Liu @ 2014-03-12 15:42 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Wei Liu, Ian Campbell, Zoltan Kiss, Tim (Xen.org)

On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote:
[...]
> > > Actually ancient memory tells me that, unfortunately, netback's backend-
> > >frontend GSO protocol is broken in this way... it requires one more
> > response slot than the number of requests it consumes (for the extra
> > metadata), which means that if the frontend keeps the ring full you can get
> > overflow. It's a bit of a tangent though, because that code doesn't use this
> > macro (or in fact check the ring has space in any way IIRC). The prefix variant
> > of the protocol is ok though.
> > 
> > I think it's not: it consumes a request for the metadata, and when the
> > packet is grant copied to the guest, it creates a response for that slot
> > as well.
> 
> As explained verbally, it doesn't consume a request for the 'extra' info. Let me elaborate here for the benefit of the list...
> 
> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is consumed for the header along with a meta slot which is used to hold the GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() for the header, but then *before* moving onto the next meta slot it makes an 'extra' response for the GSO metadata. So - one meta slot - one request consumed, but two responses produced.
> So this mechanism totally relies on the netfront driver not completely filling the shared ring. If it ever does, you'll get overflow.
> 

(... which reminds me of the heisenbug Sander is seeing.)

But do we not check for there's enough space in the ring before
procceeding?

Wei.

>   Paul
> 
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 15:42             ` Wei Liu
@ 2014-03-12 15:56               ` Paul Durrant
  2014-03-12 16:02               ` Paul Durrant
  2014-03-12 16:13               ` Zoltan Kiss
  2 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2014-03-12 15:56 UTC (permalink / raw)
  Cc: xen-devel, Wei Liu, Ian Campbell, Zoltan Kiss, Tim (Xen.org)

> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 12 March 2014 15:43
> To: Paul Durrant
> Cc: Zoltan Kiss; Ian Campbell; xen-devel@lists.xenproject.org; Wei Liu; Tim
> (Xen.org)
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote:
> [...]
> > > > Actually ancient memory tells me that, unfortunately, netback's
> backend-
> > > >frontend GSO protocol is broken in this way... it requires one more
> > > response slot than the number of requests it consumes (for the extra
> > > metadata), which means that if the frontend keeps the ring full you can
> get
> > > overflow. It's a bit of a tangent though, because that code doesn't use
> this
> > > macro (or in fact check the ring has space in any way IIRC). The prefix
> variant
> > > of the protocol is ok though.
> > >
> > > I think it's not: it consumes a request for the metadata, and when the
> > > packet is grant copied to the guest, it creates a response for that slot
> > > as well.
> >
> > As explained verbally, it doesn't consume a request for the 'extra' info. Let
> me elaborate here for the benefit of the list...
> >
> > In xenvif_gop_skb(), in the non-prefix GSO case, a single request is
> consumed for the header along with a meta slot which is used to hold the
> GSO data. Later on in xenvif_rx_action() the code calls make_rx_response()
> for the header, but then *before* moving onto the next meta slot it makes
> an 'extra' response for the GSO metadata. So - one meta slot - one request
> consumed, but two responses produced.
> > So this mechanism totally relies on the netfront driver not completely filling
> the shared ring. If it ever does, you'll get overflow.
> >
> 
> (... which reminds me of the heisenbug Sander is seeing.)
> 
> But do we not check for there's enough space in the ring before
> procceeding?
> 

Apparently not, but TBH I cannot figure out how on earth this ever worked at all. If netback is genuinely issuing 2 reponses for 1 consumed request then it must trash an unconsumed request at some point - but from my reading of the code that really does seem to be what it's doing.

  Paul

> Wei.
> 
> >   Paul
> >
> >

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 15:42             ` Wei Liu
  2014-03-12 15:56               ` Paul Durrant
@ 2014-03-12 16:02               ` Paul Durrant
  2014-03-12 16:13               ` Zoltan Kiss
  2 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2014-03-12 16:02 UTC (permalink / raw)
  Cc: xen-devel, Wei Liu, Ian Campbell, Zoltan Kiss, Tim (Xen.org)

> -----Original Message-----
> From: Paul Durrant
> Sent: 12 March 2014 15:57
> To: Wei Liu
> Cc: Zoltan Kiss; Ian Campbell; xen-devel@lists.xenproject.org; Wei Liu; Tim
> (Xen.org)
> Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> > -----Original Message-----
> > From: Wei Liu [mailto:wei.liu2@citrix.com]
> > Sent: 12 March 2014 15:43
> > To: Paul Durrant
> > Cc: Zoltan Kiss; Ian Campbell; xen-devel@lists.xenproject.org; Wei Liu; Tim
> > (Xen.org)
> > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> >
> > On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote:
> > [...]
> > > > > Actually ancient memory tells me that, unfortunately, netback's
> > backend-
> > > > >frontend GSO protocol is broken in this way... it requires one more
> > > > response slot than the number of requests it consumes (for the extra
> > > > metadata), which means that if the frontend keeps the ring full you can
> > get
> > > > overflow. It's a bit of a tangent though, because that code doesn't use
> > this
> > > > macro (or in fact check the ring has space in any way IIRC). The prefix
> > variant
> > > > of the protocol is ok though.
> > > >
> > > > I think it's not: it consumes a request for the metadata, and when the
> > > > packet is grant copied to the guest, it creates a response for that slot
> > > > as well.
> > >
> > > As explained verbally, it doesn't consume a request for the 'extra' info.
> Let
> > me elaborate here for the benefit of the list...
> > >
> > > In xenvif_gop_skb(), in the non-prefix GSO case, a single request is
> > consumed for the header along with a meta slot which is used to hold the
> > GSO data. Later on in xenvif_rx_action() the code calls
> make_rx_response()
> > for the header, but then *before* moving onto the next meta slot it makes
> > an 'extra' response for the GSO metadata. So - one meta slot - one request
> > consumed, but two responses produced.
> > > So this mechanism totally relies on the netfront driver not completely
> filling
> > the shared ring. If it ever does, you'll get overflow.
> > >
> >
> > (... which reminds me of the heisenbug Sander is seeing.)
> >
> > But do we not check for there's enough space in the ring before
> > procceeding?
> >
> 
> Apparently not, but TBH I cannot figure out how on earth this ever worked at
> all. If netback is genuinely issuing 2 reponses for 1 consumed request then it
> must trash an unconsumed request at some point - but from my reading of
> the code that really does seem to be what it's doing.
> 

Aha, no - I've found the sneaky bit of code that restores balance to the force... It's at the bottom of xenvif_gop_frag_copy(). In the non-prefix GSO case req_cons is bumped if it's processing the head frag. Phew!

  Paul


>   Paul
> 
> > Wei.
> >
> > >   Paul
> > >
> > >

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 15:42             ` Wei Liu
  2014-03-12 15:56               ` Paul Durrant
  2014-03-12 16:02               ` Paul Durrant
@ 2014-03-12 16:13               ` Zoltan Kiss
  2014-03-12 16:42                 ` Paul Durrant
  2 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 16:13 UTC (permalink / raw)
  To: Wei Liu, Paul Durrant; +Cc: xen-devel, Tim (Xen.org), Ian Campbell

On 12/03/14 15:42, Wei Liu wrote:
> On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote:
> [...]
>>>> Actually ancient memory tells me that, unfortunately, netback's backend-
>>>> frontend GSO protocol is broken in this way... it requires one more
>>> response slot than the number of requests it consumes (for the extra
>>> metadata), which means that if the frontend keeps the ring full you can get
>>> overflow. It's a bit of a tangent though, because that code doesn't use this
>>> macro (or in fact check the ring has space in any way IIRC). The prefix variant
>>> of the protocol is ok though.
>>>
>>> I think it's not: it consumes a request for the metadata, and when the
>>> packet is grant copied to the guest, it creates a response for that slot
>>> as well.
>>
>> As explained verbally, it doesn't consume a request for the 'extra' info. Let me elaborate here for the benefit of the list...
>>
>> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is consumed for the header along with a meta slot which is used to hold the GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() for the header, but then *before* moving onto the next meta slot it makes an 'extra' response for the GSO metadata. So - one meta slot - one request consumed, but two responses produced.
>> So this mechanism totally relies on the netfront driver not completely filling the shared ring. If it ever does, you'll get overflow.
>>
>
> (... which reminds me of the heisenbug Sander is seeing.)
>
> But do we not check for there's enough space in the ring before
> procceeding?

Thinking further what Paul said, we might be actually OK. At the end of 
xenvif_gop_frag_copy there is this:

if (*head && ((1 << gso_type) & vif->gso_mask))
	vif->rx.req_cons++;

So although netback doesn't call RING_GET_REQUEST to consume the 
request, it does so by increasing req_cons. And netfront automagically 
detect that in xennet_get_extras, and calls xennet_move_rx_slot to move 
that buffer to req_prod_pvt. Same happens when the backend send a 
response that it couldn't use the slot and it doesn't contain any data.

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 16:13               ` Zoltan Kiss
@ 2014-03-12 16:42                 ` Paul Durrant
  2014-03-12 19:06                   ` Zoltan Kiss
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-12 16:42 UTC (permalink / raw)
  To: Zoltan Kiss, Wei Liu; +Cc: xen-devel, Tim (Xen.org), Ian Campbell

> -----Original Message-----
> From: Zoltan Kiss
> Sent: 12 March 2014 16:14
> To: Wei Liu; Paul Durrant
> Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org)
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On 12/03/14 15:42, Wei Liu wrote:
> > On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote:
> > [...]
> >>>> Actually ancient memory tells me that, unfortunately, netback's
> backend-
> >>>> frontend GSO protocol is broken in this way... it requires one more
> >>> response slot than the number of requests it consumes (for the extra
> >>> metadata), which means that if the frontend keeps the ring full you can
> get
> >>> overflow. It's a bit of a tangent though, because that code doesn't use
> this
> >>> macro (or in fact check the ring has space in any way IIRC). The prefix
> variant
> >>> of the protocol is ok though.
> >>>
> >>> I think it's not: it consumes a request for the metadata, and when the
> >>> packet is grant copied to the guest, it creates a response for that slot
> >>> as well.
> >>
> >> As explained verbally, it doesn't consume a request for the 'extra' info.
> Let me elaborate here for the benefit of the list...
> >>
> >> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is
> consumed for the header along with a meta slot which is used to hold the
> GSO data. Later on in xenvif_rx_action() the code calls make_rx_response()
> for the header, but then *before* moving onto the next meta slot it makes
> an 'extra' response for the GSO metadata. So - one meta slot - one request
> consumed, but two responses produced.
> >> So this mechanism totally relies on the netfront driver not completely
> filling the shared ring. If it ever does, you'll get overflow.
> >>
> >
> > (... which reminds me of the heisenbug Sander is seeing.)
> >
> > But do we not check for there's enough space in the ring before
> > procceeding?
> 
> Thinking further what Paul said, we might be actually OK. At the end of
> xenvif_gop_frag_copy there is this:
> 
> if (*head && ((1 << gso_type) & vif->gso_mask))
> 	vif->rx.req_cons++;
> 
> So although netback doesn't call RING_GET_REQUEST to consume the
> request, it does so by increasing req_cons. And netfront automagically
> detect that in xennet_get_extras, and calls xennet_move_rx_slot to move
> that buffer to req_prod_pvt. Same happens when the backend send a
> response that it couldn't use the slot and it doesn't contain any data.
> 

Still leaves the issue that the frontend has no idea which request id was consumed to create the 'hole' for the extra info, unless it has its own mapping of ring idx to id - i.e. the id in the xen_netif_rx_response struct is basically useless.

  Paul

> Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 15:37             ` Ian Campbell
@ 2014-03-12 17:14               ` Zoltan Kiss
  2014-03-12 17:43                 ` Ian Campbell
  0 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 17:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan

On 12/03/14 15:37, Ian Campbell wrote:
> On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote:
>> On 12/03/14 14:30, Ian Campbell wrote:
>>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote:
>>>> On 12/03/14 10:28, Ian Campbell wrote:
>>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
>>>>>> On 11/03/14 15:44, Ian Campbell wrote:
>>>>>
>>>>>>> Is it the case that this macro considers a request to be unconsumed if
>>>>>>> the *response* to a request is outstanding as well as if the request
>>>>>>> itself is still on the ring?
>>>>>> I don't think that would make sense. I think everywhere where this macro
>>>>>> is called the caller is not interested in pending request (pending means
>>>>>> consumed but not responded)
>>>>>
>>>>> It might be interested in such pending requests in some of the
>>>>> pathological cases I allude to in the next paragraph though?
>>>>>
>>>>> For example if the ring has unconsumed requests but there are no slots
>>>>> free for a response, it would be better to treat it as no unconsumed
>>>>> requests until space opens up for a response, otherwise something else
>>>>> just has to abort the processing of the request when it notices the lack
>>>>> of space.
>>>>>
>>>>> (I'm totally speculating here BTW, I don't have any concrete idea why
>>>>> things are done this way...)
>>>>>
>>>>>
>>>>>>> I wonder if this apparently weird construction is due to pathological
>>>>>>> cases when one or the other end is not picking up requests/responses?
>>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
>>>>>>> storm when the ring it is full of one or the other or something along
>>>>>>> those lines?
>>>>>
>>>>>
>>>>
>>>> Also, let me quote again my example about when rsp makes sense:
>>>>
>>>> "To clarify what does this do, let me show an example:
>>>> req_prod = 253
>>>> req_cons = 256
>>>> rsp_prod_pvt = 0
>>>
>>> I think to make sense of this I need to see the sequence of reads/writes
>>> from both parties in a sensible ordering which would result in reads
>>> showing the above. i.e. a demonstration of the race not just an
>>> assertion that if the values are read as is things makes sense.
>>
>> Let me extend it:
>>
>> - callback reads req_prod = 253
>
> callback == backend? Which context is this code running in? Which part
> of the system is the callback logically part of?
Yes, it is part of the backend, the function which handles when we can 
release a slot back. With grant copy we don't have such thing, but with 
mapping xenvif_zerocopy_callback does this (or in classic kernel, it had 
a different name, but we called it page destructor). It can run from any 
context, it depends on who calls kfree_skb.

>
>> - frontend writes req_prod, now its 256
>> - backend picks it up, and consumes those slots, req_cons become 256
>
> "it"? Do you mean req_prod? Please be precise.
Yes, I meant req_prod. And backend means NAPI instance here.

>
>> - callback reads req_cons = 256
>
> But the backend has also seen req_prod at 256 at this point, hasn't it?
> You said so above but said "it" so I'm not sure. If the callback is part
> of the backend then why hasn't it also seen this?
Yes, the NAPI instance have seen it, but the callback has not. It were 
called from another context.

>
>> - req is UINT_MAX-3 therefore, but actually there isn't any request to
>> consume, it should be 0
>
> Only if something is ignoring the fact that it has seen req_prod == 256.
>
> If callback is some separate entity to backend within dom0 then what you
> have here is an internal inconsistency in dom0 AFAICT. IOW it seems like
> you are missing some synchronisation and/or have two different entities
> acting as backend.
The callback only needs to know whether it should poke the NAPI instance 
or not. There is this special case, if there are still a few unconsumed 
request, but the ring is nearly full of pending requests and 
xenvif_tx_pending_slots_available says NAPI should bail out, we have to 
schedule it back once we have enough free pending slots again.
As I said in an another mail of this thread, this poking happens in the 
callback, but actually it should be moved to the dealloc thread. However 
thinking further, this whole xenvif_tx_pending_slots_available stuff 
seems to be unnecessary to me:
It supposed to check if we have enough slot in the pending ring for the 
maximum number of possible slots, otherwise the backend bails out. It 
does so because if the backend start to consume the requests from the 
shared ring but runs out free slots in the pending ring, we are in 
trouble. But the pending ring supposed to have the same amount of slots 
as the shared one. And a consumed but not responded slot from the shared 
ring means a used slot in the pending ring. Therefore the frontend won't 
be able to push more than (MAX_PENDING_REQS - nr_pending_reqs(vif)) 
requests to the ring anyway. At least in practice, as MAX_PENDING_REQS = 
RING_SIZE(...). If we could bind the two to each other directly, we can 
get rid of this unnecessary checking, and whoever release the used 
pending slots should not poke the NAPI instance, because the frontend 
will call an interrupt if it sends a new packet anyway.

>
>> - callback reads rsp_prod_pvt = 0, because backend haven't responded to
>> any requests
>> - rsp is therefore 256 - (256 -0) = 0
>> - the macro returns rsp, as it is smaller. And that's good, because
>> despite the macro failed to determine the number of unconsumed requests,
>> at least it detected that the ring is full with consumed but not replied
>> requests, so there shouldn't be any unconsumed req
>>
>> And I call this best effort because if rsp_prod_pvt is e.g. 10, rsp will
>> be then 10 as well, we return it, and the caller thinks there are
>> unconsumed requests, despite there isn't any.
>>
>> Zoli
>
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 17:14               ` Zoltan Kiss
@ 2014-03-12 17:43                 ` Ian Campbell
  2014-03-12 21:10                   ` Zoltan Kiss
  0 siblings, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-12 17:43 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan

On Wed, 2014-03-12 at 17:14 +0000, Zoltan Kiss wrote:
> On 12/03/14 15:37, Ian Campbell wrote:
> > On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote:
> >> On 12/03/14 14:30, Ian Campbell wrote:
> >>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote:
> >>>> On 12/03/14 10:28, Ian Campbell wrote:
> >>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> >>>>>> On 11/03/14 15:44, Ian Campbell wrote:
> >>>>>
> >>>>>>> Is it the case that this macro considers a request to be unconsumed if
> >>>>>>> the *response* to a request is outstanding as well as if the request
> >>>>>>> itself is still on the ring?
> >>>>>> I don't think that would make sense. I think everywhere where this macro
> >>>>>> is called the caller is not interested in pending request (pending means
> >>>>>> consumed but not responded)
> >>>>>
> >>>>> It might be interested in such pending requests in some of the
> >>>>> pathological cases I allude to in the next paragraph though?
> >>>>>
> >>>>> For example if the ring has unconsumed requests but there are no slots
> >>>>> free for a response, it would be better to treat it as no unconsumed
> >>>>> requests until space opens up for a response, otherwise something else
> >>>>> just has to abort the processing of the request when it notices the lack
> >>>>> of space.
> >>>>>
> >>>>> (I'm totally speculating here BTW, I don't have any concrete idea why
> >>>>> things are done this way...)
> >>>>>
> >>>>>
> >>>>>>> I wonder if this apparently weird construction is due to pathological
> >>>>>>> cases when one or the other end is not picking up requests/responses?
> >>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
> >>>>>>> storm when the ring it is full of one or the other or something along
> >>>>>>> those lines?
> >>>>>
> >>>>>
> >>>>
> >>>> Also, let me quote again my example about when rsp makes sense:
> >>>>
> >>>> "To clarify what does this do, let me show an example:
> >>>> req_prod = 253
> >>>> req_cons = 256
> >>>> rsp_prod_pvt = 0
> >>>
> >>> I think to make sense of this I need to see the sequence of reads/writes
> >>> from both parties in a sensible ordering which would result in reads
> >>> showing the above. i.e. a demonstration of the race not just an
> >>> assertion that if the values are read as is things makes sense.
> >>
> >> Let me extend it:
> >>
> >> - callback reads req_prod = 253
> >
> > callback == backend? Which context is this code running in? Which part
> > of the system is the callback logically part of?
> Yes, it is part of the backend, the function which handles when we can 
> release a slot back. With grant copy we don't have such thing, but with 
> mapping xenvif_zerocopy_callback does this (or in classic kernel, it had 
> a different name, but we called it page destructor). It can run from any 
> context, it depends on who calls kfree_skb.

I think this is the root of the problem. The pv protocols really assume
one entity on either end is moving/updating the ring pointers. If you
have two entities on one end both doing this then you need to make sure
you have appropriate locking in place.

In the classic kernel wasn't the dealloc ring actually processed from
the tx/rx action -- i.e. it was forced back into a single context at the
point where the ring was actually touched.

Aren't you batching things up in a similar way? Perhaps you just need to
fix where you are draining those batches to be properly locked against
other updaters of the ring?

> >> - req is UINT_MAX-3 therefore, but actually there isn't any request to
> >> consume, it should be 0
> >
> > Only if something is ignoring the fact that it has seen req_prod == 256.
> >
> > If callback is some separate entity to backend within dom0 then what you
> > have here is an internal inconsistency in dom0 AFAICT. IOW it seems like
> > you are missing some synchronisation and/or have two different entities
> > acting as backend.
> The callback only needs to know whether it should poke the NAPI instance 
> or not. There is this special case, if there are still a few unconsumed 
> request, but the ring is nearly full of pending requests and 
> xenvif_tx_pending_slots_available says NAPI should bail out, we have to 
> schedule it back once we have enough free pending slots again.
> As I said in an another mail of this thread, this poking happens in the 
> callback, but actually it should be moved to the dealloc thread.

Right, I think that's what I was trying to say above. I missed that
other mail I'm afraid (or didn't grok it).

>  However 
> thinking further, this whole xenvif_tx_pending_slots_available stuff 
> seems to be unnecessary to me:
> It supposed to check if we have enough slot in the pending ring for the 
> maximum number of possible slots, otherwise the backend bails out. It 
> does so because if the backend start to consume the requests from the 
> shared ring but runs out free slots in the pending ring, we are in 
> trouble. But the pending ring supposed to have the same amount of slots 
> as the shared one. And a consumed but not responded slot from the shared 
> ring means a used slot in the pending ring. Therefore the frontend won't 
> be able to push more than (MAX_PENDING_REQS - nr_pending_reqs(vif)) 
> requests to the ring anyway. At least in practice, as MAX_PENDING_REQS = 
> RING_SIZE(...). If we could bind the two to each other directly, we can 
> get rid of this unnecessary checking, and whoever release the used 
> pending slots should not poke the NAPI instance, because the frontend 
> will call an interrupt if it sends a new packet anyway.

The frontend tries to do a certain amount of event elision, using the
event pointer etc, so you'd need to be careful since getting that wrong
will either stall or result in more interrupt than necessary, which has
a back impact on batching. But perhaps it could work.

In any case, it seems like doing the poke from the callback is wrong and
we should revert the patches which DaveM already applied and revisit
this aspect of things, do you agree?

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 16:42                 ` Paul Durrant
@ 2014-03-12 19:06                   ` Zoltan Kiss
  2014-03-13  9:26                     ` Paul Durrant
  0 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 19:06 UTC (permalink / raw)
  To: Paul Durrant, Wei Liu; +Cc: xen-devel, Tim (Xen.org), Ian Campbell

On 12/03/14 16:42, Paul Durrant wrote:
>> -----Original Message-----
>> From: Zoltan Kiss
>> Sent: 12 March 2014 16:14
>> To: Wei Liu; Paul Durrant
>> Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org)
>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
>>
>> On 12/03/14 15:42, Wei Liu wrote:
>>> On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote:
>>> [...]
>>>>>> Actually ancient memory tells me that, unfortunately, netback's
>> backend-
>>>>>> frontend GSO protocol is broken in this way... it requires one more
>>>>> response slot than the number of requests it consumes (for the extra
>>>>> metadata), which means that if the frontend keeps the ring full you can
>> get
>>>>> overflow. It's a bit of a tangent though, because that code doesn't use
>> this
>>>>> macro (or in fact check the ring has space in any way IIRC). The prefix
>> variant
>>>>> of the protocol is ok though.
>>>>>
>>>>> I think it's not: it consumes a request for the metadata, and when the
>>>>> packet is grant copied to the guest, it creates a response for that slot
>>>>> as well.
>>>>
>>>> As explained verbally, it doesn't consume a request for the 'extra' info.
>> Let me elaborate here for the benefit of the list...
>>>>
>>>> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is
>> consumed for the header along with a meta slot which is used to hold the
>> GSO data. Later on in xenvif_rx_action() the code calls make_rx_response()
>> for the header, but then *before* moving onto the next meta slot it makes
>> an 'extra' response for the GSO metadata. So - one meta slot - one request
>> consumed, but two responses produced.
>>>> So this mechanism totally relies on the netfront driver not completely
>> filling the shared ring. If it ever does, you'll get overflow.
>>>>
>>>
>>> (... which reminds me of the heisenbug Sander is seeing.)
>>>
>>> But do we not check for there's enough space in the ring before
>>> procceeding?
>>
>> Thinking further what Paul said, we might be actually OK. At the end of
>> xenvif_gop_frag_copy there is this:
>>
>> if (*head && ((1 << gso_type) & vif->gso_mask))
>> 	vif->rx.req_cons++;
>>
>> So although netback doesn't call RING_GET_REQUEST to consume the
>> request, it does so by increasing req_cons. And netfront automagically
>> detect that in xennet_get_extras, and calls xennet_move_rx_slot to move
>> that buffer to req_prod_pvt. Same happens when the backend send a
>> response that it couldn't use the slot and it doesn't contain any data.
>>
>
> Still leaves the issue that the frontend has no idea which request id was consumed to create the 'hole' for the extra info, unless it has its own mapping of ring idx to id - i.e. the id in the xen_netif_rx_response struct is basically useless.

It only works because netback puts the responses here to the place of 
the original requests. If you check, frontend doesn't even check what's 
the id in the response, it only use it in an error printout. Seems like 
it is a bit fragile ...

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 17:43                 ` Ian Campbell
@ 2014-03-12 21:10                   ` Zoltan Kiss
  2014-03-13 10:04                     ` Ian Campbell
  0 siblings, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-12 21:10 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan

On 12/03/14 17:43, Ian Campbell wrote:
> On Wed, 2014-03-12 at 17:14 +0000, Zoltan Kiss wrote:
>> On 12/03/14 15:37, Ian Campbell wrote:
>>> On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote:
>>>> On 12/03/14 14:30, Ian Campbell wrote:
>>>>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote:
>>>>>> On 12/03/14 10:28, Ian Campbell wrote:
>>>>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
>>>>>>>> On 11/03/14 15:44, Ian Campbell wrote:
>>>>>>>
>>>>>>>>> Is it the case that this macro considers a request to be unconsumed if
>>>>>>>>> the *response* to a request is outstanding as well as if the request
>>>>>>>>> itself is still on the ring?
>>>>>>>> I don't think that would make sense. I think everywhere where this macro
>>>>>>>> is called the caller is not interested in pending request (pending means
>>>>>>>> consumed but not responded)
>>>>>>>
>>>>>>> It might be interested in such pending requests in some of the
>>>>>>> pathological cases I allude to in the next paragraph though?
>>>>>>>
>>>>>>> For example if the ring has unconsumed requests but there are no slots
>>>>>>> free for a response, it would be better to treat it as no unconsumed
>>>>>>> requests until space opens up for a response, otherwise something else
>>>>>>> just has to abort the processing of the request when it notices the lack
>>>>>>> of space.
>>>>>>>
>>>>>>> (I'm totally speculating here BTW, I don't have any concrete idea why
>>>>>>> things are done this way...)
>>>>>>>
>>>>>>>
>>>>>>>>> I wonder if this apparently weird construction is due to pathological
>>>>>>>>> cases when one or the other end is not picking up requests/responses?
>>>>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
>>>>>>>>> storm when the ring it is full of one or the other or something along
>>>>>>>>> those lines?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Also, let me quote again my example about when rsp makes sense:
>>>>>>
>>>>>> "To clarify what does this do, let me show an example:
>>>>>> req_prod = 253
>>>>>> req_cons = 256
>>>>>> rsp_prod_pvt = 0
>>>>>
>>>>> I think to make sense of this I need to see the sequence of reads/writes
>>>>> from both parties in a sensible ordering which would result in reads
>>>>> showing the above. i.e. a demonstration of the race not just an
>>>>> assertion that if the values are read as is things makes sense.
>>>>
>>>> Let me extend it:
>>>>
>>>> - callback reads req_prod = 253
>>>
>>> callback == backend? Which context is this code running in? Which part
>>> of the system is the callback logically part of?
>> Yes, it is part of the backend, the function which handles when we can
>> release a slot back. With grant copy we don't have such thing, but with
>> mapping xenvif_zerocopy_callback does this (or in classic kernel, it had
>> a different name, but we called it page destructor). It can run from any
>> context, it depends on who calls kfree_skb.
>
> I think this is the root of the problem. The pv protocols really assume
> one entity on either end is moving/updating the ring pointers. If you
There is only _one_ entity moving/updating the ring pointers. Everyone 
else is just reading it. The callback, xenvif_tx_interrupt, 
xenvif_check_rx_xenvif.

> have two entities on one end both doing this then you need to make sure
> you have appropriate locking in place.
>
> In the classic kernel wasn't the dealloc ring actually processed from
> the tx/rx action -- i.e. it was forced back into a single context at the
> point where the ring was actually touched.
>
> Aren't you batching things up in a similar way? Perhaps you just need to
> fix where you are draining those batches to be properly locked against
> other updaters of the ring?
>
>>>> - req is UINT_MAX-3 therefore, but actually there isn't any request to
>>>> consume, it should be 0
>>>
>>> Only if something is ignoring the fact that it has seen req_prod == 256.
>>>
>>> If callback is some separate entity to backend within dom0 then what you
>>> have here is an internal inconsistency in dom0 AFAICT. IOW it seems like
>>> you are missing some synchronisation and/or have two different entities
>>> acting as backend.
>> The callback only needs to know whether it should poke the NAPI instance
>> or not. There is this special case, if there are still a few unconsumed
>> request, but the ring is nearly full of pending requests and
>> xenvif_tx_pending_slots_available says NAPI should bail out, we have to
>> schedule it back once we have enough free pending slots again.
>> As I said in an another mail of this thread, this poking happens in the
>> callback, but actually it should be moved to the dealloc thread.
>
> Right, I think that's what I was trying to say above. I missed that
> other mail I'm afraid (or didn't grok it).
>
>>   However
>> thinking further, this whole xenvif_tx_pending_slots_available stuff
>> seems to be unnecessary to me:
>> It supposed to check if we have enough slot in the pending ring for the
>> maximum number of possible slots, otherwise the backend bails out. It
>> does so because if the backend start to consume the requests from the
>> shared ring but runs out free slots in the pending ring, we are in
>> trouble. But the pending ring supposed to have the same amount of slots
>> as the shared one. And a consumed but not responded slot from the shared
>> ring means a used slot in the pending ring. Therefore the frontend won't
>> be able to push more than (MAX_PENDING_REQS - nr_pending_reqs(vif))
>> requests to the ring anyway. At least in practice, as MAX_PENDING_REQS =
>> RING_SIZE(...). If we could bind the two to each other directly, we can
>> get rid of this unnecessary checking, and whoever release the used
>> pending slots should not poke the NAPI instance, because the frontend
>> will call an interrupt if it sends a new packet anyway.
>
> The frontend tries to do a certain amount of event elision, using the
> event pointer etc, so you'd need to be careful since getting that wrong
> will either stall or result in more interrupt than necessary, which has
> a back impact on batching. But perhaps it could work.
I've actually tried that with a simple iperf throughput test, and it 
works. I'll run it through XenRT later to see if it goes wrong somewhere.

> In any case, it seems like doing the poke from the callback is wrong and
> we should revert the patches which DaveM already applied and revisit
> this aspect of things, do you agree?
I've just sent in a patch to fix that. I think the reason why I haven't 
seen any issue is that the in this situation there are plenty of 
outstanding packets, and all of their callback will schedule NAPI again. 
Chances are quite small that the dealloc thread couldn't release enough 
slots in the meantime.

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 19:06                   ` Zoltan Kiss
@ 2014-03-13  9:26                     ` Paul Durrant
  2014-03-13 10:02                       ` Ian Campbell
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-13  9:26 UTC (permalink / raw)
  To: Zoltan Kiss, Wei Liu; +Cc: xen-devel, Tim (Xen.org), Ian Campbell

> -----Original Message-----
> From: Zoltan Kiss
> Sent: 12 March 2014 19:06
> To: Paul Durrant; Wei Liu
> Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org)
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On 12/03/14 16:42, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Zoltan Kiss
> >> Sent: 12 March 2014 16:14
> >> To: Wei Liu; Paul Durrant
> >> Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org)
> >> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> >>
> >> On 12/03/14 15:42, Wei Liu wrote:
> >>> On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote:
> >>> [...]
> >>>>>> Actually ancient memory tells me that, unfortunately, netback's
> >> backend-
> >>>>>> frontend GSO protocol is broken in this way... it requires one more
> >>>>> response slot than the number of requests it consumes (for the extra
> >>>>> metadata), which means that if the frontend keeps the ring full you
> can
> >> get
> >>>>> overflow. It's a bit of a tangent though, because that code doesn't use
> >> this
> >>>>> macro (or in fact check the ring has space in any way IIRC). The prefix
> >> variant
> >>>>> of the protocol is ok though.
> >>>>>
> >>>>> I think it's not: it consumes a request for the metadata, and when the
> >>>>> packet is grant copied to the guest, it creates a response for that slot
> >>>>> as well.
> >>>>
> >>>> As explained verbally, it doesn't consume a request for the 'extra' info.
> >> Let me elaborate here for the benefit of the list...
> >>>>
> >>>> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is
> >> consumed for the header along with a meta slot which is used to hold the
> >> GSO data. Later on in xenvif_rx_action() the code calls
> make_rx_response()
> >> for the header, but then *before* moving onto the next meta slot it
> makes
> >> an 'extra' response for the GSO metadata. So - one meta slot - one
> request
> >> consumed, but two responses produced.
> >>>> So this mechanism totally relies on the netfront driver not completely
> >> filling the shared ring. If it ever does, you'll get overflow.
> >>>>
> >>>
> >>> (... which reminds me of the heisenbug Sander is seeing.)
> >>>
> >>> But do we not check for there's enough space in the ring before
> >>> procceeding?
> >>
> >> Thinking further what Paul said, we might be actually OK. At the end of
> >> xenvif_gop_frag_copy there is this:
> >>
> >> if (*head && ((1 << gso_type) & vif->gso_mask))
> >> 	vif->rx.req_cons++;
> >>
> >> So although netback doesn't call RING_GET_REQUEST to consume the
> >> request, it does so by increasing req_cons. And netfront automagically
> >> detect that in xennet_get_extras, and calls xennet_move_rx_slot to
> move
> >> that buffer to req_prod_pvt. Same happens when the backend send a
> >> response that it couldn't use the slot and it doesn't contain any data.
> >>
> >
> > Still leaves the issue that the frontend has no idea which request id was
> consumed to create the 'hole' for the extra info, unless it has its own
> mapping of ring idx to id - i.e. the id in the xen_netif_rx_response struct is
> basically useless.
> 
> It only works because netback puts the responses here to the place of
> the original requests. If you check, frontend doesn't even check what's
> the id in the response, it only use it in an error printout. Seems like
> it is a bit fragile ...
> 

... and completely non-obvious. Perhaps we should just bin the id field from the response struct in the canonical headers to avoid new frontends believing that it is useful, and document that they should use their own mapping of idx to id, and guarantee that netback will always respond in order on the rx ring.

  Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-13  9:26                     ` Paul Durrant
@ 2014-03-13 10:02                       ` Ian Campbell
  2014-03-13 10:58                         ` Paul Durrant
  0 siblings, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-13 10:02 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Wei Liu, Zoltan Kiss, Tim (Xen.org)

On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote:
> guarantee that netback will always respond in order on the rx ring.

I don't believe this is guaranteed byt the protocol, even if it happens
to work out that way for the Linux netback implementation.

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-12 21:10                   ` Zoltan Kiss
@ 2014-03-13 10:04                     ` Ian Campbell
  0 siblings, 0 replies; 53+ messages in thread
From: Ian Campbell @ 2014-03-13 10:04 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan

On Wed, 2014-03-12 at 21:10 +0000, Zoltan Kiss wrote:
> On 12/03/14 17:43, Ian Campbell wrote:
> > On Wed, 2014-03-12 at 17:14 +0000, Zoltan Kiss wrote:
> >> On 12/03/14 15:37, Ian Campbell wrote:
> >>> On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote:
> >>>> On 12/03/14 14:30, Ian Campbell wrote:
> >>>>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote:
> >>>>>> On 12/03/14 10:28, Ian Campbell wrote:
> >>>>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote:
> >>>>>>>> On 11/03/14 15:44, Ian Campbell wrote:
> >>>>>>>
> >>>>>>>>> Is it the case that this macro considers a request to be unconsumed if
> >>>>>>>>> the *response* to a request is outstanding as well as if the request
> >>>>>>>>> itself is still on the ring?
> >>>>>>>> I don't think that would make sense. I think everywhere where this macro
> >>>>>>>> is called the caller is not interested in pending request (pending means
> >>>>>>>> consumed but not responded)
> >>>>>>>
> >>>>>>> It might be interested in such pending requests in some of the
> >>>>>>> pathological cases I allude to in the next paragraph though?
> >>>>>>>
> >>>>>>> For example if the ring has unconsumed requests but there are no slots
> >>>>>>> free for a response, it would be better to treat it as no unconsumed
> >>>>>>> requests until space opens up for a response, otherwise something else
> >>>>>>> just has to abort the processing of the request when it notices the lack
> >>>>>>> of space.
> >>>>>>>
> >>>>>>> (I'm totally speculating here BTW, I don't have any concrete idea why
> >>>>>>> things are done this way...)
> >>>>>>>
> >>>>>>>
> >>>>>>>>> I wonder if this apparently weird construction is due to pathological
> >>>>>>>>> cases when one or the other end is not picking up requests/responses?
> >>>>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt
> >>>>>>>>> storm when the ring it is full of one or the other or something along
> >>>>>>>>> those lines?
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> Also, let me quote again my example about when rsp makes sense:
> >>>>>>
> >>>>>> "To clarify what does this do, let me show an example:
> >>>>>> req_prod = 253
> >>>>>> req_cons = 256
> >>>>>> rsp_prod_pvt = 0
> >>>>>
> >>>>> I think to make sense of this I need to see the sequence of reads/writes
> >>>>> from both parties in a sensible ordering which would result in reads
> >>>>> showing the above. i.e. a demonstration of the race not just an
> >>>>> assertion that if the values are read as is things makes sense.
> >>>>
> >>>> Let me extend it:
> >>>>
> >>>> - callback reads req_prod = 253
> >>>
> >>> callback == backend? Which context is this code running in? Which part
> >>> of the system is the callback logically part of?
> >> Yes, it is part of the backend, the function which handles when we can
> >> release a slot back. With grant copy we don't have such thing, but with
> >> mapping xenvif_zerocopy_callback does this (or in classic kernel, it had
> >> a different name, but we called it page destructor). It can run from any
> >> context, it depends on who calls kfree_skb.
> >
> > I think this is the root of the problem. The pv protocols really assume
> > one entity on either end is moving/updating the ring pointers. If you
> There is only _one_ entity moving/updating the ring pointers. Everyone 
> else is just reading it. The callback, xenvif_tx_interrupt, 
> xenvif_check_rx_xenvif.

Perhaps I should have said "accessing" rather than "moving/updating".
Even if only one entity is reading the ring if two entities are reading
at least one of them is going to see inconsistencies.

> > In any case, it seems like doing the poke from the callback is wrong and
> > we should revert the patches which DaveM already applied and revisit
> > this aspect of things, do you agree?
> I've just sent in a patch to fix that.

Thanks, I'll take a look.

>  I think the reason why I haven't 
> seen any issue is that the in this situation there are plenty of 
> outstanding packets, and all of their callback will schedule NAPI again. 
> Chances are quite small that the dealloc thread couldn't release enough 
> slots in the meantime.

Seems plausible enough. That means that the issue would be seen rarely
on quiet systems, which would be a nightmare to diagnose I think.

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-13 10:02                       ` Ian Campbell
@ 2014-03-13 10:58                         ` Paul Durrant
  2014-03-13 12:19                           ` Ian Campbell
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-13 10:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Zoltan Kiss, Tim (Xen.org)

> -----Original Message-----
> From: Ian Campbell
> Sent: 13 March 2014 10:03
> To: Paul Durrant
> Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org)
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote:
> > guarantee that netback will always respond in order on the rx ring.
> 
> I don't believe this is guaranteed byt the protocol, even if it happens
> to work out that way for the Linux netback implementation.
> 

Indeed it's not guaranteed by the protocol but it's the only way that (non-prefix) GSO can possibly work, which is why I think something really needs to be documented.

  Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-13 10:58                         ` Paul Durrant
@ 2014-03-13 12:19                           ` Ian Campbell
  2014-03-13 12:28                             ` Zoltan Kiss
  0 siblings, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2014-03-13 12:19 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Wei Liu, Zoltan Kiss, Tim (Xen.org)

On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote:
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 13 March 2014 10:03
> > To: Paul Durrant
> > Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org)
> > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> > 
> > On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote:
> > > guarantee that netback will always respond in order on the rx ring.
> > 
> > I don't believe this is guaranteed byt the protocol, even if it happens
> > to work out that way for the Linux netback implementation.
> > 
> 
> Indeed it's not guaranteed by the protocol but it's the only way that
> (non-prefix) GSO can possibly work, which is why I think something
> really needs to be documented.

There are possibly ordering constraints on the completion of the gso
info slot and the rest but I don't think there are any such restrictions
on the fragment slots themselves, are there?

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-13 12:19                           ` Ian Campbell
@ 2014-03-13 12:28                             ` Zoltan Kiss
  2014-03-13 12:29                               ` Paul Durrant
  2014-03-13 12:44                               ` Ian Campbell
  0 siblings, 2 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-13 12:28 UTC (permalink / raw)
  To: Ian Campbell, Paul Durrant; +Cc: xen-devel, Wei Liu, Tim (Xen.org)

On 13/03/14 12:19, Ian Campbell wrote:
> On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Ian Campbell
>>> Sent: 13 March 2014 10:03
>>> To: Paul Durrant
>>> Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org)
>>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
>>>
>>> On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote:
>>>> guarantee that netback will always respond in order on the rx ring.
>>>
>>> I don't believe this is guaranteed byt the protocol, even if it happens
>>> to work out that way for the Linux netback implementation.
>>>
>>
>> Indeed it's not guaranteed by the protocol but it's the only way that
>> (non-prefix) GSO can possibly work, which is why I think something
>> really needs to be documented.
>
> There are possibly ordering constraints on the completion of the gso
> info slot and the rest but I don't think there are any such restrictions
> on the fragment slots themselves, are there?

Well, the Linux frontend doesn't take the id from the response, it just 
assume that the response is in the same slot as the request.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-13 12:28                             ` Zoltan Kiss
@ 2014-03-13 12:29                               ` Paul Durrant
  2014-03-13 12:44                               ` Ian Campbell
  1 sibling, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2014-03-13 12:29 UTC (permalink / raw)
  To: Zoltan Kiss, Ian Campbell; +Cc: xen-devel, Wei Liu, Tim (Xen.org)

> -----Original Message-----
> From: Zoltan Kiss
> Sent: 13 March 2014 12:29
> To: Ian Campbell; Paul Durrant
> Cc: Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org)
> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> 
> On 13/03/14 12:19, Ian Campbell wrote:
> > On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote:
> >>> -----Original Message-----
> >>> From: Ian Campbell
> >>> Sent: 13 March 2014 10:03
> >>> To: Paul Durrant
> >>> Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org)
> >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS
> oddness
> >>>
> >>> On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote:
> >>>> guarantee that netback will always respond in order on the rx ring.
> >>>
> >>> I don't believe this is guaranteed byt the protocol, even if it happens
> >>> to work out that way for the Linux netback implementation.
> >>>
> >>
> >> Indeed it's not guaranteed by the protocol but it's the only way that
> >> (non-prefix) GSO can possibly work, which is why I think something
> >> really needs to be documented.
> >
> > There are possibly ordering constraints on the completion of the gso
> > info slot and the rest but I don't think there are any such restrictions
> > on the fragment slots themselves, are there?
> 
> Well, the Linux frontend doesn't take the id from the response, it just
> assume that the response is in the same slot as the request.

... so for compatibilities sake, the ring had better complete in order!

  Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RING_HAS_UNCONSUMED_REQUESTS oddness
  2014-03-13 12:28                             ` Zoltan Kiss
  2014-03-13 12:29                               ` Paul Durrant
@ 2014-03-13 12:44                               ` Ian Campbell
  1 sibling, 0 replies; 53+ messages in thread
From: Ian Campbell @ 2014-03-13 12:44 UTC (permalink / raw)
  To: Zoltan Kiss; +Cc: xen-devel, Paul Durrant, Wei Liu, Tim (Xen.org)

On Thu, 2014-03-13 at 12:28 +0000, Zoltan Kiss wrote:
> On 13/03/14 12:19, Ian Campbell wrote:
> > On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote:
> >>> -----Original Message-----
> >>> From: Ian Campbell
> >>> Sent: 13 March 2014 10:03
> >>> To: Paul Durrant
> >>> Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org)
> >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness
> >>>
> >>> On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote:
> >>>> guarantee that netback will always respond in order on the rx ring.
> >>>
> >>> I don't believe this is guaranteed byt the protocol, even if it happens
> >>> to work out that way for the Linux netback implementation.
> >>>
> >>
> >> Indeed it's not guaranteed by the protocol but it's the only way that
> >> (non-prefix) GSO can possibly work, which is why I think something
> >> really needs to be documented.
> >
> > There are possibly ordering constraints on the completion of the gso
> > info slot and the rest but I don't think there are any such restrictions
> > on the fragment slots themselves, are there?
> 
> Well, the Linux frontend doesn't take the id from the response, it just 
> assume that the response is in the same slot as the request.

Oops, I was thinking of guest tx not rx, sorry. Tx can complete in any
order (txrsp->id is the index into tx_skbs[]) but not rx apparently.

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-06 17:30     ` Tim Deegan
  2014-03-06 21:39       ` Zoltan Kiss
@ 2014-03-13 16:38       ` Tim Deegan
  2014-03-22 14:18         ` Zoltan Kiss
  2014-04-03  9:38         ` Tim Deegan
  1 sibling, 2 replies; 53+ messages in thread
From: Tim Deegan @ 2014-03-13 16:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Zoltan Kiss, Stefano Stabellini,
	freebsd-xen, Alan Somers, Manuel Bouyer, David Vrabel,
	Jan Beulich, Keir Fraser, Boris Ostrovsky, Roger Pau Monné,
	Paul Durrant, John Suykerbuyk

[RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()

Remove a useless (though harmless) extra check.

Code inspection
---------------

Ian Campbell and I looked at this today and can't find any case where
the existing 'rsp' test would be useful.  In today's ring.h,

#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
    unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
    unsigned int rsp = RING_SIZE(_r) -                                  \
        ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
    req < rsp ? req : rsp;                                              \
})

'req' is the number of requests that the F/E has published and we have
not yet consumed.  'rsp' is an odd fish, everything _except_ what we
might call requests locally in flight, that is requests we've consumed
but not produced a response for.  We could only think of two things we
might be trying to test for here: 

(a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the 
    way these rings normally work.  In that case, as Zoltan pointed
    out, rsp must be >= req, for a well-behaved frontend: otherwise
    we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that
    req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun
    the ring.  I don't think this even usefully protects against a
    malicious/buggy frontend.

(b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm
    told might happen in linux netback?  In that case, we might plausibly
    want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of 
    (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's
    not what this does.  Instead, rsp will underflow to 
    RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req. 

So in both of those cases, 'rsp' is always >= 'req' and is useless.


Code archaeology
----------------

In the original ring.h, the test was a boolean, as the name implies.
Cset 99812f40 ([NET] back: Add SG support) extended it to a count in
the obvious manner.  Looking at the original (0b788798):

#define RING_HAS_UNCONSUMED_REQUESTS(_p, _r)                            \
   ( ((_r)->req_cons != (_r)->sring->req_prod ) &&                      \
     (((_r)->req_cons - (_r)->rsp_prod_pvt) !=                          \
      SRING_SIZE((_p), (_r)->sring)) )

it seems to be testing for 'there are unconsumed requests _and_ we
have not got a full ring of consumed-but-not-responded requests'.
This also seems to be effectively harmless: if there are unconsumed
requests, we can't possibly have a ring full of c-b-n-r requests
unless the frontend's request producer has overrun its response
consumer.

That code was introduced with no callers, but seems to have been
copied from the existing netback code, which switched to use it in
b242b208.  All later users of it in the xenolinux trees are either
brand new code or replacing a simple (req_cons - req_prod) test.  The
netback code goes back to Keir's original implementation, where it
looked like this, in inverted form (1de448f4):

        /* Work to do? */
        i = netif->tx_req_cons;
        if ( (i == netif->tx->req_prod) ||
             ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) )
        {
            netif_put(netif);
            continue;
        }

Again, I don't think this test is useful: it's only interesting if
netfront has overrun the ring, but it doesn't reliably detect that.

So let's remove it. 

Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: Tim Deegan <tim@xen.org>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Alan Somers <alans@spectralogic.com>
Cc: John Suykerbuyk <johns@spectralogic.com>
Cc: Manuel Bouyer <bouyer@NetBSD.org>
Cc: Paul Durrant <Paul.Durrant@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

---
RFC because
- I might well be missing something here; and
- this is a change to a public header that could be in use in any
  number of implementations; since the extra test is very cheap, and
  seems to be harmess, we might consider leaving it in place.
- I haven't tested it at all yet.

CC'ing a whole bunch of people whose code might be using this macro.

diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h
index 73e13d7..d6e23f1 100644
--- a/xen/include/public/io/ring.h
+++ b/xen/include/public/io/ring.h
@@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t
 #define RING_HAS_UNCONSUMED_RESPONSES(_r)                               \
     ((_r)->sring->rsp_prod - (_r)->rsp_cons)
 
-#ifdef __GNUC__
-#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
-    unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
-    unsigned int rsp = RING_SIZE(_r) -                                  \
-        ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
-    req < rsp ? req : rsp;                                              \
-})
-#else
-/* Same as above, but without the nice GCC ({ ... }) syntax. */
 #define RING_HAS_UNCONSUMED_REQUESTS(_r)                                \
-    ((((_r)->sring->req_prod - (_r)->req_cons) <                        \
-      (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?        \
-     ((_r)->sring->req_prod - (_r)->req_cons) :                         \
-     (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
-#endif
+    ((_r)->sring->req_prod - (_r)->req_cons)
 
 /* Direct access to individual ring elements, by index. */
 #define RING_GET_REQUEST(_r, _idx)                                      \

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-13 16:38       ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan
@ 2014-03-22 14:18         ` Zoltan Kiss
  2014-03-22 17:14           ` Tim Deegan
  2014-04-03  9:38         ` Tim Deegan
  1 sibling, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-22 14:18 UTC (permalink / raw)
  To: Tim Deegan, xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, freebsd-xen,
	Alan Somers, Manuel Bouyer, David Vrabel, Jan Beulich,
	Keir Fraser, Boris Ostrovsky, Roger Pau Monné,
	Paul Durrant, John Suykerbuyk

Hi,

I think I might have an explanation why do we need this, see this mailing:

https://lkml.org/lkml/2014/3/20/710
https://lkml.org/lkml/2014/3/21/111
https://lkml.org/lkml/2014/3/21/390

Zoli


On 13/03/14 16:38, Tim Deegan wrote:
> [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
>
> Remove a useless (though harmless) extra check.
>
> Code inspection
> ---------------
>
> Ian Campbell and I looked at this today and can't find any case where
> the existing 'rsp' test would be useful.  In today's ring.h,
>
> #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
>      unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
>      unsigned int rsp = RING_SIZE(_r) -                                  \
>          ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
>      req < rsp ? req : rsp;                                              \
> })
>
> 'req' is the number of requests that the F/E has published and we have
> not yet consumed.  'rsp' is an odd fish, everything _except_ what we
> might call requests locally in flight, that is requests we've consumed
> but not produced a response for.  We could only think of two things we
> might be trying to test for here:
>
> (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the
>      way these rings normally work.  In that case, as Zoltan pointed
>      out, rsp must be >= req, for a well-behaved frontend: otherwise
>      we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that
>      req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun
>      the ring.  I don't think this even usefully protects against a
>      malicious/buggy frontend.
>
> (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm
>      told might happen in linux netback?  In that case, we might plausibly
>      want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of
>      (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's
>      not what this does.  Instead, rsp will underflow to
>      RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req.
>
> So in both of those cases, 'rsp' is always >= 'req' and is useless.
>
>
> Code archaeology
> ----------------
>
> In the original ring.h, the test was a boolean, as the name implies.
> Cset 99812f40 ([NET] back: Add SG support) extended it to a count in
> the obvious manner.  Looking at the original (0b788798):
>
> #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r)                            \
>     ( ((_r)->req_cons != (_r)->sring->req_prod ) &&                      \
>       (((_r)->req_cons - (_r)->rsp_prod_pvt) !=                          \
>        SRING_SIZE((_p), (_r)->sring)) )
>
> it seems to be testing for 'there are unconsumed requests _and_ we
> have not got a full ring of consumed-but-not-responded requests'.
> This also seems to be effectively harmless: if there are unconsumed
> requests, we can't possibly have a ring full of c-b-n-r requests
> unless the frontend's request producer has overrun its response
> consumer.
>
> That code was introduced with no callers, but seems to have been
> copied from the existing netback code, which switched to use it in
> b242b208.  All later users of it in the xenolinux trees are either
> brand new code or replacing a simple (req_cons - req_prod) test.  The
> netback code goes back to Keir's original implementation, where it
> looked like this, in inverted form (1de448f4):
>
>          /* Work to do? */
>          i = netif->tx_req_cons;
>          if ( (i == netif->tx->req_prod) ||
>               ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) )
>          {
>              netif_put(netif);
>              continue;
>          }
>
> Again, I don't think this test is useful: it's only interesting if
> netfront has overrun the ring, but it doesn't reliably detect that.
>
> So let's remove it.
>
> Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Signed-off-by: Tim Deegan <tim@xen.org>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Alan Somers <alans@spectralogic.com>
> Cc: John Suykerbuyk <johns@spectralogic.com>
> Cc: Manuel Bouyer <bouyer@NetBSD.org>
> Cc: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>
> ---
> RFC because
> - I might well be missing something here; and
> - this is a change to a public header that could be in use in any
>    number of implementations; since the extra test is very cheap, and
>    seems to be harmess, we might consider leaving it in place.
> - I haven't tested it at all yet.
>
> CC'ing a whole bunch of people whose code might be using this macro.
>
> diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h
> index 73e13d7..d6e23f1 100644
> --- a/xen/include/public/io/ring.h
> +++ b/xen/include/public/io/ring.h
> @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t
>   #define RING_HAS_UNCONSUMED_RESPONSES(_r)                               \
>       ((_r)->sring->rsp_prod - (_r)->rsp_cons)
>
> -#ifdef __GNUC__
> -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
> -    unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
> -    unsigned int rsp = RING_SIZE(_r) -                                  \
> -        ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
> -    req < rsp ? req : rsp;                                              \
> -})
> -#else
> -/* Same as above, but without the nice GCC ({ ... }) syntax. */
>   #define RING_HAS_UNCONSUMED_REQUESTS(_r)                                \
> -    ((((_r)->sring->req_prod - (_r)->req_cons) <                        \
> -      (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?        \
> -     ((_r)->sring->req_prod - (_r)->req_cons) :                         \
> -     (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
> -#endif
> +    ((_r)->sring->req_prod - (_r)->req_cons)
>
>   /* Direct access to individual ring elements, by index. */
>   #define RING_GET_REQUEST(_r, _idx)                                      \
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-22 14:18         ` Zoltan Kiss
@ 2014-03-22 17:14           ` Tim Deegan
  2014-03-24  7:38             ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Tim Deegan @ 2014-03-22 17:14 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: Wei Liu, Keir Fraser, Ian Campbell, Stefano Stabellini,
	freebsd-xen, Alan Somers, Manuel Bouyer, David Vrabel,
	Jan Beulich, xen-devel, Boris Ostrovsky, Roger Pau Monné,
	Paul Durrant, John Suykerbuyk

At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
> Hi,
> 
> I think I might have an explanation why do we need this, see this mailing:
> 
> https://lkml.org/lkml/2014/3/20/710
> https://lkml.org/lkml/2014/3/21/111
> https://lkml.org/lkml/2014/3/21/390

Quoting from the third of these:

| But consuming overrunning requests after rsp_prod_pvt is a problem:
| - NAPI instance races with dealloc thread over the slots. The first 
| reads them as requests, the second writes them as responses
| - the NAPI instance overwrites used pending slots as well, so skb frag 
| release go wrong etc.

OK, so the backend needs to be careful not to follow the frontend into
overrun, not because of the ring itself being corrupted but because it
will mess up the backend's internal bookkeeping.

Fair enough -- I think this at least constitutes a reason why it might
be dangerous to remove the check from the public header.  I withdraw
the patch.  I'll make another patch (with a smaller distribution
list!) to add an appropriate comment.

Thanks!

Tim.

> On 13/03/14 16:38, Tim Deegan wrote:
> > [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
> >
> > Remove a useless (though harmless) extra check.
> >
> > Code inspection
> > ---------------
> >
> > Ian Campbell and I looked at this today and can't find any case where
> > the existing 'rsp' test would be useful.  In today's ring.h,
> >
> > #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
> >      unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
> >      unsigned int rsp = RING_SIZE(_r) -                                  \
> >          ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
> >      req < rsp ? req : rsp;                                              \
> > })
> >
> > 'req' is the number of requests that the F/E has published and we have
> > not yet consumed.  'rsp' is an odd fish, everything _except_ what we
> > might call requests locally in flight, that is requests we've consumed
> > but not produced a response for.  We could only think of two things we
> > might be trying to test for here:
> >
> > (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the
> >      way these rings normally work.  In that case, as Zoltan pointed
> >      out, rsp must be >= req, for a well-behaved frontend: otherwise
> >      we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that
> >      req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun
> >      the ring.  I don't think this even usefully protects against a
> >      malicious/buggy frontend.
> >
> > (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm
> >      told might happen in linux netback?  In that case, we might plausibly
> >      want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of
> >      (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's
> >      not what this does.  Instead, rsp will underflow to
> >      RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req.
> >
> > So in both of those cases, 'rsp' is always >= 'req' and is useless.
> >
> >
> > Code archaeology
> > ----------------
> >
> > In the original ring.h, the test was a boolean, as the name implies.
> > Cset 99812f40 ([NET] back: Add SG support) extended it to a count in
> > the obvious manner.  Looking at the original (0b788798):
> >
> > #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r)                            \
> >     ( ((_r)->req_cons != (_r)->sring->req_prod ) &&                      \
> >       (((_r)->req_cons - (_r)->rsp_prod_pvt) !=                          \
> >        SRING_SIZE((_p), (_r)->sring)) )
> >
> > it seems to be testing for 'there are unconsumed requests _and_ we
> > have not got a full ring of consumed-but-not-responded requests'.
> > This also seems to be effectively harmless: if there are unconsumed
> > requests, we can't possibly have a ring full of c-b-n-r requests
> > unless the frontend's request producer has overrun its response
> > consumer.
> >
> > That code was introduced with no callers, but seems to have been
> > copied from the existing netback code, which switched to use it in
> > b242b208.  All later users of it in the xenolinux trees are either
> > brand new code or replacing a simple (req_cons - req_prod) test.  The
> > netback code goes back to Keir's original implementation, where it
> > looked like this, in inverted form (1de448f4):
> >
> >          /* Work to do? */
> >          i = netif->tx_req_cons;
> >          if ( (i == netif->tx->req_prod) ||
> >               ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) )
> >          {
> >              netif_put(netif);
> >              continue;
> >          }
> >
> > Again, I don't think this test is useful: it's only interesting if
> > netfront has overrun the ring, but it doesn't reliably detect that.
> >
> > So let's remove it.
> >
> > Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> > Signed-off-by: Tim Deegan <tim@xen.org>
> > Cc: Keir Fraser <keir@xen.org>
> > Cc: Jan Beulich <JBeulich@suse.com>
> > Cc: Ian Campbell <Ian.Campbell@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> > Cc: David Vrabel <david.vrabel@citrix.com>
> > Cc: Alan Somers <alans@spectralogic.com>
> > Cc: John Suykerbuyk <johns@spectralogic.com>
> > Cc: Manuel Bouyer <bouyer@NetBSD.org>
> > Cc: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> >
> > ---
> > RFC because
> > - I might well be missing something here; and
> > - this is a change to a public header that could be in use in any
> >    number of implementations; since the extra test is very cheap, and
> >    seems to be harmess, we might consider leaving it in place.
> > - I haven't tested it at all yet.
> >
> > CC'ing a whole bunch of people whose code might be using this macro.
> >
> > diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h
> > index 73e13d7..d6e23f1 100644
> > --- a/xen/include/public/io/ring.h
> > +++ b/xen/include/public/io/ring.h
> > @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t
> >   #define RING_HAS_UNCONSUMED_RESPONSES(_r)                               \
> >       ((_r)->sring->rsp_prod - (_r)->rsp_cons)
> >
> > -#ifdef __GNUC__
> > -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
> > -    unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
> > -    unsigned int rsp = RING_SIZE(_r) -                                  \
> > -        ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
> > -    req < rsp ? req : rsp;                                              \
> > -})
> > -#else
> > -/* Same as above, but without the nice GCC ({ ... }) syntax. */
> >   #define RING_HAS_UNCONSUMED_REQUESTS(_r)                                \
> > -    ((((_r)->sring->req_prod - (_r)->req_cons) <                        \
> > -      (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?        \
> > -     ((_r)->sring->req_prod - (_r)->req_cons) :                         \
> > -     (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
> > -#endif
> > +    ((_r)->sring->req_prod - (_r)->req_cons)
> >
> >   /* Direct access to individual ring elements, by index. */
> >   #define RING_GET_REQUEST(_r, _idx)                                      \
> >
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-22 17:14           ` Tim Deegan
@ 2014-03-24  7:38             ` Jan Beulich
  2014-03-24  9:39               ` Paul Durrant
  2014-03-24 12:23               ` Zoltan Kiss
  0 siblings, 2 replies; 53+ messages in thread
From: Jan Beulich @ 2014-03-24  7:38 UTC (permalink / raw)
  To: Zoltan Kiss, Tim Deegan
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Keir Fraser,
	freebsd-xen, Alan Somers, Paul Durrant, David Vrabel, xen-devel,
	Boris Ostrovsky, John Suykerbuyk, Manuel Bouyer, roger.pau

>>> On 22.03.14 at 18:14, <tim@xen.org> wrote:
> At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
>> I think I might have an explanation why do we need this, see this mailing:
>> 
>> https://lkml.org/lkml/2014/3/20/710 
>> https://lkml.org/lkml/2014/3/21/111 
>> https://lkml.org/lkml/2014/3/21/390 
> 
> Quoting from the third of these:
> 
> | But consuming overrunning requests after rsp_prod_pvt is a problem:
> | - NAPI instance races with dealloc thread over the slots. The first 
> | reads them as requests, the second writes them as responses
> | - the NAPI instance overwrites used pending slots as well, so skb frag 
> | release go wrong etc.
> 
> OK, so the backend needs to be careful not to follow the frontend into
> overrun, not because of the ring itself being corrupted but because it
> will mess up the backend's internal bookkeeping.

With s/will/may/ I'm not sure that's a reason to withdraw the patch:
The generic macros in ring.h imo shouldn't dictate any particular
protection policy beyond protecting the ring itself. I.e. I'd think if
netback need protection beyond the one provided by ring.h macros,
it should take care to implement them itself.

Yet of course I can see that weakening the protection we have had
in place for so many years may result in very undesirable fallout.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-24  7:38             ` Jan Beulich
@ 2014-03-24  9:39               ` Paul Durrant
  2014-03-24  9:59                 ` Jan Beulich
  2014-03-24 12:23               ` Zoltan Kiss
  1 sibling, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-24  9:39 UTC (permalink / raw)
  To: Jan Beulich, Zoltan Kiss, Tim (Xen.org)
  Cc: Wei Liu, Ian Campbell, Keir (Xen.org),
	freebsd-xen, Alan Somers, Manuel Bouyer, Stefano Stabellini,
	David Vrabel, xen-devel, Boris Ostrovsky, John Suykerbuyk,
	Roger Pau Monne

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 24 March 2014 07:39
> To: Zoltan Kiss; Tim (Xen.org)
> Cc: David Vrabel; Ian Campbell; Paul Durrant; Roger Pau Monne; Wei Liu;
> Stefano Stabellini; freebsd-xen@freebsd.org; xen-
> devel@lists.xenproject.org; Manuel Bouyer; Boris Ostrovsky; Konrad
> Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir (Xen.org)
> Subject: Re: [PATCH RFC] xen/public/ring.h: simplify
> RING_HAS_UNCONSUMED_REQUESTS()
> 
> >>> On 22.03.14 at 18:14, <tim@xen.org> wrote:
> > At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
> >> I think I might have an explanation why do we need this, see this mailing:
> >>
> >> https://lkml.org/lkml/2014/3/20/710
> >> https://lkml.org/lkml/2014/3/21/111
> >> https://lkml.org/lkml/2014/3/21/390
> >
> > Quoting from the third of these:
> >
> > | But consuming overrunning requests after rsp_prod_pvt is a problem:
> > | - NAPI instance races with dealloc thread over the slots. The first
> > | reads them as requests, the second writes them as responses
> > | - the NAPI instance overwrites used pending slots as well, so skb frag
> > | release go wrong etc.
> >
> > OK, so the backend needs to be careful not to follow the frontend into
> > overrun, not because of the ring itself being corrupted but because it
> > will mess up the backend's internal bookkeeping.
> 
> With s/will/may/ I'm not sure that's a reason to withdraw the patch:
> The generic macros in ring.h imo shouldn't dictate any particular
> protection policy beyond protecting the ring itself. I.e. I'd think if
> netback need protection beyond the one provided by ring.h macros,
> it should take care to implement them itself.
> 
> Yet of course I can see that weakening the protection we have had
> in place for so many years may result in very undesirable fallout.
> 

But these are, of course, macros and so the protection is baked into any old code. I'm still in favour of changing the macro in the canonical header and adding a comment to point out that older versions of the macro had the extra check.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-24  9:39               ` Paul Durrant
@ 2014-03-24  9:59                 ` Jan Beulich
  2014-03-24 11:03                   ` Paul Durrant
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2014-03-24  9:59 UTC (permalink / raw)
  To: Paul Durrant, Zoltan Kiss, Tim (Xen.org)
  Cc: Wei Liu, Ian Campbell, Keir(Xen.org),
	freebsd-xen, Alan Somers, ManuelBouyer, Stefano Stabellini,
	David Vrabel, xen-devel, Boris Ostrovsky, John Suykerbuyk,
	Roger Pau Monne

>>> On 24.03.14 at 10:39, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 24 March 2014 07:39
>> To: Zoltan Kiss; Tim (Xen.org)
>> Cc: David Vrabel; Ian Campbell; Paul Durrant; Roger Pau Monne; Wei Liu;
>> Stefano Stabellini; freebsd-xen@freebsd.org; xen-
>> devel@lists.xenproject.org; Manuel Bouyer; Boris Ostrovsky; Konrad
>> Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir (Xen.org)
>> Subject: Re: [PATCH RFC] xen/public/ring.h: simplify
>> RING_HAS_UNCONSUMED_REQUESTS()
>> 
>> >>> On 22.03.14 at 18:14, <tim@xen.org> wrote:
>> > At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
>> >> I think I might have an explanation why do we need this, see this mailing:
>> >>
>> >> https://lkml.org/lkml/2014/3/20/710 
>> >> https://lkml.org/lkml/2014/3/21/111 
>> >> https://lkml.org/lkml/2014/3/21/390 
>> >
>> > Quoting from the third of these:
>> >
>> > | But consuming overrunning requests after rsp_prod_pvt is a problem:
>> > | - NAPI instance races with dealloc thread over the slots. The first
>> > | reads them as requests, the second writes them as responses
>> > | - the NAPI instance overwrites used pending slots as well, so skb frag
>> > | release go wrong etc.
>> >
>> > OK, so the backend needs to be careful not to follow the frontend into
>> > overrun, not because of the ring itself being corrupted but because it
>> > will mess up the backend's internal bookkeeping.
>> 
>> With s/will/may/ I'm not sure that's a reason to withdraw the patch:
>> The generic macros in ring.h imo shouldn't dictate any particular
>> protection policy beyond protecting the ring itself. I.e. I'd think if
>> netback need protection beyond the one provided by ring.h macros,
>> it should take care to implement them itself.
>> 
>> Yet of course I can see that weakening the protection we have had
>> in place for so many years may result in very undesirable fallout.
>> 
> 
> But these are, of course, macros and so the protection is baked into any old 
> code.

Right, but the reduced protection won't be noticeable on a re-build,
and syncing up header files may be a purely mechanical (perhaps even
automated) operation.

> I'm still in favour of changing the macro in the canonical header and 
> adding a comment to point out that older versions of the macro had the extra 
> check.

I too am in favor of changing the canonical header; I only wanted to
point out that this isn't without risk of regressions.

Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-24  9:59                 ` Jan Beulich
@ 2014-03-24 11:03                   ` Paul Durrant
  0 siblings, 0 replies; 53+ messages in thread
From: Paul Durrant @ 2014-03-24 11:03 UTC (permalink / raw)
  To: Jan Beulich, Zoltan Kiss, Tim (Xen.org)
  Cc: Wei Liu, Ian Campbell, Keir (Xen.org),
	freebsd-xen, Alan Somers, ManuelBouyer, Stefano Stabellini,
	David Vrabel, xen-devel, Boris Ostrovsky, John Suykerbuyk,
	Roger Pau Monne

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 24 March 2014 10:00
> To: Paul Durrant; Zoltan Kiss; Tim (Xen.org)
> Cc: David Vrabel; Ian Campbell; Roger Pau Monne; Stefano Stabellini; Wei Liu;
> freebsd-xen@freebsd.org; xen-devel@lists.xenproject.org; ManuelBouyer;
> Boris Ostrovsky; Konrad Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir
> (Xen.org)
> Subject: RE: [PATCH RFC] xen/public/ring.h: simplify
> RING_HAS_UNCONSUMED_REQUESTS()
> 
> >>> On 24.03.14 at 10:39, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 24 March 2014 07:39
> >> To: Zoltan Kiss; Tim (Xen.org)
> >> Cc: David Vrabel; Ian Campbell; Paul Durrant; Roger Pau Monne; Wei Liu;
> >> Stefano Stabellini; freebsd-xen@freebsd.org; xen-
> >> devel@lists.xenproject.org; Manuel Bouyer; Boris Ostrovsky; Konrad
> >> Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir (Xen.org)
> >> Subject: Re: [PATCH RFC] xen/public/ring.h: simplify
> >> RING_HAS_UNCONSUMED_REQUESTS()
> >>
> >> >>> On 22.03.14 at 18:14, <tim@xen.org> wrote:
> >> > At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
> >> >> I think I might have an explanation why do we need this, see this
> mailing:
> >> >>
> >> >> https://lkml.org/lkml/2014/3/20/710
> >> >> https://lkml.org/lkml/2014/3/21/111
> >> >> https://lkml.org/lkml/2014/3/21/390
> >> >
> >> > Quoting from the third of these:
> >> >
> >> > | But consuming overrunning requests after rsp_prod_pvt is a problem:
> >> > | - NAPI instance races with dealloc thread over the slots. The first
> >> > | reads them as requests, the second writes them as responses
> >> > | - the NAPI instance overwrites used pending slots as well, so skb frag
> >> > | release go wrong etc.
> >> >
> >> > OK, so the backend needs to be careful not to follow the frontend into
> >> > overrun, not because of the ring itself being corrupted but because it
> >> > will mess up the backend's internal bookkeeping.
> >>
> >> With s/will/may/ I'm not sure that's a reason to withdraw the patch:
> >> The generic macros in ring.h imo shouldn't dictate any particular
> >> protection policy beyond protecting the ring itself. I.e. I'd think if
> >> netback need protection beyond the one provided by ring.h macros,
> >> it should take care to implement them itself.
> >>
> >> Yet of course I can see that weakening the protection we have had
> >> in place for so many years may result in very undesirable fallout.
> >>
> >
> > But these are, of course, macros and so the protection is baked into any old
> > code.
> 
> Right, but the reduced protection won't be noticeable on a re-build,
> and syncing up header files may be a purely mechanical (perhaps even
> automated) operation.
> 

True. I have an auto-sync script for the Windows PV drivers but every time I update the tag to pull from I do check the diffs in the headers (not least because I have to post-process them to change any use of 'long' or 'unsigned long' to something that's actually 64-bits wide in 64-bit Windows).

> > I'm still in favour of changing the macro in the canonical header and
> > adding a comment to point out that older versions of the macro had the
> extra
> > check.
> 
> I too am in favor of changing the canonical header; I only wanted to
> point out that this isn't without risk of regressions.
> 

Yes. Fair enough.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-24  7:38             ` Jan Beulich
  2014-03-24  9:39               ` Paul Durrant
@ 2014-03-24 12:23               ` Zoltan Kiss
  2014-03-24 13:52                 ` Paul Durrant
  1 sibling, 1 reply; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-24 12:23 UTC (permalink / raw)
  To: Jan Beulich, Zoltan Kiss, Tim Deegan
  Cc: Keir Fraser, Ian Campbell, Stefano Stabellini, freebsd-xen,
	Alan Somers, Paul Durrant, David Vrabel, xen-devel, Wei Liu,
	Boris Ostrovsky, roger.pau, Manuel Bouyer, John Suykerbuyk

On 24/03/14 07:38, Jan Beulich wrote:
>>>> On 22.03.14 at 18:14, <tim@xen.org> wrote:
>> At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
>>> I think I might have an explanation why do we need this, see this mailing:
>>>
>>> https://lkml.org/lkml/2014/3/20/710
>>> https://lkml.org/lkml/2014/3/21/111
>>> https://lkml.org/lkml/2014/3/21/390
>>
>> Quoting from the third of these:
>>
>> | But consuming overrunning requests after rsp_prod_pvt is a problem:
>> | - NAPI instance races with dealloc thread over the slots. The first
>> | reads them as requests, the second writes them as responses
>> | - the NAPI instance overwrites used pending slots as well, so skb frag
>> | release go wrong etc.
>>
>> OK, so the backend needs to be careful not to follow the frontend into
>> overrun, not because of the ring itself being corrupted but because it
>> will mess up the backend's internal bookkeeping.
>
> With s/will/may/ I'm not sure that's a reason to withdraw the patch:
> The generic macros in ring.h imo shouldn't dictate any particular
> protection policy beyond protecting the ring itself. I.e. I'd think if
> netback need protection beyond the one provided by ring.h macros,
> it should take care to implement them itself.

It's not "may", it is a "will". In case of Linux netback for sure, but I 
think it's reasonable for any backend to rely on the fact that the ring 
macros protect them from abusive frontends. Protecting a backend to read 
requests from the [rsp_prod_pvt, req_cons] range is a sensible thing to 
do in the ring macros.
Also, RING_FINAL_CHECK_FOR_REQUESTS relies on this macro, removing this 
protection may cause other issues, e.g. netback keeps the NAPI instance 
spinning while it's not consuming any requests.

Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-24 12:23               ` Zoltan Kiss
@ 2014-03-24 13:52                 ` Paul Durrant
  2014-03-24 23:55                   ` Zoltan Kiss
  0 siblings, 1 reply; 53+ messages in thread
From: Paul Durrant @ 2014-03-24 13:52 UTC (permalink / raw)
  To: Zoltan Kiss, Jan Beulich, Zoltan Kiss, Tim (Xen.org)
  Cc: Keir (Xen.org),
	Ian Campbell, freebsd-xen, Alan Somers, Manuel Bouyer,
	Stefano Stabellini, David Vrabel, xen-devel, Wei Liu,
	Boris Ostrovsky, Roger Pau Monne, John Suykerbuyk

> -----Original Message-----
> From: Zoltan Kiss
> Sent: 24 March 2014 12:24
> To: Jan Beulich; Zoltan Kiss; Tim (Xen.org)
> Cc: Wei Liu; Ian Campbell; Stefano Stabellini; Keir (Xen.org); freebsd-
> xen@freebsd.org; Alan Somers; Paul Durrant; David Vrabel; xen-
> devel@lists.xenproject.org; Boris Ostrovsky; John Suykerbuyk; Manuel
> Bouyer; Roger Pau Monne
> Subject: Re: [Xen-devel] [PATCH RFC] xen/public/ring.h: simplify
> RING_HAS_UNCONSUMED_REQUESTS()
> 
> On 24/03/14 07:38, Jan Beulich wrote:
> >>>> On 22.03.14 at 18:14, <tim@xen.org> wrote:
> >> At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
> >>> I think I might have an explanation why do we need this, see this mailing:
> >>>
> >>> https://lkml.org/lkml/2014/3/20/710
> >>> https://lkml.org/lkml/2014/3/21/111
> >>> https://lkml.org/lkml/2014/3/21/390
> >>
> >> Quoting from the third of these:
> >>
> >> | But consuming overrunning requests after rsp_prod_pvt is a problem:
> >> | - NAPI instance races with dealloc thread over the slots. The first
> >> | reads them as requests, the second writes them as responses
> >> | - the NAPI instance overwrites used pending slots as well, so skb frag
> >> | release go wrong etc.
> >>
> >> OK, so the backend needs to be careful not to follow the frontend into
> >> overrun, not because of the ring itself being corrupted but because it
> >> will mess up the backend's internal bookkeeping.
> >
> > With s/will/may/ I'm not sure that's a reason to withdraw the patch:
> > The generic macros in ring.h imo shouldn't dictate any particular
> > protection policy beyond protecting the ring itself. I.e. I'd think if
> > netback need protection beyond the one provided by ring.h macros,
> > it should take care to implement them itself.
> 
> It's not "may", it is a "will". In case of Linux netback for sure, but I
> think it's reasonable for any backend to rely on the fact that the ring
> macros protect them from abusive frontends. Protecting a backend to read
> requests from the [rsp_prod_pvt, req_cons] range is a sensible thing to
> do in the ring macros.

I disagree. That's not what the name of the macro implies; it implies a range for req_prod - req_cons. The backend is responsible for its own rsp_prod_pvt value and should make sure it's safe. If, to that end, it wants to have its own variant of the macro then that's reasonable, but adding such a clause to the macro in the generic header is (as has been proven) confusing.

  Paul

> Also, RING_FINAL_CHECK_FOR_REQUESTS relies on this macro, removing
> this
> protection may cause other issues, e.g. netback keeps the NAPI instance
> spinning while it's not consuming any requests.
> 
> Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-24 13:52                 ` Paul Durrant
@ 2014-03-24 23:55                   ` Zoltan Kiss
  0 siblings, 0 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-03-24 23:55 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich, Zoltan Kiss, Tim (Xen.org)
  Cc: Keir (Xen.org),
	Ian Campbell, freebsd-xen, Alan Somers, Manuel Bouyer,
	Stefano Stabellini, David Vrabel, xen-devel, Wei Liu,
	Boris Ostrovsky, Roger Pau Monne, John Suykerbuyk

On 24/03/14 13:52, Paul Durrant wrote:
>> -----Original Message-----
>> From: Zoltan Kiss
>> Sent: 24 March 2014 12:24
>> To: Jan Beulich; Zoltan Kiss; Tim (Xen.org)
>> Cc: Wei Liu; Ian Campbell; Stefano Stabellini; Keir (Xen.org); freebsd-
>> xen@freebsd.org; Alan Somers; Paul Durrant; David Vrabel; xen-
>> devel@lists.xenproject.org; Boris Ostrovsky; John Suykerbuyk; Manuel
>> Bouyer; Roger Pau Monne
>> Subject: Re: [Xen-devel] [PATCH RFC] xen/public/ring.h: simplify
>> RING_HAS_UNCONSUMED_REQUESTS()
>>
>> On 24/03/14 07:38, Jan Beulich wrote:
>>>>>> On 22.03.14 at 18:14, <tim@xen.org> wrote:
>>>> At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote:
>>>>> I think I might have an explanation why do we need this, see this mailing:
>>>>>
>>>>> https://lkml.org/lkml/2014/3/20/710
>>>>> https://lkml.org/lkml/2014/3/21/111
>>>>> https://lkml.org/lkml/2014/3/21/390
>>>> Quoting from the third of these:
>>>>
>>>> | But consuming overrunning requests after rsp_prod_pvt is a problem:
>>>> | - NAPI instance races with dealloc thread over the slots. The first
>>>> | reads them as requests, the second writes them as responses
>>>> | - the NAPI instance overwrites used pending slots as well, so skb frag
>>>> | release go wrong etc.
>>>>
>>>> OK, so the backend needs to be careful not to follow the frontend into
>>>> overrun, not because of the ring itself being corrupted but because it
>>>> will mess up the backend's internal bookkeeping.
>>> With s/will/may/ I'm not sure that's a reason to withdraw the patch:
>>> The generic macros in ring.h imo shouldn't dictate any particular
>>> protection policy beyond protecting the ring itself. I.e. I'd think if
>>> netback need protection beyond the one provided by ring.h macros,
>>> it should take care to implement them itself.
>> It's not "may", it is a "will". In case of Linux netback for sure, but I
>> think it's reasonable for any backend to rely on the fact that the ring
>> macros protect them from abusive frontends. Protecting a backend to read
>> requests from the [rsp_prod_pvt, req_cons] range is a sensible thing to
>> do in the ring macros.
> I disagree. That's not what the name of the macro implies; it implies a range for req_prod - req_cons. The backend is responsible for its own rsp_prod_pvt value and should make sure it's safe. If, to that end, it wants to have its own variant of the macro then that's reasonable, but adding such a clause to the macro in the generic header is (as has been proven) confusing.
My opinion is that the name of the macro implies the caller wants to 
know if there are unconsumed requests on the ring (and it doesn't imply 
it, but the return value is actually how many unconsumed requests you 
have). Returning req_prod - req_cons seems to be logical at first (and 
second and ...) thought, but - and that's not implied in the name 
indeed, but I think it's sensible, and should be documented in a comment 
- if you want to protect the backend from abusive frontends, you should 
check if req_prod is not overrunning rsp_prod_pvt. You could check 
rsp_prod, but rsp_prod_pvt is better, because the latter should be 
always equal to or ahead of the former, and from the backend point of 
view the responses between the two (if any) doesn't matter.
It's not about the backend making sure it's rsp_prod_pvt is valid or 
not, it's about protection from a frontend. And I think it should be a 
general thing for the ring users to have this safety check, rather than 
something backend-specific, as all of them could be affected.

Zoli
>
>    Paul
>
>> Also, RING_FINAL_CHECK_FOR_REQUESTS relies on this macro, removing
>> this
>> protection may cause other issues, e.g. netback keeps the NAPI instance
>> spinning while it's not consuming any requests.
>>
>> Zoli

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-03-13 16:38       ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan
  2014-03-22 14:18         ` Zoltan Kiss
@ 2014-04-03  9:38         ` Tim Deegan
  2014-04-03 15:34           ` Zoltan Kiss
  1 sibling, 1 reply; 53+ messages in thread
From: Tim Deegan @ 2014-04-03  9:38 UTC (permalink / raw)
  To: keir, xen-devel

[CC's clipped]

At 17:38 +0100 on 13 Mar (1394728686), Tim Deegan wrote:
> [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
> 
> Remove a useless (though harmless) extra check.

What shall we do about this?  AFAICT, Jan and Paul are in favour of the
change; Zoltan is against it.  I am wavering.  Keir, any opinion?

Tim.

> Code inspection
> ---------------
> 
> Ian Campbell and I looked at this today and can't find any case where
> the existing 'rsp' test would be useful.  In today's ring.h,
> 
> #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
>     unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
>     unsigned int rsp = RING_SIZE(_r) -                                  \
>         ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
>     req < rsp ? req : rsp;                                              \
> })
> 
> 'req' is the number of requests that the F/E has published and we have
> not yet consumed.  'rsp' is an odd fish, everything _except_ what we
> might call requests locally in flight, that is requests we've consumed
> but not produced a response for.  We could only think of two things we
> might be trying to test for here: 
> 
> (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the 
>     way these rings normally work.  In that case, as Zoltan pointed
>     out, rsp must be >= req, for a well-behaved frontend: otherwise
>     we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that
>     req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun
>     the ring.  I don't think this even usefully protects against a
>     malicious/buggy frontend.
> 
> (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm
>     told might happen in linux netback?  In that case, we might plausibly
>     want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of 
>     (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's
>     not what this does.  Instead, rsp will underflow to 
>     RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req. 
> 
> So in both of those cases, 'rsp' is always >= 'req' and is useless.
> 
> 
> Code archaeology
> ----------------
> 
> In the original ring.h, the test was a boolean, as the name implies.
> Cset 99812f40 ([NET] back: Add SG support) extended it to a count in
> the obvious manner.  Looking at the original (0b788798):
> 
> #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r)                            \
>    ( ((_r)->req_cons != (_r)->sring->req_prod ) &&                      \
>      (((_r)->req_cons - (_r)->rsp_prod_pvt) !=                          \
>       SRING_SIZE((_p), (_r)->sring)) )
> 
> it seems to be testing for 'there are unconsumed requests _and_ we
> have not got a full ring of consumed-but-not-responded requests'.
> This also seems to be effectively harmless: if there are unconsumed
> requests, we can't possibly have a ring full of c-b-n-r requests
> unless the frontend's request producer has overrun its response
> consumer.
> 
> That code was introduced with no callers, but seems to have been
> copied from the existing netback code, which switched to use it in
> b242b208.  All later users of it in the xenolinux trees are either
> brand new code or replacing a simple (req_cons - req_prod) test.  The
> netback code goes back to Keir's original implementation, where it
> looked like this, in inverted form (1de448f4):
> 
>         /* Work to do? */
>         i = netif->tx_req_cons;
>         if ( (i == netif->tx->req_prod) ||
>              ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) )
>         {
>             netif_put(netif);
>             continue;
>         }
> 
> Again, I don't think this test is useful: it's only interesting if
> netfront has overrun the ring, but it doesn't reliably detect that.
> 
> So let's remove it. 
> 
> Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Signed-off-by: Tim Deegan <tim@xen.org>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Alan Somers <alans@spectralogic.com>
> Cc: John Suykerbuyk <johns@spectralogic.com>
> Cc: Manuel Bouyer <bouyer@NetBSD.org>
> Cc: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> ---
> RFC because
> - I might well be missing something here; and
> - this is a change to a public header that could be in use in any
>   number of implementations; since the extra test is very cheap, and
>   seems to be harmess, we might consider leaving it in place.
> - I haven't tested it at all yet.
> 
> CC'ing a whole bunch of people whose code might be using this macro.
> 
> diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h
> index 73e13d7..d6e23f1 100644
> --- a/xen/include/public/io/ring.h
> +++ b/xen/include/public/io/ring.h
> @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t
>  #define RING_HAS_UNCONSUMED_RESPONSES(_r)                               \
>      ((_r)->sring->rsp_prod - (_r)->rsp_cons)
>  
> -#ifdef __GNUC__
> -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
> -    unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
> -    unsigned int rsp = RING_SIZE(_r) -                                  \
> -        ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
> -    req < rsp ? req : rsp;                                              \
> -})
> -#else
> -/* Same as above, but without the nice GCC ({ ... }) syntax. */
>  #define RING_HAS_UNCONSUMED_REQUESTS(_r)                                \
> -    ((((_r)->sring->req_prod - (_r)->req_cons) <                        \
> -      (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?        \
> -     ((_r)->sring->req_prod - (_r)->req_cons) :                         \
> -     (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
> -#endif
> +    ((_r)->sring->req_prod - (_r)->req_cons)
>  
>  /* Direct access to individual ring elements, by index. */
>  #define RING_GET_REQUEST(_r, _idx)                                      \
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
  2014-04-03  9:38         ` Tim Deegan
@ 2014-04-03 15:34           ` Zoltan Kiss
  0 siblings, 0 replies; 53+ messages in thread
From: Zoltan Kiss @ 2014-04-03 15:34 UTC (permalink / raw)
  To: Tim Deegan, keir, xen-devel

Paul mentioned the lkml links doesn't work at the moment, but a short 
summary of why rsp is important: it gives us the maximum possible amount 
of requests should be on the ring (number of slots which are not 
consumed-but-not-responded), so even if the frontend lies to us through 
increasing req_prod to crazy values, backend can avoid consuming slots 
which are still consumed-but-not-responded.
Here is my explanation from that another thread:

"The backend doesn't see what the guest does with the responses, and
that's OK, it's the guest's problem, after netback increased
rsp_prod_pvt it doesn't really care. But as soon as the guest start
placing new requests after rsp_prod_pvt, or just increasing req_prod so
req_prod-rsp_prod_pvt > XEN_NETIF_TX_RING_SIZE, it becomes an issue.
So far this xenvif_tx_pending_slots_available indirectly saved us from
consuming requests overwriting still pending requests, but the guest
could still overwrote our responses. But again, that's still the guests
problem, we had the original request saved in the pending ring data. If
the guest went too far, build_gops killed the vif when req_prod-req_cons 
 > XEN_NETIF_TX_RING_SIZE
Indirect above means it only happened because the pending ring had the
same size, and the used slots has a 1-to-1 mapping for slots between
rsp_prod_pvt and req_cons. So xenvif_tx_pending_slots_available also
means (req_cons - rsp_prod_pvt) + XEN_NETBK_LEGACY_SLOTS_MAX <
XEN_NETIF_TX_RING_SIZE (does this look familiar?
But consuming overrunning requests after rsp_prod_pvt is a problem:
- NAPI instance races with dealloc thread over the slots. The first
reads them as requests, the second writes them as responses
- the NAPI instance overwrites used pending slots as well, so skb frag
release go wrong etc.
But the current RING_HAS_UNCONSUMED_REQUESTS fortunately saves us here,
let me explain it through an example:
rsp_prod_pvt = 0
req_cons = 253
req_prod = 258

Therefore:
req = 5
rsp = 3

So in xenvif_tx_build_gops work_to_do will be 3, and in
xenvif_count_requests we bail out when we see that the packet actually
exceeds that.
So in the end, we are safe here, but we shouldn't change that macro I
suggested to refactor"

On 03/04/14 10:38, Tim Deegan wrote:
> [CC's clipped]
>
> At 17:38 +0100 on 13 Mar (1394728686), Tim Deegan wrote:
>> [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
>>
>> Remove a useless (though harmless) extra check.
>
> What shall we do about this?  AFAICT, Jan and Paul are in favour of the
> change; Zoltan is against it.  I am wavering.  Keir, any opinion?
>
> Tim.
>
>> Code inspection
>> ---------------
>>
>> Ian Campbell and I looked at this today and can't find any case where
>> the existing 'rsp' test would be useful.  In today's ring.h,
>>
>> #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
>>      unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
>>      unsigned int rsp = RING_SIZE(_r) -                                  \
>>          ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
>>      req < rsp ? req : rsp;                                              \
>> })
>>
>> 'req' is the number of requests that the F/E has published and we have
>> not yet consumed.  'rsp' is an odd fish, everything _except_ what we
>> might call requests locally in flight, that is requests we've consumed
>> but not produced a response for.  We could only think of two things we
>> might be trying to test for here:
>>
>> (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the
>>      way these rings normally work.  In that case, as Zoltan pointed
>>      out, rsp must be >= req, for a well-behaved frontend: otherwise
>>      we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that
>>      req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun
>>      the ring.  I don't think this even usefully protects against a
>>      malicious/buggy frontend.
>>
>> (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm
>>      told might happen in linux netback?  In that case, we might plausibly
>>      want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of
>>      (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's
>>      not what this does.  Instead, rsp will underflow to
>>      RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req.
>>
>> So in both of those cases, 'rsp' is always >= 'req' and is useless.
>>
>>
>> Code archaeology
>> ----------------
>>
>> In the original ring.h, the test was a boolean, as the name implies.
>> Cset 99812f40 ([NET] back: Add SG support) extended it to a count in
>> the obvious manner.  Looking at the original (0b788798):
>>
>> #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r)                            \
>>     ( ((_r)->req_cons != (_r)->sring->req_prod ) &&                      \
>>       (((_r)->req_cons - (_r)->rsp_prod_pvt) !=                          \
>>        SRING_SIZE((_p), (_r)->sring)) )
>>
>> it seems to be testing for 'there are unconsumed requests _and_ we
>> have not got a full ring of consumed-but-not-responded requests'.
>> This also seems to be effectively harmless: if there are unconsumed
>> requests, we can't possibly have a ring full of c-b-n-r requests
>> unless the frontend's request producer has overrun its response
>> consumer.
>>
>> That code was introduced with no callers, but seems to have been
>> copied from the existing netback code, which switched to use it in
>> b242b208.  All later users of it in the xenolinux trees are either
>> brand new code or replacing a simple (req_cons - req_prod) test.  The
>> netback code goes back to Keir's original implementation, where it
>> looked like this, in inverted form (1de448f4):
>>
>>          /* Work to do? */
>>          i = netif->tx_req_cons;
>>          if ( (i == netif->tx->req_prod) ||
>>               ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) )
>>          {
>>              netif_put(netif);
>>              continue;
>>          }
>>
>> Again, I don't think this test is useful: it's only interesting if
>> netfront has overrun the ring, but it doesn't reliably detect that.
>>
>> So let's remove it.
>>
>> Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com>
>> Signed-off-by: Tim Deegan <tim@xen.org>
>> Cc: Keir Fraser <keir@xen.org>
>> Cc: Jan Beulich <JBeulich@suse.com>
>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>> Cc: Alan Somers <alans@spectralogic.com>
>> Cc: John Suykerbuyk <johns@spectralogic.com>
>> Cc: Manuel Bouyer <bouyer@NetBSD.org>
>> Cc: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>
>> ---
>> RFC because
>> - I might well be missing something here; and
>> - this is a change to a public header that could be in use in any
>>    number of implementations; since the extra test is very cheap, and
>>    seems to be harmess, we might consider leaving it in place.
>> - I haven't tested it at all yet.
>>
>> CC'ing a whole bunch of people whose code might be using this macro.
>>
>> diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h
>> index 73e13d7..d6e23f1 100644
>> --- a/xen/include/public/io/ring.h
>> +++ b/xen/include/public/io/ring.h
>> @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t
>>   #define RING_HAS_UNCONSUMED_RESPONSES(_r)                               \
>>       ((_r)->sring->rsp_prod - (_r)->rsp_cons)
>>
>> -#ifdef __GNUC__
>> -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
>> -    unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
>> -    unsigned int rsp = RING_SIZE(_r) -                                  \
>> -        ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
>> -    req < rsp ? req : rsp;                                              \
>> -})
>> -#else
>> -/* Same as above, but without the nice GCC ({ ... }) syntax. */
>>   #define RING_HAS_UNCONSUMED_REQUESTS(_r)                                \
>> -    ((((_r)->sring->req_prod - (_r)->req_cons) <                        \
>> -      (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?        \
>> -     ((_r)->sring->req_prod - (_r)->req_cons) :                         \
>> -     (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
>> -#endif
>> +    ((_r)->sring->req_prod - (_r)->req_cons)
>>
>>   /* Direct access to individual ring elements, by index. */
>>   #define RING_GET_REQUEST(_r, _idx)                                      \
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2014-04-03 15:35 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss
2014-03-06 15:53 ` Ian Campbell
2014-03-06 16:31   ` Zoltan Kiss
2014-03-06 17:30     ` Tim Deegan
2014-03-06 21:39       ` Zoltan Kiss
2014-03-07  9:23         ` Paul Durrant
2014-03-07 17:43           ` Zoltan Kiss
2014-03-07 12:02         ` Wei Liu
2014-03-07 18:58           ` Zoltan Kiss
2014-03-11 15:55         ` Ian Campbell
2014-03-11 23:34           ` Zoltan Kiss
2014-03-13 16:38       ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan
2014-03-22 14:18         ` Zoltan Kiss
2014-03-22 17:14           ` Tim Deegan
2014-03-24  7:38             ` Jan Beulich
2014-03-24  9:39               ` Paul Durrant
2014-03-24  9:59                 ` Jan Beulich
2014-03-24 11:03                   ` Paul Durrant
2014-03-24 12:23               ` Zoltan Kiss
2014-03-24 13:52                 ` Paul Durrant
2014-03-24 23:55                   ` Zoltan Kiss
2014-04-03  9:38         ` Tim Deegan
2014-04-03 15:34           ` Zoltan Kiss
2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell
2014-03-11 23:24   ` Zoltan Kiss
2014-03-12 10:28     ` Ian Campbell
2014-03-12 10:48       ` Roger Pau Monné
2014-03-12 11:25       ` Paul Durrant
2014-03-12 11:38       ` Paul Durrant
2014-03-12 14:41         ` Zoltan Kiss
2014-03-12 15:23           ` Paul Durrant
2014-03-12 15:42             ` Wei Liu
2014-03-12 15:56               ` Paul Durrant
2014-03-12 16:02               ` Paul Durrant
2014-03-12 16:13               ` Zoltan Kiss
2014-03-12 16:42                 ` Paul Durrant
2014-03-12 19:06                   ` Zoltan Kiss
2014-03-13  9:26                     ` Paul Durrant
2014-03-13 10:02                       ` Ian Campbell
2014-03-13 10:58                         ` Paul Durrant
2014-03-13 12:19                           ` Ian Campbell
2014-03-13 12:28                             ` Zoltan Kiss
2014-03-13 12:29                               ` Paul Durrant
2014-03-13 12:44                               ` Ian Campbell
2014-03-12 14:25       ` Zoltan Kiss
2014-03-12 14:27       ` Zoltan Kiss
2014-03-12 14:30         ` Ian Campbell
2014-03-12 15:14           ` Zoltan Kiss
2014-03-12 15:37             ` Ian Campbell
2014-03-12 17:14               ` Zoltan Kiss
2014-03-12 17:43                 ` Ian Campbell
2014-03-12 21:10                   ` Zoltan Kiss
2014-03-13 10:04                     ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.