* RING_HAS_UNCONSUMED_REQUESTS oddness @ 2014-03-06 15:47 Zoltan Kiss 2014-03-06 15:53 ` Ian Campbell 2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell 0 siblings, 2 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-06 15:47 UTC (permalink / raw) To: xen-devel, Wei Liu, Ian Campbell Hi, I was wondering for a while why this macro looks like this: #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ ({ \ unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ unsigned int rsp = RING_SIZE(_r) - \ ((_r)->req_cons - (_r)->rsp_prod_pvt); \ req < rsp ? req : rsp; \ }) I would expect to check prod - cons, like RING_HAS_UNCONSUMED_RESPONSES does: #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ ((_r)->sring->rsp_prod - (_r)->rsp_cons) By my understanding, there is no way rsp could be smaller than req, so there is no point having this. Am I missing something? Regards, Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss @ 2014-03-06 15:53 ` Ian Campbell 2014-03-06 16:31 ` Zoltan Kiss 2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell 1 sibling, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-06 15:53 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: > By my understanding, there is no way rsp could be smaller than req, so > there is no point having this. Am I missing something? It happens during wraparound, i.e. after req has wrapped but rsp hasn't yet. Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 15:53 ` Ian Campbell @ 2014-03-06 16:31 ` Zoltan Kiss 2014-03-06 17:30 ` Tim Deegan 0 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-06 16:31 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu On 06/03/14 15:53, Ian Campbell wrote: > On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: >> By my understanding, there is no way rsp could be smaller than req, so >> there is no point having this. Am I missing something? > > It happens during wraparound, i.e. after req has wrapped but rsp hasn't > yet. The name of the macro suggest we are interested whether the ring has unconsumed requests, and netback uses it that way. The answer to that question is req_prod - req_cons. And it works if prod wrapped but cons didn't. rsp calculates the number of "consumed but not responded" requests (it also works well if req_cons wrapped but rsp_prod_pvt didn't), then subtract it from the ring size. So it gives the number of unconsumed responses + unconsumed requests + unused slots. Why do we care about it? And as it includes the number of unconsumed requests, it couldn't be smaller than that, could it? Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 16:31 ` Zoltan Kiss @ 2014-03-06 17:30 ` Tim Deegan 2014-03-06 21:39 ` Zoltan Kiss 2014-03-13 16:38 ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan 0 siblings, 2 replies; 53+ messages in thread From: Tim Deegan @ 2014-03-06 17:30 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Ian Campbell At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: > On 06/03/14 15:53, Ian Campbell wrote: > > On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: > >> By my understanding, there is no way rsp could be smaller than req, so > >> there is no point having this. Am I missing something? > > > > It happens during wraparound, i.e. after req has wrapped but rsp hasn't > > yet. > > The name of the macro suggest we are interested whether the ring has > unconsumed requests, and netback uses it that way. The answer to that > question is req_prod - req_cons. And it works if prod wrapped but cons > didn't. Yes. > rsp calculates the number of "consumed but not responded" requests (it > also works well if req_cons wrapped but rsp_prod_pvt didn't), then > subtract it from the ring size. That is indeed an odd thing to check, since it seems like it could only be relevant if the request producer overran the response producer. It's been there in one form or another since the original ring.h, and RING_REQUEST_CONS_OVERFLOW does something similar. I can't remember the original reasoning, and so I'm reluctant to suggest removing it without some more eyes on the code... Tim. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 17:30 ` Tim Deegan @ 2014-03-06 21:39 ` Zoltan Kiss 2014-03-07 9:23 ` Paul Durrant ` (2 more replies) 2014-03-13 16:38 ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan 1 sibling, 3 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-06 21:39 UTC (permalink / raw) To: Tim Deegan; +Cc: xen-devel, Wei Liu, Ian Campbell On 06/03/14 17:30, Tim Deegan wrote: > At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: >> On 06/03/14 15:53, Ian Campbell wrote: >>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: >>>> By my understanding, there is no way rsp could be smaller than req, so >>>> there is no point having this. Am I missing something? >>> >>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't >>> yet. >> >> The name of the macro suggest we are interested whether the ring has >> unconsumed requests, and netback uses it that way. The answer to that >> question is req_prod - req_cons. And it works if prod wrapped but cons >> didn't. > > Yes. > >> rsp calculates the number of "consumed but not responded" requests (it >> also works well if req_cons wrapped but rsp_prod_pvt didn't), then >> subtract it from the ring size. > > That is indeed an odd thing to check, since it seems like it could only > be relevant if the request producer overran the response producer. > It's been there in one form or another since the original ring.h, > and RING_REQUEST_CONS_OVERFLOW does something similar. > > I can't remember the original reasoning, and so I'm reluctant to > suggest removing it without some more eyes on the code... I've added the following printk before the "req < rsp" part: if (rsp < req) \ pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", req, rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \ And it gave me such results: xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755 So it can happen that req_prod is behind req_cons, sometimes even with 17! But it always happen in this callback of my new grant mapping series, which runs outside the NAPI instance. My theory why this can happen: - callback reads req_prod - frontend writes it - backend picks it up, and consumes those slots - callback reads req_cons So req can be near UINT_MAX if you call this macro outside the backend. The only place where the actual return value of this macro matters is xenvif_tx_build_gops, and it should be correct there. At other places we are only looking for the fact whether the ring has unconsumed requests or not. If prod is smaller than cons, we clearly read a wrong value. I think what we can do: 1. try again until its correct 2. just return a non-zero value, it shouldn't cause too much trouble if we say yes here 3. we can't see rsp_cons, so try to figure out if the ring is full of consumed but not responded requests, and return zero then, otherwise a positive value. That's what we do know. Does this make sense? Should we rather go option 1? Should I post a comment patch to document this, and spare a few hours for future generations? :) Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 21:39 ` Zoltan Kiss @ 2014-03-07 9:23 ` Paul Durrant 2014-03-07 17:43 ` Zoltan Kiss 2014-03-07 12:02 ` Wei Liu 2014-03-11 15:55 ` Ian Campbell 2 siblings, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-07 9:23 UTC (permalink / raw) To: Zoltan Kiss, Tim (Xen.org); +Cc: xen-devel, Wei Liu, Ian Campbell > -----Original Message----- > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- > bounces@lists.xen.org] On Behalf Of Zoltan Kiss > Sent: 06 March 2014 21:40 > To: Tim (Xen.org) > Cc: xen-devel@lists.xenproject.org; Wei Liu; Ian Campbell > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On 06/03/14 17:30, Tim Deegan wrote: > > At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: > >> On 06/03/14 15:53, Ian Campbell wrote: > >>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: > >>>> By my understanding, there is no way rsp could be smaller than req, so > >>>> there is no point having this. Am I missing something? > >>> > >>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't > >>> yet. > >> > >> The name of the macro suggest we are interested whether the ring has > >> unconsumed requests, and netback uses it that way. The answer to that > >> question is req_prod - req_cons. And it works if prod wrapped but cons > >> didn't. > > > > Yes. > > > >> rsp calculates the number of "consumed but not responded" requests (it > >> also works well if req_cons wrapped but rsp_prod_pvt didn't), then > >> subtract it from the ring size. > > > > That is indeed an odd thing to check, since it seems like it could only > > be relevant if the request producer overran the response producer. > > It's been there in one form or another since the original ring.h, > > and RING_REQUEST_CONS_OVERFLOW does something similar. > > > > I can't remember the original reasoning, and so I'm reluctant to > > suggest removing it without some more eyes on the code... > > I've added the following printk before the "req < rsp" part: > > if (rsp < req) \ > pr_err("req %u rsp %u req_prod %u req_cons %u > rsp_prod_pvt %u\n", req, > rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \ > > And it gave me such results: > > xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod > 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755 > > So it can happen that req_prod is behind req_cons, sometimes even with > 17! But it always happen in this callback of my new grant mapping > series, which runs outside the NAPI instance. My theory why this can > happen: > - callback reads req_prod > - frontend writes it > - backend picks it up, and consumes those slots > - callback reads req_cons > > So req can be near UINT_MAX if you call this macro outside the backend. > The only place where the actual return value of this macro matters is > xenvif_tx_build_gops, and it should be correct there. At other places we > are only looking for the fact whether the ring has unconsumed requests > or not. If prod is smaller than cons, we clearly read a wrong value. I > think what we can do: > 1. try again until its correct > 2. just return a non-zero value, it shouldn't cause too much trouble if > we say yes here > 3. we can't see rsp_cons, so try to figure out if the ring is full of > consumed but not responded requests, and return zero then, otherwise a > positive value. That's what we do know. > > Does this make sense? Should we rather go option 1? Should I post a > comment patch to document this, and spare a few hours for future > generations? :) > The number of responses on the ring is clearly immaterial if you're only looking for unconsumed requests. So I think it should something analogous to RING_HAS_UNCONSUMED_RESPONSES: #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ ((_r)->sring->req_prod - (_r)->req_cons) Any racing between these two values is the responsibility of the individual frontend/backend and not something that this macro needs to care about. Paul ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-07 9:23 ` Paul Durrant @ 2014-03-07 17:43 ` Zoltan Kiss 0 siblings, 0 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-07 17:43 UTC (permalink / raw) To: Paul Durrant, Tim (Xen.org); +Cc: xen-devel, Wei Liu, Ian Campbell On 07/03/14 09:23, Paul Durrant wrote: >> -----Original Message----- >> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- >> bounces@lists.xen.org] On Behalf Of Zoltan Kiss >> Sent: 06 March 2014 21:40 >> To: Tim (Xen.org) >> Cc: xen-devel@lists.xenproject.org; Wei Liu; Ian Campbell >> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness >> >> On 06/03/14 17:30, Tim Deegan wrote: >>> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: >>>> On 06/03/14 15:53, Ian Campbell wrote: >>>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: >>>>>> By my understanding, there is no way rsp could be smaller than req, so >>>>>> there is no point having this. Am I missing something? >>>>> >>>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't >>>>> yet. >>>> >>>> The name of the macro suggest we are interested whether the ring has >>>> unconsumed requests, and netback uses it that way. The answer to that >>>> question is req_prod - req_cons. And it works if prod wrapped but cons >>>> didn't. >>> >>> Yes. >>> >>>> rsp calculates the number of "consumed but not responded" requests (it >>>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then >>>> subtract it from the ring size. >>> >>> That is indeed an odd thing to check, since it seems like it could only >>> be relevant if the request producer overran the response producer. >>> It's been there in one form or another since the original ring.h, >>> and RING_REQUEST_CONS_OVERFLOW does something similar. >>> >>> I can't remember the original reasoning, and so I'm reluctant to >>> suggest removing it without some more eyes on the code... >> >> I've added the following printk before the "req < rsp" part: >> >> if (rsp < req) \ >> pr_err("req %u rsp %u req_prod %u req_cons %u >> rsp_prod_pvt %u\n", req, >> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \ >> >> And it gave me such results: >> >> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod >> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755 >> >> So it can happen that req_prod is behind req_cons, sometimes even with >> 17! But it always happen in this callback of my new grant mapping >> series, which runs outside the NAPI instance. My theory why this can >> happen: >> - callback reads req_prod >> - frontend writes it >> - backend picks it up, and consumes those slots >> - callback reads req_cons >> >> So req can be near UINT_MAX if you call this macro outside the backend. >> The only place where the actual return value of this macro matters is >> xenvif_tx_build_gops, and it should be correct there. At other places we >> are only looking for the fact whether the ring has unconsumed requests >> or not. If prod is smaller than cons, we clearly read a wrong value. I >> think what we can do: >> 1. try again until its correct >> 2. just return a non-zero value, it shouldn't cause too much trouble if >> we say yes here >> 3. we can't see rsp_cons, so try to figure out if the ring is full of >> consumed but not responded requests, and return zero then, otherwise a >> positive value. That's what we do know. >> >> Does this make sense? Should we rather go option 1? Should I post a >> comment patch to document this, and spare a few hours for future >> generations? :) >> > > The number of responses on the ring is clearly immaterial if you're only looking for unconsumed requests. So I think it should something analogous to RING_HAS_UNCONSUMED_RESPONSES: > > #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ > ((_r)->sring->req_prod - (_r)->req_cons) > > Any racing between these two values is the responsibility of the individual frontend/backend and not something that this macro needs to care about. I think we should introduce a new macro with a name like RING_NR_UNCONSUMED_REQUESTS, which does what you wrote above: ((_r)->sring->req_prod - (_r)->req_cons) It should be used where we actually care about the number of the unconsumed request, practically from xenvif_tx_build_gops. And we should note in a comment that this macro works only if you use it from the context of the writer of req_cons (NAPI instance in our case) We can actually use it from tx_work_todo as well, and comment on the function that it should be used only from the NAPI instance. The other users, who are not interested in the actual number: xenvif_tx_interrupt xenvif_zerocopy_callback (new stuff from my series) and through RING_FINAL_CHECK_FOR_REQUESTS: xenvif_check_rx_xenvif xenvif_poll do_block_io_op Apart from xenvif_poll they can run into the problem when (req_prod < req_cons). To handle that situation I've listed three options in my previous email. I think the current one is a quite good best effort to optimize the situation, and we can go with that, if we document it properly in comments why is it necessary. To clarify what does this do, let me show an example: req_prod = 253 req_cons = 256 rsp_prod_pvt = 0 req will be UINT_MAX-2, as the values changed in the meantime, and rsp is 0. It's reasonable to return 0 here, as the backend hasn't replied anything yet, so we clearly shouldn't have any unconsumed request in the ring. Regards, Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 21:39 ` Zoltan Kiss 2014-03-07 9:23 ` Paul Durrant @ 2014-03-07 12:02 ` Wei Liu 2014-03-07 18:58 ` Zoltan Kiss 2014-03-11 15:55 ` Ian Campbell 2 siblings, 1 reply; 53+ messages in thread From: Wei Liu @ 2014-03-07 12:02 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Tim Deegan, Ian Campbell, Wei Liu On Thu, Mar 06, 2014 at 09:39:40PM +0000, Zoltan Kiss wrote: > On 06/03/14 17:30, Tim Deegan wrote: > >At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: > >>On 06/03/14 15:53, Ian Campbell wrote: > >>>On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: > >>>>By my understanding, there is no way rsp could be smaller than req, so > >>>>there is no point having this. Am I missing something? > >>> > >>>It happens during wraparound, i.e. after req has wrapped but rsp hasn't > >>>yet. > >> > >>The name of the macro suggest we are interested whether the ring has > >>unconsumed requests, and netback uses it that way. The answer to that > >>question is req_prod - req_cons. And it works if prod wrapped but cons > >>didn't. > > > >Yes. > > > >>rsp calculates the number of "consumed but not responded" requests (it > >>also works well if req_cons wrapped but rsp_prod_pvt didn't), then > >>subtract it from the ring size. > > > >That is indeed an odd thing to check, since it seems like it could only > >be relevant if the request producer overran the response producer. > >It's been there in one form or another since the original ring.h, > >and RING_REQUEST_CONS_OVERFLOW does something similar. > > > >I can't remember the original reasoning, and so I'm reluctant to > >suggest removing it without some more eyes on the code... > > I've added the following printk before the "req < rsp" part: > > if (rsp < req) \ > pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", > req, rsp, (_r)->sring->req_prod, (_r)->req_cons, > (_r)->rsp_prod_pvt); \ > > And it gave me such results: > > xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod > 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755 > > So it can happen that req_prod is behind req_cons, sometimes even > with 17! But it always happen in this callback of my new grant > mapping series, which runs outside the NAPI instance. My theory why > this can happen: > - callback reads req_prod > - frontend writes it > - backend picks it up, and consumes those slots > - callback reads req_cons > So there's three parties accessing the ring? I *think* the ring is only designed to be lockfree for two parties so I don't know whether this model is legit. Wei. > So req can be near UINT_MAX if you call this macro outside the > backend. The only place where the actual return value of this macro > matters is xenvif_tx_build_gops, and it should be correct there. At > other places we are only looking for the fact whether the ring has > unconsumed requests or not. If prod is smaller than cons, we clearly > read a wrong value. I think what we can do: > 1. try again until its correct > 2. just return a non-zero value, it shouldn't cause too much trouble > if we say yes here > 3. we can't see rsp_cons, so try to figure out if the ring is full > of consumed but not responded requests, and return zero then, > otherwise a positive value. That's what we do know. > > Does this make sense? Should we rather go option 1? Should I post a > comment patch to document this, and spare a few hours for future > generations? :) > > Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-07 12:02 ` Wei Liu @ 2014-03-07 18:58 ` Zoltan Kiss 0 siblings, 0 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-07 18:58 UTC (permalink / raw) To: Wei Liu; +Cc: xen-devel, Tim Deegan, Ian Campbell On 07/03/14 12:02, Wei Liu wrote: > On Thu, Mar 06, 2014 at 09:39:40PM +0000, Zoltan Kiss wrote: >> On 06/03/14 17:30, Tim Deegan wrote: >>> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: >>>> On 06/03/14 15:53, Ian Campbell wrote: >>>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: >>>>>> By my understanding, there is no way rsp could be smaller than req, so >>>>>> there is no point having this. Am I missing something? >>>>> >>>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't >>>>> yet. >>>> >>>> The name of the macro suggest we are interested whether the ring has >>>> unconsumed requests, and netback uses it that way. The answer to that >>>> question is req_prod - req_cons. And it works if prod wrapped but cons >>>> didn't. >>> >>> Yes. >>> >>>> rsp calculates the number of "consumed but not responded" requests (it >>>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then >>>> subtract it from the ring size. >>> >>> That is indeed an odd thing to check, since it seems like it could only >>> be relevant if the request producer overran the response producer. >>> It's been there in one form or another since the original ring.h, >>> and RING_REQUEST_CONS_OVERFLOW does something similar. >>> >>> I can't remember the original reasoning, and so I'm reluctant to >>> suggest removing it without some more eyes on the code... >> >> I've added the following printk before the "req < rsp" part: >> >> if (rsp < req) \ >> pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", >> req, rsp, (_r)->sring->req_prod, (_r)->req_cons, >> (_r)->rsp_prod_pvt); \ >> >> And it gave me such results: >> >> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod >> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755 >> >> So it can happen that req_prod is behind req_cons, sometimes even >> with 17! But it always happen in this callback of my new grant >> mapping series, which runs outside the NAPI instance. My theory why >> this can happen: >> - callback reads req_prod >> - frontend writes it >> - backend picks it up, and consumes those slots >> - callback reads req_cons >> > > So there's three parties accessing the ring? I *think* the ring is only > designed to be lockfree for two parties so I don't know whether this > model is legit. Indeed, however if we want to reintroduce grant mapping in netback, the releasing of pages will happen asynchronously. And there, you have to check whether you should kick the NAPI instance, to avoid this kind of stall: - xenvif_poll call napi_complete, because more_to_do=true, but xenvif_tx_pending_slots_available=false - we won't receive interrupt from frontend, as we already got one - the reasonable place to kick in NAPI instance is when we released packets. Fortunately it doesn't happen very often Actually at the moment I do it in the callback, but it's better in the dealloc thread, when we really released the pending slots, otherwise we might risk that NAPI instance quickly deschedule itself again. I'll fix that in a next series ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 21:39 ` Zoltan Kiss 2014-03-07 9:23 ` Paul Durrant 2014-03-07 12:02 ` Wei Liu @ 2014-03-11 15:55 ` Ian Campbell 2014-03-11 23:34 ` Zoltan Kiss 2 siblings, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-11 15:55 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Tim Deegan, Wei Liu On Thu, 2014-03-06 at 21:39 +0000, Zoltan Kiss wrote: > On 06/03/14 17:30, Tim Deegan wrote: > > At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: > >> On 06/03/14 15:53, Ian Campbell wrote: > >>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: > >>>> By my understanding, there is no way rsp could be smaller than req, so > >>>> there is no point having this. Am I missing something? > >>> > >>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't > >>> yet. > >> > >> The name of the macro suggest we are interested whether the ring has > >> unconsumed requests, and netback uses it that way. The answer to that > >> question is req_prod - req_cons. And it works if prod wrapped but cons > >> didn't. > > > > Yes. > > > >> rsp calculates the number of "consumed but not responded" requests (it > >> also works well if req_cons wrapped but rsp_prod_pvt didn't), then > >> subtract it from the ring size. > > > > That is indeed an odd thing to check, since it seems like it could only > > be relevant if the request producer overran the response producer. > > It's been there in one form or another since the original ring.h, > > and RING_REQUEST_CONS_OVERFLOW does something similar. > > > > I can't remember the original reasoning, and so I'm reluctant to > > suggest removing it without some more eyes on the code... > > I've added the following printk before the "req < rsp" part: > > if (rsp < req) \ > pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", req, > rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \ > > And it gave me such results: > > xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod > 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755 > > So it can happen that req_prod is behind req_cons, sometimes even with > 17! But it always happen in this callback of my new grant mapping > series, which runs outside the NAPI instance. My theory why this can happen: > - callback reads req_prod > - frontend writes it > - backend picks it up, and consumes those slots > - callback reads req_cons I'm a bit confused by what you mean by "it" in your theory, so perhaps I misunderstand. Can you use the actual variable names for clarity. Are you sure this is a problem with the actual code and not just with your debug print? I would expect the real code to be snapshotting things as appropriate etc, and also for the public/private state to not necessarily be totally in sync when RING_HAS_UNCONSUMED_REQUESTS is being called. > So req can be near UINT_MAX if you call this macro outside the backend. > The only place where the actual return value of this macro matters is > xenvif_tx_build_gops, and it should be correct there. At other places we > are only looking for the fact whether the ring has unconsumed requests > or not. If prod is smaller than cons, we clearly read a wrong value. I > think what we can do: > 1. try again until its correct > 2. just return a non-zero value, it shouldn't cause too much trouble if > we say yes here > 3. we can't see rsp_cons, so try to figure out if the ring is full of > consumed but not responded requests, and return zero then, otherwise a > positive value. That's what we do know. s/know/now/? Is #3 here the status quo? > Does this make sense? Should we rather go option 1? Should I post a > comment patch to document this, and spare a few hours for future > generations? :) Docs are always good IMHO. Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-11 15:55 ` Ian Campbell @ 2014-03-11 23:34 ` Zoltan Kiss 0 siblings, 0 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-11 23:34 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Tim Deegan, Wei Liu On 11/03/14 15:55, Ian Campbell wrote: > On Thu, 2014-03-06 at 21:39 +0000, Zoltan Kiss wrote: >> On 06/03/14 17:30, Tim Deegan wrote: >>> At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote: >>>> On 06/03/14 15:53, Ian Campbell wrote: >>>>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: >>>>>> By my understanding, there is no way rsp could be smaller than req, so >>>>>> there is no point having this. Am I missing something? >>>>> >>>>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't >>>>> yet. >>>> >>>> The name of the macro suggest we are interested whether the ring has >>>> unconsumed requests, and netback uses it that way. The answer to that >>>> question is req_prod - req_cons. And it works if prod wrapped but cons >>>> didn't. >>> >>> Yes. >>> >>>> rsp calculates the number of "consumed but not responded" requests (it >>>> also works well if req_cons wrapped but rsp_prod_pvt didn't), then >>>> subtract it from the ring size. >>> >>> That is indeed an odd thing to check, since it seems like it could only >>> be relevant if the request producer overran the response producer. >>> It's been there in one form or another since the original ring.h, >>> and RING_REQUEST_CONS_OVERFLOW does something similar. >>> >>> I can't remember the original reasoning, and so I'm reluctant to >>> suggest removing it without some more eyes on the code... >> >> I've added the following printk before the "req < rsp" part: >> >> if (rsp < req) \ >> pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt %u\n", req, >> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \ >> >> And it gave me such results: >> >> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod >> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755 >> >> So it can happen that req_prod is behind req_cons, sometimes even with >> 17! But it always happen in this callback of my new grant mapping >> series, which runs outside the NAPI instance. My theory why this can happen: >> - callback reads req_prod >> - frontend writes it >> - backend picks it up, and consumes those slots >> - callback reads req_cons > > I'm a bit confused by what you mean by "it" in your theory, so perhaps I > misunderstand. Can you use the actual variable names for clarity. I meant req_prod all the time. > Are you sure this is a problem with the actual code and not just with > your debug print? I would expect the real code to be snapshotting things > as appropriate etc, and also for the public/private state to not > necessarily be totally in sync when RING_HAS_UNCONSUMED_REQUESTS is > being called. I think my above code does the right thing. For double check I printed out req, rsp, and the values they are calculated from. I guess req_prod is cached in the printk and that's why it is still the same value as we read when calculating req. I'll test that with a memory barrier. > >> So req can be near UINT_MAX if you call this macro outside the backend. >> The only place where the actual return value of this macro matters is >> xenvif_tx_build_gops, and it should be correct there. At other places we >> are only looking for the fact whether the ring has unconsumed requests >> or not. If prod is smaller than cons, we clearly read a wrong value. I >> think what we can do: >> 1. try again until its correct >> 2. just return a non-zero value, it shouldn't cause too much trouble if >> we say yes here >> 3. we can't see rsp_cons, so try to figure out if the ring is full of >> consumed but not responded requests, and return zero then, otherwise a >> positive value. That's what we do know. > > s/know/now/? Is #3 here the status quo? Yes, that's the status quo. It's a best effort calculation, rsp takes only effect if req ~ UINT_MAX. > >> Does this make sense? Should we rather go option 1? Should I post a >> comment patch to document this, and spare a few hours for future >> generations? :) > > Docs are always good IMHO. > > Ian. > ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-06 17:30 ` Tim Deegan 2014-03-06 21:39 ` Zoltan Kiss @ 2014-03-13 16:38 ` Tim Deegan 2014-03-22 14:18 ` Zoltan Kiss 2014-04-03 9:38 ` Tim Deegan 1 sibling, 2 replies; 53+ messages in thread From: Tim Deegan @ 2014-03-13 16:38 UTC (permalink / raw) To: xen-devel Cc: Wei Liu, Ian Campbell, Zoltan Kiss, Stefano Stabellini, freebsd-xen, Alan Somers, Manuel Bouyer, David Vrabel, Jan Beulich, Keir Fraser, Boris Ostrovsky, Roger Pau Monné, Paul Durrant, John Suykerbuyk [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Remove a useless (though harmless) extra check. Code inspection --------------- Ian Campbell and I looked at this today and can't find any case where the existing 'rsp' test would be useful. In today's ring.h, #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ unsigned int rsp = RING_SIZE(_r) - \ ((_r)->req_cons - (_r)->rsp_prod_pvt); \ req < rsp ? req : rsp; \ }) 'req' is the number of requests that the F/E has published and we have not yet consumed. 'rsp' is an odd fish, everything _except_ what we might call requests locally in flight, that is requests we've consumed but not produced a response for. We could only think of two things we might be trying to test for here: (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the way these rings normally work. In that case, as Zoltan pointed out, rsp must be >= req, for a well-behaved frontend: otherwise we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun the ring. I don't think this even usefully protects against a malicious/buggy frontend. (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm told might happen in linux netback? In that case, we might plausibly want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's not what this does. Instead, rsp will underflow to RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req. So in both of those cases, 'rsp' is always >= 'req' and is useless. Code archaeology ---------------- In the original ring.h, the test was a boolean, as the name implies. Cset 99812f40 ([NET] back: Add SG support) extended it to a count in the obvious manner. Looking at the original (0b788798): #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r) \ ( ((_r)->req_cons != (_r)->sring->req_prod ) && \ (((_r)->req_cons - (_r)->rsp_prod_pvt) != \ SRING_SIZE((_p), (_r)->sring)) ) it seems to be testing for 'there are unconsumed requests _and_ we have not got a full ring of consumed-but-not-responded requests'. This also seems to be effectively harmless: if there are unconsumed requests, we can't possibly have a ring full of c-b-n-r requests unless the frontend's request producer has overrun its response consumer. That code was introduced with no callers, but seems to have been copied from the existing netback code, which switched to use it in b242b208. All later users of it in the xenolinux trees are either brand new code or replacing a simple (req_cons - req_prod) test. The netback code goes back to Keir's original implementation, where it looked like this, in inverted form (1de448f4): /* Work to do? */ i = netif->tx_req_cons; if ( (i == netif->tx->req_prod) || ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) ) { netif_put(netif); continue; } Again, I don't think this test is useful: it's only interesting if netfront has overrun the ring, but it doesn't reliably detect that. So let's remove it. Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: Tim Deegan <tim@xen.org> Cc: Keir Fraser <keir@xen.org> Cc: Jan Beulich <JBeulich@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Alan Somers <alans@spectralogic.com> Cc: John Suykerbuyk <johns@spectralogic.com> Cc: Manuel Bouyer <bouyer@NetBSD.org> Cc: Paul Durrant <Paul.Durrant@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- RFC because - I might well be missing something here; and - this is a change to a public header that could be in use in any number of implementations; since the extra test is very cheap, and seems to be harmess, we might consider leaving it in place. - I haven't tested it at all yet. CC'ing a whole bunch of people whose code might be using this macro. diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h index 73e13d7..d6e23f1 100644 --- a/xen/include/public/io/ring.h +++ b/xen/include/public/io/ring.h @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ ((_r)->sring->rsp_prod - (_r)->rsp_cons) -#ifdef __GNUC__ -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ - unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ - unsigned int rsp = RING_SIZE(_r) - \ - ((_r)->req_cons - (_r)->rsp_prod_pvt); \ - req < rsp ? req : rsp; \ -}) -#else -/* Same as above, but without the nice GCC ({ ... }) syntax. */ #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ - ((((_r)->sring->req_prod - (_r)->req_cons) < \ - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ? \ - ((_r)->sring->req_prod - (_r)->req_cons) : \ - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) -#endif + ((_r)->sring->req_prod - (_r)->req_cons) /* Direct access to individual ring elements, by index. */ #define RING_GET_REQUEST(_r, _idx) \ ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-13 16:38 ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan @ 2014-03-22 14:18 ` Zoltan Kiss 2014-03-22 17:14 ` Tim Deegan 2014-04-03 9:38 ` Tim Deegan 1 sibling, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-22 14:18 UTC (permalink / raw) To: Tim Deegan, xen-devel Cc: Wei Liu, Ian Campbell, Stefano Stabellini, freebsd-xen, Alan Somers, Manuel Bouyer, David Vrabel, Jan Beulich, Keir Fraser, Boris Ostrovsky, Roger Pau Monné, Paul Durrant, John Suykerbuyk Hi, I think I might have an explanation why do we need this, see this mailing: https://lkml.org/lkml/2014/3/20/710 https://lkml.org/lkml/2014/3/21/111 https://lkml.org/lkml/2014/3/21/390 Zoli On 13/03/14 16:38, Tim Deegan wrote: > [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() > > Remove a useless (though harmless) extra check. > > Code inspection > --------------- > > Ian Campbell and I looked at this today and can't find any case where > the existing 'rsp' test would be useful. In today's ring.h, > > #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ > unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ > unsigned int rsp = RING_SIZE(_r) - \ > ((_r)->req_cons - (_r)->rsp_prod_pvt); \ > req < rsp ? req : rsp; \ > }) > > 'req' is the number of requests that the F/E has published and we have > not yet consumed. 'rsp' is an odd fish, everything _except_ what we > might call requests locally in flight, that is requests we've consumed > but not produced a response for. We could only think of two things we > might be trying to test for here: > > (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the > way these rings normally work. In that case, as Zoltan pointed > out, rsp must be >= req, for a well-behaved frontend: otherwise > we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that > req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun > the ring. I don't think this even usefully protects against a > malicious/buggy frontend. > > (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm > told might happen in linux netback? In that case, we might plausibly > want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of > (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's > not what this does. Instead, rsp will underflow to > RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req. > > So in both of those cases, 'rsp' is always >= 'req' and is useless. > > > Code archaeology > ---------------- > > In the original ring.h, the test was a boolean, as the name implies. > Cset 99812f40 ([NET] back: Add SG support) extended it to a count in > the obvious manner. Looking at the original (0b788798): > > #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r) \ > ( ((_r)->req_cons != (_r)->sring->req_prod ) && \ > (((_r)->req_cons - (_r)->rsp_prod_pvt) != \ > SRING_SIZE((_p), (_r)->sring)) ) > > it seems to be testing for 'there are unconsumed requests _and_ we > have not got a full ring of consumed-but-not-responded requests'. > This also seems to be effectively harmless: if there are unconsumed > requests, we can't possibly have a ring full of c-b-n-r requests > unless the frontend's request producer has overrun its response > consumer. > > That code was introduced with no callers, but seems to have been > copied from the existing netback code, which switched to use it in > b242b208. All later users of it in the xenolinux trees are either > brand new code or replacing a simple (req_cons - req_prod) test. The > netback code goes back to Keir's original implementation, where it > looked like this, in inverted form (1de448f4): > > /* Work to do? */ > i = netif->tx_req_cons; > if ( (i == netif->tx->req_prod) || > ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) ) > { > netif_put(netif); > continue; > } > > Again, I don't think this test is useful: it's only interesting if > netfront has overrun the ring, but it doesn't reliably detect that. > > So let's remove it. > > Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com> > Signed-off-by: Tim Deegan <tim@xen.org> > Cc: Keir Fraser <keir@xen.org> > Cc: Jan Beulich <JBeulich@suse.com> > Cc: Ian Campbell <Ian.Campbell@citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Cc: David Vrabel <david.vrabel@citrix.com> > Cc: Alan Somers <alans@spectralogic.com> > Cc: John Suykerbuyk <johns@spectralogic.com> > Cc: Manuel Bouyer <bouyer@NetBSD.org> > Cc: Paul Durrant <Paul.Durrant@citrix.com> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > RFC because > - I might well be missing something here; and > - this is a change to a public header that could be in use in any > number of implementations; since the extra test is very cheap, and > seems to be harmess, we might consider leaving it in place. > - I haven't tested it at all yet. > > CC'ing a whole bunch of people whose code might be using this macro. > > diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h > index 73e13d7..d6e23f1 100644 > --- a/xen/include/public/io/ring.h > +++ b/xen/include/public/io/ring.h > @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t > #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ > ((_r)->sring->rsp_prod - (_r)->rsp_cons) > > -#ifdef __GNUC__ > -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ > - unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ > - unsigned int rsp = RING_SIZE(_r) - \ > - ((_r)->req_cons - (_r)->rsp_prod_pvt); \ > - req < rsp ? req : rsp; \ > -}) > -#else > -/* Same as above, but without the nice GCC ({ ... }) syntax. */ > #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ > - ((((_r)->sring->req_prod - (_r)->req_cons) < \ > - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ? \ > - ((_r)->sring->req_prod - (_r)->req_cons) : \ > - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) > -#endif > + ((_r)->sring->req_prod - (_r)->req_cons) > > /* Direct access to individual ring elements, by index. */ > #define RING_GET_REQUEST(_r, _idx) \ > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-22 14:18 ` Zoltan Kiss @ 2014-03-22 17:14 ` Tim Deegan 2014-03-24 7:38 ` Jan Beulich 0 siblings, 1 reply; 53+ messages in thread From: Tim Deegan @ 2014-03-22 17:14 UTC (permalink / raw) To: Zoltan Kiss Cc: Wei Liu, Keir Fraser, Ian Campbell, Stefano Stabellini, freebsd-xen, Alan Somers, Manuel Bouyer, David Vrabel, Jan Beulich, xen-devel, Boris Ostrovsky, Roger Pau Monné, Paul Durrant, John Suykerbuyk At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: > Hi, > > I think I might have an explanation why do we need this, see this mailing: > > https://lkml.org/lkml/2014/3/20/710 > https://lkml.org/lkml/2014/3/21/111 > https://lkml.org/lkml/2014/3/21/390 Quoting from the third of these: | But consuming overrunning requests after rsp_prod_pvt is a problem: | - NAPI instance races with dealloc thread over the slots. The first | reads them as requests, the second writes them as responses | - the NAPI instance overwrites used pending slots as well, so skb frag | release go wrong etc. OK, so the backend needs to be careful not to follow the frontend into overrun, not because of the ring itself being corrupted but because it will mess up the backend's internal bookkeeping. Fair enough -- I think this at least constitutes a reason why it might be dangerous to remove the check from the public header. I withdraw the patch. I'll make another patch (with a smaller distribution list!) to add an appropriate comment. Thanks! Tim. > On 13/03/14 16:38, Tim Deegan wrote: > > [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() > > > > Remove a useless (though harmless) extra check. > > > > Code inspection > > --------------- > > > > Ian Campbell and I looked at this today and can't find any case where > > the existing 'rsp' test would be useful. In today's ring.h, > > > > #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ > > unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ > > unsigned int rsp = RING_SIZE(_r) - \ > > ((_r)->req_cons - (_r)->rsp_prod_pvt); \ > > req < rsp ? req : rsp; \ > > }) > > > > 'req' is the number of requests that the F/E has published and we have > > not yet consumed. 'rsp' is an odd fish, everything _except_ what we > > might call requests locally in flight, that is requests we've consumed > > but not produced a response for. We could only think of two things we > > might be trying to test for here: > > > > (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the > > way these rings normally work. In that case, as Zoltan pointed > > out, rsp must be >= req, for a well-behaved frontend: otherwise > > we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that > > req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun > > the ring. I don't think this even usefully protects against a > > malicious/buggy frontend. > > > > (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm > > told might happen in linux netback? In that case, we might plausibly > > want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of > > (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's > > not what this does. Instead, rsp will underflow to > > RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req. > > > > So in both of those cases, 'rsp' is always >= 'req' and is useless. > > > > > > Code archaeology > > ---------------- > > > > In the original ring.h, the test was a boolean, as the name implies. > > Cset 99812f40 ([NET] back: Add SG support) extended it to a count in > > the obvious manner. Looking at the original (0b788798): > > > > #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r) \ > > ( ((_r)->req_cons != (_r)->sring->req_prod ) && \ > > (((_r)->req_cons - (_r)->rsp_prod_pvt) != \ > > SRING_SIZE((_p), (_r)->sring)) ) > > > > it seems to be testing for 'there are unconsumed requests _and_ we > > have not got a full ring of consumed-but-not-responded requests'. > > This also seems to be effectively harmless: if there are unconsumed > > requests, we can't possibly have a ring full of c-b-n-r requests > > unless the frontend's request producer has overrun its response > > consumer. > > > > That code was introduced with no callers, but seems to have been > > copied from the existing netback code, which switched to use it in > > b242b208. All later users of it in the xenolinux trees are either > > brand new code or replacing a simple (req_cons - req_prod) test. The > > netback code goes back to Keir's original implementation, where it > > looked like this, in inverted form (1de448f4): > > > > /* Work to do? */ > > i = netif->tx_req_cons; > > if ( (i == netif->tx->req_prod) || > > ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) ) > > { > > netif_put(netif); > > continue; > > } > > > > Again, I don't think this test is useful: it's only interesting if > > netfront has overrun the ring, but it doesn't reliably detect that. > > > > So let's remove it. > > > > Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com> > > Signed-off-by: Tim Deegan <tim@xen.org> > > Cc: Keir Fraser <keir@xen.org> > > Cc: Jan Beulich <JBeulich@suse.com> > > Cc: Ian Campbell <Ian.Campbell@citrix.com> > > Cc: Wei Liu <wei.liu2@citrix.com> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > > Cc: David Vrabel <david.vrabel@citrix.com> > > Cc: Alan Somers <alans@spectralogic.com> > > Cc: John Suykerbuyk <johns@spectralogic.com> > > Cc: Manuel Bouyer <bouyer@NetBSD.org> > > Cc: Paul Durrant <Paul.Durrant@citrix.com> > > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > --- > > RFC because > > - I might well be missing something here; and > > - this is a change to a public header that could be in use in any > > number of implementations; since the extra test is very cheap, and > > seems to be harmess, we might consider leaving it in place. > > - I haven't tested it at all yet. > > > > CC'ing a whole bunch of people whose code might be using this macro. > > > > diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h > > index 73e13d7..d6e23f1 100644 > > --- a/xen/include/public/io/ring.h > > +++ b/xen/include/public/io/ring.h > > @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t > > #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ > > ((_r)->sring->rsp_prod - (_r)->rsp_cons) > > > > -#ifdef __GNUC__ > > -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ > > - unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ > > - unsigned int rsp = RING_SIZE(_r) - \ > > - ((_r)->req_cons - (_r)->rsp_prod_pvt); \ > > - req < rsp ? req : rsp; \ > > -}) > > -#else > > -/* Same as above, but without the nice GCC ({ ... }) syntax. */ > > #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ > > - ((((_r)->sring->req_prod - (_r)->req_cons) < \ > > - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ? \ > > - ((_r)->sring->req_prod - (_r)->req_cons) : \ > > - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) > > -#endif > > + ((_r)->sring->req_prod - (_r)->req_cons) > > > > /* Direct access to individual ring elements, by index. */ > > #define RING_GET_REQUEST(_r, _idx) \ > > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-22 17:14 ` Tim Deegan @ 2014-03-24 7:38 ` Jan Beulich 2014-03-24 9:39 ` Paul Durrant 2014-03-24 12:23 ` Zoltan Kiss 0 siblings, 2 replies; 53+ messages in thread From: Jan Beulich @ 2014-03-24 7:38 UTC (permalink / raw) To: Zoltan Kiss, Tim Deegan Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Keir Fraser, freebsd-xen, Alan Somers, Paul Durrant, David Vrabel, xen-devel, Boris Ostrovsky, John Suykerbuyk, Manuel Bouyer, roger.pau >>> On 22.03.14 at 18:14, <tim@xen.org> wrote: > At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: >> I think I might have an explanation why do we need this, see this mailing: >> >> https://lkml.org/lkml/2014/3/20/710 >> https://lkml.org/lkml/2014/3/21/111 >> https://lkml.org/lkml/2014/3/21/390 > > Quoting from the third of these: > > | But consuming overrunning requests after rsp_prod_pvt is a problem: > | - NAPI instance races with dealloc thread over the slots. The first > | reads them as requests, the second writes them as responses > | - the NAPI instance overwrites used pending slots as well, so skb frag > | release go wrong etc. > > OK, so the backend needs to be careful not to follow the frontend into > overrun, not because of the ring itself being corrupted but because it > will mess up the backend's internal bookkeeping. With s/will/may/ I'm not sure that's a reason to withdraw the patch: The generic macros in ring.h imo shouldn't dictate any particular protection policy beyond protecting the ring itself. I.e. I'd think if netback need protection beyond the one provided by ring.h macros, it should take care to implement them itself. Yet of course I can see that weakening the protection we have had in place for so many years may result in very undesirable fallout. Jan ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-24 7:38 ` Jan Beulich @ 2014-03-24 9:39 ` Paul Durrant 2014-03-24 9:59 ` Jan Beulich 2014-03-24 12:23 ` Zoltan Kiss 1 sibling, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-24 9:39 UTC (permalink / raw) To: Jan Beulich, Zoltan Kiss, Tim (Xen.org) Cc: Wei Liu, Ian Campbell, Keir (Xen.org), freebsd-xen, Alan Somers, Manuel Bouyer, Stefano Stabellini, David Vrabel, xen-devel, Boris Ostrovsky, John Suykerbuyk, Roger Pau Monne > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 24 March 2014 07:39 > To: Zoltan Kiss; Tim (Xen.org) > Cc: David Vrabel; Ian Campbell; Paul Durrant; Roger Pau Monne; Wei Liu; > Stefano Stabellini; freebsd-xen@freebsd.org; xen- > devel@lists.xenproject.org; Manuel Bouyer; Boris Ostrovsky; Konrad > Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir (Xen.org) > Subject: Re: [PATCH RFC] xen/public/ring.h: simplify > RING_HAS_UNCONSUMED_REQUESTS() > > >>> On 22.03.14 at 18:14, <tim@xen.org> wrote: > > At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: > >> I think I might have an explanation why do we need this, see this mailing: > >> > >> https://lkml.org/lkml/2014/3/20/710 > >> https://lkml.org/lkml/2014/3/21/111 > >> https://lkml.org/lkml/2014/3/21/390 > > > > Quoting from the third of these: > > > > | But consuming overrunning requests after rsp_prod_pvt is a problem: > > | - NAPI instance races with dealloc thread over the slots. The first > > | reads them as requests, the second writes them as responses > > | - the NAPI instance overwrites used pending slots as well, so skb frag > > | release go wrong etc. > > > > OK, so the backend needs to be careful not to follow the frontend into > > overrun, not because of the ring itself being corrupted but because it > > will mess up the backend's internal bookkeeping. > > With s/will/may/ I'm not sure that's a reason to withdraw the patch: > The generic macros in ring.h imo shouldn't dictate any particular > protection policy beyond protecting the ring itself. I.e. I'd think if > netback need protection beyond the one provided by ring.h macros, > it should take care to implement them itself. > > Yet of course I can see that weakening the protection we have had > in place for so many years may result in very undesirable fallout. > But these are, of course, macros and so the protection is baked into any old code. I'm still in favour of changing the macro in the canonical header and adding a comment to point out that older versions of the macro had the extra check. Paul > Jan ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-24 9:39 ` Paul Durrant @ 2014-03-24 9:59 ` Jan Beulich 2014-03-24 11:03 ` Paul Durrant 0 siblings, 1 reply; 53+ messages in thread From: Jan Beulich @ 2014-03-24 9:59 UTC (permalink / raw) To: Paul Durrant, Zoltan Kiss, Tim (Xen.org) Cc: Wei Liu, Ian Campbell, Keir(Xen.org), freebsd-xen, Alan Somers, ManuelBouyer, Stefano Stabellini, David Vrabel, xen-devel, Boris Ostrovsky, John Suykerbuyk, Roger Pau Monne >>> On 24.03.14 at 10:39, <Paul.Durrant@citrix.com> wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 24 March 2014 07:39 >> To: Zoltan Kiss; Tim (Xen.org) >> Cc: David Vrabel; Ian Campbell; Paul Durrant; Roger Pau Monne; Wei Liu; >> Stefano Stabellini; freebsd-xen@freebsd.org; xen- >> devel@lists.xenproject.org; Manuel Bouyer; Boris Ostrovsky; Konrad >> Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir (Xen.org) >> Subject: Re: [PATCH RFC] xen/public/ring.h: simplify >> RING_HAS_UNCONSUMED_REQUESTS() >> >> >>> On 22.03.14 at 18:14, <tim@xen.org> wrote: >> > At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: >> >> I think I might have an explanation why do we need this, see this mailing: >> >> >> >> https://lkml.org/lkml/2014/3/20/710 >> >> https://lkml.org/lkml/2014/3/21/111 >> >> https://lkml.org/lkml/2014/3/21/390 >> > >> > Quoting from the third of these: >> > >> > | But consuming overrunning requests after rsp_prod_pvt is a problem: >> > | - NAPI instance races with dealloc thread over the slots. The first >> > | reads them as requests, the second writes them as responses >> > | - the NAPI instance overwrites used pending slots as well, so skb frag >> > | release go wrong etc. >> > >> > OK, so the backend needs to be careful not to follow the frontend into >> > overrun, not because of the ring itself being corrupted but because it >> > will mess up the backend's internal bookkeeping. >> >> With s/will/may/ I'm not sure that's a reason to withdraw the patch: >> The generic macros in ring.h imo shouldn't dictate any particular >> protection policy beyond protecting the ring itself. I.e. I'd think if >> netback need protection beyond the one provided by ring.h macros, >> it should take care to implement them itself. >> >> Yet of course I can see that weakening the protection we have had >> in place for so many years may result in very undesirable fallout. >> > > But these are, of course, macros and so the protection is baked into any old > code. Right, but the reduced protection won't be noticeable on a re-build, and syncing up header files may be a purely mechanical (perhaps even automated) operation. > I'm still in favour of changing the macro in the canonical header and > adding a comment to point out that older versions of the macro had the extra > check. I too am in favor of changing the canonical header; I only wanted to point out that this isn't without risk of regressions. Jan ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-24 9:59 ` Jan Beulich @ 2014-03-24 11:03 ` Paul Durrant 0 siblings, 0 replies; 53+ messages in thread From: Paul Durrant @ 2014-03-24 11:03 UTC (permalink / raw) To: Jan Beulich, Zoltan Kiss, Tim (Xen.org) Cc: Wei Liu, Ian Campbell, Keir (Xen.org), freebsd-xen, Alan Somers, ManuelBouyer, Stefano Stabellini, David Vrabel, xen-devel, Boris Ostrovsky, John Suykerbuyk, Roger Pau Monne > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 24 March 2014 10:00 > To: Paul Durrant; Zoltan Kiss; Tim (Xen.org) > Cc: David Vrabel; Ian Campbell; Roger Pau Monne; Stefano Stabellini; Wei Liu; > freebsd-xen@freebsd.org; xen-devel@lists.xenproject.org; ManuelBouyer; > Boris Ostrovsky; Konrad Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir > (Xen.org) > Subject: RE: [PATCH RFC] xen/public/ring.h: simplify > RING_HAS_UNCONSUMED_REQUESTS() > > >>> On 24.03.14 at 10:39, <Paul.Durrant@citrix.com> wrote: > >> -----Original Message----- > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: 24 March 2014 07:39 > >> To: Zoltan Kiss; Tim (Xen.org) > >> Cc: David Vrabel; Ian Campbell; Paul Durrant; Roger Pau Monne; Wei Liu; > >> Stefano Stabellini; freebsd-xen@freebsd.org; xen- > >> devel@lists.xenproject.org; Manuel Bouyer; Boris Ostrovsky; Konrad > >> Rzeszutek Wilk; Alan Somers; John Suykerbuyk; Keir (Xen.org) > >> Subject: Re: [PATCH RFC] xen/public/ring.h: simplify > >> RING_HAS_UNCONSUMED_REQUESTS() > >> > >> >>> On 22.03.14 at 18:14, <tim@xen.org> wrote: > >> > At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: > >> >> I think I might have an explanation why do we need this, see this > mailing: > >> >> > >> >> https://lkml.org/lkml/2014/3/20/710 > >> >> https://lkml.org/lkml/2014/3/21/111 > >> >> https://lkml.org/lkml/2014/3/21/390 > >> > > >> > Quoting from the third of these: > >> > > >> > | But consuming overrunning requests after rsp_prod_pvt is a problem: > >> > | - NAPI instance races with dealloc thread over the slots. The first > >> > | reads them as requests, the second writes them as responses > >> > | - the NAPI instance overwrites used pending slots as well, so skb frag > >> > | release go wrong etc. > >> > > >> > OK, so the backend needs to be careful not to follow the frontend into > >> > overrun, not because of the ring itself being corrupted but because it > >> > will mess up the backend's internal bookkeeping. > >> > >> With s/will/may/ I'm not sure that's a reason to withdraw the patch: > >> The generic macros in ring.h imo shouldn't dictate any particular > >> protection policy beyond protecting the ring itself. I.e. I'd think if > >> netback need protection beyond the one provided by ring.h macros, > >> it should take care to implement them itself. > >> > >> Yet of course I can see that weakening the protection we have had > >> in place for so many years may result in very undesirable fallout. > >> > > > > But these are, of course, macros and so the protection is baked into any old > > code. > > Right, but the reduced protection won't be noticeable on a re-build, > and syncing up header files may be a purely mechanical (perhaps even > automated) operation. > True. I have an auto-sync script for the Windows PV drivers but every time I update the tag to pull from I do check the diffs in the headers (not least because I have to post-process them to change any use of 'long' or 'unsigned long' to something that's actually 64-bits wide in 64-bit Windows). > > I'm still in favour of changing the macro in the canonical header and > > adding a comment to point out that older versions of the macro had the > extra > > check. > > I too am in favor of changing the canonical header; I only wanted to > point out that this isn't without risk of regressions. > Yes. Fair enough. Paul > Jan ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-24 7:38 ` Jan Beulich 2014-03-24 9:39 ` Paul Durrant @ 2014-03-24 12:23 ` Zoltan Kiss 2014-03-24 13:52 ` Paul Durrant 1 sibling, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-24 12:23 UTC (permalink / raw) To: Jan Beulich, Zoltan Kiss, Tim Deegan Cc: Keir Fraser, Ian Campbell, Stefano Stabellini, freebsd-xen, Alan Somers, Paul Durrant, David Vrabel, xen-devel, Wei Liu, Boris Ostrovsky, roger.pau, Manuel Bouyer, John Suykerbuyk On 24/03/14 07:38, Jan Beulich wrote: >>>> On 22.03.14 at 18:14, <tim@xen.org> wrote: >> At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: >>> I think I might have an explanation why do we need this, see this mailing: >>> >>> https://lkml.org/lkml/2014/3/20/710 >>> https://lkml.org/lkml/2014/3/21/111 >>> https://lkml.org/lkml/2014/3/21/390 >> >> Quoting from the third of these: >> >> | But consuming overrunning requests after rsp_prod_pvt is a problem: >> | - NAPI instance races with dealloc thread over the slots. The first >> | reads them as requests, the second writes them as responses >> | - the NAPI instance overwrites used pending slots as well, so skb frag >> | release go wrong etc. >> >> OK, so the backend needs to be careful not to follow the frontend into >> overrun, not because of the ring itself being corrupted but because it >> will mess up the backend's internal bookkeeping. > > With s/will/may/ I'm not sure that's a reason to withdraw the patch: > The generic macros in ring.h imo shouldn't dictate any particular > protection policy beyond protecting the ring itself. I.e. I'd think if > netback need protection beyond the one provided by ring.h macros, > it should take care to implement them itself. It's not "may", it is a "will". In case of Linux netback for sure, but I think it's reasonable for any backend to rely on the fact that the ring macros protect them from abusive frontends. Protecting a backend to read requests from the [rsp_prod_pvt, req_cons] range is a sensible thing to do in the ring macros. Also, RING_FINAL_CHECK_FOR_REQUESTS relies on this macro, removing this protection may cause other issues, e.g. netback keeps the NAPI instance spinning while it's not consuming any requests. Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-24 12:23 ` Zoltan Kiss @ 2014-03-24 13:52 ` Paul Durrant 2014-03-24 23:55 ` Zoltan Kiss 0 siblings, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-24 13:52 UTC (permalink / raw) To: Zoltan Kiss, Jan Beulich, Zoltan Kiss, Tim (Xen.org) Cc: Keir (Xen.org), Ian Campbell, freebsd-xen, Alan Somers, Manuel Bouyer, Stefano Stabellini, David Vrabel, xen-devel, Wei Liu, Boris Ostrovsky, Roger Pau Monne, John Suykerbuyk > -----Original Message----- > From: Zoltan Kiss > Sent: 24 March 2014 12:24 > To: Jan Beulich; Zoltan Kiss; Tim (Xen.org) > Cc: Wei Liu; Ian Campbell; Stefano Stabellini; Keir (Xen.org); freebsd- > xen@freebsd.org; Alan Somers; Paul Durrant; David Vrabel; xen- > devel@lists.xenproject.org; Boris Ostrovsky; John Suykerbuyk; Manuel > Bouyer; Roger Pau Monne > Subject: Re: [Xen-devel] [PATCH RFC] xen/public/ring.h: simplify > RING_HAS_UNCONSUMED_REQUESTS() > > On 24/03/14 07:38, Jan Beulich wrote: > >>>> On 22.03.14 at 18:14, <tim@xen.org> wrote: > >> At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: > >>> I think I might have an explanation why do we need this, see this mailing: > >>> > >>> https://lkml.org/lkml/2014/3/20/710 > >>> https://lkml.org/lkml/2014/3/21/111 > >>> https://lkml.org/lkml/2014/3/21/390 > >> > >> Quoting from the third of these: > >> > >> | But consuming overrunning requests after rsp_prod_pvt is a problem: > >> | - NAPI instance races with dealloc thread over the slots. The first > >> | reads them as requests, the second writes them as responses > >> | - the NAPI instance overwrites used pending slots as well, so skb frag > >> | release go wrong etc. > >> > >> OK, so the backend needs to be careful not to follow the frontend into > >> overrun, not because of the ring itself being corrupted but because it > >> will mess up the backend's internal bookkeeping. > > > > With s/will/may/ I'm not sure that's a reason to withdraw the patch: > > The generic macros in ring.h imo shouldn't dictate any particular > > protection policy beyond protecting the ring itself. I.e. I'd think if > > netback need protection beyond the one provided by ring.h macros, > > it should take care to implement them itself. > > It's not "may", it is a "will". In case of Linux netback for sure, but I > think it's reasonable for any backend to rely on the fact that the ring > macros protect them from abusive frontends. Protecting a backend to read > requests from the [rsp_prod_pvt, req_cons] range is a sensible thing to > do in the ring macros. I disagree. That's not what the name of the macro implies; it implies a range for req_prod - req_cons. The backend is responsible for its own rsp_prod_pvt value and should make sure it's safe. If, to that end, it wants to have its own variant of the macro then that's reasonable, but adding such a clause to the macro in the generic header is (as has been proven) confusing. Paul > Also, RING_FINAL_CHECK_FOR_REQUESTS relies on this macro, removing > this > protection may cause other issues, e.g. netback keeps the NAPI instance > spinning while it's not consuming any requests. > > Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-24 13:52 ` Paul Durrant @ 2014-03-24 23:55 ` Zoltan Kiss 0 siblings, 0 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-24 23:55 UTC (permalink / raw) To: Paul Durrant, Jan Beulich, Zoltan Kiss, Tim (Xen.org) Cc: Keir (Xen.org), Ian Campbell, freebsd-xen, Alan Somers, Manuel Bouyer, Stefano Stabellini, David Vrabel, xen-devel, Wei Liu, Boris Ostrovsky, Roger Pau Monne, John Suykerbuyk On 24/03/14 13:52, Paul Durrant wrote: >> -----Original Message----- >> From: Zoltan Kiss >> Sent: 24 March 2014 12:24 >> To: Jan Beulich; Zoltan Kiss; Tim (Xen.org) >> Cc: Wei Liu; Ian Campbell; Stefano Stabellini; Keir (Xen.org); freebsd- >> xen@freebsd.org; Alan Somers; Paul Durrant; David Vrabel; xen- >> devel@lists.xenproject.org; Boris Ostrovsky; John Suykerbuyk; Manuel >> Bouyer; Roger Pau Monne >> Subject: Re: [Xen-devel] [PATCH RFC] xen/public/ring.h: simplify >> RING_HAS_UNCONSUMED_REQUESTS() >> >> On 24/03/14 07:38, Jan Beulich wrote: >>>>>> On 22.03.14 at 18:14, <tim@xen.org> wrote: >>>> At 14:18 +0000 on 22 Mar (1395494283), Zoltan Kiss wrote: >>>>> I think I might have an explanation why do we need this, see this mailing: >>>>> >>>>> https://lkml.org/lkml/2014/3/20/710 >>>>> https://lkml.org/lkml/2014/3/21/111 >>>>> https://lkml.org/lkml/2014/3/21/390 >>>> Quoting from the third of these: >>>> >>>> | But consuming overrunning requests after rsp_prod_pvt is a problem: >>>> | - NAPI instance races with dealloc thread over the slots. The first >>>> | reads them as requests, the second writes them as responses >>>> | - the NAPI instance overwrites used pending slots as well, so skb frag >>>> | release go wrong etc. >>>> >>>> OK, so the backend needs to be careful not to follow the frontend into >>>> overrun, not because of the ring itself being corrupted but because it >>>> will mess up the backend's internal bookkeeping. >>> With s/will/may/ I'm not sure that's a reason to withdraw the patch: >>> The generic macros in ring.h imo shouldn't dictate any particular >>> protection policy beyond protecting the ring itself. I.e. I'd think if >>> netback need protection beyond the one provided by ring.h macros, >>> it should take care to implement them itself. >> It's not "may", it is a "will". In case of Linux netback for sure, but I >> think it's reasonable for any backend to rely on the fact that the ring >> macros protect them from abusive frontends. Protecting a backend to read >> requests from the [rsp_prod_pvt, req_cons] range is a sensible thing to >> do in the ring macros. > I disagree. That's not what the name of the macro implies; it implies a range for req_prod - req_cons. The backend is responsible for its own rsp_prod_pvt value and should make sure it's safe. If, to that end, it wants to have its own variant of the macro then that's reasonable, but adding such a clause to the macro in the generic header is (as has been proven) confusing. My opinion is that the name of the macro implies the caller wants to know if there are unconsumed requests on the ring (and it doesn't imply it, but the return value is actually how many unconsumed requests you have). Returning req_prod - req_cons seems to be logical at first (and second and ...) thought, but - and that's not implied in the name indeed, but I think it's sensible, and should be documented in a comment - if you want to protect the backend from abusive frontends, you should check if req_prod is not overrunning rsp_prod_pvt. You could check rsp_prod, but rsp_prod_pvt is better, because the latter should be always equal to or ahead of the former, and from the backend point of view the responses between the two (if any) doesn't matter. It's not about the backend making sure it's rsp_prod_pvt is valid or not, it's about protection from a frontend. And I think it should be a general thing for the ring users to have this safety check, rather than something backend-specific, as all of them could be affected. Zoli > > Paul > >> Also, RING_FINAL_CHECK_FOR_REQUESTS relies on this macro, removing >> this >> protection may cause other issues, e.g. netback keeps the NAPI instance >> spinning while it's not consuming any requests. >> >> Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-03-13 16:38 ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan 2014-03-22 14:18 ` Zoltan Kiss @ 2014-04-03 9:38 ` Tim Deegan 2014-04-03 15:34 ` Zoltan Kiss 1 sibling, 1 reply; 53+ messages in thread From: Tim Deegan @ 2014-04-03 9:38 UTC (permalink / raw) To: keir, xen-devel [CC's clipped] At 17:38 +0100 on 13 Mar (1394728686), Tim Deegan wrote: > [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() > > Remove a useless (though harmless) extra check. What shall we do about this? AFAICT, Jan and Paul are in favour of the change; Zoltan is against it. I am wavering. Keir, any opinion? Tim. > Code inspection > --------------- > > Ian Campbell and I looked at this today and can't find any case where > the existing 'rsp' test would be useful. In today's ring.h, > > #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ > unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ > unsigned int rsp = RING_SIZE(_r) - \ > ((_r)->req_cons - (_r)->rsp_prod_pvt); \ > req < rsp ? req : rsp; \ > }) > > 'req' is the number of requests that the F/E has published and we have > not yet consumed. 'rsp' is an odd fish, everything _except_ what we > might call requests locally in flight, that is requests we've consumed > but not produced a response for. We could only think of two things we > might be trying to test for here: > > (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the > way these rings normally work. In that case, as Zoltan pointed > out, rsp must be >= req, for a well-behaved frontend: otherwise > we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that > req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun > the ring. I don't think this even usefully protects against a > malicious/buggy frontend. > > (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm > told might happen in linux netback? In that case, we might plausibly > want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of > (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's > not what this does. Instead, rsp will underflow to > RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req. > > So in both of those cases, 'rsp' is always >= 'req' and is useless. > > > Code archaeology > ---------------- > > In the original ring.h, the test was a boolean, as the name implies. > Cset 99812f40 ([NET] back: Add SG support) extended it to a count in > the obvious manner. Looking at the original (0b788798): > > #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r) \ > ( ((_r)->req_cons != (_r)->sring->req_prod ) && \ > (((_r)->req_cons - (_r)->rsp_prod_pvt) != \ > SRING_SIZE((_p), (_r)->sring)) ) > > it seems to be testing for 'there are unconsumed requests _and_ we > have not got a full ring of consumed-but-not-responded requests'. > This also seems to be effectively harmless: if there are unconsumed > requests, we can't possibly have a ring full of c-b-n-r requests > unless the frontend's request producer has overrun its response > consumer. > > That code was introduced with no callers, but seems to have been > copied from the existing netback code, which switched to use it in > b242b208. All later users of it in the xenolinux trees are either > brand new code or replacing a simple (req_cons - req_prod) test. The > netback code goes back to Keir's original implementation, where it > looked like this, in inverted form (1de448f4): > > /* Work to do? */ > i = netif->tx_req_cons; > if ( (i == netif->tx->req_prod) || > ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) ) > { > netif_put(netif); > continue; > } > > Again, I don't think this test is useful: it's only interesting if > netfront has overrun the ring, but it doesn't reliably detect that. > > So let's remove it. > > Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com> > Signed-off-by: Tim Deegan <tim@xen.org> > Cc: Keir Fraser <keir@xen.org> > Cc: Jan Beulich <JBeulich@suse.com> > Cc: Ian Campbell <Ian.Campbell@citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Cc: David Vrabel <david.vrabel@citrix.com> > Cc: Alan Somers <alans@spectralogic.com> > Cc: John Suykerbuyk <johns@spectralogic.com> > Cc: Manuel Bouyer <bouyer@NetBSD.org> > Cc: Paul Durrant <Paul.Durrant@citrix.com> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > RFC because > - I might well be missing something here; and > - this is a change to a public header that could be in use in any > number of implementations; since the extra test is very cheap, and > seems to be harmess, we might consider leaving it in place. > - I haven't tested it at all yet. > > CC'ing a whole bunch of people whose code might be using this macro. > > diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h > index 73e13d7..d6e23f1 100644 > --- a/xen/include/public/io/ring.h > +++ b/xen/include/public/io/ring.h > @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t > #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ > ((_r)->sring->rsp_prod - (_r)->rsp_cons) > > -#ifdef __GNUC__ > -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ > - unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ > - unsigned int rsp = RING_SIZE(_r) - \ > - ((_r)->req_cons - (_r)->rsp_prod_pvt); \ > - req < rsp ? req : rsp; \ > -}) > -#else > -/* Same as above, but without the nice GCC ({ ... }) syntax. */ > #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ > - ((((_r)->sring->req_prod - (_r)->req_cons) < \ > - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ? \ > - ((_r)->sring->req_prod - (_r)->req_cons) : \ > - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) > -#endif > + ((_r)->sring->req_prod - (_r)->req_cons) > > /* Direct access to individual ring elements, by index. */ > #define RING_GET_REQUEST(_r, _idx) \ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() 2014-04-03 9:38 ` Tim Deegan @ 2014-04-03 15:34 ` Zoltan Kiss 0 siblings, 0 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-04-03 15:34 UTC (permalink / raw) To: Tim Deegan, keir, xen-devel Paul mentioned the lkml links doesn't work at the moment, but a short summary of why rsp is important: it gives us the maximum possible amount of requests should be on the ring (number of slots which are not consumed-but-not-responded), so even if the frontend lies to us through increasing req_prod to crazy values, backend can avoid consuming slots which are still consumed-but-not-responded. Here is my explanation from that another thread: "The backend doesn't see what the guest does with the responses, and that's OK, it's the guest's problem, after netback increased rsp_prod_pvt it doesn't really care. But as soon as the guest start placing new requests after rsp_prod_pvt, or just increasing req_prod so req_prod-rsp_prod_pvt > XEN_NETIF_TX_RING_SIZE, it becomes an issue. So far this xenvif_tx_pending_slots_available indirectly saved us from consuming requests overwriting still pending requests, but the guest could still overwrote our responses. But again, that's still the guests problem, we had the original request saved in the pending ring data. If the guest went too far, build_gops killed the vif when req_prod-req_cons > XEN_NETIF_TX_RING_SIZE Indirect above means it only happened because the pending ring had the same size, and the used slots has a 1-to-1 mapping for slots between rsp_prod_pvt and req_cons. So xenvif_tx_pending_slots_available also means (req_cons - rsp_prod_pvt) + XEN_NETBK_LEGACY_SLOTS_MAX < XEN_NETIF_TX_RING_SIZE (does this look familiar? But consuming overrunning requests after rsp_prod_pvt is a problem: - NAPI instance races with dealloc thread over the slots. The first reads them as requests, the second writes them as responses - the NAPI instance overwrites used pending slots as well, so skb frag release go wrong etc. But the current RING_HAS_UNCONSUMED_REQUESTS fortunately saves us here, let me explain it through an example: rsp_prod_pvt = 0 req_cons = 253 req_prod = 258 Therefore: req = 5 rsp = 3 So in xenvif_tx_build_gops work_to_do will be 3, and in xenvif_count_requests we bail out when we see that the packet actually exceeds that. So in the end, we are safe here, but we shouldn't change that macro I suggested to refactor" On 03/04/14 10:38, Tim Deegan wrote: > [CC's clipped] > > At 17:38 +0100 on 13 Mar (1394728686), Tim Deegan wrote: >> [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() >> >> Remove a useless (though harmless) extra check. > > What shall we do about this? AFAICT, Jan and Paul are in favour of the > change; Zoltan is against it. I am wavering. Keir, any opinion? > > Tim. > >> Code inspection >> --------------- >> >> Ian Campbell and I looked at this today and can't find any case where >> the existing 'rsp' test would be useful. In today's ring.h, >> >> #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ >> unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ >> unsigned int rsp = RING_SIZE(_r) - \ >> ((_r)->req_cons - (_r)->rsp_prod_pvt); \ >> req < rsp ? req : rsp; \ >> }) >> >> 'req' is the number of requests that the F/E has published and we have >> not yet consumed. 'rsp' is an odd fish, everything _except_ what we >> might call requests locally in flight, that is requests we've consumed >> but not produced a response for. We could only think of two things we >> might be trying to test for here: >> >> (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the >> way these rings normally work. In that case, as Zoltan pointed >> out, rsp must be >= req, for a well-behaved frontend: otherwise >> we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that >> req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun >> the ring. I don't think this even usefully protects against a >> malicious/buggy frontend. >> >> (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm >> told might happen in linux netback? In that case, we might plausibly >> want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of >> (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's >> not what this does. Instead, rsp will underflow to >> RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req. >> >> So in both of those cases, 'rsp' is always >= 'req' and is useless. >> >> >> Code archaeology >> ---------------- >> >> In the original ring.h, the test was a boolean, as the name implies. >> Cset 99812f40 ([NET] back: Add SG support) extended it to a count in >> the obvious manner. Looking at the original (0b788798): >> >> #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r) \ >> ( ((_r)->req_cons != (_r)->sring->req_prod ) && \ >> (((_r)->req_cons - (_r)->rsp_prod_pvt) != \ >> SRING_SIZE((_p), (_r)->sring)) ) >> >> it seems to be testing for 'there are unconsumed requests _and_ we >> have not got a full ring of consumed-but-not-responded requests'. >> This also seems to be effectively harmless: if there are unconsumed >> requests, we can't possibly have a ring full of c-b-n-r requests >> unless the frontend's request producer has overrun its response >> consumer. >> >> That code was introduced with no callers, but seems to have been >> copied from the existing netback code, which switched to use it in >> b242b208. All later users of it in the xenolinux trees are either >> brand new code or replacing a simple (req_cons - req_prod) test. The >> netback code goes back to Keir's original implementation, where it >> looked like this, in inverted form (1de448f4): >> >> /* Work to do? */ >> i = netif->tx_req_cons; >> if ( (i == netif->tx->req_prod) || >> ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) ) >> { >> netif_put(netif); >> continue; >> } >> >> Again, I don't think this test is useful: it's only interesting if >> netfront has overrun the ring, but it doesn't reliably detect that. >> >> So let's remove it. >> >> Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com> >> Signed-off-by: Tim Deegan <tim@xen.org> >> Cc: Keir Fraser <keir@xen.org> >> Cc: Jan Beulich <JBeulich@suse.com> >> Cc: Ian Campbell <Ian.Campbell@citrix.com> >> Cc: Wei Liu <wei.liu2@citrix.com> >> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> Cc: David Vrabel <david.vrabel@citrix.com> >> Cc: Alan Somers <alans@spectralogic.com> >> Cc: John Suykerbuyk <johns@spectralogic.com> >> Cc: Manuel Bouyer <bouyer@NetBSD.org> >> Cc: Paul Durrant <Paul.Durrant@citrix.com> >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> >> --- >> RFC because >> - I might well be missing something here; and >> - this is a change to a public header that could be in use in any >> number of implementations; since the extra test is very cheap, and >> seems to be harmess, we might consider leaving it in place. >> - I haven't tested it at all yet. >> >> CC'ing a whole bunch of people whose code might be using this macro. >> >> diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h >> index 73e13d7..d6e23f1 100644 >> --- a/xen/include/public/io/ring.h >> +++ b/xen/include/public/io/ring.h >> @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t >> #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ >> ((_r)->sring->rsp_prod - (_r)->rsp_cons) >> >> -#ifdef __GNUC__ >> -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({ \ >> - unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ >> - unsigned int rsp = RING_SIZE(_r) - \ >> - ((_r)->req_cons - (_r)->rsp_prod_pvt); \ >> - req < rsp ? req : rsp; \ >> -}) >> -#else >> -/* Same as above, but without the nice GCC ({ ... }) syntax. */ >> #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ >> - ((((_r)->sring->req_prod - (_r)->req_cons) < \ >> - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ? \ >> - ((_r)->sring->req_prod - (_r)->req_cons) : \ >> - (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) >> -#endif >> + ((_r)->sring->req_prod - (_r)->req_cons) >> >> /* Direct access to individual ring elements, by index. */ >> #define RING_GET_REQUEST(_r, _idx) \ >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss 2014-03-06 15:53 ` Ian Campbell @ 2014-03-11 15:44 ` Ian Campbell 2014-03-11 23:24 ` Zoltan Kiss 1 sibling, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-11 15:44 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: > Hi, > > I was wondering for a while why this macro looks like this: > > #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ > ({ \ > unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ > unsigned int rsp = RING_SIZE(_r) - \ > ((_r)->req_cons - (_r)->rsp_prod_pvt); \ > req < rsp ? req : rsp; \ > }) > > I would expect to check prod - cons, like RING_HAS_UNCONSUMED_RESPONSES does: > > #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ > ((_r)->sring->rsp_prod - (_r)->rsp_cons) > > By my understanding, there is no way rsp could be smaller than req, so > there is no point having this. Am I missing something? Just looking at this again. All that stuff I said about wraparound was misleading/irrelevant since req and rsp are not the pointers, but actually the number of things. Sorry. Is it the case that this macro considers a request to be unconsumed if the *response* to a request is outstanding as well as if the request itself is still on the ring? I wonder if this apparently weird construction is due to pathological cases when one or the other end is not picking up requests/responses? i.e. trying to avoid deadlocking the ring or generating an interrupt storm when the ring it is full of one or the other or something along those lines? Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell @ 2014-03-11 23:24 ` Zoltan Kiss 2014-03-12 10:28 ` Ian Campbell 0 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-11 23:24 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan On 11/03/14 15:44, Ian Campbell wrote: > On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote: >> Hi, >> >> I was wondering for a while why this macro looks like this: >> >> #define RING_HAS_UNCONSUMED_REQUESTS(_r) \ >> ({ \ >> unsigned int req = (_r)->sring->req_prod - (_r)->req_cons; \ >> unsigned int rsp = RING_SIZE(_r) - \ >> ((_r)->req_cons - (_r)->rsp_prod_pvt); \ >> req < rsp ? req : rsp; \ >> }) >> >> I would expect to check prod - cons, like RING_HAS_UNCONSUMED_RESPONSES does: >> >> #define RING_HAS_UNCONSUMED_RESPONSES(_r) \ >> ((_r)->sring->rsp_prod - (_r)->rsp_cons) >> >> By my understanding, there is no way rsp could be smaller than req, so >> there is no point having this. Am I missing something? > > Just looking at this again. All that stuff I said about wraparound was > misleading/irrelevant since req and rsp are not the pointers, but > actually the number of things. Sorry. > > Is it the case that this macro considers a request to be unconsumed if > the *response* to a request is outstanding as well as if the request > itself is still on the ring? I don't think that would make sense. I think everywhere where this macro is called the caller is not interested in pending request (pending means consumed but not responded) > > I wonder if this apparently weird construction is due to pathological > cases when one or the other end is not picking up requests/responses? > i.e. trying to avoid deadlocking the ring or generating an interrupt > storm when the ring it is full of one or the other or something along > those lines? ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-11 23:24 ` Zoltan Kiss @ 2014-03-12 10:28 ` Ian Campbell 2014-03-12 10:48 ` Roger Pau Monné ` (4 more replies) 0 siblings, 5 replies; 53+ messages in thread From: Ian Campbell @ 2014-03-12 10:28 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > On 11/03/14 15:44, Ian Campbell wrote: > > Is it the case that this macro considers a request to be unconsumed if > > the *response* to a request is outstanding as well as if the request > > itself is still on the ring? > I don't think that would make sense. I think everywhere where this macro > is called the caller is not interested in pending request (pending means > consumed but not responded) It might be interested in such pending requests in some of the pathological cases I allude to in the next paragraph though? For example if the ring has unconsumed requests but there are no slots free for a response, it would be better to treat it as no unconsumed requests until space opens up for a response, otherwise something else just has to abort the processing of the request when it notices the lack of space. (I'm totally speculating here BTW, I don't have any concrete idea why things are done this way...) > > I wonder if this apparently weird construction is due to pathological > > cases when one or the other end is not picking up requests/responses? > > i.e. trying to avoid deadlocking the ring or generating an interrupt > > storm when the ring it is full of one or the other or something along > > those lines? ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 10:28 ` Ian Campbell @ 2014-03-12 10:48 ` Roger Pau Monné 2014-03-12 11:25 ` Paul Durrant ` (3 subsequent siblings) 4 siblings, 0 replies; 53+ messages in thread From: Roger Pau Monné @ 2014-03-12 10:48 UTC (permalink / raw) To: Ian Campbell, Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan On 12/03/14 11:28, Ian Campbell wrote: > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: >> On 11/03/14 15:44, Ian Campbell wrote: > >>> Is it the case that this macro considers a request to be unconsumed if >>> the *response* to a request is outstanding as well as if the request >>> itself is still on the ring? >> I don't think that would make sense. I think everywhere where this macro >> is called the caller is not interested in pending request (pending means >> consumed but not responded) > > It might be interested in such pending requests in some of the > pathological cases I allude to in the next paragraph though? > > For example if the ring has unconsumed requests but there are no slots > free for a response, it would be better to treat it as no unconsumed > requests until space opens up for a response, otherwise something else > just has to abort the processing of the request when it notices the lack > of space. Yes, I think this is true at least for blkback, when blkback starts processing a request it assumes that a response slot will be available in order to write the response when the request has finished processing. Roger. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 10:28 ` Ian Campbell 2014-03-12 10:48 ` Roger Pau Monné @ 2014-03-12 11:25 ` Paul Durrant 2014-03-12 11:38 ` Paul Durrant ` (2 subsequent siblings) 4 siblings, 0 replies; 53+ messages in thread From: Paul Durrant @ 2014-03-12 11:25 UTC (permalink / raw) To: Ian Campbell, Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim (Xen.org) > -----Original Message----- > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- > bounces@lists.xen.org] On Behalf Of Ian Campbell > Sent: 12 March 2014 10:28 > To: Zoltan Kiss > Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > > On 11/03/14 15:44, Ian Campbell wrote: > > > > Is it the case that this macro considers a request to be unconsumed if > > > the *response* to a request is outstanding as well as if the request > > > itself is still on the ring? > > I don't think that would make sense. I think everywhere where this macro > > is called the caller is not interested in pending request (pending means > > consumed but not responded) > > It might be interested in such pending requests in some of the > pathological cases I allude to in the next paragraph though? > > For example if the ring has unconsumed requests but there are no slots > free for a response, it would be better to treat it as no unconsumed > requests until space opens up for a response, otherwise something else > just has to abort the processing of the request when it notices the lack > of space. > Huh? The act of consuming the request will create the space for the response. If responses and requests are not balanced then I'd argue that the protocol is broken. Paul > (I'm totally speculating here BTW, I don't have any concrete idea why > things are done this way...) > > > > > I wonder if this apparently weird construction is due to pathological > > > cases when one or the other end is not picking up requests/responses? > > > i.e. trying to avoid deadlocking the ring or generating an interrupt > > > storm when the ring it is full of one or the other or something along > > > those lines? > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 10:28 ` Ian Campbell 2014-03-12 10:48 ` Roger Pau Monné 2014-03-12 11:25 ` Paul Durrant @ 2014-03-12 11:38 ` Paul Durrant 2014-03-12 14:41 ` Zoltan Kiss 2014-03-12 14:25 ` Zoltan Kiss 2014-03-12 14:27 ` Zoltan Kiss 4 siblings, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-12 11:38 UTC (permalink / raw) To: Ian Campbell, Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim (Xen.org) > -----Original Message----- > From: Paul Durrant > Sent: 12 March 2014 11:26 > To: Ian Campbell; Zoltan Kiss > Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) > Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > > -----Original Message----- > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- > > bounces@lists.xen.org] On Behalf Of Ian Campbell > > Sent: 12 March 2014 10:28 > > To: Zoltan Kiss > > Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) > > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > > > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > > > On 11/03/14 15:44, Ian Campbell wrote: > > > > > > Is it the case that this macro considers a request to be unconsumed if > > > > the *response* to a request is outstanding as well as if the request > > > > itself is still on the ring? > > > I don't think that would make sense. I think everywhere where this > macro > > > is called the caller is not interested in pending request (pending means > > > consumed but not responded) > > > > It might be interested in such pending requests in some of the > > pathological cases I allude to in the next paragraph though? > > > > For example if the ring has unconsumed requests but there are no slots > > free for a response, it would be better to treat it as no unconsumed > > requests until space opens up for a response, otherwise something else > > just has to abort the processing of the request when it notices the lack > > of space. > > > > Huh? The act of consuming the request will create the space for the > response. If responses and requests are not balanced then I'd argue that the > protocol is broken. > Actually ancient memory tells me that, unfortunately, netback's backend->frontend GSO protocol is broken in this way... it requires one more response slot than the number of requests it consumes (for the extra metadata), which means that if the frontend keeps the ring full you can get overflow. It's a bit of a tangent though, because that code doesn't use this macro (or in fact check the ring has space in any way IIRC). The prefix variant of the protocol is ok though. Paul > Paul > > > (I'm totally speculating here BTW, I don't have any concrete idea why > > things are done this way...) > > > > > > > > I wonder if this apparently weird construction is due to pathological > > > > cases when one or the other end is not picking up requests/responses? > > > > i.e. trying to avoid deadlocking the ring or generating an interrupt > > > > storm when the ring it is full of one or the other or something along > > > > those lines? > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 11:38 ` Paul Durrant @ 2014-03-12 14:41 ` Zoltan Kiss 2014-03-12 15:23 ` Paul Durrant 0 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 14:41 UTC (permalink / raw) To: Paul Durrant, Ian Campbell; +Cc: xen-devel, Wei Liu, Tim (Xen.org) On 12/03/14 11:38, Paul Durrant wrote: >> -----Original Message----- >> From: Paul Durrant >> Sent: 12 March 2014 11:26 >> To: Ian Campbell; Zoltan Kiss >> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) >> Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness >> >>> -----Original Message----- >>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- >>> bounces@lists.xen.org] On Behalf Of Ian Campbell >>> Sent: 12 March 2014 10:28 >>> To: Zoltan Kiss >>> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness >>> >>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: >>>> On 11/03/14 15:44, Ian Campbell wrote: >>> >>>>> Is it the case that this macro considers a request to be unconsumed if >>>>> the *response* to a request is outstanding as well as if the request >>>>> itself is still on the ring? >>>> I don't think that would make sense. I think everywhere where this >> macro >>>> is called the caller is not interested in pending request (pending means >>>> consumed but not responded) >>> >>> It might be interested in such pending requests in some of the >>> pathological cases I allude to in the next paragraph though? >>> >>> For example if the ring has unconsumed requests but there are no slots >>> free for a response, it would be better to treat it as no unconsumed >>> requests until space opens up for a response, otherwise something else >>> just has to abort the processing of the request when it notices the lack >>> of space. >>> >> >> Huh? The act of consuming the request will create the space for the >> response. If responses and requests are not balanced then I'd argue that the >> protocol is broken. >> > > Actually ancient memory tells me that, unfortunately, netback's backend->frontend GSO protocol is broken in this way... it requires one more response slot than the number of requests it consumes (for the extra metadata), which means that if the frontend keeps the ring full you can get overflow. It's a bit of a tangent though, because that code doesn't use this macro (or in fact check the ring has space in any way IIRC). The prefix variant of the protocol is ok though. I think it's not: it consumes a request for the metadata, and when the packet is grant copied to the guest, it creates a response for that slot as well. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 14:41 ` Zoltan Kiss @ 2014-03-12 15:23 ` Paul Durrant 2014-03-12 15:42 ` Wei Liu 0 siblings, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-12 15:23 UTC (permalink / raw) To: Zoltan Kiss, Ian Campbell; +Cc: xen-devel, Wei Liu, Tim (Xen.org) > -----Original Message----- > From: Zoltan Kiss > Sent: 12 March 2014 14:42 > To: Paul Durrant; Ian Campbell > Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On 12/03/14 11:38, Paul Durrant wrote: > >> -----Original Message----- > >> From: Paul Durrant > >> Sent: 12 March 2014 11:26 > >> To: Ian Campbell; Zoltan Kiss > >> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) > >> Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > >> > >>> -----Original Message----- > >>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- > >>> bounces@lists.xen.org] On Behalf Of Ian Campbell > >>> Sent: 12 March 2014 10:28 > >>> To: Zoltan Kiss > >>> Cc: xen-devel@lists.xenproject.org; Wei Liu; Tim (Xen.org) > >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS > oddness > >>> > >>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > >>>> On 11/03/14 15:44, Ian Campbell wrote: > >>> > >>>>> Is it the case that this macro considers a request to be unconsumed if > >>>>> the *response* to a request is outstanding as well as if the request > >>>>> itself is still on the ring? > >>>> I don't think that would make sense. I think everywhere where this > >> macro > >>>> is called the caller is not interested in pending request (pending means > >>>> consumed but not responded) > >>> > >>> It might be interested in such pending requests in some of the > >>> pathological cases I allude to in the next paragraph though? > >>> > >>> For example if the ring has unconsumed requests but there are no slots > >>> free for a response, it would be better to treat it as no unconsumed > >>> requests until space opens up for a response, otherwise something else > >>> just has to abort the processing of the request when it notices the lack > >>> of space. > >>> > >> > >> Huh? The act of consuming the request will create the space for the > >> response. If responses and requests are not balanced then I'd argue that > the > >> protocol is broken. > >> > > > > Actually ancient memory tells me that, unfortunately, netback's backend- > >frontend GSO protocol is broken in this way... it requires one more > response slot than the number of requests it consumes (for the extra > metadata), which means that if the frontend keeps the ring full you can get > overflow. It's a bit of a tangent though, because that code doesn't use this > macro (or in fact check the ring has space in any way IIRC). The prefix variant > of the protocol is ok though. > > I think it's not: it consumes a request for the metadata, and when the > packet is grant copied to the guest, it creates a response for that slot > as well. As explained verbally, it doesn't consume a request for the 'extra' info. Let me elaborate here for the benefit of the list... In xenvif_gop_skb(), in the non-prefix GSO case, a single request is consumed for the header along with a meta slot which is used to hold the GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() for the header, but then *before* moving onto the next meta slot it makes an 'extra' response for the GSO metadata. So - one meta slot - one request consumed, but two responses produced. So this mechanism totally relies on the netfront driver not completely filling the shared ring. If it ever does, you'll get overflow. Paul ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 15:23 ` Paul Durrant @ 2014-03-12 15:42 ` Wei Liu 2014-03-12 15:56 ` Paul Durrant ` (2 more replies) 0 siblings, 3 replies; 53+ messages in thread From: Wei Liu @ 2014-03-12 15:42 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel, Wei Liu, Ian Campbell, Zoltan Kiss, Tim (Xen.org) On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote: [...] > > > Actually ancient memory tells me that, unfortunately, netback's backend- > > >frontend GSO protocol is broken in this way... it requires one more > > response slot than the number of requests it consumes (for the extra > > metadata), which means that if the frontend keeps the ring full you can get > > overflow. It's a bit of a tangent though, because that code doesn't use this > > macro (or in fact check the ring has space in any way IIRC). The prefix variant > > of the protocol is ok though. > > > > I think it's not: it consumes a request for the metadata, and when the > > packet is grant copied to the guest, it creates a response for that slot > > as well. > > As explained verbally, it doesn't consume a request for the 'extra' info. Let me elaborate here for the benefit of the list... > > In xenvif_gop_skb(), in the non-prefix GSO case, a single request is consumed for the header along with a meta slot which is used to hold the GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() for the header, but then *before* moving onto the next meta slot it makes an 'extra' response for the GSO metadata. So - one meta slot - one request consumed, but two responses produced. > So this mechanism totally relies on the netfront driver not completely filling the shared ring. If it ever does, you'll get overflow. > (... which reminds me of the heisenbug Sander is seeing.) But do we not check for there's enough space in the ring before procceeding? Wei. > Paul > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 15:42 ` Wei Liu @ 2014-03-12 15:56 ` Paul Durrant 2014-03-12 16:02 ` Paul Durrant 2014-03-12 16:13 ` Zoltan Kiss 2 siblings, 0 replies; 53+ messages in thread From: Paul Durrant @ 2014-03-12 15:56 UTC (permalink / raw) Cc: xen-devel, Wei Liu, Ian Campbell, Zoltan Kiss, Tim (Xen.org) > -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: 12 March 2014 15:43 > To: Paul Durrant > Cc: Zoltan Kiss; Ian Campbell; xen-devel@lists.xenproject.org; Wei Liu; Tim > (Xen.org) > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote: > [...] > > > > Actually ancient memory tells me that, unfortunately, netback's > backend- > > > >frontend GSO protocol is broken in this way... it requires one more > > > response slot than the number of requests it consumes (for the extra > > > metadata), which means that if the frontend keeps the ring full you can > get > > > overflow. It's a bit of a tangent though, because that code doesn't use > this > > > macro (or in fact check the ring has space in any way IIRC). The prefix > variant > > > of the protocol is ok though. > > > > > > I think it's not: it consumes a request for the metadata, and when the > > > packet is grant copied to the guest, it creates a response for that slot > > > as well. > > > > As explained verbally, it doesn't consume a request for the 'extra' info. Let > me elaborate here for the benefit of the list... > > > > In xenvif_gop_skb(), in the non-prefix GSO case, a single request is > consumed for the header along with a meta slot which is used to hold the > GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() > for the header, but then *before* moving onto the next meta slot it makes > an 'extra' response for the GSO metadata. So - one meta slot - one request > consumed, but two responses produced. > > So this mechanism totally relies on the netfront driver not completely filling > the shared ring. If it ever does, you'll get overflow. > > > > (... which reminds me of the heisenbug Sander is seeing.) > > But do we not check for there's enough space in the ring before > procceeding? > Apparently not, but TBH I cannot figure out how on earth this ever worked at all. If netback is genuinely issuing 2 reponses for 1 consumed request then it must trash an unconsumed request at some point - but from my reading of the code that really does seem to be what it's doing. Paul > Wei. > > > Paul > > > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 15:42 ` Wei Liu 2014-03-12 15:56 ` Paul Durrant @ 2014-03-12 16:02 ` Paul Durrant 2014-03-12 16:13 ` Zoltan Kiss 2 siblings, 0 replies; 53+ messages in thread From: Paul Durrant @ 2014-03-12 16:02 UTC (permalink / raw) Cc: xen-devel, Wei Liu, Ian Campbell, Zoltan Kiss, Tim (Xen.org) > -----Original Message----- > From: Paul Durrant > Sent: 12 March 2014 15:57 > To: Wei Liu > Cc: Zoltan Kiss; Ian Campbell; xen-devel@lists.xenproject.org; Wei Liu; Tim > (Xen.org) > Subject: RE: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > > -----Original Message----- > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > Sent: 12 March 2014 15:43 > > To: Paul Durrant > > Cc: Zoltan Kiss; Ian Campbell; xen-devel@lists.xenproject.org; Wei Liu; Tim > > (Xen.org) > > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > > > On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote: > > [...] > > > > > Actually ancient memory tells me that, unfortunately, netback's > > backend- > > > > >frontend GSO protocol is broken in this way... it requires one more > > > > response slot than the number of requests it consumes (for the extra > > > > metadata), which means that if the frontend keeps the ring full you can > > get > > > > overflow. It's a bit of a tangent though, because that code doesn't use > > this > > > > macro (or in fact check the ring has space in any way IIRC). The prefix > > variant > > > > of the protocol is ok though. > > > > > > > > I think it's not: it consumes a request for the metadata, and when the > > > > packet is grant copied to the guest, it creates a response for that slot > > > > as well. > > > > > > As explained verbally, it doesn't consume a request for the 'extra' info. > Let > > me elaborate here for the benefit of the list... > > > > > > In xenvif_gop_skb(), in the non-prefix GSO case, a single request is > > consumed for the header along with a meta slot which is used to hold the > > GSO data. Later on in xenvif_rx_action() the code calls > make_rx_response() > > for the header, but then *before* moving onto the next meta slot it makes > > an 'extra' response for the GSO metadata. So - one meta slot - one request > > consumed, but two responses produced. > > > So this mechanism totally relies on the netfront driver not completely > filling > > the shared ring. If it ever does, you'll get overflow. > > > > > > > (... which reminds me of the heisenbug Sander is seeing.) > > > > But do we not check for there's enough space in the ring before > > procceeding? > > > > Apparently not, but TBH I cannot figure out how on earth this ever worked at > all. If netback is genuinely issuing 2 reponses for 1 consumed request then it > must trash an unconsumed request at some point - but from my reading of > the code that really does seem to be what it's doing. > Aha, no - I've found the sneaky bit of code that restores balance to the force... It's at the bottom of xenvif_gop_frag_copy(). In the non-prefix GSO case req_cons is bumped if it's processing the head frag. Phew! Paul > Paul > > > Wei. > > > > > Paul > > > > > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 15:42 ` Wei Liu 2014-03-12 15:56 ` Paul Durrant 2014-03-12 16:02 ` Paul Durrant @ 2014-03-12 16:13 ` Zoltan Kiss 2014-03-12 16:42 ` Paul Durrant 2 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 16:13 UTC (permalink / raw) To: Wei Liu, Paul Durrant; +Cc: xen-devel, Tim (Xen.org), Ian Campbell On 12/03/14 15:42, Wei Liu wrote: > On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote: > [...] >>>> Actually ancient memory tells me that, unfortunately, netback's backend- >>>> frontend GSO protocol is broken in this way... it requires one more >>> response slot than the number of requests it consumes (for the extra >>> metadata), which means that if the frontend keeps the ring full you can get >>> overflow. It's a bit of a tangent though, because that code doesn't use this >>> macro (or in fact check the ring has space in any way IIRC). The prefix variant >>> of the protocol is ok though. >>> >>> I think it's not: it consumes a request for the metadata, and when the >>> packet is grant copied to the guest, it creates a response for that slot >>> as well. >> >> As explained verbally, it doesn't consume a request for the 'extra' info. Let me elaborate here for the benefit of the list... >> >> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is consumed for the header along with a meta slot which is used to hold the GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() for the header, but then *before* moving onto the next meta slot it makes an 'extra' response for the GSO metadata. So - one meta slot - one request consumed, but two responses produced. >> So this mechanism totally relies on the netfront driver not completely filling the shared ring. If it ever does, you'll get overflow. >> > > (... which reminds me of the heisenbug Sander is seeing.) > > But do we not check for there's enough space in the ring before > procceeding? Thinking further what Paul said, we might be actually OK. At the end of xenvif_gop_frag_copy there is this: if (*head && ((1 << gso_type) & vif->gso_mask)) vif->rx.req_cons++; So although netback doesn't call RING_GET_REQUEST to consume the request, it does so by increasing req_cons. And netfront automagically detect that in xennet_get_extras, and calls xennet_move_rx_slot to move that buffer to req_prod_pvt. Same happens when the backend send a response that it couldn't use the slot and it doesn't contain any data. Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 16:13 ` Zoltan Kiss @ 2014-03-12 16:42 ` Paul Durrant 2014-03-12 19:06 ` Zoltan Kiss 0 siblings, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-12 16:42 UTC (permalink / raw) To: Zoltan Kiss, Wei Liu; +Cc: xen-devel, Tim (Xen.org), Ian Campbell > -----Original Message----- > From: Zoltan Kiss > Sent: 12 March 2014 16:14 > To: Wei Liu; Paul Durrant > Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org) > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On 12/03/14 15:42, Wei Liu wrote: > > On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote: > > [...] > >>>> Actually ancient memory tells me that, unfortunately, netback's > backend- > >>>> frontend GSO protocol is broken in this way... it requires one more > >>> response slot than the number of requests it consumes (for the extra > >>> metadata), which means that if the frontend keeps the ring full you can > get > >>> overflow. It's a bit of a tangent though, because that code doesn't use > this > >>> macro (or in fact check the ring has space in any way IIRC). The prefix > variant > >>> of the protocol is ok though. > >>> > >>> I think it's not: it consumes a request for the metadata, and when the > >>> packet is grant copied to the guest, it creates a response for that slot > >>> as well. > >> > >> As explained verbally, it doesn't consume a request for the 'extra' info. > Let me elaborate here for the benefit of the list... > >> > >> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is > consumed for the header along with a meta slot which is used to hold the > GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() > for the header, but then *before* moving onto the next meta slot it makes > an 'extra' response for the GSO metadata. So - one meta slot - one request > consumed, but two responses produced. > >> So this mechanism totally relies on the netfront driver not completely > filling the shared ring. If it ever does, you'll get overflow. > >> > > > > (... which reminds me of the heisenbug Sander is seeing.) > > > > But do we not check for there's enough space in the ring before > > procceeding? > > Thinking further what Paul said, we might be actually OK. At the end of > xenvif_gop_frag_copy there is this: > > if (*head && ((1 << gso_type) & vif->gso_mask)) > vif->rx.req_cons++; > > So although netback doesn't call RING_GET_REQUEST to consume the > request, it does so by increasing req_cons. And netfront automagically > detect that in xennet_get_extras, and calls xennet_move_rx_slot to move > that buffer to req_prod_pvt. Same happens when the backend send a > response that it couldn't use the slot and it doesn't contain any data. > Still leaves the issue that the frontend has no idea which request id was consumed to create the 'hole' for the extra info, unless it has its own mapping of ring idx to id - i.e. the id in the xen_netif_rx_response struct is basically useless. Paul > Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 16:42 ` Paul Durrant @ 2014-03-12 19:06 ` Zoltan Kiss 2014-03-13 9:26 ` Paul Durrant 0 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 19:06 UTC (permalink / raw) To: Paul Durrant, Wei Liu; +Cc: xen-devel, Tim (Xen.org), Ian Campbell On 12/03/14 16:42, Paul Durrant wrote: >> -----Original Message----- >> From: Zoltan Kiss >> Sent: 12 March 2014 16:14 >> To: Wei Liu; Paul Durrant >> Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org) >> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness >> >> On 12/03/14 15:42, Wei Liu wrote: >>> On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote: >>> [...] >>>>>> Actually ancient memory tells me that, unfortunately, netback's >> backend- >>>>>> frontend GSO protocol is broken in this way... it requires one more >>>>> response slot than the number of requests it consumes (for the extra >>>>> metadata), which means that if the frontend keeps the ring full you can >> get >>>>> overflow. It's a bit of a tangent though, because that code doesn't use >> this >>>>> macro (or in fact check the ring has space in any way IIRC). The prefix >> variant >>>>> of the protocol is ok though. >>>>> >>>>> I think it's not: it consumes a request for the metadata, and when the >>>>> packet is grant copied to the guest, it creates a response for that slot >>>>> as well. >>>> >>>> As explained verbally, it doesn't consume a request for the 'extra' info. >> Let me elaborate here for the benefit of the list... >>>> >>>> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is >> consumed for the header along with a meta slot which is used to hold the >> GSO data. Later on in xenvif_rx_action() the code calls make_rx_response() >> for the header, but then *before* moving onto the next meta slot it makes >> an 'extra' response for the GSO metadata. So - one meta slot - one request >> consumed, but two responses produced. >>>> So this mechanism totally relies on the netfront driver not completely >> filling the shared ring. If it ever does, you'll get overflow. >>>> >>> >>> (... which reminds me of the heisenbug Sander is seeing.) >>> >>> But do we not check for there's enough space in the ring before >>> procceeding? >> >> Thinking further what Paul said, we might be actually OK. At the end of >> xenvif_gop_frag_copy there is this: >> >> if (*head && ((1 << gso_type) & vif->gso_mask)) >> vif->rx.req_cons++; >> >> So although netback doesn't call RING_GET_REQUEST to consume the >> request, it does so by increasing req_cons. And netfront automagically >> detect that in xennet_get_extras, and calls xennet_move_rx_slot to move >> that buffer to req_prod_pvt. Same happens when the backend send a >> response that it couldn't use the slot and it doesn't contain any data. >> > > Still leaves the issue that the frontend has no idea which request id was consumed to create the 'hole' for the extra info, unless it has its own mapping of ring idx to id - i.e. the id in the xen_netif_rx_response struct is basically useless. It only works because netback puts the responses here to the place of the original requests. If you check, frontend doesn't even check what's the id in the response, it only use it in an error printout. Seems like it is a bit fragile ... Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 19:06 ` Zoltan Kiss @ 2014-03-13 9:26 ` Paul Durrant 2014-03-13 10:02 ` Ian Campbell 0 siblings, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-13 9:26 UTC (permalink / raw) To: Zoltan Kiss, Wei Liu; +Cc: xen-devel, Tim (Xen.org), Ian Campbell > -----Original Message----- > From: Zoltan Kiss > Sent: 12 March 2014 19:06 > To: Paul Durrant; Wei Liu > Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org) > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On 12/03/14 16:42, Paul Durrant wrote: > >> -----Original Message----- > >> From: Zoltan Kiss > >> Sent: 12 March 2014 16:14 > >> To: Wei Liu; Paul Durrant > >> Cc: Ian Campbell; xen-devel@lists.xenproject.org; Tim (Xen.org) > >> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > >> > >> On 12/03/14 15:42, Wei Liu wrote: > >>> On Wed, Mar 12, 2014 at 03:23:09PM +0000, Paul Durrant wrote: > >>> [...] > >>>>>> Actually ancient memory tells me that, unfortunately, netback's > >> backend- > >>>>>> frontend GSO protocol is broken in this way... it requires one more > >>>>> response slot than the number of requests it consumes (for the extra > >>>>> metadata), which means that if the frontend keeps the ring full you > can > >> get > >>>>> overflow. It's a bit of a tangent though, because that code doesn't use > >> this > >>>>> macro (or in fact check the ring has space in any way IIRC). The prefix > >> variant > >>>>> of the protocol is ok though. > >>>>> > >>>>> I think it's not: it consumes a request for the metadata, and when the > >>>>> packet is grant copied to the guest, it creates a response for that slot > >>>>> as well. > >>>> > >>>> As explained verbally, it doesn't consume a request for the 'extra' info. > >> Let me elaborate here for the benefit of the list... > >>>> > >>>> In xenvif_gop_skb(), in the non-prefix GSO case, a single request is > >> consumed for the header along with a meta slot which is used to hold the > >> GSO data. Later on in xenvif_rx_action() the code calls > make_rx_response() > >> for the header, but then *before* moving onto the next meta slot it > makes > >> an 'extra' response for the GSO metadata. So - one meta slot - one > request > >> consumed, but two responses produced. > >>>> So this mechanism totally relies on the netfront driver not completely > >> filling the shared ring. If it ever does, you'll get overflow. > >>>> > >>> > >>> (... which reminds me of the heisenbug Sander is seeing.) > >>> > >>> But do we not check for there's enough space in the ring before > >>> procceeding? > >> > >> Thinking further what Paul said, we might be actually OK. At the end of > >> xenvif_gop_frag_copy there is this: > >> > >> if (*head && ((1 << gso_type) & vif->gso_mask)) > >> vif->rx.req_cons++; > >> > >> So although netback doesn't call RING_GET_REQUEST to consume the > >> request, it does so by increasing req_cons. And netfront automagically > >> detect that in xennet_get_extras, and calls xennet_move_rx_slot to > move > >> that buffer to req_prod_pvt. Same happens when the backend send a > >> response that it couldn't use the slot and it doesn't contain any data. > >> > > > > Still leaves the issue that the frontend has no idea which request id was > consumed to create the 'hole' for the extra info, unless it has its own > mapping of ring idx to id - i.e. the id in the xen_netif_rx_response struct is > basically useless. > > It only works because netback puts the responses here to the place of > the original requests. If you check, frontend doesn't even check what's > the id in the response, it only use it in an error printout. Seems like > it is a bit fragile ... > ... and completely non-obvious. Perhaps we should just bin the id field from the response struct in the canonical headers to avoid new frontends believing that it is useful, and document that they should use their own mapping of idx to id, and guarantee that netback will always respond in order on the rx ring. Paul ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-13 9:26 ` Paul Durrant @ 2014-03-13 10:02 ` Ian Campbell 2014-03-13 10:58 ` Paul Durrant 0 siblings, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-13 10:02 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel, Wei Liu, Zoltan Kiss, Tim (Xen.org) On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote: > guarantee that netback will always respond in order on the rx ring. I don't believe this is guaranteed byt the protocol, even if it happens to work out that way for the Linux netback implementation. Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-13 10:02 ` Ian Campbell @ 2014-03-13 10:58 ` Paul Durrant 2014-03-13 12:19 ` Ian Campbell 0 siblings, 1 reply; 53+ messages in thread From: Paul Durrant @ 2014-03-13 10:58 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu, Zoltan Kiss, Tim (Xen.org) > -----Original Message----- > From: Ian Campbell > Sent: 13 March 2014 10:03 > To: Paul Durrant > Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org) > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote: > > guarantee that netback will always respond in order on the rx ring. > > I don't believe this is guaranteed byt the protocol, even if it happens > to work out that way for the Linux netback implementation. > Indeed it's not guaranteed by the protocol but it's the only way that (non-prefix) GSO can possibly work, which is why I think something really needs to be documented. Paul ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-13 10:58 ` Paul Durrant @ 2014-03-13 12:19 ` Ian Campbell 2014-03-13 12:28 ` Zoltan Kiss 0 siblings, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-13 12:19 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel, Wei Liu, Zoltan Kiss, Tim (Xen.org) On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote: > > -----Original Message----- > > From: Ian Campbell > > Sent: 13 March 2014 10:03 > > To: Paul Durrant > > Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org) > > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > > > On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote: > > > guarantee that netback will always respond in order on the rx ring. > > > > I don't believe this is guaranteed byt the protocol, even if it happens > > to work out that way for the Linux netback implementation. > > > > Indeed it's not guaranteed by the protocol but it's the only way that > (non-prefix) GSO can possibly work, which is why I think something > really needs to be documented. There are possibly ordering constraints on the completion of the gso info slot and the rest but I don't think there are any such restrictions on the fragment slots themselves, are there? Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-13 12:19 ` Ian Campbell @ 2014-03-13 12:28 ` Zoltan Kiss 2014-03-13 12:29 ` Paul Durrant 2014-03-13 12:44 ` Ian Campbell 0 siblings, 2 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-13 12:28 UTC (permalink / raw) To: Ian Campbell, Paul Durrant; +Cc: xen-devel, Wei Liu, Tim (Xen.org) On 13/03/14 12:19, Ian Campbell wrote: > On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote: >>> -----Original Message----- >>> From: Ian Campbell >>> Sent: 13 March 2014 10:03 >>> To: Paul Durrant >>> Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org) >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness >>> >>> On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote: >>>> guarantee that netback will always respond in order on the rx ring. >>> >>> I don't believe this is guaranteed byt the protocol, even if it happens >>> to work out that way for the Linux netback implementation. >>> >> >> Indeed it's not guaranteed by the protocol but it's the only way that >> (non-prefix) GSO can possibly work, which is why I think something >> really needs to be documented. > > There are possibly ordering constraints on the completion of the gso > info slot and the rest but I don't think there are any such restrictions > on the fragment slots themselves, are there? Well, the Linux frontend doesn't take the id from the response, it just assume that the response is in the same slot as the request. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-13 12:28 ` Zoltan Kiss @ 2014-03-13 12:29 ` Paul Durrant 2014-03-13 12:44 ` Ian Campbell 1 sibling, 0 replies; 53+ messages in thread From: Paul Durrant @ 2014-03-13 12:29 UTC (permalink / raw) To: Zoltan Kiss, Ian Campbell; +Cc: xen-devel, Wei Liu, Tim (Xen.org) > -----Original Message----- > From: Zoltan Kiss > Sent: 13 March 2014 12:29 > To: Ian Campbell; Paul Durrant > Cc: Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org) > Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > > On 13/03/14 12:19, Ian Campbell wrote: > > On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote: > >>> -----Original Message----- > >>> From: Ian Campbell > >>> Sent: 13 March 2014 10:03 > >>> To: Paul Durrant > >>> Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org) > >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS > oddness > >>> > >>> On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote: > >>>> guarantee that netback will always respond in order on the rx ring. > >>> > >>> I don't believe this is guaranteed byt the protocol, even if it happens > >>> to work out that way for the Linux netback implementation. > >>> > >> > >> Indeed it's not guaranteed by the protocol but it's the only way that > >> (non-prefix) GSO can possibly work, which is why I think something > >> really needs to be documented. > > > > There are possibly ordering constraints on the completion of the gso > > info slot and the rest but I don't think there are any such restrictions > > on the fragment slots themselves, are there? > > Well, the Linux frontend doesn't take the id from the response, it just > assume that the response is in the same slot as the request. ... so for compatibilities sake, the ring had better complete in order! Paul ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-13 12:28 ` Zoltan Kiss 2014-03-13 12:29 ` Paul Durrant @ 2014-03-13 12:44 ` Ian Campbell 1 sibling, 0 replies; 53+ messages in thread From: Ian Campbell @ 2014-03-13 12:44 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Paul Durrant, Wei Liu, Tim (Xen.org) On Thu, 2014-03-13 at 12:28 +0000, Zoltan Kiss wrote: > On 13/03/14 12:19, Ian Campbell wrote: > > On Thu, 2014-03-13 at 10:58 +0000, Paul Durrant wrote: > >>> -----Original Message----- > >>> From: Ian Campbell > >>> Sent: 13 March 2014 10:03 > >>> To: Paul Durrant > >>> Cc: Zoltan Kiss; Wei Liu; xen-devel@lists.xenproject.org; Tim (Xen.org) > >>> Subject: Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness > >>> > >>> On Thu, 2014-03-13 at 09:26 +0000, Paul Durrant wrote: > >>>> guarantee that netback will always respond in order on the rx ring. > >>> > >>> I don't believe this is guaranteed byt the protocol, even if it happens > >>> to work out that way for the Linux netback implementation. > >>> > >> > >> Indeed it's not guaranteed by the protocol but it's the only way that > >> (non-prefix) GSO can possibly work, which is why I think something > >> really needs to be documented. > > > > There are possibly ordering constraints on the completion of the gso > > info slot and the rest but I don't think there are any such restrictions > > on the fragment slots themselves, are there? > > Well, the Linux frontend doesn't take the id from the response, it just > assume that the response is in the same slot as the request. Oops, I was thinking of guest tx not rx, sorry. Tx can complete in any order (txrsp->id is the index into tx_skbs[]) but not rx apparently. Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 10:28 ` Ian Campbell ` (2 preceding siblings ...) 2014-03-12 11:38 ` Paul Durrant @ 2014-03-12 14:25 ` Zoltan Kiss 2014-03-12 14:27 ` Zoltan Kiss 4 siblings, 0 replies; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 14:25 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan On 12/03/14 10:28, Ian Campbell wrote: > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: >> On 11/03/14 15:44, Ian Campbell wrote: > >>> Is it the case that this macro considers a request to be unconsumed if >>> the *response* to a request is outstanding as well as if the request >>> itself is still on the ring? >> I don't think that would make sense. I think everywhere where this macro >> is called the caller is not interested in pending request (pending means >> consumed but not responded) > > It might be interested in such pending requests in some of the > pathological cases I allude to in the next paragraph though? > > For example if the ring has unconsumed requests but there are no slots > free for a response, it would be better to treat it as no unconsumed > requests until space opens up for a response, otherwise something else > just has to abort the processing of the request when it notices the lack > of space. I agree with Paul on this: if you consume a request, it becomes pending until you finish your work on it. If you finish, that slot ceases to be a pending one, and you free up space for a response. So unless a request needs more than one response, we are good. > > (I'm totally speculating here BTW, I don't have any concrete idea why > things are done this way...) > > >>> I wonder if this apparently weird construction is due to pathological >>> cases when one or the other end is not picking up requests/responses? >>> i.e. trying to avoid deadlocking the ring or generating an interrupt >>> storm when the ring it is full of one or the other or something along >>> those lines? > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 10:28 ` Ian Campbell ` (3 preceding siblings ...) 2014-03-12 14:25 ` Zoltan Kiss @ 2014-03-12 14:27 ` Zoltan Kiss 2014-03-12 14:30 ` Ian Campbell 4 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 14:27 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan On 12/03/14 10:28, Ian Campbell wrote: > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: >> On 11/03/14 15:44, Ian Campbell wrote: > >>> Is it the case that this macro considers a request to be unconsumed if >>> the *response* to a request is outstanding as well as if the request >>> itself is still on the ring? >> I don't think that would make sense. I think everywhere where this macro >> is called the caller is not interested in pending request (pending means >> consumed but not responded) > > It might be interested in such pending requests in some of the > pathological cases I allude to in the next paragraph though? > > For example if the ring has unconsumed requests but there are no slots > free for a response, it would be better to treat it as no unconsumed > requests until space opens up for a response, otherwise something else > just has to abort the processing of the request when it notices the lack > of space. > > (I'm totally speculating here BTW, I don't have any concrete idea why > things are done this way...) > > >>> I wonder if this apparently weird construction is due to pathological >>> cases when one or the other end is not picking up requests/responses? >>> i.e. trying to avoid deadlocking the ring or generating an interrupt >>> storm when the ring it is full of one or the other or something along >>> those lines? > > Also, let me quote again my example about when rsp makes sense: "To clarify what does this do, let me show an example: req_prod = 253 req_cons = 256 rsp_prod_pvt = 0 req will be UINT_MAX-2, as the values changed in the meantime, and rsp is 0. It's reasonable to return 0 here, as the backend hasn't replied anything yet, so we clearly shouldn't have any unconsumed request in the ring." Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 14:27 ` Zoltan Kiss @ 2014-03-12 14:30 ` Ian Campbell 2014-03-12 15:14 ` Zoltan Kiss 0 siblings, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-12 14:30 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: > On 12/03/14 10:28, Ian Campbell wrote: > > On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > >> On 11/03/14 15:44, Ian Campbell wrote: > > > >>> Is it the case that this macro considers a request to be unconsumed if > >>> the *response* to a request is outstanding as well as if the request > >>> itself is still on the ring? > >> I don't think that would make sense. I think everywhere where this macro > >> is called the caller is not interested in pending request (pending means > >> consumed but not responded) > > > > It might be interested in such pending requests in some of the > > pathological cases I allude to in the next paragraph though? > > > > For example if the ring has unconsumed requests but there are no slots > > free for a response, it would be better to treat it as no unconsumed > > requests until space opens up for a response, otherwise something else > > just has to abort the processing of the request when it notices the lack > > of space. > > > > (I'm totally speculating here BTW, I don't have any concrete idea why > > things are done this way...) > > > > > >>> I wonder if this apparently weird construction is due to pathological > >>> cases when one or the other end is not picking up requests/responses? > >>> i.e. trying to avoid deadlocking the ring or generating an interrupt > >>> storm when the ring it is full of one or the other or something along > >>> those lines? > > > > > > Also, let me quote again my example about when rsp makes sense: > > "To clarify what does this do, let me show an example: > req_prod = 253 > req_cons = 256 > rsp_prod_pvt = 0 I think to make sense of this I need to see the sequence of reads/writes from both parties in a sensible ordering which would result in reads showing the above. i.e. a demonstration of the race not just an assertion that if the values are read as is things makes sense. > > req will be UINT_MAX-2, as the values changed in the meantime, and rsp > is 0. It's reasonable to return 0 here, as the backend hasn't replied > anything yet, so we clearly shouldn't have any unconsumed request in the > ring." > > Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 14:30 ` Ian Campbell @ 2014-03-12 15:14 ` Zoltan Kiss 2014-03-12 15:37 ` Ian Campbell 0 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 15:14 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan On 12/03/14 14:30, Ian Campbell wrote: > On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: >> On 12/03/14 10:28, Ian Campbell wrote: >>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: >>>> On 11/03/14 15:44, Ian Campbell wrote: >>> >>>>> Is it the case that this macro considers a request to be unconsumed if >>>>> the *response* to a request is outstanding as well as if the request >>>>> itself is still on the ring? >>>> I don't think that would make sense. I think everywhere where this macro >>>> is called the caller is not interested in pending request (pending means >>>> consumed but not responded) >>> >>> It might be interested in such pending requests in some of the >>> pathological cases I allude to in the next paragraph though? >>> >>> For example if the ring has unconsumed requests but there are no slots >>> free for a response, it would be better to treat it as no unconsumed >>> requests until space opens up for a response, otherwise something else >>> just has to abort the processing of the request when it notices the lack >>> of space. >>> >>> (I'm totally speculating here BTW, I don't have any concrete idea why >>> things are done this way...) >>> >>> >>>>> I wonder if this apparently weird construction is due to pathological >>>>> cases when one or the other end is not picking up requests/responses? >>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt >>>>> storm when the ring it is full of one or the other or something along >>>>> those lines? >>> >>> >> >> Also, let me quote again my example about when rsp makes sense: >> >> "To clarify what does this do, let me show an example: >> req_prod = 253 >> req_cons = 256 >> rsp_prod_pvt = 0 > > I think to make sense of this I need to see the sequence of reads/writes > from both parties in a sensible ordering which would result in reads > showing the above. i.e. a demonstration of the race not just an > assertion that if the values are read as is things makes sense. Let me extend it: - callback reads req_prod = 253 - frontend writes req_prod, now its 256 - backend picks it up, and consumes those slots, req_cons become 256 - callback reads req_cons = 256 - req is UINT_MAX-3 therefore, but actually there isn't any request to consume, it should be 0 - callback reads rsp_prod_pvt = 0, because backend haven't responded to any requests - rsp is therefore 256 - (256 -0) = 0 - the macro returns rsp, as it is smaller. And that's good, because despite the macro failed to determine the number of unconsumed requests, at least it detected that the ring is full with consumed but not replied requests, so there shouldn't be any unconsumed req And I call this best effort because if rsp_prod_pvt is e.g. 10, rsp will be then 10 as well, we return it, and the caller thinks there are unconsumed requests, despite there isn't any. Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 15:14 ` Zoltan Kiss @ 2014-03-12 15:37 ` Ian Campbell 2014-03-12 17:14 ` Zoltan Kiss 0 siblings, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-12 15:37 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote: > On 12/03/14 14:30, Ian Campbell wrote: > > On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: > >> On 12/03/14 10:28, Ian Campbell wrote: > >>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > >>>> On 11/03/14 15:44, Ian Campbell wrote: > >>> > >>>>> Is it the case that this macro considers a request to be unconsumed if > >>>>> the *response* to a request is outstanding as well as if the request > >>>>> itself is still on the ring? > >>>> I don't think that would make sense. I think everywhere where this macro > >>>> is called the caller is not interested in pending request (pending means > >>>> consumed but not responded) > >>> > >>> It might be interested in such pending requests in some of the > >>> pathological cases I allude to in the next paragraph though? > >>> > >>> For example if the ring has unconsumed requests but there are no slots > >>> free for a response, it would be better to treat it as no unconsumed > >>> requests until space opens up for a response, otherwise something else > >>> just has to abort the processing of the request when it notices the lack > >>> of space. > >>> > >>> (I'm totally speculating here BTW, I don't have any concrete idea why > >>> things are done this way...) > >>> > >>> > >>>>> I wonder if this apparently weird construction is due to pathological > >>>>> cases when one or the other end is not picking up requests/responses? > >>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt > >>>>> storm when the ring it is full of one or the other or something along > >>>>> those lines? > >>> > >>> > >> > >> Also, let me quote again my example about when rsp makes sense: > >> > >> "To clarify what does this do, let me show an example: > >> req_prod = 253 > >> req_cons = 256 > >> rsp_prod_pvt = 0 > > > > I think to make sense of this I need to see the sequence of reads/writes > > from both parties in a sensible ordering which would result in reads > > showing the above. i.e. a demonstration of the race not just an > > assertion that if the values are read as is things makes sense. > > Let me extend it: > > - callback reads req_prod = 253 callback == backend? Which context is this code running in? Which part of the system is the callback logically part of? > - frontend writes req_prod, now its 256 > - backend picks it up, and consumes those slots, req_cons become 256 "it"? Do you mean req_prod? Please be precise. > - callback reads req_cons = 256 But the backend has also seen req_prod at 256 at this point, hasn't it? You said so above but said "it" so I'm not sure. If the callback is part of the backend then why hasn't it also seen this? > - req is UINT_MAX-3 therefore, but actually there isn't any request to > consume, it should be 0 Only if something is ignoring the fact that it has seen req_prod == 256. If callback is some separate entity to backend within dom0 then what you have here is an internal inconsistency in dom0 AFAICT. IOW it seems like you are missing some synchronisation and/or have two different entities acting as backend. > - callback reads rsp_prod_pvt = 0, because backend haven't responded to > any requests > - rsp is therefore 256 - (256 -0) = 0 > - the macro returns rsp, as it is smaller. And that's good, because > despite the macro failed to determine the number of unconsumed requests, > at least it detected that the ring is full with consumed but not replied > requests, so there shouldn't be any unconsumed req > > And I call this best effort because if rsp_prod_pvt is e.g. 10, rsp will > be then 10 as well, we return it, and the caller thinks there are > unconsumed requests, despite there isn't any. > > Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 15:37 ` Ian Campbell @ 2014-03-12 17:14 ` Zoltan Kiss 2014-03-12 17:43 ` Ian Campbell 0 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 17:14 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan On 12/03/14 15:37, Ian Campbell wrote: > On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote: >> On 12/03/14 14:30, Ian Campbell wrote: >>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: >>>> On 12/03/14 10:28, Ian Campbell wrote: >>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: >>>>>> On 11/03/14 15:44, Ian Campbell wrote: >>>>> >>>>>>> Is it the case that this macro considers a request to be unconsumed if >>>>>>> the *response* to a request is outstanding as well as if the request >>>>>>> itself is still on the ring? >>>>>> I don't think that would make sense. I think everywhere where this macro >>>>>> is called the caller is not interested in pending request (pending means >>>>>> consumed but not responded) >>>>> >>>>> It might be interested in such pending requests in some of the >>>>> pathological cases I allude to in the next paragraph though? >>>>> >>>>> For example if the ring has unconsumed requests but there are no slots >>>>> free for a response, it would be better to treat it as no unconsumed >>>>> requests until space opens up for a response, otherwise something else >>>>> just has to abort the processing of the request when it notices the lack >>>>> of space. >>>>> >>>>> (I'm totally speculating here BTW, I don't have any concrete idea why >>>>> things are done this way...) >>>>> >>>>> >>>>>>> I wonder if this apparently weird construction is due to pathological >>>>>>> cases when one or the other end is not picking up requests/responses? >>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt >>>>>>> storm when the ring it is full of one or the other or something along >>>>>>> those lines? >>>>> >>>>> >>>> >>>> Also, let me quote again my example about when rsp makes sense: >>>> >>>> "To clarify what does this do, let me show an example: >>>> req_prod = 253 >>>> req_cons = 256 >>>> rsp_prod_pvt = 0 >>> >>> I think to make sense of this I need to see the sequence of reads/writes >>> from both parties in a sensible ordering which would result in reads >>> showing the above. i.e. a demonstration of the race not just an >>> assertion that if the values are read as is things makes sense. >> >> Let me extend it: >> >> - callback reads req_prod = 253 > > callback == backend? Which context is this code running in? Which part > of the system is the callback logically part of? Yes, it is part of the backend, the function which handles when we can release a slot back. With grant copy we don't have such thing, but with mapping xenvif_zerocopy_callback does this (or in classic kernel, it had a different name, but we called it page destructor). It can run from any context, it depends on who calls kfree_skb. > >> - frontend writes req_prod, now its 256 >> - backend picks it up, and consumes those slots, req_cons become 256 > > "it"? Do you mean req_prod? Please be precise. Yes, I meant req_prod. And backend means NAPI instance here. > >> - callback reads req_cons = 256 > > But the backend has also seen req_prod at 256 at this point, hasn't it? > You said so above but said "it" so I'm not sure. If the callback is part > of the backend then why hasn't it also seen this? Yes, the NAPI instance have seen it, but the callback has not. It were called from another context. > >> - req is UINT_MAX-3 therefore, but actually there isn't any request to >> consume, it should be 0 > > Only if something is ignoring the fact that it has seen req_prod == 256. > > If callback is some separate entity to backend within dom0 then what you > have here is an internal inconsistency in dom0 AFAICT. IOW it seems like > you are missing some synchronisation and/or have two different entities > acting as backend. The callback only needs to know whether it should poke the NAPI instance or not. There is this special case, if there are still a few unconsumed request, but the ring is nearly full of pending requests and xenvif_tx_pending_slots_available says NAPI should bail out, we have to schedule it back once we have enough free pending slots again. As I said in an another mail of this thread, this poking happens in the callback, but actually it should be moved to the dealloc thread. However thinking further, this whole xenvif_tx_pending_slots_available stuff seems to be unnecessary to me: It supposed to check if we have enough slot in the pending ring for the maximum number of possible slots, otherwise the backend bails out. It does so because if the backend start to consume the requests from the shared ring but runs out free slots in the pending ring, we are in trouble. But the pending ring supposed to have the same amount of slots as the shared one. And a consumed but not responded slot from the shared ring means a used slot in the pending ring. Therefore the frontend won't be able to push more than (MAX_PENDING_REQS - nr_pending_reqs(vif)) requests to the ring anyway. At least in practice, as MAX_PENDING_REQS = RING_SIZE(...). If we could bind the two to each other directly, we can get rid of this unnecessary checking, and whoever release the used pending slots should not poke the NAPI instance, because the frontend will call an interrupt if it sends a new packet anyway. > >> - callback reads rsp_prod_pvt = 0, because backend haven't responded to >> any requests >> - rsp is therefore 256 - (256 -0) = 0 >> - the macro returns rsp, as it is smaller. And that's good, because >> despite the macro failed to determine the number of unconsumed requests, >> at least it detected that the ring is full with consumed but not replied >> requests, so there shouldn't be any unconsumed req >> >> And I call this best effort because if rsp_prod_pvt is e.g. 10, rsp will >> be then 10 as well, we return it, and the caller thinks there are >> unconsumed requests, despite there isn't any. >> >> Zoli > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 17:14 ` Zoltan Kiss @ 2014-03-12 17:43 ` Ian Campbell 2014-03-12 21:10 ` Zoltan Kiss 0 siblings, 1 reply; 53+ messages in thread From: Ian Campbell @ 2014-03-12 17:43 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan On Wed, 2014-03-12 at 17:14 +0000, Zoltan Kiss wrote: > On 12/03/14 15:37, Ian Campbell wrote: > > On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote: > >> On 12/03/14 14:30, Ian Campbell wrote: > >>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: > >>>> On 12/03/14 10:28, Ian Campbell wrote: > >>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > >>>>>> On 11/03/14 15:44, Ian Campbell wrote: > >>>>> > >>>>>>> Is it the case that this macro considers a request to be unconsumed if > >>>>>>> the *response* to a request is outstanding as well as if the request > >>>>>>> itself is still on the ring? > >>>>>> I don't think that would make sense. I think everywhere where this macro > >>>>>> is called the caller is not interested in pending request (pending means > >>>>>> consumed but not responded) > >>>>> > >>>>> It might be interested in such pending requests in some of the > >>>>> pathological cases I allude to in the next paragraph though? > >>>>> > >>>>> For example if the ring has unconsumed requests but there are no slots > >>>>> free for a response, it would be better to treat it as no unconsumed > >>>>> requests until space opens up for a response, otherwise something else > >>>>> just has to abort the processing of the request when it notices the lack > >>>>> of space. > >>>>> > >>>>> (I'm totally speculating here BTW, I don't have any concrete idea why > >>>>> things are done this way...) > >>>>> > >>>>> > >>>>>>> I wonder if this apparently weird construction is due to pathological > >>>>>>> cases when one or the other end is not picking up requests/responses? > >>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt > >>>>>>> storm when the ring it is full of one or the other or something along > >>>>>>> those lines? > >>>>> > >>>>> > >>>> > >>>> Also, let me quote again my example about when rsp makes sense: > >>>> > >>>> "To clarify what does this do, let me show an example: > >>>> req_prod = 253 > >>>> req_cons = 256 > >>>> rsp_prod_pvt = 0 > >>> > >>> I think to make sense of this I need to see the sequence of reads/writes > >>> from both parties in a sensible ordering which would result in reads > >>> showing the above. i.e. a demonstration of the race not just an > >>> assertion that if the values are read as is things makes sense. > >> > >> Let me extend it: > >> > >> - callback reads req_prod = 253 > > > > callback == backend? Which context is this code running in? Which part > > of the system is the callback logically part of? > Yes, it is part of the backend, the function which handles when we can > release a slot back. With grant copy we don't have such thing, but with > mapping xenvif_zerocopy_callback does this (or in classic kernel, it had > a different name, but we called it page destructor). It can run from any > context, it depends on who calls kfree_skb. I think this is the root of the problem. The pv protocols really assume one entity on either end is moving/updating the ring pointers. If you have two entities on one end both doing this then you need to make sure you have appropriate locking in place. In the classic kernel wasn't the dealloc ring actually processed from the tx/rx action -- i.e. it was forced back into a single context at the point where the ring was actually touched. Aren't you batching things up in a similar way? Perhaps you just need to fix where you are draining those batches to be properly locked against other updaters of the ring? > >> - req is UINT_MAX-3 therefore, but actually there isn't any request to > >> consume, it should be 0 > > > > Only if something is ignoring the fact that it has seen req_prod == 256. > > > > If callback is some separate entity to backend within dom0 then what you > > have here is an internal inconsistency in dom0 AFAICT. IOW it seems like > > you are missing some synchronisation and/or have two different entities > > acting as backend. > The callback only needs to know whether it should poke the NAPI instance > or not. There is this special case, if there are still a few unconsumed > request, but the ring is nearly full of pending requests and > xenvif_tx_pending_slots_available says NAPI should bail out, we have to > schedule it back once we have enough free pending slots again. > As I said in an another mail of this thread, this poking happens in the > callback, but actually it should be moved to the dealloc thread. Right, I think that's what I was trying to say above. I missed that other mail I'm afraid (or didn't grok it). > However > thinking further, this whole xenvif_tx_pending_slots_available stuff > seems to be unnecessary to me: > It supposed to check if we have enough slot in the pending ring for the > maximum number of possible slots, otherwise the backend bails out. It > does so because if the backend start to consume the requests from the > shared ring but runs out free slots in the pending ring, we are in > trouble. But the pending ring supposed to have the same amount of slots > as the shared one. And a consumed but not responded slot from the shared > ring means a used slot in the pending ring. Therefore the frontend won't > be able to push more than (MAX_PENDING_REQS - nr_pending_reqs(vif)) > requests to the ring anyway. At least in practice, as MAX_PENDING_REQS = > RING_SIZE(...). If we could bind the two to each other directly, we can > get rid of this unnecessary checking, and whoever release the used > pending slots should not poke the NAPI instance, because the frontend > will call an interrupt if it sends a new packet anyway. The frontend tries to do a certain amount of event elision, using the event pointer etc, so you'd need to be careful since getting that wrong will either stall or result in more interrupt than necessary, which has a back impact on batching. But perhaps it could work. In any case, it seems like doing the poke from the callback is wrong and we should revert the patches which DaveM already applied and revisit this aspect of things, do you agree? Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 17:43 ` Ian Campbell @ 2014-03-12 21:10 ` Zoltan Kiss 2014-03-13 10:04 ` Ian Campbell 0 siblings, 1 reply; 53+ messages in thread From: Zoltan Kiss @ 2014-03-12 21:10 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel, Wei Liu, Tim Deegan On 12/03/14 17:43, Ian Campbell wrote: > On Wed, 2014-03-12 at 17:14 +0000, Zoltan Kiss wrote: >> On 12/03/14 15:37, Ian Campbell wrote: >>> On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote: >>>> On 12/03/14 14:30, Ian Campbell wrote: >>>>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: >>>>>> On 12/03/14 10:28, Ian Campbell wrote: >>>>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: >>>>>>>> On 11/03/14 15:44, Ian Campbell wrote: >>>>>>> >>>>>>>>> Is it the case that this macro considers a request to be unconsumed if >>>>>>>>> the *response* to a request is outstanding as well as if the request >>>>>>>>> itself is still on the ring? >>>>>>>> I don't think that would make sense. I think everywhere where this macro >>>>>>>> is called the caller is not interested in pending request (pending means >>>>>>>> consumed but not responded) >>>>>>> >>>>>>> It might be interested in such pending requests in some of the >>>>>>> pathological cases I allude to in the next paragraph though? >>>>>>> >>>>>>> For example if the ring has unconsumed requests but there are no slots >>>>>>> free for a response, it would be better to treat it as no unconsumed >>>>>>> requests until space opens up for a response, otherwise something else >>>>>>> just has to abort the processing of the request when it notices the lack >>>>>>> of space. >>>>>>> >>>>>>> (I'm totally speculating here BTW, I don't have any concrete idea why >>>>>>> things are done this way...) >>>>>>> >>>>>>> >>>>>>>>> I wonder if this apparently weird construction is due to pathological >>>>>>>>> cases when one or the other end is not picking up requests/responses? >>>>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt >>>>>>>>> storm when the ring it is full of one or the other or something along >>>>>>>>> those lines? >>>>>>> >>>>>>> >>>>>> >>>>>> Also, let me quote again my example about when rsp makes sense: >>>>>> >>>>>> "To clarify what does this do, let me show an example: >>>>>> req_prod = 253 >>>>>> req_cons = 256 >>>>>> rsp_prod_pvt = 0 >>>>> >>>>> I think to make sense of this I need to see the sequence of reads/writes >>>>> from both parties in a sensible ordering which would result in reads >>>>> showing the above. i.e. a demonstration of the race not just an >>>>> assertion that if the values are read as is things makes sense. >>>> >>>> Let me extend it: >>>> >>>> - callback reads req_prod = 253 >>> >>> callback == backend? Which context is this code running in? Which part >>> of the system is the callback logically part of? >> Yes, it is part of the backend, the function which handles when we can >> release a slot back. With grant copy we don't have such thing, but with >> mapping xenvif_zerocopy_callback does this (or in classic kernel, it had >> a different name, but we called it page destructor). It can run from any >> context, it depends on who calls kfree_skb. > > I think this is the root of the problem. The pv protocols really assume > one entity on either end is moving/updating the ring pointers. If you There is only _one_ entity moving/updating the ring pointers. Everyone else is just reading it. The callback, xenvif_tx_interrupt, xenvif_check_rx_xenvif. > have two entities on one end both doing this then you need to make sure > you have appropriate locking in place. > > In the classic kernel wasn't the dealloc ring actually processed from > the tx/rx action -- i.e. it was forced back into a single context at the > point where the ring was actually touched. > > Aren't you batching things up in a similar way? Perhaps you just need to > fix where you are draining those batches to be properly locked against > other updaters of the ring? > >>>> - req is UINT_MAX-3 therefore, but actually there isn't any request to >>>> consume, it should be 0 >>> >>> Only if something is ignoring the fact that it has seen req_prod == 256. >>> >>> If callback is some separate entity to backend within dom0 then what you >>> have here is an internal inconsistency in dom0 AFAICT. IOW it seems like >>> you are missing some synchronisation and/or have two different entities >>> acting as backend. >> The callback only needs to know whether it should poke the NAPI instance >> or not. There is this special case, if there are still a few unconsumed >> request, but the ring is nearly full of pending requests and >> xenvif_tx_pending_slots_available says NAPI should bail out, we have to >> schedule it back once we have enough free pending slots again. >> As I said in an another mail of this thread, this poking happens in the >> callback, but actually it should be moved to the dealloc thread. > > Right, I think that's what I was trying to say above. I missed that > other mail I'm afraid (or didn't grok it). > >> However >> thinking further, this whole xenvif_tx_pending_slots_available stuff >> seems to be unnecessary to me: >> It supposed to check if we have enough slot in the pending ring for the >> maximum number of possible slots, otherwise the backend bails out. It >> does so because if the backend start to consume the requests from the >> shared ring but runs out free slots in the pending ring, we are in >> trouble. But the pending ring supposed to have the same amount of slots >> as the shared one. And a consumed but not responded slot from the shared >> ring means a used slot in the pending ring. Therefore the frontend won't >> be able to push more than (MAX_PENDING_REQS - nr_pending_reqs(vif)) >> requests to the ring anyway. At least in practice, as MAX_PENDING_REQS = >> RING_SIZE(...). If we could bind the two to each other directly, we can >> get rid of this unnecessary checking, and whoever release the used >> pending slots should not poke the NAPI instance, because the frontend >> will call an interrupt if it sends a new packet anyway. > > The frontend tries to do a certain amount of event elision, using the > event pointer etc, so you'd need to be careful since getting that wrong > will either stall or result in more interrupt than necessary, which has > a back impact on batching. But perhaps it could work. I've actually tried that with a simple iperf throughput test, and it works. I'll run it through XenRT later to see if it goes wrong somewhere. > In any case, it seems like doing the poke from the callback is wrong and > we should revert the patches which DaveM already applied and revisit > this aspect of things, do you agree? I've just sent in a patch to fix that. I think the reason why I haven't seen any issue is that the in this situation there are plenty of outstanding packets, and all of their callback will schedule NAPI again. Chances are quite small that the dealloc thread couldn't release enough slots in the meantime. Zoli ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RING_HAS_UNCONSUMED_REQUESTS oddness 2014-03-12 21:10 ` Zoltan Kiss @ 2014-03-13 10:04 ` Ian Campbell 0 siblings, 0 replies; 53+ messages in thread From: Ian Campbell @ 2014-03-13 10:04 UTC (permalink / raw) To: Zoltan Kiss; +Cc: xen-devel, Wei Liu, Tim Deegan On Wed, 2014-03-12 at 21:10 +0000, Zoltan Kiss wrote: > On 12/03/14 17:43, Ian Campbell wrote: > > On Wed, 2014-03-12 at 17:14 +0000, Zoltan Kiss wrote: > >> On 12/03/14 15:37, Ian Campbell wrote: > >>> On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote: > >>>> On 12/03/14 14:30, Ian Campbell wrote: > >>>>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: > >>>>>> On 12/03/14 10:28, Ian Campbell wrote: > >>>>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > >>>>>>>> On 11/03/14 15:44, Ian Campbell wrote: > >>>>>>> > >>>>>>>>> Is it the case that this macro considers a request to be unconsumed if > >>>>>>>>> the *response* to a request is outstanding as well as if the request > >>>>>>>>> itself is still on the ring? > >>>>>>>> I don't think that would make sense. I think everywhere where this macro > >>>>>>>> is called the caller is not interested in pending request (pending means > >>>>>>>> consumed but not responded) > >>>>>>> > >>>>>>> It might be interested in such pending requests in some of the > >>>>>>> pathological cases I allude to in the next paragraph though? > >>>>>>> > >>>>>>> For example if the ring has unconsumed requests but there are no slots > >>>>>>> free for a response, it would be better to treat it as no unconsumed > >>>>>>> requests until space opens up for a response, otherwise something else > >>>>>>> just has to abort the processing of the request when it notices the lack > >>>>>>> of space. > >>>>>>> > >>>>>>> (I'm totally speculating here BTW, I don't have any concrete idea why > >>>>>>> things are done this way...) > >>>>>>> > >>>>>>> > >>>>>>>>> I wonder if this apparently weird construction is due to pathological > >>>>>>>>> cases when one or the other end is not picking up requests/responses? > >>>>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt > >>>>>>>>> storm when the ring it is full of one or the other or something along > >>>>>>>>> those lines? > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> Also, let me quote again my example about when rsp makes sense: > >>>>>> > >>>>>> "To clarify what does this do, let me show an example: > >>>>>> req_prod = 253 > >>>>>> req_cons = 256 > >>>>>> rsp_prod_pvt = 0 > >>>>> > >>>>> I think to make sense of this I need to see the sequence of reads/writes > >>>>> from both parties in a sensible ordering which would result in reads > >>>>> showing the above. i.e. a demonstration of the race not just an > >>>>> assertion that if the values are read as is things makes sense. > >>>> > >>>> Let me extend it: > >>>> > >>>> - callback reads req_prod = 253 > >>> > >>> callback == backend? Which context is this code running in? Which part > >>> of the system is the callback logically part of? > >> Yes, it is part of the backend, the function which handles when we can > >> release a slot back. With grant copy we don't have such thing, but with > >> mapping xenvif_zerocopy_callback does this (or in classic kernel, it had > >> a different name, but we called it page destructor). It can run from any > >> context, it depends on who calls kfree_skb. > > > > I think this is the root of the problem. The pv protocols really assume > > one entity on either end is moving/updating the ring pointers. If you > There is only _one_ entity moving/updating the ring pointers. Everyone > else is just reading it. The callback, xenvif_tx_interrupt, > xenvif_check_rx_xenvif. Perhaps I should have said "accessing" rather than "moving/updating". Even if only one entity is reading the ring if two entities are reading at least one of them is going to see inconsistencies. > > In any case, it seems like doing the poke from the callback is wrong and > > we should revert the patches which DaveM already applied and revisit > > this aspect of things, do you agree? > I've just sent in a patch to fix that. Thanks, I'll take a look. > I think the reason why I haven't > seen any issue is that the in this situation there are plenty of > outstanding packets, and all of their callback will schedule NAPI again. > Chances are quite small that the dealloc thread couldn't release enough > slots in the meantime. Seems plausible enough. That means that the issue would be seen rarely on quiet systems, which would be a nightmare to diagnose I think. Ian. ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2014-04-03 15:35 UTC | newest] Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss 2014-03-06 15:53 ` Ian Campbell 2014-03-06 16:31 ` Zoltan Kiss 2014-03-06 17:30 ` Tim Deegan 2014-03-06 21:39 ` Zoltan Kiss 2014-03-07 9:23 ` Paul Durrant 2014-03-07 17:43 ` Zoltan Kiss 2014-03-07 12:02 ` Wei Liu 2014-03-07 18:58 ` Zoltan Kiss 2014-03-11 15:55 ` Ian Campbell 2014-03-11 23:34 ` Zoltan Kiss 2014-03-13 16:38 ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan 2014-03-22 14:18 ` Zoltan Kiss 2014-03-22 17:14 ` Tim Deegan 2014-03-24 7:38 ` Jan Beulich 2014-03-24 9:39 ` Paul Durrant 2014-03-24 9:59 ` Jan Beulich 2014-03-24 11:03 ` Paul Durrant 2014-03-24 12:23 ` Zoltan Kiss 2014-03-24 13:52 ` Paul Durrant 2014-03-24 23:55 ` Zoltan Kiss 2014-04-03 9:38 ` Tim Deegan 2014-04-03 15:34 ` Zoltan Kiss 2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell 2014-03-11 23:24 ` Zoltan Kiss 2014-03-12 10:28 ` Ian Campbell 2014-03-12 10:48 ` Roger Pau Monné 2014-03-12 11:25 ` Paul Durrant 2014-03-12 11:38 ` Paul Durrant 2014-03-12 14:41 ` Zoltan Kiss 2014-03-12 15:23 ` Paul Durrant 2014-03-12 15:42 ` Wei Liu 2014-03-12 15:56 ` Paul Durrant 2014-03-12 16:02 ` Paul Durrant 2014-03-12 16:13 ` Zoltan Kiss 2014-03-12 16:42 ` Paul Durrant 2014-03-12 19:06 ` Zoltan Kiss 2014-03-13 9:26 ` Paul Durrant 2014-03-13 10:02 ` Ian Campbell 2014-03-13 10:58 ` Paul Durrant 2014-03-13 12:19 ` Ian Campbell 2014-03-13 12:28 ` Zoltan Kiss 2014-03-13 12:29 ` Paul Durrant 2014-03-13 12:44 ` Ian Campbell 2014-03-12 14:25 ` Zoltan Kiss 2014-03-12 14:27 ` Zoltan Kiss 2014-03-12 14:30 ` Ian Campbell 2014-03-12 15:14 ` Zoltan Kiss 2014-03-12 15:37 ` Ian Campbell 2014-03-12 17:14 ` Zoltan Kiss 2014-03-12 17:43 ` Ian Campbell 2014-03-12 21:10 ` Zoltan Kiss 2014-03-13 10:04 ` Ian Campbell
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.