All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zoltan Kiss <zoltan.kiss@citrix.com>
To: Tim Deegan <tim@xen.org>, keir@xen.org, xen-devel@lists.xenproject.org
Subject: Re: [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
Date: Thu, 3 Apr 2014 16:34:58 +0100	[thread overview]
Message-ID: <533D7FA2.4010209@citrix.com> (raw)
In-Reply-To: <20140403093821.GA41313@deinos.phlegethon.org>

Paul mentioned the lkml links doesn't work at the moment, but a short 
summary of why rsp is important: it gives us the maximum possible amount 
of requests should be on the ring (number of slots which are not 
consumed-but-not-responded), so even if the frontend lies to us through 
increasing req_prod to crazy values, backend can avoid consuming slots 
which are still consumed-but-not-responded.
Here is my explanation from that another thread:

"The backend doesn't see what the guest does with the responses, and
that's OK, it's the guest's problem, after netback increased
rsp_prod_pvt it doesn't really care. But as soon as the guest start
placing new requests after rsp_prod_pvt, or just increasing req_prod so
req_prod-rsp_prod_pvt > XEN_NETIF_TX_RING_SIZE, it becomes an issue.
So far this xenvif_tx_pending_slots_available indirectly saved us from
consuming requests overwriting still pending requests, but the guest
could still overwrote our responses. But again, that's still the guests
problem, we had the original request saved in the pending ring data. If
the guest went too far, build_gops killed the vif when req_prod-req_cons 
 > XEN_NETIF_TX_RING_SIZE
Indirect above means it only happened because the pending ring had the
same size, and the used slots has a 1-to-1 mapping for slots between
rsp_prod_pvt and req_cons. So xenvif_tx_pending_slots_available also
means (req_cons - rsp_prod_pvt) + XEN_NETBK_LEGACY_SLOTS_MAX <
XEN_NETIF_TX_RING_SIZE (does this look familiar?
But consuming overrunning requests after rsp_prod_pvt is a problem:
- NAPI instance races with dealloc thread over the slots. The first
reads them as requests, the second writes them as responses
- the NAPI instance overwrites used pending slots as well, so skb frag
release go wrong etc.
But the current RING_HAS_UNCONSUMED_REQUESTS fortunately saves us here,
let me explain it through an example:
rsp_prod_pvt = 0
req_cons = 253
req_prod = 258

Therefore:
req = 5
rsp = 3

So in xenvif_tx_build_gops work_to_do will be 3, and in
xenvif_count_requests we bail out when we see that the packet actually
exceeds that.
So in the end, we are safe here, but we shouldn't change that macro I
suggested to refactor"

On 03/04/14 10:38, Tim Deegan wrote:
> [CC's clipped]
>
> At 17:38 +0100 on 13 Mar (1394728686), Tim Deegan wrote:
>> [RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS()
>>
>> Remove a useless (though harmless) extra check.
>
> What shall we do about this?  AFAICT, Jan and Paul are in favour of the
> change; Zoltan is against it.  I am wavering.  Keir, any opinion?
>
> Tim.
>
>> Code inspection
>> ---------------
>>
>> Ian Campbell and I looked at this today and can't find any case where
>> the existing 'rsp' test would be useful.  In today's ring.h,
>>
>> #define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
>>      unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
>>      unsigned int rsp = RING_SIZE(_r) -                                  \
>>          ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
>>      req < rsp ? req : rsp;                                              \
>> })
>>
>> 'req' is the number of requests that the F/E has published and we have
>> not yet consumed.  'rsp' is an odd fish, everything _except_ what we
>> might call requests locally in flight, that is requests we've consumed
>> but not produced a response for.  We could only think of two things we
>> might be trying to test for here:
>>
>> (a) req_cons runs ahead of rsp_prod_pvt, as would be expected from the
>>      way these rings normally work.  In that case, as Zoltan pointed
>>      out, rsp must be >= req, for a well-behaved frontend: otherwise
>>      we'd have req_prod > rsp_prod_pvt + RING_SIZE, implying that
>>      req_prod > rsp_cons + RING_SIZE, i.e. the frontend has overrun
>>      the ring.  I don't think this even usefully protects against a
>>      malicious/buggy frontend.
>>
>> (b) rsp_prod_pvt runs ahead of req_cons, which seems wrong but I'm
>>      told might happen in linux netback?  In that case, we might plausibly
>>      want to try to limit RING_HAS_UNCONSUMED_REQUESTS to max of
>>      (req_prod - req_cons) and (req_prod - rsp_prod_pvt), but that's
>>      not what this does.  Instead, rsp will underflow to
>>      RING_SIZE + (rsp_prod_pvt - req_cons), which will always be >= req.
>>
>> So in both of those cases, 'rsp' is always >= 'req' and is useless.
>>
>>
>> Code archaeology
>> ----------------
>>
>> In the original ring.h, the test was a boolean, as the name implies.
>> Cset 99812f40 ([NET] back: Add SG support) extended it to a count in
>> the obvious manner.  Looking at the original (0b788798):
>>
>> #define RING_HAS_UNCONSUMED_REQUESTS(_p, _r)                            \
>>     ( ((_r)->req_cons != (_r)->sring->req_prod ) &&                      \
>>       (((_r)->req_cons - (_r)->rsp_prod_pvt) !=                          \
>>        SRING_SIZE((_p), (_r)->sring)) )
>>
>> it seems to be testing for 'there are unconsumed requests _and_ we
>> have not got a full ring of consumed-but-not-responded requests'.
>> This also seems to be effectively harmless: if there are unconsumed
>> requests, we can't possibly have a ring full of c-b-n-r requests
>> unless the frontend's request producer has overrun its response
>> consumer.
>>
>> That code was introduced with no callers, but seems to have been
>> copied from the existing netback code, which switched to use it in
>> b242b208.  All later users of it in the xenolinux trees are either
>> brand new code or replacing a simple (req_cons - req_prod) test.  The
>> netback code goes back to Keir's original implementation, where it
>> looked like this, in inverted form (1de448f4):
>>
>>          /* Work to do? */
>>          i = netif->tx_req_cons;
>>          if ( (i == netif->tx->req_prod) ||
>>               ((i-netif->tx_resp_prod) == NETIF_TX_RING_SIZE) )
>>          {
>>              netif_put(netif);
>>              continue;
>>          }
>>
>> Again, I don't think this test is useful: it's only interesting if
>> netfront has overrun the ring, but it doesn't reliably detect that.
>>
>> So let's remove it.
>>
>> Suggested-by: Zoltan Kiss <zoltan.kiss@citrix.com>
>> Signed-off-by: Tim Deegan <tim@xen.org>
>> Cc: Keir Fraser <keir@xen.org>
>> Cc: Jan Beulich <JBeulich@suse.com>
>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>> Cc: Alan Somers <alans@spectralogic.com>
>> Cc: John Suykerbuyk <johns@spectralogic.com>
>> Cc: Manuel Bouyer <bouyer@NetBSD.org>
>> Cc: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>
>> ---
>> RFC because
>> - I might well be missing something here; and
>> - this is a change to a public header that could be in use in any
>>    number of implementations; since the extra test is very cheap, and
>>    seems to be harmess, we might consider leaving it in place.
>> - I haven't tested it at all yet.
>>
>> CC'ing a whole bunch of people whose code might be using this macro.
>>
>> diff --git a/xen/include/public/io/ring.h b/xen/include/public/io/ring.h
>> index 73e13d7..d6e23f1 100644
>> --- a/xen/include/public/io/ring.h
>> +++ b/xen/include/public/io/ring.h
>> @@ -192,21 +192,8 @@ typedef struct __name##_back_ring __name##_back_ring_t
>>   #define RING_HAS_UNCONSUMED_RESPONSES(_r)                               \
>>       ((_r)->sring->rsp_prod - (_r)->rsp_cons)
>>
>> -#ifdef __GNUC__
>> -#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
>> -    unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
>> -    unsigned int rsp = RING_SIZE(_r) -                                  \
>> -        ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
>> -    req < rsp ? req : rsp;                                              \
>> -})
>> -#else
>> -/* Same as above, but without the nice GCC ({ ... }) syntax. */
>>   #define RING_HAS_UNCONSUMED_REQUESTS(_r)                                \
>> -    ((((_r)->sring->req_prod - (_r)->req_cons) <                        \
>> -      (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?        \
>> -     ((_r)->sring->req_prod - (_r)->req_cons) :                         \
>> -     (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
>> -#endif
>> +    ((_r)->sring->req_prod - (_r)->req_cons)
>>
>>   /* Direct access to individual ring elements, by index. */
>>   #define RING_GET_REQUEST(_r, _idx)                                      \
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

  reply	other threads:[~2014-04-03 15:35 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-06 15:47 RING_HAS_UNCONSUMED_REQUESTS oddness Zoltan Kiss
2014-03-06 15:53 ` Ian Campbell
2014-03-06 16:31   ` Zoltan Kiss
2014-03-06 17:30     ` Tim Deegan
2014-03-06 21:39       ` Zoltan Kiss
2014-03-07  9:23         ` Paul Durrant
2014-03-07 17:43           ` Zoltan Kiss
2014-03-07 12:02         ` Wei Liu
2014-03-07 18:58           ` Zoltan Kiss
2014-03-11 15:55         ` Ian Campbell
2014-03-11 23:34           ` Zoltan Kiss
2014-03-13 16:38       ` [PATCH RFC] xen/public/ring.h: simplify RING_HAS_UNCONSUMED_REQUESTS() Tim Deegan
2014-03-22 14:18         ` Zoltan Kiss
2014-03-22 17:14           ` Tim Deegan
2014-03-24  7:38             ` Jan Beulich
2014-03-24  9:39               ` Paul Durrant
2014-03-24  9:59                 ` Jan Beulich
2014-03-24 11:03                   ` Paul Durrant
2014-03-24 12:23               ` Zoltan Kiss
2014-03-24 13:52                 ` Paul Durrant
2014-03-24 23:55                   ` Zoltan Kiss
2014-04-03  9:38         ` Tim Deegan
2014-04-03 15:34           ` Zoltan Kiss [this message]
2014-03-11 15:44 ` RING_HAS_UNCONSUMED_REQUESTS oddness Ian Campbell
2014-03-11 23:24   ` Zoltan Kiss
2014-03-12 10:28     ` Ian Campbell
2014-03-12 10:48       ` Roger Pau Monné
2014-03-12 11:25       ` Paul Durrant
2014-03-12 11:38       ` Paul Durrant
2014-03-12 14:41         ` Zoltan Kiss
2014-03-12 15:23           ` Paul Durrant
2014-03-12 15:42             ` Wei Liu
2014-03-12 15:56               ` Paul Durrant
2014-03-12 16:02               ` Paul Durrant
2014-03-12 16:13               ` Zoltan Kiss
2014-03-12 16:42                 ` Paul Durrant
2014-03-12 19:06                   ` Zoltan Kiss
2014-03-13  9:26                     ` Paul Durrant
2014-03-13 10:02                       ` Ian Campbell
2014-03-13 10:58                         ` Paul Durrant
2014-03-13 12:19                           ` Ian Campbell
2014-03-13 12:28                             ` Zoltan Kiss
2014-03-13 12:29                               ` Paul Durrant
2014-03-13 12:44                               ` Ian Campbell
2014-03-12 14:25       ` Zoltan Kiss
2014-03-12 14:27       ` Zoltan Kiss
2014-03-12 14:30         ` Ian Campbell
2014-03-12 15:14           ` Zoltan Kiss
2014-03-12 15:37             ` Ian Campbell
2014-03-12 17:14               ` Zoltan Kiss
2014-03-12 17:43                 ` Ian Campbell
2014-03-12 21:10                   ` Zoltan Kiss
2014-03-13 10:04                     ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=533D7FA2.4010209@citrix.com \
    --to=zoltan.kiss@citrix.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.