netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2] xen/netfront: handle compound page fragments on transmit
@ 2012-11-20 16:00 Ian Campbell
  2012-11-20 16:27 ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Campbell @ 2012-11-20 16:00 UTC (permalink / raw)
  To: netdev
  Cc: Ian Campbell, xen-devel, Eric Dumazet, Konrad Rzeszutek Wilk,
	ANNIE LI, Sander Eikelenboom, Stefan Bader

An SKB paged fragment can consist of a compound page with order > 0.
However the netchannel protocol deals only in PAGE_SIZE frames.

Handle this in xennet_make_frags by iterating over the frames which
make up the page.

This is the netfront equivalent to 6a8ed462f16b for netback.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: netdev@vger.kernel.org
Cc: xen-devel@lists.xen.org
Cc: Eric Dumazet <edumazet@google.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: ANNIE LI <annie.li@oracle.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Stefan Bader <stefan.bader@canonical.com>
---
v2: check we have enough room in the ring and that the other end can
    cope with the number of slots in a single frame
---
 drivers/net/xen-netfront.c |   95 ++++++++++++++++++++++++++++++++++----------
 1 files changed, 74 insertions(+), 21 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index caa0110..d9b16f4 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -452,29 +452,82 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 	/* Grant backend access to each skb fragment page. */
 	for (i = 0; i < frags; i++) {
 		skb_frag_t *frag = skb_shinfo(skb)->frags + i;
+		struct page *page = skb_frag_page(frag);
+		unsigned long size = skb_frag_size(frag);
+		unsigned long offset = frag->page_offset;
 
-		tx->flags |= XEN_NETTXF_more_data;
+		/* Data must not cross a page boundary. */
+		BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
 
-		id = get_id_from_freelist(&np->tx_skb_freelist, np->tx_skbs);
-		np->tx_skbs[id].skb = skb_get(skb);
-		tx = RING_GET_REQUEST(&np->tx, prod++);
-		tx->id = id;
-		ref = gnttab_claim_grant_reference(&np->gref_tx_head);
-		BUG_ON((signed short)ref < 0);
+		/* Skip unused frames from start of page */
+		page += offset >> PAGE_SHIFT;
+		offset &= ~PAGE_MASK;
 
-		mfn = pfn_to_mfn(page_to_pfn(skb_frag_page(frag)));
-		gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
-						mfn, GNTMAP_readonly);
+		while (size > 0) {
+			unsigned long bytes;
 
-		tx->gref = np->grant_tx_ref[id] = ref;
-		tx->offset = frag->page_offset;
-		tx->size = skb_frag_size(frag);
-		tx->flags = 0;
+			BUG_ON(offset >= PAGE_SIZE);
+
+			bytes = PAGE_SIZE - offset;
+			if (bytes > size)
+				bytes = size;
+
+			tx->flags |= XEN_NETTXF_more_data;
+
+			id = get_id_from_freelist(&np->tx_skb_freelist, np->tx_skbs);
+			np->tx_skbs[id].skb = skb_get(skb);
+			tx = RING_GET_REQUEST(&np->tx, prod++);
+			tx->id = id;
+			ref = gnttab_claim_grant_reference(&np->gref_tx_head);
+			BUG_ON((signed short)ref < 0);
+
+			mfn = pfn_to_mfn(page_to_pfn(page));
+			gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
+							mfn, GNTMAP_readonly);
+
+			tx->gref = np->grant_tx_ref[id] = ref;
+			tx->offset = offset;
+			tx->size = bytes;
+			tx->flags = 0;
+
+			offset += bytes;
+			size -= bytes;
+
+			/* Next frame */
+			if (offset == PAGE_SIZE && size) {
+				BUG_ON(!PageCompound(page));
+				page++;
+				offset = 0;
+			}
+		}
 	}
 
 	np->tx.req_prod_pvt = prod;
 }
 
+/*
+ * Count how many ring slots are required to send the frags of this
+ * skb. Each frag might be a compound page.
+ */
+static int xennet_count_skb_frag_slots(struct sk_buff *skb)
+{
+	int i, frags = skb_shinfo(skb)->nr_frags;
+	int pages = 0;
+
+	for (i = 0; i < frags; i++) {
+		skb_frag_t *frag = skb_shinfo(skb)->frags + i;
+		unsigned long size = skb_frag_size(frag);
+		unsigned long offset = frag->page_offset;
+
+		/* Skip unused frames from start of page */
+		offset &= ~PAGE_MASK;
+
+		pages += PFN_UP(offset + size);
+	}
+
+	return pages;
+}
+
 static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	unsigned short id;
@@ -487,23 +540,23 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	grant_ref_t ref;
 	unsigned long mfn;
 	int notify;
-	int frags = skb_shinfo(skb)->nr_frags;
+	int slots;
 	unsigned int offset = offset_in_page(data);
 	unsigned int len = skb_headlen(skb);
 	unsigned long flags;
 
-	frags += DIV_ROUND_UP(offset + len, PAGE_SIZE);
-	if (unlikely(frags > MAX_SKB_FRAGS + 1)) {
-		printk(KERN_ALERT "xennet: skb rides the rocket: %d frags\n",
-		       frags);
-		dump_stack();
+	slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) +
+		xennet_count_skb_frag_slots(skb);
+	if (unlikely(slots > MAX_SKB_FRAGS + 1)) {
+		printk(KERN_ALERT "xennet: skb rides the rocket: %d slots\n",
+		       slots);
 		goto drop;
 	}
 
 	spin_lock_irqsave(&np->tx_lock, flags);
 
 	if (unlikely(!netif_carrier_ok(dev) ||
-		     (frags > 1 && !xennet_can_sg(dev)) ||
+		     (slots > 1 && !xennet_can_sg(dev)) ||
 		     netif_needs_gso(skb, netif_skb_features(skb)))) {
 		spin_unlock_irqrestore(&np->tx_lock, flags);
 		goto drop;
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH V2] xen/netfront: handle compound page fragments on transmit
  2012-11-20 16:00 [PATCH V2] xen/netfront: handle compound page fragments on transmit Ian Campbell
@ 2012-11-20 16:27 ` Eric Dumazet
  2012-11-21 12:08   ` Ian Campbell
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2012-11-20 16:27 UTC (permalink / raw)
  To: Ian Campbell
  Cc: netdev, xen-devel, Eric Dumazet, Konrad Rzeszutek Wilk, ANNIE LI,
	Sander Eikelenboom, Stefan Bader

On Tue, 2012-11-20 at 16:00 +0000, Ian Campbell wrote:
> An SKB paged fragment can consist of a compound page with order > 0.
> However the netchannel protocol deals only in PAGE_SIZE frames.
> 
> Handle this in xennet_make_frags by iterating over the frames which
> make up the page.
> 
> This is the netfront equivalent to 6a8ed462f16b for netback.
...

> -	frags += DIV_ROUND_UP(offset + len, PAGE_SIZE);
> -	if (unlikely(frags > MAX_SKB_FRAGS + 1)) {
> -		printk(KERN_ALERT "xennet: skb rides the rocket: %d frags\n",
> -		       frags);
> -		dump_stack();
> +	slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) +
> +		xennet_count_skb_frag_slots(skb);
> +	if (unlikely(slots > MAX_SKB_FRAGS + 1)) {
> +		printk(KERN_ALERT "xennet: skb rides the rocket: %d slots\n",
> +		       slots);

I think this is wrong.

You should change netfront_tx_slot_available() to stop the queue before
this can happen.

Yes, you dont hit this on your tests, but a driver should not drop a
good packet.

>  		goto drop;
>  	}
>  
>  	spin_lock_irqsave(&np->tx_lock, flags);
>  
>  	if (unlikely(!netif_carrier_ok(dev) ||
> -		     (frags > 1 && !xennet_can_sg(dev)) ||
> +		     (slots > 1 && !xennet_can_sg(dev)) ||
>  		     netif_needs_gso(skb, netif_skb_features(skb)))) {
>  		spin_unlock_irqrestore(&np->tx_lock, flags);
>  		goto drop;


diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index caa0110..cb1e605 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -215,10 +215,13 @@ static void rx_refill_timeout(unsigned long data)
        napi_schedule(&np->napi);
 }
 
+/* Considering a 64Kb packet of 16 frags, each frag can be mapped
+ * to 3 order-0 parts on pathological cases
+ */
 static int netfront_tx_slot_available(struct netfront_info *np)
 {
        return (np->tx.req_prod_pvt - np->tx.rsp_cons) <
-               (TX_MAX_TARGET - MAX_SKB_FRAGS - 2);
+               (TX_MAX_TARGET - 3*MAX_SKB_FRAGS - 2);
 }
 
 static void xennet_maybe_wake_tx(struct net_device *dev)

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH V2] xen/netfront: handle compound page fragments on transmit
  2012-11-20 16:27 ` Eric Dumazet
@ 2012-11-21 12:08   ` Ian Campbell
  2012-11-21 15:13     ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Campbell @ 2012-11-21 12:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, xen-devel, Eric Dumazet, Konrad Rzeszutek Wilk, ANNIE LI,
	Sander Eikelenboom, Stefan Bader

On Tue, 2012-11-20 at 16:27 +0000, Eric Dumazet wrote:
> On Tue, 2012-11-20 at 16:00 +0000, Ian Campbell wrote:
> > An SKB paged fragment can consist of a compound page with order > 0.
> > However the netchannel protocol deals only in PAGE_SIZE frames.
> > 
> > Handle this in xennet_make_frags by iterating over the frames which
> > make up the page.
> > 
> > This is the netfront equivalent to 6a8ed462f16b for netback.
> ...
> 
> > -	frags += DIV_ROUND_UP(offset + len, PAGE_SIZE);
> > -	if (unlikely(frags > MAX_SKB_FRAGS + 1)) {
> > -		printk(KERN_ALERT "xennet: skb rides the rocket: %d frags\n",
> > -		       frags);
> > -		dump_stack();
> > +	slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) +
> > +		xennet_count_skb_frag_slots(skb);
> > +	if (unlikely(slots > MAX_SKB_FRAGS + 1)) {
> > +		printk(KERN_ALERT "xennet: skb rides the rocket: %d slots\n",
> > +		       slots);
> 
> I think this is wrong.
> 
> You should change netfront_tx_slot_available() to stop the queue before
> this can happen.
>
> Yes, you dont hit this on your tests, but a driver should not drop a
> good packet.

The max-frag related limitation comes from the "wire" protocol used
between front and back. As it stands either the frontend or the backend
is more than likely going to drop the sort of pathalogical skbs you are
worried.

I agree that this absolutely needs to be fixed in the protocol (and I've
posted a call to arms on this topic on xen-devel) but I'd like to do it
in a coordinated manner as part of a protocol extension (where the front
and backend negotiate the maximum number of order-0 pages per Ethernet
frame they are willing to handle) rather than as a side effect of this
patch.

So right now I don't want to introduce frontends which default to
sending increased numbers of pages in to the wild, since that makes
things more complex when we come to extend the protocol.

Perhaps in the short term doing an skb_linearize when we hit this case
would help, that will turn the pathalogical skb into a much more normal
one. It'll be expensive but it should be rare. That assumes you can
linearize such a large skb, which depends on the ability to allocate
large order pages which isn't a given. Herm, maybe that doesn't work
then.

AFAIK we don't have an existing skb_foo operation which copies an skb,
including (or only) the frags, with the side effect of aligning and
coalescing them. Do we?

Ian.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V2] xen/netfront: handle compound page fragments on transmit
  2012-11-21 12:08   ` Ian Campbell
@ 2012-11-21 15:13     ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2012-11-21 15:13 UTC (permalink / raw)
  To: Ian Campbell
  Cc: netdev, xen-devel, Eric Dumazet, Konrad Rzeszutek Wilk, ANNIE LI,
	Sander Eikelenboom, Stefan Bader

On Wed, 2012-11-21 at 12:08 +0000, Ian Campbell wrote:
> The max-frag related limitation comes from the "wire" protocol used
> between front and back. As it stands either the frontend or the backend
> is more than likely going to drop the sort of pathalogical skbs you are
> worried.
> 
> I agree that this absolutely needs to be fixed in the protocol (and I've
> posted a call to arms on this topic on xen-devel) but I'd like to do it
> in a coordinated manner as part of a protocol extension (where the front
> and backend negotiate the maximum number of order-0 pages per Ethernet
> frame they are willing to handle) rather than as a side effect of this
> patch.
> 
> So right now I don't want to introduce frontends which default to
> sending increased numbers of pages in to the wild, since that makes
> things more complex when we come to extend the protocol.
> 
> Perhaps in the short term doing an skb_linearize when we hit this case
> would help, that will turn the pathalogical skb into a much more normal
> one. It'll be expensive but it should be rare. That assumes you can
> linearize such a large skb, which depends on the ability to allocate
> large order pages which isn't a given. Herm, maybe that doesn't work
> then.
> 

First of all, thanks a lot for all these detailed informations.

This now makes sense !


> AFAIK we don't have an existing skb_foo operation which copies an skb,
> including (or only) the frags, with the side effect of aligning and
> coalescing them. Do we?
> 

No, we only have the full linearize helper, and skb_try_coalesce()
helpers.

TCP stack uses an internal function to collapse several skbs so skbs
using a single page, I guess we could generalize this and make it
available to other uses.

Thanks !

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-21 15:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-20 16:00 [PATCH V2] xen/netfront: handle compound page fragments on transmit Ian Campbell
2012-11-20 16:27 ` Eric Dumazet
2012-11-21 12:08   ` Ian Campbell
2012-11-21 15:13     ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).