netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 3/3] ipv6: use skb coalescing in reassembly
@ 2012-05-19 13:02 Eric Dumazet
  2012-05-19 22:35 ` David Miller
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2012-05-19 13:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Alexander Duyck

From: Eric Dumazet <edumazet@google.com>

ip6_frag_reasm() can use skb_try_coalesce() to build optimized skb,
reducing memory used by them (truesize), and reducing number of cache
line misses and overhead for the consumer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/ipv6/reassembly.c |   26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 5d32dfa..4ff9af6 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -415,6 +415,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 	struct sk_buff *fp, *head = fq->q.fragments;
 	int    payload_len;
 	unsigned int nhoff;
+	int sum_truesize;
 
 	fq_kill(fq);
 
@@ -484,20 +485,33 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 	head->mac_header += sizeof(struct frag_hdr);
 	head->network_header += sizeof(struct frag_hdr);
 
-	skb_shinfo(head)->frag_list = head->next;
 	skb_reset_transport_header(head);
 	skb_push(head, head->data - skb_network_header(head));
 
-	for (fp=head->next; fp; fp = fp->next) {
-		head->data_len += fp->len;
-		head->len += fp->len;
+	sum_truesize = head->truesize;
+	for (fp = head->next; fp;) {
+		bool headstolen;
+		int delta;
+		struct sk_buff *next = fp->next;
+
+		sum_truesize += fp->truesize;
 		if (head->ip_summed != fp->ip_summed)
 			head->ip_summed = CHECKSUM_NONE;
 		else if (head->ip_summed == CHECKSUM_COMPLETE)
 			head->csum = csum_add(head->csum, fp->csum);
-		head->truesize += fp->truesize;
+
+		if (skb_try_coalesce(head, fp, &headstolen, &delta)) {
+			kfree_skb_partial(fp, headstolen);
+		} else {
+			if (!skb_shinfo(head)->frag_list)
+				skb_shinfo(head)->frag_list = fp;
+			head->data_len += fp->len;
+			head->len += fp->len;
+			head->truesize += fp->truesize;
+		}
+		fp = next;
 	}
-	atomic_sub(head->truesize, &fq->q.net->mem);
+	atomic_sub(sum_truesize, &fq->q.net->mem);
 
 	head->next = NULL;
 	head->dev = dev;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 3/3] ipv6: use skb coalescing in reassembly
  2012-05-19 13:02 [PATCH net-next 3/3] ipv6: use skb coalescing in reassembly Eric Dumazet
@ 2012-05-19 22:35 ` David Miller
  2012-05-21  5:37   ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2012-05-19 22:35 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, alexander.h.duyck

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 19 May 2012 15:02:35 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> ip6_frag_reasm() can use skb_try_coalesce() to build optimized skb,
> reducing memory used by them (truesize), and reducing number of cache
> line misses and overhead for the consumer.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>

Applied.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 3/3] ipv6: use skb coalescing in reassembly
  2012-05-19 22:35 ` David Miller
@ 2012-05-21  5:37   ` Eric Dumazet
  2012-05-22  5:53     ` [RFC] net: skb_head_is_locked() should use skb_header_cloned() Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2012-05-21  5:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, alexander.h.duyck

On Sat, 2012-05-19 at 18:35 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Sat, 19 May 2012 15:02:35 +0200
> 
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > ip6_frag_reasm() can use skb_try_coalesce() to build optimized skb,
> > reducing memory used by them (truesize), and reducing number of cache
> > line misses and overhead for the consumer.
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> Applied.

I'll do the same for net/ipv6/netfilter/nf_conntrack_reasm.c today.

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC] net: skb_head_is_locked() should use skb_header_cloned()
  2012-05-21  5:37   ` Eric Dumazet
@ 2012-05-22  5:53     ` Eric Dumazet
  2012-05-22 17:23       ` Alexander Duyck
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2012-05-22  5:53 UTC (permalink / raw)
  To: David Miller, alexander.h.duyck; +Cc: netdev

Hi David and Alexander

There is no hurry since net-next is closed, but I hit the following
problem :

When IPv6 conntracking is enabled, code from
net/ipv6/netfilter/nf_conntrack_reasm.c does a cloning of all skbs to
build a shadow.

Then we run : (skb here is the head of the 'shadow skb' )

void nf_ct_frag6_output(unsigned int hooknum, struct sk_buff *skb,
                        struct net_device *in, struct net_device *out,
                        int (*okfn)(struct sk_buff *))
{
        struct sk_buff *s, *s2;

        for (s = NFCT_FRAG6_CB(skb)->orig; s;) {
                nf_conntrack_put_reasm(s->nfct_reasm);
                nf_conntrack_get_reasm(skb);
                s->nfct_reasm = skb;

                s2 = s->next;
                s->next = NULL;

                NF_HOOK_THRESH(NFPROTO_IPV6, hooknum, s, in, out, okfn,
                               NF_IP6_PRI_CONNTRACK_DEFRAG + 1);
                s = s2;
        }
        nf_conntrack_put_reasm(skb);
}

So when all original skbs are fed to real IPv6 reassembly code, their
clones are still alive and we hit the condition in skb_try_coalesce() :

if (skb_head_is_locked(from))
	return false;

I was wondering if skb_head_is_locked() should be changed to :

if (!skb->head_frag || skb_header_cloned(skb))
	return false;

Then we could add skb_header_release() calls on the clones of course in
net/ipv6/netfilter/nf_conntrack_reasm.c 

Not-Yet-Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/skbuff.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 0e50171..6509ee1 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2587,7 +2587,7 @@ static inline bool skb_is_recycleable(const struct sk_buff *skb, int skb_size)
  */
 static inline bool skb_head_is_locked(const struct sk_buff *skb)
 {
-	return !skb->head_frag || skb_cloned(skb);
+	return !skb->head_frag || skb_header_cloned(skb);
 }
 #endif	/* __KERNEL__ */
 #endif	/* _LINUX_SKBUFF_H */

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC] net: skb_head_is_locked() should use skb_header_cloned()
  2012-05-22  5:53     ` [RFC] net: skb_head_is_locked() should use skb_header_cloned() Eric Dumazet
@ 2012-05-22 17:23       ` Alexander Duyck
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Duyck @ 2012-05-22 17:23 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On 05/21/2012 10:53 PM, Eric Dumazet wrote:
> Hi David and Alexander
>
> There is no hurry since net-next is closed, but I hit the following
> problem :
>
> When IPv6 conntracking is enabled, code from
> net/ipv6/netfilter/nf_conntrack_reasm.c does a cloning of all skbs to
> build a shadow.
>
> Then we run : (skb here is the head of the 'shadow skb' )
>
> void nf_ct_frag6_output(unsigned int hooknum, struct sk_buff *skb,
>                         struct net_device *in, struct net_device *out,
>                         int (*okfn)(struct sk_buff *))
> {
>         struct sk_buff *s, *s2;
>
>         for (s = NFCT_FRAG6_CB(skb)->orig; s;) {
>                 nf_conntrack_put_reasm(s->nfct_reasm);
>                 nf_conntrack_get_reasm(skb);
>                 s->nfct_reasm = skb;
>
>                 s2 = s->next;
>                 s->next = NULL;
>
>                 NF_HOOK_THRESH(NFPROTO_IPV6, hooknum, s, in, out, okfn,
>                                NF_IP6_PRI_CONNTRACK_DEFRAG + 1);
>                 s = s2;
>         }
>         nf_conntrack_put_reasm(skb);
> }
>
> So when all original skbs are fed to real IPv6 reassembly code, their
> clones are still alive and we hit the condition in skb_try_coalesce() :
>
> if (skb_head_is_locked(from))
> 	return false;
>
> I was wondering if skb_head_is_locked() should be changed to :
>
> if (!skb->head_frag || skb_header_cloned(skb))
> 	return false;
>
> Then we could add skb_header_release() calls on the clones of course in
> net/ipv6/netfilter/nf_conntrack_reasm.c 
>
> Not-Yet-Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/skbuff.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 0e50171..6509ee1 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2587,7 +2587,7 @@ static inline bool skb_is_recycleable(const struct sk_buff *skb, int skb_size)
>   */
>  static inline bool skb_head_is_locked(const struct sk_buff *skb)
>  {
> -	return !skb->head_frag || skb_cloned(skb);
> +	return !skb->head_frag || skb_header_cloned(skb);
>  }
>  #endif	/* __KERNEL__ */
>  #endif	/* _LINUX_SKBUFF_H */
>
>
The problem is that the whole reason for checking skb_cloned was to
avoid reference count issues between the skb and the page.  We should
only be using the reference count in one or the other and not both. 
Otherwise we open up the possibility of a data corruption if someone
misinterprets a skb_shinfo()->dataref == 1, or skb_header_cloned
returning false when we have the buffer shared between both the sk_buff
and a page.

The skb_header_cloned check only verifies that the portion between
skb->head and skb->data is currently being unused by the other clones. 
It doesn't guarantee that skb->head is not being used by any other
sk_buff.  As such we run the same risk of messing up the dataref
counting if we were to use it.

The way I see it there are 2 solutions.  The first would be to just
split the reference counts and make it so that calls like skb_cloned
have to check both dataref and page count if skb->head_frag is set.  The
second option would be to look at something like pskb_expand_head where
we could generate a new head fragment and then memcpy the data over to
that frag in order to "unlock" the head.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-05-22 17:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-19 13:02 [PATCH net-next 3/3] ipv6: use skb coalescing in reassembly Eric Dumazet
2012-05-19 22:35 ` David Miller
2012-05-21  5:37   ` Eric Dumazet
2012-05-22  5:53     ` [RFC] net: skb_head_is_locked() should use skb_header_cloned() Eric Dumazet
2012-05-22 17:23       ` Alexander Duyck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).