All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals
@ 2012-07-12  0:18 Alexander Duyck
  2012-07-12  0:18 ` [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages Alexander Duyck
  2012-07-12  0:32 ` [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: Alexander Duyck @ 2012-07-12  0:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, jeffrey.t.kirsher, alexander.duyck, Alexander Duyck

The recent patch "tcp: Maintain dynamic metrics in local cache." introduced
an out of bounds access due to what appears to be a typo.   I believe this
change should resolve the issue by replacing the access to RTAX_CWND with
TCP_METRIC_CWND.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 net/ipv4/tcp_metrics.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 1fd83d3..5a38a2d 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -412,7 +412,7 @@ void tcp_update_metrics(struct sock *sk)
 				       max(tp->snd_cwnd >> 1, tp->snd_ssthresh));
 		if (!tcp_metric_locked(tm, TCP_METRIC_CWND)) {
 			val = tcp_metric_get(tm, TCP_METRIC_CWND);
-			tcp_metric_set(tm, RTAX_CWND, (val + tp->snd_cwnd) >> 1);
+			tcp_metric_set(tm, TCP_METRIC_CWND, (val + tp->snd_cwnd) >> 1);
 		}
 	} else {
 		/* Else slow start did not finish, cwnd is non-sense,

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages
  2012-07-12  0:18 [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals Alexander Duyck
@ 2012-07-12  0:18 ` Alexander Duyck
  2012-07-12  0:29   ` Eric Dumazet
  2012-07-12  0:32 ` [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Duyck @ 2012-07-12  0:18 UTC (permalink / raw)
  To: netdev
  Cc: davem, jeffrey.t.kirsher, alexander.duyck, Eric Dumazet, Alexander Duyck

This patch does several things.

First it reorders the netdev_alloc_frag code so that only one conditional
check is needed in most cases instead of 2.

Second it incorporates the atomic_set and atomic_sub_return logic from an
earlier proposed patch by Eric Dumazet allowing for a reduction in the
get_page/put_page overhead when dealing with frags.

Finally it also incorporates the page reuse code so that if the page count
is dropped to 0 we can just reinitialize the page and reuse it.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 net/core/skbuff.c |   37 +++++++++++++++++++++++++------------
 1 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 506f678..69f4add 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -296,9 +296,12 @@ EXPORT_SYMBOL(build_skb);
 struct netdev_alloc_cache {
 	struct page *page;
 	unsigned int offset;
+	unsigned int pagecnt_bias;
 };
 static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache);
 
+#define NETDEV_PAGECNT_BIAS (PAGE_SIZE / SMP_CACHE_BYTES)
+
 /**
  * netdev_alloc_frag - allocate a page fragment
  * @fragsz: fragment size
@@ -311,23 +314,33 @@ void *netdev_alloc_frag(unsigned int fragsz)
 	struct netdev_alloc_cache *nc;
 	void *data = NULL;
 	unsigned long flags;
+	unsigned int offset;
 
 	local_irq_save(flags);
 	nc = &__get_cpu_var(netdev_alloc_cache);
-	if (unlikely(!nc->page)) {
-refill:
+	offset = nc->offset;
+	if (unlikely(offset < fragsz)) {
+		BUG_ON(PAGE_SIZE < fragsz);
+
+		if (likely(nc->page) &&
+		    atomic_sub_and_test(nc->pagecnt_bias, &nc->page->_count))
+			goto recycle;
+
 		nc->page = alloc_page(GFP_ATOMIC | __GFP_COLD);
-		nc->offset = 0;
-	}
-	if (likely(nc->page)) {
-		if (nc->offset + fragsz > PAGE_SIZE) {
-			put_page(nc->page);
-			goto refill;
+		if (unlikely(!nc->page)) {
+			offset = 0;
+			goto end;
 		}
-		data = page_address(nc->page) + nc->offset;
-		nc->offset += fragsz;
-		get_page(nc->page);
-	}
+recycle:
+		atomic_set(&nc->page->_count, NETDEV_PAGECNT_BIAS);
+		nc->pagecnt_bias = NETDEV_PAGECNT_BIAS;
+		offset = PAGE_SIZE;
+	}
+	offset -= fragsz;
+	nc->pagecnt_bias--;
+	data = page_address(nc->page) + offset;
+end:
+	nc->offset = offset;
 	local_irq_restore(flags);
 	return data;
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages
  2012-07-12  0:18 ` [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages Alexander Duyck
@ 2012-07-12  0:29   ` Eric Dumazet
  2012-07-12  1:11     ` David Miller
  2012-07-12  2:02     ` Alexander Duyck
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-07-12  0:29 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, davem, jeffrey.t.kirsher, alexander.duyck, Eric Dumazet

On Wed, 2012-07-11 at 17:18 -0700, Alexander Duyck wrote:
> This patch does several things.
> 
> First it reorders the netdev_alloc_frag code so that only one conditional
> check is needed in most cases instead of 2.
> 
> Second it incorporates the atomic_set and atomic_sub_return logic from an
> earlier proposed patch by Eric Dumazet allowing for a reduction in the
> get_page/put_page overhead when dealing with frags.
> 
> Finally it also incorporates the page reuse code so that if the page count
> is dropped to 0 we can just reinitialize the page and reuse it.
> 
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---


Hmm, I was working on a version using order-3 pages if available.

(or more exactly 32768 bytes chunks)

I am not sure how your version can help with typical 1500 allocations
(2 skbs per page)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals
  2012-07-12  0:18 [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals Alexander Duyck
  2012-07-12  0:18 ` [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages Alexander Duyck
@ 2012-07-12  0:32 ` David Miller
  2012-07-12  1:46   ` Alexander Duyck
  1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2012-07-12  0:32 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev, jeffrey.t.kirsher, alexander.duyck

From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Wed, 11 Jul 2012 17:18:04 -0700

> The recent patch "tcp: Maintain dynamic metrics in local cache." introduced
> an out of bounds access due to what appears to be a typo.   I believe this
> change should resolve the issue by replacing the access to RTAX_CWND with
> TCP_METRIC_CWND.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

Applied, thanks a lot.

How did you spot this, did you get a compiler warning?

I ask because while working on this, I at one point put the
tcp timestamp members after the metrics array in the
tcp_metrics_bucket struct.  And I got a warning from gcc about
an array bounds violation that I could not figure out.

I am pretty certain this bug here is what it was warning about.  And
the problem is that if you put the array at the end gcc doesn't warn
in order to handle things similar to what people use zero length
arrays for.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages
  2012-07-12  0:29   ` Eric Dumazet
@ 2012-07-12  1:11     ` David Miller
  2012-07-12  2:02     ` Alexander Duyck
  1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2012-07-12  1:11 UTC (permalink / raw)
  To: eric.dumazet
  Cc: alexander.h.duyck, netdev, jeffrey.t.kirsher, alexander.duyck, edumazet

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 12 Jul 2012 02:29:27 +0200

> On Wed, 2012-07-11 at 17:18 -0700, Alexander Duyck wrote:
>> This patch does several things.
>> 
>> First it reorders the netdev_alloc_frag code so that only one conditional
>> check is needed in most cases instead of 2.
>> 
>> Second it incorporates the atomic_set and atomic_sub_return logic from an
>> earlier proposed patch by Eric Dumazet allowing for a reduction in the
>> get_page/put_page overhead when dealing with frags.
>> 
>> Finally it also incorporates the page reuse code so that if the page count
>> is dropped to 0 we can just reinitialize the page and reuse it.
>> 
>> Cc: Eric Dumazet <edumazet@google.com>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> ---
> 
> 
> Hmm, I was working on a version using order-3 pages if available.
> 
> (or more exactly 32768 bytes chunks)
> 
> I am not sure how your version can help with typical 1500 allocations
> (2 skbs per page)

I'd like you two to sort things out before I apply anything, thanks :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals
  2012-07-12  0:32 ` [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals David Miller
@ 2012-07-12  1:46   ` Alexander Duyck
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Duyck @ 2012-07-12  1:46 UTC (permalink / raw)
  To: David Miller; +Cc: alexander.h.duyck, netdev, jeffrey.t.kirsher

On 7/11/2012 5:32 PM, David Miller wrote:
> From: Alexander Duyck<alexander.h.duyck@intel.com>
> Date: Wed, 11 Jul 2012 17:18:04 -0700
>
>> The recent patch "tcp: Maintain dynamic metrics in local cache." introduced
>> an out of bounds access due to what appears to be a typo.   I believe this
>> change should resolve the issue by replacing the access to RTAX_CWND with
>> TCP_METRIC_CWND.
>>
>> Signed-off-by: Alexander Duyck<alexander.h.duyck@intel.com>
> Applied, thanks a lot.
>
> How did you spot this, did you get a compiler warning?
>
> I ask because while working on this, I at one point put the
> tcp timestamp members after the metrics array in the
> tcp_metrics_bucket struct.  And I got a warning from gcc about
> an array bounds violation that I could not figure out.
>
> I am pretty certain this bug here is what it was warning about.  And
> the problem is that if you put the array at the end gcc doesn't warn
> in order to handle things similar to what people use zero length
> arrays for.
It came up as a compiler warning.  I suspect it may have something to do 
with the optimizations I had turned on since it complained that the 
issue was in tcp_update_metrics but then reported it on the one line in 
tcp_metric_set.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages
  2012-07-12  0:29   ` Eric Dumazet
  2012-07-12  1:11     ` David Miller
@ 2012-07-12  2:02     ` Alexander Duyck
  2012-07-12  5:06       ` Eric Dumazet
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Duyck @ 2012-07-12  2:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, netdev, davem, jeffrey.t.kirsher, Eric Dumazet

On 7/11/2012 5:29 PM, Eric Dumazet wrote:
> On Wed, 2012-07-11 at 17:18 -0700, Alexander Duyck wrote:
>> This patch does several things.
>>
>> First it reorders the netdev_alloc_frag code so that only one conditional
>> check is needed in most cases instead of 2.
>>
>> Second it incorporates the atomic_set and atomic_sub_return logic from an
>> earlier proposed patch by Eric Dumazet allowing for a reduction in the
>> get_page/put_page overhead when dealing with frags.
>>
>> Finally it also incorporates the page reuse code so that if the page count
>> is dropped to 0 we can just reinitialize the page and reuse it.
>>
>> Cc: Eric Dumazet<edumazet@google.com>
>> Signed-off-by: Alexander Duyck<alexander.h.duyck@intel.com>
>> ---
>
> Hmm, I was working on a version using order-3 pages if available.
>
> (or more exactly 32768 bytes chunks)
>
> I am not sure how your version can help with typical 1500 allocations
> (2 skbs per page)
>
>
The gain will be minimal if any with the 1500 byte allocations, however 
there shouldn't be a performance degradation.

I was thinking more of the ixgbe case where we are working with only 256 
byte allocations and can recycle pages in the case of GRO or TCP.  For 
ixgbe the advantages are significant since we drop a number of the 
get_page calls and get the advantage of the page recycling.  So for 
example with GRO enabled we should only have to allocate 1 page for 
headers every 16 buffers, and the 6 slots we use in that page have a 
good likelihood of being warm in the cache since we just keep looping on 
the same page.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages
  2012-07-12  2:02     ` Alexander Duyck
@ 2012-07-12  5:06       ` Eric Dumazet
  2012-07-12 15:33         ` Alexander Duyck
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2012-07-12  5:06 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, netdev, davem, jeffrey.t.kirsher, Eric Dumazet

On Wed, 2012-07-11 at 19:02 -0700, Alexander Duyck wrote:

> The gain will be minimal if any with the 1500 byte allocations, however 
> there shouldn't be a performance degradation.
> 
> I was thinking more of the ixgbe case where we are working with only 256 
> byte allocations and can recycle pages in the case of GRO or TCP.  For 
> ixgbe the advantages are significant since we drop a number of the 
> get_page calls and get the advantage of the page recycling.  So for 
> example with GRO enabled we should only have to allocate 1 page for 
> headers every 16 buffers, and the 6 slots we use in that page have a 
> good likelihood of being warm in the cache since we just keep looping on 
> the same page.
> 

Its not possible to get 16 buffers per 4096 bytes page.

sizeof(struct skb_shared_info)=0x140 320

Add 192 bytes (NET_SKB_PAD + 128)

Thats a minimum of 512 bytes (but ixgbe uses more) per skb.

In practice for ixgbe, its :

#define IXGBE_RXBUFFER_512   512    /* Used for packet split */
#define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_512 

skb = netdev_alloc_skb_ip_align(rx_ring->netdev, IXGBE_RX_HDR_SIZE)

So 4 buffers per PAGE

Maybe you plan to use IXGBE_RXBUFFER_256 or IXGBE_RXBUFFER_128 ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages
  2012-07-12  5:06       ` Eric Dumazet
@ 2012-07-12 15:33         ` Alexander Duyck
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Duyck @ 2012-07-12 15:33 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, netdev, davem, jeffrey.t.kirsher, Eric Dumazet

On 07/11/2012 10:06 PM, Eric Dumazet wrote:
> On Wed, 2012-07-11 at 19:02 -0700, Alexander Duyck wrote:
>
>> The gain will be minimal if any with the 1500 byte allocations, however 
>> there shouldn't be a performance degradation.
>>
>> I was thinking more of the ixgbe case where we are working with only 256 
>> byte allocations and can recycle pages in the case of GRO or TCP.  For 
>> ixgbe the advantages are significant since we drop a number of the 
>> get_page calls and get the advantage of the page recycling.  So for 
>> example with GRO enabled we should only have to allocate 1 page for 
>> headers every 16 buffers, and the 6 slots we use in that page have a 
>> good likelihood of being warm in the cache since we just keep looping on 
>> the same page.
>>
> Its not possible to get 16 buffers per 4096 bytes page.
Actually I was talking about buffers from the device, not buffers from
the page.  However, it is possible to get 16 head_frag buffers from the
same 4K page if we consider recycling.  In the case of GRO we will end
up with the first buffer keeping the head_frag, and all of the remaining
head_frags will be freed before we call netdev_alloc_frag again.  So
what will end up happening is that each GRO assembled frame from ixgbe
would start with a recycled page used for the previously freed
head_frags, the page will be dropped from netdev_alloc_frag after we run
out of space, a new page will be allocated for use as head_frags, and
finally those head_frags will be freed and recycled until we hit the end
of the GRO frame and start over.  So if you count them all then we end
up using the page up to 16 times, maybe even more depending on how the
page offset reset aligns with the start of the GRO frame.

> sizeof(struct skb_shared_info)=0x140 320
>
> Add 192 bytes (NET_SKB_PAD + 128)
>
> Thats a minimum of 512 bytes (but ixgbe uses more) per skb.
>
> In practice for ixgbe, its :
>
> #define IXGBE_RXBUFFER_512   512    /* Used for packet split */
> #define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_512 
>
> skb = netdev_alloc_skb_ip_align(rx_ring->netdev, IXGBE_RX_HDR_SIZE)
>
> So 4 buffers per PAGE
>
> Maybe you plan to use IXGBE_RXBUFFER_256 or IXGBE_RXBUFFER_128 ?
I have a patch that is in testing in Jeff Kirsher's tree that uses
IXGBE_RXBUFFER_256.  With your recent changes it didn't make sense to
use 512 when we would only copy 256 bytes into the head.  With the size
set to 256 we will get 6 buffers per page without any recycling.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-07-12 15:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-12  0:18 [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals Alexander Duyck
2012-07-12  0:18 ` [PATCH 2/2] net: Update alloc frag to reduce get/put page usage and recycle pages Alexander Duyck
2012-07-12  0:29   ` Eric Dumazet
2012-07-12  1:11     ` David Miller
2012-07-12  2:02     ` Alexander Duyck
2012-07-12  5:06       ` Eric Dumazet
2012-07-12 15:33         ` Alexander Duyck
2012-07-12  0:32 ` [PATCH 1/2] tcp: Fix out of bounds access to tcpm_vals David Miller
2012-07-12  1:46   ` Alexander Duyck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.