* 2.6.24-rc2 slab vs slob tbench numbers
@ 2007-11-09 12:36 Nick Piggin
2007-11-09 15:15 ` Christoph Lameter
2007-11-12 20:13 ` 2.6.24-rc2 slab vs slob tbench numbers Matt Mackall
0 siblings, 2 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-09 12:36 UTC (permalink / raw)
To: Christoph Lameter, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 839 bytes --]
Hi,
Just ran some tbench numbers (from dbench-3.04), on a 2 socket, 8
core x86 system, with 1 NUMA node per socket. With kernel 2.6.24-rc2,
comparing slab vs slub allocators.
I run from 1 to 16 client threads, 5 times each, and restarting
the tbench server between every run. I'm just taking the highest
of each of the 5 tests (because the scheduler placement can
sometimes be poor). It's not completely scientific, but from the
graph you can guess it is relatively stable and seems significant.
Summary: slub is consistently slower. When all CPUs are saturated,
it is around 20% slower. Attached is a graph (x is nrclients, y
is throughput MB/s)
If I can help with reproducing it or testing anything, let me know.
I'll be trying out a few other benchmarks too... anything you want
me to test specifically and I can try.
Thanks,
Nick
[-- Attachment #2: slab.png --]
[-- Type: image/png, Size: 4614 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2 slab vs slob tbench numbers
2007-11-09 12:36 2.6.24-rc2 slab vs slob tbench numbers Nick Piggin
@ 2007-11-09 15:15 ` Christoph Lameter
2007-11-09 17:49 ` Christoph Lameter
2007-11-12 20:13 ` 2.6.24-rc2 slab vs slob tbench numbers Matt Mackall
1 sibling, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2007-11-09 15:15 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel
On Fri, 9 Nov 2007, Nick Piggin wrote:
> Just ran some tbench numbers (from dbench-3.04), on a 2 socket, 8
> core x86 system, with 1 NUMA node per socket. With kernel 2.6.24-rc2,
> comparing slab vs slub allocators.
You saw the discussion at
http://marc.info/?l=linux-kernel&m=119354245426072&w=2
and the patches / configurations that were posted to address the issues?
Could you try these?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2 slab vs slob tbench numbers
2007-11-09 15:15 ` Christoph Lameter
@ 2007-11-09 17:49 ` Christoph Lameter
2007-11-09 23:46 ` 2.6.24-rc2: Network commit causes SLUB performance regression with tbench Christoph Lameter
0 siblings, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2007-11-09 17:49 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel
On an 8p 2.6.24-rc2 I see even a 50% regression on tbench SLAB vs. SLUB
when specifying 8 threads.
Interestingly nothing changes the performance numbers regardless of
debugging on or off etc etc. Usually debugging should reduce performance
but nada. May have something to do with the localhost interface? Something
is effectively throttling SLUB here.
2.6.23 SLUB 2159.62 MB/sec
2.6.24-rc2-slab head SLUB 1260.80 MB/sec
2.6.24 SLUB should be faster than 2.6.23 SLUB. Still trying to figure out
what is going on....
^ permalink raw reply [flat|nested] 29+ messages in thread
* 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-09 17:49 ` Christoph Lameter
@ 2007-11-09 23:46 ` Christoph Lameter
2007-11-10 1:29 ` Nick Piggin
0 siblings, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2007-11-09 23:46 UTC (permalink / raw)
To: David S. Miller, Herbert Xu; +Cc: Nick Piggin, linux-kernel
commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0 seems to cause a drop
in SLUB tbench performance:
8p x86_64 system:
2.6.24-rc2:
1260.80 MB/sec
After reverting the patch:
2350.04 MB/sec
SLAB performance (which is at 2435.58 MB/sec, ~3% better than SLUB) is not
affected by the patch.
Since this is an alignment change it seems that tbench performance is
sensitive to the data layout? SLUB packs data more tightly than SLAB. So
8 byte allocations could result in cacheline contention if adjacent
objects are allocated from different cpus. SLABs minimum size is 32
bytes so the cacheline contention is likely more limited.
Maybe we need to allocate a mininum of one cacheline to the skb head? Or
padd it out to a full cacheline?
commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun Oct 21 16:27:46 2007 -0700
[NET]: Fix SKB_WITH_OVERHEAD calculation
The calculation in SKB_WITH_OVERHEAD is incorrect in that it can cause
an overflow across a page boundary which is what it's meant to prevent.
In particular, the header length (X) should not be lumped together with
skb_shared_info. The latter needs to be aligned properly while the header
has no choice but to sit in front of wherever the payload is.
Therefore the correct calculation is to take away the aligned size of
skb_shared_info, and then subtract the header length. The resulting
quantity L satisfies the following inequality:
SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <= PAGE_SIZE
This is the quantity used by alloc_skb to do the actual allocation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f93f22b..369f60a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -41,8 +41,7 @@
#define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \
~(SMP_CACHE_BYTES - 1))
#define SKB_WITH_OVERHEAD(X) \
- (((X) - sizeof(struct skb_shared_info)) & \
- ~(SMP_CACHE_BYTES - 1))
+ ((X) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
#define SKB_MAX_ORDER(X, ORDER) \
SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
#define SKB_MAX_HEAD(X) (SKB_MAX_ORDER((X), 0))
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-09 23:46 ` 2.6.24-rc2: Network commit causes SLUB performance regression with tbench Christoph Lameter
@ 2007-11-10 1:29 ` Nick Piggin
2007-11-10 3:28 ` Nick Piggin
2007-11-12 19:44 ` Christoph Lameter
0 siblings, 2 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-10 1:29 UTC (permalink / raw)
To: Christoph Lameter, linux-netdev; +Cc: David S. Miller, Herbert Xu, linux-kernel
cc'ed linux-netdev
On Saturday 10 November 2007 10:46, Christoph Lameter wrote:
> commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0 seems to cause a drop
> in SLUB tbench performance:
>
> 8p x86_64 system:
>
> 2.6.24-rc2:
> 1260.80 MB/sec
>
> After reverting the patch:
> 2350.04 MB/sec
>
> SLAB performance (which is at 2435.58 MB/sec, ~3% better than SLUB) is not
> affected by the patch.
Ah, I didn't realise this was a regression. Thanks for bisecting it.
> Since this is an alignment change it seems that tbench performance is
> sensitive to the data layout? SLUB packs data more tightly than SLAB. So
> 8 byte allocations could result in cacheline contention if adjacent
> objects are allocated from different cpus. SLABs minimum size is 32
> bytes so the cacheline contention is likely more limited.
> Maybe we need to allocate a mininum of one cacheline to the skb head? Or
> padd it out to a full cacheline?
The data should already be cacheline aligned. It is kmalloced, and
with a minimum size of somewhere around 200 bytes on a 64-bit machine.
So it will hit a cacheline aligned kmalloc slab AFAIKS -- cacheline
interference is probably not the problem. (To verify, I built slub with
minimum kmalloc size set to 32 like slab and it's no real difference)
But I can't see why restricting the allocation to PAGE_SIZE would help
either. Maybe the macros are used in some other areas.
BTW. your size-2048 kmalloc cache is order-1 in the default setup,
wheras kmalloc(1024) or kmalloc(4096) will be order-0 allocations. And
SLAB also uses order-0 for size-2048. It would be nice if SLUB did the
same...
> commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0
> Author: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Sun Oct 21 16:27:46 2007 -0700
>
> [NET]: Fix SKB_WITH_OVERHEAD calculation
>
> The calculation in SKB_WITH_OVERHEAD is incorrect in that it can cause
> an overflow across a page boundary which is what it's meant to prevent.
> In particular, the header length (X) should not be lumped together with
> skb_shared_info. The latter needs to be aligned properly while the
> header has no choice but to sit in front of wherever the payload is.
>
> Therefore the correct calculation is to take away the aligned size of
> skb_shared_info, and then subtract the header length. The resulting
> quantity L satisfies the following inequality:
>
> SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <= PAGE_SIZE
>
> This is the quantity used by alloc_skb to do the actual allocation.
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: David S. Miller <davem@davemloft.net>
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f93f22b..369f60a 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -41,8 +41,7 @@
> #define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \
> ~(SMP_CACHE_BYTES - 1))
> #define SKB_WITH_OVERHEAD(X) \
> - (((X) - sizeof(struct skb_shared_info)) & \
> - ~(SMP_CACHE_BYTES - 1))
> + ((X) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
> #define SKB_MAX_ORDER(X, ORDER) \
> SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
> #define SKB_MAX_HEAD(X) (SKB_MAX_ORDER((X), 0))
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-10 1:29 ` Nick Piggin
@ 2007-11-10 3:28 ` Nick Piggin
2007-11-12 19:44 ` Christoph Lameter
1 sibling, 0 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-10 3:28 UTC (permalink / raw)
To: Christoph Lameter, netdev; +Cc: David S. Miller, Herbert Xu, linux-kernel
On Saturday 10 November 2007 12:29, Nick Piggin wrote:
> cc'ed linux-netdev
Err, make that 'netdev' :P
> On Saturday 10 November 2007 10:46, Christoph Lameter wrote:
> > commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0 seems to cause a drop
> > in SLUB tbench performance:
> >
> > 8p x86_64 system:
> >
> > 2.6.24-rc2:
> > 1260.80 MB/sec
> >
> > After reverting the patch:
> > 2350.04 MB/sec
> >
> > SLAB performance (which is at 2435.58 MB/sec, ~3% better than SLUB) is
> > not affected by the patch.
>
> Ah, I didn't realise this was a regression. Thanks for bisecting it.
>
> > Since this is an alignment change it seems that tbench performance is
> > sensitive to the data layout? SLUB packs data more tightly than SLAB. So
> > 8 byte allocations could result in cacheline contention if adjacent
> > objects are allocated from different cpus. SLABs minimum size is 32
> > bytes so the cacheline contention is likely more limited.
> >
> > Maybe we need to allocate a mininum of one cacheline to the skb head? Or
> > padd it out to a full cacheline?
>
> The data should already be cacheline aligned. It is kmalloced, and
> with a minimum size of somewhere around 200 bytes on a 64-bit machine.
> So it will hit a cacheline aligned kmalloc slab AFAIKS -- cacheline
> interference is probably not the problem. (To verify, I built slub with
> minimum kmalloc size set to 32 like slab and it's no real difference)
>
> But I can't see why restricting the allocation to PAGE_SIZE would help
> either. Maybe the macros are used in some other areas.
>
> BTW. your size-2048 kmalloc cache is order-1 in the default setup,
> wheras kmalloc(1024) or kmalloc(4096) will be order-0 allocations. And
> SLAB also uses order-0 for size-2048. It would be nice if SLUB did the
> same...
>
> > commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0
> > Author: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Sun Oct 21 16:27:46 2007 -0700
> >
> > [NET]: Fix SKB_WITH_OVERHEAD calculation
> >
> > The calculation in SKB_WITH_OVERHEAD is incorrect in that it can
> > cause an overflow across a page boundary which is what it's meant to
> > prevent. In particular, the header length (X) should not be lumped
> > together with skb_shared_info. The latter needs to be aligned properly
> > while the header has no choice but to sit in front of wherever the
> > payload is.
> >
> > Therefore the correct calculation is to take away the aligned size of
> > skb_shared_info, and then subtract the header length. The resulting
> > quantity L satisfies the following inequality:
> >
> > SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <=
> > PAGE_SIZE
> >
> > This is the quantity used by alloc_skb to do the actual allocation.
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> >
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index f93f22b..369f60a 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -41,8 +41,7 @@
> > #define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \
> > ~(SMP_CACHE_BYTES - 1))
> > #define SKB_WITH_OVERHEAD(X) \
> > - (((X) - sizeof(struct skb_shared_info)) & \
> > - ~(SMP_CACHE_BYTES - 1))
> > + ((X) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
> > #define SKB_MAX_ORDER(X, ORDER) \
> > SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
> > #define SKB_MAX_HEAD(X) (SKB_MAX_ORDER((X), 0))
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-10 1:29 ` Nick Piggin
2007-11-10 3:28 ` Nick Piggin
@ 2007-11-12 19:44 ` Christoph Lameter
2007-11-13 11:41 ` Nick Piggin
1 sibling, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2007-11-12 19:44 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-netdev, David S. Miller, Herbert Xu, linux-kernel
On Sat, 10 Nov 2007, Nick Piggin wrote:
> BTW. your size-2048 kmalloc cache is order-1 in the default setup,
> wheras kmalloc(1024) or kmalloc(4096) will be order-0 allocations. And
> SLAB also uses order-0 for size-2048. It would be nice if SLUB did the
> same...
You can try to see the effect that order 0 would have by booting with
slub_max_order=0
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2 slab vs slob tbench numbers
2007-11-09 12:36 2.6.24-rc2 slab vs slob tbench numbers Nick Piggin
2007-11-09 15:15 ` Christoph Lameter
@ 2007-11-12 20:13 ` Matt Mackall
2007-11-13 11:44 ` Nick Piggin
1 sibling, 1 reply; 29+ messages in thread
From: Matt Mackall @ 2007-11-12 20:13 UTC (permalink / raw)
To: Nick Piggin; +Cc: Christoph Lameter, linux-kernel
On Fri, Nov 09, 2007 at 11:36:56PM +1100, Nick Piggin wrote:
> Hi,
>
> Just ran some tbench numbers (from dbench-3.04), on a 2 socket, 8
> core x86 system, with 1 NUMA node per socket. With kernel 2.6.24-rc2,
> comparing slab vs slub allocators.
Damn your misleading subject! I thought this was going to be about
something interesting.
--
Mathematics is the supreme nostalgia of our time.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-12 19:44 ` Christoph Lameter
@ 2007-11-13 11:41 ` Nick Piggin
2007-11-14 1:58 ` David Miller
0 siblings, 1 reply; 29+ messages in thread
From: Nick Piggin @ 2007-11-13 11:41 UTC (permalink / raw)
To: Christoph Lameter; +Cc: netdev, David S. Miller, Herbert Xu, linux-kernel
On Tuesday 13 November 2007 06:44, Christoph Lameter wrote:
> On Sat, 10 Nov 2007, Nick Piggin wrote:
> > BTW. your size-2048 kmalloc cache is order-1 in the default setup,
> > wheras kmalloc(1024) or kmalloc(4096) will be order-0 allocations. And
> > SLAB also uses order-0 for size-2048. It would be nice if SLUB did the
> > same...
>
> You can try to see the effect that order 0 would have by booting with
>
> slub_max_order=0
Yeah, that didn't help much, but in general I think it would give
more consistent and reliable behaviour from slub.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2 slab vs slob tbench numbers
2007-11-12 20:13 ` 2.6.24-rc2 slab vs slob tbench numbers Matt Mackall
@ 2007-11-13 11:44 ` Nick Piggin
0 siblings, 0 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-13 11:44 UTC (permalink / raw)
To: Matt Mackall; +Cc: Christoph Lameter, linux-kernel
On Tuesday 13 November 2007 07:13, Matt Mackall wrote:
> On Fri, Nov 09, 2007 at 11:36:56PM +1100, Nick Piggin wrote:
> > Hi,
> >
> > Just ran some tbench numbers (from dbench-3.04), on a 2 socket, 8
> > core x86 system, with 1 NUMA node per socket. With kernel 2.6.24-rc2,
> > comparing slab vs slub allocators.
>
> Damn your misleading subject! I thought this was going to be about
> something interesting.
Actually I did test slob as well -- it's competitive with slab and
slub up to about 4 cores, which is nice.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 1:58 ` David Miller
@ 2007-11-13 17:36 ` Nick Piggin
2007-11-14 6:12 ` David Miller
0 siblings, 1 reply; 29+ messages in thread
From: Nick Piggin @ 2007-11-13 17:36 UTC (permalink / raw)
To: David Miller; +Cc: clameter, netdev, herbert, linux-kernel
On Wednesday 14 November 2007 12:58, David Miller wrote:
> From: Nick Piggin <nickpiggin@yahoo.com.au>
> Date: Tue, 13 Nov 2007 22:41:58 +1100
>
> > On Tuesday 13 November 2007 06:44, Christoph Lameter wrote:
> > > On Sat, 10 Nov 2007, Nick Piggin wrote:
> > > > BTW. your size-2048 kmalloc cache is order-1 in the default setup,
> > > > wheras kmalloc(1024) or kmalloc(4096) will be order-0 allocations.
> > > > And SLAB also uses order-0 for size-2048. It would be nice if SLUB
> > > > did the same...
> > >
> > > You can try to see the effect that order 0 would have by booting with
> > >
> > > slub_max_order=0
> >
> > Yeah, that didn't help much, but in general I think it would give
> > more consistent and reliable behaviour from slub.
>
> Just a note that I'm not ignoring this issue, I just don't have time
> to get to it yet.
No problem. I would like to have helped more, but it's slow going given
my lack of network stack knowledge. If I get any more interesting data,
I'll send it.
> I suspect the issue is about having a huge skb->data linear area for
> TCP sends over loopback. We're likely getting a much smaller
> skb->data linear data area after the patch in question, the rest using
> the sk_buff scatterlist pages which are a little bit more expensive to
> process.
It didn't seem to be noticeable at 1 client. Unless scatterlist
processing is going to cause cacheline bouncing, I don't see why this
hurts more as you add CPUs?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 6:12 ` David Miller
@ 2007-11-13 18:14 ` Nick Piggin
2007-11-14 6:37 ` David Miller
0 siblings, 1 reply; 29+ messages in thread
From: Nick Piggin @ 2007-11-13 18:14 UTC (permalink / raw)
To: David Miller; +Cc: clameter, netdev, herbert, linux-kernel
On Wednesday 14 November 2007 17:12, David Miller wrote:
> From: Nick Piggin <nickpiggin@yahoo.com.au>
> Date: Wed, 14 Nov 2007 04:36:24 +1100
>
> > On Wednesday 14 November 2007 12:58, David Miller wrote:
> > > I suspect the issue is about having a huge skb->data linear area for
> > > TCP sends over loopback. We're likely getting a much smaller
> > > skb->data linear data area after the patch in question, the rest using
> > > the sk_buff scatterlist pages which are a little bit more expensive to
> > > process.
> >
> > It didn't seem to be noticeable at 1 client. Unless scatterlist
> > processing is going to cause cacheline bouncing, I don't see why this
> > hurts more as you add CPUs?
>
> Is your test system using HIGHMEM?
>
> That's one thing the page vector in the sk_buff can do a lot,
> kmaps.
No, it's an x86-64, so no highmem.
What's also interesting is that SLAB apparently doesn't have this
condition. The first thing that sprung to mind is that SLAB caches
order > 0 allocations, while SLUB does not. However if anything,
that should actually favour the SLUB numbers if network is avoiding
order > 0 allocations.
I'm doing some oprofile runs now to see if I can get any more info.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 6:37 ` David Miller
@ 2007-11-13 22:27 ` Nick Piggin
2007-11-13 22:55 ` Nick Piggin
2007-11-14 11:10 ` David Miller
0 siblings, 2 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-13 22:27 UTC (permalink / raw)
To: David Miller; +Cc: clameter, netdev, herbert, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4941 bytes --]
On Wednesday 14 November 2007 17:37, David Miller wrote:
> From: Nick Piggin <nickpiggin@yahoo.com.au>
> > I'm doing some oprofile runs now to see if I can get any more info.
OK, in vanilla kernels, the page allocator definitely shows higher
in the results (than with Herbert's patch reverted).
27516 2.7217 get_page_from_freelist
21677 2.1442 __rmqueue_smallest
20513 2.0290 __free_pages_ok
18725 1.8522 get_pageblock_flags_group
Just these account for nearly 10% of cycles. __alloc_skb shows up
higher too. free_hot_cold_page() shows a lot lower though, which
might indicate that actually there is more higher order allocation
activity (I'll check that next).
****
SLUB, avg throughput 1548
CPU: AMD64 family10, speed 1900 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 100000
samples % symbol name
94636 9.3609 copy_user_generic_string
38932 3.8509 ipt_do_table
34746 3.4369 tcp_v4_rcv
29539 2.9218 skb_release_data
27516 2.7217 get_page_from_freelist
26046 2.5763 tcp_sendmsg
24482 2.4216 local_bh_enable
22910 2.2661 ip_queue_xmit
22113 2.1873 ktime_get
21677 2.1442 __rmqueue_smallest
20513 2.0290 __free_pages_ok
18725 1.8522 get_pageblock_flags_group
18580 1.8378 tcp_recvmsg
18108 1.7911 __napi_schedule
17593 1.7402 schedule
16998 1.6813 tcp_ack
16102 1.5927 dev_hard_start_xmit
15751 1.5580 system_call
15707 1.5536 net_rx_action
15150 1.4986 __switch_to
14988 1.4825 tcp_transmit_skb
13921 1.3770 kmem_cache_free
13398 1.3253 __mod_timer
13243 1.3099 tcp_rcv_established
13109 1.2967 __tcp_select_window
11022 1.0902 __tcp_push_pending_frames
10732 1.0615 set_normalized_timespec
10561 1.0446 netif_rx
8840 0.8744 netif_receive_skb
7816 0.7731 nf_iterate
7300 0.7221 __update_rq_clock
6683 0.6610 _read_lock_bh
6504 0.6433 sys_recvfrom
6283 0.6215 nf_hook_slow
6188 0.6121 release_sock
6172 0.6105 loopback_xmit
5927 0.5863 __alloc_skb
5707 0.5645 tcp_cleanup_rbuf
5538 0.5478 tcp_event_data_recv
5517 0.5457 tcp_v4_do_rcv
5516 0.5456 process_backlog
****
SLUB, net patch reverted. Avg throughput 1933MB/s
Counted CPU_CLK_UNHALTED , count 100000
samples % symbol name
95895 9.5094 copy_user_generic_string
50259 4.9839 ipt_do_table
39408 3.9079 skb_release_data
37296 3.6984 tcp_v4_rcv
31309 3.1047 ip_queue_xmit
31308 3.1046 local_bh_enable
24052 2.3851 net_rx_action
23786 2.3587 __napi_schedule
21426 2.1247 tcp_recvmsg
21075 2.0899 schedule
20938 2.0763 dev_hard_start_xmit
20222 2.0053 tcp_sendmsg
19775 1.9610 tcp_ack
19717 1.9552 system_call
19495 1.9332 set_normalized_timespec
18584 1.8429 __switch_to
17022 1.6880 tcp_rcv_established
14655 1.4533 tcp_transmit_skb
14466 1.4345 __mod_timer
13820 1.3705 loopback_xmit
13776 1.3661 get_page_from_freelist
13288 1.3177 netif_receive_skb
9718 0.9637 _read_lock_bh
9625 0.9545 nf_iterate
9440 0.9361 netif_rx
9148 0.9072 free_hot_cold_page
8633 0.8561 __update_rq_clock
7668 0.7604 sys_recvfrom
7578 0.7515 __tcp_push_pending_frames
7311 0.7250 find_pid_ns
7178 0.7118 nf_hook_slow
6655 0.6599 sysret_check
6313 0.6260 release_sock
6290 0.6237 tcp_cleanup_rbuf
6263 0.6211 __tcp_select_window
6235 0.6183 process_backlog
5920 0.5871 ip_local_deliver_finish
5651 0.5604 ip_rcv
5239 0.5195 ip_finish_output
5058 0.5016 kmem_cache_free
5016 0.4974 thread_return
> Here are some other things you can play around with:
>
> 1) Monitor the values of skb->len and skb->data_len for packets
> going over loopback.
OK, I've taken a log2 histogram of ->len and ->data_len sizes.
I'll attach the plots (xaxis is byte value of most significant
bit set, y axis is number of occurances, logscale on X so 0 isn't
there :( ). If you want to see the patch or raw data, let me know.
len looks very similar for both kernels, although the reverted
kernel has significantly higher frequency at most points. tbench
unfortunately is only possible to do it time-based, so this is
rougly expected.
data_len has a spike that shifts up to 512-1023, from 256-511,
when reverting Herbert's patch. Again, I believe the magnitudes
of the spikes should actually be pretty close if we were doing
an equal amount of work.
I can't see that these numbers show much useful, unfortunately.
> 2) Try removing NETIF_F_SG in drivers/net/loopback.c's dev->feastures
> setting.
Will try that now.
[-- Attachment #2: data.png --]
[-- Type: image/png, Size: 3622 bytes --]
[-- Attachment #3: len.png --]
[-- Type: image/png, Size: 4828 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-13 22:27 ` Nick Piggin
@ 2007-11-13 22:55 ` Nick Piggin
2007-11-14 11:10 ` David Miller
1 sibling, 0 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-13 22:55 UTC (permalink / raw)
To: David Miller; +Cc: clameter, netdev, herbert, linux-kernel
On Wednesday 14 November 2007 09:27, Nick Piggin wrote:
> > 2) Try removing NETIF_F_SG in drivers/net/loopback.c's dev->feastures
> > setting.
>
> Will try that now.
Doesn't help (with vanilla kernel -- Herbert's patch applied).
data_len histogram drops to 0 and goes to len (I guess that's not
surprising).
Performance is pretty similar (ie. not good).
I'll look at allocator patterns next.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 11:10 ` David Miller
@ 2007-11-13 23:39 ` Nick Piggin
2007-11-14 11:48 ` Herbert Xu
1 sibling, 0 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-13 23:39 UTC (permalink / raw)
To: David Miller; +Cc: clameter, netdev, herbert, linux-kernel
On Wednesday 14 November 2007 22:10, David Miller wrote:
> From: Nick Piggin <nickpiggin@yahoo.com.au>
> Date: Wed, 14 Nov 2007 09:27:39 +1100
>
> > OK, in vanilla kernels, the page allocator definitely shows higher
> > in the results (than with Herbert's patch reverted).
>
> ...
>
> > I can't see that these numbers show much useful, unfortunately.
>
> Thanks for all of this data Nick.
>
> So the thing that's being effected here in TCP is
> net/ipv4/tcp.c:select_size(), specifically the else branch:
>
> int tmp = tp->mss_cache;
> ...
> else {
> int pgbreak = SKB_MAX_HEAD(MAX_TCP_HEADER);
>
> if (tmp >= pgbreak &&
> tmp <= pgbreak + (MAX_SKB_FRAGS - 1) * PAGE_SIZE)
> tmp = pgbreak;
> }
>
> This is deciding, in 'tmp', how much linear sk_buff space to
> allocate. 'tmp' is initially set to the path MSS, which
> for loopback is 16K - the space necessary for packet headers.
>
> The SKB_MAX_HEAD() value has changed as a result of Herbert's
> bug fix. I suspect this 'if' test is passing both with and
> without the patch.
>
> But pgbreak is now smaller, and thus the skb->data linear
> data area size we choose to use is smaller as well.
OK, that makes sense. BTW, are you taking advantage of kmalloc's
"quantization" into slabs WRT the linear data area? I wonder if
it would be at all useful...
> You can test if this is precisely what is causing the performance
> regression by using the old calculation just here in select_size().
>
> Add something like this local to net/ipv4/tcp.c:
>
> #define OLD_SKB_WITH_OVERHEAD(X) \
> (((X) - sizeof(struct skb_shared_info)) & \
> ~(SMP_CACHE_BYTES - 1))
> #define OLD_SKB_MAX_ORDER(X, ORDER) \
> OLD_SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
> #define OLD_SKB_MAX_HEAD(X) (OLD_SKB_MAX_ORDER((X), 0))
>
> And then use OLD_SKB_MAX_HEAD() in select_size().
That brings performance back up!
I wonder why it isn't causing a problem for SLAB...
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 11:48 ` Herbert Xu
@ 2007-11-14 0:02 ` Nick Piggin
2007-11-14 12:10 ` David Miller
2007-11-14 23:46 ` David Miller
1 sibling, 1 reply; 29+ messages in thread
From: Nick Piggin @ 2007-11-14 0:02 UTC (permalink / raw)
To: Herbert Xu; +Cc: David Miller, clameter, netdev, linux-kernel
On Wednesday 14 November 2007 22:48, Herbert Xu wrote:
> On Wed, Nov 14, 2007 at 03:10:22AM -0800, David Miller wrote:
> > So the thing that's being effected here in TCP is
> > net/ipv4/tcp.c:select_size(), specifically the else branch:
>
> Thanks for the pointer. Indeed there is a bug in that area.
> I'm not sure whether it's causing the problem at hand but it's
> certainly suboptimal.
>
> [TCP]: Fix size calculation in sk_stream_alloc_pskb
This looks like it fixes the problem!
Still interested to know why SLAB didn't see the same thing...
> We round up the header size in sk_stream_alloc_pskb so that
> TSO packets get zero tail room. Unfortunately this rounding
> up is not coordinated with the select_size() function used by
> TCP to calculate the second parameter of sk_stream_alloc_pskb.
>
> As a result, we may allocate more than a page of data in the
> non-TSO case when exactly one page is desired.
>
> In fact, rounding up the head room is detrimental in the non-TSO
> case because it makes memory that would otherwise be available to
> the payload head room. TSO doesn't need this either, all it wants
> is the guarantee that there is no tail room.
>
> So this patch fixes this by adjusting the skb_reserve call so that
> exactly the requested amount (which all callers have calculated in
> a precise way) is made available as tail room.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Cheers,
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-13 11:41 ` Nick Piggin
@ 2007-11-14 1:58 ` David Miller
2007-11-13 17:36 ` Nick Piggin
0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2007-11-14 1:58 UTC (permalink / raw)
To: nickpiggin; +Cc: clameter, netdev, herbert, linux-kernel
From: Nick Piggin <nickpiggin@yahoo.com.au>
Date: Tue, 13 Nov 2007 22:41:58 +1100
> On Tuesday 13 November 2007 06:44, Christoph Lameter wrote:
> > On Sat, 10 Nov 2007, Nick Piggin wrote:
> > > BTW. your size-2048 kmalloc cache is order-1 in the default setup,
> > > wheras kmalloc(1024) or kmalloc(4096) will be order-0 allocations. And
> > > SLAB also uses order-0 for size-2048. It would be nice if SLUB did the
> > > same...
> >
> > You can try to see the effect that order 0 would have by booting with
> >
> > slub_max_order=0
>
> Yeah, that didn't help much, but in general I think it would give
> more consistent and reliable behaviour from slub.
Just a note that I'm not ignoring this issue, I just don't have time
to get to it yet.
I suspect the issue is about having a huge skb->data linear area for
TCP sends over loopback. We're likely getting a much smaller
skb->data linear data area after the patch in question, the rest using
the sk_buff scatterlist pages which are a little bit more expensive to
process.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-13 17:36 ` Nick Piggin
@ 2007-11-14 6:12 ` David Miller
2007-11-13 18:14 ` Nick Piggin
0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2007-11-14 6:12 UTC (permalink / raw)
To: nickpiggin; +Cc: clameter, netdev, herbert, linux-kernel
From: Nick Piggin <nickpiggin@yahoo.com.au>
Date: Wed, 14 Nov 2007 04:36:24 +1100
> On Wednesday 14 November 2007 12:58, David Miller wrote:
> > I suspect the issue is about having a huge skb->data linear area for
> > TCP sends over loopback. We're likely getting a much smaller
> > skb->data linear data area after the patch in question, the rest using
> > the sk_buff scatterlist pages which are a little bit more expensive to
> > process.
>
> It didn't seem to be noticeable at 1 client. Unless scatterlist
> processing is going to cause cacheline bouncing, I don't see why this
> hurts more as you add CPUs?
Is your test system using HIGHMEM?
That's one thing the page vector in the sk_buff can do a lot,
kmaps.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-13 18:14 ` Nick Piggin
@ 2007-11-14 6:37 ` David Miller
2007-11-13 22:27 ` Nick Piggin
0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2007-11-14 6:37 UTC (permalink / raw)
To: nickpiggin; +Cc: clameter, netdev, herbert, linux-kernel
From: Nick Piggin <nickpiggin@yahoo.com.au>
Date: Wed, 14 Nov 2007 05:14:27 +1100
> On Wednesday 14 November 2007 17:12, David Miller wrote:
> > Is your test system using HIGHMEM?
> >
> > That's one thing the page vector in the sk_buff can do a lot,
> > kmaps.
>
> No, it's an x86-64, so no highmem.
Ok.
> What's also interesting is that SLAB apparently doesn't have this
> condition. The first thing that sprung to mind is that SLAB caches
> order > 0 allocations, while SLUB does not. However if anything,
> that should actually favour the SLUB numbers if network is avoiding
> order > 0 allocations.
>
> I'm doing some oprofile runs now to see if I can get any more info.
Here are some other things you can play around with:
1) Monitor the values of skb->len and skb->data_len for packets
going over loopback.
2) Try removing NETIF_F_SG in drivers/net/loopback.c's dev->feastures
setting.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-13 22:27 ` Nick Piggin
2007-11-13 22:55 ` Nick Piggin
@ 2007-11-14 11:10 ` David Miller
2007-11-13 23:39 ` Nick Piggin
2007-11-14 11:48 ` Herbert Xu
1 sibling, 2 replies; 29+ messages in thread
From: David Miller @ 2007-11-14 11:10 UTC (permalink / raw)
To: nickpiggin; +Cc: clameter, netdev, herbert, linux-kernel
From: Nick Piggin <nickpiggin@yahoo.com.au>
Date: Wed, 14 Nov 2007 09:27:39 +1100
> OK, in vanilla kernels, the page allocator definitely shows higher
> in the results (than with Herbert's patch reverted).
...
> I can't see that these numbers show much useful, unfortunately.
Thanks for all of this data Nick.
So the thing that's being effected here in TCP is
net/ipv4/tcp.c:select_size(), specifically the else branch:
int tmp = tp->mss_cache;
...
else {
int pgbreak = SKB_MAX_HEAD(MAX_TCP_HEADER);
if (tmp >= pgbreak &&
tmp <= pgbreak + (MAX_SKB_FRAGS - 1) * PAGE_SIZE)
tmp = pgbreak;
}
This is deciding, in 'tmp', how much linear sk_buff space to
allocate. 'tmp' is initially set to the path MSS, which
for loopback is 16K - the space necessary for packet headers.
The SKB_MAX_HEAD() value has changed as a result of Herbert's
bug fix. I suspect this 'if' test is passing both with and
without the patch.
But pgbreak is now smaller, and thus the skb->data linear
data area size we choose to use is smaller as well.
You can test if this is precisely what is causing the performance
regression by using the old calculation just here in select_size().
Add something like this local to net/ipv4/tcp.c:
#define OLD_SKB_WITH_OVERHEAD(X) \
(((X) - sizeof(struct skb_shared_info)) & \
~(SMP_CACHE_BYTES - 1))
#define OLD_SKB_MAX_ORDER(X, ORDER) \
OLD_SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
#define OLD_SKB_MAX_HEAD(X) (OLD_SKB_MAX_ORDER((X), 0))
And then use OLD_SKB_MAX_HEAD() in select_size().
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 11:10 ` David Miller
2007-11-13 23:39 ` Nick Piggin
@ 2007-11-14 11:48 ` Herbert Xu
2007-11-14 0:02 ` Nick Piggin
2007-11-14 23:46 ` David Miller
1 sibling, 2 replies; 29+ messages in thread
From: Herbert Xu @ 2007-11-14 11:48 UTC (permalink / raw)
To: David Miller; +Cc: nickpiggin, clameter, netdev, linux-kernel
On Wed, Nov 14, 2007 at 03:10:22AM -0800, David Miller wrote:
>
> So the thing that's being effected here in TCP is
> net/ipv4/tcp.c:select_size(), specifically the else branch:
Thanks for the pointer. Indeed there is a bug in that area.
I'm not sure whether it's causing the problem at hand but it's
certainly suboptimal.
[TCP]: Fix size calculation in sk_stream_alloc_pskb
We round up the header size in sk_stream_alloc_pskb so that
TSO packets get zero tail room. Unfortunately this rounding
up is not coordinated with the select_size() function used by
TCP to calculate the second parameter of sk_stream_alloc_pskb.
As a result, we may allocate more than a page of data in the
non-TSO case when exactly one page is desired.
In fact, rounding up the head room is detrimental in the non-TSO
case because it makes memory that would otherwise be available to
the payload head room. TSO doesn't need this either, all it wants
is the guarantee that there is no tail room.
So this patch fixes this by adjusting the skb_reserve call so that
exactly the requested amount (which all callers have calculated in
a precise way) is made available as tail room.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/net/sock.h b/include/net/sock.h
index 5504fb9..567e468 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1235,14 +1235,16 @@ static inline struct sk_buff *sk_stream_alloc_pskb(struct sock *sk,
gfp_t gfp)
{
struct sk_buff *skb;
- int hdr_len;
- hdr_len = SKB_DATA_ALIGN(sk->sk_prot->max_header);
- skb = alloc_skb_fclone(size + hdr_len, gfp);
+ skb = alloc_skb_fclone(size + sk->sk_prot->max_header, gfp);
if (skb) {
skb->truesize += mem;
if (sk_stream_wmem_schedule(sk, skb->truesize)) {
- skb_reserve(skb, hdr_len);
+ /*
+ * Make sure that we have exactly size bytes
+ * available to the caller, no more, no less.
+ */
+ skb_reserve(skb, skb_tailroom(skb) - size);
return skb;
}
__kfree_skb(skb);
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 0:02 ` Nick Piggin
@ 2007-11-14 12:10 ` David Miller
2007-11-14 18:33 ` Christoph Lameter
0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2007-11-14 12:10 UTC (permalink / raw)
To: nickpiggin; +Cc: herbert, clameter, netdev, linux-kernel
From: Nick Piggin <nickpiggin@yahoo.com.au>
Date: Wed, 14 Nov 2007 11:02:11 +1100
> On Wednesday 14 November 2007 22:48, Herbert Xu wrote:
> > On Wed, Nov 14, 2007 at 03:10:22AM -0800, David Miller wrote:
> > > So the thing that's being effected here in TCP is
> > > net/ipv4/tcp.c:select_size(), specifically the else branch:
> >
> > Thanks for the pointer. Indeed there is a bug in that area.
> > I'm not sure whether it's causing the problem at hand but it's
> > certainly suboptimal.
> >
> > [TCP]: Fix size calculation in sk_stream_alloc_pskb
>
> This looks like it fixes the problem!
Great, thanks for testing. I'll apply Herbert's patch tomorrow
as I've already sent Linus a bug fix pull request tonight.
> Still interested to know why SLAB didn't see the same thing...
Yes, I wonder why too. I bet objects just got packed differently.
There is this fugly "LOOPBACK_OVERHEAD" macro define in
drivers/net/loopback.c that is trying to figure out the
various overheads that we should subtract from the loopback
MTU we use by default.
It's almost guarenteed to be wrong for the way the allocators
work now.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 12:10 ` David Miller
@ 2007-11-14 18:33 ` Christoph Lameter
0 siblings, 0 replies; 29+ messages in thread
From: Christoph Lameter @ 2007-11-14 18:33 UTC (permalink / raw)
To: David Miller; +Cc: nickpiggin, herbert, netdev, linux-kernel
On Wed, 14 Nov 2007, David Miller wrote:
> > Still interested to know why SLAB didn't see the same thing...
>
> Yes, I wonder why too. I bet objects just got packed differently.
The objects are packed tightly in SLUB and SLUB can allocate smaller
objects (minimum is 8 SLAB mininum is 32).
On free a SLUB object goes directly back to the slab where it came from.
We have no queues in SLUB so we use the first word of the object as a
freepointer. In SLAB the objects first go onto queues and then are drained
later into the slab. On free in SLAB there is usually no need to touch the
object itself. The object pointer is simply moved onto the queue (works
well in SMP, in NUMA we have overhead identifying the queue and
overhead due to the number of queues needed).
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 11:48 ` Herbert Xu
2007-11-14 0:02 ` Nick Piggin
@ 2007-11-14 23:46 ` David Miller
2007-11-15 0:21 ` Nick Piggin
2007-11-15 1:03 ` Christoph Lameter
1 sibling, 2 replies; 29+ messages in thread
From: David Miller @ 2007-11-14 23:46 UTC (permalink / raw)
To: herbert; +Cc: nickpiggin, clameter, netdev, linux-kernel
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 14 Nov 2007 19:48:44 +0800
> [TCP]: Fix size calculation in sk_stream_alloc_pskb
>
> We round up the header size in sk_stream_alloc_pskb so that
> TSO packets get zero tail room. Unfortunately this rounding
> up is not coordinated with the select_size() function used by
> TCP to calculate the second parameter of sk_stream_alloc_pskb.
>
> As a result, we may allocate more than a page of data in the
> non-TSO case when exactly one page is desired.
>
> In fact, rounding up the head room is detrimental in the non-TSO
> case because it makes memory that would otherwise be available to
> the payload head room. TSO doesn't need this either, all it wants
> is the guarantee that there is no tail room.
>
> So this patch fixes this by adjusting the skb_reserve call so that
> exactly the requested amount (which all callers have calculated in
> a precise way) is made available as tail room.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Applied and I'll queue it up for -stable too.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 23:46 ` David Miller
@ 2007-11-15 0:21 ` Nick Piggin
2007-11-15 0:27 ` David Miller
2007-11-15 1:03 ` Christoph Lameter
1 sibling, 1 reply; 29+ messages in thread
From: Nick Piggin @ 2007-11-15 0:21 UTC (permalink / raw)
To: David Miller; +Cc: herbert, clameter, netdev, linux-kernel
On Thursday 15 November 2007 10:46, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Wed, 14 Nov 2007 19:48:44 +0800
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Applied and I'll queue it up for -stable too.
Good result. Thanks, everyone!
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-15 0:21 ` Nick Piggin
@ 2007-11-15 0:27 ` David Miller
0 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2007-11-15 0:27 UTC (permalink / raw)
To: nickpiggin; +Cc: herbert, clameter, netdev, linux-kernel
From: Nick Piggin <nickpiggin@yahoo.com.au>
Date: Thu, 15 Nov 2007 11:21:36 +1100
> On Thursday 15 November 2007 10:46, David Miller wrote:
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Wed, 14 Nov 2007 19:48:44 +0800
>
> > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> >
> > Applied and I'll queue it up for -stable too.
>
> Good result. Thanks, everyone!
This case is a good example to use the next time a stupid thread
starts up about bug reports not being looked into. To me it's
seems clearly more a matter of the quality of the bug report.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-14 23:46 ` David Miller
2007-11-15 0:21 ` Nick Piggin
@ 2007-11-15 1:03 ` Christoph Lameter
2007-11-15 1:11 ` Herbert Xu
1 sibling, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2007-11-15 1:03 UTC (permalink / raw)
To: David Miller; +Cc: herbert, nickpiggin, netdev, linux-kernel
On Wed, 14 Nov 2007, David Miller wrote:
> > As a result, we may allocate more than a page of data in the
> > non-TSO case when exactly one page is desired.
Well this is likely the result of the SLUB regression. If you allocate an
order 1 page then the zone locks need to be taken. SLAB queues the a
couple of higher order pages and can so serve a couple of requests without
going into the page allocator whereas SLUB has to go directly to the page
allocator for allocate and free. I guess that needs fixing in the page
allocator. Or do I need to add a mechanism to buffer higher order page
allcoations to SLUB?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-15 1:03 ` Christoph Lameter
@ 2007-11-15 1:11 ` Herbert Xu
2007-11-15 1:47 ` Nick Piggin
0 siblings, 1 reply; 29+ messages in thread
From: Herbert Xu @ 2007-11-15 1:11 UTC (permalink / raw)
To: Christoph Lameter; +Cc: David Miller, nickpiggin, netdev, linux-kernel
On Wed, Nov 14, 2007 at 05:03:25PM -0800, Christoph Lameter wrote:
>
> Well this is likely the result of the SLUB regression. If you allocate an
> order 1 page then the zone locks need to be taken. SLAB queues the a
> couple of higher order pages and can so serve a couple of requests without
> going into the page allocator whereas SLUB has to go directly to the page
> allocator for allocate and free. I guess that needs fixing in the page
> allocator. Or do I need to add a mechanism to buffer higher order page
> allcoations to SLUB?
Actually this serves to discourage people from using high-order
allocations which IMHO is a good thing :)
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench
2007-11-15 1:11 ` Herbert Xu
@ 2007-11-15 1:47 ` Nick Piggin
0 siblings, 0 replies; 29+ messages in thread
From: Nick Piggin @ 2007-11-15 1:47 UTC (permalink / raw)
To: Herbert Xu; +Cc: Christoph Lameter, David Miller, netdev, linux-kernel
On Thursday 15 November 2007 12:11, Herbert Xu wrote:
> On Wed, Nov 14, 2007 at 05:03:25PM -0800, Christoph Lameter wrote:
> > Well this is likely the result of the SLUB regression. If you allocate an
> > order 1 page then the zone locks need to be taken. SLAB queues the a
Yeah, it appears this is what happened. The lockless page allocator
fastpath appears on the list and the slowpaths disappear after
Herbert's patches. SLAB is doing its own thing, so it avoids that
pitfall.
> > couple of higher order pages and can so serve a couple of requests
> > without going into the page allocator whereas SLUB has to go directly to
> > the page allocator for allocate and free. I guess that needs fixing in
> > the page allocator. Or do I need to add a mechanism to buffer higher
> > order page allcoations to SLUB?
>
> Actually this serves to discourage people from using high-order
> allocations which IMHO is a good thing :)
Yeah I completely agree. The right fix is in the caller...
The bug / suboptimal allocation would not have been found in tcp
if not for this ;)
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2007-11-15 1:47 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-09 12:36 2.6.24-rc2 slab vs slob tbench numbers Nick Piggin
2007-11-09 15:15 ` Christoph Lameter
2007-11-09 17:49 ` Christoph Lameter
2007-11-09 23:46 ` 2.6.24-rc2: Network commit causes SLUB performance regression with tbench Christoph Lameter
2007-11-10 1:29 ` Nick Piggin
2007-11-10 3:28 ` Nick Piggin
2007-11-12 19:44 ` Christoph Lameter
2007-11-13 11:41 ` Nick Piggin
2007-11-14 1:58 ` David Miller
2007-11-13 17:36 ` Nick Piggin
2007-11-14 6:12 ` David Miller
2007-11-13 18:14 ` Nick Piggin
2007-11-14 6:37 ` David Miller
2007-11-13 22:27 ` Nick Piggin
2007-11-13 22:55 ` Nick Piggin
2007-11-14 11:10 ` David Miller
2007-11-13 23:39 ` Nick Piggin
2007-11-14 11:48 ` Herbert Xu
2007-11-14 0:02 ` Nick Piggin
2007-11-14 12:10 ` David Miller
2007-11-14 18:33 ` Christoph Lameter
2007-11-14 23:46 ` David Miller
2007-11-15 0:21 ` Nick Piggin
2007-11-15 0:27 ` David Miller
2007-11-15 1:03 ` Christoph Lameter
2007-11-15 1:11 ` Herbert Xu
2007-11-15 1:47 ` Nick Piggin
2007-11-12 20:13 ` 2.6.24-rc2 slab vs slob tbench numbers Matt Mackall
2007-11-13 11:44 ` Nick Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).