All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Christoph Lameter <cl@linux.com>
Cc: Dave Chinner <dchinner@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Snitzer <snitzer@redhat.com>,
	Pekka Enberg <penberg@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	Alasdair G Kergon <agk@redhat.com>, Joe Thornber <ejt@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Heinz Mauelshagen <heinzm@redhat.com>,
	linux-mm <linux-mm@kvack.org>,
	brouer@redhat.com
Subject: Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3)
Date: Fri, 4 Sep 2015 11:10:38 +0200	[thread overview]
Message-ID: <20150904111038.4a428b03@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1509031113450.24411@east.gentwo.org>

On Thu, 3 Sep 2015 11:19:53 -0500 (CDT) Christoph Lameter <cl@linux.com> wrote:

> On Thu, 3 Sep 2015, Jesper Dangaard Brouer wrote:
> 
> > I'm buying into the problem of variable object lifetime sharing the
> > same slub.
> 
[...]
>
> > With the SLAB bulk free API I'm introducing, we can speedup slub
> > slowpath, by free several objects with a single cmpxchg_double, BUT
> > these objects need to belong to the same page.
> >  Thus, as Dave describe with merging, other users of the same size
> > objects might end up holding onto objects scattered across several
> > pages, which gives the bulk free less opportunities.
> 
> This happens regardless as far as I can tell. On boot up you may end up
> for a time in special situations where that is true.

That is true, which is also why below measurements should be taken with
a grain of salt, as benchmarking is done within 10 min of boot up.


> > That would be a technical argument for introducing a SLAB_NO_MERGE flag
> > per slab.  But I want to do some measurement before making any
> > decision. And it might be hard to show for my use-case of SKB free,
> > because SKB allocs will likely be dominating 256 bytes slab anyhow.

I'll give you some preliminary measurements on my patchset which uses
the new SLAB bulk free API of SKBs in the TX completion on ixgbe NIC
driver (function ixgbe_clean_tx_irq() will bulk free max 32 SKBs).

Basic test-type is IPv4 forwarding, on a single CPU (i7-4790K CPU @
4.00GHz), with generator pktgen sending 14Mpps (using script
samples/pktgen/pktgen_sample03_burst_single_flow.sh). 

Test setup notes
 * Kernel: 4.1.0-mmotm-2015-08-24-16-12+ #261 SMP
  - with patches "detached freelist" and Christophs irqon/off fix.

Config /etc/sysctl.conf ::
 net/ipv4/conf/default/rp_filter = 0
 net/ipv4/conf/all/rp_filter = 0
 # Forwarding performance is affected by early demux
 net/ipv4/ip_early_demux = 0
 net.ipv4.ip_forward = 1

Setup::
 $ base_device_setup.sh ixgbe3
 $ base_device_setup.sh ixgbe4
 $ netfilter_unload_modules.sh ; netfilter_unload_modules.sh; rmmod nf_reject_ipv4
 $ ip neigh add 172.16.0.66 dev ixgbe4 lladdr 00:aa:aa:aa:aa:aa
 # GRO negatively affect forwarding performance (as least for UDP test)
 $ ethtool -K ixgbe4 gro off tso off gso off
 $ ethtool -K ixgbe3 gro off tso off gso off

First I tested a none patched kernel with/without "slab_nomerge".
 (Single CPU IP-forwarding of UDP packets)
 * Normal      : 2049166 pps
 * slab_nomerge: 2053440 pps
 * Diff: +4274pps and -1.02ns
 * Nanosec diff show we are below accuracy of system

Thus, results are the same.
Using bulking changes the picture:

Bulk free of max 32 SKBs in ixgbe TX-DMA-completion:
 * Bulk-free32: 2091218 pps
 * Diff to "Normal" case above: +42052 pps and 9.81ns
 * Nanosec diff is significant (enough above accuracy level of system)
 * Summary: Pretty nice improvement!

Same test with "slab_nomerge":
 * slab_nomerge: 2121703 pps
 * Diff to above: +30485 pps and -6.87 ns
 * Nanosec diff were upto 3ns in testrun, this 6ns is still valid
 * Summary: slab_nomerge did make a difference!

Total improvement is quite significant: +72537 pps and -16.68ns (+3.5%)

It is important to be critical about your own measurements.  What is
the real cause of this change.  Lets see that happens if we tune SLUB
per CPU structures to have more "room", instead of using "slab_nomerge".

Tuning::
  echo 256 > /sys/kernel/slab/skbuff_head_cache/cpu_partial
  echo 9   > /sys/kernel/slab/skbuff_head_cache/min_partial

Test with bulk-free32 and SLUB-tuning:
 * slub-tuned: 2110842 pps
 * Note this gets very close to "slab_nomerge"
  - 2121703 - 2110842 = 10861 pps
  - (1/2121703*10^9)-(1/2110842*10^9) = -2.42 ns
 * Nanosec diff around 2.5ns is not significant enough, call results the same

Thus, I could achieve the same performance results by tuning SLUB as I
could with "slab_nomerge".  Maybe the advantage from "slab_nomerge" was
just that I got my "own" per CPU structures, and this implicitly larger
per CPU memory for myself?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-09-04  9:10 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-02 23:13 slab-nomerge (was Re: [git pull] device mapper changes for 4.3) Linus Torvalds
2015-09-03  0:48 ` Andrew Morton
2015-09-03  0:53   ` Mike Snitzer
2015-09-03  0:51 ` Mike Snitzer
2015-09-03  0:51   ` Mike Snitzer
2015-09-03  1:21   ` Linus Torvalds
2015-09-03  2:31     ` Mike Snitzer
2015-09-03  3:10       ` Christoph Lameter
2015-09-03  4:55         ` Andrew Morton
2015-09-03  6:09           ` Pekka Enberg
2015-09-03  8:53             ` Dave Chinner
2015-09-03  3:11       ` Linus Torvalds
2015-09-03  6:02     ` Dave Chinner
2015-09-03  6:13       ` Pekka Enberg
2015-09-03 10:29       ` Jesper Dangaard Brouer
2015-09-03 16:19         ` Christoph Lameter
2015-09-04  9:10           ` Jesper Dangaard Brouer [this message]
2015-09-04 14:13             ` Christoph Lameter
2015-09-04  6:35         ` Sergey Senozhatsky
2015-09-04  7:01           ` Linus Torvalds
2015-09-04  7:59             ` Sergey Senozhatsky
2015-09-04  9:56               ` Sergey Senozhatsky
2015-09-04 14:05               ` Christoph Lameter
2015-09-04 14:11               ` Linus Torvalds
2015-09-05  2:09                 ` Sergey Senozhatsky
2015-09-05  2:09                   ` Sergey Senozhatsky
2015-09-05 20:33                   ` Linus Torvalds
2015-09-07  8:44                     ` Sergey Senozhatsky
2015-09-08  0:22                       ` Sergey Senozhatsky
2015-09-03 15:02       ` Linus Torvalds
2015-09-04  3:26         ` Dave Chinner
2015-09-04  3:51           ` Linus Torvalds
2015-09-05  0:36             ` Dave Chinner
2015-09-05  0:36               ` Dave Chinner
2015-09-07  9:30             ` Jesper Dangaard Brouer
2015-09-07 20:22               ` Linus Torvalds
2015-09-07 20:22                 ` Linus Torvalds
2015-09-07 21:17                 ` Jesper Dangaard Brouer
2015-09-04 13:55           ` Christoph Lameter
2015-09-04 22:46             ` Dave Chinner
2015-09-05  0:25               ` Christoph Lameter
2015-09-05  1:16                 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150904111038.4a428b03@redhat.com \
    --to=brouer@redhat.com \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dchinner@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=ejt@redhat.com \
    --cc=heinzm@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=mpatocka@redhat.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=samitolvanen@google.com \
    --cc=snitzer@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.