All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Herbert <therbert@google.com>
To: hadi@cyberus.ca
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	netdev@vger.kernel.org, robert@herjulf.net,
	David Miller <davem@davemloft.net>,
	Changli Gao <xiaosuo@gmail.com>, Andi Kleen <andi@firstfloor.org>
Subject: Re: rps perfomance WAS(Re: rps: question
Date: Wed, 14 Apr 2010 10:31:34 -0700	[thread overview]
Message-ID: <t2p65634d661004141031xf80f62e7sb64362ea1ce10a1f@mail.gmail.com> (raw)
In-Reply-To: <1271245986.3943.55.camel@bigi>

The point of RPS is to increase parallelism, but the cost of that is
more overhead per packet.  If you are running a single flow, then
you'll see latency increase for that flow.  With more concurrent flows
the benefits of parallelism kick in and latency gets better.-- we've
seen the break even point around ten connections in our tests.  Also,
I don't think we've made the claim that RPS should generally perform
better than multi-queue, the primary motivation for RPS is make single
queue NICs give reasonable performance.


On Wed, Apr 14, 2010 at 4:53 AM, jamal <hadi@cyberus.ca> wrote:
> Following up like promised:
>
> On Mon, 2010-02-08 at 10:09 -0500, jamal wrote:
>> On Sun, 2010-02-07 at 21:58 -0800, Tom Herbert wrote:
>>
>> > I don't have specific numbers, although we are using this on
>> > application doing forwarding and numbers seem in line with what we see
>> > for an end host.
>> >
>>
>> When i get the chance i will give it a run. I have access to an i7
>> somewhere. It seems like i need some specific nics?
>
> I did step #0 last night on an i7 (single Nehalem). I think more than
> anything i was impressed by the Nehalem's excellent caching system.
> Robert, I am almost tempted to say skb recycling performance will be
> excellent on this  machine given the cost of a cache miss is much lower
> than previous generation hardware.
>
> My test was simple: irq affinity on cpu0(core0) and rps redirection to
> cpu1(core 1); tried also to redirect to different SMT threads (aka CPUs)
> on different cores with similar results. I base tested against no rps
> being used and a kernel which didnt have any RPS config on.
> [BTW, I had to hand-edit the .config since i couldnt do it from
> menuconfig (Is there any reason for it to be so?)]
>
> Traffic was sent from another machine into the i7 via an el-cheapo sky2
> (dont know how shitty this NIC is, but it seems to know how to do MSI so
> probably capable of multiqueueing); the test was several sets of
> a ping first and then a ping -f (I will get more sophisticated in my
> next test likely this weekend).
>
> Results:
> CPU utilization was about 20-30% higher in the case of rps. On cpu0, the
> cpu was being chewed highly by sky2_poll and on the redirected-to-core
> it was always smp_call_function_single.
> Latency was (consistently) on average 5 microseconds.
> So if i sent 1M ping -f packets, without RPS it took on average
> 176 seconds and with RPS it took 181 seconds to do a round-trip.
> Throughput didnt change but this could be attributed to the low amounts
> of data i was sending.
> I observed that we were generating, on average, an IPI per packet even
> with ping -f. (added an extra stat to record when we sent an IPI and
> counted against the number of packets sent).
> In my opinion it is these IPIs that contribute the most to the latency
> and i think it happens that the Nehalem is just highly improved in this
> area. I wish i had a more commonly used machine to test rps on.
> I expect that rps will perform worse on currently cheaper/older hardware
> for the traffic characteristic i tested.
>
> On IPIs:
> Is anyone familiar with what is going on with Nehalem? Why is it this
> good? I expect things will get a lot nastier with other hardware like
> xeon based or even Nehalem with rps going across QPI.
> Here's why i think IPIs are bad, please correct me if i am wrong:
> - they are synchronous. i.e an IPI issuer has to wait for an ACK (which
> is in the form of an IPI).
> - data cache has to be synced to main memory
> - the instruction pipeline is flushed
> - what else did i miss? Andi?
>
> So my question to Tom, Eric and Changli or anyone else who has been
> running RPS:
> What hardware did you use? Is there anyone using older hardware than
> say AMD Opteron or Intel Nehalem?
>
> My impressions of rps so far:
> I think i may end up being impressed when i generate a lot more traffic
> since the cost of IPI will be amortized.
> At this point multiqueue seems a lot more impressive alternative and it
> seems to me multiqueu hardware is a lot more commodity (price-point)
> than a Nehalem.
>
> Plan:
> I plan to still attack the app space (and write a basic udp app that
> binds to one or more rps cpus and try blasting a lot of UDP traffic to
> see what happens) my step after that is to move to forwarding tests..
>
> cheers,
> jamal
>
>

  reply	other threads:[~2010-04-14 17:31 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-07 18:42 rps: question jamal
2010-02-08  5:58 ` Tom Herbert
2010-02-08 15:09   ` jamal
2010-04-14 11:53     ` rps perfomance WAS(Re: " jamal
2010-04-14 17:31       ` Tom Herbert [this message]
2010-04-14 18:04         ` Eric Dumazet
2010-04-14 18:53           ` jamal
2010-04-14 19:44             ` Stephen Hemminger
2010-04-14 19:58               ` Eric Dumazet
2010-04-15  8:51                 ` David Miller
2010-04-14 20:22               ` jamal
2010-04-14 20:27                 ` Eric Dumazet
2010-04-14 20:38                   ` jamal
2010-04-14 20:45                   ` Tom Herbert
2010-04-14 20:57                     ` Eric Dumazet
2010-04-14 22:51                       ` Changli Gao
2010-04-14 23:02                         ` Stephen Hemminger
2010-04-15  2:40                           ` Eric Dumazet
2010-04-15  2:50                             ` Changli Gao
2010-04-15  8:57                       ` David Miller
2010-04-15 12:10                       ` jamal
2010-04-15 12:32                         ` Changli Gao
2010-04-15 12:50                           ` jamal
2010-04-15 23:51                             ` Changli Gao
2010-04-15  8:51                 ` David Miller
2010-04-14 20:34               ` Andi Kleen
2010-04-15  8:50               ` David Miller
2010-04-15  8:48             ` David Miller
2010-04-15 11:55               ` jamal
2010-04-15 16:41                 ` Rick Jones
2010-04-15 20:16                   ` jamal
2010-04-15 20:25                     ` Rick Jones
2010-04-15 23:56                     ` Changli Gao
2010-04-16  5:18                       ` Eric Dumazet
2010-04-16  6:02                         ` Changli Gao
2010-04-16  6:28                           ` Tom Herbert
2010-04-16  6:32                           ` Eric Dumazet
2010-04-16 13:42                             ` jamal
2010-04-16  7:15                           ` Andi Kleen
2010-04-16 13:27                             ` jamal
2010-04-16 13:37                               ` Andi Kleen
2010-04-16 13:58                                 ` jamal
2010-04-16 13:21                         ` jamal
2010-04-16 13:34                           ` Changli Gao
2010-04-16 13:49                             ` jamal
2010-04-16 14:10                               ` Changli Gao
2010-04-16 14:43                                 ` jamal
2010-04-16 14:58                                   ` Changli Gao
2010-04-19 12:48                                     ` jamal
2010-04-17  7:35                           ` Eric Dumazet
2010-04-17  8:43                             ` Tom Herbert
2010-04-17  9:23                               ` Eric Dumazet
2010-04-17 14:27                                 ` Eric Dumazet
2010-04-17 17:26                                   ` Tom Herbert
2010-04-17 14:17                               ` [PATCH net-next-2.6] net: remove time limit in process_backlog() Eric Dumazet
2010-04-18  9:36                                 ` David Miller
2010-04-17 17:31                             ` rps perfomance WAS(Re: rps: question jamal
2010-04-18  9:39                               ` Eric Dumazet
2010-04-18 11:34                                 ` Eric Dumazet
2010-04-19  2:09                                   ` jamal
2010-04-19  9:37                                   ` [RFC] rps: shortcut net_rps_action() Eric Dumazet
2010-04-19  9:48                                     ` Changli Gao
2010-04-19 12:14                                       ` Eric Dumazet
2010-04-19 12:28                                         ` Changli Gao
2010-04-19 13:27                                           ` Eric Dumazet
2010-04-19 14:22                                             ` Eric Dumazet
2010-04-19 15:07                                               ` [PATCH net-next-2.6] " Eric Dumazet
2010-04-19 16:02                                                 ` Tom Herbert
2010-04-19 20:21                                                 ` David Miller
2010-04-20  7:17                                                   ` [PATCH net-next-2.6] rps: cleanups Eric Dumazet
2010-04-20  8:18                                                     ` David Miller
2010-04-19 23:56                                                 ` [PATCH net-next-2.6] rps: shortcut net_rps_action() Changli Gao
2010-04-20  0:32                                                   ` Changli Gao
2010-04-20  5:55                                                     ` Eric Dumazet
2010-04-20 12:02                                   ` rps perfomance WAS(Re: rps: question jamal
2010-04-20 13:13                                     ` Eric Dumazet
     [not found]                                       ` <1271853570.4032.21.camel@bigi>
2010-04-21 19:01                                         ` Eric Dumazet
2010-04-22  1:27                                           ` Changli Gao
2010-04-22 12:12                                           ` jamal
2010-04-25  2:31                                             ` Changli Gao
2010-04-26 11:35                                               ` jamal
2010-04-26 13:35                                                 ` Changli Gao
2010-04-21 21:53                                         ` Rick Jones
2010-04-16 15:57             ` Tom Herbert
2010-04-14 18:53       ` Stephen Hemminger
2010-04-15  8:42       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=t2p65634d661004141031xf80f62e7sb64362ea1ce10a1f@mail.gmail.com \
    --to=therbert@google.com \
    --cc=andi@firstfloor.org \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=hadi@cyberus.ca \
    --cc=netdev@vger.kernel.org \
    --cc=robert@herjulf.net \
    --cc=xiaosuo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.