All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Alexander Duyck <alexander.duyck@gmail.com>,
	Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev <netdev@vger.kernel.org>,
	Alexander Duyck <aduyck@mirantis.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	John Fastabend <john.r.fastabend@intel.com>
Subject: Re: [RFC] net: remove busylock
Date: Thu, 19 May 2016 21:49:42 -0700	[thread overview]
Message-ID: <573E9766.7080105@gmail.com> (raw)
In-Reply-To: <CAKgT0Udd+8PtOfZzR3VqEGyUPLifEjDzK0x3cezxcN9Ltq_KzQ@mail.gmail.com>

On 16-05-19 01:39 PM, Alexander Duyck wrote:
> On Thu, May 19, 2016 at 12:35 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Thu, 2016-05-19 at 11:56 -0700, Eric Dumazet wrote:
>>
>>> Removing busylock helped in all cases I tested. (at least on x86 as
>>> David pointed out)
>>>
>>> As I said, we need to revisit busylock now that spinlocks are different.
>>>
>>> In one case (20 concurrent UDP netperf), I even got a 500 % increase.
>>>
>>> With busylock :
>>>
>>> lpaa5:~# sar -n DEV 4 4|grep eth0
>>> Average:         eth0     12.19 112797.12      1.95  37672.28      0.00      0.00      0.69
>>>
>>
>>
>> Hmpf, my sysctl logic was inverted. Really these results made little
>> sense.
>>
>> Sorry for the noise. At least we have 8% confirmed gain with this
>> stuff ;)
> 
> Unfortunately I see a 21% regression when you place the qdisc under
> load from multiple CPUs/nodes.  In my testing I dropped from around
> 1.05 Mpps to around 827 Kpps when I removed the busylock.
> 
> My test setup is pretty straight forward and the results I have seen
> are consistent between runs.  I have a system with 32 threads / 16
> cores spread over 2 NUMA nodes.  I reduce the number of queues on a
> 10Gb/s NIC to 1.  I kill irqbalance and disable CPU power states.  I
> then start a quick "for" loop that will schedule one UDP_STREAM
> netperf session on each CPU using a small payload size like 32.
> 
> On a side note, if I move everything onto one node I can push about
> 2.4 Mpps and the busylock doesn't seem to really impact things, but if
> I go to both nodes then I see the performance regression as described
> above.  I was thinking about it and I don't think the MCS type locks
> would have that much of an impact.  If anything I think that xmit_more
> probably has a bigger effect given that it allows us to grab multiple
> frames with each fetch and thereby reduce the lock contention on the
> dequeue side.
> 
>>> Presumably it would tremendously help if the actual kfree_skb()
>>> was done after qdisc lock is released, ie not from the qdisc->enqueue()
>>> method.
>>>
>>
>> This part is still valid.
>>
>> We could have a per cpu storage of one skb pointer, so that we do not
>> have to change all ->enqueue() prototypes.
> 
> I fully agree with that.
> 
> One thought I had is if we could move to a lockless dequeue path for
> the qdisc then we could also get rid of the busy lock.  I know there
> has been talk about doing away with qdisc locks or changing the inner
> mechanics of the qdisc itself in the past, I'm CCing Jesper and John
> for that reason.  Maybe it is time for us to start looking into that
> so we can start cleaning out some of the old cruft we have in this
> path.
> 
> - Alex
> 

I plan to start looking at this again in June when I have some
more time FWIW. The last set of RFCs I sent out bypassed both the
qdisc lock and the busy poll lock. I remember thinking this was a
net win at the time but I only did very basic testing e.g. firing
up n sessions of pktgen.

And because we are talking about cruft I always thought the gso_skb
requeue logic could be done away with as well. As far as I can tell
it must be there from some historic code that has been re-factored
or deleted pre-git days. It would be much better I think (no data)
to use byte queue limits or some other way to ensure the driver can
consume the packet vs popping and pushing skbs around.

.John

  reply	other threads:[~2016-05-20  4:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-19 17:08 [RFC] net: remove busylock Eric Dumazet
2016-05-19 18:03 ` Alexander Duyck
2016-05-19 18:41   ` Rick Jones
2016-05-19 18:56   ` Eric Dumazet
2016-05-19 19:35     ` Eric Dumazet
2016-05-19 20:39       ` Alexander Duyck
2016-05-20  4:49         ` John Fastabend [this message]
2016-05-20  4:56           ` Eric Dumazet
2016-05-20  7:29   ` Jesper Dangaard Brouer
2016-05-20 13:11     ` Eric Dumazet
2016-05-20 13:47       ` Eric Dumazet
2016-05-20 14:16         ` Eric Dumazet
2016-05-20 17:49           ` Jesper Dangaard Brouer
2016-05-20 21:32             ` Eric Dumazet
2016-05-23  9:50               ` Jesper Dangaard Brouer
2016-05-23 21:24                 ` [PATCH net] net_sched: avoid too many hrtimer_start() calls Eric Dumazet
2016-05-24 21:49                   ` David Miller
2016-05-24 13:50             ` [RFC] net: remove busylock David Laight
2016-05-24 14:37               ` Eric Dumazet
2016-05-20 16:01       ` John Fastabend
2016-05-19 18:12 ` David Miller
2016-05-19 18:44   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=573E9766.7080105@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=aduyck@mirantis.com \
    --cc=alexander.duyck@gmail.com \
    --cc=brouer@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.