linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"David S. Miller" <davem@davemloft.net>,
	Steven Rostedt <rostedt@goodmis.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH 4/5] netdev: implement infrastructure for threadable napi irq
Date: Thu, 16 Jun 2016 12:39:53 +0200	[thread overview]
Message-ID: <1466073593.4691.47.camel@redhat.com> (raw)
In-Reply-To: <CANn89iJKxctv5Yzn2ecBLAjh2RtGpjNC6M3dqKXieqaORZaGsw@mail.gmail.com>

On Wed, 2016-06-15 at 10:04 -0700, Eric Dumazet wrote:
> On Wed, Jun 15, 2016 at 9:42 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > On Wed, 2016-06-15 at 07:17 -0700, Eric Dumazet wrote:
> 
> >>
> >> I really appreciate the effort, but as I already said this is not going to work.
> >>
> >> Many NIC have 2 NAPI contexts per queue, one for TX, one for RX.
> >>
> >> Relying on CFS to switch from the two 'threads' you need in the one
> >> vCPU case will add latencies that your 'pure throughput UDP flood' is
> >> not able to detect.
> >
> > We have done TCP_RR tests with similar results: when the throughput is
> > (guest) cpu bounded and multiple flows are used, there is measurable
> > gain.
> 
> TCP_RR hardly triggers the problem I am mentioning.
> 
> You need a combination of different competing works. Both bulk and rpc like.
> 
> The important factor for RPC is P99 latency.
> 
> Look, the simple fact that mlx4 driver can dequeue 256 skb per TX napi poll
> and only 64 skbs in RX poll is problematic in some workloads, since
> this allows a queue to build up on RX rings.
> 
> >
> >> I was waiting a fix from Andy Lutomirski to be merged before sending
> >> my ksoftirqd fix, which will work and wont bring kernel bloat.
> >
> > We experimented that patch in this scenario, but it don't give
> > measurable gain, since the ksoftirqd threads still prevent the qemu
> > process from using 100% of any hypervisor's cores.
> 
> Not sure what you measured, but in my experiment, the user thread
> could finally get a fair share of the core, instead of 0%
> 
> Improvement was 100000 % or so.

We used a different setup to explicitly avoid the (guest) userspace
starvation issue. Using a guest with 2vCPUs (or more) and a single queue
avoids the starvation issue, because the scheduler moves the user space
processes on a different vCPU in respect to the ksoftirqd thread.

In the hypervisor, with a vanilla kernel, the qemu process receives a
fair share of the cpu time, but considerably less 100%, and his
performances are bounded to a considerable lower throughput than the
theoretical one.

We tested your patch in both the guest and/or the hypervisor with the
above scenario and it doesn't change the throughput numbers much. But it
fixes nicely the starvation issue on single core host and we are
definitely in favor of it and waiting to get it included.

> How are you making sure your thread uses say 1% of the core, and let
> 99% to the 'qemu' process exactly ?

We allow the irq thread to be migrated. The scheduler can move it on a
different (hypervisor) core according to the workload, and qemu can
avoid completely competing with other processes for a cpu.

We are not using the threaded irqs in the guest, only into the
hypervisor.

> How the typical user will enable all this stuff exactly ?

A desktop host or a bare-metal server don't probably need/want it. An
hypervisor or an (small) router would probably enable irq threading on
all supported NICs. That could be managed by the tuned daemon or the
like with an appropriate profile.
Advanced users, also real time sensitive users, can simply use the
procfs now.

kernel without IRQ_FORCED_THREADING are unaffected, kernel with
IRQ_FORCED_THREADING can already change the packet reception (and more)
in a significant way with the forcedirq parameter.

Paolo

  reply	other threads:[~2016-06-16 10:39 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-15 13:42 [PATCH 0/5] genirq: threadable IRQ support Paolo Abeni
2016-06-15 13:42 ` [PATCH 1/5] genirq: implement support for runtime switch to threaded irqs Paolo Abeni
2016-06-15 14:50   ` kbuild test robot
2016-06-15 13:42 ` [PATCH 2/5] genirq: add flags for controlling the default threaded irq behavior Paolo Abeni
2016-06-15 13:42 ` [PATCH 3/5] sched/preempt: cond_resched_softirq() must check for softirq Paolo Abeni
2016-06-15 13:48   ` Peter Zijlstra
2016-06-15 14:00     ` Paolo Abeni
2016-06-15 13:42 ` [PATCH 4/5] netdev: implement infrastructure for threadable napi irq Paolo Abeni
2016-06-15 14:12   ` kbuild test robot
2016-06-15 14:17   ` Eric Dumazet
2016-06-15 14:21     ` Eric Dumazet
2016-06-15 16:42     ` Paolo Abeni
2016-06-15 17:04       ` Eric Dumazet
2016-06-16 10:39         ` Paolo Abeni [this message]
2016-06-16 11:19           ` Eric Dumazet
2016-06-16 12:03             ` Paolo Abeni
2016-06-16 16:55               ` Eric Dumazet
2016-06-15 13:42 ` [PATCH 5/5] ixgbe: add support for threadable rx irq Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1466073593.4691.47.camel@redhat.com \
    --to=pabeni@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hannes@stressinduktion.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).