netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	"'greearb@candelatech.com'" <greearb@candelatech.com>,
	"'tglx@linutronix.de'" <tglx@linutronix.de>
Cc: "'tj@kernel.org'" <tj@kernel.org>,
	"'priikone@iki.fi'" <priikone@iki.fi>,
	"'peterz@infradead.org'" <peterz@infradead.org>
Subject: Softirq latencies causing lost ethernet packets
Date: Wed, 25 May 2022 09:01:54 +0000	[thread overview]
Message-ID: <50c8042451454d8e907dd026ed5a3d53@AcuMS.aculab.com> (raw)

I've finally discovered why I'm getting a lot of lost ethernet
packets in one of my high packet rate tests (400k/sec short UDP).

The underlying problem is that the napi callbacks need to loop
in the softirq code.
For my test I need the cpu to be running at well over 50% 'softint'.
(And that is just for the ethernet receive, RPS is moving the IP/UDP
processing elsewhere.)

The problems are caused by this bit of code in __do_softirq():

        pending = local_softirq_pending();
        if (pending) {
                if (time_before(jiffies, end) && !need_resched() &&
                    --max_restart)
                        goto restart;

                wakeup_softirqd();
        }

Eric's c10d73671 changed it from:
        if (pending) {
                if (--max_restart)
                        goto restart;

                wakeup_softirqd();
        }

to
        if (pending) {
                if (time_before(jiffies, end) && !need_resched())
                        goto restart;

                wakeup_softirqd();
        }

Because just running 10 copies caused excessive latencies.

The good work was then undone by 34376a50f that added the
'max_restart' check back (with its limit of 10) to avoid
an issue with stop_machine getting stuck (jiffies doesn't
increment).

This can (probably) be fixed by setting the limit to 1000.

However there is a separate issue with the need_resched() check.
In my tests this is stopping the softint/napi callbacks for
anything up to 9 milliseconds - more than enough to drop packets.

The problem here is that the softirqd are low priority processes.
The application processes the receive the UDP all run under the
realtime scheduler (priority -51).
If the softint interrupts my RT process it is fine.
But the following sequence isn't:
 - softint runs on idle process.
 - RT process scheduled on the same cpu
 - __do_softirq() detects need_resched() calls wakeup_softirqd()
 - scheduler switches from the idle to my RT process.
 - RT process runs for several milliseconds.
 - finally softirqd is scheduled

The softint is usually higher priority than any RT thread
(because it just steals the context).
But in the more unusual case of an RT process being scheduled
while the softint is active it suddenly becomes lower priority
than the RT process.

I'm sure what the intended purpose of the need_resched() is?
I think it was eric's first thought for a limit, but he had to
add the jiffies test as well to avoid RCU stalls.

The jiffies test itself might be problematic.
It is fixed at 2 jiffies - 1ms to 2ms at 1000Hz.
I'm expecting the softint code to be running at (maybe) 80% cpu.
So that limit would need increasing.
There is a similar limit in the napi code - but that is configurable
(and, I think, just causes the softing code to loop).

But if RCU stalls are a problem maybe the rcu read lock ought to
disable softints?
So the softint is run when the rcu lock is released.

I did try setting the softirqd processes to a much higher priority
but that didn't seem to help - I didn't look exactly why.

While I could use processor affinities to stop the application's
RT threads running on the softint-heavy cpu that is all hard
and difficult to arrange.
In any case the application can make use of the non-softint time
on those cpu.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


             reply	other threads:[~2022-05-25  9:16 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-25  9:01 David Laight [this message]
2022-05-25 11:01 ` Softirq latencies causing lost ethernet packets Paolo Abeni
2022-05-25 12:00   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50c8042451454d8e907dd026ed5a3d53@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=edumazet@google.com \
    --cc=greearb@candelatech.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=priikone@iki.fi \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).