From: Dmitry Vyukov <dvyukov@google.com>
To: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
luca abeni <luca.abeni@santannapisa.it>,
Thomas Gleixner <tglx@linutronix.de>,
Juri Lelli <juri.lelli@gmail.com>,
syzbot <syzbot+385468161961cee80c31@syzkaller.appspotmail.com>,
Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
LKML <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
nstange@suse.de, syzkaller-bugs@googlegroups.com,
henrik@austad.us,
Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>,
Claudio Scordino <claudio@evidence.eu.com>,
Daniel Bristot de Oliveira <bristot@redhat.com>
Subject: Re: INFO: rcu detected stall in do_idle
Date: Sat, 27 Oct 2018 12:16:52 +0100 [thread overview]
Message-ID: <CACT4Y+Y3uvX+Zb9nG_sYfbRXoxD1XR-1PLyitwm-UQyjGv5r1A@mail.gmail.com> (raw)
In-Reply-To: <20181024120335.GE29272@localhost.localdomain>
On Wed, Oct 24, 2018 at 1:03 PM, Juri Lelli <juri.lelli@redhat.com> wrote:
>
> On 19/10/18 22:50, luca abeni wrote:
> > On Fri, 19 Oct 2018 13:39:42 +0200
> > Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > On Thu, Oct 18, 2018 at 01:08:11PM +0200, luca abeni wrote:
> > > > Ok, I see the issue now: the problem is that the "while
> > > > (dl_se->runtime <= 0)" loop is executed at replenishment time, but
> > > > the deadline should be postponed at enforcement time.
> > > >
> > > > I mean: in update_curr_dl() we do:
> > > > dl_se->runtime -= scaled_delta_exec;
> > > > if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) {
> > > > ...
> > > > enqueue replenishment timer at dl_next_period(dl_se)
> > > > But dl_next_period() is based on a "wrong" deadline!
> > > >
> > > >
> > > > I think that inserting a
> > > > while (dl_se->runtime <= -pi_se->dl_runtime) {
> > > > dl_se->deadline += pi_se->dl_period;
> > > > dl_se->runtime += pi_se->dl_runtime;
> > > > }
> > > > immediately after "dl_se->runtime -= scaled_delta_exec;" would fix
> > > > the problem, no?
> > >
> > > That certainly makes sense to me.
> >
> > Good; I'll try to work on this idea in the weekend.
>
> So, we (me and Luca) managed to spend some more time on this and found a
> few more things worth sharing. I'll try to summarize what we have got so
> far (including what already discussed) because impression is that each
> point might deserve a fix or at least consideration (just amazing how a
> simple random fuzzer thing can highlight all that :).
1. Fuzzing finds bugs in any code. Always.
If a code wasn't fuzzed, there are bugs.
2. This fuzzer is not so simple ;)
> Apologies for the
> long email.
>
> Reproducer runs on a CONFIG_HZ=100, CONFIG_IRQ_TIME_ACCOUNTING kernel
> and does something like this (only the bits that seems to matter here)
>
> int main(void)
> {
> [...]
> [setup stuff at 0x2001d000]
> syscall(__NR_perf_event_open, 0x2001d000, 0, -1, -1, 0);
> *(uint32_t*)0x20000000 = 0;
> *(uint32_t*)0x20000004 = 6;
> *(uint64_t*)0x20000008 = 0;
> *(uint32_t*)0x20000010 = 0;
> *(uint32_t*)0x20000014 = 0;
> *(uint64_t*)0x20000018 = 0x9917; <-- ~40us
> *(uint64_t*)0x20000020 = 0xffff; <-- ~65us (~60% bandwidth)
> *(uint64_t*)0x20000028 = 0;
> syscall(__NR_sched_setattr, 0, 0x20000000, 0);
> [busy loop]
> return 0;
> }
>
> And this causes problems because the task is actually never throttled.
>
> Pain points:
>
> 1. Granularity of enforcement (at each tick) is huge compared with
> the task runtime. This makes starting the replenishment timer,
> when runtime is depleted, always to fail (because old deadline
> is way in the past). So, the task is fully replenished and put
> back to run.
>
> - Luca's proposal should help here, since the deadline is postponed
> at throttling time, and replenishment timer set to that (and it
> should be in the future)
>
> 1.1 Even if we fix 1. in a configuration like this, the task would
> still be able to run for ~10ms (worst case) and potentially starve
> other tasks. It doesn't seem a too big interval maybe, but there
> might be other very short activities that might miss an occasion
> to run "quickly".
>
> - Might be fixed by imposing (via sysctl) reasonable defaults for
> minimum runtime (w.r.t. HZ, like HZ/2) and maximum for period
> (as also a very small bandwidth task can have a big runtime if
> period is big as well)
>
> (1.2) When runtime becomes very negative (because delta_exec was big)
> we seem to spend lot of time inside the replenishment loop.
>
> - Not sure it's such a big problem, might need more profiling.
> Feeling is that once the other points will be addressed this
> won't matter anymore
>
> 2. This is related to perf_event_open syscall reproducer does before
> becoming DEADLINE and entering the busy loop. Enabling of perf
> swevents generates lot of hrtimers load that happens in the
> reproducer task context. Now, DEADLINE uses rq_clock() for setting
> deadlines, but rq_clock_task() for doing runtime enforcement.
> In a situation like this it seems that the amount of irq pressure
> becomes pretty big (I'm seeing this on kvm, real hw should maybe do
> better, pain point remains I guess), so rq_clock() and
> rq_clock_task() might become more a more skewed w.r.t. each other.
> Since rq_clock() is only used when setting absolute deadlines for
> the first time (or when resetting them in certain cases), after a
> bit the replenishment code will start to see postponed deadlines
> always in the past w.r.t. rq_clock(). And this brings us back to the
> fact that the task is never stopped, since it can't keep up with
> rq_clock().
>
> - Not sure yet how we want to address this [1]. We could use
> rq_clock() everywhere, but tasks might be penalized by irq
> pressure (theoretically this would mandate that irqs are
> explicitly accounted for I guess). I tried to use the skew between
> the two clocks to "fix" deadlines, but that puts us at risks of
> de-synchronizing userspace and kernel views of deadlines.
>
> 3. HRTICK is not started for new entities.
>
> - Already got a patch for it.
>
> This should be it, I hope. Luca (thanks a lot for your help) and please
> add or correct me if I was wrong.
>
> Thoughts?
>
> Best,
>
> - Juri
>
> 1 - https://elixir.bootlin.com/linux/latest/source/kernel/sched/deadline.c#L1162
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20181024120335.GE29272%40localhost.localdomain.
> For more options, visit https://groups.google.com/d/optout.
next prev parent reply other threads:[~2018-10-27 11:17 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-13 7:31 INFO: rcu detected stall in do_idle syzbot
2018-10-16 13:24 ` Thomas Gleixner
2018-10-16 14:03 ` Peter Zijlstra
2018-10-16 14:41 ` Juri Lelli
2018-10-16 14:45 ` Thomas Gleixner
2018-10-16 15:36 ` Juri Lelli
2018-10-18 8:28 ` Juri Lelli
2018-10-18 9:48 ` Peter Zijlstra
2018-10-18 10:10 ` Juri Lelli
2018-10-18 10:38 ` luca abeni
2018-10-18 10:33 ` luca abeni
2018-10-19 13:14 ` Peter Zijlstra
2018-10-18 10:23 ` luca abeni
2018-10-18 10:47 ` Juri Lelli
2018-10-18 11:08 ` luca abeni
2018-10-18 12:21 ` Juri Lelli
2018-10-18 12:36 ` luca abeni
2018-10-19 11:39 ` Peter Zijlstra
2018-10-19 20:50 ` luca abeni
2018-10-24 12:03 ` Juri Lelli
2018-10-27 11:16 ` Dmitry Vyukov [this message]
2018-10-28 8:33 ` Juri Lelli
2018-10-30 10:45 ` Peter Zijlstra
2018-10-30 11:08 ` luca abeni
2018-10-31 16:18 ` Daniel Bristot de Oliveira
2018-10-31 16:40 ` Juri Lelli
2018-10-31 17:39 ` Peter Zijlstra
2018-10-31 17:58 ` Daniel Bristot de Oliveira
2018-11-01 5:55 ` Juri Lelli
2018-11-02 10:00 ` Daniel Bristot de Oliveira
2018-11-05 10:55 ` Juri Lelli
2018-11-07 10:12 ` Daniel Bristot de Oliveira
2018-10-31 17:38 ` Peter Zijlstra
2018-10-30 11:12 ` Juri Lelli
2018-11-06 11:44 ` Juri Lelli
2021-10-27 0:59 Hao Sun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACT4Y+Y3uvX+Zb9nG_sYfbRXoxD1XR-1PLyitwm-UQyjGv5r1A@mail.gmail.com \
--to=dvyukov@google.com \
--cc=bp@alien8.de \
--cc=bristot@redhat.com \
--cc=claudio@evidence.eu.com \
--cc=henrik@austad.us \
--cc=hpa@zytor.com \
--cc=juri.lelli@gmail.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@santannapisa.it \
--cc=mingo@redhat.com \
--cc=nstange@suse.de \
--cc=peterz@infradead.org \
--cc=syzbot+385468161961cee80c31@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=tglx@linutronix.de \
--cc=tommaso.cucinotta@santannapisa.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).