All of lore.kernel.org
 help / color / mirror / Atom feed
* Loadavg accounting error on arm64
@ 2020-11-16  9:10 ` Mel Gorman
  0 siblings, 0 replies; 70+ messages in thread
From: Mel Gorman @ 2020-11-16  9:10 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon
  Cc: Davidlohr Bueso, linux-arm-kernel, linux-kernel

Hi,

I got cc'd internal bug report filed against a 5.8 and 5.9 kernel
that loadavg was "exploding" on arch64 on a machines acting as a build
servers. It happened on at least two different arm64 variants. That setup
is complex to replicate but fortunately can be reproduced by running
hackbench-process-pipes while heavily overcomitting a machine with 96
logical CPUs and then checking if loadavg drops afterwards. With an
MMTests clone, I reproduced it as follows

./run-mmtests.sh --config configs/config-workload-hackbench-process-pipes --no-monitor testrun; \
    for i in `seq 1 60`; do cat /proc/loadavg; sleep 60; done

Load should drop to 10 after about 10 minutes and it does on x86-64 but
remained at around 200+ on arm64.

The reproduction case simply hammers the case where a task can be
descheduling while also being woken by another task at the same time. It
takes a long time to run but it makes the problem very obvious. The
expectation is that after hackbench has been running and saturating the
machine for a long time.

Commit dbfb089d360b ("sched: Fix loadavg accounting race") fixed a loadavg
accounting race in the generic case. Later it was documented why the
ordering of when p->sched_contributes_to_load is read/updated relative
to p->on_cpu.  This is critical when a task is descheduling at the same
time it is being activated on another CPU. While the load/stores happen
under the RQ lock, the RQ lock on its own does not give any guarantees
on the task state.

Over the weekend I convinced myself that it must be because the
implementation of smp_load_acquire and smp_store_release do not appear
to implement acquire/release semantics because I didn't find something
arm64 that was playing with p->state behind the schedulers back (I could
have missed it if it was in an assembly portion as I can't reliablyh read
arm assembler). Similarly, it's not clear why the arm64 implementation
does not call smp_acquire__after_ctrl_dep in the smp_load_acquire
implementation. Even when it was introduced, the arm64 implementation
differed significantly from the arm implementation in terms of what
barriers it used for non-obvious reasons.

Unfortunately, making that work similar to the arch-independent version
did not help but it's not helped that I know nothing about the arm64
memory model.

I'll be looking again today to see can I find a mistake in the ordering for
how sched_contributes_to_load is handled but again, the lack of knowledge
on the arm64 memory model means I'm a bit stuck and a second set of eyes
would be nice :(

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2020-11-19  9:55 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16  9:10 Loadavg accounting error on arm64 Mel Gorman
2020-11-16  9:10 ` Mel Gorman
2020-11-16 11:49 ` Mel Gorman
2020-11-16 11:49   ` Mel Gorman
2020-11-16 12:00   ` Mel Gorman
2020-11-16 12:00     ` Mel Gorman
2020-11-16 12:53   ` Peter Zijlstra
2020-11-16 12:53     ` Peter Zijlstra
2020-11-16 12:58     ` Peter Zijlstra
2020-11-16 12:58       ` Peter Zijlstra
2020-11-16 15:29       ` Mel Gorman
2020-11-16 15:29         ` Mel Gorman
2020-11-16 16:42         ` Mel Gorman
2020-11-16 16:42           ` Mel Gorman
2020-11-16 16:49         ` Peter Zijlstra
2020-11-16 16:49           ` Peter Zijlstra
2020-11-16 17:24           ` Mel Gorman
2020-11-16 17:24             ` Mel Gorman
2020-11-16 17:41             ` Will Deacon
2020-11-16 17:41               ` Will Deacon
2020-11-16 12:46 ` Peter Zijlstra
2020-11-16 12:46   ` Peter Zijlstra
2020-11-16 12:58   ` Mel Gorman
2020-11-16 12:58     ` Mel Gorman
2020-11-16 13:11 ` Will Deacon
2020-11-16 13:11   ` Will Deacon
2020-11-16 13:37   ` Mel Gorman
2020-11-16 13:37     ` Mel Gorman
2020-11-16 14:20     ` Peter Zijlstra
2020-11-16 14:20       ` Peter Zijlstra
2020-11-16 15:52       ` Mel Gorman
2020-11-16 15:52         ` Mel Gorman
2020-11-16 16:54         ` Peter Zijlstra
2020-11-16 16:54           ` Peter Zijlstra
2020-11-16 17:16           ` Mel Gorman
2020-11-16 17:16             ` Mel Gorman
2020-11-16 19:31       ` Mel Gorman
2020-11-16 19:31         ` Mel Gorman
2020-11-17  8:30         ` [PATCH] sched: Fix data-race in wakeup Peter Zijlstra
2020-11-17  8:30           ` Peter Zijlstra
2020-11-17  9:15           ` Will Deacon
2020-11-17  9:15             ` Will Deacon
2020-11-17  9:29             ` Peter Zijlstra
2020-11-17  9:29               ` Peter Zijlstra
2020-11-17  9:46               ` Peter Zijlstra
2020-11-17  9:46                 ` Peter Zijlstra
2020-11-17 10:36                 ` Will Deacon
2020-11-17 10:36                   ` Will Deacon
2020-11-17 12:52                 ` Valentin Schneider
2020-11-17 12:52                   ` Valentin Schneider
2020-11-17 15:37                   ` Valentin Schneider
2020-11-17 15:37                     ` Valentin Schneider
2020-11-17 16:13                     ` Peter Zijlstra
2020-11-17 16:13                       ` Peter Zijlstra
2020-11-17 19:32                       ` Valentin Schneider
2020-11-17 19:32                         ` Valentin Schneider
2020-11-18  8:05                         ` Peter Zijlstra
2020-11-18  8:05                           ` Peter Zijlstra
2020-11-18  9:51                           ` Valentin Schneider
2020-11-18  9:51                             ` Valentin Schneider
2020-11-18 13:33               ` Marco Elver
2020-11-18 13:33                 ` Marco Elver
2020-11-17  9:38           ` [PATCH] sched: Fix rq->nr_iowait ordering Peter Zijlstra
2020-11-17  9:38             ` Peter Zijlstra
2020-11-17 11:43             ` Mel Gorman
2020-11-17 11:43               ` Mel Gorman
2020-11-19  9:55             ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra
2020-11-17 12:40           ` [PATCH] sched: Fix data-race in wakeup Mel Gorman
2020-11-17 12:40             ` Mel Gorman
2020-11-19  9:55           ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.