linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] Implement call_rcu_lazy() and miscellaneous fixes
@ 2022-06-22 22:50 Joel Fernandes (Google)
  2022-06-22 22:50 ` [PATCH v2 1/1] context_tracking: Use arch_atomic_read() in __ct_state for KASAN Joel Fernandes (Google)
                   ` (9 more replies)
  0 siblings, 10 replies; 60+ messages in thread
From: Joel Fernandes (Google) @ 2022-06-22 22:50 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, rushikesh.s.kadam, urezki, neeraj.iitr10, frederic,
	paulmck, rostedt, vineeth, Joel Fernandes (Google)


Hello!
Please find the next improved version of call_rcu_lazy() attached.  The main
difference between the previous version is that it is now using bypass lists,
and thus handling rcu_barrier() and hotplug situations, with some small changes
to those parts.

I also don't see the TREE07 RCU stall from v1 anymore.

In the v1, we some numbers below (testing on v2 is in progress). Rushikesh,
feel free to pull these patches into your tree. Just to note, you will also
need to pull the call_rcu_lazy() user patches from v1. I have dropped in this
series, just to make the series focus on the feature code first.

Following are power savings we see on top of RCU_NOCB_CPU on an Intel platform.
The observation is that due to a 'trickle down' effect of RCU callbacks, the
system is very lightly loaded but constantly running few RCU callbacks very
often. This confuses the power management hardware that the system is active,
when it is in fact idle.

For example, when ChromeOS screen is off and user is not doing anything on the
system, we can see big power savings.
Before:
Pk%pc10 = 72.13
PkgWatt = 0.58
CorWatt = 0.04

After:
Pk%pc10 = 81.28
PkgWatt = 0.41
CorWatt = 0.03

Further, when ChromeOS screen is ON but system is idle or lightly loaded, we
can see that the display pipeline is constantly doing RCU callback queuing due
to open/close of file descriptors associated with graphics buffers. This is
attributed to the file_free_rcu() path which this patch series also touches.

This patch series adds a simple but effective, and lockless implementation of
RCU callback batching. On memory pressure, timeout or queue growing too big, we
initiate a flush of one or more per-CPU lists.

Similar results can be achieved by increasing jiffies_till_first_fqs, however
that also has the effect of slowing down RCU. Especially I saw huge slow down
of function graph tracer when increasing that.

One drawback of this series is, if another frequent RCU callback creeps up in
the future, that's not lazy, then that will again hurt the power. However, I
believe identifying and fixing those is a more reasonable approach than slowing
RCU down for the whole system.

Disclaimer: I have intentionally not CC'd other subsystem maintainers (like
net, fs) to keep noise low and will CC them in the future after 1 or 2 rounds
of review and agreements.

Joel Fernandes (Google) (7):
  rcu: Introduce call_rcu_lazy() API implementation
  fs: Move call_rcu() to call_rcu_lazy() in some paths
  rcu/nocb: Add option to force all call_rcu() to lazy
  rcu/nocb: Wake up gp thread when flushing
  rcuscale: Add test for using call_rcu_lazy() to emulate kfree_rcu()
  rcu/nocb: Rewrite deferred wake up logic to be more clean
  rcu/kfree: Fix kfree_rcu_shrink_count() return value

Vineeth Pillai (1):
  rcu: shrinker for lazy rcu

 fs/dcache.c                   |   4 +-
 fs/eventpoll.c                |   2 +-
 fs/file_table.c               |   2 +-
 fs/inode.c                    |   2 +-
 include/linux/rcu_segcblist.h |   1 +
 include/linux/rcupdate.h      |   6 +
 kernel/rcu/Kconfig            |   8 ++
 kernel/rcu/rcu.h              |   8 ++
 kernel/rcu/rcu_segcblist.c    |  19 +++
 kernel/rcu/rcu_segcblist.h    |  24 ++++
 kernel/rcu/rcuscale.c         |  64 +++++++++-
 kernel/rcu/tree.c             |  35 +++++-
 kernel/rcu/tree.h             |  10 +-
 kernel/rcu/tree_nocb.h        | 217 +++++++++++++++++++++++++++-------
 14 files changed, 345 insertions(+), 57 deletions(-)

-- 
2.37.0.rc0.104.g0611611a94-goog


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2022-07-12 22:41 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22 22:50 [PATCH v2 0/8] Implement call_rcu_lazy() and miscellaneous fixes Joel Fernandes (Google)
2022-06-22 22:50 ` [PATCH v2 1/1] context_tracking: Use arch_atomic_read() in __ct_state for KASAN Joel Fernandes (Google)
2022-06-22 22:58   ` Joel Fernandes
2022-06-22 22:50 ` [PATCH v2 1/8] rcu: Introduce call_rcu_lazy() API implementation Joel Fernandes (Google)
2022-06-22 23:18   ` Joel Fernandes
2022-06-26  4:00     ` Paul E. McKenney
2022-06-23  1:38   ` kernel test robot
2022-06-26  4:00   ` Paul E. McKenney
2022-07-08 18:43     ` Joel Fernandes
2022-07-08 23:10       ` Paul E. McKenney
2022-07-10  2:26     ` Joel Fernandes
2022-07-10 16:03       ` Paul E. McKenney
2022-07-12 20:53         ` Joel Fernandes
2022-07-12 21:04           ` Paul E. McKenney
2022-07-12 21:10             ` Joel Fernandes
2022-07-12 22:41               ` Paul E. McKenney
2022-06-29 11:53   ` Frederic Weisbecker
2022-06-29 17:05     ` Paul E. McKenney
2022-06-29 20:29     ` Joel Fernandes
2022-06-29 22:01       ` Frederic Weisbecker
2022-06-30 14:08         ` Joel Fernandes
2022-06-22 22:50 ` [PATCH v2 2/8] rcu: shrinker for lazy rcu Joel Fernandes (Google)
2022-06-22 22:50 ` [PATCH v2 3/8] fs: Move call_rcu() to call_rcu_lazy() in some paths Joel Fernandes (Google)
2022-06-22 22:50 ` [PATCH v2 4/8] rcu/nocb: Add option to force all call_rcu() to lazy Joel Fernandes (Google)
2022-06-22 22:50 ` [PATCH v2 5/8] rcu/nocb: Wake up gp thread when flushing Joel Fernandes (Google)
2022-06-26  4:06   ` Paul E. McKenney
2022-06-26 13:45     ` Joel Fernandes
2022-06-26 13:52       ` Paul E. McKenney
2022-06-26 14:37         ` Joel Fernandes
2022-06-22 22:51 ` [PATCH v2 6/8] rcuscale: Add test for using call_rcu_lazy() to emulate kfree_rcu() Joel Fernandes (Google)
2022-06-23  2:09   ` kernel test robot
2022-06-23  3:00   ` kernel test robot
2022-06-23  8:10   ` kernel test robot
2022-06-26  4:13   ` Paul E. McKenney
2022-07-08  4:25     ` Joel Fernandes
2022-07-08 23:06       ` Paul E. McKenney
2022-07-12 20:27         ` Joel Fernandes
2022-07-12 20:58           ` Paul E. McKenney
2022-07-12 21:15             ` Joel Fernandes
2022-07-12 22:41               ` Paul E. McKenney
2022-06-22 22:51 ` [PATCH v2 7/8] rcu/nocb: Rewrite deferred wake up logic to be more clean Joel Fernandes (Google)
2022-06-22 22:51 ` [PATCH v2 8/8] rcu/kfree: Fix kfree_rcu_shrink_count() return value Joel Fernandes (Google)
2022-06-26  4:17   ` Paul E. McKenney
2022-06-27 18:56   ` Uladzislau Rezki
2022-06-27 20:59     ` Paul E. McKenney
2022-06-27 21:18       ` Joel Fernandes
2022-06-27 21:43         ` Paul E. McKenney
2022-06-28 16:56           ` Joel Fernandes
2022-06-28 21:13             ` Joel Fernandes
2022-06-29 16:56               ` Paul E. McKenney
2022-06-29 19:47                 ` Joel Fernandes
2022-06-29 21:07                   ` Paul E. McKenney
2022-06-30 14:25                     ` Joel Fernandes
2022-06-30 15:29                       ` Paul E. McKenney
2022-06-29 16:52             ` Paul E. McKenney
2022-06-26  3:12 ` [PATCH v2 0/8] Implement call_rcu_lazy() and miscellaneous fixes Paul E. McKenney
2022-07-08  4:17   ` Joel Fernandes
2022-07-08 22:45     ` Paul E. McKenney
2022-07-10  1:38       ` Joel Fernandes
2022-07-10 15:47         ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).