All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	rushikesh.s.kadam@intel.com, neeraj.iitr10@gmail.com,
	frederic@kernel.org, paulmck@kernel.org, rostedt@goodmis.org
Subject: Re: [PATCH v6 1/4] rcu: Make call_rcu() lazy to save power
Date: Mon, 26 Sep 2022 21:39:23 +0200	[thread overview]
Message-ID: <YzH/646RHxhHBy6+@pc636> (raw)
In-Reply-To: <YzCUDXn3htWbqM4f@google.com>

On Sun, Sep 25, 2022 at 05:46:53PM +0000, Joel Fernandes wrote:
> Hi Vlad,
> 
> On Sun, Sep 25, 2022 at 10:57:10AM +0200, Uladzislau Rezki wrote:
> > > Implement timer-based RCU lazy callback batching. The batch is flushed
> > > whenever a certain amount of time has passed, or the batch on a
> > > particular CPU grows too big. Also memory pressure will flush it in a
> > > future patch.
> > > 
> > > To handle several corner cases automagically (such as rcu_barrier() and
> > > hotplug), we re-use bypass lists to handle lazy CBs. The bypass list
> > > length has the lazy CB length included in it. A separate lazy CB length
> > > counter is also introduced to keep track of the number of lazy CBs.
> > > 
> > > v5->v6:
> > > 
> > > [ Frederic Weisbec: Program the lazy timer only if WAKE_NOT, since other
> > >   deferral levels wake much earlier so for those it is not needed. ]
> > > 
> > > [ Frederic Weisbec: Use flush flags to keep bypass API code clean. ]
> > > 
> > > [ Frederic Weisbec: Make rcu_barrier() wake up only if main list empty. ]
> > > 
> > > [ Frederic Weisbec: Remove extra 'else if' branch in rcu_nocb_try_bypass(). ]
> > > 
> > > [ Joel: Fix issue where I was not resetting lazy_len after moving it to rdp ]
> > > 
> > > [ Paul/Thomas/Joel: Make call_rcu() default lazy so users don't mess up. ]
> > > 
> > I think it make sense to add some data to the commit message
> > illustrating what this patch does.
> 
> Sure, will do!
> 
> > From my side i gave a try of this patch on my setup. Some data:
> > 
> > <snip>
> > root@pc638:/home/urezki/rcu_v6# ./perf_script_parser ./perf_v6.script | sort -nk 6 | grep rcu
> > name:                       rcuop/23 pid:        184 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/26 pid:        206 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/29 pid:        227 woken-up     1     interval: min     0     max     0       avg     0
> > name:                        rcuop/2 pid:         35 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/33 pid:        256 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/34 pid:        263 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/35 pid:        270 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/36 pid:        277 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/37 pid:        284 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/38 pid:        291 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/49 pid:        370 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/59 pid:        441 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/63 pid:        469 woken-up     1     interval: min     0     max     0       avg     0
> > name:                        rcuog/0 pid:         16 woken-up     2     interval: min  8034     max  8034       avg  4017
> > name:                       rcuog/24 pid:        191 woken-up     2     interval: min  7941     max  7941       avg  3970
> > name:                       rcuog/32 pid:        248 woken-up     2     interval: min  7542     max  7542       avg  3771
> > name:                       rcuog/48 pid:        362 woken-up     2     interval: min  8065     max  8065       avg  4032
> > name:                       rcuog/56 pid:        419 woken-up     2     interval: min  8076     max  8076       avg  4038
> > name:                       rcuop/21 pid:        170 woken-up     2     interval: min 13311438  max 13311438    avg 6655719
> > name:                       rcuog/16 pid:        134 woken-up     4     interval: min  8029     max 13303387    avg 3329863
> > name:                        rcuop/9 pid:         85 woken-up     4     interval: min 10007570  max 10007586    avg 7505684
> > name:                        rcuog/8 pid:         77 woken-up     8     interval: min  6240     max 10001242    avg 3753622
> > name:                    rcu_preempt pid:         15 woken-up    18     interval: min  6058     max 9999713     avg 2140788
> > name:                     test_rcu/0 pid:       1411 woken-up 10003     interval: min   165     max 19072       avg  4275
> > root@pc638:/home/urezki/rcu_v6#
> > 
> > root@pc638:/home/urezki/rcu_v6# ./perf_script_parser ./perf_default.script | sort -nk 6 | grep rcu
> > name:                       rcuop/33 pid:        256 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuop/51 pid:        384 woken-up     1     interval: min     0     max     0       avg     0
> > name:                       rcuog/32 pid:        248 woken-up     2     interval: min 11927     max 11927       avg  5963
> > name:                       rcuop/63 pid:        469 woken-up     2     interval: min 23963     max 23963       avg 11981
> > name:                       rcuog/56 pid:        419 woken-up     3     interval: min 11132     max 23967       avg 11699
> > name:                       rcuop/50 pid:        377 woken-up     3     interval: min  8057     max 4944344     avg 1650800
> > name:                       rcuog/48 pid:        362 woken-up     8     interval: min  2712     max 37430015    avg 5298801
> > name:                       rcuop/16 pid:        135 woken-up  4790     interval: min  7340     max 16649       avg  8843
> > name:                       rcuog/16 pid:        134 woken-up  4792     interval: min  7368     max 16644       avg  8844
> > name:                    rcu_preempt pid:         15 woken-up  5302     interval: min    26     max 12179       avg  7994
> > name:                     test_rcu/0 pid:       1353 woken-up 10003     interval: min   169     max 18508       avg  4236
> > root@pc638:/home/urezki/rcu_v6#
> > <snip>
> > 
> > so it is obvious that the patch does the job.
> 
> Thanks a lot for testing!
> 
> > On my KVM machine the boot time is affected:
> > 
> > <snip>
> > [    2.273406] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
> > [   11.945283] e1000 0000:00:03.0 ens3: renamed from eth0
> > [   22.165198] sr 1:0:0:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray
> > [   22.165206] cdrom: Uniform CD-ROM driver Revision: 3.20
> > [   32.406981] sr 1:0:0:0: Attached scsi CD-ROM sr0
> > [  104.115418] process '/usr/bin/fstype' started with executable stack
> > [  104.170142] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
> > [  104.340125] systemd[1]: systemd 241 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
> > [  104.340193] systemd[1]: Detected virtualization kvm.
> > [  104.340196] systemd[1]: Detected architecture x86-64.
> > [  104.359032] systemd[1]: Set hostname to <pc638>.
> > [  105.740109] random: crng init done
> > [  105.741267] systemd[1]: Reached target Remote File Systems.
> > <snip>
> > 
> > 2 - 11 and second delay is between 32 - 104. So there are still users which must
> > be waiting for "RCU" in a sync way.
> 
> I was wondering if you can compare boot logs and see which timestamp does the
> slow down start from. That way, we can narrow down the callback. Also another
> idea is, add "trace_event=rcu:rcu_callback,rcu:rcu_invoke_callback
> ftrace_dump_on_oops" to the boot params, and then manually call
> "tracing_off(); panic();" from the code at the first printk that seems off in
> your comparison of good vs bad. For example, if "crng init done" timestamp is
> off, put the "tracing_off(); panic();" there. Then grab the serial console
> output to see what were the last callbacks that was queued/invoked.
> 
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 08605ce7379d..40ae36904825 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -108,6 +108,13 @@ static inline int rcu_preempt_depth(void)
> > >  
> > >  #endif /* #else #ifdef CONFIG_PREEMPT_RCU */
> > >  
> > > +#ifdef CONFIG_RCU_LAZY
> > > +void call_rcu_flush(struct rcu_head *head, rcu_callback_t func);
> > > +#else
> > > +static inline void call_rcu_flush(struct rcu_head *head,
> > > +		rcu_callback_t func) {  call_rcu(head, func); }
> > > +#endif
> > > +
> > >  /* Internal to kernel */
> > >  void rcu_init(void);
> > >  extern int rcu_scheduler_active;
> > > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> > > index f53ad63b2bc6..edd632e68497 100644
> > > --- a/kernel/rcu/Kconfig
> > > +++ b/kernel/rcu/Kconfig
> > > @@ -314,4 +314,12 @@ config TASKS_TRACE_RCU_READ_MB
> > >  	  Say N here if you hate read-side memory barriers.
> > >  	  Take the default if you are unsure.
> > >  
> > > +config RCU_LAZY
> > > +	bool "RCU callback lazy invocation functionality"
> > > +	depends on RCU_NOCB_CPU
> > > +	default n
> > > +	help
> > > +	  To save power, batch RCU callbacks and flush after delay, memory
> > > +	  pressure or callback list growing too big.
> > > +
> > >
> > Do you think you need this kernel option? Can we just consider and make
> > it a run-time configurable? For example much more users will give it a try,
> > so it will increase a coverage. By default it can be off.
> > 
> > Also you do not need to do:
> > 
> > #ifdef LAZY
> 
> How does the "LAZY" macro end up being runtime-configurable? That's static /
> compile time. Did I miss something?
> 
I am talking about removing if:

config RCU_LAZY

we might run into issues related to run-time switching though.

> > ...
> > #else
> > ...
> > #endif
> > 
> > >  
> > > +/*
> > > + * LAZY_FLUSH_JIFFIES decides the maximum amount of time that
> > > + * can elapse before lazy callbacks are flushed. Lazy callbacks
> > > + * could be flushed much earlier for a number of other reasons
> > > + * however, LAZY_FLUSH_JIFFIES will ensure no lazy callbacks are
> > > + * left unsubmitted to RCU after those many jiffies.
> > > + */
> > > +#define LAZY_FLUSH_JIFFIES (10 * HZ)
> > > +static unsigned long jiffies_till_flush = LAZY_FLUSH_JIFFIES;
> > Make it configurable? I do not think you actually need 10 seconds here.
> > Reducing it will reduce a possibility to hit a low memory condition. 1
> > second would be far enough i think.
> 
> Hmm, I can make the delay configurable but for now I'll keep this as default
> as all of our power testing has been done with that and I don't want risk
> losing the optimization.
>
Fine to me. Later on is OK.

> 
> Honestly, I am not worried too about memory pressure as we have a shrinker
> which triggers flushes on the slightest hint of memory pressure. If it is not
> handling it properly, then we need to fix the shrinker.
> 
Will not speculate here since i have not tested this patch enough with
different time out and a low mem. condition.

--
Uladzislau Rezki

  parent reply	other threads:[~2022-09-26 19:39 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22 22:01 [PATCH v6 0/4] rcu: call_rcu() power improvements Joel Fernandes (Google)
2022-09-22 22:01 ` [PATCH v6 1/4] rcu: Make call_rcu() lazy to save power Joel Fernandes (Google)
2022-09-23 21:44   ` Paul E. McKenney
2022-09-24 16:20     ` Joel Fernandes
2022-09-24 21:11       ` Paul E. McKenney
2022-09-24 22:56         ` Joel Fernandes
2022-09-25 17:31         ` Joel Fernandes
2022-09-26 17:42           ` Paul E. McKenney
2022-09-26 21:07             ` Joel Fernandes
2022-09-26 22:37               ` Paul E. McKenney
2022-09-26 23:33                 ` Joel Fernandes
2022-09-26 23:53                   ` Paul E. McKenney
2022-10-03 19:33                     ` Joel Fernandes
2022-10-03 19:49                       ` Paul E. McKenney
2022-09-24 22:46   ` Frederic Weisbecker
2022-09-24 23:28     ` Joel Fernandes
2022-09-25  1:00       ` Joel Fernandes
2022-09-25 22:00         ` Frederic Weisbecker
2022-09-26 15:04           ` Joel Fernandes
2022-09-26 17:33             ` Paul E. McKenney
2022-09-26 23:37               ` Joel Fernandes
2022-09-25 22:09       ` Frederic Weisbecker
2022-09-26 17:45         ` Paul E. McKenney
2022-09-25  8:57   ` Uladzislau Rezki
2022-09-25 17:46     ` Joel Fernandes
2022-09-26 17:48       ` Paul E. McKenney
2022-09-26 19:32         ` Uladzislau Rezki
2022-09-26 21:02           ` Joel Fernandes
2022-09-26 22:32             ` Paul E. McKenney
2022-09-26 23:47               ` Joel Fernandes
2022-09-26 23:59                 ` Paul E. McKenney
2022-09-27  1:49                   ` Joel Fernandes
2022-09-27  3:22                     ` Paul E. McKenney
2022-09-27 13:05                       ` Joel Fernandes
2022-09-27 14:14                         ` Paul E. McKenney
2022-09-27 14:22                           ` Joel Fernandes
2022-09-27 14:30                             ` Paul E. McKenney
2022-09-27 15:25                               ` Joel Fernandes
2022-09-27 15:59                                 ` Paul E. McKenney
     [not found]                   ` <CAEXW_YRpAjvmBPzRA-hRQpuaDuZUzfndLb3q+e3BUyWprg5wkQ@mail.gmail.com>
2022-09-27  3:21                     ` Paul E. McKenney
2022-09-26 22:27           ` Paul E. McKenney
2022-09-26 19:39       ` Uladzislau Rezki [this message]
2022-09-26 20:54         ` Joel Fernandes
2022-09-26 22:35           ` Paul E. McKenney
2022-09-26 23:44             ` Joel Fernandes
2022-09-26 23:57               ` Paul E. McKenney
2022-09-27  1:16                 ` Joel Fernandes
2022-09-27  3:20                   ` Paul E. McKenney
2022-09-27 14:08           ` Uladzislau Rezki
2022-09-27 14:30             ` Joel Fernandes
2022-09-27 14:59               ` Uladzislau Rezki
2022-09-27 15:13                 ` Uladzislau Rezki
2022-09-27 21:31                   ` Uladzislau Rezki
2022-09-27 22:05                     ` Joel Fernandes
2022-09-27 22:29                       ` Joel Fernandes
2022-09-30 16:11                         ` Uladzislau Rezki
2022-10-04 11:35                           ` Uladzislau Rezki
2022-10-04 18:06                             ` Joel Fernandes
2022-09-27 15:14                 ` Joel Fernandes
2022-09-22 22:01 ` [PATCH v6 2/4] rcu: shrinker for lazy rcu Joel Fernandes (Google)
2022-09-22 22:01 ` [PATCH v6 3/4] rcuscale: Add laziness and kfree tests Joel Fernandes (Google)
2022-09-22 22:01 ` [PATCH v6 4/4] percpu-refcount: Use call_rcu_flush() for atomic switch Joel Fernandes (Google)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YzH/646RHxhHBy6+@pc636 \
    --to=urezki@gmail.com \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rushikesh.s.kadam@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.