All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	rushikesh.s.kadam@intel.com, neeraj.iitr10@gmail.com,
	frederic@kernel.org, rostedt@goodmis.org
Subject: Re: [PATCH v6 1/4] rcu: Make call_rcu() lazy to save power
Date: Mon, 26 Sep 2022 20:22:46 -0700	[thread overview]
Message-ID: <20220927032246.GH4196@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <YzJWoRui7mUEDtox@google.com>

On Tue, Sep 27, 2022 at 01:49:21AM +0000, Joel Fernandes wrote:
> On Mon, Sep 26, 2022 at 04:59:44PM -0700, Paul E. McKenney wrote:
> > On Mon, Sep 26, 2022 at 07:47:50PM -0400, Joel Fernandes wrote:
> > > 
> > > 
> > > > On Sep 26, 2022, at 6:32 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > 
> > > > On Mon, Sep 26, 2022 at 09:02:21PM +0000, Joel Fernandes wrote:
> > > >> On Mon, Sep 26, 2022 at 09:32:44PM +0200, Uladzislau Rezki wrote:
> > > >> [...]
> > > >>>>>> On my KVM machine the boot time is affected:
> > > >>>>>> 
> > > >>>>>> <snip>
> > > >>>>>> [    2.273406] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
> > > >>>>>> [   11.945283] e1000 0000:00:03.0 ens3: renamed from eth0
> > > >>>>>> [   22.165198] sr 1:0:0:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray
> > > >>>>>> [   22.165206] cdrom: Uniform CD-ROM driver Revision: 3.20
> > > >>>>>> [   32.406981] sr 1:0:0:0: Attached scsi CD-ROM sr0
> > > >>>>>> [  104.115418] process '/usr/bin/fstype' started with executable stack
> > > >>>>>> [  104.170142] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
> > > >>>>>> [  104.340125] systemd[1]: systemd 241 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
> > > >>>>>> [  104.340193] systemd[1]: Detected virtualization kvm.
> > > >>>>>> [  104.340196] systemd[1]: Detected architecture x86-64.
> > > >>>>>> [  104.359032] systemd[1]: Set hostname to <pc638>.
> > > >>>>>> [  105.740109] random: crng init done
> > > >>>>>> [  105.741267] systemd[1]: Reached target Remote File Systems.
> > > >>>>>> <snip>
> > > >>>>>> 
> > > >>>>>> 2 - 11 and second delay is between 32 - 104. So there are still users which must
> > > >>>>>> be waiting for "RCU" in a sync way.
> > > >>>>> 
> > > >>>>> I was wondering if you can compare boot logs and see which timestamp does the
> > > >>>>> slow down start from. That way, we can narrow down the callback. Also another
> > > >>>>> idea is, add "trace_event=rcu:rcu_callback,rcu:rcu_invoke_callback
> > > >>>>> ftrace_dump_on_oops" to the boot params, and then manually call
> > > >>>>> "tracing_off(); panic();" from the code at the first printk that seems off in
> > > >>>>> your comparison of good vs bad. For example, if "crng init done" timestamp is
> > > >>>>> off, put the "tracing_off(); panic();" there. Then grab the serial console
> > > >>>>> output to see what were the last callbacks that was queued/invoked.
> > > >>>> 
> > > >>>> We do seem to be in need of some way to quickly and easily locate the
> > > >>>> callback that needed to be _flush() due to a wakeup.
> > > >>>> 
> > > >>> <snip>
> > > >>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > > >>> index aeea9731ef80..fe1146d97f1a 100644
> > > >>> --- a/kernel/workqueue.c
> > > >>> +++ b/kernel/workqueue.c
> > > >>> @@ -1771,7 +1771,7 @@ bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork)
> > > >>> 
> > > >>>        if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
> > > >>>                rwork->wq = wq;
> > > >>> -               call_rcu(&rwork->rcu, rcu_work_rcufn);
> > > >>> +               call_rcu_flush(&rwork->rcu, rcu_work_rcufn);
> > > >>>                return true;
> > > >>>        }
> > > >>> 
> > > >>> <snip>
> > > >>> 
> > > >>> ?
> > > >>> 
> > > >>> But it does not fully solve my boot-up issue. Will debug tomorrow further.
> > > >> 
> > > >> Ah, but at least its progress, thanks. Could you send me a patch to include
> > > >> in the next revision with details of this?
> > > >> 
> > > >>>> Might one more proactive approach be to use Coccinelle to locate such
> > > >>>> callback functions?  We might not want -all- callbacks that do wakeups
> > > >>>> to use call_rcu_flush(), but knowing which are which should speed up
> > > >>>> slow-boot debugging by quite a bit.
> > > >>>> 
> > > >>>> Or is there a better way to do this?
> > > >>>> 
> > > >>> I am not sure what Coccinelle is. If we had something automated that measures
> > > >>> a boot time and if needed does some profiling it would be good. Otherwise it
> > > >>> is a manual debugging mainly, IMHO.
> > > >> 
> > > >> Paul, What about using a default-off kernel CONFIG that splats on all lazy
> > > >> call_rcu() callbacks that do a wake up. We could use the trace hooks to do it
> > > >> in kernel I think. I can talk to Steve to get ideas on how to do that but I
> > > >> think it can be done purely from trace events (we might need a new
> > > >> trace_end_invoke_callback to fire after the callback is invoked). Thoughts?
> > > > 
> > > > Could you look for wakeups invoked between trace_rcu_batch_start() and
> > > > trace_rcu_batch_end() that are not from interrupt context?  This would
> > > > of course need to be associated with a task rather than a CPU.
> > > 
> > > Yes this sounds good, but we also need to know if the callbacks are lazy or not since wake-up is ok from a non lazy one. I think I’ll need a table to track that at queuing time.
> > 
> > Agreed.
> > 
> > > > Note that you would need to check for wakeups from interrupt handlers
> > > > even with the extra trace_end_invoke_callback().  The window where an
> > > > interrupt handler could do a wakeup would be reduced, but not eliminated.
> > > 
> > > True! Since this is a  debugging option, can we not just disable interrupts across callback invocation?
> > 
> > Not without terminally annoying lockdep, at least for any RCU callbacks
> > doing things like spin_lock_bh().
> > 
> 
> Sorry if my last email bounced. Looks like my iPhone betrayed me this once ;)
> 
> I was thinking something like this:
> 1. Put a flag in rcu_head to mark CBs as lazy.
> 2. Add a trace_rcu_invoke_callback_end() trace point.
> 
> Both #1 and #2 can be a debug CONFIG option. #2 can be a tracepoint and not
> exposed if needed.
> 
> 3. Put an in-kernel probe on both trace_rcu_invoke_callback_start() and
> trace_rcu_invoke_callback_end(). In the start probe, set a per-task flag if
> the current CB is lazy. In the end probe, clear it.
> 
> 4. Put an in-kernel probe on trace_rcu_sched_wakeup().
> 
> Splat in the wake up probe if:
> 1. Hard IRQs are on.
> 2. The per-cpu flag is set.
> 
> #3 actually does not even need probes if we can directly call the functions
> from the rcu_do_batch() function.

This is fine for an experiment or a debugging session, but a solution
based totally on instrumentation would be better for production use.

> I'll work on it in the morning and also look into Vlad's config.

Sounds good!

							Thanx, Paul

  reply	other threads:[~2022-09-27  3:24 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22 22:01 [PATCH v6 0/4] rcu: call_rcu() power improvements Joel Fernandes (Google)
2022-09-22 22:01 ` [PATCH v6 1/4] rcu: Make call_rcu() lazy to save power Joel Fernandes (Google)
2022-09-23 21:44   ` Paul E. McKenney
2022-09-24 16:20     ` Joel Fernandes
2022-09-24 21:11       ` Paul E. McKenney
2022-09-24 22:56         ` Joel Fernandes
2022-09-25 17:31         ` Joel Fernandes
2022-09-26 17:42           ` Paul E. McKenney
2022-09-26 21:07             ` Joel Fernandes
2022-09-26 22:37               ` Paul E. McKenney
2022-09-26 23:33                 ` Joel Fernandes
2022-09-26 23:53                   ` Paul E. McKenney
2022-10-03 19:33                     ` Joel Fernandes
2022-10-03 19:49                       ` Paul E. McKenney
2022-09-24 22:46   ` Frederic Weisbecker
2022-09-24 23:28     ` Joel Fernandes
2022-09-25  1:00       ` Joel Fernandes
2022-09-25 22:00         ` Frederic Weisbecker
2022-09-26 15:04           ` Joel Fernandes
2022-09-26 17:33             ` Paul E. McKenney
2022-09-26 23:37               ` Joel Fernandes
2022-09-25 22:09       ` Frederic Weisbecker
2022-09-26 17:45         ` Paul E. McKenney
2022-09-25  8:57   ` Uladzislau Rezki
2022-09-25 17:46     ` Joel Fernandes
2022-09-26 17:48       ` Paul E. McKenney
2022-09-26 19:32         ` Uladzislau Rezki
2022-09-26 21:02           ` Joel Fernandes
2022-09-26 22:32             ` Paul E. McKenney
2022-09-26 23:47               ` Joel Fernandes
2022-09-26 23:59                 ` Paul E. McKenney
2022-09-27  1:49                   ` Joel Fernandes
2022-09-27  3:22                     ` Paul E. McKenney [this message]
2022-09-27 13:05                       ` Joel Fernandes
2022-09-27 14:14                         ` Paul E. McKenney
2022-09-27 14:22                           ` Joel Fernandes
2022-09-27 14:30                             ` Paul E. McKenney
2022-09-27 15:25                               ` Joel Fernandes
2022-09-27 15:59                                 ` Paul E. McKenney
     [not found]                   ` <CAEXW_YRpAjvmBPzRA-hRQpuaDuZUzfndLb3q+e3BUyWprg5wkQ@mail.gmail.com>
2022-09-27  3:21                     ` Paul E. McKenney
2022-09-26 22:27           ` Paul E. McKenney
2022-09-26 19:39       ` Uladzislau Rezki
2022-09-26 20:54         ` Joel Fernandes
2022-09-26 22:35           ` Paul E. McKenney
2022-09-26 23:44             ` Joel Fernandes
2022-09-26 23:57               ` Paul E. McKenney
2022-09-27  1:16                 ` Joel Fernandes
2022-09-27  3:20                   ` Paul E. McKenney
2022-09-27 14:08           ` Uladzislau Rezki
2022-09-27 14:30             ` Joel Fernandes
2022-09-27 14:59               ` Uladzislau Rezki
2022-09-27 15:13                 ` Uladzislau Rezki
2022-09-27 21:31                   ` Uladzislau Rezki
2022-09-27 22:05                     ` Joel Fernandes
2022-09-27 22:29                       ` Joel Fernandes
2022-09-30 16:11                         ` Uladzislau Rezki
2022-10-04 11:35                           ` Uladzislau Rezki
2022-10-04 18:06                             ` Joel Fernandes
2022-09-27 15:14                 ` Joel Fernandes
2022-09-22 22:01 ` [PATCH v6 2/4] rcu: shrinker for lazy rcu Joel Fernandes (Google)
2022-09-22 22:01 ` [PATCH v6 3/4] rcuscale: Add laziness and kfree tests Joel Fernandes (Google)
2022-09-22 22:01 ` [PATCH v6 4/4] percpu-refcount: Use call_rcu_flush() for atomic switch Joel Fernandes (Google)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220927032246.GH4196@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rushikesh.s.kadam@intel.com \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.