From: Esben Nielsen <simlo@phys.au.dk>
To: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>,
dipankar@in.ibm.com, shemminger@osdl.org, akpm@osdl.org,
torvalds@osdl.org, rusty@au1.ibm.com, tgall@us.ibm.com,
jim.houston@comcast.net, manfred@colorfullife.com, gh@us.ibm.com,
linux-kernel@vger.kernel.org
Subject: Re: Real-Time Preemption and RCU
Date: Mon, 21 Mar 2005 00:23:22 +0100 (MET) [thread overview]
Message-ID: <Pine.OSF.4.05.10503210004340.5287-100000@da410.phys.au.dk> (raw)
In-Reply-To: <20050320223832.GA1298@us.ibm.com>
On Sun, 20 Mar 2005, Paul E. McKenney wrote:
> On Sun, Mar 20, 2005 at 02:29:17PM +0100, Esben Nielsen wrote:
> > On Fri, 18 Mar 2005, Ingo Molnar wrote:
> >
> > > [...]
> >
> > I think it can be deterministic (on the long timescale of memory management)
> > anyway: Boost any non-RT task entering an RCU region to the lowest RT priority.
> > This way only all the RT tasks + one non-RT task can be within those
> > regions. The RT-tasks are supposed to have some kind of upper bound to
> > their CPU-usage. The non-RT task will also finish "soon" as it is
> > boosted. If the RCU batches are also at the lowest RT-priority they can be
> > run immediately after the non-RT task is done.
>
> Hmmm... Sort of a preemptive-first-strike priority boost. Cute! ;-)
>
Well, I was actually thinking of an API like
preempt_by_nonrt_disable()
preempt_by_nonrt_enable()
working like the old preempt_disable()/preempt_enable() but still
allowing RT tasks (as well as priority inheriting non-RT tasks) to be
scheduled. That would kind of help "split" the kernel into two halfs: the
RT part and the non-RT part. The non-RT part would in many ways work as it
has always done.
> > > clearly the simplest solution is to go with the single-reader locks for
> > > now - a separate experiment could be done with a new type of rwlock that
> > > can only be used by the RCU code. (I'm not quite sure whether we could
> > > guarantee a minimum rate of RCU callback processing under such a scheme
> > > though. It's an eventual memory DoS otherwise.)
> > >
> >
> > Why are a lock needed at all? If it is doable without locking for an
> > non-preemptable SMP kernel it must be doable for an preemptable kernel as
> > well.I am convinced some kind of per-CPU rcu_read_count as I specified in
> > my previous mail can work some way or the other. call_rcu() might need to
> > do more complicated stuff and thus use CPU but call_rcu() is supposed to
> > be an relative rare event not worth optimizing for. Such an
> > implementation will work for any preemptable kernel, not only PREEMPT_RT.
> > For performance is considered it is important not to acquire any locks in
> > the rcu-read regions.
>
> You definitely don't need a lock -- you can just suppress preemption
> on the read side instead. But that potentially makes for long scheduling
> latencies.
Well, in my patch I do not leave preemption off - only while doing the
simple ++/--. In effect, I let rcu_qsctr_inc know that some RCU reader
might be active, i.e. preempted, on the current CPU such that this isn't
and quiescent point after all.
(To others: Paul nicely unfolded my attachment below - I left it in
the mail such you can read it.)
The problem with this approach is ofcourse that user space programs might
preempt an RCU reader for a very long time such that RCU batches are never
really run. The boosting of non-RT tasks mentioned above would help a
lot.
A plus(?) in it: You can actually sleep while having the rcu_read_lock !!
>
> The counter approach might work, and is also what the implementation #5
> does -- check out rcu_read_lock() in Ingo's most recent patch.
>
Do you refer to your original mail with implementing it in 5 steps?
In #5 in that one (-V0.7.41-00, right?) you use a lock and as you say that
forces syncronization between the CPUs - bad for scaling. It does make the
RCU batches somewhat deterministic, as the RCU task can boost the readers
to the rcu-task's priority.
The problem about this approach is that everybody calling into RCU code
have a worst case behaviour of the systemwide worst case RCU reader
section - which can be pretty large (in principle infinite if somebody.)
So if somebody uses a call to a function in the code containing a RCU read
area the worst case behavious would be the same as teh worst case latency
in the simple world where preempt_disable()/preempt_enable() was used.
> Thanx, Paul
>
> > I tried this approach. My UP labtop did boot on it, but I haven't testet
> > it further. I have included the very small patch as an attachment.
> >
> > > Ingo
> >
> > I have not yet looked at -V0.7.41-00...
> >
> > Esben
> >
>
> > diff -Naur --exclude-from diff_exclude linux-2.6.11-final-V0.7.40-00/include/linux/rcupdate.h linux-2.6.11-final-V0.7.40-00-RCU/include/linux/rcupdate.h
> > --- linux-2.6.11-final-V0.7.40-00/include/linux/rcupdate.h 2005-03-11 23:40:13.000000000 +0100
> > +++ linux-2.6.11-final-V0.7.40-00-RCU/include/linux/rcupdate.h 2005-03-19 12:47:09.000000000 +0100
> > @@ -85,6 +85,7 @@
> > * curlist - current batch for which quiescent cycle started if any
> > */
> > struct rcu_data {
> > + long active_readers;
> > /* 1) quiescent state handling : */
> > long quiescbatch; /* Batch # for grace period */
> > int passed_quiesc; /* User-mode/idle loop etc. */
> > @@ -115,12 +116,14 @@
> > static inline void rcu_qsctr_inc(int cpu)
> > {
> > struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
> > - rdp->passed_quiesc = 1;
> > + if(rdp->active_readers==0)
> > + rdp->passed_quiesc = 1;
> > }
> > static inline void rcu_bh_qsctr_inc(int cpu)
> > {
> > struct rcu_data *rdp = &per_cpu(rcu_bh_data, cpu);
> > - rdp->passed_quiesc = 1;
> > + if(rdp->active_readers==0)
> > + rdp->passed_quiesc = 1;
> > }
> >
> > static inline int __rcu_pending(struct rcu_ctrlblk *rcp,
> > @@ -183,29 +186,27 @@
> > *
> > * It is illegal to block while in an RCU read-side critical section.
> > */
> > -#define rcu_read_lock() preempt_disable()
> > +static inline void rcu_read_lock(void)
> > +{
> > + preempt_disable();
> > + __get_cpu_var(rcu_data).active_readers++;
> > + preempt_enable();
> > +}
> >
> > /**
> > * rcu_read_unlock - marks the end of an RCU read-side critical section.
> > *
> > * See rcu_read_lock() for more information.
> > */
> > -#define rcu_read_unlock() preempt_enable()
> > +static inline void rcu_read_unlock(void)
> > +{
> > + preempt_disable();
> > + __get_cpu_var(rcu_data).active_readers--;
> > + preempt_enable();
> > +}
> >
> > #define IGNORE_LOCK(op, lock) do { (void)(lock); op(); } while (0)
> >
> > -#ifdef CONFIG_PREEMPT_RT
> > -# define rcu_read_lock_spin(lock) spin_lock(lock)
> > -# define rcu_read_unlock_spin(lock) spin_unlock(lock)
> > -# define rcu_read_lock_read(lock) read_lock(lock)
> > -# define rcu_read_unlock_read(lock) read_unlock(lock)
> > -# define rcu_read_lock_bh_read(lock) read_lock_bh(lock)
> > -# define rcu_read_unlock_bh_read(lock) read_unlock_bh(lock)
> > -# define rcu_read_lock_down_read(rwsem) down_read(rwsem)
> > -# define rcu_read_unlock_up_read(rwsem) up_read(rwsem)
> > -# define rcu_read_lock_nort() do { } while (0)
> > -# define rcu_read_unlock_nort() do { } while (0)
> > -#else
> > # define rcu_read_lock_spin(lock) IGNORE_LOCK(rcu_read_lock, lock)
> > # define rcu_read_unlock_spin(lock) IGNORE_LOCK(rcu_read_unlock, lock)
> > # define rcu_read_lock_read(lock) IGNORE_LOCK(rcu_read_lock, lock)
> > @@ -216,15 +217,10 @@
> > # define rcu_read_unlock_nort() rcu_read_unlock()
> > # define rcu_read_lock_bh_read(lock) IGNORE_LOCK(rcu_read_lock_bh, lock)
> > # define rcu_read_unlock_bh_read(lock) IGNORE_LOCK(rcu_read_unlock_bh, lock)
> > -#endif
> >
> > -#ifdef CONFIG_PREEMPT_RT
> > -# define rcu_read_lock_sem(lock) down(lock)
> > -# define rcu_read_unlock_sem(lock) up(lock)
> > -#else
> > # define rcu_read_lock_sem(lock) IGNORE_LOCK(rcu_read_lock, lock)
> > # define rcu_read_unlock_sem(lock) IGNORE_LOCK(rcu_read_unlock, lock)
> > -#endif
> > +
> > /*
> > * So where is rcu_write_lock()? It does not exist, as there is no
> > * way for writers to lock out RCU readers. This is a feature, not
> > diff -Naur --exclude-from diff_exclude linux-2.6.11-final-V0.7.40-00/Makefile linux-2.6.11-final-V0.7.40-00-RCU/Makefile
> > --- linux-2.6.11-final-V0.7.40-00/Makefile 2005-03-11 23:40:13.000000000 +0100
> > +++ linux-2.6.11-final-V0.7.40-00-RCU/Makefile 2005-03-19 12:41:09.000000000 +0100
> > @@ -1,7 +1,7 @@
> > VERSION = 2
> > PATCHLEVEL = 6
> > SUBLEVEL = 11
> > -EXTRAVERSION = -RT-V0.7.40-00
> > +EXTRAVERSION = -RT-V0.7.40-00-RCU
> > NAME=Woozy Numbat
> >
> > # *DOCUMENTATION*
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2005-03-20 23:31 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-18 0:20 Real-Time Preemption and RCU Paul E. McKenney
2005-03-18 7:49 ` Ingo Molnar
2005-03-18 16:43 ` Paul E. McKenney
2005-03-18 17:11 ` Ingo Molnar
2005-03-18 17:29 ` Paul E. McKenney
2005-03-18 20:35 ` Paul E. McKenney
2005-03-18 22:22 ` Paul E. McKenney
2005-03-19 0:48 ` Paul E. McKenney
2005-03-18 8:44 ` Ingo Molnar
2005-03-18 9:04 ` Ingo Molnar
2005-03-18 9:38 ` Ingo Molnar
2005-03-18 9:13 ` Ingo Molnar
2005-03-18 9:28 ` Ingo Molnar
2005-03-18 9:53 ` Ingo Molnar
2005-03-18 15:33 ` Paul E. McKenney
2005-03-19 5:03 ` Manfred Spraul
2005-03-19 16:26 ` Ingo Molnar
2005-03-20 6:36 ` Manfred Spraul
2005-03-20 9:25 ` Thomas Gleixner
2005-03-20 16:57 ` Manfred Spraul
2005-03-20 21:38 ` Bill Huey
2005-03-20 21:59 ` Bill Huey
2005-03-18 10:03 ` Ingo Molnar
2005-03-18 11:30 ` Ingo Molnar
2005-03-18 16:48 ` Esben Nielsen
2005-03-18 17:19 ` Ingo Molnar
2005-03-20 13:29 ` Esben Nielsen
2005-03-20 22:38 ` Paul E. McKenney
2005-03-20 23:23 ` Esben Nielsen [this message]
2005-03-22 5:53 ` Paul E. McKenney
2005-03-22 8:55 ` Esben Nielsen
2005-03-22 9:20 ` Ingo Molnar
2005-03-22 10:19 ` Esben Nielsen
2005-03-23 5:40 ` Paul E. McKenney
2005-03-23 11:44 ` Esben Nielsen
2005-03-24 7:02 ` Paul E. McKenney
2005-03-22 10:56 ` Ingo Molnar
2005-03-22 11:39 ` Esben Nielsen
2005-03-22 13:10 ` Ingo Molnar
2005-03-22 15:08 ` Esben Nielsen
2005-03-18 15:48 ` Paul E. McKenney
2005-03-18 11:38 ` Ingo Molnar
2005-03-18 12:56 ` Bill Huey
2005-03-18 13:17 ` Bill Huey
2005-03-18 15:57 ` Paul E. McKenney
2005-03-18 16:02 ` Ingo Molnar
2005-03-18 16:55 ` Esben Nielsen
2005-03-22 10:04 ` Bill Huey
2005-03-22 10:17 ` Bill Huey
2005-03-22 10:34 ` Bill Huey
2005-03-22 10:38 ` Esben Nielsen
2005-03-18 22:26 ` Herbert Xu
2005-03-19 16:31 ` Ingo Molnar
2005-03-20 8:01 ` Kyle Moffett
2005-03-22 8:08 ` Ingo Molnar
2005-03-18 15:54 ` Paul E. McKenney
2005-03-18 15:58 ` Ingo Molnar
2009-06-11 22:57 real-time preemption " James Huang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.OSF.4.05.10503210004340.5287-100000@da410.phys.au.dk \
--to=simlo@phys.au.dk \
--cc=akpm@osdl.org \
--cc=dipankar@in.ibm.com \
--cc=gh@us.ibm.com \
--cc=jim.houston@comcast.net \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=mingo@elte.hu \
--cc=paulmck@us.ibm.com \
--cc=rusty@au1.ibm.com \
--cc=shemminger@osdl.org \
--cc=tgall@us.ibm.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).