linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: linas@austin.ibm.com
Cc: Ingo Molnar <mingo@elte.hu>, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, olh@suse.de, olof@austin.bim.com
Subject: Re: PATCH: Race in 2.6.0-test2 timer code
Date: Thu, 31 Jul 2003 01:56:07 +0200	[thread overview]
Message-ID: <20030730235607.GC322@dualathlon.random> (raw)
In-Reply-To: <20030730184317.B23750@forte.austin.ibm.com>

On Wed, Jul 30, 2003 at 06:43:18PM -0500, linas@austin.ibm.com wrote:
> Hi,
> 
> OK, I finally spent some time studying the timer code and am starting 
> to grasp it.  I think I finally understand the bug.  Andrea's timer->lock 
> patch will fix it for 2.4 and everybody says it can't happen in 2.6 so
> ok.
> 
> 
> On Thu, Jul 31, 2003 at 12:17:17AM +0200, Andrea Arcangeli wrote:
> > So the best fix would be to nuke the run_all_timers thing from 2.4 too.
> 
> Yes.
> 
> >and to keep the timer->lock everywhere to make run_all_timers safe.
> 
> Or do that instead, yes.
> 
> > In short the stack traces I described today were all right but only for
> > 2.4, and not for 2.6.
> 
> I see the bug in 2.4.  
> 
> On Wed, Jul 30, 2003 at 12:31:23PM +0200, Ingo Molnar wrote:
> > 
> > On Wed, 30 Jul 2003, Andrea Arcangeli wrote:
> > 
> > > 
> > > 	cpu0			cpu1
> > > 	------------		--------------------
> > > 
> > > 	do_setitimer
> > > 				it_real_fn
> > > 	del_timer_sync		add_timer	-> crash
> > 
> > would you mind to elaborate the precise race? I cannot see how the above
> > sequence could crash on current 2.6:
>  
> I don't know enough about how 2.6 works to say, 
> but here's more detail on what happened in 2.4:
> 
> 
> cpu 0                                              cpu 3
> -------                                          ---------
>                                       previously,  timer->base = cpu 3 base
>                                           (its task-struct->real_timer)
> bh_action() {
> __run_timers() {                          sys_setitimer() {
>   it_real_fn() // called as fn()              del_timer_sync() {
>     add_timer() {                               spinlock (cpu3 base)
>       spinlock (cpu0 base)
>         timer->base = cpu0 base                    detach_timer (timer)
>           list_add(&timer->list)
>                                                    timer->list.next = NULL;
>   and now timer is vectored in but can't be unlinked.
> 
> And so either of Andrea's fixes should fix this race.

exactly. If you want to nuke the run_all_timers from the 2.4 backport
feel free, then we could drop the additional locking from
add_timer/del_timer* and leave it only in mod_timer like 2.6 does, that
will avoid cacheline bouncing in the small window when the local timer
irq run on top of an irq handler. In the meantime the kernel is already
solid (again ;).

2.6 will kernel crash like we did in 2.4 only if it calls
add_timer_on from timer context (of course with a cpuid != than the
smp_processor_id()), that would be fixed by the timer->lock everywhere
that we've in 2.4 right now.  (but there's no add_timer_on in 2.4
anyways)

BTW, it's reassuring it wasn't lack of 2.6 stress testing, I apologize
for claiming that.

Andrea

  reply	other threads:[~2003-07-30 23:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-29 15:41 PATCH: Race in 2.6.0-test2 timer code linas
2003-07-29 20:56 ` Andrew Morton
2003-07-30  5:57   ` Ingo Molnar
2003-07-30  6:36     ` Andrew Morton
2003-07-30  7:07       ` Ingo Molnar
2003-07-30  7:34         ` Andrea Arcangeli
2003-07-30  7:34           ` Ingo Molnar
2003-07-30  8:28             ` Andrea Arcangeli
2003-07-30 10:31               ` Ingo Molnar
2003-07-30 11:16                 ` Andrea Arcangeli
2003-07-30 11:49                   ` Ingo Molnar
2003-07-30 12:34                     ` Andrea Arcangeli
2003-07-30 21:18                       ` linas
2003-07-30 22:06                         ` Andrea Arcangeli
2003-07-30 22:17                           ` Andrea Arcangeli
2003-07-31  7:04                             ` Ingo Molnar
2003-07-30 21:19                       ` Andrea Arcangeli
2003-07-30 23:43                 ` linas
2003-07-30 23:56                   ` Andrea Arcangeli [this message]
2003-07-30 23:54                     ` Andrew Morton
2003-07-31  0:16                       ` Andrea Arcangeli
2003-07-31 17:23                     ` linas
2003-08-01  6:27                       ` Ingo Molnar
2003-07-30  7:40           ` Ingo Molnar
2003-07-30  8:37             ` Andrea Arcangeli
2003-07-30 10:34               ` Ingo Molnar
2003-07-30 10:51                 ` Andrew Morton
2003-07-30 11:28                   ` Andrea Arcangeli
2003-07-30 11:22                 ` Andrea Arcangeli
2003-07-30 20:05     ` linas
2003-07-31  6:50       ` Ingo Molnar
2003-07-31 22:56     ` linas
2003-08-01  6:23       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030730235607.GC322@dualathlon.random \
    --to=andrea@suse.de \
    --cc=akpm@osdl.org \
    --cc=linas@austin.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=olh@suse.de \
    --cc=olof@austin.bim.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).