LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Simon Kirby <sim@hostway.ca>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Dave Jones <davej@redhat.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: Linux 3.1-rc9
Date: Tue, 18 Oct 2011 11:05:13 +0200
Message-ID: <1318928713.21167.4.camel@twins> (raw)
In-Reply-To: <alpine.LFD.2.02.1110181037120.3240@ionos>

On Tue, 2011-10-18 at 10:39 +0200, Thomas Gleixner wrote:
> On Mon, 17 Oct 2011, Thomas Gleixner wrote:
> > That said, I really need some sleep before I can make a final
> > judgement on that horror. The call paths are such an intermingled mess
> > that it's not funny anymore. I do that tomorrow morning first thing.
> 
> The patch is safe and the exit race just existed in my confused tired
> brain. Peter, can you please provide a changelog. That wants a cc
> stable as well, because that deadlock causing commit hit 3.0.7 :( 

---
Subject: cputimer: Cure lock inversion
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Mon Oct 17 11:50:30 CEST 2011

There's a lock inversion between the cputimer->lock and rq->lock; notably
the two callchains involved are:

 update_rlimit_cpu()
   sighand->siglock
   set_process_cpu_timer()
     cpu_timer_sample_group()
       thread_group_cputimer()
         cputimer->lock
         thread_group_cputime()
           task_sched_runtime()
             ->pi_lock
             rq->lock

 scheduler_tick()
   rq->lock
   task_tick_fair()
     update_curr()
       account_group_exec()
         cputimer->lock

Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.

This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
SMP accounting oddities").

Cure the problem by removing the cputimer->lock and rq->lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Simon Kirby <sim@hostway.ca>
Cc: stable@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/posix-cpu-timers.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Index: linux-2.6/kernel/posix-cpu-timers.c
===================================================================
--- linux-2.6.orig/kernel/posix-cpu-timers.c
+++ linux-2.6/kernel/posix-cpu-timers.c
@@ -274,9 +274,7 @@ void thread_group_cputimer(struct task_s
 	struct task_cputime sum;
 	unsigned long flags;
 
-	spin_lock_irqsave(&cputimer->lock, flags);
 	if (!cputimer->running) {
-		cputimer->running = 1;
 		/*
 		 * The POSIX timer interface allows for absolute time expiry
 		 * values through the TIMER_ABSTIME flag, therefore we have
@@ -284,8 +282,11 @@ void thread_group_cputimer(struct task_s
 		 * it.
 		 */
 		thread_group_cputime(tsk, &sum);
+		spin_lock_irqsave(&cputimer->lock, flags);
+		cputimer->running = 1;
 		update_gt_cputime(&cputimer->cputime, &sum);
-	}
+	} else
+		spin_lock_irqsave(&cputimer->lock, flags);
 	*times = cputimer->cputime;
 	spin_unlock_irqrestore(&cputimer->lock, flags);
 }


  reply index

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-05  1:40 Linus Torvalds
2011-10-07  7:08 ` Simon Kirby
2011-10-07 17:48   ` Simon Kirby
2011-10-07 18:01     ` Peter Zijlstra
2011-10-08  0:33       ` Simon Kirby
2011-10-08  0:50       ` Simon Kirby
2011-10-08  7:55         ` Peter Zijlstra
2011-10-12 21:35           ` Simon Kirby
2011-10-13 23:25             ` Simon Kirby
2011-10-17  1:39               ` Linus Torvalds
2011-10-17  4:58                 ` Ingo Molnar
2011-10-17  9:03                   ` Thomas Gleixner
2011-10-17 10:40                     ` Peter Zijlstra
2011-10-17 11:40                       ` Alan Cox
2011-10-17 18:49                     ` Ingo Molnar
2011-10-17 20:35                       ` H. Peter Anvin
2011-10-17 21:19                         ` Ingo Molnar
2011-10-17 21:22                           ` H. Peter Anvin
2011-10-17 21:39                             ` Ingo Molnar
2011-10-17 22:03                               ` Ingo Molnar
2011-10-17 22:04                                 ` Ingo Molnar
2011-10-17 22:08                               ` H. Peter Anvin
2011-10-18  6:01                                 ` Ingo Molnar
2011-10-18  7:12                                 ` Geert Uytterhoeven
2011-10-18 18:50                                   ` H. Peter Anvin
2011-10-17 21:31                           ` Ingo Molnar
2011-10-17  7:55                 ` Martin Schwidefsky
2011-10-17  9:12                   ` Peter Zijlstra
2011-10-17  9:18                     ` Martin Schwidefsky
2011-10-17 20:48                   ` H. Peter Anvin
2011-10-18  7:20                     ` Martin Schwidefsky
2011-10-17 10:34                 ` Peter Zijlstra
2011-10-17 14:07                   ` Martin Schwidefsky
2011-10-17 14:57                   ` Linus Torvalds
2011-10-17 17:54                     ` Peter Zijlstra
2011-10-17 18:31                       ` Linus Torvalds
2011-10-17 19:23                         ` Peter Zijlstra
2011-10-17 21:00                           ` Thomas Gleixner
2011-10-18  8:39                             ` Thomas Gleixner
2011-10-18  9:05                               ` Peter Zijlstra [this message]
2011-10-18 14:59                                 ` Linus Torvalds
2011-10-18 15:26                                   ` Thomas Gleixner
2011-10-18 18:07                                   ` Ingo Molnar
2011-10-18 18:14                                   ` [GIT PULL] timer fix Ingo Molnar
2011-10-18 16:13                                 ` Linux 3.1-rc9 Dave Jones
2011-10-18 18:20                                 ` Simon Kirby
2011-10-18 19:48                                   ` Thomas Gleixner
2011-10-18 20:12                                     ` Linus Torvalds
2011-10-25 15:26                                       ` Simon Kirby
2011-10-26  1:47                                         ` Yong Zhang
2011-10-24 19:02                                     ` Simon Kirby
2011-10-25  7:13                                       ` Linus Torvalds
2011-10-25  9:01                                         ` David Miller
2011-10-25 12:30                                           ` Thomas Gleixner
2011-10-25 23:18                                             ` David Miller
2011-10-25 20:20                                       ` Simon Kirby
2011-10-31 17:32                                         ` Simon Kirby
2011-11-02 16:40                                           ` Thomas Gleixner
2011-11-02 17:27                                             ` Eric Dumazet
2011-11-02 17:46                                               ` Linus Torvalds
2011-11-02 17:53                                                 ` Eric Dumazet
2011-11-02 18:00                                                   ` Linus Torvalds
2011-11-02 18:05                                                     ` Eric Dumazet
2011-11-02 18:10                                                       ` Linus Torvalds
2011-11-02 17:49                                               ` Eric Dumazet
2011-11-02 17:58                                                 ` Eric Dumazet
2011-11-02 19:16                                                   ` Simon Kirby
2011-11-02 22:42                                                     ` Eric Dumazet
2011-11-03  0:24                                                       ` Thomas Gleixner
2011-11-03  0:52                                                       ` Simon Kirby
2011-11-03 22:07                                                         ` David Miller
2011-11-03  6:06                                                       ` Jörg-Volker Peetz
2011-11-03  6:26                                                         ` Eric Dumazet
2011-11-03  6:43                                                           ` David Miller
2011-11-02 17:54                                               ` Thomas Gleixner
2011-11-02 18:04                                                 ` Eric Dumazet
2011-11-02 18:28                                             ` Simon Kirby
2011-11-02 18:30                                               ` Thomas Gleixner
2011-11-02 22:10                                           ` Steven Rostedt
2011-11-02 23:00                                             ` Steven Rostedt
2011-11-03  0:09                                               ` Simon Kirby
2011-11-03  0:15                                                 ` Steven Rostedt
2011-11-03  0:17                                                   ` Simon Kirby
2011-11-18 23:11                                         ` [tip:perf/core] lockdep: Show subclass in pretty print of lockdep output tip-bot for Steven Rostedt
2011-10-20 14:36                 ` Linux 3.1-rc9 Martin Schwidefsky
2011-10-23 11:34                   ` Ingo Molnar
2011-10-24  7:48                     ` Martin Schwidefsky
2011-10-24  7:51                       ` Linus Torvalds
2011-10-24  8:08                         ` Martin Schwidefsky
2011-10-18  5:40             ` Simon Kirby
2011-10-09 20:51 ` Arkadiusz Miśkiewicz
2011-10-10  2:29   ` [tpmdd-devel] " Stefan Berger
2011-10-10 16:23     ` Rajiv Andrade
2011-10-10 17:05       ` Arkadiusz Miśkiewicz
2011-10-10 17:22         ` Stefan Berger
2011-10-10 17:57           ` Arkadiusz Miśkiewicz
2011-10-10 21:08             ` Arkadiusz Miśkiewicz
2011-10-11  7:09             ` [tpmdd-devel] " Peter.Huewe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1318928713.21167.4.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=davej@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=schwidefsky@de.ibm.com \
    --cc=sim@hostway.ca \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git