linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: ben@iagu.net, Thomas Gleixner <tglx@linutronix.de>
Cc: Avi Kivity <avi@redhat.com>, KVM list <kvm@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	John Stultz <johnstul@us.ibm.com>,
	Richard Cochran <richard.cochran@omicron.at>
Subject: [PATCH] posix-timers: RCU conversion
Date: Tue, 22 Mar 2011 08:09:20 +0100	[thread overview]
Message-ID: <1300777760.2837.38.camel@edumazet-laptop> (raw)
In-Reply-To: <1300746429.2837.20.camel@edumazet-laptop>

Ben Nagy reported a scalability problem with KVM/QEMU that hit very hard
a single spinlock (idr_lock) in posix-timers code, on its 48 core
machine.

Even on a 16 cpu machine (2x4x2), a single test can show 98% of cpu time
used in ticket_spin_lock, from lock_timer

Ref: http://www.spinics.net/lists/kvm/msg51526.html

Switching to RCU is quite easy, IDR being already RCU ready.

idr_lock should be locked only for an insert/delete, not a lookup.

Benchmark on a 2x4x2 machine, 16 processes calling timer_gettime().

Before :

real	1m18.669s
user	0m1.346s
sys	1m17.180s

After :

real	0m3.296s
user	0m1.366s
sys	0m1.926s


Reported-by: Ben Nagy <ben@iagu.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Richard Cochran <richard.cochran@omicron.at>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/posix-timers.h |    1 +
 kernel/posix-timers.c        |   25 ++++++++++++++-----------
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index d51243a..5dc27ca 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -81,6 +81,7 @@ struct k_itimer {
 			unsigned long expires;
 		} mmtimer;
 	} it;
+	struct rcu_head rcu;
 };
 
 struct k_clock {
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 4c01249..acb9be9 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -491,6 +491,13 @@ static struct k_itimer * alloc_posix_timer(void)
 	return tmr;
 }
 
+static void k_itimer_rcu_free(struct rcu_head *head)
+{
+	struct k_itimer *tmr = container_of(head, struct k_itimer, rcu);
+
+	kmem_cache_free(posix_timers_cache, tmr);
+}
+
 #define IT_ID_SET	1
 #define IT_ID_NOT_SET	0
 static void release_posix_timer(struct k_itimer *tmr, int it_id_set)
@@ -503,7 +510,7 @@ static void release_posix_timer(struct k_itimer *tmr, int it_id_set)
 	}
 	put_pid(tmr->it_pid);
 	sigqueue_free(tmr->sigq);
-	kmem_cache_free(posix_timers_cache, tmr);
+	call_rcu(&tmr->rcu, k_itimer_rcu_free);
 }
 
 static struct k_clock *clockid_to_kclock(const clockid_t id)
@@ -631,22 +638,18 @@ out:
 static struct k_itimer *__lock_timer(timer_t timer_id, unsigned long *flags)
 {
 	struct k_itimer *timr;
-	/*
-	 * Watch out here.  We do a irqsave on the idr_lock and pass the
-	 * flags part over to the timer lock.  Must not let interrupts in
-	 * while we are moving the lock.
-	 */
-	spin_lock_irqsave(&idr_lock, *flags);
+
+	rcu_read_lock();
 	timr = idr_find(&posix_timers_id, (int)timer_id);
 	if (timr) {
-		spin_lock(&timr->it_lock);
+		spin_lock_irqsave(&timr->it_lock, *flags);
 		if (timr->it_signal == current->signal) {
-			spin_unlock(&idr_lock);
+			rcu_read_unlock();
 			return timr;
 		}
-		spin_unlock(&timr->it_lock);
+		spin_unlock_irqrestore(&timr->it_lock, *flags);
 	}
-	spin_unlock_irqrestore(&idr_lock, *flags);
+	rcu_read_unlock();
 
 	return NULL;
 }



  reply	other threads:[~2011-03-22  7:12 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTikZZ4MXZjx_+6u-vKVSgxULQo2BkQKRO6K_vzFy@mail.gmail.com>
     [not found] ` <20110318123031.GB6066@8bytes.org>
     [not found]   ` <AANLkTin2k1ZSLa0NG6pKBvHYnKEV21t8pEqzeJWNn_rR@mail.gmail.com>
     [not found]     ` <4D871F6C.40207@redhat.com>
     [not found]       ` <AANLkTintxnKRQ4rD4Wo8_MHU4yJVzYaRhR=GjnDMGwNj@mail.gmail.com>
     [not found]         ` <AANLkTin8ZoUAVR2ajoV6cBFiUSe_Dof0_k60dcAGW3r-@mail.gmail.com>
     [not found]           ` <4D875842.9050308@redhat.com>
     [not found]             ` <AANLkTikWQS281kTtJ32-qo5U+w_BAak7qUwVhUQgOxxv@mail.gmail.com>
     [not found]               ` <4D8773AA.8030408@redhat.com>
     [not found]                 ` <AANLkTin3Si1EAa=Va6f45vW-tYQ1Yk8dETXpHJHHHkxe@mail.gmail.com>
     [not found]                   ` <1300726498.2884.493.camel@edumazet-laptop>
     [not found]                     ` <4D8784A9.8040303@redhat.com>
     [not found]                       ` <1300727545.2884.513.camel@edumazet-laptop>
     [not found]                         ` <AANLkTinn7OKKKvBYe7KWsrReuYDdma5+8gHJHXdCacDY@mail.gmail.com>
2011-03-21 22:27                           ` [RFC] posix-timers: RCU conversion Eric Dumazet
2011-03-22  7:09                             ` Eric Dumazet [this message]
2011-03-22  8:59                               ` [PATCH] " Ben Nagy
2011-03-22 10:35                                 ` Avi Kivity
2011-04-04  3:30                                   ` Ben Nagy
2011-04-04  7:18                                     ` Avi Kivity
2011-04-05  7:49                                   ` Peter Zijlstra
2011-04-05  8:16                                     ` Avi Kivity
2011-04-05  8:48                                   ` Peter Zijlstra
2011-04-05  8:56                                     ` Avi Kivity
2011-04-05  9:03                                       ` Peter Zijlstra
2011-04-05  9:08                                         ` Avi Kivity
2011-04-05  9:50                                         ` Ben Nagy
2011-04-05  8:56                                     ` Mike Galbraith
2011-04-03 16:54                               ` Eric Dumazet
2011-04-04 18:08                                 ` john stultz
2011-04-04 19:47                                   ` Eric Dumazet
2011-04-05 14:48                                   ` Oleg Nesterov
2011-04-05 15:18                                     ` [PATCH v2] " Eric Dumazet
2011-04-05 15:43                                       ` Oleg Nesterov
2011-05-16 15:10                                       ` Avi Kivity
2011-05-16 15:30                                         ` Eric Dumazet
2011-04-08 18:28                                 ` [PATCH] " Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1300777760.2837.38.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=avi@redhat.com \
    --cc=ben@iagu.net \
    --cc=johnstul@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard.cochran@omicron.at \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).