All of lore.kernel.org
 help / color / mirror / Atom feed
* slow nanosleep?
@ 2010-09-08  7:45 Joakim Tjernlund
  2010-09-08  7:56 ` Eric Dumazet
  0 siblings, 1 reply; 17+ messages in thread
From: Joakim Tjernlund @ 2010-09-08  7:45 UTC (permalink / raw)
  To: Thomas Gleixner, linux-kernel


Hi Thomas

while playing with nanosleep I noticed that it is slow
compared to select. This little test program shows what
the effect:
#include <time.h>
#include <sys/time.h>
#include <stdio.h>
#define NANO_SLEEP 1
main()
{
	struct timespec req, rem;
	struct timeval tv1, tv2, tv_res;
	int res;

	rem.tv_sec = 0;
	rem.tv_nsec = 0;

	req.tv_sec = 0;
	req.tv_nsec = 0;

	tv2.tv_sec = req.tv_sec;
	tv2.tv_usec = req.tv_nsec/1000;

	gettimeofday(&tv1, NULL);
#ifdef NANO_SLEEP
	res = nanosleep(&req, &rem);
#else
	res = select(0, NULL,NULL,NULL, &tv2);
#endif
	gettimeofday(&tv2, NULL);
	timersub(&tv2, &tv1, &tv_res);
#ifdef NANO_SLEEP
	printf("nanosleep\n");
#else
	printf("selectsleep\n");
#endif
	printf("req:%d :%d\n", (int)req.tv_sec, (int)req.tv_nsec/1000);
	printf("tv_res:%d :%d\n", (int)tv_res.tv_sec, (int)tv_res.tv_usec);
}
root@localhost ~ # ./nanosleep
nanosleep
req:0 :0
tv_res:0 :119
root@localhost ~ # ./selectsleep
selectsleep
req:0 :0
tv_res:0 :36


Isn't nanosleep to slow here? The min time is about 120 us compared
to select which is 36 us. I would expect nanosleep to be better than
select.

Kernel 2.6.35 with HIGH_RES timers on Powerpc(MPC8321, 266 MHz)
x86 shows the same effect.

    Jocke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08  7:45 slow nanosleep? Joakim Tjernlund
@ 2010-09-08  7:56 ` Eric Dumazet
  2010-09-08  8:04   ` Joakim Tjernlund
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2010-09-08  7:56 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Thomas Gleixner, linux-kernel

Le mercredi 08 septembre 2010 à 09:45 +0200, Joakim Tjernlund a écrit :
> Hi Thomas
> 
> while playing with nanosleep I noticed that it is slow
> compared to select. This little test program shows what
> the effect:
> #include <time.h>
> #include <sys/time.h>
> #include <stdio.h>
> #define NANO_SLEEP 1
> main()
> {
> 	struct timespec req, rem;
> 	struct timeval tv1, tv2, tv_res;
> 	int res;
> 
> 	rem.tv_sec = 0;
> 	rem.tv_nsec = 0;
> 
> 	req.tv_sec = 0;
> 	req.tv_nsec = 0;
> 
> 	tv2.tv_sec = req.tv_sec;
> 	tv2.tv_usec = req.tv_nsec/1000;
> 
> 	gettimeofday(&tv1, NULL);
> #ifdef NANO_SLEEP
> 	res = nanosleep(&req, &rem);
> #else
> 	res = select(0, NULL,NULL,NULL, &tv2);
> #endif
> 	gettimeofday(&tv2, NULL);
> 	timersub(&tv2, &tv1, &tv_res);
> #ifdef NANO_SLEEP
> 	printf("nanosleep\n");
> #else
> 	printf("selectsleep\n");
> #endif
> 	printf("req:%d :%d\n", (int)req.tv_sec, (int)req.tv_nsec/1000);
> 	printf("tv_res:%d :%d\n", (int)tv_res.tv_sec, (int)tv_res.tv_usec);
> }
> root@localhost ~ # ./nanosleep
> nanosleep
> req:0 :0
> tv_res:0 :119
> root@localhost ~ # ./selectsleep
> selectsleep
> req:0 :0
> tv_res:0 :36
> 
> 
> Isn't nanosleep to slow here? The min time is about 120 us compared
> to select which is 36 us. I would expect nanosleep to be better than
> select.
> 
> Kernel 2.6.35 with HIGH_RES timers on Powerpc(MPC8321, 266 MHz)
> x86 shows the same effect.
> 

You need :

#define PR_SET_TIMERSLACK 29

prctl(PR_SET_TIMERSLACK, 1); /* 1 nsec resolution, please */




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08  7:56 ` Eric Dumazet
@ 2010-09-08  8:04   ` Joakim Tjernlund
  2010-09-08  8:24     ` Eric Dumazet
  0 siblings, 1 reply; 17+ messages in thread
From: Joakim Tjernlund @ 2010-09-08  8:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Thomas Gleixner

Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/09/08 09:56:40:
>
> Le mercredi 08 septembre 2010 à 09:45 +0200, Joakim Tjernlund a écrit :
> > Hi Thomas
> >
> > while playing with nanosleep I noticed that it is slow
> > compared to select. This little test program shows what
> > the effect:
> > #include <time.h>
> > #include <sys/time.h>
> > #include <stdio.h>
> > #define NANO_SLEEP 1
> > main()
> > {
> >    struct timespec req, rem;
> >    struct timeval tv1, tv2, tv_res;
> >    int res;
> >
> >    rem.tv_sec = 0;
> >    rem.tv_nsec = 0;
> >
> >    req.tv_sec = 0;
> >    req.tv_nsec = 0;
> >
> >    tv2.tv_sec = req.tv_sec;
> >    tv2.tv_usec = req.tv_nsec/1000;
> >
> >    gettimeofday(&tv1, NULL);
> > #ifdef NANO_SLEEP
> >    res = nanosleep(&req, &rem);
> > #else
> >    res = select(0, NULL,NULL,NULL, &tv2);
> > #endif
> >    gettimeofday(&tv2, NULL);
> >    timersub(&tv2, &tv1, &tv_res);
> > #ifdef NANO_SLEEP
> >    printf("nanosleep\n");
> > #else
> >    printf("selectsleep\n");
> > #endif
> >    printf("req:%d :%d\n", (int)req.tv_sec, (int)req.tv_nsec/1000);
> >    printf("tv_res:%d :%d\n", (int)tv_res.tv_sec, (int)tv_res.tv_usec);
> > }
> > root@localhost ~ # ./nanosleep
> > nanosleep
> > req:0 :0
> > tv_res:0 :119
> > root@localhost ~ # ./selectsleep
> > selectsleep
> > req:0 :0
> > tv_res:0 :36
> >
> >
> > Isn't nanosleep to slow here? The min time is about 120 us compared
> > to select which is 36 us. I would expect nanosleep to be better than
> > select.
> >
> > Kernel 2.6.35 with HIGH_RES timers on Powerpc(MPC8321, 266 MHz)
> > x86 shows the same effect.
> >
>
> You need :
>
> #define PR_SET_TIMERSLACK 29
>
> prctl(PR_SET_TIMERSLACK, 1); /* 1 nsec resolution, please */

That makes litte difference:
root@localhost ~ # ./nanosleep
nanosleep
req:0 :0
tv_res:0 :112


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08  8:04   ` Joakim Tjernlund
@ 2010-09-08  8:24     ` Eric Dumazet
  2010-09-08  9:12       ` Joakim Tjernlund
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2010-09-08  8:24 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linux-kernel, Thomas Gleixner

Le mercredi 08 septembre 2010 à 10:04 +0200, Joakim Tjernlund a écrit :

> That makes litte difference:
> root@localhost ~ # ./nanosleep
> nanosleep
> req:0 :0
> tv_res:0 :112
> 

Here, result is 30 (with prctl())
instead of 95 (without)





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08  8:24     ` Eric Dumazet
@ 2010-09-08  9:12       ` Joakim Tjernlund
  2010-09-08  9:51         ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Joakim Tjernlund @ 2010-09-08  9:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Thomas Gleixner

Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/09/08 10:24:49:
>
> Le mercredi 08 septembre 2010 à 10:04 +0200, Joakim Tjernlund a écrit :
>
> > That makes litte difference:
> > root@localhost ~ # ./nanosleep
> > nanosleep
> > req:0 :0
> > tv_res:0 :112
> >
>
> Here, result is 30 (with prctl())
> instead of 95 (without)

On x86 I notice a difference:
  7 vs 57.
however select is still faster: 2
The system call OH seems to be bigger for nanosleep than
for select and on my ppc is is about 112 us. Can anything
be done about that?

  Jocke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08  9:12       ` Joakim Tjernlund
@ 2010-09-08  9:51         ` Thomas Gleixner
  2010-09-08 10:14           ` Eric Dumazet
  2010-09-08 12:11           ` Joakim Tjernlund
  0 siblings, 2 replies; 17+ messages in thread
From: Thomas Gleixner @ 2010-09-08  9:51 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Eric Dumazet, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1403 bytes --]

On Wed, 8 Sep 2010, Joakim Tjernlund wrote:

> Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/09/08 10:24:49:
> >
> > Le mercredi 08 septembre 2010 à 10:04 +0200, Joakim Tjernlund a écrit :
> >
> > > That makes litte difference:
> > > root@localhost ~ # ./nanosleep
> > > nanosleep
> > > req:0 :0
> > > tv_res:0 :112
> > >
> >
> > Here, result is 30 (with prctl())
> > instead of 95 (without)
> 
> On x86 I notice a difference:
>   7 vs 57.
> however select is still faster: 2
> The system call OH seems to be bigger for nanosleep than
> for select and on my ppc is is about 112 us. Can anything
> be done about that?

Hmm, the only reason I can see is that select calls finally
schedule_hrtimeout_range_clock() which optimizes the expiry = 0 case
while nanosleep does not. So the difference is only visiable when the
relative timeout is 0. For timeouts > 0 nanosleep and select should
behave the same way.

Thanks,

	tglx
---
 kernel/hrtimer.c |    3 +++
 1 file changed, 3 insertions(+)

Index: linux-2.6/kernel/hrtimer.c
===================================================================
--- linux-2.6.orig/kernel/hrtimer.c
+++ linux-2.6/kernel/hrtimer.c
@@ -1609,6 +1609,9 @@ SYSCALL_DEFINE2(nanosleep, struct timesp
 	if (!timespec_valid(&tu))
 		return -EINVAL;
 
+	if (!tu.tv_sec && !tu.tv_usec)
+		return 0;
+
 	return hrtimer_nanosleep(&tu, rmtp, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 }
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08  9:51         ` Thomas Gleixner
@ 2010-09-08 10:14           ` Eric Dumazet
  2010-09-08 10:17             ` Thomas Gleixner
  2010-09-08 12:11           ` Joakim Tjernlund
  1 sibling, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2010-09-08 10:14 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Joakim Tjernlund, linux-kernel

Le mercredi 08 septembre 2010 à 11:51 +0200, Thomas Gleixner a écrit :
> +       if (!tu.tv_sec && !tu.tv_usec)
> +               return 0;
> + 

You need tu.tv_nsec here ;)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 10:14           ` Eric Dumazet
@ 2010-09-08 10:17             ` Thomas Gleixner
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2010-09-08 10:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Joakim Tjernlund, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 240 bytes --]

On Wed, 8 Sep 2010, Eric Dumazet wrote:

> Le mercredi 08 septembre 2010 à 11:51 +0200, Thomas Gleixner a écrit :
> > +       if (!tu.tv_sec && !tu.tv_usec)
> > +               return 0;
> > + 
> 
> You need tu.tv_nsec here ;)

Details :)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08  9:51         ` Thomas Gleixner
  2010-09-08 10:14           ` Eric Dumazet
@ 2010-09-08 12:11           ` Joakim Tjernlund
  2010-09-08 12:43             ` Thomas Gleixner
  1 sibling, 1 reply; 17+ messages in thread
From: Joakim Tjernlund @ 2010-09-08 12:11 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Eric Dumazet, linux-kernel

Thomas Gleixner <tglx@linutronix.de> wrote on 2010/09/08 11:51:13:
>
> On Wed, 8 Sep 2010, Joakim Tjernlund wrote:
>
> > Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/09/08 10:24:49:
> > >
> > > Le mercredi 08 septembre 2010 à 10:04 +0200, Joakim Tjernlund a écrit :
> > >
> > > > That makes litte difference:
> > > > root@localhost ~ # ./nanosleep
> > > > nanosleep
> > > > req:0 :0
> > > > tv_res:0 :112
> > > >
> > >
> > > Here, result is 30 (with prctl())
> > > instead of 95 (without)
> >
> > On x86 I notice a difference:
> >   7 vs 57.
> > however select is still faster: 2
> > The system call OH seems to be bigger for nanosleep than
> > for select and on my ppc is is about 112 us. Can anything
> > be done about that?
>
> Hmm, the only reason I can see is that select calls finally
> schedule_hrtimeout_range_clock() which optimizes the expiry = 0 case
> while nanosleep does not. So the difference is only visiable when the
> relative timeout is 0. For timeouts > 0 nanosleep and select should
> behave the same way.

Yes, that is it. Thanks

However nanosleep with 1 ns and prctl(PR_SET_TIMERSLACK, 1) takes
about 8 us on x86(Intel(R) Core(TM)2 Duo CPU E8500  @ 3.16GHz)
and 20 us on my slower ppc board. Is that system call overhead
or possibly some error?

      Jocke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 12:11           ` Joakim Tjernlund
@ 2010-09-08 12:43             ` Thomas Gleixner
  2010-09-08 13:00               ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2010-09-08 12:43 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Eric Dumazet, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1528 bytes --]

On Wed, 8 Sep 2010, Joakim Tjernlund wrote:

> Thomas Gleixner <tglx@linutronix.de> wrote on 2010/09/08 11:51:13:
> >
> > On Wed, 8 Sep 2010, Joakim Tjernlund wrote:
> >
> > > Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/09/08 10:24:49:
> > > >
> > > > Le mercredi 08 septembre 2010 à 10:04 +0200, Joakim Tjernlund a écrit :
> > > >
> > > > > That makes litte difference:
> > > > > root@localhost ~ # ./nanosleep
> > > > > nanosleep
> > > > > req:0 :0
> > > > > tv_res:0 :112
> > > > >
> > > >
> > > > Here, result is 30 (with prctl())
> > > > instead of 95 (without)
> > >
> > > On x86 I notice a difference:
> > >   7 vs 57.
> > > however select is still faster: 2
> > > The system call OH seems to be bigger for nanosleep than
> > > for select and on my ppc is is about 112 us. Can anything
> > > be done about that?
> >
> > Hmm, the only reason I can see is that select calls finally
> > schedule_hrtimeout_range_clock() which optimizes the expiry = 0 case
> > while nanosleep does not. So the difference is only visiable when the
> > relative timeout is 0. For timeouts > 0 nanosleep and select should
> > behave the same way.
> 
> Yes, that is it. Thanks
> 
> However nanosleep with 1 ns and prctl(PR_SET_TIMERSLACK, 1) takes
> about 8 us on x86(Intel(R) Core(TM)2 Duo CPU E8500  @ 3.16GHz)
> and 20 us on my slower ppc board. Is that system call overhead
> or possibly some error?

That's overhead I fear. We go way up to enqueue/arm the timer until we
figure out that the timeout already happened.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 12:43             ` Thomas Gleixner
@ 2010-09-08 13:00               ` Peter Zijlstra
  2010-09-08 13:44                 ` Joakim Tjernlund
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2010-09-08 13:00 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Joakim Tjernlund, Eric Dumazet, linux-kernel

On Wed, 2010-09-08 at 14:43 +0200, Thomas Gleixner wrote:
> > However nanosleep with 1 ns and prctl(PR_SET_TIMERSLACK, 1) takes
> > about 8 us on x86(Intel(R) Core(TM)2 Duo CPU E8500  @ 3.16GHz)
> > and 20 us on my slower ppc board. Is that system call overhead
> > or possibly some error?
> 
> That's overhead I fear. We go way up to enqueue/arm the timer until we
> figure out that the timeout already happened. 

Well, there's also the fact that his ppc board is simply dead slow,
using the freq ratio: 3166/266 you'd expect (at a similar ins/clock
ratio) the ppc to take 95us.

So in fact the pcc taking 20us is actually quite good.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 13:00               ` Peter Zijlstra
@ 2010-09-08 13:44                 ` Joakim Tjernlund
  2010-09-08 13:51                   ` Peter Zijlstra
  2010-09-08 13:52                   ` Thomas Gleixner
  0 siblings, 2 replies; 17+ messages in thread
From: Joakim Tjernlund @ 2010-09-08 13:44 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Eric Dumazet, linux-kernel, Thomas Gleixner

Peter Zijlstra <peterz@infradead.org> wrote on 2010/09/08 15:00:18:
>
> On Wed, 2010-09-08 at 14:43 +0200, Thomas Gleixner wrote:
> > > However nanosleep with 1 ns and prctl(PR_SET_TIMERSLACK, 1) takes
> > > about 8 us on x86(Intel(R) Core(TM)2 Duo CPU E8500  @ 3.16GHz)
> > > and 20 us on my slower ppc board. Is that system call overhead
> > > or possibly some error?
> >
> > That's overhead I fear. We go way up to enqueue/arm the timer until we
> > figure out that the timeout already happened.
>
> Well, there's also the fact that his ppc board is simply dead slow,
> using the freq ratio: 3166/266 you'd expect (at a similar ins/clock
> ratio) the ppc to take 95us.
>
> So in fact the pcc taking 20us is actually quite good.

Actually, it takes 120 us. The 20 us was when I had Thomas
timeout == 0 fast path patch applied(forgot to remove it).
Without that patch it takes about 115 us. So it seems it takes
115-20=95 us to turn the timer wheel on my ppc.

 Jocke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 13:44                 ` Joakim Tjernlund
@ 2010-09-08 13:51                   ` Peter Zijlstra
  2010-09-08 13:52                   ` Thomas Gleixner
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2010-09-08 13:51 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Eric Dumazet, linux-kernel, Thomas Gleixner

On Wed, 2010-09-08 at 15:44 +0200, Joakim Tjernlund wrote:
> Peter Zijlstra <peterz@infradead.org> wrote on 2010/09/08 15:00:18:
> >
> > On Wed, 2010-09-08 at 14:43 +0200, Thomas Gleixner wrote:
> > > > However nanosleep with 1 ns and prctl(PR_SET_TIMERSLACK, 1) takes
> > > > about 8 us on x86(Intel(R) Core(TM)2 Duo CPU E8500  @ 3.16GHz)
> > > > and 20 us on my slower ppc board. Is that system call overhead
> > > > or possibly some error?
> > >
> > > That's overhead I fear. We go way up to enqueue/arm the timer until we
> > > figure out that the timeout already happened.
> >
> > Well, there's also the fact that his ppc board is simply dead slow,
> > using the freq ratio: 3166/266 you'd expect (at a similar ins/clock
> > ratio) the ppc to take 95us.
> >
> > So in fact the pcc taking 20us is actually quite good.
> 
> Actually, it takes 120 us. The 20 us was when I had Thomas
> timeout == 0 fast path patch applied(forgot to remove it).
> Without that patch it takes about 115 us. So it seems it takes
> 115-20=95 us to turn the timer wheel on my ppc.

hrtimers don't have a timer wheel, but it does poke at the hardware,
could be programming timers on that ppc is terribly slow.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 13:44                 ` Joakim Tjernlund
  2010-09-08 13:51                   ` Peter Zijlstra
@ 2010-09-08 13:52                   ` Thomas Gleixner
  2010-09-08 14:19                     ` Joakim Tjernlund
  1 sibling, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2010-09-08 13:52 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Peter Zijlstra, Eric Dumazet, linux-kernel

On Wed, 8 Sep 2010, Joakim Tjernlund wrote:

> Peter Zijlstra <peterz@infradead.org> wrote on 2010/09/08 15:00:18:
> >
> > On Wed, 2010-09-08 at 14:43 +0200, Thomas Gleixner wrote:
> > > > However nanosleep with 1 ns and prctl(PR_SET_TIMERSLACK, 1) takes
> > > > about 8 us on x86(Intel(R) Core(TM)2 Duo CPU E8500  @ 3.16GHz)
> > > > and 20 us on my slower ppc board. Is that system call overhead
> > > > or possibly some error?
> > >
> > > That's overhead I fear. We go way up to enqueue/arm the timer until we
> > > figure out that the timeout already happened.
> >
> > Well, there's also the fact that his ppc board is simply dead slow,
> > using the freq ratio: 3166/266 you'd expect (at a similar ins/clock
> > ratio) the ppc to take 95us.
> >
> > So in fact the pcc taking 20us is actually quite good.
> 
> Actually, it takes 120 us. The 20 us was when I had Thomas
> timeout == 0 fast path patch applied(forgot to remove it).
> Without that patch it takes about 115 us. So it seems it takes
> 115-20=95 us to turn the timer wheel on my ppc.

You might fire up the tracer to look where it spends that time.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 13:52                   ` Thomas Gleixner
@ 2010-09-08 14:19                     ` Joakim Tjernlund
  2010-09-08 14:30                       ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Joakim Tjernlund @ 2010-09-08 14:19 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Eric Dumazet, linux-kernel, Peter Zijlstra



Thomas Gleixner <tglx@linutronix.de> wrote on 2010/09/08 15:52:23:

> From: Thomas Gleixner <tglx@linutronix.de>
> To: Joakim Tjernlund <joakim.tjernlund@transmode.se>
> Cc: Peter Zijlstra <peterz@infradead.org>, Eric Dumazet
> <eric.dumazet@gmail.com>, linux-kernel@vger.kernel.org
> Date: 2010/09/08 15:52
> Subject: Re: slow nanosleep?
>
> On Wed, 8 Sep 2010, Joakim Tjernlund wrote:
>
> > Peter Zijlstra <peterz@infradead.org> wrote on 2010/09/08 15:00:18:
> > >
> > > On Wed, 2010-09-08 at 14:43 +0200, Thomas Gleixner wrote:
> > > > > However nanosleep with 1 ns and prctl(PR_SET_TIMERSLACK, 1) takes
> > > > > about 8 us on x86(Intel(R) Core(TM)2 Duo CPU E8500  @ 3.16GHz)
> > > > > and 20 us on my slower ppc board. Is that system call overhead
> > > > > or possibly some error?
> > > >
> > > > That's overhead I fear. We go way up to enqueue/arm the timer until we
> > > > figure out that the timeout already happened.
> > >
> > > Well, there's also the fact that his ppc board is simply dead slow,
> > > using the freq ratio: 3166/266 you'd expect (at a similar ins/clock
> > > ratio) the ppc to take 95us.
> > >
> > > So in fact the pcc taking 20us is actually quite good.
> >
> > Actually, it takes 120 us. The 20 us was when I had Thomas
> > timeout == 0 fast path patch applied(forgot to remove it).
> > Without that patch it takes about 115 us. So it seems it takes
> > 115-20=95 us to turn the timer wheel on my ppc.
>
> You might fire up the tracer to look where it spends that time.

This helps for short(1 ns) nanosleeps, sleeps for 25 us. No idea
if this is any good, just tossing it out for you to tear apart :)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 5c69e99..e612016 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1545,6 +1545,9 @@ long __sched hrtimer_nanosleep_restart(struct restart_block *restart)
 				HRTIMER_MODE_ABS);
 	hrtimer_set_expires_tv64(&t.timer, restart->nanosleep.expires);

+	if (!hrtimer_active(&t.timer))
+		goto out;
+
 	if (do_nanosleep(&t, HRTIMER_MODE_ABS))
 		goto out;

@@ -1576,6 +1579,9 @@ long hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp,

 	hrtimer_init_on_stack(&t.timer, clockid, mode);
 	hrtimer_set_expires_range_ns(&t.timer, timespec_to_ktime(*rqtp), slack);
+	if (!hrtimer_active(&t.timer))
+		goto out;
+
 	if (do_nanosleep(&t, mode))
 		goto out;




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 14:19                     ` Joakim Tjernlund
@ 2010-09-08 14:30                       ` Thomas Gleixner
  2010-09-08 14:33                         ` Joakim Tjernlund
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2010-09-08 14:30 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Eric Dumazet, linux-kernel, Peter Zijlstra

On Wed, 8 Sep 2010, Joakim Tjernlund wrote:
> Thomas Gleixner <tglx@linutronix.de> wrote on 2010/09/08 15:52:23:
> > On Wed, 8 Sep 2010, Joakim Tjernlund wrote:
> > > 
> > > Actually, it takes 120 us. The 20 us was when I had Thomas
> > > timeout == 0 fast path patch applied(forgot to remove it).
> > > Without that patch it takes about 115 us. So it seems it takes
> > > 115-20=95 us to turn the timer wheel on my ppc.
> >
> > You might fire up the tracer to look where it spends that time.
> 
> This helps for short(1 ns) nanosleeps, sleeps for 25 us. No idea
> if this is any good, just tossing it out for you to tear apart :)
> 
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 5c69e99..e612016 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1545,6 +1545,9 @@ long __sched hrtimer_nanosleep_restart(struct restart_block *restart)
>  				HRTIMER_MODE_ABS);
>  	hrtimer_set_expires_tv64(&t.timer, restart->nanosleep.expires);
> 
> +	if (!hrtimer_active(&t.timer))
> +		goto out;

That actually will return for any expiry time. The timer is armed in
do_nanosleep() not in hrtimer_set_expires_tv64() /
hrtimer_set_expires_range_ns()

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: slow nanosleep?
  2010-09-08 14:30                       ` Thomas Gleixner
@ 2010-09-08 14:33                         ` Joakim Tjernlund
  0 siblings, 0 replies; 17+ messages in thread
From: Joakim Tjernlund @ 2010-09-08 14:33 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Eric Dumazet, linux-kernel, Peter Zijlstra

Thomas Gleixner <tglx@linutronix.de> wrote on 2010/09/08 16:30:07:
>
> On Wed, 8 Sep 2010, Joakim Tjernlund wrote:
> > Thomas Gleixner <tglx@linutronix.de> wrote on 2010/09/08 15:52:23:
> > > On Wed, 8 Sep 2010, Joakim Tjernlund wrote:
> > > >
> > > > Actually, it takes 120 us. The 20 us was when I had Thomas
> > > > timeout == 0 fast path patch applied(forgot to remove it).
> > > > Without that patch it takes about 115 us. So it seems it takes
> > > > 115-20=95 us to turn the timer wheel on my ppc.
> > >
> > > You might fire up the tracer to look where it spends that time.
> >
> > This helps for short(1 ns) nanosleeps, sleeps for 25 us. No idea
> > if this is any good, just tossing it out for you to tear apart :)
> >
> > diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> > index 5c69e99..e612016 100644
> > --- a/kernel/hrtimer.c
> > +++ b/kernel/hrtimer.c
> > @@ -1545,6 +1545,9 @@ long __sched hrtimer_nanosleep_restart(struct
> restart_block *restart)
> >              HRTIMER_MODE_ABS);
> >     hrtimer_set_expires_tv64(&t.timer, restart->nanosleep.expires);
> >
> > +   if (!hrtimer_active(&t.timer))
> > +      goto out;
>
> That actually will return for any expiry time. The timer is armed in
> do_nanosleep() not in hrtimer_set_expires_tv64() /
> hrtimer_set_expires_range_ns()

eehh, I should have tested with bigger nanosleeps as well :(


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-09-08 14:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-08  7:45 slow nanosleep? Joakim Tjernlund
2010-09-08  7:56 ` Eric Dumazet
2010-09-08  8:04   ` Joakim Tjernlund
2010-09-08  8:24     ` Eric Dumazet
2010-09-08  9:12       ` Joakim Tjernlund
2010-09-08  9:51         ` Thomas Gleixner
2010-09-08 10:14           ` Eric Dumazet
2010-09-08 10:17             ` Thomas Gleixner
2010-09-08 12:11           ` Joakim Tjernlund
2010-09-08 12:43             ` Thomas Gleixner
2010-09-08 13:00               ` Peter Zijlstra
2010-09-08 13:44                 ` Joakim Tjernlund
2010-09-08 13:51                   ` Peter Zijlstra
2010-09-08 13:52                   ` Thomas Gleixner
2010-09-08 14:19                     ` Joakim Tjernlund
2010-09-08 14:30                       ` Thomas Gleixner
2010-09-08 14:33                         ` Joakim Tjernlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.