linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Running an Ivy Bridge cpu at fixed frequency
@ 2019-12-04 17:01 David Laight
  2019-12-04 17:57 ` Andy Lutomirski
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: David Laight @ 2019-12-04 17:01 UTC (permalink / raw)
  To: x86, linux-kernel

Is there any way to persuade the intel_pstate driver to make an Ivy bridge (i7-3770)
cpu run at a fixed frequency?
It is really difficult to compare code execution times when the cpu clock speed
keeps changing.
I thought I'd managed by setting the 'scaling_max_freq' to 1.7GHz, but even that
doesn't seem to be working now.
It would also be nice to run a little faster than that - but without it 'randomly'
going to 'turbo' frequencies (which it is doing even after I've set no_turbo to 1).

An alternative would be a variable frequency TSC - might give more consistent values.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running an Ivy Bridge cpu at fixed frequency
  2019-12-04 17:01 Running an Ivy Bridge cpu at fixed frequency David Laight
@ 2019-12-04 17:57 ` Andy Lutomirski
  2019-12-05  9:45 ` Peter Zijlstra
  2019-12-06 14:47 ` Alexey Klimov
  2 siblings, 0 replies; 8+ messages in thread
From: Andy Lutomirski @ 2019-12-04 17:57 UTC (permalink / raw)
  To: David Laight; +Cc: x86, linux-kernel

On Wed, Dec 4, 2019 at 9:01 AM David Laight <David.Laight@aculab.com> wrote:
>
> Is there any way to persuade the intel_pstate driver to make an Ivy bridge (i7-3770)
> cpu run at a fixed frequency?
> It is really difficult to compare code execution times when the cpu clock speed
> keeps changing.
> I thought I'd managed by setting the 'scaling_max_freq' to 1.7GHz, but even that
> doesn't seem to be working now.
> It would also be nice to run a little faster than that - but without it 'randomly'
> going to 'turbo' frequencies (which it is doing even after I've set no_turbo to 1).
>

I don't remember.  I'm sure I could figure out what MSR to write, but
that's not the answer you're looking for.  Someone else will know :)

> An alternative would be a variable frequency TSC - might give more consistent values.

You can quite easily use perf to count cycles.  I never really
finished it, but this is a tiny little library that should do exactly
what you need.  It's a bit messy.

https://git.kernel.org/pub/scm/linux/kernel/git/luto/misc-tests.git/tree/tight_loop/perf_self_monitor.c

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running an Ivy Bridge cpu at fixed frequency
  2019-12-04 17:01 Running an Ivy Bridge cpu at fixed frequency David Laight
  2019-12-04 17:57 ` Andy Lutomirski
@ 2019-12-05  9:45 ` Peter Zijlstra
  2019-12-05 15:53   ` David Laight
  2019-12-06 14:47 ` Alexey Klimov
  2 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2019-12-05  9:45 UTC (permalink / raw)
  To: David Laight; +Cc: x86, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

On Wed, Dec 04, 2019 at 05:01:32PM +0000, David Laight wrote:
> Is there any way to persuade the intel_pstate driver to make an Ivy bridge (i7-3770)
> cpu run at a fixed frequency?

You can, use performance governor and put scaling_{min,max}_freq at the
base_frequency (I _think_, I never quite remember if that is the max non
turbo P state).

If that doesn't work, simply put it at cpuinfo_min_freq. It's slow, but
it's guaranteed stable.

> It is really difficult to compare code execution times when the cpu clock speed
> keeps changing.

As Andy already wrote, perf is really good for this.

Find attached, it probably is less shiny than what Andy handed you, but
contains all the bits required to frob something.

> I thought I'd managed by setting the 'scaling_max_freq' to 1.7GHz, but even that
> doesn't seem to be working now.

You also have to set the min I think, and select the performance
governor, otherwise it's too tempted to be 'smart' about stuff.

> It would also be nice to run a little faster than that - but without it 'randomly'
> going to 'turbo' frequencies (which it is doing even after I've set no_turbo to 1).
> 
> An alternative would be a variable frequency TSC - might give more consistent values.

perf :-)

[-- Attachment #2: spinlocks.tar.bz2 --]
[-- Type: application/octet-stream, Size: 29012 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Running an Ivy Bridge cpu at fixed frequency
  2019-12-05  9:45 ` Peter Zijlstra
@ 2019-12-05 15:53   ` David Laight
  2019-12-05 17:51     ` Andy Lutomirski
  2019-12-06 10:15     ` Peter Zijlstra
  0 siblings, 2 replies; 8+ messages in thread
From: David Laight @ 2019-12-05 15:53 UTC (permalink / raw)
  To: 'Peter Zijlstra'; +Cc: x86, linux-kernel

From: Peter Zijlstra
> Sent: 05 December 2019 09:46
> As Andy already wrote, perf is really good for this.
> 
> Find attached, it probably is less shiny than what Andy handed you, but
> contains all the bits required to frob something.

You are in a maze of incomplete documentation all disjoint.

The x86 instruction set doc (eg 325462.pdf) defines the rdpmc instruction, tells you
how many counters each cpu type has, but doesn't even contain a reference
to how they are incremented.
I guess there are some processor-specific MSR for that.

perf_event_open(2) tells you a few things, but doesn't actually what anything is.
It contains all but the last 'if' clause of this function, without really saying
what any of it does - or why you might do it this way.

static inline u64 mmap_read_self(void *addr)
{
        struct perf_event_mmap_page *pc = addr;
        u32 seq, idx, time_mult = 0, time_shift = 0, width = 0;
        u64 count, cyc = 0, time_offset = 0, enabled, running, delta;
        s64 pmc = 0;

        do {
                seq = pc->lock;
                barrier();

                enabled = pc->time_enabled;
                running = pc->time_running;

                if (pc->cap_user_time && enabled != running) {
                        cyc = rdtsc();
                        time_mult = pc->time_mult;
                        time_shift = pc->time_shift;
                        time_offset = pc->time_offset;
                }

                idx = pc->index;
                count = pc->offset;
                if (pc->cap_user_rdpmc && idx) {
                        width = pc->pmc_width;
                        pmc = rdpmc(idx - 1);
                }

                barrier();
        } while (pc->lock != seq);

        if (idx) {
                pmc <<= 64 - width;
                pmc >>= 64 - width; /* shift right signed */
                count += pmc;
        }

        if (enabled != running) {
                u64 quot, rem;

                quot = (cyc >> time_shift);
                rem = cyc & ((1 << time_shift) - 1);
                delta = time_offset + quot * time_mult +
                        ((rem * time_mult) >> time_shift);

                enabled += delta;
                if (idx)
                        running += delta;

                quot = count / running;
                rem = count % running;
                count = quot * enabled + (rem * enabled) / running;
        }

        return count;
}

AFAICT:
1) The last clause is scaling the count up to allow for time when the hardware counter
   couldn't be allocated.
   I'm not convinced that is useful, better to ignore the entire measurement.
   Half this got deleted from the man page, leaving strange 'set but unused' variables.

2) The hardware counters are disabled while the process is asleep.
   On wake a different pmc counter might be used (maybe on a different cpu).
   The new cpu might not even have a counter available.

3) If you don't want to scale up for missing periods it is probably enough to do:
	do {
		seq = pc->offset;
		barrier();
		idx = pc->index;
		if (!index)
			return -1;
		count = pc->offset + rdpmc(idx - 1);
	} while (seq != pc->seq);
	return (unsigned int)count;
  
Not tried it yet :-)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running an Ivy Bridge cpu at fixed frequency
  2019-12-05 15:53   ` David Laight
@ 2019-12-05 17:51     ` Andy Lutomirski
  2019-12-06 10:15     ` Peter Zijlstra
  1 sibling, 0 replies; 8+ messages in thread
From: Andy Lutomirski @ 2019-12-05 17:51 UTC (permalink / raw)
  To: David Laight; +Cc: Peter Zijlstra, x86, linux-kernel



> On Dec 5, 2019, at 7:54 AM, David Laight <David.Laight@aculab.com> wrote:
> 
> From: Peter Zijlstra
>> Sent: 05 December 2019 09:46
>> As Andy already wrote, perf is really good for this.
>> Find attached, it probably is less shiny than what Andy handed you, but
>> contains all the bits required to frob something.
> 
> You are in a maze of incomplete documentation all disjoint.

I don’t see any documentation.  Maybe you shouldn’t have turned your flashlight on.

> 
> The x86 instruction set doc (eg 325462.pdf) defines the rdpmc instruction, tells you
> how many counters each cpu type has, but doesn't even contain a reference
> to how they are incremented.
> I guess there are some processor-specific MSR for that.
> 
> perf_event_open(2) tells you a few things, but doesn't actually what anything is.
> It contains all but the last 'if' clause of this function, without really saying
> what any of it does - or why you might do it this way.
> 
> static inline u64 mmap_read_self(void *addr)
> {
>       struct perf_event_mmap_page *pc = addr;
>       u32 seq, idx, time_mult = 0, time_shift = 0, width = 0;
>       u64 count, cyc = 0, time_offset = 0, enabled, running, delta;
>       s64 pmc = 0;
> 
>       do {
>               seq = pc->lock;
>               barrier();
> 
>               enabled = pc->time_enabled;
>               running = pc->time_running;
> 
>               if (pc->cap_user_time && enabled != running) {
>                       cyc = rdtsc();
>                       time_mult = pc->time_mult;
>                       time_shift = pc->time_shift;
>                       time_offset = pc->time_offset;
>               }
> 
>               idx = pc->index;
>               count = pc->offset;
>               if (pc->cap_user_rdpmc && idx) {
>                       width = pc->pmc_width;
>                       pmc = rdpmc(idx - 1);
>               }
> 
>               barrier();
>       } while (pc->lock != seq);
> 
>       if (idx) {
>               pmc <<= 64 - width;
>               pmc >>= 64 - width; /* shift right signed */
>               count += pmc;
>       }
> 
>       if (enabled != running) {
>               u64 quot, rem;
> 
>               quot = (cyc >> time_shift);
>               rem = cyc & ((1 << time_shift) - 1);
>               delta = time_offset + quot * time_mult +
>                       ((rem * time_mult) >> time_shift);
> 
>               enabled += delta;
>               if (idx)
>                       running += delta;
> 
>               quot = count / running;
>               rem = count % running;
>               count = quot * enabled + (rem * enabled) / running;
>       }
> 
>       return count;
> }
> 
> AFAICT:
> 1) The last clause is scaling the count up to allow for time when the hardware counter
>  couldn't be allocated.
>  I'm not convinced that is useful, better to ignore the entire measurement.
>  Half this got deleted from the man page, leaving strange 'set but unused' variables.
> 
> 2) The hardware counters are disabled while the process is asleep.
>  On wake a different pmc counter might be used (maybe on a different cpu).
>  The new cpu might not even have a counter available.
> 
> 3) If you don't want to scale up for missing periods it is probably enough to do:
>   do {
>       seq = pc->offset;
>       barrier();
>       idx = pc->index;
>       if (!index)
>           return -1;
>       count = pc->offset + rdpmc(idx - 1);
>   } while (seq != pc->seq);
>   return (unsigned int)count;
> 
> Not tried it yet :-)

Use my version :).  I just throw out the sample if we were preempted or if it was otherwise suspicious.

—Andy

> 
>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running an Ivy Bridge cpu at fixed frequency
  2019-12-05 15:53   ` David Laight
  2019-12-05 17:51     ` Andy Lutomirski
@ 2019-12-06 10:15     ` Peter Zijlstra
  2019-12-06 13:06       ` David Laight
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2019-12-06 10:15 UTC (permalink / raw)
  To: David Laight; +Cc: x86, linux-kernel

On Thu, Dec 05, 2019 at 03:53:55PM +0000, David Laight wrote:
> From: Peter Zijlstra
> > Sent: 05 December 2019 09:46
> > As Andy already wrote, perf is really good for this.
> > 
> > Find attached, it probably is less shiny than what Andy handed you, but
> > contains all the bits required to frob something.
> 
> You are in a maze of incomplete documentation all disjoint.

I'm sure..

> The x86 instruction set doc (eg 325462.pdf) defines the rdpmc instruction, tells you
> how many counters each cpu type has, but doesn't even contain a reference
> to how they are incremented.

There's book 3, chapter 18, performance monitoring overview, that should
explain how the counters work, and chapter 19 that lists many of the
available events.

TL;DR, they're (48bit) signed counters that increment and raise an
interrupt when the sign flips. This means we set them to '-period' and
then upon read (either early or on interrupt) compute the delta and
accumulate elsewhere.

> perf_event_open(2) tells you a few things, but doesn't actually what anything is.
> It contains all but the last 'if' clause of this function, without really saying
> what any of it does - or why you might do it this way.

I don't actually know what's in that manpage. But it really shouldn't be
too hard to understand.

It's a seqcount protected set of value, there's the RDPMC counter index,
and the counter offset. If the idx!=0 it means the counter is actually
programmed and we must RDPMC, the result of which we must add to the
offset.

The whole counter scaling crud is just that, crud you can mostly forget
about if you want to quickly hack something together. See
mmap_read_pinned() for the simplified (and much faster version) that
ignores all that.


> AFAICT:
> 1) The last clause is scaling the count up to allow for time when the hardware counter
>    couldn't be allocated.
>    I'm not convinced that is useful, better to ignore the entire measurement.
>    Half this got deleted from the man page, leaving strange 'set but unused' variables.

Depending on the usecase, sure. I don't mave use for it either. I know
other people find it useful.

> 2) The hardware counters are disabled while the process is asleep.
>    On wake a different pmc counter might be used (maybe on a different cpu).
>    The new cpu might not even have a counter available.

Right, but if this is all you're running that is unlikely to happen.

> 3) If you don't want to scale up for missing periods it is probably enough to do:
> 	do {
> 		seq = pc->offset;
> 		barrier();
> 		idx = pc->index;
> 		if (!index)
> 			return -1;
> 		count = pc->offset + rdpmc(idx - 1);
> 	} while (seq != pc->seq);
> 	return (unsigned int)count;

You still need to do the rdpmc sign extent crud, but see
mmap_read_pinned() that does just about that.

As the name suggests it relies on using perf_event_attr::pinned = 1.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Running an Ivy Bridge cpu at fixed frequency
  2019-12-06 10:15     ` Peter Zijlstra
@ 2019-12-06 13:06       ` David Laight
  0 siblings, 0 replies; 8+ messages in thread
From: David Laight @ 2019-12-06 13:06 UTC (permalink / raw)
  To: 'Peter Zijlstra'; +Cc: x86, linux-kernel

From: Peter Zijlstra
> Sent: 06 December 2019 10:16
> To: David Laight <David.Laight@ACULAB.COM>
...
> The whole counter scaling crud is just that, crud you can mostly forget
> about if you want to quickly hack something together. See
> mmap_read_pinned() for the simplified (and much faster version) that
> ignores all that.

I noticed that version later :-(
The 'seqcount' is interesting, since it only protects against updates
that happen while the process itself is in kernel space.
It doesn't allow arbitrary kernel updates of the memory area.

...
> You still need to do the rdpmc sign extent crud, but see
> mmap_read_pinned() that does just about that.

Actually for what I'm doing i can truncate the counter to 32 bits
and not worry about when it wraps.

Anyway I've not got some histograms of the elapsed cycle counts
for recvfrom() and recvmsg() with, and without, some of the
HARDENED_USERCOPY costs.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running an Ivy Bridge cpu at fixed frequency
  2019-12-04 17:01 Running an Ivy Bridge cpu at fixed frequency David Laight
  2019-12-04 17:57 ` Andy Lutomirski
  2019-12-05  9:45 ` Peter Zijlstra
@ 2019-12-06 14:47 ` Alexey Klimov
  2 siblings, 0 replies; 8+ messages in thread
From: Alexey Klimov @ 2019-12-06 14:47 UTC (permalink / raw)
  To: David Laight; +Cc: x86, linux-kernel

On Wed, Dec 4, 2019 at 5:32 PM David Laight <David.Laight@aculab.com> wrote:
>
> Is there any way to persuade the intel_pstate driver to make an Ivy bridge (i7-3770)
> cpu run at a fixed frequency?
> It is really difficult to compare code execution times when the cpu clock speed
> keeps changing.
> I thought I'd managed by setting the 'scaling_max_freq' to 1.7GHz, but even that
> doesn't seem to be working now.
> It would also be nice to run a little faster than that - but without it 'randomly'
> going to 'turbo' frequencies (which it is doing even after I've set no_turbo to 1).
>
> An alternative would be a variable frequency TSC - might give more consistent values.

Have you tried intel_pstate=passive parameter in cmdline?
You'll be able to fix the frequency using governors or sysfs.
Not sure that this is what you're looking for. I personally also don't
know that 'passive' mode will work on Ivy Bridge.

Best regards,
Alexey

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-12-06 14:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-04 17:01 Running an Ivy Bridge cpu at fixed frequency David Laight
2019-12-04 17:57 ` Andy Lutomirski
2019-12-05  9:45 ` Peter Zijlstra
2019-12-05 15:53   ` David Laight
2019-12-05 17:51     ` Andy Lutomirski
2019-12-06 10:15     ` Peter Zijlstra
2019-12-06 13:06       ` David Laight
2019-12-06 14:47 ` Alexey Klimov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).