linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Unknown HZ value! (2000) Assume 1024.
@ 2001-05-02  5:13 Tom Holroyd
  2001-05-02  6:42 ` Albert D. Cahalan
  0 siblings, 1 reply; 5+ messages in thread
From: Tom Holroyd @ 2001-05-02  5:13 UTC (permalink / raw)
  To: kernel mailing list

Alpha.  2.4.1.  Hz = 1024.  Uptime > 48.54518 days (low idle).
(Subject message from ps and friends.)

/proc/uptime:
4400586.27 150439.36

/proc/stat:
cpu  371049158 3972370867 8752820 4448994822
     (user,    nice,      system, idle)

In .../fs/proc/proc_misc.c:kstat_read_proc(), the cpu line is being
computed by:

        len = sprintf(page, "cpu  %u %u %u %lu\n", user, nice, system,
                      jif * smp_num_cpus - (user + nice + system));

The user, nice, and system values add up to 4352172845 > 2^32, and jif is
4400586.27 * 1024 = 4506200340, leading to the incorrect idle time (1
cpu).  It should be calculated this way:

        len = sprintf(page, "cpu  %u %u %u %lu\n", user, nice, system,
                      jif * smp_num_cpus - ((unsigned long)user + nice + system));

or just declare those as unsigned longs instead of ints.  I notice also
that since kstat.per_cpu_nice is an int, it's going to overflow in another
3.6 days anyhow.  I'll let you know what blows up then.  Any chance of
making those guys longs?

The ps program, of course, is trying to calculate HZ by inverting the
above calculation, and it gets a bogus value.  I suppose it could use
(uptime[0] - uptime[1]) / (user + nice + system) instead of trying to
calculate jif first, but it'll still fail when the ints roll over.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unknown HZ value! (2000) Assume 1024.
  2001-05-02  5:13 Unknown HZ value! (2000) Assume 1024 Tom Holroyd
@ 2001-05-02  6:42 ` Albert D. Cahalan
  2001-05-02  7:47   ` Tom Holroyd
  2001-05-02  9:49   ` Ingo Oeser
  0 siblings, 2 replies; 5+ messages in thread
From: Albert D. Cahalan @ 2001-05-02  6:42 UTC (permalink / raw)
  To: Tom Holroyd; +Cc: kernel mailing list

> /proc/uptime:
> 4400586.27 150439.36
> 
> /proc/stat:
> cpu  371049158 3972370867 8752820 4448994822
>      (user,    nice,      system, idle)
> 
> In .../fs/proc/proc_misc.c:kstat_read_proc(), the cpu line is being
> computed by:
> 
>         len = sprintf(page, "cpu  %u %u %u %lu\n", user, nice, system,
>                       jif * smp_num_cpus - (user + nice + system));

This is pretty bogus. The idle time can run _backwards_ on an SMP
system. What is "top" supposed to do with that, print a negative
number for %idle time? (some versions do, while others truncate
at zero or wrap around to 4 billion -- pick your poison)

> The user, nice, and system values add up to 4352172845 > 2^32, and jif is
> 4400586.27 * 1024 = 4506200340, leading to the incorrect idle time (1
> cpu).  It should be calculated this way:
>
>    len = sprintf(page, "cpu  %u %u %u %lu\n", user, nice, system,
>             jif * smp_num_cpus - ((unsigned long)user + nice + system));
>
> or just declare those as unsigned longs instead of ints.  I notice also
> that since kstat.per_cpu_nice is an int, it's going to overflow in another
> 3.6 days anyhow.  I'll let you know what blows up then.  Any chance of
> making those guys longs?

That is good for the Alpha.

For 32-bit systems, we use 32-bit values to reduce overhead.
This causes problems at 495/smp_num_cpus days of uptime.

Proposed hack: set a very-log-duration timer (several days)
to check for the high bit changing. Count bit flips.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unknown HZ value! (2000) Assume 1024.
  2001-05-02  6:42 ` Albert D. Cahalan
@ 2001-05-02  7:47   ` Tom Holroyd
  2001-05-03  6:15     ` Albert D. Cahalan
  2001-05-02  9:49   ` Ingo Oeser
  1 sibling, 1 reply; 5+ messages in thread
From: Tom Holroyd @ 2001-05-02  7:47 UTC (permalink / raw)
  To: Albert D. Cahalan; +Cc: kernel mailing list

On Wed, 2 May 2001, Albert D. Cahalan wrote:

> This is pretty bogus. The idle time can run _backwards_ on an SMP
> system.

True, but it's failing for single CPU systems (like mine), too.

>> I notice also that since kstat.per_cpu_nice is an unsigned int, it's
>> going to overflow in another 3.6 days anyhow. ... Any chance of making
>> those guys longs?

> For 32-bit systems, we use 32-bit values to reduce overhead.
> This causes problems at 495/smp_num_cpus days of uptime.

You mean for HZ == 100.  And I guess the overhead in question is the cost
of a 64 bit add vs. a 32 bit add HZ times per second?  On a 64 bit
machine, that overhead is likely to be exactly zero.  It is zero on my
machine.  For integer math on an Alpha, changing the ints to longs can
even make a program run faster.

> Proposed hack: set a very-long-duration timer (several days)
> to check for the high bit changing. Count bit flips.

What about the interval between when it flips and when you notice it?

No, change the kstat variables to unsigned longs on 64 bit systems.
In fact, make them unsigned longs on any system, as opposed to something
like u_int64 or whatever.  (And fix the code in proc_misc.c that uses
them.)

For 32 bit systems with HZ == 1024, decide if an overhead of about 1 usec
per second is too much to justify using 64 bits.  That's 4 seconds lost
over the 49 days it takes to fail (if your hardware isn't that fast what
are you doing with HZ == 1024 in the first place).


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unknown HZ value! (2000) Assume 1024.
  2001-05-02  6:42 ` Albert D. Cahalan
  2001-05-02  7:47   ` Tom Holroyd
@ 2001-05-02  9:49   ` Ingo Oeser
  1 sibling, 0 replies; 5+ messages in thread
From: Ingo Oeser @ 2001-05-02  9:49 UTC (permalink / raw)
  To: Albert D. Cahalan; +Cc: Tom Holroyd, kernel mailing list

On Wed, May 02, 2001 at 02:42:58AM -0400, Albert D. Cahalan wrote:
> > In .../fs/proc/proc_misc.c:kstat_read_proc(), the cpu line is being
> > computed by:
> > 
> >         len = sprintf(page, "cpu  %u %u %u %lu\n", user, nice, system,
> >                       jif * smp_num_cpus - (user + nice + system));
> 
> This is pretty bogus. The idle time can run _backwards_ on an SMP
> system. What is "top" supposed to do with that, print a negative
> number for %idle time? (some versions do, while others truncate
> at zero or wrap around to 4 billion -- pick your poison)

Just a "me too". I've seen this with one or two days uptime
already. An idle of more than 40.000%. May be this means, that
the machine was _very_ bored and needs my attention ;-)

cat /proc/cpuinfo

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 8
model name	: Pentium III (Coppermine)
stepping	: 6
cpu MHz		: 851.987
cache size	: 256 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips	: 1697.38

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 8
model name	: Pentium III (Coppermine)
stepping	: 6
cpu MHz		: 851.987
cache size	: 256 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips	: 1697.38

Just FYI.

Regards

Ingo Oeser
-- 
10.+11.03.2001 - 3. Chemnitzer LinuxTag <http://www.tu-chemnitz.de/linux/tag>
         <<<<<<<<<<<<     been there and had much fun   >>>>>>>>>>>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unknown HZ value! (2000) Assume 1024.
  2001-05-02  7:47   ` Tom Holroyd
@ 2001-05-03  6:15     ` Albert D. Cahalan
  0 siblings, 0 replies; 5+ messages in thread
From: Albert D. Cahalan @ 2001-05-03  6:15 UTC (permalink / raw)
  To: Tom Holroyd; +Cc: Albert D. Cahalan, kernel mailing list

Tom Holroyd writes:
> On Wed, 2 May 2001, Albert D. Cahalan wrote:

>> For 32-bit systems, we use 32-bit values to reduce overhead.
>> This causes problems at 495/smp_num_cpus days of uptime.
>
> You mean for HZ == 100.

Well, OK. No unmodified 32-bit system runs HZ == 1024.

> And I guess the overhead in question is the cost
> of a 64 bit add vs. a 32 bit add HZ times per second?  On a 64 bit
> machine, that overhead is likely to be exactly zero.  It is zero on my
> machine.  For integer math on an Alpha, changing the ints to longs can
> even make a program run faster.

Yes.

>> Proposed hack: set a very-long-duration timer (several days)
>> to check for the high bit changing. Count bit flips.
>
> What about the interval between when it flips and when you notice it?

Not a problem. Note that I count bit flips, not roll overs.
Here are the two variables, with "flips" lagging a bit:

flips  jiffies
0      0x7fffff26
0      0x80000003   (not noticed yet)
1      0x8000b01a
1      0xffffffe7
1      0x00000666   (not noticed yet)
2      0x0000ee15

Calculate 64-bit (well, 63-bit) jiffies as:

long long total;
unsigned f = flips;
unsigned j = jiffies;
f += (f ^ (j>>31)) & 1;
total = ((long long)f<<31) | j;

Now print the total.

Well, there it is. Like it? The /proc reader does 64-bit operations
and a timer goes off every few days, saving the clock tick from
doing any 64-bit operations. The fast path stays fast, while procps
can get useful data even after years of uptime.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-05-03 10:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-05-02  5:13 Unknown HZ value! (2000) Assume 1024 Tom Holroyd
2001-05-02  6:42 ` Albert D. Cahalan
2001-05-02  7:47   ` Tom Holroyd
2001-05-03  6:15     ` Albert D. Cahalan
2001-05-02  9:49   ` Ingo Oeser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).