linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislaw Gruszka <sgruszka@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Ingo Molnar" <mingo@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:sched/core] sched: Lower chances of cputime scaling overflow
Date: Tue, 30 Apr 2013 16:03:30 +0200	[thread overview]
Message-ID: <20130430140330.GB10465@redhat.com> (raw)
In-Reply-To: <CA+55aFwMeAe6Mb9EPeT4Ke0W36z-XUK3BZvbOt+dMzdbJ3F1Ew@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

On Sat, Apr 13, 2013 at 11:44:54AM -0700, Linus Torvalds wrote:
> PS. This is the "Make sure 'total' fits in 32 bits first" version. Not
> really tested, but it's just changing the order of operations a bit.
> 
>     /* We know one of the values has a bit set in the high 32 bits */
>     for (;;) {
>         /* Make sure "rtime" is the bigger of stime/rtime */
>         if (stime > rtime) {
>             u64 tmp = rtime; rtime = stime; stime = tmp;
>         }
> 
>         /* Make sure 'total' fits in 32 bits */
>         if (total >> 32)
>                 goto drop_precision;
> 
>         /* Does rtime (and thus stime) fit in 32 bits? */
>         if (!(rtime >> 32))
>                 break;
> 
>         /* Can we just balance rtime/stime rather than dropping bits? */
>         if (stime >> 31)
>             goto drop_precision;
> 
>         /* We can grow stime and shrink rtime and try to make them both fit */
>         stime <<= 1;
>         rtime >>= 1;
>         continue;
> 
> drop_precision:
>         /* We drop from rtime, it has more bits than stime */
>         rtime >>= 1;
>         total >>= 1;
>     }

For reference I'm attaching test program and script I used to validate
algorithm.

It generates lot of (possibly real) rtime, total, stime values for 4096
threads running for 10 years. Then compare scaled stime values caluclated
by above algorithm with precise python values. 

For all values generated, scaled stime relative error was less than 0.05%

Stanislaw

[-- Attachment #2: scale_stime.c --]
[-- Type: text/plain, Size: 1625 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <strings.h>
#include <stdint.h>

typedef uint64_t u64;
typedef uint32_t u32;

static u64 div_u64_u32(u64 a, u32 b)
{
	return a / b;
}

static u64 scale_stime(u64 stime, u64 rtime, u64 total)
{
    /* We know one of the values has a bit set in the high 32 bits */
    for (;;) {
        /* Make sure "rtime" is the bigger of stime/rtime */
        if (stime > rtime) {
            u64 tmp = rtime; rtime = stime; stime = tmp;
        }

        /* Make sure 'total' fits in 32 bits */
        if (total >> 32)
                goto drop_precision;

        /* Does rtime (and thus stime) fit in 32 bits? */
        if (!(rtime >> 32))
                break;

        /* Can we just balance rtime/stime rather than dropping bits? */
        if (stime >> 31)
            goto drop_precision;

        /* We can grow stime and shrink rtime and try to make them both fit */
        stime <<= 1;
        rtime >>= 1;
        continue;

drop_precision:
        /* We drop from rtime, it has more bits than stime */
        rtime >>= 1;
        total >>= 1;
    }

/* Make sure gcc understands that this is a 32x32->64 multiply,
 * followed by a 64/32->64 divide */
    return div_u64_u32((u64) (u32) stime * (u64) (u32) rtime, (u32)total);
}
 
int main(int argc, char *argv[])
{
	u64 rtime, total, stime, scaled;

	if (argc != 4)
		return;

	rtime = strtoll(argv[1], NULL, 10);
	total = strtoll(argv[2], NULL, 10);
	stime = strtoll(argv[3], NULL, 10);

	assert (total >= stime);

	scaled = scale_stime(stime, rtime, total);
	printf("%llu\n", scaled);

	return 0;
}

[-- Attachment #3: scale_stime_test.py --]
[-- Type: text/plain, Size: 998 bytes --]

#!/usr/bin/python

import subprocess
import random
import math

def kernel_scale (rtime, total, stime):
	p = subprocess.Popen("./scale_stime " + str(rtime) + " " + str(total) + " " + str(stime) , shell=True, stdout=subprocess.PIPE)
	return int(p.stdout.read())

def python_scale (rtime, total, stime):
	return (stime * rtime) / total

max_rtime = 10*4096*364*24*60*60*1000;  # 10 years for 4096 threads

fail=False
K=1000
for i in range(0, K):
	rtime = random.randrange(max_rtime)
	total = int(random.uniform(0.1, 1.9) * rtime)

	for n in range(1, 100):
		stime = (n * total / 100)
		r1 = kernel_scale(rtime, total, stime)
		r2 = python_scale(rtime, total, stime)
		if (float(abs(r1 - r2)) / float(r2)) > 0.0005:
			print "FAIL!"
			print "rtime: " + str(rtime)
			print "total: " + str(total)
			print "stime: " + str(stime)
			print "kernel: " + str(r1)
			print "python: " + str(r2)

			fail=True
			break
	if fail:
		break;
	if (i % 100) == 99:
		print str(i/100) + "/" + str(K/100) + " OK"
		

  parent reply	other threads:[~2013-04-30 14:02 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <tip-d9a3c9823a2e6a543eb7807fb3d15d8233817ec5@git.kernel.org>
2013-03-26 14:01 ` [tip:sched/core] sched: Lower chances of cputime scaling overflow Stanislaw Gruszka
2013-03-26 14:19   ` Frederic Weisbecker
2013-03-26 16:54     ` Stanislaw Gruszka
2013-04-10 12:51     ` Ingo Molnar
2013-04-10 15:28       ` Frederic Weisbecker
2013-04-10 17:32         ` Ingo Molnar
2013-04-11  8:04           ` Stanislaw Gruszka
2013-04-11 13:45   ` Peter Zijlstra
2013-04-11 14:50     ` Stanislaw Gruszka
2013-04-11 17:31       ` Peter Zijlstra
2013-04-11 15:38     ` Linus Torvalds
2013-04-11 18:07       ` Peter Zijlstra
2013-04-11 18:22         ` Frederic Weisbecker
2013-04-11 18:26           ` Frederic Weisbecker
2013-04-11 18:22         ` Linus Torvalds
2013-04-12  7:55       ` Peter Zijlstra
2013-04-13 14:49         ` Stanislaw Gruszka
2013-04-13 18:44           ` Linus Torvalds
2013-04-16 10:40             ` Stanislaw Gruszka
2013-04-30 14:03             ` Stanislaw Gruszka [this message]
2013-04-13 14:55       ` Stanislaw Gruszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130430140330.GB10465@redhat.com \
    --to=sgruszka@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).