All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Engelmayer <cengelma@gmx.at>
To: Stanislaw Gruszka <sgruszka@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	fweisbec@gmail.com, Paul Turner <pjt@google.com>
Subject: Re: [PROBLEM] possible divide by 0 in kernel/sched/cputime.c scale_stime()
Date: Mon, 25 Nov 2013 01:43:02 +0100	[thread overview]
Message-ID: <20131125014302.6a4394ce@spike> (raw)
In-Reply-To: <20131120120815.GA12860@redhat.com>

On Mon, 18 Nov 2013 18:27:06 +0100, Peter Zijlstra <peterz@infradead.org> wrote:
> That is not actually correct in the case time wraps.
> 
> There's a further problem with this code though -- ever since Frederic
> added NO_HZ_FULL a CPU can in fact aggregate a runtime delta larger than
> 4 seconds, due to running without a tick.
> 
> Therefore we need to be able to deal with u64 deltas.
> 
> The below is a compile tested only attempt to deal with both these
> problems. Comments?

I had this patch applied during daily use. No systematic testing, but no user
perceived regressions either. The originally reported divide by 0 scenario
could no longer be reproduced with this change.

> +/* 
> + * delta_exec * weight / lw.weight
> + *   OR
> + * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
> + *
> + * Either weight := NICE_0_LOAD and lw \e prio_to_wmult[], in which case
> + * we're guaranteed shift stays positive because inv_weight is guaranteed to
> + * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
> + *
> + * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
> + * XXX mind got twisted, but I'm fairly sure shift will stay positive.
> + *
> + */
> +static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)

The patch itself seems comprehensible to me, although I have to admit that I
would have to read into the code more deeply in order to understand why the
changed __calc_delta() will always prove correct.

On Mon, 18 Nov 2013 15:19:56 +0100, Peter Zijlstra <peterz@infradead.org> wrote:
> I'm not sure what tool you used to generate that, but its broken, that's
> model 0x25 (37), it somehow truncates the upper model bits.

Correct, that was the fairly outdated cpuid (http://www.ka9q.net/code/cpuid)
currently shipped with Ubuntu 13.10. Debian already switched to packaging a
maintained version (http://www.etallen.com/cpuid.html).

> That said, its a westmere core and I've seen wsm-ep (dual socket)
> machines loose their TSC sync quite regularly, but this would be the
> first case a single socket wsm would loose its TSC sync.
>
> That leads me to believe your BIOS is screwing you over with SMIs or the
> like.

Having rechecked the running microcode as hinted by Henrique de Moraes Holschuh
off-list and running the Intel BIOS Implementation Test Suite (http://biosbits.org)
that seems to be an educated guess.

Regards,
Christian

  reply	other threads:[~2013-11-25  0:43 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-16 21:37 [PROBLEM] possible divide by 0 in kernel/sched/cputime.c scale_stime() Christian Engelmayer
2013-11-18 14:02 ` Stanislaw Gruszka
2013-11-18 14:19   ` Peter Zijlstra
2013-11-18 15:39     ` Ingo Molnar
2013-11-18 17:27   ` Peter Zijlstra
2013-11-20 12:08     ` Stanislaw Gruszka
2013-11-25  0:43       ` Christian Engelmayer [this message]
2013-11-26 14:44         ` [PATCH] sched, fair: Rework sched_fair time accounting Peter Zijlstra
2013-12-12  9:52     ` [tip:sched/urgent] sched/fair: " tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131125014302.6a4394ce@spike \
    --to=cengelma@gmx.at \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=sgruszka@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.