From: Christian Engelmayer <cengelma@gmx.at>
To: Stanislaw Gruszka <sgruszka@redhat.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
fweisbec@gmail.com, Paul Turner <pjt@google.com>
Subject: Re: [PROBLEM] possible divide by 0 in kernel/sched/cputime.c scale_stime()
Date: Mon, 25 Nov 2013 01:43:02 +0100 [thread overview]
Message-ID: <20131125014302.6a4394ce@spike> (raw)
In-Reply-To: <20131120120815.GA12860@redhat.com>
On Mon, 18 Nov 2013 18:27:06 +0100, Peter Zijlstra <peterz@infradead.org> wrote:
> That is not actually correct in the case time wraps.
>
> There's a further problem with this code though -- ever since Frederic
> added NO_HZ_FULL a CPU can in fact aggregate a runtime delta larger than
> 4 seconds, due to running without a tick.
>
> Therefore we need to be able to deal with u64 deltas.
>
> The below is a compile tested only attempt to deal with both these
> problems. Comments?
I had this patch applied during daily use. No systematic testing, but no user
perceived regressions either. The originally reported divide by 0 scenario
could no longer be reproduced with this change.
> +/*
> + * delta_exec * weight / lw.weight
> + * OR
> + * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
> + *
> + * Either weight := NICE_0_LOAD and lw \e prio_to_wmult[], in which case
> + * we're guaranteed shift stays positive because inv_weight is guaranteed to
> + * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
> + *
> + * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
> + * XXX mind got twisted, but I'm fairly sure shift will stay positive.
> + *
> + */
> +static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)
The patch itself seems comprehensible to me, although I have to admit that I
would have to read into the code more deeply in order to understand why the
changed __calc_delta() will always prove correct.
On Mon, 18 Nov 2013 15:19:56 +0100, Peter Zijlstra <peterz@infradead.org> wrote:
> I'm not sure what tool you used to generate that, but its broken, that's
> model 0x25 (37), it somehow truncates the upper model bits.
Correct, that was the fairly outdated cpuid (http://www.ka9q.net/code/cpuid)
currently shipped with Ubuntu 13.10. Debian already switched to packaging a
maintained version (http://www.etallen.com/cpuid.html).
> That said, its a westmere core and I've seen wsm-ep (dual socket)
> machines loose their TSC sync quite regularly, but this would be the
> first case a single socket wsm would loose its TSC sync.
>
> That leads me to believe your BIOS is screwing you over with SMIs or the
> like.
Having rechecked the running microcode as hinted by Henrique de Moraes Holschuh
off-list and running the Intel BIOS Implementation Test Suite (http://biosbits.org)
that seems to be an educated guess.
Regards,
Christian
next prev parent reply other threads:[~2013-11-25 0:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-16 21:37 [PROBLEM] possible divide by 0 in kernel/sched/cputime.c scale_stime() Christian Engelmayer
2013-11-18 14:02 ` Stanislaw Gruszka
2013-11-18 14:19 ` Peter Zijlstra
2013-11-18 15:39 ` Ingo Molnar
2013-11-18 17:27 ` Peter Zijlstra
2013-11-20 12:08 ` Stanislaw Gruszka
2013-11-25 0:43 ` Christian Engelmayer [this message]
2013-11-26 14:44 ` [PATCH] sched, fair: Rework sched_fair time accounting Peter Zijlstra
2013-12-12 9:52 ` [tip:sched/urgent] sched/fair: " tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131125014302.6a4394ce@spike \
--to=cengelma@gmx.at \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=sgruszka@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.