From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758842AbcCDPPK (ORCPT ); Fri, 4 Mar 2016 10:15:10 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56485 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752255AbcCDPPF (ORCPT ); Fri, 4 Mar 2016 10:15:05 -0500 Message-ID: <1457104501.6288.6.camel@redhat.com> Subject: Re: [PATCH] sched/cputime: Fix steal time accounting vs. cpu hotplug From: Rik van Riel To: Thomas Gleixner , LKML Cc: Peter Zijlstra , Ingo Molnar , Glauber Costa , Frederic Weisbecker Date: Fri, 04 Mar 2016 10:15:01 -0500 In-Reply-To: References: Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-A4ZfzDpf4m7ghlnaVbm7" Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-A4ZfzDpf4m7ghlnaVbm7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2016-03-04 at 15:59 +0100, Thomas Gleixner wrote: > On cpu hotplug the steal time accounting can keep a stale rq- > >prev_steal_time > value over cpu down and up. So after the cpu comes up again the delta > calculation in steal_account_process_tick() wreckages itself due to > the > unsigned math: >=20 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 u64 steal =3D paravirt_s= teal_clock(smp_processor_id()); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 steal -=3D this_rq()->pr= ev_steal_time; >=20 > So if steal is smaller than rq->prev_steal_time we end up with an > insane large > value which then gets added to rq->prev_steal_time, resulting in a > permanent > wreckage of the accounting. As a consequence the per cpu stats in > /proc/stat > become stale. >=20 > Nice trick to tell the world how idle the system is (100%) while the > cpu is > 100% busy running tasks. Though we prefer realistic numbers. >=20 > None of the accounting values which use a previous value to account > for > fractions is reset at cpu hotplug time. update_rq_clock_task() has a > sanity > check for prev_irq_time and prev_steal_time_rq, but that sanity check > solely > deals with clock warps and limits the /proc/stat visible wreckage. > The > prev_time values are still wrong. >=20 > Solution is simple: Reset rq->prev_*_time when the cpu is plugged in > again. >=20 > Fixes: commit e6e6685accfa "KVM guest: Steal time accounting" > Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for > stolen time" > Fixes: commit aa483808516c "sched: Remove irq time from available CPU > power" > Signed-off-by: Thomas Gleixner > Cc: stable@vger.kernel.org Acked-by: Rik van Riel --=20 All Rights Reversed. --=-A4ZfzDpf4m7ghlnaVbm7 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJW2aZ1AAoJEM553pKExN6Dgm0H/Rs6dKH2VgU+WdAfCw2oiBNV L3qn5pkEXekMElnN1IUnKAmRkp9ParyISwlo8ns2dsC/JfYfY4qHvQj+T4yA16mc o+A89xhASnfEtNCpjm7PuAq4PSDCAXsk31z6vFVPa/ubahDM0Dg3z/X+2vX4+kPJ JyGNV3fi+VOJC9707OYNPb407SVikfNqL/M9MCKvlsXLkDaqECy02hPrSqd9p5LJ U1z4o2bLTMyKZcyc8Tnw2UJ1ivz5B19Ufb1tnzOMEkdQqJEq3isVjr6eKzypLXak HLAY5a/fFyxjKRJzS9dgctJamX12wuL5l7+8BigXmOmdA1u4sXWksWj7PO6zmqc= =pc2R -----END PGP SIGNATURE----- --=-A4ZfzDpf4m7ghlnaVbm7--