From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1ecdvO-0002rC-Fd for mharc-grub-devel@gnu.org; Fri, 19 Jan 2018 16:10:10 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50691) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ecdvL-0002m2-RQ for grub-devel@gnu.org; Fri, 19 Jan 2018 16:10:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ecdvK-0004br-2Y for grub-devel@gnu.org; Fri, 19 Jan 2018 16:10:07 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56494) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ecdvJ-0004bQ-Op for grub-devel@gnu.org; Fri, 19 Jan 2018 16:10:05 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 41346272C0; Fri, 19 Jan 2018 21:10:04 +0000 (UTC) Received: from redhat.com (dhcp-10-20-1-221.bss.redhat.com [10.20.1.221]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8CC446D007; Fri, 19 Jan 2018 21:10:03 +0000 (UTC) Date: Fri, 19 Jan 2018 16:10:01 -0500 From: Peter Jones To: Daniel Kiper Cc: grub-devel@gnu.org, "David E . Box" Subject: Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail. Message-ID: <20180119211000.ubntekuxuzhnxlac@redhat.com> References: <20180116181617.2748-1-pjones@redhat.com> <20180118225253.GB21705@olila.local.net-space.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20180118225253.GB21705@olila.local.net-space.pl> User-Agent: NeoMutt/20171215 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 19 Jan 2018 21:10:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jan 2018 21:10:09 -0000 On Thu, Jan 18, 2018 at 11:52:53PM +0100, Daniel Kiper wrote: > On Tue, Jan 16, 2018 at 01:16:17PM -0500, Peter Jones wrote: > > On my laptop running at 2.4GHz, if I run a VM where tsc calibration > > using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds > > to do so (as measured with the stopwatch on my phone), with a tsc delta > > of 0x1cd1c85300, or around 125 billion cycles. > > > > If instead of trying to wait for 5-200ms to show up on the pmtimer, we try > > to wait for 5-200us, it decides it's broken in ~0x7998f9e TSCs, aka ~2 > > million cycles, or more or less instantly. > > > > Additionally, this reading the pmtimer was returning 0xffffffff anyway, > > and that's obviously an invalid return. I've added a check for that and > > 0 so we don't bother waiting for the test if what we're seeing is dead > > pins with no response at all. > > > > Signed-off-by: Peter Jones > > --- > > grub-core/kern/i386/tsc_pmtimer.c | 43 ++++++++++++++++++++++++++++++--------- > > 1 file changed, 33 insertions(+), 10 deletions(-) > > > > diff --git a/grub-core/kern/i386/tsc_pmtimer.c b/grub-core/kern/i386/tsc_pmtimer.c > > index c9c36169978..609402b8376 100644 > > --- a/grub-core/kern/i386/tsc_pmtimer.c > > +++ b/grub-core/kern/i386/tsc_pmtimer.c > > @@ -38,30 +38,53 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer, > > grub_uint64_t start_tsc; > > grub_uint64_t end_tsc; > > int num_iter = 0; > > + int bad_reads = 0; > > > > - start = grub_inl (pmtimer) & 0xffffff; > > + start = grub_inl (pmtimer) & 0x3fff; > > I am not sure why you are changing this to 0x3fff... Upon re-reading, I am not either. (As you see I wrote this patch 2+ months ago while working on something else, so the memory has already faded.) Obviously I need to make a version 2 of this patch to fix some issues. > > last = start; > > end = start + num_pm_ticks; > > start_tsc = grub_get_tsc (); > > while (1) > > { > > - cur = grub_inl (pmtimer) & 0xffffff; > > What about 24-bit timers? I would leave this here... I was blisfully unaware of 24-bit timers. I'll put that back and add a comment for future readers of the code, and check for 0xffffff below. > > + cur = grub_inl (pmtimer); > > + > > + /* If we get 10 reads in a row that are obviously dead pins, there's no > > + reason to do this thousands of times. > > + */ > > + if (cur == 0xffffffff || cur == 0) > > ...and here I would check for 0xffffff and 0. Sure. > > > + { > > + bad_reads++; > > + grub_dprintf ("pmtimer", "cur: 0x%08x bad_reads: %d\n", cur, bad_reads); > > + > > + if (bad_reads == 10) > > + return 0; > > + } > > + else if (bad_reads) > > + bad_reads = 0; > > Do we really need to reset this? I mean, it's technically correct, but I don't think you're going to see 0 and 0xffffff as valid values often enough to actually care. > > + cur &= 0x3fff; > > + > > if (cur < last) > > - cur |= 0x1000000; > > + cur |= 0x4000; > > num_iter++; You didn't spot this, but it's wrong for exactly the same reason the initialization of start is. I'm pretty sure when I did this part, I was letting the comment below tell me what was going on, and completely missed that the comment doesn't match the conditional at all. > > if (cur >= end) > > { > > end_tsc = grub_get_tsc (); > > + grub_dprintf ("pmtimer", "tsc delta is 0x%016lx\n", > > + end_tsc - start_tsc); > > return end_tsc - start_tsc; > > } > > - /* Check for broken PM timer. > > - 50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz) > > - if after this time we still don't have 1 ms on pmtimer, then > > - pmtimer is broken. > > + /* Check for broken PM timer. 5000 TSCs is between 5us (10GHz) and > ^^^ 500ns? > > > + 200us (250 MHz). If after this time we still don't have 1us on > ^^^^^ 20us? You know, I stared at this for a surprising amount of time when writing it trying to figure out why these numbers seemed weird. While I have the diff in front of me rather than the code as it is now, I see that it's because I was starting with the comment that was previously there, which had 50M instead of 5M for the TSC count. So I wound up dividing these by 10k instead of 1k. Woops. Will fix in v2. > > > + pmtimer, then pmtimer is broken. > > */ > > - if ((num_iter & 0xffffff) == 0 && grub_get_tsc () - start_tsc > 5000000) { > > - return 0; > > - } > > + end_tsc = grub_get_tsc(); > > + if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000) > > Why 0x3fff here which means, AIUI, 4000 iterations? Er... 16384, surely? But basically I just wanted to scale down the limit by approximately the same order as I scaled down the TSC delta, and I wanted to leave the code basically working the same way (not for any good reason), so I still wanted a number with only low-order bits set. TSC is scaled down by 1000, so I scaled the PM tick iteration down by 1024. I could make it twice as wide of a window instead, if you're worried about that particular race, but I don't think it's going to matter. If our loop takes 1 TSC tick per iteration, which is quite impluasibly fast, and we're on a 10GHz machine, then we should normally see pmtimer change by 3580 in ~2800 iterations. If it takes 16k iterations (again, with perfectly executing code that takes no time), the pmtimer is running at 625KHz instead of 3.58MHz. If we're on a 250MHz machine, that number is ~70 iterations. If we get to 16000 iterations without seeing pmtimer change by 3580 then pmtimer is either broken or we're on a 60GHz+ machine. That said the logic here did not match the comment and it still doesn't; it should be comparing cur and start, not the number of iterations. On the next version of the patch I'm going to do something like: if ((++num_iter > (grub_uint32_t)num_pm_ticks << 3UL) || (cur == start && end_tsc - start_tsc > 5000)) So it'll have a fairly widely ranged total limit but also actually check if pmtimer results have changed at all. Tentative patch below, but I haven't had a chance to test it yet: >From 97e59bbe000961db5f823c4df074835d1165bd9d Mon Sep 17 00:00:00 2001 From: Peter Jones Date: Tue, 7 Nov 2017 17:12:17 -0500 Subject: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail. On my laptop running at 2.4GHz, if I run a VM where tsc calibration using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds to do so (as measured with the stopwatch on my phone), with a tsc delta of 0x1cd1c85300, or around 125 billion cycles. If instead of trying to wait for 5-200ms to show up on the pmtimer, we try to wait for 500ns-20us, it decides it's broken in ~0x7998f9e TSCs, aka ~2 million cycles, or more or less instantly. Additionally, this reading the pmtimer was returning 0xffffffff anyway, and that's obviously an invalid return. I've added a check for that and 0 so we don't bother waiting for the test if what we're seeing is dead pins with no response at all. --- grub-core/kern/i386/tsc_pmtimer.c | 56 ++++++++++++++++++++++++++++++++------- 1 file changed, 46 insertions(+), 10 deletions(-) diff --git a/grub-core/kern/i386/tsc_pmtimer.c b/grub-core/kern/i386/tsc_pmtimer.c index c9c36169978..b09c00316df 100644 --- a/grub-core/kern/i386/tsc_pmtimer.c +++ b/grub-core/kern/i386/tsc_pmtimer.c @@ -37,8 +37,14 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer, grub_uint32_t cur, end; grub_uint64_t start_tsc; grub_uint64_t end_tsc; - int num_iter = 0; + unsigned int num_iter = 0; + int bad_reads = 0; + /* Some timers are 24-bit and some are 32-bit, but it doesn't make much + * difference to us. Caring which one we have isn't really worth it since + * the low-order digits will give us enough data to calibrate TSC. So just + * mask the top-order byte off. + */ start = grub_inl (pmtimer) & 0xffffff; last = start; end = start + num_pm_ticks; @@ -46,22 +52,52 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer, while (1) { cur = grub_inl (pmtimer) & 0xffffff; + + /* If we get 10 reads in a row that are obviously dead pins, there's no + reason to do this thousands of times. + */ + if (cur == 0xffffff || cur == 0) + { + bad_reads++; + grub_dprintf ("pmtimer", "pmtimer value: 0x%08x bad_reads: %d\n", + cur, bad_reads); + + if (bad_reads == 10) + return 0; + } + else if (bad_reads) + bad_reads = 0; + + end_tsc = grub_get_tsc(); + if (cur < last) cur |= 0x1000000; - num_iter++; if (cur >= end) { - end_tsc = grub_get_tsc (); + grub_dprintf ("pmtimer", "tsc delta is 0x%016llx\n", + end_tsc - start_tsc); return end_tsc - start_tsc; } - /* Check for broken PM timer. - 50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz) - if after this time we still don't have 1 ms on pmtimer, then - pmtimer is broken. + + /* Check for broken PM timer. 5000 TSCs is between 500ns (10GHz) and + 20us (250 MHz). If after this time we still don't have 1us on + pmtimer, then pmtimer is broken. + + Likewise, if our code is perfectly efficient and introduces no delays + whatsoever, on a 10GHz system we should see a TSC delta of 3580 in + ~2800 iterations. On a 250MHz machine that should be ~70 iterations. + So there's a hard limit here at 8x our desired pm timer delta, because + if we get that far, we're either on a 60GHz+ machine or we're never + hitting our threshold. */ - if ((num_iter & 0xffffff) == 0 && grub_get_tsc () - start_tsc > 5000000) { - return 0; - } + if ((++num_iter > (grub_uint32_t)num_pm_ticks << 3UL) || + (cur == start && end_tsc - start_tsc > 5000)) + { + grub_dprintf ("pmtimer", + "tsc delta is 0x%016llx, which is implausible\n", + end_tsc - start_tsc); + return 0; + } } } -- 2.15.0 -- Peter