From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1egCWA-0005Vs-E4 for mharc-grub-devel@gnu.org; Mon, 29 Jan 2018 11:42:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52087) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1egCW7-0005Tv-Mq for grub-devel@gnu.org; Mon, 29 Jan 2018 11:42:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1egCW4-0007wv-GI for grub-devel@gnu.org; Mon, 29 Jan 2018 11:42:47 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:42840) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1egCW4-0007wH-6w for grub-devel@gnu.org; Mon, 29 Jan 2018 11:42:44 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w0TGfwLl090341; Mon, 29 Jan 2018 16:42:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=corp-2017-10-26; bh=IG6AYAS6PaivR0/dh1QLfgPUKvxTrTs5tDkmNwXt9U8=; b=qi/owsCmCv89h0WldHbe+2Cn6YhK2tLWkqVHYxReu3Gl7PX3/79Saly5WAs61n7G8dki 51nJhYDs2VxaHqlgITyO7fhPU1yr8ZVR4YqtVdGqSqcB2ABlBvi0/54qpZNpNpqGwyqo b3TlxBxaTxhhhiQKzs5H7Ch23SPvNy1jC+Lwo5GD+Ldn+sB+Qg5YvdiyTddMUoXiCaM+ z94BiOS3WrocVPfqeTLXThJZXC8NCwh0P8ASFvU6MHQ9QZ3K34c2dp/pZM/D/OJsYi/0 Q0jgeXZWN4Ko22wpcrRE588gJznstV+ZZ/m3KkCKJXh2VOhaBH/+LfamNtkthKtMJuZa XQ== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2130.oracle.com with ESMTP id 2ft59g8rk2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jan 2018 16:42:32 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w0TGgVp3003038 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 29 Jan 2018 16:42:32 GMT Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w0TGgUIh025544; Mon, 29 Jan 2018 16:42:30 GMT Received: from olila.local.net-space.pl (/10.175.182.146) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 29 Jan 2018 08:42:29 -0800 Date: Mon, 29 Jan 2018 17:42:26 +0100 From: Daniel Kiper To: Peter Jones Cc: grub-devel@gnu.org, "David E . Box" Subject: Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail. Message-ID: <20180129164226.GH16863@olila.local.net-space.pl> References: <20180116181617.2748-1-pjones@redhat.com> <20180118225253.GB21705@olila.local.net-space.pl> <20180119211000.ubntekuxuzhnxlac@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180119211000.ubntekuxuzhnxlac@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8789 signatures=668655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801290218 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 141.146.126.79 X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Jan 2018 16:42:49 -0000 On Fri, Jan 19, 2018 at 04:10:01PM -0500, Peter Jones wrote: > On Thu, Jan 18, 2018 at 11:52:53PM +0100, Daniel Kiper wrote: > > On Tue, Jan 16, 2018 at 01:16:17PM -0500, Peter Jones wrote: > > > On my laptop running at 2.4GHz, if I run a VM where tsc calibration > > > using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds > > > to do so (as measured with the stopwatch on my phone), with a tsc delta > > > of 0x1cd1c85300, or around 125 billion cycles. > > > > > > If instead of trying to wait for 5-200ms to show up on the pmtimer, we try > > > to wait for 5-200us, it decides it's broken in ~0x7998f9e TSCs, aka ~2 > > > million cycles, or more or less instantly. > > > > > > Additionally, this reading the pmtimer was returning 0xffffffff anyway, > > > and that's obviously an invalid return. I've added a check for that and > > > 0 so we don't bother waiting for the test if what we're seeing is dead > > > pins with no response at all. > > > > > > Signed-off-by: Peter Jones > > > --- > > > grub-core/kern/i386/tsc_pmtimer.c | 43 ++++++++++++++++++++++++++++++--------- > > > 1 file changed, 33 insertions(+), 10 deletions(-) > > > > > > diff --git a/grub-core/kern/i386/tsc_pmtimer.c b/grub-core/kern/i386/tsc_pmtimer.c > > > index c9c36169978..609402b8376 100644 > > > --- a/grub-core/kern/i386/tsc_pmtimer.c > > > +++ b/grub-core/kern/i386/tsc_pmtimer.c > > > @@ -38,30 +38,53 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer, > > > grub_uint64_t start_tsc; > > > grub_uint64_t end_tsc; > > > int num_iter = 0; > > > + int bad_reads = 0; > > > > > > - start = grub_inl (pmtimer) & 0xffffff; > > > + start = grub_inl (pmtimer) & 0x3fff; > > > > I am not sure why you are changing this to 0x3fff... > > Upon re-reading, I am not either. (As you see I wrote this patch 2+ > months ago while working on something else, so the memory has already > faded.) Yeah, I know how it works... > Obviously I need to make a version 2 of this patch to fix some issues. Great! > > > last = start; > > > end = start + num_pm_ticks; > > > start_tsc = grub_get_tsc (); > > > while (1) > > > { > > > - cur = grub_inl (pmtimer) & 0xffffff; > > > > What about 24-bit timers? I would leave this here... > > I was blisfully unaware of 24-bit timers. I'll put that back and add a > comment for future readers of the code, and check for 0xffffff below. Thanks! > > > + cur = grub_inl (pmtimer); > > > + > > > + /* If we get 10 reads in a row that are obviously dead pins, there's no > > > + reason to do this thousands of times. > > > + */ > > > + if (cur == 0xffffffff || cur == 0) > > > > ...and here I would check for 0xffffff and 0. > > Sure. > > > > > > + { > > > + bad_reads++; > > > + grub_dprintf ("pmtimer", "cur: 0x%08x bad_reads: %d\n", cur, bad_reads); > > > + > > > + if (bad_reads == 10) > > > + return 0; > > > + } > > > + else if (bad_reads) > > > + bad_reads = 0; > > > > Do we really need to reset this? > > I mean, it's technically correct, but I don't think you're going to see > 0 and 0xffffff as valid values often enough to actually care. > > > > + cur &= 0x3fff; > > > + > > > if (cur < last) > > > - cur |= 0x1000000; > > > + cur |= 0x4000; > > > num_iter++; > > You didn't spot this, but it's wrong for exactly the same reason the > initialization of start is. I'm pretty sure when I did this part, I was > letting the comment below tell me what was going on, and completely > missed that the comment doesn't match the conditional at all. > > > > if (cur >= end) > > > { > > > end_tsc = grub_get_tsc (); > > > + grub_dprintf ("pmtimer", "tsc delta is 0x%016lx\n", > > > + end_tsc - start_tsc); > > > return end_tsc - start_tsc; > > > } > > > - /* Check for broken PM timer. > > > - 50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz) > > > - if after this time we still don't have 1 ms on pmtimer, then > > > - pmtimer is broken. > > > + /* Check for broken PM timer. 5000 TSCs is between 5us (10GHz) and > > ^^^ 500ns? > > > > > + 200us (250 MHz). If after this time we still don't have 1us on > > ^^^^^ 20us? > > You know, I stared at this for a surprising amount of time when writing > it trying to figure out why these numbers seemed weird. While I have > the diff in front of me rather than the code as it is now, I see that > it's because I was starting with the comment that was previously there, > which had 50M instead of 5M for the TSC count. So I wound up dividing > these by 10k instead of 1k. Woops. Will fix in v2. > > > > > > + pmtimer, then pmtimer is broken. > > > */ > > > - if ((num_iter & 0xffffff) == 0 && grub_get_tsc () - start_tsc > 5000000) { > > > - return 0; > > > - } > > > + end_tsc = grub_get_tsc(); > > > + if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000) > > > > Why 0x3fff here which means, AIUI, 4000 iterations? > > Er... 16384, surely? But basically I just wanted to scale down the Yep, or I simply forgot 0x before 4000... > limit by approximately the same order as I scaled down the TSC delta, > and I wanted to leave the code basically working the same way (not for > any good reason), so I still wanted a number with only low-order bits > set. TSC is scaled down by 1000, so I scaled the PM tick iteration down > by 1024. I could make it twice as wide of a window instead, if you're > worried about that particular race, but I don't think it's going to > matter. > > If our loop takes 1 TSC tick per iteration, which is quite impluasibly > fast, and we're on a 10GHz machine, then we should normally see pmtimer > change by 3580 in ~2800 iterations. If it takes 16k iterations (again, I think that number of iterations is wrong here. Let's see... f_pm = n_pm / t_pm, n_pm = 3580, f_pm = 3.58MHz => t_pm = 3580 / 3.58MHz t_pm = 0,001s = 1ms => num_iter = 1ms * 10GHz = 10.000.000 Then below calculations are wrong too... > with perfectly executing code that takes no time), the pmtimer is > running at 625KHz instead of 3.58MHz. If we're on a 250MHz machine, > that number is ~70 iterations. If we get to 16000 iterations without > seeing pmtimer change by 3580 then pmtimer is either broken or we're on > a 60GHz+ machine. > > That said the logic here did not match the comment and it still doesn't; > it should be comparing cur and start, not the number of iterations. On > the next version of the patch I'm going to do something like: > > if ((++num_iter > (grub_uint32_t)num_pm_ticks << 3UL) || Taking into account above I am not sure why you multiply by 8 here... > (cur == start && end_tsc - start_tsc > 5000)) > > So it'll have a fairly widely ranged total limit but also actually check if > pmtimer results have changed at all. > > Tentative patch below, but I haven't had a chance to test it yet: In general I am not against the idea itself but I think that some calculations should be fixed here and there. > From 97e59bbe000961db5f823c4df074835d1165bd9d Mon Sep 17 00:00:00 2001 > From: Peter Jones > Date: Tue, 7 Nov 2017 17:12:17 -0500 > Subject: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail. > > On my laptop running at 2.4GHz, if I run a VM where tsc calibration > using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds > to do so (as measured with the stopwatch on my phone), with a tsc delta > of 0x1cd1c85300, or around 125 billion cycles. > > If instead of trying to wait for 5-200ms to show up on the pmtimer, we try > to wait for 500ns-20us, it decides it's broken in ~0x7998f9e TSCs, aka ~2 > million cycles, or more or less instantly. > > Additionally, this reading the pmtimer was returning 0xffffffff anyway, > and that's obviously an invalid return. I've added a check for that and > 0 so we don't bother waiting for the test if what we're seeing is dead > pins with no response at all. > --- > grub-core/kern/i386/tsc_pmtimer.c | 56 ++++++++++++++++++++++++++++++++------- > 1 file changed, 46 insertions(+), 10 deletions(-) > > diff --git a/grub-core/kern/i386/tsc_pmtimer.c b/grub-core/kern/i386/tsc_pmtimer.c > index c9c36169978..b09c00316df 100644 > --- a/grub-core/kern/i386/tsc_pmtimer.c > +++ b/grub-core/kern/i386/tsc_pmtimer.c > @@ -37,8 +37,14 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer, > grub_uint32_t cur, end; > grub_uint64_t start_tsc; > grub_uint64_t end_tsc; > - int num_iter = 0; > + unsigned int num_iter = 0; > + int bad_reads = 0; > > + /* Some timers are 24-bit and some are 32-bit, but it doesn't make much Please leave empty line after "/*". > + * difference to us. Caring which one we have isn't really worth it since > + * the low-order digits will give us enough data to calibrate TSC. So just > + * mask the top-order byte off. > + */ > start = grub_inl (pmtimer) & 0xffffff; > last = start; > end = start + num_pm_ticks; > @@ -46,22 +52,52 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer, > while (1) > { > cur = grub_inl (pmtimer) & 0xffffff; > + > + /* If we get 10 reads in a row that are obviously dead pins, there's no > + reason to do this thousands of times. Please be consistent with the comments format. I prefer one given above. > + */ > + if (cur == 0xffffff || cur == 0) > + { > + bad_reads++; > + grub_dprintf ("pmtimer", "pmtimer value: 0x%08x bad_reads: %d\n", > + cur, bad_reads); > + > + if (bad_reads == 10) > + return 0; > + } > + else if (bad_reads) > + bad_reads = 0; > + > + end_tsc = grub_get_tsc(); > + > if (cur < last) > cur |= 0x1000000; > - num_iter++; > if (cur >= end) > { > - end_tsc = grub_get_tsc (); > + grub_dprintf ("pmtimer", "tsc delta is 0x%016llx\n", > + end_tsc - start_tsc); > return end_tsc - start_tsc; > } > - /* Check for broken PM timer. > - 50000000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz) > - if after this time we still don't have 1 ms on pmtimer, then > - pmtimer is broken. > + > + /* Check for broken PM timer. 5000 TSCs is between 500ns (10GHz) and > + 20us (250 MHz). If after this time we still don't have 1us on > + pmtimer, then pmtimer is broken. > + > + Likewise, if our code is perfectly efficient and introduces no delays > + whatsoever, on a 10GHz system we should see a TSC delta of 3580 in > + ~2800 iterations. On a 250MHz machine that should be ~70 iterations. Incorrect calculations? > + So there's a hard limit here at 8x our desired pm timer delta, because > + if we get that far, we're either on a 60GHz+ machine or we're never > + hitting our threshold. And please use same comment format as in first one. Thanks, Daniel