From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932075AbdK1TKK (ORCPT ); Tue, 28 Nov 2017 14:10:10 -0500 Received: from mail-it0-f65.google.com ([209.85.214.65]:46125 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753049AbdK1TKG (ORCPT ); Tue, 28 Nov 2017 14:10:06 -0500 X-Google-Smtp-Source: AGs4zMbvikhBeFbK94B2Cv2+oYEanvNNgR5CWJXXeX7cIic6//P6eGg/K5S9h1w7yKyOoZ6iyJHshw== Subject: Re: [RFC patch 7/7] timekeeping: Hack to use fine grained timestamps during boot To: Prarit Bhargava , Petr Mladek , Thomas Gleixner Cc: LKML , Linus Torvalds , Ingo Molnar , "H. Peter Anvin" , Peter Zijlstra , Andrew Morton , Sergey Senozhatsky , Steven Rostedt , Joe Perches References: <20171115181531.322572387@linutronix.de> <20171115182657.703928462@linutronix.de> <20171123125823.gnhwtnx6bxd3tb4q@pathway.suse.cz> <205229e2-fab4-31cb-60cd-f36bc2228804@redhat.com> From: Mark Salyzyn Message-ID: <26f1d1fd-ed65-73d1-8278-8058b6b83a3e@android.com> Date: Tue, 28 Nov 2017 11:10:02 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <205229e2-fab4-31cb-60cd-f36bc2228804@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/28/2017 10:43 AM, Prarit Bhargava wrote: > > On 11/23/2017 07:58 AM, Petr Mladek wrote: >> On Wed 2017-11-15 19:15:38, Thomas Gleixner wrote: >>> For demonstration purposes only. >>> >>> Add a disgusting hack to work around the fact that high resolution clock >>> MONOTONIC accessors are not available during early boot and return stale >>> time stamps accross suspend/resume when the current clocksource is not >>> flagged with CLOCK_SOURCE_SUSPEND_ACCESS_OK. >>> >>> Use local_clock() to provide timestamps in early boot and when the >>> clocksource is not accessible after timekeeping_suspend(). In the >>> suspend/resume case this might cause non monotonic timestamps. >> I get the non-monotonic times even during boot: >> >> [ 0.026709] smp: Bringing up secondary CPUs ... >> [ 0.027973] x86: Booting SMP configuration: >> [ 0.028006] .... node #0, CPUs: #1 >> [ 0.004000] kvm-clock: cpu 1, msr 1:3ff51041, secondary cpu clock >> ^^^^^^^^ >> [ 0.032097] KVM setup async PF for cpu 1 >> [ 0.032766] kvm-stealtime: cpu 1, msr 13b00dc40 >> [ 0.036502] #2 >> [ 0.004000] kvm-clock: cpu 2, msr 1:3ff51081, secondary cpu clock >> ^^^^^^^^ >> [ 0.040062] KVM setup async PF for cpu 2 >> [ 0.040576] kvm-stealtime: cpu 2, msr 13b20dc40 >> [ 0.041304] #3 >> [ 0.004000] kvm-clock: cpu 3, msr 1:3ff510c1, secondary cpu clock >> ^^^^^^^^ >> [ 0.048051] KVM setup async PF for cpu 3 >> [ 0.048554] kvm-stealtime: cpu 3, msr 13b40dc40 >> >> >> To be honest, I do not feel experienced enough to decide which >> solution is acceptable. I would say that only few people care >> about timestamps during boot. On the other hand, some tools > It is extremely important to know what happened and how long it took. I agree > with Petr, we should figure out a way to guarantee that the timestamp is monotonic. > > P. There is no guarantee of sequential order of delivery for kernel prints. [TL;DR] The timestamp may very well be monotonic in the general case, and yet it may take longer to _compose_ or _deliver_ the message (multiprocessor, via vsprint, etc). To get sequential order the timestamp _has_ to be reliably monotonic and _collected_ at incident _before_ anything else, before message is composed, and then (land in-place, post-sorted, what evz?) be presented in time sorted order. User space can also inject into the kernel print buffer, it certainly does not support at-incident timestamp collection as it is delayed already by a syscall or two. We would need to allow user space (via fast vdso) collect timestamp(s), then make it part of the payload of the report to even get close. It is out-of-scope for this RFC to consider rewriting the timestamp acquisition and delivery infrastructure for kernel prints IMHO. For now we all have accepted they will be _somewhat_ sequential within the confines of races and composition. Sincerely -- Mark Salyzyn