All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Waiman Long <longman@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	"H. Peter Anvin" <hpa@zytor.com>, Feng Tang <feng.tang@intel.com>,
	Bill Gray <bgray@redhat.com>, Jirka Hladky <jhladky@redhat.com>
Subject: Re: [PATCH 2/2] x86/tsc_sync: Add synchronization overhead to tsc adjustment
Date: Mon, 25 Apr 2022 21:24:16 +0200	[thread overview]
Message-ID: <87czh50xwf.ffs@tglx> (raw)
In-Reply-To: <4f02fe46-b253-2809-0af7-f2e9da091fe9@redhat.com>

On Mon, Apr 25 2022 at 09:20, Waiman Long wrote:
> On 4/22/22 06:41, Thomas Gleixner wrote:
>> I did some experiments and noticed that the boot time overhead is
>> different from the overhead when doing the sync check after boot
>> (offline a socket and on/offline the first CPU of it several times).
>>
>> During boot the overhead is lower on this machine (SKL-X), during
>> runtime it's way higher and more noisy.
>>
>> The noise can be pretty much eliminated by running the sync_overhead
>> measurement multiple times and building the average.
>>
>> The reason why it is higher is that after offlining the socket the CPU
>> comes back up with a frequency of 700Mhz while during boot it runs with
>> 2100Mhz.
>>
>> Sync overhead: 118
>> Sync overhead:  51 A: 22466 M: 22448 F: 2101683
> One explanation of the sync overhead difference (118 vs 51) here is 
> whether the lock cacheline is local or remote. My analysis the 
> interaction between check_tsc_sync_source() and check_tsc_sync_target() 
> is that real overhead is about locking with remote cacheline (local to 
> source, remote to target). When you do a 256 loop of locking, it is all 
> local cacheline. That is why the overhead is lower. It also depends on 
> if the remote cacheline is in the same socket or a different socket.

Yes. It's clear that the initial sync overhead is due to the cache line
being remote, but I rather underestimate the compensation. Aside of that
it's not guaranteed that the cache line is actually remote on the first
access. It's by chance, but not by design.

>> Sync overhead: 178
>> Sync overhead: 152 A: 22477 M: 67380 F:  700529
>>
>> Sync overhead: 212
>> Sync overhead: 152 A: 22475 M: 67380 F:  700467
>>
>> Sync overhead: 153
>> Sync overhead: 152 A: 22497 M: 67452 F:  700404
>>
>> Can you try the patch below and check whether the overhead stabilizes
>> accross several attempts on that copperlake machine and whether the
>> frequency is always the same or varies?
> Yes, I will try that experiment and report back the results.
>>
>> Independent of the outcome on that, I think have to take the actual CPU
>> frequency into account for calculating the overhead.
>
> Assuming that the clock frequency remains the same during the 
> check_tsc_warp() loop and the sync overhead computation time, I don't 
> think the actual clock frequency matters much. However, it will be a 
> different matter if the frequency does change. In this case, it is more 
> likely the frequency will go up than down. Right? IOW, we may 
> underestimate the sync overhead in this case. I think it is better than 
> overestimating it.

The question is not whether the clock frequency changes during the loop.
The point is:

    start = rdtsc();
    do_stuff();
    end = rdtsc();
    compensation = end - start;
    
do_stuff() executes a constant number of instructions which are executed
in a constant number of CPU clock cycles, let's say 100 for simplicity.
TSC runs with 2000MHz.

With a CPU frequency of 1000 MHz the real computation time is:

   100/1000MHz = 100 nsec = 200 TSC cycles

while with a CPU frequency of 2000MHz it is obviously:

   100/2000MHz =  50 nsec = 100 TSC cyles

IOW, TSC runs with a constant frequency independent of the actual CPU
frequency, ergo the CPU frequency dependent execution time has an
influence on the resulting compensation value, no?

On the machine I tested on, it's a factor of 3 between the minimal and
the maximal CPU frequency, which makes quite a difference, right?

Thanks,

        tglx



  reply	other threads:[~2022-04-25 19:24 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-14 19:46 [PATCH 0/2] x86/tsc: Avoid TSC sync failure Waiman Long
2022-03-14 19:46 ` [PATCH 1/2] x86/tsc: Reduce external interference on max_warp detection Waiman Long
2022-03-14 19:46 ` [PATCH 2/2] x86/tsc_sync: Add synchronization overhead to tsc adjustment Waiman Long
2022-04-03 10:03   ` Thomas Gleixner
     [not found]     ` <d1a04785-4822-3a3f-5c37-81329a562364@redhat.com>
2022-04-22 10:41       ` Thomas Gleixner
2022-04-25 13:20         ` Waiman Long
2022-04-25 19:24           ` Thomas Gleixner [this message]
2022-04-26 15:36             ` Waiman Long
2022-04-27 22:38               ` Thomas Gleixner
2022-04-29 17:41                 ` Waiman Long
2022-05-02  7:56                   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czh50xwf.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=bgray@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=hpa@zytor.com \
    --cc=jhladky@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.