linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4.16-rc5 (3)] x86/vdso: on Intel, VDSO should handle CLOCK_MONOTONIC_RAW
@ 2018-03-14  4:20 jason.vas.dias
  2018-03-14  4:20 ` [PATCH v4.16-rc5 1/3] " jason.vas.dias
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: jason.vas.dias @ 2018-03-14  4:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: x86, tglx, mingo, peterz, andi



  Currently the VDSO does not handle
     clock_gettime( CLOCK_MONOTONIC_RAW, &ts )
  on Intel / AMD - it calls
     vdso_fallback_gettime()
  for this clock, which issues a syscall, having an unacceptably high
  latency (minimum measurable time or time between measurements)
  of 300-700ns on 2 2.8-3.9ghz Haswell x86_64 Family'_'Model : 06_3C
  machines under various versions of Linux.

  Sometimes, particularly when correlating elapsed time to performance
  counter values, user-space  code needs to know elapsed time from the
  perspective of the CPU no matter how "hot" / fast or "cold" / slow it
  might be running wrt NTP / PTP "real" time; when code needs this,
  the latencies associated with a syscall are often unacceptably high.

  I reported this as Bug #198161 :
    'https://bugzilla.kernel.org/show_bug.cgi?id=198961'
  and in previous posts with subjects matching 'CLOCK_MONOTONIC_RAW' .
     
  This patch handles CLOCK_MONOTONIC_RAW clock_gettime() in the VDSO ,
  by exporting the raw clock calibration, last cycles, last xtime_nsec,
  and last raw_sec value in the vsyscall_gtod_data during vsyscall_update() .

  Now the new do_monotonic_raw() function in the vDSO has a latency of @ 24ns
  on average, and the test program:
   tools/testing/selftest/timers/inconsistency-check.c
  succeeds with arguments: '-c 4 -t 120' or any arbitrary -t value.

  The patch is against Linus' latest 4.16-rc5 tree,
  current HEAD of :
    git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
  .

  This patch affects only files:
  
   arch/x86/include/asm/vgtod.h
   arch/x86/entry/vdso/vclock_gettime.c
   arch/x86/entry/vdso/vdso.lds.S
   arch/x86/entry/vdso/vdsox32.lds.S
   arch/x86/entry/vdso/vdso32/vdso32.lds.S      
   arch/x86/entry/vsyscall/vsyscall_gtod.c

  There are 3 patches in the series :

   Patch #1 makes the VDSO handle clock_gettime(CLOCK_MONOTONIC_RAW) with rdtsc_ordered()

   Patch #2 makes the VDSO handle clock_gettime(CLOCK_MONOTONIC_RAW) with a new rdtscp() function in msr.h

   Patch #3 makes the VDSO export TSC calibration data via a new function in the vDSO: 
               unsigned int __vdso_linux_tsc_calibration ( struct linux_tsc_calibration *tsc_cal )
            that user code can optionally call.

   Patches #2 & #3 should be considered "optional" .

   Patch #2 makes clock_gettime(CLOCK_MONOTONIC_RAW) calls have @ half the latency
   of clock_gettime(CLOCK_MONOTONIC) calls.

   I think something like Patch #3 is necessary to export TSC calibration data to user-space TSC readers.


Best Regards,
Jason Vas Dias

^ permalink raw reply	[flat|nested] 8+ messages in thread
* [PATCH v4.16-rc5 (3)] x86/vdso: on Intel, VDSO should handle CLOCK_MONOTONIC_RAW
@ 2018-03-15 16:00 jason.vas.dias
  2018-03-15 16:00 ` [PATCH v4.16-rc5 2/3] " jason.vas.dias
  0 siblings, 1 reply; 8+ messages in thread
From: jason.vas.dias @ 2018-03-15 16:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: x86, tglx, mingo, peterz, andi


  Resent to address reviewer comments.
   
  Currently, the VDSO does not handle
     clock_gettime( CLOCK_MONOTONIC_RAW, &ts )
  on Intel / AMD - it calls
     vdso_fallback_gettime()
  for this clock, which issues a syscall, having an unacceptably high
  latency (minimum measurable time or time between measurements)
  of 300-700ns on 2 2.8-3.9ghz Haswell x86_64 Family'_'Model : 06_3C
  machines under various versions of Linux.

  Sometimes, particularly when correlating elapsed time to performance
  counter values, user-space  code needs to know elapsed time from the
  perspective of the CPU no matter how "hot" / fast or "cold" / slow it
  might be running wrt NTP / PTP "real" time; when code needs this,
  the latencies associated with a syscall are often unacceptably high.

  I reported this as Bug #198161 :
    'https://bugzilla.kernel.org/show_bug.cgi?id=198961'
  and in previous posts with subjects matching 'CLOCK_MONOTONIC_RAW' .

  This patch handles CLOCK_MONOTONIC_RAW clock_gettime() in the VDSO ,
  by exporting the raw clock calibration, last cycles, last xtime_nsec,
  and last raw_sec value in the vsyscall_gtod_data during vsyscall_update() .

  Now the new do_monotonic_raw() function in the vDSO has a latency of @ 24ns
  on average, and the test program:
   tools/testing/selftest/timers/inconsistency-check.c
  succeeds with arguments: '-c 4 -t 120' or any arbitrary -t value.

  The patch is against Linus' latest 4.16-rc5 tree,
  current HEAD of :
    git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
  .

  This patch affects only files:

   arch/x86/include/asm/vgtod.h
   arch/x86/entry/vdso/vclock_gettime.c
   arch/x86/entry/vdso/vdso.lds.S
   arch/x86/entry/vdso/vdsox32.lds.S
   arch/x86/entry/vdso/vdso32/vdso32.lds.S
   arch/x86/entry/vsyscall/vsyscall_gtod.c

  There are 3 patches in the series :

   Patch #1 makes the VDSO handle clock_gettime(CLOCK_MONOTONIC_RAW) with rdtsc_ordered()

  Patches #2 & #3 should be considered "optional" :

   Patch #2 makes the VDSO handle clock_gettime(CLOCK_MONOTONIC_RAW) with a new rdtscp() function in msr.h

   Patch #3 makes the VDSO export TSC calibration data via a new function in the vDSO:
               unsigned int __vdso_linux_tsc_calibration ( struct linux_tsc_calibration *tsc_cal )
            that user code can optionally call.


   Patch #2 makes clock_gettime(CLOCK_MONOTONIC_RAW) calls somewhat faster
   than clock_gettime(CLOCK_MONOTONIC) calls.

   I think something like Patch #3 is necessary to export TSC calibration data to user-space TSC readers.

   It is entirely up to the kernel developers whether they want to include patches
   #2 and #3, but I think something like Patch #1 really needs to get into a future
   Linux release, as an unecessary latency of 200-1000ns for a timer that can tick
   3 times per nanosecond is unacceptable.

   Patches for kernels 3.10.0-21 and 4.9.65-rt23 (ARM) are attached to bug #198161. 


Thanks & Best Regards,
Jason Vas Dias

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-03-16  6:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-14  4:20 [PATCH v4.16-rc5 (3)] x86/vdso: on Intel, VDSO should handle CLOCK_MONOTONIC_RAW jason.vas.dias
2018-03-14  4:20 ` [PATCH v4.16-rc5 1/3] " jason.vas.dias
2018-03-14 14:27   ` Thomas Gleixner
2018-03-16  6:11   ` kbuild test robot
2018-03-14  4:20 ` [PATCH v4.16-rc5 2/3] " jason.vas.dias
2018-03-14 14:48   ` Thomas Gleixner
2018-03-14  4:20 ` [PATCH v4.16-rc5 3/3] " jason.vas.dias
2018-03-15 16:00 [PATCH v4.16-rc5 (3)] " jason.vas.dias
2018-03-15 16:00 ` [PATCH v4.16-rc5 2/3] " jason.vas.dias

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).