All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 217516] New: FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test
@ 2023-06-01 12:04 bugzilla-daemon
  2023-06-01 12:10 ` [Bug 217516] " bugzilla-daemon
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: bugzilla-daemon @ 2023-06-01 12:04 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=217516

            Bug ID: 217516
           Summary: FAIL: TSC reference precision test when do
                    hyperv_clock test of kvm unit test
           Product: Virtualization
           Version: unspecified
          Hardware: Intel
                OS: Linux
            Status: NEW
          Severity: high
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: zjuysxie@hotmail.com
        Regression: No

when i do kvm unit test with -cpu host,hv_time it failed sometime(e.g. not
everytime) with error info "FAIL: TSC reference precision"

detail test failed log:
# timeout -k 1s --foreground 90s /usr/libexec/qemu-kvm --no-reboot -nodefaults
-device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none
-serial stdio -device pci-testdev -machine accel=kvm -kernel
x86/hyperv_clock.flat -cpu host,hv_time # -initrd /tmp/tmp.kFgWZZHuFw
enabling apic
smp: waiting for 0 APs
paging enabled
cr0 = 80010011
cr3 = 1007000
cr4 = 20
PASS: MSR value after enabling
scale: 10624e40e147ae1 offset: -96899
refcnt 1523884, TSC 1826d03c, TSC reference 1523889
refcnt 21523885 (delta 20000001), TSC 1422cb9da, TSC reference 21523888 (delta
19999999)
delta on CPU 0 was 3...86
PASS: TSC reference precision test
iterations/sec:  47979415
PASS: MSR value after disabling
[root@iZbp1dwjy9g3o14k8rot75Z kvm-unit-tests]# timeout -k 1s --foreground 90s
/usr/libexec/qemu-kvm --no-reboot -nodefaults -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device
pci-testdev -machine accel=kvm -kernel x86/hyperv_clock.flat -cpu host,hv_time
# -initrd /tmp/tmp.kFgWZZHuFw
enabling apic
smp: waiting for 0 APs
paging enabled
cr0 = 80010011
cr3 = 1007000
cr4 = 20
PASS: MSR value after enabling
scale: 10624e40e147ae1 offset: -94820
refcnt 1523422, TSC 181d1f72, TSC reference 1523427
refcnt 21523428 (delta 20000006), TSC 142230e40, TSC reference 21523432 (delta
20000005)
suspecting drift on CPU 0? delta = 75, acceptable [0, 45)
FAIL: TSC reference precision test
iterations/sec:  47972169
PASS: MSR value after disabling


os info:
# cat /etc/os-release
NAME="CentOS Linux"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"

# cat /etc/redhat-release
CentOS Linux release 8.5.2111

# uname -r
4.18.0-348.7.1.el8_5.x86_64

I use kvm unit test code with branch master at commit:
commit 02d8befe99f8205d4caea402d8b0800354255681 (HEAD -> master, origin/master,
origin/HEAD)
Author: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Date:   Tue Apr 4 20:50:47 2023 +0200

    pretty_print_stacks: modify relative path calculation


cpu info:
# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 85
model name      : Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz
stepping        : 7
microcode       : 0x5003102
cpu MHz         : 3800.000
cache size      : 36608 KB
physical id     : 0
siblings        : 52
core id         : 0
cpu cores       : 26
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch
cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb
stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase
tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq
rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt
xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm
ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs taa
itlb_multihit
bogomips        : 5000.00
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

[...]

processor       : 103
vendor_id       : GenuineIntel
cpu family      : 6
model           : 85
model name      : Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz
stepping        : 7
microcode       : 0x5003102
cpu MHz         : 3800.000
cache size      : 36608 KB
physical id     : 1
siblings        : 52
core id         : 29
cpu cores       : 26
apicid          : 123
initial apicid  : 123
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch
cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb
stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase
tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq
rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt
xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm
ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs taa
itlb_multihit
bogomips        : 5006.84
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 217516] FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test
  2023-06-01 12:04 [Bug 217516] New: FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test bugzilla-daemon
@ 2023-06-01 12:10 ` bugzilla-daemon
  2023-06-02 12:13 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2023-06-01 12:10 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=217516

--- Comment #1 from Ethan Xie (zjuysxie@hotmail.com) ---
qemu version:
# rpm -qa | grep qemu-kvm
qemu-kvm-common-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64
qemu-kvm-block-gluster-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64
qemu-kvm-block-curl-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64
qemu-kvm-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64
qemu-kvm-block-iscsi-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64
qemu-kvm-block-rbd-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64
qemu-kvm-block-ssh-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64
qemu-kvm-core-4.2.0-59.module_el8.5.0+1063+c9b9feff.1.x86_64

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 217516] FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test
  2023-06-01 12:04 [Bug 217516] New: FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test bugzilla-daemon
  2023-06-01 12:10 ` [Bug 217516] " bugzilla-daemon
@ 2023-06-02 12:13 ` bugzilla-daemon
  2023-06-05  3:17 ` bugzilla-daemon
  2023-06-12  9:20 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2023-06-02 12:13 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=217516

Bagas Sanjaya (bagasdotme@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bagasdotme@gmail.com

--- Comment #2 from Bagas Sanjaya (bagasdotme@gmail.com) ---
(In reply to Ethan Xie from comment #0)
> # cat /etc/redhat-release
> CentOS Linux release 8.5.2111
> 
> # uname -r
> 4.18.0-348.7.1.el8_5.x86_64
> 

CentOS 8 has been EOLed in 2021. Can you test with Rocky Linux 9 instead
(and preferably latest mainline kernel)?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 217516] FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test
  2023-06-01 12:04 [Bug 217516] New: FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test bugzilla-daemon
  2023-06-01 12:10 ` [Bug 217516] " bugzilla-daemon
  2023-06-02 12:13 ` bugzilla-daemon
@ 2023-06-05  3:17 ` bugzilla-daemon
  2023-06-12  9:20 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2023-06-05  3:17 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=217516

--- Comment #3 from Ethan Xie (zjuysxie@hotmail.com) ---
(In reply to Bagas Sanjaya from comment #2)
> (In reply to Ethan Xie from comment #0)
> > # cat /etc/redhat-release
> > CentOS Linux release 8.5.2111
> > 
> > # uname -r
> > 4.18.0-348.7.1.el8_5.x86_64
> > 
> 
> CentOS 8 has been EOLed in 2021. Can you test with Rocky Linux 9 instead
> (and preferably latest mainline kernel)?

I change the OS to Rocky Linux 9 with mainline linux kernel, it also can
reproduce this problem:

# timeout -k 1s --foreground 90s /usr/libexec/qemu-kvm --no-reboot -nodefaults
-device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none
-serial stdio -device pci-testdev -machine accel=kvm -kernel
x86/hyperv_clock.flat -cpu host,hv_time # -initrd /tmp/tmp.kFgWZZHuFw
qemu-kvm: warning: Machine type 'pc-i440fx-rhel7.6.0' is deprecated: machine
types for previous major releases are deprecated
enabling apic
smp: waiting for 0 APs
paging enabled
cr0 = 80010011
cr3 = 1007000
cr4 = 20
PASS: MSR value after enabling
scale: 10624dd2e147ae1 offset: -18011
refcnt 257287, TSC 41a33e4, TSC reference 257293
refcnt 20257287 (delta 20000000), TSC 12e2025fc, TSC reference 20257293 (delta
20000000)
suspecting drift on CPU 0? delta = 103, acceptable [0, 52)
FAIL: TSC reference precision test
iterations/sec:  47953642
PASS: MSR value after disabling

# uname -r
6.4.0-rc5

# cat /etc/os-release
NAME="Rocky Linux"
VERSION="9.0 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.0"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.0 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.0"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.0"

# cat /etc/redhat-release
Rocky Linux release 9.0 (Blue Onyx)

# rpm -qa | grep qemu-kvm
qemu-kvm-docs-7.2.0-14.el9_2.x86_64
qemu-kvm-common-7.2.0-14.el9_2.x86_64
qemu-kvm-device-display-virtio-gpu-7.2.0-14.el9_2.x86_64
qemu-kvm-device-display-virtio-gpu-pci-7.2.0-14.el9_2.x86_64
qemu-kvm-device-usb-host-7.2.0-14.el9_2.x86_64
qemu-kvm-device-display-virtio-vga-7.2.0-14.el9_2.x86_64
qemu-kvm-block-rbd-7.2.0-14.el9_2.x86_64
qemu-kvm-tools-7.2.0-14.el9_2.x86_64
qemu-kvm-device-usb-redirect-7.2.0-14.el9_2.x86_64
qemu-kvm-audio-pa-7.2.0-14.el9_2.x86_64
qemu-kvm-core-7.2.0-14.el9_2.x86_64
qemu-kvm-ui-opengl-7.2.0-14.el9_2.x86_64
qemu-kvm-ui-egl-headless-7.2.0-14.el9_2.x86_64
qemu-kvm-7.2.0-14.el9_2.x86_64


linux from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
master branch with commit:
commit 9561de3a55bed6bdd44a12820ba81ec416e705a7 (HEAD -> master, tag: v6.4-rc5,
origin/master, origin/HEAD)
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Jun 4 14:04:27 2023 -0400

    Linux 6.4-rc5

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 217516] FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test
  2023-06-01 12:04 [Bug 217516] New: FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test bugzilla-daemon
                   ` (2 preceding siblings ...)
  2023-06-05  3:17 ` bugzilla-daemon
@ 2023-06-12  9:20 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2023-06-12  9:20 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=217516

vkuznets@redhat.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bonzini@gnu.org,
                   |                            |vkuznets@redhat.com

--- Comment #4 from vkuznets@redhat.com ---
It seems this is just an unstable test. It merges the divergence between MSR
based clock and TSC page over one second and then expects delta to stay within
the measured range over another two seconds. This works well for a completely
idle system but if tasks get scheduled out, rescheduled to a different CPU,...
the test fails. Widening the range help, e.g.:

diff --git a/x86/hyperv_clock.c b/x86/hyperv_clock.c
index f1e7204a8ea9..57d25770a2d0 100644
--- a/x86/hyperv_clock.c
+++ b/x86/hyperv_clock.c
@@ -79,7 +79,7 @@ static void hv_clock_test(void *data)
                min_delta = delta < min_delta ? delta : min_delta;
                if (t < msr_sample) {
                        max_delta = delta > max_delta ? delta: max_delta;
-               } else if (delta < 0 || delta > max_delta * 3 / 2) {
+               } else if (delta < 0 || delta > max_delta * 1024) {
                        printf("suspecting drift on CPU %d? delta = %d,
acceptable [0, %d)\n", smp_id(),
                               delta, max_delta);
                        ok[i] = false;

but I wouldn't be surprised if on a busy system even '1024 * max_delta' is not
going to be sufficient. Maybe we should make this a warning and not fail the
whole test as I don't see
how we can make it reliable.

Paolo (as you're the author), wdyt?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-12  9:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-01 12:04 [Bug 217516] New: FAIL: TSC reference precision test when do hyperv_clock test of kvm unit test bugzilla-daemon
2023-06-01 12:10 ` [Bug 217516] " bugzilla-daemon
2023-06-02 12:13 ` bugzilla-daemon
2023-06-05  3:17 ` bugzilla-daemon
2023-06-12  9:20 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.