All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] kvm-unit-tests: add check parameter to run_tests configuration
@ 2014-08-26 18:29 Chris J Arges
  2014-08-26 18:29 ` [PATCH 1/2] " Chris J Arges
  2014-08-26 18:29 ` [PATCH 2/2] x86/unittests.cfg: the pmu testcase requires that nmi_watchdog is disabled Chris J Arges
  0 siblings, 2 replies; 25+ messages in thread
From: Chris J Arges @ 2014-08-26 18:29 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, Chris J Arges

These two patches add the ability to check for host machine conditions
before running a particular test. This allows the configuration file to have
a mechanism to skip false negatives.

Chris J Arges (2):
  add check parameter to run_tests configuration
  x86/unittests.cfg: the pmu testcase requires that nmi_watchdog is
    disabled

 run_tests.sh      | 21 +++++++++++++++++++--
 x86/unittests.cfg |  1 +
 2 files changed, 20 insertions(+), 2 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] add check parameter to run_tests configuration
  2014-08-26 18:29 [PATCH 0/2] kvm-unit-tests: add check parameter to run_tests configuration Chris J Arges
@ 2014-08-26 18:29 ` Chris J Arges
  2014-08-27 13:29   ` Paolo Bonzini
  2014-08-26 18:29 ` [PATCH 2/2] x86/unittests.cfg: the pmu testcase requires that nmi_watchdog is disabled Chris J Arges
  1 sibling, 1 reply; 25+ messages in thread
From: Chris J Arges @ 2014-08-26 18:29 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, Chris J Arges

In unittests.cfg one can add a line like the following:
check = /proc/sys/kernel/nmi_watchdog=0 /proc/sys/kernel/ostype=Linux

run_tests.sh will now check for those values (if defined) and only run
the test if all conditions are true.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
---
 run_tests.sh | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index 4758573..d37e0ec 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -18,6 +18,7 @@ function run()
     local kernel="$4"
     local opts="$5"
     local arch="$6"
+    local check="$7"
 
     if [ -z "$testname" ]; then
         return
@@ -32,6 +33,18 @@ function run()
         return
     fi
 
+    # check a file for a particular value before running a test
+    # the check line can contain multiple files to check separated by a space
+    # but each check parameter needs to be of the form <path>=<value>
+    for check_param in ${check[@]}; do
+        path=${check_param%%=*}
+        value=${check_param#*=}
+        if [ $path ] && [[ $(cat $path) != $value ]]; then
+            echo "skip $1 ($path not equal to $value)"
+            return
+        fi
+    done
+
     cmdline="./$TEST_DIR-run $kernel -smp $smp $opts"
     if [ $verbose != 0 ]; then
         echo $cmdline
@@ -57,18 +70,20 @@ function run_all()
     local opts
     local groups
     local arch
+    local check
 
     exec {config_fd}<$config
 
     while read -u $config_fd line; do
         if [[ "$line" =~ ^\[(.*)\]$ ]]; then
-            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
+            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
             testname=${BASH_REMATCH[1]}
             smp=1
             kernel=""
             opts=""
             groups=""
             arch=""
+            check=""
         elif [[ $line =~ ^file\ *=\ *(.*)$ ]]; then
             kernel=$TEST_DIR/${BASH_REMATCH[1]}
         elif [[ $line =~ ^smp\ *=\ *(.*)$ ]]; then
@@ -79,10 +94,12 @@ function run_all()
             groups=${BASH_REMATCH[1]}
         elif [[ $line =~ ^arch\ *=\ *(.*)$ ]]; then
             arch=${BASH_REMATCH[1]}
+        elif [[ $line =~ ^check\ *=\ *(.*)$ ]]; then
+            check=${BASH_REMATCH[1]}
         fi
     done
 
-    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
+    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
 
     exec {config_fd}<&-
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] x86/unittests.cfg: the pmu testcase requires that nmi_watchdog is disabled
  2014-08-26 18:29 [PATCH 0/2] kvm-unit-tests: add check parameter to run_tests configuration Chris J Arges
  2014-08-26 18:29 ` [PATCH 1/2] " Chris J Arges
@ 2014-08-26 18:29 ` Chris J Arges
  2014-08-27 13:29   ` Paolo Bonzini
  1 sibling, 1 reply; 25+ messages in thread
From: Chris J Arges @ 2014-08-26 18:29 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, Chris J Arges

If nmi_watchdog is enabled, it will take up a PMU counter causing the
all_counters testcase to fail. This additional check will error out if
nmi_watchdog is enabled and provide feedback for the user to configure the
host machine correctly.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
---
 x86/unittests.cfg | 1 +
 1 file changed, 1 insertion(+)

diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 0123944..badb08a 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -92,6 +92,7 @@ file = msr.flat
 [pmu]
 file = pmu.flat
 extra_params = -cpu host
+check = /proc/sys/kernel/nmi_watchdog=0
 
 [port80]
 file = port80.flat
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] add check parameter to run_tests configuration
  2014-08-26 18:29 ` [PATCH 1/2] " Chris J Arges
@ 2014-08-27 13:29   ` Paolo Bonzini
  2014-08-27 14:42     ` [PATCH 1/2 v2] " Chris J Arges
  2014-08-27 14:45     ` [PATCH 1/2 v3] " Chris J Arges
  0 siblings, 2 replies; 25+ messages in thread
From: Paolo Bonzini @ 2014-08-27 13:29 UTC (permalink / raw)
  To: Chris J Arges, kvm

Il 26/08/2014 20:29, Chris J Arges ha scritto:
> +        path=${check_param%%=*}
> +        value=${check_param#*=}
> +        if [ $path ] && [[ $(cat $path) != $value ]]; then

[[ ]] is a bashism, please use [ ].

Also, please include all operands of [ ] within double quotes.

Paolo

> +            echo "skip $1 ($path not equal to $value)"
> +            return
> +        fi


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] x86/unittests.cfg: the pmu testcase requires that nmi_watchdog is disabled
  2014-08-26 18:29 ` [PATCH 2/2] x86/unittests.cfg: the pmu testcase requires that nmi_watchdog is disabled Chris J Arges
@ 2014-08-27 13:29   ` Paolo Bonzini
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2014-08-27 13:29 UTC (permalink / raw)
  To: Chris J Arges, kvm

Il 26/08/2014 20:29, Chris J Arges ha scritto:
> If nmi_watchdog is enabled, it will take up a PMU counter causing the
> all_counters testcase to fail. This additional check will error out if
> nmi_watchdog is enabled and provide feedback for the user to configure the
> host machine correctly.
> 
> Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
> ---
>  x86/unittests.cfg | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/x86/unittests.cfg b/x86/unittests.cfg
> index 0123944..badb08a 100644
> --- a/x86/unittests.cfg
> +++ b/x86/unittests.cfg
> @@ -92,6 +92,7 @@ file = msr.flat
>  [pmu]
>  file = pmu.flat
>  extra_params = -cpu host
> +check = /proc/sys/kernel/nmi_watchdog=0
>  
>  [port80]
>  file = port80.flat
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2 v2] add check parameter to run_tests configuration
  2014-08-27 13:29   ` Paolo Bonzini
@ 2014-08-27 14:42     ` Chris J Arges
  2014-08-27 14:45     ` [PATCH 1/2 v3] " Chris J Arges
  1 sibling, 0 replies; 25+ messages in thread
From: Chris J Arges @ 2014-08-27 14:42 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini

In unittests.cfg one can add a line like the following:
check = /proc/sys/kernel/nmi_watchdog=0 /proc/sys/kernel/ostype=Linux

run_tests.sh will now check for those values (if defined) and only run
the test if all conditions are true.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
---
 run_tests.sh | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index 4758573..d37e0ec 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -18,6 +18,7 @@ function run()
     local kernel="$4"
     local opts="$5"
     local arch="$6"
+    local check="$7"
 
     if [ -z "$testname" ]; then
         return
@@ -32,6 +33,18 @@ function run()
         return
     fi
 
+    # check a file for a particular value before running a test
+    # the check line can contain multiple files to check separated by a space
+    # but each check parameter needs to be of the form <path>=<value>
+    for check_param in ${check[@]}; do
+        path=${check_param%%=*}
+        value=${check_param#*=}
+        if [ $path ] && [[ $(cat $path) != $value ]]; then
+            echo "skip $1 ($path not equal to $value)"
+            return
+        fi
+    done
+
     cmdline="./$TEST_DIR-run $kernel -smp $smp $opts"
     if [ $verbose != 0 ]; then
         echo $cmdline
@@ -57,18 +70,20 @@ function run_all()
     local opts
     local groups
     local arch
+    local check
 
     exec {config_fd}<$config
 
     while read -u $config_fd line; do
         if [[ "$line" =~ ^\[(.*)\]$ ]]; then
-            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
+            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
             testname=${BASH_REMATCH[1]}
             smp=1
             kernel=""
             opts=""
             groups=""
             arch=""
+            check=""
         elif [[ $line =~ ^file\ *=\ *(.*)$ ]]; then
             kernel=$TEST_DIR/${BASH_REMATCH[1]}
         elif [[ $line =~ ^smp\ *=\ *(.*)$ ]]; then
@@ -79,10 +94,12 @@ function run_all()
             groups=${BASH_REMATCH[1]}
         elif [[ $line =~ ^arch\ *=\ *(.*)$ ]]; then
             arch=${BASH_REMATCH[1]}
+        elif [[ $line =~ ^check\ *=\ *(.*)$ ]]; then
+            check=${BASH_REMATCH[1]}
         fi
     done
 
-    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
+    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
 
     exec {config_fd}<&-
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 1/2 v3] add check parameter to run_tests configuration
  2014-08-27 13:29   ` Paolo Bonzini
  2014-08-27 14:42     ` [PATCH 1/2 v2] " Chris J Arges
@ 2014-08-27 14:45     ` Chris J Arges
  2014-08-27 14:52       ` Paolo Bonzini
  1 sibling, 1 reply; 25+ messages in thread
From: Chris J Arges @ 2014-08-27 14:45 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, Chris J Arges

In unittests.cfg one can add a line like the following:
check = /proc/sys/kernel/nmi_watchdog=0 /proc/sys/kernel/ostype=Linux

run_tests.sh will now check for those values (if defined) and only run
the test if all conditions are true.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
---
 run_tests.sh | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index 4758573..e48f1db 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -18,6 +18,7 @@ function run()
     local kernel="$4"
     local opts="$5"
     local arch="$6"
+    local check="$7"
 
     if [ -z "$testname" ]; then
         return
@@ -32,6 +33,18 @@ function run()
         return
     fi
 
+    # check a file for a particular value before running a test
+    # the check line can contain multiple files to check separated by a space
+    # but each check parameter needs to be of the form <path>=<value>
+    for check_param in ${check[@]}; do
+        path=${check_param%%=*}
+        value=${check_param#*=}
+        if [ "$path" ] && [ "$(cat $path)" != "$value" ]; then
+            echo "skip $1 ($path not equal to $value)"
+            return
+        fi
+    done
+
     cmdline="./$TEST_DIR-run $kernel -smp $smp $opts"
     if [ $verbose != 0 ]; then
         echo $cmdline
@@ -57,18 +70,20 @@ function run_all()
     local opts
     local groups
     local arch
+    local check
 
     exec {config_fd}<$config
 
     while read -u $config_fd line; do
         if [[ "$line" =~ ^\[(.*)\]$ ]]; then
-            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
+            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
             testname=${BASH_REMATCH[1]}
             smp=1
             kernel=""
             opts=""
             groups=""
             arch=""
+            check=""
         elif [[ $line =~ ^file\ *=\ *(.*)$ ]]; then
             kernel=$TEST_DIR/${BASH_REMATCH[1]}
         elif [[ $line =~ ^smp\ *=\ *(.*)$ ]]; then
@@ -79,10 +94,12 @@ function run_all()
             groups=${BASH_REMATCH[1]}
         elif [[ $line =~ ^arch\ *=\ *(.*)$ ]]; then
             arch=${BASH_REMATCH[1]}
+        elif [[ $line =~ ^check\ *=\ *(.*)$ ]]; then
+            check=${BASH_REMATCH[1]}
         fi
     done
 
-    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
+    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
 
     exec {config_fd}<&-
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2 v3] add check parameter to run_tests configuration
  2014-08-27 14:45     ` [PATCH 1/2 v3] " Chris J Arges
@ 2014-08-27 14:52       ` Paolo Bonzini
  2014-08-27 21:24         ` kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration) Chris J Arges
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-08-27 14:52 UTC (permalink / raw)
  To: Chris J Arges, kvm

Il 27/08/2014 16:45, Chris J Arges ha scritto:
> In unittests.cfg one can add a line like the following:
> check = /proc/sys/kernel/nmi_watchdog=0 /proc/sys/kernel/ostype=Linux
> 
> run_tests.sh will now check for those values (if defined) and only run
> the test if all conditions are true.
> 
> Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
> ---
>  run_tests.sh | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/run_tests.sh b/run_tests.sh
> index 4758573..e48f1db 100755
> --- a/run_tests.sh
> +++ b/run_tests.sh
> @@ -18,6 +18,7 @@ function run()
>      local kernel="$4"
>      local opts="$5"
>      local arch="$6"
> +    local check="$7"
>  
>      if [ -z "$testname" ]; then
>          return
> @@ -32,6 +33,18 @@ function run()
>          return
>      fi
>  
> +    # check a file for a particular value before running a test
> +    # the check line can contain multiple files to check separated by a space
> +    # but each check parameter needs to be of the form <path>=<value>
> +    for check_param in ${check[@]}; do
> +        path=${check_param%%=*}
> +        value=${check_param#*=}
> +        if [ "$path" ] && [ "$(cat $path)" != "$value" ]; then
> +            echo "skip $1 ($path not equal to $value)"
> +            return
> +        fi
> +    done
> +
>      cmdline="./$TEST_DIR-run $kernel -smp $smp $opts"
>      if [ $verbose != 0 ]; then
>          echo $cmdline
> @@ -57,18 +70,20 @@ function run_all()
>      local opts
>      local groups
>      local arch
> +    local check
>  
>      exec {config_fd}<$config
>  
>      while read -u $config_fd line; do
>          if [[ "$line" =~ ^\[(.*)\]$ ]]; then
> -            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
> +            run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
>              testname=${BASH_REMATCH[1]}
>              smp=1
>              kernel=""
>              opts=""
>              groups=""
>              arch=""
> +            check=""
>          elif [[ $line =~ ^file\ *=\ *(.*)$ ]]; then
>              kernel=$TEST_DIR/${BASH_REMATCH[1]}
>          elif [[ $line =~ ^smp\ *=\ *(.*)$ ]]; then
> @@ -79,10 +94,12 @@ function run_all()
>              groups=${BASH_REMATCH[1]}
>          elif [[ $line =~ ^arch\ *=\ *(.*)$ ]]; then
>              arch=${BASH_REMATCH[1]}
> +        elif [[ $line =~ ^check\ *=\ *(.*)$ ]]; then
> +            check=${BASH_REMATCH[1]}
>          fi
>      done
>  
> -    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch"
> +    run "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check"
>  
>      exec {config_fd}<&-
>  }
> 

Thanks, looks good.  Are there more failures?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration)
  2014-08-27 14:52       ` Paolo Bonzini
@ 2014-08-27 21:24         ` Chris J Arges
  2014-08-27 22:05           ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Chris J Arges @ 2014-08-27 21:24 UTC (permalink / raw)
  To: Paolo Bonzini, kvm

<snip>
> Thanks, looks good.  Are there more failures?
> 
> Paolo
> 

Paolo,
Thanks for applying those patches!

I now only see the two failures on my machine:
model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz

I'm running with the tip of kvm master:
0ac625df43ce9d085d4ff54c1f739611f4308b13 (Merge tag 'kvm-s390-20140825')

sudo ./x86-run x86/apic.flat -smp 2 -cpu qemu64,+x2apic,+tsc-deadline |
grep -v PASS
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/apic.flat -smp 2 -cpu
qemu64,+x2apic,+tsc-deadline
enabling apic
enabling apic
paging enabled
cr0 = 80010011
cr3 = 7fff000
cr4 = 20
apic version: 1050014
x2apic enabled
FAIL: tsc deadline timer clearing
tsc deadline timer enabled

SUMMARY: 16 tests, 1 unexpected failures
Return value from qemu: 3

sudo ./x86-run x86/kvmclock_test.flat -smp 2 --append "10000000 `date +%s`"
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
10000000 1409174399
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44d4c0
kvm-clock: cpu 0, msr 0x:44d4c0
Wallclock test, threshold 5
Seconds get from host:     1409174399
Seconds get from kvmclock: 1409173176
Offset:                    -1223
offset too large!
Check the stability of raw cycle ...
Worst warp -1222831419357
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -1222831419357
Raw cycle is not stable
Monotonic cycle test:
Worst warp -1219118621614
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -1219118621614
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1065145046
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1126981511
Return value from qemu: 3

Let me know if anything comes to mind. I can also look more deeply into
these failures.

Thanks,
--chris j arges

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration)
  2014-08-27 21:24         ` kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration) Chris J Arges
@ 2014-08-27 22:05           ` Paolo Bonzini
  2014-08-29 17:36             ` kvm-unit-test failures Chris J Arges
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-08-27 22:05 UTC (permalink / raw)
  To: Chris J Arges; +Cc: kvm



----- Messaggio originale -----
> Da: "Chris J Arges" <chris.j.arges@canonical.com>
> A: "Paolo Bonzini" <pbonzini@redhat.com>, kvm@vger.kernel.org
> Inviato: Mercoledì, 27 agosto 2014 23:24:14
> Oggetto: kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration)
> 
> <snip>
> > Thanks, looks good.  Are there more failures?
> > 
> > Paolo
> > 
> 
> Paolo,
> Thanks for applying those patches!
> 
> I now only see the two failures on my machine:
> model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
> 
> I'm running with the tip of kvm master:
> 0ac625df43ce9d085d4ff54c1f739611f4308b13 (Merge tag 'kvm-s390-20140825')
> 
> sudo ./x86-run x86/apic.flat -smp 2 -cpu qemu64,+x2apic,+tsc-deadline |
> grep -v PASS
> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
> -device pci-testdev -kernel x86/apic.flat -smp 2 -cpu
> qemu64,+x2apic,+tsc-deadline
> enabling apic
> enabling apic
> paging enabled
> cr0 = 80010011
> cr3 = 7fff000
> cr4 = 20
> apic version: 1050014
> x2apic enabled
> FAIL: tsc deadline timer clearing
> tsc deadline timer enabled

This is fixed in kvm/next (3.18).

> SUMMARY: 16 tests, 1 unexpected failures
> Return value from qemu: 3
> 
> sudo ./x86-run x86/kvmclock_test.flat -smp 2 --append "10000000 `date +%s`"
> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
> -device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
> 10000000 1409174399
> enabling apic
> enabling apic
> kvm-clock: cpu 0, msr 0x:44d4c0
> kvm-clock: cpu 0, msr 0x:44d4c0
> Wallclock test, threshold 5
> Seconds get from host:     1409174399
> Seconds get from kvmclock: 1409173176
> Offset:                    -1223

Weird, your clock is 20 minutes behind in the VM than it
is in the host.  Is the offset always around -1200?  What
happens if you reboot?

(I get 0, 1 or sometimes 2).

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-08-27 22:05           ` Paolo Bonzini
@ 2014-08-29 17:36             ` Chris J Arges
  2014-08-29 21:05               ` Chris J Arges
  2014-08-29 21:08               ` Paolo Bonzini
  0 siblings, 2 replies; 25+ messages in thread
From: Chris J Arges @ 2014-08-29 17:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

On 08/27/2014 05:05 PM, Paolo Bonzini wrote:
> 
> 
> ----- Messaggio originale -----
>> Da: "Chris J Arges" <chris.j.arges@canonical.com>
>> A: "Paolo Bonzini" <pbonzini@redhat.com>, kvm@vger.kernel.org
>> Inviato: Mercoledì, 27 agosto 2014 23:24:14
>> Oggetto: kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration)
>>
>> <snip>
>>> Thanks, looks good.  Are there more failures?
>>>
>>> Paolo
>>>
>>
>> Paolo,
>> Thanks for applying those patches!
>>
>> I now only see the two failures on my machine:
>> model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
>>
>> I'm running with the tip of kvm master:
>> 0ac625df43ce9d085d4ff54c1f739611f4308b13 (Merge tag 'kvm-s390-20140825')
>>
>> sudo ./x86-run x86/apic.flat -smp 2 -cpu qemu64,+x2apic,+tsc-deadline |
>> grep -v PASS
>> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
>> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
>> -device pci-testdev -kernel x86/apic.flat -smp 2 -cpu
>> qemu64,+x2apic,+tsc-deadline
>> enabling apic
>> enabling apic
>> paging enabled
>> cr0 = 80010011
>> cr3 = 7fff000
>> cr4 = 20
>> apic version: 1050014
>> x2apic enabled
>> FAIL: tsc deadline timer clearing
>> tsc deadline timer enabled
> 
> This is fixed in kvm/next (3.18).
> 
>> SUMMARY: 16 tests, 1 unexpected failures
>> Return value from qemu: 3
>>
>> sudo ./x86-run x86/kvmclock_test.flat -smp 2 --append "10000000 `date +%s`"
>> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
>> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
>> -device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
>> 10000000 1409174399
>> enabling apic
>> enabling apic
>> kvm-clock: cpu 0, msr 0x:44d4c0
>> kvm-clock: cpu 0, msr 0x:44d4c0
>> Wallclock test, threshold 5
>> Seconds get from host:     1409174399
>> Seconds get from kvmclock: 1409173176
>> Offset:                    -1223
> 
> Weird, your clock is 20 minutes behind in the VM than it
> is in the host.  Is the offset always around -1200?  What
> happens if you reboot?
> 
> (I get 0, 1 or sometimes 2).
> 
> Paolo
> 

Hi Paolo,

Results building with kvm queue tree
(fd2752352bbc98850d83b5448a288d8991590317):
CPU:
model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz

I still get failures with the following test, I actually tested on
multiple machines with identical hardware and the same failure occurred.
In v3.13/v3.16 series kernels this passes. I'll look into which commit
changed this result for me. I suspect it was fairly recent.

./x86-run x86/kvmclock_test.flat -smp 2 --append "10000000 `date +%s`" |
grep -v PASS
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
10000000 1409160326
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44d520
kvm-clock: cpu 0, msr 0x:44d520
Wallclock test, threshold 5
Seconds get from host:     1409160326
Seconds get from kvmclock: 1409153484
Offset:                    -6842
offset too large!
Check the stability of raw cycle ...
Worst warp -6841795339348
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -6841795339348
Raw cycle is not stable
Monotonic cycle test:
Worst warp -6836691572679
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -6836691572679
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1098400654
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1106302952
Return value from qemu: 3

This is another test that fails or hangs, this passes in 3.13, but fails
on 3.16 with my testing. I'll dig into this more perhaps to find out
which commit changes things.

./x86-run x86/vmx.flat -smp 1 -cpu host,+vmx | grep -v PASS
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/vmx.flat -smp 1 -cpu host,+vmx
enabling apic
paging enabled
cr0 = 80010011
cr3 = 7fff000
cr4 = 20

Test suite: VMX capability reporting

Test suite: vmenter

Test suite: preemption timer

Test suite: control field PAT

Test suite: control field EFER

Test suite: CR shadowing

Test suite: I/O bitmap

Test suite: instruction intercept

Test suite: EPT framework
FAIL: EPT violation - paging structure

Test suite: interrupt
`ASS: running a guest with interrupt acknowledgement set

Test suite: debug controls

SUMMARY: 109 tests, 1 unexpected failures
Return value from qemu: 3

Here is the test case hanging:

./x86-run x86/vmx.flat -smp 1 -cpu host,+vmx | grep -v PASS
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/vmx.flat -smp 1 -cpu host,+vmx
enabling apic
paging enabled
cr0 = 80010011
cr3 = 7fff000
cr4 = 20

Test suite: VMX capability reporting

Test suite: vmenter

Test suite: preemption timer

Test suite: control field PAT

Test suite: control field EFER

Test suite: CR shadowing

Test suite: I/O bitmap

Test suite: instruction intercept

Test suite: EPT framework
FAIL: EPT violation - paging structure
FAIL: EPT - MMIO access

Test suite: interrupt
FAIL: intercepted interrupt + hlt

Thanks,
--chris j arges

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-08-29 17:36             ` kvm-unit-test failures Chris J Arges
@ 2014-08-29 21:05               ` Chris J Arges
  2014-08-31 16:05                 ` Paolo Bonzini
  2014-08-29 21:08               ` Paolo Bonzini
  1 sibling, 1 reply; 25+ messages in thread
From: Chris J Arges @ 2014-08-29 21:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm



On 08/29/2014 12:36 PM, Chris J Arges wrote:
> On 08/27/2014 05:05 PM, Paolo Bonzini wrote:
>>
>>
>> ----- Messaggio originale -----
>>> Da: "Chris J Arges" <chris.j.arges@canonical.com>
>>> A: "Paolo Bonzini" <pbonzini@redhat.com>, kvm@vger.kernel.org
>>> Inviato: Mercoledì, 27 agosto 2014 23:24:14
>>> Oggetto: kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration)
>>>
>>> <snip>
>>>> Thanks, looks good.  Are there more failures?
>>>>
>>>> Paolo
>>>>
>>>
>>> Paolo,
>>> Thanks for applying those patches!
>>>
>>> I now only see the two failures on my machine:
>>> model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
>>>
>>> I'm running with the tip of kvm master:
>>> 0ac625df43ce9d085d4ff54c1f739611f4308b13 (Merge tag 'kvm-s390-20140825')
>>>
>>> sudo ./x86-run x86/apic.flat -smp 2 -cpu qemu64,+x2apic,+tsc-deadline |
>>> grep -v PASS
>>> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
>>> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
>>> -device pci-testdev -kernel x86/apic.flat -smp 2 -cpu
>>> qemu64,+x2apic,+tsc-deadline
>>> enabling apic
>>> enabling apic
>>> paging enabled
>>> cr0 = 80010011
>>> cr3 = 7fff000
>>> cr4 = 20
>>> apic version: 1050014
>>> x2apic enabled
>>> FAIL: tsc deadline timer clearing
>>> tsc deadline timer enabled
>>
>> This is fixed in kvm/next (3.18).
>>
>>> SUMMARY: 16 tests, 1 unexpected failures
>>> Return value from qemu: 3
>>>
>>> sudo ./x86-run x86/kvmclock_test.flat -smp 2 --append "10000000 `date +%s`"
>>> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
>>> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
>>> -device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
>>> 10000000 1409174399
>>> enabling apic
>>> enabling apic
>>> kvm-clock: cpu 0, msr 0x:44d4c0
>>> kvm-clock: cpu 0, msr 0x:44d4c0
>>> Wallclock test, threshold 5
>>> Seconds get from host:     1409174399
>>> Seconds get from kvmclock: 1409173176
>>> Offset:                    -1223
>>
>> Weird, your clock is 20 minutes behind in the VM than it
>> is in the host.  Is the offset always around -1200?  What
>> happens if you reboot?
>>
>> (I get 0, 1 or sometimes 2).
>>
>> Paolo
>>
> 
> Hi Paolo,
> 
> Results building with kvm queue tree
> (fd2752352bbc98850d83b5448a288d8991590317):
> CPU:
> model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
> 
> I still get failures with the following test, I actually tested on
> multiple machines with identical hardware and the same failure occurred.
> In v3.13/v3.16 series kernels this passes. I'll look into which commit
> changed this result for me. I suspect it was fairly recent.
> 

I did a quick grep through the patches between v3.16 and the latest and
I found: (0d3da0d26e3c3515997c99451ce3b0ad1a69a36c KVM: x86: fix TSC
matching).

I reverted this patch and the test case passed for me. Looking at the
patch, I added an extra else statement as so:

	if (!matched) {
		kvm->arch.nr_vcpus_matched_tsc = 0;
	} else if (!already_matched) {
		kvm->arch.nr_vcpus_matched_tsc++;
	} else {
		printk("kvm: do we get here?\n");
	}

And indeed there is a condition where matched && already_matched are
both true. In this case we don't zero or increment nr_vcpus_matched_tsc.
Incrementing nr_vcpus_matched_tsc in that last else clause allows the
test to pass; however this is identical to the  logic before the patch.

Any suggestions for fixing this case?

Thanks,
--chris j arges

> ./x86-run x86/kvmclock_test.flat -smp 2 --append "10000000 `date +%s`" |
> grep -v PASS
> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
> -device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
> 10000000 1409160326
> enabling apic
> enabling apic
> kvm-clock: cpu 0, msr 0x:44d520
> kvm-clock: cpu 0, msr 0x:44d520
> Wallclock test, threshold 5
> Seconds get from host:     1409160326
> Seconds get from kvmclock: 1409153484
> Offset:                    -6842
> offset too large!
> Check the stability of raw cycle ...
> Worst warp -6841795339348
> Total vcpus: 2
> Test  loops: 10000000
> Total warps:  1
> Total stalls: 0
> Worst warp:   -6841795339348
> Raw cycle is not stable
> Monotonic cycle test:
> Worst warp -6836691572679
> Total vcpus: 2
> Test  loops: 10000000
> Total warps:  1
> Total stalls: 0
> Worst warp:   -6836691572679
> Measure the performance of raw cycle ...
> Total vcpus: 2
> Test  loops: 10000000
> TSC cycles:  1098400654
> Measure the performance of adjusted cycle ...
> Total vcpus: 2
> Test  loops: 10000000
> TSC cycles:  1106302952
> Return value from qemu: 3
> 
> This is another test that fails or hangs, this passes in 3.13, but fails
> on 3.16 with my testing. I'll dig into this more perhaps to find out
> which commit changes things.
> 
> ./x86-run x86/vmx.flat -smp 1 -cpu host,+vmx | grep -v PASS
> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
> -device pci-testdev -kernel x86/vmx.flat -smp 1 -cpu host,+vmx
> enabling apic
> paging enabled
> cr0 = 80010011
> cr3 = 7fff000
> cr4 = 20
> 
> Test suite: VMX capability reporting
> 
> Test suite: vmenter
> 
> Test suite: preemption timer
> 
> Test suite: control field PAT
> 
> Test suite: control field EFER
> 
> Test suite: CR shadowing
> 
> Test suite: I/O bitmap
> 
> Test suite: instruction intercept
> 
> Test suite: EPT framework
> FAIL: EPT violation - paging structure
> 
> Test suite: interrupt
> `ASS: running a guest with interrupt acknowledgement set
> 
> Test suite: debug controls
> 
> SUMMARY: 109 tests, 1 unexpected failures
> Return value from qemu: 3
> 
> Here is the test case hanging:
> 
> ./x86-run x86/vmx.flat -smp 1 -cpu host,+vmx | grep -v PASS
> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
> -device pci-testdev -kernel x86/vmx.flat -smp 1 -cpu host,+vmx
> enabling apic
> paging enabled
> cr0 = 80010011
> cr3 = 7fff000
> cr4 = 20
> 
> Test suite: VMX capability reporting
> 
> Test suite: vmenter
> 
> Test suite: preemption timer
> 
> Test suite: control field PAT
> 
> Test suite: control field EFER
> 
> Test suite: CR shadowing
> 
> Test suite: I/O bitmap
> 
> Test suite: instruction intercept
> 
> Test suite: EPT framework
> FAIL: EPT violation - paging structure
> FAIL: EPT - MMIO access
> 
> Test suite: interrupt
> FAIL: intercepted interrupt + hlt
> 
> Thanks,
> --chris j arges
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-08-29 17:36             ` kvm-unit-test failures Chris J Arges
  2014-08-29 21:05               ` Chris J Arges
@ 2014-08-29 21:08               ` Paolo Bonzini
  2014-08-29 21:18                 ` Chris J Arges
  1 sibling, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-08-29 21:08 UTC (permalink / raw)
  To: Chris J Arges; +Cc: kvm

Il 29/08/2014 19:36, Chris J Arges ha scritto:
> I still get failures with the following test, I actually tested on
> multiple machines with identical hardware and the same failure occurred.
> In v3.13/v3.16 series kernels this passes. I'll look into which commit
> changed this result for me. I suspect it was fairly recent.

I would try bisecting between 0e5ac3a8b100469ea154f87dd57b685fbdd356f6
(might be bad) and 41fa4215f8e8150bdc5d2a5f8704915f1b059fa8 (might be good).

Is this a laptop or desktop?  Any relationship between the "Offset" and
how long the machine has been up?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-08-29 21:08               ` Paolo Bonzini
@ 2014-08-29 21:18                 ` Chris J Arges
  0 siblings, 0 replies; 25+ messages in thread
From: Chris J Arges @ 2014-08-29 21:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm



On 08/29/2014 04:08 PM, Paolo Bonzini wrote:
> Il 29/08/2014 19:36, Chris J Arges ha scritto:
>> I still get failures with the following test, I actually tested on
>> multiple machines with identical hardware and the same failure occurred.
>> In v3.13/v3.16 series kernels this passes. I'll look into which commit
>> changed this result for me. I suspect it was fairly recent.
> 
> I would try bisecting between 0e5ac3a8b100469ea154f87dd57b685fbdd356f6
> (might be bad) and 41fa4215f8e8150bdc5d2a5f8704915f1b059fa8 (might be good).
> 
Just sent the other email with my bisect results, the 'bad' commit for
me is:
0d3da0d26e3c3515997c99451ce3b0ad1a69a36c
A revert on this commit allows the tests to pass again.

> Is this a laptop or desktop?  Any relationship between the "Offset" and
> how long the machine has been up?
>
> Paolo
> 
This is a server; and no, I've reboot the machine and re-run the test.
I've also run it after days of uptime.

--chris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-08-29 21:05               ` Chris J Arges
@ 2014-08-31 16:05                 ` Paolo Bonzini
  2014-09-02 19:57                   ` Chris J Arges
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-08-31 16:05 UTC (permalink / raw)
  To: Chris J Arges; +Cc: kvm

Il 29/08/2014 23:05, Chris J Arges ha scritto:
> And indeed there is a condition where matched && already_matched are
> both true. In this case we don't zero or increment nr_vcpus_matched_tsc.
> Incrementing nr_vcpus_matched_tsc in that last else clause allows the
> test to pass; however this is identical to the  logic before the patch.

Can you please trace the test using trace-cmd
(http://www.linux-kvm.org/page/Tracing) and send the output?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-08-31 16:05                 ` Paolo Bonzini
@ 2014-09-02 19:57                   ` Chris J Arges
  2014-09-03 14:47                     ` Paolo Bonzini
  2014-09-03 14:59                     ` Paolo Bonzini
  0 siblings, 2 replies; 25+ messages in thread
From: Chris J Arges @ 2014-09-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm



On 08/31/2014 11:05 AM, Paolo Bonzini wrote:
> Il 29/08/2014 23:05, Chris J Arges ha scritto:
>> And indeed there is a condition where matched && already_matched are
>> both true. In this case we don't zero or increment nr_vcpus_matched_tsc.
>> Incrementing nr_vcpus_matched_tsc in that last else clause allows the
>> test to pass; however this is identical to the  logic before the patch.
> 
> Can you please trace the test using trace-cmd
> (http://www.linux-kvm.org/page/Tracing) and send the output?
> 
> Paolo
> 

Paolo,

I have posted the trace data here:
http://people.canonical.com/~arges/kvm/trace.dat.xz

Here is the output from the actual test case:
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44d4c0
kvm-clock: cpu 0, msr 0x:44d4c0
Wallclock test, threshold 5
Seconds get from host:     1409687073
Seconds get from kvmclock: 1409333034
Offset:                    -354039
offset too large!
Check the stability of raw cycle ...
Worst warp -354462672821748
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -354462672821748
Raw cycle is not stable
Monotonic cycle test:
Worst warp -354455286691490
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -354455286691490
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1234719818
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1234750103

I also enabled dynamic debugging on arch/x86/kvm/x86.c (on a separate run):
[354558.460867] kvm_get_time_scale: base_khz 1000000 => 2593993, shift
2, mul 2785278775
[354558.461034] kvm: new tsc generation 1, clock 0
[354558.461812] kvm_get_time_scale: base_khz 1000000 => 2593993, shift
2, mul 2785278775
[354558.461846] kvm: matched tsc offset for 0
[354558.477542] kvm: matched tsc offset for 0
[354558.477614] kvm: matched tsc offset for 0
[354558.513729] kvm: matched tsc offset for 0
[354558.541724] kvm: matched tsc offset for 0
[354558.741532] kvm_get_time_scale: base_khz 2593993 => 1000000, shift
-1, mul 3311471770
[354558.742551] kvm_get_time_scale: base_khz 2593993 => 1000000, shift
-1, mul 3311471770

Also I ensured that this was reproducible with the latest qemu, and
these results were gathered with the latest version as of today.

Any other places to look into?

--chris j arges


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-02 19:57                   ` Chris J Arges
@ 2014-09-03 14:47                     ` Paolo Bonzini
  2014-09-03 15:21                       ` Chris J Arges
  2014-09-03 14:59                     ` Paolo Bonzini
  1 sibling, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-09-03 14:47 UTC (permalink / raw)
  To: Chris J Arges; +Cc: kvm

Il 02/09/2014 21:57, Chris J Arges ha scritto:
>> > Can you please trace the test using trace-cmd
>> > (http://www.linux-kvm.org/page/Tracing) and send the output?
>> > 
>> > Paolo
>> > 
> Paolo,
> 
> I have posted the trace data here:
> http://people.canonical.com/~arges/kvm/trace.dat.xz

Can you try running the test again (no need to get a new trace) with
clocksource=hpet on the kernel command line?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-02 19:57                   ` Chris J Arges
  2014-09-03 14:47                     ` Paolo Bonzini
@ 2014-09-03 14:59                     ` Paolo Bonzini
  2014-09-03 16:23                       ` Chris J Arges
  1 sibling, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-09-03 14:59 UTC (permalink / raw)
  To: Chris J Arges; +Cc: kvm

Il 02/09/2014 21:57, Chris J Arges ha scritto:
> Seconds get from host:     1409687073
> Seconds get from kvmclock: 1409333034
> Offset:                    -354039
> offset too large!
> Check the stability of raw cycle ...
> Worst warp -354462672821748
> Total vcpus: 2
> Test  loops: 10000000
> Total warps:  1
> Total stalls: 0
> Worst warp:   -354462672821748
> Raw cycle is not stable
> Monotonic cycle test:
> Worst warp -354455286691490

Looks like one CPU is not being initialized correctly:

- The "next" correction in the trace is 18445647546048704244,
  and (next-2^64) / -354039 is about 3.1*10^9.  This is a pretty
  plausible value of the TSC frequency.  As a comparison, on my machine
  I have next=18446366988261784997 and an uptime of 29:12 hours, and
  the two match nicely with the CPU clock:

  -(18446366988261784997-2^64) / (29.2 * 3600 * 10^9) = 3.587

  $ grep -m1 model.name /proc/cpuinfo
  model name	: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz

- The offset in seconds * 10^9 is pretty close to the warp in nanoseconds.

Can you: 1) try this patch 2) gather a new trace 3) include uptime and
cpuinfo in your report?  All this without clocksource=hpet of course.

Thanks,

Paolo

diff --git a/x86/kvmclock_test.c b/x86/kvmclock_test.c
index 52a43fb..f68881c 100644
--- a/x86/kvmclock_test.c
+++ b/x86/kvmclock_test.c
@@ -7,6 +7,9 @@
 #define DEFAULT_TEST_LOOPS 100000000L
 #define DEFAULT_THRESHOLD  5L
 
+long threshold = DEFAULT_THRESHOLD;
+int nerr = 0;
+
 struct test_info {
         struct spinlock lock;
         long loops;               /* test loops */
@@ -20,8 +23,9 @@ struct test_info {
 
 struct test_info ti[4];
 
-static int wallclock_test(long sec, long threshold)
+static void wallclock_test(void *p_sec)
 {
+	long sec = *(long *)p_sec;
         long ksec, offset;
         struct timespec ts;
 
@@ -36,10 +40,8 @@ static int wallclock_test(long sec, long threshold)
 
         if (offset > threshold || offset < -threshold) {
                 printf("offset too large!\n");
-                return 1;
+                nerr++;
         }
-
-        return 0;
 }
 
 static void kvm_clock_test(void *data)
@@ -116,10 +118,9 @@ static int cycle_test(int ncpus, long loops, int check, struct test_info *ti)
 int main(int ac, char **av)
 {
         int ncpus;
-        int nerr = 0, i;
+        int i;
         long loops = DEFAULT_TEST_LOOPS;
         long sec = 0;
-        long threshold = DEFAULT_THRESHOLD;
 
         if (ac > 1)
                 loops = atol(av[1]);
@@ -137,7 +138,8 @@ int main(int ac, char **av)
                 on_cpu(i, kvm_clock_init, (void *)0);
 
         if (ac > 2)
-                nerr += wallclock_test(sec, threshold);
+	        for (i = 0; i < ncpus; ++i)
+	                on_cpu(i, wallclock_test, &sec);
 
         printf("Check the stability of raw cycle ...\n");
         pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-03 14:47                     ` Paolo Bonzini
@ 2014-09-03 15:21                       ` Chris J Arges
  0 siblings, 0 replies; 25+ messages in thread
From: Chris J Arges @ 2014-09-03 15:21 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm



On 09/03/2014 09:47 AM, Paolo Bonzini wrote:
> Il 02/09/2014 21:57, Chris J Arges ha scritto:
>>>> Can you please trace the test using trace-cmd
>>>> (http://www.linux-kvm.org/page/Tracing) and send the output?
>>>>
>>>> Paolo
>>>>
>> Paolo,
>>
>> I have posted the trace data here:
>> http://people.canonical.com/~arges/kvm/trace.dat.xz
> 
> Can you try running the test again (no need to get a new trace) with
> clocksource=hpet on the kernel command line?
> 
> Paolo
> 
 ./x86-run x86/kvmclock_test.flat -smp 2 --append "10000000 `date +%s`"
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
10000000 1409757645
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44d4c0
kvm-clock: cpu 0, msr 0x:44d4c0
Wallclock test, threshold 5
Seconds get from host:     1409757645
Seconds get from kvmclock: 1409757645
Offset:                    0
Check the stability of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
Total warps:  0
Total stalls: 0
Worst warp:   0
Raw cycle is stable
Monotonic cycle test:
Total vcpus: 2
Test  loops: 10000000
Total warps:  0
Total stalls: 0
Worst warp:   0
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1106490299
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  3463433372
Return value from qemu: 1

Ok this passes, I'll now try the patch without the cmdline option.
--chris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-03 14:59                     ` Paolo Bonzini
@ 2014-09-03 16:23                       ` Chris J Arges
  2014-09-03 17:52                         ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Chris J Arges @ 2014-09-03 16:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm



On 09/03/2014 09:59 AM, Paolo Bonzini wrote:
> Il 02/09/2014 21:57, Chris J Arges ha scritto:
>> Seconds get from host:     1409687073
>> Seconds get from kvmclock: 1409333034
>> Offset:                    -354039
>> offset too large!
>> Check the stability of raw cycle ...
>> Worst warp -354462672821748
>> Total vcpus: 2
>> Test  loops: 10000000
>> Total warps:  1
>> Total stalls: 0
>> Worst warp:   -354462672821748
>> Raw cycle is not stable
>> Monotonic cycle test:
>> Worst warp -354455286691490
> 
> Looks like one CPU is not being initialized correctly:
> 
> - The "next" correction in the trace is 18445647546048704244,
>   and (next-2^64) / -354039 is about 3.1*10^9.  This is a pretty
>   plausible value of the TSC frequency.  As a comparison, on my machine
>   I have next=18446366988261784997 and an uptime of 29:12 hours, and
>   the two match nicely with the CPU clock:
> 
>   -(18446366988261784997-2^64) / (29.2 * 3600 * 10^9) = 3.587
> 
>   $ grep -m1 model.name /proc/cpuinfo
>   model name	: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz
> 
> - The offset in seconds * 10^9 is pretty close to the warp in nanoseconds.
> 
> Can you: 1) try this patch 2) gather a new trace 3) include uptime and
> cpuinfo in your report?  All this without clocksource=hpet of course.
> 
> Thanks,
> 
> Paolo
> 
> diff --git a/x86/kvmclock_test.c b/x86/kvmclock_test.c
> index 52a43fb..f68881c 100644
> --- a/x86/kvmclock_test.c
> +++ b/x86/kvmclock_test.c
> @@ -7,6 +7,9 @@
>  #define DEFAULT_TEST_LOOPS 100000000L
>  #define DEFAULT_THRESHOLD  5L
>  
> +long threshold = DEFAULT_THRESHOLD;
> +int nerr = 0;
> +
>  struct test_info {
>          struct spinlock lock;
>          long loops;               /* test loops */
> @@ -20,8 +23,9 @@ struct test_info {
>  
>  struct test_info ti[4];
>  
> -static int wallclock_test(long sec, long threshold)
> +static void wallclock_test(void *p_sec)
>  {
> +	long sec = *(long *)p_sec;
>          long ksec, offset;
>          struct timespec ts;
>  
> @@ -36,10 +40,8 @@ static int wallclock_test(long sec, long threshold)
>  
>          if (offset > threshold || offset < -threshold) {
>                  printf("offset too large!\n");
> -                return 1;
> +                nerr++;
>          }
> -
> -        return 0;
>  }
>  
>  static void kvm_clock_test(void *data)
> @@ -116,10 +118,9 @@ static int cycle_test(int ncpus, long loops, int check, struct test_info *ti)
>  int main(int ac, char **av)
>  {
>          int ncpus;
> -        int nerr = 0, i;
> +        int i;
>          long loops = DEFAULT_TEST_LOOPS;
>          long sec = 0;
> -        long threshold = DEFAULT_THRESHOLD;
>  
>          if (ac > 1)
>                  loops = atol(av[1]);
> @@ -137,7 +138,8 @@ int main(int ac, char **av)
>                  on_cpu(i, kvm_clock_init, (void *)0);
>  
>          if (ac > 2)
> -                nerr += wallclock_test(sec, threshold);
> +	        for (i = 0; i < ncpus; ++i)
> +	                on_cpu(i, wallclock_test, &sec);
>  
>          printf("Check the stability of raw cycle ...\n");
>          pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT
> 

Here are the results of the trace as you requested:
http://people.canonical.com/~arges/kvm/trace-2.dat.xz

$ uptime
 16:18:31 up 53 min,  1 user,  load average: 1.16, 0.39, 0.17

$ grep -m1 model.name /proc/cpuinfo
model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz

Here is the output of the command:
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
10000000 1409761075
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44e520
kvm-clock: cpu 0, msr 0x:44e520
Wallclock test, threshold 5
Seconds get from host:     1409761075
Seconds get from kvmclock: 1409757927
Offset:                    -3148
offset too large!
Wallclock test, threshold 5
Seconds get from host:     1409761075
Seconds get from kvmclock: 1409757927
Offset:                    -3148
offset too large!
Check the stability of raw cycle ...
Worst warp -3147762665310
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -3147762665310
Raw cycle is not stable
Monotonic cycle test:
Worst warp -3142929472775
Total vcpus: 2
Test  loops: 10000000
Total warps:  1
Total stalls: 0
Worst warp:   -3142929472775
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1242044050
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1242665486
Return value from qemu: 3

--chris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-03 16:23                       ` Chris J Arges
@ 2014-09-03 17:52                         ` Paolo Bonzini
  2014-09-03 18:25                           ` Chris J Arges
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-09-03 17:52 UTC (permalink / raw)
  To: Chris J Arges; +Cc: kvm

Il 03/09/2014 18:23, Chris J Arges ha scritto:
> $ uptime
>  16:18:31 up 53 min,  1 user,  load average: 1.16, 0.39, 0.17
> 
> $ grep -m1 model.name /proc/cpuinfo
> model name      : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
> 
> Here is the output of the command:
> qemu-system-x86_64 -enable-kvm -device pc-testdev -device
> isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
> -device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
> 10000000 1409761075
> enabling apic
> enabling apic
> kvm-clock: cpu 0, msr 0x:44e520
> kvm-clock: cpu 0, msr 0x:44e520
> Wallclock test, threshold 5
> Seconds get from host:     1409761075
> Seconds get from kvmclock: 1409757927
> Offset:                    -3148
> offset too large!
> Wallclock test, threshold 5
> Seconds get from host:     1409761075
> Seconds get from kvmclock: 1409757927
> Offset:                    -3148
> offset too large!
> Check the stability of raw cycle ...
> Worst warp -3147762665310

I'm not sure about the reason for the warp, but indeed the offset and
uptime match (I'll check them against the trace tomorrow) so it's "just"
that the VM's TSC base is not taken into account correctly.

Can you gather another trace with the problematic patch reverted?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-03 17:52                         ` Paolo Bonzini
@ 2014-09-03 18:25                           ` Chris J Arges
       [not found]                             ` <54083688.6000201@redhat.com>
  0 siblings, 1 reply; 25+ messages in thread
From: Chris J Arges @ 2014-09-03 18:25 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

<snip>
> I'm not sure about the reason for the warp, but indeed the offset and
> uptime match (I'll check them against the trace tomorrow) so it's "just"
> that the VM's TSC base is not taken into account correctly.
> 
> Can you gather another trace with the problematic patch reverted?
> 
> Paolo
> 

Here is the third trace running with 0d3da0d2 reverted from the latest
kvm queue branch 11cc9ea3:

http://people.canonical.com/~arges/kvm/trace-3.dat.xz

$ uptime
 18:25:13 up 5 min,  1 user,  load average: 0.21, 0.74, 0.44

qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
10000000 1409768537
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44e520
kvm-clock: cpu 0, msr 0x:44e520
Wallclock test, threshold 5
Seconds get from host:     1409768537
Seconds get from kvmclock: 1409768538
Offset:                    1
Wallclock test, threshold 5
Seconds get from host:     1409768537
Seconds get from kvmclock: 1409768538
Offset:                    1
Check the stability of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
Total warps:  0
Total stalls: 0
Worst warp:   0
Raw cycle is stable
Monotonic cycle test:
Total vcpus: 2
Test  loops: 10000000
Total warps:  0
Total stalls: 0
Worst warp:   0
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  1241970306
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 10000000
TSC cycles:  3266701026
Return value from qemu: 1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
       [not found]                             ` <54083688.6000201@redhat.com>
@ 2014-09-04 11:33                               ` Paolo Bonzini
  2014-09-04 12:24                                 ` Wanpeng Li
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2014-09-04 11:33 UTC (permalink / raw)
  To: Chris J Arges; +Cc: kvm

Il 04/09/2014 11:53, Paolo Bonzini ha scritto:
> Il 03/09/2014 20:25, Chris J Arges ha scritto:
>> <snip>
>>> I'm not sure about the reason for the warp, but indeed the offset and
>>> uptime match (I'll check them against the trace tomorrow) so it's "just"
>>> that the VM's TSC base is not taken into account correctly.
>>>
>>> Can you gather another trace with the problematic patch reverted?
>>>
>>> Paolo
>>>
>>
>> Here is the third trace running with 0d3da0d2 reverted from the latest
>> kvm queue branch 11cc9ea3:
>>
>> http://people.canonical.com/~arges/kvm/trace-3.dat.xz
> 
> Thanks!  And---yay!---I reproduced it on another machine.

And my bisection landed on the merge of the timer branch (commit
e7fda6c4c3c1a7d6996dd75fd84670fa0b5d448f).  Here is the log:

$ git bisect bad origin/master
$ git bisect good v3.16
$ git bisect good kvm-3.17-1 # 42cbc04fd3b5e3f9b011bf9fa3ce0b3d1e10b58b

good    [ 6929.863545] loaded kvm module (v3.17-rc1-158-g451fd72219dd)
bad     [ 6971.932790] loaded kvm module (for-linus)
bad     [ 7216.073128] loaded kvm module (v3.16-6426-gae045e245542)
good    [ 7286.198948] loaded kvm module (v3.16-3283-g53ee983378ff)
good    [ 7350.534060] loaded kvm module (v3.16-rc7-1668-gaef4f5b6db65)
good    [ 7439.037038] loaded kvm module (v3.16-4006-g91c2ff7708d4)
good    [ 7481.188637] loaded kvm module (v3.16-rc6-450-g7ba3c21c17d0)
bad     [ 7535.292730] loaded kvm module (v3.16-4635-ge7fda6c4c3c1)
good    [ 7589.722691] loaded kvm module (v3.16-rc5-110-g9b0fd802e8c0)
good    [ 7630.286418] loaded kvm module (v3.16-4467-ged5c41d30ef2)
good    [ 7712.470986] loaded kvm module (v3.16-rc1-35-g885d078bfe92)
good    [ 7763.443626] loaded kvm module (v3.16-rc1-381-g1b0733837a9b)
good    [ 7825.497414] loaded kvm module (v3.16-rc5-116-g7806f60e1d20)
good    [ 7893.174056] loaded kvm module (v3.16-rc1-384-gc6f1224573c3)

This means that:

- Tomasz's patch (commit 0d3da0d26e3c3515997c99451ce3b0ad1a69a36c) is
fine, it just enables the (wrong) master clock more often

- the failure is within that branch.

I then cherry-picked Tomasz's patch during a new bisection, and landed
on one of my original suspects:

commit cbcf2dd3b3d4d990610259e8d878fc8dc1f17d80
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Jul 16 21:04:54 2014 +0000

    x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based

    Convert the relevant base data right away to nanoseconds instead of
    doing the conversion on every readout. Reduces text size by 160
    bytes.

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Gleb Natapov <gleb@kernel.org>
    Cc: kvm@vger.kernel.org
    Acked-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: John Stultz <john.stultz@linaro.org>

Again, here is the log:

$ git bisect bad 953dec21aed4038464fec02f96a2f1b8701a5bce
$ git bisect good 1af447bd8cbfb808a320885d214555fb2d32e6e6

good    [ 8384.334892] loaded kvm module (v3.16-rc5-81-g68f6783d2831)
bad     [ 8525.975170] loaded kvm module (v3.16-rc5-99-gf519b1a2e08c)
good    [ 8562.204988] loaded kvm module (v3.16-rc5-90-g41fa4215f8e8)
bad     [ 8629.133287] loaded kvm module (v3.16-rc5-94-g48f18fd6addc)
bad     [ 8772.846612] loaded kvm module (v3.16-rc5-92-gcbcf2dd3b3d4)
good    [ 8836.509602] loaded kvm module (v3.16-rc5-91-gbb0b58127c5a)

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-04 11:33                               ` Paolo Bonzini
@ 2014-09-04 12:24                                 ` Wanpeng Li
  2014-09-04 13:11                                   ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Wanpeng Li @ 2014-09-04 12:24 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Chris J Arges, kvm

On Thu, Sep 04, 2014 at 01:33:10PM +0200, Paolo Bonzini wrote:
>Il 04/09/2014 11:53, Paolo Bonzini ha scritto:
>> Il 03/09/2014 20:25, Chris J Arges ha scritto:
>>> <snip>
>>>> I'm not sure about the reason for the warp, but indeed the offset and
>>>> uptime match (I'll check them against the trace tomorrow) so it's "just"
>>>> that the VM's TSC base is not taken into account correctly.
>>>>
>>>> Can you gather another trace with the problematic patch reverted?
>>>>
>>>> Paolo
>>>>
>>>
>>> Here is the third trace running with 0d3da0d2 reverted from the latest
>>> kvm queue branch 11cc9ea3:
>>>
>>> http://people.canonical.com/~arges/kvm/trace-3.dat.xz
>> 
>> Thanks!  And---yay!---I reproduced it on another machine.
>
>And my bisection landed on the merge of the timer branch (commit
>e7fda6c4c3c1a7d6996dd75fd84670fa0b5d448f).  Here is the log:
>
>$ git bisect bad origin/master
>$ git bisect good v3.16
>$ git bisect good kvm-3.17-1 # 42cbc04fd3b5e3f9b011bf9fa3ce0b3d1e10b58b
>
>good    [ 6929.863545] loaded kvm module (v3.17-rc1-158-g451fd72219dd)
>bad     [ 6971.932790] loaded kvm module (for-linus)
>bad     [ 7216.073128] loaded kvm module (v3.16-6426-gae045e245542)
>good    [ 7286.198948] loaded kvm module (v3.16-3283-g53ee983378ff)
>good    [ 7350.534060] loaded kvm module (v3.16-rc7-1668-gaef4f5b6db65)
>good    [ 7439.037038] loaded kvm module (v3.16-4006-g91c2ff7708d4)
>good    [ 7481.188637] loaded kvm module (v3.16-rc6-450-g7ba3c21c17d0)
>bad     [ 7535.292730] loaded kvm module (v3.16-4635-ge7fda6c4c3c1)
>good    [ 7589.722691] loaded kvm module (v3.16-rc5-110-g9b0fd802e8c0)
>good    [ 7630.286418] loaded kvm module (v3.16-4467-ged5c41d30ef2)
>good    [ 7712.470986] loaded kvm module (v3.16-rc1-35-g885d078bfe92)
>good    [ 7763.443626] loaded kvm module (v3.16-rc1-381-g1b0733837a9b)
>good    [ 7825.497414] loaded kvm module (v3.16-rc5-116-g7806f60e1d20)
>good    [ 7893.174056] loaded kvm module (v3.16-rc1-384-gc6f1224573c3)
>
>This means that:
>
>- Tomasz's patch (commit 0d3da0d26e3c3515997c99451ce3b0ad1a69a36c) is
>fine, it just enables the (wrong) master clock more often
>
>- the failure is within that branch.
>
>I then cherry-picked Tomasz's patch during a new bisection, and landed
>on one of my original suspects:
>
>commit cbcf2dd3b3d4d990610259e8d878fc8dc1f17d80
>Author: Thomas Gleixner <tglx@linutronix.de>
>Date:   Wed Jul 16 21:04:54 2014 +0000
>
>    x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based
>
>    Convert the relevant base data right away to nanoseconds instead of
>    doing the conversion on every readout. Reduces text size by 160
>    bytes.
>
>    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>    Cc: Gleb Natapov <gleb@kernel.org>
>    Cc: kvm@vger.kernel.org
>    Acked-by: Paolo Bonzini <pbonzini@redhat.com>
>    Signed-off-by: John Stultz <john.stultz@linaro.org>
>

Yes, I also look into this bad commit recently which lead to guest hang
after live migration or after local save/restore. 

Regards,
Wanpeng Li 

>Again, here is the log:
>
>$ git bisect bad 953dec21aed4038464fec02f96a2f1b8701a5bce
>$ git bisect good 1af447bd8cbfb808a320885d214555fb2d32e6e6
>
>good    [ 8384.334892] loaded kvm module (v3.16-rc5-81-g68f6783d2831)
>bad     [ 8525.975170] loaded kvm module (v3.16-rc5-99-gf519b1a2e08c)
>good    [ 8562.204988] loaded kvm module (v3.16-rc5-90-g41fa4215f8e8)
>bad     [ 8629.133287] loaded kvm module (v3.16-rc5-94-g48f18fd6addc)
>bad     [ 8772.846612] loaded kvm module (v3.16-rc5-92-gcbcf2dd3b3d4)
>good    [ 8836.509602] loaded kvm module (v3.16-rc5-91-gbb0b58127c5a)
>
>Paolo
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kvm-unit-test failures
  2014-09-04 12:24                                 ` Wanpeng Li
@ 2014-09-04 13:11                                   ` Paolo Bonzini
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2014-09-04 13:11 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Chris J Arges, kvm

Il 04/09/2014 14:24, Wanpeng Li ha scritto:
> On Thu, Sep 04, 2014 at 01:33:10PM +0200, Paolo Bonzini wrote:
>> Il 04/09/2014 11:53, Paolo Bonzini ha scritto:
>>> Il 03/09/2014 20:25, Chris J Arges ha scritto:
>>>> <snip>
>>>>> I'm not sure about the reason for the warp, but indeed the offset and
>>>>> uptime match (I'll check them against the trace tomorrow) so it's "just"
>>>>> that the VM's TSC base is not taken into account correctly.
>>>>>
>>>>> Can you gather another trace with the problematic patch reverted?
>>>>>
>>>>> Paolo
>>>>>
>>>>
>>>> Here is the third trace running with 0d3da0d2 reverted from the latest
>>>> kvm queue branch 11cc9ea3:
>>>>
>>>> http://people.canonical.com/~arges/kvm/trace-3.dat.xz
>>>
>>> Thanks!  And---yay!---I reproduced it on another machine.
>>
>> And my bisection landed on the merge of the timer branch (commit
>> e7fda6c4c3c1a7d6996dd75fd84670fa0b5d448f).  Here is the log:
>>
>> $ git bisect bad origin/master
>> $ git bisect good v3.16
>> $ git bisect good kvm-3.17-1 # 42cbc04fd3b5e3f9b011bf9fa3ce0b3d1e10b58b
>>
>> good    [ 6929.863545] loaded kvm module (v3.17-rc1-158-g451fd72219dd)
>> bad     [ 6971.932790] loaded kvm module (for-linus)
>> bad     [ 7216.073128] loaded kvm module (v3.16-6426-gae045e245542)
>> good    [ 7286.198948] loaded kvm module (v3.16-3283-g53ee983378ff)
>> good    [ 7350.534060] loaded kvm module (v3.16-rc7-1668-gaef4f5b6db65)
>> good    [ 7439.037038] loaded kvm module (v3.16-4006-g91c2ff7708d4)
>> good    [ 7481.188637] loaded kvm module (v3.16-rc6-450-g7ba3c21c17d0)
>> bad     [ 7535.292730] loaded kvm module (v3.16-4635-ge7fda6c4c3c1)
>> good    [ 7589.722691] loaded kvm module (v3.16-rc5-110-g9b0fd802e8c0)
>> good    [ 7630.286418] loaded kvm module (v3.16-4467-ged5c41d30ef2)
>> good    [ 7712.470986] loaded kvm module (v3.16-rc1-35-g885d078bfe92)
>> good    [ 7763.443626] loaded kvm module (v3.16-rc1-381-g1b0733837a9b)
>> good    [ 7825.497414] loaded kvm module (v3.16-rc5-116-g7806f60e1d20)
>> good    [ 7893.174056] loaded kvm module (v3.16-rc1-384-gc6f1224573c3)
>>
>> This means that:
>>
>> - Tomasz's patch (commit 0d3da0d26e3c3515997c99451ce3b0ad1a69a36c) is
>> fine, it just enables the (wrong) master clock more often
>>
>> - the failure is within that branch.
>>
>> I then cherry-picked Tomasz's patch during a new bisection, and landed
>> on one of my original suspects:
>>
>> commit cbcf2dd3b3d4d990610259e8d878fc8dc1f17d80
>> Author: Thomas Gleixner <tglx@linutronix.de>
>> Date:   Wed Jul 16 21:04:54 2014 +0000
>>
>>    x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based
>>
>>    Convert the relevant base data right away to nanoseconds instead of
>>    doing the conversion on every readout. Reduces text size by 160
>>    bytes.
>>
>>    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>    Cc: Gleb Natapov <gleb@kernel.org>
>>    Cc: kvm@vger.kernel.org
>>    Acked-by: Paolo Bonzini <pbonzini@redhat.com>
>>    Signed-off-by: John Stultz <john.stultz@linaro.org>
>>
> 
> Yes, I also look into this bad commit recently which lead to guest hang
> after live migration or after local save/restore. 

Thanks for the report!

Wanpeng, can you test and/or review the patch I just posted ("[PATCH]
KVM: x86: fix kvmclock breakage from timers branch merge")?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-09-04 13:11 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-26 18:29 [PATCH 0/2] kvm-unit-tests: add check parameter to run_tests configuration Chris J Arges
2014-08-26 18:29 ` [PATCH 1/2] " Chris J Arges
2014-08-27 13:29   ` Paolo Bonzini
2014-08-27 14:42     ` [PATCH 1/2 v2] " Chris J Arges
2014-08-27 14:45     ` [PATCH 1/2 v3] " Chris J Arges
2014-08-27 14:52       ` Paolo Bonzini
2014-08-27 21:24         ` kvm-unit-test failures (was: [PATCH 1/2 v3] add check parameter to run_tests configuration) Chris J Arges
2014-08-27 22:05           ` Paolo Bonzini
2014-08-29 17:36             ` kvm-unit-test failures Chris J Arges
2014-08-29 21:05               ` Chris J Arges
2014-08-31 16:05                 ` Paolo Bonzini
2014-09-02 19:57                   ` Chris J Arges
2014-09-03 14:47                     ` Paolo Bonzini
2014-09-03 15:21                       ` Chris J Arges
2014-09-03 14:59                     ` Paolo Bonzini
2014-09-03 16:23                       ` Chris J Arges
2014-09-03 17:52                         ` Paolo Bonzini
2014-09-03 18:25                           ` Chris J Arges
     [not found]                             ` <54083688.6000201@redhat.com>
2014-09-04 11:33                               ` Paolo Bonzini
2014-09-04 12:24                                 ` Wanpeng Li
2014-09-04 13:11                                   ` Paolo Bonzini
2014-08-29 21:08               ` Paolo Bonzini
2014-08-29 21:18                 ` Chris J Arges
2014-08-26 18:29 ` [PATCH 2/2] x86/unittests.cfg: the pmu testcase requires that nmi_watchdog is disabled Chris J Arges
2014-08-27 13:29   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.