All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently
@ 2020-10-08 20:57 Cleber Rosa
  2020-10-08 21:14 ` [Bug 1899082] " Cleber Rosa
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Cleber Rosa @ 2020-10-08 20:57 UTC (permalink / raw)
  To: qemu-devel

Public bug reported:

Even though this acceptance test is already skipped on GitLab CI, the
intermittent failures can be seen on other environments too.

The record phase works fine, but during the replay phase fail to finish
booting the kernel (until the expected place):

16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
16:34:47 DEBUG| [    0.035667]
16:36:02 ERROR| 
16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
16:36:02 ERROR| Traceback (most recent call last):
16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
16:36:02 ERROR|     False, shift, args, replay_path)
16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
16:36:02 ERROR|     vm=vm)
16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
16:36:02 ERROR|     msg = console.readline().strip()
16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
16:36:02 ERROR|     def readinto(self, b):
16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
16:36:02 ERROR| 

On my workstation, I can replicate the failure roughly once every 50
runs.

** Affects: qemu
     Importance: Undecided
         Status: New


** Tags: acceptance pc replay test x86

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
@ 2020-10-08 21:14 ` Cleber Rosa
  2020-10-09  6:01 ` Pavel Dovgalyuk
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Cleber Rosa @ 2020-10-08 21:14 UTC (permalink / raw)
  To: qemu-devel

I'm actually able to increase the reproducibility to ~ 90% when running
8 of those tests simultaneously (on an 8 core system).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
  2020-10-08 21:14 ` [Bug 1899082] " Cleber Rosa
@ 2020-10-09  6:01 ` Pavel Dovgalyuk
  2020-10-09  7:21 ` Pavel Dovgalyuk
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pavel Dovgalyuk @ 2020-10-09  6:01 UTC (permalink / raw)
  To: qemu-devel

I can 100% reproduce it with the following command line:
taskset 1 tests/venv/bin/avocado --show=app,console,replay run -t arch:x86_64 ../tests/acceptance/replay_kernel.py

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
  2020-10-08 21:14 ` [Bug 1899082] " Cleber Rosa
  2020-10-09  6:01 ` Pavel Dovgalyuk
@ 2020-10-09  7:21 ` Pavel Dovgalyuk
  2020-10-14  8:14 ` Pavel Dovgalyuk
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pavel Dovgalyuk @ 2020-10-09  7:21 UTC (permalink / raw)
  To: qemu-devel

But I can't reproduce it outside the avocado toolchain, by running qemu
directly.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
                   ` (2 preceding siblings ...)
  2020-10-09  7:21 ` Pavel Dovgalyuk
@ 2020-10-14  8:14 ` Pavel Dovgalyuk
  2021-02-10 22:48 ` Beraldo Leal
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pavel Dovgalyuk @ 2020-10-14  8:14 UTC (permalink / raw)
  To: qemu-devel

I traced this bug to hw/char/serial.c/serial_ioport_read

Bug disappears when I add qemu_log("serial_ioport_read %x %x\n",
(int)addr, ret); into the end of this function.

I suppose that there is avocado (or socket) io synchronization problem,
because running the same test without avocado works normally.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
                   ` (3 preceding siblings ...)
  2020-10-14  8:14 ` Pavel Dovgalyuk
@ 2021-02-10 22:48 ` Beraldo Leal
  2021-02-11  4:46 ` Cleber Rosa
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Beraldo Leal @ 2021-02-10 22:48 UTC (permalink / raw)
  To: qemu-devel

I could reproduce this without Avocado:

--
#!/bin/bash

SOCKET="/tmp/qemu.sock"
VMLINUZ_PATH="/tmp/vmlinuz"
REPLAY_FILE="/tmp/replay.bin"

function run_and_wait() {
        /usr/bin/qemu-system-x86_64 -display none \
                                    -vga none  \
                                    -machine pc \
                                    -chardev socket,id=console,path=${SOCKET},server=on,wait=off \
                                    -serial chardev:console \
                                    -icount shift=5,rr=$1,rrfile=${REPLAY_FILE} \
                                    -kernel ${VMLINUZ_PATH} \
                                    -append "printk.time=1 panic=-1 console=ttyS0" -net none -no-reboot &
        # Wait a little for the socket creation
        sleep 1
        socat - UNIX-CONNECT:${SOCKET}
        echo $?
}


run_and_wait "record"
echo "Was this (record) finished?"

run_and_wait "replay"
echo "Was this (replay) finished?"
--

The second echo is never displayed and my console stops here:

---
[    0.036667] Speculative Store Bypass: Vulnerable
[    0.256061] random: fast init done
[    0.308652] Freeing SMP alternatives memory: 36K
---

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
                   ` (4 preceding siblings ...)
  2021-02-10 22:48 ` Beraldo Leal
@ 2021-02-11  4:46 ` Cleber Rosa
  2021-02-11  8:41 ` Pavel Dovgalyuk
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Cleber Rosa @ 2021-02-11  4:46 UTC (permalink / raw)
  To: qemu-devel

I was able to run the reproducer from Beraldo Leal, and achieved the
same results.

Additionally, I got the following output from QEMU:

   qemu-system-x86_64: Missing character write event in the replay log

Which seems to come from replay/replay-char.c:158.

I then tested the record and replay separately, and found that, while
the above message is given and QEMU exits at the replay phase, the
amount of CPUs given to the *record* stage actually make the difference.
When the recording is done with a single CPU, the replay log seems to be
written with the "missing character write event".

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
                   ` (5 preceding siblings ...)
  2021-02-11  4:46 ` Cleber Rosa
@ 2021-02-11  8:41 ` Pavel Dovgalyuk
  2021-05-09 14:29 ` Thomas Huth
  2021-07-09  4:17 ` Launchpad Bug Tracker
  8 siblings, 0 replies; 10+ messages in thread
From: Pavel Dovgalyuk @ 2021-02-11  8:41 UTC (permalink / raw)
  To: qemu-devel

Beraldo, thanks for the script.
However, I can't reproduce the bug using it. I've got the newest QEMU from the repository, and it never hangs in this scenario.

But there are some problems in other runs with more complex tasks.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  New

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
                   ` (6 preceding siblings ...)
  2021-02-11  8:41 ` Pavel Dovgalyuk
@ 2021-05-09 14:29 ` Thomas Huth
  2021-07-09  4:17 ` Launchpad Bug Tracker
  8 siblings, 0 replies; 10+ messages in thread
From: Thomas Huth @ 2021-05-09 14:29 UTC (permalink / raw)
  To: qemu-devel

The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting the bug state to "Incomplete" now.

If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".

If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:

1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:

    https://gitlab.com/qemu-project/qemu/-/issues

and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.

2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" or "Confirmed" within the next 60 days (other-
wise it will get closed as "Expired"). We will then eventually migrate
the ticket automatically to the new system (but you won't be the reporter
of the bug in the new system and thus you won't get notified on changes
anymore).

Thank you and sorry for the inconvenience.


** Changed in: qemu
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  Incomplete

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 1899082] Re: ReplayKernel.test_x86_64_pc fails intermittently
  2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
                   ` (7 preceding siblings ...)
  2021-05-09 14:29 ` Thomas Huth
@ 2021-07-09  4:17 ` Launchpad Bug Tracker
  8 siblings, 0 replies; 10+ messages in thread
From: Launchpad Bug Tracker @ 2021-07-09  4:17 UTC (permalink / raw)
  To: qemu-devel

[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
       Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1899082

Title:
  ReplayKernel.test_x86_64_pc fails intermittently

Status in QEMU:
  Expired

Bug description:
  Even though this acceptance test is already skipped on GitLab CI, the
  intermittent failures can be seen on other environments too.

  The record phase works fine, but during the replay phase fail to
  finish booting the kernel (until the expected place):

  16:34:47 DEBUG| [    0.034498] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
  16:34:47 DEBUG| [    0.034790] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
  16:34:47 DEBUG| [    0.035093] Spectre V2 : Mitigation: Full generic retpoline
  16:34:47 DEBUG| [    0.035347] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
  16:34:47 DEBUG| [    0.035667]
  16:36:02 ERROR| 
  16:36:02 ERROR| Reproduced traceback from: /home/cleber/src/avocado/avocado/avocado/core/test.py:767
  16:36:02 ERROR| Traceback (most recent call last):
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 92, in test_x86_64_pc
  16:36:02 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern, shift=5)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 73, in run_rr
  16:36:02 ERROR|     False, shift, args, replay_path)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/replay_kernel.py", line 55, in run_vm
  16:36:02 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/boot_linux_console.py", line 53, in wait_for_console_pattern
  16:36:02 ERROR|     vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 130, in wait_for_console_pattern
  16:36:02 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  16:36:02 ERROR|   File "/var/lib/users/cleber/build/qemu/tests/acceptance/avocado_qemu/__init__.py", line 82, in _console_interaction
  16:36:02 ERROR|     msg = console.readline().strip()
  16:36:02 ERROR|   File "/usr/lib64/python3.7/socket.py", line 575, in readinto
  16:36:02 ERROR|     def readinto(self, b):
  16:36:02 ERROR|   File "/home/cleber/src/avocado/avocado/avocado/plugins/runner.py", line 77, in sigterm_handler
  16:36:02 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  16:36:02 ERROR| RuntimeError: Test interrupted by SIGTERM
  16:36:02 ERROR| 

  On my workstation, I can replicate the failure roughly once every 50
  runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1899082/+subscriptions


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-07-09  4:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-08 20:57 [Bug 1899082] [NEW] ReplayKernel.test_x86_64_pc fails intermittently Cleber Rosa
2020-10-08 21:14 ` [Bug 1899082] " Cleber Rosa
2020-10-09  6:01 ` Pavel Dovgalyuk
2020-10-09  7:21 ` Pavel Dovgalyuk
2020-10-14  8:14 ` Pavel Dovgalyuk
2021-02-10 22:48 ` Beraldo Leal
2021-02-11  4:46 ` Cleber Rosa
2021-02-11  8:41 ` Pavel Dovgalyuk
2021-05-09 14:29 ` Thomas Huth
2021-07-09  4:17 ` Launchpad Bug Tracker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.