qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20
@ 2019-08-20  6:59 Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 01/36] kvm: i386: halt poll control MSR support Paolo Bonzini
                   ` (38 more replies)
  0 siblings, 39 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit 864ab314f1d924129d06ac7b571f105a2b76a4b2:

  Update version for v4.1.0-rc4 release (2019-08-06 17:05:21 +0100)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 6f93977075ae9971d73879ca872f75e2737f66c5:

  x86: Intel AVX512_BF16 feature enabling (2019-08-20 08:59:18 +0200)

----------------------------------------------------------------
* New KVM PV features (Marcelo, Wanpeng)
* valgrind fixes (Andrey)
* Remove clock reset notifiers (David)
* KConfig and Makefile cleanups (Paolo)
* Replay and icount improvements (Pavel)
* x86 FP fixes (Peter M.)
* TCG locking assertions (Roman)
* x86 support for mmap-ed -kernel/-initrd (Stefano)
* Other cleanups (Wei Yang, Yan Zhao, Tony)
* LSI fix for infinite loop (Prasad)
* ARM migration fix (Catherine)
* AVX512_BF16 feature (Jing)

----------------------------------------------------------------
Andrey Shinkevich (3):
      test-throttle: Fix uninitialized use of burst_length
      tests: Fix uninitialized byte in test_visitor_in_fuzz
      i386/kvm: initialize struct at full before ioctl call

Catherine Ho (1):
      migration: do not rom_reset() during incoming migration

Dr. David Alan Gilbert (4):
      mc146818rtc: Remove reset notifiers
      timer: Remove reset notifiers
      replay: Remove host_clock_last
      timer: last, remove last bits of last

Eduardo Habkost (1):
      HACKING: Document 'struct' keyword usage

Jan Kiszka (1):
      kvm: vmxcap: Enhance with latest features

Jing Liu (1):
      x86: Intel AVX512_BF16 feature enabling

Li Qiang (1):
      target-i386: kvm: 'kvm_get_supported_msrs' cleanup

Marcelo Tosatti (1):
      kvm: i386: halt poll control MSR support

Paolo Bonzini (5):
      block: fix NetBSD qemu-iotests failure
      9p: simplify source file selection
      memory: fix race between TCG and accesses to dirty bitmap
      kconfig: do not select VMMOUSE
      scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)

Pavel Dovgalyuk (8):
      replay: add missing fix for internal function
      replay: document development rules
      util/qemu-timer: refactor deadline calculation for external timers
      replay: fix replay shutdown
      replay: refine replay-time module
      replay: rename step-related variables and functions
      icount: clean up cpu_can_io at the entry to the block
      icount: remove unnecessary gen_io_end calls

Peter Maydell (1):
      target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions

Roman Kagan (2):
      cpus-common: nuke finish_safe_work
      cpus-common: assert BQL nesting within cpu-exclusive sections

Stefano Garzarella (3):
      loader: Handle memory-mapped ELFs
      elf-ops.h: Map into memory the ELF to load
      hw/i386/pc: Map into memory the initrd

Wanpeng Li (1):
      target-i386: adds PV_SCHED_YIELD CPUID feature bit

Wei Yang (1):
      test-bitmap: test set 1 bit case for bitmap_set

Yan Zhao (1):
      memory: assert on out of scope notification

tony.nguyen@bt.com (1):
      configure: Define target access alignment in configure

 HACKING                                     |  14 +-
 Kconfig.host                                |   1 +
 accel/tcg/cpu-exec.c                        |   1 -
 accel/tcg/translator.c                      |   1 -
 block/file-posix.c                          |   4 +-
 configure                                   |  12 +-
 cpus-common.c                               |  12 +-
 cpus.c                                      |  17 ++-
 docs/devel/replay.txt                       |  46 +++++++
 exec.c                                      |  31 +++++
 fsdev/Makefile.objs                         |   2 +-
 hw/9pfs/Kconfig                             |   5 +
 hw/core/loader.c                            |  47 +++++--
 hw/i386/Kconfig                             |   1 +
 hw/i386/pc.c                                |  17 ++-
 hw/scsi/lsi53c895a.c                        |  41 ++++--
 hw/timer/mc146818rtc.c                      |  19 ---
 include/exec/gen-icount.h                   |  44 +++---
 include/exec/memory.h                       |  12 ++
 include/exec/poison.h                       |   1 +
 include/hw/elf_ops.h                        |  71 ++++++----
 include/hw/i386/pc.h                        |   1 +
 include/hw/loader.h                         |   5 +-
 include/qemu/timer.h                        |  43 +-----
 include/qom/cpu.h                           |   2 +-
 include/standard-headers/asm-x86/kvm_para.h |   2 +
 include/sysemu/replay.h                     |   2 +-
 memory.c                                    |  16 ++-
 migration/ram.c                             |   1 +
 qtest.c                                     |   3 +-
 replay/replay-events.c                      |   2 +-
 replay/replay-internal.c                    |  10 +-
 replay/replay-internal.h                    |  10 +-
 replay/replay-snapshot.c                    |  13 +-
 replay/replay-time.c                        |  36 +++--
 replay/replay.c                             |  30 ++--
 scripts/kvm/vmxcap                          |   8 ++
 target/alpha/cpu.h                          |   2 -
 target/alpha/translate.c                    |   2 -
 target/arm/translate-a64.c                  |   4 -
 target/arm/translate.c                      |   7 -
 target/cris/translate.c                     |   2 -
 target/hppa/cpu.h                           |   1 -
 target/hppa/translate.c                     |   1 -
 target/i386/cpu.c                           |  43 +++++-
 target/i386/cpu.h                           |   8 ++
 target/i386/kvm.c                           | 205 +++++++++++++++-------------
 target/i386/machine.c                       |  20 +++
 target/i386/ops_sse.h                       |  88 ++++++++----
 target/i386/translate.c                     |  10 --
 target/lm32/translate.c                     |   9 --
 target/microblaze/translate.c               |   2 -
 target/mips/cpu.h                           |   2 -
 target/mips/translate.c                     |  11 --
 target/nios2/translate.c                    |   4 -
 target/ppc/translate.c                      |  13 --
 target/ppc/translate_init.inc.c             |   2 -
 target/riscv/insn_trans/trans_rvi.inc.c     |   1 -
 target/sh4/cpu.h                            |   2 -
 target/sparc/cpu.h                          |   2 -
 target/sparc/translate.c                    |  16 ---
 target/unicore32/translate.c                |   1 -
 target/xtensa/cpu.h                         |   2 -
 target/xtensa/translate.c                   |  15 --
 tcg/tcg.c                                   |   2 +-
 tcg/tcg.h                                   |   8 +-
 tests/ptimer-test-stubs.c                   |   4 +-
 tests/ptimer-test.c                         |   6 +-
 tests/test-bitmap.c                         |  12 ++
 tests/test-string-input-visitor.c           |   8 +-
 tests/test-throttle.c                       |   2 +
 util/qemu-timer.c                           |  71 ++++------
 72 files changed, 667 insertions(+), 504 deletions(-)
 create mode 100644 docs/devel/replay.txt
-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 01/36] kvm: i386: halt poll control MSR support
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 02/36] target-i386: adds PV_SCHED_YIELD CPUID feature bit Paolo Bonzini
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Marcelo Tosatti

From: Marcelo Tosatti <mtosatti@redhat.com>

Add support for halt poll control MSR: save/restore, migration
and new feature name.

The purpose of this MSR is to allow the guest to disable
host halt poll.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20190603230408.GA7938@amt.cnet>
[Do not enable by default, as pointed out by Mark Kanda. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/standard-headers/asm-x86/kvm_para.h |  2 ++
 target/i386/cpu.c                           |  4 +++-
 target/i386/cpu.h                           |  1 +
 target/i386/kvm.c                           | 14 ++++++++++++++
 target/i386/machine.c                       | 20 ++++++++++++++++++++
 5 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/include/standard-headers/asm-x86/kvm_para.h b/include/standard-headers/asm-x86/kvm_para.h
index 35cd8d6..e171514 100644
--- a/include/standard-headers/asm-x86/kvm_para.h
+++ b/include/standard-headers/asm-x86/kvm_para.h
@@ -29,6 +29,7 @@
 #define KVM_FEATURE_PV_TLB_FLUSH	9
 #define KVM_FEATURE_ASYNC_PF_VMEXIT	10
 #define KVM_FEATURE_PV_SEND_IPI	11
+#define KVM_FEATURE_POLL_CONTROL	12
 
 #define KVM_HINTS_REALTIME      0
 
@@ -47,6 +48,7 @@
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
 #define MSR_KVM_STEAL_TIME  0x4b564d03
 #define MSR_KVM_PV_EOI_EN      0x4b564d04
+#define MSR_KVM_POLL_CONTROL	0x4b564d05
 
 struct kvm_steal_time {
 	uint64_t steal;
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 19751e3..9a8f244 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -906,7 +906,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             "kvmclock", "kvm-nopiodelay", "kvm-mmu", "kvmclock",
             "kvm-asyncpf", "kvm-steal-time", "kvm-pv-eoi", "kvm-pv-unhalt",
             NULL, "kvm-pv-tlb-flush", NULL, "kvm-pv-ipi",
-            NULL, NULL, NULL, NULL,
+            "kvm-poll-control", NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             "kvmclock-stable-bit", NULL, NULL, NULL,
@@ -5660,6 +5660,8 @@ static void x86_cpu_initfn(Object *obj)
     object_property_add_alias(obj, "kvm_steal_time", obj, "kvm-steal-time", &error_abort);
     object_property_add_alias(obj, "kvm_pv_eoi", obj, "kvm-pv-eoi", &error_abort);
     object_property_add_alias(obj, "kvm_pv_unhalt", obj, "kvm-pv-unhalt", &error_abort);
+    object_property_add_alias(obj, "kvm_poll_control", obj, "kvm-poll-control",
+                              &error_abort);
     object_property_add_alias(obj, "svm_lock", obj, "svm-lock", &error_abort);
     object_property_add_alias(obj, "nrip_save", obj, "nrip-save", &error_abort);
     object_property_add_alias(obj, "tsc_scale", obj, "tsc-scale", &error_abort);
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8b3dc55..44e42f5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1260,6 +1260,7 @@ typedef struct CPUX86State {
     uint64_t steal_time_msr;
     uint64_t async_pf_en_msr;
     uint64_t pv_eoi_en_msr;
+    uint64_t poll_control_msr;
 
     /* Partition-wide HV MSRs, will be updated only on the first vcpu */
     uint64_t msr_hv_hypercall;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index dbbb137..327c95a 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1782,6 +1782,8 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
 
         hyperv_x86_synic_reset(cpu);
     }
+    /* enabled by default */
+    env->poll_control_msr = 1;
 }
 
 void kvm_arch_do_init_vcpu(X86CPU *cpu)
@@ -2490,6 +2492,11 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
         if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_STEAL_TIME)) {
             kvm_msr_entry_add(cpu, MSR_KVM_STEAL_TIME, env->steal_time_msr);
         }
+
+        if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_POLL_CONTROL)) {
+            kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
+        }
+
         if (has_architectural_pmu_version > 0) {
             if (has_architectural_pmu_version > 1) {
                 /* Stop the counter.  */
@@ -2875,6 +2882,9 @@ static int kvm_get_msrs(X86CPU *cpu)
     if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_STEAL_TIME)) {
         kvm_msr_entry_add(cpu, MSR_KVM_STEAL_TIME, 0);
     }
+    if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_POLL_CONTROL)) {
+        kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
+    }
     if (has_architectural_pmu_version > 0) {
         if (has_architectural_pmu_version > 1) {
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
@@ -3109,6 +3119,10 @@ static int kvm_get_msrs(X86CPU *cpu)
         case MSR_KVM_STEAL_TIME:
             env->steal_time_msr = msrs[i].data;
             break;
+        case MSR_KVM_POLL_CONTROL: {
+            env->poll_control_msr = msrs[i].data;
+            break;
+        }
         case MSR_CORE_PERF_FIXED_CTR_CTRL:
             env->msr_fixed_ctr_ctrl = msrs[i].data;
             break;
diff --git a/target/i386/machine.c b/target/i386/machine.c
index b114609..2ddd295 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -439,6 +439,14 @@ static const VMStateDescription vmstate_exception_info = {
     }
 };
 
+/* Poll control MSR enabled by default */
+static bool poll_control_msr_needed(void *opaque)
+{
+    X86CPU *cpu = opaque;
+
+    return cpu->env.poll_control_msr != 1;
+}
+
 static const VMStateDescription vmstate_steal_time_msr = {
     .name = "cpu/steal_time_msr",
     .version_id = 1,
@@ -472,6 +480,17 @@ static const VMStateDescription vmstate_pv_eoi_msr = {
     }
 };
 
+static const VMStateDescription vmstate_poll_control_msr = {
+    .name = "cpu/poll_control_msr",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = poll_control_msr_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT64(env.poll_control_msr, X86CPU),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static bool fpop_ip_dp_needed(void *opaque)
 {
     X86CPU *cpu = opaque;
@@ -1356,6 +1375,7 @@ VMStateDescription vmstate_x86_cpu = {
         &vmstate_async_pf_msr,
         &vmstate_pv_eoi_msr,
         &vmstate_steal_time_msr,
+        &vmstate_poll_control_msr,
         &vmstate_fpop_ip_dp,
         &vmstate_msr_tsc_adjust,
         &vmstate_msr_tscdeadline,
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 02/36] target-i386: adds PV_SCHED_YIELD CPUID feature bit
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 01/36] kvm: i386: halt poll control MSR support Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 03/36] loader: Handle memory-mapped ELFs Paolo Bonzini
                   ` (36 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Wanpeng Li, Eduardo Habkost, Radim Krčmář

From: Wanpeng Li <wanpengli@tencent.com>

Adds PV_SCHED_YIELD CPUID feature bit.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1562745771-8414-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9a8f244..b07bc06 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -906,7 +906,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             "kvmclock", "kvm-nopiodelay", "kvm-mmu", "kvmclock",
             "kvm-asyncpf", "kvm-steal-time", "kvm-pv-eoi", "kvm-pv-unhalt",
             NULL, "kvm-pv-tlb-flush", NULL, "kvm-pv-ipi",
-            "kvm-poll-control", NULL, NULL, NULL,
+            "kvm-poll-control", "kvm-pv-sched-yield", NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             "kvmclock-stable-bit", NULL, NULL, NULL,
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 03/36] loader: Handle memory-mapped ELFs
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 01/36] kvm: i386: halt poll control MSR support Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 02/36] target-i386: adds PV_SCHED_YIELD CPUID feature bit Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 04/36] elf-ops.h: Map into memory the ELF to load Paolo Bonzini
                   ` (35 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefano Garzarella

From: Stefano Garzarella <sgarzare@redhat.com>

This patch allows handling an ELF memory-mapped, taking care
the reference count of the GMappedFile* passed through
rom_add_elf_program().
In this case, the 'data' pointer is not heap-allocated, so
we cannot free it.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20190724143105.307042-2-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/loader.c     | 38 ++++++++++++++++++++++++++++++--------
 include/hw/elf_ops.h |  2 +-
 include/hw/loader.h  |  5 +++--
 3 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 425bf69..9fb93a6 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -836,6 +836,7 @@ struct Rom {
     int isrom;
     char *fw_dir;
     char *fw_file;
+    GMappedFile *mapped_file;
 
     bool committed;
 
@@ -846,10 +847,25 @@ struct Rom {
 static FWCfgState *fw_cfg;
 static QTAILQ_HEAD(, Rom) roms = QTAILQ_HEAD_INITIALIZER(roms);
 
-/* rom->data must be heap-allocated (do not use with rom_add_elf_program()) */
+/*
+ * rom->data can be heap-allocated or memory-mapped (e.g. when added with
+ * rom_add_elf_program())
+ */
+static void rom_free_data(Rom *rom)
+{
+    if (rom->mapped_file) {
+        g_mapped_file_unref(rom->mapped_file);
+        rom->mapped_file = NULL;
+    } else {
+        g_free(rom->data);
+    }
+
+    rom->data = NULL;
+}
+
 static void rom_free(Rom *rom)
 {
-    g_free(rom->data);
+    rom_free_data(rom);
     g_free(rom->path);
     g_free(rom->name);
     g_free(rom->fw_dir);
@@ -1056,11 +1072,12 @@ MemoryRegion *rom_add_blob(const char *name, const void *blob, size_t len,
 
 /* This function is specific for elf program because we don't need to allocate
  * all the rom. We just allocate the first part and the rest is just zeros. This
- * is why romsize and datasize are different. Also, this function seize the
- * memory ownership of "data", so we don't have to allocate and copy the buffer.
+ * is why romsize and datasize are different. Also, this function takes its own
+ * reference to "mapped_file", so we don't have to allocate and copy the buffer.
  */
-int rom_add_elf_program(const char *name, void *data, size_t datasize,
-                        size_t romsize, hwaddr addr, AddressSpace *as)
+int rom_add_elf_program(const char *name, GMappedFile *mapped_file, void *data,
+                        size_t datasize, size_t romsize, hwaddr addr,
+                        AddressSpace *as)
 {
     Rom *rom;
 
@@ -1071,6 +1088,12 @@ int rom_add_elf_program(const char *name, void *data, size_t datasize,
     rom->romsize  = romsize;
     rom->data     = data;
     rom->as       = as;
+
+    if (mapped_file && data) {
+        g_mapped_file_ref(mapped_file);
+        rom->mapped_file = mapped_file;
+    }
+
     rom_insert(rom);
     return 0;
 }
@@ -1105,8 +1128,7 @@ static void rom_reset(void *unused)
         }
         if (rom->isrom) {
             /* rom needs to be written only once */
-            g_free(rom->data);
-            rom->data = NULL;
+            rom_free_data(rom);
         }
         /*
          * The rom loader is really on the same level as firmware in the guest
diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index 690f923..fede37e 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -525,7 +525,7 @@ static int glue(load_elf, SZ)(const char *name, int fd,
                     snprintf(label, sizeof(label), "phdr #%d: %s", i, name);
 
                     /* rom_add_elf_program() seize the ownership of 'data' */
-                    rom_add_elf_program(label, data, file_size, mem_size,
+                    rom_add_elf_program(label, NULL, data, file_size, mem_size,
                                         addr, as);
                 } else {
                     address_space_write(as ? as : &address_space_memory,
diff --git a/include/hw/loader.h b/include/hw/loader.h
index 3e1b3a4..07fd928 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -258,8 +258,9 @@ MemoryRegion *rom_add_blob(const char *name, const void *blob, size_t len,
                            FWCfgCallback fw_callback,
                            void *callback_opaque, AddressSpace *as,
                            bool read_only);
-int rom_add_elf_program(const char *name, void *data, size_t datasize,
-                        size_t romsize, hwaddr addr, AddressSpace *as);
+int rom_add_elf_program(const char *name, GMappedFile *mapped_file, void *data,
+                        size_t datasize, size_t romsize, hwaddr addr,
+                        AddressSpace *as);
 int rom_check_and_register_reset(void);
 void rom_set_fw(FWCfgState *f);
 void rom_set_order_override(int order);
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 04/36] elf-ops.h: Map into memory the ELF to load
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (2 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 03/36] loader: Handle memory-mapped ELFs Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 05/36] hw/i386/pc: Map into memory the initrd Paolo Bonzini
                   ` (34 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefano Garzarella

From: Stefano Garzarella <sgarzare@redhat.com>

In order to reduce the memory footprint we map into memory
the ELF to load using g_mapped_file_new_from_fd() instead of
reading each sections. In this way we can share the ELF pages
between multiple instances of QEMU.

Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20190724143105.307042-3-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/hw/elf_ops.h | 71 +++++++++++++++++++++++++++++++---------------------
 1 file changed, 42 insertions(+), 29 deletions(-)

diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index fede37e..1496d7e 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -323,8 +323,9 @@ static int glue(load_elf, SZ)(const char *name, int fd,
     struct elfhdr ehdr;
     struct elf_phdr *phdr = NULL, *ph;
     int size, i, total_size;
-    elf_word mem_size, file_size;
+    elf_word mem_size, file_size, data_offset;
     uint64_t addr, low = (uint64_t)-1, high = 0;
+    GMappedFile *mapped_file = NULL;
     uint8_t *data = NULL;
     char label[128];
     int ret = ELF_LOAD_FAILED;
@@ -409,20 +410,32 @@ static int glue(load_elf, SZ)(const char *name, int fd,
         }
     }
 
+    /*
+     * Since we want to be able to modify the mapped buffer, we set the
+     * 'writeble' parameter to 'true'. Modifications to the buffer are not
+     * written back to the file.
+     */
+    mapped_file = g_mapped_file_new_from_fd(fd, true, NULL);
+    if (!mapped_file) {
+        goto fail;
+    }
+
     total_size = 0;
     for(i = 0; i < ehdr.e_phnum; i++) {
         ph = &phdr[i];
         if (ph->p_type == PT_LOAD) {
             mem_size = ph->p_memsz; /* Size of the ROM */
             file_size = ph->p_filesz; /* Size of the allocated data */
-            data = g_malloc0(file_size);
-            if (ph->p_filesz > 0) {
-                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
-                    goto fail;
-                }
-                if (read(fd, data, file_size) != file_size) {
+            data_offset = ph->p_offset; /* Offset where the data is located */
+
+            if (file_size > 0) {
+                if (g_mapped_file_get_length(mapped_file) <
+                    file_size + data_offset) {
                     goto fail;
                 }
+
+                data = (uint8_t *)g_mapped_file_get_contents(mapped_file);
+                data += data_offset;
             }
 
             /* The ELF spec is somewhat vague about the purpose of the
@@ -513,25 +526,25 @@ static int glue(load_elf, SZ)(const char *name, int fd,
                 *pentry = ehdr.e_entry - ph->p_vaddr + ph->p_paddr;
             }
 
-            if (mem_size == 0) {
-                /* Some ELF files really do have segments of zero size;
-                 * just ignore them rather than trying to create empty
-                 * ROM blobs, because the zero-length blob can falsely
-                 * trigger the overlapping-ROM-blobs check.
-                 */
-                g_free(data);
-            } else {
+            /* Some ELF files really do have segments of zero size;
+             * just ignore them rather than trying to create empty
+             * ROM blobs, because the zero-length blob can falsely
+             * trigger the overlapping-ROM-blobs check.
+             */
+            if (mem_size != 0) {
                 if (load_rom) {
                     snprintf(label, sizeof(label), "phdr #%d: %s", i, name);
 
-                    /* rom_add_elf_program() seize the ownership of 'data' */
-                    rom_add_elf_program(label, NULL, data, file_size, mem_size,
-                                        addr, as);
+                    /*
+                     * rom_add_elf_program() takes its own reference to
+                     * 'mapped_file'.
+                     */
+                    rom_add_elf_program(label, mapped_file, data, file_size,
+                                        mem_size, addr, as);
                 } else {
                     address_space_write(as ? as : &address_space_memory,
                                         addr, MEMTXATTRS_UNSPECIFIED,
                                         data, file_size);
-                    g_free(data);
                 }
             }
 
@@ -547,14 +560,16 @@ static int glue(load_elf, SZ)(const char *name, int fd,
             struct elf_note *nhdr = NULL;
 
             file_size = ph->p_filesz; /* Size of the range of ELF notes */
-            data = g_malloc0(file_size);
-            if (ph->p_filesz > 0) {
-                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
-                    goto fail;
-                }
-                if (read(fd, data, file_size) != file_size) {
+            data_offset = ph->p_offset; /* Offset where the notes are located */
+
+            if (file_size > 0) {
+                if (g_mapped_file_get_length(mapped_file) <
+                    file_size + data_offset) {
                     goto fail;
                 }
+
+                data = (uint8_t *)g_mapped_file_get_contents(mapped_file);
+                data += data_offset;
             }
 
             /*
@@ -570,19 +585,17 @@ static int glue(load_elf, SZ)(const char *name, int fd,
                     sizeof(struct elf_note) == sizeof(struct elf64_note);
                 elf_note_fn((void *)nhdr, (void *)&ph->p_align, is64);
             }
-            g_free(data);
             data = NULL;
         }
     }
 
-    g_free(phdr);
     if (lowaddr)
         *lowaddr = (uint64_t)(elf_sword)low;
     if (highaddr)
         *highaddr = (uint64_t)(elf_sword)high;
-    return total_size;
+    ret = total_size;
  fail:
-    g_free(data);
+    g_mapped_file_unref(mapped_file);
     g_free(phdr);
     return ret;
 }
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 05/36] hw/i386/pc: Map into memory the initrd
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (3 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 04/36] elf-ops.h: Map into memory the ELF to load Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 06/36] memory: assert on out of scope notification Paolo Bonzini
                   ` (33 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefano Garzarella

From: Stefano Garzarella <sgarzare@redhat.com>

In order to reduce the memory footprint we map into memory
the initrd using g_mapped_file_new() instead of reading it.
In this way we can share the initrd pages between multiple
instances of QEMU.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20190724143105.307042-4-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/pc.c         | 17 +++++++++++++----
 include/hw/i386/pc.h |  1 +
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437..96f6b89 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1241,17 +1241,21 @@ static void load_linux(PCMachineState *pcms,
 
             /* load initrd */
             if (initrd_filename) {
+                GMappedFile *mapped_file;
                 gsize initrd_size;
                 gchar *initrd_data;
                 GError *gerr = NULL;
 
-                if (!g_file_get_contents(initrd_filename, &initrd_data,
-                            &initrd_size, &gerr)) {
+                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
+                if (!mapped_file) {
                     fprintf(stderr, "qemu: error reading initrd %s: %s\n",
                             initrd_filename, gerr->message);
                     exit(1);
                 }
+                pcms->initrd_mapped_file = mapped_file;
 
+                initrd_data = g_mapped_file_get_contents(mapped_file);
+                initrd_size = g_mapped_file_get_length(mapped_file);
                 initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
                 if (initrd_size >= initrd_max) {
                     fprintf(stderr, "qemu: initrd is too large, cannot support."
@@ -1378,6 +1382,7 @@ static void load_linux(PCMachineState *pcms,
 
     /* load initrd */
     if (initrd_filename) {
+        GMappedFile *mapped_file;
         gsize initrd_size;
         gchar *initrd_data;
         GError *gerr = NULL;
@@ -1387,12 +1392,16 @@ static void load_linux(PCMachineState *pcms,
             exit(1);
         }
 
-        if (!g_file_get_contents(initrd_filename, &initrd_data,
-                                 &initrd_size, &gerr)) {
+        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
+        if (!mapped_file) {
             fprintf(stderr, "qemu: error reading initrd %s: %s\n",
                     initrd_filename, gerr->message);
             exit(1);
         }
+        pcms->initrd_mapped_file = mapped_file;
+
+        initrd_data = g_mapped_file_get_contents(mapped_file);
+        initrd_size = g_mapped_file_get_length(mapped_file);
         if (initrd_size >= initrd_max) {
             fprintf(stderr, "qemu: initrd is too large, cannot support."
                     "(max: %"PRIu32", need %"PRId64")\n",
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 859b64c..44edc69 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -42,6 +42,7 @@ struct PCMachineState {
     FWCfgState *fw_cfg;
     qemu_irq *gsi;
     PFlashCFI01 *flash[2];
+    GMappedFile *initrd_mapped_file;
 
     /* Configuration options: */
     uint64_t max_ram_below_4g;
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 06/36] memory: assert on out of scope notification
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (4 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 05/36] hw/i386/pc: Map into memory the initrd Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 07/36] configure: Define target access alignment in configure Paolo Bonzini
                   ` (32 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Auger, Yan Zhao

From: Yan Zhao <yan.y.zhao@intel.com>

It is wrong for an entry to have parts out of scope of notifier's range.
assert this condition.

Out of scope mapping/unmapping would cause problem, as in below case:

1. initially there are two notifiers with ranges
0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.

2. in vfio, memory_region_register_iommu_notifier() is followed by
memory_region_iommu_replay(), which will first call address space
unmap,
and walk and add back all entries in vtd shadow page table. e.g.
(1) for notifier 0-0xfedfffff,
    IOVAs from 0 - 0xffffffff get unmapped,
    and IOVAs from 0x3c000000 - 0x3c1fffff get mapped
(2) for notifier 0xfef00000-0xffffffffffffffff
    IOVAs from 0 - 0x7fffffffff get unmapped,
    but IOVAs from 0x3c000000 - 0x3c1fffff cannot get mapped back.

Cc: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>

Message-Id: <1561432878-13754-1-git-send-email-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 memory.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/memory.c b/memory.c
index 5d8c9a9..e42d63a 100644
--- a/memory.c
+++ b/memory.c
@@ -1942,16 +1942,18 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
                               IOMMUTLBEntry *entry)
 {
     IOMMUNotifierFlag request_flags;
+    hwaddr entry_end = entry->iova + entry->addr_mask;
 
     /*
      * Skip the notification if the notification does not overlap
      * with registered range.
      */
-    if (notifier->start > entry->iova + entry->addr_mask ||
-        notifier->end < entry->iova) {
+    if (notifier->start > entry_end || notifier->end < entry->iova) {
         return;
     }
 
+    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+
     if (entry->perm & IOMMU_RW) {
         request_flags = IOMMU_NOTIFIER_MAP;
     } else {
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 07/36] configure: Define target access alignment in configure
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (5 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 06/36] memory: assert on out of scope notification Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 08/36] block: fix NetBSD qemu-iotests failure Paolo Bonzini
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: tony.nguyen

From: "tony.nguyen@bt.com" <tony.nguyen@bt.com>

This patch moves the define of target access alignment earlier from
target/foo/cpu.h to configure.

Suggested in Richard Henderson's reply to "[PATCH 1/4] tcg: TCGMemOp is now
accelerator independent MemOp"

Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Message-Id: <11e818d38ebc40e986cfa62dd7d0afdc@tpw09926dag18e.domain1.systemhost.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: tony.nguyen@bt.com <tony.nguyen@bt.com>
---
 configure             | 12 ++++++++++--
 include/exec/poison.h |  1 +
 include/qom/cpu.h     |  2 +-
 target/alpha/cpu.h    |  2 --
 target/hppa/cpu.h     |  1 -
 target/mips/cpu.h     |  2 --
 target/sh4/cpu.h      |  2 --
 target/sparc/cpu.h    |  2 --
 target/xtensa/cpu.h   |  2 --
 tcg/tcg.c             |  2 +-
 tcg/tcg.h             |  8 +++++---
 11 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/configure b/configure
index 714e7fb..942a73b 100755
--- a/configure
+++ b/configure
@@ -7431,11 +7431,16 @@ for target in $target_list; do
 target_dir="$target"
 config_target_mak=$target_dir/config-target.mak
 target_name=$(echo $target | cut -d '-' -f 1)
+target_aligned_only="no"
+case "$target_name" in
+  alpha|hppa|mips64el|mips64|mipsel|mips|mipsn32|mipsn32el|sh4|sh4eb|sparc|sparc64|sparc32plus|xtensa|xtensaeb)
+  target_aligned_only="yes"
+  ;;
+esac
 target_bigendian="no"
-
 case "$target_name" in
   armeb|aarch64_be|hppa|lm32|m68k|microblaze|mips|mipsn32|mips64|moxie|or1k|ppc|ppc64|ppc64abi32|s390x|sh4eb|sparc|sparc64|sparc32plus|xtensaeb)
-  target_bigendian=yes
+  target_bigendian="yes"
   ;;
 esac
 target_softmmu="no"
@@ -7717,6 +7722,9 @@ fi
 if supported_whpx_target $target; then
     echo "CONFIG_WHPX=y" >> $config_target_mak
 fi
+if test "$target_aligned_only" = "yes" ; then
+  echo "TARGET_ALIGNED_ONLY=y" >> $config_target_mak
+fi
 if test "$target_bigendian" = "yes" ; then
   echo "TARGET_WORDS_BIGENDIAN=y" >> $config_target_mak
 fi
diff --git a/include/exec/poison.h b/include/exec/poison.h
index b862320..955eb86 100644
--- a/include/exec/poison.h
+++ b/include/exec/poison.h
@@ -35,6 +35,7 @@
 #pragma GCC poison TARGET_UNICORE32
 #pragma GCC poison TARGET_XTENSA
 
+#pragma GCC poison TARGET_ALIGNED_ONLY
 #pragma GCC poison TARGET_HAS_BFLT
 #pragma GCC poison TARGET_NAME
 #pragma GCC poison TARGET_SUPPORTS_MTTCG
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 5ee0046..9b50b73 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -89,7 +89,7 @@ struct TranslationBlock;
  * @do_unassigned_access: Callback for unassigned access handling.
  * (this is deprecated: new targets should use do_transaction_failed instead)
  * @do_unaligned_access: Callback for unaligned access handling, if
- * the target defines #ALIGNED_ONLY.
+ * the target defines #TARGET_ALIGNED_ONLY.
  * @do_transaction_failed: Callback for handling failed memory transactions
  * (ie bus faults or external aborts; not MMU faults)
  * @virtio_is_big_endian: Callback to return %true if a CPU which supports
diff --git a/target/alpha/cpu.h b/target/alpha/cpu.h
index b3e8a82..16eb804 100644
--- a/target/alpha/cpu.h
+++ b/target/alpha/cpu.h
@@ -23,8 +23,6 @@
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 
-#define ALIGNED_ONLY
-
 /* Alpha processors have a weak memory model */
 #define TCG_GUEST_DEFAULT_MO      (0)
 
diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index aab251b..2be67c2 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -30,7 +30,6 @@
    basis.  It's probably easier to fall back to a strong memory model.  */
 #define TCG_GUEST_DEFAULT_MO        TCG_MO_ALL
 
-#define ALIGNED_ONLY
 #define MMU_KERNEL_IDX   0
 #define MMU_USER_IDX     3
 #define MMU_PHYS_IDX     4
diff --git a/target/mips/cpu.h b/target/mips/cpu.h
index 21c0615..c13cd4e 100644
--- a/target/mips/cpu.h
+++ b/target/mips/cpu.h
@@ -1,8 +1,6 @@
 #ifndef MIPS_CPU_H
 #define MIPS_CPU_H
 
-#define ALIGNED_ONLY
-
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 #include "fpu/softfloat.h"
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index aee733e..ecaa7a1 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -23,8 +23,6 @@
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 
-#define ALIGNED_ONLY
-
 /* CPU Subtypes */
 #define SH_CPU_SH7750  (1 << 0)
 #define SH_CPU_SH7750S (1 << 1)
diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h
index 8ed2250..1406f0b 100644
--- a/target/sparc/cpu.h
+++ b/target/sparc/cpu.h
@@ -5,8 +5,6 @@
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
 
-#define ALIGNED_ONLY
-
 #if !defined(TARGET_SPARC64)
 #define TARGET_DPREGS 16
 #else
diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index 2c27713..0459243 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -32,8 +32,6 @@
 #include "exec/cpu-defs.h"
 #include "xtensa-isa.h"
 
-#define ALIGNED_ONLY
-
 /* Xtensa processors have a weak memory model */
 #define TCG_GUEST_DEFAULT_MO      (0)
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index be2c33c..8d23fb0 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1926,7 +1926,7 @@ static const char * const ldst_name[] =
 };
 
 static const char * const alignment_name[(MO_AMASK >> MO_ASHIFT) + 1] = {
-#ifdef ALIGNED_ONLY
+#ifdef TARGET_ALIGNED_ONLY
     [MO_UNALN >> MO_ASHIFT]    = "un+",
     [MO_ALIGN >> MO_ASHIFT]    = "",
 #else
diff --git a/tcg/tcg.h b/tcg/tcg.h
index b411e17..529acb2 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -333,10 +333,12 @@ typedef enum TCGMemOp {
     MO_TE    = MO_LE,
 #endif
 
-    /* MO_UNALN accesses are never checked for alignment.
+    /*
+     * MO_UNALN accesses are never checked for alignment.
      * MO_ALIGN accesses will result in a call to the CPU's
      * do_unaligned_access hook if the guest address is not aligned.
-     * The default depends on whether the target CPU defines ALIGNED_ONLY.
+     * The default depends on whether the target CPU defines
+     * TARGET_ALIGNED_ONLY.
      *
      * Some architectures (e.g. ARMv8) need the address which is aligned
      * to a size more than the size of the memory access.
@@ -353,7 +355,7 @@ typedef enum TCGMemOp {
      */
     MO_ASHIFT = 4,
     MO_AMASK = 7 << MO_ASHIFT,
-#ifdef ALIGNED_ONLY
+#ifdef TARGET_ALIGNED_ONLY
     MO_ALIGN = 0,
     MO_UNALN = MO_AMASK,
 #else
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 08/36] block: fix NetBSD qemu-iotests failure
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (6 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 07/36] configure: Define target access alignment in configure Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 09/36] 9p: simplify source file selection Paolo Bonzini
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel

Opening a block device on NetBSD has an additional step compared to other OSes,
corresponding to raw_normalize_devicepath.  The error message in that function
is slightly different from that in raw_open_common and this was causing spurious
failures in qemu-iotests.  However, in general it is not important to know what
exact step was failing, for example in the qemu-iotests case the error message
contains the fairly unequivocal "No such file or directory" text from strerror.
We can thus fix the failures by standardizing on a single error message for
both raw_open_common and raw_normalize_devicepath; in fact, we can even
use error_setg_file_open to make sure the error message is the same as in
the rest of QEMU.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/file-posix.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 4479cc7..da1fde1 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -217,7 +217,7 @@ static int raw_normalize_devicepath(const char **filename, Error **errp)
     fname = *filename;
     dp = strrchr(fname, '/');
     if (lstat(fname, &sb) < 0) {
-        error_setg_errno(errp, errno, "%s: stat failed", fname);
+        error_setg_file_open(errp, errno, fname);
         return -errno;
     }
 
@@ -547,7 +547,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
     ret = fd < 0 ? -errno : 0;
 
     if (ret < 0) {
-        error_setg_errno(errp, -ret, "Could not open '%s'", filename);
+        error_setg_file_open(errp, -ret, filename);
         if (ret == -EROFS) {
             ret = -EACCES;
         }
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 09/36] 9p: simplify source file selection
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (7 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 08/36] block: fix NetBSD qemu-iotests failure Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2020-11-03 20:31   ` Philippe Mathieu-Daudé
  2019-08-20  6:59 ` [Qemu-devel] [PULL 10/36] target-i386: kvm: 'kvm_get_supported_msrs' cleanup Paolo Bonzini
                   ` (29 subsequent siblings)
  38 siblings, 1 reply; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Marc-André Lureau

Express the complex conditions in Kconfig rather than Makefiles, since Kconfig
is better suited at expressing dependencies and detecting contradictions.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Kconfig.host        | 1 +
 fsdev/Makefile.objs | 2 +-
 hw/9pfs/Kconfig     | 5 +++++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Kconfig.host b/Kconfig.host
index aec9536..bb6e116 100644
--- a/Kconfig.host
+++ b/Kconfig.host
@@ -28,6 +28,7 @@ config VHOST_USER
 
 config XEN
     bool
+    select FSDEV_9P if VIRTFS
 
 config VIRTFS
     bool
diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
index 24bbb3e..42cd70c 100644
--- a/fsdev/Makefile.objs
+++ b/fsdev/Makefile.objs
@@ -1,6 +1,6 @@
 # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
 # only pull in the actual 9p backend if we also enabled virtio or xen.
-ifeq ($(call land,$(CONFIG_VIRTFS),$(call lor,$(CONFIG_VIRTIO_9P),$(CONFIG_XEN))),y)
+ifeq ($(CONFIG_FSDEV_9P),y)
 common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
 else
 common-obj-y = qemu-fsdev-dummy.o
diff --git a/hw/9pfs/Kconfig b/hw/9pfs/Kconfig
index 8c5032c..3ae5749 100644
--- a/hw/9pfs/Kconfig
+++ b/hw/9pfs/Kconfig
@@ -1,4 +1,9 @@
+config FSDEV_9P
+    bool
+    depends on VIRTFS
+
 config VIRTIO_9P
     bool
     default y
     depends on VIRTFS && VIRTIO
+    select FSDEV_9P
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 10/36] target-i386: kvm: 'kvm_get_supported_msrs' cleanup
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (8 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 09/36] 9p: simplify source file selection Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 11/36] test-throttle: Fix uninitialized use of burst_length Paolo Bonzini
                   ` (28 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Li Qiang

From: Li Qiang <liq3ea@163.com>

Function 'kvm_get_supported_msrs' is only called once
now, get rid of the static variable 'kvm_supported_msrs'.

Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190725151639.21693-1-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm.c | 185 +++++++++++++++++++++++++++---------------------------
 1 file changed, 91 insertions(+), 94 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 327c95a..a66b956 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1839,108 +1839,105 @@ static int kvm_get_supported_feature_msrs(KVMState *s)
 
 static int kvm_get_supported_msrs(KVMState *s)
 {
-    static int kvm_supported_msrs;
     int ret = 0;
+    struct kvm_msr_list msr_list, *kvm_msr_list;
 
-    /* first time */
-    if (kvm_supported_msrs == 0) {
-        struct kvm_msr_list msr_list, *kvm_msr_list;
+    /*
+     *  Obtain MSR list from KVM.  These are the MSRs that we must
+     *  save/restore.
+     */
+    msr_list.nmsrs = 0;
+    ret = kvm_ioctl(s, KVM_GET_MSR_INDEX_LIST, &msr_list);
+    if (ret < 0 && ret != -E2BIG) {
+        return ret;
+    }
+    /*
+     * Old kernel modules had a bug and could write beyond the provided
+     * memory. Allocate at least a safe amount of 1K.
+     */
+    kvm_msr_list = g_malloc0(MAX(1024, sizeof(msr_list) +
+                                          msr_list.nmsrs *
+                                          sizeof(msr_list.indices[0])));
 
-        kvm_supported_msrs = -1;
+    kvm_msr_list->nmsrs = msr_list.nmsrs;
+    ret = kvm_ioctl(s, KVM_GET_MSR_INDEX_LIST, kvm_msr_list);
+    if (ret >= 0) {
+        int i;
 
-        /* Obtain MSR list from KVM.  These are the MSRs that we must
-         * save/restore */
-        msr_list.nmsrs = 0;
-        ret = kvm_ioctl(s, KVM_GET_MSR_INDEX_LIST, &msr_list);
-        if (ret < 0 && ret != -E2BIG) {
-            return ret;
-        }
-        /* Old kernel modules had a bug and could write beyond the provided
-           memory. Allocate at least a safe amount of 1K. */
-        kvm_msr_list = g_malloc0(MAX(1024, sizeof(msr_list) +
-                                              msr_list.nmsrs *
-                                              sizeof(msr_list.indices[0])));
-
-        kvm_msr_list->nmsrs = msr_list.nmsrs;
-        ret = kvm_ioctl(s, KVM_GET_MSR_INDEX_LIST, kvm_msr_list);
-        if (ret >= 0) {
-            int i;
-
-            for (i = 0; i < kvm_msr_list->nmsrs; i++) {
-                switch (kvm_msr_list->indices[i]) {
-                case MSR_STAR:
-                    has_msr_star = true;
-                    break;
-                case MSR_VM_HSAVE_PA:
-                    has_msr_hsave_pa = true;
-                    break;
-                case MSR_TSC_AUX:
-                    has_msr_tsc_aux = true;
-                    break;
-                case MSR_TSC_ADJUST:
-                    has_msr_tsc_adjust = true;
-                    break;
-                case MSR_IA32_TSCDEADLINE:
-                    has_msr_tsc_deadline = true;
-                    break;
-                case MSR_IA32_SMBASE:
-                    has_msr_smbase = true;
-                    break;
-                case MSR_SMI_COUNT:
-                    has_msr_smi_count = true;
-                    break;
-                case MSR_IA32_MISC_ENABLE:
-                    has_msr_misc_enable = true;
-                    break;
-                case MSR_IA32_BNDCFGS:
-                    has_msr_bndcfgs = true;
-                    break;
-                case MSR_IA32_XSS:
-                    has_msr_xss = true;
-                    break;
-                case HV_X64_MSR_CRASH_CTL:
-                    has_msr_hv_crash = true;
-                    break;
-                case HV_X64_MSR_RESET:
-                    has_msr_hv_reset = true;
-                    break;
-                case HV_X64_MSR_VP_INDEX:
-                    has_msr_hv_vpindex = true;
-                    break;
-                case HV_X64_MSR_VP_RUNTIME:
-                    has_msr_hv_runtime = true;
-                    break;
-                case HV_X64_MSR_SCONTROL:
-                    has_msr_hv_synic = true;
-                    break;
-                case HV_X64_MSR_STIMER0_CONFIG:
-                    has_msr_hv_stimer = true;
-                    break;
-                case HV_X64_MSR_TSC_FREQUENCY:
-                    has_msr_hv_frequencies = true;
-                    break;
-                case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
-                    has_msr_hv_reenlightenment = true;
-                    break;
-                case MSR_IA32_SPEC_CTRL:
-                    has_msr_spec_ctrl = true;
-                    break;
-                case MSR_VIRT_SSBD:
-                    has_msr_virt_ssbd = true;
-                    break;
-                case MSR_IA32_ARCH_CAPABILITIES:
-                    has_msr_arch_capabs = true;
-                    break;
-                case MSR_IA32_CORE_CAPABILITY:
-                    has_msr_core_capabs = true;
-                    break;
-                }
+        for (i = 0; i < kvm_msr_list->nmsrs; i++) {
+            switch (kvm_msr_list->indices[i]) {
+            case MSR_STAR:
+                has_msr_star = true;
+                break;
+            case MSR_VM_HSAVE_PA:
+                has_msr_hsave_pa = true;
+                break;
+            case MSR_TSC_AUX:
+                has_msr_tsc_aux = true;
+                break;
+            case MSR_TSC_ADJUST:
+                has_msr_tsc_adjust = true;
+                break;
+            case MSR_IA32_TSCDEADLINE:
+                has_msr_tsc_deadline = true;
+                break;
+            case MSR_IA32_SMBASE:
+                has_msr_smbase = true;
+                break;
+            case MSR_SMI_COUNT:
+                has_msr_smi_count = true;
+                break;
+            case MSR_IA32_MISC_ENABLE:
+                has_msr_misc_enable = true;
+                break;
+            case MSR_IA32_BNDCFGS:
+                has_msr_bndcfgs = true;
+                break;
+            case MSR_IA32_XSS:
+                has_msr_xss = true;
+                break;
+            case HV_X64_MSR_CRASH_CTL:
+                has_msr_hv_crash = true;
+                break;
+            case HV_X64_MSR_RESET:
+                has_msr_hv_reset = true;
+                break;
+            case HV_X64_MSR_VP_INDEX:
+                has_msr_hv_vpindex = true;
+                break;
+            case HV_X64_MSR_VP_RUNTIME:
+                has_msr_hv_runtime = true;
+                break;
+            case HV_X64_MSR_SCONTROL:
+                has_msr_hv_synic = true;
+                break;
+            case HV_X64_MSR_STIMER0_CONFIG:
+                has_msr_hv_stimer = true;
+                break;
+            case HV_X64_MSR_TSC_FREQUENCY:
+                has_msr_hv_frequencies = true;
+                break;
+            case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
+                has_msr_hv_reenlightenment = true;
+                break;
+            case MSR_IA32_SPEC_CTRL:
+                has_msr_spec_ctrl = true;
+                break;
+            case MSR_VIRT_SSBD:
+                has_msr_virt_ssbd = true;
+                break;
+            case MSR_IA32_ARCH_CAPABILITIES:
+                has_msr_arch_capabs = true;
+                break;
+            case MSR_IA32_CORE_CAPABILITY:
+                has_msr_core_capabs = true;
+                break;
             }
         }
-
-        g_free(kvm_msr_list);
     }
 
+    g_free(kvm_msr_list);
+
     return ret;
 }
 
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 11/36] test-throttle: Fix uninitialized use of burst_length
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (9 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 10/36] target-i386: kvm: 'kvm_get_supported_msrs' cleanup Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 12/36] tests: Fix uninitialized byte in test_visitor_in_fuzz Paolo Bonzini
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Andrey Shinkevich

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

ThrottleState::cfg of the static variable 'ts' is reassigned with the
local one in the do_test_accounting() and then is passed to the
throttle_account() with uninitialized member LeakyBucket::burst_length.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <1564502498-805893-2-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/test-throttle.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/test-throttle.c b/tests/test-throttle.c
index a288122..ebf3aeb 100644
--- a/tests/test-throttle.c
+++ b/tests/test-throttle.c
@@ -557,6 +557,8 @@ static bool do_test_accounting(bool is_ops, /* are we testing bps or ops */
     BucketType index;
     int i;
 
+    throttle_config_init(&cfg);
+
     for (i = 0; i < 3; i++) {
         BucketType index = to_test[is_ops][i];
         cfg.buckets[index].avg = avg;
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 12/36] tests: Fix uninitialized byte in test_visitor_in_fuzz
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (10 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 11/36] test-throttle: Fix uninitialized use of burst_length Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 13/36] i386/kvm: initialize struct at full before ioctl call Paolo Bonzini
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Andrey Shinkevich

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

One byte in the local buffer stays uninitialized, at least with the
first iteration, because of the double decrement in the
test_visitor_in_fuzz(). This is what Valgrind does not like and not
critical for the test itself. So, reduce the number of the memory
issues reports.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <1564502498-805893-3-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/test-string-input-visitor.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tests/test-string-input-visitor.c b/tests/test-string-input-visitor.c
index 34b54df..5418e08 100644
--- a/tests/test-string-input-visitor.c
+++ b/tests/test-string-input-visitor.c
@@ -444,16 +444,14 @@ static void test_visitor_in_fuzz(TestInputVisitorData *data,
     char buf[10000];
 
     for (i = 0; i < 100; i++) {
-        unsigned int j;
+        unsigned int j, k;
 
         j = g_test_rand_int_range(0, sizeof(buf) - 1);
 
         buf[j] = '\0';
 
-        if (j != 0) {
-            for (j--; j != 0; j--) {
-                buf[j - 1] = (char)g_test_rand_int_range(0, 256);
-            }
+        for (k = 0; k != j; k++) {
+            buf[k] = (char)g_test_rand_int_range(0, 256);
         }
 
         v = visitor_input_test_init(data, buf);
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 13/36] i386/kvm: initialize struct at full before ioctl call
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (11 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 12/36] tests: Fix uninitialized byte in test_visitor_in_fuzz Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 14/36] target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions Paolo Bonzini
                   ` (25 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Andrey Shinkevich

From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>

Not the whole structure is initialized before passing it to the KVM.
Reduce the number of Valgrind reports.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <1564502498-805893-4-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/kvm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index a66b956..ce3f1c3 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -190,6 +190,7 @@ static int kvm_get_tsc(CPUState *cs)
         return 0;
     }
 
+    memset(&msr_data, 0, sizeof(msr_data));
     msr_data.info.nmsrs = 1;
     msr_data.entries[0].index = MSR_IA32_TSC;
     env->tsc_valid = !runstate_is_running();
@@ -1706,6 +1707,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
     if (has_xsave) {
         env->xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
+        memset(env->xsave_buf, 0, sizeof(struct kvm_xsave));
     }
 
     max_nested_state_len = kvm_max_nested_state_length();
@@ -3488,6 +3490,7 @@ static int kvm_put_debugregs(X86CPU *cpu)
         return 0;
     }
 
+    memset(&dbgregs, 0, sizeof(dbgregs));
     for (i = 0; i < 4; i++) {
         dbgregs.db[i] = env->dr[i];
     }
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 14/36] target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (12 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 13/36] i386/kvm: initialize struct at full before ioctl call Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap Paolo Bonzini
                   ` (24 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell

From: Peter Maydell <peter.maydell@linaro.org>

The x86 architecture requires that all conversions from floating
point to integer which raise the 'invalid' exception (infinities of
both signs, NaN, and all values which don't fit in the destination
integer) return what the x86 spec calls the "indefinite integer
value", which is 0x8000_0000 for 32-bits or 0x8000_0000_0000_0000 for
64-bits.  The softfloat functions return the more usual behaviour of
positive overflows returning the maximum value that fits in the
destination integer format and negative overflows returning the
minimum value that fits.

Wrap the softfloat functions in x86-specific versions which
detect the 'invalid' condition and return the indefinite integer.

Note that we don't use these wrappers for the 3DNow! pf2id and pf2iw
instructions, which do return the minimum value that fits in
an int32 if the input float is a large negative number.

Fixes: https://bugs.launchpad.net/qemu/+bug/1815423
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20190805180332.10185-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/ops_sse.h | 88 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 60 insertions(+), 28 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index ed05989..ec1ec74 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -710,102 +710,134 @@ void helper_cvtsq2sd(CPUX86State *env, ZMMReg *d, uint64_t val)
 #endif
 
 /* float to integer */
+
+/*
+ * x86 mandates that we return the indefinite integer value for the result
+ * of any float-to-integer conversion that raises the 'invalid' exception.
+ * Wrap the softfloat functions to get this behaviour.
+ */
+#define WRAP_FLOATCONV(RETTYPE, FN, FLOATTYPE, INDEFVALUE)              \
+    static inline RETTYPE x86_##FN(FLOATTYPE a, float_status *s)        \
+    {                                                                   \
+        int oldflags, newflags;                                         \
+        RETTYPE r;                                                      \
+                                                                        \
+        oldflags = get_float_exception_flags(s);                        \
+        set_float_exception_flags(0, s);                                \
+        r = FN(a, s);                                                   \
+        newflags = get_float_exception_flags(s);                        \
+        if (newflags & float_flag_invalid) {                            \
+            r = INDEFVALUE;                                             \
+        }                                                               \
+        set_float_exception_flags(newflags | oldflags, s);              \
+        return r;                                                       \
+    }
+
+WRAP_FLOATCONV(int32_t, float32_to_int32, float32, INT32_MIN)
+WRAP_FLOATCONV(int32_t, float32_to_int32_round_to_zero, float32, INT32_MIN)
+WRAP_FLOATCONV(int32_t, float64_to_int32, float64, INT32_MIN)
+WRAP_FLOATCONV(int32_t, float64_to_int32_round_to_zero, float64, INT32_MIN)
+WRAP_FLOATCONV(int64_t, float32_to_int64, float32, INT64_MIN)
+WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN)
+WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN)
+WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN)
+
 void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
-    d->ZMM_L(0) = float32_to_int32(s->ZMM_S(0), &env->sse_status);
-    d->ZMM_L(1) = float32_to_int32(s->ZMM_S(1), &env->sse_status);
-    d->ZMM_L(2) = float32_to_int32(s->ZMM_S(2), &env->sse_status);
-    d->ZMM_L(3) = float32_to_int32(s->ZMM_S(3), &env->sse_status);
+    d->ZMM_L(0) = x86_float32_to_int32(s->ZMM_S(0), &env->sse_status);
+    d->ZMM_L(1) = x86_float32_to_int32(s->ZMM_S(1), &env->sse_status);
+    d->ZMM_L(2) = x86_float32_to_int32(s->ZMM_S(2), &env->sse_status);
+    d->ZMM_L(3) = x86_float32_to_int32(s->ZMM_S(3), &env->sse_status);
 }
 
 void helper_cvtpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
-    d->ZMM_L(0) = float64_to_int32(s->ZMM_D(0), &env->sse_status);
-    d->ZMM_L(1) = float64_to_int32(s->ZMM_D(1), &env->sse_status);
+    d->ZMM_L(0) = x86_float64_to_int32(s->ZMM_D(0), &env->sse_status);
+    d->ZMM_L(1) = x86_float64_to_int32(s->ZMM_D(1), &env->sse_status);
     d->ZMM_Q(1) = 0;
 }
 
 void helper_cvtps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = float32_to_int32(s->ZMM_S(0), &env->sse_status);
-    d->MMX_L(1) = float32_to_int32(s->ZMM_S(1), &env->sse_status);
+    d->MMX_L(0) = x86_float32_to_int32(s->ZMM_S(0), &env->sse_status);
+    d->MMX_L(1) = x86_float32_to_int32(s->ZMM_S(1), &env->sse_status);
 }
 
 void helper_cvtpd2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = float64_to_int32(s->ZMM_D(0), &env->sse_status);
-    d->MMX_L(1) = float64_to_int32(s->ZMM_D(1), &env->sse_status);
+    d->MMX_L(0) = x86_float64_to_int32(s->ZMM_D(0), &env->sse_status);
+    d->MMX_L(1) = x86_float64_to_int32(s->ZMM_D(1), &env->sse_status);
 }
 
 int32_t helper_cvtss2si(CPUX86State *env, ZMMReg *s)
 {
-    return float32_to_int32(s->ZMM_S(0), &env->sse_status);
+    return x86_float32_to_int32(s->ZMM_S(0), &env->sse_status);
 }
 
 int32_t helper_cvtsd2si(CPUX86State *env, ZMMReg *s)
 {
-    return float64_to_int32(s->ZMM_D(0), &env->sse_status);
+    return x86_float64_to_int32(s->ZMM_D(0), &env->sse_status);
 }
 
 #ifdef TARGET_X86_64
 int64_t helper_cvtss2sq(CPUX86State *env, ZMMReg *s)
 {
-    return float32_to_int64(s->ZMM_S(0), &env->sse_status);
+    return x86_float32_to_int64(s->ZMM_S(0), &env->sse_status);
 }
 
 int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s)
 {
-    return float64_to_int64(s->ZMM_D(0), &env->sse_status);
+    return x86_float64_to_int64(s->ZMM_D(0), &env->sse_status);
 }
 #endif
 
 /* float to integer truncated */
 void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
-    d->ZMM_L(0) = float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
-    d->ZMM_L(1) = float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
-    d->ZMM_L(2) = float32_to_int32_round_to_zero(s->ZMM_S(2), &env->sse_status);
-    d->ZMM_L(3) = float32_to_int32_round_to_zero(s->ZMM_S(3), &env->sse_status);
+    d->ZMM_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
+    d->ZMM_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
+    d->ZMM_L(2) = x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->sse_status);
+    d->ZMM_L(3) = x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->sse_status);
 }
 
 void helper_cvttpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s)
 {
-    d->ZMM_L(0) = float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
-    d->ZMM_L(1) = float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
+    d->ZMM_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
+    d->ZMM_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
     d->ZMM_Q(1) = 0;
 }
 
 void helper_cvttps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
-    d->MMX_L(1) = float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
+    d->MMX_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
+    d->MMX_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);
 }
 
 void helper_cvttpd2pi(CPUX86State *env, MMXReg *d, ZMMReg *s)
 {
-    d->MMX_L(0) = float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
-    d->MMX_L(1) = float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
+    d->MMX_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
+    d->MMX_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);
 }
 
 int32_t helper_cvttss2si(CPUX86State *env, ZMMReg *s)
 {
-    return float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
+    return x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);
 }
 
 int32_t helper_cvttsd2si(CPUX86State *env, ZMMReg *s)
 {
-    return float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
+    return x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);
 }
 
 #ifdef TARGET_X86_64
 int64_t helper_cvttss2sq(CPUX86State *env, ZMMReg *s)
 {
-    return float32_to_int64_round_to_zero(s->ZMM_S(0), &env->sse_status);
+    return x86_float32_to_int64_round_to_zero(s->ZMM_S(0), &env->sse_status);
 }
 
 int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s)
 {
-    return float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_status);
+    return x86_float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_status);
 }
 #endif
 
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (13 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 14/36] target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-26 12:19   ` dovgaluk
  2022-08-02 16:17   ` Peter Maydell
  2019-08-20  6:59 ` [Qemu-devel] [PULL 16/36] mc146818rtc: Remove reset notifiers Paolo Bonzini
                   ` (23 subsequent siblings)
  38 siblings, 2 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel

There is a race between TCG and accesses to the dirty log:

      vCPU thread                  reader thread
      -----------------------      -----------------------
      TLB check -> slow path
        notdirty_mem_write
          write to RAM
          set dirty flag
                                   clear dirty flag
      TLB check -> fast path
                                   read memory
        write to RAM

Fortunately, in order to fix it, no change is required to the
vCPU thread.  However, the reader thread must delay the read after
the vCPU thread has finished the write.  This can be approximated
conservatively by run_on_cpu, which waits for the end of the current
translation block.

A similar technique is used by KVM, which has to do a synchronous TLB
flush after doing a test-and-clear of the dirty-page flags.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                | 31 +++++++++++++++++++++++++++++++
 include/exec/memory.h | 12 ++++++++++++
 memory.c              | 10 +++++++++-
 migration/ram.c       |  1 +
 4 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 3e78de3..ae68f72 100644
--- a/exec.c
+++ b/exec.c
@@ -198,6 +198,7 @@ typedef struct subpage_t {
 
 static void io_mem_init(void);
 static void memory_map_init(void);
+static void tcg_log_global_after_sync(MemoryListener *listener);
 static void tcg_commit(MemoryListener *listener);
 
 static MemoryRegion io_mem_watch;
@@ -906,6 +907,7 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
     newas->cpu = cpu;
     newas->as = as;
     if (tcg_enabled()) {
+        newas->tcg_as_listener.log_global_after_sync = tcg_log_global_after_sync;
         newas->tcg_as_listener.commit = tcg_commit;
         memory_listener_register(&newas->tcg_as_listener, as);
     }
@@ -3143,6 +3145,35 @@ void address_space_dispatch_free(AddressSpaceDispatch *d)
     g_free(d);
 }
 
+static void do_nothing(CPUState *cpu, run_on_cpu_data d)
+{
+}
+
+static void tcg_log_global_after_sync(MemoryListener *listener)
+{
+    CPUAddressSpace *cpuas;
+
+    /* Wait for the CPU to end the current TB.  This avoids the following
+     * incorrect race:
+     *
+     *      vCPU                         migration
+     *      ----------------------       -------------------------
+     *      TLB check -> slow path
+     *        notdirty_mem_write
+     *          write to RAM
+     *          mark dirty
+     *                                   clear dirty flag
+     *      TLB check -> fast path
+     *                                   read memory
+     *        write to RAM
+     *
+     * by pushing the migration thread's memory read after the vCPU thread has
+     * written the memory.
+     */
+    cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
+    run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
+}
+
 static void tcg_commit(MemoryListener *listener)
 {
     CPUAddressSpace *cpuas;
diff --git a/include/exec/memory.h b/include/exec/memory.h
index bb0961d..b6bcf31 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -419,6 +419,7 @@ struct MemoryListener {
     void (*log_clear)(MemoryListener *listener, MemoryRegionSection *section);
     void (*log_global_start)(MemoryListener *listener);
     void (*log_global_stop)(MemoryListener *listener);
+    void (*log_global_after_sync)(MemoryListener *listener);
     void (*eventfd_add)(MemoryListener *listener, MemoryRegionSection *section,
                         bool match_data, uint64_t data, EventNotifier *e);
     void (*eventfd_del)(MemoryListener *listener, MemoryRegionSection *section,
@@ -1682,6 +1683,17 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 void memory_global_dirty_log_sync(void);
 
 /**
+ * memory_global_dirty_log_sync: synchronize the dirty log for all memory
+ *
+ * Synchronizes the vCPUs with a thread that is reading the dirty bitmap.
+ * This function must be called after the dirty log bitmap is cleared, and
+ * before dirty guest memory pages are read.  If you are using
+ * #DirtyBitmapSnapshot, memory_region_snapshot_and_clear_dirty() takes
+ * care of doing this.
+ */
+void memory_global_after_dirty_log_sync(void);
+
+/**
  * memory_region_transaction_begin: Start a transaction.
  *
  * During a transaction, changes will be accumulated and made visible
diff --git a/memory.c b/memory.c
index e42d63a..edd0c13 100644
--- a/memory.c
+++ b/memory.c
@@ -2127,9 +2127,12 @@ DirtyBitmapSnapshot *memory_region_snapshot_and_clear_dirty(MemoryRegion *mr,
                                                             hwaddr size,
                                                             unsigned client)
 {
+    DirtyBitmapSnapshot *snapshot;
     assert(mr->ram_block);
     memory_region_sync_dirty_bitmap(mr);
-    return cpu_physical_memory_snapshot_and_clear_dirty(mr, addr, size, client);
+    snapshot = cpu_physical_memory_snapshot_and_clear_dirty(mr, addr, size, client);
+    memory_global_after_dirty_log_sync();
+    return snapshot;
 }
 
 bool memory_region_snapshot_get_dirty(MemoryRegion *mr, DirtyBitmapSnapshot *snap,
@@ -2620,6 +2623,11 @@ void memory_global_dirty_log_sync(void)
     memory_region_sync_dirty_bitmap(NULL);
 }
 
+void memory_global_after_dirty_log_sync(void)
+{
+    MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
+}
+
 static VMChangeStateEntry *vmstate_change;
 
 void memory_global_dirty_log_start(void)
diff --git a/migration/ram.c b/migration/ram.c
index 889148d..4e3a6ae 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1847,6 +1847,7 @@ static void migration_bitmap_sync(RAMState *rs)
     rcu_read_unlock();
     qemu_mutex_unlock(&rs->bitmap_mutex);
 
+    memory_global_after_dirty_log_sync();
     trace_migration_bitmap_sync_end(rs->num_dirty_pages_period);
 
     end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 16/36] mc146818rtc: Remove reset notifiers
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (14 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 17/36] timer: " Paolo Bonzini
                   ` (22 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The reset notifiers are unreliable and recalculating the offsets
after boot causes problems with migration in cases where explicit
base times are set on the destination.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190724115823.4199-2-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/timer/mc146818rtc.c | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index ce4550b..352ff9d 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -92,7 +92,6 @@ typedef struct RTCState {
     uint32_t irq_coalesced;
     uint32_t period;
     QEMUTimer *coalesced_timer;
-    Notifier clock_reset_notifier;
     LostTickPolicy lost_tick_policy;
     Notifier suspend_notifier;
     QLIST_ENTRY(RTCState) link;
@@ -885,20 +884,6 @@ static const VMStateDescription vmstate_rtc = {
     }
 };
 
-static void rtc_notify_clock_reset(Notifier *notifier, void *data)
-{
-    RTCState *s = container_of(notifier, RTCState, clock_reset_notifier);
-    int64_t now = *(int64_t *)data;
-
-    rtc_set_date_from_host(ISA_DEVICE(s));
-    periodic_timer_update(s, now, 0);
-    check_update_timer(s);
-
-    if (s->lost_tick_policy == LOST_TICK_POLICY_SLEW) {
-        rtc_coalesced_timer_update(s);
-    }
-}
-
 /* set CMOS shutdown status register (index 0xF) as S3_resume(0xFE)
    BIOS will read it and start S3 resume at POST Entry */
 static void rtc_notify_suspend(Notifier *notifier, void *data)
@@ -984,10 +969,6 @@ static void rtc_realizefn(DeviceState *dev, Error **errp)
     s->update_timer = timer_new_ns(rtc_clock, rtc_update_timer, s);
     check_update_timer(s);
 
-    s->clock_reset_notifier.notify = rtc_notify_clock_reset;
-    qemu_clock_register_reset_notifier(rtc_clock,
-                                       &s->clock_reset_notifier);
-
     s->suspend_notifier.notify = rtc_notify_suspend;
     qemu_register_suspend_notifier(&s->suspend_notifier);
 
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 17/36] timer: Remove reset notifiers
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (15 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 16/36] mc146818rtc: Remove reset notifiers Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 18/36] replay: Remove host_clock_last Paolo Bonzini
                   ` (21 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Remove the reset notifer from the core qemu-timer code.
The only user was mc146818 and we've just remove it's use.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190724115823.4199-3-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/timer.h | 22 ----------------------
 util/qemu-timer.c    | 21 +--------------------
 2 files changed, 1 insertion(+), 42 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 5d978e1..6817c78 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -228,28 +228,6 @@ void qemu_clock_enable(QEMUClockType type, bool enabled);
 void qemu_start_warp_timer(void);
 
 /**
- * qemu_clock_register_reset_notifier:
- * @type: the clock type
- * @notifier: the notifier function
- *
- * Register a notifier function to call when the clock
- * concerned is reset.
- */
-void qemu_clock_register_reset_notifier(QEMUClockType type,
-                                        Notifier *notifier);
-
-/**
- * qemu_clock_unregister_reset_notifier:
- * @type: the clock type
- * @notifier: the notifier function
- *
- * Unregister a notifier function to call when the clock
- * concerned is reset.
- */
-void qemu_clock_unregister_reset_notifier(QEMUClockType type,
-                                          Notifier *notifier);
-
-/**
  * qemu_clock_run_timers:
  * @type: clock on which to operate
  *
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 1cc1b2f..2faaaf9 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -48,7 +48,6 @@ typedef struct QEMUClock {
     /* We rely on BQL to protect the timerlists */
     QLIST_HEAD(, QEMUTimerList) timerlists;
 
-    NotifierList reset_notifiers;
     int64_t last;
 
     QEMUClockType type;
@@ -133,7 +132,6 @@ static void qemu_clock_init(QEMUClockType type, QEMUTimerListNotifyCB *notify_cb
     clock->enabled = (type == QEMU_CLOCK_VIRTUAL ? false : true);
     clock->last = INT64_MIN;
     QLIST_INIT(&clock->timerlists);
-    notifier_list_init(&clock->reset_notifiers);
     main_loop_tlg.tl[type] = timerlist_new(type, notify_cb, NULL);
 }
 
@@ -630,7 +628,7 @@ int64_t timerlistgroup_deadline_ns(QEMUTimerListGroup *tlg)
 
 int64_t qemu_clock_get_ns(QEMUClockType type)
 {
-    int64_t now, last;
+    int64_t now;
     QEMUClock *clock = qemu_clock_ptr(type);
 
     switch (type) {
@@ -645,11 +643,7 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
         }
     case QEMU_CLOCK_HOST:
         now = REPLAY_CLOCK(REPLAY_CLOCK_HOST, get_clock_realtime());
-        last = clock->last;
         clock->last = now;
-        if (now < last || now > (last + get_max_clock_jump())) {
-            notifier_list_notify(&clock->reset_notifiers, &now);
-        }
         return now;
     case QEMU_CLOCK_VIRTUAL_RT:
         return REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT, cpu_get_clock());
@@ -668,19 +662,6 @@ void qemu_clock_set_last(QEMUClockType type, uint64_t last)
     clock->last = last;
 }
 
-void qemu_clock_register_reset_notifier(QEMUClockType type,
-                                        Notifier *notifier)
-{
-    QEMUClock *clock = qemu_clock_ptr(type);
-    notifier_list_add(&clock->reset_notifiers, notifier);
-}
-
-void qemu_clock_unregister_reset_notifier(QEMUClockType type,
-                                          Notifier *notifier)
-{
-    notifier_remove(notifier);
-}
-
 void init_clocks(QEMUTimerListNotifyCB *notify_cb)
 {
     QEMUClockType type;
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 18/36] replay: Remove host_clock_last
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (16 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 17/36] timer: " Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 19/36] timer: last, remove last bits of last Paolo Bonzini
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Now we're not using the 'last' field in the timer, remove it from
replay.

Bump the version number of the replay structure since we've
removed the field.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190724115823.4199-4-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 replay/replay-snapshot.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
index 756f48b..3a58734 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -24,7 +24,6 @@ static int replay_pre_save(void *opaque)
 {
     ReplayState *state = opaque;
     state->file_offset = ftell(replay_file);
-    state->host_clock_last = qemu_clock_get_last(QEMU_CLOCK_HOST);
 
     return 0;
 }
@@ -34,7 +33,6 @@ static int replay_post_load(void *opaque, int version_id)
     ReplayState *state = opaque;
     if (replay_mode == REPLAY_MODE_PLAY) {
         fseek(replay_file, state->file_offset, SEEK_SET);
-        qemu_clock_set_last(QEMU_CLOCK_HOST, state->host_clock_last);
         /* If this was a vmstate, saved in recording mode,
            we need to initialize replay data fields. */
         replay_fetch_data_kind();
@@ -50,8 +48,8 @@ static int replay_post_load(void *opaque, int version_id)
 
 static const VMStateDescription vmstate_replay = {
     .name = "replay",
-    .version_id = 1,
-    .minimum_version_id = 1,
+    .version_id = 2,
+    .minimum_version_id = 2,
     .pre_save = replay_pre_save,
     .post_load = replay_post_load,
     .fields = (VMStateField[]) {
@@ -62,7 +60,6 @@ static const VMStateDescription vmstate_replay = {
         VMSTATE_UINT32(has_unread_data, ReplayState),
         VMSTATE_UINT64(file_offset, ReplayState),
         VMSTATE_UINT64(block_request_id, ReplayState),
-        VMSTATE_UINT64(host_clock_last, ReplayState),
         VMSTATE_INT32(read_event_kind, ReplayState),
         VMSTATE_UINT64(read_event_id, ReplayState),
         VMSTATE_INT32(read_event_checkpoint, ReplayState),
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 19/36] timer: last, remove last bits of last
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (17 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 18/36] replay: Remove host_clock_last Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 20/36] kconfig: do not select VMMOUSE Paolo Bonzini
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The reset notifiers kept a 'last' counter to notice jumps;
now that we've remove the notifier we don't need to keep 'last'.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190724115823.4199-5-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/timer.h | 13 -------------
 util/qemu-timer.c    | 22 +---------------------
 2 files changed, 1 insertion(+), 34 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 6817c78..5bcab93 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -248,19 +248,6 @@ bool qemu_clock_run_timers(QEMUClockType type);
  */
 bool qemu_clock_run_all_timers(void);
 
-/**
- * qemu_clock_get_last:
- *
- * Returns last clock query time.
- */
-uint64_t qemu_clock_get_last(QEMUClockType type);
-/**
- * qemu_clock_set_last:
- *
- * Sets last clock query time.
- */
-void qemu_clock_set_last(QEMUClockType type, uint64_t last);
-
 
 /*
  * QEMUTimerList
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 2faaaf9..b73041d 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -48,8 +48,6 @@ typedef struct QEMUClock {
     /* We rely on BQL to protect the timerlists */
     QLIST_HEAD(, QEMUTimerList) timerlists;
 
-    int64_t last;
-
     QEMUClockType type;
     bool enabled;
 } QEMUClock;
@@ -130,7 +128,6 @@ static void qemu_clock_init(QEMUClockType type, QEMUTimerListNotifyCB *notify_cb
 
     clock->type = type;
     clock->enabled = (type == QEMU_CLOCK_VIRTUAL ? false : true);
-    clock->last = INT64_MIN;
     QLIST_INIT(&clock->timerlists);
     main_loop_tlg.tl[type] = timerlist_new(type, notify_cb, NULL);
 }
@@ -628,9 +625,6 @@ int64_t timerlistgroup_deadline_ns(QEMUTimerListGroup *tlg)
 
 int64_t qemu_clock_get_ns(QEMUClockType type)
 {
-    int64_t now;
-    QEMUClock *clock = qemu_clock_ptr(type);
-
     switch (type) {
     case QEMU_CLOCK_REALTIME:
         return get_clock();
@@ -642,26 +636,12 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
             return cpu_get_clock();
         }
     case QEMU_CLOCK_HOST:
-        now = REPLAY_CLOCK(REPLAY_CLOCK_HOST, get_clock_realtime());
-        clock->last = now;
-        return now;
+        return REPLAY_CLOCK(REPLAY_CLOCK_HOST, get_clock_realtime());
     case QEMU_CLOCK_VIRTUAL_RT:
         return REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT, cpu_get_clock());
     }
 }
 
-uint64_t qemu_clock_get_last(QEMUClockType type)
-{
-    QEMUClock *clock = qemu_clock_ptr(type);
-    return clock->last;
-}
-
-void qemu_clock_set_last(QEMUClockType type, uint64_t last)
-{
-    QEMUClock *clock = qemu_clock_ptr(type);
-    clock->last = last;
-}
-
 void init_clocks(QEMUTimerListNotifyCB *notify_cb)
 {
     QEMUClockType type;
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 20/36] kconfig: do not select VMMOUSE
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (18 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 19/36] timer: last, remove last bits of last Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 21/36] replay: add missing fix for internal function Paolo Bonzini
                   ` (18 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel

Just make it a default device, so that it is automatically removed if VMPORT
is not included in the binary (as is the case with --with-default-devices).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/i386/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 6350438..619569e 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -99,4 +99,5 @@ config VMPORT
 
 config VMMOUSE
     bool
+    default y
     depends on VMPORT
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 21/36] replay: add missing fix for internal function
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (19 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 20/36] kconfig: do not select VMMOUSE Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 22/36] replay: document development rules Paolo Bonzini
                   ` (17 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk

From: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

This is a fix which was missed by patch
74c0b816adfc6aa1b01b4426fdf385e32e35cbac, which added current_step
parameter to the replay_advance_current_step function.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404425561.18669.13015037579222450241.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 replay/replay-internal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/replay/replay-internal.c b/replay/replay-internal.c
index 9e41ed1..979f3a0 100644
--- a/replay/replay-internal.c
+++ b/replay/replay-internal.c
@@ -228,7 +228,7 @@ void replay_mutex_unlock(void)
 
 void replay_advance_current_step(uint64_t current_step)
 {
-    int diff = (int)(replay_get_current_step() - replay_state.current_step);
+    int diff = (int)(current_step - replay_state.current_step);
 
     /* Time can only go forward */
     assert(diff >= 0);
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 22/36] replay: document development rules
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (20 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 21/36] replay: add missing fix for internal function Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 23/36] util/qemu-timer: refactor deadline calculation for external timers Paolo Bonzini
                   ` (16 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk, Pavel Dovgalyuk

From: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

This patch introduces docs/devel/replay.txt which describes the rules
that should be followed to make virtual devices usable in record/replay mode.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgauk@ispras.ru>

--

v9: fixed external virtual clock description (reported by Artem Pisarenko)
Message-Id: <156404426119.18669.6707258931552832854.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
---
 docs/devel/replay.txt | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)
 create mode 100644 docs/devel/replay.txt

diff --git a/docs/devel/replay.txt b/docs/devel/replay.txt
new file mode 100644
index 0000000..e641c35
--- /dev/null
+++ b/docs/devel/replay.txt
@@ -0,0 +1,46 @@
+Record/replay mechanism, that could be enabled through icount mode, expects
+the virtual devices to satisfy the following requirements.
+
+The main idea behind this document is that everything that affects
+the guest state during execution in icount mode should be deterministic.
+
+Timers
+======
+
+All virtual devices should use virtual clock for timers that change the guest
+state. Virtual clock is deterministic, therefore such timers are deterministic
+too.
+
+Virtual devices can also use realtime clock for the events that do not change
+the guest state directly. When the clock ticking should depend on VM execution
+speed, use virtual clock with EXTERNAL attribute. It is not deterministic,
+but its speed depends on the guest execution. This clock is used by
+the virtual devices (e.g., slirp routing device) that lie outside the
+replayed guest.
+
+Bottom halves
+=============
+
+Bottom half callbacks, that affect the guest state, should be invoked through
+replay_bh_schedule_event or replay_bh_schedule_oneshot_event functions.
+Their invocations are saved in record mode and synchronized with the existing
+log in replay mode.
+
+Saving/restoring the VM state
+=============================
+
+All fields in the device state structure (including virtual timers)
+should be restored by loadvm to the same values they had before savevm.
+
+Avoid accessing other devices' state, because the order of saving/restoring
+is not defined. It means that you should not call functions like
+'update_irq' in post_load callback. Save everything explicitly to avoid
+the dependencies that may make restoring the VM state non-deterministic.
+
+Stopping the VM
+===============
+
+Stopping the guest should not interfere with its state (with the exception
+of the network connections, that could be broken by the remote timeouts).
+VM can be stopped at any moment of replay by the user. Restarting the VM
+after that stop should not break the replay by the unneeded guest state change.
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 23/36] util/qemu-timer: refactor deadline calculation for external timers
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (21 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 22/36] replay: document development rules Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 24/36] replay: fix replay shutdown Paolo Bonzini
                   ` (15 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk

From: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

icount-based record/replay uses qemu_clock_deadline_ns_all to measure
the period until vCPU may be interrupted.
This function takes in account the virtual timers, because they belong
to the virtual devices that may generate interrupt request or affect
the virtual machine state.
However, there are a subset of virtual timers, that are marked with
'external' flag. These do not change the virtual machine state and
only based on virtual clock. Calculating the deadling using the external
timers breaks the determinism, because they do not belong to the replayed
part of the virtual machine.
This patch fixes the deadline calculation for this case by adding
new parameter for skipping the external timers when it is needed.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

--

v2 changes:
 - added new parameter for timer attribute mask
Message-Id: <156404426682.18669.17014100602930969222.stgit@pasha-Precision-3630-Tower>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c                    | 17 ++++++++++++-----
 include/qemu/timer.h      |  8 ++++++--
 qtest.c                   |  3 ++-
 tests/ptimer-test-stubs.c |  4 ++--
 tests/ptimer-test.c       |  6 ++++--
 util/qemu-timer.c         | 30 +++++++++++++++++++++++++++---
 6 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/cpus.c b/cpus.c
index 927a00a..a7120ff 100644
--- a/cpus.c
+++ b/cpus.c
@@ -553,7 +553,8 @@ void qtest_clock_warp(int64_t dest)
     assert(qtest_enabled());
     aio_context = qemu_get_aio_context();
     while (clock < dest) {
-        int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+        int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                                      QEMU_TIMER_ATTR_ALL);
         int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
 
         seqlock_write_lock(&timers_state.vm_clock_seqlock,
@@ -613,7 +614,8 @@ void qemu_start_warp_timer(void)
 
     /* We want to use the earliest deadline from ALL vm_clocks */
     clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT);
-    deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+    deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                          ~QEMU_TIMER_ATTR_EXTERNAL);
     if (deadline < 0) {
         static bool notified;
         if (!icount_sleep && !notified) {
@@ -1349,7 +1351,12 @@ static int64_t tcg_get_icount_limit(void)
     int64_t deadline;
 
     if (replay_mode != REPLAY_MODE_PLAY) {
-        deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+        /*
+         * Include all the timers, because they may need an attention.
+         * Too long CPU execution may create unnecessary delay in UI.
+         */
+        deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                              QEMU_TIMER_ATTR_ALL);
 
         /* Maintain prior (possibly buggy) behaviour where if no deadline
          * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than
@@ -1370,8 +1377,8 @@ static void handle_icount_deadline(void)
 {
     assert(qemu_in_vcpu_thread());
     if (use_icount) {
-        int64_t deadline =
-            qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+        int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                                      QEMU_TIMER_ATTR_ALL);
 
         if (deadline == 0) {
             /* Wake up other AioContexts.  */
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 5bcab93..85bc6eb 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -62,13 +62,15 @@ typedef enum {
  * The following attributes are available:
  *
  * QEMU_TIMER_ATTR_EXTERNAL: drives external subsystem
+ * QEMU_TIMER_ATTR_ALL: mask for all existing attributes
  *
  * Timers with this attribute do not recorded in rr mode, therefore it could be
  * used for the subsystems that operate outside the guest core. Applicable only
  * with virtual clock type.
  */
 
-#define QEMU_TIMER_ATTR_EXTERNAL BIT(0)
+#define QEMU_TIMER_ATTR_EXTERNAL ((int)BIT(0))
+#define QEMU_TIMER_ATTR_ALL      0xffffffff
 
 typedef struct QEMUTimerList QEMUTimerList;
 
@@ -177,6 +179,8 @@ bool qemu_clock_use_for_deadline(QEMUClockType type);
 /**
  * qemu_clock_deadline_ns_all:
  * @type: the clock type
+ * @attr_mask: mask for the timer attributes that are included
+ *             in deadline calculation
  *
  * Calculate the deadline across all timer lists associated
  * with a clock (as opposed to just the default one)
@@ -184,7 +188,7 @@ bool qemu_clock_use_for_deadline(QEMUClockType type);
  *
  * Returns: time until expiry in nanoseconds or -1
  */
-int64_t qemu_clock_deadline_ns_all(QEMUClockType type);
+int64_t qemu_clock_deadline_ns_all(QEMUClockType type, int attr_mask);
 
 /**
  * qemu_clock_get_main_loop_timerlist:
diff --git a/qtest.c b/qtest.c
index 15e27e9..4e20856 100644
--- a/qtest.c
+++ b/qtest.c
@@ -655,7 +655,8 @@ static void qtest_process_command(CharBackend *chr, gchar **words)
             int ret = qemu_strtoi64(words[1], NULL, 0, &ns);
             g_assert(ret == 0);
         } else {
-            ns = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+            ns = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                            QEMU_TIMER_ATTR_ALL);
         }
         qtest_clock_warp(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + ns);
         qtest_send_prefix(chr);
diff --git a/tests/ptimer-test-stubs.c b/tests/ptimer-test-stubs.c
index 54b3fd2..ed393d9 100644
--- a/tests/ptimer-test-stubs.c
+++ b/tests/ptimer-test-stubs.c
@@ -88,9 +88,9 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
     return ptimer_test_time_ns;
 }
 
-int64_t qemu_clock_deadline_ns_all(QEMUClockType type)
+int64_t qemu_clock_deadline_ns_all(QEMUClockType type, int attr_mask)
 {
-    QEMUTimerList *timer_list = main_loop_tlg.tl[type];
+    QEMUTimerList *timer_list = main_loop_tlg.tl[QEMU_CLOCK_VIRTUAL];
     QEMUTimer *t = timer_list->active_timers.next;
     int64_t deadline = -1;
 
diff --git a/tests/ptimer-test.c b/tests/ptimer-test.c
index b30aad0..5b20e91 100644
--- a/tests/ptimer-test.c
+++ b/tests/ptimer-test.c
@@ -50,13 +50,15 @@ static void ptimer_test_set_qemu_time_ns(int64_t ns)
 
 static void qemu_clock_step(uint64_t ns)
 {
-    int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+    int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                                  QEMU_TIMER_ATTR_ALL);
     int64_t advanced_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + ns;
 
     while (deadline != -1 && deadline <= advanced_time) {
         ptimer_test_set_qemu_time_ns(deadline);
         ptimer_test_expire_qemu_timers(deadline, QEMU_CLOCK_VIRTUAL);
-        deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+        deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL,
+                                              QEMU_TIMER_ATTR_ALL);
     }
 
     ptimer_test_set_qemu_time_ns(advanced_time);
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index b73041d..dcaaac7 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -248,14 +248,38 @@ int64_t timerlist_deadline_ns(QEMUTimerList *timer_list)
  * ignore whether or not the clock should be used in deadline
  * calculations.
  */
-int64_t qemu_clock_deadline_ns_all(QEMUClockType type)
+int64_t qemu_clock_deadline_ns_all(QEMUClockType type, int attr_mask)
 {
     int64_t deadline = -1;
+    int64_t delta;
+    int64_t expire_time;
+    QEMUTimer *ts;
     QEMUTimerList *timer_list;
     QEMUClock *clock = qemu_clock_ptr(type);
+
+    if (!clock->enabled) {
+        return -1;
+    }
+
     QLIST_FOREACH(timer_list, &clock->timerlists, list) {
-        deadline = qemu_soonest_timeout(deadline,
-                                        timerlist_deadline_ns(timer_list));
+        qemu_mutex_lock(&timer_list->active_timers_lock);
+        ts = timer_list->active_timers;
+        /* Skip all external timers */
+        while (ts && (ts->attributes & ~attr_mask)) {
+            ts = ts->next;
+        }
+        if (!ts) {
+            qemu_mutex_unlock(&timer_list->active_timers_lock);
+            continue;
+        }
+        expire_time = ts->expire_time;
+        qemu_mutex_unlock(&timer_list->active_timers_lock);
+
+        delta = expire_time - qemu_clock_get_ns(type);
+        if (delta <= 0) {
+            delta = 0;
+        }
+        deadline = qemu_soonest_timeout(deadline, delta);
     }
     return deadline;
 }
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 24/36] replay: fix replay shutdown
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (22 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 23/36] util/qemu-timer: refactor deadline calculation for external timers Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 25/36] replay: refine replay-time module Paolo Bonzini
                   ` (14 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk

From: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

This patch fixes shutdown of the replay process, which is terminated with
the assert when shutdown event is read from the log.
replay_finish_event reads new data_kind and therefore the value of data_kind
should be preserved to be valid at qemu_system_shutdown_request call.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404427238.18669.12378772823692338069.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 replay/replay.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/replay/replay.c b/replay/replay.c
index 8b172b2..8d77a4c 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -45,14 +45,14 @@ bool replay_next_event_is(int event)
     }
 
     while (true) {
-        if (event == replay_state.data_kind) {
+        unsigned int data_kind = replay_state.data_kind;
+        if (event == data_kind) {
             res = true;
         }
-        switch (replay_state.data_kind) {
+        switch (data_kind) {
         case EVENT_SHUTDOWN ... EVENT_SHUTDOWN_LAST:
             replay_finish_event();
-            qemu_system_shutdown_request(replay_state.data_kind -
-                                         EVENT_SHUTDOWN);
+            qemu_system_shutdown_request(data_kind - EVENT_SHUTDOWN);
             break;
         default:
             /* clock, time_t, checkpoint and other events */
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 25/36] replay: refine replay-time module
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (23 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 24/36] replay: fix replay shutdown Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 26/36] replay: rename step-related variables and functions Paolo Bonzini
                   ` (13 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk

From: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

This patch removes refactoring artifacts from the replay/replay-time.c

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404427799.18669.8072341590511911277.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 replay/replay-time.c | 36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/replay/replay-time.c b/replay/replay-time.c
index 5154cb0..49c9e5d 100644
--- a/replay/replay-time.c
+++ b/replay/replay-time.c
@@ -14,18 +14,19 @@
 #include "replay-internal.h"
 #include "qemu/error-report.h"
 
-int64_t replay_save_clock(ReplayClockKind kind, int64_t clock, int64_t raw_icount)
+int64_t replay_save_clock(ReplayClockKind kind, int64_t clock,
+                          int64_t raw_icount)
 {
-    if (replay_file) {
-        g_assert(replay_mutex_locked());
+    g_assert(replay_file);
+    g_assert(replay_mutex_locked());
 
-        /* Due to the caller's locking requirements we get the icount from it
-         * instead of using replay_save_instructions().
-         */
-        replay_advance_current_step(raw_icount);
-        replay_put_event(EVENT_CLOCK + kind);
-        replay_put_qword(clock);
-    }
+    /*
+     * Due to the caller's locking requirements we get the icount from it
+     * instead of using replay_save_instructions().
+     */
+    replay_advance_current_step(raw_icount);
+    replay_put_event(EVENT_CLOCK + kind);
+    replay_put_qword(clock);
 
     return clock;
 }
@@ -47,20 +48,15 @@ void replay_read_next_clock(ReplayClockKind kind)
 /*! Reads next clock event from the input. */
 int64_t replay_read_clock(ReplayClockKind kind)
 {
+    int64_t ret;
     g_assert(replay_file && replay_mutex_locked());
 
     replay_account_executed_instructions();
 
-    if (replay_file) {
-        int64_t ret;
-        if (replay_next_event_is(EVENT_CLOCK + kind)) {
-            replay_read_next_clock(kind);
-        }
-        ret = replay_state.cached_clock[kind];
-
-        return ret;
+    if (replay_next_event_is(EVENT_CLOCK + kind)) {
+        replay_read_next_clock(kind);
     }
+    ret = replay_state.cached_clock[kind];
 
-    error_report("REPLAY INTERNAL ERROR %d", __LINE__);
-    exit(1);
+    return ret;
 }
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 26/36] replay: rename step-related variables and functions
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (24 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 25/36] replay: refine replay-time module Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 27/36] icount: clean up cpu_can_io at the entry to the block Paolo Bonzini
                   ` (12 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk

From: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

This patch renames replay_get_current_step() and related variables
to make these names consistent with existing 'icount' command line
option and future record/replay hmp/qmp commands.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404428377.18669.15476429889039912070.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/sysemu/replay.h  |  2 +-
 replay/replay-events.c   |  2 +-
 replay/replay-internal.c | 10 +++++-----
 replay/replay-internal.h | 10 +++++-----
 replay/replay-snapshot.c |  6 +++---
 replay/replay-time.c     |  2 +-
 replay/replay.c          | 22 +++++++++++-----------
 7 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
index 3a7c58e..28ce19e 100644
--- a/include/sysemu/replay.h
+++ b/include/sysemu/replay.h
@@ -75,7 +75,7 @@ void replay_add_blocker(Error *reason);
 /* Processing the instructions */
 
 /*! Returns number of executed instructions. */
-uint64_t replay_get_current_step(void);
+uint64_t replay_get_current_icount(void);
 /*! Returns number of instructions to execute in replay mode. */
 int replay_get_instructions(void);
 /*! Updates instructions counter in replay mode. */
diff --git a/replay/replay-events.c b/replay/replay-events.c
index 60d17f6..008e80f 100644
--- a/replay/replay-events.c
+++ b/replay/replay-events.c
@@ -124,7 +124,7 @@ void replay_add_event(ReplayAsyncEventKind event_kind,
 void replay_bh_schedule_event(QEMUBH *bh)
 {
     if (events_enabled) {
-        uint64_t id = replay_get_current_step();
+        uint64_t id = replay_get_current_icount();
         replay_add_event(REPLAY_ASYNC_EVENT_BH, bh, NULL, id);
     } else {
         qemu_bh_schedule(bh);
diff --git a/replay/replay-internal.c b/replay/replay-internal.c
index 979f3a0..ac6c901 100644
--- a/replay/replay-internal.c
+++ b/replay/replay-internal.c
@@ -172,7 +172,7 @@ void replay_fetch_data_kind(void)
         if (!replay_state.has_unread_data) {
             replay_state.data_kind = replay_get_byte();
             if (replay_state.data_kind == EVENT_INSTRUCTION) {
-                replay_state.instructions_count = replay_get_dword();
+                replay_state.instruction_count = replay_get_dword();
             }
             replay_check_error();
             replay_state.has_unread_data = 1;
@@ -226,9 +226,9 @@ void replay_mutex_unlock(void)
     }
 }
 
-void replay_advance_current_step(uint64_t current_step)
+void replay_advance_current_icount(uint64_t current_icount)
 {
-    int diff = (int)(current_step - replay_state.current_step);
+    int diff = (int)(current_icount - replay_state.current_icount);
 
     /* Time can only go forward */
     assert(diff >= 0);
@@ -236,7 +236,7 @@ void replay_advance_current_step(uint64_t current_step)
     if (diff > 0) {
         replay_put_event(EVENT_INSTRUCTION);
         replay_put_dword(diff);
-        replay_state.current_step += diff;
+        replay_state.current_icount += diff;
     }
 }
 
@@ -245,6 +245,6 @@ void replay_save_instructions(void)
 {
     if (replay_file && replay_mode == REPLAY_MODE_RECORD) {
         g_assert(replay_mutex_locked());
-        replay_advance_current_step(replay_get_current_step());
+        replay_advance_current_icount(replay_get_current_icount());
     }
 }
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index af6f4d5..afba9a3 100644
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -64,10 +64,10 @@ typedef enum ReplayAsyncEventKind ReplayAsyncEventKind;
 typedef struct ReplayState {
     /*! Cached clock values. */
     int64_t cached_clock[REPLAY_CLOCK_COUNT];
-    /*! Current step - number of processed instructions and timer events. */
-    uint64_t current_step;
+    /*! Current icount - number of processed instructions. */
+    uint64_t current_icount;
     /*! Number of instructions to be executed before other events happen. */
-    int instructions_count;
+    int instruction_count;
     /*! Type of the currently executed event. */
     unsigned int data_kind;
     /*! Flag which indicates that event is not processed yet. */
@@ -122,8 +122,8 @@ void replay_finish_event(void);
     data_kind variable. */
 void replay_fetch_data_kind(void);
 
-/*! Advance replay_state.current_step to the specified value. */
-void replay_advance_current_step(uint64_t current_step);
+/*! Advance replay_state.current_icount to the specified value. */
+void replay_advance_current_icount(uint64_t current_icount);
 /*! Saves queued events (like instructions and sound). */
 void replay_save_instructions(void);
 
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
index 3a58734..041f5af 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -39,7 +39,7 @@ static int replay_post_load(void *opaque, int version_id)
     } else if (replay_mode == REPLAY_MODE_RECORD) {
         /* This is only useful for loading the initial state.
            Therefore reset all the counters. */
-        state->instructions_count = 0;
+        state->instruction_count = 0;
         state->block_request_id = 0;
     }
 
@@ -54,8 +54,8 @@ static const VMStateDescription vmstate_replay = {
     .post_load = replay_post_load,
     .fields = (VMStateField[]) {
         VMSTATE_INT64_ARRAY(cached_clock, ReplayState, REPLAY_CLOCK_COUNT),
-        VMSTATE_UINT64(current_step, ReplayState),
-        VMSTATE_INT32(instructions_count, ReplayState),
+        VMSTATE_UINT64(current_icount, ReplayState),
+        VMSTATE_INT32(instruction_count, ReplayState),
         VMSTATE_UINT32(data_kind, ReplayState),
         VMSTATE_UINT32(has_unread_data, ReplayState),
         VMSTATE_UINT64(file_offset, ReplayState),
diff --git a/replay/replay-time.c b/replay/replay-time.c
index 49c9e5d..43357c9 100644
--- a/replay/replay-time.c
+++ b/replay/replay-time.c
@@ -24,7 +24,7 @@ int64_t replay_save_clock(ReplayClockKind kind, int64_t clock,
      * Due to the caller's locking requirements we get the icount from it
      * instead of using replay_save_instructions().
      */
-    replay_advance_current_step(raw_icount);
+    replay_advance_current_icount(raw_icount);
     replay_put_event(EVENT_CLOCK + kind);
     replay_put_qword(clock);
 
diff --git a/replay/replay.c b/replay/replay.c
index 8d77a4c..6a62ec3 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -39,7 +39,7 @@ bool replay_next_event_is(int event)
     bool res = false;
 
     /* nothing to skip - not all instructions used */
-    if (replay_state.instructions_count != 0) {
+    if (replay_state.instruction_count != 0) {
         assert(replay_state.data_kind == EVENT_INSTRUCTION);
         return event == EVENT_INSTRUCTION;
     }
@@ -62,7 +62,7 @@ bool replay_next_event_is(int event)
     return res;
 }
 
-uint64_t replay_get_current_step(void)
+uint64_t replay_get_current_icount(void)
 {
     return cpu_get_icount_raw();
 }
@@ -72,7 +72,7 @@ int replay_get_instructions(void)
     int res = 0;
     replay_mutex_lock();
     if (replay_next_event_is(EVENT_INSTRUCTION)) {
-        res = replay_state.instructions_count;
+        res = replay_state.instruction_count;
     }
     replay_mutex_unlock();
     return res;
@@ -82,16 +82,16 @@ void replay_account_executed_instructions(void)
 {
     if (replay_mode == REPLAY_MODE_PLAY) {
         g_assert(replay_mutex_locked());
-        if (replay_state.instructions_count > 0) {
-            int count = (int)(replay_get_current_step()
-                              - replay_state.current_step);
+        if (replay_state.instruction_count > 0) {
+            int count = (int)(replay_get_current_icount()
+                              - replay_state.current_icount);
 
             /* Time can only go forward */
             assert(count >= 0);
 
-            replay_state.instructions_count -= count;
-            replay_state.current_step += count;
-            if (replay_state.instructions_count == 0) {
+            replay_state.instruction_count -= count;
+            replay_state.current_icount += count;
+            if (replay_state.instruction_count == 0) {
                 assert(replay_state.data_kind == EVENT_INSTRUCTION);
                 replay_finish_event();
                 /* Wake up iothread. This is required because
@@ -273,8 +273,8 @@ static void replay_enable(const char *fname, int mode)
     replay_mutex_init();
 
     replay_state.data_kind = -1;
-    replay_state.instructions_count = 0;
-    replay_state.current_step = 0;
+    replay_state.instruction_count = 0;
+    replay_state.current_icount = 0;
     replay_state.has_unread_data = 0;
 
     /* skip file header for RECORD and check it for PLAY */
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 27/36] icount: clean up cpu_can_io at the entry to the block
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (25 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 26/36] replay: rename step-related variables and functions Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 28/36] icount: remove unnecessary gen_io_end calls Paolo Bonzini
                   ` (11 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk

From: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

Most of IO instructions can be executed only at the end of the block in
icount mode. Therefore translator can set cpu_can_io flag when translating
the last instruction.
But when the blocks are chained, then this flag is not reset and may
remain set at the beginning of the next block.
This patch resets the flag at the entry of any translation block,
making I/O operations impossible by default.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

--

v2 changes:
 - reset can_do_io at the start of every TB (suggested by Paolo Bonzini)
Message-Id: <156404428943.18669.15747009371169578935.stgit@pasha-Precision-3630-Tower>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 accel/tcg/cpu-exec.c      |  1 -
 include/exec/gen-icount.h | 38 ++++++++++++++++++++------------------
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 6c85c3e..48272c7 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -169,7 +169,6 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
     }
 #endif /* DEBUG_DISAS */
 
-    cpu->can_do_io = !use_icount;
     ret = tcg_qemu_tb_exec(env, tb_ptr);
     cpu->can_do_io = 1;
     last_tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index f7669b6..4004e6c 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -7,6 +7,24 @@
 
 static TCGOp *icount_start_insn;
 
+static inline void gen_io_start(void)
+{
+    TCGv_i32 tmp = tcg_const_i32(1);
+    tcg_gen_st_i32(tmp, cpu_env,
+                   offsetof(ArchCPU, parent_obj.can_do_io) -
+                   offsetof(ArchCPU, env));
+    tcg_temp_free_i32(tmp);
+}
+
+static inline void gen_io_end(void)
+{
+    TCGv_i32 tmp = tcg_const_i32(0);
+    tcg_gen_st_i32(tmp, cpu_env,
+                   offsetof(ArchCPU, parent_obj.can_do_io) -
+                   offsetof(ArchCPU, env));
+    tcg_temp_free_i32(tmp);
+}
+
 static inline void gen_tb_start(TranslationBlock *tb)
 {
     TCGv_i32 count, imm;
@@ -40,6 +58,8 @@ static inline void gen_tb_start(TranslationBlock *tb)
         tcg_gen_st16_i32(count, cpu_env,
                          offsetof(ArchCPU, neg.icount_decr.u16.low) -
                          offsetof(ArchCPU, env));
+        /* Disable I/O by default */
+        gen_io_end();
     }
 
     tcg_temp_free_i32(count);
@@ -57,22 +77,4 @@ static inline void gen_tb_end(TranslationBlock *tb, int num_insns)
     tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED);
 }
 
-static inline void gen_io_start(void)
-{
-    TCGv_i32 tmp = tcg_const_i32(1);
-    tcg_gen_st_i32(tmp, cpu_env,
-                   offsetof(ArchCPU, parent_obj.can_do_io) -
-                   offsetof(ArchCPU, env));
-    tcg_temp_free_i32(tmp);
-}
-
-static inline void gen_io_end(void)
-{
-    TCGv_i32 tmp = tcg_const_i32(0);
-    tcg_gen_st_i32(tmp, cpu_env,
-                   offsetof(ArchCPU, parent_obj.can_do_io) -
-                   offsetof(ArchCPU, env));
-    tcg_temp_free_i32(tmp);
-}
-
 #endif
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 28/36] icount: remove unnecessary gen_io_end calls
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (26 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 27/36] icount: clean up cpu_can_io at the entry to the block Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 29/36] cpus-common: nuke finish_safe_work Paolo Bonzini
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pavel Dovgalyuk, Pavel Dovgalyuk

From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>

Prior patch resets can_do_io flag at the TB entry. Therefore there is no
need in resetting this flag at the end of the block.
This patch removes redundant gen_io_end calls.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404429499.18669.13404064982854123855.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
---
 accel/tcg/translator.c                  |  1 -
 include/exec/gen-icount.h               |  8 +++++++-
 target/alpha/translate.c                |  2 --
 target/arm/translate-a64.c              |  4 ----
 target/arm/translate.c                  |  7 -------
 target/cris/translate.c                 |  2 --
 target/hppa/translate.c                 |  1 -
 target/i386/translate.c                 | 10 ----------
 target/lm32/translate.c                 |  9 ---------
 target/microblaze/translate.c           |  2 --
 target/mips/translate.c                 | 11 -----------
 target/nios2/translate.c                |  4 ----
 target/ppc/translate.c                  | 13 -------------
 target/ppc/translate_init.inc.c         |  2 --
 target/riscv/insn_trans/trans_rvi.inc.c |  1 -
 target/sparc/translate.c                | 16 ----------------
 target/unicore32/translate.c            |  1 -
 target/xtensa/translate.c               | 15 ---------------
 18 files changed, 7 insertions(+), 102 deletions(-)

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 9226a34..70c66c5 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -90,7 +90,6 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
             /* Accept I/O on the last instruction.  */
             gen_io_start();
             ops->translate_insn(db, cpu);
-            gen_io_end();
         } else {
             ops->translate_insn(db, cpu);
         }
diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index 4004e6c..822c43c 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -16,6 +16,13 @@ static inline void gen_io_start(void)
     tcg_temp_free_i32(tmp);
 }
 
+/*
+ * cpu->can_do_io is cleared automatically at the beginning of
+ * each translation block.  The cost is minimal and only paid
+ * for -icount, plus it would be very easy to forget doing it
+ * in the translator.  Therefore, backends only need to call
+ * gen_io_start.
+ */
 static inline void gen_io_end(void)
 {
     TCGv_i32 tmp = tcg_const_i32(0);
@@ -58,7 +65,6 @@ static inline void gen_tb_start(TranslationBlock *tb)
         tcg_gen_st16_i32(count, cpu_env,
                          offsetof(ArchCPU, neg.icount_decr.u16.low) -
                          offsetof(ArchCPU, env));
-        /* Disable I/O by default */
         gen_io_end();
     }
 
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 2c9cccf..1e29653 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -1332,7 +1332,6 @@ static DisasJumpType gen_mfpr(DisasContext *ctx, TCGv va, int regno)
         if (use_icount) {
             gen_io_start();
             helper(va);
-            gen_io_end();
             return DISAS_PC_STALE;
         } else {
             helper(va);
@@ -2398,7 +2397,6 @@ static DisasJumpType translate_one(DisasContext *ctx, uint32_t insn)
             if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
                 gen_io_start();
                 gen_helper_load_pcc(va, cpu_env);
-                gen_io_end();
                 ret = DISAS_PC_STALE;
             } else {
                 gen_helper_load_pcc(va, cpu_env);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index d323147..079b5f4 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1797,7 +1797,6 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
 
     if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
         /* I/O operations must end the TB here (whether read or write) */
-        gen_io_end();
         s->base.is_jmp = DISAS_UPDATE;
     } else if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
         /* We default to ending the TB on a coprocessor register write,
@@ -2104,9 +2103,6 @@ static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
 
         gen_helper_exception_return(cpu_env, dst);
         tcg_temp_free_i64(dst);
-        if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
-        }
         /* Must exit loop to check un-masked IRQs */
         s->base.is_jmp = DISAS_EXIT;
         return;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 7853462..d9dbe03 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -3245,9 +3245,6 @@ static void gen_rfe(DisasContext *s, TCGv_i32 pc, TCGv_i32 cpsr)
         gen_io_start();
     }
     gen_helper_cpsr_write_eret(cpu_env, cpsr);
-    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
     tcg_temp_free_i32(cpsr);
     /* Must exit loop to check un-masked IRQs */
     s->base.is_jmp = DISAS_EXIT;
@@ -7338,7 +7335,6 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
 
         if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
             /* I/O operations must end the TB here (whether read or write) */
-            gen_io_end();
             gen_lookup_tb(s);
         } else if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
             /* We default to ending the TB on a coprocessor register write,
@@ -9207,9 +9203,6 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
                         gen_io_start();
                     }
                     gen_helper_cpsr_write_eret(cpu_env, tmp);
-                    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-                        gen_io_end();
-                    }
                     tcg_temp_free_i32(tmp);
                     /* Must exit loop to check un-masked IRQs */
                     s->base.is_jmp = DISAS_EXIT;
diff --git a/target/cris/translate.c b/target/cris/translate.c
index 3429a3b..e752bd0 100644
--- a/target/cris/translate.c
+++ b/target/cris/translate.c
@@ -3225,8 +3225,6 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
 
     npc = dc->pc;
 
-        if (tb_cflags(tb) & CF_LAST_IO)
-            gen_io_end();
     /* Force an update if the per-tb cpu state has changed.  */
     if (dc->is_jmp == DISAS_NEXT
         && (dc->cpustate_changed || !dc->flagx_known
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 188fe68..8c61895 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -2161,7 +2161,6 @@ static bool trans_mfctl(DisasContext *ctx, arg_mfctl *a)
         if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
             gen_io_start();
             gen_helper_read_interval_timer(tmp);
-            gen_io_end();
             ctx->base.is_jmp = DISAS_IAQ_N_STALE;
         } else {
             gen_helper_read_interval_timer(tmp);
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 03150a8..5cd74ad 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5381,7 +5381,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_op_mov_reg_v(s, dflag, rm, s->T0);
             set_cc_op(s, CC_OP_EFLAGS);
             if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-                gen_io_end();
                 gen_jmp(s, s->pc - s->cs_base);
             }
             break;
@@ -6443,7 +6442,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_op_mov_reg_v(s, ot, R_EAX, s->T1);
         gen_bpt_io(s, s->tmp2_i32, ot);
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
             gen_jmp(s, s->pc - s->cs_base);
         }
         break;
@@ -6464,7 +6462,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_helper_out_func(ot, s->tmp2_i32, s->tmp3_i32);
         gen_bpt_io(s, s->tmp2_i32, ot);
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
             gen_jmp(s, s->pc - s->cs_base);
         }
         break;
@@ -6482,7 +6479,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_op_mov_reg_v(s, ot, R_EAX, s->T1);
         gen_bpt_io(s, s->tmp2_i32, ot);
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
             gen_jmp(s, s->pc - s->cs_base);
         }
         break;
@@ -6502,7 +6498,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_helper_out_func(ot, s->tmp2_i32, s->tmp3_i32);
         gen_bpt_io(s, s->tmp2_i32, ot);
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
             gen_jmp(s, s->pc - s->cs_base);
         }
         break;
@@ -7206,7 +7201,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         }
         gen_helper_rdtsc(cpu_env);
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
             gen_jmp(s, s->pc - s->cs_base);
         }
         break;
@@ -7666,7 +7660,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             }
             gen_helper_rdtscp(cpu_env);
             if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-                gen_io_end();
                 gen_jmp(s, s->pc - s->cs_base);
             }
             break;
@@ -8036,9 +8029,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                     gen_op_mov_v_reg(s, ot, s->T0, rm);
                     gen_helper_write_crN(cpu_env, tcg_const_i32(reg),
                                          s->T0);
-                    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-                        gen_io_end();
-                    }
                     gen_jmp_im(s, s->pc - s->cs_base);
                     gen_eob(s);
                 } else {
diff --git a/target/lm32/translate.c b/target/lm32/translate.c
index b9f2f2c..778cae1 100644
--- a/target/lm32/translate.c
+++ b/target/lm32/translate.c
@@ -885,9 +885,6 @@ static void dec_wcsr(DisasContext *dc)
         }
         gen_helper_wcsr_im(cpu_env, cpu_R[dc->r1]);
         tcg_gen_movi_tl(cpu_pc, dc->pc + 4);
-        if (tb_cflags(dc->tb) & CF_USE_ICOUNT) {
-            gen_io_end();
-        }
         dc->is_jmp = DISAS_UPDATE;
         break;
     case CSR_IP:
@@ -897,9 +894,6 @@ static void dec_wcsr(DisasContext *dc)
         }
         gen_helper_wcsr_ip(cpu_env, cpu_R[dc->r1]);
         tcg_gen_movi_tl(cpu_pc, dc->pc + 4);
-        if (tb_cflags(dc->tb) & CF_USE_ICOUNT) {
-            gen_io_end();
-        }
         dc->is_jmp = DISAS_UPDATE;
         break;
     case CSR_ICC:
@@ -1111,9 +1105,6 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
          && (dc->pc - page_start < TARGET_PAGE_SIZE)
          && num_insns < max_insns);
 
-    if (tb_cflags(tb) & CF_LAST_IO) {
-        gen_io_end();
-    }
 
     if (unlikely(cs->singlestep_enabled)) {
         if (dc->is_jmp == DISAS_NEXT) {
diff --git a/target/microblaze/translate.c b/target/microblaze/translate.c
index 9ce65f3..95ff663 100644
--- a/target/microblaze/translate.c
+++ b/target/microblaze/translate.c
@@ -1724,8 +1724,6 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
             npc = dc->jmp_pc;
     }
 
-    if (tb_cflags(tb) & CF_LAST_IO)
-        gen_io_end();
     /* Force an update if the per-tb cpu state has changed.  */
     if (dc->is_jmp == DISAS_NEXT
         && (dc->cpustate_changed || org_flags != dc->tb_flags)) {
diff --git a/target/mips/translate.c b/target/mips/translate.c
index ca62800..8aba9dd 100644
--- a/target/mips/translate.c
+++ b/target/mips/translate.c
@@ -7126,9 +7126,6 @@ static void gen_mfc0(DisasContext *ctx, TCGv arg, int reg, int sel)
                 gen_io_start();
             }
             gen_helper_mfc0_count(arg, cpu_env);
-            if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-                gen_io_end();
-            }
             /*
              * Break the TB to be able to take timer interrupts immediately
              * after reading count. DISAS_STOP isn't sufficient, we need to
@@ -8293,7 +8290,6 @@ static void gen_mtc0(DisasContext *ctx, TCGv arg, int reg, int sel)
 
     /* For simplicity assume that all writes can cause interrupts.  */
     if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
         /*
          * DISAS_STOP isn't sufficient, we need to ensure we break out of
          * translated code to check for pending interrupts.
@@ -8604,9 +8600,6 @@ static void gen_dmfc0(DisasContext *ctx, TCGv arg, int reg, int sel)
                 gen_io_start();
             }
             gen_helper_mfc0_count(arg, cpu_env);
-            if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-                gen_io_end();
-            }
             /*
              * Break the TB to be able to take timer interrupts immediately
              * after reading count. DISAS_STOP isn't sufficient, we need to
@@ -9745,7 +9738,6 @@ static void gen_dmtc0(DisasContext *ctx, TCGv arg, int reg, int sel)
 
     /* For simplicity assume that all writes can cause interrupts.  */
     if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
         /*
          * DISAS_STOP isn't sufficient, we need to ensure we break out of
          * translated code to check for pending interrupts.
@@ -12810,9 +12802,6 @@ static void gen_rdhwr(DisasContext *ctx, int rt, int rd, int sel)
             gen_io_start();
         }
         gen_helper_rdhwr_cc(t0, cpu_env);
-        if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
-        }
         gen_store_gpr(t0, rt);
         /*
          * Break the TB to be able to take timer interrupts immediately
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 17d8f18..e17656e 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -862,10 +862,6 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
              !tcg_op_buf_full() &&
              num_insns < max_insns);
 
-    if (tb_cflags(tb) & CF_LAST_IO) {
-        gen_io_end();
-    }
-
     /* Indicate where the next block should start */
     switch (dc->is_jmp) {
     case DISAS_NEXT:
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 4a5de28..ca5a916 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -1860,7 +1860,6 @@ static void gen_darn(DisasContext *ctx)
             gen_helper_darn64(cpu_gpr[rD(ctx->opcode)]);
         }
         if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
             gen_stop_exception(ctx);
         }
     }
@@ -3990,9 +3989,6 @@ static void gen_rfi(DisasContext *ctx)
     gen_update_cfar(ctx, ctx->base.pc_next - 4);
     gen_helper_rfi(cpu_env);
     gen_sync_exception(ctx);
-    if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
 #endif
 }
 
@@ -4010,9 +4006,6 @@ static void gen_rfid(DisasContext *ctx)
     gen_update_cfar(ctx, ctx->base.pc_next - 4);
     gen_helper_rfid(cpu_env);
     gen_sync_exception(ctx);
-    if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
 #endif
 }
 
@@ -4388,9 +4381,6 @@ static void gen_mtmsrd(DisasContext *ctx)
         /* Must stop the translation as machine state (may have) changed */
         /* Note that mtmsr is not always defined as context-synchronizing */
         gen_stop_exception(ctx);
-        if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
-        }
     }
 #endif /* !defined(CONFIG_USER_ONLY) */
 }
@@ -4428,9 +4418,6 @@ static void gen_mtmsr(DisasContext *ctx)
         tcg_gen_mov_tl(msr, cpu_gpr[rS(ctx->opcode)]);
 #endif
         gen_helper_store_msr(cpu_env, msr);
-        if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-            gen_io_end();
-        }
         tcg_temp_free(msr);
         /* Must stop the translation as machine state (may have) changed */
         /* Note that mtmsr is not always defined as context-synchronizing */
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 86fc8f2..66d9a73 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -189,7 +189,6 @@ static void spr_read_decr(DisasContext *ctx, int gprn, int sprn)
     }
     gen_helper_load_decr(cpu_gpr[gprn], cpu_env);
     if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
         gen_stop_exception(ctx);
     }
 }
@@ -201,7 +200,6 @@ static void spr_write_decr(DisasContext *ctx, int sprn, int gprn)
     }
     gen_helper_store_decr(cpu_env, cpu_gpr[gprn]);
     if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
         gen_stop_exception(ctx);
     }
 }
diff --git a/target/riscv/insn_trans/trans_rvi.inc.c b/target/riscv/insn_trans/trans_rvi.inc.c
index ea64731..1af795e 100644
--- a/target/riscv/insn_trans/trans_rvi.inc.c
+++ b/target/riscv/insn_trans/trans_rvi.inc.c
@@ -511,7 +511,6 @@ static bool trans_fence_i(DisasContext *ctx, arg_fence_i *a)
 } while (0)
 
 #define RISCV_OP_CSR_POST do {\
-    gen_io_end(); \
     gen_set_gpr(a->rd, dest); \
     tcg_gen_movi_tl(cpu_pc, ctx->pc_succ_insn); \
     exit_tb(ctx); \
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 091bab5..02c1612 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4412,10 +4412,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                     gen_helper_tick_set_limit(r_tickptr,
                                                               cpu_tick_cmpr);
                                     tcg_temp_free_ptr(r_tickptr);
-                                    if (tb_cflags(dc->base.tb) &
-                                           CF_USE_ICOUNT) {
-                                        gen_io_end();
-                                    }
                                     /* End TB to handle timer interrupt */
                                     dc->base.is_jmp = DISAS_EXIT;
                                 }
@@ -4440,10 +4436,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                     gen_helper_tick_set_count(r_tickptr,
                                                               cpu_tmp0);
                                     tcg_temp_free_ptr(r_tickptr);
-                                    if (tb_cflags(dc->base.tb) &
-                                           CF_USE_ICOUNT) {
-                                        gen_io_end();
-                                    }
                                     /* End TB to handle timer interrupt */
                                     dc->base.is_jmp = DISAS_EXIT;
                                 }
@@ -4468,10 +4460,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                     gen_helper_tick_set_limit(r_tickptr,
                                                               cpu_stick_cmpr);
                                     tcg_temp_free_ptr(r_tickptr);
-                                    if (tb_cflags(dc->base.tb) &
-                                           CF_USE_ICOUNT) {
-                                        gen_io_end();
-                                    }
                                     /* End TB to handle timer interrupt */
                                     dc->base.is_jmp = DISAS_EXIT;
                                 }
@@ -4588,10 +4576,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                     gen_helper_tick_set_count(r_tickptr,
                                                               cpu_tmp0);
                                     tcg_temp_free_ptr(r_tickptr);
-                                    if (tb_cflags(dc->base.tb) &
-                                           CF_USE_ICOUNT) {
-                                        gen_io_end();
-                                    }
                                     /* End TB to handle timer interrupt */
                                     dc->base.is_jmp = DISAS_EXIT;
                                 }
diff --git a/target/unicore32/translate.c b/target/unicore32/translate.c
index d27451e..0e01f35 100644
--- a/target/unicore32/translate.c
+++ b/target/unicore32/translate.c
@@ -1931,7 +1931,6 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
                code.  */
             cpu_abort(cs, "IO on conditional branch instruction");
         }
-        gen_io_end();
     }
 
     /* At this stage dc->condjmp will only be set when the skipped
diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 6f1da87..54b770b 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -540,9 +540,6 @@ static void gen_waiti(DisasContext *dc, uint32_t imm4)
         gen_io_start();
     }
     gen_helper_waiti(cpu_env, pc, intlevel);
-    if (tb_cflags(dc->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
     tcg_temp_free(pc);
     tcg_temp_free(intlevel);
 }
@@ -2216,9 +2213,6 @@ static void translate_rsr_ccount(DisasContext *dc, const OpcodeArg arg[],
     }
     gen_helper_update_ccount(cpu_env);
     tcg_gen_mov_i32(arg[0].out, cpu_SR[par[0]]);
-    if (tb_cflags(dc->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
 #endif
 }
 
@@ -2608,9 +2602,6 @@ static void translate_wsr_ccompare(DisasContext *dc, const OpcodeArg arg[],
     tcg_gen_mov_i32(cpu_SR[par[0]], arg[0].in);
     gen_helper_update_ccompare(cpu_env, tmp);
     tcg_temp_free(tmp);
-    if (tb_cflags(dc->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
 #endif
 }
 
@@ -2622,9 +2613,6 @@ static void translate_wsr_ccount(DisasContext *dc, const OpcodeArg arg[],
         gen_io_start();
     }
     gen_helper_wsr_ccount(cpu_env, arg[0].in);
-    if (tb_cflags(dc->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
 #endif
 }
 
@@ -2831,9 +2819,6 @@ static void translate_xsr_ccount(DisasContext *dc, const OpcodeArg arg[],
     tcg_gen_mov_i32(arg[0].out, tmp);
     tcg_temp_free(tmp);
 
-    if (tb_cflags(dc->base.tb) & CF_USE_ICOUNT) {
-        gen_io_end();
-    }
 #endif
 }
 
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 29/36] cpus-common: nuke finish_safe_work
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (27 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 28/36] icount: remove unnecessary gen_io_end calls Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 30/36] cpus-common: assert BQL nesting within cpu-exclusive sections Paolo Bonzini
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Roman Kagan

From: Roman Kagan <rkagan@virtuozzo.com>

It was introduced in commit ab129972c8b41e15b0521895a46fd9c752b68a5e,
with the following motivation:

  Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
  qemu_cpu_list_lock: together with a call to exclusive_idle (via
  cpu_exec_start/end) in cpu_list_add, this protects exclusive work
  against concurrent CPU addition and removal.

However, it seems to be redundant, because the cpu-exclusive
infrastructure provides suffificent protection against the newly added
CPU starting execution while the cpu-exclusive work is running, and the
aforementioned traversing of the cpu list is protected by
qemu_cpu_list_lock.

Besides, this appears to be the only place where the cpu-exclusive
section is entered with the BQL taken, which has been found to trigger
AB-BA deadlock as follows:

    vCPU thread                             main thread
    -----------                             -----------
async_safe_run_on_cpu(self,
                      async_synic_update)
...                                         [cpu hot-add]
process_queued_cpu_work()
  qemu_mutex_unlock_iothread()
                                            [grab BQL]
  start_exclusive()                         cpu_list_add()
  async_synic_update()                        finish_safe_work()
    qemu_mutex_lock_iothread()                  cpu_exec_start()

So remove it.  This paves the way to establishing a strict nesting rule
of never entering the exclusive section with the BQL taken.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20190523105440.27045-2-rkagan@virtuozzo.com>
---
 cpus-common.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 3ca58c6..023cfeb 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
     return cpu_index;
 }
 
-static void finish_safe_work(CPUState *cpu)
-{
-    cpu_exec_start(cpu);
-    cpu_exec_end(cpu);
-}
-
 void cpu_list_add(CPUState *cpu)
 {
     qemu_mutex_lock(&qemu_cpu_list_lock);
@@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
     }
     QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
     qemu_mutex_unlock(&qemu_cpu_list_lock);
-
-    finish_safe_work(cpu);
 }
 
 void cpu_list_remove(CPUState *cpu)
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 30/36] cpus-common: assert BQL nesting within cpu-exclusive sections
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (28 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 29/36] cpus-common: nuke finish_safe_work Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 31/36] kvm: vmxcap: Enhance with latest features Paolo Bonzini
                   ` (8 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Roman Kagan

From: Roman Kagan <rkagan@virtuozzo.com>

Assert that the cpu-exclusive sections are never entered/left with the
BQL taken.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20190523105440.27045-3-rkagan@virtuozzo.com>
---
 cpus-common.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/cpus-common.c b/cpus-common.c
index 023cfeb..9aa75fe 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -174,6 +174,7 @@ void start_exclusive(void)
     CPUState *other_cpu;
     int running_cpus;
 
+    assert(!qemu_mutex_iothread_locked());
     qemu_mutex_lock(&qemu_cpu_list_lock);
     exclusive_idle();
 
@@ -205,6 +206,7 @@ void start_exclusive(void)
 /* Finish an exclusive operation.  */
 void end_exclusive(void)
 {
+    assert(!qemu_mutex_iothread_locked());
     qemu_mutex_lock(&qemu_cpu_list_lock);
     atomic_set(&pending_cpus, 0);
     qemu_cond_broadcast(&exclusive_resume);
@@ -214,6 +216,7 @@ void end_exclusive(void)
 /* Wait for exclusive ops to finish, and begin cpu execution.  */
 void cpu_exec_start(CPUState *cpu)
 {
+    assert(!qemu_mutex_iothread_locked());
     atomic_set(&cpu->running, true);
 
     /* Write cpu->running before reading pending_cpus.  */
@@ -255,6 +258,7 @@ void cpu_exec_start(CPUState *cpu)
 /* Mark cpu as not executing, and release pending exclusive ops.  */
 void cpu_exec_end(CPUState *cpu)
 {
+    assert(!qemu_mutex_iothread_locked());
     atomic_set(&cpu->running, false);
 
     /* Write cpu->running before reading pending_cpus.  */
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 31/36] kvm: vmxcap: Enhance with latest features
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (29 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 30/36] cpus-common: assert BQL nesting within cpu-exclusive sections Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 32/36] HACKING: Document 'struct' keyword usage Paolo Bonzini
                   ` (7 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

Based on SDM from May 2019.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 scripts/kvm/vmxcap | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/scripts/kvm/vmxcap b/scripts/kvm/vmxcap
index 99a8146..d8c7d6d 100755
--- a/scripts/kvm/vmxcap
+++ b/scripts/kvm/vmxcap
@@ -178,7 +178,11 @@ controls = [
             19: 'Conceal non-root operation from PT',
             20: 'Enable XSAVES/XRSTORS',
             22: 'Mode-based execute control (XS/XU)',
+            23: 'Sub-page write permissions',
+            24: 'GPA translation for PT',
             25: 'TSC scaling',
+            26: 'User wait and pause',
+            28: 'ENCLV exiting',
             },
         cap_msr = MSR_IA32_VMX_PROCBASED_CTLS2,
         ),
@@ -197,6 +201,7 @@ controls = [
             22: 'Save VMX-preemption timer value',
             23: 'Clear IA32_BNDCFGS',
             24: 'Conceal VM exits from PT',
+            25: 'Clear IA32_RTIT_CTL',
             },
         cap_msr = MSR_IA32_VMX_EXIT_CTLS,
         true_cap_msr = MSR_IA32_VMX_TRUE_EXIT_CTLS,
@@ -214,6 +219,7 @@ controls = [
             15: 'Load IA32_EFER',
             16: 'Load IA32_BNDCFGS',
             17: 'Conceal VM entries from PT',
+            18: 'Load IA32_RTIT_CTL',
             },
         cap_msr = MSR_IA32_VMX_ENTRY_CTLS,
         true_cap_msr = MSR_IA32_VMX_TRUE_ENTRY_CTLS,
@@ -227,6 +233,7 @@ controls = [
             6: 'HLT activity state',
             7: 'Shutdown activity state',
             8: 'Wait-for-SIPI activity state',
+            14: 'PT in VMX operation',
             15: 'IA32_SMBASE support',
             (16,24): 'Number of CR3-target values',
             (25,27): 'MSR-load/store count recommendation',
@@ -249,6 +256,7 @@ controls = [
             17: '1GB EPT pages',
             20: 'INVEPT supported',
             21: 'EPT accessed and dirty flags',
+            22: 'Advanced VM-exit information for EPT violations',
             25: 'Single-context INVEPT',
             26: 'All-context INVEPT',
             32: 'INVVPID supported',
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 32/36] HACKING: Document 'struct' keyword usage
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (30 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 31/36] kvm: vmxcap: Enhance with latest features Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 33/36] migration: do not rom_reset() during incoming migration Paolo Bonzini
                   ` (6 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eduardo Habkost

From: Eduardo Habkost <ehabkost@redhat.com>

Sometimes we use the 'struct' keyword in headers to help us
reduce dependencies between header files.  Document that
practice.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 HACKING | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/HACKING b/HACKING
index 0fc3e0f..097d482 100644
--- a/HACKING
+++ b/HACKING
@@ -100,7 +100,19 @@ pointer, you're guaranteed that it is used to modify the storage
 it points to, or it is aliased to another pointer that is.
 
 2.3. Typedefs
-Typedefs are used to eliminate the redundant 'struct' keyword.
+
+Typedefs are used to eliminate the redundant 'struct' keyword, since type
+names have a different style than other identifiers ("CamelCase" versus
+"snake_case").  Each named struct type should have a CamelCase name and a
+corresponding typedef.
+
+Since certain C compilers choke on duplicated typedefs, you should avoid
+them and declare a typedef only in one header file.  For common types,
+you can use "include/qemu/typedefs.h" for example.  However, as a matter
+of convenience it is also perfectly fine to use forward struct
+definitions instead of typedefs in headers and function prototypes; this
+avoids problems with duplicated typedefs and reduces the need to include
+headers from other headers.
 
 2.4. Reserved namespaces in C and POSIX
 Underscore capital, double underscore, and underscore 't' suffixes should be
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 33/36] migration: do not rom_reset() during incoming migration
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (31 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 32/36] HACKING: Document 'struct' keyword usage Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 34/36] test-bitmap: test set 1 bit case for bitmap_set Paolo Bonzini
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Catherine Ho

From: Catherine Ho <catherine.hecx@gmail.com>

Commit 18269069c310 ("migration: Introduce ignore-shared capability")
addes ignore-shared capability to bypass the shared ramblock (e,g,
membackend + numa node). It does good to live migration.

As told by Yury,this commit expectes that QEMU doesn't write to guest RAM
until VM starts, but it does on aarch64 qemu:
Backtrace:
1  0x000055f4a296dd84 in address_space_write_rom_internal () at
exec.c:3458
2  0x000055f4a296de3a in address_space_write_rom () at exec.c:3479
3  0x000055f4a2d519ff in rom_reset () at hw/core/loader.c:1101
4  0x000055f4a2d475ec in qemu_devices_reset () at hw/core/reset.c:69
5  0x000055f4a2c90a28 in qemu_system_reset () at vl.c:1675
6  0x000055f4a2c9851d in main () at vl.c:4552

Actually, on arm64 virt marchine, ramblock "dtb" will be filled into ram
druing rom_reset. In ignore-shared incoming case, this rom filling
is not required since all the data has been stored in memory backend
file.

Further more, as suggested by Peter Xu, if we do rom_reset() now with
these ROMs then the RAM data should be re-filled again too with the
migration stream coming in.

Fixes: commit 18269069c310 ("migration: Introduce ignore-shared
capability")
Suggested-by: Yury Kotov <yury-kotov@yandex-team.ru>
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Catherine Ho <catherine.hecx@gmail.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/loader.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 9fb93a6..baa4448 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -1112,6 +1112,15 @@ static void rom_reset(void *unused)
 {
     Rom *rom;
 
+    /*
+     * We don't need to fill in the RAM with ROM data because we'll fill
+     * the data in during the next incoming migration in all cases.  Note
+     * that some of those RAMs can actually be modified by the guest on ARM
+     * so this is probably the only right thing to do here.
+     */
+    if (runstate_check(RUN_STATE_INMIGRATE))
+        return;
+
     QTAILQ_FOREACH(rom, &roms, next) {
         if (rom->fw_file) {
             continue;
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 34/36] test-bitmap: test set 1 bit case for bitmap_set
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (32 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 33/36] migration: do not rom_reset() during incoming migration Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 35/36] scsi: lsi: exit infinite loop while executing script (CVE-2019-12068) Paolo Bonzini
                   ` (4 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Wei Yang

From: Wei Yang <richardw.yang@linux.intel.com>

All current bitmap_set test cases set range across word, while the
handle of a range within one word is different from that.

Add case to set 1 bit as a represent for set range within one word.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/test-bitmap.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tests/test-bitmap.c b/tests/test-bitmap.c
index 18aa584..087e02a 100644
--- a/tests/test-bitmap.c
+++ b/tests/test-bitmap.c
@@ -67,6 +67,18 @@ static void bitmap_set_case(bmap_set_func set_func)
 
     bmap = bitmap_new(BMAP_SIZE);
 
+    /* Set one bit at offset in second word */
+    for (offset = 0; offset <= BITS_PER_LONG; offset++) {
+        bitmap_clear(bmap, 0, BMAP_SIZE);
+        set_func(bmap, BITS_PER_LONG + offset, 1);
+        g_assert_cmpint(find_first_bit(bmap, 2 * BITS_PER_LONG),
+                        ==, BITS_PER_LONG + offset);
+        g_assert_cmpint(find_next_zero_bit(bmap,
+                                           3 * BITS_PER_LONG,
+                                           BITS_PER_LONG + offset),
+                        ==, BITS_PER_LONG + offset + 1);
+    }
+
     /* Both Aligned, set bits [BITS_PER_LONG, 3*BITS_PER_LONG] */
     set_func(bmap, BITS_PER_LONG, 2 * BITS_PER_LONG);
     g_assert_cmpuint(bmap[1], ==, -1ul);
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 35/36] scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (33 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 34/36] test-bitmap: test set 1 bit case for bitmap_set Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  6:59 ` [Qemu-devel] [PULL 36/36] x86: Intel AVX512_BF16 feature enabling Paolo Bonzini
                   ` (3 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Prasad J Pandit

When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.

Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/scsi/lsi53c895a.c | 41 +++++++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 14 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 10468c1..72f7b59 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -185,6 +185,9 @@ static const char *names[] = {
 /* Flag set if this is a tagged command.  */
 #define LSI_TAG_VALID     (1 << 16)
 
+/* Maximum instructions to process. */
+#define LSI_MAX_INSN    10000
+
 typedef struct lsi_request {
     SCSIRequest *req;
     uint32_t tag;
@@ -1132,7 +1135,21 @@ static void lsi_execute_script(LSIState *s)
 
     s->istat1 |= LSI_ISTAT1_SRUN;
 again:
-    insn_processed++;
+    if (++insn_processed > LSI_MAX_INSN) {
+        /* Some windows drivers make the device spin waiting for a memory
+           location to change.  If we have been executed a lot of code then
+           assume this is the case and force an unexpected device disconnect.
+           This is apparently sufficient to beat the drivers into submission.
+         */
+        if (!(s->sien0 & LSI_SIST0_UDC)) {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "lsi_scsi: inf. loop with UDC masked");
+        }
+        lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
+        lsi_disconnect(s);
+        trace_lsi_execute_script_stop();
+        return;
+    }
     insn = read_dword(s, s->dsp);
     if (!insn) {
         /* If we receive an empty opcode increment the DSP by 4 bytes
@@ -1569,19 +1586,7 @@ again:
             }
         }
     }
-    if (insn_processed > 10000 && s->waiting == LSI_NOWAIT) {
-        /* Some windows drivers make the device spin waiting for a memory
-           location to change.  If we have been executed a lot of code then
-           assume this is the case and force an unexpected device disconnect.
-           This is apparently sufficient to beat the drivers into submission.
-         */
-        if (!(s->sien0 & LSI_SIST0_UDC)) {
-            qemu_log_mask(LOG_GUEST_ERROR,
-                          "lsi_scsi: inf. loop with UDC masked");
-        }
-        lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
-        lsi_disconnect(s);
-    } else if (s->istat1 & LSI_ISTAT1_SRUN && s->waiting == LSI_NOWAIT) {
+    if (s->istat1 & LSI_ISTAT1_SRUN && s->waiting == LSI_NOWAIT) {
         if (s->dcntl & LSI_DCNTL_SSM) {
             lsi_script_dma_interrupt(s, LSI_DSTAT_SSI);
         } else {
@@ -1969,6 +1974,10 @@ static void lsi_reg_writeb(LSIState *s, int offset, uint8_t val)
     case 0x2f: /* DSP[24:31] */
         s->dsp &= 0x00ffffff;
         s->dsp |= val << 24;
+        /*
+         * FIXME: if s->waiting != LSI_NOWAIT, this will only execute one
+         * instruction.  Is this correct?
+         */
         if ((s->dmode & LSI_DMODE_MAN) == 0
             && (s->istat1 & LSI_ISTAT1_SRUN) == 0)
             lsi_execute_script(s);
@@ -1987,6 +1996,10 @@ static void lsi_reg_writeb(LSIState *s, int offset, uint8_t val)
         break;
     case 0x3b: /* DCNTL */
         s->dcntl = val & ~(LSI_DCNTL_PFF | LSI_DCNTL_STD);
+        /*
+         * FIXME: if s->waiting != LSI_NOWAIT, this will only execute one
+         * instruction.  Is this correct?
+         */
         if ((val & LSI_DCNTL_STD) && (s->istat1 & LSI_ISTAT1_SRUN) == 0)
             lsi_execute_script(s);
         break;
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PULL 36/36] x86: Intel AVX512_BF16 feature enabling
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (34 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 35/36] scsi: lsi: exit infinite loop while executing script (CVE-2019-12068) Paolo Bonzini
@ 2019-08-20  6:59 ` Paolo Bonzini
  2019-08-20  7:42 ` [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 no-reply
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-08-20  6:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jing Liu

From: Jing Liu <jing2.liu@linux.intel.com>

Intel CooperLake cpu adds AVX512_BF16 instruction, defining as
CPUID.(EAX=7,ECX=1):EAX[bit 05].

The patch adds a property for setting the subleaf of CPUID leaf 7 in
case that people would like to specify it.

The release spec link as follows,
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 39 ++++++++++++++++++++++++++++++++++++++-
 target/i386/cpu.h |  7 +++++++
 target/i386/kvm.c |  3 ++-
 3 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b07bc06..0011831 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -770,6 +770,7 @@ static void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
           /* CPUID_7_0_ECX_OSPKE is dynamic */ \
           CPUID_7_0_ECX_LA57)
 #define TCG_7_0_EDX_FEATURES 0
+#define TCG_7_1_EAX_FEATURES 0
 #define TCG_APM_FEATURES 0
 #define TCG_6_EAX_FEATURES CPUID_6_EAX_ARAT
 #define TCG_XSAVE_FEATURES (CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XGETBV1)
@@ -1095,6 +1096,25 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         },
         .tcg_features = TCG_7_0_EDX_FEATURES,
     },
+    [FEAT_7_1_EAX] = {
+        .type = CPUID_FEATURE_WORD,
+        .feat_names = {
+            NULL, NULL, NULL, NULL,
+            NULL, "avx512-bf16", NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+        },
+        .cpuid = {
+            .eax = 7,
+            .needs_ecx = true, .ecx = 1,
+            .reg = R_EAX,
+        },
+        .tcg_features = TCG_7_1_EAX_FEATURES,
+    },
     [FEAT_8000_0007_EDX] = {
         .type = CPUID_FEATURE_WORD,
         .feat_names = {
@@ -4292,13 +4312,19 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     case 7:
         /* Structured Extended Feature Flags Enumeration Leaf */
         if (count == 0) {
-            *eax = 0; /* Maximum ECX value for sub-leaves */
+            /* Maximum ECX value for sub-leaves */
+            *eax = env->cpuid_level_func7;
             *ebx = env->features[FEAT_7_0_EBX]; /* Feature flags */
             *ecx = env->features[FEAT_7_0_ECX]; /* Feature flags */
             if ((*ecx & CPUID_7_0_ECX_PKU) && env->cr[4] & CR4_PKE_MASK) {
                 *ecx |= CPUID_7_0_ECX_OSPKE;
             }
             *edx = env->features[FEAT_7_0_EDX]; /* Feature flags */
+        } else if (count == 1) {
+            *eax = env->features[FEAT_7_1_EAX];
+            *ebx = 0;
+            *ecx = 0;
+            *edx = 0;
         } else {
             *eax = 0;
             *ebx = 0;
@@ -4948,6 +4974,11 @@ static void x86_cpu_adjust_feat_level(X86CPU *cpu, FeatureWord w)
         x86_cpu_adjust_level(cpu, &env->cpuid_min_xlevel2, eax);
     break;
     }
+
+    if (eax == 7) {
+        x86_cpu_adjust_level(cpu, &env->cpuid_min_level_func7,
+                             fi->cpuid.ecx);
+    }
 }
 
 /* Calculate XSAVE components based on the configured CPU feature flags */
@@ -5066,6 +5097,7 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
         x86_cpu_adjust_feat_level(cpu, FEAT_1_ECX);
         x86_cpu_adjust_feat_level(cpu, FEAT_6_EAX);
         x86_cpu_adjust_feat_level(cpu, FEAT_7_0_ECX);
+        x86_cpu_adjust_feat_level(cpu, FEAT_7_1_EAX);
         x86_cpu_adjust_feat_level(cpu, FEAT_8000_0001_EDX);
         x86_cpu_adjust_feat_level(cpu, FEAT_8000_0001_ECX);
         x86_cpu_adjust_feat_level(cpu, FEAT_8000_0007_EDX);
@@ -5097,6 +5129,9 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
     }
 
     /* Set cpuid_*level* based on cpuid_min_*level, if not explicitly set */
+    if (env->cpuid_level_func7 == UINT32_MAX) {
+        env->cpuid_level_func7 = env->cpuid_min_level_func7;
+    }
     if (env->cpuid_level == UINT32_MAX) {
         env->cpuid_level = env->cpuid_min_level;
     }
@@ -5870,6 +5905,8 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_BOOL("host-phys-bits", X86CPU, host_phys_bits, false),
     DEFINE_PROP_UINT8("host-phys-bits-limit", X86CPU, host_phys_bits_limit, 0),
     DEFINE_PROP_BOOL("fill-mtrr-mask", X86CPU, fill_mtrr_mask, true),
+    DEFINE_PROP_UINT32("level-func7", X86CPU, env.cpuid_level_func7,
+                       UINT32_MAX),
     DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, UINT32_MAX),
     DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, UINT32_MAX),
     DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, UINT32_MAX),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 44e42f5..e4472e7 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -479,6 +479,7 @@ typedef enum FeatureWord {
     FEAT_7_0_EBX,       /* CPUID[EAX=7,ECX=0].EBX */
     FEAT_7_0_ECX,       /* CPUID[EAX=7,ECX=0].ECX */
     FEAT_7_0_EDX,       /* CPUID[EAX=7,ECX=0].EDX */
+    FEAT_7_1_EAX,       /* CPUID[EAX=7,ECX=1].EAX */
     FEAT_8000_0001_EDX, /* CPUID[8000_0001].EDX */
     FEAT_8000_0001_ECX, /* CPUID[8000_0001].ECX */
     FEAT_8000_0007_EDX, /* CPUID[8000_0007].EDX */
@@ -692,6 +693,8 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_EDX_CORE_CAPABILITY   (1U << 30)  /*Core Capability*/
 #define CPUID_7_0_EDX_SPEC_CTRL_SSBD  (1U << 31) /* Speculative Store Bypass Disable */
 
+#define CPUID_7_1_EAX_AVX512_BF16 (1U << 5) /* AVX512 BFloat16 Instruction */
+
 #define CPUID_8000_0008_EBX_WBNOINVD  (1U << 9)  /* Write back and
                                                                              do not invalidate cache */
 #define CPUID_8000_0008_EBX_IBPB    (1U << 12) /* Indirect Branch Prediction Barrier */
@@ -1323,6 +1326,10 @@ typedef struct CPUX86State {
     /* Fields after this point are preserved across CPU reset. */
 
     /* processor features (e.g. for CPUID insn) */
+    /* Minimum cpuid leaf 7 value */
+    uint32_t cpuid_level_func7;
+    /* Actual cpuid leaf 7 value */
+    uint32_t cpuid_min_level_func7;
     /* Minimum level/xlevel/xlevel2, based on CPU model + features */
     uint32_t cpuid_min_level, cpuid_min_xlevel, cpuid_min_xlevel2;
     /* Maximum level/xlevel/xlevel2 value for auto-assignment: */
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index ce3f1c3..6ecdcb7 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1498,6 +1498,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                 c = &cpuid_data.entries[cpuid_i++];
             }
             break;
+        case 0x7:
         case 0x14: {
             uint32_t times;
 
@@ -1510,7 +1511,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
             for (j = 1; j <= times; ++j) {
                 if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
                     fprintf(stderr, "cpuid_data is full, no space for "
-                                "cpuid(eax:0x14,ecx:0x%x)\n", j);
+                                "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
                     abort();
                 }
                 c = &cpuid_data.entries[cpuid_i++];
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (35 preceding siblings ...)
  2019-08-20  6:59 ` [Qemu-devel] [PULL 36/36] x86: Intel AVX512_BF16 feature enabling Paolo Bonzini
@ 2019-08-20  7:42 ` no-reply
  2019-08-20  9:26 ` Peter Maydell
  2019-08-20 23:42 ` no-reply
  38 siblings, 0 replies; 47+ messages in thread
From: no-reply @ 2019-08-20  7:42 UTC (permalink / raw)
  To: pbonzini; +Cc: qemu-devel

Patchew URL: https://patchew.org/QEMU/1566284395-30287-1-git-send-email-pbonzini@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20
Message-id: 1566284395-30287-1-git-send-email-pbonzini@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/1566284395-30287-1-git-send-email-pbonzini@redhat.com -> patchew/1566284395-30287-1-git-send-email-pbonzini@redhat.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' (https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' (https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out '22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out '90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out '20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' (https://github.com/openssl/openssl) registered for path 'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out '50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out '09403100de2f6f1cdd0d484dcb8e620f1c335c8f'
Cloning into 'roms/ipxe'...
Submodule path 'roms/ipxe': checked out 'de4565cbe76ea9f7913a01f331be3ee901bb6e17'
Cloning into 'roms/openbios'...
Submodule path 'roms/openbios': checked out 'c79e0ecb84f4f1ee3f73f521622e264edd1bf174'
Cloning into 'roms/openhackware'...
Submodule path 'roms/openhackware': checked out 'c559da7c8eec5e45ef1f67978827af6f0b9546f5'
Cloning into 'roms/opensbi'...
Submodule path 'roms/opensbi': checked out 'ce228ee0919deb9957192d723eecc8aaae2697c6'
Cloning into 'roms/qemu-palcode'...
Submodule path 'roms/qemu-palcode': checked out 'bf0e13698872450164fa7040da36a95d2d4b326f'
Cloning into 'roms/seabios'...
Submodule path 'roms/seabios': checked out 'a5cab58e9a3fb6e168aba919c5669bea406573b4'
Cloning into 'roms/seabios-hppa'...
Submodule path 'roms/seabios-hppa': checked out '0f4fe84658165e96ce35870fd19fc634e182e77b'
Cloning into 'roms/sgabios'...
Submodule path 'roms/sgabios': checked out 'cbaee52287e5f32373181cff50a00b6c4ac9015a'
Cloning into 'roms/skiboot'...
Submodule path 'roms/skiboot': checked out '261ca8e779e5138869a45f174caa49be6a274501'
Cloning into 'roms/u-boot'...
Submodule path 'roms/u-boot': checked out 'd3689267f92c5956e09cc7d1baa4700141662bff'
Cloning into 'roms/u-boot-sam460ex'...
Submodule path 'roms/u-boot-sam460ex': checked out '60b3916f33e617a815973c5a6df77055b2e3a588'
Cloning into 'slirp'...
Submodule path 'slirp': checked out '126c04acbabd7ad32c2b018fe10dfac2a3bc1210'
Cloning into 'tests/fp/berkeley-softfloat-3'...
Submodule path 'tests/fp/berkeley-softfloat-3': checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'tests/fp/berkeley-testfloat-3'...
Submodule path 'tests/fp/berkeley-testfloat-3': checked out '5a59dcec19327396a011a17fd924aed4fec416b3'
Cloning into 'ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out '6b3d716e2b6472eb7189d3220552280ef3d832ce'
Switched to a new branch 'test'
4b3b98e x86: Intel AVX512_BF16 feature enabling
49cf949 scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)
28f466a test-bitmap: test set 1 bit case for bitmap_set
b5ae146 migration: do not rom_reset() during incoming migration
810eb3b HACKING: Document 'struct' keyword usage
f0b1664 kvm: vmxcap: Enhance with latest features
5ec49f7 cpus-common: assert BQL nesting within cpu-exclusive sections
5887e64 cpus-common: nuke finish_safe_work
ba02024 icount: remove unnecessary gen_io_end calls
7090243 icount: clean up cpu_can_io at the entry to the block
455f9c1 replay: rename step-related variables and functions
f279de7 replay: refine replay-time module
c051b41 replay: fix replay shutdown
bf310b1 util/qemu-timer: refactor deadline calculation for external timers
6bffd44 replay: document development rules
5a4417d replay: add missing fix for internal function
ab5d0a9 kconfig: do not select VMMOUSE
7f8a5ee timer: last, remove last bits of last
89d81e5 replay: Remove host_clock_last
9ede085 timer: Remove reset notifiers
9a6a12f mc146818rtc: Remove reset notifiers
3a47d60 memory: fix race between TCG and accesses to dirty bitmap
aefebb9 target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions
2f10f92 i386/kvm: initialize struct at full before ioctl call
1e45f05 tests: Fix uninitialized byte in test_visitor_in_fuzz
6af6152 test-throttle: Fix uninitialized use of burst_length
68c0617 target-i386: kvm: 'kvm_get_supported_msrs' cleanup
5acd668 9p: simplify source file selection
ad1d30f configure: Define target access alignment in configure
35467e3 memory: assert on out of scope notification
5913693 hw/i386/pc: Map into memory the initrd
ff22b21 elf-ops.h: Map into memory the ELF to load
b6a6457 loader: Handle memory-mapped ELFs
ff1bf69 target-i386: adds PV_SCHED_YIELD CPUID feature bit
866b9b0 kvm: i386: halt poll control MSR support

=== OUTPUT BEGIN ===
1/35 Checking commit 866b9b0b2c9d (kvm: i386: halt poll control MSR support)
2/35 Checking commit ff1bf69bc8b9 (target-i386: adds PV_SCHED_YIELD CPUID feature bit)
3/35 Checking commit b6a64576d120 (loader: Handle memory-mapped ELFs)
4/35 Checking commit ff22b21d41c9 (elf-ops.h: Map into memory the ELF to load)
WARNING: Block comments use a leading /* on a separate line
#85: FILE: include/hw/elf_ops.h:529:
+            /* Some ELF files really do have segments of zero size;

total: 0 errors, 1 warnings, 128 lines checked

Patch 4/35 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/35 Checking commit 5913693ff255 (hw/i386/pc: Map into memory the initrd)
6/35 Checking commit 35467e34932a (memory: assert on out of scope notification)
7/35 Checking commit ad1d30fa710d (configure: Define target access alignment in configure)
8/35 Checking commit 5acd6682980b (9p: simplify source file selection)
9/35 Checking commit 68c0617afd28 (target-i386: kvm: 'kvm_get_supported_msrs' cleanup)
10/35 Checking commit 6af61523d934 (test-throttle: Fix uninitialized use of burst_length)
11/35 Checking commit 1e45f05d0703 (tests: Fix uninitialized byte in test_visitor_in_fuzz)
12/35 Checking commit 2f10f92d6a60 (i386/kvm: initialize struct at full before ioctl call)
13/35 Checking commit aefebb923eea (target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions)
ERROR: spaces required around that '*' (ctx:WxV)
#46: FILE: target/i386/ops_sse.h:720:
+    static inline RETTYPE x86_##FN(FLOATTYPE a, float_status *s)        \
                                                              ^

WARNING: line over 80 characters
#141: FILE: target/i386/ops_sse.h:797:
+    d->ZMM_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);

WARNING: line over 80 characters
#142: FILE: target/i386/ops_sse.h:798:
+    d->ZMM_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);

WARNING: line over 80 characters
#143: FILE: target/i386/ops_sse.h:799:
+    d->ZMM_L(2) = x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->sse_status);

WARNING: line over 80 characters
#144: FILE: target/i386/ops_sse.h:800:
+    d->ZMM_L(3) = x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->sse_status);

WARNING: line over 80 characters
#151: FILE: target/i386/ops_sse.h:805:
+    d->ZMM_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);

WARNING: line over 80 characters
#152: FILE: target/i386/ops_sse.h:806:
+    d->ZMM_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);

WARNING: line over 80 characters
#160: FILE: target/i386/ops_sse.h:812:
+    d->MMX_L(0) = x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->sse_status);

WARNING: line over 80 characters
#161: FILE: target/i386/ops_sse.h:813:
+    d->MMX_L(1) = x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->sse_status);

WARNING: line over 80 characters
#168: FILE: target/i386/ops_sse.h:818:
+    d->MMX_L(0) = x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->sse_status);

WARNING: line over 80 characters
#169: FILE: target/i386/ops_sse.h:819:
+    d->MMX_L(1) = x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->sse_status);

total: 1 errors, 10 warnings, 162 lines checked

Patch 13/35 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

14/35 Checking commit 3a47d606e090 (memory: fix race between TCG and accesses to dirty bitmap)
WARNING: line over 80 characters
#49: FILE: exec.c:909:
+        newas->tcg_as_listener.log_global_after_sync = tcg_log_global_after_sync;

WARNING: Block comments use a leading /* on a separate line
#65: FILE: exec.c:3155:
+    /* Wait for the CPU to end the current TB.  This avoids the following

WARNING: line over 80 characters
#131: FILE: memory.c:2133:
+    snapshot = cpu_physical_memory_snapshot_and_clear_dirty(mr, addr, size, client);

total: 0 errors, 3 warnings, 104 lines checked

Patch 14/35 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
15/35 Checking commit 9a6a12f4f07f (mc146818rtc: Remove reset notifiers)
16/35 Checking commit 9ede0850dcee (timer: Remove reset notifiers)
17/35 Checking commit 89d81e5838d1 (replay: Remove host_clock_last)
18/35 Checking commit 7f8a5ee5bb59 (timer: last, remove last bits of last)
19/35 Checking commit ab5d0a99df0e (kconfig: do not select VMMOUSE)
20/35 Checking commit 5a4417d17998 (replay: add missing fix for internal function)
21/35 Checking commit 6bffd44ac48e (replay: document development rules)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#21: 
new file mode 100644

total: 0 errors, 1 warnings, 46 lines checked

Patch 21/35 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
22/35 Checking commit bf310b1e3286 (util/qemu-timer: refactor deadline calculation for external timers)
23/35 Checking commit c051b4186e05 (replay: fix replay shutdown)
24/35 Checking commit f279de73e091 (replay: refine replay-time module)
25/35 Checking commit 455f9c1bb327 (replay: rename step-related variables and functions)
26/35 Checking commit 70902434ca0b (icount: clean up cpu_can_io at the entry to the block)
27/35 Checking commit ba02024681a4 (icount: remove unnecessary gen_io_end calls)
28/35 Checking commit 5887e64f22b3 (cpus-common: nuke finish_safe_work)
29/35 Checking commit 5ec49f74b713 (cpus-common: assert BQL nesting within cpu-exclusive sections)
30/35 Checking commit f0b166449809 (kvm: vmxcap: Enhance with latest features)
31/35 Checking commit 810eb3b2376b (HACKING: Document 'struct' keyword usage)
32/35 Checking commit b5ae1460501f (migration: do not rom_reset() during incoming migration)
33/35 Checking commit 28f466a77a16 (test-bitmap: test set 1 bit case for bitmap_set)
34/35 Checking commit 49cf949845b9 (scsi: lsi: exit infinite loop while executing script (CVE-2019-12068))
WARNING: Block comments use a leading /* on a separate line
#39: FILE: hw/scsi/lsi53c895a.c:1140:
+        /* Some windows drivers make the device spin waiting for a memory

WARNING: Block comments use * on subsequent lines
#40: FILE: hw/scsi/lsi53c895a.c:1141:
+        /* Some windows drivers make the device spin waiting for a memory
+           location to change.  If we have been executed a lot of code then

total: 0 errors, 2 warnings, 71 lines checked

Patch 34/35 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
35/35 Checking commit 4b3b98eade19 (x86: Intel AVX512_BF16 feature enabling)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1566284395-30287-1-git-send-email-pbonzini@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (36 preceding siblings ...)
  2019-08-20  7:42 ` [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 no-reply
@ 2019-08-20  9:26 ` Peter Maydell
  2019-08-20 23:42 ` no-reply
  38 siblings, 0 replies; 47+ messages in thread
From: Peter Maydell @ 2019-08-20  9:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: QEMU Developers

On Tue, 20 Aug 2019 at 08:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> The following changes since commit 864ab314f1d924129d06ac7b571f105a2b76a4b2:
>
>   Update version for v4.1.0-rc4 release (2019-08-06 17:05:21 +0100)
>
> are available in the git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 6f93977075ae9971d73879ca872f75e2737f66c5:
>
>   x86: Intel AVX512_BF16 feature enabling (2019-08-20 08:59:18 +0200)
>
> ----------------------------------------------------------------
> * New KVM PV features (Marcelo, Wanpeng)
> * valgrind fixes (Andrey)
> * Remove clock reset notifiers (David)
> * KConfig and Makefile cleanups (Paolo)
> * Replay and icount improvements (Pavel)
> * x86 FP fixes (Peter M.)
> * TCG locking assertions (Roman)
> * x86 support for mmap-ed -kernel/-initrd (Stefano)
> * Other cleanups (Wei Yang, Yan Zhao, Tony)
> * LSI fix for infinite loop (Prasad)
> * ARM migration fix (Catherine)
> * AVX512_BF16 feature (Jing)
>
> ----------------------------------------------------------------

Hi; this fails to compile (all platforms):

/home/petmay01/linaro/qemu-for-merges/hw/core/loader.c: In function ‘rom_reset’:
/home/petmay01/linaro/qemu-for-merges/hw/core/loader.c:1123:9: error:
implicit declaration of function ‘runstate_check’; did you mean
‘type_check’? [-Werror=implicit-function-declaration]
     if (runstate_check(RUN_STATE_INMIGRATE))
         ^~~~~~~~~~~~~~
         type_check
/home/petmay01/linaro/qemu-for-merges/hw/core/loader.c:1123:9: error:
nested extern declaration of ‘runstate_check’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
/home/petmay01/linaro/qemu-for-merges/rules.mak:69: recipe for target
'hw/core/loader.o' failed

I suspect the patchset has a semantic conflict with Marcus's
include-cleanup and needs an extra #include somewhere.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20
  2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
                   ` (37 preceding siblings ...)
  2019-08-20  9:26 ` Peter Maydell
@ 2019-08-20 23:42 ` no-reply
  38 siblings, 0 replies; 47+ messages in thread
From: no-reply @ 2019-08-20 23:42 UTC (permalink / raw)
  To: pbonzini; +Cc: qemu-devel

Patchew URL: https://patchew.org/QEMU/1566284395-30287-1-git-send-email-pbonzini@redhat.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      hw/display/xenfb.o
  CC      hw/display/vga-pci.o
  CC      hw/display/vga-isa.o
/tmp/qemu-test/src/hw/core/loader.c:1123:9: error: implicit declaration of function 'runstate_check' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
    if (runstate_check(RUN_STATE_INMIGRATE))
        ^
/tmp/qemu-test/src/hw/core/loader.c:1123:9: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
2 errors generated.
make: *** [/tmp/qemu-test/src/rules.mak:69: hw/core/loader.o] Error 1
make: *** Waiting for unfinished jobs....


The full log is available at
http://patchew.org/logs/1566284395-30287-1-git-send-email-pbonzini@redhat.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
  2019-08-20  6:59 ` [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap Paolo Bonzini
@ 2019-08-26 12:19   ` dovgaluk
  2019-09-12  6:54     ` Pavel Dovgalyuk
  2019-09-12 12:45     ` Paolo Bonzini
  2022-08-02 16:17   ` Peter Maydell
  1 sibling, 2 replies; 47+ messages in thread
From: dovgaluk @ 2019-08-26 12:19 UTC (permalink / raw)
  To: Paolo Bonzini, pavel.dovgaluk; +Cc: Qemu-devel, qemu-devel

This patch breaks the execution recording.
While vCPU tries to lock replay mutex in main while loop,
vga causes dirty memory sync and do_run_on_cpu call.
This call waits for vCPU to process the work queue.

Pavel Dovgalyuk

Paolo Bonzini писал 2019-08-20 09:59:
> There is a race between TCG and accesses to the dirty log:
> 
>       vCPU thread                  reader thread
>       -----------------------      -----------------------
>       TLB check -> slow path
>         notdirty_mem_write
>           write to RAM
>           set dirty flag
>                                    clear dirty flag
>       TLB check -> fast path
>                                    read memory
>         write to RAM
> 
> Fortunately, in order to fix it, no change is required to the
> vCPU thread.  However, the reader thread must delay the read after
> the vCPU thread has finished the write.  This can be approximated
> conservatively by run_on_cpu, which waits for the end of the current
> translation block.
> 
> A similar technique is used by KVM, which has to do a synchronous TLB
> flush after doing a test-and-clear of the dirty-page flags.
> 
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                | 31 +++++++++++++++++++++++++++++++
>  include/exec/memory.h | 12 ++++++++++++
>  memory.c              | 10 +++++++++-
>  migration/ram.c       |  1 +
>  4 files changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/exec.c b/exec.c
> index 3e78de3..ae68f72 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -198,6 +198,7 @@ typedef struct subpage_t {
> 
>  static void io_mem_init(void);
>  static void memory_map_init(void);
> +static void tcg_log_global_after_sync(MemoryListener *listener);
>  static void tcg_commit(MemoryListener *listener);
> 
>  static MemoryRegion io_mem_watch;
> @@ -906,6 +907,7 @@ void cpu_address_space_init(CPUState *cpu, int 
> asidx,
>      newas->cpu = cpu;
>      newas->as = as;
>      if (tcg_enabled()) {
> +        newas->tcg_as_listener.log_global_after_sync =
> tcg_log_global_after_sync;
>          newas->tcg_as_listener.commit = tcg_commit;
>          memory_listener_register(&newas->tcg_as_listener, as);
>      }
> @@ -3143,6 +3145,35 @@ void 
> address_space_dispatch_free(AddressSpaceDispatch *d)
>      g_free(d);
>  }
> 
> +static void do_nothing(CPUState *cpu, run_on_cpu_data d)
> +{
> +}
> +
> +static void tcg_log_global_after_sync(MemoryListener *listener)
> +{while (1) {
         qemu_mutex_unlock_iothread();
         replay_mutex_lock();
         qemu_mutex_lock_i
> +    CPUAddressSpace *cpuas;
> +
> +    /* Wait for the CPU to end the current TB.  This avoids the 
> following
> +     * incorrect race:
> +     *
> +     *      vCPU                         migration
> +     *      ----------------------       -------------------------
> +     *      TLB check -> slow path
> +     *        notdirty_mem_write
> +     *          write to RAM
> +     *          mark dirty
> +     *                                   clear dirty flag
> +     *      TLB check -> fast path
> +     *                                   read memory
> +     *        write to RAM
> +     *
> +     * by pushing the migration thread's memory read after the vCPU 
> thread has
> +     * written the memory.
> +     */
> +    cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
> +    run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
> +}
> +
>  static void tcg_commit(MemoryListener *listener)
>  {
>      CPUAddressSpace *cpuas;
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index bb0961d..b6bcf31 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -419,6 +419,7 @@ struct MemoryListener {
>      void (*log_clear)(MemoryListener *listener, MemoryRegionSection 
> *section);
>      void (*log_global_start)(MemoryListener *listener);
>      void (*log_global_stop)(MemoryListener *listener);
> +    void (*log_global_after_sync)(MemoryListener *listener);
>      void (*eventfd_add)(MemoryListener *listener, MemoryRegionSection 
> *section,
>                          bool match_data, uint64_t data, EventNotifier 
> *e);
>      void (*eventfd_del)(MemoryListener *listener, MemoryRegionSection 
> *section,
> @@ -1682,6 +1683,17 @@ MemoryRegionSection 
> memory_region_find(MemoryRegion *mr,
>  void memory_global_dirty_log_sync(void);
> 
>  /**
> + * memory_global_dirty_log_sync: synchronize the dirty log for all 
> memory
> + *
> + * Synchronizes the vCPUs with a thread that is reading the dirty 
> bitmap.
> + * This function must be called after the dirty log bitmap is cleared, 
> and
> + * before dirty guest memory pages are read.  If you are using
> + * #DirtyBitmapSnapshot, memory_region_snapshot_and_clear_dirty() 
> takes
> + * care of doing this.
> + */
> +void memory_global_after_dirty_log_sync(void);
> +
> +/**
>   * memory_region_transaction_begin: Start a transaction.
>   *
>   * During a transaction, changes will be accumulated and made visible
> diff --git a/memory.c b/memory.c
> index e42d63a..edd0c13 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -2127,9 +2127,12 @@ DirtyBitmapSnapshot
> *memory_region_snapshot_and_clear_dirty(MemoryRegion *mr,
>                                                              hwaddr 
> size,
>                                                              unsigned 
> client)
>  {
> +    DirtyBitmapSnapshot *snapshot;
>      assert(mr->ram_block);
>      memory_region_sync_dirty_bitmap(mr);
> -    return cpu_physical_memory_snapshot_and_clear_dirty(mr, addr,
> size, client);
> +    snapshot = cpu_physical_memory_snapshot_and_clear_dirty(mr, addr,
> size, client);
> +    memory_global_after_dirty_log_sync();
> +    return snapshot;
>  }
> 
>  bool memory_region_snapshot_get_dirty(MemoryRegion *mr,
> DirtyBitmapSnapshot *snap,
> @@ -2620,6 +2623,11 @@ void memory_global_dirty_log_sync(void)
>      memory_region_sync_dirty_bitmap(NULL);
>  }
> 
> +void memory_global_after_dirty_log_sync(void)
> +{
> +    MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
> +}
> +
>  static VMChangeStateEntry *vmstate_change;
> 
>  void memory_global_dirty_log_start(void)
> diff --git a/migration/ram.c b/migration/ram.c
> index 889148d..4e3a6ae 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1847,6 +1847,7 @@ static void migration_bitmap_sync(RAMState *rs)
>      rcu_read_unlock();
>      qemu_mutex_unlock(&rs->bitmap_mutex);
> 
> +    memory_global_after_dirty_log_sync();
>      trace_migration_bitmap_sync_end(rs->num_dirty_pages_period);
> 
>      end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
  2019-08-26 12:19   ` dovgaluk
@ 2019-09-12  6:54     ` Pavel Dovgalyuk
  2019-09-12 17:43       ` Richard Henderson
  2019-09-12 12:45     ` Paolo Bonzini
  1 sibling, 1 reply; 47+ messages in thread
From: Pavel Dovgalyuk @ 2019-09-12  6:54 UTC (permalink / raw)
  To: 'dovgaluk', 'Paolo Bonzini', peter.maydell; +Cc: qemu-devel

Ping.


Pavel Dovgalyuk

> -----Original Message-----
> From: dovgaluk [mailto:dovgaluk@ispras.ru]
> Sent: Monday, August 26, 2019 3:19 PM
> To: Paolo Bonzini; pavel.dovgaluk@ispras.ru
> Cc: qemu-devel@nongnu.org; Qemu-devel
> Subject: Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty
> bitmap
> 
> This patch breaks the execution recording.
> While vCPU tries to lock replay mutex in main while loop,
> vga causes dirty memory sync and do_run_on_cpu call.
> This call waits for vCPU to process the work queue.
> 
> Pavel Dovgalyuk
> 
> Paolo Bonzini писал 2019-08-20 09:59:
> > There is a race between TCG and accesses to the dirty log:
> >
> >       vCPU thread                  reader thread
> >       -----------------------      -----------------------
> >       TLB check -> slow path
> >         notdirty_mem_write
> >           write to RAM
> >           set dirty flag
> >                                    clear dirty flag
> >       TLB check -> fast path
> >                                    read memory
> >         write to RAM
> >
> > Fortunately, in order to fix it, no change is required to the
> > vCPU thread.  However, the reader thread must delay the read after
> > the vCPU thread has finished the write.  This can be approximated
> > conservatively by run_on_cpu, which waits for the end of the current
> > translation block.
> >
> > A similar technique is used by KVM, which has to do a synchronous TLB
> > flush after doing a test-and-clear of the dirty-page flags.
> >
> > Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  exec.c                | 31 +++++++++++++++++++++++++++++++
> >  include/exec/memory.h | 12 ++++++++++++
> >  memory.c              | 10 +++++++++-
> >  migration/ram.c       |  1 +
> >  4 files changed, 53 insertions(+), 1 deletion(-)
> >
> > diff --git a/exec.c b/exec.c
> > index 3e78de3..ae68f72 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -198,6 +198,7 @@ typedef struct subpage_t {
> >
> >  static void io_mem_init(void);
> >  static void memory_map_init(void);
> > +static void tcg_log_global_after_sync(MemoryListener *listener);
> >  static void tcg_commit(MemoryListener *listener);
> >
> >  static MemoryRegion io_mem_watch;
> > @@ -906,6 +907,7 @@ void cpu_address_space_init(CPUState *cpu, int
> > asidx,
> >      newas->cpu = cpu;
> >      newas->as = as;
> >      if (tcg_enabled()) {
> > +        newas->tcg_as_listener.log_global_after_sync =
> > tcg_log_global_after_sync;
> >          newas->tcg_as_listener.commit = tcg_commit;
> >          memory_listener_register(&newas->tcg_as_listener, as);
> >      }
> > @@ -3143,6 +3145,35 @@ void
> > address_space_dispatch_free(AddressSpaceDispatch *d)
> >      g_free(d);
> >  }
> >
> > +static void do_nothing(CPUState *cpu, run_on_cpu_data d)
> > +{
> > +}
> > +
> > +static void tcg_log_global_after_sync(MemoryListener *listener)
> > +{while (1) {
>          qemu_mutex_unlock_iothread();
>          replay_mutex_lock();
>          qemu_mutex_lock_i
> > +    CPUAddressSpace *cpuas;
> > +
> > +    /* Wait for the CPU to end the current TB.  This avoids the
> > following
> > +     * incorrect race:
> > +     *
> > +     *      vCPU                         migration
> > +     *      ----------------------       -------------------------
> > +     *      TLB check -> slow path
> > +     *        notdirty_mem_write
> > +     *          write to RAM
> > +     *          mark dirty
> > +     *                                   clear dirty flag
> > +     *      TLB check -> fast path
> > +     *                                   read memory
> > +     *        write to RAM
> > +     *
> > +     * by pushing the migration thread's memory read after the vCPU
> > thread has
> > +     * written the memory.
> > +     */
> > +    cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
> > +    run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
> > +}
> > +
> >  static void tcg_commit(MemoryListener *listener)
> >  {
> >      CPUAddressSpace *cpuas;
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index bb0961d..b6bcf31 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -419,6 +419,7 @@ struct MemoryListener {
> >      void (*log_clear)(MemoryListener *listener, MemoryRegionSection
> > *section);
> >      void (*log_global_start)(MemoryListener *listener);
> >      void (*log_global_stop)(MemoryListener *listener);
> > +    void (*log_global_after_sync)(MemoryListener *listener);
> >      void (*eventfd_add)(MemoryListener *listener, MemoryRegionSection
> > *section,
> >                          bool match_data, uint64_t data, EventNotifier
> > *e);
> >      void (*eventfd_del)(MemoryListener *listener, MemoryRegionSection
> > *section,
> > @@ -1682,6 +1683,17 @@ MemoryRegionSection
> > memory_region_find(MemoryRegion *mr,
> >  void memory_global_dirty_log_sync(void);
> >
> >  /**
> > + * memory_global_dirty_log_sync: synchronize the dirty log for all
> > memory
> > + *
> > + * Synchronizes the vCPUs with a thread that is reading the dirty
> > bitmap.
> > + * This function must be called after the dirty log bitmap is cleared,
> > and
> > + * before dirty guest memory pages are read.  If you are using
> > + * #DirtyBitmapSnapshot, memory_region_snapshot_and_clear_dirty()
> > takes
> > + * care of doing this.
> > + */
> > +void memory_global_after_dirty_log_sync(void);
> > +
> > +/**
> >   * memory_region_transaction_begin: Start a transaction.
> >   *
> >   * During a transaction, changes will be accumulated and made visible
> > diff --git a/memory.c b/memory.c
> > index e42d63a..edd0c13 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -2127,9 +2127,12 @@ DirtyBitmapSnapshot
> > *memory_region_snapshot_and_clear_dirty(MemoryRegion *mr,
> >                                                              hwaddr
> > size,
> >                                                              unsigned
> > client)
> >  {
> > +    DirtyBitmapSnapshot *snapshot;
> >      assert(mr->ram_block);
> >      memory_region_sync_dirty_bitmap(mr);
> > -    return cpu_physical_memory_snapshot_and_clear_dirty(mr, addr,
> > size, client);
> > +    snapshot = cpu_physical_memory_snapshot_and_clear_dirty(mr, addr,
> > size, client);
> > +    memory_global_after_dirty_log_sync();
> > +    return snapshot;
> >  }
> >
> >  bool memory_region_snapshot_get_dirty(MemoryRegion *mr,
> > DirtyBitmapSnapshot *snap,
> > @@ -2620,6 +2623,11 @@ void memory_global_dirty_log_sync(void)
> >      memory_region_sync_dirty_bitmap(NULL);
> >  }
> >
> > +void memory_global_after_dirty_log_sync(void)
> > +{
> > +    MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
> > +}
> > +
> >  static VMChangeStateEntry *vmstate_change;
> >
> >  void memory_global_dirty_log_start(void)
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 889148d..4e3a6ae 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1847,6 +1847,7 @@ static void migration_bitmap_sync(RAMState *rs)
> >      rcu_read_unlock();
> >      qemu_mutex_unlock(&rs->bitmap_mutex);
> >
> > +    memory_global_after_dirty_log_sync();
> >      trace_migration_bitmap_sync_end(rs->num_dirty_pages_period);
> >
> >      end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
  2019-08-26 12:19   ` dovgaluk
  2019-09-12  6:54     ` Pavel Dovgalyuk
@ 2019-09-12 12:45     ` Paolo Bonzini
  1 sibling, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-09-12 12:45 UTC (permalink / raw)
  To: dovgaluk, pavel.dovgaluk; +Cc: Qemu-devel, qemu-devel

On 26/08/19 14:19, dovgaluk wrote:
> This patch breaks the execution recording.
> While vCPU tries to lock replay mutex in main while loop,
> vga causes dirty memory sync and do_run_on_cpu call.
> This call waits for vCPU to process the work queue.

IIUC there is a deadlock because VGA (from the I/O thread) is holding
the replay mutex.  Next time, please include a quick description of who
waits for whom. :)

I think it should be enough to wrap the do_run_on_cpu on replay_mode ==
REPLAY_MODE_NONE, since the I/O and vCPU thread are anyway running in
lockstep when rr is active.  (This reasoning should also be in a comment).

Paolo


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
  2019-09-12  6:54     ` Pavel Dovgalyuk
@ 2019-09-12 17:43       ` Richard Henderson
  2019-09-12 22:16         ` Paolo Bonzini
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Henderson @ 2019-09-12 17:43 UTC (permalink / raw)
  To: Pavel Dovgalyuk, 'Paolo Bonzini', peter.maydell; +Cc: qemu-devel

On 9/12/19 2:54 AM, Pavel Dovgalyuk wrote:
> Ping.
> 
> 
> Pavel Dovgalyuk
> 
>> -----Original Message-----
>> From: dovgaluk [mailto:dovgaluk@ispras.ru]
>> Sent: Monday, August 26, 2019 3:19 PM
>> To: Paolo Bonzini; pavel.dovgaluk@ispras.ru
>> Cc: qemu-devel@nongnu.org; Qemu-devel
>> Subject: Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty
>> bitmap
>>
>> This patch breaks the execution recording.
>> While vCPU tries to lock replay mutex in main while loop,
>> vga causes dirty memory sync and do_run_on_cpu call.
>> This call waits for vCPU to process the work queue.
>>
>> Pavel Dovgalyuk
>>
>> Paolo Bonzini писал 2019-08-20 09:59:
>>> There is a race between TCG and accesses to the dirty log:
>>>
>>>       vCPU thread                  reader thread
>>>       -----------------------      -----------------------
>>>       TLB check -> slow path
>>>         notdirty_mem_write
>>>           write to RAM
>>>           set dirty flag
>>>                                    clear dirty flag
>>>       TLB check -> fast path
>>>                                    read memory
>>>         write to RAM
>>>
>>> Fortunately, in order to fix it, no change is required to the
>>> vCPU thread.  However, the reader thread must delay the read after
>>> the vCPU thread has finished the write.  This can be approximated
>>> conservatively by run_on_cpu, which waits for the end of the current
>>> translation block.

If we are going to delay any read of the dirty flags until vCPU has completed
any active TranslationBlock, then we can simplify the TCG operation so that we
do not (ab)use the mmio path, and can promote this into the tlb slow path as we
have recently done with watchpoints.  C.f.

commit 50b107c5d617eaf93301cef20221312e7a986701
Author: Richard Henderson <richard.henderson@linaro.org>
Date:   Sat Aug 24 09:51:09 2019 -0700

    cputlb: Handle watchpoints via TLB_WATCHPOINT

That would greatly simplify things from my perspective, for vector and
block-type operations such as we have recently been discussing for S390.  It
would mean that the *only* time we go through TLB_MMIO is for true mmio.

Have I understood your proposal here properly?


r~


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
  2019-09-12 17:43       ` Richard Henderson
@ 2019-09-12 22:16         ` Paolo Bonzini
  0 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2019-09-12 22:16 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Maydell, Peter, Pavel Dovgalyuk, qemu-devel

Il gio 12 set 2019, 19:43 Richard Henderson <richard.henderson@linaro.org>
ha scritto:

> >>> Fortunately, in order to fix it, no change is required to the
> >>> vCPU thread.  However, the reader thread must delay the read after
> >>> the vCPU thread has finished the write.  This can be approximated
> >>> conservatively by run_on_cpu, which waits for the end of the current
> >>> translation block.
>
> If we are going to delay any read of the dirty flags until vCPU has
> completed
> any active TranslationBlock, then we can simplify the TCG operation so
> that we
> do not (ab)use the mmio path, and can promote this into the tlb slow path
> as we
> have recently done with watchpoints.


Uh, that's true and I agree it would be great. Die notdirty_mem_ops die...

Paolo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 09/36] 9p: simplify source file selection
  2019-08-20  6:59 ` [Qemu-devel] [PULL 09/36] 9p: simplify source file selection Paolo Bonzini
@ 2020-11-03 20:31   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 47+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-11-03 20:31 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel, xen-devel
  Cc: Marc-André Lureau, Christian Schoenebeck, Cornelia Huck,
	Daniel P . Berrange, Greg Kurz

Hi Paolo,

On 8/20/19 8:59 AM, Paolo Bonzini wrote:
> Express the complex conditions in Kconfig rather than Makefiles, since Kconfig
> is better suited at expressing dependencies and detecting contradictions.
> 
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  Kconfig.host        | 1 +
>  fsdev/Makefile.objs | 2 +-
>  hw/9pfs/Kconfig     | 5 +++++
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/Kconfig.host b/Kconfig.host
> index aec9536..bb6e116 100644
> --- a/Kconfig.host
> +++ b/Kconfig.host
> @@ -28,6 +28,7 @@ config VHOST_USER
>  
>  config XEN
>      bool
> +    select FSDEV_9P if VIRTFS

There is something odd with CONFIG_XEN, as it is used
to select accelerator and hardware.

>  
>  config VIRTFS
>      bool
> diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
> index 24bbb3e..42cd70c 100644
> --- a/fsdev/Makefile.objs
> +++ b/fsdev/Makefile.objs
> @@ -1,6 +1,6 @@
>  # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
>  # only pull in the actual 9p backend if we also enabled virtio or xen.
> -ifeq ($(call land,$(CONFIG_VIRTFS),$(call lor,$(CONFIG_VIRTIO_9P),$(CONFIG_XEN))),y)
> +ifeq ($(CONFIG_FSDEV_9P),y)
>  common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
>  else
>  common-obj-y = qemu-fsdev-dummy.o
> diff --git a/hw/9pfs/Kconfig b/hw/9pfs/Kconfig
> index 8c5032c..3ae5749 100644
> --- a/hw/9pfs/Kconfig
> +++ b/hw/9pfs/Kconfig
> @@ -1,4 +1,9 @@
> +config FSDEV_9P
> +    bool
> +    depends on VIRTFS

Using "depends on VIRTFS && 9PFS" instead helps to
reduce the link failure using --without-default-devices.

> +
>  config VIRTIO_9P
>      bool
>      default y
>      depends on VIRTFS && VIRTIO
> +    select FSDEV_9P

Here I used "depends on FSDEV_9P && VIRTIO" but this
doesn't look right...

Is it possible to include "config-devices.h" in
hw/xen/xen-legacy-backend.c to use CONFIG_9PFS?

xen_be_register_common() unconditionally calls:

  xen_be_register("9pfs", &xen_9pfs_ops);

As I have no much idea about Xen & 9pfs, I'm a bit
lost here (regarding the dependencies order).

Thanks,

Phil.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
  2019-08-20  6:59 ` [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap Paolo Bonzini
  2019-08-26 12:19   ` dovgaluk
@ 2022-08-02 16:17   ` Peter Maydell
  1 sibling, 0 replies; 47+ messages in thread
From: Peter Maydell @ 2022-08-02 16:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Richard Henderson, Dr. David Alan Gilbert

On Tue, 20 Aug 2019 at 08:12, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> There is a race between TCG and accesses to the dirty log:
>
>       vCPU thread                  reader thread
>       -----------------------      -----------------------
>       TLB check -> slow path
>         notdirty_mem_write
>           write to RAM
>           set dirty flag
>                                    clear dirty flag
>       TLB check -> fast path
>                                    read memory
>         write to RAM
>
> Fortunately, in order to fix it, no change is required to the
> vCPU thread.  However, the reader thread must delay the read after
> the vCPU thread has finished the write.  This can be approximated
> conservatively by run_on_cpu, which waits for the end of the current
> translation block.
>
> A similar technique is used by KVM, which has to do a synchronous TLB
> flush after doing a test-and-clear of the dirty-page flags.
>
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


> +static void tcg_log_global_after_sync(MemoryListener *listener)
> +{
> +    CPUAddressSpace *cpuas;
> +
> +    /* Wait for the CPU to end the current TB.  This avoids the following
> +     * incorrect race:
> +     *
> +     *      vCPU                         migration
> +     *      ----------------------       -------------------------
> +     *      TLB check -> slow path
> +     *        notdirty_mem_write
> +     *          write to RAM
> +     *          mark dirty
> +     *                                   clear dirty flag
> +     *      TLB check -> fast path
> +     *                                   read memory
> +     *        write to RAM
> +     *
> +     * by pushing the migration thread's memory read after the vCPU thread has
> +     * written the memory.
> +     */
> +    cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
> +    run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
> +}

So I think this implementation has two problems that we've only just
noticed:

(1) if memory_global_after_dirty_log_sync() is called in a context
where the caller holds the BQL, then the implementation here will
temporarily drop the BQL inside run_on_cpu(). That breaks
assumptions in pretty much all the graphics device models which
call memory_region_snapshot_and_clear_dirty() and assume that
things they'd determined before the call stay valid (pointers
into DisplaySurfaces, information about size and location of
the framebuffer in guest memory, etc).

(2) if memory_global_after_dirty_log_sync() is called in a context
where the caller does *not* hold the BQL, this is undefined
behaviour, because run_on_cpu() calls qemu_cond_wait() passing
it the BKL, and at least for the posix implementation that
ends up being posix_cond_wait() and calling that without the
mutex held is undefined behaviour. This happens for the migration
code for everything except the final iteration.

Does anybody have any bright ideas for fixing these? (2) I guess
we could handle just by having the migration code (or this
code right here) grab the BQL if it doesn't already have it.
(1) is much harder: you kind of conceptually want to do a "go back
and restart the update-display operation", but there's no clear
way to tell if you need to do that. You can postpone some things
(like "get the current UI display surface") until after the
call to memory_region_snapshot_and_clear_dirty(), and you could
have a "check a flag to see if the guest re-entered the display
device while we were inside memory_region_snapshot_and_clear_dirty()"
test, but some things the guest might do could potentially invalidate
things in a non-obvious action-at-a-distance way that doesn't
require an access to the display device. (e.g. a display device that
lets the guest program the FB to be anywhere in the address map,
plus the ability for the guest to cause remapping of RAM banks
or similar -- the display device can't tell if the RAM banks
got remapped or not, that's handled by a different device.)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2022-08-02 16:22 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-20  6:59 [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 01/36] kvm: i386: halt poll control MSR support Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 02/36] target-i386: adds PV_SCHED_YIELD CPUID feature bit Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 03/36] loader: Handle memory-mapped ELFs Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 04/36] elf-ops.h: Map into memory the ELF to load Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 05/36] hw/i386/pc: Map into memory the initrd Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 06/36] memory: assert on out of scope notification Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 07/36] configure: Define target access alignment in configure Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 08/36] block: fix NetBSD qemu-iotests failure Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 09/36] 9p: simplify source file selection Paolo Bonzini
2020-11-03 20:31   ` Philippe Mathieu-Daudé
2019-08-20  6:59 ` [Qemu-devel] [PULL 10/36] target-i386: kvm: 'kvm_get_supported_msrs' cleanup Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 11/36] test-throttle: Fix uninitialized use of burst_length Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 12/36] tests: Fix uninitialized byte in test_visitor_in_fuzz Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 13/36] i386/kvm: initialize struct at full before ioctl call Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 14/36] target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap Paolo Bonzini
2019-08-26 12:19   ` dovgaluk
2019-09-12  6:54     ` Pavel Dovgalyuk
2019-09-12 17:43       ` Richard Henderson
2019-09-12 22:16         ` Paolo Bonzini
2019-09-12 12:45     ` Paolo Bonzini
2022-08-02 16:17   ` Peter Maydell
2019-08-20  6:59 ` [Qemu-devel] [PULL 16/36] mc146818rtc: Remove reset notifiers Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 17/36] timer: " Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 18/36] replay: Remove host_clock_last Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 19/36] timer: last, remove last bits of last Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 20/36] kconfig: do not select VMMOUSE Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 21/36] replay: add missing fix for internal function Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 22/36] replay: document development rules Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 23/36] util/qemu-timer: refactor deadline calculation for external timers Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 24/36] replay: fix replay shutdown Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 25/36] replay: refine replay-time module Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 26/36] replay: rename step-related variables and functions Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 27/36] icount: clean up cpu_can_io at the entry to the block Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 28/36] icount: remove unnecessary gen_io_end calls Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 29/36] cpus-common: nuke finish_safe_work Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 30/36] cpus-common: assert BQL nesting within cpu-exclusive sections Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 31/36] kvm: vmxcap: Enhance with latest features Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 32/36] HACKING: Document 'struct' keyword usage Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 33/36] migration: do not rom_reset() during incoming migration Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 34/36] test-bitmap: test set 1 bit case for bitmap_set Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 35/36] scsi: lsi: exit infinite loop while executing script (CVE-2019-12068) Paolo Bonzini
2019-08-20  6:59 ` [Qemu-devel] [PULL 36/36] x86: Intel AVX512_BF16 feature enabling Paolo Bonzini
2019-08-20  7:42 ` [Qemu-devel] [PULL 00/36] QEMU patches for 2018-08-20 no-reply
2019-08-20  9:26 ` Peter Maydell
2019-08-20 23:42 ` no-reply

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).