qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11
@ 2020-06-12 16:07 Paolo Bonzini
  2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit 31d321c2b3574dcc74e9f6411af06bca6b5d10f4:

  Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sparc-next-20200609' into staging (2020-06-09 17:29:47 +0100)

are available in the Git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 3575b0aea983ad57804c9af739ed8ff7bc168393:

  target/i386: Remove obsolete TODO file (2020-06-12 11:20:15 -0400)

----------------------------------------------------------------
* Miscellaneous fixes and feature enablement (many)
* SEV refactoring (David)
* Hyper-V initial support (Jon)
* i386 TCG fixes (x87 and SSE, Joseph)
* vmport cleanup and improvements (Philippe, Liran)
* Use-after-free with vCPU hot-unplug (Nengyuan)
* run-coverity-scan improvements (myself)
* Record/replay fixes (Pavel)
* -machine kernel_irqchip=split improvements for INTx (Peter)
* Code cleanups (Philippe)
* Crash and security fixes (PJP)
* HVF cleanups (Roman)

----------------------------------------------------------------
Anthony PERARD (1):
      xen: fix build without pci passthrough

Babu Moger (1):
      target/i386: Fix the CPUID leaf CPUID_Fn80000008

Cathy Zhang (1):
      x86/cpu: Enable AVX512_VP2INTERSECT cpu feature

Cédric Le Goater (1):
      qom/object: Fix object_child_foreach_recursive() return value

David Carlier (1):
      util/oslib: Returns the real thread identifier on FreeBSD and NetBSD

David Gibson (9):
      target/i386: sev: Remove unused QSevGuestInfoClass
      target/i386: sev: Move local structure definitions into .c file
      target/i386: sev: Rename QSevGuestInfo
      target/i386: sev: Embed SEVState in SevGuestState
      target/i386: sev: Partial cleanup to sev_state global
      target/i386: sev: Remove redundant cbitpos and reduced_phys_bits fields
      target/i386: sev: Remove redundant policy field
      target/i386: sev: Remove redundant handle field
      target/i386: sev: Unify SEVState and SevGuestState

Edgar E. Iglesias (1):
      tests: machine-none-test: Enable MicroBlaze testing

Igor Mammedov (2):
      vl.c: run preconfig loop before creating default RAM backend
      numa: prevent usage of -M memory-backend and -numa memdev at the same time

Janne Grunau (1):
      target/i386: fix phadd* with identical destination and source register

Jon Doron (6):
      hyperv: expose API to determine if synic is enabled
      vmbus: add vmbus protocol definitions
      vmbus: vmbus implementation
      i386:pc: whitelist dynamic vmbus-bridge
      i386: Hyper-V VMBus ACPI DSDT entry
      vmbus: add infrastructure to save/load vmbus requests

Joseph Myers (12):
      target/i386: implement special cases for fxtract
      target/i386: fix fscale handling of signaling NaN
      target/i386: fix fscale handling of invalid exponent encodings
      target/i386: fix fscale handling of infinite exponents
      target/i386: fix fscale handling of rounding precision
      target/i386: fix floating-point load-constant rounding
      target/i386: fix fxam handling of invalid encodings
      target/i386: fix fbstp handling of negative zero
      target/i386: fix fbstp handling of out-of-range values
      target/i386: fix fisttpl, fisttpll handling of out-of-range values
      target/i386: fix IEEE x87 floating-point exception raising
      target/i386: correct fix for pcmpxstrx substring search

Julio Faracco (1):
      i386: Remove unused define's from hax and hvf

Leonid Bloch (1):
      configure: Do not ignore malloc value

Like Xu (1):
      target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES

Liran Alon (14):
      hw/i386/vmport: Add reference to VMware open-vm-tools
      hw/i386/vmport: Add device properties
      hw/i386/vmport: Propagate IOPort read to vCPU EAX register
      hw/i386/vmport: Set EAX to -1 on failed and unsupported commands
      hw/i386/vmport: Introduce vmware-vmx-version property
      hw/i386/vmport: Report vmware-vmx-type in CMD_GETVERSION
      hw/i386/vmport: Introduce vmport.h
      hw/i386/vmport: Define enum for all commands
      hw/i386/vmport: Add support for CMD_GETBIOSUUID
      hw/i386/vmport: Add support for CMD_GET_VCPU_INFO
      hw/i386/vmport: Allow x2apic without IR
      i386/cpu: Store LAPIC bus frequency in CPU structure
      hw/i386/vmport: Add support for CMD_GETHZ
      hw/i386/vmport: Assert vmport initialized before registering commands

Markus Armbruster (1):
      cpus: Fix botched configure_icount() error API violation fix

Masahiro Yamada (5):
      qom: remove index from object_resolve_abs_path()
      qom/object: factor out the initialization of hash table of properties
      qom/object: simplify type_initialize_interface()
      qom/object: pass (Object *) to object_initialize_with_type()
      qom/container: remove .instance_size initializer from container_info

Michael S. Tsirkin (1):
      checkpatch: reversed logic with acpi test checks

Pan Nengyuan (1):
      i386/kvm: fix a use-after-free when vcpu plug/unplug

Paolo Bonzini (9):
      docker.py/build: support -t and -f arguments
      docker.py/build: support binary files in --extra-files
      run-coverity-scan: get Coverity token and email from special git config section
      run-coverity-scan: use docker.py
      run-coverity-scan: add --no-update-tools option
      run-coverity-scan: use --no-update-tools in docker run
      run-coverity-scan: download tools outside the container
      run-coverity-scan: support --update-tools-only --docker
      stubs: move Xen stubs to accel/

Pavel Dovgaluk (3):
      icount: fix shift=auto for record/replay
      replay: implement fair mutex
      replay: fix replay shutdown for console mode

Peter Xu (3):
      vfio/pci: Use kvm_irqchip_add_irqfd_notifier_gsi() for irqfds
      KVM: Pass EventNotifier into kvm_irqchip_assign_irqfd
      KVM: Kick resamplefd for split kernel irqchip

Philippe Mathieu-Daudé (19):
      target/i386: Fix OUTL debug output
      qom/object: Move Object typedef to 'qemu/typedefs.h'
      io/task: Move 'qom/object.h' header to source
      Makefile: Let the 'help' target list the helper targets
      accel: Move Xen accelerator code under accel/xen/
      exec: Let address_space_read/write_cached() propagate MemTxResult
      exec: Propagate cpu_memory_rw_debug() error
      disas: Let disas::read_memory() handler return EIO on error
      hw/elf_ops: Do not ignore write failures when loading ELF
      hw/i386/vmport: Allow QTest use without crashing
      memory: Make 'info mtree' not display disabled regions by default
      qemu/thread: Mark qemu_thread_exit() with 'noreturn' attribute
      sysemu/accel: Restrict machine methods to system-mode
      sysemu/tcg: Only declare tcg_allowed when TCG is available
      sysemu/hvf: Only declare hvf_allowed when HVF is available
      target/ppc: Restrict PPCVirtualHypervisorClass to system-mode
      exec/memory: Remove unused MemoryRegionMmio type
      hw/usb: Move device-specific declarations to new 'hcd-musb.h' header
      exec/cpu-common: Move MUSB specific typedefs to 'hw/usb/hcd-musb.h'

Prasad J Pandit (4):
      megasas: use unsigned type for reply_queue_head and check index
      megasas: avoid NULL pointer dereference
      megasas: use unsigned type for positive numeric fields
      exec: set map length to zero when returning NULL

Roman Bolshakov (13):
      i386: hvf: Move HVFState definition into hvf
      i386: hvf: Drop useless declarations in sysemu
      i386: hvf: Clean stray includes in sysemu
      i386: hvf: Drop unused variable
      i386: hvf: Use ins_len to advance IP
      i386: hvf: Use IP from CPUX86State
      i386: hvf: Drop fetch_rip from HVFX86EmulatorState
      i386: hvf: Drop rflags from HVFX86EmulatorState
      i386: hvf: Drop copy of RFLAGS defines
      i386: hvf: Drop regs in HVFX86EmulatorState
      i386: hvf: Move lazy_flags into CPUX86State
      i386: hvf: Move mmio_buf into CPUX86State
      i386: hvf: Drop HVFX86EmulatorState

Sai Pavan Boddu (1):
      chardev/char-socket: Properly make qio connections non blocking

Thomas Huth (1):
      target/i386: Remove obsolete TODO file

WangBowen (1):
      hax: Dynamic allocate vcpu state structure

Wei Huang (1):
      hw/i386/amd_iommu: Fix the reserved bits definition of IOMMU commands

 MAINTAINERS                                |    2 +
 Makefile                                   |    9 +-
 Makefile.objs                              |    1 +
 accel/Makefile.objs                        |    1 +
 accel/kvm/kvm-all.c                        |   95 +-
 accel/kvm/trace-events                     |    1 +
 accel/stubs/Makefile.objs                  |    1 +
 stubs/xen-hvm.c => accel/stubs/xen-stub.c  |   23 +-
 accel/xen/Makefile.objs                    |    1 +
 hw/xen/xen-common.c => accel/xen/xen-all.c |   12 +-
 chardev/char-socket.c                      |    4 +-
 configure                                  |   25 +-
 cpus.c                                     |   26 +-
 disas.c                                    |   13 +-
 exec.c                                     |   29 +-
 hmp-commands-info.hx                       |    7 +-
 hw/Makefile.objs                           |    2 +-
 hw/acpi/piix4.c                            |    2 +-
 hw/block/vhost-user-blk.c                  |    1 -
 hw/core/machine.c                          |    4 +
 hw/core/numa.c                             |    5 +
 hw/hyperv/Kconfig                          |    5 +
 hw/hyperv/Makefile.objs                    |    1 +
 hw/hyperv/hyperv.c                         |    8 +
 hw/hyperv/trace-events                     |   18 +
 hw/hyperv/vmbus.c                          | 2778 ++++++++++++++++++++++++++++
 hw/i386/acpi-build.c                       |   43 +
 hw/i386/amd_iommu.c                        |   19 +-
 hw/i386/pc.c                               |    3 +-
 hw/i386/pc_piix.c                          |    5 +-
 hw/i386/pc_q35.c                           |    3 +
 hw/i386/vmmouse.c                          |   20 +-
 hw/i386/vmport.c                           |  183 +-
 hw/i386/vmport.h                           |   34 -
 hw/i386/xen/xen-hvm.c                      |    1 +
 hw/i386/xen/xen_platform.c                 |    1 +
 hw/intc/ioapic.c                           |   19 +
 hw/isa/piix3.c                             |    1 +
 hw/pci-host/pam.c                          |    1 -
 hw/pci/msix.c                              |    1 +
 hw/scsi/megasas.c                          |   44 +-
 hw/scsi/vhost-user-scsi.c                  |    1 -
 hw/usb/hcd-musb.c                          |    5 +-
 hw/usb/tusb6010.c                          |    1 +
 hw/vfio/pci.c                              |   37 +-
 hw/xen/Makefile.objs                       |    3 +-
 hw/xen/xen_pt.c                            |   12 +-
 hw/xen/xen_pt.h                            |    6 +-
 hw/xen/xen_pt_stub.c                       |   22 +
 include/exec/cpu-all.h                     |    1 +
 include/exec/cpu-common.h                  |    3 -
 include/exec/memory.h                      |   30 +-
 include/exec/ram_addr.h                    |    2 +-
 include/hw/display/edid.h                  |    3 -
 include/hw/elf_ops.h                       |   11 +-
 include/hw/hyperv/hyperv.h                 |    1 +
 include/hw/hyperv/vmbus-bridge.h           |   35 +
 include/hw/hyperv/vmbus-proto.h            |  222 +++
 include/hw/hyperv/vmbus.h                  |  230 +++
 include/hw/i386/vmport.h                   |   28 +
 include/hw/usb.h                           |   30 -
 include/hw/usb/hcd-musb.h                  |   47 +
 include/hw/xen/xen.h                       |   11 -
 include/io/task.h                          |    2 -
 include/qemu/thread.h                      |    2 +-
 include/qemu/typedefs.h                    |    2 +-
 include/qom/object.h                       |    2 -
 include/qom/qom-qobject.h                  |    2 -
 include/sysemu/accel.h                     |    2 +
 include/sysemu/hvf.h                       |   72 +-
 include/sysemu/kvm.h                       |    4 +
 include/sysemu/sysemu.h                    |    1 -
 include/sysemu/tcg.h                       |    2 +-
 include/sysemu/xen.h                       |   38 +
 io/task.c                                  |    1 +
 memory.c                                   |   75 +-
 migration/savevm.c                         |    2 +-
 monitor/misc.c                             |    3 +-
 qom/container.c                            |    1 -
 qom/object.c                               |   39 +-
 replay/replay-internal.c                   |   15 +-
 replay/replay.c                            |    5 +
 scripts/checkpatch.pl                      |    2 +-
 scripts/coverity-scan/coverity-scan.docker |    3 +-
 scripts/coverity-scan/run-coverity-scan    |  139 +-
 softmmu/vl.c                               |    7 +-
 stubs/Makefile.objs                        |    2 -
 stubs/qmp_memory_device.c                  |    1 -
 stubs/xen-common.c                         |   13 -
 target/i386/TODO                           |   31 -
 target/i386/cpu.c                          |   42 +-
 target/i386/cpu.h                          |   15 +-
 target/i386/fpu_helper.c                   |  258 ++-
 target/i386/hax-all.c                      |   25 +-
 target/i386/hax-i386.h                     |    7 +-
 target/i386/hvf/hvf-i386.h                 |   37 +-
 target/i386/hvf/hvf.c                      |   30 +-
 target/i386/hvf/x86.c                      |    2 +-
 target/i386/hvf/x86.h                      |   89 +-
 target/i386/hvf/x86_decode.c               |   25 +-
 target/i386/hvf/x86_emu.c                  |  122 +-
 target/i386/hvf/x86_flags.c                |   81 +-
 target/i386/hvf/x86_task.c                 |   10 +-
 target/i386/hvf/x86hvf.c                   |    6 +-
 target/i386/kvm.c                          |   34 +-
 target/i386/misc_helper.c                  |    2 +-
 target/i386/ops_sse.h                      |   57 +-
 target/i386/sev.c                          |  257 +--
 target/i386/sev_i386.h                     |   49 -
 target/ppc/cpu.h                           |    4 +-
 target/ppc/kvm_ppc.h                       |   22 +-
 target/ppc/translate_init.inc.c            |    4 +
 tests/docker/Makefile.include              |    2 +-
 tests/docker/docker.py                     |   14 +-
 tests/qtest/machine-none-test.c            |   10 +-
 tests/tcg/i386/Makefile.target             |    3 +
 tests/tcg/i386/test-i386-fbstp.c           |  140 ++
 tests/tcg/i386/test-i386-fisttp.c          |  100 +
 tests/tcg/i386/test-i386-fldcst.c          |  199 ++
 tests/tcg/i386/test-i386-fp-exceptions.c   |  831 +++++++++
 tests/tcg/i386/test-i386-fscale.c          |  108 ++
 tests/tcg/i386/test-i386-fxam.c            |  143 ++
 tests/tcg/i386/test-i386-fxtract.c         |  120 ++
 tests/tcg/i386/test-i386-pcmpistri.c       |   33 +
 tests/test-io-task.c                       |    1 +
 util/oslib-posix.c                         |    9 +
 126 files changed, 6572 insertions(+), 980 deletions(-)
 rename stubs/xen-hvm.c => accel/stubs/xen-stub.c (63%)
 create mode 100644 accel/xen/Makefile.objs
 rename hw/xen/xen-common.c => accel/xen/xen-all.c (96%)
 create mode 100644 hw/hyperv/trace-events
 create mode 100644 hw/hyperv/vmbus.c
 delete mode 100644 hw/i386/vmport.h
 create mode 100644 hw/xen/xen_pt_stub.c
 create mode 100644 include/hw/hyperv/vmbus-bridge.h
 create mode 100644 include/hw/hyperv/vmbus-proto.h
 create mode 100644 include/hw/hyperv/vmbus.h
 create mode 100644 include/hw/i386/vmport.h
 create mode 100644 include/hw/usb/hcd-musb.h
 create mode 100644 include/sysemu/xen.h
 delete mode 100644 stubs/xen-common.c
 delete mode 100644 target/i386/TODO
 create mode 100644 tests/tcg/i386/test-i386-fbstp.c
 create mode 100644 tests/tcg/i386/test-i386-fisttp.c
 create mode 100644 tests/tcg/i386/test-i386-fldcst.c
 create mode 100644 tests/tcg/i386/test-i386-fp-exceptions.c
 create mode 100644 tests/tcg/i386/test-i386-fscale.c
 create mode 100644 tests/tcg/i386/test-i386-fxam.c
 create mode 100644 tests/tcg/i386/test-i386-fxtract.c
 create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
-- 
2.26.2



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
  2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
  2020-06-15 10:18   ` Philippe Mathieu-Daudé
  2020-06-12 16:07 ` [PULL 088/116] i386: hvf: Drop useless declarations in sysemu Paolo Bonzini
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Joseph Myers

From: Joseph Myers <joseph@codesourcery.com>

This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
/ pcmpestrm / pcmpistri / pcmpistrm substring search, commit
ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.

That commit fixed a bug that showed up in four GCC tests with one libc
implementation.  The tests in question generate random inputs to the
intrinsics and compare results to a C implementation, but they only
test 1024 possible random inputs, and when the tests use the cases of
those instructions that work with word rather than byte inputs, it's
easy to have problematic cases that show up much less frequently than
that.  Thus, testing with a different libc implementation, and so a
different random number generator, showed up a problem with the
previous patch.

When investigating the previous test failures, I found the description
of these instructions in the Intel manuals (starting from computing a
16x16 or 8x8 set of comparison results) confusing and hard to match up
with the more optimized implementation in QEMU, and referred to AMD
manuals which described the instructions in a different way.  Those
AMD descriptions are very explicit that the whole of the string being
searched for must be found in the other operand, not running off the
end of that operand; they say "If the prototype and the SUT are equal
in length, the two strings must be identical for the comparison to be
TRUE.".  However, that statement is incorrect.

In my previous commit message, I noted:

  The operation in this case is a search for a string (argument d to
  the helper) in another string (argument s to the helper); if a copy
  of d at a particular position would run off the end of s, the
  resulting output bit should be 0 whether or not the strings match in
  the region where they overlap, but the QEMU implementation was
  wrongly comparing only up to the point where s ends and counting it
  as a match if an initial segment of d matched a terminal segment of
  s.  Here, "run off the end of s" means that some byte of d would
  overlap some byte outside of s; thus, if d has zero length, it is
  considered to match everywhere, including after the end of s.

The description "some byte of d would overlap some byte outside of s"
is accurate only when understood to refer to overlapping some byte
*within the 16-byte operand* but at or after the zero terminator; it
is valid to run over the end of s if the end of s is the end of the
16-byte operand.  So the fix in the previous patch for the case of d
being empty was correct, but the other part of that patch was not
correct (as it never allowed partial matches even at the end of the
16-byte operand).  Nor was the code before the previous patch correct
for the case of d nonempty, as it would always have allowed partial
matches at the end of s.

Fix with a partial revert of my previous change, combined with
inserting a check for the special case of s having maximum length to
determine where it is necessary to check for matches.

In the added test, test 1 is for the case of empty strings, which
failed before my 2017 patch, test 2 is for the bug introduced by my
2017 patch and test 3 deals with the case where a match of an initial
segment at the end of the string is not valid when the string ends
before the end of the 16-byte operand (that is, the case that would be
broken by a simple revert of the non-empty-string part of my 2017
patch).

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/ops_sse.h                |  4 ++--
 tests/tcg/i386/Makefile.target       |  3 +++
 tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+), 2 deletions(-)
 create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 01d6017412..14f2b16abd 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
             res = (2 << upper) - 1;
             break;
         }
-        for (j = valids - validd; j >= 0; j--) {
+        for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
             res <<= 1;
             v = 1;
-            for (i = validd; i >= 0; i--) {
+            for (i = MIN(valids - j, validd); i >= 0; i--) {
                 v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
             }
             res |= v;
diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index 43ee2e181e..53efec0668 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
 SKIP_I386_TESTS=test-i386-ssse3
 X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
 
+test-i386-pcmpistri: CFLAGS += -msse4.2
+run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
+
 #
 # hello-i386 is a barebones app
 #
diff --git a/tests/tcg/i386/test-i386-pcmpistri.c b/tests/tcg/i386/test-i386-pcmpistri.c
new file mode 100644
index 0000000000..1e81ae611a
--- /dev/null
+++ b/tests/tcg/i386/test-i386-pcmpistri.c
@@ -0,0 +1,33 @@
+/* Test pcmpistri instruction.  */
+
+#include <nmmintrin.h>
+#include <stdio.h>
+
+union u {
+    __m128i x;
+    unsigned char uc[16];
+};
+
+union u s0 = { .uc = { 0 } };
+union u s1 = { .uc = "abcdefghijklmnop" };
+union u s2 = { .uc = "bcdefghijklmnopa" };
+union u s3 = { .uc = "bcdefghijklmnab" };
+
+int
+main(void)
+{
+    int ret = 0;
+    if (_mm_cmpistri(s0.x, s0.x, 0x4c) != 15) {
+        printf("FAIL: pcmpistri test 1\n");
+        ret = 1;
+    }
+    if (_mm_cmpistri(s1.x, s2.x, 0x4c) != 15) {
+        printf("FAIL: pcmpistri test 2\n");
+        ret = 1;
+    }
+    if (_mm_cmpistri(s1.x, s3.x, 0x4c) != 16) {
+        printf("FAIL: pcmpistri test 3\n");
+        ret = 1;
+    }
+    return ret;
+}
-- 
2.26.2




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PULL 088/116] i386: hvf: Drop useless declarations in sysemu
  2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
  2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
  2020-06-12 16:07 ` [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State Paolo Bonzini
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Roman Bolshakov, Philippe Mathieu-Daudé

From: Roman Bolshakov <r.bolshakov@yadro.com>

They're either declared elsewhere or have no use.

While at it, rename _hvf_cpu_synchronize_post_init() to
do_hvf_cpu_synchronize_post_init().

Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-3-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 include/sysemu/hvf.h  | 22 ----------------------
 target/i386/hvf/hvf.c |  7 ++++---
 2 files changed, 4 insertions(+), 25 deletions(-)

diff --git a/include/sysemu/hvf.h b/include/sysemu/hvf.h
index 644bdfc722..2af32e505e 100644
--- a/include/sysemu/hvf.h
+++ b/include/sysemu/hvf.h
@@ -30,35 +30,13 @@ extern bool hvf_allowed;
 #define hvf_get_supported_cpuid(func, idx, reg) 0
 #endif /* !CONFIG_HVF */
 
-/* Disable HVF if |disable| is 1, otherwise, enable it iff it is supported by
- * the host CPU. Use hvf_enabled() after this to get the result. */
-void hvf_disable(int disable);
-
-/* Returns non-0 if the host CPU supports the VMX "unrestricted guest" feature
- * which allows the virtual CPU to directly run in "real mode". If true, this
- * allows QEMU to run several vCPU threads in parallel (see cpus.c). Otherwise,
- * only a a single TCG thread can run, and it will call HVF to run the current
- * instructions, except in case of "real mode" (paging disabled, typically at
- * boot time), or MMIO operations. */
-
-int hvf_sync_vcpus(void);
-
 int hvf_init_vcpu(CPUState *);
 int hvf_vcpu_exec(CPUState *);
-int hvf_smp_cpu_exec(CPUState *);
 void hvf_cpu_synchronize_state(CPUState *);
 void hvf_cpu_synchronize_post_reset(CPUState *);
 void hvf_cpu_synchronize_post_init(CPUState *);
-void _hvf_cpu_synchronize_post_init(CPUState *, run_on_cpu_data);
-
 void hvf_vcpu_destroy(CPUState *);
-void hvf_raise_event(CPUState *);
-/* void hvf_reset_vcpu_state(void *opaque); */
 void hvf_reset_vcpu(CPUState *);
-void vmx_update_tpr(CPUState *);
-void update_apic_tpr(CPUState *);
-int hvf_put_registers(CPUState *);
-void vmx_clear_int_window_exiting(CPUState *cpu);
 
 #define TYPE_HVF_ACCEL ACCEL_CLASS_NAME("hvf")
 
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index d72543dc31..9ccdb7e7c7 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -251,7 +251,7 @@ void vmx_update_tpr(CPUState *cpu)
     }
 }
 
-void update_apic_tpr(CPUState *cpu)
+static void update_apic_tpr(CPUState *cpu)
 {
     X86CPU *x86_cpu = X86_CPU(cpu);
     int tpr = rreg(cpu->hvf_fd, HV_X86_TPR) >> 4;
@@ -312,7 +312,8 @@ void hvf_cpu_synchronize_post_reset(CPUState *cpu_state)
     run_on_cpu(cpu_state, do_hvf_cpu_synchronize_post_reset, RUN_ON_CPU_NULL);
 }
 
-void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
+static void do_hvf_cpu_synchronize_post_init(CPUState *cpu,
+                                             run_on_cpu_data arg)
 {
     CPUState *cpu_state = cpu;
     hvf_put_registers(cpu_state);
@@ -321,7 +322,7 @@ void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
 
 void hvf_cpu_synchronize_post_init(CPUState *cpu_state)
 {
-    run_on_cpu(cpu_state, _hvf_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
+    run_on_cpu(cpu_state, do_hvf_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
 }
 
 static bool ept_emulation_fault(hvf_slot *slot, uint64_t gpa, uint64_t ept_qual)
-- 
2.26.2




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State
  2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
  2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
  2020-06-12 16:07 ` [PULL 088/116] i386: hvf: Drop useless declarations in sysemu Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
  2020-06-12 16:07 ` [PULL 098/116] i386: hvf: Move mmio_buf " Paolo Bonzini
  2020-06-13 16:17 ` [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Peter Maydell
  4 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Roman Bolshakov

From: Roman Bolshakov <r.bolshakov@yadro.com>

The lazy flags are still needed for instruction decoder.

Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-12-r.bolshakov@yadro.com>
[Move struct to target/i386/cpu.h - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h           |  6 ++++
 target/i386/hvf/x86.h       |  6 ----
 target/i386/hvf/x86_flags.c | 57 ++++++++++++++++++-------------------
 3 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c2b8bdcbde..f742ba933f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1366,6 +1366,11 @@ typedef struct CPUCaches {
         CPUCacheInfo *l3_cache;
 } CPUCaches;
 
+typedef struct HVFX86LazyFlags {
+    target_ulong result;
+    target_ulong auxbits;
+} HVFX86LazyFlags;
+
 typedef struct CPUX86State {
     /* standard registers */
     target_ulong regs[CPU_NB_REGS];
@@ -1597,6 +1602,7 @@ typedef struct CPUX86State {
     struct kvm_nested_state *nested_state;
 #endif
 #if defined(CONFIG_HVF)
+    HVFX86LazyFlags hvf_lflags;
     HVFX86EmulatorState *hvf_emul;
 #endif
 
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 6048b5cc74..2363616c07 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -228,14 +228,8 @@ typedef struct x68_segment_selector {
     };
 } __attribute__ ((__packed__)) x68_segment_selector;
 
-typedef struct lazy_flags {
-    target_ulong result;
-    target_ulong auxbits;
-} lazy_flags;
-
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
-    struct lazy_flags   lflags;
     uint8_t mmio_buf[4096];
 };
 
diff --git a/target/i386/hvf/x86_flags.c b/target/i386/hvf/x86_flags.c
index 1152cd7234..5ca4f41f5c 100644
--- a/target/i386/hvf/x86_flags.c
+++ b/target/i386/hvf/x86_flags.c
@@ -63,7 +63,7 @@
 #define SET_FLAGS_OSZAPC_SIZE(size, lf_carries, lf_result) { \
     target_ulong temp = ((lf_carries) & (LF_MASK_AF)) | \
     (((lf_carries) >> (size - 2)) << LF_BIT_PO); \
-    env->hvf_emul->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+    env->hvf_lflags.result = (target_ulong)(int##size##_t)(lf_result); \
     if ((size) == 32) { \
         temp = ((lf_carries) & ~(LF_MASK_PDB | LF_MASK_SD)); \
     } else if ((size) == 16) { \
@@ -73,7 +73,7 @@
     } else { \
         VM_PANIC("unimplemented");  \
     } \
-    env->hvf_emul->lflags.auxbits = (target_ulong)(uint32_t)temp; \
+    env->hvf_lflags.auxbits = (target_ulong)(uint32_t)temp; \
 }
 
 /* carries, result */
@@ -100,10 +100,10 @@
     } else { \
         VM_PANIC("unimplemented");      \
     } \
-    env->hvf_emul->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
-    target_ulong delta_c = (env->hvf_emul->lflags.auxbits ^ temp) & LF_MASK_CF; \
+    env->hvf_lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+    target_ulong delta_c = (env->hvf_lflags.auxbits ^ temp) & LF_MASK_CF; \
     delta_c ^= (delta_c >> 1); \
-    env->hvf_emul->lflags.auxbits = (target_ulong)(uint32_t)(temp ^ delta_c); \
+    env->hvf_lflags.auxbits = (target_ulong)(uint32_t)(temp ^ delta_c); \
 }
 
 /* carries, result */
@@ -117,9 +117,8 @@
 void SET_FLAGS_OxxxxC(CPUX86State *env, uint32_t new_of, uint32_t new_cf)
 {
     uint32_t temp_po = new_of ^ new_cf;
-    env->hvf_emul->lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
-    env->hvf_emul->lflags.auxbits |= (temp_po << LF_BIT_PO) |
-                                     (new_cf << LF_BIT_CF);
+    env->hvf_lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
+    env->hvf_lflags.auxbits |= (temp_po << LF_BIT_PO) | (new_cf << LF_BIT_CF);
 }
 
 void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
@@ -215,27 +214,27 @@ void SET_FLAGS_OSZAPC_LOGIC8(CPUX86State *env, uint8_t v1, uint8_t v2,
 
 bool get_PF(CPUX86State *env)
 {
-    uint32_t temp = (255 & env->hvf_emul->lflags.result);
-    temp = temp ^ (255 & (env->hvf_emul->lflags.auxbits >> LF_BIT_PDB));
+    uint32_t temp = (255 & env->hvf_lflags.result);
+    temp = temp ^ (255 & (env->hvf_lflags.auxbits >> LF_BIT_PDB));
     temp = (temp ^ (temp >> 4)) & 0x0F;
     return (0x9669U >> temp) & 1;
 }
 
 void set_PF(CPUX86State *env, bool val)
 {
-    uint32_t temp = (255 & env->hvf_emul->lflags.result) ^ (!val);
-    env->hvf_emul->lflags.auxbits &= ~(LF_MASK_PDB);
-    env->hvf_emul->lflags.auxbits |= (temp << LF_BIT_PDB);
+    uint32_t temp = (255 & env->hvf_lflags.result) ^ (!val);
+    env->hvf_lflags.auxbits &= ~(LF_MASK_PDB);
+    env->hvf_lflags.auxbits |= (temp << LF_BIT_PDB);
 }
 
 bool get_OF(CPUX86State *env)
 {
-    return ((env->hvf_emul->lflags.auxbits + (1U << LF_BIT_PO)) >> LF_BIT_CF) & 1;
+    return ((env->hvf_lflags.auxbits + (1U << LF_BIT_PO)) >> LF_BIT_CF) & 1;
 }
 
 bool get_CF(CPUX86State *env)
 {
-    return (env->hvf_emul->lflags.auxbits >> LF_BIT_CF) & 1;
+    return (env->hvf_lflags.auxbits >> LF_BIT_CF) & 1;
 }
 
 void set_OF(CPUX86State *env, bool val)
@@ -252,45 +251,45 @@ void set_CF(CPUX86State *env, bool val)
 
 bool get_AF(CPUX86State *env)
 {
-    return (env->hvf_emul->lflags.auxbits >> LF_BIT_AF) & 1;
+    return (env->hvf_lflags.auxbits >> LF_BIT_AF) & 1;
 }
 
 void set_AF(CPUX86State *env, bool val)
 {
-    env->hvf_emul->lflags.auxbits &= ~(LF_MASK_AF);
-    env->hvf_emul->lflags.auxbits |= val << LF_BIT_AF;
+    env->hvf_lflags.auxbits &= ~(LF_MASK_AF);
+    env->hvf_lflags.auxbits |= val << LF_BIT_AF;
 }
 
 bool get_ZF(CPUX86State *env)
 {
-    return !env->hvf_emul->lflags.result;
+    return !env->hvf_lflags.result;
 }
 
 void set_ZF(CPUX86State *env, bool val)
 {
     if (val) {
-        env->hvf_emul->lflags.auxbits ^=
-         (((env->hvf_emul->lflags.result >> LF_SIGN_BIT) & 1) << LF_BIT_SD);
+        env->hvf_lflags.auxbits ^=
+         (((env->hvf_lflags.result >> LF_SIGN_BIT) & 1) << LF_BIT_SD);
         /* merge the parity bits into the Parity Delta Byte */
-        uint32_t temp_pdb = (255 & env->hvf_emul->lflags.result);
-        env->hvf_emul->lflags.auxbits ^= (temp_pdb << LF_BIT_PDB);
+        uint32_t temp_pdb = (255 & env->hvf_lflags.result);
+        env->hvf_lflags.auxbits ^= (temp_pdb << LF_BIT_PDB);
         /* now zero the .result value */
-        env->hvf_emul->lflags.result = 0;
+        env->hvf_lflags.result = 0;
     } else {
-        env->hvf_emul->lflags.result |= (1 << 8);
+        env->hvf_lflags.result |= (1 << 8);
     }
 }
 
 bool get_SF(CPUX86State *env)
 {
-    return ((env->hvf_emul->lflags.result >> LF_SIGN_BIT) ^
-            (env->hvf_emul->lflags.auxbits >> LF_BIT_SD)) & 1;
+    return ((env->hvf_lflags.result >> LF_SIGN_BIT) ^
+            (env->hvf_lflags.auxbits >> LF_BIT_SD)) & 1;
 }
 
 void set_SF(CPUX86State *env, bool val)
 {
     bool temp_sf = get_SF(env);
-    env->hvf_emul->lflags.auxbits ^= (temp_sf ^ val) << LF_BIT_SD;
+    env->hvf_lflags.auxbits ^= (temp_sf ^ val) << LF_BIT_SD;
 }
 
 void lflags_to_rflags(CPUX86State *env)
@@ -305,7 +304,7 @@ void lflags_to_rflags(CPUX86State *env)
 
 void rflags_to_lflags(CPUX86State *env)
 {
-    env->hvf_emul->lflags.auxbits = env->hvf_emul->lflags.result = 0;
+    env->hvf_lflags.auxbits = env->hvf_lflags.result = 0;
     set_OF(env, env->eflags & CC_O);
     set_SF(env, env->eflags & CC_S);
     set_ZF(env, env->eflags & CC_Z);
-- 
2.26.2




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PULL 098/116] i386: hvf: Move mmio_buf into CPUX86State
  2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
                   ` (2 preceding siblings ...)
  2020-06-12 16:07 ` [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
  2020-06-13 16:17 ` [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Peter Maydell
  4 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Roman Bolshakov

From: Roman Bolshakov <r.bolshakov@yadro.com>

There's no similar field in CPUX86State, but it's needed for MMIO traps.

Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-13-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h         |  1 +
 target/i386/hvf/hvf.c     |  5 +++++
 target/i386/hvf/x86.h     |  1 -
 target/i386/hvf/x86_emu.c | 12 ++++++------
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index f742ba933f..25a2f4c0c3 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1603,6 +1603,7 @@ typedef struct CPUX86State {
 #endif
 #if defined(CONFIG_HVF)
     HVFX86LazyFlags hvf_lflags;
+    void *hvf_mmio_buf;
     HVFX86EmulatorState *hvf_emul;
 #endif
 
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 4cee496d71..57696c46c7 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -533,7 +533,11 @@ void hvf_reset_vcpu(CPUState *cpu) {
 
 void hvf_vcpu_destroy(CPUState *cpu)
 {
+    X86CPU *x86_cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86_cpu->env;
+
     hv_return_t ret = hv_vcpu_destroy((hv_vcpuid_t)cpu->hvf_fd);
+    g_free(env->hvf_mmio_buf);
     assert_hvf_ok(ret);
 }
 
@@ -563,6 +567,7 @@ int hvf_init_vcpu(CPUState *cpu)
     init_decoder();
 
     hvf_state->hvf_caps = g_new0(struct hvf_vcpu_caps, 1);
+    env->hvf_mmio_buf = g_new(char, 4096);
     env->hvf_emul = g_new0(HVFX86EmulatorState, 1);
 
     r = hv_vcpu_create((hv_vcpuid_t *)&cpu->hvf_fd, HV_VCPU_DEFAULT);
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 2363616c07..483fcea762 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -230,7 +230,6 @@ typedef struct x68_segment_selector {
 
 /* Definition of hvf_x86_state is here */
 struct HVFX86EmulatorState {
-    uint8_t mmio_buf[4096];
 };
 
 /* useful register access  macros */
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 1ad2c30e16..d3e289ed87 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -187,8 +187,8 @@ void write_val_ext(struct CPUX86State *env, target_ulong ptr, target_ulong val,
 
 uint8_t *read_mmio(struct CPUX86State *env, target_ulong ptr, int bytes)
 {
-    vmx_read_mem(env_cpu(env), env->hvf_emul->mmio_buf, ptr, bytes);
-    return env->hvf_emul->mmio_buf;
+    vmx_read_mem(env_cpu(env), env->hvf_mmio_buf, ptr, bytes);
+    return env->hvf_mmio_buf;
 }
 
 
@@ -489,9 +489,9 @@ static void exec_ins_single(struct CPUX86State *env, struct x86_decode *decode)
     target_ulong addr = linear_addr_size(env_cpu(env), RDI(env),
                                          decode->addressing_size, R_ES);
 
-    hvf_handle_io(env_cpu(env), DX(env), env->hvf_emul->mmio_buf, 0,
+    hvf_handle_io(env_cpu(env), DX(env), env->hvf_mmio_buf, 0,
                   decode->operand_size, 1);
-    vmx_write_mem(env_cpu(env), addr, env->hvf_emul->mmio_buf,
+    vmx_write_mem(env_cpu(env), addr, env->hvf_mmio_buf,
                   decode->operand_size);
 
     string_increment_reg(env, R_EDI, decode);
@@ -512,9 +512,9 @@ static void exec_outs_single(struct CPUX86State *env, struct x86_decode *decode)
 {
     target_ulong addr = decode_linear_addr(env, decode, RSI(env), R_DS);
 
-    vmx_read_mem(env_cpu(env), env->hvf_emul->mmio_buf, addr,
+    vmx_read_mem(env_cpu(env), env->hvf_mmio_buf, addr,
                  decode->operand_size);
-    hvf_handle_io(env_cpu(env), DX(env), env->hvf_emul->mmio_buf, 1,
+    hvf_handle_io(env_cpu(env), DX(env), env->hvf_mmio_buf, 1,
                   decode->operand_size, 1);
 
     string_increment_reg(env, R_ESI, decode);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11
  2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
                   ` (3 preceding siblings ...)
  2020-06-12 16:07 ` [PULL 098/116] i386: hvf: Move mmio_buf " Paolo Bonzini
@ 2020-06-13 16:17 ` Peter Maydell
  4 siblings, 0 replies; 9+ messages in thread
From: Peter Maydell @ 2020-06-13 16:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: QEMU Developers

On Fri, 12 Jun 2020 at 17:11, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> The following changes since commit 31d321c2b3574dcc74e9f6411af06bca6b5d10f4:
>
>   Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sparc-next-20200609' into staging (2020-06-09 17:29:47 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 3575b0aea983ad57804c9af739ed8ff7bc168393:
>
>   target/i386: Remove obsolete TODO file (2020-06-12 11:20:15 -0400)
>
> ----------------------------------------------------------------
> * Miscellaneous fixes and feature enablement (many)
> * SEV refactoring (David)
> * Hyper-V initial support (Jon)
> * i386 TCG fixes (x87 and SSE, Joseph)
> * vmport cleanup and improvements (Philippe, Liran)
> * Use-after-free with vCPU hot-unplug (Nengyuan)
> * run-coverity-scan improvements (myself)
> * Record/replay fixes (Pavel)
> * -machine kernel_irqchip=split improvements for INTx (Peter)
> * Code cleanups (Philippe)
> * Crash and security fixes (PJP)
> * HVF cleanups (Roman)

Applied, thanks. (I had to fix up a conflict in hw/i386/acpi-build.c;
might be worth checking that I got it right.)

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
  2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
@ 2020-06-15 10:18   ` Philippe Mathieu-Daudé
  2020-06-15 10:58     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-15 10:18 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: Alex Bennée, Joseph Myers

On 6/12/20 6:07 PM, Paolo Bonzini wrote:
> From: Joseph Myers <joseph@codesourcery.com>
> 
> This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
> / pcmpestrm / pcmpistri / pcmpistrm substring search, commit
> ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
> 
> That commit fixed a bug that showed up in four GCC tests with one libc
> implementation.  The tests in question generate random inputs to the
> intrinsics and compare results to a C implementation, but they only
> test 1024 possible random inputs, and when the tests use the cases of
> those instructions that work with word rather than byte inputs, it's
> easy to have problematic cases that show up much less frequently than
> that.  Thus, testing with a different libc implementation, and so a
> different random number generator, showed up a problem with the
> previous patch.
> 
> When investigating the previous test failures, I found the description
> of these instructions in the Intel manuals (starting from computing a
> 16x16 or 8x8 set of comparison results) confusing and hard to match up
> with the more optimized implementation in QEMU, and referred to AMD
> manuals which described the instructions in a different way.  Those
> AMD descriptions are very explicit that the whole of the string being
> searched for must be found in the other operand, not running off the
> end of that operand; they say "If the prototype and the SUT are equal
> in length, the two strings must be identical for the comparison to be
> TRUE.".  However, that statement is incorrect.
> 
> In my previous commit message, I noted:
> 
>   The operation in this case is a search for a string (argument d to
>   the helper) in another string (argument s to the helper); if a copy
>   of d at a particular position would run off the end of s, the
>   resulting output bit should be 0 whether or not the strings match in
>   the region where they overlap, but the QEMU implementation was
>   wrongly comparing only up to the point where s ends and counting it
>   as a match if an initial segment of d matched a terminal segment of
>   s.  Here, "run off the end of s" means that some byte of d would
>   overlap some byte outside of s; thus, if d has zero length, it is
>   considered to match everywhere, including after the end of s.
> 
> The description "some byte of d would overlap some byte outside of s"
> is accurate only when understood to refer to overlapping some byte
> *within the 16-byte operand* but at or after the zero terminator; it
> is valid to run over the end of s if the end of s is the end of the
> 16-byte operand.  So the fix in the previous patch for the case of d
> being empty was correct, but the other part of that patch was not
> correct (as it never allowed partial matches even at the end of the
> 16-byte operand).  Nor was the code before the previous patch correct
> for the case of d nonempty, as it would always have allowed partial
> matches at the end of s.
> 
> Fix with a partial revert of my previous change, combined with
> inserting a check for the special case of s having maximum length to
> determine where it is necessary to check for matches.
> 
> In the added test, test 1 is for the case of empty strings, which
> failed before my 2017 patch, test 2 is for the bug introduced by my
> 2017 patch and test 3 deals with the case where a match of an initial
> segment at the end of the string is not valid when the string ends
> before the end of the 16-byte operand (that is, the case that would be
> broken by a simple revert of the non-empty-string part of my 2017
> patch).
> 
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/ops_sse.h                |  4 ++--
>  tests/tcg/i386/Makefile.target       |  3 +++
>  tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
>  3 files changed, 38 insertions(+), 2 deletions(-)
>  create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
> 
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 01d6017412..14f2b16abd 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
>              res = (2 << upper) - 1;
>              break;
>          }
> -        for (j = valids - validd; j >= 0; j--) {
> +        for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
>              res <<= 1;
>              v = 1;
> -            for (i = validd; i >= 0; i--) {
> +            for (i = MIN(valids - j, validd); i >= 0; i--) {
>                  v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
>              }
>              res |= v;
> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
> index 43ee2e181e..53efec0668 100644
> --- a/tests/tcg/i386/Makefile.target
> +++ b/tests/tcg/i386/Makefile.target
> @@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
>  SKIP_I386_TESTS=test-i386-ssse3
>  X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
>  
> +test-i386-pcmpistri: CFLAGS += -msse4.2
> +run-test-i386-pcmpistri: QEMU_OPTS += -cpu max

This test fails on our CI:
https://travis-ci.org/github/qemu/qemu/jobs/698006621#L4246

> +
>  #
>  # hello-i386 is a barebones app
>  #
> diff --git a/tests/tcg/i386/test-i386-pcmpistri.c b/tests/tcg/i386/test-i386-pcmpistri.c
> new file mode 100644
> index 0000000000..1e81ae611a
> --- /dev/null
> +++ b/tests/tcg/i386/test-i386-pcmpistri.c
> @@ -0,0 +1,33 @@
> +/* Test pcmpistri instruction.  */
> +
> +#include <nmmintrin.h>
> +#include <stdio.h>
> +
> +union u {
> +    __m128i x;
> +    unsigned char uc[16];
> +};
> +
> +union u s0 = { .uc = { 0 } };
> +union u s1 = { .uc = "abcdefghijklmnop" };
> +union u s2 = { .uc = "bcdefghijklmnopa" };
> +union u s3 = { .uc = "bcdefghijklmnab" };
> +
> +int
> +main(void)
> +{
> +    int ret = 0;
> +    if (_mm_cmpistri(s0.x, s0.x, 0x4c) != 15) {
> +        printf("FAIL: pcmpistri test 1\n");
> +        ret = 1;
> +    }
> +    if (_mm_cmpistri(s1.x, s2.x, 0x4c) != 15) {
> +        printf("FAIL: pcmpistri test 2\n");
> +        ret = 1;
> +    }
> +    if (_mm_cmpistri(s1.x, s3.x, 0x4c) != 16) {
> +        printf("FAIL: pcmpistri test 3\n");
> +        ret = 1;
> +    }
> +    return ret;
> +}
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
  2020-06-15 10:18   ` Philippe Mathieu-Daudé
@ 2020-06-15 10:58     ` Philippe Mathieu-Daudé
  2020-06-15 13:43       ` Alex Bennée
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-15 10:58 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: Alex Bennée, Joseph Myers

On 6/15/20 12:18 PM, Philippe Mathieu-Daudé wrote:
> On 6/12/20 6:07 PM, Paolo Bonzini wrote:
>> From: Joseph Myers <joseph@codesourcery.com>
>>
>> This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
>> / pcmpestrm / pcmpistri / pcmpistrm substring search, commit
>> ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
>>
>> That commit fixed a bug that showed up in four GCC tests with one libc
>> implementation.  The tests in question generate random inputs to the
>> intrinsics and compare results to a C implementation, but they only
>> test 1024 possible random inputs, and when the tests use the cases of
>> those instructions that work with word rather than byte inputs, it's
>> easy to have problematic cases that show up much less frequently than
>> that.  Thus, testing with a different libc implementation, and so a
>> different random number generator, showed up a problem with the
>> previous patch.
>>
>> When investigating the previous test failures, I found the description
>> of these instructions in the Intel manuals (starting from computing a
>> 16x16 or 8x8 set of comparison results) confusing and hard to match up
>> with the more optimized implementation in QEMU, and referred to AMD
>> manuals which described the instructions in a different way.  Those
>> AMD descriptions are very explicit that the whole of the string being
>> searched for must be found in the other operand, not running off the
>> end of that operand; they say "If the prototype and the SUT are equal
>> in length, the two strings must be identical for the comparison to be
>> TRUE.".  However, that statement is incorrect.
>>
>> In my previous commit message, I noted:
>>
>>   The operation in this case is a search for a string (argument d to
>>   the helper) in another string (argument s to the helper); if a copy
>>   of d at a particular position would run off the end of s, the
>>   resulting output bit should be 0 whether or not the strings match in
>>   the region where they overlap, but the QEMU implementation was
>>   wrongly comparing only up to the point where s ends and counting it
>>   as a match if an initial segment of d matched a terminal segment of
>>   s.  Here, "run off the end of s" means that some byte of d would
>>   overlap some byte outside of s; thus, if d has zero length, it is
>>   considered to match everywhere, including after the end of s.
>>
>> The description "some byte of d would overlap some byte outside of s"
>> is accurate only when understood to refer to overlapping some byte
>> *within the 16-byte operand* but at or after the zero terminator; it
>> is valid to run over the end of s if the end of s is the end of the
>> 16-byte operand.  So the fix in the previous patch for the case of d
>> being empty was correct, but the other part of that patch was not
>> correct (as it never allowed partial matches even at the end of the
>> 16-byte operand).  Nor was the code before the previous patch correct
>> for the case of d nonempty, as it would always have allowed partial
>> matches at the end of s.
>>
>> Fix with a partial revert of my previous change, combined with
>> inserting a check for the special case of s having maximum length to
>> determine where it is necessary to check for matches.
>>
>> In the added test, test 1 is for the case of empty strings, which
>> failed before my 2017 patch, test 2 is for the bug introduced by my
>> 2017 patch and test 3 deals with the case where a match of an initial
>> segment at the end of the string is not valid when the string ends
>> before the end of the 16-byte operand (that is, the case that would be
>> broken by a simple revert of the non-empty-string part of my 2017
>> patch).
>>
>> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
>> Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  target/i386/ops_sse.h                |  4 ++--
>>  tests/tcg/i386/Makefile.target       |  3 +++
>>  tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
>>  3 files changed, 38 insertions(+), 2 deletions(-)
>>  create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
>>
>> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
>> index 01d6017412..14f2b16abd 100644
>> --- a/target/i386/ops_sse.h
>> +++ b/target/i386/ops_sse.h
>> @@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
>>              res = (2 << upper) - 1;
>>              break;
>>          }
>> -        for (j = valids - validd; j >= 0; j--) {
>> +        for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
>>              res <<= 1;
>>              v = 1;
>> -            for (i = validd; i >= 0; i--) {
>> +            for (i = MIN(valids - j, validd); i >= 0; i--) {
>>                  v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
>>              }
>>              res |= v;
>> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
>> index 43ee2e181e..53efec0668 100644
>> --- a/tests/tcg/i386/Makefile.target
>> +++ b/tests/tcg/i386/Makefile.target
>> @@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
>>  SKIP_I386_TESTS=test-i386-ssse3
>>  X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
>>  
>> +test-i386-pcmpistri: CFLAGS += -msse4.2
>> +run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
> 
> This test fails on our CI:
> https://travis-ci.org/github/qemu/qemu/jobs/698006621#L4246

FYI Paolo's analysis from 'make V=1' output
https://api.travis-ci.org/v3/job/698459904/log.txt:

timeout 60
/home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386 -cpu max
test-i386-pcmpistri >  test-i386-pcmpistri.out

timeout 60
/home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386  -plugin
../../plugin/libbb.so -d plugin -D
test-i386-pcmpistri-with-libbb.so.pout test-i386-pcmpistri >
run-plugin-test-i386-pcmpistri-with-libbb.so.out

"incorrect qemu invocation, missing -cpu max in the second".

> 
>> +
>>  #
>>  # hello-i386 is a barebones app
>>  #
>> diff --git a/tests/tcg/i386/test-i386-pcmpistri.c b/tests/tcg/i386/test-i386-pcmpistri.c
>> new file mode 100644
>> index 0000000000..1e81ae611a
>> --- /dev/null
>> +++ b/tests/tcg/i386/test-i386-pcmpistri.c
>> @@ -0,0 +1,33 @@
>> +/* Test pcmpistri instruction.  */
>> +
>> +#include <nmmintrin.h>
>> +#include <stdio.h>
>> +
>> +union u {
>> +    __m128i x;
>> +    unsigned char uc[16];
>> +};
>> +
>> +union u s0 = { .uc = { 0 } };
>> +union u s1 = { .uc = "abcdefghijklmnop" };
>> +union u s2 = { .uc = "bcdefghijklmnopa" };
>> +union u s3 = { .uc = "bcdefghijklmnab" };
>> +
>> +int
>> +main(void)
>> +{
>> +    int ret = 0;
>> +    if (_mm_cmpistri(s0.x, s0.x, 0x4c) != 15) {
>> +        printf("FAIL: pcmpistri test 1\n");
>> +        ret = 1;
>> +    }
>> +    if (_mm_cmpistri(s1.x, s2.x, 0x4c) != 15) {
>> +        printf("FAIL: pcmpistri test 2\n");
>> +        ret = 1;
>> +    }
>> +    if (_mm_cmpistri(s1.x, s3.x, 0x4c) != 16) {
>> +        printf("FAIL: pcmpistri test 3\n");
>> +        ret = 1;
>> +    }
>> +    return ret;
>> +}
>>
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
  2020-06-15 10:58     ` Philippe Mathieu-Daudé
@ 2020-06-15 13:43       ` Alex Bennée
  0 siblings, 0 replies; 9+ messages in thread
From: Alex Bennée @ 2020-06-15 13:43 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé; +Cc: Paolo Bonzini, qemu-devel, Joseph Myers


Philippe Mathieu-Daudé <philmd@redhat.com> writes:

> On 6/15/20 12:18 PM, Philippe Mathieu-Daudé wrote:
>> On 6/12/20 6:07 PM, Paolo Bonzini wrote:
>>> From: Joseph Myers <joseph@codesourcery.com>
>>>
>>> This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
>>> / pcmpestrm / pcmpistri / pcmpistrm substring search, commit
>>> ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
>>>
>>> That commit fixed a bug that showed up in four GCC tests with one libc
>>> implementation.  The tests in question generate random inputs to the
>>> intrinsics and compare results to a C implementation, but they only
>>> test 1024 possible random inputs, and when the tests use the cases of
>>> those instructions that work with word rather than byte inputs, it's
>>> easy to have problematic cases that show up much less frequently than
>>> that.  Thus, testing with a different libc implementation, and so a
>>> different random number generator, showed up a problem with the
>>> previous patch.
>>>
>>> When investigating the previous test failures, I found the description
>>> of these instructions in the Intel manuals (starting from computing a
>>> 16x16 or 8x8 set of comparison results) confusing and hard to match up
>>> with the more optimized implementation in QEMU, and referred to AMD
>>> manuals which described the instructions in a different way.  Those
>>> AMD descriptions are very explicit that the whole of the string being
>>> searched for must be found in the other operand, not running off the
>>> end of that operand; they say "If the prototype and the SUT are equal
>>> in length, the two strings must be identical for the comparison to be
>>> TRUE.".  However, that statement is incorrect.
>>>
>>> In my previous commit message, I noted:
>>>
>>>   The operation in this case is a search for a string (argument d to
>>>   the helper) in another string (argument s to the helper); if a copy
>>>   of d at a particular position would run off the end of s, the
>>>   resulting output bit should be 0 whether or not the strings match in
>>>   the region where they overlap, but the QEMU implementation was
>>>   wrongly comparing only up to the point where s ends and counting it
>>>   as a match if an initial segment of d matched a terminal segment of
>>>   s.  Here, "run off the end of s" means that some byte of d would
>>>   overlap some byte outside of s; thus, if d has zero length, it is
>>>   considered to match everywhere, including after the end of s.
>>>
>>> The description "some byte of d would overlap some byte outside of s"
>>> is accurate only when understood to refer to overlapping some byte
>>> *within the 16-byte operand* but at or after the zero terminator; it
>>> is valid to run over the end of s if the end of s is the end of the
>>> 16-byte operand.  So the fix in the previous patch for the case of d
>>> being empty was correct, but the other part of that patch was not
>>> correct (as it never allowed partial matches even at the end of the
>>> 16-byte operand).  Nor was the code before the previous patch correct
>>> for the case of d nonempty, as it would always have allowed partial
>>> matches at the end of s.
>>>
>>> Fix with a partial revert of my previous change, combined with
>>> inserting a check for the special case of s having maximum length to
>>> determine where it is necessary to check for matches.
>>>
>>> In the added test, test 1 is for the case of empty strings, which
>>> failed before my 2017 patch, test 2 is for the bug introduced by my
>>> 2017 patch and test 3 deals with the case where a match of an initial
>>> segment at the end of the string is not valid when the string ends
>>> before the end of the 16-byte operand (that is, the case that would be
>>> broken by a simple revert of the non-empty-string part of my 2017
>>> patch).
>>>
>>> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
>>> Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>  target/i386/ops_sse.h                |  4 ++--
>>>  tests/tcg/i386/Makefile.target       |  3 +++
>>>  tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
>>>  3 files changed, 38 insertions(+), 2 deletions(-)
>>>  create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
>>>
>>> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
>>> index 01d6017412..14f2b16abd 100644
>>> --- a/target/i386/ops_sse.h
>>> +++ b/target/i386/ops_sse.h
>>> @@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
>>>              res = (2 << upper) - 1;
>>>              break;
>>>          }
>>> -        for (j = valids - validd; j >= 0; j--) {
>>> +        for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
>>>              res <<= 1;
>>>              v = 1;
>>> -            for (i = validd; i >= 0; i--) {
>>> +            for (i = MIN(valids - j, validd); i >= 0; i--) {
>>>                  v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
>>>              }
>>>              res |= v;
>>> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
>>> index 43ee2e181e..53efec0668 100644
>>> --- a/tests/tcg/i386/Makefile.target
>>> +++ b/tests/tcg/i386/Makefile.target
>>> @@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
>>>  SKIP_I386_TESTS=test-i386-ssse3
>>>  X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
>>>  
>>> +test-i386-pcmpistri: CFLAGS += -msse4.2
>>> +run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
>> 
>> This test fails on our CI:
>> https://travis-ci.org/github/qemu/qemu/jobs/698006621#L4246
>
> FYI Paolo's analysis from 'make V=1' output
> https://api.travis-ci.org/v3/job/698459904/log.txt:
>
> timeout 60
> /home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386 -cpu max
> test-i386-pcmpistri >  test-i386-pcmpistri.out
>
> timeout 60
> /home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386  -plugin
> ../../plugin/libbb.so -d plugin -D
> test-i386-pcmpistri-with-libbb.so.pout test-i386-pcmpistri >
> run-plugin-test-i386-pcmpistri-with-libbb.so.out
>
> "incorrect qemu invocation, missing -cpu max in the second".

Just testing some patches now.


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-06-15 13:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
2020-06-15 10:18   ` Philippe Mathieu-Daudé
2020-06-15 10:58     ` Philippe Mathieu-Daudé
2020-06-15 13:43       ` Alex Bennée
2020-06-12 16:07 ` [PULL 088/116] i386: hvf: Drop useless declarations in sysemu Paolo Bonzini
2020-06-12 16:07 ` [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State Paolo Bonzini
2020-06-12 16:07 ` [PULL 098/116] i386: hvf: Move mmio_buf " Paolo Bonzini
2020-06-13 16:17 ` [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).