* [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11
@ 2020-06-12 16:07 Paolo Bonzini
2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
To: qemu-devel
The following changes since commit 31d321c2b3574dcc74e9f6411af06bca6b5d10f4:
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sparc-next-20200609' into staging (2020-06-09 17:29:47 +0100)
are available in the Git repository at:
git://github.com/bonzini/qemu.git tags/for-upstream
for you to fetch changes up to 3575b0aea983ad57804c9af739ed8ff7bc168393:
target/i386: Remove obsolete TODO file (2020-06-12 11:20:15 -0400)
----------------------------------------------------------------
* Miscellaneous fixes and feature enablement (many)
* SEV refactoring (David)
* Hyper-V initial support (Jon)
* i386 TCG fixes (x87 and SSE, Joseph)
* vmport cleanup and improvements (Philippe, Liran)
* Use-after-free with vCPU hot-unplug (Nengyuan)
* run-coverity-scan improvements (myself)
* Record/replay fixes (Pavel)
* -machine kernel_irqchip=split improvements for INTx (Peter)
* Code cleanups (Philippe)
* Crash and security fixes (PJP)
* HVF cleanups (Roman)
----------------------------------------------------------------
Anthony PERARD (1):
xen: fix build without pci passthrough
Babu Moger (1):
target/i386: Fix the CPUID leaf CPUID_Fn80000008
Cathy Zhang (1):
x86/cpu: Enable AVX512_VP2INTERSECT cpu feature
Cédric Le Goater (1):
qom/object: Fix object_child_foreach_recursive() return value
David Carlier (1):
util/oslib: Returns the real thread identifier on FreeBSD and NetBSD
David Gibson (9):
target/i386: sev: Remove unused QSevGuestInfoClass
target/i386: sev: Move local structure definitions into .c file
target/i386: sev: Rename QSevGuestInfo
target/i386: sev: Embed SEVState in SevGuestState
target/i386: sev: Partial cleanup to sev_state global
target/i386: sev: Remove redundant cbitpos and reduced_phys_bits fields
target/i386: sev: Remove redundant policy field
target/i386: sev: Remove redundant handle field
target/i386: sev: Unify SEVState and SevGuestState
Edgar E. Iglesias (1):
tests: machine-none-test: Enable MicroBlaze testing
Igor Mammedov (2):
vl.c: run preconfig loop before creating default RAM backend
numa: prevent usage of -M memory-backend and -numa memdev at the same time
Janne Grunau (1):
target/i386: fix phadd* with identical destination and source register
Jon Doron (6):
hyperv: expose API to determine if synic is enabled
vmbus: add vmbus protocol definitions
vmbus: vmbus implementation
i386:pc: whitelist dynamic vmbus-bridge
i386: Hyper-V VMBus ACPI DSDT entry
vmbus: add infrastructure to save/load vmbus requests
Joseph Myers (12):
target/i386: implement special cases for fxtract
target/i386: fix fscale handling of signaling NaN
target/i386: fix fscale handling of invalid exponent encodings
target/i386: fix fscale handling of infinite exponents
target/i386: fix fscale handling of rounding precision
target/i386: fix floating-point load-constant rounding
target/i386: fix fxam handling of invalid encodings
target/i386: fix fbstp handling of negative zero
target/i386: fix fbstp handling of out-of-range values
target/i386: fix fisttpl, fisttpll handling of out-of-range values
target/i386: fix IEEE x87 floating-point exception raising
target/i386: correct fix for pcmpxstrx substring search
Julio Faracco (1):
i386: Remove unused define's from hax and hvf
Leonid Bloch (1):
configure: Do not ignore malloc value
Like Xu (1):
target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES
Liran Alon (14):
hw/i386/vmport: Add reference to VMware open-vm-tools
hw/i386/vmport: Add device properties
hw/i386/vmport: Propagate IOPort read to vCPU EAX register
hw/i386/vmport: Set EAX to -1 on failed and unsupported commands
hw/i386/vmport: Introduce vmware-vmx-version property
hw/i386/vmport: Report vmware-vmx-type in CMD_GETVERSION
hw/i386/vmport: Introduce vmport.h
hw/i386/vmport: Define enum for all commands
hw/i386/vmport: Add support for CMD_GETBIOSUUID
hw/i386/vmport: Add support for CMD_GET_VCPU_INFO
hw/i386/vmport: Allow x2apic without IR
i386/cpu: Store LAPIC bus frequency in CPU structure
hw/i386/vmport: Add support for CMD_GETHZ
hw/i386/vmport: Assert vmport initialized before registering commands
Markus Armbruster (1):
cpus: Fix botched configure_icount() error API violation fix
Masahiro Yamada (5):
qom: remove index from object_resolve_abs_path()
qom/object: factor out the initialization of hash table of properties
qom/object: simplify type_initialize_interface()
qom/object: pass (Object *) to object_initialize_with_type()
qom/container: remove .instance_size initializer from container_info
Michael S. Tsirkin (1):
checkpatch: reversed logic with acpi test checks
Pan Nengyuan (1):
i386/kvm: fix a use-after-free when vcpu plug/unplug
Paolo Bonzini (9):
docker.py/build: support -t and -f arguments
docker.py/build: support binary files in --extra-files
run-coverity-scan: get Coverity token and email from special git config section
run-coverity-scan: use docker.py
run-coverity-scan: add --no-update-tools option
run-coverity-scan: use --no-update-tools in docker run
run-coverity-scan: download tools outside the container
run-coverity-scan: support --update-tools-only --docker
stubs: move Xen stubs to accel/
Pavel Dovgaluk (3):
icount: fix shift=auto for record/replay
replay: implement fair mutex
replay: fix replay shutdown for console mode
Peter Xu (3):
vfio/pci: Use kvm_irqchip_add_irqfd_notifier_gsi() for irqfds
KVM: Pass EventNotifier into kvm_irqchip_assign_irqfd
KVM: Kick resamplefd for split kernel irqchip
Philippe Mathieu-Daudé (19):
target/i386: Fix OUTL debug output
qom/object: Move Object typedef to 'qemu/typedefs.h'
io/task: Move 'qom/object.h' header to source
Makefile: Let the 'help' target list the helper targets
accel: Move Xen accelerator code under accel/xen/
exec: Let address_space_read/write_cached() propagate MemTxResult
exec: Propagate cpu_memory_rw_debug() error
disas: Let disas::read_memory() handler return EIO on error
hw/elf_ops: Do not ignore write failures when loading ELF
hw/i386/vmport: Allow QTest use without crashing
memory: Make 'info mtree' not display disabled regions by default
qemu/thread: Mark qemu_thread_exit() with 'noreturn' attribute
sysemu/accel: Restrict machine methods to system-mode
sysemu/tcg: Only declare tcg_allowed when TCG is available
sysemu/hvf: Only declare hvf_allowed when HVF is available
target/ppc: Restrict PPCVirtualHypervisorClass to system-mode
exec/memory: Remove unused MemoryRegionMmio type
hw/usb: Move device-specific declarations to new 'hcd-musb.h' header
exec/cpu-common: Move MUSB specific typedefs to 'hw/usb/hcd-musb.h'
Prasad J Pandit (4):
megasas: use unsigned type for reply_queue_head and check index
megasas: avoid NULL pointer dereference
megasas: use unsigned type for positive numeric fields
exec: set map length to zero when returning NULL
Roman Bolshakov (13):
i386: hvf: Move HVFState definition into hvf
i386: hvf: Drop useless declarations in sysemu
i386: hvf: Clean stray includes in sysemu
i386: hvf: Drop unused variable
i386: hvf: Use ins_len to advance IP
i386: hvf: Use IP from CPUX86State
i386: hvf: Drop fetch_rip from HVFX86EmulatorState
i386: hvf: Drop rflags from HVFX86EmulatorState
i386: hvf: Drop copy of RFLAGS defines
i386: hvf: Drop regs in HVFX86EmulatorState
i386: hvf: Move lazy_flags into CPUX86State
i386: hvf: Move mmio_buf into CPUX86State
i386: hvf: Drop HVFX86EmulatorState
Sai Pavan Boddu (1):
chardev/char-socket: Properly make qio connections non blocking
Thomas Huth (1):
target/i386: Remove obsolete TODO file
WangBowen (1):
hax: Dynamic allocate vcpu state structure
Wei Huang (1):
hw/i386/amd_iommu: Fix the reserved bits definition of IOMMU commands
MAINTAINERS | 2 +
Makefile | 9 +-
Makefile.objs | 1 +
accel/Makefile.objs | 1 +
accel/kvm/kvm-all.c | 95 +-
accel/kvm/trace-events | 1 +
accel/stubs/Makefile.objs | 1 +
stubs/xen-hvm.c => accel/stubs/xen-stub.c | 23 +-
accel/xen/Makefile.objs | 1 +
hw/xen/xen-common.c => accel/xen/xen-all.c | 12 +-
chardev/char-socket.c | 4 +-
configure | 25 +-
cpus.c | 26 +-
disas.c | 13 +-
exec.c | 29 +-
hmp-commands-info.hx | 7 +-
hw/Makefile.objs | 2 +-
hw/acpi/piix4.c | 2 +-
hw/block/vhost-user-blk.c | 1 -
hw/core/machine.c | 4 +
hw/core/numa.c | 5 +
hw/hyperv/Kconfig | 5 +
hw/hyperv/Makefile.objs | 1 +
hw/hyperv/hyperv.c | 8 +
hw/hyperv/trace-events | 18 +
hw/hyperv/vmbus.c | 2778 ++++++++++++++++++++++++++++
hw/i386/acpi-build.c | 43 +
hw/i386/amd_iommu.c | 19 +-
hw/i386/pc.c | 3 +-
hw/i386/pc_piix.c | 5 +-
hw/i386/pc_q35.c | 3 +
hw/i386/vmmouse.c | 20 +-
hw/i386/vmport.c | 183 +-
hw/i386/vmport.h | 34 -
hw/i386/xen/xen-hvm.c | 1 +
hw/i386/xen/xen_platform.c | 1 +
hw/intc/ioapic.c | 19 +
hw/isa/piix3.c | 1 +
hw/pci-host/pam.c | 1 -
hw/pci/msix.c | 1 +
hw/scsi/megasas.c | 44 +-
hw/scsi/vhost-user-scsi.c | 1 -
hw/usb/hcd-musb.c | 5 +-
hw/usb/tusb6010.c | 1 +
hw/vfio/pci.c | 37 +-
hw/xen/Makefile.objs | 3 +-
hw/xen/xen_pt.c | 12 +-
hw/xen/xen_pt.h | 6 +-
hw/xen/xen_pt_stub.c | 22 +
include/exec/cpu-all.h | 1 +
include/exec/cpu-common.h | 3 -
include/exec/memory.h | 30 +-
include/exec/ram_addr.h | 2 +-
include/hw/display/edid.h | 3 -
include/hw/elf_ops.h | 11 +-
include/hw/hyperv/hyperv.h | 1 +
include/hw/hyperv/vmbus-bridge.h | 35 +
include/hw/hyperv/vmbus-proto.h | 222 +++
include/hw/hyperv/vmbus.h | 230 +++
include/hw/i386/vmport.h | 28 +
include/hw/usb.h | 30 -
include/hw/usb/hcd-musb.h | 47 +
include/hw/xen/xen.h | 11 -
include/io/task.h | 2 -
include/qemu/thread.h | 2 +-
include/qemu/typedefs.h | 2 +-
include/qom/object.h | 2 -
include/qom/qom-qobject.h | 2 -
include/sysemu/accel.h | 2 +
include/sysemu/hvf.h | 72 +-
include/sysemu/kvm.h | 4 +
include/sysemu/sysemu.h | 1 -
include/sysemu/tcg.h | 2 +-
include/sysemu/xen.h | 38 +
io/task.c | 1 +
memory.c | 75 +-
migration/savevm.c | 2 +-
monitor/misc.c | 3 +-
qom/container.c | 1 -
qom/object.c | 39 +-
replay/replay-internal.c | 15 +-
replay/replay.c | 5 +
scripts/checkpatch.pl | 2 +-
scripts/coverity-scan/coverity-scan.docker | 3 +-
scripts/coverity-scan/run-coverity-scan | 139 +-
softmmu/vl.c | 7 +-
stubs/Makefile.objs | 2 -
stubs/qmp_memory_device.c | 1 -
stubs/xen-common.c | 13 -
target/i386/TODO | 31 -
target/i386/cpu.c | 42 +-
target/i386/cpu.h | 15 +-
target/i386/fpu_helper.c | 258 ++-
target/i386/hax-all.c | 25 +-
target/i386/hax-i386.h | 7 +-
target/i386/hvf/hvf-i386.h | 37 +-
target/i386/hvf/hvf.c | 30 +-
target/i386/hvf/x86.c | 2 +-
target/i386/hvf/x86.h | 89 +-
target/i386/hvf/x86_decode.c | 25 +-
target/i386/hvf/x86_emu.c | 122 +-
target/i386/hvf/x86_flags.c | 81 +-
target/i386/hvf/x86_task.c | 10 +-
target/i386/hvf/x86hvf.c | 6 +-
target/i386/kvm.c | 34 +-
target/i386/misc_helper.c | 2 +-
target/i386/ops_sse.h | 57 +-
target/i386/sev.c | 257 +--
target/i386/sev_i386.h | 49 -
target/ppc/cpu.h | 4 +-
target/ppc/kvm_ppc.h | 22 +-
target/ppc/translate_init.inc.c | 4 +
tests/docker/Makefile.include | 2 +-
tests/docker/docker.py | 14 +-
tests/qtest/machine-none-test.c | 10 +-
tests/tcg/i386/Makefile.target | 3 +
tests/tcg/i386/test-i386-fbstp.c | 140 ++
tests/tcg/i386/test-i386-fisttp.c | 100 +
tests/tcg/i386/test-i386-fldcst.c | 199 ++
tests/tcg/i386/test-i386-fp-exceptions.c | 831 +++++++++
tests/tcg/i386/test-i386-fscale.c | 108 ++
tests/tcg/i386/test-i386-fxam.c | 143 ++
tests/tcg/i386/test-i386-fxtract.c | 120 ++
tests/tcg/i386/test-i386-pcmpistri.c | 33 +
tests/test-io-task.c | 1 +
util/oslib-posix.c | 9 +
126 files changed, 6572 insertions(+), 980 deletions(-)
rename stubs/xen-hvm.c => accel/stubs/xen-stub.c (63%)
create mode 100644 accel/xen/Makefile.objs
rename hw/xen/xen-common.c => accel/xen/xen-all.c (96%)
create mode 100644 hw/hyperv/trace-events
create mode 100644 hw/hyperv/vmbus.c
delete mode 100644 hw/i386/vmport.h
create mode 100644 hw/xen/xen_pt_stub.c
create mode 100644 include/hw/hyperv/vmbus-bridge.h
create mode 100644 include/hw/hyperv/vmbus-proto.h
create mode 100644 include/hw/hyperv/vmbus.h
create mode 100644 include/hw/i386/vmport.h
create mode 100644 include/hw/usb/hcd-musb.h
create mode 100644 include/sysemu/xen.h
delete mode 100644 stubs/xen-common.c
delete mode 100644 target/i386/TODO
create mode 100644 tests/tcg/i386/test-i386-fbstp.c
create mode 100644 tests/tcg/i386/test-i386-fisttp.c
create mode 100644 tests/tcg/i386/test-i386-fldcst.c
create mode 100644 tests/tcg/i386/test-i386-fp-exceptions.c
create mode 100644 tests/tcg/i386/test-i386-fscale.c
create mode 100644 tests/tcg/i386/test-i386-fxam.c
create mode 100644 tests/tcg/i386/test-i386-fxtract.c
create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
--
2.26.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
2020-06-15 10:18 ` Philippe Mathieu-Daudé
2020-06-12 16:07 ` [PULL 088/116] i386: hvf: Drop useless declarations in sysemu Paolo Bonzini
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
To: qemu-devel; +Cc: Joseph Myers
From: Joseph Myers <joseph@codesourcery.com>
This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
/ pcmpestrm / pcmpistri / pcmpistrm substring search, commit
ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
That commit fixed a bug that showed up in four GCC tests with one libc
implementation. The tests in question generate random inputs to the
intrinsics and compare results to a C implementation, but they only
test 1024 possible random inputs, and when the tests use the cases of
those instructions that work with word rather than byte inputs, it's
easy to have problematic cases that show up much less frequently than
that. Thus, testing with a different libc implementation, and so a
different random number generator, showed up a problem with the
previous patch.
When investigating the previous test failures, I found the description
of these instructions in the Intel manuals (starting from computing a
16x16 or 8x8 set of comparison results) confusing and hard to match up
with the more optimized implementation in QEMU, and referred to AMD
manuals which described the instructions in a different way. Those
AMD descriptions are very explicit that the whole of the string being
searched for must be found in the other operand, not running off the
end of that operand; they say "If the prototype and the SUT are equal
in length, the two strings must be identical for the comparison to be
TRUE.". However, that statement is incorrect.
In my previous commit message, I noted:
The operation in this case is a search for a string (argument d to
the helper) in another string (argument s to the helper); if a copy
of d at a particular position would run off the end of s, the
resulting output bit should be 0 whether or not the strings match in
the region where they overlap, but the QEMU implementation was
wrongly comparing only up to the point where s ends and counting it
as a match if an initial segment of d matched a terminal segment of
s. Here, "run off the end of s" means that some byte of d would
overlap some byte outside of s; thus, if d has zero length, it is
considered to match everywhere, including after the end of s.
The description "some byte of d would overlap some byte outside of s"
is accurate only when understood to refer to overlapping some byte
*within the 16-byte operand* but at or after the zero terminator; it
is valid to run over the end of s if the end of s is the end of the
16-byte operand. So the fix in the previous patch for the case of d
being empty was correct, but the other part of that patch was not
correct (as it never allowed partial matches even at the end of the
16-byte operand). Nor was the code before the previous patch correct
for the case of d nonempty, as it would always have allowed partial
matches at the end of s.
Fix with a partial revert of my previous change, combined with
inserting a check for the special case of s having maximum length to
determine where it is necessary to check for matches.
In the added test, test 1 is for the case of empty strings, which
failed before my 2017 patch, test 2 is for the bug introduced by my
2017 patch and test 3 deals with the case where a match of an initial
segment at the end of the string is not valid when the string ends
before the end of the 16-byte operand (that is, the case that would be
broken by a simple revert of the non-empty-string part of my 2017
patch).
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/ops_sse.h | 4 ++--
tests/tcg/i386/Makefile.target | 3 +++
tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
3 files changed, 38 insertions(+), 2 deletions(-)
create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 01d6017412..14f2b16abd 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
res = (2 << upper) - 1;
break;
}
- for (j = valids - validd; j >= 0; j--) {
+ for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
res <<= 1;
v = 1;
- for (i = validd; i >= 0; i--) {
+ for (i = MIN(valids - j, validd); i >= 0; i--) {
v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
}
res |= v;
diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index 43ee2e181e..53efec0668 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
SKIP_I386_TESTS=test-i386-ssse3
X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
+test-i386-pcmpistri: CFLAGS += -msse4.2
+run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
+
#
# hello-i386 is a barebones app
#
diff --git a/tests/tcg/i386/test-i386-pcmpistri.c b/tests/tcg/i386/test-i386-pcmpistri.c
new file mode 100644
index 0000000000..1e81ae611a
--- /dev/null
+++ b/tests/tcg/i386/test-i386-pcmpistri.c
@@ -0,0 +1,33 @@
+/* Test pcmpistri instruction. */
+
+#include <nmmintrin.h>
+#include <stdio.h>
+
+union u {
+ __m128i x;
+ unsigned char uc[16];
+};
+
+union u s0 = { .uc = { 0 } };
+union u s1 = { .uc = "abcdefghijklmnop" };
+union u s2 = { .uc = "bcdefghijklmnopa" };
+union u s3 = { .uc = "bcdefghijklmnab" };
+
+int
+main(void)
+{
+ int ret = 0;
+ if (_mm_cmpistri(s0.x, s0.x, 0x4c) != 15) {
+ printf("FAIL: pcmpistri test 1\n");
+ ret = 1;
+ }
+ if (_mm_cmpistri(s1.x, s2.x, 0x4c) != 15) {
+ printf("FAIL: pcmpistri test 2\n");
+ ret = 1;
+ }
+ if (_mm_cmpistri(s1.x, s3.x, 0x4c) != 16) {
+ printf("FAIL: pcmpistri test 3\n");
+ ret = 1;
+ }
+ return ret;
+}
--
2.26.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PULL 088/116] i386: hvf: Drop useless declarations in sysemu
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
2020-06-12 16:07 ` [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State Paolo Bonzini
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
To: qemu-devel; +Cc: Roman Bolshakov, Philippe Mathieu-Daudé
From: Roman Bolshakov <r.bolshakov@yadro.com>
They're either declared elsewhere or have no use.
While at it, rename _hvf_cpu_synchronize_post_init() to
do_hvf_cpu_synchronize_post_init().
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-3-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
include/sysemu/hvf.h | 22 ----------------------
target/i386/hvf/hvf.c | 7 ++++---
2 files changed, 4 insertions(+), 25 deletions(-)
diff --git a/include/sysemu/hvf.h b/include/sysemu/hvf.h
index 644bdfc722..2af32e505e 100644
--- a/include/sysemu/hvf.h
+++ b/include/sysemu/hvf.h
@@ -30,35 +30,13 @@ extern bool hvf_allowed;
#define hvf_get_supported_cpuid(func, idx, reg) 0
#endif /* !CONFIG_HVF */
-/* Disable HVF if |disable| is 1, otherwise, enable it iff it is supported by
- * the host CPU. Use hvf_enabled() after this to get the result. */
-void hvf_disable(int disable);
-
-/* Returns non-0 if the host CPU supports the VMX "unrestricted guest" feature
- * which allows the virtual CPU to directly run in "real mode". If true, this
- * allows QEMU to run several vCPU threads in parallel (see cpus.c). Otherwise,
- * only a a single TCG thread can run, and it will call HVF to run the current
- * instructions, except in case of "real mode" (paging disabled, typically at
- * boot time), or MMIO operations. */
-
-int hvf_sync_vcpus(void);
-
int hvf_init_vcpu(CPUState *);
int hvf_vcpu_exec(CPUState *);
-int hvf_smp_cpu_exec(CPUState *);
void hvf_cpu_synchronize_state(CPUState *);
void hvf_cpu_synchronize_post_reset(CPUState *);
void hvf_cpu_synchronize_post_init(CPUState *);
-void _hvf_cpu_synchronize_post_init(CPUState *, run_on_cpu_data);
-
void hvf_vcpu_destroy(CPUState *);
-void hvf_raise_event(CPUState *);
-/* void hvf_reset_vcpu_state(void *opaque); */
void hvf_reset_vcpu(CPUState *);
-void vmx_update_tpr(CPUState *);
-void update_apic_tpr(CPUState *);
-int hvf_put_registers(CPUState *);
-void vmx_clear_int_window_exiting(CPUState *cpu);
#define TYPE_HVF_ACCEL ACCEL_CLASS_NAME("hvf")
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index d72543dc31..9ccdb7e7c7 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -251,7 +251,7 @@ void vmx_update_tpr(CPUState *cpu)
}
}
-void update_apic_tpr(CPUState *cpu)
+static void update_apic_tpr(CPUState *cpu)
{
X86CPU *x86_cpu = X86_CPU(cpu);
int tpr = rreg(cpu->hvf_fd, HV_X86_TPR) >> 4;
@@ -312,7 +312,8 @@ void hvf_cpu_synchronize_post_reset(CPUState *cpu_state)
run_on_cpu(cpu_state, do_hvf_cpu_synchronize_post_reset, RUN_ON_CPU_NULL);
}
-void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
+static void do_hvf_cpu_synchronize_post_init(CPUState *cpu,
+ run_on_cpu_data arg)
{
CPUState *cpu_state = cpu;
hvf_put_registers(cpu_state);
@@ -321,7 +322,7 @@ void _hvf_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
void hvf_cpu_synchronize_post_init(CPUState *cpu_state)
{
- run_on_cpu(cpu_state, _hvf_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
+ run_on_cpu(cpu_state, do_hvf_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
}
static bool ept_emulation_fault(hvf_slot *slot, uint64_t gpa, uint64_t ept_qual)
--
2.26.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
2020-06-12 16:07 ` [PULL 088/116] i386: hvf: Drop useless declarations in sysemu Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
2020-06-12 16:07 ` [PULL 098/116] i386: hvf: Move mmio_buf " Paolo Bonzini
2020-06-13 16:17 ` [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Peter Maydell
4 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
To: qemu-devel; +Cc: Roman Bolshakov
From: Roman Bolshakov <r.bolshakov@yadro.com>
The lazy flags are still needed for instruction decoder.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-12-r.bolshakov@yadro.com>
[Move struct to target/i386/cpu.h - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/cpu.h | 6 ++++
target/i386/hvf/x86.h | 6 ----
target/i386/hvf/x86_flags.c | 57 ++++++++++++++++++-------------------
3 files changed, 34 insertions(+), 35 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c2b8bdcbde..f742ba933f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1366,6 +1366,11 @@ typedef struct CPUCaches {
CPUCacheInfo *l3_cache;
} CPUCaches;
+typedef struct HVFX86LazyFlags {
+ target_ulong result;
+ target_ulong auxbits;
+} HVFX86LazyFlags;
+
typedef struct CPUX86State {
/* standard registers */
target_ulong regs[CPU_NB_REGS];
@@ -1597,6 +1602,7 @@ typedef struct CPUX86State {
struct kvm_nested_state *nested_state;
#endif
#if defined(CONFIG_HVF)
+ HVFX86LazyFlags hvf_lflags;
HVFX86EmulatorState *hvf_emul;
#endif
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 6048b5cc74..2363616c07 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -228,14 +228,8 @@ typedef struct x68_segment_selector {
};
} __attribute__ ((__packed__)) x68_segment_selector;
-typedef struct lazy_flags {
- target_ulong result;
- target_ulong auxbits;
-} lazy_flags;
-
/* Definition of hvf_x86_state is here */
struct HVFX86EmulatorState {
- struct lazy_flags lflags;
uint8_t mmio_buf[4096];
};
diff --git a/target/i386/hvf/x86_flags.c b/target/i386/hvf/x86_flags.c
index 1152cd7234..5ca4f41f5c 100644
--- a/target/i386/hvf/x86_flags.c
+++ b/target/i386/hvf/x86_flags.c
@@ -63,7 +63,7 @@
#define SET_FLAGS_OSZAPC_SIZE(size, lf_carries, lf_result) { \
target_ulong temp = ((lf_carries) & (LF_MASK_AF)) | \
(((lf_carries) >> (size - 2)) << LF_BIT_PO); \
- env->hvf_emul->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+ env->hvf_lflags.result = (target_ulong)(int##size##_t)(lf_result); \
if ((size) == 32) { \
temp = ((lf_carries) & ~(LF_MASK_PDB | LF_MASK_SD)); \
} else if ((size) == 16) { \
@@ -73,7 +73,7 @@
} else { \
VM_PANIC("unimplemented"); \
} \
- env->hvf_emul->lflags.auxbits = (target_ulong)(uint32_t)temp; \
+ env->hvf_lflags.auxbits = (target_ulong)(uint32_t)temp; \
}
/* carries, result */
@@ -100,10 +100,10 @@
} else { \
VM_PANIC("unimplemented"); \
} \
- env->hvf_emul->lflags.result = (target_ulong)(int##size##_t)(lf_result); \
- target_ulong delta_c = (env->hvf_emul->lflags.auxbits ^ temp) & LF_MASK_CF; \
+ env->hvf_lflags.result = (target_ulong)(int##size##_t)(lf_result); \
+ target_ulong delta_c = (env->hvf_lflags.auxbits ^ temp) & LF_MASK_CF; \
delta_c ^= (delta_c >> 1); \
- env->hvf_emul->lflags.auxbits = (target_ulong)(uint32_t)(temp ^ delta_c); \
+ env->hvf_lflags.auxbits = (target_ulong)(uint32_t)(temp ^ delta_c); \
}
/* carries, result */
@@ -117,9 +117,8 @@
void SET_FLAGS_OxxxxC(CPUX86State *env, uint32_t new_of, uint32_t new_cf)
{
uint32_t temp_po = new_of ^ new_cf;
- env->hvf_emul->lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
- env->hvf_emul->lflags.auxbits |= (temp_po << LF_BIT_PO) |
- (new_cf << LF_BIT_CF);
+ env->hvf_lflags.auxbits &= ~(LF_MASK_PO | LF_MASK_CF);
+ env->hvf_lflags.auxbits |= (temp_po << LF_BIT_PO) | (new_cf << LF_BIT_CF);
}
void SET_FLAGS_OSZAPC_SUB32(CPUX86State *env, uint32_t v1, uint32_t v2,
@@ -215,27 +214,27 @@ void SET_FLAGS_OSZAPC_LOGIC8(CPUX86State *env, uint8_t v1, uint8_t v2,
bool get_PF(CPUX86State *env)
{
- uint32_t temp = (255 & env->hvf_emul->lflags.result);
- temp = temp ^ (255 & (env->hvf_emul->lflags.auxbits >> LF_BIT_PDB));
+ uint32_t temp = (255 & env->hvf_lflags.result);
+ temp = temp ^ (255 & (env->hvf_lflags.auxbits >> LF_BIT_PDB));
temp = (temp ^ (temp >> 4)) & 0x0F;
return (0x9669U >> temp) & 1;
}
void set_PF(CPUX86State *env, bool val)
{
- uint32_t temp = (255 & env->hvf_emul->lflags.result) ^ (!val);
- env->hvf_emul->lflags.auxbits &= ~(LF_MASK_PDB);
- env->hvf_emul->lflags.auxbits |= (temp << LF_BIT_PDB);
+ uint32_t temp = (255 & env->hvf_lflags.result) ^ (!val);
+ env->hvf_lflags.auxbits &= ~(LF_MASK_PDB);
+ env->hvf_lflags.auxbits |= (temp << LF_BIT_PDB);
}
bool get_OF(CPUX86State *env)
{
- return ((env->hvf_emul->lflags.auxbits + (1U << LF_BIT_PO)) >> LF_BIT_CF) & 1;
+ return ((env->hvf_lflags.auxbits + (1U << LF_BIT_PO)) >> LF_BIT_CF) & 1;
}
bool get_CF(CPUX86State *env)
{
- return (env->hvf_emul->lflags.auxbits >> LF_BIT_CF) & 1;
+ return (env->hvf_lflags.auxbits >> LF_BIT_CF) & 1;
}
void set_OF(CPUX86State *env, bool val)
@@ -252,45 +251,45 @@ void set_CF(CPUX86State *env, bool val)
bool get_AF(CPUX86State *env)
{
- return (env->hvf_emul->lflags.auxbits >> LF_BIT_AF) & 1;
+ return (env->hvf_lflags.auxbits >> LF_BIT_AF) & 1;
}
void set_AF(CPUX86State *env, bool val)
{
- env->hvf_emul->lflags.auxbits &= ~(LF_MASK_AF);
- env->hvf_emul->lflags.auxbits |= val << LF_BIT_AF;
+ env->hvf_lflags.auxbits &= ~(LF_MASK_AF);
+ env->hvf_lflags.auxbits |= val << LF_BIT_AF;
}
bool get_ZF(CPUX86State *env)
{
- return !env->hvf_emul->lflags.result;
+ return !env->hvf_lflags.result;
}
void set_ZF(CPUX86State *env, bool val)
{
if (val) {
- env->hvf_emul->lflags.auxbits ^=
- (((env->hvf_emul->lflags.result >> LF_SIGN_BIT) & 1) << LF_BIT_SD);
+ env->hvf_lflags.auxbits ^=
+ (((env->hvf_lflags.result >> LF_SIGN_BIT) & 1) << LF_BIT_SD);
/* merge the parity bits into the Parity Delta Byte */
- uint32_t temp_pdb = (255 & env->hvf_emul->lflags.result);
- env->hvf_emul->lflags.auxbits ^= (temp_pdb << LF_BIT_PDB);
+ uint32_t temp_pdb = (255 & env->hvf_lflags.result);
+ env->hvf_lflags.auxbits ^= (temp_pdb << LF_BIT_PDB);
/* now zero the .result value */
- env->hvf_emul->lflags.result = 0;
+ env->hvf_lflags.result = 0;
} else {
- env->hvf_emul->lflags.result |= (1 << 8);
+ env->hvf_lflags.result |= (1 << 8);
}
}
bool get_SF(CPUX86State *env)
{
- return ((env->hvf_emul->lflags.result >> LF_SIGN_BIT) ^
- (env->hvf_emul->lflags.auxbits >> LF_BIT_SD)) & 1;
+ return ((env->hvf_lflags.result >> LF_SIGN_BIT) ^
+ (env->hvf_lflags.auxbits >> LF_BIT_SD)) & 1;
}
void set_SF(CPUX86State *env, bool val)
{
bool temp_sf = get_SF(env);
- env->hvf_emul->lflags.auxbits ^= (temp_sf ^ val) << LF_BIT_SD;
+ env->hvf_lflags.auxbits ^= (temp_sf ^ val) << LF_BIT_SD;
}
void lflags_to_rflags(CPUX86State *env)
@@ -305,7 +304,7 @@ void lflags_to_rflags(CPUX86State *env)
void rflags_to_lflags(CPUX86State *env)
{
- env->hvf_emul->lflags.auxbits = env->hvf_emul->lflags.result = 0;
+ env->hvf_lflags.auxbits = env->hvf_lflags.result = 0;
set_OF(env, env->eflags & CC_O);
set_SF(env, env->eflags & CC_S);
set_ZF(env, env->eflags & CC_Z);
--
2.26.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PULL 098/116] i386: hvf: Move mmio_buf into CPUX86State
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
` (2 preceding siblings ...)
2020-06-12 16:07 ` [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State Paolo Bonzini
@ 2020-06-12 16:07 ` Paolo Bonzini
2020-06-13 16:17 ` [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Peter Maydell
4 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-12 16:07 UTC (permalink / raw)
To: qemu-devel; +Cc: Roman Bolshakov
From: Roman Bolshakov <r.bolshakov@yadro.com>
There's no similar field in CPUX86State, but it's needed for MMIO traps.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-13-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/cpu.h | 1 +
target/i386/hvf/hvf.c | 5 +++++
target/i386/hvf/x86.h | 1 -
target/i386/hvf/x86_emu.c | 12 ++++++------
4 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index f742ba933f..25a2f4c0c3 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1603,6 +1603,7 @@ typedef struct CPUX86State {
#endif
#if defined(CONFIG_HVF)
HVFX86LazyFlags hvf_lflags;
+ void *hvf_mmio_buf;
HVFX86EmulatorState *hvf_emul;
#endif
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 4cee496d71..57696c46c7 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -533,7 +533,11 @@ void hvf_reset_vcpu(CPUState *cpu) {
void hvf_vcpu_destroy(CPUState *cpu)
{
+ X86CPU *x86_cpu = X86_CPU(cpu);
+ CPUX86State *env = &x86_cpu->env;
+
hv_return_t ret = hv_vcpu_destroy((hv_vcpuid_t)cpu->hvf_fd);
+ g_free(env->hvf_mmio_buf);
assert_hvf_ok(ret);
}
@@ -563,6 +567,7 @@ int hvf_init_vcpu(CPUState *cpu)
init_decoder();
hvf_state->hvf_caps = g_new0(struct hvf_vcpu_caps, 1);
+ env->hvf_mmio_buf = g_new(char, 4096);
env->hvf_emul = g_new0(HVFX86EmulatorState, 1);
r = hv_vcpu_create((hv_vcpuid_t *)&cpu->hvf_fd, HV_VCPU_DEFAULT);
diff --git a/target/i386/hvf/x86.h b/target/i386/hvf/x86.h
index 2363616c07..483fcea762 100644
--- a/target/i386/hvf/x86.h
+++ b/target/i386/hvf/x86.h
@@ -230,7 +230,6 @@ typedef struct x68_segment_selector {
/* Definition of hvf_x86_state is here */
struct HVFX86EmulatorState {
- uint8_t mmio_buf[4096];
};
/* useful register access macros */
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 1ad2c30e16..d3e289ed87 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -187,8 +187,8 @@ void write_val_ext(struct CPUX86State *env, target_ulong ptr, target_ulong val,
uint8_t *read_mmio(struct CPUX86State *env, target_ulong ptr, int bytes)
{
- vmx_read_mem(env_cpu(env), env->hvf_emul->mmio_buf, ptr, bytes);
- return env->hvf_emul->mmio_buf;
+ vmx_read_mem(env_cpu(env), env->hvf_mmio_buf, ptr, bytes);
+ return env->hvf_mmio_buf;
}
@@ -489,9 +489,9 @@ static void exec_ins_single(struct CPUX86State *env, struct x86_decode *decode)
target_ulong addr = linear_addr_size(env_cpu(env), RDI(env),
decode->addressing_size, R_ES);
- hvf_handle_io(env_cpu(env), DX(env), env->hvf_emul->mmio_buf, 0,
+ hvf_handle_io(env_cpu(env), DX(env), env->hvf_mmio_buf, 0,
decode->operand_size, 1);
- vmx_write_mem(env_cpu(env), addr, env->hvf_emul->mmio_buf,
+ vmx_write_mem(env_cpu(env), addr, env->hvf_mmio_buf,
decode->operand_size);
string_increment_reg(env, R_EDI, decode);
@@ -512,9 +512,9 @@ static void exec_outs_single(struct CPUX86State *env, struct x86_decode *decode)
{
target_ulong addr = decode_linear_addr(env, decode, RSI(env), R_DS);
- vmx_read_mem(env_cpu(env), env->hvf_emul->mmio_buf, addr,
+ vmx_read_mem(env_cpu(env), env->hvf_mmio_buf, addr,
decode->operand_size);
- hvf_handle_io(env_cpu(env), DX(env), env->hvf_emul->mmio_buf, 1,
+ hvf_handle_io(env_cpu(env), DX(env), env->hvf_mmio_buf, 1,
decode->operand_size, 1);
string_increment_reg(env, R_ESI, decode);
--
2.26.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
` (3 preceding siblings ...)
2020-06-12 16:07 ` [PULL 098/116] i386: hvf: Move mmio_buf " Paolo Bonzini
@ 2020-06-13 16:17 ` Peter Maydell
4 siblings, 0 replies; 9+ messages in thread
From: Peter Maydell @ 2020-06-13 16:17 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: QEMU Developers
On Fri, 12 Jun 2020 at 17:11, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> The following changes since commit 31d321c2b3574dcc74e9f6411af06bca6b5d10f4:
>
> Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sparc-next-20200609' into staging (2020-06-09 17:29:47 +0100)
>
> are available in the Git repository at:
>
> git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 3575b0aea983ad57804c9af739ed8ff7bc168393:
>
> target/i386: Remove obsolete TODO file (2020-06-12 11:20:15 -0400)
>
> ----------------------------------------------------------------
> * Miscellaneous fixes and feature enablement (many)
> * SEV refactoring (David)
> * Hyper-V initial support (Jon)
> * i386 TCG fixes (x87 and SSE, Joseph)
> * vmport cleanup and improvements (Philippe, Liran)
> * Use-after-free with vCPU hot-unplug (Nengyuan)
> * run-coverity-scan improvements (myself)
> * Record/replay fixes (Pavel)
> * -machine kernel_irqchip=split improvements for INTx (Peter)
> * Code cleanups (Philippe)
> * Crash and security fixes (PJP)
> * HVF cleanups (Roman)
Applied, thanks. (I had to fix up a conflict in hw/i386/acpi-build.c;
might be worth checking that I got it right.)
Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1
for any user-visible changes.
-- PMM
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
@ 2020-06-15 10:18 ` Philippe Mathieu-Daudé
2020-06-15 10:58 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-15 10:18 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: Alex Bennée, Joseph Myers
On 6/12/20 6:07 PM, Paolo Bonzini wrote:
> From: Joseph Myers <joseph@codesourcery.com>
>
> This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
> / pcmpestrm / pcmpistri / pcmpistrm substring search, commit
> ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
>
> That commit fixed a bug that showed up in four GCC tests with one libc
> implementation. The tests in question generate random inputs to the
> intrinsics and compare results to a C implementation, but they only
> test 1024 possible random inputs, and when the tests use the cases of
> those instructions that work with word rather than byte inputs, it's
> easy to have problematic cases that show up much less frequently than
> that. Thus, testing with a different libc implementation, and so a
> different random number generator, showed up a problem with the
> previous patch.
>
> When investigating the previous test failures, I found the description
> of these instructions in the Intel manuals (starting from computing a
> 16x16 or 8x8 set of comparison results) confusing and hard to match up
> with the more optimized implementation in QEMU, and referred to AMD
> manuals which described the instructions in a different way. Those
> AMD descriptions are very explicit that the whole of the string being
> searched for must be found in the other operand, not running off the
> end of that operand; they say "If the prototype and the SUT are equal
> in length, the two strings must be identical for the comparison to be
> TRUE.". However, that statement is incorrect.
>
> In my previous commit message, I noted:
>
> The operation in this case is a search for a string (argument d to
> the helper) in another string (argument s to the helper); if a copy
> of d at a particular position would run off the end of s, the
> resulting output bit should be 0 whether or not the strings match in
> the region where they overlap, but the QEMU implementation was
> wrongly comparing only up to the point where s ends and counting it
> as a match if an initial segment of d matched a terminal segment of
> s. Here, "run off the end of s" means that some byte of d would
> overlap some byte outside of s; thus, if d has zero length, it is
> considered to match everywhere, including after the end of s.
>
> The description "some byte of d would overlap some byte outside of s"
> is accurate only when understood to refer to overlapping some byte
> *within the 16-byte operand* but at or after the zero terminator; it
> is valid to run over the end of s if the end of s is the end of the
> 16-byte operand. So the fix in the previous patch for the case of d
> being empty was correct, but the other part of that patch was not
> correct (as it never allowed partial matches even at the end of the
> 16-byte operand). Nor was the code before the previous patch correct
> for the case of d nonempty, as it would always have allowed partial
> matches at the end of s.
>
> Fix with a partial revert of my previous change, combined with
> inserting a check for the special case of s having maximum length to
> determine where it is necessary to check for matches.
>
> In the added test, test 1 is for the case of empty strings, which
> failed before my 2017 patch, test 2 is for the bug introduced by my
> 2017 patch and test 3 deals with the case where a match of an initial
> segment at the end of the string is not valid when the string ends
> before the end of the 16-byte operand (that is, the case that would be
> broken by a simple revert of the non-empty-string part of my 2017
> patch).
>
> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
> Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> target/i386/ops_sse.h | 4 ++--
> tests/tcg/i386/Makefile.target | 3 +++
> tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
> 3 files changed, 38 insertions(+), 2 deletions(-)
> create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
>
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 01d6017412..14f2b16abd 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
> res = (2 << upper) - 1;
> break;
> }
> - for (j = valids - validd; j >= 0; j--) {
> + for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
> res <<= 1;
> v = 1;
> - for (i = validd; i >= 0; i--) {
> + for (i = MIN(valids - j, validd); i >= 0; i--) {
> v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
> }
> res |= v;
> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
> index 43ee2e181e..53efec0668 100644
> --- a/tests/tcg/i386/Makefile.target
> +++ b/tests/tcg/i386/Makefile.target
> @@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
> SKIP_I386_TESTS=test-i386-ssse3
> X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
>
> +test-i386-pcmpistri: CFLAGS += -msse4.2
> +run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
This test fails on our CI:
https://travis-ci.org/github/qemu/qemu/jobs/698006621#L4246
> +
> #
> # hello-i386 is a barebones app
> #
> diff --git a/tests/tcg/i386/test-i386-pcmpistri.c b/tests/tcg/i386/test-i386-pcmpistri.c
> new file mode 100644
> index 0000000000..1e81ae611a
> --- /dev/null
> +++ b/tests/tcg/i386/test-i386-pcmpistri.c
> @@ -0,0 +1,33 @@
> +/* Test pcmpistri instruction. */
> +
> +#include <nmmintrin.h>
> +#include <stdio.h>
> +
> +union u {
> + __m128i x;
> + unsigned char uc[16];
> +};
> +
> +union u s0 = { .uc = { 0 } };
> +union u s1 = { .uc = "abcdefghijklmnop" };
> +union u s2 = { .uc = "bcdefghijklmnopa" };
> +union u s3 = { .uc = "bcdefghijklmnab" };
> +
> +int
> +main(void)
> +{
> + int ret = 0;
> + if (_mm_cmpistri(s0.x, s0.x, 0x4c) != 15) {
> + printf("FAIL: pcmpistri test 1\n");
> + ret = 1;
> + }
> + if (_mm_cmpistri(s1.x, s2.x, 0x4c) != 15) {
> + printf("FAIL: pcmpistri test 2\n");
> + ret = 1;
> + }
> + if (_mm_cmpistri(s1.x, s3.x, 0x4c) != 16) {
> + printf("FAIL: pcmpistri test 3\n");
> + ret = 1;
> + }
> + return ret;
> +}
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
2020-06-15 10:18 ` Philippe Mathieu-Daudé
@ 2020-06-15 10:58 ` Philippe Mathieu-Daudé
2020-06-15 13:43 ` Alex Bennée
0 siblings, 1 reply; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-15 10:58 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: Alex Bennée, Joseph Myers
On 6/15/20 12:18 PM, Philippe Mathieu-Daudé wrote:
> On 6/12/20 6:07 PM, Paolo Bonzini wrote:
>> From: Joseph Myers <joseph@codesourcery.com>
>>
>> This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
>> / pcmpestrm / pcmpistri / pcmpistrm substring search, commit
>> ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
>>
>> That commit fixed a bug that showed up in four GCC tests with one libc
>> implementation. The tests in question generate random inputs to the
>> intrinsics and compare results to a C implementation, but they only
>> test 1024 possible random inputs, and when the tests use the cases of
>> those instructions that work with word rather than byte inputs, it's
>> easy to have problematic cases that show up much less frequently than
>> that. Thus, testing with a different libc implementation, and so a
>> different random number generator, showed up a problem with the
>> previous patch.
>>
>> When investigating the previous test failures, I found the description
>> of these instructions in the Intel manuals (starting from computing a
>> 16x16 or 8x8 set of comparison results) confusing and hard to match up
>> with the more optimized implementation in QEMU, and referred to AMD
>> manuals which described the instructions in a different way. Those
>> AMD descriptions are very explicit that the whole of the string being
>> searched for must be found in the other operand, not running off the
>> end of that operand; they say "If the prototype and the SUT are equal
>> in length, the two strings must be identical for the comparison to be
>> TRUE.". However, that statement is incorrect.
>>
>> In my previous commit message, I noted:
>>
>> The operation in this case is a search for a string (argument d to
>> the helper) in another string (argument s to the helper); if a copy
>> of d at a particular position would run off the end of s, the
>> resulting output bit should be 0 whether or not the strings match in
>> the region where they overlap, but the QEMU implementation was
>> wrongly comparing only up to the point where s ends and counting it
>> as a match if an initial segment of d matched a terminal segment of
>> s. Here, "run off the end of s" means that some byte of d would
>> overlap some byte outside of s; thus, if d has zero length, it is
>> considered to match everywhere, including after the end of s.
>>
>> The description "some byte of d would overlap some byte outside of s"
>> is accurate only when understood to refer to overlapping some byte
>> *within the 16-byte operand* but at or after the zero terminator; it
>> is valid to run over the end of s if the end of s is the end of the
>> 16-byte operand. So the fix in the previous patch for the case of d
>> being empty was correct, but the other part of that patch was not
>> correct (as it never allowed partial matches even at the end of the
>> 16-byte operand). Nor was the code before the previous patch correct
>> for the case of d nonempty, as it would always have allowed partial
>> matches at the end of s.
>>
>> Fix with a partial revert of my previous change, combined with
>> inserting a check for the special case of s having maximum length to
>> determine where it is necessary to check for matches.
>>
>> In the added test, test 1 is for the case of empty strings, which
>> failed before my 2017 patch, test 2 is for the bug introduced by my
>> 2017 patch and test 3 deals with the case where a match of an initial
>> segment at the end of the string is not valid when the string ends
>> before the end of the 16-byte operand (that is, the case that would be
>> broken by a simple revert of the non-empty-string part of my 2017
>> patch).
>>
>> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
>> Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> target/i386/ops_sse.h | 4 ++--
>> tests/tcg/i386/Makefile.target | 3 +++
>> tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
>> 3 files changed, 38 insertions(+), 2 deletions(-)
>> create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
>>
>> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
>> index 01d6017412..14f2b16abd 100644
>> --- a/target/i386/ops_sse.h
>> +++ b/target/i386/ops_sse.h
>> @@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
>> res = (2 << upper) - 1;
>> break;
>> }
>> - for (j = valids - validd; j >= 0; j--) {
>> + for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
>> res <<= 1;
>> v = 1;
>> - for (i = validd; i >= 0; i--) {
>> + for (i = MIN(valids - j, validd); i >= 0; i--) {
>> v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
>> }
>> res |= v;
>> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
>> index 43ee2e181e..53efec0668 100644
>> --- a/tests/tcg/i386/Makefile.target
>> +++ b/tests/tcg/i386/Makefile.target
>> @@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
>> SKIP_I386_TESTS=test-i386-ssse3
>> X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
>>
>> +test-i386-pcmpistri: CFLAGS += -msse4.2
>> +run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
>
> This test fails on our CI:
> https://travis-ci.org/github/qemu/qemu/jobs/698006621#L4246
FYI Paolo's analysis from 'make V=1' output
https://api.travis-ci.org/v3/job/698459904/log.txt:
timeout 60
/home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386 -cpu max
test-i386-pcmpistri > test-i386-pcmpistri.out
timeout 60
/home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386 -plugin
../../plugin/libbb.so -d plugin -D
test-i386-pcmpistri-with-libbb.so.pout test-i386-pcmpistri >
run-plugin-test-i386-pcmpistri-with-libbb.so.out
"incorrect qemu invocation, missing -cpu max in the second".
>
>> +
>> #
>> # hello-i386 is a barebones app
>> #
>> diff --git a/tests/tcg/i386/test-i386-pcmpistri.c b/tests/tcg/i386/test-i386-pcmpistri.c
>> new file mode 100644
>> index 0000000000..1e81ae611a
>> --- /dev/null
>> +++ b/tests/tcg/i386/test-i386-pcmpistri.c
>> @@ -0,0 +1,33 @@
>> +/* Test pcmpistri instruction. */
>> +
>> +#include <nmmintrin.h>
>> +#include <stdio.h>
>> +
>> +union u {
>> + __m128i x;
>> + unsigned char uc[16];
>> +};
>> +
>> +union u s0 = { .uc = { 0 } };
>> +union u s1 = { .uc = "abcdefghijklmnop" };
>> +union u s2 = { .uc = "bcdefghijklmnopa" };
>> +union u s3 = { .uc = "bcdefghijklmnab" };
>> +
>> +int
>> +main(void)
>> +{
>> + int ret = 0;
>> + if (_mm_cmpistri(s0.x, s0.x, 0x4c) != 15) {
>> + printf("FAIL: pcmpistri test 1\n");
>> + ret = 1;
>> + }
>> + if (_mm_cmpistri(s1.x, s2.x, 0x4c) != 15) {
>> + printf("FAIL: pcmpistri test 2\n");
>> + ret = 1;
>> + }
>> + if (_mm_cmpistri(s1.x, s3.x, 0x4c) != 16) {
>> + printf("FAIL: pcmpistri test 3\n");
>> + ret = 1;
>> + }
>> + return ret;
>> +}
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search
2020-06-15 10:58 ` Philippe Mathieu-Daudé
@ 2020-06-15 13:43 ` Alex Bennée
0 siblings, 0 replies; 9+ messages in thread
From: Alex Bennée @ 2020-06-15 13:43 UTC (permalink / raw)
To: Philippe Mathieu-Daudé; +Cc: Paolo Bonzini, qemu-devel, Joseph Myers
Philippe Mathieu-Daudé <philmd@redhat.com> writes:
> On 6/15/20 12:18 PM, Philippe Mathieu-Daudé wrote:
>> On 6/12/20 6:07 PM, Paolo Bonzini wrote:
>>> From: Joseph Myers <joseph@codesourcery.com>
>>>
>>> This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
>>> / pcmpestrm / pcmpistri / pcmpistrm substring search, commit
>>> ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
>>>
>>> That commit fixed a bug that showed up in four GCC tests with one libc
>>> implementation. The tests in question generate random inputs to the
>>> intrinsics and compare results to a C implementation, but they only
>>> test 1024 possible random inputs, and when the tests use the cases of
>>> those instructions that work with word rather than byte inputs, it's
>>> easy to have problematic cases that show up much less frequently than
>>> that. Thus, testing with a different libc implementation, and so a
>>> different random number generator, showed up a problem with the
>>> previous patch.
>>>
>>> When investigating the previous test failures, I found the description
>>> of these instructions in the Intel manuals (starting from computing a
>>> 16x16 or 8x8 set of comparison results) confusing and hard to match up
>>> with the more optimized implementation in QEMU, and referred to AMD
>>> manuals which described the instructions in a different way. Those
>>> AMD descriptions are very explicit that the whole of the string being
>>> searched for must be found in the other operand, not running off the
>>> end of that operand; they say "If the prototype and the SUT are equal
>>> in length, the two strings must be identical for the comparison to be
>>> TRUE.". However, that statement is incorrect.
>>>
>>> In my previous commit message, I noted:
>>>
>>> The operation in this case is a search for a string (argument d to
>>> the helper) in another string (argument s to the helper); if a copy
>>> of d at a particular position would run off the end of s, the
>>> resulting output bit should be 0 whether or not the strings match in
>>> the region where they overlap, but the QEMU implementation was
>>> wrongly comparing only up to the point where s ends and counting it
>>> as a match if an initial segment of d matched a terminal segment of
>>> s. Here, "run off the end of s" means that some byte of d would
>>> overlap some byte outside of s; thus, if d has zero length, it is
>>> considered to match everywhere, including after the end of s.
>>>
>>> The description "some byte of d would overlap some byte outside of s"
>>> is accurate only when understood to refer to overlapping some byte
>>> *within the 16-byte operand* but at or after the zero terminator; it
>>> is valid to run over the end of s if the end of s is the end of the
>>> 16-byte operand. So the fix in the previous patch for the case of d
>>> being empty was correct, but the other part of that patch was not
>>> correct (as it never allowed partial matches even at the end of the
>>> 16-byte operand). Nor was the code before the previous patch correct
>>> for the case of d nonempty, as it would always have allowed partial
>>> matches at the end of s.
>>>
>>> Fix with a partial revert of my previous change, combined with
>>> inserting a check for the special case of s having maximum length to
>>> determine where it is necessary to check for matches.
>>>
>>> In the added test, test 1 is for the case of empty strings, which
>>> failed before my 2017 patch, test 2 is for the bug introduced by my
>>> 2017 patch and test 3 deals with the case where a match of an initial
>>> segment at the end of the string is not valid when the string ends
>>> before the end of the 16-byte operand (that is, the case that would be
>>> broken by a simple revert of the non-empty-string part of my 2017
>>> patch).
>>>
>>> Signed-off-by: Joseph Myers <joseph@codesourcery.com>
>>> Message-Id: <alpine.DEB.2.21.2006121344290.9881@digraph.polyomino.org.uk>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>> target/i386/ops_sse.h | 4 ++--
>>> tests/tcg/i386/Makefile.target | 3 +++
>>> tests/tcg/i386/test-i386-pcmpistri.c | 33 ++++++++++++++++++++++++++++
>>> 3 files changed, 38 insertions(+), 2 deletions(-)
>>> create mode 100644 tests/tcg/i386/test-i386-pcmpistri.c
>>>
>>> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
>>> index 01d6017412..14f2b16abd 100644
>>> --- a/target/i386/ops_sse.h
>>> +++ b/target/i386/ops_sse.h
>>> @@ -2089,10 +2089,10 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg *d, Reg *s,
>>> res = (2 << upper) - 1;
>>> break;
>>> }
>>> - for (j = valids - validd; j >= 0; j--) {
>>> + for (j = valids == upper ? valids : valids - validd; j >= 0; j--) {
>>> res <<= 1;
>>> v = 1;
>>> - for (i = validd; i >= 0; i--) {
>>> + for (i = MIN(valids - j, validd); i >= 0; i--) {
>>> v &= (pcmp_val(s, ctrl, i + j) == pcmp_val(d, ctrl, i));
>>> }
>>> res |= v;
>>> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
>>> index 43ee2e181e..53efec0668 100644
>>> --- a/tests/tcg/i386/Makefile.target
>>> +++ b/tests/tcg/i386/Makefile.target
>>> @@ -10,6 +10,9 @@ ALL_X86_TESTS=$(I386_SRCS:.c=)
>>> SKIP_I386_TESTS=test-i386-ssse3
>>> X86_64_TESTS:=$(filter test-i386-ssse3, $(ALL_X86_TESTS))
>>>
>>> +test-i386-pcmpistri: CFLAGS += -msse4.2
>>> +run-test-i386-pcmpistri: QEMU_OPTS += -cpu max
>>
>> This test fails on our CI:
>> https://travis-ci.org/github/qemu/qemu/jobs/698006621#L4246
>
> FYI Paolo's analysis from 'make V=1' output
> https://api.travis-ci.org/v3/job/698459904/log.txt:
>
> timeout 60
> /home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386 -cpu max
> test-i386-pcmpistri > test-i386-pcmpistri.out
>
> timeout 60
> /home/travis/build/philmd/qemu/build/i386-linux-user/qemu-i386 -plugin
> ../../plugin/libbb.so -d plugin -D
> test-i386-pcmpistri-with-libbb.so.pout test-i386-pcmpistri >
> run-plugin-test-i386-pcmpistri-with-libbb.so.out
>
> "incorrect qemu invocation, missing -cpu max in the second".
Just testing some patches now.
--
Alex Bennée
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-06-15 13:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-12 16:07 [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Paolo Bonzini
2020-06-12 16:07 ` [PULL 082/116] target/i386: correct fix for pcmpxstrx substring search Paolo Bonzini
2020-06-15 10:18 ` Philippe Mathieu-Daudé
2020-06-15 10:58 ` Philippe Mathieu-Daudé
2020-06-15 13:43 ` Alex Bennée
2020-06-12 16:07 ` [PULL 088/116] i386: hvf: Drop useless declarations in sysemu Paolo Bonzini
2020-06-12 16:07 ` [PULL 097/116] i386: hvf: Move lazy_flags into CPUX86State Paolo Bonzini
2020-06-12 16:07 ` [PULL 098/116] i386: hvf: Move mmio_buf " Paolo Bonzini
2020-06-13 16:17 ` [PULL v2 000/116] Huge miscellaneous pull request for 2020-06-11 Peter Maydell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).