From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:35996) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gxBSO-0008Uq-I0 for qemu-devel@nongnu.org; Fri, 22 Feb 2019 09:05:41 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gxBSN-0005ne-EC for qemu-devel@nongnu.org; Fri, 22 Feb 2019 09:05:40 -0500 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]:34814) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gxBSN-0003w2-2p for qemu-devel@nongnu.org; Fri, 22 Feb 2019 09:05:39 -0500 Received: by mail-wr1-x429.google.com with SMTP id f14so2530295wrg.1 for ; Fri, 22 Feb 2019 06:04:33 -0800 (PST) MIME-Version: 1.0 References: <20190221111147.GB3282@stefanha-x1.localdomain> <28e6b4ed-9afd-3a79-6267-86c7385c23ce@redhat.com> In-Reply-To: From: Stefan Hajnoczi Date: Fri, 22 Feb 2019 14:04:20 +0000 Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?Q?Fernando_Casas_Sch=C3=B6ssow?= Cc: Paolo Bonzini , Natanael Copa , Richard Henderson , qemu-devel On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Sch=C3=B6ssow wrote: I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. Fernando: Can you confirm that the bug occurs with an unmodified Alpine Linux qemu binary? Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add host endian unaligned access functions") introduced the unaligned memory access functions in question here. Please see below for details on the bug - basically QEMU code assumes they are atomic, but that is not guaranteed :(. Any ideas for how to fix this? Natanael: It seems likely that the qemu package in Alpine Linux suffers from a compilation issue resulting in a broken QEMU. It may be necessary to leave the compiler optimization flag alone in APKBUILD to work around this problem. Here are the details. QEMU relies on the compiler turning memcpy(&dst, &src, 2) turning into a load instruction in include/qemu/bswap.h:lduw_he_p() (explanation below): /* Any compiler worth its salt will turn these memcpy into native unaligned operations. Thus we don't need to play games with packed attributes, or inline byte-by-byte stores. */ static inline int lduw_he_p(const void *ptr) { uint16_t r; memcpy(&r, ptr, sizeof(r)); return r; } Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that shows the load instruction: 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax 39816a: 66 89 43 32 mov %ax,0x32(%rbx) Here is the instruction sequence in the Alpine Linux binary: 455562: ba 02 00 00 00 mov $0x2,%edx 455567: e8 74 24 f3 ff callq 3879e0 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax 455571: 66 41 89 47 32 mov %ax,0x32(%r15) It's calling memcpy instead of using a load instruction. Fernando found that QEMU's virtqueue_pop() function sees bogus values when loading a 16-bit guest RAM location. Paolo figured out that the bogus value can be produced by memcpy() when another thread is updating the 16-bit memory location simultaneously. This is a race condition between one thread loading the 16-bit value and another thread storing it (in this case a guest vcpu thread). Sometimes memcpy() may load one old byte and one new byte, resulting in a bogus value. The symptom that Fernando experienced is a "Virtqueue size exceeded" error message from QEMU and then the virtio-blk or virtio-scsi device stops working. This issue potentially affects other device emulation code in QEMU as well, not just virtio devices. For the time being, I suggest tweaking the APKBUILD so the memcpy() is not generated. Hopefully QEMU can make the code more portable in the future so the compiler always does the expected thing, but this may not be easily possible. Stefan