From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: [Qemu-devel] [PULL 1/2] queue: fix QSLIST_INSERT_HEAD_ATOMIC race
Date: Thu, 12 Mar 2015 19:10:32 +0000 [thread overview]
Message-ID: <1426187433-29136-2-git-send-email-stefanha@redhat.com> (raw)
In-Reply-To: <1426187433-29136-1-git-send-email-stefanha@redhat.com>
From: Paolo Bonzini <pbonzini@redhat.com>
There is a not-so-subtle race in QSLIST_INSERT_HEAD_ATOMIC.
Because atomic_cmpxchg returns the old value instead of a success flag,
QSLIST_INSERT_HEAD_ATOMIC was checking for success by comparing against
the second argument to atomic_cmpxchg. Unfortunately, this only works
if the second argument is a local or thread-local variable.
If it is in memory, it can be subject to common subexpression elimination
(and then everything's fine) or reloaded after the atomic_cmpxchg,
depending on the compiler's whims. If the latter happens, the race can
happen. A thread can sneak in, doing something on elm->field.sle_next
after the atomic_cmpxchg and before the comparison. This causes a wrong
failure, and then two threads are using "elm" at the same time. In the
case discovered by Christian, the sequence was likely something like this:
thread 1 | thread 2
QSLIST_INSERT_HEAD_ATOMIC |
atomic_cmpxchg succeeds |
elm added to list |
| steal release_pool
| QSLIST_REMOVE_HEAD
| elm removed from list
| ...
| QSLIST_INSERT_HEAD_ATOMIC
| (overwrites sle_next)
spurious failure |
atomic_cmpxchg succeeds |
elm added to list again |
|
steal release_pool |
QSLIST_REMOVE_HEAD |
elm removed again |
The last three steps could be done by a third thread as well.
A reproducer that failed in a matter of seconds is as follows:
- the guest has 32 VCPUs on a 28 core host (hyperthreading was enabled),
memory was 16G just to err on the safe side (the host has 64G, but hey
at least you need no s390)
- the guest has 24 null-aio virtio-blk devices using dataplane
(-object iothread,id=ioN -drive if=none,id=blkN,driver=null-aio,size=500G
-device virtio-blk-pci,iothread=ioN,drive=blkN)
- the guest also has a single network interface. It's only doing loopback
tests so slirp vs. tap and the model doesn't matter.
- the guest is running fio with the following script:
[global]
rw=randread
blocksize=16k
ioengine=libaio
runtime=10m
buffered=0
fallocate=none
time_based
iodepth=32
[virtio1a]
filename=/dev/block/252\:16
[virtio1b]
filename=/dev/block/252\:16
...
[virtio24a]
filename=/dev/block/252\:384
[virtio24b]
filename=/dev/block/252\:384
[listen1]
protocol=tcp
ioengine=net
port=12345
listen
rw=read
bs=4k
size=1000g
[connect1]
protocol=tcp
hostname=localhost
ioengine=net
port=12345
protocol=tcp
rw=write
startdelay=1
size=1000g
...
[listen8]
protocol=tcp
ioengine=net
port=12352
listen
rw=read
bs=4k
size=1000g
[connect8]
protocol=tcp
hostname=localhost
ioengine=net
port=12352
rw=write
startdelay=1
size=1000g
Moral of the story: I should refrain from writing more clever stuff.
At least it looks like it is not too clever to be undebuggable.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1426002357-6889-1-git-send-email-pbonzini@redhat.com
Fixes: c740ad92d0d958fa785e5d7aa1b67ecaf30a6a54
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
include/qemu/queue.h | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 8094150..f781aa2 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -197,11 +197,12 @@ struct { \
(head)->slh_first = (elm); \
} while (/*CONSTCOND*/0)
-#define QSLIST_INSERT_HEAD_ATOMIC(head, elm, field) do { \
- do { \
- (elm)->field.sle_next = (head)->slh_first; \
- } while (atomic_cmpxchg(&(head)->slh_first, (elm)->field.sle_next, \
- (elm)) != (elm)->field.sle_next); \
+#define QSLIST_INSERT_HEAD_ATOMIC(head, elm, field) do { \
+ typeof(elm) save_sle_next; \
+ do { \
+ save_sle_next = (elm)->field.sle_next = (head)->slh_first; \
+ } while (atomic_cmpxchg(&(head)->slh_first, save_sle_next, (elm)) != \
+ save_sle_next); \
} while (/*CONSTCOND*/0)
#define QSLIST_MOVE_ATOMIC(dest, src) do { \
--
2.1.0
next prev parent reply other threads:[~2015-03-12 19:10 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-12 19:10 [Qemu-devel] [PULL 0/2] Block patches Stefan Hajnoczi
2015-03-12 19:10 ` Stefan Hajnoczi [this message]
2015-03-12 19:10 ` [Qemu-devel] [PULL 2/2] qcow2: fix the macro QCOW_MAX_L1_SIZE's use Stefan Hajnoczi
2015-03-13 11:00 ` [Qemu-devel] [PULL 0/2] Block patches Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1426187433-29136-2-git-send-email-stefanha@redhat.com \
--to=stefanha@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.