qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: "Lukáš Doktor" <ldoktor@redhat.com>,
	"QEMU Developers" <qemu-devel@nongnu.org>,
	qemu-block@nongnu.org,
	"Anton Nefedov" <anton.nefedov@virtuozzo.com>,
	"Andrew Jones" <drjones@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>
Subject: Re: [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas [v2]
Date: Thu, 22 Aug 2019 18:32:59 +0200	[thread overview]
Message-ID: <1ac19336-e9f4-753c-ebd2-41156152eb9a@redhat.com> (raw)
In-Reply-To: <9b09731a-1769-2cbe-4b6f-3d7787f74ebc@redhat.com>

On 22.08.19 17:40, Max Reitz wrote:
> On 22.08.19 17:25, Max Reitz wrote:
>> On 22.08.19 14:09, Max Reitz wrote:
>>> (CC-ing Paolo because of the XFS connection, and Stefan because why not.)
>>>
>>> On 22.08.19 13:27, Lukáš Doktor wrote:
>>>> Dne 21. 08. 19 v 19:51 Max Reitz napsal(a):
>>>>> On 21.08.19 16:14, Lukáš Doktor wrote:
>>>>>> Hello guys,
>>>>>>
>>>>>> First attempt was rejected due to zip attachment, let's try it again with just Avocado-vt debug.log and serial console log files attached.
>>>>>>
>>>>>> I bisected a regression on aarch64 all the way to this commit: "qcow2: skip writing zero buffers to empty COW areas" c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a. Would you please have a look at it?
>>>>>
>>>>> I think I can see the issue on my x64 system (I don’t see the XFS
>>>>> corruption, but the installation fails because of some segfaults).
>>>>>
>>>>> I haven’t found a simpler way to reproduce the problem yet, though,
>>>>> which is a pain... :-/
>>>>>
>>>>> It looks like the problem disappears when I configure qemu with
>>>>> “--disable-xfsctl”.  Can you try that?
>>>>>
>>>>> Max
>>>>>
>>>>
>>>> Hello Max,
>>>>
>>>> yes, I'm getting the same behavior. With "--disable-xfsctl" it works well. Also looking at the option I understand why it only failed on aarch64 for me, I don't have libs installed on the other machines, therefor it was disabled by "./configure" there. Anyway I guess disabling it in my builds won't really fix the issue, right? :-)
>>>
>>> Thanks!
>>>
>>> No, it won’t, but it means the actual root of the problem is probably
>>> rather in some XFS-related code (be it because qemu uses it the wrong
>>> way or because of XFS kernel code) than in the pure qcow2 commit that
>>> made the problem surface by exercising it heavily.  (Or in an
>>> interaction between the two.)
>>
>> OK, I got a simpler reproducer now:
>>
>> $ ./qemu-img create -f qcow2 test.qcow2 1M
>> $ (for i in $(seq 15 -1 0); do \
>>        echo "aio_write -P 42 $((i * 64 + 1))k 62k"; \
>>    done) \
>>   | ./qemu-io test.qcow2
>> $ for i in $(seq 0 15); do \
>>       echo $i; \
>>       ofs=$((i * 64)); \
>>       ./qemu-io -c "read -P 0 ${ofs}k 1k" \
>>                 -c "read -P 42 $((ofs + 1))k 62k" \
>>                 -c "read -P 0 $((ofs + 63))k 1k" \
>>                 test.qcow2 \
>>           | grep 'verification'; \
>>   done
>>
>> On XFS with --enable-xfsctl, this basically always gives me some
>> verification failure somewhere.  (On tmpfs or with --disable-xfsctl, it
>> never fails.)
>>
>> So it seems to be related to I/O from back to front.
>>
>> (You can also reproduce it with a plain “qemu-img bench” invocation,
>> like “./qemu-img bench -w --pattern=42 -o 1k -S 64k -s 62k test.qcow2”
>> (on, say, a 4 GB image), but then the failure appears much later in the
>> image, because you have to wait from some requests to come in reverse
>> (by chance) first.)
> 
> The problem is the ftruncate() in xfs_write_zeroes().  It is possible
> for it to yield, then other requests come in, and the data they write
> may get discarded once the ftruncate() settles.

I’ve just sent a patch: “block/file-posix: Fix xfs_write_zeroes()”,
Message-ID <20190822162618.27670-1-mreitz@redhat.com>:
https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg01148.html

Max


      reply	other threads:[~2019-08-22 16:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-21 14:14 [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas [v2] Lukáš Doktor
2019-08-21 15:49 ` Anton Nefedov
2019-08-21 16:23   ` Lukáš Doktor
2019-08-21 17:51 ` Max Reitz
2019-08-22 11:27   ` Lukáš Doktor
2019-08-22 12:09     ` Max Reitz
2019-08-22 15:25       ` Max Reitz
2019-08-22 15:40         ` Max Reitz
2019-08-22 16:32           ` Max Reitz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1ac19336-e9f4-753c-ebd2-41156152eb9a@redhat.com \
    --to=mreitz@redhat.com \
    --cc=anton.nefedov@virtuozzo.com \
    --cc=drjones@redhat.com \
    --cc=ldoktor@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).