On 10.10.19 18:15, Anton Nefedov wrote: > On 10/10/2019 6:17 PM, Max Reitz wrote: >> Hi everyone, >> >> (CCs just based on tags in the commit in question) >> >> I have two bug reports which claim problems of qcow2 on XFS on ppc64le >> machines since qemu 4.1.0. One of those is about bad performance >> (sorry, is isn’t public :-/), the other about data corruption >> (https://bugzilla.redhat.com/show_bug.cgi?id=1751934). >> >> It looks like in both cases reverting c8bb23cbdbe3 solves the problem >> (which optimized COW of unallocated areas). >> >> I think I’ve looked at every angle but can‘t find what could be wrong >> with it. Do any of you have any idea? :-/ >> > > hi, > > oh, that patch strikes again.. > > I don't quite follow, was this bug confirmed to happen on x86? Comment 8 > (https://bugzilla.redhat.com/show_bug.cgi?id=1751934#c8) mentioned that > (or was that mixed up with the old xfsctl bug?) I think that was mixed up with the xfsctl bug, yes. > Regardless of the platform, does it reproduce? That's comforting > already; worst case we can trace each and every request then (unless it > will stop to reproduce this way). I haven’t been able to reproduce it yet (wrestling with the test system and getting ppc64 machines provisioned), but as far as I know it reproduces reliably on ppc64, but only there. > Also, perhaps it's worth to try to replace fallocate with write(0)? > Either in qcow2 (in the patch, bdrv_co_pwrite_zeroes -> bdrv_co_pwritev) > or in the file driver. It might hint whether it's misbehaving fallocate > (in qemu or in kernel) or something else. Good idea, that should at least tell us something about the corruption. Max