All of lore.kernel.org
 help / color / mirror / Atom feed
* migration: broken snapshot saves appear on s390 when small fields in migration stream removed
@ 2020-07-12 10:00 Claudio Fontana
  2020-07-12 16:11 ` Paolo Bonzini
  2020-07-13 11:03 ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 25+ messages in thread
From: Claudio Fontana @ 2020-07-12 10:00 UTC (permalink / raw)
  To: Juan Quintela, Dr. David Alan Gilbert
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi, Paolo Bonzini

Hello all,

during unrelated work for splitting QTest from the TCG instruction counting module,

I encountered what seems to be a migration stream issue, which is apparent only on s390, and only shows in block test 267.

./check -qcow2 267

when it comes to snapshot save and load using backing file.

Here is a minimal reproducer patch that causes the issue on s390 only.

--------------------------------------------cut-------------------------------------------
diff --git a/cpus.c b/cpus.c
index 41d1c5099f..443b88697a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
 
 static bool icount_state_needed(void *opaque)
 {
-    return use_icount;
+    return 0;
 }
 
 static bool warp_timer_state_needed(void *opaque)
--------------------------------------------cut-------------------------------------------

config.status configure line:
exec '/home/cfontana/qemu-build/../qemu/configure' '--enable-tcg' '--disable-kvm' '--disable-hax' '--target-list=s390x-softmmu' '--enable-debug'

$ make check-block

TEST    iotest-qcow2: 267 [fail]
QEMU          -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x" -nodefaults -display none -accel qtest
QEMU_IMG      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-img" 
QEMU_IO       -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
QEMU_NBD      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-nbd" 
IMGFMT        -- qcow2 (compat=1.1)
IMGPROTO      -- file
PLATFORM      -- Linux/s390x s390zp13 5.3.18-21-default
TEST_DIR      -- /home/cfontana/qemu-build/tests/qemu-iotests/scratch
SOCK_DIR      -- /tmp/tmp.bLJcJVtzk5
SOCKET_SCM_HELPER -- /home/cfontana/qemu-build/tests/qemu-iotests/socket_scm_helper

--- /home/cfontana/qemu/tests/qemu-iotests/267.out	2020-07-12 05:10:07.948262675 -0400
+++ /home/cfontana/qemu-build/tests/qemu-iotests/267.out.bad	2020-07-12 05:27:03.358362781 -0400
@@ -137,6 +137,9 @@
 ID        TAG                 VM SIZE                DATE       VM CLOCK
 --        snap0                  SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
 (qemu) loadvm snap0
+Unexpected storage key flag data: 0
+error while loading state for instance 0x0 of device 's390-skeys'
+Error: Error -22 while loading VM state
 (qemu) quit
 
 
-----------


Not run: 172 186 192 259 287
Failures: 267
Failed 1 of 115 iotests
make: *** [/home/cfontana/qemu/tests/Makefile.include:880: check-tests/check-block.sh] Error 1

-----------

Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.


Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
which include a buffer of 32768 bytes of keydata in the stream.

The code (hw/s390x/s390-skeys.c),
is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.

Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).

----------
Observation: the migration/qemu-file.c shows an IO_BUF_SIZE of 32768.

--

The following workarounds hide the problem (make the test pass):

1) always including the icount field in the (unrelated) timers field that are sent before in the migration stream (ie not applying the reproducer patch).

2) increasing the IO_BUF_SIZE also hides the problem:

----------------------cut--------------------------
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index be21518c57..f81d1272eb 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -30,7 +30,7 @@
 #include "trace.h"
 #include "qapi/error.h"
 
-#define IO_BUF_SIZE 32768
+#define IO_BUF_SIZE 65536
 #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
 
 struct QEMUFile {
----------------------cut--------------------------

3) adding a qemu_fflush in hw/s390x/s390-skeys.c after EOS also "fixes" the problem:

----------------------cut--------------------------
diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
index 1e036cc602..47c9a015af 100644
--- a/hw/s390x/s390-skeys.c
+++ b/hw/s390x/s390-skeys.c
@@ -252,6 +252,8 @@ static const TypeInfo qemu_s390_skeys_info = {
     .class_size    = sizeof(S390SKeysClass),
 };
 
+extern void qemu_fflush(QEMUFile *f);
+
 static void s390_storage_keys_save(QEMUFile *f, void *opaque)
 {
     S390SKeysState *ss = S390_SKEYS(opaque);
@@ -302,6 +304,7 @@ static void s390_storage_keys_save(QEMUFile *f, void *opaque)
     g_free(buf);
 end_stream:
     qemu_put_be64(f, eos);
+    qemu_fflush(f);
 }
 
 static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
----------------------cut--------------------------

Do any of you with better understanding of migration/, block and s390 have a suggestion on what could be the issue here,
and what could be the next step in the investigation?

Is the fact that migration/ram.c always does fflush after writing the EOS have any relevance here? why does it do it,
and should s390 code also follow the same pattern?

Thanks,

Claudio


-- 
Claudio Fontana
Engineering Manager Virtualization, SUSE Labs Core

SUSE Software Solutions Italy Srl


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-12 10:00 migration: broken snapshot saves appear on s390 when small fields in migration stream removed Claudio Fontana
@ 2020-07-12 16:11 ` Paolo Bonzini
  2020-07-13  9:11   ` Claudio Fontana
  2020-07-13 11:03 ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2020-07-12 16:11 UTC (permalink / raw)
  To: Claudio Fontana, Juan Quintela, Dr. David Alan Gilbert
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

On 12/07/20 12:00, Claudio Fontana wrote:
> Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.
> 
> 
> Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
> which include a buffer of 32768 bytes of keydata in the stream.
> 
> The code (hw/s390x/s390-skeys.c),
> is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.
> 
> Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).

1) Are there unexpected differences in the migration stream?  That is,
you could modify qcow2.c to fopen/fwrite/fclose the bytes as they're
written and read, and see if something does not match.

2) If it matches, are there unexpected differences other than the lack
of icount section when you apply the reproducer patch?

The fflush part makes me put more hope in the first, but both could help
you debug it.

Thanks,

Paolo



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-12 16:11 ` Paolo Bonzini
@ 2020-07-13  9:11   ` Claudio Fontana
  2020-07-14 14:29     ` Claudio Fontana
  0 siblings, 1 reply; 25+ messages in thread
From: Claudio Fontana @ 2020-07-13  9:11 UTC (permalink / raw)
  To: Paolo Bonzini, Juan Quintela, Dr. David Alan Gilbert
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

Hi Paolo,

On 7/12/20 6:11 PM, Paolo Bonzini wrote:
> On 12/07/20 12:00, Claudio Fontana wrote:
>> Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.
>>
>>
>> Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
>> which include a buffer of 32768 bytes of keydata in the stream.
>>
>> The code (hw/s390x/s390-skeys.c),
>> is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.
>>
>> Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).
> 
> 1) Are there unexpected differences in the migration stream?  That is,
> you could modify qcow2.c to fopen/fwrite/fclose the bytes as they're
> written and read, and see if something does not match.

I hooked qcow2_co_pwritev_task and qcow2_co_preadv_task to also write to an external file using your suggestion.

I am seeing very interesting differences with and without the reproducer patch (ie, forcing icount_state_needed to 0 or not):

* without the reproducer patch, everything past the s390-skeys data field is in order: there is the EOS, and then the next idstr follows ("cpu_common").

* with the reproducer patch, every single byte past the s390-skeys data field is ZERO. There is no EOS, there is no next idstr "cpu_common", there is absolutely nothing else than zeroes until the end of the file.


> 
> 2) If it matches, are there unexpected differences other than the lack
> of icount section when you apply the reproducer patch?

they do not match at all.


> 
> The fflush part makes me put more hope in the first, but both could help
> you debug it.
> 
> Thanks,
> 
> Paolo
> 

Thanks,

Claudio


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-12 10:00 migration: broken snapshot saves appear on s390 when small fields in migration stream removed Claudio Fontana
  2020-07-12 16:11 ` Paolo Bonzini
@ 2020-07-13 11:03 ` Dr. David Alan Gilbert
  2020-07-13 11:39   ` Cornelia Huck
  2020-07-13 11:39   ` Claudio Fontana
  1 sibling, 2 replies; 25+ messages in thread
From: Dr. David Alan Gilbert @ 2020-07-13 11:03 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Juan Quintela,
	Peter Maydell, Cornelia Huck, qemu-devel, Stefan Hajnoczi,
	Paolo Bonzini

* Claudio Fontana (cfontana@suse.de) wrote:
> Hello all,
> 
> during unrelated work for splitting QTest from the TCG instruction counting module,
> 
> I encountered what seems to be a migration stream issue, which is apparent only on s390, and only shows in block test 267.
> 
> ./check -qcow2 267
> 
> when it comes to snapshot save and load using backing file.
> 
> Here is a minimal reproducer patch that causes the issue on s390 only.
> 
> --------------------------------------------cut-------------------------------------------
> diff --git a/cpus.c b/cpus.c
> index 41d1c5099f..443b88697a 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>  
>  static bool icount_state_needed(void *opaque)
>  {
> -    return use_icount;
> +    return 0;
>  }

That's weird; I mean that's just turning a subsection on and off;
so you'd hope if this is a test that generates it's own snapshot and
then uses it then it should be consistent.

>  static bool warp_timer_state_needed(void *opaque)
> --------------------------------------------cut-------------------------------------------
> 
> config.status configure line:
> exec '/home/cfontana/qemu-build/../qemu/configure' '--enable-tcg' '--disable-kvm' '--disable-hax' '--target-list=s390x-softmmu' '--enable-debug'
> 
> $ make check-block
> 
> TEST    iotest-qcow2: 267 [fail]
> QEMU          -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x" -nodefaults -display none -accel qtest
> QEMU_IMG      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-img" 
> QEMU_IO       -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
> QEMU_NBD      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-nbd" 
> IMGFMT        -- qcow2 (compat=1.1)
> IMGPROTO      -- file
> PLATFORM      -- Linux/s390x s390zp13 5.3.18-21-default
> TEST_DIR      -- /home/cfontana/qemu-build/tests/qemu-iotests/scratch
> SOCK_DIR      -- /tmp/tmp.bLJcJVtzk5
> SOCKET_SCM_HELPER -- /home/cfontana/qemu-build/tests/qemu-iotests/socket_scm_helper
> 
> --- /home/cfontana/qemu/tests/qemu-iotests/267.out	2020-07-12 05:10:07.948262675 -0400
> +++ /home/cfontana/qemu-build/tests/qemu-iotests/267.out.bad	2020-07-12 05:27:03.358362781 -0400
> @@ -137,6 +137,9 @@
>  ID        TAG                 VM SIZE                DATE       VM CLOCK
>  --        snap0                  SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
>  (qemu) loadvm snap0
> +Unexpected storage key flag data: 0
> +error while loading state for instance 0x0 of device 's390-skeys'
> +Error: Error -22 while loading VM state
>  (qemu) quit
>  
>  
> -----------
> 
> 
> Not run: 172 186 192 259 287
> Failures: 267
> Failed 1 of 115 iotests
> make: *** [/home/cfontana/qemu/tests/Makefile.include:880: check-tests/check-block.sh] Error 1
> 
> -----------
> 
> Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.
> 
> 
> Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
> which include a buffer of 32768 bytes of keydata in the stream.
> 
> The code (hw/s390x/s390-skeys.c),
> is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.
> 
> Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).
> 
> ----------
> Observation: the migration/qemu-file.c shows an IO_BUF_SIZE of 32768.
> 
> --
> 
> The following workarounds hide the problem (make the test pass):
> 
> 1) always including the icount field in the (unrelated) timers field that are sent before in the migration stream (ie not applying the reproducer patch).
> 
> 2) increasing the IO_BUF_SIZE also hides the problem:
> 
> ----------------------cut--------------------------
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index be21518c57..f81d1272eb 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -30,7 +30,7 @@
>  #include "trace.h"
>  #include "qapi/error.h"
>  
> -#define IO_BUF_SIZE 32768
> +#define IO_BUF_SIZE 65536
>  #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
>  
>  struct QEMUFile {
> ----------------------cut--------------------------
> 
> 3) adding a qemu_fflush in hw/s390x/s390-skeys.c after EOS also "fixes" the problem:
> 
> ----------------------cut--------------------------
> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> index 1e036cc602..47c9a015af 100644
> --- a/hw/s390x/s390-skeys.c
> +++ b/hw/s390x/s390-skeys.c
> @@ -252,6 +252,8 @@ static const TypeInfo qemu_s390_skeys_info = {
>      .class_size    = sizeof(S390SKeysClass),
>  };
>  
> +extern void qemu_fflush(QEMUFile *f);
> +
>  static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>  {
>      S390SKeysState *ss = S390_SKEYS(opaque);
> @@ -302,6 +304,7 @@ static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>      g_free(buf);
>  end_stream:
>      qemu_put_be64(f, eos);
> +    qemu_fflush(f);
>  }
>  
>  static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
> ----------------------cut--------------------------
> 
> Do any of you with better understanding of migration/, block and s390 have a suggestion on what could be the issue here,
> and what could be the next step in the investigation?
> 
> Is the fact that migration/ram.c always does fflush after writing the EOS have any relevance here? why does it do it,
> and should s390 code also follow the same pattern?

I didn't think it was required.
And qemu_put_buffer loops if needed and calls qemu_fflush internally.
It's possible here that the storage key code is just the canary - the
first thing that detects that the stream is invalid after it all goes
wrong.

Dave


> Thanks,
> 
> Claudio
> 
> 
> -- 
> Claudio Fontana
> Engineering Manager Virtualization, SUSE Labs Core
> 
> SUSE Software Solutions Italy Srl
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-13 11:03 ` Dr. David Alan Gilbert
@ 2020-07-13 11:39   ` Cornelia Huck
  2020-07-13 11:39   ` Claudio Fontana
  1 sibling, 0 replies; 25+ messages in thread
From: Cornelia Huck @ 2020-07-13 11:39 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Juan Quintela,
	Peter Maydell, qemu-devel, Claudio Fontana, Stefan Hajnoczi,
	Paolo Bonzini

On Mon, 13 Jul 2020 12:03:33 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Claudio Fontana (cfontana@suse.de) wrote:

> > The following workarounds hide the problem (make the test pass):
> > 
> > 1) always including the icount field in the (unrelated) timers field that are sent before in the migration stream (ie not applying the reproducer patch).
> > 
> > 2) increasing the IO_BUF_SIZE also hides the problem:
> > 
> > ----------------------cut--------------------------
> > diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> > index be21518c57..f81d1272eb 100644
> > --- a/migration/qemu-file.c
> > +++ b/migration/qemu-file.c
> > @@ -30,7 +30,7 @@
> >  #include "trace.h"
> >  #include "qapi/error.h"
> >  
> > -#define IO_BUF_SIZE 32768
> > +#define IO_BUF_SIZE 65536
> >  #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
> >  
> >  struct QEMUFile {
> > ----------------------cut--------------------------
> > 
> > 3) adding a qemu_fflush in hw/s390x/s390-skeys.c after EOS also "fixes" the problem:
> > 
> > ----------------------cut--------------------------
> > diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> > index 1e036cc602..47c9a015af 100644
> > --- a/hw/s390x/s390-skeys.c
> > +++ b/hw/s390x/s390-skeys.c
> > @@ -252,6 +252,8 @@ static const TypeInfo qemu_s390_skeys_info = {
> >      .class_size    = sizeof(S390SKeysClass),
> >  };
> >  
> > +extern void qemu_fflush(QEMUFile *f);
> > +
> >  static void s390_storage_keys_save(QEMUFile *f, void *opaque)
> >  {
> >      S390SKeysState *ss = S390_SKEYS(opaque);
> > @@ -302,6 +304,7 @@ static void s390_storage_keys_save(QEMUFile *f, void *opaque)
> >      g_free(buf);
> >  end_stream:
> >      qemu_put_be64(f, eos);
> > +    qemu_fflush(f);
> >  }
> >  
> >  static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
> > ----------------------cut--------------------------
> > 
> > Do any of you with better understanding of migration/, block and s390 have a suggestion on what could be the issue here,
> > and what could be the next step in the investigation?
> > 
> > Is the fact that migration/ram.c always does fflush after writing the EOS have any relevance here? why does it do it,
> > and should s390 code also follow the same pattern?  
> 
> I didn't think it was required.
> And qemu_put_buffer loops if needed and calls qemu_fflush internally.
> It's possible here that the storage key code is just the canary - the
> first thing that detects that the stream is invalid after it all goes
> wrong.

Yes, that seems possible. Especially as we end up with all zeroes after
the skeys section in the bad case -- it seems like weird problem to
have to be cured by an individual device. No good idea *what* actually
goes wrong, though.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-13 11:03 ` Dr. David Alan Gilbert
  2020-07-13 11:39   ` Cornelia Huck
@ 2020-07-13 11:39   ` Claudio Fontana
  2020-07-13 11:45     ` Claudio Fontana
  1 sibling, 1 reply; 25+ messages in thread
From: Claudio Fontana @ 2020-07-13 11:39 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Peter Maydell,
	Cornelia Huck, qemu-devel, Juan Quintela, Stefan Hajnoczi,
	Paolo Bonzini

Hello,

On 7/13/20 1:03 PM, Dr. David Alan Gilbert wrote:
> * Claudio Fontana (cfontana@suse.de) wrote:
>> Hello all,
>>
>> during unrelated work for splitting QTest from the TCG instruction counting module,
>>
>> I encountered what seems to be a migration stream issue, which is apparent only on s390, and only shows in block test 267.
>>
>> ./check -qcow2 267
>>
>> when it comes to snapshot save and load using backing file.
>>
>> Here is a minimal reproducer patch that causes the issue on s390 only.
>>
>> --------------------------------------------cut-------------------------------------------
>> diff --git a/cpus.c b/cpus.c
>> index 41d1c5099f..443b88697a 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>  
>>  static bool icount_state_needed(void *opaque)
>>  {
>> -    return use_icount;
>> +    return 0;
>>  }
> 
> That's weird; I mean that's just turning a subsection on and off;
> so you'd hope if this is a test that generates it's own snapshot and
> then uses it then it should be consistent.

Indeed.

> 
>>  static bool warp_timer_state_needed(void *opaque)
>> --------------------------------------------cut-------------------------------------------
>>
>> config.status configure line:
>> exec '/home/cfontana/qemu-build/../qemu/configure' '--enable-tcg' '--disable-kvm' '--disable-hax' '--target-list=s390x-softmmu' '--enable-debug'
>>
>> $ make check-block
>>
>> TEST    iotest-qcow2: 267 [fail]
>> QEMU          -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x" -nodefaults -display none -accel qtest
>> QEMU_IMG      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-img" 
>> QEMU_IO       -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
>> QEMU_NBD      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-nbd" 
>> IMGFMT        -- qcow2 (compat=1.1)
>> IMGPROTO      -- file
>> PLATFORM      -- Linux/s390x s390zp13 5.3.18-21-default
>> TEST_DIR      -- /home/cfontana/qemu-build/tests/qemu-iotests/scratch
>> SOCK_DIR      -- /tmp/tmp.bLJcJVtzk5
>> SOCKET_SCM_HELPER -- /home/cfontana/qemu-build/tests/qemu-iotests/socket_scm_helper
>>
>> --- /home/cfontana/qemu/tests/qemu-iotests/267.out	2020-07-12 05:10:07.948262675 -0400
>> +++ /home/cfontana/qemu-build/tests/qemu-iotests/267.out.bad	2020-07-12 05:27:03.358362781 -0400
>> @@ -137,6 +137,9 @@
>>  ID        TAG                 VM SIZE                DATE       VM CLOCK
>>  --        snap0                  SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
>>  (qemu) loadvm snap0
>> +Unexpected storage key flag data: 0
>> +error while loading state for instance 0x0 of device 's390-skeys'
>> +Error: Error -22 while loading VM state
>>  (qemu) quit
>>  
>>  
>> -----------
>>
>>
>> Not run: 172 186 192 259 287
>> Failures: 267
>> Failed 1 of 115 iotests
>> make: *** [/home/cfontana/qemu/tests/Makefile.include:880: check-tests/check-block.sh] Error 1
>>
>> -----------
>>
>> Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.
>>
>>
>> Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
>> which include a buffer of 32768 bytes of keydata in the stream.
>>
>> The code (hw/s390x/s390-skeys.c),
>> is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.
>>
>> Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).
>>
>> ----------
>> Observation: the migration/qemu-file.c shows an IO_BUF_SIZE of 32768.
>>
>> --
>>
>> The following workarounds hide the problem (make the test pass):
>>
>> 1) always including the icount field in the (unrelated) timers field that are sent before in the migration stream (ie not applying the reproducer patch).
>>
>> 2) increasing the IO_BUF_SIZE also hides the problem:
>>
>> ----------------------cut--------------------------
>> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>> index be21518c57..f81d1272eb 100644
>> --- a/migration/qemu-file.c
>> +++ b/migration/qemu-file.c
>> @@ -30,7 +30,7 @@
>>  #include "trace.h"
>>  #include "qapi/error.h"
>>  
>> -#define IO_BUF_SIZE 32768
>> +#define IO_BUF_SIZE 65536
>>  #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
>>  
>>  struct QEMUFile {
>> ----------------------cut--------------------------
>>
>> 3) adding a qemu_fflush in hw/s390x/s390-skeys.c after EOS also "fixes" the problem:
>>
>> ----------------------cut--------------------------
>> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
>> index 1e036cc602..47c9a015af 100644
>> --- a/hw/s390x/s390-skeys.c
>> +++ b/hw/s390x/s390-skeys.c
>> @@ -252,6 +252,8 @@ static const TypeInfo qemu_s390_skeys_info = {
>>      .class_size    = sizeof(S390SKeysClass),
>>  };
>>  
>> +extern void qemu_fflush(QEMUFile *f);
>> +
>>  static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>>  {
>>      S390SKeysState *ss = S390_SKEYS(opaque);
>> @@ -302,6 +304,7 @@ static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>>      g_free(buf);
>>  end_stream:
>>      qemu_put_be64(f, eos);
>> +    qemu_fflush(f);
>>  }
>>  
>>  static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
>> ----------------------cut--------------------------
>>
>> Do any of you with better understanding of migration/, block and s390 have a suggestion on what could be the issue here,
>> and what could be the next step in the investigation?
>>
>> Is the fact that migration/ram.c always does fflush after writing the EOS have any relevance here? why does it do it,
>> and should s390 code also follow the same pattern?
> 
> I didn't think it was required.
> And qemu_put_buffer loops if needed and calls qemu_fflush internally.

This detail probably does not help in this case, because the buffer itself is fine (32678 zero bytes is actually correct).

The End Of Stream marker written just after with qemu_put_be64 is not ok,
and anything past that point is just a whole bunch of zeroes in the stream when we hookup the calls to

qcow2_co_pwritev_task and qcow2_co_preadv_task

(see my response to Paolo in this thread).


> It's possible here that the storage key code is just the canary - the
> first thing that detects that the stream is invalid after it all goes
> wrong.

Seems the exact opposite to me, ie, the stream seems just fine up to the point where the EOS marker after the keydata in "s390-skeys" should be.

Thanks!

Claudio

> 
> Dave
> 
> 
>> Thanks,
>>
>> Claudio
>>
>>
>> -- 
>> Claudio Fontana
>> Engineering Manager Virtualization, SUSE Labs Core
>>
>> SUSE Software Solutions Italy Srl
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-13 11:39   ` Claudio Fontana
@ 2020-07-13 11:45     ` Claudio Fontana
  0 siblings, 0 replies; 25+ messages in thread
From: Claudio Fontana @ 2020-07-13 11:45 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Peter Maydell,
	Cornelia Huck, qemu-devel, Juan Quintela, Stefan Hajnoczi,
	Paolo Bonzini

On 7/13/20 1:39 PM, Claudio Fontana wrote:
> Hello,
> 
> On 7/13/20 1:03 PM, Dr. David Alan Gilbert wrote:
>> * Claudio Fontana (cfontana@suse.de) wrote:
>>> Hello all,
>>>
>>> during unrelated work for splitting QTest from the TCG instruction counting module,
>>>
>>> I encountered what seems to be a migration stream issue, which is apparent only on s390, and only shows in block test 267.
>>>
>>> ./check -qcow2 267
>>>
>>> when it comes to snapshot save and load using backing file.
>>>
>>> Here is a minimal reproducer patch that causes the issue on s390 only.
>>>
>>> --------------------------------------------cut-------------------------------------------
>>> diff --git a/cpus.c b/cpus.c
>>> index 41d1c5099f..443b88697a 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>>  
>>>  static bool icount_state_needed(void *opaque)
>>>  {
>>> -    return use_icount;
>>> +    return 0;
>>>  }
>>
>> That's weird; I mean that's just turning a subsection on and off;
>> so you'd hope if this is a test that generates it's own snapshot and
>> then uses it then it should be consistent.
> 
> Indeed.
> 
>>
>>>  static bool warp_timer_state_needed(void *opaque)
>>> --------------------------------------------cut-------------------------------------------
>>>
>>> config.status configure line:
>>> exec '/home/cfontana/qemu-build/../qemu/configure' '--enable-tcg' '--disable-kvm' '--disable-hax' '--target-list=s390x-softmmu' '--enable-debug'
>>>
>>> $ make check-block
>>>
>>> TEST    iotest-qcow2: 267 [fail]
>>> QEMU          -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x" -nodefaults -display none -accel qtest
>>> QEMU_IMG      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-img" 
>>> QEMU_IO       -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
>>> QEMU_NBD      -- "/home/cfontana/qemu-build/tests/qemu-iotests/../../qemu-nbd" 
>>> IMGFMT        -- qcow2 (compat=1.1)
>>> IMGPROTO      -- file
>>> PLATFORM      -- Linux/s390x s390zp13 5.3.18-21-default
>>> TEST_DIR      -- /home/cfontana/qemu-build/tests/qemu-iotests/scratch
>>> SOCK_DIR      -- /tmp/tmp.bLJcJVtzk5
>>> SOCKET_SCM_HELPER -- /home/cfontana/qemu-build/tests/qemu-iotests/socket_scm_helper
>>>
>>> --- /home/cfontana/qemu/tests/qemu-iotests/267.out	2020-07-12 05:10:07.948262675 -0400
>>> +++ /home/cfontana/qemu-build/tests/qemu-iotests/267.out.bad	2020-07-12 05:27:03.358362781 -0400
>>> @@ -137,6 +137,9 @@
>>>  ID        TAG                 VM SIZE                DATE       VM CLOCK
>>>  --        snap0                  SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
>>>  (qemu) loadvm snap0
>>> +Unexpected storage key flag data: 0
>>> +error while loading state for instance 0x0 of device 's390-skeys'
>>> +Error: Error -22 while loading VM state
>>>  (qemu) quit
>>>  
>>>  
>>> -----------
>>>
>>>
>>> Not run: 172 186 192 259 287
>>> Failures: 267
>>> Failed 1 of 115 iotests
>>> make: *** [/home/cfontana/qemu/tests/Makefile.include:880: check-tests/check-block.sh] Error 1
>>>
>>> -----------
>>>
>>> Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.
>>>
>>>
>>> Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
>>> which include a buffer of 32768 bytes of keydata in the stream.
>>>
>>> The code (hw/s390x/s390-skeys.c),
>>> is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.
>>>
>>> Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).
>>>
>>> ----------
>>> Observation: the migration/qemu-file.c shows an IO_BUF_SIZE of 32768.
>>>
>>> --
>>>
>>> The following workarounds hide the problem (make the test pass):
>>>
>>> 1) always including the icount field in the (unrelated) timers field that are sent before in the migration stream (ie not applying the reproducer patch).
>>>
>>> 2) increasing the IO_BUF_SIZE also hides the problem:
>>>
>>> ----------------------cut--------------------------
>>> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>>> index be21518c57..f81d1272eb 100644
>>> --- a/migration/qemu-file.c
>>> +++ b/migration/qemu-file.c
>>> @@ -30,7 +30,7 @@
>>>  #include "trace.h"
>>>  #include "qapi/error.h"
>>>  
>>> -#define IO_BUF_SIZE 32768
>>> +#define IO_BUF_SIZE 65536
>>>  #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
>>>  
>>>  struct QEMUFile {
>>> ----------------------cut--------------------------
>>>
>>> 3) adding a qemu_fflush in hw/s390x/s390-skeys.c after EOS also "fixes" the problem:
>>>
>>> ----------------------cut--------------------------
>>> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
>>> index 1e036cc602..47c9a015af 100644
>>> --- a/hw/s390x/s390-skeys.c
>>> +++ b/hw/s390x/s390-skeys.c
>>> @@ -252,6 +252,8 @@ static const TypeInfo qemu_s390_skeys_info = {
>>>      .class_size    = sizeof(S390SKeysClass),
>>>  };
>>>  
>>> +extern void qemu_fflush(QEMUFile *f);
>>> +
>>>  static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>>>  {
>>>      S390SKeysState *ss = S390_SKEYS(opaque);
>>> @@ -302,6 +304,7 @@ static void s390_storage_keys_save(QEMUFile *f, void *opaque)
>>>      g_free(buf);
>>>  end_stream:
>>>      qemu_put_be64(f, eos);
>>> +    qemu_fflush(f);
>>>  }
>>>  
>>>  static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
>>> ----------------------cut--------------------------
>>>
>>> Do any of you with better understanding of migration/, block and s390 have a suggestion on what could be the issue here,
>>> and what could be the next step in the investigation?
>>>
>>> Is the fact that migration/ram.c always does fflush after writing the EOS have any relevance here? why does it do it,
>>> and should s390 code also follow the same pattern?
>>
>> I didn't think it was required.
>> And qemu_put_buffer loops if needed and calls qemu_fflush internally.
> 
> This detail probably does not help in this case, because the buffer itself is fine (32678 zero bytes is actually correct).
> 
> The End Of Stream marker written just after with qemu_put_be64 is not ok,
> and anything past that point is just a whole bunch of zeroes in the stream when we hookup the calls to
> 
> qcow2_co_pwritev_task and qcow2_co_preadv_task
> 
> (see my response to Paolo in this thread).
> 
> 
>> It's possible here that the storage key code is just the canary - the
>> first thing that detects that the stream is invalid after it all goes
>> wrong.
> 
> Seems the exact opposite to me, ie, the stream seems just fine up to the point where the EOS marker after the keydata in "s390-skeys" should be.

Ah I might have misunderstood what you wrote there, you mean that the stream became invalid, and this just happened in the context of s390-skeys right..

To me it also looks like an underlying issue, not really related to skeys in particular, just that they exercise the underlying machinery is a very peculiar way,
in terms or field positions, sizes etc.

Ciao,

Claudio

> 
> Thanks!
> 
> Claudio
> 
>>
>> Dave
>>
>>
>>> Thanks,
>>>
>>> Claudio
>>>
>>>
>>> -- 
>>> Claudio Fontana
>>> Engineering Manager Virtualization, SUSE Labs Core
>>>
>>> SUSE Software Solutions Italy Srl
>>>
>> --
>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>
>>
> 
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-13  9:11   ` Claudio Fontana
@ 2020-07-14 14:29     ` Claudio Fontana
  2020-07-14 14:35       ` Thomas Huth
  0 siblings, 1 reply; 25+ messages in thread
From: Claudio Fontana @ 2020-07-14 14:29 UTC (permalink / raw)
  To: Paolo Bonzini, Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf,
	Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Thomas Huth, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

Hello,

I have some tiny progress in narrowing down this issue, possibly a qcow2 issue, still unclear,
but involving Kevin Wolf and Max Reitz.


The reproducer again:

> --------------------------------------------cut-------------------------------------------
> diff --git a/cpus.c b/cpus.c
> index 41d1c5099f..443b88697a 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>  
>  static bool icount_state_needed(void *opaque)
>  {
> -    return use_icount;
> +    return 0;
>  }
>  
>  static bool warp_timer_state_needed(void *opaque)
> --------------------------------------------cut-------------------------------------------



This issue for now appears on s390 only:

On s390 hardware, test 267 fails (both kvm and tcg) in the qcow2 backing file part, with broken migration stream data in the s390-skeys vmsave (old style).

What is new?
It is possible to work around the problem by adding back to the stream simply by extending, for example, the existing UNUSED field in the timers vmstate.

So this "fixes it", applied on top of the reproducer:

--------------------------------------------cut-------------------------------------------
@@ -728,7 +728,7 @@ static const VMStateDescription vmstate_timers = {
     .minimum_version_id = 1,
     .fields = (VMStateField[]) {
         VMSTATE_INT64(cpu_ticks_offset, TimersState),
-        VMSTATE_UNUSED(8),
+        VMSTATE_UNUSED(119),
         VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
         VMSTATE_END_OF_LIST()
     },
--------------------------------------------cut-------------------------------------------


NB: Exactly 111 bytes need to be added, not one byte less.

Using VMSTATE_UNUSED(118) again breaks.
Using VMSTATE_UNUSED(120) or more is _fine_. Seems that a certain threshold needs to be reached in the stream, before the transfer of old style vmsave fields in s390-skeys is successful.

So here is the output, with some logging, of the VMSTATE_UNUSED(118) run (FAILURE, marked by error "-22"), followed by the VMSTATE_UNUSED(119) run (SUCCESS).
qemu_fflush logs when called, as well as s390 save key code, just before and just after writing the skey fields.

VMSTATE_UNUSED(118):

QA output created by 267

=== -blockdev with a backing file ===

Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=134217728
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/t.IMGFMT.base
Testing: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT.base,node-name=backing-file -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=file -blockdev driver=IMGFMT,file=file,backing=backing-file,node-name=fmt
QEMU X.Y.Z monitor - type 'help' for more information
(qemu) savevm snap0
qemu_fflush: written: 82, expected: 82 - done.
qemu_fflush: written: 78, expected: 78 - done.
qemu_fflush: - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 81907, expected: 81907 - done.
qemu_fflush: written: 36, expected: 36 - done.
saving keys to stream...
bytes_xfer=344418 xfer_limit=0 pos=344247 buf_index=331 buf_size=0
iov cnt=1
[0] base=0x2aa38733778 len=331 [~]

qemu_fflush: written: 32768, expected: 32768 - done.
saved keys to stream.
bytes_xfer=377210 xfer_limit=0 pos=377015 buf_index=355 buf_size=0
iov cnt=1
[0] base=0x2aa38733778 len=355 []

qemu_fflush: written: 16201, expected: 16201 - done.
qemu_fflush: - done.
qemu_fflush: - done.
(qemu) info snapshots
List of snapshots present on all disks:
ID        TAG                     VM SIZE                DATE       VM CLOCK
--        snap0                      SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
(qemu) loadvm snap0
Unexpected storage key flag data: 0
error while loading state for instance 0x0 of device 's390-skeys'
qemu_fflush: not writable!
Error: Error -22 while loading VM state
(qemu) quit

*** done



===================================================================================================================================



VMSTATE_UNUSED(119):


QA output created by 267

=== -blockdev with a backing file ===

Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=134217728
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/t.IMGFMT.base
Testing: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT.base,node-name=backing-file -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=file -blockdev driver=IMGFMT,file=file,backing=backing-file,node-name=fmt
QEMU X.Y.Z monitor - type 'help' for more information
(qemu) savevm snap0
qemu_fflush: written: 82, expected: 82 - done.
qemu_fflush: written: 78, expected: 78 - done.
qemu_fflush: - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 32768, expected: 32768 - done.
qemu_fflush: written: 81907, expected: 81907 - done.
qemu_fflush: written: 36, expected: 36 - done.
saving keys to stream...
bytes_xfer=344419 xfer_limit=0 pos=344247 buf_index=332 buf_size=0
iov cnt=1
[0] base=0x2aa1bc41778 len=332 [~]

qemu_fflush: written: 32768, expected: 32768 - done.
saved keys to stream.
bytes_xfer=377211 xfer_limit=0 pos=377015 buf_index=356 buf_size=0
iov cnt=1
[0] base=0x2aa1bc41778 len=356 []

qemu_fflush: written: 16202, expected: 16202 - done.
qemu_fflush: - done.
qemu_fflush: - done.
(qemu) info snapshots
List of snapshots present on all disks:


=================================================

Here is the diff between failed and successful runs:

--- bad_extra_118.txt   2020-07-14 09:52:48.054023509 -0400
+++ good_extra_119.txt  2020-07-14 09:48:30.823977732 -0400
@@ -21,17 +21,17 @@
 qemu_fflush: written: 81907, expected: 81907 - done.
 qemu_fflush: written: 36, expected: 36 - done.
 saving keys to stream...
-bytes_xfer=344418 xfer_limit=0 pos=344247 buf_index=331 buf_size=0
+bytes_xfer=344419 xfer_limit=0 pos=344247 buf_index=332 buf_size=0
 iov cnt=1
-[0] base=0x2aa38733778 len=331 [~]
+[0] base=0x2aa1bc41778 len=332 [~]
 
 qemu_fflush: written: 32768, expected: 32768 - done.
 saved keys to stream.
-bytes_xfer=377210 xfer_limit=0 pos=377015 buf_index=355 buf_size=0
+bytes_xfer=377211 xfer_limit=0 pos=377015 buf_index=356 buf_size=0
 iov cnt=1
-[0] base=0x2aa38733778 len=355 []
+[0] base=0x2aa1bc41778 len=356 []
 
-qemu_fflush: written: 16201, expected: 16201 - done.
+qemu_fflush: written: 16202, expected: 16202 - done.
 qemu_fflush: - done.
 qemu_fflush: - done.
 (qemu) info snapshots
@@ -39,10 +39,7 @@
 ID        TAG                     VM SIZE                DATE       VM CLOCK
 --        snap0                      SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
 (qemu) loadvm snap0
-Unexpected storage key flag data: 0
-error while loading state for instance 0x0 of device 's390-skeys'
 qemu_fflush: not writable!
-Error: Error -22 while loading VM state
 (qemu) quit
 
 *** done

=================================================


If someone has a good idea let me know - first attempts to reproduce on x86 failed, but maybe more work could lead to it.

Thanks,

Claudio



On 7/13/20 11:11 AM, Claudio Fontana wrote:
> Hi Paolo,
> 
> On 7/12/20 6:11 PM, Paolo Bonzini wrote:
>> On 12/07/20 12:00, Claudio Fontana wrote:
>>> Note: only the === -blockdev with a backing file === part of test 267 fails. -blockdev with NBD is ok, like all the rest.
>>>
>>>
>>> Interesting facts about s390 in particular: its save/load code includes the transfer of "storage keys",
>>> which include a buffer of 32768 bytes of keydata in the stream.
>>>
>>> The code (hw/s390x/s390-skeys.c),
>>> is modeled similarly to RAM transfer (like in migration/ram.c), with an EOS (end of stream) marker.
>>>
>>> Countrary to RAM transfer code though, after qemu_put_be64(f, EOS), the s390 code does not qemu_fflush(f).
>>
>> 1) Are there unexpected differences in the migration stream?  That is,
>> you could modify qcow2.c to fopen/fwrite/fclose the bytes as they're
>> written and read, and see if something does not match.
> 
> I hooked qcow2_co_pwritev_task and qcow2_co_preadv_task to also write to an external file using your suggestion.
> 
> I am seeing very interesting differences with and without the reproducer patch (ie, forcing icount_state_needed to 0 or not):
> 
> * without the reproducer patch, everything past the s390-skeys data field is in order: there is the EOS, and then the next idstr follows ("cpu_common").
> 
> * with the reproducer patch, every single byte past the s390-skeys data field is ZERO. There is no EOS, there is no next idstr "cpu_common", there is absolutely nothing else than zeroes until the end of the file.
> 
> 
>>
>> 2) If it matches, are there unexpected differences other than the lack
>> of icount section when you apply the reproducer patch?
> 
> they do not match at all.
> 
> 
>>
>> The fflush part makes me put more hope in the first, but both could help
>> you debug it.
>>
>> Thanks,
>>
>> Paolo
>>
> 
> Thanks,
> 
> Claudio
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-14 14:29     ` Claudio Fontana
@ 2020-07-14 14:35       ` Thomas Huth
  2020-07-15 11:10         ` Claudio Fontana
  0 siblings, 1 reply; 25+ messages in thread
From: Thomas Huth @ 2020-07-14 14:35 UTC (permalink / raw)
  To: Claudio Fontana, Paolo Bonzini, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

On 14/07/2020 16.29, Claudio Fontana wrote:
> Hello,
> 
> I have some tiny progress in narrowing down this issue, possibly a qcow2 issue, still unclear,
> but involving Kevin Wolf and Max Reitz.
> 
> 
> The reproducer again:
> 
>> --------------------------------------------cut-------------------------------------------
>> diff --git a/cpus.c b/cpus.c
>> index 41d1c5099f..443b88697a 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>  
>>  static bool icount_state_needed(void *opaque)
>>  {
>> -    return use_icount;
>> +    return 0;
>>  }
>>  
>>  static bool warp_timer_state_needed(void *opaque)
>> --------------------------------------------cut-------------------------------------------
> 
> This issue for now appears on s390 only:
> 
> On s390 hardware, test 267 fails (both kvm and tcg) in the qcow2 backing file part, with broken migration stream data in the s390-skeys vmsave (old style).
[...]
> If someone has a good idea let me know - first attempts to reproduce on x86 failed, but maybe more work could lead to it.

Two questions:

1) Can you also reproduce the issue manually, without running iotest
267? ... I tried, but so far I failed.

2) Since all the information so far sounds like the problem could be
elsewhere in the code, and the skeys just catch it by accident ... have
you tried running with valgrind? Maybe it catches something useful?

 Thomas



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-14 14:35       ` Thomas Huth
@ 2020-07-15 11:10         ` Claudio Fontana
  2020-07-15 12:25           ` Claudio Fontana
  2020-07-16 12:58           ` Claudio Fontana
  0 siblings, 2 replies; 25+ messages in thread
From: Claudio Fontana @ 2020-07-15 11:10 UTC (permalink / raw)
  To: Thomas Huth, Paolo Bonzini, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

Hi Thomas,

On 7/14/20 4:35 PM, Thomas Huth wrote:
> On 14/07/2020 16.29, Claudio Fontana wrote:
>> Hello,
>>
>> I have some tiny progress in narrowing down this issue, possibly a qcow2 issue, still unclear,
>> but involving Kevin Wolf and Max Reitz.
>>
>>
>> The reproducer again:
>>
>>> --------------------------------------------cut-------------------------------------------
>>> diff --git a/cpus.c b/cpus.c
>>> index 41d1c5099f..443b88697a 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>>  
>>>  static bool icount_state_needed(void *opaque)
>>>  {
>>> -    return use_icount;
>>> +    return 0;
>>>  }
>>>  
>>>  static bool warp_timer_state_needed(void *opaque)
>>> --------------------------------------------cut-------------------------------------------
>>
>> This issue for now appears on s390 only:
>>
>> On s390 hardware, test 267 fails (both kvm and tcg) in the qcow2 backing file part, with broken migration stream data in the s390-skeys vmsave (old style).
> [...]
>> If someone has a good idea let me know - first attempts to reproduce on x86 failed, but maybe more work could lead to it.
> 

small update: in the GOOD case (enough padding added) a qcow_merge() is triggered for the last write of 16202 bytes.
In the BAD case (not enough padding added) a qcow_merge() is not triggered for the last write of 16201 bytes.

Note: manually flushing with qemu_fflush in s390-skeys vmsave also works (maybe got lost in the noise).


> Two questions:
> 
> 1) Can you also reproduce the issue manually, without running iotest
> 267? ... I tried, but so far I failed.

Thanks for the suggestion, will try.

> 
> 2) Since all the information so far sounds like the problem could be
> elsewhere in the code, and the skeys just catch it by accident ... have
> you tried running with valgrind? Maybe it catches something useful?

Nothing yet, but will fiddle with the options a bit more.

> 
>  Thomas
> 

Ciao,

Claudio



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-15 11:10         ` Claudio Fontana
@ 2020-07-15 12:25           ` Claudio Fontana
  2020-07-16 12:58           ` Claudio Fontana
  1 sibling, 0 replies; 25+ messages in thread
From: Claudio Fontana @ 2020-07-15 12:25 UTC (permalink / raw)
  To: Thomas Huth, Paolo Bonzini, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

On 7/15/20 1:10 PM, Claudio Fontana wrote:
> Hi Thomas,
> 
> On 7/14/20 4:35 PM, Thomas Huth wrote:
>> On 14/07/2020 16.29, Claudio Fontana wrote:
>>> Hello,
>>>
>>> I have some tiny progress in narrowing down this issue, possibly a qcow2 issue, still unclear,
>>> but involving Kevin Wolf and Max Reitz.
>>>
>>>
>>> The reproducer again:
>>>
>>>> --------------------------------------------cut-------------------------------------------
>>>> diff --git a/cpus.c b/cpus.c
>>>> index 41d1c5099f..443b88697a 100644
>>>> --- a/cpus.c
>>>> +++ b/cpus.c
>>>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>>>  
>>>>  static bool icount_state_needed(void *opaque)
>>>>  {
>>>> -    return use_icount;
>>>> +    return 0;
>>>>  }
>>>>  
>>>>  static bool warp_timer_state_needed(void *opaque)
>>>> --------------------------------------------cut-------------------------------------------
>>>
>>> This issue for now appears on s390 only:
>>>
>>> On s390 hardware, test 267 fails (both kvm and tcg) in the qcow2 backing file part, with broken migration stream data in the s390-skeys vmsave (old style).
>> [...]
>>> If someone has a good idea let me know - first attempts to reproduce on x86 failed, but maybe more work could lead to it.
>>
> 
> small update: in the GOOD case (enough padding added) a qcow_merge() is triggered for the last write of 16202 bytes.
> In the BAD case (not enough padding added) a qcow_merge() is not triggered for the last write of 16201 bytes.

I am talking about merge_cow() here, and it is more complicated. Will post a better explanation later on.

> 
> Note: manually flushing with qemu_fflush in s390-skeys vmsave also works (maybe got lost in the noise).


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-15 11:10         ` Claudio Fontana
  2020-07-15 12:25           ` Claudio Fontana
@ 2020-07-16 12:58           ` Claudio Fontana
  2020-07-20 18:24             ` Claudio Fontana
  1 sibling, 1 reply; 25+ messages in thread
From: Claudio Fontana @ 2020-07-16 12:58 UTC (permalink / raw)
  To: Thomas Huth, Paolo Bonzini, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

Small update on this,

On 7/15/20 1:10 PM, Claudio Fontana wrote:
> Hi Thomas,
> 
> On 7/14/20 4:35 PM, Thomas Huth wrote:
>> On 14/07/2020 16.29, Claudio Fontana wrote:
>>> Hello,
>>>
>>> I have some tiny progress in narrowing down this issue, possibly a qcow2 issue, still unclear,
>>> but involving Kevin Wolf and Max Reitz.
>>>
>>>
>>> The reproducer again:
>>>
>>>> --------------------------------------------cut-------------------------------------------
>>>> diff --git a/cpus.c b/cpus.c
>>>> index 41d1c5099f..443b88697a 100644
>>>> --- a/cpus.c
>>>> +++ b/cpus.c
>>>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>>>  
>>>>  static bool icount_state_needed(void *opaque)
>>>>  {
>>>> -    return use_icount;
>>>> +    return 0;
>>>>  }
>>>>  
>>>>  static bool warp_timer_state_needed(void *opaque)
>>>> --------------------------------------------cut-------------------------------------------
>>>
>>> This issue for now appears on s390 only:
>>>
>>> On s390 hardware, test 267 fails (both kvm and tcg) in the qcow2 backing file part, with broken migration stream data in the s390-skeys vmsave (old style).
>> [...]
>>> If someone has a good idea let me know - first attempts to reproduce on x86 failed, but maybe more work could lead to it.
>>
> 
> small update: in the GOOD case (enough padding added) a qcow_merge() is triggered for the last write of 16202 bytes.
> In the BAD case (not enough padding added) a qcow_merge() is not triggered for the last write of 16201 bytes.
> 
> Note: manually flushing with qemu_fflush in s390-skeys vmsave also works (maybe got lost in the noise).
> 
> 
>> Two questions:
>>
>> 1) Can you also reproduce the issue manually, without running iotest
>> 267? ... I tried, but so far I failed.
> 
> Thanks for the suggestion, will try.

Currently trying to reproduce manually an environment similar to that of the test,
at the moment I am not able to reproduce the issue manually.

Not very familiar with s390,
I've been running with 

export QEMU=/home/cfontana/qemu-build/s390x-softmmu/qemu-system-s390x

$QEMU -nographic -monitor stdio -nodefaults -no-shutdown FILENAME

where FILENAME is the qcow2 produced by the test.

let me know if you have a suggestion on how to setup up something simple properly.


> 
>>
>> 2) Since all the information so far sounds like the problem could be
>> elsewhere in the code, and the skeys just catch it by accident ... have
>> you tried running with valgrind? Maybe it catches something useful?
> 
> Nothing yet, but will fiddle with the options a bit more.

Only thing I have seen so far:


+==33321== 
+==33321== Warning: client switching stacks?  SP change: 0x1ffeffe5e8 --> 0x5d9cf60
+==33321==          to suppress, use: --max-stackframe=137324009096 or greater
+==33321== Warning: client switching stacks?  SP change: 0x5d9cd18 --> 0x1ffeffe5e8
+==33321==          to suppress, use: --max-stackframe=137324009680 or greater
+==33321== Warning: client switching stacks?  SP change: 0x1ffeffe8b8 --> 0x5d9ce58
+==33321==          to suppress, use: --max-stackframe=137324010080 or greater
+==33321==          further instances of this message will not be shown.
+==33321== Thread 4:
+==33321== Conditional jump or move depends on uninitialised value(s)
+==33321==    at 0x3AEC70: process_queued_cpu_work (cpus-common.c:331)
+==33321==    by 0x2753E1: qemu_wait_io_event_common (cpus.c:1213)
+==33321==    by 0x2755CD: qemu_wait_io_event (cpus.c:1253)
+==33321==    by 0x27596D: qemu_dummy_cpu_thread_fn (cpus.c:1337)
+==33321==    by 0x725C87: qemu_thread_start (qemu-thread-posix.c:521)
+==33321==    by 0x4D504E9: start_thread (in /lib64/libpthread-2.26.so)
+==33321==    by 0x4E72BBD: ??? (in /lib64/libc-2.26.so)
+==33321== 
+==33321== Conditional jump or move depends on uninitialised value(s)
+==33321==    at 0x3AEC74: process_queued_cpu_work (cpus-common.c:331)
+==33321==    by 0x2753E1: qemu_wait_io_event_common (cpus.c:1213)
+==33321==    by 0x2755CD: qemu_wait_io_event (cpus.c:1253)
+==33321==    by 0x27596D: qemu_dummy_cpu_thread_fn (cpus.c:1337)
+==33321==    by 0x725C87: qemu_thread_start (qemu-thread-posix.c:521)
+==33321==    by 0x4D504E9: start_thread (in /lib64/libpthread-2.26.so)
+==33321==    by 0x4E72BBD: ??? (in /lib64/libc-2.26.so)
+==33321== 
+==33321== 
+==33321== HEAP SUMMARY:
+==33321==     in use at exit: 2,138,442 bytes in 13,935 blocks
+==33321==   total heap usage: 19,089 allocs, 5,154 frees, 5,187,670 bytes allocated
+==33321== 
+==33321== LEAK SUMMARY:
+==33321==    definitely lost: 0 bytes in 0 blocks
+==33321==    indirectly lost: 0 bytes in 0 blocks
+==33321==      possibly lost: 7,150 bytes in 111 blocks
+==33321==    still reachable: 2,131,292 bytes in 13,824 blocks
+==33321==         suppressed: 0 bytes in 0 blocks
+==33321== Rerun with --leak-check=full to see details of leaked memory


> 
>>
>>  Thomas
>>
> 
> Ciao,
> 
> Claudio
> 
> 

A more interesting update is what follows I think.

I was able to "fix" the problem shown by the reproducer:

@@ -643,7 +643,7 @@ static void qemu_account_warp_tim@@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
 
 static bool icount_state_needed(void *opaque)
 {
-    return use_icount;
+    return 0;
 }
 
by just slowing down qcow2_co_pwritev_task_entry with some tight loops,
without changing any fields between runs (other than the reproducer icount field removal).

I tried to insert the same slowdown just in savevm.c at the end of save_snapshot, but that does not work, needs to be in the coroutine.

Thanks,

Claudio



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-16 12:58           ` Claudio Fontana
@ 2020-07-20 18:24             ` Claudio Fontana
  2020-07-21  8:22               ` Claudio Fontana
  0 siblings, 1 reply; 25+ messages in thread
From: Claudio Fontana @ 2020-07-20 18:24 UTC (permalink / raw)
  To: Thomas Huth, Paolo Bonzini, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Bruce Rogers, Stefan Hajnoczi

I have now been able to reproduce this on X86 as well.

It happens much more rarely, about once every 10 times.

I will sort out the data and try to make it even more reproducible, then post my findings in detail.

Overall I proceeded as follows:

1) hooked the savevm code to skip all fields with the exception of "s390-skeys". So only s390-skeys are actually saved.

2) reimplemented "s390-skeys" in a common implementation in cpus.c, used on both x86 and s390, modeling the behaviour of save/load from hw/s390

3) ran ./check -qcow2 267 on both x86 and s390.

In the case of s390, failure seems to be reproducible 100% of the times.
On X86, it is as mentioned failing about 10% of the times.

Ciao,

Claudio


On 7/16/20 2:58 PM, Claudio Fontana wrote:
> Small update on this,
> 
> On 7/15/20 1:10 PM, Claudio Fontana wrote:
>> Hi Thomas,
>>
>> On 7/14/20 4:35 PM, Thomas Huth wrote:
>>> On 14/07/2020 16.29, Claudio Fontana wrote:
>>>> Hello,
>>>>
>>>> I have some tiny progress in narrowing down this issue, possibly a qcow2 issue, still unclear,
>>>> but involving Kevin Wolf and Max Reitz.
>>>>
>>>>
>>>> The reproducer again:
>>>>
>>>>> --------------------------------------------cut-------------------------------------------
>>>>> diff --git a/cpus.c b/cpus.c
>>>>> index 41d1c5099f..443b88697a 100644
>>>>> --- a/cpus.c
>>>>> +++ b/cpus.c
>>>>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>>>>  
>>>>>  static bool icount_state_needed(void *opaque)
>>>>>  {
>>>>> -    return use_icount;
>>>>> +    return 0;
>>>>>  }
>>>>>  
>>>>>  static bool warp_timer_state_needed(void *opaque)
>>>>> --------------------------------------------cut-------------------------------------------
>>>>
>>>> This issue for now appears on s390 only:
>>>>
>>>> On s390 hardware, test 267 fails (both kvm and tcg) in the qcow2 backing file part, with broken migration stream data in the s390-skeys vmsave (old style).
>>> [...]
>>>> If someone has a good idea let me know - first attempts to reproduce on x86 failed, but maybe more work could lead to it.
>>>
>>
>> small update: in the GOOD case (enough padding added) a qcow_merge() is triggered for the last write of 16202 bytes.
>> In the BAD case (not enough padding added) a qcow_merge() is not triggered for the last write of 16201 bytes.
>>
>> Note: manually flushing with qemu_fflush in s390-skeys vmsave also works (maybe got lost in the noise).
>>
>>
>>> Two questions:
>>>
>>> 1) Can you also reproduce the issue manually, without running iotest
>>> 267? ... I tried, but so far I failed.
>>
>> Thanks for the suggestion, will try.
> 
> Currently trying to reproduce manually an environment similar to that of the test,
> at the moment I am not able to reproduce the issue manually.
> 
> Not very familiar with s390,
> I've been running with 
> 
> export QEMU=/home/cfontana/qemu-build/s390x-softmmu/qemu-system-s390x
> 
> $QEMU -nographic -monitor stdio -nodefaults -no-shutdown FILENAME
> 
> where FILENAME is the qcow2 produced by the test.
> 
> let me know if you have a suggestion on how to setup up something simple properly.
> 
> 
>>
>>>
>>> 2) Since all the information so far sounds like the problem could be
>>> elsewhere in the code, and the skeys just catch it by accident ... have
>>> you tried running with valgrind? Maybe it catches something useful?
>>
>> Nothing yet, but will fiddle with the options a bit more.
> 
> Only thing I have seen so far:
> 
> 
> +==33321== 
> +==33321== Warning: client switching stacks?  SP change: 0x1ffeffe5e8 --> 0x5d9cf60
> +==33321==          to suppress, use: --max-stackframe=137324009096 or greater
> +==33321== Warning: client switching stacks?  SP change: 0x5d9cd18 --> 0x1ffeffe5e8
> +==33321==          to suppress, use: --max-stackframe=137324009680 or greater
> +==33321== Warning: client switching stacks?  SP change: 0x1ffeffe8b8 --> 0x5d9ce58
> +==33321==          to suppress, use: --max-stackframe=137324010080 or greater
> +==33321==          further instances of this message will not be shown.
> +==33321== Thread 4:
> +==33321== Conditional jump or move depends on uninitialised value(s)
> +==33321==    at 0x3AEC70: process_queued_cpu_work (cpus-common.c:331)
> +==33321==    by 0x2753E1: qemu_wait_io_event_common (cpus.c:1213)
> +==33321==    by 0x2755CD: qemu_wait_io_event (cpus.c:1253)
> +==33321==    by 0x27596D: qemu_dummy_cpu_thread_fn (cpus.c:1337)
> +==33321==    by 0x725C87: qemu_thread_start (qemu-thread-posix.c:521)
> +==33321==    by 0x4D504E9: start_thread (in /lib64/libpthread-2.26.so)
> +==33321==    by 0x4E72BBD: ??? (in /lib64/libc-2.26.so)
> +==33321== 
> +==33321== Conditional jump or move depends on uninitialised value(s)
> +==33321==    at 0x3AEC74: process_queued_cpu_work (cpus-common.c:331)
> +==33321==    by 0x2753E1: qemu_wait_io_event_common (cpus.c:1213)
> +==33321==    by 0x2755CD: qemu_wait_io_event (cpus.c:1253)
> +==33321==    by 0x27596D: qemu_dummy_cpu_thread_fn (cpus.c:1337)
> +==33321==    by 0x725C87: qemu_thread_start (qemu-thread-posix.c:521)
> +==33321==    by 0x4D504E9: start_thread (in /lib64/libpthread-2.26.so)
> +==33321==    by 0x4E72BBD: ??? (in /lib64/libc-2.26.so)
> +==33321== 
> +==33321== 
> +==33321== HEAP SUMMARY:
> +==33321==     in use at exit: 2,138,442 bytes in 13,935 blocks
> +==33321==   total heap usage: 19,089 allocs, 5,154 frees, 5,187,670 bytes allocated
> +==33321== 
> +==33321== LEAK SUMMARY:
> +==33321==    definitely lost: 0 bytes in 0 blocks
> +==33321==    indirectly lost: 0 bytes in 0 blocks
> +==33321==      possibly lost: 7,150 bytes in 111 blocks
> +==33321==    still reachable: 2,131,292 bytes in 13,824 blocks
> +==33321==         suppressed: 0 bytes in 0 blocks
> +==33321== Rerun with --leak-check=full to see details of leaked memory
> 
> 
>>
>>>
>>>  Thomas
>>>
>>
>> Ciao,
>>
>> Claudio
>>
>>
> 
> A more interesting update is what follows I think.
> 
> I was able to "fix" the problem shown by the reproducer:
> 
> @@ -643,7 +643,7 @@ static void qemu_account_warp_tim@@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>  
>  static bool icount_state_needed(void *opaque)
>  {
> -    return use_icount;
> +    return 0;
>  }
>  
> by just slowing down qcow2_co_pwritev_task_entry with some tight loops,
> without changing any fields between runs (other than the reproducer icount field removal).
> 
> I tried to insert the same slowdown just in savevm.c at the end of save_snapshot, but that does not work, needs to be in the coroutine.
> 
> Thanks,
> 
> Claudio
> 
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-20 18:24             ` Claudio Fontana
@ 2020-07-21  8:22               ` Claudio Fontana
  2020-07-27 23:09                 ` Bruce Rogers
  0 siblings, 1 reply; 25+ messages in thread
From: Claudio Fontana @ 2020-07-21  8:22 UTC (permalink / raw)
  To: Thomas Huth, Paolo Bonzini, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Bruce Rogers, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 9405 bytes --]

On 7/20/20 8:24 PM, Claudio Fontana wrote:
> I have now been able to reproduce this on X86 as well.
> 
> It happens much more rarely, about once every 10 times.
> 
> I will sort out the data and try to make it even more reproducible, then post my findings in detail.
> 
> Overall I proceeded as follows:
> 
> 1) hooked the savevm code to skip all fields with the exception of "s390-skeys". So only s390-skeys are actually saved.
> 
> 2) reimplemented "s390-skeys" in a common implementation in cpus.c, used on both x86 and s390, modeling the behaviour of save/load from hw/s390
> 
> 3) ran ./check -qcow2 267 on both x86 and s390.
> 
> In the case of s390, failure seems to be reproducible 100% of the times.
> On X86, it is as mentioned failing about 10% of the times.
> 
> Ciao,
> 
> Claudio


And here is a small series of two patches that can be used to reproduce the problem.

Clearly, this is not directly related to s390 or to skeys or to icount in particular, it is just an issue that happened to be more visible there.

If you could help with this, please apply the attached patches.

Patch 1 just adds a new "300" iotest. It is way easier to extract the relevant part out of test 267, which does a bit too much in the same file.
Also this allows easier use of valgrind, since it does not "require" anything.

Patch 2 hooks the savevm code to skip all fields during the snapshot with the exception of "s390-skeys", a new artificial field implemented to
model what the real s390-skeys is doing.

After applying patch 1 and patch 2, you can test (also on X86), with:

./check -qcow2 300

On X86 many runs will be successful, but a certain % of them will instead fail like this:


claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests> ./check -qcow2 300
QEMU          -- "/home/claudio/git/qemu-pristine/qemu-build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64" -nodefaults -display none -accel qtest
QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-build/tests/qemu-iotests/../../qemu-img" 
QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-build/tests/qemu-iotests/../../qemu-nbd" 
IMGFMT        -- qcow2 (compat=1.1)
IMGPROTO      -- file
PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-iotests/scratch
SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-iotests/socket_scm_helper

300      fail       [10:14:05] [10:14:06]      (last: 0s)    output mismatch (see 300.out.bad)
--- /home/claudio/git/qemu-pristine/qemu/tests/qemu-iotests/300.out     2020-07-21 10:03:54.468104764 +0200
+++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
@@ -12,6 +12,9 @@
 ID        TAG                 VM SIZE                DATE       VM CLOCK
 --        snap0                  SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
 (qemu) loadvm snap0
+Unexpected storage key data: 0
+error while loading state for instance 0x0 of device 's390-skeys'
+Error: Error -22 while loading VM state
 (qemu) quit
 
 *** done
Failures: 300
Failed 1 of 1 iotests


At this point somebody more knowledgeable about QCOW2, coroutines and backing files could chime in?

Thanks,

Claudio

> 
> 
> On 7/16/20 2:58 PM, Claudio Fontana wrote:
>> Small update on this,
>>
>> On 7/15/20 1:10 PM, Claudio Fontana wrote:
>>> Hi Thomas,
>>>
>>> On 7/14/20 4:35 PM, Thomas Huth wrote:
>>>> On 14/07/2020 16.29, Claudio Fontana wrote:
>>>>> Hello,
>>>>>
>>>>> I have some tiny progress in narrowing down this issue, possibly a qcow2 issue, still unclear,
>>>>> but involving Kevin Wolf and Max Reitz.
>>>>>
>>>>>
>>>>> The reproducer again:
>>>>>
>>>>>> --------------------------------------------cut-------------------------------------------
>>>>>> diff --git a/cpus.c b/cpus.c
>>>>>> index 41d1c5099f..443b88697a 100644
>>>>>> --- a/cpus.c
>>>>>> +++ b/cpus.c
>>>>>> @@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>>>>>  
>>>>>>  static bool icount_state_needed(void *opaque)
>>>>>>  {
>>>>>> -    return use_icount;
>>>>>> +    return 0;
>>>>>>  }
>>>>>>  
>>>>>>  static bool warp_timer_state_needed(void *opaque)
>>>>>> --------------------------------------------cut-------------------------------------------
>>>>>
>>>>> This issue for now appears on s390 only:
>>>>>
>>>>> On s390 hardware, test 267 fails (both kvm and tcg) in the qcow2 backing file part, with broken migration stream data in the s390-skeys vmsave (old style).
>>>> [...]
>>>>> If someone has a good idea let me know - first attempts to reproduce on x86 failed, but maybe more work could lead to it.
>>>>
>>>
>>> small update: in the GOOD case (enough padding added) a qcow_merge() is triggered for the last write of 16202 bytes.
>>> In the BAD case (not enough padding added) a qcow_merge() is not triggered for the last write of 16201 bytes.
>>>
>>> Note: manually flushing with qemu_fflush in s390-skeys vmsave also works (maybe got lost in the noise).
>>>
>>>
>>>> Two questions:
>>>>
>>>> 1) Can you also reproduce the issue manually, without running iotest
>>>> 267? ... I tried, but so far I failed.
>>>
>>> Thanks for the suggestion, will try.
>>
>> Currently trying to reproduce manually an environment similar to that of the test,
>> at the moment I am not able to reproduce the issue manually.
>>
>> Not very familiar with s390,
>> I've been running with 
>>
>> export QEMU=/home/cfontana/qemu-build/s390x-softmmu/qemu-system-s390x
>>
>> $QEMU -nographic -monitor stdio -nodefaults -no-shutdown FILENAME
>>
>> where FILENAME is the qcow2 produced by the test.
>>
>> let me know if you have a suggestion on how to setup up something simple properly.
>>
>>
>>>
>>>>
>>>> 2) Since all the information so far sounds like the problem could be
>>>> elsewhere in the code, and the skeys just catch it by accident ... have
>>>> you tried running with valgrind? Maybe it catches something useful?
>>>
>>> Nothing yet, but will fiddle with the options a bit more.
>>
>> Only thing I have seen so far:
>>
>>
>> +==33321== 
>> +==33321== Warning: client switching stacks?  SP change: 0x1ffeffe5e8 --> 0x5d9cf60
>> +==33321==          to suppress, use: --max-stackframe=137324009096 or greater
>> +==33321== Warning: client switching stacks?  SP change: 0x5d9cd18 --> 0x1ffeffe5e8
>> +==33321==          to suppress, use: --max-stackframe=137324009680 or greater
>> +==33321== Warning: client switching stacks?  SP change: 0x1ffeffe8b8 --> 0x5d9ce58
>> +==33321==          to suppress, use: --max-stackframe=137324010080 or greater
>> +==33321==          further instances of this message will not be shown.
>> +==33321== Thread 4:
>> +==33321== Conditional jump or move depends on uninitialised value(s)
>> +==33321==    at 0x3AEC70: process_queued_cpu_work (cpus-common.c:331)
>> +==33321==    by 0x2753E1: qemu_wait_io_event_common (cpus.c:1213)
>> +==33321==    by 0x2755CD: qemu_wait_io_event (cpus.c:1253)
>> +==33321==    by 0x27596D: qemu_dummy_cpu_thread_fn (cpus.c:1337)
>> +==33321==    by 0x725C87: qemu_thread_start (qemu-thread-posix.c:521)
>> +==33321==    by 0x4D504E9: start_thread (in /lib64/libpthread-2.26.so)
>> +==33321==    by 0x4E72BBD: ??? (in /lib64/libc-2.26.so)
>> +==33321== 
>> +==33321== Conditional jump or move depends on uninitialised value(s)
>> +==33321==    at 0x3AEC74: process_queued_cpu_work (cpus-common.c:331)
>> +==33321==    by 0x2753E1: qemu_wait_io_event_common (cpus.c:1213)
>> +==33321==    by 0x2755CD: qemu_wait_io_event (cpus.c:1253)
>> +==33321==    by 0x27596D: qemu_dummy_cpu_thread_fn (cpus.c:1337)
>> +==33321==    by 0x725C87: qemu_thread_start (qemu-thread-posix.c:521)
>> +==33321==    by 0x4D504E9: start_thread (in /lib64/libpthread-2.26.so)
>> +==33321==    by 0x4E72BBD: ??? (in /lib64/libc-2.26.so)
>> +==33321== 
>> +==33321== 
>> +==33321== HEAP SUMMARY:
>> +==33321==     in use at exit: 2,138,442 bytes in 13,935 blocks
>> +==33321==   total heap usage: 19,089 allocs, 5,154 frees, 5,187,670 bytes allocated
>> +==33321== 
>> +==33321== LEAK SUMMARY:
>> +==33321==    definitely lost: 0 bytes in 0 blocks
>> +==33321==    indirectly lost: 0 bytes in 0 blocks
>> +==33321==      possibly lost: 7,150 bytes in 111 blocks
>> +==33321==    still reachable: 2,131,292 bytes in 13,824 blocks
>> +==33321==         suppressed: 0 bytes in 0 blocks
>> +==33321== Rerun with --leak-check=full to see details of leaked memory
>>
>>
>>>
>>>>
>>>>  Thomas
>>>>
>>>
>>> Ciao,
>>>
>>> Claudio
>>>
>>>
>>
>> A more interesting update is what follows I think.
>>
>> I was able to "fix" the problem shown by the reproducer:
>>
>> @@ -643,7 +643,7 @@ static void qemu_account_warp_tim@@ -643,7 +643,7 @@ static void qemu_account_warp_timer(void)
>>  
>>  static bool icount_state_needed(void *opaque)
>>  {
>> -    return use_icount;
>> +    return 0;
>>  }
>>  
>> by just slowing down qcow2_co_pwritev_task_entry with some tight loops,
>> without changing any fields between runs (other than the reproducer icount field removal).
>>
>> I tried to insert the same slowdown just in savevm.c at the end of save_snapshot, but that does not work, needs to be in the coroutine.
>>
>> Thanks,
>>
>> Claudio
>>
>>
> 
> 


[-- Attachment #2: 0002-XXX-reproducer-for-backing-file-migration-test-issue.patch --]
[-- Type: text/x-patch, Size: 6033 bytes --]

From 36538b4dfde5ccaa663d3d7c738fdef595027cb6 Mon Sep 17 00:00:00 2001
From: Claudio Fontana <cfontana@suse.de>
Date: Tue, 21 Jul 2020 10:10:29 +0200
Subject: [PATCH 2/2] XXX reproducer for backing file migration test issue

do not commit. During a snapshot save, this changeset throws away
all fields from the stream with the exception of "s390-skeys",
which is added as a field for all architectures (renaming the
real s390-skeys to something else).

This allows to compare the reproducibility of the snapshot
issue with all architectures.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 hw/s390x/s390-skeys.c |  2 +-
 migration/savevm.c    | 31 +++++++++++++++++++++++
 softmmu/cpus.c        | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
index db2f49cb27..41116bdcc1 100644
--- a/hw/s390x/s390-skeys.c
+++ b/hw/s390x/s390-skeys.c
@@ -389,7 +389,7 @@ static inline void s390_skeys_set_migration_enabled(Object *obj, bool value,
     ss->migration_enabled = value;
 
     if (ss->migration_enabled) {
-        register_savevm_live(TYPE_S390_SKEYS, 0, 1,
+        register_savevm_live("s390-oldkeys", 0, 1,
                              &savevm_s390_storage_keys, ss);
     } else {
         unregister_savevm(VMSTATE_IF(ss), TYPE_S390_SKEYS, ss);
diff --git a/migration/savevm.c b/migration/savevm.c
index 45c9dd9d8a..b63af07ccd 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1163,6 +1163,11 @@ void qemu_savevm_state_setup(QEMUFile *f)
 
     trace_savevm_state_setup();
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+
+        if (strcmp(se->idstr, "s390-skeys") != 0) {
+            /* only save keys */
+            continue;
+        }
         if (!se->ops || !se->ops->save_setup) {
             continue;
         }
@@ -1194,6 +1199,11 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
     trace_savevm_state_resume_prepare();
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+
+        if (strcmp(se->idstr, "s390-skeys") != 0) {
+            /* only save keys */
+            continue;
+        }
         if (!se->ops || !se->ops->resume_prepare) {
             continue;
         }
@@ -1224,6 +1234,11 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
 
     trace_savevm_state_iterate();
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+
+        if (strcmp(se->idstr, "s390-skeys") != 0) {
+            /* only save keys */
+            continue;
+        }
         if (!se->ops || !se->ops->save_live_iterate) {
             continue;
         }
@@ -1325,6 +1340,12 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy)
     int ret;
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+
+        if (strcmp(se->idstr, "s390-skeys") != 0) {
+            /* only save keys */
+            continue;
+        }
+
         if (!se->ops ||
             (in_postcopy && se->ops->has_postcopy &&
              se->ops->has_postcopy(se->opaque)) ||
@@ -1368,6 +1389,11 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
     json_start_array(vmdesc, "devices");
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
 
+        if (strcmp(se->idstr, "s390-skeys") != 0) {
+            /* only save keys */
+            continue;
+        }
+
         if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
             continue;
         }
@@ -1583,6 +1609,11 @@ int qemu_save_device_state(QEMUFile *f)
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
         int ret;
 
+        if (strcmp(se->idstr, "s390-skeys") != 0) {
+            /* only save keys */
+            continue;
+        }
+
         if (se->is_ram) {
             continue;
         }
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index a802e899ab..ae3ad0878c 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -63,6 +63,9 @@
 
 #include "sysemu/cpu-throttle.h"
 
+#include "migration/qemu-file-types.h"
+#include "migration/register.h"
+
 #ifdef CONFIG_LINUX
 
 #include <sys/prctl.h>
@@ -732,11 +735,76 @@ static const VMStateDescription vmstate_timers = {
     }
 };
 
+static void s390_storage_keys_save(QEMUFile *f, void *opaque)
+{
+    uint64_t eos = 1, data = 2;
+    uint8_t *buf;
+
+    buf = g_try_malloc(32768);
+    if (!buf) {
+        error_report("storage key save could not allocate memory");
+        goto end_stream;
+    }
+
+    qemu_put_be64(f, data);
+    qemu_put_be64(f, 32768);
+
+    memset(buf, 0, 32768);
+    qemu_put_buffer(f, buf, 32768);
+    g_free(buf);
+
+ end_stream:
+    qemu_put_be64(f, eos);
+}
+
+static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id)
+{
+    int ret = 0;
+    uint64_t data;
+    uint8_t *buf;
+
+    while (ret == 0) {
+        data = qemu_get_be64(f);
+
+        switch (data) {
+        case 2: {
+            const uint64_t total_count = qemu_get_be64(f);
+            buf = g_try_malloc(32768);
+            if (!buf) {
+                error_report("storage key load could not allocate memory");
+                ret = -ENOMEM;
+                break;
+            }
+            qemu_get_buffer(f, buf, total_count);
+            g_free(buf);
+            break;
+        }
+        case 1: {
+            /* normal exit */
+            return 0;
+        }
+        default:
+            error_report("Unexpected storage key data: %#lx", data);
+            ret = -EINVAL;
+        }
+    }
+    return ret;
+}
+
+static SaveVMHandlers savevm_s390_storage_keys = {
+    .save_state = s390_storage_keys_save,
+    .load_state = s390_storage_keys_load,
+};
+
 void cpu_ticks_init(void)
 {
     seqlock_init(&timers_state.vm_clock_seqlock);
     qemu_spin_init(&timers_state.vm_clock_lock);
     vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+
+    register_savevm_live("s390-skeys", 0, 1,
+                         &savevm_s390_storage_keys, NULL);
+
     cpu_throttle_init();
 }
 
-- 
2.16.4


[-- Attachment #3: 0001-iotests-300-new-test-for-catching-backing-file-migra.patch --]
[-- Type: text/x-patch, Size: 4982 bytes --]

From 807eefab80e06244c973098a9a012fe3871d19b8 Mon Sep 17 00:00:00 2001
From: Claudio Fontana <cfontana@suse.de>
Date: Tue, 21 Jul 2020 10:06:37 +0200
Subject: [PATCH 1/2] iotests/300: new test for catching backing file migration
 issue

this test has been split off from 267, checking only
the backing file, and not "requiring" any components,
as to make it easier to debug and run under valgrind.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 tests/qemu-iotests/300     | 100 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/300.out |  17 ++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 118 insertions(+)
 create mode 100755 tests/qemu-iotests/300
 create mode 100644 tests/qemu-iotests/300.out

diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300
new file mode 100755
index 0000000000..aec01c6691
--- /dev/null
+++ b/tests/qemu-iotests/300
@@ -0,0 +1,100 @@
+#!/usr/bin/env bash
+#
+# Test which nodes are involved in internal snapshots
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=kwolf@redhat.com
+
+# adapted by Claudio Fontana <cfontana@suse.de>
+#
+# this is a standalone extract from 267 focused on the first backing file
+# testcase, which failed consistently on s390 and less consistently
+# on X86 under particular conditions, depending on the number of fields
+# in the migration stream.
+#
+# Also, in contrast to 267, we do not "require" anything,
+# which fixes running under valgrind, which otherwise breaks.
+#
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+# Internal snapshots are (currently) impossible with refcount_bits=1,
+# and generally impossible with external data files
+_unsupported_imgopts 'refcount_bits=1[^0-9]' data_file
+
+do_run_qemu()
+{
+    echo Testing: "$@"
+    (
+        if ! test -t 0; then
+            while read cmd; do
+                echo $cmd
+            done
+        fi
+        echo quit
+    ) | $QEMU -nographic -monitor stdio -nodefaults "$@"
+    echo
+}
+
+run_qemu()
+{
+    do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qemu | _filter_hmp |
+        _filter_generated_node_ids | _filter_imgfmt | _filter_vmstate_size
+}
+
+size=128M
+
+run_test()
+{
+    _make_test_img -b "$BACKING_FILE" -F $IMGFMT $size
+    printf "savevm snap0\ninfo snapshots\nloadvm snap0\n" | run_qemu "$@" | _filter_date
+}
+
+echo
+echo "=== -blockdev with a backing file ==="
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img $size
+
+BACKING_FILE="$TEST_IMG.base" \
+run_test -blockdev driver=file,filename="$TEST_IMG.base",node-name=backing-file \
+         -blockdev driver=file,filename="$TEST_IMG",node-name=file \
+         -blockdev driver=$IMGFMT,file=file,backing=backing-file,node-name=fmt
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/300.out b/tests/qemu-iotests/300.out
new file mode 100644
index 0000000000..9904043638
--- /dev/null
+++ b/tests/qemu-iotests/300.out
@@ -0,0 +1,17 @@
+QA output created by 300
+
+=== -blockdev with a backing file ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=134217728
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
+Testing: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT.base,node-name=backing-file -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=file -blockdev driver=IMGFMT,file=file,backing=backing-file,node-name=fmt
+QEMU X.Y.Z monitor - type 'help' for more information
+(qemu) savevm snap0
+(qemu) info snapshots
+List of snapshots present on all disks:
+ID        TAG                 VM SIZE                DATE       VM CLOCK
+--        snap0                  SIZE yyyy-mm-dd hh:mm:ss   00:00:00.000
+(qemu) loadvm snap0
+(qemu) quit
+
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 1d0252e1f0..cb66b5a58c 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -307,4 +307,5 @@
 296 rw
 297 meta
 299 auto quick
+300 rw auto quick snapshot backing
 301 backing quick
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-21  8:22               ` Claudio Fontana
@ 2020-07-27 23:09                 ` Bruce Rogers
  2020-07-28  8:15                   ` Vladimir Sementsov-Ogievskiy
                                     ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Bruce Rogers @ 2020-07-27 23:09 UTC (permalink / raw)
  To: Claudio Fontana, Thomas Huth, Paolo Bonzini, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, vsementsov, Stefan Hajnoczi

On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
> On 7/20/20 8:24 PM, Claudio Fontana wrote:
> > I have now been able to reproduce this on X86 as well.
> > 
> > It happens much more rarely, about once every 10 times.
> > 
> > I will sort out the data and try to make it even more reproducible,
> > then post my findings in detail.
> > 
> > Overall I proceeded as follows:
> > 
> > 1) hooked the savevm code to skip all fields with the exception of
> > "s390-skeys". So only s390-skeys are actually saved.
> > 
> > 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
> > used on both x86 and s390, modeling the behaviour of save/load from
> > hw/s390
> > 
> > 3) ran ./check -qcow2 267 on both x86 and s390.
> > 
> > In the case of s390, failure seems to be reproducible 100% of the
> > times.
> > On X86, it is as mentioned failing about 10% of the times.
> > 
> > Ciao,
> > 
> > Claudio
> 
> And here is a small series of two patches that can be used to
> reproduce the problem.
> 
> Clearly, this is not directly related to s390 or to skeys or to
> icount in particular, it is just an issue that happened to be more
> visible there.
> 
> If you could help with this, please apply the attached patches.
> 
> Patch 1 just adds a new "300" iotest. It is way easier to extract the
> relevant part out of test 267, which does a bit too much in the same
> file.
> Also this allows easier use of valgrind, since it does not "require"
> anything.
> 
> Patch 2 hooks the savevm code to skip all fields during the snapshot
> with the exception of "s390-skeys", a new artificial field
> implemented to
> model what the real s390-skeys is doing.
> 
> After applying patch 1 and patch 2, you can test (also on X86), with:
> 
> ./check -qcow2 300
> 
> On X86 many runs will be successful, but a certain % of them will
> instead fail like this:
> 
> 
> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests> 
> ./check -qcow2 300
> QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
> -nodefaults -display none -accel qtest
> QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../qemu-img" 
> QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
> threads -f qcow2
> QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../qemu-nbd" 
> IMGFMT        -- qcow2 (compat=1.1)
> IMGPROTO      -- file
> PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
> TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/scratch
> SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/socket_scm_helper
> 
> 300      fail       [10:14:05] [10:14:06]      (last: 0s)    output
> mismatch (see 300.out.bad)
> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
> iotests/300.out     2020-07-21 10:03:54.468104764 +0200
> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
> @@ -12,6 +12,9 @@
>  ID        TAG                 VM SIZE                DATE       VM
> CLOCK
>  --        snap0                  SIZE yyyy-mm-dd
> hh:mm:ss   00:00:00.000
>  (qemu) loadvm snap0
> +Unexpected storage key data: 0
> +error while loading state for instance 0x0 of device 's390-skeys'
> +Error: Error -22 while loading VM state
>  (qemu) quit
>  
>  *** done
> Failures: 300
> Failed 1 of 1 iotests
> 
> 
> At this point somebody more knowledgeable about QCOW2, coroutines and
> backing files could chime in?
> 
<trim>

I used the reproducer you provide here to do a git bisect as I assume
whatever is now broken wasn't always broken, and it pointed to the
following commit:

commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date:   Tue Jun 4 19:15:13 2019 +0300

    block/qcow2: implement .bdrv_co_preadv_part

Indeed, I am currently able to reliable reproduce the issue with this
commit applied, and not reproduce it without it.

That said, I've not been able to identify exactly what is going wrong.
I'm fairly confident the savevm data is correctly written out, but on
the loadvm side, somehow the last part of the s390 data is not
correctly read in the data (it's in the second pass through the while
loop in qcow2_co_preadv_part() where that happens.)

If anyone familiar with this code can have a look or provide some
pointers, it would be much appreciated.

Adding commit author to CC.

Thanks,

Bruce



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-27 23:09                 ` Bruce Rogers
@ 2020-07-28  8:15                   ` Vladimir Sementsov-Ogievskiy
  2020-07-28  8:43                   ` Vladimir Sementsov-Ogievskiy
  2020-07-28 11:10                   ` Max Reitz
  2 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-28  8:15 UTC (permalink / raw)
  To: Bruce Rogers, Claudio Fontana, Thomas Huth, Paolo Bonzini,
	Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

28.07.2020 02:09, Bruce Rogers wrote:
> On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
>> On 7/20/20 8:24 PM, Claudio Fontana wrote:
>>> I have now been able to reproduce this on X86 as well.
>>>
>>> It happens much more rarely, about once every 10 times.
>>>
>>> I will sort out the data and try to make it even more reproducible,
>>> then post my findings in detail.
>>>
>>> Overall I proceeded as follows:
>>>
>>> 1) hooked the savevm code to skip all fields with the exception of
>>> "s390-skeys". So only s390-skeys are actually saved.
>>>
>>> 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
>>> used on both x86 and s390, modeling the behaviour of save/load from
>>> hw/s390
>>>
>>> 3) ran ./check -qcow2 267 on both x86 and s390.
>>>
>>> In the case of s390, failure seems to be reproducible 100% of the
>>> times.
>>> On X86, it is as mentioned failing about 10% of the times.
>>>
>>> Ciao,
>>>
>>> Claudio
>>
>> And here is a small series of two patches that can be used to
>> reproduce the problem.
>>
>> Clearly, this is not directly related to s390 or to skeys or to
>> icount in particular, it is just an issue that happened to be more
>> visible there.
>>
>> If you could help with this, please apply the attached patches.
>>
>> Patch 1 just adds a new "300" iotest. It is way easier to extract the
>> relevant part out of test 267, which does a bit too much in the same
>> file.
>> Also this allows easier use of valgrind, since it does not "require"
>> anything.
>>
>> Patch 2 hooks the savevm code to skip all fields during the snapshot
>> with the exception of "s390-skeys", a new artificial field
>> implemented to
>> model what the real s390-skeys is doing.
>>
>> After applying patch 1 and patch 2, you can test (also on X86), with:
>>
>> ./check -qcow2 300
>>
>> On X86 many runs will be successful, but a certain % of them will
>> instead fail like this:
>>
>>
>> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests>
>> ./check -qcow2 300
>> QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
>> -nodefaults -display none -accel qtest
>> QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-img"
>> QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
>> threads -f qcow2
>> QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-nbd"
>> IMGFMT        -- qcow2 (compat=1.1)
>> IMGPROTO      -- file
>> PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
>> TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/scratch
>> SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
>> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/socket_scm_helper
>>
>> 300      fail       [10:14:05] [10:14:06]      (last: 0s)    output
>> mismatch (see 300.out.bad)
>> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
>> iotests/300.out     2020-07-21 10:03:54.468104764 +0200
>> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
>> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
>> @@ -12,6 +12,9 @@
>>   ID        TAG                 VM SIZE                DATE       VM
>> CLOCK
>>   --        snap0                  SIZE yyyy-mm-dd
>> hh:mm:ss   00:00:00.000
>>   (qemu) loadvm snap0
>> +Unexpected storage key data: 0
>> +error while loading state for instance 0x0 of device 's390-skeys'
>> +Error: Error -22 while loading VM state
>>   (qemu) quit
>>   
>>   *** done
>> Failures: 300
>> Failed 1 of 1 iotests
>>
>>
>> At this point somebody more knowledgeable about QCOW2, coroutines and
>> backing files could chime in?
>>
> <trim>
> 
> I used the reproducer you provide here to do a git bisect as I assume
> whatever is now broken wasn't always broken, and it pointed to the
> following commit:
> 
> commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Date:   Tue Jun 4 19:15:13 2019 +0300
> 
>      block/qcow2: implement .bdrv_co_preadv_part
> 
> Indeed, I am currently able to reliable reproduce the issue with this
> commit applied, and not reproduce it without it.
> 
> That said, I've not been able to identify exactly what is going wrong.
> I'm fairly confident the savevm data is correctly written out, but on
> the loadvm side, somehow the last part of the s390 data is not
> correctly read in the data (it's in the second pass through the while
> loop in qcow2_co_preadv_part() where that happens.)
> 
> If anyone familiar with this code can have a look or provide some
> pointers, it would be much appreciated.
> 
> Adding commit author to CC.

Ooops :) I'll look at it of course.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-27 23:09                 ` Bruce Rogers
  2020-07-28  8:15                   ` Vladimir Sementsov-Ogievskiy
@ 2020-07-28  8:43                   ` Vladimir Sementsov-Ogievskiy
  2020-07-28 13:23                     ` Bruce Rogers
  2020-07-28 11:10                   ` Max Reitz
  2 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-28  8:43 UTC (permalink / raw)
  To: Bruce Rogers, Claudio Fontana, Thomas Huth, Paolo Bonzini,
	Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf, Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

28.07.2020 02:09, Bruce Rogers wrote:
> On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
>> On 7/20/20 8:24 PM, Claudio Fontana wrote:
>>> I have now been able to reproduce this on X86 as well.
>>>
>>> It happens much more rarely, about once every 10 times.
>>>
>>> I will sort out the data and try to make it even more reproducible,
>>> then post my findings in detail.
>>>
>>> Overall I proceeded as follows:
>>>
>>> 1) hooked the savevm code to skip all fields with the exception of
>>> "s390-skeys". So only s390-skeys are actually saved.
>>>
>>> 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
>>> used on both x86 and s390, modeling the behaviour of save/load from
>>> hw/s390
>>>
>>> 3) ran ./check -qcow2 267 on both x86 and s390.
>>>
>>> In the case of s390, failure seems to be reproducible 100% of the
>>> times.
>>> On X86, it is as mentioned failing about 10% of the times.
>>>
>>> Ciao,
>>>
>>> Claudio
>>
>> And here is a small series of two patches that can be used to
>> reproduce the problem.
>>
>> Clearly, this is not directly related to s390 or to skeys or to
>> icount in particular, it is just an issue that happened to be more
>> visible there.
>>
>> If you could help with this, please apply the attached patches.
>>
>> Patch 1 just adds a new "300" iotest. It is way easier to extract the
>> relevant part out of test 267, which does a bit too much in the same
>> file.
>> Also this allows easier use of valgrind, since it does not "require"
>> anything.
>>
>> Patch 2 hooks the savevm code to skip all fields during the snapshot
>> with the exception of "s390-skeys", a new artificial field
>> implemented to
>> model what the real s390-skeys is doing.
>>
>> After applying patch 1 and patch 2, you can test (also on X86), with:
>>
>> ./check -qcow2 300
>>
>> On X86 many runs will be successful, but a certain % of them will
>> instead fail like this:
>>
>>
>> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests>
>> ./check -qcow2 300
>> QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
>> -nodefaults -display none -accel qtest
>> QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-img"
>> QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
>> threads -f qcow2
>> QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-nbd"
>> IMGFMT        -- qcow2 (compat=1.1)
>> IMGPROTO      -- file
>> PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
>> TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/scratch
>> SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
>> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/socket_scm_helper
>>
>> 300      fail       [10:14:05] [10:14:06]      (last: 0s)    output
>> mismatch (see 300.out.bad)
>> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
>> iotests/300.out     2020-07-21 10:03:54.468104764 +0200
>> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
>> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
>> @@ -12,6 +12,9 @@
>>   ID        TAG                 VM SIZE                DATE       VM
>> CLOCK
>>   --        snap0                  SIZE yyyy-mm-dd
>> hh:mm:ss   00:00:00.000
>>   (qemu) loadvm snap0
>> +Unexpected storage key data: 0
>> +error while loading state for instance 0x0 of device 's390-skeys'
>> +Error: Error -22 while loading VM state
>>   (qemu) quit
>>   
>>   *** done
>> Failures: 300
>> Failed 1 of 1 iotests
>>
>>
>> At this point somebody more knowledgeable about QCOW2, coroutines and
>> backing files could chime in?
>>
> <trim>
> 
> I used the reproducer you provide here to do a git bisect as I assume
> whatever is now broken wasn't always broken, and it pointed to the
> following commit:
> 
> commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Date:   Tue Jun 4 19:15:13 2019 +0300
> 
>      block/qcow2: implement .bdrv_co_preadv_part
> 
> Indeed, I am currently able to reliable reproduce the issue with this
> commit applied, and not reproduce it without it.

Do you have a reproducer on top of df893d25ceea? I tried to apply the patches on it, there are a lot of conflicts and after solving them, the test not work for me.

What do you mean by not reproduce without df893d25ceea3c0d? I also tried to do "git rebase df893d25ceea3c0d --onto df893d25ceea3c0d^", after applying the reproducing patches on master, but again, there are conflicts..

How did you applied

commit 88f468e54696cd9ffc28ecf96f18f04125402541
Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date:   Mon Sep 16 20:53:22 2019 +0300

     block/qcow2: refactor qcow2_co_preadv_part

and, which is more important further

commit d710cf575ad5fb3ab329204620de45bfe50caa53
Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date:   Mon Sep 16 20:53:24 2019 +0300

     block/qcow2: introduce parallel subrequest handling in read and write

without df893d25ceea3c0d ?


I can reproduce the bug on master with Claudio's two patches, but how to reproduce the relation with df893d25ceea3c0d?

> 
> That said, I've not been able to identify exactly what is going wrong.
> I'm fairly confident the savevm data is correctly written out, but on
> the loadvm side, somehow the last part of the s390 data is not
> correctly read in the data (it's in the second pass through the while
> loop in qcow2_co_preadv_part() where that happens.)
> 
> If anyone familiar with this code can have a look or provide some
> pointers, it would be much appreciated.
> 





-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-27 23:09                 ` Bruce Rogers
  2020-07-28  8:15                   ` Vladimir Sementsov-Ogievskiy
  2020-07-28  8:43                   ` Vladimir Sementsov-Ogievskiy
@ 2020-07-28 11:10                   ` Max Reitz
  2020-07-28 11:27                     ` Vladimir Sementsov-Ogievskiy
                                       ` (2 more replies)
  2 siblings, 3 replies; 25+ messages in thread
From: Max Reitz @ 2020-07-28 11:10 UTC (permalink / raw)
  To: Bruce Rogers, Claudio Fontana, Thomas Huth, Paolo Bonzini,
	Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, vsementsov, Stefan Hajnoczi


[-- Attachment #1.1.1: Type: text/plain, Size: 4986 bytes --]

On 28.07.20 01:09, Bruce Rogers wrote:
> On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
>> On 7/20/20 8:24 PM, Claudio Fontana wrote:
>>> I have now been able to reproduce this on X86 as well.
>>>
>>> It happens much more rarely, about once every 10 times.
>>>
>>> I will sort out the data and try to make it even more reproducible,
>>> then post my findings in detail.
>>>
>>> Overall I proceeded as follows:
>>>
>>> 1) hooked the savevm code to skip all fields with the exception of
>>> "s390-skeys". So only s390-skeys are actually saved.
>>>
>>> 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
>>> used on both x86 and s390, modeling the behaviour of save/load from
>>> hw/s390
>>>
>>> 3) ran ./check -qcow2 267 on both x86 and s390.
>>>
>>> In the case of s390, failure seems to be reproducible 100% of the
>>> times.
>>> On X86, it is as mentioned failing about 10% of the times.
>>>
>>> Ciao,
>>>
>>> Claudio
>>
>> And here is a small series of two patches that can be used to
>> reproduce the problem.
>>
>> Clearly, this is not directly related to s390 or to skeys or to
>> icount in particular, it is just an issue that happened to be more
>> visible there.
>>
>> If you could help with this, please apply the attached patches.
>>
>> Patch 1 just adds a new "300" iotest. It is way easier to extract the
>> relevant part out of test 267, which does a bit too much in the same
>> file.
>> Also this allows easier use of valgrind, since it does not "require"
>> anything.
>>
>> Patch 2 hooks the savevm code to skip all fields during the snapshot
>> with the exception of "s390-skeys", a new artificial field
>> implemented to
>> model what the real s390-skeys is doing.
>>
>> After applying patch 1 and patch 2, you can test (also on X86), with:
>>
>> ./check -qcow2 300
>>
>> On X86 many runs will be successful, but a certain % of them will
>> instead fail like this:
>>
>>
>> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests> 
>> ./check -qcow2 300
>> QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
>> -nodefaults -display none -accel qtest
>> QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-img" 
>> QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
>> threads -f qcow2
>> QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/../../qemu-nbd" 
>> IMGFMT        -- qcow2 (compat=1.1)
>> IMGPROTO      -- file
>> PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
>> TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/scratch
>> SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
>> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
>> build/tests/qemu-iotests/socket_scm_helper
>>
>> 300      fail       [10:14:05] [10:14:06]      (last: 0s)    output
>> mismatch (see 300.out.bad)
>> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
>> iotests/300.out     2020-07-21 10:03:54.468104764 +0200
>> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
>> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
>> @@ -12,6 +12,9 @@
>>  ID        TAG                 VM SIZE                DATE       VM
>> CLOCK
>>  --        snap0                  SIZE yyyy-mm-dd
>> hh:mm:ss   00:00:00.000
>>  (qemu) loadvm snap0
>> +Unexpected storage key data: 0
>> +error while loading state for instance 0x0 of device 's390-skeys'
>> +Error: Error -22 while loading VM state
>>  (qemu) quit
>>  
>>  *** done
>> Failures: 300
>> Failed 1 of 1 iotests
>>
>>
>> At this point somebody more knowledgeable about QCOW2, coroutines and
>> backing files could chime in?
>>
> <trim>
> 
> I used the reproducer you provide here to do a git bisect as I assume
> whatever is now broken wasn't always broken, and it pointed to the
> following commit:
> 
> commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Date:   Tue Jun 4 19:15:13 2019 +0300
> 
>     block/qcow2: implement .bdrv_co_preadv_part
> 
> Indeed, I am currently able to reliable reproduce the issue with this
> commit applied, and not reproduce it without it.
> 
> That said, I've not been able to identify exactly what is going wrong.
> I'm fairly confident the savevm data is correctly written out, but on
> the loadvm side, somehow the last part of the s390 data is not
> correctly read in the data (it's in the second pass through the while
> loop in qcow2_co_preadv_part() where that happens.)
> 
> If anyone familiar with this code can have a look or provide some
> pointers, it would be much appreciated.

Thanks for both your investigation.  Does the attached patch help?

Max

[-- Attachment #1.1.2: 0001-block-Fix-bdrv_aligned_preadv-with-qiov_offset.patch --]
[-- Type: text/x-patch, Size: 1250 bytes --]

From 3df2df667d561dfe423e4ee5784e83943205974a Mon Sep 17 00:00:00 2001
From: Max Reitz <mreitz@redhat.com>
Date: Tue, 28 Jul 2020 13:02:31 +0200
Subject: [PATCH] block: Fix bdrv_aligned_preadv() with qiov_offset

---
 block/io.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index b6564e34c5..bd45d739c7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1524,12 +1524,13 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
             assert(num);
 
             ret = bdrv_driver_preadv(bs, offset + bytes - bytes_remaining,
-                                     num, qiov, bytes - bytes_remaining, 0);
+                                     num, qiov,
+                                     qiov_offset + bytes - bytes_remaining, 0);
             max_bytes -= num;
         } else {
             num = bytes_remaining;
-            ret = qemu_iovec_memset(qiov, bytes - bytes_remaining, 0,
-                                    bytes_remaining);
+            ret = qemu_iovec_memset(qiov, qiov_offset + bytes - bytes_remaining,
+                                    0, bytes_remaining);
         }
         if (ret < 0) {
             goto out;
-- 
2.26.2


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-28 11:10                   ` Max Reitz
@ 2020-07-28 11:27                     ` Vladimir Sementsov-Ogievskiy
  2020-07-28 11:33                     ` Vladimir Sementsov-Ogievskiy
  2020-07-28 12:47                     ` Claudio Fontana
  2 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-28 11:27 UTC (permalink / raw)
  To: Max Reitz, Bruce Rogers, Claudio Fontana, Thomas Huth,
	Paolo Bonzini, Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

28.07.2020 14:10, Max Reitz wrote:
> On 28.07.20 01:09, Bruce Rogers wrote:
>> On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
>>> On 7/20/20 8:24 PM, Claudio Fontana wrote:
>>>> I have now been able to reproduce this on X86 as well.
>>>>
>>>> It happens much more rarely, about once every 10 times.
>>>>
>>>> I will sort out the data and try to make it even more reproducible,
>>>> then post my findings in detail.
>>>>
>>>> Overall I proceeded as follows:
>>>>
>>>> 1) hooked the savevm code to skip all fields with the exception of
>>>> "s390-skeys". So only s390-skeys are actually saved.
>>>>
>>>> 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
>>>> used on both x86 and s390, modeling the behaviour of save/load from
>>>> hw/s390
>>>>
>>>> 3) ran ./check -qcow2 267 on both x86 and s390.
>>>>
>>>> In the case of s390, failure seems to be reproducible 100% of the
>>>> times.
>>>> On X86, it is as mentioned failing about 10% of the times.
>>>>
>>>> Ciao,
>>>>
>>>> Claudio
>>>
>>> And here is a small series of two patches that can be used to
>>> reproduce the problem.
>>>
>>> Clearly, this is not directly related to s390 or to skeys or to
>>> icount in particular, it is just an issue that happened to be more
>>> visible there.
>>>
>>> If you could help with this, please apply the attached patches.
>>>
>>> Patch 1 just adds a new "300" iotest. It is way easier to extract the
>>> relevant part out of test 267, which does a bit too much in the same
>>> file.
>>> Also this allows easier use of valgrind, since it does not "require"
>>> anything.
>>>
>>> Patch 2 hooks the savevm code to skip all fields during the snapshot
>>> with the exception of "s390-skeys", a new artificial field
>>> implemented to
>>> model what the real s390-skeys is doing.
>>>
>>> After applying patch 1 and patch 2, you can test (also on X86), with:
>>>
>>> ./check -qcow2 300
>>>
>>> On X86 many runs will be successful, but a certain % of them will
>>> instead fail like this:
>>>
>>>
>>> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests>
>>> ./check -qcow2 300
>>> QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
>>> -nodefaults -display none -accel qtest
>>> QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-img"
>>> QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
>>> threads -f qcow2
>>> QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-nbd"
>>> IMGFMT        -- qcow2 (compat=1.1)
>>> IMGPROTO      -- file
>>> PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
>>> TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/scratch
>>> SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
>>> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/socket_scm_helper
>>>
>>> 300      fail       [10:14:05] [10:14:06]      (last: 0s)    output
>>> mismatch (see 300.out.bad)
>>> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
>>> iotests/300.out     2020-07-21 10:03:54.468104764 +0200
>>> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
>>> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
>>> @@ -12,6 +12,9 @@
>>>   ID        TAG                 VM SIZE                DATE       VM
>>> CLOCK
>>>   --        snap0                  SIZE yyyy-mm-dd
>>> hh:mm:ss   00:00:00.000
>>>   (qemu) loadvm snap0
>>> +Unexpected storage key data: 0
>>> +error while loading state for instance 0x0 of device 's390-skeys'
>>> +Error: Error -22 while loading VM state
>>>   (qemu) quit
>>>   
>>>   *** done
>>> Failures: 300
>>> Failed 1 of 1 iotests
>>>
>>>
>>> At this point somebody more knowledgeable about QCOW2, coroutines and
>>> backing files could chime in?
>>>
>> <trim>
>>
>> I used the reproducer you provide here to do a git bisect as I assume
>> whatever is now broken wasn't always broken, and it pointed to the
>> following commit:
>>
>> commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
>> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Date:   Tue Jun 4 19:15:13 2019 +0300
>>
>>      block/qcow2: implement .bdrv_co_preadv_part
>>
>> Indeed, I am currently able to reliable reproduce the issue with this
>> commit applied, and not reproduce it without it.
>>
>> That said, I've not been able to identify exactly what is going wrong.
>> I'm fairly confident the savevm data is correctly written out, but on
>> the loadvm side, somehow the last part of the s390 data is not
>> correctly read in the data (it's in the second pass through the while
>> loop in qcow2_co_preadv_part() where that happens.)
>>
>> If anyone familiar with this code can have a look or provide some
>> pointers, it would be much appreciated.
> 
> Thanks for both your investigation.  Does the attached patch help?
> 
> Max
> 

Oh, I'm stupid. Great bug!

So (at least what you fix, not sure about reported bug, but it seems very close),
it's a degradation in

commit 65cd4424b9df03bb5195351c33e04cbbecc0705c
Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date:   Tue Jun 4 19:15:09 2019 +0300

     block/io: bdrv_aligned_preadv: use and support qiov_offset

And broken since 4.2...


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-28 11:10                   ` Max Reitz
  2020-07-28 11:27                     ` Vladimir Sementsov-Ogievskiy
@ 2020-07-28 11:33                     ` Vladimir Sementsov-Ogievskiy
  2020-07-28 11:35                       ` Paolo Bonzini
  2020-07-28 12:47                     ` Claudio Fontana
  2 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-28 11:33 UTC (permalink / raw)
  To: Max Reitz, Bruce Rogers, Claudio Fontana, Thomas Huth,
	Paolo Bonzini, Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

28.07.2020 14:10, Max Reitz wrote:
> On 28.07.20 01:09, Bruce Rogers wrote:
>> On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
>>> On 7/20/20 8:24 PM, Claudio Fontana wrote:
>>>> I have now been able to reproduce this on X86 as well.
>>>>
>>>> It happens much more rarely, about once every 10 times.
>>>>
>>>> I will sort out the data and try to make it even more reproducible,
>>>> then post my findings in detail.
>>>>
>>>> Overall I proceeded as follows:
>>>>
>>>> 1) hooked the savevm code to skip all fields with the exception of
>>>> "s390-skeys". So only s390-skeys are actually saved.
>>>>
>>>> 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
>>>> used on both x86 and s390, modeling the behaviour of save/load from
>>>> hw/s390
>>>>
>>>> 3) ran ./check -qcow2 267 on both x86 and s390.
>>>>
>>>> In the case of s390, failure seems to be reproducible 100% of the
>>>> times.
>>>> On X86, it is as mentioned failing about 10% of the times.
>>>>
>>>> Ciao,
>>>>
>>>> Claudio
>>>
>>> And here is a small series of two patches that can be used to
>>> reproduce the problem.
>>>
>>> Clearly, this is not directly related to s390 or to skeys or to
>>> icount in particular, it is just an issue that happened to be more
>>> visible there.
>>>
>>> If you could help with this, please apply the attached patches.
>>>
>>> Patch 1 just adds a new "300" iotest. It is way easier to extract the
>>> relevant part out of test 267, which does a bit too much in the same
>>> file.
>>> Also this allows easier use of valgrind, since it does not "require"
>>> anything.
>>>
>>> Patch 2 hooks the savevm code to skip all fields during the snapshot
>>> with the exception of "s390-skeys", a new artificial field
>>> implemented to
>>> model what the real s390-skeys is doing.
>>>
>>> After applying patch 1 and patch 2, you can test (also on X86), with:
>>>
>>> ./check -qcow2 300
>>>
>>> On X86 many runs will be successful, but a certain % of them will
>>> instead fail like this:
>>>
>>>
>>> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests>
>>> ./check -qcow2 300
>>> QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
>>> -nodefaults -display none -accel qtest
>>> QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-img"
>>> QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
>>> threads -f qcow2
>>> QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-nbd"
>>> IMGFMT        -- qcow2 (compat=1.1)
>>> IMGPROTO      -- file
>>> PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
>>> TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/scratch
>>> SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
>>> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/socket_scm_helper
>>>
>>> 300      fail       [10:14:05] [10:14:06]      (last: 0s)    output
>>> mismatch (see 300.out.bad)
>>> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
>>> iotests/300.out     2020-07-21 10:03:54.468104764 +0200
>>> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
>>> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
>>> @@ -12,6 +12,9 @@
>>>   ID        TAG                 VM SIZE                DATE       VM
>>> CLOCK
>>>   --        snap0                  SIZE yyyy-mm-dd
>>> hh:mm:ss   00:00:00.000
>>>   (qemu) loadvm snap0
>>> +Unexpected storage key data: 0
>>> +error while loading state for instance 0x0 of device 's390-skeys'
>>> +Error: Error -22 while loading VM state
>>>   (qemu) quit
>>>   
>>>   *** done
>>> Failures: 300
>>> Failed 1 of 1 iotests
>>>
>>>
>>> At this point somebody more knowledgeable about QCOW2, coroutines and
>>> backing files could chime in?
>>>
>> <trim>
>>
>> I used the reproducer you provide here to do a git bisect as I assume
>> whatever is now broken wasn't always broken, and it pointed to the
>> following commit:
>>
>> commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
>> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Date:   Tue Jun 4 19:15:13 2019 +0300
>>
>>      block/qcow2: implement .bdrv_co_preadv_part
>>
>> Indeed, I am currently able to reliable reproduce the issue with this
>> commit applied, and not reproduce it without it.
>>
>> That said, I've not been able to identify exactly what is going wrong.
>> I'm fairly confident the savevm data is correctly written out, but on
>> the loadvm side, somehow the last part of the s390 data is not
>> correctly read in the data (it's in the second pass through the while
>> loop in qcow2_co_preadv_part() where that happens.)
>>
>> If anyone familiar with this code can have a look or provide some
>> pointers, it would be much appreciated.
> 
> Thanks for both your investigation.  Does the attached patch help?
> 

For me, the reproducer is fixed with your patch.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-28 11:33                     ` Vladimir Sementsov-Ogievskiy
@ 2020-07-28 11:35                       ` Paolo Bonzini
  2020-07-28 11:45                         ` Max Reitz
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2020-07-28 11:35 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Max Reitz, Bruce Rogers,
	Claudio Fontana, Thomas Huth, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

On 28/07/20 13:33, Vladimir Sementsov-Ogievskiy wrote:
>> Thanks for both your investigation.  Does the attached patch help?
> 
> For me, the reproducer is fixed with your patch.

Is it possible to make a quick testcase using qemu-io?

Paolo



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-28 11:35                       ` Paolo Bonzini
@ 2020-07-28 11:45                         ` Max Reitz
  2020-07-28 12:09                           ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Max Reitz @ 2020-07-28 11:45 UTC (permalink / raw)
  To: Paolo Bonzini, Vladimir Sementsov-Ogievskiy, Bruce Rogers,
	Claudio Fontana, Thomas Huth, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi


[-- Attachment #1.1: Type: text/plain, Size: 758 bytes --]

On 28.07.20 13:35, Paolo Bonzini wrote:
> On 28/07/20 13:33, Vladimir Sementsov-Ogievskiy wrote:
>>> Thanks for both your investigation.  Does the attached patch help?
>>
>> For me, the reproducer is fixed with your patch.
> 
> Is it possible to make a quick testcase using qemu-io?

This seems to work:

$ qemu-img create -f qcow2 -o cluster_size=512 base.qcow2 512
$ qemu-img create -f qcow2 -o cluster_size=512 -F qcow2 -b base.qcow2 \
  top.qcow2 1024
$ qemu-io -c 'write -P 42 0 512' top.qcow2
$ qemu-io -c 'read -v 0 1024' top.qcow2

That should print 512 times “2a” and then 512 times “00”, but beyond
512, the buffer stays uninitialized; instead, the first 512 bytes are
zeroed.  (With the patch, it works as expected.)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-28 11:45                         ` Max Reitz
@ 2020-07-28 12:09                           ` Paolo Bonzini
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2020-07-28 12:09 UTC (permalink / raw)
  To: Max Reitz, Vladimir Sementsov-Ogievskiy, Bruce Rogers,
	Claudio Fontana, Thomas Huth, Juan Quintela,
	Dr. David Alan Gilbert, Kevin Wolf
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

On 28/07/20 13:45, Max Reitz wrote:
>> Is it possible to make a quick testcase using qemu-io?
> This seems to work:
> 
> $ qemu-img create -f qcow2 -o cluster_size=512 base.qcow2 512
> $ qemu-img create -f qcow2 -o cluster_size=512 -F qcow2 -b base.qcow2 \
>   top.qcow2 1024
> $ qemu-io -c 'write -P 42 0 512' top.qcow2
> $ qemu-io -c 'read -v 0 1024' top.qcow2
> 
> That should print 512 times “2a” and then 512 times “00”, but beyond
> 512, the buffer stays uninitialized; instead, the first 512 bytes are
> zeroed.  (With the patch, it works as expected.)
> 

Wow, that's great (and a good recipe for qemu-iotests too).  Thanks.

Paolo



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-28 11:10                   ` Max Reitz
  2020-07-28 11:27                     ` Vladimir Sementsov-Ogievskiy
  2020-07-28 11:33                     ` Vladimir Sementsov-Ogievskiy
@ 2020-07-28 12:47                     ` Claudio Fontana
  2 siblings, 0 replies; 25+ messages in thread
From: Claudio Fontana @ 2020-07-28 12:47 UTC (permalink / raw)
  To: Max Reitz, Bruce Rogers, Thomas Huth, Paolo Bonzini,
	Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, vsementsov, Stefan Hajnoczi

On 7/28/20 1:10 PM, Max Reitz wrote:
> On 28.07.20 01:09, Bruce Rogers wrote:
>> On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
>>> On 7/20/20 8:24 PM, Claudio Fontana wrote:
>>>> I have now been able to reproduce this on X86 as well.
>>>>
>>>> It happens much more rarely, about once every 10 times.
>>>>
>>>> I will sort out the data and try to make it even more reproducible,
>>>> then post my findings in detail.
>>>>
>>>> Overall I proceeded as follows:
>>>>
>>>> 1) hooked the savevm code to skip all fields with the exception of
>>>> "s390-skeys". So only s390-skeys are actually saved.
>>>>
>>>> 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
>>>> used on both x86 and s390, modeling the behaviour of save/load from
>>>> hw/s390
>>>>
>>>> 3) ran ./check -qcow2 267 on both x86 and s390.
>>>>
>>>> In the case of s390, failure seems to be reproducible 100% of the
>>>> times.
>>>> On X86, it is as mentioned failing about 10% of the times.
>>>>
>>>> Ciao,
>>>>
>>>> Claudio
>>>
>>> And here is a small series of two patches that can be used to
>>> reproduce the problem.
>>>
>>> Clearly, this is not directly related to s390 or to skeys or to
>>> icount in particular, it is just an issue that happened to be more
>>> visible there.
>>>
>>> If you could help with this, please apply the attached patches.
>>>
>>> Patch 1 just adds a new "300" iotest. It is way easier to extract the
>>> relevant part out of test 267, which does a bit too much in the same
>>> file.
>>> Also this allows easier use of valgrind, since it does not "require"
>>> anything.
>>>
>>> Patch 2 hooks the savevm code to skip all fields during the snapshot
>>> with the exception of "s390-skeys", a new artificial field
>>> implemented to
>>> model what the real s390-skeys is doing.
>>>
>>> After applying patch 1 and patch 2, you can test (also on X86), with:
>>>
>>> ./check -qcow2 300
>>>
>>> On X86 many runs will be successful, but a certain % of them will
>>> instead fail like this:
>>>
>>>
>>> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests> 
>>> ./check -qcow2 300
>>> QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
>>> -nodefaults -display none -accel qtest
>>> QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-img" 
>>> QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
>>> threads -f qcow2
>>> QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/../../qemu-nbd" 
>>> IMGFMT        -- qcow2 (compat=1.1)
>>> IMGPROTO      -- file
>>> PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
>>> TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/scratch
>>> SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
>>> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
>>> build/tests/qemu-iotests/socket_scm_helper
>>>
>>> 300      fail       [10:14:05] [10:14:06]      (last: 0s)    output
>>> mismatch (see 300.out.bad)
>>> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
>>> iotests/300.out     2020-07-21 10:03:54.468104764 +0200
>>> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
>>> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
>>> @@ -12,6 +12,9 @@
>>>  ID        TAG                 VM SIZE                DATE       VM
>>> CLOCK
>>>  --        snap0                  SIZE yyyy-mm-dd
>>> hh:mm:ss   00:00:00.000
>>>  (qemu) loadvm snap0
>>> +Unexpected storage key data: 0
>>> +error while loading state for instance 0x0 of device 's390-skeys'
>>> +Error: Error -22 while loading VM state
>>>  (qemu) quit
>>>  
>>>  *** done
>>> Failures: 300
>>> Failed 1 of 1 iotests
>>>
>>>
>>> At this point somebody more knowledgeable about QCOW2, coroutines and
>>> backing files could chime in?
>>>
>> <trim>
>>
>> I used the reproducer you provide here to do a git bisect as I assume
>> whatever is now broken wasn't always broken, and it pointed to the
>> following commit:
>>
>> commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
>> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Date:   Tue Jun 4 19:15:13 2019 +0300
>>
>>     block/qcow2: implement .bdrv_co_preadv_part
>>
>> Indeed, I am currently able to reliable reproduce the issue with this
>> commit applied, and not reproduce it without it.
>>
>> That said, I've not been able to identify exactly what is going wrong.
>> I'm fairly confident the savevm data is correctly written out, but on
>> the loadvm side, somehow the last part of the s390 data is not
>> correctly read in the data (it's in the second pass through the while
>> loop in qcow2_co_preadv_part() where that happens.)
>>
>> If anyone familiar with this code can have a look or provide some
>> pointers, it would be much appreciated.
> 
> Thanks for both your investigation.  Does the attached patch help?
> 
> Max
> 

Hello Max,

yes, this solves the problem everywhere (on s390, on X86, based on all the reproducers so far).

Thanks a lot for the help!

Claudio


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed
  2020-07-28  8:43                   ` Vladimir Sementsov-Ogievskiy
@ 2020-07-28 13:23                     ` Bruce Rogers
  0 siblings, 0 replies; 25+ messages in thread
From: Bruce Rogers @ 2020-07-28 13:23 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Claudio Fontana, Thomas Huth,
	Paolo Bonzini, Juan Quintela, Dr. David Alan Gilbert, Kevin Wolf,
	Max Reitz
  Cc: Jason J. Herne, Fam Zheng, Liang Yan, Peter Maydell,
	Cornelia Huck, qemu-devel, Stefan Hajnoczi

On Tue, 2020-07-28 at 11:43 +0300, Vladimir Sementsov-Ogievskiy wrote:
> 28.07.2020 02:09, Bruce Rogers wrote:
> > On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
> > > On 7/20/20 8:24 PM, Claudio Fontana wrote:
> > > > I have now been able to reproduce this on X86 as well.
> > > > 
> > > > It happens much more rarely, about once every 10 times.
> > > > 
> > > > I will sort out the data and try to make it even more
> > > > reproducible,
> > > > then post my findings in detail.
> > > > 
> > > > Overall I proceeded as follows:
> > > > 
> > > > 1) hooked the savevm code to skip all fields with the exception
> > > > of
> > > > "s390-skeys". So only s390-skeys are actually saved.
> > > > 
> > > > 2) reimplemented "s390-skeys" in a common implementation in
> > > > cpus.c,
> > > > used on both x86 and s390, modeling the behaviour of save/load
> > > > from
> > > > hw/s390
> > > > 
> > > > 3) ran ./check -qcow2 267 on both x86 and s390.
> > > > 
> > > > In the case of s390, failure seems to be reproducible 100% of
> > > > the
> > > > times.
> > > > On X86, it is as mentioned failing about 10% of the times.
> > > > 
> > > > Ciao,
> > > > 
> > > > Claudio
> > > 
> > > And here is a small series of two patches that can be used to
> > > reproduce the problem.
> > > 
> > > Clearly, this is not directly related to s390 or to skeys or to
> > > icount in particular, it is just an issue that happened to be
> > > more
> > > visible there.
> > > 
> > > If you could help with this, please apply the attached patches.
> > > 
> > > Patch 1 just adds a new "300" iotest. It is way easier to extract
> > > the
> > > relevant part out of test 267, which does a bit too much in the
> > > same
> > > file.
> > > Also this allows easier use of valgrind, since it does not
> > > "require"
> > > anything.
> > > 
> > > Patch 2 hooks the savevm code to skip all fields during the
> > > snapshot
> > > with the exception of "s390-skeys", a new artificial field
> > > implemented to
> > > model what the real s390-skeys is doing.
> > > 
> > > After applying patch 1 and patch 2, you can test (also on X86),
> > > with:
> > > 
> > > ./check -qcow2 300
> > > 
> > > On X86 many runs will be successful, but a certain % of them will
> > > instead fail like this:
> > > 
> > > 
> > > claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-
> > > iotests>
> > > ./check -qcow2 300
> > > QEMU          -- "/home/claudio/git/qemu-pristine/qemu-
> > > build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
> > > -nodefaults -display none -accel qtest
> > > QEMU_IMG      -- "/home/claudio/git/qemu-pristine/qemu-
> > > build/tests/qemu-iotests/../../qemu-img"
> > > QEMU_IO       -- "/home/claudio/git/qemu-pristine/qemu-
> > > build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
> > > threads -f qcow2
> > > QEMU_NBD      -- "/home/claudio/git/qemu-pristine/qemu-
> > > build/tests/qemu-iotests/../../qemu-nbd"
> > > IMGFMT        -- qcow2 (compat=1.1)
> > > IMGPROTO      -- file
> > > PLATFORM      -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-
> > > default
> > > TEST_DIR      -- /home/claudio/git/qemu-pristine/qemu-
> > > build/tests/qemu-iotests/scratch
> > > SOCK_DIR      -- /tmp/tmp.gdcUu3l0SM
> > > SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
> > > build/tests/qemu-iotests/socket_scm_helper
> > > 
> > > 300      fail       [10:14:05] [10:14:06]      (last:
> > > 0s)    output
> > > mismatch (see 300.out.bad)
> > > --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
> > > iotests/300.out     2020-07-21 10:03:54.468104764 +0200
> > > +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
> > > iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
> > > @@ -12,6 +12,9 @@
> > >   ID        TAG                 VM
> > > SIZE                DATE       VM
> > > CLOCK
> > >   --        snap0                  SIZE yyyy-mm-dd
> > > hh:mm:ss   00:00:00.000
> > >   (qemu) loadvm snap0
> > > +Unexpected storage key data: 0
> > > +error while loading state for instance 0x0 of device 's390-
> > > skeys'
> > > +Error: Error -22 while loading VM state
> > >   (qemu) quit
> > >   
> > >   *** done
> > > Failures: 300
> > > Failed 1 of 1 iotests
> > > 
> > > 
> > > At this point somebody more knowledgeable about QCOW2, coroutines
> > > and
> > > backing files could chime in?
> > > 
> > <trim>
> > 
> > I used the reproducer you provide here to do a git bisect as I
> > assume
> > whatever is now broken wasn't always broken, and it pointed to the
> > following commit:
> > 
> > commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
> > Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > Date:   Tue Jun 4 19:15:13 2019 +0300
> > 
> >      block/qcow2: implement .bdrv_co_preadv_part
> > 
> > Indeed, I am currently able to reliable reproduce the issue with
> > this
> > commit applied, and not reproduce it without it.
> 
> Do you have a reproducer on top of df893d25ceea? I tried to apply the
> patches on it, there are a lot of conflicts and after solving them,
> the test not work for me.
> 
> What do you mean by not reproduce without df893d25ceea3c0d? I also
> tried to do "git rebase df893d25ceea3c0d --onto df893d25ceea3c0d^",
> after applying the reproducing patches on master, but again, there
> are conflicts..
> 
> How did you applied
> 
> commit 88f468e54696cd9ffc28ecf96f18f04125402541
> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Date:   Mon Sep 16 20:53:22 2019 +0300
> 
>      block/qcow2: refactor qcow2_co_preadv_part
> 
> and, which is more important further
> 
> commit d710cf575ad5fb3ab329204620de45bfe50caa53
> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Date:   Mon Sep 16 20:53:24 2019 +0300
> 
>      block/qcow2: introduce parallel subrequest handling in read and
> write
> 
> without df893d25ceea3c0d ?
> 
> 
> I can reproduce the bug on master with Claudio's two patches, but how
> to reproduce the relation with df893d25ceea3c0d?
> 
> 

Yeah, I didn't provide full reproducer info for the point where the
patch I mention got applied. No matter, it looks like a fix has been
identified in the mean time.

Thanks for the quick response, though.

Bruce



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-07-28 13:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-12 10:00 migration: broken snapshot saves appear on s390 when small fields in migration stream removed Claudio Fontana
2020-07-12 16:11 ` Paolo Bonzini
2020-07-13  9:11   ` Claudio Fontana
2020-07-14 14:29     ` Claudio Fontana
2020-07-14 14:35       ` Thomas Huth
2020-07-15 11:10         ` Claudio Fontana
2020-07-15 12:25           ` Claudio Fontana
2020-07-16 12:58           ` Claudio Fontana
2020-07-20 18:24             ` Claudio Fontana
2020-07-21  8:22               ` Claudio Fontana
2020-07-27 23:09                 ` Bruce Rogers
2020-07-28  8:15                   ` Vladimir Sementsov-Ogievskiy
2020-07-28  8:43                   ` Vladimir Sementsov-Ogievskiy
2020-07-28 13:23                     ` Bruce Rogers
2020-07-28 11:10                   ` Max Reitz
2020-07-28 11:27                     ` Vladimir Sementsov-Ogievskiy
2020-07-28 11:33                     ` Vladimir Sementsov-Ogievskiy
2020-07-28 11:35                       ` Paolo Bonzini
2020-07-28 11:45                         ` Max Reitz
2020-07-28 12:09                           ` Paolo Bonzini
2020-07-28 12:47                     ` Claudio Fontana
2020-07-13 11:03 ` Dr. David Alan Gilbert
2020-07-13 11:39   ` Cornelia Huck
2020-07-13 11:39   ` Claudio Fontana
2020-07-13 11:45     ` Claudio Fontana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.