All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
       [not found] <CANzW0mvSX5nWuinDU68W2yJzgoQSGAHPqpz0G36A6NKwRsz_4A@mail.gmail.com>
@ 2018-10-01 14:14 ` dovgaluk
  2018-10-01 18:22   ` Artem Pisarenko
  0 siblings, 1 reply; 13+ messages in thread
From: dovgaluk @ 2018-10-01 14:14 UTC (permalink / raw)
  To: Artem Pisarenko; +Cc: qemu-devel, Pavel.Dovgaluk

Artem Pisarenko писал 2018-09-30 14:01:
> Feature still broken :(

Thanks for testing.

> 
> Brief description of my tests.
> 
> Guest image is Linux, which just powers off after kernel boots
> (instead of proceeding to user-space /init or /sbin/init).
> Base cmdline:
> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048 -cpu
> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage -initrd
> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
> -nographic -serial SERIAL_VALUE -icount
> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin

I've never tried it with sleep=off. Can you remove it and try again?

We also seen a problem with '-nographic'. When we remove this option and 
QEMU runs with SDL
window, everything is ok. There is some problem with main loop which may 
sleep when there
is no GUI to update, or something like that. We couldn't fix it yet.

> 
> Test 1. When SERIAL_VALUE=none
> Running with RR_VALUE=record completes successfully.
> Running with RR_VALUE=replay doesn't completes. qemu process just
> eating ~100% cpu and memory usage doesn't grow after some moment. I
> don't see what happens because of problem no.2 (see below).

Try 'info replay' monitor command. Does instruction counter increases?

> 
> Test 2. When SERIAL_VALUE=stdio
> Running with RR_VALUE=record completes successfully.
> 
> Running with RR_VALUE=replay caues exit with error:
> 
> "qemu-system-x86_64: Missing character write event in the replay log"
> 
> These problems are same with qemu 2.12 (both vanilla and with previous
> versions of these patches applied). Furthemore, I consider whole
> icount mode broken and determinism isn't achievable.
> The irony is that I actually don't need record/replay feature. I've
> tried to use it only as instrument to debug failing determinism in
> qemu code. But since replay/record feature itself relies on
> determinism, which is broken, it's no wonder why it fails also (I just
> hoped to bypass it).
> 
> Contact me if you need more details. I just tired a lot trying to get
> all these things working... Hope is leaving me...

Can you share the kernel in case the icount still broken?


Pavel Dovgalyuk

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-01 14:14 ` [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging dovgaluk
@ 2018-10-01 18:22   ` Artem Pisarenko
  2018-10-02  7:02     ` Artem Pisarenko
  0 siblings, 1 reply; 13+ messages in thread
From: Artem Pisarenko @ 2018-10-01 18:22 UTC (permalink / raw)
  To: dovgaluk; +Cc: Pavel.Dovgaluk, qemu-devel

I've posted bug report with extended tests (incl. case without sleep=off).
You may find guest image (kernel) in bug description.
https://bugs.launchpad.net/qemu/+bug/1795369

The most annoying thing is that some issues are almost not reproducible.
There are definitely race conditions somewhere in qemu code. Running
'stress-ng' utility with CPU and I/O stressors in parallel with qemu
execution greatly minimizes amount of attempts when I'm trying to trigger
some of issues I encounter.

I'll try 'info monitor' command tomorrow, but no guarantees that I'll be
able to reproduce issue again.

Speaking about '-nographic' and SDL... I've noted that UI greatly minimizes
possibility of hanging (but not avoids it completely) when using icount in
general, so this effect isn't rr-specific. I've already reported this bug
too.


пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:

> Artem Pisarenko писал 2018-09-30 14:01:
> > Feature still broken :(
>
> Thanks for testing.
>
> >
> > Brief description of my tests.
> >
> > Guest image is Linux, which just powers off after kernel boots
> > (instead of proceeding to user-space /init or /sbin/init).
> > Base cmdline:
> > qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048 -cpu
> > qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage -initrd
> > rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
> > -nographic -serial SERIAL_VALUE -icount
> > 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
>
> I've never tried it with sleep=off. Can you remove it and try again?
>
> We also seen a problem with '-nographic'. When we remove this option and
> QEMU runs with SDL
> window, everything is ok. There is some problem with main loop which may
> sleep when there
> is no GUI to update, or something like that. We couldn't fix it yet.
>
> >
> > Test 1. When SERIAL_VALUE=none
> > Running with RR_VALUE=record completes successfully.
> > Running with RR_VALUE=replay doesn't completes. qemu process just
> > eating ~100% cpu and memory usage doesn't grow after some moment. I
> > don't see what happens because of problem no.2 (see below).
>
> Try 'info replay' monitor command. Does instruction counter increases?
>
> >
> > Test 2. When SERIAL_VALUE=stdio
> > Running with RR_VALUE=record completes successfully.
> >
> > Running with RR_VALUE=replay caues exit with error:
> >
> > "qemu-system-x86_64: Missing character write event in the replay log"
> >
> > These problems are same with qemu 2.12 (both vanilla and with previous
> > versions of these patches applied). Furthemore, I consider whole
> > icount mode broken and determinism isn't achievable.
> > The irony is that I actually don't need record/replay feature. I've
> > tried to use it only as instrument to debug failing determinism in
> > qemu code. But since replay/record feature itself relies on
> > determinism, which is broken, it's no wonder why it fails also (I just
> > hoped to bypass it).
> >
> > Contact me if you need more details. I just tired a lot trying to get
> > all these things working... Hope is leaving me...
>
> Can you share the kernel in case the icount still broken?
>
>
> Pavel Dovgalyuk
>
> --

С уважением,
  Артем Писаренко

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-01 18:22   ` Artem Pisarenko
@ 2018-10-02  7:02     ` Artem Pisarenko
  2018-10-03  6:47       ` dovgaluk
  0 siblings, 1 reply; 13+ messages in thread
From: Artem Pisarenko @ 2018-10-02  7:02 UTC (permalink / raw)
  To: dovgaluk; +Cc: Pavel.Dovgaluk, qemu-devel

I've added "-monitor stdio" option to command line of Test 1 and repeated
entering command during execution:

  QEMU 3.0.50 monitor - type 'help' for more information
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 311736195
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 318198367
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 324737211
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 329890795
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 607069789
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 607069789
  (qemu) info replay
  Replaying execution 'icount_rr_capture.bin': current step = 607069789
  ...

Some notes on value of step it stucks on:
- mostly it's same (even across different record-replay pairs);
- stressing host during replay may cause it to change even for same
record-replay pair (i.e. different replay executions for same file
recorded).

This specific case seems to be stable to reproduce.

вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko <artem.k.pisarenko@gmail.com>:

> I've posted bug report with extended tests (incl. case without sleep=off).
> You may find guest image (kernel) in bug description.
> https://bugs.launchpad.net/qemu/+bug/1795369
>
> The most annoying thing is that some issues are almost not reproducible.
> There are definitely race conditions somewhere in qemu code. Running
> 'stress-ng' utility with CPU and I/O stressors in parallel with qemu
> execution greatly minimizes amount of attempts when I'm trying to trigger
> some of issues I encounter.
>
> I'll try 'info monitor' command tomorrow, but no guarantees that I'll be
> able to reproduce issue again.
>
> Speaking about '-nographic' and SDL... I've noted that UI greatly
> minimizes possibility of hanging (but not avoids it completely) when using
> icount in general, so this effect isn't rr-specific. I've already reported
> this bug too.
>
>
> пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:
>
>> Artem Pisarenko писал 2018-09-30 14:01:
>> > Feature still broken :(
>>
>> Thanks for testing.
>>
>> >
>> > Brief description of my tests.
>> >
>> > Guest image is Linux, which just powers off after kernel boots
>> > (instead of proceeding to user-space /init or /sbin/init).
>> > Base cmdline:
>> > qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048 -cpu
>> > qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage -initrd
>> > rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
>> > -nographic -serial SERIAL_VALUE -icount
>> > 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
>>
>> I've never tried it with sleep=off. Can you remove it and try again?
>>
>> We also seen a problem with '-nographic'. When we remove this option and
>> QEMU runs with SDL
>> window, everything is ok. There is some problem with main loop which may
>> sleep when there
>> is no GUI to update, or something like that. We couldn't fix it yet.
>>
>> >
>> > Test 1. When SERIAL_VALUE=none
>> > Running with RR_VALUE=record completes successfully.
>> > Running with RR_VALUE=replay doesn't completes. qemu process just
>> > eating ~100% cpu and memory usage doesn't grow after some moment. I
>> > don't see what happens because of problem no.2 (see below).
>>
>> Try 'info replay' monitor command. Does instruction counter increases?
>>
>> >
>> > Test 2. When SERIAL_VALUE=stdio
>> > Running with RR_VALUE=record completes successfully.
>> >
>> > Running with RR_VALUE=replay caues exit with error:
>> >
>> > "qemu-system-x86_64: Missing character write event in the replay log"
>> >
>> > These problems are same with qemu 2.12 (both vanilla and with previous
>> > versions of these patches applied). Furthemore, I consider whole
>> > icount mode broken and determinism isn't achievable.
>> > The irony is that I actually don't need record/replay feature. I've
>> > tried to use it only as instrument to debug failing determinism in
>> > qemu code. But since replay/record feature itself relies on
>> > determinism, which is broken, it's no wonder why it fails also (I just
>> > hoped to bypass it).
>> >
>> > Contact me if you need more details. I just tired a lot trying to get
>> > all these things working... Hope is leaving me...
>>
>> Can you share the kernel in case the icount still broken?
>>
>>
>> Pavel Dovgalyuk
>>
>> --
>
> С уважением,
>   Артем Писаренко
>
-- 

С уважением,
  Артем Писаренко

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-02  7:02     ` Artem Pisarenko
@ 2018-10-03  6:47       ` dovgaluk
  2018-10-04 13:15         ` Artem Pisarenko
  0 siblings, 1 reply; 13+ messages in thread
From: dovgaluk @ 2018-10-03  6:47 UTC (permalink / raw)
  To: Artem Pisarenko; +Cc: Pavel.Dovgaluk, qemu-devel

Can you try applying this patch?
https://www.mail-archive.com/qemu-devel@nongnu.org/msg563798.html

I also encountered the problems with x86_64 replaying and found the 
misprint in
the code which was fixed later, than sending the series to the mailing 
list.

Pavel Dovgalyuk


Artem Pisarenko писал 2018-10-02 10:02:
> I've added "-monitor stdio" option to command line of Test 1 and
> repeated entering command during execution:
> 
>   QEMU 3.0.50 monitor - type 'help' for more information
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 311736195
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 318198367
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 324737211
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 329890795
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   ...
> 
> Some notes on value of step it stucks on:
> - mostly it's same (even across different record-replay pairs);
> - stressing host during replay may cause it to change even for same
> record-replay pair (i.e. different replay executions for same file
> recorded).
> 
> This specific case seems to be stable to reproduce.
> 
> вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko
> <artem.k.pisarenko@gmail.com>:
> 
>> I've posted bug report with extended tests (incl. case without
>> sleep=off). You may find guest image (kernel) in bug description.
>> https://bugs.launchpad.net/qemu/+bug/1795369 [1]
>> 
>> The most annoying thing is that some issues are almost not
>> reproducible. There are definitely race conditions somewhere in qemu
>> code. Running 'stress-ng' utility with CPU and I/O stressors in
>> parallel with qemu execution greatly minimizes amount of attempts
>> when I'm trying to trigger some of issues I encounter.
>> 
>> I'll try 'info monitor' command tomorrow, but no guarantees that
>> I'll be able to reproduce issue again.
>> 
>> Speaking about '-nographic' and SDL... I've noted that UI greatly
>> minimizes possibility of hanging (but not avoids it completely) when
>> using icount in general, so this effect isn't rr-specific. I've
>> already reported this bug too.
>> 
>> пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:
>> 
>>> Artem Pisarenko писал 2018-09-30 14:01:
>>>> Feature still broken :(
>>> 
>>> Thanks for testing.
>>> 
>>>> 
>>>> Brief description of my tests.
>>>> 
>>>> Guest image is Linux, which just powers off after kernel boots
>>>> (instead of proceeding to user-space /init or /sbin/init).
>>>> Base cmdline:
>>>> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048
>>> -cpu
>>>> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage
>>> -initrd
>>>> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
>>>> -nographic -serial SERIAL_VALUE -icount
>>>> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
>>> 
>>> I've never tried it with sleep=off. Can you remove it and try
>>> again?
>>> 
>>> We also seen a problem with '-nographic'. When we remove this
>>> option and
>>> QEMU runs with SDL
>>> window, everything is ok. There is some problem with main loop
>>> which may
>>> sleep when there
>>> is no GUI to update, or something like that. We couldn't fix it
>>> yet.
>>> 
>>>> 
>>>> Test 1. When SERIAL_VALUE=none
>>>> Running with RR_VALUE=record completes successfully.
>>>> Running with RR_VALUE=replay doesn't completes. qemu process
>>> just
>>>> eating ~100% cpu and memory usage doesn't grow after some
>>> moment. I
>>>> don't see what happens because of problem no.2 (see below).
>>> 
>>> Try 'info replay' monitor command. Does instruction counter
>>> increases?
>>> 
>>>> 
>>>> Test 2. When SERIAL_VALUE=stdio
>>>> Running with RR_VALUE=record completes successfully.
>>>> 
>>>> Running with RR_VALUE=replay caues exit with error:
>>>> 
>>>> "qemu-system-x86_64: Missing character write event in the replay
>>> log"
>>>> 
>>>> These problems are same with qemu 2.12 (both vanilla and with
>>> previous
>>>> versions of these patches applied). Furthemore, I consider whole
>>>> icount mode broken and determinism isn't achievable.
>>>> The irony is that I actually don't need record/replay feature.
>>> I've
>>>> tried to use it only as instrument to debug failing determinism
>>> in
>>>> qemu code. But since replay/record feature itself relies on
>>>> determinism, which is broken, it's no wonder why it fails also
>>> (I just
>>>> hoped to bypass it).
>>>> 
>>>> Contact me if you need more details. I just tired a lot trying
>>> to get
>>>> all these things working... Hope is leaving me...
>>> 
>>> Can you share the kernel in case the icount still broken?
>>> 
>>> Pavel Dovgalyuk
>> --
>> 
>> С уважением,
>> Артем Писаренко
>  --
> 
> С уважением,
>   Артем Писаренко
> 
> Links:
> ------
> [1] https://bugs.launchpad.net/qemu/+bug/1795369

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-03  6:47       ` dovgaluk
@ 2018-10-04 13:15         ` Artem Pisarenko
  2018-10-09  9:04           ` Pavel Dovgalyuk
  0 siblings, 1 reply; 13+ messages in thread
From: Artem Pisarenko @ 2018-10-04 13:15 UTC (permalink / raw)
  To: dovgaluk; +Cc: Pavel.Dovgaluk, qemu-devel

No, it didn't changed test results, at least for
https://github.com/ispras/qemu/tree/rr-180911 . Even step values it stucks
on are same for most runs.
Playing with master and my own branch gives different results for tests
without sleep=off and -rtc base. It seems that patch you mentioned didn't
changed them very much.
The only thing can be said for sure, is that this patch does not fix issues
completely. But MAY fix them partially or in some other specific cases...

ср, 3 окт. 2018 г. в 12:47, dovgaluk <dovgaluk@ispras.ru>:

> Can you try applying this patch?
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg563798.html
>
> I also encountered the problems with x86_64 replaying and found the
> misprint in
> the code which was fixed later, than sending the series to the mailing
> list.
>
> Pavel Dovgalyuk
>
>
> Artem Pisarenko писал 2018-10-02 10:02:
> > I've added "-monitor stdio" option to command line of Test 1 and
> > repeated entering command during execution:
> >
> >   QEMU 3.0.50 monitor - type 'help' for more information
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 311736195
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 318198367
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 324737211
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 329890795
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   ...
> >
> > Some notes on value of step it stucks on:
> > - mostly it's same (even across different record-replay pairs);
> > - stressing host during replay may cause it to change even for same
> > record-replay pair (i.e. different replay executions for same file
> > recorded).
> >
> > This specific case seems to be stable to reproduce.
> >
> > вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko
> > <artem.k.pisarenko@gmail.com>:
> >
> >> I've posted bug report with extended tests (incl. case without
> >> sleep=off). You may find guest image (kernel) in bug description.
> >> https://bugs.launchpad.net/qemu/+bug/1795369 [1]
> >>
> >> The most annoying thing is that some issues are almost not
> >> reproducible. There are definitely race conditions somewhere in qemu
> >> code. Running 'stress-ng' utility with CPU and I/O stressors in
> >> parallel with qemu execution greatly minimizes amount of attempts
> >> when I'm trying to trigger some of issues I encounter.
> >>
> >> I'll try 'info monitor' command tomorrow, but no guarantees that
> >> I'll be able to reproduce issue again.
> >>
> >> Speaking about '-nographic' and SDL... I've noted that UI greatly
> >> minimizes possibility of hanging (but not avoids it completely) when
> >> using icount in general, so this effect isn't rr-specific. I've
> >> already reported this bug too.
> >>
> >> пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:
> >>
> >>> Artem Pisarenko писал 2018-09-30 14:01:
> >>>> Feature still broken :(
> >>>
> >>> Thanks for testing.
> >>>
> >>>>
> >>>> Brief description of my tests.
> >>>>
> >>>> Guest image is Linux, which just powers off after kernel boots
> >>>> (instead of proceeding to user-space /init or /sbin/init).
> >>>> Base cmdline:
> >>>> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048
> >>> -cpu
> >>>> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage
> >>> -initrd
> >>>> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
> >>>> -nographic -serial SERIAL_VALUE -icount
> >>>> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
> >>>
> >>> I've never tried it with sleep=off. Can you remove it and try
> >>> again?
> >>>
> >>> We also seen a problem with '-nographic'. When we remove this
> >>> option and
> >>> QEMU runs with SDL
> >>> window, everything is ok. There is some problem with main loop
> >>> which may
> >>> sleep when there
> >>> is no GUI to update, or something like that. We couldn't fix it
> >>> yet.
> >>>
> >>>>
> >>>> Test 1. When SERIAL_VALUE=none
> >>>> Running with RR_VALUE=record completes successfully.
> >>>> Running with RR_VALUE=replay doesn't completes. qemu process
> >>> just
> >>>> eating ~100% cpu and memory usage doesn't grow after some
> >>> moment. I
> >>>> don't see what happens because of problem no.2 (see below).
> >>>
> >>> Try 'info replay' monitor command. Does instruction counter
> >>> increases?
> >>>
> >>>>
> >>>> Test 2. When SERIAL_VALUE=stdio
> >>>> Running with RR_VALUE=record completes successfully.
> >>>>
> >>>> Running with RR_VALUE=replay caues exit with error:
> >>>>
> >>>> "qemu-system-x86_64: Missing character write event in the replay
> >>> log"
> >>>>
> >>>> These problems are same with qemu 2.12 (both vanilla and with
> >>> previous
> >>>> versions of these patches applied). Furthemore, I consider whole
> >>>> icount mode broken and determinism isn't achievable.
> >>>> The irony is that I actually don't need record/replay feature.
> >>> I've
> >>>> tried to use it only as instrument to debug failing determinism
> >>> in
> >>>> qemu code. But since replay/record feature itself relies on
> >>>> determinism, which is broken, it's no wonder why it fails also
> >>> (I just
> >>>> hoped to bypass it).
> >>>>
> >>>> Contact me if you need more details. I just tired a lot trying
> >>> to get
> >>>> all these things working... Hope is leaving me...
> >>>
> >>> Can you share the kernel in case the icount still broken?
> >>>
> >>> Pavel Dovgalyuk
> >> --
> >>
> >> С уважением,
> >> Артем Писаренко
> >  --
> >
> > С уважением,
> >   Артем Писаренко
> >
> > Links:
> > ------
> > [1] https://bugs.launchpad.net/qemu/+bug/1795369
>
> --

С уважением,
  Артем Писаренко

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-04 13:15         ` Artem Pisarenko
@ 2018-10-09  9:04           ` Pavel Dovgalyuk
  2018-10-09 11:23             ` Artem Pisarenko
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Dovgalyuk @ 2018-10-09  9:04 UTC (permalink / raw)
  To: 'Artem Pisarenko'; +Cc: Pavel.Dovgaluk, qemu-devel

Please try the following patch.

There was a problem with rtc option in record/replay mode.

 

diff --git a/vl.c b/vl.c

index 40d5d0f..afe1c20 100644

--- a/vl.c

+++ b/vl.c

@@ -2885,6 +2885,7 @@ int main(int argc, char **argv, char **envp)

     DisplayState *ds;

     QemuOpts *opts, *machine_opts;

     QemuOpts *icount_opts = NULL, *accel_opts = NULL;

+    QemuOpts *rtc_opts = NULL;

     QemuOptsList *olist;

     int optind;

     const char *optarg;

@@ -3691,12 +3692,11 @@ int main(int argc, char **argv, char **envp)

                 warn_report("This option is ignored and will be removed soon");

                 break;

             case QEMU_OPTION_rtc:

-                opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"), optarg,

-                                               false);

-                if (!opts) {

+                rtc_opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),

+                                                   optarg, false);

+                if (!rtc_opts) {

                     exit(1);

                 }

-                configure_rtc(opts);

                 break;

             case QEMU_OPTION_tb_size:

#ifndef CONFIG_TCG

@@ -3907,6 +3907,9 @@ int main(int argc, char **argv, char **envp)

     loc_set_none();

     replay_configure(icount_opts);

+    if (rtc_opts) {

+        configure_rtc(rtc_opts);

+    }

     if (incoming && !preconfig_exit_requested) {

         error_report("'preconfig' and 'incoming' options are "

 

Pavel Dovgalyuk

 

From: Artem Pisarenko [mailto:artem.k.pisarenko@gmail.com] 
Sent: Thursday, October 04, 2018 4:16 PM
To: dovgaluk
Cc: Pavel.Dovgaluk@ispras.ru; qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

 

No, it didn't changed test results, at least for https://github.com/ispras/qemu/tree/rr-180911 . Even step values it stucks on are same for most runs.

Playing with master and my own branch gives different results for tests without sleep=off and -rtc base. It seems that patch you mentioned didn't changed them very much.

The only thing can be said for sure, is that this patch does not fix issues completely. But MAY fix them partially or in some other specific cases...

 

ср, 3 окт. 2018 г. в 12:47, dovgaluk <dovgaluk@ispras.ru>:

Can you try applying this patch?
https://www.mail-archive.com/qemu-devel@nongnu.org/msg563798.html

I also encountered the problems with x86_64 replaying and found the 
misprint in
the code which was fixed later, than sending the series to the mailing 
list.

Pavel Dovgalyuk


Artem Pisarenko писал 2018-10-02 10:02:
> I've added "-monitor stdio" option to command line of Test 1 and
> repeated entering command during execution:
> 
>   QEMU 3.0.50 monitor - type 'help' for more information
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 311736195
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 318198367
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 324737211
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 329890795
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   ...
> 
> Some notes on value of step it stucks on:
> - mostly it's same (even across different record-replay pairs);
> - stressing host during replay may cause it to change even for same
> record-replay pair (i.e. different replay executions for same file
> recorded).
> 
> This specific case seems to be stable to reproduce.
> 
> вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko
> <artem.k.pisarenko@gmail.com>:
> 
>> I've posted bug report with extended tests (incl. case without
>> sleep=off). You may find guest image (kernel) in bug description.
>> https://bugs.launchpad.net/qemu/+bug/1795369 [1]
>> 
>> The most annoying thing is that some issues are almost not
>> reproducible. There are definitely race conditions somewhere in qemu
>> code. Running 'stress-ng' utility with CPU and I/O stressors in
>> parallel with qemu execution greatly minimizes amount of attempts
>> when I'm trying to trigger some of issues I encounter.
>> 
>> I'll try 'info monitor' command tomorrow, but no guarantees that
>> I'll be able to reproduce issue again.
>> 
>> Speaking about '-nographic' and SDL... I've noted that UI greatly
>> minimizes possibility of hanging (but not avoids it completely) when
>> using icount in general, so this effect isn't rr-specific. I've
>> already reported this bug too.
>> 
>> пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:
>> 
>>> Artem Pisarenko писал 2018-09-30 14:01:
>>>> Feature still broken :(
>>> 
>>> Thanks for testing.
>>> 
>>>> 
>>>> Brief description of my tests.
>>>> 
>>>> Guest image is Linux, which just powers off after kernel boots
>>>> (instead of proceeding to user-space /init or /sbin/init).
>>>> Base cmdline:
>>>> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048
>>> -cpu
>>>> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage
>>> -initrd
>>>> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
>>>> -nographic -serial SERIAL_VALUE -icount
>>>> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
>>> 
>>> I've never tried it with sleep=off. Can you remove it and try
>>> again?
>>> 
>>> We also seen a problem with '-nographic'. When we remove this
>>> option and
>>> QEMU runs with SDL
>>> window, everything is ok. There is some problem with main loop
>>> which may
>>> sleep when there
>>> is no GUI to update, or something like that. We couldn't fix it
>>> yet.
>>> 
>>>> 
>>>> Test 1. When SERIAL_VALUE=none
>>>> Running with RR_VALUE=record completes successfully.
>>>> Running with RR_VALUE=replay doesn't completes. qemu process
>>> just
>>>> eating ~100% cpu and memory usage doesn't grow after some
>>> moment. I
>>>> don't see what happens because of problem no.2 (see below).
>>> 
>>> Try 'info replay' monitor command. Does instruction counter
>>> increases?
>>> 
>>>> 
>>>> Test 2. When SERIAL_VALUE=stdio
>>>> Running with RR_VALUE=record completes successfully.
>>>> 
>>>> Running with RR_VALUE=replay caues exit with error:
>>>> 
>>>> "qemu-system-x86_64: Missing character write event in the replay
>>> log"
>>>> 
>>>> These problems are same with qemu 2.12 (both vanilla and with
>>> previous
>>>> versions of these patches applied). Furthemore, I consider whole
>>>> icount mode broken and determinism isn't achievable.
>>>> The irony is that I actually don't need record/replay feature.
>>> I've
>>>> tried to use it only as instrument to debug failing determinism
>>> in
>>>> qemu code. But since replay/record feature itself relies on
>>>> determinism, which is broken, it's no wonder why it fails also
>>> (I just
>>>> hoped to bypass it).
>>>> 
>>>> Contact me if you need more details. I just tired a lot trying
>>> to get
>>>> all these things working... Hope is leaving me...
>>> 
>>> Can you share the kernel in case the icount still broken?
>>> 
>>> Pavel Dovgalyuk
>> --
>> 
>> С уважением,
>> Артем Писаренко
>  --
> 
> С уважением,
>   Артем Писаренко
> 
> Links:
> ------
> [1] https://bugs.launchpad.net/qemu/+bug/1795369

-- 

С уважением,
  Артем Писаренко

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-09  9:04           ` Pavel Dovgalyuk
@ 2018-10-09 11:23             ` Artem Pisarenko
  2018-10-09 11:26               ` Pavel Dovgalyuk
  0 siblings, 1 reply; 13+ messages in thread
From: Artem Pisarenko @ 2018-10-09 11:23 UTC (permalink / raw)
  To: Pavel Dovgalyuk; +Cc: Pavel.Dovgaluk, qemu-devel

(Since all previous patches are already merged to master, I'm running tests
against latest (almost) version from master branch. Following results are
based on master commit dafd95053611aa14dda40266857608d12ddce658 .)

Applying this patch made Tests 1 and 2 succeed (at least I wasn't able to
acheive failures with several attempts).
Also I've tried few tests without sleep=off and/or rtc base options. All of
them succeed too, except one case - removing sleep=off (regardless of -rtc
option values or its presence at all) causes qemu to hang hard in recording
mode at very startup. Process needs to be killed.

Some info from debugger:
    qemu-system-x86_64 [13231] [cores: 2,4,5,7]
Thread #1 [qemu-system-x86] 13231 [core: 2] (Suspended : Container)
__lll_lock_wait() at lowlevellock.S:135 0x7f00b116626d
__GI___pthread_mutex_lock() at pthread_mutex_lock.c:80 0x7f00b115fdbd
qemu_mutex_lock_impl() at qemu-thread-posix.c:66 0x947ac4
replay_mutex_lock() at replay-internal.c:206 0x7f3dea
os_host_main_loop_wait() at main-loop.c:235 0x94335e
main_loop_wait() at main-loop.c:497 0x943429
main_loop() at vl.c:1,853 0x5be70f
main() at vl.c:4,575 0x5c56e0
Thread #2 [qemu-system-x86] 13282 [core: 4] (Suspended : Container)
Thread #3 [qemu-system-x86] 13283 [core: 5] (Suspended : Container)
Thread #4 [qemu-system-x86] 13284 [core: 7] (Suspended : Step)
cpu_get_icount_raw() at cpus.c:301 0x45a0a0
replay_get_current_step() at replay.c:67 0x7f2f14
replay_save_instructions() at replay-internal.c:225 0x7f3ea0
replay_save_clock() at replay-time.c:24 0x7f483d
icount_warp_rt() at cpus.c:512 0x45a745
qemu_account_warp_timer() at cpus.c:690 0x45ad55
qemu_tcg_rr_cpu_thread_fn() at cpus.c:1,498 0x45c554
qemu_thread_start() at qemu-thread-posix.c:504 0x9485cf
start_thread() at pthread_create.c:333 0x7f00b115d6ba
clone() at clone.S:109 0x7f00b0e9341d
    gdb (7.11.1)

Threads #2,3 are just waiting in poll or similar. Nothing extraordinary.

Thread #4 cycles inside do {} while() loop of cpu_get_icount_raw() function:
    do {
        start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
        icount = cpu_get_icount_raw_locked();
    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));

Value of timers_state.vm_clock_seqlock.sequence is always 3.

вт, 9 окт. 2018 г. в 15:04, Pavel Dovgalyuk <dovgaluk@ispras.ru>:

> Please try the following patch.
>
> There was a problem with rtc option in record/replay mode.
>
>
>
> diff --git a/vl.c b/vl.c
>
> index 40d5d0f..afe1c20 100644
>
> --- a/vl.c
>
> +++ b/vl.c
>
> @@ -2885,6 +2885,7 @@ int main(int argc, char **argv, char **envp)
>
>      DisplayState *ds;
>
>      QemuOpts *opts, *machine_opts;
>
>      QemuOpts *icount_opts = NULL, *accel_opts = NULL;
>
> +    QemuOpts *rtc_opts = NULL;
>
>      QemuOptsList *olist;
>
>      int optind;
>
>      const char *optarg;
>
> @@ -3691,12 +3692,11 @@ int main(int argc, char **argv, char **envp)
>
>                  warn_report("This option is ignored and will be removed
> soon");
>
>                  break;
>
>              case QEMU_OPTION_rtc:
>
> -                opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
> optarg,
>
> -                                               false);
>
> -                if (!opts) {
>
> +                rtc_opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
>
> +                                                   optarg, false);
>
> +                if (!rtc_opts) {
>
>                      exit(1);
>
>                  }
>
> -                configure_rtc(opts);
>
>                  break;
>
>              case QEMU_OPTION_tb_size:
>
> #ifndef CONFIG_TCG
>
> @@ -3907,6 +3907,9 @@ int main(int argc, char **argv, char **envp)
>
>      loc_set_none();
>
>      replay_configure(icount_opts);
>
> +    if (rtc_opts) {
>
> +        configure_rtc(rtc_opts);
>
> +    }
>
>      if (incoming && !preconfig_exit_requested) {
>
>          error_report("'preconfig' and 'incoming' options are "
>
>
>
> Pavel Dovgalyuk
>
>
>
> *From:* Artem Pisarenko [mailto:artem.k.pisarenko@gmail.com]
> *Sent:* Thursday, October 04, 2018 4:16 PM
> *To:* dovgaluk
> *Cc:* Pavel.Dovgaluk@ispras.ru; qemu-devel@nongnu.org
> *Subject:* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and
> adding reverse debugging
>
>
>
> No, it didn't changed test results, at least for
> https://github.com/ispras/qemu/tree/rr-180911 . Even step values it
> stucks on are same for most runs.
>
> Playing with master and my own branch gives different results for tests
> without sleep=off and -rtc base. It seems that patch you mentioned didn't
> changed them very much.
>
> The only thing can be said for sure, is that this patch does not fix
> issues completely. But MAY fix them partially or in some other specific
> cases...
>
>
>
> ср, 3 окт. 2018 г. в 12:47, dovgaluk <dovgaluk@ispras.ru>:
>
> Can you try applying this patch?
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg563798.html
>
> I also encountered the problems with x86_64 replaying and found the
> misprint in
> the code which was fixed later, than sending the series to the mailing
> list.
>
> Pavel Dovgalyuk
>
>
> Artem Pisarenko писал 2018-10-02 10:02:
> > I've added "-monitor stdio" option to command line of Test 1 and
> > repeated entering command during execution:
> >
> >   QEMU 3.0.50 monitor - type 'help' for more information
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 311736195
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 318198367
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 324737211
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 329890795
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   ...
> >
> > Some notes on value of step it stucks on:
> > - mostly it's same (even across different record-replay pairs);
> > - stressing host during replay may cause it to change even for same
> > record-replay pair (i.e. different replay executions for same file
> > recorded).
> >
> > This specific case seems to be stable to reproduce.
> >
> > вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko
> > <artem.k.pisarenko@gmail.com>:
> >
> >> I've posted bug report with extended tests (incl. case without
> >> sleep=off). You may find guest image (kernel) in bug description.
> >> https://bugs.launchpad.net/qemu/+bug/1795369 [1]
> >>
> >> The most annoying thing is that some issues are almost not
> >> reproducible. There are definitely race conditions somewhere in qemu
> >> code. Running 'stress-ng' utility with CPU and I/O stressors in
> >> parallel with qemu execution greatly minimizes amount of attempts
> >> when I'm trying to trigger some of issues I encounter.
> >>
> >> I'll try 'info monitor' command tomorrow, but no guarantees that
> >> I'll be able to reproduce issue again.
> >>
> >> Speaking about '-nographic' and SDL... I've noted that UI greatly
> >> minimizes possibility of hanging (but not avoids it completely) when
> >> using icount in general, so this effect isn't rr-specific. I've
> >> already reported this bug too.
> >>
> >> пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:
> >>
> >>> Artem Pisarenko писал 2018-09-30 14:01:
> >>>> Feature still broken :(
> >>>
> >>> Thanks for testing.
> >>>
> >>>>
> >>>> Brief description of my tests.
> >>>>
> >>>> Guest image is Linux, which just powers off after kernel boots
> >>>> (instead of proceeding to user-space /init or /sbin/init).
> >>>> Base cmdline:
> >>>> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048
> >>> -cpu
> >>>> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage
> >>> -initrd
> >>>> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
> >>>> -nographic -serial SERIAL_VALUE -icount
> >>>> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
> >>>
> >>> I've never tried it with sleep=off. Can you remove it and try
> >>> again?
> >>>
> >>> We also seen a problem with '-nographic'. When we remove this
> >>> option and
> >>> QEMU runs with SDL
> >>> window, everything is ok. There is some problem with main loop
> >>> which may
> >>> sleep when there
> >>> is no GUI to update, or something like that. We couldn't fix it
> >>> yet.
> >>>
> >>>>
> >>>> Test 1. When SERIAL_VALUE=none
> >>>> Running with RR_VALUE=record completes successfully.
> >>>> Running with RR_VALUE=replay doesn't completes. qemu process
> >>> just
> >>>> eating ~100% cpu and memory usage doesn't grow after some
> >>> moment. I
> >>>> don't see what happens because of problem no.2 (see below).
> >>>
> >>> Try 'info replay' monitor command. Does instruction counter
> >>> increases?
> >>>
> >>>>
> >>>> Test 2. When SERIAL_VALUE=stdio
> >>>> Running with RR_VALUE=record completes successfully.
> >>>>
> >>>> Running with RR_VALUE=replay caues exit with error:
> >>>>
> >>>> "qemu-system-x86_64: Missing character write event in the replay
> >>> log"
> >>>>
> >>>> These problems are same with qemu 2.12 (both vanilla and with
> >>> previous
> >>>> versions of these patches applied). Furthemore, I consider whole
> >>>> icount mode broken and determinism isn't achievable.
> >>>> The irony is that I actually don't need record/replay feature.
> >>> I've
> >>>> tried to use it only as instrument to debug failing determinism
> >>> in
> >>>> qemu code. But since replay/record feature itself relies on
> >>>> determinism, which is broken, it's no wonder why it fails also
> >>> (I just
> >>>> hoped to bypass it).
> >>>>
> >>>> Contact me if you need more details. I just tired a lot trying
> >>> to get
> >>>> all these things working... Hope is leaving me...
> >>>
> >>> Can you share the kernel in case the icount still broken?
> >>>
> >>> Pavel Dovgalyuk
> >> --
> >>
> >> С уважением,
> >> Артем Писаренко
> >  --
> >
> > С уважением,
> >   Артем Писаренко
> >
> > Links:
> > ------
> > [1] https://bugs.launchpad.net/qemu/+bug/1795369
>
> --
>
> С уважением,
>   Артем Писаренко
>
-- 

С уважением,
  Артем Писаренко

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-09 11:23             ` Artem Pisarenko
@ 2018-10-09 11:26               ` Pavel Dovgalyuk
  2018-10-09 12:59                 ` Artem Pisarenko
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Dovgalyuk @ 2018-10-09 11:26 UTC (permalink / raw)
  To: 'Artem Pisarenko'; +Cc: Pavel.Dovgaluk, qemu-devel

Maybe this will help?

 

https://www.mail-archive.com/qemu-devel@nongnu.org/msg560780.html

 

Pavel Dovgalyuk

 

From: Artem Pisarenko [mailto:artem.k.pisarenko@gmail.com] 
Sent: Tuesday, October 09, 2018 2:24 PM
To: Pavel Dovgalyuk
Cc: Pavel.Dovgaluk@ispras.ru; qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

 

(Since all previous patches are already merged to master, I'm running tests against latest (almost) version from master branch. Following results are based on master commit dafd95053611aa14dda40266857608d12ddce658 .)

 

Applying this patch made Tests 1 and 2 succeed (at least I wasn't able to acheive failures with several attempts).

Also I've tried few tests without sleep=off and/or rtc base options. All of them succeed too, except one case - removing sleep=off (regardless of -rtc option values or its presence at all) causes qemu to hang hard in recording mode at very startup. Process needs to be killed.

 

Some info from debugger:

    qemu-system-x86_64 [13231] [cores: 2,4,5,7]    

          Thread #1 [qemu-system-x86] 13231 [core: 2] (Suspended : Container)    

                      __lll_lock_wait() at lowlevellock.S:135 0x7f00b116626d   

                      __GI___pthread_mutex_lock() at pthread_mutex_lock.c:80 0x7f00b115fdbd     

                      qemu_mutex_lock_impl() at qemu-thread-posix.c:66 0x947ac4      

                      replay_mutex_lock() at replay-internal.c:206 0x7f3dea       

                      os_host_main_loop_wait() at main-loop.c:235 0x94335e    

                      main_loop_wait() at main-loop.c:497 0x943429      

                      main_loop() at vl.c:1,853 0x5be70f   

                      main() at vl.c:4,575 0x5c56e0           

          Thread #2 [qemu-system-x86] 13282 [core: 4] (Suspended : Container)    

          Thread #3 [qemu-system-x86] 13283 [core: 5] (Suspended : Container)    

          Thread #4 [qemu-system-x86] 13284 [core: 7] (Suspended : Step) 

                      cpu_get_icount_raw() at cpus.c:301 0x45a0a0         

                      replay_get_current_step() at replay.c:67 0x7f2f14   

                      replay_save_instructions() at replay-internal.c:225 0x7f3ea0          

                      replay_save_clock() at replay-time.c:24 0x7f483d   

                      icount_warp_rt() at cpus.c:512 0x45a745     

                      qemu_account_warp_timer() at cpus.c:690 0x45ad55         

                      qemu_tcg_rr_cpu_thread_fn() at cpus.c:1,498 0x45c554    

                      qemu_thread_start() at qemu-thread-posix.c:504 0x9485cf 

                      start_thread() at pthread_create.c:333 0x7f00b115d6ba     

                      clone() at clone.S:109 0x7f00b0e9341d       

    gdb (7.11.1)          

 

Threads #2,3 are just waiting in poll or similar. Nothing extraordinary.

 

Thread #4 cycles inside do {} while() loop of cpu_get_icount_raw() function:

    do {

        start = seqlock_read_begin(&timers_state.vm_clock_seqlock);

        icount = cpu_get_icount_raw_locked();

    } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));

 

Value of timers_state.vm_clock_seqlock.sequence is always 3.

 

вт, 9 окт. 2018 г. в 15:04, Pavel Dovgalyuk <dovgaluk@ispras.ru>:

Please try the following patch.

There was a problem with rtc option in record/replay mode.

 

diff --git a/vl.c b/vl.c

index 40d5d0f..afe1c20 100644

--- a/vl.c

+++ b/vl.c

@@ -2885,6 +2885,7 @@ int main(int argc, char **argv, char **envp)

     DisplayState *ds;

     QemuOpts *opts, *machine_opts;

     QemuOpts *icount_opts = NULL, *accel_opts = NULL;

+    QemuOpts *rtc_opts = NULL;

     QemuOptsList *olist;

     int optind;

     const char *optarg;

@@ -3691,12 +3692,11 @@ int main(int argc, char **argv, char **envp)

                 warn_report("This option is ignored and will be removed soon");

                 break;

             case QEMU_OPTION_rtc:

-                opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"), optarg,

-                                               false);

-                if (!opts) {

+                rtc_opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),

+                                                   optarg, false);

+                if (!rtc_opts) {

                     exit(1);

                 }

-                configure_rtc(opts);

                 break;

             case QEMU_OPTION_tb_size:

#ifndef CONFIG_TCG

@@ -3907,6 +3907,9 @@ int main(int argc, char **argv, char **envp)

     loc_set_none();

     replay_configure(icount_opts);

+    if (rtc_opts) {

+        configure_rtc(rtc_opts);

+    }

     if (incoming && !preconfig_exit_requested) {

         error_report("'preconfig' and 'incoming' options are "

 

Pavel Dovgalyuk

 

From: Artem Pisarenko [mailto:artem.k.pisarenko@gmail.com] 
Sent: Thursday, October 04, 2018 4:16 PM
To: dovgaluk
Cc: Pavel.Dovgaluk@ispras.ru; qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging

 

No, it didn't changed test results, at least for https://github.com/ispras/qemu/tree/rr-180911 . Even step values it stucks on are same for most runs.

Playing with master and my own branch gives different results for tests without sleep=off and -rtc base. It seems that patch you mentioned didn't changed them very much.

The only thing can be said for sure, is that this patch does not fix issues completely. But MAY fix them partially or in some other specific cases...

 

ср, 3 окт. 2018 г. в 12:47, dovgaluk <dovgaluk@ispras.ru>:

Can you try applying this patch?
https://www.mail-archive.com/qemu-devel@nongnu.org/msg563798.html

I also encountered the problems with x86_64 replaying and found the 
misprint in
the code which was fixed later, than sending the series to the mailing 
list.

Pavel Dovgalyuk


Artem Pisarenko писал 2018-10-02 10:02:
> I've added "-monitor stdio" option to command line of Test 1 and
> repeated entering command during execution:
> 
>   QEMU 3.0.50 monitor - type 'help' for more information
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 311736195
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 318198367
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 324737211
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 329890795
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   (qemu) info replay
>   Replaying execution 'icount_rr_capture.bin': current step =
> 607069789
>   ...
> 
> Some notes on value of step it stucks on:
> - mostly it's same (even across different record-replay pairs);
> - stressing host during replay may cause it to change even for same
> record-replay pair (i.e. different replay executions for same file
> recorded).
> 
> This specific case seems to be stable to reproduce.
> 
> вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko
> <artem.k.pisarenko@gmail.com>:
> 
>> I've posted bug report with extended tests (incl. case without
>> sleep=off). You may find guest image (kernel) in bug description.
>> https://bugs.launchpad.net/qemu/+bug/1795369 [1]
>> 
>> The most annoying thing is that some issues are almost not
>> reproducible. There are definitely race conditions somewhere in qemu
>> code. Running 'stress-ng' utility with CPU and I/O stressors in
>> parallel with qemu execution greatly minimizes amount of attempts
>> when I'm trying to trigger some of issues I encounter.
>> 
>> I'll try 'info monitor' command tomorrow, but no guarantees that
>> I'll be able to reproduce issue again.
>> 
>> Speaking about '-nographic' and SDL... I've noted that UI greatly
>> minimizes possibility of hanging (but not avoids it completely) when
>> using icount in general, so this effect isn't rr-specific. I've
>> already reported this bug too.
>> 
>> пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:
>> 
>>> Artem Pisarenko писал 2018-09-30 14:01:
>>>> Feature still broken :(
>>> 
>>> Thanks for testing.
>>> 
>>>> 
>>>> Brief description of my tests.
>>>> 
>>>> Guest image is Linux, which just powers off after kernel boots
>>>> (instead of proceeding to user-space /init or /sbin/init).
>>>> Base cmdline:
>>>> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048
>>> -cpu
>>>> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage
>>> -initrd
>>>> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
>>>> -nographic -serial SERIAL_VALUE -icount
>>>> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
>>> 
>>> I've never tried it with sleep=off. Can you remove it and try
>>> again?
>>> 
>>> We also seen a problem with '-nographic'. When we remove this
>>> option and
>>> QEMU runs with SDL
>>> window, everything is ok. There is some problem with main loop
>>> which may
>>> sleep when there
>>> is no GUI to update, or something like that. We couldn't fix it
>>> yet.
>>> 
>>>> 
>>>> Test 1. When SERIAL_VALUE=none
>>>> Running with RR_VALUE=record completes successfully.
>>>> Running with RR_VALUE=replay doesn't completes. qemu process
>>> just
>>>> eating ~100% cpu and memory usage doesn't grow after some
>>> moment. I
>>>> don't see what happens because of problem no.2 (see below).
>>> 
>>> Try 'info replay' monitor command. Does instruction counter
>>> increases?
>>> 
>>>> 
>>>> Test 2. When SERIAL_VALUE=stdio
>>>> Running with RR_VALUE=record completes successfully.
>>>> 
>>>> Running with RR_VALUE=replay caues exit with error:
>>>> 
>>>> "qemu-system-x86_64: Missing character write event in the replay
>>> log"
>>>> 
>>>> These problems are same with qemu 2.12 (both vanilla and with
>>> previous
>>>> versions of these patches applied). Furthemore, I consider whole
>>>> icount mode broken and determinism isn't achievable.
>>>> The irony is that I actually don't need record/replay feature.
>>> I've
>>>> tried to use it only as instrument to debug failing determinism
>>> in
>>>> qemu code. But since replay/record feature itself relies on
>>>> determinism, which is broken, it's no wonder why it fails also
>>> (I just
>>>> hoped to bypass it).
>>>> 
>>>> Contact me if you need more details. I just tired a lot trying
>>> to get
>>>> all these things working... Hope is leaving me...
>>> 
>>> Can you share the kernel in case the icount still broken?
>>> 
>>> Pavel Dovgalyuk
>> --
>> 
>> С уважением,
>> Артем Писаренко
>  --
> 
> С уважением,
>   Артем Писаренко
> 
> Links:
> ------
> [1] https://bugs.launchpad.net/qemu/+bug/1795369

-- 

С уважением,
  Артем Писаренко

-- 

С уважением,
  Артем Писаренко

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-10-09 11:26               ` Pavel Dovgalyuk
@ 2018-10-09 12:59                 ` Artem Pisarenko
  0 siblings, 0 replies; 13+ messages in thread
From: Artem Pisarenko @ 2018-10-09 12:59 UTC (permalink / raw)
  To: Pavel Dovgalyuk; +Cc: Pavel.Dovgaluk, qemu-devel

It wasn't so easy to apply this patch due to problems in compilation of
version you pointed to, and due to content distortions introduced by mail
archive, but I got it worked finally :)

Applying this patch finally made all my tests succeed... almost :)

Now qemu may hang in random moment of emulation, but not hard. Symptoms
looks like I've already reported here:
https://bugs.launchpad.net/qemu/+bug/1790460 . So, this isn't
record/replay-specific. Although, without rr= option I wasn't able cause
this issue to reveal itself, but it doesn't make much sense due to
instability of issue's nature and its hard reproducibility.

Commit I tested against (with patches
applied): 53a19a9a5f9811a911e9b69ef36afb0d66b5d85c .


вт, 9 окт. 2018 г. в 17:26, Pavel Dovgalyuk <dovgaluk@ispras.ru>:

> Maybe this will help?
>
>
>
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg560780.html
>
>
>
> Pavel Dovgalyuk
>
>
>
> *From:* Artem Pisarenko [mailto:artem.k.pisarenko@gmail.com]
> *Sent:* Tuesday, October 09, 2018 2:24 PM
> *To:* Pavel Dovgalyuk
>
>
> *Cc:* Pavel.Dovgaluk@ispras.ru; qemu-devel@nongnu.org
> *Subject:* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and
> adding reverse debugging
>
>
>
> (Since all previous patches are already merged to master, I'm running
> tests against latest (almost) version from master branch. Following results
> are based on master commit dafd95053611aa14dda40266857608d12ddce658 .)
>
>
>
> Applying this patch made Tests 1 and 2 succeed (at least I wasn't able to
> acheive failures with several attempts).
>
> Also I've tried few tests without sleep=off and/or rtc base options. All
> of them succeed too, except one case - removing sleep=off (regardless of
> -rtc option values or its presence at all) causes qemu to hang hard in
> recording mode at very startup. Process needs to be killed.
>
>
>
> Some info from debugger:
>
>     qemu-system-x86_64 [13231] [cores: 2,4,5,7]
>
>           Thread #1 [qemu-system-x86] 13231 [core: 2] (Suspended :
> Container)
>
>                       __lll_lock_wait() at lowlevellock.S:135
> 0x7f00b116626d
>
>                       __GI___pthread_mutex_lock() at
> pthread_mutex_lock.c:80 0x7f00b115fdbd
>
>                       qemu_mutex_lock_impl() at qemu-thread-posix.c:66
> 0x947ac4
>
>                       replay_mutex_lock() at replay-internal.c:206
> 0x7f3dea
>
>                       os_host_main_loop_wait() at main-loop.c:235
> 0x94335e
>
>                       main_loop_wait() at main-loop.c:497 0x943429
>
>                       main_loop() at vl.c:1,853 0x5be70f
>
>                       main() at vl.c:4,575 0x5c56e0
>
>           Thread #2 [qemu-system-x86] 13282 [core: 4] (Suspended :
> Container)
>
>           Thread #3 [qemu-system-x86] 13283 [core: 5] (Suspended :
> Container)
>
>           Thread #4 [qemu-system-x86] 13284 [core: 7] (Suspended : Step)
>
>                       cpu_get_icount_raw() at cpus.c:301 0x45a0a0
>
>                       replay_get_current_step() at replay.c:67 0x7f2f14
>
>                       replay_save_instructions() at replay-internal.c:225
> 0x7f3ea0
>
>                       replay_save_clock() at replay-time.c:24 0x7f483d
>
>                       icount_warp_rt() at cpus.c:512 0x45a745
>
>                       qemu_account_warp_timer() at cpus.c:690
> 0x45ad55
>
>                       qemu_tcg_rr_cpu_thread_fn() at cpus.c:1,498
> 0x45c554
>
>                       qemu_thread_start() at qemu-thread-posix.c:504
> 0x9485cf
>
>                       start_thread() at pthread_create.c:333
> 0x7f00b115d6ba
>
>                       clone() at clone.S:109 0x7f00b0e9341d
>
>     gdb (7.11.1)
>
>
>
> Threads #2,3 are just waiting in poll or similar. Nothing extraordinary.
>
>
>
> Thread #4 cycles inside do {} while() loop of cpu_get_icount_raw()
> function:
>
>     do {
>
>         start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
>
>         icount = cpu_get_icount_raw_locked();
>
>     } while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
>
>
>
> Value of timers_state.vm_clock_seqlock.sequence is always 3.
>
>
>
> вт, 9 окт. 2018 г. в 15:04, Pavel Dovgalyuk <dovgaluk@ispras.ru>:
>
> Please try the following patch.
>
> There was a problem with rtc option in record/replay mode.
>
>
>
> diff --git a/vl.c b/vl.c
>
> index 40d5d0f..afe1c20 100644
>
> --- a/vl.c
>
> +++ b/vl.c
>
> @@ -2885,6 +2885,7 @@ int main(int argc, char **argv, char **envp)
>
>      DisplayState *ds;
>
>      QemuOpts *opts, *machine_opts;
>
>      QemuOpts *icount_opts = NULL, *accel_opts = NULL;
>
> +    QemuOpts *rtc_opts = NULL;
>
>      QemuOptsList *olist;
>
>      int optind;
>
>      const char *optarg;
>
> @@ -3691,12 +3692,11 @@ int main(int argc, char **argv, char **envp)
>
>                  warn_report("This option is ignored and will be removed
> soon");
>
>                  break;
>
>              case QEMU_OPTION_rtc:
>
> -                opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
> optarg,
>
> -                                               false);
>
> -                if (!opts) {
>
> +                rtc_opts = qemu_opts_parse_noisily(qemu_find_opts("rtc"),
>
> +                                                   optarg, false);
>
> +                if (!rtc_opts) {
>
>                      exit(1);
>
>                  }
>
> -                configure_rtc(opts);
>
>                  break;
>
>              case QEMU_OPTION_tb_size:
>
> #ifndef CONFIG_TCG
>
> @@ -3907,6 +3907,9 @@ int main(int argc, char **argv, char **envp)
>
>      loc_set_none();
>
>      replay_configure(icount_opts);
>
> +    if (rtc_opts) {
>
> +        configure_rtc(rtc_opts);
>
> +    }
>
>      if (incoming && !preconfig_exit_requested) {
>
>          error_report("'preconfig' and 'incoming' options are "
>
>
>
> Pavel Dovgalyuk
>
>
>
> *From:* Artem Pisarenko [mailto:artem.k.pisarenko@gmail.com]
> *Sent:* Thursday, October 04, 2018 4:16 PM
> *To:* dovgaluk
> *Cc:* Pavel.Dovgaluk@ispras.ru; qemu-devel@nongnu.org
> *Subject:* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and
> adding reverse debugging
>
>
>
> No, it didn't changed test results, at least for
> https://github.com/ispras/qemu/tree/rr-180911 . Even step values it
> stucks on are same for most runs.
>
> Playing with master and my own branch gives different results for tests
> without sleep=off and -rtc base. It seems that patch you mentioned didn't
> changed them very much.
>
> The only thing can be said for sure, is that this patch does not fix
> issues completely. But MAY fix them partially or in some other specific
> cases...
>
>
>
> ср, 3 окт. 2018 г. в 12:47, dovgaluk <dovgaluk@ispras.ru>:
>
> Can you try applying this patch?
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg563798.html
>
> I also encountered the problems with x86_64 replaying and found the
> misprint in
> the code which was fixed later, than sending the series to the mailing
> list.
>
> Pavel Dovgalyuk
>
>
> Artem Pisarenko писал 2018-10-02 10:02:
> > I've added "-monitor stdio" option to command line of Test 1 and
> > repeated entering command during execution:
> >
> >   QEMU 3.0.50 monitor - type 'help' for more information
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 311736195
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 318198367
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 324737211
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 329890795
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   (qemu) info replay
> >   Replaying execution 'icount_rr_capture.bin': current step =
> > 607069789
> >   ...
> >
> > Some notes on value of step it stucks on:
> > - mostly it's same (even across different record-replay pairs);
> > - stressing host during replay may cause it to change even for same
> > record-replay pair (i.e. different replay executions for same file
> > recorded).
> >
> > This specific case seems to be stable to reproduce.
> >
> > вт, 2 окт. 2018 г. в 0:22, Artem Pisarenko
> > <artem.k.pisarenko@gmail.com>:
> >
> >> I've posted bug report with extended tests (incl. case without
> >> sleep=off). You may find guest image (kernel) in bug description.
> >> https://bugs.launchpad.net/qemu/+bug/1795369 [1]
> >>
> >> The most annoying thing is that some issues are almost not
> >> reproducible. There are definitely race conditions somewhere in qemu
> >> code. Running 'stress-ng' utility with CPU and I/O stressors in
> >> parallel with qemu execution greatly minimizes amount of attempts
> >> when I'm trying to trigger some of issues I encounter.
> >>
> >> I'll try 'info monitor' command tomorrow, but no guarantees that
> >> I'll be able to reproduce issue again.
> >>
> >> Speaking about '-nographic' and SDL... I've noted that UI greatly
> >> minimizes possibility of hanging (but not avoids it completely) when
> >> using icount in general, so this effect isn't rr-specific. I've
> >> already reported this bug too.
> >>
> >> пн, 1 окт. 2018 г., 20:14 dovgaluk <dovgaluk@ispras.ru>:
> >>
> >>> Artem Pisarenko писал 2018-09-30 14:01:
> >>>> Feature still broken :(
> >>>
> >>> Thanks for testing.
> >>>
> >>>>
> >>>> Brief description of my tests.
> >>>>
> >>>> Guest image is Linux, which just powers off after kernel boots
> >>>> (instead of proceeding to user-space /init or /sbin/init).
> >>>> Base cmdline:
> >>>> qemu-system-x86_64 -nodefaults -machine pc,accel=tcg -m 2048
> >>> -cpu
> >>>> qemu64 -rtc clock=vm,base=2000-01-01T00:00:00 -kernel bzImage
> >>> -initrd
> >>>> rootfs -append 'nokaslr console=ttyS0 rdinit=/init_poweroff'
> >>>> -nographic -serial SERIAL_VALUE -icount
> >>>> 1,sleep=off,rr=RR_VALUE,rrfile=icount_rr_capture.bin
> >>>
> >>> I've never tried it with sleep=off. Can you remove it and try
> >>> again?
> >>>
> >>> We also seen a problem with '-nographic'. When we remove this
> >>> option and
> >>> QEMU runs with SDL
> >>> window, everything is ok. There is some problem with main loop
> >>> which may
> >>> sleep when there
> >>> is no GUI to update, or something like that. We couldn't fix it
> >>> yet.
> >>>
> >>>>
> >>>> Test 1. When SERIAL_VALUE=none
> >>>> Running with RR_VALUE=record completes successfully.
> >>>> Running with RR_VALUE=replay doesn't completes. qemu process
> >>> just
> >>>> eating ~100% cpu and memory usage doesn't grow after some
> >>> moment. I
> >>>> don't see what happens because of problem no.2 (see below).
> >>>
> >>> Try 'info replay' monitor command. Does instruction counter
> >>> increases?
> >>>
> >>>>
> >>>> Test 2. When SERIAL_VALUE=stdio
> >>>> Running with RR_VALUE=record completes successfully.
> >>>>
> >>>> Running with RR_VALUE=replay caues exit with error:
> >>>>
> >>>> "qemu-system-x86_64: Missing character write event in the replay
> >>> log"
> >>>>
> >>>> These problems are same with qemu 2.12 (both vanilla and with
> >>> previous
> >>>> versions of these patches applied). Furthemore, I consider whole
> >>>> icount mode broken and determinism isn't achievable.
> >>>> The irony is that I actually don't need record/replay feature.
> >>> I've
> >>>> tried to use it only as instrument to debug failing determinism
> >>> in
> >>>> qemu code. But since replay/record feature itself relies on
> >>>> determinism, which is broken, it's no wonder why it fails also
> >>> (I just
> >>>> hoped to bypass it).
> >>>>
> >>>> Contact me if you need more details. I just tired a lot trying
> >>> to get
> >>>> all these things working... Hope is leaving me...
> >>>
> >>> Can you share the kernel in case the icount still broken?
> >>>
> >>> Pavel Dovgalyuk
> >> --
> >>
> >> С уважением,
> >> Артем Писаренко
> >  --
> >
> > С уважением,
> >   Артем Писаренко
> >
> > Links:
> > ------
> > [1] https://bugs.launchpad.net/qemu/+bug/1795369
>
> --
>
> С уважением,
>   Артем Писаренко
>
> --
>
> С уважением,
>   Артем Писаренко
>
-- 

С уважением,
  Артем Писаренко

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-09-13 13:40   ` Pavel Dovgalyuk
@ 2018-09-13 13:46     ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2018-09-13 13:46 UTC (permalink / raw)
  To: Pavel Dovgalyuk, 'Pavel Dovgalyuk', qemu-devel
  Cc: kwolf, peter.maydell, war2jordan, crosthwaite.peter, boost.lists,
	quintela, ciro.santilli, jasowang, mst, zuban32s, armbru,
	maria.klimushenkova, kraxel, thomas.dullien, mreitz, alex.bennee,
	dgilbert, rth, John Snow

On 13/09/2018 15:40, Pavel Dovgalyuk wrote:
>> For now I'm queuing 12, 14, 19, 20 (pending question to you) and 23-25.
> What about patch 21?

I'd want an ACK from the IDE maintainer.  Let's add him to Cc.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-09-13 10:27 ` Paolo Bonzini
@ 2018-09-13 13:40   ` Pavel Dovgalyuk
  2018-09-13 13:46     ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Dovgalyuk @ 2018-09-13 13:40 UTC (permalink / raw)
  To: 'Paolo Bonzini', 'Pavel Dovgalyuk', qemu-devel
  Cc: kwolf, peter.maydell, war2jordan, crosthwaite.peter, boost.lists,
	quintela, ciro.santilli, jasowang, mst, zuban32s, armbru,
	maria.klimushenkova, kraxel, thomas.dullien, mreitz, alex.bennee,
	dgilbert, rth

> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> On 12/09/2018 10:17, Pavel Dovgalyuk wrote:
> > GDB remote protocol supports reverse debugging of the targets.
> > It includes 'reverse step' and 'reverse continue' operations.
> > The first one finds the previous step of the execution,
> > and the second one is intended to stop at the last breakpoint that
> > would happen when the program is executed normally.
> >
> > Reverse debugging is possible in the replay mode, when at least
> > one snapshot was created at the record or replay phase.
> > QEMU can use these snapshots for travelling back in time with GDB.
> >
> > Running the execution in replay mode allows using GDB reverse debugging
> > commands:
> >  - reverse-stepi (or rsi): Steps one instruction to the past.
> >    QEMU loads on of the prior snapshots and proceeds to the desired
> >    instruction forward. When that step is reaches, execution stops.
> >  - reverse-continue (or rc): Runs execution "backwards".
> >    QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
> >    and replaying the execution. Then QEMU loads snapshots again and
> >    replays to the latest breakpoint. When there are no breakpoints in
> >    the examined section of the execution, QEMU finds one more snapshot
> >    and tries again. After the first snapshot is processed, execution
> >    stops at this snapshot.
> >
> > The set of patches include the following modifications:
> >  - fixes of record/replay caused by the QEMU core changes
> >  - gdbstub update for reverse debugging support
> >  - functions that automatically perform reverse step and reverse
> >    continue operations
> >  - hmp/qmp commands for manipulating the replay process
> >  - improvement of the snapshotting for saving the execution step
> >    in the snapshot parameters
> >  - adding new clock for correct timer events from vnc and slirp
> >  - other record/replay fixes
> >
> > The patches are available in the repository:
> > https://github.com/ispras/qemu/tree/rr-180911
> >
> > v6 changes:
> >  - rebased to the new version of master
> >  - fixed build of linux-user configurations
> >  - added new clock for slirp and vnc timers
> >
> > v5 changes:
> >  - multiple fixes of record/replay bugs appeared after QEMU core update
> >  - changed reverse debugging to 'since 3.1'
> >
> > v4 changes:
> >  - changed 'since 2.13' to 'since 3.0' in json (as suggested by Eric Blake)
> >
> > v3 changes:
> >  - Fixed PS/2 bug with save/load vm, which caused failures of the replay.
> >  - Rebased to the new code base.
> >  - Minor fixes.
> >
> > v2 changes:
> >  - documented reverse debugging
> >  - fixed start vmstate loading in record mode
> >  - documented qcow2 changes (as suggested by Eric Blake)
> >  - made icount SnapshotInfo field optional (as suggested by Eric Blake)
> >  - renamed qmp commands (as suggested by Eric Blake)
> >  - minor changes
> >
> > ---
> >
> > Pavel Dovgalyuk (25):
> >       block: implement bdrv_snapshot_goto for blkreplay
> >       replay: disable default snapshot for record/replay
> >       replay: update docs for record/replay with block devices
> >       replay: don't drain/flush bdrv queue while RR is working
> >       replay: finish record/replay before closing the disks
> >       qcow2: introduce icount field for snapshots
> >       migration: introduce icount field for snapshots
> >       replay: provide and accessor for rr filename
> >       replay: introduce info hmp/qmp command
> >       replay: introduce breakpoint at the specified step
> >       replay: implement replay-seek command to proceed to the desired step
> >       replay: flush events when exiting
> >       replay: refine replay-time module
> >       translator: fix breakpoint processing
> >       replay: flush rr queue before loading the vmstate
> >       gdbstub: add reverse step support in replay mode
> >       gdbstub: add reverse continue support in replay mode
> >       replay: describe reverse debugging in docs/replay.txt
> >       replay: allow loading any snapshots before recording
> >       replay: wake up vCPU when replaying
> >       replay: replay BH for IDE trim operation
> >       replay: add BH oneshot event for block layer
> >       timer: introduce new virtual clock
> >       slirp: fix ipv6 timers
> >       ui: fix virtual timers
> >
> >
> >  accel/tcg/translator.c    |    9 +
> >  block/blkreplay.c         |    8 +
> >  block/block-backend.c     |    3
> >  block/io.c                |   22 +++
> >  block/qapi.c              |   17 ++-
> >  block/qcow2-snapshot.c    |    9 +
> >  block/qcow2.h             |    2
> >  blockdev.c                |   10 ++
> >  cpus.c                    |   50 +++++---
> >  docs/interop/qcow2.txt    |    4 +
> >  docs/replay.txt           |   45 +++++++
> >  exec.c                    |    6 +
> >  gdbstub.c                 |   50 +++++++-
> >  hmp-commands-info.hx      |   14 ++
> >  hmp-commands.hx           |   30 +++++
> >  hmp.h                     |    3
> >  hw/ide/core.c             |    3
> >  include/block/snapshot.h  |    1
> >  include/qemu/timer.h      |    9 +
> >  include/sysemu/replay.h   |   26 ++++
> >  migration/savevm.c        |   15 +-
> >  qapi/block-core.json      |    5 +
> >  qapi/block.json           |    3
> >  qapi/misc.json            |   68 +++++++++++
> >  replay/Makefile.objs      |    3
> >  replay/replay-debugging.c |  287 +++++++++++++++++++++++++++++++++++++++++++++
> >  replay/replay-events.c    |   30 +++--
> >  replay/replay-internal.h  |    9 +
> >  replay/replay-snapshot.c  |   17 ++-
> >  replay/replay-time.c      |   32 ++---
> >  replay/replay.c           |   38 ++++++
> >  slirp/ip6_icmp.c          |    7 +
> >  stubs/Makefile.objs       |    1
> >  stubs/replay-user.c       |    9 +
> >  stubs/replay.c            |   10 ++
> >  ui/input.c                |    8 +
> >  util/qemu-timer.c         |    2
> >  vl.c                      |   18 ++-
> >  38 files changed, 791 insertions(+), 92 deletions(-)
> >  create mode 100644 replay/replay-debugging.c
> >  create mode 100644 stubs/replay-user.c
> >
> 
> For now I'm queuing 12, 14, 19, 20 (pending question to you) and 23-25.

What about patch 21?

Pavel Dovgalyuk

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
  2018-09-12  8:17 Pavel Dovgalyuk
@ 2018-09-13 10:27 ` Paolo Bonzini
  2018-09-13 13:40   ` Pavel Dovgalyuk
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2018-09-13 10:27 UTC (permalink / raw)
  To: Pavel Dovgalyuk, qemu-devel
  Cc: kwolf, peter.maydell, war2jordan, crosthwaite.peter, boost.lists,
	quintela, ciro.santilli, jasowang, mst, zuban32s, armbru,
	maria.klimushenkova, dovgaluk, kraxel, thomas.dullien, mreitz,
	alex.bennee, dgilbert, rth

On 12/09/2018 10:17, Pavel Dovgalyuk wrote:
> GDB remote protocol supports reverse debugging of the targets.
> It includes 'reverse step' and 'reverse continue' operations.
> The first one finds the previous step of the execution,
> and the second one is intended to stop at the last breakpoint that
> would happen when the program is executed normally.
> 
> Reverse debugging is possible in the replay mode, when at least
> one snapshot was created at the record or replay phase.
> QEMU can use these snapshots for travelling back in time with GDB.
> 
> Running the execution in replay mode allows using GDB reverse debugging
> commands:
>  - reverse-stepi (or rsi): Steps one instruction to the past.
>    QEMU loads on of the prior snapshots and proceeds to the desired
>    instruction forward. When that step is reaches, execution stops.
>  - reverse-continue (or rc): Runs execution "backwards".
>    QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
>    and replaying the execution. Then QEMU loads snapshots again and
>    replays to the latest breakpoint. When there are no breakpoints in
>    the examined section of the execution, QEMU finds one more snapshot
>    and tries again. After the first snapshot is processed, execution
>    stops at this snapshot.
> 
> The set of patches include the following modifications:
>  - fixes of record/replay caused by the QEMU core changes
>  - gdbstub update for reverse debugging support
>  - functions that automatically perform reverse step and reverse
>    continue operations
>  - hmp/qmp commands for manipulating the replay process
>  - improvement of the snapshotting for saving the execution step
>    in the snapshot parameters
>  - adding new clock for correct timer events from vnc and slirp
>  - other record/replay fixes
> 
> The patches are available in the repository:
> https://github.com/ispras/qemu/tree/rr-180911
> 
> v6 changes:
>  - rebased to the new version of master
>  - fixed build of linux-user configurations
>  - added new clock for slirp and vnc timers
> 
> v5 changes:
>  - multiple fixes of record/replay bugs appeared after QEMU core update
>  - changed reverse debugging to 'since 3.1'
> 
> v4 changes:
>  - changed 'since 2.13' to 'since 3.0' in json (as suggested by Eric Blake)
> 
> v3 changes:
>  - Fixed PS/2 bug with save/load vm, which caused failures of the replay.
>  - Rebased to the new code base.
>  - Minor fixes.
> 
> v2 changes:
>  - documented reverse debugging
>  - fixed start vmstate loading in record mode
>  - documented qcow2 changes (as suggested by Eric Blake)
>  - made icount SnapshotInfo field optional (as suggested by Eric Blake)
>  - renamed qmp commands (as suggested by Eric Blake)
>  - minor changes
> 
> ---
> 
> Pavel Dovgalyuk (25):
>       block: implement bdrv_snapshot_goto for blkreplay
>       replay: disable default snapshot for record/replay
>       replay: update docs for record/replay with block devices
>       replay: don't drain/flush bdrv queue while RR is working
>       replay: finish record/replay before closing the disks
>       qcow2: introduce icount field for snapshots
>       migration: introduce icount field for snapshots
>       replay: provide and accessor for rr filename
>       replay: introduce info hmp/qmp command
>       replay: introduce breakpoint at the specified step
>       replay: implement replay-seek command to proceed to the desired step
>       replay: flush events when exiting
>       replay: refine replay-time module
>       translator: fix breakpoint processing
>       replay: flush rr queue before loading the vmstate
>       gdbstub: add reverse step support in replay mode
>       gdbstub: add reverse continue support in replay mode
>       replay: describe reverse debugging in docs/replay.txt
>       replay: allow loading any snapshots before recording
>       replay: wake up vCPU when replaying
>       replay: replay BH for IDE trim operation
>       replay: add BH oneshot event for block layer
>       timer: introduce new virtual clock
>       slirp: fix ipv6 timers
>       ui: fix virtual timers
> 
> 
>  accel/tcg/translator.c    |    9 +
>  block/blkreplay.c         |    8 +
>  block/block-backend.c     |    3 
>  block/io.c                |   22 +++
>  block/qapi.c              |   17 ++-
>  block/qcow2-snapshot.c    |    9 +
>  block/qcow2.h             |    2 
>  blockdev.c                |   10 ++
>  cpus.c                    |   50 +++++---
>  docs/interop/qcow2.txt    |    4 +
>  docs/replay.txt           |   45 +++++++
>  exec.c                    |    6 +
>  gdbstub.c                 |   50 +++++++-
>  hmp-commands-info.hx      |   14 ++
>  hmp-commands.hx           |   30 +++++
>  hmp.h                     |    3 
>  hw/ide/core.c             |    3 
>  include/block/snapshot.h  |    1 
>  include/qemu/timer.h      |    9 +
>  include/sysemu/replay.h   |   26 ++++
>  migration/savevm.c        |   15 +-
>  qapi/block-core.json      |    5 +
>  qapi/block.json           |    3 
>  qapi/misc.json            |   68 +++++++++++
>  replay/Makefile.objs      |    3 
>  replay/replay-debugging.c |  287 +++++++++++++++++++++++++++++++++++++++++++++
>  replay/replay-events.c    |   30 +++--
>  replay/replay-internal.h  |    9 +
>  replay/replay-snapshot.c  |   17 ++-
>  replay/replay-time.c      |   32 ++---
>  replay/replay.c           |   38 ++++++
>  slirp/ip6_icmp.c          |    7 +
>  stubs/Makefile.objs       |    1 
>  stubs/replay-user.c       |    9 +
>  stubs/replay.c            |   10 ++
>  ui/input.c                |    8 +
>  util/qemu-timer.c         |    2 
>  vl.c                      |   18 ++-
>  38 files changed, 791 insertions(+), 92 deletions(-)
>  create mode 100644 replay/replay-debugging.c
>  create mode 100644 stubs/replay-user.c
> 

For now I'm queuing 12, 14, 19, 20 (pending question to you) and 23-25.

Kevin, can you take a look at patches 1-5?  I cannot quite evaluate if 4
has any scary ramifications.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging
@ 2018-09-12  8:17 Pavel Dovgalyuk
  2018-09-13 10:27 ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Dovgalyuk @ 2018-09-12  8:17 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, peter.maydell, war2jordan, crosthwaite.peter, boost.lists,
	quintela, ciro.santilli, jasowang, mst, zuban32s, armbru,
	maria.klimushenkova, dovgaluk, kraxel, pavel.dovgaluk,
	thomas.dullien, pbonzini, mreitz, alex.bennee, dgilbert, rth

GDB remote protocol supports reverse debugging of the targets.
It includes 'reverse step' and 'reverse continue' operations.
The first one finds the previous step of the execution,
and the second one is intended to stop at the last breakpoint that
would happen when the program is executed normally.

Reverse debugging is possible in the replay mode, when at least
one snapshot was created at the record or replay phase.
QEMU can use these snapshots for travelling back in time with GDB.

Running the execution in replay mode allows using GDB reverse debugging
commands:
 - reverse-stepi (or rsi): Steps one instruction to the past.
   QEMU loads on of the prior snapshots and proceeds to the desired
   instruction forward. When that step is reaches, execution stops.
 - reverse-continue (or rc): Runs execution "backwards".
   QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
   and replaying the execution. Then QEMU loads snapshots again and
   replays to the latest breakpoint. When there are no breakpoints in
   the examined section of the execution, QEMU finds one more snapshot
   and tries again. After the first snapshot is processed, execution
   stops at this snapshot.

The set of patches include the following modifications:
 - fixes of record/replay caused by the QEMU core changes
 - gdbstub update for reverse debugging support
 - functions that automatically perform reverse step and reverse
   continue operations
 - hmp/qmp commands for manipulating the replay process
 - improvement of the snapshotting for saving the execution step
   in the snapshot parameters
 - adding new clock for correct timer events from vnc and slirp
 - other record/replay fixes

The patches are available in the repository:
https://github.com/ispras/qemu/tree/rr-180911

v6 changes:
 - rebased to the new version of master
 - fixed build of linux-user configurations
 - added new clock for slirp and vnc timers

v5 changes:
 - multiple fixes of record/replay bugs appeared after QEMU core update
 - changed reverse debugging to 'since 3.1'

v4 changes:
 - changed 'since 2.13' to 'since 3.0' in json (as suggested by Eric Blake)

v3 changes:
 - Fixed PS/2 bug with save/load vm, which caused failures of the replay.
 - Rebased to the new code base.
 - Minor fixes.

v2 changes:
 - documented reverse debugging
 - fixed start vmstate loading in record mode
 - documented qcow2 changes (as suggested by Eric Blake)
 - made icount SnapshotInfo field optional (as suggested by Eric Blake)
 - renamed qmp commands (as suggested by Eric Blake)
 - minor changes

---

Pavel Dovgalyuk (25):
      block: implement bdrv_snapshot_goto for blkreplay
      replay: disable default snapshot for record/replay
      replay: update docs for record/replay with block devices
      replay: don't drain/flush bdrv queue while RR is working
      replay: finish record/replay before closing the disks
      qcow2: introduce icount field for snapshots
      migration: introduce icount field for snapshots
      replay: provide and accessor for rr filename
      replay: introduce info hmp/qmp command
      replay: introduce breakpoint at the specified step
      replay: implement replay-seek command to proceed to the desired step
      replay: flush events when exiting
      replay: refine replay-time module
      translator: fix breakpoint processing
      replay: flush rr queue before loading the vmstate
      gdbstub: add reverse step support in replay mode
      gdbstub: add reverse continue support in replay mode
      replay: describe reverse debugging in docs/replay.txt
      replay: allow loading any snapshots before recording
      replay: wake up vCPU when replaying
      replay: replay BH for IDE trim operation
      replay: add BH oneshot event for block layer
      timer: introduce new virtual clock
      slirp: fix ipv6 timers
      ui: fix virtual timers


 accel/tcg/translator.c    |    9 +
 block/blkreplay.c         |    8 +
 block/block-backend.c     |    3 
 block/io.c                |   22 +++
 block/qapi.c              |   17 ++-
 block/qcow2-snapshot.c    |    9 +
 block/qcow2.h             |    2 
 blockdev.c                |   10 ++
 cpus.c                    |   50 +++++---
 docs/interop/qcow2.txt    |    4 +
 docs/replay.txt           |   45 +++++++
 exec.c                    |    6 +
 gdbstub.c                 |   50 +++++++-
 hmp-commands-info.hx      |   14 ++
 hmp-commands.hx           |   30 +++++
 hmp.h                     |    3 
 hw/ide/core.c             |    3 
 include/block/snapshot.h  |    1 
 include/qemu/timer.h      |    9 +
 include/sysemu/replay.h   |   26 ++++
 migration/savevm.c        |   15 +-
 qapi/block-core.json      |    5 +
 qapi/block.json           |    3 
 qapi/misc.json            |   68 +++++++++++
 replay/Makefile.objs      |    3 
 replay/replay-debugging.c |  287 +++++++++++++++++++++++++++++++++++++++++++++
 replay/replay-events.c    |   30 +++--
 replay/replay-internal.h  |    9 +
 replay/replay-snapshot.c  |   17 ++-
 replay/replay-time.c      |   32 ++---
 replay/replay.c           |   38 ++++++
 slirp/ip6_icmp.c          |    7 +
 stubs/Makefile.objs       |    1 
 stubs/replay-user.c       |    9 +
 stubs/replay.c            |   10 ++
 ui/input.c                |    8 +
 util/qemu-timer.c         |    2 
 vl.c                      |   18 ++-
 38 files changed, 791 insertions(+), 92 deletions(-)
 create mode 100644 replay/replay-debugging.c
 create mode 100644 stubs/replay-user.c

-- 
Pavel Dovgalyuk

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-10-09 12:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CANzW0mvSX5nWuinDU68W2yJzgoQSGAHPqpz0G36A6NKwRsz_4A@mail.gmail.com>
2018-10-01 14:14 ` [Qemu-devel] [PATCH v6 00/25] Fixing record/replay and adding reverse debugging dovgaluk
2018-10-01 18:22   ` Artem Pisarenko
2018-10-02  7:02     ` Artem Pisarenko
2018-10-03  6:47       ` dovgaluk
2018-10-04 13:15         ` Artem Pisarenko
2018-10-09  9:04           ` Pavel Dovgalyuk
2018-10-09 11:23             ` Artem Pisarenko
2018-10-09 11:26               ` Pavel Dovgalyuk
2018-10-09 12:59                 ` Artem Pisarenko
2018-09-12  8:17 Pavel Dovgalyuk
2018-09-13 10:27 ` Paolo Bonzini
2018-09-13 13:40   ` Pavel Dovgalyuk
2018-09-13 13:46     ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.