All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] icount: make dma reads deterministic
@ 2020-03-03 12:26 Pavel Dovgalyuk
  2020-03-11 12:42 ` Pavel Dovgalyuk
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Dovgalyuk @ 2020-03-03 12:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, dovgaluk, jsnow, pavel.dovgaluk, mreitz

Windows guest sometimes makes DMA requests with overlapping
target addresses. This leads to the following structure of iov for
the block driver:

addr size1
addr size2
addr size3

It means that three adjacent disk blocks should be read into the same
memory buffer. Windows does not expects anything from these bytes
(should it be data from the first block, or the last one, or some mix),
but uses them somehow. It leads to non-determinism of the guest execution,
because block driver does not preserve any order of reading.

This situation was discusses in the mailing list at least twice:
https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html

This patch makes such disk reads deterministic in icount mode.
It splits the whole request into several parts. Parts may overlap,
but SGs inside one part do not overlap.
Parts that are processed later overwrite the prior ones in case
of overlapping.

Examples for different SG part sequences:

1)
A1 1000
A2 1000
A1 1000
A3 1000
->
One request is split into two.
A1 1000
A2 1000
--
A1 1000
A3 1000

2)
A1 800
A2 1000
A1 1000
->
A1 800
A2 1000
--
A1 1000

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

--

v2:
 - Rewritten the loop to split the request instead of skipping the parts
   (suggested by Kevin Wolf)
---
 dma-helpers.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/dma-helpers.c b/dma-helpers.c
index e8a26e81e1..959e114595 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -13,6 +13,8 @@
 #include "trace-root.h"
 #include "qemu/thread.h"
 #include "qemu/main-loop.h"
+#include "sysemu/cpus.h"
+#include "qemu/range.h"
 
 /* #define DEBUG_IOMMU */
 
@@ -142,6 +144,23 @@ static void dma_blk_cb(void *opaque, int ret)
         cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
         cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
         mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
+        /*
+         * Make reads deterministic in icount mode. Windows sometimes issues
+         * disk read requests with overlapping SGs. It leads
+         * to non-determinism, because resulting buffer contents may be mixed
+         * from several sectors. This code splits all SGs into several
+         * groups. SGs in every group do not overlap.
+         */
+        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
+            int i;
+            for (i = 0 ; i < dbs->iov.niov ; ++i) {
+                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base, dbs->iov.iov[i].iov_len,
+                                   (intptr_t)mem, cur_len)) {
+                    mem = NULL;
+                    break;
+                }
+            }
+        }
         if (!mem)
             break;
         qemu_iovec_add(&dbs->iov, mem, cur_len);



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH v2] icount: make dma reads deterministic
  2020-03-03 12:26 [PATCH v2] icount: make dma reads deterministic Pavel Dovgalyuk
@ 2020-03-11 12:42 ` Pavel Dovgalyuk
  0 siblings, 0 replies; 9+ messages in thread
From: Pavel Dovgalyuk @ 2020-03-11 12:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, jsnow, pavel.dovgaluk, mreitz

Ping.


Pavel Dovgalyuk

> -----Original Message-----
> From: Pavel Dovgalyuk [mailto:pavel.dovgaluk@gmail.com]
> Sent: Tuesday, March 03, 2020 3:27 PM
> To: qemu-devel@nongnu.org
> Cc: kwolf@redhat.com; jsnow@redhat.com; dovgaluk@ispras.ru; pavel.dovgaluk@ispras.ru;
> mreitz@redhat.com
> Subject: [PATCH v2] icount: make dma reads deterministic
> 
> Windows guest sometimes makes DMA requests with overlapping
> target addresses. This leads to the following structure of iov for
> the block driver:
> 
> addr size1
> addr size2
> addr size3
> 
> It means that three adjacent disk blocks should be read into the same
> memory buffer. Windows does not expects anything from these bytes
> (should it be data from the first block, or the last one, or some mix),
> but uses them somehow. It leads to non-determinism of the guest execution,
> because block driver does not preserve any order of reading.
> 
> This situation was discusses in the mailing list at least twice:
> https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
> https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
> 
> This patch makes such disk reads deterministic in icount mode.
> It splits the whole request into several parts. Parts may overlap,
> but SGs inside one part do not overlap.
> Parts that are processed later overwrite the prior ones in case
> of overlapping.
> 
> Examples for different SG part sequences:
> 
> 1)
> A1 1000
> A2 1000
> A1 1000
> A3 1000
> ->
> One request is split into two.
> A1 1000
> A2 1000
> --
> A1 1000
> A3 1000
> 
> 2)
> A1 800
> A2 1000
> A1 1000
> ->
> A1 800
> A2 1000
> --
> A1 1000
> 
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> 
> --
> 
> v2:
>  - Rewritten the loop to split the request instead of skipping the parts
>    (suggested by Kevin Wolf)
> ---
>  dma-helpers.c |   19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/dma-helpers.c b/dma-helpers.c
> index e8a26e81e1..959e114595 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -13,6 +13,8 @@
>  #include "trace-root.h"
>  #include "qemu/thread.h"
>  #include "qemu/main-loop.h"
> +#include "sysemu/cpus.h"
> +#include "qemu/range.h"
> 
>  /* #define DEBUG_IOMMU */
> 
> @@ -142,6 +144,23 @@ static void dma_blk_cb(void *opaque, int ret)
>          cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
>          cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
>          mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
> +        /*
> +         * Make reads deterministic in icount mode. Windows sometimes issues
> +         * disk read requests with overlapping SGs. It leads
> +         * to non-determinism, because resulting buffer contents may be mixed
> +         * from several sectors. This code splits all SGs into several
> +         * groups. SGs in every group do not overlap.
> +         */
> +        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
> +            int i;
> +            for (i = 0 ; i < dbs->iov.niov ; ++i) {
> +                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base, dbs-
> >iov.iov[i].iov_len,
> +                                   (intptr_t)mem, cur_len)) {
> +                    mem = NULL;
> +                    break;
> +                }
> +            }
> +        }
>          if (!mem)
>              break;
>          qemu_iovec_add(&dbs->iov, mem, cur_len);




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] icount: make dma reads deterministic
  2020-06-03  8:15     ` Kevin Wolf
  2020-06-03  8:56       ` Pavel Dovgalyuk
@ 2020-06-03  9:37       ` Pavel Dovgalyuk
  1 sibling, 0 replies; 9+ messages in thread
From: Pavel Dovgalyuk @ 2020-06-03  9:37 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: vsementsov, qemu-devel, mreitz, Pavel Dovgalyuk, pavel.dovgaluk, jsnow


On 03.06.2020 11:15, Kevin Wolf wrote:
> Am 03.06.2020 um 07:57 hat Pavel Dovgalyuk geschrieben:
>> On 02.06.2020 18:54, Kevin Wolf wrote:
>>> Am 30.04.2020 um 11:02 hat Pavel Dovgalyuk geschrieben:
>>>> From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
>>>>
>>>> Windows guest sometimes makes DMA requests with overlapping
>>>> target addresses. This leads to the following structure of iov for
>>>> the block driver:
>>>>
>>>> addr size1
>>>> addr size2
>>>> addr size3
>>>>
>>>> It means that three adjacent disk blocks should be read into the same
>>>> memory buffer. Windows does not expects anything from these bytes
>>>> (should it be data from the first block, or the last one, or some mix),
>>>> but uses them somehow. It leads to non-determinism of the guest execution,
>>>> because block driver does not preserve any order of reading.
>>>>
>>>> This situation was discusses in the mailing list at least twice:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
>>>> https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
>>>>
>>>> This patch makes such disk reads deterministic in icount mode.
>>>> It splits the whole request into several parts. Parts may overlap,
>>>> but SGs inside one part do not overlap.
>>>> Parts that are processed later overwrite the prior ones in case
>>>> of overlapping.
>>>>
>>>> Examples for different SG part sequences:
>>>>
>>>> 1)
>>>> A1 1000
>>>> A2 1000
>>>> A1 1000
>>>> A3 1000
>>>> ->
>>>> One request is split into two.
>>>> A1 1000
>>>> A2 1000
>>>> --
>>>> A1 1000
>>>> A3 1000
>>>>
>>>> 2)
>>>> A1 800
>>>> A2 1000
>>>> A1 1000
>>>> ->
>>>> A1 800
>>>> A2 1000
>>>> --
>>>> A1 1000
>>>>
>>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>>>
>>>> --
>>>>
>>>> v2:
>>>>    - Rewritten the loop to split the request instead of skipping the parts
>>>>      (suggested by Kevin Wolf)
>>>> ---
>>>>    dma-helpers.c |   20 ++++++++++++++++++++
>>>>    1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/dma-helpers.c b/dma-helpers.c
>>>> index e8a26e81e1..a49f9a0e34 100644
>>>> --- a/dma-helpers.c
>>>> +++ b/dma-helpers.c
>>>> @@ -13,6 +13,8 @@
>>>>    #include "trace-root.h"
>>>>    #include "qemu/thread.h"
>>>>    #include "qemu/main-loop.h"
>>>> +#include "sysemu/cpus.h"
>>>> +#include "qemu/range.h"
>>>>    /* #define DEBUG_IOMMU */
>>>> @@ -142,6 +144,24 @@ static void dma_blk_cb(void *opaque, int ret)
>>>>            cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
>>>>            cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
>>>>            mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
>>>> +        /*
>>>> +         * Make reads deterministic in icount mode. Windows sometimes issues
>>>> +         * disk read requests with overlapping SGs. It leads
>>>> +         * to non-determinism, because resulting buffer contents may be mixed
>>>> +         * from several sectors. This code splits all SGs into several
>>>> +         * groups. SGs in every group do not overlap.
>>>> +         */
>>>> +        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
>>>> +            int i;
>>>> +            for (i = 0 ; i < dbs->iov.niov ; ++i) {
>>>> +                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base,
>>>> +                                   dbs->iov.iov[i].iov_len, (intptr_t)mem,
>>>> +                                   cur_len)) {
>>>> +                    mem = NULL;
>>> Doesn't this leak mem, i.e. should we call dma_memory_unmap()?
>> Thanks, I missed, that the memory is unmapped on request finish.
>>
>>> Did you verify that it is guaranteed that mapping the same guest memory
>>> twice results in the same host address? v1 compared the SG list (which
>>> has guest addresses) rather than the resulting QEMUIOVector (which has
>>> host addresses).
>> We don't need the host addresses to be equivalent in different runs.
>>
>> The order of the SG list entries is deterministic. This is enough for
>> successful exclusion of the duplicate entries.
> I'm not talking about different runs, but just a single one. You only
> correctly detect an overlap if mapping the same guest address returns
> the same host address for both entries.
>
> Let me see...
>
> address_space_map() has a path where it allocates a bounce buffer rather
> than directly mapping guest memory. We're lucky there because it has
> only one global bounce buffer and if it's already in use, it returns
> NULL instead, which means that we split the request. This is good, but
> if it ever changes to allow more than one bounce buffer, the code would
> break. If we rely on it, maybe better add a comment to the bounce.in_use
> code in address_space_map().
>
> The other code path just translates according to the FlatView of the
> AddressSpace. Do we need some kind of locking to make sure that the
> FlatView doesn't change between both dma_memory_map() calls? On the
> other hand, use_icount is probably not something that you would use
> together with iothreads, so maybe running in the main thread is enough
> to protect it.
>
>>>> +                    break;
>>>> +                }
>>>> +            }
>>>> +        }
>>>>            if (!mem)
>>>>                break;
> Now that I talked about how a NULL return saves us, I wonder, shouldn't
> this check actually come before the overlap checking?

I missed this one.

With unmap, we should check only non-null case. I'll add the comparison.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] icount: make dma reads deterministic
  2020-06-03  8:15     ` Kevin Wolf
@ 2020-06-03  8:56       ` Pavel Dovgalyuk
  2020-06-03  9:37       ` Pavel Dovgalyuk
  1 sibling, 0 replies; 9+ messages in thread
From: Pavel Dovgalyuk @ 2020-06-03  8:56 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: vsementsov, qemu-devel, mreitz, Pavel Dovgalyuk, pavel.dovgaluk, jsnow


On 03.06.2020 11:15, Kevin Wolf wrote:
> Am 03.06.2020 um 07:57 hat Pavel Dovgalyuk geschrieben:
>> On 02.06.2020 18:54, Kevin Wolf wrote:
>>> Am 30.04.2020 um 11:02 hat Pavel Dovgalyuk geschrieben:
>>>> From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
>>>>
>>>> Windows guest sometimes makes DMA requests with overlapping
>>>> target addresses. This leads to the following structure of iov for
>>>> the block driver:
>>>>
>>>> addr size1
>>>> addr size2
>>>> addr size3
>>>>
>>>> It means that three adjacent disk blocks should be read into the same
>>>> memory buffer. Windows does not expects anything from these bytes
>>>> (should it be data from the first block, or the last one, or some mix),
>>>> but uses them somehow. It leads to non-determinism of the guest execution,
>>>> because block driver does not preserve any order of reading.
>>>>
>>>> This situation was discusses in the mailing list at least twice:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
>>>> https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
>>>>
>>>> This patch makes such disk reads deterministic in icount mode.
>>>> It splits the whole request into several parts. Parts may overlap,
>>>> but SGs inside one part do not overlap.
>>>> Parts that are processed later overwrite the prior ones in case
>>>> of overlapping.
>>>>
>>>> Examples for different SG part sequences:
>>>>
>>>> 1)
>>>> A1 1000
>>>> A2 1000
>>>> A1 1000
>>>> A3 1000
>>>> ->
>>>> One request is split into two.
>>>> A1 1000
>>>> A2 1000
>>>> --
>>>> A1 1000
>>>> A3 1000
>>>>
>>>> 2)
>>>> A1 800
>>>> A2 1000
>>>> A1 1000
>>>> ->
>>>> A1 800
>>>> A2 1000
>>>> --
>>>> A1 1000
>>>>
>>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>>>
>>>> --
>>>>
>>>> v2:
>>>>    - Rewritten the loop to split the request instead of skipping the parts
>>>>      (suggested by Kevin Wolf)
>>>> ---
>>>>    dma-helpers.c |   20 ++++++++++++++++++++
>>>>    1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/dma-helpers.c b/dma-helpers.c
>>>> index e8a26e81e1..a49f9a0e34 100644
>>>> --- a/dma-helpers.c
>>>> +++ b/dma-helpers.c
>>>> @@ -13,6 +13,8 @@
>>>>    #include "trace-root.h"
>>>>    #include "qemu/thread.h"
>>>>    #include "qemu/main-loop.h"
>>>> +#include "sysemu/cpus.h"
>>>> +#include "qemu/range.h"
>>>>    /* #define DEBUG_IOMMU */
>>>> @@ -142,6 +144,24 @@ static void dma_blk_cb(void *opaque, int ret)
>>>>            cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
>>>>            cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
>>>>            mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
>>>> +        /*
>>>> +         * Make reads deterministic in icount mode. Windows sometimes issues
>>>> +         * disk read requests with overlapping SGs. It leads
>>>> +         * to non-determinism, because resulting buffer contents may be mixed
>>>> +         * from several sectors. This code splits all SGs into several
>>>> +         * groups. SGs in every group do not overlap.
>>>> +         */
>>>> +        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
>>>> +            int i;
>>>> +            for (i = 0 ; i < dbs->iov.niov ; ++i) {
>>>> +                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base,
>>>> +                                   dbs->iov.iov[i].iov_len, (intptr_t)mem,
>>>> +                                   cur_len)) {
>>>> +                    mem = NULL;
>>> Doesn't this leak mem, i.e. should we call dma_memory_unmap()?
>> Thanks, I missed, that the memory is unmapped on request finish.
>>
>>> Did you verify that it is guaranteed that mapping the same guest memory
>>> twice results in the same host address? v1 compared the SG list (which
>>> has guest addresses) rather than the resulting QEMUIOVector (which has
>>> host addresses).
>> We don't need the host addresses to be equivalent in different runs.
>>
>> The order of the SG list entries is deterministic. This is enough for
>> successful exclusion of the duplicate entries.
> I'm not talking about different runs, but just a single one. You only
> correctly detect an overlap if mapping the same guest address returns
> the same host address for both entries.
>
> Let me see...
>
> address_space_map() has a path where it allocates a bounce buffer rather
> than directly mapping guest memory. We're lucky there because it has
> only one global bounce buffer and if it's already in use, it returns
> NULL instead, which means that we split the request. This is good, but
> if it ever changes to allow more than one bounce buffer, the code would
> break. If we rely on it, maybe better add a comment to the bounce.in_use
> code in address_space_map().
> The other code path just translates according to the FlatView of the
> AddressSpace. Do we need some kind of locking to make sure that the
> FlatView doesn't change between both dma_memory_map() calls? On the
> other hand, use_icount is probably not something that you would use
> together with iothreads, so maybe running in the main thread is enough
> to protect it.

Right, in icount mode no thread can modify mapping, when other threads 
work with it.

I'm using host addresses, because iov does not save information about 
the guest. Trying to check guest addresses will make this code harder to 
understand and maintain.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] icount: make dma reads deterministic
  2020-06-03  5:57   ` Pavel Dovgalyuk
@ 2020-06-03  8:15     ` Kevin Wolf
  2020-06-03  8:56       ` Pavel Dovgalyuk
  2020-06-03  9:37       ` Pavel Dovgalyuk
  0 siblings, 2 replies; 9+ messages in thread
From: Kevin Wolf @ 2020-06-03  8:15 UTC (permalink / raw)
  To: Pavel Dovgalyuk
  Cc: vsementsov, qemu-devel, mreitz, Pavel Dovgalyuk, pavel.dovgaluk, jsnow

Am 03.06.2020 um 07:57 hat Pavel Dovgalyuk geschrieben:
> 
> On 02.06.2020 18:54, Kevin Wolf wrote:
> > Am 30.04.2020 um 11:02 hat Pavel Dovgalyuk geschrieben:
> > > From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
> > > 
> > > Windows guest sometimes makes DMA requests with overlapping
> > > target addresses. This leads to the following structure of iov for
> > > the block driver:
> > > 
> > > addr size1
> > > addr size2
> > > addr size3
> > > 
> > > It means that three adjacent disk blocks should be read into the same
> > > memory buffer. Windows does not expects anything from these bytes
> > > (should it be data from the first block, or the last one, or some mix),
> > > but uses them somehow. It leads to non-determinism of the guest execution,
> > > because block driver does not preserve any order of reading.
> > > 
> > > This situation was discusses in the mailing list at least twice:
> > > https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
> > > https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
> > > 
> > > This patch makes such disk reads deterministic in icount mode.
> > > It splits the whole request into several parts. Parts may overlap,
> > > but SGs inside one part do not overlap.
> > > Parts that are processed later overwrite the prior ones in case
> > > of overlapping.
> > > 
> > > Examples for different SG part sequences:
> > > 
> > > 1)
> > > A1 1000
> > > A2 1000
> > > A1 1000
> > > A3 1000
> > > ->
> > > One request is split into two.
> > > A1 1000
> > > A2 1000
> > > --
> > > A1 1000
> > > A3 1000
> > > 
> > > 2)
> > > A1 800
> > > A2 1000
> > > A1 1000
> > > ->
> > > A1 800
> > > A2 1000
> > > --
> > > A1 1000
> > > 
> > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> > > 
> > > --
> > > 
> > > v2:
> > >   - Rewritten the loop to split the request instead of skipping the parts
> > >     (suggested by Kevin Wolf)
> > > ---
> > >   dma-helpers.c |   20 ++++++++++++++++++++
> > >   1 file changed, 20 insertions(+)
> > > 
> > > diff --git a/dma-helpers.c b/dma-helpers.c
> > > index e8a26e81e1..a49f9a0e34 100644
> > > --- a/dma-helpers.c
> > > +++ b/dma-helpers.c
> > > @@ -13,6 +13,8 @@
> > >   #include "trace-root.h"
> > >   #include "qemu/thread.h"
> > >   #include "qemu/main-loop.h"
> > > +#include "sysemu/cpus.h"
> > > +#include "qemu/range.h"
> > >   /* #define DEBUG_IOMMU */
> > > @@ -142,6 +144,24 @@ static void dma_blk_cb(void *opaque, int ret)
> > >           cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
> > >           cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
> > >           mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
> > > +        /*
> > > +         * Make reads deterministic in icount mode. Windows sometimes issues
> > > +         * disk read requests with overlapping SGs. It leads
> > > +         * to non-determinism, because resulting buffer contents may be mixed
> > > +         * from several sectors. This code splits all SGs into several
> > > +         * groups. SGs in every group do not overlap.
> > > +         */
> > > +        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
> > > +            int i;
> > > +            for (i = 0 ; i < dbs->iov.niov ; ++i) {
> > > +                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base,
> > > +                                   dbs->iov.iov[i].iov_len, (intptr_t)mem,
> > > +                                   cur_len)) {
> > > +                    mem = NULL;
> > Doesn't this leak mem, i.e. should we call dma_memory_unmap()?
> 
> Thanks, I missed, that the memory is unmapped on request finish.
> 
> > Did you verify that it is guaranteed that mapping the same guest memory
> > twice results in the same host address? v1 compared the SG list (which
> > has guest addresses) rather than the resulting QEMUIOVector (which has
> > host addresses).
> 
> We don't need the host addresses to be equivalent in different runs.
> 
> The order of the SG list entries is deterministic. This is enough for
> successful exclusion of the duplicate entries.

I'm not talking about different runs, but just a single one. You only
correctly detect an overlap if mapping the same guest address returns
the same host address for both entries.

Let me see...

address_space_map() has a path where it allocates a bounce buffer rather
than directly mapping guest memory. We're lucky there because it has
only one global bounce buffer and if it's already in use, it returns
NULL instead, which means that we split the request. This is good, but
if it ever changes to allow more than one bounce buffer, the code would
break. If we rely on it, maybe better add a comment to the bounce.in_use
code in address_space_map().

The other code path just translates according to the FlatView of the
AddressSpace. Do we need some kind of locking to make sure that the
FlatView doesn't change between both dma_memory_map() calls? On the
other hand, use_icount is probably not something that you would use
together with iothreads, so maybe running in the main thread is enough
to protect it.

> > 
> > > +                    break;
> > > +                }
> > > +            }
> > > +        }
> > >           if (!mem)
> > >               break;

Now that I talked about how a NULL return saves us, I wonder, shouldn't
this check actually come before the overlap checking?

Kevin



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] icount: make dma reads deterministic
  2020-06-02 15:54 ` Kevin Wolf
@ 2020-06-03  5:57   ` Pavel Dovgalyuk
  2020-06-03  8:15     ` Kevin Wolf
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Dovgalyuk @ 2020-06-03  5:57 UTC (permalink / raw)
  To: Kevin Wolf, Pavel Dovgalyuk
  Cc: vsementsov, jsnow, qemu-devel, pavel.dovgaluk, mreitz


On 02.06.2020 18:54, Kevin Wolf wrote:
> Am 30.04.2020 um 11:02 hat Pavel Dovgalyuk geschrieben:
>> From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
>>
>> Windows guest sometimes makes DMA requests with overlapping
>> target addresses. This leads to the following structure of iov for
>> the block driver:
>>
>> addr size1
>> addr size2
>> addr size3
>>
>> It means that three adjacent disk blocks should be read into the same
>> memory buffer. Windows does not expects anything from these bytes
>> (should it be data from the first block, or the last one, or some mix),
>> but uses them somehow. It leads to non-determinism of the guest execution,
>> because block driver does not preserve any order of reading.
>>
>> This situation was discusses in the mailing list at least twice:
>> https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
>> https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
>>
>> This patch makes such disk reads deterministic in icount mode.
>> It splits the whole request into several parts. Parts may overlap,
>> but SGs inside one part do not overlap.
>> Parts that are processed later overwrite the prior ones in case
>> of overlapping.
>>
>> Examples for different SG part sequences:
>>
>> 1)
>> A1 1000
>> A2 1000
>> A1 1000
>> A3 1000
>> ->
>> One request is split into two.
>> A1 1000
>> A2 1000
>> --
>> A1 1000
>> A3 1000
>>
>> 2)
>> A1 800
>> A2 1000
>> A1 1000
>> ->
>> A1 800
>> A2 1000
>> --
>> A1 1000
>>
>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>
>> --
>>
>> v2:
>>   - Rewritten the loop to split the request instead of skipping the parts
>>     (suggested by Kevin Wolf)
>> ---
>>   dma-helpers.c |   20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/dma-helpers.c b/dma-helpers.c
>> index e8a26e81e1..a49f9a0e34 100644
>> --- a/dma-helpers.c
>> +++ b/dma-helpers.c
>> @@ -13,6 +13,8 @@
>>   #include "trace-root.h"
>>   #include "qemu/thread.h"
>>   #include "qemu/main-loop.h"
>> +#include "sysemu/cpus.h"
>> +#include "qemu/range.h"
>>   
>>   /* #define DEBUG_IOMMU */
>>   
>> @@ -142,6 +144,24 @@ static void dma_blk_cb(void *opaque, int ret)
>>           cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
>>           cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
>>           mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
>> +        /*
>> +         * Make reads deterministic in icount mode. Windows sometimes issues
>> +         * disk read requests with overlapping SGs. It leads
>> +         * to non-determinism, because resulting buffer contents may be mixed
>> +         * from several sectors. This code splits all SGs into several
>> +         * groups. SGs in every group do not overlap.
>> +         */
>> +        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
>> +            int i;
>> +            for (i = 0 ; i < dbs->iov.niov ; ++i) {
>> +                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base,
>> +                                   dbs->iov.iov[i].iov_len, (intptr_t)mem,
>> +                                   cur_len)) {
>> +                    mem = NULL;
> Doesn't this leak mem, i.e. should we call dma_memory_unmap()?

Thanks, I missed, that the memory is unmapped on request finish.

> Did you verify that it is guaranteed that mapping the same guest memory
> twice results in the same host address? v1 compared the SG list (which
> has guest addresses) rather than the resulting QEMUIOVector (which has
> host addresses).

We don't need the host addresses to be equivalent in different runs.

The order of the SG list entries is deterministic. This is enough for 
successful exclusion of the duplicate entries.

>
>> +                    break;
>> +                }
>> +            }
>> +        }
>>           if (!mem)
>>               break;
>>           qemu_iovec_add(&dbs->iov, mem, cur_len);
> Kevin
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] icount: make dma reads deterministic
  2020-04-30  9:02 Pavel Dovgalyuk
  2020-06-01 10:36 ` Pavel Dovgalyuk
@ 2020-06-02 15:54 ` Kevin Wolf
  2020-06-03  5:57   ` Pavel Dovgalyuk
  1 sibling, 1 reply; 9+ messages in thread
From: Kevin Wolf @ 2020-06-02 15:54 UTC (permalink / raw)
  To: Pavel Dovgalyuk
  Cc: vsementsov, qemu-devel, mreitz, dovgaluk, pavel.dovgaluk, jsnow

Am 30.04.2020 um 11:02 hat Pavel Dovgalyuk geschrieben:
> From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
> 
> Windows guest sometimes makes DMA requests with overlapping
> target addresses. This leads to the following structure of iov for
> the block driver:
> 
> addr size1
> addr size2
> addr size3
> 
> It means that three adjacent disk blocks should be read into the same
> memory buffer. Windows does not expects anything from these bytes
> (should it be data from the first block, or the last one, or some mix),
> but uses them somehow. It leads to non-determinism of the guest execution,
> because block driver does not preserve any order of reading.
> 
> This situation was discusses in the mailing list at least twice:
> https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
> https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
> 
> This patch makes such disk reads deterministic in icount mode.
> It splits the whole request into several parts. Parts may overlap,
> but SGs inside one part do not overlap.
> Parts that are processed later overwrite the prior ones in case
> of overlapping.
> 
> Examples for different SG part sequences:
> 
> 1)
> A1 1000
> A2 1000
> A1 1000
> A3 1000
> ->
> One request is split into two.
> A1 1000
> A2 1000
> --
> A1 1000
> A3 1000
> 
> 2)
> A1 800
> A2 1000
> A1 1000
> ->
> A1 800
> A2 1000
> --
> A1 1000
> 
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> 
> --
> 
> v2:
>  - Rewritten the loop to split the request instead of skipping the parts
>    (suggested by Kevin Wolf)
> ---
>  dma-helpers.c |   20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/dma-helpers.c b/dma-helpers.c
> index e8a26e81e1..a49f9a0e34 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -13,6 +13,8 @@
>  #include "trace-root.h"
>  #include "qemu/thread.h"
>  #include "qemu/main-loop.h"
> +#include "sysemu/cpus.h"
> +#include "qemu/range.h"
>  
>  /* #define DEBUG_IOMMU */
>  
> @@ -142,6 +144,24 @@ static void dma_blk_cb(void *opaque, int ret)
>          cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
>          cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
>          mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
> +        /*
> +         * Make reads deterministic in icount mode. Windows sometimes issues
> +         * disk read requests with overlapping SGs. It leads
> +         * to non-determinism, because resulting buffer contents may be mixed
> +         * from several sectors. This code splits all SGs into several
> +         * groups. SGs in every group do not overlap.
> +         */
> +        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
> +            int i;
> +            for (i = 0 ; i < dbs->iov.niov ; ++i) {
> +                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base,
> +                                   dbs->iov.iov[i].iov_len, (intptr_t)mem,
> +                                   cur_len)) {
> +                    mem = NULL;

Doesn't this leak mem, i.e. should we call dma_memory_unmap()?

Did you verify that it is guaranteed that mapping the same guest memory
twice results in the same host address? v1 compared the SG list (which
has guest addresses) rather than the resulting QEMUIOVector (which has
host addresses).

> +                    break;
> +                }
> +            }
> +        }
>          if (!mem)
>              break;
>          qemu_iovec_add(&dbs->iov, mem, cur_len);

Kevin



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] icount: make dma reads deterministic
  2020-04-30  9:02 Pavel Dovgalyuk
@ 2020-06-01 10:36 ` Pavel Dovgalyuk
  2020-06-02 15:54 ` Kevin Wolf
  1 sibling, 0 replies; 9+ messages in thread
From: Pavel Dovgalyuk @ 2020-06-01 10:36 UTC (permalink / raw)
  To: Pavel Dovgalyuk, qemu-devel
  Cc: kwolf, vsementsov, jsnow, pavel.dovgaluk, mreitz

ping

On 30.04.2020 12:02, Pavel Dovgalyuk wrote:
> From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
>
> Windows guest sometimes makes DMA requests with overlapping
> target addresses. This leads to the following structure of iov for
> the block driver:
>
> addr size1
> addr size2
> addr size3
>
> It means that three adjacent disk blocks should be read into the same
> memory buffer. Windows does not expects anything from these bytes
> (should it be data from the first block, or the last one, or some mix),
> but uses them somehow. It leads to non-determinism of the guest execution,
> because block driver does not preserve any order of reading.
>
> This situation was discusses in the mailing list at least twice:
> https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
> https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html
>
> This patch makes such disk reads deterministic in icount mode.
> It splits the whole request into several parts. Parts may overlap,
> but SGs inside one part do not overlap.
> Parts that are processed later overwrite the prior ones in case
> of overlapping.
>
> Examples for different SG part sequences:
>
> 1)
> A1 1000
> A2 1000
> A1 1000
> A3 1000
> ->
> One request is split into two.
> A1 1000
> A2 1000
> --
> A1 1000
> A3 1000
>
> 2)
> A1 800
> A2 1000
> A1 1000
> ->
> A1 800
> A2 1000
> --
> A1 1000
>
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>
> --
>
> v2:
>   - Rewritten the loop to split the request instead of skipping the parts
>     (suggested by Kevin Wolf)
> ---
>   dma-helpers.c |   20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
>
> diff --git a/dma-helpers.c b/dma-helpers.c
> index e8a26e81e1..a49f9a0e34 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -13,6 +13,8 @@
>   #include "trace-root.h"
>   #include "qemu/thread.h"
>   #include "qemu/main-loop.h"
> +#include "sysemu/cpus.h"
> +#include "qemu/range.h"
>   
>   /* #define DEBUG_IOMMU */
>   
> @@ -142,6 +144,24 @@ static void dma_blk_cb(void *opaque, int ret)
>           cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
>           cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
>           mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
> +        /*
> +         * Make reads deterministic in icount mode. Windows sometimes issues
> +         * disk read requests with overlapping SGs. It leads
> +         * to non-determinism, because resulting buffer contents may be mixed
> +         * from several sectors. This code splits all SGs into several
> +         * groups. SGs in every group do not overlap.
> +         */
> +        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
> +            int i;
> +            for (i = 0 ; i < dbs->iov.niov ; ++i) {
> +                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base,
> +                                   dbs->iov.iov[i].iov_len, (intptr_t)mem,
> +                                   cur_len)) {
> +                    mem = NULL;
> +                    break;
> +                }
> +            }
> +        }
>           if (!mem)
>               break;
>           qemu_iovec_add(&dbs->iov, mem, cur_len);
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] icount: make dma reads deterministic
@ 2020-04-30  9:02 Pavel Dovgalyuk
  2020-06-01 10:36 ` Pavel Dovgalyuk
  2020-06-02 15:54 ` Kevin Wolf
  0 siblings, 2 replies; 9+ messages in thread
From: Pavel Dovgalyuk @ 2020-04-30  9:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, mreitz, dovgaluk, pavel.dovgaluk, jsnow

From: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>

Windows guest sometimes makes DMA requests with overlapping
target addresses. This leads to the following structure of iov for
the block driver:

addr size1
addr size2
addr size3

It means that three adjacent disk blocks should be read into the same
memory buffer. Windows does not expects anything from these bytes
(should it be data from the first block, or the last one, or some mix),
but uses them somehow. It leads to non-determinism of the guest execution,
because block driver does not preserve any order of reading.

This situation was discusses in the mailing list at least twice:
https://lists.gnu.org/archive/html/qemu-devel/2010-09/msg01996.html
https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg05185.html

This patch makes such disk reads deterministic in icount mode.
It splits the whole request into several parts. Parts may overlap,
but SGs inside one part do not overlap.
Parts that are processed later overwrite the prior ones in case
of overlapping.

Examples for different SG part sequences:

1)
A1 1000
A2 1000
A1 1000
A3 1000
->
One request is split into two.
A1 1000
A2 1000
--
A1 1000
A3 1000

2)
A1 800
A2 1000
A1 1000
->
A1 800
A2 1000
--
A1 1000

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

--

v2:
 - Rewritten the loop to split the request instead of skipping the parts
   (suggested by Kevin Wolf)
---
 dma-helpers.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/dma-helpers.c b/dma-helpers.c
index e8a26e81e1..a49f9a0e34 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -13,6 +13,8 @@
 #include "trace-root.h"
 #include "qemu/thread.h"
 #include "qemu/main-loop.h"
+#include "sysemu/cpus.h"
+#include "qemu/range.h"
 
 /* #define DEBUG_IOMMU */
 
@@ -142,6 +144,24 @@ static void dma_blk_cb(void *opaque, int ret)
         cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
         cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
         mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
+        /*
+         * Make reads deterministic in icount mode. Windows sometimes issues
+         * disk read requests with overlapping SGs. It leads
+         * to non-determinism, because resulting buffer contents may be mixed
+         * from several sectors. This code splits all SGs into several
+         * groups. SGs in every group do not overlap.
+         */
+        if (use_icount && dbs->dir == DMA_DIRECTION_FROM_DEVICE) {
+            int i;
+            for (i = 0 ; i < dbs->iov.niov ; ++i) {
+                if (ranges_overlap((intptr_t)dbs->iov.iov[i].iov_base,
+                                   dbs->iov.iov[i].iov_len, (intptr_t)mem,
+                                   cur_len)) {
+                    mem = NULL;
+                    break;
+                }
+            }
+        }
         if (!mem)
             break;
         qemu_iovec_add(&dbs->iov, mem, cur_len);



^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-06-03  9:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-03 12:26 [PATCH v2] icount: make dma reads deterministic Pavel Dovgalyuk
2020-03-11 12:42 ` Pavel Dovgalyuk
2020-04-30  9:02 Pavel Dovgalyuk
2020-06-01 10:36 ` Pavel Dovgalyuk
2020-06-02 15:54 ` Kevin Wolf
2020-06-03  5:57   ` Pavel Dovgalyuk
2020-06-03  8:15     ` Kevin Wolf
2020-06-03  8:56       ` Pavel Dovgalyuk
2020-06-03  9:37       ` Pavel Dovgalyuk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.