QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/2] two atomic_cmpxchg() related fixes
@ 2020-06-16  4:50 Halil Pasic
  2020-06-16  4:50 ` [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic Halil Pasic
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Halil Pasic @ 2020-06-16  4:50 UTC (permalink / raw)
  To: Matthew Rosato, Cornelia Huck, Christian Borntraeger,
	Michael S. Tsirkin, qemu-s390x, qemu-devel
  Cc: Halil Pasic, Thomas Huth, David Hildenbrand, Richard Henderson

The story short: compiler can generate code that does two
distinct fetches of *ind_addr for old and _old. If that happens we can
not figure out if we had the desired xchg or not.

Halil Pasic (2):
  virtio-ccw: fix virtio_set_ind_atomic
  s390x/pci: fix set_ind_atomic

 hw/s390x/s390-pci-bus.c | 16 +++++++++-------
 hw/s390x/virtio-ccw.c   | 18 ++++++++++--------
 2 files changed, 19 insertions(+), 15 deletions(-)


base-commit: 7d3660e79830a069f1848bb4fa1cdf8f666424fb
-- 
2.17.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  4:50 [PATCH 0/2] two atomic_cmpxchg() related fixes Halil Pasic
@ 2020-06-16  4:50 ` Halil Pasic
  2020-06-16  5:58   ` Christian Borntraeger
                     ` (2 more replies)
  2020-06-16  4:50 ` [PATCH 2/2] s390x/pci: fix set_ind_atomic Halil Pasic
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 22+ messages in thread
From: Halil Pasic @ 2020-06-16  4:50 UTC (permalink / raw)
  To: Matthew Rosato, Cornelia Huck, Christian Borntraeger,
	Michael S. Tsirkin, qemu-s390x, qemu-devel
  Cc: Halil Pasic, Thomas Huth, David Hildenbrand, Richard Henderson

The atomic_cmpxchg() loop is broken because we occasionally end up with
old and _old having different values (a legit compiler can generate code
that accessed *ind_addr again to pick up a value for _old instead of
using the value of old that was already fetched according to the
rules of the abstract machine). This means the underlying CS instruction
may use a different old (_old) than the one we intended to use if
atomic_cmpxchg() performed the xchg part.

Let us use volatile to force the rules of the abstract machine for
accesses to *ind_addr. Let us also rewrite the loop so, we that the
new old is used to compute the new desired value if the xchg part
is not performed.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Andre Wild <Andre.Wild1@ibm.com>
Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
---
 hw/s390x/virtio-ccw.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index c1f4bb1d33..3c988a000b 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
 static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
                                      uint8_t to_be_set)
 {
-    uint8_t ind_old, ind_new;
+    uint8_t expected, actual;
     hwaddr len = 1;
-    uint8_t *ind_addr;
+    /* avoid  multiple fetches */
+    uint8_t volatile *ind_addr;
 
     ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
     if (!ind_addr) {
@@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
                      __func__, sch->cssid, sch->ssid, sch->schid);
         return -1;
     }
+    actual = *ind_addr;
     do {
-        ind_old = *ind_addr;
-        ind_new = ind_old | to_be_set;
-    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
-    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
-    cpu_physical_memory_unmap(ind_addr, len, 1, len);
+        expected = actual;
+        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
+    } while (actual != expected);
+    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
+    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
 
-    return ind_old;
+    return actual;
 }
 
 static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
-- 
2.17.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/2] s390x/pci: fix set_ind_atomic
  2020-06-16  4:50 [PATCH 0/2] two atomic_cmpxchg() related fixes Halil Pasic
  2020-06-16  4:50 ` [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic Halil Pasic
@ 2020-06-16  4:50 ` Halil Pasic
  2020-07-01 13:14   ` Christian Borntraeger
  2020-07-01 12:01 ` [PATCH 0/2] two atomic_cmpxchg() related fixes Cornelia Huck
  2020-07-02 11:18 ` Cornelia Huck
  3 siblings, 1 reply; 22+ messages in thread
From: Halil Pasic @ 2020-06-16  4:50 UTC (permalink / raw)
  To: Matthew Rosato, Cornelia Huck, Christian Borntraeger,
	Michael S. Tsirkin, qemu-s390x, qemu-devel
  Cc: Halil Pasic, Thomas Huth, David Hildenbrand, Richard Henderson

The atomic_cmpxchg() loop is broken because we occasionally end up with
old and _old having different values (a legit compiler can generate code
that accessed *ind_addr again to pick up a value for _old instead of
using the value of old that was already fetched according to the
rules of the abstract machine). This means the underlying CS instruction
may use a different old (_old) than the one we intended to use if
atomic_cmpxchg() performed the xchg part.

Let us use volatile to force the rules of the abstract machine for
accesses to *ind_addr. Let us also rewrite the loop so, we that the
new old is used to compute the new desired value if the xchg part
is not performed.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: 8cba80c3a0 ("s390: Add PCI bus support")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 hw/s390x/s390-pci-bus.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index c4a4259f0c..80eb957a91 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -637,22 +637,24 @@ static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
 
 static uint8_t set_ind_atomic(uint64_t ind_loc, uint8_t to_be_set)
 {
-    uint8_t ind_old, ind_new;
+    uint8_t expected, actual;
     hwaddr len = 1;
-    uint8_t *ind_addr;
+    /* avoid  multiple fetches */
+    uint8_t volatile *ind_addr;
 
     ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
     if (!ind_addr) {
         s390_pci_generate_error_event(ERR_EVENT_AIRERR, 0, 0, 0, 0);
         return -1;
     }
+    actual = *ind_addr;
     do {
-        ind_old = *ind_addr;
-        ind_new = ind_old | to_be_set;
-    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
-    cpu_physical_memory_unmap(ind_addr, len, 1, len);
+        expected = actual;
+        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
+    } while (actual != expected);
+    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
 
-    return ind_old;
+    return actual;
 }
 
 static void s390_msi_ctrl_write(void *opaque, hwaddr addr, uint64_t data,
-- 
2.17.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  4:50 ` [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic Halil Pasic
@ 2020-06-16  5:58   ` Christian Borntraeger
  2020-06-16  6:33     ` Cornelia Huck
  2020-06-16  9:31     ` Halil Pasic
  2020-07-01 13:13   ` Christian Borntraeger
  2020-07-04 18:34   ` Michael S. Tsirkin
  2 siblings, 2 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-06-16  5:58 UTC (permalink / raw)
  To: Halil Pasic, Matthew Rosato, Cornelia Huck, Michael S. Tsirkin,
	qemu-s390x, qemu-devel, Paolo Bonzini
  Cc: Thomas Huth, Richard Henderson, Andreas Krebbel, David Hildenbrand



On 16.06.20 06:50, Halil Pasic wrote:
> The atomic_cmpxchg() loop is broken because we occasionally end up with
> old and _old having different values (a legit compiler can generate code
> that accessed *ind_addr again to pick up a value for _old instead of
> using the value of old that was already fetched according to the
> rules of the abstract machine). This means the underlying CS instruction
> may use a different old (_old) than the one we intended to use if
> atomic_cmpxchg() performed the xchg part.
> 
> Let us use volatile to force the rules of the abstract machine for
> accesses to *ind_addr. Let us also rewrite the loop so, we that the
> new old is used to compute the new desired value if the xchg part
> is not performed.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Reported-by: Andre Wild <Andre.Wild1@ibm.com>
> Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
> ---
>  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> index c1f4bb1d33..3c988a000b 100644
> --- a/hw/s390x/virtio-ccw.c
> +++ b/hw/s390x/virtio-ccw.c
> @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
>  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>                                       uint8_t to_be_set)
>  {
> -    uint8_t ind_old, ind_new;
> +    uint8_t expected, actual;
>      hwaddr len = 1;
> -    uint8_t *ind_addr;
> +    /* avoid  multiple fetches */
> +    uint8_t volatile *ind_addr;
>  
>      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
>      if (!ind_addr) {
> @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>                       __func__, sch->cssid, sch->ssid, sch->schid);
>          return -1;
>      }
> +    actual = *ind_addr;
>      do {
> -        ind_old = *ind_addr;

to make things easier to understand. Adding a barrier in here also fixes the issue.
Reasoning follows below:

> -        ind_new = ind_old | to_be_set;

with an analysis from Andreas (cc)

 #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
 
     typeof_strip_qual(*ptr) _old = (old);                               \   
 
     (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
 
                               __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
 
     _old;                                                               \   
 
 })
 
ind_old is copied into _old in the macro. Instead of doing the copy from the
register the compiler reloads the value from memory. The result is that _old
and ind_old end up having different values. _old in r1 with the bits set
already and ind_old in r10 with the bits cleared. _old gets updated by CS
and matches ind_old afterwards - both with the bits being 0. So the !=
compare is false and the loop is left without having set any bits.


Paolo (to),
I am asking myself if it would be safer to add a barrier or something like
this in the macros in include/qemu/atomic.h. 




> -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
> -    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
> -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
> +        expected = actual;
> +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
> +    } while (actual != expected);
> +    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
> +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
>  
> -    return ind_old;
> +    return actual;
>  }
>  
>  static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
> 




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  5:58   ` Christian Borntraeger
@ 2020-06-16  6:33     ` Cornelia Huck
  2020-06-16  6:45       ` Christian Borntraeger
  2020-06-17 23:56       ` Halil Pasic
  2020-06-16  9:31     ` Halil Pasic
  1 sibling, 2 replies; 22+ messages in thread
From: Cornelia Huck @ 2020-06-16  6:33 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, Halil Pasic, qemu-s390x,
	Paolo Bonzini, Andreas Krebbel, Richard Henderson

On Tue, 16 Jun 2020 07:58:53 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 16.06.20 06:50, Halil Pasic wrote:
> > The atomic_cmpxchg() loop is broken because we occasionally end up with
> > old and _old having different values (a legit compiler can generate code
> > that accessed *ind_addr again to pick up a value for _old instead of
> > using the value of old that was already fetched according to the
> > rules of the abstract machine). This means the underlying CS instruction
> > may use a different old (_old) than the one we intended to use if
> > atomic_cmpxchg() performed the xchg part.
> > 
> > Let us use volatile to force the rules of the abstract machine for
> > accesses to *ind_addr. Let us also rewrite the loop so, we that the
> > new old is used to compute the new desired value if the xchg part
> > is not performed.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Reported-by: Andre Wild <Andre.Wild1@ibm.com>
> > Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
> > ---
> >  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> > index c1f4bb1d33..3c988a000b 100644
> > --- a/hw/s390x/virtio-ccw.c
> > +++ b/hw/s390x/virtio-ccw.c
> > @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
> >  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >                                       uint8_t to_be_set)
> >  {
> > -    uint8_t ind_old, ind_new;
> > +    uint8_t expected, actual;
> >      hwaddr len = 1;
> > -    uint8_t *ind_addr;
> > +    /* avoid  multiple fetches */
> > +    uint8_t volatile *ind_addr;
> >  
> >      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
> >      if (!ind_addr) {
> > @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >                       __func__, sch->cssid, sch->ssid, sch->schid);
> >          return -1;
> >      }
> > +    actual = *ind_addr;
> >      do {
> > -        ind_old = *ind_addr;  
> 
> to make things easier to understand. Adding a barrier in here also fixes the issue.
> Reasoning follows below:
> 
> > -        ind_new = ind_old | to_be_set;  
> 
> with an analysis from Andreas (cc)
> 
>  #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
>  
>      typeof_strip_qual(*ptr) _old = (old);                               \   
>  
>      (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
>  
>                                __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
>  
>      _old;                                                               \   
>  
>  })
>  
> ind_old is copied into _old in the macro. Instead of doing the copy from the
> register the compiler reloads the value from memory. The result is that _old
> and ind_old end up having different values. _old in r1 with the bits set
> already and ind_old in r10 with the bits cleared. _old gets updated by CS
> and matches ind_old afterwards - both with the bits being 0. So the !=
> compare is false and the loop is left without having set any bits.
> 
> 
> Paolo (to),
> I am asking myself if it would be safer to add a barrier or something like
> this in the macros in include/qemu/atomic.h. 

I'm also wondering whether this has been seen on other architectures as
well? There are also some callers in non-s390x code, and dealing with
this in common code would catch them as well.

> 
> 
> 
> 
> > -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
> > -    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
> > -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
> > +        expected = actual;
> > +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
> > +    } while (actual != expected);
> > +    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
> > +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
> >  
> > -    return ind_old;
> > +    return actual;
> >  }
> >  
> >  static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
> >   
> 
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  6:33     ` Cornelia Huck
@ 2020-06-16  6:45       ` Christian Borntraeger
  2020-06-19  7:14         ` Cornelia Huck
  2020-06-17 23:56       ` Halil Pasic
  1 sibling, 1 reply; 22+ messages in thread
From: Christian Borntraeger @ 2020-06-16  6:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, Halil Pasic, qemu-s390x,
	Paolo Bonzini, Andreas Krebbel, Richard Henderson



On 16.06.20 08:33, Cornelia Huck wrote:
> On Tue, 16 Jun 2020 07:58:53 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> On 16.06.20 06:50, Halil Pasic wrote:
>>> The atomic_cmpxchg() loop is broken because we occasionally end up with
>>> old and _old having different values (a legit compiler can generate code
>>> that accessed *ind_addr again to pick up a value for _old instead of
>>> using the value of old that was already fetched according to the
>>> rules of the abstract machine). This means the underlying CS instruction
>>> may use a different old (_old) than the one we intended to use if
>>> atomic_cmpxchg() performed the xchg part.
>>>
>>> Let us use volatile to force the rules of the abstract machine for
>>> accesses to *ind_addr. Let us also rewrite the loop so, we that the
>>> new old is used to compute the new desired value if the xchg part
>>> is not performed.
>>>
>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>> Reported-by: Andre Wild <Andre.Wild1@ibm.com>
>>> Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
>>> ---
>>>  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
>>>  1 file changed, 10 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
>>> index c1f4bb1d33..3c988a000b 100644
>>> --- a/hw/s390x/virtio-ccw.c
>>> +++ b/hw/s390x/virtio-ccw.c
>>> @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
>>>  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>>>                                       uint8_t to_be_set)
>>>  {
>>> -    uint8_t ind_old, ind_new;
>>> +    uint8_t expected, actual;
>>>      hwaddr len = 1;
>>> -    uint8_t *ind_addr;
>>> +    /* avoid  multiple fetches */
>>> +    uint8_t volatile *ind_addr;
>>>  
>>>      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
>>>      if (!ind_addr) {
>>> @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>>>                       __func__, sch->cssid, sch->ssid, sch->schid);
>>>          return -1;
>>>      }
>>> +    actual = *ind_addr;
>>>      do {
>>> -        ind_old = *ind_addr;  
>>
>> to make things easier to understand. Adding a barrier in here also fixes the issue.
>> Reasoning follows below:
>>
>>> -        ind_new = ind_old | to_be_set;  
>>
>> with an analysis from Andreas (cc)
>>
>>  #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
>>  
>>      typeof_strip_qual(*ptr) _old = (old);                               \   
>>  
>>      (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
>>  
>>                                __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
>>  
>>      _old;                                                               \   
>>  
>>  })
>>  
>> ind_old is copied into _old in the macro. Instead of doing the copy from the
>> register the compiler reloads the value from memory. The result is that _old
>> and ind_old end up having different values. _old in r1 with the bits set
>> already and ind_old in r10 with the bits cleared. _old gets updated by CS
>> and matches ind_old afterwards - both with the bits being 0. So the !=
>> compare is false and the loop is left without having set any bits.
>>
>>
>> Paolo (to),
>> I am asking myself if it would be safer to add a barrier or something like
>> this in the macros in include/qemu/atomic.h. 

Having said this, I think that the refactoring from Halil (to re-use actual) 
also makes sense independent of the fix. 
> 
> I'm also wondering whether this has been seen on other architectures as
> well? There are also some callers in non-s390x code, and dealing with
> this in common code would catch them as well.
> 
>>
>>
>>
>>
>>> -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
>>> -    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
>>> -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
>>> +        expected = actual;
>>> +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
>>> +    } while (actual != expected);
>>> +    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
>>> +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
>>>  
>>> -    return ind_old;
>>> +    return actual;
>>>  }
>>>  
>>>  static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
>>>   
>>
>>
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  5:58   ` Christian Borntraeger
  2020-06-16  6:33     ` Cornelia Huck
@ 2020-06-16  9:31     ` Halil Pasic
  1 sibling, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2020-06-16  9:31 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin, Cornelia Huck,
	David Hildenbrand, qemu-devel, qemu-s390x, Paolo Bonzini,
	Andreas Krebbel, Richard Henderson

On Tue, 16 Jun 2020 07:58:53 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> 
> 
> On 16.06.20 06:50, Halil Pasic wrote:
> > The atomic_cmpxchg() loop is broken because we occasionally end up with
> > old and _old having different values (a legit compiler can generate code
> > that accessed *ind_addr again to pick up a value for _old instead of
> > using the value of old that was already fetched according to the
> > rules of the abstract machine). This means the underlying CS instruction
> > may use a different old (_old) than the one we intended to use if
> > atomic_cmpxchg() performed the xchg part.
> > 
> > Let us use volatile to force the rules of the abstract machine for
> > accesses to *ind_addr. Let us also rewrite the loop so, we that the
> > new old is used to compute the new desired value if the xchg part
> > is not performed.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Reported-by: Andre Wild <Andre.Wild1@ibm.com>
> > Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
> > ---
> >  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> > index c1f4bb1d33..3c988a000b 100644
> > --- a/hw/s390x/virtio-ccw.c
> > +++ b/hw/s390x/virtio-ccw.c
> > @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
> >  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >                                       uint8_t to_be_set)
> >  {
> > -    uint8_t ind_old, ind_new;
> > +    uint8_t expected, actual;
> >      hwaddr len = 1;
> > -    uint8_t *ind_addr;
> > +    /* avoid  multiple fetches */
> > +    uint8_t volatile *ind_addr;
> >  
> >      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
> >      if (!ind_addr) {
> > @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >                       __func__, sch->cssid, sch->ssid, sch->schid);
> >          return -1;
> >      }
> > +    actual = *ind_addr;
> >      do {
> > -        ind_old = *ind_addr;
> 
> to make things easier to understand. Adding a barrier in here also fixes the issue.
> Reasoning follows below:
> 
> > -        ind_new = ind_old | to_be_set;
> 
> with an analysis from Andreas (cc)
> 
>  #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
>  
>      typeof_strip_qual(*ptr) _old = (old);                               \   
>  
>      (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
>  
>                                __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
>  
>      _old;                                                               \   
>  
>  })
>  

There is also the 

#define atomic_cmpxchg(ptr, old, new) __sync_val_compare_and_swap(ptr, old, new)

variant, I guess, when the C11 stuff is not available. I don't know if
that variant is guaranteed to not have problems with multiple loads.



> ind_old is copied into _old in the macro. Instead of doing the copy from the
> register the compiler reloads the value from memory. The result is that _old
> and ind_old end up having different values. _old in r1 with the bits set
> already and ind_old in r10 with the bits cleared. _old gets updated by CS
> and matches ind_old afterwards - both with the bits being 0. So the !=
> compare is false and the loop is left without having set any bits.
> 
> 
> Paolo (to),
> I am asking myself if it would be safer to add a barrier or something like
> this in the macros in include/qemu/atomic.h. 
>

I think accessing the initial value via a volatile pointer initially and
using the value loaded by cmpxchg for subsequent iterations is cleaner.

Regards,
Halil

> 
> > -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
> > -    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
> > -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
> > +        expected = actual;
> > +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
> > +    } while (actual != expected);
> > +    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
> > +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
> >  
> > -    return ind_old;
> > +    return actual;
> >  }
> >  
> >  static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
> > 
> 
> 
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  6:33     ` Cornelia Huck
  2020-06-16  6:45       ` Christian Borntraeger
@ 2020-06-17 23:56       ` Halil Pasic
  2020-06-19  7:33         ` David Hildenbrand
  1 sibling, 1 reply; 22+ messages in thread
From: Halil Pasic @ 2020-06-17 23:56 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, Christian Borntraeger, qemu-s390x,
	Paolo Bonzini, Andreas Krebbel, Richard Henderson

On Tue, 16 Jun 2020 08:33:33 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> >  #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
> >  
> >      typeof_strip_qual(*ptr) _old = (old);                               \   
> >  
> >      (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
> >  
> >                                __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
> >  
> >      _old;                                                               \   
> >  
> >  })
> >  
> > ind_old is copied into _old in the macro. Instead of doing the copy from the
> > register the compiler reloads the value from memory. The result is that _old
> > and ind_old end up having different values. _old in r1 with the bits set
> > already and ind_old in r10 with the bits cleared. _old gets updated by CS
> > and matches ind_old afterwards - both with the bits being 0. So the !=
> > compare is false and the loop is left without having set any bits.
> > 
> > 
> > Paolo (to),
> > I am asking myself if it would be safer to add a barrier or something like
> > this in the macros in include/qemu/atomic.h.   
> 
> I'm also wondering whether this has been seen on other architectures as
> well? There are also some callers in non-s390x code, and dealing with
> this in common code would catch them as well.

Quite a bunch of users use something like old = atomic_read(..), where
atomic_read is documented as in docs/devel/atomics.rst:
- ``atomic_read()`` and ``atomic_set()``; these prevent the compiler from
  optimizing accesses out of existence and creating unsolicited
  accesses, but do not otherwise impose any ordering on loads and
  stores: both the compiler and the processor are free to reorder
  them.

Maybe I should have used that instead of volatile, but my problem was
that I didn't fully understand what atomic_read() does, and if it does
more than we need. I found the documentation just now.

Another bunch uses constants as old, which is also fine. And there is
a third bunch where I don't know whats up, partly because I did not
dig deep enough.

Regards,
Halil




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  6:45       ` Christian Borntraeger
@ 2020-06-19  7:14         ` Cornelia Huck
  0 siblings, 0 replies; 22+ messages in thread
From: Cornelia Huck @ 2020-06-19  7:14 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, Halil Pasic, qemu-s390x,
	Paolo Bonzini, Andreas Krebbel, Richard Henderson

On Tue, 16 Jun 2020 08:45:14 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 16.06.20 08:33, Cornelia Huck wrote:
> > On Tue, 16 Jun 2020 07:58:53 +0200
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >   
> >> On 16.06.20 06:50, Halil Pasic wrote:  
> >>> The atomic_cmpxchg() loop is broken because we occasionally end up with
> >>> old and _old having different values (a legit compiler can generate code
> >>> that accessed *ind_addr again to pick up a value for _old instead of
> >>> using the value of old that was already fetched according to the
> >>> rules of the abstract machine). This means the underlying CS instruction
> >>> may use a different old (_old) than the one we intended to use if
> >>> atomic_cmpxchg() performed the xchg part.
> >>>
> >>> Let us use volatile to force the rules of the abstract machine for
> >>> accesses to *ind_addr. Let us also rewrite the loop so, we that the
> >>> new old is used to compute the new desired value if the xchg part
> >>> is not performed.
> >>>
> >>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>> Reported-by: Andre Wild <Andre.Wild1@ibm.com>
> >>> Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
> >>> ---
> >>>  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
> >>>  1 file changed, 10 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> >>> index c1f4bb1d33..3c988a000b 100644
> >>> --- a/hw/s390x/virtio-ccw.c
> >>> +++ b/hw/s390x/virtio-ccw.c
> >>> @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
> >>>  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >>>                                       uint8_t to_be_set)
> >>>  {
> >>> -    uint8_t ind_old, ind_new;
> >>> +    uint8_t expected, actual;
> >>>      hwaddr len = 1;
> >>> -    uint8_t *ind_addr;
> >>> +    /* avoid  multiple fetches */
> >>> +    uint8_t volatile *ind_addr;
> >>>  
> >>>      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
> >>>      if (!ind_addr) {
> >>> @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >>>                       __func__, sch->cssid, sch->ssid, sch->schid);
> >>>          return -1;
> >>>      }
> >>> +    actual = *ind_addr;
> >>>      do {
> >>> -        ind_old = *ind_addr;    
> >>
> >> to make things easier to understand. Adding a barrier in here also fixes the issue.
> >> Reasoning follows below:
> >>  
> >>> -        ind_new = ind_old | to_be_set;    
> >>
> >> with an analysis from Andreas (cc)
> >>
> >>  #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
> >>  
> >>      typeof_strip_qual(*ptr) _old = (old);                               \   
> >>  
> >>      (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
> >>  
> >>                                __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
> >>  
> >>      _old;                                                               \   
> >>  
> >>  })
> >>  
> >> ind_old is copied into _old in the macro. Instead of doing the copy from the
> >> register the compiler reloads the value from memory. The result is that _old
> >> and ind_old end up having different values. _old in r1 with the bits set
> >> already and ind_old in r10 with the bits cleared. _old gets updated by CS
> >> and matches ind_old afterwards - both with the bits being 0. So the !=
> >> compare is false and the loop is left without having set any bits.
> >>
> >>
> >> Paolo (to),
> >> I am asking myself if it would be safer to add a barrier or something like
> >> this in the macros in include/qemu/atomic.h.   
> 
> Having said this, I think that the refactoring from Halil (to re-use actual) 
> also makes sense independent of the fix. 

What about adding a barrier instead, as you suggested?

(Still wondering about other users of atomic_cmpxchg(), though.)



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-17 23:56       ` Halil Pasic
@ 2020-06-19  7:33         ` David Hildenbrand
  2020-06-19  8:17           ` Halil Pasic
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2020-06-19  7:33 UTC (permalink / raw)
  To: Halil Pasic, Cornelia Huck
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin, qemu-devel,
	Christian Borntraeger, qemu-s390x, Paolo Bonzini,
	Andreas Krebbel, Richard Henderson

On 18.06.20 01:56, Halil Pasic wrote:
> On Tue, 16 Jun 2020 08:33:33 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
>>>  #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
>>>  
>>>      typeof_strip_qual(*ptr) _old = (old);                               \   
>>>  
>>>      (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
>>>  
>>>                                __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
>>>  
>>>      _old;                                                               \   
>>>  
>>>  })
>>>  
>>> ind_old is copied into _old in the macro. Instead of doing the copy from the
>>> register the compiler reloads the value from memory. The result is that _old
>>> and ind_old end up having different values. _old in r1 with the bits set
>>> already and ind_old in r10 with the bits cleared. _old gets updated by CS
>>> and matches ind_old afterwards - both with the bits being 0. So the !=
>>> compare is false and the loop is left without having set any bits.
>>>
>>>
>>> Paolo (to),
>>> I am asking myself if it would be safer to add a barrier or something like
>>> this in the macros in include/qemu/atomic.h.   
>>
>> I'm also wondering whether this has been seen on other architectures as
>> well? There are also some callers in non-s390x code, and dealing with
>> this in common code would catch them as well.
> 
> Quite a bunch of users use something like old = atomic_read(..), where
> atomic_read is documented as in docs/devel/atomics.rst:
> - ``atomic_read()`` and ``atomic_set()``; these prevent the compiler from
>   optimizing accesses out of existence and creating unsolicited
>   accesses, but do not otherwise impose any ordering on loads and
>   stores: both the compiler and the processor are free to reorder
>   them.
> 
> Maybe I should have used that instead of volatile, but my problem was
> that I didn't fully understand what atomic_read() does, and if it does
> more than we need. I found the documentation just now.

IIRC, atomic_read() is the right way of doing it, at least in the
kernel. I use such a loop in QEMU in

https://lkml.kernel.org/r/20200610115419.51688-2-david@redhat.com

But reading docs/devel/atomics.rst:"Comparison with Linux kernel
primitives" I do wonder if that is sufficient.

Any experts around?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-19  7:33         ` David Hildenbrand
@ 2020-06-19  8:17           ` Halil Pasic
  0 siblings, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2020-06-19  8:17 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin, Cornelia Huck,
	qemu-devel, Christian Borntraeger, qemu-s390x, Paolo Bonzini,
	Andreas Krebbel, Richard Henderson

On Fri, 19 Jun 2020 09:33:44 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 18.06.20 01:56, Halil Pasic wrote:
> > On Tue, 16 Jun 2020 08:33:33 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> >>>  #define atomic_cmpxchg__nocheck(ptr, old, new)    ({                    \   
> >>>  
> >>>      typeof_strip_qual(*ptr) _old = (old);                               \   
> >>>  
> >>>      (void)__atomic_compare_exchange_n(ptr, &_old, new, false,           \   
> >>>  
> >>>                                __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);      \   
> >>>  
> >>>      _old;                                                               \   
> >>>  
> >>>  })
> >>>  
> >>> ind_old is copied into _old in the macro. Instead of doing the copy from the
> >>> register the compiler reloads the value from memory. The result is that _old
> >>> and ind_old end up having different values. _old in r1 with the bits set
> >>> already and ind_old in r10 with the bits cleared. _old gets updated by CS
> >>> and matches ind_old afterwards - both with the bits being 0. So the !=
> >>> compare is false and the loop is left without having set any bits.
> >>>
> >>>
> >>> Paolo (to),
> >>> I am asking myself if it would be safer to add a barrier or something like
> >>> this in the macros in include/qemu/atomic.h.   
> >>
> >> I'm also wondering whether this has been seen on other architectures as
> >> well? There are also some callers in non-s390x code, and dealing with
> >> this in common code would catch them as well.
> > 
> > Quite a bunch of users use something like old = atomic_read(..), where
> > atomic_read is documented as in docs/devel/atomics.rst:
> > - ``atomic_read()`` and ``atomic_set()``; these prevent the compiler from
> >   optimizing accesses out of existence and creating unsolicited
> >   accesses, but do not otherwise impose any ordering on loads and
> >   stores: both the compiler and the processor are free to reorder
> >   them.
> > 
> > Maybe I should have used that instead of volatile, but my problem was
> > that I didn't fully understand what atomic_read() does, and if it does
> > more than we need. I found the documentation just now.
> 
> IIRC, atomic_read() is the right way of doing it, at least in the
> kernel. 

In kernel I would use READ_ONCE. 

https://elixir.bootlin.com/linux/v5.8-rc1/source/include/asm-generic/atomic.h#L171

In this case we are not manipulating an atomic variable. For uint8_t
that boils down to an access through a volatile pointer. And that is
what I did :).

> I use such a loop in QEMU in
> 
> https://lkml.kernel.org/r/20200610115419.51688-2-david@redhat.com
> 
> But reading docs/devel/atomics.rst:"Comparison with Linux kernel
> primitives" I do wonder if that is sufficient.
> 
> Any experts around?
> 

IMHO what we want here is READ_ONCE, i.e. volatile access, and not
necessarily atomic access. But I suppose atomic access implies volatile
access (the C11 standard refers to atomic_load as 
C atomic_load(volatile A *object)). Because QEMU seems to use atomic_read()
in such situations, and does not have READ_ONCE, for me atomic_read would
also do.

Regards,
Halil



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] two atomic_cmpxchg() related fixes
  2020-06-16  4:50 [PATCH 0/2] two atomic_cmpxchg() related fixes Halil Pasic
  2020-06-16  4:50 ` [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic Halil Pasic
  2020-06-16  4:50 ` [PATCH 2/2] s390x/pci: fix set_ind_atomic Halil Pasic
@ 2020-07-01 12:01 ` Cornelia Huck
  2020-07-01 12:06   ` Christian Borntraeger
  2020-07-02 11:18 ` Cornelia Huck
  3 siblings, 1 reply; 22+ messages in thread
From: Cornelia Huck @ 2020-07-01 12:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, Christian Borntraeger, qemu-s390x,
	Richard Henderson

On Tue, 16 Jun 2020 06:50:33 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> The story short: compiler can generate code that does two
> distinct fetches of *ind_addr for old and _old. If that happens we can
> not figure out if we had the desired xchg or not.
> 
> Halil Pasic (2):
>   virtio-ccw: fix virtio_set_ind_atomic
>   s390x/pci: fix set_ind_atomic
> 
>  hw/s390x/s390-pci-bus.c | 16 +++++++++-------
>  hw/s390x/virtio-ccw.c   | 18 ++++++++++--------
>  2 files changed, 19 insertions(+), 15 deletions(-)
> 
> 
> base-commit: 7d3660e79830a069f1848bb4fa1cdf8f666424fb

Have we managed to reach any kind of agreement on this? (A v2?)

We can still get in fixes post-softfreeze, of course.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] two atomic_cmpxchg() related fixes
  2020-07-01 12:01 ` [PATCH 0/2] two atomic_cmpxchg() related fixes Cornelia Huck
@ 2020-07-01 12:06   ` Christian Borntraeger
  2020-07-01 13:10     ` Cornelia Huck
  2020-07-03 13:37     ` Halil Pasic
  0 siblings, 2 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-07-01 12:06 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, qemu-s390x, Richard Henderson



On 01.07.20 14:01, Cornelia Huck wrote:
> On Tue, 16 Jun 2020 06:50:33 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> The story short: compiler can generate code that does two
>> distinct fetches of *ind_addr for old and _old. If that happens we can
>> not figure out if we had the desired xchg or not.
>>
>> Halil Pasic (2):
>>   virtio-ccw: fix virtio_set_ind_atomic
>>   s390x/pci: fix set_ind_atomic
>>
>>  hw/s390x/s390-pci-bus.c | 16 +++++++++-------
>>  hw/s390x/virtio-ccw.c   | 18 ++++++++++--------
>>  2 files changed, 19 insertions(+), 15 deletions(-)
>>
>>
>> base-commit: 7d3660e79830a069f1848bb4fa1cdf8f666424fb
> 
> Have we managed to reach any kind of agreement on this? (A v2?)
> 
> We can still get in fixes post-softfreeze, of course.

Unless Halil has a v2 ready, 
I think the current patch is fine as is as a fix. I would suggest
to go with that and we can then do more beautification later when
necessary.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] two atomic_cmpxchg() related fixes
  2020-07-01 12:06   ` Christian Borntraeger
@ 2020-07-01 13:10     ` Cornelia Huck
  2020-07-03 13:37     ` Halil Pasic
  1 sibling, 0 replies; 22+ messages in thread
From: Cornelia Huck @ 2020-07-01 13:10 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, Halil Pasic, qemu-s390x,
	Richard Henderson

On Wed, 1 Jul 2020 14:06:11 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 01.07.20 14:01, Cornelia Huck wrote:
> > On Tue, 16 Jun 2020 06:50:33 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> >> The story short: compiler can generate code that does two
> >> distinct fetches of *ind_addr for old and _old. If that happens we can
> >> not figure out if we had the desired xchg or not.
> >>
> >> Halil Pasic (2):
> >>   virtio-ccw: fix virtio_set_ind_atomic
> >>   s390x/pci: fix set_ind_atomic
> >>
> >>  hw/s390x/s390-pci-bus.c | 16 +++++++++-------
> >>  hw/s390x/virtio-ccw.c   | 18 ++++++++++--------
> >>  2 files changed, 19 insertions(+), 15 deletions(-)
> >>
> >>
> >> base-commit: 7d3660e79830a069f1848bb4fa1cdf8f666424fb  
> > 
> > Have we managed to reach any kind of agreement on this? (A v2?)
> > 
> > We can still get in fixes post-softfreeze, of course.  
> 
> Unless Halil has a v2 ready, 
> I think the current patch is fine as is as a fix. I would suggest
> to go with that and we can then do more beautification later when
> necessary.

Sure, no objection to the patches as they are now.

I would like to see some R-bs/A-bs, though :)



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  4:50 ` [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic Halil Pasic
  2020-06-16  5:58   ` Christian Borntraeger
@ 2020-07-01 13:13   ` Christian Borntraeger
  2020-07-04 18:34   ` Michael S. Tsirkin
  2 siblings, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-07-01 13:13 UTC (permalink / raw)
  To: Halil Pasic, Matthew Rosato, Cornelia Huck, Michael S. Tsirkin,
	qemu-s390x, qemu-devel
  Cc: Thomas Huth, David Hildenbrand, Richard Henderson



On 16.06.20 06:50, Halil Pasic wrote:
> The atomic_cmpxchg() loop is broken because we occasionally end up with
> old and _old having different values (a legit compiler can generate code
> that accessed *ind_addr again to pick up a value for _old instead of
> using the value of old that was already fetched according to the
> rules of the abstract machine). This means the underlying CS instruction
> may use a different old (_old) than the one we intended to use if
> atomic_cmpxchg() performed the xchg part.
> 
> Let us use volatile to force the rules of the abstract machine for
> accesses to *ind_addr. Let us also rewrite the loop so, we that the
> new old is used to compute the new desired value if the xchg part
> is not performed.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Reported-by: Andre Wild <Andre.Wild1@ibm.com>
> Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> index c1f4bb1d33..3c988a000b 100644
> --- a/hw/s390x/virtio-ccw.c
> +++ b/hw/s390x/virtio-ccw.c
> @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
>  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>                                       uint8_t to_be_set)
>  {
> -    uint8_t ind_old, ind_new;
> +    uint8_t expected, actual;
>      hwaddr len = 1;
> -    uint8_t *ind_addr;
> +    /* avoid  multiple fetches */
> +    uint8_t volatile *ind_addr;
>  
>      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
>      if (!ind_addr) {
> @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>                       __func__, sch->cssid, sch->ssid, sch->schid);
>          return -1;
>      }
> +    actual = *ind_addr;
>      do {
> -        ind_old = *ind_addr;
> -        ind_new = ind_old | to_be_set;
> -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
> -    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
> -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
> +        expected = actual;
> +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
> +    } while (actual != expected);
> +    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
> +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
>  
> -    return ind_old;
> +    return actual;
>  }
>  
>  static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] s390x/pci: fix set_ind_atomic
  2020-06-16  4:50 ` [PATCH 2/2] s390x/pci: fix set_ind_atomic Halil Pasic
@ 2020-07-01 13:14   ` Christian Borntraeger
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-07-01 13:14 UTC (permalink / raw)
  To: Halil Pasic, Matthew Rosato, Cornelia Huck, Michael S. Tsirkin,
	qemu-s390x, qemu-devel
  Cc: Thomas Huth, David Hildenbrand, Richard Henderson



On 16.06.20 06:50, Halil Pasic wrote:
> The atomic_cmpxchg() loop is broken because we occasionally end up with
> old and _old having different values (a legit compiler can generate code
> that accessed *ind_addr again to pick up a value for _old instead of
> using the value of old that was already fetched according to the
> rules of the abstract machine). This means the underlying CS instruction
> may use a different old (_old) than the one we intended to use if
> atomic_cmpxchg() performed the xchg part.
> 
> Let us use volatile to force the rules of the abstract machine for
> accesses to *ind_addr. Let us also rewrite the loop so, we that the
> new old is used to compute the new desired value if the xchg part
> is not performed.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Fixes: 8cba80c3a0 ("s390: Add PCI bus support")
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>

> ---
>  hw/s390x/s390-pci-bus.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index c4a4259f0c..80eb957a91 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -637,22 +637,24 @@ static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>  
>  static uint8_t set_ind_atomic(uint64_t ind_loc, uint8_t to_be_set)
>  {
> -    uint8_t ind_old, ind_new;
> +    uint8_t expected, actual;
>      hwaddr len = 1;
> -    uint8_t *ind_addr;
> +    /* avoid  multiple fetches */
> +    uint8_t volatile *ind_addr;
>  
>      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
>      if (!ind_addr) {
>          s390_pci_generate_error_event(ERR_EVENT_AIRERR, 0, 0, 0, 0);
>          return -1;
>      }
> +    actual = *ind_addr;
>      do {
> -        ind_old = *ind_addr;
> -        ind_new = ind_old | to_be_set;
> -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
> -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
> +        expected = actual;
> +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
> +    } while (actual != expected);
> +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
>  
> -    return ind_old;
> +    return actual;
>  }
>  
>  static void s390_msi_ctrl_write(void *opaque, hwaddr addr, uint64_t data,
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] two atomic_cmpxchg() related fixes
  2020-06-16  4:50 [PATCH 0/2] two atomic_cmpxchg() related fixes Halil Pasic
                   ` (2 preceding siblings ...)
  2020-07-01 12:01 ` [PATCH 0/2] two atomic_cmpxchg() related fixes Cornelia Huck
@ 2020-07-02 11:18 ` Cornelia Huck
  3 siblings, 0 replies; 22+ messages in thread
From: Cornelia Huck @ 2020-07-02 11:18 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Thomas Huth, Matthew Rosato, Michael S. Tsirkin,
	David Hildenbrand, qemu-devel, Christian Borntraeger, qemu-s390x,
	Richard Henderson

On Tue, 16 Jun 2020 06:50:33 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> The story short: compiler can generate code that does two
> distinct fetches of *ind_addr for old and _old. If that happens we can
> not figure out if we had the desired xchg or not.
> 
> Halil Pasic (2):
>   virtio-ccw: fix virtio_set_ind_atomic
>   s390x/pci: fix set_ind_atomic
> 
>  hw/s390x/s390-pci-bus.c | 16 +++++++++-------
>  hw/s390x/virtio-ccw.c   | 18 ++++++++++--------
>  2 files changed, 19 insertions(+), 15 deletions(-)
> 
> 
> base-commit: 7d3660e79830a069f1848bb4fa1cdf8f666424fb

Thanks, applied.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] two atomic_cmpxchg() related fixes
  2020-07-01 12:06   ` Christian Borntraeger
  2020-07-01 13:10     ` Cornelia Huck
@ 2020-07-03 13:37     ` Halil Pasic
  2020-07-03 14:03       ` Cornelia Huck
  1 sibling, 1 reply; 22+ messages in thread
From: Halil Pasic @ 2020-07-03 13:37 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Thomas Huth, Matthew Rosato, David Hildenbrand, Cornelia Huck,
	qemu-devel, qemu-s390x, Michael S. Tsirkin, Richard Henderson

On Wed, 1 Jul 2020 14:06:11 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> 
> 
> On 01.07.20 14:01, Cornelia Huck wrote:
> > On Tue, 16 Jun 2020 06:50:33 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> > 
> >> The story short: compiler can generate code that does two
> >> distinct fetches of *ind_addr for old and _old. If that happens we can
> >> not figure out if we had the desired xchg or not.
> >>
> >> Halil Pasic (2):
> >>   virtio-ccw: fix virtio_set_ind_atomic
> >>   s390x/pci: fix set_ind_atomic
> >>
> >>  hw/s390x/s390-pci-bus.c | 16 +++++++++-------
> >>  hw/s390x/virtio-ccw.c   | 18 ++++++++++--------
> >>  2 files changed, 19 insertions(+), 15 deletions(-)
> >>
> >>
> >> base-commit: 7d3660e79830a069f1848bb4fa1cdf8f666424fb
> > 
> > Have we managed to reach any kind of agreement on this? (A v2?)
> > 
> > We can still get in fixes post-softfreeze, of course.
> 
> Unless Halil has a v2 ready, 
> I think the current patch is fine as is as a fix. I would suggest
> to go with that and we can then do more beautification later when
> necessary.
> 
> 

I think we were waiting for Paolo to chime in regarding the barrier().
The thing I could beautify is using atomic_read(), but frankly I'm not
sure about it, especially considering multi-proccess. My understanding of
volatile is better than my understanding of atomic_read(). On the other
hand, same multi-process question can be asked about atomic_cmpxchg()
and __atomic_compare_exchange_n()...

Regards,
Halil


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] two atomic_cmpxchg() related fixes
  2020-07-03 13:37     ` Halil Pasic
@ 2020-07-03 14:03       ` Cornelia Huck
  0 siblings, 0 replies; 22+ messages in thread
From: Cornelia Huck @ 2020-07-03 14:03 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Thomas Huth, Matthew Rosato, David Hildenbrand,
	Michael S. Tsirkin, qemu-devel, Christian Borntraeger,
	qemu-s390x, Richard Henderson

On Fri, 3 Jul 2020 15:37:11 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 1 Jul 2020 14:06:11 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
> > 
> > 
> > On 01.07.20 14:01, Cornelia Huck wrote:  
> > > On Tue, 16 Jun 2020 06:50:33 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > >> The story short: compiler can generate code that does two
> > >> distinct fetches of *ind_addr for old and _old. If that happens we can
> > >> not figure out if we had the desired xchg or not.
> > >>
> > >> Halil Pasic (2):
> > >>   virtio-ccw: fix virtio_set_ind_atomic
> > >>   s390x/pci: fix set_ind_atomic
> > >>
> > >>  hw/s390x/s390-pci-bus.c | 16 +++++++++-------
> > >>  hw/s390x/virtio-ccw.c   | 18 ++++++++++--------
> > >>  2 files changed, 19 insertions(+), 15 deletions(-)
> > >>
> > >>
> > >> base-commit: 7d3660e79830a069f1848bb4fa1cdf8f666424fb  
> > > 
> > > Have we managed to reach any kind of agreement on this? (A v2?)
> > > 
> > > We can still get in fixes post-softfreeze, of course.  
> > 
> > Unless Halil has a v2 ready, 
> > I think the current patch is fine as is as a fix. I would suggest
> > to go with that and we can then do more beautification later when
> > necessary.
> > 
> >   
> 
> I think we were waiting for Paolo to chime in regarding the barrier().
> The thing I could beautify is using atomic_read(), but frankly I'm not
> sure about it, especially considering multi-proccess. My understanding of
> volatile is better than my understanding of atomic_read(). On the other
> hand, same multi-process question can be asked about atomic_cmpxchg()
> and __atomic_compare_exchange_n()...

We can just improve the code on top later. Having something that
works now beats something not yet developed that might be nicer :)

(But yes, it also might be good to look at non-s390 examples of this
pattern and check if something can be done in a more central place.)



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-06-16  4:50 ` [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic Halil Pasic
  2020-06-16  5:58   ` Christian Borntraeger
  2020-07-01 13:13   ` Christian Borntraeger
@ 2020-07-04 18:34   ` Michael S. Tsirkin
  2020-07-06  5:44     ` Christian Borntraeger
  2020-07-06 11:19     ` Halil Pasic
  2 siblings, 2 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2020-07-04 18:34 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Thomas Huth, Matthew Rosato, David Hildenbrand, Cornelia Huck,
	qemu-devel, Christian Borntraeger, qemu-s390x, Richard Henderson

On Tue, Jun 16, 2020 at 06:50:34AM +0200, Halil Pasic wrote:
> The atomic_cmpxchg() loop is broken because we occasionally end up with
> old and _old having different values (a legit compiler can generate code
> that accessed *ind_addr again to pick up a value for _old instead of
> using the value of old that was already fetched according to the
> rules of the abstract machine). This means the underlying CS instruction
> may use a different old (_old) than the one we intended to use if
> atomic_cmpxchg() performed the xchg part.

And was this ever observed in the field? Or is this a theoretical issue?
commit log should probably say ...

> 
> Let us use volatile to force the rules of the abstract machine for
> accesses to *ind_addr. Let us also rewrite the loop so, we that the

we that -> we know that?

> new old is used to compute the new desired value if the xchg part
> is not performed.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Reported-by: Andre Wild <Andre.Wild1@ibm.com>
> Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
> ---
>  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> index c1f4bb1d33..3c988a000b 100644
> --- a/hw/s390x/virtio-ccw.c
> +++ b/hw/s390x/virtio-ccw.c
> @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
>  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>                                       uint8_t to_be_set)
>  {
> -    uint8_t ind_old, ind_new;
> +    uint8_t expected, actual;
>      hwaddr len = 1;
> -    uint8_t *ind_addr;
> +    /* avoid  multiple fetches */
> +    uint8_t volatile *ind_addr;
>  
>      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
>      if (!ind_addr) {
> @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
>                       __func__, sch->cssid, sch->ssid, sch->schid);
>          return -1;
>      }
> +    actual = *ind_addr;
>      do {
> -        ind_old = *ind_addr;
> -        ind_new = ind_old | to_be_set;
> -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
> -    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
> -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
> +        expected = actual;
> +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
> +    } while (actual != expected);
> +    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
> +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
>  
> -    return ind_old;
> +    return actual;
>  }

I wonder whether cpuXX APIs should accept volatile pointers, too:
casting away volatile is always suspicious.
But that is a separate issue ...

>  static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
> -- 
> 2.17.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-07-04 18:34   ` Michael S. Tsirkin
@ 2020-07-06  5:44     ` Christian Borntraeger
  2020-07-06 11:19     ` Halil Pasic
  1 sibling, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2020-07-06  5:44 UTC (permalink / raw)
  To: Michael S. Tsirkin, Halil Pasic
  Cc: Thomas Huth, Matthew Rosato, David Hildenbrand, Cornelia Huck,
	qemu-devel, qemu-s390x, Richard Henderson



On 04.07.20 20:34, Michael S. Tsirkin wrote:
> On Tue, Jun 16, 2020 at 06:50:34AM +0200, Halil Pasic wrote:
>> The atomic_cmpxchg() loop is broken because we occasionally end up with
>> old and _old having different values (a legit compiler can generate code
>> that accessed *ind_addr again to pick up a value for _old instead of
>> using the value of old that was already fetched according to the
>> rules of the abstract machine). This means the underlying CS instruction
>> may use a different old (_old) than the one we intended to use if
>> atomic_cmpxchg() performed the xchg part.
> 
> And was this ever observed in the field? Or is this a theoretical issue?
> commit log should probably say ...

It was observed in the field when the xml specified qemu instead of vhost.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic
  2020-07-04 18:34   ` Michael S. Tsirkin
  2020-07-06  5:44     ` Christian Borntraeger
@ 2020-07-06 11:19     ` Halil Pasic
  1 sibling, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2020-07-06 11:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Thomas Huth, Matthew Rosato, David Hildenbrand, Cornelia Huck,
	qemu-devel, Christian Borntraeger, qemu-s390x, Richard Henderson

On Sat, 4 Jul 2020 14:34:04 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Jun 16, 2020 at 06:50:34AM +0200, Halil Pasic wrote:
> > The atomic_cmpxchg() loop is broken because we occasionally end up with
> > old and _old having different values (a legit compiler can generate code
> > that accessed *ind_addr again to pick up a value for _old instead of
> > using the value of old that was already fetched according to the
> > rules of the abstract machine). This means the underlying CS instruction
> > may use a different old (_old) than the one we intended to use if
> > atomic_cmpxchg() performed the xchg part.
> 
> And was this ever observed in the field? Or is this a theoretical issue?
> commit log should probably say ...
> 

It was observed in the field (Christian already answered). I think the
message already implies this, because the only conjunctive is about the
compiler behavior.

> > 
> > Let us use volatile to force the rules of the abstract machine for
> > accesses to *ind_addr. Let us also rewrite the loop so, we that the
> 
> we that -> we know that?

s/we//

It would be nice to fix this before the patch gets merged.

> 
> > new old is used to compute the new desired value if the xchg part
> > is not performed.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Reported-by: Andre Wild <Andre.Wild1@ibm.com>
> > Fixes: 7e7494627f ("s390x/virtio-ccw: Adapter interrupt support.")
> > ---
> >  hw/s390x/virtio-ccw.c | 18 ++++++++++--------
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> > index c1f4bb1d33..3c988a000b 100644
> > --- a/hw/s390x/virtio-ccw.c
> > +++ b/hw/s390x/virtio-ccw.c
> > @@ -786,9 +786,10 @@ static inline VirtioCcwDevice *to_virtio_ccw_dev_fast(DeviceState *d)
> >  static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >                                       uint8_t to_be_set)
> >  {
> > -    uint8_t ind_old, ind_new;
> > +    uint8_t expected, actual;
> >      hwaddr len = 1;
> > -    uint8_t *ind_addr;
> > +    /* avoid  multiple fetches */
> > +    uint8_t volatile *ind_addr;
> >  
> >      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
> >      if (!ind_addr) {
> > @@ -796,14 +797,15 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, uint64_t ind_loc,
> >                       __func__, sch->cssid, sch->ssid, sch->schid);
> >          return -1;
> >      }
> > +    actual = *ind_addr;
> >      do {
> > -        ind_old = *ind_addr;
> > -        ind_new = ind_old | to_be_set;
> > -    } while (atomic_cmpxchg(ind_addr, ind_old, ind_new) != ind_old);
> > -    trace_virtio_ccw_set_ind(ind_loc, ind_old, ind_new);
> > -    cpu_physical_memory_unmap(ind_addr, len, 1, len);
> > +        expected = actual;
> > +        actual = atomic_cmpxchg(ind_addr, expected, expected | to_be_set);
> > +    } while (actual != expected);
> > +    trace_virtio_ccw_set_ind(ind_loc, actual, actual | to_be_set);
> > +    cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
> >  
> > -    return ind_old;
> > +    return actual;
> >  }
> 
> I wonder whether cpuXX APIs should accept volatile pointers, too:
> casting away volatile is always suspicious.
> But that is a separate issue ...
> 

Nod.

Thanks for having a look!

> >  static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
> > -- 
> > 2.17.1
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, back to index

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16  4:50 [PATCH 0/2] two atomic_cmpxchg() related fixes Halil Pasic
2020-06-16  4:50 ` [PATCH 1/2] virtio-ccw: fix virtio_set_ind_atomic Halil Pasic
2020-06-16  5:58   ` Christian Borntraeger
2020-06-16  6:33     ` Cornelia Huck
2020-06-16  6:45       ` Christian Borntraeger
2020-06-19  7:14         ` Cornelia Huck
2020-06-17 23:56       ` Halil Pasic
2020-06-19  7:33         ` David Hildenbrand
2020-06-19  8:17           ` Halil Pasic
2020-06-16  9:31     ` Halil Pasic
2020-07-01 13:13   ` Christian Borntraeger
2020-07-04 18:34   ` Michael S. Tsirkin
2020-07-06  5:44     ` Christian Borntraeger
2020-07-06 11:19     ` Halil Pasic
2020-06-16  4:50 ` [PATCH 2/2] s390x/pci: fix set_ind_atomic Halil Pasic
2020-07-01 13:14   ` Christian Borntraeger
2020-07-01 12:01 ` [PATCH 0/2] two atomic_cmpxchg() related fixes Cornelia Huck
2020-07-01 12:06   ` Christian Borntraeger
2020-07-01 13:10     ` Cornelia Huck
2020-07-03 13:37     ` Halil Pasic
2020-07-03 14:03       ` Cornelia Huck
2020-07-02 11:18 ` Cornelia Huck

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git