* [PATCH 1/1] virtio/s390: fix race on airq_areas[]
@ 2019-07-23 22:58 Halil Pasic
2019-07-24 6:44 ` Christian Borntraeger
0 siblings, 1 reply; 6+ messages in thread
From: Halil Pasic @ 2019-07-23 22:58 UTC (permalink / raw)
To: Cornelia Huck, kvm, linux-s390
Cc: Halil Pasic, Christian Borntraeger, Janosch Frank,
Marc Hartmayer, virtualization
The access to airq_areas was racy ever since the adapter interrupts got
introduced to virtio-ccw, but since commit 39c7dcb15892 ("virtio/s390:
make airq summary indicators DMA") this became an issue in practice as
well. Namely before that commit the airq_info that got overwritten was
still functional. After that commit however the two infos share a
summary_indicator, which aggravates the situation. Which means
auto-online mechanism occasionally hangs the boot with virtio_blk.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Fixes: 96b14536d935 ("virtio-ccw: virtio-ccw adapter interrupt support.")
---
* We need definitely this fixed for 5.3. For older stable kernels it is
to be discussed. @Connie what do you think: do we need a cc stable?
* I have a variant that does not need the extra mutex but uses cmpxchg().
Decided to post this one because that one is more complex. But if there
is interest we can have a look at it as well.
---
drivers/s390/virtio/virtio_ccw.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 1a55e5942d36..d97742662755 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -145,6 +145,8 @@ struct airq_info {
struct airq_iv *aiv;
};
static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
+DEFINE_MUTEX(airq_areas_lock);
+
static u8 *summary_indicators;
static inline u8 *get_summary_indicator(struct airq_info *info)
@@ -265,9 +267,11 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
unsigned long bit, flags;
for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+ mutex_lock(&airq_areas_lock);
if (!airq_areas[i])
airq_areas[i] = new_airq_info(i);
info = airq_areas[i];
+ mutex_unlock(&airq_areas_lock);
if (!info)
return 0;
write_lock_irqsave(&info->lock, flags);
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] virtio/s390: fix race on airq_areas[]
2019-07-23 22:58 [PATCH 1/1] virtio/s390: fix race on airq_areas[] Halil Pasic
@ 2019-07-24 6:44 ` Christian Borntraeger
2019-07-24 8:34 ` Cornelia Huck
2019-07-24 11:01 ` Halil Pasic
0 siblings, 2 replies; 6+ messages in thread
From: Christian Borntraeger @ 2019-07-24 6:44 UTC (permalink / raw)
To: Halil Pasic, Cornelia Huck, kvm, linux-s390
Cc: Janosch Frank, Marc Hartmayer, virtualization
On 24.07.19 00:58, Halil Pasic wrote:
> The access to airq_areas was racy ever since the adapter interrupts got
> introduced to virtio-ccw, but since commit 39c7dcb15892 ("virtio/s390:
> make airq summary indicators DMA") this became an issue in practice as
> well. Namely before that commit the airq_info that got overwritten was
> still functional. After that commit however the two infos share a
> summary_indicator, which aggravates the situation. Which means
> auto-online mechanism occasionally hangs the boot with virtio_blk.
>
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> Fixes: 96b14536d935 ("virtio-ccw: virtio-ccw adapter interrupt support.")
> ---
> * We need definitely this fixed for 5.3. For older stable kernels it is
> to be discussed. @Connie what do you think: do we need a cc stable?
Unless you can prove that the problem could never happen on old version
we absolutely do need cc stable.
>
> * I have a variant that does not need the extra mutex but uses cmpxchg().
> Decided to post this one because that one is more complex. But if there
> is interest we can have a look at it as well.
This is slow path (startup) and never called in hot path. Correct? Mutex should be
fine.
> ---
> drivers/s390/virtio/virtio_ccw.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 1a55e5942d36..d97742662755 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -145,6 +145,8 @@ struct airq_info {
> struct airq_iv *aiv;
> };
> static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
> +DEFINE_MUTEX(airq_areas_lock);
> +
> static u8 *summary_indicators;
>
> static inline u8 *get_summary_indicator(struct airq_info *info)
> @@ -265,9 +267,11 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
> unsigned long bit, flags;
>
> for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> + mutex_lock(&airq_areas_lock);
> if (!airq_areas[i])
> airq_areas[i] = new_airq_info(i);
> info = airq_areas[i];
> + mutex_unlock(&airq_areas_lock);
> if (!info)
> return 0;
> write_lock_irqsave(&info->lock, flags);
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] virtio/s390: fix race on airq_areas[]
2019-07-24 6:44 ` Christian Borntraeger
@ 2019-07-24 8:34 ` Cornelia Huck
2019-07-24 8:39 ` Christian Borntraeger
2019-07-24 11:01 ` Halil Pasic
1 sibling, 1 reply; 6+ messages in thread
From: Cornelia Huck @ 2019-07-24 8:34 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Halil Pasic, kvm, linux-s390, Janosch Frank, Marc Hartmayer,
virtualization
On Wed, 24 Jul 2019 08:44:19 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> On 24.07.19 00:58, Halil Pasic wrote:
> > The access to airq_areas was racy ever since the adapter interrupts got
> > introduced to virtio-ccw, but since commit 39c7dcb15892 ("virtio/s390:
> > make airq summary indicators DMA") this became an issue in practice as
> > well. Namely before that commit the airq_info that got overwritten was
> > still functional. After that commit however the two infos share a
> > summary_indicator, which aggravates the situation. Which means
> > auto-online mechanism occasionally hangs the boot with virtio_blk.
> >
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> > Fixes: 96b14536d935 ("virtio-ccw: virtio-ccw adapter interrupt support.")
> > ---
> > * We need definitely this fixed for 5.3. For older stable kernels it is
> > to be discussed. @Connie what do you think: do we need a cc stable?
>
> Unless you can prove that the problem could never happen on old version
> we absolutely do need cc stable.
Yes, this needs to be cc:stable.
>
> >
> > * I have a variant that does not need the extra mutex but uses cmpxchg().
> > Decided to post this one because that one is more complex. But if there
> > is interest we can have a look at it as well.
>
> This is slow path (startup) and never called in hot path. Correct? Mutex should be
> fine.
Yes, this is ultimately called through the ->probe functions of virtio
drivers.
> > ---
> > drivers/s390/virtio/virtio_ccw.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 1a55e5942d36..d97742662755 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -145,6 +145,8 @@ struct airq_info {
> > struct airq_iv *aiv;
> > };
> > static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
> > +DEFINE_MUTEX(airq_areas_lock);
> > +
> > static u8 *summary_indicators;
> >
> > static inline u8 *get_summary_indicator(struct airq_info *info)
> > @@ -265,9 +267,11 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
> > unsigned long bit, flags;
> >
> > for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> > + mutex_lock(&airq_areas_lock);
> > if (!airq_areas[i])
> > airq_areas[i] = new_airq_info(i);
> > info = airq_areas[i];
> > + mutex_unlock(&airq_areas_lock);
> > if (!info)
> > return 0;
> > write_lock_irqsave(&info->lock, flags);
> >
>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Should I pick this and send a pull request, or is it quicker to just
take this directly?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] virtio/s390: fix race on airq_areas[]
2019-07-24 8:34 ` Cornelia Huck
@ 2019-07-24 8:39 ` Christian Borntraeger
2019-07-24 11:17 ` Halil Pasic
0 siblings, 1 reply; 6+ messages in thread
From: Christian Borntraeger @ 2019-07-24 8:39 UTC (permalink / raw)
To: Cornelia Huck
Cc: Halil Pasic, kvm, linux-s390, Janosch Frank, Marc Hartmayer,
virtualization
On 24.07.19 10:34, Cornelia Huck wrote:
> On Wed, 24 Jul 2019 08:44:19 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>
>> On 24.07.19 00:58, Halil Pasic wrote:
>>> The access to airq_areas was racy ever since the adapter interrupts got
>>> introduced to virtio-ccw, but since commit 39c7dcb15892 ("virtio/s390:
>>> make airq summary indicators DMA") this became an issue in practice as
>>> well. Namely before that commit the airq_info that got overwritten was
>>> still functional. After that commit however the two infos share a
>>> summary_indicator, which aggravates the situation. Which means
>>> auto-online mechanism occasionally hangs the boot with virtio_blk.
>>>
>>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>>> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>>> Fixes: 96b14536d935 ("virtio-ccw: virtio-ccw adapter interrupt support.")
>>> ---
>>> * We need definitely this fixed for 5.3. For older stable kernels it is
>>> to be discussed. @Connie what do you think: do we need a cc stable?
>>
>> Unless you can prove that the problem could never happen on old version
>> we absolutely do need cc stable.
>
> Yes, this needs to be cc:stable.
>
>>
>>>
>>> * I have a variant that does not need the extra mutex but uses cmpxchg().
>>> Decided to post this one because that one is more complex. But if there
>>> is interest we can have a look at it as well.
>>
>> This is slow path (startup) and never called in hot path. Correct? Mutex should be
>> fine.
>
> Yes, this is ultimately called through the ->probe functions of virtio
> drivers.
>
>>> ---
>>> drivers/s390/virtio/virtio_ccw.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
>>> index 1a55e5942d36..d97742662755 100644
>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>> @@ -145,6 +145,8 @@ struct airq_info {
>>> struct airq_iv *aiv;
>>> };
>>> static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
>>> +DEFINE_MUTEX(airq_areas_lock);
>>> +
>>> static u8 *summary_indicators;
>>>
>>> static inline u8 *get_summary_indicator(struct airq_info *info)
>>> @@ -265,9 +267,11 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
>>> unsigned long bit, flags;
>>>
>>> for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
>>> + mutex_lock(&airq_areas_lock);
>>> if (!airq_areas[i])
>>> airq_areas[i] = new_airq_info(i);
>>> info = airq_areas[i];
>>> + mutex_unlock(&airq_areas_lock);
>>> if (!info)
>>> return 0;
>>> write_lock_irqsave(&info->lock, flags);
>>>
>>
>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>
> Should I pick this and send a pull request, or is it quicker to just
> take this directly?
I think we can you did via a fast path. Halil, can you push to the s390 tree?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] virtio/s390: fix race on airq_areas[]
2019-07-24 6:44 ` Christian Borntraeger
2019-07-24 8:34 ` Cornelia Huck
@ 2019-07-24 11:01 ` Halil Pasic
1 sibling, 0 replies; 6+ messages in thread
From: Halil Pasic @ 2019-07-24 11:01 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Cornelia Huck, kvm, linux-s390, Janosch Frank, Marc Hartmayer,
virtualization, stable
On Wed, 24 Jul 2019 08:44:19 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>
>
> On 24.07.19 00:58, Halil Pasic wrote:
> > The access to airq_areas was racy ever since the adapter interrupts got
> > introduced to virtio-ccw, but since commit 39c7dcb15892 ("virtio/s390:
> > make airq summary indicators DMA") this became an issue in practice as
> > well. Namely before that commit the airq_info that got overwritten was
> > still functional. After that commit however the two infos share a
> > summary_indicator, which aggravates the situation. Which means
> > auto-online mechanism occasionally hangs the boot with virtio_blk.
> >
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> > Fixes: 96b14536d935 ("virtio-ccw: virtio-ccw adapter interrupt support.")
> > ---
> > * We need definitely this fixed for 5.3. For older stable kernels it is
> > to be discussed. @Connie what do you think: do we need a cc stable?
>
> Unless you can prove that the problem could never happen on old version
> we absolutely do need cc stable.
No I would not like to make an attempt at proving that. I prefer code
race free anyway. CC-ing stable.
>
> >
> > * I have a variant that does not need the extra mutex but uses cmpxchg().
> > Decided to post this one because that one is more complex. But if there
> > is interest we can have a look at it as well.
>
> This is slow path (startup) and never called in hot path. Correct? Mutex should be
> fine.
Right, this is only relevant during device initialization, which is an
infrequent operation.
Thanks,
Halil
> > ---
> > drivers/s390/virtio/virtio_ccw.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 1a55e5942d36..d97742662755 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -145,6 +145,8 @@ struct airq_info {
> > struct airq_iv *aiv;
> > };
> > static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
> > +DEFINE_MUTEX(airq_areas_lock);
> > +
> > static u8 *summary_indicators;
> >
> > static inline u8 *get_summary_indicator(struct airq_info *info)
> > @@ -265,9 +267,11 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
> > unsigned long bit, flags;
> >
> > for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> > + mutex_lock(&airq_areas_lock);
> > if (!airq_areas[i])
> > airq_areas[i] = new_airq_info(i);
> > info = airq_areas[i];
> > + mutex_unlock(&airq_areas_lock);
> > if (!info)
> > return 0;
> > write_lock_irqsave(&info->lock, flags);
> >
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] virtio/s390: fix race on airq_areas[]
2019-07-24 8:39 ` Christian Borntraeger
@ 2019-07-24 11:17 ` Halil Pasic
0 siblings, 0 replies; 6+ messages in thread
From: Halil Pasic @ 2019-07-24 11:17 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Cornelia Huck, kvm, linux-s390, Janosch Frank, Marc Hartmayer,
virtualization
On Wed, 24 Jul 2019 10:39:13 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>
>
> On 24.07.19 10:34, Cornelia Huck wrote:
> > On Wed, 24 Jul 2019 08:44:19 +0200
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >
> >> On 24.07.19 00:58, Halil Pasic wrote:
> >>> The access to airq_areas was racy ever since the adapter interrupts got
> >>> introduced to virtio-ccw, but since commit 39c7dcb15892 ("virtio/s390:
> >>> make airq summary indicators DMA") this became an issue in practice as
> >>> well. Namely before that commit the airq_info that got overwritten was
> >>> still functional. After that commit however the two infos share a
> >>> summary_indicator, which aggravates the situation. Which means
> >>> auto-online mechanism occasionally hangs the boot with virtio_blk.
> >>>
> >>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> >>> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> >>> Fixes: 96b14536d935 ("virtio-ccw: virtio-ccw adapter interrupt support.")
> >>> ---
> >>> * We need definitely this fixed for 5.3. For older stable kernels it is
> >>> to be discussed. @Connie what do you think: do we need a cc stable?
> >>
> >> Unless you can prove that the problem could never happen on old version
> >> we absolutely do need cc stable.
> >
> > Yes, this needs to be cc:stable.
> >
> >>
> >>>
> >>> * I have a variant that does not need the extra mutex but uses cmpxchg().
> >>> Decided to post this one because that one is more complex. But if there
> >>> is interest we can have a look at it as well.
> >>
> >> This is slow path (startup) and never called in hot path. Correct? Mutex should be
> >> fine.
> >
> > Yes, this is ultimately called through the ->probe functions of virtio
> > drivers.
> >
> >>> ---
> >>> drivers/s390/virtio/virtio_ccw.c | 4 ++++
> >>> 1 file changed, 4 insertions(+)
> >>>
> >>> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> >>> index 1a55e5942d36..d97742662755 100644
> >>> --- a/drivers/s390/virtio/virtio_ccw.c
> >>> +++ b/drivers/s390/virtio/virtio_ccw.c
> >>> @@ -145,6 +145,8 @@ struct airq_info {
> >>> struct airq_iv *aiv;
> >>> };
> >>> static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
> >>> +DEFINE_MUTEX(airq_areas_lock);
> >>> +
> >>> static u8 *summary_indicators;
> >>>
> >>> static inline u8 *get_summary_indicator(struct airq_info *info)
> >>> @@ -265,9 +267,11 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
> >>> unsigned long bit, flags;
> >>>
> >>> for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
> >>> + mutex_lock(&airq_areas_lock);
> >>> if (!airq_areas[i])
> >>> airq_areas[i] = new_airq_info(i);
> >>> info = airq_areas[i];
> >>> + mutex_unlock(&airq_areas_lock);
> >>> if (!info)
> >>> return 0;
> >>> write_lock_irqsave(&info->lock, flags);
> >>>
> >>
> >
> > Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> >
> > Should I pick this and send a pull request, or is it quicker to just
> > take this directly?
>
> I think we can you did via a fast path. Halil, can you push to the s390 tree?
Sure!
Regards,
Halil
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-07-24 11:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-23 22:58 [PATCH 1/1] virtio/s390: fix race on airq_areas[] Halil Pasic
2019-07-24 6:44 ` Christian Borntraeger
2019-07-24 8:34 ` Cornelia Huck
2019-07-24 8:39 ` Christian Borntraeger
2019-07-24 11:17 ` Halil Pasic
2019-07-24 11:01 ` Halil Pasic
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).