[V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport
@ 2018-07-17 11:03 jiangyiwen
  2018-07-17 11:35 ` piaojun
  2018-07-17 11:42 ` Dominique Martinet
  0 siblings, 2 replies; 6+ messages in thread
From: jiangyiwen @ 2018-07-17 11:03 UTC (permalink / raw)
  To: Andrew Morton, Eric Van Hensbergen, Ron Minnich, Latchesar Ionkov
  Cc: Linux Kernel Mailing List, v9fs-developer, Dominique Martinet

When client has multiple threads that issue io requests
all the time, and the server has a very good performance,
it may cause cpu is running in the irq context for a long
time because it can check virtqueue has buf in the *while*
loop.

So we should keep chan->lock in the whole loop.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
---
 net/9p/trans_virtio.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 05006cb..e5fea8b 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -148,20 +148,15 @@ static void req_done(struct virtqueue *vq)

 	p9_debug(P9_DEBUG_TRANS, ": request done\n");

-	while (1) {
-		spin_lock_irqsave(&chan->lock, flags);
-		req = virtqueue_get_buf(chan->vq, &len);
-		if (req == NULL) {
-			spin_unlock_irqrestore(&chan->lock, flags);
-			break;
-		}
-		chan->ring_bufs_avail = 1;
-		spin_unlock_irqrestore(&chan->lock, flags);
-		/* Wakeup if anyone waiting for VirtIO ring space. */
-		wake_up(chan->vc_wq);
+	spin_lock_irqsave(&chan->lock, flags);
+	while ((req = virtqueue_get_buf(chan->vq, &len)) != NULL) {
 		if (len)
 			p9_client_cb(chan->client, req, REQ_STATUS_RCVD);
 	}
+	chan->ring_bufs_avail = 1;
+	spin_unlock_irqrestore(&chan->lock, flags);
+	/* Wakeup if anyone waiting for VirtIO ring space. */
+	wake_up(chan->vc_wq);
 }

 /**
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport
  2018-07-17 11:03 [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport jiangyiwen
@ 2018-07-17 11:35 ` piaojun
  2018-07-17 11:42 ` Dominique Martinet
  1 sibling, 0 replies; 6+ messages in thread
From: piaojun @ 2018-07-17 11:35 UTC (permalink / raw)
  To: jiangyiwen, Andrew Morton, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov
  Cc: v9fs-developer, Linux Kernel Mailing List

LGTM

On 2018/7/17 19:03, jiangyiwen wrote:
> When client has multiple threads that issue io requests
> all the time, and the server has a very good performance,
> it may cause cpu is running in the irq context for a long
> time because it can check virtqueue has buf in the *while*
> loop.
> 
> So we should keep chan->lock in the whole loop.
> 
> Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Jun Piao <piaojun@huawei.com>
> ---
>  net/9p/trans_virtio.c | 17 ++++++-----------
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> index 05006cb..e5fea8b 100644
> --- a/net/9p/trans_virtio.c
> +++ b/net/9p/trans_virtio.c
> @@ -148,20 +148,15 @@ static void req_done(struct virtqueue *vq)
> 
>  	p9_debug(P9_DEBUG_TRANS, ": request done\n");
> 
> -	while (1) {
> -		spin_lock_irqsave(&chan->lock, flags);
> -		req = virtqueue_get_buf(chan->vq, &len);
> -		if (req == NULL) {
> -			spin_unlock_irqrestore(&chan->lock, flags);
> -			break;
> -		}
> -		chan->ring_bufs_avail = 1;
> -		spin_unlock_irqrestore(&chan->lock, flags);
> -		/* Wakeup if anyone waiting for VirtIO ring space. */
> -		wake_up(chan->vc_wq);
> +	spin_lock_irqsave(&chan->lock, flags);
> +	while ((req = virtqueue_get_buf(chan->vq, &len)) != NULL) {
>  		if (len)
>  			p9_client_cb(chan->client, req, REQ_STATUS_RCVD);
>  	}
> +	chan->ring_bufs_avail = 1;
> +	spin_unlock_irqrestore(&chan->lock, flags);
> +	/* Wakeup if anyone waiting for VirtIO ring space. */
> +	wake_up(chan->vc_wq);
>  }
> 
>  /**
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport
  2018-07-17 11:03 [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport jiangyiwen
  2018-07-17 11:35 ` piaojun
@ 2018-07-17 11:42 ` Dominique Martinet
  2018-07-17 12:27   ` jiangyiwen
  1 sibling, 1 reply; 6+ messages in thread
From: Dominique Martinet @ 2018-07-17 11:42 UTC (permalink / raw)
  To: jiangyiwen
  Cc: Andrew Morton, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov, Linux Kernel Mailing List, v9fs-developer


> Subject: net/9p: Fix a deadlock case in the virtio transport

I hadn't noticed in the v1, but how is that a deadlock fix?
The previous code doesn't look like it deadlocks to me, the commit
message is more correct.

jiangyiwen wrote on Tue, Jul 17, 2018:
> When client has multiple threads that issue io requests
> all the time, and the server has a very good performance,
> it may cause cpu is running in the irq context for a long
> time because it can check virtqueue has buf in the *while*
> loop.
> 
> So we should keep chan->lock in the whole loop.
> 
> Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
> ---
>  net/9p/trans_virtio.c | 17 ++++++-----------
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> index 05006cb..e5fea8b 100644
> --- a/net/9p/trans_virtio.c
> +++ b/net/9p/trans_virtio.c
> @@ -148,20 +148,15 @@ static void req_done(struct virtqueue *vq)
> 
>  	p9_debug(P9_DEBUG_TRANS, ": request done\n");
> 
> -	while (1) {
> -		spin_lock_irqsave(&chan->lock, flags);
> -		req = virtqueue_get_buf(chan->vq, &len);
> -		if (req == NULL) {
> -			spin_unlock_irqrestore(&chan->lock, flags);
> -			break;
> -		}
> -		chan->ring_bufs_avail = 1;
> -		spin_unlock_irqrestore(&chan->lock, flags);
> -		/* Wakeup if anyone waiting for VirtIO ring space. */
> -		wake_up(chan->vc_wq);
> +	spin_lock_irqsave(&chan->lock, flags);
> +	while ((req = virtqueue_get_buf(chan->vq, &len)) != NULL) {
>  		if (len)
>  			p9_client_cb(chan->client, req, REQ_STATUS_RCVD);
>  	}
> +	chan->ring_bufs_avail = 1;

Do we have a guarantee that req_done is only called if there is at least
one buf to read?
For example, that there isn't two threads queueing the same callback but
the first one reads everything and the second has nothing to read?

If virtblk_done takes care of setting up a "req_done" bool to only
notify waiters if something has been done I'd rather have a reason to do
differently, even if you can argue that nothing bad will happen in case
of a gratuitous wake_up

> +	spin_unlock_irqrestore(&chan->lock, flags);
> +	/* Wakeup if anyone waiting for VirtIO ring space. */
> +	wake_up(chan->vc_wq);
>  }

Thanks,
-- 
Dominique

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport
  2018-07-17 11:42 ` Dominique Martinet
@ 2018-07-17 12:27   ` jiangyiwen
  2018-07-17 13:07     ` Dominique Martinet
  0 siblings, 1 reply; 6+ messages in thread
From: jiangyiwen @ 2018-07-17 12:27 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Andrew Morton, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov, Linux Kernel Mailing List, v9fs-developer

On 2018/7/17 19:42, Dominique Martinet wrote:
> 
>> Subject: net/9p: Fix a deadlock case in the virtio transport
> 
> I hadn't noticed in the v1, but how is that a deadlock fix?
> The previous code doesn't look like it deadlocks to me, the commit
> message is more correct.
> 

Hi Dominique,

If cpu is running in the irq context for a long time,
NMI watchdog will detect the hard lockup in the cpu,
and then it will cause kernel panic. So I use this
subject to underline the scenario.

> jiangyiwen wrote on Tue, Jul 17, 2018:
>> When client has multiple threads that issue io requests
>> all the time, and the server has a very good performance,
>> it may cause cpu is running in the irq context for a long
>> time because it can check virtqueue has buf in the *while*
>> loop.
>>
>> So we should keep chan->lock in the whole loop.
>>
>> Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
>> ---
>>  net/9p/trans_virtio.c | 17 ++++++-----------
>>  1 file changed, 6 insertions(+), 11 deletions(-)
>>
>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>> index 05006cb..e5fea8b 100644
>> --- a/net/9p/trans_virtio.c
>> +++ b/net/9p/trans_virtio.c
>> @@ -148,20 +148,15 @@ static void req_done(struct virtqueue *vq)
>>
>>  	p9_debug(P9_DEBUG_TRANS, ": request done\n");
>>
>> -	while (1) {
>> -		spin_lock_irqsave(&chan->lock, flags);
>> -		req = virtqueue_get_buf(chan->vq, &len);
>> -		if (req == NULL) {
>> -			spin_unlock_irqrestore(&chan->lock, flags);
>> -			break;
>> -		}
>> -		chan->ring_bufs_avail = 1;
>> -		spin_unlock_irqrestore(&chan->lock, flags);
>> -		/* Wakeup if anyone waiting for VirtIO ring space. */
>> -		wake_up(chan->vc_wq);
>> +	spin_lock_irqsave(&chan->lock, flags);
>> +	while ((req = virtqueue_get_buf(chan->vq, &len)) != NULL) {
>>  		if (len)
>>  			p9_client_cb(chan->client, req, REQ_STATUS_RCVD);
>>  	}
>> +	chan->ring_bufs_avail = 1;
> 
> Do we have a guarantee that req_done is only called if there is at least
> one buf to read?
> For example, that there isn't two threads queueing the same callback but
> the first one reads everything and the second has nothing to read?
> 
> If virtblk_done takes care of setting up a "req_done" bool to only
> notify waiters if something has been done I'd rather have a reason to do
> differently, even if you can argue that nothing bad will happen in case
> of a gratuitous wake_up
> 

Sorry, I don't fully understand what your mean.
I think even if the ring buffer don't have the data, wakeup operation
will not cause any other problem, and the loss of performance can be
ignored.

Thanks.

>> +	spin_unlock_irqrestore(&chan->lock, flags);
>> +	/* Wakeup if anyone waiting for VirtIO ring space. */
>> +	wake_up(chan->vc_wq);
>>  }
> 
> Thanks,
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport
  2018-07-17 12:27   ` jiangyiwen
@ 2018-07-17 13:07     ` Dominique Martinet
  2018-07-18  0:58       ` jiangyiwen
  0 siblings, 1 reply; 6+ messages in thread
From: Dominique Martinet @ 2018-07-17 13:07 UTC (permalink / raw)
  To: jiangyiwen
  Cc: Andrew Morton, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov, Linux Kernel Mailing List, v9fs-developer

jiangyiwen wrote on Tue, Jul 17, 2018:
> On 2018/7/17 19:42, Dominique Martinet wrote:
> > 
> >> Subject: net/9p: Fix a deadlock case in the virtio transport
> > 
> > I hadn't noticed in the v1, but how is that a deadlock fix?
> > The previous code doesn't look like it deadlocks to me, the commit
> > message is more correct.
> > 
> 
> If cpu is running in the irq context for a long time,
> NMI watchdog will detect the hard lockup in the cpu,
> and then it will cause kernel panic. So I use this
> subject to underline the scenario.

That's still not a deadlock - fix lockup would be more appropriate?


> > Do we have a guarantee that req_done is only called if there is at least
> > one buf to read?
> > For example, that there isn't two threads queueing the same callback but
> > the first one reads everything and the second has nothing to read?
> > 
> > If virtblk_done takes care of setting up a "req_done" bool to only
> > notify waiters if something has been done I'd rather have a reason to do
> > differently, even if you can argue that nothing bad will happen in case
> > of a gratuitous wake_up
> > 
> 
> Sorry, I don't fully understand what your mean.
> I think even if the ring buffer don't have the data, wakeup operation
> will not cause any other problem, and the loss of performance can be
> ignored.

I just mean "others do check, why not us?". It's almost free to check if
we had something to read, but if there are many pending read/writes
waiting for a buffer they will all wake up and spin uselessly.

I've checked other callers of virtqueue_get_buf() and out of 9 that loop
around in a callback then wake another thread up, 6 do check before
waking up, two check that something happened just to print a debug
statement if not (virtio_test and virtgpu) and one doesn't check
(virtio_input); so I guess we wouldn't be the first ones, just not
following the trend.

But yes, nothing bad will happen, so let's agree to disagree and I'll
defer to others opinion on this


Thanks,
-- 
Dominique Martinet

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport
  2018-07-17 13:07     ` Dominique Martinet
@ 2018-07-18  0:58       ` jiangyiwen
  0 siblings, 0 replies; 6+ messages in thread
From: jiangyiwen @ 2018-07-18  0:58 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Andrew Morton, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov, Linux Kernel Mailing List, v9fs-developer

On 2018/7/17 21:07, Dominique Martinet wrote:
> jiangyiwen wrote on Tue, Jul 17, 2018:
>> On 2018/7/17 19:42, Dominique Martinet wrote:
>>>
>>>> Subject: net/9p: Fix a deadlock case in the virtio transport
>>>
>>> I hadn't noticed in the v1, but how is that a deadlock fix?
>>> The previous code doesn't look like it deadlocks to me, the commit
>>> message is more correct.
>>>
>>
>> If cpu is running in the irq context for a long time,
>> NMI watchdog will detect the hard lockup in the cpu,
>> and then it will cause kernel panic. So I use this
>> subject to underline the scenario.
> 
> That's still not a deadlock - fix lockup would be more appropriate?
> 
> 

Okay.

>>> Do we have a guarantee that req_done is only called if there is at least
>>> one buf to read?
>>> For example, that there isn't two threads queueing the same callback but
>>> the first one reads everything and the second has nothing to read?
>>>
>>> If virtblk_done takes care of setting up a "req_done" bool to only
>>> notify waiters if something has been done I'd rather have a reason to do
>>> differently, even if you can argue that nothing bad will happen in case
>>> of a gratuitous wake_up
>>>
>>
>> Sorry, I don't fully understand what your mean.
>> I think even if the ring buffer don't have the data, wakeup operation
>> will not cause any other problem, and the loss of performance can be
>> ignored.
> 
> I just mean "others do check, why not us?". It's almost free to check if
> we had something to read, but if there are many pending read/writes
> waiting for a buffer they will all wake up and spin uselessly.
> 
> I've checked other callers of virtqueue_get_buf() and out of 9 that loop
> around in a callback then wake another thread up, 6 do check before
> waking up, two check that something happened just to print a debug
> statement if not (virtio_test and virtgpu) and one doesn't check
> (virtio_input); so I guess we wouldn't be the first ones, just not
> following the trend.
> 
> But yes, nothing bad will happen, so let's agree to disagree and I'll
> defer to others opinion on this
> 
> 
> Thanks,
> 

Thanks for your reply, you're right, other callers also check whether
Virtio ring has data then do wakeup operation, we also should follow
the trend.

Okay, I will resend the patch later.

Thanks,
Yiwen.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-07-18  0:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-17 11:03 [V9fs-developer] [PATCH v2] net/9p: Fix a deadlock case in the virtio transport jiangyiwen
2018-07-17 11:35 ` piaojun
2018-07-17 11:42 ` Dominique Martinet
2018-07-17 12:27   ` jiangyiwen
2018-07-17 13:07     ` Dominique Martinet
2018-07-18  0:58       ` jiangyiwen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).