* [PATCH] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work
@ 2020-06-11 1:48 Wang Hai
2020-06-11 14:50 ` Dominique Martinet
0 siblings, 1 reply; 3+ messages in thread
From: Wang Hai @ 2020-06-11 1:48 UTC (permalink / raw)
To: ericvh, lucho, asmadeus, davem
Cc: v9fs-developer, netdev, linux-kernel, wanghai38
p9_read_work and p9_fd_cancelled may be called concurrently.
Before list_del(&m->rreq->req_list) in p9_read_work is called,
the req->req_list may have been deleted in p9_fd_cancelled.
We can fix it by setting req->status to REQ_STATUS_FLSHD after
list_del(&req->req_list) in p9_fd_cancelled.
Before list_del(&req->req_list) in p9_fd_cancelled is called,
the req->req_list may have been deleted in p9_read_work.
We should return when req->status = REQ_STATUS_RCVD which means
we just received a response for oldreq, so we need do nothing
in p9_fd_cancelled.
Fixes: 60ff779c4abb ("9p: client: remove unused code and any reference to "cancelled" function")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
---
net/9p/trans_fd.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index f868cf6fba79..a563699629cb 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -718,11 +718,18 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
{
p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req);
- /* we haven't received a response for oldreq,
- * remove it from the list.
+ /* If req->status == REQ_STATUS_RCVD, it means we just received a
+ * response for oldreq, we need do nothing here. Else, remove it from
+ * the list.
*/
spin_lock(&client->lock);
+ if (req->status == REQ_STATUS_RCVD) {
+ spin_unlock(&client->lock);
+ return 0;
+ }
+
list_del(&req->req_list);
+ req->status = REQ_STATUS_FLSHD;
spin_unlock(&client->lock);
p9_req_put(req);
--
2.17.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work
2020-06-11 1:48 [PATCH] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work Wang Hai
@ 2020-06-11 14:50 ` Dominique Martinet
[not found] ` <7bed531c-0ea5-b5f8-eaf8-4feb9ccf1b31@huawei.com>
0 siblings, 1 reply; 3+ messages in thread
From: Dominique Martinet @ 2020-06-11 14:50 UTC (permalink / raw)
To: Wang Hai; +Cc: ericvh, lucho, davem, v9fs-developer, netdev, linux-kernel
Wang Hai wrote on Thu, Jun 11, 2020:
> p9_read_work and p9_fd_cancelled may be called concurrently.
Good catch. I'm sure this fixes some of the old syzbot bugs...
I'll check other transports handle this properly as well.
> Before list_del(&m->rreq->req_list) in p9_read_work is called,
> the req->req_list may have been deleted in p9_fd_cancelled.
> We can fix it by setting req->status to REQ_STATUS_FLSHD after
> list_del(&req->req_list) in p9_fd_cancelled.
hm if you do that read_work will fail with EIO and all further 9p
messages will not be read?
p9_read_work probably should handle REQ_STATUS_FLSHD in a special case
that just throws the message away without error as well.
> Before list_del(&req->req_list) in p9_fd_cancelled is called,
> the req->req_list may have been deleted in p9_read_work.
> We should return when req->status = REQ_STATUS_RCVD which means
> we just received a response for oldreq, so we need do nothing
> in p9_fd_cancelled.
I'll need some time to convince myself the refcounting is correct in
this case.
Pre-ref counting this definitely was wrong, but now it might just work
by chance.... I'll double-check.
> Fixes: 60ff779c4abb ("9p: client: remove unused code and any reference
> to "cancelled" function")
I don't understand how this commit is related?
At least make it afd8d65411 ("9P: Add cancelled() to the transport
functions.") which adds the op, not something that removed a previous
version of cancelled even earlier.
> diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> index f868cf6fba79..a563699629cb 100644
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -718,11 +718,18 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
> {
> p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req);
>
> - /* we haven't received a response for oldreq,
> - * remove it from the list.
> + /* If req->status == REQ_STATUS_RCVD, it means we just received a
> + * response for oldreq, we need do nothing here. Else, remove it from
> + * the list.
(nitpick) this feels a bit hard to read, and does not give any
information: you're just paraphrasing the C code.
I would suggest moving the comment after the spinlock and say what we
really do ; something as simple as "ignore cancelled request if message
has been received before lock" is enough.
> */
> spin_lock(&client->lock);
> + if (req->status == REQ_STATUS_RCVD) {
> + spin_unlock(&client->lock);
> + return 0;
> + }
> +
> list_del(&req->req_list);
> + req->status = REQ_STATUS_FLSHD;
> spin_unlock(&client->lock);
> p9_req_put(req);
>
--
Dominique
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-06-12 6:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-11 1:48 [PATCH] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work Wang Hai
2020-06-11 14:50 ` Dominique Martinet
[not found] ` <7bed531c-0ea5-b5f8-eaf8-4feb9ccf1b31@huawei.com>
2020-06-12 6:46 ` Dominique Martinet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).