linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace
@ 2018-03-09 20:41 Greg Kurz
  2018-03-09 22:12 ` Andrew Morton
  2018-03-12  1:33 ` jiangyiwen
  0 siblings, 2 replies; 5+ messages in thread
From: Greg Kurz @ 2018-03-09 20:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, v9fs-developer, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov, David S. Miller, Andrew Morton, Yiwen Jiang

If it was interrupted by a signal, the 9p client may need to send some
more requests to the server for cleanup before returning to userspace.

To avoid such a last minute request to be interrupted right away, the
client memorizes if a signal is pending, clears TIF_SIGPENDING, handles
the request and calls recalc_sigpending() before returning.

Unfortunately, if the transmission of this cleanup request fails for any
reason, the transport returns an error and the client propagates it right
away, without calling recalc_sigpending().

This ends up with -ERESTARTSYS from the initially interrupted request
crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
request. The specific signal handling code, which is responsible for
converting -ERESTARTSYS to -EINTR is not called, and userspace receives
the confusing errno value:

open: Unknown error 512 (512)

This is really hard to hit in real life. I discovered the issue while
working on hot-unplug of a virtio-9p-pci device with an instrumented
QEMU allowing to control request completion.

Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
error path actually. Their code flow is a bit obscure and the best
thing to do would probably be a full rewrite: to really ensure this
situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
never happen.

But given the general lack of interest for the 9p code, I won't risk
breaking more things. So this patch simply fixes the buggy paths in
both functions with a trivial label+goto.

Thanks to Laurent Dufour for his help and suggestions on how to find
the root cause and how to fix it.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 net/9p/client.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index b433aff5ff13..e6cae8332e2e 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -769,7 +769,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 	if (err < 0) {
 		if (err != -ERESTARTSYS && err != -EFAULT)
 			c->status = Disconnected;
-		goto reterr;
+		goto recalc_sigpending;
 	}
 again:
 	/* Wait for the response */
@@ -804,6 +804,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 		if (req->status == REQ_STATUS_RCVD)
 			err = 0;
 	}
+recalc_sigpending:
 	if (sigpending) {
 		spin_lock_irqsave(&current->sighand->siglock, flags);
 		recalc_sigpending();
@@ -867,7 +868,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 		if (err == -EIO)
 			c->status = Disconnected;
 		if (err != -ERESTARTSYS)
-			goto reterr;
+			goto recalc_sigpending;
 	}
 	if (req->status == REQ_STATUS_ERROR) {
 		p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
@@ -885,6 +886,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 		if (req->status == REQ_STATUS_RCVD)
 			err = 0;
 	}
+recalc_sigpending:
 	if (sigpending) {
 		spin_lock_irqsave(&current->sighand->siglock, flags);
 		recalc_sigpending();

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace
  2018-03-09 20:41 [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace Greg Kurz
@ 2018-03-09 22:12 ` Andrew Morton
  2018-03-10 12:32   ` Greg Kurz
  2018-03-12  1:33 ` jiangyiwen
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2018-03-09 22:12 UTC (permalink / raw)
  To: Greg Kurz
  Cc: linux-kernel, netdev, v9fs-developer, Eric Van Hensbergen,
	Ron Minnich, Latchesar Ionkov, David S. Miller, Yiwen Jiang

On Fri, 09 Mar 2018 21:41:38 +0100 Greg Kurz <groug@kaod.org> wrote:

> If it was interrupted by a signal, the 9p client may need to send some
> more requests to the server for cleanup before returning to userspace.
> 
> To avoid such a last minute request to be interrupted right away, the
> client memorizes if a signal is pending, clears TIF_SIGPENDING, handles
> the request and calls recalc_sigpending() before returning.
> 
> Unfortunately, if the transmission of this cleanup request fails for any
> reason, the transport returns an error and the client propagates it right
> away, without calling recalc_sigpending().
> 
> This ends up with -ERESTARTSYS from the initially interrupted request
> crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
> request. The specific signal handling code, which is responsible for
> converting -ERESTARTSYS to -EINTR is not called, and userspace receives
> the confusing errno value:
> 
> open: Unknown error 512 (512)
> 
> This is really hard to hit in real life. I discovered the issue while
> working on hot-unplug of a virtio-9p-pci device with an instrumented
> QEMU allowing to control request completion.
> 
> Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
> error path actually. Their code flow is a bit obscure and the best
> thing to do would probably be a full rewrite: to really ensure this
> situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
> never happen.
> 
> But given the general lack of interest for the 9p code, I won't risk
> breaking more things. So this patch simply fixes the buggy paths in
> both functions with a trivial label+goto.
> 
> Thanks to Laurent Dufour for his help and suggestions on how to find
> the root cause and how to fix it.

That's a fairly straightforward-looking bug.  However the code still
looks a bit racy:


: 	if (signal_pending(current)) {
: 		sigpending = 1;
: 		clear_thread_flag(TIF_SIGPENDING);
: 	} else
: 		sigpending = 0;
: 

what happens if signal_pending(current) becomes true right here?

: 	err = c->trans_mod->request(c, req);


I'm surprised that the 9p client is mucking with signals at all. 
Signals are a userspace IPC scheme and kernel code should instead be
using the more powerful messaging mechanisms which we've developed. 
Ones which don't disrupt userspace state.

Why is this happening?  Is there some userspace/kernel interoperation
involved?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace
  2018-03-09 22:12 ` Andrew Morton
@ 2018-03-10 12:32   ` Greg Kurz
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kurz @ 2018-03-10 12:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, netdev, v9fs-developer, Eric Van Hensbergen,
	Ron Minnich, Latchesar Ionkov, David S. Miller, Yiwen Jiang

Hi Andrew,

Thank you very much for taking care of this.

Please find my answers to your remarks below.

On Fri, 9 Mar 2018 14:12:52 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 09 Mar 2018 21:41:38 +0100 Greg Kurz <groug@kaod.org> wrote:
> 
> > If it was interrupted by a signal, the 9p client may need to send some
> > more requests to the server for cleanup before returning to userspace.
> > 
> > To avoid such a last minute request to be interrupted right away, the
> > client memorizes if a signal is pending, clears TIF_SIGPENDING, handles
> > the request and calls recalc_sigpending() before returning.
> > 
> > Unfortunately, if the transmission of this cleanup request fails for any
> > reason, the transport returns an error and the client propagates it right
> > away, without calling recalc_sigpending().
> > 
> > This ends up with -ERESTARTSYS from the initially interrupted request
> > crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
> > request. The specific signal handling code, which is responsible for
> > converting -ERESTARTSYS to -EINTR is not called, and userspace receives
> > the confusing errno value:
> > 
> > open: Unknown error 512 (512)
> > 
> > This is really hard to hit in real life. I discovered the issue while
> > working on hot-unplug of a virtio-9p-pci device with an instrumented
> > QEMU allowing to control request completion.
> > 
> > Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
> > error path actually. Their code flow is a bit obscure and the best
> > thing to do would probably be a full rewrite: to really ensure this
> > situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
> > never happen.
> > 
> > But given the general lack of interest for the 9p code, I won't risk
> > breaking more things. So this patch simply fixes the buggy paths in
> > both functions with a trivial label+goto.
> > 
> > Thanks to Laurent Dufour for his help and suggestions on how to find
> > the root cause and how to fix it.  
> 
> That's a fairly straightforward-looking bug.  However the code still
> looks a bit racy:
> 
> 
> : 	if (signal_pending(current)) {
> : 		sigpending = 1;
> : 		clear_thread_flag(TIF_SIGPENDING);
> : 	} else
> : 		sigpending = 0;
> : 
> 
> what happens if signal_pending(current) becomes true right here?
>

Depending on the transport c->trans_mod->request() could possibly
return -ERESTARTSYS, and we would then recalc_sigpending() and
propagate -ERESTARTSYS to the caller.

> : 	err = c->trans_mod->request(c, req);
> 
> 
> I'm surprised that the 9p client is mucking with signals at all. 
> Signals are a userspace IPC scheme and kernel code should instead be
> using the more powerful messaging mechanisms which we've developed. 
> Ones which don't disrupt userspace state.
> 
> Why is this happening?  Is there some userspace/kernel interoperation
> involved?
> 

Before commit 9523feac272ccad2ad8186ba4fcc89103754de52, we used to rely
on wait_event_interruptible() instead of wait_event_killable(). IIUC,
the purpose of all this sigpending tweaking was just to allow subsequent
cleanup related requests to be issued, without them being interrupted right
away because of the initial signal.

BTW the issue tentatively fixed by commit 9523feac272ccad2ad8186ba4fcc89103754de52
was more deeply investigated since then. It was caused by a bug in QEMU that got 
fixed, and will be available in the next release (2.12). And to cope with existing
buggy QEMUs, we can assume that a -EINTR response from the server necessarily means
the corresponding request has been successfully canceled.

Cheers,

--
Greg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace
  2018-03-09 20:41 [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace Greg Kurz
  2018-03-09 22:12 ` Andrew Morton
@ 2018-03-12  1:33 ` jiangyiwen
  1 sibling, 0 replies; 5+ messages in thread
From: jiangyiwen @ 2018-03-12  1:33 UTC (permalink / raw)
  To: Greg Kurz, linux-kernel
  Cc: netdev, v9fs-developer, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov, David S. Miller, Andrew Morton

On 2018/3/10 4:41, Greg Kurz wrote:
> If it was interrupted by a signal, the 9p client may need to send some
> more requests to the server for cleanup before returning to userspace.
> 
> To avoid such a last minute request to be interrupted right away, the
> client memorizes if a signal is pending, clears TIF_SIGPENDING, handles
> the request and calls recalc_sigpending() before returning.
> 
> Unfortunately, if the transmission of this cleanup request fails for any
> reason, the transport returns an error and the client propagates it right
> away, without calling recalc_sigpending().
> 
> This ends up with -ERESTARTSYS from the initially interrupted request
> crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
> request. The specific signal handling code, which is responsible for
> converting -ERESTARTSYS to -EINTR is not called, and userspace receives
> the confusing errno value:
> 
> open: Unknown error 512 (512)
> 
> This is really hard to hit in real life. I discovered the issue while
> working on hot-unplug of a virtio-9p-pci device with an instrumented
> QEMU allowing to control request completion.
> 
> Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
> error path actually. Their code flow is a bit obscure and the best
> thing to do would probably be a full rewrite: to really ensure this
> situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
> never happen.
> 
> But given the general lack of interest for the 9p code, I won't risk
> breaking more things. So this patch simply fixes the buggy paths in
> both functions with a trivial label+goto.
> 
> Thanks to Laurent Dufour for his help and suggestions on how to find
> the root cause and how to fix it.
> 

Looks good.

Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>

> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  net/9p/client.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/9p/client.c b/net/9p/client.c
> index b433aff5ff13..e6cae8332e2e 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -769,7 +769,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
>  	if (err < 0) {
>  		if (err != -ERESTARTSYS && err != -EFAULT)
>  			c->status = Disconnected;
> -		goto reterr;
> +		goto recalc_sigpending;
>  	}
>  again:
>  	/* Wait for the response */
> @@ -804,6 +804,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
>  		if (req->status == REQ_STATUS_RCVD)
>  			err = 0;
>  	}
> +recalc_sigpending:
>  	if (sigpending) {
>  		spin_lock_irqsave(&current->sighand->siglock, flags);
>  		recalc_sigpending();
> @@ -867,7 +868,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
>  		if (err == -EIO)
>  			c->status = Disconnected;
>  		if (err != -ERESTARTSYS)
> -			goto reterr;
> +			goto recalc_sigpending;
>  	}
>  	if (req->status == REQ_STATUS_ERROR) {
>  		p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
> @@ -885,6 +886,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
>  		if (req->status == REQ_STATUS_RCVD)
>  			err = 0;
>  	}
> +recalc_sigpending:
>  	if (sigpending) {
>  		spin_lock_irqsave(&current->sighand->siglock, flags);
>  		recalc_sigpending();
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace
@ 2018-02-08 17:38 Greg Kurz
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kurz @ 2018-02-08 17:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, v9fs-developer, Eric Van Hensbergen, Ron Minnich,
	Latchesar Ionkov, David S. Miller, Al Viro

If it was interrupted by a signal, the 9p client may need to send some
more requests to the server for cleanup before returning to userspace.

To avoid such a last minute request to be interrupted right away, the
client memorizes if a signal is pending, clear TIF_SIGPENDING, handle
the request and call recalc_sigpending() before returning.

Unfortunately, if the transmission of this cleanup request fails for any
reason, the transport returns an error and the client propagates it right
away, without calling recalc_sigpending().

This ends up with -ERESTARTSYS from the initially interrupted request
crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
request. The specific signal handling code, which is responsible for
converting -ERESTARTSYS to -EINTR is not called, and userspace receives
the confusing errno value:

open: Unknown error 512 (512)

This is really hard to hit in real life. I discovered the issue while
working on hot-unplug of a virtio-9p-pci device with an instrumented
QEMU allowing to control request completion.

Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
error path actually. Their code flow is a bit obscure and the best
thing to do would probably be a full rewrite: to really ensure this
situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
never happen.

But given the general lack of interest for the 9p code, I won't risk
breaking more things. So this patch simply fix the buggy paths in both
functions with a trivial label+goto.

Thanks to Laurent Dufour for his help and suggestions on how to find
the root cause and how to fix it.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 net/9p/client.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index 4c8cf9c1631a..5154eaf19fff 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -769,7 +769,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 	if (err < 0) {
 		if (err != -ERESTARTSYS && err != -EFAULT)
 			c->status = Disconnected;
-		goto reterr;
+		goto recalc_sigpending;
 	}
 again:
 	/* Wait for the response */
@@ -804,6 +804,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 		if (req->status == REQ_STATUS_RCVD)
 			err = 0;
 	}
+recalc_sigpending:
 	if (sigpending) {
 		spin_lock_irqsave(&current->sighand->siglock, flags);
 		recalc_sigpending();
@@ -867,7 +868,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 		if (err == -EIO)
 			c->status = Disconnected;
 		if (err != -ERESTARTSYS)
-			goto reterr;
+			goto recalc_sigpending;
 	}
 	if (req->status == REQ_STATUS_ERROR) {
 		p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
@@ -885,6 +886,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 		if (req->status == REQ_STATUS_RCVD)
 			err = 0;
 	}
+recalc_sigpending:
 	if (sigpending) {
 		spin_lock_irqsave(&current->sighand->siglock, flags);
 		recalc_sigpending();

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-12  1:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-09 20:41 [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace Greg Kurz
2018-03-09 22:12 ` Andrew Morton
2018-03-10 12:32   ` Greg Kurz
2018-03-12  1:33 ` jiangyiwen
  -- strict thread matches above, loose matches on Subject: below --
2018-02-08 17:38 Greg Kurz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).