All of lore.kernel.org
 help / color / mirror / Atom feed
* [tipc-discussion][net 0/1] tipc: fix race condition between sleep/wakeup
@ 2019-02-19  7:03 Tung Nguyen
  2019-02-19  7:03 ` [tipc-discussion][net 1/1] tipc: fix race condition causing hung sendto Tung Nguyen
  0 siblings, 1 reply; 9+ messages in thread
From: Tung Nguyen @ 2019-02-19  7:03 UTC (permalink / raw)
  To: davem, netdev; +Cc: tipc-discussion

Fix issue of hung sendto() in user space.

Tung Nguyen (1):
  tipc: fix race condition causing hung sendto

 net/tipc/socket.c | 4 ++++
 1 file changed, 4 insertions(+)
 mode change 100644 => 100755 net/tipc/socket.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tipc-discussion][net 1/1] tipc: fix race condition causing hung sendto
  2019-02-19  7:03 [tipc-discussion][net 0/1] tipc: fix race condition between sleep/wakeup Tung Nguyen
@ 2019-02-19  7:03 ` Tung Nguyen
  2019-02-19 21:21   ` David Miller
  2019-02-21  3:31   ` [tipc-discussion][net v2 " Tung Nguyen
  0 siblings, 2 replies; 9+ messages in thread
From: Tung Nguyen @ 2019-02-19  7:03 UTC (permalink / raw)
  To: davem, netdev; +Cc: tipc-discussion

When sending multicast messages via blocking socket,
if sending link is congested (tsk->cong_link_cnt is set to 1),
the sending thread will be put into sleeping state. However,
tipc_sk_filter_rcv() is called under socket spin lock but
tipc_wait_for_cond() is not. So, there is no guarantee that
the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
CPU-1 will be perceived by CPU-0. If that is the case, the sending
thread in CPU-0 after being waken up, will continue to see
tsk->cong_link_cnt as 1 and put the sending thread into sleeping
state again. The sending thread will sleep forever.

CPU-0                                | CPU-1
tipc_wait_for_cond()                 |
{                                    |
 // condition_ = !tsk->cong_link_cnt |
 while ((rc_ = !(condition_))) {     |
  ...                                |
  release_sock(sk_);                 |
  wait_woken();                      |
                                     | if (!sock_owned_by_user(sk))
                                     |  tipc_sk_filter_rcv()
                                     |  {
                                     |   ...
                                     |   tipc_sk_proto_rcv()
                                     |   {
                                     |    ...
                                     |    tsk->cong_link_cnt--;
                                     |    ...
                                     |    sk->sk_write_space(sk);
                                     |    ...
                                     |   }
                                     |   ...
                                     |  }
  sched_annotate_sleep();            |
  lock_sock(sk_);                    |
  remove_wait_queue();               |
 }                                   |
}                                    |

This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
and tipc_wait_for_cond().

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
---
 net/tipc/socket.c | 4 ++++
 1 file changed, 4 insertions(+)
 mode change 100644 => 100755 net/tipc/socket.c

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
old mode 100644
new mode 100755
index 1217c90a363b..d8f054d45941
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -383,6 +383,8 @@ static int tipc_sk_sock_err(struct socket *sock, long *timeout)
 	int rc_;							       \
 									       \
 	while ((rc_ = !(condition_))) {					       \
+		/* coupled with smp_wmb() in tipc_sk_proto_rcv() */            \
+		smp_rmb();                                                     \
 		DEFINE_WAIT_FUNC(wait_, woken_wake_function);	               \
 		sk_ = (sock_)->sk;					       \
 		rc_ = tipc_sk_sock_err((sock_), timeo_);		       \
@@ -1982,6 +1984,8 @@ static void tipc_sk_proto_rcv(struct sock *sk,
 		return;
 	case SOCK_WAKEUP:
 		tipc_dest_del(&tsk->cong_links, msg_orignode(hdr), 0);
+		/* coupled with smp_rmb() in tipc_wait_for_cond() */
+		smp_wmb();
 		tsk->cong_link_cnt--;
 		wakeup = true;
 		break;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [tipc-discussion][net 1/1] tipc: fix race condition causing hung sendto
  2019-02-19  7:03 ` [tipc-discussion][net 1/1] tipc: fix race condition causing hung sendto Tung Nguyen
@ 2019-02-19 21:21   ` David Miller
  2019-02-21  3:31   ` [tipc-discussion][net v2 " Tung Nguyen
  1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2019-02-19 21:21 UTC (permalink / raw)
  To: tung.q.nguyen; +Cc: netdev, tipc-discussion

From: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Date: Tue, 19 Feb 2019 14:03:10 +0700

> When sending multicast messages via blocking socket,
> if sending link is congested (tsk->cong_link_cnt is set to 1),
> the sending thread will be put into sleeping state. However,
> tipc_sk_filter_rcv() is called under socket spin lock but
> tipc_wait_for_cond() is not. So, there is no guarantee that
> the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
> CPU-1 will be perceived by CPU-0. If that is the case, the sending
> thread in CPU-0 after being waken up, will continue to see
> tsk->cong_link_cnt as 1 and put the sending thread into sleeping
> state again. The sending thread will sleep forever.
 ...
> This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
> and tipc_wait_for_cond().
> 
> Acked-by: Jon Maloy <jon.maloy@ericsson.com>
> Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>

You really need to build test this stuff properly:

net/tipc/socket.c: In function ‘__tipc_shutdown’:
./include/linux/wait.h:1119:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
  struct wait_queue_entry name = {     \
  ^~~~~~

You can't put the smp_rmb(); before the DEFINE_WAIT_FUNC() in that
basic block.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tipc-discussion][net v2 1/1] tipc: fix race condition causing hung sendto
  2019-02-19  7:03 ` [tipc-discussion][net 1/1] tipc: fix race condition causing hung sendto Tung Nguyen
  2019-02-19 21:21   ` David Miller
@ 2019-02-21  3:31   ` Tung Nguyen
  2019-02-21  3:51     ` David Miller
                       ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: Tung Nguyen @ 2019-02-21  3:31 UTC (permalink / raw)
  To: davem, netdev; +Cc: tipc-discussion

When sending multicast messages via blocking socket,
if sending link is congested (tsk->cong_link_cnt is set to 1),
the sending thread will be put into sleeping state. However,
tipc_sk_filter_rcv() is called under socket spin lock but
tipc_wait_for_cond() is not. So, there is no guarantee that
the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
CPU-1 will be perceived by CPU-0. If that is the case, the sending
thread in CPU-0 after being waken up, will continue to see
tsk->cong_link_cnt as 1 and put the sending thread into sleeping
state again. The sending thread will sleep forever.

CPU-0                                | CPU-1
tipc_wait_for_cond()                 |
{                                    |
 // condition_ = !tsk->cong_link_cnt |
 while ((rc_ = !(condition_))) {     |
  ...                                |
  release_sock(sk_);                 |
  wait_woken();                      |
                                     | if (!sock_owned_by_user(sk))
                                     |  tipc_sk_filter_rcv()
                                     |  {
                                     |   ...
                                     |   tipc_sk_proto_rcv()
                                     |   {
                                     |    ...
                                     |    tsk->cong_link_cnt--;
                                     |    ...
                                     |    sk->sk_write_space(sk);
                                     |    ...
                                     |   }
                                     |   ...
                                     |  }
  sched_annotate_sleep();            |
  lock_sock(sk_);                    |
  remove_wait_queue();               |
 }                                   |
}                                    |

This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
and tipc_wait_for_cond().

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
---
 net/tipc/socket.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 1217c90a363b..7cfa0c8ccc2a 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -379,16 +379,18 @@ static int tipc_sk_sock_err(struct socket *sock, long *timeout)
 
 #define tipc_wait_for_cond(sock_, timeo_, condition_)			       \
 ({                                                                             \
+	DEFINE_WAIT_FUNC(wait_, woken_wake_function);                          \
 	struct sock *sk_;						       \
 	int rc_;							       \
 									       \
 	while ((rc_ = !(condition_))) {					       \
-		DEFINE_WAIT_FUNC(wait_, woken_wake_function);	               \
+		/* coupled with smp_wmb() in tipc_sk_proto_rcv() */            \
+		smp_rmb();                                                     \
 		sk_ = (sock_)->sk;					       \
 		rc_ = tipc_sk_sock_err((sock_), timeo_);		       \
 		if (rc_)						       \
 			break;						       \
-		prepare_to_wait(sk_sleep(sk_), &wait_, TASK_INTERRUPTIBLE);    \
+		add_wait_queue(sk_sleep(sk_), &wait_);                         \
 		release_sock(sk_);					       \
 		*(timeo_) = wait_woken(&wait_, TASK_INTERRUPTIBLE, *(timeo_)); \
 		sched_annotate_sleep();				               \
@@ -1982,6 +1984,8 @@ static void tipc_sk_proto_rcv(struct sock *sk,
 		return;
 	case SOCK_WAKEUP:
 		tipc_dest_del(&tsk->cong_links, msg_orignode(hdr), 0);
+		/* coupled with smp_rmb() in tipc_wait_for_cond() */
+		smp_wmb();
 		tsk->cong_link_cnt--;
 		wakeup = true;
 		break;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [tipc-discussion][net v2 1/1] tipc: fix race condition causing hung sendto
  2019-02-21  3:31   ` [tipc-discussion][net v2 " Tung Nguyen
@ 2019-02-21  3:51     ` David Miller
  2019-02-21  4:04       ` tung quang nguyen
  2019-02-22 23:24     ` David Miller
  2019-02-25  3:57     ` [net v3 " Tung Nguyen
  2 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2019-02-21  3:51 UTC (permalink / raw)
  To: tung.q.nguyen; +Cc: netdev, tipc-discussion


So if the previous version didn't even compile, I have to ask.

How did you test this?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [tipc-discussion][net v2 1/1] tipc: fix race condition causing hung sendto
  2019-02-21  3:51     ` David Miller
@ 2019-02-21  4:04       ` tung quang nguyen
  0 siblings, 0 replies; 9+ messages in thread
From: tung quang nguyen @ 2019-02-21  4:04 UTC (permalink / raw)
  To: 'David Miller'; +Cc: netdev, tipc-discussion

Hi David,

I compiled previous version and tested it. But I forgot to observe kernel
warning.

For the way to reproduce the issue: calling sendto() and printf() right
after that to print out a log to see if sendto() was successful. On some
occasions, the log was never printed and stack trace showed sendto() was not
returned.
By applying the patch, the issue disappeared.

Thanks.

Best regards,
Tung Nguyen

-----Original Message-----
From: David Miller <davem@davemloft.net> 
Sent: Thursday, February 21, 2019 10:51 AM
To: tung.q.nguyen@dektech.com.au
Cc: netdev@vger.kernel.org; tipc-discussion@lists.sourceforge.net
Subject: Re: [tipc-discussion][net v2 1/1] tipc: fix race condition causing
hung sendto


So if the previous version didn't even compile, I have to ask.

How did you test this?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [tipc-discussion][net v2 1/1] tipc: fix race condition causing hung sendto
  2019-02-21  3:31   ` [tipc-discussion][net v2 " Tung Nguyen
  2019-02-21  3:51     ` David Miller
@ 2019-02-22 23:24     ` David Miller
  2019-02-25  3:57     ` [net v3 " Tung Nguyen
  2 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2019-02-22 23:24 UTC (permalink / raw)
  To: tung.q.nguyen; +Cc: netdev, tipc-discussion

From: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Date: Thu, 21 Feb 2019 10:31:21 +0700

> When sending multicast messages via blocking socket,

This patch does not apply cleanly to net, please respin:

[davem@localhost net]$ git am --signoff tipc-discussion-net-v2-1-1-tipc-fix-race-condition-causing-hung-sendto.patch 
Applying: tipc: fix race condition causing hung sendto
error: patch failed: net/tipc/socket.c:379
error: net/tipc/socket.c: patch does not apply
Patch failed at 0001 tipc: fix race condition causing hung sendto
hint: Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [net v3 1/1] tipc: fix race condition causing hung sendto
  2019-02-21  3:31   ` [tipc-discussion][net v2 " Tung Nguyen
  2019-02-21  3:51     ` David Miller
  2019-02-22 23:24     ` David Miller
@ 2019-02-25  3:57     ` Tung Nguyen
  2019-02-26 22:52       ` David Miller
  2 siblings, 1 reply; 9+ messages in thread
From: Tung Nguyen @ 2019-02-25  3:57 UTC (permalink / raw)
  To: davem, netdev; +Cc: tipc-discussion

When sending multicast messages via blocking socket,
if sending link is congested (tsk->cong_link_cnt is set to 1),
the sending thread will be put into sleeping state. However,
tipc_sk_filter_rcv() is called under socket spin lock but
tipc_wait_for_cond() is not. So, there is no guarantee that
the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
CPU-1 will be perceived by CPU-0. If that is the case, the sending
thread in CPU-0 after being waken up, will continue to see
tsk->cong_link_cnt as 1 and put the sending thread into sleeping
state again. The sending thread will sleep forever.

CPU-0                                | CPU-1
tipc_wait_for_cond()                 |
{                                    |
 // condition_ = !tsk->cong_link_cnt |
 while ((rc_ = !(condition_))) {     |
  ...                                |
  release_sock(sk_);                 |
  wait_woken();                      |
                                     | if (!sock_owned_by_user(sk))
                                     |  tipc_sk_filter_rcv()
                                     |  {
                                     |   ...
                                     |   tipc_sk_proto_rcv()
                                     |   {
                                     |    ...
                                     |    tsk->cong_link_cnt--;
                                     |    ...
                                     |    sk->sk_write_space(sk);
                                     |    ...
                                     |   }
                                     |   ...
                                     |  }
  sched_annotate_sleep();            |
  lock_sock(sk_);                    |
  remove_wait_queue();               |
 }                                   |
}                                    |

This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
and tipc_wait_for_cond().

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
---
 net/tipc/socket.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 684f2125fc6b..70343ac448b1 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -379,11 +379,13 @@ static int tipc_sk_sock_err(struct socket *sock, long *timeout)
 
 #define tipc_wait_for_cond(sock_, timeo_, condition_)			       \
 ({                                                                             \
+	DEFINE_WAIT_FUNC(wait_, woken_wake_function);                          \
 	struct sock *sk_;						       \
 	int rc_;							       \
 									       \
 	while ((rc_ = !(condition_))) {					       \
-		DEFINE_WAIT_FUNC(wait_, woken_wake_function);	               \
+		/* coupled with smp_wmb() in tipc_sk_proto_rcv() */            \
+		smp_rmb();                                                     \
 		sk_ = (sock_)->sk;					       \
 		rc_ = tipc_sk_sock_err((sock_), timeo_);		       \
 		if (rc_)						       \
@@ -1983,6 +1985,8 @@ static void tipc_sk_proto_rcv(struct sock *sk,
 		return;
 	case SOCK_WAKEUP:
 		tipc_dest_del(&tsk->cong_links, msg_orignode(hdr), 0);
+		/* coupled with smp_rmb() in tipc_wait_for_cond() */
+		smp_wmb();
 		tsk->cong_link_cnt--;
 		wakeup = true;
 		break;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [net v3 1/1] tipc: fix race condition causing hung sendto
  2019-02-25  3:57     ` [net v3 " Tung Nguyen
@ 2019-02-26 22:52       ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2019-02-26 22:52 UTC (permalink / raw)
  To: tung.q.nguyen; +Cc: netdev, tipc-discussion

From: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Date: Mon, 25 Feb 2019 10:57:20 +0700

> When sending multicast messages via blocking socket,
> if sending link is congested (tsk->cong_link_cnt is set to 1),
> the sending thread will be put into sleeping state. However,
> tipc_sk_filter_rcv() is called under socket spin lock but
> tipc_wait_for_cond() is not. So, there is no guarantee that
> the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
> CPU-1 will be perceived by CPU-0. If that is the case, the sending
> thread in CPU-0 after being waken up, will continue to see
> tsk->cong_link_cnt as 1 and put the sending thread into sleeping
> state again. The sending thread will sleep forever.
...
> This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
> and tipc_wait_for_cond().
> 
> Acked-by: Jon Maloy <jon.maloy@ericsson.com>
> Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>

Applied and queued up for -stable.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-02-26 22:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-19  7:03 [tipc-discussion][net 0/1] tipc: fix race condition between sleep/wakeup Tung Nguyen
2019-02-19  7:03 ` [tipc-discussion][net 1/1] tipc: fix race condition causing hung sendto Tung Nguyen
2019-02-19 21:21   ` David Miller
2019-02-21  3:31   ` [tipc-discussion][net v2 " Tung Nguyen
2019-02-21  3:51     ` David Miller
2019-02-21  4:04       ` tung quang nguyen
2019-02-22 23:24     ` David Miller
2019-02-25  3:57     ` [net v3 " Tung Nguyen
2019-02-26 22:52       ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.