* [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready()
@ 2010-02-03 5:04 Wei Yongjun
2010-02-03 9:51 ` Wei Yongjun
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Wei Yongjun @ 2010-02-03 5:04 UTC (permalink / raw)
To: linux-sctp
sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
not be used in this case. Therefore, we have to make a new function
sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
============================[ INFO: possible irq lock inversion dependency detected ]
2.6.33-rc6 #129
---------------------------------------------------------
sctp_darn/1517 just changed the state of lock:
(clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
but this lock took another, SOFTIRQ-unsafe lock in the past:
(slock-AF_INET){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
1 lock held by sctp_darn/1517:
#0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
---
include/net/sctp/sctp.h | 1 +
net/sctp/endpointola.c | 1 +
net/sctp/socket.c | 10 ++++++++++
3 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 78740ec..fa6cde5 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -128,6 +128,7 @@ extern int sctp_register_pf(struct sctp_pf *, sa_family_t);
int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb);
int sctp_inet_listen(struct socket *sock, int backlog);
void sctp_write_space(struct sock *sk);
+void sctp_data_ready(struct sock *sk, int len);
unsigned int sctp_poll(struct file *file, struct socket *sock,
poll_table *wait);
void sctp_sock_rfree(struct sk_buff *skb);
diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
index 905fda5..7ec09ba 100644
--- a/net/sctp/endpointola.c
+++ b/net/sctp/endpointola.c
@@ -144,6 +144,7 @@ static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep,
/* Use SCTP specific send buffer space queues. */
ep->sndbuf_policy = sctp_sndbuf_policy;
+ sk->sk_data_ready = sctp_data_ready;
sk->sk_write_space = sctp_write_space;
sock_set_flag(sk, SOCK_USE_WRITE_QUEUE);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 67fdac9..b437e2a 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -6185,6 +6185,16 @@ do_nonblock:
goto out;
}
+void sctp_data_ready(struct sock *sk, int len)
+{
+ read_lock_bh(&sk->sk_callback_lock);
+ if (sk_has_sleeper(sk))
+ wake_up_interruptible_sync_poll(sk->sk_sleep, POLLIN |
+ POLLRDNORM | POLLRDBAND);
+ sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
+ read_unlock_bh(&sk->sk_callback_lock);
+}
+
/* If socket sndbuf has changed, wake up all per association waiters. */
void sctp_write_space(struct sock *sk)
{
--
1.6.5.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready()
2010-02-03 5:04 [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready() Wei Yongjun
@ 2010-02-03 9:51 ` Wei Yongjun
2010-02-03 17:14 ` Vlad Yasevich
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Wei Yongjun @ 2010-02-03 9:51 UTC (permalink / raw)
To: linux-sctp
>
> sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
> contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
> not be used in this case. Therefore, we have to make a new function
> sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
>
> ============================> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.33-rc6 #129
> ---------------------------------------------------------
> sctp_darn/1517 just changed the state of lock:
> (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
> but this lock took another, SOFTIRQ-unsafe lock in the past:
> (slock-AF_INET){+.-...}
>
> and interrupts could create inverse lock ordering between them.
>
> other info that might help us debug this:
> 1 lock held by sctp_darn/1517:
> #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
>
The full lockdep output message is:
============================[ INFO: possible irq lock inversion dependency detected ]
2.6.33-rc6 #129
---------------------------------------------------------
sctp_darn/1517 just changed the state of lock:
(clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
but this lock took another, SOFTIRQ-unsafe lock in the past:
(slock-AF_INET){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
1 lock held by sctp_darn/1517:
#0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
the shortest dependencies between 2nd lock and 1st lock:
-> (slock-AF_INET){+.-...} ops: 0 {
HARDIRQ-ON-W at:
[<c04717ff>] __lock_acquire+0x9af/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762bbd>] _raw_spin_lock_bh+0x3d/0x50
[<c06aa723>] lock_sock_nested+0x33/0xf0
[<c06ed27a>] tcp_close+0x1a/0x390
[<c070eefb>] inet_release+0x3b/0x60
[<c06a8030>] sock_release+0x20/0x70
[<c06a8097>] sock_close+0x17/0x30
[<c04f687b>] __fput+0xfb/0x200
[<c04f699d>] fput+0x1d/0x30
[<c04f341c>] filp_close+0x4c/0x80
[<c04f34c7>] sys_close+0x77/0xc0
[<c0402d8c>] sysenter_do_call+0x12/0x32
IN-SOFTIRQ-W at:
[<c04717e3>] __lock_acquire+0x993/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762968>] _raw_spin_lock+0x38/0x50
[<c070601e>] udp_queue_rcv_skb+0xee/0x2c0
[<c07067ef>] __udp4_lib_rcv+0x1bf/0x6a0
[<c0706ce7>] udp_rcv+0x17/0x20
[<c06e194f>] ip_local_deliver_finish+0xdf/0x2c0
[<c06e1bbf>] ip_local_deliver+0x8f/0xa0
[<c06e12cb>] ip_rcv_finish+0xdb/0x3c0
[<c06e17b6>] ip_rcv+0x206/0x2c0
[<c06b79ef>] netif_receive_skb+0x34f/0x570
[<cc83ac5e>] pcnet32_poll+0x27e/0x7a0 [pcnet32]
[<c06b84b0>] net_rx_action+0x150/0x230
[<c0449520>] __do_softirq+0xa0/0x1c0
INITIAL USE at:
[<c04711bf>] __lock_acquire+0x36f/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762bbd>] _raw_spin_lock_bh+0x3d/0x50
[<c06aa723>] lock_sock_nested+0x33/0xf0
[<c06ed27a>] tcp_close+0x1a/0x390
[<c070eefb>] inet_release+0x3b/0x60
[<c06a8030>] sock_release+0x20/0x70
[<c06a8097>] sock_close+0x17/0x30
[<c04f687b>] __fput+0xfb/0x200
[<c04f699d>] fput+0x1d/0x30
[<c04f341c>] filp_close+0x4c/0x80
[<c04f34c7>] sys_close+0x77/0xc0
[<c0402d8c>] sysenter_do_call+0x12/0x32
}
... key at: [<c0f2d8d0>] af_family_slock_keys+0x10/0x140
... acquired at:
[<c0471fc2>] __lock_acquire+0x1172/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762c0d>] _raw_write_lock_bh+0x3d/0x50
[<c06ab3ad>] sk_common_release+0x2d/0xb0
[<cdfe2ff8>] sctp_close+0xe8/0x1f0 [sctp]
[<c070eefb>] inet_release+0x3b/0x60
[<c06a8030>] sock_release+0x20/0x70
[<c06a8097>] sock_close+0x17/0x30
[<c04f687b>] __fput+0xfb/0x200
[<c04f699d>] fput+0x1d/0x30
[<c04f341c>] filp_close+0x4c/0x80
[<c04f34c7>] sys_close+0x77/0xc0
[<c0402d8c>] sysenter_do_call+0x12/0x32
-> (clock-AF_INET){++.?..} ops: 0 {
HARDIRQ-ON-W at:
[<c04717ff>] __lock_acquire+0x9af/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762c0d>] _raw_write_lock_bh+0x3d/0x50
[<c06ed359>] tcp_close+0xf9/0x390
[<c070eefb>] inet_release+0x3b/0x60
[<c06a8030>] sock_release+0x20/0x70
[<c06a8097>] sock_close+0x17/0x30
[<c04f687b>] __fput+0xfb/0x200
[<c04f699d>] fput+0x1d/0x30
[<c04f341c>] filp_close+0x4c/0x80
[<c04f34c7>] sys_close+0x77/0xc0
[<c0402d8c>] sysenter_do_call+0x12/0x32
HARDIRQ-ON-R at:
[<c0470fd6>] __lock_acquire+0x186/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762fa8>] _raw_read_lock+0x38/0x50
[<c06aaaac>] sock_def_write_space+0x1c/0xb0
[<c06ab33a>] sock_wfree+0x4a/0x60
[<c06af175>] skb_release_head_state+0x45/0xc0
[<c06aeea0>] __kfree_skb+0x10/0x90
[<c06b57a9>] net_tx_action+0x59/0x140
[<c0449520>] __do_softirq+0xa0/0x1c0
IN-SOFTIRQ-R at:
[<c04717e3>] __lock_acquire+0x993/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762fa8>] _raw_read_lock+0x38/0x50
[<c06aaaac>] sock_def_write_space+0x1c/0xb0
[<c06ab33a>] sock_wfree+0x4a/0x60
[<c06af175>] skb_release_head_state+0x45/0xc0
[<c06aeea0>] __kfree_skb+0x10/0x90
[<c06b57a9>] net_tx_action+0x59/0x140
[<c0449520>] __do_softirq+0xa0/0x1c0
SOFTIRQ-ON-R at:
[<c0471824>] __lock_acquire+0x9d4/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762fa8>] _raw_read_lock+0x38/0x50
[<c06aab60>] sock_def_readable+0x20/0x80
[<cdfdebc4>] sctp_ulpq_tail_event+0x134/0x210 [sctp]
[<cdfd253e>] sctp_side_effects+0x8ee/0x10c0 [sctp]
[<cdfd2dc0>] sctp_do_sm+0xb0/0x1c0 [sctp]
[<cdfe6bc2>] sctp_primitive_ABORT+0x42/0x50 [sctp]
[<cdfe3892>] sctp_sendmsg+0x492/0xc00 [sctp]
[<c070df9e>] inet_sendmsg+0x2e/0x60
[<c06a7527>] sock_sendmsg+0xe7/0x110
[<c06a7723>] sys_sendmsg+0x113/0x230
[<c06a956b>] sys_socketcall+0xeb/0x2a0
[<c0402d8c>] sysenter_do_call+0x12/0x32
INITIAL USE at:
[<c04711bf>] __lock_acquire+0x36f/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762c0d>] _raw_write_lock_bh+0x3d/0x50
[<c06ed359>] tcp_close+0xf9/0x390
[<c070eefb>] inet_release+0x3b/0x60
[<c06a8030>] sock_release+0x20/0x70
[<c06a8097>] sock_close+0x17/0x30
[<c04f687b>] __fput+0xfb/0x200
[<c04f699d>] fput+0x1d/0x30
[<c04f341c>] filp_close+0x4c/0x80
[<c04f34c7>] sys_close+0x77/0xc0
[<c0402d8c>] sysenter_do_call+0x12/0x32
}
... key at: [<c0f2da10>] af_callback_keys+0x10/0x128
... acquired at:
[<c04735db>] check_usage_backwards+0x8b/0xd0
[<c046fd9a>] mark_lock+0x1ba/0x5c0
[<c0471824>] __lock_acquire+0x9d4/0x1890
[<c047275f>] lock_acquire+0x7f/0xf0
[<c0762fa8>] _raw_read_lock+0x38/0x50
[<c06aab60>] sock_def_readable+0x20/0x80
[<cdfdebc4>] sctp_ulpq_tail_event+0x134/0x210 [sctp]
[<cdfd253e>] sctp_side_effects+0x8ee/0x10c0 [sctp]
[<cdfd2dc0>] sctp_do_sm+0xb0/0x1c0 [sctp]
[<cdfe6bc2>] sctp_primitive_ABORT+0x42/0x50 [sctp]
[<cdfe3892>] sctp_sendmsg+0x492/0xc00 [sctp]
[<c070df9e>] inet_sendmsg+0x2e/0x60
[<c06a7527>] sock_sendmsg+0xe7/0x110
[<c06a7723>] sys_sendmsg+0x113/0x230
[<c06a956b>] sys_socketcall+0xeb/0x2a0
[<c0402d8c>] sysenter_do_call+0x12/0x32
stack backtrace:
Pid: 1517, comm: sctp_darn Not tainted 2.6.33-rc6 #129
Call Trace:
[<c075f994>] ? printk+0x1d/0x21
[<c047353e>] print_irq_inversion_bug.clone.0+0xfe/0x110
[<c04735db>] check_usage_backwards+0x8b/0xd0
[<c046fd9a>] mark_lock+0x1ba/0x5c0
[<c059e513>] ? string+0x33/0xe0
[<c0473550>] ? check_usage_backwards+0x0/0xd0
[<c0471824>] __lock_acquire+0x9d4/0x1890
[<c059e513>] ? string+0x33/0xe0
[<c0762a4b>] ? _raw_spin_lock_irqsave+0x1b/0x60
[<c07630cf>] ? _raw_spin_unlock_irqrestore+0x4f/0x60
[<c046f37b>] ? trace_hardirqs_off+0xb/0x10
[<c07630cf>] ? _raw_spin_unlock_irqrestore+0x4f/0x60
[<c044338f>] ? release_console_sem+0x1ef/0x240
[<c047275f>] lock_acquire+0x7f/0xf0
[<c06aab60>] ? sock_def_readable+0x20/0x80
[<c0762fa8>] _raw_read_lock+0x38/0x50
[<c06aab60>] ? sock_def_readable+0x20/0x80
[<c06aab60>] sock_def_readable+0x20/0x80
[<cdfdebc4>] sctp_ulpq_tail_event+0x134/0x210 [sctp]
[<cdfd253e>] sctp_side_effects+0x8ee/0x10c0 [sctp]
[<c0762a4b>] ? _raw_spin_lock_irqsave+0x1b/0x60
[<cdfd2dc0>] sctp_do_sm+0xb0/0x1c0 [sctp]
[<c044338f>] ? release_console_sem+0x1ef/0x240
[<cdfe6bc2>] sctp_primitive_ABORT+0x42/0x50 [sctp]
[<cdfe3892>] sctp_sendmsg+0x492/0xc00 [sctp]
[<c070df9e>] inet_sendmsg+0x2e/0x60
[<c06a7527>] sock_sendmsg+0xe7/0x110
[<c046f37b>] ? trace_hardirqs_off+0xb/0x10
[<c04d6810>] ? might_fault+0x50/0xa0
[<c04d6810>] ? might_fault+0x50/0xa0
[<c04d6856>] ? might_fault+0x96/0xa0
[<c04d6810>] ? might_fault+0x50/0xa0
[<c05a140d>] ? _copy_from_user+0x3d/0x130
[<c06a7723>] sys_sendmsg+0x113/0x230
[<c06aa6e7>] ? release_sock+0xd7/0xe0
[<c04704eb>] ? trace_hardirqs_on+0xb/0x10
[<c0449298>] ? local_bh_enable_ip+0x68/0xd0
[<cdfe4b0c>] ? sctp_getsockopt+0x9c/0x1010 [sctp]
[<c0472949>] ? lock_release_non_nested+0x59/0x2f0
[<c04704eb>] ? trace_hardirqs_on+0xb/0x10
[<c06093ce>] ? put_ldisc+0x3e/0xc0
[<c04d6810>] ? might_fault+0x50/0xa0
[<c04d6810>] ? might_fault+0x50/0xa0
[<c06a956b>] sys_socketcall+0xeb/0x2a0
[<c0402dbb>] ? sysenter_exit+0xf/0x16
[<c0402d8c>] sysenter_do_call+0x12/0x32
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready()
2010-02-03 5:04 [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready() Wei Yongjun
2010-02-03 9:51 ` Wei Yongjun
@ 2010-02-03 17:14 ` Vlad Yasevich
2010-02-04 0:58 ` Wei Yongjun
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Vlad Yasevich @ 2010-02-03 17:14 UTC (permalink / raw)
To: linux-sctp
Wei Yongjun wrote:
> sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
> contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
> not be used in this case. Therefore, we have to make a new function
> sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
>
Wouldn't the same inversion happen in TCP as well? TCP can call that
function in _bh and user contexts as well.
-vlad
> ============================> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.33-rc6 #129
> ---------------------------------------------------------
> sctp_darn/1517 just changed the state of lock:
> (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
> but this lock took another, SOFTIRQ-unsafe lock in the past:
> (slock-AF_INET){+.-...}
>
> and interrupts could create inverse lock ordering between them.
>
> other info that might help us debug this:
> 1 lock held by sctp_darn/1517:
> #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
>
> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
> ---
> include/net/sctp/sctp.h | 1 +
> net/sctp/endpointola.c | 1 +
> net/sctp/socket.c | 10 ++++++++++
> 3 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index 78740ec..fa6cde5 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -128,6 +128,7 @@ extern int sctp_register_pf(struct sctp_pf *, sa_family_t);
> int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb);
> int sctp_inet_listen(struct socket *sock, int backlog);
> void sctp_write_space(struct sock *sk);
> +void sctp_data_ready(struct sock *sk, int len);
> unsigned int sctp_poll(struct file *file, struct socket *sock,
> poll_table *wait);
> void sctp_sock_rfree(struct sk_buff *skb);
> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
> index 905fda5..7ec09ba 100644
> --- a/net/sctp/endpointola.c
> +++ b/net/sctp/endpointola.c
> @@ -144,6 +144,7 @@ static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep,
> /* Use SCTP specific send buffer space queues. */
> ep->sndbuf_policy = sctp_sndbuf_policy;
>
> + sk->sk_data_ready = sctp_data_ready;
> sk->sk_write_space = sctp_write_space;
> sock_set_flag(sk, SOCK_USE_WRITE_QUEUE);
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 67fdac9..b437e2a 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -6185,6 +6185,16 @@ do_nonblock:
> goto out;
> }
>
> +void sctp_data_ready(struct sock *sk, int len)
> +{
> + read_lock_bh(&sk->sk_callback_lock);
> + if (sk_has_sleeper(sk))
> + wake_up_interruptible_sync_poll(sk->sk_sleep, POLLIN |
> + POLLRDNORM | POLLRDBAND);
> + sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
> + read_unlock_bh(&sk->sk_callback_lock);
> +}
> +
> /* If socket sndbuf has changed, wake up all per association waiters. */
> void sctp_write_space(struct sock *sk)
> {
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready()
2010-02-03 5:04 [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready() Wei Yongjun
2010-02-03 9:51 ` Wei Yongjun
2010-02-03 17:14 ` Vlad Yasevich
@ 2010-02-04 0:58 ` Wei Yongjun
2010-02-04 15:58 ` Vlad Yasevich
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Wei Yongjun @ 2010-02-04 0:58 UTC (permalink / raw)
To: linux-sctp
Vlad Yasevich wrote:
> Wei Yongjun wrote:
>
>> sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
>> contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
>> not be used in this case. Therefore, we have to make a new function
>> sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
>>
>>
>
> Wouldn't the same inversion happen in TCP as well? TCP can call that
> function in _bh and user contexts as well.
>
Not sure, but TCP does not call that function in user context at all.
> -vlad
>
>
>> ============================>> [ INFO: possible irq lock inversion dependency detected ]
>> 2.6.33-rc6 #129
>> ---------------------------------------------------------
>> sctp_darn/1517 just changed the state of lock:
>> (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
>> but this lock took another, SOFTIRQ-unsafe lock in the past:
>> (slock-AF_INET){+.-...}
>>
>> and interrupts could create inverse lock ordering between them.
>>
>> other info that might help us debug this:
>> 1 lock held by sctp_darn/1517:
>> #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
>>
>> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
>> ---
>> include/net/sctp/sctp.h | 1 +
>> net/sctp/endpointola.c | 1 +
>> net/sctp/socket.c | 10 ++++++++++
>> 3 files changed, 12 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
>> index 78740ec..fa6cde5 100644
>> --- a/include/net/sctp/sctp.h
>> +++ b/include/net/sctp/sctp.h
>> @@ -128,6 +128,7 @@ extern int sctp_register_pf(struct sctp_pf *, sa_family_t);
>> int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb);
>> int sctp_inet_listen(struct socket *sock, int backlog);
>> void sctp_write_space(struct sock *sk);
>> +void sctp_data_ready(struct sock *sk, int len);
>> unsigned int sctp_poll(struct file *file, struct socket *sock,
>> poll_table *wait);
>> void sctp_sock_rfree(struct sk_buff *skb);
>> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
>> index 905fda5..7ec09ba 100644
>> --- a/net/sctp/endpointola.c
>> +++ b/net/sctp/endpointola.c
>> @@ -144,6 +144,7 @@ static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep,
>> /* Use SCTP specific send buffer space queues. */
>> ep->sndbuf_policy = sctp_sndbuf_policy;
>>
>> + sk->sk_data_ready = sctp_data_ready;
>> sk->sk_write_space = sctp_write_space;
>> sock_set_flag(sk, SOCK_USE_WRITE_QUEUE);
>>
>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>> index 67fdac9..b437e2a 100644
>> --- a/net/sctp/socket.c
>> +++ b/net/sctp/socket.c
>> @@ -6185,6 +6185,16 @@ do_nonblock:
>> goto out;
>> }
>>
>> +void sctp_data_ready(struct sock *sk, int len)
>> +{
>> + read_lock_bh(&sk->sk_callback_lock);
>> + if (sk_has_sleeper(sk))
>> + wake_up_interruptible_sync_poll(sk->sk_sleep, POLLIN |
>> + POLLRDNORM | POLLRDBAND);
>> + sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
>> + read_unlock_bh(&sk->sk_callback_lock);
>> +}
>> +
>> /* If socket sndbuf has changed, wake up all per association waiters. */
>> void sctp_write_space(struct sock *sk)
>> {
>>
>
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready()
2010-02-03 5:04 [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready() Wei Yongjun
` (2 preceding siblings ...)
2010-02-04 0:58 ` Wei Yongjun
@ 2010-02-04 15:58 ` Vlad Yasevich
2010-02-05 19:19 ` Vlad Yasevich
2010-02-08 1:55 ` Wei Yongjun
5 siblings, 0 replies; 7+ messages in thread
From: Vlad Yasevich @ 2010-02-04 15:58 UTC (permalink / raw)
To: linux-sctp
Wei Yongjun wrote:
>
> Vlad Yasevich wrote:
>> Wei Yongjun wrote:
>>
>>> sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
>>> contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
>>> not be used in this case. Therefore, we have to make a new function
>>> sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
>>>
>>>
>> Wouldn't the same inversion happen in TCP as well? TCP can call that
>> function in _bh and user contexts as well.
>>
>
> Not sure, but TCP does not call that function in user context at all.
It might as part of release_sock(). It call out to sk_backlog_rcv, which
calls tcp_v4_do_rcv() (the backlog handler). That handler will eventually
call tcp_data_queue() which will call sk_data_ready(). I don't see the code
paths that disable BH there.
-vlad
>
>> -vlad
>>
>>
>>> ============================>>> [ INFO: possible irq lock inversion dependency detected ]
>>> 2.6.33-rc6 #129
>>> ---------------------------------------------------------
>>> sctp_darn/1517 just changed the state of lock:
>>> (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
>>> but this lock took another, SOFTIRQ-unsafe lock in the past:
>>> (slock-AF_INET){+.-...}
>>>
>>> and interrupts could create inverse lock ordering between them.
>>>
>>> other info that might help us debug this:
>>> 1 lock held by sctp_darn/1517:
>>> #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
>>>
>>> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
>>> ---
>>> include/net/sctp/sctp.h | 1 +
>>> net/sctp/endpointola.c | 1 +
>>> net/sctp/socket.c | 10 ++++++++++
>>> 3 files changed, 12 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
>>> index 78740ec..fa6cde5 100644
>>> --- a/include/net/sctp/sctp.h
>>> +++ b/include/net/sctp/sctp.h
>>> @@ -128,6 +128,7 @@ extern int sctp_register_pf(struct sctp_pf *, sa_family_t);
>>> int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb);
>>> int sctp_inet_listen(struct socket *sock, int backlog);
>>> void sctp_write_space(struct sock *sk);
>>> +void sctp_data_ready(struct sock *sk, int len);
>>> unsigned int sctp_poll(struct file *file, struct socket *sock,
>>> poll_table *wait);
>>> void sctp_sock_rfree(struct sk_buff *skb);
>>> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
>>> index 905fda5..7ec09ba 100644
>>> --- a/net/sctp/endpointola.c
>>> +++ b/net/sctp/endpointola.c
>>> @@ -144,6 +144,7 @@ static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep,
>>> /* Use SCTP specific send buffer space queues. */
>>> ep->sndbuf_policy = sctp_sndbuf_policy;
>>>
>>> + sk->sk_data_ready = sctp_data_ready;
>>> sk->sk_write_space = sctp_write_space;
>>> sock_set_flag(sk, SOCK_USE_WRITE_QUEUE);
>>>
>>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>>> index 67fdac9..b437e2a 100644
>>> --- a/net/sctp/socket.c
>>> +++ b/net/sctp/socket.c
>>> @@ -6185,6 +6185,16 @@ do_nonblock:
>>> goto out;
>>> }
>>>
>>> +void sctp_data_ready(struct sock *sk, int len)
>>> +{
>>> + read_lock_bh(&sk->sk_callback_lock);
>>> + if (sk_has_sleeper(sk))
>>> + wake_up_interruptible_sync_poll(sk->sk_sleep, POLLIN |
>>> + POLLRDNORM | POLLRDBAND);
>>> + sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
>>> + read_unlock_bh(&sk->sk_callback_lock);
>>> +}
>>> +
>>> /* If socket sndbuf has changed, wake up all per association waiters. */
>>> void sctp_write_space(struct sock *sk)
>>> {
>>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready()
2010-02-03 5:04 [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready() Wei Yongjun
` (3 preceding siblings ...)
2010-02-04 15:58 ` Vlad Yasevich
@ 2010-02-05 19:19 ` Vlad Yasevich
2010-02-08 1:55 ` Wei Yongjun
5 siblings, 0 replies; 7+ messages in thread
From: Vlad Yasevich @ 2010-02-05 19:19 UTC (permalink / raw)
To: linux-sctp
Wei Yongjun wrote:
>
> Vlad Yasevich wrote:
>> Wei Yongjun wrote:
>>
>>> sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
>>> contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
>>> not be used in this case. Therefore, we have to make a new function
>>> sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
>>>
>>>
>> Wouldn't the same inversion happen in TCP as well? TCP can call that
>> function in _bh and user contexts as well.
>>
>
> Not sure, but TCP does not call that function in user context at all.
>
Wei
Can you trigger this problem with this patch applied?
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 3a95fcb..dabdc50 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3717,9 +3717,9 @@ SCTP_STATIC int sctp_init_sock(struct sock *sk)
sp->hmac = NULL;
SCTP_DBG_OBJCNT_INC(sock);
- percpu_counter_inc(&sctp_sockets_allocated);
local_bh_disable();
+ percpu_counter_inc(&sctp_sockets_allocated);
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
local_bh_enable();
-vlad
>> -vlad
>>
>>
>>> ============================>>> [ INFO: possible irq lock inversion dependency detected ]
>>> 2.6.33-rc6 #129
>>> ---------------------------------------------------------
>>> sctp_darn/1517 just changed the state of lock:
>>> (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
>>> but this lock took another, SOFTIRQ-unsafe lock in the past:
>>> (slock-AF_INET){+.-...}
>>>
>>> and interrupts could create inverse lock ordering between them.
>>>
>>> other info that might help us debug this:
>>> 1 lock held by sctp_darn/1517:
>>> #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
>>>
>>> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
>>> ---
>>> include/net/sctp/sctp.h | 1 +
>>> net/sctp/endpointola.c | 1 +
>>> net/sctp/socket.c | 10 ++++++++++
>>> 3 files changed, 12 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
>>> index 78740ec..fa6cde5 100644
>>> --- a/include/net/sctp/sctp.h
>>> +++ b/include/net/sctp/sctp.h
>>> @@ -128,6 +128,7 @@ extern int sctp_register_pf(struct sctp_pf *, sa_family_t);
>>> int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb);
>>> int sctp_inet_listen(struct socket *sock, int backlog);
>>> void sctp_write_space(struct sock *sk);
>>> +void sctp_data_ready(struct sock *sk, int len);
>>> unsigned int sctp_poll(struct file *file, struct socket *sock,
>>> poll_table *wait);
>>> void sctp_sock_rfree(struct sk_buff *skb);
>>> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
>>> index 905fda5..7ec09ba 100644
>>> --- a/net/sctp/endpointola.c
>>> +++ b/net/sctp/endpointola.c
>>> @@ -144,6 +144,7 @@ static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep,
>>> /* Use SCTP specific send buffer space queues. */
>>> ep->sndbuf_policy = sctp_sndbuf_policy;
>>>
>>> + sk->sk_data_ready = sctp_data_ready;
>>> sk->sk_write_space = sctp_write_space;
>>> sock_set_flag(sk, SOCK_USE_WRITE_QUEUE);
>>>
>>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>>> index 67fdac9..b437e2a 100644
>>> --- a/net/sctp/socket.c
>>> +++ b/net/sctp/socket.c
>>> @@ -6185,6 +6185,16 @@ do_nonblock:
>>> goto out;
>>> }
>>>
>>> +void sctp_data_ready(struct sock *sk, int len)
>>> +{
>>> + read_lock_bh(&sk->sk_callback_lock);
>>> + if (sk_has_sleeper(sk))
>>> + wake_up_interruptible_sync_poll(sk->sk_sleep, POLLIN |
>>> + POLLRDNORM | POLLRDBAND);
>>> + sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
>>> + read_unlock_bh(&sk->sk_callback_lock);
>>> +}
>>> +
>>> /* If socket sndbuf has changed, wake up all per association waiters. */
>>> void sctp_write_space(struct sock *sk)
>>> {
>>>
>>
>>
>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready()
2010-02-03 5:04 [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready() Wei Yongjun
` (4 preceding siblings ...)
2010-02-05 19:19 ` Vlad Yasevich
@ 2010-02-08 1:55 ` Wei Yongjun
5 siblings, 0 replies; 7+ messages in thread
From: Wei Yongjun @ 2010-02-08 1:55 UTC (permalink / raw)
To: linux-sctp
Vlad Yasevich wrote:
> Wei Yongjun wrote:
>
>> Vlad Yasevich wrote:
>>
>>> Wei Yongjun wrote:
>>>
>>>
>>>> sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
>>>> contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
>>>> not be used in this case. Therefore, we have to make a new function
>>>> sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
>>>>
>>>>
>>>>
>>> Wouldn't the same inversion happen in TCP as well? TCP can call that
>>> function in _bh and user contexts as well.
>>>
>>>
>> Not sure, but TCP does not call that function in user context at all.
>>
>>
>
> Wei
>
> Can you trigger this problem with this patch applied?
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 3a95fcb..dabdc50 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -3717,9 +3717,9 @@ SCTP_STATIC int sctp_init_sock(struct sock *sk)
> sp->hmac = NULL;
>
> SCTP_DBG_OBJCNT_INC(sock);
> - percpu_counter_inc(&sctp_sockets_allocated);
>
> local_bh_disable();
> + percpu_counter_inc(&sctp_sockets_allocated);
> sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
> local_bh_enable();
>
> -vlad
This patch did not change anything, the lockdep INFO still exists.
This lockdep INFO happend when I try to make a user abort through
sctp_primitive_ABORT.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-02-08 1:55 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-03 5:04 [PATCH] sctp: avoid irq lock inversion while call sk->sk_data_ready() Wei Yongjun
2010-02-03 9:51 ` Wei Yongjun
2010-02-03 17:14 ` Vlad Yasevich
2010-02-04 0:58 ` Wei Yongjun
2010-02-04 15:58 ` Vlad Yasevich
2010-02-05 19:19 ` Vlad Yasevich
2010-02-08 1:55 ` Wei Yongjun
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.