* [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
@ 2017-07-31 10:07 gfree.wind
2017-08-01 4:59 ` David Miller
2017-08-01 20:39 ` Cong Wang
0 siblings, 2 replies; 15+ messages in thread
From: gfree.wind @ 2017-07-31 10:07 UTC (permalink / raw)
To: xeb, davem, netdev; +Cc: Gao Feng
From: Gao Feng <gfree.wind@vip.163.com>
The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
trigger this bug when __sk_free is invoked in atomic context, because of
the call path pptp_sock_destruct->del_chan->synchronize_rcu.
Now move the synchronize_rcu to pptp_release from del_chan. This is the
only one case which would free the sock and need the synchronize_rcu.
The following is the panic I met with kernel 3.3.8, but this issue should
exist in current kernel too according to the codes.
BUG: scheduling while atomic
__schedule_bug+0x5e/0x64
__schedule+0x55/0x580
? ppp_unregister_channel+0x1cd5/0x1de0 [ppp_generic]
? dev_hard_start_xmit+0x423/0x530
? sch_direct_xmit+0x73/0x170
__cond_resched+0x16/0x30
_cond_resched+0x22/0x30
wait_for_common+0x18/0x110
? call_rcu_bh+0x10/0x10
wait_for_completion+0x12/0x20
wait_rcu_gp+0x34/0x40
? wait_rcu_gp+0x40/0x40
synchronize_sched+0x1e/0x20
0xf8417298
0xf8417484
? sock_queue_rcv_skb+0x109/0x130
__sk_free+0x16/0x110
? udp_queue_rcv_skb+0x1f2/0x290
sk_free+0x16/0x20
__udp4_lib_rcv+0x3b8/0x650
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
---
drivers/net/ppp/pptp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index eac499c..6dde9a0 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -131,7 +131,6 @@ static void del_chan(struct pppox_sock *sock)
clear_bit(sock->proto.pptp.src_addr.call_id, callid_bitmap);
RCU_INIT_POINTER(callid_sock[sock->proto.pptp.src_addr.call_id], NULL);
spin_unlock(&chan_lock);
- synchronize_rcu();
}
static int pptp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
@@ -520,6 +519,7 @@ static int pptp_release(struct socket *sock)
po = pppox_sk(sk);
del_chan(po);
+ synchronize_rcu();
pppox_unbind_sock(sk);
sk->sk_state = PPPOX_DEAD;
--
1.9.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-07-31 10:07 [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan gfree.wind
@ 2017-08-01 4:59 ` David Miller
2017-08-01 20:39 ` Cong Wang
1 sibling, 0 replies; 15+ messages in thread
From: David Miller @ 2017-08-01 4:59 UTC (permalink / raw)
To: gfree.wind; +Cc: xeb, netdev
From: gfree.wind@vip.163.com
Date: Mon, 31 Jul 2017 18:07:38 +0800
> From: Gao Feng <gfree.wind@vip.163.com>
>
> The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
> trigger this bug when __sk_free is invoked in atomic context, because of
> the call path pptp_sock_destruct->del_chan->synchronize_rcu.
>
> Now move the synchronize_rcu to pptp_release from del_chan. This is the
> only one case which would free the sock and need the synchronize_rcu.
>
> The following is the panic I met with kernel 3.3.8, but this issue should
> exist in current kernel too according to the codes.
...
> Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-07-31 10:07 [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan gfree.wind
2017-08-01 4:59 ` David Miller
@ 2017-08-01 20:39 ` Cong Wang
2017-08-02 17:13 ` Cong Wang
1 sibling, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-01 20:39 UTC (permalink / raw)
To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers
On Mon, Jul 31, 2017 at 3:07 AM, <gfree.wind@vip.163.com> wrote:
> From: Gao Feng <gfree.wind@vip.163.com>
>
> The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
> trigger this bug when __sk_free is invoked in atomic context, because of
> the call path pptp_sock_destruct->del_chan->synchronize_rcu.
>
> Now move the synchronize_rcu to pptp_release from del_chan. This is the
> only one case which would free the sock and need the synchronize_rcu.
I don't understand the last part.
>From my understanding, this RCU is supposed to protect the pppox_sock
pointers in 'callid_sock' which could be NULL'ed in del_chan(). And the
pppox_sock is freed when the last refcnt is gone, that is, when sock
dctor is called. pptp_release() is ONLY called when the fd in user-space
is gone, not necessarily the last refcnt.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-01 20:39 ` Cong Wang
@ 2017-08-02 17:13 ` Cong Wang
2017-08-07 1:32 ` Gao Feng
0 siblings, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-02 17:13 UTC (permalink / raw)
To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers
Hi, Gao
On Tue, Aug 1, 2017 at 1:39 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> From my understanding, this RCU is supposed to protect the pppox_sock
> pointers in 'callid_sock' which could be NULL'ed in del_chan(). And the
> pppox_sock is freed when the last refcnt is gone, that is, when sock
> dctor is called. pptp_release() is ONLY called when the fd in user-space
> is gone, not necessarily the last refcnt.
Your commit is probably not the right fix. Can you try the following fix?
diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index 6dde9a0cfe76..e75bb95c107f 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -519,7 +519,6 @@ static int pptp_release(struct socket *sock)
po = pppox_sk(sk);
del_chan(po);
- synchronize_rcu();
pppox_unbind_sock(sk);
sk->sk_state = PPPOX_DEAD;
@@ -564,6 +563,7 @@ static int pptp_create(struct net *net, struct
socket *sock, int kern)
sk->sk_family = PF_PPPOX;
sk->sk_protocol = PX_PROTO_PPTP;
sk->sk_destruct = pptp_sock_destruct;
+ sock_set_flag(sk, SOCK_RCU_FREE);
po = pppox_sk(sk);
opt = &po->proto.pptp;
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re:Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-02 17:13 ` Cong Wang
@ 2017-08-07 1:32 ` Gao Feng
2017-08-07 17:17 ` Cong Wang
0 siblings, 1 reply; 15+ messages in thread
From: Gao Feng @ 2017-08-07 1:32 UTC (permalink / raw)
To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers
At 2017-08-03 01:13:36, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>Hi, Gao
>
>On Tue, Aug 1, 2017 at 1:39 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> From my understanding, this RCU is supposed to protect the pppox_sock
>> pointers in 'callid_sock' which could be NULL'ed in del_chan(). And the
>> pppox_sock is freed when the last refcnt is gone, that is, when sock
>> dctor is called. pptp_release() is ONLY called when the fd in user-space
>> is gone, not necessarily the last refcnt.
Hi Cong,
I am sorry to reply you so late, because I took a short trip recently, and didn't check my emails.
I think the RCU should be supposed to avoid the race between del_chan and lookup_chan.
The synchronize_rcu could make sure if there was one which calls lookup_chan in this period, it would be finished and the sock refcnt is increased if necessary.
So I think it is ok to invoke sock_put directly without SOCK_RCU_FREE, because the lookup_chan caller has already hold the sock refcnt,
Best Regards
Feng
>
>Your commit is probably not the right fix. Can you try the following fix?
>
>diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>index 6dde9a0cfe76..e75bb95c107f 100644
>--- a/drivers/net/ppp/pptp.c
>+++ b/drivers/net/ppp/pptp.c
>@@ -519,7 +519,6 @@ static int pptp_release(struct socket *sock)
>
> po = pppox_sk(sk);
> del_chan(po);
>- synchronize_rcu();
>
> pppox_unbind_sock(sk);
> sk->sk_state = PPPOX_DEAD;
>@@ -564,6 +563,7 @@ static int pptp_create(struct net *net, struct
>socket *sock, int kern)
> sk->sk_family = PF_PPPOX;
> sk->sk_protocol = PX_PROTO_PPTP;
> sk->sk_destruct = pptp_sock_destruct;
>+ sock_set_flag(sk, SOCK_RCU_FREE);
>
> po = pppox_sk(sk);
> opt = &po->proto.pptp;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-07 1:32 ` Gao Feng
@ 2017-08-07 17:17 ` Cong Wang
2017-08-07 17:34 ` Cong Wang
[not found] ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
0 siblings, 2 replies; 15+ messages in thread
From: Cong Wang @ 2017-08-07 17:17 UTC (permalink / raw)
To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers
On Sun, Aug 6, 2017 at 6:32 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
> I think the RCU should be supposed to avoid the race between del_chan and lookup_chan.
More precisely, it is callid_sock which is protected by RCU.
Unless I miss any other code path, pptp_exit_module() is
problematic too, I don't think it can just vfree() the whole thing.
> The synchronize_rcu could make sure if there was one which calls lookup_chan in this period, it would be finished and the sock refcnt is increased if necessary.
>
> So I think it is ok to invoke sock_put directly without SOCK_RCU_FREE, because the lookup_chan caller has already hold the sock refcnt,
>
If you mean the sock_hold() inside lookup_chan(), no,
it doesn't help because we already dereference the sock
before it.
Also, lookup_chan_dst() does not have a refcnt, I don't
find any code preventing it deref'ing other sock in callid_sock
than the calling one.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-07 17:17 ` Cong Wang
@ 2017-08-07 17:34 ` Cong Wang
[not found] ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
1 sibling, 0 replies; 15+ messages in thread
From: Cong Wang @ 2017-08-07 17:34 UTC (permalink / raw)
To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers
On Mon, Aug 7, 2017 at 10:17 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> Unless I miss any other code path, pptp_exit_module() is
> problematic too, I don't think it can just vfree() the whole thing.
>
This path should be fine, because:
1. sock holds a refcnt to module via proto->owner
2. gre_del_protocol() already waits for readers
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re:Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
[not found] ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
@ 2017-08-08 1:10 ` Gao Feng
2017-08-08 19:45 ` Cong Wang
0 siblings, 1 reply; 15+ messages in thread
From: Gao Feng @ 2017-08-08 1:10 UTC (permalink / raw)
To: Gao Feng; +Cc: Cong Wang, xeb, David Miller, Linux Kernel Network Developers
At 2017-08-08 01:17:02, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Sun, Aug 6, 2017 at 6:32 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> I think the RCU should be supposed to avoid the race between del_chan and lookup_chan.
>
>More precisely, it is callid_sock which is protected by RCU.
>
>Unless I miss any other code path, pptp_exit_module() is
>problematic too, I don't think it can just vfree() the whole thing.
>
>
>> The synchronize_rcu could make sure if there was one which calls lookup_chan in this period, it would be finished and the sock refcnt is increased if necessary.
>>
>> So I think it is ok to invoke sock_put directly without SOCK_RCU_FREE, because the lookup_chan caller has already hold the sock refcnt,
>>
>
Hi Cong,
I just thought about this issue last night, then I get your this email this morning.
>If you mean the sock_hold() inside lookup_chan(), no,
>it doesn't help because we already dereference the sock
>before it.
>
Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?
The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
and the lookup is over.
There is no one could get the sock after synchronize_rcu in pptp_release.
But I think about another problem.
It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
Because when the sock refcnt is 0, the pptp_release must have be invoked already.
There are two cases totally.
1. when pptp_release invokes sock_put, the refcnt is 0. The del_chan and pppox_unbind_sock are invoked.
2. when pptp_release invokes sock_put, the refcnt is not 0. It means someone holds the sock during the period pptp_release invokes del_chan.
Then someone invokes sock_put and the sock refcnt reach 0, it would invoke sk_free and invokes pptp_sock_destruct.
So it is unnecessary to invoke del_chan and pppox_unbind_sock again.
And it would bring a race issue even if the pptp_sock_destruct invoked del_chan.
If so, I would send another patch for it.
>Also, lookup_chan_dst() does not have a refcnt, I don't
>find any code preventing it deref'ing other sock in callid_sock
>than the calling one.
Sorry, the last email is html format, not text.
So I send it with text format again.
Best Regards
Feng
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-08 1:10 ` Gao Feng
@ 2017-08-08 19:45 ` Cong Wang
2017-08-09 5:13 ` Gao Feng
0 siblings, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-08 19:45 UTC (permalink / raw)
To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers
On Mon, Aug 7, 2017 at 6:10 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>
> Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?
I already told you, the dereference happends before sock_hold().
sock = rcu_dereference(callid_sock[call_id]);
if (sock) {
opt = &sock->proto.pptp;
if (opt->dst_addr.sin_addr.s_addr != s_addr) <=== HERE
sock = NULL;
else
sock_hold(sk_pppox(sock));
}
If we don't wait for readers properly, sock could be freed at the
same time when deference it.
> The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
> and the lookup is over.
> There is no one could get the sock after synchronize_rcu in pptp_release.
If this were true, then this code in pptp_sock_destruct()
would be unneeded:
if (!(sk->sk_state & PPPOX_DEAD)) {
del_chan(pppox_sk(sk));
pppox_unbind_sock(sk);
}
>
>
> But I think about another problem.
> It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
> Because when the sock refcnt is 0, the pptp_release must have be invoked already.
>
I don't know. Looks like sock_orphan() is only called
in pptp_release(), but I am not sure if there is a case
we call sock destructor before release.
Also note, this socket is very special, it doesn't support
poll(), sendmsg() or recvmsg()..
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-08 19:45 ` Cong Wang
@ 2017-08-09 5:13 ` Gao Feng
2017-08-09 7:17 ` Gao Feng
2017-08-09 18:18 ` Cong Wang
0 siblings, 2 replies; 15+ messages in thread
From: Gao Feng @ 2017-08-09 5:13 UTC (permalink / raw)
To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers
At 2017-08-09 03:45:53, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Mon, Aug 7, 2017 at 6:10 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>>
>> Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?
>
>I already told you, the dereference happends before sock_hold().
>
> sock = rcu_dereference(callid_sock[call_id]);
> if (sock) {
> opt = &sock->proto.pptp;
> if (opt->dst_addr.sin_addr.s_addr != s_addr) <=== HERE
> sock = NULL;
> else
> sock_hold(sk_pppox(sock));
> }
>
>If we don't wait for readers properly, sock could be freed at the
>same time when deference it.
Maybe I didn't show my explanation clearly.
I think it won't happen as I mentioned in the last email.
Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD.
In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false.
As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary.
And it even brings confusing.
Best Regards
Feng
>
>> The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
>> and the lookup is over.
>> There is no one could get the sock after synchronize_rcu in pptp_release.
>
>
>If this were true, then this code in pptp_sock_destruct()
>would be unneeded:
>
> if (!(sk->sk_state & PPPOX_DEAD)) {
> del_chan(pppox_sk(sk));
> pppox_unbind_sock(sk);
> }
>
>
>>
>>
>> But I think about another problem.
>> It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
>> Because when the sock refcnt is 0, the pptp_release must have be invoked already.
>>
>
>
>I don't know. Looks like sock_orphan() is only called
>in pptp_release(), but I am not sure if there is a case
>we call sock destructor before release.
>
>Also note, this socket is very special, it doesn't support
>poll(), sendmsg() or recvmsg()..
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re:Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-09 5:13 ` Gao Feng
@ 2017-08-09 7:17 ` Gao Feng
2017-08-09 21:00 ` Cong Wang
2017-08-09 18:18 ` Cong Wang
1 sibling, 1 reply; 15+ messages in thread
From: Gao Feng @ 2017-08-09 7:17 UTC (permalink / raw)
To: Gao Feng; +Cc: Cong Wang, xeb, David Miller, Linux Kernel Network Developers
At 2017-08-09 13:13:53, "Gao Feng" <gfree.wind@vip.163.com> wrote:
>At 2017-08-09 03:45:53, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>>On Mon, Aug 7, 2017 at 6:10 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>>>
>>> Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?
>>
>>I already told you, the dereference happends before sock_hold().
>>
>> sock = rcu_dereference(callid_sock[call_id]);
>> if (sock) {
>> opt = &sock->proto.pptp;
>> if (opt->dst_addr.sin_addr.s_addr != s_addr) <=== HERE
>> sock = NULL;
>> else
>> sock_hold(sk_pppox(sock));
>> }
>>
>>If we don't wait for readers properly, sock could be freed at the
>>same time when deference it.
>
>Maybe I didn't show my explanation clearly.
>I think it won't happen as I mentioned in the last email.
>Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
>It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD.
>
>In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
>Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false.
>
>As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary.
>And it even brings confusing.
>
>Best Regards
>Feng
>
>>
>>> The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
>>> and the lookup is over.
>>> There is no one could get the sock after synchronize_rcu in pptp_release.
>>
>>
>>If this were true, then this code in pptp_sock_destruct()
>>would be unneeded:
>>
>> if (!(sk->sk_state & PPPOX_DEAD)) {
>> del_chan(pppox_sk(sk));
>> pppox_unbind_sock(sk);
>> }
>>
>>
>>>
>>>
>>> But I think about another problem.
>>> It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
>>> Because when the sock refcnt is 0, the pptp_release must have be invoked already.
>>>
>>
>>
>>I don't know. Looks like sock_orphan() is only called
>>in pptp_release(), but I am not sure if there is a case
>>we call sock destructor before release.
>>
>>Also note, this socket is very special, it doesn't support
>>poll(), sendmsg() or recvmsg()..
>
>
>
Hi Cong,
Actually I have one question about the SOCK_RCU_FREE.
I don't think it could resolve the issue you raised even though it exists really.
I checked the SOCK_RCU_FREE, it just defer the __sk_destruct after one rcu period.
But when it performs, someone still could find this sock by callid during the del_chan period and it may still deference the sock which may freed soon.
The right flow should be following.
del_chan()
wait a rcu period
sock_put() ------------ It is safe that someone gets the sock because it already hold sock refcnt.
When using SOCK_RCU_FREE, its flow would be following.
wait a rcu period
del_chan()
free the sock directly -------- no sock refcnt check again.
Because the del_chan happens after rcu wait, not before, so it isn't helpful with SOCK_RCU_FREE.
I don't know if I misunderstand the SOCK_RCU_FREE usage.
But it is a good news that the del_chan is only invoked in pptp_release actually and it would wait a rcu period before sock_put.
Best Regards
Feng
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-09 5:13 ` Gao Feng
2017-08-09 7:17 ` Gao Feng
@ 2017-08-09 18:18 ` Cong Wang
2017-08-10 1:25 ` Gao Feng
1 sibling, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-09 18:18 UTC (permalink / raw)
To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers
On Tue, Aug 8, 2017 at 10:13 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
> Maybe I didn't show my explanation clearly.
> I think it won't happen as I mentioned in the last email.
> Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
> It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD.
>
> In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
> Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false.
Only if sock->sk is always non-NULL for pptp_release(), which
is what I am not sure. If you look at other ->release(), similar checks
are there too, so not just for pptp.
>
> As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary.
> And it even brings confusing.
Sorry, I can't draw any conclusion for this.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-09 7:17 ` Gao Feng
@ 2017-08-09 21:00 ` Cong Wang
2017-08-10 2:41 ` Gao Feng
0 siblings, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-09 21:00 UTC (permalink / raw)
To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers
On Wed, Aug 9, 2017 at 12:17 AM, Gao Feng <gfree.wind@vip.163.com> wrote:
> Hi Cong,
>
> Actually I have one question about the SOCK_RCU_FREE.
> I don't think it could resolve the issue you raised even though it exists really.
>
> I checked the SOCK_RCU_FREE, it just defer the __sk_destruct after one rcu period.
> But when it performs, someone still could find this sock by callid during the del_chan period and it may still deference the sock which may freed soon.
>
> The right flow should be following.
> del_chan()
> wait a rcu period
> sock_put() ------------ It is safe that someone gets the sock because it already hold sock refcnt.
>
> When using SOCK_RCU_FREE, its flow would be following.
> wait a rcu period
> del_chan()
> free the sock directly -------- no sock refcnt check again.
> Because the del_chan happens after rcu wait, not before, so it isn't helpful with SOCK_RCU_FREE.
Yes, good point! With SOCK_RCU_FREE the sock_hold() should
not be needed. For RCU, unpublish should indeed happen before
grace period.
>
> I don't know if I misunderstand the SOCK_RCU_FREE usage.
>
> But it is a good news that the del_chan is only invoked in pptp_release actually and it would wait a rcu period before sock_put.
>
Looking at the code again, the reader lookup_chan() is actually
invoked in BH context, but neither add_chan() nor del_chan()
actually disables BH...
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re:Re: Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-09 18:18 ` Cong Wang
@ 2017-08-10 1:25 ` Gao Feng
0 siblings, 0 replies; 15+ messages in thread
From: Gao Feng @ 2017-08-10 1:25 UTC (permalink / raw)
To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers
At 2017-08-10 02:18:44, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Tue, Aug 8, 2017 at 10:13 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> Maybe I didn't show my explanation clearly.
>> I think it won't happen as I mentioned in the last email.
>> Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
>> It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD.
>>
>> In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
>> Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false.
>
>Only if sock->sk is always non-NULL for pptp_release(), which
>is what I am not sure. If you look at other ->release(), similar checks
>are there too, so not just for pptp.
Yes. It seems only if the release() is invoked twice, the sock->sk would be NULL.
But I don't find there is any case which could cause it.
>
>>
>> As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary.
>> And it even brings confusing.
>
>Sorry, I can't draw any conclusion for this.
Thank you all the same, and I have learn a lot from you :)
Wish someone which is familiar with these codes could give more details and explanations.
Best Regards
Feng
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re:Re: Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
2017-08-09 21:00 ` Cong Wang
@ 2017-08-10 2:41 ` Gao Feng
0 siblings, 0 replies; 15+ messages in thread
From: Gao Feng @ 2017-08-10 2:41 UTC (permalink / raw)
To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers
At 2017-08-10 05:00:19, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Wed, Aug 9, 2017 at 12:17 AM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> Hi Cong,
>>
>> Actually I have one question about the SOCK_RCU_FREE.
>> I don't think it could resolve the issue you raised even though it exists really.
>>
>> I checked the SOCK_RCU_FREE, it just defer the __sk_destruct after one rcu period.
>> But when it performs, someone still could find this sock by callid during the del_chan period and it may still deference the sock which may freed soon.
>>
>> The right flow should be following.
>> del_chan()
>> wait a rcu period
>> sock_put() ------------ It is safe that someone gets the sock because it already hold sock refcnt.
>>
>> When using SOCK_RCU_FREE, its flow would be following.
>> wait a rcu period
>> del_chan()
>> free the sock directly -------- no sock refcnt check again.
>> Because the del_chan happens after rcu wait, not before, so it isn't helpful with SOCK_RCU_FREE.
>
>
>Yes, good point! With SOCK_RCU_FREE the sock_hold() should
>not be needed. For RCU, unpublish should indeed happen before
>grace period.
Sorry, I couldn't understand why sock_hold() isn't necessary with SOCK_RCU_FREE.
When lookup_chan finds the sock, it would return and reference it later.
If no refcnt, how to protect the sock ?
Best Regards
Feng
>
>
>>
>> I don't know if I misunderstand the SOCK_RCU_FREE usage.
>>
>> But it is a good news that the del_chan is only invoked in pptp_release actually and it would wait a rcu period before sock_put.
>>
>
>Looking at the code again, the reader lookup_chan() is actually
>invoked in BH context, but neither add_chan() nor del_chan()
>actually disables BH...
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-08-10 2:42 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-31 10:07 [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan gfree.wind
2017-08-01 4:59 ` David Miller
2017-08-01 20:39 ` Cong Wang
2017-08-02 17:13 ` Cong Wang
2017-08-07 1:32 ` Gao Feng
2017-08-07 17:17 ` Cong Wang
2017-08-07 17:34 ` Cong Wang
[not found] ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
2017-08-08 1:10 ` Gao Feng
2017-08-08 19:45 ` Cong Wang
2017-08-09 5:13 ` Gao Feng
2017-08-09 7:17 ` Gao Feng
2017-08-09 21:00 ` Cong Wang
2017-08-10 2:41 ` Gao Feng
2017-08-09 18:18 ` Cong Wang
2017-08-10 1:25 ` Gao Feng
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.