All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
@ 2017-07-31 10:07 gfree.wind
  2017-08-01  4:59 ` David Miller
  2017-08-01 20:39 ` Cong Wang
  0 siblings, 2 replies; 15+ messages in thread
From: gfree.wind @ 2017-07-31 10:07 UTC (permalink / raw)
  To: xeb, davem, netdev; +Cc: Gao Feng

From: Gao Feng <gfree.wind@vip.163.com>

The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
trigger this bug when __sk_free is invoked in atomic context, because of
the call path pptp_sock_destruct->del_chan->synchronize_rcu.

Now move the synchronize_rcu to pptp_release from del_chan. This is the
only one case which would free the sock and need the synchronize_rcu.

The following is the panic I met with kernel 3.3.8, but this issue should
exist in current kernel too according to the codes.

BUG: scheduling while atomic
__schedule_bug+0x5e/0x64
__schedule+0x55/0x580
? ppp_unregister_channel+0x1cd5/0x1de0 [ppp_generic]
? dev_hard_start_xmit+0x423/0x530
? sch_direct_xmit+0x73/0x170
__cond_resched+0x16/0x30
_cond_resched+0x22/0x30
wait_for_common+0x18/0x110
? call_rcu_bh+0x10/0x10
wait_for_completion+0x12/0x20
wait_rcu_gp+0x34/0x40
? wait_rcu_gp+0x40/0x40
synchronize_sched+0x1e/0x20
0xf8417298
0xf8417484
? sock_queue_rcv_skb+0x109/0x130
__sk_free+0x16/0x110
? udp_queue_rcv_skb+0x1f2/0x290
sk_free+0x16/0x20
__udp4_lib_rcv+0x3b8/0x650

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
---
 drivers/net/ppp/pptp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index eac499c..6dde9a0 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -131,7 +131,6 @@ static void del_chan(struct pppox_sock *sock)
 	clear_bit(sock->proto.pptp.src_addr.call_id, callid_bitmap);
 	RCU_INIT_POINTER(callid_sock[sock->proto.pptp.src_addr.call_id], NULL);
 	spin_unlock(&chan_lock);
-	synchronize_rcu();
 }
 
 static int pptp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
@@ -520,6 +519,7 @@ static int pptp_release(struct socket *sock)
 
 	po = pppox_sk(sk);
 	del_chan(po);
+	synchronize_rcu();
 
 	pppox_unbind_sock(sk);
 	sk->sk_state = PPPOX_DEAD;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-07-31 10:07 [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan gfree.wind
@ 2017-08-01  4:59 ` David Miller
  2017-08-01 20:39 ` Cong Wang
  1 sibling, 0 replies; 15+ messages in thread
From: David Miller @ 2017-08-01  4:59 UTC (permalink / raw)
  To: gfree.wind; +Cc: xeb, netdev

From: gfree.wind@vip.163.com
Date: Mon, 31 Jul 2017 18:07:38 +0800

> From: Gao Feng <gfree.wind@vip.163.com>
> 
> The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
> trigger this bug when __sk_free is invoked in atomic context, because of
> the call path pptp_sock_destruct->del_chan->synchronize_rcu.
> 
> Now move the synchronize_rcu to pptp_release from del_chan. This is the
> only one case which would free the sock and need the synchronize_rcu.
> 
> The following is the panic I met with kernel 3.3.8, but this issue should
> exist in current kernel too according to the codes.
 ...
> Signed-off-by: Gao Feng <gfree.wind@vip.163.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-07-31 10:07 [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan gfree.wind
  2017-08-01  4:59 ` David Miller
@ 2017-08-01 20:39 ` Cong Wang
  2017-08-02 17:13   ` Cong Wang
  1 sibling, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-01 20:39 UTC (permalink / raw)
  To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers

On Mon, Jul 31, 2017 at 3:07 AM,  <gfree.wind@vip.163.com> wrote:
> From: Gao Feng <gfree.wind@vip.163.com>
>
> The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
> trigger this bug when __sk_free is invoked in atomic context, because of
> the call path pptp_sock_destruct->del_chan->synchronize_rcu.
>
> Now move the synchronize_rcu to pptp_release from del_chan. This is the
> only one case which would free the sock and need the synchronize_rcu.

I don't understand the last part.

>From my understanding, this RCU is supposed to protect the pppox_sock
pointers in 'callid_sock' which could be NULL'ed in del_chan(). And the
pppox_sock is freed when the last refcnt is gone, that is, when sock
dctor is called. pptp_release() is ONLY called when the fd in user-space
is gone, not necessarily the last refcnt.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-01 20:39 ` Cong Wang
@ 2017-08-02 17:13   ` Cong Wang
  2017-08-07  1:32     ` Gao Feng
  0 siblings, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-02 17:13 UTC (permalink / raw)
  To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers

Hi, Gao

On Tue, Aug 1, 2017 at 1:39 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> From my understanding, this RCU is supposed to protect the pppox_sock
> pointers in 'callid_sock' which could be NULL'ed in del_chan(). And the
> pppox_sock is freed when the last refcnt is gone, that is, when sock
> dctor is called. pptp_release() is ONLY called when the fd in user-space
> is gone, not necessarily the last refcnt.

Your commit is probably not the right fix. Can you try the following fix?

diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index 6dde9a0cfe76..e75bb95c107f 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -519,7 +519,6 @@ static int pptp_release(struct socket *sock)

        po = pppox_sk(sk);
        del_chan(po);
-       synchronize_rcu();

        pppox_unbind_sock(sk);
        sk->sk_state = PPPOX_DEAD;
@@ -564,6 +563,7 @@ static int pptp_create(struct net *net, struct
socket *sock, int kern)
        sk->sk_family      = PF_PPPOX;
        sk->sk_protocol    = PX_PROTO_PPTP;
        sk->sk_destruct    = pptp_sock_destruct;
+       sock_set_flag(sk, SOCK_RCU_FREE);

        po = pppox_sk(sk);
        opt = &po->proto.pptp;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re:Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-02 17:13   ` Cong Wang
@ 2017-08-07  1:32     ` Gao Feng
  2017-08-07 17:17       ` Cong Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Gao Feng @ 2017-08-07  1:32 UTC (permalink / raw)
  To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers

At 2017-08-03 01:13:36, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>Hi, Gao
>
>On Tue, Aug 1, 2017 at 1:39 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> From my understanding, this RCU is supposed to protect the pppox_sock
>> pointers in 'callid_sock' which could be NULL'ed in del_chan(). And the
>> pppox_sock is freed when the last refcnt is gone, that is, when sock
>> dctor is called. pptp_release() is ONLY called when the fd in user-space
>> is gone, not necessarily the last refcnt.

Hi Cong,

I am sorry to reply you so late, because I took a short trip recently, and didn't check my emails.

I think the RCU should be supposed to avoid the race between del_chan and lookup_chan.
The synchronize_rcu could make sure if there was one which calls lookup_chan in this period, it would be finished and the sock refcnt is increased if necessary.

So I think it is ok to invoke sock_put directly without SOCK_RCU_FREE, because the lookup_chan caller has already hold the sock refcnt, 

Best Regards
Feng

>
>Your commit is probably not the right fix. Can you try the following fix?
>
>diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
>index 6dde9a0cfe76..e75bb95c107f 100644
>--- a/drivers/net/ppp/pptp.c
>+++ b/drivers/net/ppp/pptp.c
>@@ -519,7 +519,6 @@ static int pptp_release(struct socket *sock)
>
>        po = pppox_sk(sk);
>        del_chan(po);
>-       synchronize_rcu();
>
>        pppox_unbind_sock(sk);
>        sk->sk_state = PPPOX_DEAD;
>@@ -564,6 +563,7 @@ static int pptp_create(struct net *net, struct
>socket *sock, int kern)
>        sk->sk_family      = PF_PPPOX;
>        sk->sk_protocol    = PX_PROTO_PPTP;
>        sk->sk_destruct    = pptp_sock_destruct;
>+       sock_set_flag(sk, SOCK_RCU_FREE);
>
>        po = pppox_sk(sk);
>        opt = &po->proto.pptp;


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-07  1:32     ` Gao Feng
@ 2017-08-07 17:17       ` Cong Wang
  2017-08-07 17:34         ` Cong Wang
       [not found]         ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
  0 siblings, 2 replies; 15+ messages in thread
From: Cong Wang @ 2017-08-07 17:17 UTC (permalink / raw)
  To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers

On Sun, Aug 6, 2017 at 6:32 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
> I think the RCU should be supposed to avoid the race between del_chan and lookup_chan.

More precisely, it is callid_sock which is protected by RCU.

Unless I miss any other code path, pptp_exit_module() is
problematic too, I don't think it can just vfree() the whole thing.


> The synchronize_rcu could make sure if there was one which calls lookup_chan in this period, it would be finished and the sock refcnt is increased if necessary.
>
> So I think it is ok to invoke sock_put directly without SOCK_RCU_FREE, because the lookup_chan caller has already hold the sock refcnt,
>

If you mean the sock_hold() inside lookup_chan(), no,
it doesn't help because we already dereference the sock
before it.

Also, lookup_chan_dst() does not have a refcnt, I don't
find any code preventing it deref'ing other sock in callid_sock
than the calling one.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-07 17:17       ` Cong Wang
@ 2017-08-07 17:34         ` Cong Wang
       [not found]         ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
  1 sibling, 0 replies; 15+ messages in thread
From: Cong Wang @ 2017-08-07 17:34 UTC (permalink / raw)
  To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers

On Mon, Aug 7, 2017 at 10:17 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> Unless I miss any other code path, pptp_exit_module() is
> problematic too, I don't think it can just vfree() the whole thing.
>

This path should be fine, because:

1. sock holds a refcnt to module via proto->owner
2. gre_del_protocol() already waits for readers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re:Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
       [not found]         ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
@ 2017-08-08  1:10           ` Gao Feng
  2017-08-08 19:45             ` Cong Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Gao Feng @ 2017-08-08  1:10 UTC (permalink / raw)
  To: Gao Feng; +Cc: Cong Wang, xeb, David Miller, Linux Kernel Network Developers

At 2017-08-08 01:17:02, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Sun, Aug 6, 2017 at 6:32 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> I think the RCU should be supposed to avoid the race between del_chan and lookup_chan.
>
>More precisely, it is callid_sock which is protected by RCU.
>
>Unless I miss any other code path, pptp_exit_module() is
>problematic too, I don't think it can just vfree() the whole thing.
>
>
>> The synchronize_rcu could make sure if there was one which calls lookup_chan in this period, it would be finished and the sock refcnt is increased if necessary.
>>
>> So I think it is ok to invoke sock_put directly without SOCK_RCU_FREE, because the lookup_chan caller has already hold the sock refcnt,
>>

>


Hi Cong,
I just thought about this issue last night, then I get your this email this morning.

>If you mean the sock_hold() inside lookup_chan(), no,
>it doesn't help because we already dereference the sock
>before it.

>


Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?
The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
and the lookup is over.
There is no one could get the sock after synchronize_rcu in pptp_release.


But I think about another problem.
It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
Because when the sock refcnt is 0, the pptp_release must have be invoked already.


There are two cases totally.
1. when pptp_release invokes sock_put, the refcnt is 0. The del_chan and pppox_unbind_sock are invoked.
2. when pptp_release invokes sock_put, the refcnt is not 0. It means someone holds the sock during the period pptp_release invokes del_chan.
Then someone invokes sock_put and the sock refcnt reach 0, it would invoke sk_free and invokes pptp_sock_destruct.
So it is unnecessary to invoke del_chan and pppox_unbind_sock again.
And it would bring a race issue even if the pptp_sock_destruct invoked del_chan.


If so, I would send another patch for it.


>Also, lookup_chan_dst() does not have a refcnt, I don't
>find any code preventing it deref'ing other sock in callid_sock

>than the calling one.


Sorry, the last email is html format, not text.
So I send it with text format again.

Best Regards
Feng






^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-08  1:10           ` Gao Feng
@ 2017-08-08 19:45             ` Cong Wang
  2017-08-09  5:13               ` Gao Feng
  0 siblings, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-08 19:45 UTC (permalink / raw)
  To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers

On Mon, Aug 7, 2017 at 6:10 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>
> Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?

I already told you, the dereference happends before sock_hold().

        sock = rcu_dereference(callid_sock[call_id]);
        if (sock) {
                opt = &sock->proto.pptp;
                if (opt->dst_addr.sin_addr.s_addr != s_addr) <=== HERE
                        sock = NULL;
                else
                        sock_hold(sk_pppox(sock));
        }

If we don't wait for readers properly, sock could be freed at the
same time when deference it.

> The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
> and the lookup is over.
> There is no one could get the sock after synchronize_rcu in pptp_release.


If this were true, then this code in pptp_sock_destruct()
would be unneeded:

        if (!(sk->sk_state & PPPOX_DEAD)) {
                del_chan(pppox_sk(sk));
                pppox_unbind_sock(sk);
        }


>
>
> But I think about another problem.
> It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
> Because when the sock refcnt is 0, the pptp_release must have be invoked already.
>


I don't know. Looks like sock_orphan() is only called
in pptp_release(), but I am not sure if there is a case
we call sock destructor before release.

Also note, this socket is very special, it doesn't support
poll(), sendmsg() or recvmsg()..

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-08 19:45             ` Cong Wang
@ 2017-08-09  5:13               ` Gao Feng
  2017-08-09  7:17                 ` Gao Feng
  2017-08-09 18:18                 ` Cong Wang
  0 siblings, 2 replies; 15+ messages in thread
From: Gao Feng @ 2017-08-09  5:13 UTC (permalink / raw)
  To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers

At 2017-08-09 03:45:53, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Mon, Aug 7, 2017 at 6:10 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>>
>> Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?
>
>I already told you, the dereference happends before sock_hold().
>
>        sock = rcu_dereference(callid_sock[call_id]);
>        if (sock) {
>                opt = &sock->proto.pptp;
>                if (opt->dst_addr.sin_addr.s_addr != s_addr) <=== HERE
>                        sock = NULL;
>                else
>                        sock_hold(sk_pppox(sock));
>        }
>
>If we don't wait for readers properly, sock could be freed at the
>same time when deference it.

Maybe I didn't show my explanation clearly.
I think it won't happen as I mentioned in the last email.
Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD. 

In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false. 

As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary. 
And it even brings confusing.

Best Regards
Feng

>
>> The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
>> and the lookup is over.
>> There is no one could get the sock after synchronize_rcu in pptp_release.
>
>
>If this were true, then this code in pptp_sock_destruct()
>would be unneeded:
>
>        if (!(sk->sk_state & PPPOX_DEAD)) {
>                del_chan(pppox_sk(sk));
>                pppox_unbind_sock(sk);
>        }
>
>
>>
>>
>> But I think about another problem.
>> It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
>> Because when the sock refcnt is 0, the pptp_release must have be invoked already.
>>
>
>
>I don't know. Looks like sock_orphan() is only called
>in pptp_release(), but I am not sure if there is a case
>we call sock destructor before release.
>
>Also note, this socket is very special, it doesn't support
>poll(), sendmsg() or recvmsg()..




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re:Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-09  5:13               ` Gao Feng
@ 2017-08-09  7:17                 ` Gao Feng
  2017-08-09 21:00                   ` Cong Wang
  2017-08-09 18:18                 ` Cong Wang
  1 sibling, 1 reply; 15+ messages in thread
From: Gao Feng @ 2017-08-09  7:17 UTC (permalink / raw)
  To: Gao Feng; +Cc: Cong Wang, xeb, David Miller, Linux Kernel Network Developers

At 2017-08-09 13:13:53, "Gao Feng" <gfree.wind@vip.163.com> wrote:
>At 2017-08-09 03:45:53, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>>On Mon, Aug 7, 2017 at 6:10 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>>>
>>> Sorry, I don't get you clearly. Why the sock_hold() isn't helpful?
>>
>>I already told you, the dereference happends before sock_hold().
>>
>>        sock = rcu_dereference(callid_sock[call_id]);
>>        if (sock) {
>>                opt = &sock->proto.pptp;
>>                if (opt->dst_addr.sin_addr.s_addr != s_addr) <=== HERE
>>                        sock = NULL;
>>                else
>>                        sock_hold(sk_pppox(sock));
>>        }
>>
>>If we don't wait for readers properly, sock could be freed at the
>>same time when deference it.
>
>Maybe I didn't show my explanation clearly.
>I think it won't happen as I mentioned in the last email.
>Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
>It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD. 
>
>In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
>Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false. 
>
>As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary. 
>And it even brings confusing.
>
>Best Regards
>Feng
>
>>
>>> The pptp_release invokes synchronize_rcu after del_chan, it could make sure the others has increased the sock refcnt if necessary
>>> and the lookup is over.
>>> There is no one could get the sock after synchronize_rcu in pptp_release.
>>
>>
>>If this were true, then this code in pptp_sock_destruct()
>>would be unneeded:
>>
>>        if (!(sk->sk_state & PPPOX_DEAD)) {
>>                del_chan(pppox_sk(sk));
>>                pppox_unbind_sock(sk);
>>        }
>>
>>
>>>
>>>
>>> But I think about another problem.
>>> It seems the pptp_sock_destruct should not invoke del_chan and pppox_unbind_sock.
>>> Because when the sock refcnt is 0, the pptp_release must have be invoked already.
>>>
>>
>>
>>I don't know. Looks like sock_orphan() is only called
>>in pptp_release(), but I am not sure if there is a case
>>we call sock destructor before release.
>>
>>Also note, this socket is very special, it doesn't support
>>poll(), sendmsg() or recvmsg()..
>
>
>

Hi Cong,

Actually I have one question about the SOCK_RCU_FREE.
I don't think it could resolve the issue you raised even though it exists really.

I checked the SOCK_RCU_FREE, it just defer the __sk_destruct after one rcu period.
But when it performs, someone still could find this sock by callid during the del_chan period and it may still deference the sock which may freed soon.

The right flow should be following.
del_chan()
wait a rcu period
sock_put() ------------ It is safe that someone gets the sock because it already hold sock refcnt.

When using SOCK_RCU_FREE, its flow would be following.
wait a rcu period
del_chan()
free the sock directly -------- no sock refcnt check again.
Because the del_chan happens after rcu wait, not before, so it isn't helpful with SOCK_RCU_FREE.

I don't know if I misunderstand the SOCK_RCU_FREE usage.

But it is a good news that the del_chan is only invoked in pptp_release actually and it would wait a rcu period before sock_put.

Best Regards
Feng



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-09  5:13               ` Gao Feng
  2017-08-09  7:17                 ` Gao Feng
@ 2017-08-09 18:18                 ` Cong Wang
  2017-08-10  1:25                   ` Gao Feng
  1 sibling, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-09 18:18 UTC (permalink / raw)
  To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers

On Tue, Aug 8, 2017 at 10:13 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
> Maybe I didn't show my explanation clearly.
> I think it won't happen as I mentioned in the last email.
> Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
> It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD.
>
> In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
> Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false.

Only if sock->sk is always non-NULL for pptp_release(), which
is what I am not sure. If you look at other ->release(), similar checks
are there too, so not just for pptp.

>
> As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary.
> And it even brings confusing.

Sorry, I can't draw any conclusion for this.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-09  7:17                 ` Gao Feng
@ 2017-08-09 21:00                   ` Cong Wang
  2017-08-10  2:41                     ` Gao Feng
  0 siblings, 1 reply; 15+ messages in thread
From: Cong Wang @ 2017-08-09 21:00 UTC (permalink / raw)
  To: Gao Feng; +Cc: xeb, David Miller, Linux Kernel Network Developers

On Wed, Aug 9, 2017 at 12:17 AM, Gao Feng <gfree.wind@vip.163.com> wrote:
> Hi Cong,
>
> Actually I have one question about the SOCK_RCU_FREE.
> I don't think it could resolve the issue you raised even though it exists really.
>
> I checked the SOCK_RCU_FREE, it just defer the __sk_destruct after one rcu period.
> But when it performs, someone still could find this sock by callid during the del_chan period and it may still deference the sock which may freed soon.
>
> The right flow should be following.
> del_chan()
> wait a rcu period
> sock_put() ------------ It is safe that someone gets the sock because it already hold sock refcnt.
>
> When using SOCK_RCU_FREE, its flow would be following.
> wait a rcu period
> del_chan()
> free the sock directly -------- no sock refcnt check again.
> Because the del_chan happens after rcu wait, not before, so it isn't helpful with SOCK_RCU_FREE.


Yes, good point! With SOCK_RCU_FREE the sock_hold() should
not be needed. For RCU, unpublish should indeed happen before
grace period.


>
> I don't know if I misunderstand the SOCK_RCU_FREE usage.
>
> But it is a good news that the del_chan is only invoked in pptp_release actually and it would wait a rcu period before sock_put.
>

Looking at the code again, the reader lookup_chan() is actually
invoked in BH context, but neither add_chan() nor del_chan()
actually disables BH...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re:Re: Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-09 18:18                 ` Cong Wang
@ 2017-08-10  1:25                   ` Gao Feng
  0 siblings, 0 replies; 15+ messages in thread
From: Gao Feng @ 2017-08-10  1:25 UTC (permalink / raw)
  To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers

At 2017-08-10 02:18:44, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Tue, Aug 8, 2017 at 10:13 PM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> Maybe I didn't show my explanation clearly.
>> I think it won't happen as I mentioned in the last email.
>> Because the pptp_release invokes the synchronize_rcu to make sure it, and actually there is no one which would invoke del_chan except pptp_release.
>> It is guaranteed by that the pptp_release doesn't put the sock refcnt until complete all cleanup include marking sk_state as PPPOX_DEAD.
>>
>> In other words, even though the pptp_release is not the last user of this sock, the other one wouldn't invoke del_chan in pptp_sock_destruct.
>> Because the condition "!(sk->sk_state & PPPOX_DEAD)" must be false.
>
>Only if sock->sk is always non-NULL for pptp_release(), which
>is what I am not sure. If you look at other ->release(), similar checks
>are there too, so not just for pptp.

Yes. It seems only if the release() is invoked twice, the sock->sk would be NULL.
But I don't find there is any case which could cause it.

>
>>
>> As summary, the del_chan and pppox_unbind_sock in pptp_sock_destruct are unnecessary.
>> And it even brings confusing.
>
>Sorry, I can't draw any conclusion for this.

Thank you all the same, and I have learn a lot from you :)
Wish someone which is familiar with these codes could give more details and explanations.

Best Regards
Feng 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re:Re: Re:Re: Re:Re: Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan
  2017-08-09 21:00                   ` Cong Wang
@ 2017-08-10  2:41                     ` Gao Feng
  0 siblings, 0 replies; 15+ messages in thread
From: Gao Feng @ 2017-08-10  2:41 UTC (permalink / raw)
  To: Cong Wang; +Cc: xeb, David Miller, Linux Kernel Network Developers


At 2017-08-10 05:00:19, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
>On Wed, Aug 9, 2017 at 12:17 AM, Gao Feng <gfree.wind@vip.163.com> wrote:
>> Hi Cong,
>>
>> Actually I have one question about the SOCK_RCU_FREE.
>> I don't think it could resolve the issue you raised even though it exists really.
>>
>> I checked the SOCK_RCU_FREE, it just defer the __sk_destruct after one rcu period.
>> But when it performs, someone still could find this sock by callid during the del_chan period and it may still deference the sock which may freed soon.
>>
>> The right flow should be following.
>> del_chan()
>> wait a rcu period
>> sock_put() ------------ It is safe that someone gets the sock because it already hold sock refcnt.
>>
>> When using SOCK_RCU_FREE, its flow would be following.
>> wait a rcu period
>> del_chan()
>> free the sock directly -------- no sock refcnt check again.
>> Because the del_chan happens after rcu wait, not before, so it isn't helpful with SOCK_RCU_FREE.
>
>
>Yes, good point! With SOCK_RCU_FREE the sock_hold() should
>not be needed. For RCU, unpublish should indeed happen before
>grace period.

Sorry, I couldn't understand why sock_hold() isn't necessary with SOCK_RCU_FREE.
When lookup_chan finds the sock, it would return and reference it later.
If no refcnt, how to protect the sock ?

Best Regards
Feng

>
>
>>
>> I don't know if I misunderstand the SOCK_RCU_FREE usage.
>>
>> But it is a good news that the del_chan is only invoked in pptp_release actually and it would wait a rcu period before sock_put.
>>
>
>Looking at the code again, the reader lookup_chan() is actually
>invoked in BH context, but neither add_chan() nor del_chan()
>actually disables BH...



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-08-10  2:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-31 10:07 [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan gfree.wind
2017-08-01  4:59 ` David Miller
2017-08-01 20:39 ` Cong Wang
2017-08-02 17:13   ` Cong Wang
2017-08-07  1:32     ` Gao Feng
2017-08-07 17:17       ` Cong Wang
2017-08-07 17:34         ` Cong Wang
     [not found]         ` <697dbbd.7911.15dbf5ca3a6.Coremail.gfree.wind@vip.163.com>
2017-08-08  1:10           ` Gao Feng
2017-08-08 19:45             ` Cong Wang
2017-08-09  5:13               ` Gao Feng
2017-08-09  7:17                 ` Gao Feng
2017-08-09 21:00                   ` Cong Wang
2017-08-10  2:41                     ` Gao Feng
2017-08-09 18:18                 ` Cong Wang
2017-08-10  1:25                   ` Gao Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.