From mboxrd@z Thu Jan 1 00:00:00 1970 From: Feng Gao Date: Thu, 18 Aug 2016 00:58:31 +0000 Subject: Re: [PATCH v2 1/1] ppp: Fix one deadlock issue of PPP when send frame Message-Id: List-Id: References: <1471347218-24472-1-git-send-email-fgao@ikuai8.com> <20160817174219.y4opmhj4u77m3ufq@alphalink.fr> In-Reply-To: <20160817174219.y4opmhj4u77m3ufq@alphalink.fr> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Guillaume Nault Cc: Gao Feng , paulus@samba.org, linux-ppp@vger.kernel.org, Linux Kernel Network Developers inline answer. On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault wrote: > On Tue, Aug 16, 2016 at 07:33:38PM +0800, fgao@ikuai8.com wrote: >> From: Gao Feng >> >> PPP channel holds one spinlock before send frame. But the skb may >> select the same PPP channel with wrong route policy. As a result, >> the skb reaches the same channel path. It tries to get the same >> spinlock which is held before. Bang, the deadlock comes out. >> > Unless I misunderstood the problem you're trying to solve, this patch > doesn't really help: deadlock still occurs if the same IP is used for > L2TP and PPP's peer address. > The deadlock happens because the same cpu try to hold the spinlock which is already held before by itself. Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then when the same cpu invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id, so it only increases the lock_cnt without trying to hold the lock again. So it avoids the deadlock. >> Now add one lock owner to avoid it like xmit_lock_owner of >> netdev_queue. Check the lock owner before try to get the spinlock. >> If the current cpu is already the owner, needn't lock again. When >> PPP channel holds the spinlock at the first time, it sets owner >> with current CPU ID. >> > I think you should forbid lock recursion entirely, and drop the packet > if the owner tries to re-acquire the channel lock. Otherwise you just > move the deadlock down the stack (l2tp_xmit_skb() can't be called > recursively). The reason that fix it in ppp is that there are other layer on the ppp module. We resolve it in ppp module, it could avoid all similar potential issues. > >> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c >> index 70cfa06..6909ab1 100644 >> --- a/drivers/net/ppp/ppp_generic.c >> +++ b/drivers/net/ppp/ppp_generic.c >> @@ -162,6 +162,37 @@ struct ppp { >> |SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \ >> |SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP) >> >> +struct chennel_lock { > ^ > I guess you meant 'channel_lock'. Yes. It is a typo. Thanks. > >> + spinlock_t lock; >> + u32 owner; > This field's default value is -1. So why not declaring it as 'int'? OK. I will fix it. Best Regards Feng