* [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
@ 2013-05-08 7:41 Cong Wang
2013-05-08 9:47 ` dingtianhong
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Cong Wang @ 2013-05-08 7:41 UTC (permalink / raw)
To: netdev; +Cc: dingtianhong, Hideaki YOSHIFUJI, David S. Miller, Cong Wang
dingtianhong reported the following deadlock detected by lockdep:
======================================================
[ INFO: possible circular locking dependency detected ]
3.4.24.05-0.1-default #1 Not tainted
-------------------------------------------------------
ksoftirqd/0/3 is trying to acquire lock:
(&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
but task is already holding lock:
(&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mc->mca_lock){+.+...}:
[<ffffffff810a8027>] validate_chain+0x637/0x730
[<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
[<ffffffff810a8734>] lock_acquire+0x114/0x150
[<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
[<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
[<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
[<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
[<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
[<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
[<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
[<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
[<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
[<ffffffff813d9360>] dev_change_flags+0x40/0x70
[<ffffffff813ea627>] do_setlink+0x237/0x8a0
[<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
[<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
[<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
[<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
[<ffffffff81403e20>] netlink_unicast+0x140/0x180
[<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
[<ffffffff813c4252>] sock_sendmsg+0x112/0x130
[<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
[<ffffffff813c5544>] sys_sendmsg+0x44/0x70
[<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
-> #0 (&ndev->lock){+.+...}:
[<ffffffff810a798e>] check_prev_add+0x3de/0x440
[<ffffffff810a8027>] validate_chain+0x637/0x730
[<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
[<ffffffff810a8734>] lock_acquire+0x114/0x150
[<ffffffff814f6c82>] rt_read_lock+0x42/0x60
[<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
[<ffffffff8149b036>] mld_newpack+0xb6/0x160
[<ffffffff8149b18b>] add_grhead+0xab/0xc0
[<ffffffff8149d03b>] add_grec+0x3ab/0x460
[<ffffffff8149d14a>] mld_send_report+0x5a/0x150
[<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
[<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
[<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
[<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
[<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
[<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
[<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
[<ffffffff8106c7de>] kthread+0xae/0xc0
[<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
actually we can just hold idev->lock before taking pmc->mca_lock,
and avoid taking idev->lock again when iterating idev->addr_list.
Reported-by: dingtianhong <dingtianhong@huawei.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
---
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 84a6440..dbc6db7 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -86,6 +86,9 @@ extern int ipv6_dev_get_saddr(struct net *net,
const struct in6_addr *daddr,
unsigned int srcprefs,
struct in6_addr *saddr);
+extern int __ipv6_get_lladdr(struct inet6_dev *idev,
+ struct in6_addr *addr,
+ unsigned char banned_flags);
extern int ipv6_get_lladdr(struct net_device *dev,
struct in6_addr *addr,
unsigned char banned_flags);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d1ab6ab..a937092 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1448,6 +1448,23 @@ try_nextdev:
}
EXPORT_SYMBOL(ipv6_dev_get_saddr);
+int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr *addr,
+ unsigned char banned_flags)
+{
+ int err = -EADDRNOTAVAIL;
+ struct inet6_ifaddr *ifp;
+
+ list_for_each_entry(ifp, &idev->addr_list, if_list) {
+ if (ifp->scope == IFA_LINK &&
+ !(ifp->flags & banned_flags)) {
+ *addr = ifp->addr;
+ err = 0;
+ break;
+ }
+ }
+ return err;
+}
+
int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr,
unsigned char banned_flags)
{
@@ -1457,17 +1474,8 @@ int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr,
rcu_read_lock();
idev = __in6_dev_get(dev);
if (idev) {
- struct inet6_ifaddr *ifp;
-
read_lock_bh(&idev->lock);
- list_for_each_entry(ifp, &idev->addr_list, if_list) {
- if (ifp->scope == IFA_LINK &&
- !(ifp->flags & banned_flags)) {
- *addr = ifp->addr;
- err = 0;
- break;
- }
- }
+ err = __ipv6_get_lladdr(idev, addr, banned_flags);
read_unlock_bh(&idev->lock);
}
rcu_read_unlock();
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index bfa6cc3..c3998c2 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1343,8 +1343,9 @@ static void ip6_mc_hdr(struct sock *sk, struct sk_buff *skb,
hdr->daddr = *daddr;
}
-static struct sk_buff *mld_newpack(struct net_device *dev, int size)
+static struct sk_buff *mld_newpack(struct inet6_dev *idev, int size)
{
+ struct net_device *dev = idev->dev;
struct net *net = dev_net(dev);
struct sock *sk = net->ipv6.igmp_sk;
struct sk_buff *skb;
@@ -1369,7 +1370,7 @@ static struct sk_buff *mld_newpack(struct net_device *dev, int size)
skb_reserve(skb, hlen);
- if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) {
+ if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
/* <draft-ietf-magma-mld-source-05.txt>:
* use unspecified address as the source address
* when a valid link-local address is not available.
@@ -1465,7 +1466,7 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
struct mld2_grec *pgr;
if (!skb)
- skb = mld_newpack(dev, dev->mtu);
+ skb = mld_newpack(pmc->idev, dev->mtu);
if (!skb)
return NULL;
pgr = (struct mld2_grec *)skb_put(skb, sizeof(struct mld2_grec));
@@ -1485,7 +1486,8 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
int type, int gdeleted, int sdeleted)
{
- struct net_device *dev = pmc->idev->dev;
+ struct inet6_dev *idev = pmc->idev;
+ struct net_device *dev = idev->dev;
struct mld2_report *pmr;
struct mld2_grec *pgr = NULL;
struct ip6_sf_list *psf, *psf_next, *psf_prev, **psf_list;
@@ -1514,7 +1516,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
AVAILABLE(skb) < grec_size(pmc, type, gdeleted, sdeleted)) {
if (skb)
mld_sendpack(skb);
- skb = mld_newpack(dev, dev->mtu);
+ skb = mld_newpack(idev, dev->mtu);
}
}
first = 1;
@@ -1541,7 +1543,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
pgr->grec_nsrcs = htons(scount);
if (skb)
mld_sendpack(skb);
- skb = mld_newpack(dev, dev->mtu);
+ skb = mld_newpack(idev, dev->mtu);
first = 1;
scount = 0;
}
@@ -1596,8 +1598,8 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
struct sk_buff *skb = NULL;
int type;
+ read_lock_bh(&idev->lock);
if (!pmc) {
- read_lock_bh(&idev->lock);
for (pmc=idev->mc_list; pmc; pmc=pmc->next) {
if (pmc->mca_flags & MAF_NOREPORT)
continue;
@@ -1609,7 +1611,6 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
skb = add_grec(skb, pmc, type, 0, 0);
spin_unlock_bh(&pmc->mca_lock);
}
- read_unlock_bh(&idev->lock);
} else {
spin_lock_bh(&pmc->mca_lock);
if (pmc->mca_sfcount[MCAST_EXCLUDE])
@@ -1619,6 +1620,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
skb = add_grec(skb, pmc, type, 0, 0);
spin_unlock_bh(&pmc->mca_lock);
}
+ read_unlock_bh(&idev->lock);
if (skb)
mld_sendpack(skb);
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-08 7:41 [Patch net] ipv6,mcast: always hold idev->lock before mca_lock Cong Wang
@ 2013-05-08 9:47 ` dingtianhong
2013-05-08 20:16 ` David Miller
` (4 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: dingtianhong @ 2013-05-08 9:47 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, Hideaki YOSHIFUJI, David S. Miller
I think move ahead of the idev->lock is a fit way to solve this, so will test it, thanks.
On 2013/5/8 15:41, Cong Wang wrote:
> dingtianhong reported the following deadlock detected by lockdep:
>
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 3.4.24.05-0.1-default #1 Not tainted
> -------------------------------------------------------
> ksoftirqd/0/3 is trying to acquire lock:
> (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
>
> but task is already holding lock:
> (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&mc->mca_lock){+.+...}:
> [<ffffffff810a8027>] validate_chain+0x637/0x730
> [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
> [<ffffffff810a8734>] lock_acquire+0x114/0x150
> [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
> [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
> [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
> [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
> [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
> [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
> [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
> [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
> [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
> [<ffffffff813d9360>] dev_change_flags+0x40/0x70
> [<ffffffff813ea627>] do_setlink+0x237/0x8a0
> [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
> [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
> [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
> [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
> [<ffffffff81403e20>] netlink_unicast+0x140/0x180
> [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
> [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
> [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
> [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
> [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
>
> -> #0 (&ndev->lock){+.+...}:
> [<ffffffff810a798e>] check_prev_add+0x3de/0x440
> [<ffffffff810a8027>] validate_chain+0x637/0x730
> [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
> [<ffffffff810a8734>] lock_acquire+0x114/0x150
> [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
> [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
> [<ffffffff8149b036>] mld_newpack+0xb6/0x160
> [<ffffffff8149b18b>] add_grhead+0xab/0xc0
> [<ffffffff8149d03b>] add_grec+0x3ab/0x460
> [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
> [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
> [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
> [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
> [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
> [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
> [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
> [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
> [<ffffffff8106c7de>] kthread+0xae/0xc0
> [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
>
> actually we can just hold idev->lock before taking pmc->mca_lock,
> and avoid taking idev->lock again when iterating idev->addr_list.
>
> Reported-by: dingtianhong <dingtianhong@huawei.com>
> Cc: dingtianhong <dingtianhong@huawei.com>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
>
> ---
> diff --git a/include/net/addrconf.h b/include/net/addrconf.h
> index 84a6440..dbc6db7 100644
> --- a/include/net/addrconf.h
> +++ b/include/net/addrconf.h
> @@ -86,6 +86,9 @@ extern int ipv6_dev_get_saddr(struct net *net,
> const struct in6_addr *daddr,
> unsigned int srcprefs,
> struct in6_addr *saddr);
> +extern int __ipv6_get_lladdr(struct inet6_dev *idev,
> + struct in6_addr *addr,
> + unsigned char banned_flags);
> extern int ipv6_get_lladdr(struct net_device *dev,
> struct in6_addr *addr,
> unsigned char banned_flags);
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index d1ab6ab..a937092 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -1448,6 +1448,23 @@ try_nextdev:
> }
> EXPORT_SYMBOL(ipv6_dev_get_saddr);
>
> +int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr *addr,
> + unsigned char banned_flags)
> +{
> + int err = -EADDRNOTAVAIL;
> + struct inet6_ifaddr *ifp;
> +
> + list_for_each_entry(ifp, &idev->addr_list, if_list) {
> + if (ifp->scope == IFA_LINK &&
> + !(ifp->flags & banned_flags)) {
> + *addr = ifp->addr;
> + err = 0;
> + break;
> + }
> + }
> + return err;
> +}
> +
> int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr,
> unsigned char banned_flags)
> {
> @@ -1457,17 +1474,8 @@ int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr,
> rcu_read_lock();
> idev = __in6_dev_get(dev);
> if (idev) {
> - struct inet6_ifaddr *ifp;
> -
> read_lock_bh(&idev->lock);
> - list_for_each_entry(ifp, &idev->addr_list, if_list) {
> - if (ifp->scope == IFA_LINK &&
> - !(ifp->flags & banned_flags)) {
> - *addr = ifp->addr;
> - err = 0;
> - break;
> - }
> - }
> + err = __ipv6_get_lladdr(idev, addr, banned_flags);
> read_unlock_bh(&idev->lock);
> }
> rcu_read_unlock();
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index bfa6cc3..c3998c2 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1343,8 +1343,9 @@ static void ip6_mc_hdr(struct sock *sk, struct sk_buff *skb,
> hdr->daddr = *daddr;
> }
>
> -static struct sk_buff *mld_newpack(struct net_device *dev, int size)
> +static struct sk_buff *mld_newpack(struct inet6_dev *idev, int size)
> {
> + struct net_device *dev = idev->dev;
> struct net *net = dev_net(dev);
> struct sock *sk = net->ipv6.igmp_sk;
> struct sk_buff *skb;
> @@ -1369,7 +1370,7 @@ static struct sk_buff *mld_newpack(struct net_device *dev, int size)
>
> skb_reserve(skb, hlen);
>
> - if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) {
> + if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
> /* <draft-ietf-magma-mld-source-05.txt>:
> * use unspecified address as the source address
> * when a valid link-local address is not available.
> @@ -1465,7 +1466,7 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> struct mld2_grec *pgr;
>
> if (!skb)
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(pmc->idev, dev->mtu);
> if (!skb)
> return NULL;
> pgr = (struct mld2_grec *)skb_put(skb, sizeof(struct mld2_grec));
> @@ -1485,7 +1486,8 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> int type, int gdeleted, int sdeleted)
> {
> - struct net_device *dev = pmc->idev->dev;
> + struct inet6_dev *idev = pmc->idev;
> + struct net_device *dev = idev->dev;
> struct mld2_report *pmr;
> struct mld2_grec *pgr = NULL;
> struct ip6_sf_list *psf, *psf_next, *psf_prev, **psf_list;
> @@ -1514,7 +1516,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> AVAILABLE(skb) < grec_size(pmc, type, gdeleted, sdeleted)) {
> if (skb)
> mld_sendpack(skb);
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(idev, dev->mtu);
> }
> }
> first = 1;
> @@ -1541,7 +1543,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> pgr->grec_nsrcs = htons(scount);
> if (skb)
> mld_sendpack(skb);
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(idev, dev->mtu);
> first = 1;
> scount = 0;
> }
> @@ -1596,8 +1598,8 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> struct sk_buff *skb = NULL;
> int type;
>
> + read_lock_bh(&idev->lock);
> if (!pmc) {
> - read_lock_bh(&idev->lock);
> for (pmc=idev->mc_list; pmc; pmc=pmc->next) {
> if (pmc->mca_flags & MAF_NOREPORT)
> continue;
> @@ -1609,7 +1611,6 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> - read_unlock_bh(&idev->lock);
> } else {
> spin_lock_bh(&pmc->mca_lock);
> if (pmc->mca_sfcount[MCAST_EXCLUDE])
> @@ -1619,6 +1620,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> + read_unlock_bh(&idev->lock);
> if (skb)
> mld_sendpack(skb);
> }
>
> .
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-08 7:41 [Patch net] ipv6,mcast: always hold idev->lock before mca_lock Cong Wang
2013-05-08 9:47 ` dingtianhong
@ 2013-05-08 20:16 ` David Miller
2013-05-11 23:11 ` David Miller
` (3 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2013-05-08 20:16 UTC (permalink / raw)
To: amwang; +Cc: netdev, dingtianhong, yoshfuji
From: Cong Wang <amwang@redhat.com>
Date: Wed, 8 May 2013 15:41:54 +0800
> dingtianhong reported the following deadlock detected by lockdep:
...
> Reported-by: dingtianhong <dingtianhong@huawei.com>
> Cc: dingtianhong <dingtianhong@huawei.com>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
I'm waiting for testing validation of this patch.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-08 7:41 [Patch net] ipv6,mcast: always hold idev->lock before mca_lock Cong Wang
2013-05-08 9:47 ` dingtianhong
2013-05-08 20:16 ` David Miller
@ 2013-05-11 23:11 ` David Miller
2013-05-27 6:52 ` dingtianhong
2013-06-06 2:35 ` Cong Wang
2013-05-21 7:01 ` dingtianhong
` (2 subsequent siblings)
5 siblings, 2 replies; 13+ messages in thread
From: David Miller @ 2013-05-11 23:11 UTC (permalink / raw)
To: amwang; +Cc: netdev, dingtianhong, yoshfuji
From: Cong Wang <amwang@redhat.com>
Date: Wed, 8 May 2013 15:41:54 +0800
> @@ -1369,7 +1370,7 @@ static struct sk_buff *mld_newpack(struct net_device *dev, int size)
>
> skb_reserve(skb, hlen);
>
> - if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) {
> + if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
> /* <draft-ietf-magma-mld-source-05.txt>:
> * use unspecified address as the source address
> * when a valid link-local address is not available.
You aren't necessarily going to be holding idev->lock, therefore you can't
just do a lockless traversal of idev->addr_list here.
Yes, you can elide the rcu_read_lock() because you have a known reference
to 'idev' in these paths, but you can't get rid of the address list locking
altogether.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-08 7:41 [Patch net] ipv6,mcast: always hold idev->lock before mca_lock Cong Wang
` (2 preceding siblings ...)
2013-05-11 23:11 ` David Miller
@ 2013-05-21 7:01 ` dingtianhong
2013-05-21 10:10 ` Hannes Frederic Sowa
2013-06-26 22:58 ` Hannes Frederic Sowa
5 siblings, 0 replies; 13+ messages in thread
From: dingtianhong @ 2013-05-21 7:01 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, Hideaki YOSHIFUJI, David S. Miller
On 2013/5/8 15:41, Cong Wang wrote:
> dingtianhong reported the following deadlock detected by lockdep:
>
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 3.4.24.05-0.1-default #1 Not tainted
> -------------------------------------------------------
> ksoftirqd/0/3 is trying to acquire lock:
> (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
>
> but task is already holding lock:
> (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&mc->mca_lock){+.+...}:
> [<ffffffff810a8027>] validate_chain+0x637/0x730
> [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
> [<ffffffff810a8734>] lock_acquire+0x114/0x150
> [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
> [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
> [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
> [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
> [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
> [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
> [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
> [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
> [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
> [<ffffffff813d9360>] dev_change_flags+0x40/0x70
> [<ffffffff813ea627>] do_setlink+0x237/0x8a0
> [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
> [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
> [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
> [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
> [<ffffffff81403e20>] netlink_unicast+0x140/0x180
> [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
> [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
> [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
> [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
> [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
>
> -> #0 (&ndev->lock){+.+...}:
> [<ffffffff810a798e>] check_prev_add+0x3de/0x440
> [<ffffffff810a8027>] validate_chain+0x637/0x730
> [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
> [<ffffffff810a8734>] lock_acquire+0x114/0x150
> [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
> [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
> [<ffffffff8149b036>] mld_newpack+0xb6/0x160
> [<ffffffff8149b18b>] add_grhead+0xab/0xc0
> [<ffffffff8149d03b>] add_grec+0x3ab/0x460
> [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
> [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
> [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
> [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
> [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
> [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
> [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
> [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
> [<ffffffff8106c7de>] kthread+0xae/0xc0
> [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
>
> actually we can just hold idev->lock before taking pmc->mca_lock,
> and avoid taking idev->lock again when iterating idev->addr_list.
>
> Reported-by: dingtianhong <dingtianhong@huawei.com>
> Cc: dingtianhong <dingtianhong@huawei.com>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
>
> ---
> diff --git a/include/net/addrconf.h b/include/net/addrconf.h
> index 84a6440..dbc6db7 100644
> --- a/include/net/addrconf.h
> +++ b/include/net/addrconf.h
> @@ -86,6 +86,9 @@ extern int ipv6_dev_get_saddr(struct net *net,
> const struct in6_addr *daddr,
> unsigned int srcprefs,
> struct in6_addr *saddr);
> +extern int __ipv6_get_lladdr(struct inet6_dev *idev,
> + struct in6_addr *addr,
> + unsigned char banned_flags);
> extern int ipv6_get_lladdr(struct net_device *dev,
> struct in6_addr *addr,
> unsigned char banned_flags);
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index d1ab6ab..a937092 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -1448,6 +1448,23 @@ try_nextdev:
> }
> EXPORT_SYMBOL(ipv6_dev_get_saddr);
>
> +int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr *addr,
> + unsigned char banned_flags)
> +{
> + int err = -EADDRNOTAVAIL;
> + struct inet6_ifaddr *ifp;
> +
> + list_for_each_entry(ifp, &idev->addr_list, if_list) {
> + if (ifp->scope == IFA_LINK &&
> + !(ifp->flags & banned_flags)) {
> + *addr = ifp->addr;
> + err = 0;
> + break;
> + }
> + }
> + return err;
> +}
> +
> int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr,
> unsigned char banned_flags)
> {
> @@ -1457,17 +1474,8 @@ int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr,
> rcu_read_lock();
> idev = __in6_dev_get(dev);
> if (idev) {
> - struct inet6_ifaddr *ifp;
> -
> read_lock_bh(&idev->lock);
> - list_for_each_entry(ifp, &idev->addr_list, if_list) {
> - if (ifp->scope == IFA_LINK &&
> - !(ifp->flags & banned_flags)) {
> - *addr = ifp->addr;
> - err = 0;
> - break;
> - }
> - }
> + err = __ipv6_get_lladdr(idev, addr, banned_flags);
> read_unlock_bh(&idev->lock);
> }
> rcu_read_unlock();
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index bfa6cc3..c3998c2 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1343,8 +1343,9 @@ static void ip6_mc_hdr(struct sock *sk, struct sk_buff *skb,
> hdr->daddr = *daddr;
> }
>
> -static struct sk_buff *mld_newpack(struct net_device *dev, int size)
> +static struct sk_buff *mld_newpack(struct inet6_dev *idev, int size)
> {
> + struct net_device *dev = idev->dev;
> struct net *net = dev_net(dev);
> struct sock *sk = net->ipv6.igmp_sk;
> struct sk_buff *skb;
> @@ -1369,7 +1370,7 @@ static struct sk_buff *mld_newpack(struct net_device *dev, int size)
>
> skb_reserve(skb, hlen);
>
> - if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) {
> + if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
> /* <draft-ietf-magma-mld-source-05.txt>:
> * use unspecified address as the source address
> * when a valid link-local address is not available.
> @@ -1465,7 +1466,7 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> struct mld2_grec *pgr;
>
> if (!skb)
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(pmc->idev, dev->mtu);
> if (!skb)
> return NULL;
> pgr = (struct mld2_grec *)skb_put(skb, sizeof(struct mld2_grec));
> @@ -1485,7 +1486,8 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> int type, int gdeleted, int sdeleted)
> {
> - struct net_device *dev = pmc->idev->dev;
> + struct inet6_dev *idev = pmc->idev;
> + struct net_device *dev = idev->dev;
> struct mld2_report *pmr;
> struct mld2_grec *pgr = NULL;
> struct ip6_sf_list *psf, *psf_next, *psf_prev, **psf_list;
> @@ -1514,7 +1516,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> AVAILABLE(skb) < grec_size(pmc, type, gdeleted, sdeleted)) {
> if (skb)
> mld_sendpack(skb);
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(idev, dev->mtu);
> }
> }
> first = 1;
> @@ -1541,7 +1543,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> pgr->grec_nsrcs = htons(scount);
> if (skb)
> mld_sendpack(skb);
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(idev, dev->mtu);
> first = 1;
> scount = 0;
> }
> @@ -1596,8 +1598,8 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> struct sk_buff *skb = NULL;
> int type;
>
> + read_lock_bh(&idev->lock);
> if (!pmc) {
> - read_lock_bh(&idev->lock);
> for (pmc=idev->mc_list; pmc; pmc=pmc->next) {
> if (pmc->mca_flags & MAF_NOREPORT)
> continue;
> @@ -1609,7 +1611,6 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> - read_unlock_bh(&idev->lock);
> } else {
> spin_lock_bh(&pmc->mca_lock);
> if (pmc->mca_sfcount[MCAST_EXCLUDE])
> @@ -1619,6 +1620,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> + read_unlock_bh(&idev->lock);
> if (skb)
> mld_sendpack(skb);
> }
>
> .
>
I test the patch in kernel 3.4 stable and work well till now.
Tested-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Chen Weilong <chenweilong@huawei.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-08 7:41 [Patch net] ipv6,mcast: always hold idev->lock before mca_lock Cong Wang
` (3 preceding siblings ...)
2013-05-21 7:01 ` dingtianhong
@ 2013-05-21 10:10 ` Hannes Frederic Sowa
2013-06-26 22:58 ` Hannes Frederic Sowa
5 siblings, 0 replies; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-05-21 10:10 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, dingtianhong, Hideaki YOSHIFUJI, David S. Miller
On Wed, May 08, 2013 at 03:41:54PM +0800, Cong Wang wrote:
> @@ -1596,8 +1598,8 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> struct sk_buff *skb = NULL;
> int type;
>
> + read_lock_bh(&idev->lock);
> if (!pmc) {
> - read_lock_bh(&idev->lock);
> for (pmc=idev->mc_list; pmc; pmc=pmc->next) {
> if (pmc->mca_flags & MAF_NOREPORT)
> continue;
> @@ -1609,7 +1611,6 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> - read_unlock_bh(&idev->lock);
> } else {
> spin_lock_bh(&pmc->mca_lock);
> if (pmc->mca_sfcount[MCAST_EXCLUDE])
> @@ -1619,6 +1620,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> + read_unlock_bh(&idev->lock);
> if (skb)
> mld_sendpack(skb);
> }
Isn't this hunk the only one needed to fix this problem?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-11 23:11 ` David Miller
@ 2013-05-27 6:52 ` dingtianhong
2013-06-06 2:35 ` Cong Wang
1 sibling, 0 replies; 13+ messages in thread
From: dingtianhong @ 2013-05-27 6:52 UTC (permalink / raw)
To: David Miller; +Cc: amwang, netdev, yoshfuji
On 2013/5/12 7:11, David Miller wrote:
> From: Cong Wang <amwang@redhat.com>
> Date: Wed, 8 May 2013 15:41:54 +0800
>
>> @@ -1369,7 +1370,7 @@ static struct sk_buff *mld_newpack(struct net_device *dev, int size)
>>
>> skb_reserve(skb, hlen);
>>
>> - if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) {
>> + if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
>> /* <draft-ietf-magma-mld-source-05.txt>:
>> * use unspecified address as the source address
>> * when a valid link-local address is not available.
>
> You aren't necessarily going to be holding idev->lock, therefore you can't
> just do a lockless traversal of idev->addr_list here.
>
> Yes, you can elide the rcu_read_lock() because you have a known reference
> to 'idev' in these paths, but you can't get rid of the address list locking
> altogether.
>
>
I think the problem is clear:
mld_send_report(...){
read_lock_bh(&idev->lock);
add_grec(...)
read_unlock_bh(&idev->lock);
}
--->add_grec(...){
add_grhead(...)
}
--->add_grhead(...){
mld_newpack(...)
}
--->mld_newpack(...){
ipv6_get_lladdr(...)
}
--->ipv6_get_lladdr(...){
read_lock_bh(&idev->lock);
...
read_unlock_bh(&idev->lock);
}
so I think it is no need to lock twice and its unsafe here
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-11 23:11 ` David Miller
2013-05-27 6:52 ` dingtianhong
@ 2013-06-06 2:35 ` Cong Wang
1 sibling, 0 replies; 13+ messages in thread
From: Cong Wang @ 2013-06-06 2:35 UTC (permalink / raw)
To: David Miller; +Cc: netdev, dingtianhong, yoshfuji
On Sat, 2013-05-11 at 16:11 -0700, David Miller wrote:
> From: Cong Wang <amwang@redhat.com>
> Date: Wed, 8 May 2013 15:41:54 +0800
>
> > @@ -1369,7 +1370,7 @@ static struct sk_buff *mld_newpack(struct net_device *dev, int size)
> >
> > skb_reserve(skb, hlen);
> >
> > - if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) {
> > + if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
> > /* <draft-ietf-magma-mld-source-05.txt>:
> > * use unspecified address as the source address
> > * when a valid link-local address is not available.
>
> You aren't necessarily going to be holding idev->lock, therefore you can't
> just do a lockless traversal of idev->addr_list here.
>
> Yes, you can elide the rcu_read_lock() because you have a known reference
> to 'idev' in these paths, but you can't get rid of the address list locking
> altogether.
(Apologize for the delay...)
The upper callers of mld_newpack() already take
read_lock_bh(&idev->lock):
mld_newpack() ...-> add_grec() -|-> mld_send_cr() LOCKED
|-> mld_send_report() LOCKED
therefore it is safe to travel idev->addr_list locklessly here.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-05-08 7:41 [Patch net] ipv6,mcast: always hold idev->lock before mca_lock Cong Wang
` (4 preceding siblings ...)
2013-05-21 10:10 ` Hannes Frederic Sowa
@ 2013-06-26 22:58 ` Hannes Frederic Sowa
2013-06-27 3:09 ` Cong Wang
5 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-06-26 22:58 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, dingtianhong, Hideaki YOSHIFUJI, David S. Miller
On Wed, May 08, 2013 at 03:41:54PM +0800, Cong Wang wrote:
> dingtianhong reported the following deadlock detected by lockdep:
>
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 3.4.24.05-0.1-default #1 Not tainted
> -------------------------------------------------------
> ksoftirqd/0/3 is trying to acquire lock:
> (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
>
> but task is already holding lock:
> (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&mc->mca_lock){+.+...}:
> [<ffffffff810a8027>] validate_chain+0x637/0x730
> [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
> [<ffffffff810a8734>] lock_acquire+0x114/0x150
> [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
> [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
> [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
> [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
> [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
> [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
> [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
> [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
> [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
> [<ffffffff813d9360>] dev_change_flags+0x40/0x70
> [<ffffffff813ea627>] do_setlink+0x237/0x8a0
> [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
> [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
> [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
> [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
> [<ffffffff81403e20>] netlink_unicast+0x140/0x180
> [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
> [<ffffffff813c4252>] sock_sendmsg+0x112/0x130
> [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
> [<ffffffff813c5544>] sys_sendmsg+0x44/0x70
> [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b
>
> -> #0 (&ndev->lock){+.+...}:
> [<ffffffff810a798e>] check_prev_add+0x3de/0x440
> [<ffffffff810a8027>] validate_chain+0x637/0x730
> [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
> [<ffffffff810a8734>] lock_acquire+0x114/0x150
> [<ffffffff814f6c82>] rt_read_lock+0x42/0x60
> [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
> [<ffffffff8149b036>] mld_newpack+0xb6/0x160
> [<ffffffff8149b18b>] add_grhead+0xab/0xc0
> [<ffffffff8149d03b>] add_grec+0x3ab/0x460
> [<ffffffff8149d14a>] mld_send_report+0x5a/0x150
> [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
> [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
> [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
> [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
> [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
> [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
> [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
> [<ffffffff8106c7de>] kthread+0xae/0xc0
> [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10
>
> actually we can just hold idev->lock before taking pmc->mca_lock,
> and avoid taking idev->lock again when iterating idev->addr_list.
>
> Reported-by: dingtianhong <dingtianhong@huawei.com>
> Cc: dingtianhong <dingtianhong@huawei.com>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
Is someone still working on this issue?
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -1448,6 +1448,23 @@ try_nextdev:
> }
> EXPORT_SYMBOL(ipv6_dev_get_saddr);
>
> +int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr *addr,
> + unsigned char banned_flags)
> +{
> + int err = -EADDRNOTAVAIL;
> + struct inet6_ifaddr *ifp;
> +
> + list_for_each_entry(ifp, &idev->addr_list, if_list) {
> + if (ifp->scope == IFA_LINK &&
> + !(ifp->flags & banned_flags)) {
> + *addr = ifp->addr;
> + err = 0;
> + break;
> + }
> + }
> + return err;
> +}
> +
I added this function in net-next commit b7b1bfc ("ipv6: split duplicate
address detection and router solicitation timer"), so a rebase would
be needed (and it should be exported).
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index bfa6cc3..c3998c2 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1343,8 +1343,9 @@ static void ip6_mc_hdr(struct sock *sk, struct sk_buff *skb,
> hdr->daddr = *daddr;
> }
>
> -static struct sk_buff *mld_newpack(struct net_device *dev, int size)
> +static struct sk_buff *mld_newpack(struct inet6_dev *idev, int size)
> {
> + struct net_device *dev = idev->dev;
> struct net *net = dev_net(dev);
> struct sock *sk = net->ipv6.igmp_sk;
> struct sk_buff *skb;
> @@ -1369,7 +1370,7 @@ static struct sk_buff *mld_newpack(struct net_device *dev, int size)
>
> skb_reserve(skb, hlen);
>
> - if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) {
> + if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) {
> /* <draft-ietf-magma-mld-source-05.txt>:
> * use unspecified address as the source address
> * when a valid link-local address is not available.
> @@ -1465,7 +1466,7 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> struct mld2_grec *pgr;
>
> if (!skb)
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(pmc->idev, dev->mtu);
> if (!skb)
> return NULL;
> pgr = (struct mld2_grec *)skb_put(skb, sizeof(struct mld2_grec));
> @@ -1485,7 +1486,8 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> int type, int gdeleted, int sdeleted)
> {
> - struct net_device *dev = pmc->idev->dev;
> + struct inet6_dev *idev = pmc->idev;
> + struct net_device *dev = idev->dev;
> struct mld2_report *pmr;
> struct mld2_grec *pgr = NULL;
> struct ip6_sf_list *psf, *psf_next, *psf_prev, **psf_list;
> @@ -1514,7 +1516,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> AVAILABLE(skb) < grec_size(pmc, type, gdeleted, sdeleted)) {
> if (skb)
> mld_sendpack(skb);
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(idev, dev->mtu);
> }
> }
> first = 1;
> @@ -1541,7 +1543,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc,
> pgr->grec_nsrcs = htons(scount);
> if (skb)
> mld_sendpack(skb);
> - skb = mld_newpack(dev, dev->mtu);
> + skb = mld_newpack(idev, dev->mtu);
> first = 1;
> scount = 0;
> }
> @@ -1596,8 +1598,8 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> struct sk_buff *skb = NULL;
> int type;
>
> + read_lock_bh(&idev->lock);
> if (!pmc) {
> - read_lock_bh(&idev->lock);
> for (pmc=idev->mc_list; pmc; pmc=pmc->next) {
> if (pmc->mca_flags & MAF_NOREPORT)
> continue;
> @@ -1609,7 +1611,6 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> - read_unlock_bh(&idev->lock);
> } else {
> spin_lock_bh(&pmc->mca_lock);
> if (pmc->mca_sfcount[MCAST_EXCLUDE])
> @@ -1619,6 +1620,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc)
> skb = add_grec(skb, pmc, type, 0, 0);
> spin_unlock_bh(&pmc->mca_lock);
> }
> + read_unlock_bh(&idev->lock);
> if (skb)
> mld_sendpack(skb);
> }
I do confirm that if this last hunk is applied the idev->addr_list traversal
is safe.
Please let me know if no one is working on this, I would rebase it then.
Thanks,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-06-26 22:58 ` Hannes Frederic Sowa
@ 2013-06-27 3:09 ` Cong Wang
2013-06-27 3:42 ` Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2013-06-27 3:09 UTC (permalink / raw)
To: Hannes Frederic Sowa
Cc: netdev, dingtianhong, Hideaki YOSHIFUJI, David S. Miller
On Thu, 2013-06-27 at 00:58 +0200, Hannes Frederic Sowa wrote:
> I do confirm that if this last hunk is applied the idev->addr_list
> traversal
> is safe.
>
> Please let me know if no one is working on this, I would rebase it
> then.
I will rebase my patch on latest net-next. I am assuming this patch
looks good to you...
Thanks!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-06-27 3:09 ` Cong Wang
@ 2013-06-27 3:42 ` Hannes Frederic Sowa
2013-06-28 6:26 ` Ding Tianhong
0 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-06-27 3:42 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, dingtianhong, Hideaki YOSHIFUJI, David S. Miller
On Thu, Jun 27, 2013 at 11:09:44AM +0800, Cong Wang wrote:
> On Thu, 2013-06-27 at 00:58 +0200, Hannes Frederic Sowa wrote:
> > I do confirm that if this last hunk is applied the idev->addr_list
> > traversal
> > is safe.
> >
> > Please let me know if no one is working on this, I would rebase it
> > then.
>
> I will rebase my patch on latest net-next. I am assuming this patch
> looks good to you...
Yes, I am fine with the apporach you took. Perhaps you could describe
why the non-idev-locked call to__ipv6_get_lladdr-call is ok in that place.
Thanks a lot,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-06-27 3:42 ` Hannes Frederic Sowa
@ 2013-06-28 6:26 ` Ding Tianhong
2013-06-28 10:46 ` Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Ding Tianhong @ 2013-06-28 6:26 UTC (permalink / raw)
To: Cong Wang, netdev, Hideaki YOSHIFUJI, David S. Miller
On 2013/6/27 11:42, Hannes Frederic Sowa wrote:
> On Thu, Jun 27, 2013 at 11:09:44AM +0800, Cong Wang wrote:
>> On Thu, 2013-06-27 at 00:58 +0200, Hannes Frederic Sowa wrote:
>>> I do confirm that if this last hunk is applied the idev->addr_list
>>> traversal
>>> is safe.
>>>
>>> Please let me know if no one is working on this, I would rebase it
>>> then.
>>
>> I will rebase my patch on latest net-next. I am assuming this patch
>> looks good to you...
>
> Yes, I am fine with the apporach you took. Perhaps you could describe
> why the non-idev-locked call to__ipv6_get_lladdr-call is ok in that place.
>
> Thanks a lot,
>
> Hannes
>
>
>
I think the problem is clear:
mld_send_report(...){
read_lock_bh(&idev->lock);
add_grec(...)
read_unlock_bh(&idev->lock);
}
--->add_grec(...){
add_grhead(...)
}
--->add_grhead(...){
mld_newpack(...)
}
--->mld_newpack(...){
ipv6_get_lladdr(...)
}
--__ipv6_get_lladdr(..) (after the patch, so it is protect by the idev->lock)
compare
--->ipv6_get_lladdr(...){ (before the patch)
read_lock_bh(&idev->lock);
...
read_unlock_bh(&idev->lock);
}
so i think it is clear to describe the reason.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] ipv6,mcast: always hold idev->lock before mca_lock
2013-06-28 6:26 ` Ding Tianhong
@ 2013-06-28 10:46 ` Hannes Frederic Sowa
0 siblings, 0 replies; 13+ messages in thread
From: Hannes Frederic Sowa @ 2013-06-28 10:46 UTC (permalink / raw)
To: Ding Tianhong; +Cc: Cong Wang, netdev, Hideaki YOSHIFUJI, David S. Miller
On Fri, Jun 28, 2013 at 02:26:07PM +0800, Ding Tianhong wrote:
> On 2013/6/27 11:42, Hannes Frederic Sowa wrote:
> > On Thu, Jun 27, 2013 at 11:09:44AM +0800, Cong Wang wrote:
> >> On Thu, 2013-06-27 at 00:58 +0200, Hannes Frederic Sowa wrote:
> >>> I do confirm that if this last hunk is applied the idev->addr_list
> >>> traversal
> >>> is safe.
> >>>
> >>> Please let me know if no one is working on this, I would rebase it
> >>> then.
> >>
> >> I will rebase my patch on latest net-next. I am assuming this patch
> >> looks good to you...
> >
> > Yes, I am fine with the apporach you took. Perhaps you could describe
> > why the non-idev-locked call to__ipv6_get_lladdr-call is ok in that place.
> >
> > Thanks a lot,
> >
> > Hannes
> >
> >
> >
> I think the problem is clear:
>
> [...]
>
> so i think it is clear to describe the reason.
Sorry, I meant "...in a comment". ;)
Greetings,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-06-28 10:46 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-08 7:41 [Patch net] ipv6,mcast: always hold idev->lock before mca_lock Cong Wang
2013-05-08 9:47 ` dingtianhong
2013-05-08 20:16 ` David Miller
2013-05-11 23:11 ` David Miller
2013-05-27 6:52 ` dingtianhong
2013-06-06 2:35 ` Cong Wang
2013-05-21 7:01 ` dingtianhong
2013-05-21 10:10 ` Hannes Frederic Sowa
2013-06-26 22:58 ` Hannes Frederic Sowa
2013-06-27 3:09 ` Cong Wang
2013-06-27 3:42 ` Hannes Frederic Sowa
2013-06-28 6:26 ` Ding Tianhong
2013-06-28 10:46 ` Hannes Frederic Sowa
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.