* Re: [PATCH net-next 1/2] netlink: ipv4 IGMP join notifications
2018-08-30 9:35 ` [PATCH net-next 1/2] netlink: ipv4 IGMP " Patrick Ruddy
@ 2018-08-30 16:44 ` Patrick Ruddy
2018-08-31 4:23 ` kbuild test robot
` (3 subsequent siblings)
4 siblings, 0 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-08-30 16:44 UTC (permalink / raw)
To: netdev; +Cc: roopa, jiri, stephen
Don't know what happened to the 0/2 cover for this series so here it
is:
This patch is an update to https://patchwork.ozlabs.org/patch/571127/.
The
previous patch was based on sending multicast MAC addresses in the
netlink messages to allow the programming of hardware. It was agreed to
rework this to use RTM_NEW/DELLINK messages which were more appropriate
for layer 2 addresses.
In the interim period it has become apparent that the applications
actually
needs to see the L3 multicast addresses which are joined for FORUS
processing so this patch has been reworked to send the L3 multicast
addresses using RTM_NEW/DELADDR.
These new multicast L3 netlink notifications should use the
IFA_MULTICAST
address type but this has been dropped in favour of IFA_ADDRESS as
during
testing it was noticed that some applications - notably getaddrinfo in
lib6c assume that there is an IFA_ADDRESS in a RTM_NEW/DELADDR and
blindly dereference it.
Finally the RTM_GETADDR for both address families has been modified to
include the multicast l3 addresses.
Patrick Ruddy (2):
netlink: ipv4 IGMP join notifications
netlink: ipv6 MLD join notifications
include/linux/igmp.h | 2 +
net/ipv4/devinet.c | 39 +++++++++++++------
net/ipv4/igmp.c | 90 ++++++++++++++++++++++++++++++++++++++++++++
net/ipv6/addrconf.c | 44 ++++++++++++++++------
net/ipv6/mcast.c | 66 ++++++++++++++++++++++++++++++++
5 files changed, 218 insertions(+), 23 deletions(-)
--
2.17.1
On Thu, 2018-08-30 at 10:35 +0100, Patrick Ruddy wrote:
> Some userspace applications need to know about IGMP joins from the kernel
> for 2 reasons
> 1. To allow the programming of multicast MAC filters in hardware
> 2. To form a multicast FORUS list for non link-local multicast
> groups to be sent to the kernel and from there to the interested
> party.
> (1) can be fulfilled but simply sending the hardware multicast MAC
> address to be programmed but (2) requires the L3 address to be sent
> since this cannot be constructed from the MAC address whereas the
> reverse translation is a standard library function.
>
> This commit provides addition and deletion of multicast addresses
> using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> the RTM_GETADDR extension to allow multicast join state to be read
> from the kernel.
>
> Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> ---
> include/linux/igmp.h | 2 +
> net/ipv4/devinet.c | 39 +++++++++++++------
> net/ipv4/igmp.c | 90 ++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 120 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/igmp.h b/include/linux/igmp.h
> index 119f53941c12..1fb417865e7d 100644
> --- a/include/linux/igmp.h
> +++ b/include/linux/igmp.h
> @@ -130,6 +130,8 @@ extern void ip_mc_unmap(struct in_device *);
> extern void ip_mc_remap(struct in_device *);
> extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
> extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
> +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> + struct net_device *dev);
> int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
>
> #endif
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index ea4bd8a52422..42f7dcc4fb5e 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -57,6 +57,7 @@
> #endif
> #include <linux/kmod.h>
> #include <linux/netconf.h>
> +#include <linux/igmp.h>
>
> #include <net/arp.h>
> #include <net/ip.h>
> @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> int h, s_h;
> int idx, s_idx;
> int ip_idx, s_ip_idx;
> + int multicast, mcast_idx;
> struct net_device *dev;
> struct in_device *in_dev;
> struct in_ifaddr *ifa;
> @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> s_h = cb->args[0];
> s_idx = idx = cb->args[1];
> s_ip_idx = ip_idx = cb->args[2];
> + multicast = cb->args[3];
> + mcast_idx = cb->args[4];
>
> for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
> idx = 0;
> @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> if (!in_dev)
> goto cont;
>
> - for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> - ifa = ifa->ifa_next, ip_idx++) {
> - if (ip_idx < s_ip_idx)
> - continue;
> - if (inet_fill_ifaddr(skb, ifa,
> - NETLINK_CB(cb->skb).portid,
> - cb->nlh->nlmsg_seq,
> - RTM_NEWADDR, NLM_F_MULTI) < 0) {
> - rcu_read_unlock();
> - goto done;
> + if (!multicast) {
> + for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> + ifa = ifa->ifa_next, ip_idx++) {
> + if (ip_idx < s_ip_idx)
> + continue;
> + if (inet_fill_ifaddr(skb, ifa,
> + NETLINK_CB(cb->skb).portid,
> + cb->nlh->nlmsg_seq,
> + RTM_NEWADDR,
> + NLM_F_MULTI) < 0) {
> + rcu_read_unlock();
> + goto done;
> + }
> + nl_dump_check_consistent(cb,
> + nlmsg_hdr(skb));
> }
> - nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> + /* set for multicast loop */
> + multicast++;
> + }
> + /* loop over multicast addresses */
> + if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) {
> + rcu_read_unlock();
> + goto done;
> }
> cont:
> idx++;
> @@ -1698,6 +1713,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> cb->args[0] = h;
> cb->args[1] = idx;
> cb->args[2] = ip_idx;
> + cb->args[3] = multicast;
> + cb->args[4] = mcast_idx;
>
> return skb->len;
> }
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index cf75f8944b05..c9bbd1d27124 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -86,6 +86,7 @@
> #include <linux/inetdevice.h>
> #include <linux/igmp.h>
> #include <linux/if_arp.h>
> +#include <net/netlink.h>
> #include <linux/rtnetlink.h>
> #include <linux/times.h>
> #include <linux/pkt_sched.h>
> @@ -1384,6 +1385,91 @@ static void ip_mc_hash_remove(struct in_device *in_dev,
> }
>
>
> +static int fill_addr(struct sk_buff *skb, struct net_device *dev, __be32 addr,
> + int type, unsigned int flags)
> +{
> + struct nlmsghdr *nlh;
> + struct ifaddrmsg *ifm;
> +
> + nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
> + if (!nlh)
> + return -EMSGSIZE;
> +
> + ifm = nlmsg_data(nlh);
> + ifm->ifa_family = AF_INET;
> + ifm->ifa_prefixlen = 32;
> + ifm->ifa_flags = IFA_F_PERMANENT;
> + ifm->ifa_scope = RT_SCOPE_LINK;
> + ifm->ifa_index = dev->ifindex;
> +
> + if (nla_put_in_addr(skb, IFA_ADDRESS, addr))
> + goto nla_put_failure;
> + nlmsg_end(skb, nlh);
> + return 0;
> +
> +nla_put_failure:
> + nlmsg_cancel(skb, nlh);
> + return -EMSGSIZE;
> +}
> +
> +static inline size_t addr_nlmsg_size(void)
> +{
> + return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
> + + nla_total_size(sizeof(__be32));
> +}
> +
> +static void ip_mc_addr_notify(struct net_device *dev, __be32 addr, int type)
> +{
> + struct net *net = dev_net(dev);
> + struct sk_buff *skb;
> + int err = -ENOBUFS;
> +
> + skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
> + if (!skb)
> + goto errout;
> +
> + err = fill_addr(skb, dev, addr, type, 0);
> + if (err < 0) {
> + WARN_ON(err == -EMSGSIZE);
> + kfree_skb(skb);
> + goto errout;
> + }
> + rtnl_notify(skb, net, 0, RTNLGRP_IPV4_IFADDR, NULL, GFP_ATOMIC);
> + return;
> +errout:
> + if (err < 0)
> + rtnl_set_sk_err(net, RTNLGRP_LINK, err);
> +}
> +
> +int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> + struct net_device *dev)
> +{
> + int s_idx;
> + int idx = 0;
> + struct ip_mc_list *im;
> + struct in_device *in_dev;
> +
> + ASSERT_RTNL();
> +
> + s_idx = cb->args[4];
> + in_dev = __in_dev_get_rtnl(dev);
> +
> + for_each_pmc_rtnl(in_dev, im) {
> + if (idx < s_idx)
> + continue;
> + if (fill_addr(skb, dev, im->multiaddr, RTM_NEWADDR,
> + NLM_F_MULTI) < 0)
> + goto done;
> + nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> + idx++;
> + }
> +
> + done:
> + cb->args[4] = idx;
> +
> + return skb->len;
> +}
> +
> /*
> * A socket has joined a multicast group on device dev.
> */
> @@ -1433,6 +1519,8 @@ static void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr,
> igmpv3_del_delrec(in_dev, im);
> #endif
> igmp_group_added(im);
> +
> + ip_mc_addr_notify(in_dev->dev, addr, RTM_NEWADDR);
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);
> out:
> @@ -1664,6 +1752,8 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
> in_dev->mc_count--;
> igmp_group_dropped(i);
> ip_mc_clear_src(i);
> + ip_mc_addr_notify(in_dev->dev, addr,
> + RTM_DELADDR);
>
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next 1/2] netlink: ipv4 IGMP join notifications
2018-08-30 9:35 ` [PATCH net-next 1/2] netlink: ipv4 IGMP " Patrick Ruddy
2018-08-30 16:44 ` Patrick Ruddy
@ 2018-08-31 4:23 ` kbuild test robot
2018-08-31 4:52 ` kbuild test robot
` (2 subsequent siblings)
4 siblings, 0 replies; 24+ messages in thread
From: kbuild test robot @ 2018-08-31 4:23 UTC (permalink / raw)
To: Patrick Ruddy; +Cc: kbuild-all, netdev, roopa, jiri, stephen
[-- Attachment #1: Type: text/plain, Size: 2775 bytes --]
Hi Patrick,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Patrick-Ruddy/netlink-ipv4-IGMP-join-notifications/20180831-105548
config: i386-randconfig-s0-201834 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All warnings (new ones prefixed by >>):
In file included from net//ipv4/udp.c:92:0:
>> include/linux/igmp.h:133:58: warning: 'struct netlink_callback' declared inside parameter list will not be visible outside of this definition or declaration
extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
^~~~~~~~~~~~~~~~
vim +133 include/linux/igmp.h
108
109 extern int ip_check_mc_rcu(struct in_device *dev, __be32 mc_addr, __be32 src_addr, u8 proto);
110 extern int igmp_rcv(struct sk_buff *);
111 extern int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr);
112 extern int ip_mc_join_group_ssm(struct sock *sk, struct ip_mreqn *imr,
113 unsigned int mode);
114 extern int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr);
115 extern void ip_mc_drop_socket(struct sock *sk);
116 extern int ip_mc_source(int add, int omode, struct sock *sk,
117 struct ip_mreq_source *mreqs, int ifindex);
118 extern int ip_mc_msfilter(struct sock *sk, struct ip_msfilter *msf,int ifindex);
119 extern int ip_mc_msfget(struct sock *sk, struct ip_msfilter *msf,
120 struct ip_msfilter __user *optval, int __user *optlen);
121 extern int ip_mc_gsfget(struct sock *sk, struct group_filter *gsf,
122 struct group_filter __user *optval, int __user *optlen);
123 extern int ip_mc_sf_allow(struct sock *sk, __be32 local, __be32 rmt,
124 int dif, int sdif);
125 extern void ip_mc_init_dev(struct in_device *);
126 extern void ip_mc_destroy_dev(struct in_device *);
127 extern void ip_mc_up(struct in_device *);
128 extern void ip_mc_down(struct in_device *);
129 extern void ip_mc_unmap(struct in_device *);
130 extern void ip_mc_remap(struct in_device *);
131 extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
132 extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
> 133 extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
134 struct net_device *dev);
135 int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
136
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28040 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next 1/2] netlink: ipv4 IGMP join notifications
2018-08-30 9:35 ` [PATCH net-next 1/2] netlink: ipv4 IGMP " Patrick Ruddy
2018-08-30 16:44 ` Patrick Ruddy
2018-08-31 4:23 ` kbuild test robot
@ 2018-08-31 4:52 ` kbuild test robot
2018-08-31 11:20 ` [PATCH net-next v2 1/2] netlink: ipv4 igmp " Patrick Ruddy
2018-09-06 9:10 ` [PATCH net-next v3 " Patrick Ruddy
4 siblings, 0 replies; 24+ messages in thread
From: kbuild test robot @ 2018-08-31 4:52 UTC (permalink / raw)
To: Patrick Ruddy; +Cc: kbuild-all, netdev, roopa, jiri, stephen
[-- Attachment #1: Type: text/plain, Size: 2741 bytes --]
Hi Patrick,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Patrick-Ruddy/netlink-ipv4-IGMP-join-notifications/20180831-105548
config: i386-randconfig-a1-201834 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All warnings (new ones prefixed by >>):
In file included from net//bridge/br_multicast.c:16:0:
>> include/linux/igmp.h:134:16: warning: 'struct netlink_callback' declared inside parameter list
struct net_device *dev);
^
>> include/linux/igmp.h:134:16: warning: its scope is only this definition or declaration, which is probably not what you want
vim +134 include/linux/igmp.h
108
109 extern int ip_check_mc_rcu(struct in_device *dev, __be32 mc_addr, __be32 src_addr, u8 proto);
110 extern int igmp_rcv(struct sk_buff *);
111 extern int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr);
112 extern int ip_mc_join_group_ssm(struct sock *sk, struct ip_mreqn *imr,
113 unsigned int mode);
114 extern int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr);
115 extern void ip_mc_drop_socket(struct sock *sk);
116 extern int ip_mc_source(int add, int omode, struct sock *sk,
117 struct ip_mreq_source *mreqs, int ifindex);
118 extern int ip_mc_msfilter(struct sock *sk, struct ip_msfilter *msf,int ifindex);
119 extern int ip_mc_msfget(struct sock *sk, struct ip_msfilter *msf,
120 struct ip_msfilter __user *optval, int __user *optlen);
121 extern int ip_mc_gsfget(struct sock *sk, struct group_filter *gsf,
122 struct group_filter __user *optval, int __user *optlen);
123 extern int ip_mc_sf_allow(struct sock *sk, __be32 local, __be32 rmt,
124 int dif, int sdif);
125 extern void ip_mc_init_dev(struct in_device *);
126 extern void ip_mc_destroy_dev(struct in_device *);
127 extern void ip_mc_up(struct in_device *);
128 extern void ip_mc_down(struct in_device *);
129 extern void ip_mc_unmap(struct in_device *);
130 extern void ip_mc_remap(struct in_device *);
131 extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
132 extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
133 extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> 134 struct net_device *dev);
135 int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
136
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34377 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
2018-08-30 9:35 ` [PATCH net-next 1/2] netlink: ipv4 IGMP " Patrick Ruddy
` (2 preceding siblings ...)
2018-08-31 4:52 ` kbuild test robot
@ 2018-08-31 11:20 ` Patrick Ruddy
2018-08-31 11:20 ` [PATCH net-next v2 2/2] netlink: ipv6 MLD " Patrick Ruddy
2018-08-31 16:29 ` [PATCH net-next v2 1/2] netlink: ipv4 igmp " Roopa Prabhu
2018-09-06 9:10 ` [PATCH net-next v3 " Patrick Ruddy
4 siblings, 2 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-08-31 11:20 UTC (permalink / raw)
To: netdev; +Cc: roopa, jiri, stephen
Some userspace applications need to know about IGMP joins from the kernel
for 2 reasons
1. To allow the programming of multicast MAC filters in hardware
2. To form a multicast FORUS list for non link-local multicast
groups to be sent to the kernel and from there to the interested
party.
(1) can be fulfilled but simply sending the hardware multicast MAC
address to be programmed but (2) requires the L3 address to be sent
since this cannot be constructed from the MAC address whereas the
reverse translation is a standard library function.
This commit provides addition and deletion of multicast addresses
using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
the RTM_GETADDR extension to allow multicast join state to be read
from the kernel.
Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
---
v2: fix kbuild warnings.
include/linux/igmp.h | 4 ++
net/ipv4/devinet.c | 39 +++++++++++++------
net/ipv4/igmp.c | 90 ++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 122 insertions(+), 11 deletions(-)
diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index 119f53941c12..644a548024ed 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -19,6 +19,8 @@
#include <linux/timer.h>
#include <linux/in.h>
#include <linux/refcount.h>
+#include <linux/netlink.h>
+#include <linux/netdevice.h>
#include <uapi/linux/igmp.h>
static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb)
@@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *);
extern void ip_mc_remap(struct in_device *);
extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
+extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
+ struct net_device *dev);
int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
#endif
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index ea4bd8a52422..42f7dcc4fb5e 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -57,6 +57,7 @@
#endif
#include <linux/kmod.h>
#include <linux/netconf.h>
+#include <linux/igmp.h>
#include <net/arp.h>
#include <net/ip.h>
@@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
int h, s_h;
int idx, s_idx;
int ip_idx, s_ip_idx;
+ int multicast, mcast_idx;
struct net_device *dev;
struct in_device *in_dev;
struct in_ifaddr *ifa;
@@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
s_h = cb->args[0];
s_idx = idx = cb->args[1];
s_ip_idx = ip_idx = cb->args[2];
+ multicast = cb->args[3];
+ mcast_idx = cb->args[4];
for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
idx = 0;
@@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
if (!in_dev)
goto cont;
- for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
- ifa = ifa->ifa_next, ip_idx++) {
- if (ip_idx < s_ip_idx)
- continue;
- if (inet_fill_ifaddr(skb, ifa,
- NETLINK_CB(cb->skb).portid,
- cb->nlh->nlmsg_seq,
- RTM_NEWADDR, NLM_F_MULTI) < 0) {
- rcu_read_unlock();
- goto done;
+ if (!multicast) {
+ for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
+ ifa = ifa->ifa_next, ip_idx++) {
+ if (ip_idx < s_ip_idx)
+ continue;
+ if (inet_fill_ifaddr(skb, ifa,
+ NETLINK_CB(cb->skb).portid,
+ cb->nlh->nlmsg_seq,
+ RTM_NEWADDR,
+ NLM_F_MULTI) < 0) {
+ rcu_read_unlock();
+ goto done;
+ }
+ nl_dump_check_consistent(cb,
+ nlmsg_hdr(skb));
}
- nl_dump_check_consistent(cb, nlmsg_hdr(skb));
+ /* set for multicast loop */
+ multicast++;
+ }
+ /* loop over multicast addresses */
+ if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) {
+ rcu_read_unlock();
+ goto done;
}
cont:
idx++;
@@ -1698,6 +1713,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
cb->args[0] = h;
cb->args[1] = idx;
cb->args[2] = ip_idx;
+ cb->args[3] = multicast;
+ cb->args[4] = mcast_idx;
return skb->len;
}
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index cf75f8944b05..c9bbd1d27124 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -86,6 +86,7 @@
#include <linux/inetdevice.h>
#include <linux/igmp.h>
#include <linux/if_arp.h>
+#include <net/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/times.h>
#include <linux/pkt_sched.h>
@@ -1384,6 +1385,91 @@ static void ip_mc_hash_remove(struct in_device *in_dev,
}
+static int fill_addr(struct sk_buff *skb, struct net_device *dev, __be32 addr,
+ int type, unsigned int flags)
+{
+ struct nlmsghdr *nlh;
+ struct ifaddrmsg *ifm;
+
+ nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
+ if (!nlh)
+ return -EMSGSIZE;
+
+ ifm = nlmsg_data(nlh);
+ ifm->ifa_family = AF_INET;
+ ifm->ifa_prefixlen = 32;
+ ifm->ifa_flags = IFA_F_PERMANENT;
+ ifm->ifa_scope = RT_SCOPE_LINK;
+ ifm->ifa_index = dev->ifindex;
+
+ if (nla_put_in_addr(skb, IFA_ADDRESS, addr))
+ goto nla_put_failure;
+ nlmsg_end(skb, nlh);
+ return 0;
+
+nla_put_failure:
+ nlmsg_cancel(skb, nlh);
+ return -EMSGSIZE;
+}
+
+static inline size_t addr_nlmsg_size(void)
+{
+ return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
+ + nla_total_size(sizeof(__be32));
+}
+
+static void ip_mc_addr_notify(struct net_device *dev, __be32 addr, int type)
+{
+ struct net *net = dev_net(dev);
+ struct sk_buff *skb;
+ int err = -ENOBUFS;
+
+ skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
+ if (!skb)
+ goto errout;
+
+ err = fill_addr(skb, dev, addr, type, 0);
+ if (err < 0) {
+ WARN_ON(err == -EMSGSIZE);
+ kfree_skb(skb);
+ goto errout;
+ }
+ rtnl_notify(skb, net, 0, RTNLGRP_IPV4_IFADDR, NULL, GFP_ATOMIC);
+ return;
+errout:
+ if (err < 0)
+ rtnl_set_sk_err(net, RTNLGRP_LINK, err);
+}
+
+int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
+ struct net_device *dev)
+{
+ int s_idx;
+ int idx = 0;
+ struct ip_mc_list *im;
+ struct in_device *in_dev;
+
+ ASSERT_RTNL();
+
+ s_idx = cb->args[4];
+ in_dev = __in_dev_get_rtnl(dev);
+
+ for_each_pmc_rtnl(in_dev, im) {
+ if (idx < s_idx)
+ continue;
+ if (fill_addr(skb, dev, im->multiaddr, RTM_NEWADDR,
+ NLM_F_MULTI) < 0)
+ goto done;
+ nl_dump_check_consistent(cb, nlmsg_hdr(skb));
+ idx++;
+ }
+
+ done:
+ cb->args[4] = idx;
+
+ return skb->len;
+}
+
/*
* A socket has joined a multicast group on device dev.
*/
@@ -1433,6 +1519,8 @@ static void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr,
igmpv3_del_delrec(in_dev, im);
#endif
igmp_group_added(im);
+
+ ip_mc_addr_notify(in_dev->dev, addr, RTM_NEWADDR);
if (!in_dev->dead)
ip_rt_multicast_event(in_dev);
out:
@@ -1664,6 +1752,8 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
in_dev->mc_count--;
igmp_group_dropped(i);
ip_mc_clear_src(i);
+ ip_mc_addr_notify(in_dev->dev, addr,
+ RTM_DELADDR);
if (!in_dev->dead)
ip_rt_multicast_event(in_dev);
--
2.17.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH net-next v2 2/2] netlink: ipv6 MLD join notifications
2018-08-31 11:20 ` [PATCH net-next v2 1/2] netlink: ipv4 igmp " Patrick Ruddy
@ 2018-08-31 11:20 ` Patrick Ruddy
2018-08-31 16:29 ` [PATCH net-next v2 1/2] netlink: ipv4 igmp " Roopa Prabhu
1 sibling, 0 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-08-31 11:20 UTC (permalink / raw)
To: netdev; +Cc: roopa, jiri, stephen
Some userspace applications need to know about MLD joins from the
kernel for 2 reasons:
1. To allow the programming of multicast MAC filters in hardware
2. To form a multicast FORUS list for non link-local multicast
groups to be sent to the kernel and from there to the interested
party.
(1) can be fulfilled but simply sending the hardware multicast MAC
address to be programmed but (2) requires the L3 address to be sent
since this cannot be constructed from the MAC address whereas the
reverse translation is a standard library function.
This commit provides addition and deletion of multicast addresses
using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
the RTM_GETADDR extension to allow multicast join state to be read
from the kernel.
Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
---
v2: fix kbuild issues.
net/ipv6/addrconf.c | 44 +++++++++++++++++++++---------
net/ipv6/mcast.c | 66 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 98 insertions(+), 12 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d51a8c0b3372..0b609c7897b4 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4855,11 +4855,13 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa,
}
static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca,
- u32 portid, u32 seq, int event, u16 flags)
+ u32 portid, u32 seq, int event, u16 flags)
{
struct nlmsghdr *nlh;
u8 scope = RT_SCOPE_UNIVERSE;
int ifindex = ifmca->idev->dev->ifindex;
+ int addr_type = (event == RTM_GETMULTICAST) ? IFA_MULTICAST :
+ IFA_ADDRESS;
if (ipv6_addr_scope(&ifmca->mca_addr) & IFA_SITE)
scope = RT_SCOPE_SITE;
@@ -4869,7 +4871,7 @@ static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca,
return -EMSGSIZE;
put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex);
- if (nla_put_in6_addr(skb, IFA_MULTICAST, &ifmca->mca_addr) < 0 ||
+ if (nla_put_in6_addr(skb, addr_type, &ifmca->mca_addr) < 0 ||
put_cacheinfo(skb, ifmca->mca_cstamp, ifmca->mca_tstamp,
INFINITY_LIFE_TIME, INFINITY_LIFE_TIME) < 0) {
nlmsg_cancel(skb, nlh);
@@ -4916,7 +4918,7 @@ enum addr_type_t {
/* called with rcu_read_lock() */
static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
struct netlink_callback *cb, enum addr_type_t type,
- int s_ip_idx, int *p_ip_idx)
+ int s_ip_idx, int *p_ip_idx, int msg_type)
{
struct ifmcaddr6 *ifmca;
struct ifacaddr6 *ifaca;
@@ -4935,7 +4937,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
err = inet6_fill_ifaddr(skb, ifa,
NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
- RTM_NEWADDR,
+ msg_type,
NLM_F_MULTI);
if (err < 0)
break;
@@ -4952,7 +4954,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
err = inet6_fill_ifmcaddr(skb, ifmca,
NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
- RTM_GETMULTICAST,
+ msg_type,
NLM_F_MULTI);
if (err < 0)
break;
@@ -4967,7 +4969,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
err = inet6_fill_ifacaddr(skb, ifaca,
NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
- RTM_GETANYCAST,
+ msg_type,
NLM_F_MULTI);
if (err < 0)
break;
@@ -4982,7 +4984,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
}
static int inet6_dump_addr(struct sk_buff *skb, struct netlink_callback *cb,
- enum addr_type_t type)
+ enum addr_type_t type, int msg_type)
{
struct net *net = sock_net(skb->sk);
int h, s_h;
@@ -5012,7 +5014,7 @@ static int inet6_dump_addr(struct sk_buff *skb, struct netlink_callback *cb,
goto cont;
if (in6_dump_addrs(idev, skb, cb, type,
- s_ip_idx, &ip_idx) < 0)
+ s_ip_idx, &ip_idx, msg_type) < 0)
goto done;
cont:
idx++;
@@ -5029,16 +5031,34 @@ static int inet6_dump_addr(struct sk_buff *skb, struct netlink_callback *cb,
static int inet6_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
{
- enum addr_type_t type = UNICAST_ADDR;
+ enum addr_type_t type;
+ int ret;
+
+ type = cb->args[3];
+ if (type == UNICAST_ADDR) {
+ ret = inet6_dump_addr(skb, cb, type, RTM_NEWADDR);
+ if (ret > 0)
+ goto done;
- return inet6_dump_addr(skb, cb, type);
+ /* reset indices and move on to multicast*/
+ cb->args[0] = 0;
+ cb->args[1] = 0;
+ cb->args[2] = 0;
+ type = MULTICAST_ADDR;
+ }
+
+ /* do the RTM_NEWADDR notifications for multicast type */
+ ret = inet6_dump_addr(skb, cb, type, RTM_NEWADDR);
+ done:
+ cb->args[3] = type;
+ return ret;
}
static int inet6_dump_ifmcaddr(struct sk_buff *skb, struct netlink_callback *cb)
{
enum addr_type_t type = MULTICAST_ADDR;
- return inet6_dump_addr(skb, cb, type);
+ return inet6_dump_addr(skb, cb, type, RTM_GETMULTICAST);
}
@@ -5046,7 +5066,7 @@ static int inet6_dump_ifacaddr(struct sk_buff *skb, struct netlink_callback *cb)
{
enum addr_type_t type = ANYCAST_ADDR;
- return inet6_dump_addr(skb, cb, type);
+ return inet6_dump_addr(skb, cb, type, RTM_GETANYCAST);
}
static int inet6_rtm_getaddr(struct sk_buff *in_skb, struct nlmsghdr *nlh,
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 4ae54aaca373..735fb6a8ad34 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -880,6 +880,67 @@ static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev,
return mc;
}
+static int fill_addr(struct sk_buff *skb, struct net_device *dev,
+ const struct in6_addr *addr, int type, unsigned int flags)
+{
+ struct nlmsghdr *nlh;
+ struct ifaddrmsg *ifm;
+ u8 scope = RT_SCOPE_UNIVERSE;
+
+ if (ipv6_addr_scope(addr) & IFA_SITE)
+ scope = RT_SCOPE_SITE;
+
+ nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
+ if (!nlh)
+ return -EMSGSIZE;
+
+ ifm = nlmsg_data(nlh);
+ ifm->ifa_family = AF_INET6;
+ ifm->ifa_prefixlen = 128;
+ ifm->ifa_flags = IFA_F_PERMANENT;
+ ifm->ifa_scope = scope;
+ ifm->ifa_index = dev->ifindex;
+
+ if (nla_put_in6_addr(skb, IFA_ADDRESS, addr))
+ goto nla_put_failure;
+ nlmsg_end(skb, nlh);
+ return 0;
+
+nla_put_failure:
+ nlmsg_cancel(skb, nlh);
+ return -EMSGSIZE;
+}
+
+static inline size_t addr_nlmsg_size(void)
+{
+ return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
+ + nla_total_size(sizeof(struct in6_addr));
+}
+
+static void ipv6_mc_addr_notify(struct net_device *dev,
+ const struct in6_addr *addr, int type)
+{
+ struct net *net = dev_net(dev);
+ struct sk_buff *skb;
+ int err = -ENOBUFS;
+
+ skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
+ if (!skb)
+ goto errout;
+
+ err = fill_addr(skb, dev, addr, type, 0);
+ if (err < 0) {
+ WARN_ON(err == -EMSGSIZE);
+ kfree_skb(skb);
+ goto errout;
+ }
+ rtnl_notify(skb, net, 0, RTNLGRP_IPV6_IFADDR, NULL, GFP_ATOMIC);
+ return;
+errout:
+ if (err < 0)
+ rtnl_set_sk_err(net, RTNLGRP_IPV6_IFADDR, err);
+}
+
/*
* device multicast group inc (add if not found)
*/
@@ -932,6 +993,9 @@ static int __ipv6_dev_mc_inc(struct net_device *dev,
mld_del_delrec(idev, mc);
igmp6_group_added(mc);
+
+ ipv6_mc_addr_notify(dev, addr, RTM_NEWADDR);
+
ma_put(mc);
return 0;
}
@@ -960,6 +1024,8 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr)
igmp6_group_dropped(ma);
ip6_mc_clear_src(ma);
+ ipv6_mc_addr_notify(idev->dev, addr,
+ RTM_DELADDR);
ma_put(ma);
return 0;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
2018-08-31 11:20 ` [PATCH net-next v2 1/2] netlink: ipv4 igmp " Patrick Ruddy
2018-08-31 11:20 ` [PATCH net-next v2 2/2] netlink: ipv6 MLD " Patrick Ruddy
@ 2018-08-31 16:29 ` Roopa Prabhu
2018-09-02 11:18 ` Patrick Ruddy
1 sibling, 1 reply; 24+ messages in thread
From: Roopa Prabhu @ 2018-08-31 16:29 UTC (permalink / raw)
To: Patrick Ruddy; +Cc: netdev, Jiří Pírko, Stephen Hemminger
On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
<pruddy@vyatta.att-mail.com> wrote:
> Some userspace applications need to know about IGMP joins from the kernel
> for 2 reasons
> 1. To allow the programming of multicast MAC filters in hardware
> 2. To form a multicast FORUS list for non link-local multicast
> groups to be sent to the kernel and from there to the interested
> party.
> (1) can be fulfilled but simply sending the hardware multicast MAC
> address to be programmed but (2) requires the L3 address to be sent
> since this cannot be constructed from the MAC address whereas the
> reverse translation is a standard library function.
>
> This commit provides addition and deletion of multicast addresses
> using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> the RTM_GETADDR extension to allow multicast join state to be read
> from the kernel.
>
> Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> ---
> v2: fix kbuild warnings.
I am still going through the series, but AFAICT, user-space caches listening to
RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
>
> include/linux/igmp.h | 4 ++
> net/ipv4/devinet.c | 39 +++++++++++++------
> net/ipv4/igmp.c | 90 ++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 122 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/igmp.h b/include/linux/igmp.h
> index 119f53941c12..644a548024ed 100644
> --- a/include/linux/igmp.h
> +++ b/include/linux/igmp.h
> @@ -19,6 +19,8 @@
> #include <linux/timer.h>
> #include <linux/in.h>
> #include <linux/refcount.h>
> +#include <linux/netlink.h>
> +#include <linux/netdevice.h>
> #include <uapi/linux/igmp.h>
>
> static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb)
> @@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *);
> extern void ip_mc_remap(struct in_device *);
> extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
> extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
> +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> + struct net_device *dev);
> int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
>
> #endif
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index ea4bd8a52422..42f7dcc4fb5e 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -57,6 +57,7 @@
> #endif
> #include <linux/kmod.h>
> #include <linux/netconf.h>
> +#include <linux/igmp.h>
>
> #include <net/arp.h>
> #include <net/ip.h>
> @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> int h, s_h;
> int idx, s_idx;
> int ip_idx, s_ip_idx;
> + int multicast, mcast_idx;
> struct net_device *dev;
> struct in_device *in_dev;
> struct in_ifaddr *ifa;
> @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> s_h = cb->args[0];
> s_idx = idx = cb->args[1];
> s_ip_idx = ip_idx = cb->args[2];
> + multicast = cb->args[3];
> + mcast_idx = cb->args[4];
>
> for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
> idx = 0;
> @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> if (!in_dev)
> goto cont;
>
> - for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> - ifa = ifa->ifa_next, ip_idx++) {
> - if (ip_idx < s_ip_idx)
> - continue;
> - if (inet_fill_ifaddr(skb, ifa,
> - NETLINK_CB(cb->skb).portid,
> - cb->nlh->nlmsg_seq,
> - RTM_NEWADDR, NLM_F_MULTI) < 0) {
> - rcu_read_unlock();
> - goto done;
> + if (!multicast) {
> + for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> + ifa = ifa->ifa_next, ip_idx++) {
> + if (ip_idx < s_ip_idx)
> + continue;
> + if (inet_fill_ifaddr(skb, ifa,
> + NETLINK_CB(cb->skb).portid,
> + cb->nlh->nlmsg_seq,
> + RTM_NEWADDR,
> + NLM_F_MULTI) < 0) {
> + rcu_read_unlock();
> + goto done;
> + }
> + nl_dump_check_consistent(cb,
> + nlmsg_hdr(skb));
> }
> - nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> + /* set for multicast loop */
> + multicast++;
> + }
> + /* loop over multicast addresses */
> + if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) {
> + rcu_read_unlock();
> + goto done;
> }
> cont:
> idx++;
> @@ -1698,6 +1713,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> cb->args[0] = h;
> cb->args[1] = idx;
> cb->args[2] = ip_idx;
> + cb->args[3] = multicast;
> + cb->args[4] = mcast_idx;
>
> return skb->len;
> }
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index cf75f8944b05..c9bbd1d27124 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -86,6 +86,7 @@
> #include <linux/inetdevice.h>
> #include <linux/igmp.h>
> #include <linux/if_arp.h>
> +#include <net/netlink.h>
> #include <linux/rtnetlink.h>
> #include <linux/times.h>
> #include <linux/pkt_sched.h>
> @@ -1384,6 +1385,91 @@ static void ip_mc_hash_remove(struct in_device *in_dev,
> }
>
>
> +static int fill_addr(struct sk_buff *skb, struct net_device *dev, __be32 addr,
> + int type, unsigned int flags)
> +{
> + struct nlmsghdr *nlh;
> + struct ifaddrmsg *ifm;
> +
> + nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
> + if (!nlh)
> + return -EMSGSIZE;
> +
> + ifm = nlmsg_data(nlh);
> + ifm->ifa_family = AF_INET;
> + ifm->ifa_prefixlen = 32;
> + ifm->ifa_flags = IFA_F_PERMANENT;
> + ifm->ifa_scope = RT_SCOPE_LINK;
> + ifm->ifa_index = dev->ifindex;
> +
> + if (nla_put_in_addr(skb, IFA_ADDRESS, addr))
> + goto nla_put_failure;
> + nlmsg_end(skb, nlh);
> + return 0;
> +
> +nla_put_failure:
> + nlmsg_cancel(skb, nlh);
> + return -EMSGSIZE;
> +}
> +
> +static inline size_t addr_nlmsg_size(void)
> +{
> + return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
> + + nla_total_size(sizeof(__be32));
> +}
> +
> +static void ip_mc_addr_notify(struct net_device *dev, __be32 addr, int type)
> +{
> + struct net *net = dev_net(dev);
> + struct sk_buff *skb;
> + int err = -ENOBUFS;
> +
> + skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
> + if (!skb)
> + goto errout;
> +
> + err = fill_addr(skb, dev, addr, type, 0);
> + if (err < 0) {
> + WARN_ON(err == -EMSGSIZE);
> + kfree_skb(skb);
> + goto errout;
> + }
> + rtnl_notify(skb, net, 0, RTNLGRP_IPV4_IFADDR, NULL, GFP_ATOMIC);
> + return;
> +errout:
> + if (err < 0)
> + rtnl_set_sk_err(net, RTNLGRP_LINK, err);
s/RTNLGRP_LINK/RTNLGRP_IPV4_IFADDR/
> +}
> +
> +int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> + struct net_device *dev)
> +{
> + int s_idx;
> + int idx = 0;
> + struct ip_mc_list *im;
> + struct in_device *in_dev;
> +
> + ASSERT_RTNL();
> +
> + s_idx = cb->args[4];
> + in_dev = __in_dev_get_rtnl(dev);
> +
> + for_each_pmc_rtnl(in_dev, im) {
> + if (idx < s_idx)
> + continue;
> + if (fill_addr(skb, dev, im->multiaddr, RTM_NEWADDR,
> + NLM_F_MULTI) < 0)
> + goto done;
> + nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> + idx++;
> + }
> +
> + done:
> + cb->args[4] = idx;
> +
> + return skb->len;
> +}
> +
> /*
> * A socket has joined a multicast group on device dev.
> */
> @@ -1433,6 +1519,8 @@ static void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr,
> igmpv3_del_delrec(in_dev, im);
> #endif
> igmp_group_added(im);
> +
> + ip_mc_addr_notify(in_dev->dev, addr, RTM_NEWADDR);
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);
> out:
> @@ -1664,6 +1752,8 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
> in_dev->mc_count--;
> igmp_group_dropped(i);
> ip_mc_clear_src(i);
> + ip_mc_addr_notify(in_dev->dev, addr,
> + RTM_DELADDR);
>
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
2018-08-31 16:29 ` [PATCH net-next v2 1/2] netlink: ipv4 igmp " Roopa Prabhu
@ 2018-09-02 11:18 ` Patrick Ruddy
2018-09-03 23:12 ` Roopa Prabhu
0 siblings, 1 reply; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-02 11:18 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: netdev, Jiří Pírko, Stephen Hemminger
Hi Roopa
inline
thx
-pr
On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
> On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
> <pruddy@vyatta.att-mail.com> wrote:
> > Some userspace applications need to know about IGMP joins from the kernel
> > for 2 reasons
> > 1. To allow the programming of multicast MAC filters in hardware
> > 2. To form a multicast FORUS list for non link-local multicast
> > groups to be sent to the kernel and from there to the interested
> > party.
> > (1) can be fulfilled but simply sending the hardware multicast MAC
> > address to be programmed but (2) requires the L3 address to be sent
> > since this cannot be constructed from the MAC address whereas the
> > reverse translation is a standard library function.
> >
> > This commit provides addition and deletion of multicast addresses
> > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> > the RTM_GETADDR extension to allow multicast join state to be read
> > from the kernel.
> >
> > Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> > ---
> > v2: fix kbuild warnings.
>
> I am still going through the series, but AFAICT, user-space caches listening to
> RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
>
Yes that's the crux of this change. It's unfortunate that I could not
use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
would be to create a set of new NEW/DEL/GETMULTICAST messages but the
partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
slightly. Happy to look at it if you think that would be be better.
>
> >
> > include/linux/igmp.h | 4 ++
> > net/ipv4/devinet.c | 39 +++++++++++++------
> > net/ipv4/igmp.c | 90 ++++++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 122 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/igmp.h b/include/linux/igmp.h
> > index 119f53941c12..644a548024ed 100644
> > --- a/include/linux/igmp.h
> > +++ b/include/linux/igmp.h
> > @@ -19,6 +19,8 @@
> > #include <linux/timer.h>
> > #include <linux/in.h>
> > #include <linux/refcount.h>
> > +#include <linux/netlink.h>
> > +#include <linux/netdevice.h>
> > #include <uapi/linux/igmp.h>
> >
> > static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb)
> > @@ -130,6 +132,8 @@ extern void ip_mc_unmap(struct in_device *);
> > extern void ip_mc_remap(struct in_device *);
> > extern void ip_mc_dec_group(struct in_device *in_dev, __be32 addr);
> > extern void ip_mc_inc_group(struct in_device *in_dev, __be32 addr);
> > +extern int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> > + struct net_device *dev);
> > int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed);
> >
> > #endif
> > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> > index ea4bd8a52422..42f7dcc4fb5e 100644
> > --- a/net/ipv4/devinet.c
> > +++ b/net/ipv4/devinet.c
> > @@ -57,6 +57,7 @@
> > #endif
> > #include <linux/kmod.h>
> > #include <linux/netconf.h>
> > +#include <linux/igmp.h>
> >
> > #include <net/arp.h>
> > #include <net/ip.h>
> > @@ -1651,6 +1652,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> > int h, s_h;
> > int idx, s_idx;
> > int ip_idx, s_ip_idx;
> > + int multicast, mcast_idx;
> > struct net_device *dev;
> > struct in_device *in_dev;
> > struct in_ifaddr *ifa;
> > @@ -1659,6 +1661,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> > s_h = cb->args[0];
> > s_idx = idx = cb->args[1];
> > s_ip_idx = ip_idx = cb->args[2];
> > + multicast = cb->args[3];
> > + mcast_idx = cb->args[4];
> >
> > for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
> > idx = 0;
> > @@ -1675,18 +1679,29 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> > if (!in_dev)
> > goto cont;
> >
> > - for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> > - ifa = ifa->ifa_next, ip_idx++) {
> > - if (ip_idx < s_ip_idx)
> > - continue;
> > - if (inet_fill_ifaddr(skb, ifa,
> > - NETLINK_CB(cb->skb).portid,
> > - cb->nlh->nlmsg_seq,
> > - RTM_NEWADDR, NLM_F_MULTI) < 0) {
> > - rcu_read_unlock();
> > - goto done;
> > + if (!multicast) {
> > + for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
> > + ifa = ifa->ifa_next, ip_idx++) {
> > + if (ip_idx < s_ip_idx)
> > + continue;
> > + if (inet_fill_ifaddr(skb, ifa,
> > + NETLINK_CB(cb->skb).portid,
> > + cb->nlh->nlmsg_seq,
> > + RTM_NEWADDR,
> > + NLM_F_MULTI) < 0) {
> > + rcu_read_unlock();
> > + goto done;
> > + }
> > + nl_dump_check_consistent(cb,
> > + nlmsg_hdr(skb));
> > }
> > - nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> > + /* set for multicast loop */
> > + multicast++;
> > + }
> > + /* loop over multicast addresses */
> > + if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) {
> > + rcu_read_unlock();
> > + goto done;
> > }
> > cont:
> > idx++;
> > @@ -1698,6 +1713,8 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
> > cb->args[0] = h;
> > cb->args[1] = idx;
> > cb->args[2] = ip_idx;
> > + cb->args[3] = multicast;
> > + cb->args[4] = mcast_idx;
> >
> > return skb->len;
> > }
> > diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> > index cf75f8944b05..c9bbd1d27124 100644
> > --- a/net/ipv4/igmp.c
> > +++ b/net/ipv4/igmp.c
> > @@ -86,6 +86,7 @@
> > #include <linux/inetdevice.h>
> > #include <linux/igmp.h>
> > #include <linux/if_arp.h>
> > +#include <net/netlink.h>
> > #include <linux/rtnetlink.h>
> > #include <linux/times.h>
> > #include <linux/pkt_sched.h>
> > @@ -1384,6 +1385,91 @@ static void ip_mc_hash_remove(struct in_device *in_dev,
> > }
> >
> >
> > +static int fill_addr(struct sk_buff *skb, struct net_device *dev, __be32 addr,
> > + int type, unsigned int flags)
> > +{
> > + struct nlmsghdr *nlh;
> > + struct ifaddrmsg *ifm;
> > +
> > + nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
> > + if (!nlh)
> > + return -EMSGSIZE;
> > +
> > + ifm = nlmsg_data(nlh);
> > + ifm->ifa_family = AF_INET;
> > + ifm->ifa_prefixlen = 32;
> > + ifm->ifa_flags = IFA_F_PERMANENT;
> > + ifm->ifa_scope = RT_SCOPE_LINK;
> > + ifm->ifa_index = dev->ifindex;
> > +
> > + if (nla_put_in_addr(skb, IFA_ADDRESS, addr))
> > + goto nla_put_failure;
> > + nlmsg_end(skb, nlh);
> > + return 0;
> > +
> > +nla_put_failure:
> > + nlmsg_cancel(skb, nlh);
> > + return -EMSGSIZE;
> > +}
> > +
> > +static inline size_t addr_nlmsg_size(void)
> > +{
> > + return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
> > + + nla_total_size(sizeof(__be32));
> > +}
> > +
> > +static void ip_mc_addr_notify(struct net_device *dev, __be32 addr, int type)
> > +{
> > + struct net *net = dev_net(dev);
> > + struct sk_buff *skb;
> > + int err = -ENOBUFS;
> > +
> > + skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
> > + if (!skb)
> > + goto errout;
> > +
> > + err = fill_addr(skb, dev, addr, type, 0);
> > + if (err < 0) {
> > + WARN_ON(err == -EMSGSIZE);
> > + kfree_skb(skb);
> > + goto errout;
> > + }
> > + rtnl_notify(skb, net, 0, RTNLGRP_IPV4_IFADDR, NULL, GFP_ATOMIC);
> > + return;
> > +errout:
> > + if (err < 0)
> > + rtnl_set_sk_err(net, RTNLGRP_LINK, err);
>
>
> s/RTNLGRP_LINK/RTNLGRP_IPV4_IFADDR/
>
>
>
>
> > +}
> > +
> > +int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> > + struct net_device *dev)
> > +{
> > + int s_idx;
> > + int idx = 0;
> > + struct ip_mc_list *im;
> > + struct in_device *in_dev;
> > +
> > + ASSERT_RTNL();
> > +
> > + s_idx = cb->args[4];
> > + in_dev = __in_dev_get_rtnl(dev);
> > +
> > + for_each_pmc_rtnl(in_dev, im) {
> > + if (idx < s_idx)
> > + continue;
> > + if (fill_addr(skb, dev, im->multiaddr, RTM_NEWADDR,
> > + NLM_F_MULTI) < 0)
> > + goto done;
> > + nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> > + idx++;
> > + }
> > +
> > + done:
> > + cb->args[4] = idx;
> > +
> > + return skb->len;
> > +}
> > +
> > /*
> > * A socket has joined a multicast group on device dev.
> > */
> > @@ -1433,6 +1519,8 @@ static void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr,
> > igmpv3_del_delrec(in_dev, im);
> > #endif
> > igmp_group_added(im);
> > +
> > + ip_mc_addr_notify(in_dev->dev, addr, RTM_NEWADDR);
> > if (!in_dev->dead)
> > ip_rt_multicast_event(in_dev);
> > out:
> > @@ -1664,6 +1752,8 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
> > in_dev->mc_count--;
> > igmp_group_dropped(i);
> > ip_mc_clear_src(i);
> > + ip_mc_addr_notify(in_dev->dev, addr,
> > + RTM_DELADDR);
> >
> > if (!in_dev->dead)
> > ip_rt_multicast_event(in_dev);
> > --
> > 2.17.1
> >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
2018-09-02 11:18 ` Patrick Ruddy
@ 2018-09-03 23:12 ` Roopa Prabhu
2018-09-04 7:54 ` Patrick Ruddy
2018-09-04 16:36 ` Patrick Ruddy
0 siblings, 2 replies; 24+ messages in thread
From: Roopa Prabhu @ 2018-09-03 23:12 UTC (permalink / raw)
To: Patrick Ruddy; +Cc: netdev, Jiří Pírko, Stephen Hemminger
On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy
<pruddy@vyatta.att-mail.com> wrote:
> Hi Roopa
>
> inline
>
> thx
>
> -pr
>
> On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
>> On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
>> <pruddy@vyatta.att-mail.com> wrote:
>> > Some userspace applications need to know about IGMP joins from the kernel
>> > for 2 reasons
>> > 1. To allow the programming of multicast MAC filters in hardware
>> > 2. To form a multicast FORUS list for non link-local multicast
>> > groups to be sent to the kernel and from there to the interested
>> > party.
>> > (1) can be fulfilled but simply sending the hardware multicast MAC
>> > address to be programmed but (2) requires the L3 address to be sent
>> > since this cannot be constructed from the MAC address whereas the
>> > reverse translation is a standard library function.
>> >
>> > This commit provides addition and deletion of multicast addresses
>> > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
>> > the RTM_GETADDR extension to allow multicast join state to be read
>> > from the kernel.
>> >
>> > Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
>> > ---
>> > v2: fix kbuild warnings.
>>
>> I am still going through the series, but AFAICT, user-space caches listening to
>> RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
>>
>
> Yes that's the crux of this change. It's unfortunate that I could not
> use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
> would be to create a set of new NEW/DEL/GETMULTICAST messages but the
> partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
> slightly. Happy to look at it if you think that would be be better.
>
yeah, true. Thinking about this some more, you are adding an interface
for multicast entries learnt via igmp.
There is already a netlink channel for layer2 mc addresses via igmp. I
can't see why that cannot be used.
It is RTM_*MDB msgs. It is currently only available for the bridge.
But, I have a requirement for it to be
available via a vxlan dev...so, I am looking at making it available on
other devices.
Can you check if RTM_*MDB msgs can be made to work for your case ?.
The reason I think it should be possible is because this is similar to
bridge fdb entries.
The bridge fdb api (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to
notify and dump netdev unicast addresses.
similarly I think the mdb api can be overloaded to notify and dump
netdev multicast addresses (statically added or learnt via igmp)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
2018-09-03 23:12 ` Roopa Prabhu
@ 2018-09-04 7:54 ` Patrick Ruddy
2018-09-04 16:36 ` Patrick Ruddy
1 sibling, 0 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-04 7:54 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: netdev, Jiří Pírko, Stephen Hemminger
On Mon, 2018-09-03 at 16:12 -0700, Roopa Prabhu wrote:
> On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy
> <pruddy@vyatta.att-mail.com> wrote:
> > Hi Roopa
> >
> > inline
> >
> > thx
> >
> > -pr
> >
> > On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
> > > On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
> > > <pruddy@vyatta.att-mail.com> wrote:
> > > > Some userspace applications need to know about IGMP joins from the kernel
> > > > for 2 reasons
> > > > 1. To allow the programming of multicast MAC filters in hardware
> > > > 2. To form a multicast FORUS list for non link-local multicast
> > > > groups to be sent to the kernel and from there to the interested
> > > > party.
> > > > (1) can be fulfilled but simply sending the hardware multicast MAC
> > > > address to be programmed but (2) requires the L3 address to be sent
> > > > since this cannot be constructed from the MAC address whereas the
> > > > reverse translation is a standard library function.
> > > >
> > > > This commit provides addition and deletion of multicast addresses
> > > > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> > > > the RTM_GETADDR extension to allow multicast join state to be read
> > > > from the kernel.
> > > >
> > > > Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> > > > ---
> > > > v2: fix kbuild warnings.
> > >
> > > I am still going through the series, but AFAICT, user-space caches listening to
> > > RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
> > >
> >
> > Yes that's the crux of this change. It's unfortunate that I could not
> > use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
> > would be to create a set of new NEW/DEL/GETMULTICAST messages but the
> > partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
> > slightly. Happy to look at it if you think that would be be better.
> >
>
> yeah, true. Thinking about this some more, you are adding an interface
> for multicast entries learnt via igmp.
> There is already a netlink channel for layer2 mc addresses via igmp. I
> can't see why that cannot be used.
> It is RTM_*MDB msgs. It is currently only available for the bridge.
> But, I have a requirement for it to be
> available via a vxlan dev...so, I am looking at making it available on
> other devices.
>
> The reason I think it should be possible is because this is similar to
> bridge fdb entries.
> The bridge fdb api (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to
> notify and dump netdev unicast addresses.
> similarly I think the mdb api can be overloaded to notify and dump
> netdev multicast addresses (statically added or learnt via igmp)
OK I'll take a look at this.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v2 1/2] netlink: ipv4 igmp join notifications
2018-09-03 23:12 ` Roopa Prabhu
2018-09-04 7:54 ` Patrick Ruddy
@ 2018-09-04 16:36 ` Patrick Ruddy
1 sibling, 0 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-04 16:36 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: netdev, Jiří Pírko, Stephen Hemminger
On Mon, 2018-09-03 at 16:12 -0700, Roopa Prabhu wrote:
> On Sun, Sep 2, 2018 at 4:18 AM, Patrick Ruddy
> <pruddy@vyatta.att-mail.com> wrote:
> > Hi Roopa
> >
> > inline
> >
> > thx
> >
> > -pr
> >
> > On Fri, 2018-08-31 at 09:29 -0700, Roopa Prabhu wrote:
> > > On Fri, Aug 31, 2018 at 4:20 AM, Patrick Ruddy
> > > <pruddy@vyatta.att-mail.com> wrote:
> > > > Some userspace applications need to know about IGMP joins from the kernel
> > > > for 2 reasons
> > > > 1. To allow the programming of multicast MAC filters in hardware
> > > > 2. To form a multicast FORUS list for non link-local multicast
> > > > groups to be sent to the kernel and from there to the interested
> > > > party.
> > > > (1) can be fulfilled but simply sending the hardware multicast MAC
> > > > address to be programmed but (2) requires the L3 address to be sent
> > > > since this cannot be constructed from the MAC address whereas the
> > > > reverse translation is a standard library function.
> > > >
> > > > This commit provides addition and deletion of multicast addresses
> > > > using the RTM_NEWADDR and RTM_DELADDR messages. It also provides
> > > > the RTM_GETADDR extension to allow multicast join state to be read
> > > > from the kernel.
> > > >
> > > > Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> > > > ---
> > > > v2: fix kbuild warnings.
> > >
> > > I am still going through the series, but AFAICT, user-space caches listening to
> > > RTNLGRP_IPV4_IFADDR will now also get multicast addresses by default ?
> > >
> >
> > Yes that's the crux of this change. It's unfortunate that I could not
> > use IFA_MULTICAST to distinguish the SAFI. I suppose the other option
> > would be to create a set of new NEW/DEL/GETMULTICAST messages but the
> > partial code for RTM_GETMULTICAST in ipv6/mcast.c complicates that
> > slightly. Happy to look at it if you think that would be be better.
> >
>
> yeah, true. Thinking about this some more, you are adding an interface
> for multicast entries learnt via igmp.
> There is already a netlink channel for layer2 mc addresses via igmp. I
> can't see why that cannot be used.
> It is RTM_*MDB msgs. It is currently only available for the bridge.
> But, I have a requirement for it to be
> available via a vxlan dev...so, I am looking at making it available on
> other devices.
>
> Can you check if RTM_*MDB msgs can be made to work for your case ?.
>
> The reason I think it should be possible is because this is similar to
> bridge fdb entries.
> The bridge fdb api (RTM_NEWNEIGH with AF_BRIDGE) is overloaded to
> notify and dump netdev unicast addresses.
> similarly I think the mdb api can be overloaded to notify and dump
> netdev multicast addresses (statically added or learnt via igmp)
If I'm reading this correctly I think overloading this channel is
possible.
What you're suggesting is overloading the RTM_***MDB messages with
AF_INET and AF_INET6 to carry the per-interfaces joined l3 multicast
addresses.
I've thrown together a quick test of this and it looks good. I can
polish this up and resubmit if you're happy with the approach. FWIW
isolating the multicast addresses this was seems safer and it's a
smaller patchset.
thx
-pr
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-08-30 9:35 ` [PATCH net-next 1/2] netlink: ipv4 IGMP " Patrick Ruddy
` (3 preceding siblings ...)
2018-08-31 11:20 ` [PATCH net-next v2 1/2] netlink: ipv4 igmp " Patrick Ruddy
@ 2018-09-06 9:10 ` Patrick Ruddy
2018-09-06 9:10 ` [PATCH net-next v3 2/2] netlink: ipv6 MLD " Patrick Ruddy
2018-09-07 3:40 ` [PATCH net-next v3 1/2] netlink: ipv4 igmp " Roopa Prabhu
4 siblings, 2 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-06 9:10 UTC (permalink / raw)
To: netdev; +Cc: roopa, jiri, stephen
Some userspace applications need to know about IGMP joins from the
kernel for 2 reasons:
1. To allow the programming of multicast MAC filters in hardware
2. To form a multicast FORUS list for non link-local multicast
groups to be sent to the kernel and from there to the interested
party.
(1) can be fulfilled but simply sending the hardware multicast MAC
address to be programmed but (2) requires the L3 address to be sent
since this cannot be constructed from the MAC address whereas the
reverse translation is a standard library function.
This commit provides addition and deletion of multicast addresses
using the RTM_NEWMDB and RTM_DELMDB messages with AF_INET. It also
provides the RTM_GETMDB extension to allow multicast join state to
be read from the kernel.
Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
---
v3 rework to use RTM_***MDB messages as per review comments.
net/ipv4/igmp.c | 139 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 139 insertions(+)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 4da39446da2d..aed819e2ea93 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -86,6 +86,7 @@
#include <linux/inetdevice.h>
#include <linux/igmp.h>
#include <linux/if_arp.h>
+#include <net/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/times.h>
#include <linux/pkt_sched.h>
@@ -1385,6 +1386,91 @@ static void ip_mc_hash_remove(struct in_device *in_dev,
}
+static int fill_addr(struct sk_buff *skb, struct net_device *dev, __be32 addr,
+ int type, unsigned int flags)
+{
+ struct nlmsghdr *nlh;
+ struct ifaddrmsg *ifm;
+
+ nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
+ if (!nlh)
+ return -EMSGSIZE;
+
+ ifm = nlmsg_data(nlh);
+ ifm->ifa_family = AF_INET;
+ ifm->ifa_prefixlen = 32;
+ ifm->ifa_flags = IFA_F_PERMANENT;
+ ifm->ifa_scope = RT_SCOPE_LINK;
+ ifm->ifa_index = dev->ifindex;
+
+ if (nla_put_in_addr(skb, IFA_ADDRESS, addr))
+ goto nla_put_failure;
+ nlmsg_end(skb, nlh);
+ return 0;
+
+nla_put_failure:
+ nlmsg_cancel(skb, nlh);
+ return -EMSGSIZE;
+}
+
+static inline size_t addr_nlmsg_size(void)
+{
+ return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
+ + nla_total_size(sizeof(__be32));
+}
+
+static void ip_mc_addr_notify(struct net_device *dev, __be32 addr, int type)
+{
+ struct net *net = dev_net(dev);
+ struct sk_buff *skb;
+ int err = -ENOBUFS;
+
+ skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
+ if (!skb)
+ goto errout;
+
+ err = fill_addr(skb, dev, addr, type, 0);
+ if (err < 0) {
+ WARN_ON(err == -EMSGSIZE);
+ kfree_skb(skb);
+ goto errout;
+ }
+ rtnl_notify(skb, net, 0, RTNLGRP_MDB, NULL, GFP_ATOMIC);
+ return;
+errout:
+ if (err < 0)
+ rtnl_set_sk_err(net, RTNLGRP_MDB, err);
+}
+
+int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
+ struct net_device *dev)
+{
+ int s_idx;
+ int idx = 0;
+ struct ip_mc_list *im;
+ struct in_device *in_dev;
+
+ ASSERT_RTNL();
+
+ s_idx = cb->args[2];
+ in_dev = __in_dev_get_rtnl(dev);
+
+ for_each_pmc_rtnl(in_dev, im) {
+ if (idx < s_idx)
+ continue;
+ if (fill_addr(skb, dev, im->multiaddr, RTM_NEWMDB,
+ NLM_F_MULTI) < 0)
+ goto done;
+ nl_dump_check_consistent(cb, nlmsg_hdr(skb));
+ idx++;
+ }
+
+ done:
+ cb->args[2] = idx;
+
+ return skb->len;
+}
+
/*
* A socket has joined a multicast group on device dev.
*/
@@ -1430,6 +1516,8 @@ static void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr,
igmpv3_del_delrec(in_dev, im);
#endif
igmp_group_added(im);
+
+ ip_mc_addr_notify(in_dev->dev, addr, RTM_NEWMDB);
if (!in_dev->dead)
ip_rt_multicast_event(in_dev);
out:
@@ -1661,6 +1749,8 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
in_dev->mc_count--;
igmp_group_dropped(i);
ip_mc_clear_src(i);
+ ip_mc_addr_notify(in_dev->dev, addr,
+ RTM_DELMDB);
if (!in_dev->dead)
ip_rt_multicast_event(in_dev);
@@ -3051,6 +3141,53 @@ static struct notifier_block igmp_notifier = {
.notifier_call = igmp_netdev_event,
};
+static int igmp_mc_dump_ifaddrs(struct sk_buff *skb,
+ struct netlink_callback *cb)
+{
+ struct net *net = sock_net(skb->sk);
+ int h, s_h;
+ int idx, s_idx;
+ struct net_device *dev;
+ struct in_device *in_dev;
+ struct hlist_head *head;
+
+ s_h = cb->args[0];
+ idx = cb->args[1];
+ s_idx = idx;
+
+ for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
+ idx = 0;
+ head = &net->dev_index_head[h];
+ rcu_read_lock();
+ cb->seq = atomic_read(&net->ipv4.dev_addr_genid) ^
+ net->dev_base_seq;
+ hlist_for_each_entry_rcu(dev, head, index_hlist) {
+ if (idx < s_idx)
+ goto cont;
+ if (h > s_h || idx > s_idx)
+ cb->args[2] = 0;
+ in_dev = __in_dev_get_rcu(dev);
+ if (!in_dev)
+ goto cont;
+
+ /* loop over multicast addresses */
+ if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) {
+ rcu_read_unlock();
+ goto done;
+ }
+cont:
+ idx++;
+ }
+ rcu_read_unlock();
+ }
+
+done:
+ cb->args[0] = h;
+ cb->args[1] = idx;
+
+ return skb->len;
+}
+
int __init igmp_mc_init(void)
{
#if defined(CONFIG_PROC_FS)
@@ -3064,6 +3201,8 @@ int __init igmp_mc_init(void)
goto reg_notif_fail;
return 0;
+ rtnl_register(PF_INET, RTM_GETMDB, NULL, igmp_mc_dump_ifaddrs, 0);
+
reg_notif_fail:
unregister_pernet_subsys(&igmp_net_ops);
return err;
--
2.17.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH net-next v3 2/2] netlink: ipv6 MLD join notifications
2018-09-06 9:10 ` [PATCH net-next v3 " Patrick Ruddy
@ 2018-09-06 9:10 ` Patrick Ruddy
2018-09-07 3:40 ` [PATCH net-next v3 1/2] netlink: ipv4 igmp " Roopa Prabhu
1 sibling, 0 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-06 9:10 UTC (permalink / raw)
To: netdev; +Cc: roopa, jiri, stephen
Some userspace applications need to know about MLD joins from the
kernel for 2 reasons:
1. To allow the programming of multicast MAC filters in hardware
2. To form a multicast FORUS list for non link-local multicast
groups to be sent to the kernel and from there to the interested
party.
(1) can be fulfilled but simply sending the hardware multicast MAC
address to be programmed but (2) requires the L3 address to be sent
since this cannot be constructed from the MAC address whereas the
reverse translation is a standard library function.
This commit provides addition and deletion of multicast addresses
using the RTM_NEWMDB and RTM_DELMDB messages with AF_INET6. It also
provides the RTM_GETMDB extension to allow multicast join state to
be read from the kernel.
Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
---
v3 rework to use RTM_***MDB messages as per review comments.
net/ipv6/addrconf.c | 34 ++++++++++++++++-------
net/ipv6/mcast.c | 66 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 90 insertions(+), 10 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d51a8c0b3372..d23955c21650 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4860,6 +4860,8 @@ static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca,
struct nlmsghdr *nlh;
u8 scope = RT_SCOPE_UNIVERSE;
int ifindex = ifmca->idev->dev->ifindex;
+ int addr_type = (event == RTM_GETMULTICAST) ? IFA_MULTICAST :
+ IFA_ADDRESS;
if (ipv6_addr_scope(&ifmca->mca_addr) & IFA_SITE)
scope = RT_SCOPE_SITE;
@@ -4869,7 +4871,7 @@ static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca,
return -EMSGSIZE;
put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex);
- if (nla_put_in6_addr(skb, IFA_MULTICAST, &ifmca->mca_addr) < 0 ||
+ if (nla_put_in6_addr(skb, addr_type, &ifmca->mca_addr) < 0 ||
put_cacheinfo(skb, ifmca->mca_cstamp, ifmca->mca_tstamp,
INFINITY_LIFE_TIME, INFINITY_LIFE_TIME) < 0) {
nlmsg_cancel(skb, nlh);
@@ -4916,7 +4918,7 @@ enum addr_type_t {
/* called with rcu_read_lock() */
static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
struct netlink_callback *cb, enum addr_type_t type,
- int s_ip_idx, int *p_ip_idx)
+ int s_ip_idx, int *p_ip_idx, int msg_type)
{
struct ifmcaddr6 *ifmca;
struct ifacaddr6 *ifaca;
@@ -4935,7 +4937,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
err = inet6_fill_ifaddr(skb, ifa,
NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
- RTM_NEWADDR,
+ msg_type,
NLM_F_MULTI);
if (err < 0)
break;
@@ -4952,7 +4954,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
err = inet6_fill_ifmcaddr(skb, ifmca,
NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
- RTM_GETMULTICAST,
+ msg_type,
NLM_F_MULTI);
if (err < 0)
break;
@@ -4967,7 +4969,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
err = inet6_fill_ifacaddr(skb, ifaca,
NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
- RTM_GETANYCAST,
+ msg_type,
NLM_F_MULTI);
if (err < 0)
break;
@@ -4982,7 +4984,7 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb,
}
static int inet6_dump_addr(struct sk_buff *skb, struct netlink_callback *cb,
- enum addr_type_t type)
+ enum addr_type_t type, int msg_type)
{
struct net *net = sock_net(skb->sk);
int h, s_h;
@@ -5012,7 +5014,7 @@ static int inet6_dump_addr(struct sk_buff *skb, struct netlink_callback *cb,
goto cont;
if (in6_dump_addrs(idev, skb, cb, type,
- s_ip_idx, &ip_idx) < 0)
+ s_ip_idx, &ip_idx, msg_type) < 0)
goto done;
cont:
idx++;
@@ -5031,14 +5033,22 @@ static int inet6_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
{
enum addr_type_t type = UNICAST_ADDR;
- return inet6_dump_addr(skb, cb, type);
+ return inet6_dump_addr(skb, cb, type, RTM_NEWADDR);
}
static int inet6_dump_ifmcaddr(struct sk_buff *skb, struct netlink_callback *cb)
{
enum addr_type_t type = MULTICAST_ADDR;
- return inet6_dump_addr(skb, cb, type);
+ return inet6_dump_addr(skb, cb, type, RTM_GETMULTICAST);
+}
+
+static int inet6_mdb_dump_ifmcaddr(struct sk_buff *skb,
+ struct netlink_callback *cb)
+{
+ enum addr_type_t type = MULTICAST_ADDR;
+
+ return inet6_dump_addr(skb, cb, type, RTM_NEWMDB);
}
@@ -5046,7 +5056,7 @@ static int inet6_dump_ifacaddr(struct sk_buff *skb, struct netlink_callback *cb)
{
enum addr_type_t type = ANYCAST_ADDR;
- return inet6_dump_addr(skb, cb, type);
+ return inet6_dump_addr(skb, cb, type, RTM_GETANYCAST);
}
static int inet6_rtm_getaddr(struct sk_buff *in_skb, struct nlmsghdr *nlh,
@@ -6774,6 +6784,10 @@ int __init addrconf_init(void)
NULL, inet6_dump_ifmcaddr, 0);
if (err < 0)
goto errout;
+ err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETMDB,
+ NULL, inet6_mdb_dump_ifmcaddr, 0);
+ if (err < 0)
+ goto errout;
err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETANYCAST,
NULL, inet6_dump_ifacaddr, 0);
if (err < 0)
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 4ae54aaca373..5108dbf73516 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -880,6 +880,67 @@ static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev,
return mc;
}
+static int fill_addr(struct sk_buff *skb, struct net_device *dev,
+ const struct in6_addr *addr, int type, unsigned int flags)
+{
+ struct nlmsghdr *nlh;
+ struct ifaddrmsg *ifm;
+ u8 scope = RT_SCOPE_UNIVERSE;
+
+ if (ipv6_addr_scope(addr) & IFA_SITE)
+ scope = RT_SCOPE_SITE;
+
+ nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
+ if (!nlh)
+ return -EMSGSIZE;
+
+ ifm = nlmsg_data(nlh);
+ ifm->ifa_family = AF_INET6;
+ ifm->ifa_prefixlen = 128;
+ ifm->ifa_flags = IFA_F_PERMANENT;
+ ifm->ifa_scope = scope;
+ ifm->ifa_index = dev->ifindex;
+
+ if (nla_put_in6_addr(skb, IFA_ADDRESS, addr))
+ goto nla_put_failure;
+ nlmsg_end(skb, nlh);
+ return 0;
+
+nla_put_failure:
+ nlmsg_cancel(skb, nlh);
+ return -EMSGSIZE;
+}
+
+static inline size_t addr_nlmsg_size(void)
+{
+ return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
+ + nla_total_size(sizeof(struct in6_addr));
+}
+
+static void ipv6_mc_addr_notify(struct net_device *dev,
+ const struct in6_addr *addr, int type)
+{
+ struct net *net = dev_net(dev);
+ struct sk_buff *skb;
+ int err = -ENOBUFS;
+
+ skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
+ if (!skb)
+ goto errout;
+
+ err = fill_addr(skb, dev, addr, type, 0);
+ if (err < 0) {
+ WARN_ON(err == -EMSGSIZE);
+ kfree_skb(skb);
+ goto errout;
+ }
+ rtnl_notify(skb, net, 0, RTNLGRP_MDB, NULL, GFP_ATOMIC);
+ return;
+errout:
+ if (err < 0)
+ rtnl_set_sk_err(net, RTNLGRP_MDB, err);
+}
+
/*
* device multicast group inc (add if not found)
*/
@@ -932,6 +993,9 @@ static int __ipv6_dev_mc_inc(struct net_device *dev,
mld_del_delrec(idev, mc);
igmp6_group_added(mc);
+
+ ipv6_mc_addr_notify(dev, addr, RTM_NEWMDB);
+
ma_put(mc);
return 0;
}
@@ -960,6 +1024,8 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr)
igmp6_group_dropped(ma);
ip6_mc_clear_src(ma);
+ ipv6_mc_addr_notify(idev->dev, addr,
+ RTM_DELMDB);
ma_put(ma);
return 0;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-06 9:10 ` [PATCH net-next v3 " Patrick Ruddy
2018-09-06 9:10 ` [PATCH net-next v3 2/2] netlink: ipv6 MLD " Patrick Ruddy
@ 2018-09-07 3:40 ` Roopa Prabhu
2018-09-13 17:03 ` Roopa Prabhu
1 sibling, 1 reply; 24+ messages in thread
From: Roopa Prabhu @ 2018-09-07 3:40 UTC (permalink / raw)
To: Patrick Ruddy; +Cc: netdev, Jiří Pírko, Stephen Hemminger
On Thu, Sep 6, 2018 at 2:10 AM, Patrick Ruddy
<pruddy@vyatta.att-mail.com> wrote:
> Some userspace applications need to know about IGMP joins from the
> kernel for 2 reasons:
> 1. To allow the programming of multicast MAC filters in hardware
> 2. To form a multicast FORUS list for non link-local multicast
> groups to be sent to the kernel and from there to the interested
> party.
> (1) can be fulfilled but simply sending the hardware multicast MAC
> address to be programmed but (2) requires the L3 address to be sent
> since this cannot be constructed from the MAC address whereas the
> reverse translation is a standard library function.
>
> This commit provides addition and deletion of multicast addresses
> using the RTM_NEWMDB and RTM_DELMDB messages with AF_INET. It also
> provides the RTM_GETMDB extension to allow multicast join state to
> be read from the kernel.
>
> Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> ---
> v3 rework to use RTM_***MDB messages as per review comments.
Patrick, this version seems to be using RTM_***MDB msgs with the
RTM_*ADDR format.
We cant do that...because existing RTM_MDB users will be confused.
My request was to evaluate RTM_***MDB msg format. see
nlmsg_populate_mdb_fill for details.
If you can wait a day or two I can share some experimental code that
moves high level RTM_*MDB msg handling into net/core/rtnetlink.c
similar to RTM_*FDB
>
> net/ipv4/igmp.c | 139 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 139 insertions(+)
>
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 4da39446da2d..aed819e2ea93 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -86,6 +86,7 @@
> #include <linux/inetdevice.h>
> #include <linux/igmp.h>
> #include <linux/if_arp.h>
> +#include <net/netlink.h>
> #include <linux/rtnetlink.h>
> #include <linux/times.h>
> #include <linux/pkt_sched.h>
> @@ -1385,6 +1386,91 @@ static void ip_mc_hash_remove(struct in_device *in_dev,
> }
>
>
> +static int fill_addr(struct sk_buff *skb, struct net_device *dev, __be32 addr,
> + int type, unsigned int flags)
> +{
> + struct nlmsghdr *nlh;
> + struct ifaddrmsg *ifm;
> +
> + nlh = nlmsg_put(skb, 0, 0, type, sizeof(*ifm), flags);
> + if (!nlh)
> + return -EMSGSIZE;
> +
> + ifm = nlmsg_data(nlh);
> + ifm->ifa_family = AF_INET;
> + ifm->ifa_prefixlen = 32;
> + ifm->ifa_flags = IFA_F_PERMANENT;
> + ifm->ifa_scope = RT_SCOPE_LINK;
> + ifm->ifa_index = dev->ifindex;
> +
> + if (nla_put_in_addr(skb, IFA_ADDRESS, addr))
> + goto nla_put_failure;
> + nlmsg_end(skb, nlh);
> + return 0;
> +
> +nla_put_failure:
> + nlmsg_cancel(skb, nlh);
> + return -EMSGSIZE;
> +}
> +
> +static inline size_t addr_nlmsg_size(void)
> +{
> + return NLMSG_ALIGN(sizeof(struct ifaddrmsg))
> + + nla_total_size(sizeof(__be32));
> +}
> +
> +static void ip_mc_addr_notify(struct net_device *dev, __be32 addr, int type)
> +{
> + struct net *net = dev_net(dev);
> + struct sk_buff *skb;
> + int err = -ENOBUFS;
> +
> + skb = nlmsg_new(addr_nlmsg_size(), GFP_ATOMIC);
> + if (!skb)
> + goto errout;
> +
> + err = fill_addr(skb, dev, addr, type, 0);
> + if (err < 0) {
> + WARN_ON(err == -EMSGSIZE);
> + kfree_skb(skb);
> + goto errout;
> + }
> + rtnl_notify(skb, net, 0, RTNLGRP_MDB, NULL, GFP_ATOMIC);
> + return;
> +errout:
> + if (err < 0)
> + rtnl_set_sk_err(net, RTNLGRP_MDB, err);
> +}
> +
> +int ip_mc_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb,
> + struct net_device *dev)
> +{
> + int s_idx;
> + int idx = 0;
> + struct ip_mc_list *im;
> + struct in_device *in_dev;
> +
> + ASSERT_RTNL();
> +
> + s_idx = cb->args[2];
> + in_dev = __in_dev_get_rtnl(dev);
> +
> + for_each_pmc_rtnl(in_dev, im) {
> + if (idx < s_idx)
> + continue;
> + if (fill_addr(skb, dev, im->multiaddr, RTM_NEWMDB,
> + NLM_F_MULTI) < 0)
> + goto done;
> + nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> + idx++;
> + }
> +
> + done:
> + cb->args[2] = idx;
> +
> + return skb->len;
> +}
> +
> /*
> * A socket has joined a multicast group on device dev.
> */
> @@ -1430,6 +1516,8 @@ static void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr,
> igmpv3_del_delrec(in_dev, im);
> #endif
> igmp_group_added(im);
> +
> + ip_mc_addr_notify(in_dev->dev, addr, RTM_NEWMDB);
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);
> out:
> @@ -1661,6 +1749,8 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
> in_dev->mc_count--;
> igmp_group_dropped(i);
> ip_mc_clear_src(i);
> + ip_mc_addr_notify(in_dev->dev, addr,
> + RTM_DELMDB);
>
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);
> @@ -3051,6 +3141,53 @@ static struct notifier_block igmp_notifier = {
> .notifier_call = igmp_netdev_event,
> };
>
> +static int igmp_mc_dump_ifaddrs(struct sk_buff *skb,
> + struct netlink_callback *cb)
> +{
> + struct net *net = sock_net(skb->sk);
> + int h, s_h;
> + int idx, s_idx;
> + struct net_device *dev;
> + struct in_device *in_dev;
> + struct hlist_head *head;
> +
> + s_h = cb->args[0];
> + idx = cb->args[1];
> + s_idx = idx;
> +
> + for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
> + idx = 0;
> + head = &net->dev_index_head[h];
> + rcu_read_lock();
> + cb->seq = atomic_read(&net->ipv4.dev_addr_genid) ^
> + net->dev_base_seq;
> + hlist_for_each_entry_rcu(dev, head, index_hlist) {
> + if (idx < s_idx)
> + goto cont;
> + if (h > s_h || idx > s_idx)
> + cb->args[2] = 0;
> + in_dev = __in_dev_get_rcu(dev);
> + if (!in_dev)
> + goto cont;
> +
> + /* loop over multicast addresses */
> + if (ip_mc_dump_ifaddr(skb, cb, dev) < 0) {
> + rcu_read_unlock();
> + goto done;
> + }
> +cont:
> + idx++;
> + }
> + rcu_read_unlock();
> + }
> +
> +done:
> + cb->args[0] = h;
> + cb->args[1] = idx;
> +
> + return skb->len;
> +}
> +
> int __init igmp_mc_init(void)
> {
> #if defined(CONFIG_PROC_FS)
> @@ -3064,6 +3201,8 @@ int __init igmp_mc_init(void)
> goto reg_notif_fail;
> return 0;
>
> + rtnl_register(PF_INET, RTM_GETMDB, NULL, igmp_mc_dump_ifaddrs, 0);
> +
> reg_notif_fail:
> unregister_pernet_subsys(&igmp_net_ops);
> return err;
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-07 3:40 ` [PATCH net-next v3 1/2] netlink: ipv4 igmp " Roopa Prabhu
@ 2018-09-13 17:03 ` Roopa Prabhu
2018-09-13 17:49 ` Patrick Ruddy
2018-09-18 13:12 ` Patrick Ruddy
0 siblings, 2 replies; 24+ messages in thread
From: Roopa Prabhu @ 2018-09-13 17:03 UTC (permalink / raw)
To: Patrick Ruddy
Cc: netdev, Jiří Pírko, Stephen Hemminger,
Nikolay Aleksandrov
On Thu, Sep 6, 2018 at 8:40 PM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> On Thu, Sep 6, 2018 at 2:10 AM, Patrick Ruddy
> <pruddy@vyatta.att-mail.com> wrote:
>> Some userspace applications need to know about IGMP joins from the
>> kernel for 2 reasons:
>> 1. To allow the programming of multicast MAC filters in hardware
>> 2. To form a multicast FORUS list for non link-local multicast
>> groups to be sent to the kernel and from there to the interested
>> party.
>> (1) can be fulfilled but simply sending the hardware multicast MAC
>> address to be programmed but (2) requires the L3 address to be sent
>> since this cannot be constructed from the MAC address whereas the
>> reverse translation is a standard library function.
>>
>> This commit provides addition and deletion of multicast addresses
>> using the RTM_NEWMDB and RTM_DELMDB messages with AF_INET. It also
>> provides the RTM_GETMDB extension to allow multicast join state to
>> be read from the kernel.
>>
>> Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
>> ---
>> v3 rework to use RTM_***MDB messages as per review comments.
>
> Patrick, this version seems to be using RTM_***MDB msgs with the
> RTM_*ADDR format.
> We cant do that...because existing RTM_MDB users will be confused.
>
> My request was to evaluate RTM_***MDB msg format. see
> nlmsg_populate_mdb_fill for details.
>
> If you can wait a day or two I can share some experimental code that
> moves high level RTM_*MDB msg handling into net/core/rtnetlink.c
> similar to RTM_*FDB
>
I was trying to get a default per interface (non bridge) RTM_*MDB
working, but realized that the dev->mc
entries are already getting dumped as part of RTM_*FDB msgs instead of
RTM_*MDB. (see net/core/rtnetlink.c:ndo_dflt_fdb_dump).
This adds another wrench.
so, that puts us back to your use of RTM_NEWADDR.
Instead of using IFA_ADDRESS, you could introduce a new one
IFA_IGMP_MULTICAST (since IFA_MULTICAST is already taken).
To keep existing users of RTM_NEWADDR unaffected. I think you can use
the IPMR family with RTM_NEWADDR.
We can introduce new notification group. (We can choose to add a new
family too, but that seems unnecessary)
since you only need dumps:
rtnl_register(RTNL_FAMILY_IPMR, RTM_GETADDR, NULL, igmp_rtm_dumpaddrs, 0);
For notifications, since we already have many variants for routes, I
don't see a problem adding similar addr variants
RTNLGRP_IPV4_MCADDR
(Others on the list may have more feedback).
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-13 17:03 ` Roopa Prabhu
@ 2018-09-13 17:49 ` Patrick Ruddy
2018-09-18 13:12 ` Patrick Ruddy
1 sibling, 0 replies; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-13 17:49 UTC (permalink / raw)
To: Roopa Prabhu
Cc: netdev, Jiří Pírko, Stephen Hemminger,
Nikolay Aleksandrov
On Thu, 2018-09-13 at 10:03 -0700, Roopa Prabhu wrote:
> On Thu, Sep 6, 2018 at 8:40 PM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> > On Thu, Sep 6, 2018 at 2:10 AM, Patrick Ruddy
> > <pruddy@vyatta.att-mail.com> wrote:
> > > Some userspace applications need to know about IGMP joins from the
> > > kernel for 2 reasons:
> > > 1. To allow the programming of multicast MAC filters in hardware
> > > 2. To form a multicast FORUS list for non link-local multicast
> > > groups to be sent to the kernel and from there to the interested
> > > party.
> > > (1) can be fulfilled but simply sending the hardware multicast MAC
> > > address to be programmed but (2) requires the L3 address to be sent
> > > since this cannot be constructed from the MAC address whereas the
> > > reverse translation is a standard library function.
> > >
> > > This commit provides addition and deletion of multicast addresses
> > > using the RTM_NEWMDB and RTM_DELMDB messages with AF_INET. It also
> > > provides the RTM_GETMDB extension to allow multicast join state to
> > > be read from the kernel.
> > >
> > > Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> > > ---
> > > v3 rework to use RTM_***MDB messages as per review comments.
> >
> > Patrick, this version seems to be using RTM_***MDB msgs with the
> > RTM_*ADDR format.
> > We cant do that...because existing RTM_MDB users will be confused.
> >
> > My request was to evaluate RTM_***MDB msg format. see
> > nlmsg_populate_mdb_fill for details.
> >
> > If you can wait a day or two I can share some experimental code that
> > moves high level RTM_*MDB msg handling into net/core/rtnetlink.c
> > similar to RTM_*FDB
> >
>
> I was trying to get a default per interface (non bridge) RTM_*MDB
> working, but realized that the dev->mc
> entries are already getting dumped as part of RTM_*FDB msgs instead of
> RTM_*MDB. (see net/core/rtnetlink.c:ndo_dflt_fdb_dump).
> This adds another wrench.
>
> so, that puts us back to your use of RTM_NEWADDR.
> Instead of using IFA_ADDRESS, you could introduce a new one
> IFA_IGMP_MULTICAST (since IFA_MULTICAST is already taken).
>
>
> To keep existing users of RTM_NEWADDR unaffected. I think you can use
> the IPMR family with RTM_NEWADDR.
> We can introduce new notification group. (We can choose to add a new
> family too, but that seems unnecessary)
>
> since you only need dumps:
> rtnl_register(RTNL_FAMILY_IPMR, RTM_GETADDR, NULL, igmp_rtm_dumpaddrs, 0);
>
> For notifications, since we already have many variants for routes, I
> don't see a problem adding similar addr variants
> RTNLGRP_IPV4_MCADDR
>
> (Others on the list may have more feedback).
Thanks for looking at this Roopa - I'll rehash as suggested.
-pr
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-13 17:03 ` Roopa Prabhu
2018-09-13 17:49 ` Patrick Ruddy
@ 2018-09-18 13:12 ` Patrick Ruddy
2018-09-20 4:47 ` David Ahern
1 sibling, 1 reply; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-18 13:12 UTC (permalink / raw)
To: Roopa Prabhu
Cc: netdev, Jiří Pírko, Stephen Hemminger,
Nikolay Aleksandrov
On Thu, 2018-09-13 at 10:03 -0700, Roopa Prabhu wrote:
> On Thu, Sep 6, 2018 at 8:40 PM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> > On Thu, Sep 6, 2018 at 2:10 AM, Patrick Ruddy
> > <pruddy@vyatta.att-mail.com> wrote:
> > > Some userspace applications need to know about IGMP joins from the
> > > kernel for 2 reasons:
> > > 1. To allow the programming of multicast MAC filters in hardware
> > > 2. To form a multicast FORUS list for non link-local multicast
> > > groups to be sent to the kernel and from there to the interested
> > > party.
> > > (1) can be fulfilled but simply sending the hardware multicast MAC
> > > address to be programmed but (2) requires the L3 address to be sent
> > > since this cannot be constructed from the MAC address whereas the
> > > reverse translation is a standard library function.
> > >
> > > This commit provides addition and deletion of multicast addresses
> > > using the RTM_NEWMDB and RTM_DELMDB messages with AF_INET. It also
> > > provides the RTM_GETMDB extension to allow multicast join state to
> > > be read from the kernel.
> > >
> > > Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com>
> > > ---
> > > v3 rework to use RTM_***MDB messages as per review comments.
> >
> > Patrick, this version seems to be using RTM_***MDB msgs with the
> > RTM_*ADDR format.
> > We cant do that...because existing RTM_MDB users will be confused.
> >
> > My request was to evaluate RTM_***MDB msg format. see
> > nlmsg_populate_mdb_fill for details.
> >
> > If you can wait a day or two I can share some experimental code that
> > moves high level RTM_*MDB msg handling into net/core/rtnetlink.c
> > similar to RTM_*FDB
> >
>
> I was trying to get a default per interface (non bridge) RTM_*MDB
> working, but realized that the dev->mc
> entries are already getting dumped as part of RTM_*FDB msgs instead of
> RTM_*MDB. (see net/core/rtnetlink.c:ndo_dflt_fdb_dump).
> This adds another wrench.
>
> so, that puts us back to your use of RTM_NEWADDR.
> Instead of using IFA_ADDRESS, you could introduce a new one
> IFA_IGMP_MULTICAST (since IFA_MULTICAST is already taken).
>
>
> To keep existing users of RTM_NEWADDR unaffected. I think you can use
> the IPMR family with RTM_NEWADDR.
> We can introduce new notification group. (We can choose to add a new
> family too, but that seems unnecessary)
>
> since you only need dumps:
> rtnl_register(RTNL_FAMILY_IPMR, RTM_GETADDR, NULL, igmp_rtm_dumpaddrs, 0);
>
> For notifications, since we already have many variants for routes, I
> don't see a problem adding similar addr variants
> RTNLGRP_IPV4_MCADDR
>
> (Others on the list may have more feedback).
I've hit a small snag with adding the new groups. The number of defined
groups currently sits at 31 so I can only add one before hitting the
limit defined by the 32 bit groups bitmask in socakddr_nl. I can use 1
group for both v4 and v6 notifications which seems like the sensible
options since the AF is carried separately, but it breaks the precedent
where there are separate IPV4 and IPV6 groups for IFADDR.
I have the combined group patches ready and can share them if that's
the preference.
Has there been any previous discussion about extending the number of
availabel groups?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-18 13:12 ` Patrick Ruddy
@ 2018-09-20 4:47 ` David Ahern
2018-09-25 9:34 ` Patrick Ruddy
0 siblings, 1 reply; 24+ messages in thread
From: David Ahern @ 2018-09-20 4:47 UTC (permalink / raw)
To: pruddy, Roopa Prabhu
Cc: netdev, Jiří Pírko, Stephen Hemminger,
Nikolay Aleksandrov
On 9/18/18 6:12 AM, Patrick Ruddy wrote:
>
> I've hit a small snag with adding the new groups. The number of defined
> groups currently sits at 31 so I can only add one before hitting the
I believe you have no more available. RTNLGRP_* has been defined from 0
(RTNLGRP_NONE) to 31 (RTNLGRP_IPV6_MROUTE_R) which covers the u32 range.
> limit defined by the 32 bit groups bitmask in socakddr_nl. I can use 1
> group for both v4 and v6 notifications which seems like the sensible
> options since the AF is carried separately, but it breaks the precedent
> where there are separate IPV4 and IPV6 groups for IFADDR.
>
> I have the combined group patches ready and can share them if that's
> the preference.
>
> Has there been any previous discussion about extending the number of
> availabel groups?
>
I have not tried it, but from a prior code review I believe you have you
use setsockopt to add groups > 31.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-20 4:47 ` David Ahern
@ 2018-09-25 9:34 ` Patrick Ruddy
2018-09-26 17:23 ` Roopa Prabhu
0 siblings, 1 reply; 24+ messages in thread
From: Patrick Ruddy @ 2018-09-25 9:34 UTC (permalink / raw)
To: David Ahern, Roopa Prabhu
Cc: netdev, Jiří Pírko, Stephen Hemminger,
Nikolay Aleksandrov
On Wed, 2018-09-19 at 21:47 -0700, David Ahern wrote:
> On 9/18/18 6:12 AM, Patrick Ruddy wrote:
> >
> > I've hit a small snag with adding the new groups. The number of defined
> > groups currently sits at 31 so I can only add one before hitting the
>
> I believe you have no more available. RTNLGRP_* has been defined from 0
> (RTNLGRP_NONE) to 31 (RTNLGRP_IPV6_MROUTE_R) which covers the u32 range.
>
> > limit defined by the 32 bit groups bitmask in socakddr_nl. I can use 1
> > group for both v4 and v6 notifications which seems like the sensible
> > options since the AF is carried separately, but it breaks the precedent
> > where there are separate IPV4 and IPV6 groups for IFADDR.
> >
> > I have the combined group patches ready and can share them if that's
> > the preference.
> >
> > Has there been any previous discussion about extending the number of
> > availabel groups?
> >
>
> I have not tried it, but from a prior code review I believe you have you
> use setsockopt to add groups > 31.
I can certainly join the new groups using setsockopt and
NETLINK_ADD_MEMBERSHIP.
I can't see any examples of extending the defined group list within the
kernel so I assume I just add to the RTNLGRP enum list with a suitable
comment to indicate that later groups must be joined with the mechanism
above or am I missing some other way of dynamically adding groups?
thanks
-pr
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-25 9:34 ` Patrick Ruddy
@ 2018-09-26 17:23 ` Roopa Prabhu
2018-10-01 15:38 ` Roopa Prabhu
0 siblings, 1 reply; 24+ messages in thread
From: Roopa Prabhu @ 2018-09-26 17:23 UTC (permalink / raw)
To: Patrick Ruddy
Cc: David Ahern, netdev, Jiří Pírko,
Stephen Hemminger, Nikolay Aleksandrov
On Tue, Sep 25, 2018 at 2:34 AM, Patrick Ruddy
<pruddy@vyatta.att-mail.com> wrote:
> On Wed, 2018-09-19 at 21:47 -0700, David Ahern wrote:
>> On 9/18/18 6:12 AM, Patrick Ruddy wrote:
>> >
>> > I've hit a small snag with adding the new groups. The number of defined
>> > groups currently sits at 31 so I can only add one before hitting the
>>
>> I believe you have no more available. RTNLGRP_* has been defined from 0
>> (RTNLGRP_NONE) to 31 (RTNLGRP_IPV6_MROUTE_R) which covers the u32 range.
>>
>> > limit defined by the 32 bit groups bitmask in socakddr_nl. I can use 1
>> > group for both v4 and v6 notifications which seems like the sensible
>> > options since the AF is carried separately, but it breaks the precedent
>> > where there are separate IPV4 and IPV6 groups for IFADDR.
>> >
>> > I have the combined group patches ready and can share them if that's
>> > the preference.
>> >
>> > Has there been any previous discussion about extending the number of
>> > availabel groups?
>> >
>>
>> I have not tried it, but from a prior code review I believe you have you
>> use setsockopt to add groups > 31.
>
> I can certainly join the new groups using setsockopt and
> NETLINK_ADD_MEMBERSHIP.
> I can't see any examples of extending the defined group list within the
> kernel so I assume I just add to the RTNLGRP enum list with a suitable
> comment to indicate that later groups must be joined with the mechanism
> above or am I missing some other way of dynamically adding groups?
>
With a quick look, there are other subsystem specific groups:
xfrm_nlgroups, nfnetlink_groups ...which i see apps registering using
NETLINK_ADD_MEMBERSHIP.
seems like an overkill to add something like this for your case.
yet another option to consider:
use family: RTNL_FAMILY_IPMR/ RTNL_FAMILY_IP6MR with RTM_GETADDR/DELADDR
and use the existing groups: RTNLGRP_IPV4_IFADDR / RTNLGRP_IPV6_IFADDR
(pls check if this will break any existing users)
precedence is ipmr fib rules.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH net-next v3 1/2] netlink: ipv4 igmp join notifications
2018-09-26 17:23 ` Roopa Prabhu
@ 2018-10-01 15:38 ` Roopa Prabhu
0 siblings, 0 replies; 24+ messages in thread
From: Roopa Prabhu @ 2018-10-01 15:38 UTC (permalink / raw)
To: Patrick Ruddy
Cc: David Ahern, netdev, Jiří Pírko,
Stephen Hemminger, Nikolay Aleksandrov
On Wed, Sep 26, 2018 at 10:23 AM Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>
> On Tue, Sep 25, 2018 at 2:34 AM, Patrick Ruddy
> <pruddy@vyatta.att-mail.com> wrote:
> > On Wed, 2018-09-19 at 21:47 -0700, David Ahern wrote:
> >> On 9/18/18 6:12 AM, Patrick Ruddy wrote:
> >> >
> >> > I've hit a small snag with adding the new groups. The number of defined
> >> > groups currently sits at 31 so I can only add one before hitting the
> >>
> >> I believe you have no more available. RTNLGRP_* has been defined from 0
> >> (RTNLGRP_NONE) to 31 (RTNLGRP_IPV6_MROUTE_R) which covers the u32 range.
> >>
> >> > limit defined by the 32 bit groups bitmask in socakddr_nl. I can use 1
> >> > group for both v4 and v6 notifications which seems like the sensible
> >> > options since the AF is carried separately, but it breaks the precedent
> >> > where there are separate IPV4 and IPV6 groups for IFADDR.
> >> >
> >> > I have the combined group patches ready and can share them if that's
> >> > the preference.
> >> >
> >> > Has there been any previous discussion about extending the number of
> >> > availabel groups?
> >> >
> >>
> >> I have not tried it, but from a prior code review I believe you have you
> >> use setsockopt to add groups > 31.
> >
> > I can certainly join the new groups using setsockopt and
> > NETLINK_ADD_MEMBERSHIP.
> > I can't see any examples of extending the defined group list within the
> > kernel so I assume I just add to the RTNLGRP enum list with a suitable
> > comment to indicate that later groups must be joined with the mechanism
> > above or am I missing some other way of dynamically adding groups?
> >
>
> With a quick look, there are other subsystem specific groups:
> xfrm_nlgroups, nfnetlink_groups ...which i see apps registering using
> NETLINK_ADD_MEMBERSHIP.
>
scratch that. These groups are for different netlink protocols and the
limit on netlink groups per protocol seems to be 32.
We seem to have hit the max on groups for NETLINK_ROUTE protocol. we
will have to rework the group handling
to make room for more groups. We do need room for more groups in the
future and not just for this patchset.
> seems like an overkill to add something like this for your case.
>
> yet another option to consider:
> use family: RTNL_FAMILY_IPMR/ RTNL_FAMILY_IP6MR with RTM_GETADDR/DELADDR
> and use the existing groups: RTNLGRP_IPV4_IFADDR / RTNLGRP_IPV6_IFADDR
>
> (pls check if this will break any existing users)
>
> precedence is ipmr fib rules.
^ permalink raw reply [flat|nested] 24+ messages in thread