All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Prefix List against 2.5.70 (re-done)
@ 2003-06-20 20:53 Krishna Kumar
  2003-06-21 14:36 ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar @ 2003-06-20 20:53 UTC (permalink / raw)
  To: davem, kuznet; +Cc: netdev, linux-net

Hi,

The earlier patch to implement the prefix list has been redone to use fib.
Following are the implementation details and a couple of issues :

1. I change the netlink_dump_start to pass another parameter, <type>, which is
    stored in a new field in the cb, <type>. All users of this function have been
    changed to pass a -1 since they don't care about the type, except the
    generic routine rtnetlink_rcv_msg() which calculates the type and stores it.
    So the same routine which is used to dump route table can be used to dump
    the prefix list by checking the type. It might be possible to derive the
    type from the table offset, but that is more complicated (probably doable).
2. Added yoshifuji's patch to store the M/O flags (now it is needed).
3. Added user interface for retrieving M/O flags. This is a separate interface
    from the one for getting the prefix list since the flags are per interface
    while the prefix list is per route. However these two can be merged into one
    if needed.
4. Changed the usage of RTF_ADDRCONF to be used only when the action is being
    performed due to receipt of a RA.
5. Though this patch is modified to use only routing table for updating and
    accessing the prefix list, I did a performace analysis for this approach vs
    storing the plist on the idev. Following is the result :

    System : 1 CPU. 866 MHz, 256MB memory
    For 1000 VLAN devices (4036 route entries gets created automatically as part
    of address assignment), retrieve prefix list for (system times only) :

    #devices  #iteration for each dev  plist on IDEV    plist in RTTABLE     %
    200          100                   3.95 secs        40.14 secs          916%
    1000         10                    2.60 secs        20.98 secs          706%
    200          1000                  38.44 secs       400.76 secs         942%

6. I have kept #ifdef PREFIXLIST in a few places, I can modify the patch to
    remove that if required.
7. I removed the /proc interface since I was not able to cleanly use seq_file
    with fib6_walk(). If needed, I can work on this later (but will need some
    input on how to proceed). So currently, the only user interface is using
    rtnetlink.
8. The patch can be extended to issue events on new prefix addition and on
    prefix deletion. I can do that if required.
9. I have tested using rtnetlink for both interfaces (prefix list and get O/M
    flags), no issues found.

Please let me know if this looks acceptable, in which case I can also send the 
patch for 2.4 kernel.

Thanks,

- KK

diff -ruN linux-2.5.70.org/include/linux/ipv6_route.h 
linux-2.5.70.new/include/linux/ipv6_route.h
--- linux-2.5.70.org/include/linux/ipv6_route.h	2003-05-26 18:00:25.000000000 -0700
+++ linux-2.5.70.new/include/linux/ipv6_route.h	2003-06-20 01:45:17.000000000 -0700
@@ -44,4 +44,16 @@
  #define RTMSG_NEWROUTE		0x21
  #define RTMSG_DELROUTE		0x22

+#ifdef CONFIG_IPV6_PREFIXLIST
+
+/* Structure to return prefix and prefix length for all devices */
+
+struct in6_prefix_msg
+{
+	int ifindex;
+	int prefix_len;
+	struct in6_addr prefix;
+};
+#endif
+
  #endif
diff -ruN linux-2.5.70.org/include/linux/netlink.h 
linux-2.5.70.new/include/linux/netlink.h
--- linux-2.5.70.org/include/linux/netlink.h	2003-05-26 18:00:56.000000000 -0700
+++ linux-2.5.70.new/include/linux/netlink.h	2003-06-20 05:00:47.000000000 -0700
@@ -132,6 +132,7 @@
  	int		(*dump)(struct sk_buff * skb, struct netlink_callback *cb);
  	int		(*done)(struct netlink_callback *cb);
  	int		family;
+	int		type;	/* for overloading functions */
  	long		args[4];
  };

@@ -161,7 +162,7 @@
     __nlmsg_put(skb, pid, seq, type, len); })

  extern int netlink_dump_start(struct sock *ssk, struct sk_buff *skb,
-			      struct nlmsghdr *nlh,
+			      struct nlmsghdr *nlh, int type,
  			      int (*dump)(struct sk_buff *skb, struct netlink_callback*),
  			      int (*done)(struct netlink_callback*));

diff -ruN linux-2.5.70.org/include/linux/rtnetlink.h 
linux-2.5.70.new/include/linux/rtnetlink.h
--- linux-2.5.70.org/include/linux/rtnetlink.h	2003-05-26 18:00:46.000000000 -0700
+++ linux-2.5.70.new/include/linux/rtnetlink.h	2003-06-20 01:36:19.000000000 -0700
@@ -47,7 +47,14 @@
  #define	RTM_DELTFILTER	(RTM_BASE+29)
  #define	RTM_GETTFILTER	(RTM_BASE+30)

-#define	RTM_MAX		(RTM_BASE+31)
+#define	RTM_GETOMFLAGS	(RTM_BASE+34)
+
+#ifndef CONFIG_IPV6_PREFIXLIST
+#define	RTM_MAX		(RTM_GETOMFLAGS+1)
+#else
+#define	RTM_GETPLIST	(RTM_BASE+38)
+#define	RTM_MAX		(RTM_GETPLIST+1)
+#endif

  /*
     Generic structure for encapsulation optional route information.
@@ -61,6 +68,14 @@
  	unsigned short	rta_type;
  };

+/* Structure to return per interface device flags */
+
+struct ifp_if6info
+{
+	int ifindex;
+	int flags;
+};
+
  /* Macros to handle rtattributes */

  #define RTA_ALIGNTO	4
@@ -201,9 +216,10 @@
  	RTA_FLOW,
  	RTA_CACHEINFO,
  	RTA_SESSION,
+	RTA_RA6INFO,	/* No support yet, send event on prefix event */
  };

-#define RTA_MAX RTA_SESSION
+#define RTA_MAX RTA_RA6INFO

  #define RTM_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct 
rtmsg))))
  #define RTM_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct rtmsg))
diff -ruN linux-2.5.70.org/include/net/if_inet6.h 
linux-2.5.70.new/include/net/if_inet6.h
--- linux-2.5.70.org/include/net/if_inet6.h	2003-05-26 18:00:59.000000000 -0700
+++ linux-2.5.70.new/include/net/if_inet6.h	2003-06-20 02:01:39.000000000 -0700
@@ -17,6 +17,9 @@

  #include <net/snmp.h>

+/* inet6_dev.if_flags */
+#define IF_RA_OTHERCONF	0x80
+#define IF_RA_MANAGED	0x40
  #define IF_RA_RCVD	0x20
  #define IF_RS_SENT	0x10

diff -ruN linux-2.5.70.org/net/core/rtnetlink.c 
linux-2.5.70.new/net/core/rtnetlink.c
--- linux-2.5.70.org/net/core/rtnetlink.c	2003-05-26 18:01:03.000000000 -0700
+++ linux-2.5.70.new/net/core/rtnetlink.c	2003-06-19 06:05:34.000000000 -0700
@@ -380,7 +380,7 @@
  		if (link->dumpit == NULL)
  			goto err_inval;

-		if ((*errp = netlink_dump_start(rtnl, skb, nlh,
+		if ((*errp = netlink_dump_start(rtnl, skb, nlh, type,
  						link->dumpit,
  						rtnetlink_done)) != 0) {
  			return -1;
diff -ruN linux-2.5.70.org/net/ipv4/tcp_diag.c linux-2.5.70.new/net/ipv4/tcp_diag.c
--- linux-2.5.70.org/net/ipv4/tcp_diag.c	2003-05-26 18:00:20.000000000 -0700
+++ linux-2.5.70.new/net/ipv4/tcp_diag.c	2003-06-19 06:09:45.000000000 -0700
@@ -591,7 +591,7 @@
  			if (tcpdiag_bc_audit(RTA_DATA(rta), RTA_PAYLOAD(rta)))
  				goto err_inval;
  		}
-		return netlink_dump_start(tcpnl, skb, nlh,
+		return netlink_dump_start(tcpnl, skb, nlh, -1,
  					  tcpdiag_dump,
  					  tcpdiag_dump_done);
  	} else {
diff -ruN linux-2.5.70.org/net/ipv6/Kconfig linux-2.5.70.new/net/ipv6/Kconfig
--- linux-2.5.70.org/net/ipv6/Kconfig	2003-05-26 18:00:40.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/Kconfig	2003-06-19 05:37:11.000000000 -0700
@@ -42,4 +42,13 @@

  	  If unsure, say Y.

+config IPV6_PREFIXLIST
+	bool "IPv6: Prefix List"
+	depends on IPV6
+	---help---
+	  For applications needing to retrieve the list of prefixes supported
+	  on the system. Defined in RFC2461.
+
+	  If unsure, say Y.
+
  source "net/ipv6/netfilter/Kconfig"
diff -ruN linux-2.5.70.org/net/ipv6/addrconf.c linux-2.5.70.new/net/ipv6/addrconf.c
--- linux-2.5.70.org/net/ipv6/addrconf.c	2003-05-26 18:00:58.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/addrconf.c	2003-06-20 01:34:14.000000000 -0700
@@ -124,7 +124,7 @@

  static int addrconf_ifdown(struct net_device *dev, int how);

-static void addrconf_dad_start(struct inet6_ifaddr *ifp);
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags);
  static void addrconf_dad_timer(unsigned long data);
  static void addrconf_dad_completed(struct inet6_ifaddr *ifp);
  static void addrconf_rs_timer(unsigned long data);
@@ -738,7 +738,7 @@
  	ift->prefered_lft = tmp_prefered_lft;
  	ift->tstamp = ifp->tstamp;
  	spin_unlock_bh(&ift->lock);
-	addrconf_dad_start(ift);
+	addrconf_dad_start(ift, 0);
  	in6_ifa_put(ift);
  	in6_dev_put(idev);
  out:
@@ -1234,7 +1234,7 @@
  	rtmsg.rtmsg_dst_len = 8;
  	rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
  	rtmsg.rtmsg_ifindex = dev->ifindex;
-	rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF;
+	rtmsg.rtmsg_flags = RTF_UP;
  	rtmsg.rtmsg_type = RTMSG_NEWROUTE;
  	ip6_route_add(&rtmsg, NULL, NULL);
  }
@@ -1261,7 +1261,7 @@
  	struct in6_addr addr;

  	ipv6_addr_set(&addr,  htonl(0xFE800000), 0, 0, 0);
-	addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF);
+	addrconf_prefix_route(&addr, 64, dev, 0, 0);
  }

  static struct inet6_dev *addrconf_add_dev(struct net_device *dev)
@@ -1401,7 +1401,7 @@
  			}

  			create = 1;
-			addrconf_dad_start(ifp);
+			addrconf_dad_start(ifp, RTF_ADDRCONF);
  		}

  		if (ifp && valid_lft == 0) {
@@ -1552,7 +1552,7 @@

  	ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT);
  	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
  		in6_ifa_put(ifp);
  		return 0;
  	}
@@ -1727,7 +1727,7 @@

  	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT);
  	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
  		in6_ifa_put(ifp);
  	}
  }
@@ -1965,8 +1965,7 @@
  		memset(&rtmsg, 0, sizeof(struct in6_rtmsg));
  		rtmsg.rtmsg_type = RTMSG_NEWROUTE;
  		rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
-		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF |
-				     RTF_DEFAULT | RTF_UP);
+		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP);

  		rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex;

@@ -1980,7 +1979,7 @@
  /*
   *	Duplicate Address Detection
   */
-static void addrconf_dad_start(struct inet6_ifaddr *ifp)
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags)
  {
  	struct net_device *dev;
  	unsigned long rand_num;
@@ -1990,7 +1989,7 @@
  	addrconf_join_solict(dev, &ifp->addr);

  	if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT))
-		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF);
+		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags);

  	net_srandom(ifp->addr.s6_addr32[3]);
  	rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1);
@@ -2389,6 +2388,42 @@
  	netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC);
  }

+int inet6_dump_omflags(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	int flags;
+	struct ifp_if6info *ifp;
+	struct net_device *dev;
+	struct inet6_dev *idev;
+	struct nlmsghdr  *nlh;
+	unsigned char *cur_tail, *org_tail = skb->tail;
+
+	read_lock(&dev_base_lock);
+	for (dev = dev_base; dev; dev = dev->next) {
+		if (dev->flags & IFF_LOOPBACK)
+			continue;
+		if ((idev = in6_dev_get(dev)) == NULL)
+			continue;
+		flags = idev->if_flags;
+		in6_dev_put(idev);
+		cur_tail = skb->tail;
+		nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid,
+				cb->nlh->nlmsg_seq, RTM_GETOMFLAGS,
+				sizeof(*ifp));
+		ifp = NLMSG_DATA(nlh);
+		ifp->ifindex = dev->ifindex;
+		ifp->flags = flags;
+		nlh->nlmsg_len = skb->tail - cur_tail;
+	}
+	read_unlock(&dev_base_lock);
+	return skb->len;
+
+nlmsg_failure:
+	read_unlock(&dev_base_lock);
+	printk(KERN_INFO "inet6_dump_omflags:skb size not enough\n");
+	skb_trim(skb, org_tail - skb->data);
+	return -1;
+}
+
  static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = {
  	[RTM_NEWADDR - RTM_BASE] = { .doit	= inet6_rtm_newaddr, },
  	[RTM_DELADDR - RTM_BASE] = { .doit	= inet6_rtm_deladdr, },
@@ -2397,6 +2432,10 @@
  	[RTM_DELROUTE - RTM_BASE] = { .doit	= inet6_rtm_delroute, },
  	[RTM_GETROUTE - RTM_BASE] = { .doit	= inet6_rtm_getroute,
  				      .dumpit	= inet6_dump_fib, },
+	[RTM_GETOMFLAGS - RTM_BASE] = { .dumpit	= inet6_dump_omflags, },
+#ifdef CONFIG_IPV6_PREFIXLIST
+	[RTM_GETPLIST - RTM_BASE] = { .dumpit	= inet6_dump_fib, },
+#endif
  };

  static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
@@ -2730,7 +2769,7 @@
  #ifdef CONFIG_PROC_FS
  	proc_net_create("if_inet6", 0, iface_proc_info);
  #endif
-	
+
  	addrconf_verify(0);
  	rtnetlink_links[PF_INET6] = inet6_rtnetlink_table;
  #ifdef CONFIG_SYSCTL
diff -ruN linux-2.5.70.org/net/ipv6/ndisc.c linux-2.5.70.new/net/ipv6/ndisc.c
--- linux-2.5.70.org/net/ipv6/ndisc.c	2003-05-26 18:00:41.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/ndisc.c	2003-06-20 02:00:53.000000000 -0700
@@ -1049,6 +1049,16 @@
  		 */
  		in6_dev->if_flags |= IF_RA_RCVD;
  	}
+	/*
+	 * Remember the managed/otherconf flags from most recently
+	 * received RA message (RFC 2462) -- yoshfuji
+	 */
+	in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED |
+				IF_RA_OTHERCONF)) |
+				(ra_msg->icmph.icmp6_addrconf_managed ?
+					IF_RA_MANAGED : 0) |
+				(ra_msg->icmph.icmp6_addrconf_other ?
+					IF_RA_OTHERCONF : 0);

  	lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime);

diff -ruN linux-2.5.70.org/net/ipv6/route.c linux-2.5.70.new/net/ipv6/route.c
--- linux-2.5.70.org/net/ipv6/route.c	2003-05-26 18:00:45.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/route.c	2003-06-20 02:05:48.000000000 -0700
@@ -1520,6 +1520,68 @@
  	return 0;
  }

+#ifdef CONFIG_IPV6_PREFIXLIST
+static int rt6_fill_prefix(struct sk_buff *skb, struct rt6_info *rt,
+			 int type, u32 pid, u32 seq)
+{
+	struct in6_prefix_msg *pmsg;
+	struct nlmsghdr  *nlh;
+	unsigned char *b = skb->tail;
+
+	nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*pmsg));
+	pmsg = NLMSG_DATA(nlh);
+	pmsg->ifindex = rt->rt6i_dev->ifindex;
+	pmsg->prefix_len = rt->rt6i_dst.plen;
+	ipv6_addr_copy(&pmsg->prefix, &rt->rt6i_dst.addr);
+	nlh->nlmsg_len = skb->tail - b;
+	return skb->len;
+
+nlmsg_failure:
+	printk(KERN_INFO "rt6_fill_prefix:skb size not enough\n");
+	skb_trim(skb, b - skb->data);
+	return -1;
+}
+
+static int rt6_dump_route_prefix(struct rt6_info *rt, void *p_arg)
+{
+	int addr_type;
+	struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg;
+
+	/*
+	 * Definition of a prefix :
+	 * 	- Should be autoconfigured
+	 *	- No nexthop
+	 *	- Not a linklocal, loopback or multicast type.
+	 */
+	if (rt->rt6i_nexthop || (rt->rt6i_flags & RTF_ADDRCONF) == 0)
+		return 0;
+	addr_type = ipv6_addr_type(&rt->rt6i_dst.addr);
+	if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK |
+			IPV6_ADDR_MULTICAST)) != 0 ||
+			addr_type == IPV6_ADDR_ANY)
+		return 0;
+	return rt6_fill_prefix(arg->skb, rt, RTM_GETPLIST,
+		     NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq);
+}
+
+static int fib6_dump_prefix(struct fib6_walker_t *w)
+{
+	int res;
+	struct rt6_info *rt;
+
+	for (rt = w->leaf; rt; rt = rt->u.next) {
+		res = rt6_dump_route_prefix(rt, w->args);
+		if (res < 0) {
+			/* Frame is full, suspend walking */
+			w->leaf = rt;
+			return 1;
+		}
+	}
+	w->leaf = NULL;
+	return 0;
+}
+#endif
+
  static void fib6_dump_end(struct netlink_callback *cb)
  {
  	struct fib6_walker_t *w = (void*)cb->args[0];
@@ -1547,6 +1609,13 @@
  	struct fib6_walker_t *w;
  	int res;

+#ifdef CONFIG_IPV6_PREFIXLIST
+	BUG_TRAP(cb->type + RTM_BASE == RTM_GETROUTE ||
+			cb->type + RTM_BASE == RTM_GETPLIST);
+#else
+	BUG_TRAP(cb->type + RTM_BASE == RTM_GETROUTE);
+#endif
+
  	arg.skb = skb;
  	arg.cb = cb;

@@ -1568,7 +1637,12 @@
  		RT6_TRACE("dump<%p", w);
  		memset(w, 0, sizeof(*w));
  		w->root = &ip6_routing_table;
-		w->func = fib6_dump_node;
+		if (cb->type + RTM_BASE == RTM_GETROUTE)
+			w->func = fib6_dump_node;
+#ifdef CONFIG_IPV6_PREFIXLIST
+		else
+			w->func = fib6_dump_prefix;
+#endif
  		w->args = &arg;
  		cb->args[0] = (long)w;
  		read_lock_bh(&rt6_lock);
diff -ruN linux-2.5.70.org/net/netlink/af_netlink.c 
linux-2.5.70.new/net/netlink/af_netlink.c
--- linux-2.5.70.org/net/netlink/af_netlink.c	2003-05-26 18:00:40.000000000 -0700
+++ linux-2.5.70.new/net/netlink/af_netlink.c	2003-06-19 06:14:26.000000000 -0700
@@ -842,7 +842,7 @@
  }

  int netlink_dump_start(struct sock *ssk, struct sk_buff *skb,
-		       struct nlmsghdr *nlh,
+		       struct nlmsghdr *nlh, int type,
  		       int (*dump)(struct sk_buff *skb, struct netlink_callback*),
  		       int (*done)(struct netlink_callback*))
  {
@@ -858,6 +858,7 @@
  	cb->dump = dump;
  	cb->done = done;
  	cb->nlh = nlh;
+	cb->type = type;
  	atomic_inc(&skb->users);
  	cb->skb = skb;

diff -ruN linux-2.5.70.org/net/xfrm/xfrm_user.c 
linux-2.5.70.new/net/xfrm/xfrm_user.c
--- linux-2.5.70.org/net/xfrm/xfrm_user.c	2003-05-26 18:00:41.000000000 -0700
+++ linux-2.5.70.new/net/xfrm/xfrm_user.c	2003-06-19 06:10:17.000000000 -0700
@@ -869,7 +869,7 @@
  		if (link->dump == NULL)
  			goto err_einval;

-		if ((*errp = netlink_dump_start(xfrm_nl, skb, nlh,
+		if ((*errp = netlink_dump_start(xfrm_nl, skb, nlh, -1,
  						link->dump,
  						xfrm_done)) != 0) {
  			return -1;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-20 20:53 [PATCH] Prefix List against 2.5.70 (re-done) Krishna Kumar
@ 2003-06-21 14:36 ` YOSHIFUJI Hideaki / 吉藤英明
  2003-06-25 17:02   ` Krishna Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2003-06-21 14:36 UTC (permalink / raw)
  To: krkumar; +Cc: davem, kuznet, yoshfuji, netdev, linux-net

Grr, I've almost lost this...


In article <3EF37458.3070103@us.ibm.com> (at Fri, 20 Jun 2003 13:53:44 -0700), Krishna Kumar <krkumar@us.ibm.com> says:

> 1. I change the netlink_dump_start to pass another parameter, <type>, which is
>     stored in a new field in the cb, <type>. All users of this function have been
>     changed to pass a -1 since they don't care about the type, except the
>     generic routine rtnetlink_rcv_msg() which calculates the type and stores it.
>     So the same routine which is used to dump route table can be used to dump
>     the prefix list by checking the type. It might be possible to derive the
>     type from the table offset, but that is more complicated (probably doable).

I think this is not required.

Rename inet6_dump_fib() to __inet6_dump_fib() and introduce
extra argument. and call it like
  inet6_dump_fib()    { __inet6_dump_fib(...,0); }
and
  inet6_dump_prefix() { __inet6_dump_fib(...,1); }
etc.


> 3. Added user interface for retrieving M/O flags. This is a separate interface
>     from the one for getting the prefix list since the flags are per interface
>     while the prefix list is per route. However these two can be merged into one
>     if needed.

Hmm, what I expected is to get information via RTA_NEWLINK message.
This is because, this is per-interface thing.


> 5. Though this patch is modified to use only routing table for updating and
>     accessing the prefix list, I did a performace analysis for this approach vs
>     storing the plist on the idev. Following is the result :
> 
>     System : 1 CPU. 866 MHz, 256MB memory
>     For 1000 VLAN devices (4036 route entries gets created automatically as part
>     of address assignment), retrieve prefix list for (system times only) :
> 
>     #devices  #iteration for each dev  plist on IDEV    plist in RTTABLE     %
>     200          100                   3.95 secs        40.14 secs          916%
>     1000         10                    2.60 secs        20.98 secs          706%
>     200          1000                  38.44 secs       400.76 secs         942%

Hmm...

Well, what should we do...

-- 
Hideaki YOSHIFUJI @ USAGI Project <yoshfuji@linux-ipv6.org>
GPG FP: 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-21 14:36 ` YOSHIFUJI Hideaki / 吉藤英明
@ 2003-06-25 17:02   ` Krishna Kumar
  2003-06-26  6:42     ` David S. Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar @ 2003-06-25 17:02 UTC (permalink / raw)
  To: yoshfuji; +Cc: David S. Miller, netdev, linux-net

Hi Yoshfuji,

 > Rename inet6_dump_fib() to __inet6_dump_fib() and introduce

OK, done.

 > Hmm, what I expected is to get information via RTA_NEWLINK message.

I have made changes to return per interface flags, however I am not very
familiar with netlink and it's different interfaces. I wanted to clarify whether
the following code is what you are trying to get done. Otherwise please let me
know what changes need to be done. What I want to happen is :
	1. Return entire prefix list on request from user.
	2. Return flags for a particular interface on request from user.
What I have not yet done is to broadcast events when a new prefix arrives or
an existing prefix gets expired, and to broadcast flags when an RA is received.
Currently there is no need for these from DHCP, but it could be added later on.

 >>    #devices  #iteration for each dev  plist on IDEV    plist in RTTABLE     %
 >>     200          100                   3.95 secs        40.14 secs          916%
 > Well, what should we do...

The original code, though faster, you have more code as dave said in his initial
mail. I think this operation is not done too often to be concerned about
performance. Besides it is taking 40 secs to iterate 20,000 times over a 4K
routing table, effectively about 2ms for getting the prefix list for one device.
And for most systems, this is not an issue since this code will not run at all.

Thanks,

- KK

diff -ruN linux-2.5.70.org/include/linux/ipv6_route.h
linux-2.5.70.new/include/linux/ipv6_route.h
--- linux-2.5.70.org/include/linux/ipv6_route.h	2003-05-26 18:00:25.000000000 -0700
+++ linux-2.5.70.new/include/linux/ipv6_route.h	2003-06-24 04:36:39.000000000 -0700
@@ -44,4 +44,19 @@
   #define RTMSG_NEWROUTE		0x21
   #define RTMSG_DELROUTE		0x22

+#ifdef CONFIG_IPV6_PREFIXLIST
+
+/*
+ * Return entire prefix list in array of following structures. Provides the
+ * prefix and prefix length for all devices.
+ */
+
+struct in6_prefix_msg
+{
+	int ifindex;
+	int prefix_len;
+	struct in6_addr prefix;
+};
+#endif
+
   #endif
diff -ruN linux-2.5.70.org/include/linux/rtnetlink.h
linux-2.5.70.new/include/linux/rtnetlink.h
--- linux-2.5.70.org/include/linux/rtnetlink.h	2003-05-26 18:00:46.000000000 -0700
+++ linux-2.5.70.new/include/linux/rtnetlink.h	2003-06-24 04:39:59.000000000 -0700
@@ -47,7 +47,14 @@
   #define	RTM_DELTFILTER	(RTM_BASE+29)
   #define	RTM_GETTFILTER	(RTM_BASE+30)

-#define	RTM_MAX		(RTM_BASE+31)
+#define	RTM_GETLNKFLAGS	(RTM_BASE+34)
+
+#ifndef CONFIG_IPV6_PREFIXLIST
+#define	RTM_MAX		(RTM_GETLNKFLAGS+1)
+#else
+#define	RTM_GETPLIST	(RTM_BASE+38)
+#define	RTM_MAX		(RTM_GETPLIST+1)
+#endif

   /*
      Generic structure for encapsulation optional route information.
@@ -61,6 +68,14 @@
   	unsigned short	rta_type;
   };

+/* Structure to return per interface device flags */
+
+struct ifp_if6info
+{
+	int ifindex;
+	int flags;
+};
+
   /* Macros to handle rtattributes */

   #define RTA_ALIGNTO	4
@@ -201,9 +216,11 @@
   	RTA_FLOW,
   	RTA_CACHEINFO,
   	RTA_SESSION,
+	RTA_LINKFLAGS,
+	RTA_RA6INFO,	/* No support yet, send event on new prefix event */
   };

-#define RTA_MAX RTA_SESSION
+#define RTA_MAX RTA_RA6INFO

   #define RTM_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct
rtmsg))))
   #define RTM_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct rtmsg))
diff -ruN linux-2.5.70.org/include/net/if_inet6.h
linux-2.5.70.new/include/net/if_inet6.h
--- linux-2.5.70.org/include/net/if_inet6.h	2003-05-26 18:00:59.000000000 -0700
+++ linux-2.5.70.new/include/net/if_inet6.h	2003-06-19 05:42:08.000000000 -0700
@@ -17,6 +17,8 @@

   #include <net/snmp.h>

+#define IF_RA_OTHERCONF	0x80
+#define IF_RA_MANAGED	0x40
   #define IF_RA_RCVD	0x20
   #define IF_RS_SENT	0x10

diff -ruN linux-2.5.70.org/include/net/ip6_route.h
linux-2.5.70.new/include/net/ip6_route.h
--- linux-2.5.70.org/include/net/ip6_route.h	2003-05-26 18:00:26.000000000 -0700
+++ linux-2.5.70.new/include/net/ip6_route.h	2003-06-23 02:59:06.000000000 -0700
@@ -87,6 +87,7 @@
   struct nlmsghdr;
   struct netlink_callback;
   extern int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb);
+extern int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb);
   extern int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void
*arg);
   extern int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void
*arg);
   extern int inet6_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void
*arg);
diff -ruN linux-2.5.70.org/net/ipv6/Kconfig linux-2.5.70.new/net/ipv6/Kconfig
--- linux-2.5.70.org/net/ipv6/Kconfig	2003-05-26 18:00:40.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/Kconfig	2003-06-19 05:37:11.000000000 -0700
@@ -42,4 +42,13 @@

   	  If unsure, say Y.

+config IPV6_PREFIXLIST
+	bool "IPv6: Prefix List"
+	depends on IPV6
+	---help---
+	  For applications needing to retrieve the list of prefixes supported
+	  on the system. Defined in RFC2461.
+
+	  If unsure, say Y.
+
   source "net/ipv6/netfilter/Kconfig"
diff -ruN linux-2.5.70.org/net/ipv6/addrconf.c linux-2.5.70.new/net/ipv6/addrconf.c
--- linux-2.5.70.org/net/ipv6/addrconf.c	2003-05-26 18:00:58.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/addrconf.c	2003-06-24 04:40:05.000000000 -0700
@@ -124,7 +124,7 @@

   static int addrconf_ifdown(struct net_device *dev, int how);

-static void addrconf_dad_start(struct inet6_ifaddr *ifp);
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags);
   static void addrconf_dad_timer(unsigned long data);
   static void addrconf_dad_completed(struct inet6_ifaddr *ifp);
   static void addrconf_rs_timer(unsigned long data);
@@ -738,7 +738,7 @@
   	ift->prefered_lft = tmp_prefered_lft;
   	ift->tstamp = ifp->tstamp;
   	spin_unlock_bh(&ift->lock);
-	addrconf_dad_start(ift);
+	addrconf_dad_start(ift, 0);
   	in6_ifa_put(ift);
   	in6_dev_put(idev);
   out:
@@ -1234,7 +1234,7 @@
   	rtmsg.rtmsg_dst_len = 8;
   	rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
   	rtmsg.rtmsg_ifindex = dev->ifindex;
-	rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF;
+	rtmsg.rtmsg_flags = RTF_UP;
   	rtmsg.rtmsg_type = RTMSG_NEWROUTE;
   	ip6_route_add(&rtmsg, NULL, NULL);
   }
@@ -1261,7 +1261,7 @@
   	struct in6_addr addr;

   	ipv6_addr_set(&addr,  htonl(0xFE800000), 0, 0, 0);
-	addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF);
+	addrconf_prefix_route(&addr, 64, dev, 0, 0);
   }

   static struct inet6_dev *addrconf_add_dev(struct net_device *dev)
@@ -1401,7 +1401,7 @@
   			}

   			create = 1;
-			addrconf_dad_start(ifp);
+			addrconf_dad_start(ifp, RTF_ADDRCONF);
   		}

   		if (ifp && valid_lft == 0) {
@@ -1552,7 +1552,7 @@

   	ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT);
   	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
   		in6_ifa_put(ifp);
   		return 0;
   	}
@@ -1727,7 +1727,7 @@

   	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT);
   	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
   		in6_ifa_put(ifp);
   	}
   }
@@ -1965,8 +1965,7 @@
   		memset(&rtmsg, 0, sizeof(struct in6_rtmsg));
   		rtmsg.rtmsg_type = RTMSG_NEWROUTE;
   		rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
-		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF |
-				     RTF_DEFAULT | RTF_UP);
+		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP);

   		rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex;

@@ -1980,7 +1979,7 @@
   /*
    *	Duplicate Address Detection
    */
-static void addrconf_dad_start(struct inet6_ifaddr *ifp)
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags)
   {
   	struct net_device *dev;
   	unsigned long rand_num;
@@ -1990,7 +1989,7 @@
   	addrconf_join_solict(dev, &ifp->addr);

   	if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT))
-		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF);
+		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags);

   	net_srandom(ifp->addr.s6_addr32[3]);
   	rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1);
@@ -2389,6 +2388,43 @@
   	netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC);
   }

+int inet6_dump_linkflags(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	int ifindex, flags = 0;
+	struct net_device *dev;
+	struct inet6_dev *idev;
+	struct nlmsghdr *nlh;
+	struct ifp_if6info *ifp = NLMSG_DATA(cb->nlh);
+	unsigned char *org_tail = skb->tail;
+
+	/* ifindex = cb->args[0]; ? */
+	ifindex = ifp->ifindex;
+
+	if ((dev = dev_get_by_index(ifindex)) == NULL)
+		goto out;
+	if ((idev = in6_dev_get(dev)) != NULL) {
+		flags = idev->if_flags;
+		in6_dev_put(idev);
+	}
+	dev_put(dev);
+
+	nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq,
+			RTA_LINKFLAGS, sizeof(*ifp));
+	ifp = NLMSG_DATA(nlh);
+	ifp->flags = flags;
+	ifp->ifindex = ifindex;	/* duplicate information for user to verify */
+
+	nlh->nlmsg_len = skb->tail - org_tail;
+	return skb->len;
+
+nlmsg_failure:
+	printk(KERN_INFO "inet6_dump_linkflags:skb size not enough\n");
+	skb_trim(skb, org_tail - skb->data);
+
+out:
+	return -1;
+}
+
   static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = {
   	[RTM_NEWADDR - RTM_BASE] = { .doit	= inet6_rtm_newaddr, },
   	[RTM_DELADDR - RTM_BASE] = { .doit	= inet6_rtm_deladdr, },
@@ -2397,6 +2433,10 @@
   	[RTM_DELROUTE - RTM_BASE] = { .doit	= inet6_rtm_delroute, },
   	[RTM_GETROUTE - RTM_BASE] = { .doit	= inet6_rtm_getroute,
   				      .dumpit	= inet6_dump_fib, },
+	[RTM_GETLNKFLAGS - RTM_BASE] = { .dumpit = inet6_dump_linkflags, },
+#ifdef CONFIG_IPV6_PREFIXLIST
+	[RTM_GETPLIST - RTM_BASE] = { .dumpit	= inet6_dump_prefix, },
+#endif
   };

   static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
@@ -2730,7 +2770,7 @@
   #ifdef CONFIG_PROC_FS
   	proc_net_create("if_inet6", 0, iface_proc_info);
   #endif
-	
+
   	addrconf_verify(0);
   	rtnetlink_links[PF_INET6] = inet6_rtnetlink_table;
   #ifdef CONFIG_SYSCTL
diff -ruN linux-2.5.70.org/net/ipv6/ndisc.c linux-2.5.70.new/net/ipv6/ndisc.c
--- linux-2.5.70.org/net/ipv6/ndisc.c	2003-05-26 18:00:41.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/ndisc.c	2003-06-24 04:09:30.000000000 -0700
@@ -1049,6 +1049,16 @@
   		 */
   		in6_dev->if_flags |= IF_RA_RCVD;
   	}
+	/*
+	 * Remember the managed/otherconf flags from most recently
+	 * receieved RA message (RFC 2462) -- yoshfuji
+	 */
+	in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED|
+				IF_RA_OTHERCONF)) |
+				(ra_msg->icmph.icmp6_addrconf_managed ?
+					IF_RA_MANAGED : 0) |
+				(ra_msg->icmph.icmp6_addrconf_other ?
+					IF_RA_OTHERCONF : 0);

   	lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime);

diff -ruN linux-2.5.70.org/net/ipv6/route.c linux-2.5.70.new/net/ipv6/route.c
--- linux-2.5.70.org/net/ipv6/route.c	2003-05-26 18:00:45.000000000 -0700
+++ linux-2.5.70.new/net/ipv6/route.c	2003-06-23 02:46:42.000000000 -0700
@@ -1520,6 +1520,68 @@
   	return 0;
   }

+#ifdef CONFIG_IPV6_PREFIXLIST
+static int rt6_fill_prefix(struct sk_buff *skb, struct rt6_info *rt,
+			 int type, u32 pid, u32 seq)
+{
+	struct in6_prefix_msg *pmsg;
+	struct nlmsghdr  *nlh;
+	unsigned char *b = skb->tail;
+
+	nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*pmsg));
+	pmsg = NLMSG_DATA(nlh);
+	pmsg->ifindex = rt->rt6i_dev->ifindex;
+	pmsg->prefix_len = rt->rt6i_dst.plen;
+	ipv6_addr_copy(&pmsg->prefix, &rt->rt6i_dst.addr);
+	nlh->nlmsg_len = skb->tail - b;
+	return skb->len;
+
+nlmsg_failure:
+	printk(KERN_INFO "rt6_fill_prefix:skb size not enough\n");
+	skb_trim(skb, b - skb->data);
+	return -1;
+}
+
+static int rt6_dump_route_prefix(struct rt6_info *rt, void *p_arg)
+{
+	int addr_type;
+	struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg;
+
+	/*
+	 * Definition of a prefix :
+	 * 	- Should be autoconfigured
+	 *	- No nexthop
+	 *	- Not a linklocal, loopback or multicast type.
+	 */
+	if (rt->rt6i_nexthop || (rt->rt6i_flags & RTF_ADDRCONF) == 0)
+		return 0;
+	addr_type = ipv6_addr_type(&rt->rt6i_dst.addr);
+	if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK |
+			IPV6_ADDR_MULTICAST)) != 0 ||
+			addr_type == IPV6_ADDR_ANY)
+		return 0;
+	return rt6_fill_prefix(arg->skb, rt, RTM_GETPLIST,
+		     NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq);
+}
+
+static int fib6_dump_prefix(struct fib6_walker_t *w)
+{
+	int res;
+	struct rt6_info *rt;
+
+	for (rt = w->leaf; rt; rt = rt->u.next) {
+		res = rt6_dump_route_prefix(rt, w->args);
+		if (res < 0) {
+			/* Frame is full, suspend walking */
+			w->leaf = rt;
+			return 1;
+		}
+	}
+	w->leaf = NULL;
+	return 0;
+}
+#endif
+
   static void fib6_dump_end(struct netlink_callback *cb)
   {
   	struct fib6_walker_t *w = (void*)cb->args[0];
@@ -1541,12 +1603,17 @@
   	return cb->done(cb);
   }

-int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
+static int __inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb,
+			int prefix)
   {
   	struct rt6_rtnl_dump_arg arg;
   	struct fib6_walker_t *w;
   	int res;

+#ifndef CONFIG_IPV6_PREFIXLIST
+	BUG_TRAP(prefix == 0);
+#endif
+
   	arg.skb = skb;
   	arg.cb = cb;

@@ -1568,7 +1635,12 @@
   		RT6_TRACE("dump<%p", w);
   		memset(w, 0, sizeof(*w));
   		w->root = &ip6_routing_table;
-		w->func = fib6_dump_node;
+		if (prefix == 0)
+			w->func = fib6_dump_node;
+#ifdef CONFIG_IPV6_PREFIXLIST
+		else
+			w->func = fib6_dump_prefix;
+#endif
   		w->args = &arg;
   		cb->args[0] = (long)w;
   		read_lock_bh(&rt6_lock);
@@ -1595,6 +1667,16 @@
   	return res;
   }

+int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return __inet6_dump_fib(skb, cb, 0);
+}
+
+int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return __inet6_dump_fib(skb, cb, 1);
+}
+
   int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg)
   {
   	struct rtattr **rta = arg;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-25 17:02   ` Krishna Kumar
@ 2003-06-26  6:42     ` David S. Miller
  2003-06-26 16:32       ` Krishna Kumar
  2003-06-26 22:40       ` [PATCH] Prefix List against 2.4.21 Krishna Kumar
  0 siblings, 2 replies; 13+ messages in thread
From: David S. Miller @ 2003-06-26  6:42 UTC (permalink / raw)
  To: krkumar; +Cc: yoshfuji, netdev, linux-net


I don't think it's wise to make RTNETLINK facilities
dependant upon ifdef values.

Please kill CONFIG_IPV6_PREFIXLIST.  People don't need to
to enable funny options to get a fully function dhcp on
ipv4, they should therefore not have to for ipv6 either.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-26  6:42     ` David S. Miller
@ 2003-06-26 16:32       ` Krishna Kumar
  2003-06-27  6:07         ` David S. Miller
  2003-06-26 22:40       ` [PATCH] Prefix List against 2.4.21 Krishna Kumar
  1 sibling, 1 reply; 13+ messages in thread
From: Krishna Kumar @ 2003-06-26 16:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: yoshfuji, netdev, linux-net

Hi Dave,

 > I don't think it's wise to make RTNETLINK facilities dependant upon ifdef

Following is the patch without the PREFIXLIST config option. I have remade it
against 2.5.73.

We would like to have this functionality in 2.4 kernel, so can I go ahead and
send this patch against 2.4.21 too ?

Thanks,

- KK

--------------------------------------------------------------------------------
diff -ruN linux-2.5.73.org/include/linux/ipv6_route.h 
linux-2.5.73.new/include/linux/ipv6_route.h
--- linux-2.5.73.org/include/linux/ipv6_route.h	2003-06-22 11:32:36.000000000 -0700
+++ linux-2.5.73.new/include/linux/ipv6_route.h	2003-06-26 09:05:01.000000000 -0700
@@ -44,4 +44,16 @@
  #define RTMSG_NEWROUTE		0x21
  #define RTMSG_DELROUTE		0x22

+/*
+ * Return entire prefix list in array of following structures. Provides the
+ * prefix and prefix length for all devices.
+ */
+
+struct in6_prefix_msg
+{
+	int ifindex;
+	int prefix_len;
+	struct in6_addr prefix;
+};
+
  #endif
diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h 
linux-2.5.73.new/include/linux/rtnetlink.h
--- linux-2.5.73.org/include/linux/rtnetlink.h	2003-06-22 11:33:07.000000000 -0700
+++ linux-2.5.73.new/include/linux/rtnetlink.h	2003-06-26 09:05:01.000000000 -0700
@@ -47,7 +47,11 @@
  #define	RTM_DELTFILTER	(RTM_BASE+29)
  #define	RTM_GETTFILTER	(RTM_BASE+30)

-#define	RTM_MAX		(RTM_BASE+31)
+#define	RTM_GETLNKFLAGS	(RTM_BASE+34)
+
+#define	RTM_GETPLIST	(RTM_BASE+38)
+
+#define	RTM_MAX		(RTM_GETPLIST+1)

  /*
     Generic structure for encapsulation of optional route information.
@@ -61,6 +65,14 @@
  	unsigned short	rta_type;
  };

+/* Structure to return per interface device flags */
+
+struct ifp_if6info
+{
+	int ifindex;
+	int flags;
+};
+
  /* Macros to handle rtattributes */

  #define RTA_ALIGNTO	4
@@ -201,9 +213,11 @@
  	RTA_FLOW,
  	RTA_CACHEINFO,
  	RTA_SESSION,
+	RTA_LINKFLAGS,
+	RTA_RA6INFO,	/* No support yet, send event on new prefix event */
  };

-#define RTA_MAX RTA_SESSION
+#define RTA_MAX RTA_RA6INFO

  #define RTM_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct 
rtmsg))))
  #define RTM_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct rtmsg))
diff -ruN linux-2.5.73.org/include/net/if_inet6.h 
linux-2.5.73.new/include/net/if_inet6.h
--- linux-2.5.73.org/include/net/if_inet6.h	2003-06-22 11:33:32.000000000 -0700
+++ linux-2.5.73.new/include/net/if_inet6.h	2003-06-26 09:05:01.000000000 -0700
@@ -17,6 +17,8 @@

  #include <net/snmp.h>

+#define IF_RA_OTHERCONF	0x80
+#define IF_RA_MANAGED	0x40
  #define IF_RA_RCVD	0x20
  #define IF_RS_SENT	0x10

diff -ruN linux-2.5.73.org/include/net/ip6_route.h 
linux-2.5.73.new/include/net/ip6_route.h
--- linux-2.5.73.org/include/net/ip6_route.h	2003-06-22 11:32:37.000000000 -0700
+++ linux-2.5.73.new/include/net/ip6_route.h	2003-06-26 09:05:01.000000000 -0700
@@ -87,6 +87,7 @@
  struct nlmsghdr;
  struct netlink_callback;
  extern int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb);
+extern int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb);
  extern int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
  extern int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
  extern int inet6_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c linux-2.5.73.new/net/ipv6/addrconf.c
--- linux-2.5.73.org/net/ipv6/addrconf.c	2003-06-22 11:33:17.000000000 -0700
+++ linux-2.5.73.new/net/ipv6/addrconf.c	2003-06-26 09:05:01.000000000 -0700
@@ -129,7 +129,7 @@

  static int addrconf_ifdown(struct net_device *dev, int how);

-static void addrconf_dad_start(struct inet6_ifaddr *ifp);
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags);
  static void addrconf_dad_timer(unsigned long data);
  static void addrconf_dad_completed(struct inet6_ifaddr *ifp);
  static void addrconf_rs_timer(unsigned long data);
@@ -715,7 +715,7 @@
  	ift->prefered_lft = tmp_prefered_lft;
  	ift->tstamp = ifp->tstamp;
  	spin_unlock_bh(&ift->lock);
-	addrconf_dad_start(ift);
+	addrconf_dad_start(ift, 0);
  	in6_ifa_put(ift);
  	in6_dev_put(idev);
  out:
@@ -1211,7 +1211,7 @@
  	rtmsg.rtmsg_dst_len = 8;
  	rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
  	rtmsg.rtmsg_ifindex = dev->ifindex;
-	rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF;
+	rtmsg.rtmsg_flags = RTF_UP;
  	rtmsg.rtmsg_type = RTMSG_NEWROUTE;
  	ip6_route_add(&rtmsg, NULL, NULL);
  }
@@ -1238,7 +1238,7 @@
  	struct in6_addr addr;

  	ipv6_addr_set(&addr,  htonl(0xFE800000), 0, 0, 0);
-	addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF);
+	addrconf_prefix_route(&addr, 64, dev, 0, 0);
  }

  static struct inet6_dev *addrconf_add_dev(struct net_device *dev)
@@ -1378,7 +1378,7 @@
  			}

  			create = 1;
-			addrconf_dad_start(ifp);
+			addrconf_dad_start(ifp, RTF_ADDRCONF);
  		}

  		if (ifp && valid_lft == 0) {
@@ -1529,7 +1529,7 @@

  	ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT);
  	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
  		in6_ifa_put(ifp);
  		return 0;
  	}
@@ -1704,7 +1704,7 @@

  	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT);
  	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
  		in6_ifa_put(ifp);
  	}
  }
@@ -1943,8 +1943,7 @@
  		memset(&rtmsg, 0, sizeof(struct in6_rtmsg));
  		rtmsg.rtmsg_type = RTMSG_NEWROUTE;
  		rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
-		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF |
-				     RTF_DEFAULT | RTF_UP);
+		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP);

  		rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex;

@@ -1958,7 +1957,7 @@
  /*
   *	Duplicate Address Detection
   */
-static void addrconf_dad_start(struct inet6_ifaddr *ifp)
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags)
  {
  	struct net_device *dev;
  	unsigned long rand_num;
@@ -1968,7 +1967,7 @@
  	addrconf_join_solict(dev, &ifp->addr);

  	if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT))
-		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF);
+		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags);

  	net_srandom(ifp->addr.s6_addr32[3]);
  	rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1);
@@ -2451,6 +2450,42 @@
  	netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC);
  }

+int inet6_dump_linkflags(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	int ifindex, flags = 0;
+	struct net_device *dev;
+	struct inet6_dev *idev;
+	struct nlmsghdr *nlh;
+	struct ifp_if6info *ifp = NLMSG_DATA(cb->nlh);
+	unsigned char *org_tail = skb->tail;
+
+	ifindex = ifp->ifindex;
+
+	if ((dev = dev_get_by_index(ifindex)) == NULL)
+		goto out;
+	if ((idev = in6_dev_get(dev)) != NULL) {
+		flags = idev->if_flags;
+		in6_dev_put(idev);
+	}
+	dev_put(dev);
+
+	nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq,
+			RTA_LINKFLAGS, sizeof(*ifp));
+	ifp = NLMSG_DATA(nlh);
+	ifp->flags = flags;
+	ifp->ifindex = ifindex;	/* duplicate information for user to verify */
+
+	nlh->nlmsg_len = skb->tail - org_tail;
+	return skb->len;
+
+nlmsg_failure:
+	printk(KERN_INFO "inet6_dump_linkflags:skb size not enough\n");
+	skb_trim(skb, org_tail - skb->data);
+
+out:
+	return -1;
+}
+
  static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = {
  	[RTM_NEWADDR - RTM_BASE] = { .doit	= inet6_rtm_newaddr, },
  	[RTM_DELADDR - RTM_BASE] = { .doit	= inet6_rtm_deladdr, },
@@ -2459,6 +2494,8 @@
  	[RTM_DELROUTE - RTM_BASE] = { .doit	= inet6_rtm_delroute, },
  	[RTM_GETROUTE - RTM_BASE] = { .doit	= inet6_rtm_getroute,
  				      .dumpit	= inet6_dump_fib, },
+	[RTM_GETLNKFLAGS - RTM_BASE] = { .dumpit = inet6_dump_linkflags, },
+	[RTM_GETPLIST - RTM_BASE] = { .dumpit	= inet6_dump_prefix, },
  };

  static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
diff -ruN linux-2.5.73.org/net/ipv6/ndisc.c linux-2.5.73.new/net/ipv6/ndisc.c
--- linux-2.5.73.org/net/ipv6/ndisc.c	2003-06-22 11:32:56.000000000 -0700
+++ linux-2.5.73.new/net/ipv6/ndisc.c	2003-06-26 09:05:01.000000000 -0700
@@ -1036,6 +1036,16 @@
  		 */
  		in6_dev->if_flags |= IF_RA_RCVD;
  	}
+	/*
+	 * Remember the managed/otherconf flags from most recently
+	 * receieved RA message (RFC 2462) -- yoshfuji
+	 */
+	in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED|
+				IF_RA_OTHERCONF)) |
+				(ra_msg->icmph.icmp6_addrconf_managed ?
+					IF_RA_MANAGED : 0) |
+				(ra_msg->icmph.icmp6_addrconf_other ?
+					IF_RA_OTHERCONF : 0);

  	lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime);

diff -ruN linux-2.5.73.org/net/ipv6/route.c linux-2.5.73.new/net/ipv6/route.c
--- linux-2.5.73.org/net/ipv6/route.c	2003-06-22 11:33:05.000000000 -0700
+++ linux-2.5.73.new/net/ipv6/route.c	2003-06-26 09:05:01.000000000 -0700
@@ -1511,6 +1511,66 @@
  	return 0;
  }

+static int rt6_fill_prefix(struct sk_buff *skb, struct rt6_info *rt,
+			 int type, u32 pid, u32 seq)
+{
+	struct in6_prefix_msg *pmsg;
+	struct nlmsghdr  *nlh;
+	unsigned char *b = skb->tail;
+
+	nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*pmsg));
+	pmsg = NLMSG_DATA(nlh);
+	pmsg->ifindex = rt->rt6i_dev->ifindex;
+	pmsg->prefix_len = rt->rt6i_dst.plen;
+	ipv6_addr_copy(&pmsg->prefix, &rt->rt6i_dst.addr);
+	nlh->nlmsg_len = skb->tail - b;
+	return skb->len;
+
+nlmsg_failure:
+	printk(KERN_INFO "rt6_fill_prefix:skb size not enough\n");
+	skb_trim(skb, b - skb->data);
+	return -1;
+}
+
+static int rt6_dump_route_prefix(struct rt6_info *rt, void *p_arg)
+{
+	int addr_type;
+	struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg;
+
+	/*
+	 * Definition of a prefix :
+	 * 	- Should be autoconfigured
+	 *	- No nexthop
+	 *	- Not a linklocal, loopback or multicast type.
+	 */
+	if (rt->rt6i_nexthop || (rt->rt6i_flags & RTF_ADDRCONF) == 0)
+		return 0;
+	addr_type = ipv6_addr_type(&rt->rt6i_dst.addr);
+	if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK |
+			IPV6_ADDR_MULTICAST)) != 0 ||
+			addr_type == IPV6_ADDR_ANY)
+		return 0;
+	return rt6_fill_prefix(arg->skb, rt, RTM_GETPLIST,
+		     NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq);
+}
+
+static int fib6_dump_prefix(struct fib6_walker_t *w)
+{
+	int res;
+	struct rt6_info *rt;
+
+	for (rt = w->leaf; rt; rt = rt->u.next) {
+		res = rt6_dump_route_prefix(rt, w->args);
+		if (res < 0) {
+			/* Frame is full, suspend walking */
+			w->leaf = rt;
+			return 1;
+		}
+	}
+	w->leaf = NULL;
+	return 0;
+}
+
  static void fib6_dump_end(struct netlink_callback *cb)
  {
  	struct fib6_walker_t *w = (void*)cb->args[0];
@@ -1532,7 +1592,8 @@
  	return cb->done(cb);
  }

-int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
+static int __inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb,
+			int prefix)
  {
  	struct rt6_rtnl_dump_arg arg;
  	struct fib6_walker_t *w;
@@ -1559,7 +1620,10 @@
  		RT6_TRACE("dump<%p", w);
  		memset(w, 0, sizeof(*w));
  		w->root = &ip6_routing_table;
-		w->func = fib6_dump_node;
+		if (prefix)
+			w->func = fib6_dump_prefix;
+		else
+			w->func = fib6_dump_node;
  		w->args = &arg;
  		cb->args[0] = (long)w;
  		read_lock_bh(&rt6_lock);
@@ -1586,6 +1650,16 @@
  	return res;
  }

+int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return __inet6_dump_fib(skb, cb, 0);
+}
+
+int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return __inet6_dump_fib(skb, cb, 1);
+}
+
  int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg)
  {
  	struct rtattr **rta = arg;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] Prefix List against 2.4.21
  2003-06-26  6:42     ` David S. Miller
  2003-06-26 16:32       ` Krishna Kumar
@ 2003-06-26 22:40       ` Krishna Kumar
  1 sibling, 0 replies; 13+ messages in thread
From: Krishna Kumar @ 2003-06-26 22:40 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-net, yoshfuji

Hi dave,

This is the same patch against 2.4.21. Minor changes in initialization of
inet6_rtnetlink_table, otherwise the rest is the same. I have tested it, and it
works fine.

thanks,

- KK

--------------------------------------------------------------------------------
diff -ruN linux-2.4.21.org/include/linux/ipv6_route.h 
linux-2.4.21/include/linux/ipv6_route.h
--- linux-2.4.21.org/include/linux/ipv6_route.h	1998-08-27 19:33:08.000000000 -0700
+++ linux-2.4.21/include/linux/ipv6_route.h	2003-06-26 09:35:05.000000000 -0700
@@ -53,4 +53,16 @@
  #define RTMSG_NEWROUTE		0x21
  #define RTMSG_DELROUTE		0x22

+/*
+ * Return entire prefix list in array of following structures. Provides the
+ * prefix and prefix length for all devices.
+ */
+
+struct in6_prefix_msg
+{
+	int ifindex;
+	int prefix_len;
+	struct in6_addr prefix;
+};
+
  #endif
diff -ruN linux-2.4.21.org/include/linux/rtnetlink.h 
linux-2.4.21/include/linux/rtnetlink.h
--- linux-2.4.21.org/include/linux/rtnetlink.h	2002-11-28 15:53:15.000000000 -0800
+++ linux-2.4.21/include/linux/rtnetlink.h	2003-06-26 12:50:46.000000000 -0700
@@ -46,7 +46,11 @@
  #define	RTM_DELTFILTER	(RTM_BASE+29)
  #define	RTM_GETTFILTER	(RTM_BASE+30)

-#define	RTM_MAX		(RTM_BASE+31)
+#define	RTM_GETLNKFLAGS	(RTM_BASE+34)
+
+#define	RTM_GETPLIST	(RTM_BASE+38)
+
+#define	RTM_MAX		(RTM_GETPLIST+1)

  /*
     Generic structure for encapsulation optional route information.
@@ -60,6 +64,14 @@
  	unsigned short	rta_type;
  };

+/* Structure to return per interface device flags */
+
+struct ifp_if6info
+{
+	int ifindex;
+	int flags;
+};
+
  /* Macros to handle rtattributes */

  #define RTA_ALIGNTO	4
@@ -198,10 +210,12 @@
  	RTA_MULTIPATH,
  	RTA_PROTOINFO,
  	RTA_FLOW,
-	RTA_CACHEINFO
+	RTA_CACHEINFO,
+	RTA_LINKFLAGS,
+	RTA_RA6INFO,	/* No support yet, send event on new prefix event */
  };

-#define RTA_MAX RTA_CACHEINFO
+#define RTA_MAX RTA_RA6INFO

  #define RTM_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct 
rtmsg))))
  #define RTM_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct rtmsg))
diff -ruN linux-2.4.21.org/include/net/if_inet6.h 
linux-2.4.21/include/net/if_inet6.h
--- linux-2.4.21.org/include/net/if_inet6.h	2003-06-13 07:51:39.000000000 -0700
+++ linux-2.4.21/include/net/if_inet6.h	2003-06-26 09:35:05.000000000 -0700
@@ -15,6 +15,8 @@
  #ifndef _NET_IF_INET6_H
  #define _NET_IF_INET6_H

+#define IF_RA_OTHERCONF	0x80
+#define IF_RA_MANAGED	0x40
  #define IF_RA_RCVD	0x20
  #define IF_RS_SENT	0x10

diff -ruN linux-2.4.21.org/include/net/ip6_route.h 
linux-2.4.21/include/net/ip6_route.h
--- linux-2.4.21.org/include/net/ip6_route.h	2003-06-13 07:51:39.000000000 -0700
+++ linux-2.4.21/include/net/ip6_route.h	2003-06-26 13:51:22.000000000 -0700
@@ -84,6 +84,7 @@
  struct nlmsghdr;
  struct netlink_callback;
  extern int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb);
+extern int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb);
  extern int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
  extern int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
  extern int inet6_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
diff -ruN linux-2.4.21.org/net/ipv6/addrconf.c linux-2.4.21/net/ipv6/addrconf.c
--- linux-2.4.21.org/net/ipv6/addrconf.c	2003-06-13 07:51:39.000000000 -0700
+++ linux-2.4.21/net/ipv6/addrconf.c	2003-06-26 15:05:27.000000000 -0700
@@ -101,7 +101,7 @@

  static int addrconf_ifdown(struct net_device *dev, int how);

-static void addrconf_dad_start(struct inet6_ifaddr *ifp);
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags);
  static void addrconf_dad_timer(unsigned long data);
  static void addrconf_dad_completed(struct inet6_ifaddr *ifp);
  static void addrconf_rs_timer(unsigned long data);
@@ -889,7 +889,7 @@
  	rtmsg.rtmsg_dst_len = 8;
  	rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
  	rtmsg.rtmsg_ifindex = dev->ifindex;
-	rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF;
+	rtmsg.rtmsg_flags = RTF_UP;
  	rtmsg.rtmsg_type = RTMSG_NEWROUTE;
  	ip6_route_add(&rtmsg, NULL);
  }
@@ -916,7 +916,7 @@
  	struct in6_addr addr;

  	ipv6_addr_set(&addr,  htonl(0xFE800000), 0, 0, 0);
-	addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF);
+	addrconf_prefix_route(&addr, 64, dev, 0, 0);
  }

  static struct inet6_dev *addrconf_add_dev(struct net_device *dev)
@@ -1054,7 +1054,7 @@
  				return;
  			}

-			addrconf_dad_start(ifp);
+			addrconf_dad_start(ifp, RTF_ADDRCONF);
  		}

  		if (ifp && valid_lft == 0) {
@@ -1166,7 +1166,7 @@

  	ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT);
  	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
  		in6_ifa_put(ifp);
  		return 0;
  	}
@@ -1341,7 +1341,7 @@

  	ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT);
  	if (!IS_ERR(ifp)) {
-		addrconf_dad_start(ifp);
+		addrconf_dad_start(ifp, 0);
  		in6_ifa_put(ifp);
  	}
  }
@@ -1578,8 +1578,7 @@
  		memset(&rtmsg, 0, sizeof(struct in6_rtmsg));
  		rtmsg.rtmsg_type = RTMSG_NEWROUTE;
  		rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF;
-		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF |
-				     RTF_DEFAULT | RTF_UP);
+		rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP);

  		rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex;

@@ -1593,7 +1592,7 @@
  /*
   *	Duplicate Address Detection
   */
-static void addrconf_dad_start(struct inet6_ifaddr *ifp)
+static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags)
  {
  	struct net_device *dev;
  	unsigned long rand_num;
@@ -1603,7 +1602,7 @@
  	addrconf_join_solict(dev, &ifp->addr);

  	if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT))
-		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF);
+		addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags);

  	net_srandom(ifp->addr.s6_addr32[3]);
  	rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1);
@@ -1971,6 +1970,42 @@
  	netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC);
  }

+int inet6_dump_linkflags(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	int ifindex, flags = 0;
+	struct net_device *dev;
+	struct inet6_dev *idev;
+	struct nlmsghdr *nlh;
+	struct ifp_if6info *ifp = NLMSG_DATA(cb->nlh);
+	unsigned char *org_tail = skb->tail;
+
+	ifindex = ifp->ifindex;
+
+	if ((dev = dev_get_by_index(ifindex)) == NULL)
+		goto out;
+	if ((idev = in6_dev_get(dev)) != NULL) {
+		flags = idev->if_flags;
+		in6_dev_put(idev);
+	}
+	dev_put(dev);
+
+	nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq,
+			RTA_LINKFLAGS, sizeof(*ifp));
+	ifp = NLMSG_DATA(nlh);
+	ifp->flags = flags;
+	ifp->ifindex = ifindex;	/* duplicate information for user to verify */
+
+	nlh->nlmsg_len = skb->tail - org_tail;
+	return skb->len;
+
+nlmsg_failure:
+	printk(KERN_INFO "inet6_dump_linkflags:skb size not enough\n");
+	skb_trim(skb, org_tail - skb->data);
+
+out:
+	return -1;
+}
+
  static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX-RTM_BASE+1] =
  {
  	{ NULL,			NULL,			},
@@ -1987,6 +2022,41 @@
  	{ inet6_rtm_delroute,	NULL,			},
  	{ inet6_rtm_getroute,	inet6_dump_fib,		},
  	{ NULL,			NULL,			},
+
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			inet6_dump_linkflags	},
+	{ NULL,			NULL,			},
+
+	{ NULL,			NULL,			},
+	{ NULL,			NULL,			},
+	{ NULL,			inet6_dump_prefix	},
+	{ NULL,			NULL,			},
  };

  static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
@@ -2200,7 +2270,7 @@
  #ifdef CONFIG_PROC_FS
  	proc_net_create("if_inet6", 0, iface_proc_info);
  #endif
-	
+
  	addrconf_verify(0);
  	rtnetlink_links[PF_INET6] = inet6_rtnetlink_table;
  #ifdef CONFIG_SYSCTL
diff -ruN linux-2.4.21.org/net/ipv6/ndisc.c linux-2.4.21/net/ipv6/ndisc.c
--- linux-2.4.21.org/net/ipv6/ndisc.c	2003-06-13 07:51:39.000000000 -0700
+++ linux-2.4.21/net/ipv6/ndisc.c	2003-06-26 09:35:05.000000000 -0700
@@ -940,6 +940,16 @@
  		 */
  		in6_dev->if_flags |= IF_RA_RCVD;
  	}
+	/*
+	 * Remember the managed/otherconf flags from most recently
+	 * receieved RA message (RFC 2462) -- yoshfuji
+	 */
+	in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED|
+				IF_RA_OTHERCONF)) |
+				(ra_msg->icmph.icmp6_addrconf_managed ?
+					IF_RA_MANAGED : 0) |
+				(ra_msg->icmph.icmp6_addrconf_other ?
+					IF_RA_OTHERCONF : 0);

  	lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime);

diff -ruN linux-2.4.21.org/net/ipv6/route.c linux-2.4.21/net/ipv6/route.c
--- linux-2.4.21.org/net/ipv6/route.c	2003-06-13 07:51:39.000000000 -0700
+++ linux-2.4.21/net/ipv6/route.c	2003-06-26 09:35:05.000000000 -0700
@@ -1627,6 +1627,66 @@
  	return 0;
  }

+static int rt6_fill_prefix(struct sk_buff *skb, struct rt6_info *rt,
+			 int type, u32 pid, u32 seq)
+{
+	struct in6_prefix_msg *pmsg;
+	struct nlmsghdr  *nlh;
+	unsigned char *b = skb->tail;
+
+	nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*pmsg));
+	pmsg = NLMSG_DATA(nlh);
+	pmsg->ifindex = rt->rt6i_dev->ifindex;
+	pmsg->prefix_len = rt->rt6i_dst.plen;
+	ipv6_addr_copy(&pmsg->prefix, &rt->rt6i_dst.addr);
+	nlh->nlmsg_len = skb->tail - b;
+	return skb->len;
+
+nlmsg_failure:
+	printk(KERN_INFO "rt6_fill_prefix:skb size not enough\n");
+	skb_trim(skb, b - skb->data);
+	return -1;
+}
+
+static int rt6_dump_route_prefix(struct rt6_info *rt, void *p_arg)
+{
+	int addr_type;
+	struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg;
+
+	/*
+	 * Definition of a prefix :
+	 * 	- Should be autoconfigured
+	 *	- No nexthop
+	 *	- Not a linklocal, loopback or multicast type.
+	 */
+	if (rt->rt6i_nexthop || (rt->rt6i_flags & RTF_ADDRCONF) == 0)
+		return 0;
+	addr_type = ipv6_addr_type(&rt->rt6i_dst.addr);
+	if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK |
+			IPV6_ADDR_MULTICAST)) != 0 ||
+			addr_type == IPV6_ADDR_ANY)
+		return 0;
+	return rt6_fill_prefix(arg->skb, rt, RTM_GETPLIST,
+		     NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq);
+}
+
+static int fib6_dump_prefix(struct fib6_walker_t *w)
+{
+	int res;
+	struct rt6_info *rt;
+
+	for (rt = w->leaf; rt; rt = rt->u.next) {
+		res = rt6_dump_route_prefix(rt, w->args);
+		if (res < 0) {
+			/* Frame is full, suspend walking */
+			w->leaf = rt;
+			return 1;
+		}
+	}
+	w->leaf = NULL;
+	return 0;
+}
+
  static void fib6_dump_end(struct netlink_callback *cb)
  {
  	struct fib6_walker_t *w = (void*)cb->args[0];
@@ -1648,7 +1708,8 @@
  	return cb->done(cb);
  }

-int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
+static int __inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb,
+			int prefix)
  {
  	struct rt6_rtnl_dump_arg arg;
  	struct fib6_walker_t *w;
@@ -1675,7 +1736,10 @@
  		RT6_TRACE("dump<%p", w);
  		memset(w, 0, sizeof(*w));
  		w->root = &ip6_routing_table;
-		w->func = fib6_dump_node;
+		if (prefix)
+			w->func = fib6_dump_prefix;
+		else
+			w->func = fib6_dump_node;
  		w->args = &arg;
  		cb->args[0] = (long)w;
  		read_lock_bh(&rt6_lock);
@@ -1702,6 +1766,16 @@
  	return res;
  }

+int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return __inet6_dump_fib(skb, cb, 0);
+}
+
+int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return __inet6_dump_fib(skb, cb, 1);
+}
+
  int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg)
  {
  	struct rtattr **rta = arg;




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-26 16:32       ` Krishna Kumar
@ 2003-06-27  6:07         ` David S. Miller
  2003-06-27 15:45           ` Krishna Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: David S. Miller @ 2003-06-27  6:07 UTC (permalink / raw)
  To: krkumar; +Cc: yoshfuji, netdev, linux-net

   From: Krishna Kumar <krkumar@us.ibm.com>
   Date: Thu, 26 Jun 2003 09:32:23 -0700

I still have problems with this patch.
   
   -#define	RTM_MAX		(RTM_BASE+31)
   +#define	RTM_GETLNKFLAGS	(RTM_BASE+34)
   +
   +#define	RTM_GETPLIST	(RTM_BASE+38)

Please allocate contiguous numbers to the new messages, don't skip
around like this.

Thanks.  (this of course means you have to redo your 2.4.x patch
as well)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-27  6:07         ` David S. Miller
@ 2003-06-27 15:45           ` Krishna Kumar
  2003-06-27 21:47             ` David S. Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar @ 2003-06-27 15:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: yoshfuji, netdev, linux-net

rtnetlink_rcv_msg() calls dumpit() (via netlink_dump_start) only for those
messages for which the last two bits are binary '10'. So I had to use these
values. All the other *GET* macros use the same semantics.

thanks,

- KK


David S. Miller wrote:
>    From: Krishna Kumar <krkumar@us.ibm.com>
>    Date: Thu, 26 Jun 2003 09:32:23 -0700
> 
> I still have problems with this patch.
>    
>    -#define	RTM_MAX		(RTM_BASE+31)
>    +#define	RTM_GETLNKFLAGS	(RTM_BASE+34)
>    +
>    +#define	RTM_GETPLIST	(RTM_BASE+38)
> 
> Please allocate contiguous numbers to the new messages, don't skip
> around like this.
> 
> Thanks.  (this of course means you have to redo your 2.4.x patch
> as well)
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-27 15:45           ` Krishna Kumar
@ 2003-06-27 21:47             ` David S. Miller
  2003-06-28  4:06               ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 1 reply; 13+ messages in thread
From: David S. Miller @ 2003-06-27 21:47 UTC (permalink / raw)
  To: krkumar; +Cc: yoshfuji, netdev, linux-net

   From: Krishna Kumar <krkumar@us.ibm.com>
   Date: Fri, 27 Jun 2003 08:45:19 -0700

   rtnetlink_rcv_msg() calls dumpit() (via netlink_dump_start) only
   for those messages for which the last two bits are binary '10'. So
   I had to use these values. All the other *GET* macros use the same
   semantics.

Ok, please retransmit your two patches (2.4.x and 2.5.x) to me
under seperate cover.  I don't keep a copy around of patches
I've decided not to apply.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-27 21:47             ` David S. Miller
@ 2003-06-28  4:06               ` YOSHIFUJI Hideaki / 吉藤英明
  2003-06-30 18:54                 ` Krishna Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2003-06-28  4:06 UTC (permalink / raw)
  To: davem; +Cc: krkumar, netdev, linux-net

In article <20030627.144752.78715628.davem@redhat.com> (at Fri, 27 Jun 2003 14:47:52 -0700 (PDT)), "David S. Miller" <davem@redhat.com> says:

>    From: Krishna Kumar <krkumar@us.ibm.com>
>    Date: Fri, 27 Jun 2003 08:45:19 -0700
> 
>    rtnetlink_rcv_msg() calls dumpit() (via netlink_dump_start) only
>    for those messages for which the last two bits are binary '10'. So
>    I had to use these values. All the other *GET* macros use the same
>    semantics.
> 
> Ok, please retransmit your two patches (2.4.x and 2.5.x) to me
> under seperate cover.  I don't keep a copy around of patches
> I've decided not to apply.

Well...

1. is it okay to have another hook for garbbig prefix list?
   Userspace application can get such information via
   - routing table
   - interface flag

2. is the "managed" flags etc, which is per interface variable, 
   really NEWROUTE information?
   It is NOT L2 thing, but it is per-link information.
   I think it is NEWLINK thing.

What I'm thinking is:

 - fix "ADDRCONF" flag in route information
 - manage / other flags via NEWLINK message
(- No new interface to get prefix itself.)

-- 
Hideaki YOSHIFUJI @ USAGI Project <yoshfuji@linux-ipv6.org>
GPG FP: 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-28  4:06               ` YOSHIFUJI Hideaki / 吉藤英明
@ 2003-06-30 18:54                 ` Krishna Kumar
  2003-07-02  0:18                   ` YOSHIFUJI Hideaki / 吉藤英明
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kumar @ 2003-06-30 18:54 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki; +Cc: David S. Miller, netdev, linux-net


> 1. is it okay to have another hook for garbbig prefix list?
>    Userspace application can get such information via
>    - routing table
>    - interface flag
> 
> 2. is the "managed" flags etc, which is per interface variable, 
>    really NEWROUTE information?
>    It is NOT L2 thing, but it is per-link information.
>    I think it is NEWLINK thing.
> 
> What I'm thinking is:
> 
>  - fix "ADDRCONF" flag in route information
>  - manage / other flags via NEWLINK message
> (- No new interface to get prefix itself.)


Well, there are two reason that I can see to not do so (ADDRCONF flag is already
fixed in earlier patch) :

-  With the latest submission, the actual code to get the prefix list itself
    is very small, the top level inet6_dump_fib uses either the dump_node or
    the dump_prefix, the latter being the new function added. This is the whole
    user interface, 50 odd lines of code with comments.
-  If I understood your point about using interface flag and routing table,
    you are suggesting that the user can make look at rttable and get the prefix
    entries by make checks (it is non-trivial, eg the address should not LL or MC,
    there should be no nexthop and it should be added via an RA, etc). However,
    having a user interface makes it easier to get the prefix list without
    significant bloat to the kernel, and the user doesn't have to make a lot of
    checks to get the system prefixes. I don't see much gain from this approach.

About your point about the managed flag, I think it is a per interface flag
that gets returned when a request for getting flags on that interface is made.
That's why I have made it per interface as part of a GETLNKFLAGS operation.
I don't understand why you think it is NEWLINK thing (not sure what you mean by
that), since it is a flag information on your existing device that a RA is
advertising. I want to get this information not on receipt of an RA, but when
a request is made.

Thanks,

- KK


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-06-30 18:54                 ` Krishna Kumar
@ 2003-07-02  0:18                   ` YOSHIFUJI Hideaki / 吉藤英明
  2003-07-10 22:16                     ` Krishna Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2003-07-02  0:18 UTC (permalink / raw)
  To: krkumar; +Cc: davem, netdev, linux-net, yoshfuji, kuznet

In article <3F008771.5030206@us.ibm.com> (at Mon, 30 Jun 2003 11:54:41 -0700), Krishna Kumar <krkumar@us.ibm.com> says:

> Well, there are two reason that I can see to not do so (ADDRCONF flag is already
> fixed in earlier patch) :
:

You do not explain why we (or kernel) NEED(s) this.
It is not so important how SMALL it is
though it may cause problems how LARGE it is.


> About your point about the managed flag, I think it is a per interface flag
> that gets returned when a request for getting flags on that interface is made.
> That's why I have made it per interface as part of a GETLNKFLAGS operation.
> I don't understand why you think it is NEWLINK thing (not sure what you mean by
> that), since it is a flag information on your existing device that a RA is
> advertising. I want to get this information not on receipt of an RA, but when
> a request is made.

This is design issue; how we should provide L3 per-interface 
information to userspace; eg. in_device and/or inet6_dev things 
including per-interface statistics.

Since I think it is not appropriate to provide per-interface 
statistics via RTM_xxxROUTE, so I don't agree to provide 
the RA infomation (i.e. Manage/Otherconf Flags) via 
RTM_xxxROUTE.

Options:
 - use RTM_xxxLINK for L3 operation
 - introduce RTM_xxxIFACE for L3 per-interface operations

I really want to hear from other maintainers here...
David? Alexey?


Well, on moving forward; you can split your patch up to 3 things:
  1. fix routing flags
  2. provide Managed/Otherconf flags API
 (3. provide the prefix list API (if it IS required))

I'm not against the first item.
We need to discuss on the design related to the 2nd item.
I don't think that we really need 3rd item.


Thank you.

-- 
Hideaki YOSHIFUJI @ USAGI Project <yoshfuji@linux-ipv6.org>
GPG FP: 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Prefix List against 2.5.70 (re-done)
  2003-07-02  0:18                   ` YOSHIFUJI Hideaki / 吉藤英明
@ 2003-07-10 22:16                     ` Krishna Kumar
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kumar @ 2003-07-10 22:16 UTC (permalink / raw)
  To: yoshfuji; +Cc: davem, netdev, linux-net, kuznet

> You do not explain why we (or kernel) NEED(s) this.
> It is not so important how SMALL it is
> though it may cause problems how LARGE it is.

I had explained the reasons for having prefix list i/f in my previous mail. To recap :
-  User don't need to know what the definition of a prefix is, all he has to do is ask
    the kernel and get the list. Otherwise different user apps will have to know the
    definition of a prefix and parse the entry themselves. The parsing is non-trivial (eg
    the address should not LL or MC, there should be no nexthop and it should be added via
    an RA, etc).
-  The kernel code to get the prefix list is small, the top level inet6_dump_fib uses
    either the dump_node or the dump_prefix, the latter being the new user interface. Having
    a user interface makes it easier to get the prefix list without significant bloat to the
    kernel.

> This is design issue; how we should provide L3 per-interface 
> information to userspace; eg. in_device and/or inet6_dev things 
> including per-interface statistics.
> 
> Since I think it is not appropriate to provide per-interface 
> statistics via RTM_xxxROUTE, so I don't agree to provide 
> the RA infomation (i.e. Manage/Otherconf Flags) via 
> RTM_xxxROUTE.
> 
> Options:
>  - use RTM_xxxLINK for L3 operation
>  - introduce RTM_xxxIFACE for L3 per-interface operations

Yes, there are a couple of different ways to do this. One is as you have suggested, but there
is a problem with it. The existing RTM_GETLINK interface returns very generic elements of the
dev (mtu, hardware address, dev statistics), while the change you suggested is specific to
ipv6. I am not sure if this is a good design to implement. Either we could use the current
(submitted) way or use a different RTM_GETADDR interface in inet6_fill_ifaddr (and introduce
RTM_IFACEFLAGS). This will be specific to IPv6. Are you agreeable to this ?

> Well, on moving forward; you can split your patch up to 3 things:
>   1. fix routing flags
>   2. provide Managed/Otherconf flags API
>  (3. provide the prefix list API (if it IS required))
> 
> I'm not against the first item.
> We need to discuss on the design related to the 2nd item.
> I don't think that we really need 3rd item.

- I am ok with 1 :-)
- I have suggested changes for 2, please let me know what you think, whether we can go with the
   old way or make the change suggested above.
- I believe we need #3 for the reasons given above.

Thanks,

- KK

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2003-07-10 22:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-20 20:53 [PATCH] Prefix List against 2.5.70 (re-done) Krishna Kumar
2003-06-21 14:36 ` YOSHIFUJI Hideaki / 吉藤英明
2003-06-25 17:02   ` Krishna Kumar
2003-06-26  6:42     ` David S. Miller
2003-06-26 16:32       ` Krishna Kumar
2003-06-27  6:07         ` David S. Miller
2003-06-27 15:45           ` Krishna Kumar
2003-06-27 21:47             ` David S. Miller
2003-06-28  4:06               ` YOSHIFUJI Hideaki / 吉藤英明
2003-06-30 18:54                 ` Krishna Kumar
2003-07-02  0:18                   ` YOSHIFUJI Hideaki / 吉藤英明
2003-07-10 22:16                     ` Krishna Kumar
2003-06-26 22:40       ` [PATCH] Prefix List against 2.4.21 Krishna Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.