All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: rtnetlink problems with Cisco enic and VFs
@ 2014-04-22  4:14 David Gibson
  2014-04-22  4:17 ` David Gibson
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: David Gibson @ 2014-04-22  4:14 UTC (permalink / raw)
  To: netdev
  Cc: Christian Benvenuti, Sujith Sankar, Govindarajulu Varadarajan,
	Neel Patel, Nishank Trivedi

[-- Attachment #1: Type: text/plain, Size: 4840 bytes --]

I believe I've found a problem with netlink handling which can be
triggered on Cisco enic devices with a large number (30-40) of virtual
functions.  I believe this is the cause of a real customer problem
we've seen.

 * When requesting a list of interfaces with RTM_GETLINK, enic devices
   (and currently, _only_ enic devices) report IFLA_VF_PORTS
   information 

 * IFLA_VF_PORTS information has at least 90 bytes ber virtual function

 * Unlike IFLA_VFINFO_LIST, the ports information is always reported,
   regardless of the setting of the IFLA_EXT_MASK parameter

 * When IFLA_EXT_MASK is not specified, the reply packets have maximum
   size NLMSG_GOODSIZE (4k - overheads)

 * If there are enough virtual functions the IFLA_VF_PORTS information
   can cause a single interface's info to exceed NLMSG_GOODSIZE

 * The number of interfaces necessary to trigger this is reduced
   substantially if both IPv4 and IPv6 IFLA_AF_SPEC information is
   present (~972 bytes)

 * If the dump function returns -EMSGSIZE on the first message in a
   packet, netlink_dump() incorrectly assumes the listing is done,
   omitting information for that interface and any later ones.

 * This can cause getifaddrs(3) to go into an infinite loop

 * 'ip link' is not affected, because it supplies IFLA_EXT_MASK which
   triggers rtnl_calcit() to recalculate the required packet size to
   greater than NLMSG_GOODSIZE.


I can see several possible ways to fix this, but they all have
possible problems.  I'm hoping someone here can determine which, if
any, are real problems, and therefore what's the right approach to fix
this.

    A) Always calculate the RTM_NEWLINK packet size, rather than
       assuming NLMSG_GOODSIZE.
    Problem: The NLMSG_GOODSIZE limit was introduced to stop broken
             user tools with limited buffers encountering problems
             (see 115c9b81928360d769a76c632bae62d15206a94a). This
             approach might mean that such tools break again.

    B) Don't issue the VF port information when RTEXT_FILTER_VF isn't
       set
    Problem: Do tools using the port information already set this flag?

    C) Don't include the VF port info when listing interfaces, only
       when doing GETLINK on a specific interface.
    Problem: As (B), plus it's ugly.

    D) Detect the case when the first interface in a packet doesn't fit
       reallocate the packet buffer
    Problem: As (A), plus more complicated.

As an interim band-aid, here's a patch which adds a WARN_ON() in this
situation, which will at least make the problem easier to locate for
the next person to encounter it.

From: David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH] rtnetlink: Warn when interface's information won't fit
         in our packet

Without IFLA_EXT_MASK specified, the information reported for a single
interface in response to RTM_GETLINK is expected to fit within a netlink
packet of NLMSG_GOODSIZE.

If it doesn't, however, things will go badly wrong,  When listing all
interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first
message in a packet as the end of the listing and omit information for
that interface and all subsequent ones.  This can cause getifaddrs(3) to
enter an infinite loop.

This patch won't fix the problem, but it will WARN_ON() making it
easier to track down what's going wrong.


Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 net/core/rtnetlink.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d4ff417..5331db2 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1198,6 +1198,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb,
struct netlink_callback *cb) struct hlist_head *head;
 	struct nlattr *tb[IFLA_MAX+1];
 	u32 ext_filter_mask = 0;
+	int err;
 
 	s_h = cb->args[0];
 	s_idx = cb->args[1];
@@ -1218,11 +1219,16 @@ static int rtnl_dump_ifinfo(struct sk_buff
*skb, struct netlink_callback *cb) hlist_for_each_entry_rcu(dev, head,
index_hlist) { if (idx < s_idx)
 				goto cont;
-			if (rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
-					     NETLINK_CB
(cb->skb).portid,
-					     cb->nlh->nlmsg_seq, 0,
-					     NLM_F_MULTI,
-					     ext_filter_mask) <= 0)
+			err = rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
+					       NETLINK_CB
(cb->skb).portid,
+					       cb->nlh->nlmsg_seq, 0,
+					       NLM_F_MULTI,
+					       ext_filter_mask);
+			/* If we ran out of room on the first message,
+			 * we're in trouble */
+			WARN_ON((err == -EMSGSIZE) && (skb->len == 0));
+
+			if (err <= 0)
 				goto out;
 
 			nl_dump_check_consistent(cb, nlmsg_hdr(skb));
-- 
1.9.0



-- 
David Gibson <dgibson@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-22  4:14 RFC: rtnetlink problems with Cisco enic and VFs David Gibson
@ 2014-04-22  4:17 ` David Gibson
  2014-04-22 18:03 ` Ben Hutchings
  2014-04-22 19:14 ` Christian Benvenuti (benve)
  2 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2014-04-22  4:17 UTC (permalink / raw)
  To: David Gibson
  Cc: netdev, Christian Benvenuti, Sujith Sankar,
	Govindarajulu Varadarajan, Neel Patel, Nishank Trivedi

[-- Attachment #1: Type: text/plain, Size: 2440 bytes --]

On Tue, 22 Apr 2014 14:14:25 +1000
David Gibson <dgibson@redhat.com> wrote:
[snip]

Sorry, had the wrong mailer setting, which mangled the patch.  Trying
again:

>From a8c1396c45449c0692560844d148abe7d5d1c2cc Mon Sep 17 00:00:00 2001
From: David Gibson <david@gibson.dropbear.id.au>
Date: Tue, 22 Apr 2014 14:13:16 +1000
Subject: [PATCH] rtnetlink: Warn when interface's information won't fit in our
 packet

Without IFLA_EXT_MASK specified, the information reported for a single
interface in response to RTM_GETLINK is expected to fit within a netlink
packet of NLMSG_GOODSIZE.

If it doesn't, however, things will go badly wrong,  When listing all
interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first
message in a packet as the end of the listing and omit information for
that interface and all subsequent ones.  This can cause getifaddrs(3) to
enter an infinite loop.

This patch won't fix the problem, but it will WARN_ON() making it easier to
track down what's going wrong.


Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 net/core/rtnetlink.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d4ff417..5331db2 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1198,6 +1198,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 	struct hlist_head *head;
 	struct nlattr *tb[IFLA_MAX+1];
 	u32 ext_filter_mask = 0;
+	int err;
 
 	s_h = cb->args[0];
 	s_idx = cb->args[1];
@@ -1218,11 +1219,16 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 		hlist_for_each_entry_rcu(dev, head, index_hlist) {
 			if (idx < s_idx)
 				goto cont;
-			if (rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
-					     NETLINK_CB(cb->skb).portid,
-					     cb->nlh->nlmsg_seq, 0,
-					     NLM_F_MULTI,
-					     ext_filter_mask) <= 0)
+			err = rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
+					       NETLINK_CB(cb->skb).portid,
+					       cb->nlh->nlmsg_seq, 0,
+					       NLM_F_MULTI,
+					       ext_filter_mask);
+			/* If we ran out of room on the first message,
+			 * we're in trouble */
+			WARN_ON((err == -EMSGSIZE) && (skb->len == 0));
+
+			if (err <= 0)
 				goto out;
 
 			nl_dump_check_consistent(cb, nlmsg_hdr(skb));
-- 
1.9.0



-- 
David Gibson <dgibson@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-22  4:14 RFC: rtnetlink problems with Cisco enic and VFs David Gibson
  2014-04-22  4:17 ` David Gibson
@ 2014-04-22 18:03 ` Ben Hutchings
  2014-04-22 18:12   ` David Miller
  2014-04-22 19:14 ` Christian Benvenuti (benve)
  2 siblings, 1 reply; 13+ messages in thread
From: Ben Hutchings @ 2014-04-22 18:03 UTC (permalink / raw)
  To: David Gibson
  Cc: netdev, Christian Benvenuti, Sujith Sankar,
	Govindarajulu Varadarajan, Neel Patel, Nishank Trivedi

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

On Tue, 2014-04-22 at 14:14 +1000, David Gibson wrote:
> I believe I've found a problem with netlink handling which can be
> triggered on Cisco enic devices with a large number (30-40) of virtual
> functions.  I believe this is the cause of a real customer problem
> we've seen.
> 
>  * When requesting a list of interfaces with RTM_GETLINK, enic devices
>    (and currently, _only_ enic devices) report IFLA_VF_PORTS
>    information 
> 
>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual function
> 
>  * Unlike IFLA_VFINFO_LIST, the ports information is always reported,
>    regardless of the setting of the IFLA_EXT_MASK parameter
[...]

So I think you should make reporting of IFLA_VF_PORTS dependent on the
same flag as IFLA_VFINFO_LIST.

Ben.

-- 
Ben Hutchings
Beware of programmers who carry screwdrivers. - Leonard Brandwein

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-22 18:03 ` Ben Hutchings
@ 2014-04-22 18:12   ` David Miller
  2014-04-22 23:26     ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2014-04-22 18:12 UTC (permalink / raw)
  To: ben; +Cc: dgibson, netdev, benve, ssujith, govindarajulu90, neepatel, nistrive

From: Ben Hutchings <ben@decadent.org.uk>
Date: Tue, 22 Apr 2014 19:03:19 +0100

> On Tue, 2014-04-22 at 14:14 +1000, David Gibson wrote:
>> I believe I've found a problem with netlink handling which can be
>> triggered on Cisco enic devices with a large number (30-40) of virtual
>> functions.  I believe this is the cause of a real customer problem
>> we've seen.
>> 
>>  * When requesting a list of interfaces with RTM_GETLINK, enic devices
>>    (and currently, _only_ enic devices) report IFLA_VF_PORTS
>>    information 
>> 
>>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual function
>> 
>>  * Unlike IFLA_VFINFO_LIST, the ports information is always reported,
>>    regardless of the setting of the IFLA_EXT_MASK parameter
> [...]
> 
> So I think you should make reporting of IFLA_VF_PORTS dependent on the
> same flag as IFLA_VFINFO_LIST.

I think that's what we'll have to do.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: rtnetlink problems with Cisco enic and VFs
  2014-04-22  4:14 RFC: rtnetlink problems with Cisco enic and VFs David Gibson
  2014-04-22  4:17 ` David Gibson
  2014-04-22 18:03 ` Ben Hutchings
@ 2014-04-22 19:14 ` Christian Benvenuti (benve)
  2014-04-22 23:24   ` David Gibson
  2 siblings, 1 reply; 13+ messages in thread
From: Christian Benvenuti (benve) @ 2014-04-22 19:14 UTC (permalink / raw)
  To: David Gibson, netdev
  Cc: Sujith Sankar (ssujith),
	Govindarajulu Varadarajan, Neel Patel (neepatel),
	Nishank Trivedi

David,

> -----Original Message-----
> From: David Gibson [mailto:dgibson@redhat.com]
> Sent: Monday, April 21, 2014 9:14 PM
> To: netdev@vger.kernel.org
> Cc: Christian Benvenuti (benve); Sujith Sankar (ssujith); Govindarajulu
> Varadarajan; Neel Patel (neepatel); Nishank Trivedi
> Subject: RFC: rtnetlink problems with Cisco enic and VFs
> 
> I believe I've found a problem with netlink handling which can be triggered
> on Cisco enic devices with a large number (30-40) of virtual functions.  I
> believe this is the cause of a real customer problem we've seen.
> 
>  * When requesting a list of interfaces with RTM_GETLINK, enic devices
>    (and currently, _only_ enic devices) report IFLA_VF_PORTS
>    information

Is the fact that Enic is the only driver implementing ndo_get_vf_port [1]
the root cause of the problem and the reason why this happens only with Enic?

/Chris

[1]
This is what makes rtnl_port_size to account for vf_port_size*dev_num_vf(...)
and rtnl_port_fill to add IFLA_VF_PORTS.
 
>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual function
> 
>  * Unlike IFLA_VFINFO_LIST, the ports information is always reported,
>    regardless of the setting of the IFLA_EXT_MASK parameter
> 
>  * When IFLA_EXT_MASK is not specified, the reply packets have maximum
>    size NLMSG_GOODSIZE (4k - overheads)
> 
>  * If there are enough virtual functions the IFLA_VF_PORTS information
>    can cause a single interface's info to exceed NLMSG_GOODSIZE
> 
>  * The number of interfaces necessary to trigger this is reduced
>    substantially if both IPv4 and IPv6 IFLA_AF_SPEC information is
>    present (~972 bytes)
> 
>  * If the dump function returns -EMSGSIZE on the first message in a
>    packet, netlink_dump() incorrectly assumes the listing is done,
>    omitting information for that interface and any later ones.
> 
>  * This can cause getifaddrs(3) to go into an infinite loop
> 
>  * 'ip link' is not affected, because it supplies IFLA_EXT_MASK which
>    triggers rtnl_calcit() to recalculate the required packet size to
>    greater than NLMSG_GOODSIZE.
> 
> 
> I can see several possible ways to fix this, but they all have possible
> problems.  I'm hoping someone here can determine which, if any, are real
> problems, and therefore what's the right approach to fix this.
> 
>     A) Always calculate the RTM_NEWLINK packet size, rather than
>        assuming NLMSG_GOODSIZE.
>     Problem: The NLMSG_GOODSIZE limit was introduced to stop broken
>              user tools with limited buffers encountering problems
>              (see 115c9b81928360d769a76c632bae62d15206a94a). This
>              approach might mean that such tools break again.
> 
>     B) Don't issue the VF port information when RTEXT_FILTER_VF isn't
>        set
>     Problem: Do tools using the port information already set this flag?
> 
>     C) Don't include the VF port info when listing interfaces, only
>        when doing GETLINK on a specific interface.
>     Problem: As (B), plus it's ugly.
> 
>     D) Detect the case when the first interface in a packet doesn't fit
>        reallocate the packet buffer
>     Problem: As (A), plus more complicated.
> 
> As an interim band-aid, here's a patch which adds a WARN_ON() in this
> situation, which will at least make the problem easier to locate for the next
> person to encounter it.
> 
> From: David Gibson <david@gibson.dropbear.id.au>
> Subject: [PATCH] rtnetlink: Warn when interface's information won't fit
>          in our packet
> 
> Without IFLA_EXT_MASK specified, the information reported for a single
> interface in response to RTM_GETLINK is expected to fit within a netlink
> packet of NLMSG_GOODSIZE.
> 
> If it doesn't, however, things will go badly wrong,  When listing all
> interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first
> message in a packet as the end of the listing and omit information for that
> interface and all subsequent ones.  This can cause getifaddrs(3) to enter an
> infinite loop.
> 
> This patch won't fix the problem, but it will WARN_ON() making it easier to
> track down what's going wrong.
> 
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  net/core/rtnetlink.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index
> d4ff417..5331db2 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1198,6 +1198,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb,
> struct netlink_callback *cb) struct hlist_head *head;
>  	struct nlattr *tb[IFLA_MAX+1];
>  	u32 ext_filter_mask = 0;
> +	int err;
> 
>  	s_h = cb->args[0];
>  	s_idx = cb->args[1];
> @@ -1218,11 +1219,16 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb,
> struct netlink_callback *cb) hlist_for_each_entry_rcu(dev, head,
> index_hlist) { if (idx < s_idx)
>  				goto cont;
> -			if (rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
> -					     NETLINK_CB
> (cb->skb).portid,
> -					     cb->nlh->nlmsg_seq, 0,
> -					     NLM_F_MULTI,
> -					     ext_filter_mask) <= 0)
> +			err = rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
> +					       NETLINK_CB
> (cb->skb).portid,
> +					       cb->nlh->nlmsg_seq, 0,
> +					       NLM_F_MULTI,
> +					       ext_filter_mask);
> +			/* If we ran out of room on the first message,
> +			 * we're in trouble */
> +			WARN_ON((err == -EMSGSIZE) && (skb->len == 0));
> +
> +			if (err <= 0)
>  				goto out;
> 
>  			nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> --
> 1.9.0
> 
> 
> 
> --
> David Gibson <dgibson@redhat.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: rtnetlink problems with Cisco enic and VFs
  2014-04-22 19:14 ` Christian Benvenuti (benve)
@ 2014-04-22 23:24   ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2014-04-22 23:24 UTC (permalink / raw)
  To: Christian Benvenuti (benve)
  Cc: netdev, Sujith Sankar (ssujith),
	Govindarajulu Varadarajan, Neel Patel (neepatel),
	Nishank Trivedi

[-- Attachment #1: Type: text/plain, Size: 1268 bytes --]

On Tue, 22 Apr 2014 19:14:04 +0000
"Christian Benvenuti (benve)" <benve@cisco.com> wrote:

> David,
> 
> > -----Original Message-----
> > From: David Gibson [mailto:dgibson@redhat.com]
> > Sent: Monday, April 21, 2014 9:14 PM
> > To: netdev@vger.kernel.org
> > Cc: Christian Benvenuti (benve); Sujith Sankar (ssujith); Govindarajulu
> > Varadarajan; Neel Patel (neepatel); Nishank Trivedi
> > Subject: RFC: rtnetlink problems with Cisco enic and VFs
> > 
> > I believe I've found a problem with netlink handling which can be triggered
> > on Cisco enic devices with a large number (30-40) of virtual functions.  I
> > believe this is the cause of a real customer problem we've seen.
> > 
> >  * When requesting a list of interfaces with RTM_GETLINK, enic devices
> >    (and currently, _only_ enic devices) report IFLA_VF_PORTS
> >    information
> 
> Is the fact that Enic is the only driver implementing ndo_get_vf_port [1]
> the root cause of the problem and the reason why this happens only with Enic?

Yes, or at least it's one of the factors.

> 
> /Chris
> 
> [1]
> This is what makes rtnl_port_size to account for vf_port_size*dev_num_vf(...)
> and rtnl_port_fill to add IFLA_VF_PORTS.

-- 
David Gibson <dgibson@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-22 18:12   ` David Miller
@ 2014-04-22 23:26     ` David Gibson
  2014-04-23  0:04       ` Greg Rose
  2014-04-23  0:59       ` David Miller
  0 siblings, 2 replies; 13+ messages in thread
From: David Gibson @ 2014-04-22 23:26 UTC (permalink / raw)
  To: David Miller
  Cc: ben, netdev, benve, ssujith, govindarajulu90, neepatel, nistrive

[-- Attachment #1: Type: text/plain, Size: 1247 bytes --]

On Tue, 22 Apr 2014 14:12:00 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Tue, 22 Apr 2014 19:03:19 +0100
> 
> > On Tue, 2014-04-22 at 14:14 +1000, David Gibson wrote:
> >> I believe I've found a problem with netlink handling which can be
> >> triggered on Cisco enic devices with a large number (30-40) of virtual
> >> functions.  I believe this is the cause of a real customer problem
> >> we've seen.
> >> 
> >>  * When requesting a list of interfaces with RTM_GETLINK, enic devices
> >>    (and currently, _only_ enic devices) report IFLA_VF_PORTS
> >>    information 
> >> 
> >>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual function
> >> 
> >>  * Unlike IFLA_VFINFO_LIST, the ports information is always reported,
> >>    regardless of the setting of the IFLA_EXT_MASK parameter
> > [...]
> > 
> > So I think you should make reporting of IFLA_VF_PORTS dependent on the
> > same flag as IFLA_VFINFO_LIST.
> 
> I think that's what we'll have to do.

Ok, makes logical sense.

But does anyone know what tools make use of the IFLA_VF_PORTS
information?  Do they set the IFLA_EXT_MASK already?

-- 
David Gibson <dgibson@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-22 23:26     ` David Gibson
@ 2014-04-23  0:04       ` Greg Rose
  2014-04-23  1:12         ` David Gibson
  2014-04-23  0:59       ` David Miller
  1 sibling, 1 reply; 13+ messages in thread
From: Greg Rose @ 2014-04-23  0:04 UTC (permalink / raw)
  To: David Gibson
  Cc: David Miller, ben, netdev, benve, ssujith, govindarajulu90,
	neepatel, nistrive

On Wed, 23 Apr 2014 09:26:06 +1000
David Gibson <dgibson@redhat.com> wrote:

> On Tue, 22 Apr 2014 14:12:00 -0400 (EDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Ben Hutchings <ben@decadent.org.uk>
> > Date: Tue, 22 Apr 2014 19:03:19 +0100
> > 
> > > On Tue, 2014-04-22 at 14:14 +1000, David Gibson wrote:
> > >> I believe I've found a problem with netlink handling which can be
> > >> triggered on Cisco enic devices with a large number (30-40) of
> > >> virtual functions.  I believe this is the cause of a real
> > >> customer problem we've seen.
> > >> 
> > >>  * When requesting a list of interfaces with RTM_GETLINK, enic
> > >> devices (and currently, _only_ enic devices) report IFLA_VF_PORTS
> > >>    information 
> > >> 
> > >>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual
> > >> function
> > >> 
> > >>  * Unlike IFLA_VFINFO_LIST, the ports information is always
> > >> reported, regardless of the setting of the IFLA_EXT_MASK
> > >> parameter
> > > [...]
> > > 
> > > So I think you should make reporting of IFLA_VF_PORTS dependent
> > > on the same flag as IFLA_VFINFO_LIST.
> > 
> > I think that's what we'll have to do.
> 
> Ok, makes logical sense.
> 
> But does anyone know what tools make use of the IFLA_VF_PORTS
> information?  Do they set the IFLA_EXT_MASK already?
> 

So far as I know only the IP route tool 'ip link' sets that.  In fact,
that's the reason I had to add it some number of years and months ago
was because there were tools that didn't expect to get all the
additional VF info and those tools were getting borked by all the
additional goo sent up for VFs.

Beyond that who knows what anyone's been up to with other tools in
other places?

- Greg

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-22 23:26     ` David Gibson
  2014-04-23  0:04       ` Greg Rose
@ 2014-04-23  0:59       ` David Miller
  1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2014-04-23  0:59 UTC (permalink / raw)
  To: dgibson; +Cc: ben, netdev, benve, ssujith, govindarajulu90, neepatel, nistrive

From: David Gibson <dgibson@redhat.com>
Date: Wed, 23 Apr 2014 09:26:06 +1000

> On Tue, 22 Apr 2014 14:12:00 -0400 (EDT)
> David Miller <davem@davemloft.net> wrote:
> 
>> From: Ben Hutchings <ben@decadent.org.uk>
>> Date: Tue, 22 Apr 2014 19:03:19 +0100
>> 
>> > On Tue, 2014-04-22 at 14:14 +1000, David Gibson wrote:
>> >> I believe I've found a problem with netlink handling which can be
>> >> triggered on Cisco enic devices with a large number (30-40) of virtual
>> >> functions.  I believe this is the cause of a real customer problem
>> >> we've seen.
>> >> 
>> >>  * When requesting a list of interfaces with RTM_GETLINK, enic devices
>> >>    (and currently, _only_ enic devices) report IFLA_VF_PORTS
>> >>    information 
>> >> 
>> >>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual function
>> >> 
>> >>  * Unlike IFLA_VFINFO_LIST, the ports information is always reported,
>> >>    regardless of the setting of the IFLA_EXT_MASK parameter
>> > [...]
>> > 
>> > So I think you should make reporting of IFLA_VF_PORTS dependent on the
>> > same flag as IFLA_VFINFO_LIST.
>> 
>> I think that's what we'll have to do.
> 
> Ok, makes logical sense.
> 
> But does anyone know what tools make use of the IFLA_VF_PORTS
> information?  Do they set the IFLA_EXT_MASK already?

iproute2 makes use of IFLA_VF_PORTS and sets IFLA_EXT_MASK
unconditionally in rtnl_wilddump_request().

libvirt also makes use of IFLA_VF_PORTS, and also unconditionally
sets IFLA_EXT_MASK in virNetDevLinkDump.

I only did a quick search and can't find any more users.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-23  0:04       ` Greg Rose
@ 2014-04-23  1:12         ` David Gibson
  2014-04-23  1:16           ` David Miller
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2014-04-23  1:12 UTC (permalink / raw)
  To: Greg Rose
  Cc: David Miller, ben, netdev, benve, ssujith, govindarajulu90,
	neepatel, nistrive

[-- Attachment #1: Type: text/plain, Size: 2164 bytes --]

On Tue, 22 Apr 2014 17:04:38 -0700
Greg Rose <gregory.v.rose@intel.com> wrote:

> On Wed, 23 Apr 2014 09:26:06 +1000
> David Gibson <dgibson@redhat.com> wrote:
> 
> > On Tue, 22 Apr 2014 14:12:00 -0400 (EDT)
> > David Miller <davem@davemloft.net> wrote:
> > 
> > > From: Ben Hutchings <ben@decadent.org.uk>
> > > Date: Tue, 22 Apr 2014 19:03:19 +0100
> > > 
> > > > On Tue, 2014-04-22 at 14:14 +1000, David Gibson wrote:
> > > >> I believe I've found a problem with netlink handling which can be
> > > >> triggered on Cisco enic devices with a large number (30-40) of
> > > >> virtual functions.  I believe this is the cause of a real
> > > >> customer problem we've seen.
> > > >> 
> > > >>  * When requesting a list of interfaces with RTM_GETLINK, enic
> > > >> devices (and currently, _only_ enic devices) report IFLA_VF_PORTS
> > > >>    information 
> > > >> 
> > > >>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual
> > > >> function
> > > >> 
> > > >>  * Unlike IFLA_VFINFO_LIST, the ports information is always
> > > >> reported, regardless of the setting of the IFLA_EXT_MASK
> > > >> parameter
> > > > [...]
> > > > 
> > > > So I think you should make reporting of IFLA_VF_PORTS dependent
> > > > on the same flag as IFLA_VFINFO_LIST.
> > > 
> > > I think that's what we'll have to do.
> > 
> > Ok, makes logical sense.
> > 
> > But does anyone know what tools make use of the IFLA_VF_PORTS
> > information?  Do they set the IFLA_EXT_MASK already?
> > 
> 
> So far as I know only the IP route tool 'ip link' sets that.  In fact,
> that's the reason I had to add it some number of years and months ago
> was because there were tools that didn't expect to get all the
> additional VF info and those tools were getting borked by all the
> additional goo sent up for VFs.
> 
> Beyond that who knows what anyone's been up to with other tools in
> other places?

And therein lies the problem.  I don't even know what the IFLA_VF_PORTS
info is for, but presumably something uses it.  If they stop receiving
it, they can be expected to break horribly.

-- 
David Gibson <dgibson@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-23  1:12         ` David Gibson
@ 2014-04-23  1:16           ` David Miller
  2014-04-23  2:33             ` Christian Benvenuti (benve)
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2014-04-23  1:16 UTC (permalink / raw)
  To: dgibson
  Cc: gregory.v.rose, ben, netdev, benve, ssujith, govindarajulu90,
	neepatel, nistrive

From: David Gibson <dgibson@redhat.com>
Date: Wed, 23 Apr 2014 11:12:03 +1000

> And therein lies the problem.  I don't even know what the IFLA_VF_PORTS
> info is for, but presumably something uses it.  If they stop receiving
> it, they can be expected to break horribly.

We did the same thing to VFINFO list.

All the users I could find already set the mask unconditionally
for all device dumps.

It's absolutely, positively, the only reasonable fix for this
problem.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-23  1:16           ` David Miller
@ 2014-04-23  2:33             ` Christian Benvenuti (benve)
  2014-04-23  4:15               ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Benvenuti (benve) @ 2014-04-23  2:33 UTC (permalink / raw)
  To: David Miller, dgibson
  Cc: gregory.v.rose, ben, netdev, Sujith Sankar (ssujith),
	govindarajulu90, Neel Patel (neepatel),
	nistrive

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of David Miller
> Sent: Tuesday, April 22, 2014 6:17 PM
> To: dgibson@redhat.com
> Cc: gregory.v.rose@intel.com; ben@decadent.org.uk;
> netdev@vger.kernel.org; Christian Benvenuti (benve); Sujith Sankar
> (ssujith); govindarajulu90@gmail.com; Neel Patel (neepatel);
> nistrive@cisco.com
> Subject: Re: RFC: rtnetlink problems with Cisco enic and VFs
> 
> From: David Gibson <dgibson@redhat.com>
> Date: Wed, 23 Apr 2014 11:12:03 +1000
> 
> > And therein lies the problem.  I don't even know what the
> > IFLA_VF_PORTS info is for, but presumably something uses it.  If they
> > stop receiving it, they can be expected to break horribly.

In the case of Enic, libvirt uses IFLA_VF_PORTS in the context of the port profile
(see virtualport section and 802.1Qbh in the libvirt network xml documentation).

As Miller said, libvirt and iproute2 are the two known users (with libvirt being the main one)
but you never know what else may be using it.

> We did the same thing to VFINFO list.

I guess you refer to Bugzilla 889319.

> All the users I could find already set the mask unconditionally for all device
> dumps.
> 
> It's absolutely, positively, the only reasonable fix for this problem.

The fix based on IFLA_EXT_MASK seems reasonable to me
(IFLA_EXT_MASK is in use in libvirt >= 1.0.3 and iproute2 >=3.4.0 based on a quick check).

/Chris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC: rtnetlink problems with Cisco enic and VFs
  2014-04-23  2:33             ` Christian Benvenuti (benve)
@ 2014-04-23  4:15               ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2014-04-23  4:15 UTC (permalink / raw)
  To: Christian Benvenuti (benve)
  Cc: David Miller, gregory.v.rose, ben, netdev,
	Sujith Sankar (ssujith), govindarajulu90, Neel Patel (neepatel),
	nistrive

[-- Attachment #1: Type: text/plain, Size: 2837 bytes --]

On Wed, 23 Apr 2014 02:33:06 +0000
"Christian Benvenuti (benve)" <benve@cisco.com> wrote:

> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org [mailto:netdev-
> > owner@vger.kernel.org] On Behalf Of David Miller
> > Sent: Tuesday, April 22, 2014 6:17 PM
> > To: dgibson@redhat.com
> > Cc: gregory.v.rose@intel.com; ben@decadent.org.uk;
> > netdev@vger.kernel.org; Christian Benvenuti (benve); Sujith Sankar
> > (ssujith); govindarajulu90@gmail.com; Neel Patel (neepatel);
> > nistrive@cisco.com
> > Subject: Re: RFC: rtnetlink problems with Cisco enic and VFs
> > 
> > From: David Gibson <dgibson@redhat.com>
> > Date: Wed, 23 Apr 2014 11:12:03 +1000
> > 
> > > And therein lies the problem.  I don't even know what the
> > > IFLA_VF_PORTS info is for, but presumably something uses it.  If they
> > > stop receiving it, they can be expected to break horribly.
> 
> In the case of Enic, libvirt uses IFLA_VF_PORTS in the context of the port profile
> (see virtualport section and 802.1Qbh in the libvirt network xml documentation).
> 
> As Miller said, libvirt and iproute2 are the two known users (with libvirt being the main one)
> but you never know what else may be using it.

Ah, yes, I see it in libvirt.

I don't think it's used in iproute2 though, at least not in the master
branch.

$ git remote show origin
* remote origin
  Fetch URL:
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git
Push  URL:
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git
HEAD branch: master Remote branches: iproute-3.5.1     tracked
    master            tracked
    net-next          tracked
    net-next-3.11     tracked
    net-next-for-3.13 tracked
  Local branch configured for 'git pull':
    master merges with remote master
  Local ref configured for 'git push':
    master pushes to master (up to date)
$ git rev-parse HEAD
ce3436ca05ee2a9f4bf4c5b10eb25638865772cb
$ git grep IFLA_VF_PORTS
include/linux/if_link.h:	IFLA_VF_PORTS,
include/linux/if_link.h: *		[IFLA_VF_PORTS]

Those are declaration and comment only, no actual uses.

> > We did the same thing to VFINFO list.
> 
> I guess you refer to Bugzilla 889319.
> 
> > All the users I could find already set the mask unconditionally for all device
> > dumps.
> > 
> > It's absolutely, positively, the only reasonable fix for this problem.
> 
> The fix based on IFLA_EXT_MASK seems reasonable to me
> (IFLA_EXT_MASK is in use in libvirt >= 1.0.3 and iproute2 >=3.4.0 based on a quick check).

Ok.  And conveniently for me it looks like the EXT_MASK fix is also
backported into the RHEL libvirt.

Alright, I'll whip up a patch series that makes the IFLA_VF_PORTS
information conditional on the RTEXT_FILTER_VF flag.

-- 
David Gibson <dgibson@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-04-23  4:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-22  4:14 RFC: rtnetlink problems with Cisco enic and VFs David Gibson
2014-04-22  4:17 ` David Gibson
2014-04-22 18:03 ` Ben Hutchings
2014-04-22 18:12   ` David Miller
2014-04-22 23:26     ` David Gibson
2014-04-23  0:04       ` Greg Rose
2014-04-23  1:12         ` David Gibson
2014-04-23  1:16           ` David Miller
2014-04-23  2:33             ` Christian Benvenuti (benve)
2014-04-23  4:15               ` David Gibson
2014-04-23  0:59       ` David Miller
2014-04-22 19:14 ` Christian Benvenuti (benve)
2014-04-22 23:24   ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.