netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sabrina Dubroca <sd@queasysnail.net>
To: Taehee Yoo <ap420073@gmail.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com,
	johannes@sipsolutions.net, j.vosburgh@gmail.com,
	vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us,
	roopa@cumulusnetworks.com, saeedm@mellanox.com,
	manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com,
	haiyangz@microsoft.com, stephen@networkplumber.org,
	sashal@kernel.org, hare@suse.de, varun@chelsio.com,
	ubraun@linux.ibm.com, kgraul@linux.ibm.com,
	jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no
Subject: Re: [PATCH net v4 01/12] net: core: limit nested device depth
Date: Thu, 10 Oct 2019 12:19:25 +0200	[thread overview]
Message-ID: <20191010101925.GA93190@bistromath.localdomain> (raw)
In-Reply-To: <20190928164843.31800-2-ap420073@gmail.com>

2019-09-28, 16:48:32 +0000, Taehee Yoo wrote:
> @@ -6790,23 +6878,45 @@ int netdev_walk_all_lower_dev(struct net_device *dev,
>  					void *data),
>  			      void *data)
>  {
> -	struct net_device *ldev;
> -	struct list_head *iter;
> -	int ret;
> +	struct net_device *ldev, *next, *now, *dev_stack[MAX_NEST_DEV + 1];
> +	struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1];
> +	int ret, cur = 0;
>  
> -	for (iter = &dev->adj_list.lower,
> -	     ldev = netdev_next_lower_dev(dev, &iter);
> -	     ldev;
> -	     ldev = netdev_next_lower_dev(dev, &iter)) {
> -		/* first is the lower device itself */
> -		ret = fn(ldev, data);
> -		if (ret)
> -			return ret;
> +	now = dev;
> +	iter = &dev->adj_list.lower;
>  
> -		/* then look at all of its lower devices */
> -		ret = netdev_walk_all_lower_dev(ldev, fn, data);
> -		if (ret)
> -			return ret;
> +	while (1) {
> +		if (now != dev) {
> +			ret = fn(now, data);
> +			if (ret)
> +				return ret;
> +		}
> +
> +		next = NULL;
> +		while (1) {
> +			ldev = netdev_next_lower_dev(now, &iter);
> +			if (!ldev)
> +				break;
> +
> +			if (!next) {
> +				next = ldev;
> +				niter = &ldev->adj_list.lower;
> +			} else {
> +				dev_stack[cur] = ldev;
> +				iter_stack[cur++] = &ldev->adj_list.lower;
> +				break;
> +			}
> +		}
> +
> +		if (!next) {
> +			if (!cur)
> +				return 0;

Hmm, I don't think this condition is correct.

If we have this topology:


                bridge0
                /  |  \
               /   |   \
              /    |    \
        dummy0   vlan1   vlan2
                   |       \
                 dummy1    dummy2

We end up with the expected lower/upper levels for all devices:

    | device  | upper | lower |
    |---------+-------+-------|
    | dummy0  |     2 |     1 |
    | dummy1  |     3 |     1 |
    | dummy2  |     3 |     1 |
    | vlan1   |     2 |     2 |
    | vlan2   |     2 |     2 |
    | bridge0 |     1 |     3 |


If we then add macvlan0 on top of bridge0:


                macvlan0
                   |
                   |
                bridge0
                /  |  \
               /   |   \
              /    |    \
        dummy0   vlan1   vlan2
                   |       \
                 dummy1    dummy2


we can observe that __netdev_update_upper_level is only called for
some of the devices under bridge0. I added a perf probe:

 # perf probe -a '__netdev_update_upper_level dev->name:string'

which gets hit for bridge0 (called directly by
__netdev_upper_dev_link) and then dummy0, vlan1, dummy1. It is never
called for vlan2 and dummy2.

After this, we have the following levels (*):

    | device   | upper | lower |
    |----------+-------+-------|
    | dummy0   |     3 |     1 |
    | dummy1   |     4 |     1 |
    | dummy2   |     3 |     1 |
    | vlan1    |     3 |     2 |
    | vlan2    |     2 |     2 |
    | bridge0  |     2 |     3 |
    | macvlan0 |     1 |     4 |

For dummy0, dummy1, vlan1, the upper level has increased by 1, as
expected. For dummy2 and vlan2, it's still the same, which is wrong.


(*) observed easily by adding another probe:

 # perf probe -a 'dev_get_stats dev->name:string dev->upper_level dev->lower_level'

and running "ip link"

Or you can just add prints and recompile, of course :)

> +			next = dev_stack[--cur];
> +			niter = iter_stack[cur];
> +		}
> +
> +		now = next;
> +		iter = niter;
>  	}
>  
>  	return 0;

-- 
Sabrina

  parent reply	other threads:[~2019-10-10 10:19 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-28 16:48 [PATCH net v4 00/12] net: fix nested device bugs Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 01/12] net: core: limit nested device depth Taehee Yoo
2019-09-28 19:36   ` Johannes Berg
2019-09-29 11:05     ` Taehee Yoo
2019-10-01  7:11       ` Johannes Berg
2019-10-01 13:53         ` Taehee Yoo
2019-10-01 13:57           ` Johannes Berg
2019-10-01 18:23             ` Taehee Yoo
2019-10-10 10:19   ` Sabrina Dubroca [this message]
2019-10-12 11:42     ` Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 02/12] vlan: use dynamic lockdep key instead of subclass Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 03/12] bonding: fix unexpected IFF_BONDING bit unset Taehee Yoo
2019-09-30 20:48   ` Jay Vosburgh
2019-09-28 16:48 ` [PATCH net v4 04/12] bonding: use dynamic lockdep key instead of subclass Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 05/12] team: use dynamic lockdep key instead of static key Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 06/12] macsec: use dynamic lockdep key instead of subclass Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 07/12] macvlan: " Taehee Yoo
2019-09-28 19:14   ` Johannes Berg
2019-09-29  8:03     ` Taehee Yoo
2019-10-01  7:25       ` Johannes Berg
2019-10-05  9:13         ` Taehee Yoo
2019-10-07 11:41           ` Johannes Berg
2019-10-08  8:13             ` Taehee Yoo
2019-10-21 16:00             ` Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 08/12] macsec: fix refcnt leak in module exit routine Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 09/12] net: core: add ignore flag to netdev_adjacent structure Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 10/12] vxlan: add adjacent link to limit depth level Taehee Yoo
2019-09-28 16:48 ` [PATCH net v4 11/12] net: remove unnecessary variables and callback Taehee Yoo
2019-09-28 19:42   ` Johannes Berg
2019-09-28 16:48 ` [PATCH net v4 12/12] virt_wifi: fix refcnt leak in module exit routine Taehee Yoo
2019-09-28 18:57   ` Johannes Berg
2019-10-07 11:22   ` Sabrina Dubroca
2019-10-08  6:53     ` Taehee Yoo
2019-09-28 19:20 ` [PATCH net v4 00/12] net: fix nested device bugs Johannes Berg
2019-09-29  8:31   ` Taehee Yoo
2019-10-01  7:39     ` Johannes Berg
2019-10-05  9:40       ` Taehee Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191010101925.GA93190@bistromath.localdomain \
    --to=sd@queasysnail.net \
    --cc=andy@greyhouse.net \
    --cc=ap420073@gmail.com \
    --cc=bjorn@mork.no \
    --cc=davem@davemloft.net \
    --cc=haiyangz@microsoft.com \
    --cc=hare@suse.de \
    --cc=j.vosburgh@gmail.com \
    --cc=jakub.kicinski@netronome.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=jiri@resnulli.us \
    --cc=johannes@sipsolutions.net \
    --cc=kgraul@linux.ibm.com \
    --cc=kys@microsoft.com \
    --cc=linux-wireless@vger.kernel.org \
    --cc=manishc@marvell.com \
    --cc=netdev@vger.kernel.org \
    --cc=rahulv@marvell.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=saeedm@mellanox.com \
    --cc=sashal@kernel.org \
    --cc=schuffelen@google.com \
    --cc=stephen@networkplumber.org \
    --cc=ubraun@linux.ibm.com \
    --cc=varun@chelsio.com \
    --cc=vfalico@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).