Linux-WPAN Archive on lore.kernel.org
 help / color / Atom feed
From: Vasily Averin <vvs@virtuozzo.com>
To: dsahern@kernel.org, netdev@vger.kernel.org
Cc: nikita.leshchenko@oracle.com, roopa@cumulusnetworks.com,
	stephen@networkplumber.org, idosch@mellanox.com,
	jiri@mellanox.com, saeedm@mellanox.com, alex.aring@gmail.com,
	linux-wpan@vger.kernel.org, netfilter-devel@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Ahern <dsahern@gmail.com>
Subject: Re: [RFC/RFT, net-next, 00/17] net: Convert neighbor tables to per-namespace
Date: Sun, 12 Aug 2018 09:46:58 +0300
Message-ID: <dc9cd93c-61d4-dc8c-9ba1-cd8584908ba2@virtuozzo.com> (raw)
In-Reply-To: <20180717120651.15748-1-dsahern@kernel.org>

On 07/17/2018 03:06 PM, dsahern@kernel.org wrote:
> From: David Ahern <dsahern@gmail.com>
> 
> Nikita Leshenko reported that neighbor entries in one namespace can
> evict neighbor entries in another. The problem is that the neighbor
> tables have entries across all namespaces without separate accounting
> and with global limits on when to scan for entries to evict.
> 
> Resolve by making the neighbor tables for ipv4, ipv6 and decnet per
> namespace and making the accounting and threshold limits per namespace.

Dear David,
I prepared own patch set to fix this problem and found your one.
It looks perfect for me, and I hope David Miller will merge it soon,
however I have found a few drawbacks:

1) I know that if net_device exist it always have correct net reference,
so dev_net(dev) will be always correct.
However I afraid that device reference itself is correct in some places.
For example, 
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -376,9 +377,10 @@ mlxsw_sp_span_entry_gretap4_parms(const struct net_device *to_dev,
 		return mlxsw_sp_span_entry_unoffloadable(sparmsp);
 
 	l3edev = mlxsw_sp_span_gretap4_route(to_dev, &saddr.addr4, &gw.addr4);
+	tbl = ipv4_neigh_table(dev_net(l3edev));
 	return mlxsw_sp_span_entry_tunnel_parms_common(l3edev, saddr, daddr, gw,
 						       tparm.iph.ttl,
-						       &arp_tbl, sparmsp);
+						       tbl, sparmsp);
 }
 
mlxsw_sp_span_entry_tunnel_parms_common() have "if (!edev)" check inside,
so it seems l3edev can be set to NULL here and lead to crash inside dev_net(l3edev).
There are few other suspicious places and I think they should be carefully re-checked.

2) modified arp_net_init() does not check return value neigh_sysctl_register() and lacks correct rollback.
It was acceptable in arp_init, because it was called only once on boot, but now it will be called 
for each new net namespace, it can have real chances to fail lead to memory crash/memory corruption.

3) modified neigh_table_init() is called many times per netns but it can panic in case failed memory allocation.
I think it should be reworked to return errors in such cases, its callers should check it and add correct rollbacks.

4) currently neigh_table_clear() always return 0, I think it makes sense to change it to return void.

Thank you,
	Vasily Averin 

  parent reply index

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-17 12:06 [PATCH RFC/RFT net-next " dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 01/17] net/ipv4: rename ipv4_neigh_lookup to ipv4_dst_neigh_lookup dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 02/17] net/neigh: export neigh_find_table dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 03/17] net/ipv4: wrappers for arp table references dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 04/17] net/ipv4: Remove open coded use of arp table dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 05/17] net/ipv6: wrappers for neighbor table references dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 06/17] net/ipv6: Remove open coded use of neighbor table dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 07/17] drivers/net: remove open coding of neighbor tables dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 08/17] net: Remove nd_tbl from ipv6 stub dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 09/17] net: Remove arp_tbl and nd_tbl from headers dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 10/17] net: Add key_len to neighbor constructor dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 11/17] net: Change neigh_table_init and neigh_table_clear signature dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 12/17] net/neigh: Change neigh_xmit to take an address family dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 13/17] net/neighbor: Convert internal functions away from neigh_tables dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 14/17] net/ipv4: Convert arp table to per namespace dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 15/17] net/ipv6: Convert neighbor table to per-namespace dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 16/17] net/decnet: Move " dsahern
2018-07-17 12:06 ` [PATCH RFC/RFT net-next 17/17] net/neighbor: Remove neigh_tables and NEIGH enum dsahern
2018-07-17 17:40 ` [PATCH RFC/RFT net-next 00/17] net: Convert neighbor tables to per-namespace Cong Wang
2018-07-17 17:43   ` David Ahern
2018-07-17 17:53     ` Cong Wang
2018-07-17 19:02       ` David Ahern
2018-07-17 20:37         ` Cong Wang
2018-07-18  3:59         ` David Miller
2018-07-19 16:16           ` David Ahern
2018-07-19 17:12             ` Cong Wang
2018-07-24 15:14               ` David Ahern
2018-07-24 17:14                 ` David Miller
2018-07-25 18:23                   ` David Ahern
2018-07-24 22:09                 ` Cong Wang
2018-07-25 12:33                   ` Eric W. Biederman
2018-07-25 14:06                     ` David Ahern
2018-07-25 17:38                       ` Eric W. Biederman
2018-07-25 18:13                         ` David Ahern
2018-07-25 19:17                           ` Eric W. Biederman
2018-08-13 21:48                             ` David Ahern
2018-08-15  4:36                               ` Eric W. Biederman
2018-07-26 11:12                         ` David Laight
2018-07-27 16:27                           ` Eric W. Biederman
2018-07-19  0:54 ` Michael Richardson
2018-07-19 15:49   ` David Ahern
2018-08-12  6:46 ` Vasily Averin [this message]
2018-08-12 17:37   ` [RFC/RFT, net-next, " David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc9cd93c-61d4-dc8c-9ba1-cd8584908ba2@virtuozzo.com \
    --to=vvs@virtuozzo.com \
    --cc=alex.aring@gmail.com \
    --cc=dsahern@gmail.com \
    --cc=dsahern@kernel.org \
    --cc=idosch@mellanox.com \
    --cc=jiri@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wpan@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=nikita.leshchenko@oracle.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=saeedm@mellanox.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-WPAN Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-wpan/0 linux-wpan/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-wpan linux-wpan/ https://lore.kernel.org/linux-wpan \
		linux-wpan@vger.kernel.org
	public-inbox-index linux-wpan

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-wpan


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git