All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Aleksandrov <razor@blackwall.org>
To: Petr Machata <petrm@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Roopa Prabhu <roopa@nvidia.com>,
	netdev@vger.kernel.org
Cc: bridge@lists.linux-foundation.org, Ido Schimmel <idosch@nvidia.com>
Subject: Re: [PATCH net-next mlxsw v2 07/16] net: bridge: Maintain number of MDB entries in net_bridge_mcast_port
Date: Thu, 2 Feb 2023 10:57:07 +0200	[thread overview]
Message-ID: <9ed8f5c0-ef73-3e12-a822-b0153f5237bb@blackwall.org> (raw)
In-Reply-To: <18e82e5a-1ee9-94ee-78a7-15bc08b62978@blackwall.org>

On 02/02/2023 10:56, Nikolay Aleksandrov wrote:
> On 01/02/2023 19:28, Petr Machata wrote:
>> The MDB maintained by the bridge is limited. When the bridge is configured
>> for IGMP / MLD snooping, a buggy or malicious client can easily exhaust its
>> capacity. In SW datapath, the capacity is configurable through the
>> IFLA_BR_MCAST_HASH_MAX parameter, but ultimately is finite. Obviously a
>> similar limit exists in the HW datapath for purposes of offloading.
>>
>> In order to prevent the issue of unilateral exhaustion of MDB resources,
>> introduce two parameters in each of two contexts:
>>
>> - Per-port and per-port-VLAN number of MDB entries that the port
>>   is member in.
>>
>> - Per-port and (when BROPT_MCAST_VLAN_SNOOPING_ENABLED is enabled)
>>   per-port-VLAN maximum permitted number of MDB entries, or 0 for
>>   no limit.
>>
>> The per-port multicast context is used for tracking of MDB entries for the
>> port as a whole. This is available for all bridges.
>>
>> The per-port-VLAN multicast context is then only available on
>> VLAN-filtering bridges on VLANs that have multicast snooping on.
>>
>> With these changes in place, it will be possible to configure MDB limit for
>> bridge as a whole, or any one port as a whole, or any single port-VLAN.
>>
>> Note that unlike the global limit, exhaustion of the per-port and
>> per-port-VLAN maximums does not cause disablement of multicast snooping.
>> It is also permitted to configure the local limit larger than hash_max,
>> even though that is not useful.
>>
>> In this patch, introduce only the accounting for number of entries, and the
>> max field itself, but not the means to toggle the max. The next patch
>> introduces the netlink APIs to toggle and read the values.
>>
>> Signed-off-by: Petr Machata <petrm@nvidia.com>
>> ---
>>
>> Notes:
>>     v2:
>>     - In br_multicast_port_ngroups_inc_one(), bounce
>>       if n>=max, not if n==max
>>     - Adjust extack messages to mention ngroups, now that
>>       the bounces appear when n>=max, not n==max
>>     - In __br_multicast_enable_port_ctx(), do not reset
>>       max to 0. Also do not count number of entries by
>>       going through _inc, as that would end up incorrectly
>>       bouncing the entries.
>>
>>  net/bridge/br_multicast.c | 132 +++++++++++++++++++++++++++++++++++++-
>>  net/bridge/br_private.h   |   2 +
>>  2 files changed, 133 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
>> index 51b622afdb67..e7ae339a8757 100644
>> --- a/net/bridge/br_multicast.c
>> +++ b/net/bridge/br_multicast.c
>> @@ -31,6 +31,7 @@
>>  #include <net/ip6_checksum.h>
>>  #include <net/addrconf.h>
>>  #endif
>> +#include <trace/events/bridge.h>
>>  
>>  #include "br_private.h"
>>  #include "br_private_mcast_eht.h"
>> @@ -234,6 +235,29 @@ br_multicast_pg_to_port_ctx(const struct net_bridge_port_group *pg)
>>  	return pmctx;
>>  }
>>  
>> +static struct net_bridge_mcast_port *
>> +br_multicast_port_vid_to_port_ctx(struct net_bridge_port *port, u16 vid)
>> +{
>> +	struct net_bridge_mcast_port *pmctx = NULL;
>> +	struct net_bridge_vlan *vlan;
>> +
>> +	lockdep_assert_held_once(&port->br->multicast_lock);
>> +
>> +	if (!br_opt_get(port->br, BROPT_MCAST_VLAN_SNOOPING_ENABLED))
>> +		return NULL;
>> +
>> +	/* Take RCU to access the vlan. */
>> +	rcu_read_lock();
>> +
>> +	vlan = br_vlan_find(nbp_vlan_group_rcu(port), vid);
>> +	if (vlan && !br_multicast_port_ctx_vlan_disabled(&vlan->port_mcast_ctx))
>> +		pmctx = &vlan->port_mcast_ctx;
>> +
>> +	rcu_read_unlock();
>> +
>> +	return pmctx;
>> +}
>> +
>>  /* when snooping we need to check if the contexts should be used
>>   * in the following order:
>>   * - if pmctx is non-NULL (port), check if it should be used
>> @@ -668,6 +692,82 @@ void br_multicast_del_group_src(struct net_bridge_group_src *src,
>>  	__br_multicast_del_group_src(src);
>>  }
>>  
>> +static int
>> +br_multicast_port_ngroups_inc_one(struct net_bridge_mcast_port *pmctx,
>> +				  struct netlink_ext_ack *extack)
>> +{
>> +	if (pmctx->mdb_max_entries &&
>> +	    pmctx->mdb_n_entries >= pmctx->mdb_max_entries)
> 
> These should be using *_ONCE() because of the next patch.
> KCSAN might be sad otherwise. :)
> 
>> +		return -E2BIG;
>> +
>> +	pmctx->mdb_n_entries++;
> 
> WRITE_ONCE()
> 
>> +	return 0;
>> +}
>> +
>> +static void br_multicast_port_ngroups_dec_one(struct net_bridge_mcast_port *pmctx)
>> +{
>> +	WARN_ON_ONCE(pmctx->mdb_n_entries-- == 0);
> 
> READ_ONCE()

err, I meant WRITE_ONCE() of course. :)
Need to get coffee.

> 
>> +}
>> +


WARNING: multiple messages have this Message-ID (diff)
From: Nikolay Aleksandrov <razor@blackwall.org>
To: Petr Machata <petrm@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Roopa Prabhu <roopa@nvidia.com>,
	netdev@vger.kernel.org
Cc: Ido Schimmel <idosch@nvidia.com>, bridge@lists.linux-foundation.org
Subject: Re: [Bridge] [PATCH net-next mlxsw v2 07/16] net: bridge: Maintain number of MDB entries in net_bridge_mcast_port
Date: Thu, 2 Feb 2023 10:57:07 +0200	[thread overview]
Message-ID: <9ed8f5c0-ef73-3e12-a822-b0153f5237bb@blackwall.org> (raw)
In-Reply-To: <18e82e5a-1ee9-94ee-78a7-15bc08b62978@blackwall.org>

On 02/02/2023 10:56, Nikolay Aleksandrov wrote:
> On 01/02/2023 19:28, Petr Machata wrote:
>> The MDB maintained by the bridge is limited. When the bridge is configured
>> for IGMP / MLD snooping, a buggy or malicious client can easily exhaust its
>> capacity. In SW datapath, the capacity is configurable through the
>> IFLA_BR_MCAST_HASH_MAX parameter, but ultimately is finite. Obviously a
>> similar limit exists in the HW datapath for purposes of offloading.
>>
>> In order to prevent the issue of unilateral exhaustion of MDB resources,
>> introduce two parameters in each of two contexts:
>>
>> - Per-port and per-port-VLAN number of MDB entries that the port
>>   is member in.
>>
>> - Per-port and (when BROPT_MCAST_VLAN_SNOOPING_ENABLED is enabled)
>>   per-port-VLAN maximum permitted number of MDB entries, or 0 for
>>   no limit.
>>
>> The per-port multicast context is used for tracking of MDB entries for the
>> port as a whole. This is available for all bridges.
>>
>> The per-port-VLAN multicast context is then only available on
>> VLAN-filtering bridges on VLANs that have multicast snooping on.
>>
>> With these changes in place, it will be possible to configure MDB limit for
>> bridge as a whole, or any one port as a whole, or any single port-VLAN.
>>
>> Note that unlike the global limit, exhaustion of the per-port and
>> per-port-VLAN maximums does not cause disablement of multicast snooping.
>> It is also permitted to configure the local limit larger than hash_max,
>> even though that is not useful.
>>
>> In this patch, introduce only the accounting for number of entries, and the
>> max field itself, but not the means to toggle the max. The next patch
>> introduces the netlink APIs to toggle and read the values.
>>
>> Signed-off-by: Petr Machata <petrm@nvidia.com>
>> ---
>>
>> Notes:
>>     v2:
>>     - In br_multicast_port_ngroups_inc_one(), bounce
>>       if n>=max, not if n==max
>>     - Adjust extack messages to mention ngroups, now that
>>       the bounces appear when n>=max, not n==max
>>     - In __br_multicast_enable_port_ctx(), do not reset
>>       max to 0. Also do not count number of entries by
>>       going through _inc, as that would end up incorrectly
>>       bouncing the entries.
>>
>>  net/bridge/br_multicast.c | 132 +++++++++++++++++++++++++++++++++++++-
>>  net/bridge/br_private.h   |   2 +
>>  2 files changed, 133 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
>> index 51b622afdb67..e7ae339a8757 100644
>> --- a/net/bridge/br_multicast.c
>> +++ b/net/bridge/br_multicast.c
>> @@ -31,6 +31,7 @@
>>  #include <net/ip6_checksum.h>
>>  #include <net/addrconf.h>
>>  #endif
>> +#include <trace/events/bridge.h>
>>  
>>  #include "br_private.h"
>>  #include "br_private_mcast_eht.h"
>> @@ -234,6 +235,29 @@ br_multicast_pg_to_port_ctx(const struct net_bridge_port_group *pg)
>>  	return pmctx;
>>  }
>>  
>> +static struct net_bridge_mcast_port *
>> +br_multicast_port_vid_to_port_ctx(struct net_bridge_port *port, u16 vid)
>> +{
>> +	struct net_bridge_mcast_port *pmctx = NULL;
>> +	struct net_bridge_vlan *vlan;
>> +
>> +	lockdep_assert_held_once(&port->br->multicast_lock);
>> +
>> +	if (!br_opt_get(port->br, BROPT_MCAST_VLAN_SNOOPING_ENABLED))
>> +		return NULL;
>> +
>> +	/* Take RCU to access the vlan. */
>> +	rcu_read_lock();
>> +
>> +	vlan = br_vlan_find(nbp_vlan_group_rcu(port), vid);
>> +	if (vlan && !br_multicast_port_ctx_vlan_disabled(&vlan->port_mcast_ctx))
>> +		pmctx = &vlan->port_mcast_ctx;
>> +
>> +	rcu_read_unlock();
>> +
>> +	return pmctx;
>> +}
>> +
>>  /* when snooping we need to check if the contexts should be used
>>   * in the following order:
>>   * - if pmctx is non-NULL (port), check if it should be used
>> @@ -668,6 +692,82 @@ void br_multicast_del_group_src(struct net_bridge_group_src *src,
>>  	__br_multicast_del_group_src(src);
>>  }
>>  
>> +static int
>> +br_multicast_port_ngroups_inc_one(struct net_bridge_mcast_port *pmctx,
>> +				  struct netlink_ext_ack *extack)
>> +{
>> +	if (pmctx->mdb_max_entries &&
>> +	    pmctx->mdb_n_entries >= pmctx->mdb_max_entries)
> 
> These should be using *_ONCE() because of the next patch.
> KCSAN might be sad otherwise. :)
> 
>> +		return -E2BIG;
>> +
>> +	pmctx->mdb_n_entries++;
> 
> WRITE_ONCE()
> 
>> +	return 0;
>> +}
>> +
>> +static void br_multicast_port_ngroups_dec_one(struct net_bridge_mcast_port *pmctx)
>> +{
>> +	WARN_ON_ONCE(pmctx->mdb_n_entries-- == 0);
> 
> READ_ONCE()

err, I meant WRITE_ONCE() of course. :)
Need to get coffee.

> 
>> +}
>> +


  reply	other threads:[~2023-02-02  8:57 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-01 17:28 [PATCH net-next mlxsw v2 00/16] bridge: Limit number of MDB entries per port, port-vlan Petr Machata
2023-02-01 17:28 ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 01/16] net: bridge: Set strict_start_type at two policies Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 02/16] net: bridge: Add extack to br_multicast_new_port_group() Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 03/16] net: bridge: Move extack-setting " Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 04/16] net: bridge: Add br_multicast_del_port_group() Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 05/16] net: bridge: Change a cleanup in br_multicast_new_port_group() to goto Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 06/16] net: bridge: Add a tracepoint for MDB overflows Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-02 16:22   ` Steven Rostedt
2023-02-02 16:22     ` [Bridge] " Steven Rostedt
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 07/16] net: bridge: Maintain number of MDB entries in net_bridge_mcast_port Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-02  8:56   ` Nikolay Aleksandrov
2023-02-02  8:56     ` [Bridge] " Nikolay Aleksandrov
2023-02-02  8:57     ` Nikolay Aleksandrov [this message]
2023-02-02  8:57       ` Nikolay Aleksandrov
2023-02-02 10:09     ` Petr Machata
2023-02-02 10:09       ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 08/16] net: bridge: Add netlink knobs for number / maximum MDB entries Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-02  8:52   ` Nikolay Aleksandrov
2023-02-02  8:52     ` [Bridge] " Nikolay Aleksandrov
2023-02-02  9:02     ` Nikolay Aleksandrov
2023-02-02  9:02       ` [Bridge] " Nikolay Aleksandrov
2023-02-02 15:02       ` Petr Machata
2023-02-02 15:02         ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 09/16] selftests: forwarding: Move IGMP- and MLD-related functions to lib Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 10/16] selftests: forwarding: bridge_mdb: Fix a typo Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 11/16] selftests: forwarding: lib: Add helpers for IP address handling Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 12/16] selftests: forwarding: lib: Add helpers for checksum handling Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 13/16] selftests: forwarding: lib: Parameterize IGMPv3/MLDv2 generation Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 14/16] selftests: forwarding: lib: Allow list of IPs for IGMPv3/MLDv2 Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 15/16] selftests: forwarding: lib: Add helpers to build IGMP/MLD leave packets Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 17:28 ` [PATCH net-next mlxsw v2 16/16] selftests: forwarding: bridge_mdb_max: Add a new selftest Petr Machata
2023-02-01 17:28   ` [Bridge] " Petr Machata
2023-02-01 18:25 ` [PATCH net-next mlxsw v2 00/16] bridge: Limit number of MDB entries per port, port-vlan Jakub Kicinski
2023-02-01 18:25   ` [Bridge] " Jakub Kicinski
2023-02-02 10:07   ` Petr Machata
2023-02-02 10:07     ` [Bridge] " Petr Machata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ed8f5c0-ef73-3e12-a822-b0153f5237bb@blackwall.org \
    --to=razor@blackwall.org \
    --cc=bridge@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    --cc=roopa@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.