netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Possible to use both dev_mc_sync and __dev_mc_sync?
@ 2022-03-21 16:32 Vladimir Oltean
  2022-03-21 18:37 ` Alexander Duyck
  0 siblings, 1 reply; 6+ messages in thread
From: Vladimir Oltean @ 2022-03-21 16:32 UTC (permalink / raw)
  To: Alexander Duyck, Jakub Kicinski, Jiri Pirko, Florian Fainelli; +Cc: netdev

Hello,

Despite the similar names, the 2 functions above serve quite different
purposes, and as it happens, DSA needs to use both of them, each for its
own purpose.

static void dsa_slave_set_rx_mode(struct net_device *dev)
{
	struct net_device *master = dsa_slave_to_master(dev);
	struct dsa_port *dp = dsa_slave_to_port(dev);
	struct dsa_switch *ds = dp->ds;

	dev_mc_sync(master, dev); // DSA is a stacked device
	dev_uc_sync(master, dev);
	if (dsa_switch_supports_mc_filtering(ds))
		__dev_mc_sync(dev, dsa_slave_sync_mc, dsa_slave_unsync_mc); // DSA is also a hardware device
	if (dsa_switch_supports_uc_filtering(ds))
		__dev_uc_sync(dev, dsa_slave_sync_uc, dsa_slave_unsync_uc);
}

What I'm noticing is that some addresses, for example 33:33:00:00:00:01
(added by addrconf.c as in6addr_linklocal_allnodes) are synced to the
master via dev_mc_sync(), but not to hardware by __dev_mc_sync().

Superficially, this is because dev_mc_sync() -> __hw_addr_sync_one()
will increase ha->sync_cnt to a non-zero value. Then, when __dev_mc_sync()
-> __hw_addr_sync_dev() checks ha->sync_cnt, it sees that it has been
"already synced" (not really), so it doesn't call the "sync" method
(dsa_slave_sync_mc) for this ha.

However I don't understand the deep reasons and I am confused by the
members of struct netdev_hw_addr (synced vs sync_cnt vs refcount).
I can't tell if this was supposed to work, given that "sync address to
another device" is conceptually a different kind of sync than "sync
address to hardware", so I'm a bit surprised that they share the same
variables.

Any ideas?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Possible to use both dev_mc_sync and __dev_mc_sync?
  2022-03-21 16:32 Possible to use both dev_mc_sync and __dev_mc_sync? Vladimir Oltean
@ 2022-03-21 18:37 ` Alexander Duyck
  2022-03-21 18:42   ` Vladimir Oltean
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2022-03-21 18:37 UTC (permalink / raw)
  To: Vladimir Oltean, Jakub Kicinski, Jiri Pirko, Florian Fainelli; +Cc: netdev

> -----Original Message-----
> From: Vladimir Oltean <olteanv@gmail.com>
> Sent: Monday, March 21, 2022 9:32 AM
> To: Alexander Duyck <alexanderduyck@fb.com>; Jakub Kicinski
> <kuba@kernel.org>; Jiri Pirko <jiri@nvidia.com>; Florian Fainelli
> <f.fainelli@gmail.com>
> Cc: netdev@vger.kernel.org
> Subject: Possible to use both dev_mc_sync and __dev_mc_sync?
> 
> Hello,
> 
> Despite the similar names, the 2 functions above serve quite different
> purposes, and as it happens, DSA needs to use both of them, each for its
> own purpose.
> 
> static void dsa_slave_set_rx_mode(struct net_device *dev) {
> 	struct net_device *master = dsa_slave_to_master(dev);
> 	struct dsa_port *dp = dsa_slave_to_port(dev);
> 	struct dsa_switch *ds = dp->ds;
> 
> 	dev_mc_sync(master, dev); // DSA is a stacked device
> 	dev_uc_sync(master, dev);
> 	if (dsa_switch_supports_mc_filtering(ds))
> 		__dev_mc_sync(dev, dsa_slave_sync_mc,
> dsa_slave_unsync_mc); // DSA is also a hardware device
> 	if (dsa_switch_supports_uc_filtering(ds))
> 		__dev_uc_sync(dev, dsa_slave_sync_uc,
> dsa_slave_unsync_uc); }
> 
> What I'm noticing is that some addresses, for example 33:33:00:00:00:01
> (added by addrconf.c as in6addr_linklocal_allnodes) are synced to the master
> via dev_mc_sync(), but not to hardware by __dev_mc_sync().
> 
> Superficially, this is because dev_mc_sync() -> __hw_addr_sync_one() will
> increase ha->sync_cnt to a non-zero value. Then, when __dev_mc_sync()
> -> __hw_addr_sync_dev() checks ha->sync_cnt, it sees that it has been
> "already synced" (not really), so it doesn't call the "sync" method
> (dsa_slave_sync_mc) for this ha.
> 
> However I don't understand the deep reasons and I am confused by the
> members of struct netdev_hw_addr (synced vs sync_cnt vs refcount).
> I can't tell if this was supposed to work, given that "sync address to another
> device" is conceptually a different kind of sync than "sync address to
> hardware", so I'm a bit surprised that they share the same variables.
> 
> Any ideas?

I hadn't intended it to work this way. The expectation was that __dev_mc_sync would be used by hardware devices whereas dev_mc_sync was used by stacked devices such as vlan or macvlan.

Probably the easiest way to address it is to split things up so that you are using __dev_mc_sync if the switch supports mc filtering and have your dsa_slave_sync/unsync_mc call also push it down to the lower device, and then call dev_mc_sync after that so that if it hasn't already been pushed to the lower device it gets pushed. The assumption is that the lower device and the hardware would be synced in the same way. If we can't go that route we may have to look at implementing a different setup in terms of the reference counting such as what is done in __hw_addr_sync_multiple.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible to use both dev_mc_sync and __dev_mc_sync?
  2022-03-21 18:37 ` Alexander Duyck
@ 2022-03-21 18:42   ` Vladimir Oltean
  2022-03-21 18:45     ` Vladimir Oltean
  0 siblings, 1 reply; 6+ messages in thread
From: Vladimir Oltean @ 2022-03-21 18:42 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jakub Kicinski, Jiri Pirko, Florian Fainelli, netdev

On Mon, Mar 21, 2022 at 06:37:05PM +0000, Alexander Duyck wrote:
> > -----Original Message-----
> > From: Vladimir Oltean <olteanv@gmail.com>
> > Sent: Monday, March 21, 2022 9:32 AM
> > To: Alexander Duyck <alexanderduyck@fb.com>; Jakub Kicinski
> > <kuba@kernel.org>; Jiri Pirko <jiri@nvidia.com>; Florian Fainelli
> > <f.fainelli@gmail.com>
> > Cc: netdev@vger.kernel.org
> > Subject: Possible to use both dev_mc_sync and __dev_mc_sync?
> I hadn't intended it to work this way. The expectation was that
> __dev_mc_sync would be used by hardware devices whereas dev_mc_sync
> was used by stacked devices such as vlan or macvlan.

Understood, thanks for confirming.

> Probably the easiest way to address it is to split things up so that
> you are using __dev_mc_sync if the switch supports mc filtering and
> have your dsa_slave_sync/unsync_mc call also push it down to the lower
> device, and then call dev_mc_sync after that so that if it hasn't
> already been pushed to the lower device it gets pushed.

Yes, I have a patch with that change, just wanted to make sure I'm not
missing something. It's less efficient because now we need to check
whether dsa_switch_supports_uc_filtering() for each address, whereas
before we checked only once, before calling __dev_uc_add(). Oh well.

> The assumption is that the lower device and the hardware would be
> synced in the same way. If we can't go that route we may have to look
> at implementing a different setup in terms of the reference counting
> such as what is done in __hw_addr_sync_multiple.

So as mentioned, I haven't really understood the internals of the
reference/sync counting schemes being used here. But why are there
different implementations for dev_mc_sync() and dev_mc_sync_multiple()?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible to use both dev_mc_sync and __dev_mc_sync?
  2022-03-21 18:42   ` Vladimir Oltean
@ 2022-03-21 18:45     ` Vladimir Oltean
  2022-03-21 19:13       ` Alexander Duyck
  0 siblings, 1 reply; 6+ messages in thread
From: Vladimir Oltean @ 2022-03-21 18:45 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jakub Kicinski, Jiri Pirko, Florian Fainelli, netdev

On Mon, Mar 21, 2022 at 08:42:59PM +0200, Vladimir Oltean wrote:
> On Mon, Mar 21, 2022 at 06:37:05PM +0000, Alexander Duyck wrote:
> > > -----Original Message-----
> > > From: Vladimir Oltean <olteanv@gmail.com>
> > > Sent: Monday, March 21, 2022 9:32 AM
> > > To: Alexander Duyck <alexanderduyck@fb.com>; Jakub Kicinski
> > > <kuba@kernel.org>; Jiri Pirko <jiri@nvidia.com>; Florian Fainelli
> > > <f.fainelli@gmail.com>
> > > Cc: netdev@vger.kernel.org
> > > Subject: Possible to use both dev_mc_sync and __dev_mc_sync?
> > I hadn't intended it to work this way. The expectation was that
> > __dev_mc_sync would be used by hardware devices whereas dev_mc_sync
> > was used by stacked devices such as vlan or macvlan.
> 
> Understood, thanks for confirming.
> 
> > Probably the easiest way to address it is to split things up so that
> > you are using __dev_mc_sync if the switch supports mc filtering and
> > have your dsa_slave_sync/unsync_mc call also push it down to the lower
> > device, and then call dev_mc_sync after that so that if it hasn't
> > already been pushed to the lower device it gets pushed.
> 
> Yes, I have a patch with that change, just wanted to make sure I'm not
> missing something. It's less efficient because now we need to check
> whether dsa_switch_supports_uc_filtering() for each address, whereas
> before we checked only once, before calling __dev_uc_add(). Oh well.
> 
> > The assumption is that the lower device and the hardware would be
> > synced in the same way. If we can't go that route we may have to look
> > at implementing a different setup in terms of the reference counting
> > such as what is done in __hw_addr_sync_multiple.
> 
> So as mentioned, I haven't really understood the internals of the
> reference/sync counting schemes being used here. But why are there
> different implementations for dev_mc_sync() and dev_mc_sync_multiple()?

And on the same not of me not quite understanding what goes on, I wonder
why some multicast addresses get passed both to the lower dev and to
dsa_slave_sync_mc (which is why I didn't notice the problem in the first
place), while others don't.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Possible to use both dev_mc_sync and __dev_mc_sync?
  2022-03-21 18:45     ` Vladimir Oltean
@ 2022-03-21 19:13       ` Alexander Duyck
  2022-03-21 20:02         ` Jay Vosburgh
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2022-03-21 19:13 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Jakub Kicinski, Jiri Pirko, Florian Fainelli, netdev

> -----Original Message-----
> From: Vladimir Oltean <olteanv@gmail.com>
> Sent: Monday, March 21, 2022 11:45 AM
> To: Alexander Duyck <alexanderduyck@fb.com>
> Cc: Jakub Kicinski <kuba@kernel.org>; Jiri Pirko <jiri@nvidia.com>; Florian
> Fainelli <f.fainelli@gmail.com>; netdev@vger.kernel.org
> Subject: Re: Possible to use both dev_mc_sync and __dev_mc_sync?
> 
> On Mon, Mar 21, 2022 at 08:42:59PM +0200, Vladimir Oltean wrote:
> > On Mon, Mar 21, 2022 at 06:37:05PM +0000, Alexander Duyck wrote:
> > > > -----Original Message-----
> > > > From: Vladimir Oltean <olteanv@gmail.com>
> > > > Sent: Monday, March 21, 2022 9:32 AM
> > > > To: Alexander Duyck <alexanderduyck@fb.com>; Jakub Kicinski
> > > > <kuba@kernel.org>; Jiri Pirko <jiri@nvidia.com>; Florian Fainelli
> > > > <f.fainelli@gmail.com>
> > > > Cc: netdev@vger.kernel.org
> > > > Subject: Possible to use both dev_mc_sync and __dev_mc_sync?
> > > I hadn't intended it to work this way. The expectation was that
> > > __dev_mc_sync would be used by hardware devices whereas
> dev_mc_sync
> > > was used by stacked devices such as vlan or macvlan.
> >
> > Understood, thanks for confirming.
> >
> > > Probably the easiest way to address it is to split things up so that
> > > you are using __dev_mc_sync if the switch supports mc filtering and
> > > have your dsa_slave_sync/unsync_mc call also push it down to the
> > > lower device, and then call dev_mc_sync after that so that if it
> > > hasn't already been pushed to the lower device it gets pushed.
> >
> > Yes, I have a patch with that change, just wanted to make sure I'm not
> > missing something. It's less efficient because now we need to check
> > whether dsa_switch_supports_uc_filtering() for each address, whereas
> > before we checked only once, before calling __dev_uc_add(). Oh well.
> >
> > > The assumption is that the lower device and the hardware would be
> > > synced in the same way. If we can't go that route we may have to
> > > look at implementing a different setup in terms of the reference
> > > counting such as what is done in __hw_addr_sync_multiple.
> >
> > So as mentioned, I haven't really understood the internals of the
> > reference/sync counting schemes being used here. But why are there
> > different implementations for dev_mc_sync() and
> dev_mc_sync_multiple()?
> 
> And on the same not of me not quite understanding what goes on, I wonder
> why some multicast addresses get passed both to the lower dev and to
> dsa_slave_sync_mc (which is why I didn't notice the problem in the first
> place), while others don't.

It all depends on the complexity of the setup. The standard __hw_addr_sync basically assumes you are operating in one of two states.
Sync: sync_cnt == 0, refcount == 1 -> sync_cnt = 1, refcount++
Unsync: sync_cnt == 1, refcount == 1 -> sync_cnt = 0, entry deleted

I myself am not all that familiar with the multiple approach either, however it seems to operate on the idea that the reference count should always be greater than the sync count. So the device will hold one reference and it will sync the address as long as it doesn't already exist in the lower devices address table based on the rules in __hw_addr_add_ex.

Also this might explain why some were synching while others weren't. It is possible that the lower dev already had the address present and as such was rejected for not being an exclusive address for this device.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possible to use both dev_mc_sync and __dev_mc_sync?
  2022-03-21 19:13       ` Alexander Duyck
@ 2022-03-21 20:02         ` Jay Vosburgh
  0 siblings, 0 replies; 6+ messages in thread
From: Jay Vosburgh @ 2022-03-21 20:02 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Vladimir Oltean, Jakub Kicinski, Jiri Pirko, Florian Fainelli, netdev

Alexander Duyck <alexanderduyck@fb.com> wrote:

>> -----Original Message-----
>> From: Vladimir Oltean <olteanv@gmail.com>
>> Sent: Monday, March 21, 2022 11:45 AM
>> To: Alexander Duyck <alexanderduyck@fb.com>
>> Cc: Jakub Kicinski <kuba@kernel.org>; Jiri Pirko <jiri@nvidia.com>; Florian
>> Fainelli <f.fainelli@gmail.com>; netdev@vger.kernel.org
>> Subject: Re: Possible to use both dev_mc_sync and __dev_mc_sync?
>> 
>> On Mon, Mar 21, 2022 at 08:42:59PM +0200, Vladimir Oltean wrote:
>> > On Mon, Mar 21, 2022 at 06:37:05PM +0000, Alexander Duyck wrote:
>> > > > -----Original Message-----
>> > > > From: Vladimir Oltean <olteanv@gmail.com>
>> > > > Sent: Monday, March 21, 2022 9:32 AM
>> > > > To: Alexander Duyck <alexanderduyck@fb.com>; Jakub Kicinski
>> > > > <kuba@kernel.org>; Jiri Pirko <jiri@nvidia.com>; Florian Fainelli
>> > > > <f.fainelli@gmail.com>
>> > > > Cc: netdev@vger.kernel.org
>> > > > Subject: Possible to use both dev_mc_sync and __dev_mc_sync?
>> > > I hadn't intended it to work this way. The expectation was that
>> > > __dev_mc_sync would be used by hardware devices whereas
>> dev_mc_sync
>> > > was used by stacked devices such as vlan or macvlan.
>> >
>> > Understood, thanks for confirming.
>> >
>> > > Probably the easiest way to address it is to split things up so that
>> > > you are using __dev_mc_sync if the switch supports mc filtering and
>> > > have your dsa_slave_sync/unsync_mc call also push it down to the
>> > > lower device, and then call dev_mc_sync after that so that if it
>> > > hasn't already been pushed to the lower device it gets pushed.
>> >
>> > Yes, I have a patch with that change, just wanted to make sure I'm not
>> > missing something. It's less efficient because now we need to check
>> > whether dsa_switch_supports_uc_filtering() for each address, whereas
>> > before we checked only once, before calling __dev_uc_add(). Oh well.
>> >
>> > > The assumption is that the lower device and the hardware would be
>> > > synced in the same way. If we can't go that route we may have to
>> > > look at implementing a different setup in terms of the reference
>> > > counting such as what is done in __hw_addr_sync_multiple.
>> >
>> > So as mentioned, I haven't really understood the internals of the
>> > reference/sync counting schemes being used here. But why are there
>> > different implementations for dev_mc_sync() and
>> dev_mc_sync_multiple()?
>> 
>> And on the same not of me not quite understanding what goes on, I wonder
>> why some multicast addresses get passed both to the lower dev and to
>> dsa_slave_sync_mc (which is why I didn't notice the problem in the first
>> place), while others don't.
>
>It all depends on the complexity of the setup. The standard __hw_addr_sync basically assumes you are operating in one of two states.
>Sync: sync_cnt == 0, refcount == 1 -> sync_cnt = 1, refcount++
>Unsync: sync_cnt == 1, refcount == 1 -> sync_cnt = 0, entry deleted
>
>I myself am not all that familiar with the multiple approach either,
>however it seems to operate on the idea that the reference count should
>always be greater than the sync count. So the device will hold one
>reference and it will sync the address as long as it doesn't already
>exist in the lower devices address table based on the rules in
>__hw_addr_add_ex.

	Pretty much, yes.  The _sync_multiple versions are for the case
of a device cloning its entire ucast and/or mcast address set to
multiple subordinate devices, e.g., a bond or team to its interfaces.
I've not poked at this in a while, but if memory serves the bond / team
itself is one reference, and then each subordinate device adds a
refcount and a sync_cnt, so the usual case is refcount == sync_cnt + 1.

	I believe this test in __hw_addr_sync_multiple:

		if (ha->sync_cnt == ha->refcount) {
			__hw_addr_unsync_one(to_list, from_list, ha, addr_len);

	is for the "removed a HW address from bond / team, then resync
to subordinate interfaces" case. I.e., the bond / team's refcount has
been released, and the correct action is to remove the "no longer on
bond / team" HW address from the subordinate.

	-J

>Also this might explain why some were synching while others weren't. It
>is possible that the lower dev already had the address present and as
>such was rejected for not being an exclusive address for this device.

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-21 20:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21 16:32 Possible to use both dev_mc_sync and __dev_mc_sync? Vladimir Oltean
2022-03-21 18:37 ` Alexander Duyck
2022-03-21 18:42   ` Vladimir Oltean
2022-03-21 18:45     ` Vladimir Oltean
2022-03-21 19:13       ` Alexander Duyck
2022-03-21 20:02         ` Jay Vosburgh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).