netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
To: Arkadi Sharshevsky <arkadis@mellanox.com>,
	Vivien Didelot <vivien.didelot@savoirfairelinux.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	netdev@vger.kernel.org
Cc: davem@davemloft.net, jiri@resnulli.us, ivecera@redhat.com,
	andrew@lunn.ch, Woojung.Huh@microchip.com,
	stephen@networkplumber.org, mlxsw@mellanox.com,
	roopa <roopa@cumulusnetworks.com>,
	Shrijeet Mukherjee <shm@cumulusnetworks.com>
Subject: Re: [PATCH net-next RFC 05/12] net: dsa: Add support for learning FDB through notification
Date: Tue, 11 Jul 2017 18:05:08 +0300	[thread overview]
Message-ID: <df578ae3-ec4e-5aca-6b1c-9bdb8db10774@cumulusnetworks.com> (raw)
In-Reply-To: <ccad80c4-2d1f-2efe-ebf6-29945d84f287@mellanox.com>

On 11/07/17 13:26, Arkadi Sharshevsky wrote:
> 
> 
> On 07/10/2017 11:59 PM, Vivien Didelot wrote:
>> Hi Arkadi,
>>
>> Arkadi Sharshevsky <arkadis@mellanox.com> writes:
>>
>>>>> +		err = dsa_port_fdb_add(p->dp, fdb_info->addr, fdb_info->vid);
>>>>> +		if (err) {
>>>>> +			netdev_dbg(dev, "fdb add failed err=%d\n", err);
>>>>> +			break;
>>>>> +		}
>>>>> +		call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED, dev,
>>>>> +					 &fdb_info->info);
>>>>> +		break;
>>>>> +
>>>>> +	case SWITCHDEV_FDB_DEL_TO_DEVICE:
>>>>> +		fdb_info = &switchdev_work->fdb_info;
>>>>> +		err = dsa_port_fdb_del(p->dp, fdb_info->addr, fdb_info->vid);
>>>>> +		if (err)
>>>>> +			netdev_dbg(dev, "fdb del failed err=%d\n", err);
>>>>
>>>> OK I must have missed from the off-list discussion why we are not
>>>> calling the switchdev notifier here?
>>>
>>> We do not agree on it actually, that is why it was moved to the list.
>>> I think that delete should succeed, you should retry until succession.
>>>
>>> The deletion is done under spinlock in the bridge so you cannot block,
>>> thus delete cannot fail due to hardware failure. Calling it here doesn't
>>> make sense because the bridge probably already deleted this FDB.
>>
>> So as we discussed, the problem here is that if dsa_port_fdb_del fails
>> for some probable reasons (MDIO timeout, weak GPIO lines, etc.), Linux
>> bridge will delete the entry in software, dumping bridge fdb will show
>> nothing, but the entry would still be programmed in hardware and the
>> network can thus be inconsistent, unsupposedly switching frames.
>>
>> IMHO the correct way for bridge to use the notification chain is to make
>> SWITCHDEV_FDB_DEL_TO_DEVICE symmetrical to SWITCHDEV_FDB_ADD_TO_DEVICE:
>> if an entry has been marked as offloaded, bridge must mark the entry as
>> to-be-deleted and do not delete the software entry until the driver
>> notifies back the successful deletion.
>>
>> If that is hardly feasible due to some bridge limitations, we must
>> explain this in a comment and use something more explosive than a simple
>> netdev_dbg to warn the user about the broken network setup...
>>
>>
>> Thanks,
>>
>>         Vivien
>>
> 
> Hi Nikolay,
> 
> Vivien raised inconsistency issue with the current switchdev
> notification chain in case of FDB del. In case of static FDB delete,
> notification will be sent to the driver, followed by deletion of the
> software entry without waiting for the hardware delete. In case of
> hardware deletion failure the consecutive FDB dump will not show the
> deleted entry, yet, the entry will stay in hardware.
> 
> The deletion is done under lock thus the hardware deletion is deferred,
> and cannot fail due to hardware removal failure. Thus the above proposed
> solution by Vivien can lead to confusing situation:
> 
> 1. User deletes the entry
> 2. Deletion succeed
> 3. User dumps FDB and still sees this entry due to hardware failure,
>    what should he do? retry to delete until the FDB dump will not show
>    the entry?
> 
> Would like to hear you opinion about this solution.
> 
> IMHO in this case the driver should retry to delete, in case of
> several retries the driver should maybe:
> 1. Trap the traffic to CPU (dint think it possible in case of DSA).
> 2. Disable the port (its more explosive then netdev_dbg).
> 
> Thanks,
> Arkadi
> 
> 

Hi,
Looking at the code - it would seem that retrying is the only current option
with the way these switchdev notifications are handled. They cannot fail, meaning
from the bridge POV these ops must always succeed and errors are ignored, so the
driver should do everything possible to stay in sync, and in case all fails
then disabling the port seems like the best option to me, to show that something is
clearly wrong and avoid further issues, but DSA maintainers can comment more
on how to handle failure.

That being said:
This sounds a lot like the switchdev notifications vs callbacks discussions that we've
had in the past. Also what happened with the prepare+commit and all that ? If the hash_lock
is the main problem let's work towards improving that and making the fdb code handle
switchdev similar to the vlan code.

Cheers,
 Nik

  reply	other threads:[~2017-07-11 15:05 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-05 15:36 [PATCH net-next RFC 00/12] Change DSA's FDB API and perform switchdev cleanup Arkadi Sharshevsky
2017-07-05 15:36 ` [PATCH net-next RFC 01/12] net: dsa: Change DSA slave FDB API to be switchdev independent Arkadi Sharshevsky
2017-07-05 19:13   ` Florian Fainelli
2017-07-05 15:36 ` [PATCH net-next RFC 02/12] net: dsa: Remove prepare phase for FDB Arkadi Sharshevsky
2017-07-05 19:25   ` Florian Fainelli
2017-07-06 18:20   ` Vivien Didelot
2017-07-05 15:36 ` [PATCH net-next RFC 03/12] net: dsa: Remove switchdev dependency from DSA switch notifier chain Arkadi Sharshevsky
2017-07-05 19:26   ` Florian Fainelli
2017-07-06 18:42   ` Vivien Didelot
2017-07-05 15:36 ` [PATCH net-next RFC 04/12] net: dsa: Add ordered workqueue Arkadi Sharshevsky
2017-07-05 19:37   ` Florian Fainelli
2017-07-06 18:45     ` Vivien Didelot
2017-07-09 12:56       ` Arkadi Sharshevsky
2017-07-05 15:36 ` [PATCH net-next RFC 05/12] net: dsa: Add support for learning FDB through notification Arkadi Sharshevsky
2017-07-05 19:35   ` Florian Fainelli
2017-07-06 13:04     ` Arkadi Sharshevsky
2017-07-10 20:59       ` Vivien Didelot
2017-07-11 10:26         ` Arkadi Sharshevsky
2017-07-11 15:05           ` Nikolay Aleksandrov [this message]
2017-07-12 11:23             ` Arkadi Sharshevsky
2017-07-12 11:42               ` Nikolay Aleksandrov
2017-07-05 15:36 ` [PATCH net-next RFC 06/12] net: dsa: Remove support for FDB add/del via SELF Arkadi Sharshevsky
2017-07-05 15:36 ` [PATCH net-next RFC 07/12] net: dsa: Add support for querying supported bridge flags Arkadi Sharshevsky
2017-07-05 15:36 ` [PATCH net-next RFC 08/12] net: dsa: Remove support for bypass bridge port attributes/vlan set Arkadi Sharshevsky
2017-07-05 19:45   ` Florian Fainelli
2017-07-06 11:00     ` Arkadi Sharshevsky
2017-07-06 21:36       ` Florian Fainelli
2017-07-09 13:56         ` Arkadi Sharshevsky
2017-07-10 17:13     ` Vivien Didelot
2017-07-05 15:36 ` [PATCH net-next RFC 09/12] net: dsa: Remove redundant MDB dump support Arkadi Sharshevsky
2017-07-05 15:36 ` [PATCH net-next RFC 10/12] net: dsa: Move FDB dump implementation inside DSA Arkadi Sharshevsky
2017-07-10 19:36   ` Vivien Didelot
2017-07-11  8:32     ` Arkadi Sharshevsky
2017-07-05 15:36 ` [PATCH net-next RFC 11/12] net: bridge: Remove FDB deletion through switchdev object Arkadi Sharshevsky
2017-07-05 15:36 ` [PATCH net-next RFC 12/12] net: switchdev: Remove bridge bypass support from switchdev Arkadi Sharshevsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df578ae3-ec4e-5aca-6b1c-9bdb8db10774@cumulusnetworks.com \
    --to=nikolay@cumulusnetworks.com \
    --cc=Woojung.Huh@microchip.com \
    --cc=andrew@lunn.ch \
    --cc=arkadis@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=ivecera@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=mlxsw@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=roopa@cumulusnetworks.com \
    --cc=shm@cumulusnetworks.com \
    --cc=stephen@networkplumber.org \
    --cc=vivien.didelot@savoirfairelinux.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).