netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Oltean <olteanv@gmail.com>
To: Ido Schimmel <idosch@idosch.org>
Cc: Nikolay Aleksandrov <nikolay@nvidia.com>,
	Vladimir Oltean <vladimir.oltean@nxp.com>,
	netdev@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Roopa Prabhu <roopa@nvidia.com>, Andrew Lunn <andrew@lunn.ch>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Vadym Kochan <vkochan@marvell.com>,
	Taras Chornyi <tchornyi@marvell.com>,
	Jiri Pirko <jiri@nvidia.com>, Ido Schimmel <idosch@nvidia.com>,
	UNGLinuxDriver@microchip.com,
	Grygorii Strashko <grygorii.strashko@ti.com>,
	Marek Behun <kabel@blackhole.sk>,
	DENG Qingfang <dqfext@gmail.com>,
	Kurt Kanzenbach <kurt@linutronix.de>,
	Hauke Mehrtens <hauke@hauke-m.de>,
	Woojung Huh <woojung.huh@microchip.com>,
	Sean Wang <sean.wang@mediatek.com>,
	Landen Chao <Landen.Chao@mediatek.com>,
	Claudiu Manoil <claudiu.manoil@nxp.com>,
	Alexandre Belloni <alexandre.belloni@bootlin.com>,
	George McCollister <george.mccollister@gmail.com>,
	Ioana Ciornei <ioana.ciornei@nxp.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Leon Romanovsky <leon@kernel.org>,
	Lars Povlsen <lars.povlsen@microchip.com>,
	Steen Hegelund <Steen.Hegelund@microchip.com>,
	Julian Wiedmann <jwi@linux.ibm.com>,
	Karsten Graul <kgraul@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Ivan Vecera <ivecera@redhat.com>, Vlad Buslov <vladbu@nvidia.com>,
	Jianbo Liu <jianbol@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
	Roi Dayan <roid@nvidia.com>,
	Tobias Waldekranz <tobias@waldekranz.com>,
	Vignesh Raghavendra <vigneshr@ti.com>,
	Jesse Brandeburg <jesse.brandeburg@intel.com>
Subject: Re: [PATCH v2 net-next 0/5] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking
Date: Mon, 23 Aug 2021 19:11:57 +0300	[thread overview]
Message-ID: <20210823161157.mg3rtswzje3ww32k@skbuf> (raw)
In-Reply-To: <YSPGh4Fj3idApeFx@shredder>

On Mon, Aug 23, 2021 at 07:02:15PM +0300, Ido Schimmel wrote:
> > > Inside the work item you would do something like:
> > >
> > > spin_lock_bh()
> > > list_splice_init()
> > > spin_unlock_bh()
> > >
> > > mutex_lock() // rtnl or preferably private lock
> > > list_for_each_entry_safe()
> > > 	// process entry
> > > 	cond_resched()
> > > mutex_unlock()
> >
> > When is the work item scheduled in your proposal?
>
> Calling queue_work() whenever you get a notification. The work item
> might already be queued, which is fine.
>
> > I assume not only when SWITCHDEV_FDB_FLUSH_TO_DEVICE is emitted. Is
> > there some sort of timer to allow for some batching to occur?
>
> You can add an hysteresis timer if you want, but I don't think it's
> necessary. Assuming user space is programming entries at a high rate,
> then by the time you finish a batch, you will have a new one enqueued.

With the current model, nobody really stops any driver from doing that
if so they wish. No switchdev or bridge changes needed. We have maximum
flexibility now, with this async model. Yet it just so happens that no
one is exploiting it, and instead the existing options are poorly
utilized by most drivers.

> > > In del_nbp(), after br_fdb_delete_by_port(), the bridge will emit some
> > > new blocking event (e.g., SWITCHDEV_FDB_FLUSH_TO_DEVICE) that will
> > > instruct the driver to flush all its pending FDB notifications. You
> > > don't strictly need this notification because of the
> > > netdev_upper_dev_unlink() that follows, but it helps in making things
> > > more structured.
> > >
> > > Pros:
> > >
> > > 1. Solves your problem?
> > > 2. Pattern is not worse than what we currently have
> > > 3. Does not force RTNL
> > > 4. Allows for batching. For example, mlxsw has the ability to program up
> > > to 64 entries in one transaction with the device. I assume other devices
> > > in the same grade have similar capabilities
> > >
> > > Cons:
> > >
> > > 1. Asynchronous
> > > 2. Pattern we will see in multiple drivers? Can consider migrating it
> > > into switchdev itself at some point
> >
> > I can already flush_workqueue(dsa_owq) in dsa_port_pre_bridge_leave()
> > and this will solve the problem in the same way, will it not?
>
> Problem is that you will deadlock if your work item tries to take RTNL.

I think we agreed that the rtnl_lock could be dropped from driver FDB work items.
I have not tried that yet though.

> > It's not that I don't have driver-level solutions and hook points.
> > My concern is that there are way too many moving parts and the entrance
> > barrier for a new switchdev driver is getting higher and higher to
> > achieve even basic stuff.
>
> I understand the frustration, but that's my best proposal at the moment.
> IMO, it doesn't make things worse and has some nice advantages.

Reconsidering my options, I don't want to reduce the available optimizations
that other switchdev drivers can make, in the name of a simpler baseline.
I am also not smart enough for reworking the bridge data path.
I will probably do something like flush_workqueue in the PRECHANGEUPPER
handler, see what other common patterns there might be, and try to synthesize
them in library code (a la switchdev_handle_*) that can be used by drivers
that wish to, and ignored by drivers that don't.

  reply	other threads:[~2021-08-23 16:12 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-19 16:07 [PATCH v2 net-next 0/5] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Vladimir Oltean
2021-08-19 16:07 ` [PATCH v2 net-next 1/5] net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking notifier chain Vladimir Oltean
2021-08-19 18:15   ` Vlad Buslov
2021-08-19 23:18     ` Vladimir Oltean
2021-08-20  7:36       ` Vlad Buslov
2021-08-19 16:07 ` [PATCH v2 net-next 2/5] net: bridge: switchdev: make br_fdb_replay offer sleepable context to consumers Vladimir Oltean
2021-08-19 16:07 ` [PATCH v2 net-next 3/5] net: switchdev: drop the atomic notifier block from switchdev_bridge_port_{,un}offload Vladimir Oltean
2021-08-19 16:07 ` [PATCH v2 net-next 4/5] net: switchdev: don't assume RCU context in switchdev_handle_fdb_{add,del}_to_device Vladimir Oltean
2021-08-19 16:07 ` [PATCH v2 net-next 5/5] net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously Vladimir Oltean
2021-08-20  9:16 ` [PATCH v2 net-next 0/5] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking Ido Schimmel
2021-08-20  9:37   ` Vladimir Oltean
2021-08-20 16:09     ` Ido Schimmel
2021-08-20 17:06       ` Vladimir Oltean
2021-08-20 23:36         ` Nikolay Aleksandrov
2021-08-21  0:22           ` Vladimir Oltean
2021-08-22  6:48           ` Ido Schimmel
2021-08-22  9:12             ` Nikolay Aleksandrov
2021-08-22 13:31               ` Vladimir Oltean
2021-08-22 17:06                 ` Ido Schimmel
2021-08-22 17:44                   ` Vladimir Oltean
2021-08-23 10:47                     ` Ido Schimmel
2021-08-23 11:00                       ` Vladimir Oltean
2021-08-23 12:16                         ` Ido Schimmel
2021-08-23 14:29                           ` Vladimir Oltean
2021-08-23 15:18                             ` Ido Schimmel
2021-08-23 15:42                               ` Nikolay Aleksandrov
2021-08-23 15:42                               ` Vladimir Oltean
2021-08-23 16:02                                 ` Ido Schimmel
2021-08-23 16:11                                   ` Vladimir Oltean [this message]
2021-08-23 16:23                                   ` Vladimir Oltean
2021-08-20 10:49   ` Vladimir Oltean
2021-08-20 16:11     ` Ido Schimmel
2021-08-21 19:09       ` Vladimir Oltean
2021-08-22  7:19         ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210823161157.mg3rtswzje3ww32k@skbuf \
    --to=olteanv@gmail.com \
    --cc=Landen.Chao@mediatek.com \
    --cc=Steen.Hegelund@microchip.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=alexandre.belloni@bootlin.com \
    --cc=andrew@lunn.ch \
    --cc=borntraeger@de.ibm.com \
    --cc=claudiu.manoil@nxp.com \
    --cc=davem@davemloft.net \
    --cc=dqfext@gmail.com \
    --cc=f.fainelli@gmail.com \
    --cc=george.mccollister@gmail.com \
    --cc=gor@linux.ibm.com \
    --cc=grygorii.strashko@ti.com \
    --cc=hauke@hauke-m.de \
    --cc=hca@linux.ibm.com \
    --cc=idosch@idosch.org \
    --cc=idosch@nvidia.com \
    --cc=ioana.ciornei@nxp.com \
    --cc=ivecera@redhat.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jianbol@nvidia.com \
    --cc=jiri@nvidia.com \
    --cc=jwi@linux.ibm.com \
    --cc=kabel@blackhole.sk \
    --cc=kgraul@linux.ibm.com \
    --cc=kuba@kernel.org \
    --cc=kurt@linutronix.de \
    --cc=lars.povlsen@microchip.com \
    --cc=leon@kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@nvidia.com \
    --cc=roid@nvidia.com \
    --cc=roopa@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=sean.wang@mediatek.com \
    --cc=tchornyi@marvell.com \
    --cc=tobias@waldekranz.com \
    --cc=vigneshr@ti.com \
    --cc=vivien.didelot@gmail.com \
    --cc=vkochan@marvell.com \
    --cc=vladbu@nvidia.com \
    --cc=vladimir.oltean@nxp.com \
    --cc=woojung.huh@microchip.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).