linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
@ 2021-03-11  7:23 Xie He
  2021-03-11 20:43 ` Jakub Kicinski
  2021-03-14  1:59 ` Xie He
  0 siblings, 2 replies; 11+ messages in thread
From: Xie He @ 2021-03-11  7:23 UTC (permalink / raw)
  To: Martin Schiller, David S. Miller, Jakub Kicinski, linux-x25,
	netdev, linux-kernel
  Cc: Xie He

There are two "netif_running" checks in this driver. One is in
"lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
sure that the LAPB APIs called in these functions are called before
"lapb_unregister" is called by the "ndo_stop" function.

However, these "netif_running" checks are unreliable, because it's
possible that immediately after "netif_running" returns true, "ndo_stop"
is called (which causes "lapb_unregister" to be called).

This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
reliably check and ensure the netif is running while doing their work.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xie He <xie.he.0141@gmail.com>
---
 drivers/net/wan/lapbether.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wan/lapbether.c b/drivers/net/wan/lapbether.c
index c3372498f4f1..8fda0446ff71 100644
--- a/drivers/net/wan/lapbether.c
+++ b/drivers/net/wan/lapbether.c
@@ -51,6 +51,8 @@ struct lapbethdev {
 	struct list_head	node;
 	struct net_device	*ethdev;	/* link to ethernet device */
 	struct net_device	*axdev;		/* lapbeth device (lapb#) */
+	bool			up;
+	spinlock_t		up_lock;	/* Protects "up" */
 };
 
 static LIST_HEAD(lapbeth_devices);
@@ -101,8 +103,9 @@ static int lapbeth_rcv(struct sk_buff *skb, struct net_device *dev, struct packe
 	rcu_read_lock();
 	lapbeth = lapbeth_get_x25_dev(dev);
 	if (!lapbeth)
-		goto drop_unlock;
-	if (!netif_running(lapbeth->axdev))
+		goto drop_unlock_rcu;
+	spin_lock_bh(&lapbeth->up_lock);
+	if (!lapbeth->up)
 		goto drop_unlock;
 
 	len = skb->data[0] + skb->data[1] * 256;
@@ -117,11 +120,14 @@ static int lapbeth_rcv(struct sk_buff *skb, struct net_device *dev, struct packe
 		goto drop_unlock;
 	}
 out:
+	spin_unlock_bh(&lapbeth->up_lock);
 	rcu_read_unlock();
 	return 0;
 drop_unlock:
 	kfree_skb(skb);
 	goto out;
+drop_unlock_rcu:
+	rcu_read_unlock();
 drop:
 	kfree_skb(skb);
 	return 0;
@@ -151,13 +157,11 @@ static int lapbeth_data_indication(struct net_device *dev, struct sk_buff *skb)
 static netdev_tx_t lapbeth_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
+	struct lapbethdev *lapbeth = netdev_priv(dev);
 	int err;
 
-	/*
-	 * Just to be *really* sure not to send anything if the interface
-	 * is down, the ethernet device may have gone.
-	 */
-	if (!netif_running(dev))
+	spin_lock_bh(&lapbeth->up_lock);
+	if (!lapbeth->up)
 		goto drop;
 
 	/* There should be a pseudo header of 1 byte added by upper layers.
@@ -194,6 +198,7 @@ static netdev_tx_t lapbeth_xmit(struct sk_buff *skb,
 		goto drop;
 	}
 out:
+	spin_unlock_bh(&lapbeth->up_lock);
 	return NETDEV_TX_OK;
 drop:
 	kfree_skb(skb);
@@ -285,6 +290,7 @@ static const struct lapb_register_struct lapbeth_callbacks = {
  */
 static int lapbeth_open(struct net_device *dev)
 {
+	struct lapbethdev *lapbeth = netdev_priv(dev);
 	int err;
 
 	if ((err = lapb_register(dev, &lapbeth_callbacks)) != LAPB_OK) {
@@ -292,13 +298,22 @@ static int lapbeth_open(struct net_device *dev)
 		return -ENODEV;
 	}
 
+	spin_lock_bh(&lapbeth->up_lock);
+	lapbeth->up = true;
+	spin_unlock_bh(&lapbeth->up_lock);
+
 	return 0;
 }
 
 static int lapbeth_close(struct net_device *dev)
 {
+	struct lapbethdev *lapbeth = netdev_priv(dev);
 	int err;
 
+	spin_lock_bh(&lapbeth->up_lock);
+	lapbeth->up = false;
+	spin_unlock_bh(&lapbeth->up_lock);
+
 	if ((err = lapb_unregister(dev)) != LAPB_OK)
 		pr_err("lapb_unregister error: %d\n", err);
 
@@ -356,6 +371,9 @@ static int lapbeth_new_device(struct net_device *dev)
 	dev_hold(dev);
 	lapbeth->ethdev = dev;
 
+	lapbeth->up = false;
+	spin_lock_init(&lapbeth->up_lock);
+
 	rc = -EIO;
 	if (register_netdevice(ndev))
 		goto fail;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-11  7:23 [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running Xie He
@ 2021-03-11 20:43 ` Jakub Kicinski
  2021-03-11 21:12   ` Xie He
  2021-03-14  1:59 ` Xie He
  1 sibling, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2021-03-11 20:43 UTC (permalink / raw)
  To: Xie He; +Cc: Martin Schiller, David S. Miller, linux-x25, netdev, linux-kernel

On Wed, 10 Mar 2021 23:23:09 -0800 Xie He wrote:
> There are two "netif_running" checks in this driver. One is in
> "lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
> sure that the LAPB APIs called in these functions are called before
> "lapb_unregister" is called by the "ndo_stop" function.
> 
> However, these "netif_running" checks are unreliable, because it's
> possible that immediately after "netif_running" returns true, "ndo_stop"
> is called (which causes "lapb_unregister" to be called).
> 
> This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
> reliably check and ensure the netif is running while doing their work.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Xie He <xie.he.0141@gmail.com>

Is this a theoretical issues or do you see a path where it triggers?

Who are the callers sending frames to a device which went down?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-11 20:43 ` Jakub Kicinski
@ 2021-03-11 21:12   ` Xie He
  2021-03-11 22:52     ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Xie He @ 2021-03-11 21:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Martin Schiller, David S. Miller, Linux X25,
	Linux Kernel Network Developers, LKML

On Thu, Mar 11, 2021 at 12:43 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Is this a theoretical issues or do you see a path where it triggers?
>
> Who are the callers sending frames to a device which went down?

This is a theoretical issue. I didn't see this issue in practice.

When "__dev_queue_xmit" and "sch_direct_xmit" call
"dev_hard_start_xmit", there appears to be no locking mechanism
preventing the netif from going down while "dev_hard_start_xmit" is
doing its work.

David once confirmed in an email that a driver's "ndo_stop" function
would indeed race with its "ndo_start_xmit" function:
https://lore.kernel.org/netdev/20190520.200922.2277656639346033061.davem@davemloft.net/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-11 21:12   ` Xie He
@ 2021-03-11 22:52     ` Jakub Kicinski
  2021-03-11 23:13       ` Xie He
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2021-03-11 22:52 UTC (permalink / raw)
  To: Xie He
  Cc: Martin Schiller, David S. Miller, Linux X25,
	Linux Kernel Network Developers, LKML

On Thu, 11 Mar 2021 13:12:25 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 12:43 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > Is this a theoretical issues or do you see a path where it triggers?
> >
> > Who are the callers sending frames to a device which went down?  
> 
> This is a theoretical issue. I didn't see this issue in practice.
> 
> When "__dev_queue_xmit" and "sch_direct_xmit" call
> "dev_hard_start_xmit", there appears to be no locking mechanism
> preventing the netif from going down while "dev_hard_start_xmit" is
> doing its work.

Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
locks, so unless your driver is lockless (LLTX) there should be no xmit
calls after that point.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-11 22:52     ` Jakub Kicinski
@ 2021-03-11 23:13       ` Xie He
  2021-03-12  0:10         ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Xie He @ 2021-03-11 23:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Martin Schiller, David S. Miller, Linux X25,
	Linux Kernel Network Developers, LKML

On Thu, Mar 11, 2021 at 2:52 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
> locks, so unless your driver is lockless (LLTX) there should be no xmit
> calls after that point.

Do you mean I should call "netif_tx_disable" inside my "ndo_stop"
function as a fix for the racing between "ndo_stop" and
"ndo_start_xmit"?

I can't call "netif_tx_disable" inside my "ndo_stop" function because
"netif_tx_disable" will call "netif_tx_stop_queue", which causes
another racing problem. Please see my recent commit f7d9d4854519
("net: lapbether: Remove netif_start_queue / netif_stop_queue")

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-11 23:13       ` Xie He
@ 2021-03-12  0:10         ` Jakub Kicinski
  2021-03-12  0:28           ` Xie He
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2021-03-12  0:10 UTC (permalink / raw)
  To: Xie He
  Cc: Martin Schiller, David S. Miller, Linux X25,
	Linux Kernel Network Developers, LKML

On Thu, 11 Mar 2021 15:13:01 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 2:52 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
> > locks, so unless your driver is lockless (LLTX) there should be no xmit
> > calls after that point.  
> 
> Do you mean I should call "netif_tx_disable" inside my "ndo_stop"
> function as a fix for the racing between "ndo_stop" and
> "ndo_start_xmit"?
> 
> I can't call "netif_tx_disable" inside my "ndo_stop" function because
> "netif_tx_disable" will call "netif_tx_stop_queue", which causes
> another racing problem. Please see my recent commit f7d9d4854519
> ("net: lapbether: Remove netif_start_queue / netif_stop_queue")

And the "noqueue" queue is there because it's on top of hdlc_fr.c
somehow or some out of tree driver? Or do you install it manually?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-12  0:10         ` Jakub Kicinski
@ 2021-03-12  0:28           ` Xie He
  2021-03-12  1:40             ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Xie He @ 2021-03-12  0:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Martin Schiller, David S. Miller, Linux X25,
	Linux Kernel Network Developers, LKML

On Thu, Mar 11, 2021 at 4:10 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> And the "noqueue" queue is there because it's on top of hdlc_fr.c
> somehow or some out of tree driver? Or do you install it manually?

No, this driver is not related to "hdlc_fr.c" or any out-of-tree
driver. The default qdisc is "noqueue" for this driver because this
driver doesn't set "tx_queue_len". This means the value of
"tx_queue_len" would be 0. In this case, "alloc_netdev_mqs" will
automatically add the "IFF_NO_QUEUE" flag to the device, then
"attach_one_default_qdisc" in "net/sched/sch_generic.c" will attach
the "noqueue" qdisc for devices with the "IFF_NO_QUEUE" flag.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-12  0:28           ` Xie He
@ 2021-03-12  1:40             ` Jakub Kicinski
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2021-03-12  1:40 UTC (permalink / raw)
  To: Xie He
  Cc: Martin Schiller, David S. Miller, Linux X25,
	Linux Kernel Network Developers, LKML

On Thu, 11 Mar 2021 16:28:47 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 4:10 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > And the "noqueue" queue is there because it's on top of hdlc_fr.c
> > somehow or some out of tree driver? Or do you install it manually?  
> 
> No, this driver is not related to "hdlc_fr.c" or any out-of-tree
> driver. The default qdisc is "noqueue" for this driver because this
> driver doesn't set "tx_queue_len". This means the value of
> "tx_queue_len" would be 0. In this case, "alloc_netdev_mqs" will
> automatically add the "IFF_NO_QUEUE" flag to the device, then
> "attach_one_default_qdisc" in "net/sched/sch_generic.c" will attach
> the "noqueue" qdisc for devices with the "IFF_NO_QUEUE" flag.

I see.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-11  7:23 [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running Xie He
  2021-03-11 20:43 ` Jakub Kicinski
@ 2021-03-14  1:59 ` Xie He
  2021-03-16 15:17   ` Martin Schiller
  1 sibling, 1 reply; 11+ messages in thread
From: Xie He @ 2021-03-14  1:59 UTC (permalink / raw)
  To: Martin Schiller
  Cc: David S. Miller, Jakub Kicinski, Linux X25,
	Linux Kernel Network Developers, LKML

Hi Martin,

Could you ack? Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-14  1:59 ` Xie He
@ 2021-03-16 15:17   ` Martin Schiller
  2021-03-16 15:54     ` Xie He
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Schiller @ 2021-03-16 15:17 UTC (permalink / raw)
  To: Xie He
  Cc: David S. Miller, Jakub Kicinski, Linux X25,
	Linux Kernel Network Developers, LKML

On 2021-03-14 02:59, Xie He wrote:
> Hi Martin,
> 
> Could you ack? Thanks!

Acked-by: Martin Schiller <ms@dev.tdt.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running
  2021-03-16 15:17   ` Martin Schiller
@ 2021-03-16 15:54     ` Xie He
  0 siblings, 0 replies; 11+ messages in thread
From: Xie He @ 2021-03-16 15:54 UTC (permalink / raw)
  To: Martin Schiller
  Cc: David S. Miller, Jakub Kicinski, Linux X25,
	Linux Kernel Network Developers, LKML

On Tue, Mar 16, 2021 at 8:17 AM Martin Schiller <ms@dev.tdt.de> wrote:
>
> On 2021-03-14 02:59, Xie He wrote:
> > Hi Martin,
> >
> > Could you ack? Thanks!
>
> Acked-by: Martin Schiller <ms@dev.tdt.de>

Thank you!!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-03-16 15:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-11  7:23 [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running Xie He
2021-03-11 20:43 ` Jakub Kicinski
2021-03-11 21:12   ` Xie He
2021-03-11 22:52     ` Jakub Kicinski
2021-03-11 23:13       ` Xie He
2021-03-12  0:10         ` Jakub Kicinski
2021-03-12  0:28           ` Xie He
2021-03-12  1:40             ` Jakub Kicinski
2021-03-14  1:59 ` Xie He
2021-03-16 15:17   ` Martin Schiller
2021-03-16 15:54     ` Xie He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).