All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API
@ 2022-02-07 14:47 Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 01/14] net: ieee802154: Move the logic restarting the queue upon transmission Miquel Raynal
                   ` (13 more replies)
  0 siblings, 14 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

The idea here is to provide a fully synchronous Tx API and also be able
to be sure that a transfer as finished. This will be used later by
another series.

The first patches create an error helper and then use it in order to
have only two "end of transmission" helpers that are always called.

Then, a bit of cleanup regarding the naming and the locations of certain
peaces of code is done.

Finally, we create a hot and a slow path, add the necessary logic to be
able to track ongoing transfers and when the queue must be kept on hold,
until we finally create a helper to stop emitting after the last
transfer, which we then use to create a synchronous MLME API.

Changes in v2:
* Adapted with the changes already merged/refused.

Miquel Raynal (14):
  net: ieee802154: Move the logic restarting the queue upon transmission
  net: mac802154: Create a transmit error helper
  net: ieee802154: at86rf230: Call _xmit_error() when a transmission
    fails
  net: ieee802154: atusb: Call _xmit_error() when a transmission fails
  net: ieee802154: ca8210: Call _xmit_error() when a transmission fails
  net: mac802154: Stop exporting ieee802154_wake/stop_queue()
  net: mac802154: Rename the synchronous xmit worker
  net: mac802154: Rename the main tx_work struct
  net: mac802154: Follow the count of ongoing transmissions
  net: mac802154: Hold the transmit queue when relevant
  net: mac802154: Create a hot tx path
  net: mac802154: Add a warning in the hot path
  net: mac802154: Introduce a tx queue flushing mechanism
  net: mac802154: Introduce a synchronous API for MLME commands

 drivers/net/ieee802154/at86rf230.c |  3 +-
 drivers/net/ieee802154/atusb.c     |  4 +--
 drivers/net/ieee802154/ca8210.c    |  7 +++--
 include/net/cfg802154.h            |  5 ++++
 include/net/mac802154.h            | 37 +++++++----------------
 net/ieee802154/core.c              |  1 +
 net/mac802154/cfg.c                |  5 ++--
 net/mac802154/ieee802154_i.h       | 35 ++++++++++++++++++++--
 net/mac802154/main.c               |  2 +-
 net/mac802154/tx.c                 | 48 +++++++++++++++++++++++++-----
 net/mac802154/util.c               | 39 ++++++++++++++++++++----
 11 files changed, 134 insertions(+), 52 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 01/14] net: ieee802154: Move the logic restarting the queue upon transmission
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper Miquel Raynal
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

Create a new helper with the logic restarting the queue upon
transmission, so that we can create a second path for error conditions
which can reuse that code easily.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 net/mac802154/util.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/net/mac802154/util.c b/net/mac802154/util.c
index f2078238718b..6f82418e9dec 100644
--- a/net/mac802154/util.c
+++ b/net/mac802154/util.c
@@ -55,8 +55,9 @@ enum hrtimer_restart ieee802154_xmit_ifs_timer(struct hrtimer *timer)
 	return HRTIMER_NORESTART;
 }
 
-void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
-			      bool ifs_handling)
+static void
+ieee802154_wakeup_after_xmit_done(struct ieee802154_hw *hw, bool ifs_handling,
+				  unsigned int skb_len)
 {
 	if (ifs_handling) {
 		struct ieee802154_local *local = hw_to_local(hw);
@@ -72,7 +73,7 @@ void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
 		else
 			max_sifs_size = IEEE802154_MAX_SIFS_FRAME_SIZE;
 
-		if (skb->len > max_sifs_size)
+		if (skb_len > max_sifs_size)
 			hrtimer_start(&local->ifs_timer,
 				      hw->phy->lifs_period * NSEC_PER_USEC,
 				      HRTIMER_MODE_REL);
@@ -83,8 +84,21 @@ void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
 	} else {
 		ieee802154_wake_queue(hw);
 	}
+}
+
+static void ieee802154_xmit_end(struct ieee802154_hw *hw, bool ifs_handling,
+				unsigned int skb_len)
+{
+	ieee802154_wakeup_after_xmit_done(hw, ifs_handling, skb_len);
+}
+
+void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
+			      bool ifs_handling)
+{
+	unsigned int skb_len = skb->len;
 
 	dev_consume_skb_any(skb);
+	ieee802154_xmit_end(hw, ifs_handling, skb_len);
 }
 EXPORT_SYMBOL(ieee802154_xmit_complete);
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 01/14] net: ieee802154: Move the logic restarting the queue upon transmission Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-20 23:31   ` Alexander Aring
  2022-02-07 14:47 ` [PATCH wpan-next v2 03/14] net: ieee802154: at86rf230: Call _xmit_error() when a transmission fails Miquel Raynal
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

So far there is only a helper for successful transmission, which led
device drivers to implement their own handling in case of
error. Unfortunately, we really need all the drivers to give the hand
back to the core once they are done in order to be able to build a
proper synchronous API. So let's create a _xmit_error() helper.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 include/net/mac802154.h | 10 ++++++++++
 net/mac802154/util.c    | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/include/net/mac802154.h b/include/net/mac802154.h
index 2c3bbc6645ba..9fe8cfef1ba0 100644
--- a/include/net/mac802154.h
+++ b/include/net/mac802154.h
@@ -498,4 +498,14 @@ void ieee802154_stop_queue(struct ieee802154_hw *hw);
 void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
 			      bool ifs_handling);
 
+/**
+ * ieee802154_xmit_error - frame transmission failed
+ *
+ * @hw: pointer as obtained from ieee802154_alloc_hw().
+ * @skb: buffer for transmission
+ * @ifs_handling: indicate interframe space handling
+ */
+void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
+			   bool ifs_handling);
+
 #endif /* NET_MAC802154_H */
diff --git a/net/mac802154/util.c b/net/mac802154/util.c
index 6f82418e9dec..9016f634efba 100644
--- a/net/mac802154/util.c
+++ b/net/mac802154/util.c
@@ -102,6 +102,16 @@ void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
 }
 EXPORT_SYMBOL(ieee802154_xmit_complete);
 
+void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
+			   bool ifs_handling)
+{
+	unsigned int skb_len = skb->len;
+
+	dev_kfree_skb_any(skb);
+	ieee802154_xmit_end(hw, ifs_handling, skb_len);
+}
+EXPORT_SYMBOL(ieee802154_xmit_error);
+
 void ieee802154_stop_device(struct ieee802154_local *local)
 {
 	flush_workqueue(local->workqueue);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 03/14] net: ieee802154: at86rf230: Call _xmit_error() when a transmission fails
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 01/14] net: ieee802154: Move the logic restarting the queue upon transmission Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 04/14] net: ieee802154: atusb: " Miquel Raynal
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

ieee802154_xmit_error() is the right helper to call when a transmission
has failed. Let's use it instead of open-coding it.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 drivers/net/ieee802154/at86rf230.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
index 563031ce76f0..5d714c6ec9b4 100644
--- a/drivers/net/ieee802154/at86rf230.c
+++ b/drivers/net/ieee802154/at86rf230.c
@@ -346,8 +346,7 @@ at86rf230_async_error_recover_complete(void *context)
 
 	if (lp->was_tx) {
 		lp->was_tx = 0;
-		dev_kfree_skb_any(lp->tx_skb);
-		ieee802154_wake_queue(lp->hw);
+		ieee802154_xmit_error(lp->hw, lp->tx_skb, false);
 	}
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 04/14] net: ieee802154: atusb: Call _xmit_error() when a transmission fails
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (2 preceding siblings ...)
  2022-02-07 14:47 ` [PATCH wpan-next v2 03/14] net: ieee802154: at86rf230: Call _xmit_error() when a transmission fails Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-20 23:35   ` Alexander Aring
  2022-02-07 14:47 ` [PATCH wpan-next v2 05/14] net: ieee802154: ca8210: " Miquel Raynal
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

ieee802154_xmit_error() is the right helper to call when a transmission
has failed. Let's use it instead of open-coding it.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 drivers/net/ieee802154/atusb.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
index f27a5f535808..0e6f180b4e79 100644
--- a/drivers/net/ieee802154/atusb.c
+++ b/drivers/net/ieee802154/atusb.c
@@ -271,9 +271,7 @@ static void atusb_tx_done(struct atusb *atusb, u8 seq)
 		 * unlikely case now that seq == expect is then true, but can
 		 * happen and fail with a tx_skb = NULL;
 		 */
-		ieee802154_wake_queue(atusb->hw);
-		if (atusb->tx_skb)
-			dev_kfree_skb_irq(atusb->tx_skb);
+		ieee802154_xmit_error(atusb->hw, atusb->tx_skb, false);
 	}
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 05/14] net: ieee802154: ca8210: Call _xmit_error() when a transmission fails
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (3 preceding siblings ...)
  2022-02-07 14:47 ` [PATCH wpan-next v2 04/14] net: ieee802154: atusb: " Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 06/14] net: mac802154: Stop exporting ieee802154_wake/stop_queue() Miquel Raynal
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

ieee802154_xmit_error() is the right helper to call when a transmission
has failed. Let's use it instead of open-coding it.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 drivers/net/ieee802154/ca8210.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ieee802154/ca8210.c b/drivers/net/ieee802154/ca8210.c
index fc74fa0f1ddd..1dfc5528f295 100644
--- a/drivers/net/ieee802154/ca8210.c
+++ b/drivers/net/ieee802154/ca8210.c
@@ -1765,17 +1765,20 @@ static int ca8210_async_xmit_complete(
 	priv->nextmsduhandle++;
 
 	if (status) {
+		bool ifs_handling =
+			status == MAC_TRANSACTION_OVERFLOW ? true : false;
+
 		dev_err(
 			&priv->spi->dev,
 			"Link transmission unsuccessful, status = %d\n",
 			status
 		);
 		if (status != MAC_TRANSACTION_OVERFLOW) {
-			dev_kfree_skb_any(priv->tx_skb);
-			ieee802154_wake_queue(priv->hw);
+			ieee802154_xmit_error(priv->hw, priv->tx_skb, ifs_handling);
 			return 0;
 		}
 	}
+
 	ieee802154_xmit_complete(priv->hw, priv->tx_skb, true);
 
 	return 0;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 06/14] net: mac802154: Stop exporting ieee802154_wake/stop_queue()
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (4 preceding siblings ...)
  2022-02-07 14:47 ` [PATCH wpan-next v2 05/14] net: ieee802154: ca8210: " Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 07/14] net: mac802154: Rename the synchronous xmit worker Miquel Raynal
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

Individual drivers do not necessarily need to call these helpers
manually. There are other functions, more suited for this purpose, that
will do that for them. The advantage is that, as no more drivers call
these, it eases the tracking of the ongoing transfers that we are about
to introduce while keeping the possibility to bypass thse counters from
core code.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 include/net/mac802154.h      | 27 ---------------------------
 net/mac802154/ieee802154_i.h | 24 ++++++++++++++++++++++++
 net/mac802154/util.c         |  2 --
 3 files changed, 24 insertions(+), 29 deletions(-)

diff --git a/include/net/mac802154.h b/include/net/mac802154.h
index 9fe8cfef1ba0..9e2e2b2cd65e 100644
--- a/include/net/mac802154.h
+++ b/include/net/mac802154.h
@@ -460,33 +460,6 @@ void ieee802154_unregister_hw(struct ieee802154_hw *hw);
  */
 void ieee802154_rx_irqsafe(struct ieee802154_hw *hw, struct sk_buff *skb,
 			   u8 lqi);
-/**
- * ieee802154_wake_queue - wake ieee802154 queue
- * @hw: pointer as obtained from ieee802154_alloc_hw().
- *
- * Tranceivers usually have either one transmit framebuffer or one framebuffer
- * for both transmitting and receiving. Hence, the core currently only handles
- * one frame at a time for each phy, which means we had to stop the queue to
- * avoid new skb to come during the transmission. The queue then needs to be
- * woken up after the operation.
- *
- * Drivers should use this function instead of netif_wake_queue.
- */
-void ieee802154_wake_queue(struct ieee802154_hw *hw);
-
-/**
- * ieee802154_stop_queue - stop ieee802154 queue
- * @hw: pointer as obtained from ieee802154_alloc_hw().
- *
- * Tranceivers usually have either one transmit framebuffer or one framebuffer
- * for both transmitting and receiving. Hence, the core currently only handles
- * one frame at a time for each phy, which means we need to tell upper layers to
- * stop giving us new skbs while we are busy with the transmitted one. The queue
- * must then be stopped before transmitting.
- *
- * Drivers should use this function instead of netif_stop_queue.
- */
-void ieee802154_stop_queue(struct ieee802154_hw *hw);
 
 /**
  * ieee802154_xmit_complete - frame transmission complete
diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index 702560acc8ce..c9451d18de31 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -128,6 +128,30 @@ netdev_tx_t
 ieee802154_subif_start_xmit(struct sk_buff *skb, struct net_device *dev);
 enum hrtimer_restart ieee802154_xmit_ifs_timer(struct hrtimer *timer);
 
+/**
+ * ieee802154_wake_queue - wake ieee802154 queue
+ * @hw: pointer as obtained from ieee802154_alloc_hw().
+ *
+ * Tranceivers usually have either one transmit framebuffer or one framebuffer
+ * for both transmitting and receiving. Hence, the core currently only handles
+ * one frame at a time for each phy, which means we had to stop the queue to
+ * avoid new skb to come during the transmission. The queue then needs to be
+ * woken up after the operation.
+ */
+void ieee802154_wake_queue(struct ieee802154_hw *hw);
+
+/**
+ * ieee802154_stop_queue - stop ieee802154 queue
+ * @hw: pointer as obtained from ieee802154_alloc_hw().
+ *
+ * Tranceivers usually have either one transmit framebuffer or one framebuffer
+ * for both transmitting and receiving. Hence, the core currently only handles
+ * one frame at a time for each phy, which means we need to tell upper layers to
+ * stop giving us new skbs while we are busy with the transmitted one. The queue
+ * must then be stopped before transmitting.
+ */
+void ieee802154_stop_queue(struct ieee802154_hw *hw);
+
 /* MIB callbacks */
 void mac802154_dev_set_page_channel(struct net_device *dev, u8 page, u8 chan);
 
diff --git a/net/mac802154/util.c b/net/mac802154/util.c
index 9016f634efba..6c56884ed508 100644
--- a/net/mac802154/util.c
+++ b/net/mac802154/util.c
@@ -27,7 +27,6 @@ void ieee802154_wake_queue(struct ieee802154_hw *hw)
 	}
 	rcu_read_unlock();
 }
-EXPORT_SYMBOL(ieee802154_wake_queue);
 
 void ieee802154_stop_queue(struct ieee802154_hw *hw)
 {
@@ -43,7 +42,6 @@ void ieee802154_stop_queue(struct ieee802154_hw *hw)
 	}
 	rcu_read_unlock();
 }
-EXPORT_SYMBOL(ieee802154_stop_queue);
 
 enum hrtimer_restart ieee802154_xmit_ifs_timer(struct hrtimer *timer)
 {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 07/14] net: mac802154: Rename the synchronous xmit worker
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (5 preceding siblings ...)
  2022-02-07 14:47 ` [PATCH wpan-next v2 06/14] net: mac802154: Stop exporting ieee802154_wake/stop_queue() Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 08/14] net: mac802154: Rename the main tx_work struct Miquel Raynal
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

There are currently two driver hooks: one is synchronous, the other is
not. We cannot rely on driver implementations to provide a synchronous
API (which is related to the bus medium more than a wish to have a
synchronized implementation) so we are going to introduce a sync API
above any kind of driver transmit function. In order to clarify what
this worker is for (synchronous driver implementation), let's rename it
so that people don't get bothered by the fact that their driver does not
make use of the "xmit worker" which is a too generic name.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 net/mac802154/ieee802154_i.h | 2 +-
 net/mac802154/main.c         | 2 +-
 net/mac802154/tx.c           | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index c9451d18de31..a44a8244dc8d 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -121,7 +121,7 @@ ieee802154_sdata_running(struct ieee802154_sub_if_data *sdata)
 extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
 
 void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
-void ieee802154_xmit_worker(struct work_struct *work);
+void ieee802154_xmit_sync_worker(struct work_struct *work);
 netdev_tx_t
 ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t
diff --git a/net/mac802154/main.c b/net/mac802154/main.c
index 5546ef86e231..13c6b3cd0429 100644
--- a/net/mac802154/main.c
+++ b/net/mac802154/main.c
@@ -95,7 +95,7 @@ ieee802154_alloc_hw(size_t priv_data_len, const struct ieee802154_ops *ops)
 
 	skb_queue_head_init(&local->skb_queue);
 
-	INIT_WORK(&local->tx_work, ieee802154_xmit_worker);
+	INIT_WORK(&local->tx_work, ieee802154_xmit_sync_worker);
 
 	/* init supported flags with 802.15.4 default ranges */
 	phy->supported.max_minbe = 8;
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index c829e4a75325..97df5985b830 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -22,7 +22,7 @@
 #include "ieee802154_i.h"
 #include "driver-ops.h"
 
-void ieee802154_xmit_worker(struct work_struct *work)
+void ieee802154_xmit_sync_worker(struct work_struct *work)
 {
 	struct ieee802154_local *local =
 		container_of(work, struct ieee802154_local, tx_work);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 08/14] net: mac802154: Rename the main tx_work struct
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (6 preceding siblings ...)
  2022-02-07 14:47 ` [PATCH wpan-next v2 07/14] net: mac802154: Rename the synchronous xmit worker Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-07 14:47 ` [PATCH wpan-next v2 09/14] net: mac802154: Follow the count of ongoing transmissions Miquel Raynal
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

This entry is dedicated to synchronous transmissions done by drivers
without async hook. Make this clearer that this is not a work that any
driver can use by at least prefixing it with "sync_". While at it, let's
enhance the comment explaining why we choose one or the other.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 net/mac802154/ieee802154_i.h | 2 +-
 net/mac802154/main.c         | 2 +-
 net/mac802154/tx.c           | 9 ++++++---
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index a44a8244dc8d..d3bcc097e491 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -55,7 +55,7 @@ struct ieee802154_local {
 	struct sk_buff_head skb_queue;
 
 	struct sk_buff *tx_skb;
-	struct work_struct tx_work;
+	struct work_struct sync_tx_work;
 };
 
 enum {
diff --git a/net/mac802154/main.c b/net/mac802154/main.c
index 13c6b3cd0429..46258c6d430f 100644
--- a/net/mac802154/main.c
+++ b/net/mac802154/main.c
@@ -95,7 +95,7 @@ ieee802154_alloc_hw(size_t priv_data_len, const struct ieee802154_ops *ops)
 
 	skb_queue_head_init(&local->skb_queue);
 
-	INIT_WORK(&local->tx_work, ieee802154_xmit_sync_worker);
+	INIT_WORK(&local->sync_tx_work, ieee802154_xmit_sync_worker);
 
 	/* init supported flags with 802.15.4 default ranges */
 	phy->supported.max_minbe = 8;
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 97df5985b830..a01689ddd547 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -25,7 +25,7 @@
 void ieee802154_xmit_sync_worker(struct work_struct *work)
 {
 	struct ieee802154_local *local =
-		container_of(work, struct ieee802154_local, tx_work);
+		container_of(work, struct ieee802154_local, sync_tx_work);
 	struct sk_buff *skb = local->tx_skb;
 	struct net_device *dev = skb->dev;
 	int res;
@@ -76,7 +76,10 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
 	/* Stop the netif queue on each sub_if_data object. */
 	ieee802154_stop_queue(&local->hw);
 
-	/* async is priority, otherwise sync is fallback */
+	/* Drivers should preferably implement the async callback. In some rare
+	 * cases they only provide a sync callback which we will use as a
+	 * fallback.
+	 */
 	if (local->ops->xmit_async) {
 		unsigned int len = skb->len;
 
@@ -90,7 +93,7 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
 		dev->stats.tx_bytes += len;
 	} else {
 		local->tx_skb = skb;
-		queue_work(local->workqueue, &local->tx_work);
+		queue_work(local->workqueue, &local->sync_tx_work);
 	}
 
 	return NETDEV_TX_OK;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 09/14] net: mac802154: Follow the count of ongoing transmissions
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (7 preceding siblings ...)
  2022-02-07 14:47 ` [PATCH wpan-next v2 08/14] net: mac802154: Rename the main tx_work struct Miquel Raynal
@ 2022-02-07 14:47 ` Miquel Raynal
  2022-02-07 14:48 ` [PATCH wpan-next v2 10/14] net: mac802154: Hold the transmit queue when relevant Miquel Raynal
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:47 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

In order to create a synchronous API for MLME command purposes, we need
to be able to track the end of the ongoing transmissions. Let's
introduce an atomic variable which is incremented and decremented when
relevant and now at any moment if a there is an ongoing transmission.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 include/net/cfg802154.h | 3 +++
 net/mac802154/tx.c      | 3 +++
 net/mac802154/util.c    | 1 +
 3 files changed, 7 insertions(+)

diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h
index 85f9e8417688..473ebcb9b155 100644
--- a/include/net/cfg802154.h
+++ b/include/net/cfg802154.h
@@ -214,6 +214,9 @@ struct wpan_phy {
 	/* the network namespace this phy lives in currently */
 	possible_net_t _net;
 
+	/* Transmission monitoring */
+	atomic_t ongoing_txs;
+
 	char priv[] __aligned(NETDEV_ALIGN);
 };
 
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index a01689ddd547..731e86bfe73f 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -45,6 +45,7 @@ void ieee802154_xmit_sync_worker(struct work_struct *work)
 	/* Restart the netif queue on each sub_if_data object. */
 	ieee802154_wake_queue(&local->hw);
 	kfree_skb(skb);
+	atomic_dec(&local->phy->ongoing_txs);
 	netdev_dbg(dev, "transmission failed\n");
 }
 
@@ -80,6 +81,7 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
 	 * cases they only provide a sync callback which we will use as a
 	 * fallback.
 	 */
+	atomic_inc(&local->phy->ongoing_txs);
 	if (local->ops->xmit_async) {
 		unsigned int len = skb->len;
 
@@ -99,6 +101,7 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
 	return NETDEV_TX_OK;
 
 err_tx:
+	atomic_dec(&local->phy->ongoing_txs);
 	kfree_skb(skb);
 	return NETDEV_TX_OK;
 }
diff --git a/net/mac802154/util.c b/net/mac802154/util.c
index 6c56884ed508..a72d202c212b 100644
--- a/net/mac802154/util.c
+++ b/net/mac802154/util.c
@@ -88,6 +88,7 @@ static void ieee802154_xmit_end(struct ieee802154_hw *hw, bool ifs_handling,
 				unsigned int skb_len)
 {
 	ieee802154_wakeup_after_xmit_done(hw, ifs_handling, skb_len);
+	atomic_dec(&hw->phy->ongoing_txs);
 }
 
 void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 10/14] net: mac802154: Hold the transmit queue when relevant
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (8 preceding siblings ...)
  2022-02-07 14:47 ` [PATCH wpan-next v2 09/14] net: mac802154: Follow the count of ongoing transmissions Miquel Raynal
@ 2022-02-07 14:48 ` Miquel Raynal
  2022-02-07 14:48 ` [PATCH wpan-next v2 11/14] net: mac802154: Create a hot tx path Miquel Raynal
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:48 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

Let's create a hold_txs atomic variable and increment/decrement it when
relevant. A current use is during a suspend. Very soon we will also use
this feature during scans.

When the variable is incremented, any further call to helpers usually
waking up the queue will skip this part because it is the core
responsibility to wake up the queue when relevant.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 include/net/cfg802154.h      | 3 ++-
 net/mac802154/cfg.c          | 4 +++-
 net/mac802154/ieee802154_i.h | 5 +++++
 net/mac802154/tx.c           | 7 +++++--
 net/mac802154/util.c         | 7 ++++++-
 5 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h
index 473ebcb9b155..043d8e4359e7 100644
--- a/include/net/cfg802154.h
+++ b/include/net/cfg802154.h
@@ -214,8 +214,9 @@ struct wpan_phy {
 	/* the network namespace this phy lives in currently */
 	possible_net_t _net;
 
-	/* Transmission monitoring */
+	/* Transmission monitoring and control */
 	atomic_t ongoing_txs;
+	atomic_t hold_txs;
 
 	char priv[] __aligned(NETDEV_ALIGN);
 };
diff --git a/net/mac802154/cfg.c b/net/mac802154/cfg.c
index 1e4a9f74ed43..e8aabf215286 100644
--- a/net/mac802154/cfg.c
+++ b/net/mac802154/cfg.c
@@ -46,6 +46,7 @@ static int ieee802154_suspend(struct wpan_phy *wpan_phy)
 	if (!local->open_count)
 		goto suspend;
 
+	atomic_inc(&wpan_phy->hold_txs);
 	ieee802154_stop_queue(&local->hw);
 	synchronize_net();
 
@@ -72,7 +73,8 @@ static int ieee802154_resume(struct wpan_phy *wpan_phy)
 		return ret;
 
 wake_up:
-	ieee802154_wake_queue(&local->hw);
+	if (!atomic_dec_and_test(&wpan_phy->hold_txs))
+		ieee802154_wake_queue(&local->hw);
 	local->suspended = false;
 	return 0;
 }
diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index d3bcc097e491..56fcd7ef5b6f 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -190,6 +190,11 @@ void mac802154_unlock_table(struct net_device *dev);
 
 int mac802154_wpan_update_llsec(struct net_device *dev);
 
+static inline bool mac802154_queue_is_stopped(struct ieee802154_local *local)
+{
+	return atomic_read(&local->phy->hold_txs);
+}
+
 /* interface handling */
 int ieee802154_iface_init(void);
 void ieee802154_iface_exit(void);
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 731e86bfe73f..a8d4d5e175b6 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -43,7 +43,9 @@ void ieee802154_xmit_sync_worker(struct work_struct *work)
 
 err_tx:
 	/* Restart the netif queue on each sub_if_data object. */
-	ieee802154_wake_queue(&local->hw);
+	if (!mac802154_queue_is_stopped(local))
+		ieee802154_wake_queue(&local->hw);
+
 	kfree_skb(skb);
 	atomic_dec(&local->phy->ongoing_txs);
 	netdev_dbg(dev, "transmission failed\n");
@@ -87,7 +89,8 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
 
 		ret = drv_xmit_async(local, skb);
 		if (ret) {
-			ieee802154_wake_queue(&local->hw);
+			if (!mac802154_queue_is_stopped(local))
+				ieee802154_wake_queue(&local->hw);
 			goto err_tx;
 		}
 
diff --git a/net/mac802154/util.c b/net/mac802154/util.c
index a72d202c212b..cc572c12a8f9 100644
--- a/net/mac802154/util.c
+++ b/net/mac802154/util.c
@@ -87,7 +87,12 @@ ieee802154_wakeup_after_xmit_done(struct ieee802154_hw *hw, bool ifs_handling,
 static void ieee802154_xmit_end(struct ieee802154_hw *hw, bool ifs_handling,
 				unsigned int skb_len)
 {
-	ieee802154_wakeup_after_xmit_done(hw, ifs_handling, skb_len);
+	struct ieee802154_local *local = hw_to_local(hw);
+
+	/* Avoid waking-up a transmit queue which needs to remain stopped */
+	if (!mac802154_queue_is_stopped(local))
+		ieee802154_wakeup_after_xmit_done(hw, ifs_handling, skb_len);
+
 	atomic_dec(&hw->phy->ongoing_txs);
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 11/14] net: mac802154: Create a hot tx path
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (9 preceding siblings ...)
  2022-02-07 14:48 ` [PATCH wpan-next v2 10/14] net: mac802154: Hold the transmit queue when relevant Miquel Raynal
@ 2022-02-07 14:48 ` Miquel Raynal
  2022-02-07 14:48 ` [PATCH wpan-next v2 12/14] net: mac802154: Add a warning in the hot path Miquel Raynal
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:48 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

Let's rename the current tx path to show that this is the "hot" path. We
will soon introduce a slower path for MLME commands.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 net/mac802154/tx.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index a8d4d5e175b6..18ee6fcfcd7f 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -109,6 +109,12 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
 	return NETDEV_TX_OK;
 }
 
+static netdev_tx_t
+ieee802154_hot_tx(struct ieee802154_local *local, struct sk_buff *skb)
+{
+	return ieee802154_tx(local, skb);
+}
+
 netdev_tx_t
 ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
@@ -116,7 +122,7 @@ ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	skb->skb_iif = dev->ifindex;
 
-	return ieee802154_tx(sdata->local, skb);
+	return ieee802154_hot_tx(sdata->local, skb);
 }
 
 netdev_tx_t
@@ -138,5 +144,5 @@ ieee802154_subif_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	skb->skb_iif = dev->ifindex;
 
-	return ieee802154_tx(sdata->local, skb);
+	return ieee802154_hot_tx(sdata->local, skb);
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 12/14] net: mac802154: Add a warning in the hot path
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (10 preceding siblings ...)
  2022-02-07 14:48 ` [PATCH wpan-next v2 11/14] net: mac802154: Create a hot tx path Miquel Raynal
@ 2022-02-07 14:48 ` Miquel Raynal
  2022-02-07 14:48 ` [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism Miquel Raynal
  2022-02-07 14:48 ` [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands Miquel Raynal
  13 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:48 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

We should never start a transmission after the queue has been stopped.

But because it might work we don't kill the function here but rather
warn loudly the user that something is wrong.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 net/mac802154/tx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 18ee6fcfcd7f..abd9a057521e 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -112,6 +112,8 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
 static netdev_tx_t
 ieee802154_hot_tx(struct ieee802154_local *local, struct sk_buff *skb)
 {
+	WARN_ON_ONCE(mac802154_queue_is_stopped(local));
+
 	return ieee802154_tx(local, skb);
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (11 preceding siblings ...)
  2022-02-07 14:48 ` [PATCH wpan-next v2 12/14] net: mac802154: Add a warning in the hot path Miquel Raynal
@ 2022-02-07 14:48 ` Miquel Raynal
  2022-02-20 23:49   ` Alexander Aring
  2022-02-07 14:48 ` [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands Miquel Raynal
  13 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:48 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

Right now we are able to stop a queue but we have no indication if a
transmission is ongoing or not.

Thanks to recent additions, we can track the number of ongoing
transmissions so we know if the last transmission is over. Adding on top
of it an internal wait queue also allows to be woken up asynchronously
when this happens. If, beforehands, we marked the queue to be held and
stopped it, we end up flushing and stopping the tx queue.

Thanks to this feature, we will soon be able to introduce a synchronous
transmit API.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 include/net/cfg802154.h      |  1 +
 net/ieee802154/core.c        |  1 +
 net/mac802154/cfg.c          |  5 ++---
 net/mac802154/ieee802154_i.h |  1 +
 net/mac802154/tx.c           | 11 ++++++++++-
 net/mac802154/util.c         |  3 ++-
 6 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h
index 043d8e4359e7..0d385a214da3 100644
--- a/include/net/cfg802154.h
+++ b/include/net/cfg802154.h
@@ -217,6 +217,7 @@ struct wpan_phy {
 	/* Transmission monitoring and control */
 	atomic_t ongoing_txs;
 	atomic_t hold_txs;
+	wait_queue_head_t sync_txq;
 
 	char priv[] __aligned(NETDEV_ALIGN);
 };
diff --git a/net/ieee802154/core.c b/net/ieee802154/core.c
index de259b5170ab..0953cacafbff 100644
--- a/net/ieee802154/core.c
+++ b/net/ieee802154/core.c
@@ -129,6 +129,7 @@ wpan_phy_new(const struct cfg802154_ops *ops, size_t priv_size)
 	wpan_phy_net_set(&rdev->wpan_phy, &init_net);
 
 	init_waitqueue_head(&rdev->dev_wait);
+	init_waitqueue_head(&rdev->wpan_phy.sync_txq);
 
 	return &rdev->wpan_phy;
 }
diff --git a/net/mac802154/cfg.c b/net/mac802154/cfg.c
index e8aabf215286..da94aaa32fcb 100644
--- a/net/mac802154/cfg.c
+++ b/net/mac802154/cfg.c
@@ -46,8 +46,7 @@ static int ieee802154_suspend(struct wpan_phy *wpan_phy)
 	if (!local->open_count)
 		goto suspend;
 
-	atomic_inc(&wpan_phy->hold_txs);
-	ieee802154_stop_queue(&local->hw);
+	ieee802154_sync_and_stop_tx(local);
 	synchronize_net();
 
 	/* stop hardware - this must stop RX */
@@ -73,7 +72,7 @@ static int ieee802154_resume(struct wpan_phy *wpan_phy)
 		return ret;
 
 wake_up:
-	if (!atomic_dec_and_test(&wpan_phy->hold_txs))
+	if (!atomic_read(&wpan_phy->hold_txs))
 		ieee802154_wake_queue(&local->hw);
 	local->suspended = false;
 	return 0;
diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index 56fcd7ef5b6f..295c9ce091e1 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -122,6 +122,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
 
 void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
 void ieee802154_xmit_sync_worker(struct work_struct *work);
+void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
 netdev_tx_t
 ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index abd9a057521e..06ae2e6cea43 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -47,7 +47,8 @@ void ieee802154_xmit_sync_worker(struct work_struct *work)
 		ieee802154_wake_queue(&local->hw);
 
 	kfree_skb(skb);
-	atomic_dec(&local->phy->ongoing_txs);
+	if (!atomic_dec_and_test(&local->phy->ongoing_txs))
+		wake_up(&local->phy->sync_txq);
 	netdev_dbg(dev, "transmission failed\n");
 }
 
@@ -117,6 +118,14 @@ ieee802154_hot_tx(struct ieee802154_local *local, struct sk_buff *skb)
 	return ieee802154_tx(local, skb);
 }
 
+void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
+{
+	atomic_inc(&local->phy->hold_txs);
+	ieee802154_stop_queue(&local->hw);
+	wait_event(local->phy->sync_txq, !atomic_read(&local->phy->ongoing_txs));
+	atomic_dec(&local->phy->hold_txs);
+}
+
 netdev_tx_t
 ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
diff --git a/net/mac802154/util.c b/net/mac802154/util.c
index cc572c12a8f9..e666ac7a16bd 100644
--- a/net/mac802154/util.c
+++ b/net/mac802154/util.c
@@ -93,7 +93,8 @@ static void ieee802154_xmit_end(struct ieee802154_hw *hw, bool ifs_handling,
 	if (!mac802154_queue_is_stopped(local))
 		ieee802154_wakeup_after_xmit_done(hw, ifs_handling, skb_len);
 
-	atomic_dec(&hw->phy->ongoing_txs);
+	if (!atomic_dec_and_test(&hw->phy->ongoing_txs))
+		wake_up(&hw->phy->sync_txq);
 }
 
 void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands
  2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
                   ` (12 preceding siblings ...)
  2022-02-07 14:48 ` [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism Miquel Raynal
@ 2022-02-07 14:48 ` Miquel Raynal
  2022-02-20 23:52   ` Alexander Aring
  13 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-02-07 14:48 UTC (permalink / raw)
  To: Alexander Aring, Stefan Schmidt, linux-wpan
  Cc: David S. Miller, Jakub Kicinski, netdev, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni, Miquel Raynal

This is the slow path, we need to wait for each command to be processed
before continuing so let's introduce an helper which does the
transmission and blocks until it gets notified of its asynchronous
completion. This helper is going to be used when introducing scan
support.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 net/mac802154/ieee802154_i.h | 1 +
 net/mac802154/tx.c           | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
index 295c9ce091e1..ad76a60af087 100644
--- a/net/mac802154/ieee802154_i.h
+++ b/net/mac802154/ieee802154_i.h
@@ -123,6 +123,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
 void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
 void ieee802154_xmit_sync_worker(struct work_struct *work);
 void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
+void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb);
 netdev_tx_t
 ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 06ae2e6cea43..7c281458942e 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -126,6 +126,12 @@ void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
 	atomic_dec(&local->phy->hold_txs);
 }
 
+void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb)
+{
+	ieee802154_tx(local, skb);
+	ieee802154_sync_and_stop_tx(local);
+}
+
 netdev_tx_t
 ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper
  2022-02-07 14:47 ` [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper Miquel Raynal
@ 2022-02-20 23:31   ` Alexander Aring
  2022-02-21 20:22     ` Alexander Aring
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-02-20 23:31 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Alexander Aring, Stefan Schmidt, linux-wpan, David S. Miller,
	Jakub Kicinski, Network Development, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Mon, Feb 7, 2022 at 10:09 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> So far there is only a helper for successful transmission, which led
> device drivers to implement their own handling in case of
> error. Unfortunately, we really need all the drivers to give the hand
> back to the core once they are done in order to be able to build a
> proper synchronous API. So let's create a _xmit_error() helper.
>
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> ---
>  include/net/mac802154.h | 10 ++++++++++
>  net/mac802154/util.c    | 10 ++++++++++
>  2 files changed, 20 insertions(+)
>
> diff --git a/include/net/mac802154.h b/include/net/mac802154.h
> index 2c3bbc6645ba..9fe8cfef1ba0 100644
> --- a/include/net/mac802154.h
> +++ b/include/net/mac802154.h
> @@ -498,4 +498,14 @@ void ieee802154_stop_queue(struct ieee802154_hw *hw);
>  void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
>                               bool ifs_handling);
>
> +/**
> + * ieee802154_xmit_error - frame transmission failed
> + *
> + * @hw: pointer as obtained from ieee802154_alloc_hw().
> + * @skb: buffer for transmission
> + * @ifs_handling: indicate interframe space handling
> + */
> +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> +                          bool ifs_handling);
> +
>  #endif /* NET_MAC802154_H */
> diff --git a/net/mac802154/util.c b/net/mac802154/util.c
> index 6f82418e9dec..9016f634efba 100644
> --- a/net/mac802154/util.c
> +++ b/net/mac802154/util.c
> @@ -102,6 +102,16 @@ void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
>  }
>  EXPORT_SYMBOL(ieee802154_xmit_complete);
>
> +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> +                          bool ifs_handling)
> +{
> +       unsigned int skb_len = skb->len;
> +
> +       dev_kfree_skb_any(skb);
> +       ieee802154_xmit_end(hw, ifs_handling, skb_len);
> +}

Remove ieee802154_xmit_end() function and just call to wake up the
queue here, also drop the "ifs_handling" parameter here.

- Alex


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 04/14] net: ieee802154: atusb: Call _xmit_error() when a transmission fails
  2022-02-07 14:47 ` [PATCH wpan-next v2 04/14] net: ieee802154: atusb: " Miquel Raynal
@ 2022-02-20 23:35   ` Alexander Aring
  2022-02-24  2:00     ` Alexander Aring
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-02-20 23:35 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> ieee802154_xmit_error() is the right helper to call when a transmission
> has failed. Let's use it instead of open-coding it.
>
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> ---
>  drivers/net/ieee802154/atusb.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
> index f27a5f535808..0e6f180b4e79 100644
> --- a/drivers/net/ieee802154/atusb.c
> +++ b/drivers/net/ieee802154/atusb.c
> @@ -271,9 +271,7 @@ static void atusb_tx_done(struct atusb *atusb, u8 seq)
>                  * unlikely case now that seq == expect is then true, but can
>                  * happen and fail with a tx_skb = NULL;
>                  */
> -               ieee802154_wake_queue(atusb->hw);
> -               if (atusb->tx_skb)
> -                       dev_kfree_skb_irq(atusb->tx_skb);
> +               ieee802154_xmit_error(atusb->hw, atusb->tx_skb, false);
>
Are you sure you can easily convert this? You should introduce a
"ieee802154_xmit_error_irq()"?

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-02-07 14:48 ` [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism Miquel Raynal
@ 2022-02-20 23:49   ` Alexander Aring
  2022-03-03 18:17     ` Miquel Raynal
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-02-20 23:49 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Right now we are able to stop a queue but we have no indication if a
> transmission is ongoing or not.
>
> Thanks to recent additions, we can track the number of ongoing
> transmissions so we know if the last transmission is over. Adding on top
> of it an internal wait queue also allows to be woken up asynchronously
> when this happens. If, beforehands, we marked the queue to be held and
> stopped it, we end up flushing and stopping the tx queue.
>
> Thanks to this feature, we will soon be able to introduce a synchronous
> transmit API.
>
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> ---
>  include/net/cfg802154.h      |  1 +
>  net/ieee802154/core.c        |  1 +
>  net/mac802154/cfg.c          |  5 ++---
>  net/mac802154/ieee802154_i.h |  1 +
>  net/mac802154/tx.c           | 11 ++++++++++-
>  net/mac802154/util.c         |  3 ++-
>  6 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h
> index 043d8e4359e7..0d385a214da3 100644
> --- a/include/net/cfg802154.h
> +++ b/include/net/cfg802154.h
> @@ -217,6 +217,7 @@ struct wpan_phy {
>         /* Transmission monitoring and control */
>         atomic_t ongoing_txs;
>         atomic_t hold_txs;
> +       wait_queue_head_t sync_txq;
>
>         char priv[] __aligned(NETDEV_ALIGN);
>  };
> diff --git a/net/ieee802154/core.c b/net/ieee802154/core.c
> index de259b5170ab..0953cacafbff 100644
> --- a/net/ieee802154/core.c
> +++ b/net/ieee802154/core.c
> @@ -129,6 +129,7 @@ wpan_phy_new(const struct cfg802154_ops *ops, size_t priv_size)
>         wpan_phy_net_set(&rdev->wpan_phy, &init_net);
>
>         init_waitqueue_head(&rdev->dev_wait);
> +       init_waitqueue_head(&rdev->wpan_phy.sync_txq);
>
>         return &rdev->wpan_phy;
>  }
> diff --git a/net/mac802154/cfg.c b/net/mac802154/cfg.c
> index e8aabf215286..da94aaa32fcb 100644
> --- a/net/mac802154/cfg.c
> +++ b/net/mac802154/cfg.c
> @@ -46,8 +46,7 @@ static int ieee802154_suspend(struct wpan_phy *wpan_phy)
>         if (!local->open_count)
>                 goto suspend;
>
> -       atomic_inc(&wpan_phy->hold_txs);
> -       ieee802154_stop_queue(&local->hw);
> +       ieee802154_sync_and_stop_tx(local);
>         synchronize_net();
>
>         /* stop hardware - this must stop RX */
> @@ -73,7 +72,7 @@ static int ieee802154_resume(struct wpan_phy *wpan_phy)
>                 return ret;
>
>  wake_up:
> -       if (!atomic_dec_and_test(&wpan_phy->hold_txs))
> +       if (!atomic_read(&wpan_phy->hold_txs))
>                 ieee802154_wake_queue(&local->hw);
>         local->suspended = false;
>         return 0;
> diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> index 56fcd7ef5b6f..295c9ce091e1 100644
> --- a/net/mac802154/ieee802154_i.h
> +++ b/net/mac802154/ieee802154_i.h
> @@ -122,6 +122,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
>
>  void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
>  void ieee802154_xmit_sync_worker(struct work_struct *work);
> +void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
>  netdev_tx_t
>  ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
>  netdev_tx_t
> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> index abd9a057521e..06ae2e6cea43 100644
> --- a/net/mac802154/tx.c
> +++ b/net/mac802154/tx.c
> @@ -47,7 +47,8 @@ void ieee802154_xmit_sync_worker(struct work_struct *work)
>                 ieee802154_wake_queue(&local->hw);
>
>         kfree_skb(skb);
> -       atomic_dec(&local->phy->ongoing_txs);
> +       if (!atomic_dec_and_test(&local->phy->ongoing_txs))
> +               wake_up(&local->phy->sync_txq);
>         netdev_dbg(dev, "transmission failed\n");
>  }
>
> @@ -117,6 +118,14 @@ ieee802154_hot_tx(struct ieee802154_local *local, struct sk_buff *skb)
>         return ieee802154_tx(local, skb);
>  }
>
> +void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
> +{
> +       atomic_inc(&local->phy->hold_txs);
> +       ieee802154_stop_queue(&local->hw);
> +       wait_event(local->phy->sync_txq, !atomic_read(&local->phy->ongoing_txs));
> +       atomic_dec(&local->phy->hold_txs);

In my opinion this _still_ races as I mentioned earlier. You need to
be sure that if you do ieee802154_stop_queue() that no ieee802154_tx()
or hot_tx() is running at this time. Look into the function I
mentioned earlier "?netif_tx_disable()?".

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands
  2022-02-07 14:48 ` [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands Miquel Raynal
@ 2022-02-20 23:52   ` Alexander Aring
  2022-02-21 20:33     ` Alexander Aring
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-02-20 23:52 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> This is the slow path, we need to wait for each command to be processed
> before continuing so let's introduce an helper which does the
> transmission and blocks until it gets notified of its asynchronous
> completion. This helper is going to be used when introducing scan
> support.
>
> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> ---
>  net/mac802154/ieee802154_i.h | 1 +
>  net/mac802154/tx.c           | 6 ++++++
>  2 files changed, 7 insertions(+)
>
> diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> index 295c9ce091e1..ad76a60af087 100644
> --- a/net/mac802154/ieee802154_i.h
> +++ b/net/mac802154/ieee802154_i.h
> @@ -123,6 +123,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
>  void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
>  void ieee802154_xmit_sync_worker(struct work_struct *work);
>  void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
> +void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb);
>  netdev_tx_t
>  ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
>  netdev_tx_t
> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> index 06ae2e6cea43..7c281458942e 100644
> --- a/net/mac802154/tx.c
> +++ b/net/mac802154/tx.c
> @@ -126,6 +126,12 @@ void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
>         atomic_dec(&local->phy->hold_txs);
>  }
>
> +void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb)
> +{
> +       ieee802154_tx(local, skb);
> +       ieee802154_sync_and_stop_tx(local);

Some of those functions can fail, in async case we can do some stats
but here we can deliver the caller an error. Please do so.

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper
  2022-02-20 23:31   ` Alexander Aring
@ 2022-02-21 20:22     ` Alexander Aring
  2022-02-22  8:43       ` Miquel Raynal
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-02-21 20:22 UTC (permalink / raw)
  To: Alexander Aring
  Cc: Miquel Raynal, Stefan Schmidt, linux-wpan - ML, David S. Miller,
	Jakub Kicinski, Network Development, David Girault,
	Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Sun, Feb 20, 2022 at 6:31 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> Hi,
>
> On Mon, Feb 7, 2022 at 10:09 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > So far there is only a helper for successful transmission, which led
> > device drivers to implement their own handling in case of
> > error. Unfortunately, we really need all the drivers to give the hand
> > back to the core once they are done in order to be able to build a
> > proper synchronous API. So let's create a _xmit_error() helper.
> >
> > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > ---
> >  include/net/mac802154.h | 10 ++++++++++
> >  net/mac802154/util.c    | 10 ++++++++++
> >  2 files changed, 20 insertions(+)
> >
> > diff --git a/include/net/mac802154.h b/include/net/mac802154.h
> > index 2c3bbc6645ba..9fe8cfef1ba0 100644
> > --- a/include/net/mac802154.h
> > +++ b/include/net/mac802154.h
> > @@ -498,4 +498,14 @@ void ieee802154_stop_queue(struct ieee802154_hw *hw);
> >  void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
> >                               bool ifs_handling);
> >
> > +/**
> > + * ieee802154_xmit_error - frame transmission failed
> > + *
> > + * @hw: pointer as obtained from ieee802154_alloc_hw().
> > + * @skb: buffer for transmission
> > + * @ifs_handling: indicate interframe space handling
> > + */
> > +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> > +                          bool ifs_handling);
> > +
> >  #endif /* NET_MAC802154_H */
> > diff --git a/net/mac802154/util.c b/net/mac802154/util.c
> > index 6f82418e9dec..9016f634efba 100644
> > --- a/net/mac802154/util.c
> > +++ b/net/mac802154/util.c
> > @@ -102,6 +102,16 @@ void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
> >  }
> >  EXPORT_SYMBOL(ieee802154_xmit_complete);
> >
> > +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> > +                          bool ifs_handling)
> > +{
> > +       unsigned int skb_len = skb->len;
> > +
> > +       dev_kfree_skb_any(skb);
> > +       ieee802154_xmit_end(hw, ifs_handling, skb_len);
> > +}
>
> Remove ieee802154_xmit_end() function and just call to wake up the
> queue here, also drop the "ifs_handling" parameter here.

I am sorry, I think I should deliver an explanation here... I think
the handling of success and error paths are just too different. In
error there will also never ifs handling in the error path. Also
please note there are not just errors as bus/transceiver errors, in
future transceiver should also deliver [0] to the caller, in sync
transmit it should return those errors to the caller... in async mode
there exists different ways to deliver errors like (no ack) to user
space by using socket error queue, here again is worth to look into
wireless subsystem which have a similar feature.

The errors in [0] are currently ignored but I think should be switched
some time soon or with an additional patch by you to calling
xmit_error with an int for $REASON. Those errors are happening on the
transceiver because some functionality is offloaded. btw: so far I
know some MLME-ops need to know if an ack is received or not.

You can split the functionality somehow it makes sense, but with the
above change I only see the wake up queue is the only thing that both
(success/error) should have in common.

- Alex

[0] https://elixir.bootlin.com/linux/v5.16-rc7/source/drivers/net/ieee802154/at86rf230.c#L670

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands
  2022-02-20 23:52   ` Alexander Aring
@ 2022-02-21 20:33     ` Alexander Aring
  2022-02-21 20:33       ` Alexander Aring
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-02-21 20:33 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Sun, Feb 20, 2022 at 6:52 PM Alexander Aring <alex.aring@gmail.com> wrote:
>
> Hi,
>
> On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > This is the slow path, we need to wait for each command to be processed
> > before continuing so let's introduce an helper which does the
> > transmission and blocks until it gets notified of its asynchronous
> > completion. This helper is going to be used when introducing scan
> > support.
> >
> > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > ---
> >  net/mac802154/ieee802154_i.h | 1 +
> >  net/mac802154/tx.c           | 6 ++++++
> >  2 files changed, 7 insertions(+)
> >
> > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> > index 295c9ce091e1..ad76a60af087 100644
> > --- a/net/mac802154/ieee802154_i.h
> > +++ b/net/mac802154/ieee802154_i.h
> > @@ -123,6 +123,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
> >  void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
> >  void ieee802154_xmit_sync_worker(struct work_struct *work);
> >  void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
> > +void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb);
> >  netdev_tx_t
> >  ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
> >  netdev_tx_t
> > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> > index 06ae2e6cea43..7c281458942e 100644
> > --- a/net/mac802154/tx.c
> > +++ b/net/mac802154/tx.c
> > @@ -126,6 +126,12 @@ void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
> >         atomic_dec(&local->phy->hold_txs);
> >  }
> >
> > +void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb)
> > +{
> > +       ieee802154_tx(local, skb);
> > +       ieee802154_sync_and_stop_tx(local);
>
> Some of those functions can fail, in async case we can do some stats
> but here we can deliver the caller an error. Please do so.
>

to more specify what I mean here is also to get an error if the async
transmit path ends with the error case and not success. You need to
put a tx result int a global pointer to return it here and fetch it
before wake_up() "ieee802154_sync_and_stop_tx()".

Also I think we ignore here the case that the interface goes down
waiting for tx completion which means we are calling stop() callback
on the driver layer. I am worried that if stop() happens during
ieee802154_mlme_tx() we are running in a deadlock because a "tx
complete" will never occur?

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands
  2022-02-21 20:33     ` Alexander Aring
@ 2022-02-21 20:33       ` Alexander Aring
  0 siblings, 0 replies; 32+ messages in thread
From: Alexander Aring @ 2022-02-21 20:33 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Mon, Feb 21, 2022 at 3:33 PM Alexander Aring <alex.aring@gmail.com> wrote:
>
> Hi,
>
> On Sun, Feb 20, 2022 at 6:52 PM Alexander Aring <alex.aring@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >
> > > This is the slow path, we need to wait for each command to be processed
> > > before continuing so let's introduce an helper which does the
> > > transmission and blocks until it gets notified of its asynchronous
> > > completion. This helper is going to be used when introducing scan
> > > support.
> > >
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  net/mac802154/ieee802154_i.h | 1 +
> > >  net/mac802154/tx.c           | 6 ++++++
> > >  2 files changed, 7 insertions(+)
> > >
> > > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> > > index 295c9ce091e1..ad76a60af087 100644
> > > --- a/net/mac802154/ieee802154_i.h
> > > +++ b/net/mac802154/ieee802154_i.h
> > > @@ -123,6 +123,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
> > >  void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
> > >  void ieee802154_xmit_sync_worker(struct work_struct *work);
> > >  void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
> > > +void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb);
> > >  netdev_tx_t
> > >  ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
> > >  netdev_tx_t
> > > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> > > index 06ae2e6cea43..7c281458942e 100644
> > > --- a/net/mac802154/tx.c
> > > +++ b/net/mac802154/tx.c
> > > @@ -126,6 +126,12 @@ void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
> > >         atomic_dec(&local->phy->hold_txs);
> > >  }
> > >
> > > +void ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb)
> > > +{
> > > +       ieee802154_tx(local, skb);
> > > +       ieee802154_sync_and_stop_tx(local);
> >
> > Some of those functions can fail, in async case we can do some stats
> > but here we can deliver the caller an error. Please do so.
> >
>
> to more specify what I mean here is also to get an error if the async
> transmit path ends with the error case and not success. You need to

s/not success/success/

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper
  2022-02-21 20:22     ` Alexander Aring
@ 2022-02-22  8:43       ` Miquel Raynal
  2022-02-24  1:53         ` Alexander Aring
  0 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-02-22  8:43 UTC (permalink / raw)
  To: Alexander Aring
  Cc: Alexander Aring, Stefan Schmidt, linux-wpan - ML,
	David S. Miller, Jakub Kicinski, Network Development,
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi Alexander,

alex.aring@gmail.com wrote on Mon, 21 Feb 2022 15:22:40 -0500:

> Hi,
> 
> On Sun, Feb 20, 2022 at 6:31 PM Alexander Aring <aahringo@redhat.com> wrote:
> >
> > Hi,
> >
> > On Mon, Feb 7, 2022 at 10:09 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > >
> > > So far there is only a helper for successful transmission, which led
> > > device drivers to implement their own handling in case of
> > > error. Unfortunately, we really need all the drivers to give the hand
> > > back to the core once they are done in order to be able to build a
> > > proper synchronous API. So let's create a _xmit_error() helper.
> > >
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  include/net/mac802154.h | 10 ++++++++++
> > >  net/mac802154/util.c    | 10 ++++++++++
> > >  2 files changed, 20 insertions(+)
> > >
> > > diff --git a/include/net/mac802154.h b/include/net/mac802154.h
> > > index 2c3bbc6645ba..9fe8cfef1ba0 100644
> > > --- a/include/net/mac802154.h
> > > +++ b/include/net/mac802154.h
> > > @@ -498,4 +498,14 @@ void ieee802154_stop_queue(struct ieee802154_hw *hw);
> > >  void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
> > >                               bool ifs_handling);
> > >
> > > +/**
> > > + * ieee802154_xmit_error - frame transmission failed
> > > + *
> > > + * @hw: pointer as obtained from ieee802154_alloc_hw().
> > > + * @skb: buffer for transmission
> > > + * @ifs_handling: indicate interframe space handling
> > > + */
> > > +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> > > +                          bool ifs_handling);
> > > +
> > >  #endif /* NET_MAC802154_H */
> > > diff --git a/net/mac802154/util.c b/net/mac802154/util.c
> > > index 6f82418e9dec..9016f634efba 100644
> > > --- a/net/mac802154/util.c
> > > +++ b/net/mac802154/util.c
> > > @@ -102,6 +102,16 @@ void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
> > >  }
> > >  EXPORT_SYMBOL(ieee802154_xmit_complete);
> > >
> > > +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> > > +                          bool ifs_handling)
> > > +{
> > > +       unsigned int skb_len = skb->len;
> > > +
> > > +       dev_kfree_skb_any(skb);
> > > +       ieee802154_xmit_end(hw, ifs_handling, skb_len);
> > > +}  
> >
> > Remove ieee802154_xmit_end() function and just call to wake up the
> > queue here, also drop the "ifs_handling" parameter here.  
> 
> I am sorry, I think I should deliver an explanation here... I think
> the handling of success and error paths are just too different. In
> error there will also never ifs handling in the error path. Also
> please note there are not just errors as bus/transceiver errors, in
> future transceiver should also deliver [0] to the caller, in sync
> transmit it should return those errors to the caller... in async mode
> there exists different ways to deliver errors like (no ack) to user
> space by using socket error queue, here again is worth to look into
> wireless subsystem which have a similar feature.
> 
> The errors in [0] are currently ignored but I think should be switched
> some time soon or with an additional patch by you to calling
> xmit_error with an int for $REASON. Those errors are happening on the
> transceiver because some functionality is offloaded. btw: so far I
> know some MLME-ops need to know if an ack is received or not.
> 
> You can split the functionality somehow it makes sense, but with the
> above change I only see the wake up queue is the only thing that both
> (success/error) should have in common.

Very clear, thanks for the additional details. Indeed I would find much
better to be able to propagate the error reasons to upper layers. Even
though at this time we don't propagate them all the way up to userspace,
having them *at least* in the ieee core would be a nice feature. I'll
see what I can do.

> 
> - Alex
> 
> [0] https://elixir.bootlin.com/linux/v5.16-rc7/source/drivers/net/ieee802154/at86rf230.c#L670


Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper
  2022-02-22  8:43       ` Miquel Raynal
@ 2022-02-24  1:53         ` Alexander Aring
  0 siblings, 0 replies; 32+ messages in thread
From: Alexander Aring @ 2022-02-24  1:53 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Alexander Aring, Stefan Schmidt, linux-wpan - ML,
	David S. Miller, Jakub Kicinski, Network Development,
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Tue, Feb 22, 2022 at 3:43 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Alexander,
>
> alex.aring@gmail.com wrote on Mon, 21 Feb 2022 15:22:40 -0500:
>
> > Hi,
> >
> > On Sun, Feb 20, 2022 at 6:31 PM Alexander Aring <aahringo@redhat.com> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Feb 7, 2022 at 10:09 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > > >
> > > > So far there is only a helper for successful transmission, which led
> > > > device drivers to implement their own handling in case of
> > > > error. Unfortunately, we really need all the drivers to give the hand
> > > > back to the core once they are done in order to be able to build a
> > > > proper synchronous API. So let's create a _xmit_error() helper.
> > > >
> > > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > > ---
> > > >  include/net/mac802154.h | 10 ++++++++++
> > > >  net/mac802154/util.c    | 10 ++++++++++
> > > >  2 files changed, 20 insertions(+)
> > > >
> > > > diff --git a/include/net/mac802154.h b/include/net/mac802154.h
> > > > index 2c3bbc6645ba..9fe8cfef1ba0 100644
> > > > --- a/include/net/mac802154.h
> > > > +++ b/include/net/mac802154.h
> > > > @@ -498,4 +498,14 @@ void ieee802154_stop_queue(struct ieee802154_hw *hw);
> > > >  void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
> > > >                               bool ifs_handling);
> > > >
> > > > +/**
> > > > + * ieee802154_xmit_error - frame transmission failed
> > > > + *
> > > > + * @hw: pointer as obtained from ieee802154_alloc_hw().
> > > > + * @skb: buffer for transmission
> > > > + * @ifs_handling: indicate interframe space handling
> > > > + */
> > > > +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> > > > +                          bool ifs_handling);
> > > > +
> > > >  #endif /* NET_MAC802154_H */
> > > > diff --git a/net/mac802154/util.c b/net/mac802154/util.c
> > > > index 6f82418e9dec..9016f634efba 100644
> > > > --- a/net/mac802154/util.c
> > > > +++ b/net/mac802154/util.c
> > > > @@ -102,6 +102,16 @@ void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
> > > >  }
> > > >  EXPORT_SYMBOL(ieee802154_xmit_complete);
> > > >
> > > > +void ieee802154_xmit_error(struct ieee802154_hw *hw, struct sk_buff *skb,
> > > > +                          bool ifs_handling)
> > > > +{
> > > > +       unsigned int skb_len = skb->len;
> > > > +
> > > > +       dev_kfree_skb_any(skb);
> > > > +       ieee802154_xmit_end(hw, ifs_handling, skb_len);
> > > > +}
> > >
> > > Remove ieee802154_xmit_end() function and just call to wake up the
> > > queue here, also drop the "ifs_handling" parameter here.
> >
> > I am sorry, I think I should deliver an explanation here... I think
> > the handling of success and error paths are just too different. In
> > error there will also never ifs handling in the error path. Also
> > please note there are not just errors as bus/transceiver errors, in
> > future transceiver should also deliver [0] to the caller, in sync
> > transmit it should return those errors to the caller... in async mode
> > there exists different ways to deliver errors like (no ack) to user
> > space by using socket error queue, here again is worth to look into
> > wireless subsystem which have a similar feature.
> >
> > The errors in [0] are currently ignored but I think should be switched
> > some time soon or with an additional patch by you to calling
> > xmit_error with an int for $REASON. Those errors are happening on the
> > transceiver because some functionality is offloaded. btw: so far I
> > know some MLME-ops need to know if an ack is received or not.
> >
> > You can split the functionality somehow it makes sense, but with the
> > above change I only see the wake up queue is the only thing that both
> > (success/error) should have in common.
>
> Very clear, thanks for the additional details. Indeed I would find much
> better to be able to propagate the error reasons to upper layers. Even
> though at this time we don't propagate them all the way up to userspace,
> having them *at least* in the ieee core would be a nice feature. I'll
> see what I can do.
>

For now I think we can live with "success" or "error". It would be
nice to have a reason and it is definitely necessary in future use.
The case NO_ACK is a very specific 802.15.4 case which I think is
important/sometimes even required to know and I would put it as an
error, that means it was not received on the other end. Of course this
error makes only sense if ack request bit was set and the transceiver
should only inform about it if it was set. The other errors which
might occur e.g. channel access failure... it might be that
transceivers handle such cases "differently?" Even I don't even know
if all transceivers have a functionality to request if there was an
ack or not (in case of offloaded ARET handling). In my opinion they
should at least have the possibility to report back if there was an
ack or not, if they don't have such functionality we might always
"think" there was an ack but the transceiver would act sometimes a
little bit weird. At this point I would blame the hardware because it
cannot report it.

However, as I see in the at86rf230 driver there can be different
success cases and different error cases... you don't need to support
that. We currently handle only some bus/misc transceiver issues which
should end in an error in the sync tx handling. Later we can look if
we might add more error cases into it. It would be nice to see that if
we can simply check in general if there was a success or error, if the
caller wants to be more specific there can be an if/switch-case
statement like the well-known handling with errnos.

Don't tackle it now for userspace, I think it will require a lot of
work... even it depends on which socket family you want to support it.
I think it was just worth mentioning...

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 04/14] net: ieee802154: atusb: Call _xmit_error() when a transmission fails
  2022-02-20 23:35   ` Alexander Aring
@ 2022-02-24  2:00     ` Alexander Aring
  2022-02-24 14:43       ` Miquel Raynal
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-02-24  2:00 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Sun, Feb 20, 2022 at 6:35 PM Alexander Aring <alex.aring@gmail.com> wrote:
>
> Hi,
>
> On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > ieee802154_xmit_error() is the right helper to call when a transmission
> > has failed. Let's use it instead of open-coding it.
> >
> > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > ---
> >  drivers/net/ieee802154/atusb.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
> > index f27a5f535808..0e6f180b4e79 100644
> > --- a/drivers/net/ieee802154/atusb.c
> > +++ b/drivers/net/ieee802154/atusb.c
> > @@ -271,9 +271,7 @@ static void atusb_tx_done(struct atusb *atusb, u8 seq)
> >                  * unlikely case now that seq == expect is then true, but can
> >                  * happen and fail with a tx_skb = NULL;
> >                  */
> > -               ieee802154_wake_queue(atusb->hw);
> > -               if (atusb->tx_skb)
> > -                       dev_kfree_skb_irq(atusb->tx_skb);
> > +               ieee802154_xmit_error(atusb->hw, atusb->tx_skb, false);
> >
> Are you sure you can easily convert this? You should introduce a
> "ieee802154_xmit_error_irq()"?

Should be fine as you are using dev_kfree_skb_any().

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 04/14] net: ieee802154: atusb: Call _xmit_error() when a transmission fails
  2022-02-24  2:00     ` Alexander Aring
@ 2022-02-24 14:43       ` Miquel Raynal
  0 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-02-24 14:43 UTC (permalink / raw)
  To: Alexander Aring
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi Alexander,

alex.aring@gmail.com wrote on Wed, 23 Feb 2022 21:00:23 -0500:

> Hi,
> 
> On Sun, Feb 20, 2022 at 6:35 PM Alexander Aring <alex.aring@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > >
> > > ieee802154_xmit_error() is the right helper to call when a transmission
> > > has failed. Let's use it instead of open-coding it.
> > >
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  drivers/net/ieee802154/atusb.c | 4 +---
> > >  1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
> > > index f27a5f535808..0e6f180b4e79 100644
> > > --- a/drivers/net/ieee802154/atusb.c
> > > +++ b/drivers/net/ieee802154/atusb.c
> > > @@ -271,9 +271,7 @@ static void atusb_tx_done(struct atusb *atusb, u8 seq)
> > >                  * unlikely case now that seq == expect is then true, but can
> > >                  * happen and fail with a tx_skb = NULL;
> > >                  */
> > > -               ieee802154_wake_queue(atusb->hw);
> > > -               if (atusb->tx_skb)
> > > -                       dev_kfree_skb_irq(atusb->tx_skb);
> > > +               ieee802154_xmit_error(atusb->hw, atusb->tx_skb, false);
> > >  
> > Are you sure you can easily convert this? You should introduce a
> > "ieee802154_xmit_error_irq()"?  
> 
> Should be fine as you are using dev_kfree_skb_any().

Haha, you answer my questions before I even take the time to write them
down :-)

Yes I agree, because of dev_kfree_skb_any(), my understanding is that
we do not need an _irq specific error handler.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-02-20 23:49   ` Alexander Aring
@ 2022-03-03 18:17     ` Miquel Raynal
  2022-03-04 10:54       ` Miquel Raynal
  0 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-03-03 18:17 UTC (permalink / raw)
  To: Alexander Aring
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi Alexander,

alex.aring@gmail.com wrote on Sun, 20 Feb 2022 18:49:06 -0500:

> Hi,
> 
> On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Right now we are able to stop a queue but we have no indication if a
> > transmission is ongoing or not.
> >
> > Thanks to recent additions, we can track the number of ongoing
> > transmissions so we know if the last transmission is over. Adding on top
> > of it an internal wait queue also allows to be woken up asynchronously
> > when this happens. If, beforehands, we marked the queue to be held and
> > stopped it, we end up flushing and stopping the tx queue.
> >
> > Thanks to this feature, we will soon be able to introduce a synchronous
> > transmit API.
> >
> > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > ---
> >  include/net/cfg802154.h      |  1 +
> >  net/ieee802154/core.c        |  1 +
> >  net/mac802154/cfg.c          |  5 ++---
> >  net/mac802154/ieee802154_i.h |  1 +
> >  net/mac802154/tx.c           | 11 ++++++++++-
> >  net/mac802154/util.c         |  3 ++-
> >  6 files changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h
> > index 043d8e4359e7..0d385a214da3 100644
> > --- a/include/net/cfg802154.h
> > +++ b/include/net/cfg802154.h
> > @@ -217,6 +217,7 @@ struct wpan_phy {
> >         /* Transmission monitoring and control */
> >         atomic_t ongoing_txs;
> >         atomic_t hold_txs;
> > +       wait_queue_head_t sync_txq;
> >
> >         char priv[] __aligned(NETDEV_ALIGN);
> >  };
> > diff --git a/net/ieee802154/core.c b/net/ieee802154/core.c
> > index de259b5170ab..0953cacafbff 100644
> > --- a/net/ieee802154/core.c
> > +++ b/net/ieee802154/core.c
> > @@ -129,6 +129,7 @@ wpan_phy_new(const struct cfg802154_ops *ops, size_t priv_size)
> >         wpan_phy_net_set(&rdev->wpan_phy, &init_net);
> >
> >         init_waitqueue_head(&rdev->dev_wait);
> > +       init_waitqueue_head(&rdev->wpan_phy.sync_txq);
> >
> >         return &rdev->wpan_phy;
> >  }
> > diff --git a/net/mac802154/cfg.c b/net/mac802154/cfg.c
> > index e8aabf215286..da94aaa32fcb 100644
> > --- a/net/mac802154/cfg.c
> > +++ b/net/mac802154/cfg.c
> > @@ -46,8 +46,7 @@ static int ieee802154_suspend(struct wpan_phy *wpan_phy)
> >         if (!local->open_count)
> >                 goto suspend;
> >
> > -       atomic_inc(&wpan_phy->hold_txs);
> > -       ieee802154_stop_queue(&local->hw);
> > +       ieee802154_sync_and_stop_tx(local);
> >         synchronize_net();
> >
> >         /* stop hardware - this must stop RX */
> > @@ -73,7 +72,7 @@ static int ieee802154_resume(struct wpan_phy *wpan_phy)
> >                 return ret;
> >
> >  wake_up:
> > -       if (!atomic_dec_and_test(&wpan_phy->hold_txs))
> > +       if (!atomic_read(&wpan_phy->hold_txs))
> >                 ieee802154_wake_queue(&local->hw);
> >         local->suspended = false;
> >         return 0;
> > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> > index 56fcd7ef5b6f..295c9ce091e1 100644
> > --- a/net/mac802154/ieee802154_i.h
> > +++ b/net/mac802154/ieee802154_i.h
> > @@ -122,6 +122,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
> >
> >  void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
> >  void ieee802154_xmit_sync_worker(struct work_struct *work);
> > +void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
> >  netdev_tx_t
> >  ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
> >  netdev_tx_t
> > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> > index abd9a057521e..06ae2e6cea43 100644
> > --- a/net/mac802154/tx.c
> > +++ b/net/mac802154/tx.c
> > @@ -47,7 +47,8 @@ void ieee802154_xmit_sync_worker(struct work_struct *work)
> >                 ieee802154_wake_queue(&local->hw);
> >
> >         kfree_skb(skb);
> > -       atomic_dec(&local->phy->ongoing_txs);
> > +       if (!atomic_dec_and_test(&local->phy->ongoing_txs))
> > +               wake_up(&local->phy->sync_txq);
> >         netdev_dbg(dev, "transmission failed\n");
> >  }
> >
> > @@ -117,6 +118,14 @@ ieee802154_hot_tx(struct ieee802154_local *local, struct sk_buff *skb)
> >         return ieee802154_tx(local, skb);
> >  }
> >
> > +void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
> > +{
> > +       atomic_inc(&local->phy->hold_txs);
> > +       ieee802154_stop_queue(&local->hw);
> > +       wait_event(local->phy->sync_txq, !atomic_read(&local->phy->ongoing_txs));
> > +       atomic_dec(&local->phy->hold_txs);  
> 
> In my opinion this _still_ races as I mentioned earlier. You need to
> be sure that if you do ieee802154_stop_queue() that no ieee802154_tx()
> or hot_tx() is running at this time. Look into the function I
> mentioned earlier "?netif_tx_disable()?".

I think now I get the problem, but I am having troubles understanding
the logic in netif_tx_disable(), or should I say, the idea that I
should adapt to our situation.

I understand that we should make sure the following situation does not
happen:
- ieee802154_subif_start_xmit() is called
- ieee802154_subif_start_xmit() is called again
- ieee802154_tx() get's executed once and stops the queue
- ongoing_txs gets incremented once
- the first transfer finishes and ongoing_txs gets decremented
- the tx queue is supposedly empty by the current series while
  the second transfer requested earlier has not yet been processed and
  will definitely be tried in a short moment.

I don't find a pretty solution for that. Is your suggestion to use the
netdev tx_global_lock? If yes, then, how? Because it does not appear
clear to me how we should tackle this issue.

In the mean time, I believe the first half of the series is now big
enough to be sent aside given the number of additional commits that
have popped up following your last review :)

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-03-03 18:17     ` Miquel Raynal
@ 2022-03-04 10:54       ` Miquel Raynal
  2022-03-13 20:43         ` Alexander Aring
  0 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-03-04 10:54 UTC (permalink / raw)
  To: Alexander Aring
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi Alexander,

miquel.raynal@bootlin.com wrote on Thu, 3 Mar 2022 19:17:23 +0100:

> Hi Alexander,
> 
> alex.aring@gmail.com wrote on Sun, 20 Feb 2022 18:49:06 -0500:
> 
> > Hi,
> > 
> > On Mon, Feb 7, 2022 at 9:48 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > >
> > > Right now we are able to stop a queue but we have no indication if a
> > > transmission is ongoing or not.
> > >
> > > Thanks to recent additions, we can track the number of ongoing
> > > transmissions so we know if the last transmission is over. Adding on top
> > > of it an internal wait queue also allows to be woken up asynchronously
> > > when this happens. If, beforehands, we marked the queue to be held and
> > > stopped it, we end up flushing and stopping the tx queue.
> > >
> > > Thanks to this feature, we will soon be able to introduce a synchronous
> > > transmit API.
> > >
> > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
> > > ---
> > >  include/net/cfg802154.h      |  1 +
> > >  net/ieee802154/core.c        |  1 +
> > >  net/mac802154/cfg.c          |  5 ++---
> > >  net/mac802154/ieee802154_i.h |  1 +
> > >  net/mac802154/tx.c           | 11 ++++++++++-
> > >  net/mac802154/util.c         |  3 ++-
> > >  6 files changed, 17 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h
> > > index 043d8e4359e7..0d385a214da3 100644
> > > --- a/include/net/cfg802154.h
> > > +++ b/include/net/cfg802154.h
> > > @@ -217,6 +217,7 @@ struct wpan_phy {
> > >         /* Transmission monitoring and control */
> > >         atomic_t ongoing_txs;
> > >         atomic_t hold_txs;
> > > +       wait_queue_head_t sync_txq;
> > >
> > >         char priv[] __aligned(NETDEV_ALIGN);
> > >  };
> > > diff --git a/net/ieee802154/core.c b/net/ieee802154/core.c
> > > index de259b5170ab..0953cacafbff 100644
> > > --- a/net/ieee802154/core.c
> > > +++ b/net/ieee802154/core.c
> > > @@ -129,6 +129,7 @@ wpan_phy_new(const struct cfg802154_ops *ops, size_t priv_size)
> > >         wpan_phy_net_set(&rdev->wpan_phy, &init_net);
> > >
> > >         init_waitqueue_head(&rdev->dev_wait);
> > > +       init_waitqueue_head(&rdev->wpan_phy.sync_txq);
> > >
> > >         return &rdev->wpan_phy;
> > >  }
> > > diff --git a/net/mac802154/cfg.c b/net/mac802154/cfg.c
> > > index e8aabf215286..da94aaa32fcb 100644
> > > --- a/net/mac802154/cfg.c
> > > +++ b/net/mac802154/cfg.c
> > > @@ -46,8 +46,7 @@ static int ieee802154_suspend(struct wpan_phy *wpan_phy)
> > >         if (!local->open_count)
> > >                 goto suspend;
> > >
> > > -       atomic_inc(&wpan_phy->hold_txs);
> > > -       ieee802154_stop_queue(&local->hw);
> > > +       ieee802154_sync_and_stop_tx(local);
> > >         synchronize_net();
> > >
> > >         /* stop hardware - this must stop RX */
> > > @@ -73,7 +72,7 @@ static int ieee802154_resume(struct wpan_phy *wpan_phy)
> > >                 return ret;
> > >
> > >  wake_up:
> > > -       if (!atomic_dec_and_test(&wpan_phy->hold_txs))
> > > +       if (!atomic_read(&wpan_phy->hold_txs))
> > >                 ieee802154_wake_queue(&local->hw);
> > >         local->suspended = false;
> > >         return 0;
> > > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> > > index 56fcd7ef5b6f..295c9ce091e1 100644
> > > --- a/net/mac802154/ieee802154_i.h
> > > +++ b/net/mac802154/ieee802154_i.h
> > > @@ -122,6 +122,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
> > >
> > >  void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
> > >  void ieee802154_xmit_sync_worker(struct work_struct *work);
> > > +void ieee802154_sync_and_stop_tx(struct ieee802154_local *local);
> > >  netdev_tx_t
> > >  ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
> > >  netdev_tx_t
> > > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> > > index abd9a057521e..06ae2e6cea43 100644
> > > --- a/net/mac802154/tx.c
> > > +++ b/net/mac802154/tx.c
> > > @@ -47,7 +47,8 @@ void ieee802154_xmit_sync_worker(struct work_struct *work)
> > >                 ieee802154_wake_queue(&local->hw);
> > >
> > >         kfree_skb(skb);
> > > -       atomic_dec(&local->phy->ongoing_txs);
> > > +       if (!atomic_dec_and_test(&local->phy->ongoing_txs))
> > > +               wake_up(&local->phy->sync_txq);
> > >         netdev_dbg(dev, "transmission failed\n");
> > >  }
> > >
> > > @@ -117,6 +118,14 @@ ieee802154_hot_tx(struct ieee802154_local *local, struct sk_buff *skb)
> > >         return ieee802154_tx(local, skb);
> > >  }
> > >
> > > +void ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
> > > +{
> > > +       atomic_inc(&local->phy->hold_txs);
> > > +       ieee802154_stop_queue(&local->hw);
> > > +       wait_event(local->phy->sync_txq, !atomic_read(&local->phy->ongoing_txs));
> > > +       atomic_dec(&local->phy->hold_txs);    
> > 
> > In my opinion this _still_ races as I mentioned earlier. You need to
> > be sure that if you do ieee802154_stop_queue() that no ieee802154_tx()
> > or hot_tx() is running at this time. Look into the function I
> > mentioned earlier "?netif_tx_disable()?".  
> 
> I think now I get the problem, but I am having troubles understanding
> the logic in netif_tx_disable(), or should I say, the idea that I
> should adapt to our situation.
> 
> I understand that we should make sure the following situation does not
> happen:
> - ieee802154_subif_start_xmit() is called
> - ieee802154_subif_start_xmit() is called again
> - ieee802154_tx() get's executed once and stops the queue
> - ongoing_txs gets incremented once
> - the first transfer finishes and ongoing_txs gets decremented
> - the tx queue is supposedly empty by the current series while
>   the second transfer requested earlier has not yet been processed and
>   will definitely be tried in a short moment.
> 
> I don't find a pretty solution for that. Is your suggestion to use the
> netdev tx_global_lock? If yes, then, how? Because it does not appear
> clear to me how we should tackle this issue.

I had a second look at it and it appears to me that the issue was
already there and is structural. We just did not really cared about it
because we didn't bother with synchronization issues.

Here is a figure to base our discussions on:

                       enable
                         ┌────────────────────────────────────────────────────────────┐
                         │                                                            │
                         ▼                                                            │
          packet     ┌────────┐   ┌────────────┐   ┌────────────┐   ┌───────┐   ┌─────┴─────┐
            ┌┐       │        │   │            │   │            │   │       │   │           │
User  ──────┴┴──────►│ Queue  ├──►│ ieee*_tx() ├──►│stop_queue()├──►│xmit() ├──►│ wait/sync │
                     │        │   │            │   │            │   │       │   │           │
                     └────────┘   └────────────┘   └─────┬──────┘   └───────┘   └───────────┘
                         ▲                               │
                         │                               │
                         │                               │
                         └───────────────────────────────┘
                      disable

I assumed that we don't have the hand on the queuing mechanism (on the
left of the 'queue' box). I looked at the core code under
net/ieee802154/ and even if incrementing a counter there would be
handy, I assumed this was not an acceptable solution.

So then we end up with the possible situation where there are two (or
more) packets that must be processed by the mac tx logic (at the right
of the 'queue' box). The problem is of course the atomicity of the
stop_queue() compared to the number of times the ieee802154_tx()
function call can be made. We can have several packets being processed,
we don't have any way to know that.

Moving the stop_queue earlier would just reduce the racy area, without
fully preventing it, so not a solution per-se.

Perhaps we could monitor the state of the queue, it would help us know
if we need to retain a packet, but I personally find this a bit crappy,
yet probably working. Here is a drafted implementation, I'm only half
convinced this is a good idea and your input is welcome here:

--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -77,14 +77,26 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb)
                put_unaligned_le16(crc, skb_put(skb, 2));
        }
 
+retry:
+       while (netif_queue_stopped())
+               schedule();
+
+       acquire_lock();
+       if (netif_queue_stopped()) {
+               release_lock();
+               goto retry;
+       }
+
        /* Stop the netif queue on each sub_if_data object. */
        ieee802154_stop_queue(&local->hw);
 
+       atomic_inc(&local->phy->ongoing_txs);
+       release_lock();
+
        /* Drivers should preferably implement the async callback. In some rare
         * cases they only provide a sync callback which we will use as a
         * fallback.
         */
        if (local->ops->xmit_async) {
                unsigned int len = skb->len;
 
@@ -122,8 +134,10 @@ int ieee802154_sync_and_stop_tx(struct ieee802154_local *local)
 {
        int ret;
 
+       acquire_lock();
        atomic_inc(&local->phy->hold_txs);
        ieee802154_stop_queue(&local->hw);
+       release_lock();
        wait_event(local->phy->sync_txq, !atomic_read(&local->phy->ongoing_txs));
        ret = local->tx_result;
        atomic_dec(&local->phy->hold_txs);


If we go in this direction, perhaps it's best to rework the sync API
like you already proposed: just stopping the queue and syncing the
ongoing transfers, so that after that we can use a dedicated tx path
for MLME commands, bypassing the queue-is-stopped check. This way we
avoid risking to deliver data packets between two MLME calls.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-03-04 10:54       ` Miquel Raynal
@ 2022-03-13 20:43         ` Alexander Aring
  2022-03-18 18:11           ` Miquel Raynal
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-03-13 20:43 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Fri, Mar 4, 2022 at 5:54 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:

> I had a second look at it and it appears to me that the issue was
> already there and is structural. We just did not really cared about it
> because we didn't bother with synchronization issues.
>

I am not sure if I understand correctly. We stop the queue at some
specific moment and we need to make sure that xmit_do() is not called
or can't be called anymore.

I was thinking about:

void ieee802154_disable_queue(struct ieee802154_hw *hw)
{
        struct ieee802154_local *local = hw_to_local(hw);
        struct ieee802154_sub_if_data *sdata;

        rcu_read_lock();
        list_for_each_entry_rcu(sdata, &local->interfaces, list) {
                if (!sdata->dev)
                        continue;

               netif_tx_disable(sdata->dev);
        }
        rcu_read_unlock();
}
EXPORT_SYMBOL(ieee802154_stop_queue);

From my quick view is that "netif_tx_disable()" ensures by holding
locks and other things and doing netif_tx_stop_queue() it we can be
sure there will be no xmit_do() going on while it's called and
afterwards. It can be that there are still transmissions on the
transceiver that are on the way, but then your atomic counter and
wait_event() will come in place.
We need to be sure there will be nothing queued anymore for
transmission what (in my opinion) tx_disable() does. from any context.

We might need to review some netif callbacks... I have in my mind for
example stop(), maybe netif_tx_stop_queue() is enough (because the
context is like netif_tx_disable(), helding similar locks, etc.) but
we might want to be sure that nothing is going on anymore by using
your wait_event() with counter.

Is there any problem which I don't see?

- Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-03-13 20:43         ` Alexander Aring
@ 2022-03-18 18:11           ` Miquel Raynal
  2022-03-27 16:45             ` Alexander Aring
  0 siblings, 1 reply; 32+ messages in thread
From: Miquel Raynal @ 2022-03-18 18:11 UTC (permalink / raw)
  To: Alexander Aring
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi Alexander,

alex.aring@gmail.com wrote on Sun, 13 Mar 2022 16:43:52 -0400:

> Hi,
> 
> On Fri, Mar 4, 2022 at 5:54 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> 
> > I had a second look at it and it appears to me that the issue was
> > already there and is structural. We just did not really cared about it
> > because we didn't bother with synchronization issues.
> >  
> 
> I am not sure if I understand correctly. We stop the queue at some
> specific moment and we need to make sure that xmit_do() is not called
> or can't be called anymore.
> 
> I was thinking about:
> 
> void ieee802154_disable_queue(struct ieee802154_hw *hw)
> {
>         struct ieee802154_local *local = hw_to_local(hw);
>         struct ieee802154_sub_if_data *sdata;
> 
>         rcu_read_lock();
>         list_for_each_entry_rcu(sdata, &local->interfaces, list) {
>                 if (!sdata->dev)
>                         continue;
> 
>                netif_tx_disable(sdata->dev);
>         }
>         rcu_read_unlock();
> }
> EXPORT_SYMBOL(ieee802154_stop_queue);
> 
> From my quick view is that "netif_tx_disable()" ensures by holding
> locks and other things and doing netif_tx_stop_queue() it we can be
> sure there will be no xmit_do() going on while it's called and
> afterwards. It can be that there are still transmissions on the
> transceiver that are on the way, but then your atomic counter and
> wait_event() will come in place.

I went for a deeper investigation to understand how the net core
was calling our callbacks. And it appeared to go through
dev_hard_start_xmit() and come from __dev_queue_xmit(). This means
the ieee802154 callback could only be called once at a time
because it is protected by the network device transmit lock
(netif_tx_lock()). Which makes the logic safe and not racy as I
initially thought. This was the missing peace in my mental model I
believe.

> We need to be sure there will be nothing queued anymore for
> transmission what (in my opinion) tx_disable() does. from any context.
>
> We might need to review some netif callbacks... I have in my mind for
> example stop(), maybe netif_tx_stop_queue() is enough (because the
> context is like netif_tx_disable(), helding similar locks, etc.) but
> we might want to be sure that nothing is going on anymore by using
> your wait_event() with counter.

I don't see a real reason anymore to use the tx_disable() call. Is
there any reason this could be needed that I don't have in mind? Right
now the only thing that I see is that it could delay a little bit the
moment where we actually stop the queue because we would be waiting for
the lock to be released after the skb has been offloaded to hardware.
Perhaps maybe we would let another frame to be transmitted before we
actually get the lock.

> Is there any problem which I don't see?

One question however, as I understand, if userspace tries to send more
packets, I believe the "if (!stopped)" condition will be false and the
xmit call will simply be skipped, ending with a -ENETDOWN error [1]. Is
it what we want? I initially thought we could actually queue patches and
wait for the queue to be re-enabled again, but it does not look easy.

[1] https://elixir.bootlin.com/linux/latest/source/net/core/dev.c#L4249

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-03-18 18:11           ` Miquel Raynal
@ 2022-03-27 16:45             ` Alexander Aring
  2022-03-29 16:29               ` Miquel Raynal
  0 siblings, 1 reply; 32+ messages in thread
From: Alexander Aring @ 2022-03-27 16:45 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi,

On Fri, Mar 18, 2022 at 2:11 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Alexander,
>
> alex.aring@gmail.com wrote on Sun, 13 Mar 2022 16:43:52 -0400:
>
> > Hi,
> >
> > On Fri, Mar 4, 2022 at 5:54 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > > I had a second look at it and it appears to me that the issue was
> > > already there and is structural. We just did not really cared about it
> > > because we didn't bother with synchronization issues.
> > >
> >
> > I am not sure if I understand correctly. We stop the queue at some
> > specific moment and we need to make sure that xmit_do() is not called
> > or can't be called anymore.
> >
> > I was thinking about:
> >
> > void ieee802154_disable_queue(struct ieee802154_hw *hw)
> > {
> >         struct ieee802154_local *local = hw_to_local(hw);
> >         struct ieee802154_sub_if_data *sdata;
> >
> >         rcu_read_lock();
> >         list_for_each_entry_rcu(sdata, &local->interfaces, list) {
> >                 if (!sdata->dev)
> >                         continue;
> >
> >                netif_tx_disable(sdata->dev);
> >         }
> >         rcu_read_unlock();
> > }
> > EXPORT_SYMBOL(ieee802154_stop_queue);
> >
> > From my quick view is that "netif_tx_disable()" ensures by holding
> > locks and other things and doing netif_tx_stop_queue() it we can be
> > sure there will be no xmit_do() going on while it's called and
> > afterwards. It can be that there are still transmissions on the
> > transceiver that are on the way, but then your atomic counter and
> > wait_event() will come in place.
>
> I went for a deeper investigation to understand how the net core
> was calling our callbacks. And it appeared to go through
> dev_hard_start_xmit() and come from __dev_queue_xmit(). This means
> the ieee802154 callback could only be called once at a time
> because it is protected by the network device transmit lock
> (netif_tx_lock()). Which makes the logic safe and not racy as I
> initially thought. This was the missing peace in my mental model I
> believe.
>

You forget here that you want to stop all transmission and to wait
that all transmissions are done from a sleepable context.

> > We need to be sure there will be nothing queued anymore for
> > transmission what (in my opinion) tx_disable() does. from any context.
> >
> > We might need to review some netif callbacks... I have in my mind for
> > example stop(), maybe netif_tx_stop_queue() is enough (because the
> > context is like netif_tx_disable(), helding similar locks, etc.) but
> > we might want to be sure that nothing is going on anymore by using
> > your wait_event() with counter.
>
> I don't see a real reason anymore to use the tx_disable() call. Is
> there any reason this could be needed that I don't have in mind? Right
> now the only thing that I see is that it could delay a little bit the
> moment where we actually stop the queue because we would be waiting for
> the lock to be released after the skb has been offloaded to hardware.
> Perhaps maybe we would let another frame to be transmitted before we
> actually get the lock.
>
> > Is there any problem which I don't see?
>
> One question however, as I understand, if userspace tries to send more
> packets, I believe the "if (!stopped)" condition will be false and the
> xmit call will simply be skipped, ending with a -ENETDOWN error [1]. Is
> it what we want? I initially thought we could actually queue patches and
> wait for the queue to be re-enabled again, but it does not look easy.
>

The problem is here that netif_tx_stop_queue() will only set some
flags and this will be checked there. [0]
Now you want to do from a sleepable context:

1. stop queue (net core functionality check [0])
2. wait until all ongoing transmissions are done (softmac layer atomic
counter handled in xmit_do())

Example situation for the race:

cpu0:
 - checked _already_ the if queue is stopped [0] it was not the case
BUT not incremented the atomic counter yet to signal that a
transmission is going on  (which is done later in xmit_do)

While cpu0 is in the above mentioned state cpu1 is in the following state:

- mlme message will transmitted
- stop the queue by setting flag [0] (note the check in cpu0 already
happened and a transmission is on the way)
- make a wait_event($ONGOING_TRANSMISSION_COUNTER) which will not wait
- (it's zero because and does not wait because cpu0 didn't incremented
the ongoing transmission counter yet)

---

This will end in that both cpu0 and cpu1 start transmissions... but
this is only "we completed the spi transfer to the transceiver" the
framebuffer is written and transmission is started. That the
transceiver actually transmits the frame is completely handled on the
transceiver side, on the Linux side we only need to care about that we
don't overwrite the framebuffer while a transmission is going on. This
can happen here, e.g. cpu0 writes the framebuffer first, then cpu1
will overwrite the framebuffer because we start another transmission
(writing framebuffer) while the transceiver still reads the
framebuffer for the cpu0 transmission. In short it will break.

If we want to start transmissions from any sleepable context we cannot
use "netif_tx_stop_queue()" because it does not guarantee that
xmit_do() is still going on, "netif_tx_disable()" will do it because
it will held the xmit_lock while setting the flag and we don't run
into the above problem.

Is this more clear? I think it was never clear what I really meant by
this race, I hope the above example helped. Also "netif_tx_disable()"
was my first hit to find netif_tx_disable()what we need, but maybe
there exists better options?
To be sure, I mean we need "netif_tx_disable()" only for any sleepable
context e.g. we trigger mlme transmission and take control of all
transmissions and be sure nothing is going on anymore, then we need to
have still the wait_event(). After this is done we can be sure no
transmission is going on and we can take over until we are done (mlme
sync tx handling) and can call wake_queue() again.

- Alex

[0] https://elixir.bootlin.com/linux/v5.17/source/net/core/dev.c#L4114

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism
  2022-03-27 16:45             ` Alexander Aring
@ 2022-03-29 16:29               ` Miquel Raynal
  0 siblings, 0 replies; 32+ messages in thread
From: Miquel Raynal @ 2022-03-29 16:29 UTC (permalink / raw)
  To: Alexander Aring
  Cc: Stefan Schmidt, linux-wpan - ML, David S. Miller, Jakub Kicinski,
	open list:NETWORKING [GENERAL],
	David Girault, Romuald Despres, Frederic Blain, Nicolas Schodet,
	Thomas Petazzoni

Hi Alexander,

alex.aring@gmail.com wrote on Sun, 27 Mar 2022 12:45:20 -0400:

> Hi,
> 
> On Fri, Mar 18, 2022 at 2:11 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Alexander,
> >
> > alex.aring@gmail.com wrote on Sun, 13 Mar 2022 16:43:52 -0400:
> >  
> > > Hi,
> > >
> > > On Fri, Mar 4, 2022 at 5:54 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >  
> > > > I had a second look at it and it appears to me that the issue was
> > > > already there and is structural. We just did not really cared about it
> > > > because we didn't bother with synchronization issues.
> > > >  
> > >
> > > I am not sure if I understand correctly. We stop the queue at some
> > > specific moment and we need to make sure that xmit_do() is not called
> > > or can't be called anymore.
> > >
> > > I was thinking about:
> > >
> > > void ieee802154_disable_queue(struct ieee802154_hw *hw)
> > > {
> > >         struct ieee802154_local *local = hw_to_local(hw);
> > >         struct ieee802154_sub_if_data *sdata;
> > >
> > >         rcu_read_lock();
> > >         list_for_each_entry_rcu(sdata, &local->interfaces, list) {
> > >                 if (!sdata->dev)
> > >                         continue;
> > >
> > >                netif_tx_disable(sdata->dev);
> > >         }
> > >         rcu_read_unlock();
> > > }
> > > EXPORT_SYMBOL(ieee802154_stop_queue);
> > >
> > > From my quick view is that "netif_tx_disable()" ensures by holding
> > > locks and other things and doing netif_tx_stop_queue() it we can be
> > > sure there will be no xmit_do() going on while it's called and
> > > afterwards. It can be that there are still transmissions on the
> > > transceiver that are on the way, but then your atomic counter and
> > > wait_event() will come in place.  
> >
> > I went for a deeper investigation to understand how the net core
> > was calling our callbacks. And it appeared to go through
> > dev_hard_start_xmit() and come from __dev_queue_xmit(). This means
> > the ieee802154 callback could only be called once at a time
> > because it is protected by the network device transmit lock
> > (netif_tx_lock()). Which makes the logic safe and not racy as I
> > initially thought. This was the missing peace in my mental model I
> > believe.
> >  
> 
> You forget here that you want to stop all transmission and to wait
> that all transmissions are done from a sleepable context.
> 
> > > We need to be sure there will be nothing queued anymore for
> > > transmission what (in my opinion) tx_disable() does. from any context.
> > >
> > > We might need to review some netif callbacks... I have in my mind for
> > > example stop(), maybe netif_tx_stop_queue() is enough (because the
> > > context is like netif_tx_disable(), helding similar locks, etc.) but
> > > we might want to be sure that nothing is going on anymore by using
> > > your wait_event() with counter.  
> >
> > I don't see a real reason anymore to use the tx_disable() call. Is
> > there any reason this could be needed that I don't have in mind? Right
> > now the only thing that I see is that it could delay a little bit the
> > moment where we actually stop the queue because we would be waiting for
> > the lock to be released after the skb has been offloaded to hardware.
> > Perhaps maybe we would let another frame to be transmitted before we
> > actually get the lock.
> >  
> > > Is there any problem which I don't see?  
> >
> > One question however, as I understand, if userspace tries to send more
> > packets, I believe the "if (!stopped)" condition will be false and the
> > xmit call will simply be skipped, ending with a -ENETDOWN error [1]. Is
> > it what we want? I initially thought we could actually queue patches and
> > wait for the queue to be re-enabled again, but it does not look easy.
> >  
> 
> The problem is here that netif_tx_stop_queue() will only set some
> flags and this will be checked there. [0]
> Now you want to do from a sleepable context:
> 
> 1. stop queue (net core functionality check [0])
> 2. wait until all ongoing transmissions are done (softmac layer atomic
> counter handled in xmit_do())
> 
> Example situation for the race:
> 
> cpu0:
>  - checked _already_ the if queue is stopped [0] it was not the case
> BUT not incremented the atomic counter yet to signal that a
> transmission is going on  (which is done later in xmit_do)
> 
> While cpu0 is in the above mentioned state cpu1 is in the following state:
> 
> - mlme message will transmitted
> - stop the queue by setting flag [0] (note the check in cpu0 already
> happened and a transmission is on the way)
> - make a wait_event($ONGOING_TRANSMISSION_COUNTER) which will not wait
> - (it's zero because and does not wait because cpu0 didn't incremented
> the ongoing transmission counter yet)
> 
> ---
> 
> This will end in that both cpu0 and cpu1 start transmissions... but
> this is only "we completed the spi transfer to the transceiver" the
> framebuffer is written and transmission is started. That the
> transceiver actually transmits the frame is completely handled on the
> transceiver side, on the Linux side we only need to care about that we
> don't overwrite the framebuffer while a transmission is going on. This
> can happen here, e.g. cpu0 writes the framebuffer first, then cpu1
> will overwrite the framebuffer because we start another transmission
> (writing framebuffer) while the transceiver still reads the
> framebuffer for the cpu0 transmission. In short it will break.
> 
> If we want to start transmissions from any sleepable context we cannot
> use "netif_tx_stop_queue()" because it does not guarantee that
> xmit_do() is still going on, "netif_tx_disable()" will do it because
> it will held the xmit_lock while setting the flag and we don't run
> into the above problem.
> 
> Is this more clear?

Crystal clear. Thanks for taking the time to explain it. I am now
convinced of the usefulness of calling netif_tx_disable() (and create
our own ieee802154 helper to call it).

> I think it was never clear what I really meant by
> this race, I hope the above example helped. Also "netif_tx_disable()"
> was my first hit to find netif_tx_disable()what we need, but maybe
> there exists better options?
> To be sure, I mean we need "netif_tx_disable()" only for any sleepable
> context e.g. we trigger mlme transmission and take control of all
> transmissions and be sure nothing is going on anymore, then we need to
> have still the wait_event(). After this is done we can be sure no
> transmission is going on and we can take over until we are done (mlme
> sync tx handling) and can call wake_queue() again.
> 
> - Alex
> 
> [0] https://elixir.bootlin.com/linux/v5.17/source/net/core/dev.c#L4114


Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-03-29 16:29 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-07 14:47 [PATCH wpan-next v2 00/14] ieee802154: Synchronous Tx API Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 01/14] net: ieee802154: Move the logic restarting the queue upon transmission Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 02/14] net: mac802154: Create a transmit error helper Miquel Raynal
2022-02-20 23:31   ` Alexander Aring
2022-02-21 20:22     ` Alexander Aring
2022-02-22  8:43       ` Miquel Raynal
2022-02-24  1:53         ` Alexander Aring
2022-02-07 14:47 ` [PATCH wpan-next v2 03/14] net: ieee802154: at86rf230: Call _xmit_error() when a transmission fails Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 04/14] net: ieee802154: atusb: " Miquel Raynal
2022-02-20 23:35   ` Alexander Aring
2022-02-24  2:00     ` Alexander Aring
2022-02-24 14:43       ` Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 05/14] net: ieee802154: ca8210: " Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 06/14] net: mac802154: Stop exporting ieee802154_wake/stop_queue() Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 07/14] net: mac802154: Rename the synchronous xmit worker Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 08/14] net: mac802154: Rename the main tx_work struct Miquel Raynal
2022-02-07 14:47 ` [PATCH wpan-next v2 09/14] net: mac802154: Follow the count of ongoing transmissions Miquel Raynal
2022-02-07 14:48 ` [PATCH wpan-next v2 10/14] net: mac802154: Hold the transmit queue when relevant Miquel Raynal
2022-02-07 14:48 ` [PATCH wpan-next v2 11/14] net: mac802154: Create a hot tx path Miquel Raynal
2022-02-07 14:48 ` [PATCH wpan-next v2 12/14] net: mac802154: Add a warning in the hot path Miquel Raynal
2022-02-07 14:48 ` [PATCH wpan-next v2 13/14] net: mac802154: Introduce a tx queue flushing mechanism Miquel Raynal
2022-02-20 23:49   ` Alexander Aring
2022-03-03 18:17     ` Miquel Raynal
2022-03-04 10:54       ` Miquel Raynal
2022-03-13 20:43         ` Alexander Aring
2022-03-18 18:11           ` Miquel Raynal
2022-03-27 16:45             ` Alexander Aring
2022-03-29 16:29               ` Miquel Raynal
2022-02-07 14:48 ` [PATCH wpan-next v2 14/14] net: mac802154: Introduce a synchronous API for MLME commands Miquel Raynal
2022-02-20 23:52   ` Alexander Aring
2022-02-21 20:33     ` Alexander Aring
2022-02-21 20:33       ` Alexander Aring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.