All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 net-next 0/4] MTU changes and other fixes
@ 2016-01-05 11:57 Jakub Kicinski
  2016-01-05 11:57 ` [PATCHv2 net-next 1/4] nfp: return error if MTU change fails Jakub Kicinski
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Jakub Kicinski @ 2016-01-05 11:57 UTC (permalink / raw)
  To: netdev; +Cc: simon.horman, rolf.neugebauer, Jakub Kicinski

Hi!

Four small fixes around RX buffer sizing.  First one makes sure
we return an error if .ndo_change_mtu() fails.  Second one corrects
the length used for unmapping DMA buffers when MTU is changed,
third makes sure buffers are big enough to meet FW's expectations.
Last change corrects packet length validation on RX.

v2:
 - add first patch (return error on fail).


Jakub Kicinski (4):
  nfp: return error if MTU change fails
  nfp: free buffers before changing MTU
  nfp: correct RX buffer length calculation
  nfp: fix RX buffer length validation

 .../net/ethernet/netronome/nfp/nfp_net_common.c    | 42 ++++++++++------------
 1 file changed, 18 insertions(+), 24 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCHv2 net-next 1/4] nfp: return error if MTU change fails
  2016-01-05 11:57 [PATCHv2 net-next 0/4] MTU changes and other fixes Jakub Kicinski
@ 2016-01-05 11:57 ` Jakub Kicinski
  2016-01-05 11:57 ` [PATCHv2 net-next 2/4] nfp: free buffers before changing MTU Jakub Kicinski
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2016-01-05 11:57 UTC (permalink / raw)
  To: netdev; +Cc: simon.horman, rolf.neugebauer, Jakub Kicinski

When reopening device fails after MTU change, let the userspace
know.  MTU remains changed even though error is returned, this
is what all ethernet devices are doing.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
Dave,

I know this is not what you asked for but, since we are using FW
commands to disable/enable RX, even if we allocate all required
resources before freeing old ones we still cannot guarantee that
the reenabling operation will not fail.  Should we refuse to do
MTU changes while the interface is running altogether?

---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 43c618bafdb6..006d9600240f 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1911,6 +1911,7 @@ static void nfp_net_set_rx_mode(struct net_device *netdev)
 static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct nfp_net *nn = netdev_priv(netdev);
+	int ret = 0;
 	u32 tmp;
 
 	nn_dbg(nn, "New MTU = %d\n", new_mtu);
@@ -1929,10 +1930,10 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 	/* restart if running */
 	if (netif_running(netdev)) {
 		nfp_net_netdev_close(netdev);
-		nfp_net_netdev_open(netdev);
+		ret = nfp_net_netdev_open(netdev);
 	}
 
-	return 0;
+	return ret;
 }
 
 static struct rtnl_link_stats64 *nfp_net_stat64(struct net_device *netdev,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv2 net-next 2/4] nfp: free buffers before changing MTU
  2016-01-05 11:57 [PATCHv2 net-next 0/4] MTU changes and other fixes Jakub Kicinski
  2016-01-05 11:57 ` [PATCHv2 net-next 1/4] nfp: return error if MTU change fails Jakub Kicinski
@ 2016-01-05 11:57 ` Jakub Kicinski
  2016-01-05 11:57 ` [PATCHv2 net-next 3/4] nfp: correct RX buffer length calculation Jakub Kicinski
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2016-01-05 11:57 UTC (permalink / raw)
  To: netdev; +Cc: simon.horman, rolf.neugebauer, Jakub Kicinski

For freeing DMA buffers we depend on nfp_net.fl_bufsz having the same
value as during allocation therefore in .ndo_change_mtu we must first
free the buffers and then change the setting.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 006d9600240f..b381682de3d6 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1921,17 +1921,17 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 		return -EINVAL;
 	}
 
+	if (netif_running(netdev))
+		nfp_net_netdev_close(netdev);
+
 	netdev->mtu = new_mtu;
 
 	/* Freelist buffer size rounded up to the nearest 1K */
 	tmp = new_mtu + ETH_HLEN + VLAN_HLEN + NFP_NET_MAX_PREPEND;
 	nn->fl_bufsz = roundup(tmp, 1024);
 
-	/* restart if running */
-	if (netif_running(netdev)) {
-		nfp_net_netdev_close(netdev);
+	if (netif_running(netdev))
 		ret = nfp_net_netdev_open(netdev);
-	}
 
 	return ret;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv2 net-next 3/4] nfp: correct RX buffer length calculation
  2016-01-05 11:57 [PATCHv2 net-next 0/4] MTU changes and other fixes Jakub Kicinski
  2016-01-05 11:57 ` [PATCHv2 net-next 1/4] nfp: return error if MTU change fails Jakub Kicinski
  2016-01-05 11:57 ` [PATCHv2 net-next 2/4] nfp: free buffers before changing MTU Jakub Kicinski
@ 2016-01-05 11:57 ` Jakub Kicinski
  2016-01-05 11:57 ` [PATCHv2 net-next 4/4] nfp: fix RX buffer length validation Jakub Kicinski
  2016-01-07 19:57 ` [PATCHv2 net-next 0/4] MTU changes and other fixes David Miller
  4 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2016-01-05 11:57 UTC (permalink / raw)
  To: netdev; +Cc: simon.horman, rolf.neugebauer, Jakub Kicinski

When calculating the RX buffer length we need to account for
up to 2 VLAN tags and up to 8 MPLS labels.  Rounding up to 1k
is an relic of a distant past and can be removed.  While at
it also remove trivial print statement.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index b381682de3d6..553ae64e2f7f 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -61,6 +61,7 @@
 
 #include <linux/ktime.h>
 
+#include <net/mpls.h>
 #include <net/vxlan.h>
 
 #include "nfp_net_ctrl.h"
@@ -1912,9 +1913,6 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct nfp_net *nn = netdev_priv(netdev);
 	int ret = 0;
-	u32 tmp;
-
-	nn_dbg(nn, "New MTU = %d\n", new_mtu);
 
 	if (new_mtu < 68 || new_mtu > nn->max_mtu) {
 		nn_err(nn, "New MTU (%d) is not valid\n", new_mtu);
@@ -1925,10 +1923,8 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 		nfp_net_netdev_close(netdev);
 
 	netdev->mtu = new_mtu;
-
-	/* Freelist buffer size rounded up to the nearest 1K */
-	tmp = new_mtu + ETH_HLEN + VLAN_HLEN + NFP_NET_MAX_PREPEND;
-	nn->fl_bufsz = roundup(tmp, 1024);
+	nn->fl_bufsz = NFP_NET_MAX_PREPEND + ETH_HLEN + VLAN_HLEN * 2 +
+		MPLS_HLEN * 8 + new_mtu;
 
 	if (netif_running(netdev))
 		ret = nfp_net_netdev_open(netdev);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv2 net-next 4/4] nfp: fix RX buffer length validation
  2016-01-05 11:57 [PATCHv2 net-next 0/4] MTU changes and other fixes Jakub Kicinski
                   ` (2 preceding siblings ...)
  2016-01-05 11:57 ` [PATCHv2 net-next 3/4] nfp: correct RX buffer length calculation Jakub Kicinski
@ 2016-01-05 11:57 ` Jakub Kicinski
  2016-01-07 19:57 ` [PATCHv2 net-next 0/4] MTU changes and other fixes David Miller
  4 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2016-01-05 11:57 UTC (permalink / raw)
  To: netdev; +Cc: simon.horman, rolf.neugebauer, Jakub Kicinski

Meaning of data_len and meta_len RX WB descriptor fields depend
slightly on whether rx_offset is dynamic or not.  For dynamic
offsets data_len includes mata_len.  This makes the code harder
to follow, in fact our RX buffer length check is incorrect -
we are comparing allocation length to data_len while we should
also account for meta_len.

Let's adjust the values of data_len and meta_len to their natural
meaning and simplify the logic.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Rolf Neugebauer <rolf.neugebauer@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 553ae64e2f7f..070645f9bc21 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1259,22 +1259,19 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 
 		meta_len = rxd->rxd.meta_len_dd & PCIE_DESC_RX_META_LEN_MASK;
 		data_len = le16_to_cpu(rxd->rxd.data_len);
+		/* For dynamic offset data_len includes meta_len, adjust */
+		if (nn->rx_offset == NFP_NET_CFG_RX_OFFSET_DYNAMIC)
+			data_len -= meta_len;
+		else
+			meta_len = nn->rx_offset;
 
-		if (WARN_ON_ONCE(data_len > nn->fl_bufsz)) {
+		if (WARN_ON_ONCE(meta_len + data_len > nn->fl_bufsz)) {
 			dev_kfree_skb_any(skb);
 			continue;
 		}
 
-		if (nn->rx_offset == NFP_NET_CFG_RX_OFFSET_DYNAMIC) {
-			/* The packet data starts after the metadata */
-			skb_reserve(skb, meta_len);
-		} else {
-			/* The packet data starts at a fixed offset */
-			skb_reserve(skb, nn->rx_offset);
-		}
-
-		/* Adjust the SKB for the dynamic meta data pre-pended */
-		skb_put(skb, data_len - meta_len);
+		skb_reserve(skb, meta_len);
+		skb_put(skb, data_len);
 
 		nfp_net_set_hash(nn->netdev, skb, rxd);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHv2 net-next 0/4] MTU changes and other fixes
  2016-01-05 11:57 [PATCHv2 net-next 0/4] MTU changes and other fixes Jakub Kicinski
                   ` (3 preceding siblings ...)
  2016-01-05 11:57 ` [PATCHv2 net-next 4/4] nfp: fix RX buffer length validation Jakub Kicinski
@ 2016-01-07 19:57 ` David Miller
  2016-01-07 20:49   ` Jakub Kicinski
  4 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2016-01-07 19:57 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: netdev, simon.horman, rolf.neugebauer

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Tue,  5 Jan 2016 11:57:27 +0000

> v2:
>  - add first patch (return error on fail).

You didn't read my feedback, so I'm not going to read your patches.

Is that OK with you?  I personally think it's fair.

I said that you need to reorganize how you do the MTU
changes.

You _MUST_ try to allocate the resources (queues, data
structures, whatever) for the new MTU size.

And if that fails, release those new resources and leave
the device exactly in the state it was in prior to the MTU
change call.

This means you can't use the close/open scheme, I'm explicitly
telling you not to use that mechanism to change the MTU because
it can leave the device inoperative if the re-open fails.

That is completely unexpected behavior for the user.

The interface must remain up and functioning at the original
MTU if the MTU change operation fails.  Your code does not
do this.

Yes, this is a lot of more work, but that's the only correct
way to implement this.

If you fail to implement this properly one more time I will start
simply ignoring your patch submissions for a while because you will be
completely wasting my time.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv2 net-next 0/4] MTU changes and other fixes
  2016-01-07 19:57 ` [PATCHv2 net-next 0/4] MTU changes and other fixes David Miller
@ 2016-01-07 20:49   ` Jakub Kicinski
  2016-01-07 21:33     ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2016-01-07 20:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, simon.horman, rolf.neugebauer

On Thu, 07 Jan 2016 14:57:50 -0500 (EST), David Miller wrote:
> From: Jakub Kicinski <jakub.kicinski@netronome.com>
> Date: Tue,  5 Jan 2016 11:57:27 +0000
> 
> > v2:
> >  - add first patch (return error on fail).
> 
> You didn't read my feedback, so I'm not going to read your patches.

Assuming that I'm that stupid/disrespectful to you makes me sad.
If you did read the patches you would see the note the first patch
contains...  I assumed it would be easier for you to read it there since
patchwork does not pick up cover letters, my bad.  Here is that note:

I know this is not what you asked for but, since we are using FW
commands to disable/enable RX, even if we allocate all required
resources before freeing old ones we still cannot guarantee that
the reenabling operation will not fail.  Should we refuse to do
MTU changes while the interface is running altogether?

> I said that you need to reorganize how you do the MTU
> changes.
> 
> You _MUST_ try to allocate the resources (queues, data
> structures, whatever) for the new MTU size.
> 
> And if that fails, release those new resources and leave
> the device exactly in the state it was in prior to the MTU
> change call.
> 
> This means you can't use the close/open scheme, I'm explicitly
> telling you not to use that mechanism to change the MTU because
> it can leave the device inoperative if the re-open fails.
> 
> That is completely unexpected behavior for the user.
> 
> The interface must remain up and functioning at the original
> MTU if the MTU change operation fails.  Your code does not
> do this.
> 
> Yes, this is a lot of more work, but that's the only correct
> way to implement this.

I understood you the first time 100% and agree with you 100%.
The situation is the same when configuring number or size of
rings FWIW.  All drivers I've seen ignore this problem and at
most return an error which is stupid and makes users expect
that everything can be configured while interface is up.
For rings it's even more stupid since there is currently no way to
increase number of allocated MSI-X irqs without freeing them first
so the prepare/commit paradigm is basically impossible.
At the extreme one could argue that none of the reconfig should
be allowed unless driver guarantees not to drop frames.

Please respond if you want me to proceed with the preallocation
even though it won't guarantee that the whole operation succeeds
or refusing to do runtime changes is the right way to go.

Sorry I didn't ask the clarifying question right away.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv2 net-next 0/4] MTU changes and other fixes
  2016-01-07 20:49   ` Jakub Kicinski
@ 2016-01-07 21:33     ` David Miller
  2016-01-07 21:50       ` Jakub Kicinski
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2016-01-07 21:33 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: netdev, simon.horman, rolf.neugebauer

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Thu, 7 Jan 2016 20:49:28 +0000

> I know this is not what you asked for but, since we are using FW
> commands to disable/enable RX, even if we allocate all required
> resources before freeing old ones we still cannot guarantee that
> the reenabling operation will not fail.  Should we refuse to do
> MTU changes while the interface is running altogether?

If you issue the MTU change command and it fails, then you're still
configured at the old MTU.  There should therefore be no problem
rewinding in that case.

> All drivers I've seen ...

Bad practice in other drivers should be ignored, not copied.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv2 net-next 0/4] MTU changes and other fixes
  2016-01-07 21:33     ` David Miller
@ 2016-01-07 21:50       ` Jakub Kicinski
  2016-01-07 21:55         ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2016-01-07 21:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, simon.horman, rolf.neugebauer

On Thu, 07 Jan 2016 16:33:14 -0500 (EST), David Miller wrote:
> From: Jakub Kicinski <jakub.kicinski@netronome.com>
> Date: Thu, 7 Jan 2016 20:49:28 +0000
> 
> > I know this is not what you asked for but, since we are using FW
> > commands to disable/enable RX, even if we allocate all required
> > resources before freeing old ones we still cannot guarantee that
> > the reenabling operation will not fail.  Should we refuse to do
> > MTU changes while the interface is running altogether?
> 
> If you issue the MTU change command and it fails, then you're still
> configured at the old MTU.  There should therefore be no problem
> rewinding in that case.

No, no...  The FW command is to stop and start the RX path in
the NIC.  Our NIC is NPU-based, it has a ton of programmability
so even though we try to make it work like a run-of-the-mill
NIC there are some gotchas.

Unless there is a way to change MTU without stopping RX which
escapes me.

> > All drivers I've seen ...
> 
> Bad practice in other drivers should be ignored, not copied.

I'll take that to heart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv2 net-next 0/4] MTU changes and other fixes
  2016-01-07 21:50       ` Jakub Kicinski
@ 2016-01-07 21:55         ` David Miller
  0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2016-01-07 21:55 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: netdev, simon.horman, rolf.neugebauer

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Thu, 7 Jan 2016 21:50:11 +0000

> On Thu, 07 Jan 2016 16:33:14 -0500 (EST), David Miller wrote:
>> From: Jakub Kicinski <jakub.kicinski@netronome.com>
>> Date: Thu, 7 Jan 2016 20:49:28 +0000
>> 
>> > I know this is not what you asked for but, since we are using FW
>> > commands to disable/enable RX, even if we allocate all required
>> > resources before freeing old ones we still cannot guarantee that
>> > the reenabling operation will not fail.  Should we refuse to do
>> > MTU changes while the interface is running altogether?
>> 
>> If you issue the MTU change command and it fails, then you're still
>> configured at the old MTU.  There should therefore be no problem
>> rewinding in that case.
> 
> No, no...  The FW command is to stop and start the RX path in
> the NIC.  Our NIC is NPU-based, it has a ton of programmability
> so even though we try to make it work like a run-of-the-mill
> NIC there are some gotchas.
> 
> Unless there is a way to change MTU without stopping RX which
> escapes me.

Then the best you can do is retry the FW configuration using the
original MTU, and if _that_ fails you must return and error as well as
emit a kernel log message because this is a failure that cannot be
recovered from and the user must be able to figure out what happened.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-01-07 21:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-05 11:57 [PATCHv2 net-next 0/4] MTU changes and other fixes Jakub Kicinski
2016-01-05 11:57 ` [PATCHv2 net-next 1/4] nfp: return error if MTU change fails Jakub Kicinski
2016-01-05 11:57 ` [PATCHv2 net-next 2/4] nfp: free buffers before changing MTU Jakub Kicinski
2016-01-05 11:57 ` [PATCHv2 net-next 3/4] nfp: correct RX buffer length calculation Jakub Kicinski
2016-01-05 11:57 ` [PATCHv2 net-next 4/4] nfp: fix RX buffer length validation Jakub Kicinski
2016-01-07 19:57 ` [PATCHv2 net-next 0/4] MTU changes and other fixes David Miller
2016-01-07 20:49   ` Jakub Kicinski
2016-01-07 21:33     ` David Miller
2016-01-07 21:50       ` Jakub Kicinski
2016-01-07 21:55         ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.