netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs
@ 2020-04-23 14:57 Jesper Dangaard Brouer
  2020-04-23 14:57 ` [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via extack Jesper Dangaard Brouer
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2020-04-23 14:57 UTC (permalink / raw)
  Cc: Jesper Dangaard Brouer, netdev, Ilias Apalodimas,
	Toke Høiland-Jørgensen, ruxandra.radulescu,
	ioana.ciornei, nipun.gupta, shawnguo

I've been very puzzled why networking on my NXP development board,
using driver dpaa2-eth, stopped working when I updated the kernel
version >= 5.3.  The observable issue were that interface would drop
all TX packets, because it had assigned the qdisc noop.

This turned out the be a NIC driver bug, that would only get triggered
when using sysctl net/core/default_qdisc=fq_codel. It was non-trivial
to find out[1] this was driver related. Thus, this patchset besides
fixing the driver bug, also helps end-user identify the issue.

[1]: https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_nxp_ls1088/nxp-board04-troubleshoot-qdisc.org

---

Jesper Dangaard Brouer (2):
      net: sched: report ndo_setup_tc failures via extack
      dpaa2-eth: fix return codes used in ndo_setup_tc


 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c |    4 ++--
 net/sched/cls_api.c                              |    5 ++++-
 2 files changed, 6 insertions(+), 3 deletions(-)

--


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via extack
  2020-04-23 14:57 [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs Jesper Dangaard Brouer
@ 2020-04-23 14:57 ` Jesper Dangaard Brouer
  2020-04-23 19:51   ` Ioana Ciornei
  2020-04-23 14:57 ` [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc Jesper Dangaard Brouer
  2020-04-24 23:45 ` [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs David Miller
  2 siblings, 1 reply; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2020-04-23 14:57 UTC (permalink / raw)
  Cc: Jesper Dangaard Brouer, netdev, Ilias Apalodimas,
	Toke Høiland-Jørgensen, ruxandra.radulescu,
	ioana.ciornei, nipun.gupta, shawnguo

Help end-users of the 'tc' command to see if the drivers ndo_setup_tc
function call fails. Troubleshooting when this happens is non-trivial
(see full process here[1]), and results in net_device getting assigned
the 'qdisc noop', which will drop all TX packets on the interface.

[1]: https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_nxp_ls1088/nxp-board04-troubleshoot-qdisc.org

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/sched/cls_api.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 55bd1429678f..11b683c45c28 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -735,8 +735,11 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
 	INIT_LIST_HEAD(&bo.cb_list);
 
 	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
-	if (err < 0)
+	if (err < 0) {
+		if (err != -EOPNOTSUPP)
+			NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
 		return err;
+	}
 
 	return tcf_block_setup(block, &bo);
 }



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-23 14:57 [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs Jesper Dangaard Brouer
  2020-04-23 14:57 ` [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via extack Jesper Dangaard Brouer
@ 2020-04-23 14:57 ` Jesper Dangaard Brouer
  2020-04-23 15:28   ` Stephen Hemminger
  2020-04-23 19:49   ` Ioana Ciornei
  2020-04-24 23:45 ` [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs David Miller
  2 siblings, 2 replies; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2020-04-23 14:57 UTC (permalink / raw)
  Cc: Jesper Dangaard Brouer, netdev, Ilias Apalodimas,
	Toke Høiland-Jørgensen, ruxandra.radulescu,
	ioana.ciornei, nipun.gupta, shawnguo

Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot
support the qdisc type. Other return values will result in failing the
qdisc setup.  This lead to qdisc noop getting assigned, which will
drop all TX packets on the interface.

Fixes: ab1e6de2bd49 ("dpaa2-eth: Add mqprio support")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index 873b66ed3aee..a72f5a0d9e7c 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -2055,7 +2055,7 @@ static int dpaa2_eth_setup_tc(struct net_device *net_dev,
 	int i;
 
 	if (type != TC_SETUP_QDISC_MQPRIO)
-		return -EINVAL;
+		return -EOPNOTSUPP;
 
 	mqprio->hw = TC_MQPRIO_HW_OFFLOAD_TCS;
 	num_queues = dpaa2_eth_queue_count(priv);
@@ -2067,7 +2067,7 @@ static int dpaa2_eth_setup_tc(struct net_device *net_dev,
 	if (num_tc  > dpaa2_eth_tc_count(priv)) {
 		netdev_err(net_dev, "Max %d traffic classes supported\n",
 			   dpaa2_eth_tc_count(priv));
-		return -EINVAL;
+		return -EOPNOTSUPP;
 	}
 
 	if (!num_tc) {



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-23 14:57 ` [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc Jesper Dangaard Brouer
@ 2020-04-23 15:28   ` Stephen Hemminger
  2020-04-23 15:38     ` Jesper Dangaard Brouer
  2020-04-23 19:49   ` Ioana Ciornei
  1 sibling, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2020-04-23 15:28 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Ilias Apalodimas, Toke Høiland-Jørgensen,
	ruxandra.radulescu, ioana.ciornei, nipun.gupta, shawnguo

On Thu, 23 Apr 2020 16:57:50 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot
> support the qdisc type. Other return values will result in failing the
> qdisc setup.  This lead to qdisc noop getting assigned, which will
> drop all TX packets on the interface.
> 
> Fixes: ab1e6de2bd49 ("dpaa2-eth: Add mqprio support")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

Would it be possible to use extack as well?
Putting errors in dmesg is unhelpful

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-23 15:28   ` Stephen Hemminger
@ 2020-04-23 15:38     ` Jesper Dangaard Brouer
  2020-04-23 19:33       ` Stephen Hemminger
  0 siblings, 1 reply; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2020-04-23 15:38 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, Ilias Apalodimas, Toke Høiland-Jørgensen,
	ruxandra.radulescu, ioana.ciornei, nipun.gupta, shawnguo, brouer

On Thu, 23 Apr 2020 08:28:58 -0700
Stephen Hemminger <stephen@networkplumber.org> wrote:

> On Thu, 23 Apr 2020 16:57:50 +0200
> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> 
> > Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot
> > support the qdisc type. Other return values will result in failing the
> > qdisc setup.  This lead to qdisc noop getting assigned, which will
> > drop all TX packets on the interface.
> > 
> > Fixes: ab1e6de2bd49 ("dpaa2-eth: Add mqprio support")
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>  
> 
> Would it be possible to use extack as well?

That is what patch 1/2 already does.

> Putting errors in dmesg is unhelpful

This patchset does not introduce any dmesg printk.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-23 15:38     ` Jesper Dangaard Brouer
@ 2020-04-23 19:33       ` Stephen Hemminger
  2020-04-23 19:56         ` Jakub Kicinski
  0 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2020-04-23 19:33 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Ilias Apalodimas, Toke Høiland-Jørgensen,
	ruxandra.radulescu, ioana.ciornei, nipun.gupta, shawnguo

On Thu, 23 Apr 2020 17:38:04 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> On Thu, 23 Apr 2020 08:28:58 -0700
> Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
> > On Thu, 23 Apr 2020 16:57:50 +0200
> > Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >   
> > > Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot
> > > support the qdisc type. Other return values will result in failing the
> > > qdisc setup.  This lead to qdisc noop getting assigned, which will
> > > drop all TX packets on the interface.
> > > 
> > > Fixes: ab1e6de2bd49 ("dpaa2-eth: Add mqprio support")
> > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>    
> > 
> > Would it be possible to use extack as well?  
> 
> That is what patch 1/2 already does.
> 
> > Putting errors in dmesg is unhelpful  
> 
> This patchset does not introduce any dmesg printk.
> 

I was thinking that this  
	if (num_tc  > dpaa2_eth_tc_count(priv)) {
 		netdev_err(net_dev, "Max %d traffic classes supported\n",
 			   dpaa2_eth_tc_count(priv));
-		return -EINVAL;
+		return -EOPNOTSUPP;
 	}

could be an extack message, but doing that would require a change
to the ndo_setup_tc hook to allow driver to return its own error message
as to why the setup failed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-23 14:57 ` [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc Jesper Dangaard Brouer
  2020-04-23 15:28   ` Stephen Hemminger
@ 2020-04-23 19:49   ` Ioana Ciornei
  1 sibling, 0 replies; 13+ messages in thread
From: Ioana Ciornei @ 2020-04-23 19:49 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Ilias Apalodimas, Toke Høiland-Jørgensen,
	Ioana Ciocoi Radulescu, Nipun Gupta, shawnguo

> Subject: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
> 
> Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot support
> the qdisc type. Other return values will result in failing the qdisc setup.  This lead
> to qdisc noop getting assigned, which will drop all TX packets on the interface.
> 
> Fixes: ab1e6de2bd49 ("dpaa2-eth: Add mqprio support")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---

Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>

>  drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
> b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
> index 873b66ed3aee..a72f5a0d9e7c 100644
> --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
> +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
> @@ -2055,7 +2055,7 @@ static int dpaa2_eth_setup_tc(struct net_device
> *net_dev,
>  	int i;
> 
>  	if (type != TC_SETUP_QDISC_MQPRIO)
> -		return -EINVAL;
> +		return -EOPNOTSUPP;
> 
>  	mqprio->hw = TC_MQPRIO_HW_OFFLOAD_TCS;
>  	num_queues = dpaa2_eth_queue_count(priv); @@ -2067,7 +2067,7
> @@ static int dpaa2_eth_setup_tc(struct net_device *net_dev,
>  	if (num_tc  > dpaa2_eth_tc_count(priv)) {
>  		netdev_err(net_dev, "Max %d traffic classes supported\n",
>  			   dpaa2_eth_tc_count(priv));
> -		return -EINVAL;
> +		return -EOPNOTSUPP;
>  	}
> 
>  	if (!num_tc) {
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via extack
  2020-04-23 14:57 ` [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via extack Jesper Dangaard Brouer
@ 2020-04-23 19:51   ` Ioana Ciornei
  0 siblings, 0 replies; 13+ messages in thread
From: Ioana Ciornei @ 2020-04-23 19:51 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Ilias Apalodimas, Toke Høiland-Jørgensen,
	Ioana Ciocoi Radulescu, Nipun Gupta, shawnguo

> Subject: [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via
> extack
> 
> Help end-users of the 'tc' command to see if the drivers ndo_setup_tc function
> call fails. Troubleshooting when this happens is non-trivial (see full process
> here[1]), and results in net_device getting assigned the 'qdisc noop', which will
> drop all TX packets on the interface.
> 
> [1]:
> https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_nxp_ls1088/nxp-board04-troubleshoot-qdisc.org
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---

Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>

>  net/sched/cls_api.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index
> 55bd1429678f..11b683c45c28 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -735,8 +735,11 @@ static int tcf_block_offload_cmd(struct tcf_block
> *block,
>  	INIT_LIST_HEAD(&bo.cb_list);
> 
>  	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> -	if (err < 0)
> +	if (err < 0) {
> +		if (err != -EOPNOTSUPP)
> +			NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc
> failed");
>  		return err;
> +	}
> 
>  	return tcf_block_setup(block, &bo);
>  }
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-23 19:33       ` Stephen Hemminger
@ 2020-04-23 19:56         ` Jakub Kicinski
  2020-04-24  7:04           ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Kicinski @ 2020-04-23 19:56 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jesper Dangaard Brouer, netdev, Ilias Apalodimas,
	Toke Høiland-Jørgensen, ruxandra.radulescu,
	ioana.ciornei, nipun.gupta, shawnguo

On Thu, 23 Apr 2020 12:33:56 -0700 Stephen Hemminger wrote:
> On Thu, 23 Apr 2020 17:38:04 +0200
> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> 
> > On Thu, 23 Apr 2020 08:28:58 -0700
> > Stephen Hemminger <stephen@networkplumber.org> wrote:
> >   
> > > On Thu, 23 Apr 2020 16:57:50 +0200
> > > Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > >     
> > > > Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot
> > > > support the qdisc type. Other return values will result in failing the
> > > > qdisc setup.  This lead to qdisc noop getting assigned, which will
> > > > drop all TX packets on the interface.
> > > > 
> > > > Fixes: ab1e6de2bd49 ("dpaa2-eth: Add mqprio support")
> > > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>      
> > > 
> > > Would it be possible to use extack as well?    
> > 
> > That is what patch 1/2 already does.
> >   
> > > Putting errors in dmesg is unhelpful    
> > 
> > This patchset does not introduce any dmesg printk.
> >   
> 
> I was thinking that this  
> 	if (num_tc  > dpaa2_eth_tc_count(priv)) {
>  		netdev_err(net_dev, "Max %d traffic classes supported\n",
>  			   dpaa2_eth_tc_count(priv));
> -		return -EINVAL;
> +		return -EOPNOTSUPP;
>  	}
> 
> could be an extack message

That's a good question, actually. In this case Jesper was seeing a
failure when creating the default qdisc. The extack would go nowhere,
we'd have to print it to the logs, no? Which we should probably do,
anyway.

> but doing that would require a change
> to the ndo_setup_tc hook to allow driver to return its own error message
> as to why the setup failed.

Yeah :S The block offload command contains extack, but this driver
doesn't understand block offload, so it won't interpret it...

That brings me to an important point - doesn't the extack in patch 1
override any extack driver may have set?

I remember we discussed this when adding extacks to the TC core, but 
I don't remember the conclusion now, ugh.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-23 19:56         ` Jakub Kicinski
@ 2020-04-24  7:04           ` Jesper Dangaard Brouer
  2020-04-25  0:28             ` Jakub Kicinski
  0 siblings, 1 reply; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2020-04-24  7:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Stephen Hemminger, netdev, Ilias Apalodimas,
	Toke Høiland-Jørgensen, ruxandra.radulescu,
	ioana.ciornei, nipun.gupta, shawnguo, brouer

On Thu, 23 Apr 2020 12:56:00 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> On Thu, 23 Apr 2020 12:33:56 -0700 Stephen Hemminger wrote:
> > On Thu, 23 Apr 2020 17:38:04 +0200
> > Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >   
> > > On Thu, 23 Apr 2020 08:28:58 -0700
> > > Stephen Hemminger <stephen@networkplumber.org> wrote:
> > >     
> > > > On Thu, 23 Apr 2020 16:57:50 +0200
> > > > Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > > >       
> > > > > Drivers ndo_setup_tc call should return -EOPNOTSUPP, when it cannot
> > > > > support the qdisc type. Other return values will result in failing the
> > > > > qdisc setup.  This lead to qdisc noop getting assigned, which will
> > > > > drop all TX packets on the interface.
> > > > > 
> > > > > Fixes: ab1e6de2bd49 ("dpaa2-eth: Add mqprio support")
> > > > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>        
> > > > 
> > > > Would it be possible to use extack as well?      
> > > 
> > > That is what patch 1/2 already does.
> > >     
> > > > Putting errors in dmesg is unhelpful      
> > > 
> > > This patchset does not introduce any dmesg printk.
> > >     
> > 
> > I was thinking that this  
> > 	if (num_tc  > dpaa2_eth_tc_count(priv)) {
> >  		netdev_err(net_dev, "Max %d traffic classes supported\n",
> >  			   dpaa2_eth_tc_count(priv));
> > -		return -EINVAL;
> > +		return -EOPNOTSUPP;
> >  	}
> > 
> > could be an extack message  

First of all, this is a fix, and we need to keep it simple, as it needs
to be backported to v5.3.

Talking about converting this warning message this into a extack, I'm
actually not convinced that is a good idea, or will even work.  First
the extack cannot contain the %d number.  Second returning -EOPNOTSUPP
this is actually not an error, and I don't think tc will print the
extack in that case?
 
> That's a good question, actually. In this case Jesper was seeing a
> failure when creating the default qdisc. The extack would go nowhere,
> we'd have to print it to the logs, no? Which we should probably do,
> anyway.

Good point. We probably need a separate dmesg error when we cannot
configure the default qdisc.  As there is not end-user to receive the
extack.   But I would place that at a higher level in qdisc_create_dflt().
It would definitely have helped me to identify what net-subsystem was
dropping packets, and after my patch[1/2] adding the extack, an
end-user would get a meaning full message to ease the troubleshooting.

(Side-note: First I placed an extack in qdisc_create_dflt() but I
realized it was wrong, because it could potentially override messages
from the lower layers.)

(For a separate patch:)

We should discuss, that when creating the default qdisc, we should IMHO
not allow that to fail.  As you can see in [1], this step happens
during the qdisc init function e.g. it could also fail due to low
memory. IMHO we should have a fallback, for when the default qdisc init
fails, e.g. assign pfifo_fast instead or even noqueue.


> > but doing that would require a change
> > to the ndo_setup_tc hook to allow driver to return its own error message
> > as to why the setup failed.  
> 
> Yeah :S The block offload command contains extack, but this driver
> doesn't understand block offload, so it won't interpret it...
> 
> That brings me to an important point - doesn't the extack in patch 1
> override any extack driver may have set?

Nope, see above side-note.  I set the extack at the "lowest level",
e.g. closest to the error that cause the err back-propagation, when I
detect that this will cause a failure at higher level.


> I remember we discussed this when adding extacks to the TC core, but 
> I don't remember the conclusion now, ugh.

When adding the extack code, I as puzzled that during debugging I
managed to override other extack messages.  Have anyone though about a
better way to handle if extack messages gets overridden?


[1] https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_nxp_ls1088/nxp-board04-troubleshoot-qdisc.org
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs
  2020-04-23 14:57 [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs Jesper Dangaard Brouer
  2020-04-23 14:57 ` [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via extack Jesper Dangaard Brouer
  2020-04-23 14:57 ` [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc Jesper Dangaard Brouer
@ 2020-04-24 23:45 ` David Miller
  2 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2020-04-24 23:45 UTC (permalink / raw)
  To: brouer
  Cc: netdev, ilias.apalodimas, toke, ruxandra.radulescu,
	ioana.ciornei, nipun.gupta, shawnguo

From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Thu, 23 Apr 2020 16:57:40 +0200

> I've been very puzzled why networking on my NXP development board,
> using driver dpaa2-eth, stopped working when I updated the kernel
> version >= 5.3.  The observable issue were that interface would drop
> all TX packets, because it had assigned the qdisc noop.
> 
> This turned out the be a NIC driver bug, that would only get triggered
> when using sysctl net/core/default_qdisc=fq_codel. It was non-trivial
> to find out[1] this was driver related. Thus, this patchset besides
> fixing the driver bug, also helps end-user identify the issue.
> 
> [1]: https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_nxp_ls1088/nxp-board04-troubleshoot-qdisc.org

Series applied to net-next, thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-24  7:04           ` Jesper Dangaard Brouer
@ 2020-04-25  0:28             ` Jakub Kicinski
  2020-04-27 14:18               ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Kicinski @ 2020-04-25  0:28 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Stephen Hemminger, netdev, Ilias Apalodimas,
	Toke Høiland-Jørgensen, ruxandra.radulescu,
	ioana.ciornei, nipun.gupta, shawnguo, Marcelo Ricardo Leitner

On Fri, 24 Apr 2020 09:04:26 +0200 Jesper Dangaard Brouer wrote:
> (Side-note: First I placed an extack in qdisc_create_dflt() but I
> realized it was wrong, because it could potentially override messages
> from the lower layers.)
>
> > > but doing that would require a change
> > > to the ndo_setup_tc hook to allow driver to return its own error message
> > > as to why the setup failed.    
> > 
> > Yeah :S The block offload command contains extack, but this driver
> > doesn't understand block offload, so it won't interpret it...
> > 
> > That brings me to an important point - doesn't the extack in patch 1
> > override any extack driver may have set?  
> 
> Nope, see above side-note.  I set the extack at the "lowest level",
> e.g. closest to the error that cause the err back-propagation, when I
> detect that this will cause a failure at higher level.

Still, the driver is lower:

diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 2908e0a0d6e1..ffed75453c14 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -209,9 +209,12 @@ static int
 nsim_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data)
 {
        struct netdevsim *ns = netdev_priv(dev);
+       struct flow_block_offload *f;

        switch (type) {
        case TC_SETUP_BLOCK:
+               f = type_data;
+               NL_SET_ERR_MSG_MOD(f->extack, "bla bla bla bla bla");
+               return -EINVAL;
-               return flow_block_cb_setup_simple(type_data,
-                                                 &nsim_block_cb_list,
-                                                 nsim_setup_tc_block_cb,
 	default:
 		return -EOPNOTSUPP;
 	}

# tc qdisc add dev netdevsim0 ingress
Error: Driver ndo_setup_tc failed.


> > I remember we discussed this when adding extacks to the TC core, but 
> > I don't remember the conclusion now, ugh.  
> 
> When adding the extack code, I as puzzled that during debugging I
> managed to override other extack messages.  Have anyone though about a
> better way to handle if extack messages gets overridden?

I think there was more discussion, but this is all I can find now:

https://lore.kernel.org/netdev/20180918131212.20266-4-johannes@sipsolutions.net/#t

Maybe Marcelo will remeber.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc
  2020-04-25  0:28             ` Jakub Kicinski
@ 2020-04-27 14:18               ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 13+ messages in thread
From: Marcelo Ricardo Leitner @ 2020-04-27 14:18 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jesper Dangaard Brouer, Stephen Hemminger, netdev,
	Ilias Apalodimas, Toke Høiland-Jørgensen,
	ruxandra.radulescu, ioana.ciornei, nipun.gupta, shawnguo

On Fri, Apr 24, 2020 at 05:28:59PM -0700, Jakub Kicinski wrote:
> On Fri, 24 Apr 2020 09:04:26 +0200 Jesper Dangaard Brouer wrote:
> > (Side-note: First I placed an extack in qdisc_create_dflt() but I
> > realized it was wrong, because it could potentially override messages
> > from the lower layers.)
> >
> > > > but doing that would require a change
> > > > to the ndo_setup_tc hook to allow driver to return its own error message
> > > > as to why the setup failed.    
> > > 
> > > Yeah :S The block offload command contains extack, but this driver
> > > doesn't understand block offload, so it won't interpret it...
> > > 
> > > That brings me to an important point - doesn't the extack in patch 1
> > > override any extack driver may have set?  
> > 
> > Nope, see above side-note.  I set the extack at the "lowest level",
> > e.g. closest to the error that cause the err back-propagation, when I
> > detect that this will cause a failure at higher level.
> 
> Still, the driver is lower:
> 
> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
> index 2908e0a0d6e1..ffed75453c14 100644
> --- a/drivers/net/netdevsim/netdev.c
> +++ b/drivers/net/netdevsim/netdev.c
> @@ -209,9 +209,12 @@ static int
>  nsim_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data)
>  {
>         struct netdevsim *ns = netdev_priv(dev);
> +       struct flow_block_offload *f;
> 
>         switch (type) {
>         case TC_SETUP_BLOCK:
> +               f = type_data;
> +               NL_SET_ERR_MSG_MOD(f->extack, "bla bla bla bla bla");
> +               return -EINVAL;
> -               return flow_block_cb_setup_simple(type_data,
> -                                                 &nsim_block_cb_list,
> -                                                 nsim_setup_tc_block_cb,
>  	default:
>  		return -EOPNOTSUPP;
>  	}
> 
> # tc qdisc add dev netdevsim0 ingress
> Error: Driver ndo_setup_tc failed.
> 
> 
> > > I remember we discussed this when adding extacks to the TC core, but 
> > > I don't remember the conclusion now, ugh.  
> > 
> > When adding the extack code, I as puzzled that during debugging I
> > managed to override other extack messages.  Have anyone though about a
> > better way to handle if extack messages gets overridden?
> 
> I think there was more discussion, but this is all I can find now:
> 
> https://lore.kernel.org/netdev/20180918131212.20266-4-johannes@sipsolutions.net/#t
> 
> Maybe Marcelo will remeber.

There was also this other one, on supporting multiple messages:
https://lore.kernel.org/netdev/673abaddb26351826ca454f46d1271f1f4814c56.1521226621.git.marcelo.leitner%40gmail.com/T/

What I remember is what we have now is enough because of 3 main
reasons:
- Anything logged before the actual error is potentially just noise
- Anything logged after the actual error is just noise,
- The user is probably not prepared to handle warnings from ip/tc/etc
  commands.

For this last, one example using the context here:
"Warning: failed to create the qdisc."
Ok, but what does that mean? Should the sysadmin, potentially unaware
of what a qdisc is, retry or ignore the warning? If retry, then it
probably should have just failed itself in the first place.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-04-27 14:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-23 14:57 [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs Jesper Dangaard Brouer
2020-04-23 14:57 ` [PATCH net-next 1/2] net: sched: report ndo_setup_tc failures via extack Jesper Dangaard Brouer
2020-04-23 19:51   ` Ioana Ciornei
2020-04-23 14:57 ` [PATCH net-next 2/2] dpaa2-eth: fix return codes used in ndo_setup_tc Jesper Dangaard Brouer
2020-04-23 15:28   ` Stephen Hemminger
2020-04-23 15:38     ` Jesper Dangaard Brouer
2020-04-23 19:33       ` Stephen Hemminger
2020-04-23 19:56         ` Jakub Kicinski
2020-04-24  7:04           ` Jesper Dangaard Brouer
2020-04-25  0:28             ` Jakub Kicinski
2020-04-27 14:18               ` Marcelo Ricardo Leitner
2020-04-23 19:49   ` Ioana Ciornei
2020-04-24 23:45 ` [PATCH net-next 0/2] Fix qdisc noop issue caused by driver and identify future bugs David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).