netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v1 0/2] GRO drop accounting
@ 2021-01-06 21:55 Jesse Brandeburg
  2021-01-06 21:55 ` [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg
  2021-01-06 21:55 ` [PATCH net-next v1 2/2] ice: remove GRO drop accounting Jesse Brandeburg
  0 siblings, 2 replies; 12+ messages in thread
From: Jesse Brandeburg @ 2021-01-06 21:55 UTC (permalink / raw)
  To: netdev; +Cc: Jesse Brandeburg, intel-wired-lan

Add some accounting for when the stack drops a packet
that a driver tried to indicate with a gro* call. This
helps users track where packets might have disappeared
to and will show up in the netdevice stats that already
exist.

After that, remove the driver specific workaround
that was added to do the same, just scoped too small.

Jesse Brandeburg (2):
  net: core: count drops from GRO
  ice: remove GRO drop accounting

 drivers/net/ethernet/intel/ice/ice.h          | 1 -
 drivers/net/ethernet/intel/ice/ice_ethtool.c  | 1 -
 drivers/net/ethernet/intel/ice/ice_main.c     | 4 +---
 drivers/net/ethernet/intel/ice/ice_txrx.h     | 1 -
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 --
 net/core/dev.c                                | 2 ++
 6 files changed, 3 insertions(+), 8 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-06 21:55 [PATCH net-next v1 0/2] GRO drop accounting Jesse Brandeburg
@ 2021-01-06 21:55 ` Jesse Brandeburg
  2021-01-08  0:50   ` Shannon Nelson
  2021-01-08  9:25   ` Eric Dumazet
  2021-01-06 21:55 ` [PATCH net-next v1 2/2] ice: remove GRO drop accounting Jesse Brandeburg
  1 sibling, 2 replies; 12+ messages in thread
From: Jesse Brandeburg @ 2021-01-06 21:55 UTC (permalink / raw)
  To: netdev; +Cc: Jesse Brandeburg, intel-wired-lan, Eric Dumazet, Jamal Hadi Salim

When drivers call the various receive upcalls to receive an skb
to the stack, sometimes that stack can drop the packet. The good
news is that the return code is given to all the drivers of
NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
the one "ice" driver that I changed, check the stat and increment
the dropped count. This is currently leading to packets that
arrive at the edge interface and are fully handled by the driver
and then mysteriously disappear.

Rather than fix all drivers to increment the drop stat when
handling the return code, emulate the already existing statistic
update for NET_RX_DROP events for the two GRO_DROP locations, and
increment the dev->rx_dropped associated with the skb.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/core/dev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8fa739259041..ef34043a9550 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi,
 		break;
 
 	case GRO_DROP:
+		atomic_long_inc(&skb->dev->rx_dropped);
 		kfree_skb(skb);
 		break;
 
@@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi,
 		break;
 
 	case GRO_DROP:
+		atomic_long_inc(&skb->dev->rx_dropped);
 		napi_reuse_skb(napi, skb);
 		break;
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v1 2/2] ice: remove GRO drop accounting
  2021-01-06 21:55 [PATCH net-next v1 0/2] GRO drop accounting Jesse Brandeburg
  2021-01-06 21:55 ` [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg
@ 2021-01-06 21:55 ` Jesse Brandeburg
  1 sibling, 0 replies; 12+ messages in thread
From: Jesse Brandeburg @ 2021-01-06 21:55 UTC (permalink / raw)
  To: netdev; +Cc: Jesse Brandeburg, intel-wired-lan

The driver was counting GRO drops but now that the stack
does it with the previous patch, the driver can drop
all the logic.  The driver keeps the dev_dbg message in order
to do optional detailed tracing.

This mostly undoes commit a8fffd7ae9a5 ("ice: add useful statistics").

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h          | 1 -
 drivers/net/ethernet/intel/ice/ice_ethtool.c  | 1 -
 drivers/net/ethernet/intel/ice/ice_main.c     | 4 +---
 drivers/net/ethernet/intel/ice/ice_txrx.h     | 1 -
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 --
 5 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 56725356a17b..dde850045e7e 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -256,7 +256,6 @@ struct ice_vsi {
 	u32 tx_busy;
 	u32 rx_buf_failed;
 	u32 rx_page_failed;
-	u32 rx_gro_dropped;
 	u16 num_q_vectors;
 	u16 base_vector;		/* IRQ base for OS reserved vectors */
 	enum ice_vsi_type type;
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 9e8e9531cd87..025c0a13e724 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -59,7 +59,6 @@ static const struct ice_stats ice_gstrings_vsi_stats[] = {
 	ICE_VSI_STAT("rx_unknown_protocol", eth_stats.rx_unknown_protocol),
 	ICE_VSI_STAT("rx_alloc_fail", rx_buf_failed),
 	ICE_VSI_STAT("rx_pg_alloc_fail", rx_page_failed),
-	ICE_VSI_STAT("rx_gro_dropped", rx_gro_dropped),
 	ICE_VSI_STAT("tx_errors", eth_stats.tx_errors),
 	ICE_VSI_STAT("tx_linearize", tx_linearize),
 	ICE_VSI_STAT("tx_busy", tx_busy),
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c52b9bb0e3ab..e157a2b4fcb9 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5314,7 +5314,6 @@ static void ice_update_vsi_ring_stats(struct ice_vsi *vsi)
 	vsi->tx_linearize = 0;
 	vsi->rx_buf_failed = 0;
 	vsi->rx_page_failed = 0;
-	vsi->rx_gro_dropped = 0;
 
 	rcu_read_lock();
 
@@ -5329,7 +5328,6 @@ static void ice_update_vsi_ring_stats(struct ice_vsi *vsi)
 		vsi_stats->rx_bytes += bytes;
 		vsi->rx_buf_failed += ring->rx_stats.alloc_buf_failed;
 		vsi->rx_page_failed += ring->rx_stats.alloc_page_failed;
-		vsi->rx_gro_dropped += ring->rx_stats.gro_dropped;
 	}
 
 	/* update XDP Tx rings counters */
@@ -5361,7 +5359,7 @@ void ice_update_vsi_stats(struct ice_vsi *vsi)
 	ice_update_eth_stats(vsi);
 
 	cur_ns->tx_errors = cur_es->tx_errors;
-	cur_ns->rx_dropped = cur_es->rx_discards + vsi->rx_gro_dropped;
+	cur_ns->rx_dropped = cur_es->rx_discards;
 	cur_ns->tx_dropped = cur_es->tx_discards;
 	cur_ns->multicast = cur_es->rx_multicast;
 
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index ff1a1cbd078e..6ce2046fc349 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -193,7 +193,6 @@ struct ice_rxq_stats {
 	u64 non_eop_descs;
 	u64 alloc_page_failed;
 	u64 alloc_buf_failed;
-	u64 gro_dropped; /* GRO returned dropped */
 };
 
 /* this enum matches hardware bits and is meant to be used by DYN_CTLN
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index bc2f4390b51d..3601b7d8abe5 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -192,8 +192,6 @@ ice_receive_skb(struct ice_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag)
 	    (vlan_tag & VLAN_VID_MASK))
 		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag);
 	if (napi_gro_receive(&rx_ring->q_vector->napi, skb) == GRO_DROP) {
-		/* this is tracked separately to help us debug stack drops */
-		rx_ring->rx_stats.gro_dropped++;
 		netdev_dbg(rx_ring->netdev, "Receive Queue %d: Dropped packet from GRO\n",
 			   rx_ring->q_index);
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-06 21:55 ` [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg
@ 2021-01-08  0:50   ` Shannon Nelson
  2021-01-08 18:26     ` Jesse Brandeburg
  2021-01-08  9:25   ` Eric Dumazet
  1 sibling, 1 reply; 12+ messages in thread
From: Shannon Nelson @ 2021-01-08  0:50 UTC (permalink / raw)
  To: Jesse Brandeburg, netdev; +Cc: intel-wired-lan, Eric Dumazet, Jamal Hadi Salim

On 1/6/21 1:55 PM, Jesse Brandeburg wrote:
> When drivers call the various receive upcalls to receive an skb
> to the stack, sometimes that stack can drop the packet. The good
> news is that the return code is given to all the drivers of
> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
> the one "ice" driver that I changed, check the stat and increment

If the stack is dropping the packet, isn't it up to the stack to track 
that, perhaps with something that shows up in netstat -s?  We don't 
really want to make the driver responsible for any drops that happen 
above its head, do we?

sln

> the dropped count. This is currently leading to packets that
> arrive at the edge interface and are fully handled by the driver
> and then mysteriously disappear.
>
> Rather than fix all drivers to increment the drop stat when
> handling the return code, emulate the already existing statistic
> update for NET_RX_DROP events for the two GRO_DROP locations, and
> increment the dev->rx_dropped associated with the skb.
>
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> ---
>   net/core/dev.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 8fa739259041..ef34043a9550 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi,
>   		break;
>   
>   	case GRO_DROP:
> +		atomic_long_inc(&skb->dev->rx_dropped);
>   		kfree_skb(skb);
>   		break;
>   
> @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi,
>   		break;
>   
>   	case GRO_DROP:
> +		atomic_long_inc(&skb->dev->rx_dropped);
>   		napi_reuse_skb(napi, skb);
>   		break;
>   


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-06 21:55 ` [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg
  2021-01-08  0:50   ` Shannon Nelson
@ 2021-01-08  9:25   ` Eric Dumazet
  2021-01-08 18:35     ` Jesse Brandeburg
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2021-01-08  9:25 UTC (permalink / raw)
  To: Jesse Brandeburg; +Cc: netdev, intel-wired-lan, Jamal Hadi Salim

On Wed, Jan 6, 2021 at 10:56 PM Jesse Brandeburg
<jesse.brandeburg@intel.com> wrote:
>
> When drivers call the various receive upcalls to receive an skb
> to the stack, sometimes that stack can drop the packet. The good
> news is that the return code is given to all the drivers of
> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
> the one "ice" driver that I changed, check the stat and increment
> the dropped count. This is currently leading to packets that
> arrive at the edge interface and are fully handled by the driver
> and then mysteriously disappear.
>
> Rather than fix all drivers to increment the drop stat when
> handling the return code, emulate the already existing statistic
> update for NET_RX_DROP events for the two GRO_DROP locations, and
> increment the dev->rx_dropped associated with the skb.
>
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> ---
>  net/core/dev.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 8fa739259041..ef34043a9550 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi,
>                 break;
>
>         case GRO_DROP:
> +               atomic_long_inc(&skb->dev->rx_dropped);
>                 kfree_skb(skb);
>                 break;
>
> @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi,
>                 break;
>
>         case GRO_DROP:
> +               atomic_long_inc(&skb->dev->rx_dropped);
>                 napi_reuse_skb(napi, skb);
>                 break;
>


This is not needed. I think we should clean up ice instead.

Drivers are supposed to have allocated the skb (using
napi_get_frags()) before calling napi_gro_frags()

Only napi_gro_frags() would return GRO_DROP, but we supposedly could
crash at that point, since a driver is clearly buggy.

We probably can remove GRO_DROP completely, assuming lazy drivers are fixed.

diff --git a/net/core/dev.c b/net/core/dev.c
index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi)
        gro_result_t ret;
        struct sk_buff *skb = napi_frags_skb(napi);

-       if (!skb)
-               return GRO_DROP;
-
        trace_napi_gro_frags_entry(skb);

        ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb));

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-08  0:50   ` Shannon Nelson
@ 2021-01-08 18:26     ` Jesse Brandeburg
  2021-01-08 19:21       ` Shannon Nelson
  0 siblings, 1 reply; 12+ messages in thread
From: Jesse Brandeburg @ 2021-01-08 18:26 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: netdev, intel-wired-lan, Eric Dumazet, Jamal Hadi Salim

Shannon Nelson wrote:

> On 1/6/21 1:55 PM, Jesse Brandeburg wrote:
> > When drivers call the various receive upcalls to receive an skb
> > to the stack, sometimes that stack can drop the packet. The good
> > news is that the return code is given to all the drivers of
> > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
> > the one "ice" driver that I changed, check the stat and increment
> 
> If the stack is dropping the packet, isn't it up to the stack to track 
> that, perhaps with something that shows up in netstat -s?  We don't 
> really want to make the driver responsible for any drops that happen 
> above its head, do we?

I totally agree!

In patch 2/2 I revert the driver-specific changes I had made in an
earlier patch, and this patch *was* my effort to make the stack show the
drops.

Maybe I wasn't clear. I'm seeing packets disappear during TCP
workloads, and this GRO_DROP code was the source of the drops (I see it
returning infrequently but regularly)

The driver processes the packet but the stack never sees it, and there
were no drop counters anywhere tracking it.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-08  9:25   ` Eric Dumazet
@ 2021-01-08 18:35     ` Jesse Brandeburg
  2021-01-08 18:45       ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Jesse Brandeburg @ 2021-01-08 18:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, intel-wired-lan, Jamal Hadi Salim

Eric Dumazet wrote:
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi,
> >                 break;
> >
> >         case GRO_DROP:
> > +               atomic_long_inc(&skb->dev->rx_dropped);
> >                 kfree_skb(skb);
> >                 break;
> >
> > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi,
> >                 break;
> >
> >         case GRO_DROP:
> > +               atomic_long_inc(&skb->dev->rx_dropped);
> >                 napi_reuse_skb(napi, skb);
> >                 break;
> >
> 
> 
> This is not needed. I think we should clean up ice instead.

My patch 2 already did that. I was trying to address the fact that I'm
*actually seeing* GRO_DROP return codes coming back from stack.

I'll try to reproduce that issue again that I saw. Maybe modern kernels
don't have the problem as frequently or at all.

> Drivers are supposed to have allocated the skb (using
> napi_get_frags()) before calling napi_gro_frags()

ice doesn't use napi_get_frags/napi_gro_frags, so I'm not sure how this
is relevant. 

> Only napi_gro_frags() would return GRO_DROP, but we supposedly could
> crash at that point, since a driver is clearly buggy.

seems unlikely since we don't call those functions.
 
> We probably can remove GRO_DROP completely, assuming lazy drivers are fixed.

This might be ok, but doesn't explain why I was seeing this return
code (which was the whole reason I was trying to count them), however I
may have been running on a distro kernel from redhat/centos 8 when I
was seeing these events. I haven't fully completed spelunking all the
different sources, but might be able to follow down the rabbit hole
further.

 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c
> 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi)
>         gro_result_t ret;
>         struct sk_buff *skb = napi_frags_skb(napi);
> 
> -       if (!skb)
> -               return GRO_DROP;
> -
>         trace_napi_gro_frags_entry(skb);
> 
>         ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb));

This change (noted from your other patches is fine), and a likely
improvement, thanks for sending those!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-08 18:35     ` Jesse Brandeburg
@ 2021-01-08 18:45       ` Eric Dumazet
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2021-01-08 18:45 UTC (permalink / raw)
  To: Jesse Brandeburg; +Cc: netdev, intel-wired-lan, Jamal Hadi Salim

On Fri, Jan 8, 2021 at 7:35 PM Jesse Brandeburg
<jesse.brandeburg@intel.com> wrote:
>
> Eric Dumazet wrote:
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi,
> > >                 break;
> > >
> > >         case GRO_DROP:
> > > +               atomic_long_inc(&skb->dev->rx_dropped);
> > >                 kfree_skb(skb);
> > >                 break;
> > >
> > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi,
> > >                 break;
> > >
> > >         case GRO_DROP:
> > > +               atomic_long_inc(&skb->dev->rx_dropped);
> > >                 napi_reuse_skb(napi, skb);
> > >                 break;
> > >
> >
> >
> > This is not needed. I think we should clean up ice instead.
>
> My patch 2 already did that. I was trying to address the fact that I'm
> *actually seeing* GRO_DROP return codes coming back from stack.
>
> I'll try to reproduce that issue again that I saw. Maybe modern kernels
> don't have the problem as frequently or at all.


Jesse, you are sending a patch for current kernels.

It is pretty clear that the issue you have can not happen with current
kernels, by reading the code source,
even without an actual ICE piece of hardware to test this :)

>
> > Drivers are supposed to have allocated the skb (using
> > napi_get_frags()) before calling napi_gro_frags()
>
> ice doesn't use napi_get_frags/napi_gro_frags, so I'm not sure how this
> is relevant.
>
> > Only napi_gro_frags() would return GRO_DROP, but we supposedly could
> > crash at that point, since a driver is clearly buggy.
>
> seems unlikely since we don't call those functions.
>
> > We probably can remove GRO_DROP completely, assuming lazy drivers are fixed.
>
> This might be ok, but doesn't explain why I was seeing this return
> code (which was the whole reason I was trying to count them), however I
> may have been running on a distro kernel from redhat/centos 8 when I
> was seeing these events. I haven't fully completed spelunking all the
> different sources, but might be able to follow down the rabbit hole
> further.

Yes please :)

>
>
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi)
> >         gro_result_t ret;
> >         struct sk_buff *skb = napi_frags_skb(napi);
> >
> > -       if (!skb)
> > -               return GRO_DROP;
> > -
> >         trace_napi_gro_frags_entry(skb);
> >
> >         ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb));
>
> This change (noted from your other patches is fine), and a likely
> improvement, thanks for sending those!

Sure !

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-08 18:26     ` Jesse Brandeburg
@ 2021-01-08 19:21       ` Shannon Nelson
  2021-01-08 20:26         ` Saeed Mahameed
  2021-01-14 13:53         ` Jamal Hadi Salim
  0 siblings, 2 replies; 12+ messages in thread
From: Shannon Nelson @ 2021-01-08 19:21 UTC (permalink / raw)
  To: Jesse Brandeburg; +Cc: netdev, intel-wired-lan, Eric Dumazet, Jamal Hadi Salim

On 1/8/21 10:26 AM, Jesse Brandeburg wrote:
> Shannon Nelson wrote:
>
>> On 1/6/21 1:55 PM, Jesse Brandeburg wrote:
>>> When drivers call the various receive upcalls to receive an skb
>>> to the stack, sometimes that stack can drop the packet. The good
>>> news is that the return code is given to all the drivers of
>>> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
>>> the one "ice" driver that I changed, check the stat and increment
>> If the stack is dropping the packet, isn't it up to the stack to track
>> that, perhaps with something that shows up in netstat -s?  We don't
>> really want to make the driver responsible for any drops that happen
>> above its head, do we?
> I totally agree!
>
> In patch 2/2 I revert the driver-specific changes I had made in an
> earlier patch, and this patch *was* my effort to make the stack show the
> drops.
>
> Maybe I wasn't clear. I'm seeing packets disappear during TCP
> workloads, and this GRO_DROP code was the source of the drops (I see it
> returning infrequently but regularly)
>
> The driver processes the packet but the stack never sees it, and there
> were no drop counters anywhere tracking it.
>

My point is that the patch increments a netdev counter, which to my mind 
immediately implicates the driver and hardware, rather than the stack.  
As a driver maintainer, I don't want to be chasing driver packet drop 
reports that are a stack problem.  I'd rather see a new counter in 
netstat -s that reflects the stack decision and can better imply what 
went wrong.  I don't have a good suggestion for a counter name at the 
moment.

I guess part of the issue is that this is right on the boundary of 
driver-stack.  But if we follow Eric's suggestions, maybe the problem 
magically goes away :-) .

sln


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-08 19:21       ` Shannon Nelson
@ 2021-01-08 20:26         ` Saeed Mahameed
  2021-01-08 22:17           ` Eric Dumazet
  2021-01-14 13:53         ` Jamal Hadi Salim
  1 sibling, 1 reply; 12+ messages in thread
From: Saeed Mahameed @ 2021-01-08 20:26 UTC (permalink / raw)
  To: Shannon Nelson, Jesse Brandeburg
  Cc: netdev, intel-wired-lan, Eric Dumazet, Jamal Hadi Salim

On Fri, 2021-01-08 at 11:21 -0800, Shannon Nelson wrote:
> On 1/8/21 10:26 AM, Jesse Brandeburg wrote:
> > Shannon Nelson wrote:
> > 
> > > On 1/6/21 1:55 PM, Jesse Brandeburg wrote:
> > > > When drivers call the various receive upcalls to receive an skb
> > > > to the stack, sometimes that stack can drop the packet. The
> > > > good
> > > > news is that the return code is given to all the drivers of
> > > > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
> > > > the one "ice" driver that I changed, check the stat and
> > > > increment
> > > If the stack is dropping the packet, isn't it up to the stack to
> > > track
> > > that, perhaps with something that shows up in netstat -s?  We
> > > don't
> > > really want to make the driver responsible for any drops that
> > > happen
> > > above its head, do we?
> > I totally agree!
> > 
> > In patch 2/2 I revert the driver-specific changes I had made in an
> > earlier patch, and this patch *was* my effort to make the stack
> > show the
> > drops.
> > 
> > Maybe I wasn't clear. I'm seeing packets disappear during TCP
> > workloads, and this GRO_DROP code was the source of the drops (I
> > see it
> > returning infrequently but regularly)
> > 
> > The driver processes the packet but the stack never sees it, and
> > there
> > were no drop counters anywhere tracking it.
> > 
> 
> My point is that the patch increments a netdev counter, which to my
> mind 
> immediately implicates the driver and hardware, rather than the
> stack.  
> As a driver maintainer, I don't want to be chasing driver packet
> drop 
> reports that are a stack problem.  I'd rather see a new counter in 
> netstat -s that reflects the stack decision and can better imply
> what 
> went wrong.  I don't have a good suggestion for a counter name at
> the 
> moment.
> 
> I guess part of the issue is that this is right on the boundary of 
> driver-stack.  But if we follow Eric's suggestions, maybe the
> problem 
> magically goes away :-) .
> 
> sln
> 

I think there is still some merit in this patchset even with Eric's
removal of GRO_DROP from gro_receive(). As Eric explained, it is still
possible to silently drop for the same reason when drivers
call napi_get_frags or even alloc_skb() apis, many drivers do not
account for such packet drops, and maybe it is the right thing to do to
inline the packet drop accounting into the skb alloc APIs ? the
question is, is it the job of those APIs to update netdev->stats ?






^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-08 20:26         ` Saeed Mahameed
@ 2021-01-08 22:17           ` Eric Dumazet
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2021-01-08 22:17 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Shannon Nelson, Jesse Brandeburg, netdev, intel-wired-lan,
	Jamal Hadi Salim

On Fri, Jan 8, 2021 at 9:27 PM Saeed Mahameed <saeed@kernel.org> wrote:
>

> I think there is still some merit in this patchset even with Eric's
> removal of GRO_DROP from gro_receive(). As Eric explained, it is still
> possible to silently drop for the same reason when drivers
> call napi_get_frags or even alloc_skb() apis, many drivers do not
> account for such packet drops, and maybe it is the right thing to do to
> inline the packet drop accounting into the skb alloc APIs ? the
> question is, is it the job of those APIs to update netdev->stats ?
>

You absolutely do not want to have a generic increment of
netdev->stats for multiqueue drivers.
This would add terrible cache line false sharing under DDOS and memory stress.

Each driver maintains (or should maintain) per rx queue counter for this case.

It seems  mlx4 does nothing special, I would suggest you fix it :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v1 1/2] net: core: count drops from GRO
  2021-01-08 19:21       ` Shannon Nelson
  2021-01-08 20:26         ` Saeed Mahameed
@ 2021-01-14 13:53         ` Jamal Hadi Salim
  1 sibling, 0 replies; 12+ messages in thread
From: Jamal Hadi Salim @ 2021-01-14 13:53 UTC (permalink / raw)
  To: Shannon Nelson, Jesse Brandeburg; +Cc: netdev, intel-wired-lan, Eric Dumazet

On 2021-01-08 2:21 p.m., Shannon Nelson wrote:
> On 1/8/21 10:26 AM, Jesse Brandeburg wrote:
>> Shannon Nelson wrote:
>>
>>> On 1/6/21 1:55 PM, Jesse Brandeburg wrote:
>>>> When drivers call the various receive upcalls to receive an skb
>>>> to the stack, sometimes that stack can drop the packet. The good
>>>> news is that the return code is given to all the drivers of
>>>> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
>>>> the one "ice" driver that I changed, check the stat and increment
>>> If the stack is dropping the packet, isn't it up to the stack to track
>>> that, perhaps with something that shows up in netstat -s?  We don't
>>> really want to make the driver responsible for any drops that happen
>>> above its head, do we?
>> I totally agree!
>>
>> In patch 2/2 I revert the driver-specific changes I had made in an
>> earlier patch, and this patch *was* my effort to make the stack show the
>> drops.
>>
>> Maybe I wasn't clear. I'm seeing packets disappear during TCP
>> workloads, and this GRO_DROP code was the source of the drops (I see it
>> returning infrequently but regularly)
>>
>> The driver processes the packet but the stack never sees it, and there
>> were no drop counters anywhere tracking it.
>>
> 
> My point is that the patch increments a netdev counter, which to my mind 
> immediately implicates the driver and hardware, rather than the stack. 
> As a driver maintainer, I don't want to be chasing driver packet drop 
> reports that are a stack problem.  I'd rather see a new counter in 
> netstat -s that reflects the stack decision and can better imply what 
> went wrong.  I don't have a good suggestion for a counter name at the 
> moment.
> 
> I guess part of the issue is that this is right on the boundary of 
> driver-stack.  But if we follow Eric's suggestions, maybe the problem 
> magically goes away :-) .
> 

So: How does one know that the stack-upcall dropped a packet because
of GRO issues? Debugging with kprobe or traces doesnt count as an
answer.

cheers,
jamal



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-01-14 13:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-06 21:55 [PATCH net-next v1 0/2] GRO drop accounting Jesse Brandeburg
2021-01-06 21:55 ` [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg
2021-01-08  0:50   ` Shannon Nelson
2021-01-08 18:26     ` Jesse Brandeburg
2021-01-08 19:21       ` Shannon Nelson
2021-01-08 20:26         ` Saeed Mahameed
2021-01-08 22:17           ` Eric Dumazet
2021-01-14 13:53         ` Jamal Hadi Salim
2021-01-08  9:25   ` Eric Dumazet
2021-01-08 18:35     ` Jesse Brandeburg
2021-01-08 18:45       ` Eric Dumazet
2021-01-06 21:55 ` [PATCH net-next v1 2/2] ice: remove GRO drop accounting Jesse Brandeburg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).