netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
@ 2019-01-31 13:02 Tariq Toukan
  2019-01-31 15:02 ` Eric Dumazet
  2019-01-31 17:38 ` David Miller
  0 siblings, 2 replies; 11+ messages in thread
From: Tariq Toukan @ 2019-01-31 13:02 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eran Ben Elisha, Saeed Mahameed, Eric Dumazet, Tariq Toukan

From: Saeed Mahameed <saeedm@mellanox.com>

When an ethernet frame is padded to meet the minimum ethernet frame
size, the padding octets are not covered by the hardware checksum.
Fortunately the padding octets are usually zero's, which don't affect
checksum. However, it is not guaranteed. For example, switches might
choose to make other use of these octets.
This repeatedly causes kernel hardware checksum fault.

Prior to the cited commit below, skb checksum was forced to be
CHECKSUM_NONE when padding is detected. After it, we need to keep
skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
verify and parse IP headers, it does not worth the effort as the packets
are so small that CHECKSUM_COMPLETE has no significant advantage.

Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

Hi Dave,
Please queue for -stable >= v4.18.

Thanks,
Tariq

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 9a0881cb7f51..fc8a11444e1a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct sk_buff *skb,
 }
 #endif
 
+#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
+
 /* We reach this function only after checking that any of
  * the (IPv4 | IPv6) bits are set in cqe->status.
  */
@@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
 		      netdev_features_t dev_features)
 {
 	__wsum hw_checksum = 0;
+	void *hdr;
+
+	/* CQE csum doesn't cover padding octets in short ethernet
+	 * frames. And the pad field is appended prior to calculating
+	 * and appending the FCS field.
+	 *
+	 * Detecting these padded frames requires to verify and parse
+	 * IP headers, so we simply force all those small frames to be
+	 * CHECKSUM_NONE even if they are not padded.
+	 * TODO: better if we report CHECKSUM_UNNECESSARY but this
+	 * demands some refactroing.
+	 */
+	if (short_frame(skb->len))
+		return -EINVAL;
 
-	void *hdr = (u8 *)va + sizeof(struct ethhdr);
-
+	hdr = (u8 *)va + sizeof(struct ethhdr);
 	hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
 
 	if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-01-31 13:02 [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames Tariq Toukan
@ 2019-01-31 15:02 ` Eric Dumazet
  2019-01-31 19:04   ` Saeed Mahameed
  2019-01-31 17:38 ` David Miller
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2019-01-31 15:02 UTC (permalink / raw)
  To: Tariq Toukan; +Cc: David S. Miller, netdev, Eran Ben Elisha, Saeed Mahameed

On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan <tariqt@mellanox.com> wrote:
>
> From: Saeed Mahameed <saeedm@mellanox.com>
>
> When an ethernet frame is padded to meet the minimum ethernet frame
> size, the padding octets are not covered by the hardware checksum.
> Fortunately the padding octets are usually zero's, which don't affect
> checksum. However, it is not guaranteed. For example, switches might
> choose to make other use of these octets.
> This repeatedly causes kernel hardware checksum fault.
>
> Prior to the cited commit below, skb checksum was forced to be
> CHECKSUM_NONE when padding is detected. After it, we need to keep
> skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
> verify and parse IP headers, it does not worth the effort as the packets
> are so small that CHECKSUM_COMPLETE has no significant advantage.
>
> Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
>
> Hi Dave,
> Please queue for -stable >= v4.18.
>
> Thanks,
> Tariq
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index 9a0881cb7f51..fc8a11444e1a 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct sk_buff *skb,
>  }
>  #endif
>
> +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> +
>  /* We reach this function only after checking that any of
>   * the (IPv4 | IPv6) bits are set in cqe->status.
>   */
> @@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
>                       netdev_features_t dev_features)
>  {
>         __wsum hw_checksum = 0;
> +       void *hdr;
> +
> +       /* CQE csum doesn't cover padding octets in short ethernet
> +        * frames. And the pad field is appended prior to calculating
> +        * and appending the FCS field.
> +        *
> +        * Detecting these padded frames requires to verify and parse
> +        * IP headers, so we simply force all those small frames to be
> +        * CHECKSUM_NONE even if they are not padded.
> +        * TODO: better if we report CHECKSUM_UNNECESSARY but this
> +        * demands some refactroing.

This TODO: looks bogus to me.

mlx4 driver first tries to use CHECKSUM_UNNECESSARY.

if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
                                                  MLX4_CQE_STATUS_UDP)) &&
     (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
      cqe->checksum == cpu_to_be16(0xffff)) {
     ...
      ip_summed = CHECKSUM_UNNECESSARY;
      ...
}

Then if this attempt failed, it tries to use CHECKSUM_COMPLETE (and
calls this check_sum() function)

So there is no refactoring to be done for mlx4 : short
IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already

> +        */
> +       if (short_frame(skb->len))
> +               return -EINVAL;
>
> -       void *hdr = (u8 *)va + sizeof(struct ethhdr);
> -
> +       hdr = (u8 *)va + sizeof(struct ethhdr);
>         hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
>
>         if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-01-31 13:02 [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames Tariq Toukan
  2019-01-31 15:02 ` Eric Dumazet
@ 2019-01-31 17:38 ` David Miller
  1 sibling, 0 replies; 11+ messages in thread
From: David Miller @ 2019-01-31 17:38 UTC (permalink / raw)
  To: tariqt; +Cc: netdev, eranbe, saeedm, edumazet

From: Tariq Toukan <tariqt@mellanox.com>
Date: Thu, 31 Jan 2019 15:02:43 +0200

> From: Saeed Mahameed <saeedm@mellanox.com>
> 
> When an ethernet frame is padded to meet the minimum ethernet frame
> size, the padding octets are not covered by the hardware checksum.
> Fortunately the padding octets are usually zero's, which don't affect
> checksum. However, it is not guaranteed. For example, switches might
> choose to make other use of these octets.
> This repeatedly causes kernel hardware checksum fault.
> 
> Prior to the cited commit below, skb checksum was forced to be
> CHECKSUM_NONE when padding is detected. After it, we need to keep
> skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
> verify and parse IP headers, it does not worth the effort as the packets
> are so small that CHECKSUM_COMPLETE has no significant advantage.
> 
> Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> Hi Dave,
> Please queue for -stable >= v4.18.

Please look into Eric's feedback and update the comment as needed.

Thank you.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-01-31 15:02 ` Eric Dumazet
@ 2019-01-31 19:04   ` Saeed Mahameed
  2019-01-31 19:07     ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-01-31 19:04 UTC (permalink / raw)
  To: Tariq Toukan, edumazet; +Cc: Eran Ben Elisha, davem, netdev

On Thu, 2019-01-31 at 07:02 -0800, Eric Dumazet wrote:
> On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan <tariqt@mellanox.com>
> wrote:
> > From: Saeed Mahameed <saeedm@mellanox.com>
> > 
> > When an ethernet frame is padded to meet the minimum ethernet frame
> > size, the padding octets are not covered by the hardware checksum.
> > Fortunately the padding octets are usually zero's, which don't
> > affect
> > checksum. However, it is not guaranteed. For example, switches
> > might
> > choose to make other use of these octets.
> > This repeatedly causes kernel hardware checksum fault.
> > 
> > Prior to the cited commit below, skb checksum was forced to be
> > CHECKSUM_NONE when padding is detected. After it, we need to keep
> > skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
> > verify and parse IP headers, it does not worth the effort as the
> > packets
> > are so small that CHECKSUM_COMPLETE has no significant advantage.
> > 
> > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
> > are friends")
> > Cc: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> > ---
> >  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++-
> > -
> >  1 file changed, 17 insertions(+), 2 deletions(-)
> > 
> > Hi Dave,
> > Please queue for -stable >= v4.18.
> > 
> > Thanks,
> > Tariq
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > index 9a0881cb7f51..fc8a11444e1a 100644
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > @@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum
> > hw_checksum, struct sk_buff *skb,
> >  }
> >  #endif
> > 
> > +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> > +
> >  /* We reach this function only after checking that any of
> >   * the (IPv4 | IPv6) bits are set in cqe->status.
> >   */
> > @@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe *cqe,
> > struct sk_buff *skb, void *va,
> >                       netdev_features_t dev_features)
> >  {
> >         __wsum hw_checksum = 0;
> > +       void *hdr;
> > +
> > +       /* CQE csum doesn't cover padding octets in short ethernet
> > +        * frames. And the pad field is appended prior to
> > calculating
> > +        * and appending the FCS field.
> > +        *
> > +        * Detecting these padded frames requires to verify and
> > parse
> > +        * IP headers, so we simply force all those small frames to
> > be
> > +        * CHECKSUM_NONE even if they are not padded.
> > +        * TODO: better if we report CHECKSUM_UNNECESSARY but this
> > +        * demands some refactroing.
> 
> This TODO: looks bogus to me.
> 
> mlx4 driver first tries to use CHECKSUM_UNNECESSARY.
> 
> if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
>                                                   MLX4_CQE_STATUS_UDP
> )) &&
>      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
>       cqe->checksum == cpu_to_be16(0xffff)) {
>      ...
>       ip_summed = CHECKSUM_UNNECESSARY;
>       ...
> }
> 
> Then if this attempt failed, it tries to use CHECKSUM_COMPLETE (and
> calls this check_sum() function)
> 
> So there is no refactoring to be done for mlx4 : short
> IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already

not exactly, considering mlx4 weird condition to decide to do
CHECKSUM_UNNECESSARY:

if ((cqe csum fields are valid) && cqe->checksum ==
cpu_to_be16(0xffff)) 
     { do CHECKSUM_UNNECESSARY }
else { do CHECKSUM_COMPLETE }

CHECKSUM_UNNECESSARY will be skipped if csum complete field is valid 
i.e (cqe->checksum != 0xffff)

but if checksum complete is to be skipped due to short_frame we want to
go back to CHECKSUM_UNNECESSARY, this is the refactoring i am talking
about :).


I hope now it is clear.. 


> 
> > +        */
> > +       if (short_frame(skb->len))
> > +               return -EINVAL;
> > 
> > -       void *hdr = (u8 *)va + sizeof(struct ethhdr);
> > -
> > +       hdr = (u8 *)va + sizeof(struct ethhdr);
> >         hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
> > 
> >         if (cqe->vlan_my_qpn &
> > cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
> > --
> > 1.8.3.1
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-01-31 19:04   ` Saeed Mahameed
@ 2019-01-31 19:07     ` Eric Dumazet
  2019-01-31 19:27       ` Saeed Mahameed
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2019-01-31 19:07 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: Tariq Toukan, Eran Ben Elisha, davem, netdev

On Thu, Jan 31, 2019 at 11:04 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> On Thu, 2019-01-31 at 07:02 -0800, Eric Dumazet wrote:
> > On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan <tariqt@mellanox.com>
> > wrote:
> > > From: Saeed Mahameed <saeedm@mellanox.com>
> > >
> > > When an ethernet frame is padded to meet the minimum ethernet frame
> > > size, the padding octets are not covered by the hardware checksum.
> > > Fortunately the padding octets are usually zero's, which don't
> > > affect
> > > checksum. However, it is not guaranteed. For example, switches
> > > might
> > > choose to make other use of these octets.
> > > This repeatedly causes kernel hardware checksum fault.
> > >
> > > Prior to the cited commit below, skb checksum was forced to be
> > > CHECKSUM_NONE when padding is detected. After it, we need to keep
> > > skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
> > > verify and parse IP headers, it does not worth the effort as the
> > > packets
> > > are so small that CHECKSUM_COMPLETE has no significant advantage.
> > >
> > > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
> > > are friends")
> > > Cc: Eric Dumazet <edumazet@google.com>
> > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > > Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> > > ---
> > >  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19 +++++++++++++++++-
> > > -
> > >  1 file changed, 17 insertions(+), 2 deletions(-)
> > >
> > > Hi Dave,
> > > Please queue for -stable >= v4.18.
> > >
> > > Thanks,
> > > Tariq
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > index 9a0881cb7f51..fc8a11444e1a 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > @@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum
> > > hw_checksum, struct sk_buff *skb,
> > >  }
> > >  #endif
> > >
> > > +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> > > +
> > >  /* We reach this function only after checking that any of
> > >   * the (IPv4 | IPv6) bits are set in cqe->status.
> > >   */
> > > @@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe *cqe,
> > > struct sk_buff *skb, void *va,
> > >                       netdev_features_t dev_features)
> > >  {
> > >         __wsum hw_checksum = 0;
> > > +       void *hdr;
> > > +
> > > +       /* CQE csum doesn't cover padding octets in short ethernet
> > > +        * frames. And the pad field is appended prior to
> > > calculating
> > > +        * and appending the FCS field.
> > > +        *
> > > +        * Detecting these padded frames requires to verify and
> > > parse
> > > +        * IP headers, so we simply force all those small frames to
> > > be
> > > +        * CHECKSUM_NONE even if they are not padded.
> > > +        * TODO: better if we report CHECKSUM_UNNECESSARY but this
> > > +        * demands some refactroing.
> >
> > This TODO: looks bogus to me.
> >
> > mlx4 driver first tries to use CHECKSUM_UNNECESSARY.
> >
> > if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
> >                                                   MLX4_CQE_STATUS_UDP
> > )) &&
> >      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
> >       cqe->checksum == cpu_to_be16(0xffff)) {
> >      ...
> >       ip_summed = CHECKSUM_UNNECESSARY;
> >       ...
> > }
> >
> > Then if this attempt failed, it tries to use CHECKSUM_COMPLETE (and
> > calls this check_sum() function)
> >
> > So there is no refactoring to be done for mlx4 : short
> > IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already
>
> not exactly, considering mlx4 weird condition to decide to do
> CHECKSUM_UNNECESSARY:
>
> if ((cqe csum fields are valid) && cqe->checksum ==
> cpu_to_be16(0xffff))
>      { do CHECKSUM_UNNECESSARY }
> else { do CHECKSUM_COMPLETE }
>
> CHECKSUM_UNNECESSARY will be skipped if csum complete field is valid
> i.e (cqe->checksum != 0xffff)
>
> but if checksum complete is to be skipped due to short_frame we want to
> go back to CHECKSUM_UNNECESSARY, this is the refactoring i am talking
> about :).
>
>
> I hope now it is clear..

Not really, since in case of short frame, cqe->checksum will be
0xffff, since mlx4 does not include the padding bytes in the checksum
in the first place.

(For native IPv4/IPV6 TCP/UDP frames that is)

>
>
> >
> > > +        */
> > > +       if (short_frame(skb->len))
> > > +               return -EINVAL;
> > >
> > > -       void *hdr = (u8 *)va + sizeof(struct ethhdr);
> > > -
> > > +       hdr = (u8 *)va + sizeof(struct ethhdr);
> > >         hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
> > >
> > >         if (cqe->vlan_my_qpn &
> > > cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
> > > --
> > > 1.8.3.1
> > >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-01-31 19:07     ` Eric Dumazet
@ 2019-01-31 19:27       ` Saeed Mahameed
  2019-01-31 19:42         ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-01-31 19:27 UTC (permalink / raw)
  To: edumazet; +Cc: Eran Ben Elisha, davem, netdev, Tariq Toukan

On Thu, 2019-01-31 at 11:07 -0800, Eric Dumazet wrote:
> On Thu, Jan 31, 2019 at 11:04 AM Saeed Mahameed <saeedm@mellanox.com>
> wrote:
> > On Thu, 2019-01-31 at 07:02 -0800, Eric Dumazet wrote:
> > > On Thu, Jan 31, 2019 at 5:04 AM Tariq Toukan <tariqt@mellanox.com
> > > >
> > > wrote:
> > > > From: Saeed Mahameed <saeedm@mellanox.com>
> > > > 
> > > > When an ethernet frame is padded to meet the minimum ethernet
> > > > frame
> > > > size, the padding octets are not covered by the hardware
> > > > checksum.
> > > > Fortunately the padding octets are usually zero's, which don't
> > > > affect
> > > > checksum. However, it is not guaranteed. For example, switches
> > > > might
> > > > choose to make other use of these octets.
> > > > This repeatedly causes kernel hardware checksum fault.
> > > > 
> > > > Prior to the cited commit below, skb checksum was forced to be
> > > > CHECKSUM_NONE when padding is detected. After it, we need to
> > > > keep
> > > > skb->csum updated. However, fixing up CHECKSUM_COMPLETE
> > > > requires to
> > > > verify and parse IP headers, it does not worth the effort as
> > > > the
> > > > packets
> > > > are so small that CHECKSUM_COMPLETE has no significant
> > > > advantage.
> > > > 
> > > > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and
> > > > CHECKSUM_COMPLETE
> > > > are friends")
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > > > Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> > > > ---
> > > >  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 19
> > > > +++++++++++++++++-
> > > > -
> > > >  1 file changed, 17 insertions(+), 2 deletions(-)
> > > > 
> > > > Hi Dave,
> > > > Please queue for -stable >= v4.18.
> > > > 
> > > > Thanks,
> > > > Tariq
> > > > 
> > > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > index 9a0881cb7f51..fc8a11444e1a 100644
> > > > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > > > @@ -617,6 +617,8 @@ static int get_fixed_ipv6_csum(__wsum
> > > > hw_checksum, struct sk_buff *skb,
> > > >  }
> > > >  #endif
> > > > 
> > > > +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> > > > +
> > > >  /* We reach this function only after checking that any of
> > > >   * the (IPv4 | IPv6) bits are set in cqe->status.
> > > >   */
> > > > @@ -624,9 +626,22 @@ static int check_csum(struct mlx4_cqe
> > > > *cqe,
> > > > struct sk_buff *skb, void *va,
> > > >                       netdev_features_t dev_features)
> > > >  {
> > > >         __wsum hw_checksum = 0;
> > > > +       void *hdr;
> > > > +
> > > > +       /* CQE csum doesn't cover padding octets in short
> > > > ethernet
> > > > +        * frames. And the pad field is appended prior to
> > > > calculating
> > > > +        * and appending the FCS field.
> > > > +        *
> > > > +        * Detecting these padded frames requires to verify and
> > > > parse
> > > > +        * IP headers, so we simply force all those small
> > > > frames to
> > > > be
> > > > +        * CHECKSUM_NONE even if they are not padded.
> > > > +        * TODO: better if we report CHECKSUM_UNNECESSARY but
> > > > this
> > > > +        * demands some refactroing.
> > > 
> > > This TODO: looks bogus to me.
> > > 
> > > mlx4 driver first tries to use CHECKSUM_UNNECESSARY.
> > > 
> > > if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
> > >                                                   MLX4_CQE_STATUS
> > > _UDP
> > > )) &&
> > >      (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
> > >       cqe->checksum == cpu_to_be16(0xffff)) {
> > >      ...
> > >       ip_summed = CHECKSUM_UNNECESSARY;
> > >       ...
> > > }
> > > 
> > > Then if this attempt failed, it tries to use CHECKSUM_COMPLETE
> > > (and
> > > calls this check_sum() function)
> > > 
> > > So there is no refactoring to be done for mlx4 : short
> > > IPv[46]+{UDP|TCP} frames get CHECKSUM_UNNECESSARY already
> > 
> > not exactly, considering mlx4 weird condition to decide to do
> > CHECKSUM_UNNECESSARY:
> > 
> > if ((cqe csum fields are valid) && cqe->checksum ==
> > cpu_to_be16(0xffff))
> >      { do CHECKSUM_UNNECESSARY }
> > else { do CHECKSUM_COMPLETE }
> > 
> > CHECKSUM_UNNECESSARY will be skipped if csum complete field is
> > valid
> > i.e (cqe->checksum != 0xffff)
> > 
> > but if checksum complete is to be skipped due to short_frame we
> > want to
> > go back to CHECKSUM_UNNECESSARY, this is the refactoring i am
> > talking
> > about :).
> > 
> > 
> > I hope now it is clear..
> 
> Not really, since in case of short frame, cqe->checksum will be
> 0xffff, since mlx4 does not include the padding bytes in the checksum
> in the first place.
> 

Are you sure ? you are claiming that the hardware will skip csum
complete i.e cqe->checksum will be 0xffff for padded short IP frames.
i don't think this is the case, the whole bug is that the hw does
provide a partial cqe->checksum (i.e doesn't included the padding
bytes) even for short eth frames.


> (For native IPv4/IPV6 TCP/UDP frames that is)
> 
> > 
> > > > +        */
> > > > +       if (short_frame(skb->len))
> > > > +               return -EINVAL;
> > > > 
> > > > -       void *hdr = (u8 *)va + sizeof(struct ethhdr);
> > > > -
> > > > +       hdr = (u8 *)va + sizeof(struct ethhdr);
> > > >         hw_checksum = csum_unfold((__force __sum16)cqe-
> > > > >checksum);
> > > > 
> > > >         if (cqe->vlan_my_qpn &
> > > > cpu_to_be32(MLX4_CQE_CVLAN_PRESENT_MASK) &&
> > > > --
> > > > 1.8.3.1
> > > > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-01-31 19:27       ` Saeed Mahameed
@ 2019-01-31 19:42         ` Eric Dumazet
  2019-02-01  0:05           ` Saeed Mahameed
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2019-01-31 19:42 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: Eran Ben Elisha, davem, netdev, Tariq Toukan

On Thu, Jan 31, 2019 at 11:27 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> Are you sure ? you are claiming that the hardware will skip csum
> complete i.e cqe->checksum will be 0xffff for padded short IP frames.
> i don't think this is the case, the whole bug is that the hw does
> provide a partial cqe->checksum (i.e doesn't included the padding
> bytes) even for short eth frames.

If the padding is not included, then cqe->checksum is 0xFFFF for
correctly received frames.

Otherwise, what would be cqe->checksum in this case ? A random value ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-01-31 19:42         ` Eric Dumazet
@ 2019-02-01  0:05           ` Saeed Mahameed
  2019-02-06 20:15             ` Saeed Mahameed
  0 siblings, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-01  0:05 UTC (permalink / raw)
  To: edumazet; +Cc: Eran Ben Elisha, davem, netdev, Tariq Toukan

On Thu, 2019-01-31 at 11:42 -0800, Eric Dumazet wrote:
> On Thu, Jan 31, 2019 at 11:27 AM Saeed Mahameed <saeedm@mellanox.com>
> wrote:
> > Are you sure ? you are claiming that the hardware will skip csum
> > complete i.e cqe->checksum will be 0xffff for padded short IP
> > frames.
> > i don't think this is the case, the whole bug is that the hw does
> > provide a partial cqe->checksum (i.e doesn't included the padding
> > bytes) even for short eth frames.
> 
> If the padding is not included, then cqe->checksum is 0xFFFF for
> correctly received frames.
> 
> Otherwise, what would be cqe->checksum in this case ? A random value
> ?

the actual checksum of IP headers+IP payload, while ignoring the
padding bytes, which is the bug, let me double check..



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-02-01  0:05           ` Saeed Mahameed
@ 2019-02-06 20:15             ` Saeed Mahameed
  2019-02-06 20:17               ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-06 20:15 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: edumazet, Eran Ben Elisha, davem, netdev, Tariq Toukan

On Thu, Jan 31, 2019 at 4:06 PM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> On Thu, 2019-01-31 at 11:42 -0800, Eric Dumazet wrote:
> > On Thu, Jan 31, 2019 at 11:27 AM Saeed Mahameed <saeedm@mellanox.com>
> > wrote:
> > > Are you sure ? you are claiming that the hardware will skip csum
> > > complete i.e cqe->checksum will be 0xffff for padded short IP
> > > frames.
> > > i don't think this is the case, the whole bug is that the hw does
> > > provide a partial cqe->checksum (i.e doesn't included the padding
> > > bytes) even for short eth frames.
> >
> > If the padding is not included, then cqe->checksum is 0xFFFF for
> > correctly received frames.
> >
> > Otherwise, what would be cqe->checksum in this case ? A random value
> > ?
>
> the actual checksum of IP headers+IP payload, while ignoring the
> padding bytes, which is the bug, let me double check..
>
>

Ok, just verified, csum complete (cqe->checksum) is provided and valid
for non-TCP/UDP ip packets, (on specific ConnectX3 HWs)
e.g. ICMP packets or IP fragments go through csum complete,  that
being said, looking at the code before my patch.
when cqe->checksum complete is not valid a IP non-TCP/UDP packet will
report csum NONE, which means my
TODO comment is valid even without my patch :).

We can remove the TODO, although i am fine with it if it kept there,
since it is valid,
but we must add a future optimization task (to tariq's backlog ;)) for
IP non-TCP/UDP traffic to check for
csum unnecessary when csum complete is not an option.

Thanks,
Saeed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-02-06 20:15             ` Saeed Mahameed
@ 2019-02-06 20:17               ` Eric Dumazet
  2019-02-06 20:23                 ` Saeed Mahameed
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2019-02-06 20:17 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Saeed Mahameed, Eran Ben Elisha, davem, netdev, Tariq Toukan

On Wed, Feb 6, 2019 at 12:15 PM Saeed Mahameed
<saeedm@dev.mellanox.co.il> wrote:
>
> Ok, just verified, csum complete (cqe->checksum) is provided and valid
> for non-TCP/UDP ip packets, (on specific ConnectX3 HWs)
> e.g. ICMP packets or IP fragments go through csum complete,  that
> being said, looking at the code before my patch.
> when cqe->checksum complete is not valid a IP non-TCP/UDP packet will
> report csum NONE, which means my
> TODO comment is valid even without my patch :).
>
> We can remove the TODO, although i am fine with it if it kept there,
> since it is valid,
> but we must add a future optimization task (to tariq's backlog ;)) for
> IP non-TCP/UDP traffic to check for
> csum unnecessary when csum complete is not an option.
>
>

Thanks for double checking.
You might add more details in the changelog for future generations ;)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
  2019-02-06 20:17               ` Eric Dumazet
@ 2019-02-06 20:23                 ` Saeed Mahameed
  0 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-06 20:23 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Saeed Mahameed, Eran Ben Elisha, davem, netdev, Tariq Toukan

On Wed, Feb 6, 2019 at 12:17 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Feb 6, 2019 at 12:15 PM Saeed Mahameed
> <saeedm@dev.mellanox.co.il> wrote:
> >
> > Ok, just verified, csum complete (cqe->checksum) is provided and valid
> > for non-TCP/UDP ip packets, (on specific ConnectX3 HWs)
> > e.g. ICMP packets or IP fragments go through csum complete,  that
> > being said, looking at the code before my patch.
> > when cqe->checksum complete is not valid a IP non-TCP/UDP packet will
> > report csum NONE, which means my
> > TODO comment is valid even without my patch :).
> >
> > We can remove the TODO, although i am fine with it if it kept there,
> > since it is valid,
> > but we must add a future optimization task (to tariq's backlog ;)) for
> > IP non-TCP/UDP traffic to check for
> > csum unnecessary when csum complete is not an option.
> >
> >
>
> Thanks for double checking.
> You might add more details in the changelog for future generations ;)

Great, I will do that, we will post V2,

Thank you Eric.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-02-06 20:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-31 13:02 [PATCH net] net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames Tariq Toukan
2019-01-31 15:02 ` Eric Dumazet
2019-01-31 19:04   ` Saeed Mahameed
2019-01-31 19:07     ` Eric Dumazet
2019-01-31 19:27       ` Saeed Mahameed
2019-01-31 19:42         ` Eric Dumazet
2019-02-01  0:05           ` Saeed Mahameed
2019-02-06 20:15             ` Saeed Mahameed
2019-02-06 20:17               ` Eric Dumazet
2019-02-06 20:23                 ` Saeed Mahameed
2019-01-31 17:38 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).