netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]   ` <alpine.LRH.2.00.1201261642340.31408-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
@ 2012-01-30  4:36     ` Roland Dreier
       [not found]       ` <CAL1RGDUm8ROxFFMa+D1ZD5jF+cK+kV8aEVzspgnZFNXeuai+fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-02-02 21:58     ` Or Gerlitz
  1 sibling, 1 reply; 70+ messages in thread
From: Roland Dreier @ 2012-01-30  4:36 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: linux-rdma, Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 26, 2012 at 6:43 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> From: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> The GRO flow makes a check in every layer to ensure the packets
> are actually merged only if they match at all layers.
>
> The first GRO check, at L2 always fails for IPoIB, since it assumes
> that all packets have 14 bytes of Ethernet link layer header. Using the
> IPoIB header will not help here either, since its only four bytes. To
> overcome this, the skb mac header pointer is set to an area within the
> packet IB GRH headroom, such that later, the L2 check done by GRO
> succeeds and it can move to checks at the network and transport layers.

> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> index 4115be5..89cfaf7 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> @@ -286,10 +287,20 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>        else
>                skb->pkt_type = PACKET_MULTICAST;
>
> -       skb_pull(skb, IB_GRH_BYTES);
> +       /*
> +        * GRO first does L2 compares (14 bytes). We must not let it start from
> +        * the IPoIB header as ten octets of the IP header, containing fields
> +        * which vary from packet to packet will cause non-merging of packets.
> +        * from the same TCP stream.
> +        */
> +       psgid = skb_pull(skb, offsetof(struct ib_grh, sgid));
> +       /* if there's no GRH, that area could contain random data */
> +       if (!(wc->wc_flags & IB_WC_GRH))
> +               memset(psgid, 0, 16);
> +       skb_reset_mac_header(skb);
> +       skb_pull(skb, IB_GRH_BYTES - offsetof(struct ib_grh, sgid));
>
>        skb->protocol = ((struct ipoib_header *) skb->data)->proto;
> -       skb_reset_mac_header(skb);

This seems like a really weird place to fix this.  Wouldn't it
make more sense to fix the GRO check to handle non-ethernet
L2 headers?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]       ` <CAL1RGDUm8ROxFFMa+D1ZD5jF+cK+kV8aEVzspgnZFNXeuai+fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-01-30  7:44         ` Shlomo Pongratz
       [not found]           ` <36F7E4A28C18BE4DB7C86058E7B607240BE9687A-SlGPd/IId7auSA5JZHE7gA@public.gmane.org>
  2012-01-30  7:44         ` Or Gerlitz
  1 sibling, 1 reply; 70+ messages in thread
From: Shlomo Pongratz @ 2012-01-30  7:44 UTC (permalink / raw)
  To: Roland Dreier, Or Gerlitz; +Cc: linux-rdma, netdev-u79uwXL29TY76Z2rM5mHXA

Hi Roland,

I see this fix as part of a more general fix, one that will enable more protocols (e.g. X25, PPP) to enjoy the benefit of GRO.
I think that the more general fix should include adding a MAC comparison function to the header_ops structure in netdev.h.
Then network drivers which are not Ethernet will register their own comparison function, and the GRO code will check if such routine exists and if so will use it.
The problem with this approach is that until such a major fix is integrated the IPoIB performance will suffer.
This simple fix will enable us to use the benefit of GRO until the more general fix is integrated.

Best regards,

S.P.
 

-----Original Message-----
From: roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org [mailto:roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org] On Behalf Of Roland Dreier
Sent: Monday, January 30, 2012 6:44 AM
To: Or Gerlitz
Cc: linux-rdma; Shlomo Pongratz; netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams

On Thu, Jan 26, 2012 at 6:43 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> From: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> The GRO flow makes a check in every layer to ensure the packets are 
> actually merged only if they match at all layers.
>
> The first GRO check, at L2 always fails for IPoIB, since it assumes 
> that all packets have 14 bytes of Ethernet link layer header. Using 
> the IPoIB header will not help here either, since its only four bytes. 
> To overcome this, the skb mac header pointer is set to an area within 
> the packet IB GRH headroom, such that later, the L2 check done by GRO 
> succeeds and it can move to checks at the network and transport layers.

> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
> b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> index 4115be5..89cfaf7 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> @@ -286,10 +287,20 @@ static void ipoib_ib_handle_rx_wc(struct 
> net_device *dev, struct ib_wc *wc)
>        else
>                skb->pkt_type = PACKET_MULTICAST;
>
> -       skb_pull(skb, IB_GRH_BYTES);
> +       /*
> +        * GRO first does L2 compares (14 bytes). We must not let it 
> + start from
> +        * the IPoIB header as ten octets of the IP header, containing 
> + fields
> +        * which vary from packet to packet will cause non-merging of packets.
> +        * from the same TCP stream.
> +        */
> +       psgid = skb_pull(skb, offsetof(struct ib_grh, sgid));
> +       /* if there's no GRH, that area could contain random data */
> +       if (!(wc->wc_flags & IB_WC_GRH))
> +               memset(psgid, 0, 16);
> +       skb_reset_mac_header(skb);
> +       skb_pull(skb, IB_GRH_BYTES - offsetof(struct ib_grh, sgid));
>
>        skb->protocol = ((struct ipoib_header *) skb->data)->proto;
> -       skb_reset_mac_header(skb);

This seems like a really weird place to fix this.  Wouldn't it make more sense to fix the GRO check to handle non-ethernet
L2 headers?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]       ` <CAL1RGDUm8ROxFFMa+D1ZD5jF+cK+kV8aEVzspgnZFNXeuai+fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-01-30  7:44         ` Shlomo Pongratz
@ 2012-01-30  7:44         ` Or Gerlitz
       [not found]           ` <4F264A6C.3070706-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-01-30  7:44 UTC (permalink / raw)
  To: Roland Dreier, Herbert Xu, davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: linux-rdma, Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA

On 1/30/2012 6:36 AM, Roland Dreier wrote:
> On Thu, Jan 26, 2012 at 6:43 AM, Or Gerlitz<ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>  wrote:
>> The GRO flow makes a check in every layer to ensure the packets
>> are actually merged only if they match at all layers.
>>
>> The first GRO check, at L2 always fails for IPoIB, since it assumes
>> that all packets have 14 bytes of Ethernet link layer header. Using the
>> IPoIB header will not help here either, since its only four bytes. To
>> overcome this, the skb mac header pointer is set to an area within the
>> packet IB GRH headroom, such that later, the L2 check done by GRO
>> succeeds and it can move to checks at the network and transport layers.
>
>> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>> @@ -286,10 +287,20 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>>         else
>>                 skb->pkt_type = PACKET_MULTICAST;
>>
>> -       skb_pull(skb, IB_GRH_BYTES);
>> +       /*
>> +        * GRO first does L2 compares (14 bytes). We must not let it start from
>> +        * the IPoIB header as ten octets of the IP header, containing fields
>> +        * which vary from packet to packet will cause non-merging of packets.
>> +        * from the same TCP stream.
>> +        */
>> +       psgid = skb_pull(skb, offsetof(struct ib_grh, sgid));
>> +       /* if there's no GRH, that area could contain random data */
>> +       if (!(wc->wc_flags&  IB_WC_GRH))
>> +               memset(psgid, 0, 16);
>> +       skb_reset_mac_header(skb);
>> +       skb_pull(skb, IB_GRH_BYTES - offsetof(struct ib_grh, sgid));
>>
>>         skb->protocol = ((struct ipoib_header *) skb->data)->proto;
>> -       skb_reset_mac_header(skb);
>
> This seems like a really weird place to fix this.  Wouldn't it
> make more sense to fix the GRO check to handle non-ethernet L2 headers?

Yes, we can do that as well. Herbert, Dave, would it be enough here, to 
skip the Ethernet header and vlan comparison for skbs whose associated 
netdevice type isn't ARPHRD_ETHER? e.g something along the lines of:

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 115dee1..c529f5a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3505,9 +3505,11 @@ __napi_gro_receive(struct napi_struct *napi, 
> struct sk_buff *skb)
>                 unsigned long diffs;
>
>                 diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
> -               diffs |= p->vlan_tci ^ skb->vlan_tci;
> -               diffs |= compare_ether_header(skb_mac_header(p),
> -                                             skb_gro_mac_header(skb));
> +               if (!diffs && p->dev->type == ARPHRD_ETHER) {
> +                       diffs |= p->vlan_tci ^ skb->vlan_tci;
> +                       diffs |= compare_ether_header(skb_mac_header(p),
> +                                                     
> skb_gro_mac_header(skb));
> +               }
>                 NAPI_GRO_CB(p)->same_flow = !diffs;
>                 NAPI_GRO_CB(p)->flush = 0;
>         }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]           ` <4F264A6C.3070706-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-01-30  8:04             ` Eric Dumazet
  2012-01-30  8:11               ` Or Gerlitz
                                 ` (2 more replies)
  0 siblings, 3 replies; 70+ messages in thread
From: Eric Dumazet @ 2012-01-30  8:04 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Roland Dreier, Herbert Xu, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-rdma, Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA

Le lundi 30 janvier 2012 à 09:44 +0200, Or Gerlitz a écrit :
> On 1/30/2012 6:36 AM, Roland Dreier wrote:
> > On Thu, Jan 26, 2012 at 6:43 AM, Or Gerlitz<ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>  wrote:
> >> The GRO flow makes a check in every layer to ensure the packets
> >> are actually merged only if they match at all layers.
> >>
> >> The first GRO check, at L2 always fails for IPoIB, since it assumes
> >> that all packets have 14 bytes of Ethernet link layer header. Using the
> >> IPoIB header will not help here either, since its only four bytes. To
> >> overcome this, the skb mac header pointer is set to an area within the
> >> packet IB GRH headroom, such that later, the L2 check done by GRO
> >> succeeds and it can move to checks at the network and transport layers.
> >
> >> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> >> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> >> @@ -286,10 +287,20 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
> >>         else
> >>                 skb->pkt_type = PACKET_MULTICAST;
> >>
> >> -       skb_pull(skb, IB_GRH_BYTES);
> >> +       /*
> >> +        * GRO first does L2 compares (14 bytes). We must not let it start from
> >> +        * the IPoIB header as ten octets of the IP header, containing fields
> >> +        * which vary from packet to packet will cause non-merging of packets.
> >> +        * from the same TCP stream.
> >> +        */
> >> +       psgid = skb_pull(skb, offsetof(struct ib_grh, sgid));
> >> +       /* if there's no GRH, that area could contain random data */
> >> +       if (!(wc->wc_flags&  IB_WC_GRH))
> >> +               memset(psgid, 0, 16);
> >> +       skb_reset_mac_header(skb);
> >> +       skb_pull(skb, IB_GRH_BYTES - offsetof(struct ib_grh, sgid));
> >>
> >>         skb->protocol = ((struct ipoib_header *) skb->data)->proto;
> >> -       skb_reset_mac_header(skb);
> >
> > This seems like a really weird place to fix this.  Wouldn't it
> > make more sense to fix the GRO check to handle non-ethernet L2 headers?
> 
> Yes, we can do that as well. Herbert, Dave, would it be enough here, to 
> skip the Ethernet header and vlan comparison for skbs whose associated 
> netdevice type isn't ARPHRD_ETHER? e.g something along the lines of:
> 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 115dee1..c529f5a 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3505,9 +3505,11 @@ __napi_gro_receive(struct napi_struct *napi, 
> > struct sk_buff *skb)
> >                 unsigned long diffs;
> >
> >                 diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
> > -               diffs |= p->vlan_tci ^ skb->vlan_tci;
> > -               diffs |= compare_ether_header(skb_mac_header(p),
> > -                                             skb_gro_mac_header(skb));
> > +               if (!diffs && p->dev->type == ARPHRD_ETHER) {
> > +                       diffs |= p->vlan_tci ^ skb->vlan_tci;
> > +                       diffs |= compare_ether_header(skb_mac_header(p),
> > +                                                     
> > skb_gro_mac_header(skb));
> > +               }
> >                 NAPI_GRO_CB(p)->same_flow = !diffs;
> >                 NAPI_GRO_CB(p)->flush = 0;

Hmm, do we really need to compare ether header, thats the question.

IMHO, GRO could avoid this check, as legal trafic could be never merged
(eg multipath)





--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-01-30  8:04             ` Eric Dumazet
@ 2012-01-30  8:11               ` Or Gerlitz
  2012-01-30  8:18               ` Herbert Xu
  2012-01-30  8:25               ` Michał Mirosław
  2 siblings, 0 replies; 70+ messages in thread
From: Or Gerlitz @ 2012-01-30  8:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Roland Dreier, Herbert Xu, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-rdma, Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA

On 1/30/2012 10:04 AM, Eric Dumazet wrote:
> Hmm, do we really need to compare ether header, thats the question. 
> IMHO, GRO could avoid this check, as legal trafic could be never 
> merged (eg multipath) 

Yep, basically I tend to agree that GRO could do very well with L3 and 
L4 checks, the multipath comment is interesting use case to show why the 
ethernet header comparison might be too strict here.


Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-01-30  8:04             ` Eric Dumazet
  2012-01-30  8:11               ` Or Gerlitz
@ 2012-01-30  8:18               ` Herbert Xu
       [not found]                 ` <20120130081849.GA7848-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
  2012-01-30  8:25               ` Michał Mirosław
  2 siblings, 1 reply; 70+ messages in thread
From: Herbert Xu @ 2012-01-30  8:18 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Or Gerlitz, Roland Dreier, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-rdma, Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA

On Mon, Jan 30, 2012 at 09:04:32AM +0100, Eric Dumazet wrote:
>
> Hmm, do we really need to compare ether header, thats the question.

I think we do.  As otherwise macvlan would break.

Cheers,
-- 
Email: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-01-30  8:04             ` Eric Dumazet
  2012-01-30  8:11               ` Or Gerlitz
  2012-01-30  8:18               ` Herbert Xu
@ 2012-01-30  8:25               ` Michał Mirosław
  2 siblings, 0 replies; 70+ messages in thread
From: Michał Mirosław @ 2012-01-30  8:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Or Gerlitz, Roland Dreier, Herbert Xu,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, linux-rdma, Shlomo Pongratz,
	netdev-u79uwXL29TY76Z2rM5mHXA

2012/1/30 Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
> Le lundi 30 janvier 2012 à 09:44 +0200, Or Gerlitz a écrit :
>> On 1/30/2012 6:36 AM, Roland Dreier wrote:
>> > On Thu, Jan 26, 2012 at 6:43 AM, Or Gerlitz<ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>  wrote:
>> >> The GRO flow makes a check in every layer to ensure the packets
>> >> are actually merged only if they match at all layers.
>> >>
>> >> The first GRO check, at L2 always fails for IPoIB, since it assumes
>> >> that all packets have 14 bytes of Ethernet link layer header. Using the
>> >> IPoIB header will not help here either, since its only four bytes. To
>> >> overcome this, the skb mac header pointer is set to an area within the
>> >> packet IB GRH headroom, such that later, the L2 check done by GRO
>> >> succeeds and it can move to checks at the network and transport layers.
>> >
>> >> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>> >> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>> >> @@ -286,10 +287,20 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>> >>         else
>> >>                 skb->pkt_type = PACKET_MULTICAST;
>> >>
>> >> -       skb_pull(skb, IB_GRH_BYTES);
>> >> +       /*
>> >> +        * GRO first does L2 compares (14 bytes). We must not let it start from
>> >> +        * the IPoIB header as ten octets of the IP header, containing fields
>> >> +        * which vary from packet to packet will cause non-merging of packets.
>> >> +        * from the same TCP stream.
>> >> +        */
>> >> +       psgid = skb_pull(skb, offsetof(struct ib_grh, sgid));
>> >> +       /* if there's no GRH, that area could contain random data */
>> >> +       if (!(wc->wc_flags&  IB_WC_GRH))
>> >> +               memset(psgid, 0, 16);
>> >> +       skb_reset_mac_header(skb);
>> >> +       skb_pull(skb, IB_GRH_BYTES - offsetof(struct ib_grh, sgid));
>> >>
>> >>         skb->protocol = ((struct ipoib_header *) skb->data)->proto;
>> >> -       skb_reset_mac_header(skb);
>> >
>> > This seems like a really weird place to fix this.  Wouldn't it
>> > make more sense to fix the GRO check to handle non-ethernet L2 headers?
>>
>> Yes, we can do that as well. Herbert, Dave, would it be enough here, to
>> skip the Ethernet header and vlan comparison for skbs whose associated
>> netdevice type isn't ARPHRD_ETHER? e.g something along the lines of:
>>
>> > diff --git a/net/core/dev.c b/net/core/dev.c
>> > index 115dee1..c529f5a 100644
>> > --- a/net/core/dev.c
>> > +++ b/net/core/dev.c
>> > @@ -3505,9 +3505,11 @@ __napi_gro_receive(struct napi_struct *napi,
>> > struct sk_buff *skb)
>> >                 unsigned long diffs;
>> >
>> >                 diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
>> > -               diffs |= p->vlan_tci ^ skb->vlan_tci;
>> > -               diffs |= compare_ether_header(skb_mac_header(p),
>> > -                                             skb_gro_mac_header(skb));
>> > +               if (!diffs && p->dev->type == ARPHRD_ETHER) {
>> > +                       diffs |= p->vlan_tci ^ skb->vlan_tci;
>> > +                       diffs |= compare_ether_header(skb_mac_header(p),
>> > +
>> > skb_gro_mac_header(skb));
>> > +               }
>> >                 NAPI_GRO_CB(p)->same_flow = !diffs;
>> >                 NAPI_GRO_CB(p)->flush = 0;
>
> Hmm, do we really need to compare ether header, thats the question.
>
> IMHO, GRO could avoid this check, as legal trafic could be never merged
> (eg multipath)

This would allow injecting data to the connection by other host on the
same LAN. GRO does coalescing before any L3 anti-spoofing checks (eg.
rpfilter) are done, doesn't it?

Best Regards,
Michał Mirosław
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                 ` <20120130081849.GA7848-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
@ 2012-01-30  8:53                   ` Eric Dumazet
  2012-01-30  8:57                     ` Herbert Xu
  0 siblings, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-01-30  8:53 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Or Gerlitz, Roland Dreier, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-rdma, Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA

Le lundi 30 janvier 2012 à 19:18 +1100, Herbert Xu a écrit :
> On Mon, Jan 30, 2012 at 09:04:32AM +0100, Eric Dumazet wrote:
> >
> > Hmm, do we really need to compare ether header, thats the question.
> 
> I think we do.  As otherwise macvlan would break.
> 

Thats the theory yes, but practically ?

Really, GRO can merge two TCP frames given they match everything needed,
exactly as our TCP stack would do in the end.

What could be a normal workload where this mismatch of two different tcp
flows could happen with macvlan or any kind of devices ?

If this is an attack, TCP will merge the frames anyway on the same
socket.

Or should we add checks in TCP stack, in case GRO is off ?



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-01-30  8:53                   ` Eric Dumazet
@ 2012-01-30  8:57                     ` Herbert Xu
       [not found]                       ` <20120130085742.GA8262-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Herbert Xu @ 2012-01-30  8:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Or Gerlitz, Roland Dreier, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-rdma, Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA

On Mon, Jan 30, 2012 at 09:53:03AM +0100, Eric Dumazet wrote:
>
> If this is an attack, TCP will merge the frames anyway on the same
> socket.

Not if we have a MAC-level filter in place or VLANs.

Cheers,
-- 
Email: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                       ` <20120130085742.GA8262-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
@ 2012-01-30 16:43                         ` David Miller
  2012-02-01  8:23                         ` Or Gerlitz
  1 sibling, 0 replies; 70+ messages in thread
From: David Miller @ 2012-01-30 16:43 UTC (permalink / raw)
  To: herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

From: Herbert Xu <herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw@public.gmane.org>
Date: Mon, 30 Jan 2012 19:57:42 +1100

> On Mon, Jan 30, 2012 at 09:53:03AM +0100, Eric Dumazet wrote:
>>
>> If this is an attack, TCP will merge the frames anyway on the same
>> socket.
> 
> Not if we have a MAC-level filter in place or VLANs.

Right.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]           ` <36F7E4A28C18BE4DB7C86058E7B607240BE9687A-SlGPd/IId7auSA5JZHE7gA@public.gmane.org>
@ 2012-01-30 18:11             ` Roland Dreier
       [not found]               ` <CAL1RGDXjjQ-PhCv-9WJX45NuovC9XiS=_7507OsyvLW_gBaJ5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Roland Dreier @ 2012-01-30 18:11 UTC (permalink / raw)
  To: Shlomo Pongratz; +Cc: Or Gerlitz, linux-rdma, netdev-u79uwXL29TY76Z2rM5mHXA

On Sun, Jan 29, 2012 at 11:44 PM, Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> I see this fix as part of a more general fix, one that will enable more protocols (e.g. X25, PPP) to enjoy the benefit of GRO.
> I think that the more general fix should include adding a MAC comparison function to the header_ops structure in netdev.h.
> Then network drivers which are not Ethernet will register their own comparison function, and the GRO code will check if such routine exists and if so will use it.
> The problem with this approach is that until such a major fix is integrated the IPoIB performance will suffer.
> This simple fix will enable us to use the benefit of GRO until the more general fix is integrated.

But is this general fix so difficult that we can't just do the right thing?

Realistically if we add this hack to ipoib then we'll have that ugly
memset in our fast path forever.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]               ` <CAL1RGDXjjQ-PhCv-9WJX45NuovC9XiS=_7507OsyvLW_gBaJ5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-01-30 18:16                 ` Or Gerlitz
       [not found]                   ` <CAJZOPZLhgDysSASyMLNpOrmGzEyfyHAQjGLVei6ZNSFfb7TM1w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-01-30 18:16 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Shlomo Pongratz, Or Gerlitz, linux-rdma, netdev-u79uwXL29TY76Z2rM5mHXA

On Mon, Jan 30, 2012 at 8:11 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> But is this general fix so difficult that we can't just do the right thing?

no one said its so difficult, its just not trivial, b/c of the many
use cases over which the GRO logic plays... lets see if the network
folks come with some idea/directive for us.


> Realistically if we add this hack to ipoib then we'll have that ugly
> memset in our fast path forever.

forever, why? once a clever fix is found, we can revert that memset, etc.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                   ` <CAJZOPZLhgDysSASyMLNpOrmGzEyfyHAQjGLVei6ZNSFfb7TM1w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-01-30 19:00                     ` Roland Dreier
  0 siblings, 0 replies; 70+ messages in thread
From: Roland Dreier @ 2012-01-30 19:00 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Shlomo Pongratz, Or Gerlitz, linux-rdma, netdev-u79uwXL29TY76Z2rM5mHXA

On Mon, Jan 30, 2012 at 10:16 AM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> Realistically if we add this hack to ipoib then we'll have that ugly
>> memset in our fast path forever.
>
> forever, why? once a clever fix is found, we can revert that memset, etc.

Because no one is ever going to work on the right fix if we add a
bandaid to ipoib now.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                       ` <20120130085742.GA8262-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
  2012-01-30 16:43                         ` David Miller
@ 2012-02-01  8:23                         ` Or Gerlitz
       [not found]                           ` <CAJZOPZJb1HvcS0XXKLvoDuoi1EfYTY-awwY2g0aHWoS=4qmdyQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-01  8:23 UTC (permalink / raw)
  To: Herbert Xu, Eric Dumazet
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma, Shlomo Pongratz,
	Roland Dreier

Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> wrote:
> Eric Dumazet wrote:

>> If this is an attack, TCP will merge the frames anyway on the same socket.

> Not if we have a MAC-level filter in place or VLANs.

Herbert, Eric

So what would you recommend here? not sure if you saw that, but Shlomo
suggested to add new entry to the header ops, compare_header e.g such
that if skb->dev has this callback use it instead of
compare_ether_header in __napi_gro_receive, makes sense?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                           ` <CAJZOPZJb1HvcS0XXKLvoDuoi1EfYTY-awwY2g0aHWoS=4qmdyQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-01  8:38                             ` Herbert Xu
       [not found]                               ` <20120201083837.GA7081-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Herbert Xu @ 2012-02-01  8:38 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Eric Dumazet, netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma,
	Shlomo Pongratz, Roland Dreier

On Wed, Feb 01, 2012 at 10:23:22AM +0200, Or Gerlitz wrote:
>
> So what would you recommend here? not sure if you saw that, but Shlomo
> suggested to add new entry to the header ops, compare_header e.g such
> that if skb->dev has this callback use it instead of
> compare_ether_header in __napi_gro_receive, makes sense?

If we just turn it into a memcmp with a variable length would
that work for you?

Cheers,
-- 
Email: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                               ` <20120201083837.GA7081-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
@ 2012-02-01  9:43                                 ` Or Gerlitz
  2012-02-01 14:07                                   ` Eric Dumazet
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-01  9:43 UTC (permalink / raw)
  To: Herbert Xu, Roland Dreier
  Cc: Eric Dumazet, netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma, Shlomo Pongratz

On Wed, Feb 1, 2012 at 10:38 AM, Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org> wrote:
> On Wed, Feb 01, 2012 at 10:23:22AM +0200, Or Gerlitz wrote:

>> So what would you recommend here? not sure if you saw that, but Shlomo
>> suggested to add new entry to the header ops, compare_header e.g such
>> that if skb->dev has this callback use it instead of
>> compare_ether_header in __napi_gro_receive, makes sense?

> If we just turn it into a memcmp with a variable length would
> that work for you?

I think yes, FWIW we will compare the IPoIB header, Roland is that okay for you?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-02-01  9:43                                 ` Or Gerlitz
@ 2012-02-01 14:07                                   ` Eric Dumazet
  2012-02-02 14:01                                     ` Or Gerlitz
  0 siblings, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-01 14:07 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Herbert Xu, Roland Dreier, netdev, linux-rdma, Shlomo Pongratz

Le mercredi 01 février 2012 à 11:43 +0200, Or Gerlitz a écrit :
> On Wed, Feb 1, 2012 at 10:38 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > On Wed, Feb 01, 2012 at 10:23:22AM +0200, Or Gerlitz wrote:
> 
> >> So what would you recommend here? not sure if you saw that, but Shlomo
> >> suggested to add new entry to the header ops, compare_header e.g such
> >> that if skb->dev has this callback use it instead of
> >> compare_ether_header in __napi_gro_receive, makes sense?
> 
> > If we just turn it into a memcmp with a variable length would
> > that work for you?
> 
> I think yes, FWIW we will compare the IPoIB header, Roland is that okay for you?

A memcmp(xxx, yyy, variable_len) will be out of line and slow, its a bit
sad ...

Are skb_mac_header(p) / skb_gro_mac_header(skb) going to point to IPoIB
header ?

Maybe we can keep a fastpath for ethernet case...
(the "if (hlen == ETH_HLEN) being always predicted)

Maybe need to introduce gro_hard_header_len as well)

diff --git a/net/core/dev.c b/net/core/dev.c
index 115dee1..62abee4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3500,14 +3500,20 @@ static inline gro_result_t
 __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	unsigned int hlen = skb->dev->hard_header_len;
 
 	for (p = napi->gro_list; p; p = p->next) {
 		unsigned long diffs;
 
 		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
 		diffs |= p->vlan_tci ^ skb->vlan_tci;
-		diffs |= compare_ether_header(skb_mac_header(p),
-					      skb_gro_mac_header(skb));
+		if (hlen == ETH_HLEN)
+			diffs |= compare_ether_header(skb_mac_header(p),
+						      skb_gro_mac_header(skb));
+		else if (!diffs)
+			diffs = memcmp(skb_mac_header(p),
+					skb_gro_mac_header(skb),
+					skb->dev->hard_header_len);
 		NAPI_GRO_CB(p)->same_flow = !diffs;
 		NAPI_GRO_CB(p)->flush = 0;
 	}

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-02-01 14:07                                   ` Eric Dumazet
@ 2012-02-02 14:01                                     ` Or Gerlitz
       [not found]                                       ` <4F2A974B.209-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-02 14:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, Roland Dreier, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma, Shlomo Pongratz

On 2/1/2012 4:07 PM, Eric Dumazet wrote:
>>> If we just turn it into a memcmp with a variable length would
>>> that work for you?
>>
>> I think yes, FWIW we will compare the IPoIB header, Roland is that okay for you?
>
> A memcmp(xxx, yyy, variable_len) will be out of line and slow, its a bit sad ... Are skb_mac_header(p) / skb_gro_mac_header(skb) going to point to IPoIB header ?

yes, both skb_mac_header(p) / skb_gro_mac_header(skb) point to IPoIB 
header, however (see next)


> Maybe we can keep a fastpath for ethernet case... (the "if (hlen == ETH_HLEN) being always predicted)
> Maybe need to introduce gro_hard_header_len as well)

today, IPoIB advertizes hard_header_len which is bigger than the IPoIB 
header len, this is done such that skbs sent by the network stack have 
enough headroom for a "pseudoheader" which for few flows (e.g unicast 
arp replies and multicast) is placed there by the ipoib hard_header 
function and later used by the xmit function.

So we can either try and change that, such that hard_header_len will be 
equal to the ipoib header len or add gro_hard_header_len as you 
suggested, any preferences?

Or.

Or.




>
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 115dee1..62abee4 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3500,14 +3500,20 @@ static inline gro_result_t
>   __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
>   {
>   	struct sk_buff *p;
> +	unsigned int hlen = skb->dev->hard_header_len;
>
>   	for (p = napi->gro_list; p; p = p->next) {
>   		unsigned long diffs;
>
>   		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
>   		diffs |= p->vlan_tci ^ skb->vlan_tci;
> -		diffs |= compare_ether_header(skb_mac_header(p),
> -					      skb_gro_mac_header(skb));
> +		if (hlen == ETH_HLEN)
> +			diffs |= compare_ether_header(skb_mac_header(p),
> +						      skb_gro_mac_header(skb));
> +		else if (!diffs)
> +			diffs = memcmp(skb_mac_header(p),
> +					skb_gro_mac_header(skb),
> +					skb->dev->hard_header_len);
>   		NAPI_GRO_CB(p)->same_flow = !diffs;
>   		NAPI_GRO_CB(p)->flush = 0;
>   	}
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                                       ` <4F2A974B.209-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-02-02 14:38                                         ` Eric Dumazet
  2012-02-02 14:44                                           ` Eric Dumazet
  2012-02-02 15:43                                           ` Or Gerlitz
  0 siblings, 2 replies; 70+ messages in thread
From: Eric Dumazet @ 2012-02-02 14:38 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Herbert Xu, Roland Dreier, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma, Shlomo Pongratz

Le jeudi 02 février 2012 à 16:01 +0200, Or Gerlitz a écrit :
> On 2/1/2012 4:07 PM, Eric Dumazet wrote:
> >>> If we just turn it into a memcmp with a variable length would
> >>> that work for you?
> >>
> >> I think yes, FWIW we will compare the IPoIB header, Roland is that okay for you?
> >
> > A memcmp(xxx, yyy, variable_len) will be out of line and slow, its a bit sad ... Are skb_mac_header(p) / skb_gro_mac_header(skb) going to point to IPoIB header ?
> 
> yes, both skb_mac_header(p) / skb_gro_mac_header(skb) point to IPoIB 
> header, however (see next)
> 
> 
> > Maybe we can keep a fastpath for ethernet case... (the "if (hlen == ETH_HLEN) being always predicted)
> > Maybe need to introduce gro_hard_header_len as well)
> 
> today, IPoIB advertizes hard_header_len which is bigger than the IPoIB 
> header len, this is done such that skbs sent by the network stack have 
> enough headroom for a "pseudoheader" which for few flows (e.g unicast 
> arp replies and multicast) is placed there by the ipoib hard_header 
> function and later used by the xmit function.
> 
> So we can either try and change that, such that hard_header_len will be 
> equal to the ipoib header len or add gro_hard_header_len as you 
> suggested, any preferences?

I guess changing hard_header_len might be difficult (for you).

Adding gro_mac_header_len sounds the easy way.

[ You'll need to set it in your device setup() ]




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-02-02 14:38                                         ` Eric Dumazet
@ 2012-02-02 14:44                                           ` Eric Dumazet
  2012-02-02 21:42                                             ` Or Gerlitz
  2012-02-02 15:43                                           ` Or Gerlitz
  1 sibling, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-02 14:44 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Herbert Xu, Roland Dreier, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma, Shlomo Pongratz

Le jeudi 02 février 2012 à 15:38 +0100, Eric Dumazet a écrit :

> I guess changing hard_header_len might be difficult (for you).
> 
> Adding gro_mac_header_len sounds the easy way.
> 
> [ You'll need to set it in your device setup() ]
> 

Something like :

 include/linux/netdevice.h |    2 +-
 net/core/dev.c            |   11 +++++++++--
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0eac07c..99b12c4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1095,7 +1095,7 @@ struct net_device {
 	unsigned int		mtu;	/* interface MTU value		*/
 	unsigned short		type;	/* interface hardware type	*/
 	unsigned short		hard_header_len;	/* hardware hdr length	*/
-
+	unsigned int		gro_mac_header_len;
 	/* extra head- and tailroom the hardware may need, but not in all cases
 	 * can this be guaranteed, especially tailroom. Some cases also use
 	 * LL_MAX_HEADER instead to allocate the skb.
diff --git a/net/core/dev.c b/net/core/dev.c
index f124947..a3af7bc 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3491,14 +3491,20 @@ static inline gro_result_t
 __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	unsigned int maclen = skb->dev->gro_mac_header_len;
 
 	for (p = napi->gro_list; p; p = p->next) {
 		unsigned long diffs;
 
 		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
 		diffs |= p->vlan_tci ^ skb->vlan_tci;
-		diffs |= compare_ether_header(skb_mac_header(p),
-					      skb_gro_mac_header(skb));
+		if (maclen == ETH_HLEN)
+			diffs |= compare_ether_header(skb_mac_header(p),
+						      skb_gro_mac_header(skb));
+		else if (!diffs)
+			diffs = memcmp(skb_mac_header(p),
+					skb_gro_mac_header(skb),
+					maclen);
 		NAPI_GRO_CB(p)->same_flow = !diffs;
 		NAPI_GRO_CB(p)->flush = 0;
 	}
@@ -5962,6 +5968,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
 	dev->priv_flags = IFF_XMIT_DST_RELEASE;
+	dev->gro_mac_header_len = ETH_HLEN;
 	setup(dev);
 
 	dev->num_tx_queues = txqs;


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-02-02 14:38                                         ` Eric Dumazet
  2012-02-02 14:44                                           ` Eric Dumazet
@ 2012-02-02 15:43                                           ` Or Gerlitz
  1 sibling, 0 replies; 70+ messages in thread
From: Or Gerlitz @ 2012-02-02 15:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, Roland Dreier, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma, Shlomo Pongratz

On 2/2/2012 4:38 PM, Eric Dumazet wrote:
>>> Maybe we can keep a fastpath for ethernet case... (the "if (hlen == ETH_HLEN) being always predicted) Maybe need to introduce gro_hard_header_len as well)
>>
>> today, IPoIB advertizes hard_header_len which is bigger than the IPoIB
>> header len, this is done such that skbs sent by the network stack have
>> enough headroom for a "pseudoheader" which for few flows (e.g unicast
>> arp replies and multicast) is placed there by the ipoib hard_header
>> function and later used by the xmit function.
>>
>> So we can either try and change that, such that hard_header_len will be equal to the ipoib header len or add gro_hard_header_len as you  suggested, any preferences?
>
> I guess changing hard_header_len might be difficult (for you).
> Adding gro_mac_header_len sounds the easy way. [ You'll need to set it in your device setup() ]


Eric, yep, I'm okay with adding gro_mac_header_len, such that if the 
device setup function doesn't touch it, the core sets it to 
hard_header_len and later the gro code uses that len in the comparison 
you have placed in your patch.  The ipoib setup code will set that field 
to the IPoIB header size. If you enhance a bit your patch to include 
that I could do the testing or you prefer me to do that? do you think we 
need two or one patch here?

Or.



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-02-02 14:44                                           ` Eric Dumazet
@ 2012-02-02 21:42                                             ` Or Gerlitz
  0 siblings, 0 replies; 70+ messages in thread
From: Or Gerlitz @ 2012-02-02 21:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-rdma

On Thu, Feb 2, 2012 at 4:44 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > I guess changing hard_header_len might be difficult (for you).
> > Adding gro_mac_header_len sounds the easy way.
> > [ You'll need to set it in your device setup() ]

> Something like :

Hi Eric,

I will test this patch along with setting gro_mac_header_len in ipoib
and let you know

Or.

>
>  include/linux/netdevice.h |    2 +-
>  net/core/dev.c            |   11 +++++++++--
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 0eac07c..99b12c4 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1095,7 +1095,7 @@ struct net_device {
>        unsigned int            mtu;    /* interface MTU value          */
>        unsigned short          type;   /* interface hardware type      */
>        unsigned short          hard_header_len;        /* hardware hdr length  */
> -
> +       unsigned int            gro_mac_header_len;
>        /* extra head- and tailroom the hardware may need, but not in all cases
>         * can this be guaranteed, especially tailroom. Some cases also use
>         * LL_MAX_HEADER instead to allocate the skb.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f124947..a3af7bc 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3491,14 +3491,20 @@ static inline gro_result_t
>  __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
>  {
>        struct sk_buff *p;
> +       unsigned int maclen = skb->dev->gro_mac_header_len;
>
>        for (p = napi->gro_list; p; p = p->next) {
>                unsigned long diffs;
>
>                diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
>                diffs |= p->vlan_tci ^ skb->vlan_tci;
> -               diffs |= compare_ether_header(skb_mac_header(p),
> -                                             skb_gro_mac_header(skb));
> +               if (maclen == ETH_HLEN)
> +                       diffs |= compare_ether_header(skb_mac_header(p),
> +                                                     skb_gro_mac_header(skb));
> +               else if (!diffs)
> +                       diffs = memcmp(skb_mac_header(p),
> +                                       skb_gro_mac_header(skb),
> +                                       maclen);
>                NAPI_GRO_CB(p)->same_flow = !diffs;
>                NAPI_GRO_CB(p)->flush = 0;
>        }
> @@ -5962,6 +5968,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
>        INIT_LIST_HEAD(&dev->unreg_list);
>        INIT_LIST_HEAD(&dev->link_watch_list);
>        dev->priv_flags = IFF_XMIT_DST_RELEASE;
> +       dev->gro_mac_header_len = ETH_HLEN;
>        setup(dev);
>
>        dev->num_tx_queues = txqs;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]   ` <alpine.LRH.2.00.1201261642340.31408-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
  2012-01-30  4:36     ` [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams Roland Dreier
@ 2012-02-02 21:58     ` Or Gerlitz
       [not found]       ` <alpine.LRH.2.00.1202022352560.30300-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
  1 sibling, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-02 21:58 UTC (permalink / raw)
  To: Sean Hefty
  Cc: linux-rdma, Shlomo Pongratz, Roland Dreier,
	netdev-u79uwXL29TY76Z2rM5mHXA, Eric Dumazet, Herbert Xu

On Thu, 26 Jan 2012, Or Gerlitz wrote:
> From: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> The GRO flow makes a check in every layer to ensure the packets
> are actually merged only if they match at all layers.
> The first GRO check, at L2 always fails for IPoIB, since it assumes
> that all packets have 14 bytes of Ethernet link layer header.

Hi Sean, this is the patch posted earlier by Eric enhanced to
have IPoIB to set gro_mac_header_len to the ipoib header len, does
it make things to work okay for you?

Or.

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 3a6848d..397bb88 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1026,6 +1026,11 @@ static void ipoib_setup(struct net_device *dev)
 	 * address "pseudoheader" for skbs without neighbour struct.
 	 */
 	dev->hard_header_len	 = IPOIB_ENCAP_LEN + INFINIBAND_ALEN;
+	/*
+	 * We want the GRO code to conduct link layer comparisons only
+	 * on the IPoIB header.
+	 */
+	dev->gro_mac_header_len	 = IPOIB_ENCAP_LEN;
 	dev->addr_len		 = INFINIBAND_ALEN;
 	dev->type		 = ARPHRD_INFINIBAND;
 	dev->tx_queue_len	 = ipoib_sendq_size * 2;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0eac07c..99b12c4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1095,7 +1095,7 @@ struct net_device {
 	unsigned int		mtu;	/* interface MTU value		*/
 	unsigned short		type;	/* interface hardware type	*/
 	unsigned short		hard_header_len;	/* hardware hdr length	*/
-
+	unsigned int		gro_mac_header_len;
 	/* extra head- and tailroom the hardware may need, but not in all cases
 	 * can this be guaranteed, especially tailroom. Some cases also use
 	 * LL_MAX_HEADER instead to allocate the skb.
diff --git a/net/core/dev.c b/net/core/dev.c
index 115dee1..45738ef 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3500,14 +3500,20 @@ static inline gro_result_t
 __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	unsigned int maclen = skb->dev->gro_mac_header_len;

 	for (p = napi->gro_list; p; p = p->next) {
 		unsigned long diffs;

 		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
 		diffs |= p->vlan_tci ^ skb->vlan_tci;
-		diffs |= compare_ether_header(skb_mac_header(p),
-					      skb_gro_mac_header(skb));
+		if (maclen == ETH_HLEN)
+			diffs |= compare_ether_header(skb_mac_header(p),
+						      skb_gro_mac_header(skb));
+		else if (!diffs)
+			diffs = memcmp(skb_mac_header(p),
+					skb_gro_mac_header(skb),
+					maclen);
 		NAPI_GRO_CB(p)->same_flow = !diffs;
 		NAPI_GRO_CB(p)->flush = 0;
 	}
@@ -5978,6 +5984,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
 	dev->priv_flags = IFF_XMIT_DST_RELEASE;
+	dev->gro_mac_header_len = ETH_HLEN;
 	setup(dev);

 	dev->num_tx_queues = txqs;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* RE: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]       ` <alpine.LRH.2.00.1202022352560.30300-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
@ 2012-02-03  7:18         ` Hefty, Sean
       [not found]           ` <1828884A29C6694DAF28B7E6B8A823733349A5D2-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Hefty, Sean @ 2012-02-03  7:18 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: linux-rdma, Shlomo Pongratz, Roland Dreier,
	netdev-u79uwXL29TY76Z2rM5mHXA, Eric Dumazet, Herbert Xu

> Hi Sean, this is the patch posted earlier by Eric enhanced to
> have IPoIB to set gro_mac_header_len to the ipoib header len, does
> it make things to work okay for you?

I ran a quick test.  It helps some, but the performance for streaming packets in one direction is still only 0.11 Gbps.  It should be closer to 10-15 Gbps for ipoib on my systems.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]           ` <1828884A29C6694DAF28B7E6B8A823733349A5D2-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2012-02-03  9:00             ` Eric Dumazet
  2012-02-03 20:24               ` Hefty, Sean
  0 siblings, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-03  9:00 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Or Gerlitz, linux-rdma, Shlomo Pongratz, Roland Dreier,
	netdev-u79uwXL29TY76Z2rM5mHXA, Herbert Xu

Le vendredi 03 février 2012 à 07:18 +0000, Hefty, Sean a écrit :
> > Hi Sean, this is the patch posted earlier by Eric enhanced to
> > have IPoIB to set gro_mac_header_len to the ipoib header len, does
> > it make things to work okay for you?
> 
> I ran a quick test.  It helps some, but the performance for streaming packets in one direction is still only 0.11 Gbps.  It should be closer to 10-15 Gbps for ipoib on my systems.
> 
> - Sean

Hmm... a tcpdump would help to understand what is going on (for example
if GRO kicks in)



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-02-03  9:00             ` Eric Dumazet
@ 2012-02-03 20:24               ` Hefty, Sean
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A823733349A6C9-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Hefty, Sean @ 2012-02-03 20:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Or Gerlitz, linux-rdma, Shlomo Pongratz, Roland Dreier,
	netdev-u79uwXL29TY76Z2rM5mHXA, Herbert Xu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 862 bytes --]

> > > Hi Sean, this is the patch posted earlier by Eric enhanced to
> > > have IPoIB to set gro_mac_header_len to the ipoib header len, does
> > > it make things to work okay for you?
> >
> > I ran a quick test.  It helps some, but the performance for streaming
> packets in one direction is still only 0.11 Gbps.  It should be closer to 10-
> 15 Gbps for ipoib on my systems.
> >
> > - Sean
> 
> Hmm... a tcpdump would help to understand what is going on (for example
> if GRO kicks in)

I should stop trying to apply patches so late.  I must have done something wrong with my quick test.  Re-applying the patch on 3.2, the ipoib UD performance looks okay to me with Eric's patch.  Sorry about the flub.

- Sean
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A823733349A6C9-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2012-02-06 15:05                   ` Or Gerlitz
       [not found]                     ` <4F2FEC36.6090800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-06 15:05 UTC (permalink / raw)
  To: Hefty, Sean, Eric Dumazet, Herbert Xu
  Cc: linux-rdma, Shlomo Pongratz, Roland Dreier,
	netdev-u79uwXL29TY76Z2rM5mHXA

On 2/3/2012 10:24 PM, Hefty, Sean wrote:
>> Hmm... a tcpdump would help to understand what is going on (for example if GRO kicks in)
>
> I should stop trying to apply patches so late.  I must have done something wrong with my quick test.  Re-applying the patch on 3.2, the ipoib UD performance looks okay to me with Eric's patch.  Sorry about the flub.
>

Sean, same here, with both 3.2 and 3.3-rc1 I see with tcpdump that no 
GRO aggregation is done in the receiving side, when applying the patch I 
sent (Eric's patch enhanced to change ipoib code such that it sets 
gro_mac_header_lento the ipoib header len), I can see GRO aggregation 
and the performance goes well up. So how do we go from here? Herbert, 
Roland, are you okay with the patch?

Eric, if this we're going on that patch, could you please add 
Reported-by: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> to the change-log.

Or.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
       [not found]                     ` <4F2FEC36.6090800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-02-06 15:21                       ` Eric Dumazet
  2012-02-06 15:22                         ` Or Gerlitz
  0 siblings, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-06 15:21 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Hefty, Sean, Herbert Xu, linux-rdma, Shlomo Pongratz,
	Roland Dreier, netdev-u79uwXL29TY76Z2rM5mHXA

Le lundi 06 février 2012 à 17:05 +0200, Or Gerlitz a écrit :
> On 2/3/2012 10:24 PM, Hefty, Sean wrote:
> >> Hmm... a tcpdump would help to understand what is going on (for example if GRO kicks in)
> >
> > I should stop trying to apply patches so late.  I must have done something wrong with my quick test.  Re-applying the patch on 3.2, the ipoib UD performance looks okay to me with Eric's patch.  Sorry about the flub.
> >
> 
> Sean, same here, with both 3.2 and 3.3-rc1 I see with tcpdump that no 
> GRO aggregation is done in the receiving side, when applying the patch I 
> sent (Eric's patch enhanced to change ipoib code such that it sets 
> gro_mac_header_lento the ipoib header len), I can see GRO aggregation 
> and the performance goes well up. So how do we go from here? Herbert, 
> Roland, are you okay with the patch?
> 
> Eric, if this we're going on that patch, could you please add 
> Reported-by: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> to the change-log.

Sure, I suggest two patches then.

One from me (the one I sent earlier), introducing the core
infrastructure, and one from you adding the "dev->gro_mac_header_len  =
IPOIB_ENCAP_LEN;" in drivers/infiniband/ulp/ipoib/ipoib_main.c

If you agree, I'll resend my part with official submission, and you'll
send your patch after ?



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams
  2012-02-06 15:21                       ` Eric Dumazet
@ 2012-02-06 15:22                         ` Or Gerlitz
       [not found]                           ` <4F2FF050.7040400-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-06 15:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hefty, Sean, Herbert Xu, linux-rdma, Shlomo Pongratz,
	Roland Dreier, netdev-u79uwXL29TY76Z2rM5mHXA

On 2/6/2012 5:21 PM, Eric Dumazet wrote:
> Sure, I suggest two patches then. One from me (the one I sent 
> earlier), introducing the core infrastructure, and one from you adding 
> the "dev->gro_mac_header_len = IPOIB_ENCAP_LEN;" in 
> drivers/infiniband/ulp/ipoib/ipoib_main.c If you agree, I'll resend my 
> part with official submission, and you'll send your patch after ? 

sure, sounds excellent, lets do that

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH net-next] gro: introduce gro_mac_header_len
       [not found]                           ` <4F2FF050.7040400-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-02-06 16:27                             ` Eric Dumazet
  2012-02-06 16:31                               ` David Miller
  0 siblings, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-06 16:27 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Hefty, Sean, Herbert Xu, linux-rdma, Shlomo Pongratz,
	Roland Dreier, netdev-u79uwXL29TY76Z2rM5mHXA

Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
only, and failed on IB/ipoib traffic.

He provided a patch faking a zeroed header to let GRO aggregates frames.

Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
check to be more generic.

This patch introduces a new netdevice field, gro_mac_header_len, giving
L2 header length, default to ETH_HLEN (14 bytes)

A device setup function can override this default value.

__napi_gro_receive() has special handling for the common case (Ethernet)
to avoid a memcmp() call and use an inline optimized function instead.

Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reported-by: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
---
 include/linux/netdevice.h |    1 +
 net/core/dev.c            |   11 +++++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0eac07c..d17192b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1095,6 +1095,7 @@ struct net_device {
 	unsigned int		mtu;	/* interface MTU value		*/
 	unsigned short		type;	/* interface hardware type	*/
 	unsigned short		hard_header_len;	/* hardware hdr length	*/
+	unsigned int		gro_mac_header_len;
 
 	/* extra head- and tailroom the hardware may need, but not in all cases
 	 * can this be guaranteed, especially tailroom. Some cases also use
diff --git a/net/core/dev.c b/net/core/dev.c
index f124947..0b43939 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3491,14 +3491,20 @@ static inline gro_result_t
 __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	unsigned int maclen = skb->dev->gro_mac_header_len;
 
 	for (p = napi->gro_list; p; p = p->next) {
 		unsigned long diffs;
 
 		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
 		diffs |= p->vlan_tci ^ skb->vlan_tci;
-		diffs |= compare_ether_header(skb_mac_header(p),
-					      skb_gro_mac_header(skb));
+		if (maclen == ETH_HLEN)
+			diffs |= compare_ether_header(skb_mac_header(p),
+						      skb_gro_mac_header(skb));
+		else if (!diffs)
+			diffs = memcmp(skb_mac_header(p),
+				       skb_gro_mac_header(skb),
+				       maclen);
 		NAPI_GRO_CB(p)->same_flow = !diffs;
 		NAPI_GRO_CB(p)->flush = 0;
 	}
@@ -5962,6 +5968,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
 	dev->priv_flags = IFF_XMIT_DST_RELEASE;
+	dev->gro_mac_header_len = ETH_HLEN;
 	setup(dev);
 
 	dev->num_tx_queues = txqs;


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: introduce gro_mac_header_len
  2012-02-06 16:27                             ` [PATCH net-next] gro: introduce gro_mac_header_len Eric Dumazet
@ 2012-02-06 16:31                               ` David Miller
       [not found]                                 ` <20120206.113145.1284864994961472499.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2012-02-06 16:47                                 ` [PATCH net-next V2] " Eric Dumazet
  0 siblings, 2 replies; 70+ messages in thread
From: David Miller @ 2012-02-06 16:31 UTC (permalink / raw)
  To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Mon, 06 Feb 2012 17:27:07 +0100

> Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
> only, and failed on IB/ipoib traffic.
> 
> He provided a patch faking a zeroed header to let GRO aggregates frames.
> 
> Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
> check to be more generic.
> 
> This patch introduces a new netdevice field, gro_mac_header_len, giving
> L2 header length, default to ETH_HLEN (14 bytes)
> 
> A device setup function can override this default value.
> 
> __napi_gro_receive() has special handling for the common case (Ethernet)
> to avoid a memcmp() call and use an inline optimized function instead.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reported-by: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

We really need an explanation, probably both in the commit message and
the comments next to this new struct member, explaining why in the world
we can't use ->hard_header_len for this.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: introduce gro_mac_header_len
       [not found]                                 ` <20120206.113145.1284864994961472499.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-06 16:44                                   ` Or Gerlitz
       [not found]                                     ` <4F300353.2080705-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-06 16:44 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

On 2/6/2012 6:31 PM, David Miller wrote:
> We really need an explanation, probably both in the commit message and 
> the comments next to this new struct member, explaining why in the 
> world we can't use ->hard_header_len for this. 

Dave,

As I wrote earlier on this thread, the reason is that in order to be 
able to xmit skb's for which the stack doesn't provide a neighbour, 
IPoIB makes sure to have extra INFINIBAND_ALEN bytes headroom on skbs. 
Such that the ipoib hard_header functions writes a "ipoib_pseudoheader" 
(= hw address) on the skb and the xmit function grabs it from there and 
use that pseudoheader to search for the actual IB L2 address handle kept 
for that dest. Example such skbs are unicast ARP replies and multicast. 
see below some relevant code sections from ipoib_main.c

Or.

> static void ipoib_setup(struct net_device *dev)
> [...]
>         /*
>          * We add in INFINIBAND_ALEN to allow for the destination
>          * address "pseudoheader" for skbs without neighbour struct.
>          */
>         dev->hard_header_len     = IPOIB_ENCAP_LEN + INFINIBAND_ALEN;
>         dev->addr_len            = INFINIBAND_ALEN;
>         dev->type                = ARPHRD_INFINIBAND;


> static int ipoib_hard_header(struct sk_buff *skb,
>                              struct net_device *dev,
>                              unsigned short type,
>                              const void *daddr, const void *saddr, 
> unsigned len)
>         [...]
>         /*
>          * If we don't have a neighbour structure, stuff the
>          * destination address onto the front of the skb so we can
>          * figure out where to send the packet later.
>          */
>


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH net-next V2] gro: introduce gro_mac_header_len
  2012-02-06 16:31                               ` David Miller
       [not found]                                 ` <20120206.113145.1284864994961472499.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-06 16:47                                 ` Eric Dumazet
  2012-02-06 16:58                                   ` David Miller
  2012-02-08 18:51                                   ` [PATCH net-next] gro: more generic L2 header check Eric Dumazet
  1 sibling, 2 replies; 70+ messages in thread
From: Eric Dumazet @ 2012-02-06 16:47 UTC (permalink / raw)
  To: David Miller
  Cc: ogerlitz, sean.hefty, herbert, linux-rdma, shlomop, roland, netdev

Le lundi 06 février 2012 à 11:31 -0500, David Miller a écrit :

> We really need an explanation, probably both in the commit message and
> the comments next to this new struct member, explaining why in the world
> we can't use ->hard_header_len for this.

OK, I added some information from Or Gerlitz in V2

Thanks !

[PATCH net-next V2] gro: introduce gro_mac_header_len

Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
only, and failed on IB/ipoib traffic.

He provided a patch faking a zeroed header to let GRO aggregates frames.

Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
check to be more generic.

This patch introduces a new netdevice field, gro_mac_header_len, giving
L2 header length, default to ETH_HLEN (14 bytes)

A device setup function can override this default value.

gro_max_header_len can be different than hard_header_len because as Or
Gerlitz said :

	IPoIB advertizes hard_header_len which is bigger than the
	IPoIB header len, this is done such that skbs sent by the
	network stack have enough headroom for a "pseudoheader"
	which for few flows (e.g unicast arp replies and multicast)
	is placed there by the ipoib hard_header function and later
	used by the xmit function.

__napi_gro_receive() has special handling for the common case (Ethernet)
to avoid a memcmp() call and use an inline optimized function instead.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Shlomo Pongratz <shlomop@mellanox.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
---
V2: added a comment saying why we dont use hard_header_len but a new
field.

 include/linux/netdevice.h |    1 +
 net/core/dev.c            |   11 +++++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0eac07c..903bb6e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1095,6 +1095,7 @@ struct net_device {
 	unsigned int		mtu;	/* interface MTU value		*/
 	unsigned short		type;	/* interface hardware type	*/
 	unsigned short		hard_header_len;	/* hardware hdr length	*/
+	unsigned int		gro_mac_header_len; /* L2 header length for GRO */
 
 	/* extra head- and tailroom the hardware may need, but not in all cases
 	 * can this be guaranteed, especially tailroom. Some cases also use
diff --git a/net/core/dev.c b/net/core/dev.c
index f124947..0b43939 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3491,14 +3491,20 @@ static inline gro_result_t
 __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	unsigned int maclen = skb->dev->gro_mac_header_len;
 
 	for (p = napi->gro_list; p; p = p->next) {
 		unsigned long diffs;
 
 		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
 		diffs |= p->vlan_tci ^ skb->vlan_tci;
-		diffs |= compare_ether_header(skb_mac_header(p),
-					      skb_gro_mac_header(skb));
+		if (maclen == ETH_HLEN)
+			diffs |= compare_ether_header(skb_mac_header(p),
+						      skb_gro_mac_header(skb));
+		else if (!diffs)
+			diffs = memcmp(skb_mac_header(p),
+				       skb_gro_mac_header(skb),
+				       maclen);
 		NAPI_GRO_CB(p)->same_flow = !diffs;
 		NAPI_GRO_CB(p)->flush = 0;
 	}
@@ -5962,6 +5968,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
 	dev->priv_flags = IFF_XMIT_DST_RELEASE;
+	dev->gro_mac_header_len = ETH_HLEN;
 	setup(dev);
 
 	dev->num_tx_queues = txqs;

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
  2012-02-06 16:47                                 ` [PATCH net-next V2] " Eric Dumazet
@ 2012-02-06 16:58                                   ` David Miller
  2012-02-06 17:07                                     ` Eric Dumazet
       [not found]                                     ` <20120206.115859.1384761795375582044.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2012-02-08 18:51                                   ` [PATCH net-next] gro: more generic L2 header check Eric Dumazet
  1 sibling, 2 replies; 70+ messages in thread
From: David Miller @ 2012-02-06 16:58 UTC (permalink / raw)
  To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA, roland-DgEjT+Ai2ygdnm+yROfE0A

From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Mon, 06 Feb 2012 17:47:14 +0100

[ Roland Dreier CC:'d ]

> gro_max_header_len can be different than hard_header_len because as Or
> Gerlitz said :
> 
> 	IPoIB advertizes hard_header_len which is bigger than the
> 	IPoIB header len, this is done such that skbs sent by the
> 	network stack have enough headroom for a "pseudoheader"
> 	which for few flows (e.g unicast arp replies and multicast)
> 	is placed there by the ipoib hard_header function and later
> 	used by the xmit function.

Translation: IPoIB's path resolution mechanism is garbage

So if IPoIB path resolution was properly integrated into the neighbour
cache state machine, instead of being implemented awkwardly in the
device transmit path, this crap wouldn't be necessary right?

So here we have yet another incredibly painful side effect of how
IPoIB path resolution works.

Roland, I want you to seriously consider a way, any way, to get rid of
how IPoIB does path resolution.  It must be fully integrated into the
neighbour layer, the neighbour layer must be knowledgable about how
path resolution is a necessary step for a neighbour entry to enter the
valid state, and I want all of this awkward neighbour handling code
removed from the transmit path of IPoIB.

And finally it must not lie about it's hardware header length.

Then we won't need crap like what is being proposed here, a
"no_this_is_the_real_hard_header_len" struct member.  That's just
rediculious.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: introduce gro_mac_header_len
       [not found]                                     ` <4F300353.2080705-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-02-06 17:00                                       ` David Miller
  0 siblings, 0 replies; 70+ messages in thread
From: David Miller @ 2012-02-06 17:00 UTC (permalink / raw)
  To: ogerlitz-VPRAkNaXOzVWk0Htik3J/w
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Date: Mon, 6 Feb 2012 18:44:03 +0200

> As I wrote earlier on this thread, the reason is that in order to be
> able to xmit skb's for which the stack doesn't provide a neighbour,
> IPoIB makes sure to have extra INFINIBAND_ALEN bytes headroom on
> skbs. Such that the ipoib hard_header functions writes a
> "ipoib_pseudoheader" (= hw address) on the skb and the xmit function
> grabs it from there and use that pseudoheader to search for the actual
> IB L2 address handle kept for that dest. Example such skbs are unicast
> ARP replies and multicast. see below some relevant code sections from
> ipoib_main.c

Then IPoIB is broken and must be fixed, see my other reply.

We're not adding a bogus member to the netdevice just because IPoIB
tries to do path resolution and neighbour validation behind the
neighbour cache's back.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
  2012-02-06 16:58                                   ` David Miller
@ 2012-02-06 17:07                                     ` Eric Dumazet
  2012-02-06 17:11                                       ` Or Gerlitz
       [not found]                                     ` <20120206.115859.1384761795375582044.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  1 sibling, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-06 17:07 UTC (permalink / raw)
  To: David Miller
  Cc: ogerlitz, sean.hefty, herbert, linux-rdma, shlomop, roland, netdev

Le lundi 06 février 2012 à 11:58 -0500, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 06 Feb 2012 17:47:14 +0100
> 
> [ Roland Dreier CC:'d ]
> 
> > gro_max_header_len can be different than hard_header_len because as Or
> > Gerlitz said :
> > 
> > 	IPoIB advertizes hard_header_len which is bigger than the
> > 	IPoIB header len, this is done such that skbs sent by the
> > 	network stack have enough headroom for a "pseudoheader"
> > 	which for few flows (e.g unicast arp replies and multicast)
> > 	is placed there by the ipoib hard_header function and later
> > 	used by the xmit function.
> 
> Translation: IPoIB's path resolution mechanism is garbage
> 
> So if IPoIB path resolution was properly integrated into the neighbour
> cache state machine, instead of being implemented awkwardly in the
> device transmit path, this crap wouldn't be necessary right?
> 
> So here we have yet another incredibly painful side effect of how
> IPoIB path resolution works.
> 
> Roland, I want you to seriously consider a way, any way, to get rid of
> how IPoIB does path resolution.  It must be fully integrated into the
> neighbour layer, the neighbour layer must be knowledgable about how
> path resolution is a necessary step for a neighbour entry to enter the
> valid state, and I want all of this awkward neighbour handling code
> removed from the transmit path of IPoIB.
> 
> And finally it must not lie about it's hardware header length.
> 
> Then we won't need crap like what is being proposed here, a
> "no_this_is_the_real_hard_header_len" struct member.  That's just
> rediculious.

OK, I'll resend my first patch then, using hard_header_len

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                     ` <20120206.115859.1384761795375582044.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-06 17:09                                       ` Or Gerlitz
  2012-02-06 17:23                                       ` Roland Dreier
  1 sibling, 0 replies; 70+ messages in thread
From: Or Gerlitz @ 2012-02-06 17:09 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

On 2/6/2012 6:58 PM, David Miller wrote:
> So if IPoIB path resolution was properly integrated into the neighbour 
> cache state machine, instead of being implemented awkwardly in the 
> device transmit path, this crap wouldn't be necessary right?


Dave, say we have integrated the path resolution into ND cache, how 
would you suggest ipoib to act for skbs for which xmit is called without 
a neighbour? specifically arp replies and multicast. I can think of at 
least another one other location where the HW address can be stored 
between hard_header and xmit - on the sbk->cb storage, which is large 
enough.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
  2012-02-06 17:07                                     ` Eric Dumazet
@ 2012-02-06 17:11                                       ` Or Gerlitz
       [not found]                                         ` <4F3009AE.2090605-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-06 17:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

On 2/6/2012 7:07 PM, Eric Dumazet wrote:
> OK, I'll resend my first patch then, using hard_header_len

so, we will be back to square one... as the hard_header_len which 
advertized now by IPoIB will fail
the GRO L2 check... lets see where this discussion evolves.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                         ` <4F3009AE.2090605-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-02-06 17:19                                           ` Eric Dumazet
  0 siblings, 0 replies; 70+ messages in thread
From: Eric Dumazet @ 2012-02-06 17:19 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

Le lundi 06 février 2012 à 19:11 +0200, Or Gerlitz a écrit :
> On 2/6/2012 7:07 PM, Eric Dumazet wrote:
> > OK, I'll resend my first patch then, using hard_header_len
> 
> so, we will be back to square one... as the hard_header_len which 
> advertized now by IPoIB will fail
> the GRO L2 check... lets see where this discussion evolves.
> 
> Or.
> 

You'll have to define hard_header_len to IPOIB_ENCAP_LEN ?



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                     ` <20120206.115859.1384761795375582044.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2012-02-06 17:09                                       ` Or Gerlitz
@ 2012-02-06 17:23                                       ` Roland Dreier
       [not found]                                         ` <CAG4TOxOBCbEEOtP62ZM1R3-6umebdbJFkK0bnjphLCZOgHzt1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 70+ messages in thread
From: Roland Dreier @ 2012-02-06 17:23 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

On Mon, Feb 6, 2012 at 8:58 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>
> So if IPoIB path resolution was properly integrated into the neighbour
> cache state machine, instead of being implemented awkwardly in the
> device transmit path, this crap wouldn't be necessary right?
>
> So here we have yet another incredibly painful side effect of how
> IPoIB path resolution works.
>
> Roland, I want you to seriously consider a way, any way, to get rid of
> how IPoIB does path resolution.  It must be fully integrated into the
> neighbour layer, the neighbour layer must be knowledgable about how
> path resolution is a necessary step for a neighbour entry to enter the
> valid state, and I want all of this awkward neighbour handling code
> removed from the transmit path of IPoIB.
>
> And finally it must not lie about it's hardware header length.

I would love to clean this up.  But I don't know how to do it.

IMHO the problem is in the IPoIB RFC: (4391) which makes
a distinction between an "encapsulation header" and the "link
layer address".  The LL address is what we put into ARP and
ND packets, and so I think we are forced into exposing that
to the network stack as our hardware address.

However this LL address is not actually what we need to send
a packet -- we need to take the GID of our destination and send
a query to the subnet manager to resolve it to a path.  And this
query really is an RPC to a remote entity somewhere else on
the network, so we have to do it asynchronously etc.

Now the part I don't know how to handle is when the network
stack gives us a LL address to send an ARP or something to
but the skb has no dst attached.  Suppose I don't have a path
for that LL address when I get that skb with no dst into my
.hard_header method.  Where do I stick the LL addr to keep
around until the packet shows up in the xmit function?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                         ` <CAG4TOxOBCbEEOtP62ZM1R3-6umebdbJFkK0bnjphLCZOgHzt1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-06 19:12                                           ` David Miller
       [not found]                                             ` <20120206.141223.332863167187002998.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-06 19:12 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

From: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date: Mon, 6 Feb 2012 09:23:58 -0800

> IMHO the problem is in the IPoIB RFC: (4391) which makes
> a distinction between an "encapsulation header" and the "link
> layer address".  The LL address is what we put into ARP and
> ND packets, and so I think we are forced into exposing that
> to the network stack as our hardware address.

An address is not a hardware MAC header, we're only talking
about the length of the latter.

If the addressing is such that you need to put the GID
into the ARP/NDISC packets, and that's different from what
ends up in the final encapsulation header, I really don't
see what the problem is specificially with respect to the
MAC header size.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                             ` <20120206.141223.332863167187002998.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-06 19:23                                               ` Roland Dreier
       [not found]                                                 ` <CAG4TOxNgJ3=AFuA==Km815YzW8eQ7nD_UAzkXLSXcAyaCAAvPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Roland Dreier @ 2012-02-06 19:23 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

On Mon, Feb 6, 2012 at 11:12 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>> IMHO the problem is in the IPoIB RFC: (4391) which makes
>> a distinction between an "encapsulation header" and the "link
>> layer address".  The LL address is what we put into ARP and
>> ND packets, and so I think we are forced into exposing that
>> to the network stack as our hardware address.
>
> An address is not a hardware MAC header, we're only talking
> about the length of the latter.
>
> If the addressing is such that you need to put the GID
> into the ARP/NDISC packets, and that's different from what
> ends up in the final encapsulation header, I really don't
> see what the problem is specificially with respect to the
> MAC header size.

Dave, please look at my whole email.  The problem is that
the LL addr is not what we need to send packets but it's all
we get in our .hard_header method.  So for eg unicast ARP
packets without a dst we have nowhere to stash what we
actually will need in our xmit method.

(The same problem for unicast ARPs applies to multicast
sends too).

Does the netdev driver own skb->cb between hard_header
and start_xmit?  If so we could use that instead of stealing
some header space, and that would at least let us not lie
about hard_header_len.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                 ` <CAG4TOxNgJ3=AFuA==Km815YzW8eQ7nD_UAzkXLSXcAyaCAAvPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-06 19:32                                                   ` David Miller
       [not found]                                                     ` <20120206.143230.1415707004934341114.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-06 19:32 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

From: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date: Mon, 6 Feb 2012 11:23:21 -0800

> Does the netdev driver own skb->cb between hard_header
> and start_xmit?  If so we could use that instead of stealing
> some header space, and that would at least let us not lie
> about hard_header_len.

Unfortunately the packet scheduler sits between the hard_header()
call (via neigh_*_output() --> dev_hard_header()) and when the
device xmit method is invoked.

And the packet scheduler can make use of the skb->cb[], for
include/net/sch_generic.h:qdisc_skb_cb


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                     ` <20120206.143230.1415707004934341114.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-06 19:51                                                       ` David Miller
       [not found]                                                         ` <20120206.145148.558736903670696169.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-06 19:51 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

From: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Date: Mon, 06 Feb 2012 14:32:30 -0500 (EST)

> From: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Date: Mon, 6 Feb 2012 11:23:21 -0800
> 
>> Does the netdev driver own skb->cb between hard_header
>> and start_xmit?  If so we could use that instead of stealing
>> some header space, and that would at least let us not lie
>> about hard_header_len.
> 
> Unfortunately the packet scheduler sits between the hard_header()
> call (via neigh_*_output() --> dev_hard_header()) and when the
> device xmit method is invoked.
> 
> And the packet scheduler can make use of the skb->cb[], for
> include/net/sch_generic.h:qdisc_skb_cb

Actually there is a way to make this work.

Define your ipoib_skb_cb something like:

struct ipoib_skb_cb {
	struct qdisc_skb_cb	qdisc_cb;

	... ipoib stuff goes here ...
};

That way you can use the SKB cb area for your ipoib info
without interfering with the packet scheduler.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                         ` <20120206.145148.558736903670696169.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-06 19:56                                                           ` David Miller
       [not found]                                                             ` <20120206.145652.1575591691467905094.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2012-02-06 19:59                                                           ` [PATCH net-next V2] gro: introduce gro_mac_header_len Roland Dreier
  1 sibling, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-06 19:56 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

From: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Date: Mon, 06 Feb 2012 14:51:48 -0500 (EST)

> From: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> Date: Mon, 06 Feb 2012 14:32:30 -0500 (EST)
> 
>> From: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>> Date: Mon, 6 Feb 2012 11:23:21 -0800
>> 
>>> Does the netdev driver own skb->cb between hard_header
>>> and start_xmit?  If so we could use that instead of stealing
>>> some header space, and that would at least let us not lie
>>> about hard_header_len.
>> 
>> Unfortunately the packet scheduler sits between the hard_header()
>> call (via neigh_*_output() --> dev_hard_header()) and when the
>> device xmit method is invoked.
>> 
>> And the packet scheduler can make use of the skb->cb[], for
>> include/net/sch_generic.h:qdisc_skb_cb
> 
> Actually there is a way to make this work.
> 
> Define your ipoib_skb_cb something like:
> 
> struct ipoib_skb_cb {
> 	struct qdisc_skb_cb	qdisc_cb;
> 
> 	... ipoib stuff goes here ...
> };
> 
> That way you can use the SKB cb area for your ipoib info
> without interfering with the packet scheduler.

But this needs a little bit of work since the qdisc_skb_cb ends with a
variable length array, but we can put an upper bound on this just like
we do for skb->cb[] itself to fix this issue.

I'll toss something together.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                         ` <20120206.145148.558736903670696169.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2012-02-06 19:56                                                           ` David Miller
@ 2012-02-06 19:59                                                           ` Roland Dreier
  1 sibling, 0 replies; 70+ messages in thread
From: Roland Dreier @ 2012-02-06 19:59 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

> Actually there is a way to make this work.
>
> Define your ipoib_skb_cb something like:
>
> struct ipoib_skb_cb {
>        struct qdisc_skb_cb     qdisc_cb;
>
>        ... ipoib stuff goes here ...
> };
>
> That way you can use the SKB cb area for your ipoib info
> without interfering with the packet scheduler.

Not sure I see how this could work.  sch_generic.h has:

struct qdisc_skb_cb {
        unsigned int            pkt_len;
        long                    data[];
};

and ipoib has no way of knowing what the biggest
scheduler-specific use of data[] will be.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                             ` <20120206.145652.1575591691467905094.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-06 20:01                                                               ` Roland Dreier
       [not found]                                                                 ` <CAG4TOxNcB8+SP3f0WhZG1GZ481s+=7pVRv91vzAbwDzUZD8g_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Roland Dreier @ 2012-02-06 20:01 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

On Mon, Feb 6, 2012 at 11:56 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> But this needs a little bit of work since the qdisc_skb_cb ends with a
> variable length array, but we can put an upper bound on this just like
> we do for skb->cb[] itself to fix this issue.
>
> I'll toss something together.

OK thanks.  We need something like 20 bytes of space to stash an
ipoib LL addr.  Which I guess leaves 28 bytes for sch use.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                                 ` <CAG4TOxNcB8+SP3f0WhZG1GZ481s+=7pVRv91vzAbwDzUZD8g_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-06 20:15                                                                   ` David Miller
       [not found]                                                                     ` <20120206.151509.1959432192519622134.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-06 20:15 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

From: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date: Mon, 6 Feb 2012 12:01:13 -0800

> On Mon, Feb 6, 2012 at 11:56 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>> But this needs a little bit of work since the qdisc_skb_cb ends with a
>> variable length array, but we can put an upper bound on this just like
>> we do for skb->cb[] itself to fix this issue.
>>
>> I'll toss something together.
> 
> OK thanks.  We need something like 20 bytes of space to stash an
> ipoib LL addr.  Which I guess leaves 28 bytes for sch use.

I just pushed the following into net-next, you have 24 bytes
available.

--------------------
net: Make qdisc_skb_cb upper size bound explicit.

Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside
of other data structures.

This is intended to be used by IPoIB so that it can remember
addressing information stored at hard_header_ops->create() time that
it can fetch when the packet gets to the transmit routine.

Signed-off-by: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
---
 include/net/sch_generic.h |    9 ++++++++-
 net/sched/sch_choke.c     |    3 +--
 net/sched/sch_netem.c     |    3 +--
 net/sched/sch_sfb.c       |    3 +--
 net/sched/sch_sfq.c       |    5 ++---
 5 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index f6bb08b..55ce96b 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -220,9 +220,16 @@ struct tcf_proto {
 
 struct qdisc_skb_cb {
 	unsigned int		pkt_len;
-	long			data[];
+	unsigned char		data[24];
 };
 
+static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
+{
+	struct qdisc_skb_cb *qcb;
+	BUILD_BUG_ON(sizeof(skb->cb) < sizeof(unsigned int) + sz);
+	BUILD_BUG_ON(sizeof(qcb->data) < sz);
+}
+
 static inline int qdisc_qlen(const struct Qdisc *q)
 {
 	return q->q.qlen;
diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c
index e465064..7e267d7 100644
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -148,8 +148,7 @@ struct choke_skb_cb {
 
 static inline struct choke_skb_cb *choke_skb_cb(const struct sk_buff *skb)
 {
-	BUILD_BUG_ON(sizeof(skb->cb) <
-		sizeof(struct qdisc_skb_cb) + sizeof(struct choke_skb_cb));
+	qdisc_cb_private_validate(skb, sizeof(struct choke_skb_cb));
 	return (struct choke_skb_cb *)qdisc_skb_cb(skb)->data;
 }
 
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 2776012..e83d61c 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -130,8 +130,7 @@ struct netem_skb_cb {
 
 static inline struct netem_skb_cb *netem_skb_cb(struct sk_buff *skb)
 {
-	BUILD_BUG_ON(sizeof(skb->cb) <
-		sizeof(struct qdisc_skb_cb) + sizeof(struct netem_skb_cb));
+	qdisc_cb_private_validate(skb, sizeof(struct netem_skb_cb));
 	return (struct netem_skb_cb *)qdisc_skb_cb(skb)->data;
 }
 
diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c
index 96e42ca..d7eea99 100644
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -94,8 +94,7 @@ struct sfb_skb_cb {
 
 static inline struct sfb_skb_cb *sfb_skb_cb(const struct sk_buff *skb)
 {
-	BUILD_BUG_ON(sizeof(skb->cb) <
-		sizeof(struct qdisc_skb_cb) + sizeof(struct sfb_skb_cb));
+	qdisc_cb_private_validate(skb, sizeof(struct sfb_skb_cb));
 	return (struct sfb_skb_cb *)qdisc_skb_cb(skb)->data;
 }
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 67494ae..60d4718 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -166,9 +166,8 @@ struct sfq_skb_cb {
 
 static inline struct sfq_skb_cb *sfq_skb_cb(const struct sk_buff *skb)
 {
-       BUILD_BUG_ON(sizeof(skb->cb) <
-               sizeof(struct qdisc_skb_cb) + sizeof(struct sfq_skb_cb));
-       return (struct sfq_skb_cb *)qdisc_skb_cb(skb)->data;
+	qdisc_cb_private_validate(skb, sizeof(struct sfq_skb_cb));
+	return (struct sfq_skb_cb *)qdisc_skb_cb(skb)->data;
 }
 
 static unsigned int sfq_hash(const struct sfq_sched_data *q,
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                                     ` <20120206.151509.1959432192519622134.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-07 19:51                                                                       ` Roland Dreier
  2012-02-07 20:33                                                                         ` David Miller
  0 siblings, 1 reply; 70+ messages in thread
From: Roland Dreier @ 2012-02-07 19:51 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 488 bytes --]

On Mon, Feb 6, 2012 at 12:15 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> I just pushed the following into net-next, you have 24 bytes
> available.

OK, I'm testing the attached patch (sorry, stuck on webmail ATM
but wanted to send this out).  Will submit formally later today or
tomorrow.

Sean / Or if you have a chance to perf test this + Eric's patch to
test hard_header_len against ETH_ALEN and see if GRO works
with IPoIb, that would be helpful.

 - R.

[-- Attachment #2: net-next-ipoib-cb.patch --]
[-- Type: text/x-diff, Size: 5775 bytes --]

 drivers/infiniband/ulp/ipoib/ipoib.h           |    6 ++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c      |   41 +++++++++---------------
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |   10 +-----
 3 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index b3cc1e0..86df632 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -44,6 +44,7 @@
 #include <linux/mutex.h>
 
 #include <net/neighbour.h>
+#include <net/sch_generic.h>
 
 #include <linux/atomic.h>
 
@@ -117,8 +118,9 @@ struct ipoib_header {
 	u16	reserved;
 };
 
-struct ipoib_pseudoheader {
-	u8  hwaddr[INFINIBAND_ALEN];
+struct ipoib_cb {
+	struct qdisc_skb_cb	qdisc_cb;
+	u8			hwaddr[INFINIBAND_ALEN];
 };
 
 /* Used for all multicast joins (broadcast, IPv4 mcast and IPv6 mcast) */
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 3514ca0..3e4955b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -653,7 +653,7 @@ static void ipoib_path_lookup(struct sk_buff *skb, struct neighbour *n, struct n
 }
 
 static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
-			     struct ipoib_pseudoheader *phdr)
+			     struct ipoib_cb *cb)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 	struct ipoib_path *path;
@@ -661,17 +661,15 @@ static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
 
 	spin_lock_irqsave(&priv->lock, flags);
 
-	path = __path_find(dev, phdr->hwaddr + 4);
+	path = __path_find(dev, cb->hwaddr + 4);
 	if (!path || !path->valid) {
 		int new_path = 0;
 
 		if (!path) {
-			path = path_rec_create(dev, phdr->hwaddr + 4);
+			path = path_rec_create(dev, cb->hwaddr + 4);
 			new_path = 1;
 		}
 		if (path) {
-			/* put pseudoheader back on for next time */
-			skb_push(skb, sizeof *phdr);
 			__skb_queue_tail(&path->queue, skb);
 
 			if (!path->query && path_rec_start(dev, path)) {
@@ -695,12 +693,10 @@ static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
 			  be16_to_cpu(path->pathrec.dlid));
 
 		spin_unlock_irqrestore(&priv->lock, flags);
-		ipoib_send(dev, skb, path->ah, IPOIB_QPN(phdr->hwaddr));
+		ipoib_send(dev, skb, path->ah, IPOIB_QPN(cb->hwaddr));
 		return;
 	} else if ((path->query || !path_rec_start(dev, path)) &&
 		   skb_queue_len(&path->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
-		/* put pseudoheader back on for next time */
-		skb_push(skb, sizeof *phdr);
 		__skb_queue_tail(&path->queue, skb);
 	} else {
 		++dev->stats.tx_dropped;
@@ -774,16 +770,14 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			dev_kfree_skb_any(skb);
 		}
 	} else {
-		struct ipoib_pseudoheader *phdr =
-			(struct ipoib_pseudoheader *) skb->data;
-		skb_pull(skb, sizeof *phdr);
+		struct ipoib_cb *cb = (struct ipoib_cb *) skb->cb;
 
-		if (phdr->hwaddr[4] == 0xff) {
+		if (cb->hwaddr[4] == 0xff) {
 			/* Add in the P_Key for multicast*/
-			phdr->hwaddr[8] = (priv->pkey >> 8) & 0xff;
-			phdr->hwaddr[9] = priv->pkey & 0xff;
+			cb->hwaddr[8] = (priv->pkey >> 8) & 0xff;
+			cb->hwaddr[9] = priv->pkey & 0xff;
 
-			ipoib_mcast_send(dev, phdr->hwaddr + 4, skb);
+			ipoib_mcast_send(dev, cb->hwaddr + 4, skb);
 		} else {
 			/* unicast GID -- should be ARP or RARP reply */
 
@@ -792,14 +786,14 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				ipoib_warn(priv, "Unicast, no %s: type %04x, QPN %06x %pI6\n",
 					   skb_dst(skb) ? "neigh" : "dst",
 					   be16_to_cpup((__be16 *) skb->data),
-					   IPOIB_QPN(phdr->hwaddr),
-					   phdr->hwaddr + 4);
+					   IPOIB_QPN(cb->hwaddr),
+					   cb->hwaddr + 4);
 				dev_kfree_skb_any(skb);
 				++dev->stats.tx_dropped;
 				goto unlock;
 			}
 
-			unicast_arp_send(skb, dev, phdr);
+			unicast_arp_send(skb, dev, cb);
 		}
 	}
 unlock:
@@ -843,9 +837,8 @@ static int ipoib_hard_header(struct sk_buff *skb,
 	if (dst)
 		n = dst_get_neighbour_noref_raw(dst);
 	if ((!dst || !n) && daddr) {
-		struct ipoib_pseudoheader *phdr =
-			(struct ipoib_pseudoheader *) skb_push(skb, sizeof *phdr);
-		memcpy(phdr->hwaddr, daddr, INFINIBAND_ALEN);
+		struct ipoib_cb *cb = (struct ipoib_cb *) skb->cb;
+		memcpy(cb->hwaddr, daddr, INFINIBAND_ALEN);
 	}
 
 	return 0;
@@ -1021,11 +1014,7 @@ static void ipoib_setup(struct net_device *dev)
 
 	dev->flags		|= IFF_BROADCAST | IFF_MULTICAST;
 
-	/*
-	 * We add in INFINIBAND_ALEN to allow for the destination
-	 * address "pseudoheader" for skbs without neighbour struct.
-	 */
-	dev->hard_header_len	 = IPOIB_ENCAP_LEN + INFINIBAND_ALEN;
+	dev->hard_header_len	 = IPOIB_ENCAP_LEN;
 	dev->addr_len		 = INFINIBAND_ALEN;
 	dev->type		 = ARPHRD_INFINIBAND;
 	dev->tx_queue_len	 = ipoib_sendq_size * 2;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index f7ff9dd..20ebc6f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -262,21 +262,13 @@ static int ipoib_mcast_join_finish(struct ipoib_mcast *mcast,
 	netif_tx_lock_bh(dev);
 	while (!skb_queue_empty(&mcast->pkt_queue)) {
 		struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue);
-		struct dst_entry *dst = skb_dst(skb);
-		struct neighbour *n = NULL;
 
 		netif_tx_unlock_bh(dev);
 
 		skb->dev = dev;
-		if (dst)
-			n = dst_get_neighbour_noref_raw(dst);
-		if (!dst || !n) {
-			/* put pseudoheader back on for next time */
-			skb_push(skb, sizeof (struct ipoib_pseudoheader));
-		}
-
 		if (dev_queue_xmit(skb))
 			ipoib_warn(priv, "dev_queue_xmit failed to requeue packet\n");
+
 		netif_tx_lock_bh(dev);
 	}
 	netif_tx_unlock_bh(dev);

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
  2012-02-07 19:51                                                                       ` Roland Dreier
@ 2012-02-07 20:33                                                                         ` David Miller
       [not found]                                                                           ` <20120207.153325.1941809701255235550.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-07 20:33 UTC (permalink / raw)
  To: roland
  Cc: eric.dumazet, ogerlitz, sean.hefty, herbert, linux-rdma, shlomop, netdev

From: Roland Dreier <roland@kernel.org>
Date: Tue, 7 Feb 2012 11:51:46 -0800

Overall, looks great.

> @@ -843,9 +837,8 @@ static int ipoib_hard_header(struct sk_buff *skb,
>  	if (dst)
>  		n = dst_get_neighbour_noref_raw(dst);
>  	if ((!dst || !n) && daddr) {
> -		struct ipoib_pseudoheader *phdr =
> -			(struct ipoib_pseudoheader *) skb_push(skb, sizeof *phdr);
> -		memcpy(phdr->hwaddr, daddr, INFINIBAND_ALEN);
> +		struct ipoib_cb *cb = (struct ipoib_cb *) skb->cb;
> +		memcpy(cb->hwaddr, daddr, INFINIBAND_ALEN);

I would simplify this to "!skb_dst(skb)", any failure of
dst_get_neighbour_noref_raw() (now and in the future) would be
transient.

You're trying to see if this is a "neigh resolvable" path or not, and
the correct test for that is whether a dst is attached to the SKB.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next V2] gro: introduce gro_mac_header_len
       [not found]                                                                           ` <20120207.153325.1941809701255235550.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-07 20:34                                                                             ` Roland Dreier
       [not found]                                                                               ` <CAG4TOxP2iEgT64L8vAR6P139U-HyHSxU8gdWw+cLsDwWSGkrwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Roland Dreier @ 2012-02-07 20:34 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

On Tue, Feb 7, 2012 at 12:33 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> I would simplify this to "!skb_dst(skb)", any failure of
> dst_get_neighbour_noref_raw() (now and in the future) would be
> transient.

OK, will do.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
       [not found]                                                                               ` <CAG4TOxP2iEgT64L8vAR6P139U-HyHSxU8gdWw+cLsDwWSGkrwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-08  0:51                                                                                 ` Roland Dreier
  2012-02-08  7:29                                                                                   ` Hefty, Sean
  2012-02-08 20:50                                                                                   ` David Miller
  0 siblings, 2 replies; 70+ messages in thread
From: Roland Dreier @ 2012-02-08  0:51 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

From: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>

Commit a0417fa3a18a ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.

Signed-off-by: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
OK, this works for me.  Definitely looks cleaner and also looks
obvious enough that it's probably even correct.

 drivers/infiniband/ulp/ipoib/ipoib.h           |    6 ++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c      |   55 ++++++++---------------
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |   10 +----
 3 files changed, 24 insertions(+), 47 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index b3cc1e0..86df632 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -44,6 +44,7 @@
 #include <linux/mutex.h>
 
 #include <net/neighbour.h>
+#include <net/sch_generic.h>
 
 #include <linux/atomic.h>
 
@@ -117,8 +118,9 @@ struct ipoib_header {
 	u16	reserved;
 };
 
-struct ipoib_pseudoheader {
-	u8  hwaddr[INFINIBAND_ALEN];
+struct ipoib_cb {
+	struct qdisc_skb_cb	qdisc_cb;
+	u8			hwaddr[INFINIBAND_ALEN];
 };
 
 /* Used for all multicast joins (broadcast, IPv4 mcast and IPv6 mcast) */
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 3514ca0..3974c29 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -653,7 +653,7 @@ static void ipoib_path_lookup(struct sk_buff *skb, struct neighbour *n, struct n
 }
 
 static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
-			     struct ipoib_pseudoheader *phdr)
+			     struct ipoib_cb *cb)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 	struct ipoib_path *path;
@@ -661,17 +661,15 @@ static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
 
 	spin_lock_irqsave(&priv->lock, flags);
 
-	path = __path_find(dev, phdr->hwaddr + 4);
+	path = __path_find(dev, cb->hwaddr + 4);
 	if (!path || !path->valid) {
 		int new_path = 0;
 
 		if (!path) {
-			path = path_rec_create(dev, phdr->hwaddr + 4);
+			path = path_rec_create(dev, cb->hwaddr + 4);
 			new_path = 1;
 		}
 		if (path) {
-			/* put pseudoheader back on for next time */
-			skb_push(skb, sizeof *phdr);
 			__skb_queue_tail(&path->queue, skb);
 
 			if (!path->query && path_rec_start(dev, path)) {
@@ -695,12 +693,10 @@ static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
 			  be16_to_cpu(path->pathrec.dlid));
 
 		spin_unlock_irqrestore(&priv->lock, flags);
-		ipoib_send(dev, skb, path->ah, IPOIB_QPN(phdr->hwaddr));
+		ipoib_send(dev, skb, path->ah, IPOIB_QPN(cb->hwaddr));
 		return;
 	} else if ((path->query || !path_rec_start(dev, path)) &&
 		   skb_queue_len(&path->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
-		/* put pseudoheader back on for next time */
-		skb_push(skb, sizeof *phdr);
 		__skb_queue_tail(&path->queue, skb);
 	} else {
 		++dev->stats.tx_dropped;
@@ -774,16 +770,14 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			dev_kfree_skb_any(skb);
 		}
 	} else {
-		struct ipoib_pseudoheader *phdr =
-			(struct ipoib_pseudoheader *) skb->data;
-		skb_pull(skb, sizeof *phdr);
+		struct ipoib_cb *cb = (struct ipoib_cb *) skb->cb;
 
-		if (phdr->hwaddr[4] == 0xff) {
+		if (cb->hwaddr[4] == 0xff) {
 			/* Add in the P_Key for multicast*/
-			phdr->hwaddr[8] = (priv->pkey >> 8) & 0xff;
-			phdr->hwaddr[9] = priv->pkey & 0xff;
+			cb->hwaddr[8] = (priv->pkey >> 8) & 0xff;
+			cb->hwaddr[9] = priv->pkey & 0xff;
 
-			ipoib_mcast_send(dev, phdr->hwaddr + 4, skb);
+			ipoib_mcast_send(dev, cb->hwaddr + 4, skb);
 		} else {
 			/* unicast GID -- should be ARP or RARP reply */
 
@@ -792,14 +786,14 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				ipoib_warn(priv, "Unicast, no %s: type %04x, QPN %06x %pI6\n",
 					   skb_dst(skb) ? "neigh" : "dst",
 					   be16_to_cpup((__be16 *) skb->data),
-					   IPOIB_QPN(phdr->hwaddr),
-					   phdr->hwaddr + 4);
+					   IPOIB_QPN(cb->hwaddr),
+					   cb->hwaddr + 4);
 				dev_kfree_skb_any(skb);
 				++dev->stats.tx_dropped;
 				goto unlock;
 			}
 
-			unicast_arp_send(skb, dev, phdr);
+			unicast_arp_send(skb, dev, cb);
 		}
 	}
 unlock:
@@ -825,8 +819,6 @@ static int ipoib_hard_header(struct sk_buff *skb,
 			     const void *daddr, const void *saddr, unsigned len)
 {
 	struct ipoib_header *header;
-	struct dst_entry *dst;
-	struct neighbour *n;
 
 	header = (struct ipoib_header *) skb_push(skb, sizeof *header);
 
@@ -834,18 +826,13 @@ static int ipoib_hard_header(struct sk_buff *skb,
 	header->reserved = 0;
 
 	/*
-	 * If we don't have a neighbour structure, stuff the
-	 * destination address onto the front of the skb so we can
-	 * figure out where to send the packet later.
+	 * If we don't have a dst_entry structure, stuff the
+	 * destination address into skb->cb so we can figure out where
+	 * to send the packet later.
 	 */
-	dst = skb_dst(skb);
-	n = NULL;
-	if (dst)
-		n = dst_get_neighbour_noref_raw(dst);
-	if ((!dst || !n) && daddr) {
-		struct ipoib_pseudoheader *phdr =
-			(struct ipoib_pseudoheader *) skb_push(skb, sizeof *phdr);
-		memcpy(phdr->hwaddr, daddr, INFINIBAND_ALEN);
+	if (!skb_dst(skb)) {
+		struct ipoib_cb *cb = (struct ipoib_cb *) skb->cb;
+		memcpy(cb->hwaddr, daddr, INFINIBAND_ALEN);
 	}
 
 	return 0;
@@ -1021,11 +1008,7 @@ static void ipoib_setup(struct net_device *dev)
 
 	dev->flags		|= IFF_BROADCAST | IFF_MULTICAST;
 
-	/*
-	 * We add in INFINIBAND_ALEN to allow for the destination
-	 * address "pseudoheader" for skbs without neighbour struct.
-	 */
-	dev->hard_header_len	 = IPOIB_ENCAP_LEN + INFINIBAND_ALEN;
+	dev->hard_header_len	 = IPOIB_ENCAP_LEN;
 	dev->addr_len		 = INFINIBAND_ALEN;
 	dev->type		 = ARPHRD_INFINIBAND;
 	dev->tx_queue_len	 = ipoib_sendq_size * 2;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index f7ff9dd..20ebc6f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -262,21 +262,13 @@ static int ipoib_mcast_join_finish(struct ipoib_mcast *mcast,
 	netif_tx_lock_bh(dev);
 	while (!skb_queue_empty(&mcast->pkt_queue)) {
 		struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue);
-		struct dst_entry *dst = skb_dst(skb);
-		struct neighbour *n = NULL;
 
 		netif_tx_unlock_bh(dev);
 
 		skb->dev = dev;
-		if (dst)
-			n = dst_get_neighbour_noref_raw(dst);
-		if (!dst || !n) {
-			/* put pseudoheader back on for next time */
-			skb_push(skb, sizeof (struct ipoib_pseudoheader));
-		}
-
 		if (dev_queue_xmit(skb))
 			ipoib_warn(priv, "dev_queue_xmit failed to requeue packet\n");
+
 		netif_tx_lock_bh(dev);
 	}
 	netif_tx_unlock_bh(dev);
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* RE: [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
  2012-02-08  0:51                                                                                 ` [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses Roland Dreier
@ 2012-02-08  7:29                                                                                   ` Hefty, Sean
       [not found]                                                                                     ` <1828884A29C6694DAF28B7E6B8A82373374C1C26-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2012-02-08 20:50                                                                                   ` David Miller
  1 sibling, 1 reply; 70+ messages in thread
From: Hefty, Sean @ 2012-02-08  7:29 UTC (permalink / raw)
  To: Roland Dreier, David Miller
  Cc: eric.dumazet, ogerlitz, herbert, linux-rdma, shlomop, netdev

> OK, this works for me.  Definitely looks cleaner and also looks
> obvious enough that it's probably even correct.

I tested this with Dave's patch and Eric's first patch against Linus' latest tree 3.3-rc2+, and things look good so far.

Thanks,
- Sean

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
       [not found]                                                                                     ` <1828884A29C6694DAF28B7E6B8A82373374C1C26-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2012-02-08  7:50                                                                                       ` Eric Dumazet
  2012-02-08 14:28                                                                                         ` Or Gerlitz
  0 siblings, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-08  7:50 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Roland Dreier, David Miller, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA

Le mercredi 08 février 2012 à 07:29 +0000, Hefty, Sean a écrit :
> > OK, this works for me.  Definitely looks cleaner and also looks
> > obvious enough that it's probably even correct.
> 
> I tested this with Dave's patch and Eric's first patch against Linus' latest tree 3.3-rc2+, and things look good so far.
> 
> Thanks,
> - Sean

Thanks for testing, I'll resend my (updated) patch today.



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
  2012-02-08  7:50                                                                                       ` Eric Dumazet
@ 2012-02-08 14:28                                                                                         ` Or Gerlitz
  0 siblings, 0 replies; 70+ messages in thread
From: Or Gerlitz @ 2012-02-08 14:28 UTC (permalink / raw)
  To: Eric Dumazet, Roland Dreier
  Cc: Hefty, Sean, David Miller, herbert, linux-rdma, shlomop, netdev

On 2/8/2012 9:50 AM, Eric Dumazet wrote:
> Le mercredi 08 février 2012 à 07:29 +0000, Hefty, Sean a écrit :
>> I tested this with Dave's patch and Eric's first patch against Linus' latest tree 3.3-rc2+, and things look good so far.
>>
>
> Thanks for testing, I'll resend my (updated) patch today.

same here, I used Roland's patch on top of net-next plus the below patch 
and got GRO to aggregate okay,
great doing! Eric, could you please post your work to Dave after 
Roland's patch is applied?

Or.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index f124947..9b8e2fa 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3491,14 +3491,20 @@ static inline gro_result_t
>  __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
>  {
>         struct sk_buff *p;
> +       unsigned int maclen = skb->dev->hard_header_len;
>
>         for (p = napi->gro_list; p; p = p->next) {
>                 unsigned long diffs;
>
>                 diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
>                 diffs |= p->vlan_tci ^ skb->vlan_tci;
> -               diffs |= compare_ether_header(skb_mac_header(p),
> -                                             skb_gro_mac_header(skb));
> +               if (maclen == ETH_HLEN)
> +                       diffs |= compare_ether_header(skb_mac_header(p),
> +                                                     
> skb_gro_mac_header(skb));
> +               else if (!diffs)
> +                       diffs = memcmp(skb_mac_header(p),
> +                                       skb_gro_mac_header(skb),
> +                                       maclen);
>                 NAPI_GRO_CB(p)->same_flow = !diffs;
>                 NAPI_GRO_CB(p)->flush = 0;
>         }

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH net-next] gro: more generic L2 header check
  2012-02-06 16:47                                 ` [PATCH net-next V2] " Eric Dumazet
  2012-02-06 16:58                                   ` David Miller
@ 2012-02-08 18:51                                   ` Eric Dumazet
  2012-02-08 20:50                                     ` David Miller
  1 sibling, 1 reply; 70+ messages in thread
From: Eric Dumazet @ 2012-02-08 18:51 UTC (permalink / raw)
  To: David Miller
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA, Hefty, Sean

Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
only, and failed on IB/ipoib traffic.

He provided a patch faking a zeroed header to let GRO aggregates frames.

Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
check to be more generic, ie not assuming L2 header is 14 bytes, but
taking into account hard_header_len.

__napi_gro_receive() has special handling for the common case (Ethernet)
to avoid a memcmp() call and use an inline optimized function instead.

Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reported-by: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Tested-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 net/core/dev.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index f124947..763a0ed 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3491,14 +3491,20 @@ static inline gro_result_t
 __napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	unsigned int maclen = skb->dev->hard_header_len;
 
 	for (p = napi->gro_list; p; p = p->next) {
 		unsigned long diffs;
 
 		diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
 		diffs |= p->vlan_tci ^ skb->vlan_tci;
-		diffs |= compare_ether_header(skb_mac_header(p),
-					      skb_gro_mac_header(skb));
+		if (maclen == ETH_HLEN)
+			diffs |= compare_ether_header(skb_mac_header(p),
+						      skb_gro_mac_header(skb));
+		else if (!diffs)
+			diffs = memcmp(skb_mac_header(p),
+				       skb_gro_mac_header(skb),
+				       maclen);
 		NAPI_GRO_CB(p)->same_flow = !diffs;
 		NAPI_GRO_CB(p)->flush = 0;
 	}


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
  2012-02-08  0:51                                                                                 ` [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses Roland Dreier
  2012-02-08  7:29                                                                                   ` Hefty, Sean
@ 2012-02-08 20:50                                                                                   ` David Miller
  1 sibling, 0 replies; 70+ messages in thread
From: David Miller @ 2012-02-08 20:50 UTC (permalink / raw)
  To: roland
  Cc: eric.dumazet, ogerlitz, sean.hefty, herbert, linux-rdma, shlomop, netdev

From: Roland Dreier <roland@kernel.org>
Date: Tue,  7 Feb 2012 16:51:21 -0800

> From: Roland Dreier <roland@purestorage.com>
> 
> Commit a0417fa3a18a ("net: Make qdisc_skb_cb upper size bound
> explicit.") made it possible for a netdev driver to use skb->cb
> between its header_ops.create method and its .ndo_start_xmit
> method.  Use this in ipoib_hard_header() to stash away the LL address
> (GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
> IPoIB to stop lying about its hard_header_len, which will let us fix
> the L2 check for GRO.
> 
> Signed-off-by: Roland Dreier <roland@purestorage.com>
> ---
> OK, this works for me.  Definitely looks cleaner and also looks
> obvious enough that it's probably even correct.

I'll toss this, and Eric's updated patch into net-next, thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
  2012-02-08 18:51                                   ` [PATCH net-next] gro: more generic L2 header check Eric Dumazet
@ 2012-02-08 20:50                                     ` David Miller
       [not found]                                       ` <20120208.155027.599792363539233740.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-08 20:50 UTC (permalink / raw)
  To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Wed, 08 Feb 2012 19:51:50 +0100

> Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
> only, and failed on IB/ipoib traffic.
> 
> He provided a patch faking a zeroed header to let GRO aggregates frames.
> 
> Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
> check to be more generic, ie not assuming L2 header is 14 bytes, but
> taking into account hard_header_len.
> 
> __napi_gro_receive() has special handling for the common case (Ethernet)
> to avoid a memcmp() call and use an inline optimized function instead.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reported-by: Shlomo Pongratz <shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
> Tested-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                       ` <20120208.155027.599792363539233740.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-08 21:08                                         ` Or Gerlitz
       [not found]                                           ` <CAJZOPZL_zEbNwUfVOmQeODny7HDf24gOc1HStfBxim_UtD-kvA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-08 21:08 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Wed, Feb 8, 2012 at 10:50 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
> > only, and failed on IB/ipoib traffic.
> > He provided a patch faking a zeroed header to let GRO aggregates frames.
> >
> > Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
> > check to be more generic, ie not assuming L2 header is 14 bytes, but
> > taking into account hard_header_len.
> >
> > __napi_gro_receive() has special handling for the common case (Ethernet)
> > to avoid a memcmp() call and use an inline optimized function instead.

> Applied.

Hi Dave, for correct operation / future bisection, you should 1st
apply Roland's patch which reduces the hard header len advertized by
ipoib to be only the size of the ipoib header without that 20 bytes
headroom, else the gro memcmp will be issued on the ipoib header and
then 20 bytes of the ip header, kind of back to square one...

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                           ` <CAJZOPZL_zEbNwUfVOmQeODny7HDf24gOc1HStfBxim_UtD-kvA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-08 21:31                                             ` David Miller
       [not found]                                               ` <20120208.163159.2229331610142060560.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-08 21:31 UTC (permalink / raw)
  To: or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

From: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Wed, 8 Feb 2012 23:08:26 +0200

> On Wed, Feb 8, 2012 at 10:50 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>> From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> > Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
>> > only, and failed on IB/ipoib traffic.
>> > He provided a patch faking a zeroed header to let GRO aggregates frames.
>> >
>> > Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
>> > check to be more generic, ie not assuming L2 header is 14 bytes, but
>> > taking into account hard_header_len.
>> >
>> > __napi_gro_receive() has special handling for the common case (Ethernet)
>> > to avoid a memcmp() call and use an inline optimized function instead.
> 
>> Applied.
> 
> Hi Dave, for correct operation / future bisection, you should 1st
> apply Roland's patch which reduces the hard header len advertized by
> ipoib to be only the size of the ipoib header without that 20 bytes
> headroom, else the gro memcmp will be issued on the ipoib header and
> then 20 bytes of the ip header, kind of back to square one...

I did.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                               ` <20120208.163159.2229331610142060560.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-08 21:49                                                 ` Or Gerlitz
       [not found]                                                   ` <CAJZOPZLLteDbm0prTN3-npubtFun=kO2DYT7ea=E-HyJ84gaiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-08 21:49 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Wed, Feb 8, 2012 at 11:31 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> From: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

>> Hi Dave, for correct operation / future bisection, you should 1st
>> apply Roland's patch which reduces the hard header len advertized by
>> ipoib to be only the size of the ipoib header without that 20 bytes
>> headroom, else the gro memcmp will be issued on the ipoib header and
>> then 20 bytes of the ip header, kind of back to square one...

> I did.

Yep, I missed your earlier response to Roland's patch. So a question
here, how would you recommend to get that out... since the fix for the
problem includes two patches to core networking and one patch to
ipoib, we can't just ship a patch to ipoib which people can apply
build against their kernels. You've placed this work in net-next and
not net, so I assume we should wait for 3.4-rc1 and then push to
-stable? any other method to use here that you can recommend? thanks,

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                   ` <CAJZOPZLLteDbm0prTN3-npubtFun=kO2DYT7ea=E-HyJ84gaiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-08 23:09                                                     ` David Miller
       [not found]                                                       ` <20120208.180917.586628615268005115.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-08 23:09 UTC (permalink / raw)
  To: or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

From: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Wed, 8 Feb 2012 23:49:43 +0200

> On Wed, Feb 8, 2012 at 11:31 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>> From: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> 
>>> Hi Dave, for correct operation / future bisection, you should 1st
>>> apply Roland's patch which reduces the hard header len advertized by
>>> ipoib to be only the size of the ipoib header without that 20 bytes
>>> headroom, else the gro memcmp will be issued on the ipoib header and
>>> then 20 bytes of the ip header, kind of back to square one...
> 
>> I did.
> 
> Yep, I missed your earlier response to Roland's patch. So a question
> here, how would you recommend to get that out... since the fix for the
> problem includes two patches to core networking and one patch to
> ipoib, we can't just ship a patch to ipoib which people can apply
> build against their kernels. You've placed this work in net-next and
> not net, so I assume we should wait for 3.4-rc1 and then push to
> -stable? any other method to use here that you can recommend? thanks,

I don't think this is an appropriate bug fix at all.

Apparently this problem has existed since day one and the world has
kept on spinning meanwhile.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                       ` <20120208.180917.586628615268005115.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-08 23:20                                                         ` Or Gerlitz
       [not found]                                                           ` <CAJZOPZ+SyGsDpnK89wNWtBn4Jdzc2tg1diKjbHSv=n75XiG+ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-08 23:20 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, Feb 9, 2012 at 1:09 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> I don't think this is an appropriate bug fix at all.

I'm not sure to understand

> Apparently this problem has existed since day one and the world has
> kept on spinning meanwhile.

Dave, the bug was introduced on 2.6.38 when LRO was removed from IPoIB
and GRO added, so users who run distributions containing pre 2.6.38
kernels don't hit the bug.

So with the world keep spinning, GRO becomes critical when throwing
ipoib into the virtualization space, as except for LRO being
deprecated, its not supported by the virtualization stack (e.g the
bridge), and here GRO should come into play, and will, after the
fix... BTW - ipoib surely can't be plugged to bridge/macvtap as is,
but this is different story to cope with, now one less obstacle, GRO
works!

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                           ` <CAJZOPZ+SyGsDpnK89wNWtBn4Jdzc2tg1diKjbHSv=n75XiG+ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-08 23:26                                                             ` David Miller
       [not found]                                                               ` <20120208.182629.1816927738607872730.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-08 23:26 UTC (permalink / raw)
  To: or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

From: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Thu, 9 Feb 2012 01:20:46 +0200

> On Thu, Feb 9, 2012 at 1:09 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>> I don't think this is an appropriate bug fix at all.
> 
> I'm not sure to understand
> 
>> Apparently this problem has existed since day one and the world has
>> kept on spinning meanwhile.
> 
> Dave, the bug was introduced on 2.6.38 when LRO was removed from IPoIB
> and GRO added, so users who run distributions containing pre 2.6.38
> kernels don't hit the bug.

This I didn't know, Ok I'll apply this stuff to the net tree and add it
to my -stable queue as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                               ` <20120208.182629.1816927738607872730.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-09 10:46                                                                 ` Or Gerlitz
       [not found]                                                                   ` <4F33A3FC.1050205-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-09 10:46 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

On 2/9/2012 1:26 AM, David Miller wrote:
> the bug was introduced on 2.6.38 when LRO was removed from IPoIB
> and GRO added, so users who run distributions containing pre 2.6.38
> kernels don't hit the bug.
>
> This I didn't know, Ok I'll apply this stuff to the net tree and add it
> to my -stable queue as well.

Dave,

Thanks, going net and -stable will help here. I'm still cloning net, but 
from git web looking it seems that you've applied Roland's and Eric's 
patches on net but not yours... e.g net-next commit 
a0417fa3a18a14be1f4d9cffcf378a7c42d92a91 
"net: Make qdisc_skb_cb upper size bound explicit" is missing from net, 
and it has to be applied before Roland's.

Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                                   ` <4F33A3FC.1050205-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-02-09 18:52                                                                     ` David Miller
       [not found]                                                                       ` <20120209.135235.957414875716615693.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-09 18:52 UTC (permalink / raw)
  To: ogerlitz-VPRAkNaXOzVWk0Htik3J/w
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	herbert-F6s6mLieUQo7FNHlEwC/lvQIK84fMopw,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	shlomop-VPRAkNaXOzVWk0Htik3J/w, roland-DgEjT+Ai2ygdnm+yROfE0A,
	netdev-u79uwXL29TY76Z2rM5mHXA

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Date: Thu, 9 Feb 2012 12:46:20 +0200

> Thanks, going net and -stable will help here. I'm still cloning net,
> but from git web looking it seems that you've applied Roland's and
> Eric's patches on net but not yours... e.g net-next commit
> a0417fa3a18a14be1f4d9cffcf378a7c42d92a91 "net: Make qdisc_skb_cb upper
> size bound explicit" is missing from net, and it has to be applied
> before Roland's.

I've fix this now, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                                       ` <20120209.135235.957414875716615693.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-09 20:04                                                                         ` Or Gerlitz
  2012-02-09 20:29                                                                           ` David Miller
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-09 20:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, Feb 9, 2012 at 8:52 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:

> I've fix this now, thanks.

Yep, I see it there, still, for correctness/biscetion, you should
rebase such that the order of commits when pushed to Linus is

3. 5ca3b72c5da47d95b83857b768def6172fbc080a "gro: more generic L2 header check"

2. 936d7de3d736e0737542641269436f4b5968e9ef "IPoIB: Stop lying about
hard_header_len and use skb->cb to stash LL addresses"

1. 16bda13d90c8d5da243e2cfa1677e62ecce26860 "net: Make qdisc_skb_cb
upper size bound explicit" comes before "

currently its 2,3,1

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
  2012-02-09 20:04                                                                         ` Or Gerlitz
@ 2012-02-09 20:29                                                                           ` David Miller
       [not found]                                                                             ` <20120209.152943.1123862542883793404.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: David Miller @ 2012-02-09 20:29 UTC (permalink / raw)
  To: or.gerlitz; +Cc: eric.dumazet, linux-rdma, netdev

From: Or Gerlitz <or.gerlitz@gmail.com>
Date: Thu, 9 Feb 2012 22:04:34 +0200

> On Thu, Feb 9, 2012 at 8:52 PM, David Miller <davem@davemloft.net> wrote:
> 
>> I've fix this now, thanks.
> 
> Yep, I see it there, still, for correctness/biscetion, you should
> rebase such that the order of commits when pushed to Linus is

I cannot rebase my tree, too many people use it and I would break
their stuff.

We simply have to live with this error.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                                             ` <20120209.152943.1123862542883793404.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-02-09 22:18                                                                               ` Or Gerlitz
       [not found]                                                                                 ` <CAJZOPZKUG-tT+Zd=1EiLhgRxg4VEDrisBJQ99XtARMvK68UWfA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Or Gerlitz @ 2012-02-09 22:18 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, Feb 9, 2012 at 10:29 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> I cannot rebase my tree, too many people use it and I would break
> their stuff. We simply have to live with this error.

I see, so you will add CC to stable, such that they will be picked
into -stable once you push them to Linus?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH net-next] gro: more generic L2 header check
       [not found]                                                                                 ` <CAJZOPZKUG-tT+Zd=1EiLhgRxg4VEDrisBJQ99XtARMvK68UWfA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-02-09 22:28                                                                                   ` David Miller
  0 siblings, 0 replies; 70+ messages in thread
From: David Miller @ 2012-02-09 22:28 UTC (permalink / raw)
  To: or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA

From: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Fri, 10 Feb 2012 00:18:30 +0200

> On Thu, Feb 9, 2012 at 10:29 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>> I cannot rebase my tree, too many people use it and I would break
>> their stuff. We simply have to live with this error.
> 
> I see, so you will add CC to stable, such that they will be picked
> into -stable once you push them to Linus?

This is so tiring explaining all of this stuff...

I submit stable patches for networking directly myself, at a time of
my own choosing, once I believe that the changes have had enough
exposure and testing.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2012-02-09 22:28 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <alpine.LRH.2.00.1201261624540.30384@ogerlitz.voltaire.com>
     [not found] ` <alpine.LRH.2.00.1201261642340.31408@ogerlitz.voltaire.com>
     [not found]   ` <alpine.LRH.2.00.1201261642340.31408-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2012-01-30  4:36     ` [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated TCP streams Roland Dreier
     [not found]       ` <CAL1RGDUm8ROxFFMa+D1ZD5jF+cK+kV8aEVzspgnZFNXeuai+fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-30  7:44         ` Shlomo Pongratz
     [not found]           ` <36F7E4A28C18BE4DB7C86058E7B607240BE9687A-SlGPd/IId7auSA5JZHE7gA@public.gmane.org>
2012-01-30 18:11             ` Roland Dreier
     [not found]               ` <CAL1RGDXjjQ-PhCv-9WJX45NuovC9XiS=_7507OsyvLW_gBaJ5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-30 18:16                 ` Or Gerlitz
     [not found]                   ` <CAJZOPZLhgDysSASyMLNpOrmGzEyfyHAQjGLVei6ZNSFfb7TM1w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-30 19:00                     ` Roland Dreier
2012-01-30  7:44         ` Or Gerlitz
     [not found]           ` <4F264A6C.3070706-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-01-30  8:04             ` Eric Dumazet
2012-01-30  8:11               ` Or Gerlitz
2012-01-30  8:18               ` Herbert Xu
     [not found]                 ` <20120130081849.GA7848-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2012-01-30  8:53                   ` Eric Dumazet
2012-01-30  8:57                     ` Herbert Xu
     [not found]                       ` <20120130085742.GA8262-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2012-01-30 16:43                         ` David Miller
2012-02-01  8:23                         ` Or Gerlitz
     [not found]                           ` <CAJZOPZJb1HvcS0XXKLvoDuoi1EfYTY-awwY2g0aHWoS=4qmdyQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-01  8:38                             ` Herbert Xu
     [not found]                               ` <20120201083837.GA7081-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2012-02-01  9:43                                 ` Or Gerlitz
2012-02-01 14:07                                   ` Eric Dumazet
2012-02-02 14:01                                     ` Or Gerlitz
     [not found]                                       ` <4F2A974B.209-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-02-02 14:38                                         ` Eric Dumazet
2012-02-02 14:44                                           ` Eric Dumazet
2012-02-02 21:42                                             ` Or Gerlitz
2012-02-02 15:43                                           ` Or Gerlitz
2012-01-30  8:25               ` Michał Mirosław
2012-02-02 21:58     ` Or Gerlitz
     [not found]       ` <alpine.LRH.2.00.1202022352560.30300-VYr5/9ddeaGSIdy2EShu12Xnswh1EIUO@public.gmane.org>
2012-02-03  7:18         ` Hefty, Sean
     [not found]           ` <1828884A29C6694DAF28B7E6B8A823733349A5D2-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-02-03  9:00             ` Eric Dumazet
2012-02-03 20:24               ` Hefty, Sean
     [not found]                 ` <1828884A29C6694DAF28B7E6B8A823733349A6C9-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-02-06 15:05                   ` Or Gerlitz
     [not found]                     ` <4F2FEC36.6090800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-02-06 15:21                       ` Eric Dumazet
2012-02-06 15:22                         ` Or Gerlitz
     [not found]                           ` <4F2FF050.7040400-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-02-06 16:27                             ` [PATCH net-next] gro: introduce gro_mac_header_len Eric Dumazet
2012-02-06 16:31                               ` David Miller
     [not found]                                 ` <20120206.113145.1284864994961472499.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-06 16:44                                   ` Or Gerlitz
     [not found]                                     ` <4F300353.2080705-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-02-06 17:00                                       ` David Miller
2012-02-06 16:47                                 ` [PATCH net-next V2] " Eric Dumazet
2012-02-06 16:58                                   ` David Miller
2012-02-06 17:07                                     ` Eric Dumazet
2012-02-06 17:11                                       ` Or Gerlitz
     [not found]                                         ` <4F3009AE.2090605-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-02-06 17:19                                           ` Eric Dumazet
     [not found]                                     ` <20120206.115859.1384761795375582044.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-06 17:09                                       ` Or Gerlitz
2012-02-06 17:23                                       ` Roland Dreier
     [not found]                                         ` <CAG4TOxOBCbEEOtP62ZM1R3-6umebdbJFkK0bnjphLCZOgHzt1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-06 19:12                                           ` David Miller
     [not found]                                             ` <20120206.141223.332863167187002998.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-06 19:23                                               ` Roland Dreier
     [not found]                                                 ` <CAG4TOxNgJ3=AFuA==Km815YzW8eQ7nD_UAzkXLSXcAyaCAAvPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-06 19:32                                                   ` David Miller
     [not found]                                                     ` <20120206.143230.1415707004934341114.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-06 19:51                                                       ` David Miller
     [not found]                                                         ` <20120206.145148.558736903670696169.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-06 19:56                                                           ` David Miller
     [not found]                                                             ` <20120206.145652.1575591691467905094.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-06 20:01                                                               ` Roland Dreier
     [not found]                                                                 ` <CAG4TOxNcB8+SP3f0WhZG1GZ481s+=7pVRv91vzAbwDzUZD8g_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-06 20:15                                                                   ` David Miller
     [not found]                                                                     ` <20120206.151509.1959432192519622134.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-07 19:51                                                                       ` Roland Dreier
2012-02-07 20:33                                                                         ` David Miller
     [not found]                                                                           ` <20120207.153325.1941809701255235550.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-07 20:34                                                                             ` Roland Dreier
     [not found]                                                                               ` <CAG4TOxP2iEgT64L8vAR6P139U-HyHSxU8gdWw+cLsDwWSGkrwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-08  0:51                                                                                 ` [PATCH] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses Roland Dreier
2012-02-08  7:29                                                                                   ` Hefty, Sean
     [not found]                                                                                     ` <1828884A29C6694DAF28B7E6B8A82373374C1C26-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-02-08  7:50                                                                                       ` Eric Dumazet
2012-02-08 14:28                                                                                         ` Or Gerlitz
2012-02-08 20:50                                                                                   ` David Miller
2012-02-06 19:59                                                           ` [PATCH net-next V2] gro: introduce gro_mac_header_len Roland Dreier
2012-02-08 18:51                                   ` [PATCH net-next] gro: more generic L2 header check Eric Dumazet
2012-02-08 20:50                                     ` David Miller
     [not found]                                       ` <20120208.155027.599792363539233740.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-08 21:08                                         ` Or Gerlitz
     [not found]                                           ` <CAJZOPZL_zEbNwUfVOmQeODny7HDf24gOc1HStfBxim_UtD-kvA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-08 21:31                                             ` David Miller
     [not found]                                               ` <20120208.163159.2229331610142060560.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-08 21:49                                                 ` Or Gerlitz
     [not found]                                                   ` <CAJZOPZLLteDbm0prTN3-npubtFun=kO2DYT7ea=E-HyJ84gaiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-08 23:09                                                     ` David Miller
     [not found]                                                       ` <20120208.180917.586628615268005115.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-08 23:20                                                         ` Or Gerlitz
     [not found]                                                           ` <CAJZOPZ+SyGsDpnK89wNWtBn4Jdzc2tg1diKjbHSv=n75XiG+ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-08 23:26                                                             ` David Miller
     [not found]                                                               ` <20120208.182629.1816927738607872730.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-09 10:46                                                                 ` Or Gerlitz
     [not found]                                                                   ` <4F33A3FC.1050205-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-02-09 18:52                                                                     ` David Miller
     [not found]                                                                       ` <20120209.135235.957414875716615693.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-09 20:04                                                                         ` Or Gerlitz
2012-02-09 20:29                                                                           ` David Miller
     [not found]                                                                             ` <20120209.152943.1123862542883793404.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-02-09 22:18                                                                               ` Or Gerlitz
     [not found]                                                                                 ` <CAJZOPZKUG-tT+Zd=1EiLhgRxg4VEDrisBJQ99XtARMvK68UWfA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-09 22:28                                                                                   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).