All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call
@ 2016-09-28  9:08 Shmulik Ladkani
  2016-09-28 10:30 ` Daniel Borkmann
  0 siblings, 1 reply; 6+ messages in thread
From: Shmulik Ladkani @ 2016-09-28  9:08 UTC (permalink / raw)
  To: David S. Miller, Pravin Shelar
  Cc: Daniel Borkmann, netdev, Shmulik Ladkani, Jiri Pirko

From: Shmulik Ladkani <shmulik.ladkani@gmail.com>

In case 'skb_vlan_push' is called on an skb with a hw-accel vlan tag
present, the existing hw-accel tag is inserted into the payload, and
the new given tag is placed as new hw-accel tag.

In order to insert the existing hw-accel tag, 'skb_vlan_push' adjusts
the 'data' pointer at the mac_header (if needed), invokes __vlan_insert_tag,
and finally re-adjusts 'data' back to its original position (according
to the remembered "adjustment offset").

However, successful '__vlan_insert_tag' pushes 4 more bytes at start of
frame.
Alas, the remembered "adjustment offset" is NOT fixed to account for
these additional 4 bytes, so the subsequent '__skb_pull(skb, offset)'
fails to unwind 'data' back to its original location.

Since 'skb->mac_len' IS fixed to account for the additional 4 bytes
(incremented to a total of 18 bytes), any access to
'skb->data - skb->mac_len' points to bytes PRIOR start of frame.

This is problematic, as many constructs in the stack are issuing
'skb_push(skb, skb->mac_len)' prior xmit to a device (e.g tcf_mirred,
tcf_bpf, nf_dup_netdev_egress), resulting in bogus frames being
xmitted (having random 4 bytes at start of frame).

For example:

 # ip l add dev d0 type dummy
 # tc filter add dev eth0 parent ffff: pref 1 basic \
     action vlan push protocol 802.1ad id 5 pipe \
     action mirred egress redirect dev d0

 Any 802.1q (hw-accel) tagged frames arriving on eth0 are xmitted as
 bogus frames on d0; whereas the expected behavior is having QinQ frames.

Fix, by properly accouting the additionally pushed 4 bytes (in the case
where an adjustment to point at mac_header was done).

Fixes: 93515d53b1 ("net: move vlan pop/push functions into common code")
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Cc: Pravin Shelar <pshelar@ovn.org>
Cc: Jiri Pirko <jiri@mellanox.com>
---
 v2: Instead of reducing mac_len by 4 bytes, which was found incorrect,
     fix the problem of wrong unwinding of 'skb->data'

 David, if patch ok, suggest this goes to -stable
   
 net/core/skbuff.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d36c754..3926b79 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4608,6 +4608,8 @@ int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci)
 
 		skb->protocol = skb->vlan_proto;
 		skb->mac_len += VLAN_HLEN;
+		if (offset)
+			offset += VLAN_HLEN;
 
 		skb_postpush_rcsum(skb, skb->data + (2 * ETH_ALEN), VLAN_HLEN);
 		__skb_pull(skb, offset);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call
  2016-09-28  9:08 [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call Shmulik Ladkani
@ 2016-09-28 10:30 ` Daniel Borkmann
  2016-09-28 11:56   ` Shmulik Ladkani
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Borkmann @ 2016-09-28 10:30 UTC (permalink / raw)
  To: Shmulik Ladkani, David S. Miller, Pravin Shelar
  Cc: netdev, Shmulik Ladkani, Jiri Pirko

On 09/28/2016 11:08 AM, Shmulik Ladkani wrote:
> From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
>
> In case 'skb_vlan_push' is called on an skb with a hw-accel vlan tag
> present, the existing hw-accel tag is inserted into the payload, and
> the new given tag is placed as new hw-accel tag.
>
> In order to insert the existing hw-accel tag, 'skb_vlan_push' adjusts
> the 'data' pointer at the mac_header (if needed), invokes __vlan_insert_tag,
> and finally re-adjusts 'data' back to its original position (according
> to the remembered "adjustment offset").
>
> However, successful '__vlan_insert_tag' pushes 4 more bytes at start of
> frame.
> Alas, the remembered "adjustment offset" is NOT fixed to account for
> these additional 4 bytes, so the subsequent '__skb_pull(skb, offset)'
> fails to unwind 'data' back to its original location.
>
> Since 'skb->mac_len' IS fixed to account for the additional 4 bytes
> (incremented to a total of 18 bytes), any access to
> 'skb->data - skb->mac_len' points to bytes PRIOR start of frame.
>
> This is problematic, as many constructs in the stack are issuing
> 'skb_push(skb, skb->mac_len)' prior xmit to a device (e.g tcf_mirred,
> tcf_bpf, nf_dup_netdev_egress), resulting in bogus frames being
> xmitted (having random 4 bytes at start of frame).
>
> For example:
>
>   # ip l add dev d0 type dummy
>   # tc filter add dev eth0 parent ffff: pref 1 basic \
>       action vlan push protocol 802.1ad id 5 pipe \
>       action mirred egress redirect dev d0
>
>   Any 802.1q (hw-accel) tagged frames arriving on eth0 are xmitted as
>   bogus frames on d0; whereas the expected behavior is having QinQ frames.
>
> Fix, by properly accouting the additionally pushed 4 bytes (in the case
> where an adjustment to point at mac_header was done).
>
> Fixes: 93515d53b1 ("net: move vlan pop/push functions into common code")
> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> Cc: Pravin Shelar <pshelar@ovn.org>
> Cc: Jiri Pirko <jiri@mellanox.com>
> ---
>   v2: Instead of reducing mac_len by 4 bytes, which was found incorrect,
>       fix the problem of wrong unwinding of 'skb->data'
>
>   David, if patch ok, suggest this goes to -stable
>
>   net/core/skbuff.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index d36c754..3926b79 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4608,6 +4608,8 @@ int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci)
>
>   		skb->protocol = skb->vlan_proto;
>   		skb->mac_len += VLAN_HLEN;
> +		if (offset)
> +			offset += VLAN_HLEN;
>
>   		skb_postpush_rcsum(skb, skb->data + (2 * ETH_ALEN), VLAN_HLEN);
>   		__skb_pull(skb, offset);

This looks much better indeed than your v1 of this patch.

v1 would have definitely changed existing behavior for the cls_bpf/act_bpf
case. Both start at the beginning of mac header from ingress and egress side.
So when frame comes in on ingress, skb->data points to start of net header,
we then do __skb_push(skb, skb->mac_len) before running BPF prog, and after
return from BPF prog again __skb_pull(skb, skb->mac_len) to return to original
location. Thus in skb_vlan_push() from BPF helper call offset is always 0;
perhaps similar in ovs case.

With the removed skb->mac_len adjustment in skb_vlan_push() from your v1, we
would then have pointed into vlan header on return to stack instead of net
header location as we do currently.

So the issue might only be visible to act_vlan as the other remaining user of
skb_vlan_push(). Above fix looks better to me. So if we don't start at the
mac header yet, we need to do the __skb_push()/__skb_pull() adjustment from
there, and since we expand mac header, we need to take these 4 bytes into
account as well for returning to original location. My only question would
be: what about __skb_vlan_pop(), wouldn't that then need the same adjustment
a la offset -= VLAN_HLEN?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call
  2016-09-28 10:30 ` Daniel Borkmann
@ 2016-09-28 11:56   ` Shmulik Ladkani
  2016-09-28 14:43     ` Daniel Borkmann
  0 siblings, 1 reply; 6+ messages in thread
From: Shmulik Ladkani @ 2016-09-28 11:56 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David S. Miller, Pravin Shelar, netdev, Shmulik Ladkani, Jiri Pirko

Hi,

On Wed, 28 Sep 2016 12:30:56 +0200, daniel@iogearbox.net wrote:
> > @@ -4608,6 +4608,8 @@ int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci)
> >
> >   		skb->protocol = skb->vlan_proto;
> >   		skb->mac_len += VLAN_HLEN;
> > +		if (offset)
> > +			offset += VLAN_HLEN;
> >
> >   		skb_postpush_rcsum(skb, skb->data + (2 * ETH_ALEN), VLAN_HLEN);
> >   		__skb_pull(skb, offset);
> 
> This looks much better indeed than your v1 of this patch.

Yep, after some meditation and history digging I happened to notice I
was barking at the wrong tree.

> So the issue might only be visible to act_vlan as the other remaining user of
> skb_vlan_push(). 

Yes, this is correct. I'll amend the log message to express that.
The bug occurs for callers of skb_vlan_push() whose data is not
pointing at mac_header.

> My only question would be:
> what about __skb_vlan_pop(), wouldn't that then need the same adjustment
> a la offset -= VLAN_HLEN?

Well, theoretically, yes; but caller may expect 2 different things:

(assuming tags are in-payload)

(1) suppose upon entry we have

    DA,SA,0x8100,TCI,0x0800,
    ^                ^
    mac_hdr          data

initial offset is 18, and after current unwinding code we'll get

    DA,SA,0x0800,4_bytes,
    ^                    ^
    mac_hdr              data

which is probably incorrect, adjustment 'offset -= VLAN_HLEN' is needed.

(2) suppose upon entry we have

    DA,SA,0x8100,TCI,0x0800
    ^            ^
    mac_hdr      data

initial offset is 14, and after current unwinding code we'll get

    DA,SA,0x0800,
    ^            ^
    mac_hdr      data

which is probably what user has intended.
(had we adjusted offset to be 10, 'data' would point into SA)

From test I've made using act_vlan upon ingress on QinQ tags, existing call
provides data as in (2).

Thoughts?
Should we adjust "offset" back, only if resulting offset is >=14 ?

Thanks,
Shmulik

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call
  2016-09-28 11:56   ` Shmulik Ladkani
@ 2016-09-28 14:43     ` Daniel Borkmann
  2016-09-28 17:11       ` Shmulik Ladkani
  2016-09-28 17:42       ` Shmulik Ladkani
  0 siblings, 2 replies; 6+ messages in thread
From: Daniel Borkmann @ 2016-09-28 14:43 UTC (permalink / raw)
  To: Shmulik Ladkani
  Cc: David S. Miller, Pravin Shelar, netdev, Shmulik Ladkani, Jiri Pirko

On 09/28/2016 01:56 PM, Shmulik Ladkani wrote:
> On Wed, 28 Sep 2016 12:30:56 +0200, daniel@iogearbox.net wrote:
>>> @@ -4608,6 +4608,8 @@ int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci)
>>>
>>>    		skb->protocol = skb->vlan_proto;
>>>    		skb->mac_len += VLAN_HLEN;
>>> +		if (offset)
>>> +			offset += VLAN_HLEN;
>>>
>>>    		skb_postpush_rcsum(skb, skb->data + (2 * ETH_ALEN), VLAN_HLEN);
>>>    		__skb_pull(skb, offset);
>>
>> This looks much better indeed than your v1 of this patch.
>
> Yep, after some meditation and history digging I happened to notice I
> was barking at the wrong tree.
>
>> So the issue might only be visible to act_vlan as the other remaining user of
>> skb_vlan_push().
>
> Yes, this is correct. I'll amend the log message to express that.
> The bug occurs for callers of skb_vlan_push() whose data is not
> pointing at mac_header.
>
>> My only question would be:
>> what about __skb_vlan_pop(), wouldn't that then need the same adjustment
>> a la offset -= VLAN_HLEN?
>
> Well, theoretically, yes; but caller may expect 2 different things:
>
> (assuming tags are in-payload)
>
> (1) suppose upon entry we have
>
>      DA,SA,0x8100,TCI,0x0800,
>      ^                ^
>      mac_hdr          data
>
> initial offset is 18, and after current unwinding code we'll get

You mean data points after the 0x0800, right?

>
>      DA,SA,0x0800,4_bytes,
>      ^                    ^
>      mac_hdr              data
>
> which is probably incorrect, adjustment 'offset -= VLAN_HLEN' is needed.
>
> (2) suppose upon entry we have
>
>      DA,SA,0x8100,TCI,0x0800
>      ^            ^
>      mac_hdr      data
>
> initial offset is 14, and after current unwinding code we'll get
>
>      DA,SA,0x0800,
>      ^            ^
>      mac_hdr      data
>
> which is probably what user has intended.
> (had we adjusted offset to be 10, 'data' would point into SA)
>
>  From test I've made using act_vlan upon ingress on QinQ tags, existing call
> provides data as in (2).
>
> Thoughts?

Yeah, so we likely end up at 2) because of things like eth_type_trans()
that would only pull ETH_HLEN.

Couldn't we end up with 1) for the act_vlan case when we'd have the
offset-adjusted skb_vlan_push() fix from here, where we'd then redirect
to ingress where skb_vlan_pop() would be called? If I'm not missing
something, skb_vlan_push() would then point to the data location of 1)
and with your other proposed direct netif_receive_skb() patch, no
further skb->data adjustments would be done, right?

Another potential issue (but unrelated to this fix here) I just noticed
is, whether act_vlan might have the same problem as we fixed in 8065694e6519
("bpf: fix checksum for vlan push/pop helper"). So potentially, we could
end up fixing CHECKSUM_COMPLETE wrongly on ingress, since these 14 bytes
are already pulled out of the sum at that point.

> Should we adjust "offset" back, only if resulting offset is >=14 ?

If also the checksum one might end up as an issue, maybe it's just best
to go through the pain and do the push/pull for data plus csum, so both
skb_vlan_*() functions see the frame starting from mac header temporarily?
Jiri, any thoughts?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call
  2016-09-28 14:43     ` Daniel Borkmann
@ 2016-09-28 17:11       ` Shmulik Ladkani
  2016-09-28 17:42       ` Shmulik Ladkani
  1 sibling, 0 replies; 6+ messages in thread
From: Shmulik Ladkani @ 2016-09-28 17:11 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David S. Miller, Pravin Shelar, netdev, Shmulik Ladkani, Jiri Pirko

On Wed, 28 Sep 2016 16:43:38 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote:
> > (1) suppose upon entry we have
> >
> >      DA,SA,0x8100,TCI,0x0800,
> >      ^                ^
> >      mac_hdr          data
> >
> > initial offset is 18, and after current unwinding code we'll get  
> 
> You mean data points after the 0x0800, right?

Sorry. Yes, exactly as you say. Initially 18 bytes ahead:

    DA,SA,0x8100,TCI,0x0800,
    ^                       ^
    mac_hdr                 data

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call
  2016-09-28 14:43     ` Daniel Borkmann
  2016-09-28 17:11       ` Shmulik Ladkani
@ 2016-09-28 17:42       ` Shmulik Ladkani
  1 sibling, 0 replies; 6+ messages in thread
From: Shmulik Ladkani @ 2016-09-28 17:42 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David S. Miller, Pravin Shelar, netdev, Shmulik Ladkani, Jiri Pirko

On Wed, 28 Sep 2016 16:43:38 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote:
> Couldn't we end up with 1) for the act_vlan case when we'd have the
> offset-adjusted skb_vlan_push() fix from here, where we'd then redirect
> to ingress where skb_vlan_pop() would be called? If I'm not missing
> something, skb_vlan_push() would then point to the data location of 1)
> and with your other proposed direct netif_receive_skb() patch, no
> further skb->data adjustments would be done, right?

Right. Then skb_vlan_pop() should expect either (1) or (2).

> Another potential issue (but unrelated to this fix here) I just noticed
> is, whether act_vlan might have the same problem as we fixed in 8065694e6519
> ("bpf: fix checksum for vlan push/pop helper"). So potentially, we could
> end up fixing CHECKSUM_COMPLETE wrongly on ingress, since these 14 bytes
> are already pulled out of the sum at that point.
> 
> > Should we adjust "offset" back, only if resulting offset is >=14 ?  
> 
> If also the checksum one might end up as an issue, maybe it's just best
> to go through the pain and do the push/pull for data plus csum, so both
> skb_vlan_*() functions see the frame starting from mac header temporarily?

Although not related to this specific fix, I see 2 ways addressing the
rcsum problem:

1. Per your suggestion, skb_vlan_*() to expect 'data' at mac_header
   That would simplify things; for this suggested 'data unwind' fix as well

2. Within skb_vlan_*(), deduct (according to initial offset) whether
   we're already "pulled out" of the rcsum, and not invoke the
   skb_postpull/push_rcsum update.

Will meditate some more.

Thanks
Shmulik

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-09-28 17:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-28  9:08 [PATCH v2 net] net: skbuff: skb_vlan_push: Fix wrong unwinding of skb->data after __vlan_insert_tag call Shmulik Ladkani
2016-09-28 10:30 ` Daniel Borkmann
2016-09-28 11:56   ` Shmulik Ladkani
2016-09-28 14:43     ` Daniel Borkmann
2016-09-28 17:11       ` Shmulik Ladkani
2016-09-28 17:42       ` Shmulik Ladkani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.