netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
@ 2020-08-04  9:37 Sabrina Dubroca
  2020-08-04 12:32 ` Bram Yvakh
  2020-08-27  7:08 ` Steffen Klassert
  0 siblings, 2 replies; 8+ messages in thread
From: Sabrina Dubroca @ 2020-08-04  9:37 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, Sabrina Dubroca, Xiumei Mu

xfrm interfaces currently test for !skb->ignore_df when deciding
whether to update the pmtu on the skb's dst. Because of this, no pmtu
exception is created when we do something like:

    ping -s 1438 <dest>

By dropping this check, the pmtu exception will be created and the
next ping attempt will work.

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/xfrm/xfrm_interface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index b615729812e5..ade2eba863b3 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -292,7 +292,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	}
 
 	mtu = dst_mtu(dst);
-	if (!skb->ignore_df && skb->len > mtu) {
+	if (skb->len > mtu) {
 		skb_dst_update_pmtu_no_confirm(skb, mtu);
 
 		if (skb->protocol == htons(ETH_P_IPV6)) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
  2020-08-04  9:37 [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu Sabrina Dubroca
@ 2020-08-04 12:32 ` Bram Yvakh
  2020-08-07 14:47   ` Sabrina Dubroca
  2020-08-27  7:08 ` Steffen Klassert
  1 sibling, 1 reply; 8+ messages in thread
From: Bram Yvakh @ 2020-08-04 12:32 UTC (permalink / raw)
  To: sd; +Cc: netdev, Steffen Klassert, xmu

On 4/08/2020 11:37, Sabrina Dubroca wrote:
> xfrm interfaces currently test for !skb->ignore_df when deciding
> whether to update the pmtu on the skb's dst. Because of this, no pmtu
> exception is created when we do something like:
>
>     ping -s 1438 <dest>
>
> By dropping this check, the pmtu exception will be created and the
> next ping attempt will work.
>
>
> Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
> Reported-by: Xiumei Mu <xmu@redhat.com>
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
>  net/xfrm/xfrm_interface.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
> index b615729812e5..ade2eba863b3 100644
> --- a/net/xfrm/xfrm_interface.c
> +++ b/net/xfrm/xfrm_interface.c
> @@ -292,7 +292,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>  	}
>  
>  	mtu = dst_mtu(dst);
> -	if (!skb->ignore_df && skb->len > mtu) {
> +	if (skb->len > mtu) {
>   

To me dropping the 'ignore_df' check looks incorrect;
When I submitted patches last year for VTI which related to
ptmu/df-bit[1]  I did some testing and also some comparison (also with GRE)
(I also briefly tested with xfrmi but given that the VTI patches were
mostly ignored I did not pursue that further[2])

The conclusion for my testing with GRE: (only the last item is relevant
but rest included for context)
* with 'pmtudisc' set for the GRE device: outgoing GRE packet has the
DF-bit set (this did suffer from some issues when the
to-be-forwarded-packet did not have the DF bit set)
* with 'nopmtudisc' set for the GRE device: outgoing GRE packet copies
DF-bit from the to-be-forwarded packet (this did suffer from some issues
when client did set DF bit..)
* with 'nopmtudisc' and 'ignore-df' bit set: outgoing GRE packet never
has the DF bit set: packet can be fragmented at will

How I understand things (based on my testing last year): when
'ignore-df' is set then the DF-bit should *not* be set. This in turns
means that path-mtu discovery is not possible (which is why it has to be
used in combination with 'nopmtudisc').
In the patch above the 'ignore_df' is removed which looks wrong: if
'ignore_df' is set then the size of the packet should not be checked
since the packet may be fragmented at will.
(I also suppose that means that setting 'ignore_df' is not an option (or
no longer) an option for xfrmi?)

I'm also not sure what the exact case is/was that lead to this patch;
can you shed some light on it?
(What I would expect: if 'ignore_df' is set then pmtu discovery does not
happen.)

[1]:
https://lore.kernel.org/netdev/1552865877-13401-1-git-send-email-bram-yvahk@mail.wizbit.be/
and
https://lore.kernel.org/netdev/1552865877-13401-2-git-send-email-bram-yvahk@mail.wizbit.be/
[2]: https://lore.kernel.org/netdev/5C8EDF7F.2010100@mail.wizbit.be/

>   
>  		skb_dst_update_pmtu_no_confirm(skb, mtu);
>  
>  		if (skb->protocol == htons(ETH_P_IPV6)) {
>   


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
  2020-08-04 12:32 ` Bram Yvakh
@ 2020-08-07 14:47   ` Sabrina Dubroca
  2020-08-07 15:41     ` Bram Yvakh
  0 siblings, 1 reply; 8+ messages in thread
From: Sabrina Dubroca @ 2020-08-07 14:47 UTC (permalink / raw)
  To: Bram Yvakh; +Cc: netdev, Steffen Klassert, xmu

2020-08-04, 14:32:56 +0200, Bram Yvakh wrote:
> On 4/08/2020 11:37, Sabrina Dubroca wrote:
> > xfrm interfaces currently test for !skb->ignore_df when deciding
> > whether to update the pmtu on the skb's dst. Because of this, no pmtu
> > exception is created when we do something like:
> >
> >     ping -s 1438 <dest>
> >
> > By dropping this check, the pmtu exception will be created and the
> > next ping attempt will work.
> >
> >
> > Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
> > Reported-by: Xiumei Mu <xmu@redhat.com>
> > Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> > ---
> >  net/xfrm/xfrm_interface.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
> > index b615729812e5..ade2eba863b3 100644
> > --- a/net/xfrm/xfrm_interface.c
> > +++ b/net/xfrm/xfrm_interface.c
> > @@ -292,7 +292,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
> >  	}
> >  
> >  	mtu = dst_mtu(dst);
> > -	if (!skb->ignore_df && skb->len > mtu) {
> > +	if (skb->len > mtu) {
> >   
> 
> To me dropping the 'ignore_df' check looks incorrect;
> When I submitted patches last year for VTI which related to
> ptmu/df-bit[1]  I did some testing and also some comparison (also with GRE)
> (I also briefly tested with xfrmi but given that the VTI patches were
> mostly ignored I did not pursue that further[2])
> 
> The conclusion for my testing with GRE: (only the last item is relevant
> but rest included for context)
> * with 'pmtudisc' set for the GRE device: outgoing GRE packet has the
> DF-bit set (this did suffer from some issues when the
> to-be-forwarded-packet did not have the DF bit set)
> * with 'nopmtudisc' set for the GRE device: outgoing GRE packet copies
> DF-bit from the to-be-forwarded packet (this did suffer from some issues
> when client did set DF bit..)
> * with 'nopmtudisc' and 'ignore-df' bit set: outgoing GRE packet never
> has the DF bit set: packet can be fragmented at will

I'd like to focus on xfrmi for now. If there's a bug in VTI, or in
GRE, we can also fix that. But right now, we have a very simple case
that doesn't work with xfrmi.

> I'm also not sure what the exact case is/was that lead to this patch;
> can you shed some light on it?

Yeah, it's the most simple xfrmi setup possible (directly connected by
a veth), and just run ping on it. ping sets DF, we want an exception
to be created, but this test prevents it.
The packet ends up being dropped in xfrm_output_one because of the mtu
check in xfrm4_tunnel_check_size.

-------- 8< --------

KEY_aead=0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f

ip netns add ha
ip netns add hb

ip link add veth0 netns ha type veth peer name veth0 netns hb
ip -net ha link set veth0 up
ip -net hb link set veth0 up
ip -net ha addr add 192.168.1.1/24 dev veth0
ip -net hb addr add 192.168.1.2/24 dev veth0

ip -net ha xfrm state add src 192.168.1.1 dst 192.168.1.2 spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' $KEY_aead 128 mode tunnel sel src 192.168.2.1 dst 192.168.2.2  if_id 123
ip -net ha xfrm state add src 192.168.1.2 dst 192.168.1.1 spi 0x1001 proto esp aead 'rfc4106(gcm(aes))' $KEY_aead 128 mode tunnel sel src 192.168.2.2 dst 192.168.2.1  if_id 123
ip -net ha xfrm policy add dir out src 192.168.2.1 dst 192.168.2.2  tmpl src 192.168.1.1 dst 192.168.1.2 proto esp mode tunnel if_id 123
ip -net ha xfrm policy add dir in  src 192.168.2.2 dst 192.168.2.1  tmpl src 192.168.1.2 dst 192.168.1.1 proto esp mode tunnel if_id 123
ip -net hb xfrm state add src 192.168.1.2 dst 192.168.1.1 spi 0x1001 proto esp aead 'rfc4106(gcm(aes))' $KEY_aead 128 mode tunnel sel src 192.168.2.2 dst 192.168.2.1  if_id 123
ip -net hb xfrm state add src 192.168.1.1 dst 192.168.1.2 spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' $KEY_aead 128 mode tunnel sel src 192.168.2.1 dst 192.168.2.2  if_id 123
ip -net hb xfrm policy add dir out src 192.168.2.2 dst 192.168.2.1  tmpl src 192.168.1.2 dst 192.168.1.1 proto esp mode tunnel if_id 123
ip -net hb xfrm policy add dir in  src 192.168.2.1 dst 192.168.2.2  tmpl src 192.168.1.1 dst 192.168.1.2 proto esp mode tunnel if_id 123

sleep 1
ip netns exec ha ping -c 3 192.168.1.2

ip -net ha link add xfrm0 type xfrm dev veth0 if_id 123
ip -net ha link set xfrm0 up
ip -net hb link add xfrm0 type xfrm dev veth0 if_id 123
ip -net hb link set xfrm0 up
ip -net ha addr add 192.168.2.1/24 dev xfrm0
ip -net hb addr add 192.168.2.2/24 dev xfrm0

ip netns exec ha ping -s 1438 192.168.2.2

-- 
Sabrina


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
  2020-08-07 14:47   ` Sabrina Dubroca
@ 2020-08-07 15:41     ` Bram Yvakh
  2020-08-10 12:20       ` Sabrina Dubroca
  0 siblings, 1 reply; 8+ messages in thread
From: Bram Yvakh @ 2020-08-07 15:41 UTC (permalink / raw)
  To: sd; +Cc: netdev, Steffen Klassert, xmu


On 7/08/2020 16:47, Sabrina Dubroca wrote:
> 2020-08-04, 14:32:56 +0200, Bram Yvakh wrote:
>   
>> On 4/08/2020 11:37, Sabrina Dubroca wrote:
>>     
>>> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
>>> index b615729812e5..ade2eba863b3 100644
>>> --- a/net/xfrm/xfrm_interface.c
>>> +++ b/net/xfrm/xfrm_interface.c
>>> @@ -292,7 +292,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>>>  	}
>>>  
>>>  	mtu = dst_mtu(dst);
>>> -	if (!skb->ignore_df && skb->len > mtu) {
>>> +	if (skb->len > mtu) {
>>>       
[snip]

>
> Yeah, it's the most simple xfrmi setup possible (directly connected by
> a veth),
Thanks, that gives me something to experiment with;
Could you you share what kernel your testing with? (i.e. what
tree/branch/sha)
> and just run ping on it. ping sets DF, we want an exception
> to be created, but this test prevents it.
>   
As I said dropping the test currently doesn't make sense to me.
I would expect that the 'ignore_df' flag is not set on the device, and
if it's set then I would expect things to work.


> The packet ends up being dropped in xfrm_output_one because of the mtu
> check in xfrm4_tunnel_check_size.
>   
That's the bit that does not (yet) make senes to me..
Looking at net-next/master (bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8)

||

	static int xfrm4_tunnel_check_size(struct sk_buff *skb)
	{
		int mtu, ret = 0;
	
		if (IPCB(skb)->flags & IPSKB_XFRM_TUNNEL_SIZE)
			goto out;
	
		if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) || skb->ignore_df)
			goto out;
	
		mtu = dst_mtu(skb_dst(skb));
		if ((!skb_is_gso(skb) && skb->len > mtu) ||
		    (skb_is_gso(skb) &&
		     !skb_gso_validate_network_len(skb, ip_skb_dst_mtu(skb->sk, skb)))) {
			skb->protocol = htons(ETH_P_IP);
	
			if (skb->sk)
				xfrm_local_error(skb, mtu);
			else
				icmp_send(skb, ICMP_DEST_UNREACH,
					  ICMP_FRAG_NEEDED, htonl(mtu));
			ret = -EMSGSIZE;
		}
	out:
		return ret;
	}

*If* skb->ignore_df is set then it *skips* the mtu check.

In other words: 'xfrm4_tunnel_check_size' only cares about the mtu if ignore_df isn't set.
The original code in 'xfrmi_xmit2': only checks the mtu if ignore_df isn't set. (-> looks consistent)

With your patch: 'xfrmi_xmit2' now always checks the mtu whereas 'xfrm4_tunnel_check_size' only checks it when ignore_df isn't set.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
  2020-08-07 15:41     ` Bram Yvakh
@ 2020-08-10 12:20       ` Sabrina Dubroca
  2020-08-10 15:52         ` Bram Yvakh
  2020-08-26  8:40         ` Steffen Klassert
  0 siblings, 2 replies; 8+ messages in thread
From: Sabrina Dubroca @ 2020-08-10 12:20 UTC (permalink / raw)
  To: Bram Yvakh; +Cc: netdev, Steffen Klassert, xmu

2020-08-07, 17:41:09 +0200, Bram Yvakh wrote:
> 
> On 7/08/2020 16:47, Sabrina Dubroca wrote:
> > 2020-08-04, 14:32:56 +0200, Bram Yvakh wrote:
> >   
> >> On 4/08/2020 11:37, Sabrina Dubroca wrote:
> >>     
> >>> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
> >>> index b615729812e5..ade2eba863b3 100644
> >>> --- a/net/xfrm/xfrm_interface.c
> >>> +++ b/net/xfrm/xfrm_interface.c
> >>> @@ -292,7 +292,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
> >>>  	}
> >>>  
> >>>  	mtu = dst_mtu(dst);
> >>> -	if (!skb->ignore_df && skb->len > mtu) {
> >>> +	if (skb->len > mtu) {
> >>>       
> [snip]
> 
> >
> > Yeah, it's the most simple xfrmi setup possible (directly connected by
> > a veth),
> Thanks, that gives me something to experiment with;
> Could you you share what kernel your testing with? (i.e. what
> tree/branch/sha)

Always the latest upstream relevant to the area I'm working on. In
this case, Steffen's ipsec/master.

> > and just run ping on it. ping sets DF, we want an exception
> > to be created, but this test prevents it.
> >   
> As I said dropping the test currently doesn't make sense to me.
> I would expect that the 'ignore_df' flag is not set on the device, and
> if it's set then I would expect things to work.

ignore_df is a property of the packet (skb), not of the device. Only
gre tunnels have an ignore_df property, not vti or xfrmi.

> > The packet ends up being dropped in xfrm_output_one because of the mtu
> > check in xfrm4_tunnel_check_size.
> >   
> That's the bit that does not (yet) make senes to me..
> Looking at net-next/master (bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8)
> 
> ||
> 
> 	static int xfrm4_tunnel_check_size(struct sk_buff *skb)
> 	{
> 		int mtu, ret = 0;
> 	
> 		if (IPCB(skb)->flags & IPSKB_XFRM_TUNNEL_SIZE)
> 			goto out;
> 	
> 		if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) || skb->ignore_df)
> 			goto out;
> 	
> 		mtu = dst_mtu(skb_dst(skb));
> 		if ((!skb_is_gso(skb) && skb->len > mtu) ||
> 		    (skb_is_gso(skb) &&
> 		     !skb_gso_validate_network_len(skb, ip_skb_dst_mtu(skb->sk, skb)))) {
> 			skb->protocol = htons(ETH_P_IP);
> 	
> 			if (skb->sk)
> 				xfrm_local_error(skb, mtu);
> 			else
> 				icmp_send(skb, ICMP_DEST_UNREACH,
> 					  ICMP_FRAG_NEEDED, htonl(mtu));
> 			ret = -EMSGSIZE;
> 		}
> 	out:
> 		return ret;
> 	}
> 
> *If* skb->ignore_df is set then it *skips* the mtu check.

If the packet doesn't have the DF bit set (so the stack can fragment
the packet at will), or if the stack decided that it can ignore it and
fragment anyway, there's no need to check the mtu, because we'll
fragment the packet when we need. Otherwise, we're not allowed to
fragment, so we have to check the packet's size against the mtu.

> In other words: 'xfrm4_tunnel_check_size' only cares about the mtu if ignore_df isn't set.
> The original code in 'xfrmi_xmit2': only checks the mtu if ignore_df isn't set. (-> looks consistent)

Except that we reset skb->ignore_df in between (just after the mtu
handling in xfrmi_xmit2, via xfrmi_scrub_packet).

Why should we care about whether we can fragment the packet that's
being transmitted over a tunnel? We're not fragmenting it, we're going
to encapsulate it inside another IP header, and then we'll have to
fragment that outer IP packet.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
  2020-08-10 12:20       ` Sabrina Dubroca
@ 2020-08-10 15:52         ` Bram Yvakh
  2020-08-26  8:40         ` Steffen Klassert
  1 sibling, 0 replies; 8+ messages in thread
From: Bram Yvakh @ 2020-08-10 15:52 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: netdev, Steffen Klassert, xmu


>>> and just run ping on it. ping sets DF, we want an exception
>>> to be created, but this test prevents it.
>>>   
>>>       
>> As I said dropping the test currently doesn't make sense to me.
>> I would expect that the 'ignore_df' flag is not set on the device, and
>> if it's set then I would expect things to work.
>>     
>
> ignore_df is a property of the packet (skb), not of the device. Only
> gre tunnels have an ignore_df property, not vti or xfrmi.
>   
Correct, I noticed my 'typo' after mail was submitted.

>>> The packet ends up being dropped in xfrm_output_one because of the mtu
>>> check in xfrm4_tunnel_check_size.
>>>   
>>>       
>> That's the bit that does not (yet) make senes to me..
>> Looking at net-next/master (bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8)
>>
>> ||
>>
>> 	static int xfrm4_tunnel_check_size(struct sk_buff *skb)
>> 	{
>> 		int mtu, ret = 0;
>> 	
>> 		if (IPCB(skb)->flags & IPSKB_XFRM_TUNNEL_SIZE)
>> 			goto out;
>> 	
>> 		if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) || skb->ignore_df)
>> 			goto out;
>> 	
>> 		mtu = dst_mtu(skb_dst(skb));
>> 		if ((!skb_is_gso(skb) && skb->len > mtu) ||
>> 		    (skb_is_gso(skb) &&
>> 		     !skb_gso_validate_network_len(skb, ip_skb_dst_mtu(skb->sk, skb)))) {
>> 			skb->protocol = htons(ETH_P_IP);
>> 	
>> 			if (skb->sk)
>> 				xfrm_local_error(skb, mtu);
>> 			else
>> 				icmp_send(skb, ICMP_DEST_UNREACH,
>> 					  ICMP_FRAG_NEEDED, htonl(mtu));
>> 			ret = -EMSGSIZE;
>> 		}
>> 	out:
>> 		return ret;
>> 	}
>>
>> *If* skb->ignore_df is set then it *skips* the mtu check.
>>     
>
> If the packet doesn't have the DF bit set (so the stack can fragment
> the packet at will), or if the stack decided that it can ignore it and
> fragment anyway, there's no need to check the mtu, because we'll
> fragment the packet when we need. Otherwise, we're not allowed to
> fragment, so we have to check the packet's size against the mtu.
>   
Agreed.
>   
>> In other words: 'xfrm4_tunnel_check_size' only cares about the mtu if ignore_df isn't set.
>> The original code in 'xfrmi_xmit2': only checks the mtu if ignore_df isn't set. (-> looks consistent)
>>     
>
> Except that we reset skb->ignore_df in between (just after the mtu
> handling in xfrmi_xmit2, via xfrmi_scrub_packet).
>   
Thanks, that's a bit that was not clear to me;
> Why should we care about whether we can fragment the packet that's
> being transmitted over a tunnel? We're not fragmenting it, we're going
> to encapsulate it inside another IP header, and then we'll have to
> fragment that outer IP packet.
>   
Agreed, but then it makes a bit less sense to be.
If we don't care if we can fragment the packet then why are we checking
the packet size?

Also, assume:
- ignore_df is set on the skb
- packet size is > mtu

Then with your patch applied the code will now send
ICMP_FRAG_NEEDED/ICMPV6_PKT_TOOBIG icmp which to me doesn't make sense
since the stack decided that the packet can be fragmented.
So in what case would in then be appropriate to send the
ICMP_FPRAG_NEEDED to the client? (which is very likely not expecting a
ICMP_FRAG_NEEDED icmp)
But thinking about it some more: I would also expect to see the
'(ip_hdr(skb)->frag_off & htons(IP_DF))' check... (because again: if the
packet says it's ok to fragment then it will not expect/handle a
ICMP_FRAG_NEEDED icmp)

(And a side-note: as to why would we care about fragmentation when we're
going to encapsulate it? *I* wouldn't care but last time I tried to
raise an issue/patch with failing to encapsulate IPv6 in IPv6 (with
xfrm) when pmtu is 1280 (min IPv6 mtu)  I was mostly ignored)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
  2020-08-10 12:20       ` Sabrina Dubroca
  2020-08-10 15:52         ` Bram Yvakh
@ 2020-08-26  8:40         ` Steffen Klassert
  1 sibling, 0 replies; 8+ messages in thread
From: Steffen Klassert @ 2020-08-26  8:40 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: Bram Yvakh, netdev, xmu

On Mon, Aug 10, 2020 at 02:20:20PM +0200, Sabrina Dubroca wrote:
> 2020-08-07, 17:41:09 +0200, Bram Yvakh wrote:
> 
> If the packet doesn't have the DF bit set (so the stack can fragment
> the packet at will), or if the stack decided that it can ignore it and
> fragment anyway, there's no need to check the mtu, because we'll
> fragment the packet when we need. Otherwise, we're not allowed to
> fragment, so we have to check the packet's size against the mtu.
> 
> > In other words: 'xfrm4_tunnel_check_size' only cares about the mtu if ignore_df isn't set.
> > The original code in 'xfrmi_xmit2': only checks the mtu if ignore_df isn't set. (-> looks consistent)
> 
> Except that we reset skb->ignore_df in between (just after the mtu
> handling in xfrmi_xmit2, via xfrmi_scrub_packet).

I guess the problem appears with a local ping, right?
Does 'ping -M do' work?

Looks like the comment in __ip_make_skb() on ignore_df
is not true for packets that are sent through a virtual
interface that increases the packet size. It says:

/* Unless user demanded real pmtu discovery (IP_PMTUDISC_DO), we allow
 * to fragment the frame generated here. No matter, what transforms
 * how transforms change size of the packet, it will come out.
 */

If we reset ignore_df before we can fragment it, the packet
won't come out.

I tend to apply your patch because it makes xfrmi consistend with
vti, but that might not be the end of the story. We will then signal
PMTU events also to sockets that can't handle it. Unfortunately, we
can't fragment before we send the packets into the interface, as
we don't know their final size. Alternatively, we can keep the
ignore_df on th skb and fragment the encapsulated packet later on.
But this has problems on its own...


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu
  2020-08-04  9:37 [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu Sabrina Dubroca
  2020-08-04 12:32 ` Bram Yvakh
@ 2020-08-27  7:08 ` Steffen Klassert
  1 sibling, 0 replies; 8+ messages in thread
From: Steffen Klassert @ 2020-08-27  7:08 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: netdev, Xiumei Mu

On Tue, Aug 04, 2020 at 11:37:29AM +0200, Sabrina Dubroca wrote:
> xfrm interfaces currently test for !skb->ignore_df when deciding
> whether to update the pmtu on the skb's dst. Because of this, no pmtu
> exception is created when we do something like:
> 
>     ping -s 1438 <dest>
> 
> By dropping this check, the pmtu exception will be created and the
> next ping attempt will work.
> 
> Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
> Reported-by: Xiumei Mu <xmu@redhat.com>
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>

Patch applied, thanks Sabrina!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-08-27  7:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-04  9:37 [PATCH ipsec] xfrmi: drop ignore_df check before updating pmtu Sabrina Dubroca
2020-08-04 12:32 ` Bram Yvakh
2020-08-07 14:47   ` Sabrina Dubroca
2020-08-07 15:41     ` Bram Yvakh
2020-08-10 12:20       ` Sabrina Dubroca
2020-08-10 15:52         ` Bram Yvakh
2020-08-26  8:40         ` Steffen Klassert
2020-08-27  7:08 ` Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).