All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
@ 2019-11-24 13:24 Oliver Herms
  2019-11-25 22:41 ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Herms @ 2019-11-24 13:24 UTC (permalink / raw)
  To: davem, yoshfuji, kuznet; +Cc: netdev

In IPv4 the identification field ensures that fragments of different datagrams
are not mixed by the receiver. Packets with Don't Fragment (DF) flag set are not
to be fragmented in transit and thus don't need an identification.
From RFC 6864 "Updated Specification of the IPv4 ID Field" section 4.1:
"The IPv4 ID field MUST NOT be used for purposes other than fragmentation and
reassembly."
Calculating the identification takes significant CPU time.
This patch will increase IP tunneling performance by ~10% unless DF is not set.
However, DF is set by default which is best practice.
For security reason this patch will set ID to zero when ID is not needed to
prevent leaking random information.

Signed-off-by: Oliver Herms <oliver.peter.herms@gmail.com>
---
Changes in v2:
  - Add source of assertion into commit message.
  - Correct check of DF flag.
  - Set iph->id to 0 when DF if not set to avoid leaking information.

 net/ipv4/ip_tunnel_core.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 1452a97914a0..ea12c8cc9e83 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -73,7 +73,11 @@ void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb,
 	iph->daddr	=	dst;
 	iph->saddr	=	src;
 	iph->ttl	=	ttl;
-	__ip_select_ident(net, iph, skb_shinfo(skb)->gso_segs ?: 1);
+
+	if (likely((iph->frag_off & htons(IP_DF)) == htons(IP_DF)))
+		iph->id = 0;
+	else
+		__ip_select_ident(net, iph, skb_shinfo(skb)->gso_segs ?: 1);
 
 	err = ip_local_out(net, sk, skb);
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-24 13:24 [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set Oliver Herms
@ 2019-11-25 22:41 ` David Miller
  2019-11-26 19:10   ` Oliver Herms
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2019-11-25 22:41 UTC (permalink / raw)
  To: oliver.peter.herms; +Cc: yoshfuji, kuznet, netdev

From: Oliver Herms <oliver.peter.herms@gmail.com>
Date: Sun, 24 Nov 2019 14:24:18 +0100

> From RFC 6864 "Updated Specification of the IPv4 ID Field" section 4.1:

Just reading the abstract of this RFC I cannot take it seriously:

	This document updates the specification of the IPv4 ID field
	in RFCs 791, 1122, and 2003 to more closely reflect current
	practice...

"more closely reflect current practice" ?!?!

That statement is a joke right?

Linux generates the bulk of the traffic on the internet and we've had
the current behavior of the ID field for decades.

Therefore, I don't think even the premise of this document is valid.

These are all red flags to me, and I think we should keep the current
behavior.

I'm not applying your patch, sorry.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-25 22:41 ` David Miller
@ 2019-11-26 19:10   ` Oliver Herms
  2019-11-26 20:51     ` David Miller
  2019-11-26 22:45     ` Eric Dumazet
  0 siblings, 2 replies; 9+ messages in thread
From: Oliver Herms @ 2019-11-26 19:10 UTC (permalink / raw)
  To: David Miller; +Cc: yoshfuji, kuznet, netdev

Hi Dave,

On 25.11.19 23:41, David Miller wrote:
> From: Oliver Herms <oliver.peter.herms@gmail.com>
> Date: Sun, 24 Nov 2019 14:24:18 +0100
> 
>> From RFC 6864 "Updated Specification of the IPv4 ID Field" section 4.1:
> 
> Just reading the abstract of this RFC I cannot take it seriously:
> 
> 	This document updates the specification of the IPv4 ID field
> 	in RFCs 791, 1122, and 2003 to more closely reflect current
> 	practice...
> 
> "more closely reflect current practice" ?!?!
> 
> That statement is a joke right?
> 
> Linux generates the bulk of the traffic on the internet and we've had
> the current behavior of the ID field for decades.
> 
> Therefore, I don't think even the premise of this document is valid.
> 
> These are all red flags to me, and I think we should keep the current
> behavior.
> 
> I'm not applying your patch, sorry.
> 
I totally understand your argument.

What do you think about making this configurable via sysctl and make the current
behavior the default? I would also like to make this configurable for other 
payload types like TCP and UDP. IMHO there the ID is unnecessary, too, when DF is set.

Would you be willing to merge a patch that offers this?

Thanks
Oliver

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-26 19:10   ` Oliver Herms
@ 2019-11-26 20:51     ` David Miller
  2019-11-26 22:45     ` Eric Dumazet
  1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2019-11-26 20:51 UTC (permalink / raw)
  To: oliver.peter.herms; +Cc: yoshfuji, kuznet, netdev

From: Oliver Herms <oliver.peter.herms@gmail.com>
Date: Tue, 26 Nov 2019 20:10:52 +0100

> Would you be willing to merge a patch that offers this?

No.

There is lots of precendence for incrementing the ID, look
at SLIP compression:

	deltaS = ntohs(ip->id) - ntohs(cs->cs_ip.id);
	if(deltaS != 1){
		cp = encode(cp,deltaS);
		changes |= NEW_I;
	}

That's just one example, it won't compress unless the ID
field is (unconditionally) incrementing.

The RFC being discussed here is poorly constructed and is founded on a
false basis of reality.  It's the usual tone deaf IETF stuff... and
Linux will not participate.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-26 19:10   ` Oliver Herms
  2019-11-26 20:51     ` David Miller
@ 2019-11-26 22:45     ` Eric Dumazet
  2019-11-26 23:32       ` Oliver Herms
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2019-11-26 22:45 UTC (permalink / raw)
  To: Oliver Herms, David Miller; +Cc: yoshfuji, kuznet, netdev



On 11/26/19 11:10 AM, Oliver Herms wrote:
> 
> What do you think about making this configurable via sysctl and make the current
> behavior the default? I would also like to make this configurable for other 
> payload types like TCP and UDP. IMHO there the ID is unnecessary, too, when DF is set.
>

Certainly not.

I advise you to look at GRO layer (at various stages, depending on linux version)

You can not 'optimize [1]' the sender and break receivers ( including old ones )

[1] Look at ip_select_ident_segs() : the per-socket id generator makes
ID generation quite low cost, there is no real issue here.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-26 22:45     ` Eric Dumazet
@ 2019-11-26 23:32       ` Oliver Herms
  2019-11-27  0:23         ` Eric Dumazet
  2019-11-27  0:28         ` Eric Dumazet
  0 siblings, 2 replies; 9+ messages in thread
From: Oliver Herms @ 2019-11-26 23:32 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: yoshfuji, kuznet, netdev

Hi everyone,

On 26.11.19 23:45, Eric Dumazet wrote:
> 
> 
> On 11/26/19 11:10 AM, Oliver Herms wrote:
>>
>> What do you think about making this configurable via sysctl and make the current
>> behavior the default? I would also like to make this configurable for other 
>> payload types like TCP and UDP. IMHO there the ID is unnecessary, too, when DF is set.
>>
> 
> Certainly not.
> 
> I advise you to look at GRO layer (at various stages, depending on linux version)
> 
> You can not 'optimize [1]' the sender and break receivers ( including old ones )
> 
> [1] Look at ip_select_ident_segs() : the per-socket id generator makes
> ID generation quite low cost, there is no real issue here.
> 

ip_select_ident_segs() is not the issue. The issue is with __ip_select_ident
that calls ip_idents_reserve. That consumes significant amount of CPU time here
while not adding any value (for my use case of company internal IPIP tunneling 
in a well defined environment to be fair).

Here is a flame graph: https://tinyurl.com/s9qv9fx
I'm curious for ideas on how to make this more efficient.
Using a simple incrementation here, as with sockets, would solve my problem well enough.

Thoughts?

Thanks
Oliver

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-26 23:32       ` Oliver Herms
@ 2019-11-27  0:23         ` Eric Dumazet
  2019-11-27  0:28         ` Eric Dumazet
  1 sibling, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2019-11-27  0:23 UTC (permalink / raw)
  To: Oliver Herms, Eric Dumazet, David Miller; +Cc: yoshfuji, kuznet, netdev



On 11/26/19 3:32 PM, Oliver Herms wrote:
> Hi everyone,
> 
> On 26.11.19 23:45, Eric Dumazet wrote:
>>
>>
>> On 11/26/19 11:10 AM, Oliver Herms wrote:
>>>
>>> What do you think about making this configurable via sysctl and make the current
>>> behavior the default? I would also like to make this configurable for other 
>>> payload types like TCP and UDP. IMHO there the ID is unnecessary, too, when DF is set.
>>>
>>
>> Certainly not.
>>
>> I advise you to look at GRO layer (at various stages, depending on linux version)
>>
>> You can not 'optimize [1]' the sender and break receivers ( including old ones )
>>
>> [1] Look at ip_select_ident_segs() : the per-socket id generator makes
>> ID generation quite low cost, there is no real issue here.
>>
> 
> ip_select_ident_segs() is not the issue. The issue is with __ip_select_ident
> that calls ip_idents_reserve. That consumes significant amount of CPU time here
> while not adding any value (for my use case of company internal IPIP tunneling 
> in a well defined environment to be fair).
> 
> Here is a flame graph: https://tinyurl.com/s9qv9fx
> I'm curious for ideas on how to make this more efficient.
> Using a simple incrementation here, as with sockets, would solve my problem well enough.
> 
> Thoughts?
> 

Oliver, TCP flows do not call  __ip_select_ident()

Pleas do not touch TCP payload.

Thank you.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-26 23:32       ` Oliver Herms
  2019-11-27  0:23         ` Eric Dumazet
@ 2019-11-27  0:28         ` Eric Dumazet
  2019-11-27 12:19           ` Oliver Herms
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2019-11-27  0:28 UTC (permalink / raw)
  To: Oliver Herms, Eric Dumazet, David Miller; +Cc: yoshfuji, kuznet, netdev



On 11/26/19 3:32 PM, Oliver Herms wrote:
> Using a simple incrementation here, as with sockets, would solve my problem well enough.
>

I have to ask : Are you aware that linux is SMP OS ?

If on a mostly idle host, two packets need a different ID, using a " simple incrementation" 
wont fit the need.

sockets are protected against concurrent increments by their lock.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set
  2019-11-27  0:28         ` Eric Dumazet
@ 2019-11-27 12:19           ` Oliver Herms
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver Herms @ 2019-11-27 12:19 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: yoshfuji, kuznet, netdev

On 27.11.19 01:28, Eric Dumazet wrote:
> 
> 
> On 11/26/19 3:32 PM, Oliver Herms wrote:
>> Using a simple incrementation here, as with sockets, would solve my problem well enough.
>>
> 
> I have to ask : Are you aware that linux is SMP OS ?
> 
> If on a mostly idle host, two packets need a different ID, using a " simple incrementation" 
> wont fit the need.
> 
> sockets are protected against concurrent increments by their lock.
> 
I know and I'm not going to mess around with TCP.
I've double checked and found that for non IP tunnel traffic (UDP, TCP, etc.) the cheap function
ip_select_ident_segs() is being used. That is absolutely fine. Nothing to optimize here.

For IP tunnels __ip_select_ident is being called.
And that one is way more expensive than ip_select_ident_segs().

ip_select_ident_segs() increments a counter (yes, I'm aware it is protected by lock).
If somehow __ip_select_ident could be refactored to work in a similar fashion that
would solve my problem.

Thanks
Oliver

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-11-27 12:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-24 13:24 [PATCH v2] net: ip/tnl: Set iph->id only when don't fragment is not set Oliver Herms
2019-11-25 22:41 ` David Miller
2019-11-26 19:10   ` Oliver Herms
2019-11-26 20:51     ` David Miller
2019-11-26 22:45     ` Eric Dumazet
2019-11-26 23:32       ` Oliver Herms
2019-11-27  0:23         ` Eric Dumazet
2019-11-27  0:28         ` Eric Dumazet
2019-11-27 12:19           ` Oliver Herms

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.