All of lore.kernel.org
 help / color / mirror / Atom feed
* question: update_pmtu doesn't update dst mtu
@ 2014-04-03 11:37 Ilya V. Matveychikov
  2014-04-03 11:58 ` Hannes Frederic Sowa
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya V. Matveychikov @ 2014-04-03 11:37 UTC (permalink / raw)
  To: netdev

Hello,

Is this a place where I can post questions not the patches? If so, can anybody
explain me what the problem I have when trying to update dst PMTU value.

So, the scheme is the following:

                  skb->dst                   rt(pmtu)
IN_DEV (MTU 1500) -------> TUNNEL (MTU 1440) -------> OUT_DEV (MTU 1500)

I have a simple tunnel_xmit function that handles all the packets that goes
trough the tunnel. So, I have an skb with valid skb->dst value (filled earlier
in ip_input_route_noref).

Next, when encapsulation is done, I needed to get an output route for the packet
(rt at the scheme). At this point I know that the output route may have a PMTU
value that is different from the OUT_DEV->mtu. So, I'm trying to update the
input skb->dst route's PMTU with the update_pmtu function. It seems that all is
OK, but when I trying to get the dst mtu value (using dst_mtu(skb_dst(skb))) I
always get the value of 1440 (TUNNEL's MTU):

tunnel_xmit:
    ...
    pmtu = dst_mtu(&rt->dst) - OVERHEAD; // pmtu = 1300, for example
    skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, pmtu);
    // dst_mtu(skb_dst(skb)) still returns 1440

Looking through the code gives me that rt_pmtu is always 0 for the skb->dst
entry and ipv4_mtu that called via the dst->ops->mtu() uses dev->mtu :(

So, the question is what I missed when trying to dynamically tunning tunnel's
input route path MTU?

Thanks :)

-- 
Ilya V. Matveychikov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-03 11:37 question: update_pmtu doesn't update dst mtu Ilya V. Matveychikov
@ 2014-04-03 11:58 ` Hannes Frederic Sowa
  2014-04-03 12:07   ` Ilya V. Matveychikov
  0 siblings, 1 reply; 11+ messages in thread
From: Hannes Frederic Sowa @ 2014-04-03 11:58 UTC (permalink / raw)
  To: Ilya V. Matveychikov; +Cc: netdev

On Thu, Apr 03, 2014 at 03:37:33PM +0400, Ilya V. Matveychikov wrote:
> Hello,
> 
> Is this a place where I can post questions not the patches? If so, can anybody
> explain me what the problem I have when trying to update dst PMTU value.
> 
> So, the scheme is the following:
> 
>                   skb->dst                   rt(pmtu)
> IN_DEV (MTU 1500) -------> TUNNEL (MTU 1440) -------> OUT_DEV (MTU 1500)
> 
> I have a simple tunnel_xmit function that handles all the packets that goes
> trough the tunnel. So, I have an skb with valid skb->dst value (filled earlier
> in ip_input_route_noref).
> 
> Next, when encapsulation is done, I needed to get an output route for the packet
> (rt at the scheme). At this point I know that the output route may have a PMTU
> value that is different from the OUT_DEV->mtu. So, I'm trying to update the
> input skb->dst route's PMTU with the update_pmtu function. It seems that all is
> OK, but when I trying to get the dst mtu value (using dst_mtu(skb_dst(skb))) I
> always get the value of 1440 (TUNNEL's MTU):
> 
> tunnel_xmit:
>     ...
>     pmtu = dst_mtu(&rt->dst) - OVERHEAD; // pmtu = 1300, for example
>     skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, pmtu);
>     // dst_mtu(skb_dst(skb)) still returns 1440
> 
> Looking through the code gives me that rt_pmtu is always 0 for the skb->dst
> entry and ipv4_mtu that called via the dst->ops->mtu() uses dev->mtu :(

At this point you have to drop skb_dst and have to relookup the route. During
that a new dst will be created which gets the mtu value from the next hop
exception, which got created by update_pmtu.

Normally routes are checked with dst_check if they are still valid
and a relookup should happen. In your example just do the relookup
unconditionally.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-03 11:58 ` Hannes Frederic Sowa
@ 2014-04-03 12:07   ` Ilya V. Matveychikov
  2014-04-03 12:14     ` Hannes Frederic Sowa
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya V. Matveychikov @ 2014-04-03 12:07 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

On 03.04.2014 15:58, Hannes Frederic Sowa wrote:
> On Thu, Apr 03, 2014 at 03:37:33PM +0400, Ilya V. Matveychikov wrote:
>> Looking through the code gives me that rt_pmtu is always 0 for the skb->dst
>> entry and ipv4_mtu that called via the dst->ops->mtu() uses dev->mtu :(
> 
> At this point you have to drop skb_dst and have to relookup the route. During
> that a new dst will be created which gets the mtu value from the next hop
> exception, which got created by update_pmtu.
> 
> Normally routes are checked with dst_check if they are still valid
> and a relookup should happen. In your example just do the relookup
> unconditionally.
> 

Does it mean that the next packet must have an updated route without any
problems? I meant that if the first packet xmitting leads to updating the route
PMTU via the exception creating (or updating) so the next packets must have an
updated route? Am I right?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-03 12:07   ` Ilya V. Matveychikov
@ 2014-04-03 12:14     ` Hannes Frederic Sowa
  2014-04-03 12:27       ` Ilya V. Matveychikov
  0 siblings, 1 reply; 11+ messages in thread
From: Hannes Frederic Sowa @ 2014-04-03 12:14 UTC (permalink / raw)
  To: Ilya V. Matveychikov; +Cc: netdev

On Thu, Apr 03, 2014 at 04:07:21PM +0400, Ilya V. Matveychikov wrote:
> On 03.04.2014 15:58, Hannes Frederic Sowa wrote:
> > On Thu, Apr 03, 2014 at 03:37:33PM +0400, Ilya V. Matveychikov wrote:
> >> Looking through the code gives me that rt_pmtu is always 0 for the skb->dst
> >> entry and ipv4_mtu that called via the dst->ops->mtu() uses dev->mtu :(
> > 
> > At this point you have to drop skb_dst and have to relookup the route. During
> > that a new dst will be created which gets the mtu value from the next hop
> > exception, which got created by update_pmtu.
> > 
> > Normally routes are checked with dst_check if they are still valid
> > and a relookup should happen. In your example just do the relookup
> > unconditionally.
> > 
> 
> Does it mean that the next packet must have an updated route without any
> problems? I meant that if the first packet xmitting leads to updating the route
> PMTU via the exception creating (or updating) so the next packets must have an
> updated route? Am I right?

In case the next packet causes a lookup in the routing table, yes. Or
if it does cache the routing lookup in some structure and checks the
route with dst_check and conditionally does a relookup (which kernel
implementation should already do) then you should also get the updated
mtu value.

Bye,

  Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-03 12:14     ` Hannes Frederic Sowa
@ 2014-04-03 12:27       ` Ilya V. Matveychikov
  2014-04-08  9:03         ` Ilya V. Matveychikov
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya V. Matveychikov @ 2014-04-03 12:27 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

On 03.04.2014 16:14, Hannes Frederic Sowa wrote:
> On Thu, Apr 03, 2014 at 04:07:21PM +0400, Ilya V. Matveychikov wrote:
>> On 03.04.2014 15:58, Hannes Frederic Sowa wrote:
>>> On Thu, Apr 03, 2014 at 03:37:33PM +0400, Ilya V. Matveychikov wrote:
>>>> Looking through the code gives me that rt_pmtu is always 0 for the skb->dst
>>>> entry and ipv4_mtu that called via the dst->ops->mtu() uses dev->mtu :(
>>>
>>> At this point you have to drop skb_dst and have to relookup the route. During
>>> that a new dst will be created which gets the mtu value from the next hop
>>> exception, which got created by update_pmtu.
>>>
>>> Normally routes are checked with dst_check if they are still valid
>>> and a relookup should happen. In your example just do the relookup
>>> unconditionally.
>>>
>>
>> Does it mean that the next packet must have an updated route without any
>> problems? I meant that if the first packet xmitting leads to updating the route
>> PMTU via the exception creating (or updating) so the next packets must have an
>> updated route? Am I right?
> 
> In case the next packet causes a lookup in the routing table, yes. Or
> if it does cache the routing lookup in some structure and checks the
> route with dst_check and conditionally does a relookup (which kernel
> implementation should already do) then you should also get the updated
> mtu value.
> 

OK, seems that I understand the logic. Thanks a lot.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-03 12:27       ` Ilya V. Matveychikov
@ 2014-04-08  9:03         ` Ilya V. Matveychikov
  2014-04-08 14:57           ` Hannes Frederic Sowa
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya V. Matveychikov @ 2014-04-08  9:03 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

On 03.04.2014 16:27, Ilya V. Matveychikov wrote:
> On 03.04.2014 16:14, Hannes Frederic Sowa wrote:
>> On Thu, Apr 03, 2014 at 04:07:21PM +0400, Ilya V. Matveychikov wrote:
>>> On 03.04.2014 15:58, Hannes Frederic Sowa wrote:
>>>> On Thu, Apr 03, 2014 at 03:37:33PM +0400, Ilya V. Matveychikov wrote:
>>>>> Looking through the code gives me that rt_pmtu is always 0 for the skb->dst
>>>>> entry and ipv4_mtu that called via the dst->ops->mtu() uses dev->mtu :(
>>>>
>>>> At this point you have to drop skb_dst and have to relookup the route. During
>>>> that a new dst will be created which gets the mtu value from the next hop
>>>> exception, which got created by update_pmtu.
>>>>
>>>> Normally routes are checked with dst_check if they are still valid
>>>> and a relookup should happen. In your example just do the relookup
>>>> unconditionally.
>>>>
>>>
>>> Does it mean that the next packet must have an updated route without any
>>> problems? I meant that if the first packet xmitting leads to updating the route
>>> PMTU via the exception creating (or updating) so the next packets must have an
>>> updated route? Am I right?
>>
>> In case the next packet causes a lookup in the routing table, yes. Or
>> if it does cache the routing lookup in some structure and checks the
>> route with dst_check and conditionally does a relookup (which kernel
>> implementation should already do) then you should also get the updated
>> mtu value.
>>
> 
> OK, seems that I understand the logic. Thanks a lot.

Just another related question that gets me into trouble. Imagine that there is
an SKB that wants to be transmitted via that tunnel. Let's say that when it
comes to the TUNNEL device it has an MTU1 value. Now, someone updates the PMTU
for the route and mtu decreasing from MTU1 to MTU2, so MTU2 < MTU1.

Given that, I suppose that our SKB must be (re)fragmented with ip_fragment as
it's size might be slightly bigger then the path can pass. The problem is that
ip_fragment uses dst_mtu(skb_dst(skb)) to determine the fragment size but it
still has MTU1 value as even update_pmtu(MTU2) was called as it doesn't leads to
real dst MTU updating.

So the question is do I need to relookup the route or can I use the following
hack before ip_fragment:

	// dst_mtu(dst) shows MTU1
	dst->ops->update_pmtu(dst, ..., MTU2)
	...
	skb_rtable(skb)->rt_pmtu = MTU2;
	dst_set_expires(dst, 1);
	...
	// now, ip_fragment knows about real MTU value
	ip_fragment(skb, output...)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-08  9:03         ` Ilya V. Matveychikov
@ 2014-04-08 14:57           ` Hannes Frederic Sowa
  2014-04-09  8:17             ` Ilya V. Matveychikov
  0 siblings, 1 reply; 11+ messages in thread
From: Hannes Frederic Sowa @ 2014-04-08 14:57 UTC (permalink / raw)
  To: Ilya V. Matveychikov; +Cc: netdev

On Tue, Apr 08, 2014 at 01:03:43PM +0400, Ilya V. Matveychikov wrote:
> Just another related question that gets me into trouble. Imagine that there is
> an SKB that wants to be transmitted via that tunnel. Let's say that when it
> comes to the TUNNEL device it has an MTU1 value. Now, someone updates the PMTU
> for the route and mtu decreasing from MTU1 to MTU2, so MTU2 < MTU1.
> 
> Given that, I suppose that our SKB must be (re)fragmented with ip_fragment as
> it's size might be slightly bigger then the path can pass. The problem is that
> ip_fragment uses dst_mtu(skb_dst(skb)) to determine the fragment size but it
> still has MTU1 value as even update_pmtu(MTU2) was called as it doesn't leads to
> real dst MTU updating.
> 
> So the question is do I need to relookup the route or can I use the following
> hack before ip_fragment:
> 
> 	// dst_mtu(dst) shows MTU1
> 	dst->ops->update_pmtu(dst, ..., MTU2)
> 	...
> 	skb_rtable(skb)->rt_pmtu = MTU2;

This might be a cached dst and you would alter the mtu for more nexthops than
you intended.

> 	dst_set_expires(dst, 1);

With this you won't get around the time_after_eq check. You would have to
tweak it manually to not retrieve dst_metrics value (this is what you
intended?).

> 	...
> 	// now, ip_fragment knows about real MTU value
> 	ip_fragment(skb, output...)

Check if you can something do like
skb_dst_drop(skb);
new_dst = ip_route_output*(..., &fl4, ...);
skb_dst_set(_noref)(skb, new_dst);

This should be a very unlikely path, I assume, so should not degrade
performance that much.

I wonder why you update the mtu in the output path.

Bye,

  Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-08 14:57           ` Hannes Frederic Sowa
@ 2014-04-09  8:17             ` Ilya V. Matveychikov
  2014-04-09 20:30               ` Hannes Frederic Sowa
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya V. Matveychikov @ 2014-04-09  8:17 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

On 08.04.2014 18:57, Hannes Frederic Sowa wrote:
> On Tue, Apr 08, 2014 at 01:03:43PM +0400, Ilya V. Matveychikov wrote:
>> Just another related question that gets me into trouble. Imagine that there is
>> an SKB that wants to be transmitted via that tunnel. Let's say that when it
>> comes to the TUNNEL device it has an MTU1 value. Now, someone updates the PMTU
>> for the route and mtu decreasing from MTU1 to MTU2, so MTU2 < MTU1.
>>
>> Given that, I suppose that our SKB must be (re)fragmented with ip_fragment as
>> it's size might be slightly bigger then the path can pass. The problem is that
>> ip_fragment uses dst_mtu(skb_dst(skb)) to determine the fragment size but it
>> still has MTU1 value as even update_pmtu(MTU2) was called as it doesn't leads to
>> real dst MTU updating.
>>
>> So the question is do I need to relookup the route or can I use the following
>> hack before ip_fragment:
>>
>> 	// dst_mtu(dst) shows MTU1
>> 	dst->ops->update_pmtu(dst, ..., MTU2)
>> 	...
>> 	skb_rtable(skb)->rt_pmtu = MTU2;
> 
> This might be a cached dst and you would alter the mtu for more nexthops than
> you intended.
> 
>> 	dst_set_expires(dst, 1);
> 
> With this you won't get around the time_after_eq check. You would have to
> tweak it manually to not retrieve dst_metrics value (this is what you
> intended?).
> 

Well, I missed it. Also, that was a hack and it's not the best solution...

>> 	...
>> 	// now, ip_fragment knows about real MTU value
>> 	ip_fragment(skb, output...)
> 
> Check if you can something do like
> skb_dst_drop(skb);
> new_dst = ip_route_output*(..., &fl4, ...);
> skb_dst_set(_noref)(skb, new_dst);
> 

Works fine, thanks! By the way, could you briefly explain why routes are
separated to input and output? What are the benefits?

> This should be a very unlikely path, I assume, so should not degrade
> performance that much.

Sure.

> 
> I wonder why you update the mtu in the output path.

Well, the problem is that I don't know how to properly handle packets whose DSTs
have changed while processing. The simple solution is to drop them. But I think
it's not the best case as we can re-fragment them if new MTU value is lesser
than the older one.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-09  8:17             ` Ilya V. Matveychikov
@ 2014-04-09 20:30               ` Hannes Frederic Sowa
  2014-04-10  8:28                 ` Ilya V. Matveychikov
  0 siblings, 1 reply; 11+ messages in thread
From: Hannes Frederic Sowa @ 2014-04-09 20:30 UTC (permalink / raw)
  To: Ilya V. Matveychikov; +Cc: netdev

On Wed, Apr 09, 2014 at 12:17:33PM +0400, Ilya V. Matveychikov wrote:
> Works fine, thanks! By the way, could you briefly explain why routes are
> separated to input and output? What are the benefits?

Core routing table is not split by input and output, merely the dst
construction and surrounding lookup and policy checks are seperated for input
and output.

Output does not have to deal with source address validation e.g. but has to do
source address selection.

Quite hard to answer, I guess the design emerged quite naturally. ;)

Bye,

  Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question: update_pmtu doesn't update dst mtu
  2014-04-09 20:30               ` Hannes Frederic Sowa
@ 2014-04-10  8:28                 ` Ilya V. Matveychikov
  0 siblings, 0 replies; 11+ messages in thread
From: Ilya V. Matveychikov @ 2014-04-10  8:28 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

On 10.04.2014 00:30, Hannes Frederic Sowa wrote:
> On Wed, Apr 09, 2014 at 12:17:33PM +0400, Ilya V. Matveychikov wrote:
>> Works fine, thanks! By the way, could you briefly explain why routes are
>> separated to input and output? What are the benefits?
> 
> Core routing table is not split by input and output, merely the dst
> construction and surrounding lookup and policy checks are seperated for input
> and output.
> 
> Output does not have to deal with source address validation e.g. but has to do
> source address selection.
> 
> Quite hard to answer, I guess the design emerged quite naturally. ;)

OK, I've got it. Thanks again.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* question: update_pmtu doesn't update dst mtu
@ 2014-04-03 11:57 Ilya V. Matveychikov
  0 siblings, 0 replies; 11+ messages in thread
From: Ilya V. Matveychikov @ 2014-04-03 11:57 UTC (permalink / raw)
  To: netdev

Hello,

Is this a place where I can post questions not the patches? If so, can anybody
explain me what the problem I have when trying to update dst PMTU value.

So, the scheme is the following:

                  skb->dst                   rt(pmtu)
IN_DEV (MTU 1500) -------> TUNNEL (MTU 1440) -------> OUT_DEV (MTU 1500)

I have a simple tunnel_xmit function that handles all the packets that goes
trough the tunnel. So, I have an skb with valid skb->dst value (filled earlier
in ip_input_route_noref).

Next, when encapsulation is done, I needed to get an output route for the packet
(rt at the scheme). At this point I know that the output route may have a PMTU
value that is different from the OUT_DEV->mtu. So, I'm trying to update the
input skb->dst route's PMTU with the update_pmtu function. It seems that all is
OK, but when I trying to get the dst mtu value (using dst_mtu(skb_dst(skb))) I
always get the value of 1440 (TUNNEL's MTU):

tunnel_xmit:
    ...
    pmtu = dst_mtu(&rt->dst) - OVERHEAD; // pmtu = 1300, for example
    skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, pmtu);
    // dst_mtu(skb_dst(skb)) still returns 1440

Looking through the code gives me that rt_pmtu is always 0 for the skb->dst
entry and ipv4_mtu that called via the dst->ops->mtu() uses dev->mtu :(

So, the question is what I missed when trying to dynamically tunning tunnel's
input route path MTU?

Thanks :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-04-10  8:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-03 11:37 question: update_pmtu doesn't update dst mtu Ilya V. Matveychikov
2014-04-03 11:58 ` Hannes Frederic Sowa
2014-04-03 12:07   ` Ilya V. Matveychikov
2014-04-03 12:14     ` Hannes Frederic Sowa
2014-04-03 12:27       ` Ilya V. Matveychikov
2014-04-08  9:03         ` Ilya V. Matveychikov
2014-04-08 14:57           ` Hannes Frederic Sowa
2014-04-09  8:17             ` Ilya V. Matveychikov
2014-04-09 20:30               ` Hannes Frederic Sowa
2014-04-10  8:28                 ` Ilya V. Matveychikov
2014-04-03 11:57 Ilya V. Matveychikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.