All of lore.kernel.org
 help / color / mirror / Atom feed
* Q: bad routing table cache entries
@ 2015-12-29 10:54 Stas Sergeev
  2015-12-29 11:58 ` Sowmini Varadhan
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Stas Sergeev @ 2015-12-29 10:54 UTC (permalink / raw)
  To: netdev

Hello.

I was hitting a strange problem when some internet hosts
suddenly stops responding until I reboot. ping to these
host gives "Destination Host Unreachable". After the
initial confusion, I've finally got to
ip route get
and got something quite strange.


Example for GOOD address (the one that I can ping):

ip route get 91.189.89.237
91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
    cache


Example for BAD address (the one that stopped responding):

ip route get 91.189.89.238
91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
    cache <redirected>


Two things differ: the <redirected> mark appears, and the
gateway changed from 192.168.8.1 to 192.168.0.1.
Now, 192.168.0.1 is also a valid gateway, but it is outside
of the network mask for the eth0 interface:

ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:50:43:00:0b:e0
          inet addr:192.168.10.202  Bcast:192.168.11.255  Mask:255.255.252.0


As a result, this route simply doesn't work.
I checked with tcpdump - the icmp packets do not even go
to eth0 - they instead can be captured on lo interface for
some reason.

So my question is: why does linux allow an invalid redirect
entries? Is it a problem with my setup, or some kernel bug,
or some router setup problem? Where should I look into, to
nail this down?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 10:54 Q: bad routing table cache entries Stas Sergeev
@ 2015-12-29 11:58 ` Sowmini Varadhan
  2015-12-29 12:06   ` Stas Sergeev
  2015-12-29 13:19 ` Stas Sergeev
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: Sowmini Varadhan @ 2015-12-29 11:58 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev

On (12/29/15 13:54), Stas Sergeev wrote:
> 
> ip route get 91.189.89.238
> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>     cache <redirected>
         :
> Now, 192.168.0.1 is also a valid gateway, but it is outside
> of the network mask for the eth0 interface:
         :
> So my question is: why does linux allow an invalid redirect
> entries? Is it a problem with my setup, or some kernel bug,
> or some router setup problem? Where should I look into, to
> nail this down?

Seems like the problem is in the router that is sending
the bad redirect. You would have to check into the configuration
and/or implementation of the router- it should not be sending
back a redirect in the above case (different netmasks) even
if the ingress and egress physical interfaces are the same.

--Sowmini

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 11:58 ` Sowmini Varadhan
@ 2015-12-29 12:06   ` Stas Sergeev
  2015-12-29 12:32     ` Sowmini Varadhan
  0 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2015-12-29 12:06 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev

29.12.2015 14:58, Sowmini Varadhan пишет:
> On (12/29/15 13:54), Stas Sergeev wrote:
>>
>> ip route get 91.189.89.238
>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>     cache <redirected>
>          :
>> Now, 192.168.0.1 is also a valid gateway, but it is outside
>> of the network mask for the eth0 interface:
>          :
>> So my question is: why does linux allow an invalid redirect
>> entries? Is it a problem with my setup, or some kernel bug,
>> or some router setup problem? Where should I look into, to
>> nail this down?
> 
> Seems like the problem is in the router that is sending
> the bad redirect. You would have to check into the configuration
> and/or implementation of the router- it should not be sending
> back a redirect in the above case (different netmasks) even
> if the ingress and egress physical interfaces are the same.
Router on 192.168.8.1 is just a PC with ubuntu, w/o any special
software. I'd be very surprised if it does so. As I understand,
linux would accept such ICMP redirect only from the router, or
could someone else also send them?

But what worries me more, is the question:
Should the linux kernel really silently accept those, breaking
the routing in a completely unexpected ways? Isn't it a bug?
The sanity check against netmask looks trivial, so why it is not there?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 12:06   ` Stas Sergeev
@ 2015-12-29 12:32     ` Sowmini Varadhan
  2015-12-29 12:43       ` Stas Sergeev
  0 siblings, 1 reply; 37+ messages in thread
From: Sowmini Varadhan @ 2015-12-29 12:32 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev

On (12/29/15 15:06), Stas Sergeev wrote:
> Router on 192.168.8.1 is just a PC with ubuntu, w/o any special
> software. I'd be very surprised if it does so. As I understand,
> linux would accept such ICMP redirect only from the router, or
> could someone else also send them?

If someone elase can spoof redirects on your network, you have
a much bigger network management problem- at that point, how can you
trust anything, e.g., a default rdisc rtradv?

> But what worries me more, is the question:
> Should the linux kernel really silently accept those, breaking
> the routing in a completely unexpected ways? Isn't it a bug?

How is the receiver supposed to know that the redirect was "bad"?

In your example, you claimed that

a "good" redirect was:
     ip route get 91.189.89.237
     91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
         cache

but a "bad" one was:

    ip route get 91.189.89.238
    91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
        cache <redirected>

Its not clear to me what the netmask on eth0 is - is this a /16
(in which case both redirs are "good" as far as the receiver can tell)?
Are the 2 gws also on a /16? or something longer?

> The sanity check against netmask looks trivial, so why it is not there?

According to rfc1812 (pg 82-84)

   Routers MUST NOT generate a Redirect Message unless all the following
   conditions are met:

   o The packet is being forwarded out the same physical interface that
      it was received from,

   o The IP source address in the packet is on the same Logical IP
      (sub)network as the next-hop IP address, and

   o The packet does not contain an IP source route option.

The second condition seems to have been violated by the router. I 
suppose it might not hurt if the receiver can do some sanity checking
on the redirect but this might not eliminate every error, since
it might not be possible to detect netmask mismatch in every case.

--Sowmini

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 12:32     ` Sowmini Varadhan
@ 2015-12-29 12:43       ` Stas Sergeev
  0 siblings, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2015-12-29 12:43 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev

29.12.2015 15:32, Sowmini Varadhan пишет:
> On (12/29/15 15:06), Stas Sergeev wrote:
>> Router on 192.168.8.1 is just a PC with ubuntu, w/o any special
>> software. I'd be very surprised if it does so. As I understand,
>> linux would accept such ICMP redirect only from the router, or
>> could someone else also send them?
> 
> If someone elase can spoof redirects on your network, you have
> a much bigger network management problem- at that point, how can you
> trust anything, e.g., a default rdisc rtradv?
Well, I have /proc/sys/net/ipv4/conf/all/secure_redirects set to 1,
so it should be a router I suppose. But this is strange and I wonder
why does it do so very rarely (but that's something for me to investigate).

>> But what worries me more, is the question:
>> Should the linux kernel really silently accept those, breaking
>> the routing in a completely unexpected ways? Isn't it a bug?
> 
> How is the receiver supposed to know that the redirect was "bad"?
> 
> In your example, you claimed that
> 
> a "good" redirect was:
>      ip route get 91.189.89.237
>      91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>          cache
> 
> but a "bad" one was:
> 
>     ip route get 91.189.89.238
>     91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>         cache <redirected>
> 
> Its not clear to me what the netmask on eth0 is - is this a /16
But I demonstrated the netmask in a very first posting, and here it is:

ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:50:43:00:0b:e0
          inet addr:192.168.10.202  Bcast:192.168.11.255  Mask:255.255.252.0


> (in which case both redirs are "good" as far as the receiver can tell)?
> Are the 2 gws also on a /16? or something longer?
Yes, the problem is exactly that: the mask is longer.
So the route is bad, and the packets are routed to the "lo"
interface instead - I checked that with tcpdump.

>> The sanity check against netmask looks trivial, so why it is not there?
> 
> According to rfc1812 (pg 82-84)
> 
>    Routers MUST NOT generate a Redirect Message unless all the following
>    conditions are met:
> 
>    o The packet is being forwarded out the same physical interface that
>       it was received from,
> 
>    o The IP source address in the packet is on the same Logical IP
>       (sub)network as the next-hop IP address, and
> 
>    o The packet does not contain an IP source route option.
> 
> The second condition seems to have been violated by the router. I 
> suppose it might not hurt if the receiver can do some sanity checking
> on the redirect but this might not eliminate every error, since
> it might not be possible to detect netmask mismatch in every case.
Not sure what case you mean, but at least as simple error as I am
having, should be possible to detect.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 10:54 Q: bad routing table cache entries Stas Sergeev
  2015-12-29 11:58 ` Sowmini Varadhan
@ 2015-12-29 13:19 ` Stas Sergeev
  2015-12-29 15:22 ` Sowmini Varadhan
  2016-01-12 15:34 ` Hannes Frederic Sowa
  3 siblings, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2015-12-29 13:19 UTC (permalink / raw)
  To: netdev; +Cc: Sowmini Varadhan

29.12.2015 13:54, Stas Sergeev пишет:
> Hello.
> 
> I was hitting a strange problem when some internet hosts
> suddenly stops responding until I reboot. ping to these
> host gives "Destination Host Unreachable". After the
> initial confusion, I've finally got to
> ip route get
> and got something quite strange.
Another observation:


# ip route get 91.189.90.236
91.189.90.236 via 192.168.0.1 dev eth0  src 192.168.10.202
    cache <redirected>

# cat /proc/net/rt_cache
Iface	Destination	Gateway 	Flags		RefCnt	Use	Metric	Source		MTU	Window	IRTT	TOS	HHRef	HHUptod	SpecDst


So the redirection exists, but rt_cache doesn't show anything.
Am I looking in the wrong place? Where can I get a list of all
the redirections I have?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 10:54 Q: bad routing table cache entries Stas Sergeev
  2015-12-29 11:58 ` Sowmini Varadhan
  2015-12-29 13:19 ` Stas Sergeev
@ 2015-12-29 15:22 ` Sowmini Varadhan
  2015-12-29 15:38   ` Stas Sergeev
                     ` (2 more replies)
  2016-01-12 15:34 ` Hannes Frederic Sowa
  3 siblings, 3 replies; 37+ messages in thread
From: Sowmini Varadhan @ 2015-12-29 15:22 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev


Do you have admin control over the ubuntu router?
If yes, you might want to check the shared_media [#] setting 
on that router for the interfaces with overlapping subnets.
(it is on by default, I would try turning it off).

AFAICT, the code does the right thing per rfc1812 when setting
IPSKB_DOREDIRECT if shared_media is turned off.

--Sowmini

[#] https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/theconfvariables.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 15:22 ` Sowmini Varadhan
@ 2015-12-29 15:38   ` Stas Sergeev
  2015-12-29 17:40     ` Stas Sergeev
  2015-12-30 12:42   ` Stas Sergeev
  2016-01-12 14:40   ` Stas Sergeev
  2 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2015-12-29 15:38 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev

29.12.2015 18:22, Sowmini Varadhan пишет:
> Do you have admin control over the ubuntu router?
> If yes, you might want to check the shared_media [#] setting 
> on that router for the interfaces with overlapping subnets.
> (it is on by default, I would try turning it off).
Ahha, good catch, thanks!
Done that, then
ip route flush cache
on host, and am waiting for the problem to re-appear.
Hope it won't... but to say for sure I'll need a day or 2,
as it is not very fast to appear.


> AFAICT, the code does the right thing per rfc1812 when setting
> IPSKB_DOREDIRECT if shared_media is turned off.
Likely the router's side is doing the right thing, but of
course there are still many questions about the host's side.
Namely, why the bad entries were allowed, and how to list them?
The problem would not happen if they are rejected based on a
simple netmask check.

Thanks for your help so far! With shared_media hint you've pretty
likely saved me from lots of headache. :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 15:38   ` Stas Sergeev
@ 2015-12-29 17:40     ` Stas Sergeev
  0 siblings, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2015-12-29 17:40 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev

29.12.2015 18:38, Stas Sergeev пишет:
> Likely the router's side is doing the right thing, but of
Or maybe not?
Here's the ifconfig of router:

eth0      Link encap:Ethernet  HWaddr 00:1e:8c:a7:b5:36
          inet addr:192.168.0.220  Bcast:192.168.3.255  Mask:255.255.252.0

eth0:1    Link encap:Ethernet  HWaddr 00:1e:8c:a7:b5:36
          inet addr:192.168.8.1  Bcast:192.168.11.255  Mask:255.255.252.0


# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         192.168.0.1     0.0.0.0         UG    0      0        0 eth0
10.1.10.0       192.168.10.143  255.255.255.0   UG    0      0        0 eth0
localnet        *               255.255.252.0   U     0      0        0 eth0
192.168.8.0     *               255.255.252.0   U     0      0        0 eth0


Masks look correct.
So why would it send redirects to hosts of 192.168.8.0 subnet
with the gateway of 192.168.0.0 subnet?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 15:22 ` Sowmini Varadhan
  2015-12-29 15:38   ` Stas Sergeev
@ 2015-12-30 12:42   ` Stas Sergeev
  2015-12-30 14:17     ` Eric Dumazet
  2016-01-04  1:05     ` Sowmini Varadhan
  2016-01-12 14:40   ` Stas Sergeev
  2 siblings, 2 replies; 37+ messages in thread
From: Stas Sergeev @ 2015-12-30 12:42 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev

29.12.2015 18:22, Sowmini Varadhan пишет:
> Do you have admin control over the ubuntu router?
> If yes, you might want to check the shared_media [#] setting 
> on that router for the interfaces with overlapping subnets.
> (it is on by default, I would try turning it off).
That didn't help, problem re-appears.
Thanks anyway, looks like I am going to disable accept_redirects then.
It seems buggy and obviously no one cares.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-30 12:42   ` Stas Sergeev
@ 2015-12-30 14:17     ` Eric Dumazet
  2015-12-30 17:56       ` David Miller
  2016-01-04  1:05     ` Sowmini Varadhan
  1 sibling, 1 reply; 37+ messages in thread
From: Eric Dumazet @ 2015-12-30 14:17 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: Sowmini Varadhan, netdev

On Wed, 2015-12-30 at 15:42 +0300, Stas Sergeev wrote:
> 29.12.2015 18:22, Sowmini Varadhan пишет:
> > Do you have admin control over the ubuntu router?
> > If yes, you might want to check the shared_media [#] setting 
> > on that router for the interfaces with overlapping subnets.
> > (it is on by default, I would try turning it off).
> That didn't help, problem re-appears.
> Thanks anyway, looks like I am going to disable accept_redirects then.
> It seems buggy and obviously no one cares.

Obviously some people take vacations at this period of the year, and do
stay away from netdev traffic.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-30 14:17     ` Eric Dumazet
@ 2015-12-30 17:56       ` David Miller
  0 siblings, 0 replies; 37+ messages in thread
From: David Miller @ 2015-12-30 17:56 UTC (permalink / raw)
  To: eric.dumazet; +Cc: stsp, sowmini.varadhan, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 30 Dec 2015 09:17:42 -0500

> On Wed, 2015-12-30 at 15:42 +0300, Stas Sergeev wrote:
>> 29.12.2015 18:22, Sowmini Varadhan пишет:
>> > Do you have admin control over the ubuntu router?
>> > If yes, you might want to check the shared_media [#] setting 
>> > on that router for the interfaces with overlapping subnets.
>> > (it is on by default, I would try turning it off).
>> That didn't help, problem re-appears.
>> Thanks anyway, looks like I am going to disable accept_redirects then.
>> It seems buggy and obviously no one cares.
> 
> Obviously some people take vacations at this period of the year, and do
> stay away from netdev traffic.

+1

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-30 12:42   ` Stas Sergeev
  2015-12-30 14:17     ` Eric Dumazet
@ 2016-01-04  1:05     ` Sowmini Varadhan
  2016-01-04  1:32       ` Stas Sergeev
  2016-01-04 17:23       ` Stas Sergeev
  1 sibling, 2 replies; 37+ messages in thread
From: Sowmini Varadhan @ 2016-01-04  1:05 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev

On (12/30/15 15:42), Stas Sergeev wrote:
> 29.12.2015 18:22, Sowmini Varadhan пишет:
> > Do you have admin control over the ubuntu router?
> > If yes, you might want to check the shared_media [#] setting 
> > on that router for the interfaces with overlapping subnets.
> > (it is on by default, I would try turning it off).
> That didn't help, problem re-appears.

the code that sets things up for redirect is this:

  if (out_dev == in_dev && err && IN_DEV_TX_REDIRECTS(out_dev) &&
            skb->protocol == htons(ETH_P_IP) &&
            (IN_DEV_SHARED_MEDIA(out_dev) ||
             inet_addr_onlink(out_dev, saddr, FIB_RES_GW(*res))))
                IPCB(skb)->flags |= IPSKB_DOREDIRECT;

If you are still seeing the problematic redirect after disabling
shared_media, then you would need to trace through inet_addr_onlink()
to see why it was not returning false. As I said before, afaict from
reading the code, inet_addr_onlink looks right. So there may be something
unusual with your netmask config on in_dev/out_dev.

But even if the redirect is suppressed, sounds like the network/netmask
config is sub-optimal, since each packet gets (needlessly?) sent
up/down the router's in_dev/out_dev. That should be avoided, if possible.

--Sowmini

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-04  1:05     ` Sowmini Varadhan
@ 2016-01-04  1:32       ` Stas Sergeev
  2016-01-04 17:23       ` Stas Sergeev
  1 sibling, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2016-01-04  1:32 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev

04.01.2016 04:05, Sowmini Varadhan пишет:
> On (12/30/15 15:42), Stas Sergeev wrote:
>> 29.12.2015 18:22, Sowmini Varadhan пишет:
>>> Do you have admin control over the ubuntu router?
>>> If yes, you might want to check the shared_media [#] setting
>>> on that router for the interfaces with overlapping subnets.
>>> (it is on by default, I would try turning it off).
>> That didn't help, problem re-appears.
> the code that sets things up for redirect is this:
>
>    if (out_dev == in_dev && err && IN_DEV_TX_REDIRECTS(out_dev) &&
>              skb->protocol == htons(ETH_P_IP) &&
>              (IN_DEV_SHARED_MEDIA(out_dev) ||
>               inet_addr_onlink(out_dev, saddr, FIB_RES_GW(*res))))
>                  IPCB(skb)->flags |= IPSKB_DOREDIRECT;
>
> If you are still seeing the problematic redirect after disabling
> shared_media, then you would need to trace through inet_addr_onlink()
> to see why it was not returning false. As I said before, afaict from
> reading the code, inet_addr_onlink looks right. So there may be something
> unusual with your netmask config on in_dev/out_dev.
OK, thanks for the hint. Looks like a small function, I'll
trace it with systemtap a week later.
But I am more worrying about accepting such a redirects.
Almost certainly a bug.

> But even if the redirect is suppressed, sounds like the network/netmask
> config is sub-optimal, since each packet gets (needlessly?) sent
> up/down the router's in_dev/out_dev. That should be avoided, if possible.
Since I don't have a root access to the 192.168.0.1, I can't change
the masks. But curing the 192.168.8.1 router would give enough
of a relief too.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-04  1:05     ` Sowmini Varadhan
  2016-01-04  1:32       ` Stas Sergeev
@ 2016-01-04 17:23       ` Stas Sergeev
  1 sibling, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2016-01-04 17:23 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev

04.01.2016 04:05, Sowmini Varadhan пишет:
> On (12/30/15 15:42), Stas Sergeev wrote:
>> 29.12.2015 18:22, Sowmini Varadhan пишет:
>>> Do you have admin control over the ubuntu router?
>>> If yes, you might want to check the shared_media [#] setting
>>> on that router for the interfaces with overlapping subnets.
>>> (it is on by default, I would try turning it off).
>> That didn't help, problem re-appears.
> the code that sets things up for redirect is this:
I was also privately suggested that I could disable shared_media
insufficiently. Indeed, I've only done that for "all" (net.ipv4.conf.all),
but not for the every particular IFace. So I'll re-try the test.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 15:22 ` Sowmini Varadhan
  2015-12-29 15:38   ` Stas Sergeev
  2015-12-30 12:42   ` Stas Sergeev
@ 2016-01-12 14:40   ` Stas Sergeev
  2016-01-12 14:47     ` Sowmini Varadhan
  2 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 14:40 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: netdev, Alexander Duyck

29.12.2015 18:22, Sowmini Varadhan пишет:
> 
> Do you have admin control over the ubuntu router?
> If yes, you might want to check the shared_media [#] setting 
> on that router for the interfaces with overlapping subnets.
> (it is on by default, I would try turning it off).
Updated testing results:
After I disabled shared_media not only for "all", but
also for _all_ interfaces individually, the problem seem to
have stopped. So thanks for these hints.
Now, as the media is actually really shared (same NIC/cable),
I just wonder what's going on here.

And unfortunately we still don't know why these redirects are
ever accepted...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 14:40   ` Stas Sergeev
@ 2016-01-12 14:47     ` Sowmini Varadhan
  2016-01-12 20:33       ` David Miller
  0 siblings, 1 reply; 37+ messages in thread
From: Sowmini Varadhan @ 2016-01-12 14:47 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev, Alexander Duyck

On (01/12/16 17:40), Stas Sergeev wrote:
> Updated testing results:
> After I disabled shared_media not only for "all", but
> also for _all_ interfaces individually, the problem seem to
> have stopped. So thanks for these hints.
> Now, as the media is actually really shared (same NIC/cable),
> I just wonder what's going on here.

I dont know the history of the shared_media tunable (or
the rationale behind the default) - I was just reading out the
code - perhaps someone on the list who has the history can share 
the motivation behind this tunable.

> And unfortunately we still don't know why these redirects are
> ever accepted...

I would guess that it is accepted because there is nothing
(no RFC chapter/verse) saying it should not. 

But the fact remains that the network is sub-optimally configured-
each packet that triggers the redirect is now amplified at the 
router - one copy gets forwarded, and one redirect gets sent back. 

--Sowmini

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2015-12-29 10:54 Q: bad routing table cache entries Stas Sergeev
                   ` (2 preceding siblings ...)
  2015-12-29 15:22 ` Sowmini Varadhan
@ 2016-01-12 15:34 ` Hannes Frederic Sowa
  2016-01-12 15:52   ` Hannes Frederic Sowa
  2016-01-12 15:57   ` Stas Sergeev
  3 siblings, 2 replies; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 15:34 UTC (permalink / raw)
  To: Stas Sergeev, netdev

On 29.12.2015 11:54, Stas Sergeev wrote:
> Hello.
>
> I was hitting a strange problem when some internet hosts
> suddenly stops responding until I reboot. ping to these
> host gives "Destination Host Unreachable". After the
> initial confusion, I've finally got to
> ip route get
> and got something quite strange.
>
>
> Example for GOOD address (the one that I can ping):
>
> ip route get 91.189.89.237
> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>      cache
>
>
> Example for BAD address (the one that stopped responding):
>
> ip route get 91.189.89.238
> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>      cache <redirected>

I tried to understand this thread and now wonder why this redirect route 
isn't there always. Can you please summarize again why this shouldn't 
happen? It looks totally fine to me from the configuration of your 
router and the subnet masks.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 15:34 ` Hannes Frederic Sowa
@ 2016-01-12 15:52   ` Hannes Frederic Sowa
  2016-01-12 16:03     ` Stas Sergeev
  2016-01-12 15:57   ` Stas Sergeev
  1 sibling, 1 reply; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 15:52 UTC (permalink / raw)
  To: Stas Sergeev, netdev

On 12.01.2016 16:34, Hannes Frederic Sowa wrote:
> On 29.12.2015 11:54, Stas Sergeev wrote:
>> Hello.
>>
>> I was hitting a strange problem when some internet hosts
>> suddenly stops responding until I reboot. ping to these
>> host gives "Destination Host Unreachable". After the
>> initial confusion, I've finally got to
>> ip route get
>> and got something quite strange.
>>
>>
>> Example for GOOD address (the one that I can ping):
>>
>> ip route get 91.189.89.237
>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>      cache
>>
>>
>> Example for BAD address (the one that stopped responding):
>>
>> ip route get 91.189.89.238
>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>      cache <redirected>
>
> I tried to understand this thread and now wonder why this redirect route
> isn't there always. Can you please summarize again why this shouldn't
> happen? It looks totally fine to me from the configuration of your
> router and the subnet masks.

Just an addendum:

In IPv6 a redirect is seen as a notification telling hosts, this new 
address is on the same link as you. I think this semantic is the same 
for IPv4, so we are informing you that in essence you are getting a /32 
route installed to your new interface and can do link layer resolving of 
the new host.

I do think this is valid and fine.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 15:34 ` Hannes Frederic Sowa
  2016-01-12 15:52   ` Hannes Frederic Sowa
@ 2016-01-12 15:57   ` Stas Sergeev
  1 sibling, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 15:57 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

12.01.2016 18:34, Hannes Frederic Sowa пишет:
> On 29.12.2015 11:54, Stas Sergeev wrote:
>> Hello.
>>
>> I was hitting a strange problem when some internet hosts
>> suddenly stops responding until I reboot. ping to these
>> host gives "Destination Host Unreachable". After the
>> initial confusion, I've finally got to
>> ip route get
>> and got something quite strange.
>>
>>
>> Example for GOOD address (the one that I can ping):
>>
>> ip route get 91.189.89.237
>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>      cache
>>
>>
>> Example for BAD address (the one that stopped responding):
>>
>> ip route get 91.189.89.238
>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>      cache <redirected>
> 
> I tried to understand this thread and now wonder why this redirect route isn't there always. Can you please summarize again why this shouldn't happen? It looks totally fine to me from the
> configuration of your router and the subnet masks.
http://www.spinics.net/lists/netdev/msg358200.html
Sowmini Varadhan explains:
---
According to rfc1812 (pg 82-84)

   Routers MUST NOT generate a Redirect Message unless all the following
   conditions are met:

   o The packet is being forwarded out the same physical interface that
      it was received from,

   o The IP source address in the packet is on the same Logical IP
      (sub)network as the next-hop IP address, and

   o The packet does not contain an IP source route option.

The second condition seems to have been violated by the router.
---

And he also shows the tunable that stops the router from violating this.
Good that linux can be at least tuned to do the right thing. :)

The fewer explained question is why the bad route is ever accepted.
This is what actually looks risky.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 15:52   ` Hannes Frederic Sowa
@ 2016-01-12 16:03     ` Stas Sergeev
  2016-01-12 16:10       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 16:03 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

12.01.2016 18:52, Hannes Frederic Sowa пишет:
> On 12.01.2016 16:34, Hannes Frederic Sowa wrote:
>> On 29.12.2015 11:54, Stas Sergeev wrote:
>>> Hello.
>>>
>>> I was hitting a strange problem when some internet hosts
>>> suddenly stops responding until I reboot. ping to these
>>> host gives "Destination Host Unreachable". After the
>>> initial confusion, I've finally got to
>>> ip route get
>>> and got something quite strange.
>>>
>>>
>>> Example for GOOD address (the one that I can ping):
>>>
>>> ip route get 91.189.89.237
>>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>>      cache
>>>
>>>
>>> Example for BAD address (the one that stopped responding):
>>>
>>> ip route get 91.189.89.238
>>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>>      cache <redirected>
>>
>> I tried to understand this thread and now wonder why this redirect route
>> isn't there always. Can you please summarize again why this shouldn't
>> happen? It looks totally fine to me from the configuration of your
>> router and the subnet masks.
> 
> Just an addendum:
> 
> In IPv6 a redirect is seen as a notification telling hosts, this new address is on the same link as you. I think this semantic is the same for IPv4, so we are informing you that in essence you are
> getting a /32 route installed to your new interface and can do link layer resolving of the new host.
> 
> I do think this is valid and fine.
You can't call "valid and fine" something that doesn't
work, at first place. Why and where does it fail, was the
subject of this thread.
If you think router did the right thing, then please explain
the breakage from that point of view.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 16:03     ` Stas Sergeev
@ 2016-01-12 16:10       ` Hannes Frederic Sowa
  2016-01-12 16:42         ` Stas Sergeev
  0 siblings, 1 reply; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 16:10 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev

On 12.01.2016 17:03, Stas Sergeev wrote:
> 12.01.2016 18:52, Hannes Frederic Sowa пишет:
>> On 12.01.2016 16:34, Hannes Frederic Sowa wrote:
>>> On 29.12.2015 11:54, Stas Sergeev wrote:
>>>> Hello.
>>>>
>>>> I was hitting a strange problem when some internet hosts
>>>> suddenly stops responding until I reboot. ping to these
>>>> host gives "Destination Host Unreachable". After the
>>>> initial confusion, I've finally got to
>>>> ip route get
>>>> and got something quite strange.
>>>>
>>>>
>>>> Example for GOOD address (the one that I can ping):
>>>>
>>>> ip route get 91.189.89.237
>>>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>>>       cache
>>>>
>>>>
>>>> Example for BAD address (the one that stopped responding):
>>>>
>>>> ip route get 91.189.89.238
>>>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>>>       cache <redirected>
>>>
>>> I tried to understand this thread and now wonder why this redirect route
>>> isn't there always. Can you please summarize again why this shouldn't
>>> happen? It looks totally fine to me from the configuration of your
>>> router and the subnet masks.
>>
>> Just an addendum:
>>
>> In IPv6 a redirect is seen as a notification telling hosts, this new address is on the same link as you. I think this semantic is the same for IPv4, so we are informing you that in essence you are
>> getting a /32 route installed to your new interface and can do link layer resolving of the new host.
>>
>> I do think this is valid and fine.
> You can't call "valid and fine" something that doesn't
> work, at first place. Why and where does it fail, was the
> subject of this thread.

In terms of the shared media specification 
<https://tools.ietf.org/html/rfc1620> it is valid and fine.

You can also disable shared_media on the client and it won't accept such 
redirects anymore. It is just what we defined as the default.

> If you think router did the right thing, then please explain
> the breakage from that point of view.

Hope it makes sense.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 16:10       ` Hannes Frederic Sowa
@ 2016-01-12 16:42         ` Stas Sergeev
  2016-01-12 16:56           ` Stas Sergeev
  0 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 16:42 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

12.01.2016 19:10, Hannes Frederic Sowa пишет:
> On 12.01.2016 17:03, Stas Sergeev wrote:
>> 12.01.2016 18:52, Hannes Frederic Sowa пишет:
>>> On 12.01.2016 16:34, Hannes Frederic Sowa wrote:
>>>> On 29.12.2015 11:54, Stas Sergeev wrote:
>>>>> Hello.
>>>>>
>>>>> I was hitting a strange problem when some internet hosts
>>>>> suddenly stops responding until I reboot. ping to these
>>>>> host gives "Destination Host Unreachable". After the
>>>>> initial confusion, I've finally got to
>>>>> ip route get
>>>>> and got something quite strange.
>>>>>
>>>>>
>>>>> Example for GOOD address (the one that I can ping):
>>>>>
>>>>> ip route get 91.189.89.237
>>>>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>>>>       cache
>>>>>
>>>>>
>>>>> Example for BAD address (the one that stopped responding):
>>>>>
>>>>> ip route get 91.189.89.238
>>>>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>>>>       cache <redirected>
>>>>
>>>> I tried to understand this thread and now wonder why this redirect route
>>>> isn't there always. Can you please summarize again why this shouldn't
>>>> happen? It looks totally fine to me from the configuration of your
>>>> router and the subnet masks.
>>>
>>> Just an addendum:
>>>
>>> In IPv6 a redirect is seen as a notification telling hosts, this new address is on the same link as you. I think this semantic is the same for IPv4, so we are informing you that in essence you are
>>> getting a /32 route installed to your new interface and can do link layer resolving of the new host.
>>>
>>> I do think this is valid and fine.
>> You can't call "valid and fine" something that doesn't
>> work, at first place. Why and where does it fail, was the
>> subject of this thread.
> 
> In terms of the shared media specification <https://tools.ietf.org/html/rfc1620> it is valid and fine.
Good luck sending users to RFC without giving any explanations. :)
Well, yes, an interesting reading, but:
https://tools.ietf.org/html/rfc1812
---
   Routers MUST NOT generate a Redirect Message unless all the following
   conditions are met:

   o The packet is being forwarded out the same physical interface that
      it was received from,

   o The IP source address in the packet is on the same Logical IP
      (sub)network as the next-hop IP address, and

   o The packet does not contain an IP source route option.

   The source address used in the ICMP Redirect MUST belong to the same
   logical (sub)net as the destination address.
---

Could you please explain why the above does not apply?

> You can also disable shared_media on the client and it won't accept such redirects anymore.
Only "such" redirects, or any redirects?


>> If you think router did the right thing, then please explain
>> the breakage from that point of view.
> Hope it makes sense.
No, because it still doesn't work for me.
What should I do to get such redirects to work?
What should I do to at least list them?
Even if this is with accordance to some RFC (which it seems not, though),
this doesn't help me a tiny bit, unless it also works. :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 16:42         ` Stas Sergeev
@ 2016-01-12 16:56           ` Stas Sergeev
  2016-01-12 17:06             ` Hannes Frederic Sowa
  0 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 16:56 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, Sowmini Varadhan

12.01.2016 19:42, Stas Sergeev пишет:
> 12.01.2016 19:10, Hannes Frederic Sowa пишет:
>> On 12.01.2016 17:03, Stas Sergeev wrote:
>>> 12.01.2016 18:52, Hannes Frederic Sowa пишет:
>>>> On 12.01.2016 16:34, Hannes Frederic Sowa wrote:
>>>>> On 29.12.2015 11:54, Stas Sergeev wrote:
>>>>>> Hello.
>>>>>>
>>>>>> I was hitting a strange problem when some internet hosts
>>>>>> suddenly stops responding until I reboot. ping to these
>>>>>> host gives "Destination Host Unreachable". After the
>>>>>> initial confusion, I've finally got to
>>>>>> ip route get
>>>>>> and got something quite strange.
>>>>>>
>>>>>>
>>>>>> Example for GOOD address (the one that I can ping):
>>>>>>
>>>>>> ip route get 91.189.89.237
>>>>>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>>>>>       cache
>>>>>>
>>>>>>
>>>>>> Example for BAD address (the one that stopped responding):
>>>>>>
>>>>>> ip route get 91.189.89.238
>>>>>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>>>>>       cache <redirected>
>>>>>
>>>>> I tried to understand this thread and now wonder why this redirect route
>>>>> isn't there always. Can you please summarize again why this shouldn't
>>>>> happen? It looks totally fine to me from the configuration of your
>>>>> router and the subnet masks.
>>>>
>>>> Just an addendum:
>>>>
>>>> In IPv6 a redirect is seen as a notification telling hosts, this new address is on the same link as you. I think this semantic is the same for IPv4, so we are informing you that in essence you are
>>>> getting a /32 route installed to your new interface and can do link layer resolving of the new host.
>>>>
>>>> I do think this is valid and fine.
>>> You can't call "valid and fine" something that doesn't
>>> work, at first place. Why and where does it fail, was the
>>> subject of this thread.
>>
>> In terms of the shared media specification <https://tools.ietf.org/html/rfc1620> it is valid and fine.
> Good luck sending users to RFC without giving any explanations. :)
> Well, yes, an interesting reading, but:
> https://tools.ietf.org/html/rfc1812
> ---
>    Routers MUST NOT generate a Redirect Message unless all the following
>    conditions are met:
> 
>    o The packet is being forwarded out the same physical interface that
>       it was received from,
> 
>    o The IP source address in the packet is on the same Logical IP
>       (sub)network as the next-hop IP address, and
> 
>    o The packet does not contain an IP source route option.
> 
>    The source address used in the ICMP Redirect MUST belong to the same
>    logical (sub)net as the destination address.
> ---
> 
> Could you please explain why the above does not apply?
Also the rfc1620 you pointed, seems to be saying this:

                A Redirect message SHOULD be silently discarded if the
                new router address it specifies is not on the same
                connected (sub-) net through which the Redirect arrived,
                or if the source of the Redirect is not the current
                first-hop router for the specified destination.

It seems, this is exactly the rule we were trying to find
during the thread. And it seems violated, either. Unless I am
mis-interpreting it, of course.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 16:56           ` Stas Sergeev
@ 2016-01-12 17:06             ` Hannes Frederic Sowa
  2016-01-12 17:18               ` Stas Sergeev
  0 siblings, 1 reply; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 17:06 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev, Sowmini Varadhan

On 12.01.2016 17:56, Stas Sergeev wrote:
> 12.01.2016 19:42, Stas Sergeev пишет:
>> 12.01.2016 19:10, Hannes Frederic Sowa пишет:
>>> On 12.01.2016 17:03, Stas Sergeev wrote:
>>>> 12.01.2016 18:52, Hannes Frederic Sowa пишет:
>>>>> On 12.01.2016 16:34, Hannes Frederic Sowa wrote:
>>>>>> On 29.12.2015 11:54, Stas Sergeev wrote:
>>>>>>> Hello.
>>>>>>>
>>>>>>> I was hitting a strange problem when some internet hosts
>>>>>>> suddenly stops responding until I reboot. ping to these
>>>>>>> host gives "Destination Host Unreachable". After the
>>>>>>> initial confusion, I've finally got to
>>>>>>> ip route get
>>>>>>> and got something quite strange.
>>>>>>>
>>>>>>>
>>>>>>> Example for GOOD address (the one that I can ping):
>>>>>>>
>>>>>>> ip route get 91.189.89.237
>>>>>>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>>>>>>        cache
>>>>>>>
>>>>>>>
>>>>>>> Example for BAD address (the one that stopped responding):
>>>>>>>
>>>>>>> ip route get 91.189.89.238
>>>>>>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>>>>>>        cache <redirected>
>>>>>>
>>>>>> I tried to understand this thread and now wonder why this redirect route
>>>>>> isn't there always. Can you please summarize again why this shouldn't
>>>>>> happen? It looks totally fine to me from the configuration of your
>>>>>> router and the subnet masks.
>>>>>
>>>>> Just an addendum:
>>>>>
>>>>> In IPv6 a redirect is seen as a notification telling hosts, this new address is on the same link as you. I think this semantic is the same for IPv4, so we are informing you that in essence you are
>>>>> getting a /32 route installed to your new interface and can do link layer resolving of the new host.
>>>>>
>>>>> I do think this is valid and fine.
>>>> You can't call "valid and fine" something that doesn't
>>>> work, at first place. Why and where does it fail, was the
>>>> subject of this thread.
>>>
>>> In terms of the shared media specification <https://tools.ietf.org/html/rfc1620> it is valid and fine.
>> Good luck sending users to RFC without giving any explanations. :)

It explains how and what extensions need to be added to an ip 
routing/host device to deal better with shared media.

>> Well, yes, an interesting reading, but:
>> https://tools.ietf.org/html/rfc1812
>> ---
>>     Routers MUST NOT generate a Redirect Message unless all the following
>>     conditions are met:
>>
>>     o The packet is being forwarded out the same physical interface that
>>        it was received from,
>>
>>     o The IP source address in the packet is on the same Logical IP
>>        (sub)network as the next-hop IP address, and
>>
>>     o The packet does not contain an IP source route option.
>>
>>     The source address used in the ICMP Redirect MUST belong to the same
>>     logical (sub)net as the destination address.
>> ---
>>
>> Could you please explain why the above does not apply?
> Also the rfc1620 you pointed, seems to be saying this:
>
>                  A Redirect message SHOULD be silently discarded if the
>                  new router address it specifies is not on the same
>                  connected (sub-) net through which the Redirect arrived,
>                  or if the source of the Redirect is not the current
>                  first-hop router for the specified destination.
>
> It seems, this is exactly the rule we were trying to find
> during the thread. And it seems violated, either. Unless I am
> mis-interpreting it, of course.

If you read on you will read that with shared_media this exact clause 
(the first of those) is not in effect any more.

I don't know why shared_media=1 is the default in Linux, this decision 
was made long before I joined here. Anyway, with shared_media=1 this is 
absolutely the required behavior.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 17:06             ` Hannes Frederic Sowa
@ 2016-01-12 17:18               ` Stas Sergeev
  2016-01-12 17:26                 ` Hannes Frederic Sowa
  0 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 17:18 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, Sowmini Varadhan

12.01.2016 20:06, Hannes Frederic Sowa пишет:
> On 12.01.2016 17:56, Stas Sergeev wrote:
>> 12.01.2016 19:42, Stas Sergeev пишет:
>>> 12.01.2016 19:10, Hannes Frederic Sowa пишет:
>>>> On 12.01.2016 17:03, Stas Sergeev wrote:
>>>>> 12.01.2016 18:52, Hannes Frederic Sowa пишет:
>>>>>> On 12.01.2016 16:34, Hannes Frederic Sowa wrote:
>>>>>>> On 29.12.2015 11:54, Stas Sergeev wrote:
>>>>>>>> Hello.
>>>>>>>>
>>>>>>>> I was hitting a strange problem when some internet hosts
>>>>>>>> suddenly stops responding until I reboot. ping to these
>>>>>>>> host gives "Destination Host Unreachable". After the
>>>>>>>> initial confusion, I've finally got to
>>>>>>>> ip route get
>>>>>>>> and got something quite strange.
>>>>>>>>
>>>>>>>>
>>>>>>>> Example for GOOD address (the one that I can ping):
>>>>>>>>
>>>>>>>> ip route get 91.189.89.237
>>>>>>>> 91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
>>>>>>>>        cache
>>>>>>>>
>>>>>>>>
>>>>>>>> Example for BAD address (the one that stopped responding):
>>>>>>>>
>>>>>>>> ip route get 91.189.89.238
>>>>>>>> 91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
>>>>>>>>        cache <redirected>
>>>>>>>
>>>>>>> I tried to understand this thread and now wonder why this redirect route
>>>>>>> isn't there always. Can you please summarize again why this shouldn't
>>>>>>> happen? It looks totally fine to me from the configuration of your
>>>>>>> router and the subnet masks.
>>>>>>
>>>>>> Just an addendum:
>>>>>>
>>>>>> In IPv6 a redirect is seen as a notification telling hosts, this new address is on the same link as you. I think this semantic is the same for IPv4, so we are informing you that in essence you are
>>>>>> getting a /32 route installed to your new interface and can do link layer resolving of the new host.
>>>>>>
>>>>>> I do think this is valid and fine.
>>>>> You can't call "valid and fine" something that doesn't
>>>>> work, at first place. Why and where does it fail, was the
>>>>> subject of this thread.
>>>>
>>>> In terms of the shared media specification <https://tools.ietf.org/html/rfc1620> it is valid and fine.
>>> Good luck sending users to RFC without giving any explanations. :)
> 
> It explains how and what extensions need to be added to an ip routing/host device to deal better with shared media.
> 
>>> Well, yes, an interesting reading, but:
>>> https://tools.ietf.org/html/rfc1812
>>> ---
>>>     Routers MUST NOT generate a Redirect Message unless all the following
>>>     conditions are met:
>>>
>>>     o The packet is being forwarded out the same physical interface that
>>>        it was received from,
>>>
>>>     o The IP source address in the packet is on the same Logical IP
>>>        (sub)network as the next-hop IP address, and
>>>
>>>     o The packet does not contain an IP source route option.
>>>
>>>     The source address used in the ICMP Redirect MUST belong to the same
>>>     logical (sub)net as the destination address.
>>> ---
>>>
>>> Could you please explain why the above does not apply?
>> Also the rfc1620 you pointed, seems to be saying this:
>>
>>                  A Redirect message SHOULD be silently discarded if the
>>                  new router address it specifies is not on the same
>>                  connected (sub-) net through which the Redirect arrived,
>>                  or if the source of the Redirect is not the current
>>                  first-hop router for the specified destination.
>>
>> It seems, this is exactly the rule we were trying to find
>> during the thread. And it seems violated, either. Unless I am
>> mis-interpreting it, of course.
> 
> If you read on you will read that with shared_media this exact clause (the first of those) is not in effect any more.
OK. But how to get such a redirect to work, if (checked with
tcpdump) the packets do not even go to eth0, but to "lo"?
And how to deal with the above quote from rfc1812?

> I don't know why shared_media=1 is the default in Linux, this decision was made long before I joined here. Anyway, with shared_media=1 this is absolutely the required behavior.
Then it should work. How? :)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 17:18               ` Stas Sergeev
@ 2016-01-12 17:26                 ` Hannes Frederic Sowa
  2016-01-12 17:33                   ` Stas Sergeev
  2016-01-12 17:41                   ` Stas Sergeev
  0 siblings, 2 replies; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 17:26 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev, Sowmini Varadhan

On 12.01.2016 18:18, Stas Sergeev wrote:
> 12.01.2016 20:06, Hannes Frederic Sowa пишет:
>> On 12.01.2016 17:56, Stas Sergeev wrote:
>>> 12.01.2016 19:42, Stas Sergeev пишет:
>>> Also the rfc1620 you pointed, seems to be saying this:
>>>
>>>                   A Redirect message SHOULD be silently discarded if the
>>>                   new router address it specifies is not on the same
>>>                   connected (sub-) net through which the Redirect arrived,
>>>                   or if the source of the Redirect is not the current
>>>                   first-hop router for the specified destination.
>>>
>>> It seems, this is exactly the rule we were trying to find
>>> during the thread. And it seems violated, either. Unless I am
>>> mis-interpreting it, of course.
>>
>> If you read on you will read that with shared_media this exact clause (the first of those) is not in effect any more.
> OK. But how to get such a redirect to work, if (checked with
> tcpdump) the packets do not even go to eth0, but to "lo"?

I don't know, the router must be on the same shared medium. I guess 
physical reconfiguration is required?

Aren't there arp request for the host on eth0?

> And how to deal with the above quote from rfc1812?
>
>> I don't know why shared_media=1 is the default in Linux, this decision was made long before I joined here. Anyway, with shared_media=1 this is absolutely the required behavior.
> Then it should work. How? :)

What should work? Sorry, I can't follow you. Everything looks fine to 
me. The default is shared_media, so servers send such redirects and 
client accept those. If it would be 0 the rfc1812 applies and should 
stop servers to send such redirects and clients to accept those.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 17:26                 ` Hannes Frederic Sowa
@ 2016-01-12 17:33                   ` Stas Sergeev
  2016-01-12 17:47                     ` Hannes Frederic Sowa
  2016-01-12 17:41                   ` Stas Sergeev
  1 sibling, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 17:33 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, Sowmini Varadhan

12.01.2016 20:26, Hannes Frederic Sowa пишет:
> On 12.01.2016 18:18, Stas Sergeev wrote:
>> 12.01.2016 20:06, Hannes Frederic Sowa пишет:
>>> On 12.01.2016 17:56, Stas Sergeev wrote:
>>>> 12.01.2016 19:42, Stas Sergeev пишет:
>>>> Also the rfc1620 you pointed, seems to be saying this:
>>>>
>>>>                   A Redirect message SHOULD be silently discarded if the
>>>>                   new router address it specifies is not on the same
>>>>                   connected (sub-) net through which the Redirect arrived,
>>>>                   or if the source of the Redirect is not the current
>>>>                   first-hop router for the specified destination.
>>>>
>>>> It seems, this is exactly the rule we were trying to find
>>>> during the thread. And it seems violated, either. Unless I am
>>>> mis-interpreting it, of course.
>>>
>>> If you read on you will read that with shared_media this exact clause (the first of those) is not in effect any more.
>> OK. But how to get such a redirect to work, if (checked with
>> tcpdump) the packets do not even go to eth0, but to "lo"?
> 
> I don't know, the router must be on the same shared medium. I guess physical reconfiguration is required?
It is same.
Router 192.168.8.1 has just one ethernet port.
And even on the 192.168.10.202 node I can do:
# arp -a |grep "0.1"
? (192.168.0.1) at 14:d6:4d:1c:97:3d [ether] on eth0
So even 0.1 is about to be reachable.
Still nothing works.
Should it work if 192.168.0.1 router, to which 8.1 redirects,
has shared_media disabled?

>>> I don't know why shared_media=1 is the default in Linux, this decision was made long before I joined here. Anyway, with shared_media=1 this is absolutely the required behavior.
>> Then it should work. How? :)
> 
> What should work? Sorry, I can't follow you. Everything looks fine to me.
Except that pings do not flow.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 17:26                 ` Hannes Frederic Sowa
  2016-01-12 17:33                   ` Stas Sergeev
@ 2016-01-12 17:41                   ` Stas Sergeev
  1 sibling, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 17:41 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, Sowmini Varadhan

12.01.2016 20:26, Hannes Frederic Sowa пишет:
> If it would be 0 the rfc1812 applies and
> should stop servers to send such redirects and clients to accept those.
Well, this is a very interesting point that
contradicts with all assumptions of the prev thread.
The assumptions were based on this document:
https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/theconfvariables.html
which says:
---
The shared_media setting tells the kernel if the physical network connected to a specific network card is a shared media or not. For example, if several different IP networks with different netmasks
operate over the same physical media or not.
*The main effect that this variable makes, is to tell the kernel whether it should send ICMP redirects to specific networks or not.*
---

But you seem to be saying that rfc1812 applies when it
is _disabled_ so the  ICMP redirects should still be sent,
just with the different rules.
So can you confirm that you imply the above document is wrong?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 17:33                   ` Stas Sergeev
@ 2016-01-12 17:47                     ` Hannes Frederic Sowa
  2016-01-12 20:43                       ` Stas Sergeev
  0 siblings, 1 reply; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 17:47 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev, Sowmini Varadhan

On 12.01.2016 18:33, Stas Sergeev wrote:
> 12.01.2016 20:26, Hannes Frederic Sowa пишет:
>> On 12.01.2016 18:18, Stas Sergeev wrote:
>>> 12.01.2016 20:06, Hannes Frederic Sowa пишет:
>>>> On 12.01.2016 17:56, Stas Sergeev wrote:
>>>>> 12.01.2016 19:42, Stas Sergeev пишет:
>>>>> Also the rfc1620 you pointed, seems to be saying this:
>>>>>
>>>>>                    A Redirect message SHOULD be silently discarded if the
>>>>>                    new router address it specifies is not on the same
>>>>>                    connected (sub-) net through which the Redirect arrived,
>>>>>                    or if the source of the Redirect is not the current
>>>>>                    first-hop router for the specified destination.
>>>>>
>>>>> It seems, this is exactly the rule we were trying to find
>>>>> during the thread. And it seems violated, either. Unless I am
>>>>> mis-interpreting it, of course.
>>>>
>>>> If you read on you will read that with shared_media this exact clause (the first of those) is not in effect any more.
>>> OK. But how to get such a redirect to work, if (checked with
>>> tcpdump) the packets do not even go to eth0, but to "lo"?
>>
>> I don't know, the router must be on the same shared medium. I guess physical reconfiguration is required?
> It is same.
> Router 192.168.8.1 has just one ethernet port.
> And even on the 192.168.10.202 node I can do:
> # arp -a |grep "0.1"
> ? (192.168.0.1) at 14:d6:4d:1c:97:3d [ether] on eth0
> So even 0.1 is about to be reachable.
> Still nothing works.
> Should it work if 192.168.0.1 router, to which 8.1 redirects,
> has shared_media disabled?

Can you check with tcpdump? ping requires the router to also find a 
correct way back, so packet can get stuck at a lot of places. Also uRPF 
is maybe active which kind of defeats shared_media and please check 
netfilter.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 14:47     ` Sowmini Varadhan
@ 2016-01-12 20:33       ` David Miller
  0 siblings, 0 replies; 37+ messages in thread
From: David Miller @ 2016-01-12 20:33 UTC (permalink / raw)
  To: sowmini.varadhan; +Cc: stsp, netdev, alexander.duyck

From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Tue, 12 Jan 2016 09:47:25 -0500

> On (01/12/16 17:40), Stas Sergeev wrote:
>> Updated testing results:
>> After I disabled shared_media not only for "all", but
>> also for _all_ interfaces individually, the problem seem to
>> have stopped. So thanks for these hints.
>> Now, as the media is actually really shared (same NIC/cable),
>> I just wonder what's going on here.
> 
> I dont know the history of the shared_media tunable (or
> the rationale behind the default) - I was just reading out the
> code - perhaps someone on the list who has the history can share 
> the motivation behind this tunable.

Most of the choices for defaults for ipv4 routing and ARP behavior
are motivated by two things:

1) host based addressing model

2) increasing the chance of successful communication with peers

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 17:47                     ` Hannes Frederic Sowa
@ 2016-01-12 20:43                       ` Stas Sergeev
  2016-01-12 22:26                         ` Hannes Frederic Sowa
  0 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 20:43 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, Sowmini Varadhan

12.01.2016 20:47, Hannes Frederic Sowa пишет:
> On 12.01.2016 18:33, Stas Sergeev wrote:
>> 12.01.2016 20:26, Hannes Frederic Sowa пишет:
>>> On 12.01.2016 18:18, Stas Sergeev wrote:
>>>> 12.01.2016 20:06, Hannes Frederic Sowa пишет:
>>>>> On 12.01.2016 17:56, Stas Sergeev wrote:
>>>>>> 12.01.2016 19:42, Stas Sergeev пишет:
>>>>>> Also the rfc1620 you pointed, seems to be saying this:
>>>>>>
>>>>>>                    A Redirect message SHOULD be silently 
>>>>>> discarded if the
>>>>>>                    new router address it specifies is not on the 
>>>>>> same
>>>>>>                    connected (sub-) net through which the 
>>>>>> Redirect arrived,
>>>>>>                    or if the source of the Redirect is not the 
>>>>>> current
>>>>>>                    first-hop router for the specified destination.
>>>>>>
>>>>>> It seems, this is exactly the rule we were trying to find
>>>>>> during the thread. And it seems violated, either. Unless I am
>>>>>> mis-interpreting it, of course.
>>>>>
>>>>> If you read on you will read that with shared_media this exact 
>>>>> clause (the first of those) is not in effect any more.
>>>> OK. But how to get such a redirect to work, if (checked with
>>>> tcpdump) the packets do not even go to eth0, but to "lo"?
>>>
>>> I don't know, the router must be on the same shared medium. I guess 
>>> physical reconfiguration is required?
>> It is same.
>> Router 192.168.8.1 has just one ethernet port.
>> And even on the 192.168.10.202 node I can do:
>> # arp -a |grep "0.1"
>> ? (192.168.0.1) at 14:d6:4d:1c:97:3d [ether] on eth0
>> So even 0.1 is about to be reachable.
>> Still nothing works.
>> Should it work if 192.168.0.1 router, to which 8.1 redirects,
>> has shared_media disabled?
>
> Can you check with tcpdump?
That's what I already did.
I monitored on 8.1 router and on the node itself,
and my conclusion was that the packets do not
even reach the eth0 interface. Instead I captured
them on "lo" interface, so I assumed such route is
completely broken.
If it is not - how can I even see that it exist? How to
list these redirect routes?
I'd like to do some investigations, but this looks no
more than a black magic without a proper support
from tools, proper documentation, etc.

And I suspect that shared_media is disabled on a 0.1
router, so I wonder if this can work at all, even if the node
is cured to do the right thing with those redirects.
In a nearby message David Miller says:
---

2) increasing the chance of successful communication with peers

---
If this can't work right when one of the gateways has
shared_media disabled, then this rule is clearly violated.

> ping requires the router to also find a correct way back, so packet 
> can get stuck at a lot of places. Also uRPF is maybe active which kind 
> of defeats shared_media and please check netfilter.
I am pretty sure the node has a default ubuntu without
any special network tweaks, but I'll double-check.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 20:43                       ` Stas Sergeev
@ 2016-01-12 22:26                         ` Hannes Frederic Sowa
  2016-01-12 22:57                           ` Stas Sergeev
  0 siblings, 1 reply; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 22:26 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev, Sowmini Varadhan

Hi,

On 12.01.2016 21:43, Stas Sergeev wrote:
> 12.01.2016 20:47, Hannes Frederic Sowa пишет:
>> On 12.01.2016 18:33, Stas Sergeev wrote:
>>> 12.01.2016 20:26, Hannes Frederic Sowa пишет:
>>>> On 12.01.2016 18:18, Stas Sergeev wrote:
>>>>> 12.01.2016 20:06, Hannes Frederic Sowa пишет:
>>>>>> On 12.01.2016 17:56, Stas Sergeev wrote:
>>>>>>> 12.01.2016 19:42, Stas Sergeev пишет:
>>>>>>> Also the rfc1620 you pointed, seems to be saying this:
>>>>>>>
>>>>>>>                    A Redirect message SHOULD be silently
>>>>>>> discarded if the
>>>>>>>                    new router address it specifies is not on the
>>>>>>> same
>>>>>>>                    connected (sub-) net through which the
>>>>>>> Redirect arrived,
>>>>>>>                    or if the source of the Redirect is not the
>>>>>>> current
>>>>>>>                    first-hop router for the specified destination.
>>>>>>>
>>>>>>> It seems, this is exactly the rule we were trying to find
>>>>>>> during the thread. And it seems violated, either. Unless I am
>>>>>>> mis-interpreting it, of course.
>>>>>>
>>>>>> If you read on you will read that with shared_media this exact
>>>>>> clause (the first of those) is not in effect any more.
>>>>> OK. But how to get such a redirect to work, if (checked with
>>>>> tcpdump) the packets do not even go to eth0, but to "lo"?
>>>>
>>>> I don't know, the router must be on the same shared medium. I guess
>>>> physical reconfiguration is required?
>>> It is same.
>>> Router 192.168.8.1 has just one ethernet port.
>>> And even on the 192.168.10.202 node I can do:
>>> # arp -a |grep "0.1"
>>> ? (192.168.0.1) at 14:d6:4d:1c:97:3d [ether] on eth0
>>> So even 0.1 is about to be reachable.
>>> Still nothing works.
>>> Should it work if 192.168.0.1 router, to which 8.1 redirects,
>>> has shared_media disabled?
>>
>> Can you check with tcpdump?
> That's what I already did.
> I monitored on 8.1 router and on the node itself,
> and my conclusion was that the packets do not
> even reach the eth0 interface. Instead I captured
> them on "lo" interface, so I assumed such route is
> completely broken.

I didn't check a full featured setup but just did some dirty testing 
with namespaces and I had correct arp request for the now to be assumed 
on-link router on the external veth.

> If it is not - how can I even see that it exist? How to
> list these redirect routes?

Yeah, that might be a minor issue. The rt_cache procfs files are empty 
since the deletion of the cache and we probably don't have an interface 
for next hop exceptions, I consider this todo. :) ip route get is your 
only hope right now.

Anyway, seems like there are problems with redirect timeout somehow. I 
am investigating this.

> I'd like to do some investigations, but this looks no
> more than a black magic without a proper support
> from tools, proper documentation, etc.

Hmm, so far I think shared_media is behaving like it should, besides 
maybe it shouldn't be the default setting. Maybe someone who can 
remember why it is default could chime in?

> And I suspect that shared_media is disabled on a 0.1
> router, so I wonder if this can work at all, even if the node
> is cured to do the right thing with those redirects.
> In a nearby message David Miller says:

Default is that shared_media is enabled, so the chances are relatively 
high that it is enabled if it is not turned off.

> ---
>
> 2) increasing the chance of successful communication with peers
>
> ---
> If this can't work right when one of the gateways has
> shared_media disabled, then this rule is clearly violated.

It can work right and I think the RFC actually gives examples where it 
is very useful. Also IPv6 adapts it as the default, so it might make 
sense to have it as default.

I still consider something broken in your network setup, maybe.

>> ping requires the router to also find a correct way back, so packet
>> can get stuck at a lot of places. Also uRPF is maybe active which kind
>> of defeats shared_media and please check netfilter.
> I am pretty sure the node has a default ubuntu without
> any special network tweaks, but I'll double-check.

Please do that. Thanks!

Bye,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 22:26                         ` Hannes Frederic Sowa
@ 2016-01-12 22:57                           ` Stas Sergeev
  2016-01-12 23:07                             ` Hannes Frederic Sowa
  0 siblings, 1 reply; 37+ messages in thread
From: Stas Sergeev @ 2016-01-12 22:57 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, Sowmini Varadhan

13.01.2016 01:26, Hannes Frederic Sowa пишет:
> Hi,
>
> On 12.01.2016 21:43, Stas Sergeev wrote:
>> 12.01.2016 20:47, Hannes Frederic Sowa пишет:
>>> On 12.01.2016 18:33, Stas Sergeev wrote:
>>>> 12.01.2016 20:26, Hannes Frederic Sowa пишет:
>>>>> On 12.01.2016 18:18, Stas Sergeev wrote:
>>>>>> 12.01.2016 20:06, Hannes Frederic Sowa пишет:
>>>>>>> On 12.01.2016 17:56, Stas Sergeev wrote:
>>>>>>>> 12.01.2016 19:42, Stas Sergeev пишет:
>>>>>>>> Also the rfc1620 you pointed, seems to be saying this:
>>>>>>>>
>>>>>>>>                    A Redirect message SHOULD be silently
>>>>>>>> discarded if the
>>>>>>>>                    new router address it specifies is not on the
>>>>>>>> same
>>>>>>>>                    connected (sub-) net through which the
>>>>>>>> Redirect arrived,
>>>>>>>>                    or if the source of the Redirect is not the
>>>>>>>> current
>>>>>>>>                    first-hop router for the specified destination.
>>>>>>>>
>>>>>>>> It seems, this is exactly the rule we were trying to find
>>>>>>>> during the thread. And it seems violated, either. Unless I am
>>>>>>>> mis-interpreting it, of course.
>>>>>>>
>>>>>>> If you read on you will read that with shared_media this exact
>>>>>>> clause (the first of those) is not in effect any more.
>>>>>> OK. But how to get such a redirect to work, if (checked with
>>>>>> tcpdump) the packets do not even go to eth0, but to "lo"?
>>>>>
>>>>> I don't know, the router must be on the same shared medium. I guess
>>>>> physical reconfiguration is required?
>>>> It is same.
>>>> Router 192.168.8.1 has just one ethernet port.
>>>> And even on the 192.168.10.202 node I can do:
>>>> # arp -a |grep "0.1"
>>>> ? (192.168.0.1) at 14:d6:4d:1c:97:3d [ether] on eth0
>>>> So even 0.1 is about to be reachable.
>>>> Still nothing works.
>>>> Should it work if 192.168.0.1 router, to which 8.1 redirects,
>>>> has shared_media disabled?
>>>
>>> Can you check with tcpdump?
>> That's what I already did.
>> I monitored on 8.1 router and on the node itself,
>> and my conclusion was that the packets do not
>> even reach the eth0 interface. Instead I captured
>> them on "lo" interface, so I assumed such route is
>> completely broken.
>
> I didn't check a full featured setup but just did some dirty testing 
> with namespaces and I had correct arp request for the now to be 
> assumed on-link router on the external veth.
I haven't checked anything with arp.
I set up tcpdump to only capture icmp.
What would you like me to check, could you please
give the detailed instructions?

>> If it is not - how can I even see that it exist? How to
>> list these redirect routes?
>
> Yeah, that might be a minor issue. The rt_cache procfs files are empty 
> since the deletion of the cache and we probably don't have an 
> interface for next hop exceptions, I consider this todo. :) ip route 
> get is your only hope right now.
>
> Anyway, seems like there are problems with redirect timeout somehow. I 
> am investigating this.
>
>> I'd like to do some investigations, but this looks no
>> more than a black magic without a proper support
>> from tools, proper documentation, etc.
>
> Hmm, so far I think shared_media is behaving like it should,
No, unless you correct the documentation:
https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/theconfvariables.html
It says not what you say.
So this feature is essentially poorly (or wrongly) documented.

> besides maybe it shouldn't be the default setting. Maybe someone who 
> can remember why it is default could chime in?
>
>> And I suspect that shared_media is disabled on a 0.1
>> router, so I wonder if this can work at all, even if the node
>> is cured to do the right thing with those redirects.
>> In a nearby message David Miller says:
>
> Default is that shared_media is enabled,
On what OS, and since what version?

> so the chances are relatively high that it is enabled if it is not 
> turned off.
I don't even know what is there in a 0.1 router - maybe windows95,
who knows. You can't assume the latest linux kernel is everywhere.

>> ---
>>
>> 2) increasing the chance of successful communication with peers
>>
>> ---
>> If this can't work right when one of the gateways has
>> shared_media disabled, then this rule is clearly violated.
>
> It can work right and I think the RFC actually gives examples where it 
> is very useful. Also IPv6 adapts it as the default, so it might make 
> sense to have it as default.
>
> I still consider something broken in your network setup, maybe.
>
>>> ping requires the router to also find a correct way back, so packet
>>> can get stuck at a lot of places. Also uRPF is maybe active which kind
>>> of defeats shared_media and please check netfilter.
>> I am pretty sure the node has a default ubuntu without
>> any special network tweaks, but I'll double-check.
>
> Please do that. Thanks!
OK but please make it clear what should I check.
iptables do not seem to be in the game, what else to check?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 22:57                           ` Stas Sergeev
@ 2016-01-12 23:07                             ` Hannes Frederic Sowa
  2016-01-13 12:59                               ` Stas Sergeev
  0 siblings, 1 reply; 37+ messages in thread
From: Hannes Frederic Sowa @ 2016-01-12 23:07 UTC (permalink / raw)
  To: Stas Sergeev; +Cc: netdev, Sowmini Varadhan

Hi,

On 12.01.2016 23:57, Stas Sergeev wrote:
> 13.01.2016 01:26, Hannes Frederic Sowa пишет:
>> I didn't check a full featured setup but just did some dirty testing
>> with namespaces and I had correct arp request for the now to be
>> assumed on-link router on the external veth.
> I haven't checked anything with arp.
> I set up tcpdump to only capture icmp.
> What would you like me to check, could you please
> give the detailed instructions?

Check simply for arp traffic on the interface. arp requests should leave 
your client and ask directly for the new router you got as next-hop. If 
it does not answer, there is the problem.

>>> If it is not - how can I even see that it exist? How to
>>> list these redirect routes?
>>
>> Yeah, that might be a minor issue. The rt_cache procfs files are empty
>> since the deletion of the cache and we probably don't have an
>> interface for next hop exceptions, I consider this todo. :) ip route
>> get is your only hope right now.
>>
>> Anyway, seems like there are problems with redirect timeout somehow. I
>> am investigating this.
>>
>>> I'd like to do some investigations, but this looks no
>>> more than a black magic without a proper support
>>> from tools, proper documentation, etc.
>>
>> Hmm, so far I think shared_media is behaving like it should,
> No, unless you correct the documentation:
> https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/theconfvariables.html
>
> It says not what you say.
> So this feature is essentially poorly (or wrongly) documented.

I am sorry, but I have no access to this website. I just grepped around 
in the Documentation/ directory of the kernel:
<https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/ip-sysctl.txt?id=refs/tags/v4.4#n1014>

It is correctly documented there and also references the RFC. I think 
this is fine. Please feel free to send a patch, it will be welcomed.

>> besides maybe it shouldn't be the default setting. Maybe someone who
>> can remember why it is default could chime in?
>>
>>> And I suspect that shared_media is disabled on a 0.1
>>> router, so I wonder if this can work at all, even if the node
>>> is cured to do the right thing with those redirects.
>>> In a nearby message David Miller says:
>>
>> Default is that shared_media is enabled,
> On what OS, and since what version?
>
>> so the chances are relatively high that it is enabled if it is not
>> turned off.
> I don't even know what is there in a 0.1 router - maybe windows95,
> who knows. You can't assume the latest linux kernel is everywhere.

Looking into the kernel cvs history the change was done in 2000(!). 
Would be pretty strange to find such an old kernel there.

>>> ---
>>>
>>> 2) increasing the chance of successful communication with peers
>>>
>>> ---
>>> If this can't work right when one of the gateways has
>>> shared_media disabled, then this rule is clearly violated.
>>
>> It can work right and I think the RFC actually gives examples where it
>> is very useful. Also IPv6 adapts it as the default, so it might make
>> sense to have it as default.
>>
>> I still consider something broken in your network setup, maybe.
>>
>>>> ping requires the router to also find a correct way back, so packet
>>>> can get stuck at a lot of places. Also uRPF is maybe active which kind
>>>> of defeats shared_media and please check netfilter.
>>> I am pretty sure the node has a default ubuntu without
>>> any special network tweaks, but I'll double-check.
>>
>> Please do that. Thanks!
> OK but please make it clear what should I check.
> iptables do not seem to be in the game, what else to check?

Please have a look for arp traffic as written before.

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Q: bad routing table cache entries
  2016-01-12 23:07                             ` Hannes Frederic Sowa
@ 2016-01-13 12:59                               ` Stas Sergeev
  0 siblings, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2016-01-13 12:59 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev

13.01.2016 02:07, Hannes Frederic Sowa пишет:
> Hi,
> 
> On 12.01.2016 23:57, Stas Sergeev wrote:
>> 13.01.2016 01:26, Hannes Frederic Sowa пишет:
>>> I didn't check a full featured setup but just did some dirty testing
>>> with namespaces and I had correct arp request for the now to be
>>> assumed on-link router on the external veth.
>> I haven't checked anything with arp.
>> I set up tcpdump to only capture icmp.
>> What would you like me to check, could you please
>> give the detailed instructions?
> 
> Check simply for arp traffic on the interface. arp requests should leave your client and ask directly for the new router you got as next-hop. If it does not answer, there is the problem.
It does not answer:

tcpdump -vn -i eth0 arp host 192.168.10.202
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
15:38:23.334783 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28
15:38:24.329949 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28
15:38:25.329946 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28
15:38:26.338987 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.10.202, length 28

This is what happens when I try to ping via the redirected route.
I wonder why should it answer. Suppose it has shared_media disabled,
should it answer into a different subnet even then?

>>>> If it is not - how can I even see that it exist? How to
>>>> list these redirect routes?
>>>
>>> Yeah, that might be a minor issue. The rt_cache procfs files are empty
>>> since the deletion of the cache and we probably don't have an
>>> interface for next hop exceptions, I consider this todo. :) ip route
>>> get is your only hope right now.
>>>
>>> Anyway, seems like there are problems with redirect timeout somehow. I
>>> am investigating this.
>>>
>>>> I'd like to do some investigations, but this looks no
>>>> more than a black magic without a proper support
>>>> from tools, proper documentation, etc.
>>>
>>> Hmm, so far I think shared_media is behaving like it should,
>> No, unless you correct the documentation:
>> https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/theconfvariables.html
>>
>> It says not what you say.
>> So this feature is essentially poorly (or wrongly) documented.
> 
> I am sorry, but I have no access to this website. I just grepped around in the Documentation/ directory of the kernel:
> <https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/ip-sysctl.txt?id=refs/tags/v4.4#n1014>
> 
> It is correctly documented
With just a single-line description like this:
"Send(router) or accept(host) RFC1620 shared media redirects."

Isn't this too few for the feature that completely changes the
meaning of such fundamental things as the netmask is? IMHO the
books and articles should have been written before making it a default. :)
And how about /proc/sys/net/ipv4/route/gc_interval, redirect_load,
redirect_number, redirect_silence? Are they documented at all?
I am trying to make the problematic event to trigger faster for
debugging, or make the cache to expire faster, but this all looks
completely undocumented and the intuitive guesses do not work.

>>> besides maybe it shouldn't be the default setting. Maybe someone who
>>> can remember why it is default could chime in?
>>>
>>>> And I suspect that shared_media is disabled on a 0.1
>>>> router, so I wonder if this can work at all, even if the node
>>>> is cured to do the right thing with those redirects.
>>>> In a nearby message David Miller says:
>>>
>>> Default is that shared_media is enabled,
>> On what OS, and since what version?
>>
>>> so the chances are relatively high that it is enabled if it is not
>>> turned off.
>> I don't even know what is there in a 0.1 router - maybe windows95,
>> who knows. You can't assume the latest linux kernel is everywhere.
> 
> Looking into the kernel cvs history the change was done in 2000(!). Would be pretty strange to find such an old kernel there.
The router at 192.168.0.1 may have some other OS, maybe freebsd.
You can't assume linux is everywhere.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Q: bad routing table cache entries
@ 2015-12-28 15:26 Stas Sergeev
  0 siblings, 0 replies; 37+ messages in thread
From: Stas Sergeev @ 2015-12-28 15:26 UTC (permalink / raw)
  To: Linux kernel

Hello.

I was hitting a strange problem when some hosts
suddenly stops responding until reboot. ping to these
host gives "Destination Host Unreachable". After the
initial confusion, I've finally got to
ip route get
and got something quite strange.


Example for GOOD address (the one that I can ping):

ip route get 91.189.89.237
91.189.89.237 via 192.168.8.1 dev eth0  src 192.168.10.202
    cache


Example for BAD address (the one that stopped responding):

ip route get 91.189.89.238
91.189.89.238 via 192.168.0.1 dev eth0  src 192.168.10.202
    cache <redirected>


Two things differ: the <redirected> mark appears, and the
gateway changed from 192.168.8.1 to 192.168.0.1.
Now, 192.168.0.1 is also a valid gateway, but it is outside
of the network mask for the eth0 interface:

ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:50:43:00:0b:e0
          inet addr:192.168.10.202  Bcast:192.168.11.255  Mask:255.255.252.0


As a result, this route simply doesn't work.
I checked with tcpdump - the icmp packets do not even go
to eth0 - they instead can be captured on lo interface for
some reason.

So my question is: why does linux allow an invalid redirect
entries? Is it a problem with my setup, or some kernel bug,
or some router setup problem? Where should I look into, to
nail this down?

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-01-13 12:59 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-29 10:54 Q: bad routing table cache entries Stas Sergeev
2015-12-29 11:58 ` Sowmini Varadhan
2015-12-29 12:06   ` Stas Sergeev
2015-12-29 12:32     ` Sowmini Varadhan
2015-12-29 12:43       ` Stas Sergeev
2015-12-29 13:19 ` Stas Sergeev
2015-12-29 15:22 ` Sowmini Varadhan
2015-12-29 15:38   ` Stas Sergeev
2015-12-29 17:40     ` Stas Sergeev
2015-12-30 12:42   ` Stas Sergeev
2015-12-30 14:17     ` Eric Dumazet
2015-12-30 17:56       ` David Miller
2016-01-04  1:05     ` Sowmini Varadhan
2016-01-04  1:32       ` Stas Sergeev
2016-01-04 17:23       ` Stas Sergeev
2016-01-12 14:40   ` Stas Sergeev
2016-01-12 14:47     ` Sowmini Varadhan
2016-01-12 20:33       ` David Miller
2016-01-12 15:34 ` Hannes Frederic Sowa
2016-01-12 15:52   ` Hannes Frederic Sowa
2016-01-12 16:03     ` Stas Sergeev
2016-01-12 16:10       ` Hannes Frederic Sowa
2016-01-12 16:42         ` Stas Sergeev
2016-01-12 16:56           ` Stas Sergeev
2016-01-12 17:06             ` Hannes Frederic Sowa
2016-01-12 17:18               ` Stas Sergeev
2016-01-12 17:26                 ` Hannes Frederic Sowa
2016-01-12 17:33                   ` Stas Sergeev
2016-01-12 17:47                     ` Hannes Frederic Sowa
2016-01-12 20:43                       ` Stas Sergeev
2016-01-12 22:26                         ` Hannes Frederic Sowa
2016-01-12 22:57                           ` Stas Sergeev
2016-01-12 23:07                             ` Hannes Frederic Sowa
2016-01-13 12:59                               ` Stas Sergeev
2016-01-12 17:41                   ` Stas Sergeev
2016-01-12 15:57   ` Stas Sergeev
  -- strict thread matches above, loose matches on Subject: below --
2015-12-28 15:26 Stas Sergeev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.