All of lore.kernel.org
 help / color / mirror / Atom feed
* oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
@ 2007-05-14 18:30 James Morris
  2007-05-14 18:46 ` Patrick McHardy
  0 siblings, 1 reply; 12+ messages in thread
From: James Morris @ 2007-05-14 18:30 UTC (permalink / raw)
  To: netdev

---------- Forwarded message ----------
Date: Mon, 14 May 2007 08:15:50 -0700 (PDT)
From: Curtis Doty <Curtis@GreenKey.net>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Subject: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr

Summary: On a multi-homed box, after turning on
/proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr, it now periodically oopses
when trying to lookup the source address to use for sending an ICMP in response
to a jump ipt_REJECT.

I'm still trying to figure out what makes this test case unique. It spuriously
occurs with many fedora builds of 2.6.{18,19,20} all of which don't appear to
have any patches in this area of the kernel. Just _maybe_ it's because of a
combination of dogleg routing and overloading one vlan with multiple subnets:

# ip route list table main proto kernel scope link
10.10.36.208/28 dev eth0.1  src 10.10.36.210
10.10.36.224/27 dev eth0.1  src 10.10.36.226
10.10.1.0/24 dev eth0.5  src 10.10.1.48
10.10.2.0/24 dev eth0.9  src 10.10.2.27
10.10.10.0/24 dev eth0.8  src 10.10.10.1

Policy routing has been disabled and the problem still occurs. Only one route
remains. It is gw:

# ip route list table main proto boot
default via 10.10.36.209 dev eth0.1

Am I on the right track here or is this a distro/build/config issue? Here's the
oops:

BUG: unable to handle kernel NULL pointer dereference at virtual address
000000a8
 printing eip:
c05fe72b
*pde = 3d429067
Oops: 0000 [#1]
SMP
last sysfs file: /devices/platform/i2c-9191/name
Modules linked in: it87 hwmon_vid hwmon i2c_isa eeprom 8021q nf_nat_ftp
nf_conntrack_ftp ipt_REJECT ipt_owner ipt_ULOG xt_limit xt_state xt_multiport
iptable_filter xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack
nfnetlink ip_tables x_tables video sbs i2c_ec dock button battery asus_acpi
backlight ac parport_pc lp parport 8139too 8139cp mii pcspkr i2c_i801 i2c_i810
iTCO_wdt i2c_algo_bit i2c_core iTCO_vendor_support dm_snapshot dm_zero dm_mirror
dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    0
EIP:    0060:[<c05fe72b>]    Not tainted VLI
EFLAGS: 00010246   (2.6.20-1.2948.fc6 #1)
EIP is at inet_select_addr+0x4/0x9f
eax: 00000000   ebx: f8b97046   ecx: 000000fd   edx: 00000000
esi: 000000fd   edi: 00000001   ebp: f71cd0ac   esp: c078bc9c
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, ti=c078b000 task=c06fc480 task.ti=c0746000)
Stack: f8b97046 f601b130 c05fd0b6 f728b980 f728b980 f8b5adbb c05bcb6e c078bd74
       00000003 00000003 00000246 00000246 00000000 f887e014 f8a611a6 f7c1ea80
       f728b9a8 00000000 f727d220 f887e000 00000001 00000072 f7383800 f728b980
Call Trace:
 [<f8b97046>] reject+0x0/0x4ae [ipt_REJECT]
 [<c05fd0b6>] icmp_send+0x14d/0x39b
 [<f8b5adbb>] nf_conntrack_free+0x18/0x20 [nf_conntrack]
 [<c05bcb6e>] __kfree_skb+0xc5/0x10d
 [<f8a611a6>] rtl8139_start_xmit+0xe2/0x112 [8139too]
 [<c05c12fc>] dev_hard_start_xmit+0x1bc/0x21b
 [<c060e141>] xfrm_decode_session+0x44/0x4b
 [<c0620667>] _spin_lock_irqsave+0x9/0xd
 [<c042edc9>] lock_timer_base+0x15/0x2f
 [<c042eed5>] __mod_timer+0x94/0x9e
 [<f8b90406>] ipt_ulog_packet+0x2f0/0x3ba [ipt_ULOG]
 [<f8b97046>] reject+0x0/0x4ae [ipt_REJECT]
 [<f8b9709c>] reject+0x56/0x4ae [ipt_REJECT]
 [<c06205bd>] _spin_unlock_bh+0x5/0xd
 [<f8b904f6>] ipt_ulog_target+0x26/0x2e [ipt_ULOG]
 [<f8b97046>] reject+0x0/0x4ae [ipt_REJECT]
 [<f8b3d454>] ipt_do_table+0x28c/0x2e8 [ip_tables]
 [<c05d7550>] nf_iterate+0x38/0x6a
 [<c05d7681>] nf_hook_slow+0x4d/0xb5
 [<c05dec14>] dst_output+0x0/0x7
 [<c05e0f14>] ip_queue_xmit+0x3a3/0x3f4
 [<c05dec14>] dst_output+0x0/0x7
 [<c0422645>] try_to_wake_up+0x3aa/0x3b4
 [<c05f47f1>] tcp_v4_send_check+0x74/0xaa
 [<c05ef1c3>] tcp_transmit_skb+0x6d5/0x703
 [<c05eff49>] tcp_retransmit_skb+0x4b1/0x592
 [<c05e8db7>] tcp_enter_loss+0x1a8/0x205
 [<c05f258b>] tcp_write_timer+0x43f/0x638
 [<c0620655>] _spin_unlock_irq+0x5/0x7
 [<c0439d17>] hrtimer_run_queues+0x127/0x141
 [<c0422add>] run_rebalance_domains+0x116/0x33e
 [<c05f214c>] tcp_write_timer+0x0/0x638
 [<c042e51b>] run_timer_softirq+0x101/0x164
 [<c042b7b0>] __do_softirq+0x5d/0xba
 [<c040615b>] do_softirq+0x59/0xb1
 [<c0419d4e>] smp_apic_timer_interrupt+0x76/0x80
 [<c04049b0>] apic_timer_interrupt+0x28/0x30
 [<c0402d52>] default_idle+0x0/0x3e
 [<c061007b>] __xfrm_policy_check+0x4c5/0x592
 [<c0402d7e>] default_idle+0x2c/0x3e
 [<c04023d0>] cpu_idle+0x9e/0xb7
 [<c074b812>] start_kernel+0x3b6/0x3be
 [<c074b25a>] unknown_bootoption+0x0/0x202
 =======================
Code: eb 10 39 72 18 75 09 89 f8 33 42 14 85 c6 74 0e 8b 12 85 d2 74 08 f6 42 25
01 74 e6 31 d2 83 c4 0c 89 d0 5b 5e 5f c3 56 89 ce 53 <8b> 80 a8 00 00 00 85 c0
74 39 8b 48 0c 31 db eb 24 0f b6 41 24
EIP: [<c05fe72b>] inet_select_addr+0x4/0x9f SS:ESP 0068:c078bc9c
 <0>Kernel panic - not syncing: Fatal exception in interrupt
 <0>Rebooting in 60 seconds..

And a few relevant details:

# uname -a
Linux host 2.6.20-1.2948.fc6 #1 SMP Fri Apr 27 19:48:40 EDT 2007 i686 i686 i386
GNU/Linux

# lspci -tv
-[0000:00]-+-00.0  Intel Corporation 82845G/GL[Brookdale-G]/GE/PE DRAM
Controller/Host-Hub Interface
           +-02.0  Intel Corporation 82845G/GL[Brookdale-G]/GE Chipset
Integrated Graphics Device
           +-1d.0  Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB
UHCI Controller #1
           +-1d.1  Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB
UHCI Controller #2
           +-1d.2  Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB
UHCI Controller #3
           +-1d.7  Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
Controller
           +-1e.0-[0000:01]----05.0  Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+
           +-1f.0  Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface
Bridge
           +-1f.1  Intel Corporation 82801DB (ICH4) IDE Controller
           \-1f.3  Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus
Controller

Rather than turn off ipt_REJECT or icmp_errors_use_inbound_ifaddr, I've removed
the system from production to try and capture the error. It's still live and on
the wire, and the oopses occur every few days/weeks. Unfortunately, I cannot
reproduce at will. :-(

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-14 18:30 oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd) James Morris
@ 2007-05-14 18:46 ` Patrick McHardy
  2007-05-14 19:19   ` Patrick McHardy
  2007-05-14 20:24   ` Curtis Doty
  0 siblings, 2 replies; 12+ messages in thread
From: Patrick McHardy @ 2007-05-14 18:46 UTC (permalink / raw)
  To: James Morris; +Cc: netdev, Curtis Doty

James Morris wrote:
> ---------- Forwarded message ----------
> Date: Mon, 14 May 2007 08:15:50 -0700 (PDT)
> From: Curtis Doty <Curtis@GreenKey.net>
> To: Linux Kernel <linux-kernel@vger.kernel.org>
> Subject: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr
> 
> Summary: On a multi-homed box, after turning on
> /proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr, it now periodically oopses
> when trying to lookup the source address to use for sending an ICMP in response
> to a jump ipt_REJECT.
> 
> I'm still trying to figure out what makes this test case unique. It spuriously
> occurs with many fedora builds of 2.6.{18,19,20} all of which don't appear to
> have any patches in this area of the kernel. Just _maybe_ it's because of a
> combination of dogleg routing and overloading one vlan with multiple subnets:
> 
> [..]
> 
> BUG: unable to handle kernel NULL pointer dereference at virtual address
> 000000a8
> [...]
> EIP is at inet_select_addr+0x4/0x9f
> eax: 00000000   ebx: f8b97046   ecx: 000000fd   edx: 00000000
> esi: 000000fd   edi: 00000001   ebp: f71cd0ac   esp: c078bc9c
> ds: 007b   es: 007b   ss: 0068
> Process swapper (pid: 0, ti=c078b000 task=c06fc480 task.ti=c0746000)
> Stack: f8b97046 f601b130 c05fd0b6 f728b980 f728b980 f8b5adbb c05bcb6e c078bd74
>        00000003 00000003 00000246 00000246 00000000 f887e014 f8a611a6 f7c1ea80
>        f728b9a8 00000000 f727d220 f887e000 00000001 00000072 f7383800 f728b980
> Call Trace:
>  [<f8b97046>] reject+0x0/0x4ae [ipt_REJECT]
>  [<c05fd0b6>] icmp_send+0x14d/0x39b


A REJECT target in the output chain will trigger this in combination
with icmp_errors_use_inbound_ifaddr because skb->dev is still NULL
at this point and its passed to inet_select_addr.

I'll look into this.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-14 18:46 ` Patrick McHardy
@ 2007-05-14 19:19   ` Patrick McHardy
  2007-05-17 16:52     ` Patrick McHardy
  2007-05-20  5:26     ` Herbert Xu
  2007-05-14 20:24   ` Curtis Doty
  1 sibling, 2 replies; 12+ messages in thread
From: Patrick McHardy @ 2007-05-14 19:19 UTC (permalink / raw)
  To: netdev; +Cc: James Morris, Curtis Doty

Patrick McHardy wrote:
> James Morris wrote:
> 
>>---------- Forwarded message ----------
>>Date: Mon, 14 May 2007 08:15:50 -0700 (PDT)
>>From: Curtis Doty <Curtis@GreenKey.net>
>>To: Linux Kernel <linux-kernel@vger.kernel.org>
>>Subject: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr
>>
>>BUG: unable to handle kernel NULL pointer dereference at virtual address
>>000000a8
>>[...]
>>EIP is at inet_select_addr+0x4/0x9f
>>eax: 00000000   ebx: f8b97046   ecx: 000000fd   edx: 00000000
>>esi: 000000fd   edi: 00000001   ebp: f71cd0ac   esp: c078bc9c
>>ds: 007b   es: 007b   ss: 0068
>>Process swapper (pid: 0, ti=c078b000 task=c06fc480 task.ti=c0746000)
>>Stack: f8b97046 f601b130 c05fd0b6 f728b980 f728b980 f8b5adbb c05bcb6e c078bd74
>>       00000003 00000003 00000246 00000246 00000000 f887e014 f8a611a6 f7c1ea80
>>       f728b9a8 00000000 f727d220 f887e000 00000001 00000072 f7383800 f728b980
>>Call Trace:
>> [<f8b97046>] reject+0x0/0x4ae [ipt_REJECT]
>> [<c05fd0b6>] icmp_send+0x14d/0x39b
> 
> 
> 
> A REJECT target in the output chain will trigger this in combination
> with icmp_errors_use_inbound_ifaddr because skb->dev is still NULL
> at this point and its passed to inet_select_addr.
> 
> I'll look into this.


saddr = iph->daddr;
if (!(rt->rt_flags & RTCF_LOCAL)) {
        if (sysctl_icmp_errors_use_inbound_ifaddr)


                saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK);
        else
                saddr = 0;
}

Fixing the crash is easy, the right thing to do when skb->dev
is not set is to let routing choose the address because the
packet was locally generated and icmp_errors_use_inbound_ifaddr
shouldn't apply (the crash can also happen with IPsec tunnels
by the way).

This leaves the question what to do in the path after ip_output,
when skb->dev points to the output device. We don't know the
input device anymore, so there doesn't seem to be a way to make
it do what the sysctl promises.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-14 18:46 ` Patrick McHardy
  2007-05-14 19:19   ` Patrick McHardy
@ 2007-05-14 20:24   ` Curtis Doty
  1 sibling, 0 replies; 12+ messages in thread
From: Curtis Doty @ 2007-05-14 20:24 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: James Morris, netdev

8:46pm Patrick McHardy said:

> James Morris wrote:
>> ---------- Forwarded message ----------
>> Date: Mon, 14 May 2007 08:15:50 -0700 (PDT)
>> From: Curtis Doty <Curtis@GreenKey.net>
>> To: Linux Kernel <linux-kernel@vger.kernel.org>
>> Subject: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr
>>
>> Summary: On a multi-homed box, after turning on
>> /proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr, it now periodically oopses
>> when trying to lookup the source address to use for sending an ICMP in response
>> to a jump ipt_REJECT.
>>
>> I'm still trying to figure out what makes this test case unique. It spuriously
>> occurs with many fedora builds of 2.6.{18,19,20} all of which don't appear to
>> have any patches in this area of the kernel. Just _maybe_ it's because of a
>> combination of dogleg routing and overloading one vlan with multiple subnets:
>>
>> [..]
>>
>> BUG: unable to handle kernel NULL pointer dereference at virtual address
>> 000000a8
>> [...]
>> EIP is at inet_select_addr+0x4/0x9f
>> eax: 00000000   ebx: f8b97046   ecx: 000000fd   edx: 00000000
>> esi: 000000fd   edi: 00000001   ebp: f71cd0ac   esp: c078bc9c
>> ds: 007b   es: 007b   ss: 0068
>> Process swapper (pid: 0, ti=c078b000 task=c06fc480 task.ti=c0746000)
>> Stack: f8b97046 f601b130 c05fd0b6 f728b980 f728b980 f8b5adbb c05bcb6e c078bd74
>>        00000003 00000003 00000246 00000246 00000000 f887e014 f8a611a6 f7c1ea80
>>        f728b9a8 00000000 f727d220 f887e000 00000001 00000072 f7383800 f728b980
>> Call Trace:
>>  [<f8b97046>] reject+0x0/0x4ae [ipt_REJECT]
>>  [<c05fd0b6>] icmp_send+0x14d/0x39b
>
>
> A REJECT target in the output chain will trigger this in combination
> with icmp_errors_use_inbound_ifaddr because skb->dev is still NULL
> at this point and its passed to inet_select_addr.
>

Indeed, thanks!

curtis@host~$ nc kernel.org 42
BUG: unable to handle kernel NULL pointer dereference at virtual address 000000a8
  printing eip:
c05fe72b
*pde = 00000000
Oops: 0000 [#1]
...

And for now, the easy userland workaround is to add '-i ! eth+ -j DROP' 
just before any jumps to REJECT. This now causes delays/timeouts for any 
forbidden outbound traffic. But it's better than an oops.

Now I'm off to figure out what daemon is trying to make these odd 
connections...or if netfilter/conntrack is periodically scrambling its 
brains.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-14 19:19   ` Patrick McHardy
@ 2007-05-17 16:52     ` Patrick McHardy
  2007-05-18  0:57       ` Julian Anastasov
  2007-05-19 21:50       ` David Miller
  2007-05-20  5:26     ` Herbert Xu
  1 sibling, 2 replies; 12+ messages in thread
From: Patrick McHardy @ 2007-05-17 16:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, James Morris, Curtis Doty

[-- Attachment #1: Type: text/plain, Size: 1368 bytes --]

Patrick McHardy wrote:
>>>BUG: unable to handle kernel NULL pointer dereference at virtual address
>>>000000a8
>>>[...]
>>>EIP is at inet_select_addr+0x4/0x9f
>>>Call Trace:
>>>[<f8b97046>] reject+0x0/0x4ae [ipt_REJECT]
>>>[<c05fd0b6>] icmp_send+0x14d/0x39b
>>
>>
>>A REJECT target in the output chain will trigger this in combination
>>with icmp_errors_use_inbound_ifaddr because skb->dev is still NULL
>>at this point and its passed to inet_select_addr.
>>
>>I'll look into this.
> 
> 
> 
> saddr = iph->daddr;
> if (!(rt->rt_flags & RTCF_LOCAL)) {
>         if (sysctl_icmp_errors_use_inbound_ifaddr)
> 
> 
>                 saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK);
>         else
>                 saddr = 0;
> }
> 
> Fixing the crash is easy, the right thing to do when skb->dev
> is not set is to let routing choose the address because the
> packet was locally generated and icmp_errors_use_inbound_ifaddr
> shouldn't apply (the crash can also happen with IPsec tunnels
> by the way).
> 
> This leaves the question what to do in the path after ip_output,
> when skb->dev points to the output device. We don't know the
> input device anymore, so there doesn't seem to be a way to make
> it do what the sysctl promises.


The last problem seems to be unfixable, so this patch fixes the crash
and adds a comment about the brokeness of this option.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1816 bytes --]

[IPV4]: icmp: fix crash with sysctl_icmp_errors_use_inbound_ifaddr

When icmp_send is called on the local output path before the
packet hits ip_output, skb->dev is not set, causing a crash
when sysctl_icmp_errors_use_inbound_ifaddr is set. This can
happen with the netfilter REJECT target or IPsec tunnels.

Let routing decide the ICMP source address in that case, since the
packet is locally generated there is no inbound interface and
the sysctl should not apply.

The option actually seems to be unfixable broken, on the path
after ip_output() skb->dev points to the outgoing device and
we don't know the incoming device anymore, so its going to do
the absolute wrong thing and pick the address of the outgoing
interface. Add a comment about this.

Reported by Curtis Doty <Curtis@GreenKey.net>.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 637fc540b0ad22bf7971929e906e704236af06cd
tree 0c32138983e1bd7cc9ac96e7c62085e9b74b6217
parent 52ade9b3b97fd3bea42842a056fe0786c28d0555
author Patrick McHardy <kaber@trash.net> Thu, 17 May 2007 18:50:13 +0200
committer Patrick McHardy <kaber@trash.net> Thu, 17 May 2007 18:50:13 +0200

 net/ipv4/icmp.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index d38cbba..e238b17 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -514,7 +514,10 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info)
 
 	saddr = iph->daddr;
 	if (!(rt->rt_flags & RTCF_LOCAL)) {
-		if (sysctl_icmp_errors_use_inbound_ifaddr)
+		/* This is broken, skb_in->dev points to the outgoing device
+		 * after the packet passes through ip_output().
+		 */
+		if (skb_in->dev && sysctl_icmp_errors_use_inbound_ifaddr)
 			saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK);
 		else
 			saddr = 0;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-17 16:52     ` Patrick McHardy
@ 2007-05-18  0:57       ` Julian Anastasov
  2007-05-19 21:50       ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Julian Anastasov @ 2007-05-18  0:57 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, netdev, James Morris, Curtis Doty


	Hello,

On Thu, 17 May 2007, Patrick McHardy wrote:

> > saddr = iph->daddr;
> > if (!(rt->rt_flags & RTCF_LOCAL)) {
> >         if (sysctl_icmp_errors_use_inbound_ifaddr)
> > 
> > 
> >                 saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK);
> >         else
> >                 saddr = 0;
> > }

	While we are fixing this problem, is adding more logic
for the sysctl_icmp_errors_use_inbound_ifaddr case still working for its
users (untested code follows) ?:

	if (!(rt->rt_flags & RTCF_LOCAL)) {
		if (sysctl_icmp_errors_use_inbound_ifaddr && rt->fl.iif)
			saddr = inet_select_addr(skb_in->dev, iph->saddr,
				(rt->rt_flags & RTCF_DIRECTSRC) ?
				RT_SCOPE_LINK : RT_SCOPE_UNIVERSE);
		else
			saddr = 0;
	}

	Because this inet_select_addr call is too risky, it uses blindly
the first address (usually scope link). So,

- assume sysctl_icmp_errors_use_inbound_ifaddr is for packets from network,
work for input routes only (replaces check for skb_in->dev)

- prefer local address from the same subnet as sender (iph->saddr) or
it should be the target: icmp_param.replyopts.srr ? 
icmp_param.replyopts.faddr : iph->saddr as used below? Useful when input 
interface has many subnets.

- don't expose link addresses to sender if they are not known to it, sender
should be onlink to see them. If sender is not onlink and all our addresses
on input interface are scope link then we can not expose such addresses,
we risk to send from private address, for example, when our uplink
interface has only private addresses to talk with gateway and our
public IP is on internal interface where we are router for public subnet.
If we can not select address the routing still has chance to do it 
(from prefsrc or another interface).

	Also, any problems if icmp_send happens after SNAT changes
source? Or that is not possible? Because we think iph->saddr is
sender (target for our ICMP).

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-17 16:52     ` Patrick McHardy
  2007-05-18  0:57       ` Julian Anastasov
@ 2007-05-19 21:50       ` David Miller
  2007-05-21 17:03         ` Patrick McHardy
  1 sibling, 1 reply; 12+ messages in thread
From: David Miller @ 2007-05-19 21:50 UTC (permalink / raw)
  To: kaber; +Cc: netdev, jmorris, Curtis

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 17 May 2007 18:52:29 +0200

> [IPV4]: icmp: fix crash with sysctl_icmp_errors_use_inbound_ifaddr
> 
> When icmp_send is called on the local output path before the
> packet hits ip_output, skb->dev is not set, causing a crash
> when sysctl_icmp_errors_use_inbound_ifaddr is set. This can
> happen with the netfilter REJECT target or IPsec tunnels.
> 
> Let routing decide the ICMP source address in that case, since the
> packet is locally generated there is no inbound interface and
> the sysctl should not apply.
> 
> The option actually seems to be unfixable broken, on the path
> after ip_output() skb->dev points to the outgoing device and
> we don't know the incoming device anymore, so its going to do
> the absolute wrong thing and pick the address of the outgoing
> interface. Add a comment about this.
> 
> Reported by Curtis Doty <Curtis@GreenKey.net>.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied, thanks for the fix Patrick.

The post ip_output() case is very unfortunate.  Perhaps we
can tag the call sites, or if that doesn't work we can
find some way to perhaps tag the dst as an input vs. output
route in order to avoid this problem.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-14 19:19   ` Patrick McHardy
  2007-05-17 16:52     ` Patrick McHardy
@ 2007-05-20  5:26     ` Herbert Xu
  2007-05-21 16:36       ` Patrick McHardy
  1 sibling, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2007-05-20  5:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, jmorris, Curtis, davem

Patrick McHardy <kaber@trash.net> wrote:
> 
> This leaves the question what to do in the path after ip_output,
> when skb->dev points to the output device. We don't know the
> input device anymore, so there doesn't seem to be a way to make
> it do what the sysctl promises.

Perhaps we could change things so that the setting of skb->dev is
delayed until the packet has completely left the IP stack?  It'd
require a massive audit though of the IP stack though.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-20  5:26     ` Herbert Xu
@ 2007-05-21 16:36       ` Patrick McHardy
  2007-05-21 21:28         ` Herbert Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Patrick McHardy @ 2007-05-21 16:36 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, jmorris, Curtis, davem

Herbert Xu wrote:
> Patrick McHardy <kaber@trash.net> wrote:
> 
>>This leaves the question what to do in the path after ip_output,
>>when skb->dev points to the output device. We don't know the
>>input device anymore, so there doesn't seem to be a way to make
>>it do what the sysctl promises.
> 
> 
> Perhaps we could change things so that the setting of skb->dev is
> delayed until the packet has completely left the IP stack?  It'd
> require a massive audit though of the IP stack though.


The IP stack shouldn't be too hard, we currently set skb->dev in
ip_output, after that we only have the POST_ROUTING hook and ip_fragment
which care. It wouldn't help with ipip/ip_gre though, at that point
skb->dev must point to the tunnel device.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-19 21:50       ` David Miller
@ 2007-05-21 17:03         ` Patrick McHardy
  0 siblings, 0 replies; 12+ messages in thread
From: Patrick McHardy @ 2007-05-21 17:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jmorris, Curtis

[-- Attachment #1: Type: text/plain, Size: 553 bytes --]

David Miller wrote:
> The post ip_output() case is very unfortunate.  Perhaps we
> can tag the call sites, or if that doesn't work we can
> find some way to perhaps tag the dst as an input vs. output
> route in order to avoid this problem.


Tagging call-sites should make sure we don't use an address from the
outgoing device, but we would still not always use an address from
the incoming device.

Thinking again, we can simply perform a lookup on rt->fl.iif, that
should always do the right thing.

Signed-off-by: Patrick McHardy <kaber@trash.net>



[-- Attachment #2: x --]
[-- Type: text/plain, Size: 783 bytes --]

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index e238b17..02a899b 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -514,12 +514,15 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info)
 
 	saddr = iph->daddr;
 	if (!(rt->rt_flags & RTCF_LOCAL)) {
-		/* This is broken, skb_in->dev points to the outgoing device
-		 * after the packet passes through ip_output().
-		 */
-		if (skb_in->dev && sysctl_icmp_errors_use_inbound_ifaddr)
-			saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK);
-		else
+		struct net_device *dev = NULL;
+
+		if (rt->fl.iif && sysctl_icmp_errors_use_inbound_ifaddr)
+			dev = dev_get_by_index(rt->fl.iif);
+
+		if (dev) {
+			saddr = inet_select_addr(dev, 0, RT_SCOPE_LINK);
+			dev_put(dev);
+		} else
 			saddr = 0;
 	}
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-21 16:36       ` Patrick McHardy
@ 2007-05-21 21:28         ` Herbert Xu
  2007-05-21 21:32           ` Patrick McHardy
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2007-05-21 21:28 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, jmorris, Curtis, davem

On Mon, May 21, 2007 at 06:36:57PM +0200, Patrick McHardy wrote:
> 
> The IP stack shouldn't be too hard, we currently set skb->dev in
> ip_output, after that we only have the POST_ROUTING hook and ip_fragment
> which care. It wouldn't help with ipip/ip_gre though, at that point
> skb->dev must point to the tunnel device.

I think that should be OK though.  Once you pass through ipip/ip_gre's
transmit function, you're conceptually a locally generated packet.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd)
  2007-05-21 21:28         ` Herbert Xu
@ 2007-05-21 21:32           ` Patrick McHardy
  0 siblings, 0 replies; 12+ messages in thread
From: Patrick McHardy @ 2007-05-21 21:32 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, jmorris, Curtis, davem

Herbert Xu wrote:
> On Mon, May 21, 2007 at 06:36:57PM +0200, Patrick McHardy wrote:
> 
>>The IP stack shouldn't be too hard, we currently set skb->dev in
>>ip_output, after that we only have the POST_ROUTING hook and ip_fragment
>>which care. It wouldn't help with ipip/ip_gre though, at that point
>>skb->dev must point to the tunnel device.
> 
> 
> I think that should be OK though.  Once you pass through ipip/ip_gre's
> transmit function, you're conceptually a locally generated packet.


Yes, but it may generate ICMP errors for the original packet before
it has passed through it. The patch I sent earlier should also handle
this case correctly.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-05-21 21:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-14 18:30 oops in net/ipv4/icmp.c:icmp_send() with icmp_errors_use_inbound_ifaddr (fwd) James Morris
2007-05-14 18:46 ` Patrick McHardy
2007-05-14 19:19   ` Patrick McHardy
2007-05-17 16:52     ` Patrick McHardy
2007-05-18  0:57       ` Julian Anastasov
2007-05-19 21:50       ` David Miller
2007-05-21 17:03         ` Patrick McHardy
2007-05-20  5:26     ` Herbert Xu
2007-05-21 16:36       ` Patrick McHardy
2007-05-21 21:28         ` Herbert Xu
2007-05-21 21:32           ` Patrick McHardy
2007-05-14 20:24   ` Curtis Doty

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.