* [PATCH] ipvs: fix oops on NAT reply in br_nf context
@ 2012-07-07 10:26 Lin Ming
2012-07-10 8:51 ` Simon Horman
0 siblings, 1 reply; 3+ messages in thread
From: Lin Ming @ 2012-07-07 10:26 UTC (permalink / raw)
To: Simon Horman, Julian Anastasov
Cc: Massimo Cetra, Eric Dumazet, David S. Miller, netdev
IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.
[ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[ 579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[ 579.781792] PGD 218f9067 PUD 0
[ 579.781865] Oops: 0000 [#1] SMP
[ 579.781945] CPU 0
[ 579.781983] Modules linked in:
[ 579.782047]
[ 579.782080]
[ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8
[ 579.782300] RIP: 0010:[<ffffffff817b1ca5>] [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287
[ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[ 579.783919] Stack:
[ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[ 579.784477] Call Trace:
[ 579.784523] <IRQ>
[ 579.784562]
[ 579.784603] [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[ 579.784707] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[ 579.784797] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[ 579.784906] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[ 579.784995] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[ 579.785175] [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[ 579.785179] [<ffffffff817ac417>] __br_forward+0x97/0xa2
[ 579.785179] [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[ 579.785179] [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[ 579.785179] [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[ 579.785179] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[ 579.785179] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[ 579.785179] [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[ 579.785179] [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[ 579.785179] [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[ 579.785179] [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[ 579.785179] [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[ 579.785179] [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[ 579.785179] [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[ 579.785179] [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[ 579.785179] [<ffffffff8188812c>] call_softirq+0x1c/0x30
The steps to reproduce as follow,
1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
ipvsadm -A -t 192.168.1.106:80 -s rr
ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
ab -n 1000 http://192.168.1.106/
ip_vs_reply4
ip_vs_out
handle_response
ip_vs_notrack
nf_reset()
{
skb->nf_bridge = NULL;
}
Actually, IPVS wants in this case just to replace nfct
with untracked version. So replace the nf_reset(skb) call
in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
include/net/ip_vs.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..95374d1 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
if (!ct || !nf_ct_is_untracked(ct)) {
- nf_reset(skb);
+ nf_conntrack_put(skb->nfct);
skb->nfct = &nf_ct_untracked_get()->ct_general;
skb->nfctinfo = IP_CT_NEW;
nf_conntrack_get(skb->nfct);
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] ipvs: fix oops on NAT reply in br_nf context
2012-07-07 10:26 [PATCH] ipvs: fix oops on NAT reply in br_nf context Lin Ming
@ 2012-07-10 8:51 ` Simon Horman
2012-07-10 10:24 ` Lin Ming
0 siblings, 1 reply; 3+ messages in thread
From: Simon Horman @ 2012-07-10 8:51 UTC (permalink / raw)
To: Lin Ming
Cc: Julian Anastasov, Massimo Cetra, Eric Dumazet, David S. Miller, netdev
On Sat, Jul 07, 2012 at 06:26:10PM +0800, Lin Ming wrote:
> IPVS should not reset skb->nf_bridge in FORWARD hook
> by calling nf_reset for NAT replies. It triggers oops in
> br_nf_forward_finish.
>
> [ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> [ 579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [ 579.781792] PGD 218f9067 PUD 0
> [ 579.781865] Oops: 0000 [#1] SMP
> [ 579.781945] CPU 0
> [ 579.781983] Modules linked in:
> [ 579.782047]
> [ 579.782080]
> [ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8
> [ 579.782300] RIP: 0010:[<ffffffff817b1ca5>] [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287
> [ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
> [ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
> [ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
> [ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
> [ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
> [ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
> [ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
> [ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
> [ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
> [ 579.783919] Stack:
> [ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
> [ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
> [ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
> [ 579.784477] Call Trace:
> [ 579.784523] <IRQ>
> [ 579.784562]
> [ 579.784603] [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
> [ 579.784707] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [ 579.784797] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [ 579.784906] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [ 579.784995] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [ 579.785175] [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
> [ 579.785179] [<ffffffff817ac417>] __br_forward+0x97/0xa2
> [ 579.785179] [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
> [ 579.785179] [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
> [ 579.785179] [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
> [ 579.785179] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [ 579.785179] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [ 579.785179] [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
> [ 579.785179] [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
> [ 579.785179] [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
> [ 579.785179] [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
> [ 579.785179] [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
> [ 579.785179] [<ffffffff816e6800>] net_rx_action+0xdf/0x242
> [ 579.785179] [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
> [ 579.785179] [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> [ 579.785179] [<ffffffff8188812c>] call_softirq+0x1c/0x30
>
> The steps to reproduce as follow,
>
> 1. On Host1, setup brige br0(192.168.1.106)
> 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
> 3. Start IPVS service on Host1
> ipvsadm -A -t 192.168.1.106:80 -s rr
> ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
> 4. Run apache benchmark on Host2(192.168.1.101)
> ab -n 1000 http://192.168.1.106/
>
> ip_vs_reply4
> ip_vs_out
> handle_response
> ip_vs_notrack
> nf_reset()
> {
> skb->nf_bridge = NULL;
> }
>
> Actually, IPVS wants in this case just to replace nfct
> with untracked version. So replace the nf_reset(skb) call
> in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
>
> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
Actually, I'll queue up this version for 3.5 rather than the previous one
as it has a better title.
As per my previous comment (repeated here for reference) it seems to me
that this problem has been present since 2.6.37 and thus is stable material.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] ipvs: fix oops on NAT reply in br_nf context
2012-07-10 8:51 ` Simon Horman
@ 2012-07-10 10:24 ` Lin Ming
0 siblings, 0 replies; 3+ messages in thread
From: Lin Ming @ 2012-07-10 10:24 UTC (permalink / raw)
To: Simon Horman
Cc: Julian Anastasov, Massimo Cetra, Eric Dumazet, David S. Miller, netdev
On Tue, Jul 10, 2012 at 4:51 PM, Simon Horman <horms@verge.net.au> wrote:
> On Sat, Jul 07, 2012 at 06:26:10PM +0800, Lin Ming wrote:
>> IPVS should not reset skb->nf_bridge in FORWARD hook
>> by calling nf_reset for NAT replies. It triggers oops in
>> br_nf_forward_finish.
>>
>> [ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
>> [ 579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
>> [ 579.781792] PGD 218f9067 PUD 0
>> [ 579.781865] Oops: 0000 [#1] SMP
>> [ 579.781945] CPU 0
>> [ 579.781983] Modules linked in:
>> [ 579.782047]
>> [ 579.782080]
>> [ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8
>> [ 579.782300] RIP: 0010:[<ffffffff817b1ca5>] [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
>> [ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287
>> [ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
>> [ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
>> [ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
>> [ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
>> [ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
>> [ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
>> [ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
>> [ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
>> [ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
>> [ 579.783919] Stack:
>> [ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
>> [ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
>> [ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
>> [ 579.784477] Call Trace:
>> [ 579.784523] <IRQ>
>> [ 579.784562]
>> [ 579.784603] [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
>> [ 579.784707] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
>> [ 579.784797] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
>> [ 579.784906] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
>> [ 579.784995] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
>> [ 579.785175] [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
>> [ 579.785179] [<ffffffff817ac417>] __br_forward+0x97/0xa2
>> [ 579.785179] [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
>> [ 579.785179] [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
>> [ 579.785179] [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
>> [ 579.785179] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
>> [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
>> [ 579.785179] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
>> [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
>> [ 579.785179] [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
>> [ 579.785179] [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
>> [ 579.785179] [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
>> [ 579.785179] [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
>> [ 579.785179] [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
>> [ 579.785179] [<ffffffff816e6800>] net_rx_action+0xdf/0x242
>> [ 579.785179] [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
>> [ 579.785179] [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
>> [ 579.785179] [<ffffffff8188812c>] call_softirq+0x1c/0x30
>>
>> The steps to reproduce as follow,
>>
>> 1. On Host1, setup brige br0(192.168.1.106)
>> 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
>> 3. Start IPVS service on Host1
>> ipvsadm -A -t 192.168.1.106:80 -s rr
>> ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
>> 4. Run apache benchmark on Host2(192.168.1.101)
>> ab -n 1000 http://192.168.1.106/
>>
>> ip_vs_reply4
>> ip_vs_out
>> handle_response
>> ip_vs_notrack
>> nf_reset()
>> {
>> skb->nf_bridge = NULL;
>> }
>>
>> Actually, IPVS wants in this case just to replace nfct
>> with untracked version. So replace the nf_reset(skb) call
>> in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
>>
>> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
>> Signed-off-by: Julian Anastasov <ja@ssi.bg>
>
> Actually, I'll queue up this version for 3.5 rather than the previous one
> as it has a better title.
Yes, thanks.
>
> As per my previous comment (repeated here for reference) it seems to me
> that this problem has been present since 2.6.37 and thus is stable material.
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-07-10 10:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-07 10:26 [PATCH] ipvs: fix oops on NAT reply in br_nf context Lin Ming
2012-07-10 8:51 ` Simon Horman
2012-07-10 10:24 ` Lin Ming
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.