* [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address.
@ 2020-01-14 17:40 Paolo Abeni
0 siblings, 0 replies; 3+ messages in thread
From: Paolo Abeni @ 2020-01-14 17:40 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 2804 bytes --]
On Mon, 2020-01-13 at 18:39 -0800, Mat Martineau wrote:
> On Mon, 13 Jan 2020, Paolo Abeni wrote:
>
> > The TCP code unconditionally overwrite in a few places 'icsk->icsk_af_ops',
> > which lead to the wrong oops being called for ipv4-mapped ipv6 address.
> > e.g. Kasan notice a invalid slub access running packetdrill ipv4-mapped
> > tests.
> >
> > This patch try to address the issue building ipv4-mapped mptcp icsk ops
> > and using it when appropriate. Still needs quite a bit of work:
> >
> > - kasan still notice an use after free at netns cleanup time
> > (for the ns struct ???)
> > - squashing
> >
> > ANy feedback more than welcome!
> > ---
> > include/net/mptcp.h | 2 ++
> > net/ipv6/tcp_ipv6.c | 5 ++++
> > net/mptcp/subflow.c | 66 +++++++++++++++++++++++++++++++++++++++++----
> > 3 files changed, 68 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> > index f65baeacf199..27627e2d1bc2 100644
> > --- a/include/net/mptcp.h
> > +++ b/include/net/mptcp.h
> > @@ -174,6 +174,8 @@ static inline bool mptcp_skb_can_collapse(const struct sk_buff *to,
> >
> > #endif /* CONFIG_MPTCP */
> >
> > +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped);
> > +
> > #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> > int mptcpv6_init(void);
> > #elif IS_ENABLED(CONFIG_IPV6)
> > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > index 60068ffde1d9..522d16d70210 100644
> > --- a/net/ipv6/tcp_ipv6.c
> > +++ b/net/ipv6/tcp_ipv6.c
> > @@ -238,6 +238,9 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
> >
> > icsk->icsk_af_ops = &ipv6_mapped;
> > + if (sk_is_mptcp(sk))
> > + mptcp_handle_ipv6_mapped(sk, true);
> > +
>
> I think we could get maintainer feedback about assigning icsk->icsk_af_ops
> multiple times. An option is to create an inlineable helper function that
> returns the appropriate value to assign (and does the other
> mptcp_handle_ipv6_mapped() stuff).
double checking I got the obove correctly. Your suggestion is something
alike the following lines:
---
static inline struct inet_connection_sock_af_ops *
tcp_ipv6_mapped_ops(struct sock *sk)
{
if (!sk_is_mptcp(sk))
return &ipv6_mapped;
return mptcp_mapped_ops(sk);
}
// ...
icsk->icsk_af_ops = tcp_ipv6_mapped_ops(sk);
---
plus something similar to reset the ops on error, right?
I personally don't think setting twice icsk_af_ops should be an issue:
it happens only when the socket can't receive/transmit any packet under
the socket lock - should be safe - and in a somewhat uncommon control
path scenario - performances do not matter much. Do you have strong
feeling about this?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 3+ messages in thread
* [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address.
@ 2020-01-14 2:39 Mat Martineau
0 siblings, 0 replies; 3+ messages in thread
From: Mat Martineau @ 2020-01-14 2:39 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 6629 bytes --]
On Mon, 13 Jan 2020, Paolo Abeni wrote:
> The TCP code unconditionally overwrite in a few places 'icsk->icsk_af_ops',
> which lead to the wrong oops being called for ipv4-mapped ipv6 address.
> e.g. Kasan notice a invalid slub access running packetdrill ipv4-mapped
> tests.
>
> This patch try to address the issue building ipv4-mapped mptcp icsk ops
> and using it when appropriate. Still needs quite a bit of work:
>
> - kasan still notice an use after free at netns cleanup time
> (for the ns struct ???)
> - squashing
>
> ANy feedback more than welcome!
> ---
> include/net/mptcp.h | 2 ++
> net/ipv6/tcp_ipv6.c | 5 ++++
> net/mptcp/subflow.c | 66 +++++++++++++++++++++++++++++++++++++++++----
> 3 files changed, 68 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> index f65baeacf199..27627e2d1bc2 100644
> --- a/include/net/mptcp.h
> +++ b/include/net/mptcp.h
> @@ -174,6 +174,8 @@ static inline bool mptcp_skb_can_collapse(const struct sk_buff *to,
>
> #endif /* CONFIG_MPTCP */
>
> +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped);
> +
> #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> int mptcpv6_init(void);
> #elif IS_ENABLED(CONFIG_IPV6)
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 60068ffde1d9..522d16d70210 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -238,6 +238,9 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
>
> icsk->icsk_af_ops = &ipv6_mapped;
> + if (sk_is_mptcp(sk))
> + mptcp_handle_ipv6_mapped(sk, true);
> +
I think we could get maintainer feedback about assigning icsk->icsk_af_ops
multiple times. An option is to create an inlineable helper function that
returns the appropriate value to assign (and does the other
mptcp_handle_ipv6_mapped() stuff).
Mat
> sk->sk_backlog_rcv = tcp_v4_do_rcv;
> #ifdef CONFIG_TCP_MD5SIG
> tp->af_specific = &tcp_sock_ipv6_mapped_specific;
> @@ -252,6 +255,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> #ifdef CONFIG_TCP_MD5SIG
> tp->af_specific = &tcp_sock_ipv6_specific;
> #endif
> + if (sk_is_mptcp(sk))
> + mptcp_handle_ipv6_mapped(sk, false);
> goto failure;
> }
> np->saddr = sk->sk_v6_rcv_saddr;
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 2c0f2ed7a428..4898cb340659 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -20,6 +20,8 @@
> #include <net/mptcp.h>
> #include "protocol.h"
>
> +static void subflow_chk_mapped(struct sock *sk);
> +
> static int subflow_rebuild_header(struct sock *sk)
> {
> struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
> @@ -149,6 +151,7 @@ static int subflow_v4_conn_request(struct sock *sk, struct sk_buff *skb)
> #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> static struct tcp_request_sock_ops subflow_request_sock_ipv6_ops;
> static struct inet_connection_sock_af_ops subflow_v6_specific;
> +static struct inet_connection_sock_af_ops subflow_v6m_specific;
>
> static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb)
> {
> @@ -219,6 +222,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
> if (!ctx)
> return child;
>
> + subflow_chk_mapped(child);
> +
> if (ctx->mp_capable) {
> if (mptcp_token_new_accept(ctx->token))
> goto close_child;
> @@ -237,6 +242,54 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
>
> static struct inet_connection_sock_af_ops subflow_specific;
>
> +static struct inet_connection_sock_af_ops *
> +subflow_default_af_ops(struct sock *sk)
> +{
> +#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> + if (sk->sk_family == AF_INET6)
> + return &subflow_v6_specific;
> +#endif
> + return &subflow_specific;
> +}
> +
> +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped)
> +{
> +#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
> + struct inet_connection_sock *icsk = inet_csk(sk);
> + struct inet_connection_sock_af_ops *target;
> +
> + target = mapped ? &subflow_v6m_specific : subflow_default_af_ops(sk);
> +
> + pr_debug("subflow=%p family=%d ops=%p target=%p mapped=%d",
> + subflow, sk->sk_family, icsk->icsk_af_ops, target, mapped);
> +
> + if (likely(icsk->icsk_af_ops == target))
> + return;
> +
> + subflow->icsk_af_ops = icsk->icsk_af_ops;
> + icsk->icsk_af_ops = target;
> +#endif
> +}
> +
> +static void subflow_chk_mapped(struct sock *sk)
> +{
> +#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
> + struct inet_connection_sock *icsk = inet_csk(sk);
> +
> + pr_debug("subflow=%p family=%d ops=%p default=%p mapped=%p",
> + subflow, sk->sk_family, icsk->icsk_af_ops,
> + subflow_default_af_ops(sk), &subflow_v6m_specific);
> +
> + if (likely(icsk->icsk_af_ops == subflow_default_af_ops(sk)))
> + return;
> +
> + subflow->icsk_af_ops = icsk->icsk_af_ops;
> + icsk->icsk_af_ops = &subflow_v6m_specific;
> +#endif
> +}
> +
> enum mapping_status {
> MAPPING_OK,
> MAPPING_INVALID,
> @@ -688,11 +741,7 @@ static int subflow_ulp_init(struct sock *sk)
>
> tp->is_mptcp = 1;
> ctx->icsk_af_ops = icsk->icsk_af_ops;
> - icsk->icsk_af_ops = &subflow_specific;
> -#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> - if (sk->sk_family == AF_INET6)
> - icsk->icsk_af_ops = &subflow_v6_specific;
> -#endif
> + icsk->icsk_af_ops = subflow_default_af_ops(sk);
> ctx->tcp_data_ready = sk->sk_data_ready;
> ctx->tcp_state_change = sk->sk_state_change;
> ctx->tcp_write_space = sk->sk_write_space;
> @@ -815,6 +864,13 @@ void mptcp_subflow_init(void)
> subflow_v6_specific.syn_recv_sock = subflow_syn_recv_sock;
> subflow_v6_specific.sk_rx_dst_set = subflow_finish_connect;
> subflow_v6_specific.rebuild_header = subflow_rebuild_header;
> +
> + subflow_v6m_specific = subflow_v6_specific;
> + subflow_v6m_specific.queue_xmit = ipv4_specific.queue_xmit;
> + subflow_v6m_specific.send_check = ipv4_specific.send_check;
> + subflow_v6m_specific.net_header_len = ipv4_specific.net_header_len;
> + subflow_v6m_specific.mtu_reduced = ipv4_specific.mtu_reduced;
> + subflow_v6m_specific.net_frag_header_len = 0;
> #endif
>
> if (tcp_register_ulp(&subflow_ulp_ops) != 0)
> --
> 2.21.0
> _______________________________________________
> mptcp mailing list -- mptcp(a)lists.01.org
> To unsubscribe send an email to mptcp-leave(a)lists.01.org
>
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 3+ messages in thread
* [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address.
@ 2020-01-13 16:09 Paolo Abeni
0 siblings, 0 replies; 3+ messages in thread
From: Paolo Abeni @ 2020-01-13 16:09 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 7892 bytes --]
On Mon, 2020-01-13 at 15:34 +0100, Paolo Abeni wrote:
> The TCP code unconditionally overwrite in a few places 'icsk->icsk_af_ops',
> which lead to the wrong oops being called for ipv4-mapped ipv6 address.
> e.g. Kasan notice a invalid slub access running packetdrill ipv4-mapped
> tests.
>
> This patch try to address the issue building ipv4-mapped mptcp icsk ops
> and using it when appropriate. Still needs quite a bit of work:
>
> - kasan still notice an use after free at netns cleanup time
> (for the ns struct ???)
Actually, I can reproduce the UAF with the current git (unpatched) tree
running the packetdrill test v1_connect_tcpfallback_wrongver.pkt with
ipv4 addresses.
So it looks like we have UAF in MPTCP ipv4.
The full splat is as follow:
[ 133.498790] subflow_create_ctx:616: MPTCP: subflow=00000000b82781fb
[ 133.498794] subflow_ulp_init:687: MPTCP: subflow=00000000b82781fb, family=2
[ 133.499845] mptcp_subflow_create_socket:595: MPTCP: subflow=00000000b82781fb
[ 133.500939] subflow_rebuild_header:29: MPTCP: subflow=00000000a3080487
[ 133.501868] mptcp_token_new_connect:109: MPTCP: ssk=00000000a3080487, local_key=14090005032543291656, token=470185988, idsn=6672336186025600666
[ 133.504546] mptcp_syn_options:291: local_key=14090005032543291656
[ 133.506686] mptcp_rcv_synsent:305: subflow=00000000b82781fb
[ 133.507624] subflow_finish_connect:116: MPTCP: subflow=00000000b82781fb, remote_key=0
[ 133.508373] subflow_finish_connect:122: MPTCP: synack seq=0
[ 133.698080] mptcp_sendmsg:329: MPTCP: fallback passthrough
[ 133.903468] ==================================================================
[ 133.904836] BUG: KASAN: use-after-free in fib_get_table+0xfa/0x120
[ 133.905968] Read of size 8 at addr ffff8880a938f7f8 by task swapper/1/0
[ 133.907509] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc4.mptcp_81918d92317d+ #526
[ 133.908952] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
[ 133.910624] Call Trace:
[ 133.911157] <IRQ>
[ 133.911614] dump_stack+0x97/0xe0
[ 133.912227] print_address_description.constprop.0+0x1b/0x210
[ 133.913268] ? fib_get_table+0xfa/0x120
[ 133.913993] ? fib_get_table+0xfa/0x120
[ 133.914710] __kasan_report.cold+0x37/0x7d
[ 133.915481] ? inet_lookup_ifaddr_rcu+0x100/0x1e0
[ 133.916346] ? fib_get_table+0xfa/0x120
[ 133.917086] kasan_report+0xe/0x20
[ 133.917734] fib_get_table+0xfa/0x120
[ 133.918433] __ip_dev_find+0x265/0x2f0
[ 133.919144] ? inet_lookup_ifaddr_rcu+0x1e0/0x1e0
[ 133.920012] ? mark_lock+0xbc/0x1190
[ 133.920679] ? mark_held_locks+0x65/0xf0
[ 133.921437] ip_route_output_key_hash_rcu+0x135d/0x22e0
[ 133.922401] ? ip_route_input_noref+0x1b0/0x1b0
[ 133.923228] ? ip_route_output_key_hash+0x123/0x270
[ 133.924132] ip_route_output_key_hash+0x176/0x270
[ 133.925008] ? ip_route_output_key_hash_rcu+0x22e0/0x22e0
[ 133.926020] ip_route_output_flow+0x1e/0x90
[ 133.926779] inet_sk_rebuild_header+0x769/0x1c90
[ 133.927656] ? inet_sk_set_state+0x330/0x330
[ 133.928461] ? __pskb_trim_head+0x9d0/0x9d0
[ 133.929237] __tcp_retransmit_skb+0x274/0x15e0
[ 133.930088] ? tcp_retrans_try_collapse+0x12c0/0x12c0
[ 133.930766] ? mark_held_locks+0x65/0xf0
[ 133.931310] tcp_send_loss_probe+0x3fb/0x770
[ 133.931906] tcp_write_timer_handler+0x585/0x710
[ 133.932527] tcp_write_timer+0x7e/0x1c0
[ 133.933048] call_timer_fn+0x1b1/0x640
[ 133.933576] ? tcp_write_timer_handler+0x710/0x710
[ 133.934205] ? timer_fixup_init+0x40/0x40
[ 133.934752] ? lock_downgrade+0x720/0x720
[ 133.935295] ? rcu_read_lock_sched_held+0xa1/0xe0
[ 133.935926] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 133.936530] ? _raw_spin_unlock_irq+0x24/0x30
[ 133.937114] __run_timers.part.0+0x4a5/0x9a0
[ 133.937701] ? tcp_write_timer_handler+0x710/0x710
[ 133.938346] ? call_timer_fn+0x640/0x640
[ 133.938873] ? lock_downgrade+0x720/0x720
[ 133.939445] ? rcu_read_lock_sched_held+0xa1/0xe0
[ 133.940069] run_timer_softirq+0x6a/0x100
[ 133.940618] __do_softirq+0x24c/0xa6e
[ 133.941127] irq_exit+0x275/0x2c0
[ 133.941603] smp_apic_timer_interrupt+0x1b0/0x650
[ 133.942222] apic_timer_interrupt+0xf/0x20
[ 133.942777] </IRQ>
[ 133.943090] RIP: 0010:default_idle+0x2d/0x390
[ 133.943676] Code: 00 00 41 56 41 55 41 54 65 44 8b 25 c5 56 2c 79 55 53 0f 1f 44 00 00 e8 b1 95 7c fe e9 07 00 00 00 0f 00 2d 35 80 4c 00 fb f4 <65> 44 8b 25 a3 56 2c 79 0f 1f 44 00 00 5b 5d 41 5c 41 5d 41 5e c3
[ 133.945969] RSP: 0018:ffff88810746fda8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 133.946932] RAX: 0000000000000007 RBX: ffff888107463480 RCX: 0000000000000000
[ 133.947835] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff88810746417c
[ 133.948741] RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
[ 133.949698] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[ 133.950618] R13: 0000000000000000 R14: 0000000000000001 R15: ffffffff883e7820
[ 133.951573] ? default_idle+0x1f/0x390
[ 133.952088] do_idle+0x44b/0x5a0
[ 133.952553] ? arch_cpu_idle_exit+0x40/0x40
[ 133.953120] ? _raw_spin_unlock_irqrestore+0x3e/0x50
[ 133.953811] cpu_startup_entry+0x19/0x20
[ 133.954353] start_secondary+0x2f4/0x400
[ 133.954918] ? set_cpu_sibling_map+0x2420/0x2420
[ 133.955556] secondary_startup_64+0xb6/0xc0
[ 133.956401] Allocated by task 3288:
[ 133.956901] save_stack+0x1b/0x80
[ 133.957400] __kasan_kmalloc.constprop.0+0xc2/0xd0
[ 133.958037] kmem_cache_alloc_trace+0x160/0x390
[ 133.958664] fib_net_init+0x133/0x300
[ 133.959180] ops_init+0x96/0x360
[ 133.959644] setup_net+0x2c5/0x790
[ 133.960117] copy_net_ns+0x24f/0x490
[ 133.960625] create_new_namespaces+0x35e/0x6e0
[ 133.961219] unshare_nsproxy_namespaces+0x87/0x1a0
[ 133.961863] ksys_unshare+0x306/0x680
[ 133.962390] __x64_sys_unshare+0x2d/0x40
[ 133.962925] do_syscall_64+0x99/0x500
[ 133.963434] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 133.964327] Freed by task 36:
[ 133.964763] save_stack+0x1b/0x80
[ 133.965217] __kasan_slab_free+0x12c/0x170
[ 133.965768] kfree+0x103/0x370
[ 133.966195] ip_fib_net_exit+0x231/0x340
[ 133.966760] ops_exit_list.isra.0+0x90/0x140
[ 133.967368] cleanup_net+0x44a/0x930
[ 133.967854] process_one_work+0x889/0x15b0
[ 133.968421] worker_thread+0x91/0xcb0
[ 133.968922] kthread+0x2f2/0x3f0
[ 133.969387] ret_from_fork+0x24/0x30
[ 133.970115] The buggy address belongs to the object at ffff8880a938f000
which belongs to the cache kmalloc-2k of size 2048
[ 133.971699] The buggy address is located 2040 bytes inside of
2048-byte region [ffff8880a938f000, ffff8880a938f800)
[ 133.973173] The buggy address belongs to the page:
[ 133.973810] page:ffffea0002a4e200 refcount:1 mapcount:0 mapping:ffff888107c0c000 index:0x0 compound_mapcount: 0
[ 133.975077] raw: 000fffffc0010200 dead000000000100 dead000000000122 ffff888107c0c000
[ 133.976093] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
[ 133.977090] page dumped because: kasan: bad access detected
[ 133.978073] Memory state around the buggy address:
[ 133.978723] ffff8880a938f680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 133.979672] ffff8880a938f700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 133.980597] >ffff8880a938f780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 133.981551] ^
[ 133.982498] ffff8880a938f800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 133.983481] ffff8880a938f880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 133.984404] ==================================================================
[ 133.985413] Disabling lock debugging due to kernel taint
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-01-14 17:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 17:40 [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address Paolo Abeni
-- strict thread matches above, loose matches on Subject: below --
2020-01-14 2:39 Mat Martineau
2020-01-13 16:09 Paolo Abeni
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.