All of lore.kernel.org
 help / color / mirror / Atom feed
* [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address.
@ 2020-01-14 17:40 Paolo Abeni
  0 siblings, 0 replies; 3+ messages in thread
From: Paolo Abeni @ 2020-01-14 17:40 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2804 bytes --]

On Mon, 2020-01-13 at 18:39 -0800, Mat Martineau wrote:
> On Mon, 13 Jan 2020, Paolo Abeni wrote:
> 
> > The TCP code unconditionally overwrite in a few places 'icsk->icsk_af_ops',
> > which lead to the wrong oops being called for ipv4-mapped ipv6 address.
> > e.g. Kasan notice a invalid slub access running packetdrill ipv4-mapped
> > tests.
> > 
> > This patch try to address the issue building ipv4-mapped  mptcp icsk ops
> > and using it when appropriate. Still needs quite a bit of work:
> > 
> > - kasan still notice an use after free at netns cleanup time
> >  (for the ns struct ???)
> > - squashing
> > 
> > ANy feedback more than welcome!
> > ---
> > include/net/mptcp.h |  2 ++
> > net/ipv6/tcp_ipv6.c |  5 ++++
> > net/mptcp/subflow.c | 66 +++++++++++++++++++++++++++++++++++++++++----
> > 3 files changed, 68 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> > index f65baeacf199..27627e2d1bc2 100644
> > --- a/include/net/mptcp.h
> > +++ b/include/net/mptcp.h
> > @@ -174,6 +174,8 @@ static inline bool mptcp_skb_can_collapse(const struct sk_buff *to,
> > 
> > #endif /* CONFIG_MPTCP */
> > 
> > +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped);
> > +
> > #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> > int mptcpv6_init(void);
> > #elif IS_ENABLED(CONFIG_IPV6)
> > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > index 60068ffde1d9..522d16d70210 100644
> > --- a/net/ipv6/tcp_ipv6.c
> > +++ b/net/ipv6/tcp_ipv6.c
> > @@ -238,6 +238,9 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> > 		sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
> > 
> > 		icsk->icsk_af_ops = &ipv6_mapped;
> > +		if (sk_is_mptcp(sk))
> > +			mptcp_handle_ipv6_mapped(sk, true);
> > +
> 
> I think we could get maintainer feedback about assigning icsk->icsk_af_ops 
> multiple times. An option is to create an inlineable helper function that 
> returns the appropriate value to assign (and does the other 
> mptcp_handle_ipv6_mapped() stuff).

double checking I got the obove correctly. Your suggestion is something
alike the following lines:
---
static inline struct inet_connection_sock_af_ops *
tcp_ipv6_mapped_ops(struct sock *sk)
{
	if (!sk_is_mptcp(sk))
		return &ipv6_mapped;
	return mptcp_mapped_ops(sk);
}


// ...
	icsk->icsk_af_ops = tcp_ipv6_mapped_ops(sk);
---

plus something similar to reset the ops on error, right?

I personally don't think setting twice icsk_af_ops should be an issue:
it happens only when the socket can't receive/transmit any packet under
the socket lock - should be safe - and in a somewhat uncommon control
path scenario - performances do not matter much. Do you have strong
feeling about this?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address.
@ 2020-01-14  2:39 Mat Martineau
  0 siblings, 0 replies; 3+ messages in thread
From: Mat Martineau @ 2020-01-14  2:39 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 6629 bytes --]


On Mon, 13 Jan 2020, Paolo Abeni wrote:

> The TCP code unconditionally overwrite in a few places 'icsk->icsk_af_ops',
> which lead to the wrong oops being called for ipv4-mapped ipv6 address.
> e.g. Kasan notice a invalid slub access running packetdrill ipv4-mapped
> tests.
>
> This patch try to address the issue building ipv4-mapped  mptcp icsk ops
> and using it when appropriate. Still needs quite a bit of work:
>
> - kasan still notice an use after free at netns cleanup time
>  (for the ns struct ???)
> - squashing
>
> ANy feedback more than welcome!
> ---
> include/net/mptcp.h |  2 ++
> net/ipv6/tcp_ipv6.c |  5 ++++
> net/mptcp/subflow.c | 66 +++++++++++++++++++++++++++++++++++++++++----
> 3 files changed, 68 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> index f65baeacf199..27627e2d1bc2 100644
> --- a/include/net/mptcp.h
> +++ b/include/net/mptcp.h
> @@ -174,6 +174,8 @@ static inline bool mptcp_skb_can_collapse(const struct sk_buff *to,
>
> #endif /* CONFIG_MPTCP */
>
> +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped);
> +
> #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> int mptcpv6_init(void);
> #elif IS_ENABLED(CONFIG_IPV6)
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 60068ffde1d9..522d16d70210 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -238,6 +238,9 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> 		sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
>
> 		icsk->icsk_af_ops = &ipv6_mapped;
> +		if (sk_is_mptcp(sk))
> +			mptcp_handle_ipv6_mapped(sk, true);
> +

I think we could get maintainer feedback about assigning icsk->icsk_af_ops 
multiple times. An option is to create an inlineable helper function that 
returns the appropriate value to assign (and does the other 
mptcp_handle_ipv6_mapped() stuff).

Mat


> 		sk->sk_backlog_rcv = tcp_v4_do_rcv;
> #ifdef CONFIG_TCP_MD5SIG
> 		tp->af_specific = &tcp_sock_ipv6_mapped_specific;
> @@ -252,6 +255,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
> #ifdef CONFIG_TCP_MD5SIG
> 			tp->af_specific = &tcp_sock_ipv6_specific;
> #endif
> +			if (sk_is_mptcp(sk))
> +				mptcp_handle_ipv6_mapped(sk, false);
> 			goto failure;
> 		}
> 		np->saddr = sk->sk_v6_rcv_saddr;
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 2c0f2ed7a428..4898cb340659 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -20,6 +20,8 @@
> #include <net/mptcp.h>
> #include "protocol.h"
>
> +static void subflow_chk_mapped(struct sock *sk);
> +
> static int subflow_rebuild_header(struct sock *sk)
> {
> 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
> @@ -149,6 +151,7 @@ static int subflow_v4_conn_request(struct sock *sk, struct sk_buff *skb)
> #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> static struct tcp_request_sock_ops subflow_request_sock_ipv6_ops;
> static struct inet_connection_sock_af_ops subflow_v6_specific;
> +static struct inet_connection_sock_af_ops subflow_v6m_specific;
>
> static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb)
> {
> @@ -219,6 +222,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
> 		if (!ctx)
> 			return child;
>
> +		subflow_chk_mapped(child);
> +
> 		if (ctx->mp_capable) {
> 			if (mptcp_token_new_accept(ctx->token))
> 				goto close_child;
> @@ -237,6 +242,54 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
>
> static struct inet_connection_sock_af_ops subflow_specific;
>
> +static struct inet_connection_sock_af_ops *
> +subflow_default_af_ops(struct sock *sk)
> +{
> +#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> +	if (sk->sk_family == AF_INET6)
> +		return &subflow_v6_specific;
> +#endif
> +	return &subflow_specific;
> +}
> +
> +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped)
> +{
> +#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> +	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
> +	struct inet_connection_sock *icsk = inet_csk(sk);
> +	struct inet_connection_sock_af_ops *target;
> +
> +	target = mapped ? &subflow_v6m_specific : subflow_default_af_ops(sk);
> +
> +	pr_debug("subflow=%p family=%d ops=%p target=%p mapped=%d",
> +	         subflow, sk->sk_family, icsk->icsk_af_ops, target, mapped);
> +
> +	if (likely(icsk->icsk_af_ops == target))
> +		return;
> +
> +	subflow->icsk_af_ops = icsk->icsk_af_ops;
> +	icsk->icsk_af_ops = target;
> +#endif
> +}
> +
> +static void subflow_chk_mapped(struct sock *sk)
> +{
> +#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> +	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
> +	struct inet_connection_sock *icsk = inet_csk(sk);
> +
> +	pr_debug("subflow=%p family=%d ops=%p default=%p mapped=%p",
> +	         subflow, sk->sk_family, icsk->icsk_af_ops,
> +	         subflow_default_af_ops(sk), &subflow_v6m_specific);
> +
> +	if (likely(icsk->icsk_af_ops == subflow_default_af_ops(sk)))
> +		return;
> +
> +	subflow->icsk_af_ops = icsk->icsk_af_ops;
> +	icsk->icsk_af_ops = &subflow_v6m_specific;
> +#endif
> +}
> +
> enum mapping_status {
> 	MAPPING_OK,
> 	MAPPING_INVALID,
> @@ -688,11 +741,7 @@ static int subflow_ulp_init(struct sock *sk)
>
> 	tp->is_mptcp = 1;
> 	ctx->icsk_af_ops = icsk->icsk_af_ops;
> -	icsk->icsk_af_ops = &subflow_specific;
> -#if IS_ENABLED(CONFIG_MPTCP_IPV6)
> -	if (sk->sk_family == AF_INET6)
> -		icsk->icsk_af_ops = &subflow_v6_specific;
> -#endif
> +	icsk->icsk_af_ops = subflow_default_af_ops(sk);
> 	ctx->tcp_data_ready = sk->sk_data_ready;
> 	ctx->tcp_state_change = sk->sk_state_change;
> 	ctx->tcp_write_space = sk->sk_write_space;
> @@ -815,6 +864,13 @@ void mptcp_subflow_init(void)
> 	subflow_v6_specific.syn_recv_sock = subflow_syn_recv_sock;
> 	subflow_v6_specific.sk_rx_dst_set = subflow_finish_connect;
> 	subflow_v6_specific.rebuild_header = subflow_rebuild_header;
> +
> +	subflow_v6m_specific = subflow_v6_specific;
> +	subflow_v6m_specific.queue_xmit = ipv4_specific.queue_xmit;
> +	subflow_v6m_specific.send_check = ipv4_specific.send_check;
> +	subflow_v6m_specific.net_header_len = ipv4_specific.net_header_len;
> +	subflow_v6m_specific.mtu_reduced = ipv4_specific.mtu_reduced;
> +	subflow_v6m_specific.net_frag_header_len = 0;
> #endif
>
> 	if (tcp_register_ulp(&subflow_ulp_ops) != 0)
> -- 
> 2.21.0
> _______________________________________________
> mptcp mailing list -- mptcp(a)lists.01.org
> To unsubscribe send an email to mptcp-leave(a)lists.01.org
>

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address.
@ 2020-01-13 16:09 Paolo Abeni
  0 siblings, 0 replies; 3+ messages in thread
From: Paolo Abeni @ 2020-01-13 16:09 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 7892 bytes --]

On Mon, 2020-01-13 at 15:34 +0100, Paolo Abeni wrote:
> The TCP code unconditionally overwrite in a few places 'icsk->icsk_af_ops',
> which lead to the wrong oops being called for ipv4-mapped ipv6 address.
> e.g. Kasan notice a invalid slub access running packetdrill ipv4-mapped
> tests.
> 
> This patch try to address the issue building ipv4-mapped  mptcp icsk ops
> and using it when appropriate. Still needs quite a bit of work:
> 
> - kasan still notice an use after free at netns cleanup time
>   (for the ns struct ???)

Actually, I can reproduce the UAF with the current git (unpatched) tree
running the packetdrill test v1_connect_tcpfallback_wrongver.pkt with
ipv4 addresses.

So it looks like we have UAF in MPTCP ipv4.

The full splat is as follow:

[  133.498790] subflow_create_ctx:616: MPTCP: subflow=00000000b82781fb
[  133.498794] subflow_ulp_init:687: MPTCP: subflow=00000000b82781fb, family=2
[  133.499845] mptcp_subflow_create_socket:595: MPTCP: subflow=00000000b82781fb
[  133.500939] subflow_rebuild_header:29: MPTCP: subflow=00000000a3080487
[  133.501868] mptcp_token_new_connect:109: MPTCP: ssk=00000000a3080487, local_key=14090005032543291656, token=470185988, idsn=6672336186025600666
[  133.504546] mptcp_syn_options:291: local_key=14090005032543291656
[  133.506686] mptcp_rcv_synsent:305: subflow=00000000b82781fb
[  133.507624] subflow_finish_connect:116: MPTCP: subflow=00000000b82781fb, remote_key=0
[  133.508373] subflow_finish_connect:122: MPTCP: synack seq=0
[  133.698080] mptcp_sendmsg:329: MPTCP: fallback passthrough
[  133.903468] ==================================================================
[  133.904836] BUG: KASAN: use-after-free in fib_get_table+0xfa/0x120
[  133.905968] Read of size 8 at addr ffff8880a938f7f8 by task swapper/1/0

[  133.907509] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc4.mptcp_81918d92317d+ #526
[  133.908952] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
[  133.910624] Call Trace:
[  133.911157]  <IRQ>
[  133.911614]  dump_stack+0x97/0xe0
[  133.912227]  print_address_description.constprop.0+0x1b/0x210
[  133.913268]  ? fib_get_table+0xfa/0x120
[  133.913993]  ? fib_get_table+0xfa/0x120
[  133.914710]  __kasan_report.cold+0x37/0x7d
[  133.915481]  ? inet_lookup_ifaddr_rcu+0x100/0x1e0
[  133.916346]  ? fib_get_table+0xfa/0x120
[  133.917086]  kasan_report+0xe/0x20
[  133.917734]  fib_get_table+0xfa/0x120
[  133.918433]  __ip_dev_find+0x265/0x2f0
[  133.919144]  ? inet_lookup_ifaddr_rcu+0x1e0/0x1e0
[  133.920012]  ? mark_lock+0xbc/0x1190
[  133.920679]  ? mark_held_locks+0x65/0xf0
[  133.921437]  ip_route_output_key_hash_rcu+0x135d/0x22e0
[  133.922401]  ? ip_route_input_noref+0x1b0/0x1b0
[  133.923228]  ? ip_route_output_key_hash+0x123/0x270
[  133.924132]  ip_route_output_key_hash+0x176/0x270
[  133.925008]  ? ip_route_output_key_hash_rcu+0x22e0/0x22e0
[  133.926020]  ip_route_output_flow+0x1e/0x90
[  133.926779]  inet_sk_rebuild_header+0x769/0x1c90
[  133.927656]  ? inet_sk_set_state+0x330/0x330
[  133.928461]  ? __pskb_trim_head+0x9d0/0x9d0
[  133.929237]  __tcp_retransmit_skb+0x274/0x15e0
[  133.930088]  ? tcp_retrans_try_collapse+0x12c0/0x12c0
[  133.930766]  ? mark_held_locks+0x65/0xf0
[  133.931310]  tcp_send_loss_probe+0x3fb/0x770
[  133.931906]  tcp_write_timer_handler+0x585/0x710
[  133.932527]  tcp_write_timer+0x7e/0x1c0
[  133.933048]  call_timer_fn+0x1b1/0x640
[  133.933576]  ? tcp_write_timer_handler+0x710/0x710
[  133.934205]  ? timer_fixup_init+0x40/0x40
[  133.934752]  ? lock_downgrade+0x720/0x720
[  133.935295]  ? rcu_read_lock_sched_held+0xa1/0xe0
[  133.935926]  ? rcu_read_lock_bh_held+0xc0/0xc0
[  133.936530]  ? _raw_spin_unlock_irq+0x24/0x30
[  133.937114]  __run_timers.part.0+0x4a5/0x9a0
[  133.937701]  ? tcp_write_timer_handler+0x710/0x710
[  133.938346]  ? call_timer_fn+0x640/0x640
[  133.938873]  ? lock_downgrade+0x720/0x720
[  133.939445]  ? rcu_read_lock_sched_held+0xa1/0xe0
[  133.940069]  run_timer_softirq+0x6a/0x100
[  133.940618]  __do_softirq+0x24c/0xa6e
[  133.941127]  irq_exit+0x275/0x2c0
[  133.941603]  smp_apic_timer_interrupt+0x1b0/0x650
[  133.942222]  apic_timer_interrupt+0xf/0x20
[  133.942777]  </IRQ>
[  133.943090] RIP: 0010:default_idle+0x2d/0x390
[  133.943676] Code: 00 00 41 56 41 55 41 54 65 44 8b 25 c5 56 2c 79 55 53 0f 1f 44 00 00 e8 b1 95 7c fe e9 07 00 00 00 0f 00 2d 35 80 4c 00 fb f4 <65> 44 8b 25 a3 56 2c 79 0f 1f 44 00 00 5b 5d 41 5c 41 5d 41 5e c3
[  133.945969] RSP: 0018:ffff88810746fda8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[  133.946932] RAX: 0000000000000007 RBX: ffff888107463480 RCX: 0000000000000000
[  133.947835] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff88810746417c
[  133.948741] RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
[  133.949698] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[  133.950618] R13: 0000000000000000 R14: 0000000000000001 R15: ffffffff883e7820
[  133.951573]  ? default_idle+0x1f/0x390
[  133.952088]  do_idle+0x44b/0x5a0
[  133.952553]  ? arch_cpu_idle_exit+0x40/0x40
[  133.953120]  ? _raw_spin_unlock_irqrestore+0x3e/0x50
[  133.953811]  cpu_startup_entry+0x19/0x20
[  133.954353]  start_secondary+0x2f4/0x400
[  133.954918]  ? set_cpu_sibling_map+0x2420/0x2420
[  133.955556]  secondary_startup_64+0xb6/0xc0

[  133.956401] Allocated by task 3288:
[  133.956901]  save_stack+0x1b/0x80
[  133.957400]  __kasan_kmalloc.constprop.0+0xc2/0xd0
[  133.958037]  kmem_cache_alloc_trace+0x160/0x390
[  133.958664]  fib_net_init+0x133/0x300
[  133.959180]  ops_init+0x96/0x360
[  133.959644]  setup_net+0x2c5/0x790
[  133.960117]  copy_net_ns+0x24f/0x490
[  133.960625]  create_new_namespaces+0x35e/0x6e0
[  133.961219]  unshare_nsproxy_namespaces+0x87/0x1a0
[  133.961863]  ksys_unshare+0x306/0x680
[  133.962390]  __x64_sys_unshare+0x2d/0x40
[  133.962925]  do_syscall_64+0x99/0x500
[  133.963434]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  133.964327] Freed by task 36:
[  133.964763]  save_stack+0x1b/0x80
[  133.965217]  __kasan_slab_free+0x12c/0x170
[  133.965768]  kfree+0x103/0x370
[  133.966195]  ip_fib_net_exit+0x231/0x340
[  133.966760]  ops_exit_list.isra.0+0x90/0x140
[  133.967368]  cleanup_net+0x44a/0x930
[  133.967854]  process_one_work+0x889/0x15b0
[  133.968421]  worker_thread+0x91/0xcb0
[  133.968922]  kthread+0x2f2/0x3f0
[  133.969387]  ret_from_fork+0x24/0x30
[  133.970115] The buggy address belongs to the object at ffff8880a938f000
 which belongs to the cache kmalloc-2k of size 2048

[  133.971699] The buggy address is located 2040 bytes inside of
 2048-byte region [ffff8880a938f000, ffff8880a938f800)
[  133.973173] The buggy address belongs to the page:
[  133.973810] page:ffffea0002a4e200 refcount:1 mapcount:0 mapping:ffff888107c0c000 index:0x0 compound_mapcount: 0
[  133.975077] raw: 000fffffc0010200 dead000000000100 dead000000000122 ffff888107c0c000
[  133.976093] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
[  133.977090] page dumped because: kasan: bad access detected
[  133.978073] Memory state around the buggy address:
[  133.978723]  ffff8880a938f680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  133.979672]  ffff8880a938f700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  133.980597] >ffff8880a938f780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  133.981551]                                                                 ^
[  133.982498]  ffff8880a938f800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  133.983481]  ffff8880a938f880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  133.984404] ==================================================================
[  133.985413] Disabling lock debugging due to kernel taint

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-14 17:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 17:40 [MPTCP] Re: [RFC PATCH] mptcp: handle ipv4-mapped ipv6 address Paolo Abeni
  -- strict thread matches above, loose matches on Subject: below --
2020-01-14  2:39 Mat Martineau
2020-01-13 16:09 Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.