All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/2] net: fix issues when uncloning an skb dst+metadata
@ 2022-02-02 11:01 Antoine Tenart
  2022-02-02 11:01 ` [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata Antoine Tenart
  2022-02-02 11:01 ` [PATCH net 2/2] net: fix a memleak " Antoine Tenart
  0 siblings, 2 replies; 10+ messages in thread
From: Antoine Tenart @ 2022-02-02 11:01 UTC (permalink / raw)
  To: davem, kuba; +Cc: Antoine Tenart, netdev, vladbu, pabeni, pshelar

Hi,

This fixes two issues when uncloning an skb dst+metadata in
tun_dst_unclone; this was initially reported by Vlad Buslov[1]. Because
of the memory leak fixed by patch 2, the issue in patch 1 never happened
in practice.

tun_dst_unclone is called from two different places, one in geneve/vxlan
to handle PMTU and one in net/openvswitch/actions.c where it is used to
retrieve tunnel information. While both Vlad and I tested the former, we
could not for the latter. I did spend quite some time trying to, but
that code path is not easy to trigger. Code inspection shows this should
be fine, the tunnel information (dst+metadata) is uncloned and the skb
it is referenced from is only consumed after all accesses to the tunnel
information are done:

  do_execute_actions
    output_userspace
      dev_fill_metadata_dst         <- dst+metadata is uncloned
      ovs_dp_upcall
        queue_userspace_packet
          ovs_nla_put_tunnel_info   <- metadata (tunnel info) is accessed
    consume_skb                     <- dst+metadata is freed

Thanks!
Antoine

[1] https://lore.kernel.org/all/ygnhh79yluw2.fsf@nvidia.com/T/#m2f814614a4f5424cea66bbff7297f692b59b69a0

Antoine Tenart (2):
  net: do not keep the dst cache when uncloning an skb dst and its
    metadata
  net: fix a memleak when uncloning an skb dst and its metadata

 include/net/dst_metadata.h | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-02 11:01 [PATCH net 0/2] net: fix issues when uncloning an skb dst+metadata Antoine Tenart
@ 2022-02-02 11:01 ` Antoine Tenart
  2022-02-02 11:17   ` Antoine Tenart
                     ` (2 more replies)
  2022-02-02 11:01 ` [PATCH net 2/2] net: fix a memleak " Antoine Tenart
  1 sibling, 3 replies; 10+ messages in thread
From: Antoine Tenart @ 2022-02-02 11:01 UTC (permalink / raw)
  To: davem, kuba; +Cc: Antoine Tenart, netdev, vladbu, pabeni, pshelar

When uncloning an skb dst and its associated metadata a new dst+metadata
is allocated and the tunnel information from the old metadata is copied
over there.

The issue is the tunnel metadata has references to cached dst, which are
copied along the way. When a dst+metadata refcount drops to 0 the
metadata is freed including the cached dst entries. As they are also
referenced in the initial dst+metadata, this ends up in UaFs.

In practice the above did not happen because of another issue, the
dst+metadata was never freed because its refcount never dropped to 0
(this will be fixed in a subsequent patch).

Fix this by initializing the dst cache after copying the tunnel
information from the old metadata to also unshare the dst cache.

Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: Vlad Buslov <vladbu@nvidia.com>
Tested-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 include/net/dst_metadata.h | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index 14efa0ded75d..c8f8b7b56bba 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h
@@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
 static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
 {
 	struct metadata_dst *md_dst = skb_metadata_dst(skb);
-	int md_size;
 	struct metadata_dst *new_md;
+	int md_size, ret;
 
 	if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
 		return ERR_PTR(-EINVAL);
@@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
 
 	memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
 	       sizeof(struct ip_tunnel_info) + md_size);
+#ifdef CONFIG_DST_CACHE
+	ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
+	if (ret) {
+		/* We can't call metadata_dst_free directly as the still shared
+		 * dst cache would be released.
+		 */
+		kfree(new_md);
+		return ERR_PTR(ret);
+	}
+#endif
+
 	skb_dst_drop(skb);
 	dst_hold(&new_md->dst);
 	skb_dst_set(skb, &new_md->dst);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net 2/2] net: fix a memleak when uncloning an skb dst and its metadata
  2022-02-02 11:01 [PATCH net 0/2] net: fix issues when uncloning an skb dst+metadata Antoine Tenart
  2022-02-02 11:01 ` [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata Antoine Tenart
@ 2022-02-02 11:01 ` Antoine Tenart
  1 sibling, 0 replies; 10+ messages in thread
From: Antoine Tenart @ 2022-02-02 11:01 UTC (permalink / raw)
  To: davem, kuba; +Cc: Antoine Tenart, netdev, vladbu, pabeni, pshelar

When uncloning an skb dst and its associated metadata, a new
dst+metadata is allocated and later replaces the old one in the skb.
This is helpful to have a non-shared dst+metadata attached to a specific
skb.

The issue is the uncloned dst+metadata is initialized with a refcount of
1, which is increased to 2 before attaching it to the skb. When
tun_dst_unclone returns, the dst+metadata is only referenced from a
single place (the skb) while its refcount is 2. Its refcount will never
drop to 0 (when the skb is consumed), leading to a memory leak.

Fix this by removing the call to dst_hold in tun_dst_unclone, as the
dst+metadata refcount is already 1.

Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.")
Cc: Pravin B Shelar <pshelar@ovn.org>
Reported-by: Vlad Buslov <vladbu@nvidia.com>
Tested-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 include/net/dst_metadata.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index c8f8b7b56bba..edd75c89222d 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h
@@ -135,7 +135,6 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
 #endif
 
 	skb_dst_drop(skb);
-	dst_hold(&new_md->dst);
 	skb_dst_set(skb, &new_md->dst);
 	return new_md;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-02 11:01 ` [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata Antoine Tenart
@ 2022-02-02 11:17   ` Antoine Tenart
  2022-02-02 12:13   ` Daniel Borkmann
  2022-02-02 14:24   ` Paolo Abeni
  2 siblings, 0 replies; 10+ messages in thread
From: Antoine Tenart @ 2022-02-02 11:17 UTC (permalink / raw)
  To: davem, kuba; +Cc: netdev, vladbu, pabeni, pshelar

Quoting Antoine Tenart (2022-02-02 12:01:36)
> When uncloning an skb dst and its associated metadata a new dst+metadata
> is allocated and the tunnel information from the old metadata is copied
> over there.
> 
> The issue is the tunnel metadata has references to cached dst, which are
> copied along the way. When a dst+metadata refcount drops to 0 the
> metadata is freed including the cached dst entries. As they are also
> referenced in the initial dst+metadata, this ends up in UaFs.
> 
> In practice the above did not happen because of another issue, the
> dst+metadata was never freed because its refcount never dropped to 0
> (this will be fixed in a subsequent patch).
> 
> Fix this by initializing the dst cache after copying the tunnel
> information from the old metadata to also unshare the dst cache.
> 
> Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
> Cc: Paolo Abeni <pabeni@redhat.com>
> Reported-by: Vlad Buslov <vladbu@nvidia.com>
> Tested-by: Vlad Buslov <vladbu@nvidia.com>
> Signed-off-by: Antoine Tenart <atenart@kernel.org>
> ---
>  include/net/dst_metadata.h | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> index 14efa0ded75d..c8f8b7b56bba 100644
> --- a/include/net/dst_metadata.h
> +++ b/include/net/dst_metadata.h
> @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
>  static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>  {
>         struct metadata_dst *md_dst = skb_metadata_dst(skb);
> -       int md_size;
>         struct metadata_dst *new_md;
> +       int md_size, ret;

Hmmm ret should probably be defined inside a CONFIG_DST_CACHE section.

>         if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
>                 return ERR_PTR(-EINVAL);
> @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>  
>         memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
>                sizeof(struct ip_tunnel_info) + md_size);
> +#ifdef CONFIG_DST_CACHE
> +       ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> +       if (ret) {
> +               /* We can't call metadata_dst_free directly as the still shared
> +                * dst cache would be released.
> +                */
> +               kfree(new_md);
> +               return ERR_PTR(ret);
> +       }
> +#endif
> +
>         skb_dst_drop(skb);
>         dst_hold(&new_md->dst);
>         skb_dst_set(skb, &new_md->dst);
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-02 11:01 ` [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata Antoine Tenart
  2022-02-02 11:17   ` Antoine Tenart
@ 2022-02-02 12:13   ` Daniel Borkmann
  2022-02-02 13:44     ` Antoine Tenart
  2022-02-02 14:24   ` Paolo Abeni
  2 siblings, 1 reply; 10+ messages in thread
From: Daniel Borkmann @ 2022-02-02 12:13 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba; +Cc: netdev, vladbu, pabeni, pshelar

On 2/2/22 12:01 PM, Antoine Tenart wrote:
> When uncloning an skb dst and its associated metadata a new dst+metadata
> is allocated and the tunnel information from the old metadata is copied
> over there.
> 
> The issue is the tunnel metadata has references to cached dst, which are
> copied along the way. When a dst+metadata refcount drops to 0 the
> metadata is freed including the cached dst entries. As they are also
> referenced in the initial dst+metadata, this ends up in UaFs.
> 
> In practice the above did not happen because of another issue, the
> dst+metadata was never freed because its refcount never dropped to 0
> (this will be fixed in a subsequent patch).
> 
> Fix this by initializing the dst cache after copying the tunnel
> information from the old metadata to also unshare the dst cache.
> 
> Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
> Cc: Paolo Abeni <pabeni@redhat.com>
> Reported-by: Vlad Buslov <vladbu@nvidia.com>
> Tested-by: Vlad Buslov <vladbu@nvidia.com>
> Signed-off-by: Antoine Tenart <atenart@kernel.org>
> ---
>   include/net/dst_metadata.h | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> index 14efa0ded75d..c8f8b7b56bba 100644
> --- a/include/net/dst_metadata.h
> +++ b/include/net/dst_metadata.h
> @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
>   static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>   {
>   	struct metadata_dst *md_dst = skb_metadata_dst(skb);
> -	int md_size;
>   	struct metadata_dst *new_md;
> +	int md_size, ret;
>   
>   	if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
>   		return ERR_PTR(-EINVAL);
> @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>   
>   	memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
>   	       sizeof(struct ip_tunnel_info) + md_size);
> +#ifdef CONFIG_DST_CACHE
> +	ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> +	if (ret) {
> +		/* We can't call metadata_dst_free directly as the still shared
> +		 * dst cache would be released.
> +		 */
> +		kfree(new_md);
> +		return ERR_PTR(ret);
> +	}
> +#endif

Could you elaborate (e.g. also in commit message) how this interacts or whether it is
needed for TUNNEL_NOCACHE users? (Among others, latter is used by BPF, for example.)

>   	skb_dst_drop(skb);
>   	dst_hold(&new_md->dst);
>   	skb_dst_set(skb, &new_md->dst);
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-02 12:13   ` Daniel Borkmann
@ 2022-02-02 13:44     ` Antoine Tenart
  2022-02-04 12:33       ` Daniel Borkmann
  0 siblings, 1 reply; 10+ messages in thread
From: Antoine Tenart @ 2022-02-02 13:44 UTC (permalink / raw)
  To: Daniel Borkmann, davem, kuba; +Cc: netdev, vladbu, pabeni, pshelar

Quoting Daniel Borkmann (2022-02-02 13:13:30)
> On 2/2/22 12:01 PM, Antoine Tenart wrote:
> > When uncloning an skb dst and its associated metadata a new dst+metadata
> > is allocated and the tunnel information from the old metadata is copied
> > over there.
> > 
> > The issue is the tunnel metadata has references to cached dst, which are
> > copied along the way. When a dst+metadata refcount drops to 0 the
> > metadata is freed including the cached dst entries. As they are also
> > referenced in the initial dst+metadata, this ends up in UaFs.
> > 
> > In practice the above did not happen because of another issue, the
> > dst+metadata was never freed because its refcount never dropped to 0
> > (this will be fixed in a subsequent patch).
> > 
> > Fix this by initializing the dst cache after copying the tunnel
> > information from the old metadata to also unshare the dst cache.
> > 
> > Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
> > Cc: Paolo Abeni <pabeni@redhat.com>
> > Reported-by: Vlad Buslov <vladbu@nvidia.com>
> > Tested-by: Vlad Buslov <vladbu@nvidia.com>
> > Signed-off-by: Antoine Tenart <atenart@kernel.org>
> > ---
> >   include/net/dst_metadata.h | 13 ++++++++++++-
> >   1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> > index 14efa0ded75d..c8f8b7b56bba 100644
> > --- a/include/net/dst_metadata.h
> > +++ b/include/net/dst_metadata.h
> > @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
> >   static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >   {
> >       struct metadata_dst *md_dst = skb_metadata_dst(skb);
> > -     int md_size;
> >       struct metadata_dst *new_md;
> > +     int md_size, ret;
> >   
> >       if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
> >               return ERR_PTR(-EINVAL);
> > @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >   
> >       memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
> >              sizeof(struct ip_tunnel_info) + md_size);
> > +#ifdef CONFIG_DST_CACHE
> > +     ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> > +     if (ret) {
> > +             /* We can't call metadata_dst_free directly as the still shared
> > +              * dst cache would be released.
> > +              */
> > +             kfree(new_md);
> > +             return ERR_PTR(ret);
> > +     }
> > +#endif
> 
> Could you elaborate (e.g. also in commit message) how this interacts
> or whether it is needed for TUNNEL_NOCACHE users? (Among others,
> latter is used by BPF, for example.)

My understanding is that TUNNEL_NOCACHE is used to decide whether or not
to use a dst cache, that might or might not come from the tunnel info
attached to an skb. The dst cache being allocated in a tunnel info is
orthogonal to the use of TUNNEL_NOCACHE. While looking around I actually
found a code path explicitly setting both, in nft_tunnel_obj_init (that
might need to be investigated though but it is another topic).

It doesn't look like initializing the dst cache would break
TUNNEL_NOCACHE users as ip_tunnel_dst_cache_usable would return false
anyway. Having said that, we probably want to unshare the dst cache only
if there is one already, checking for
'md_dst->u.tun_info.dst_cache.cache != NULL' first.

Does that make sense?

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-02 11:01 ` [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata Antoine Tenart
  2022-02-02 11:17   ` Antoine Tenart
  2022-02-02 12:13   ` Daniel Borkmann
@ 2022-02-02 14:24   ` Paolo Abeni
  2022-02-02 14:29     ` Antoine Tenart
  2 siblings, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2022-02-02 14:24 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba; +Cc: netdev, vladbu, pshelar

On Wed, 2022-02-02 at 12:01 +0100, Antoine Tenart wrote:
> When uncloning an skb dst and its associated metadata a new dst+metadata
> is allocated and the tunnel information from the old metadata is copied
> over there.
> 
> The issue is the tunnel metadata has references to cached dst, which are
> copied along the way. When a dst+metadata refcount drops to 0 the
> metadata is freed including the cached dst entries. As they are also
> referenced in the initial dst+metadata, this ends up in UaFs.
> 
> In practice the above did not happen because of another issue, the
> dst+metadata was never freed because its refcount never dropped to 0
> (this will be fixed in a subsequent patch).
> 
> Fix this by initializing the dst cache after copying the tunnel
> information from the old metadata to also unshare the dst cache.
> 
> Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
> Cc: Paolo Abeni <pabeni@redhat.com>
> Reported-by: Vlad Buslov <vladbu@nvidia.com>
> Tested-by: Vlad Buslov <vladbu@nvidia.com>
> Signed-off-by: Antoine Tenart <atenart@kernel.org>
> ---
>  include/net/dst_metadata.h | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> index 14efa0ded75d..c8f8b7b56bba 100644
> --- a/include/net/dst_metadata.h
> +++ b/include/net/dst_metadata.h
> @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
>  static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>  {
>  	struct metadata_dst *md_dst = skb_metadata_dst(skb);
> -	int md_size;
>  	struct metadata_dst *new_md;
> +	int md_size, ret;
>  
>  	if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
>  		return ERR_PTR(-EINVAL);
> @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>  
>  	memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
>  	       sizeof(struct ip_tunnel_info) + md_size);
> +#ifdef CONFIG_DST_CACHE
> +	ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> +	if (ret) {
> +		/* We can't call metadata_dst_free directly as the still shared
> +		 * dst cache would be released.
> +		 */
> +		kfree(new_md);

I think here you can use metadata_dst_free(): if dst_cache_init fails,
the dst_cache will be zeroed.

> +		return ERR_PTR(ret);
> +	}
> +#endif
> +
>  	skb_dst_drop(skb);
>  	dst_hold(&new_md->dst);
>  	skb_dst_set(skb, &new_md->dst);

Other than that LGTM, thanks!

/P


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-02 14:24   ` Paolo Abeni
@ 2022-02-02 14:29     ` Antoine Tenart
  0 siblings, 0 replies; 10+ messages in thread
From: Antoine Tenart @ 2022-02-02 14:29 UTC (permalink / raw)
  To: Paolo Abeni, davem, kuba; +Cc: netdev, vladbu, pshelar

Quoting Paolo Abeni (2022-02-02 15:24:51)
> On Wed, 2022-02-02 at 12:01 +0100, Antoine Tenart wrote:
> >       memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
> >              sizeof(struct ip_tunnel_info) + md_size);
> > +#ifdef CONFIG_DST_CACHE
> > +     ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> > +     if (ret) {
> > +             /* We can't call metadata_dst_free directly as the still shared
> > +              * dst cache would be released.
> > +              */
> > +             kfree(new_md);
> 
> I think here you can use metadata_dst_free(): if dst_cache_init fails,
> the dst_cache will be zeroed.

You're right, I'll use it in v2.

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-02 13:44     ` Antoine Tenart
@ 2022-02-04 12:33       ` Daniel Borkmann
  2022-02-04 16:19         ` Antoine Tenart
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Borkmann @ 2022-02-04 12:33 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba; +Cc: netdev, vladbu, pabeni, pshelar, wenxu

On 2/2/22 2:44 PM, Antoine Tenart wrote:
> Quoting Daniel Borkmann (2022-02-02 13:13:30)
>> On 2/2/22 12:01 PM, Antoine Tenart wrote:
>>> When uncloning an skb dst and its associated metadata a new dst+metadata
>>> is allocated and the tunnel information from the old metadata is copied
>>> over there.
>>>
>>> The issue is the tunnel metadata has references to cached dst, which are
>>> copied along the way. When a dst+metadata refcount drops to 0 the
>>> metadata is freed including the cached dst entries. As they are also
>>> referenced in the initial dst+metadata, this ends up in UaFs.
>>>
>>> In practice the above did not happen because of another issue, the
>>> dst+metadata was never freed because its refcount never dropped to 0
>>> (this will be fixed in a subsequent patch).
>>>
>>> Fix this by initializing the dst cache after copying the tunnel
>>> information from the old metadata to also unshare the dst cache.
>>>
>>> Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
>>> Cc: Paolo Abeni <pabeni@redhat.com>
>>> Reported-by: Vlad Buslov <vladbu@nvidia.com>
>>> Tested-by: Vlad Buslov <vladbu@nvidia.com>
>>> Signed-off-by: Antoine Tenart <atenart@kernel.org>
>>> ---
>>>    include/net/dst_metadata.h | 13 ++++++++++++-
>>>    1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
>>> index 14efa0ded75d..c8f8b7b56bba 100644
>>> --- a/include/net/dst_metadata.h
>>> +++ b/include/net/dst_metadata.h
>>> @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
>>>    static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>>>    {
>>>        struct metadata_dst *md_dst = skb_metadata_dst(skb);
>>> -     int md_size;
>>>        struct metadata_dst *new_md;
>>> +     int md_size, ret;
>>>    
>>>        if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
>>>                return ERR_PTR(-EINVAL);
>>> @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
>>>    
>>>        memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
>>>               sizeof(struct ip_tunnel_info) + md_size);
>>> +#ifdef CONFIG_DST_CACHE
>>> +     ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
>>> +     if (ret) {
>>> +             /* We can't call metadata_dst_free directly as the still shared
>>> +              * dst cache would be released.
>>> +              */
>>> +             kfree(new_md);
>>> +             return ERR_PTR(ret);
>>> +     }
>>> +#endif
>>
>> Could you elaborate (e.g. also in commit message) how this interacts
>> or whether it is needed for TUNNEL_NOCACHE users? (Among others,
>> latter is used by BPF, for example.)
> 
> My understanding is that TUNNEL_NOCACHE is used to decide whether or not
> to use a dst cache, that might or might not come from the tunnel info
> attached to an skb. The dst cache being allocated in a tunnel info is
> orthogonal to the use of TUNNEL_NOCACHE. While looking around I actually
> found a code path explicitly setting both, in nft_tunnel_obj_init (that
> might need to be investigated though but it is another topic).

Good point, this is coming from 3e511d5652ce ("netfilter: nft_tunnel: Add dst_cache
support") and was added only after af308b94a2a4 ("netfilter: nf_tables: add tunnel
support") which initially indicated TUNNEL_NOCACHE. This is indeed contradictory.
wenxu (+Cc), ptal.

> It doesn't look like initializing the dst cache would break
> TUNNEL_NOCACHE users as ip_tunnel_dst_cache_usable would return false
> anyway. Having said that, we probably want to unshare the dst cache only
> if there is one already, checking for
> 'md_dst->u.tun_info.dst_cache.cache != NULL' first.

Meaning, if that is the case, we wouldn't require the dst_cache_init() and thus
extra alloc, right? Would make sense afaics. db3c6139e6ea ("bpf, vxlan, geneve,
gre: fix usage of dst_cache on xmit") had some details related to BPF use.

Thanks again!
Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
  2022-02-04 12:33       ` Daniel Borkmann
@ 2022-02-04 16:19         ` Antoine Tenart
  0 siblings, 0 replies; 10+ messages in thread
From: Antoine Tenart @ 2022-02-04 16:19 UTC (permalink / raw)
  To: Daniel Borkmann, davem, kuba; +Cc: netdev, vladbu, pabeni, pshelar, wenxu

Quoting Daniel Borkmann (2022-02-04 13:33:20)
> On 2/2/22 2:44 PM, Antoine Tenart wrote:
> > Quoting Daniel Borkmann (2022-02-02 13:13:30)
> >> On 2/2/22 12:01 PM, Antoine Tenart wrote:
> >>> When uncloning an skb dst and its associated metadata a new dst+metadata
> >>> is allocated and the tunnel information from the old metadata is copied
> >>> over there.
> >>>
> >>> The issue is the tunnel metadata has references to cached dst, which are
> >>> copied along the way. When a dst+metadata refcount drops to 0 the
> >>> metadata is freed including the cached dst entries. As they are also
> >>> referenced in the initial dst+metadata, this ends up in UaFs.
> >>>
> >>> In practice the above did not happen because of another issue, the
> >>> dst+metadata was never freed because its refcount never dropped to 0
> >>> (this will be fixed in a subsequent patch).
> >>>
> >>> Fix this by initializing the dst cache after copying the tunnel
> >>> information from the old metadata to also unshare the dst cache.
> >>>
> >>> Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
> >>> Cc: Paolo Abeni <pabeni@redhat.com>
> >>> Reported-by: Vlad Buslov <vladbu@nvidia.com>
> >>> Tested-by: Vlad Buslov <vladbu@nvidia.com>
> >>> Signed-off-by: Antoine Tenart <atenart@kernel.org>
> >>> ---
> >>>    include/net/dst_metadata.h | 13 ++++++++++++-
> >>>    1 file changed, 12 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> >>> index 14efa0ded75d..c8f8b7b56bba 100644
> >>> --- a/include/net/dst_metadata.h
> >>> +++ b/include/net/dst_metadata.h
> >>> @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
> >>>    static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >>>    {
> >>>        struct metadata_dst *md_dst = skb_metadata_dst(skb);
> >>> -     int md_size;
> >>>        struct metadata_dst *new_md;
> >>> +     int md_size, ret;
> >>>    
> >>>        if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
> >>>                return ERR_PTR(-EINVAL);
> >>> @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >>>    
> >>>        memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
> >>>               sizeof(struct ip_tunnel_info) + md_size);
> >>> +#ifdef CONFIG_DST_CACHE
> >>> +     ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> >>> +     if (ret) {
> >>> +             /* We can't call metadata_dst_free directly as the still shared
> >>> +              * dst cache would be released.
> >>> +              */
> >>> +             kfree(new_md);
> >>> +             return ERR_PTR(ret);
> >>> +     }
> >>> +#endif
> >>
> >> Could you elaborate (e.g. also in commit message) how this interacts
> >> or whether it is needed for TUNNEL_NOCACHE users? (Among others,
> >> latter is used by BPF, for example.)
> > 
> > My understanding is that TUNNEL_NOCACHE is used to decide whether or not
> > to use a dst cache, that might or might not come from the tunnel info
> > attached to an skb. The dst cache being allocated in a tunnel info is
> > orthogonal to the use of TUNNEL_NOCACHE. While looking around I actually
> > found a code path explicitly setting both, in nft_tunnel_obj_init (that
> > might need to be investigated though but it is another topic).
> 
> Good point, this is coming from 3e511d5652ce ("netfilter: nft_tunnel: Add dst_cache
> support") and was added only after af308b94a2a4 ("netfilter: nf_tables: add tunnel
> support") which initially indicated TUNNEL_NOCACHE. This is indeed contradictory.
> wenxu (+Cc), ptal.
> 
> > It doesn't look like initializing the dst cache would break
> > TUNNEL_NOCACHE users as ip_tunnel_dst_cache_usable would return false
> > anyway. Having said that, we probably want to unshare the dst cache only
> > if there is one already, checking for
> > 'md_dst->u.tun_info.dst_cache.cache != NULL' first.
> 
> Meaning, if that is the case, we wouldn't require the dst_cache_init()
> and thus extra alloc, right? Would make sense afaics.

Meaning:

          memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
                 sizeof(struct ip_tunnel_info) + md_size);
  +#ifdef CONFIG_DST_CACHE
  +       if (new_md->u.tun_info.dst_cache.cache) {
  +               int ret = dst_cache_init(&new_md->u.tun_info.dst_cache,
  +                                        GFP_ATOMIC);
  +               if (ret) {
  +                       metadata_dst_free(new_md);
  +                       return ERR_PTR(ret);
  +               }
  +       }
  +#endif

So that the cache is unshared only if there was one in the first place.
If there was no cache to unshare, we can save the extra alloc.

> db3c6139e6ea ("bpf, vxlan, geneve, gre: fix usage of dst_cache on
> xmit") had some details related to BPF use.

With the above commit if TUNNEL_NOCACHE is set the tunnel cache
shouldn't be used, regardless of it being allocated. I guess with that
and the above, we should be good.

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-04 16:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-02 11:01 [PATCH net 0/2] net: fix issues when uncloning an skb dst+metadata Antoine Tenart
2022-02-02 11:01 ` [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata Antoine Tenart
2022-02-02 11:17   ` Antoine Tenart
2022-02-02 12:13   ` Daniel Borkmann
2022-02-02 13:44     ` Antoine Tenart
2022-02-04 12:33       ` Daniel Borkmann
2022-02-04 16:19         ` Antoine Tenart
2022-02-02 14:24   ` Paolo Abeni
2022-02-02 14:29     ` Antoine Tenart
2022-02-02 11:01 ` [PATCH net 2/2] net: fix a memleak " Antoine Tenart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.