All of lore.kernel.org
 help / color / mirror / Atom feed
From: Antoine Tenart <atenart@kernel.org>
To: Daniel Borkmann <daniel@iogearbox.net>,
	davem@davemloft.net, kuba@kernel.org
Cc: netdev@vger.kernel.org, vladbu@nvidia.com, pabeni@redhat.com,
	pshelar@ovn.org, wenxu@ucloud.cn
Subject: Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
Date: Fri, 04 Feb 2022 17:19:33 +0100	[thread overview]
Message-ID: <164399157371.4980.14890218612337167330@kwain> (raw)
In-Reply-To: <fc060e49-6ddf-0d18-f10d-958425876370@iogearbox.net>

Quoting Daniel Borkmann (2022-02-04 13:33:20)
> On 2/2/22 2:44 PM, Antoine Tenart wrote:
> > Quoting Daniel Borkmann (2022-02-02 13:13:30)
> >> On 2/2/22 12:01 PM, Antoine Tenart wrote:
> >>> When uncloning an skb dst and its associated metadata a new dst+metadata
> >>> is allocated and the tunnel information from the old metadata is copied
> >>> over there.
> >>>
> >>> The issue is the tunnel metadata has references to cached dst, which are
> >>> copied along the way. When a dst+metadata refcount drops to 0 the
> >>> metadata is freed including the cached dst entries. As they are also
> >>> referenced in the initial dst+metadata, this ends up in UaFs.
> >>>
> >>> In practice the above did not happen because of another issue, the
> >>> dst+metadata was never freed because its refcount never dropped to 0
> >>> (this will be fixed in a subsequent patch).
> >>>
> >>> Fix this by initializing the dst cache after copying the tunnel
> >>> information from the old metadata to also unshare the dst cache.
> >>>
> >>> Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
> >>> Cc: Paolo Abeni <pabeni@redhat.com>
> >>> Reported-by: Vlad Buslov <vladbu@nvidia.com>
> >>> Tested-by: Vlad Buslov <vladbu@nvidia.com>
> >>> Signed-off-by: Antoine Tenart <atenart@kernel.org>
> >>> ---
> >>>    include/net/dst_metadata.h | 13 ++++++++++++-
> >>>    1 file changed, 12 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> >>> index 14efa0ded75d..c8f8b7b56bba 100644
> >>> --- a/include/net/dst_metadata.h
> >>> +++ b/include/net/dst_metadata.h
> >>> @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
> >>>    static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >>>    {
> >>>        struct metadata_dst *md_dst = skb_metadata_dst(skb);
> >>> -     int md_size;
> >>>        struct metadata_dst *new_md;
> >>> +     int md_size, ret;
> >>>    
> >>>        if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
> >>>                return ERR_PTR(-EINVAL);
> >>> @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >>>    
> >>>        memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
> >>>               sizeof(struct ip_tunnel_info) + md_size);
> >>> +#ifdef CONFIG_DST_CACHE
> >>> +     ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> >>> +     if (ret) {
> >>> +             /* We can't call metadata_dst_free directly as the still shared
> >>> +              * dst cache would be released.
> >>> +              */
> >>> +             kfree(new_md);
> >>> +             return ERR_PTR(ret);
> >>> +     }
> >>> +#endif
> >>
> >> Could you elaborate (e.g. also in commit message) how this interacts
> >> or whether it is needed for TUNNEL_NOCACHE users? (Among others,
> >> latter is used by BPF, for example.)
> > 
> > My understanding is that TUNNEL_NOCACHE is used to decide whether or not
> > to use a dst cache, that might or might not come from the tunnel info
> > attached to an skb. The dst cache being allocated in a tunnel info is
> > orthogonal to the use of TUNNEL_NOCACHE. While looking around I actually
> > found a code path explicitly setting both, in nft_tunnel_obj_init (that
> > might need to be investigated though but it is another topic).
> 
> Good point, this is coming from 3e511d5652ce ("netfilter: nft_tunnel: Add dst_cache
> support") and was added only after af308b94a2a4 ("netfilter: nf_tables: add tunnel
> support") which initially indicated TUNNEL_NOCACHE. This is indeed contradictory.
> wenxu (+Cc), ptal.
> 
> > It doesn't look like initializing the dst cache would break
> > TUNNEL_NOCACHE users as ip_tunnel_dst_cache_usable would return false
> > anyway. Having said that, we probably want to unshare the dst cache only
> > if there is one already, checking for
> > 'md_dst->u.tun_info.dst_cache.cache != NULL' first.
> 
> Meaning, if that is the case, we wouldn't require the dst_cache_init()
> and thus extra alloc, right? Would make sense afaics.

Meaning:

          memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
                 sizeof(struct ip_tunnel_info) + md_size);
  +#ifdef CONFIG_DST_CACHE
  +       if (new_md->u.tun_info.dst_cache.cache) {
  +               int ret = dst_cache_init(&new_md->u.tun_info.dst_cache,
  +                                        GFP_ATOMIC);
  +               if (ret) {
  +                       metadata_dst_free(new_md);
  +                       return ERR_PTR(ret);
  +               }
  +       }
  +#endif

So that the cache is unshared only if there was one in the first place.
If there was no cache to unshare, we can save the extra alloc.

> db3c6139e6ea ("bpf, vxlan, geneve, gre: fix usage of dst_cache on
> xmit") had some details related to BPF use.

With the above commit if TUNNEL_NOCACHE is set the tunnel cache
shouldn't be used, regardless of it being allocated. I guess with that
and the above, we should be good.

Thanks!
Antoine

  reply	other threads:[~2022-02-04 16:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-02 11:01 [PATCH net 0/2] net: fix issues when uncloning an skb dst+metadata Antoine Tenart
2022-02-02 11:01 ` [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata Antoine Tenart
2022-02-02 11:17   ` Antoine Tenart
2022-02-02 12:13   ` Daniel Borkmann
2022-02-02 13:44     ` Antoine Tenart
2022-02-04 12:33       ` Daniel Borkmann
2022-02-04 16:19         ` Antoine Tenart [this message]
2022-02-02 14:24   ` Paolo Abeni
2022-02-02 14:29     ` Antoine Tenart
2022-02-02 11:01 ` [PATCH net 2/2] net: fix a memleak " Antoine Tenart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=164399157371.4980.14890218612337167330@kwain \
    --to=atenart@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pshelar@ovn.org \
    --cc=vladbu@nvidia.com \
    --cc=wenxu@ucloud.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.