Re: [PATCH v3] net: sched: fix memory leak in tcindex_set_parms

From: Hawkins Jiawei <yin31149@gmail.com>
To: xiyou.wangcong@gmail.com
Cc: 18801353760@163.com, cong.wang@bytedance.com,
	davem@davemloft.net, dvyukov@google.com, edumazet@google.com,
	jhs@mojatatu.com, jiri@resnulli.us, kuba@kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	pabeni@redhat.com,
	syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com,
	syzkaller-bugs@googlegroups.com, yin31149@gmail.com
Subject: Re: [PATCH v3] net: sched: fix memory leak in tcindex_set_parms
Date: Mon,  5 Dec 2022 23:19:56 +0800	[thread overview]
Message-ID: <20221205151956.28422-1-yin31149@gmail.com> (raw)
In-Reply-To: <Y4uvQA2xxtJXltSM@pop-os.localdomain>

On Sun, 4 Dec 2022 at 04:19, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Tue, Nov 29, 2022 at 10:52:49AM +0800, Hawkins Jiawei wrote:
> > Kernel uses tcindex_change() to change an existing
> > filter properties. During the process of changing,
> > kernel uses tcindex_alloc_perfect_hash() to newly
> > allocate filter results, uses tcindex_filter_result_init()
> > to clear the old filter result.
> >
> > Yet the problem is that, kernel clears the old
> > filter result, without destroying its tcf_exts structure,
> > which triggers the above memory leak.
> >
> > Considering that there already extis a tc_filter_wq workqueue
> > to destroy the old tcindex_data by tcindex_partial_destroy_work()
> > at the end of tcindex_set_parms(), this patch solves this memory
> > leak bug by removing this old filter result clearing part,
> > and delegating it to the tc_filter_wq workqueue.
>
> Hmm?? The tcindex_partial_destroy_work() is to destroy 'oldp' which is
> different from 'old_r'. I mean, you seem assuming that struct
> tcindex_filter_result is always from struct tcindex_data, which is not
> true, check the following tcindex_lookup() which retrieves tcindex_filter_result
> from struct tcindex_filter.
>
> static struct tcindex_filter_result *tcindex_lookup(struct tcindex_data *p,
>                                                     u16 key)
> {
>         if (p->perfect) {
>                 struct tcindex_filter_result *f = p->perfect + key;
>
>                 return tcindex_filter_is_set(f) ? f : NULL;
>         } else if (p->h) {
>                 struct tcindex_filter __rcu **fp;
>                 struct tcindex_filter *f;
>
>                 fp = &p->h[key % p->hash];
>                 for (f = rcu_dereference_bh_rtnl(*fp);
>                      f;
>                      fp = &f->next, f = rcu_dereference_bh_rtnl(*fp))
>                         if (f->key == key)
>                                 return &f->result;
>         }
>
>         return NULL;
> }

Oh, thanks for correcting me! You are right, I wrongly assuming that
struct tcindex_filter_result is always from struct tcindex_data
`perfect` field.

But I think this patch still can fix this problem, after reviewing
the tcindex_set_parms(). Because only the `tcindex_filter_result` is
from `struct tcindex_data`, can the code reaches the deleted part
in this patch.

To be more specific, the simplified logic about original
tcindex_set_parms() is as below:

static int
tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
		  u32 handle, struct tcindex_data *p,
		  struct tcindex_filter_result *r, struct nlattr **tb,
		  struct nlattr *est, u32 flags, struct netlink_ext_ack *extack)
{
	...
	if (p->perfect) {
		int i;

		if (tcindex_alloc_perfect_hash(net, cp) < 0)
			goto errout;
		cp->alloc_hash = cp->hash;
		for (i = 0; i < min(cp->hash, p->hash); i++)
			cp->perfect[i].res = p->perfect[i].res;
		balloc = 1;
	}
	cp->h = p->h;

	...

	if (cp->perfect)
		r = cp->perfect + handle;
	else
		r = tcindex_lookup(cp, handle) ? : &new_filter_result;

	if (old_r && old_r != r) {
		err = tcindex_filter_result_init(old_r, cp, net);
		if (err < 0) {
			kfree(f);
			goto errout_alloc;
		}
	}
	...
}

- cp's h field is directly copied from p's h field

- if `old_r` is retrieved from struct tcindex_filter, in other word,
is retrieved from p's h field. Then the `r` should get the same value
from `tcindex_loopup(cp, handle)`.

- so `old_r == r` is true, code will never uses tcindex_filter_result_init()
to clear the old_r in such case.

So I think this patch still can fix this memory leak caused by 
tcindex_filter_result_init(), But maybe I need to improve my
commit message.

Please correct me If I am wrong.

> > diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c
> > index 1c9eeb98d826..3f4e7a6cdd96 100644
> > --- a/net/sched/cls_tcindex.c
> > +++ b/net/sched/cls_tcindex.c
> > @@ -478,14 +478,6 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
> >               tcf_bind_filter(tp, &cr, base);
> >       }
> > 
> > -     if (old_r && old_r != r) {
> > -             err = tcindex_filter_result_init(old_r, cp, net);
> > -             if (err < 0) {
> > -                     kfree(f);
> > -                     goto errout_alloc;
> > -             }
> > -     }
> > -
>
> Even if your above analysis is correct, 'old_r' becomes unused (set but not used)
> now, I think you should get some compiler warning.

Oh, it actually didn't trigger any compiler warning,
because there is still a used place as below:

static int
tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
		  u32 handle, struct tcindex_data *p,
		  struct tcindex_filter_result *r, struct nlattr **tb,
		  struct nlattr *est, u32 flags, struct netlink_ext_ack *extack)
{
	struct tcindex_filter_result new_filter_result, *old_r = r;
	...
	err = tcindex_filter_result_init(&new_filter_result, cp, net);
	if (err < 0)
		goto errout_alloc;
	if (old_r)
		cr = r->res;
	...
}

But the `old_r` and `r` has the same value here, so we can just replace
the `old_r` with `r` here, and delete the `old_r` as you suggested.

Thanks for your suggestion!

>
> Thanks.