All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Xu <dxu@dxuuu.xyz>
To: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>,
	Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	ppenkov@aviatrix.com
Subject: Re: ip_set_hash_netiface
Date: Thu, 27 Oct 2022 03:06:55 -0600	[thread overview]
Message-ID: <20221027090655.r54utor2bkty3m5p@k2> (raw)
In-Reply-To: <7fcf3bbb-95d2-a286-e3a-4d4dd87f713a@netfilter.org>

Hi Jozsef,

On Wed, Oct 26, 2022 at 02:26:08PM +0200, Jozsef Kadlecsik wrote:
> Hi Daniel,
> 
> On Tue, 25 Oct 2022, Daniel Xu wrote:
> 
> > I'm following up with our hallway chat yesterday about how ipset 
> > hash:net,iface can easily OOM.
> > 
> > Here's a quick reproducer (stolen from
> > https://bugzilla.kernel.org/show_bug.cgi?id=199107):
> > 
> >         $ ipset create ACL.IN.ALL_PERMIT hash:net,iface hashsize 1048576 timeout 0
> >         $ for i in $(seq 0 100); do /sbin/ipset add ACL.IN.ALL_PERMIT 0.0.0.0/0,kaf_$i timeout 0 -exist; done
> > 
> > This used to cause a NULL ptr deref panic before
> > https://github.com/torvalds/linux/commit/2b33d6ffa9e38f344418976b06 .
> > 
> > Now it'll either allocate a huge amount of memory or fail a
> > vmalloc():
> > 
> >         [Tue Oct 25 00:13:08 2022] ipset: vmalloc error: size 1073741848, exceeds total pages
> >         <...>
> >         [Tue Oct 25 00:13:08 2022] Call Trace:
> >         [Tue Oct 25 00:13:08 2022]  <TASK>
> >         [Tue Oct 25 00:13:08 2022]  dump_stack_lvl+0x48/0x60
> >         [Tue Oct 25 00:13:08 2022]  warn_alloc+0x155/0x180
> >         [Tue Oct 25 00:13:08 2022]  __vmalloc_node_range+0x72a/0x760
> >         [Tue Oct 25 00:13:08 2022]  ? hash_netiface4_add+0x7c0/0xb20
> >         [Tue Oct 25 00:13:08 2022]  ? __kmalloc_large_node+0x4a/0x90
> >         [Tue Oct 25 00:13:08 2022]  kvmalloc_node+0xa6/0xd0
> >         [Tue Oct 25 00:13:08 2022]  ? hash_netiface4_resize+0x99/0x710
> >         <...>
> > 
> > Note that this behavior is somewhat documented
> > (https://ipset.netfilter.org/ipset.man.html):
> > 
> > >  The internal restriction of the hash:net,iface set type is that the same
> > >  network prefix cannot be stored with more than 64 different interfaces
> > >  in a single set.
> > 
> > I'm not sure how hard it would be to enforce a limit, but I think it would
> > be a bit better to error than allocate many GBs of memory.
> 
> That's a bug, actually the limit is not enforced in spite of the 
> documentation. The next patch fixes it and I'm going to submit to Pablo:
> 
> diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
> index 6e391308431d..3f8853ed32e9 100644
> --- a/net/netfilter/ipset/ip_set_hash_gen.h
> +++ b/net/netfilter/ipset/ip_set_hash_gen.h
> @@ -61,10 +61,6 @@ tune_bucketsize(u8 curr, u32 multi)
>  	 */
>  	return n > curr && n <= AHASH_MAX_TUNED ? n : curr;
>  }
> -#define TUNE_BUCKETSIZE(h, multi)	\
> -	((h)->bucketsize = tune_bucketsize((h)->bucketsize, multi))
> -#else
> -#define TUNE_BUCKETSIZE(h, multi)
>  #endif
>  
>  /* A hash bucket */
> @@ -936,7 +932,11 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
>  		goto set_full;
>  	/* Create a new slot */
>  	if (n->pos >= n->size) {
> -		TUNE_BUCKETSIZE(h, multi);
> +#ifdef IP_SET_HASH_WITH_MULTI
> +		if (h->bucketsize >= AHASH_MAX_TUNED)
> +			goto set_full;
> +		h->bucketsize = tune_bucketsize(h->bucketsize, multi);
> +#endif
>  		if (n->size >= AHASH_MAX(h)) {
>  			/* Trigger rehashing */
>  			mtype_data_next(&h->next, d);
> 
> Best regards,
> Jozsef
> -
> E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
> PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
> Address : Wigner Research Centre for Physics
>           H-1525 Budapest 114, POB. 49, Hungary

Thank you!

Daniel

  reply	other threads:[~2022-10-27  9:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25  6:19 ip_set_hash_netiface Daniel Xu
2022-10-26 12:26 ` ip_set_hash_netiface Jozsef Kadlecsik
2022-10-27  9:06   ` Daniel Xu [this message]
2022-10-28  7:51   ` ip_set_hash_netiface David Laight
2022-10-28 10:59     ` ip_set_hash_netiface Jozsef Kadlecsik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221027090655.r54utor2bkty3m5p@k2 \
    --to=dxu@dxuuu.xyz \
    --cc=coreteam@netfilter.org \
    --cc=fw@strlen.de \
    --cc=kadlec@netfilter.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=ppenkov@aviatrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.