* [PATCH 0/1] ipset patch for nf @ 2016-02-24 20:19 Jozsef Kadlecsik 2016-02-24 20:19 ` [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel Jozsef Kadlecsik 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2016-02-24 20:19 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Hi Pablo, Please apply the next patch against the nf tree: - Deniz Eren reported that parallel flush/dump of list:set type of sets can lead to kernel crash. The bug was due to non-RCU compatible flushing, listing in the set type, fixed by me. The following changes since commit 5cc6ce9ff27565949a1001a2889a8dd9fd09e772: netfilter: nft_counter: fix erroneous return values (2016-02-08 13:05:02 +0100) are available in the git repository at: git://blackhole.kfki.hu/nf master for you to fetch changes up to 45040978c8994d1401baf5cc5ac71c1495d4e120: netfilter: ipset: Fix set:list type crash when flush/dump set in parallel (2016-02-24 20:32:21 +0100) ---------------------------------------------------------------- Jozsef Kadlecsik (1): netfilter: ipset: Fix set:list type crash when flush/dump set in parallel net/netfilter/ipset/ip_set_core.c | 3 +++ net/netfilter/ipset/ip_set_list_set.c | 55 ++++++++++++++++++++++-------------------------- 2 files changed, 28 insertions(+), 30 deletions(-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel 2016-02-24 20:19 [PATCH 0/1] ipset patch for nf Jozsef Kadlecsik @ 2016-02-24 20:19 ` Jozsef Kadlecsik 2016-02-29 12:22 ` Pablo Neira Ayuso 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2016-02-24 20:19 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Flushing/listing entries was not RCU safe, so parallel flush/dump could lead to kernel crash. Bug reported by Deniz Eren. Fixes netfilter bugzilla id #1050. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> --- net/netfilter/ipset/ip_set_core.c | 3 ++ net/netfilter/ipset/ip_set_list_set.c | 55 ++++++++++++++++------------------- 2 files changed, 28 insertions(+), 30 deletions(-) diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 95db43f..7e6568c 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -985,6 +985,9 @@ static int ip_set_destroy(struct net *net, struct sock *ctnl, if (unlikely(protocol_failed(attr))) return -IPSET_ERR_PROTOCOL; + /* Must wait for flush to be really finished in list:set */ + rcu_barrier(); + /* Commands are serialized and references are * protected by the ip_set_ref_lock. * External systems (i.e. xt_set) must call diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c index bbede95..24c6c19 100644 --- a/net/netfilter/ipset/ip_set_list_set.c +++ b/net/netfilter/ipset/ip_set_list_set.c @@ -30,6 +30,7 @@ MODULE_ALIAS("ip_set_list:set"); struct set_elem { struct rcu_head rcu; struct list_head list; + struct ip_set *set; /* Sigh, in order to cleanup reference */ ip_set_id_t id; } __aligned(__alignof__(u64)); @@ -151,30 +152,29 @@ list_set_kadt(struct ip_set *set, const struct sk_buff *skb, /* Userspace interfaces: we are protected by the nfnl mutex */ static void -__list_set_del(struct ip_set *set, struct set_elem *e) +__list_set_del_rcu(struct rcu_head * rcu) { + struct set_elem *e = container_of(rcu, struct set_elem, rcu); + struct ip_set *set = e->set; struct list_set *map = set->data; ip_set_put_byindex(map->net, e->id); - /* We may call it, because we don't have a to be destroyed - * extension which is used by the kernel. - */ ip_set_ext_destroy(set, e); - kfree_rcu(e, rcu); + kfree(e); } static inline void list_set_del(struct ip_set *set, struct set_elem *e) { list_del_rcu(&e->list); - __list_set_del(set, e); + call_rcu(&e->rcu, __list_set_del_rcu); } static inline void -list_set_replace(struct ip_set *set, struct set_elem *e, struct set_elem *old) +list_set_replace(struct set_elem *e, struct set_elem *old) { list_replace_rcu(&old->list, &e->list); - __list_set_del(set, old); + call_rcu(&old->rcu, __list_set_del_rcu); } static void @@ -244,9 +244,6 @@ list_set_uadd(struct ip_set *set, void *value, const struct ip_set_ext *ext, struct set_elem *e, *n, *prev, *next; bool flag_exist = flags & IPSET_FLAG_EXIST; - if (SET_WITH_TIMEOUT(set)) - set_cleanup_entries(set); - /* Find where to add the new entry */ n = prev = next = NULL; list_for_each_entry(e, &map->members, list) { @@ -301,10 +298,11 @@ list_set_uadd(struct ip_set *set, void *value, const struct ip_set_ext *ext, if (!e) return -ENOMEM; e->id = d->id; + e->set = set; INIT_LIST_HEAD(&e->list); list_set_init_extensions(set, ext, e); if (n) - list_set_replace(set, e, n); + list_set_replace(e, n); else if (next) list_add_tail_rcu(&e->list, &next->list); else if (prev) @@ -431,6 +429,7 @@ list_set_destroy(struct ip_set *set) if (SET_WITH_TIMEOUT(set)) del_timer_sync(&map->gc); + list_for_each_entry_safe(e, n, &map->members, list) { list_del(&e->list); ip_set_put_byindex(map->net, e->id); @@ -450,8 +449,10 @@ list_set_head(struct ip_set *set, struct sk_buff *skb) struct set_elem *e; u32 n = 0; - list_for_each_entry(e, &map->members, list) + rcu_read_lock(); + list_for_each_entry_rcu(e, &map->members, list) n++; + rcu_read_unlock(); nested = ipset_nest_start(skb, IPSET_ATTR_DATA); if (!nested) @@ -483,33 +484,25 @@ list_set_list(const struct ip_set *set, atd = ipset_nest_start(skb, IPSET_ATTR_ADT); if (!atd) return -EMSGSIZE; - list_for_each_entry(e, &map->members, list) { - if (i == first) - break; - i++; - } rcu_read_lock(); - list_for_each_entry_from(e, &map->members, list) { - i++; - if (SET_WITH_TIMEOUT(set) && - ip_set_timeout_expired(ext_timeout(e, set))) + list_for_each_entry_rcu(e, &map->members, list) { + if (i < first || + (SET_WITH_TIMEOUT(set) && + ip_set_timeout_expired(ext_timeout(e, set)))) { + i++; continue; + } nested = ipset_nest_start(skb, IPSET_ATTR_DATA); - if (!nested) { - if (i == first) { - nla_nest_cancel(skb, atd); - ret = -EMSGSIZE; - goto out; - } + if (!nested) goto nla_put_failure; - } if (nla_put_string(skb, IPSET_ATTR_NAME, ip_set_name_byindex(map->net, e->id))) goto nla_put_failure; if (ip_set_put_extensions(skb, set, e, true)) goto nla_put_failure; ipset_nest_end(skb, nested); + i++; } ipset_nest_end(skb, atd); @@ -520,10 +513,12 @@ list_set_list(const struct ip_set *set, nla_put_failure: nla_nest_cancel(skb, nested); if (unlikely(i == first)) { + nla_nest_cancel(skb, atd); cb->args[IPSET_CB_ARG0] = 0; ret = -EMSGSIZE; + } else { + cb->args[IPSET_CB_ARG0] = i; } - cb->args[IPSET_CB_ARG0] = i - 1; ipset_nest_end(skb, atd); out: rcu_read_unlock(); -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel 2016-02-24 20:19 ` [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel Jozsef Kadlecsik @ 2016-02-29 12:22 ` Pablo Neira Ayuso 2016-02-29 12:47 ` Jozsef Kadlecsik 0 siblings, 1 reply; 15+ messages in thread From: Pablo Neira Ayuso @ 2016-02-29 12:22 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel On Wed, Feb 24, 2016 at 09:19:26PM +0100, Jozsef Kadlecsik wrote: > Flushing/listing entries was not RCU safe, so parallel flush/dump > could lead to kernel crash. Bug reported by Deniz Eren. > > Fixes netfilter bugzilla id #1050. > > Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> > --- > net/netfilter/ipset/ip_set_core.c | 3 ++ > net/netfilter/ipset/ip_set_list_set.c | 55 ++++++++++++++++------------------- > 2 files changed, 28 insertions(+), 30 deletions(-) > > diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c > index 95db43f..7e6568c 100644 > --- a/net/netfilter/ipset/ip_set_core.c > +++ b/net/netfilter/ipset/ip_set_core.c > @@ -985,6 +985,9 @@ static int ip_set_destroy(struct net *net, struct sock *ctnl, > if (unlikely(protocol_failed(attr))) > return -IPSET_ERR_PROTOCOL; > > + /* Must wait for flush to be really finished in list:set */ > + rcu_barrier(); Jozsef, are you sure you need this rcu_barrier()? This is waiting for rcu callback completion (ie. decrement the set reference counter and releasing the extension and object itself). The rcu read side should be safe when accessing old copies from the dumping path when using call_rcu(). Let me know, thanks! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel 2016-02-29 12:22 ` Pablo Neira Ayuso @ 2016-02-29 12:47 ` Jozsef Kadlecsik 2016-03-05 12:32 ` Pablo Neira Ayuso 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2016-02-29 12:47 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel On Mon, 29 Feb 2016, Pablo Neira Ayuso wrote: > On Wed, Feb 24, 2016 at 09:19:26PM +0100, Jozsef Kadlecsik wrote: > > Flushing/listing entries was not RCU safe, so parallel flush/dump > > could lead to kernel crash. Bug reported by Deniz Eren. > > > > Fixes netfilter bugzilla id #1050. > > > > Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> > > --- > > net/netfilter/ipset/ip_set_core.c | 3 ++ > > net/netfilter/ipset/ip_set_list_set.c | 55 ++++++++++++++++------------------- > > 2 files changed, 28 insertions(+), 30 deletions(-) > > > > diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c > > index 95db43f..7e6568c 100644 > > --- a/net/netfilter/ipset/ip_set_core.c > > +++ b/net/netfilter/ipset/ip_set_core.c > > @@ -985,6 +985,9 @@ static int ip_set_destroy(struct net *net, struct sock *ctnl, > > if (unlikely(protocol_failed(attr))) > > return -IPSET_ERR_PROTOCOL; > > > > + /* Must wait for flush to be really finished in list:set */ > > + rcu_barrier(); > > Jozsef, are you sure you need this rcu_barrier()? > > This is waiting for rcu callback completion (ie. decrement the set > reference counter and releasing the extension and object itself). Yes, that's exactly what it's waiting for, in the case of sets which are elements in set:list type of sets. > The rcu read side should be safe when accessing old copies from the > dumping path when using call_rcu(). Three competing actions play role here: flush/delete, destroy, dumping. The rcu_barrier() above waits for the flush/delete actions. The rcu read side has access to the set id only and must lookup the set by id in order to get the name, which is protected by a reference counter. The function ip_set_name_byindex() intentionally checks that the reference counter cannot be equal to zero and I believe it's an important internal sanity checking and I don't want to remove it. The problem was that the reference counter of the set was first decremented at flush/delete and so we could get a BUG_ON() crash in ip_set_name_byindex() when an ongoing dumping was processed. So the reference counter must be released last for a successful parallel dumping. If rcu_barrier() was not added to the destroy path, a "destroy" immediately following a "flush" can lead to the error "set is in use, cannot delete". I tested it and couldn't find a simpler way. It is the destroy path for sets, so it should not be performance critical. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel 2016-02-29 12:47 ` Jozsef Kadlecsik @ 2016-03-05 12:32 ` Pablo Neira Ayuso 2016-03-08 19:27 ` Jozsef Kadlecsik 0 siblings, 1 reply; 15+ messages in thread From: Pablo Neira Ayuso @ 2016-03-05 12:32 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel Hi Jozsef, On Mon, Feb 29, 2016 at 01:47:59PM +0100, Jozsef Kadlecsik wrote: > On Mon, 29 Feb 2016, Pablo Neira Ayuso wrote: > > > On Wed, Feb 24, 2016 at 09:19:26PM +0100, Jozsef Kadlecsik wrote: > > > Flushing/listing entries was not RCU safe, so parallel flush/dump > > > could lead to kernel crash. Bug reported by Deniz Eren. > > > > > > Fixes netfilter bugzilla id #1050. > > > > > > Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> > > > --- > > > net/netfilter/ipset/ip_set_core.c | 3 ++ > > > net/netfilter/ipset/ip_set_list_set.c | 55 ++++++++++++++++------------------- > > > 2 files changed, 28 insertions(+), 30 deletions(-) > > > > > > diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c > > > index 95db43f..7e6568c 100644 > > > --- a/net/netfilter/ipset/ip_set_core.c > > > +++ b/net/netfilter/ipset/ip_set_core.c > > > @@ -985,6 +985,9 @@ static int ip_set_destroy(struct net *net, struct sock *ctnl, > > > if (unlikely(protocol_failed(attr))) > > > return -IPSET_ERR_PROTOCOL; > > > > > > + /* Must wait for flush to be really finished in list:set */ > > > + rcu_barrier(); > > > > Jozsef, are you sure you need this rcu_barrier()? > > > > This is waiting for rcu callback completion (ie. decrement the set > > reference counter and releasing the extension and object itself). > > Yes, that's exactly what it's waiting for, in the case of sets which are > elements in set:list type of sets. > > > The rcu read side should be safe when accessing old copies from the > > dumping path when using call_rcu(). > > Three competing actions play role here: flush/delete, destroy, dumping. > The rcu_barrier() above waits for the flush/delete actions. The rcu read > side has access to the set id only and must lookup the set by id in order > to get the name, which is protected by a reference counter. The function > ip_set_name_byindex() intentionally checks that the reference counter > cannot be equal to zero and I believe it's an important internal sanity > checking and I don't want to remove it. The problem was that the reference > counter of the set was first decremented at flush/delete and so we could > get a BUG_ON() crash in ip_set_name_byindex() when an ongoing dumping was > processed. So the reference counter must be released last for a successful > parallel dumping. If rcu_barrier() was not added to the destroy path, a > "destroy" immediately following a "flush" can lead to the error "set is in > use, cannot delete". I tested it and couldn't find a simpler way. ip_set_destroy() calls ip_set_destroy_set() which releases the set object and its content without waiting for the rcu grace period. I think readers (dump path) may still be walking on the set by when you're releasing this object. Our nfnetlink dump path is lockless (rcu read side protected), so we unpublish the set object and release objects via rcu callback when operating from the control plane. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel 2016-03-05 12:32 ` Pablo Neira Ayuso @ 2016-03-08 19:27 ` Jozsef Kadlecsik 2016-03-10 18:27 ` Pablo Neira Ayuso 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2016-03-08 19:27 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel Hi Pablo, On Sat, 5 Mar 2016, Pablo Neira Ayuso wrote: > On Mon, Feb 29, 2016 at 01:47:59PM +0100, Jozsef Kadlecsik wrote: > > On Mon, 29 Feb 2016, Pablo Neira Ayuso wrote: > > > > > On Wed, Feb 24, 2016 at 09:19:26PM +0100, Jozsef Kadlecsik wrote: > > > > Flushing/listing entries was not RCU safe, so parallel flush/dump > > > > could lead to kernel crash. Bug reported by Deniz Eren. > > > > > > > > Fixes netfilter bugzilla id #1050. > > > > > > > > Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> > > > > --- > > > > net/netfilter/ipset/ip_set_core.c | 3 ++ > > > > net/netfilter/ipset/ip_set_list_set.c | 55 ++++++++++++++++------------------- > > > > 2 files changed, 28 insertions(+), 30 deletions(-) > > > > > > > > diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c > > > > index 95db43f..7e6568c 100644 > > > > --- a/net/netfilter/ipset/ip_set_core.c > > > > +++ b/net/netfilter/ipset/ip_set_core.c > > > > @@ -985,6 +985,9 @@ static int ip_set_destroy(struct net *net, struct sock *ctnl, > > > > if (unlikely(protocol_failed(attr))) > > > > return -IPSET_ERR_PROTOCOL; > > > > > > > > + /* Must wait for flush to be really finished in list:set */ > > > > + rcu_barrier(); > > > > > > Jozsef, are you sure you need this rcu_barrier()? > > > > > > This is waiting for rcu callback completion (ie. decrement the set > > > reference counter and releasing the extension and object itself). > > > > Yes, that's exactly what it's waiting for, in the case of sets which are > > elements in set:list type of sets. > > > > > The rcu read side should be safe when accessing old copies from the > > > dumping path when using call_rcu(). > > > > Three competing actions play role here: flush/delete, destroy, dumping. > > The rcu_barrier() above waits for the flush/delete actions. The rcu read > > side has access to the set id only and must lookup the set by id in order > > to get the name, which is protected by a reference counter. The function > > ip_set_name_byindex() intentionally checks that the reference counter > > cannot be equal to zero and I believe it's an important internal sanity > > checking and I don't want to remove it. The problem was that the reference > > counter of the set was first decremented at flush/delete and so we could > > get a BUG_ON() crash in ip_set_name_byindex() when an ongoing dumping was > > processed. So the reference counter must be released last for a successful > > parallel dumping. If rcu_barrier() was not added to the destroy path, a > > "destroy" immediately following a "flush" can lead to the error "set is in > > use, cannot delete". I tested it and couldn't find a simpler way. > > ip_set_destroy() calls ip_set_destroy_set() which releases the set > object and its content without waiting for the rcu grace period. I > think readers (dump path) may still be walking on the set by when > you're releasing this object. > > Our nfnetlink dump path is lockless (rcu read side protected), so we > unpublish the set object and release objects via rcu callback when > operating from the control plane. Yes, but we are dealing with flush and then destroy, when the set we are flushing is a list:set type of set where the elements are sets. It is not possible to unpublish the flushed objects, because those are totally valid sets. Before I sent my patch I tested it without rcu_barrier() and the destroy command failed (quite rightly) because 'Set is in use' - destroy has to wait for flush to be completed anyway. After your email I have been thinking on it a lot and couldn't figure out a better way. Would it be OK for you to limit rcu_barrier() to the list:set case with something like: mutex_lock(&module_mutex); struct module *m = find_module("ip_set_list_set"); if (m && module_refcount(m) > 0) rcu_barrier(); mutex_unlock(&module_mutex); Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel 2016-03-08 19:27 ` Jozsef Kadlecsik @ 2016-03-10 18:27 ` Pablo Neira Ayuso 0 siblings, 0 replies; 15+ messages in thread From: Pablo Neira Ayuso @ 2016-03-10 18:27 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel On Tue, Mar 08, 2016 at 08:27:30PM +0100, Jozsef Kadlecsik wrote: > Hi Pablo, > > On Sat, 5 Mar 2016, Pablo Neira Ayuso wrote: > > > On Mon, Feb 29, 2016 at 01:47:59PM +0100, Jozsef Kadlecsik wrote: > > > On Mon, 29 Feb 2016, Pablo Neira Ayuso wrote: > > > > > > > On Wed, Feb 24, 2016 at 09:19:26PM +0100, Jozsef Kadlecsik wrote: > > > > > Flushing/listing entries was not RCU safe, so parallel flush/dump > > > > > could lead to kernel crash. Bug reported by Deniz Eren. > > > > > > > > > > Fixes netfilter bugzilla id #1050. > > > > > > > > > > Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> > > > > > --- > > > > > net/netfilter/ipset/ip_set_core.c | 3 ++ > > > > > net/netfilter/ipset/ip_set_list_set.c | 55 ++++++++++++++++------------------- > > > > > 2 files changed, 28 insertions(+), 30 deletions(-) > > > > > > > > > > diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c > > > > > index 95db43f..7e6568c 100644 > > > > > --- a/net/netfilter/ipset/ip_set_core.c > > > > > +++ b/net/netfilter/ipset/ip_set_core.c > > > > > @@ -985,6 +985,9 @@ static int ip_set_destroy(struct net *net, struct sock *ctnl, > > > > > if (unlikely(protocol_failed(attr))) > > > > > return -IPSET_ERR_PROTOCOL; > > > > > > > > > > + /* Must wait for flush to be really finished in list:set */ > > > > > + rcu_barrier(); > > > > > > > > Jozsef, are you sure you need this rcu_barrier()? > > > > > > > > This is waiting for rcu callback completion (ie. decrement the set > > > > reference counter and releasing the extension and object itself). > > > > > > Yes, that's exactly what it's waiting for, in the case of sets which are > > > elements in set:list type of sets. > > > > > > > The rcu read side should be safe when accessing old copies from the > > > > dumping path when using call_rcu(). > > > > > > Three competing actions play role here: flush/delete, destroy, dumping. > > > The rcu_barrier() above waits for the flush/delete actions. The rcu read > > > side has access to the set id only and must lookup the set by id in order > > > to get the name, which is protected by a reference counter. The function > > > ip_set_name_byindex() intentionally checks that the reference counter > > > cannot be equal to zero and I believe it's an important internal sanity > > > checking and I don't want to remove it. The problem was that the reference > > > counter of the set was first decremented at flush/delete and so we could > > > get a BUG_ON() crash in ip_set_name_byindex() when an ongoing dumping was > > > processed. So the reference counter must be released last for a successful > > > parallel dumping. If rcu_barrier() was not added to the destroy path, a > > > "destroy" immediately following a "flush" can lead to the error "set is in > > > use, cannot delete". I tested it and couldn't find a simpler way. > > > > ip_set_destroy() calls ip_set_destroy_set() which releases the set > > object and its content without waiting for the rcu grace period. I > > think readers (dump path) may still be walking on the set by when > > you're releasing this object. > > > > Our nfnetlink dump path is lockless (rcu read side protected), so we > > unpublish the set object and release objects via rcu callback when > > operating from the control plane. > > Yes, but we are dealing with flush and then destroy, when the set we are > flushing is a list:set type of set where the elements are sets. It is not > possible to unpublish the flushed objects, because those are totally valid > sets. > > Before I sent my patch I tested it without rcu_barrier() and the destroy > command failed (quite rightly) because 'Set is in use' - destroy has to > wait for flush to be completed anyway. OK, thanks for explaining. I don't find any better way at this moment to handle the interdependencies that the set:list type creates with this elements that are actually sets, so this rcu_barrier() guarantees the in-order set releasing that you need not to trigger the bug. Sorry, I misunderstood the goal of this and I just wanted to make sure this rcu_barrier() was not actually hiding a real problem. I have pulled this into the nf-next tree, thanks for your patience. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/1] ipset patch for nf @ 2020-02-18 12:19 Jozsef Kadlecsik 0 siblings, 0 replies; 15+ messages in thread From: Jozsef Kadlecsik @ 2020-02-18 12:19 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso, Hillf Danton Hi Pablo, Please consider to apply the next patch to the nf tree. It's larger than usual, but the issue could not be solved simpler. - Fix "INFO: rcu detected stall in hash_xxx" reports of syzbot by introducing region locking and using workqueue instead of timer based gc of timed out entries in hash types of sets in ipset. Best regards, Jozsef The following changes since commit 83d0585f91da441a0b11bc5ff93f4cda56de6703: Merge branch 'Fix-reconnection-latency-caused-by-FIN-ACK-handling-race' (2020-02-02 13:45:05 -0800) are available in the Git repository at: git://blackhole.kfki.hu/nf f431c76a1f36bcd6bbfd for you to fetch changes up to f431c76a1f36bcd6bbfd668a98127ac60226b86f: netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports (2020-02-18 11:19:06 +0100) ---------------------------------------------------------------- Jozsef Kadlecsik (1): netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports include/linux/netfilter/ipset/ip_set.h | 11 +- net/netfilter/ipset/ip_set_core.c | 34 +- net/netfilter/ipset/ip_set_hash_gen.h | 633 +++++++++++++++++++++++---------- 3 files changed, 472 insertions(+), 206 deletions(-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/1] ipset patch for nf @ 2017-09-23 21:37 Jozsef Kadlecsik 0 siblings, 0 replies; 15+ messages in thread From: Jozsef Kadlecsik @ 2017-09-23 21:37 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Hi Pablo, Please apply the next patch for the nf tree. It fixes the issue when an IP range with more than 2^31 addresses was specified to add (bugzilla id #1005) - Fix adding an IPv4 range containing more than 2^31 addresses, which failed silently due to wrong comparison (bugzilla id #1005). Thanks, Jozsef - The following changes since commit 7f4f7dd4417d9efd038b14d39c70170db2e0baa0: netfilter: ipset: ipset list may return wrong member count for set with timeout (2017-09-18 17:35:32 +0200) are available in the git repository at: git://blackhole.kfki.hu/nf for you to fetch changes up to 625a0411a81ae51c12bde4452171f08174b656cd: Fix adding an IPv4 range containing more than 2^31 addresses (2017-09-23 23:28:13 +0200) ---------------------------------------------------------------- Jozsef Kadlecsik (1): Fix adding an IPv4 range containing more than 2^31 addresses net/netfilter/ipset/ip_set_hash_ip.c | 22 ++++++++++++---------- net/netfilter/ipset/ip_set_hash_ipmark.c | 2 +- net/netfilter/ipset/ip_set_hash_ipport.c | 2 +- net/netfilter/ipset/ip_set_hash_ipportip.c | 2 +- net/netfilter/ipset/ip_set_hash_ipportnet.c | 4 ++-- net/netfilter/ipset/ip_set_hash_net.c | 2 +- net/netfilter/ipset/ip_set_hash_netiface.c | 2 +- net/netfilter/ipset/ip_set_hash_netnet.c | 4 ++-- net/netfilter/ipset/ip_set_hash_netport.c | 2 +- net/netfilter/ipset/ip_set_hash_netportnet.c | 4 ++-- 10 files changed, 24 insertions(+), 22 deletions(-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/1] ipset patch for nf @ 2016-03-16 21:02 Jozsef Kadlecsik 2016-03-21 21:37 ` Pablo Neira Ayuso 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2016-03-16 21:02 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Hi Pablo, Please apply the next patch against the nf tree: - There was a race condition between parallel save/swap and delete, which resulted a kernel crash due to the increase ref for save, swap, wrong ref decrease operations. Reported and fixed by Vishwanath Pai. The patch should be applied to the older stable kernel branches too. Best regards, Jozsef The following changes since commit fe1c3e3f630ef7765f34d9585d6b524899502b3f: netfilter: ipset: Check IPSET_ATTR_ETHER netlink attribute length (2016-03-16 21:43:50 +0100) are available in the git repository at: git://blackhole.kfki.hu/nf master for you to fetch changes up to 17f8a7334f3aee9783cfa6787172240c083ef394: netfilter: ipset: fix race condition in ipset save, swap and delete (2016-03-16 21:49:00 +0100) ---------------------------------------------------------------- Vishwanath Pai (1): netfilter: ipset: fix race condition in ipset save, swap and delete include/linux/netfilter/ipset/ip_set.h | 4 ++++ net/netfilter/ipset/ip_set_bitmap_gen.h | 2 +- net/netfilter/ipset/ip_set_core.c | 33 ++++++++++++++++++++++++++++----- net/netfilter/ipset/ip_set_hash_gen.h | 2 +- net/netfilter/ipset/ip_set_list_set.c | 2 +- 5 files changed, 35 insertions(+), 8 deletions(-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/1] ipset patch for nf 2016-03-16 21:02 Jozsef Kadlecsik @ 2016-03-21 21:37 ` Pablo Neira Ayuso 0 siblings, 0 replies; 15+ messages in thread From: Pablo Neira Ayuso @ 2016-03-21 21:37 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel On Wed, Mar 16, 2016 at 10:02:29PM +0100, Jozsef Kadlecsik wrote: > Hi Pablo, > > Please apply the next patch against the nf tree: > > - There was a race condition between parallel save/swap and delete, > which resulted a kernel crash due to the increase ref for save, swap, > wrong ref decrease operations. Reported and fixed by Vishwanath Pai. > > The patch should be applied to the older stable kernel branches too. > > Best regards, > Jozsef > > The following changes since commit fe1c3e3f630ef7765f34d9585d6b524899502b3f: > > netfilter: ipset: Check IPSET_ATTR_ETHER netlink attribute length (2016-03-16 21:43:50 +0100) > > are available in the git repository at: > > git://blackhole.kfki.hu/nf master Pulled, thanks Jozsef. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/1] ipset patch for nf @ 2016-03-08 19:44 Jozsef Kadlecsik 2016-03-10 18:28 ` Pablo Neira Ayuso 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2016-03-08 19:44 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Hi Pablo, Please apply the next patch against the nf tree: - Julia Lawall pointed out that IPSET_ATTR_ETHER netlink attribute length was not checked explicitly. The patch adds the missing checkings. The patch should be applied to the older stable kernel branches too. Best regards, Jozsef The following changes since commit 45040978c8994d1401baf5cc5ac71c1495d4e120: netfilter: ipset: Fix set:list type crash when flush/dump set in parallel (2016-02-24 20:32:21 +0100) are available in the git repository at: git://blackhole.kfki.hu/nf master for you to fetch changes up to d8aacd87180141ff6b812b53de77a4336e87c91a: netfilter: ipset: Check IPSET_ATTR_ETHER netlink attribute length (2016-03-08 20:36:17 +0100) ---------------------------------------------------------------- Jozsef Kadlecsik (1): netfilter: ipset: Check IPSET_ATTR_ETHER netlink attribute length net/netfilter/ipset/ip_set_bitmap_ipmac.c | 2 ++ net/netfilter/ipset/ip_set_hash_mac.c | 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/1] ipset patch for nf 2016-03-08 19:44 Jozsef Kadlecsik @ 2016-03-10 18:28 ` Pablo Neira Ayuso 0 siblings, 0 replies; 15+ messages in thread From: Pablo Neira Ayuso @ 2016-03-10 18:28 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel On Tue, Mar 08, 2016 at 08:44:19PM +0100, Jozsef Kadlecsik wrote: > Hi Pablo, > > Please apply the next patch against the nf tree: > > - Julia Lawall pointed out that IPSET_ATTR_ETHER netlink attribute > length was not checked explicitly. The patch adds the missing > checkings. > > The patch should be applied to the older stable kernel branches too. > > Best regards, > Jozsef > > The following changes since commit 45040978c8994d1401baf5cc5ac71c1495d4e120: > > netfilter: ipset: Fix set:list type crash when flush/dump set in parallel (2016-02-24 20:32:21 +0100) > > are available in the git repository at: > > git://blackhole.kfki.hu/nf master This patch also came with the previous pull, so this will show up in the nf-next tree too. Thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/1] ipset patch for nf @ 2015-08-28 17:15 Jozsef Kadlecsik 2015-08-28 23:03 ` Pablo Neira Ayuso 0 siblings, 1 reply; 15+ messages in thread From: Jozsef Kadlecsik @ 2015-08-28 17:15 UTC (permalink / raw) To: netfilter-devel; +Cc: Pablo Neira Ayuso Hi Pablo, Please apply the next bugfix patch against the nf tree. - Dave Jones reported that KASan detected out of bounds access in hash:net* types, which is fixed in the next patch Thanks, Jozsef The following changes since commit 18e1db67e93ed75d9dc0d34c8d783ccf10547c2b: netfilter: bridge: fix IPv6 packets not being bridged with CONFIG_IPV6=n (2015-08-19 21:21:41 +0200) are available in the git repository at: git://blackhole.kfki.hu/nf master for you to fetch changes up to 6fe7ccfd77415a6ba250c10c580eb3f9acf79753: netfilter: ipset: Out of bound access in hash:net* types fixed (2015-08-28 18:51:30 +0200) ---------------------------------------------------------------- Jozsef Kadlecsik (1): netfilter: ipset: Out of bound access in hash:net* types fixed net/netfilter/ipset/ip_set_hash_gen.h | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) kadlec@blackhole ~/git/nf $ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/1] ipset patch for nf 2015-08-28 17:15 Jozsef Kadlecsik @ 2015-08-28 23:03 ` Pablo Neira Ayuso 0 siblings, 0 replies; 15+ messages in thread From: Pablo Neira Ayuso @ 2015-08-28 23:03 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel On Fri, Aug 28, 2015 at 07:15:51PM +0200, Jozsef Kadlecsik wrote: > Hi Pablo, > > Please apply the next bugfix patch against the nf tree. > > - Dave Jones reported that KASan detected out of bounds access in hash:net* > types, which is fixed in the next patch > > Thanks, > Jozsef > > The following changes since commit 18e1db67e93ed75d9dc0d34c8d783ccf10547c2b: > > netfilter: bridge: fix IPv6 packets not being bridged with CONFIG_IPV6=n (2015-08-19 21:21:41 +0200) > > are available in the git repository at: > > git://blackhole.kfki.hu/nf master Pulled into nf, thanks Jozsef. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-02-18 12:19 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-02-24 20:19 [PATCH 0/1] ipset patch for nf Jozsef Kadlecsik 2016-02-24 20:19 ` [PATCH 1/1] netfilter: ipset: Fix set:list type crash when flush/dump set in parallel Jozsef Kadlecsik 2016-02-29 12:22 ` Pablo Neira Ayuso 2016-02-29 12:47 ` Jozsef Kadlecsik 2016-03-05 12:32 ` Pablo Neira Ayuso 2016-03-08 19:27 ` Jozsef Kadlecsik 2016-03-10 18:27 ` Pablo Neira Ayuso -- strict thread matches above, loose matches on Subject: below -- 2020-02-18 12:19 [PATCH 0/1] ipset patch for nf Jozsef Kadlecsik 2017-09-23 21:37 Jozsef Kadlecsik 2016-03-16 21:02 Jozsef Kadlecsik 2016-03-21 21:37 ` Pablo Neira Ayuso 2016-03-08 19:44 Jozsef Kadlecsik 2016-03-10 18:28 ` Pablo Neira Ayuso 2015-08-28 17:15 Jozsef Kadlecsik 2015-08-28 23:03 ` Pablo Neira Ayuso
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.