From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH v3 nf-next 1/2] netfilter: x_tables: wait until old table isn't used anymore Date: Wed, 11 Oct 2017 11:23:28 -0700 Message-ID: References: <20171011142607.15026-1-fw@strlen.de> <20171011142607.15026-2-fw@strlen.de> <20171011174832.GB26835@breakpoint.cc> <20171011181803.GC26835@breakpoint.cc> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: netfilter-devel@vger.kernel.org, Dan Williams To: Florian Westphal Return-path: Received: from mail-ua0-f171.google.com ([209.85.217.171]:52982 "EHLO mail-ua0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752502AbdJKSX3 (ORCPT ); Wed, 11 Oct 2017 14:23:29 -0400 Received: by mail-ua0-f171.google.com with SMTP id i35so1626445uah.9 for ; Wed, 11 Oct 2017 11:23:29 -0700 (PDT) In-Reply-To: <20171011181803.GC26835@breakpoint.cc> Sender: netfilter-devel-owner@vger.kernel.org List-ID: On Wed, Oct 11, 2017 at 11:18 AM, Florian Westphal wrote: > Eric Dumazet wrote: >> On Wed, Oct 11, 2017 at 10:48 AM, Florian Westphal wrote: >> > Eric Dumazet wrote: >> >> On Wed, Oct 11, 2017 at 7:26 AM, Florian Westphal wrote: >> >> > xt_replace_table relies on table replacement counter retrieval (which >> >> > uses xt_recseq to synchronize pcpu counters). >> >> > >> >> > This is fine, however with large rule set get_counters() can take >> >> > a very long time -- it needs to synchronize all counters because >> >> > it has to assume concurrent modifications can occur. >> >> > >> >> > Make xt_replace_table synchronize by itself by waiting until all cpus >> >> > had an even seqcount. >> >> > >> >> > This allows a followup patch to copy the counters of the old ruleset >> >> > without any synchonization after xt_replace_table has completed. >> >> > >> >> > Cc: Dan Williams >> >> > Cc: Eric Dumazet >> >> > Signed-off-by: Florian Westphal >> >> > --- >> >> > v3: check for 'seq is uneven' OR 'has changed' since >> >> > last check. Its fine if seq is uneven iff its a different >> >> > sequence number than the initial one. >> >> > >> >> > v2: fix Erics email address >> >> > net/netfilter/x_tables.c | 18 +++++++++++++++--- >> >> > 1 file changed, 15 insertions(+), 3 deletions(-) >> >> > >> >> > >> >> >> >> Reviewed-by: Eric Dumazet >> >> >> >> But it seems we need an extra smp_wmb() after >> >> >> >> smp_wmb(); >> >> table->private = newinfo; >> >> + smp_wmb(); >> >> >> >> Otherwise we have no guarantee other cpus actually see the new ->private value. >> > >> > Seems to be unrelated to this change, so I will submit >> > a separate patch for nf.git that adds this. >> >> This is related to this change, please read the comment before the >> local_bh_enable9) >> >> /* >> * Even though table entries have now been swapped, other CPU's >> * may still be using the old entries. This is okay, because >> * resynchronization happens because of the locking done >> * during the get_counters() routine. >> */ > > Hmm, but get_counters() does not issue a wmb, and the 'new' code added > here essentially is the same as get_counters(), except that we only > read seqcount until we saw a change (and not for each counter in > the rule set). > > What am I missing? Your sync code does nothing interesting if we are not sure new table->private value is visible Without barriers, compiler/cpu could do this : + /* ... so wait for even xt_recseq on all cpus */ + for_each_possible_cpu(cpu) { + seqcount_t *s = &per_cpu(xt_recseq, cpu); + u32 seq = raw_read_seqcount(s); + + if (seq & 1) { + do { + cond_resched(); + cpu_relax(); + } while (seq == raw_read_seqcount(s)); + } + } /* finally, write new private value */ table->private = newinfo; Basically, your loop is now useless and you could remove it. So there is definitely a bug.