linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects
       [not found] ` <1481147576-5690-38-git-send-email-pablo@netfilter.org>
@ 2016-12-09  0:40   ` Paul Gortmaker
  2016-12-09 10:24     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Gortmaker @ 2016-12-09  0:40 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, David Miller, netdev, linux-next

On Wed, Dec 7, 2016 at 4:52 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> This patch adds a new NFT_MSG_GETOBJ_RESET command perform an atomic
> dump-and-reset of the stateful object. This also comes with add support
> for atomic dump and reset for counter and quota objects.

This triggered a new build failure in linux-next on parisc-32, which a
hands-off bisect
run lists as resulting from this:

ERROR: "__cmpxchg_u64" [net/netfilter/nft_counter.ko] undefined!
make[2]: *** [__modpost] Error 1
make[1]: *** [modules] Error 2
make: *** [sub-make] Error 2
43da04a593d8b2626f1cf4b56efe9402f6b53652 is the first bad commit
commit 43da04a593d8b2626f1cf4b56efe9402f6b53652
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Mon Nov 28 00:05:44 2016 +0100

    netfilter: nf_tables: atomic dump and reset for stateful objects

    This patch adds a new NFT_MSG_GETOBJ_RESET command perform an atomic
    dump-and-reset of the stateful object. This also comes with add support
    for atomic dump and reset for counter and quota objects.

    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

:040000 040000 6cd4554f69247e5c837db52342f26888beda1623
5908aca93c89e7922336546c3753bfcf2aceefba M      include
:040000 040000 f25d5831eb30972436bd198c5bb237a0cb0b4856
4ee5751c8de02bb5a8dcaadb2a2df7986d90f8e9 M      net
bisect run success

Guessing this is more an issue with parisc than it is with netfilter, but I
figured I'd mention it anyway.

Paul.
--




>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> ---
>  include/net/netfilter/nf_tables.h        |  3 +-
>  include/uapi/linux/netfilter/nf_tables.h |  2 ++
>  net/netfilter/nf_tables_api.c            | 29 ++++++++++++-----
>  net/netfilter/nft_counter.c              | 56 +++++++++++++++++++++++++++-----
>  net/netfilter/nft_quota.c                | 18 ++++++----
>  5 files changed, 85 insertions(+), 23 deletions(-)
>
> diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
> index 903cd618f50e..6f7d6a1dc09c 100644
> --- a/include/net/netfilter/nf_tables.h
> +++ b/include/net/netfilter/nf_tables.h
> @@ -997,7 +997,8 @@ struct nft_object_type {
>                                                 struct nft_object *obj);
>         void                            (*destroy)(struct nft_object *obj);
>         int                             (*dump)(struct sk_buff *skb,
> -                                               const struct nft_object *obj);
> +                                               struct nft_object *obj,
> +                                               bool reset);
>  };
>
>  int nft_register_obj(struct nft_object_type *obj_type);
> diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
> index 3d47582caa80..399eac1eee91 100644
> --- a/include/uapi/linux/netfilter/nf_tables.h
> +++ b/include/uapi/linux/netfilter/nf_tables.h
> @@ -89,6 +89,7 @@ enum nft_verdicts {
>   * @NFT_MSG_NEWOBJ: create a stateful object (enum nft_obj_attributes)
>   * @NFT_MSG_GETOBJ: get a stateful object (enum nft_obj_attributes)
>   * @NFT_MSG_DELOBJ: delete a stateful object (enum nft_obj_attributes)
> + * @NFT_MSG_GETOBJ_RESET: get and reset a stateful object (enum nft_obj_attributes)
>   */
>  enum nf_tables_msg_types {
>         NFT_MSG_NEWTABLE,
> @@ -112,6 +113,7 @@ enum nf_tables_msg_types {
>         NFT_MSG_NEWOBJ,
>         NFT_MSG_GETOBJ,
>         NFT_MSG_DELOBJ,
> +       NFT_MSG_GETOBJ_RESET,
>         NFT_MSG_MAX,
>  };
>
> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> index 2ae717c5dcb8..bfc015af366a 100644
> --- a/net/netfilter/nf_tables_api.c
> +++ b/net/netfilter/nf_tables_api.c
> @@ -3972,14 +3972,14 @@ static struct nft_object *nft_obj_init(const struct nft_object_type *type,
>  }
>
>  static int nft_object_dump(struct sk_buff *skb, unsigned int attr,
> -                          const struct nft_object *obj)
> +                          struct nft_object *obj, bool reset)
>  {
>         struct nlattr *nest;
>
>         nest = nla_nest_start(skb, attr);
>         if (!nest)
>                 goto nla_put_failure;
> -       if (obj->type->dump(skb, obj) < 0)
> +       if (obj->type->dump(skb, obj, reset) < 0)
>                 goto nla_put_failure;
>         nla_nest_end(skb, nest);
>         return 0;
> @@ -4096,7 +4096,7 @@ static int nf_tables_newobj(struct net *net, struct sock *nlsk,
>  static int nf_tables_fill_obj_info(struct sk_buff *skb, struct net *net,
>                                    u32 portid, u32 seq, int event, u32 flags,
>                                    int family, const struct nft_table *table,
> -                                  const struct nft_object *obj)
> +                                  struct nft_object *obj, bool reset)
>  {
>         struct nfgenmsg *nfmsg;
>         struct nlmsghdr *nlh;
> @@ -4115,7 +4115,7 @@ static int nf_tables_fill_obj_info(struct sk_buff *skb, struct net *net,
>             nla_put_string(skb, NFTA_OBJ_NAME, obj->name) ||
>             nla_put_be32(skb, NFTA_OBJ_TYPE, htonl(obj->type->type)) ||
>             nla_put_be32(skb, NFTA_OBJ_USE, htonl(obj->use)) ||
> -           nft_object_dump(skb, NFTA_OBJ_DATA, obj))
> +           nft_object_dump(skb, NFTA_OBJ_DATA, obj, reset))
>                 goto nla_put_failure;
>
>         nlmsg_end(skb, nlh);
> @@ -4131,10 +4131,14 @@ static int nf_tables_dump_obj(struct sk_buff *skb, struct netlink_callback *cb)
>         const struct nfgenmsg *nfmsg = nlmsg_data(cb->nlh);
>         const struct nft_af_info *afi;
>         const struct nft_table *table;
> -       const struct nft_object *obj;
>         unsigned int idx = 0, s_idx = cb->args[0];
>         struct net *net = sock_net(skb->sk);
>         int family = nfmsg->nfgen_family;
> +       struct nft_object *obj;
> +       bool reset = false;
> +
> +       if (NFNL_MSG_TYPE(cb->nlh->nlmsg_type) == NFT_MSG_GETOBJ_RESET)
> +               reset = true;
>
>         rcu_read_lock();
>         cb->seq = net->nft.base_seq;
> @@ -4156,7 +4160,7 @@ static int nf_tables_dump_obj(struct sk_buff *skb, struct netlink_callback *cb)
>                                                             cb->nlh->nlmsg_seq,
>                                                             NFT_MSG_NEWOBJ,
>                                                             NLM_F_MULTI | NLM_F_APPEND,
> -                                                           afi->family, table, obj) < 0)
> +                                                           afi->family, table, obj, reset) < 0)
>                                         goto done;
>
>                                 nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> @@ -4183,6 +4187,7 @@ static int nf_tables_getobj(struct net *net, struct sock *nlsk,
>         const struct nft_table *table;
>         struct nft_object *obj;
>         struct sk_buff *skb2;
> +       bool reset = false;
>         u32 objtype;
>         int err;
>
> @@ -4214,9 +4219,12 @@ static int nf_tables_getobj(struct net *net, struct sock *nlsk,
>         if (!skb2)
>                 return -ENOMEM;
>
> +       if (NFNL_MSG_TYPE(nlh->nlmsg_type) == NFT_MSG_GETOBJ_RESET)
> +               reset = true;
> +
>         err = nf_tables_fill_obj_info(skb2, net, NETLINK_CB(skb).portid,
>                                       nlh->nlmsg_seq, NFT_MSG_NEWOBJ, 0,
> -                                     family, table, obj);
> +                                     family, table, obj, reset);
>         if (err < 0)
>                 goto err;
>
> @@ -4291,7 +4299,7 @@ static int nf_tables_obj_notify(const struct nft_ctx *ctx,
>
>         err = nf_tables_fill_obj_info(skb, ctx->net, ctx->portid, ctx->seq,
>                                       event, 0, ctx->afi->family, ctx->table,
> -                                     obj);
> +                                     obj, false);
>         if (err < 0) {
>                 kfree_skb(skb);
>                 goto err;
> @@ -4482,6 +4490,11 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = {
>                 .attr_count     = NFTA_OBJ_MAX,
>                 .policy         = nft_obj_policy,
>         },
> +       [NFT_MSG_GETOBJ_RESET] = {
> +               .call           = nf_tables_getobj,
> +               .attr_count     = NFTA_OBJ_MAX,
> +               .policy         = nft_obj_policy,
> +       },
>  };
>
>  static void nft_chain_commit_update(struct nft_trans *trans)
> diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
> index 6f3dd429f865..f6a02c5071c2 100644
> --- a/net/netfilter/nft_counter.c
> +++ b/net/netfilter/nft_counter.c
> @@ -100,10 +100,10 @@ static void nft_counter_obj_destroy(struct nft_object *obj)
>         nft_counter_do_destroy(priv);
>  }
>
> -static void nft_counter_fetch(const struct nft_counter_percpu __percpu *counter,
> +static void nft_counter_fetch(struct nft_counter_percpu __percpu *counter,
>                               struct nft_counter *total)
>  {
> -       const struct nft_counter_percpu *cpu_stats;
> +       struct nft_counter_percpu *cpu_stats;
>         u64 bytes, packets;
>         unsigned int seq;
>         int cpu;
> @@ -122,12 +122,52 @@ static void nft_counter_fetch(const struct nft_counter_percpu __percpu *counter,
>         }
>  }
>
> +static u64 __nft_counter_reset(u64 *counter)
> +{
> +       u64 ret, old;
> +
> +       do {
> +               old = *counter;
> +               ret = cmpxchg64(counter, old, 0);
> +       } while (ret != old);
> +
> +       return ret;
> +}
> +
> +static void nft_counter_reset(struct nft_counter_percpu __percpu *counter,
> +                             struct nft_counter *total)
> +{
> +       struct nft_counter_percpu *cpu_stats;
> +       u64 bytes, packets;
> +       unsigned int seq;
> +       int cpu;
> +
> +       memset(total, 0, sizeof(*total));
> +       for_each_possible_cpu(cpu) {
> +               bytes = packets = 0;
> +
> +               cpu_stats = per_cpu_ptr(counter, cpu);
> +               do {
> +                       seq     = u64_stats_fetch_begin_irq(&cpu_stats->syncp);
> +                       packets += __nft_counter_reset(&cpu_stats->counter.packets);
> +                       bytes   += __nft_counter_reset(&cpu_stats->counter.bytes);
> +               } while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, seq));
> +
> +               total->packets += packets;
> +               total->bytes += bytes;
> +       }
> +}
> +
>  static int nft_counter_do_dump(struct sk_buff *skb,
> -                              const struct nft_counter_percpu_priv *priv)
> +                              const struct nft_counter_percpu_priv *priv,
> +                              bool reset)
>  {
>         struct nft_counter total;
>
> -       nft_counter_fetch(priv->counter, &total);
> +       if (reset)
> +               nft_counter_reset(priv->counter, &total);
> +       else
> +               nft_counter_fetch(priv->counter, &total);
>
>         if (nla_put_be64(skb, NFTA_COUNTER_BYTES, cpu_to_be64(total.bytes),
>                          NFTA_COUNTER_PAD) ||
> @@ -141,11 +181,11 @@ static int nft_counter_do_dump(struct sk_buff *skb,
>  }
>
>  static int nft_counter_obj_dump(struct sk_buff *skb,
> -                               const struct nft_object *obj)
> +                               struct nft_object *obj, bool reset)
>  {
> -       const struct nft_counter_percpu_priv *priv = nft_obj_data(obj);
> +       struct nft_counter_percpu_priv *priv = nft_obj_data(obj);
>
> -       return nft_counter_do_dump(skb, priv);
> +       return nft_counter_do_dump(skb, priv, reset);
>  }
>
>  static const struct nla_policy nft_counter_policy[NFTA_COUNTER_MAX + 1] = {
> @@ -178,7 +218,7 @@ static int nft_counter_dump(struct sk_buff *skb, const struct nft_expr *expr)
>  {
>         const struct nft_counter_percpu_priv *priv = nft_expr_priv(expr);
>
> -       return nft_counter_do_dump(skb, priv);
> +       return nft_counter_do_dump(skb, priv, false);
>  }
>
>  static int nft_counter_init(const struct nft_ctx *ctx,
> diff --git a/net/netfilter/nft_quota.c b/net/netfilter/nft_quota.c
> index 0d344209803a..5d25f57497cb 100644
> --- a/net/netfilter/nft_quota.c
> +++ b/net/netfilter/nft_quota.c
> @@ -83,12 +83,17 @@ static int nft_quota_obj_init(const struct nlattr * const tb[],
>         return nft_quota_do_init(tb, priv);
>  }
>
> -static int nft_quota_do_dump(struct sk_buff *skb, const struct nft_quota *priv)
> +static int nft_quota_do_dump(struct sk_buff *skb, struct nft_quota *priv,
> +                            bool reset)
>  {
>         u32 flags = priv->invert ? NFT_QUOTA_F_INV : 0;
>         u64 consumed;
>
> -       consumed = atomic64_read(&priv->consumed);
> +       if (reset)
> +               consumed = atomic64_xchg(&priv->consumed, 0);
> +       else
> +               consumed = atomic64_read(&priv->consumed);
> +
>         /* Since we inconditionally increment consumed quota for each packet
>          * that we see, don't go over the quota boundary in what we send to
>          * userspace.
> @@ -108,11 +113,12 @@ static int nft_quota_do_dump(struct sk_buff *skb, const struct nft_quota *priv)
>         return -1;
>  }
>
> -static int nft_quota_obj_dump(struct sk_buff *skb, const struct nft_object *obj)
> +static int nft_quota_obj_dump(struct sk_buff *skb, struct nft_object *obj,
> +                             bool reset)
>  {
>         struct nft_quota *priv = nft_obj_data(obj);
>
> -       return nft_quota_do_dump(skb, priv);
> +       return nft_quota_do_dump(skb, priv, reset);
>  }
>
>  static struct nft_object_type nft_quota_obj __read_mostly = {
> @@ -146,9 +152,9 @@ static int nft_quota_init(const struct nft_ctx *ctx,
>
>  static int nft_quota_dump(struct sk_buff *skb, const struct nft_expr *expr)
>  {
> -       const struct nft_quota *priv = nft_expr_priv(expr);
> +       struct nft_quota *priv = nft_expr_priv(expr);
>
> -       return nft_quota_do_dump(skb, priv);
> +       return nft_quota_do_dump(skb, priv, false);
>  }
>
>  static struct nft_expr_type nft_quota_type;
> --
> 2.1.4
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects
  2016-12-09  0:40   ` [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects Paul Gortmaker
@ 2016-12-09 10:24     ` Pablo Neira Ayuso
  2016-12-09 14:24       ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Pablo Neira Ayuso @ 2016-12-09 10:24 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: netfilter-devel, David Miller, netdev, linux-next

[-- Attachment #1: Type: text/plain, Size: 1674 bytes --]

Hi Paul,

On Thu, Dec 08, 2016 at 07:40:14PM -0500, Paul Gortmaker wrote:
> On Wed, Dec 7, 2016 at 4:52 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > This patch adds a new NFT_MSG_GETOBJ_RESET command perform an atomic
> > dump-and-reset of the stateful object. This also comes with add support
> > for atomic dump and reset for counter and quota objects.
> 
> This triggered a new build failure in linux-next on parisc-32, which a
> hands-off bisect
> run lists as resulting from this:
> 
> ERROR: "__cmpxchg_u64" [net/netfilter/nft_counter.ko] undefined!
> make[2]: *** [__modpost] Error 1
> make[1]: *** [modules] Error 2
> make: *** [sub-make] Error 2
> 43da04a593d8b2626f1cf4b56efe9402f6b53652 is the first bad commit
> commit 43da04a593d8b2626f1cf4b56efe9402f6b53652
> Author: Pablo Neira Ayuso <pablo@netfilter.org>
> Date:   Mon Nov 28 00:05:44 2016 +0100
> 
>     netfilter: nf_tables: atomic dump and reset for stateful objects
> 
>     This patch adds a new NFT_MSG_GETOBJ_RESET command perform an atomic
>     dump-and-reset of the stateful object. This also comes with add support
>     for atomic dump and reset for counter and quota objects.
> 
>     Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> 
> :040000 040000 6cd4554f69247e5c837db52342f26888beda1623
> 5908aca93c89e7922336546c3753bfcf2aceefba M      include
> :040000 040000 f25d5831eb30972436bd198c5bb237a0cb0b4856
> 4ee5751c8de02bb5a8dcaadb2a2df7986d90f8e9 M      net
> bisect run success
> 
> Guessing this is more an issue with parisc than it is with netfilter, but I
> figured I'd mention it anyway.

I'm planning to submit this patch to parisc, I'm attaching it to this
email.

[-- Attachment #2: 0001-parisc-export-symbol-__cmpxchg_u64.patch --]
[-- Type: text/x-diff, Size: 1313 bytes --]

>From c9d320ac0be2a32a7b2bfad398be549865088ecf Mon Sep 17 00:00:00 2001
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu, 8 Dec 2016 22:55:33 +0100
Subject: [PATCH] parisc: export symbol __cmpxchg_u64()

kbuild test robot reports:

>> ERROR: "__cmpxchg_u64" [net/netfilter/nft_counter.ko] undefined!

Commit 43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for
stateful objects") introduces the first client of cmpxchg64() from
modules.

Patch 54b668009076 ("parisc: Add native high-resolution sched_clock()
implementation") removed __cmpxchg_u64() dependency on CONFIG_64BIT.
So, let's fix this problem by exporting this symbol unconditionally.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 arch/parisc/kernel/parisc_ksyms.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/parisc_ksyms.c b/arch/parisc/kernel/parisc_ksyms.c
index 3cad8aadc69e..cfa704548cf3 100644
--- a/arch/parisc/kernel/parisc_ksyms.c
+++ b/arch/parisc/kernel/parisc_ksyms.c
@@ -40,8 +40,8 @@ EXPORT_SYMBOL(__atomic_hash);
 #endif
 #ifdef CONFIG_64BIT
 EXPORT_SYMBOL(__xchg64);
-EXPORT_SYMBOL(__cmpxchg_u64);
 #endif
+EXPORT_SYMBOL(__cmpxchg_u64);
 
 #include <asm/uaccess.h>
 EXPORT_SYMBOL(lclear_user);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects
  2016-12-09 10:24     ` Pablo Neira Ayuso
@ 2016-12-09 14:24       ` Eric Dumazet
  2016-12-09 15:22         ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2016-12-09 14:24 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Paul Gortmaker, netfilter-devel, David Miller, netdev, linux-next

On Fri, 2016-12-09 at 11:24 +0100, Pablo Neira Ayuso wrote:
> Hi Paul,

Hi Pablo

Given that bytes/packets counters are modified without cmpxchg64()  :

static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
                                       struct nft_regs *regs,
                                       const struct nft_pktinfo *pkt)
{
        struct nft_counter_percpu *this_cpu;

        local_bh_disable();
        this_cpu = this_cpu_ptr(priv->counter);
        u64_stats_update_begin(&this_cpu->syncp);
        this_cpu->counter.bytes += pkt->skb->len;
        this_cpu->counter.packets++;
        u64_stats_update_end(&this_cpu->syncp);
        local_bh_enable();
}

It means that the cmpxchg64() used to clear the stats is not good enough.

It does not help to make sure stats are properly cleared.

On 64 bit, the ->syncp is not there, so the nft_counter_reset() might
not see that a bytes or packets counter was modified by another cpu.


CPU 1                              CPU 2

LOAD PTR->BYTES into REG_A         old = *counter;
REG_A += skb->len;
                                   cmpxchg64(counter, old, 0);
PTR->BYTES = REG_A

It looks that you want a seqcount, even on 64bit arches,
so that CPU 2 can restart its loop, and more importantly you need
to not accumulate the values you read, because they might be old/invalid.

Another way would be to not use cmpxchg64() at all.
Way to expensive in fast path !

The percpu value would never be modified by an other cpu than the owner.

You need a per cpu seqcount, no need to add a syncp per nft percpu counter.


static DEFINE_PERCPU(seqcount_t, nft_pcpu_seq);

static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
                                       struct nft_regs *regs,
                                       const struct nft_pktinfo *pkt)
{
        struct nft_counter_percpu *this_cpu;
	seqcount_t *myseq;

        local_bh_disable();
        this_cpu = this_cpu_ptr(priv->counter);
	myseq = this_cpu_ptr(&nft_pcpu_seq);

	write_seqcount_begin(myseq);

        this_cpu->counter.bytes += pkt->skb->len;
        this_cpu->counter.packets++;

	write_seqcount_end(myseq);
	
        local_bh_enable();
}

Thanks !

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects
  2016-12-09 14:24       ` Eric Dumazet
@ 2016-12-09 15:22         ` Eric Dumazet
  2016-12-10 12:21           ` Pablo Neira Ayuso
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2016-12-09 15:22 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Paul Gortmaker, netfilter-devel, David Miller, netdev, linux-next

On Fri, 2016-12-09 at 06:24 -0800, Eric Dumazet wrote:

> It looks that you want a seqcount, even on 64bit arches,
> so that CPU 2 can restart its loop, and more importantly you need
> to not accumulate the values you read, because they might be old/invalid.

Untested patch to give general idea. I can polish it a bit later today.

 net/netfilter/nft_counter.c |   59 +++++++++++++---------------------
 1 file changed, 23 insertions(+), 36 deletions(-)

diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
index f6a02c5071c2aeafca7635da3282a809aa04d6ab..57ed95b024473a2aa76298fe5bb5013bf709801b 100644
--- a/net/netfilter/nft_counter.c
+++ b/net/netfilter/nft_counter.c
@@ -31,18 +31,25 @@ struct nft_counter_percpu_priv {
 	struct nft_counter_percpu __percpu *counter;
 };
 
+static DEFINE_PER_CPU(seqcount_t, nft_counter_seq);
+
 static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
 				       struct nft_regs *regs,
 				       const struct nft_pktinfo *pkt)
 {
 	struct nft_counter_percpu *this_cpu;
+	seqcount_t *myseq;
 
 	local_bh_disable();
 	this_cpu = this_cpu_ptr(priv->counter);
-	u64_stats_update_begin(&this_cpu->syncp);
+	myseq = this_cpu_ptr(&nft_counter_seq);
+
+	write_seqcount_begin(myseq);
+
 	this_cpu->counter.bytes += pkt->skb->len;
 	this_cpu->counter.packets++;
-	u64_stats_update_end(&this_cpu->syncp);
+
+	write_seqcount_end(myseq);
 	local_bh_enable();
 }
 
@@ -110,52 +117,30 @@ static void nft_counter_fetch(struct nft_counter_percpu __percpu *counter,
 
 	memset(total, 0, sizeof(*total));
 	for_each_possible_cpu(cpu) {
+		seqcount_t *seqp = per_cpu_ptr(&nft_counter_seq, cpu);
+
 		cpu_stats = per_cpu_ptr(counter, cpu);
 		do {
-			seq	= u64_stats_fetch_begin_irq(&cpu_stats->syncp);
+			seq	= read_seqcount_begin(seqp);
 			bytes	= cpu_stats->counter.bytes;
 			packets	= cpu_stats->counter.packets;
-		} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, seq));
+		} while (read_seqcount_retry(seqp, seq));
 
 		total->packets += packets;
 		total->bytes += bytes;
 	}
 }
 
-static u64 __nft_counter_reset(u64 *counter)
-{
-	u64 ret, old;
-
-	do {
-		old = *counter;
-		ret = cmpxchg64(counter, old, 0);
-	} while (ret != old);
-
-	return ret;
-}
-
 static void nft_counter_reset(struct nft_counter_percpu __percpu *counter,
 			      struct nft_counter *total)
 {
 	struct nft_counter_percpu *cpu_stats;
-	u64 bytes, packets;
-	unsigned int seq;
-	int cpu;
 
-	memset(total, 0, sizeof(*total));
-	for_each_possible_cpu(cpu) {
-		bytes = packets = 0;
-
-		cpu_stats = per_cpu_ptr(counter, cpu);
-		do {
-			seq	= u64_stats_fetch_begin_irq(&cpu_stats->syncp);
-			packets	+= __nft_counter_reset(&cpu_stats->counter.packets);
-			bytes	+= __nft_counter_reset(&cpu_stats->counter.bytes);
-		} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, seq));
-
-		total->packets += packets;
-		total->bytes += bytes;
-	}
+	local_bh_disable();
+	cpu_stats = this_cpu_ptr(counter);
+	cpu_stats->counter.packets -= total->packets;
+	cpu_stats->counter.bytes -= total->bytes;
+	local_bh_enable();
 }
 
 static int nft_counter_do_dump(struct sk_buff *skb,
@@ -164,10 +149,9 @@ static int nft_counter_do_dump(struct sk_buff *skb,
 {
 	struct nft_counter total;
 
+	nft_counter_fetch(priv->counter, &total);
 	if (reset)
 		nft_counter_reset(priv->counter, &total);
-	else
-		nft_counter_fetch(priv->counter, &total);
 
 	if (nla_put_be64(skb, NFTA_COUNTER_BYTES, cpu_to_be64(total.bytes),
 			 NFTA_COUNTER_PAD) ||
@@ -285,7 +269,10 @@ static struct nft_expr_type nft_counter_type __read_mostly = {
 
 static int __init nft_counter_module_init(void)
 {
-	int err;
+	int err, cpu;
+
+	for_each_possible_cpu(cpu)
+		seqcount_init(per_cpu_ptr(&nft_counter_seq, cpu));
 
 	err = nft_register_obj(&nft_counter_obj);
 	if (err < 0)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects
  2016-12-09 15:22         ` Eric Dumazet
@ 2016-12-10 12:21           ` Pablo Neira Ayuso
  0 siblings, 0 replies; 5+ messages in thread
From: Pablo Neira Ayuso @ 2016-12-10 12:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Paul Gortmaker, netfilter-devel, David Miller, netdev, linux-next

On Fri, Dec 09, 2016 at 07:22:06AM -0800, Eric Dumazet wrote:
> On Fri, 2016-12-09 at 06:24 -0800, Eric Dumazet wrote:
> 
> > It looks that you want a seqcount, even on 64bit arches,
> > so that CPU 2 can restart its loop, and more importantly you need
> > to not accumulate the values you read, because they might be old/invalid.
> 
> Untested patch to give general idea. I can polish it a bit later today.

I'm preparing a patch now, so you can review it.

Eric, thanks a lot for reviewing and proposing a working approach!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-12-10 12:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1481147576-5690-1-git-send-email-pablo@netfilter.org>
     [not found] ` <1481147576-5690-38-git-send-email-pablo@netfilter.org>
2016-12-09  0:40   ` [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects Paul Gortmaker
2016-12-09 10:24     ` Pablo Neira Ayuso
2016-12-09 14:24       ` Eric Dumazet
2016-12-09 15:22         ` Eric Dumazet
2016-12-10 12:21           ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).