All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <edumazet@google.com>
To: SeongJae Park <sjpark@amazon.com>
Cc: David Miller <davem@davemloft.net>,
	SeongJae Park <sjpark@amazon.de>,
	Jakub Kicinski <kuba@kernel.org>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Florian Westphal <fw@strlen.de>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	netdev <netdev@vger.kernel.org>,
	rcu@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 1/1] net/ipv4/inet_fragment: Batch fqdir destroy works
Date: Fri, 11 Dec 2020 09:43:41 +0100	[thread overview]
Message-ID: <CANn89iJdPa-2FQS18p3d_YjZx_5OD=eZr_3+a6LPiAxpj=fowA@mail.gmail.com> (raw)
In-Reply-To: <20201211082032.26965-2-sjpark@amazon.com>

On Fri, Dec 11, 2020 at 9:21 AM SeongJae Park <sjpark@amazon.com> wrote:
>
> From: SeongJae Park <sjpark@amazon.de>
>
> For each 'fqdir_exit()' call, a work for destroy of the 'fqdir' is
> enqueued.  The work function, 'fqdir_work_fn()', internally calls
> 'rcu_barrier()'.  In case of intensive 'fqdir_exit()' (e.g., frequent
> 'unshare()' systemcalls), this increased contention could result in
> unacceptably high latency of 'rcu_barrier()'.  This commit avoids such
> contention by doing the 'rcu_barrier()' and subsequent lightweight works
> in a batched manner using a dedicated singlethread worker, as similar to
> that of 'cleanup_net()'.


Not sure why you submit a patch series with a single patch.

Your cover letter contains interesting info that would better be
captured in this changelog IMO

>
> Signed-off-by: SeongJae Park <sjpark@amazon.de>
> ---
>  include/net/inet_frag.h  |  1 +
>  net/ipv4/inet_fragment.c | 45 +++++++++++++++++++++++++++++++++-------
>  2 files changed, 39 insertions(+), 7 deletions(-)
>
> diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
> index bac79e817776..48cc5795ceda 100644
> --- a/include/net/inet_frag.h
> +++ b/include/net/inet_frag.h
> @@ -21,6 +21,7 @@ struct fqdir {
>         /* Keep atomic mem on separate cachelines in structs that include it */
>         atomic_long_t           mem ____cacheline_aligned_in_smp;
>         struct work_struct      destroy_work;
> +       struct llist_node       free_list;
>  };
>
>  /**
> diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
> index 10d31733297d..a6fc4afcc323 100644
> --- a/net/ipv4/inet_fragment.c
> +++ b/net/ipv4/inet_fragment.c
> @@ -145,12 +145,17 @@ static void inet_frags_free_cb(void *ptr, void *arg)
>                 inet_frag_destroy(fq);
>  }
>
> -static void fqdir_work_fn(struct work_struct *work)
> +static struct workqueue_struct *fqdir_wq;
> +static LLIST_HEAD(free_list);
> +
> +static void fqdir_free_fn(struct work_struct *work)
>  {
> -       struct fqdir *fqdir = container_of(work, struct fqdir, destroy_work);
> -       struct inet_frags *f = fqdir->f;
> +       struct llist_node *kill_list;
> +       struct fqdir *fqdir, *tmp;
> +       struct inet_frags *f;
>
> -       rhashtable_free_and_destroy(&fqdir->rhashtable, inet_frags_free_cb, NULL);
> +       /* Atomically snapshot the list of fqdirs to free */
> +       kill_list = llist_del_all(&free_list);
>
>         /* We need to make sure all ongoing call_rcu(..., inet_frag_destroy_rcu)
>          * have completed, since they need to dereference fqdir.
> @@ -158,12 +163,38 @@ static void fqdir_work_fn(struct work_struct *work)
>          */
>         rcu_barrier();
>
> -       if (refcount_dec_and_test(&f->refcnt))
> -               complete(&f->completion);
> +       llist_for_each_entry_safe(fqdir, tmp, kill_list, free_list) {
> +               f = fqdir->f;
> +               if (refcount_dec_and_test(&f->refcnt))
> +                       complete(&f->completion);
>
> -       kfree(fqdir);
> +               kfree(fqdir);
> +       }
>  }
>
> +static DECLARE_WORK(fqdir_free_work, fqdir_free_fn);
> +
> +static void fqdir_work_fn(struct work_struct *work)
> +{
> +       struct fqdir *fqdir = container_of(work, struct fqdir, destroy_work);
> +
> +       rhashtable_free_and_destroy(&fqdir->rhashtable, inet_frags_free_cb, NULL);
> +
> +       if (llist_add(&fqdir->free_list, &free_list))
> +               queue_work(fqdir_wq, &fqdir_free_work);

I think you misunderstood me.

Since this fqdir_free_work will have at most one instance, you can use
system_wq here, there is no risk of abuse.

My suggestion was to not use system_wq for fqdir_exit(), to better
control the number
 of threads in rhashtable cleanups.

void fqdir_exit(struct fqdir *fqdir)
{
        INIT_WORK(&fqdir->destroy_work, fqdir_work_fn);
-       queue_work(system_wq, &fqdir->destroy_work);
+      queue_work(fqdir_wq, &fqdir->destroy_work);
}



> +}
> +
> +static int __init fqdir_wq_init(void)
> +{
> +       fqdir_wq = create_singlethread_workqueue("fqdir");


And here, I suggest to use a non ordered work queue, allowing one
thread per cpu, to allow concurrent rhashtable cleanups

Also "fqdir" name is rather vague, this is an implementation detail ?

fqdir_wq =create_workqueue("inet_frag_wq");

> +       if (!fqdir_wq)
> +               panic("Could not create fqdir workq");
> +       return 0;
> +}
> +
> +pure_initcall(fqdir_wq_init);
> +
> +
>  int fqdir_init(struct fqdir **fqdirp, struct inet_frags *f, struct net *net)
>  {
>         struct fqdir *fqdir = kzalloc(sizeof(*fqdir), GFP_KERNEL);
> --
> 2.17.1
>

  reply	other threads:[~2020-12-11  8:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-11  8:20 [PATCH v3 0/1] net: Reduce rcu_barrier() contentions from 'unshare(CLONE_NEWNET)' SeongJae Park
2020-12-11  8:20 ` [PATCH v3 1/1] net/ipv4/inet_fragment: Batch fqdir destroy works SeongJae Park
2020-12-11  8:43   ` Eric Dumazet [this message]
2020-12-11 10:28     ` SeongJae Park
2020-12-11 10:41       ` Eric Dumazet
2020-12-11 10:46         ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANn89iJdPa-2FQS18p3d_YjZx_5OD=eZr_3+a6LPiAxpj=fowA@mail.gmail.com' \
    --to=edumazet@google.com \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=kuba@kernel.org \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=sjpark@amazon.com \
    --cc=sjpark@amazon.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.