From: Marcelo Tosatti <mtosatti@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Chris Friesen <chris.friesen@windriver.com>,
linux-kernel@vger.kernel.org, Christoph Lameter <cl@linux.com>,
Jim Somerville <Jim.Somerville@windriver.com>,
Andrew Morton <akpm@linux-foundation.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v2] isolcpus: affine kernel threads to specified cpumask
Date: Fri, 27 Mar 2020 09:07:28 -0300 [thread overview]
Message-ID: <20200327120728.GA11108@fuller.cnet> (raw)
In-Reply-To: <20200326162002.GA3946@lenoir>
On Thu, Mar 26, 2020 at 05:20:05PM +0100, Frederic Weisbecker wrote:
> On Wed, Mar 25, 2020 at 08:47:36AM -0300, Marcelo Tosatti wrote:
> >
> > Hi Frederic,
> >
> > On Wed, Mar 25, 2020 at 01:30:00AM +0100, Frederic Weisbecker wrote:
> > > On Tue, Mar 24, 2020 at 12:20:16PM -0300, Marcelo Tosatti wrote:
> > > >
> > > > This is a kernel enhancement to configure the cpu affinity of kernel
> > > > threads via kernel boot option isolcpus=no_kthreads,<isolcpus_params>,<cpulist>
> > > >
> > > > When this option is specified, the cpumask is immediately applied upon
> > > > thread launch. This does not affect kernel threads that specify cpu
> > > > and node.
> > > >
> > > > This allows CPU isolation (that is not allowing certain threads
> > > > to execute on certain CPUs) without using the isolcpus=domain parameter,
> > > > making it possible to enable load balancing on such CPUs
> > > > during runtime (see
> > > >
> > > > Note-1: this is based off on Wind River's patch at
> > > > https://github.com/starlingx-staging/stx-integ/blob/master/kernel/kernel-std/centos/patches/affine-compute-kernel-threads.patch
> > > >
> > > > Difference being that this patch is limited to modifying
> > > > kernel thread cpumask: Behaviour of other threads can
> > > > be controlled via cgroups or sched_setaffinity.
> > > >
> > > > Note-2: MontaVista's patch was based off Christoph Lameter's patch at
> > > > https://lwn.net/Articles/565932/ with the only difference being
> > > > the kernel parameter changed from kthread to kthread_cpus.
> > > >
> > > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > >
> > > I'm wondering, why do you need such a boot shift at all when you
> > > can actually affine kthreads on runtime?
> >
> > New, unbound kernel threads inherit the cpumask of kthreadd.
> >
> > Therefore there is a race between kernel thread creation
> > and affine.
> >
> > If you know of a solution to that problem, that can be used instead.
>
> Well, you could first set the affinity of kthreadd and only then the affinity
> of the others. But I can still imagine some tiny races with fork().
>
> > >
> > > > };
> > > >
> > > > #ifdef CONFIG_CPU_ISOLATION
> > > > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > > > index b262f47046ca..be9c8d53a986 100644
> > > > --- a/kernel/kthread.c
> > > > +++ b/kernel/kthread.c
> > > > @@ -347,7 +347,7 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
> > > > * The kernel thread should not inherit these properties.
> > > > */
> > > > sched_setscheduler_nocheck(task, SCHED_NORMAL, ¶m);
> > > > - set_cpus_allowed_ptr(task, cpu_all_mask);
> > > > + set_cpus_allowed_ptr(task, cpu_kthread_mask);
> > >
> > > I'm wondering, why are we using cpu_all_mask and not cpu_possible_mask here?
> > > If we used the latter, you wouldn't need to create cpu_kthread_mask and
> > > you could directly rely on housekeeping_cpumask(HK_FLAG_KTHREAD).
> >
> > I suppose that either work: CPUs can only be online from
> > cpu_possible_mask (and is contained in cpu_possible_mask).
> >
> > Nice cleanup, thanks.
>
> But may I suggest you to do:
>
> - set_cpus_allowed_ptr(task, cpu_all_mask);
> + set_cpus_allowed_ptr(task, cpu_possible_mask);
>
> as a first step in its own patch in the series. I just want to make sure that change
> isn't missed by reviewers or bisections, in case someone catches something we
> overlooked.
>
> >
> > >
> > > > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> > > > index 008d6ac2342b..e9d48729efd4 100644
> > > > --- a/kernel/sched/isolation.c
> > > > +++ b/kernel/sched/isolation.c
> > > > @@ -169,6 +169,12 @@ static int __init housekeeping_isolcpus_setup(char *str)
> > > > continue;
> > > > }
> > > >
> > > > + if (!strncmp(str, "no_kthreads,", 12)) {
> > > > + str += 12;
> > > > + flags |= HK_FLAG_NO_KTHREADS;
> > >
> > > You will certainly want HK_FLAG_WQ as well since workqueue has its own
> > > way to deal with unbound affinity.
> >
> > Yep. HK_FLAG_WQ is simply a convenience so that the user does not have
> > to configure this separately: OK.
>
> Also, and that's a larger debate, are you interested in isolating kthreads
> only or any kind of kernel unbound work that could be affine outside
> a given CPU?
Any kind of kernel work.
> In case of all the unbound work, I may suggest an all-in-one "unbound"
> flag that would do:
>
> HK_FLAG_KTHREAD | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU | HK_FLAG_MISC
> | HK_FLAG_SCHED
>
> Otherwise we can stick with HK_FLAG_KTHREAD, but I'd be curious about your usecase.
>
> Thanks.
BTW HK_FLAG_SCHED is not settable at the moment.
Any reason why nohz_full= is not setting it ?
Thanks
next prev parent reply other threads:[~2020-03-27 12:09 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-23 13:54 [PATCH] affine kernel threads to specified cpumask Marcelo Tosatti
2020-03-23 15:29 ` Chris Friesen
2020-03-24 15:07 ` Marcelo Tosatti
2020-03-23 16:22 ` Thomas Gleixner
2020-03-23 17:02 ` Chris Friesen
2020-03-23 20:31 ` Thomas Gleixner
2020-03-24 11:38 ` Marcelo Tosatti
2020-03-24 15:20 ` [PATCH v2] isolcpus: " Marcelo Tosatti
2020-03-24 15:56 ` Chris Friesen
2020-03-24 16:50 ` Marcelo Tosatti
2020-03-25 0:30 ` Frederic Weisbecker
2020-03-25 11:47 ` Marcelo Tosatti
2020-03-26 16:20 ` Frederic Weisbecker
2020-03-26 16:52 ` Frederic Weisbecker
2020-03-27 12:07 ` Marcelo Tosatti [this message]
2020-03-25 18:05 ` David Laight
2020-03-26 11:28 ` Marcelo Tosatti
2020-03-26 16:22 ` Frederic Weisbecker
2020-03-26 16:32 ` Chris Friesen
2020-03-26 16:51 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200327120728.GA11108@fuller.cnet \
--to=mtosatti@redhat.com \
--cc=Jim.Somerville@windriver.com \
--cc=akpm@linux-foundation.org \
--cc=chris.friesen@windriver.com \
--cc=cl@linux.com \
--cc=frederic@kernel.org \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).