All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valentin Schneider <valentin.schneider@arm.com>
To: James Morse <james.morse@arm.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	Fenghua Yu <fenghua.yu@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 2/2] x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race
Date: Fri, 20 Nov 2020 15:54:53 +0000	[thread overview]
Message-ID: <jhjtutkuipe.mognet@arm.com> (raw)
In-Reply-To: <c3370c24-98f2-dac8-e390-9c80a3321cf0@arm.com>


Hi James,

On 20/11/20 14:53, James Morse wrote:
> Hi Valentin,
>
> On 18/11/2020 18:00, Valentin Schneider wrote:
>> Upon moving a task to a new control / monitor group, said task's {closid,
>> rmid} fields are updated *after* triggering the move_myself() task_work
>> callback. This can cause said callback to miss the update, e.g. if the
>> triggering thread got preempted before fiddling with task_struct, or if the
>> targeted task was already on its way to return to userspace.
>
> So, if move_myself() runs after task_work_add() but before tsk is written to.
> Sounds fun!
>
>
>> Update the task_struct's {closid, rmid} tuple *before* invoking
>> task_work_add(). As they can happen concurrently, wrap {closid, rmid}
>> accesses with READ_ONCE() and WRITE_ONCE(). Highlight the required ordering
>> with a pair of comments.
>
> ... and this one is if move_myself() or __resctrl_sched_in() runs while tsk is being
> written to on another CPU. It might get torn values, or multiple-reads get different values.
>
> The READ_ONCE/WRITE_ONCEry would have been easier to read as a separate patch as you touch
> all sites, and move/change some of them.
>

True, I initially only fixed up the reads/writes involved with
__rdtgroup_move_task(), but ended up coccinelle'ing the whole lot - which I
should have then moved to a dedicated patch. Thanks for powering through
it, I'll send a v2 with a neater split. 

> Regardless:
> Reviewed-by: James Morse <james.morse@arm.com>
>

Thanks!

>
> I don't 'get' memory-ordering, so one curiosity below:
>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index b6b5b95df833..135a51529f70 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -524,11 +524,13 @@ static void move_myself(struct callback_head *head)
>>  	 * If resource group was deleted before this task work callback
>>  	 * was invoked, then assign the task to root group and free the
>>  	 * resource group.
>> +	 *
>> +	 * See pairing atomic_inc() in __rdtgroup_move_task()
>>  	 */
>>  	if (atomic_dec_and_test(&rdtgrp->waitcount) &&
>>  	    (rdtgrp->flags & RDT_DELETED)) {
>> -		current->closid = 0;
>> -		current->rmid = 0;
>> +		WRITE_ONCE(current->closid, 0);
>> +		WRITE_ONCE(current->rmid, 0);
>>  		kfree(rdtgrp);
>>  	}
>>  
>> @@ -553,14 +555,32 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
>
>>  	/*
>>  	 * Take a refcount, so rdtgrp cannot be freed before the
>>  	 * callback has been invoked.
>> +	 *
>> +	 * Also ensures above {closid, rmid} writes are observed by
>> +	 * move_myself(), as it can run immediately after task_work_add().
>> +	 * Otherwise old values may be loaded, and the move will only actually
>> +	 * happen at the next context switch.
>
> But __resctrl_sched_in() can still occur at anytime and READ_ONCE() a pair of values that
> don't go together?

Yes, the thought did cross my mind...

> I don't think this is a problem for RDT as with old-rmid the task was a member of that
> monitor-group previously, and 'freed' rmid are kept in limbo for a while after.
> (old-closid is the same as the task having not schedule()d since the change, which is fine).
>
> For MPAM, this is more annoying as changing just the closid may put the task in a
> monitoring group that never existed, meaning its surprise dirty later.
>
> If this all makes sense, I guess the fix (for much later) is to union closid/rmid, and
> WRITE_ONCE() them together where necessary.
> (I've made a note for when I next pass that part of the MPAM tree)
>

It does make sense to me - one more question back to you: can RDT exist on
an X86_32 system? It shouldn't be a stopper, but would be an inconvenience.

FWIW kernel/sched/fair.c uses two synced u64's for this; see

  struct cfs_rq { .min_vruntime, .min_vruntime_copy }

and

  kernel/sched/fair.c:update_min_vruntime()
  kernel/sched/fair.c:migrate_task_rq_fair()
>
>> +	 *
>> +	 * Pairs with atomic_dec() in move_myself().
>>  	 */
>>  	atomic_inc(&rdtgrp->waitcount);
>> +
>>  	ret = task_work_add(tsk, &callback->work, TWA_RESUME);
>>  	if (ret) {
>>  		/*
>
>
> Thanks!
>
> James


      reply	other threads:[~2020-11-20 15:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18 18:00 [PATCH 0/2] x86/intel_rdt: task_work vs task_struct rmid/closid write race Valentin Schneider
2020-11-18 18:00 ` [PATCH 1/2] x86/intel_rdt: Check monitor group vs control group membership earlier Valentin Schneider
2020-11-20 14:53   ` James Morse
2020-11-20 15:53     ` Valentin Schneider
2020-11-18 18:00 ` [PATCH 2/2] x86/intel_rdt: Plug task_work vs task_struct {rmid,closid} update race Valentin Schneider
2020-11-20 14:53   ` James Morse
2020-11-20 15:54     ` Valentin Schneider [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhjtutkuipe.mognet@arm.com \
    --to=valentin.schneider@arm.com \
    --cc=bp@alien8.de \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.