linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: Reinette Chatre <reinette.chatre@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
	shameerali.kolothum.thodi@huawei.com,
	D Scott Phillips OS <scott@os.amperecomputing.com>,
	carl@os.amperecomputing.com, lcherian@marvell.com,
	bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com,
	xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com,
	Jamie Iles <quic_jiles@quicinc.com>,
	Xin Hao <xhao@linux.alibaba.com>,
	peternewman@google.com
Subject: Re: [PATCH v2 18/18] x86/resctrl: Separate arch and fs resctrl locks
Date: Mon, 6 Mar 2023 11:34:32 +0000	[thread overview]
Message-ID: <cc4eed18-1966-ebcf-8ae1-81b7e0833299@arm.com> (raw)
In-Reply-To: <d7d43c53-a482-100f-8a25-4dae6500d184@intel.com>

Hi Reinette,

On 02/02/2023 23:50, Reinette Chatre wrote:
> On 1/13/2023 9:54 AM, James Morse wrote:
>> resctrl has one mutex that is taken by the architecture specific code,
>> and the filesystem parts. The two interact via cpuhp, where the
>> architecture code updates the domain list. Filesystem handlers that
>> walk the domains list should not run concurrently with the cpuhp
>> callback modifying the list.
>>
>> Exposing a lock from the filesystem code means the interface is not
>> cleanly defined, and creates the possibility of cross-architecture
>> lock ordering headaches. The interaction only exists so that certain
>> filesystem paths are serialised against cpu hotplug. The cpu hotplug
>> code already has a mechanism to do this using cpus_read_lock().
>>
>> MPAM's monitors have an overflow interrupt, so it needs to be possible
>> to walk the domains list in irq context. RCU is ideal for this,
>> but some paths need to be able to sleep to allocate memory.
>>
>> Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part
>> of a cpuhp callback, cpus_read_lock() must always be taken first.
>> rdtgroup_schemata_write() already does this.
>>
>> All but one of the filesystem code's domain list walkers are
>> currently protected by the rdtgroup_mutex taken in
>> rdtgroup_kn_lock_live(). The exception is rdt_bit_usage_show()
>> which takes the lock directly.
> 
> The new BMEC code also. You can find it on tip's x86/cache branch,
> see mbm_total_bytes_config_write() and mbm_local_bytes_config_write().
> 
>>
>> Make the domain list protected by RCU. An architecture-specific
>> lock prevents concurrent writers. rdt_bit_usage_show() can
>> walk the domain list under rcu_read_lock().
>> The other filesystem list walkers need to be able to sleep.
>> Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the
>> cpuhp callbacks can't be invoked when file system operations are
>> occurring.
>>
>> Add lockdep_assert_cpus_held() in the cases where the
>> rdtgroup_kn_lock_live() call isn't obvious.
>>
>> Resctrl's domain online/offline calls now need to take the
>> rdtgroup_mutex themselves.


>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 7896fcf11df6..dc1ba580c4db 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -25,8 +25,14 @@
>>  #include <asm/resctrl.h>
>>  #include "internal.h"
>>  
>> -/* Mutex to protect rdtgroup access. */
>> -DEFINE_MUTEX(rdtgroup_mutex);
>> +/*
>> + * rdt_domain structures are kfree()d when their last cpu goes offline,
>> + * and allocated when the first cpu in a new domain comes online.
>> + * The rdt_resource's domain list is updated when this happens. The domain
>> + * list is protected by RCU, but callers can also take the cpus_read_lock()
>> + * to prevent modification if they need to sleep. All writers take this mutex:
> 
> Using "callers can" is not specific (compare to "callers should"). Please provide
> clear guidance on how the locks should be used. Reader may wonder "why take cpus_read_lock()
> to prevent modification, why not just take the mutex to prevent modification?"

'if they need to sleep' is the answer to this. I think a certain amount of background
knowledge can be assumed. My aim here wasn't to write an essay, but indicate not all
readers do the same thing. This is already the case in resctrl, and the MPAM pmu stuff
makes that worse.

Is this more robust:
| * rdt_domain structures are kfree()d when their last cpu goes offline,
| * and allocated when the first cpu in a new domain comes online.
| * The rdt_resource's domain list is updated when this happens. Readers of
| * the domain list must either take cpus_read_lock(), or rely on an RCU
| * read-side critical section, to avoid observing concurrent modification.
| * For information about RCU, see Docuemtation/RCU/rcu.rst.
| * All writers take this mutex:

?


>> @@ -541,7 +550,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>>  	cpumask_clear_cpu(cpu, &d->cpu_mask);
>>  	if (cpumask_empty(&d->cpu_mask)) {
>>  		resctrl_offline_domain(r, d);
>> -		list_del(&d->list);
>> +		list_del_rcu(&d->list);
>> +		synchronize_rcu();
>>  
>>  		/*
>>  		 * rdt_domain "d" is going to be freed below, so clear

> Should domain_remove_cpu() also get a "lockdep_assert_held(&domain_list_lock)"?

Yes, not sure why I didn't do that!


>> @@ -569,30 +579,27 @@ static void clear_closid_rmid(int cpu)
>>  static int resctrl_arch_online_cpu(unsigned int cpu)
>>  {
>>  	struct rdt_resource *r;
>> -	int err;
>>  
>> -	mutex_lock(&rdtgroup_mutex);
>> +	mutex_lock(&domain_list_lock);
>>  	for_each_capable_rdt_resource(r)
>>  		domain_add_cpu(cpu, r);
>>  	clear_closid_rmid(cpu);
>> +	mutex_unlock(&domain_list_lock);

> Why is clear_closid_rmid(cpu) protected by mutex?

It doesn't need to be, its just an artefact of changing the lock, then moving the
filesystem calls out. (its doesn't need to be protected by rdtgroup_mutex today).

If you don't think its churn, I'll move it to make it clearer.


Thanks,

James

  reply	other threads:[~2023-03-06 11:34 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13 17:54 [PATCH v2 00/18] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse
2023-01-13 17:54 ` [PATCH v2 01/18] x86/resctrl: Track the closid with the rmid James Morse
2023-01-13 17:54 ` [PATCH v2 02/18] x86/resctrl: Access per-rmid structures by index James Morse
2023-01-17 18:28   ` Yu, Fenghua
2023-03-03 18:33     ` James Morse
2023-01-17 18:29   ` Yu, Fenghua
2023-02-02 23:44   ` Reinette Chatre
2023-03-03 18:33     ` James Morse
2023-01-13 17:54 ` [PATCH v2 03/18] x86/resctrl: Create helper for RMID allocation and mondata dir creation James Morse
2023-01-13 17:54 ` [PATCH v2 04/18] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() James Morse
2023-01-17 18:28   ` Yu, Fenghua
2023-02-02 23:45   ` Reinette Chatre
2023-03-03 18:33     ` James Morse
2023-01-13 17:54 ` [PATCH v2 05/18] x86/resctrl: Allow RMID allocation to be scoped by CLOSID James Morse
2023-01-17 18:53   ` Yu, Fenghua
2023-03-03 18:34     ` James Morse
2023-02-02 23:45   ` Reinette Chatre
2023-03-03 18:34     ` James Morse
2023-03-10 19:57       ` Reinette Chatre
2023-01-13 17:54 ` [PATCH v2 06/18] x86/resctrl: Allow the allocator to check if a CLOSID can allocate clean RMID James Morse
2023-01-17 18:29   ` Yu, Fenghua
2023-03-03 18:35     ` James Morse
2023-02-02 23:46   ` Reinette Chatre
2023-03-03 18:36     ` James Morse
2023-03-10 19:59       ` Reinette Chatre
2023-03-20 17:12         ` James Morse
2023-01-13 17:54 ` [PATCH v2 07/18] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers James Morse
2023-01-17 19:10   ` Yu, Fenghua
2023-03-03 18:37     ` James Morse
2023-02-02 23:47   ` Reinette Chatre
2023-03-06 11:32     ` James Morse
2023-03-08 10:30       ` Peter Newman
2023-03-10 20:00       ` Reinette Chatre
2023-01-13 17:54 ` [PATCH v2 08/18] x86/resctrl: Queue mon_event_read() instead of sending an IPI James Morse
2023-01-17 18:29   ` Yu, Fenghua
2023-03-06 11:32     ` James Morse
2023-03-10 20:00       ` Reinette Chatre
2023-02-02 23:47   ` Reinette Chatre
2023-03-06 11:33     ` James Morse
2023-03-08 16:09       ` James Morse
2023-03-10 20:06         ` Reinette Chatre
2023-03-20 17:12           ` James Morse
2023-01-13 17:54 ` [PATCH v2 09/18] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep James Morse
2023-01-23 13:54   ` Peter Newman
2023-03-06 11:33     ` James Morse
2023-01-23 15:33   ` Peter Newman
2023-03-06 11:33     ` James Morse
2023-03-06 13:14       ` Peter Newman
2023-03-08 17:45         ` James Morse
2023-03-09 13:41           ` Peter Newman
2023-03-09 17:35             ` James Morse
2023-03-10  9:28               ` Peter Newman
2023-03-20 17:12                 ` James Morse
2023-03-22 13:21                   ` Peter Newman
2023-01-13 17:54 ` [PATCH v2 10/18] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() James Morse
2023-01-13 17:54 ` [PATCH v2 11/18] x86/resctrl: Make resctrl_mounted checks explicit James Morse
2023-01-13 17:54 ` [PATCH v2 12/18] x86/resctrl: Move alloc/mon static keys into helpers James Morse
2023-01-13 17:54 ` [PATCH v2 13/18] x86/resctrl: Make rdt_enable_key the arch's decision to switch James Morse
2023-02-02 23:48   ` Reinette Chatre
2023-01-13 17:54 ` [PATCH v2 14/18] x86/resctrl: Add helpers for system wide mon/alloc capable James Morse
2023-01-25  7:16   ` Shaopeng Tan (Fujitsu)
2023-03-06 11:34     ` James Morse
2023-01-13 17:54 ` [PATCH v2 15/18] x86/resctrl: Add cpu online callback for resctrl work James Morse
2023-01-13 17:54 ` [PATCH v2 16/18] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu James Morse
2023-02-02 23:49   ` Reinette Chatre
2023-03-06 11:34     ` James Morse
2023-01-13 17:54 ` [PATCH v2 17/18] x86/resctrl: Add cpu offline callback for resctrl work James Morse
2023-01-13 17:54 ` [PATCH v2 18/18] x86/resctrl: Separate arch and fs resctrl locks James Morse
2023-02-02 23:50   ` Reinette Chatre
2023-03-06 11:34     ` James Morse [this message]
2023-03-11  0:22       ` Reinette Chatre
2023-03-20 17:12         ` James Morse
2023-01-25  7:19 ` [PATCH v2 00/18] x86/resctrl: monitored closid+rmid together, separate arch/fs locking Shaopeng Tan (Fujitsu)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc4eed18-1966-ebcf-8ae1-81b7e0833299@arm.com \
    --to=james.morse@arm.com \
    --cc=Babu.Moger@amd.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=carl@os.amperecomputing.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=lcherian@marvell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peternewman@google.com \
    --cc=quic_jiles@quicinc.com \
    --cc=reinette.chatre@intel.com \
    --cc=scott@os.amperecomputing.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xhao@linux.alibaba.com \
    --cc=xingxin.hx@openanolis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).