From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752527AbdGFVko (ORCPT <rfc822;w@1wt.eu>);
        Thu, 6 Jul 2017 17:40:44 -0400
Received: from mga09.intel.com ([134.134.136.24]:56600 "EHLO mga09.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751943AbdGFVkn (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 6 Jul 2017 17:40:43 -0400
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.40,319,1496127600"; 
   d="scan'208";a="876024597"
Date: Thu, 6 Jul 2017 14:42:16 -0700 (PDT)
From: Shivappa Vikas <vikas.shivappa@intel.com>
X-X-Sender: vikas@vshiva-Udesk
To: Thomas Gleixner <tglx@linutronix.de>
cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org, hpa@zytor.com, peterz@infradead.org,
        "Shankar, Ravi V" <ravi.v.shankar@intel.com>, vikas.shivappa@intel.com,
        tony.luck@intel.com, "Yu, Fenghua" <fenghua.yu@intel.com>
Subject: Re: [PATCH 13/21] x86/intel_rdt/cqm: Add cpus file support
In-Reply-To: <alpine.DEB.2.20.1707021313150.2296@nanos>
Message-ID: <alpine.DEB.2.10.1707061432511.3798@vshiva-Udesk>
References: <1498503368-20173-1-git-send-email-vikas.shivappa@linux.intel.com> <1498503368-20173-14-git-send-email-vikas.shivappa@linux.intel.com> <alpine.DEB.2.20.1707021313150.2296@nanos>
User-Agent: Alpine 2.10 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


On Sun, 2 Jul 2017, Thomas Gleixner wrote:

> On Mon, 26 Jun 2017, Vikas Shivappa wrote:
>> -static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
>> -				   char *buf, size_t nbytes, loff_t off)
>> +static ssize_t cpus_mon_write(struct kernfs_open_file *of,
>> +			      char *buf, size_t nbytes,
>> +			      struct rdtgroup *rdtgrp)
>
> Again. Please make the split of rdtgroup_cpus_write() as a seperate
> preparatory change first and just move the guts of the existing write
> function out into cpus_ctrl_write() and then add the mon_write stuff as an
> extra patch.
>
>>  {
>> +	struct rdtgroup *pr = rdtgrp->parent, *cr;
>
> *pr and *cr really suck.
>
>>  	cpumask_var_t tmpmask, newmask;
>> -	struct rdtgroup *rdtgrp, *r;
>> +	struct list_head *llist;
>>  	int ret;
>>
>> -	if (!buf)
>> -		return -EINVAL;
>> -
>>  	if (!zalloc_cpumask_var(&tmpmask, GFP_KERNEL))
>>  		return -ENOMEM;
>>  	if (!zalloc_cpumask_var(&newmask, GFP_KERNEL)) {
>> @@ -233,10 +235,89 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
>>  		return -ENOMEM;
>>  	}
>>
>> -	rdtgrp = rdtgroup_kn_lock_live(of->kn);
>> -	if (!rdtgrp) {
>> -		ret = -ENOENT;
>> -		goto unlock;
>> +	if (is_cpu_list(of))
>> +		ret = cpulist_parse(buf, newmask);
>> +	else
>> +		ret = cpumask_parse(buf, newmask);
>
> The cpuask allocation and parsing of the user buffer can be done in the
> common code. No point in duplicating that.
>
>> +
>> +	if (ret)
>> +		goto out;
>> +
>> +	/* check that user didn't specify any offline cpus */
>> +	cpumask_andnot(tmpmask, newmask, cpu_online_mask);
>> +	if (cpumask_weight(tmpmask)) {
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>
> Common code.

Will fix all above

>
>> +	/* Check whether cpus belong to parent ctrl group */
>> +	cpumask_andnot(tmpmask, newmask, &pr->cpu_mask);
>> +	if (cpumask_weight(tmpmask)) {
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>> +
>> +	/* Check whether cpus are dropped from this group */
>> +	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
>> +	if (cpumask_weight(tmpmask)) {
>> +		/* Give any dropped cpus to parent rdtgroup */
>> +		cpumask_or(&pr->cpu_mask, &pr->cpu_mask, tmpmask);
>
> This does not make any sense. The check above verifies that all cpus in
> newmask belong to the parent->cpu_mask. If they don't then you return
> -EINVAL, but here you give them back to parent->cpu_mask. How is that
> supposed to work? You never get into this code path!

The parent->cpu_mask always is the parent->cpus_valid_mask if i understand 
right. With monitor group, the cpu is present is always present in "one" 
ctrl_mon group and one mon_group. And the mon group can have only cpus in its 
parent. May be it needs a comment? (its explaind in the documentation patch).

# mkdir /sys/fs/resctrl/p1
# mkdir /sys/fs/resctrl/p1/mon_groups/m1
# echo 5-10 > /sys/fs/resctr/p1/cpus_list
Say p1 has RMID 2
cpus 5-10 have RMID 2

# echo 5-6 > /sys/fs/resctrl/p1/mon_groups/m1/cpus_list
cpus 5-6 have RMID 3
cpus 7-10 have RMID 2

# cat /sys/fs/resctrl/p1/cpus_list
5-10

This is because when we query the data for p1 it adds its own data (RMID 2) and 
all the data for its child mon groups (hence all cpus from 5-10).

But
  >> +		cpumask_or(&pr->cpu_mask, &pr->cpu_mask, tmpmask);
can be removed because it does nothing like you suggest as the parent already 
has these cpus. We just need the update_rmid_closid(tmpmask, pr)

>
> So you need a seperate mask in the parent rdtgroup to store the CPUs which
> are valid in any monitoring group which belongs to it. So the logic
> becomes:
>
> 	/*
> 	 * Check whether the CPU mask is a subset of the CPUs
> 	 * which belong to the parent group.
> 	 */
> 	cpumask_andnot(tmpmask, newmask, parent->cpus_valid_mask);
> 	if (cpumask_weight(tmpmask))
> 		return -EINVAL;
>
> When CAT is not available, then parent->cpus_valid_mask is a pointer to
> cpu_online_mask. When CAT is enabled, then parent->cpus_valid_mask is a
> pointer to the CAT group cpu mask.

When CAT is unavailable we cannot create any ctrl_mon groups.

>
>> +		update_closid_rmid(tmpmask, pr);
>> +	}
>> +
>> +	/*
>> +	 * If we added cpus, remove them from previous group that owned them
>> +	 * and update per-cpu rmid
>> +	 */
>> +	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
>> +	if (cpumask_weight(tmpmask)) {
>> +		llist = &pr->crdtgrp_list;
>
> llist is a bad name. We have a facility llist, i.e. lockless list. head ?

Will fix.

>
>> +		list_for_each_entry(cr, llist, crdtgrp_list) {
>> +			if (cr == rdtgrp)
>> +				continue;
>> +			cpumask_andnot(&cr->cpu_mask, &cr->cpu_mask, tmpmask);
>> +		}
>> +		update_closid_rmid(tmpmask, rdtgrp);
>> +	}
>
>> +static void cpumask_rdtgrp_clear(struct rdtgroup *r, struct cpumask *m)
>> +{
>> +	struct rdtgroup *cr;
>> +
>> +	cpumask_andnot(&r->cpu_mask, &r->cpu_mask, m);
>> +	/* update the child mon group masks as well*/
>> +	list_for_each_entry(cr, &r->crdtgrp_list, crdtgrp_list)
>> +		cpumask_and(&cr->cpu_mask, &r->cpu_mask, &cr->cpu_mask);
>
> That's equally wrong. See above.

Because of same reason above, each cpu is present in "one" ctrl_mon group and 
may be present in "one" mon group - we need to clear both..

Thanks,
Vikas

>
> Thanks,
>
> 	tglx
>