linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
@ 2017-04-04 19:02 Tracy Smith
  0 siblings, 0 replies; 9+ messages in thread
From: Tracy Smith @ 2017-04-04 19:02 UTC (permalink / raw)
  To: Shivappa Vikas
  Cc: x86, lkml, ravi.v.shankar, tony.luck, fenghua.yu, h.peter.anvin,
	Vikas Shivappa

Apologies, unrelated to MBA. Resent later with a changed subject line.

On Tue, Apr 4, 2017 at 1:50 PM, Shivappa Vikas <vikas.shivappa@intel.com> wrote:
>
>
>
> On Mon, 3 Apr 2017, Tracy Smith wrote:
>
>> Hi All,
>>
>> No JTAG available and need to understand why Linux 4.8.3 doesn't boot on a
>> x86_64 corei7-64.  Hangs at the typical "Starting kernel" location after
>> the last message of the U-boot.  The bootcmd is given below.
>
>
> Do you see the issue when you apply MBA patches . It seems like you use 4.8
> kernel but the mba patches are dependent on :
> https://marc.info/?l=linux-kernel&m=149125583824700
>
> which is applied on 4.11-rc4.
>
>
>>
>> See a fault FFS and a message indicating the image didn't load after
>> failover.
>>
>> 1) How can I verify if the kernel image was loaded from uboot
>> 2) What is this fault?
>> 3) Has the bootargs or bootcmd changed between 4.1 and 4.8.3?
>> 4) If the boot cmd/arg has changed, what should the boot cmd/arg be for
>> 4.8.3 to boot on x86_64 corei7-64?
>>
>> Initial RAM disk at linear address 0x20000000, size 11638378 bytes
>> Kernel command line: "BOOT_IMAGE=/imgx/bzImage LABEL=BOOT root=/dev/ram0
>> imgmnt=/media/sda2 imgdir=imgx img=image.rootfs rootdelay=2 slub_debug=F
>> console=ttyS1,115200  bootcount=1 bootcount_addr=0xa4000
>> acpi_enforce_resources=lax pram_size=0x800000 pram_addr=10000000
>> pram_loc=ddr crashkernel=128M memmap=0x800000$0x10000000  "
>> EFI table at 683392c0, mmap 68339300, mmap size 4c0, version 1, descr.
>> size
>> 0x30
>>
>> Starting kernel ...
>>
>> Timer summary in microseconds:
>>       Mark    Elapsed  Stage
>>          0          0  reset
>>          1          1  board_init_r
>>        105        104  board_init_f
>> 10,180,048 10,179,943  id=64
>> 10,221,985     41,937  id=65
>> 10,356,645    134,660  main_loop
>> 12,366,521  2,009,876  usb_start
>> 18,747,284  6,380,763  start_kernel
>> Accumulated time:
>>            10,162,689  ahci
>>
>> On a 4.1.26-yocto-standard #1 SMP it boots with no issues.  Basically same
>> .config used in both cases except for anything deprecated between 4.1 and
>> 4.8.3.
>>
>> root@:~# cat /proc/cmdline
>> BOOT_IMAGE=/imgy/bzImage LABEL=BOOT root=/dev/ram0 imgmnt=/media/sda2
>> imgdir=imgy img=image.rootfs rootdelay=2 slub_debug=F console=ttyS1,115200
>> fault=FFS bootcount=3 bootcount_addr=0xa4000  acpi_enforce_resources=lax
>> pram_size=0x800000 pram_addr=10000000 pram_loc=ddr crashkernel=128M
>> memmap=0x800000$0x10000000
>> root@CLX3001:~# cat /proc/consoles
>> ttyS1                -W- (EC p a)    4:65
>> netcon0              -W- (E     )
>>
>> thx,
>> Tray
>>
>



-- 
Confidentiality notice: This e-mail message, including any
attachments, may contain legally privileged and/or confidential
information. If you are not the intended recipient(s), please
immediately notify the sender and delete this e-mail message.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
  2017-04-08  0:33 [PATCH 0/8 V4] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
@ 2017-04-08  0:33 ` Vikas Shivappa
  0 siblings, 0 replies; 9+ messages in thread
From: Vikas Shivappa @ 2017-04-08  0:33 UTC (permalink / raw)
  To: vikas.shivappa, x86, linux-kernel
  Cc: hpa, tglx, mingo, ravi.v.shankar, tony.luck, fenghua.yu, vikas.shivappa

The MBA feature details like minimum bandwidth supported, b/w
granularity etc are obtained via executing CPUID with EAX=10H
,ECX=3.

Setup and initialize the MBA specific extensions to data structures like
global list of RDT resources, RDT resource structure and RDT domain
structure.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
 arch/x86/include/asm/intel_rdt.h |  92 +++++++++++++++---------
 arch/x86/kernel/cpu/intel_rdt.c  | 151 ++++++++++++++++++++++++++++++++++++---
 2 files changed, 199 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index 4c94f18..097134b 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -12,6 +12,7 @@
 #define IA32_L3_QOS_CFG		0xc81
 #define IA32_L3_CBM_BASE	0xc90
 #define IA32_L2_CBM_BASE	0xd10
+#define IA32_MBA_THRTL_BASE	0xd50
 
 #define L3_QOS_CDP_ENABLE	0x01ULL
 
@@ -69,39 +70,6 @@ struct rftype {
 };
 
 /**
- * struct rdt_resource - attributes of an RDT resource
- * @enabled:			Is this feature enabled on this machine
- * @capable:			Is this feature available on this machine
- * @name:			Name to use in "schemata" file
- * @num_closid:			Number of CLOSIDs available
- * @default_ctrl:		Specifies default cache cbm or mem b/w percent.
- * @data_width:		Character width of data when displaying
- * @min_cbm_bits:		Minimum number of consecutive bits to be set
- *				in a cache bit mask
- * @domains:			All domains for this resource
- * @msr_base:			Base MSR address for CBMs
- * @cache_level:		Which cache level defines scope of this domain
- * @cbm_idx_multi:		Multiplier of CBM index
- * @cbm_idx_offset:		Offset of CBM index. CBM index is computed by:
- *				closid * cbm_idx_multi + cbm_idx_offset
- */
-struct rdt_resource {
-	bool			enabled;
-	bool			capable;
-	char			*name;
-	int			num_closid;
-	int			cbm_len;
-	int			min_cbm_bits;
-	u32			default_ctrl;
-	int			data_width;
-	struct list_head	domains;
-	int			msr_base;
-	int			cache_level;
-	int			cbm_idx_multi;
-	int			cbm_idx_offset;
-};
-
-/**
  * struct rdt_domain - group of cpus sharing an RDT resource
  * @list:	all instances of this resource
  * @id:		unique id for this instance
@@ -131,6 +99,63 @@ struct msr_param {
 	int			high;
 };
 
+/**
+ * struct rdt_resource - attributes of an RDT resource
+ * @enabled:			Is this feature enabled on this machine
+ * @capable:			Is this feature available on this machine
+ * @name:			Name to use in "schemata" file
+ * @num_closid:			Number of CLOSIDs available
+ * @default_ctrl:		Specifies default cache cbm or mem b/w percent.
+ * @data_width:		Character width of data when displaying
+ * @msr_update:		Function pointer to update QOS MSRs
+ * @domains:			All domains for this resource
+ * @msr_base:			Base MSR address for CBMs
+ * @cache_level:		Which cache level defines scope of this domain
+ * @cbm_idx_multi:		Multiplier of CBM index
+ * @cbm_idx_offset:		Offset of CBM index. CBM index is computed by:
+ *				closid * cbm_idx_multi + cbm_idx_offset
+ * @cbm_len			Number of cbm bits
+ * @min_cbm_bits:		Minimum number of consecutive bits to be set
+ *				in a cache bit mask
+ * @max_delay:			Max throttle delay. Delay is the hardware
+ *				understandable value for memory bandwidth.
+ * @min_bw:			Minimum memory bandwidth percentage user
+ *				can request
+ * @bw_gran:			Granularity at which the memory bandwidth
+ *				is allocated
+ * @delay_linear:		True if memory b/w delay is in linear scale
+ * @mb_map:			Mapping of memory b/w percentage to
+ *				memory b/w delay values
+ */
+struct rdt_resource {
+	bool			enabled;
+	bool			capable;
+	char			*name;
+	int			num_closid;
+	u32			default_ctrl;
+	int			data_width;
+	void (*msr_update)	(struct rdt_domain *d, struct msr_param *m,
+				 struct rdt_resource *r);
+	struct list_head	domains;
+	int			msr_base;
+	int			cache_level;
+	int			cbm_idx_multi;
+	int			cbm_idx_offset;
+	union {
+		struct { /*cache ctrls*/
+			int	cbm_len;
+			int	min_cbm_bits;
+		};
+		struct { /*mem ctrls*/
+			u32	max_delay;
+			u32	min_bw;
+			u32	bw_gran;
+			u32	delay_linear;
+			u32	*mb_map;
+		};
+	};
+};
+
 extern struct mutex rdtgroup_mutex;
 
 extern struct rdt_resource rdt_resources_all[];
@@ -144,6 +169,7 @@ enum {
 	RDT_RESOURCE_L3DATA,
 	RDT_RESOURCE_L3CODE,
 	RDT_RESOURCE_L2,
+	RDT_RESOURCE_MBA,
 
 	/* Must be the last */
 	RDT_NUM_RESOURCES,
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index c4cf2e8..b9f1cfc 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -32,6 +32,9 @@
 #include <asm/intel-family.h>
 #include <asm/intel_rdt.h>
 
+#define MAX_MBA_BW	100u
+#define MBA_IS_LINEAR	0x4
+
 /* Mutex to protect rdtgroup access. */
 DEFINE_MUTEX(rdtgroup_mutex);
 
@@ -45,11 +48,17 @@
  */
 int max_name_width, max_data_width;
 
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+static void
+cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+
 struct rdt_resource rdt_resources_all[] = {
 	{
 		.name		= "L3",
 		.domains	= domain_init(RDT_RESOURCE_L3),
 		.msr_base	= IA32_L3_CBM_BASE,
+		.msr_update	= cat_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 3,
 		.cbm_idx_multi	= 1,
@@ -59,6 +68,7 @@ struct rdt_resource rdt_resources_all[] = {
 		.name		= "L3DATA",
 		.domains	= domain_init(RDT_RESOURCE_L3DATA),
 		.msr_base	= IA32_L3_CBM_BASE,
+		.msr_update	= cat_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 3,
 		.cbm_idx_multi	= 2,
@@ -68,6 +78,7 @@ struct rdt_resource rdt_resources_all[] = {
 		.name		= "L3CODE",
 		.domains	= domain_init(RDT_RESOURCE_L3CODE),
 		.msr_base	= IA32_L3_CBM_BASE,
+		.msr_update	= cat_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 3,
 		.cbm_idx_multi	= 2,
@@ -77,11 +88,21 @@ struct rdt_resource rdt_resources_all[] = {
 		.name		= "L2",
 		.domains	= domain_init(RDT_RESOURCE_L2),
 		.msr_base	= IA32_L2_CBM_BASE,
+		.msr_update	= cat_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 2,
 		.cbm_idx_multi	= 1,
 		.cbm_idx_offset	= 0
 	},
+	{
+		.name		= "MB",
+		.domains	= domain_init(RDT_RESOURCE_MBA),
+		.msr_base	= IA32_MBA_THRTL_BASE,
+		.msr_update	= mba_wrmsr,
+		.cache_level	= 3,
+		.cbm_idx_multi	= 1,
+		.cbm_idx_offset = 0
+	},
 };
 
 static int cbm_idx(struct rdt_resource *r, int closid)
@@ -136,6 +157,53 @@ static inline bool cache_alloc_hsw_probe(void)
 	return false;
 }
 
+/*
+ * rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values
+ * exposed to user interface and the h/w understandable delay values.
+ *
+ * The non-linear delay values have the granularity of power of two
+ * and also the h/w does not guarantee a curve for configured delay
+ * values vs. actual b/w enforced.
+ * Hence we need a mapping that is pre calibrated so the user can
+ * express the memory b/w as a percentage value.
+ */
+static inline bool rdt_get_mb_table(struct rdt_resource *r)
+{
+	/*
+	 * There are no Intel SKUs as of now to support non-linear delay.
+	 */
+	pr_info("MBA b/w map not implemented for cpu:%d, model:%d",
+	        boot_cpu_data.x86, boot_cpu_data.x86_model);
+
+	return false;
+}
+
+static bool rdt_get_mem_config(struct rdt_resource *r)
+{
+	union cpuid_0x10_3_eax eax;
+	union cpuid_0x10_x_edx edx;
+	u32 ebx, ecx;
+
+	cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
+	r->num_closid = edx.split.cos_max + 1;
+	r->max_delay = eax.split.max_delay + 1;
+	r->default_ctrl = MAX_MBA_BW;
+	if (ecx & MBA_IS_LINEAR) {
+		r->delay_linear = true;
+		r->min_bw = MAX_MBA_BW - r->max_delay;
+		r->bw_gran = MAX_MBA_BW - r->max_delay;
+	} else {
+		if (!rdt_get_mb_table(r))
+			return false;
+	}
+	r->data_width = 3;
+
+	r->capable = true;
+	r->enabled = true;
+
+	return true;
+}
+
 static void rdt_get_cache_config(int idx, struct rdt_resource *r)
 {
 	union cpuid_0x10_1_eax eax;
@@ -212,7 +280,8 @@ static inline bool get_rdt_resources(void)
 		ret = true;
 	}
 
-	if (boot_cpu_has(X86_FEATURE_MBA))
+	if (boot_cpu_has(X86_FEATURE_MBA) &&
+	     rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
 		ret = true;
 
 	rdt_init_padding();
@@ -233,6 +302,47 @@ static int get_cache_id(int cpu, int level)
 	return -1;
 }
 
+/*
+ * Map the memory b/w percentage value to delay values
+ * that can be written to QOS_MSRs.
+ * There are currently no SKUs which support non linear delay values.
+ */
+static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
+{
+	if (r->delay_linear)
+		return MAX_MBA_BW - bw;
+
+	WARN_ONCE(1, "Non Linear delay-bw map not supported but queried\n");
+	return r->default_ctrl;
+}
+
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+	int i;
+
+	for (i = m->low; i < m->high; i++) {
+		int idx = cbm_idx(r, i);
+
+		/*
+		 * Write the delay value for mba.
+		 */
+		wrmsrl(r->msr_base + idx, delay_bw_map(d->ctrl_val[i], r));
+	}
+}
+
+static void
+cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+	int i;
+
+	for (i = m->low; i < m->high; i++) {
+		int idx = cbm_idx(r, i);
+
+		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
+	}
+}
+
 void rdt_ctrl_update(void *arg)
 {
 	struct msr_param *m = (struct msr_param *)arg;
@@ -291,6 +401,33 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
 	return NULL;
 }
 
+static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
+{
+	struct msr_param m;
+	u32 *dc;
+	int i;
+
+	dc = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+	if (!dc)
+		return -ENOMEM;
+
+	d->ctrl_val = dc;
+
+	/*
+	 * Initialize the Control MSRs to having no control.
+	 * For Cache Allocation: Set all bits in cbm
+	 * For Memory Allocation: Set b/w requested to 100
+	 */
+	for (i = 0; i < r->num_closid; i++, dc++)
+		*dc = r->default_ctrl;
+
+	m.low = 0;
+	m.high = r->num_closid;
+	r->msr_update(d, &m, r);
+
+	return 0;
+}
+
 /*
  * domain_add_cpu - Add a cpu to a resource's domain list.
  *
@@ -306,7 +443,7 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
  */
 static void domain_add_cpu(int cpu, struct rdt_resource *r)
 {
-	int i, id = get_cache_id(cpu, r->cache_level);
+	int id = get_cache_id(cpu, r->cache_level);
 	struct list_head *add_pos = NULL;
 	struct rdt_domain *d;
 
@@ -327,19 +464,11 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
 
 	d->id = id;
 
-	d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
-	if (!d->ctrl_val) {
+	if (domain_setup_ctrlval(r, d)) {
 		kfree(d);
 		return;
 	}
 
-	for (i = 0; i < r->num_closid; i++) {
-		int idx = cbm_idx(r, i);
-
-		d->ctrl_val[i] = r->default_ctrl;
-		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
-	}
-
 	cpumask_set_cpu(cpu, &d->cpu_mask);
 	list_add_tail(&d->list, add_pos);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
  2017-04-05 15:40   ` Thomas Gleixner
@ 2017-04-05 18:09     ` Shivappa Vikas
  0 siblings, 0 replies; 9+ messages in thread
From: Shivappa Vikas @ 2017-04-05 18:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Vikas Shivappa, vikas.shivappa, x86, linux-kernel, hpa, mingo,
	peterz, ravi.v.shankar, tony.luck, fenghua.yu, h.peter.anvin



On Wed, 5 Apr 2017, Thomas Gleixner wrote:

> On Mon, 3 Apr 2017, Vikas Shivappa wrote:
>>
>>  /**
>> + * struct rdt_domain - group of cpus sharing an RDT resource
>> + * @list:	all instances of this resource
>> + * @id:		unique id for this instance
>> + * @cpu_mask:	which cpus share this resource
>> + * @ctrl_val:	array of cache or mem ctrl values (indexed by CLOSID)
>> + * @new_cbm:	new cbm value to be loaded
>> + * @have_new_cbm: did user provide new_cbm for this domain
>
> The version which you removed below has the kernel-doc comments correct ....

Will fix

>
>> +/**
>>   * struct rdt_resource - attributes of an RDT resource
>>   * @enabled:			Is this feature enabled on this machine
>>   * @capable:			Is this feature available on this machine
>> @@ -78,6 +109,16 @@ struct rftype {
>>   * @data_width:		Character width of data when displaying
>>   * @min_cbm_bits:		Minimum number of consecutive bits to be set
>>   *				in a cache bit mask
>> + * @msr_update:		Function pointer to update QOS MSRs
>> + * @max_delay:			Max throttle delay. Delay is the hardware
>> + *				understandable value for memory bandwidth.
>> + * @min_bw:			Minimum memory bandwidth percentage user
>> + *				can request
>> + * @bw_gran:			Granularity at which the memory bandwidth
>> + *				is allocated
>> + * @delay_linear:		True if memory b/w delay is in linear scale
>> + * @mb_map:			Mapping of memory b/w percentage to
>> + *				memory b/w delay values
>>   * @domains:			All domains for this resource
>>   * @msr_base:			Base MSR address for CBMs
>>   * @cache_level:		Which cache level defines scope of this domain
>> @@ -94,6 +135,14 @@ struct rdt_resource {
>>  	int			min_cbm_bits;
>>  	u32			default_ctrl;
>>  	int			data_width;
>> +	void (*msr_update)	(struct rdt_domain *d, struct msr_param *m,
>> +				 struct rdt_resource *r);
>> +	u32			max_delay;
>> +	u32			min_bw;
>> +	u32			bw_gran;
>> +	u32			delay_linear;
>> +	u32			*mb_map;
>
> I don't know what other weird controls will be added over time, but we are
> probably better off to have
>
> struct cache_ctrl {
> 	int		cbm_len;
> 	int		min_cbm_bits;
> };
>
> struct mba_ctrl {
> 	u32			max_delay;
> 	u32			min_bw;
> 	u32			bw_gran;
> 	u32			delay_linear;
> 	u32			*mb_map;
> };
>
> and in then in struct rdt_resource:
>
>       <common fields>
>       union {
>       		struct cache_ctrl	foo;
> 		struct mba_ctrl		bla;
> 	} ctrl;
>
>
> That avoids that rdt_resource becomes a hodgepodge of unrelated or even
> contradicting fields.
>
> Hmm?

Ok, makes sense. Will fix. Thought of a union when i had added a couple fields 
and given up but its grown a lot now.

Thanks,
Vikas

>
> Thanks,
>
> 	tglx
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
  2017-04-03 21:57 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
       [not found]   ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
@ 2017-04-05 15:40   ` Thomas Gleixner
  2017-04-05 18:09     ` Shivappa Vikas
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2017-04-05 15:40 UTC (permalink / raw)
  To: Vikas Shivappa
  Cc: vikas.shivappa, x86, linux-kernel, hpa, mingo, peterz,
	ravi.v.shankar, tony.luck, fenghua.yu, h.peter.anvin

On Mon, 3 Apr 2017, Vikas Shivappa wrote:
>  
>  /**
> + * struct rdt_domain - group of cpus sharing an RDT resource
> + * @list:	all instances of this resource
> + * @id:		unique id for this instance
> + * @cpu_mask:	which cpus share this resource
> + * @ctrl_val:	array of cache or mem ctrl values (indexed by CLOSID)
> + * @new_cbm:	new cbm value to be loaded
> + * @have_new_cbm: did user provide new_cbm for this domain

The version which you removed below has the kernel-doc comments correct ....

> +/**
>   * struct rdt_resource - attributes of an RDT resource
>   * @enabled:			Is this feature enabled on this machine
>   * @capable:			Is this feature available on this machine
> @@ -78,6 +109,16 @@ struct rftype {
>   * @data_width:		Character width of data when displaying
>   * @min_cbm_bits:		Minimum number of consecutive bits to be set
>   *				in a cache bit mask
> + * @msr_update:		Function pointer to update QOS MSRs
> + * @max_delay:			Max throttle delay. Delay is the hardware
> + *				understandable value for memory bandwidth.
> + * @min_bw:			Minimum memory bandwidth percentage user
> + *				can request
> + * @bw_gran:			Granularity at which the memory bandwidth
> + *				is allocated
> + * @delay_linear:		True if memory b/w delay is in linear scale
> + * @mb_map:			Mapping of memory b/w percentage to
> + *				memory b/w delay values
>   * @domains:			All domains for this resource
>   * @msr_base:			Base MSR address for CBMs
>   * @cache_level:		Which cache level defines scope of this domain
> @@ -94,6 +135,14 @@ struct rdt_resource {
>  	int			min_cbm_bits;
>  	u32			default_ctrl;
>  	int			data_width;
> +	void (*msr_update)	(struct rdt_domain *d, struct msr_param *m,
> +				 struct rdt_resource *r);
> +	u32			max_delay;
> +	u32			min_bw;
> +	u32			bw_gran;
> +	u32			delay_linear;
> +	u32			*mb_map;

I don't know what other weird controls will be added over time, but we are
probably better off to have

struct cache_ctrl {
	int		cbm_len;
	int		min_cbm_bits;
};

struct mba_ctrl {
	u32			max_delay;
	u32			min_bw;
	u32			bw_gran;
	u32			delay_linear;
	u32			*mb_map;
};

and in then in struct rdt_resource:

       <common fields>
       union {
       		struct cache_ctrl	foo;
		struct mba_ctrl		bla;
	} ctrl;


That avoids that rdt_resource becomes a hodgepodge of unrelated or even
contradicting fields.

Hmm?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
       [not found]   ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
@ 2017-04-04 18:50     ` Shivappa Vikas
  0 siblings, 0 replies; 9+ messages in thread
From: Shivappa Vikas @ 2017-04-04 18:50 UTC (permalink / raw)
  To: Tracy Smith
  Cc: x86, vikas.shivappa, lkml, ravi.v.shankar, tony.luck, fenghua.yu,
	h.peter.anvin, Vikas Shivappa




On Mon, 3 Apr 2017, Tracy Smith wrote:

> Hi All,
>
> No JTAG available and need to understand why Linux 4.8.3 doesn't boot on a
> x86_64 corei7-64.  Hangs at the typical "Starting kernel" location after
> the last message of the U-boot.  The bootcmd is given below.

Do you see the issue when you apply MBA patches . It seems like you use 4.8 
kernel but the mba patches are dependent on :
https://marc.info/?l=linux-kernel&m=149125583824700

which is applied on 4.11-rc4.

>
> See a fault FFS and a message indicating the image didn't load after
> failover.
>
> 1) How can I verify if the kernel image was loaded from uboot
> 2) What is this fault?
> 3) Has the bootargs or bootcmd changed between 4.1 and 4.8.3?
> 4) If the boot cmd/arg has changed, what should the boot cmd/arg be for
> 4.8.3 to boot on x86_64 corei7-64?
>
> Initial RAM disk at linear address 0x20000000, size 11638378 bytes
> Kernel command line: "BOOT_IMAGE=/imgx/bzImage LABEL=BOOT root=/dev/ram0
> imgmnt=/media/sda2 imgdir=imgx img=image.rootfs rootdelay=2 slub_debug=F
> console=ttyS1,115200  bootcount=1 bootcount_addr=0xa4000
> acpi_enforce_resources=lax pram_size=0x800000 pram_addr=10000000
> pram_loc=ddr crashkernel=128M memmap=0x800000$0x10000000  "
> EFI table at 683392c0, mmap 68339300, mmap size 4c0, version 1, descr. size
> 0x30
>
> Starting kernel ...
>
> Timer summary in microseconds:
>       Mark    Elapsed  Stage
>          0          0  reset
>          1          1  board_init_r
>        105        104  board_init_f
> 10,180,048 10,179,943  id=64
> 10,221,985     41,937  id=65
> 10,356,645    134,660  main_loop
> 12,366,521  2,009,876  usb_start
> 18,747,284  6,380,763  start_kernel
> Accumulated time:
>            10,162,689  ahci
>
> On a 4.1.26-yocto-standard #1 SMP it boots with no issues.  Basically same
> .config used in both cases except for anything deprecated between 4.1 and
> 4.8.3.
>
> root@:~# cat /proc/cmdline
> BOOT_IMAGE=/imgy/bzImage LABEL=BOOT root=/dev/ram0 imgmnt=/media/sda2
> imgdir=imgy img=image.rootfs rootdelay=2 slub_debug=F console=ttyS1,115200
> fault=FFS bootcount=3 bootcount_addr=0xa4000  acpi_enforce_resources=lax
> pram_size=0x800000 pram_addr=10000000 pram_loc=ddr crashkernel=128M
> memmap=0x800000$0x10000000
> root@CLX3001:~# cat /proc/consoles
> ttyS1                -W- (EC p a)    4:65
> netcon0              -W- (E     )
>
> thx,
> Tray
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
  2017-04-03 21:57 [PATCH 0/8 V3] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
@ 2017-04-03 21:57 ` Vikas Shivappa
       [not found]   ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
  2017-04-05 15:40   ` Thomas Gleixner
  0 siblings, 2 replies; 9+ messages in thread
From: Vikas Shivappa @ 2017-04-03 21:57 UTC (permalink / raw)
  To: vikas.shivappa, x86, linux-kernel
  Cc: hpa, tglx, mingo, peterz, ravi.v.shankar, tony.luck, fenghua.yu,
	h.peter.anvin

The MBA feature details like minimum bandwidth supported, b/w
granularity etc are obtained via executing CPUID with EAX=10H
,ECX=3.

Setup and initialize the MBA specific extensions to data structures like
global list of RDT resources, RDT resource structure and RDT domain
structure.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
 arch/x86/include/asm/intel_rdt.h |  80 +++++++++++++--------
 arch/x86/kernel/cpu/intel_rdt.c  | 151 ++++++++++++++++++++++++++++++++++++---
 2 files changed, 190 insertions(+), 41 deletions(-)

diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index 4c94f18..285cdeb 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -12,6 +12,7 @@
 #define IA32_L3_QOS_CFG		0xc81
 #define IA32_L3_CBM_BASE	0xc90
 #define IA32_L2_CBM_BASE	0xd10
+#define IA32_MBA_THRTL_BASE	0xd50
 
 #define L3_QOS_CDP_ENABLE	0x01ULL
 
@@ -69,6 +70,36 @@ struct rftype {
 };
 
 /**
+ * struct rdt_domain - group of cpus sharing an RDT resource
+ * @list:	all instances of this resource
+ * @id:		unique id for this instance
+ * @cpu_mask:	which cpus share this resource
+ * @ctrl_val:	array of cache or mem ctrl values (indexed by CLOSID)
+ * @new_cbm:	new cbm value to be loaded
+ * @have_new_cbm: did user provide new_cbm for this domain
+ */
+struct rdt_domain {
+	struct list_head	list;
+	int			id;
+	struct cpumask		cpu_mask;
+	u32			*ctrl_val;
+	u32			new_ctrl;
+	bool			have_new_ctrl;
+};
+
+/**
+ * struct msr_param - set a range of MSRs from a domain
+ * @res:       The resource to use
+ * @low:       Beginning index from base MSR
+ * @high:      End index
+ */
+struct msr_param {
+	struct rdt_resource	*res;
+	int			low;
+	int			high;
+};
+
+/**
  * struct rdt_resource - attributes of an RDT resource
  * @enabled:			Is this feature enabled on this machine
  * @capable:			Is this feature available on this machine
@@ -78,6 +109,16 @@ struct rftype {
  * @data_width:		Character width of data when displaying
  * @min_cbm_bits:		Minimum number of consecutive bits to be set
  *				in a cache bit mask
+ * @msr_update:		Function pointer to update QOS MSRs
+ * @max_delay:			Max throttle delay. Delay is the hardware
+ *				understandable value for memory bandwidth.
+ * @min_bw:			Minimum memory bandwidth percentage user
+ *				can request
+ * @bw_gran:			Granularity at which the memory bandwidth
+ *				is allocated
+ * @delay_linear:		True if memory b/w delay is in linear scale
+ * @mb_map:			Mapping of memory b/w percentage to
+ *				memory b/w delay values
  * @domains:			All domains for this resource
  * @msr_base:			Base MSR address for CBMs
  * @cache_level:		Which cache level defines scope of this domain
@@ -94,6 +135,14 @@ struct rdt_resource {
 	int			min_cbm_bits;
 	u32			default_ctrl;
 	int			data_width;
+	void (*msr_update)	(struct rdt_domain *d, struct msr_param *m,
+				 struct rdt_resource *r);
+	u32			max_delay;
+	u32			min_bw;
+	u32			bw_gran;
+	u32			delay_linear;
+	u32			*mb_map;
+
 	struct list_head	domains;
 	int			msr_base;
 	int			cache_level;
@@ -101,36 +150,6 @@ struct rdt_resource {
 	int			cbm_idx_offset;
 };
 
-/**
- * struct rdt_domain - group of cpus sharing an RDT resource
- * @list:	all instances of this resource
- * @id:		unique id for this instance
- * @cpu_mask:	which cpus share this resource
- * @ctrl_val:	array of cache or mem ctrl values (indexed by CLOSID)
- * @new_ctrl:	new ctrl value to be loaded
- * @have_new_ctrl: did user provide new_ctrl for this domain
- */
-struct rdt_domain {
-	struct list_head	list;
-	int			id;
-	struct cpumask		cpu_mask;
-	u32			*ctrl_val;
-	u32			new_ctrl;
-	bool			have_new_ctrl;
-};
-
-/**
- * struct msr_param - set a range of MSRs from a domain
- * @res:       The resource to use
- * @low:       Beginning index from base MSR
- * @high:      End index
- */
-struct msr_param {
-	struct rdt_resource	*res;
-	int			low;
-	int			high;
-};
-
 extern struct mutex rdtgroup_mutex;
 
 extern struct rdt_resource rdt_resources_all[];
@@ -144,6 +163,7 @@ enum {
 	RDT_RESOURCE_L3DATA,
 	RDT_RESOURCE_L3CODE,
 	RDT_RESOURCE_L2,
+	RDT_RESOURCE_MBA,
 
 	/* Must be the last */
 	RDT_NUM_RESOURCES,
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index c4cf2e8..be272b9 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -32,6 +32,9 @@
 #include <asm/intel-family.h>
 #include <asm/intel_rdt.h>
 
+#define MAX_MBA_BW	100u
+#define MBA_IS_LINEAR	0x4
+
 /* Mutex to protect rdtgroup access. */
 DEFINE_MUTEX(rdtgroup_mutex);
 
@@ -45,11 +48,17 @@
  */
 int max_name_width, max_data_width;
 
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+static void
+cqm_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+
 struct rdt_resource rdt_resources_all[] = {
 	{
 		.name		= "L3",
 		.domains	= domain_init(RDT_RESOURCE_L3),
 		.msr_base	= IA32_L3_CBM_BASE,
+		.msr_update	= cqm_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 3,
 		.cbm_idx_multi	= 1,
@@ -59,6 +68,7 @@ struct rdt_resource rdt_resources_all[] = {
 		.name		= "L3DATA",
 		.domains	= domain_init(RDT_RESOURCE_L3DATA),
 		.msr_base	= IA32_L3_CBM_BASE,
+		.msr_update	= cqm_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 3,
 		.cbm_idx_multi	= 2,
@@ -68,6 +78,7 @@ struct rdt_resource rdt_resources_all[] = {
 		.name		= "L3CODE",
 		.domains	= domain_init(RDT_RESOURCE_L3CODE),
 		.msr_base	= IA32_L3_CBM_BASE,
+		.msr_update	= cqm_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 3,
 		.cbm_idx_multi	= 2,
@@ -77,11 +88,21 @@ struct rdt_resource rdt_resources_all[] = {
 		.name		= "L2",
 		.domains	= domain_init(RDT_RESOURCE_L2),
 		.msr_base	= IA32_L2_CBM_BASE,
+		.msr_update	= cqm_wrmsr,
 		.min_cbm_bits	= 1,
 		.cache_level	= 2,
 		.cbm_idx_multi	= 1,
 		.cbm_idx_offset	= 0
 	},
+	{
+		.name		= "MB",
+		.domains	= domain_init(RDT_RESOURCE_MBA),
+		.msr_base	= IA32_MBA_THRTL_BASE,
+		.msr_update	= mba_wrmsr,
+		.cache_level	= 3,
+		.cbm_idx_multi	= 1,
+		.cbm_idx_offset = 0
+	},
 };
 
 static int cbm_idx(struct rdt_resource *r, int closid)
@@ -136,6 +157,53 @@ static inline bool cache_alloc_hsw_probe(void)
 	return false;
 }
 
+/*
+ * rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values
+ * exposed to user interface and the h/w understandable delay values.
+ *
+ * The non-linear delay values have the granularity of power of two
+ * and also the h/w does not guarantee a curve for configured delay
+ * values vs. actual b/w enforced.
+ * Hence we need a mapping that is pre calibrated so the user can
+ * express the memory b/w as a percentage value.
+ */
+static inline bool rdt_get_mb_table(struct rdt_resource *r)
+{
+	/*
+	 * There are no Intel SKUs as of now to support non-linear delay.
+	 */
+	pr_info("MBA b/w map not implemented for cpu:%d, model:%d",
+	        boot_cpu_data.x86, boot_cpu_data.x86_model);
+
+	return false;
+}
+
+static bool rdt_get_mem_config(struct rdt_resource *r)
+{
+	union cpuid_0x10_3_eax eax;
+	union cpuid_0x10_x_edx edx;
+	u32 ebx, ecx;
+
+	cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
+	r->num_closid = edx.split.cos_max + 1;
+	r->max_delay = eax.split.max_delay + 1;
+	r->default_ctrl = MAX_MBA_BW;
+	if (ecx & MBA_IS_LINEAR) {
+		r->delay_linear = true;
+		r->min_bw = MAX_MBA_BW - r->max_delay;
+		r->bw_gran = MAX_MBA_BW - r->max_delay;
+	} else {
+		if (!rdt_get_mb_table(r))
+			return false;
+	}
+	r->data_width = 3;
+
+	r->capable = true;
+	r->enabled = true;
+
+	return true;
+}
+
 static void rdt_get_cache_config(int idx, struct rdt_resource *r)
 {
 	union cpuid_0x10_1_eax eax;
@@ -212,7 +280,8 @@ static inline bool get_rdt_resources(void)
 		ret = true;
 	}
 
-	if (boot_cpu_has(X86_FEATURE_MBA))
+	if (boot_cpu_has(X86_FEATURE_MBA) &&
+	     rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
 		ret = true;
 
 	rdt_init_padding();
@@ -233,6 +302,47 @@ static int get_cache_id(int cpu, int level)
 	return -1;
 }
 
+/*
+ * Map the memory b/w percentage value to delay values
+ * that can be written to QOS_MSRs.
+ * There are currently no SKUs which support non linear delay values.
+ */
+static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
+{
+	if (r->delay_linear)
+		return MAX_MBA_BW - bw;
+
+	WARN_ONCE(1, "Non Linear delay-bw map not supported but queried\n");
+	return r->default_ctrl;
+}
+
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+	int i;
+
+	for (i = m->low; i < m->high; i++) {
+		int idx = cbm_idx(r, i);
+
+		/*
+		 * Write the delay value for mba.
+		 */
+		wrmsrl(r->msr_base + idx, delay_bw_map(d->ctrl_val[i], r));
+	}
+}
+
+static void
+cqm_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+	int i;
+
+	for (i = m->low; i < m->high; i++) {
+		int idx = cbm_idx(r, i);
+
+		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
+	}
+}
+
 void rdt_ctrl_update(void *arg)
 {
 	struct msr_param *m = (struct msr_param *)arg;
@@ -291,6 +401,33 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
 	return NULL;
 }
 
+static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
+{
+	struct msr_param m;
+	u32 *dc;
+	int i;
+
+	dc = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+	if (!dc)
+		return -ENOMEM;
+
+	d->ctrl_val = dc;
+
+	/*
+	 * Initialize the Control MSRs to having no control.
+	 * For Cache Allocation: Set all bits in cbm
+	 * For Memory Allocation: Set b/w requested to 100
+	 */
+	for (i = 0; i < r->num_closid; i++, dc++)
+		*dc = r->default_ctrl;
+
+	m.low = 0;
+	m.high = r->num_closid;
+	r->msr_update(d, &m, r);
+
+	return 0;
+}
+
 /*
  * domain_add_cpu - Add a cpu to a resource's domain list.
  *
@@ -306,7 +443,7 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
  */
 static void domain_add_cpu(int cpu, struct rdt_resource *r)
 {
-	int i, id = get_cache_id(cpu, r->cache_level);
+	int id = get_cache_id(cpu, r->cache_level);
 	struct list_head *add_pos = NULL;
 	struct rdt_domain *d;
 
@@ -327,19 +464,11 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
 
 	d->id = id;
 
-	d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
-	if (!d->ctrl_val) {
+	if (domain_setup_ctrlval(r, d)) {
 		kfree(d);
 		return;
 	}
 
-	for (i = 0; i < r->num_closid; i++) {
-		int idx = cbm_idx(r, i);
-
-		d->ctrl_val[i] = r->default_ctrl;
-		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
-	}
-
 	cpumask_set_cpu(cpu, &d->cpu_mask);
 	list_add_tail(&d->list, add_pos);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
  2017-03-01 15:24   ` Thomas Gleixner
@ 2017-03-10 21:51     ` Shivappa Vikas
  0 siblings, 0 replies; 9+ messages in thread
From: Shivappa Vikas @ 2017-03-10 21:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Vikas Shivappa, vikas.shivappa, linux-kernel, x86, hpa, mingo,
	peterz, ravi.v.shankar, tony.luck, fenghua.yu, andi.kleen



On Wed, 1 Mar 2017, Thomas Gleixner wrote:

> On Fri, 17 Feb 2017, Vikas Shivappa wrote:
>> --- a/arch/x86/include/asm/intel_rdt.h
>> +++ b/arch/x86/include/asm/intel_rdt.h
>> @@ -11,6 +11,9 @@
>>  #define IA32_L3_QOS_CFG		0xc81
>>  #define IA32_L3_CBM_BASE	0xc90
>>  #define IA32_L2_CBM_BASE	0xd10
>> +#define IA32_MBA_THRTL_BASE	0xd50
>> +#define MAX_MBA_THRTL		100u
>> +#define MBA_IS_LINEAR		0x4
>
> I have a hard time to figure out how the latter two constants are related
> to this list of registers. MBA_IS_LINEAR is used to check the CPUID bit and
> MAX_MBA_THRTL is obviously a pure software constant because with a
> non-linear scale the maximum value is not 100.
>
> Just slapping defines to random places is equally bad as using hard coded
> constants.
>
>> +/*
>> + * rdt_get_mb_table() - get a mapping of b/w percentage values
>> + * exposed to user interface and the h/w understandable delay values.
>> + *
>> + * The non-linear delay values have the granularity of power of two
>> + * and also the h/w does not guarantee a curve for configured delay
>> + * values vs. actual b/w throttled.
>> + * Hence we need a mapping that is pre caliberated for user to express
>> + * the b/w in terms of any sensible number.
>
> ... calibrated so the user can express the bandwidth as a percentage value.
>
>> +static inline int rdt_get_mb_table(struct rdt_resource *r)
>> +{
>> +	/*
>> +	 * There are no Intel SKUs as of now to support non-linear delay.
>> +	 */
>> +	r->mb_map = NULL;
>
> What's the point of setting this to NULL?
>
> Also it would be helpful to emit log info here so people don't have to
> start digging around.
>
> 	pr_info("Bandwidth map not implemented for ....", ... model);
>
>> +
>> +	return -ENODEV;
>
> Returning -ENODEV to a function which just returns a boolean value is
> pointless.
>
>>  static void rdt_get_cache_config(int idx, struct rdt_resource *r)
>>  {
>>  	union cpuid_0x10_1_eax eax;
>> @@ -184,9 +237,8 @@ static inline bool get_rdt_resources(void)
>>  		ret = true;
>>  	}
>>
>> -	if (boot_cpu_has(X86_FEATURE_MBA)) {
>> -		ret = true;
>> -	}
>> +	if (boot_cpu_has(X86_FEATURE_MBA))
>> +		ret = rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]);
>
> Groan. When rdt_get_mem_config() returns false (because the map is not
> implemented), then the whole function returns false and CAT is disabled.
>
>> +static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
>> +{
>> +	int i;
>> +
>> +	d->ctrl_val = kmalloc_array(r->num_closid,
>> +				     sizeof(*d->ctrl_val), GFP_KERNEL);
>> +	if (!d->ctrl_val)
>> +		return -ENOMEM;
>> +
>> +	/*
>> +	 * Initialize the Control MSRs to having no control.
>> +	 * For Cache Allocation: Set all bits in cbm
>> +	 * For Memory Allocation: Set b/w requested to 100
>> +	 */
>> +	for (i = 0; i < r->num_closid; i++) {
>> +		int idx = cbm_idx(r, i);
>> +
>> +		d->ctrl_val[i] = r->default_ctrl;
>> +		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
>> +	}
>
> So if you use a local pointer for that, this whole mess becomes readable.
>
> static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
> {
> 	u32 *p;
> 	int i;
>
> 	p = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
> 	if (!p)
> 		return -ENOMEM;
>
> 	d->ctrl_val = p;
>
> 	/* Initialize the Control MSRs to the default value */
> 	for (i = 0; i < r->num_closid; i++, p++) {
> 		int idx = cbm_idx(r, i);
>
> 		*p = r->default_ctrl;
> 		wrmsrl(r->msr_base + idx, *p);
> 	}
>> +
>> +	return 0;
>> +}
>
>>  static void domain_add_cpu(int cpu, struct rdt_resource *r)
>>  {
>> -	int i, id = get_cache_id(cpu, r->cache_level);
>> +	int id = get_cache_id(cpu, r->cache_level), ret;
>
> Bah. If you have the same type in one line, then please move the
> uninitialized variables to the front.
>
> 	int ret, id = get_cache_id(cpu, r->cache_level);
>
> But a s/i/ret/ would have been to simple and kept the code readable.
>
>> @@ -298,19 +374,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>>
>>  	d->id = id;
>>
>> -	d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
>> -	if (!d->ctrl_val) {
>> +	ret = domain_setup_ctrlval(r, d);
>> +	if (ret) {
>>  		kfree(d);
>>  		return;
>>  	}
>
> What's the point of this 'ret' variable if the function is void?
>
> 	if (domain_setup_ctrlval(r, d)) {
> 		kfree(d);
> 		return;
> 	}
>
> would have been to easy to read, right?

Will fix all the issues pointed. Thanks for pointing out.

>
> Thanks,
>
> 	tglx
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
  2017-02-17 19:58 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
@ 2017-03-01 15:24   ` Thomas Gleixner
  2017-03-10 21:51     ` Shivappa Vikas
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2017-03-01 15:24 UTC (permalink / raw)
  To: Vikas Shivappa
  Cc: vikas.shivappa, linux-kernel, x86, hpa, mingo, peterz,
	ravi.v.shankar, tony.luck, fenghua.yu, andi.kleen

On Fri, 17 Feb 2017, Vikas Shivappa wrote:
> --- a/arch/x86/include/asm/intel_rdt.h
> +++ b/arch/x86/include/asm/intel_rdt.h
> @@ -11,6 +11,9 @@
>  #define IA32_L3_QOS_CFG		0xc81
>  #define IA32_L3_CBM_BASE	0xc90
>  #define IA32_L2_CBM_BASE	0xd10
> +#define IA32_MBA_THRTL_BASE	0xd50
> +#define MAX_MBA_THRTL		100u
> +#define MBA_IS_LINEAR		0x4

I have a hard time to figure out how the latter two constants are related
to this list of registers. MBA_IS_LINEAR is used to check the CPUID bit and
MAX_MBA_THRTL is obviously a pure software constant because with a
non-linear scale the maximum value is not 100.

Just slapping defines to random places is equally bad as using hard coded
constants.

> +/*
> + * rdt_get_mb_table() - get a mapping of b/w percentage values
> + * exposed to user interface and the h/w understandable delay values.
> + *
> + * The non-linear delay values have the granularity of power of two
> + * and also the h/w does not guarantee a curve for configured delay
> + * values vs. actual b/w throttled.
> + * Hence we need a mapping that is pre caliberated for user to express
> + * the b/w in terms of any sensible number.

... calibrated so the user can express the bandwidth as a percentage value.

> +static inline int rdt_get_mb_table(struct rdt_resource *r)
> +{
> +	/*
> +	 * There are no Intel SKUs as of now to support non-linear delay.
> +	 */
> +	r->mb_map = NULL;

What's the point of setting this to NULL?

Also it would be helpful to emit log info here so people don't have to
start digging around.

	pr_info("Bandwidth map not implemented for ....", ... model);

> +
> +	return -ENODEV;

Returning -ENODEV to a function which just returns a boolean value is
pointless.

>  static void rdt_get_cache_config(int idx, struct rdt_resource *r)
>  {
>  	union cpuid_0x10_1_eax eax;
> @@ -184,9 +237,8 @@ static inline bool get_rdt_resources(void)
>  		ret = true;
>  	}
>  
> -	if (boot_cpu_has(X86_FEATURE_MBA)) {
> -		ret = true;
> -	}
> +	if (boot_cpu_has(X86_FEATURE_MBA))
> +		ret = rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]);

Groan. When rdt_get_mem_config() returns false (because the map is not
implemented), then the whole function returns false and CAT is disabled.

> +static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
> +{
> +	int i;
> +
> +	d->ctrl_val = kmalloc_array(r->num_closid,
> +				     sizeof(*d->ctrl_val), GFP_KERNEL);
> +	if (!d->ctrl_val)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Initialize the Control MSRs to having no control.
> +	 * For Cache Allocation: Set all bits in cbm
> +	 * For Memory Allocation: Set b/w requested to 100
> +	 */
> +	for (i = 0; i < r->num_closid; i++) {
> +		int idx = cbm_idx(r, i);
> +
> +		d->ctrl_val[i] = r->default_ctrl;
> +		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
> +	}

So if you use a local pointer for that, this whole mess becomes readable.

static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
	u32 *p;
	int i;

	p = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
	if (!p)
		return -ENOMEM;

	d->ctrl_val = p;

	/* Initialize the Control MSRs to the default value */
	for (i = 0; i < r->num_closid; i++, p++) {
		int idx = cbm_idx(r, i);

		*p = r->default_ctrl;
		wrmsrl(r->msr_base + idx, *p);
	}
> +
> +	return 0;
> +}

>  static void domain_add_cpu(int cpu, struct rdt_resource *r)
>  {
> -	int i, id = get_cache_id(cpu, r->cache_level);
> +	int id = get_cache_id(cpu, r->cache_level), ret;

Bah. If you have the same type in one line, then please move the
uninitialized variables to the front.

	int ret, id = get_cache_id(cpu, r->cache_level);

But a s/i/ret/ would have been to simple and kept the code readable.

> @@ -298,19 +374,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>  
>  	d->id = id;
>  
> -	d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
> -	if (!d->ctrl_val) {
> +	ret = domain_setup_ctrlval(r, d);
> +	if (ret) {
>  		kfree(d);
>  		return;
>  	}

What's the point of this 'ret' variable if the function is void?

	if (domain_setup_ctrlval(r, d)) {
		kfree(d);
		return;
	}

would have been to easy to read, right?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
  2017-02-17 19:58 [PATCH 0/8 V2] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
@ 2017-02-17 19:58 ` Vikas Shivappa
  2017-03-01 15:24   ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Vikas Shivappa @ 2017-02-17 19:58 UTC (permalink / raw)
  To: vikas.shivappa
  Cc: linux-kernel, x86, hpa, tglx, mingo, peterz, ravi.v.shankar,
	tony.luck, fenghua.yu, andi.kleen, vikas.shivappa

The MBA feature details like minimum bandwidth supported, b/w
granularity etc are obtained via executing CPUID with EAX=10H
,ECX=3.

Setup and initialize the MBA specific extensions to data structures like
global list of RDT resources, RDT resource structure and RDT domain
structure.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
 arch/x86/include/asm/intel_rdt.h | 17 +++++++
 arch/x86/kernel/cpu/intel_rdt.c  | 95 ++++++++++++++++++++++++++++++++++------
 2 files changed, 99 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index d2eee45..af65b2a 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -11,6 +11,9 @@
 #define IA32_L3_QOS_CFG		0xc81
 #define IA32_L3_CBM_BASE	0xc90
 #define IA32_L2_CBM_BASE	0xd10
+#define IA32_MBA_THRTL_BASE	0xd50
+#define MAX_MBA_THRTL		100u
+#define MBA_IS_LINEAR		0x4
 
 #define L3_QOS_CDP_ENABLE	0x01ULL
 
@@ -74,6 +77,14 @@ struct rftype {
  * @default_ctrl:		Specifies default cache cbm or mem b/w percent.
  * @min_cbm_bits:		Minimum number of consecutive bits to be set
  *				in a cache bit mask
+ * @max_delay:			Max throttle delay. Delay is the hardware
+ *				understandable value for memory b/w.
+ * @min_bw:			Minimum memory bandwidth in percentage
+ *				user can request
+ * @bw_gran:			Bandwidth granularity
+ * @delay_linear:		True if Mem b/w delay is in linear scale
+ * @mb_map:			Mapping of mem b/w delay to
+ *				b/w throttle percentage
  * @domains:			All domains for this resource
  * @num_domains:		Number of domains active
  * @msr_base:			Base MSR address for CBMs
@@ -92,6 +103,11 @@ struct rdt_resource {
 	int			cbm_len;
 	int			min_cbm_bits;
 	u32			default_ctrl;
+	u32			max_delay;
+	u32			min_bw;
+	u32			bw_gran;
+	u32			delay_linear;
+	u32			*mb_map;
 	struct list_head	domains;
 	int			num_domains;
 	int			msr_base;
@@ -141,6 +157,7 @@ enum {
 	RDT_RESOURCE_L3DATA,
 	RDT_RESOURCE_L3CODE,
 	RDT_RESOURCE_L2,
+	RDT_RESOURCE_MBA,
 
 	/* Must be the last */
 	RDT_NUM_RESOURCES,
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index b76a518..130ce98 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -76,6 +76,14 @@ struct rdt_resource rdt_resources_all[] = {
 		.cbm_idx_multi	= 1,
 		.cbm_idx_offset	= 0
 	},
+	{
+		.name		= "MB",
+		.domains	= domain_init(RDT_RESOURCE_MBA),
+		.msr_base	= IA32_MBA_THRTL_BASE,
+		.cache_level	= 3,
+		.cbm_idx_multi	= 1,
+		.cbm_idx_offset = 0
+	},
 };
 
 static int cbm_idx(struct rdt_resource *r, int closid)
@@ -130,6 +138,51 @@ static inline bool cache_alloc_hsw_probe(void)
 	return false;
 }
 
+/*
+ * rdt_get_mb_table() - get a mapping of b/w percentage values
+ * exposed to user interface and the h/w understandable delay values.
+ *
+ * The non-linear delay values have the granularity of power of two
+ * and also the h/w does not guarantee a curve for configured delay
+ * values vs. actual b/w throttled.
+ * Hence we need a mapping that is pre caliberated for user to express
+ * the b/w in terms of any sensible number.
+ */
+static inline int rdt_get_mb_table(struct rdt_resource *r)
+{
+	/*
+	 * There are no Intel SKUs as of now to support non-linear delay.
+	 */
+	r->mb_map = NULL;
+
+	return -ENODEV;
+}
+
+static bool rdt_get_mem_config(struct rdt_resource *r)
+{
+	union cpuid_0x10_3_eax eax;
+	union cpuid_0x10_x_edx edx;
+	u32 ebx, ecx;
+
+	cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
+	r->num_closid = edx.split.cos_max + 1;
+	r->max_delay = eax.split.max_delay + 1;
+	r->default_ctrl = MAX_MBA_THRTL;
+	if (ecx & MBA_IS_LINEAR) {
+		r->delay_linear = true;
+		r->min_bw = MAX_MBA_THRTL - r->max_delay;
+		r->bw_gran = MAX_MBA_THRTL - r->max_delay;
+	} else {
+		if (rdt_get_mb_table(r))
+			return false;
+	}
+
+	r->capable = true;
+	r->enabled = true;
+
+	return true;
+}
+
 static void rdt_get_cache_config(int idx, struct rdt_resource *r)
 {
 	union cpuid_0x10_1_eax eax;
@@ -184,9 +237,8 @@ static inline bool get_rdt_resources(void)
 		ret = true;
 	}
 
-	if (boot_cpu_has(X86_FEATURE_MBA)) {
-		ret = true;
-	}
+	if (boot_cpu_has(X86_FEATURE_MBA))
+		ret = rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]);
 
 	return ret;
 }
@@ -262,6 +314,30 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
 	return NULL;
 }
 
+static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
+{
+	int i;
+
+	d->ctrl_val = kmalloc_array(r->num_closid,
+				     sizeof(*d->ctrl_val), GFP_KERNEL);
+	if (!d->ctrl_val)
+		return -ENOMEM;
+
+	/*
+	 * Initialize the Control MSRs to having no control.
+	 * For Cache Allocation: Set all bits in cbm
+	 * For Memory Allocation: Set b/w requested to 100
+	 */
+	for (i = 0; i < r->num_closid; i++) {
+		int idx = cbm_idx(r, i);
+
+		d->ctrl_val[i] = r->default_ctrl;
+		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
+	}
+
+	return 0;
+}
+
 /*
  * domain_add_cpu - Add a cpu to a resource's domain list.
  *
@@ -277,7 +353,7 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
  */
 static void domain_add_cpu(int cpu, struct rdt_resource *r)
 {
-	int i, id = get_cache_id(cpu, r->cache_level);
+	int id = get_cache_id(cpu, r->cache_level), ret;
 	struct list_head *add_pos = NULL;
 	struct rdt_domain *d;
 
@@ -298,19 +374,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
 
 	d->id = id;
 
-	d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
-	if (!d->ctrl_val) {
+	ret = domain_setup_ctrlval(r, d);
+	if (ret) {
 		kfree(d);
 		return;
 	}
 
-	for (i = 0; i < r->num_closid; i++) {
-		int idx = cbm_idx(r, i);
-
-		d->ctrl_val[i] = r->default_ctrl;
-		wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
-	}
-
 	cpumask_set_cpu(cpu, &d->cpu_mask);
 	list_add_tail(&d->list, add_pos);
 	r->num_domains++;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-08  0:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 19:02 [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Tracy Smith
  -- strict thread matches above, loose matches on Subject: below --
2017-04-08  0:33 [PATCH 0/8 V4] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
2017-04-08  0:33 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
2017-04-03 21:57 [PATCH 0/8 V3] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
2017-04-03 21:57 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
     [not found]   ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
2017-04-04 18:50     ` Shivappa Vikas
2017-04-05 15:40   ` Thomas Gleixner
2017-04-05 18:09     ` Shivappa Vikas
2017-02-17 19:58 [PATCH 0/8 V2] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
2017-02-17 19:58 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
2017-03-01 15:24   ` Thomas Gleixner
2017-03-10 21:51     ` Shivappa Vikas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).