* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
@ 2017-04-04 19:02 Tracy Smith
0 siblings, 0 replies; 9+ messages in thread
From: Tracy Smith @ 2017-04-04 19:02 UTC (permalink / raw)
To: Shivappa Vikas
Cc: x86, lkml, ravi.v.shankar, tony.luck, fenghua.yu, h.peter.anvin,
Vikas Shivappa
Apologies, unrelated to MBA. Resent later with a changed subject line.
On Tue, Apr 4, 2017 at 1:50 PM, Shivappa Vikas <vikas.shivappa@intel.com> wrote:
>
>
>
> On Mon, 3 Apr 2017, Tracy Smith wrote:
>
>> Hi All,
>>
>> No JTAG available and need to understand why Linux 4.8.3 doesn't boot on a
>> x86_64 corei7-64. Hangs at the typical "Starting kernel" location after
>> the last message of the U-boot. The bootcmd is given below.
>
>
> Do you see the issue when you apply MBA patches . It seems like you use 4.8
> kernel but the mba patches are dependent on :
> https://marc.info/?l=linux-kernel&m=149125583824700
>
> which is applied on 4.11-rc4.
>
>
>>
>> See a fault FFS and a message indicating the image didn't load after
>> failover.
>>
>> 1) How can I verify if the kernel image was loaded from uboot
>> 2) What is this fault?
>> 3) Has the bootargs or bootcmd changed between 4.1 and 4.8.3?
>> 4) If the boot cmd/arg has changed, what should the boot cmd/arg be for
>> 4.8.3 to boot on x86_64 corei7-64?
>>
>> Initial RAM disk at linear address 0x20000000, size 11638378 bytes
>> Kernel command line: "BOOT_IMAGE=/imgx/bzImage LABEL=BOOT root=/dev/ram0
>> imgmnt=/media/sda2 imgdir=imgx img=image.rootfs rootdelay=2 slub_debug=F
>> console=ttyS1,115200 bootcount=1 bootcount_addr=0xa4000
>> acpi_enforce_resources=lax pram_size=0x800000 pram_addr=10000000
>> pram_loc=ddr crashkernel=128M memmap=0x800000$0x10000000 "
>> EFI table at 683392c0, mmap 68339300, mmap size 4c0, version 1, descr.
>> size
>> 0x30
>>
>> Starting kernel ...
>>
>> Timer summary in microseconds:
>> Mark Elapsed Stage
>> 0 0 reset
>> 1 1 board_init_r
>> 105 104 board_init_f
>> 10,180,048 10,179,943 id=64
>> 10,221,985 41,937 id=65
>> 10,356,645 134,660 main_loop
>> 12,366,521 2,009,876 usb_start
>> 18,747,284 6,380,763 start_kernel
>> Accumulated time:
>> 10,162,689 ahci
>>
>> On a 4.1.26-yocto-standard #1 SMP it boots with no issues. Basically same
>> .config used in both cases except for anything deprecated between 4.1 and
>> 4.8.3.
>>
>> root@:~# cat /proc/cmdline
>> BOOT_IMAGE=/imgy/bzImage LABEL=BOOT root=/dev/ram0 imgmnt=/media/sda2
>> imgdir=imgy img=image.rootfs rootdelay=2 slub_debug=F console=ttyS1,115200
>> fault=FFS bootcount=3 bootcount_addr=0xa4000 acpi_enforce_resources=lax
>> pram_size=0x800000 pram_addr=10000000 pram_loc=ddr crashkernel=128M
>> memmap=0x800000$0x10000000
>> root@CLX3001:~# cat /proc/consoles
>> ttyS1 -W- (EC p a) 4:65
>> netcon0 -W- (E )
>>
>> thx,
>> Tray
>>
>
--
Confidentiality notice: This e-mail message, including any
attachments, may contain legally privileged and/or confidential
information. If you are not the intended recipient(s), please
immediately notify the sender and delete this e-mail message.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
2017-04-08 0:33 [PATCH 0/8 V4] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
@ 2017-04-08 0:33 ` Vikas Shivappa
0 siblings, 0 replies; 9+ messages in thread
From: Vikas Shivappa @ 2017-04-08 0:33 UTC (permalink / raw)
To: vikas.shivappa, x86, linux-kernel
Cc: hpa, tglx, mingo, ravi.v.shankar, tony.luck, fenghua.yu, vikas.shivappa
The MBA feature details like minimum bandwidth supported, b/w
granularity etc are obtained via executing CPUID with EAX=10H
,ECX=3.
Setup and initialize the MBA specific extensions to data structures like
global list of RDT resources, RDT resource structure and RDT domain
structure.
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
arch/x86/include/asm/intel_rdt.h | 92 +++++++++++++++---------
arch/x86/kernel/cpu/intel_rdt.c | 151 ++++++++++++++++++++++++++++++++++++---
2 files changed, 199 insertions(+), 44 deletions(-)
diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index 4c94f18..097134b 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -12,6 +12,7 @@
#define IA32_L3_QOS_CFG 0xc81
#define IA32_L3_CBM_BASE 0xc90
#define IA32_L2_CBM_BASE 0xd10
+#define IA32_MBA_THRTL_BASE 0xd50
#define L3_QOS_CDP_ENABLE 0x01ULL
@@ -69,39 +70,6 @@ struct rftype {
};
/**
- * struct rdt_resource - attributes of an RDT resource
- * @enabled: Is this feature enabled on this machine
- * @capable: Is this feature available on this machine
- * @name: Name to use in "schemata" file
- * @num_closid: Number of CLOSIDs available
- * @default_ctrl: Specifies default cache cbm or mem b/w percent.
- * @data_width: Character width of data when displaying
- * @min_cbm_bits: Minimum number of consecutive bits to be set
- * in a cache bit mask
- * @domains: All domains for this resource
- * @msr_base: Base MSR address for CBMs
- * @cache_level: Which cache level defines scope of this domain
- * @cbm_idx_multi: Multiplier of CBM index
- * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
- * closid * cbm_idx_multi + cbm_idx_offset
- */
-struct rdt_resource {
- bool enabled;
- bool capable;
- char *name;
- int num_closid;
- int cbm_len;
- int min_cbm_bits;
- u32 default_ctrl;
- int data_width;
- struct list_head domains;
- int msr_base;
- int cache_level;
- int cbm_idx_multi;
- int cbm_idx_offset;
-};
-
-/**
* struct rdt_domain - group of cpus sharing an RDT resource
* @list: all instances of this resource
* @id: unique id for this instance
@@ -131,6 +99,63 @@ struct msr_param {
int high;
};
+/**
+ * struct rdt_resource - attributes of an RDT resource
+ * @enabled: Is this feature enabled on this machine
+ * @capable: Is this feature available on this machine
+ * @name: Name to use in "schemata" file
+ * @num_closid: Number of CLOSIDs available
+ * @default_ctrl: Specifies default cache cbm or mem b/w percent.
+ * @data_width: Character width of data when displaying
+ * @msr_update: Function pointer to update QOS MSRs
+ * @domains: All domains for this resource
+ * @msr_base: Base MSR address for CBMs
+ * @cache_level: Which cache level defines scope of this domain
+ * @cbm_idx_multi: Multiplier of CBM index
+ * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
+ * closid * cbm_idx_multi + cbm_idx_offset
+ * @cbm_len Number of cbm bits
+ * @min_cbm_bits: Minimum number of consecutive bits to be set
+ * in a cache bit mask
+ * @max_delay: Max throttle delay. Delay is the hardware
+ * understandable value for memory bandwidth.
+ * @min_bw: Minimum memory bandwidth percentage user
+ * can request
+ * @bw_gran: Granularity at which the memory bandwidth
+ * is allocated
+ * @delay_linear: True if memory b/w delay is in linear scale
+ * @mb_map: Mapping of memory b/w percentage to
+ * memory b/w delay values
+ */
+struct rdt_resource {
+ bool enabled;
+ bool capable;
+ char *name;
+ int num_closid;
+ u32 default_ctrl;
+ int data_width;
+ void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
+ struct rdt_resource *r);
+ struct list_head domains;
+ int msr_base;
+ int cache_level;
+ int cbm_idx_multi;
+ int cbm_idx_offset;
+ union {
+ struct { /*cache ctrls*/
+ int cbm_len;
+ int min_cbm_bits;
+ };
+ struct { /*mem ctrls*/
+ u32 max_delay;
+ u32 min_bw;
+ u32 bw_gran;
+ u32 delay_linear;
+ u32 *mb_map;
+ };
+ };
+};
+
extern struct mutex rdtgroup_mutex;
extern struct rdt_resource rdt_resources_all[];
@@ -144,6 +169,7 @@ enum {
RDT_RESOURCE_L3DATA,
RDT_RESOURCE_L3CODE,
RDT_RESOURCE_L2,
+ RDT_RESOURCE_MBA,
/* Must be the last */
RDT_NUM_RESOURCES,
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index c4cf2e8..b9f1cfc 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -32,6 +32,9 @@
#include <asm/intel-family.h>
#include <asm/intel_rdt.h>
+#define MAX_MBA_BW 100u
+#define MBA_IS_LINEAR 0x4
+
/* Mutex to protect rdtgroup access. */
DEFINE_MUTEX(rdtgroup_mutex);
@@ -45,11 +48,17 @@
*/
int max_name_width, max_data_width;
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+static void
+cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+
struct rdt_resource rdt_resources_all[] = {
{
.name = "L3",
.domains = domain_init(RDT_RESOURCE_L3),
.msr_base = IA32_L3_CBM_BASE,
+ .msr_update = cat_wrmsr,
.min_cbm_bits = 1,
.cache_level = 3,
.cbm_idx_multi = 1,
@@ -59,6 +68,7 @@ struct rdt_resource rdt_resources_all[] = {
.name = "L3DATA",
.domains = domain_init(RDT_RESOURCE_L3DATA),
.msr_base = IA32_L3_CBM_BASE,
+ .msr_update = cat_wrmsr,
.min_cbm_bits = 1,
.cache_level = 3,
.cbm_idx_multi = 2,
@@ -68,6 +78,7 @@ struct rdt_resource rdt_resources_all[] = {
.name = "L3CODE",
.domains = domain_init(RDT_RESOURCE_L3CODE),
.msr_base = IA32_L3_CBM_BASE,
+ .msr_update = cat_wrmsr,
.min_cbm_bits = 1,
.cache_level = 3,
.cbm_idx_multi = 2,
@@ -77,11 +88,21 @@ struct rdt_resource rdt_resources_all[] = {
.name = "L2",
.domains = domain_init(RDT_RESOURCE_L2),
.msr_base = IA32_L2_CBM_BASE,
+ .msr_update = cat_wrmsr,
.min_cbm_bits = 1,
.cache_level = 2,
.cbm_idx_multi = 1,
.cbm_idx_offset = 0
},
+ {
+ .name = "MB",
+ .domains = domain_init(RDT_RESOURCE_MBA),
+ .msr_base = IA32_MBA_THRTL_BASE,
+ .msr_update = mba_wrmsr,
+ .cache_level = 3,
+ .cbm_idx_multi = 1,
+ .cbm_idx_offset = 0
+ },
};
static int cbm_idx(struct rdt_resource *r, int closid)
@@ -136,6 +157,53 @@ static inline bool cache_alloc_hsw_probe(void)
return false;
}
+/*
+ * rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values
+ * exposed to user interface and the h/w understandable delay values.
+ *
+ * The non-linear delay values have the granularity of power of two
+ * and also the h/w does not guarantee a curve for configured delay
+ * values vs. actual b/w enforced.
+ * Hence we need a mapping that is pre calibrated so the user can
+ * express the memory b/w as a percentage value.
+ */
+static inline bool rdt_get_mb_table(struct rdt_resource *r)
+{
+ /*
+ * There are no Intel SKUs as of now to support non-linear delay.
+ */
+ pr_info("MBA b/w map not implemented for cpu:%d, model:%d",
+ boot_cpu_data.x86, boot_cpu_data.x86_model);
+
+ return false;
+}
+
+static bool rdt_get_mem_config(struct rdt_resource *r)
+{
+ union cpuid_0x10_3_eax eax;
+ union cpuid_0x10_x_edx edx;
+ u32 ebx, ecx;
+
+ cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
+ r->num_closid = edx.split.cos_max + 1;
+ r->max_delay = eax.split.max_delay + 1;
+ r->default_ctrl = MAX_MBA_BW;
+ if (ecx & MBA_IS_LINEAR) {
+ r->delay_linear = true;
+ r->min_bw = MAX_MBA_BW - r->max_delay;
+ r->bw_gran = MAX_MBA_BW - r->max_delay;
+ } else {
+ if (!rdt_get_mb_table(r))
+ return false;
+ }
+ r->data_width = 3;
+
+ r->capable = true;
+ r->enabled = true;
+
+ return true;
+}
+
static void rdt_get_cache_config(int idx, struct rdt_resource *r)
{
union cpuid_0x10_1_eax eax;
@@ -212,7 +280,8 @@ static inline bool get_rdt_resources(void)
ret = true;
}
- if (boot_cpu_has(X86_FEATURE_MBA))
+ if (boot_cpu_has(X86_FEATURE_MBA) &&
+ rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
ret = true;
rdt_init_padding();
@@ -233,6 +302,47 @@ static int get_cache_id(int cpu, int level)
return -1;
}
+/*
+ * Map the memory b/w percentage value to delay values
+ * that can be written to QOS_MSRs.
+ * There are currently no SKUs which support non linear delay values.
+ */
+static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
+{
+ if (r->delay_linear)
+ return MAX_MBA_BW - bw;
+
+ WARN_ONCE(1, "Non Linear delay-bw map not supported but queried\n");
+ return r->default_ctrl;
+}
+
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+ int i;
+
+ for (i = m->low; i < m->high; i++) {
+ int idx = cbm_idx(r, i);
+
+ /*
+ * Write the delay value for mba.
+ */
+ wrmsrl(r->msr_base + idx, delay_bw_map(d->ctrl_val[i], r));
+ }
+}
+
+static void
+cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+ int i;
+
+ for (i = m->low; i < m->high; i++) {
+ int idx = cbm_idx(r, i);
+
+ wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
+ }
+}
+
void rdt_ctrl_update(void *arg)
{
struct msr_param *m = (struct msr_param *)arg;
@@ -291,6 +401,33 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
return NULL;
}
+static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
+{
+ struct msr_param m;
+ u32 *dc;
+ int i;
+
+ dc = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+ if (!dc)
+ return -ENOMEM;
+
+ d->ctrl_val = dc;
+
+ /*
+ * Initialize the Control MSRs to having no control.
+ * For Cache Allocation: Set all bits in cbm
+ * For Memory Allocation: Set b/w requested to 100
+ */
+ for (i = 0; i < r->num_closid; i++, dc++)
+ *dc = r->default_ctrl;
+
+ m.low = 0;
+ m.high = r->num_closid;
+ r->msr_update(d, &m, r);
+
+ return 0;
+}
+
/*
* domain_add_cpu - Add a cpu to a resource's domain list.
*
@@ -306,7 +443,7 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
*/
static void domain_add_cpu(int cpu, struct rdt_resource *r)
{
- int i, id = get_cache_id(cpu, r->cache_level);
+ int id = get_cache_id(cpu, r->cache_level);
struct list_head *add_pos = NULL;
struct rdt_domain *d;
@@ -327,19 +464,11 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
d->id = id;
- d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
- if (!d->ctrl_val) {
+ if (domain_setup_ctrlval(r, d)) {
kfree(d);
return;
}
- for (i = 0; i < r->num_closid; i++) {
- int idx = cbm_idx(r, i);
-
- d->ctrl_val[i] = r->default_ctrl;
- wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
- }
-
cpumask_set_cpu(cpu, &d->cpu_mask);
list_add_tail(&d->list, add_pos);
}
--
1.9.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
2017-04-05 15:40 ` Thomas Gleixner
@ 2017-04-05 18:09 ` Shivappa Vikas
0 siblings, 0 replies; 9+ messages in thread
From: Shivappa Vikas @ 2017-04-05 18:09 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Vikas Shivappa, vikas.shivappa, x86, linux-kernel, hpa, mingo,
peterz, ravi.v.shankar, tony.luck, fenghua.yu, h.peter.anvin
On Wed, 5 Apr 2017, Thomas Gleixner wrote:
> On Mon, 3 Apr 2017, Vikas Shivappa wrote:
>>
>> /**
>> + * struct rdt_domain - group of cpus sharing an RDT resource
>> + * @list: all instances of this resource
>> + * @id: unique id for this instance
>> + * @cpu_mask: which cpus share this resource
>> + * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
>> + * @new_cbm: new cbm value to be loaded
>> + * @have_new_cbm: did user provide new_cbm for this domain
>
> The version which you removed below has the kernel-doc comments correct ....
Will fix
>
>> +/**
>> * struct rdt_resource - attributes of an RDT resource
>> * @enabled: Is this feature enabled on this machine
>> * @capable: Is this feature available on this machine
>> @@ -78,6 +109,16 @@ struct rftype {
>> * @data_width: Character width of data when displaying
>> * @min_cbm_bits: Minimum number of consecutive bits to be set
>> * in a cache bit mask
>> + * @msr_update: Function pointer to update QOS MSRs
>> + * @max_delay: Max throttle delay. Delay is the hardware
>> + * understandable value for memory bandwidth.
>> + * @min_bw: Minimum memory bandwidth percentage user
>> + * can request
>> + * @bw_gran: Granularity at which the memory bandwidth
>> + * is allocated
>> + * @delay_linear: True if memory b/w delay is in linear scale
>> + * @mb_map: Mapping of memory b/w percentage to
>> + * memory b/w delay values
>> * @domains: All domains for this resource
>> * @msr_base: Base MSR address for CBMs
>> * @cache_level: Which cache level defines scope of this domain
>> @@ -94,6 +135,14 @@ struct rdt_resource {
>> int min_cbm_bits;
>> u32 default_ctrl;
>> int data_width;
>> + void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
>> + struct rdt_resource *r);
>> + u32 max_delay;
>> + u32 min_bw;
>> + u32 bw_gran;
>> + u32 delay_linear;
>> + u32 *mb_map;
>
> I don't know what other weird controls will be added over time, but we are
> probably better off to have
>
> struct cache_ctrl {
> int cbm_len;
> int min_cbm_bits;
> };
>
> struct mba_ctrl {
> u32 max_delay;
> u32 min_bw;
> u32 bw_gran;
> u32 delay_linear;
> u32 *mb_map;
> };
>
> and in then in struct rdt_resource:
>
> <common fields>
> union {
> struct cache_ctrl foo;
> struct mba_ctrl bla;
> } ctrl;
>
>
> That avoids that rdt_resource becomes a hodgepodge of unrelated or even
> contradicting fields.
>
> Hmm?
Ok, makes sense. Will fix. Thought of a union when i had added a couple fields
and given up but its grown a lot now.
Thanks,
Vikas
>
> Thanks,
>
> tglx
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
2017-04-03 21:57 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
[not found] ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
@ 2017-04-05 15:40 ` Thomas Gleixner
2017-04-05 18:09 ` Shivappa Vikas
1 sibling, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2017-04-05 15:40 UTC (permalink / raw)
To: Vikas Shivappa
Cc: vikas.shivappa, x86, linux-kernel, hpa, mingo, peterz,
ravi.v.shankar, tony.luck, fenghua.yu, h.peter.anvin
On Mon, 3 Apr 2017, Vikas Shivappa wrote:
>
> /**
> + * struct rdt_domain - group of cpus sharing an RDT resource
> + * @list: all instances of this resource
> + * @id: unique id for this instance
> + * @cpu_mask: which cpus share this resource
> + * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
> + * @new_cbm: new cbm value to be loaded
> + * @have_new_cbm: did user provide new_cbm for this domain
The version which you removed below has the kernel-doc comments correct ....
> +/**
> * struct rdt_resource - attributes of an RDT resource
> * @enabled: Is this feature enabled on this machine
> * @capable: Is this feature available on this machine
> @@ -78,6 +109,16 @@ struct rftype {
> * @data_width: Character width of data when displaying
> * @min_cbm_bits: Minimum number of consecutive bits to be set
> * in a cache bit mask
> + * @msr_update: Function pointer to update QOS MSRs
> + * @max_delay: Max throttle delay. Delay is the hardware
> + * understandable value for memory bandwidth.
> + * @min_bw: Minimum memory bandwidth percentage user
> + * can request
> + * @bw_gran: Granularity at which the memory bandwidth
> + * is allocated
> + * @delay_linear: True if memory b/w delay is in linear scale
> + * @mb_map: Mapping of memory b/w percentage to
> + * memory b/w delay values
> * @domains: All domains for this resource
> * @msr_base: Base MSR address for CBMs
> * @cache_level: Which cache level defines scope of this domain
> @@ -94,6 +135,14 @@ struct rdt_resource {
> int min_cbm_bits;
> u32 default_ctrl;
> int data_width;
> + void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
> + struct rdt_resource *r);
> + u32 max_delay;
> + u32 min_bw;
> + u32 bw_gran;
> + u32 delay_linear;
> + u32 *mb_map;
I don't know what other weird controls will be added over time, but we are
probably better off to have
struct cache_ctrl {
int cbm_len;
int min_cbm_bits;
};
struct mba_ctrl {
u32 max_delay;
u32 min_bw;
u32 bw_gran;
u32 delay_linear;
u32 *mb_map;
};
and in then in struct rdt_resource:
<common fields>
union {
struct cache_ctrl foo;
struct mba_ctrl bla;
} ctrl;
That avoids that rdt_resource becomes a hodgepodge of unrelated or even
contradicting fields.
Hmm?
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
[not found] ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
@ 2017-04-04 18:50 ` Shivappa Vikas
0 siblings, 0 replies; 9+ messages in thread
From: Shivappa Vikas @ 2017-04-04 18:50 UTC (permalink / raw)
To: Tracy Smith
Cc: x86, vikas.shivappa, lkml, ravi.v.shankar, tony.luck, fenghua.yu,
h.peter.anvin, Vikas Shivappa
On Mon, 3 Apr 2017, Tracy Smith wrote:
> Hi All,
>
> No JTAG available and need to understand why Linux 4.8.3 doesn't boot on a
> x86_64 corei7-64. Hangs at the typical "Starting kernel" location after
> the last message of the U-boot. The bootcmd is given below.
Do you see the issue when you apply MBA patches . It seems like you use 4.8
kernel but the mba patches are dependent on :
https://marc.info/?l=linux-kernel&m=149125583824700
which is applied on 4.11-rc4.
>
> See a fault FFS and a message indicating the image didn't load after
> failover.
>
> 1) How can I verify if the kernel image was loaded from uboot
> 2) What is this fault?
> 3) Has the bootargs or bootcmd changed between 4.1 and 4.8.3?
> 4) If the boot cmd/arg has changed, what should the boot cmd/arg be for
> 4.8.3 to boot on x86_64 corei7-64?
>
> Initial RAM disk at linear address 0x20000000, size 11638378 bytes
> Kernel command line: "BOOT_IMAGE=/imgx/bzImage LABEL=BOOT root=/dev/ram0
> imgmnt=/media/sda2 imgdir=imgx img=image.rootfs rootdelay=2 slub_debug=F
> console=ttyS1,115200 bootcount=1 bootcount_addr=0xa4000
> acpi_enforce_resources=lax pram_size=0x800000 pram_addr=10000000
> pram_loc=ddr crashkernel=128M memmap=0x800000$0x10000000 "
> EFI table at 683392c0, mmap 68339300, mmap size 4c0, version 1, descr. size
> 0x30
>
> Starting kernel ...
>
> Timer summary in microseconds:
> Mark Elapsed Stage
> 0 0 reset
> 1 1 board_init_r
> 105 104 board_init_f
> 10,180,048 10,179,943 id=64
> 10,221,985 41,937 id=65
> 10,356,645 134,660 main_loop
> 12,366,521 2,009,876 usb_start
> 18,747,284 6,380,763 start_kernel
> Accumulated time:
> 10,162,689 ahci
>
> On a 4.1.26-yocto-standard #1 SMP it boots with no issues. Basically same
> .config used in both cases except for anything deprecated between 4.1 and
> 4.8.3.
>
> root@:~# cat /proc/cmdline
> BOOT_IMAGE=/imgy/bzImage LABEL=BOOT root=/dev/ram0 imgmnt=/media/sda2
> imgdir=imgy img=image.rootfs rootdelay=2 slub_debug=F console=ttyS1,115200
> fault=FFS bootcount=3 bootcount_addr=0xa4000 acpi_enforce_resources=lax
> pram_size=0x800000 pram_addr=10000000 pram_loc=ddr crashkernel=128M
> memmap=0x800000$0x10000000
> root@CLX3001:~# cat /proc/consoles
> ttyS1 -W- (EC p a) 4:65
> netcon0 -W- (E )
>
> thx,
> Tray
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
2017-04-03 21:57 [PATCH 0/8 V3] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
@ 2017-04-03 21:57 ` Vikas Shivappa
[not found] ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
2017-04-05 15:40 ` Thomas Gleixner
0 siblings, 2 replies; 9+ messages in thread
From: Vikas Shivappa @ 2017-04-03 21:57 UTC (permalink / raw)
To: vikas.shivappa, x86, linux-kernel
Cc: hpa, tglx, mingo, peterz, ravi.v.shankar, tony.luck, fenghua.yu,
h.peter.anvin
The MBA feature details like minimum bandwidth supported, b/w
granularity etc are obtained via executing CPUID with EAX=10H
,ECX=3.
Setup and initialize the MBA specific extensions to data structures like
global list of RDT resources, RDT resource structure and RDT domain
structure.
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
arch/x86/include/asm/intel_rdt.h | 80 +++++++++++++--------
arch/x86/kernel/cpu/intel_rdt.c | 151 ++++++++++++++++++++++++++++++++++++---
2 files changed, 190 insertions(+), 41 deletions(-)
diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index 4c94f18..285cdeb 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -12,6 +12,7 @@
#define IA32_L3_QOS_CFG 0xc81
#define IA32_L3_CBM_BASE 0xc90
#define IA32_L2_CBM_BASE 0xd10
+#define IA32_MBA_THRTL_BASE 0xd50
#define L3_QOS_CDP_ENABLE 0x01ULL
@@ -69,6 +70,36 @@ struct rftype {
};
/**
+ * struct rdt_domain - group of cpus sharing an RDT resource
+ * @list: all instances of this resource
+ * @id: unique id for this instance
+ * @cpu_mask: which cpus share this resource
+ * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
+ * @new_cbm: new cbm value to be loaded
+ * @have_new_cbm: did user provide new_cbm for this domain
+ */
+struct rdt_domain {
+ struct list_head list;
+ int id;
+ struct cpumask cpu_mask;
+ u32 *ctrl_val;
+ u32 new_ctrl;
+ bool have_new_ctrl;
+};
+
+/**
+ * struct msr_param - set a range of MSRs from a domain
+ * @res: The resource to use
+ * @low: Beginning index from base MSR
+ * @high: End index
+ */
+struct msr_param {
+ struct rdt_resource *res;
+ int low;
+ int high;
+};
+
+/**
* struct rdt_resource - attributes of an RDT resource
* @enabled: Is this feature enabled on this machine
* @capable: Is this feature available on this machine
@@ -78,6 +109,16 @@ struct rftype {
* @data_width: Character width of data when displaying
* @min_cbm_bits: Minimum number of consecutive bits to be set
* in a cache bit mask
+ * @msr_update: Function pointer to update QOS MSRs
+ * @max_delay: Max throttle delay. Delay is the hardware
+ * understandable value for memory bandwidth.
+ * @min_bw: Minimum memory bandwidth percentage user
+ * can request
+ * @bw_gran: Granularity at which the memory bandwidth
+ * is allocated
+ * @delay_linear: True if memory b/w delay is in linear scale
+ * @mb_map: Mapping of memory b/w percentage to
+ * memory b/w delay values
* @domains: All domains for this resource
* @msr_base: Base MSR address for CBMs
* @cache_level: Which cache level defines scope of this domain
@@ -94,6 +135,14 @@ struct rdt_resource {
int min_cbm_bits;
u32 default_ctrl;
int data_width;
+ void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
+ struct rdt_resource *r);
+ u32 max_delay;
+ u32 min_bw;
+ u32 bw_gran;
+ u32 delay_linear;
+ u32 *mb_map;
+
struct list_head domains;
int msr_base;
int cache_level;
@@ -101,36 +150,6 @@ struct rdt_resource {
int cbm_idx_offset;
};
-/**
- * struct rdt_domain - group of cpus sharing an RDT resource
- * @list: all instances of this resource
- * @id: unique id for this instance
- * @cpu_mask: which cpus share this resource
- * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
- * @new_ctrl: new ctrl value to be loaded
- * @have_new_ctrl: did user provide new_ctrl for this domain
- */
-struct rdt_domain {
- struct list_head list;
- int id;
- struct cpumask cpu_mask;
- u32 *ctrl_val;
- u32 new_ctrl;
- bool have_new_ctrl;
-};
-
-/**
- * struct msr_param - set a range of MSRs from a domain
- * @res: The resource to use
- * @low: Beginning index from base MSR
- * @high: End index
- */
-struct msr_param {
- struct rdt_resource *res;
- int low;
- int high;
-};
-
extern struct mutex rdtgroup_mutex;
extern struct rdt_resource rdt_resources_all[];
@@ -144,6 +163,7 @@ enum {
RDT_RESOURCE_L3DATA,
RDT_RESOURCE_L3CODE,
RDT_RESOURCE_L2,
+ RDT_RESOURCE_MBA,
/* Must be the last */
RDT_NUM_RESOURCES,
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index c4cf2e8..be272b9 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -32,6 +32,9 @@
#include <asm/intel-family.h>
#include <asm/intel_rdt.h>
+#define MAX_MBA_BW 100u
+#define MBA_IS_LINEAR 0x4
+
/* Mutex to protect rdtgroup access. */
DEFINE_MUTEX(rdtgroup_mutex);
@@ -45,11 +48,17 @@
*/
int max_name_width, max_data_width;
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+static void
+cqm_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+
struct rdt_resource rdt_resources_all[] = {
{
.name = "L3",
.domains = domain_init(RDT_RESOURCE_L3),
.msr_base = IA32_L3_CBM_BASE,
+ .msr_update = cqm_wrmsr,
.min_cbm_bits = 1,
.cache_level = 3,
.cbm_idx_multi = 1,
@@ -59,6 +68,7 @@ struct rdt_resource rdt_resources_all[] = {
.name = "L3DATA",
.domains = domain_init(RDT_RESOURCE_L3DATA),
.msr_base = IA32_L3_CBM_BASE,
+ .msr_update = cqm_wrmsr,
.min_cbm_bits = 1,
.cache_level = 3,
.cbm_idx_multi = 2,
@@ -68,6 +78,7 @@ struct rdt_resource rdt_resources_all[] = {
.name = "L3CODE",
.domains = domain_init(RDT_RESOURCE_L3CODE),
.msr_base = IA32_L3_CBM_BASE,
+ .msr_update = cqm_wrmsr,
.min_cbm_bits = 1,
.cache_level = 3,
.cbm_idx_multi = 2,
@@ -77,11 +88,21 @@ struct rdt_resource rdt_resources_all[] = {
.name = "L2",
.domains = domain_init(RDT_RESOURCE_L2),
.msr_base = IA32_L2_CBM_BASE,
+ .msr_update = cqm_wrmsr,
.min_cbm_bits = 1,
.cache_level = 2,
.cbm_idx_multi = 1,
.cbm_idx_offset = 0
},
+ {
+ .name = "MB",
+ .domains = domain_init(RDT_RESOURCE_MBA),
+ .msr_base = IA32_MBA_THRTL_BASE,
+ .msr_update = mba_wrmsr,
+ .cache_level = 3,
+ .cbm_idx_multi = 1,
+ .cbm_idx_offset = 0
+ },
};
static int cbm_idx(struct rdt_resource *r, int closid)
@@ -136,6 +157,53 @@ static inline bool cache_alloc_hsw_probe(void)
return false;
}
+/*
+ * rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values
+ * exposed to user interface and the h/w understandable delay values.
+ *
+ * The non-linear delay values have the granularity of power of two
+ * and also the h/w does not guarantee a curve for configured delay
+ * values vs. actual b/w enforced.
+ * Hence we need a mapping that is pre calibrated so the user can
+ * express the memory b/w as a percentage value.
+ */
+static inline bool rdt_get_mb_table(struct rdt_resource *r)
+{
+ /*
+ * There are no Intel SKUs as of now to support non-linear delay.
+ */
+ pr_info("MBA b/w map not implemented for cpu:%d, model:%d",
+ boot_cpu_data.x86, boot_cpu_data.x86_model);
+
+ return false;
+}
+
+static bool rdt_get_mem_config(struct rdt_resource *r)
+{
+ union cpuid_0x10_3_eax eax;
+ union cpuid_0x10_x_edx edx;
+ u32 ebx, ecx;
+
+ cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
+ r->num_closid = edx.split.cos_max + 1;
+ r->max_delay = eax.split.max_delay + 1;
+ r->default_ctrl = MAX_MBA_BW;
+ if (ecx & MBA_IS_LINEAR) {
+ r->delay_linear = true;
+ r->min_bw = MAX_MBA_BW - r->max_delay;
+ r->bw_gran = MAX_MBA_BW - r->max_delay;
+ } else {
+ if (!rdt_get_mb_table(r))
+ return false;
+ }
+ r->data_width = 3;
+
+ r->capable = true;
+ r->enabled = true;
+
+ return true;
+}
+
static void rdt_get_cache_config(int idx, struct rdt_resource *r)
{
union cpuid_0x10_1_eax eax;
@@ -212,7 +280,8 @@ static inline bool get_rdt_resources(void)
ret = true;
}
- if (boot_cpu_has(X86_FEATURE_MBA))
+ if (boot_cpu_has(X86_FEATURE_MBA) &&
+ rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
ret = true;
rdt_init_padding();
@@ -233,6 +302,47 @@ static int get_cache_id(int cpu, int level)
return -1;
}
+/*
+ * Map the memory b/w percentage value to delay values
+ * that can be written to QOS_MSRs.
+ * There are currently no SKUs which support non linear delay values.
+ */
+static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
+{
+ if (r->delay_linear)
+ return MAX_MBA_BW - bw;
+
+ WARN_ONCE(1, "Non Linear delay-bw map not supported but queried\n");
+ return r->default_ctrl;
+}
+
+static void
+mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+ int i;
+
+ for (i = m->low; i < m->high; i++) {
+ int idx = cbm_idx(r, i);
+
+ /*
+ * Write the delay value for mba.
+ */
+ wrmsrl(r->msr_base + idx, delay_bw_map(d->ctrl_val[i], r));
+ }
+}
+
+static void
+cqm_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+{
+ int i;
+
+ for (i = m->low; i < m->high; i++) {
+ int idx = cbm_idx(r, i);
+
+ wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
+ }
+}
+
void rdt_ctrl_update(void *arg)
{
struct msr_param *m = (struct msr_param *)arg;
@@ -291,6 +401,33 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
return NULL;
}
+static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
+{
+ struct msr_param m;
+ u32 *dc;
+ int i;
+
+ dc = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+ if (!dc)
+ return -ENOMEM;
+
+ d->ctrl_val = dc;
+
+ /*
+ * Initialize the Control MSRs to having no control.
+ * For Cache Allocation: Set all bits in cbm
+ * For Memory Allocation: Set b/w requested to 100
+ */
+ for (i = 0; i < r->num_closid; i++, dc++)
+ *dc = r->default_ctrl;
+
+ m.low = 0;
+ m.high = r->num_closid;
+ r->msr_update(d, &m, r);
+
+ return 0;
+}
+
/*
* domain_add_cpu - Add a cpu to a resource's domain list.
*
@@ -306,7 +443,7 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
*/
static void domain_add_cpu(int cpu, struct rdt_resource *r)
{
- int i, id = get_cache_id(cpu, r->cache_level);
+ int id = get_cache_id(cpu, r->cache_level);
struct list_head *add_pos = NULL;
struct rdt_domain *d;
@@ -327,19 +464,11 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
d->id = id;
- d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
- if (!d->ctrl_val) {
+ if (domain_setup_ctrlval(r, d)) {
kfree(d);
return;
}
- for (i = 0; i < r->num_closid; i++) {
- int idx = cbm_idx(r, i);
-
- d->ctrl_val[i] = r->default_ctrl;
- wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
- }
-
cpumask_set_cpu(cpu, &d->cpu_mask);
list_add_tail(&d->list, add_pos);
}
--
1.9.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
2017-03-01 15:24 ` Thomas Gleixner
@ 2017-03-10 21:51 ` Shivappa Vikas
0 siblings, 0 replies; 9+ messages in thread
From: Shivappa Vikas @ 2017-03-10 21:51 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Vikas Shivappa, vikas.shivappa, linux-kernel, x86, hpa, mingo,
peterz, ravi.v.shankar, tony.luck, fenghua.yu, andi.kleen
On Wed, 1 Mar 2017, Thomas Gleixner wrote:
> On Fri, 17 Feb 2017, Vikas Shivappa wrote:
>> --- a/arch/x86/include/asm/intel_rdt.h
>> +++ b/arch/x86/include/asm/intel_rdt.h
>> @@ -11,6 +11,9 @@
>> #define IA32_L3_QOS_CFG 0xc81
>> #define IA32_L3_CBM_BASE 0xc90
>> #define IA32_L2_CBM_BASE 0xd10
>> +#define IA32_MBA_THRTL_BASE 0xd50
>> +#define MAX_MBA_THRTL 100u
>> +#define MBA_IS_LINEAR 0x4
>
> I have a hard time to figure out how the latter two constants are related
> to this list of registers. MBA_IS_LINEAR is used to check the CPUID bit and
> MAX_MBA_THRTL is obviously a pure software constant because with a
> non-linear scale the maximum value is not 100.
>
> Just slapping defines to random places is equally bad as using hard coded
> constants.
>
>> +/*
>> + * rdt_get_mb_table() - get a mapping of b/w percentage values
>> + * exposed to user interface and the h/w understandable delay values.
>> + *
>> + * The non-linear delay values have the granularity of power of two
>> + * and also the h/w does not guarantee a curve for configured delay
>> + * values vs. actual b/w throttled.
>> + * Hence we need a mapping that is pre caliberated for user to express
>> + * the b/w in terms of any sensible number.
>
> ... calibrated so the user can express the bandwidth as a percentage value.
>
>> +static inline int rdt_get_mb_table(struct rdt_resource *r)
>> +{
>> + /*
>> + * There are no Intel SKUs as of now to support non-linear delay.
>> + */
>> + r->mb_map = NULL;
>
> What's the point of setting this to NULL?
>
> Also it would be helpful to emit log info here so people don't have to
> start digging around.
>
> pr_info("Bandwidth map not implemented for ....", ... model);
>
>> +
>> + return -ENODEV;
>
> Returning -ENODEV to a function which just returns a boolean value is
> pointless.
>
>> static void rdt_get_cache_config(int idx, struct rdt_resource *r)
>> {
>> union cpuid_0x10_1_eax eax;
>> @@ -184,9 +237,8 @@ static inline bool get_rdt_resources(void)
>> ret = true;
>> }
>>
>> - if (boot_cpu_has(X86_FEATURE_MBA)) {
>> - ret = true;
>> - }
>> + if (boot_cpu_has(X86_FEATURE_MBA))
>> + ret = rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]);
>
> Groan. When rdt_get_mem_config() returns false (because the map is not
> implemented), then the whole function returns false and CAT is disabled.
>
>> +static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
>> +{
>> + int i;
>> +
>> + d->ctrl_val = kmalloc_array(r->num_closid,
>> + sizeof(*d->ctrl_val), GFP_KERNEL);
>> + if (!d->ctrl_val)
>> + return -ENOMEM;
>> +
>> + /*
>> + * Initialize the Control MSRs to having no control.
>> + * For Cache Allocation: Set all bits in cbm
>> + * For Memory Allocation: Set b/w requested to 100
>> + */
>> + for (i = 0; i < r->num_closid; i++) {
>> + int idx = cbm_idx(r, i);
>> +
>> + d->ctrl_val[i] = r->default_ctrl;
>> + wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
>> + }
>
> So if you use a local pointer for that, this whole mess becomes readable.
>
> static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
> {
> u32 *p;
> int i;
>
> p = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
> if (!p)
> return -ENOMEM;
>
> d->ctrl_val = p;
>
> /* Initialize the Control MSRs to the default value */
> for (i = 0; i < r->num_closid; i++, p++) {
> int idx = cbm_idx(r, i);
>
> *p = r->default_ctrl;
> wrmsrl(r->msr_base + idx, *p);
> }
>> +
>> + return 0;
>> +}
>
>> static void domain_add_cpu(int cpu, struct rdt_resource *r)
>> {
>> - int i, id = get_cache_id(cpu, r->cache_level);
>> + int id = get_cache_id(cpu, r->cache_level), ret;
>
> Bah. If you have the same type in one line, then please move the
> uninitialized variables to the front.
>
> int ret, id = get_cache_id(cpu, r->cache_level);
>
> But a s/i/ret/ would have been to simple and kept the code readable.
>
>> @@ -298,19 +374,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>>
>> d->id = id;
>>
>> - d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
>> - if (!d->ctrl_val) {
>> + ret = domain_setup_ctrlval(r, d);
>> + if (ret) {
>> kfree(d);
>> return;
>> }
>
> What's the point of this 'ret' variable if the function is void?
>
> if (domain_setup_ctrlval(r, d)) {
> kfree(d);
> return;
> }
>
> would have been to easy to read, right?
Will fix all the issues pointed. Thanks for pointing out.
>
> Thanks,
>
> tglx
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
2017-02-17 19:58 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
@ 2017-03-01 15:24 ` Thomas Gleixner
2017-03-10 21:51 ` Shivappa Vikas
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2017-03-01 15:24 UTC (permalink / raw)
To: Vikas Shivappa
Cc: vikas.shivappa, linux-kernel, x86, hpa, mingo, peterz,
ravi.v.shankar, tony.luck, fenghua.yu, andi.kleen
On Fri, 17 Feb 2017, Vikas Shivappa wrote:
> --- a/arch/x86/include/asm/intel_rdt.h
> +++ b/arch/x86/include/asm/intel_rdt.h
> @@ -11,6 +11,9 @@
> #define IA32_L3_QOS_CFG 0xc81
> #define IA32_L3_CBM_BASE 0xc90
> #define IA32_L2_CBM_BASE 0xd10
> +#define IA32_MBA_THRTL_BASE 0xd50
> +#define MAX_MBA_THRTL 100u
> +#define MBA_IS_LINEAR 0x4
I have a hard time to figure out how the latter two constants are related
to this list of registers. MBA_IS_LINEAR is used to check the CPUID bit and
MAX_MBA_THRTL is obviously a pure software constant because with a
non-linear scale the maximum value is not 100.
Just slapping defines to random places is equally bad as using hard coded
constants.
> +/*
> + * rdt_get_mb_table() - get a mapping of b/w percentage values
> + * exposed to user interface and the h/w understandable delay values.
> + *
> + * The non-linear delay values have the granularity of power of two
> + * and also the h/w does not guarantee a curve for configured delay
> + * values vs. actual b/w throttled.
> + * Hence we need a mapping that is pre caliberated for user to express
> + * the b/w in terms of any sensible number.
... calibrated so the user can express the bandwidth as a percentage value.
> +static inline int rdt_get_mb_table(struct rdt_resource *r)
> +{
> + /*
> + * There are no Intel SKUs as of now to support non-linear delay.
> + */
> + r->mb_map = NULL;
What's the point of setting this to NULL?
Also it would be helpful to emit log info here so people don't have to
start digging around.
pr_info("Bandwidth map not implemented for ....", ... model);
> +
> + return -ENODEV;
Returning -ENODEV to a function which just returns a boolean value is
pointless.
> static void rdt_get_cache_config(int idx, struct rdt_resource *r)
> {
> union cpuid_0x10_1_eax eax;
> @@ -184,9 +237,8 @@ static inline bool get_rdt_resources(void)
> ret = true;
> }
>
> - if (boot_cpu_has(X86_FEATURE_MBA)) {
> - ret = true;
> - }
> + if (boot_cpu_has(X86_FEATURE_MBA))
> + ret = rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]);
Groan. When rdt_get_mem_config() returns false (because the map is not
implemented), then the whole function returns false and CAT is disabled.
> +static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
> +{
> + int i;
> +
> + d->ctrl_val = kmalloc_array(r->num_closid,
> + sizeof(*d->ctrl_val), GFP_KERNEL);
> + if (!d->ctrl_val)
> + return -ENOMEM;
> +
> + /*
> + * Initialize the Control MSRs to having no control.
> + * For Cache Allocation: Set all bits in cbm
> + * For Memory Allocation: Set b/w requested to 100
> + */
> + for (i = 0; i < r->num_closid; i++) {
> + int idx = cbm_idx(r, i);
> +
> + d->ctrl_val[i] = r->default_ctrl;
> + wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
> + }
So if you use a local pointer for that, this whole mess becomes readable.
static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
u32 *p;
int i;
p = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
if (!p)
return -ENOMEM;
d->ctrl_val = p;
/* Initialize the Control MSRs to the default value */
for (i = 0; i < r->num_closid; i++, p++) {
int idx = cbm_idx(r, i);
*p = r->default_ctrl;
wrmsrl(r->msr_base + idx, *p);
}
> +
> + return 0;
> +}
> static void domain_add_cpu(int cpu, struct rdt_resource *r)
> {
> - int i, id = get_cache_id(cpu, r->cache_level);
> + int id = get_cache_id(cpu, r->cache_level), ret;
Bah. If you have the same type in one line, then please move the
uninitialized variables to the front.
int ret, id = get_cache_id(cpu, r->cache_level);
But a s/i/ret/ would have been to simple and kept the code readable.
> @@ -298,19 +374,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>
> d->id = id;
>
> - d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
> - if (!d->ctrl_val) {
> + ret = domain_setup_ctrlval(r, d);
> + if (ret) {
> kfree(d);
> return;
> }
What's the point of this 'ret' variable if the function is void?
if (domain_setup_ctrlval(r, d)) {
kfree(d);
return;
}
would have been to easy to read, right?
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA
2017-02-17 19:58 [PATCH 0/8 V2] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
@ 2017-02-17 19:58 ` Vikas Shivappa
2017-03-01 15:24 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Vikas Shivappa @ 2017-02-17 19:58 UTC (permalink / raw)
To: vikas.shivappa
Cc: linux-kernel, x86, hpa, tglx, mingo, peterz, ravi.v.shankar,
tony.luck, fenghua.yu, andi.kleen, vikas.shivappa
The MBA feature details like minimum bandwidth supported, b/w
granularity etc are obtained via executing CPUID with EAX=10H
,ECX=3.
Setup and initialize the MBA specific extensions to data structures like
global list of RDT resources, RDT resource structure and RDT domain
structure.
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
arch/x86/include/asm/intel_rdt.h | 17 +++++++
arch/x86/kernel/cpu/intel_rdt.c | 95 ++++++++++++++++++++++++++++++++++------
2 files changed, 99 insertions(+), 13 deletions(-)
diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index d2eee45..af65b2a 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -11,6 +11,9 @@
#define IA32_L3_QOS_CFG 0xc81
#define IA32_L3_CBM_BASE 0xc90
#define IA32_L2_CBM_BASE 0xd10
+#define IA32_MBA_THRTL_BASE 0xd50
+#define MAX_MBA_THRTL 100u
+#define MBA_IS_LINEAR 0x4
#define L3_QOS_CDP_ENABLE 0x01ULL
@@ -74,6 +77,14 @@ struct rftype {
* @default_ctrl: Specifies default cache cbm or mem b/w percent.
* @min_cbm_bits: Minimum number of consecutive bits to be set
* in a cache bit mask
+ * @max_delay: Max throttle delay. Delay is the hardware
+ * understandable value for memory b/w.
+ * @min_bw: Minimum memory bandwidth in percentage
+ * user can request
+ * @bw_gran: Bandwidth granularity
+ * @delay_linear: True if Mem b/w delay is in linear scale
+ * @mb_map: Mapping of mem b/w delay to
+ * b/w throttle percentage
* @domains: All domains for this resource
* @num_domains: Number of domains active
* @msr_base: Base MSR address for CBMs
@@ -92,6 +103,11 @@ struct rdt_resource {
int cbm_len;
int min_cbm_bits;
u32 default_ctrl;
+ u32 max_delay;
+ u32 min_bw;
+ u32 bw_gran;
+ u32 delay_linear;
+ u32 *mb_map;
struct list_head domains;
int num_domains;
int msr_base;
@@ -141,6 +157,7 @@ enum {
RDT_RESOURCE_L3DATA,
RDT_RESOURCE_L3CODE,
RDT_RESOURCE_L2,
+ RDT_RESOURCE_MBA,
/* Must be the last */
RDT_NUM_RESOURCES,
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index b76a518..130ce98 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -76,6 +76,14 @@ struct rdt_resource rdt_resources_all[] = {
.cbm_idx_multi = 1,
.cbm_idx_offset = 0
},
+ {
+ .name = "MB",
+ .domains = domain_init(RDT_RESOURCE_MBA),
+ .msr_base = IA32_MBA_THRTL_BASE,
+ .cache_level = 3,
+ .cbm_idx_multi = 1,
+ .cbm_idx_offset = 0
+ },
};
static int cbm_idx(struct rdt_resource *r, int closid)
@@ -130,6 +138,51 @@ static inline bool cache_alloc_hsw_probe(void)
return false;
}
+/*
+ * rdt_get_mb_table() - get a mapping of b/w percentage values
+ * exposed to user interface and the h/w understandable delay values.
+ *
+ * The non-linear delay values have the granularity of power of two
+ * and also the h/w does not guarantee a curve for configured delay
+ * values vs. actual b/w throttled.
+ * Hence we need a mapping that is pre caliberated for user to express
+ * the b/w in terms of any sensible number.
+ */
+static inline int rdt_get_mb_table(struct rdt_resource *r)
+{
+ /*
+ * There are no Intel SKUs as of now to support non-linear delay.
+ */
+ r->mb_map = NULL;
+
+ return -ENODEV;
+}
+
+static bool rdt_get_mem_config(struct rdt_resource *r)
+{
+ union cpuid_0x10_3_eax eax;
+ union cpuid_0x10_x_edx edx;
+ u32 ebx, ecx;
+
+ cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
+ r->num_closid = edx.split.cos_max + 1;
+ r->max_delay = eax.split.max_delay + 1;
+ r->default_ctrl = MAX_MBA_THRTL;
+ if (ecx & MBA_IS_LINEAR) {
+ r->delay_linear = true;
+ r->min_bw = MAX_MBA_THRTL - r->max_delay;
+ r->bw_gran = MAX_MBA_THRTL - r->max_delay;
+ } else {
+ if (rdt_get_mb_table(r))
+ return false;
+ }
+
+ r->capable = true;
+ r->enabled = true;
+
+ return true;
+}
+
static void rdt_get_cache_config(int idx, struct rdt_resource *r)
{
union cpuid_0x10_1_eax eax;
@@ -184,9 +237,8 @@ static inline bool get_rdt_resources(void)
ret = true;
}
- if (boot_cpu_has(X86_FEATURE_MBA)) {
- ret = true;
- }
+ if (boot_cpu_has(X86_FEATURE_MBA))
+ ret = rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]);
return ret;
}
@@ -262,6 +314,30 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
return NULL;
}
+static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
+{
+ int i;
+
+ d->ctrl_val = kmalloc_array(r->num_closid,
+ sizeof(*d->ctrl_val), GFP_KERNEL);
+ if (!d->ctrl_val)
+ return -ENOMEM;
+
+ /*
+ * Initialize the Control MSRs to having no control.
+ * For Cache Allocation: Set all bits in cbm
+ * For Memory Allocation: Set b/w requested to 100
+ */
+ for (i = 0; i < r->num_closid; i++) {
+ int idx = cbm_idx(r, i);
+
+ d->ctrl_val[i] = r->default_ctrl;
+ wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
+ }
+
+ return 0;
+}
+
/*
* domain_add_cpu - Add a cpu to a resource's domain list.
*
@@ -277,7 +353,7 @@ static struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
*/
static void domain_add_cpu(int cpu, struct rdt_resource *r)
{
- int i, id = get_cache_id(cpu, r->cache_level);
+ int id = get_cache_id(cpu, r->cache_level), ret;
struct list_head *add_pos = NULL;
struct rdt_domain *d;
@@ -298,19 +374,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
d->id = id;
- d->ctrl_val = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
- if (!d->ctrl_val) {
+ ret = domain_setup_ctrlval(r, d);
+ if (ret) {
kfree(d);
return;
}
- for (i = 0; i < r->num_closid; i++) {
- int idx = cbm_idx(r, i);
-
- d->ctrl_val[i] = r->default_ctrl;
- wrmsrl(r->msr_base + idx, d->ctrl_val[i]);
- }
-
cpumask_set_cpu(cpu, &d->cpu_mask);
list_add_tail(&d->list, add_pos);
r->num_domains++;
--
1.9.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-04-08 0:34 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 19:02 [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Tracy Smith
-- strict thread matches above, loose matches on Subject: below --
2017-04-08 0:33 [PATCH 0/8 V4] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
2017-04-08 0:33 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
2017-04-03 21:57 [PATCH 0/8 V3] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
2017-04-03 21:57 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
[not found] ` <CAChUvXM8gWAz6-AJ6jkyKjf5Yz0ze-2XAtvdZvze3Go44TPD8A@mail.gmail.com>
2017-04-04 18:50 ` Shivappa Vikas
2017-04-05 15:40 ` Thomas Gleixner
2017-04-05 18:09 ` Shivappa Vikas
2017-02-17 19:58 [PATCH 0/8 V2] x86/intel_rdt: Intel Memory bandwidth allocation Vikas Shivappa
2017-02-17 19:58 ` [PATCH 4/8] x86/intel_rct/mba: Add MBA structures and initialize MBA Vikas Shivappa
2017-03-01 15:24 ` Thomas Gleixner
2017-03-10 21:51 ` Shivappa Vikas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).