linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marc Gauthier <consult-mg@gstardust.com>
To: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: linux-riscv@lists.infradead.org, aou@eecs.berkeley.edu,
	Paul Walmsley <paul.walmsley@sifive.com>
Subject: Re: [PATCH v2 2/2] riscv: Set SHMLBA according to cache geometry
Date: Thu, 5 Dec 2019 18:58:25 -0500	[thread overview]
Message-ID: <7faf1c8c-86cb-f47a-511b-bcc6a916b9c4@gstardust.com> (raw)
In-Reply-To: <mhng-344205d1-f6ce-45d3-a420-77c00ab0f4ad@palmerdabbelt-glaptop>

Palmer Dabbelt wrote on 2019-12-05 18:03:
> On Tue, 26 Nov 2019 14:44:46 PST (-0800), consult-mg@gstardust.com wrote:
>> Set SHMLBA to the maximum cache "span" (line size * number of sets) of
>> all CPU L1 instruction and data caches (L2 and up are rarely VIPT).
>> This avoids VIPT cache aliasing with minimal alignment constraints.
>>
>> If the device tree does not provide cache parameters, use a conservative
>> default of 16 KB:  only large enough to avoid aliasing in most VIPT 
>> caches.
>>
>> Signed-off-by: Marc Gauthier <consult-mg@gstardust.com>
>> ---
>>  arch/riscv/include/asm/Kbuild     |  1 -
>>  arch/riscv/include/asm/shmparam.h | 12 +++++++
>>  arch/riscv/kernel/cacheinfo.c     | 52 +++++++++++++++++++++++++++++++
>>  3 files changed, 64 insertions(+), 1 deletion(-)
>>  create mode 100644 arch/riscv/include/asm/shmparam.h
>>
>> diff --git a/arch/riscv/include/asm/Kbuild 
>> b/arch/riscv/include/asm/Kbuild
>> index 16970f246860..3905765807af 100644
>> --- a/arch/riscv/include/asm/Kbuild
>> +++ b/arch/riscv/include/asm/Kbuild
>> @@ -27,7 +27,6 @@ generic-y += percpu.h
>>  generic-y += preempt.h
>>  generic-y += sections.h
>>  generic-y += serial.h
>> -generic-y += shmparam.h
>>  generic-y += topology.h
>>  generic-y += trace_clock.h
>>  generic-y += unaligned.h
>> diff --git a/arch/riscv/include/asm/shmparam.h 
>> b/arch/riscv/include/asm/shmparam.h
>> new file mode 100644
>> index 000000000000..9b6a98153648
>> --- /dev/null
>> +++ b/arch/riscv/include/asm/shmparam.h
>> @@ -0,0 +1,12 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_RISCV_SHMPARAM_H
>> +#define _ASM_RISCV_SHMPARAM_H
>> +
>> +/*
>> + * Minimum alignment of shared memory segments as a function of 
>> cache geometry.
>> + */
>> +#define    SHMLBA    arch_shmlba()
>
> I'd prefer if we inline the memoization, which would avoid the cost of a
> function call in the general case.  You can also avoid that 0 test by
> initializing the variable to PAGE_SIZE and the filling it out in our 
> early init
> code -- maybe setup_vm()?  That's what SPARC is doing.

Good point.
Unlike SPARC, this patch re-uses existing code in 
drivers/base/cacheinfo.c to compute cache parameters.  To preserve that, 
it'll be more robust to initialize shmlba at a point certain to have 
those parameters -- at the comment far below, "Force recalculating 
SHMLBA if cache parameters are updated."  That way it keeps working if 
that point in time changes.


>> +
>> +long arch_shmlba(void);
>> +
>> +#endif /* _ASM_RISCV_SHMPARAM_H */
>> diff --git a/arch/riscv/kernel/cacheinfo.c 
>> b/arch/riscv/kernel/cacheinfo.c
>> index 4c90c07d8c39..1bc7df8577d6 100644
>> --- a/arch/riscv/kernel/cacheinfo.c
>> +++ b/arch/riscv/kernel/cacheinfo.c
>> @@ -1,12 +1,61 @@
>>  // SPDX-License-Identifier: GPL-2.0-only
>>  /*
>>   * Copyright (C) 2017 SiFive
>> + * Copyright (C) 2019 Aril Inc
>>   */
>>
>>  #include <linux/cacheinfo.h>
>>  #include <linux/cpu.h>
>>  #include <linux/of.h>
>>  #include <linux/of_device.h>
>> +#include <linux/mm.h>
>> +
>> +static long shmlba;
>> +
>> +
>> +/*
>> + * Assuming  cache size = line size * #sets * N  for N-way 
>> associative caches,
>> + * return the max cache "span" == (line size * #sets) == (cache size 
>> / N)
>> + * across all L1 caches, or 0 if cache parameters are not available.
>> + * VIPT caches with span > min page size are susceptible to aliasing.
>> + */
>> +static long get_max_cache_span(void)
>> +{
>> +    struct cpu_cacheinfo *this_cpu_ci;
>> +    struct cacheinfo *this_leaf;
>> +    long span, max_span = 0;
>> +    int cpu, leaf;
>> +
>> +    for_each_possible_cpu(cpu) {
>> +        this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +        this_leaf = this_cpu_ci->info_list;
>> +        for (leaf = 0; leaf < this_cpu_ci->num_leaves; leaf++) {
>> +            if (this_leaf->level > 1)
>> +                break;
>> +            span = this_leaf->coherency_line_size *
>> +                   this_leaf->number_of_sets;
>> +            if (span > max_span)
>> +                max_span = span;
>> +            this_leaf++;
>> +        }
>> +    }
>> +    return max_span;
>> +}
>> +
>> +/*
>> + * Align shared mappings to the maximum cache "span" to avoid aliasing
>> + * in VIPT caches, for performance.
>> + * The returned SHMLBA value is always a power-of-two multiple of 
>> PAGE_SIZE.
>> + */
>> +long arch_shmlba(void)
>> +{
>> +    if (shmlba == 0) {
>> +        long max_span = get_max_cache_span();
>> +
>> +        shmlba = max_span ? PAGE_ALIGN(max_span) : 4 * PAGE_SIZE;
>
> I'd prefer to avoid sneaking in a default 4*PAGE_SIZE here, just 
> default to
> PAGE_SIZE and rely on systems with this behavior specifying the 
> correct tuning
> value in the device tree.

Fair enough.


> This avoids changing the behavior for existing
> systems, which is a slight regression as the alignment uses more 
> memory.  It's
> not a big deal, but on systems that don't require alignment for high
> performance there's no reason to just throw away memory -- 
> particularly as we
> have some RISC-V systems with pretty limited memory

Greater alignment takes up more virtual memory, not more physical memory.


> (I'm thinking of the
> Kendryte boards, though I don't know how SHMLBA interacts with NOMMU 
> so it
> might not matter).

There's no virtual memory in NOMMU, so indeed it doesn't matter.

M


>> +    }
>> +    return shmlba;
>> +}
>>
>>  static void ci_leaf_init(struct cacheinfo *this_leaf,
>>               struct device_node *node,
>> @@ -93,6 +142,9 @@ static int __populate_cache_leaves(unsigned int cpu)
>>      }
>>      of_node_put(np);
>>
>> +    /* Force recalculating SHMLBA if cache parameters are updated. */
>> +    shmlba = 0;
>> +
>>      return 0;
>>  }



  reply	other threads:[~2019-12-05 23:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-26 22:44 [PATCH v2 0/2] riscv: Align shared mappings to avoid cache aliasing Marc Gauthier
2019-11-26 22:44 ` [PATCH v2 1/2] riscv: Align shared mappings to SHMLBA Marc Gauthier
2019-12-05 23:03   ` Palmer Dabbelt
2019-12-06  0:24     ` Marc Gauthier
2019-11-26 22:44 ` [PATCH v2 2/2] riscv: Set SHMLBA according to cache geometry Marc Gauthier
2019-12-05 23:03   ` Palmer Dabbelt
2019-12-05 23:58     ` Marc Gauthier [this message]
2019-12-06  0:07       ` Palmer Dabbelt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7faf1c8c-86cb-f47a-511b-bcc6a916b9c4@gstardust.com \
    --to=consult-mg@gstardust.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmerdabbelt@google.com \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).