From: Palmer Dabbelt <palmerdabbelt@google.com>
To: consult-mg@gstardust.com
Cc: linux-riscv@lists.infradead.org, aou@eecs.berkeley.edu,
Paul Walmsley <paul.walmsley@sifive.com>
Subject: Re: [PATCH v2 2/2] riscv: Set SHMLBA according to cache geometry
Date: Thu, 05 Dec 2019 16:07:04 -0800 (PST) [thread overview]
Message-ID: <mhng-56d3620d-5177-4c78-a353-a9e8c5b1a567@palmerdabbelt-glaptop> (raw)
In-Reply-To: <7faf1c8c-86cb-f47a-511b-bcc6a916b9c4@gstardust.com>
On Thu, 05 Dec 2019 15:58:25 PST (-0800), consult-mg@gstardust.com wrote:
> Palmer Dabbelt wrote on 2019-12-05 18:03:
>> On Tue, 26 Nov 2019 14:44:46 PST (-0800), consult-mg@gstardust.com wrote:
>>> Set SHMLBA to the maximum cache "span" (line size * number of sets) of
>>> all CPU L1 instruction and data caches (L2 and up are rarely VIPT).
>>> This avoids VIPT cache aliasing with minimal alignment constraints.
>>>
>>> If the device tree does not provide cache parameters, use a conservative
>>> default of 16 KB: only large enough to avoid aliasing in most VIPT
>>> caches.
>>>
>>> Signed-off-by: Marc Gauthier <consult-mg@gstardust.com>
>>> ---
>>> arch/riscv/include/asm/Kbuild | 1 -
>>> arch/riscv/include/asm/shmparam.h | 12 +++++++
>>> arch/riscv/kernel/cacheinfo.c | 52 +++++++++++++++++++++++++++++++
>>> 3 files changed, 64 insertions(+), 1 deletion(-)
>>> create mode 100644 arch/riscv/include/asm/shmparam.h
>>>
>>> diff --git a/arch/riscv/include/asm/Kbuild
>>> b/arch/riscv/include/asm/Kbuild
>>> index 16970f246860..3905765807af 100644
>>> --- a/arch/riscv/include/asm/Kbuild
>>> +++ b/arch/riscv/include/asm/Kbuild
>>> @@ -27,7 +27,6 @@ generic-y += percpu.h
>>> generic-y += preempt.h
>>> generic-y += sections.h
>>> generic-y += serial.h
>>> -generic-y += shmparam.h
>>> generic-y += topology.h
>>> generic-y += trace_clock.h
>>> generic-y += unaligned.h
>>> diff --git a/arch/riscv/include/asm/shmparam.h
>>> b/arch/riscv/include/asm/shmparam.h
>>> new file mode 100644
>>> index 000000000000..9b6a98153648
>>> --- /dev/null
>>> +++ b/arch/riscv/include/asm/shmparam.h
>>> @@ -0,0 +1,12 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef _ASM_RISCV_SHMPARAM_H
>>> +#define _ASM_RISCV_SHMPARAM_H
>>> +
>>> +/*
>>> + * Minimum alignment of shared memory segments as a function of
>>> cache geometry.
>>> + */
>>> +#define SHMLBA arch_shmlba()
>>
>> I'd prefer if we inline the memoization, which would avoid the cost of a
>> function call in the general case. You can also avoid that 0 test by
>> initializing the variable to PAGE_SIZE and the filling it out in our
>> early init
>> code -- maybe setup_vm()? That's what SPARC is doing.
>
> Good point.
> Unlike SPARC, this patch re-uses existing code in
> drivers/base/cacheinfo.c to compute cache parameters. To preserve that,
> it'll be more robust to initialize shmlba at a point certain to have
> those parameters -- at the comment far below, "Force recalculating
> SHMLBA if cache parameters are updated." That way it keeps working if
> that point in time changes.
Works for me.
>>> +
>>> +long arch_shmlba(void);
>>> +
>>> +#endif /* _ASM_RISCV_SHMPARAM_H */
>>> diff --git a/arch/riscv/kernel/cacheinfo.c
>>> b/arch/riscv/kernel/cacheinfo.c
>>> index 4c90c07d8c39..1bc7df8577d6 100644
>>> --- a/arch/riscv/kernel/cacheinfo.c
>>> +++ b/arch/riscv/kernel/cacheinfo.c
>>> @@ -1,12 +1,61 @@
>>> // SPDX-License-Identifier: GPL-2.0-only
>>> /*
>>> * Copyright (C) 2017 SiFive
>>> + * Copyright (C) 2019 Aril Inc
>>> */
>>>
>>> #include <linux/cacheinfo.h>
>>> #include <linux/cpu.h>
>>> #include <linux/of.h>
>>> #include <linux/of_device.h>
>>> +#include <linux/mm.h>
>>> +
>>> +static long shmlba;
>>> +
>>> +
>>> +/*
>>> + * Assuming cache size = line size * #sets * N for N-way
>>> associative caches,
>>> + * return the max cache "span" == (line size * #sets) == (cache size
>>> / N)
>>> + * across all L1 caches, or 0 if cache parameters are not available.
>>> + * VIPT caches with span > min page size are susceptible to aliasing.
>>> + */
>>> +static long get_max_cache_span(void)
>>> +{
>>> + struct cpu_cacheinfo *this_cpu_ci;
>>> + struct cacheinfo *this_leaf;
>>> + long span, max_span = 0;
>>> + int cpu, leaf;
>>> +
>>> + for_each_possible_cpu(cpu) {
>>> + this_cpu_ci = get_cpu_cacheinfo(cpu);
>>> + this_leaf = this_cpu_ci->info_list;
>>> + for (leaf = 0; leaf < this_cpu_ci->num_leaves; leaf++) {
>>> + if (this_leaf->level > 1)
>>> + break;
>>> + span = this_leaf->coherency_line_size *
>>> + this_leaf->number_of_sets;
>>> + if (span > max_span)
>>> + max_span = span;
>>> + this_leaf++;
>>> + }
>>> + }
>>> + return max_span;
>>> +}
>>> +
>>> +/*
>>> + * Align shared mappings to the maximum cache "span" to avoid aliasing
>>> + * in VIPT caches, for performance.
>>> + * The returned SHMLBA value is always a power-of-two multiple of
>>> PAGE_SIZE.
>>> + */
>>> +long arch_shmlba(void)
>>> +{
>>> + if (shmlba == 0) {
>>> + long max_span = get_max_cache_span();
>>> +
>>> + shmlba = max_span ? PAGE_ALIGN(max_span) : 4 * PAGE_SIZE;
>>
>> I'd prefer to avoid sneaking in a default 4*PAGE_SIZE here, just
>> default to
>> PAGE_SIZE and rely on systems with this behavior specifying the
>> correct tuning
>> value in the device tree.
>
> Fair enough.
>
>
>> This avoids changing the behavior for existing
>> systems, which is a slight regression as the alignment uses more
>> memory. It's
>> not a big deal, but on systems that don't require alignment for high
>> performance there's no reason to just throw away memory --
>> particularly as we
>> have some RISC-V systems with pretty limited memory
>
> Greater alignment takes up more virtual memory, not more physical memory.
>
>
>> (I'm thinking of the
>> Kendryte boards, though I don't know how SHMLBA interacts with NOMMU
>> so it
>> might not matter).
>
> There's no virtual memory in NOMMU, so indeed it doesn't matter.
Of course :). I'd still like to leave the default alone, if only to prevent
people from relying on an arbitrary default decision.
>
> M
>
>
>>> + }
>>> + return shmlba;
>>> +}
>>>
>>> static void ci_leaf_init(struct cacheinfo *this_leaf,
>>> struct device_node *node,
>>> @@ -93,6 +142,9 @@ static int __populate_cache_leaves(unsigned int cpu)
>>> }
>>> of_node_put(np);
>>>
>>> + /* Force recalculating SHMLBA if cache parameters are updated. */
>>> + shmlba = 0;
>>> +
>>> return 0;
>>> }
prev parent reply other threads:[~2019-12-06 0:07 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-26 22:44 [PATCH v2 0/2] riscv: Align shared mappings to avoid cache aliasing Marc Gauthier
2019-11-26 22:44 ` [PATCH v2 1/2] riscv: Align shared mappings to SHMLBA Marc Gauthier
2019-12-05 23:03 ` Palmer Dabbelt
2019-12-06 0:24 ` Marc Gauthier
2019-11-26 22:44 ` [PATCH v2 2/2] riscv: Set SHMLBA according to cache geometry Marc Gauthier
2019-12-05 23:03 ` Palmer Dabbelt
2019-12-05 23:58 ` Marc Gauthier
2019-12-06 0:07 ` Palmer Dabbelt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mhng-56d3620d-5177-4c78-a353-a9e8c5b1a567@palmerdabbelt-glaptop \
--to=palmerdabbelt@google.com \
--cc=aou@eecs.berkeley.edu \
--cc=consult-mg@gstardust.com \
--cc=linux-riscv@lists.infradead.org \
--cc=paul.walmsley@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).