All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] Sparse HART id support
@ 2022-01-20  9:09 ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Albert Ou, Atish Patra, Anup Patel, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

Currently, sparse hartid is not supported for Linux RISC-V for the following
reasons.
1. Both spinwait and ordered booting method uses __cpu_up_stack/task_pointer
   which is an array size of NR_CPUs.
2. During early booting, any hartid greater than NR_CPUs are not booted at all.
3. riscv_cpuid_to_hartid_mask uses struct cpumask for generating hartid bitmap.
4. SBI v0.2 implementation uses NR_CPUs as the maximum hartid number while
   generating hartmask.

In order to support sparse hartid, the hartid & NR_CPUS needs to be disassociated
which was logically incorrect anyways. NR_CPUs represent the maximum logical|
CPU id configured in the kernel while the hartid represent the physical hartid
stored in mhartid CSR defined by the privilege specification. Thus, hartid
can have much greater value than logical cpuid.

Currently, we have two methods of booting. Ordered booting where the booting
hart brings up each non-booting hart one by one using SBI HSM extension.
The spinwait booting method relies on harts jumping to Linux kernel randomly
and boot hart is selected by a lottery. All other non-booting harts keep
spinning on __cpu_up_stack/task_pointer until boot hart initializes the data.
Both these methods rely on __cpu_up_stack/task_pointer to setup the stack/
task pointer. The spinwait method is mostly used to support older firmwares
without SBI HSM extension and M-mode Linux.  The ordered booting method is the
preferred booting method for booting general Linux because it can support
cpu hotplug and kexec.

The first patch modified the ordered booting method to use an opaque parameter
already available in HSM start API to setup the stack/task pointer. The third
patch resolves the issue #1 by limiting the usage of
__cpu_up_stack/task_pointer to spinwait specific booting method. The fourth 
and fifth patch moves the entire hart lottery selection and spinwait method
to a separate config that can be disabled if required. It solves the issue #2.
The 6th patch solves issue #3 and #4 by removing riscv_cpuid_to_hartid_mask
completely. All the SBI APIs directly pass a pointer to struct cpumask and
the SBI implementation takes care of generating the hart bitmap from the
cpumask.

It is not trivial to support sparse hartid for spinwait booting method and
there are no usecases to support sparse hartid for spinwait method as well.
Any platform with sparse hartid will probably require more advanced features
such as cpu hotplug and kexec. Thus, the series supports the sparse hartid via
ordered booting method only. To maintain backward compatibility, spinwait
booting method is currently enabled in defconfig so that M-mode linux will
continue to work. Any platform that requires to sparse hartid must disable the
spinwait method.

This series also fixes the out-of-bounds access error[1] reported by Geert.
The issue can be reproduced with SMP booting with NR_CPUS=4 on platforms with
discontiguous hart numbering (HiFive unleashed/unmatched & polarfire).
Spinwait method should also be disabled for such configuration where NR_CPUS
value is less than maximum hartid in the platform. 

[1] https://lore.kernel.org/lkml/CAMuHMdUPWOjJfJohxLJefHOrJBtXZ0xfHQt4=hXpUXnasiN+AQ@mail.gmail.com/#t

The series is based on queue branch on kvm-riscv as it has kvm related changes
as well. I have tested it on HiFive Unmatched and Qemu.

Changes from v2->v3:
1. Rebased on linux-next
2. Removed the redundant variable in PATCH 1.
3. Added the reviewed-by/acked-by tags.

Changes from v1->v2:
1. Fixed few typos in Kconfig.
2. Moved the boot data structure offsets to a asm-offset.c
3. Removed the redundant config check in head.S

Atish Patra (6):
RISC-V: Avoid using per cpu array for ordered booting
RISC-V: Do not print the SBI version during HSM extension boot print
RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
RISC-V: Move the entire hart selection via lottery to SMP
RISC-V: Move spinwait booting method to its own config
RISC-V: Do not use cpumask data structure for hartid bitmap

arch/riscv/Kconfig                   |  14 ++
arch/riscv/include/asm/cpu_ops.h     |   2 -
arch/riscv/include/asm/cpu_ops_sbi.h |  25 ++++
arch/riscv/include/asm/sbi.h         |  19 +--
arch/riscv/include/asm/smp.h         |   2 -
arch/riscv/kernel/Makefile           |   3 +-
arch/riscv/kernel/asm-offsets.c      |   3 +
arch/riscv/kernel/cpu_ops.c          |  26 ++--
arch/riscv/kernel/cpu_ops_sbi.c      |  26 +++-
arch/riscv/kernel/cpu_ops_spinwait.c |  27 +++-
arch/riscv/kernel/head.S             |  35 ++---
arch/riscv/kernel/head.h             |   6 +-
arch/riscv/kernel/sbi.c              | 189 +++++++++++++++------------
arch/riscv/kernel/setup.c            |  10 --
arch/riscv/kernel/smpboot.c          |   2 +-
arch/riscv/kvm/mmu.c                 |   4 +-
arch/riscv/kvm/vcpu_sbi_replace.c    |  11 +-
arch/riscv/kvm/vcpu_sbi_v01.c        |  11 +-
arch/riscv/kvm/vmid.c                |   4 +-
arch/riscv/mm/cacheflush.c           |   5 +-
arch/riscv/mm/tlbflush.c             |   9 +-
21 files changed, 253 insertions(+), 180 deletions(-)
create mode 100644 arch/riscv/include/asm/cpu_ops_sbi.h

--
2.30.2


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v3 0/6] Sparse HART id support
@ 2022-01-20  9:09 ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Albert Ou, Atish Patra, Anup Patel, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

Currently, sparse hartid is not supported for Linux RISC-V for the following
reasons.
1. Both spinwait and ordered booting method uses __cpu_up_stack/task_pointer
   which is an array size of NR_CPUs.
2. During early booting, any hartid greater than NR_CPUs are not booted at all.
3. riscv_cpuid_to_hartid_mask uses struct cpumask for generating hartid bitmap.
4. SBI v0.2 implementation uses NR_CPUs as the maximum hartid number while
   generating hartmask.

In order to support sparse hartid, the hartid & NR_CPUS needs to be disassociated
which was logically incorrect anyways. NR_CPUs represent the maximum logical|
CPU id configured in the kernel while the hartid represent the physical hartid
stored in mhartid CSR defined by the privilege specification. Thus, hartid
can have much greater value than logical cpuid.

Currently, we have two methods of booting. Ordered booting where the booting
hart brings up each non-booting hart one by one using SBI HSM extension.
The spinwait booting method relies on harts jumping to Linux kernel randomly
and boot hart is selected by a lottery. All other non-booting harts keep
spinning on __cpu_up_stack/task_pointer until boot hart initializes the data.
Both these methods rely on __cpu_up_stack/task_pointer to setup the stack/
task pointer. The spinwait method is mostly used to support older firmwares
without SBI HSM extension and M-mode Linux.  The ordered booting method is the
preferred booting method for booting general Linux because it can support
cpu hotplug and kexec.

The first patch modified the ordered booting method to use an opaque parameter
already available in HSM start API to setup the stack/task pointer. The third
patch resolves the issue #1 by limiting the usage of
__cpu_up_stack/task_pointer to spinwait specific booting method. The fourth 
and fifth patch moves the entire hart lottery selection and spinwait method
to a separate config that can be disabled if required. It solves the issue #2.
The 6th patch solves issue #3 and #4 by removing riscv_cpuid_to_hartid_mask
completely. All the SBI APIs directly pass a pointer to struct cpumask and
the SBI implementation takes care of generating the hart bitmap from the
cpumask.

It is not trivial to support sparse hartid for spinwait booting method and
there are no usecases to support sparse hartid for spinwait method as well.
Any platform with sparse hartid will probably require more advanced features
such as cpu hotplug and kexec. Thus, the series supports the sparse hartid via
ordered booting method only. To maintain backward compatibility, spinwait
booting method is currently enabled in defconfig so that M-mode linux will
continue to work. Any platform that requires to sparse hartid must disable the
spinwait method.

This series also fixes the out-of-bounds access error[1] reported by Geert.
The issue can be reproduced with SMP booting with NR_CPUS=4 on platforms with
discontiguous hart numbering (HiFive unleashed/unmatched & polarfire).
Spinwait method should also be disabled for such configuration where NR_CPUS
value is less than maximum hartid in the platform. 

[1] https://lore.kernel.org/lkml/CAMuHMdUPWOjJfJohxLJefHOrJBtXZ0xfHQt4=hXpUXnasiN+AQ@mail.gmail.com/#t

The series is based on queue branch on kvm-riscv as it has kvm related changes
as well. I have tested it on HiFive Unmatched and Qemu.

Changes from v2->v3:
1. Rebased on linux-next
2. Removed the redundant variable in PATCH 1.
3. Added the reviewed-by/acked-by tags.

Changes from v1->v2:
1. Fixed few typos in Kconfig.
2. Moved the boot data structure offsets to a asm-offset.c
3. Removed the redundant config check in head.S

Atish Patra (6):
RISC-V: Avoid using per cpu array for ordered booting
RISC-V: Do not print the SBI version during HSM extension boot print
RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
RISC-V: Move the entire hart selection via lottery to SMP
RISC-V: Move spinwait booting method to its own config
RISC-V: Do not use cpumask data structure for hartid bitmap

arch/riscv/Kconfig                   |  14 ++
arch/riscv/include/asm/cpu_ops.h     |   2 -
arch/riscv/include/asm/cpu_ops_sbi.h |  25 ++++
arch/riscv/include/asm/sbi.h         |  19 +--
arch/riscv/include/asm/smp.h         |   2 -
arch/riscv/kernel/Makefile           |   3 +-
arch/riscv/kernel/asm-offsets.c      |   3 +
arch/riscv/kernel/cpu_ops.c          |  26 ++--
arch/riscv/kernel/cpu_ops_sbi.c      |  26 +++-
arch/riscv/kernel/cpu_ops_spinwait.c |  27 +++-
arch/riscv/kernel/head.S             |  35 ++---
arch/riscv/kernel/head.h             |   6 +-
arch/riscv/kernel/sbi.c              | 189 +++++++++++++++------------
arch/riscv/kernel/setup.c            |  10 --
arch/riscv/kernel/smpboot.c          |   2 +-
arch/riscv/kvm/mmu.c                 |   4 +-
arch/riscv/kvm/vcpu_sbi_replace.c    |  11 +-
arch/riscv/kvm/vcpu_sbi_v01.c        |  11 +-
arch/riscv/kvm/vmid.c                |   4 +-
arch/riscv/mm/cacheflush.c           |   5 +-
arch/riscv/mm/tlbflush.c             |   9 +-
21 files changed, 253 insertions(+), 180 deletions(-)
create mode 100644 arch/riscv/include/asm/cpu_ops_sbi.h

--
2.30.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v3 1/6] RISC-V: Avoid using per cpu array for ordered booting
  2022-01-20  9:09 ` Atish Patra
@ 2022-01-20  9:09   ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

Currently both order booting and spinwait approach uses a per cpu
array to update stack & task pointer. This approach will not work for the
following cases.
1. If NR_CPUs are configured to be less than highest hart id.
2. A platform has sparse hartid.

This issue can be fixed for ordered booting as the booting cpu brings up
one cpu at a time using SBI HSM extension which has opaque parameter
that is unused until now.

Introduce a common secondary boot data structure that can store the stack
and task pointer. Secondary harts will use this data while booting up
to setup the sp & tp.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/cpu_ops_sbi.h | 25 +++++++++++++++++++++++++
 arch/riscv/kernel/asm-offsets.c      |  3 +++
 arch/riscv/kernel/cpu_ops_sbi.c      | 26 ++++++++++++++++++++------
 arch/riscv/kernel/head.S             | 19 ++++++++++---------
 4 files changed, 58 insertions(+), 15 deletions(-)
 create mode 100644 arch/riscv/include/asm/cpu_ops_sbi.h

diff --git a/arch/riscv/include/asm/cpu_ops_sbi.h b/arch/riscv/include/asm/cpu_ops_sbi.h
new file mode 100644
index 000000000000..56e4b76d09ff
--- /dev/null
+++ b/arch/riscv/include/asm/cpu_ops_sbi.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021 by Rivos Inc.
+ */
+#ifndef __ASM_CPU_OPS_SBI_H
+#define __ASM_CPU_OPS_SBI_H
+
+#ifndef __ASSEMBLY__
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/threads.h>
+
+/**
+ * struct sbi_hart_boot_data - Hart specific boot used during booting and
+ *			       cpu hotplug.
+ * @task_ptr: A pointer to the hart specific tp
+ * @stack_ptr: A pointer to the hart specific sp
+ */
+struct sbi_hart_boot_data {
+	void *task_ptr;
+	void *stack_ptr;
+};
+#endif
+
+#endif /* ifndef __ASM_CPU_OPS_SBI_H */
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 253126e4beef..df0519a64eaf 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -12,6 +12,7 @@
 #include <asm/kvm_host.h>
 #include <asm/thread_info.h>
 #include <asm/ptrace.h>
+#include <asm/cpu_ops_sbi.h>
 
 void asm_offsets(void);
 
@@ -468,4 +469,6 @@ void asm_offsets(void)
 	DEFINE(PT_SIZE_ON_STACK, ALIGN(sizeof(struct pt_regs), STACK_ALIGN));
 
 	OFFSET(KERNEL_MAP_VIRT_ADDR, kernel_mapping, virt_addr);
+	OFFSET(SBI_HART_BOOT_TASK_PTR_OFFSET, sbi_hart_boot_data, task_ptr);
+	OFFSET(SBI_HART_BOOT_STACK_PTR_OFFSET, sbi_hart_boot_data, stack_ptr);
 }
diff --git a/arch/riscv/kernel/cpu_ops_sbi.c b/arch/riscv/kernel/cpu_ops_sbi.c
index 685fae72b7f5..dae29cbfe550 100644
--- a/arch/riscv/kernel/cpu_ops_sbi.c
+++ b/arch/riscv/kernel/cpu_ops_sbi.c
@@ -7,13 +7,22 @@
 
 #include <linux/init.h>
 #include <linux/mm.h>
+#include <linux/sched/task_stack.h>
 #include <asm/cpu_ops.h>
+#include <asm/cpu_ops_sbi.h>
 #include <asm/sbi.h>
 #include <asm/smp.h>
 
 extern char secondary_start_sbi[];
 const struct cpu_operations cpu_ops_sbi;
 
+/*
+ * Ordered booting via HSM brings one cpu at a time. However, cpu hotplug can
+ * be invoked from multiple threads in parallel. Define a per cpu data
+ * to handle that.
+ */
+DEFINE_PER_CPU(struct sbi_hart_boot_data, boot_data);
+
 static int sbi_hsm_hart_start(unsigned long hartid, unsigned long saddr,
 			      unsigned long priv)
 {
@@ -55,14 +64,19 @@ static int sbi_hsm_hart_get_status(unsigned long hartid)
 
 static int sbi_cpu_start(unsigned int cpuid, struct task_struct *tidle)
 {
-	int rc;
 	unsigned long boot_addr = __pa_symbol(secondary_start_sbi);
 	int hartid = cpuid_to_hartid_map(cpuid);
-
-	cpu_update_secondary_bootdata(cpuid, tidle);
-	rc = sbi_hsm_hart_start(hartid, boot_addr, 0);
-
-	return rc;
+	unsigned long hsm_data;
+	struct sbi_hart_boot_data *bdata = &per_cpu(boot_data, cpuid);
+
+	/* Make sure tidle is updated */
+	smp_mb();
+	bdata->task_ptr = tidle;
+	bdata->stack_ptr = task_stack_page(tidle) + THREAD_SIZE;
+	/* Make sure boot data is updated */
+	smp_mb();
+	hsm_data = __pa(bdata);
+	return sbi_hsm_hart_start(hartid, boot_addr, hsm_data);
 }
 
 static int sbi_cpu_prepare(unsigned int cpuid)
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 604d60292dd8..bc24a90ede1c 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -11,6 +11,7 @@
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/csr.h>
+#include <asm/cpu_ops_sbi.h>
 #include <asm/hwcap.h>
 #include <asm/image.h>
 #include "efi-header.S"
@@ -167,15 +168,15 @@ secondary_start_sbi:
 	la a3, .Lsecondary_park
 	csrw CSR_TVEC, a3
 
-	slli a3, a0, LGREG
-	la a4, __cpu_up_stack_pointer
-	XIP_FIXUP_OFFSET a4
-	la a5, __cpu_up_task_pointer
-	XIP_FIXUP_OFFSET a5
-	add a4, a3, a4
-	add a5, a3, a5
-	REG_L sp, (a4)
-	REG_L tp, (a5)
+	/* a0 contains the hartid & a1 contains boot data */
+	li a2, SBI_HART_BOOT_TASK_PTR_OFFSET
+	XIP_FIXUP_OFFSET a2
+	add a2, a2, a1
+	REG_L tp, (a2)
+	li a3, SBI_HART_BOOT_STACK_PTR_OFFSET
+	XIP_FIXUP_OFFSET a3
+	add a3, a3, a1
+	REG_L sp, (a3)
 
 .Lsecondary_start_common:
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 1/6] RISC-V: Avoid using per cpu array for ordered booting
@ 2022-01-20  9:09   ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

Currently both order booting and spinwait approach uses a per cpu
array to update stack & task pointer. This approach will not work for the
following cases.
1. If NR_CPUs are configured to be less than highest hart id.
2. A platform has sparse hartid.

This issue can be fixed for ordered booting as the booting cpu brings up
one cpu at a time using SBI HSM extension which has opaque parameter
that is unused until now.

Introduce a common secondary boot data structure that can store the stack
and task pointer. Secondary harts will use this data while booting up
to setup the sp & tp.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/cpu_ops_sbi.h | 25 +++++++++++++++++++++++++
 arch/riscv/kernel/asm-offsets.c      |  3 +++
 arch/riscv/kernel/cpu_ops_sbi.c      | 26 ++++++++++++++++++++------
 arch/riscv/kernel/head.S             | 19 ++++++++++---------
 4 files changed, 58 insertions(+), 15 deletions(-)
 create mode 100644 arch/riscv/include/asm/cpu_ops_sbi.h

diff --git a/arch/riscv/include/asm/cpu_ops_sbi.h b/arch/riscv/include/asm/cpu_ops_sbi.h
new file mode 100644
index 000000000000..56e4b76d09ff
--- /dev/null
+++ b/arch/riscv/include/asm/cpu_ops_sbi.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021 by Rivos Inc.
+ */
+#ifndef __ASM_CPU_OPS_SBI_H
+#define __ASM_CPU_OPS_SBI_H
+
+#ifndef __ASSEMBLY__
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/threads.h>
+
+/**
+ * struct sbi_hart_boot_data - Hart specific boot used during booting and
+ *			       cpu hotplug.
+ * @task_ptr: A pointer to the hart specific tp
+ * @stack_ptr: A pointer to the hart specific sp
+ */
+struct sbi_hart_boot_data {
+	void *task_ptr;
+	void *stack_ptr;
+};
+#endif
+
+#endif /* ifndef __ASM_CPU_OPS_SBI_H */
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 253126e4beef..df0519a64eaf 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -12,6 +12,7 @@
 #include <asm/kvm_host.h>
 #include <asm/thread_info.h>
 #include <asm/ptrace.h>
+#include <asm/cpu_ops_sbi.h>
 
 void asm_offsets(void);
 
@@ -468,4 +469,6 @@ void asm_offsets(void)
 	DEFINE(PT_SIZE_ON_STACK, ALIGN(sizeof(struct pt_regs), STACK_ALIGN));
 
 	OFFSET(KERNEL_MAP_VIRT_ADDR, kernel_mapping, virt_addr);
+	OFFSET(SBI_HART_BOOT_TASK_PTR_OFFSET, sbi_hart_boot_data, task_ptr);
+	OFFSET(SBI_HART_BOOT_STACK_PTR_OFFSET, sbi_hart_boot_data, stack_ptr);
 }
diff --git a/arch/riscv/kernel/cpu_ops_sbi.c b/arch/riscv/kernel/cpu_ops_sbi.c
index 685fae72b7f5..dae29cbfe550 100644
--- a/arch/riscv/kernel/cpu_ops_sbi.c
+++ b/arch/riscv/kernel/cpu_ops_sbi.c
@@ -7,13 +7,22 @@
 
 #include <linux/init.h>
 #include <linux/mm.h>
+#include <linux/sched/task_stack.h>
 #include <asm/cpu_ops.h>
+#include <asm/cpu_ops_sbi.h>
 #include <asm/sbi.h>
 #include <asm/smp.h>
 
 extern char secondary_start_sbi[];
 const struct cpu_operations cpu_ops_sbi;
 
+/*
+ * Ordered booting via HSM brings one cpu at a time. However, cpu hotplug can
+ * be invoked from multiple threads in parallel. Define a per cpu data
+ * to handle that.
+ */
+DEFINE_PER_CPU(struct sbi_hart_boot_data, boot_data);
+
 static int sbi_hsm_hart_start(unsigned long hartid, unsigned long saddr,
 			      unsigned long priv)
 {
@@ -55,14 +64,19 @@ static int sbi_hsm_hart_get_status(unsigned long hartid)
 
 static int sbi_cpu_start(unsigned int cpuid, struct task_struct *tidle)
 {
-	int rc;
 	unsigned long boot_addr = __pa_symbol(secondary_start_sbi);
 	int hartid = cpuid_to_hartid_map(cpuid);
-
-	cpu_update_secondary_bootdata(cpuid, tidle);
-	rc = sbi_hsm_hart_start(hartid, boot_addr, 0);
-
-	return rc;
+	unsigned long hsm_data;
+	struct sbi_hart_boot_data *bdata = &per_cpu(boot_data, cpuid);
+
+	/* Make sure tidle is updated */
+	smp_mb();
+	bdata->task_ptr = tidle;
+	bdata->stack_ptr = task_stack_page(tidle) + THREAD_SIZE;
+	/* Make sure boot data is updated */
+	smp_mb();
+	hsm_data = __pa(bdata);
+	return sbi_hsm_hart_start(hartid, boot_addr, hsm_data);
 }
 
 static int sbi_cpu_prepare(unsigned int cpuid)
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 604d60292dd8..bc24a90ede1c 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -11,6 +11,7 @@
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/csr.h>
+#include <asm/cpu_ops_sbi.h>
 #include <asm/hwcap.h>
 #include <asm/image.h>
 #include "efi-header.S"
@@ -167,15 +168,15 @@ secondary_start_sbi:
 	la a3, .Lsecondary_park
 	csrw CSR_TVEC, a3
 
-	slli a3, a0, LGREG
-	la a4, __cpu_up_stack_pointer
-	XIP_FIXUP_OFFSET a4
-	la a5, __cpu_up_task_pointer
-	XIP_FIXUP_OFFSET a5
-	add a4, a3, a4
-	add a5, a3, a5
-	REG_L sp, (a4)
-	REG_L tp, (a5)
+	/* a0 contains the hartid & a1 contains boot data */
+	li a2, SBI_HART_BOOT_TASK_PTR_OFFSET
+	XIP_FIXUP_OFFSET a2
+	add a2, a2, a1
+	REG_L tp, (a2)
+	li a3, SBI_HART_BOOT_STACK_PTR_OFFSET
+	XIP_FIXUP_OFFSET a3
+	add a3, a3, a1
+	REG_L sp, (a3)
 
 .Lsecondary_start_common:
 
-- 
2.30.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 2/6] RISC-V: Do not print the SBI version during HSM extension boot print
  2022-01-20  9:09 ` Atish Patra
@ 2022-01-20  9:09   ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

The HSM extension information log also prints the SBI version v0.2. This
is misleading as the underlying firmware SBI version may be different
from v0.2.

Remove the unncessary printing of SBI version.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/kernel/cpu_ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
index 1985884fe829..3f5a38b03044 100644
--- a/arch/riscv/kernel/cpu_ops.c
+++ b/arch/riscv/kernel/cpu_ops.c
@@ -38,7 +38,7 @@ void __init cpu_set_ops(int cpuid)
 #if IS_ENABLED(CONFIG_RISCV_SBI)
 	if (sbi_probe_extension(SBI_EXT_HSM) > 0) {
 		if (!cpuid)
-			pr_info("SBI v0.2 HSM extension detected\n");
+			pr_info("SBI HSM extension detected\n");
 		cpu_ops[cpuid] = &cpu_ops_sbi;
 	} else
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 2/6] RISC-V: Do not print the SBI version during HSM extension boot print
@ 2022-01-20  9:09   ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

The HSM extension information log also prints the SBI version v0.2. This
is misleading as the underlying firmware SBI version may be different
from v0.2.

Remove the unncessary printing of SBI version.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/kernel/cpu_ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
index 1985884fe829..3f5a38b03044 100644
--- a/arch/riscv/kernel/cpu_ops.c
+++ b/arch/riscv/kernel/cpu_ops.c
@@ -38,7 +38,7 @@ void __init cpu_set_ops(int cpuid)
 #if IS_ENABLED(CONFIG_RISCV_SBI)
 	if (sbi_probe_extension(SBI_EXT_HSM) > 0) {
 		if (!cpuid)
-			pr_info("SBI v0.2 HSM extension detected\n");
+			pr_info("SBI HSM extension detected\n");
 		cpu_ops[cpuid] = &cpu_ops_sbi;
 	} else
 #endif
-- 
2.30.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 3/6] RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
  2022-01-20  9:09 ` Atish Patra
@ 2022-01-20  9:09   ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

The __cpu_up_stack/task_pointer array is only used for spinwait method
now. The per cpu array based lookup is also fragile for platforms with
discontiguous/sparse hartids. The spinwait method is only used for
M-mode Linux or older firmwares without SBI HSM extension. For general
Linux systems, ordered booting method is preferred anyways to support
cpu hotplug and kexec.

Make sure that __cpu_up_stack/task_pointer is only used for spinwait
method. Take this opportunity to rename it to
__cpu_spinwait_stack/task_pointer to emphasize the purpose as well.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/cpu_ops.h     |  2 --
 arch/riscv/kernel/cpu_ops.c          | 16 ----------------
 arch/riscv/kernel/cpu_ops_spinwait.c | 27 ++++++++++++++++++++++++++-
 arch/riscv/kernel/head.S             |  4 ++--
 arch/riscv/kernel/head.h             |  4 ++--
 5 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/arch/riscv/include/asm/cpu_ops.h b/arch/riscv/include/asm/cpu_ops.h
index a8ec3c5c1bd2..134590f1b843 100644
--- a/arch/riscv/include/asm/cpu_ops.h
+++ b/arch/riscv/include/asm/cpu_ops.h
@@ -40,7 +40,5 @@ struct cpu_operations {
 
 extern const struct cpu_operations *cpu_ops[NR_CPUS];
 void __init cpu_set_ops(int cpu);
-void cpu_update_secondary_bootdata(unsigned int cpuid,
-				   struct task_struct *tidle);
 
 #endif /* ifndef __ASM_CPU_OPS_H */
diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
index 3f5a38b03044..c1e30f403c3b 100644
--- a/arch/riscv/kernel/cpu_ops.c
+++ b/arch/riscv/kernel/cpu_ops.c
@@ -8,31 +8,15 @@
 #include <linux/of.h>
 #include <linux/string.h>
 #include <linux/sched.h>
-#include <linux/sched/task_stack.h>
 #include <asm/cpu_ops.h>
 #include <asm/sbi.h>
 #include <asm/smp.h>
 
 const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init;
 
-void *__cpu_up_stack_pointer[NR_CPUS] __section(".data");
-void *__cpu_up_task_pointer[NR_CPUS] __section(".data");
-
 extern const struct cpu_operations cpu_ops_sbi;
 extern const struct cpu_operations cpu_ops_spinwait;
 
-void cpu_update_secondary_bootdata(unsigned int cpuid,
-				   struct task_struct *tidle)
-{
-	int hartid = cpuid_to_hartid_map(cpuid);
-
-	/* Make sure tidle is updated */
-	smp_mb();
-	WRITE_ONCE(__cpu_up_stack_pointer[hartid],
-		   task_stack_page(tidle) + THREAD_SIZE);
-	WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);
-}
-
 void __init cpu_set_ops(int cpuid)
 {
 #if IS_ENABLED(CONFIG_RISCV_SBI)
diff --git a/arch/riscv/kernel/cpu_ops_spinwait.c b/arch/riscv/kernel/cpu_ops_spinwait.c
index b2c957bb68c1..346847f6c41c 100644
--- a/arch/riscv/kernel/cpu_ops_spinwait.c
+++ b/arch/riscv/kernel/cpu_ops_spinwait.c
@@ -6,11 +6,36 @@
 #include <linux/errno.h>
 #include <linux/of.h>
 #include <linux/string.h>
+#include <linux/sched/task_stack.h>
 #include <asm/cpu_ops.h>
 #include <asm/sbi.h>
 #include <asm/smp.h>
 
 const struct cpu_operations cpu_ops_spinwait;
+void *__cpu_spinwait_stack_pointer[NR_CPUS] __section(".data");
+void *__cpu_spinwait_task_pointer[NR_CPUS] __section(".data");
+
+static void cpu_update_secondary_bootdata(unsigned int cpuid,
+				   struct task_struct *tidle)
+{
+	int hartid = cpuid_to_hartid_map(cpuid);
+
+	/*
+	 * The hartid must be less than NR_CPUS to avoid out-of-bound access
+	 * errors for __cpu_spinwait_stack/task_pointer. That is not always possible
+	 * for platforms with discontiguous hartid numbering scheme. That's why
+	 * spinwait booting is not the recommended approach for any platforms
+	 * booting Linux in S-mode and can be disabled in the future.
+	 */
+	if (hartid == INVALID_HARTID || hartid >= NR_CPUS)
+		return;
+
+	/* Make sure tidle is updated */
+	smp_mb();
+	WRITE_ONCE(__cpu_spinwait_stack_pointer[hartid],
+		   task_stack_page(tidle) + THREAD_SIZE);
+	WRITE_ONCE(__cpu_spinwait_task_pointer[hartid], tidle);
+}
 
 static int spinwait_cpu_prepare(unsigned int cpuid)
 {
@@ -28,7 +53,7 @@ static int spinwait_cpu_start(unsigned int cpuid, struct task_struct *tidle)
 	 * selects the first cpu to boot the kernel and causes the remainder
 	 * of the cpus to spin in a loop waiting for their stack pointer to be
 	 * setup by that main cpu.  Writing to bootdata
-	 * (i.e __cpu_up_stack_pointer) signals to the spinning cpus that they
+	 * (i.e __cpu_spinwait_stack_pointer) signals to the spinning cpus that they
 	 * can continue the boot process.
 	 */
 	cpu_update_secondary_bootdata(cpuid, tidle);
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index bc24a90ede1c..1f11de45df9c 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -346,9 +346,9 @@ clear_bss_done:
 	csrw CSR_TVEC, a3
 
 	slli a3, a0, LGREG
-	la a1, __cpu_up_stack_pointer
+	la a1, __cpu_spinwait_stack_pointer
 	XIP_FIXUP_OFFSET a1
-	la a2, __cpu_up_task_pointer
+	la a2, __cpu_spinwait_task_pointer
 	XIP_FIXUP_OFFSET a2
 	add a1, a3, a1
 	add a2, a3, a2
diff --git a/arch/riscv/kernel/head.h b/arch/riscv/kernel/head.h
index aabbc3ac3e48..5393cca77790 100644
--- a/arch/riscv/kernel/head.h
+++ b/arch/riscv/kernel/head.h
@@ -16,7 +16,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa);
 asmlinkage void __init __copy_data(void);
 #endif
 
-extern void *__cpu_up_stack_pointer[];
-extern void *__cpu_up_task_pointer[];
+extern void *__cpu_spinwait_stack_pointer[];
+extern void *__cpu_spinwait_task_pointer[];
 
 #endif /* __ASM_HEAD_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 3/6] RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
@ 2022-01-20  9:09   ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

The __cpu_up_stack/task_pointer array is only used for spinwait method
now. The per cpu array based lookup is also fragile for platforms with
discontiguous/sparse hartids. The spinwait method is only used for
M-mode Linux or older firmwares without SBI HSM extension. For general
Linux systems, ordered booting method is preferred anyways to support
cpu hotplug and kexec.

Make sure that __cpu_up_stack/task_pointer is only used for spinwait
method. Take this opportunity to rename it to
__cpu_spinwait_stack/task_pointer to emphasize the purpose as well.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/cpu_ops.h     |  2 --
 arch/riscv/kernel/cpu_ops.c          | 16 ----------------
 arch/riscv/kernel/cpu_ops_spinwait.c | 27 ++++++++++++++++++++++++++-
 arch/riscv/kernel/head.S             |  4 ++--
 arch/riscv/kernel/head.h             |  4 ++--
 5 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/arch/riscv/include/asm/cpu_ops.h b/arch/riscv/include/asm/cpu_ops.h
index a8ec3c5c1bd2..134590f1b843 100644
--- a/arch/riscv/include/asm/cpu_ops.h
+++ b/arch/riscv/include/asm/cpu_ops.h
@@ -40,7 +40,5 @@ struct cpu_operations {
 
 extern const struct cpu_operations *cpu_ops[NR_CPUS];
 void __init cpu_set_ops(int cpu);
-void cpu_update_secondary_bootdata(unsigned int cpuid,
-				   struct task_struct *tidle);
 
 #endif /* ifndef __ASM_CPU_OPS_H */
diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
index 3f5a38b03044..c1e30f403c3b 100644
--- a/arch/riscv/kernel/cpu_ops.c
+++ b/arch/riscv/kernel/cpu_ops.c
@@ -8,31 +8,15 @@
 #include <linux/of.h>
 #include <linux/string.h>
 #include <linux/sched.h>
-#include <linux/sched/task_stack.h>
 #include <asm/cpu_ops.h>
 #include <asm/sbi.h>
 #include <asm/smp.h>
 
 const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init;
 
-void *__cpu_up_stack_pointer[NR_CPUS] __section(".data");
-void *__cpu_up_task_pointer[NR_CPUS] __section(".data");
-
 extern const struct cpu_operations cpu_ops_sbi;
 extern const struct cpu_operations cpu_ops_spinwait;
 
-void cpu_update_secondary_bootdata(unsigned int cpuid,
-				   struct task_struct *tidle)
-{
-	int hartid = cpuid_to_hartid_map(cpuid);
-
-	/* Make sure tidle is updated */
-	smp_mb();
-	WRITE_ONCE(__cpu_up_stack_pointer[hartid],
-		   task_stack_page(tidle) + THREAD_SIZE);
-	WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);
-}
-
 void __init cpu_set_ops(int cpuid)
 {
 #if IS_ENABLED(CONFIG_RISCV_SBI)
diff --git a/arch/riscv/kernel/cpu_ops_spinwait.c b/arch/riscv/kernel/cpu_ops_spinwait.c
index b2c957bb68c1..346847f6c41c 100644
--- a/arch/riscv/kernel/cpu_ops_spinwait.c
+++ b/arch/riscv/kernel/cpu_ops_spinwait.c
@@ -6,11 +6,36 @@
 #include <linux/errno.h>
 #include <linux/of.h>
 #include <linux/string.h>
+#include <linux/sched/task_stack.h>
 #include <asm/cpu_ops.h>
 #include <asm/sbi.h>
 #include <asm/smp.h>
 
 const struct cpu_operations cpu_ops_spinwait;
+void *__cpu_spinwait_stack_pointer[NR_CPUS] __section(".data");
+void *__cpu_spinwait_task_pointer[NR_CPUS] __section(".data");
+
+static void cpu_update_secondary_bootdata(unsigned int cpuid,
+				   struct task_struct *tidle)
+{
+	int hartid = cpuid_to_hartid_map(cpuid);
+
+	/*
+	 * The hartid must be less than NR_CPUS to avoid out-of-bound access
+	 * errors for __cpu_spinwait_stack/task_pointer. That is not always possible
+	 * for platforms with discontiguous hartid numbering scheme. That's why
+	 * spinwait booting is not the recommended approach for any platforms
+	 * booting Linux in S-mode and can be disabled in the future.
+	 */
+	if (hartid == INVALID_HARTID || hartid >= NR_CPUS)
+		return;
+
+	/* Make sure tidle is updated */
+	smp_mb();
+	WRITE_ONCE(__cpu_spinwait_stack_pointer[hartid],
+		   task_stack_page(tidle) + THREAD_SIZE);
+	WRITE_ONCE(__cpu_spinwait_task_pointer[hartid], tidle);
+}
 
 static int spinwait_cpu_prepare(unsigned int cpuid)
 {
@@ -28,7 +53,7 @@ static int spinwait_cpu_start(unsigned int cpuid, struct task_struct *tidle)
 	 * selects the first cpu to boot the kernel and causes the remainder
 	 * of the cpus to spin in a loop waiting for their stack pointer to be
 	 * setup by that main cpu.  Writing to bootdata
-	 * (i.e __cpu_up_stack_pointer) signals to the spinning cpus that they
+	 * (i.e __cpu_spinwait_stack_pointer) signals to the spinning cpus that they
 	 * can continue the boot process.
 	 */
 	cpu_update_secondary_bootdata(cpuid, tidle);
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index bc24a90ede1c..1f11de45df9c 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -346,9 +346,9 @@ clear_bss_done:
 	csrw CSR_TVEC, a3
 
 	slli a3, a0, LGREG
-	la a1, __cpu_up_stack_pointer
+	la a1, __cpu_spinwait_stack_pointer
 	XIP_FIXUP_OFFSET a1
-	la a2, __cpu_up_task_pointer
+	la a2, __cpu_spinwait_task_pointer
 	XIP_FIXUP_OFFSET a2
 	add a1, a3, a1
 	add a2, a3, a2
diff --git a/arch/riscv/kernel/head.h b/arch/riscv/kernel/head.h
index aabbc3ac3e48..5393cca77790 100644
--- a/arch/riscv/kernel/head.h
+++ b/arch/riscv/kernel/head.h
@@ -16,7 +16,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa);
 asmlinkage void __init __copy_data(void);
 #endif
 
-extern void *__cpu_up_stack_pointer[];
-extern void *__cpu_up_task_pointer[];
+extern void *__cpu_spinwait_stack_pointer[];
+extern void *__cpu_spinwait_task_pointer[];
 
 #endif /* __ASM_HEAD_H */
-- 
2.30.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 4/6] RISC-V: Move the entire hart selection via lottery to SMP
  2022-01-20  9:09 ` Atish Patra
                   ` (3 preceding siblings ...)
  (?)
@ 2022-01-20  9:09 ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

The booting hart selection via lottery is only useful for SMP systems.
Moreover, the lottery selection is only necessary for systems using
spinwait booting method. It is better to keep the entire lottery
selection together so that it can be disabled in future.

Move the lottery selection code to under CONFIG_SMP.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/kernel/head.S | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 1f11de45df9c..824aaeb5b951 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -263,8 +263,8 @@ pmp_done:
 	blt a0, t0, .Lgood_cores
 	tail .Lsecondary_park
 .Lgood_cores:
-#endif
 
+	/* The lottery system is only required for spinwait booting method */
 #ifndef CONFIG_XIP_KERNEL
 	/* Pick one hart to run the main boot sequence */
 	la a3, hart_lottery
@@ -283,6 +283,10 @@ pmp_done:
 	/* first time here if hart_lottery in RAM is not set */
 	beq t0, t1, .Lsecondary_start
 
+#endif /* CONFIG_XIP */
+#endif /* CONFIG_SMP */
+
+#ifdef CONFIG_XIP_KERNEL
 	la sp, _end + THREAD_SIZE
 	XIP_FIXUP_OFFSET sp
 	mv s0, a0
@@ -339,8 +343,8 @@ clear_bss_done:
 	call soc_early_init
 	tail start_kernel
 
-.Lsecondary_start:
 #ifdef CONFIG_SMP
+.Lsecondary_start:
 	/* Set trap vector to spin forever to help debug */
 	la a3, .Lsecondary_park
 	csrw CSR_TVEC, a3
-- 
2.30.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 5/6] RISC-V: Move spinwait booting method to its own config
  2022-01-20  9:09 ` Atish Patra
@ 2022-01-20  9:09   ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

The spinwait booting method should only be used for platforms with older
firmware without SBI HSM extension or M-mode firmware because spinwait
method can't support cpu hotplug, kexec or sparse hartid. It is better
to move the entire spinwait implementation to its own config which can
be disabled if required. It is enabled by default to maintain backward
compatibility and M-mode Linux.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/Kconfig          | 14 ++++++++++++++
 arch/riscv/kernel/Makefile  |  3 ++-
 arch/riscv/kernel/cpu_ops.c |  8 ++++++++
 arch/riscv/kernel/head.S    |  8 ++++----
 arch/riscv/kernel/head.h    |  2 ++
 5 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 4602cfe92a20..61afe4f1ad1e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -404,6 +404,20 @@ config RISCV_SBI_V01
 	  This config allows kernel to use SBI v0.1 APIs. This will be
 	  deprecated in future once legacy M-mode software are no longer in use.
 
+config RISCV_BOOT_SPINWAIT
+	bool "Spinwait booting method"
+	depends on SMP
+	default y
+	help
+	  This enables support for booting Linux via spinwait method. In the
+	  spinwait method, all cores randomly jump to Linux. One of the cores
+	  gets chosen via lottery and all other keep spinning on a percpu
+	  variable. This method cannot support CPU hotplug and sparse hartid
+	  scheme. It should be only enabled for M-mode Linux or platforms relying
+	  on older firmware without SBI HSM extension. All other platforms should
+	  rely on ordered booting via SBI HSM extension which gets chosen
+	  dynamically at runtime if the firmware supports it.
+
 config KEXEC
 	bool "Kexec system call"
 	select KEXEC_CORE
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 3397ddac1a30..612556faa527 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -43,7 +43,8 @@ obj-$(CONFIG_FPU)		+= fpu.o
 obj-$(CONFIG_SMP)		+= smpboot.o
 obj-$(CONFIG_SMP)		+= smp.o
 obj-$(CONFIG_SMP)		+= cpu_ops.o
-obj-$(CONFIG_SMP)		+= cpu_ops_spinwait.o
+
+obj-$(CONFIG_RISCV_BOOT_SPINWAIT) += cpu_ops_spinwait.o
 obj-$(CONFIG_MODULES)		+= module.o
 obj-$(CONFIG_MODULE_SECTIONS)	+= module-sections.o
 
diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
index c1e30f403c3b..170d07e57721 100644
--- a/arch/riscv/kernel/cpu_ops.c
+++ b/arch/riscv/kernel/cpu_ops.c
@@ -15,7 +15,15 @@
 const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init;
 
 extern const struct cpu_operations cpu_ops_sbi;
+#ifdef CONFIG_RISCV_BOOT_SPINWAIT
 extern const struct cpu_operations cpu_ops_spinwait;
+#else
+const struct cpu_operations cpu_ops_spinwait = {
+	.name		= "",
+	.cpu_prepare	= NULL,
+	.cpu_start	= NULL,
+};
+#endif
 
 void __init cpu_set_ops(int cpuid)
 {
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 824aaeb5b951..2dfeea56d5fe 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -258,7 +258,7 @@ pmp_done:
 	li t0, SR_FS
 	csrc CSR_STATUS, t0
 
-#ifdef CONFIG_SMP
+#ifdef CONFIG_RISCV_BOOT_SPINWAIT
 	li t0, CONFIG_NR_CPUS
 	blt a0, t0, .Lgood_cores
 	tail .Lsecondary_park
@@ -284,7 +284,7 @@ pmp_done:
 	beq t0, t1, .Lsecondary_start
 
 #endif /* CONFIG_XIP */
-#endif /* CONFIG_SMP */
+#endif /* CONFIG_RISCV_BOOT_SPINWAIT */
 
 #ifdef CONFIG_XIP_KERNEL
 	la sp, _end + THREAD_SIZE
@@ -343,7 +343,7 @@ clear_bss_done:
 	call soc_early_init
 	tail start_kernel
 
-#ifdef CONFIG_SMP
+#if CONFIG_RISCV_BOOT_SPINWAIT
 .Lsecondary_start:
 	/* Set trap vector to spin forever to help debug */
 	la a3, .Lsecondary_park
@@ -370,7 +370,7 @@ clear_bss_done:
 	fence
 
 	tail .Lsecondary_start_common
-#endif
+#endif /* CONFIG_RISCV_BOOT_SPINWAIT */
 
 END(_start_kernel)
 
diff --git a/arch/riscv/kernel/head.h b/arch/riscv/kernel/head.h
index 5393cca77790..726731ada534 100644
--- a/arch/riscv/kernel/head.h
+++ b/arch/riscv/kernel/head.h
@@ -16,7 +16,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa);
 asmlinkage void __init __copy_data(void);
 #endif
 
+#ifdef CONFIG_RISCV_BOOT_SPINWAIT
 extern void *__cpu_spinwait_stack_pointer[];
 extern void *__cpu_spinwait_task_pointer[];
+#endif
 
 #endif /* __ASM_HEAD_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 5/6] RISC-V: Move spinwait booting method to its own config
@ 2022-01-20  9:09   ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

The spinwait booting method should only be used for platforms with older
firmware without SBI HSM extension or M-mode firmware because spinwait
method can't support cpu hotplug, kexec or sparse hartid. It is better
to move the entire spinwait implementation to its own config which can
be disabled if required. It is enabled by default to maintain backward
compatibility and M-mode Linux.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/Kconfig          | 14 ++++++++++++++
 arch/riscv/kernel/Makefile  |  3 ++-
 arch/riscv/kernel/cpu_ops.c |  8 ++++++++
 arch/riscv/kernel/head.S    |  8 ++++----
 arch/riscv/kernel/head.h    |  2 ++
 5 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 4602cfe92a20..61afe4f1ad1e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -404,6 +404,20 @@ config RISCV_SBI_V01
 	  This config allows kernel to use SBI v0.1 APIs. This will be
 	  deprecated in future once legacy M-mode software are no longer in use.
 
+config RISCV_BOOT_SPINWAIT
+	bool "Spinwait booting method"
+	depends on SMP
+	default y
+	help
+	  This enables support for booting Linux via spinwait method. In the
+	  spinwait method, all cores randomly jump to Linux. One of the cores
+	  gets chosen via lottery and all other keep spinning on a percpu
+	  variable. This method cannot support CPU hotplug and sparse hartid
+	  scheme. It should be only enabled for M-mode Linux or platforms relying
+	  on older firmware without SBI HSM extension. All other platforms should
+	  rely on ordered booting via SBI HSM extension which gets chosen
+	  dynamically at runtime if the firmware supports it.
+
 config KEXEC
 	bool "Kexec system call"
 	select KEXEC_CORE
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 3397ddac1a30..612556faa527 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -43,7 +43,8 @@ obj-$(CONFIG_FPU)		+= fpu.o
 obj-$(CONFIG_SMP)		+= smpboot.o
 obj-$(CONFIG_SMP)		+= smp.o
 obj-$(CONFIG_SMP)		+= cpu_ops.o
-obj-$(CONFIG_SMP)		+= cpu_ops_spinwait.o
+
+obj-$(CONFIG_RISCV_BOOT_SPINWAIT) += cpu_ops_spinwait.o
 obj-$(CONFIG_MODULES)		+= module.o
 obj-$(CONFIG_MODULE_SECTIONS)	+= module-sections.o
 
diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
index c1e30f403c3b..170d07e57721 100644
--- a/arch/riscv/kernel/cpu_ops.c
+++ b/arch/riscv/kernel/cpu_ops.c
@@ -15,7 +15,15 @@
 const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init;
 
 extern const struct cpu_operations cpu_ops_sbi;
+#ifdef CONFIG_RISCV_BOOT_SPINWAIT
 extern const struct cpu_operations cpu_ops_spinwait;
+#else
+const struct cpu_operations cpu_ops_spinwait = {
+	.name		= "",
+	.cpu_prepare	= NULL,
+	.cpu_start	= NULL,
+};
+#endif
 
 void __init cpu_set_ops(int cpuid)
 {
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 824aaeb5b951..2dfeea56d5fe 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -258,7 +258,7 @@ pmp_done:
 	li t0, SR_FS
 	csrc CSR_STATUS, t0
 
-#ifdef CONFIG_SMP
+#ifdef CONFIG_RISCV_BOOT_SPINWAIT
 	li t0, CONFIG_NR_CPUS
 	blt a0, t0, .Lgood_cores
 	tail .Lsecondary_park
@@ -284,7 +284,7 @@ pmp_done:
 	beq t0, t1, .Lsecondary_start
 
 #endif /* CONFIG_XIP */
-#endif /* CONFIG_SMP */
+#endif /* CONFIG_RISCV_BOOT_SPINWAIT */
 
 #ifdef CONFIG_XIP_KERNEL
 	la sp, _end + THREAD_SIZE
@@ -343,7 +343,7 @@ clear_bss_done:
 	call soc_early_init
 	tail start_kernel
 
-#ifdef CONFIG_SMP
+#if CONFIG_RISCV_BOOT_SPINWAIT
 .Lsecondary_start:
 	/* Set trap vector to spin forever to help debug */
 	la a3, .Lsecondary_park
@@ -370,7 +370,7 @@ clear_bss_done:
 	fence
 
 	tail .Lsecondary_start_common
-#endif
+#endif /* CONFIG_RISCV_BOOT_SPINWAIT */
 
 END(_start_kernel)
 
diff --git a/arch/riscv/kernel/head.h b/arch/riscv/kernel/head.h
index 5393cca77790..726731ada534 100644
--- a/arch/riscv/kernel/head.h
+++ b/arch/riscv/kernel/head.h
@@ -16,7 +16,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa);
 asmlinkage void __init __copy_data(void);
 #endif
 
+#ifdef CONFIG_RISCV_BOOT_SPINWAIT
 extern void *__cpu_spinwait_stack_pointer[];
 extern void *__cpu_spinwait_task_pointer[];
+#endif
 
 #endif /* __ASM_HEAD_H */
-- 
2.30.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-20  9:09 ` Atish Patra
@ 2022-01-20  9:09   ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

Currently, SBI APIs accept a hartmask that is generated from struct
cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
is not the correct data structure for hartids as it can be higher
than NR_CPUs for platforms with sparse or discontguous hartids.

Remove all association between hartid mask and struct cpumask.

Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/sbi.h      |  19 +--
 arch/riscv/include/asm/smp.h      |   2 -
 arch/riscv/kernel/sbi.c           | 189 +++++++++++++++++-------------
 arch/riscv/kernel/setup.c         |  10 --
 arch/riscv/kernel/smpboot.c       |   2 +-
 arch/riscv/kvm/mmu.c              |   4 +-
 arch/riscv/kvm/vcpu_sbi_replace.c |  11 +-
 arch/riscv/kvm/vcpu_sbi_v01.c     |  11 +-
 arch/riscv/kvm/vmid.c             |   4 +-
 arch/riscv/mm/cacheflush.c        |   5 +-
 arch/riscv/mm/tlbflush.c          |   9 +-
 11 files changed, 130 insertions(+), 136 deletions(-)

diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 26ba6f2d7a40..d1c37479d828 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -8,6 +8,7 @@
 #define _ASM_RISCV_SBI_H
 
 #include <linux/types.h>
+#include <linux/cpumask.h>
 
 #ifdef CONFIG_RISCV_SBI
 enum sbi_ext_id {
@@ -128,27 +129,27 @@ long sbi_get_mimpid(void);
 void sbi_set_timer(uint64_t stime_value);
 void sbi_shutdown(void);
 void sbi_clear_ipi(void);
-int sbi_send_ipi(const unsigned long *hart_mask);
-int sbi_remote_fence_i(const unsigned long *hart_mask);
-int sbi_remote_sfence_vma(const unsigned long *hart_mask,
+int sbi_send_ipi(const struct cpumask *cpu_mask);
+int sbi_remote_fence_i(const struct cpumask *cpu_mask);
+int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size);
 
-int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
+int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid);
-int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size);
-int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long vmid);
-int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size);
-int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid);
@@ -183,7 +184,7 @@ static inline unsigned long sbi_mk_version(unsigned long major,
 
 int sbi_err_map_linux_errno(int err);
 #else /* CONFIG_RISCV_SBI */
-static inline int sbi_remote_fence_i(const unsigned long *hart_mask) { return -1; }
+static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1; }
 static inline void sbi_init(void) {}
 #endif /* CONFIG_RISCV_SBI */
 #endif /* _ASM_RISCV_SBI_H */
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 6ad749f42807..23170c933d73 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -92,8 +92,6 @@ static inline void riscv_clear_ipi(void)
 
 #endif /* CONFIG_SMP */
 
-void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
-
 #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP)
 bool cpu_has_hotplug(unsigned int cpu);
 #else
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index 9a84f0cb5175..f72527fcb347 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -16,8 +16,8 @@ unsigned long sbi_spec_version __ro_after_init = SBI_SPEC_VERSION_DEFAULT;
 EXPORT_SYMBOL(sbi_spec_version);
 
 static void (*__sbi_set_timer)(uint64_t stime) __ro_after_init;
-static int (*__sbi_send_ipi)(const unsigned long *hart_mask) __ro_after_init;
-static int (*__sbi_rfence)(int fid, const unsigned long *hart_mask,
+static int (*__sbi_send_ipi)(const struct cpumask *cpu_mask) __ro_after_init;
+static int (*__sbi_rfence)(int fid, const struct cpumask *cpu_mask,
 			   unsigned long start, unsigned long size,
 			   unsigned long arg4, unsigned long arg5) __ro_after_init;
 
@@ -67,6 +67,30 @@ int sbi_err_map_linux_errno(int err)
 EXPORT_SYMBOL(sbi_err_map_linux_errno);
 
 #ifdef CONFIG_RISCV_SBI_V01
+static unsigned long __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mask)
+{
+	unsigned long cpuid, hartid;
+	unsigned long hmask = 0;
+
+	/*
+	 * There is no maximum hartid concept in RISC-V and NR_CPUS must not be
+	 * associated with hartid. As SBI v0.1 is only kept for backward compatibility
+	 * and will be removed in the future, there is no point in supporting hartid
+	 * greater than BITS_PER_LONG (32 for RV32 and 64 for RV64). Ideally, SBI v0.2
+	 * should be used for platforms with hartid greater than BITS_PER_LONG.
+	 */
+	for_each_cpu(cpuid, cpu_mask) {
+		hartid = cpuid_to_hartid_map(cpuid);
+		if (hartid >= BITS_PER_LONG) {
+			pr_warn("Unable to send any request to hartid > BITS_PER_LONG for SBI v0.1\n");
+			break;
+		}
+		hmask |= 1 << hartid;
+	}
+
+	return hmask;
+}
+
 /**
  * sbi_console_putchar() - Writes given character to the console device.
  * @ch: The data to be written to the console.
@@ -132,33 +156,44 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
 #endif
 }
 
-static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
+static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
 {
-	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)hart_mask,
+	unsigned long hart_mask;
+
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
+	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
+
+	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)(&hart_mask),
 		  0, 0, 0, 0, 0);
 	return 0;
 }
 
-static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
+static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
 			    unsigned long start, unsigned long size,
 			    unsigned long arg4, unsigned long arg5)
 {
 	int result = 0;
+	unsigned long hart_mask;
+
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
+	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
 
 	/* v0.2 function IDs are equivalent to v0.1 extension IDs */
 	switch (fid) {
 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
 		sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
-			  (unsigned long)hart_mask, 0, 0, 0, 0, 0);
+			  (unsigned long)&hart_mask, 0, 0, 0, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA, 0,
-			  (unsigned long)hart_mask, start, size,
+			  (unsigned long)&hart_mask, start, size,
 			  0, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID, 0,
-			  (unsigned long)hart_mask, start, size,
+			  (unsigned long)&hart_mask, start, size,
 			  arg4, 0, 0);
 		break;
 	default:
@@ -180,7 +215,7 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
 		sbi_major_version(), sbi_minor_version());
 }
 
-static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
+static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
 {
 	pr_warn("IPI extension is not available in SBI v%lu.%lu\n",
 		sbi_major_version(), sbi_minor_version());
@@ -188,7 +223,7 @@ static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
 	return 0;
 }
 
-static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
+static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
 			    unsigned long start, unsigned long size,
 			    unsigned long arg4, unsigned long arg5)
 {
@@ -212,37 +247,33 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
 #endif
 }
 
-static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
+static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
 {
-	unsigned long hartid, hmask_val, hbase;
-	struct cpumask tmask;
+	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
 	struct sbiret ret = {0};
 	int result;
 
-	if (!hart_mask || !(*hart_mask)) {
-		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
-		hart_mask = cpumask_bits(&tmask);
-	}
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
 
-	hmask_val = 0;
-	hbase = 0;
-	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
-		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
+	for_each_cpu(cpuid, cpu_mask) {
+		hartid = cpuid_to_hartid_map(cpuid);
+		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
 			ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
-					hmask_val, hbase, 0, 0, 0, 0);
+					hmask, hbase, 0, 0, 0, 0);
 			if (ret.error)
 				goto ecall_failed;
-			hmask_val = 0;
+			hmask = 0;
 			hbase = 0;
 		}
-		if (!hmask_val)
+		if (!hmask)
 			hbase = hartid;
-		hmask_val |= 1UL << (hartid - hbase);
+		hmask |= 1UL << (hartid - hbase);
 	}
 
-	if (hmask_val) {
+	if (hmask) {
 		ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
-				hmask_val, hbase, 0, 0, 0, 0);
+				hmask, hbase, 0, 0, 0, 0);
 		if (ret.error)
 			goto ecall_failed;
 	}
@@ -252,11 +283,11 @@ static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
 ecall_failed:
 	result = sbi_err_map_linux_errno(ret.error);
 	pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
-	       __func__, hbase, hmask_val, result);
+	       __func__, hbase, hmask, result);
 	return result;
 }
 
-static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
+static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask,
 				 unsigned long hbase, unsigned long start,
 				 unsigned long size, unsigned long arg4,
 				 unsigned long arg5)
@@ -267,31 +298,31 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
 
 	switch (fid) {
 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, 0, 0, 0, 0);
+		ret = sbi_ecall(ext, fid, hmask, hbase, 0, 0, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, arg4, 0);
 		break;
 
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, arg4, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, arg4, 0);
 		break;
 	default:
@@ -303,43 +334,39 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
 	if (ret.error) {
 		result = sbi_err_map_linux_errno(ret.error);
 		pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
-		       __func__, hbase, hmask_val, result);
+		       __func__, hbase, hmask, result);
 	}
 
 	return result;
 }
 
-static int __sbi_rfence_v02(int fid, const unsigned long *hart_mask,
+static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask,
 			    unsigned long start, unsigned long size,
 			    unsigned long arg4, unsigned long arg5)
 {
-	unsigned long hmask_val, hartid, hbase;
-	struct cpumask tmask;
+	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
 	int result;
 
-	if (!hart_mask || !(*hart_mask)) {
-		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
-		hart_mask = cpumask_bits(&tmask);
-	}
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
 
-	hmask_val = 0;
-	hbase = 0;
-	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
-		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
-			result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
+	for_each_cpu(cpuid, cpu_mask) {
+		hartid = cpuid_to_hartid_map(cpuid);
+		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
+			result = __sbi_rfence_v02_call(fid, hmask, hbase,
 						       start, size, arg4, arg5);
 			if (result)
 				return result;
-			hmask_val = 0;
+			hmask = 0;
 			hbase = 0;
 		}
-		if (!hmask_val)
+		if (!hmask)
 			hbase = hartid;
-		hmask_val |= 1UL << (hartid - hbase);
+		hmask |= 1UL << (hartid - hbase);
 	}
 
-	if (hmask_val) {
-		result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
+	if (hmask) {
+		result = __sbi_rfence_v02_call(fid, hmask, hbase,
 					       start, size, arg4, arg5);
 		if (result)
 			return result;
@@ -361,44 +388,44 @@ void sbi_set_timer(uint64_t stime_value)
 
 /**
  * sbi_send_ipi() - Send an IPI to any hart.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_send_ipi(const unsigned long *hart_mask)
+int sbi_send_ipi(const struct cpumask *cpu_mask)
 {
-	return __sbi_send_ipi(hart_mask);
+	return __sbi_send_ipi(cpu_mask);
 }
 EXPORT_SYMBOL(sbi_send_ipi);
 
 /**
  * sbi_remote_fence_i() - Execute FENCE.I instruction on given remote harts.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_remote_fence_i(const unsigned long *hart_mask)
+int sbi_remote_fence_i(const struct cpumask *cpu_mask)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_FENCE_I,
-			    hart_mask, 0, 0, 0, 0);
+			    cpu_mask, 0, 0, 0, 0);
 }
 EXPORT_SYMBOL(sbi_remote_fence_i);
 
 /**
  * sbi_remote_sfence_vma() - Execute SFENCE.VMA instructions on given remote
  *			     harts for the specified virtual address range.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the virtual address
  * @size: Total size of the virtual address range.
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_remote_sfence_vma(const unsigned long *hart_mask,
+int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
-			    hart_mask, start, size, 0, 0);
+			    cpu_mask, start, size, 0, 0);
 }
 EXPORT_SYMBOL(sbi_remote_sfence_vma);
 
@@ -406,38 +433,38 @@ EXPORT_SYMBOL(sbi_remote_sfence_vma);
  * sbi_remote_sfence_vma_asid() - Execute SFENCE.VMA instructions on given
  * remote harts for a virtual address range belonging to a specific ASID.
  *
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the virtual address
  * @size: Total size of the virtual address range.
  * @asid: The value of address space identifier (ASID).
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
+int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
-			    hart_mask, start, size, asid, 0);
+			    cpu_mask, start, size, asid, 0);
 }
 EXPORT_SYMBOL(sbi_remote_sfence_vma_asid);
 
 /**
  * sbi_remote_hfence_gvma() - Execute HFENCE.GVMA instructions on given remote
  *			   harts for the specified guest physical address range.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the guest physical address
  * @size: Total size of the guest physical address range.
  *
  * Return: None
  */
-int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
-			    hart_mask, start, size, 0, 0);
+			    cpu_mask, start, size, 0, 0);
 }
 EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
 
@@ -445,38 +472,38 @@ EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
  * sbi_remote_hfence_gvma_vmid() - Execute HFENCE.GVMA instructions on given
  * remote harts for a guest physical address range belonging to a specific VMID.
  *
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the guest physical address
  * @size: Total size of the guest physical address range.
  * @vmid: The value of guest ID (VMID).
  *
  * Return: 0 if success, Error otherwise.
  */
-int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long vmid)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID,
-			    hart_mask, start, size, vmid, 0);
+			    cpu_mask, start, size, vmid, 0);
 }
 EXPORT_SYMBOL(sbi_remote_hfence_gvma_vmid);
 
 /**
  * sbi_remote_hfence_vvma() - Execute HFENCE.VVMA instructions on given remote
  *			     harts for the current guest virtual address range.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the current guest virtual address
  * @size: Total size of the current guest virtual address range.
  *
  * Return: None
  */
-int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
-			    hart_mask, start, size, 0, 0);
+			    cpu_mask, start, size, 0, 0);
 }
 EXPORT_SYMBOL(sbi_remote_hfence_vvma);
 
@@ -485,20 +512,20 @@ EXPORT_SYMBOL(sbi_remote_hfence_vvma);
  * remote harts for current guest virtual address range belonging to a specific
  * ASID.
  *
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the current guest virtual address
  * @size: Total size of the current guest virtual address range.
  * @asid: The value of address space identifier (ASID).
  *
  * Return: None
  */
-int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID,
-			    hart_mask, start, size, asid, 0);
+			    cpu_mask, start, size, asid, 0);
 }
 EXPORT_SYMBOL(sbi_remote_hfence_vvma_asid);
 
@@ -591,11 +618,7 @@ long sbi_get_mimpid(void)
 
 static void sbi_send_cpumask_ipi(const struct cpumask *target)
 {
-	struct cpumask hartid_mask;
-
-	riscv_cpuid_to_hartid_mask(target, &hartid_mask);
-
-	sbi_send_ipi(cpumask_bits(&hartid_mask));
+	sbi_send_ipi(target);
 }
 
 static const struct riscv_ipi_ops sbi_ipi_ops = {
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 63241abe84eb..b42bfdc67482 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -59,16 +59,6 @@ atomic_t hart_lottery __section(".sdata")
 unsigned long boot_cpu_hartid;
 static DEFINE_PER_CPU(struct cpu, cpu_devices);
 
-void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out)
-{
-	int cpu;
-
-	cpumask_clear(out);
-	for_each_cpu(cpu, in)
-		cpumask_set_cpu(cpuid_to_hartid_map(cpu), out);
-}
-EXPORT_SYMBOL_GPL(riscv_cpuid_to_hartid_mask);
-
 /*
  * Place kernel memory regions on the resource tree so that
  * kexec-tools can retrieve them from /proc/iomem. While there
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index bd82375db51a..622f226454d5 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -96,7 +96,7 @@ void __init setup_smp(void)
 		if (cpuid >= NR_CPUS) {
 			pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
 				cpuid, hart);
-			break;
+			continue;
 		}
 
 		cpuid_to_hartid_map(cpuid) = hart;
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 9af67dbdc66a..f80a34fbf102 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -114,7 +114,6 @@ static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr,
 
 static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
 {
-	struct cpumask hmask;
 	unsigned long size = PAGE_SIZE;
 	struct kvm_vmid *vmid = &kvm->arch.vmid;
 
@@ -127,8 +126,7 @@ static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
 	 * where the Guest/VM is running.
 	 */
 	preempt_disable();
-	riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
-	sbi_remote_hfence_gvma_vmid(cpumask_bits(&hmask), addr, size,
+	sbi_remote_hfence_gvma_vmid(cpu_online_mask, addr, size,
 				    READ_ONCE(vmid->vmid));
 	preempt_enable();
 }
diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
index 00036b7f83b9..1bc0608a5bfd 100644
--- a/arch/riscv/kvm/vcpu_sbi_replace.c
+++ b/arch/riscv/kvm/vcpu_sbi_replace.c
@@ -82,7 +82,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 {
 	int ret = 0;
 	unsigned long i;
-	struct cpumask cm, hm;
+	struct cpumask cm;
 	struct kvm_vcpu *tmp;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 	unsigned long hmask = cp->a0;
@@ -90,7 +90,6 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 	unsigned long funcid = cp->a6;
 
 	cpumask_clear(&cm);
-	cpumask_clear(&hm);
 	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
 		if (hbase != -1UL) {
 			if (tmp->vcpu_id < hbase)
@@ -103,17 +102,15 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 		cpumask_set_cpu(tmp->cpu, &cm);
 	}
 
-	riscv_cpuid_to_hartid_mask(&cm, &hm);
-
 	switch (funcid) {
 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
-		ret = sbi_remote_fence_i(cpumask_bits(&hm));
+		ret = sbi_remote_fence_i(&cm);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
-		ret = sbi_remote_hfence_vvma(cpumask_bits(&hm), cp->a2, cp->a3);
+		ret = sbi_remote_hfence_vvma(&cm, cp->a2, cp->a3);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
-		ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm), cp->a2,
+		ret = sbi_remote_hfence_vvma_asid(&cm, cp->a2,
 						  cp->a3, cp->a4);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
index 4c7e13ec9ccc..07e2de14433a 100644
--- a/arch/riscv/kvm/vcpu_sbi_v01.c
+++ b/arch/riscv/kvm/vcpu_sbi_v01.c
@@ -38,7 +38,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	int i, ret = 0;
 	u64 next_cycle;
 	struct kvm_vcpu *rvcpu;
-	struct cpumask cm, hm;
+	struct cpumask cm;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 
@@ -101,15 +101,12 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 				continue;
 			cpumask_set_cpu(rvcpu->cpu, &cm);
 		}
-		riscv_cpuid_to_hartid_mask(&cm, &hm);
 		if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
-			ret = sbi_remote_fence_i(cpumask_bits(&hm));
+			ret = sbi_remote_fence_i(&cm);
 		else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
-			ret = sbi_remote_hfence_vvma(cpumask_bits(&hm),
-						cp->a1, cp->a2);
+			ret = sbi_remote_hfence_vvma(&cm, cp->a1, cp->a2);
 		else
-			ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
-						cp->a1, cp->a2, cp->a3);
+			ret = sbi_remote_hfence_vvma_asid(&cm, cp->a1, cp->a2, cp->a3);
 		break;
 	default:
 		ret = -EINVAL;
diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
index 807228f8f409..2fa4f7b1813d 100644
--- a/arch/riscv/kvm/vmid.c
+++ b/arch/riscv/kvm/vmid.c
@@ -67,7 +67,6 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
 {
 	unsigned long i;
 	struct kvm_vcpu *v;
-	struct cpumask hmask;
 	struct kvm_vmid *vmid = &vcpu->kvm->arch.vmid;
 
 	if (!kvm_riscv_stage2_vmid_ver_changed(vmid))
@@ -102,8 +101,7 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
 		 * running, we force VM exits on all host CPUs using IPI and
 		 * flush all Guest TLBs.
 		 */
-		riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
-		sbi_remote_hfence_gvma(cpumask_bits(&hmask), 0, 0);
+		sbi_remote_hfence_gvma(cpu_online_mask, 0, 0);
 	}
 
 	vmid->vmid = vmid_next;
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 89f81067e09e..6cb7d96ad9c7 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -67,10 +67,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
 		 */
 		smp_mb();
 	} else if (IS_ENABLED(CONFIG_RISCV_SBI)) {
-		cpumask_t hartid_mask;
-
-		riscv_cpuid_to_hartid_mask(&others, &hartid_mask);
-		sbi_remote_fence_i(cpumask_bits(&hartid_mask));
+		sbi_remote_fence_i(&others);
 	} else {
 		on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
 	}
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 64f8201237c2..37ed760d007c 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -32,7 +32,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
 				  unsigned long size, unsigned long stride)
 {
 	struct cpumask *cmask = mm_cpumask(mm);
-	struct cpumask hmask;
 	unsigned int cpuid;
 	bool broadcast;
 
@@ -46,9 +45,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
 		unsigned long asid = atomic_long_read(&mm->context.id);
 
 		if (broadcast) {
-			riscv_cpuid_to_hartid_mask(cmask, &hmask);
-			sbi_remote_sfence_vma_asid(cpumask_bits(&hmask),
-						   start, size, asid);
+			sbi_remote_sfence_vma_asid(cmask, start, size, asid);
 		} else if (size <= stride) {
 			local_flush_tlb_page_asid(start, asid);
 		} else {
@@ -56,9 +53,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
 		}
 	} else {
 		if (broadcast) {
-			riscv_cpuid_to_hartid_mask(cmask, &hmask);
-			sbi_remote_sfence_vma(cpumask_bits(&hmask),
-					      start, size);
+			sbi_remote_sfence_vma(cmask, start, size);
 		} else if (size <= stride) {
 			local_flush_tlb_page(start);
 		} else {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-20  9:09   ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-20  9:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Albert Ou, Atish Patra, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

Currently, SBI APIs accept a hartmask that is generated from struct
cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
is not the correct data structure for hartids as it can be higher
than NR_CPUs for platforms with sparse or discontguous hartids.

Remove all association between hartid mask and struct cpumask.

Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/sbi.h      |  19 +--
 arch/riscv/include/asm/smp.h      |   2 -
 arch/riscv/kernel/sbi.c           | 189 +++++++++++++++++-------------
 arch/riscv/kernel/setup.c         |  10 --
 arch/riscv/kernel/smpboot.c       |   2 +-
 arch/riscv/kvm/mmu.c              |   4 +-
 arch/riscv/kvm/vcpu_sbi_replace.c |  11 +-
 arch/riscv/kvm/vcpu_sbi_v01.c     |  11 +-
 arch/riscv/kvm/vmid.c             |   4 +-
 arch/riscv/mm/cacheflush.c        |   5 +-
 arch/riscv/mm/tlbflush.c          |   9 +-
 11 files changed, 130 insertions(+), 136 deletions(-)

diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 26ba6f2d7a40..d1c37479d828 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -8,6 +8,7 @@
 #define _ASM_RISCV_SBI_H
 
 #include <linux/types.h>
+#include <linux/cpumask.h>
 
 #ifdef CONFIG_RISCV_SBI
 enum sbi_ext_id {
@@ -128,27 +129,27 @@ long sbi_get_mimpid(void);
 void sbi_set_timer(uint64_t stime_value);
 void sbi_shutdown(void);
 void sbi_clear_ipi(void);
-int sbi_send_ipi(const unsigned long *hart_mask);
-int sbi_remote_fence_i(const unsigned long *hart_mask);
-int sbi_remote_sfence_vma(const unsigned long *hart_mask,
+int sbi_send_ipi(const struct cpumask *cpu_mask);
+int sbi_remote_fence_i(const struct cpumask *cpu_mask);
+int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size);
 
-int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
+int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid);
-int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size);
-int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long vmid);
-int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size);
-int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid);
@@ -183,7 +184,7 @@ static inline unsigned long sbi_mk_version(unsigned long major,
 
 int sbi_err_map_linux_errno(int err);
 #else /* CONFIG_RISCV_SBI */
-static inline int sbi_remote_fence_i(const unsigned long *hart_mask) { return -1; }
+static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1; }
 static inline void sbi_init(void) {}
 #endif /* CONFIG_RISCV_SBI */
 #endif /* _ASM_RISCV_SBI_H */
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 6ad749f42807..23170c933d73 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -92,8 +92,6 @@ static inline void riscv_clear_ipi(void)
 
 #endif /* CONFIG_SMP */
 
-void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
-
 #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP)
 bool cpu_has_hotplug(unsigned int cpu);
 #else
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index 9a84f0cb5175..f72527fcb347 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -16,8 +16,8 @@ unsigned long sbi_spec_version __ro_after_init = SBI_SPEC_VERSION_DEFAULT;
 EXPORT_SYMBOL(sbi_spec_version);
 
 static void (*__sbi_set_timer)(uint64_t stime) __ro_after_init;
-static int (*__sbi_send_ipi)(const unsigned long *hart_mask) __ro_after_init;
-static int (*__sbi_rfence)(int fid, const unsigned long *hart_mask,
+static int (*__sbi_send_ipi)(const struct cpumask *cpu_mask) __ro_after_init;
+static int (*__sbi_rfence)(int fid, const struct cpumask *cpu_mask,
 			   unsigned long start, unsigned long size,
 			   unsigned long arg4, unsigned long arg5) __ro_after_init;
 
@@ -67,6 +67,30 @@ int sbi_err_map_linux_errno(int err)
 EXPORT_SYMBOL(sbi_err_map_linux_errno);
 
 #ifdef CONFIG_RISCV_SBI_V01
+static unsigned long __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mask)
+{
+	unsigned long cpuid, hartid;
+	unsigned long hmask = 0;
+
+	/*
+	 * There is no maximum hartid concept in RISC-V and NR_CPUS must not be
+	 * associated with hartid. As SBI v0.1 is only kept for backward compatibility
+	 * and will be removed in the future, there is no point in supporting hartid
+	 * greater than BITS_PER_LONG (32 for RV32 and 64 for RV64). Ideally, SBI v0.2
+	 * should be used for platforms with hartid greater than BITS_PER_LONG.
+	 */
+	for_each_cpu(cpuid, cpu_mask) {
+		hartid = cpuid_to_hartid_map(cpuid);
+		if (hartid >= BITS_PER_LONG) {
+			pr_warn("Unable to send any request to hartid > BITS_PER_LONG for SBI v0.1\n");
+			break;
+		}
+		hmask |= 1 << hartid;
+	}
+
+	return hmask;
+}
+
 /**
  * sbi_console_putchar() - Writes given character to the console device.
  * @ch: The data to be written to the console.
@@ -132,33 +156,44 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
 #endif
 }
 
-static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
+static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
 {
-	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)hart_mask,
+	unsigned long hart_mask;
+
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
+	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
+
+	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)(&hart_mask),
 		  0, 0, 0, 0, 0);
 	return 0;
 }
 
-static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
+static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
 			    unsigned long start, unsigned long size,
 			    unsigned long arg4, unsigned long arg5)
 {
 	int result = 0;
+	unsigned long hart_mask;
+
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
+	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
 
 	/* v0.2 function IDs are equivalent to v0.1 extension IDs */
 	switch (fid) {
 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
 		sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
-			  (unsigned long)hart_mask, 0, 0, 0, 0, 0);
+			  (unsigned long)&hart_mask, 0, 0, 0, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA, 0,
-			  (unsigned long)hart_mask, start, size,
+			  (unsigned long)&hart_mask, start, size,
 			  0, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID, 0,
-			  (unsigned long)hart_mask, start, size,
+			  (unsigned long)&hart_mask, start, size,
 			  arg4, 0, 0);
 		break;
 	default:
@@ -180,7 +215,7 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
 		sbi_major_version(), sbi_minor_version());
 }
 
-static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
+static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
 {
 	pr_warn("IPI extension is not available in SBI v%lu.%lu\n",
 		sbi_major_version(), sbi_minor_version());
@@ -188,7 +223,7 @@ static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
 	return 0;
 }
 
-static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
+static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
 			    unsigned long start, unsigned long size,
 			    unsigned long arg4, unsigned long arg5)
 {
@@ -212,37 +247,33 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
 #endif
 }
 
-static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
+static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
 {
-	unsigned long hartid, hmask_val, hbase;
-	struct cpumask tmask;
+	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
 	struct sbiret ret = {0};
 	int result;
 
-	if (!hart_mask || !(*hart_mask)) {
-		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
-		hart_mask = cpumask_bits(&tmask);
-	}
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
 
-	hmask_val = 0;
-	hbase = 0;
-	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
-		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
+	for_each_cpu(cpuid, cpu_mask) {
+		hartid = cpuid_to_hartid_map(cpuid);
+		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
 			ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
-					hmask_val, hbase, 0, 0, 0, 0);
+					hmask, hbase, 0, 0, 0, 0);
 			if (ret.error)
 				goto ecall_failed;
-			hmask_val = 0;
+			hmask = 0;
 			hbase = 0;
 		}
-		if (!hmask_val)
+		if (!hmask)
 			hbase = hartid;
-		hmask_val |= 1UL << (hartid - hbase);
+		hmask |= 1UL << (hartid - hbase);
 	}
 
-	if (hmask_val) {
+	if (hmask) {
 		ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
-				hmask_val, hbase, 0, 0, 0, 0);
+				hmask, hbase, 0, 0, 0, 0);
 		if (ret.error)
 			goto ecall_failed;
 	}
@@ -252,11 +283,11 @@ static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
 ecall_failed:
 	result = sbi_err_map_linux_errno(ret.error);
 	pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
-	       __func__, hbase, hmask_val, result);
+	       __func__, hbase, hmask, result);
 	return result;
 }
 
-static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
+static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask,
 				 unsigned long hbase, unsigned long start,
 				 unsigned long size, unsigned long arg4,
 				 unsigned long arg5)
@@ -267,31 +298,31 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
 
 	switch (fid) {
 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, 0, 0, 0, 0);
+		ret = sbi_ecall(ext, fid, hmask, hbase, 0, 0, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, arg4, 0);
 		break;
 
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, arg4, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, 0, 0);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
-		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
+		ret = sbi_ecall(ext, fid, hmask, hbase, start,
 				size, arg4, 0);
 		break;
 	default:
@@ -303,43 +334,39 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
 	if (ret.error) {
 		result = sbi_err_map_linux_errno(ret.error);
 		pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
-		       __func__, hbase, hmask_val, result);
+		       __func__, hbase, hmask, result);
 	}
 
 	return result;
 }
 
-static int __sbi_rfence_v02(int fid, const unsigned long *hart_mask,
+static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask,
 			    unsigned long start, unsigned long size,
 			    unsigned long arg4, unsigned long arg5)
 {
-	unsigned long hmask_val, hartid, hbase;
-	struct cpumask tmask;
+	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
 	int result;
 
-	if (!hart_mask || !(*hart_mask)) {
-		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
-		hart_mask = cpumask_bits(&tmask);
-	}
+	if (!cpu_mask)
+		cpu_mask = cpu_online_mask;
 
-	hmask_val = 0;
-	hbase = 0;
-	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
-		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
-			result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
+	for_each_cpu(cpuid, cpu_mask) {
+		hartid = cpuid_to_hartid_map(cpuid);
+		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
+			result = __sbi_rfence_v02_call(fid, hmask, hbase,
 						       start, size, arg4, arg5);
 			if (result)
 				return result;
-			hmask_val = 0;
+			hmask = 0;
 			hbase = 0;
 		}
-		if (!hmask_val)
+		if (!hmask)
 			hbase = hartid;
-		hmask_val |= 1UL << (hartid - hbase);
+		hmask |= 1UL << (hartid - hbase);
 	}
 
-	if (hmask_val) {
-		result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
+	if (hmask) {
+		result = __sbi_rfence_v02_call(fid, hmask, hbase,
 					       start, size, arg4, arg5);
 		if (result)
 			return result;
@@ -361,44 +388,44 @@ void sbi_set_timer(uint64_t stime_value)
 
 /**
  * sbi_send_ipi() - Send an IPI to any hart.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_send_ipi(const unsigned long *hart_mask)
+int sbi_send_ipi(const struct cpumask *cpu_mask)
 {
-	return __sbi_send_ipi(hart_mask);
+	return __sbi_send_ipi(cpu_mask);
 }
 EXPORT_SYMBOL(sbi_send_ipi);
 
 /**
  * sbi_remote_fence_i() - Execute FENCE.I instruction on given remote harts.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_remote_fence_i(const unsigned long *hart_mask)
+int sbi_remote_fence_i(const struct cpumask *cpu_mask)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_FENCE_I,
-			    hart_mask, 0, 0, 0, 0);
+			    cpu_mask, 0, 0, 0, 0);
 }
 EXPORT_SYMBOL(sbi_remote_fence_i);
 
 /**
  * sbi_remote_sfence_vma() - Execute SFENCE.VMA instructions on given remote
  *			     harts for the specified virtual address range.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the virtual address
  * @size: Total size of the virtual address range.
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_remote_sfence_vma(const unsigned long *hart_mask,
+int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
-			    hart_mask, start, size, 0, 0);
+			    cpu_mask, start, size, 0, 0);
 }
 EXPORT_SYMBOL(sbi_remote_sfence_vma);
 
@@ -406,38 +433,38 @@ EXPORT_SYMBOL(sbi_remote_sfence_vma);
  * sbi_remote_sfence_vma_asid() - Execute SFENCE.VMA instructions on given
  * remote harts for a virtual address range belonging to a specific ASID.
  *
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the virtual address
  * @size: Total size of the virtual address range.
  * @asid: The value of address space identifier (ASID).
  *
  * Return: 0 on success, appropriate linux error code otherwise.
  */
-int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
+int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
-			    hart_mask, start, size, asid, 0);
+			    cpu_mask, start, size, asid, 0);
 }
 EXPORT_SYMBOL(sbi_remote_sfence_vma_asid);
 
 /**
  * sbi_remote_hfence_gvma() - Execute HFENCE.GVMA instructions on given remote
  *			   harts for the specified guest physical address range.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the guest physical address
  * @size: Total size of the guest physical address range.
  *
  * Return: None
  */
-int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
-			    hart_mask, start, size, 0, 0);
+			    cpu_mask, start, size, 0, 0);
 }
 EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
 
@@ -445,38 +472,38 @@ EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
  * sbi_remote_hfence_gvma_vmid() - Execute HFENCE.GVMA instructions on given
  * remote harts for a guest physical address range belonging to a specific VMID.
  *
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the guest physical address
  * @size: Total size of the guest physical address range.
  * @vmid: The value of guest ID (VMID).
  *
  * Return: 0 if success, Error otherwise.
  */
-int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
+int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long vmid)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID,
-			    hart_mask, start, size, vmid, 0);
+			    cpu_mask, start, size, vmid, 0);
 }
 EXPORT_SYMBOL(sbi_remote_hfence_gvma_vmid);
 
 /**
  * sbi_remote_hfence_vvma() - Execute HFENCE.VVMA instructions on given remote
  *			     harts for the current guest virtual address range.
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the current guest virtual address
  * @size: Total size of the current guest virtual address range.
  *
  * Return: None
  */
-int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
 			   unsigned long start,
 			   unsigned long size)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
-			    hart_mask, start, size, 0, 0);
+			    cpu_mask, start, size, 0, 0);
 }
 EXPORT_SYMBOL(sbi_remote_hfence_vvma);
 
@@ -485,20 +512,20 @@ EXPORT_SYMBOL(sbi_remote_hfence_vvma);
  * remote harts for current guest virtual address range belonging to a specific
  * ASID.
  *
- * @hart_mask: A cpu mask containing all the target harts.
+ * @cpu_mask: A cpu mask containing all the target harts.
  * @start: Start of the current guest virtual address
  * @size: Total size of the current guest virtual address range.
  * @asid: The value of address space identifier (ASID).
  *
  * Return: None
  */
-int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
+int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
 				unsigned long start,
 				unsigned long size,
 				unsigned long asid)
 {
 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID,
-			    hart_mask, start, size, asid, 0);
+			    cpu_mask, start, size, asid, 0);
 }
 EXPORT_SYMBOL(sbi_remote_hfence_vvma_asid);
 
@@ -591,11 +618,7 @@ long sbi_get_mimpid(void)
 
 static void sbi_send_cpumask_ipi(const struct cpumask *target)
 {
-	struct cpumask hartid_mask;
-
-	riscv_cpuid_to_hartid_mask(target, &hartid_mask);
-
-	sbi_send_ipi(cpumask_bits(&hartid_mask));
+	sbi_send_ipi(target);
 }
 
 static const struct riscv_ipi_ops sbi_ipi_ops = {
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 63241abe84eb..b42bfdc67482 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -59,16 +59,6 @@ atomic_t hart_lottery __section(".sdata")
 unsigned long boot_cpu_hartid;
 static DEFINE_PER_CPU(struct cpu, cpu_devices);
 
-void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out)
-{
-	int cpu;
-
-	cpumask_clear(out);
-	for_each_cpu(cpu, in)
-		cpumask_set_cpu(cpuid_to_hartid_map(cpu), out);
-}
-EXPORT_SYMBOL_GPL(riscv_cpuid_to_hartid_mask);
-
 /*
  * Place kernel memory regions on the resource tree so that
  * kexec-tools can retrieve them from /proc/iomem. While there
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index bd82375db51a..622f226454d5 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -96,7 +96,7 @@ void __init setup_smp(void)
 		if (cpuid >= NR_CPUS) {
 			pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
 				cpuid, hart);
-			break;
+			continue;
 		}
 
 		cpuid_to_hartid_map(cpuid) = hart;
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 9af67dbdc66a..f80a34fbf102 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -114,7 +114,6 @@ static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr,
 
 static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
 {
-	struct cpumask hmask;
 	unsigned long size = PAGE_SIZE;
 	struct kvm_vmid *vmid = &kvm->arch.vmid;
 
@@ -127,8 +126,7 @@ static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
 	 * where the Guest/VM is running.
 	 */
 	preempt_disable();
-	riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
-	sbi_remote_hfence_gvma_vmid(cpumask_bits(&hmask), addr, size,
+	sbi_remote_hfence_gvma_vmid(cpu_online_mask, addr, size,
 				    READ_ONCE(vmid->vmid));
 	preempt_enable();
 }
diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
index 00036b7f83b9..1bc0608a5bfd 100644
--- a/arch/riscv/kvm/vcpu_sbi_replace.c
+++ b/arch/riscv/kvm/vcpu_sbi_replace.c
@@ -82,7 +82,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 {
 	int ret = 0;
 	unsigned long i;
-	struct cpumask cm, hm;
+	struct cpumask cm;
 	struct kvm_vcpu *tmp;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 	unsigned long hmask = cp->a0;
@@ -90,7 +90,6 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 	unsigned long funcid = cp->a6;
 
 	cpumask_clear(&cm);
-	cpumask_clear(&hm);
 	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
 		if (hbase != -1UL) {
 			if (tmp->vcpu_id < hbase)
@@ -103,17 +102,15 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 		cpumask_set_cpu(tmp->cpu, &cm);
 	}
 
-	riscv_cpuid_to_hartid_mask(&cm, &hm);
-
 	switch (funcid) {
 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
-		ret = sbi_remote_fence_i(cpumask_bits(&hm));
+		ret = sbi_remote_fence_i(&cm);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
-		ret = sbi_remote_hfence_vvma(cpumask_bits(&hm), cp->a2, cp->a3);
+		ret = sbi_remote_hfence_vvma(&cm, cp->a2, cp->a3);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
-		ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm), cp->a2,
+		ret = sbi_remote_hfence_vvma_asid(&cm, cp->a2,
 						  cp->a3, cp->a4);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
index 4c7e13ec9ccc..07e2de14433a 100644
--- a/arch/riscv/kvm/vcpu_sbi_v01.c
+++ b/arch/riscv/kvm/vcpu_sbi_v01.c
@@ -38,7 +38,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	int i, ret = 0;
 	u64 next_cycle;
 	struct kvm_vcpu *rvcpu;
-	struct cpumask cm, hm;
+	struct cpumask cm;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 
@@ -101,15 +101,12 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 				continue;
 			cpumask_set_cpu(rvcpu->cpu, &cm);
 		}
-		riscv_cpuid_to_hartid_mask(&cm, &hm);
 		if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
-			ret = sbi_remote_fence_i(cpumask_bits(&hm));
+			ret = sbi_remote_fence_i(&cm);
 		else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
-			ret = sbi_remote_hfence_vvma(cpumask_bits(&hm),
-						cp->a1, cp->a2);
+			ret = sbi_remote_hfence_vvma(&cm, cp->a1, cp->a2);
 		else
-			ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
-						cp->a1, cp->a2, cp->a3);
+			ret = sbi_remote_hfence_vvma_asid(&cm, cp->a1, cp->a2, cp->a3);
 		break;
 	default:
 		ret = -EINVAL;
diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
index 807228f8f409..2fa4f7b1813d 100644
--- a/arch/riscv/kvm/vmid.c
+++ b/arch/riscv/kvm/vmid.c
@@ -67,7 +67,6 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
 {
 	unsigned long i;
 	struct kvm_vcpu *v;
-	struct cpumask hmask;
 	struct kvm_vmid *vmid = &vcpu->kvm->arch.vmid;
 
 	if (!kvm_riscv_stage2_vmid_ver_changed(vmid))
@@ -102,8 +101,7 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
 		 * running, we force VM exits on all host CPUs using IPI and
 		 * flush all Guest TLBs.
 		 */
-		riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
-		sbi_remote_hfence_gvma(cpumask_bits(&hmask), 0, 0);
+		sbi_remote_hfence_gvma(cpu_online_mask, 0, 0);
 	}
 
 	vmid->vmid = vmid_next;
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 89f81067e09e..6cb7d96ad9c7 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -67,10 +67,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
 		 */
 		smp_mb();
 	} else if (IS_ENABLED(CONFIG_RISCV_SBI)) {
-		cpumask_t hartid_mask;
-
-		riscv_cpuid_to_hartid_mask(&others, &hartid_mask);
-		sbi_remote_fence_i(cpumask_bits(&hartid_mask));
+		sbi_remote_fence_i(&others);
 	} else {
 		on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
 	}
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 64f8201237c2..37ed760d007c 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -32,7 +32,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
 				  unsigned long size, unsigned long stride)
 {
 	struct cpumask *cmask = mm_cpumask(mm);
-	struct cpumask hmask;
 	unsigned int cpuid;
 	bool broadcast;
 
@@ -46,9 +45,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
 		unsigned long asid = atomic_long_read(&mm->context.id);
 
 		if (broadcast) {
-			riscv_cpuid_to_hartid_mask(cmask, &hmask);
-			sbi_remote_sfence_vma_asid(cpumask_bits(&hmask),
-						   start, size, asid);
+			sbi_remote_sfence_vma_asid(cmask, start, size, asid);
 		} else if (size <= stride) {
 			local_flush_tlb_page_asid(start, asid);
 		} else {
@@ -56,9 +53,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
 		}
 	} else {
 		if (broadcast) {
-			riscv_cpuid_to_hartid_mask(cmask, &hmask);
-			sbi_remote_sfence_vma(cpumask_bits(&hmask),
-					      start, size);
+			sbi_remote_sfence_vma(cmask, start, size);
 		} else if (size <= stride) {
 			local_flush_tlb_page(start);
 		} else {
-- 
2.30.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 0/6] Sparse HART id support
  2022-01-20  9:09 ` Atish Patra
@ 2022-01-20 18:17   ` Palmer Dabbelt
  -1 siblings, 0 replies; 55+ messages in thread
From: Palmer Dabbelt @ 2022-01-20 18:17 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Atish Patra, aou, atishp, anup, damien.lemoal,
	devicetree, jszhang, krzysztof.kozlowski, linux-riscv,
	Paul Walmsley, robh+dt

On Thu, 20 Jan 2022 01:09:12 PST (-0800), Atish Patra wrote:
> Currently, sparse hartid is not supported for Linux RISC-V for the following
> reasons.
> 1. Both spinwait and ordered booting method uses __cpu_up_stack/task_pointer
>    which is an array size of NR_CPUs.
> 2. During early booting, any hartid greater than NR_CPUs are not booted at all.
> 3. riscv_cpuid_to_hartid_mask uses struct cpumask for generating hartid bitmap.
> 4. SBI v0.2 implementation uses NR_CPUs as the maximum hartid number while
>    generating hartmask.
>
> In order to support sparse hartid, the hartid & NR_CPUS needs to be disassociated
> which was logically incorrect anyways. NR_CPUs represent the maximum logical|
> CPU id configured in the kernel while the hartid represent the physical hartid
> stored in mhartid CSR defined by the privilege specification. Thus, hartid
> can have much greater value than logical cpuid.
>
> Currently, we have two methods of booting. Ordered booting where the booting
> hart brings up each non-booting hart one by one using SBI HSM extension.
> The spinwait booting method relies on harts jumping to Linux kernel randomly
> and boot hart is selected by a lottery. All other non-booting harts keep
> spinning on __cpu_up_stack/task_pointer until boot hart initializes the data.
> Both these methods rely on __cpu_up_stack/task_pointer to setup the stack/
> task pointer. The spinwait method is mostly used to support older firmwares
> without SBI HSM extension and M-mode Linux.  The ordered booting method is the
> preferred booting method for booting general Linux because it can support
> cpu hotplug and kexec.
>
> The first patch modified the ordered booting method to use an opaque parameter
> already available in HSM start API to setup the stack/task pointer. The third
> patch resolves the issue #1 by limiting the usage of
> __cpu_up_stack/task_pointer to spinwait specific booting method. The fourth
> and fifth patch moves the entire hart lottery selection and spinwait method
> to a separate config that can be disabled if required. It solves the issue #2.
> The 6th patch solves issue #3 and #4 by removing riscv_cpuid_to_hartid_mask
> completely. All the SBI APIs directly pass a pointer to struct cpumask and
> the SBI implementation takes care of generating the hart bitmap from the
> cpumask.
>
> It is not trivial to support sparse hartid for spinwait booting method and
> there are no usecases to support sparse hartid for spinwait method as well.
> Any platform with sparse hartid will probably require more advanced features
> such as cpu hotplug and kexec. Thus, the series supports the sparse hartid via
> ordered booting method only. To maintain backward compatibility, spinwait
> booting method is currently enabled in defconfig so that M-mode linux will
> continue to work. Any platform that requires to sparse hartid must disable the
> spinwait method.
>
> This series also fixes the out-of-bounds access error[1] reported by Geert.
> The issue can be reproduced with SMP booting with NR_CPUS=4 on platforms with
> discontiguous hart numbering (HiFive unleashed/unmatched & polarfire).
> Spinwait method should also be disabled for such configuration where NR_CPUS
> value is less than maximum hartid in the platform.
>
> [1] https://lore.kernel.org/lkml/CAMuHMdUPWOjJfJohxLJefHOrJBtXZ0xfHQt4=hXpUXnasiN+AQ@mail.gmail.com/#t
>
> The series is based on queue branch on kvm-riscv as it has kvm related changes
> as well. I have tested it on HiFive Unmatched and Qemu.
>
> Changes from v2->v3:
> 1. Rebased on linux-next
> 2. Removed the redundant variable in PATCH 1.
> 3. Added the reviewed-by/acked-by tags.
>
> Changes from v1->v2:
> 1. Fixed few typos in Kconfig.
> 2. Moved the boot data structure offsets to a asm-offset.c
> 3. Removed the redundant config check in head.S
>
> Atish Patra (6):
> RISC-V: Avoid using per cpu array for ordered booting
> RISC-V: Do not print the SBI version during HSM extension boot print
> RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
> RISC-V: Move the entire hart selection via lottery to SMP
> RISC-V: Move spinwait booting method to its own config
> RISC-V: Do not use cpumask data structure for hartid bitmap
>
> arch/riscv/Kconfig                   |  14 ++
> arch/riscv/include/asm/cpu_ops.h     |   2 -
> arch/riscv/include/asm/cpu_ops_sbi.h |  25 ++++
> arch/riscv/include/asm/sbi.h         |  19 +--
> arch/riscv/include/asm/smp.h         |   2 -
> arch/riscv/kernel/Makefile           |   3 +-
> arch/riscv/kernel/asm-offsets.c      |   3 +
> arch/riscv/kernel/cpu_ops.c          |  26 ++--
> arch/riscv/kernel/cpu_ops_sbi.c      |  26 +++-
> arch/riscv/kernel/cpu_ops_spinwait.c |  27 +++-
> arch/riscv/kernel/head.S             |  35 ++---
> arch/riscv/kernel/head.h             |   6 +-
> arch/riscv/kernel/sbi.c              | 189 +++++++++++++++------------
> arch/riscv/kernel/setup.c            |  10 --
> arch/riscv/kernel/smpboot.c          |   2 +-
> arch/riscv/kvm/mmu.c                 |   4 +-
> arch/riscv/kvm/vcpu_sbi_replace.c    |  11 +-
> arch/riscv/kvm/vcpu_sbi_v01.c        |  11 +-
> arch/riscv/kvm/vmid.c                |   4 +-
> arch/riscv/mm/cacheflush.c           |   5 +-
> arch/riscv/mm/tlbflush.c             |   9 +-
> 21 files changed, 253 insertions(+), 180 deletions(-)
> create mode 100644 arch/riscv/include/asm/cpu_ops_sbi.h

Thanks, these are on for-next.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 0/6] Sparse HART id support
@ 2022-01-20 18:17   ` Palmer Dabbelt
  0 siblings, 0 replies; 55+ messages in thread
From: Palmer Dabbelt @ 2022-01-20 18:17 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Atish Patra, aou, atishp, anup, damien.lemoal,
	devicetree, jszhang, krzysztof.kozlowski, linux-riscv,
	Paul Walmsley, robh+dt

On Thu, 20 Jan 2022 01:09:12 PST (-0800), Atish Patra wrote:
> Currently, sparse hartid is not supported for Linux RISC-V for the following
> reasons.
> 1. Both spinwait and ordered booting method uses __cpu_up_stack/task_pointer
>    which is an array size of NR_CPUs.
> 2. During early booting, any hartid greater than NR_CPUs are not booted at all.
> 3. riscv_cpuid_to_hartid_mask uses struct cpumask for generating hartid bitmap.
> 4. SBI v0.2 implementation uses NR_CPUs as the maximum hartid number while
>    generating hartmask.
>
> In order to support sparse hartid, the hartid & NR_CPUS needs to be disassociated
> which was logically incorrect anyways. NR_CPUs represent the maximum logical|
> CPU id configured in the kernel while the hartid represent the physical hartid
> stored in mhartid CSR defined by the privilege specification. Thus, hartid
> can have much greater value than logical cpuid.
>
> Currently, we have two methods of booting. Ordered booting where the booting
> hart brings up each non-booting hart one by one using SBI HSM extension.
> The spinwait booting method relies on harts jumping to Linux kernel randomly
> and boot hart is selected by a lottery. All other non-booting harts keep
> spinning on __cpu_up_stack/task_pointer until boot hart initializes the data.
> Both these methods rely on __cpu_up_stack/task_pointer to setup the stack/
> task pointer. The spinwait method is mostly used to support older firmwares
> without SBI HSM extension and M-mode Linux.  The ordered booting method is the
> preferred booting method for booting general Linux because it can support
> cpu hotplug and kexec.
>
> The first patch modified the ordered booting method to use an opaque parameter
> already available in HSM start API to setup the stack/task pointer. The third
> patch resolves the issue #1 by limiting the usage of
> __cpu_up_stack/task_pointer to spinwait specific booting method. The fourth
> and fifth patch moves the entire hart lottery selection and spinwait method
> to a separate config that can be disabled if required. It solves the issue #2.
> The 6th patch solves issue #3 and #4 by removing riscv_cpuid_to_hartid_mask
> completely. All the SBI APIs directly pass a pointer to struct cpumask and
> the SBI implementation takes care of generating the hart bitmap from the
> cpumask.
>
> It is not trivial to support sparse hartid for spinwait booting method and
> there are no usecases to support sparse hartid for spinwait method as well.
> Any platform with sparse hartid will probably require more advanced features
> such as cpu hotplug and kexec. Thus, the series supports the sparse hartid via
> ordered booting method only. To maintain backward compatibility, spinwait
> booting method is currently enabled in defconfig so that M-mode linux will
> continue to work. Any platform that requires to sparse hartid must disable the
> spinwait method.
>
> This series also fixes the out-of-bounds access error[1] reported by Geert.
> The issue can be reproduced with SMP booting with NR_CPUS=4 on platforms with
> discontiguous hart numbering (HiFive unleashed/unmatched & polarfire).
> Spinwait method should also be disabled for such configuration where NR_CPUS
> value is less than maximum hartid in the platform.
>
> [1] https://lore.kernel.org/lkml/CAMuHMdUPWOjJfJohxLJefHOrJBtXZ0xfHQt4=hXpUXnasiN+AQ@mail.gmail.com/#t
>
> The series is based on queue branch on kvm-riscv as it has kvm related changes
> as well. I have tested it on HiFive Unmatched and Qemu.
>
> Changes from v2->v3:
> 1. Rebased on linux-next
> 2. Removed the redundant variable in PATCH 1.
> 3. Added the reviewed-by/acked-by tags.
>
> Changes from v1->v2:
> 1. Fixed few typos in Kconfig.
> 2. Moved the boot data structure offsets to a asm-offset.c
> 3. Removed the redundant config check in head.S
>
> Atish Patra (6):
> RISC-V: Avoid using per cpu array for ordered booting
> RISC-V: Do not print the SBI version during HSM extension boot print
> RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
> RISC-V: Move the entire hart selection via lottery to SMP
> RISC-V: Move spinwait booting method to its own config
> RISC-V: Do not use cpumask data structure for hartid bitmap
>
> arch/riscv/Kconfig                   |  14 ++
> arch/riscv/include/asm/cpu_ops.h     |   2 -
> arch/riscv/include/asm/cpu_ops_sbi.h |  25 ++++
> arch/riscv/include/asm/sbi.h         |  19 +--
> arch/riscv/include/asm/smp.h         |   2 -
> arch/riscv/kernel/Makefile           |   3 +-
> arch/riscv/kernel/asm-offsets.c      |   3 +
> arch/riscv/kernel/cpu_ops.c          |  26 ++--
> arch/riscv/kernel/cpu_ops_sbi.c      |  26 +++-
> arch/riscv/kernel/cpu_ops_spinwait.c |  27 +++-
> arch/riscv/kernel/head.S             |  35 ++---
> arch/riscv/kernel/head.h             |   6 +-
> arch/riscv/kernel/sbi.c              | 189 +++++++++++++++------------
> arch/riscv/kernel/setup.c            |  10 --
> arch/riscv/kernel/smpboot.c          |   2 +-
> arch/riscv/kvm/mmu.c                 |   4 +-
> arch/riscv/kvm/vcpu_sbi_replace.c    |  11 +-
> arch/riscv/kvm/vcpu_sbi_v01.c        |  11 +-
> arch/riscv/kvm/vmid.c                |   4 +-
> arch/riscv/mm/cacheflush.c           |   5 +-
> arch/riscv/mm/tlbflush.c             |   9 +-
> 21 files changed, 253 insertions(+), 180 deletions(-)
> create mode 100644 arch/riscv/include/asm/cpu_ops_sbi.h

Thanks, these are on for-next.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-20  9:09   ` Atish Patra
@ 2022-01-25 20:12     ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-25 20:12 UTC (permalink / raw)
  To: Atish Patra
  Cc: Linux Kernel Mailing List, Anup Patel, Albert Ou, Atish Patra,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

Hi Atish,

On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
> Currently, SBI APIs accept a hartmask that is generated from struct
> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> is not the correct data structure for hartids as it can be higher
> than NR_CPUs for platforms with sparse or discontguous hartids.
>
> Remove all association between hartid mask and struct cpumask.
>
> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> Signed-off-by: Atish Patra <atishp@rivosinc.com>

Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.

I am having an issue with random userspace SEGVs on Starlight Beta
(which needs out-of-tree patches).  It doesn't always manifest
itself immediately, so it took a while to bisect, but I suspect the
above commit to be the culprit.

So far the Icicle looks unaffected.

Do you have a clue?
Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-25 20:12     ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-25 20:12 UTC (permalink / raw)
  To: Atish Patra
  Cc: Linux Kernel Mailing List, Anup Patel, Albert Ou, Atish Patra,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

Hi Atish,

On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
> Currently, SBI APIs accept a hartmask that is generated from struct
> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> is not the correct data structure for hartids as it can be higher
> than NR_CPUs for platforms with sparse or discontguous hartids.
>
> Remove all association between hartid mask and struct cpumask.
>
> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> Signed-off-by: Atish Patra <atishp@rivosinc.com>

Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.

I am having an issue with random userspace SEGVs on Starlight Beta
(which needs out-of-tree patches).  It doesn't always manifest
itself immediately, so it took a while to bisect, but I suspect the
above commit to be the culprit.

So far the Icicle looks unaffected.

Do you have a clue?
Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-25 20:12     ` Geert Uytterhoeven
@ 2022-01-25 20:17       ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-25 20:17 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

On Tue, Jan 25, 2022 at 12:12 PM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Atish,
>
> On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
> > Currently, SBI APIs accept a hartmask that is generated from struct
> > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > is not the correct data structure for hartids as it can be higher
> > than NR_CPUs for platforms with sparse or discontguous hartids.
> >
> > Remove all association between hartid mask and struct cpumask.
> >
> > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
>
> Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
> Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.
>
> I am having an issue with random userspace SEGVs on Starlight Beta
> (which needs out-of-tree patches).  It doesn't always manifest
> itself immediately, so it took a while to bisect, but I suspect the
> above commit to be the culprit.
>

I have never seen one before during my testing. How frequently do you see them?
Does it happen while running anything or just idle user space results
in SEGVs randomly.

Do you have a trace that I can look into ?

> So far the Icicle looks unaffected.
>
> Do you have a clue?
> Thanks!
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds



-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-25 20:17       ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-25 20:17 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

On Tue, Jan 25, 2022 at 12:12 PM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Atish,
>
> On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
> > Currently, SBI APIs accept a hartmask that is generated from struct
> > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > is not the correct data structure for hartids as it can be higher
> > than NR_CPUs for platforms with sparse or discontguous hartids.
> >
> > Remove all association between hartid mask and struct cpumask.
> >
> > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
>
> Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
> Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.
>
> I am having an issue with random userspace SEGVs on Starlight Beta
> (which needs out-of-tree patches).  It doesn't always manifest
> itself immediately, so it took a while to bisect, but I suspect the
> above commit to be the culprit.
>

I have never seen one before during my testing. How frequently do you see them?
Does it happen while running anything or just idle user space results
in SEGVs randomly.

Do you have a trace that I can look into ?

> So far the Icicle looks unaffected.
>
> Do you have a clue?
> Thanks!
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds



-- 
Regards,
Atish

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-25 20:17       ` Atish Patra
@ 2022-01-25 20:52         ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-25 20:52 UTC (permalink / raw)
  To: Atish Patra
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

Hi Atish,

On Tue, Jan 25, 2022 at 9:17 PM Atish Patra <atishp@atishpatra.org> wrote:
> On Tue, Jan 25, 2022 at 12:12 PM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> > On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
> > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > is not the correct data structure for hartids as it can be higher
> > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > >
> > > Remove all association between hartid mask and struct cpumask.
> > >
> > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> >
> > Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
> > Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.
> >
> > I am having an issue with random userspace SEGVs on Starlight Beta
> > (which needs out-of-tree patches).  It doesn't always manifest
> > itself immediately, so it took a while to bisect, but I suspect the
> > above commit to be the culprit.
>
> I have never seen one before during my testing. How frequently do you see them?
> Does it happen while running anything or just idle user space results
> in SEGVs randomly.

Sometimes they happen during startup (lots of failures from systemd),
sometimes they happen later, during interactive work.
Sometimes while idle, and something runs in the background (e.g. mandb).

> Do you have a trace that I can look into ?

# apt update
[  807.499050] apt[258]: unhandled signal 11 code 0x1 at
0xffffff8300060020 in libapt-pkg.so.6.0.0[3fa49ac000+174000]
[  807.509548] CPU: 0 PID: 258 Comm: apt Not tainted
5.16.0-starlight-11192-g26fb751ca378-dirty #153
[  807.518674] Hardware name: BeagleV Starlight Beta (DT)
[  807.524077] epc : 0000003fa4a47a0a ra : 0000003fa4a479fc sp :
0000003fcb4b39b0
[  807.531383]  gp : 0000002adcef4800 tp : 0000003fa43287b0 t0 :
0000000000000001
[  807.538603]  t1 : 0000000000000009 t2 : 00000000000003ff s0 :
0000000000000000
[  807.545887]  s1 : 0000002adcf3cb60 a0 : 0000000000000003 a1 :
0000000000000000
[  807.553167]  a2 : 0000003fcb4b3a30 a3 : 0000000000000000 a4 :
0000002adcf3cc1c
[  807.560390]  a5 : 0007000300060000 a6 : 0000000000000003 a7 :
1999999999999999
[  807.567654]  s2 : 0000003fcb4b3a28 s3 : 0000000000000002 s4 :
0000003fcb4b3a30
[  807.575039]  s5 : 0000003fa4baa810 s6 : 0000000000000010 s7 :
0000002adcf19a40
[  807.582363]  s8 : 0000003fcb4b4010 s9 : 0000003fa4baa810 s10:
0000003fcb4b3e90
[  807.589606]  s11: 0000003fa4b2a528 t3 : 0000000000000000 t4 :
0000003fa47906a0
[  807.596891]  t5 : 0000000000000005 t6 : ffffffffffffffff
[  807.602302] status: 0000000200004020 badaddr: ffffff8300060020
cause: 000000000000000d

(-dirty due to Starlight DTS and driver updates)

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-25 20:52         ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-25 20:52 UTC (permalink / raw)
  To: Atish Patra
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

Hi Atish,

On Tue, Jan 25, 2022 at 9:17 PM Atish Patra <atishp@atishpatra.org> wrote:
> On Tue, Jan 25, 2022 at 12:12 PM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> > On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
> > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > is not the correct data structure for hartids as it can be higher
> > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > >
> > > Remove all association between hartid mask and struct cpumask.
> > >
> > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> >
> > Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
> > Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.
> >
> > I am having an issue with random userspace SEGVs on Starlight Beta
> > (which needs out-of-tree patches).  It doesn't always manifest
> > itself immediately, so it took a while to bisect, but I suspect the
> > above commit to be the culprit.
>
> I have never seen one before during my testing. How frequently do you see them?
> Does it happen while running anything or just idle user space results
> in SEGVs randomly.

Sometimes they happen during startup (lots of failures from systemd),
sometimes they happen later, during interactive work.
Sometimes while idle, and something runs in the background (e.g. mandb).

> Do you have a trace that I can look into ?

# apt update
[  807.499050] apt[258]: unhandled signal 11 code 0x1 at
0xffffff8300060020 in libapt-pkg.so.6.0.0[3fa49ac000+174000]
[  807.509548] CPU: 0 PID: 258 Comm: apt Not tainted
5.16.0-starlight-11192-g26fb751ca378-dirty #153
[  807.518674] Hardware name: BeagleV Starlight Beta (DT)
[  807.524077] epc : 0000003fa4a47a0a ra : 0000003fa4a479fc sp :
0000003fcb4b39b0
[  807.531383]  gp : 0000002adcef4800 tp : 0000003fa43287b0 t0 :
0000000000000001
[  807.538603]  t1 : 0000000000000009 t2 : 00000000000003ff s0 :
0000000000000000
[  807.545887]  s1 : 0000002adcf3cb60 a0 : 0000000000000003 a1 :
0000000000000000
[  807.553167]  a2 : 0000003fcb4b3a30 a3 : 0000000000000000 a4 :
0000002adcf3cc1c
[  807.560390]  a5 : 0007000300060000 a6 : 0000000000000003 a7 :
1999999999999999
[  807.567654]  s2 : 0000003fcb4b3a28 s3 : 0000000000000002 s4 :
0000003fcb4b3a30
[  807.575039]  s5 : 0000003fa4baa810 s6 : 0000000000000010 s7 :
0000002adcf19a40
[  807.582363]  s8 : 0000003fcb4b4010 s9 : 0000003fa4baa810 s10:
0000003fcb4b3e90
[  807.589606]  s11: 0000003fa4b2a528 t3 : 0000000000000000 t4 :
0000003fa47906a0
[  807.596891]  t5 : 0000000000000005 t6 : ffffffffffffffff
[  807.602302] status: 0000000200004020 badaddr: ffffff8300060020
cause: 000000000000000d

(-dirty due to Starlight DTS and driver updates)

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-25 20:52         ` Geert Uytterhoeven
@ 2022-01-25 21:11           ` Ron Economos
  -1 siblings, 0 replies; 55+ messages in thread
From: Ron Economos @ 2022-01-25 21:11 UTC (permalink / raw)
  To: Geert Uytterhoeven, Atish Patra
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

On 1/25/22 12:52, Geert Uytterhoeven wrote:
> Hi Atish,
>
> On Tue, Jan 25, 2022 at 9:17 PM Atish Patra <atishp@atishpatra.org> wrote:
>> On Tue, Jan 25, 2022 at 12:12 PM Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
>>>> Currently, SBI APIs accept a hartmask that is generated from struct
>>>> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
>>>> is not the correct data structure for hartids as it can be higher
>>>> than NR_CPUs for platforms with sparse or discontguous hartids.
>>>>
>>>> Remove all association between hartid mask and struct cpumask.
>>>>
>>>> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
>>>> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
>>>> Signed-off-by: Atish Patra <atishp@rivosinc.com>
>>> Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
>>> Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.
>>>
>>> I am having an issue with random userspace SEGVs on Starlight Beta
>>> (which needs out-of-tree patches).  It doesn't always manifest
>>> itself immediately, so it took a while to bisect, but I suspect the
>>> above commit to be the culprit.
>> I have never seen one before during my testing. How frequently do you see them?
>> Does it happen while running anything or just idle user space results
>> in SEGVs randomly.
> Sometimes they happen during startup (lots of failures from systemd),
> sometimes they happen later, during interactive work.
> Sometimes while idle, and something runs in the background (e.g. mandb).
>
>> Do you have a trace that I can look into ?
> # apt update
> [  807.499050] apt[258]: unhandled signal 11 code 0x1 at
> 0xffffff8300060020 in libapt-pkg.so.6.0.0[3fa49ac000+174000]
> [  807.509548] CPU: 0 PID: 258 Comm: apt Not tainted
> 5.16.0-starlight-11192-g26fb751ca378-dirty #153
> [  807.518674] Hardware name: BeagleV Starlight Beta (DT)
> [  807.524077] epc : 0000003fa4a47a0a ra : 0000003fa4a479fc sp :
> 0000003fcb4b39b0
> [  807.531383]  gp : 0000002adcef4800 tp : 0000003fa43287b0 t0 :
> 0000000000000001
> [  807.538603]  t1 : 0000000000000009 t2 : 00000000000003ff s0 :
> 0000000000000000
> [  807.545887]  s1 : 0000002adcf3cb60 a0 : 0000000000000003 a1 :
> 0000000000000000
> [  807.553167]  a2 : 0000003fcb4b3a30 a3 : 0000000000000000 a4 :
> 0000002adcf3cc1c
> [  807.560390]  a5 : 0007000300060000 a6 : 0000000000000003 a7 :
> 1999999999999999
> [  807.567654]  s2 : 0000003fcb4b3a28 s3 : 0000000000000002 s4 :
> 0000003fcb4b3a30
> [  807.575039]  s5 : 0000003fa4baa810 s6 : 0000000000000010 s7 :
> 0000002adcf19a40
> [  807.582363]  s8 : 0000003fcb4b4010 s9 : 0000003fa4baa810 s10:
> 0000003fcb4b3e90
> [  807.589606]  s11: 0000003fa4b2a528 t3 : 0000000000000000 t4 :
> 0000003fa47906a0
> [  807.596891]  t5 : 0000000000000005 t6 : ffffffffffffffff
> [  807.602302] status: 0000000200004020 badaddr: ffffff8300060020
> cause: 000000000000000d
>
> (-dirty due to Starlight DTS and driver updates)
>
> Gr{oetje,eeting}s,
>
>                          Geert
>
> --

I'm not sure if it's related, but I'm also seeing a systemd segfault on 
boot with the HiFive Unmatched and 5.17.0-rc1. I don't have the dmesg 
dump, but here's the journalctl dump. It was built before the tag, so it 
says 5.16.0.

Jan 23 02:41:50 riscv64 systemd-udevd[551]: mmcblk0p12: Failed to wait 
for spawned command '/usr/bin/unshare -m /usr/bin/snap auto-import 
--mount=/dev/mmcblk0p12': Invalid argument
Jan 23 02:41:50 riscv64 systemd-udevd[412]: mmcblk0p12: Process 
'/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/mmcblk0p12' 
terminated by signal SEGV.
Jan 23 02:41:50 riscv64 kernel: systemd-udevd[551]: unhandled signal 11 
code 0x1 at 0x0000000003938700 in udevadm[3fa7eee000+b1000]
Jan 23 02:41:50 riscv64 kernel: CPU: 2 PID: 551 Comm: systemd-udevd Not 
tainted 5.16.0 #1
Jan 23 02:41:50 riscv64 kernel: Hardware name: SiFive HiFive Unmatched 
A00 (DT)
Jan 23 02:41:50 riscv64 kernel: epc : 0000003fa7f14104 ra : 
0000003fa7f14102 sp : 0000003fe3da5320
Jan 23 02:41:50 riscv64 kernel:  gp : 0000003fa7fc3ef8 tp : 
0000003fa79f8530 t0 : 0000003fe3da38f0
Jan 23 02:41:50 riscv64 kernel:  t1 : 0000003fa7f0425c t2 : 
0000000000000000 s0 : 0000003fcd046d88
Jan 23 02:41:50 riscv64 kernel:  s1 : 0000003fcd046d60 a0 : 
ffffffffffffffff a1 : 0000003fcd0cb330
Jan 23 02:41:50 riscv64 kernel:  a2 : 0000003fcd043028 a3 : 
0000000000000007 a4 : c98b6a1813e46d00
Jan 23 02:41:50 riscv64 kernel:  a5 : ffffffffffffffff a6 : 
fefefefefefefeff a7 : 0000000000000039
Jan 23 02:41:50 riscv64 kernel:  s2 : 0000000000000000 s3 : 
ffffffffffffffea s4 : 0000000000000000
Jan 23 02:41:50 riscv64 kernel:  s5 : 0000003fe3da5378 s6 : 
ffffffffffffffea s7 : 0000000003938700
Jan 23 02:41:50 riscv64 kernel:  s8 : 0000003fe3da53e0 s9 : 
0000003fe3da53d8 s10: 0000003fa7fc200c
Jan 23 02:41:50 riscv64 kernel:  s11: 0000000000081000 t3 : 
0000003fa7db3822 t4 : 0000000000000000
Jan 23 02:41:50 riscv64 kernel:  t5 : 0000003fe3da38c8 t6 : 000000000000002a
Jan 23 02:41:50 riscv64 kernel: status: 0000000200004020 badaddr: 
0000000003938700 cause: 000000000000000d
Jan 23 02:41:50 riscv64 systemd-udevd[412]: mmcblk0p12: Failed to wait 
for spawned command '/usr/bin/unshare -m /usr/bin/snap auto-import 
--mount=/dev/mmcblk0p12': Input/output error
Jan 23 02:41:50 riscv64 systemd-udevd[412]: mmcblk0p12: Failed to 
execute '/usr/bin/unshare -m /usr/bin/snap auto-import 
--mount=/dev/mmcblk0p12', ignoring: Input/output error

I'll try removing this patch.

Ron


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-25 21:11           ` Ron Economos
  0 siblings, 0 replies; 55+ messages in thread
From: Ron Economos @ 2022-01-25 21:11 UTC (permalink / raw)
  To: Geert Uytterhoeven, Atish Patra
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring, Emil Renner Berthing

On 1/25/22 12:52, Geert Uytterhoeven wrote:
> Hi Atish,
>
> On Tue, Jan 25, 2022 at 9:17 PM Atish Patra <atishp@atishpatra.org> wrote:
>> On Tue, Jan 25, 2022 at 12:12 PM Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> On Thu, Jan 20, 2022 at 10:12 AM Atish Patra <atishp@rivosinc.com> wrote:
>>>> Currently, SBI APIs accept a hartmask that is generated from struct
>>>> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
>>>> is not the correct data structure for hartids as it can be higher
>>>> than NR_CPUs for platforms with sparse or discontguous hartids.
>>>>
>>>> Remove all association between hartid mask and struct cpumask.
>>>>
>>>> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
>>>> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
>>>> Signed-off-by: Atish Patra <atishp@rivosinc.com>
>>> Thanks for your patch, which is now commit 26fb751ca37846c9 ("RISC-V:
>>> Do not use cpumask data structure for hartid bitmap") in v5.17-rc1.
>>>
>>> I am having an issue with random userspace SEGVs on Starlight Beta
>>> (which needs out-of-tree patches).  It doesn't always manifest
>>> itself immediately, so it took a while to bisect, but I suspect the
>>> above commit to be the culprit.
>> I have never seen one before during my testing. How frequently do you see them?
>> Does it happen while running anything or just idle user space results
>> in SEGVs randomly.
> Sometimes they happen during startup (lots of failures from systemd),
> sometimes they happen later, during interactive work.
> Sometimes while idle, and something runs in the background (e.g. mandb).
>
>> Do you have a trace that I can look into ?
> # apt update
> [  807.499050] apt[258]: unhandled signal 11 code 0x1 at
> 0xffffff8300060020 in libapt-pkg.so.6.0.0[3fa49ac000+174000]
> [  807.509548] CPU: 0 PID: 258 Comm: apt Not tainted
> 5.16.0-starlight-11192-g26fb751ca378-dirty #153
> [  807.518674] Hardware name: BeagleV Starlight Beta (DT)
> [  807.524077] epc : 0000003fa4a47a0a ra : 0000003fa4a479fc sp :
> 0000003fcb4b39b0
> [  807.531383]  gp : 0000002adcef4800 tp : 0000003fa43287b0 t0 :
> 0000000000000001
> [  807.538603]  t1 : 0000000000000009 t2 : 00000000000003ff s0 :
> 0000000000000000
> [  807.545887]  s1 : 0000002adcf3cb60 a0 : 0000000000000003 a1 :
> 0000000000000000
> [  807.553167]  a2 : 0000003fcb4b3a30 a3 : 0000000000000000 a4 :
> 0000002adcf3cc1c
> [  807.560390]  a5 : 0007000300060000 a6 : 0000000000000003 a7 :
> 1999999999999999
> [  807.567654]  s2 : 0000003fcb4b3a28 s3 : 0000000000000002 s4 :
> 0000003fcb4b3a30
> [  807.575039]  s5 : 0000003fa4baa810 s6 : 0000000000000010 s7 :
> 0000002adcf19a40
> [  807.582363]  s8 : 0000003fcb4b4010 s9 : 0000003fa4baa810 s10:
> 0000003fcb4b3e90
> [  807.589606]  s11: 0000003fa4b2a528 t3 : 0000000000000000 t4 :
> 0000003fa47906a0
> [  807.596891]  t5 : 0000000000000005 t6 : ffffffffffffffff
> [  807.602302] status: 0000000200004020 badaddr: ffffff8300060020
> cause: 000000000000000d
>
> (-dirty due to Starlight DTS and driver updates)
>
> Gr{oetje,eeting}s,
>
>                          Geert
>
> --

I'm not sure if it's related, but I'm also seeing a systemd segfault on 
boot with the HiFive Unmatched and 5.17.0-rc1. I don't have the dmesg 
dump, but here's the journalctl dump. It was built before the tag, so it 
says 5.16.0.

Jan 23 02:41:50 riscv64 systemd-udevd[551]: mmcblk0p12: Failed to wait 
for spawned command '/usr/bin/unshare -m /usr/bin/snap auto-import 
--mount=/dev/mmcblk0p12': Invalid argument
Jan 23 02:41:50 riscv64 systemd-udevd[412]: mmcblk0p12: Process 
'/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/mmcblk0p12' 
terminated by signal SEGV.
Jan 23 02:41:50 riscv64 kernel: systemd-udevd[551]: unhandled signal 11 
code 0x1 at 0x0000000003938700 in udevadm[3fa7eee000+b1000]
Jan 23 02:41:50 riscv64 kernel: CPU: 2 PID: 551 Comm: systemd-udevd Not 
tainted 5.16.0 #1
Jan 23 02:41:50 riscv64 kernel: Hardware name: SiFive HiFive Unmatched 
A00 (DT)
Jan 23 02:41:50 riscv64 kernel: epc : 0000003fa7f14104 ra : 
0000003fa7f14102 sp : 0000003fe3da5320
Jan 23 02:41:50 riscv64 kernel:  gp : 0000003fa7fc3ef8 tp : 
0000003fa79f8530 t0 : 0000003fe3da38f0
Jan 23 02:41:50 riscv64 kernel:  t1 : 0000003fa7f0425c t2 : 
0000000000000000 s0 : 0000003fcd046d88
Jan 23 02:41:50 riscv64 kernel:  s1 : 0000003fcd046d60 a0 : 
ffffffffffffffff a1 : 0000003fcd0cb330
Jan 23 02:41:50 riscv64 kernel:  a2 : 0000003fcd043028 a3 : 
0000000000000007 a4 : c98b6a1813e46d00
Jan 23 02:41:50 riscv64 kernel:  a5 : ffffffffffffffff a6 : 
fefefefefefefeff a7 : 0000000000000039
Jan 23 02:41:50 riscv64 kernel:  s2 : 0000000000000000 s3 : 
ffffffffffffffea s4 : 0000000000000000
Jan 23 02:41:50 riscv64 kernel:  s5 : 0000003fe3da5378 s6 : 
ffffffffffffffea s7 : 0000000003938700
Jan 23 02:41:50 riscv64 kernel:  s8 : 0000003fe3da53e0 s9 : 
0000003fe3da53d8 s10: 0000003fa7fc200c
Jan 23 02:41:50 riscv64 kernel:  s11: 0000000000081000 t3 : 
0000003fa7db3822 t4 : 0000000000000000
Jan 23 02:41:50 riscv64 kernel:  t5 : 0000003fe3da38c8 t6 : 000000000000002a
Jan 23 02:41:50 riscv64 kernel: status: 0000000200004020 badaddr: 
0000000003938700 cause: 000000000000000d
Jan 23 02:41:50 riscv64 systemd-udevd[412]: mmcblk0p12: Failed to wait 
for spawned command '/usr/bin/unshare -m /usr/bin/snap auto-import 
--mount=/dev/mmcblk0p12': Input/output error
Jan 23 02:41:50 riscv64 systemd-udevd[412]: mmcblk0p12: Failed to 
execute '/usr/bin/unshare -m /usr/bin/snap auto-import 
--mount=/dev/mmcblk0p12', ignoring: Input/output error

I'll try removing this patch.

Ron


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-20  9:09   ` Atish Patra
@ 2022-01-25 22:26     ` Jessica Clarke
  -1 siblings, 0 replies; 55+ messages in thread
From: Jessica Clarke @ 2022-01-25 22:26 UTC (permalink / raw)
  To: Atish Patra
  Cc: Linux Kernel Mailing List, Anup Patel, Albert Ou, Atish Patra,
	Damien Le Moal, devicetree, Jisheng Zhang, Krzysztof Kozlowski,
	linux-riscv, Palmer Dabbelt, Paul Walmsley, Rob Herring

On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> 
> Currently, SBI APIs accept a hartmask that is generated from struct
> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> is not the correct data structure for hartids as it can be higher
> than NR_CPUs for platforms with sparse or discontguous hartids.
> 
> Remove all association between hartid mask and struct cpumask.
> 
> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
> arch/riscv/include/asm/sbi.h      |  19 +--
> arch/riscv/include/asm/smp.h      |   2 -
> arch/riscv/kernel/sbi.c           | 189 +++++++++++++++++-------------
> arch/riscv/kernel/setup.c         |  10 --
> arch/riscv/kernel/smpboot.c       |   2 +-
> arch/riscv/kvm/mmu.c              |   4 +-
> arch/riscv/kvm/vcpu_sbi_replace.c |  11 +-
> arch/riscv/kvm/vcpu_sbi_v01.c     |  11 +-
> arch/riscv/kvm/vmid.c             |   4 +-
> arch/riscv/mm/cacheflush.c        |   5 +-
> arch/riscv/mm/tlbflush.c          |   9 +-
> 11 files changed, 130 insertions(+), 136 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 26ba6f2d7a40..d1c37479d828 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -8,6 +8,7 @@
> #define _ASM_RISCV_SBI_H
> 
> #include <linux/types.h>
> +#include <linux/cpumask.h>
> 
> #ifdef CONFIG_RISCV_SBI
> enum sbi_ext_id {
> @@ -128,27 +129,27 @@ long sbi_get_mimpid(void);
> void sbi_set_timer(uint64_t stime_value);
> void sbi_shutdown(void);
> void sbi_clear_ipi(void);
> -int sbi_send_ipi(const unsigned long *hart_mask);
> -int sbi_remote_fence_i(const unsigned long *hart_mask);
> -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> +int sbi_send_ipi(const struct cpumask *cpu_mask);
> +int sbi_remote_fence_i(const struct cpumask *cpu_mask);
> +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size);
> 
> -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid);
> -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size);
> -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long vmid);
> -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size);
> -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid);
> @@ -183,7 +184,7 @@ static inline unsigned long sbi_mk_version(unsigned long major,
> 
> int sbi_err_map_linux_errno(int err);
> #else /* CONFIG_RISCV_SBI */
> -static inline int sbi_remote_fence_i(const unsigned long *hart_mask) { return -1; }
> +static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1; }
> static inline void sbi_init(void) {}
> #endif /* CONFIG_RISCV_SBI */
> #endif /* _ASM_RISCV_SBI_H */
> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
> index 6ad749f42807..23170c933d73 100644
> --- a/arch/riscv/include/asm/smp.h
> +++ b/arch/riscv/include/asm/smp.h
> @@ -92,8 +92,6 @@ static inline void riscv_clear_ipi(void)
> 
> #endif /* CONFIG_SMP */
> 
> -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
> -
> #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP)
> bool cpu_has_hotplug(unsigned int cpu);
> #else
> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> index 9a84f0cb5175..f72527fcb347 100644
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -16,8 +16,8 @@ unsigned long sbi_spec_version __ro_after_init = SBI_SPEC_VERSION_DEFAULT;
> EXPORT_SYMBOL(sbi_spec_version);
> 
> static void (*__sbi_set_timer)(uint64_t stime) __ro_after_init;
> -static int (*__sbi_send_ipi)(const unsigned long *hart_mask) __ro_after_init;
> -static int (*__sbi_rfence)(int fid, const unsigned long *hart_mask,
> +static int (*__sbi_send_ipi)(const struct cpumask *cpu_mask) __ro_after_init;
> +static int (*__sbi_rfence)(int fid, const struct cpumask *cpu_mask,
> 			   unsigned long start, unsigned long size,
> 			   unsigned long arg4, unsigned long arg5) __ro_after_init;
> 
> @@ -67,6 +67,30 @@ int sbi_err_map_linux_errno(int err)
> EXPORT_SYMBOL(sbi_err_map_linux_errno);
> 
> #ifdef CONFIG_RISCV_SBI_V01
> +static unsigned long __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mask)
> +{
> +	unsigned long cpuid, hartid;
> +	unsigned long hmask = 0;
> +
> +	/*
> +	 * There is no maximum hartid concept in RISC-V and NR_CPUS must not be
> +	 * associated with hartid. As SBI v0.1 is only kept for backward compatibility
> +	 * and will be removed in the future, there is no point in supporting hartid
> +	 * greater than BITS_PER_LONG (32 for RV32 and 64 for RV64). Ideally, SBI v0.2
> +	 * should be used for platforms with hartid greater than BITS_PER_LONG.
> +	 */
> +	for_each_cpu(cpuid, cpu_mask) {
> +		hartid = cpuid_to_hartid_map(cpuid);
> +		if (hartid >= BITS_PER_LONG) {
> +			pr_warn("Unable to send any request to hartid > BITS_PER_LONG for SBI v0.1\n");
> +			break;
> +		}
> +		hmask |= 1 << hartid;

This should be 1UL otherwise hartid 31 and up cause UB.

> +	}
> +
> +	return hmask;
> +}
> +
> /**
>  * sbi_console_putchar() - Writes given character to the console device.
>  * @ch: The data to be written to the console.
> @@ -132,33 +156,44 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> #endif
> }
> 
> -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> {
> -	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)hart_mask,
> +	unsigned long hart_mask;
> +
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;
> +	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> +
> +	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)(&hart_mask),
> 		  0, 0, 0, 0, 0);
> 	return 0;
> }
> 
> -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> 			    unsigned long start, unsigned long size,
> 			    unsigned long arg4, unsigned long arg5)
> {
> 	int result = 0;
> +	unsigned long hart_mask;
> +
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;
> +	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> 
> 	/* v0.2 function IDs are equivalent to v0.1 extension IDs */
> 	switch (fid) {
> 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> 		sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
> -			  (unsigned long)hart_mask, 0, 0, 0, 0, 0);
> +			  (unsigned long)&hart_mask, 0, 0, 0, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA, 0,
> -			  (unsigned long)hart_mask, start, size,
> +			  (unsigned long)&hart_mask, start, size,
> 			  0, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID, 0,
> -			  (unsigned long)hart_mask, start, size,
> +			  (unsigned long)&hart_mask, start, size,
> 			  arg4, 0, 0);
> 		break;
> 	default:
> @@ -180,7 +215,7 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> 		sbi_major_version(), sbi_minor_version());
> }
> 
> -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> {
> 	pr_warn("IPI extension is not available in SBI v%lu.%lu\n",
> 		sbi_major_version(), sbi_minor_version());
> @@ -188,7 +223,7 @@ static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> 	return 0;
> }
> 
> -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> 			    unsigned long start, unsigned long size,
> 			    unsigned long arg4, unsigned long arg5)
> {
> @@ -212,37 +247,33 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
> #endif
> }
> 
> -static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> +static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
> {
> -	unsigned long hartid, hmask_val, hbase;
> -	struct cpumask tmask;
> +	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> 	struct sbiret ret = {0};
> 	int result;
> 
> -	if (!hart_mask || !(*hart_mask)) {
> -		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> -		hart_mask = cpumask_bits(&tmask);
> -	}
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;

This is a change in behaviour. Are you sure nothing passes an empty mask?

> -	hmask_val = 0;
> -	hbase = 0;
> -	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> -		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> +	for_each_cpu(cpuid, cpu_mask) {
> +		hartid = cpuid_to_hartid_map(cpuid);
> +		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> 			ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> -					hmask_val, hbase, 0, 0, 0, 0);
> +					hmask, hbase, 0, 0, 0, 0);
> 			if (ret.error)
> 				goto ecall_failed;
> -			hmask_val = 0;
> +			hmask = 0;
> 			hbase = 0;
> 		}
> -		if (!hmask_val)
> +		if (!hmask)
> 			hbase = hartid;
> -		hmask_val |= 1UL << (hartid - hbase);
> +		hmask |= 1UL << (hartid - hbase);
> 	}
> 
> -	if (hmask_val) {
> +	if (hmask) {
> 		ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> -				hmask_val, hbase, 0, 0, 0, 0);
> +				hmask, hbase, 0, 0, 0, 0);
> 		if (ret.error)
> 			goto ecall_failed;
> 	}
> @@ -252,11 +283,11 @@ static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> ecall_failed:
> 	result = sbi_err_map_linux_errno(ret.error);
> 	pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> -	       __func__, hbase, hmask_val, result);
> +	       __func__, hbase, hmask, result);
> 	return result;
> }
> 
> -static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> +static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask,
> 				 unsigned long hbase, unsigned long start,
> 				 unsigned long size, unsigned long arg4,
> 				 unsigned long arg5)
> @@ -267,31 +298,31 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> 
> 	switch (fid) {
> 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, 0, 0, 0, 0);
> +		ret = sbi_ecall(ext, fid, hmask, hbase, 0, 0, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, arg4, 0);
> 		break;
> 
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, arg4, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, arg4, 0);
> 		break;
> 	default:
> @@ -303,43 +334,39 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> 	if (ret.error) {
> 		result = sbi_err_map_linux_errno(ret.error);
> 		pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> -		       __func__, hbase, hmask_val, result);
> +		       __func__, hbase, hmask, result);
> 	}
> 
> 	return result;
> }
> 
> -static int __sbi_rfence_v02(int fid, const unsigned long *hart_mask,
> +static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask,
> 			    unsigned long start, unsigned long size,
> 			    unsigned long arg4, unsigned long arg5)
> {
> -	unsigned long hmask_val, hartid, hbase;
> -	struct cpumask tmask;
> +	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> 	int result;
> 
> -	if (!hart_mask || !(*hart_mask)) {
> -		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> -		hart_mask = cpumask_bits(&tmask);
> -	}
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;

As with __sbi_send_ipi_v02.

Jess

> -	hmask_val = 0;
> -	hbase = 0;
> -	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> -		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> -			result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> +	for_each_cpu(cpuid, cpu_mask) {
> +		hartid = cpuid_to_hartid_map(cpuid);
> +		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> +			result = __sbi_rfence_v02_call(fid, hmask, hbase,
> 						       start, size, arg4, arg5);
> 			if (result)
> 				return result;
> -			hmask_val = 0;
> +			hmask = 0;
> 			hbase = 0;
> 		}
> -		if (!hmask_val)
> +		if (!hmask)
> 			hbase = hartid;
> -		hmask_val |= 1UL << (hartid - hbase);
> +		hmask |= 1UL << (hartid - hbase);
> 	}
> 
> -	if (hmask_val) {
> -		result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> +	if (hmask) {
> +		result = __sbi_rfence_v02_call(fid, hmask, hbase,
> 					       start, size, arg4, arg5);
> 		if (result)
> 			return result;
> @@ -361,44 +388,44 @@ void sbi_set_timer(uint64_t stime_value)
> 
> /**
>  * sbi_send_ipi() - Send an IPI to any hart.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_send_ipi(const unsigned long *hart_mask)
> +int sbi_send_ipi(const struct cpumask *cpu_mask)
> {
> -	return __sbi_send_ipi(hart_mask);
> +	return __sbi_send_ipi(cpu_mask);
> }
> EXPORT_SYMBOL(sbi_send_ipi);
> 
> /**
>  * sbi_remote_fence_i() - Execute FENCE.I instruction on given remote harts.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_remote_fence_i(const unsigned long *hart_mask)
> +int sbi_remote_fence_i(const struct cpumask *cpu_mask)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_FENCE_I,
> -			    hart_mask, 0, 0, 0, 0);
> +			    cpu_mask, 0, 0, 0, 0);
> }
> EXPORT_SYMBOL(sbi_remote_fence_i);
> 
> /**
>  * sbi_remote_sfence_vma() - Execute SFENCE.VMA instructions on given remote
>  *			     harts for the specified virtual address range.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the virtual address
>  * @size: Total size of the virtual address range.
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
> -			    hart_mask, start, size, 0, 0);
> +			    cpu_mask, start, size, 0, 0);
> }
> EXPORT_SYMBOL(sbi_remote_sfence_vma);
> 
> @@ -406,38 +433,38 @@ EXPORT_SYMBOL(sbi_remote_sfence_vma);
>  * sbi_remote_sfence_vma_asid() - Execute SFENCE.VMA instructions on given
>  * remote harts for a virtual address range belonging to a specific ASID.
>  *
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the virtual address
>  * @size: Total size of the virtual address range.
>  * @asid: The value of address space identifier (ASID).
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
> -			    hart_mask, start, size, asid, 0);
> +			    cpu_mask, start, size, asid, 0);
> }
> EXPORT_SYMBOL(sbi_remote_sfence_vma_asid);
> 
> /**
>  * sbi_remote_hfence_gvma() - Execute HFENCE.GVMA instructions on given remote
>  *			   harts for the specified guest physical address range.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the guest physical address
>  * @size: Total size of the guest physical address range.
>  *
>  * Return: None
>  */
> -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
> -			    hart_mask, start, size, 0, 0);
> +			    cpu_mask, start, size, 0, 0);
> }
> EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
> 
> @@ -445,38 +472,38 @@ EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
>  * sbi_remote_hfence_gvma_vmid() - Execute HFENCE.GVMA instructions on given
>  * remote harts for a guest physical address range belonging to a specific VMID.
>  *
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the guest physical address
>  * @size: Total size of the guest physical address range.
>  * @vmid: The value of guest ID (VMID).
>  *
>  * Return: 0 if success, Error otherwise.
>  */
> -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long vmid)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID,
> -			    hart_mask, start, size, vmid, 0);
> +			    cpu_mask, start, size, vmid, 0);
> }
> EXPORT_SYMBOL(sbi_remote_hfence_gvma_vmid);
> 
> /**
>  * sbi_remote_hfence_vvma() - Execute HFENCE.VVMA instructions on given remote
>  *			     harts for the current guest virtual address range.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the current guest virtual address
>  * @size: Total size of the current guest virtual address range.
>  *
>  * Return: None
>  */
> -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
> -			    hart_mask, start, size, 0, 0);
> +			    cpu_mask, start, size, 0, 0);
> }
> EXPORT_SYMBOL(sbi_remote_hfence_vvma);
> 
> @@ -485,20 +512,20 @@ EXPORT_SYMBOL(sbi_remote_hfence_vvma);
>  * remote harts for current guest virtual address range belonging to a specific
>  * ASID.
>  *
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the current guest virtual address
>  * @size: Total size of the current guest virtual address range.
>  * @asid: The value of address space identifier (ASID).
>  *
>  * Return: None
>  */
> -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID,
> -			    hart_mask, start, size, asid, 0);
> +			    cpu_mask, start, size, asid, 0);
> }
> EXPORT_SYMBOL(sbi_remote_hfence_vvma_asid);
> 
> @@ -591,11 +618,7 @@ long sbi_get_mimpid(void)
> 
> static void sbi_send_cpumask_ipi(const struct cpumask *target)
> {
> -	struct cpumask hartid_mask;
> -
> -	riscv_cpuid_to_hartid_mask(target, &hartid_mask);
> -
> -	sbi_send_ipi(cpumask_bits(&hartid_mask));
> +	sbi_send_ipi(target);
> }
> 
> static const struct riscv_ipi_ops sbi_ipi_ops = {
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 63241abe84eb..b42bfdc67482 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -59,16 +59,6 @@ atomic_t hart_lottery __section(".sdata")
> unsigned long boot_cpu_hartid;
> static DEFINE_PER_CPU(struct cpu, cpu_devices);
> 
> -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out)
> -{
> -	int cpu;
> -
> -	cpumask_clear(out);
> -	for_each_cpu(cpu, in)
> -		cpumask_set_cpu(cpuid_to_hartid_map(cpu), out);
> -}
> -EXPORT_SYMBOL_GPL(riscv_cpuid_to_hartid_mask);
> -
> /*
>  * Place kernel memory regions on the resource tree so that
>  * kexec-tools can retrieve them from /proc/iomem. While there
> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> index bd82375db51a..622f226454d5 100644
> --- a/arch/riscv/kernel/smpboot.c
> +++ b/arch/riscv/kernel/smpboot.c
> @@ -96,7 +96,7 @@ void __init setup_smp(void)
> 		if (cpuid >= NR_CPUS) {
> 			pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
> 				cpuid, hart);
> -			break;
> +			continue;
> 		}
> 
> 		cpuid_to_hartid_map(cpuid) = hart;
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index 9af67dbdc66a..f80a34fbf102 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -114,7 +114,6 @@ static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr,
> 
> static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> {
> -	struct cpumask hmask;
> 	unsigned long size = PAGE_SIZE;
> 	struct kvm_vmid *vmid = &kvm->arch.vmid;
> 
> @@ -127,8 +126,7 @@ static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> 	 * where the Guest/VM is running.
> 	 */
> 	preempt_disable();
> -	riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> -	sbi_remote_hfence_gvma_vmid(cpumask_bits(&hmask), addr, size,
> +	sbi_remote_hfence_gvma_vmid(cpu_online_mask, addr, size,
> 				    READ_ONCE(vmid->vmid));
> 	preempt_enable();
> }
> diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> index 00036b7f83b9..1bc0608a5bfd 100644
> --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> @@ -82,7 +82,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> {
> 	int ret = 0;
> 	unsigned long i;
> -	struct cpumask cm, hm;
> +	struct cpumask cm;
> 	struct kvm_vcpu *tmp;
> 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> 	unsigned long hmask = cp->a0;
> @@ -90,7 +90,6 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> 	unsigned long funcid = cp->a6;
> 
> 	cpumask_clear(&cm);
> -	cpumask_clear(&hm);
> 	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> 		if (hbase != -1UL) {
> 			if (tmp->vcpu_id < hbase)
> @@ -103,17 +102,15 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> 		cpumask_set_cpu(tmp->cpu, &cm);
> 	}
> 
> -	riscv_cpuid_to_hartid_mask(&cm, &hm);
> -
> 	switch (funcid) {
> 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> -		ret = sbi_remote_fence_i(cpumask_bits(&hm));
> +		ret = sbi_remote_fence_i(&cm);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> -		ret = sbi_remote_hfence_vvma(cpumask_bits(&hm), cp->a2, cp->a3);
> +		ret = sbi_remote_hfence_vvma(&cm, cp->a2, cp->a3);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> -		ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm), cp->a2,
> +		ret = sbi_remote_hfence_vvma_asid(&cm, cp->a2,
> 						  cp->a3, cp->a4);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
> index 4c7e13ec9ccc..07e2de14433a 100644
> --- a/arch/riscv/kvm/vcpu_sbi_v01.c
> +++ b/arch/riscv/kvm/vcpu_sbi_v01.c
> @@ -38,7 +38,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> 	int i, ret = 0;
> 	u64 next_cycle;
> 	struct kvm_vcpu *rvcpu;
> -	struct cpumask cm, hm;
> +	struct cpumask cm;
> 	struct kvm *kvm = vcpu->kvm;
> 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> 
> @@ -101,15 +101,12 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> 				continue;
> 			cpumask_set_cpu(rvcpu->cpu, &cm);
> 		}
> -		riscv_cpuid_to_hartid_mask(&cm, &hm);
> 		if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
> -			ret = sbi_remote_fence_i(cpumask_bits(&hm));
> +			ret = sbi_remote_fence_i(&cm);
> 		else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
> -			ret = sbi_remote_hfence_vvma(cpumask_bits(&hm),
> -						cp->a1, cp->a2);
> +			ret = sbi_remote_hfence_vvma(&cm, cp->a1, cp->a2);
> 		else
> -			ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
> -						cp->a1, cp->a2, cp->a3);
> +			ret = sbi_remote_hfence_vvma_asid(&cm, cp->a1, cp->a2, cp->a3);
> 		break;
> 	default:
> 		ret = -EINVAL;
> diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
> index 807228f8f409..2fa4f7b1813d 100644
> --- a/arch/riscv/kvm/vmid.c
> +++ b/arch/riscv/kvm/vmid.c
> @@ -67,7 +67,6 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> {
> 	unsigned long i;
> 	struct kvm_vcpu *v;
> -	struct cpumask hmask;
> 	struct kvm_vmid *vmid = &vcpu->kvm->arch.vmid;
> 
> 	if (!kvm_riscv_stage2_vmid_ver_changed(vmid))
> @@ -102,8 +101,7 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> 		 * running, we force VM exits on all host CPUs using IPI and
> 		 * flush all Guest TLBs.
> 		 */
> -		riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> -		sbi_remote_hfence_gvma(cpumask_bits(&hmask), 0, 0);
> +		sbi_remote_hfence_gvma(cpu_online_mask, 0, 0);
> 	}
> 
> 	vmid->vmid = vmid_next;
> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> index 89f81067e09e..6cb7d96ad9c7 100644
> --- a/arch/riscv/mm/cacheflush.c
> +++ b/arch/riscv/mm/cacheflush.c
> @@ -67,10 +67,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
> 		 */
> 		smp_mb();
> 	} else if (IS_ENABLED(CONFIG_RISCV_SBI)) {
> -		cpumask_t hartid_mask;
> -
> -		riscv_cpuid_to_hartid_mask(&others, &hartid_mask);
> -		sbi_remote_fence_i(cpumask_bits(&hartid_mask));
> +		sbi_remote_fence_i(&others);
> 	} else {
> 		on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
> 	}
> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> index 64f8201237c2..37ed760d007c 100644
> --- a/arch/riscv/mm/tlbflush.c
> +++ b/arch/riscv/mm/tlbflush.c
> @@ -32,7 +32,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> 				  unsigned long size, unsigned long stride)
> {
> 	struct cpumask *cmask = mm_cpumask(mm);
> -	struct cpumask hmask;
> 	unsigned int cpuid;
> 	bool broadcast;
> 
> @@ -46,9 +45,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> 		unsigned long asid = atomic_long_read(&mm->context.id);
> 
> 		if (broadcast) {
> -			riscv_cpuid_to_hartid_mask(cmask, &hmask);
> -			sbi_remote_sfence_vma_asid(cpumask_bits(&hmask),
> -						   start, size, asid);
> +			sbi_remote_sfence_vma_asid(cmask, start, size, asid);
> 		} else if (size <= stride) {
> 			local_flush_tlb_page_asid(start, asid);
> 		} else {
> @@ -56,9 +53,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> 		}
> 	} else {
> 		if (broadcast) {
> -			riscv_cpuid_to_hartid_mask(cmask, &hmask);
> -			sbi_remote_sfence_vma(cpumask_bits(&hmask),
> -					      start, size);
> +			sbi_remote_sfence_vma(cmask, start, size);
> 		} else if (size <= stride) {
> 			local_flush_tlb_page(start);
> 		} else {
> -- 
> 2.30.2
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-25 22:26     ` Jessica Clarke
  0 siblings, 0 replies; 55+ messages in thread
From: Jessica Clarke @ 2022-01-25 22:26 UTC (permalink / raw)
  To: Atish Patra
  Cc: Linux Kernel Mailing List, Anup Patel, Albert Ou, Atish Patra,
	Damien Le Moal, devicetree, Jisheng Zhang, Krzysztof Kozlowski,
	linux-riscv, Palmer Dabbelt, Paul Walmsley, Rob Herring

On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> 
> Currently, SBI APIs accept a hartmask that is generated from struct
> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> is not the correct data structure for hartids as it can be higher
> than NR_CPUs for platforms with sparse or discontguous hartids.
> 
> Remove all association between hartid mask and struct cpumask.
> 
> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
> arch/riscv/include/asm/sbi.h      |  19 +--
> arch/riscv/include/asm/smp.h      |   2 -
> arch/riscv/kernel/sbi.c           | 189 +++++++++++++++++-------------
> arch/riscv/kernel/setup.c         |  10 --
> arch/riscv/kernel/smpboot.c       |   2 +-
> arch/riscv/kvm/mmu.c              |   4 +-
> arch/riscv/kvm/vcpu_sbi_replace.c |  11 +-
> arch/riscv/kvm/vcpu_sbi_v01.c     |  11 +-
> arch/riscv/kvm/vmid.c             |   4 +-
> arch/riscv/mm/cacheflush.c        |   5 +-
> arch/riscv/mm/tlbflush.c          |   9 +-
> 11 files changed, 130 insertions(+), 136 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 26ba6f2d7a40..d1c37479d828 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -8,6 +8,7 @@
> #define _ASM_RISCV_SBI_H
> 
> #include <linux/types.h>
> +#include <linux/cpumask.h>
> 
> #ifdef CONFIG_RISCV_SBI
> enum sbi_ext_id {
> @@ -128,27 +129,27 @@ long sbi_get_mimpid(void);
> void sbi_set_timer(uint64_t stime_value);
> void sbi_shutdown(void);
> void sbi_clear_ipi(void);
> -int sbi_send_ipi(const unsigned long *hart_mask);
> -int sbi_remote_fence_i(const unsigned long *hart_mask);
> -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> +int sbi_send_ipi(const struct cpumask *cpu_mask);
> +int sbi_remote_fence_i(const struct cpumask *cpu_mask);
> +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size);
> 
> -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid);
> -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size);
> -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long vmid);
> -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size);
> -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid);
> @@ -183,7 +184,7 @@ static inline unsigned long sbi_mk_version(unsigned long major,
> 
> int sbi_err_map_linux_errno(int err);
> #else /* CONFIG_RISCV_SBI */
> -static inline int sbi_remote_fence_i(const unsigned long *hart_mask) { return -1; }
> +static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1; }
> static inline void sbi_init(void) {}
> #endif /* CONFIG_RISCV_SBI */
> #endif /* _ASM_RISCV_SBI_H */
> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
> index 6ad749f42807..23170c933d73 100644
> --- a/arch/riscv/include/asm/smp.h
> +++ b/arch/riscv/include/asm/smp.h
> @@ -92,8 +92,6 @@ static inline void riscv_clear_ipi(void)
> 
> #endif /* CONFIG_SMP */
> 
> -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
> -
> #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP)
> bool cpu_has_hotplug(unsigned int cpu);
> #else
> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> index 9a84f0cb5175..f72527fcb347 100644
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -16,8 +16,8 @@ unsigned long sbi_spec_version __ro_after_init = SBI_SPEC_VERSION_DEFAULT;
> EXPORT_SYMBOL(sbi_spec_version);
> 
> static void (*__sbi_set_timer)(uint64_t stime) __ro_after_init;
> -static int (*__sbi_send_ipi)(const unsigned long *hart_mask) __ro_after_init;
> -static int (*__sbi_rfence)(int fid, const unsigned long *hart_mask,
> +static int (*__sbi_send_ipi)(const struct cpumask *cpu_mask) __ro_after_init;
> +static int (*__sbi_rfence)(int fid, const struct cpumask *cpu_mask,
> 			   unsigned long start, unsigned long size,
> 			   unsigned long arg4, unsigned long arg5) __ro_after_init;
> 
> @@ -67,6 +67,30 @@ int sbi_err_map_linux_errno(int err)
> EXPORT_SYMBOL(sbi_err_map_linux_errno);
> 
> #ifdef CONFIG_RISCV_SBI_V01
> +static unsigned long __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mask)
> +{
> +	unsigned long cpuid, hartid;
> +	unsigned long hmask = 0;
> +
> +	/*
> +	 * There is no maximum hartid concept in RISC-V and NR_CPUS must not be
> +	 * associated with hartid. As SBI v0.1 is only kept for backward compatibility
> +	 * and will be removed in the future, there is no point in supporting hartid
> +	 * greater than BITS_PER_LONG (32 for RV32 and 64 for RV64). Ideally, SBI v0.2
> +	 * should be used for platforms with hartid greater than BITS_PER_LONG.
> +	 */
> +	for_each_cpu(cpuid, cpu_mask) {
> +		hartid = cpuid_to_hartid_map(cpuid);
> +		if (hartid >= BITS_PER_LONG) {
> +			pr_warn("Unable to send any request to hartid > BITS_PER_LONG for SBI v0.1\n");
> +			break;
> +		}
> +		hmask |= 1 << hartid;

This should be 1UL otherwise hartid 31 and up cause UB.

> +	}
> +
> +	return hmask;
> +}
> +
> /**
>  * sbi_console_putchar() - Writes given character to the console device.
>  * @ch: The data to be written to the console.
> @@ -132,33 +156,44 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> #endif
> }
> 
> -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> {
> -	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)hart_mask,
> +	unsigned long hart_mask;
> +
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;
> +	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> +
> +	sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)(&hart_mask),
> 		  0, 0, 0, 0, 0);
> 	return 0;
> }
> 
> -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> 			    unsigned long start, unsigned long size,
> 			    unsigned long arg4, unsigned long arg5)
> {
> 	int result = 0;
> +	unsigned long hart_mask;
> +
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;
> +	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> 
> 	/* v0.2 function IDs are equivalent to v0.1 extension IDs */
> 	switch (fid) {
> 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> 		sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
> -			  (unsigned long)hart_mask, 0, 0, 0, 0, 0);
> +			  (unsigned long)&hart_mask, 0, 0, 0, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA, 0,
> -			  (unsigned long)hart_mask, start, size,
> +			  (unsigned long)&hart_mask, start, size,
> 			  0, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> 		sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID, 0,
> -			  (unsigned long)hart_mask, start, size,
> +			  (unsigned long)&hart_mask, start, size,
> 			  arg4, 0, 0);
> 		break;
> 	default:
> @@ -180,7 +215,7 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> 		sbi_major_version(), sbi_minor_version());
> }
> 
> -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> {
> 	pr_warn("IPI extension is not available in SBI v%lu.%lu\n",
> 		sbi_major_version(), sbi_minor_version());
> @@ -188,7 +223,7 @@ static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> 	return 0;
> }
> 
> -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> 			    unsigned long start, unsigned long size,
> 			    unsigned long arg4, unsigned long arg5)
> {
> @@ -212,37 +247,33 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
> #endif
> }
> 
> -static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> +static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
> {
> -	unsigned long hartid, hmask_val, hbase;
> -	struct cpumask tmask;
> +	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> 	struct sbiret ret = {0};
> 	int result;
> 
> -	if (!hart_mask || !(*hart_mask)) {
> -		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> -		hart_mask = cpumask_bits(&tmask);
> -	}
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;

This is a change in behaviour. Are you sure nothing passes an empty mask?

> -	hmask_val = 0;
> -	hbase = 0;
> -	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> -		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> +	for_each_cpu(cpuid, cpu_mask) {
> +		hartid = cpuid_to_hartid_map(cpuid);
> +		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> 			ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> -					hmask_val, hbase, 0, 0, 0, 0);
> +					hmask, hbase, 0, 0, 0, 0);
> 			if (ret.error)
> 				goto ecall_failed;
> -			hmask_val = 0;
> +			hmask = 0;
> 			hbase = 0;
> 		}
> -		if (!hmask_val)
> +		if (!hmask)
> 			hbase = hartid;
> -		hmask_val |= 1UL << (hartid - hbase);
> +		hmask |= 1UL << (hartid - hbase);
> 	}
> 
> -	if (hmask_val) {
> +	if (hmask) {
> 		ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> -				hmask_val, hbase, 0, 0, 0, 0);
> +				hmask, hbase, 0, 0, 0, 0);
> 		if (ret.error)
> 			goto ecall_failed;
> 	}
> @@ -252,11 +283,11 @@ static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> ecall_failed:
> 	result = sbi_err_map_linux_errno(ret.error);
> 	pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> -	       __func__, hbase, hmask_val, result);
> +	       __func__, hbase, hmask, result);
> 	return result;
> }
> 
> -static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> +static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask,
> 				 unsigned long hbase, unsigned long start,
> 				 unsigned long size, unsigned long arg4,
> 				 unsigned long arg5)
> @@ -267,31 +298,31 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> 
> 	switch (fid) {
> 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, 0, 0, 0, 0);
> +		ret = sbi_ecall(ext, fid, hmask, hbase, 0, 0, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, arg4, 0);
> 		break;
> 
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, arg4, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, 0, 0);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
> -		ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> +		ret = sbi_ecall(ext, fid, hmask, hbase, start,
> 				size, arg4, 0);
> 		break;
> 	default:
> @@ -303,43 +334,39 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> 	if (ret.error) {
> 		result = sbi_err_map_linux_errno(ret.error);
> 		pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> -		       __func__, hbase, hmask_val, result);
> +		       __func__, hbase, hmask, result);
> 	}
> 
> 	return result;
> }
> 
> -static int __sbi_rfence_v02(int fid, const unsigned long *hart_mask,
> +static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask,
> 			    unsigned long start, unsigned long size,
> 			    unsigned long arg4, unsigned long arg5)
> {
> -	unsigned long hmask_val, hartid, hbase;
> -	struct cpumask tmask;
> +	unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> 	int result;
> 
> -	if (!hart_mask || !(*hart_mask)) {
> -		riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> -		hart_mask = cpumask_bits(&tmask);
> -	}
> +	if (!cpu_mask)
> +		cpu_mask = cpu_online_mask;

As with __sbi_send_ipi_v02.

Jess

> -	hmask_val = 0;
> -	hbase = 0;
> -	for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> -		if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> -			result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> +	for_each_cpu(cpuid, cpu_mask) {
> +		hartid = cpuid_to_hartid_map(cpuid);
> +		if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> +			result = __sbi_rfence_v02_call(fid, hmask, hbase,
> 						       start, size, arg4, arg5);
> 			if (result)
> 				return result;
> -			hmask_val = 0;
> +			hmask = 0;
> 			hbase = 0;
> 		}
> -		if (!hmask_val)
> +		if (!hmask)
> 			hbase = hartid;
> -		hmask_val |= 1UL << (hartid - hbase);
> +		hmask |= 1UL << (hartid - hbase);
> 	}
> 
> -	if (hmask_val) {
> -		result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> +	if (hmask) {
> +		result = __sbi_rfence_v02_call(fid, hmask, hbase,
> 					       start, size, arg4, arg5);
> 		if (result)
> 			return result;
> @@ -361,44 +388,44 @@ void sbi_set_timer(uint64_t stime_value)
> 
> /**
>  * sbi_send_ipi() - Send an IPI to any hart.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_send_ipi(const unsigned long *hart_mask)
> +int sbi_send_ipi(const struct cpumask *cpu_mask)
> {
> -	return __sbi_send_ipi(hart_mask);
> +	return __sbi_send_ipi(cpu_mask);
> }
> EXPORT_SYMBOL(sbi_send_ipi);
> 
> /**
>  * sbi_remote_fence_i() - Execute FENCE.I instruction on given remote harts.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_remote_fence_i(const unsigned long *hart_mask)
> +int sbi_remote_fence_i(const struct cpumask *cpu_mask)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_FENCE_I,
> -			    hart_mask, 0, 0, 0, 0);
> +			    cpu_mask, 0, 0, 0, 0);
> }
> EXPORT_SYMBOL(sbi_remote_fence_i);
> 
> /**
>  * sbi_remote_sfence_vma() - Execute SFENCE.VMA instructions on given remote
>  *			     harts for the specified virtual address range.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the virtual address
>  * @size: Total size of the virtual address range.
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
> -			    hart_mask, start, size, 0, 0);
> +			    cpu_mask, start, size, 0, 0);
> }
> EXPORT_SYMBOL(sbi_remote_sfence_vma);
> 
> @@ -406,38 +433,38 @@ EXPORT_SYMBOL(sbi_remote_sfence_vma);
>  * sbi_remote_sfence_vma_asid() - Execute SFENCE.VMA instructions on given
>  * remote harts for a virtual address range belonging to a specific ASID.
>  *
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the virtual address
>  * @size: Total size of the virtual address range.
>  * @asid: The value of address space identifier (ASID).
>  *
>  * Return: 0 on success, appropriate linux error code otherwise.
>  */
> -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
> -			    hart_mask, start, size, asid, 0);
> +			    cpu_mask, start, size, asid, 0);
> }
> EXPORT_SYMBOL(sbi_remote_sfence_vma_asid);
> 
> /**
>  * sbi_remote_hfence_gvma() - Execute HFENCE.GVMA instructions on given remote
>  *			   harts for the specified guest physical address range.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the guest physical address
>  * @size: Total size of the guest physical address range.
>  *
>  * Return: None
>  */
> -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
> -			    hart_mask, start, size, 0, 0);
> +			    cpu_mask, start, size, 0, 0);
> }
> EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
> 
> @@ -445,38 +472,38 @@ EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
>  * sbi_remote_hfence_gvma_vmid() - Execute HFENCE.GVMA instructions on given
>  * remote harts for a guest physical address range belonging to a specific VMID.
>  *
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the guest physical address
>  * @size: Total size of the guest physical address range.
>  * @vmid: The value of guest ID (VMID).
>  *
>  * Return: 0 if success, Error otherwise.
>  */
> -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long vmid)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID,
> -			    hart_mask, start, size, vmid, 0);
> +			    cpu_mask, start, size, vmid, 0);
> }
> EXPORT_SYMBOL(sbi_remote_hfence_gvma_vmid);
> 
> /**
>  * sbi_remote_hfence_vvma() - Execute HFENCE.VVMA instructions on given remote
>  *			     harts for the current guest virtual address range.
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the current guest virtual address
>  * @size: Total size of the current guest virtual address range.
>  *
>  * Return: None
>  */
> -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> 			   unsigned long start,
> 			   unsigned long size)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
> -			    hart_mask, start, size, 0, 0);
> +			    cpu_mask, start, size, 0, 0);
> }
> EXPORT_SYMBOL(sbi_remote_hfence_vvma);
> 
> @@ -485,20 +512,20 @@ EXPORT_SYMBOL(sbi_remote_hfence_vvma);
>  * remote harts for current guest virtual address range belonging to a specific
>  * ASID.
>  *
> - * @hart_mask: A cpu mask containing all the target harts.
> + * @cpu_mask: A cpu mask containing all the target harts.
>  * @start: Start of the current guest virtual address
>  * @size: Total size of the current guest virtual address range.
>  * @asid: The value of address space identifier (ASID).
>  *
>  * Return: None
>  */
> -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> 				unsigned long start,
> 				unsigned long size,
> 				unsigned long asid)
> {
> 	return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID,
> -			    hart_mask, start, size, asid, 0);
> +			    cpu_mask, start, size, asid, 0);
> }
> EXPORT_SYMBOL(sbi_remote_hfence_vvma_asid);
> 
> @@ -591,11 +618,7 @@ long sbi_get_mimpid(void)
> 
> static void sbi_send_cpumask_ipi(const struct cpumask *target)
> {
> -	struct cpumask hartid_mask;
> -
> -	riscv_cpuid_to_hartid_mask(target, &hartid_mask);
> -
> -	sbi_send_ipi(cpumask_bits(&hartid_mask));
> +	sbi_send_ipi(target);
> }
> 
> static const struct riscv_ipi_ops sbi_ipi_ops = {
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 63241abe84eb..b42bfdc67482 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -59,16 +59,6 @@ atomic_t hart_lottery __section(".sdata")
> unsigned long boot_cpu_hartid;
> static DEFINE_PER_CPU(struct cpu, cpu_devices);
> 
> -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out)
> -{
> -	int cpu;
> -
> -	cpumask_clear(out);
> -	for_each_cpu(cpu, in)
> -		cpumask_set_cpu(cpuid_to_hartid_map(cpu), out);
> -}
> -EXPORT_SYMBOL_GPL(riscv_cpuid_to_hartid_mask);
> -
> /*
>  * Place kernel memory regions on the resource tree so that
>  * kexec-tools can retrieve them from /proc/iomem. While there
> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> index bd82375db51a..622f226454d5 100644
> --- a/arch/riscv/kernel/smpboot.c
> +++ b/arch/riscv/kernel/smpboot.c
> @@ -96,7 +96,7 @@ void __init setup_smp(void)
> 		if (cpuid >= NR_CPUS) {
> 			pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
> 				cpuid, hart);
> -			break;
> +			continue;
> 		}
> 
> 		cpuid_to_hartid_map(cpuid) = hart;
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index 9af67dbdc66a..f80a34fbf102 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -114,7 +114,6 @@ static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr,
> 
> static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> {
> -	struct cpumask hmask;
> 	unsigned long size = PAGE_SIZE;
> 	struct kvm_vmid *vmid = &kvm->arch.vmid;
> 
> @@ -127,8 +126,7 @@ static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> 	 * where the Guest/VM is running.
> 	 */
> 	preempt_disable();
> -	riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> -	sbi_remote_hfence_gvma_vmid(cpumask_bits(&hmask), addr, size,
> +	sbi_remote_hfence_gvma_vmid(cpu_online_mask, addr, size,
> 				    READ_ONCE(vmid->vmid));
> 	preempt_enable();
> }
> diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> index 00036b7f83b9..1bc0608a5bfd 100644
> --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> @@ -82,7 +82,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> {
> 	int ret = 0;
> 	unsigned long i;
> -	struct cpumask cm, hm;
> +	struct cpumask cm;
> 	struct kvm_vcpu *tmp;
> 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> 	unsigned long hmask = cp->a0;
> @@ -90,7 +90,6 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> 	unsigned long funcid = cp->a6;
> 
> 	cpumask_clear(&cm);
> -	cpumask_clear(&hm);
> 	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> 		if (hbase != -1UL) {
> 			if (tmp->vcpu_id < hbase)
> @@ -103,17 +102,15 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> 		cpumask_set_cpu(tmp->cpu, &cm);
> 	}
> 
> -	riscv_cpuid_to_hartid_mask(&cm, &hm);
> -
> 	switch (funcid) {
> 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> -		ret = sbi_remote_fence_i(cpumask_bits(&hm));
> +		ret = sbi_remote_fence_i(&cm);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> -		ret = sbi_remote_hfence_vvma(cpumask_bits(&hm), cp->a2, cp->a3);
> +		ret = sbi_remote_hfence_vvma(&cm, cp->a2, cp->a3);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> -		ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm), cp->a2,
> +		ret = sbi_remote_hfence_vvma_asid(&cm, cp->a2,
> 						  cp->a3, cp->a4);
> 		break;
> 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
> index 4c7e13ec9ccc..07e2de14433a 100644
> --- a/arch/riscv/kvm/vcpu_sbi_v01.c
> +++ b/arch/riscv/kvm/vcpu_sbi_v01.c
> @@ -38,7 +38,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> 	int i, ret = 0;
> 	u64 next_cycle;
> 	struct kvm_vcpu *rvcpu;
> -	struct cpumask cm, hm;
> +	struct cpumask cm;
> 	struct kvm *kvm = vcpu->kvm;
> 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> 
> @@ -101,15 +101,12 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> 				continue;
> 			cpumask_set_cpu(rvcpu->cpu, &cm);
> 		}
> -		riscv_cpuid_to_hartid_mask(&cm, &hm);
> 		if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
> -			ret = sbi_remote_fence_i(cpumask_bits(&hm));
> +			ret = sbi_remote_fence_i(&cm);
> 		else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
> -			ret = sbi_remote_hfence_vvma(cpumask_bits(&hm),
> -						cp->a1, cp->a2);
> +			ret = sbi_remote_hfence_vvma(&cm, cp->a1, cp->a2);
> 		else
> -			ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
> -						cp->a1, cp->a2, cp->a3);
> +			ret = sbi_remote_hfence_vvma_asid(&cm, cp->a1, cp->a2, cp->a3);
> 		break;
> 	default:
> 		ret = -EINVAL;
> diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
> index 807228f8f409..2fa4f7b1813d 100644
> --- a/arch/riscv/kvm/vmid.c
> +++ b/arch/riscv/kvm/vmid.c
> @@ -67,7 +67,6 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> {
> 	unsigned long i;
> 	struct kvm_vcpu *v;
> -	struct cpumask hmask;
> 	struct kvm_vmid *vmid = &vcpu->kvm->arch.vmid;
> 
> 	if (!kvm_riscv_stage2_vmid_ver_changed(vmid))
> @@ -102,8 +101,7 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> 		 * running, we force VM exits on all host CPUs using IPI and
> 		 * flush all Guest TLBs.
> 		 */
> -		riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> -		sbi_remote_hfence_gvma(cpumask_bits(&hmask), 0, 0);
> +		sbi_remote_hfence_gvma(cpu_online_mask, 0, 0);
> 	}
> 
> 	vmid->vmid = vmid_next;
> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> index 89f81067e09e..6cb7d96ad9c7 100644
> --- a/arch/riscv/mm/cacheflush.c
> +++ b/arch/riscv/mm/cacheflush.c
> @@ -67,10 +67,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
> 		 */
> 		smp_mb();
> 	} else if (IS_ENABLED(CONFIG_RISCV_SBI)) {
> -		cpumask_t hartid_mask;
> -
> -		riscv_cpuid_to_hartid_mask(&others, &hartid_mask);
> -		sbi_remote_fence_i(cpumask_bits(&hartid_mask));
> +		sbi_remote_fence_i(&others);
> 	} else {
> 		on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
> 	}
> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> index 64f8201237c2..37ed760d007c 100644
> --- a/arch/riscv/mm/tlbflush.c
> +++ b/arch/riscv/mm/tlbflush.c
> @@ -32,7 +32,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> 				  unsigned long size, unsigned long stride)
> {
> 	struct cpumask *cmask = mm_cpumask(mm);
> -	struct cpumask hmask;
> 	unsigned int cpuid;
> 	bool broadcast;
> 
> @@ -46,9 +45,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> 		unsigned long asid = atomic_long_read(&mm->context.id);
> 
> 		if (broadcast) {
> -			riscv_cpuid_to_hartid_mask(cmask, &hmask);
> -			sbi_remote_sfence_vma_asid(cpumask_bits(&hmask),
> -						   start, size, asid);
> +			sbi_remote_sfence_vma_asid(cmask, start, size, asid);
> 		} else if (size <= stride) {
> 			local_flush_tlb_page_asid(start, asid);
> 		} else {
> @@ -56,9 +53,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> 		}
> 	} else {
> 		if (broadcast) {
> -			riscv_cpuid_to_hartid_mask(cmask, &hmask);
> -			sbi_remote_sfence_vma(cpumask_bits(&hmask),
> -					      start, size);
> +			sbi_remote_sfence_vma(cmask, start, size);
> 		} else if (size <= stride) {
> 			local_flush_tlb_page(start);
> 		} else {
> -- 
> 2.30.2
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-25 22:26     ` Jessica Clarke
@ 2022-01-25 22:29       ` David Laight
  -1 siblings, 0 replies; 55+ messages in thread
From: David Laight @ 2022-01-25 22:29 UTC (permalink / raw)
  To: 'Jessica Clarke', Atish Patra
  Cc: Linux Kernel Mailing List, Anup Patel, Albert Ou, Atish Patra,
	Damien Le Moal, devicetree, Jisheng Zhang, Krzysztof Kozlowski,
	linux-riscv, Palmer Dabbelt, Paul Walmsley, Rob Herring

> On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> >
> > Currently, SBI APIs accept a hartmask that is generated from struct
> > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > is not the correct data structure for hartids as it can be higher
> > than NR_CPUs for platforms with sparse or discontguous hartids.
> >
> > Remove all association between hartid mask and struct cpumask.
....
> > -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> > 			    unsigned long start, unsigned long size,
> > 			    unsigned long arg4, unsigned long arg5)
> > {
> > 	int result = 0;
> > +	unsigned long hart_mask;
> > +
> > +	if (!cpu_mask)
> > +		cpu_mask = cpu_online_mask;
> > +	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> >
> > 	/* v0.2 function IDs are equivalent to v0.1 extension IDs */
> > 	switch (fid) {
> > 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> > 		sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
> > -			  (unsigned long)hart_mask, 0, 0, 0, 0, 0);
> > +			  (unsigned long)&hart_mask, 0, 0, 0, 0, 0);

You don't need the cast.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-25 22:29       ` David Laight
  0 siblings, 0 replies; 55+ messages in thread
From: David Laight @ 2022-01-25 22:29 UTC (permalink / raw)
  To: 'Jessica Clarke', Atish Patra
  Cc: Linux Kernel Mailing List, Anup Patel, Albert Ou, Atish Patra,
	Damien Le Moal, devicetree, Jisheng Zhang, Krzysztof Kozlowski,
	linux-riscv, Palmer Dabbelt, Paul Walmsley, Rob Herring

> On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> >
> > Currently, SBI APIs accept a hartmask that is generated from struct
> > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > is not the correct data structure for hartids as it can be higher
> > than NR_CPUs for platforms with sparse or discontguous hartids.
> >
> > Remove all association between hartid mask and struct cpumask.
....
> > -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> > 			    unsigned long start, unsigned long size,
> > 			    unsigned long arg4, unsigned long arg5)
> > {
> > 	int result = 0;
> > +	unsigned long hart_mask;
> > +
> > +	if (!cpu_mask)
> > +		cpu_mask = cpu_online_mask;
> > +	hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> >
> > 	/* v0.2 function IDs are equivalent to v0.1 extension IDs */
> > 	switch (fid) {
> > 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> > 		sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
> > -			  (unsigned long)hart_mask, 0, 0, 0, 0, 0);
> > +			  (unsigned long)&hart_mask, 0, 0, 0, 0, 0);

You don't need the cast.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-25 22:26     ` Jessica Clarke
@ 2022-01-26  2:21       ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-26  2:21 UTC (permalink / raw)
  To: Jessica Clarke, Geert Uytterhoeven
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal, devicetree, Jisheng Zhang, Krzysztof Kozlowski,
	linux-riscv, Palmer Dabbelt, Paul Walmsley, Rob Herring

On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>
> On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> >
> > Currently, SBI APIs accept a hartmask that is generated from struct
> > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > is not the correct data structure for hartids as it can be higher
> > than NR_CPUs for platforms with sparse or discontguous hartids.
> >
> > Remove all association between hartid mask and struct cpumask.
> >
> > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> > arch/riscv/include/asm/sbi.h      |  19 +--
> > arch/riscv/include/asm/smp.h      |   2 -
> > arch/riscv/kernel/sbi.c           | 189 +++++++++++++++++-------------
> > arch/riscv/kernel/setup.c         |  10 --
> > arch/riscv/kernel/smpboot.c       |   2 +-
> > arch/riscv/kvm/mmu.c              |   4 +-
> > arch/riscv/kvm/vcpu_sbi_replace.c |  11 +-
> > arch/riscv/kvm/vcpu_sbi_v01.c     |  11 +-
> > arch/riscv/kvm/vmid.c             |   4 +-
> > arch/riscv/mm/cacheflush.c        |   5 +-
> > arch/riscv/mm/tlbflush.c          |   9 +-
> > 11 files changed, 130 insertions(+), 136 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> > index 26ba6f2d7a40..d1c37479d828 100644
> > --- a/arch/riscv/include/asm/sbi.h
> > +++ b/arch/riscv/include/asm/sbi.h
> > @@ -8,6 +8,7 @@
> > #define _ASM_RISCV_SBI_H
> >
> > #include <linux/types.h>
> > +#include <linux/cpumask.h>
> >
> > #ifdef CONFIG_RISCV_SBI
> > enum sbi_ext_id {
> > @@ -128,27 +129,27 @@ long sbi_get_mimpid(void);
> > void sbi_set_timer(uint64_t stime_value);
> > void sbi_shutdown(void);
> > void sbi_clear_ipi(void);
> > -int sbi_send_ipi(const unsigned long *hart_mask);
> > -int sbi_remote_fence_i(const unsigned long *hart_mask);
> > -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> > +int sbi_send_ipi(const struct cpumask *cpu_mask);
> > +int sbi_remote_fence_i(const struct cpumask *cpu_mask);
> > +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size);
> >
> > -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid);
> > -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size);
> > -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long vmid);
> > -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size);
> > -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid);
> > @@ -183,7 +184,7 @@ static inline unsigned long sbi_mk_version(unsigned long major,
> >
> > int sbi_err_map_linux_errno(int err);
> > #else /* CONFIG_RISCV_SBI */
> > -static inline int sbi_remote_fence_i(const unsigned long *hart_mask) { return -1; }
> > +static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1; }
> > static inline void sbi_init(void) {}
> > #endif /* CONFIG_RISCV_SBI */
> > #endif /* _ASM_RISCV_SBI_H */
> > diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
> > index 6ad749f42807..23170c933d73 100644
> > --- a/arch/riscv/include/asm/smp.h
> > +++ b/arch/riscv/include/asm/smp.h
> > @@ -92,8 +92,6 @@ static inline void riscv_clear_ipi(void)
> >
> > #endif /* CONFIG_SMP */
> >
> > -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
> > -
> > #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP)
> > bool cpu_has_hotplug(unsigned int cpu);
> > #else
> > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> > index 9a84f0cb5175..f72527fcb347 100644
> > --- a/arch/riscv/kernel/sbi.c
> > +++ b/arch/riscv/kernel/sbi.c
> > @@ -16,8 +16,8 @@ unsigned long sbi_spec_version __ro_after_init = SBI_SPEC_VERSION_DEFAULT;
> > EXPORT_SYMBOL(sbi_spec_version);
> >
> > static void (*__sbi_set_timer)(uint64_t stime) __ro_after_init;
> > -static int (*__sbi_send_ipi)(const unsigned long *hart_mask) __ro_after_init;
> > -static int (*__sbi_rfence)(int fid, const unsigned long *hart_mask,
> > +static int (*__sbi_send_ipi)(const struct cpumask *cpu_mask) __ro_after_init;
> > +static int (*__sbi_rfence)(int fid, const struct cpumask *cpu_mask,
> >                          unsigned long start, unsigned long size,
> >                          unsigned long arg4, unsigned long arg5) __ro_after_init;
> >
> > @@ -67,6 +67,30 @@ int sbi_err_map_linux_errno(int err)
> > EXPORT_SYMBOL(sbi_err_map_linux_errno);
> >
> > #ifdef CONFIG_RISCV_SBI_V01
> > +static unsigned long __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mask)
> > +{
> > +     unsigned long cpuid, hartid;
> > +     unsigned long hmask = 0;
> > +
> > +     /*
> > +      * There is no maximum hartid concept in RISC-V and NR_CPUS must not be
> > +      * associated with hartid. As SBI v0.1 is only kept for backward compatibility
> > +      * and will be removed in the future, there is no point in supporting hartid
> > +      * greater than BITS_PER_LONG (32 for RV32 and 64 for RV64). Ideally, SBI v0.2
> > +      * should be used for platforms with hartid greater than BITS_PER_LONG.
> > +      */
> > +     for_each_cpu(cpuid, cpu_mask) {
> > +             hartid = cpuid_to_hartid_map(cpuid);
> > +             if (hartid >= BITS_PER_LONG) {
> > +                     pr_warn("Unable to send any request to hartid > BITS_PER_LONG for SBI v0.1\n");
> > +                     break;
> > +             }
> > +             hmask |= 1 << hartid;
>
> This should be 1UL otherwise hartid 31 and up cause UB.
>

Yeah. Thanks for catching it.

> > +     }
> > +
> > +     return hmask;
> > +}
> > +
> > /**
> >  * sbi_console_putchar() - Writes given character to the console device.
> >  * @ch: The data to be written to the console.
> > @@ -132,33 +156,44 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> > #endif
> > }
> >
> > -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> > +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> > {
> > -     sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)hart_mask,
> > +     unsigned long hart_mask;
> > +
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
> > +     hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> > +
> > +     sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)(&hart_mask),
> >                 0, 0, 0, 0, 0);
> >       return 0;
> > }
> >
> > -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> >                           unsigned long start, unsigned long size,
> >                           unsigned long arg4, unsigned long arg5)
> > {
> >       int result = 0;
> > +     unsigned long hart_mask;
> > +
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
> > +     hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> >
> >       /* v0.2 function IDs are equivalent to v0.1 extension IDs */
> >       switch (fid) {
> >       case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> >               sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
> > -                       (unsigned long)hart_mask, 0, 0, 0, 0, 0);
> > +                       (unsigned long)&hart_mask, 0, 0, 0, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> >               sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA, 0,
> > -                       (unsigned long)hart_mask, start, size,
> > +                       (unsigned long)&hart_mask, start, size,
> >                         0, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> >               sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID, 0,
> > -                       (unsigned long)hart_mask, start, size,
> > +                       (unsigned long)&hart_mask, start, size,
> >                         arg4, 0, 0);
> >               break;
> >       default:
> > @@ -180,7 +215,7 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> >               sbi_major_version(), sbi_minor_version());
> > }
> >
> > -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> > +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> > {
> >       pr_warn("IPI extension is not available in SBI v%lu.%lu\n",
> >               sbi_major_version(), sbi_minor_version());
> > @@ -188,7 +223,7 @@ static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> >       return 0;
> > }
> >
> > -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> >                           unsigned long start, unsigned long size,
> >                           unsigned long arg4, unsigned long arg5)
> > {
> > @@ -212,37 +247,33 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
> > #endif
> > }
> >
> > -static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> > +static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
> > {
> > -     unsigned long hartid, hmask_val, hbase;
> > -     struct cpumask tmask;
> > +     unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> >       struct sbiret ret = {0};
> >       int result;
> >
> > -     if (!hart_mask || !(*hart_mask)) {
> > -             riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> > -             hart_mask = cpumask_bits(&tmask);
> > -     }
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
>
> This is a change in behaviour. Are you sure nothing passes an empty mask?
>

The change in behavior is not intentional.

I am yet to reproduce it on my end.
@Geert Uytterhoeven: can you please try the below diff on your end.

diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index f72527fcb347..ca1c617407b4 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -85,7 +85,7 @@ static unsigned long
__sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
                        pr_warn("Unable to send any request to hartid
> BITS_PER_LONG for SBI v0.1\n");
                        break;
                }
-               hmask |= 1 << hartid;
+               hmask |= 1UL << hartid;
        }

        return hmask;
@@ -160,7 +160,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
*cpu_mask)
 {
        unsigned long hart_mask;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;
        hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -176,7 +176,7 @@ static int __sbi_rfence_v01(int fid, const struct
cpumask *cpu_mask,
        int result = 0;
        unsigned long hart_mask;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;
        hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -253,7 +253,7 @@ static int __sbi_send_ipi_v02(const struct cpumask
*cpu_mask)
        struct sbiret ret = {0};
        int result;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;

        for_each_cpu(cpuid, cpu_mask) {
@@ -347,7 +347,7 @@ static int __sbi_rfence_v02(int fid, const struct
cpumask *cpu_mask,
        unsigned long hartid, cpuid, hmask = 0, hbase = 0;
        int result;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;


> > -     hmask_val = 0;
> > -     hbase = 0;
> > -     for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> > -             if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> > +     for_each_cpu(cpuid, cpu_mask) {
> > +             hartid = cpuid_to_hartid_map(cpuid);
> > +             if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> >                       ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> > -                                     hmask_val, hbase, 0, 0, 0, 0);
> > +                                     hmask, hbase, 0, 0, 0, 0);
> >                       if (ret.error)
> >                               goto ecall_failed;
> > -                     hmask_val = 0;
> > +                     hmask = 0;
> >                       hbase = 0;
> >               }
> > -             if (!hmask_val)
> > +             if (!hmask)
> >                       hbase = hartid;
> > -             hmask_val |= 1UL << (hartid - hbase);
> > +             hmask |= 1UL << (hartid - hbase);
> >       }
> >
> > -     if (hmask_val) {
> > +     if (hmask) {
> >               ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> > -                             hmask_val, hbase, 0, 0, 0, 0);
> > +                             hmask, hbase, 0, 0, 0, 0);
> >               if (ret.error)
> >                       goto ecall_failed;
> >       }
> > @@ -252,11 +283,11 @@ static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> > ecall_failed:
> >       result = sbi_err_map_linux_errno(ret.error);
> >       pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> > -            __func__, hbase, hmask_val, result);
> > +            __func__, hbase, hmask, result);
> >       return result;
> > }
> >
> > -static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> > +static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask,
> >                                unsigned long hbase, unsigned long start,
> >                                unsigned long size, unsigned long arg4,
> >                                unsigned long arg5)
> > @@ -267,31 +298,31 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> >
> >       switch (fid) {
> >       case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, 0, 0, 0, 0);
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, 0, 0, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, arg4, 0);
> >               break;
> >
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, arg4, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, arg4, 0);
> >               break;
> >       default:
> > @@ -303,43 +334,39 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> >       if (ret.error) {
> >               result = sbi_err_map_linux_errno(ret.error);
> >               pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> > -                    __func__, hbase, hmask_val, result);
> > +                    __func__, hbase, hmask, result);
> >       }
> >
> >       return result;
> > }
> >
> > -static int __sbi_rfence_v02(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask,
> >                           unsigned long start, unsigned long size,
> >                           unsigned long arg4, unsigned long arg5)
> > {
> > -     unsigned long hmask_val, hartid, hbase;
> > -     struct cpumask tmask;
> > +     unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> >       int result;
> >
> > -     if (!hart_mask || !(*hart_mask)) {
> > -             riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> > -             hart_mask = cpumask_bits(&tmask);
> > -     }
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
>
> As with __sbi_send_ipi_v02.
>
> Jess
>
> > -     hmask_val = 0;
> > -     hbase = 0;
> > -     for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> > -             if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> > -                     result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> > +     for_each_cpu(cpuid, cpu_mask) {
> > +             hartid = cpuid_to_hartid_map(cpuid);
> > +             if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> > +                     result = __sbi_rfence_v02_call(fid, hmask, hbase,
> >                                                      start, size, arg4, arg5);
> >                       if (result)
> >                               return result;
> > -                     hmask_val = 0;
> > +                     hmask = 0;
> >                       hbase = 0;
> >               }
> > -             if (!hmask_val)
> > +             if (!hmask)
> >                       hbase = hartid;
> > -             hmask_val |= 1UL << (hartid - hbase);
> > +             hmask |= 1UL << (hartid - hbase);
> >       }
> >
> > -     if (hmask_val) {
> > -             result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> > +     if (hmask) {
> > +             result = __sbi_rfence_v02_call(fid, hmask, hbase,
> >                                              start, size, arg4, arg5);
> >               if (result)
> >                       return result;
> > @@ -361,44 +388,44 @@ void sbi_set_timer(uint64_t stime_value)
> >
> > /**
> >  * sbi_send_ipi() - Send an IPI to any hart.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_send_ipi(const unsigned long *hart_mask)
> > +int sbi_send_ipi(const struct cpumask *cpu_mask)
> > {
> > -     return __sbi_send_ipi(hart_mask);
> > +     return __sbi_send_ipi(cpu_mask);
> > }
> > EXPORT_SYMBOL(sbi_send_ipi);
> >
> > /**
> >  * sbi_remote_fence_i() - Execute FENCE.I instruction on given remote harts.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_remote_fence_i(const unsigned long *hart_mask)
> > +int sbi_remote_fence_i(const struct cpumask *cpu_mask)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_FENCE_I,
> > -                         hart_mask, 0, 0, 0, 0);
> > +                         cpu_mask, 0, 0, 0, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_fence_i);
> >
> > /**
> >  * sbi_remote_sfence_vma() - Execute SFENCE.VMA instructions on given remote
> >  *                         harts for the specified virtual address range.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the virtual address
> >  * @size: Total size of the virtual address range.
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> > +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
> > -                         hart_mask, start, size, 0, 0);
> > +                         cpu_mask, start, size, 0, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_sfence_vma);
> >
> > @@ -406,38 +433,38 @@ EXPORT_SYMBOL(sbi_remote_sfence_vma);
> >  * sbi_remote_sfence_vma_asid() - Execute SFENCE.VMA instructions on given
> >  * remote harts for a virtual address range belonging to a specific ASID.
> >  *
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the virtual address
> >  * @size: Total size of the virtual address range.
> >  * @asid: The value of address space identifier (ASID).
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
> > -                         hart_mask, start, size, asid, 0);
> > +                         cpu_mask, start, size, asid, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_sfence_vma_asid);
> >
> > /**
> >  * sbi_remote_hfence_gvma() - Execute HFENCE.GVMA instructions on given remote
> >  *                       harts for the specified guest physical address range.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the guest physical address
> >  * @size: Total size of the guest physical address range.
> >  *
> >  * Return: None
> >  */
> > -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
> > -                         hart_mask, start, size, 0, 0);
> > +                         cpu_mask, start, size, 0, 0);
> > }
> > EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
> >
> > @@ -445,38 +472,38 @@ EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
> >  * sbi_remote_hfence_gvma_vmid() - Execute HFENCE.GVMA instructions on given
> >  * remote harts for a guest physical address range belonging to a specific VMID.
> >  *
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the guest physical address
> >  * @size: Total size of the guest physical address range.
> >  * @vmid: The value of guest ID (VMID).
> >  *
> >  * Return: 0 if success, Error otherwise.
> >  */
> > -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long vmid)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID,
> > -                         hart_mask, start, size, vmid, 0);
> > +                         cpu_mask, start, size, vmid, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_hfence_gvma_vmid);
> >
> > /**
> >  * sbi_remote_hfence_vvma() - Execute HFENCE.VVMA instructions on given remote
> >  *                         harts for the current guest virtual address range.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the current guest virtual address
> >  * @size: Total size of the current guest virtual address range.
> >  *
> >  * Return: None
> >  */
> > -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
> > -                         hart_mask, start, size, 0, 0);
> > +                         cpu_mask, start, size, 0, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_hfence_vvma);
> >
> > @@ -485,20 +512,20 @@ EXPORT_SYMBOL(sbi_remote_hfence_vvma);
> >  * remote harts for current guest virtual address range belonging to a specific
> >  * ASID.
> >  *
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the current guest virtual address
> >  * @size: Total size of the current guest virtual address range.
> >  * @asid: The value of address space identifier (ASID).
> >  *
> >  * Return: None
> >  */
> > -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID,
> > -                         hart_mask, start, size, asid, 0);
> > +                         cpu_mask, start, size, asid, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_hfence_vvma_asid);
> >
> > @@ -591,11 +618,7 @@ long sbi_get_mimpid(void)
> >
> > static void sbi_send_cpumask_ipi(const struct cpumask *target)
> > {
> > -     struct cpumask hartid_mask;
> > -
> > -     riscv_cpuid_to_hartid_mask(target, &hartid_mask);
> > -
> > -     sbi_send_ipi(cpumask_bits(&hartid_mask));
> > +     sbi_send_ipi(target);
> > }
> >
> > static const struct riscv_ipi_ops sbi_ipi_ops = {
> > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> > index 63241abe84eb..b42bfdc67482 100644
> > --- a/arch/riscv/kernel/setup.c
> > +++ b/arch/riscv/kernel/setup.c
> > @@ -59,16 +59,6 @@ atomic_t hart_lottery __section(".sdata")
> > unsigned long boot_cpu_hartid;
> > static DEFINE_PER_CPU(struct cpu, cpu_devices);
> >
> > -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out)
> > -{
> > -     int cpu;
> > -
> > -     cpumask_clear(out);
> > -     for_each_cpu(cpu, in)
> > -             cpumask_set_cpu(cpuid_to_hartid_map(cpu), out);
> > -}
> > -EXPORT_SYMBOL_GPL(riscv_cpuid_to_hartid_mask);
> > -
> > /*
> >  * Place kernel memory regions on the resource tree so that
> >  * kexec-tools can retrieve them from /proc/iomem. While there
> > diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> > index bd82375db51a..622f226454d5 100644
> > --- a/arch/riscv/kernel/smpboot.c
> > +++ b/arch/riscv/kernel/smpboot.c
> > @@ -96,7 +96,7 @@ void __init setup_smp(void)
> >               if (cpuid >= NR_CPUS) {
> >                       pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
> >                               cpuid, hart);
> > -                     break;
> > +                     continue;
> >               }
> >
> >               cpuid_to_hartid_map(cpuid) = hart;
> > diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> > index 9af67dbdc66a..f80a34fbf102 100644
> > --- a/arch/riscv/kvm/mmu.c
> > +++ b/arch/riscv/kvm/mmu.c
> > @@ -114,7 +114,6 @@ static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr,
> >
> > static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> > {
> > -     struct cpumask hmask;
> >       unsigned long size = PAGE_SIZE;
> >       struct kvm_vmid *vmid = &kvm->arch.vmid;
> >
> > @@ -127,8 +126,7 @@ static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> >        * where the Guest/VM is running.
> >        */
> >       preempt_disable();
> > -     riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> > -     sbi_remote_hfence_gvma_vmid(cpumask_bits(&hmask), addr, size,
> > +     sbi_remote_hfence_gvma_vmid(cpu_online_mask, addr, size,
> >                                   READ_ONCE(vmid->vmid));
> >       preempt_enable();
> > }
> > diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> > index 00036b7f83b9..1bc0608a5bfd 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> > @@ -82,7 +82,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> > {
> >       int ret = 0;
> >       unsigned long i;
> > -     struct cpumask cm, hm;
> > +     struct cpumask cm;
> >       struct kvm_vcpu *tmp;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >       unsigned long hmask = cp->a0;
> > @@ -90,7 +90,6 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >       unsigned long funcid = cp->a6;
> >
> >       cpumask_clear(&cm);
> > -     cpumask_clear(&hm);
> >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> >               if (hbase != -1UL) {
> >                       if (tmp->vcpu_id < hbase)
> > @@ -103,17 +102,15 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >               cpumask_set_cpu(tmp->cpu, &cm);
> >       }
> >
> > -     riscv_cpuid_to_hartid_mask(&cm, &hm);
> > -
> >       switch (funcid) {
> >       case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> > -             ret = sbi_remote_fence_i(cpumask_bits(&hm));
> > +             ret = sbi_remote_fence_i(&cm);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> > -             ret = sbi_remote_hfence_vvma(cpumask_bits(&hm), cp->a2, cp->a3);
> > +             ret = sbi_remote_hfence_vvma(&cm, cp->a2, cp->a3);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> > -             ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm), cp->a2,
> > +             ret = sbi_remote_hfence_vvma_asid(&cm, cp->a2,
> >                                                 cp->a3, cp->a4);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> > diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
> > index 4c7e13ec9ccc..07e2de14433a 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_v01.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_v01.c
> > @@ -38,7 +38,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >       int i, ret = 0;
> >       u64 next_cycle;
> >       struct kvm_vcpu *rvcpu;
> > -     struct cpumask cm, hm;
> > +     struct cpumask cm;
> >       struct kvm *kvm = vcpu->kvm;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >
> > @@ -101,15 +101,12 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >                               continue;
> >                       cpumask_set_cpu(rvcpu->cpu, &cm);
> >               }
> > -             riscv_cpuid_to_hartid_mask(&cm, &hm);
> >               if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
> > -                     ret = sbi_remote_fence_i(cpumask_bits(&hm));
> > +                     ret = sbi_remote_fence_i(&cm);
> >               else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
> > -                     ret = sbi_remote_hfence_vvma(cpumask_bits(&hm),
> > -                                             cp->a1, cp->a2);
> > +                     ret = sbi_remote_hfence_vvma(&cm, cp->a1, cp->a2);
> >               else
> > -                     ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
> > -                                             cp->a1, cp->a2, cp->a3);
> > +                     ret = sbi_remote_hfence_vvma_asid(&cm, cp->a1, cp->a2, cp->a3);
> >               break;
> >       default:
> >               ret = -EINVAL;
> > diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
> > index 807228f8f409..2fa4f7b1813d 100644
> > --- a/arch/riscv/kvm/vmid.c
> > +++ b/arch/riscv/kvm/vmid.c
> > @@ -67,7 +67,6 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> > {
> >       unsigned long i;
> >       struct kvm_vcpu *v;
> > -     struct cpumask hmask;
> >       struct kvm_vmid *vmid = &vcpu->kvm->arch.vmid;
> >
> >       if (!kvm_riscv_stage2_vmid_ver_changed(vmid))
> > @@ -102,8 +101,7 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> >                * running, we force VM exits on all host CPUs using IPI and
> >                * flush all Guest TLBs.
> >                */
> > -             riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> > -             sbi_remote_hfence_gvma(cpumask_bits(&hmask), 0, 0);
> > +             sbi_remote_hfence_gvma(cpu_online_mask, 0, 0);
> >       }
> >
> >       vmid->vmid = vmid_next;
> > diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> > index 89f81067e09e..6cb7d96ad9c7 100644
> > --- a/arch/riscv/mm/cacheflush.c
> > +++ b/arch/riscv/mm/cacheflush.c
> > @@ -67,10 +67,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
> >                */
> >               smp_mb();
> >       } else if (IS_ENABLED(CONFIG_RISCV_SBI)) {
> > -             cpumask_t hartid_mask;
> > -
> > -             riscv_cpuid_to_hartid_mask(&others, &hartid_mask);
> > -             sbi_remote_fence_i(cpumask_bits(&hartid_mask));
> > +             sbi_remote_fence_i(&others);
> >       } else {
> >               on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
> >       }
> > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> > index 64f8201237c2..37ed760d007c 100644
> > --- a/arch/riscv/mm/tlbflush.c
> > +++ b/arch/riscv/mm/tlbflush.c
> > @@ -32,7 +32,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> >                                 unsigned long size, unsigned long stride)
> > {
> >       struct cpumask *cmask = mm_cpumask(mm);
> > -     struct cpumask hmask;
> >       unsigned int cpuid;
> >       bool broadcast;
> >
> > @@ -46,9 +45,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> >               unsigned long asid = atomic_long_read(&mm->context.id);
> >
> >               if (broadcast) {
> > -                     riscv_cpuid_to_hartid_mask(cmask, &hmask);
> > -                     sbi_remote_sfence_vma_asid(cpumask_bits(&hmask),
> > -                                                start, size, asid);
> > +                     sbi_remote_sfence_vma_asid(cmask, start, size, asid);
> >               } else if (size <= stride) {
> >                       local_flush_tlb_page_asid(start, asid);
> >               } else {
> > @@ -56,9 +53,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> >               }
> >       } else {
> >               if (broadcast) {
> > -                     riscv_cpuid_to_hartid_mask(cmask, &hmask);
> > -                     sbi_remote_sfence_vma(cpumask_bits(&hmask),
> > -                                           start, size);
> > +                     sbi_remote_sfence_vma(cmask, start, size);
> >               } else if (size <= stride) {
> >                       local_flush_tlb_page(start);
> >               } else {
> > --
> > 2.30.2
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>


-- 
Regards,
Atish

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-26  2:21       ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-26  2:21 UTC (permalink / raw)
  To: Jessica Clarke, Geert Uytterhoeven
  Cc: Atish Patra, Linux Kernel Mailing List, Anup Patel, Albert Ou,
	Damien Le Moal, devicetree, Jisheng Zhang, Krzysztof Kozlowski,
	linux-riscv, Palmer Dabbelt, Paul Walmsley, Rob Herring

On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>
> On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> >
> > Currently, SBI APIs accept a hartmask that is generated from struct
> > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > is not the correct data structure for hartids as it can be higher
> > than NR_CPUs for platforms with sparse or discontguous hartids.
> >
> > Remove all association between hartid mask and struct cpumask.
> >
> > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> > arch/riscv/include/asm/sbi.h      |  19 +--
> > arch/riscv/include/asm/smp.h      |   2 -
> > arch/riscv/kernel/sbi.c           | 189 +++++++++++++++++-------------
> > arch/riscv/kernel/setup.c         |  10 --
> > arch/riscv/kernel/smpboot.c       |   2 +-
> > arch/riscv/kvm/mmu.c              |   4 +-
> > arch/riscv/kvm/vcpu_sbi_replace.c |  11 +-
> > arch/riscv/kvm/vcpu_sbi_v01.c     |  11 +-
> > arch/riscv/kvm/vmid.c             |   4 +-
> > arch/riscv/mm/cacheflush.c        |   5 +-
> > arch/riscv/mm/tlbflush.c          |   9 +-
> > 11 files changed, 130 insertions(+), 136 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> > index 26ba6f2d7a40..d1c37479d828 100644
> > --- a/arch/riscv/include/asm/sbi.h
> > +++ b/arch/riscv/include/asm/sbi.h
> > @@ -8,6 +8,7 @@
> > #define _ASM_RISCV_SBI_H
> >
> > #include <linux/types.h>
> > +#include <linux/cpumask.h>
> >
> > #ifdef CONFIG_RISCV_SBI
> > enum sbi_ext_id {
> > @@ -128,27 +129,27 @@ long sbi_get_mimpid(void);
> > void sbi_set_timer(uint64_t stime_value);
> > void sbi_shutdown(void);
> > void sbi_clear_ipi(void);
> > -int sbi_send_ipi(const unsigned long *hart_mask);
> > -int sbi_remote_fence_i(const unsigned long *hart_mask);
> > -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> > +int sbi_send_ipi(const struct cpumask *cpu_mask);
> > +int sbi_remote_fence_i(const struct cpumask *cpu_mask);
> > +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size);
> >
> > -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid);
> > -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size);
> > -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long vmid);
> > -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size);
> > -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid);
> > @@ -183,7 +184,7 @@ static inline unsigned long sbi_mk_version(unsigned long major,
> >
> > int sbi_err_map_linux_errno(int err);
> > #else /* CONFIG_RISCV_SBI */
> > -static inline int sbi_remote_fence_i(const unsigned long *hart_mask) { return -1; }
> > +static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { return -1; }
> > static inline void sbi_init(void) {}
> > #endif /* CONFIG_RISCV_SBI */
> > #endif /* _ASM_RISCV_SBI_H */
> > diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
> > index 6ad749f42807..23170c933d73 100644
> > --- a/arch/riscv/include/asm/smp.h
> > +++ b/arch/riscv/include/asm/smp.h
> > @@ -92,8 +92,6 @@ static inline void riscv_clear_ipi(void)
> >
> > #endif /* CONFIG_SMP */
> >
> > -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
> > -
> > #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP)
> > bool cpu_has_hotplug(unsigned int cpu);
> > #else
> > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> > index 9a84f0cb5175..f72527fcb347 100644
> > --- a/arch/riscv/kernel/sbi.c
> > +++ b/arch/riscv/kernel/sbi.c
> > @@ -16,8 +16,8 @@ unsigned long sbi_spec_version __ro_after_init = SBI_SPEC_VERSION_DEFAULT;
> > EXPORT_SYMBOL(sbi_spec_version);
> >
> > static void (*__sbi_set_timer)(uint64_t stime) __ro_after_init;
> > -static int (*__sbi_send_ipi)(const unsigned long *hart_mask) __ro_after_init;
> > -static int (*__sbi_rfence)(int fid, const unsigned long *hart_mask,
> > +static int (*__sbi_send_ipi)(const struct cpumask *cpu_mask) __ro_after_init;
> > +static int (*__sbi_rfence)(int fid, const struct cpumask *cpu_mask,
> >                          unsigned long start, unsigned long size,
> >                          unsigned long arg4, unsigned long arg5) __ro_after_init;
> >
> > @@ -67,6 +67,30 @@ int sbi_err_map_linux_errno(int err)
> > EXPORT_SYMBOL(sbi_err_map_linux_errno);
> >
> > #ifdef CONFIG_RISCV_SBI_V01
> > +static unsigned long __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mask)
> > +{
> > +     unsigned long cpuid, hartid;
> > +     unsigned long hmask = 0;
> > +
> > +     /*
> > +      * There is no maximum hartid concept in RISC-V and NR_CPUS must not be
> > +      * associated with hartid. As SBI v0.1 is only kept for backward compatibility
> > +      * and will be removed in the future, there is no point in supporting hartid
> > +      * greater than BITS_PER_LONG (32 for RV32 and 64 for RV64). Ideally, SBI v0.2
> > +      * should be used for platforms with hartid greater than BITS_PER_LONG.
> > +      */
> > +     for_each_cpu(cpuid, cpu_mask) {
> > +             hartid = cpuid_to_hartid_map(cpuid);
> > +             if (hartid >= BITS_PER_LONG) {
> > +                     pr_warn("Unable to send any request to hartid > BITS_PER_LONG for SBI v0.1\n");
> > +                     break;
> > +             }
> > +             hmask |= 1 << hartid;
>
> This should be 1UL otherwise hartid 31 and up cause UB.
>

Yeah. Thanks for catching it.

> > +     }
> > +
> > +     return hmask;
> > +}
> > +
> > /**
> >  * sbi_console_putchar() - Writes given character to the console device.
> >  * @ch: The data to be written to the console.
> > @@ -132,33 +156,44 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> > #endif
> > }
> >
> > -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> > +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> > {
> > -     sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)hart_mask,
> > +     unsigned long hart_mask;
> > +
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
> > +     hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> > +
> > +     sbi_ecall(SBI_EXT_0_1_SEND_IPI, 0, (unsigned long)(&hart_mask),
> >                 0, 0, 0, 0, 0);
> >       return 0;
> > }
> >
> > -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> >                           unsigned long start, unsigned long size,
> >                           unsigned long arg4, unsigned long arg5)
> > {
> >       int result = 0;
> > +     unsigned long hart_mask;
> > +
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
> > +     hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
> >
> >       /* v0.2 function IDs are equivalent to v0.1 extension IDs */
> >       switch (fid) {
> >       case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> >               sbi_ecall(SBI_EXT_0_1_REMOTE_FENCE_I, 0,
> > -                       (unsigned long)hart_mask, 0, 0, 0, 0, 0);
> > +                       (unsigned long)&hart_mask, 0, 0, 0, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> >               sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA, 0,
> > -                       (unsigned long)hart_mask, start, size,
> > +                       (unsigned long)&hart_mask, start, size,
> >                         0, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> >               sbi_ecall(SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID, 0,
> > -                       (unsigned long)hart_mask, start, size,
> > +                       (unsigned long)&hart_mask, start, size,
> >                         arg4, 0, 0);
> >               break;
> >       default:
> > @@ -180,7 +215,7 @@ static void __sbi_set_timer_v01(uint64_t stime_value)
> >               sbi_major_version(), sbi_minor_version());
> > }
> >
> > -static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> > +static int __sbi_send_ipi_v01(const struct cpumask *cpu_mask)
> > {
> >       pr_warn("IPI extension is not available in SBI v%lu.%lu\n",
> >               sbi_major_version(), sbi_minor_version());
> > @@ -188,7 +223,7 @@ static int __sbi_send_ipi_v01(const unsigned long *hart_mask)
> >       return 0;
> > }
> >
> > -static int __sbi_rfence_v01(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v01(int fid, const struct cpumask *cpu_mask,
> >                           unsigned long start, unsigned long size,
> >                           unsigned long arg4, unsigned long arg5)
> > {
> > @@ -212,37 +247,33 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
> > #endif
> > }
> >
> > -static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> > +static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
> > {
> > -     unsigned long hartid, hmask_val, hbase;
> > -     struct cpumask tmask;
> > +     unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> >       struct sbiret ret = {0};
> >       int result;
> >
> > -     if (!hart_mask || !(*hart_mask)) {
> > -             riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> > -             hart_mask = cpumask_bits(&tmask);
> > -     }
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
>
> This is a change in behaviour. Are you sure nothing passes an empty mask?
>

The change in behavior is not intentional.

I am yet to reproduce it on my end.
@Geert Uytterhoeven: can you please try the below diff on your end.

diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index f72527fcb347..ca1c617407b4 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -85,7 +85,7 @@ static unsigned long
__sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
                        pr_warn("Unable to send any request to hartid
> BITS_PER_LONG for SBI v0.1\n");
                        break;
                }
-               hmask |= 1 << hartid;
+               hmask |= 1UL << hartid;
        }

        return hmask;
@@ -160,7 +160,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
*cpu_mask)
 {
        unsigned long hart_mask;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;
        hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -176,7 +176,7 @@ static int __sbi_rfence_v01(int fid, const struct
cpumask *cpu_mask,
        int result = 0;
        unsigned long hart_mask;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;
        hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -253,7 +253,7 @@ static int __sbi_send_ipi_v02(const struct cpumask
*cpu_mask)
        struct sbiret ret = {0};
        int result;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;

        for_each_cpu(cpuid, cpu_mask) {
@@ -347,7 +347,7 @@ static int __sbi_rfence_v02(int fid, const struct
cpumask *cpu_mask,
        unsigned long hartid, cpuid, hmask = 0, hbase = 0;
        int result;

-       if (!cpu_mask)
+       if (!cpu_mask || cpumask_empty(cpu_mask))
                cpu_mask = cpu_online_mask;


> > -     hmask_val = 0;
> > -     hbase = 0;
> > -     for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> > -             if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> > +     for_each_cpu(cpuid, cpu_mask) {
> > +             hartid = cpuid_to_hartid_map(cpuid);
> > +             if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> >                       ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> > -                                     hmask_val, hbase, 0, 0, 0, 0);
> > +                                     hmask, hbase, 0, 0, 0, 0);
> >                       if (ret.error)
> >                               goto ecall_failed;
> > -                     hmask_val = 0;
> > +                     hmask = 0;
> >                       hbase = 0;
> >               }
> > -             if (!hmask_val)
> > +             if (!hmask)
> >                       hbase = hartid;
> > -             hmask_val |= 1UL << (hartid - hbase);
> > +             hmask |= 1UL << (hartid - hbase);
> >       }
> >
> > -     if (hmask_val) {
> > +     if (hmask) {
> >               ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> > -                             hmask_val, hbase, 0, 0, 0, 0);
> > +                             hmask, hbase, 0, 0, 0, 0);
> >               if (ret.error)
> >                       goto ecall_failed;
> >       }
> > @@ -252,11 +283,11 @@ static int __sbi_send_ipi_v02(const unsigned long *hart_mask)
> > ecall_failed:
> >       result = sbi_err_map_linux_errno(ret.error);
> >       pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> > -            __func__, hbase, hmask_val, result);
> > +            __func__, hbase, hmask, result);
> >       return result;
> > }
> >
> > -static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> > +static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask,
> >                                unsigned long hbase, unsigned long start,
> >                                unsigned long size, unsigned long arg4,
> >                                unsigned long arg5)
> > @@ -267,31 +298,31 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> >
> >       switch (fid) {
> >       case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, 0, 0, 0, 0);
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, 0, 0, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, arg4, 0);
> >               break;
> >
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, arg4, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, 0, 0);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
> > -             ret = sbi_ecall(ext, fid, hmask_val, hbase, start,
> > +             ret = sbi_ecall(ext, fid, hmask, hbase, start,
> >                               size, arg4, 0);
> >               break;
> >       default:
> > @@ -303,43 +334,39 @@ static int __sbi_rfence_v02_call(unsigned long fid, unsigned long hmask_val,
> >       if (ret.error) {
> >               result = sbi_err_map_linux_errno(ret.error);
> >               pr_err("%s: hbase = [%lu] hmask = [0x%lx] failed (error [%d])\n",
> > -                    __func__, hbase, hmask_val, result);
> > +                    __func__, hbase, hmask, result);
> >       }
> >
> >       return result;
> > }
> >
> > -static int __sbi_rfence_v02(int fid, const unsigned long *hart_mask,
> > +static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask,
> >                           unsigned long start, unsigned long size,
> >                           unsigned long arg4, unsigned long arg5)
> > {
> > -     unsigned long hmask_val, hartid, hbase;
> > -     struct cpumask tmask;
> > +     unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> >       int result;
> >
> > -     if (!hart_mask || !(*hart_mask)) {
> > -             riscv_cpuid_to_hartid_mask(cpu_online_mask, &tmask);
> > -             hart_mask = cpumask_bits(&tmask);
> > -     }
> > +     if (!cpu_mask)
> > +             cpu_mask = cpu_online_mask;
>
> As with __sbi_send_ipi_v02.
>
> Jess
>
> > -     hmask_val = 0;
> > -     hbase = 0;
> > -     for_each_set_bit(hartid, hart_mask, NR_CPUS) {
> > -             if (hmask_val && ((hbase + BITS_PER_LONG) <= hartid)) {
> > -                     result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> > +     for_each_cpu(cpuid, cpu_mask) {
> > +             hartid = cpuid_to_hartid_map(cpuid);
> > +             if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> > +                     result = __sbi_rfence_v02_call(fid, hmask, hbase,
> >                                                      start, size, arg4, arg5);
> >                       if (result)
> >                               return result;
> > -                     hmask_val = 0;
> > +                     hmask = 0;
> >                       hbase = 0;
> >               }
> > -             if (!hmask_val)
> > +             if (!hmask)
> >                       hbase = hartid;
> > -             hmask_val |= 1UL << (hartid - hbase);
> > +             hmask |= 1UL << (hartid - hbase);
> >       }
> >
> > -     if (hmask_val) {
> > -             result = __sbi_rfence_v02_call(fid, hmask_val, hbase,
> > +     if (hmask) {
> > +             result = __sbi_rfence_v02_call(fid, hmask, hbase,
> >                                              start, size, arg4, arg5);
> >               if (result)
> >                       return result;
> > @@ -361,44 +388,44 @@ void sbi_set_timer(uint64_t stime_value)
> >
> > /**
> >  * sbi_send_ipi() - Send an IPI to any hart.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_send_ipi(const unsigned long *hart_mask)
> > +int sbi_send_ipi(const struct cpumask *cpu_mask)
> > {
> > -     return __sbi_send_ipi(hart_mask);
> > +     return __sbi_send_ipi(cpu_mask);
> > }
> > EXPORT_SYMBOL(sbi_send_ipi);
> >
> > /**
> >  * sbi_remote_fence_i() - Execute FENCE.I instruction on given remote harts.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_remote_fence_i(const unsigned long *hart_mask)
> > +int sbi_remote_fence_i(const struct cpumask *cpu_mask)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_FENCE_I,
> > -                         hart_mask, 0, 0, 0, 0);
> > +                         cpu_mask, 0, 0, 0, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_fence_i);
> >
> > /**
> >  * sbi_remote_sfence_vma() - Execute SFENCE.VMA instructions on given remote
> >  *                         harts for the specified virtual address range.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the virtual address
> >  * @size: Total size of the virtual address range.
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_remote_sfence_vma(const unsigned long *hart_mask,
> > +int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
> > -                         hart_mask, start, size, 0, 0);
> > +                         cpu_mask, start, size, 0, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_sfence_vma);
> >
> > @@ -406,38 +433,38 @@ EXPORT_SYMBOL(sbi_remote_sfence_vma);
> >  * sbi_remote_sfence_vma_asid() - Execute SFENCE.VMA instructions on given
> >  * remote harts for a virtual address range belonging to a specific ASID.
> >  *
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the virtual address
> >  * @size: Total size of the virtual address range.
> >  * @asid: The value of address space identifier (ASID).
> >  *
> >  * Return: 0 on success, appropriate linux error code otherwise.
> >  */
> > -int sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
> > -                         hart_mask, start, size, asid, 0);
> > +                         cpu_mask, start, size, asid, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_sfence_vma_asid);
> >
> > /**
> >  * sbi_remote_hfence_gvma() - Execute HFENCE.GVMA instructions on given remote
> >  *                       harts for the specified guest physical address range.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the guest physical address
> >  * @size: Total size of the guest physical address range.
> >  *
> >  * Return: None
> >  */
> > -int sbi_remote_hfence_gvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
> > -                         hart_mask, start, size, 0, 0);
> > +                         cpu_mask, start, size, 0, 0);
> > }
> > EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
> >
> > @@ -445,38 +472,38 @@ EXPORT_SYMBOL_GPL(sbi_remote_hfence_gvma);
> >  * sbi_remote_hfence_gvma_vmid() - Execute HFENCE.GVMA instructions on given
> >  * remote harts for a guest physical address range belonging to a specific VMID.
> >  *
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the guest physical address
> >  * @size: Total size of the guest physical address range.
> >  * @vmid: The value of guest ID (VMID).
> >  *
> >  * Return: 0 if success, Error otherwise.
> >  */
> > -int sbi_remote_hfence_gvma_vmid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_gvma_vmid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long vmid)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID,
> > -                         hart_mask, start, size, vmid, 0);
> > +                         cpu_mask, start, size, vmid, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_hfence_gvma_vmid);
> >
> > /**
> >  * sbi_remote_hfence_vvma() - Execute HFENCE.VVMA instructions on given remote
> >  *                         harts for the current guest virtual address range.
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the current guest virtual address
> >  * @size: Total size of the current guest virtual address range.
> >  *
> >  * Return: None
> >  */
> > -int sbi_remote_hfence_vvma(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma(const struct cpumask *cpu_mask,
> >                          unsigned long start,
> >                          unsigned long size)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
> > -                         hart_mask, start, size, 0, 0);
> > +                         cpu_mask, start, size, 0, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_hfence_vvma);
> >
> > @@ -485,20 +512,20 @@ EXPORT_SYMBOL(sbi_remote_hfence_vvma);
> >  * remote harts for current guest virtual address range belonging to a specific
> >  * ASID.
> >  *
> > - * @hart_mask: A cpu mask containing all the target harts.
> > + * @cpu_mask: A cpu mask containing all the target harts.
> >  * @start: Start of the current guest virtual address
> >  * @size: Total size of the current guest virtual address range.
> >  * @asid: The value of address space identifier (ASID).
> >  *
> >  * Return: None
> >  */
> > -int sbi_remote_hfence_vvma_asid(const unsigned long *hart_mask,
> > +int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask,
> >                               unsigned long start,
> >                               unsigned long size,
> >                               unsigned long asid)
> > {
> >       return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID,
> > -                         hart_mask, start, size, asid, 0);
> > +                         cpu_mask, start, size, asid, 0);
> > }
> > EXPORT_SYMBOL(sbi_remote_hfence_vvma_asid);
> >
> > @@ -591,11 +618,7 @@ long sbi_get_mimpid(void)
> >
> > static void sbi_send_cpumask_ipi(const struct cpumask *target)
> > {
> > -     struct cpumask hartid_mask;
> > -
> > -     riscv_cpuid_to_hartid_mask(target, &hartid_mask);
> > -
> > -     sbi_send_ipi(cpumask_bits(&hartid_mask));
> > +     sbi_send_ipi(target);
> > }
> >
> > static const struct riscv_ipi_ops sbi_ipi_ops = {
> > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> > index 63241abe84eb..b42bfdc67482 100644
> > --- a/arch/riscv/kernel/setup.c
> > +++ b/arch/riscv/kernel/setup.c
> > @@ -59,16 +59,6 @@ atomic_t hart_lottery __section(".sdata")
> > unsigned long boot_cpu_hartid;
> > static DEFINE_PER_CPU(struct cpu, cpu_devices);
> >
> > -void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out)
> > -{
> > -     int cpu;
> > -
> > -     cpumask_clear(out);
> > -     for_each_cpu(cpu, in)
> > -             cpumask_set_cpu(cpuid_to_hartid_map(cpu), out);
> > -}
> > -EXPORT_SYMBOL_GPL(riscv_cpuid_to_hartid_mask);
> > -
> > /*
> >  * Place kernel memory regions on the resource tree so that
> >  * kexec-tools can retrieve them from /proc/iomem. While there
> > diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
> > index bd82375db51a..622f226454d5 100644
> > --- a/arch/riscv/kernel/smpboot.c
> > +++ b/arch/riscv/kernel/smpboot.c
> > @@ -96,7 +96,7 @@ void __init setup_smp(void)
> >               if (cpuid >= NR_CPUS) {
> >                       pr_warn("Invalid cpuid [%d] for hartid [%d]\n",
> >                               cpuid, hart);
> > -                     break;
> > +                     continue;
> >               }
> >
> >               cpuid_to_hartid_map(cpuid) = hart;
> > diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> > index 9af67dbdc66a..f80a34fbf102 100644
> > --- a/arch/riscv/kvm/mmu.c
> > +++ b/arch/riscv/kvm/mmu.c
> > @@ -114,7 +114,6 @@ static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr,
> >
> > static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> > {
> > -     struct cpumask hmask;
> >       unsigned long size = PAGE_SIZE;
> >       struct kvm_vmid *vmid = &kvm->arch.vmid;
> >
> > @@ -127,8 +126,7 @@ static void stage2_remote_tlb_flush(struct kvm *kvm, u32 level, gpa_t addr)
> >        * where the Guest/VM is running.
> >        */
> >       preempt_disable();
> > -     riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> > -     sbi_remote_hfence_gvma_vmid(cpumask_bits(&hmask), addr, size,
> > +     sbi_remote_hfence_gvma_vmid(cpu_online_mask, addr, size,
> >                                   READ_ONCE(vmid->vmid));
> >       preempt_enable();
> > }
> > diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> > index 00036b7f83b9..1bc0608a5bfd 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> > @@ -82,7 +82,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> > {
> >       int ret = 0;
> >       unsigned long i;
> > -     struct cpumask cm, hm;
> > +     struct cpumask cm;
> >       struct kvm_vcpu *tmp;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >       unsigned long hmask = cp->a0;
> > @@ -90,7 +90,6 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >       unsigned long funcid = cp->a6;
> >
> >       cpumask_clear(&cm);
> > -     cpumask_clear(&hm);
> >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> >               if (hbase != -1UL) {
> >                       if (tmp->vcpu_id < hbase)
> > @@ -103,17 +102,15 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >               cpumask_set_cpu(tmp->cpu, &cm);
> >       }
> >
> > -     riscv_cpuid_to_hartid_mask(&cm, &hm);
> > -
> >       switch (funcid) {
> >       case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> > -             ret = sbi_remote_fence_i(cpumask_bits(&hm));
> > +             ret = sbi_remote_fence_i(&cm);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> > -             ret = sbi_remote_hfence_vvma(cpumask_bits(&hm), cp->a2, cp->a3);
> > +             ret = sbi_remote_hfence_vvma(&cm, cp->a2, cp->a3);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> > -             ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm), cp->a2,
> > +             ret = sbi_remote_hfence_vvma_asid(&cm, cp->a2,
> >                                                 cp->a3, cp->a4);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> > diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
> > index 4c7e13ec9ccc..07e2de14433a 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_v01.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_v01.c
> > @@ -38,7 +38,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >       int i, ret = 0;
> >       u64 next_cycle;
> >       struct kvm_vcpu *rvcpu;
> > -     struct cpumask cm, hm;
> > +     struct cpumask cm;
> >       struct kvm *kvm = vcpu->kvm;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >
> > @@ -101,15 +101,12 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >                               continue;
> >                       cpumask_set_cpu(rvcpu->cpu, &cm);
> >               }
> > -             riscv_cpuid_to_hartid_mask(&cm, &hm);
> >               if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
> > -                     ret = sbi_remote_fence_i(cpumask_bits(&hm));
> > +                     ret = sbi_remote_fence_i(&cm);
> >               else if (cp->a7 == SBI_EXT_0_1_REMOTE_SFENCE_VMA)
> > -                     ret = sbi_remote_hfence_vvma(cpumask_bits(&hm),
> > -                                             cp->a1, cp->a2);
> > +                     ret = sbi_remote_hfence_vvma(&cm, cp->a1, cp->a2);
> >               else
> > -                     ret = sbi_remote_hfence_vvma_asid(cpumask_bits(&hm),
> > -                                             cp->a1, cp->a2, cp->a3);
> > +                     ret = sbi_remote_hfence_vvma_asid(&cm, cp->a1, cp->a2, cp->a3);
> >               break;
> >       default:
> >               ret = -EINVAL;
> > diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
> > index 807228f8f409..2fa4f7b1813d 100644
> > --- a/arch/riscv/kvm/vmid.c
> > +++ b/arch/riscv/kvm/vmid.c
> > @@ -67,7 +67,6 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> > {
> >       unsigned long i;
> >       struct kvm_vcpu *v;
> > -     struct cpumask hmask;
> >       struct kvm_vmid *vmid = &vcpu->kvm->arch.vmid;
> >
> >       if (!kvm_riscv_stage2_vmid_ver_changed(vmid))
> > @@ -102,8 +101,7 @@ void kvm_riscv_stage2_vmid_update(struct kvm_vcpu *vcpu)
> >                * running, we force VM exits on all host CPUs using IPI and
> >                * flush all Guest TLBs.
> >                */
> > -             riscv_cpuid_to_hartid_mask(cpu_online_mask, &hmask);
> > -             sbi_remote_hfence_gvma(cpumask_bits(&hmask), 0, 0);
> > +             sbi_remote_hfence_gvma(cpu_online_mask, 0, 0);
> >       }
> >
> >       vmid->vmid = vmid_next;
> > diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> > index 89f81067e09e..6cb7d96ad9c7 100644
> > --- a/arch/riscv/mm/cacheflush.c
> > +++ b/arch/riscv/mm/cacheflush.c
> > @@ -67,10 +67,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
> >                */
> >               smp_mb();
> >       } else if (IS_ENABLED(CONFIG_RISCV_SBI)) {
> > -             cpumask_t hartid_mask;
> > -
> > -             riscv_cpuid_to_hartid_mask(&others, &hartid_mask);
> > -             sbi_remote_fence_i(cpumask_bits(&hartid_mask));
> > +             sbi_remote_fence_i(&others);
> >       } else {
> >               on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
> >       }
> > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
> > index 64f8201237c2..37ed760d007c 100644
> > --- a/arch/riscv/mm/tlbflush.c
> > +++ b/arch/riscv/mm/tlbflush.c
> > @@ -32,7 +32,6 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> >                                 unsigned long size, unsigned long stride)
> > {
> >       struct cpumask *cmask = mm_cpumask(mm);
> > -     struct cpumask hmask;
> >       unsigned int cpuid;
> >       bool broadcast;
> >
> > @@ -46,9 +45,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> >               unsigned long asid = atomic_long_read(&mm->context.id);
> >
> >               if (broadcast) {
> > -                     riscv_cpuid_to_hartid_mask(cmask, &hmask);
> > -                     sbi_remote_sfence_vma_asid(cpumask_bits(&hmask),
> > -                                                start, size, asid);
> > +                     sbi_remote_sfence_vma_asid(cmask, start, size, asid);
> >               } else if (size <= stride) {
> >                       local_flush_tlb_page_asid(start, asid);
> >               } else {
> > @@ -56,9 +53,7 @@ static void __sbi_tlb_flush_range(struct mm_struct *mm, unsigned long start,
> >               }
> >       } else {
> >               if (broadcast) {
> > -                     riscv_cpuid_to_hartid_mask(cmask, &hmask);
> > -                     sbi_remote_sfence_vma(cpumask_bits(&hmask),
> > -                                           start, size);
> > +                     sbi_remote_sfence_vma(cmask, start, size);
> >               } else if (size <= stride) {
> >                       local_flush_tlb_page(start);
> >               } else {
> > --
> > 2.30.2
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>


-- 
Regards,
Atish

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-26  2:21       ` Atish Patra
@ 2022-01-26  8:28         ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-26  8:28 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > is not the correct data structure for hartids as it can be higher
> > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > >
> > > Remove all association between hartid mask and struct cpumask.
> > >
> > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > Signed-off-by: Atish Patra <atishp@rivosinc.com>

> I am yet to reproduce it on my end.
> @Geert Uytterhoeven: can you please try the below diff on your end.

Unfortunately it doesn't fix the issue for me.

/me debugging...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-26  8:28         ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-26  8:28 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > is not the correct data structure for hartids as it can be higher
> > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > >
> > > Remove all association between hartid mask and struct cpumask.
> > >
> > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > Signed-off-by: Atish Patra <atishp@rivosinc.com>

> I am yet to reproduce it on my end.
> @Geert Uytterhoeven: can you please try the below diff on your end.

Unfortunately it doesn't fix the issue for me.

/me debugging...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-26  8:28         ` Geert Uytterhoeven
@ 2022-01-26  9:10           ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-26  9:10 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > is not the correct data structure for hartids as it can be higher
> > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > >
> > > > Remove all association between hartid mask and struct cpumask.
> > > >
> > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
>
> > I am yet to reproduce it on my end.
> > @Geert Uytterhoeven: can you please try the below diff on your end.
>
> Unfortunately it doesn't fix the issue for me.
>
> /me debugging...

Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
hbase = 0.

cpuid 1 maps to  hartid 0
cpuid 0 maps to hartid 1

    __sbi_rfence_v02:364: cpuid 1 hartid 0
    __sbi_rfence_v02:377: hartid 0 hbase 1
    hmask |= 1UL << (hartid - hbase);

oops

    __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
8000000000000001 hbase 1

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-26  9:10           ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-26  9:10 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > is not the correct data structure for hartids as it can be higher
> > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > >
> > > > Remove all association between hartid mask and struct cpumask.
> > > >
> > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
>
> > I am yet to reproduce it on my end.
> > @Geert Uytterhoeven: can you please try the below diff on your end.
>
> Unfortunately it doesn't fix the issue for me.
>
> /me debugging...

Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
hbase = 0.

cpuid 1 maps to  hartid 0
cpuid 0 maps to hartid 1

    __sbi_rfence_v02:364: cpuid 1 hartid 0
    __sbi_rfence_v02:377: hartid 0 hbase 1
    hmask |= 1UL << (hartid - hbase);

oops

    __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
8000000000000001 hbase 1

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-26  9:10           ` Geert Uytterhoeven
@ 2022-01-27  1:01             ` Atish Patra
  -1 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-27  1:01 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Atish,
>
> On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > is not the correct data structure for hartids as it can be higher
> > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > >
> > > > > Remove all association between hartid mask and struct cpumask.
> > > > >
> > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> >
> > > I am yet to reproduce it on my end.
> > > @Geert Uytterhoeven: can you please try the below diff on your end.
> >
> > Unfortunately it doesn't fix the issue for me.
> >
> > /me debugging...
>
> Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> hbase = 0.
>
> cpuid 1 maps to  hartid 0
> cpuid 0 maps to hartid 1
>
>     __sbi_rfence_v02:364: cpuid 1 hartid 0
>     __sbi_rfence_v02:377: hartid 0 hbase 1
>     hmask |= 1UL << (hartid - hbase);
>
> oops
>
>     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> 8000000000000001 hbase 1
>

Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
hartid and it is trying to do a remote tlb flush/IPI
to lower the hartid. We should generate the hartid array before the loop.

Can you try this diff ? It seems to work for me during multiple boot
cycle on the unleashed.

You can find the patch here as well
https://github.com/atishp04/linux/commits/v5.17-rc1

--------------------------------------------------------------------------------------------------------------------------------
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index f72527fcb347..4ebeb5813edc 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -8,6 +8,8 @@
 #include <linux/init.h>
 #include <linux/pm.h>
 #include <linux/reboot.h>
+#include <linux/sort.h>
+
 #include <asm/sbi.h>
 #include <asm/smp.h>

@@ -85,7 +87,7 @@ static unsigned long
__sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
  pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
SBI v0.1\n");
  break;
  }
- hmask |= 1 << hartid;
+ hmask |= 1UL << hartid;
  }

  return hmask;
@@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
*cpu_mask)
 {
  unsigned long hart_mask;

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;
  hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
cpumask *cpu_mask,
  int result = 0;
  unsigned long hart_mask;

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;
  hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
cpumask *cpu_mask,
 static void sbi_set_power_off(void) {}
 #endif /* CONFIG_RISCV_SBI_V01 */

+static int cmp_ulong(const void *A, const void *B)
+{
+ const unsigned long *a = A, *b = B;
+
+ if (*a < *b)
+ return -1;
+ else if (*a > *b)
+ return 1;
+ else
+ return 0;
+}
+
 static void __sbi_set_timer_v02(uint64_t stime_value)
 {
 #if __riscv_xlen == 32
@@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
cpumask *cpu_mask)
 {
  unsigned long hartid, cpuid, hmask = 0, hbase = 0;
  struct sbiret ret = {0};
- int result;
+ int result, index = 0, max_index = 0;
+ unsigned long hartid_arr[NR_CPUS] = {0};

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;

  for_each_cpu(cpuid, cpu_mask) {
  hartid = cpuid_to_hartid_map(cpuid);
+ hartid_arr[index] = hartid;
+ index++;
+ }
+
+ max_index = index;
+ sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
+ for (index = 0; index < max_index; index++) {
+ hartid = hartid_arr[index];
  if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
  ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
  hmask, hbase, 0, 0, 0, 0);
@@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
struct cpumask *cpu_mask,
      unsigned long arg4, unsigned long arg5)
 {
  unsigned long hartid, cpuid, hmask = 0, hbase = 0;
- int result;
+ int result, index = 0, max_index = 0;
+ unsigned long hartid_arr[NR_CPUS] = {0};

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;

  for_each_cpu(cpuid, cpu_mask) {
  hartid = cpuid_to_hartid_map(cpuid);
+ hartid_arr[index] = hartid;
+ index++;
+ }
+ max_index = index;
+ sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
+ for (index = 0; index < max_index; index++) {
+ hartid = hartid_arr[index];
  if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
  result = __sbi_rfence_v02_call(fid, hmask, hbase,
         start, size, arg4, arg5);

--------------------------------------------------------------------------------------------------------------------------------

> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds



-- 
Regards,
Atish

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-27  1:01             ` Atish Patra
  0 siblings, 0 replies; 55+ messages in thread
From: Atish Patra @ 2022-01-27  1:01 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Atish,
>
> On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > is not the correct data structure for hartids as it can be higher
> > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > >
> > > > > Remove all association between hartid mask and struct cpumask.
> > > > >
> > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> >
> > > I am yet to reproduce it on my end.
> > > @Geert Uytterhoeven: can you please try the below diff on your end.
> >
> > Unfortunately it doesn't fix the issue for me.
> >
> > /me debugging...
>
> Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> hbase = 0.
>
> cpuid 1 maps to  hartid 0
> cpuid 0 maps to hartid 1
>
>     __sbi_rfence_v02:364: cpuid 1 hartid 0
>     __sbi_rfence_v02:377: hartid 0 hbase 1
>     hmask |= 1UL << (hartid - hbase);
>
> oops
>
>     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> 8000000000000001 hbase 1
>

Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
hartid and it is trying to do a remote tlb flush/IPI
to lower the hartid. We should generate the hartid array before the loop.

Can you try this diff ? It seems to work for me during multiple boot
cycle on the unleashed.

You can find the patch here as well
https://github.com/atishp04/linux/commits/v5.17-rc1

--------------------------------------------------------------------------------------------------------------------------------
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index f72527fcb347..4ebeb5813edc 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -8,6 +8,8 @@
 #include <linux/init.h>
 #include <linux/pm.h>
 #include <linux/reboot.h>
+#include <linux/sort.h>
+
 #include <asm/sbi.h>
 #include <asm/smp.h>

@@ -85,7 +87,7 @@ static unsigned long
__sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
  pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
SBI v0.1\n");
  break;
  }
- hmask |= 1 << hartid;
+ hmask |= 1UL << hartid;
  }

  return hmask;
@@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
*cpu_mask)
 {
  unsigned long hart_mask;

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;
  hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
cpumask *cpu_mask,
  int result = 0;
  unsigned long hart_mask;

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;
  hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);

@@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
cpumask *cpu_mask,
 static void sbi_set_power_off(void) {}
 #endif /* CONFIG_RISCV_SBI_V01 */

+static int cmp_ulong(const void *A, const void *B)
+{
+ const unsigned long *a = A, *b = B;
+
+ if (*a < *b)
+ return -1;
+ else if (*a > *b)
+ return 1;
+ else
+ return 0;
+}
+
 static void __sbi_set_timer_v02(uint64_t stime_value)
 {
 #if __riscv_xlen == 32
@@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
cpumask *cpu_mask)
 {
  unsigned long hartid, cpuid, hmask = 0, hbase = 0;
  struct sbiret ret = {0};
- int result;
+ int result, index = 0, max_index = 0;
+ unsigned long hartid_arr[NR_CPUS] = {0};

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;

  for_each_cpu(cpuid, cpu_mask) {
  hartid = cpuid_to_hartid_map(cpuid);
+ hartid_arr[index] = hartid;
+ index++;
+ }
+
+ max_index = index;
+ sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
+ for (index = 0; index < max_index; index++) {
+ hartid = hartid_arr[index];
  if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
  ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
  hmask, hbase, 0, 0, 0, 0);
@@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
struct cpumask *cpu_mask,
      unsigned long arg4, unsigned long arg5)
 {
  unsigned long hartid, cpuid, hmask = 0, hbase = 0;
- int result;
+ int result, index = 0, max_index = 0;
+ unsigned long hartid_arr[NR_CPUS] = {0};

- if (!cpu_mask)
+ if (!cpu_mask || cpumask_empty(cpu_mask))
  cpu_mask = cpu_online_mask;

  for_each_cpu(cpuid, cpu_mask) {
  hartid = cpuid_to_hartid_map(cpuid);
+ hartid_arr[index] = hartid;
+ index++;
+ }
+ max_index = index;
+ sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
+ for (index = 0; index < max_index; index++) {
+ hartid = hartid_arr[index];
  if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
  result = __sbi_rfence_v02_call(fid, hmask, hbase,
         start, size, arg4, arg5);

--------------------------------------------------------------------------------------------------------------------------------

> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds



-- 
Regards,
Atish

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-27  1:01             ` Atish Patra
@ 2022-01-27  8:48               ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-27  8:48 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > > is not the correct data structure for hartids as it can be higher
> > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > > >
> > > > > > Remove all association between hartid mask and struct cpumask.
> > > > > >
> > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > >
> > > > I am yet to reproduce it on my end.
> > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > >
> > > Unfortunately it doesn't fix the issue for me.
> > >
> > > /me debugging...
> >
> > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > hbase = 0.
> >
> > cpuid 1 maps to  hartid 0
> > cpuid 0 maps to hartid 1
> >
> >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> >     __sbi_rfence_v02:377: hartid 0 hbase 1
> >     hmask |= 1UL << (hartid - hbase);
> >
> > oops
> >
> >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > 8000000000000001 hbase 1
> >
>
> Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> hartid and it is trying to do a remote tlb flush/IPI
> to lower the hartid. We should generate the hartid array before the loop.
>
> Can you try this diff ? It seems to work for me during multiple boot
> cycle on the unleashed.
>
> You can find the patch here as well
> https://github.com/atishp04/linux/commits/v5.17-rc1

Thanks, that fixes the issue for me.

> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -8,6 +8,8 @@
>  #include <linux/init.h>
>  #include <linux/pm.h>
>  #include <linux/reboot.h>
> +#include <linux/sort.h>
> +
>  #include <asm/sbi.h>
>  #include <asm/smp.h>
>
> @@ -85,7 +87,7 @@ static unsigned long
> __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
>   pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
> SBI v0.1\n");
>   break;
>   }
> - hmask |= 1 << hartid;
> + hmask |= 1UL << hartid;
>   }
>
>   return hmask;
> @@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
> *cpu_mask)
>  {
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>   int result = 0;
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>  static void sbi_set_power_off(void) {}
>  #endif /* CONFIG_RISCV_SBI_V01 */
>
> +static int cmp_ulong(const void *A, const void *B)
> +{
> + const unsigned long *a = A, *b = B;
> +
> + if (*a < *b)
> + return -1;
> + else if (*a > *b)
> + return 1;
> + else
> + return 0;
> +}
> +
>  static void __sbi_set_timer_v02(uint64_t stime_value)
>  {
>  #if __riscv_xlen == 32
> @@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>   struct sbiret ret = {0};
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> +
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>   hmask, hbase, 0, 0, 0, 0);
> @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>       unsigned long arg4, unsigned long arg5)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};

That's up to 256 bytes on the stack. And more if the maximum
number of cores is increased.

>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];

That looks expensive to me.

What about shifting hmask and adjusting hbase if a hartid is
lower than the current hbase?

>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   result = __sbi_rfence_v02_call(fid, hmask, hbase,
>          start, size, arg4, arg5);

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-27  8:48               ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-27  8:48 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > > is not the correct data structure for hartids as it can be higher
> > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > > >
> > > > > > Remove all association between hartid mask and struct cpumask.
> > > > > >
> > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > >
> > > > I am yet to reproduce it on my end.
> > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > >
> > > Unfortunately it doesn't fix the issue for me.
> > >
> > > /me debugging...
> >
> > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > hbase = 0.
> >
> > cpuid 1 maps to  hartid 0
> > cpuid 0 maps to hartid 1
> >
> >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> >     __sbi_rfence_v02:377: hartid 0 hbase 1
> >     hmask |= 1UL << (hartid - hbase);
> >
> > oops
> >
> >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > 8000000000000001 hbase 1
> >
>
> Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> hartid and it is trying to do a remote tlb flush/IPI
> to lower the hartid. We should generate the hartid array before the loop.
>
> Can you try this diff ? It seems to work for me during multiple boot
> cycle on the unleashed.
>
> You can find the patch here as well
> https://github.com/atishp04/linux/commits/v5.17-rc1

Thanks, that fixes the issue for me.

> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -8,6 +8,8 @@
>  #include <linux/init.h>
>  #include <linux/pm.h>
>  #include <linux/reboot.h>
> +#include <linux/sort.h>
> +
>  #include <asm/sbi.h>
>  #include <asm/smp.h>
>
> @@ -85,7 +87,7 @@ static unsigned long
> __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
>   pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
> SBI v0.1\n");
>   break;
>   }
> - hmask |= 1 << hartid;
> + hmask |= 1UL << hartid;
>   }
>
>   return hmask;
> @@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
> *cpu_mask)
>  {
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>   int result = 0;
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>  static void sbi_set_power_off(void) {}
>  #endif /* CONFIG_RISCV_SBI_V01 */
>
> +static int cmp_ulong(const void *A, const void *B)
> +{
> + const unsigned long *a = A, *b = B;
> +
> + if (*a < *b)
> + return -1;
> + else if (*a > *b)
> + return 1;
> + else
> + return 0;
> +}
> +
>  static void __sbi_set_timer_v02(uint64_t stime_value)
>  {
>  #if __riscv_xlen == 32
> @@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>   struct sbiret ret = {0};
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> +
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>   hmask, hbase, 0, 0, 0, 0);
> @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>       unsigned long arg4, unsigned long arg5)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};

That's up to 256 bytes on the stack. And more if the maximum
number of cores is increased.

>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];

That looks expensive to me.

What about shifting hmask and adjusting hbase if a hartid is
lower than the current hbase?

>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   result = __sbi_rfence_v02_call(fid, hmask, hbase,
>          start, size, arg4, arg5);

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-27  8:48               ` Geert Uytterhoeven
@ 2022-01-27  8:48                 ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-27  8:48 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On Thu, Jan 27, 2022 at 9:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > > > is not the correct data structure for hartids as it can be higher
> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > > > >
> > > > > > > Remove all association between hartid mask and struct cpumask.
> > > > > > >
> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > > >
> > > > > I am yet to reproduce it on my end.
> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > > >
> > > > Unfortunately it doesn't fix the issue for me.
> > > >
> > > > /me debugging...
> > >
> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > > hbase = 0.
> > >
> > > cpuid 1 maps to  hartid 0
> > > cpuid 0 maps to hartid 1
> > >
> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
> > >     hmask |= 1UL << (hartid - hbase);
> > >
> > > oops
> > >
> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > > 8000000000000001 hbase 1
> > >
> >
> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> > hartid and it is trying to do a remote tlb flush/IPI
> > to lower the hartid. We should generate the hartid array before the loop.
> >
> > Can you try this diff ? It seems to work for me during multiple boot
> > cycle on the unleashed.
> >
> > You can find the patch here as well
> > https://github.com/atishp04/linux/commits/v5.17-rc1
>
> Thanks, that fixes the issue for me.

Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-27  8:48                 ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-27  8:48 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On Thu, Jan 27, 2022 at 9:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > > > is not the correct data structure for hartids as it can be higher
> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > > > >
> > > > > > > Remove all association between hartid mask and struct cpumask.
> > > > > > >
> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > > >
> > > > > I am yet to reproduce it on my end.
> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > > >
> > > > Unfortunately it doesn't fix the issue for me.
> > > >
> > > > /me debugging...
> > >
> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > > hbase = 0.
> > >
> > > cpuid 1 maps to  hartid 0
> > > cpuid 0 maps to hartid 1
> > >
> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
> > >     hmask |= 1UL << (hartid - hbase);
> > >
> > > oops
> > >
> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > > 8000000000000001 hbase 1
> > >
> >
> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> > hartid and it is trying to do a remote tlb flush/IPI
> > to lower the hartid. We should generate the hartid array before the loop.
> >
> > Can you try this diff ? It seems to work for me during multiple boot
> > cycle on the unleashed.
> >
> > You can find the patch here as well
> > https://github.com/atishp04/linux/commits/v5.17-rc1
>
> Thanks, that fixes the issue for me.

Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-27  1:01             ` Atish Patra
@ 2022-01-27  9:56               ` Ron Economos
  -1 siblings, 0 replies; 55+ messages in thread
From: Ron Economos @ 2022-01-27  9:56 UTC (permalink / raw)
  To: Atish Patra, Geert Uytterhoeven
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On 1/26/22 17:01, Atish Patra wrote:
> On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> Hi Atish,
>>
>> On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>>> On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
>>>> On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>>> On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
>>>>>> Currently, SBI APIs accept a hartmask that is generated from struct
>>>>>> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
>>>>>> is not the correct data structure for hartids as it can be higher
>>>>>> than NR_CPUs for platforms with sparse or discontguous hartids.
>>>>>>
>>>>>> Remove all association between hartid mask and struct cpumask.
>>>>>>
>>>>>> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
>>>>>> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
>>>>>> Signed-off-by: Atish Patra <atishp@rivosinc.com>
>>>> I am yet to reproduce it on my end.
>>>> @Geert Uytterhoeven: can you please try the below diff on your end.
>>> Unfortunately it doesn't fix the issue for me.
>>>
>>> /me debugging...
>> Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
>> SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
>> hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
>> hbase = 0.
>>
>> cpuid 1 maps to  hartid 0
>> cpuid 0 maps to hartid 1
>>
>>      __sbi_rfence_v02:364: cpuid 1 hartid 0
>>      __sbi_rfence_v02:377: hartid 0 hbase 1
>>      hmask |= 1UL << (hartid - hbase);
>>
>> oops
>>
>>      __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
>> 8000000000000001 hbase 1
>>
> Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> hartid and it is trying to do a remote tlb flush/IPI
> to lower the hartid. We should generate the hartid array before the loop.
>
> Can you try this diff ? It seems to work for me during multiple boot
> cycle on the unleashed.
>
> You can find the patch here as well
> https://github.com/atishp04/linux/commits/v5.17-rc1
>
> --------------------------------------------------------------------------------------------------------------------------------
> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> index f72527fcb347..4ebeb5813edc 100644
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -8,6 +8,8 @@
>   #include <linux/init.h>
>   #include <linux/pm.h>
>   #include <linux/reboot.h>
> +#include <linux/sort.h>
> +
>   #include <asm/sbi.h>
>   #include <asm/smp.h>
>
> @@ -85,7 +87,7 @@ static unsigned long
> __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
>    pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
> SBI v0.1\n");
>    break;
>    }
> - hmask |= 1 << hartid;
> + hmask |= 1UL << hartid;
>    }
>
>    return hmask;
> @@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
> *cpu_mask)
>   {
>    unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>    hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>    int result = 0;
>    unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>    hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>   static void sbi_set_power_off(void) {}
>   #endif /* CONFIG_RISCV_SBI_V01 */
>
> +static int cmp_ulong(const void *A, const void *B)
> +{
> + const unsigned long *a = A, *b = B;
> +
> + if (*a < *b)
> + return -1;
> + else if (*a > *b)
> + return 1;
> + else
> + return 0;
> +}
> +
>   static void __sbi_set_timer_v02(uint64_t stime_value)
>   {
>   #if __riscv_xlen == 32
> @@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>   {
>    unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>    struct sbiret ret = {0};
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>
>    for_each_cpu(cpuid, cpu_mask) {
>    hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> +
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>    if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>    ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>    hmask, hbase, 0, 0, 0, 0);
> @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>        unsigned long arg4, unsigned long arg5)
>   {
>    unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>
>    for_each_cpu(cpuid, cpu_mask) {
>    hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>    if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>    result = __sbi_rfence_v02_call(fid, hmask, hbase,
>           start, size, arg4, arg5);
>
> --------------------------------------------------------------------------------------------------------------------------------

Works good here. No systemd segfaults on Unmatched.

Tested-by: Ron Economos <re@w6rz.net>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-27  9:56               ` Ron Economos
  0 siblings, 0 replies; 55+ messages in thread
From: Ron Economos @ 2022-01-27  9:56 UTC (permalink / raw)
  To: Atish Patra, Geert Uytterhoeven
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On 1/26/22 17:01, Atish Patra wrote:
> On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> Hi Atish,
>>
>> On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>>> On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
>>>> On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>>> On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
>>>>>> Currently, SBI APIs accept a hartmask that is generated from struct
>>>>>> cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
>>>>>> is not the correct data structure for hartids as it can be higher
>>>>>> than NR_CPUs for platforms with sparse or discontguous hartids.
>>>>>>
>>>>>> Remove all association between hartid mask and struct cpumask.
>>>>>>
>>>>>> Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
>>>>>> Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
>>>>>> Signed-off-by: Atish Patra <atishp@rivosinc.com>
>>>> I am yet to reproduce it on my end.
>>>> @Geert Uytterhoeven: can you please try the below diff on your end.
>>> Unfortunately it doesn't fix the issue for me.
>>>
>>> /me debugging...
>> Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
>> SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
>> hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
>> hbase = 0.
>>
>> cpuid 1 maps to  hartid 0
>> cpuid 0 maps to hartid 1
>>
>>      __sbi_rfence_v02:364: cpuid 1 hartid 0
>>      __sbi_rfence_v02:377: hartid 0 hbase 1
>>      hmask |= 1UL << (hartid - hbase);
>>
>> oops
>>
>>      __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
>> 8000000000000001 hbase 1
>>
> Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> hartid and it is trying to do a remote tlb flush/IPI
> to lower the hartid. We should generate the hartid array before the loop.
>
> Can you try this diff ? It seems to work for me during multiple boot
> cycle on the unleashed.
>
> You can find the patch here as well
> https://github.com/atishp04/linux/commits/v5.17-rc1
>
> --------------------------------------------------------------------------------------------------------------------------------
> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> index f72527fcb347..4ebeb5813edc 100644
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -8,6 +8,8 @@
>   #include <linux/init.h>
>   #include <linux/pm.h>
>   #include <linux/reboot.h>
> +#include <linux/sort.h>
> +
>   #include <asm/sbi.h>
>   #include <asm/smp.h>
>
> @@ -85,7 +87,7 @@ static unsigned long
> __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
>    pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
> SBI v0.1\n");
>    break;
>    }
> - hmask |= 1 << hartid;
> + hmask |= 1UL << hartid;
>    }
>
>    return hmask;
> @@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
> *cpu_mask)
>   {
>    unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>    hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>    int result = 0;
>    unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>    hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>   static void sbi_set_power_off(void) {}
>   #endif /* CONFIG_RISCV_SBI_V01 */
>
> +static int cmp_ulong(const void *A, const void *B)
> +{
> + const unsigned long *a = A, *b = B;
> +
> + if (*a < *b)
> + return -1;
> + else if (*a > *b)
> + return 1;
> + else
> + return 0;
> +}
> +
>   static void __sbi_set_timer_v02(uint64_t stime_value)
>   {
>   #if __riscv_xlen == 32
> @@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>   {
>    unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>    struct sbiret ret = {0};
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>
>    for_each_cpu(cpuid, cpu_mask) {
>    hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> +
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>    if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>    ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>    hmask, hbase, 0, 0, 0, 0);
> @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>        unsigned long arg4, unsigned long arg5)
>   {
>    unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>    cpu_mask = cpu_online_mask;
>
>    for_each_cpu(cpuid, cpu_mask) {
>    hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>    if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>    result = __sbi_rfence_v02_call(fid, hmask, hbase,
>           start, size, arg4, arg5);
>
> --------------------------------------------------------------------------------------------------------------------------------

Works good here. No systemd segfaults on Unmatched.

Tested-by: Ron Economos <re@w6rz.net>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-27  8:48               ` Geert Uytterhoeven
@ 2022-01-27 10:03                 ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-27 10:03 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Thu, Jan 27, 2022 at 9:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
 > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> > hartid and it is trying to do a remote tlb flush/IPI
> > to lower the hartid. We should generate the hartid array before the loop.
> >
> > Can you try this diff ? It seems to work for me during multiple boot
> > cycle on the unleashed.
> >
> > You can find the patch here as well
> > https://github.com/atishp04/linux/commits/v5.17-rc1
>
> Thanks, that fixes the issue for me.
>
> > --- a/arch/riscv/kernel/sbi.c
> > +++ b/arch/riscv/kernel/sbi.c

> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> > struct cpumask *cpu_mask,
> >       unsigned long arg4, unsigned long arg5)
> >  {
> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> > - int result;
> > + int result, index = 0, max_index = 0;
> > + unsigned long hartid_arr[NR_CPUS] = {0};
>
> That's up to 256 bytes on the stack. And more if the maximum
> number of cores is increased.

I.e. 4 KiB with the proposed increase to 256 CPUs, as mentioned in
https://lore.kernel.org/all/CAAhSdy2xTW0FkwvS2dExOb7q1dVruFfTP_Vh_jWju+yi7thCeA@mail.gmail.com/

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-27 10:03                 ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-27 10:03 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Thu, Jan 27, 2022 at 9:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
 > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> > hartid and it is trying to do a remote tlb flush/IPI
> > to lower the hartid. We should generate the hartid array before the loop.
> >
> > Can you try this diff ? It seems to work for me during multiple boot
> > cycle on the unleashed.
> >
> > You can find the patch here as well
> > https://github.com/atishp04/linux/commits/v5.17-rc1
>
> Thanks, that fixes the issue for me.
>
> > --- a/arch/riscv/kernel/sbi.c
> > +++ b/arch/riscv/kernel/sbi.c

> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> > struct cpumask *cpu_mask,
> >       unsigned long arg4, unsigned long arg5)
> >  {
> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> > - int result;
> > + int result, index = 0, max_index = 0;
> > + unsigned long hartid_arr[NR_CPUS] = {0};
>
> That's up to 256 bytes on the stack. And more if the maximum
> number of cores is increased.

I.e. 4 KiB with the proposed increase to 256 CPUs, as mentioned in
https://lore.kernel.org/all/CAAhSdy2xTW0FkwvS2dExOb7q1dVruFfTP_Vh_jWju+yi7thCeA@mail.gmail.com/

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-27 10:03                 ` Geert Uytterhoeven
@ 2022-01-27 10:17                   ` Andreas Schwab
  -1 siblings, 0 replies; 55+ messages in thread
From: Andreas Schwab @ 2022-01-27 10:17 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Atish Patra, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Anup Patel, Albert Ou, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

On Jan 27 2022, Geert Uytterhoeven wrote:

> Hi Atish,
>
> On Thu, Jan 27, 2022 at 9:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
>  > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
>> > hartid and it is trying to do a remote tlb flush/IPI
>> > to lower the hartid. We should generate the hartid array before the loop.
>> >
>> > Can you try this diff ? It seems to work for me during multiple boot
>> > cycle on the unleashed.
>> >
>> > You can find the patch here as well
>> > https://github.com/atishp04/linux/commits/v5.17-rc1
>>
>> Thanks, that fixes the issue for me.
>>
>> > --- a/arch/riscv/kernel/sbi.c
>> > +++ b/arch/riscv/kernel/sbi.c
>
>> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
>> > struct cpumask *cpu_mask,
>> >       unsigned long arg4, unsigned long arg5)
>> >  {
>> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>> > - int result;
>> > + int result, index = 0, max_index = 0;
>> > + unsigned long hartid_arr[NR_CPUS] = {0};
>>
>> That's up to 256 bytes on the stack. And more if the maximum
>> number of cores is increased.
>
> I.e. 4 KiB with the proposed increase to 256 CPUs, as mentioned in

And those 4K need to be cleared each time the function is called, even
if there is only a small number of cpus.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-27 10:17                   ` Andreas Schwab
  0 siblings, 0 replies; 55+ messages in thread
From: Andreas Schwab @ 2022-01-27 10:17 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Atish Patra, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Anup Patel, Albert Ou, Damien Le Moal,
	devicetree, Jisheng Zhang, Krzysztof Kozlowski, linux-riscv,
	Palmer Dabbelt, Paul Walmsley, Rob Herring

On Jan 27 2022, Geert Uytterhoeven wrote:

> Hi Atish,
>
> On Thu, Jan 27, 2022 at 9:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
>  > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
>> > hartid and it is trying to do a remote tlb flush/IPI
>> > to lower the hartid. We should generate the hartid array before the loop.
>> >
>> > Can you try this diff ? It seems to work for me during multiple boot
>> > cycle on the unleashed.
>> >
>> > You can find the patch here as well
>> > https://github.com/atishp04/linux/commits/v5.17-rc1
>>
>> Thanks, that fixes the issue for me.
>>
>> > --- a/arch/riscv/kernel/sbi.c
>> > +++ b/arch/riscv/kernel/sbi.c
>
>> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
>> > struct cpumask *cpu_mask,
>> >       unsigned long arg4, unsigned long arg5)
>> >  {
>> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>> > - int result;
>> > + int result, index = 0, max_index = 0;
>> > + unsigned long hartid_arr[NR_CPUS] = {0};
>>
>> That's up to 256 bytes on the stack. And more if the maximum
>> number of cores is increased.
>
> I.e. 4 KiB with the proposed increase to 256 CPUs, as mentioned in

And those 4K need to be cleared each time the function is called, even
if there is only a small number of cpus.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
       [not found]               ` <CAOnJCU+U0xmw-_yTEUo9ZXO5pvoJ6VCGu+jjU-Sa2MnhcAha6Q@mail.gmail.com>
@ 2022-01-28  8:39                   ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-28  8:39 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
>> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
>> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
>> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
>> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
>> > > > > > > is not the correct data structure for hartids as it can be higher
>> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
>> > > > > > >
>> > > > > > > Remove all association between hartid mask and struct cpumask.
>> > > > > > >
>> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
>> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
>> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
>> > > >
>> > > > > I am yet to reproduce it on my end.
>> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
>> > > >
>> > > > Unfortunately it doesn't fix the issue for me.
>> > > >
>> > > > /me debugging...
>> > >
>> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
>> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
>> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
>> > > hbase = 0.
>> > >
>> > > cpuid 1 maps to  hartid 0
>> > > cpuid 0 maps to hartid 1
>> > >
>> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
>> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
>> > >     hmask |= 1UL << (hartid - hbase);
>> > >
>> > > oops
>> > >
>> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
>> > > 8000000000000001 hbase 1
>> > >
>> >
>> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
>> > hartid and it is trying to do a remote tlb flush/IPI
>> > to lower the hartid. We should generate the hartid array before the loop.
>> >
>> > Can you try this diff ? It seems to work for me during multiple boot
>> > cycle on the unleashed.
>> >
>> > You can find the patch here as well
>> > https://github.com/atishp04/linux/commits/v5.17-rc1

>> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
>> > struct cpumask *cpu_mask,
>> >       unsigned long arg4, unsigned long arg5)
>> >  {
>> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>> > - int result;
>> > + int result, index = 0, max_index = 0;
>> > + unsigned long hartid_arr[NR_CPUS] = {0};
>>
>> That's up to 256 bytes on the stack. And more if the maximum
>> number of cores is increased.
>>
>
> Yeah. We can switch to dynamic allocation using kmalloc based on
> the number of bits set in the cpumask.

Even more overhead...

>> > - if (!cpu_mask)
>> > + if (!cpu_mask || cpumask_empty(cpu_mask))
>> >   cpu_mask = cpu_online_mask;
>> >
>> >   for_each_cpu(cpuid, cpu_mask) {
>> >   hartid = cpuid_to_hartid_map(cpuid);
>> > + hartid_arr[index] = hartid;
>> > + index++;
>> > + }
>> > + max_index = index;
>> > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
>> > + for (index = 0; index < max_index; index++) {
>> > + hartid = hartid_arr[index];
>>
>> That looks expensive to me.
>>
>> What about shifting hmask and adjusting hbase if a hartid is
>> lower than the current hbase?
>
> That will probably work for current systems but it will fail when we have hartid > 64.
> The below logic as it assumes that the hartids are in order. We can have a situation
> where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> because the number of harts exceeds BITS_PER_LONG.

If the number of harts exceeds BITS_PER_LONG, you always need multiple
calls, right?

I think the below (gmail-whitespace-damaged diff) should work:

--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -249,7 +249,7 @@ static void __sbi_set_timer_v02(uint64_t stime_value)

 static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
 {
-       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
+       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
        struct sbiret ret = {0};
        int result;

@@ -258,16 +258,27 @@ static int __sbi_send_ipi_v02(const struct
cpumask *cpu_mask)

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask &&
-                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
-                       ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
-                                       hmask, hbase, 0, 0, 0, 0);
-                       if (ret.error)
-                               goto ecall_failed;
-                       hmask = 0;
+               if (hmask) {
+                       if (hartid + BITS_PER_LONG <= htop ||
+                           hartid >= hbase + BITS_PER_LONG) {
+                               ret = sbi_ecall(SBI_EXT_IPI,
+                                               SBI_EXT_IPI_SEND_IPI, hmask,
+                                               hbase, 0, 0, 0, 0);
+                               if (ret.error)
+                                       goto ecall_failed;
+                               hmask = 0;
+                       } else if (hartid < hbase) {
+                               /* shift the mask to fit lower hartid */
+                               hmask <<= hbase - hartid;
+                               hbase = hartid;
+                       }
                }
-               if (!hmask)
+               if (!hmask) {
                        hbase = hartid & -BITS_PER_LONG;
+                       htop = hartid;
+               } else if (hartid > htop) {
+                       htop = hartid;
+               }
                hmask |= 1UL << (hartid - hbase);
        }

@@ -344,7 +355,7 @@ static int __sbi_rfence_v02(int fid, const struct
cpumask *cpu_mask,
                            unsigned long start, unsigned long size,
                            unsigned long arg4, unsigned long arg5)
 {
-       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
+       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
        int result;

        if (!cpu_mask || cpumask_empty(cpu_mask))
@@ -352,16 +363,26 @@ static int __sbi_rfence_v02(int fid, const
struct cpumask *cpu_mask,

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask &&
-                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
-                       result = __sbi_rfence_v02_call(fid, hmask, hbase,
-                                                      start, size, arg4, arg5);
-                       if (result)
-                               return result;
-                       hmask = 0;
+               if (hmask) {
+                       if (hartid + BITS_PER_LONG <= htop ||
+                           hartid >= hbase + BITS_PER_LONG) {
+                               result = __sbi_rfence_v02_call(fid, hmask,
+                                               hbase, start, size, arg4, arg5);
+                               if (result)
+                                       return result;
+                               hmask = 0;
+                       } else if (hartid < hbase) {
+                               /* shift the mask to fit lower hartid */
+                               hmask <<= hbase - hartid;
+                               hbase = hartid;
+                       }
+               }
+               if (!hmask) {
+                       hbase = hartid;
+                       htop = hartid;
+               } else if (hartid > htop) {
+                       htop = hartid;
                }
-               if (!hmask)
-                       hbase = hartid & -BITS_PER_LONG;
                hmask |= 1UL << (hartid - hbase);
        }

Another simpler solution would be to just round hbase down to a
multiple of 32/64 (gmail-whitespace-damaged diff):

--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -258,16 +258,16 @@ static int __sbi_send_ipi_v02(const struct
cpumask *cpu_mask)

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
+               if (hmask &&
+                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
                        ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
                                        hmask, hbase, 0, 0, 0, 0);
                        if (ret.error)
                                goto ecall_failed;
                        hmask = 0;
-                       hbase = 0;
                }
                if (!hmask)
-                       hbase = hartid;
+                       hbase = hartid & -BITS_PER_LONG;
                hmask |= 1UL << (hartid - hbase);
        }

@@ -352,16 +352,16 @@ static int __sbi_rfence_v02(int fid, const
struct cpumask *cpu_mask,

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
+               if (hmask &&
+                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
                        result = __sbi_rfence_v02_call(fid, hmask, hbase,
                                                       start, size, arg4, arg5);
                        if (result)
                                return result;
                        hmask = 0;
-                       hbase = 0;
                }
                if (!hmask)
-                       hbase = hartid;
+                       hbase = hartid & -BITS_PER_LONG;
                hmask |= 1UL << (hartid - hbase);
        }

But that means multiple SBI calls if you have e.g. hartids 1-64.
The shifted mask solution doesn't suffer from that.
Both solutions don't sort the CPUs, so they are suboptimal in case of
hartid numberings like 0, 64, 1, 65, ...

What do you think?
Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-28  8:39                   ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-28  8:39 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

Hi Atish,

On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
>> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
>> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
>> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
>> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
>> > > > > > > is not the correct data structure for hartids as it can be higher
>> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
>> > > > > > >
>> > > > > > > Remove all association between hartid mask and struct cpumask.
>> > > > > > >
>> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
>> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
>> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
>> > > >
>> > > > > I am yet to reproduce it on my end.
>> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
>> > > >
>> > > > Unfortunately it doesn't fix the issue for me.
>> > > >
>> > > > /me debugging...
>> > >
>> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
>> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
>> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
>> > > hbase = 0.
>> > >
>> > > cpuid 1 maps to  hartid 0
>> > > cpuid 0 maps to hartid 1
>> > >
>> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
>> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
>> > >     hmask |= 1UL << (hartid - hbase);
>> > >
>> > > oops
>> > >
>> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
>> > > 8000000000000001 hbase 1
>> > >
>> >
>> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
>> > hartid and it is trying to do a remote tlb flush/IPI
>> > to lower the hartid. We should generate the hartid array before the loop.
>> >
>> > Can you try this diff ? It seems to work for me during multiple boot
>> > cycle on the unleashed.
>> >
>> > You can find the patch here as well
>> > https://github.com/atishp04/linux/commits/v5.17-rc1

>> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
>> > struct cpumask *cpu_mask,
>> >       unsigned long arg4, unsigned long arg5)
>> >  {
>> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>> > - int result;
>> > + int result, index = 0, max_index = 0;
>> > + unsigned long hartid_arr[NR_CPUS] = {0};
>>
>> That's up to 256 bytes on the stack. And more if the maximum
>> number of cores is increased.
>>
>
> Yeah. We can switch to dynamic allocation using kmalloc based on
> the number of bits set in the cpumask.

Even more overhead...

>> > - if (!cpu_mask)
>> > + if (!cpu_mask || cpumask_empty(cpu_mask))
>> >   cpu_mask = cpu_online_mask;
>> >
>> >   for_each_cpu(cpuid, cpu_mask) {
>> >   hartid = cpuid_to_hartid_map(cpuid);
>> > + hartid_arr[index] = hartid;
>> > + index++;
>> > + }
>> > + max_index = index;
>> > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
>> > + for (index = 0; index < max_index; index++) {
>> > + hartid = hartid_arr[index];
>>
>> That looks expensive to me.
>>
>> What about shifting hmask and adjusting hbase if a hartid is
>> lower than the current hbase?
>
> That will probably work for current systems but it will fail when we have hartid > 64.
> The below logic as it assumes that the hartids are in order. We can have a situation
> where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> because the number of harts exceeds BITS_PER_LONG.

If the number of harts exceeds BITS_PER_LONG, you always need multiple
calls, right?

I think the below (gmail-whitespace-damaged diff) should work:

--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -249,7 +249,7 @@ static void __sbi_set_timer_v02(uint64_t stime_value)

 static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
 {
-       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
+       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
        struct sbiret ret = {0};
        int result;

@@ -258,16 +258,27 @@ static int __sbi_send_ipi_v02(const struct
cpumask *cpu_mask)

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask &&
-                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
-                       ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
-                                       hmask, hbase, 0, 0, 0, 0);
-                       if (ret.error)
-                               goto ecall_failed;
-                       hmask = 0;
+               if (hmask) {
+                       if (hartid + BITS_PER_LONG <= htop ||
+                           hartid >= hbase + BITS_PER_LONG) {
+                               ret = sbi_ecall(SBI_EXT_IPI,
+                                               SBI_EXT_IPI_SEND_IPI, hmask,
+                                               hbase, 0, 0, 0, 0);
+                               if (ret.error)
+                                       goto ecall_failed;
+                               hmask = 0;
+                       } else if (hartid < hbase) {
+                               /* shift the mask to fit lower hartid */
+                               hmask <<= hbase - hartid;
+                               hbase = hartid;
+                       }
                }
-               if (!hmask)
+               if (!hmask) {
                        hbase = hartid & -BITS_PER_LONG;
+                       htop = hartid;
+               } else if (hartid > htop) {
+                       htop = hartid;
+               }
                hmask |= 1UL << (hartid - hbase);
        }

@@ -344,7 +355,7 @@ static int __sbi_rfence_v02(int fid, const struct
cpumask *cpu_mask,
                            unsigned long start, unsigned long size,
                            unsigned long arg4, unsigned long arg5)
 {
-       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
+       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
        int result;

        if (!cpu_mask || cpumask_empty(cpu_mask))
@@ -352,16 +363,26 @@ static int __sbi_rfence_v02(int fid, const
struct cpumask *cpu_mask,

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask &&
-                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
-                       result = __sbi_rfence_v02_call(fid, hmask, hbase,
-                                                      start, size, arg4, arg5);
-                       if (result)
-                               return result;
-                       hmask = 0;
+               if (hmask) {
+                       if (hartid + BITS_PER_LONG <= htop ||
+                           hartid >= hbase + BITS_PER_LONG) {
+                               result = __sbi_rfence_v02_call(fid, hmask,
+                                               hbase, start, size, arg4, arg5);
+                               if (result)
+                                       return result;
+                               hmask = 0;
+                       } else if (hartid < hbase) {
+                               /* shift the mask to fit lower hartid */
+                               hmask <<= hbase - hartid;
+                               hbase = hartid;
+                       }
+               }
+               if (!hmask) {
+                       hbase = hartid;
+                       htop = hartid;
+               } else if (hartid > htop) {
+                       htop = hartid;
                }
-               if (!hmask)
-                       hbase = hartid & -BITS_PER_LONG;
                hmask |= 1UL << (hartid - hbase);
        }

Another simpler solution would be to just round hbase down to a
multiple of 32/64 (gmail-whitespace-damaged diff):

--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -258,16 +258,16 @@ static int __sbi_send_ipi_v02(const struct
cpumask *cpu_mask)

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
+               if (hmask &&
+                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
                        ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
                                        hmask, hbase, 0, 0, 0, 0);
                        if (ret.error)
                                goto ecall_failed;
                        hmask = 0;
-                       hbase = 0;
                }
                if (!hmask)
-                       hbase = hartid;
+                       hbase = hartid & -BITS_PER_LONG;
                hmask |= 1UL << (hartid - hbase);
        }

@@ -352,16 +352,16 @@ static int __sbi_rfence_v02(int fid, const
struct cpumask *cpu_mask,

        for_each_cpu(cpuid, cpu_mask) {
                hartid = cpuid_to_hartid_map(cpuid);
-               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
+               if (hmask &&
+                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
                        result = __sbi_rfence_v02_call(fid, hmask, hbase,
                                                       start, size, arg4, arg5);
                        if (result)
                                return result;
                        hmask = 0;
-                       hbase = 0;
                }
                if (!hmask)
-                       hbase = hartid;
+                       hbase = hartid & -BITS_PER_LONG;
                hmask |= 1UL << (hartid - hbase);
        }

But that means multiple SBI calls if you have e.g. hartids 1-64.
The shifted mask solution doesn't suffer from that.
Both solutions don't sort the CPUs, so they are suboptimal in case of
hartid numberings like 0, 64, 1, 65, ...

What do you think?
Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-28  8:39                   ` Geert Uytterhoeven
@ 2022-01-28  8:55                     ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-28  8:55 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On Fri, Jan 28, 2022 at 9:39 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> What about shifting hmask and adjusting hbase if a hartid is
> >> lower than the current hbase?
> >
> > That will probably work for current systems but it will fail when we have hartid > 64.
> > The below logic as it assumes that the hartids are in order. We can have a situation
> > where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> > because the number of harts exceeds BITS_PER_LONG.
>
> If the number of harts exceeds BITS_PER_LONG, you always need multiple
> calls, right?
>
> I think the below (gmail-whitespace-damaged diff) should work:
>
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -249,7 +249,7 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
>
>  static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
>  {
> -       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> +       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
>         struct sbiret ret = {0};
>         int result;
>
> @@ -258,16 +258,27 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask &&
> -                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {

Oops, I actually sent the diff against the simpler solution below,
not against the current code, but I guess you get the idea.
I can send a proper patch when agreed.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-28  8:55                     ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-28  8:55 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jessica Clarke, Atish Patra, Linux Kernel Mailing List,
	Anup Patel, Albert Ou, Damien Le Moal, devicetree, Jisheng Zhang,
	Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt, Paul Walmsley,
	Rob Herring

On Fri, Jan 28, 2022 at 9:39 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> What about shifting hmask and adjusting hbase if a hartid is
> >> lower than the current hbase?
> >
> > That will probably work for current systems but it will fail when we have hartid > 64.
> > The below logic as it assumes that the hartids are in order. We can have a situation
> > where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> > because the number of harts exceeds BITS_PER_LONG.
>
> If the number of harts exceeds BITS_PER_LONG, you always need multiple
> calls, right?
>
> I think the below (gmail-whitespace-damaged diff) should work:
>
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -249,7 +249,7 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
>
>  static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
>  {
> -       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> +       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
>         struct sbiret ret = {0};
>         int result;
>
> @@ -258,16 +258,27 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask &&
> -                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {

Oops, I actually sent the diff against the simpler solution below,
not against the current code, but I guess you get the idea.
I can send a proper patch when agreed.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-27  1:01             ` Atish Patra
@ 2022-01-31  8:35               ` Anup Patel
  -1 siblings, 0 replies; 55+ messages in thread
From: Anup Patel @ 2022-01-31  8:35 UTC (permalink / raw)
  To: Atish Patra
  Cc: Geert Uytterhoeven, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Albert Ou, Damien Le Moal, devicetree,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring

On Thu, Jan 27, 2022 at 6:32 AM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > Hi Atish,
> >
> > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > > is not the correct data structure for hartids as it can be higher
> > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > > >
> > > > > > Remove all association between hartid mask and struct cpumask.
> > > > > >
> > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > >
> > > > I am yet to reproduce it on my end.
> > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > >
> > > Unfortunately it doesn't fix the issue for me.
> > >
> > > /me debugging...
> >
> > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > hbase = 0.
> >
> > cpuid 1 maps to  hartid 0
> > cpuid 0 maps to hartid 1
> >
> >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> >     __sbi_rfence_v02:377: hartid 0 hbase 1
> >     hmask |= 1UL << (hartid - hbase);
> >
> > oops
> >
> >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > 8000000000000001 hbase 1
> >
>
> Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> hartid and it is trying to do a remote tlb flush/IPI
> to lower the hartid. We should generate the hartid array before the loop.
>
> Can you try this diff ? It seems to work for me during multiple boot
> cycle on the unleashed.
>
> You can find the patch here as well
> https://github.com/atishp04/linux/commits/v5.17-rc1
>
> --------------------------------------------------------------------------------------------------------------------------------
> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> index f72527fcb347..4ebeb5813edc 100644
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -8,6 +8,8 @@
>  #include <linux/init.h>
>  #include <linux/pm.h>
>  #include <linux/reboot.h>
> +#include <linux/sort.h>
> +
>  #include <asm/sbi.h>
>  #include <asm/smp.h>
>
> @@ -85,7 +87,7 @@ static unsigned long
> __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
>   pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
> SBI v0.1\n");
>   break;
>   }
> - hmask |= 1 << hartid;
> + hmask |= 1UL << hartid;
>   }
>
>   return hmask;
> @@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
> *cpu_mask)
>  {
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>   int result = 0;
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>  static void sbi_set_power_off(void) {}
>  #endif /* CONFIG_RISCV_SBI_V01 */
>
> +static int cmp_ulong(const void *A, const void *B)
> +{
> + const unsigned long *a = A, *b = B;
> +
> + if (*a < *b)
> + return -1;
> + else if (*a > *b)
> + return 1;
> + else
> + return 0;
> +}
> +
>  static void __sbi_set_timer_v02(uint64_t stime_value)
>  {
>  #if __riscv_xlen == 32
> @@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>   struct sbiret ret = {0};
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};

No need to clear the hartid_arr[] because you have "index" and
"max_index" telling us number of enteries.

>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;

You can create a sorted array on the fly instead of calling sort()

> + index++;
> + }
> +
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>   hmask, hbase, 0, 0, 0, 0);
> @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>       unsigned long arg4, unsigned long arg5)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   result = __sbi_rfence_v02_call(fid, hmask, hbase,
>          start, size, arg4, arg5);
>
> --------------------------------------------------------------------------------------------------------------------------------
>
> > Gr{oetje,eeting}s,
> >
> >                         Geert
> >
> > --
> > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> >
> > In personal conversations with technical people, I call myself a hacker. But
> > when I'm talking to journalists I just say "programmer" or something like that.
> >                                 -- Linus Torvalds
>
>
>
> --
> Regards,
> Atish

My main concern is the sizeof hartid_arr[] on stack. Using kmalloc()
will only further slow it down.

Further, for small systems with fewer HARTs, this sorting
business will be a unnecessary overhead.

Regards,
Anup

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-31  8:35               ` Anup Patel
  0 siblings, 0 replies; 55+ messages in thread
From: Anup Patel @ 2022-01-31  8:35 UTC (permalink / raw)
  To: Atish Patra
  Cc: Geert Uytterhoeven, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Albert Ou, Damien Le Moal, devicetree,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring

On Thu, Jan 27, 2022 at 6:32 AM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > Hi Atish,
> >
> > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > > > > > is not the correct data structure for hartids as it can be higher
> > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > > > > >
> > > > > > Remove all association between hartid mask and struct cpumask.
> > > > > >
> > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > >
> > > > I am yet to reproduce it on my end.
> > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > >
> > > Unfortunately it doesn't fix the issue for me.
> > >
> > > /me debugging...
> >
> > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > hbase = 0.
> >
> > cpuid 1 maps to  hartid 0
> > cpuid 0 maps to hartid 1
> >
> >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> >     __sbi_rfence_v02:377: hartid 0 hbase 1
> >     hmask |= 1UL << (hartid - hbase);
> >
> > oops
> >
> >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > 8000000000000001 hbase 1
> >
>
> Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> hartid and it is trying to do a remote tlb flush/IPI
> to lower the hartid. We should generate the hartid array before the loop.
>
> Can you try this diff ? It seems to work for me during multiple boot
> cycle on the unleashed.
>
> You can find the patch here as well
> https://github.com/atishp04/linux/commits/v5.17-rc1
>
> --------------------------------------------------------------------------------------------------------------------------------
> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> index f72527fcb347..4ebeb5813edc 100644
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -8,6 +8,8 @@
>  #include <linux/init.h>
>  #include <linux/pm.h>
>  #include <linux/reboot.h>
> +#include <linux/sort.h>
> +
>  #include <asm/sbi.h>
>  #include <asm/smp.h>
>
> @@ -85,7 +87,7 @@ static unsigned long
> __sbi_v01_cpumask_to_hartmask(const struct cpumask *cpu_mas
>   pr_warn("Unable to send any request to hartid > BITS_PER_LONG for
> SBI v0.1\n");
>   break;
>   }
> - hmask |= 1 << hartid;
> + hmask |= 1UL << hartid;
>   }
>
>   return hmask;
> @@ -160,7 +162,7 @@ static int __sbi_send_ipi_v01(const struct cpumask
> *cpu_mask)
>  {
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -176,7 +178,7 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>   int result = 0;
>   unsigned long hart_mask;
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>   hart_mask = __sbi_v01_cpumask_to_hartmask(cpu_mask);
>
> @@ -236,6 +238,18 @@ static int __sbi_rfence_v01(int fid, const struct
> cpumask *cpu_mask,
>  static void sbi_set_power_off(void) {}
>  #endif /* CONFIG_RISCV_SBI_V01 */
>
> +static int cmp_ulong(const void *A, const void *B)
> +{
> + const unsigned long *a = A, *b = B;
> +
> + if (*a < *b)
> + return -1;
> + else if (*a > *b)
> + return 1;
> + else
> + return 0;
> +}
> +
>  static void __sbi_set_timer_v02(uint64_t stime_value)
>  {
>  #if __riscv_xlen == 32
> @@ -251,13 +265,22 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
>   struct sbiret ret = {0};
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};

No need to clear the hartid_arr[] because you have "index" and
"max_index" telling us number of enteries.

>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;

You can create a sorted array on the fly instead of calling sort()

> + index++;
> + }
> +
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>   hmask, hbase, 0, 0, 0, 0);
> @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>       unsigned long arg4, unsigned long arg5)
>  {
>   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> - int result;
> + int result, index = 0, max_index = 0;
> + unsigned long hartid_arr[NR_CPUS] = {0};
>
> - if (!cpu_mask)
> + if (!cpu_mask || cpumask_empty(cpu_mask))
>   cpu_mask = cpu_online_mask;
>
>   for_each_cpu(cpuid, cpu_mask) {
>   hartid = cpuid_to_hartid_map(cpuid);
> + hartid_arr[index] = hartid;
> + index++;
> + }
> + max_index = index;
> + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> + for (index = 0; index < max_index; index++) {
> + hartid = hartid_arr[index];
>   if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
>   result = __sbi_rfence_v02_call(fid, hmask, hbase,
>          start, size, arg4, arg5);
>
> --------------------------------------------------------------------------------------------------------------------------------
>
> > Gr{oetje,eeting}s,
> >
> >                         Geert
> >
> > --
> > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> >
> > In personal conversations with technical people, I call myself a hacker. But
> > when I'm talking to journalists I just say "programmer" or something like that.
> >                                 -- Linus Torvalds
>
>
>
> --
> Regards,
> Atish

My main concern is the sizeof hartid_arr[] on stack. Using kmalloc()
will only further slow it down.

Further, for small systems with fewer HARTs, this sorting
business will be a unnecessary overhead.

Regards,
Anup

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-28  8:39                   ` Geert Uytterhoeven
@ 2022-01-31 12:09                     ` Anup Patel
  -1 siblings, 0 replies; 55+ messages in thread
From: Anup Patel @ 2022-01-31 12:09 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Atish Patra, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Albert Ou, Damien Le Moal, devicetree,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring

On Fri, Jan 28, 2022 at 2:09 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Atish,
>
> On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> >> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> >> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> >> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> >> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> >> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> >> > > > > > > is not the correct data structure for hartids as it can be higher
> >> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> >> > > > > > >
> >> > > > > > > Remove all association between hartid mask and struct cpumask.
> >> > > > > > >
> >> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> >> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> >> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> >> > > >
> >> > > > > I am yet to reproduce it on my end.
> >> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> >> > > >
> >> > > > Unfortunately it doesn't fix the issue for me.
> >> > > >
> >> > > > /me debugging...
> >> > >
> >> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> >> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> >> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> >> > > hbase = 0.
> >> > >
> >> > > cpuid 1 maps to  hartid 0
> >> > > cpuid 0 maps to hartid 1
> >> > >
> >> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> >> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
> >> > >     hmask |= 1UL << (hartid - hbase);
> >> > >
> >> > > oops
> >> > >
> >> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> >> > > 8000000000000001 hbase 1
> >> > >
> >> >
> >> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> >> > hartid and it is trying to do a remote tlb flush/IPI
> >> > to lower the hartid. We should generate the hartid array before the loop.
> >> >
> >> > Can you try this diff ? It seems to work for me during multiple boot
> >> > cycle on the unleashed.
> >> >
> >> > You can find the patch here as well
> >> > https://github.com/atishp04/linux/commits/v5.17-rc1
>
> >> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> >> > struct cpumask *cpu_mask,
> >> >       unsigned long arg4, unsigned long arg5)
> >> >  {
> >> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> >> > - int result;
> >> > + int result, index = 0, max_index = 0;
> >> > + unsigned long hartid_arr[NR_CPUS] = {0};
> >>
> >> That's up to 256 bytes on the stack. And more if the maximum
> >> number of cores is increased.
> >>
> >
> > Yeah. We can switch to dynamic allocation using kmalloc based on
> > the number of bits set in the cpumask.
>
> Even more overhead...
>
> >> > - if (!cpu_mask)
> >> > + if (!cpu_mask || cpumask_empty(cpu_mask))
> >> >   cpu_mask = cpu_online_mask;
> >> >
> >> >   for_each_cpu(cpuid, cpu_mask) {
> >> >   hartid = cpuid_to_hartid_map(cpuid);
> >> > + hartid_arr[index] = hartid;
> >> > + index++;
> >> > + }
> >> > + max_index = index;
> >> > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> >> > + for (index = 0; index < max_index; index++) {
> >> > + hartid = hartid_arr[index];
> >>
> >> That looks expensive to me.
> >>
> >> What about shifting hmask and adjusting hbase if a hartid is
> >> lower than the current hbase?
> >
> > That will probably work for current systems but it will fail when we have hartid > 64.
> > The below logic as it assumes that the hartids are in order. We can have a situation
> > where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> > because the number of harts exceeds BITS_PER_LONG.
>
> If the number of harts exceeds BITS_PER_LONG, you always need multiple
> calls, right?
>
> I think the below (gmail-whitespace-damaged diff) should work:
>
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -249,7 +249,7 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
>
>  static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
>  {
> -       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> +       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
>         struct sbiret ret = {0};
>         int result;
>
> @@ -258,16 +258,27 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask &&
> -                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
> -                       ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> -                                       hmask, hbase, 0, 0, 0, 0);
> -                       if (ret.error)
> -                               goto ecall_failed;
> -                       hmask = 0;
> +               if (hmask) {
> +                       if (hartid + BITS_PER_LONG <= htop ||
> +                           hartid >= hbase + BITS_PER_LONG) {
> +                               ret = sbi_ecall(SBI_EXT_IPI,
> +                                               SBI_EXT_IPI_SEND_IPI, hmask,
> +                                               hbase, 0, 0, 0, 0);
> +                               if (ret.error)
> +                                       goto ecall_failed;
> +                               hmask = 0;
> +                       } else if (hartid < hbase) {
> +                               /* shift the mask to fit lower hartid */
> +                               hmask <<= hbase - hartid;
> +                               hbase = hartid;
> +                       }
>                 }
> -               if (!hmask)
> +               if (!hmask) {
>                         hbase = hartid & -BITS_PER_LONG;
> +                       htop = hartid;
> +               } else if (hartid > htop) {
> +                       htop = hartid;
> +               }
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> @@ -344,7 +355,7 @@ static int __sbi_rfence_v02(int fid, const struct
> cpumask *cpu_mask,
>                             unsigned long start, unsigned long size,
>                             unsigned long arg4, unsigned long arg5)
>  {
> -       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> +       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
>         int result;
>
>         if (!cpu_mask || cpumask_empty(cpu_mask))
> @@ -352,16 +363,26 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask &&
> -                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
> -                       result = __sbi_rfence_v02_call(fid, hmask, hbase,
> -                                                      start, size, arg4, arg5);
> -                       if (result)
> -                               return result;
> -                       hmask = 0;
> +               if (hmask) {
> +                       if (hartid + BITS_PER_LONG <= htop ||
> +                           hartid >= hbase + BITS_PER_LONG) {
> +                               result = __sbi_rfence_v02_call(fid, hmask,
> +                                               hbase, start, size, arg4, arg5);
> +                               if (result)
> +                                       return result;
> +                               hmask = 0;
> +                       } else if (hartid < hbase) {
> +                               /* shift the mask to fit lower hartid */
> +                               hmask <<= hbase - hartid;
> +                               hbase = hartid;
> +                       }
> +               }
> +               if (!hmask) {
> +                       hbase = hartid;
> +                       htop = hartid;
> +               } else if (hartid > htop) {
> +                       htop = hartid;
>                 }
> -               if (!hmask)
> -                       hbase = hartid & -BITS_PER_LONG;
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> Another simpler solution would be to just round hbase down to a
> multiple of 32/64 (gmail-whitespace-damaged diff):
>
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -258,16 +258,16 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> +               if (hmask &&
> +                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
>                         ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>                                         hmask, hbase, 0, 0, 0, 0);
>                         if (ret.error)
>                                 goto ecall_failed;
>                         hmask = 0;
> -                       hbase = 0;
>                 }
>                 if (!hmask)
> -                       hbase = hartid;
> +                       hbase = hartid & -BITS_PER_LONG;
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> @@ -352,16 +352,16 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> +               if (hmask &&
> +                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
>                         result = __sbi_rfence_v02_call(fid, hmask, hbase,
>                                                        start, size, arg4, arg5);
>                         if (result)
>                                 return result;
>                         hmask = 0;
> -                       hbase = 0;
>                 }
>                 if (!hmask)
> -                       hbase = hartid;
> +                       hbase = hartid & -BITS_PER_LONG;
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> But that means multiple SBI calls if you have e.g. hartids 1-64.
> The shifted mask solution doesn't suffer from that.
> Both solutions don't sort the CPUs, so they are suboptimal in case of
> hartid numberings like 0, 64, 1, 65, ...

In most cases, the hartids will be in sorted order under /cpus DT node
but it is not guaranteed that boot_cpu will be having smallest hartid

This means hartid numbering will be something like:
0, 1, 2, .....,
64, 0, 1, 2, ....
31, 0, 1, 2, .....

>
> What do you think?

Assuming hartids under /cpus DT node are ordered, I think your
approach will only have one additional SBI call compared to Atish's
approach but Atish's approach will require more memory with
increasing NR_CPUS so I suggest we go with your approach.

Can you send a patch with your approach ?

Regards,
Anup


> Thanks!
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-31 12:09                     ` Anup Patel
  0 siblings, 0 replies; 55+ messages in thread
From: Anup Patel @ 2022-01-31 12:09 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Atish Patra, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Albert Ou, Damien Le Moal, devicetree,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring

On Fri, Jan 28, 2022 at 2:09 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Atish,
>
> On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> > On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> >> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> >> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> >> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> >> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> >> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> >> > > > > > > is not the correct data structure for hartids as it can be higher
> >> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> >> > > > > > >
> >> > > > > > > Remove all association between hartid mask and struct cpumask.
> >> > > > > > >
> >> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> >> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> >> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> >> > > >
> >> > > > > I am yet to reproduce it on my end.
> >> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> >> > > >
> >> > > > Unfortunately it doesn't fix the issue for me.
> >> > > >
> >> > > > /me debugging...
> >> > >
> >> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> >> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> >> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> >> > > hbase = 0.
> >> > >
> >> > > cpuid 1 maps to  hartid 0
> >> > > cpuid 0 maps to hartid 1
> >> > >
> >> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> >> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
> >> > >     hmask |= 1UL << (hartid - hbase);
> >> > >
> >> > > oops
> >> > >
> >> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> >> > > 8000000000000001 hbase 1
> >> > >
> >> >
> >> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> >> > hartid and it is trying to do a remote tlb flush/IPI
> >> > to lower the hartid. We should generate the hartid array before the loop.
> >> >
> >> > Can you try this diff ? It seems to work for me during multiple boot
> >> > cycle on the unleashed.
> >> >
> >> > You can find the patch here as well
> >> > https://github.com/atishp04/linux/commits/v5.17-rc1
>
> >> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> >> > struct cpumask *cpu_mask,
> >> >       unsigned long arg4, unsigned long arg5)
> >> >  {
> >> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> >> > - int result;
> >> > + int result, index = 0, max_index = 0;
> >> > + unsigned long hartid_arr[NR_CPUS] = {0};
> >>
> >> That's up to 256 bytes on the stack. And more if the maximum
> >> number of cores is increased.
> >>
> >
> > Yeah. We can switch to dynamic allocation using kmalloc based on
> > the number of bits set in the cpumask.
>
> Even more overhead...
>
> >> > - if (!cpu_mask)
> >> > + if (!cpu_mask || cpumask_empty(cpu_mask))
> >> >   cpu_mask = cpu_online_mask;
> >> >
> >> >   for_each_cpu(cpuid, cpu_mask) {
> >> >   hartid = cpuid_to_hartid_map(cpuid);
> >> > + hartid_arr[index] = hartid;
> >> > + index++;
> >> > + }
> >> > + max_index = index;
> >> > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> >> > + for (index = 0; index < max_index; index++) {
> >> > + hartid = hartid_arr[index];
> >>
> >> That looks expensive to me.
> >>
> >> What about shifting hmask and adjusting hbase if a hartid is
> >> lower than the current hbase?
> >
> > That will probably work for current systems but it will fail when we have hartid > 64.
> > The below logic as it assumes that the hartids are in order. We can have a situation
> > where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> > because the number of harts exceeds BITS_PER_LONG.
>
> If the number of harts exceeds BITS_PER_LONG, you always need multiple
> calls, right?
>
> I think the below (gmail-whitespace-damaged diff) should work:
>
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -249,7 +249,7 @@ static void __sbi_set_timer_v02(uint64_t stime_value)
>
>  static int __sbi_send_ipi_v02(const struct cpumask *cpu_mask)
>  {
> -       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> +       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
>         struct sbiret ret = {0};
>         int result;
>
> @@ -258,16 +258,27 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask &&
> -                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
> -                       ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
> -                                       hmask, hbase, 0, 0, 0, 0);
> -                       if (ret.error)
> -                               goto ecall_failed;
> -                       hmask = 0;
> +               if (hmask) {
> +                       if (hartid + BITS_PER_LONG <= htop ||
> +                           hartid >= hbase + BITS_PER_LONG) {
> +                               ret = sbi_ecall(SBI_EXT_IPI,
> +                                               SBI_EXT_IPI_SEND_IPI, hmask,
> +                                               hbase, 0, 0, 0, 0);
> +                               if (ret.error)
> +                                       goto ecall_failed;
> +                               hmask = 0;
> +                       } else if (hartid < hbase) {
> +                               /* shift the mask to fit lower hartid */
> +                               hmask <<= hbase - hartid;
> +                               hbase = hartid;
> +                       }
>                 }
> -               if (!hmask)
> +               if (!hmask) {
>                         hbase = hartid & -BITS_PER_LONG;
> +                       htop = hartid;
> +               } else if (hartid > htop) {
> +                       htop = hartid;
> +               }
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> @@ -344,7 +355,7 @@ static int __sbi_rfence_v02(int fid, const struct
> cpumask *cpu_mask,
>                             unsigned long start, unsigned long size,
>                             unsigned long arg4, unsigned long arg5)
>  {
> -       unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> +       unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
>         int result;
>
>         if (!cpu_mask || cpumask_empty(cpu_mask))
> @@ -352,16 +363,26 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask &&
> -                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
> -                       result = __sbi_rfence_v02_call(fid, hmask, hbase,
> -                                                      start, size, arg4, arg5);
> -                       if (result)
> -                               return result;
> -                       hmask = 0;
> +               if (hmask) {
> +                       if (hartid + BITS_PER_LONG <= htop ||
> +                           hartid >= hbase + BITS_PER_LONG) {
> +                               result = __sbi_rfence_v02_call(fid, hmask,
> +                                               hbase, start, size, arg4, arg5);
> +                               if (result)
> +                                       return result;
> +                               hmask = 0;
> +                       } else if (hartid < hbase) {
> +                               /* shift the mask to fit lower hartid */
> +                               hmask <<= hbase - hartid;
> +                               hbase = hartid;
> +                       }
> +               }
> +               if (!hmask) {
> +                       hbase = hartid;
> +                       htop = hartid;
> +               } else if (hartid > htop) {
> +                       htop = hartid;
>                 }
> -               if (!hmask)
> -                       hbase = hartid & -BITS_PER_LONG;
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> Another simpler solution would be to just round hbase down to a
> multiple of 32/64 (gmail-whitespace-damaged diff):
>
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -258,16 +258,16 @@ static int __sbi_send_ipi_v02(const struct
> cpumask *cpu_mask)
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> +               if (hmask &&
> +                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
>                         ret = sbi_ecall(SBI_EXT_IPI, SBI_EXT_IPI_SEND_IPI,
>                                         hmask, hbase, 0, 0, 0, 0);
>                         if (ret.error)
>                                 goto ecall_failed;
>                         hmask = 0;
> -                       hbase = 0;
>                 }
>                 if (!hmask)
> -                       hbase = hartid;
> +                       hbase = hartid & -BITS_PER_LONG;
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> @@ -352,16 +352,16 @@ static int __sbi_rfence_v02(int fid, const
> struct cpumask *cpu_mask,
>
>         for_each_cpu(cpuid, cpu_mask) {
>                 hartid = cpuid_to_hartid_map(cpuid);
> -               if (hmask && ((hbase + BITS_PER_LONG) <= hartid)) {
> +               if (hmask &&
> +                   (hartid < hbase || hartid >= hbase + BITS_PER_LONG)) {
>                         result = __sbi_rfence_v02_call(fid, hmask, hbase,
>                                                        start, size, arg4, arg5);
>                         if (result)
>                                 return result;
>                         hmask = 0;
> -                       hbase = 0;
>                 }
>                 if (!hmask)
> -                       hbase = hartid;
> +                       hbase = hartid & -BITS_PER_LONG;
>                 hmask |= 1UL << (hartid - hbase);
>         }
>
> But that means multiple SBI calls if you have e.g. hartids 1-64.
> The shifted mask solution doesn't suffer from that.
> Both solutions don't sort the CPUs, so they are suboptimal in case of
> hartid numberings like 0, 64, 1, 65, ...

In most cases, the hartids will be in sorted order under /cpus DT node
but it is not guaranteed that boot_cpu will be having smallest hartid

This means hartid numbering will be something like:
0, 1, 2, .....,
64, 0, 1, 2, ....
31, 0, 1, 2, .....

>
> What do you think?

Assuming hartids under /cpus DT node are ordered, I think your
approach will only have one additional SBI call compared to Atish's
approach but Atish's approach will require more memory with
increasing NR_CPUS so I suggest we go with your approach.

Can you send a patch with your approach ?

Regards,
Anup


> Thanks!
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
  2022-01-31 12:09                     ` Anup Patel
@ 2022-01-31 13:27                       ` Geert Uytterhoeven
  -1 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-31 13:27 UTC (permalink / raw)
  To: Anup Patel
  Cc: Atish Patra, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Albert Ou, Damien Le Moal, devicetree,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring

Hi Anup,

On Mon, Jan 31, 2022 at 1:09 PM Anup Patel <anup@brainfault.org> wrote:
> On Fri, Jan 28, 2022 at 2:09 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> > >> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > >> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > >> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > >> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > >> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > >> > > > > > > is not the correct data structure for hartids as it can be higher
> > >> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > >> > > > > > >
> > >> > > > > > > Remove all association between hartid mask and struct cpumask.
> > >> > > > > > >
> > >> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > >> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > >> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > >> > > >
> > >> > > > > I am yet to reproduce it on my end.
> > >> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > >> > > >
> > >> > > > Unfortunately it doesn't fix the issue for me.
> > >> > > >
> > >> > > > /me debugging...
> > >> > >
> > >> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > >> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > >> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > >> > > hbase = 0.
> > >> > >
> > >> > > cpuid 1 maps to  hartid 0
> > >> > > cpuid 0 maps to hartid 1
> > >> > >
> > >> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> > >> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
> > >> > >     hmask |= 1UL << (hartid - hbase);
> > >> > >
> > >> > > oops
> > >> > >
> > >> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > >> > > 8000000000000001 hbase 1
> > >> > >
> > >> >
> > >> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> > >> > hartid and it is trying to do a remote tlb flush/IPI
> > >> > to lower the hartid. We should generate the hartid array before the loop.
> > >> >
> > >> > Can you try this diff ? It seems to work for me during multiple boot
> > >> > cycle on the unleashed.
> > >> >
> > >> > You can find the patch here as well
> > >> > https://github.com/atishp04/linux/commits/v5.17-rc1
> >
> > >> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> > >> > struct cpumask *cpu_mask,
> > >> >       unsigned long arg4, unsigned long arg5)
> > >> >  {
> > >> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> > >> > - int result;
> > >> > + int result, index = 0, max_index = 0;
> > >> > + unsigned long hartid_arr[NR_CPUS] = {0};
> > >>
> > >> That's up to 256 bytes on the stack. And more if the maximum
> > >> number of cores is increased.
> > >>
> > >
> > > Yeah. We can switch to dynamic allocation using kmalloc based on
> > > the number of bits set in the cpumask.
> >
> > Even more overhead...
> >
> > >> > - if (!cpu_mask)
> > >> > + if (!cpu_mask || cpumask_empty(cpu_mask))
> > >> >   cpu_mask = cpu_online_mask;
> > >> >
> > >> >   for_each_cpu(cpuid, cpu_mask) {
> > >> >   hartid = cpuid_to_hartid_map(cpuid);
> > >> > + hartid_arr[index] = hartid;
> > >> > + index++;
> > >> > + }
> > >> > + max_index = index;
> > >> > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> > >> > + for (index = 0; index < max_index; index++) {
> > >> > + hartid = hartid_arr[index];
> > >>
> > >> That looks expensive to me.
> > >>
> > >> What about shifting hmask and adjusting hbase if a hartid is
> > >> lower than the current hbase?
> > >
> > > That will probably work for current systems but it will fail when we have hartid > 64.
> > > The below logic as it assumes that the hartids are in order. We can have a situation
> > > where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> > > because the number of harts exceeds BITS_PER_LONG.
> >
> > If the number of harts exceeds BITS_PER_LONG, you always need multiple
> > calls, right?
> >
> > I think the below (gmail-whitespace-damaged diff) should work:

[...]

> >
> > Another simpler solution would be to just round hbase down to a
> > multiple of 32/64 (gmail-whitespace-damaged diff):

[...]

> > But that means multiple SBI calls if you have e.g. hartids 1-64.
> > The shifted mask solution doesn't suffer from that.
> > Both solutions don't sort the CPUs, so they are suboptimal in case of
> > hartid numberings like 0, 64, 1, 65, ...
>
> In most cases, the hartids will be in sorted order under /cpus DT node
> but it is not guaranteed that boot_cpu will be having smallest hartid
>
> This means hartid numbering will be something like:
> 0, 1, 2, .....,
> 64, 0, 1, 2, ....
> 31, 0, 1, 2, .....
>
> >
> > What do you think?
>
> Assuming hartids under /cpus DT node are ordered, I think your
> approach will only have one additional SBI call compared to Atish's
> approach but Atish's approach will require more memory with
> increasing NR_CPUS so I suggest we go with your approach.
>
> Can you send a patch with your approach ?

Sure, done.
https://lore.kernel.org/r/cover.1643635156.git.geert@linux-m68k.org/

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap
@ 2022-01-31 13:27                       ` Geert Uytterhoeven
  0 siblings, 0 replies; 55+ messages in thread
From: Geert Uytterhoeven @ 2022-01-31 13:27 UTC (permalink / raw)
  To: Anup Patel
  Cc: Atish Patra, Jessica Clarke, Atish Patra,
	Linux Kernel Mailing List, Albert Ou, Damien Le Moal, devicetree,
	Jisheng Zhang, Krzysztof Kozlowski, linux-riscv, Palmer Dabbelt,
	Paul Walmsley, Rob Herring

Hi Anup,

On Mon, Jan 31, 2022 at 1:09 PM Anup Patel <anup@brainfault.org> wrote:
> On Fri, Jan 28, 2022 at 2:09 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Fri, Jan 28, 2022 at 1:13 AM Atish Patra <atishp@atishpatra.org> wrote:
> > > On Thu, Jan 27, 2022 at 12:48 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >> On Thu, Jan 27, 2022 at 2:02 AM Atish Patra <atishp@atishpatra.org> wrote:
> > >> > On Wed, Jan 26, 2022 at 1:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >> > > On Wed, Jan 26, 2022 at 9:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >> > > > On Wed, Jan 26, 2022 at 3:21 AM Atish Patra <atishp@atishpatra.org> wrote:
> > >> > > > > On Tue, Jan 25, 2022 at 2:26 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > >> > > > > > On 20 Jan 2022, at 09:09, Atish Patra <atishp@rivosinc.com> wrote:
> > >> > > > > > > Currently, SBI APIs accept a hartmask that is generated from struct
> > >> > > > > > > cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
> > >> > > > > > > is not the correct data structure for hartids as it can be higher
> > >> > > > > > > than NR_CPUs for platforms with sparse or discontguous hartids.
> > >> > > > > > >
> > >> > > > > > > Remove all association between hartid mask and struct cpumask.
> > >> > > > > > >
> > >> > > > > > > Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
> > >> > > > > > > Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
> > >> > > > > > > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > >> > > >
> > >> > > > > I am yet to reproduce it on my end.
> > >> > > > > @Geert Uytterhoeven: can you please try the below diff on your end.
> > >> > > >
> > >> > > > Unfortunately it doesn't fix the issue for me.
> > >> > > >
> > >> > > > /me debugging...
> > >> > >
> > >> > > Found it: after this commit, the SBI_EXT_RFENCE_REMOTE_FENCE_I and
> > >> > > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA ecalls are now called with
> > >> > > hmask = 0x8000000000000001 and hbase = 1 instead of hmask = 3 and
> > >> > > hbase = 0.
> > >> > >
> > >> > > cpuid 1 maps to  hartid 0
> > >> > > cpuid 0 maps to hartid 1
> > >> > >
> > >> > >     __sbi_rfence_v02:364: cpuid 1 hartid 0
> > >> > >     __sbi_rfence_v02:377: hartid 0 hbase 1
> > >> > >     hmask |= 1UL << (hartid - hbase);
> > >> > >
> > >> > > oops
> > >> > >
> > >> > >     __sbi_rfence_v02_call:303: SBI_EXT_RFENCE_REMOTE_FENCE_I hmask
> > >> > > 8000000000000001 hbase 1
> > >> > >
> > >> >
> > >> > Ahh yes. hmask will be incorrect if the bootcpu(cpu 0) is a higher
> > >> > hartid and it is trying to do a remote tlb flush/IPI
> > >> > to lower the hartid. We should generate the hartid array before the loop.
> > >> >
> > >> > Can you try this diff ? It seems to work for me during multiple boot
> > >> > cycle on the unleashed.
> > >> >
> > >> > You can find the patch here as well
> > >> > https://github.com/atishp04/linux/commits/v5.17-rc1
> >
> > >> > @@ -345,13 +368,21 @@ static int __sbi_rfence_v02(int fid, const
> > >> > struct cpumask *cpu_mask,
> > >> >       unsigned long arg4, unsigned long arg5)
> > >> >  {
> > >> >   unsigned long hartid, cpuid, hmask = 0, hbase = 0;
> > >> > - int result;
> > >> > + int result, index = 0, max_index = 0;
> > >> > + unsigned long hartid_arr[NR_CPUS] = {0};
> > >>
> > >> That's up to 256 bytes on the stack. And more if the maximum
> > >> number of cores is increased.
> > >>
> > >
> > > Yeah. We can switch to dynamic allocation using kmalloc based on
> > > the number of bits set in the cpumask.
> >
> > Even more overhead...
> >
> > >> > - if (!cpu_mask)
> > >> > + if (!cpu_mask || cpumask_empty(cpu_mask))
> > >> >   cpu_mask = cpu_online_mask;
> > >> >
> > >> >   for_each_cpu(cpuid, cpu_mask) {
> > >> >   hartid = cpuid_to_hartid_map(cpuid);
> > >> > + hartid_arr[index] = hartid;
> > >> > + index++;
> > >> > + }
> > >> > + max_index = index;
> > >> > + sort(hartid_arr, max_index, sizeof(unsigned long), cmp_ulong, NULL);
> > >> > + for (index = 0; index < max_index; index++) {
> > >> > + hartid = hartid_arr[index];
> > >>
> > >> That looks expensive to me.
> > >>
> > >> What about shifting hmask and adjusting hbase if a hartid is
> > >> lower than the current hbase?
> > >
> > > That will probably work for current systems but it will fail when we have hartid > 64.
> > > The below logic as it assumes that the hartids are in order. We can have a situation
> > > where a two consecutive cpuid belong to hartids that require two invocations of sbi call
> > > because the number of harts exceeds BITS_PER_LONG.
> >
> > If the number of harts exceeds BITS_PER_LONG, you always need multiple
> > calls, right?
> >
> > I think the below (gmail-whitespace-damaged diff) should work:

[...]

> >
> > Another simpler solution would be to just round hbase down to a
> > multiple of 32/64 (gmail-whitespace-damaged diff):

[...]

> > But that means multiple SBI calls if you have e.g. hartids 1-64.
> > The shifted mask solution doesn't suffer from that.
> > Both solutions don't sort the CPUs, so they are suboptimal in case of
> > hartid numberings like 0, 64, 1, 65, ...
>
> In most cases, the hartids will be in sorted order under /cpus DT node
> but it is not guaranteed that boot_cpu will be having smallest hartid
>
> This means hartid numbering will be something like:
> 0, 1, 2, .....,
> 64, 0, 1, 2, ....
> 31, 0, 1, 2, .....
>
> >
> > What do you think?
>
> Assuming hartids under /cpus DT node are ordered, I think your
> approach will only have one additional SBI call compared to Atish's
> approach but Atish's approach will require more memory with
> increasing NR_CPUS so I suggest we go with your approach.
>
> Can you send a patch with your approach ?

Sure, done.
https://lore.kernel.org/r/cover.1643635156.git.geert@linux-m68k.org/

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2022-01-31 13:28 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-20  9:09 [PATCH v3 0/6] Sparse HART id support Atish Patra
2022-01-20  9:09 ` Atish Patra
2022-01-20  9:09 ` [PATCH v3 1/6] RISC-V: Avoid using per cpu array for ordered booting Atish Patra
2022-01-20  9:09   ` Atish Patra
2022-01-20  9:09 ` [PATCH v3 2/6] RISC-V: Do not print the SBI version during HSM extension boot print Atish Patra
2022-01-20  9:09   ` Atish Patra
2022-01-20  9:09 ` [PATCH v3 3/6] RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method Atish Patra
2022-01-20  9:09   ` Atish Patra
2022-01-20  9:09 ` [PATCH v3 4/6] RISC-V: Move the entire hart selection via lottery to SMP Atish Patra
2022-01-20  9:09 ` [PATCH v3 5/6] RISC-V: Move spinwait booting method to its own config Atish Patra
2022-01-20  9:09   ` Atish Patra
2022-01-20  9:09 ` [PATCH v3 6/6] RISC-V: Do not use cpumask data structure for hartid bitmap Atish Patra
2022-01-20  9:09   ` Atish Patra
2022-01-25 20:12   ` Geert Uytterhoeven
2022-01-25 20:12     ` Geert Uytterhoeven
2022-01-25 20:17     ` Atish Patra
2022-01-25 20:17       ` Atish Patra
2022-01-25 20:52       ` Geert Uytterhoeven
2022-01-25 20:52         ` Geert Uytterhoeven
2022-01-25 21:11         ` Ron Economos
2022-01-25 21:11           ` Ron Economos
2022-01-25 22:26   ` Jessica Clarke
2022-01-25 22:26     ` Jessica Clarke
2022-01-25 22:29     ` David Laight
2022-01-25 22:29       ` David Laight
2022-01-26  2:21     ` Atish Patra
2022-01-26  2:21       ` Atish Patra
2022-01-26  8:28       ` Geert Uytterhoeven
2022-01-26  8:28         ` Geert Uytterhoeven
2022-01-26  9:10         ` Geert Uytterhoeven
2022-01-26  9:10           ` Geert Uytterhoeven
2022-01-27  1:01           ` Atish Patra
2022-01-27  1:01             ` Atish Patra
2022-01-27  8:48             ` Geert Uytterhoeven
2022-01-27  8:48               ` Geert Uytterhoeven
2022-01-27  8:48               ` Geert Uytterhoeven
2022-01-27  8:48                 ` Geert Uytterhoeven
2022-01-27 10:03               ` Geert Uytterhoeven
2022-01-27 10:03                 ` Geert Uytterhoeven
2022-01-27 10:17                 ` Andreas Schwab
2022-01-27 10:17                   ` Andreas Schwab
     [not found]               ` <CAOnJCU+U0xmw-_yTEUo9ZXO5pvoJ6VCGu+jjU-Sa2MnhcAha6Q@mail.gmail.com>
2022-01-28  8:39                 ` Geert Uytterhoeven
2022-01-28  8:39                   ` Geert Uytterhoeven
2022-01-28  8:55                   ` Geert Uytterhoeven
2022-01-28  8:55                     ` Geert Uytterhoeven
2022-01-31 12:09                   ` Anup Patel
2022-01-31 12:09                     ` Anup Patel
2022-01-31 13:27                     ` Geert Uytterhoeven
2022-01-31 13:27                       ` Geert Uytterhoeven
2022-01-27  9:56             ` Ron Economos
2022-01-27  9:56               ` Ron Economos
2022-01-31  8:35             ` Anup Patel
2022-01-31  8:35               ` Anup Patel
2022-01-20 18:17 ` [PATCH v3 0/6] Sparse HART id support Palmer Dabbelt
2022-01-20 18:17   ` Palmer Dabbelt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.