linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/6] RISC-V Hardware Probing User Interface
@ 2023-03-27 16:31 Evan Green
  2023-03-27 16:32 ` [PATCH v5 5/6] selftests: Test the new RISC-V hwprobe interface Evan Green
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Evan Green @ 2023-03-27 16:31 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: slewis, vineetg, heiko, Conor Dooley, Evan Green, Albert Ou,
	Andrew Bresticker, Andrew Jones, Andrew Morton, Anup Patel,
	Arnd Bergmann, Atish Patra, Bagas Sanjaya, Catalin Marinas,
	Celeste Liu, Conor Dooley, Dao Lu, Guo Ren, Heiko Stuebner,
	Jann Horn, Jisheng Zhang, Jonathan Corbet, Ley Foon Tan,
	Mark Brown, Mike Kravetz, Nathan Chancellor, Palmer Dabbelt,
	Paul Walmsley, Peter Xu, Philipp Tomsich, Randy Dunlap,
	Samuel Holland, Shuah Khan, Sunil V L, Tobias Klauser, linux-doc,
	linux-kernel, linux-kselftest, linux-riscv


There's been a bunch of off-list discussions about this, including at
Plumbers.  The original plan was to do something involving providing an
ISA string to userspace, but ISA strings just aren't sufficient for a
stable ABI any more: in order to parse an ISA string users need the
version of the specifications that the string is written to, the version
of each extension (sometimes at a finer granularity than the RISC-V
releases/versions encode), and the expected use case for the ISA string
(ie, is it a U-mode or M-mode string).  That's a lot of complexity to
try and keep ABI compatible and it's probably going to continue to grow,
as even if there's no more complexity in the specifications we'll have
to deal with the various ISA string parsing oddities that end up all
over userspace.

Instead this patch set takes a very different approach and provides a set
of key/value pairs that encode various bits about the system.  The big
advantage here is that we can clearly define what these mean so we can
ensure ABI stability, but it also allows us to encode information that's
unlikely to ever appear in an ISA string (see the misaligned access
performance, for example).  The resulting interface looks a lot like
what arm64 and x86 do, and will hopefully fit well into something like
ACPI in the future.

The actual user interface is a syscall, with a vDSO function in front of
it. The vDSO function can answer some queries without a syscall at all,
and falls back to the syscall for cases it doesn't have answers to.
Currently we prepopulate it with an array of answers for all keys and
a CPU set of "all CPUs". This can be adjusted as necessary to provide
fast answers to the most common queries.

An example series in glibc exposing this syscall and using it in an
ifunc selector for memcpy can be found at [1]. I'm about to send a v2
of that series out that incorporates the vDSO function.

I was asked about the performance delta between this and something like
sysfs. I created a small test program [2] and ran it on a Nezha D1
Allwinner board. Doing each operation 100000 times and dividing, these
operations take the following amount of time:
 - open()+read()+close() of /sys/kernel/cpu_byteorder: 3.8us
 - access("/sys/kernel/cpu_byteorder", R_OK): 1.3us
 - riscv_hwprobe() vDSO and syscall: .0094us
 - riscv_hwprobe() vDSO with no syscall: 0.0091us

These numbers get farther apart if we query multiple keys, as sysfs will
scale linearly with the number of keys, where the dedicated syscall
stays the same. To frame these numbers, I also did a tight
fork/exec/wait loop, which I measured as 4.8ms. So doing 4
open/read/close operations is a delta of about 0.3%, versus a single vDSO
call is a delta of essentially zero.

[1] https://public-inbox.org/libc-alpha/20230206194819.1679472-1-evan@rivosinc.com/T/#t
[2] https://pastebin.com/x84NEKaS

Changes in v5:
 - Added tags
 - Fixed misuse of ISA_EXT_c as bitmap, changed to use
   riscv_isa_extension_available() (Heiko, Conor)
 - Document the alternatives approach in the commit message (Conor and
   Heiko).
 - Fix __init call warnings by making probe_vendor_features() and
   thead_feature_probe_func() __init_or_module.
 - Fixed compat vdso compilation failure (lkp).

Changes in v4:
 - Used real types in syscall prototypes (Arnd)
 - Fixed static line break in do_riscv_hwprobe() (Conor)
 - Added newlines between documentation lists (Conor)
 - Crispen up size types to size_t, and cpu indices to int (Joe)
 - Fix copy_from_user() return logic bug (found via kselftests!)
 - Add __user to SYSCALL_DEFINE() to fix warning
 - More newlines in BASE_BEHAVIOR_IMA documentation (Conor)
 - Add newlines to CPUPERF_0 documentation (Conor)
 - Add UNSUPPORTED value (Conor)
 - Switched from DT to alternatives-based probing (Rob)
 - Crispen up cpu index type to always be int (Conor)
 - Fixed selftests commit description, no more tiny libc (Mark Brown)
 - Fixed selftest syscall prototype types to match v4.
 - Added a prototype to fix -Wmissing-prototype warning (lkp@intel.com)
 - Fixed rv32 build failure (lkp@intel.com)
 - Make vdso prototype match syscall types update

Changes in v3:
 - Updated copyright date in cpufeature.h
 - Fixed typo in cpufeature.h comment (Conor)
 - Refactored functions so that kernel mode can query too, in
   preparation for the vDSO data population.
 - Changed the vendor/arch/imp IDs to return a value of -1 on mismatch
   rather than failing the whole call.
 - Const cpumask pointer in hwprobe_mid()
 - Embellished documentation WRT cpu_set and the returned values.
 - Renamed hwprobe_mid() to hwprobe_arch_id() (Conor)
 - Fixed machine ID doc warnings, changed elements to c:macro:.
 - Completed dangling unistd.h comment (Conor)
 - Fixed line breaks and minor logic optimization (Conor).
 - Use riscv_cached_mxxxid() (Conor)
 - Refactored base ISA behavior probe to allow kernel probing as well,
   in prep for vDSO data initialization.
 - Fixed doc warnings in IMA text list, use :c:macro:.
 - Have hwprobe_misaligned return int instead of long.
 - Constify cpumask pointer in hwprobe_misaligned()
 - Fix warnings in _PERF_O list documentation, use :c:macro:.
 - Move include cpufeature.h to misaligned patch.
 - Fix documentation mismatch for RISCV_HWPROBE_KEY_CPUPERF_0 (Conor)
 - Use for_each_possible_cpu() instead of NR_CPUS (Conor)
 - Break early in misaligned access iteration (Conor)
 - Increase MISALIGNED_MASK from 2 bits to 3 for possible UNSUPPORTED future
   value (Conor)
 - Introduced vDSO function

Changes in v2:
 - Factored the move of struct riscv_cpuinfo to its own header
 - Changed the interface to look more like poll(). Rather than supplying
   key_offset and getting back an array of values with numerically
   contiguous keys, have the user pre-fill the key members of the array,
   and the kernel will fill in the corresponding values. For any key it
   doesn't recognize, it will set the key of that element to -1. This
   allows usermode to quickly ask for exactly the elements it cares
   about, and not get bogged down in a back and forth about newer keys
   that older kernels might not recognize. In other words, the kernel
   can communicate that it doesn't recognize some of the keys while
   still providing the data for the keys it does know.
 - Added a shortcut to the cpuset parameters that if a size of 0 and
   NULL is provided for the CPU set, the kernel will use a cpu mask of
   all online CPUs. This is convenient because I suspect most callers
   will only want to act on a feature if it's supported on all CPUs, and
   it's a headache to dynamically allocate an array of all 1s, not to
   mention a waste to have the kernel loop over all of the offline bits.
 - Fixed logic error in if(of_property_read_string...) that caused crash
 - Include cpufeature.h in cpufeature.h to avoid undeclared variable
   warning.
 - Added a _MASK define
 - Fix random checkpatch complaints
 - Updated the selftests to the new API and added some more.
 - Fixed indentation, comments in .S, and general checkpatch complaints.

Evan Green (6):
  RISC-V: Move struct riscv_cpuinfo to new header
  RISC-V: Add a syscall for HW probing
  RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA
  RISC-V: hwprobe: Support probing of misaligned access performance
  selftests: Test the new RISC-V hwprobe interface
  RISC-V: Add hwprobe vDSO function and data

 Documentation/riscv/hwprobe.rst               |  86 +++++++
 Documentation/riscv/index.rst                 |   1 +
 arch/riscv/Kconfig                            |   1 +
 arch/riscv/errata/thead/errata.c              |  10 +
 arch/riscv/include/asm/alternative.h          |   5 +
 arch/riscv/include/asm/cpufeature.h           |  23 ++
 arch/riscv/include/asm/hwprobe.h              |  13 +
 arch/riscv/include/asm/syscall.h              |   4 +
 arch/riscv/include/asm/vdso/data.h            |  17 ++
 arch/riscv/include/asm/vdso/gettimeofday.h    |   8 +
 arch/riscv/include/uapi/asm/hwprobe.h         |  37 +++
 arch/riscv/include/uapi/asm/unistd.h          |   9 +
 arch/riscv/kernel/alternative.c               |  19 ++
 arch/riscv/kernel/compat_vdso/Makefile        |   2 +-
 arch/riscv/kernel/cpu.c                       |   8 +-
 arch/riscv/kernel/cpufeature.c                |   3 +
 arch/riscv/kernel/smpboot.c                   |   1 +
 arch/riscv/kernel/sys_riscv.c                 | 225 +++++++++++++++++-
 arch/riscv/kernel/vdso.c                      |   6 -
 arch/riscv/kernel/vdso/Makefile               |   4 +
 arch/riscv/kernel/vdso/hwprobe.c              |  52 ++++
 arch/riscv/kernel/vdso/sys_hwprobe.S          |  15 ++
 arch/riscv/kernel/vdso/vdso.lds.S             |   3 +
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/riscv/Makefile        |  58 +++++
 .../testing/selftests/riscv/hwprobe/Makefile  |  10 +
 .../testing/selftests/riscv/hwprobe/hwprobe.c |  90 +++++++
 .../selftests/riscv/hwprobe/sys_hwprobe.S     |  12 +
 28 files changed, 709 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/riscv/hwprobe.rst
 create mode 100644 arch/riscv/include/asm/cpufeature.h
 create mode 100644 arch/riscv/include/asm/hwprobe.h
 create mode 100644 arch/riscv/include/asm/vdso/data.h
 create mode 100644 arch/riscv/include/uapi/asm/hwprobe.h
 create mode 100644 arch/riscv/kernel/vdso/hwprobe.c
 create mode 100644 arch/riscv/kernel/vdso/sys_hwprobe.S
 create mode 100644 tools/testing/selftests/riscv/Makefile
 create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile
 create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c
 create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 5/6] selftests: Test the new RISC-V hwprobe interface
  2023-03-27 16:31 [PATCH v5 0/6] RISC-V Hardware Probing User Interface Evan Green
@ 2023-03-27 16:32 ` Evan Green
  2023-03-28  6:45 ` [PATCH v5 0/6] RISC-V Hardware Probing User Interface Conor Dooley
  2023-03-28 20:34 ` Heiko Stübner
  2 siblings, 0 replies; 6+ messages in thread
From: Evan Green @ 2023-03-27 16:32 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: slewis, vineetg, heiko, Conor Dooley, Evan Green, Albert Ou,
	Catalin Marinas, Mark Brown, Palmer Dabbelt, Paul Walmsley,
	Shuah Khan, linux-kernel, linux-kselftest, linux-riscv

This adds a test for the recently added RISC-V interface for probing
hardware capabilities.  It happens to be the first selftest we have for
RISC-V, so I've added some infrastructure for those as well.

Co-developed-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Evan Green <evan@rivosinc.com>

---

(no changes since v4)

Changes in v4:
 - Fixed selftests commit description, no more tiny libc (Mark Brown)
 - Fixed selftest syscall prototype types to match v4.

Changes in v2:
 - Updated the selftests to the new API and added some more.
 - Fixed indentation, comments in .S, and general checkpatch complaints.


---
 tools/testing/selftests/Makefile              |  1 +
 tools/testing/selftests/riscv/Makefile        | 58 ++++++++++++
 .../testing/selftests/riscv/hwprobe/Makefile  | 10 +++
 .../testing/selftests/riscv/hwprobe/hwprobe.c | 90 +++++++++++++++++++
 .../selftests/riscv/hwprobe/sys_hwprobe.S     | 12 +++
 5 files changed, 171 insertions(+)
 create mode 100644 tools/testing/selftests/riscv/Makefile
 create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile
 create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c
 create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 13a6837a0c6b..4bea26109450 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -63,6 +63,7 @@ TARGETS += pstore
 TARGETS += ptrace
 TARGETS += openat2
 TARGETS += resctrl
+TARGETS += riscv
 TARGETS += rlimits
 TARGETS += rseq
 TARGETS += rtc
diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile
new file mode 100644
index 000000000000..32a72902d045
--- /dev/null
+++ b/tools/testing/selftests/riscv/Makefile
@@ -0,0 +1,58 @@
+# SPDX-License-Identifier: GPL-2.0
+# Originally tools/testing/arm64/Makefile
+
+# When ARCH not overridden for crosscompiling, lookup machine
+ARCH ?= $(shell uname -m 2>/dev/null || echo not)
+
+ifneq (,$(filter $(ARCH),riscv))
+RISCV_SUBTARGETS ?= hwprobe
+else
+RISCV_SUBTARGETS :=
+endif
+
+CFLAGS := -Wall -O2 -g
+
+# A proper top_srcdir is needed by KSFT(lib.mk)
+top_srcdir = $(realpath ../../../../)
+
+# Additional include paths needed by kselftest.h and local headers
+CFLAGS += -I$(top_srcdir)/tools/testing/selftests/
+
+CFLAGS += $(KHDR_INCLUDES)
+
+export CFLAGS
+export top_srcdir
+
+all:
+	@for DIR in $(RISCV_SUBTARGETS); do				\
+		BUILD_TARGET=$(OUTPUT)/$$DIR;			\
+		mkdir -p $$BUILD_TARGET;			\
+		$(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@;		\
+	done
+
+install: all
+	@for DIR in $(RISCV_SUBTARGETS); do				\
+		BUILD_TARGET=$(OUTPUT)/$$DIR;			\
+		$(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@;		\
+	done
+
+run_tests: all
+	@for DIR in $(RISCV_SUBTARGETS); do				\
+		BUILD_TARGET=$(OUTPUT)/$$DIR;			\
+		$(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@;		\
+	done
+
+# Avoid any output on non riscv on emit_tests
+emit_tests: all
+	@for DIR in $(RISCV_SUBTARGETS); do				\
+		BUILD_TARGET=$(OUTPUT)/$$DIR;			\
+		$(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@;		\
+	done
+
+clean:
+	@for DIR in $(RISCV_SUBTARGETS); do				\
+		BUILD_TARGET=$(OUTPUT)/$$DIR;			\
+		$(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@;		\
+	done
+
+.PHONY: all clean install run_tests emit_tests
diff --git a/tools/testing/selftests/riscv/hwprobe/Makefile b/tools/testing/selftests/riscv/hwprobe/Makefile
new file mode 100644
index 000000000000..ebdbb3c22e54
--- /dev/null
+++ b/tools/testing/selftests/riscv/hwprobe/Makefile
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2021 ARM Limited
+# Originally tools/testing/arm64/abi/Makefile
+
+TEST_GEN_PROGS := hwprobe
+
+include ../../lib.mk
+
+$(OUTPUT)/hwprobe: hwprobe.c sys_hwprobe.S
+	$(CC) -o$@ $(CFLAGS) $(LDFLAGS) $^
diff --git a/tools/testing/selftests/riscv/hwprobe/hwprobe.c b/tools/testing/selftests/riscv/hwprobe/hwprobe.c
new file mode 100644
index 000000000000..09f290a67420
--- /dev/null
+++ b/tools/testing/selftests/riscv/hwprobe/hwprobe.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <stddef.h>
+#include <asm/hwprobe.h>
+
+/*
+ * Rather than relying on having a new enough libc to define this, just do it
+ * ourselves.  This way we don't need to be coupled to a new-enough libc to
+ * contain the call.
+ */
+long riscv_hwprobe(struct riscv_hwprobe *pairs, size_t pair_count,
+		   size_t cpu_count, unsigned long *cpus, unsigned int flags);
+
+int main(int argc, char **argv)
+{
+	struct riscv_hwprobe pairs[8];
+	unsigned long cpus;
+	long out;
+
+	/* Fake the CPU_SET ops. */
+	cpus = -1;
+
+	/*
+	 * Just run a basic test: pass enough pairs to get up to the base
+	 * behavior, and then check to make sure it's sane.
+	 */
+	for (long i = 0; i < 8; i++)
+		pairs[i].key = i;
+	out = riscv_hwprobe(pairs, 8, 1, &cpus, 0);
+	if (out != 0)
+		return -1;
+	for (long i = 0; i < 4; ++i) {
+		/* Fail if the kernel claims not to recognize a base key. */
+		if ((i < 4) && (pairs[i].key != i))
+			return -2;
+
+		if (pairs[i].key != RISCV_HWPROBE_KEY_BASE_BEHAVIOR)
+			continue;
+
+		if (pairs[i].value & RISCV_HWPROBE_BASE_BEHAVIOR_IMA)
+			continue;
+
+		return -3;
+	}
+
+	/*
+	 * This should also work with a NULL CPU set, but should not work
+	 * with an improperly supplied CPU set.
+	 */
+	out = riscv_hwprobe(pairs, 8, 0, 0, 0);
+	if (out != 0)
+		return -4;
+
+	out = riscv_hwprobe(pairs, 8, 0, &cpus, 0);
+	if (out == 0)
+		return -5;
+
+	out = riscv_hwprobe(pairs, 8, 1, 0, 0);
+	if (out == 0)
+		return -6;
+
+	/*
+	 * Check that keys work by providing one that we know exists, and
+	 * checking to make sure the resultig pair is what we asked for.
+	 */
+	pairs[0].key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR;
+	out = riscv_hwprobe(pairs, 1, 1, &cpus, 0);
+	if (out != 0)
+		return -7;
+	if (pairs[0].key != RISCV_HWPROBE_KEY_BASE_BEHAVIOR)
+		return -8;
+
+	/*
+	 * Check that an unknown key gets overwritten with -1,
+	 * but doesn't block elements after it.
+	 */
+	pairs[0].key = 0x5555;
+	pairs[1].key = 1;
+	pairs[1].value = 0xAAAA;
+	out = riscv_hwprobe(pairs, 2, 0, 0, 0);
+	if (out != 0)
+		return -9;
+
+	if (pairs[0].key != -1)
+		return -10;
+
+	if ((pairs[1].key != 1) || (pairs[1].value == 0xAAAA))
+		return -11;
+
+	return 0;
+}
diff --git a/tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S b/tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S
new file mode 100644
index 000000000000..ed8d28863b27
--- /dev/null
+++ b/tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2022 Rivos, Inc */
+
+.text
+.global riscv_hwprobe
+riscv_hwprobe:
+	# Put __NR_riscv_hwprobe in the syscall number register, then just shim
+	# back the kernel's return.  This doesn't do any sort of errno
+	# handling, the caller can deal with it.
+	li a7, 258
+	ecall
+	ret
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 0/6] RISC-V Hardware Probing User Interface
  2023-03-27 16:31 [PATCH v5 0/6] RISC-V Hardware Probing User Interface Evan Green
  2023-03-27 16:32 ` [PATCH v5 5/6] selftests: Test the new RISC-V hwprobe interface Evan Green
@ 2023-03-28  6:45 ` Conor Dooley
  2023-03-28 22:54   ` Evan Green
  2023-03-28 20:34 ` Heiko Stübner
  2 siblings, 1 reply; 6+ messages in thread
From: Conor Dooley @ 2023-03-28  6:45 UTC (permalink / raw)
  To: Evan Green
  Cc: Palmer Dabbelt, slewis, vineetg, heiko, Conor Dooley, Albert Ou,
	Andrew Bresticker, Andrew Jones, Andrew Morton, Anup Patel,
	Arnd Bergmann, Atish Patra, Bagas Sanjaya, Catalin Marinas,
	Celeste Liu, Dao Lu, Guo Ren, Heiko Stuebner, Jann Horn,
	Jisheng Zhang, Jonathan Corbet, Ley Foon Tan, Mark Brown,
	Mike Kravetz, Nathan Chancellor, Palmer Dabbelt, Paul Walmsley,
	Peter Xu, Philipp Tomsich, Randy Dunlap, Samuel Holland,
	Shuah Khan, Sunil V L, Tobias Klauser, linux-doc, linux-kernel,
	linux-kselftest, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 739 bytes --]

On Mon, Mar 27, 2023 at 09:31:57AM -0700, Evan Green wrote:

Hey Evan,

Patchwork has a rake of complaints about the series unfortunately:
https://patchwork.kernel.org/project/linux-riscv/list/?series=734234

Some of the checkpatch whinging may be spurious, but there's some
definitely valid stuff in there!

> Evan Green (6):
>   RISC-V: Move struct riscv_cpuinfo to new header
>   RISC-V: Add a syscall for HW probing
>   RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA
>   RISC-V: hwprobe: Support probing of misaligned access performance
>   selftests: Test the new RISC-V hwprobe interface

>   RISC-V: Add hwprobe vDSO function and data

And this one breaks the build for !MMU kernels unfortunately.

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 0/6] RISC-V Hardware Probing User Interface
  2023-03-27 16:31 [PATCH v5 0/6] RISC-V Hardware Probing User Interface Evan Green
  2023-03-27 16:32 ` [PATCH v5 5/6] selftests: Test the new RISC-V hwprobe interface Evan Green
  2023-03-28  6:45 ` [PATCH v5 0/6] RISC-V Hardware Probing User Interface Conor Dooley
@ 2023-03-28 20:34 ` Heiko Stübner
  2023-03-28 22:53   ` Evan Green
  2 siblings, 1 reply; 6+ messages in thread
From: Heiko Stübner @ 2023-03-28 20:34 UTC (permalink / raw)
  To: Palmer Dabbelt, Evan Green
  Cc: slewis, vineetg, Conor Dooley, Evan Green, Albert Ou,
	Andrew Bresticker, Andrew Jones, Andrew Morton, Anup Patel,
	Arnd Bergmann, Atish Patra, Bagas Sanjaya, Catalin Marinas,
	Celeste Liu, Conor Dooley, Dao Lu, Guo Ren, Jann Horn,
	Jisheng Zhang, Jonathan Corbet, Ley Foon Tan, Mark Brown,
	Mike Kravetz, Nathan Chancellor, Palmer Dabbelt, Paul Walmsley,
	Peter Xu, Philipp Tomsich, Randy Dunlap, Samuel Holland,
	Shuah Khan, Sunil V L, Tobias Klauser, linux-doc, linux-kernel,
	linux-kselftest, linux-riscv

Am Montag, 27. März 2023, 18:31:57 CEST schrieb Evan Green:
> 
> There's been a bunch of off-list discussions about this, including at
> Plumbers.  The original plan was to do something involving providing an
> ISA string to userspace, but ISA strings just aren't sufficient for a
> stable ABI any more: in order to parse an ISA string users need the
> version of the specifications that the string is written to, the version
> of each extension (sometimes at a finer granularity than the RISC-V
> releases/versions encode), and the expected use case for the ISA string
> (ie, is it a U-mode or M-mode string).  That's a lot of complexity to
> try and keep ABI compatible and it's probably going to continue to grow,
> as even if there's no more complexity in the specifications we'll have
> to deal with the various ISA string parsing oddities that end up all
> over userspace.
> 
> Instead this patch set takes a very different approach and provides a set
> of key/value pairs that encode various bits about the system.  The big
> advantage here is that we can clearly define what these mean so we can
> ensure ABI stability, but it also allows us to encode information that's
> unlikely to ever appear in an ISA string (see the misaligned access
> performance, for example).  The resulting interface looks a lot like
> what arm64 and x86 do, and will hopefully fit well into something like
> ACPI in the future.
> 
> The actual user interface is a syscall, with a vDSO function in front of
> it. The vDSO function can answer some queries without a syscall at all,
> and falls back to the syscall for cases it doesn't have answers to.
> Currently we prepopulate it with an array of answers for all keys and
> a CPU set of "all CPUs". This can be adjusted as necessary to provide
> fast answers to the most common queries.
> 
> An example series in glibc exposing this syscall and using it in an
> ifunc selector for memcpy can be found at [1]. I'm about to send a v2
> of that series out that incorporates the vDSO function.
> 
> I was asked about the performance delta between this and something like
> sysfs. I created a small test program [2] and ran it on a Nezha D1
> Allwinner board. Doing each operation 100000 times and dividing, these
> operations take the following amount of time:
>  - open()+read()+close() of /sys/kernel/cpu_byteorder: 3.8us
>  - access("/sys/kernel/cpu_byteorder", R_OK): 1.3us
>  - riscv_hwprobe() vDSO and syscall: .0094us
>  - riscv_hwprobe() vDSO with no syscall: 0.0091us

Looks like this series spawned a thread on one of the riscv-lists [0].

As auxvals were mentioned in that thread, I was wondering what's the
difference between doing a new syscall vs. putting the keys + values as
architecture auxvec elements [1] ?

I'm probably missing some simple issue but from looking at that stuff
I fathom RISCV_HWPROBE_KEY_BASE_BEHAVIOR could also just be
AT_RISCV_BASE_BEHAVIOR ?


Heiko


[0] https://lists.riscv.org/g/sig-toolchains/topic/97886491
[1] https://elixir.bootlin.com/linux/latest/source/arch/riscv/include/uapi/asm/auxvec.h



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 0/6] RISC-V Hardware Probing User Interface
  2023-03-28 20:34 ` Heiko Stübner
@ 2023-03-28 22:53   ` Evan Green
  0 siblings, 0 replies; 6+ messages in thread
From: Evan Green @ 2023-03-28 22:53 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: Palmer Dabbelt, slewis, vineetg, Conor Dooley, Albert Ou,
	Andrew Bresticker, Andrew Jones, Andrew Morton, Anup Patel,
	Arnd Bergmann, Atish Patra, Bagas Sanjaya, Catalin Marinas,
	Celeste Liu, Conor Dooley, Dao Lu, Guo Ren, Jann Horn,
	Jisheng Zhang, Jonathan Corbet, Ley Foon Tan, Mark Brown,
	Mike Kravetz, Nathan Chancellor, Palmer Dabbelt, Paul Walmsley,
	Peter Xu, Philipp Tomsich, Randy Dunlap, Samuel Holland,
	Shuah Khan, Sunil V L, Tobias Klauser, linux-doc, linux-kernel,
	linux-kselftest, linux-riscv

On Tue, Mar 28, 2023 at 1:35 PM Heiko Stübner <heiko@sntech.de> wrote:
>
> Am Montag, 27. März 2023, 18:31:57 CEST schrieb Evan Green:
> >
> > There's been a bunch of off-list discussions about this, including at
> > Plumbers.  The original plan was to do something involving providing an
> > ISA string to userspace, but ISA strings just aren't sufficient for a
> > stable ABI any more: in order to parse an ISA string users need the
> > version of the specifications that the string is written to, the version
> > of each extension (sometimes at a finer granularity than the RISC-V
> > releases/versions encode), and the expected use case for the ISA string
> > (ie, is it a U-mode or M-mode string).  That's a lot of complexity to
> > try and keep ABI compatible and it's probably going to continue to grow,
> > as even if there's no more complexity in the specifications we'll have
> > to deal with the various ISA string parsing oddities that end up all
> > over userspace.
> >
> > Instead this patch set takes a very different approach and provides a set
> > of key/value pairs that encode various bits about the system.  The big
> > advantage here is that we can clearly define what these mean so we can
> > ensure ABI stability, but it also allows us to encode information that's
> > unlikely to ever appear in an ISA string (see the misaligned access
> > performance, for example).  The resulting interface looks a lot like
> > what arm64 and x86 do, and will hopefully fit well into something like
> > ACPI in the future.
> >
> > The actual user interface is a syscall, with a vDSO function in front of
> > it. The vDSO function can answer some queries without a syscall at all,
> > and falls back to the syscall for cases it doesn't have answers to.
> > Currently we prepopulate it with an array of answers for all keys and
> > a CPU set of "all CPUs". This can be adjusted as necessary to provide
> > fast answers to the most common queries.
> >
> > An example series in glibc exposing this syscall and using it in an
> > ifunc selector for memcpy can be found at [1]. I'm about to send a v2
> > of that series out that incorporates the vDSO function.
> >
> > I was asked about the performance delta between this and something like
> > sysfs. I created a small test program [2] and ran it on a Nezha D1
> > Allwinner board. Doing each operation 100000 times and dividing, these
> > operations take the following amount of time:
> >  - open()+read()+close() of /sys/kernel/cpu_byteorder: 3.8us
> >  - access("/sys/kernel/cpu_byteorder", R_OK): 1.3us
> >  - riscv_hwprobe() vDSO and syscall: .0094us
> >  - riscv_hwprobe() vDSO with no syscall: 0.0091us
>
> Looks like this series spawned a thread on one of the riscv-lists [0].
>
> As auxvals were mentioned in that thread, I was wondering what's the
> difference between doing a new syscall vs. putting the keys + values as
> architecture auxvec elements [1] ?

The auxvec approach would also work. The primary difference is that
auxvec bits are actively copied into every new process, forever. If
you predict a slow pace of new bits coming in, the auxvec approach
probably makes more sense. This series was born out of a prediction
that this set of "stuff" was going to be larger than traditional
x86/ARM architectures, fiddly (ie bits possibly representing specific
versions of various extensions), evolving regularly over time, and
heterogeneous between cores. With that sort of rubber band ball in
mind, a key/value interface seemed to make more sense.

-Evan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 0/6] RISC-V Hardware Probing User Interface
  2023-03-28  6:45 ` [PATCH v5 0/6] RISC-V Hardware Probing User Interface Conor Dooley
@ 2023-03-28 22:54   ` Evan Green
  0 siblings, 0 replies; 6+ messages in thread
From: Evan Green @ 2023-03-28 22:54 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Palmer Dabbelt, slewis, vineetg, heiko, Conor Dooley, Albert Ou,
	Andrew Bresticker, Andrew Jones, Andrew Morton, Anup Patel,
	Arnd Bergmann, Atish Patra, Bagas Sanjaya, Catalin Marinas,
	Celeste Liu, Dao Lu, Guo Ren, Heiko Stuebner, Jann Horn,
	Jisheng Zhang, Jonathan Corbet, Ley Foon Tan, Mark Brown,
	Mike Kravetz, Nathan Chancellor, Palmer Dabbelt, Paul Walmsley,
	Peter Xu, Philipp Tomsich, Randy Dunlap, Samuel Holland,
	Shuah Khan, Sunil V L, Tobias Klauser, linux-doc, linux-kernel,
	linux-kselftest, linux-riscv

On Mon, Mar 27, 2023 at 11:34 PM Conor Dooley
<conor.dooley@microchip.com> wrote:
>
> On Mon, Mar 27, 2023 at 09:31:57AM -0700, Evan Green wrote:
>
> Hey Evan,
>
> Patchwork has a rake of complaints about the series unfortunately:
> https://patchwork.kernel.org/project/linux-riscv/list/?series=734234
>
> Some of the checkpatch whinging may be spurious, but there's some
> definitely valid stuff in there!
>
> > Evan Green (6):
> >   RISC-V: Move struct riscv_cpuinfo to new header
> >   RISC-V: Add a syscall for HW probing
> >   RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA
> >   RISC-V: hwprobe: Support probing of misaligned access performance
> >   selftests: Test the new RISC-V hwprobe interface
>
> >   RISC-V: Add hwprobe vDSO function and data
>
> And this one breaks the build for !MMU kernels unfortunately.

Drat! Ok, thanks for the heads up. I'll go track these down.
-Evan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-03-28 22:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-27 16:31 [PATCH v5 0/6] RISC-V Hardware Probing User Interface Evan Green
2023-03-27 16:32 ` [PATCH v5 5/6] selftests: Test the new RISC-V hwprobe interface Evan Green
2023-03-28  6:45 ` [PATCH v5 0/6] RISC-V Hardware Probing User Interface Conor Dooley
2023-03-28 22:54   ` Evan Green
2023-03-28 20:34 ` Heiko Stübner
2023-03-28 22:53   ` Evan Green

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).