All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/23] [PATCH v4 00/30]  Enable build of full Xen for RISC-V
@ 2024-02-26 17:38 Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 01/23] xen/riscv: disable unnecessary configs Oleksii Kurochko
                   ` (22 more replies)
  0 siblings, 23 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Doug Goldstein, Stefano Stabellini,
	Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Wei Liu,
	Tamas K Lengyel, Alexandru Isaila, Petre Pircalabu

This patch series performs all of the additions necessary to drop the
build overrides for RISCV and enable the full Xen build. Except in cases
where compatibile implementations already exist (e.g. atomic.h and
bitops.h), the newly added definitions are simple.

The patch series is based on the following patch series:
-	[PATCH v6 0/9] Introduce generic headers   [1]
- [PATCH] move __read_mostly to xen/cache.h  [2]
- [XEN PATCH v2 1/3] xen: introduce STATIC_ASSERT_UNREACHABLE() [3]
- [PATCH] xen/lib: introduce generic find next bit operations [4]

Right now, the patch series doesn't have a direct dependency on [2] and it
provides __read_mostly in the patch:
    [PATCH v3 26/34] xen/riscv: add definition of __read_mostly
However, it will be dropped as soon as [2] is merged or at least when the
final version of the patch [2] is provided.

[1] https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com/
[2] https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b333e@suse.com/
[3] https://lore.kernel.org/xen-devel/42fc6ae8d3eb802429d29c774502ff232340dc84.1706259490.git.federico.serafini@bugseng.com/
[4] https://lore.kernel.org/xen-devel/52730e6314210ba4164a9934a720c4fda201447b.1706266854.git.oleksii.kurochko@gmail.com/

---
Changes in V5:
 - Update the cover letter as one of the dependencies were merged to staging.
 - Was introduced asm-generic for atomic ops and separate patches for asm-generic bit ops
 - Moved fence.h to separate patch to deal with some patches dependecies on fence.h
 - Patches were dropped as they were merged to staging:
   * [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V
   * [PATCH v4 04/30] xen/riscv: introduce cpufeature.h
   * [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h
   * [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h
   * [PATCH v4 08/30] xen/riscv: introduce setup.h
   * [PATCH v4 10/30] xen/riscv: introduce flushtlb.h
   * [PATCH v4 11/30] xen/riscv: introduce smp.h
   * [PATCH v4 15/30] xen/riscv: introduce irq.h
   * [PATCH v4 16/30] xen/riscv: introduce p2m.h
   * [PATCH v4 17/30] xen/riscv: introduce regs.h
   * [PATCH v4 18/30] xen/riscv: introduce time.h
   * [PATCH v4 19/30] xen/riscv: introduce event.h
   * [PATCH v4 22/30] xen/riscv: define an address of frame table
 - Other changes are specific to specific patches. please look at specific patch
---
Changes in V4:
 - Update the cover letter message: new patch series dependencies.
 - Some patches were merged to staging, so they were dropped in this patch series:
     [PATCH v3 09/34] xen/riscv: introduce system.h
     [PATCH v3 18/34] xen/riscv: introduce domain.h
     [PATCH v3 19/34] xen/riscv: introduce guest_access.h
 - Was sent out of this patch series:
     [PATCH v3 16/34] xen/lib: introduce generic find next bit operations
 - [PATCH v3 17/34] xen/riscv: add compilation of generic find-next-bit.c was
   droped as CONFIG_GENERIC_FIND_NEXT_BIT was dropped.
 - All other changes are specific to a specific patch.
---
Changes in V3:
 - Update the cover letter message
 - The following patches were dropped as they were merged to staging:
    [PATCH v2 03/39] xen/riscv:introduce asm/byteorder.h
    [PATCH v2 04/39] xen/riscv: add public arch-riscv.h
    [PATCH v2 05/39] xen/riscv: introduce spinlock.h
    [PATCH v2 20/39] xen/riscv: define bug frame tables in xen.lds.S
    [PATCH v2 34/39] xen: add RISCV support for pmu.h
    [PATCH v2 35/39] xen: add necessary headers to common to build full Xen for RISC-V
 - Instead of the following patches were introduced new:
    [PATCH v2 10/39] xen/riscv: introduce asm/iommu.h
    [PATCH v2 11/39] xen/riscv: introduce asm/nospec.h
 - remove "asm/"  for commit messages which start with "xen/riscv:"
 - code style updates.
 - add emulation of {cmp}xchg_* for 1 and 2 bytes types.
 - code style fixes.
 - add SPDX and footer for the newly added headers.
 - introduce generic find-next-bit.c.
 - some other mionor changes. ( details please find in a patch )
---
Changes in V2:
  - Drop the following patches as they are the part of [2]:
      [PATCH v1 06/57] xen/riscv: introduce paging.h
      [PATCH v1 08/57] xen/riscv: introduce asm/device.h
      [PATCH v1 10/57] xen/riscv: introduce asm/grant_table.h
      [PATCH v1 12/57] xen/riscv: introduce asm/hypercall.h
      [PATCH v1 13/57] xen/riscv: introduce asm/iocap.h
      [PATCH v1 15/57] xen/riscv: introduce asm/mem_access.h
      [PATCH v1 18/57] xen/riscv: introduce asm/random.h
      [PATCH v1 21/57] xen/riscv: introduce asm/xenoprof.h
      [PATCH v1 24/57] xen/riscv: introduce asm/percpu.h
      [PATCH v1 29/57] xen/riscv: introduce asm/hardirq.h
      [PATCH v1 33/57] xen/riscv: introduce asm/altp2m.h
      [PATCH v1 38/57] xen/riscv: introduce asm/monitor.h
      [PATCH v1 39/57] xen/riscv: introduce asm/numa.h
      [PATCH v1 42/57] xen/riscv: introduce asm/softirq.h
  - xen/lib.h in most of the cases were changed to xen/bug.h as
    mostly functionilty of bug.h is used.
  - align arch-riscv.h with Arm's version of it.
  - change the Author of commit with introduction of asm/atomic.h.
  - update some definition from spinlock.h.
  - code style changes.
---

Oleksii Kurochko (23):
  xen/riscv: disable unnecessary configs
  xen/riscv: use some asm-generic headers
  xen/riscv: introduce nospec.h
  xen/asm-generic: introduce generic fls() and flsl() functions
  xen/asm-generic: introduce generic find first set bit functions
  xen/asm-generic: introduce generic ffz()
  xen/asm-generic: introduce generic hweight64()
  xen/asm-generic: introduce generic non-atomic test_*bit()
  xen/riscv: introduce bitops.h
  xen/riscv: introduces acrquire, release and full barriers
  xen/riscv: introduce cmpxchg.h
  xen/riscv: introduce io.h
  xen/riscv: introduce atomic.h
  xen/riscv: introduce monitor.h
  xen/riscv: add definition of __read_mostly
  xen/riscv: add required things to current.h
  xen/riscv: add minimal stuff to page.h to build full Xen
  xen/riscv: add minimal stuff to processor.h to build full Xen
  xen/riscv: add minimal stuff to mm.h to build full Xen
  xen/riscv: introduce vm_event_*() functions
  xen/rirscv: add minimal amount of stubs to build full Xen
  xen/riscv: enable full Xen build
  xen/README: add compiler and binutils versions for RISC-V64

 README                                        |   9 +
 automation/gitlab-ci/build.yaml               |  24 +
 docs/misc/riscv/booting.txt                   |   8 +
 xen/arch/riscv/Makefile                       |  18 +-
 xen/arch/riscv/arch.mk                        |  12 +-
 xen/arch/riscv/configs/tiny64_defconfig       |  17 +
 xen/arch/riscv/early_printk.c                 | 168 -------
 xen/arch/riscv/include/asm/Makefile           |  12 +
 xen/arch/riscv/include/asm/atomic.h           | 296 +++++++++++++
 xen/arch/riscv/include/asm/bitops.h           | 152 +++++++
 xen/arch/riscv/include/asm/cache.h            |   2 +
 xen/arch/riscv/include/asm/cmpxchg.h          | 241 ++++++++++
 xen/arch/riscv/include/asm/config.h           |   2 +
 xen/arch/riscv/include/asm/current.h          |  19 +
 xen/arch/riscv/include/asm/fence.h            |   9 +
 xen/arch/riscv/include/asm/io.h               | 157 +++++++
 xen/arch/riscv/include/asm/mm.h               | 246 +++++++++++
 xen/arch/riscv/include/asm/monitor.h          |  26 ++
 xen/arch/riscv/include/asm/nospec.h           |  25 ++
 xen/arch/riscv/include/asm/page.h             |  19 +
 xen/arch/riscv/include/asm/processor.h        |  23 +
 xen/arch/riscv/mm.c                           |  52 ++-
 xen/arch/riscv/setup.c                        |  10 +-
 xen/arch/riscv/stubs.c                        | 415 ++++++++++++++++++
 xen/arch/riscv/traps.c                        |  25 ++
 xen/arch/riscv/vm_event.c                     |  19 +
 xen/include/asm-generic/atomic-ops.h          |  92 ++++
 xen/include/asm-generic/bitops/__ffs.h        |  47 ++
 xen/include/asm-generic/bitops/bitops-bits.h  |  21 +
 xen/include/asm-generic/bitops/ffs.h          |   9 +
 xen/include/asm-generic/bitops/ffsl.h         |  16 +
 xen/include/asm-generic/bitops/ffz.h          |  18 +
 .../asm-generic/bitops/find-first-set-bit.h   |  17 +
 xen/include/asm-generic/bitops/fls.h          |  18 +
 xen/include/asm-generic/bitops/flsl.h         |  10 +
 .../asm-generic/bitops/generic-non-atomic.h   |  89 ++++
 xen/include/asm-generic/bitops/hweight.h      |  13 +
 xen/include/asm-generic/bitops/test-bit.h     |  18 +
 38 files changed, 2198 insertions(+), 176 deletions(-)
 create mode 100644 docs/misc/riscv/booting.txt
 create mode 100644 xen/arch/riscv/include/asm/Makefile
 create mode 100644 xen/arch/riscv/include/asm/atomic.h
 create mode 100644 xen/arch/riscv/include/asm/bitops.h
 create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h
 create mode 100644 xen/arch/riscv/include/asm/fence.h
 create mode 100644 xen/arch/riscv/include/asm/io.h
 create mode 100644 xen/arch/riscv/include/asm/monitor.h
 create mode 100644 xen/arch/riscv/include/asm/nospec.h
 create mode 100644 xen/arch/riscv/stubs.c
 create mode 100644 xen/arch/riscv/vm_event.c
 create mode 100644 xen/include/asm-generic/atomic-ops.h
 create mode 100644 xen/include/asm-generic/bitops/__ffs.h
 create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
 create mode 100644 xen/include/asm-generic/bitops/ffs.h
 create mode 100644 xen/include/asm-generic/bitops/ffsl.h
 create mode 100644 xen/include/asm-generic/bitops/ffz.h
 create mode 100644 xen/include/asm-generic/bitops/find-first-set-bit.h
 create mode 100644 xen/include/asm-generic/bitops/fls.h
 create mode 100644 xen/include/asm-generic/bitops/flsl.h
 create mode 100644 xen/include/asm-generic/bitops/generic-non-atomic.h
 create mode 100644 xen/include/asm-generic/bitops/hweight.h
 create mode 100644 xen/include/asm-generic/bitops/test-bit.h

-- 
2.43.0



^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v5 01/23] xen/riscv: disable unnecessary configs
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 02/23] xen/riscv: use some asm-generic headers Oleksii Kurochko
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Doug Goldstein, Stefano Stabellini,
	Alistair Francis, Bob Eshleman, Connor Davis

This patch disables unnecessary configs for two cases:
1. By utilizing EXTRA_FIXED_RANDCONFIG for randconfig builds (GitLab CI jobs).
2. By using tiny64_defconfig for non-randconfig builds.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - Rebase and drop duplicated configs in EXTRA_FIXED_RANDCONFIG list
 - Update the commit message
---
Changes in V4:
 - Nothing changed. Only rebase
---
Changes in V3:
 - Remove EXTRA_FIXED_RANDCONFIG for non-randconfig jobs.
   For non-randconfig jobs, it is sufficient to disable configs by using the defconfig.
 - Remove double blank lines in build.yaml file before archlinux-current-gcc-riscv64-debug
---
Changes in V2:
 - update the commit message.
 - remove xen/arch/riscv/Kconfig changes.
---
 automation/gitlab-ci/build.yaml         | 24 ++++++++++++++++++++++++
 xen/arch/riscv/configs/tiny64_defconfig | 17 +++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
index aac29ee13a..3b3d2c47dc 100644
--- a/automation/gitlab-ci/build.yaml
+++ b/automation/gitlab-ci/build.yaml
@@ -519,6 +519,30 @@ alpine-3.18-gcc-debug-arm64-boot-cpupools:
       CONFIG_EXPERT=y
       CONFIG_GRANT_TABLE=n
       CONFIG_MEM_ACCESS=n
+      CONFIG_SCHED_CREDIT=n
+      CONFIG_SCHED_CREDIT2=n
+      CONFIG_SCHED_RTDS=n
+      CONFIG_SCHED_NULL=n
+      CONFIG_SCHED_ARINC653=n
+      CONFIG_TRACEBUFFER=n
+      CONFIG_HYPFS=n
+      CONFIG_SPECULATIVE_HARDEN_ARRAY=n
+      CONFIG_ARGO=n
+      CONFIG_HYPFS_CONFIG=n
+      CONFIG_CORE_PARKING=n
+      CONFIG_DEBUG_TRACE=n
+      CONFIG_IOREQ_SERVER=n
+      CONFIG_CRASH_DEBUG=n
+      CONFIG_KEXEC=n
+      CONFIG_LIVEPATCH=n
+      CONFIG_NUMA=n
+      CONFIG_PERF_COUNTERS=n
+      CONFIG_HAS_PMAP=n
+      CONFIG_XENOPROF=n
+      CONFIG_COMPAT=n
+      CONFIG_UBSAN=n
+      CONFIG_NEEDS_LIBELF=n
+      CONFIG_XSM=n
 
 archlinux-current-gcc-riscv64:
   extends: .gcc-riscv64-cross-build
diff --git a/xen/arch/riscv/configs/tiny64_defconfig b/xen/arch/riscv/configs/tiny64_defconfig
index 09defe236b..35915255e6 100644
--- a/xen/arch/riscv/configs/tiny64_defconfig
+++ b/xen/arch/riscv/configs/tiny64_defconfig
@@ -7,6 +7,23 @@
 # CONFIG_GRANT_TABLE is not set
 # CONFIG_SPECULATIVE_HARDEN_ARRAY is not set
 # CONFIG_MEM_ACCESS is not set
+# CONFIG_ARGO is not set
+# CONFIG_HYPFS_CONFIG is not set
+# CONFIG_CORE_PARKING is not set
+# CONFIG_DEBUG_TRACE is not set
+# CONFIG_IOREQ_SERVER is not set
+# CONFIG_CRASH_DEBUG is not setz
+# CONFIG_KEXEC is not set
+# CONFIG_LIVEPATCH is not set
+# CONFIG_NUMA is not set
+# CONFIG_PERF_COUNTERS is not set
+# CONFIG_HAS_PMAP is not set
+# CONFIG_TRACEBUFFER is not set
+# CONFIG_XENOPROF is not set
+# CONFIG_COMPAT is not set
+# CONFIG_COVERAGE is not set
+# CONFIG_UBSAN is not set
+# CONFIG_NEEDS_LIBELF is not set
 
 CONFIG_RISCV_64=y
 CONFIG_DEBUG=y
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 02/23] xen/riscv: use some asm-generic headers
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 01/23] xen/riscv: disable unnecessary configs Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-27  7:35   ` Jan Beulich
  2024-02-26 17:38 ` [PATCH v5 03/23] xen/riscv: introduce nospec.h Oleksii Kurochko
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

The following headers end up the same as asm-generic's version:
* altp2m.h
* device.h
* div64.h
* hardirq.h
* hypercall.h
* iocap.h
* paging.h
* percpu.h
* random.h
* softirq.h
* vm_event.h

RISC-V should utilize the asm-generic's version of the mentioned
headers instead of introducing them in the arch-specific folder.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V5:
 - Nothing changed. Only rebase.
 - update the commit message.
 - drop the message above revision log as there is no depenency for this patch
   from other patch series.
---
Changes in V4:
- removed numa.h from asm/include/Makefile because of the patch: [PATCH v2] NUMA: no need for asm/numa.h when !NUMA
- updated the commit message
---
Changes in V3:
 - remove monitor.h from the RISC-V asm/Makefile list.
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V2:
 - New commit introduced in V2.
---
 xen/arch/riscv/include/asm/Makefile | 12 ++++++++++++
 1 file changed, 12 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/Makefile

diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
new file mode 100644
index 0000000000..ced02e26ed
--- /dev/null
+++ b/xen/arch/riscv/include/asm/Makefile
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+generic-y += altp2m.h
+generic-y += device.h
+generic-y += div64.h
+generic-y += hardirq.h
+generic-y += hypercall.h
+generic-y += iocap.h
+generic-y += paging.h
+generic-y += percpu.h
+generic-y += random.h
+generic-y += softirq.h
+generic-y += vm_event.h
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 03/23] xen/riscv: introduce nospec.h
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 01/23] xen/riscv: disable unnecessary configs Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 02/23] xen/riscv: use some asm-generic headers Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-27  7:38   ` Jan Beulich
                     ` (2 more replies)
  2024-02-26 17:38 ` [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions Oleksii Kurochko
                   ` (19 subsequent siblings)
  22 siblings, 3 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

From the unpriviliged doc:
  No standard hints are presently defined.
  We anticipate standard hints to eventually include memory-system spatial
  and temporal locality hints, branch prediction hints, thread-scheduling
  hints, security tags, and instrumentation flags for simulation/emulation.

Also, there are no speculation execution barriers.

Therefore, functions evaluate_nospec() and block_speculation() should
remain empty until a specific platform has an extension to deal with
speculation execution.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
   - new patch
---
 xen/arch/riscv/include/asm/nospec.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/nospec.h

diff --git a/xen/arch/riscv/include/asm/nospec.h b/xen/arch/riscv/include/asm/nospec.h
new file mode 100644
index 0000000000..4fb404a0a2
--- /dev/null
+++ b/xen/arch/riscv/include/asm/nospec.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2024 Vates */
+
+#ifndef _ASM_GENERIC_NOSPEC_H
+#define _ASM_GENERIC_NOSPEC_H
+
+static inline bool evaluate_nospec(bool condition)
+{
+    return condition;
+}
+
+static inline void block_speculation(void)
+{
+}
+
+#endif /* _ASM_GENERIC_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (2 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 03/23] xen/riscv: introduce nospec.h Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-29 13:54   ` Julien Grall
                     ` (2 more replies)
  2024-02-26 17:38 ` [PATCH v5 05/23] xen/asm-generic: introduce generic find first set bit functions Oleksii Kurochko
                   ` (18 subsequent siblings)
  22 siblings, 3 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

These functions can be useful for architectures that don't
have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/fls.h  | 18 ++++++++++++++++++
 xen/include/asm-generic/bitops/flsl.h | 10 ++++++++++
 2 files changed, 28 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/fls.h
 create mode 100644 xen/include/asm-generic/bitops/flsl.h

diff --git a/xen/include/asm-generic/bitops/fls.h b/xen/include/asm-generic/bitops/fls.h
new file mode 100644
index 0000000000..369a4c790c
--- /dev/null
+++ b/xen/include/asm-generic/bitops/fls.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FLS_H_
+#define _ASM_GENERIC_BITOPS_FLS_H_
+
+/**
+ * fls - find last (most-significant) bit set
+ * @x: the word to search
+ *
+ * This is defined the same way as ffs.
+ * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
+ */
+
+static inline int fls(unsigned int x)
+{
+    return generic_fls(x);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FLS_H_ */
diff --git a/xen/include/asm-generic/bitops/flsl.h b/xen/include/asm-generic/bitops/flsl.h
new file mode 100644
index 0000000000..d0a2e9c729
--- /dev/null
+++ b/xen/include/asm-generic/bitops/flsl.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FLSL_H_
+#define _ASM_GENERIC_BITOPS_FLSL_H_
+
+static inline int flsl(unsigned long x)
+{
+    return generic_flsl(x);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FLSL_H_ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 05/23] xen/asm-generic: introduce generic find first set bit functions
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (3 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 06/23] xen/asm-generic: introduce generic ffz() Oleksii Kurochko
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

These functions can be useful for architectures that don't
have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/__ffs.h        | 47 +++++++++++++++++++
 xen/include/asm-generic/bitops/ffs.h          |  9 ++++
 xen/include/asm-generic/bitops/ffsl.h         | 16 +++++++
 .../asm-generic/bitops/find-first-set-bit.h   | 17 +++++++
 4 files changed, 89 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/__ffs.h
 create mode 100644 xen/include/asm-generic/bitops/ffs.h
 create mode 100644 xen/include/asm-generic/bitops/ffsl.h
 create mode 100644 xen/include/asm-generic/bitops/find-first-set-bit.h

diff --git a/xen/include/asm-generic/bitops/__ffs.h b/xen/include/asm-generic/bitops/__ffs.h
new file mode 100644
index 0000000000..fecb4484d9
--- /dev/null
+++ b/xen/include/asm-generic/bitops/__ffs.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS___FFS_H_
+#define _ASM_GENERIC_BITOPS___FFS_H_
+
+/**
+ * ffs - find first bit in word.
+ * @word: The word to search
+ *
+ * Returns 0 if no bit exists, otherwise returns 1-indexed bit location.
+ */
+static inline unsigned int __ffs(unsigned long word)
+{
+    unsigned int num = 0;
+
+#if BITS_PER_LONG == 64
+    if ( (word & 0xffffffff) == 0 )
+    {
+        num += 32;
+        word >>= 32;
+    }
+#endif
+    if ( (word & 0xffff) == 0 )
+    {
+        num += 16;
+        word >>= 16;
+    }
+    if ( (word & 0xff) == 0 )
+    {
+        num += 8;
+        word >>= 8;
+    }
+    if ( (word & 0xf) == 0 )
+    {
+        num += 4;
+        word >>= 4;
+    }
+    if ( (word & 0x3) == 0 )
+    {
+        num += 2;
+        word >>= 2;
+    }
+    if ( (word & 0x1) == 0 )
+        num += 1;
+    return num;
+}
+
+#endif /* _ASM_GENERIC_BITOPS___FFS_H_ */
diff --git a/xen/include/asm-generic/bitops/ffs.h b/xen/include/asm-generic/bitops/ffs.h
new file mode 100644
index 0000000000..3f75fded14
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffs.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFS_H_
+#define _ASM_GENERIC_BITOPS_FFS_H_
+
+#include <xen/macros.h>
+
+#define ffs(x) ({ unsigned int t_ = (x); fls(ISOLATE_LSB(t_)); })
+
+#endif /* _ASM_GENERIC_BITOPS_FFS_H_ */
diff --git a/xen/include/asm-generic/bitops/ffsl.h b/xen/include/asm-generic/bitops/ffsl.h
new file mode 100644
index 0000000000..d0996808f5
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffsl.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFSL_H_
+#define _ASM_GENERIC_BITOPS_FFSL_H_
+
+/**
+ * ffsl - find first bit in long.
+ * @word: The word to search
+ *
+ * Returns 0 if no bit exists, otherwise returns 1-indexed bit location.
+ */
+static inline unsigned int ffsl(unsigned long word)
+{
+    return generic_ffsl(word);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FFSL_H_ */
diff --git a/xen/include/asm-generic/bitops/find-first-set-bit.h b/xen/include/asm-generic/bitops/find-first-set-bit.h
new file mode 100644
index 0000000000..7d28b8a89b
--- /dev/null
+++ b/xen/include/asm-generic/bitops/find-first-set-bit.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_
+#define _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_
+
+/**
+ * find_first_set_bit - find the first set bit in @word
+ * @word: the word to search
+ *
+ * Returns the bit-number of the first set bit (first bit being 0).
+ * The input must *not* be zero.
+ */
+static inline unsigned int find_first_set_bit(unsigned long word)
+{
+        return ffsl(word) - 1;
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 06/23] xen/asm-generic: introduce generic ffz()
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (4 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 05/23] xen/asm-generic: introduce generic find first set bit functions Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 07/23] xen/asm-generic: introduce generic hweight64() Oleksii Kurochko
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

The generic ffz() can be useful for architectures
that don't have corresponding arch-specific instruction.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/ffz.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/ffz.h

diff --git a/xen/include/asm-generic/bitops/ffz.h b/xen/include/asm-generic/bitops/ffz.h
new file mode 100644
index 0000000000..5932fe6695
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffz.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFZ_H_
+#define _ASM_GENERIC_BITOPS_FFZ_H_
+
+/*
+ * ffz - find first zero in word.
+ * @word: The word to search
+ *
+ * Undefined if no zero exists, so code should check against ~0UL first.
+ *
+ * ffz() is defined as __ffs() and not as ffs() as it is defined in such
+ * a way in Linux kernel (6.4.0 ) from where this header was taken, so this
+ * header is supposed to be aligned with Linux kernel version.
+ * Also, most architectures are defined in the same way in Xen.
+ */
+#define ffz(x)  __ffs(~(x))
+
+#endif /* _ASM_GENERIC_BITOPS_FFZ_H_ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 07/23] xen/asm-generic: introduce generic hweight64()
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (5 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 06/23] xen/asm-generic: introduce generic ffz() Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 08/23] xen/asm-generic: introduce generic non-atomic test_*bit() Oleksii Kurochko
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

The generic hweight() function can be useful for architectures
that don't have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/hweight.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/hweight.h

diff --git a/xen/include/asm-generic/bitops/hweight.h b/xen/include/asm-generic/bitops/hweight.h
new file mode 100644
index 0000000000..0d7577054e
--- /dev/null
+++ b/xen/include/asm-generic/bitops/hweight.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_HWEIGHT_H_
+#define _ASM_GENERIC_BITOPS_HWEIGHT_H_
+
+/*
+ * hweightN - returns the hamming weight of a N-bit word
+ * @x: the word to weigh
+ *
+ * The Hamming Weight of a number is the total number of bits set in it.
+ */
+#define hweight64(x) generic_hweight64(x)
+
+#endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 08/23] xen/asm-generic: introduce generic non-atomic test_*bit()
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (6 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 07/23] xen/asm-generic: introduce generic hweight64() Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 09/23] xen/riscv: introduce bitops.h Oleksii Kurochko
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

The patch introduces the following generic functions:
* test_bit
* generic___test_and_set_bit
* generic___test_and_clear_bit
* generic___test_and_change_bit

Also, the patch introduces the following generics which are
used by the functions mentioned above:
* BITOP_BITS_PER_WORD
* BITOP_MASK
* BITOP_WORD
* BITOP_TYPE

These functions and macros can be useful for architectures
that don't have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/bitops-bits.h  | 21 +++++
 .../asm-generic/bitops/generic-non-atomic.h   | 89 +++++++++++++++++++
 xen/include/asm-generic/bitops/test-bit.h     | 18 ++++
 3 files changed, 128 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
 create mode 100644 xen/include/asm-generic/bitops/generic-non-atomic.h
 create mode 100644 xen/include/asm-generic/bitops/test-bit.h

diff --git a/xen/include/asm-generic/bitops/bitops-bits.h b/xen/include/asm-generic/bitops/bitops-bits.h
new file mode 100644
index 0000000000..4ece2affd6
--- /dev/null
+++ b/xen/include/asm-generic/bitops/bitops-bits.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_BITS_H_
+#define _ASM_GENERIC_BITOPS_BITS_H_
+
+#ifndef BITOP_BITS_PER_WORD
+#define BITOP_BITS_PER_WORD     32
+#endif
+
+#ifndef BITOP_MASK
+#define BITOP_MASK(nr)          (1U << ((nr) % BITOP_BITS_PER_WORD))
+#endif
+
+#ifndef BITOP_WORD
+#define BITOP_WORD(nr)          ((nr) / BITOP_BITS_PER_WORD)
+#endif
+
+#ifndef BITOP_TYPE
+typedef uint32_t bitops_uint_t;
+#endif
+
+#endif /* _ASM_GENERIC_BITOPS_BITS_H_ */
diff --git a/xen/include/asm-generic/bitops/generic-non-atomic.h b/xen/include/asm-generic/bitops/generic-non-atomic.h
new file mode 100644
index 0000000000..42569d0d7c
--- /dev/null
+++ b/xen/include/asm-generic/bitops/generic-non-atomic.h
@@ -0,0 +1,89 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * The file is based on Linux ( 6.4.0 ) header:
+ *   include/asm-generic/bitops/generic-non-atomic.h
+ * 
+ * Only functions that can be reused in Xen were left; others were removed.
+ * 
+ * Also, the following changes were done:
+ *  - it was updated the message inside #ifndef ... #endif.
+ *  - __always_inline -> always_inline to be align with definition in
+ *    xen/compiler.h.
+ *  - update function prototypes from
+ *    generic___test_and_*(unsigned long nr nr, volatile unsigned long *addr) to
+ *    generic___test_and_*(unsigned long nr, volatile void *addr) to be
+ *    consistent with other related macros/defines.
+ *  - convert identations from tabs to spaces.
+ *  - inside generic__test_and_* use 'bitops_uint_t' instead of 'unsigned long'
+ *    to be generic.
+ */
+
+#ifndef __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+#define __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+
+#include <xen/compiler.h>
+
+#include <asm-generic/bitops/bitops-bits.h>
+
+#ifndef _LINUX_BITOPS_H
+#error only <xen/bitops.h> can be included directly
+#endif
+
+/*
+ * Generic definitions for bit operations, should not be used in regular code
+ * directly.
+ */
+
+/**
+ * generic___test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static always_inline bool
+generic___test_and_set_bit(unsigned long nr, volatile void *addr)
+{
+    bitops_uint_t mask = BITOP_MASK(nr);
+    bitops_uint_t *p = ((bitops_uint_t *)addr) + BITOP_WORD(nr);
+    bitops_uint_t old = *p;
+
+    *p = old | mask;
+    return (old & mask) != 0;
+}
+
+/**
+ * generic___test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static always_inline bool
+generic___test_and_clear_bit(bitops_uint_t nr, volatile void *addr)
+{
+    bitops_uint_t mask = BITOP_MASK(nr);
+    bitops_uint_t *p = ((bitops_uint_t *)addr) + BITOP_WORD(nr);
+    bitops_uint_t old = *p;
+
+    *p = old & ~mask;
+    return (old & mask) != 0;
+}
+
+/* WARNING: non atomic and it can be reordered! */
+static always_inline bool
+generic___test_and_change_bit(unsigned long nr, volatile void *addr)
+{
+    bitops_uint_t mask = BITOP_MASK(nr);
+    bitops_uint_t *p = ((bitops_uint_t *)addr) + BITOP_WORD(nr);
+    bitops_uint_t old = *p;
+
+    *p = old ^ mask;
+    return (old & mask) != 0;
+}
+
+#endif /* __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H */
diff --git a/xen/include/asm-generic/bitops/test-bit.h b/xen/include/asm-generic/bitops/test-bit.h
new file mode 100644
index 0000000000..6fb414d808
--- /dev/null
+++ b/xen/include/asm-generic/bitops/test-bit.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_TESTBIT_H_
+#define _ASM_GENERIC_BITOPS_TESTBIT_H_
+
+#include <asm-generic/bitops/bitops-bits.h>
+
+/**
+ * test_bit - Determine whether a bit is set
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ */
+static inline int test_bit(int nr, const volatile void *addr)
+{
+    const volatile bitops_uint_t *p = addr;
+    return 1 & (p[BITOP_WORD(nr)] >> (nr & (BITOP_BITS_PER_WORD - 1)));
+}
+
+#endif /* _ASM_GENERIC_BITOPS_TESTBIT_H_ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 09/23] xen/riscv: introduce bitops.h
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (7 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 08/23] xen/asm-generic: introduce generic non-atomic test_*bit() Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 10/23] xen/riscv: introduces acrquire, release and full barriers Oleksii Kurochko
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Taken from Linux-6.4.0-rc1

Xen's bitops.h consists of several Linux's headers:
* linux/arch/include/asm/bitops.h:
  * The following function were removed as they aren't used in Xen:
        * test_and_set_bit_lock
        * clear_bit_unlock
        * __clear_bit_unlock
  * The following functions were renamed in the way how they are
    used by common code:
        * __test_and_set_bit
        * __test_and_clear_bit
  * The declaration and implementation of the following functios
    were updated to make Xen build happy:
        * clear_bit
        * set_bit
        * __test_and_clear_bit
        * __test_and_set_bit
  * linux/include/asm-generic/bitops/generic-non-atomic.h with the
    following changes:
     * Only functions that can be reused in Xen were left;
       others were removed.
     * it was updated the message inside #ifndef ... #endif.
     * __always_inline -> always_inline to be align with definition in
       xen/compiler.h.
     * update function prototypes from
       generic___test_and_*(unsigned long nr nr, volatile unsigned long *addr)
       to
       generic___test_and_*(unsigned long nr, volatile void *addr) to be
       consistent with other related macros/defines.
     * convert identations from tabs to spaces.
     * inside generic__test_and_* use 'bitops_uint_t' instead of 'unsigned long'
        to be generic.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
  Patches 04 - 08 of this patch series are prerequisite for this patch.
---
Changes in V5:
 - Code style fixes
 - s/__NOP/NOP/g
 - s/__NOT/NOT/g
 - update the comments above functions: test_and_set_bit, test_and_clear_bit, set_bit, clear_bit as all of them
   are using atomic operation and a memory barrier, so the operation in it cannot be reordered.
 - s/volatile uint32_t/volatile bitops_uint_t in  test_and_set_bit, test_and_clear_bit, set_bit, clear_bit.
 - update the commit message
 - split introduction of asm-generic functions to separate patches:
   Patches 04 - 08 of this patch series are prerequisite for this patch.
---
Changes in V4:
  - updated the commit message: dropped the message about what was taken from linux/include/asm-generic/bitops/find.h
    as related changes now are located in xen/bitops.h. Also these changes were removed from riscv/bitops.h
  - switch tabs to spaces.
  - update return type of __ffs function, format __ffs according to Xen code style. Move the function to
    respective asm-generic header.
  - format ffsl() according to Xen code style, update the type of num: int -> unsigned to be align with
    return type of the function. Move the function to respective asm-generic header.
  - add new line for the files:
      asm-generic/bitops-bits.h
      asm-generic/ffz.h
      asm-generic/find-first-bit-set.h
      asm-generic/fls.h
      asm-generic/flsl.h
      asm-generic/test-bit.h
  - rename asm-generic/find-first-bit-set.h to asm-generic/find-first-set-bit.h to be aligned with the function
    name implemented inside.
  - introduce generic___test_and*() operation for non-atomic bitops.
  - rename current __test_and_*() -> test_and_*() as their implementation are atomic aware.
  - define __test_and_*() to generic___test_and_*().
  - introduce test_and_change_bit().
  - update asm-generic/bitops/bitops-bits.h to give possoibility to change BITOP_*() macros by architecture.
    Also, it was introduced bitops_uint_t type to make generic___test_and_*() generic.
  - "include asm-generic/bitops/bitops-bits.h" to files which use its definitions.
  - add comment why generic ffz is defined as __ffs().
  - update the commit message.
  - swtich ffsl() to generic_ffsl().
---
Changes in V3:
 - update the commit message
 - Introduce the following asm-generic bitops headers:
	create mode 100644 xen/arch/riscv/include/asm/bitops.h
	create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
	create mode 100644 xen/include/asm-generic/bitops/ffs.h
	create mode 100644 xen/include/asm-generic/bitops/ffz.h
	create mode 100644 xen/include/asm-generic/bitops/find-first-bit-set.h
	create mode 100644 xen/include/asm-generic/bitops/fls.h
	create mode 100644 xen/include/asm-generic/bitops/flsl.h
	create mode 100644 xen/include/asm-generic/bitops/hweight.h
	create mode 100644 xen/include/asm-generic/bitops/test-bit.h
 - switch some bitops functions to asm-generic's versions.
 - re-sync some macros with Linux kernel version mentioned in the commit message.
 - Xen code style fixes.
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/bitops.h | 152 ++++++++++++++++++++++++++++
 xen/arch/riscv/include/asm/config.h |   2 +
 2 files changed, 154 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/bitops.h

diff --git a/xen/arch/riscv/include/asm/bitops.h b/xen/arch/riscv/include/asm/bitops.h
new file mode 100644
index 0000000000..17b3cf5be5
--- /dev/null
+++ b/xen/arch/riscv/include/asm/bitops.h
@@ -0,0 +1,152 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2012 Regents of the University of California */
+
+#ifndef _ASM_RISCV_BITOPS_H
+#define _ASM_RISCV_BITOPS_H
+
+#include <asm/system.h>
+
+#define BITOP_BITS_PER_WORD BITS_PER_LONG
+
+#define BITOP_TYPE
+typedef uint64_t bitops_uint_t;
+
+#include <asm-generic/bitops/bitops-bits.h>
+
+#define __set_bit(n, p)      set_bit(n, p)
+#define __clear_bit(n, p)    clear_bit(n, p)
+
+/* Based on linux/arch/include/asm/bitops.h */
+
+#if BITS_PER_LONG == 64
+#define __AMO(op)   "amo" #op ".d"
+#elif BITS_PER_LONG == 32
+#define __AMO(op)   "amo" #op ".w"
+#else
+#error "Unexpected BITS_PER_LONG"
+#endif
+
+#define test_and_op_bit_ord(op, mod, nr, addr, ord)     \
+({                                                      \
+    unsigned long res, mask;                            \
+    mask = BITOP_MASK(nr);                              \
+    __asm__ __volatile__ (                              \
+        __AMO(op) #ord " %0, %2, %1"                    \
+        : "=r" (res), "+A" (addr[BITOP_WORD(nr)])       \
+        : "r" (mod(mask))                               \
+        : "memory");                                    \
+    ((res & mask) != 0);                                \
+})
+
+#define __op_bit_ord(op, mod, nr, addr, ord)    \
+    __asm__ __volatile__ (                      \
+        __AMO(op) #ord " zero, %1, %0"          \
+        : "+A" (addr[BITOP_WORD(nr)])           \
+        : "r" (mod(BITOP_MASK(nr)))             \
+        : "memory");
+
+#define test_and_op_bit(op, mod, nr, addr)    \
+    test_and_op_bit_ord(op, mod, nr, addr, .aqrl)
+#define __op_bit(op, mod, nr, addr) \
+    __op_bit_ord(op, mod, nr, addr, )
+
+/* Bitmask modifiers */
+#define NOP(x)    (x)
+#define NOT(x)    (~(x))
+
+/**
+ * test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ */
+static inline int test_and_set_bit(int nr, volatile void *p)
+{
+    volatile bitops_uint_t *addr = p;
+
+    return test_and_op_bit(or, NOP, nr, addr);
+}
+
+/**
+ * test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ */
+static inline int test_and_clear_bit(int nr, volatile void *p)
+{
+    volatile bitops_uint_t *addr = p;
+
+    return test_and_op_bit(and, NOT, nr, addr);
+}
+
+/**
+ * set_bit - Atomically set a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
+ */
+static inline void set_bit(int nr, volatile void *p)
+{
+    volatile bitops_uint_t *addr = p;
+
+    __op_bit(or, NOP, nr, addr);
+}
+
+/**
+ * clear_bit - Clears a bit in memory
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
+ */
+static inline void clear_bit(int nr, volatile void *p)
+{
+    volatile bitops_uint_t *addr = p;
+
+    __op_bit(and, NOT, nr, addr);
+}
+
+/**
+ * test_and_change_bit - Toggle (change) a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This operation is atomic and cannot be reordered.
+ * It also implies a memory barrier.
+ */
+static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
+{
+	return test_and_op_bit(xor, NOP, nr, addr);
+}
+
+#undef test_and_op_bit
+#undef __op_bit
+#undef NOP
+#undef NOT
+#undef __AMO
+
+#include <asm-generic/bitops/generic-non-atomic.h>
+
+#define __test_and_set_bit generic___test_and_set_bit
+#define __test_and_clear_bit generic___test_and_clear_bit
+#define __test_and_change_bit generic___test_and_change_bit
+
+#include <asm-generic/bitops/fls.h>
+#include <asm-generic/bitops/flsl.h>
+#include <asm-generic/bitops/__ffs.h>
+#include <asm-generic/bitops/ffs.h>
+#include <asm-generic/bitops/ffsl.h>
+#include <asm-generic/bitops/ffz.h>
+#include <asm-generic/bitops/find-first-set-bit.h>
+#include <asm-generic/bitops/hweight.h>
+#include <asm-generic/bitops/test-bit.h>
+
+#endif /* _ASM_RISCV_BITOPS_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
index 2c7f2b1ff9..479da15782 100644
--- a/xen/arch/riscv/include/asm/config.h
+++ b/xen/arch/riscv/include/asm/config.h
@@ -113,6 +113,8 @@
 # error "Unsupported RISCV variant"
 #endif
 
+#define BITS_PER_BYTE 8
+
 #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
 #define BITS_PER_LONG  (BYTES_PER_LONG << 3)
 #define POINTER_ALIGN  BYTES_PER_LONG
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 10/23] xen/riscv: introduces acrquire, release and full barriers
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (8 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 09/23] xen/riscv: introduce bitops.h Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-03-05  7:42   ` Jan Beulich
  2024-02-26 17:38 ` [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
   - new patch
---
 xen/arch/riscv/include/asm/fence.h | 9 +++++++++
 1 file changed, 9 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/fence.h

diff --git a/xen/arch/riscv/include/asm/fence.h b/xen/arch/riscv/include/asm/fence.h
new file mode 100644
index 0000000000..27f46fa897
--- /dev/null
+++ b/xen/arch/riscv/include/asm/fence.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_RISCV_FENCE_H
+#define _ASM_RISCV_FENCE_H
+
+#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
+#define RISCV_RELEASE_BARRIER   "\tfence rw, w\n"
+#define RISCV_FULL_BARRIER      "\tfence rw, rw\n"
+
+#endif	/* _ASM_RISCV_FENCE_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (9 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 10/23] xen/riscv: introduces acrquire, release and full barriers Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-03-06 14:56   ` Jan Beulich
  2024-02-26 17:38 ` [PATCH v5 12/23] xen/riscv: introduce io.h Oleksii Kurochko
                   ` (11 subsequent siblings)
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

The header was taken from Linux kernl 6.4.0-rc1.

Addionally, were updated:
* add emulation of {cmp}xchg for 1/2 byte types using 32-bit atomic
  access.
* replace tabs with spaces
* replace __* variale with *__
* introduce generic version of xchg_* and cmpxchg_*.

Implementation of 4- and 8-byte cases were left as it is done in
Linux kernel as according to the RISC-V spec:
```
Table A.5 ( only part of the table was copied here )

Linux Construct       RVWMO Mapping
atomic <op> relaxed    amo<op>.{w|d}
atomic <op> acquire    amo<op>.{w|d}.aq
atomic <op> release    amo<op>.{w|d}.rl
atomic <op>            amo<op>.{w|d}.aqrl

Linux Construct       RVWMO LR/SC Mapping
atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez loop OR
                       fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ; bnez loop
atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez loop

The Linux mappings for release operations may seem stronger than necessary,
but these mappings are needed to cover some cases in which Linux requires
stronger orderings than the more intuitive mappings would provide.
In particular, as of the time this text is being written, Linux is actively
debating whether to require load-load, load-store, and store-store orderings
between accesses in one critical section and accesses in a subsequent critical
section in the same hart and protected by the same synchronization object.
Not all combinations of FENCE RW,W/FENCE R,RW mappings with aq/rl mappings
combine to provide such orderings.
There are a few ways around this problem, including:
1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This suffices
   but is undesirable, as it defeats the purpose of the aq/rl modifiers.
2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does not
   currently work due to the lack of load and store opcodes with aq and rl
   modifiers.
3. Strengthen the mappings of release operations such that they would
   enforce sufficient orderings in the presence of either type of acquire mapping.
   This is the currently-recommended solution, and the one shown in Table A.5.
```

But in Linux kenrel atomics were strengthen with fences:
```
Atomics present the same issue with locking: release and acquire
variants need to be strengthened to meet the constraints defined
by the Linux-kernel memory consistency model [1].

Atomics present a further issue: implementations of atomics such
as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
which do not give full-ordering with .aqrl; for example, current
implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
below to end up with the state indicated in the "exists" clause.

In order to "synchronize" LKMM and RISC-V's implementation, this
commit strengthens the implementations of the atomics operations
by replacing .rl and .aq with the use of ("lightweigth") fences,
and by replacing .aqrl LR/SC pairs in sequences such as:

0:      lr.w.aqrl  %0, %addr
        bne        %0, %old, 1f
        ...
        sc.w.aqrl  %1, %new, %addr
        bnez       %1, 0b
1:

with sequences of the form:

0:      lr.w       %0, %addr
        bne        %0, %old, 1f
              ...
        sc.w.rl    %1, %new, %addr   /* SC-release   */
        bnez       %1, 0b
        fence      rw, rw            /* "full" fence */
1:

following Daniel's suggestion.

These modifications were validated with simulation of the RISC-V
memory consistency model.

C lr-sc-aqrl-pair-vs-full-barrier

{}

P0(int *x, int *y, atomic_t *u)
{
        int r0;
        int r1;

        WRITE_ONCE(*x, 1);
        r0 = atomic_cmpxchg(u, 0, 1);
        r1 = READ_ONCE(*y);
}

P1(int *x, int *y, atomic_t *v)
{
        int r0;
        int r1;

        WRITE_ONCE(*y, 1);
        r0 = atomic_cmpxchg(v, 0, 1);
        r1 = READ_ONCE(*x);
}

exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)

[1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
https://marc.info/?l=linux-kernel&m=151633436614259&w=2
```

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - update the commit message.
 - drop ALIGN_DOWN().
 - update the definition of emulate_xchg_1_2(): 
   - lr.d -> lr.w, sc.d -> sc.w.
   - drop ret argument.
   - code style fixes around asm volatile.
   - update prototype.
   - use asm named operands.
   - rename local variables.
   - add comment above the macros
 - update the definition of __xchg_generic:
   - drop local ptr__ variable.
   - code style fixes around switch()
   - update prototype.
 - introduce RISCV_FULL_BARRIES.
 - redefine cmpxchg()
 - update emulate_cmpxchg_1_2():
   - update prototype
   - update local variables names and usage of them
   - use name asm operands.
   - add comment above the macros
---
Changes in V4:
 - Code style fixes.
 - enforce in __xchg_*() has the same type for new and *ptr, also "\n"
   was removed at the end of asm instruction.
 - dependency from https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@bugseng.com/
 - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE().
 - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used.
 - drop cmpxcg{32,64}_{local} as they aren't used.
 - introduce generic version of xchg_* and cmpxchg_*.
 - update the commit message.
---
Changes in V3:
 - update the commit message
 - add emulation of {cmp}xchg_... for 1 and 2 bytes types
---
Changes in V2:
 - update the comment at the top of the header.
 - change xen/lib.h to xen/bug.h.
 - sort inclusion of headers properly.
---
 xen/arch/riscv/include/asm/cmpxchg.h | 258 +++++++++++++++++++++++++++
 1 file changed, 258 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h

diff --git a/xen/arch/riscv/include/asm/cmpxchg.h b/xen/arch/riscv/include/asm/cmpxchg.h
new file mode 100644
index 0000000000..66cbe26737
--- /dev/null
+++ b/xen/arch/riscv/include/asm/cmpxchg.h
@@ -0,0 +1,258 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (C) 2014 Regents of the University of California */
+
+#ifndef _ASM_RISCV_CMPXCHG_H
+#define _ASM_RISCV_CMPXCHG_H
+
+#include <xen/compiler.h>
+#include <xen/lib.h>
+
+#include <asm/fence.h>
+#include <asm/io.h>
+#include <asm/system.h>
+
+#define __amoswap_generic(ptr, new, ret, sfx, pre, post) \
+({ \
+    asm volatile( \
+        pre \
+        " amoswap" sfx " %0, %2, %1\n" \
+        post \
+        : "=r" (ret), "+A" (*ptr) \
+        : "r" (new) \
+        : "memory" ); \
+})
+
+/*
+ * For LR and SC, the A extension requires that the address held in rs1 be
+ * naturally aligned to the size of the operand (i.e., eight-byte aligned
+ * for 64-bit words and four-byte aligned for 32-bit words).
+ * If the address is not naturally aligned, an address-misaligned exception
+ * or an access-fault exception will be generated.
+ * 
+ * Thereby:
+ * - for 1-byte xchg access the containing word by clearing low two bits
+ * - for 2-byte xchg ccess the containing word by clearing first bit.
+ * 
+ * If resulting 4-byte access is still misalgined, it will fault just as
+ * non-emulated 4-byte access would.
+ */
+#define emulate_xchg_1_2(ptr, new, sc_sfx, pre, post) \
+({ \
+    uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr & ~(0x4 - sizeof(*ptr))); \
+    uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 - sizeof(*ptr))) * BITS_PER_BYTE; \
+    unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE) - 1, 0) << new_val_pos; \
+    unsigned int new_ = new << new_val_pos; \
+    unsigned int old_val; \
+    unsigned int xchged_val; \
+    \
+    asm volatile ( \
+        pre \
+        "0: lr.w %[op_oldval], %[op_aligned_ptr]\n" \
+        "   and  %[op_xchged_val], %[op_oldval], %z[op_nmask]\n" \
+        "   or   %[op_xchged_val], %[op_xchged_val], %z[op_new]\n" \
+        "   sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val], %[op_aligned_ptr]\n" \
+        "   bnez %[op_xchged_val], 0b\n" \
+        post \
+        : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r" (xchged_val), [op_aligned_ptr]"+A" (*aligned_ptr) \
+        : [op_new] "rJ" (new_), [op_nmask] "rJ" (~mask) \
+        : "memory" ); \
+    \
+    (__typeof__(*(ptr)))((old_val & mask) >> new_val_pos); \
+})
+
+#define __xchg_generic(ptr, new, size, sfx, pre, post) \
+({ \
+    __typeof__(*(ptr)) new__ = (new); \
+    __typeof__(*(ptr)) ret__; \
+    switch ( size ) \
+    { \
+    case 1: \
+    case 2: \
+        ret__ = emulate_xchg_1_2(ptr, new__, sfx, pre, post); \
+        break; \
+    case 4: \
+        __amoswap_generic(ptr, new__, ret__,\
+                          ".w" sfx,  pre, post); \
+        break; \
+    case 8: \
+        __amoswap_generic(ptr, new__, ret__,\
+                          ".d" sfx,  pre, post); \
+        break; \
+    default: \
+        STATIC_ASSERT_UNREACHABLE(); \
+    } \
+    ret__; \
+})
+
+#define xchg_relaxed(ptr, x) \
+({ \
+    __typeof__(*(ptr)) x_ = (x); \
+    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), "", "", ""); \
+})
+
+#define xchg_acquire(ptr, x) \
+({ \
+    __typeof__(*(ptr)) x_ = (x); \
+    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
+                                       "", "", RISCV_ACQUIRE_BARRIER); \
+})
+
+#define xchg_release(ptr, x) \
+({ \
+    __typeof__(*(ptr)) x_ = (x); \
+    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
+                                       "", RISCV_RELEASE_BARRIER, ""); \
+})
+
+#define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x), sizeof(*(ptr)), \
+                                    ".aqrl", "", "")
+
+#define __generic_cmpxchg(ptr, old, new, ret, lr_sfx, sc_sfx, pre, post)	\
+ ({ \
+    register unsigned int rc; \
+    asm volatile( \
+        pre \
+        "0: lr" lr_sfx " %0, %2\n" \
+        "   bne  %0, %z3, 1f\n" \
+        "   sc" sc_sfx " %1, %z4, %2\n" \
+        "   bnez %1, 0b\n" \
+        post \
+        "1:\n" \
+        : "=&r" (ret), "=&r" (rc), "+A" (*ptr) \
+        : "rJ" (old), "rJ" (new) \
+        : "memory"); \
+ })
+
+/*
+ * For LR and SC, the A extension requires that the address held in rs1 be
+ * naturally aligned to the size of the operand (i.e., eight-byte aligned
+ * for 64-bit words and four-byte aligned for 32-bit words).
+ * If the address is not naturally aligned, an address-misaligned exception
+ * or an access-fault exception will be generated.
+ * 
+ * Thereby:
+ * - for 1-byte xchg access the containing word by clearing low two bits
+ * - for 2-byte xchg ccess the containing word by clearing first bit.
+ * 
+ * If resulting 4-byte access is still misalgined, it will fault just as
+ * non-emulated 4-byte access would.
+ *
+ * old_val was casted to unsigned long at the end of the define because of
+ * the following issue:
+ * ./arch/riscv/include/asm/cmpxchg.h:166:5: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
+ * 166 |     (__typeof__(*(ptr)))(old_val >> new_val_pos); \
+ *     |     ^
+ * ./arch/riscv/include/asm/cmpxchg.h:184:17: note: in expansion of macro 'emulate_cmpxchg_1_2'
+ * 184 |         ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
+ *     |                 ^~~~~~~~~~~~~~~~~~~
+ * ./arch/riscv/include/asm/cmpxchg.h:227:5: note: in expansion of macro '__cmpxchg_generic'
+ * 227 |     __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
+ *     |     ^~~~~~~~~~~~~~~~~
+ * ./include/xen/lib.h:141:26: note: in expansion of macro '__cmpxchg'
+ * 141 |     ((__typeof__(*(ptr)))__cmpxchg(ptr, (unsigned long)o_,              \
+ *     |                          ^~~~~~~~~
+ * common/event_channel.c:109:13: note: in expansion of macro 'cmpxchgptr'
+ * 109 |             cmpxchgptr(&xen_consumers[i], NULL, fn);
+ */
+#define emulate_cmpxchg_1_2(ptr, old, new, sc_sfx, pre, post) \
+({ \
+    uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr & ~(0x4 - sizeof(*ptr))); \
+    uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 - sizeof(*ptr))) * BITS_PER_BYTE; \
+    unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE) - 1, 0) << new_val_pos; \
+    unsigned int old_ = old << new_val_pos; \
+    unsigned int new_ = new << new_val_pos; \
+    unsigned int old_val; \
+    unsigned int xchged_val; \
+    \
+    __asm__ __volatile__ ( \
+        pre \
+        "0: lr.w %[op_xchged_val], %[op_aligned_ptr]\n" \
+        "   and  %[op_oldval], %[op_xchged_val], %z[op_mask]\n" \
+        "   bne  %[op_oldval], %z[op_old], 1f\n" \
+        "   xor  %[op_xchged_val], %[op_oldval], %[op_xchged_val]\n" \
+        "   or   %[op_xchged_val], %[op_xchged_val], %z[op_new]\n" \
+        "   sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val], %[op_aligned_ptr]\n" \
+        "   bnez %[op_xchged_val], 0b\n" \
+        post \
+        "1:\n" \
+        : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r" (xchged_val), [op_aligned_ptr] "+A" (*aligned_ptr) \
+        : [op_old] "rJ" (old_), [op_new] "rJ" (new_), \
+          [op_mask] "rJ" (mask) \
+        : "memory" ); \
+    \
+    (__typeof__(*(ptr)))((unsigned long)old_val >> new_val_pos); \
+})
+
+/*
+ * Atomic compare and exchange.  Compare OLD with MEM, if identical,
+ * store NEW in MEM.  Return the initial value in MEM.  Success is
+ * indicated by comparing RETURN with OLD.
+ */
+#define __cmpxchg_generic(ptr, old, new, size, sc_sfx, pre, post) \
+({ \
+    __typeof__(ptr) ptr__ = (ptr); \
+    __typeof__(*(ptr)) old__ = (__typeof__(*(ptr)))(old); \
+    __typeof__(*(ptr)) new__ = (__typeof__(*(ptr)))(new); \
+    __typeof__(*(ptr)) ret__; \
+    switch ( size ) \
+    { \
+    case 1: \
+    case 2: \
+        ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
+                            sc_sfx, pre, post); \
+        break; \
+    case 4: \
+        __generic_cmpxchg(ptr__, old__, new__, ret__, \
+                          ".w", ".w"sc_sfx, pre, post); \
+        break; \
+    case 8: \
+        __generic_cmpxchg(ptr__, old__, new__, ret__, \
+                          ".d", ".d"sc_sfx, pre, post); \
+        break; \
+    default: \
+        STATIC_ASSERT_UNREACHABLE(); \
+    } \
+    ret__; \
+})
+
+#define cmpxchg_relaxed(ptr, o, n) \
+({ \
+    __typeof__(*(ptr)) o_ = (o); \
+    __typeof__(*(ptr)) n_ = (n); \
+    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, \
+                    o_, n_, sizeof(*(ptr)), "", "", ""); \
+})
+
+#define cmpxchg_acquire(ptr, o, n) \
+({ \
+    __typeof__(*(ptr)) o_ = (o); \
+    __typeof__(*(ptr)) n_ = (n); \
+    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, o_, n_, sizeof(*(ptr)), \
+                                          "", "", RISCV_ACQUIRE_BARRIER); \
+})
+
+#define cmpxchg_release(ptr, o, n) \
+({ \
+    __typeof__(*(ptr)) o_ = (o); \
+    __typeof__(*(ptr)) n_ = (n); \
+    (__typeof__(*(ptr)))__cmpxchg_release(ptr, o_, n_, sizeof(*(ptr)), \
+                                          "", RISCV_RELEASE_BARRIER, ""); \
+})
+
+#define __cmpxchg(ptr, o, n, s) \
+    (__typeof__(*(ptr))) \
+    __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
+                      s, ".rl", "", RISCV_FULL_BARRIER)
+
+#define cmpxchg(ptr, o, n) __cmpxchg(ptr, o, n, sizeof(*(ptr)))
+
+#endif /* _ASM_RISCV_CMPXCHG_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (10 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-03-06 14:13   ` Jan Beulich
  2024-02-26 17:38 ` [PATCH v5 13/23] xen/riscv: introduce atomic.h Oleksii Kurochko
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

The header taken form Linux 6.4.0-rc1 and is based on
arch/riscv/include/asm/mmio.h with the following changes:
- drop forcing of endianess for read*(), write*() functions as
  no matter what CPU endianness, what endianness a particular device
  (and hence its MMIO region(s)) is using is entirely independent.
  Hence conversion, where necessary, needs to occur at a layer up.
  Another one reason to drop endianess conversion here is:
  https://patchwork.kernel.org/project/linux-riscv/patch/20190411115623.5749-3-hch@lst.de/
  One of the answers of the author of the commit:
    And we don't know if Linux will be around if that ever changes.
    The point is:
     a) the current RISC-V spec is LE only
     b) the current linux port is LE only except for this little bit
    There is no point in leaving just this bitrotting code around.  It
    just confuses developers, (very very slightly) slows down compiles
    and will bitrot.  It also won't be any significant help to a future
    developer down the road doing a hypothetical BE RISC-V Linux port.
- drop unused argument of __io_ar() macros.
- drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
  as they are unnessary.
- Adopt the Xen code style for this header, considering that significant changes
  are not anticipated in the future.
  In the event of any issues, adapting them to Xen style should be easily
  manageable.
- drop unnessary __r variables in macros read*_cpu()

Addionally, to the header was added definions of ioremap_*().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - Xen code style related fixes
 - drop #define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}
 - drop cpu_to_le16()
 - remove unuused argument in _io_ar()
 - update the commit message 
 - drop unnessary __r variables in macros read*_cpu()
 - update the comments at the top of the header.
---
Changes in V4:
 - delete inner parentheses in macros.
 - s/u<N>/uint<N>.
---
Changes in V3:
 - re-sync with linux kernel
 - update the commit message
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/io.h | 157 ++++++++++++++++++++++++++++++++
 1 file changed, 157 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/io.h

diff --git a/xen/arch/riscv/include/asm/io.h b/xen/arch/riscv/include/asm/io.h
new file mode 100644
index 0000000000..95a459432c
--- /dev/null
+++ b/xen/arch/riscv/include/asm/io.h
@@ -0,0 +1,157 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ *  The header taken form Linux 6.4.0-rc1 and is based on
+ *  arch/riscv/include/asm/mmio.h with the following changes:
+ *   - drop forcing of endianess for read*(), write*() functions as
+ *     no matter what CPU endianness, what endianness a particular device
+ *     (and hence its MMIO region(s)) is using is entirely independent.
+ *     Hence conversion, where necessary, needs to occur at a layer up.
+ *     Another one reason to drop endianess conversion is:
+ *     https://patchwork.kernel.org/project/linux-riscv/patch/20190411115623.5749-3-hch@lst.de/
+ *     One of the answers of the author of the commit:
+ *       And we don't know if Linux will be around if that ever changes.
+ *       The point is:
+ *        a) the current RISC-V spec is LE only
+ *        b) the current linux port is LE only except for this little bit
+ *       There is no point in leaving just this bitrotting code around.  It
+ *       just confuses developers, (very very slightly) slows down compiles
+  *      and will bitrot.  It also won't be any significant help to a future
+ *       developer down the road doing a hypothetical BE RISC-V Linux port.
+ *   - drop unused argument of __io_ar() macros.
+ *   - drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
+ *     as they are unnessary.
+ *   - Adopt the Xen code style for this header, considering that significant changes
+ *     are not anticipated in the future.
+ *     In the event of any issues, adapting them to Xen style should be easily
+ *     manageable.
+ *   - drop unnessary __r variables in macros read*_cpu()
+ *
+ * Copyright (C) 1996-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2014 Regents of the University of California
+ * Copyright (C) 2024 Vates
+ */
+
+#ifndef _ASM_RISCV_IO_H
+#define _ASM_RISCV_IO_H
+
+#include <asm/byteorder.h>
+
+/*
+ * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
+ * change the properties of memory regions.  This should be fixed by the
+ * upcoming platform spec.
+ */
+#define ioremap_nocache(addr, size) ioremap(addr, size)
+#define ioremap_wc(addr, size) ioremap(addr, size)
+#define ioremap_wt(addr, size) ioremap(addr, size)
+
+/* Generic IO read/write.  These perform native-endian accesses. */
+static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
+{
+    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
+}
+
+static inline void __raw_writew(uint16_t val, volatile void __iomem *addr)
+{
+    asm volatile ( "sh %0, 0(%1)" : : "r" (val), "r" (addr) );
+}
+
+static inline void __raw_writel(uint32_t val, volatile void __iomem *addr)
+{
+    asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r" (addr) );
+}
+
+#ifdef CONFIG_64BIT
+static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
+{
+    asm volatile ( "sd %0, 0(%1)" : : "r" (val), "r" (addr) );
+}
+#endif
+
+static inline uint8_t __raw_readb(const volatile void __iomem *addr)
+{
+    uint8_t val;
+
+    asm volatile ( "lb %0, 0(%1)" : "=r" (val) : "r" (addr) );
+    return val;
+}
+
+static inline uint16_t __raw_readw(const volatile void __iomem *addr)
+{
+    uint16_t val;
+
+    asm volatile ( "lh %0, 0(%1)" : "=r" (val) : "r" (addr) );
+    return val;
+}
+
+static inline uint32_t __raw_readl(const volatile void __iomem *addr)
+{
+    uint32_t val;
+
+    asm volatile ( "lw %0, 0(%1)" : "=r" (val) : "r" (addr) );
+    return val;
+}
+
+#ifdef CONFIG_64BIT
+static inline u64 __raw_readq(const volatile void __iomem *addr)
+{
+    u64 val;
+
+    asm volatile ( "ld %0, 0(%1)" : "=r" (val) : "r" (addr) );
+    return val;
+}
+#endif
+
+/*
+ * Unordered I/O memory access primitives.  These are even more relaxed than
+ * the relaxed versions, as they don't even order accesses between successive
+ * operations to the I/O regions.
+ */
+#define readb_cpu(c)        __raw_readb(c)
+#define readw_cpu(c)        __raw_readw(c)
+#define readl_cpu(c)        __raw_readl(c)
+
+#define writeb_cpu(v, c)    __raw_writeb(v, c)
+#define writew_cpu(v, c)    __raw_writew(v, c)
+#define writel_cpu(v, c)    __raw_writel(v, c)
+
+#ifdef CONFIG_64BIT
+#define readq_cpu(c)        __raw_readq(c)
+#define writeq_cpu(v, c)    __raw_writeq(v, c)
+#endif
+
+/*
+ * I/O memory access primitives. Reads are ordered relative to any
+ * following Normal memory access. Writes are ordered relative to any prior
+ * Normal memory access.  The memory barriers here are necessary as RISC-V
+ * doesn't define any ordering between the memory space and the I/O space.
+ */
+#define __io_br()   do { } while (0)
+#define __io_ar()   asm volatile ( "fence i,r" : : : "memory" );
+#define __io_bw()   asm volatile ( "fence w,o" : : : "memory" );
+#define __io_aw()   do { } while (0)
+
+#define readb(c)    ({ uint8_t  v; __io_br(); v = readb_cpu(c); __io_ar(); v; })
+#define readw(c)    ({ uint16_t v; __io_br(); v = readw_cpu(c); __io_ar(); v; })
+#define readl(c)    ({ uint32_t v; __io_br(); v = readl_cpu(c); __io_ar(); v; })
+
+#define writeb(v, c)    ({ __io_bw(); writeb_cpu(v, c); __io_aw(); })
+#define writew(v, c)    ({ __io_bw(); writew_cpu(v, c); __io_aw(); })
+#define writel(v, c)    ({ __io_bw(); writel_cpu(v, c); __io_aw(); })
+
+#ifdef CONFIG_64BIT
+#define readq(c)        ({ uint64_t v; __io_br(); v = readq_cpu(c); __io_ar(); v; })
+#define writeq(v, c)    ({ __io_bw(); writeq_cpu(v, c); __io_aw(); })
+#endif
+
+#endif /* _ASM_RISCV_IO_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 13/23] xen/riscv: introduce atomic.h
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (11 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 12/23] xen/riscv: introduce io.h Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-03-06 15:31   ` Jan Beulich
  2024-02-26 17:38 ` [PATCH v5 14/23] xen/riscv: introduce monitor.h Oleksii Kurochko
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Initially the patch was introduced by Bobby, who takes the header from
Linux kernel.

The following changes were done on top of Linux kernel header:
 - atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
     to use__*xchg_generic()
 - drop casts in write_atomic() as they are unnecessary
 - drop introduction of WRITE_ONCE() and READ_ONCE().
   Xen provides ACCESS_ONCE()
 - remove zero-length array access in read_atomic()
 - drop defines similar to pattern
 - #define atomic_add_return_relaxed   atomic_add_return_relaxed
 - move not RISC-V specific functions to asm-generic/atomics-ops.h

Signed-off-by: Bobby Eshleman <bobbyeshleman@gmail.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - fence.h changes were moved to separate patch as patches related to io.h and cmpxchg.h,
   which are dependecies for this patch, also needed changes in fence.h
 - remove accessing of zero-length array
 - drops cast in write_atomic()
 - drop introduction of WRITE_ONCE() and READ_ONCE().
 - drop defines similar to pattern #define atomic_add_return_relaxed   atomic_add_return_relaxed
 - Xen code style fixes
 - move not RISC-V specific functions to asm-generic/atomics-ops.h
---
Changes in V4:
 - do changes related to the updates of [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
 - drop casts in read_atomic_size(), write_atomic(), add_sized()
 - tabs -> spaces
 - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler to handle NR_CPUS=1
   the same as NR_CPUS>1 with accepting less than ideal performance.
---
Changes in V3:
  - update the commit message
  - add SPDX for fence.h
  - code style fixes
  - Remove /* TODO: ... */ for add_sized macros. It looks correct to me.
  - re-order the patch
  - merge to this patch fence.h
---
Changes in V2:
 - Change an author of commit. I got this header from Bobby's old repo.
---
 xen/arch/riscv/include/asm/atomic.h  | 296 +++++++++++++++++++++++++++
 xen/include/asm-generic/atomic-ops.h |  92 +++++++++
 2 files changed, 388 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/atomic.h
 create mode 100644 xen/include/asm-generic/atomic-ops.h

diff --git a/xen/arch/riscv/include/asm/atomic.h b/xen/arch/riscv/include/asm/atomic.h
new file mode 100644
index 0000000000..8007ae4c90
--- /dev/null
+++ b/xen/arch/riscv/include/asm/atomic.h
@@ -0,0 +1,296 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Taken and modified from Linux.
+ *
+ * The following changes were done:
+ * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
+ *     to use__*xchg_generic()
+ * - drop casts in write_atomic() as they are unnecessary
+ * - drop introduction of WRITE_ONCE() and READ_ONCE().
+ *   Xen provides ACCESS_ONCE()
+ * - remove zero-length array access in read_atomic()
+ * - drop defines similar to pattern
+ *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
+ * - move not RISC-V specific functions to asm-generic/atomics-ops.h
+ * 
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ * Copyright (C) 2024 Vates SAS
+ */
+
+#ifndef _ASM_RISCV_ATOMIC_H
+#define _ASM_RISCV_ATOMIC_H
+
+#include <xen/atomic.h>
+
+#include <asm/cmpxchg.h>
+#include <asm/fence.h>
+#include <asm/io.h>
+#include <asm/system.h>
+
+#include <asm-generic/atomic-ops.h>
+
+void __bad_atomic_size(void);
+
+/*
+ * Legacy from Linux kernel. For some reason they wanted to have ordered
+ * read/write access. Thereby read* is used instead of read<X>_cpu()
+ */
+static always_inline void read_atomic_size(const volatile void *p,
+                                           void *res,
+                                           unsigned int size)
+{
+    switch ( size )
+    {
+    case 1: *(uint8_t *)res = readb(p); break;
+    case 2: *(uint16_t *)res = readw(p); break;
+    case 4: *(uint32_t *)res = readl(p); break;
+    case 8: *(uint32_t *)res  = readq(p); break;
+    default: __bad_atomic_size(); break;
+    }
+}
+
+#define read_atomic(p) ({                               \
+    union { typeof(*p) val; char c[sizeof(*p)]; } x_;   \
+    read_atomic_size(p, x_.c, sizeof(*p));              \
+    x_.val;                                             \
+})
+
+#define write_atomic(p, x)                              \
+({                                                      \
+    typeof(*p) x__ = (x);                               \
+    switch ( sizeof(*p) )                               \
+    {                                                   \
+    case 1: writeb(x__,  p); break;                     \
+    case 2: writew(x__, p); break;                      \
+    case 4: writel(x__, p); break;                      \
+    case 8: writeq(x__, p); break;                      \
+    default: __bad_atomic_size(); break;                \
+    }                                                   \
+    x__;                                                \
+})
+
+#define add_sized(p, x)                                 \
+({                                                      \
+    typeof(*(p)) x__ = (x);                             \
+    switch ( sizeof(*(p)) )                             \
+    {                                                   \
+    case 1: writeb(read_atomic(p) + x__, p); break;     \
+    case 2: writew(read_atomic(p) + x__, p); break;     \
+    case 4: writel(read_atomic(p) + x__, p); break;     \
+    default: __bad_atomic_size(); break;                \
+    }                                                   \
+})
+
+#define __atomic_acquire_fence() \
+    __asm__ __volatile__ ( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
+
+#define __atomic_release_fence() \
+    __asm__ __volatile__ ( RISCV_RELEASE_BARRIER "" ::: "memory" )
+
+/*
+ * First, the atomic ops that have no ordering constraints and therefor don't
+ * have the AQ or RL bits set.  These don't return anything, so there's only
+ * one version to worry about.
+ */
+#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
+static inline                                               \
+void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
+{                                                           \
+    __asm__ __volatile__ (                                  \
+        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
+        : "+A" (v->counter)                                 \
+        : "r" (I)                                           \
+        : "memory" );                                       \
+}                                                           \
+
+#define ATOMIC_OPS(op, asm_op, I)                           \
+        ATOMIC_OP (op, asm_op, I, w, int,   )
+
+ATOMIC_OPS(add, add,  i)
+ATOMIC_OPS(sub, add, -i)
+ATOMIC_OPS(and, and,  i)
+ATOMIC_OPS( or,  or,  i)
+ATOMIC_OPS(xor, xor,  i)
+
+#undef ATOMIC_OP
+#undef ATOMIC_OPS
+
+/*
+ * Atomic ops that have ordered, relaxed, acquire, and release variants.
+ * There's two flavors of these: the arithmatic ops have both fetch and return
+ * versions, while the logical ops only have fetch versions.
+ */
+#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type, prefix)    \
+static inline                                                       \
+c_type atomic##prefix##_fetch_##op##_relaxed(c_type i,              \
+                         atomic##prefix##_t *v)                     \
+{                                                                   \
+    register c_type ret;                                            \
+    __asm__ __volatile__ (                                          \
+        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
+        : "+A" (v->counter), "=r" (ret)                             \
+        : "r" (I)                                                   \
+        : "memory" );                                               \
+    return ret;                                                     \
+}                                                                   \
+static inline                                                       \
+c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \
+{                                                                   \
+    register c_type ret;                                            \
+    __asm__ __volatile__ (                                          \
+        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2, %0"          \
+        : "+A" (v->counter), "=r" (ret)                             \
+        : "r" (I)                                                   \
+        : "memory" );                                               \
+    return ret;                                                     \
+}
+
+#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \
+static inline                                                           \
+c_type atomic##prefix##_##op##_return_relaxed(c_type i,                 \
+                          atomic##prefix##_t *v)                        \
+{                                                                       \
+        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I;      \
+}                                                                       \
+static inline                                                           \
+c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v)  \
+{                                                                       \
+        return atomic##prefix##_fetch_##op(i, v) c_op I;                \
+}
+
+#define ATOMIC_OPS(op, asm_op, c_op, I)                                 \
+        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,   )               \
+        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
+
+ATOMIC_OPS(add, add, +,  i)
+ATOMIC_OPS(sub, add, +, -i)
+
+#undef ATOMIC_OPS
+
+#define ATOMIC_OPS(op, asm_op, I) \
+        ATOMIC_FETCH_OP(op, asm_op, I, w, int,   )
+
+ATOMIC_OPS(and, and, i)
+ATOMIC_OPS( or,  or, i)
+ATOMIC_OPS(xor, xor, i)
+
+#undef ATOMIC_OPS
+
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_OP_RETURN
+
+/* This is required to provide a full barrier on success. */
+static inline int atomic_add_unless(atomic_t *v, int a, int u)
+{
+       int prev, rc;
+
+    __asm__ __volatile__ (
+        "0: lr.w     %[p],  %[c]\n"
+        "   beq      %[p],  %[u], 1f\n"
+        "   add      %[rc], %[p], %[a]\n"
+        "   sc.w.rl  %[rc], %[rc], %[c]\n"
+        "   bnez     %[rc], 0b\n"
+        RISCV_FULL_BARRIER
+        "1:\n"
+        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
+        : [a] "r" (a), [u] "r" (u)
+        : "memory");
+    return prev;
+}
+
+/*
+ * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
+ * {cmp,}xchg and the operations that return, so they need a full barrier.
+ */
+#define ATOMIC_OP(c_t, prefix, size)                            \
+static inline                                                   \
+c_t atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size, "", "", "");  \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          "", "", RISCV_ACQUIRE_BARRIER);       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          "", RISCV_RELEASE_BARRIER, "");       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n)         \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          ".aqrl", "", "");                     \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_relaxed(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", "", "");                       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", "", RISCV_ACQUIRE_BARRIER);    \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{	                                                            \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", RISCV_RELEASE_BARRIER, "");    \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n) \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             ".rl", "", " fence rw, rw\n");     \
+}
+
+#define ATOMIC_OPS() \
+    ATOMIC_OP(int,   , 4)
+
+ATOMIC_OPS()
+
+#undef ATOMIC_OPS
+#undef ATOMIC_OP
+
+static inline int atomic_sub_if_positive(atomic_t *v, int offset)
+{
+       int prev, rc;
+
+    __asm__ __volatile__ (
+        "0: lr.w     %[p],  %[c]\n"
+        "   sub      %[rc], %[p], %[o]\n"
+        "   bltz     %[rc], 1f\n"
+        "   sc.w.rl  %[rc], %[rc], %[c]\n"
+        "   bnez     %[rc], 0b\n"
+        "   fence    rw, rw\n"
+        "1:\n"
+        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
+        : [o] "r" (offset)
+        : "memory" );
+    return prev - offset;
+}
+
+#define atomic_dec_if_positive(v)	atomic_sub_if_positive(v, 1)
+
+#endif /* _ASM_RISCV_ATOMIC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-generic/atomic-ops.h b/xen/include/asm-generic/atomic-ops.h
new file mode 100644
index 0000000000..fdd5a93ed8
--- /dev/null
+++ b/xen/include/asm-generic/atomic-ops.h
@@ -0,0 +1,92 @@
+#/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
+#define _ASM_GENERIC_ATOMIC_OPS_H_
+
+#include <xen/atomic.h>
+#include <xen/lib.h>
+
+#ifndef ATOMIC_READ
+static inline int atomic_read(const atomic_t *v)
+{
+    return ACCESS_ONCE(v->counter);
+}
+#endif
+
+#ifndef _ATOMIC_READ
+static inline int _atomic_read(atomic_t v)
+{
+    return v.counter;
+}
+#endif
+
+#ifndef ATOMIC_SET
+static inline void atomic_set(atomic_t *v, int i)
+{
+    ACCESS_ONCE(v->counter) = i;
+}
+#endif
+
+#ifndef _ATOMIC_SET
+static inline void _atomic_set(atomic_t *v, int i)
+{
+    v->counter = i;
+}
+#endif
+
+#ifndef ATOMIC_SUB_AND_TEST
+static inline int atomic_sub_and_test(int i, atomic_t *v)
+{
+    return atomic_sub_return(i, v) == 0;
+}
+#endif
+
+#ifndef ATOMIC_INC
+static inline void atomic_inc(atomic_t *v)
+{
+    atomic_add(1, v);
+}
+#endif
+
+#ifndef ATOMIC_INC_RETURN
+static inline int atomic_inc_return(atomic_t *v)
+{
+    return atomic_add_return(1, v);
+}
+#endif
+
+#ifndef ATOMIC_DEC
+static inline void atomic_dec(atomic_t *v)
+{
+    atomic_sub(1, v);
+}
+#endif
+
+#ifndef ATOMIC_DEC_RETURN
+static inline int atomic_dec_return(atomic_t *v)
+{
+    return atomic_sub_return(1, v);
+}
+#endif
+
+#ifndef ATOMIC_DEC_AND_TEST
+static inline int atomic_dec_and_test(atomic_t *v)
+{
+    return atomic_sub_return(1, v) == 0;
+}
+#endif
+
+#ifndef ATOMIC_ADD_NEGATIVE
+static inline int atomic_add_negative(int i, atomic_t *v)
+{
+    return atomic_add_return(i, v) < 0;
+}
+#endif
+
+#ifndef ATOMIC_INC_AND_TEST
+static inline int atomic_inc_and_test(atomic_t *v)
+{
+    return atomic_add_return(1, v) == 0;
+}
+#endif
+
+#endif /* _ASM_GENERIC_ATOMIC_OPS_H_ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 14/23] xen/riscv: introduce monitor.h
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (12 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 13/23] xen/riscv: introduce atomic.h Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 15/23] xen/riscv: add definition of __read_mostly Oleksii Kurochko
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Tamas K Lengyel, Alexandru Isaila,
	Petre Pircalabu, Alistair Francis, Bob Eshleman, Connor Davis

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Waiting for dependency to be merged: [PATCH v6 0/9] Introduce generic headers
 (https://lore.kernel.org/xen-devel/84568b0c24a5ec96244f3f34537e9a148367facf.1707499278.git.oleksii.kurochko@gmail.com/)
---
Changes in V4/V5:
 - Nothing changed. Only rebase.
---
Changes in V3:
 - new patch.
---
 xen/arch/riscv/include/asm/monitor.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/monitor.h

diff --git a/xen/arch/riscv/include/asm/monitor.h b/xen/arch/riscv/include/asm/monitor.h
new file mode 100644
index 0000000000..f4fe2c0690
--- /dev/null
+++ b/xen/arch/riscv/include/asm/monitor.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_MONITOR_H__
+#define __ASM_RISCV_MONITOR_H__
+
+#include <xen/bug.h>
+
+#include <asm-generic/monitor.h>
+
+struct domain;
+
+static inline uint32_t arch_monitor_get_capabilities(struct domain *d)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+#endif /* __ASM_RISCV_MONITOR_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 15/23] xen/riscv: add definition of __read_mostly
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (13 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 14/23] xen/riscv: introduce monitor.h Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 16/23] xen/riscv: add required things to current.h Oleksii Kurochko
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

The definition of __read_mostly should be removed in:
https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b333e@suse.com/

The patch introduces it in arch-specific header to not
block enabling of full Xen build for RISC-V.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
- [PATCH] move __read_mostly to xen/cache.h  [2]

Right now, the patch series doesn't have a direct dependency on [2] and it
provides __read_mostly in the patch:
    [PATCH v3 26/34] xen/riscv: add definition of __read_mostly
However, it will be dropped as soon as [2] is merged or at least when the
final version of the patch [2] is provided.

[2] https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b333e@suse.com/
---
Changes in V4-V6:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/cache.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/riscv/include/asm/cache.h b/xen/arch/riscv/include/asm/cache.h
index 69573eb051..94bd94db53 100644
--- a/xen/arch/riscv/include/asm/cache.h
+++ b/xen/arch/riscv/include/asm/cache.h
@@ -3,4 +3,6 @@
 #ifndef _ASM_RISCV_CACHE_H
 #define _ASM_RISCV_CACHE_H
 
+#define __read_mostly __section(".data.read_mostly")
+
 #endif /* _ASM_RISCV_CACHE_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 16/23] xen/riscv: add required things to current.h
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (14 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 15/23] xen/riscv: add definition of __read_mostly Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:38 ` [PATCH v5 17/23] xen/riscv: add minimal stuff to page.h to build full Xen Oleksii Kurochko
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Add minimal requied things to be able to build full Xen.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V5:
 - Nothing changed. Only rebase.
---
Changes in V4:
 - BUG() was changed to BUG_ON("unimplemented");
 - Change "xen/bug.h" to "xen/lib.h" as BUG_ON is defined in xen/lib.h.
 - Add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V3:
 - add SPDX
 - drop a forward declaration of struct vcpu;
 - update guest_cpu_user_regs() macros
 - replace get_processor_id with smp_processor_id
 - update the commit message
 - code style fixes
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/current.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index d84f15dc50..aedb6dc732 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -3,6 +3,21 @@
 #ifndef __ASM_CURRENT_H
 #define __ASM_CURRENT_H
 
+#include <xen/lib.h>
+#include <xen/percpu.h>
+#include <asm/processor.h>
+
+#ifndef __ASSEMBLY__
+
+/* Which VCPU is "current" on this PCPU. */
+DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
+
+#define current            this_cpu(curr_vcpu)
+#define set_current(vcpu)  do { current = (vcpu); } while (0)
+#define get_cpu_current(cpu)  per_cpu(curr_vcpu, cpu)
+
+#define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
+
 #define switch_stack_and_jump(stack, fn) do {               \
     asm volatile (                                          \
             "mv sp, %0\n"                                   \
@@ -10,4 +25,8 @@
     unreachable();                                          \
 } while ( false )
 
+#define get_per_cpu_offset() __per_cpu_offset[smp_processor_id()]
+
+#endif /* __ASSEMBLY__ */
+
 #endif /* __ASM_CURRENT_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 17/23] xen/riscv: add minimal stuff to page.h to build full Xen
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (15 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 16/23] xen/riscv: add required things to current.h Oleksii Kurochko
@ 2024-02-26 17:38 ` Oleksii Kurochko
  2024-02-26 17:39 ` [PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V5:
 - Nothing changed. Only rebase.
---
Changes in V4:
---
 - Change message -> subject in "Changes in V3"
 - s/BUG/BUG_ON("...")
 - Do proper rebase ( pfn_to_paddr() and paddr_to_pfn() aren't removed ).
---
Changes in V3:
 - update the commit subject
 - add implemetation of PAGE_HYPERVISOR macros
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
 - drop definition of pfn_to_addr, and paddr_to_pfn in <asm/mm.h>
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/page.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index 95074e29b3..c831e16417 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -6,6 +6,7 @@
 #ifndef __ASSEMBLY__
 
 #include <xen/const.h>
+#include <xen/bug.h>
 #include <xen/types.h>
 
 #include <asm/mm.h>
@@ -32,6 +33,10 @@
 #define PTE_LEAF_DEFAULT            (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
 #define PTE_TABLE                   (PTE_VALID)
 
+#define PAGE_HYPERVISOR_RW          (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
+
+#define PAGE_HYPERVISOR             PAGE_HYPERVISOR_RW
+
 /* Calculate the offsets into the pagetables for a given VA */
 #define pt_linear_offset(lvl, va)   ((va) >> XEN_PT_LEVEL_SHIFT(lvl))
 
@@ -62,6 +67,20 @@ static inline bool pte_is_valid(pte_t p)
     return p.pte & PTE_VALID;
 }
 
+static inline void invalidate_icache(void)
+{
+    BUG_ON("unimplemented");
+}
+
+#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
+#define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE)
+
+/* TODO: Flush the dcache for an entire page. */
+static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache)
+{
+    BUG_ON("unimplemented");
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_PAGE_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (16 preceding siblings ...)
  2024-02-26 17:38 ` [PATCH v5 17/23] xen/riscv: add minimal stuff to page.h to build full Xen Oleksii Kurochko
@ 2024-02-26 17:39 ` Oleksii Kurochko
  2024-03-05  8:05   ` Jan Beulich
  2024-02-26 17:39 ` [PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Alistair Francis,
	Bob Eshleman, Connor Davis

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - Code style fixes.
 - drop introduced TOOLCHAIN_HAS_ZIHINTPAUSE and use as-insn instead and use
   as-insn istead.
---
Changes in V4:
 - Change message -> subject in "Changes in V3"
 - Documentation about system requirement was added. In the future, it can be checked if the extension is supported
   by system __riscv_isa_extension_available() ( https://gitlab.com/xen-project/people/olkur/xen/-/commit/737998e89ed305eb92059300c374dfa53d2143fa )
 - update cpu_relax() function to check if __riscv_zihintpause is supported by a toolchain
 - add conditional _zihintpause to -march if it is supported by a toolchain
Changes in V3:
 - update the commit subject
 - rename get_processor_id to smp_processor_id
 - code style fixes
 - update the cpu_relax instruction: use pause instruction instead of div %0, %0, zero
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 docs/misc/riscv/booting.txt            |  8 ++++++++
 xen/arch/riscv/arch.mk                 |  8 +++++++-
 xen/arch/riscv/include/asm/processor.h | 23 +++++++++++++++++++++++
 3 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 docs/misc/riscv/booting.txt

diff --git a/docs/misc/riscv/booting.txt b/docs/misc/riscv/booting.txt
new file mode 100644
index 0000000000..38fad74956
--- /dev/null
+++ b/docs/misc/riscv/booting.txt
@@ -0,0 +1,8 @@
+System requirements
+===================
+
+The following extensions are expected to be supported by a system on which
+Xen is run:
+- Zihintpause:
+  On a system that doesn't have this extension, cpu_relax() should be
+  implemented properly. Otherwise, an illegal instruction exception will arise.
diff --git a/xen/arch/riscv/arch.mk b/xen/arch/riscv/arch.mk
index 8403f96b6f..fabe323ec5 100644
--- a/xen/arch/riscv/arch.mk
+++ b/xen/arch/riscv/arch.mk
@@ -5,6 +5,12 @@ $(call cc-options-add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS))
 
 CFLAGS-$(CONFIG_RISCV_64) += -mabi=lp64
 
+ifeq ($(CONFIG_RISCV_64),y)
+has_zihintpause = $(call as-insn,$(CC) -mabi=lp64 -march=rv64i_zihintpause, "pause",_zihintpause,)
+else
+has_zihintpause = $(call as-insn,$(CC) -mabi=ilp32 -march=rv32i_zihintpause, "pause",_zihintpause,)
+endif
+
 riscv-march-$(CONFIG_RISCV_ISA_RV64G) := rv64g
 riscv-march-$(CONFIG_RISCV_ISA_C)       := $(riscv-march-y)c
 
@@ -12,7 +18,7 @@ riscv-march-$(CONFIG_RISCV_ISA_C)       := $(riscv-march-y)c
 # into the upper half _or_ the lower half of the address space.
 # -mcmodel=medlow would force Xen into the lower half.
 
-CFLAGS += -march=$(riscv-march-y) -mstrict-align -mcmodel=medany
+CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align -mcmodel=medany
 
 # TODO: Drop override when more of the build is working
 override ALL_OBJS-y = arch/$(SRCARCH)/built_in.o
diff --git a/xen/arch/riscv/include/asm/processor.h b/xen/arch/riscv/include/asm/processor.h
index 6db681d805..b96af07660 100644
--- a/xen/arch/riscv/include/asm/processor.h
+++ b/xen/arch/riscv/include/asm/processor.h
@@ -12,6 +12,9 @@
 
 #ifndef __ASSEMBLY__
 
+/* TODO: need to be implemeted */
+#define smp_processor_id() 0
+
 /* On stack VCPU state */
 struct cpu_user_regs
 {
@@ -53,6 +56,26 @@ struct cpu_user_regs
     unsigned long pregs;
 };
 
+/* TODO: need to implement */
+#define cpu_to_core(cpu)   (0)
+#define cpu_to_socket(cpu) (0)
+
+static inline void cpu_relax(void)
+{
+#ifdef __riscv_zihintpause
+    /*
+     * Reduce instruction retirement.
+     * This assumes the PC changes.
+     */
+    __asm__ __volatile__ ( "pause" );
+#else
+    /* Encoding of the pause instruction */
+    __asm__ __volatile__ ( ".insn 0x100000F" );
+#endif
+
+    barrier();
+}
+
 static inline void wfi(void)
 {
     __asm__ __volatile__ ("wfi");
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h to build full Xen
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (17 preceding siblings ...)
  2024-02-26 17:39 ` [PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
@ 2024-02-26 17:39 ` Oleksii Kurochko
  2024-03-05  8:17   ` Jan Beulich
  2024-02-26 17:39 ` [PATCH v5 20/23] xen/riscv: introduce vm_event_*() functions Oleksii Kurochko
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - update the comment around "struct domain *domain;" : zero -> NULL
 - fix ident. for unsigned long val;
 - put page_to_virt() and virt_to_page() close to each other.
 - drop unnessary leading underscore
 - drop a space before the comment: /* Count of uses of this frame as its current type. */
 - drop comment about a page 'not as a shadow'. it is not necessary for RISC-V
---
Changes in V4:
 - update an argument name of PFN_ORDERN macros.
 - drop pad at the end of 'struct page_info'.
 - Change message -> subject in "Changes in V3"
 - delete duplicated macros from riscv/mm.h
 - fix identation in struct page_info
 - align comment for PGC_ macros
 - update definitions of domain_set_alloc_bitsize() and domain_clamp_alloc_bitsize()
 - drop unnessary comments.
 - s/BUG/BUG_ON("...")
 - define __virt_to_maddr, __maddr_to_virt as stubs
 - add inclusion of xen/mm-frame.h for mfn_x and others
 - include "xen/mm.h" instead of "asm/mm.h" to fix compilation issues:
	 In file included from arch/riscv/setup.c:7:
	./arch/riscv/include/asm/mm.h:60:28: error: field 'list' has incomplete type
	   60 |     struct page_list_entry list;
	      |                            ^~~~
	./arch/riscv/include/asm/mm.h:81:43: error: 'MAX_ORDER' undeclared here (not in a function)
	   81 |                 unsigned long first_dirty:MAX_ORDER + 1;
	      |                                           ^~~~~~~~~
	./arch/riscv/include/asm/mm.h:81:31: error: bit-field 'first_dirty' width not an integer constant
	   81 |                 unsigned long first_dirty:MAX_ORDER + 1;
 - Define __virt_to_mfn() and __mfn_to_virt() using maddr_to_mfn() and mfn_to_maddr().
---
Changes in V3:
 - update the commit title
 - introduce DIRECTMAP_VIRT_START.
 - drop changes related pfn_to_paddr() and paddr_to_pfn as they were remvoe in
   [PATCH v2 32/39] xen/riscv: add minimal stuff to asm/page.h to build full Xen
 - code style fixes.
 - drop get_page_nr  and put_page_nr as they don't need for time being
 - drop CONFIG_STATIC_MEMORY related things
 - code style fixes
---
Changes in V2:
 - define stub for arch_get_dma_bitsize(void)
---
 xen/arch/riscv/include/asm/mm.h | 246 ++++++++++++++++++++++++++++++++
 xen/arch/riscv/mm.c             |   2 +-
 xen/arch/riscv/setup.c          |   2 +-
 3 files changed, 248 insertions(+), 2 deletions(-)

diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 07c7a0abba..2f13c1c3c2 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -3,11 +3,252 @@
 #ifndef _ASM_RISCV_MM_H
 #define _ASM_RISCV_MM_H
 
+#include <public/xen.h>
+#include <xen/bug.h>
+#include <xen/mm-frame.h>
+#include <xen/pdx.h>
+#include <xen/types.h>
+
 #include <asm/page-bits.h>
 
 #define pfn_to_paddr(pfn) ((paddr_t)(pfn) << PAGE_SHIFT)
 #define paddr_to_pfn(pa)  ((unsigned long)((pa) >> PAGE_SHIFT))
 
+#define paddr_to_pdx(pa)    mfn_to_pdx(maddr_to_mfn(pa))
+#define gfn_to_gaddr(gfn)   pfn_to_paddr(gfn_x(gfn))
+#define gaddr_to_gfn(ga)    _gfn(paddr_to_pfn(ga))
+#define mfn_to_maddr(mfn)   pfn_to_paddr(mfn_x(mfn))
+#define maddr_to_mfn(ma)    _mfn(paddr_to_pfn(ma))
+#define vmap_to_mfn(va)     maddr_to_mfn(virt_to_maddr((vaddr_t)va))
+#define vmap_to_page(va)    mfn_to_page(vmap_to_mfn(va))
+
+static inline unsigned long __virt_to_maddr(unsigned long va)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+static inline void *__maddr_to_virt(unsigned long ma)
+{
+    BUG_ON("unimplemented");
+    return NULL;
+}
+
+#define virt_to_maddr(va) __virt_to_maddr((unsigned long)(va))
+#define maddr_to_virt(pa) __maddr_to_virt((unsigned long)(pa))
+
+/* Convert between Xen-heap virtual addresses and machine frame numbers. */
+#define __virt_to_mfn(va)  mfn_x(maddr_to_mfn(virt_to_maddr(va)))
+#define __mfn_to_virt(mfn) maddr_to_virt(mfn_to_maddr(_mfn(mfn)))
+
+/*
+ * We define non-underscored wrappers for above conversion functions.
+ * These are overriden in various source files while underscored version
+ * remain intact.
+ */
+#define virt_to_mfn(va)     __virt_to_mfn(va)
+#define mfn_to_virt(mfn)    __mfn_to_virt(mfn)
+
+struct page_info
+{
+    /* Each frame can be threaded onto a doubly-linked list. */
+    struct page_list_entry list;
+
+    /* Reference count and various PGC_xxx flags and fields. */
+    unsigned long count_info;
+
+    /* Context-dependent fields follow... */
+    union {
+        /* Page is in use: ((count_info & PGC_count_mask) != 0). */
+        struct {
+            /* Type reference count and various PGT_xxx flags and fields. */
+            unsigned long type_info;
+        } inuse;
+        /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
+        union {
+            struct {
+                /*
+                 * Index of the first *possibly* unscrubbed page in the buddy.
+                 * One more bit than maximum possible order to accommodate
+                 * INVALID_DIRTY_IDX.
+                 */
+#define INVALID_DIRTY_IDX ((1UL << (MAX_ORDER + 1)) - 1)
+                unsigned long first_dirty:MAX_ORDER + 1;
+
+                /* Do TLBs need flushing for safety before next page use? */
+                bool need_tlbflush:1;
+
+#define BUDDY_NOT_SCRUBBING    0
+#define BUDDY_SCRUBBING        1
+#define BUDDY_SCRUB_ABORT      2
+                unsigned long scrub_state:2;
+            };
+
+            unsigned long val;
+        } free;
+    } u;
+
+    union {
+        /* Page is in use */
+        struct {
+            /* Owner of this page (NULL if page is anonymous). */
+            struct domain *domain;
+        } inuse;
+
+        /* Page is on a free list. */
+        struct {
+            /* Order-size of the free chunk this page is the head of. */
+            unsigned int order;
+        } free;
+    } v;
+
+    union {
+        /*
+         * Timestamp from 'TLB clock', used to avoid extra safety flushes.
+         * Only valid for: a) free pages, and b) pages with zero type count
+         */
+        uint32_t tlbflush_timestamp;
+    };
+};
+
+#define frame_table ((struct page_info *)FRAMETABLE_VIRT_START)
+
+/* PDX of the first page in the frame table. */
+extern unsigned long frametable_base_pdx;
+
+/* Convert between machine frame numbers and page-info structures. */
+#define mfn_to_page(mfn)                                            \
+    (frame_table + (mfn_to_pdx(mfn) - frametable_base_pdx))
+#define page_to_mfn(pg)                                             \
+    pdx_to_mfn((unsigned long)((pg) - frame_table) + frametable_base_pdx)
+
+static inline void *page_to_virt(const struct page_info *pg)
+{
+    return mfn_to_virt(mfn_x(page_to_mfn(pg)));
+}
+
+/* Convert between Xen-heap virtual addresses and page-info structures. */
+static inline struct page_info *virt_to_page(const void *v)
+{
+    BUG_ON("unimplemented");
+    return NULL;
+}
+
+/*
+ * Common code requires get_page_type and put_page_type.
+ * We don't care about typecounts so we just do the minimum to make it
+ * happy.
+ */
+static inline int get_page_type(struct page_info *page, unsigned long type)
+{
+    return 1;
+}
+
+static inline void put_page_type(struct page_info *page)
+{
+}
+
+static inline void put_page_and_type(struct page_info *page)
+{
+    put_page_type(page);
+    put_page(page);
+}
+
+/*
+ * RISC-V does not have an M2P, but common code expects a handful of
+ * M2P-related defines and functions. Provide dummy versions of these.
+ */
+#define INVALID_M2P_ENTRY        (~0UL)
+#define SHARED_M2P_ENTRY         (~0UL - 1UL)
+#define SHARED_M2P(_e)           ((_e) == SHARED_M2P_ENTRY)
+
+#define set_gpfn_from_mfn(mfn, pfn) do { (void)(mfn), (void)(pfn); } while (0)
+#define mfn_to_gfn(d, mfn) ((void)(d), _gfn(mfn_x(mfn)))
+
+#define PDX_GROUP_SHIFT (PAGE_SHIFT + VPN_BITS)
+
+static inline unsigned long domain_get_maximum_gpfn(struct domain *d)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+static inline long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+/*
+ * On RISCV, all the RAM is currently direct mapped in Xen.
+ * Hence return always true.
+ */
+static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
+{
+    return true;
+}
+
+#define PG_shift(idx)   (BITS_PER_LONG - (idx))
+#define PG_mask(x, idx) (x ## UL << PG_shift(idx))
+
+#define PGT_none          PG_mask(0, 1)  /* no special uses of this page   */
+#define PGT_writable_page PG_mask(1, 1)  /* has writable mappings?         */
+#define PGT_type_mask     PG_mask(1, 1)  /* Bits 31 or 63.                 */
+
+/* Count of uses of this frame as its current type. */
+#define PGT_count_width   PG_shift(2)
+#define PGT_count_mask    ((1UL << PGT_count_width) - 1)
+
+/*
+ * Page needs to be scrubbed. Since this bit can only be set on a page that is
+ * free (i.e. in PGC_state_free) we can reuse PGC_allocated bit.
+ */
+#define _PGC_need_scrub   _PGC_allocated
+#define PGC_need_scrub    PGC_allocated
+
+/* Cleared when the owning guest 'frees' this page. */
+#define _PGC_allocated    PG_shift(1)
+#define PGC_allocated     PG_mask(1, 1)
+/* Page is Xen heap? */
+#define _PGC_xen_heap     PG_shift(2)
+#define PGC_xen_heap      PG_mask(1, 2)
+/* Page is broken? */
+#define _PGC_broken       PG_shift(7)
+#define PGC_broken        PG_mask(1, 7)
+/* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
+#define PGC_state         PG_mask(3, 9)
+#define PGC_state_inuse   PG_mask(0, 9)
+#define PGC_state_offlining PG_mask(1, 9)
+#define PGC_state_offlined PG_mask(2, 9)
+#define PGC_state_free    PG_mask(3, 9)
+#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
+
+/* Count of references to this frame. */
+#define PGC_count_width   PG_shift(9)
+#define PGC_count_mask    ((1UL << PGC_count_width) - 1)
+
+#define _PGC_extra        PG_shift(10)
+#define PGC_extra         PG_mask(1, 10)
+
+#define is_xen_heap_page(page) ((page)->count_info & PGC_xen_heap)
+#define is_xen_heap_mfn(mfn) \
+    (mfn_valid(mfn) && is_xen_heap_page(mfn_to_page(mfn)))
+
+#define is_xen_fixed_mfn(mfn)                                   \
+    ((mfn_to_maddr(mfn) >= virt_to_maddr((vaddr_t)_start)) &&   \
+     (mfn_to_maddr(mfn) <= virt_to_maddr((vaddr_t)_end - 1)))
+
+#define page_get_owner(p)    (p)->v.inuse.domain
+#define page_set_owner(p, d) ((p)->v.inuse.domain = (d))
+
+/* TODO: implement */
+#define mfn_valid(mfn) ({ (void)(mfn); 0; })
+
+#define domain_set_alloc_bitsize(d) ((void)(d))
+#define domain_clamp_alloc_bitsize(d, b) ((void)(d), (b))
+
+#define PFN_ORDER(pfn) ((pfn)->v.free.order)
+
 extern unsigned char cpu0_boot_stack[];
 
 void setup_initial_pagetables(void);
@@ -20,4 +261,9 @@ unsigned long calc_phys_offset(void);
 
 void turn_on_mmu(unsigned long ra);
 
+static inline unsigned int arch_get_dma_bitsize(void)
+{
+    return 32; /* TODO */
+}
+
 #endif /* _ASM_RISCV_MM_H */
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index 053f043a3d..fe3a43be20 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -5,12 +5,12 @@
 #include <xen/init.h>
 #include <xen/kernel.h>
 #include <xen/macros.h>
+#include <xen/mm.h>
 #include <xen/pfn.h>
 
 #include <asm/early_printk.h>
 #include <asm/csr.h>
 #include <asm/current.h>
-#include <asm/mm.h>
 #include <asm/page.h>
 #include <asm/processor.h>
 
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 6593f601c1..98a94c4c48 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -2,9 +2,9 @@
 
 #include <xen/compile.h>
 #include <xen/init.h>
+#include <xen/mm.h>
 
 #include <asm/early_printk.h>
-#include <asm/mm.h>
 
 /* Xen stack for bringing up the first CPU. */
 unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 20/23] xen/riscv: introduce vm_event_*() functions
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (18 preceding siblings ...)
  2024-02-26 17:39 ` [PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
@ 2024-02-26 17:39 ` Oleksii Kurochko
  2024-02-26 17:39 ` [PATCH v5 21/23] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Tamas K Lengyel, Alexandru Isaila, Petre Pircalabu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - Only rebase was done.
---
Changes in V4:
  - New patch.
---
 xen/arch/riscv/Makefile   |  1 +
 xen/arch/riscv/vm_event.c | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+)
 create mode 100644 xen/arch/riscv/vm_event.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 2fefe14e7c..1ed1a8369b 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_RISCV_64) += riscv64/
 obj-y += sbi.o
 obj-y += setup.o
 obj-y += traps.o
+obj-y += vm_event.o
 
 $(TARGET): $(TARGET)-syms
 	$(OBJCOPY) -O binary -S $< $@
diff --git a/xen/arch/riscv/vm_event.c b/xen/arch/riscv/vm_event.c
new file mode 100644
index 0000000000..bb1fc73bc1
--- /dev/null
+++ b/xen/arch/riscv/vm_event.c
@@ -0,0 +1,19 @@
+#include <xen/bug.h>
+
+struct vm_event_st;
+struct vcpu;
+
+void vm_event_fill_regs(struct vm_event_st *req)
+{
+    BUG_ON("unimplemented");
+}
+
+void vm_event_set_registers(struct vcpu *v, struct vm_event_st *rsp)
+{
+    BUG_ON("unimplemented");
+}
+
+void vm_event_monitor_next_interrupt(struct vcpu *v)
+{
+    /* Not supported on RISCV. */
+}
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 21/23] xen/rirscv: add minimal amount of stubs to build full Xen
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (19 preceding siblings ...)
  2024-02-26 17:39 ` [PATCH v5 20/23] xen/riscv: introduce vm_event_*() functions Oleksii Kurochko
@ 2024-02-26 17:39 ` Oleksii Kurochko
  2024-03-05  8:40   ` Jan Beulich
  2024-02-26 17:39 ` [PATCH v5 22/23] xen/riscv: enable full Xen build Oleksii Kurochko
  2024-02-26 17:39 ` [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - drop unrelated changes
 - assert_failed("unimplmented...") change to BUG_ON()
---
Changes in V4:
  - added new stubs which are necessary for compilation after rebase: __cpu_up(), __cpu_disable(), __cpu_die()
    from smpboot.c
  - back changes related to printk() in early_printk() as they should be removed in the next patch to avoid
    compilation error.
  - update definition of cpu_khz: __read_mostly -> __ro_after_init.
  - drop vm_event_reset_vmtrace(). It is defibed in asm-generic/vm_event.h.
  - move vm_event_*() functions from stubs.c to riscv/vm_event.c.
  - s/BUG/BUG_ON("unimplemented") in stubs.c
  - back irq_actor_none() and irq_actor_none() as common/irq.c isn't compiled at this moment,
    so this function are needed to avoid compilation error.
  - defined max_page to avoid compilation error, it will be removed as soon as common/page_alloc.c will
    be compiled.
---
Changes in V3:
 - code style fixes.
 - update attribute for frametable_base_pdx  and frametable_virt_end to __ro_after_init.
   insteaf of read_mostly.
 - use BUG() instead of assert_failed/WARN for newly introduced stubs.
 - drop "#include <public/vm_event.h>" in stubs.c and use forward declaration instead.
 - drop ack_node() and end_node() as they aren't used now.
---
Changes in V2:
 - define udelay stub
 - remove 'select HAS_PDX' from RISC-V Kconfig because of
   https://lore.kernel.org/xen-devel/20231006144405.1078260-1-andrew.cooper3@citrix.com/
---
 xen/arch/riscv/Makefile |   1 +
 xen/arch/riscv/mm.c     |  50 +++++
 xen/arch/riscv/setup.c  |   8 +
 xen/arch/riscv/stubs.c  | 438 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/riscv/traps.c  |  25 +++
 5 files changed, 522 insertions(+)
 create mode 100644 xen/arch/riscv/stubs.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 1ed1a8369b..60afbc0ad9 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -4,6 +4,7 @@ obj-y += mm.o
 obj-$(CONFIG_RISCV_64) += riscv64/
 obj-y += sbi.o
 obj-y += setup.o
+obj-y += stubs.o
 obj-y += traps.o
 obj-y += vm_event.o
 
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index fe3a43be20..2c3fb7d72e 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include <xen/bug.h>
 #include <xen/cache.h>
 #include <xen/compiler.h>
 #include <xen/init.h>
@@ -14,6 +15,9 @@
 #include <asm/page.h>
 #include <asm/processor.h>
 
+unsigned long __ro_after_init frametable_base_pdx;
+unsigned long __ro_after_init frametable_virt_end;
+
 struct mmu_desc {
     unsigned int num_levels;
     unsigned int pgtbl_count;
@@ -294,3 +298,49 @@ unsigned long __init calc_phys_offset(void)
     phys_offset = load_start - XEN_VIRT_START;
     return phys_offset;
 }
+
+void put_page(struct page_info *page)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned long get_upper_mfn_bound(void)
+{
+    /* No memory hotplug yet, so current memory limit is the final one. */
+    return max_page - 1;
+}
+
+void arch_dump_shared_mem_info(void)
+{
+    BUG_ON("unimplemented");
+}
+
+int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
+{
+    BUG_ON("unimplemented");
+    return -1;
+}
+
+int xenmem_add_to_physmap_one(struct domain *d, unsigned int space,
+                              union add_to_physmap_extra extra,
+                              unsigned long idx, gfn_t gfn)
+{
+    BUG_ON("unimplemented");
+
+    return 0;
+}
+
+int destroy_xen_mappings(unsigned long s, unsigned long e)
+{
+    BUG_ON("unimplemented");
+    return -1;
+}
+
+int map_pages_to_xen(unsigned long virt,
+                     mfn_t mfn,
+                     unsigned long nr_mfns,
+                     unsigned int flags)
+{
+    BUG_ON("unimplemented");
+    return -1;
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 98a94c4c48..8bb5bdb2ae 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -1,11 +1,19 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include <xen/bug.h>
 #include <xen/compile.h>
 #include <xen/init.h>
 #include <xen/mm.h>
 
+#include <public/version.h>
+
 #include <asm/early_printk.h>
 
+void arch_get_xen_caps(xen_capabilities_info_t *info)
+{
+    BUG_ON("unimplemented");
+}
+
 /* Xen stack for bringing up the first CPU. */
 unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
     __aligned(STACK_SIZE);
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
new file mode 100644
index 0000000000..529f1dbe52
--- /dev/null
+++ b/xen/arch/riscv/stubs.c
@@ -0,0 +1,438 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#include <xen/cpumask.h>
+#include <xen/domain.h>
+#include <xen/irq.h>
+#include <xen/nodemask.h>
+#include <xen/time.h>
+#include <public/domctl.h>
+
+#include <asm/current.h>
+
+/* smpboot.c */
+
+cpumask_t cpu_online_map;
+cpumask_t cpu_present_map;
+cpumask_t cpu_possible_map;
+
+/* ID of the PCPU we're running on */
+DEFINE_PER_CPU(unsigned int, cpu_id);
+/* XXX these seem awfully x86ish... */
+/* representing HT siblings of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
+/* representing HT and core siblings of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
+
+nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
+
+/*
+ * max_page is defined in page_alloc.c which isn't complied for now.
+ * definition of max_page will be remove as soon as page_alloc is built.
+ */
+unsigned long __read_mostly max_page;
+
+/* time.c */
+
+unsigned long __ro_after_init cpu_khz;  /* CPU clock frequency in kHz. */
+
+s_time_t get_s_time(void)
+{
+    BUG_ON("unimplemented");
+}
+
+int reprogram_timer(s_time_t timeout)
+{
+    BUG_ON("unimplemented");
+}
+
+void send_timer_event(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void domain_set_time_offset(struct domain *d, int64_t time_offset_seconds)
+{
+    BUG_ON("unimplemented");
+}
+
+/* shutdown.c */
+
+void machine_restart(unsigned int delay_millisecs)
+{
+    BUG_ON("unimplemented");
+}
+
+void machine_halt(void)
+{
+    BUG_ON("unimplemented");
+}
+
+/* domctl.c */
+
+long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
+                    XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_get_domain_info(const struct domain *d,
+                          struct xen_domctl_getdomaininfo *info)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
+{
+    BUG_ON("unimplemented");
+}
+
+/* monitor.c */
+
+int arch_monitor_domctl_event(struct domain *d,
+                              struct xen_domctl_monitor_op *mop)
+{
+    BUG_ON("unimplemented");
+}
+
+/* smp.c */
+
+void arch_flush_tlb_mask(const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+void smp_send_event_check_mask(const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+void smp_send_call_function_mask(const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+/* irq.c */
+
+struct pirq *alloc_pirq_struct(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+int pirq_guest_bind(struct vcpu *v, struct pirq *pirq, int will_share)
+{
+    BUG_ON("unimplemented");
+}
+
+void pirq_guest_unbind(struct domain *d, struct pirq *pirq)
+{
+    BUG_ON("unimplemented");
+}
+
+void pirq_set_affinity(struct domain *d, int pirq, const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+hw_irq_controller no_irq_type = {
+    .typename = "none",
+    .startup = irq_startup_none,
+    .shutdown = irq_shutdown_none,
+    .enable = irq_enable_none,
+    .disable = irq_disable_none,
+};
+
+int arch_init_one_irq_desc(struct irq_desc *desc)
+{
+    BUG_ON("unimplemented");
+}
+
+void smp_send_state_dump(unsigned int cpu)
+{
+    BUG_ON("unimplemented");
+}
+
+/* domain.c */
+
+DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
+unsigned long __per_cpu_offset[NR_CPUS];
+
+void context_switch(struct vcpu *prev, struct vcpu *next)
+{
+    BUG_ON("unimplemented");
+}
+
+void continue_running(struct vcpu *same)
+{
+    BUG_ON("unimplemented");
+}
+
+void sync_local_execstate(void)
+{
+    BUG_ON("unimplemented");
+}
+
+void sync_vcpu_execstate(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void startup_cpu_idle_loop(void)
+{
+    BUG_ON("unimplemented");
+}
+
+void free_domain_struct(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void dump_pageframe_info(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void free_vcpu_struct(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_vcpu_create(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_vcpu_destroy(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_switch_to_aarch64_mode(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_domain_create(struct domain *d,
+                       struct xen_domctl_createdomain *config,
+                       unsigned int flags)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_domain_teardown(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_destroy(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_shutdown(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_pause(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_unpause(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_domain_soft_reset(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_creation_finished(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_vcpu_reset(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+int domain_relinquish_resources(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_dump_domain_info(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_dump_vcpu_info(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_mark_events_pending(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_update_evtchn_irq(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_block_unless_event_pending(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_kick(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+struct domain *alloc_domain_struct(void)
+{
+    BUG_ON("unimplemented");
+}
+
+struct vcpu *alloc_vcpu_struct(const struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned long
+hypercall_create_continuation(unsigned int op, const char *format, ...)
+{
+    BUG_ON("unimplemented");
+}
+
+int __init parse_arch_dom0_param(const char *s, const char *e)
+{
+    BUG_ON("unimplemented");
+}
+
+/* guestcopy.c */
+
+unsigned long raw_copy_to_guest(void *to, const void *from, unsigned int len)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned long raw_copy_from_guest(void *to, const void __user *from,
+                                  unsigned int len)
+{
+    BUG_ON("unimplemented");
+}
+
+/* sysctl.c */
+
+long arch_do_sysctl(struct xen_sysctl *sysctl,
+                    XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
+{
+    BUG_ON("unimplemented");
+}
+
+/* p2m.c */
+
+int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
+{
+    BUG_ON("unimplemented");
+}
+
+int unmap_mmio_regions(struct domain *d,
+                       gfn_t start_gfn,
+                       unsigned long nr,
+                       mfn_t mfn)
+{
+    BUG_ON("unimplemented");
+}
+
+int map_mmio_regions(struct domain *d,
+                     gfn_t start_gfn,
+                     unsigned long nr,
+                     mfn_t mfn)
+{
+    BUG_ON("unimplemented");
+}
+
+int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
+                          unsigned long gfn, mfn_t mfn)
+{
+    BUG_ON("unimplemented");
+}
+
+/* Return the size of the pool, in bytes. */
+int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
+{
+    BUG_ON("unimplemented");
+}
+
+/* delay.c */
+
+void udelay(unsigned long usecs)
+{
+    BUG_ON("unimplemented");
+}
+
+/* guest_access.h */ 
+
+static inline unsigned long raw_clear_guest(void *to, unsigned int len)
+{
+    BUG_ON("unimplemented");
+}
+
+/* smpboot.c */
+
+int __cpu_up(unsigned int cpu)
+{
+    BUG_ON("unimplemented");
+}
+
+void __cpu_disable(void)
+{
+    BUG_ON("unimplemented");
+}
+
+void __cpu_die(unsigned int cpu)
+{
+    BUG_ON("unimplemented");
+}
+
+/*
+ * The following functions are defined in common/irq.c, which will be built in
+ * the next commit, so these changes will be removed there.
+ */
+
+void cf_check irq_actor_none(struct irq_desc *desc)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned int cf_check irq_startup_none(struct irq_desc *desc)
+{
+    BUG_ON("unimplemented");
+
+    return 0;
+}
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index ccd3593f5a..5415cf8d90 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -4,6 +4,10 @@
  *
  * RISC-V Trap handlers
  */
+
+#include <xen/lib.h>
+#include <xen/sched.h>
+
 #include <asm/processor.h>
 #include <asm/traps.h>
 
@@ -11,3 +15,24 @@ void do_trap(struct cpu_user_regs *cpu_regs)
 {
     die();
 }
+
+void vcpu_show_execution_state(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void show_execution_state(const struct cpu_user_regs *regs)
+{
+    printk("implement show_execution_state(regs)\n");
+}
+
+void arch_hypercall_tasklet_result(struct vcpu *v, long res)
+{
+    BUG_ON("unimplemented");
+}
+
+enum mc_disposition arch_do_multicall_call(struct mc_state *state)
+{
+    BUG_ON("unimplemented");
+    return mc_continue;
+}
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 22/23] xen/riscv: enable full Xen build
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (20 preceding siblings ...)
  2024-02-26 17:39 ` [PATCH v5 21/23] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
@ 2024-02-26 17:39 ` Oleksii Kurochko
  2024-02-26 17:39 ` [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
  22 siblings, 0 replies; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V5:
 - Nothing changed. Only rebase.
---
Changes in V4:
 - drop stubs for irq_actor_none() and irq_actor_none() as common/irq.c is compiled now.
 - drop defintion of max_page in stubs.c as common/page_alloc.c is compiled now.
 - drop printk() related changes in riscv/early_printk.c as common version will be used.
---
Changes in V3:
 - Reviewed-by: Jan Beulich <jbeulich@suse.com>
 - unrealted change dropped in tiny64_defconfig
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/Makefile       |  16 +++-
 xen/arch/riscv/arch.mk        |   4 -
 xen/arch/riscv/early_printk.c | 168 ----------------------------------
 xen/arch/riscv/stubs.c        |  23 -----
 4 files changed, 15 insertions(+), 196 deletions(-)

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 60afbc0ad9..81b77b13d6 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -12,10 +12,24 @@ $(TARGET): $(TARGET)-syms
 	$(OBJCOPY) -O binary -S $< $@
 
 $(TARGET)-syms: $(objtree)/prelink.o $(obj)/xen.lds
-	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) -o $@
+	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< \
+	    $(objtree)/common/symbols-dummy.o -o $(dot-target).0
+	$(NM) -pa --format=sysv $(dot-target).0 \
+		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+		> $(dot-target).0.S
+	$(MAKE) $(build)=$(@D) $(dot-target).0.o
+	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< \
+	    $(dot-target).0.o -o $(dot-target).1
+	$(NM) -pa --format=sysv $(dot-target).1 \
+		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+		> $(dot-target).1.S
+	$(MAKE) $(build)=$(@D) $(dot-target).1.o
+	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
+	    $(dot-target).1.o -o $@
 	$(NM) -pa --format=sysv $@ \
 		| $(objtree)/tools/symbols --all-symbols --xensyms --sysv --sort \
 		> $@.map
+	rm -f $(@D)/.$(@F).[0-9]*
 
 $(obj)/xen.lds: $(src)/xen.lds.S FORCE
 	$(call if_changed_dep,cpp_lds_S)
diff --git a/xen/arch/riscv/arch.mk b/xen/arch/riscv/arch.mk
index fabe323ec5..197d5e1893 100644
--- a/xen/arch/riscv/arch.mk
+++ b/xen/arch/riscv/arch.mk
@@ -19,7 +19,3 @@ riscv-march-$(CONFIG_RISCV_ISA_C)       := $(riscv-march-y)c
 # -mcmodel=medlow would force Xen into the lower half.
 
 CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align -mcmodel=medany
-
-# TODO: Drop override when more of the build is working
-override ALL_OBJS-y = arch/$(SRCARCH)/built_in.o
-override ALL_LIBS-y =
diff --git a/xen/arch/riscv/early_printk.c b/xen/arch/riscv/early_printk.c
index 60742a042d..610c814f54 100644
--- a/xen/arch/riscv/early_printk.c
+++ b/xen/arch/riscv/early_printk.c
@@ -40,171 +40,3 @@ void early_printk(const char *str)
         str++;
     }
 }
-
-/*
- * The following #if 1 ... #endif should be removed after printk
- * and related stuff are ready.
- */
-#if 1
-
-#include <xen/stdarg.h>
-#include <xen/string.h>
-
-/**
- * strlen - Find the length of a string
- * @s: The string to be sized
- */
-size_t (strlen)(const char * s)
-{
-    const char *sc;
-
-    for (sc = s; *sc != '\0'; ++sc)
-        /* nothing */;
-    return sc - s;
-}
-
-/**
- * memcpy - Copy one area of memory to another
- * @dest: Where to copy to
- * @src: Where to copy from
- * @count: The size of the area.
- *
- * You should not use this function to access IO space, use memcpy_toio()
- * or memcpy_fromio() instead.
- */
-void *(memcpy)(void *dest, const void *src, size_t count)
-{
-    char *tmp = (char *) dest, *s = (char *) src;
-
-    while (count--)
-        *tmp++ = *s++;
-
-    return dest;
-}
-
-int vsnprintf(char* str, size_t size, const char* format, va_list args)
-{
-    size_t i = 0; /* Current position in the output string */
-    size_t written = 0; /* Total number of characters written */
-    char* dest = str;
-
-    while ( format[i] != '\0' && written < size - 1 )
-    {
-        if ( format[i] == '%' )
-        {
-            i++;
-
-            if ( format[i] == '\0' )
-                break;
-
-            if ( format[i] == '%' )
-            {
-                if ( written < size - 1 )
-                {
-                    dest[written] = '%';
-                    written++;
-                }
-                i++;
-                continue;
-            }
-
-            /*
-             * Handle format specifiers.
-             * For simplicity, only %s and %d are implemented here.
-             */
-
-            if ( format[i] == 's' )
-            {
-                char* arg = va_arg(args, char*);
-                size_t arglen = strlen(arg);
-
-                size_t remaining = size - written - 1;
-
-                if ( arglen > remaining )
-                    arglen = remaining;
-
-                memcpy(dest + written, arg, arglen);
-
-                written += arglen;
-                i++;
-            }
-            else if ( format[i] == 'd' )
-            {
-                int arg = va_arg(args, int);
-
-                /* Convert the integer to string representation */
-                char numstr[32]; /* Assumes a maximum of 32 digits */
-                int numlen = 0;
-                int num = arg;
-                size_t remaining;
-
-                if ( arg < 0 )
-                {
-                    if ( written < size - 1 )
-                    {
-                        dest[written] = '-';
-                        written++;
-                    }
-
-                    num = -arg;
-                }
-
-                do
-                {
-                    numstr[numlen] = '0' + num % 10;
-                    num = num / 10;
-                    numlen++;
-                } while ( num > 0 );
-
-                /* Reverse the string */
-                for (int j = 0; j < numlen / 2; j++)
-                {
-                    char tmp = numstr[j];
-                    numstr[j] = numstr[numlen - 1 - j];
-                    numstr[numlen - 1 - j] = tmp;
-                }
-
-                remaining = size - written - 1;
-
-                if ( numlen > remaining )
-                    numlen = remaining;
-
-                memcpy(dest + written, numstr, numlen);
-
-                written += numlen;
-                i++;
-            }
-        }
-        else
-        {
-            if ( written < size - 1 )
-            {
-                dest[written] = format[i];
-                written++;
-            }
-            i++;
-        }
-    }
-
-    if ( size > 0 )
-        dest[written] = '\0';
-
-    return written;
-}
-
-void printk(const char *format, ...)
-{
-    static char buf[1024];
-
-    va_list args;
-    va_start(args, format);
-
-    (void)vsnprintf(buf, sizeof(buf), format, args);
-
-    early_printk(buf);
-
-    va_end(args);
-}
-
-#endif
-
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 529f1dbe52..bda35fc347 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -24,12 +24,6 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-/*
- * max_page is defined in page_alloc.c which isn't complied for now.
- * definition of max_page will be remove as soon as page_alloc is built.
- */
-unsigned long __read_mostly max_page;
-
 /* time.c */
 
 unsigned long __ro_after_init cpu_khz;  /* CPU clock frequency in kHz. */
@@ -419,20 +413,3 @@ void __cpu_die(unsigned int cpu)
 {
     BUG_ON("unimplemented");
 }
-
-/*
- * The following functions are defined in common/irq.c, which will be built in
- * the next commit, so these changes will be removed there.
- */
-
-void cf_check irq_actor_none(struct irq_desc *desc)
-{
-    BUG_ON("unimplemented");
-}
-
-unsigned int cf_check irq_startup_none(struct irq_desc *desc)
-{
-    BUG_ON("unimplemented");
-
-    return 0;
-}
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (21 preceding siblings ...)
  2024-02-26 17:39 ` [PATCH v5 22/23] xen/riscv: enable full Xen build Oleksii Kurochko
@ 2024-02-26 17:39 ` Oleksii Kurochko
  2024-02-27  7:55   ` Jan Beulich
  22 siblings, 1 reply; 88+ messages in thread
From: Oleksii Kurochko @ 2024-02-26 17:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

This patch doesn't represent a strict lower bound for GCC and
GNU Binutils; rather, these versions are specifically employed by
the Xen RISC-V container and are anticipated to undergo continuous
testing.

While it is feasible to utilize Clang, it's important to note that,
currently, there is no Xen RISC-V CI job in place to verify the
seamless functioning of the build with Clang.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V5:
  - update the commit message and README file with additional explanation about GCC and
    GNU Binutils version. Additionally, it was added information about Clang.
---
 Changes in V4:
  - Update version of GCC (12.2) and GNU Binutils (2.39) to the version
    which are in Xen's contrainter for RISC-V
---
 Changes in V3:
  - new patch
---
 README | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/README b/README
index c8a108449e..7fd4173743 100644
--- a/README
+++ b/README
@@ -48,6 +48,15 @@ provided by your OS distributor:
       - For ARM 64-bit:
         - GCC 5.1 or later
         - GNU Binutils 2.24 or later
+      - For RISC-V 64-bit:
+        - GCC 12.2 or later
+        - GNU Binutils 2.39 or later
+        This doesn't represent a strict lower bound for GCC and GNU Binutils;
+        rather, these versions are specifically employed by the Xen RISC-V
+        container and are anticipated to undergo continuous testing.
+        While it is feasible to utilize Clang, it's important to note that,
+        currently, there is no Xen RISC-V CI job in place to verify the
+        seamless functioning of the build with Clang.
     * POSIX compatible awk
     * Development install of zlib (e.g., zlib-dev)
     * Development install of Python 2.7 or later (e.g., python-dev)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 02/23] xen/riscv: use some asm-generic headers
  2024-02-26 17:38 ` [PATCH v5 02/23] xen/riscv: use some asm-generic headers Oleksii Kurochko
@ 2024-02-27  7:35   ` Jan Beulich
  0 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-02-27  7:35 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> The following headers end up the same as asm-generic's version:
> * altp2m.h
> * device.h
> * div64.h
> * hardirq.h
> * hypercall.h
> * iocap.h
> * paging.h
> * percpu.h
> * random.h
> * softirq.h
> * vm_event.h
> 
> RISC-V should utilize the asm-generic's version of the mentioned
> headers instead of introducing them in the arch-specific folder.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
> Changes in V5:
>  - Nothing changed. Only rebase.
>  - update the commit message.
>  - drop the message above revision log as there is no depenency for this patch
>    from other patch series.

Please can you make sure you submit patches against a sufficiently up-to-
date version of the staging branch? I committed v4 of this patch a couple
of days ago already, upon your own confirmation that it was okay to go in
ahead of others.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 03/23] xen/riscv: introduce nospec.h
  2024-02-26 17:38 ` [PATCH v5 03/23] xen/riscv: introduce nospec.h Oleksii Kurochko
@ 2024-02-27  7:38   ` Jan Beulich
  2024-02-28  9:59     ` Oleksii
  2024-02-29 13:49   ` Julien Grall
  2024-02-29 16:27   ` Jan Beulich
  2 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-02-27  7:38 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> From the unpriviliged doc:
>   No standard hints are presently defined.
>   We anticipate standard hints to eventually include memory-system spatial
>   and temporal locality hints, branch prediction hints, thread-scheduling
>   hints, security tags, and instrumentation flags for simulation/emulation.
> 
> Also, there are no speculation execution barriers.
> 
> Therefore, functions evaluate_nospec() and block_speculation() should
> remain empty until a specific platform has an extension to deal with
> speculation execution.

What about array_index_mask_nospec(), though? No custom implementation,
meaning the generic one will be used there? If that's the intention,
then ...

> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

Jan



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-26 17:39 ` [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
@ 2024-02-27  7:55   ` Jan Beulich
  2024-02-28 17:03     ` Oleksii
  2024-02-28 22:58     ` Julien Grall
  0 siblings, 2 replies; 88+ messages in thread
From: Jan Beulich @ 2024-02-27  7:55 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 26.02.2024 18:39, Oleksii Kurochko wrote:
> This patch doesn't represent a strict lower bound for GCC and
> GNU Binutils; rather, these versions are specifically employed by
> the Xen RISC-V container and are anticipated to undergo continuous
> testing.

Up and until that container would be updated to a newer gcc. I'm
afraid I view this as too weak a criteria, but I'm also not meaning to
stand in the way if somebody else wants to ack this patch in this form;
my bare minimum requirement is now met.

> --- a/README
> +++ b/README
> @@ -48,6 +48,15 @@ provided by your OS distributor:
>        - For ARM 64-bit:
>          - GCC 5.1 or later
>          - GNU Binutils 2.24 or later
> +      - For RISC-V 64-bit:
> +        - GCC 12.2 or later
> +        - GNU Binutils 2.39 or later
> +        This doesn't represent a strict lower bound for GCC and GNU Binutils;
> +        rather, these versions are specifically employed by the Xen RISC-V
> +        container and are anticipated to undergo continuous testing.

As per above, I think here it really needs saying "at the time of writing"
or recording a concrete date. Furthermore I expect "these versions" relates
to the specifically named versions and particularly _not_ to "or later":
With the criteria you apply, using later versions (or in fact any version
other than the very specific ones used in the container) would be similarly
untested. Much like x86 and Arm don't have the full range of permitted
tool chain versions continuously tested. Plus don't forget that distros may
apply their own selection of patches on top of what they take from upstream
(and they may also take random snapshots rather than released versions).

IOW it is hard for me to see why RISC-V needs stronger restrictions here
than other architectures. It ought to be possible to determine a baseline
version. Even if taking the desire to have "pause" available as a
requirement, gas (and presumably gld) 2.36.1 would already suffice.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 03/23] xen/riscv: introduce nospec.h
  2024-02-27  7:38   ` Jan Beulich
@ 2024-02-28  9:59     ` Oleksii
  0 siblings, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-02-28  9:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Tue, 2024-02-27 at 08:38 +0100, Jan Beulich wrote:
> On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > From the unpriviliged doc:
> >   No standard hints are presently defined.
> >   We anticipate standard hints to eventually include memory-system
> > spatial
> >   and temporal locality hints, branch prediction hints, thread-
> > scheduling
> >   hints, security tags, and instrumentation flags for
> > simulation/emulation.
> > 
> > Also, there are no speculation execution barriers.
> > 
> > Therefore, functions evaluate_nospec() and block_speculation()
> > should
> > remain empty until a specific platform has an extension to deal
> > with
> > speculation execution.
> 
> What about array_index_mask_nospec(), though? No custom
> implementation,
> meaning the generic one will be used there? If that's the intention,
> then ...
Yes, the generic one will be used.

~ Oleksii
> 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> Jan
> 



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-27  7:55   ` Jan Beulich
@ 2024-02-28 17:03     ` Oleksii
  2024-02-28 22:58     ` Julien Grall
  1 sibling, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-02-28 17:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On Tue, 2024-02-27 at 08:55 +0100, Jan Beulich wrote:
> On 26.02.2024 18:39, Oleksii Kurochko wrote:
> > This patch doesn't represent a strict lower bound for GCC and
> > GNU Binutils; rather, these versions are specifically employed by
> > the Xen RISC-V container and are anticipated to undergo continuous
> > testing.
> 
> Up and until that container would be updated to a newer gcc. I'm
> afraid I view this as too weak a criteria, but I'm also not meaning
> to
> stand in the way if somebody else wants to ack this patch in this
> form;
> my bare minimum requirement is now met.
> 
> > --- a/README
> > +++ b/README
> > @@ -48,6 +48,15 @@ provided by your OS distributor:
> >        - For ARM 64-bit:
> >          - GCC 5.1 or later
> >          - GNU Binutils 2.24 or later
> > +      - For RISC-V 64-bit:
> > +        - GCC 12.2 or later
> > +        - GNU Binutils 2.39 or later
> > +        This doesn't represent a strict lower bound for GCC and
> > GNU Binutils;
> > +        rather, these versions are specifically employed by the
> > Xen RISC-V
> > +        container and are anticipated to undergo continuous
> > testing.
> 
> As per above, I think here it really needs saying "at the time of
> writing"
> or recording a concrete date. Furthermore I expect "these versions"
> relates
> to the specifically named versions and particularly _not_ to "or
> later":
> With the criteria you apply, using later versions (or in fact any
> version
> other than the very specific ones used in the container) would be
> similarly
> untested. Much like x86 and Arm don't have the full range of
> permitted
> tool chain versions continuously tested. Plus don't forget that
> distros may
> apply their own selection of patches on top of what they take from
> upstream
> (and they may also take random snapshots rather than released
> versions).
> 
> IOW it is hard for me to see why RISC-V needs stronger restrictions
> here
> than other architectures. It ought to be possible to determine a
> baseline
> version. Even if taking the desire to have "pause" available as a
> requirement, gas (and presumably gld) 2.36.1 would already suffice.
I'll be happy to determine a baseline version and RISC-V doesn't need
stronger restriction that why I wrote: "This patch doesn't represent a
strict lower bound for GCC and GNU Binutils".

Would it be good to use for GCC -> "12.2 or later" and for Binutils ->
"2.36.1 or later"?

I missed that I've pushed RISC-V contrainer without fixing version of
archlinux, so you are right that after container update what I wrote
won't be true, as compiler version might be changed.

Just for clarifying when the version will be agreed, does it mean that
I should use a toolchain with mentioned version in this file and each
time to verify that everything still working with this versions?

~ Oleksii



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-27  7:55   ` Jan Beulich
  2024-02-28 17:03     ` Oleksii
@ 2024-02-28 22:58     ` Julien Grall
  2024-02-28 23:11       ` Andrew Cooper
  2024-02-29  7:58       ` Jan Beulich
  1 sibling, 2 replies; 88+ messages in thread
From: Julien Grall @ 2024-02-28 22:58 UTC (permalink / raw)
  To: Jan Beulich, Oleksii Kurochko
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu, xen-devel

Hi Jan,

On 27/02/2024 07:55, Jan Beulich wrote:
> On 26.02.2024 18:39, Oleksii Kurochko wrote:
>> This patch doesn't represent a strict lower bound for GCC and
>> GNU Binutils; rather, these versions are specifically employed by
>> the Xen RISC-V container and are anticipated to undergo continuous
>> testing.
> 
> Up and until that container would be updated to a newer gcc. I'm
> afraid I view this as too weak a criteria,

I disagree. We have to decide a limit at some point. It is sensible to 
say that we are only supporting what we can tests. AFAIK, this is what 
QEMU has been doing.

> but I'm also not meaning to
> stand in the way if somebody else wants to ack this patch in this form;
> my bare minimum requirement is now met.
> 
>> --- a/README
>> +++ b/README
>> @@ -48,6 +48,15 @@ provided by your OS distributor:
>>         - For ARM 64-bit:
>>           - GCC 5.1 or later
>>           - GNU Binutils 2.24 or later
>> +      - For RISC-V 64-bit:
>> +        - GCC 12.2 or later
>> +        - GNU Binutils 2.39 or later
>> +        This doesn't represent a strict lower bound for GCC and GNU Binutils;
>> +        rather, these versions are specifically employed by the Xen RISC-V
>> +        container and are anticipated to undergo continuous testing.
> 
> As per above, I think here it really needs saying "at the time of writing"
> or recording a concrete date. Furthermore I expect "these versions" relates
> to the specifically named versions and particularly _not_ to "or later":
> With the criteria you apply, using later versions (or in fact any version
> other than the very specific ones used in the container) would be similarly
> untested. Much like x86 and Arm don't have the full range of permitted
> tool chain versions continuously tested. Plus don't forget that distros may
> apply their own selection of patches on top of what they take from upstream
> (and they may also take random snapshots rather than released versions).

TBH, I think this should be dropped from the README. With the wording, 
it implies that older GCC would work, but this is not a guarantee.

The same for Arm, I suspect some revision of GCC below 5.1 that may 
work. But that's just convenience to list a lower limit.

With the sentence dropped, I would be happy to ack this patch.

> 
> IOW it is hard for me to see why RISC-V needs stronger restrictions here
> than other architectures. It ought to be possible to determine a baseline
> version. Even if taking the desire to have "pause" available as a
> requirement, gas (and presumably gld) 2.36.1 would already suffice.

I think we want to bump it on Arm. There are zero reasons to try to keep 
a lower versions if nobody tests/use it in production.

I would suggest to do the same on x86. What's the point of try to 
support Xen with a 15+ years old compiler?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-28 22:58     ` Julien Grall
@ 2024-02-28 23:11       ` Andrew Cooper
  2024-02-29 17:00         ` Oleksii
  2024-02-29  7:58       ` Jan Beulich
  1 sibling, 1 reply; 88+ messages in thread
From: Andrew Cooper @ 2024-02-28 23:11 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich, Oleksii Kurochko
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel,
	Roger Pau Monné

On 28/02/2024 10:58 pm, Julien Grall wrote:
> Hi Jan,
>
> On 27/02/2024 07:55, Jan Beulich wrote:
>> On 26.02.2024 18:39, Oleksii Kurochko wrote:
>>> This patch doesn't represent a strict lower bound for GCC and
>>> GNU Binutils; rather, these versions are specifically employed by
>>> the Xen RISC-V container and are anticipated to undergo continuous
>>> testing.
>>
>> Up and until that container would be updated to a newer gcc. I'm
>> afraid I view this as too weak a criteria,
>
> I disagree. We have to decide a limit at some point. It is sensible to
> say that we are only supporting what we can tests. AFAIK, this is what
> QEMU has been doing.
>
>> but I'm also not meaning to
>> stand in the way if somebody else wants to ack this patch in this form;
>> my bare minimum requirement is now met.
>>
>>> --- a/README
>>> +++ b/README
>>> @@ -48,6 +48,15 @@ provided by your OS distributor:
>>>         - For ARM 64-bit:
>>>           - GCC 5.1 or later
>>>           - GNU Binutils 2.24 or later
>>> +      - For RISC-V 64-bit:
>>> +        - GCC 12.2 or later
>>> +        - GNU Binutils 2.39 or later
>>> +        This doesn't represent a strict lower bound for GCC and GNU
>>> Binutils;
>>> +        rather, these versions are specifically employed by the Xen
>>> RISC-V
>>> +        container and are anticipated to undergo continuous testing.
>>
>> As per above, I think here it really needs saying "at the time of
>> writing"
>> or recording a concrete date. Furthermore I expect "these versions"
>> relates
>> to the specifically named versions and particularly _not_ to "or later":
>> With the criteria you apply, using later versions (or in fact any
>> version
>> other than the very specific ones used in the container) would be
>> similarly
>> untested. Much like x86 and Arm don't have the full range of permitted
>> tool chain versions continuously tested. Plus don't forget that
>> distros may
>> apply their own selection of patches on top of what they take from
>> upstream
>> (and they may also take random snapshots rather than released versions).
>
> TBH, I think this should be dropped from the README. With the wording,
> it implies that older GCC would work, but this is not a guarantee.
>
> The same for Arm, I suspect some revision of GCC below 5.1 that may
> work. But that's just convenience to list a lower limit.
>
> With the sentence dropped, I would be happy to ack this patch.
>
>>
>> IOW it is hard for me to see why RISC-V needs stronger restrictions here
>> than other architectures. It ought to be possible to determine a
>> baseline
>> version. Even if taking the desire to have "pause" available as a
>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>
> I think we want to bump it on Arm. There are zero reasons to try to
> keep a lower versions if nobody tests/use it in production.
>
> I would suggest to do the same on x86. What's the point of try to
> support Xen with a 15+ years old compiler?

There is a material cost to supporting ancient toolchains.  I'm
increasingly unwilling to keep paying.

I'm also bored of needing to support versions of binutils which don't
know the Virt instructions, which are approaching 2 decades old now.

There are very good reasons to move to GCC 5.1 minimum across the board,
because that gets us several features (__has_include(), asm goto and
_Generic()) which will make material improvements to our code.  And I'd
like to start using other bits of C11/gnu11.

The choice of minimum toolchains should definitely be per-arch, and
"this is the oldest one we regularly test" is an entirely fine reason to
set the minimum bar.

Furthermore, Linux has regularly been bumping minimum toolchain versions
due to code generation issues, and we'd be foolish not pay attention.

~Andrew


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-28 22:58     ` Julien Grall
  2024-02-28 23:11       ` Andrew Cooper
@ 2024-02-29  7:58       ` Jan Beulich
  2024-02-29 10:23         ` Julien Grall
  2024-02-29 16:54         ` Oleksii
  1 sibling, 2 replies; 88+ messages in thread
From: Jan Beulich @ 2024-02-29  7:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	xen-devel, Oleksii Kurochko

On 28.02.2024 23:58, Julien Grall wrote:
> On 27/02/2024 07:55, Jan Beulich wrote:
>> On 26.02.2024 18:39, Oleksii Kurochko wrote:
>>> This patch doesn't represent a strict lower bound for GCC and
>>> GNU Binutils; rather, these versions are specifically employed by
>>> the Xen RISC-V container and are anticipated to undergo continuous
>>> testing.
>>
>> Up and until that container would be updated to a newer gcc. I'm
>> afraid I view this as too weak a criteria,
> 
> I disagree. We have to decide a limit at some point. It is sensible to 
> say that we are only supporting what we can tests. AFAIK, this is what 
> QEMU has been doing.

I view qemu as a particularly bad example. They raise their baselines
far too aggressively for my taste.

>> IOW it is hard for me to see why RISC-V needs stronger restrictions here
>> than other architectures. It ought to be possible to determine a baseline
>> version. Even if taking the desire to have "pause" available as a
>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
> 
> I think we want to bump it on Arm. There are zero reasons to try to keep 
> a lower versions if nobody tests/use it in production.
> 
> I would suggest to do the same on x86. What's the point of try to 
> support Xen with a 15+ years old compiler?

It could have long been bumped if only a proper scheme to follow for
this and future bumping would have been put forward by anyone keen on
such bumping, like - see his reply - e.g. Andrew. You may recall that
this was discussed more than once on meetings, with no real outcome.
I'm personally not meaning to stand in the way of such bumping as long
as it's done in a predictable manner, but I'm not keen on doing so and
hence I don't view it as my obligation to try to invent a reasonable
scheme. (My personal view is that basic functionality should be
possible to have virtually everywhere, whereas for advanced stuff it
is fine to require a more modern tool chain.)

The one additional concern I've raised in the past is that in the end
it's not just minimal tool chain versions we rely on, but also other
core system tools (see the recent move from "which" to "command -v"
for an example of such a dependency, where luckily it turned out to
not be an issue that the -v had only become a standard thing at some
point). While for the tool chain I can arrange for making newer
versions available, for core system tools I can't. Therefore being too
eager there would mean I can't really / easily (smoke) test Xen
anymore on ancient hardware every once in a while. When afaict we do
too little of such testing already anyway, despite not having any
lower bound on hardware that formally we support running Xen on. (And
no, upgrading the ancient distros on that ancient hardware is not an
option for me.)

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29  7:58       ` Jan Beulich
@ 2024-02-29 10:23         ` Julien Grall
  2024-02-29 11:56           ` Jan Beulich
  2024-02-29 12:05           ` Andrew Cooper
  2024-02-29 16:54         ` Oleksii
  1 sibling, 2 replies; 88+ messages in thread
From: Julien Grall @ 2024-02-29 10:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	xen-devel, Oleksii Kurochko



On 29/02/2024 07:58, Jan Beulich wrote:
> On 28.02.2024 23:58, Julien Grall wrote:
>> On 27/02/2024 07:55, Jan Beulich wrote:
>>> On 26.02.2024 18:39, Oleksii Kurochko wrote:
>>>> This patch doesn't represent a strict lower bound for GCC and
>>>> GNU Binutils; rather, these versions are specifically employed by
>>>> the Xen RISC-V container and are anticipated to undergo continuous
>>>> testing.
>>>
>>> Up and until that container would be updated to a newer gcc. I'm
>>> afraid I view this as too weak a criteria,
>>
>> I disagree. We have to decide a limit at some point. It is sensible to
>> say that we are only supporting what we can tests. AFAIK, this is what
>> QEMU has been doing.
> 
> I view qemu as a particularly bad example. They raise their baselines
> far too aggressively for my taste.

AFAICT, the decision was based on the supported distros at the time. 
Which makes sense to me (even though I got recently caught because of 
this check). They also seem to be open to relax the check if there are 
any use cases.

Why would we want to support build Xen on non-supported distros?

> 
>>> IOW it is hard for me to see why RISC-V needs stronger restrictions here
>>> than other architectures. It ought to be possible to determine a baseline
>>> version. Even if taking the desire to have "pause" available as a
>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>
>> I think we want to bump it on Arm. There are zero reasons to try to keep
>> a lower versions if nobody tests/use it in production.
>>
>> I would suggest to do the same on x86. What's the point of try to
>> support Xen with a 15+ years old compiler?
> 
> It could have long been bumped if only a proper scheme to follow for
> this and future bumping would have been put forward by anyone keen on
> such bumping, like - see his reply - e.g. Andrew. You may recall that
> this was discussed more than once on meetings, with no real outcome.
> I'm personally not meaning to stand in the way of such bumping as long
> as it's done in a predictable manner, but I'm not keen on doing so and
> hence I don't view it as my obligation to try to invent a reasonable
> scheme. (My personal view is that basic functionality should be
> possible to have virtually everywhere, whereas for advanced stuff it
> is fine to require a more modern tool chain.)

That's one way to see it. The problem with this statement is a user 
today is mislead to think you can build Xen with any GCC versions since 
4.1. I don't believe we can guarantee that and we are exposing our users 
to unnecessary risk.

In addition to that, I agree with Andrew. This is preventing us to 
improve our code base and we have to carry hacks for older compilers.

> 
> The one additional concern I've raised in the past is that in the end
> it's not just minimal tool chain versions we rely on, but also other
> core system tools (see the recent move from "which" to "command -v"
> for an example of such a dependency, where luckily it turned out to
> not be an issue that the -v had only become a standard thing at some
> point). While for the tool chain I can arrange for making newer
> versions available, for core system tools I can't.

I agree we probably want to clarify the minimum version of the 
coretools. However, I think we need to separate the two. Otherwise, we 
will be forever in the statu quo on x86.

> Therefore being too
> eager there would mean I can't really / easily (smoke) test Xen
> anymore on ancient hardware every once in a while. When afaict we do
> too little of such testing already anyway, despite not having any
> lower bound on hardware that formally we support running Xen on.

Can you provide more details of what you mean by "ancient"?

> (And
> no, upgrading the ancient distros on that ancient hardware is not an
> option for me.)

May I ask why? Is it because newer distros don't support your HW?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 10:23         ` Julien Grall
@ 2024-02-29 11:56           ` Jan Beulich
  2024-02-29 11:59             ` Jan Beulich
  2024-02-29 12:05           ` Andrew Cooper
  1 sibling, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 11:56 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	xen-devel, Oleksii Kurochko

On 29.02.2024 11:23, Julien Grall wrote:
> On 29/02/2024 07:58, Jan Beulich wrote:
>> Therefore being too
>> eager there would mean I can't really / easily (smoke) test Xen
>> anymore on ancient hardware every once in a while. When afaict we do
>> too little of such testing already anyway, despite not having any
>> lower bound on hardware that formally we support running Xen on.
> 
> Can you provide more details of what you mean by "ancient"?

Formally we support running Xen on any x86 hardware supporting 64-bit
mode. I don't think I have any 1st gen systems left, but I think a
2nd gen SVM and a 2nd gen VMX one is what I still have around.

>> (And
>> no, upgrading the ancient distros on that ancient hardware is not an
>> option for me.)
> 
> May I ask why? Is it because newer distros don't support your HW?

Because as part of my job I also need to support ancient versions of
Xen on ancient distros. Since I need to keep those around, it makes
sense to me to then also test modern Xen there (every now and then, as
said, and not really extensively).

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 11:56           ` Jan Beulich
@ 2024-02-29 11:59             ` Jan Beulich
  0 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 11:59 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	xen-devel, Oleksii Kurochko

On 29.02.2024 12:56, Jan Beulich wrote:
> On 29.02.2024 11:23, Julien Grall wrote:
>> On 29/02/2024 07:58, Jan Beulich wrote:
>>> (And
>>> no, upgrading the ancient distros on that ancient hardware is not an
>>> option for me.)
>>
>> May I ask why? Is it because newer distros don't support your HW?
> 
> Because as part of my job I also need to support ancient versions of
> Xen on ancient distros. Since I need to keep those around, it makes
> sense to me to then also test modern Xen there (every now and then, as
> said, and not really extensively).

Oh, and - because upgrading has proven to take quite a bit of time, if
also accounting for all the follow-on work that needs doing when parts
of the upgrade didn't quite go as intended. Whether newer distros
formally don't support such old hardware is just another possible
factor.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 10:23         ` Julien Grall
  2024-02-29 11:56           ` Jan Beulich
@ 2024-02-29 12:05           ` Andrew Cooper
  2024-02-29 12:17             ` Jan Beulich
  2024-02-29 12:27             ` Julien Grall
  1 sibling, 2 replies; 88+ messages in thread
From: Andrew Cooper @ 2024-02-29 12:05 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel, Oleksii Kurochko

On 29/02/2024 10:23 am, Julien Grall wrote:
>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>> here
>>>> than other architectures. It ought to be possible to determine a
>>>> baseline
>>>> version. Even if taking the desire to have "pause" available as a
>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>
>>> I think we want to bump it on Arm. There are zero reasons to try to
>>> keep
>>> a lower versions if nobody tests/use it in production.
>>>
>>> I would suggest to do the same on x86. What's the point of try to
>>> support Xen with a 15+ years old compiler?
>>
>> It could have long been bumped if only a proper scheme to follow for
>> this and future bumping would have been put forward by anyone keen on
>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>> this was discussed more than once on meetings, with no real outcome.
>> I'm personally not meaning to stand in the way of such bumping as long
>> as it's done in a predictable manner, but I'm not keen on doing so and
>> hence I don't view it as my obligation to try to invent a reasonable
>> scheme. (My personal view is that basic functionality should be
>> possible to have virtually everywhere, whereas for advanced stuff it
>> is fine to require a more modern tool chain.)
>
> That's one way to see it. The problem with this statement is a user
> today is mislead to think you can build Xen with any GCC versions
> since 4.1. I don't believe we can guarantee that and we are exposing
> our users to unnecessary risk.
>
> In addition to that, I agree with Andrew. This is preventing us to
> improve our code base and we have to carry hacks for older compilers.

I don't think anyone here is suggesting that we switch to a
bleeding-edge-only policy.  But 15y of support is extreme in the
opposite direction.

Xen ought to be buildable in the contemporary distros of the day, and I
don't think anyone is going to credibly argue otherwise.

But, it's also fine for new things to have newer requirements.

Take CET for example.  I know we have disagreements on exactly how it's
toolchain-conditionalness is implemented, but the basic principle of "If
you want shiny new optional feature $X, you need newer toolchain $Y" is
entirely fine.

A brand new architecture is exactly the same.  Saying "this is the
minimum, because it's what we test" doesn't preclude someone coming
along and saying "can we use $N-1 ?  See here it works, and here's a
change to CI test it".


Anyway, its clear we need to write some policy on this, before making
specific adjustments.  To get started, is there going to be any
objection whatsoever on some principles which begin as follows:

* For established architectures, we expect Xen to be buildable on the
common contemporary distros.  (i.e. minima is not newer than what's
available in contemporary distros, without a good reason)

* Optional features explicitly may have newer minima, generally chosen
by when toolchain support landed and/or was bugfixed suitably to be usable.

* Xen won't expect to update minima "just because".  But updates
across-the-board will be considered periodically where it doesn't
conflict with point 1, and where changing the minima allows us to use a
new feature to have a positive impact on the codebase.

* We always reserve the right to update minima to e.g. avoid crippling
code generation bugs, even if it conflicts with point 1.  Where
workarounds can reasonably be done, they ought to be preferred, but this
is ultimately at the discretion of the relevant architecture maintainers.

?

~Andrew


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 12:05           ` Andrew Cooper
@ 2024-02-29 12:17             ` Jan Beulich
  2024-02-29 12:32               ` Julien Grall
  2024-02-29 12:27             ` Julien Grall
  1 sibling, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 12:17 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel,
	Oleksii Kurochko, Julien Grall

On 29.02.2024 13:05, Andrew Cooper wrote:
> On 29/02/2024 10:23 am, Julien Grall wrote:
>>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>>> here
>>>>> than other architectures. It ought to be possible to determine a
>>>>> baseline
>>>>> version. Even if taking the desire to have "pause" available as a
>>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>>
>>>> I think we want to bump it on Arm. There are zero reasons to try to
>>>> keep
>>>> a lower versions if nobody tests/use it in production.
>>>>
>>>> I would suggest to do the same on x86. What's the point of try to
>>>> support Xen with a 15+ years old compiler?
>>>
>>> It could have long been bumped if only a proper scheme to follow for
>>> this and future bumping would have been put forward by anyone keen on
>>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>>> this was discussed more than once on meetings, with no real outcome.
>>> I'm personally not meaning to stand in the way of such bumping as long
>>> as it's done in a predictable manner, but I'm not keen on doing so and
>>> hence I don't view it as my obligation to try to invent a reasonable
>>> scheme. (My personal view is that basic functionality should be
>>> possible to have virtually everywhere, whereas for advanced stuff it
>>> is fine to require a more modern tool chain.)
>>
>> That's one way to see it. The problem with this statement is a user
>> today is mislead to think you can build Xen with any GCC versions
>> since 4.1. I don't believe we can guarantee that and we are exposing
>> our users to unnecessary risk.
>>
>> In addition to that, I agree with Andrew. This is preventing us to
>> improve our code base and we have to carry hacks for older compilers.
> 
> I don't think anyone here is suggesting that we switch to a
> bleeding-edge-only policy.  But 15y of support is extreme in the
> opposite direction.
> 
> Xen ought to be buildable in the contemporary distros of the day, and I
> don't think anyone is going to credibly argue otherwise.
> 
> But, it's also fine for new things to have newer requirements.
> 
> Take CET for example.  I know we have disagreements on exactly how it's
> toolchain-conditionalness is implemented, but the basic principle of "If
> you want shiny new optional feature $X, you need newer toolchain $Y" is
> entirely fine.
> 
> A brand new architecture is exactly the same.  Saying "this is the
> minimum, because it's what we test" doesn't preclude someone coming
> along and saying "can we use $N-1 ?  See here it works, and here's a
> change to CI test it".
> 
> 
> Anyway, its clear we need to write some policy on this, before making
> specific adjustments.  To get started, is there going to be any
> objection whatsoever on some principles which begin as follows:

Largely not, but one aspect needs clarifying up front:

> * For established architectures, we expect Xen to be buildable on the
> common contemporary distros.  (i.e. minima is not newer than what's
> available in contemporary distros, without a good reason)

What counts as contemporary distro? Still in normal support? LTS? Yet
more extreme forms?

Plus - whose distros would we consider?

Jan

> * Optional features explicitly may have newer minima, generally chosen
> by when toolchain support landed and/or was bugfixed suitably to be usable.
> 
> * Xen won't expect to update minima "just because".  But updates
> across-the-board will be considered periodically where it doesn't
> conflict with point 1, and where changing the minima allows us to use a
> new feature to have a positive impact on the codebase.
> 
> * We always reserve the right to update minima to e.g. avoid crippling
> code generation bugs, even if it conflicts with point 1.  Where
> workarounds can reasonably be done, they ought to be preferred, but this
> is ultimately at the discretion of the relevant architecture maintainers.
> 
> ?
> 
> ~Andrew



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 12:05           ` Andrew Cooper
  2024-02-29 12:17             ` Jan Beulich
@ 2024-02-29 12:27             ` Julien Grall
  1 sibling, 0 replies; 88+ messages in thread
From: Julien Grall @ 2024-02-29 12:27 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel, Oleksii Kurochko

Hi Andrew,

On 29/02/2024 12:05, Andrew Cooper wrote:
> On 29/02/2024 10:23 am, Julien Grall wrote:
>>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>>> here
>>>>> than other architectures. It ought to be possible to determine a
>>>>> baseline
>>>>> version. Even if taking the desire to have "pause" available as a
>>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>>
>>>> I think we want to bump it on Arm. There are zero reasons to try to
>>>> keep
>>>> a lower versions if nobody tests/use it in production.
>>>>
>>>> I would suggest to do the same on x86. What's the point of try to
>>>> support Xen with a 15+ years old compiler?
>>>
>>> It could have long been bumped if only a proper scheme to follow for
>>> this and future bumping would have been put forward by anyone keen on
>>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>>> this was discussed more than once on meetings, with no real outcome.
>>> I'm personally not meaning to stand in the way of such bumping as long
>>> as it's done in a predictable manner, but I'm not keen on doing so and
>>> hence I don't view it as my obligation to try to invent a reasonable
>>> scheme. (My personal view is that basic functionality should be
>>> possible to have virtually everywhere, whereas for advanced stuff it
>>> is fine to require a more modern tool chain.)
>>
>> That's one way to see it. The problem with this statement is a user
>> today is mislead to think you can build Xen with any GCC versions
>> since 4.1. I don't believe we can guarantee that and we are exposing
>> our users to unnecessary risk.
>>
>> In addition to that, I agree with Andrew. This is preventing us to
>> improve our code base and we have to carry hacks for older compilers.
> 
> I don't think anyone here is suggesting that we switch to a
> bleeding-edge-only policy.  But 15y of support is extreme in the
> opposite direction.
> 
> Xen ought to be buildable in the contemporary distros of the day, and I
> don't think anyone is going to credibly argue otherwise.
> 
> But, it's also fine for new things to have newer requirements.
> 
> Take CET for example.  I know we have disagreements on exactly how it's
> toolchain-conditionalness is implemented, but the basic principle of "If
> you want shiny new optional feature $X, you need newer toolchain $Y" is
> entirely fine.
> 
> A brand new architecture is exactly the same.  Saying "this is the
> minimum, because it's what we test" doesn't preclude someone coming
> along and saying "can we use $N-1 ?  See here it works, and here's a
> change to CI test it".
> 
> 
> Anyway, its clear we need to write some policy on this, before making
> specific adjustments.  To get started, is there going to be any
> objection whatsoever on some principles which begin as follows:

No objections.

> 
> * For established architectures, we expect Xen to be buildable on the
> common contemporary distros.  (i.e. minima is not newer than what's
> available in contemporary distros, without a good reason)

I think we would need to list the distros we are taking into account. 
Reading the rest of the principles, I am assuming you would be ok if new 
distros are added if there is a use case.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 12:17             ` Jan Beulich
@ 2024-02-29 12:32               ` Julien Grall
  2024-02-29 12:51                 ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Julien Grall @ 2024-02-29 12:32 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel, Oleksii Kurochko

Hi,

On 29/02/2024 12:17, Jan Beulich wrote:
> On 29.02.2024 13:05, Andrew Cooper wrote:
>> On 29/02/2024 10:23 am, Julien Grall wrote:
>>>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>>>> here
>>>>>> than other architectures. It ought to be possible to determine a
>>>>>> baseline
>>>>>> version. Even if taking the desire to have "pause" available as a
>>>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>>>
>>>>> I think we want to bump it on Arm. There are zero reasons to try to
>>>>> keep
>>>>> a lower versions if nobody tests/use it in production.
>>>>>
>>>>> I would suggest to do the same on x86. What's the point of try to
>>>>> support Xen with a 15+ years old compiler?
>>>>
>>>> It could have long been bumped if only a proper scheme to follow for
>>>> this and future bumping would have been put forward by anyone keen on
>>>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>>>> this was discussed more than once on meetings, with no real outcome.
>>>> I'm personally not meaning to stand in the way of such bumping as long
>>>> as it's done in a predictable manner, but I'm not keen on doing so and
>>>> hence I don't view it as my obligation to try to invent a reasonable
>>>> scheme. (My personal view is that basic functionality should be
>>>> possible to have virtually everywhere, whereas for advanced stuff it
>>>> is fine to require a more modern tool chain.)
>>>
>>> That's one way to see it. The problem with this statement is a user
>>> today is mislead to think you can build Xen with any GCC versions
>>> since 4.1. I don't believe we can guarantee that and we are exposing
>>> our users to unnecessary risk.
>>>
>>> In addition to that, I agree with Andrew. This is preventing us to
>>> improve our code base and we have to carry hacks for older compilers.
>>
>> I don't think anyone here is suggesting that we switch to a
>> bleeding-edge-only policy.  But 15y of support is extreme in the
>> opposite direction.
>>
>> Xen ought to be buildable in the contemporary distros of the day, and I
>> don't think anyone is going to credibly argue otherwise.
>>
>> But, it's also fine for new things to have newer requirements.
>>
>> Take CET for example.  I know we have disagreements on exactly how it's
>> toolchain-conditionalness is implemented, but the basic principle of "If
>> you want shiny new optional feature $X, you need newer toolchain $Y" is
>> entirely fine.
>>
>> A brand new architecture is exactly the same.  Saying "this is the
>> minimum, because it's what we test" doesn't preclude someone coming
>> along and saying "can we use $N-1 ?  See here it works, and here's a
>> change to CI test it".
>>
>>
>> Anyway, its clear we need to write some policy on this, before making
>> specific adjustments.  To get started, is there going to be any
>> objection whatsoever on some principles which begin as follows:
> 
> Largely not, but one aspect needs clarifying up front:
> 
>> * For established architectures, we expect Xen to be buildable on the
>> common contemporary distros.  (i.e. minima is not newer than what's
>> available in contemporary distros, without a good reason)
> 
> What counts as contemporary distro? Still in normal support? LTS? Yet
> more extreme forms?

LTS makes sense. More I am not sure. I am under the impression that 
people using older distros are those that wants a stable system. So they 
would unlikely try to upgrade the hypervisor.

Even for LTS, I would argue that if it has been released 5 years ago, 
then you probably want to update it at the same time as moving to a 
newer Xen version.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 12:32               ` Julien Grall
@ 2024-02-29 12:51                 ` Jan Beulich
  2024-02-29 13:44                   ` Julien Grall
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 12:51 UTC (permalink / raw)
  To: Julien Grall
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel,
	Oleksii Kurochko, Andrew Cooper

On 29.02.2024 13:32, Julien Grall wrote:
> On 29/02/2024 12:17, Jan Beulich wrote:
>> On 29.02.2024 13:05, Andrew Cooper wrote:
>>> On 29/02/2024 10:23 am, Julien Grall wrote:
>>>>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>>>>> here
>>>>>>> than other architectures. It ought to be possible to determine a
>>>>>>> baseline
>>>>>>> version. Even if taking the desire to have "pause" available as a
>>>>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>>>>
>>>>>> I think we want to bump it on Arm. There are zero reasons to try to
>>>>>> keep
>>>>>> a lower versions if nobody tests/use it in production.
>>>>>>
>>>>>> I would suggest to do the same on x86. What's the point of try to
>>>>>> support Xen with a 15+ years old compiler?
>>>>>
>>>>> It could have long been bumped if only a proper scheme to follow for
>>>>> this and future bumping would have been put forward by anyone keen on
>>>>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>>>>> this was discussed more than once on meetings, with no real outcome.
>>>>> I'm personally not meaning to stand in the way of such bumping as long
>>>>> as it's done in a predictable manner, but I'm not keen on doing so and
>>>>> hence I don't view it as my obligation to try to invent a reasonable
>>>>> scheme. (My personal view is that basic functionality should be
>>>>> possible to have virtually everywhere, whereas for advanced stuff it
>>>>> is fine to require a more modern tool chain.)
>>>>
>>>> That's one way to see it. The problem with this statement is a user
>>>> today is mislead to think you can build Xen with any GCC versions
>>>> since 4.1. I don't believe we can guarantee that and we are exposing
>>>> our users to unnecessary risk.
>>>>
>>>> In addition to that, I agree with Andrew. This is preventing us to
>>>> improve our code base and we have to carry hacks for older compilers.
>>>
>>> I don't think anyone here is suggesting that we switch to a
>>> bleeding-edge-only policy.  But 15y of support is extreme in the
>>> opposite direction.
>>>
>>> Xen ought to be buildable in the contemporary distros of the day, and I
>>> don't think anyone is going to credibly argue otherwise.
>>>
>>> But, it's also fine for new things to have newer requirements.
>>>
>>> Take CET for example.  I know we have disagreements on exactly how it's
>>> toolchain-conditionalness is implemented, but the basic principle of "If
>>> you want shiny new optional feature $X, you need newer toolchain $Y" is
>>> entirely fine.
>>>
>>> A brand new architecture is exactly the same.  Saying "this is the
>>> minimum, because it's what we test" doesn't preclude someone coming
>>> along and saying "can we use $N-1 ?  See here it works, and here's a
>>> change to CI test it".
>>>
>>>
>>> Anyway, its clear we need to write some policy on this, before making
>>> specific adjustments.  To get started, is there going to be any
>>> objection whatsoever on some principles which begin as follows:
>>
>> Largely not, but one aspect needs clarifying up front:
>>
>>> * For established architectures, we expect Xen to be buildable on the
>>> common contemporary distros.  (i.e. minima is not newer than what's
>>> available in contemporary distros, without a good reason)
>>
>> What counts as contemporary distro? Still in normal support? LTS? Yet
>> more extreme forms?
> 
> LTS makes sense. More I am not sure. I am under the impression that 
> people using older distros are those that wants a stable system. So they 
> would unlikely try to upgrade the hypervisor.
> 
> Even for LTS, I would argue that if it has been released 5 years ago, 
> then you probably want to update it at the same time as moving to a 
> newer Xen version.

For the purposes of distros I agree. For the purposes of individuals
I don't: What's wrong with running a newer hypervisor and/or kernel
underneath an older distro?

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 12:51                 ` Jan Beulich
@ 2024-02-29 13:44                   ` Julien Grall
  2024-02-29 14:07                     ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Julien Grall @ 2024-02-29 13:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel,
	Oleksii Kurochko, Andrew Cooper

Hi Jan,

On 29/02/2024 12:51, Jan Beulich wrote:
> On 29.02.2024 13:32, Julien Grall wrote:
>> On 29/02/2024 12:17, Jan Beulich wrote:
>>> On 29.02.2024 13:05, Andrew Cooper wrote:
>>>> On 29/02/2024 10:23 am, Julien Grall wrote:
>>>>>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>>>>>> here
>>>>>>>> than other architectures. It ought to be possible to determine a
>>>>>>>> baseline
>>>>>>>> version. Even if taking the desire to have "pause" available as a
>>>>>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>>>>>
>>>>>>> I think we want to bump it on Arm. There are zero reasons to try to
>>>>>>> keep
>>>>>>> a lower versions if nobody tests/use it in production.
>>>>>>>
>>>>>>> I would suggest to do the same on x86. What's the point of try to
>>>>>>> support Xen with a 15+ years old compiler?
>>>>>>
>>>>>> It could have long been bumped if only a proper scheme to follow for
>>>>>> this and future bumping would have been put forward by anyone keen on
>>>>>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>>>>>> this was discussed more than once on meetings, with no real outcome.
>>>>>> I'm personally not meaning to stand in the way of such bumping as long
>>>>>> as it's done in a predictable manner, but I'm not keen on doing so and
>>>>>> hence I don't view it as my obligation to try to invent a reasonable
>>>>>> scheme. (My personal view is that basic functionality should be
>>>>>> possible to have virtually everywhere, whereas for advanced stuff it
>>>>>> is fine to require a more modern tool chain.)
>>>>>
>>>>> That's one way to see it. The problem with this statement is a user
>>>>> today is mislead to think you can build Xen with any GCC versions
>>>>> since 4.1. I don't believe we can guarantee that and we are exposing
>>>>> our users to unnecessary risk.
>>>>>
>>>>> In addition to that, I agree with Andrew. This is preventing us to
>>>>> improve our code base and we have to carry hacks for older compilers.
>>>>
>>>> I don't think anyone here is suggesting that we switch to a
>>>> bleeding-edge-only policy.  But 15y of support is extreme in the
>>>> opposite direction.
>>>>
>>>> Xen ought to be buildable in the contemporary distros of the day, and I
>>>> don't think anyone is going to credibly argue otherwise.
>>>>
>>>> But, it's also fine for new things to have newer requirements.
>>>>
>>>> Take CET for example.  I know we have disagreements on exactly how it's
>>>> toolchain-conditionalness is implemented, but the basic principle of "If
>>>> you want shiny new optional feature $X, you need newer toolchain $Y" is
>>>> entirely fine.
>>>>
>>>> A brand new architecture is exactly the same.  Saying "this is the
>>>> minimum, because it's what we test" doesn't preclude someone coming
>>>> along and saying "can we use $N-1 ?  See here it works, and here's a
>>>> change to CI test it".
>>>>
>>>>
>>>> Anyway, its clear we need to write some policy on this, before making
>>>> specific adjustments.  To get started, is there going to be any
>>>> objection whatsoever on some principles which begin as follows:
>>>
>>> Largely not, but one aspect needs clarifying up front:
>>>
>>>> * For established architectures, we expect Xen to be buildable on the
>>>> common contemporary distros.  (i.e. minima is not newer than what's
>>>> available in contemporary distros, without a good reason)
>>>
>>> What counts as contemporary distro? Still in normal support? LTS? Yet
>>> more extreme forms?
>>
>> LTS makes sense. More I am not sure. I am under the impression that
>> people using older distros are those that wants a stable system. So they
>> would unlikely try to upgrade the hypervisor.
>>
>> Even for LTS, I would argue that if it has been released 5 years ago,
>> then you probably want to update it at the same time as moving to a
>> newer Xen version.
> 
> For the purposes of distros I agree. For the purposes of individuals
> I don't: What's wrong with running a newer hypervisor and/or kernel
> underneath an older distro?

There is nothing wrong. I just don't understand the benefits for us to 
support that use case. To me there are two sorts of individuals:
  1. The ones that are using distro packages. They will unlikely want to 
switch to a newer hypervisor
  2. The ones that are happy to compile and hack their system. Fairly 
likely they will use a more distros and/or would not be put up by 
upgrading it.

What individuals do you have in mind?

Also, for me, the minimum doesn't prevent anyone to try to compile with 
an older compiler. It is only here to say that as a community we will 
not investigate or trying to workaround bugs in those compilers.

I don't see the problem with that if someone decide to use an older dom0 
distros with a newer hypervisor. It would likely not be the only issue 
they will have.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 03/23] xen/riscv: introduce nospec.h
  2024-02-26 17:38 ` [PATCH v5 03/23] xen/riscv: introduce nospec.h Oleksii Kurochko
  2024-02-27  7:38   ` Jan Beulich
@ 2024-02-29 13:49   ` Julien Grall
  2024-02-29 14:01     ` Jan Beulich
  2024-02-29 16:27   ` Jan Beulich
  2 siblings, 1 reply; 88+ messages in thread
From: Julien Grall @ 2024-02-29 13:49 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi Oleksii,

On 26/02/2024 17:38, Oleksii Kurochko wrote:
>  From the unpriviliged doc:
>    No standard hints are presently defined.
>    We anticipate standard hints to eventually include memory-system spatial
>    and temporal locality hints, branch prediction hints, thread-scheduling
>    hints, security tags, and instrumentation flags for simulation/emulation.
> 
> Also, there are no speculation execution barriers.
> 
> Therefore, functions evaluate_nospec() and block_speculation() should
> remain empty until a specific platform has an extension to deal with
> speculation execution.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
>   Changes in V5:
>     - new patch
> ---
>   xen/arch/riscv/include/asm/nospec.h | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
>   create mode 100644 xen/arch/riscv/include/asm/nospec.h
> 
> diff --git a/xen/arch/riscv/include/asm/nospec.h b/xen/arch/riscv/include/asm/nospec.h
> new file mode 100644
> index 0000000000..4fb404a0a2
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/nospec.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0 */

New file should use the SPDX tag GPL-2.0-only. I guess this could be 
fixed on commit?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-26 17:38 ` [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions Oleksii Kurochko
@ 2024-02-29 13:54   ` Julien Grall
  2024-02-29 14:03     ` Jan Beulich
  2024-02-29 16:17     ` Oleksii
  2024-02-29 15:52   ` Jan Beulich
  2024-02-29 16:25   ` Andrew Cooper
  2 siblings, 2 replies; 88+ messages in thread
From: Julien Grall @ 2024-02-29 13:54 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi Oleksii,

On 26/02/2024 17:38, Oleksii Kurochko wrote:
> These functions can be useful for architectures that don't
> have corresponding arch-specific instructions.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
>   Changes in V5:
>     - new patch
> ---
>   xen/include/asm-generic/bitops/fls.h  | 18 ++++++++++++++++++
>   xen/include/asm-generic/bitops/flsl.h | 10 ++++++++++

One header per function seems a little bit excessive to me. Do you have 
any pointer where this request is coming from?

Why not using the pattern.

In arch implementation:

#define fls
static inline ...

In the generic header (asm-generic or xen/):

#ifndef fls
static inline ...
#endif

>   2 files changed, 28 insertions(+)
>   create mode 100644 xen/include/asm-generic/bitops/fls.h
>   create mode 100644 xen/include/asm-generic/bitops/flsl.h
> 
> diff --git a/xen/include/asm-generic/bitops/fls.h b/xen/include/asm-generic/bitops/fls.h
> new file mode 100644
> index 0000000000..369a4c790c
> --- /dev/null
> +++ b/xen/include/asm-generic/bitops/fls.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */

You should use GPL-2.0-only.

> +#ifndef _ASM_GENERIC_BITOPS_FLS_H_
> +#define _ASM_GENERIC_BITOPS_FLS_H_
> +
> +/**
> + * fls - find last (most-significant) bit set
> + * @x: the word to search
> + *
> + * This is defined the same way as ffs.
> + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
> + */
> +
> +static inline int fls(unsigned int x)
> +{
> +    return generic_fls(x);
> +}
> +
> +#endif /* _ASM_GENERIC_BITOPS_FLS_H_ */

Missing emacs magic. I am probably not going to repeat this remark and 
the one above again. So please have a look.

> diff --git a/xen/include/asm-generic/bitops/flsl.h b/xen/include/asm-generic/bitops/flsl.h
> new file mode 100644
> index 0000000000..d0a2e9c729
> --- /dev/null
> +++ b/xen/include/asm-generic/bitops/flsl.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_GENERIC_BITOPS_FLSL_H_
> +#define _ASM_GENERIC_BITOPS_FLSL_H_
> +
> +static inline int flsl(unsigned long x)
> +{
> +    return generic_flsl(x);
> +}
> +
> +#endif /* _ASM_GENERIC_BITOPS_FLSL_H_ */

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 03/23] xen/riscv: introduce nospec.h
  2024-02-29 13:49   ` Julien Grall
@ 2024-02-29 14:01     ` Jan Beulich
  2024-02-29 16:09       ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 14:01 UTC (permalink / raw)
  To: Julien Grall, Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Stefano Stabellini, Wei Liu, xen-devel

On 29.02.2024 14:49, Julien Grall wrote:
> On 26/02/2024 17:38, Oleksii Kurochko wrote:
>> --- /dev/null
>> +++ b/xen/arch/riscv/include/asm/nospec.h
>> @@ -0,0 +1,25 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
> 
> New file should use the SPDX tag GPL-2.0-only. I guess this could be 
> fixed on commit?

I wouldn't mind doing so.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-29 13:54   ` Julien Grall
@ 2024-02-29 14:03     ` Jan Beulich
  2024-02-29 14:08       ` Julien Grall
  2024-02-29 16:17     ` Oleksii
  1 sibling, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 14:03 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Oleksii Kurochko, xen-devel

On 29.02.2024 14:54, Julien Grall wrote:
> On 26/02/2024 17:38, Oleksii Kurochko wrote:
>> These functions can be useful for architectures that don't
>> have corresponding arch-specific instructions.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> ---
>>   Changes in V5:
>>     - new patch
>> ---
>>   xen/include/asm-generic/bitops/fls.h  | 18 ++++++++++++++++++
>>   xen/include/asm-generic/bitops/flsl.h | 10 ++++++++++
> 
> One header per function seems a little bit excessive to me. Do you have 
> any pointer where this request is coming from?

That's in an attempt to follow Linux'es way of having this, aiui. This way
an arch can mix and match header inclusions and private definitions without
any #ifdef-ary.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 13:44                   ` Julien Grall
@ 2024-02-29 14:07                     ` Jan Beulich
  2024-02-29 14:14                       ` Julien Grall
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 14:07 UTC (permalink / raw)
  To: Julien Grall
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel,
	Oleksii Kurochko, Andrew Cooper

On 29.02.2024 14:44, Julien Grall wrote:
> Hi Jan,
> 
> On 29/02/2024 12:51, Jan Beulich wrote:
>> On 29.02.2024 13:32, Julien Grall wrote:
>>> On 29/02/2024 12:17, Jan Beulich wrote:
>>>> On 29.02.2024 13:05, Andrew Cooper wrote:
>>>>> On 29/02/2024 10:23 am, Julien Grall wrote:
>>>>>>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>>>>>>> here
>>>>>>>>> than other architectures. It ought to be possible to determine a
>>>>>>>>> baseline
>>>>>>>>> version. Even if taking the desire to have "pause" available as a
>>>>>>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>>>>>>
>>>>>>>> I think we want to bump it on Arm. There are zero reasons to try to
>>>>>>>> keep
>>>>>>>> a lower versions if nobody tests/use it in production.
>>>>>>>>
>>>>>>>> I would suggest to do the same on x86. What's the point of try to
>>>>>>>> support Xen with a 15+ years old compiler?
>>>>>>>
>>>>>>> It could have long been bumped if only a proper scheme to follow for
>>>>>>> this and future bumping would have been put forward by anyone keen on
>>>>>>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>>>>>>> this was discussed more than once on meetings, with no real outcome.
>>>>>>> I'm personally not meaning to stand in the way of such bumping as long
>>>>>>> as it's done in a predictable manner, but I'm not keen on doing so and
>>>>>>> hence I don't view it as my obligation to try to invent a reasonable
>>>>>>> scheme. (My personal view is that basic functionality should be
>>>>>>> possible to have virtually everywhere, whereas for advanced stuff it
>>>>>>> is fine to require a more modern tool chain.)
>>>>>>
>>>>>> That's one way to see it. The problem with this statement is a user
>>>>>> today is mislead to think you can build Xen with any GCC versions
>>>>>> since 4.1. I don't believe we can guarantee that and we are exposing
>>>>>> our users to unnecessary risk.
>>>>>>
>>>>>> In addition to that, I agree with Andrew. This is preventing us to
>>>>>> improve our code base and we have to carry hacks for older compilers.
>>>>>
>>>>> I don't think anyone here is suggesting that we switch to a
>>>>> bleeding-edge-only policy.  But 15y of support is extreme in the
>>>>> opposite direction.
>>>>>
>>>>> Xen ought to be buildable in the contemporary distros of the day, and I
>>>>> don't think anyone is going to credibly argue otherwise.
>>>>>
>>>>> But, it's also fine for new things to have newer requirements.
>>>>>
>>>>> Take CET for example.  I know we have disagreements on exactly how it's
>>>>> toolchain-conditionalness is implemented, but the basic principle of "If
>>>>> you want shiny new optional feature $X, you need newer toolchain $Y" is
>>>>> entirely fine.
>>>>>
>>>>> A brand new architecture is exactly the same.  Saying "this is the
>>>>> minimum, because it's what we test" doesn't preclude someone coming
>>>>> along and saying "can we use $N-1 ?  See here it works, and here's a
>>>>> change to CI test it".
>>>>>
>>>>>
>>>>> Anyway, its clear we need to write some policy on this, before making
>>>>> specific adjustments.  To get started, is there going to be any
>>>>> objection whatsoever on some principles which begin as follows:
>>>>
>>>> Largely not, but one aspect needs clarifying up front:
>>>>
>>>>> * For established architectures, we expect Xen to be buildable on the
>>>>> common contemporary distros.  (i.e. minima is not newer than what's
>>>>> available in contemporary distros, without a good reason)
>>>>
>>>> What counts as contemporary distro? Still in normal support? LTS? Yet
>>>> more extreme forms?
>>>
>>> LTS makes sense. More I am not sure. I am under the impression that
>>> people using older distros are those that wants a stable system. So they
>>> would unlikely try to upgrade the hypervisor.
>>>
>>> Even for LTS, I would argue that if it has been released 5 years ago,
>>> then you probably want to update it at the same time as moving to a
>>> newer Xen version.
>>
>> For the purposes of distros I agree. For the purposes of individuals
>> I don't: What's wrong with running a newer hypervisor and/or kernel
>> underneath an older distro?
> 
> There is nothing wrong. I just don't understand the benefits for us to 
> support that use case. To me there are two sorts of individuals:
>   1. The ones that are using distro packages. They will unlikely want to 
> switch to a newer hypervisor
>   2. The ones that are happy to compile and hack their system. Fairly 
> likely they will use a more distros and/or would not be put up by 
> upgrading it.
> 
> What individuals do you have in mind?

People like me.

> Also, for me, the minimum doesn't prevent anyone to try to compile with 
> an older compiler. It is only here to say that as a community we will 
> not investigate or trying to workaround bugs in those compilers.

Besides this also allowing to use functionality you won't have an easy
way of replacing, what you say also doesn't make clear whether - for
cases where the issue can be (reasonably easily) worked around - patches
would be accepted, or rejected on the basis of only helping a below-the-
line compiler.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-29 14:03     ` Jan Beulich
@ 2024-02-29 14:08       ` Julien Grall
  0 siblings, 0 replies; 88+ messages in thread
From: Julien Grall @ 2024-02-29 14:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Oleksii Kurochko, xen-devel

Hi Jan,

On 29/02/2024 14:03, Jan Beulich wrote:
> On 29.02.2024 14:54, Julien Grall wrote:
>> On 26/02/2024 17:38, Oleksii Kurochko wrote:
>>> These functions can be useful for architectures that don't
>>> have corresponding arch-specific instructions.
>>>
>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>> ---
>>>    Changes in V5:
>>>      - new patch
>>> ---
>>>    xen/include/asm-generic/bitops/fls.h  | 18 ++++++++++++++++++
>>>    xen/include/asm-generic/bitops/flsl.h | 10 ++++++++++
>>
>> One header per function seems a little bit excessive to me. Do you have
>> any pointer where this request is coming from?
> 
> That's in an attempt to follow Linux'es way of having this, aiui. This way
> an arch can mix and match header inclusions and private definitions without
> any #ifdef-ary.

Ok. I am not going to oppose it if the goal is to keep the headers in 
sync with Linux.

Although, it might have been useful to mention it in the commit message.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 14:07                     ` Jan Beulich
@ 2024-02-29 14:14                       ` Julien Grall
  2024-02-29 17:43                         ` Stefano Stabellini
  0 siblings, 1 reply; 88+ messages in thread
From: Julien Grall @ 2024-02-29 14:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel,
	Oleksii Kurochko, Andrew Cooper



On 29/02/2024 14:07, Jan Beulich wrote:
> On 29.02.2024 14:44, Julien Grall wrote:
>> Hi Jan,
>>
>> On 29/02/2024 12:51, Jan Beulich wrote:
>>> On 29.02.2024 13:32, Julien Grall wrote:
>>>> On 29/02/2024 12:17, Jan Beulich wrote:
>>>>> On 29.02.2024 13:05, Andrew Cooper wrote:
>>>>>> On 29/02/2024 10:23 am, Julien Grall wrote:
>>>>>>>>>> IOW it is hard for me to see why RISC-V needs stronger restrictions
>>>>>>>>>> here
>>>>>>>>>> than other architectures. It ought to be possible to determine a
>>>>>>>>>> baseline
>>>>>>>>>> version. Even if taking the desire to have "pause" available as a
>>>>>>>>>> requirement, gas (and presumably gld) 2.36.1 would already suffice.
>>>>>>>>>
>>>>>>>>> I think we want to bump it on Arm. There are zero reasons to try to
>>>>>>>>> keep
>>>>>>>>> a lower versions if nobody tests/use it in production.
>>>>>>>>>
>>>>>>>>> I would suggest to do the same on x86. What's the point of try to
>>>>>>>>> support Xen with a 15+ years old compiler?
>>>>>>>>
>>>>>>>> It could have long been bumped if only a proper scheme to follow for
>>>>>>>> this and future bumping would have been put forward by anyone keen on
>>>>>>>> such bumping, like - see his reply - e.g. Andrew. You may recall that
>>>>>>>> this was discussed more than once on meetings, with no real outcome.
>>>>>>>> I'm personally not meaning to stand in the way of such bumping as long
>>>>>>>> as it's done in a predictable manner, but I'm not keen on doing so and
>>>>>>>> hence I don't view it as my obligation to try to invent a reasonable
>>>>>>>> scheme. (My personal view is that basic functionality should be
>>>>>>>> possible to have virtually everywhere, whereas for advanced stuff it
>>>>>>>> is fine to require a more modern tool chain.)
>>>>>>>
>>>>>>> That's one way to see it. The problem with this statement is a user
>>>>>>> today is mislead to think you can build Xen with any GCC versions
>>>>>>> since 4.1. I don't believe we can guarantee that and we are exposing
>>>>>>> our users to unnecessary risk.
>>>>>>>
>>>>>>> In addition to that, I agree with Andrew. This is preventing us to
>>>>>>> improve our code base and we have to carry hacks for older compilers.
>>>>>>
>>>>>> I don't think anyone here is suggesting that we switch to a
>>>>>> bleeding-edge-only policy.  But 15y of support is extreme in the
>>>>>> opposite direction.
>>>>>>
>>>>>> Xen ought to be buildable in the contemporary distros of the day, and I
>>>>>> don't think anyone is going to credibly argue otherwise.
>>>>>>
>>>>>> But, it's also fine for new things to have newer requirements.
>>>>>>
>>>>>> Take CET for example.  I know we have disagreements on exactly how it's
>>>>>> toolchain-conditionalness is implemented, but the basic principle of "If
>>>>>> you want shiny new optional feature $X, you need newer toolchain $Y" is
>>>>>> entirely fine.
>>>>>>
>>>>>> A brand new architecture is exactly the same.  Saying "this is the
>>>>>> minimum, because it's what we test" doesn't preclude someone coming
>>>>>> along and saying "can we use $N-1 ?  See here it works, and here's a
>>>>>> change to CI test it".
>>>>>>
>>>>>>
>>>>>> Anyway, its clear we need to write some policy on this, before making
>>>>>> specific adjustments.  To get started, is there going to be any
>>>>>> objection whatsoever on some principles which begin as follows:
>>>>>
>>>>> Largely not, but one aspect needs clarifying up front:
>>>>>
>>>>>> * For established architectures, we expect Xen to be buildable on the
>>>>>> common contemporary distros.  (i.e. minima is not newer than what's
>>>>>> available in contemporary distros, without a good reason)
>>>>>
>>>>> What counts as contemporary distro? Still in normal support? LTS? Yet
>>>>> more extreme forms?
>>>>
>>>> LTS makes sense. More I am not sure. I am under the impression that
>>>> people using older distros are those that wants a stable system. So they
>>>> would unlikely try to upgrade the hypervisor.
>>>>
>>>> Even for LTS, I would argue that if it has been released 5 years ago,
>>>> then you probably want to update it at the same time as moving to a
>>>> newer Xen version.
>>>
>>> For the purposes of distros I agree. For the purposes of individuals
>>> I don't: What's wrong with running a newer hypervisor and/or kernel
>>> underneath an older distro?
>>
>> There is nothing wrong. I just don't understand the benefits for us to
>> support that use case. To me there are two sorts of individuals:
>>    1. The ones that are using distro packages. They will unlikely want to
>> switch to a newer hypervisor
>>    2. The ones that are happy to compile and hack their system. Fairly
>> likely they will use a more distros and/or would not be put up by
>> upgrading it.
>>
>> What individuals do you have in mind?
> 
> People like me.

Which means? From what I read you mostly use an older distros for smoke 
testing/convenience.

>> Also, for me, the minimum doesn't prevent anyone to try to compile with
>> an older compiler. It is only here to say that as a community we will
>> not investigate or trying to workaround bugs in those compilers.
> 
> Besides this also allowing to use functionality you won't have an easy
> way of replacing, what you say also doesn't make clear whether - for
> cases where the issue can be (reasonably easily) worked around - patches
> would be accepted, or rejected on the basis of only helping a below-the-
> line compiler.

I would not accept them.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-26 17:38 ` [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions Oleksii Kurochko
  2024-02-29 13:54   ` Julien Grall
@ 2024-02-29 15:52   ` Jan Beulich
  2024-02-29 16:25   ` Andrew Cooper
  2 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 15:52 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/include/asm-generic/bitops/fls.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_GENERIC_BITOPS_FLS_H_
> +#define _ASM_GENERIC_BITOPS_FLS_H_
> +
> +/**
> + * fls - find last (most-significant) bit set
> + * @x: the word to search
> + *
> + * This is defined the same way as ffs.
> + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
> + */
> +
> +static inline int fls(unsigned int x)
> +{
> +    return generic_fls(x);
> +}

This being an inline function, it requires generic_fls() to be declared.
Yet there's no other header included here. I think these headers would
better be self-contained. Or else (e.g. because of this leading to an
#include cycle) something needs saying somewhere.

The other thing here that worries me is the use of plain int as return
type. Yes, generic_fls() is declared like that, too. But no, the return
value there or here cannot be negative.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 03/23] xen/riscv: introduce nospec.h
  2024-02-29 14:01     ` Jan Beulich
@ 2024-02-29 16:09       ` Oleksii
  0 siblings, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-02-29 16:09 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Stefano Stabellini, Wei Liu, xen-devel

On Thu, 2024-02-29 at 15:01 +0100, Jan Beulich wrote:
> On 29.02.2024 14:49, Julien Grall wrote:
> > On 26/02/2024 17:38, Oleksii Kurochko wrote:
> > > --- /dev/null
> > > +++ b/xen/arch/riscv/include/asm/nospec.h
> > > @@ -0,0 +1,25 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > 
> > New file should use the SPDX tag GPL-2.0-only. I guess this could
> > be 
> > fixed on commit?
> 
> I wouldn't mind doing so.
I would happy with that. Thanks.

~ Oleksii


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-29 13:54   ` Julien Grall
  2024-02-29 14:03     ` Jan Beulich
@ 2024-02-29 16:17     ` Oleksii
  1 sibling, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-02-29 16:17 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi Julien,

On Thu, 2024-02-29 at 13:54 +0000, Julien Grall wrote:
> Hi Oleksii,
> 
> On 26/02/2024 17:38, Oleksii Kurochko wrote:
> > These functions can be useful for architectures that don't
> > have corresponding arch-specific instructions.
> > 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > ---
> >   Changes in V5:
> >     - new patch
> > ---
> >   xen/include/asm-generic/bitops/fls.h  | 18 ++++++++++++++++++
> >   xen/include/asm-generic/bitops/flsl.h | 10 ++++++++++
> 
> One header per function seems a little bit excessive to me. Do you
> have 
> any pointer where this request is coming from?
The goal was to be in sync with Linux kernel as Jan mentioned.
I will update the commit message as you suggested in one of replies.

> 
> Why not using the pattern.
> 
> In arch implementation:
> 
> #define fls
> static inline ...
> 
> In the generic header (asm-generic or xen/):
> 
> #ifndef fls
> static inline ...
> #endif
> 
> >   2 files changed, 28 insertions(+)
> >   create mode 100644 xen/include/asm-generic/bitops/fls.h
> >   create mode 100644 xen/include/asm-generic/bitops/flsl.h
> > 
> > diff --git a/xen/include/asm-generic/bitops/fls.h
> > b/xen/include/asm-generic/bitops/fls.h
> > new file mode 100644
> > index 0000000000..369a4c790c
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/fls.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> 
> You should use GPL-2.0-only.
Sure, I'll update the license here and in other files. I automatically
copied this SPDX from Linux kernel.

> 
> > +#ifndef _ASM_GENERIC_BITOPS_FLS_H_
> > +#define _ASM_GENERIC_BITOPS_FLS_H_
> > +
> > +/**
> > + * fls - find last (most-significant) bit set
> > + * @x: the word to search
> > + *
> > + * This is defined the same way as ffs.
> > + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
> > + */
> > +
> > +static inline int fls(unsigned int x)
> > +{
> > +    return generic_fls(x);
> > +}
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_FLS_H_ */
> 
> Missing emacs magic. I am probably not going to repeat this remark
> and 
> the one above again. So please have a look.
Sure, I'll update files with emacs magic.

~ Oleksii
> 
> > diff --git a/xen/include/asm-generic/bitops/flsl.h
> > b/xen/include/asm-generic/bitops/flsl.h
> > new file mode 100644
> > index 0000000000..d0a2e9c729
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/flsl.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_FLSL_H_
> > +#define _ASM_GENERIC_BITOPS_FLSL_H_
> > +
> > +static inline int flsl(unsigned long x)
> > +{
> > +    return generic_flsl(x);
> > +}
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_FLSL_H_ */
> 
> Cheers,
> 



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-26 17:38 ` [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions Oleksii Kurochko
  2024-02-29 13:54   ` Julien Grall
  2024-02-29 15:52   ` Jan Beulich
@ 2024-02-29 16:25   ` Andrew Cooper
  2024-03-01  9:15     ` Oleksii
  2 siblings, 1 reply; 88+ messages in thread
From: Andrew Cooper @ 2024-02-29 16:25 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

On 26/02/2024 5:38 pm, Oleksii Kurochko wrote:
> These functions can be useful for architectures that don't
> have corresponding arch-specific instructions.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
>  Changes in V5:
>    - new patch
> ---
>  xen/include/asm-generic/bitops/fls.h  | 18 ++++++++++++++++++
>  xen/include/asm-generic/bitops/flsl.h | 10 ++++++++++
>  2 files changed, 28 insertions(+)
>  create mode 100644 xen/include/asm-generic/bitops/fls.h
>  create mode 100644 xen/include/asm-generic/bitops/flsl.h
>
> diff --git a/xen/include/asm-generic/bitops/fls.h b/xen/include/asm-generic/bitops/fls.h
> new file mode 100644
> index 0000000000..369a4c790c
> --- /dev/null
> +++ b/xen/include/asm-generic/bitops/fls.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_GENERIC_BITOPS_FLS_H_
> +#define _ASM_GENERIC_BITOPS_FLS_H_
> +
> +/**
> + * fls - find last (most-significant) bit set
> + * @x: the word to search
> + *
> + * This is defined the same way as ffs.
> + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
> + */
> +
> +static inline int fls(unsigned int x)
> +{
> +    return generic_fls(x);
> +}
> +
> +#endif /* _ASM_GENERIC_BITOPS_FLS_H_ */
> diff --git a/xen/include/asm-generic/bitops/flsl.h b/xen/include/asm-generic/bitops/flsl.h
> new file mode 100644
> index 0000000000..d0a2e9c729
> --- /dev/null
> +++ b/xen/include/asm-generic/bitops/flsl.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_GENERIC_BITOPS_FLSL_H_
> +#define _ASM_GENERIC_BITOPS_FLSL_H_
> +
> +static inline int flsl(unsigned long x)
> +{
> +    return generic_flsl(x);
> +}
> +
> +#endif /* _ASM_GENERIC_BITOPS_FLSL_H_ */

Please don't do this.  It's compounding existing problems we have with
bitops, and there's a way to simplify things instead.

If you can wait a couple of days, I'll see about finishing and posting
my prototype demonstrating a simplification across all architectures,
and a reduction of code overall.

~Andrew


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 03/23] xen/riscv: introduce nospec.h
  2024-02-26 17:38 ` [PATCH v5 03/23] xen/riscv: introduce nospec.h Oleksii Kurochko
  2024-02-27  7:38   ` Jan Beulich
  2024-02-29 13:49   ` Julien Grall
@ 2024-02-29 16:27   ` Jan Beulich
  2 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-02-29 16:27 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/nospec.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (C) 2024 Vates */
> +
> +#ifndef _ASM_GENERIC_NOSPEC_H
> +#define _ASM_GENERIC_NOSPEC_H

Btw, at the very last second I noticed the GENERIC in here, which I
took the liberty to replace. But please be more careful when moving
files around in the tree.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29  7:58       ` Jan Beulich
  2024-02-29 10:23         ` Julien Grall
@ 2024-02-29 16:54         ` Oleksii
  1 sibling, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-02-29 16:54 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu, xen-devel

On Thu, 2024-02-29 at 08:58 +0100, Jan Beulich wrote:
> On 28.02.2024 23:58, Julien Grall wrote:
> > On 27/02/2024 07:55, Jan Beulich wrote:
> > > On 26.02.2024 18:39, Oleksii Kurochko wrote:
> > > > This patch doesn't represent a strict lower bound for GCC and
> > > > GNU Binutils; rather, these versions are specifically employed
> > > > by
> > > > the Xen RISC-V container and are anticipated to undergo
> > > > continuous
> > > > testing.
> > > 
> > > Up and until that container would be updated to a newer gcc. I'm
> > > afraid I view this as too weak a criteria,
> > 
> > I disagree. We have to decide a limit at some point. It is sensible
> > to 
> > say that we are only supporting what we can tests. AFAIK, this is
> > what 
> > QEMU has been doing.
> 
> I view qemu as a particularly bad example. They raise their baselines
> far too aggressively for my taste.
> 
> > > IOW it is hard for me to see why RISC-V needs stronger
> > > restrictions here
> > > than other architectures. It ought to be possible to determine a
> > > baseline
> > > version. Even if taking the desire to have "pause" available as a
> > > requirement, gas (and presumably gld) 2.36.1 would already
> > > suffice.
> > 
> > I think we want to bump it on Arm. There are zero reasons to try to
> > keep 
> > a lower versions if nobody tests/use it in production.
> > 
> > I would suggest to do the same on x86. What's the point of try to 
> > support Xen with a 15+ years old compiler?
> 
> It could have long been bumped if only a proper scheme to follow for
> this and future bumping would have been put forward by anyone keen on
> such bumping, like - see his reply - e.g. Andrew. You may recall that
> this was discussed more than once on meetings, with no real outcome.
> I'm personally not meaning to stand in the way of such bumping as
> long
> as it's done in a predictable manner, but I'm not keen on doing so
> and
> hence I don't view it as my obligation to try to invent a reasonable
> scheme. (My personal view is that basic functionality should be
> possible to have virtually everywhere, whereas for advanced stuff it
> is fine to require a more modern tool chain.)
> 
> The one additional concern I've raised in the past is that in the end
> it's not just minimal tool chain versions we rely on, but also other
> core system tools (see the recent move from "which" to "command -v"
> for an example of such a dependency, where luckily it turned out to
> not be an issue that the -v had only become a standard thing at some
> point). While for the tool chain I can arrange for making newer
> versions available, for core system tools I can't. 
Can't we identify the top X most popular Linux distributions ( LTS
versions ) and align Xen's minimal toolchain version with the selected
distributions' minimum toolchain versions?

> Therefore being too
> eager there would mean I can't really / easily (smoke) test Xen
> anymore on ancient hardware every once in a while. When afaict we do
> too little of such testing already anyway, despite not having any
> lower bound on hardware that formally we support running Xen on. (And
> no, upgrading the ancient distros on that ancient hardware is not an
> option for me.)
It seems there is no room for upgrading the toolchain version. This
leads to the question of determining the threshold between maintaining
the version as minimal as possible and deciding to upgrade it.
I understand your situation where you have an ancient hardware that
necessitates the use of older Linux distributions. However, is this a
common use case?

~ Oleksii







^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-28 23:11       ` Andrew Cooper
@ 2024-02-29 17:00         ` Oleksii
  0 siblings, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-02-29 17:00 UTC (permalink / raw)
  To: Andrew Cooper, Julien Grall, Jan Beulich
  Cc: George Dunlap, Stefano Stabellini, Wei Liu, xen-devel,
	Roger Pau Monné

On Wed, 2024-02-28 at 23:11 +0000, Andrew Cooper wrote:
> Furthermore, Linux has regularly been bumping minimum toolchain
> versions
> due to code generation issues, and we'd be foolish not pay attention.
Do they document that?

It looks like their doc is pretty old, because in Documentation/Changes
it is mentioned:
   GNU C                  5.1              gcc --version

And RISC-V support in GCC wad added after 7.0 or so...

But there is also the following words:

   This document is originally based on my "Changes" file for 2.0.x
   kernels
   and therefore owes credit to the same people as that file (Jared
   Mauch,
   Axel Boldt, Alessandro Sigala, and countless other users all over
   the
   'net).

Probably the doc wasn't updated for a long time, but at the same time
there is a line with Rust:
   
   Rust (optional)        1.62.0           rustc --version
   
~ Oleksii





^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-29 14:14                       ` Julien Grall
@ 2024-02-29 17:43                         ` Stefano Stabellini
  0 siblings, 0 replies; 88+ messages in thread
From: Stefano Stabellini @ 2024-02-29 17:43 UTC (permalink / raw)
  To: Julien Grall
  Cc: Jan Beulich, George Dunlap, Stefano Stabellini, Wei Liu,
	xen-devel, Oleksii Kurochko, Andrew Cooper

[-- Attachment #1: Type: text/plain, Size: 6635 bytes --]

On Thu, 29 Feb 2024, Julien Grall wrote:
> On 29/02/2024 14:07, Jan Beulich wrote:
> > On 29.02.2024 14:44, Julien Grall wrote:
> > > Hi Jan,
> > > 
> > > On 29/02/2024 12:51, Jan Beulich wrote:
> > > > On 29.02.2024 13:32, Julien Grall wrote:
> > > > > On 29/02/2024 12:17, Jan Beulich wrote:
> > > > > > On 29.02.2024 13:05, Andrew Cooper wrote:
> > > > > > > On 29/02/2024 10:23 am, Julien Grall wrote:
> > > > > > > > > > > IOW it is hard for me to see why RISC-V needs stronger
> > > > > > > > > > > restrictions
> > > > > > > > > > > here
> > > > > > > > > > > than other architectures. It ought to be possible to
> > > > > > > > > > > determine a
> > > > > > > > > > > baseline
> > > > > > > > > > > version. Even if taking the desire to have "pause"
> > > > > > > > > > > available as a
> > > > > > > > > > > requirement, gas (and presumably gld) 2.36.1 would already
> > > > > > > > > > > suffice.
> > > > > > > > > > 
> > > > > > > > > > I think we want to bump it on Arm. There are zero reasons to
> > > > > > > > > > try to
> > > > > > > > > > keep
> > > > > > > > > > a lower versions if nobody tests/use it in production.
> > > > > > > > > > 
> > > > > > > > > > I would suggest to do the same on x86. What's the point of
> > > > > > > > > > try to
> > > > > > > > > > support Xen with a 15+ years old compiler?
> > > > > > > > > 
> > > > > > > > > It could have long been bumped if only a proper scheme to
> > > > > > > > > follow for
> > > > > > > > > this and future bumping would have been put forward by anyone
> > > > > > > > > keen on
> > > > > > > > > such bumping, like - see his reply - e.g. Andrew. You may
> > > > > > > > > recall that
> > > > > > > > > this was discussed more than once on meetings, with no real
> > > > > > > > > outcome.
> > > > > > > > > I'm personally not meaning to stand in the way of such bumping
> > > > > > > > > as long
> > > > > > > > > as it's done in a predictable manner, but I'm not keen on
> > > > > > > > > doing so and
> > > > > > > > > hence I don't view it as my obligation to try to invent a
> > > > > > > > > reasonable
> > > > > > > > > scheme. (My personal view is that basic functionality should
> > > > > > > > > be
> > > > > > > > > possible to have virtually everywhere, whereas for advanced
> > > > > > > > > stuff it
> > > > > > > > > is fine to require a more modern tool chain.)
> > > > > > > > 
> > > > > > > > That's one way to see it. The problem with this statement is a
> > > > > > > > user
> > > > > > > > today is mislead to think you can build Xen with any GCC
> > > > > > > > versions
> > > > > > > > since 4.1. I don't believe we can guarantee that and we are
> > > > > > > > exposing
> > > > > > > > our users to unnecessary risk.
> > > > > > > > 
> > > > > > > > In addition to that, I agree with Andrew. This is preventing us
> > > > > > > > to
> > > > > > > > improve our code base and we have to carry hacks for older
> > > > > > > > compilers.
> > > > > > > 
> > > > > > > I don't think anyone here is suggesting that we switch to a
> > > > > > > bleeding-edge-only policy.  But 15y of support is extreme in the
> > > > > > > opposite direction.
> > > > > > > 
> > > > > > > Xen ought to be buildable in the contemporary distros of the day,
> > > > > > > and I
> > > > > > > don't think anyone is going to credibly argue otherwise.
> > > > > > > 
> > > > > > > But, it's also fine for new things to have newer requirements.
> > > > > > > 
> > > > > > > Take CET for example.  I know we have disagreements on exactly how
> > > > > > > it's
> > > > > > > toolchain-conditionalness is implemented, but the basic principle
> > > > > > > of "If
> > > > > > > you want shiny new optional feature $X, you need newer toolchain
> > > > > > > $Y" is
> > > > > > > entirely fine.
> > > > > > > 
> > > > > > > A brand new architecture is exactly the same.  Saying "this is the
> > > > > > > minimum, because it's what we test" doesn't preclude someone
> > > > > > > coming
> > > > > > > along and saying "can we use $N-1 ?  See here it works, and here's
> > > > > > > a
> > > > > > > change to CI test it".
> > > > > > > 
> > > > > > > 
> > > > > > > Anyway, its clear we need to write some policy on this, before
> > > > > > > making
> > > > > > > specific adjustments.  To get started, is there going to be any
> > > > > > > objection whatsoever on some principles which begin as follows:
> > > > > > 
> > > > > > Largely not, but one aspect needs clarifying up front:
> > > > > > 
> > > > > > > * For established architectures, we expect Xen to be buildable on
> > > > > > > the
> > > > > > > common contemporary distros.  (i.e. minima is not newer than
> > > > > > > what's
> > > > > > > available in contemporary distros, without a good reason)
> > > > > > 
> > > > > > What counts as contemporary distro? Still in normal support? LTS?
> > > > > > Yet
> > > > > > more extreme forms?
> > > > > 
> > > > > LTS makes sense. More I am not sure. I am under the impression that
> > > > > people using older distros are those that wants a stable system. So
> > > > > they
> > > > > would unlikely try to upgrade the hypervisor.
> > > > > 
> > > > > Even for LTS, I would argue that if it has been released 5 years ago,
> > > > > then you probably want to update it at the same time as moving to a
> > > > > newer Xen version.
> > > > 
> > > > For the purposes of distros I agree. For the purposes of individuals
> > > > I don't: What's wrong with running a newer hypervisor and/or kernel
> > > > underneath an older distro?
> > > 
> > > There is nothing wrong. I just don't understand the benefits for us to
> > > support that use case. To me there are two sorts of individuals:
> > >    1. The ones that are using distro packages. They will unlikely want to
> > > switch to a newer hypervisor
> > >    2. The ones that are happy to compile and hack their system. Fairly
> > > likely they will use a more distros and/or would not be put up by
> > > upgrading it.
> > > 
> > > What individuals do you have in mind?
> > 
> > People like me.
> 
> Which means? From what I read you mostly use an older distros for smoke
> testing/convenience.

This is a cost/benefit decision. Supporting new Xen on ancient distros
has a cost for maintainers and contributors. Is it worth paying this
cost for the benefits it provides?

It is natural that different people are going to have different opinions
on this, because their use-case and situation are different.

We need to define exactly what "ancient" distro means, but from what I
see my answer is "no, it is not worth it".

For example, for ARM I would raise the minimum GCC version to something
around GCC5.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions
  2024-02-29 16:25   ` Andrew Cooper
@ 2024-03-01  9:15     ` Oleksii
  0 siblings, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-03-01  9:15 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

On Thu, 2024-02-29 at 16:25 +0000, Andrew Cooper wrote:
> On 26/02/2024 5:38 pm, Oleksii Kurochko wrote:
> > These functions can be useful for architectures that don't
> > have corresponding arch-specific instructions.
> > 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > ---
> >  Changes in V5:
> >    - new patch
> > ---
> >  xen/include/asm-generic/bitops/fls.h  | 18 ++++++++++++++++++
> >  xen/include/asm-generic/bitops/flsl.h | 10 ++++++++++
> >  2 files changed, 28 insertions(+)
> >  create mode 100644 xen/include/asm-generic/bitops/fls.h
> >  create mode 100644 xen/include/asm-generic/bitops/flsl.h
> > 
> > diff --git a/xen/include/asm-generic/bitops/fls.h
> > b/xen/include/asm-generic/bitops/fls.h
> > new file mode 100644
> > index 0000000000..369a4c790c
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/fls.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_FLS_H_
> > +#define _ASM_GENERIC_BITOPS_FLS_H_
> > +
> > +/**
> > + * fls - find last (most-significant) bit set
> > + * @x: the word to search
> > + *
> > + * This is defined the same way as ffs.
> > + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
> > + */
> > +
> > +static inline int fls(unsigned int x)
> > +{
> > +    return generic_fls(x);
> > +}
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_FLS_H_ */
> > diff --git a/xen/include/asm-generic/bitops/flsl.h
> > b/xen/include/asm-generic/bitops/flsl.h
> > new file mode 100644
> > index 0000000000..d0a2e9c729
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/flsl.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_FLSL_H_
> > +#define _ASM_GENERIC_BITOPS_FLSL_H_
> > +
> > +static inline int flsl(unsigned long x)
> > +{
> > +    return generic_flsl(x);
> > +}
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_FLSL_H_ */
> 
> Please don't do this.  It's compounding existing problems we have
> with
> bitops, and there's a way to simplify things instead.
> 
> If you can wait a couple of days, I'll see about finishing and
> posting
> my prototype demonstrating a simplification across all architectures,
> and a reduction of code overall.
Please add me in CC.

~ Oleksii


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 10/23] xen/riscv: introduces acrquire, release and full barriers
  2024-02-26 17:38 ` [PATCH v5 10/23] xen/riscv: introduces acrquire, release and full barriers Oleksii Kurochko
@ 2024-03-05  7:42   ` Jan Beulich
  0 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-03-05  7:42 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
albeit ...

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/fence.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _ASM_RISCV_FENCE_H
> +#define _ASM_RISCV_FENCE_H
> +
> +#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
> +#define RISCV_RELEASE_BARRIER   "\tfence rw, w\n"
> +#define RISCV_FULL_BARRIER      "\tfence rw, rw\n"

... I'm not really happy with the \t and \n that are put here. My take
on this is that it is the responsibility of the use site to supply such
as and when necessary.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-26 17:39 ` [PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
@ 2024-03-05  8:05   ` Jan Beulich
  2024-03-05 17:34     ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-05  8:05 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On 26.02.2024 18:39, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/docs/misc/riscv/booting.txt
> @@ -0,0 +1,8 @@
> +System requirements
> +===================
> +
> +The following extensions are expected to be supported by a system on which
> +Xen is run:
> +- Zihintpause:
> +  On a system that doesn't have this extension, cpu_relax() should be
> +  implemented properly. Otherwise, an illegal instruction exception will arise.

This decision wants justifying in the (presently once again empty) description.

Furthermore - will there really be an illegal instruction exception otherwise?
Isn't it the nature of hints that they are NOPs if not serving their designated
purpose?

> --- a/xen/arch/riscv/arch.mk
> +++ b/xen/arch/riscv/arch.mk
> @@ -5,6 +5,12 @@ $(call cc-options-add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS))
>  
>  CFLAGS-$(CONFIG_RISCV_64) += -mabi=lp64
>  
> +ifeq ($(CONFIG_RISCV_64),y)
> +has_zihintpause = $(call as-insn,$(CC) -mabi=lp64 -march=rv64i_zihintpause, "pause",_zihintpause,)
> +else
> +has_zihintpause = $(call as-insn,$(CC) -mabi=ilp32 -march=rv32i_zihintpause, "pause",_zihintpause,)
> +endif

Considering that down the road likely more such tests will want adding, I think
this wants further abstracting for the rv32/rv64 difference (ideally in a way
that wouldn't make future RV128 wrongly and silently take the RV32 branch).
This would include eliminating the -mabi=lp64 redundancy with what's visible in
context, perhaps by way of introducing a separate helper macro, e.g.

riscv-abi-$(CONFIG_RISCV_32) := -mabi=ilp32
riscv-abi-$(CONFIG_RISCV_64) := -mabi=lp64

I further see nothing wrong with also using $(riscv-march-y) here. I.e.
overall

_zihintpause := $(call as-insn,$(CC) $(riscv-abi-y) $(riscv-march-y)_zihintpause,"pause",_zihintpause)

(still with potential of abstracting further through another macro such
that not every such construct would need to spell out the ABI and arch
compiler options).

Plus a macro named has_* imo can be expected to expand to y or n. I would
suggest to simply drop the "has", thus ...

> @@ -12,7 +18,7 @@ riscv-march-$(CONFIG_RISCV_ISA_C)       := $(riscv-march-y)c
>  # into the upper half _or_ the lower half of the address space.
>  # -mcmodel=medlow would force Xen into the lower half.
>  
> -CFLAGS += -march=$(riscv-march-y) -mstrict-align -mcmodel=medany
> +CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align -mcmodel=medany

... also making the use site look 

> --- a/xen/arch/riscv/include/asm/processor.h
> +++ b/xen/arch/riscv/include/asm/processor.h
> @@ -12,6 +12,9 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +/* TODO: need to be implemeted */
> +#define smp_processor_id() 0
> +
>  /* On stack VCPU state */
>  struct cpu_user_regs
>  {
> @@ -53,6 +56,26 @@ struct cpu_user_regs
>      unsigned long pregs;
>  };
>  
> +/* TODO: need to implement */
> +#define cpu_to_core(cpu)   (0)
> +#define cpu_to_socket(cpu) (0)

Nit: Like above in smp_processor_id() no need for parentheses here.

> +static inline void cpu_relax(void)
> +{
> +#ifdef __riscv_zihintpause
> +    /*
> +     * Reduce instruction retirement.
> +     * This assumes the PC changes.

What is this 2nd sentence about?

> +     */
> +    __asm__ __volatile__ ( "pause" );
> +#else
> +    /* Encoding of the pause instruction */
> +    __asm__ __volatile__ ( ".insn 0x100000F" );

May I ask that you spell out the leading zero here, to make clear there
aren't, by mistake, one to few zeroes in the middle?

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h to build full Xen
  2024-02-26 17:39 ` [PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
@ 2024-03-05  8:17   ` Jan Beulich
  2024-03-05 16:46     ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-05  8:17 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:39, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/include/asm/mm.h
> +++ b/xen/arch/riscv/include/asm/mm.h
> @@ -3,11 +3,252 @@
>  #ifndef _ASM_RISCV_MM_H
>  #define _ASM_RISCV_MM_H
>  
> +#include <public/xen.h>
> +#include <xen/bug.h>
> +#include <xen/mm-frame.h>
> +#include <xen/pdx.h>
> +#include <xen/types.h>
> +
>  #include <asm/page-bits.h>
>  
>  #define pfn_to_paddr(pfn) ((paddr_t)(pfn) << PAGE_SHIFT)
>  #define paddr_to_pfn(pa)  ((unsigned long)((pa) >> PAGE_SHIFT))
>  
> +#define paddr_to_pdx(pa)    mfn_to_pdx(maddr_to_mfn(pa))
> +#define gfn_to_gaddr(gfn)   pfn_to_paddr(gfn_x(gfn))
> +#define gaddr_to_gfn(ga)    _gfn(paddr_to_pfn(ga))
> +#define mfn_to_maddr(mfn)   pfn_to_paddr(mfn_x(mfn))
> +#define maddr_to_mfn(ma)    _mfn(paddr_to_pfn(ma))
> +#define vmap_to_mfn(va)     maddr_to_mfn(virt_to_maddr((vaddr_t)va))

va needs parenthesizing here. Also why vaddr_t here but ...

> +#define vmap_to_page(va)    mfn_to_page(vmap_to_mfn(va))
> +
> +static inline unsigned long __virt_to_maddr(unsigned long va)
> +{
> +    BUG_ON("unimplemented");
> +    return 0;
> +}
> +
> +static inline void *__maddr_to_virt(unsigned long ma)
> +{
> +    BUG_ON("unimplemented");
> +    return NULL;
> +}
> +
> +#define virt_to_maddr(va) __virt_to_maddr((unsigned long)(va))
> +#define maddr_to_virt(pa) __maddr_to_virt((unsigned long)(pa))

... unsigned long here? In fact for __maddr_to_virt() I think there
better wouldn't be any cast, such that the compiler can spot if, by
mistake, a pointer type value was passed in. Or, wait, we can go
yet further (also on x86): There are no uses of __maddr_to_virt()
except here. Hence the symbol isn't needed (anymore?) in the first
place.

> +/* Convert between Xen-heap virtual addresses and machine frame numbers. */
> +#define __virt_to_mfn(va)  mfn_x(maddr_to_mfn(virt_to_maddr(va)))
> +#define __mfn_to_virt(mfn) maddr_to_virt(mfn_to_maddr(_mfn(mfn)))
> +
> +/*
> + * We define non-underscored wrappers for above conversion functions.
> + * These are overriden in various source files while underscored version
> + * remain intact.
> + */
> +#define virt_to_mfn(va)     __virt_to_mfn(va)
> +#define mfn_to_virt(mfn)    __mfn_to_virt(mfn)
> +
> +struct page_info
> +{
> +    /* Each frame can be threaded onto a doubly-linked list. */
> +    struct page_list_entry list;
> +
> +    /* Reference count and various PGC_xxx flags and fields. */
> +    unsigned long count_info;
> +
> +    /* Context-dependent fields follow... */
> +    union {
> +        /* Page is in use: ((count_info & PGC_count_mask) != 0). */
> +        struct {
> +            /* Type reference count and various PGT_xxx flags and fields. */
> +            unsigned long type_info;
> +        } inuse;

Blank line here please.

Jan

> +        /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
> +        union {
> +            struct {
> +                /*
> +                 * Index of the first *possibly* unscrubbed page in the buddy.
> +                 * One more bit than maximum possible order to accommodate
> +                 * INVALID_DIRTY_IDX.
> +                 */
> +#define INVALID_DIRTY_IDX ((1UL << (MAX_ORDER + 1)) - 1)
> +                unsigned long first_dirty:MAX_ORDER + 1;
> +
> +                /* Do TLBs need flushing for safety before next page use? */
> +                bool need_tlbflush:1;
> +
> +#define BUDDY_NOT_SCRUBBING    0
> +#define BUDDY_SCRUBBING        1
> +#define BUDDY_SCRUB_ABORT      2
> +                unsigned long scrub_state:2;
> +            };
> +
> +            unsigned long val;
> +        } free;
> +    } u;
> +
> +    union {
> +        /* Page is in use */
> +        struct {
> +            /* Owner of this page (NULL if page is anonymous). */
> +            struct domain *domain;
> +        } inuse;
> +
> +        /* Page is on a free list. */
> +        struct {
> +            /* Order-size of the free chunk this page is the head of. */
> +            unsigned int order;
> +        } free;
> +    } v;
> +
> +    union {
> +        /*
> +         * Timestamp from 'TLB clock', used to avoid extra safety flushes.
> +         * Only valid for: a) free pages, and b) pages with zero type count
> +         */
> +        uint32_t tlbflush_timestamp;
> +    };
> +};



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 21/23] xen/rirscv: add minimal amount of stubs to build full Xen
  2024-02-26 17:39 ` [PATCH v5 21/23] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
@ 2024-03-05  8:40   ` Jan Beulich
  0 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-03-05  8:40 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:39, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
with one remark:

> +/*
> + * The following functions are defined in common/irq.c, which will be built in
> + * the next commit, so these changes will be removed there.
> + */
> +
> +void cf_check irq_actor_none(struct irq_desc *desc)
> +{
> +    BUG_ON("unimplemented");
> +}
> +
> +unsigned int cf_check irq_startup_none(struct irq_desc *desc)
> +{
> +    BUG_ON("unimplemented");
> +
> +    return 0;
> +}

Neither patch descriptions nor comments should mention "the next commit" or
anything alike. You want to describe things without making assumptions that
two successively submitted patches are also committed in direct succession.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h to build full Xen
  2024-03-05  8:17   ` Jan Beulich
@ 2024-03-05 16:46     ` Oleksii
  0 siblings, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-03-05 16:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Tue, 2024-03-05 at 09:17 +0100, Jan Beulich wrote:
> On 26.02.2024 18:39, Oleksii Kurochko wrote:
> > --- a/xen/arch/riscv/include/asm/mm.h
> > +++ b/xen/arch/riscv/include/asm/mm.h
> > @@ -3,11 +3,252 @@
> >  #ifndef _ASM_RISCV_MM_H
> >  #define _ASM_RISCV_MM_H
> >  
> > +#include <public/xen.h>
> > +#include <xen/bug.h>
> > +#include <xen/mm-frame.h>
> > +#include <xen/pdx.h>
> > +#include <xen/types.h>
> > +
> >  #include <asm/page-bits.h>
> >  
> >  #define pfn_to_paddr(pfn) ((paddr_t)(pfn) << PAGE_SHIFT)
> >  #define paddr_to_pfn(pa)  ((unsigned long)((pa) >> PAGE_SHIFT))
> >  
> > +#define paddr_to_pdx(pa)    mfn_to_pdx(maddr_to_mfn(pa))
> > +#define gfn_to_gaddr(gfn)   pfn_to_paddr(gfn_x(gfn))
> > +#define gaddr_to_gfn(ga)    _gfn(paddr_to_pfn(ga))
> > +#define mfn_to_maddr(mfn)   pfn_to_paddr(mfn_x(mfn))
> > +#define maddr_to_mfn(ma)    _mfn(paddr_to_pfn(ma))
> > +#define vmap_to_mfn(va)    
> > maddr_to_mfn(virt_to_maddr((vaddr_t)va))
> 
> va needs parenthesizing here. Also why vaddr_t here but ...
> 
> > +#define vmap_to_page(va)    mfn_to_page(vmap_to_mfn(va))
> > +
> > +static inline unsigned long __virt_to_maddr(unsigned long va)
> > +{
> > +    BUG_ON("unimplemented");
> > +    return 0;
> > +}
> > +
> > +static inline void *__maddr_to_virt(unsigned long ma)
> > +{
> > +    BUG_ON("unimplemented");
> > +    return NULL;
> > +}
> > +
> > +#define virt_to_maddr(va) __virt_to_maddr((unsigned long)(va))
> > +#define maddr_to_virt(pa) __maddr_to_virt((unsigned long)(pa))
> 
> ... unsigned long here? In fact for __maddr_to_virt() I think there
> better wouldn't be any cast, such that the compiler can spot if, by
> mistake, a pointer type value was passed in. Or, wait, we can go
> yet further (also on x86): There are no uses of __maddr_to_virt()
> except here. Hence the symbol isn't needed (anymore?) in the first
> place.
I used 'unsigned long' only because I declared __virt_to_maddr() with
unsigned long argument, but I think it should should be vaddr_t as in
input argument is expected to be VA and for a cast should be used
vaddr_t too.

~ Oleksii


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-03-05  8:05   ` Jan Beulich
@ 2024-03-05 17:34     ` Oleksii
  0 siblings, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-03-05 17:34 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On Tue, 2024-03-05 at 09:05 +0100, Jan Beulich wrote:
> On 26.02.2024 18:39, Oleksii Kurochko wrote:
> > --- /dev/null
> > +++ b/docs/misc/riscv/booting.txt
> > @@ -0,0 +1,8 @@
> > +System requirements
> > +===================
> > +
> > +The following extensions are expected to be supported by a system
> > on which
> > +Xen is run:
> > +- Zihintpause:
> > +  On a system that doesn't have this extension, cpu_relax() should
> > be
> > +  implemented properly. Otherwise, an illegal instruction
> > exception will arise.
> 
> This decision wants justifying in the (presently once again empty)
> description.
> 
> Furthermore - will there really be an illegal instruction exception
> otherwise?
> Isn't it the nature of hints that they are NOPs if not serving their
> designated
> purpose?
You are right, they are NOPs, so I will drop the part about an illegal
instruction exception.

> > --- a/xen/arch/riscv/arch.mk
> > +++ b/xen/arch/riscv/arch.mk
> > @@ -5,6 +5,12 @@ $(call cc-options-
> > add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS))
> >  
> >  CFLAGS-$(CONFIG_RISCV_64) += -mabi=lp64
> >  
> > +ifeq ($(CONFIG_RISCV_64),y)
> > +has_zihintpause = $(call as-insn,$(CC) -mabi=lp64 -
> > march=rv64i_zihintpause, "pause",_zihintpause,)
> > +else
> > +has_zihintpause = $(call as-insn,$(CC) -mabi=ilp32 -
> > march=rv32i_zihintpause, "pause",_zihintpause,)
> > +endif
> 
> Considering that down the road likely more such tests will want
> adding, I think
> this wants further abstracting for the rv32/rv64 difference (ideally
> in a way
> that wouldn't make future RV128 wrongly and silently take the RV32
> branch).
> This would include eliminating the -mabi=lp64 redundancy with what's
> visible in
> context, perhaps by way of introducing a separate helper macro, e.g.
> 
> riscv-abi-$(CONFIG_RISCV_32) := -mabi=ilp32
> riscv-abi-$(CONFIG_RISCV_64) := -mabi=lp64
> 
> I further see nothing wrong with also using $(riscv-march-y) here.
> I.e.
> overall
> 
> _zihintpause := $(call as-insn,$(CC) $(riscv-abi-y) $(riscv-march-
> y)_zihintpause,"pause",_zihintpause)
> 
> (still with potential of abstracting further through another macro
> such
> that not every such construct would need to spell out the ABI and
> arch
> compiler options).
> 
> Plus a macro named has_* imo can be expected to expand to y or n. I
> would
> suggest to simply drop the "has", thus ...
> 
> > @@ -12,7 +18,7 @@ riscv-march-$(CONFIG_RISCV_ISA_C)       :=
> > $(riscv-march-y)c
> >  # into the upper half _or_ the lower half of the address space.
> >  # -mcmodel=medlow would force Xen into the lower half.
> >  
> > -CFLAGS += -march=$(riscv-march-y) -mstrict-align -mcmodel=medany
> > +CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align
> > -mcmodel=medany
> 
> ... also making the use site look 
> 
> > --- a/xen/arch/riscv/include/asm/processor.h
> > +++ b/xen/arch/riscv/include/asm/processor.h
> > @@ -12,6 +12,9 @@
> >  
> >  #ifndef __ASSEMBLY__
> >  
> > +/* TODO: need to be implemeted */
> > +#define smp_processor_id() 0
> > +
> >  /* On stack VCPU state */
> >  struct cpu_user_regs
> >  {
> > @@ -53,6 +56,26 @@ struct cpu_user_regs
> >      unsigned long pregs;
> >  };
> >  
> > +/* TODO: need to implement */
> > +#define cpu_to_core(cpu)   (0)
> > +#define cpu_to_socket(cpu) (0)
> 
> Nit: Like above in smp_processor_id() no need for parentheses here.
> 
> > +static inline void cpu_relax(void)
> > +{
> > +#ifdef __riscv_zihintpause
> > +    /*
> > +     * Reduce instruction retirement.
> > +     * This assumes the PC changes.
> 
> What is this 2nd sentence about?
cpu_relax() function was copied from Linux kernel and this comment
exists there, but I couldn't find in zihintpause spec how it affects PC
/IP, so it seems to me it can be dropped.

My guess that the 2nd sentece was added because of the following words
from the spec:
   The PAUSE instruction is a HINT that indicates the current hart’s
   rate of instruction retirement should be temporarily reduced or
   paused. The duration of its effect must be bounded and may be zero.
   
So it says reduced or pause, but still doesn't make sense as no matter
how long pause takes to complete, it will still advance PC.

> 
> > +     */
> > +    __asm__ __volatile__ ( "pause" );
> > +#else
> > +    /* Encoding of the pause instruction */
> > +    __asm__ __volatile__ ( ".insn 0x100000F" );
> 
> May I ask that you spell out the leading zero here, to make clear
> there
> aren't, by mistake, one to few zeroes in the middle?
I will add a leading zero. The encoding is correct, I've verified with
disassembler:
   c:   0100000f                pause    

~ Oleksii


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-02-26 17:38 ` [PATCH v5 12/23] xen/riscv: introduce io.h Oleksii Kurochko
@ 2024-03-06 14:13   ` Jan Beulich
  2024-03-07 13:01     ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-06 14:13 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/io.h
> @@ -0,0 +1,157 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + *  The header taken form Linux 6.4.0-rc1 and is based on
> + *  arch/riscv/include/asm/mmio.h with the following changes:
> + *   - drop forcing of endianess for read*(), write*() functions as
> + *     no matter what CPU endianness, what endianness a particular device
> + *     (and hence its MMIO region(s)) is using is entirely independent.
> + *     Hence conversion, where necessary, needs to occur at a layer up.
> + *     Another one reason to drop endianess conversion is:
> + *     https://patchwork.kernel.org/project/linux-riscv/patch/20190411115623.5749-3-hch@lst.de/
> + *     One of the answers of the author of the commit:
> + *       And we don't know if Linux will be around if that ever changes.
> + *       The point is:
> + *        a) the current RISC-V spec is LE only
> + *        b) the current linux port is LE only except for this little bit
> + *       There is no point in leaving just this bitrotting code around.  It
> + *       just confuses developers, (very very slightly) slows down compiles
> +  *      and will bitrot.  It also won't be any significant help to a future

Nit: Stray extra leading blank.

> + *       developer down the road doing a hypothetical BE RISC-V Linux port.
> + *   - drop unused argument of __io_ar() macros.
> + *   - drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
> + *     as they are unnessary.

Nit: unnecessary (also ...

> + *   - Adopt the Xen code style for this header, considering that significant changes
> + *     are not anticipated in the future.
> + *     In the event of any issues, adapting them to Xen style should be easily
> + *     manageable.
> + *   - drop unnessary __r variables in macros read*_cpu()

... again here)

> + * Copyright (C) 1996-2000 Russell King
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2014 Regents of the University of California
> + * Copyright (C) 2024 Vates
> + */
> +
> +#ifndef _ASM_RISCV_IO_H
> +#define _ASM_RISCV_IO_H
> +
> +#include <asm/byteorder.h>
> +
> +/*
> + * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> + * change the properties of memory regions.  This should be fixed by the
> + * upcoming platform spec.
> + */
> +#define ioremap_nocache(addr, size) ioremap(addr, size)
> +#define ioremap_wc(addr, size) ioremap(addr, size)
> +#define ioremap_wt(addr, size) ioremap(addr, size)
> +
> +/* Generic IO read/write.  These perform native-endian accesses. */
> +static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
> +{
> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}

I realize this is like Linux has it, but how is the compiler to know that
*addr is being access here? If the omission of respective constraints here
and below is intentional, I think a comment (covering all instances) is
needed. Note that while supposedly cloned from Arm code, Arm variants do
have such constraints in Linux.

I'm sorry for not having paid (enough) attention earlier.

> +static inline void __raw_writew(uint16_t val, volatile void __iomem *addr)
> +{
> +    asm volatile ( "sh %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}
> +
> +static inline void __raw_writel(uint32_t val, volatile void __iomem *addr)
> +{
> +    asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}
> +
> +#ifdef CONFIG_64BIT
> +static inline void __raw_writeq(u64 val, volatile void __iomem *addr)

uint64_t please

> +{
> +    asm volatile ( "sd %0, 0(%1)" : : "r" (val), "r" (addr) );
> +}
> +#endif
> +
> +static inline uint8_t __raw_readb(const volatile void __iomem *addr)
> +{
> +    uint8_t val;
> +
> +    asm volatile ( "lb %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +
> +static inline uint16_t __raw_readw(const volatile void __iomem *addr)
> +{
> +    uint16_t val;
> +
> +    asm volatile ( "lh %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +
> +static inline uint32_t __raw_readl(const volatile void __iomem *addr)
> +{
> +    uint32_t val;
> +
> +    asm volatile ( "lw %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +
> +#ifdef CONFIG_64BIT
> +static inline u64 __raw_readq(const volatile void __iomem *addr)

uint64_t please

> +{
> +    u64 val;

and again

> +    asm volatile ( "ld %0, 0(%1)" : "=r" (val) : "r" (addr) );
> +    return val;
> +}
> +#endif
> +
> +/*
> + * Unordered I/O memory access primitives.  These are even more relaxed than
> + * the relaxed versions, as they don't even order accesses between successive
> + * operations to the I/O regions.
> + */
> +#define readb_cpu(c)        __raw_readb(c)
> +#define readw_cpu(c)        __raw_readw(c)
> +#define readl_cpu(c)        __raw_readl(c)
> +
> +#define writeb_cpu(v, c)    __raw_writeb(v, c)
> +#define writew_cpu(v, c)    __raw_writew(v, c)
> +#define writel_cpu(v, c)    __raw_writel(v, c)
> +
> +#ifdef CONFIG_64BIT
> +#define readq_cpu(c)        __raw_readq(c)
> +#define writeq_cpu(v, c)    __raw_writeq(v, c)
> +#endif
> +
> +/*
> + * I/O memory access primitives. Reads are ordered relative to any
> + * following Normal memory access. Writes are ordered relative to any prior
> + * Normal memory access.  The memory barriers here are necessary as RISC-V
> + * doesn't define any ordering between the memory space and the I/O space.
> + */
> +#define __io_br()   do { } while (0)
> +#define __io_ar()   asm volatile ( "fence i,r" : : : "memory" );
> +#define __io_bw()   asm volatile ( "fence w,o" : : : "memory" );
> +#define __io_aw()   do { } while (0)
> +
> +#define readb(c)    ({ uint8_t  v; __io_br(); v = readb_cpu(c); __io_ar(); v; })
> +#define readw(c)    ({ uint16_t v; __io_br(); v = readw_cpu(c); __io_ar(); v; })
> +#define readl(c)    ({ uint32_t v; __io_br(); v = readl_cpu(c); __io_ar(); v; })
> +
> +#define writeb(v, c)    ({ __io_bw(); writeb_cpu(v, c); __io_aw(); })
> +#define writew(v, c)    ({ __io_bw(); writew_cpu(v, c); __io_aw(); })
> +#define writel(v, c)    ({ __io_bw(); writel_cpu(v, c); __io_aw(); })
> +
> +#ifdef CONFIG_64BIT
> +#define readq(c)        ({ uint64_t v; __io_br(); v = readq_cpu(c); __io_ar(); v; })
> +#define writeq(v, c)    ({ __io_bw(); writeq_cpu(v, c); __io_aw(); })
> +#endif

Overall looks much tidier now, thanks.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
  2024-02-26 17:38 ` [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
@ 2024-03-06 14:56   ` Jan Beulich
  2024-03-07 10:35     ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-06 14:56 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> The header was taken from Linux kernl 6.4.0-rc1.
> 
> Addionally, were updated:
> * add emulation of {cmp}xchg for 1/2 byte types using 32-bit atomic
>   access.
> * replace tabs with spaces
> * replace __* variale with *__
> * introduce generic version of xchg_* and cmpxchg_*.
> 
> Implementation of 4- and 8-byte cases were left as it is done in
> Linux kernel as according to the RISC-V spec:
> ```
> Table A.5 ( only part of the table was copied here )
> 
> Linux Construct       RVWMO Mapping
> atomic <op> relaxed    amo<op>.{w|d}
> atomic <op> acquire    amo<op>.{w|d}.aq
> atomic <op> release    amo<op>.{w|d}.rl
> atomic <op>            amo<op>.{w|d}.aqrl
> 
> Linux Construct       RVWMO LR/SC Mapping
> atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
> atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
> atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez loop OR
>                        fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ; bnez loop
> atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez loop
> 
> The Linux mappings for release operations may seem stronger than necessary,
> but these mappings are needed to cover some cases in which Linux requires
> stronger orderings than the more intuitive mappings would provide.
> In particular, as of the time this text is being written, Linux is actively
> debating whether to require load-load, load-store, and store-store orderings
> between accesses in one critical section and accesses in a subsequent critical
> section in the same hart and protected by the same synchronization object.
> Not all combinations of FENCE RW,W/FENCE R,RW mappings with aq/rl mappings
> combine to provide such orderings.
> There are a few ways around this problem, including:
> 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This suffices
>    but is undesirable, as it defeats the purpose of the aq/rl modifiers.
> 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does not
>    currently work due to the lack of load and store opcodes with aq and rl
>    modifiers.

As before I don't understand this point. Can you give an example of what
sort of opcode / instruction is missing?

> 3. Strengthen the mappings of release operations such that they would
>    enforce sufficient orderings in the presence of either type of acquire mapping.
>    This is the currently-recommended solution, and the one shown in Table A.5.
> ```
> 
> But in Linux kenrel atomics were strengthen with fences:
> ```
> Atomics present the same issue with locking: release and acquire
> variants need to be strengthened to meet the constraints defined
> by the Linux-kernel memory consistency model [1].
> 
> Atomics present a further issue: implementations of atomics such
> as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
> which do not give full-ordering with .aqrl; for example, current
> implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
> below to end up with the state indicated in the "exists" clause.
> 
> In order to "synchronize" LKMM and RISC-V's implementation, this
> commit strengthens the implementations of the atomics operations
> by replacing .rl and .aq with the use of ("lightweigth") fences,
> and by replacing .aqrl LR/SC pairs in sequences such as:
> 
> 0:      lr.w.aqrl  %0, %addr
>         bne        %0, %old, 1f
>         ...
>         sc.w.aqrl  %1, %new, %addr
>         bnez       %1, 0b
> 1:
> 
> with sequences of the form:
> 
> 0:      lr.w       %0, %addr
>         bne        %0, %old, 1f
>               ...
>         sc.w.rl    %1, %new, %addr   /* SC-release   */
>         bnez       %1, 0b
>         fence      rw, rw            /* "full" fence */
> 1:
> 
> following Daniel's suggestion.
> 
> These modifications were validated with simulation of the RISC-V
> memory consistency model.
> 
> C lr-sc-aqrl-pair-vs-full-barrier
> 
> {}
> 
> P0(int *x, int *y, atomic_t *u)
> {
>         int r0;
>         int r1;
> 
>         WRITE_ONCE(*x, 1);
>         r0 = atomic_cmpxchg(u, 0, 1);
>         r1 = READ_ONCE(*y);
> }
> 
> P1(int *x, int *y, atomic_t *v)
> {
>         int r0;
>         int r1;
> 
>         WRITE_ONCE(*y, 1);
>         r0 = atomic_cmpxchg(v, 0, 1);
>         r1 = READ_ONCE(*x);
> }
> 
> exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)

While I'm entirely willing to trust this can happen, I can't bring this
in line with the A extension spec.

Additionally it's not clear to me in how far all of this applies when
you don't really use LR/SC in the 4- and 8-byte cases (and going forward
likely also not in the 1- and 2-byte case, utilizing Zahba when available).

> ---
> Changes in V5:
>  - update the commit message.
>  - drop ALIGN_DOWN().
>  - update the definition of emulate_xchg_1_2(): 
>    - lr.d -> lr.w, sc.d -> sc.w.
>    - drop ret argument.
>    - code style fixes around asm volatile.
>    - update prototype.
>    - use asm named operands.
>    - rename local variables.
>    - add comment above the macros
>  - update the definition of __xchg_generic:
>    - drop local ptr__ variable.
>    - code style fixes around switch()
>    - update prototype.
>  - introduce RISCV_FULL_BARRIES.
>  - redefine cmpxchg()
>  - update emulate_cmpxchg_1_2():
>    - update prototype
>    - update local variables names and usage of them
>    - use name asm operands.
>    - add comment above the macros
> ---
> Changes in V4:
>  - Code style fixes.
>  - enforce in __xchg_*() has the same type for new and *ptr, also "\n"
>    was removed at the end of asm instruction.
>  - dependency from https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@bugseng.com/
>  - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE().
>  - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used.
>  - drop cmpxcg{32,64}_{local} as they aren't used.
>  - introduce generic version of xchg_* and cmpxchg_*.
>  - update the commit message.
> ---
> Changes in V3:
>  - update the commit message
>  - add emulation of {cmp}xchg_... for 1 and 2 bytes types
> ---
> Changes in V2:
>  - update the comment at the top of the header.
>  - change xen/lib.h to xen/bug.h.
>  - sort inclusion of headers properly.
> ---
>  xen/arch/riscv/include/asm/cmpxchg.h | 258 +++++++++++++++++++++++++++
>  1 file changed, 258 insertions(+)
>  create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h
> 
> diff --git a/xen/arch/riscv/include/asm/cmpxchg.h b/xen/arch/riscv/include/asm/cmpxchg.h
> new file mode 100644
> index 0000000000..66cbe26737
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> @@ -0,0 +1,258 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright (C) 2014 Regents of the University of California */
> +
> +#ifndef _ASM_RISCV_CMPXCHG_H
> +#define _ASM_RISCV_CMPXCHG_H
> +
> +#include <xen/compiler.h>
> +#include <xen/lib.h>
> +
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +#define __amoswap_generic(ptr, new, ret, sfx, pre, post) \
> +({ \
> +    asm volatile( \

Nit: In Xen style this is lacking a blank ahead of the opening parenthesis.

> +        pre \
> +        " amoswap" sfx " %0, %2, %1\n" \
> +        post \
> +        : "=r" (ret), "+A" (*ptr) \
> +        : "r" (new) \
> +        : "memory" ); \
> +})
> +
> +/*
> + * For LR and SC, the A extension requires that the address held in rs1 be
> + * naturally aligned to the size of the operand (i.e., eight-byte aligned
> + * for 64-bit words and four-byte aligned for 32-bit words).
> + * If the address is not naturally aligned, an address-misaligned exception
> + * or an access-fault exception will be generated.
> + * 
> + * Thereby:
> + * - for 1-byte xchg access the containing word by clearing low two bits
> + * - for 2-byte xchg ccess the containing word by clearing first bit.

"first bit" can still be ambiguous. Better say "bit 1".

> + * 

Here and apparently also elsewhere: Stray trailing blank. Git has a config
setting to warn you about (maybe even to automatically strip? such.

> + * If resulting 4-byte access is still misalgined, it will fault just as
> + * non-emulated 4-byte access would.
> + */
> +#define emulate_xchg_1_2(ptr, new, sc_sfx, pre, post) \
> +({ \
> +    uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr & ~(0x4 - sizeof(*ptr))); \

Here and elsewhere: sizeof(*(ptr)) (i.e. the inner parentheses are needed
also there).

> +    uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 - sizeof(*ptr))) * BITS_PER_BYTE; \

Why uint8_t?

> +    unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE) - 1, 0) << new_val_pos; \
> +    unsigned int new_ = new << new_val_pos; \
> +    unsigned int old_val; \
> +    unsigned int xchged_val; \
> +    \
> +    asm volatile ( \
> +        pre \
> +        "0: lr.w %[op_oldval], %[op_aligned_ptr]\n" \
> +        "   and  %[op_xchged_val], %[op_oldval], %z[op_nmask]\n" \
> +        "   or   %[op_xchged_val], %[op_xchged_val], %z[op_new]\n" \
> +        "   sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val], %[op_aligned_ptr]\n" \
> +        "   bnez %[op_xchged_val], 0b\n" \
> +        post \
> +        : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r" (xchged_val), [op_aligned_ptr]"+A" (*aligned_ptr) \

Too long line. Partly because you have op_ prefixes here which I can't
recognized what they would be good for. The val / _val suffixes also
don't appear to carry much useful information. And "xchged", being
explicitly past tense, doesn't look to fit even up and until the SC,
not to speak of afterwards. Anything wrong with calling this just tmp,
aux, or scratch?

> +        : [op_new] "rJ" (new_), [op_nmask] "rJ" (~mask) \
> +        : "memory" ); \
> +    \
> +    (__typeof__(*(ptr)))((old_val & mask) >> new_val_pos); \
> +})
> +
> +#define __xchg_generic(ptr, new, size, sfx, pre, post) \
> +({ \
> +    __typeof__(*(ptr)) new__ = (new); \
> +    __typeof__(*(ptr)) ret__; \
> +    switch ( size ) \

Can't this use sizeof(*(ptr)), allowing for one less macro parameter?

> +    { \
> +    case 1: \
> +    case 2: \
> +        ret__ = emulate_xchg_1_2(ptr, new__, sfx, pre, post); \
> +        break; \
> +    case 4: \
> +        __amoswap_generic(ptr, new__, ret__,\
> +                          ".w" sfx,  pre, post); \
> +        break; \
> +    case 8: \
> +        __amoswap_generic(ptr, new__, ret__,\
> +                          ".d" sfx,  pre, post); \
> +        break; \

In io.h you make sure to avoid rv64-only insns. Here you don't. The build
would fail either way, but this still looks inconsistent.

Also nit: Stray double blands (twice) ahead of "pre". Plus with this style
of line continuation you want to consistently have exactly one blank ahead
of each backslash.

> +    default: \
> +        STATIC_ASSERT_UNREACHABLE(); \
> +    } \
> +    ret__; \
> +})
> +
> +#define xchg_relaxed(ptr, x) \
> +({ \
> +    __typeof__(*(ptr)) x_ = (x); \

What is the purpose of this, when __xchg_generic() already does this same
type conversion?

> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), "", "", ""); \
> +})
> +
> +#define xchg_acquire(ptr, x) \
> +({ \
> +    __typeof__(*(ptr)) x_ = (x); \
> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
> +                                       "", "", RISCV_ACQUIRE_BARRIER); \
> +})
> +
> +#define xchg_release(ptr, x) \
> +({ \
> +    __typeof__(*(ptr)) x_ = (x); \
> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
> +                                       "", RISCV_RELEASE_BARRIER, ""); \
> +})

As asked before: Are there going to be any uses of these three? Common
code doesn't require them. And not needing to provide them would
simplify things quite a bit, it seems.

> +#define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x), sizeof(*(ptr)), \
> +                                    ".aqrl", "", "")

According to the earlier comment (where I don't follow the example given),
is .aqrl sufficient here? And even if it was for the 4- and 8-byte cases,
is it sufficient in the 1- and 2-byte emulation case (where it then is
appended to just the SC)?

> +#define __generic_cmpxchg(ptr, old, new, ret, lr_sfx, sc_sfx, pre, post)	\
> + ({ \
> +    register unsigned int rc; \
> +    asm volatile( \
> +        pre \
> +        "0: lr" lr_sfx " %0, %2\n" \
> +        "   bne  %0, %z3, 1f\n" \
> +        "   sc" sc_sfx " %1, %z4, %2\n" \
> +        "   bnez %1, 0b\n" \
> +        post \
> +        "1:\n" \
> +        : "=&r" (ret), "=&r" (rc), "+A" (*ptr) \
> +        : "rJ" (old), "rJ" (new) \
> +        : "memory"); \
> + })
> +
> +/*
> + * For LR and SC, the A extension requires that the address held in rs1 be
> + * naturally aligned to the size of the operand (i.e., eight-byte aligned
> + * for 64-bit words and four-byte aligned for 32-bit words).
> + * If the address is not naturally aligned, an address-misaligned exception
> + * or an access-fault exception will be generated.
> + * 
> + * Thereby:
> + * - for 1-byte xchg access the containing word by clearing low two bits
> + * - for 2-byte xchg ccess the containing word by clearing first bit.
> + * 
> + * If resulting 4-byte access is still misalgined, it will fault just as
> + * non-emulated 4-byte access would.
> + *
> + * old_val was casted to unsigned long at the end of the define because of
> + * the following issue:
> + * ./arch/riscv/include/asm/cmpxchg.h:166:5: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
> + * 166 |     (__typeof__(*(ptr)))(old_val >> new_val_pos); \
> + *     |     ^
> + * ./arch/riscv/include/asm/cmpxchg.h:184:17: note: in expansion of macro 'emulate_cmpxchg_1_2'
> + * 184 |         ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
> + *     |                 ^~~~~~~~~~~~~~~~~~~
> + * ./arch/riscv/include/asm/cmpxchg.h:227:5: note: in expansion of macro '__cmpxchg_generic'
> + * 227 |     __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
> + *     |     ^~~~~~~~~~~~~~~~~
> + * ./include/xen/lib.h:141:26: note: in expansion of macro '__cmpxchg'
> + * 141 |     ((__typeof__(*(ptr)))__cmpxchg(ptr, (unsigned long)o_,              \
> + *     |                          ^~~~~~~~~
> + * common/event_channel.c:109:13: note: in expansion of macro 'cmpxchgptr'
> + * 109 |             cmpxchgptr(&xen_consumers[i], NULL, fn);
> + */

This is too much detail on the compile issue. Just mentioning that said
cast is needed for cmpxchgptr() ought to be sufficient.

> +#define emulate_cmpxchg_1_2(ptr, old, new, sc_sfx, pre, post) \
> +({ \
> +    uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr & ~(0x4 - sizeof(*ptr))); \
> +    uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 - sizeof(*ptr))) * BITS_PER_BYTE; \
> +    unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE) - 1, 0) << new_val_pos; \
> +    unsigned int old_ = old << new_val_pos; \
> +    unsigned int new_ = new << new_val_pos; \
> +    unsigned int old_val; \
> +    unsigned int xchged_val; \
> +    \
> +    __asm__ __volatile__ ( \
> +        pre \
> +        "0: lr.w %[op_xchged_val], %[op_aligned_ptr]\n" \
> +        "   and  %[op_oldval], %[op_xchged_val], %z[op_mask]\n" \
> +        "   bne  %[op_oldval], %z[op_old], 1f\n" \
> +        "   xor  %[op_xchged_val], %[op_oldval], %[op_xchged_val]\n" \
> +        "   or   %[op_xchged_val], %[op_xchged_val], %z[op_new]\n" \
> +        "   sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val], %[op_aligned_ptr]\n" \
> +        "   bnez %[op_xchged_val], 0b\n" \
> +        post \
> +        "1:\n" \
> +        : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r" (xchged_val), [op_aligned_ptr] "+A" (*aligned_ptr) \
> +        : [op_old] "rJ" (old_), [op_new] "rJ" (new_), \
> +          [op_mask] "rJ" (mask) \
> +        : "memory" ); \
> +    \
> +    (__typeof__(*(ptr)))((unsigned long)old_val >> new_val_pos); \
> +})
> +
> +/*
> + * Atomic compare and exchange.  Compare OLD with MEM, if identical,
> + * store NEW in MEM.  Return the initial value in MEM.  Success is
> + * indicated by comparing RETURN with OLD.
> + */
> +#define __cmpxchg_generic(ptr, old, new, size, sc_sfx, pre, post) \
> +({ \
> +    __typeof__(ptr) ptr__ = (ptr); \
> +    __typeof__(*(ptr)) old__ = (__typeof__(*(ptr)))(old); \
> +    __typeof__(*(ptr)) new__ = (__typeof__(*(ptr)))(new); \
> +    __typeof__(*(ptr)) ret__; \
> +    switch ( size ) \
> +    { \
> +    case 1: \
> +    case 2: \
> +        ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
> +                            sc_sfx, pre, post); \
> +        break; \
> +    case 4: \
> +        __generic_cmpxchg(ptr__, old__, new__, ret__, \
> +                          ".w", ".w"sc_sfx, pre, post); \
> +        break; \
> +    case 8: \
> +        __generic_cmpxchg(ptr__, old__, new__, ret__, \
> +                          ".d", ".d"sc_sfx, pre, post); \
> +        break; \
> +    default: \
> +        STATIC_ASSERT_UNREACHABLE(); \
> +    } \
> +    ret__; \
> +})
> +
> +#define cmpxchg_relaxed(ptr, o, n) \
> +({ \
> +    __typeof__(*(ptr)) o_ = (o); \
> +    __typeof__(*(ptr)) n_ = (n); \
> +    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, \
> +                    o_, n_, sizeof(*(ptr)), "", "", ""); \
> +})
> +
> +#define cmpxchg_acquire(ptr, o, n) \
> +({ \
> +    __typeof__(*(ptr)) o_ = (o); \
> +    __typeof__(*(ptr)) n_ = (n); \
> +    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, o_, n_, sizeof(*(ptr)), \
> +                                          "", "", RISCV_ACQUIRE_BARRIER); \
> +})
> +
> +#define cmpxchg_release(ptr, o, n) \
> +({ \
> +    __typeof__(*(ptr)) o_ = (o); \
> +    __typeof__(*(ptr)) n_ = (n); \
> +    (__typeof__(*(ptr)))__cmpxchg_release(ptr, o_, n_, sizeof(*(ptr)), \

There's no __cmpxchg_release() afaics; dym __cmpxchg_generic()?

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 13/23] xen/riscv: introduce atomic.h
  2024-02-26 17:38 ` [PATCH v5 13/23] xen/riscv: introduce atomic.h Oleksii Kurochko
@ 2024-03-06 15:31   ` Jan Beulich
  2024-03-07 13:30     ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-06 15:31 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/atomic.h
> @@ -0,0 +1,296 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Taken and modified from Linux.
> + *
> + * The following changes were done:
> + * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
> + *     to use__*xchg_generic()
> + * - drop casts in write_atomic() as they are unnecessary
> + * - drop introduction of WRITE_ONCE() and READ_ONCE().
> + *   Xen provides ACCESS_ONCE()
> + * - remove zero-length array access in read_atomic()
> + * - drop defines similar to pattern
> + *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
> + * - move not RISC-V specific functions to asm-generic/atomics-ops.h
> + * 
> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> + * Copyright (C) 2012 Regents of the University of California
> + * Copyright (C) 2017 SiFive
> + * Copyright (C) 2024 Vates SAS
> + */
> +
> +#ifndef _ASM_RISCV_ATOMIC_H
> +#define _ASM_RISCV_ATOMIC_H
> +
> +#include <xen/atomic.h>
> +
> +#include <asm/cmpxchg.h>
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +#include <asm-generic/atomic-ops.h>

While, because of the forward decls in xen/atomic.h, having this #include
works, I wonder if it wouldn't better be placed further down. The compiler
will likely have an easier time when it sees the inline definitions ahead
of any uses.

> +void __bad_atomic_size(void);
> +
> +/*
> + * Legacy from Linux kernel. For some reason they wanted to have ordered
> + * read/write access. Thereby read* is used instead of read<X>_cpu()
> + */
> +static always_inline void read_atomic_size(const volatile void *p,
> +                                           void *res,
> +                                           unsigned int size)
> +{
> +    switch ( size )
> +    {
> +    case 1: *(uint8_t *)res = readb(p); break;
> +    case 2: *(uint16_t *)res = readw(p); break;
> +    case 4: *(uint32_t *)res = readl(p); break;
> +    case 8: *(uint32_t *)res  = readq(p); break;

This is the point where the lack of constraints in io.h (see my respective
comment) becomes actually harmful: You're accessing not MMIO, but compiler-
visible variables here. It needs to know which ones are read ...

> +    default: __bad_atomic_size(); break;
> +    }
> +}
> +
> +#define read_atomic(p) ({                               \
> +    union { typeof(*p) val; char c[sizeof(*p)]; } x_;   \
> +    read_atomic_size(p, x_.c, sizeof(*p));              \
> +    x_.val;                                             \
> +})
> +
> +#define write_atomic(p, x)                              \
> +({                                                      \
> +    typeof(*p) x__ = (x);                               \
> +    switch ( sizeof(*p) )                               \
> +    {                                                   \
> +    case 1: writeb(x__,  p); break;                     \
> +    case 2: writew(x__, p); break;                      \
> +    case 4: writel(x__, p); break;                      \
> +    case 8: writeq(x__, p); break;                      \

... or written.

Nit: There's a stray blank in the writeb() invocation.

> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +    x__;                                                \
> +})
> +
> +#define add_sized(p, x)                                 \
> +({                                                      \
> +    typeof(*(p)) x__ = (x);                             \
> +    switch ( sizeof(*(p)) )                             \

Like you have it here, {read,write}_atomic() also need p properly
parenthesized. There look to be more parenthesization issues further
down.

> +    {                                                   \
> +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> +    case 2: writew(read_atomic(p) + x__, p); break;     \
> +    case 4: writel(read_atomic(p) + x__, p); break;     \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +})

Any reason this doesn't have an 8-byte case? x86'es at least has one.

> +#define __atomic_acquire_fence() \
> +    __asm__ __volatile__ ( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
> +
> +#define __atomic_release_fence() \
> +    __asm__ __volatile__ ( RISCV_RELEASE_BARRIER "" ::: "memory" )

Elsewhere you use asm volatile() - why __asm__ __volatile__() here?
Or why not there (cmpxchg.h, io.h)?

> +/*
> + * First, the atomic ops that have no ordering constraints and therefor don't
> + * have the AQ or RL bits set.  These don't return anything, so there's only
> + * one version to worry about.
> + */
> +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> +static inline                                               \
> +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                           \
> +    __asm__ __volatile__ (                                  \
> +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> +        : "+A" (v->counter)                                 \
> +        : "r" (I)                                           \
> +        : "memory" );                                       \
> +}                                                           \
> +
> +#define ATOMIC_OPS(op, asm_op, I)                           \
> +        ATOMIC_OP (op, asm_op, I, w, int,   )
> +
> +ATOMIC_OPS(add, add,  i)
> +ATOMIC_OPS(sub, add, -i)
> +ATOMIC_OPS(and, and,  i)
> +ATOMIC_OPS( or,  or,  i)
> +ATOMIC_OPS(xor, xor,  i)
> +
> +#undef ATOMIC_OP
> +#undef ATOMIC_OPS
> +
> +/*
> + * Atomic ops that have ordered, relaxed, acquire, and release variants.
> + * There's two flavors of these: the arithmatic ops have both fetch and return
> + * versions, while the logical ops only have fetch versions.
> + */
> +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type, prefix)    \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op##_relaxed(c_type i,              \
> +                         atomic##prefix##_t *v)                     \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}                                                                   \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2, %0"          \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}
> +
> +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return_relaxed(c_type i,                 \
> +                          atomic##prefix##_t *v)                        \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I;      \
> +}                                                                       \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v)  \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op(i, v) c_op I;                \
> +}
> +
> +#define ATOMIC_OPS(op, asm_op, c_op, I)                                 \
> +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,   )               \
> +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )

What purpose is the last macro argument when you only ever pass nothing
for it (here and ...

> +ATOMIC_OPS(add, add, +,  i)
> +ATOMIC_OPS(sub, add, +, -i)
> +
> +#undef ATOMIC_OPS
> +
> +#define ATOMIC_OPS(op, asm_op, I) \
> +        ATOMIC_FETCH_OP(op, asm_op, I, w, int,   )

... here)?

> +ATOMIC_OPS(and, and, i)
> +ATOMIC_OPS( or,  or, i)
> +ATOMIC_OPS(xor, xor, i)
> +
> +#undef ATOMIC_OPS
> +
> +#undef ATOMIC_FETCH_OP
> +#undef ATOMIC_OP_RETURN
> +
> +/* This is required to provide a full barrier on success. */
> +static inline int atomic_add_unless(atomic_t *v, int a, int u)
> +{
> +       int prev, rc;
> +
> +    __asm__ __volatile__ (
> +        "0: lr.w     %[p],  %[c]\n"
> +        "   beq      %[p],  %[u], 1f\n"
> +        "   add      %[rc], %[p], %[a]\n"
> +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> +        "   bnez     %[rc], 0b\n"
> +        RISCV_FULL_BARRIER
> +        "1:\n"
> +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> +        : [a] "r" (a), [u] "r" (u)
> +        : "memory");
> +    return prev;
> +}
> +
> +/*
> + * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
> + * {cmp,}xchg and the operations that return, so they need a full barrier.
> + */
> +#define ATOMIC_OP(c_t, prefix, size)                            \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size, "", "", "");  \

The inner parentheses aren't really needed here, are they?

> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size,               \
> +                          "", "", RISCV_ACQUIRE_BARRIER);       \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size,               \
> +                          "", RISCV_RELEASE_BARRIER, "");       \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n)         \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size,               \
> +                          ".aqrl", "", "");                     \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg_relaxed(atomic##prefix##_t *v,     \
> +                     c_t o, c_t n)                              \
> +{                                                               \
> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             "", "", "");                       \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v,     \
> +                     c_t o, c_t n)                              \
> +{                                                               \
> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             "", "", RISCV_ACQUIRE_BARRIER);    \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v,     \
> +                     c_t o, c_t n)                              \
> +{	                                                            \

A hard tab looks to have been left here.

> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             "", RISCV_RELEASE_BARRIER, "");    \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n) \
> +{                                                               \
> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             ".rl", "", " fence rw, rw\n");     \
> +}
> +
> +#define ATOMIC_OPS() \
> +    ATOMIC_OP(int,   , 4)
> +
> +ATOMIC_OPS()
> +
> +#undef ATOMIC_OPS
> +#undef ATOMIC_OP
> +
> +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> +{
> +       int prev, rc;
> +
> +    __asm__ __volatile__ (
> +        "0: lr.w     %[p],  %[c]\n"
> +        "   sub      %[rc], %[p], %[o]\n"
> +        "   bltz     %[rc], 1f\n"
> +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> +        "   bnez     %[rc], 0b\n"
> +        "   fence    rw, rw\n"
> +        "1:\n"
> +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> +        : [o] "r" (offset)
> +        : "memory" );
> +    return prev - offset;
> +}
> +
> +#define atomic_dec_if_positive(v)	atomic_sub_if_positive(v, 1)

Hmm, PPC for some reason also has the latter, but for both: Are they indeed
going to be needed in RISC-V code? They certainly look unnecessary for the
purpose of this series (allowing common code to build).

> --- /dev/null
> +++ b/xen/include/asm-generic/atomic-ops.h
> @@ -0,0 +1,92 @@
> +#/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
> +#define _ASM_GENERIC_ATOMIC_OPS_H_
> +
> +#include <xen/atomic.h>
> +#include <xen/lib.h>

If I'm not mistaken this header provides default implementations for every
xen/atomic.h-provided forward inline declaration that can be synthesized
from other atomic functions. I think a comment to this effect would want
adding somewhere here.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
  2024-03-06 14:56   ` Jan Beulich
@ 2024-03-07 10:35     ` Oleksii
  2024-03-07 10:46       ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-07 10:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Wed, 2024-03-06 at 15:56 +0100, Jan Beulich wrote:
> On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > The header was taken from Linux kernl 6.4.0-rc1.
> > 
> > Addionally, were updated:
> > * add emulation of {cmp}xchg for 1/2 byte types using 32-bit atomic
> >   access.
> > * replace tabs with spaces
> > * replace __* variale with *__
> > * introduce generic version of xchg_* and cmpxchg_*.
> > 
> > Implementation of 4- and 8-byte cases were left as it is done in
> > Linux kernel as according to the RISC-V spec:
> > ```
> > Table A.5 ( only part of the table was copied here )
> > 
> > Linux Construct       RVWMO Mapping
> > atomic <op> relaxed    amo<op>.{w|d}
> > atomic <op> acquire    amo<op>.{w|d}.aq
> > atomic <op> release    amo<op>.{w|d}.rl
> > atomic <op>            amo<op>.{w|d}.aqrl
> > 
> > Linux Construct       RVWMO LR/SC Mapping
> > atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
> > atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
> > atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez
> > loop OR
> >                        fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ;
> > bnez loop
> > atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
> > loop
> > 
> > The Linux mappings for release operations may seem stronger than
> > necessary,
> > but these mappings are needed to cover some cases in which Linux
> > requires
> > stronger orderings than the more intuitive mappings would provide.
> > In particular, as of the time this text is being written, Linux is
> > actively
> > debating whether to require load-load, load-store, and store-store
> > orderings
> > between accesses in one critical section and accesses in a
> > subsequent critical
> > section in the same hart and protected by the same synchronization
> > object.
> > Not all combinations of FENCE RW,W/FENCE R,RW mappings with aq/rl
> > mappings
> > combine to provide such orderings.
> > There are a few ways around this problem, including:
> > 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This
> > suffices
> >    but is undesirable, as it defeats the purpose of the aq/rl
> > modifiers.
> > 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does
> > not
> >    currently work due to the lack of load and store opcodes with aq
> > and rl
> >    modifiers.
> 
> As before I don't understand this point. Can you give an example of
> what
> sort of opcode / instruction is missing?
If I understand the spec correctly then l{b|h|w|d} and s{b|h|w|d}
instructions don't have aq or rl annotation. Here is text from the
spec:
   ARM Operation                  RVWMO Mapping
   Load                           l{b|h|w|d}
   Load-Acquire                   fence rw, rw; l{b|h|w|d}; fence r,rw 
   Load-Exclusive                 lr.{w|d}
   Load-Acquire-Exclusive         lr.{w|d}.aqrl
   Store                          s{b|h|w|d}
   Store-Release                  fence rw,w; s{b|h|w|d}
   Store-Exclusive                sc.{w|d}
   Store-Release-Exclusive        sc.{w|d}.rl
   dmb                            fence rw,rw
   dmb.ld                         fence r,rw
   dmb.st                         fence w,w
   isb                            fence.i; fence r,r
     Table A.4: Mappings from ARM operations to RISC-V operations

   Table A.4 provides a mapping from ARM memory operations onto RISC-V
   memory instructions.
   Since RISC-V does not currently have plain load and store opcodes with
   aq or rl annotations, ARM
   load-acquire and store-release operations should be mapped using fences
   instead.

> 
> > 3. Strengthen the mappings of release operations such that they
> > would
> >    enforce sufficient orderings in the presence of either type of
> > acquire mapping.
> >    This is the currently-recommended solution, and the one shown in
> > Table A.5.
> > ```
> > 
> > But in Linux kenrel atomics were strengthen with fences:
> > ```
> > Atomics present the same issue with locking: release and acquire
> > variants need to be strengthened to meet the constraints defined
> > by the Linux-kernel memory consistency model [1].
> > 
> > Atomics present a further issue: implementations of atomics such
> > as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
> > which do not give full-ordering with .aqrl; for example, current
> > implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
> > below to end up with the state indicated in the "exists" clause.
> > 
> > In order to "synchronize" LKMM and RISC-V's implementation, this
> > commit strengthens the implementations of the atomics operations
> > by replacing .rl and .aq with the use of ("lightweigth") fences,
> > and by replacing .aqrl LR/SC pairs in sequences such as:
> > 
> > 0:      lr.w.aqrl  %0, %addr
> >         bne        %0, %old, 1f
> >         ...
> >         sc.w.aqrl  %1, %new, %addr
> >         bnez       %1, 0b
> > 1:
> > 
> > with sequences of the form:
> > 
> > 0:      lr.w       %0, %addr
> >         bne        %0, %old, 1f
> >               ...
> >         sc.w.rl    %1, %new, %addr   /* SC-release   */
> >         bnez       %1, 0b
> >         fence      rw, rw            /* "full" fence */
> > 1:
> > 
> > following Daniel's suggestion.
> > 
> > These modifications were validated with simulation of the RISC-V
> > memory consistency model.
> > 
> > C lr-sc-aqrl-pair-vs-full-barrier
> > 
> > {}
> > 
> > P0(int *x, int *y, atomic_t *u)
> > {
> >         int r0;
> >         int r1;
> > 
> >         WRITE_ONCE(*x, 1);
> >         r0 = atomic_cmpxchg(u, 0, 1);
> >         r1 = READ_ONCE(*y);
> > }
> > 
> > P1(int *x, int *y, atomic_t *v)
> > {
> >         int r0;
> >         int r1;
> > 
> >         WRITE_ONCE(*y, 1);
> >         r0 = atomic_cmpxchg(v, 0, 1);
> >         r1 = READ_ONCE(*x);
> > }
> > 
> > exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)
> 
> While I'm entirely willing to trust this can happen, I can't bring
> this
> in line with the A extension spec.
> 
> Additionally it's not clear to me in how far all of this applies when
> you don't really use LR/SC in the 4- and 8-byte cases (and going
> forward
> likely also not in the 1- and 2-byte case, utilizing Zahba when
> available).
It just explain what combination of fences, lr/sc, amoswap, .aq and .rl
annotation can be combined, and why combinations introduced in this
patch are used.

> 
> > ---
> > Changes in V5:
> >  - update the commit message.
> >  - drop ALIGN_DOWN().
> >  - update the definition of emulate_xchg_1_2(): 
> >    - lr.d -> lr.w, sc.d -> sc.w.
> >    - drop ret argument.
> >    - code style fixes around asm volatile.
> >    - update prototype.
> >    - use asm named operands.
> >    - rename local variables.
> >    - add comment above the macros
> >  - update the definition of __xchg_generic:
> >    - drop local ptr__ variable.
> >    - code style fixes around switch()
> >    - update prototype.
> >  - introduce RISCV_FULL_BARRIES.
> >  - redefine cmpxchg()
> >  - update emulate_cmpxchg_1_2():
> >    - update prototype
> >    - update local variables names and usage of them
> >    - use name asm operands.
> >    - add comment above the macros
> > ---
> > Changes in V4:
> >  - Code style fixes.
> >  - enforce in __xchg_*() has the same type for new and *ptr, also
> > "\n"
> >    was removed at the end of asm instruction.
> >  - dependency from
> > https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@bugseng.com/
> >  - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE().
> >  - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used.
> >  - drop cmpxcg{32,64}_{local} as they aren't used.
> >  - introduce generic version of xchg_* and cmpxchg_*.
> >  - update the commit message.
> > ---
> > Changes in V3:
> >  - update the commit message
> >  - add emulation of {cmp}xchg_... for 1 and 2 bytes types
> > ---
> > Changes in V2:
> >  - update the comment at the top of the header.
> >  - change xen/lib.h to xen/bug.h.
> >  - sort inclusion of headers properly.
> > ---
> >  xen/arch/riscv/include/asm/cmpxchg.h | 258
> > +++++++++++++++++++++++++++
> >  1 file changed, 258 insertions(+)
> >  create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h
> > 
> > diff --git a/xen/arch/riscv/include/asm/cmpxchg.h
> > b/xen/arch/riscv/include/asm/cmpxchg.h
> > new file mode 100644
> > index 0000000000..66cbe26737
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> > @@ -0,0 +1,258 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/* Copyright (C) 2014 Regents of the University of California */
> > +
> > +#ifndef _ASM_RISCV_CMPXCHG_H
> > +#define _ASM_RISCV_CMPXCHG_H
> > +
> > +#include <xen/compiler.h>
> > +#include <xen/lib.h>
> > +
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +#define __amoswap_generic(ptr, new, ret, sfx, pre, post) \
> > +({ \
> > +    asm volatile( \
> 
> Nit: In Xen style this is lacking a blank ahead of the opening
> parenthesis.
> 
> > +        pre \
> > +        " amoswap" sfx " %0, %2, %1\n" \
> > +        post \
> > +        : "=r" (ret), "+A" (*ptr) \
> > +        : "r" (new) \
> > +        : "memory" ); \
> > +})
> > +
> > +/*
> > + * For LR and SC, the A extension requires that the address held
> > in rs1 be
> > + * naturally aligned to the size of the operand (i.e., eight-byte
> > aligned
> > + * for 64-bit words and four-byte aligned for 32-bit words).
> > + * If the address is not naturally aligned, an address-misaligned
> > exception
> > + * or an access-fault exception will be generated.
> > + * 
> > + * Thereby:
> > + * - for 1-byte xchg access the containing word by clearing low
> > two bits
> > + * - for 2-byte xchg ccess the containing word by clearing first
> > bit.
> 
> "first bit" can still be ambiguous. Better say "bit 1".
> 
> > + * 
> 
> Here and apparently also elsewhere: Stray trailing blank. Git has a
> config
> setting to warn you about (maybe even to automatically strip? such.
It  would be useful for me. Thanks a lot for such recommendation.

> 
> > + * If resulting 4-byte access is still misalgined, it will fault
> > just as
> > + * non-emulated 4-byte access would.
> > + */
> > +#define emulate_xchg_1_2(ptr, new, sc_sfx, pre, post) \
> > +({ \
> > +    uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr &
> > ~(0x4 - sizeof(*ptr))); \
> 
> Here and elsewhere: sizeof(*(ptr)) (i.e. the inner parentheses are
> needed
> also there).
> 
> > +    uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 -
> > sizeof(*ptr))) * BITS_PER_BYTE; \
> 
> Why uint8_t?
It is enough to cover possible start bit position of value that should
be updated, so I decided to use uint8_t.

> 
> > +    unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE)
> > - 1, 0) << new_val_pos; \
> > +    unsigned int new_ = new << new_val_pos; \
> > +    unsigned int old_val; \
> > +    unsigned int xchged_val; \
> > +    \
> > +    asm volatile ( \
> > +        pre \
> > +        "0: lr.w %[op_oldval], %[op_aligned_ptr]\n" \
> > +        "   and  %[op_xchged_val], %[op_oldval], %z[op_nmask]\n" \
> > +        "   or   %[op_xchged_val], %[op_xchged_val], %z[op_new]\n"
> > \
> > +        "   sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val],
> > %[op_aligned_ptr]\n" \
> > +        "   bnez %[op_xchged_val], 0b\n" \
> > +        post \
> > +        : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r"
> > (xchged_val), [op_aligned_ptr]"+A" (*aligned_ptr) \
> 
> Too long line. Partly because you have op_ prefixes here which I
> can't
> recognized what they would be good for. The val / _val suffixes also
> don't appear to carry much useful information. And "xchged", being
> explicitly past tense, doesn't look to fit even up and until the SC,
> not to speak of afterwards. Anything wrong with calling this just
> tmp,
> aux, or scratch?
op_ can be dropped and named operand can be equal to local variable
name, I thought it would be useful to understand that it is named
operand, but after rethinking it looks like unneeded overhead.

In case of emulate_xchg_1_2() there is no sense in val/_val suffixes as
local variables don't intersect with macros variable, and the suffixes
were added just to be in sync with emulate_cmpxchg_1_2 macros, but in
case of emulate_cmpxchg_1_2(ptr, old, new, sc_sfx, pre, post), the
macros has old argument, so to distinguish them _val was added.
Probably, it would be better to rename it to read or read_old.


> 
> > +        : [op_new] "rJ" (new_), [op_nmask] "rJ" (~mask) \
> > +        : "memory" ); \
> > +    \
> > +    (__typeof__(*(ptr)))((old_val & mask) >> new_val_pos); \
> > +})
> > +
> > +#define __xchg_generic(ptr, new, size, sfx, pre, post) \
> > +({ \
> > +    __typeof__(*(ptr)) new__ = (new); \
> > +    __typeof__(*(ptr)) ret__; \
> > +    switch ( size ) \
> 
> Can't this use sizeof(*(ptr)), allowing for one less macro parameter?
> 
> > +    { \
> > +    case 1: \
> > +    case 2: \
> > +        ret__ = emulate_xchg_1_2(ptr, new__, sfx, pre, post); \
> > +        break; \
> > +    case 4: \
> > +        __amoswap_generic(ptr, new__, ret__,\
> > +                          ".w" sfx,  pre, post); \
> > +        break; \
> > +    case 8: \
> > +        __amoswap_generic(ptr, new__, ret__,\
> > +                          ".d" sfx,  pre, post); \
> > +        break; \
> 
> In io.h you make sure to avoid rv64-only insns. Here you don't. The
> build
> would fail either way, but this still looks inconsistent.
> 
> Also nit: Stray double blands (twice) ahead of "pre". Plus with this
> style
> of line continuation you want to consistently have exactly one blank
> ahead
> of each backslash.
> 
> > +    default: \
> > +        STATIC_ASSERT_UNREACHABLE(); \
> > +    } \
> > +    ret__; \
> > +})
> > +
> > +#define xchg_relaxed(ptr, x) \
> > +({ \
> > +    __typeof__(*(ptr)) x_ = (x); \
> 
> What is the purpose of this, when __xchg_generic() already does this
> same
> type conversion?
> 
> > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),
> > "", "", ""); \
> > +})
> > +
> > +#define xchg_acquire(ptr, x) \
> > +({ \
> > +    __typeof__(*(ptr)) x_ = (x); \
> > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
> > +                                       "", "",
> > RISCV_ACQUIRE_BARRIER); \
> > +})
> > +
> > +#define xchg_release(ptr, x) \
> > +({ \
> > +    __typeof__(*(ptr)) x_ = (x); \
> > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
> > +                                       "", RISCV_RELEASE_BARRIER,
> > ""); \
> > +})
> 
> As asked before: Are there going to be any uses of these three?
> Common
> code doesn't require them. And not needing to provide them would
> simplify things quite a bit, it seems.
I checked my private branches and it looks to me that I introduced them
only for the correspondent atomic operations ( which was copied from
Linux Kernel ) which are not also used.

So we could definitely drop these macros for now, but should
xchg_generic() be updated as well? If to look at:
 #define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x), sizeof(*
(ptr)), \
                                    ".aqrl", "", "")
Last two arguments start to be unneeded, but I've wanted to leave them,
in case someone will needed to back xchg_{release, acquire, ...}. Does
it make any sense?

> 
> > +#define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x),
> > sizeof(*(ptr)), \
> > +                                    ".aqrl", "", "")
> 
> According to the earlier comment (where I don't follow the example
> given),
> is .aqrl sufficient here? And even if it was for the 4- and 8-byte
> cases,
> is it sufficient in the 1- and 2-byte emulation case (where it then
> is
> appended to just the SC)?
If I understand your question correctly then accroding to the spec.,
.aqrl is enough for amo<op>.{w|d} instructions:
   Linux Construct        RVWMO AMO Mapping
   atomic <op> relaxed    amo<op>.{w|d}
   atomic <op> acquire    amo<op>.{w|d}.aq
   atomic <op> release    amo<op>.{w|d}.rl
   atomic <op>            amo<op>.{w|d}.aqrl
but in case of lr/sc you are right sc requires suffix too:
   Linux Construct        RVWMO LR/SC Mapping
   atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
   atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
   atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez 
   loop OR fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ; bnez loop
   atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
   loop
   
I will add sc_sfx to emulate_xchg_1_2(). The only question is left if
__xchg_generic(ptr, new, size, sfx, pre, post) should be changed to:
__xchg_generic(ptr, new, size, sfx1, sfx2, pre, post) to cover both
cases amo<op>.{w|d}.sfx1 and lr.{w|d}.sfx1 ... sc.{w|d}.sfx2?

~ Oleksii

> 
> > +#define __generic_cmpxchg(ptr, old, new, ret, lr_sfx, sc_sfx, pre,
> > post)	\
> > + ({ \
> > +    register unsigned int rc; \
> > +    asm volatile( \
> > +        pre \
> > +        "0: lr" lr_sfx " %0, %2\n" \
> > +        "   bne  %0, %z3, 1f\n" \
> > +        "   sc" sc_sfx " %1, %z4, %2\n" \
> > +        "   bnez %1, 0b\n" \
> > +        post \
> > +        "1:\n" \
> > +        : "=&r" (ret), "=&r" (rc), "+A" (*ptr) \
> > +        : "rJ" (old), "rJ" (new) \
> > +        : "memory"); \
> > + })
> > +
> > +/*
> > + * For LR and SC, the A extension requires that the address held
> > in rs1 be
> > + * naturally aligned to the size of the operand (i.e., eight-byte
> > aligned
> > + * for 64-bit words and four-byte aligned for 32-bit words).
> > + * If the address is not naturally aligned, an address-misaligned
> > exception
> > + * or an access-fault exception will be generated.
> > + * 
> > + * Thereby:
> > + * - for 1-byte xchg access the containing word by clearing low
> > two bits
> > + * - for 2-byte xchg ccess the containing word by clearing first
> > bit.
> > + * 
> > + * If resulting 4-byte access is still misalgined, it will fault
> > just as
> > + * non-emulated 4-byte access would.
> > + *
> > + * old_val was casted to unsigned long at the end of the define
> > because of
> > + * the following issue:
> > + * ./arch/riscv/include/asm/cmpxchg.h:166:5: error: cast to
> > pointer from integer of different size [-Werror=int-to-pointer-
> > cast]
> > + * 166 |     (__typeof__(*(ptr)))(old_val >> new_val_pos); \
> > + *     |     ^
> > + * ./arch/riscv/include/asm/cmpxchg.h:184:17: note: in expansion
> > of macro 'emulate_cmpxchg_1_2'
> > + * 184 |         ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
> > + *     |                 ^~~~~~~~~~~~~~~~~~~
> > + * ./arch/riscv/include/asm/cmpxchg.h:227:5: note: in expansion of
> > macro '__cmpxchg_generic'
> > + * 227 |     __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned
> > long)(n), \
> > + *     |     ^~~~~~~~~~~~~~~~~
> > + * ./include/xen/lib.h:141:26: note: in expansion of macro
> > '__cmpxchg'
> > + * 141 |     ((__typeof__(*(ptr)))__cmpxchg(ptr, (unsigned
> > long)o_,              \
> > + *     |                          ^~~~~~~~~
> > + * common/event_channel.c:109:13: note: in expansion of macro
> > 'cmpxchgptr'
> > + * 109 |             cmpxchgptr(&xen_consumers[i], NULL, fn);
> > + */
> 
> This is too much detail on the compile issue. Just mentioning that
> said
> cast is needed for cmpxchgptr() ought to be sufficient.
> 
> > +#define emulate_cmpxchg_1_2(ptr, old, new, sc_sfx, pre, post) \
> > +({ \
> > +    uint32_t *aligned_ptr = (uint32_t *)((unsigned long)ptr &
> > ~(0x4 - sizeof(*ptr))); \
> > +    uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 -
> > sizeof(*ptr))) * BITS_PER_BYTE; \
> > +    unsigned long mask = GENMASK(((sizeof(*ptr)) * BITS_PER_BYTE)
> > - 1, 0) << new_val_pos; \
> > +    unsigned int old_ = old << new_val_pos; \
> > +    unsigned int new_ = new << new_val_pos; \
> > +    unsigned int old_val; \
> > +    unsigned int xchged_val; \
> > +    \
> > +    __asm__ __volatile__ ( \
> > +        pre \
> > +        "0: lr.w %[op_xchged_val], %[op_aligned_ptr]\n" \
> > +        "   and  %[op_oldval], %[op_xchged_val], %z[op_mask]\n" \
> > +        "   bne  %[op_oldval], %z[op_old], 1f\n" \
> > +        "   xor  %[op_xchged_val], %[op_oldval],
> > %[op_xchged_val]\n" \
> > +        "   or   %[op_xchged_val], %[op_xchged_val], %z[op_new]\n"
> > \
> > +        "   sc.w" sc_sfx " %[op_xchged_val], %[op_xchged_val],
> > %[op_aligned_ptr]\n" \
> > +        "   bnez %[op_xchged_val], 0b\n" \
> > +        post \
> > +        "1:\n" \
> > +        : [op_oldval] "=&r" (old_val), [op_xchged_val] "=&r"
> > (xchged_val), [op_aligned_ptr] "+A" (*aligned_ptr) \
> > +        : [op_old] "rJ" (old_), [op_new] "rJ" (new_), \
> > +          [op_mask] "rJ" (mask) \
> > +        : "memory" ); \
> > +    \
> > +    (__typeof__(*(ptr)))((unsigned long)old_val >> new_val_pos); \
> > +})
> > +
> > +/*
> > + * Atomic compare and exchange.  Compare OLD with MEM, if
> > identical,
> > + * store NEW in MEM.  Return the initial value in MEM.  Success is
> > + * indicated by comparing RETURN with OLD.
> > + */
> > +#define __cmpxchg_generic(ptr, old, new, size, sc_sfx, pre, post)
> > \
> > +({ \
> > +    __typeof__(ptr) ptr__ = (ptr); \
> > +    __typeof__(*(ptr)) old__ = (__typeof__(*(ptr)))(old); \
> > +    __typeof__(*(ptr)) new__ = (__typeof__(*(ptr)))(new); \
> > +    __typeof__(*(ptr)) ret__; \
> > +    switch ( size ) \
> > +    { \
> > +    case 1: \
> > +    case 2: \
> > +        ret__ = emulate_cmpxchg_1_2(ptr, old, new, \
> > +                            sc_sfx, pre, post); \
> > +        break; \
> > +    case 4: \
> > +        __generic_cmpxchg(ptr__, old__, new__, ret__, \
> > +                          ".w", ".w"sc_sfx, pre, post); \
> > +        break; \
> > +    case 8: \
> > +        __generic_cmpxchg(ptr__, old__, new__, ret__, \
> > +                          ".d", ".d"sc_sfx, pre, post); \
> > +        break; \
> > +    default: \
> > +        STATIC_ASSERT_UNREACHABLE(); \
> > +    } \
> > +    ret__; \
> > +})
> > +
> > +#define cmpxchg_relaxed(ptr, o, n) \
> > +({ \
> > +    __typeof__(*(ptr)) o_ = (o); \
> > +    __typeof__(*(ptr)) n_ = (n); \
> > +    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, \
> > +                    o_, n_, sizeof(*(ptr)), "", "", ""); \
> > +})
> > +
> > +#define cmpxchg_acquire(ptr, o, n) \
> > +({ \
> > +    __typeof__(*(ptr)) o_ = (o); \
> > +    __typeof__(*(ptr)) n_ = (n); \
> > +    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, o_, n_,
> > sizeof(*(ptr)), \
> > +                                          "", "",
> > RISCV_ACQUIRE_BARRIER); \
> > +})
> > +
> > +#define cmpxchg_release(ptr, o, n) \
> > +({ \
> > +    __typeof__(*(ptr)) o_ = (o); \
> > +    __typeof__(*(ptr)) n_ = (n); \
> > +    (__typeof__(*(ptr)))__cmpxchg_release(ptr, o_, n_,
> > sizeof(*(ptr)), \
> 
> There's no __cmpxchg_release() afaics; dym __cmpxchg_generic()?
> 
> Jan



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
  2024-03-07 10:35     ` Oleksii
@ 2024-03-07 10:46       ` Jan Beulich
  2024-03-07 11:01         ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-07 10:46 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 11:35, Oleksii wrote:
> On Wed, 2024-03-06 at 15:56 +0100, Jan Beulich wrote:
>> On 26.02.2024 18:38, Oleksii Kurochko wrote:
>>> The header was taken from Linux kernl 6.4.0-rc1.
>>>
>>> Addionally, were updated:
>>> * add emulation of {cmp}xchg for 1/2 byte types using 32-bit atomic
>>>   access.
>>> * replace tabs with spaces
>>> * replace __* variale with *__
>>> * introduce generic version of xchg_* and cmpxchg_*.
>>>
>>> Implementation of 4- and 8-byte cases were left as it is done in
>>> Linux kernel as according to the RISC-V spec:
>>> ```
>>> Table A.5 ( only part of the table was copied here )
>>>
>>> Linux Construct       RVWMO Mapping
>>> atomic <op> relaxed    amo<op>.{w|d}
>>> atomic <op> acquire    amo<op>.{w|d}.aq
>>> atomic <op> release    amo<op>.{w|d}.rl
>>> atomic <op>            amo<op>.{w|d}.aqrl
>>>
>>> Linux Construct       RVWMO LR/SC Mapping
>>> atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
>>> atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
>>> atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez
>>> loop OR
>>>                        fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ;
>>> bnez loop
>>> atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
>>> loop
>>>
>>> The Linux mappings for release operations may seem stronger than
>>> necessary,
>>> but these mappings are needed to cover some cases in which Linux
>>> requires
>>> stronger orderings than the more intuitive mappings would provide.
>>> In particular, as of the time this text is being written, Linux is
>>> actively
>>> debating whether to require load-load, load-store, and store-store
>>> orderings
>>> between accesses in one critical section and accesses in a
>>> subsequent critical
>>> section in the same hart and protected by the same synchronization
>>> object.
>>> Not all combinations of FENCE RW,W/FENCE R,RW mappings with aq/rl
>>> mappings
>>> combine to provide such orderings.
>>> There are a few ways around this problem, including:
>>> 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This
>>> suffices
>>>    but is undesirable, as it defeats the purpose of the aq/rl
>>> modifiers.
>>> 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does
>>> not
>>>    currently work due to the lack of load and store opcodes with aq
>>> and rl
>>>    modifiers.
>>
>> As before I don't understand this point. Can you give an example of
>> what
>> sort of opcode / instruction is missing?
> If I understand the spec correctly then l{b|h|w|d} and s{b|h|w|d}
> instructions don't have aq or rl annotation.

How would load insns other that LR and store insns other than SC come
into play here?

>>> 3. Strengthen the mappings of release operations such that they
>>> would
>>>    enforce sufficient orderings in the presence of either type of
>>> acquire mapping.
>>>    This is the currently-recommended solution, and the one shown in
>>> Table A.5.
>>> ```
>>>
>>> But in Linux kenrel atomics were strengthen with fences:
>>> ```
>>> Atomics present the same issue with locking: release and acquire
>>> variants need to be strengthened to meet the constraints defined
>>> by the Linux-kernel memory consistency model [1].
>>>
>>> Atomics present a further issue: implementations of atomics such
>>> as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
>>> which do not give full-ordering with .aqrl; for example, current
>>> implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
>>> below to end up with the state indicated in the "exists" clause.
>>>
>>> In order to "synchronize" LKMM and RISC-V's implementation, this
>>> commit strengthens the implementations of the atomics operations
>>> by replacing .rl and .aq with the use of ("lightweigth") fences,
>>> and by replacing .aqrl LR/SC pairs in sequences such as:
>>>
>>> 0:      lr.w.aqrl  %0, %addr
>>>         bne        %0, %old, 1f
>>>         ...
>>>         sc.w.aqrl  %1, %new, %addr
>>>         bnez       %1, 0b
>>> 1:
>>>
>>> with sequences of the form:
>>>
>>> 0:      lr.w       %0, %addr
>>>         bne        %0, %old, 1f
>>>               ...
>>>         sc.w.rl    %1, %new, %addr   /* SC-release   */
>>>         bnez       %1, 0b
>>>         fence      rw, rw            /* "full" fence */
>>> 1:
>>>
>>> following Daniel's suggestion.
>>>
>>> These modifications were validated with simulation of the RISC-V
>>> memory consistency model.
>>>
>>> C lr-sc-aqrl-pair-vs-full-barrier
>>>
>>> {}
>>>
>>> P0(int *x, int *y, atomic_t *u)
>>> {
>>>         int r0;
>>>         int r1;
>>>
>>>         WRITE_ONCE(*x, 1);
>>>         r0 = atomic_cmpxchg(u, 0, 1);
>>>         r1 = READ_ONCE(*y);
>>> }
>>>
>>> P1(int *x, int *y, atomic_t *v)
>>> {
>>>         int r0;
>>>         int r1;
>>>
>>>         WRITE_ONCE(*y, 1);
>>>         r0 = atomic_cmpxchg(v, 0, 1);
>>>         r1 = READ_ONCE(*x);
>>> }
>>>
>>> exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)
>>
>> While I'm entirely willing to trust this can happen, I can't bring
>> this
>> in line with the A extension spec.
>>
>> Additionally it's not clear to me in how far all of this applies when
>> you don't really use LR/SC in the 4- and 8-byte cases (and going
>> forward
>> likely also not in the 1- and 2-byte case, utilizing Zahba when
>> available).
> It just explain what combination of fences, lr/sc, amoswap, .aq and .rl
> annotation can be combined, and why combinations introduced in this
> patch are used.

Except that I don't understand that explanation, iow why said combination
of values could be observed even when using suffixes properly.

>>> +    uint8_t new_val_pos = ((unsigned long)(ptr) & (0x4 -
>>> sizeof(*ptr))) * BITS_PER_BYTE; \
>>
>> Why uint8_t?
> It is enough to cover possible start bit position of value that should
> be updated, so I decided to use uint8_t.

Please take a look at the "Types" section in ./CODING_STYLE.

>>> +    { \
>>> +    case 1: \
>>> +    case 2: \
>>> +        ret__ = emulate_xchg_1_2(ptr, new__, sfx, pre, post); \
>>> +        break; \
>>> +    case 4: \
>>> +        __amoswap_generic(ptr, new__, ret__,\
>>> +                          ".w" sfx,  pre, post); \
>>> +        break; \
>>> +    case 8: \
>>> +        __amoswap_generic(ptr, new__, ret__,\
>>> +                          ".d" sfx,  pre, post); \
>>> +        break; \
>>
>> In io.h you make sure to avoid rv64-only insns. Here you don't. The
>> build
>> would fail either way, but this still looks inconsistent.
>>
>> Also nit: Stray double blands (twice) ahead of "pre". Plus with this
>> style
>> of line continuation you want to consistently have exactly one blank
>> ahead
>> of each backslash.
>>
>>> +    default: \
>>> +        STATIC_ASSERT_UNREACHABLE(); \
>>> +    } \
>>> +    ret__; \
>>> +})
>>> +
>>> +#define xchg_relaxed(ptr, x) \
>>> +({ \
>>> +    __typeof__(*(ptr)) x_ = (x); \
>>
>> What is the purpose of this, when __xchg_generic() already does this
>> same
>> type conversion?
>>
>>> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),
>>> "", "", ""); \
>>> +})
>>> +
>>> +#define xchg_acquire(ptr, x) \
>>> +({ \
>>> +    __typeof__(*(ptr)) x_ = (x); \
>>> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
>>> +                                       "", "",
>>> RISCV_ACQUIRE_BARRIER); \
>>> +})
>>> +
>>> +#define xchg_release(ptr, x) \
>>> +({ \
>>> +    __typeof__(*(ptr)) x_ = (x); \
>>> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
>>> +                                       "", RISCV_RELEASE_BARRIER,
>>> ""); \
>>> +})
>>
>> As asked before: Are there going to be any uses of these three?
>> Common
>> code doesn't require them. And not needing to provide them would
>> simplify things quite a bit, it seems.
> I checked my private branches and it looks to me that I introduced them
> only for the correspondent atomic operations ( which was copied from
> Linux Kernel ) which are not also used.
> 
> So we could definitely drop these macros for now, but should
> xchg_generic() be updated as well? If to look at:
>  #define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x), sizeof(*
> (ptr)), \
>                                     ".aqrl", "", "")
> Last two arguments start to be unneeded, but I've wanted to leave them,
> in case someone will needed to back xchg_{release, acquire, ...}. Does
> it make any sense?

It all depends on how it's justified in the description.

>>> +#define xchg(ptr, x) __xchg_generic(ptr, (unsigned long)(x),
>>> sizeof(*(ptr)), \
>>> +                                    ".aqrl", "", "")
>>
>> According to the earlier comment (where I don't follow the example
>> given),
>> is .aqrl sufficient here? And even if it was for the 4- and 8-byte
>> cases,
>> is it sufficient in the 1- and 2-byte emulation case (where it then
>> is
>> appended to just the SC)?
> If I understand your question correctly then accroding to the spec.,
> .aqrl is enough for amo<op>.{w|d} instructions:
>    Linux Construct        RVWMO AMO Mapping
>    atomic <op> relaxed    amo<op>.{w|d}
>    atomic <op> acquire    amo<op>.{w|d}.aq
>    atomic <op> release    amo<op>.{w|d}.rl
>    atomic <op>            amo<op>.{w|d}.aqrl
> but in case of lr/sc you are right sc requires suffix too:
>    Linux Construct        RVWMO LR/SC Mapping
>    atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
>    atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
>    atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez 
>    loop OR fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ; bnez loop
>    atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
>    loop
>    
> I will add sc_sfx to emulate_xchg_1_2(). The only question is left if
> __xchg_generic(ptr, new, size, sfx, pre, post) should be changed to:
> __xchg_generic(ptr, new, size, sfx1, sfx2, pre, post) to cover both
> cases amo<op>.{w|d}.sfx1 and lr.{w|d}.sfx1 ... sc.{w|d}.sfx2?

I expect that's going to be necessary. In the end you'll see what's needed
when making the code adjustment.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
  2024-03-07 10:46       ` Jan Beulich
@ 2024-03-07 11:01         ` Oleksii
  2024-03-07 11:11           ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-07 11:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Thu, 2024-03-07 at 11:46 +0100, Jan Beulich wrote:
> On 07.03.2024 11:35, Oleksii wrote:
> > On Wed, 2024-03-06 at 15:56 +0100, Jan Beulich wrote:
> > > On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > > > The header was taken from Linux kernl 6.4.0-rc1.
> > > > 
> > > > Addionally, were updated:
> > > > * add emulation of {cmp}xchg for 1/2 byte types using 32-bit
> > > > atomic
> > > >   access.
> > > > * replace tabs with spaces
> > > > * replace __* variale with *__
> > > > * introduce generic version of xchg_* and cmpxchg_*.
> > > > 
> > > > Implementation of 4- and 8-byte cases were left as it is done
> > > > in
> > > > Linux kernel as according to the RISC-V spec:
> > > > ```
> > > > Table A.5 ( only part of the table was copied here )
> > > > 
> > > > Linux Construct       RVWMO Mapping
> > > > atomic <op> relaxed    amo<op>.{w|d}
> > > > atomic <op> acquire    amo<op>.{w|d}.aq
> > > > atomic <op> release    amo<op>.{w|d}.rl
> > > > atomic <op>            amo<op>.{w|d}.aqrl
> > > > 
> > > > Linux Construct       RVWMO LR/SC Mapping
> > > > atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez
> > > > loop
> > > > atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez
> > > > loop
> > > > atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ;
> > > > bnez
> > > > loop OR
> > > >                        fence.tso; loop: lr.{w|d}; <op>;
> > > > sc.{w|d}∗ ;
> > > > bnez loop
> > > > atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl;
> > > > bnez
> > > > loop
> > > > 
> > > > The Linux mappings for release operations may seem stronger
> > > > than
> > > > necessary,
> > > > but these mappings are needed to cover some cases in which
> > > > Linux
> > > > requires
> > > > stronger orderings than the more intuitive mappings would
> > > > provide.
> > > > In particular, as of the time this text is being written, Linux
> > > > is
> > > > actively
> > > > debating whether to require load-load, load-store, and store-
> > > > store
> > > > orderings
> > > > between accesses in one critical section and accesses in a
> > > > subsequent critical
> > > > section in the same hart and protected by the same
> > > > synchronization
> > > > object.
> > > > Not all combinations of FENCE RW,W/FENCE R,RW mappings with
> > > > aq/rl
> > > > mappings
> > > > combine to provide such orderings.
> > > > There are a few ways around this problem, including:
> > > > 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This
> > > > suffices
> > > >    but is undesirable, as it defeats the purpose of the aq/rl
> > > > modifiers.
> > > > 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This
> > > > does
> > > > not
> > > >    currently work due to the lack of load and store opcodes
> > > > with aq
> > > > and rl
> > > >    modifiers.
> > > 
> > > As before I don't understand this point. Can you give an example
> > > of
> > > what
> > > sort of opcode / instruction is missing?
> > If I understand the spec correctly then l{b|h|w|d} and s{b|h|w|d}
> > instructions don't have aq or rl annotation.
> 
> How would load insns other that LR and store insns other than SC come
> into play here?

This part of the spec. is not only about LR and SC which cover Load-
Exclusive and Store-Exclusive cases, but also about non-Exclusive cases
for each l{b|h|w|d} and s{b|h|w|d} are used.

~ Oleksii



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
  2024-03-07 11:01         ` Oleksii
@ 2024-03-07 11:11           ` Jan Beulich
  2024-03-07 12:28             ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-07 11:11 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 12:01, Oleksii wrote:
> On Thu, 2024-03-07 at 11:46 +0100, Jan Beulich wrote:
>> On 07.03.2024 11:35, Oleksii wrote:
>>> On Wed, 2024-03-06 at 15:56 +0100, Jan Beulich wrote:
>>>> On 26.02.2024 18:38, Oleksii Kurochko wrote:
>>>>> The header was taken from Linux kernl 6.4.0-rc1.
>>>>>
>>>>> Addionally, were updated:
>>>>> * add emulation of {cmp}xchg for 1/2 byte types using 32-bit
>>>>> atomic
>>>>>   access.
>>>>> * replace tabs with spaces
>>>>> * replace __* variale with *__
>>>>> * introduce generic version of xchg_* and cmpxchg_*.
>>>>>
>>>>> Implementation of 4- and 8-byte cases were left as it is done
>>>>> in
>>>>> Linux kernel as according to the RISC-V spec:
>>>>> ```
>>>>> Table A.5 ( only part of the table was copied here )
>>>>>
>>>>> Linux Construct       RVWMO Mapping
>>>>> atomic <op> relaxed    amo<op>.{w|d}
>>>>> atomic <op> acquire    amo<op>.{w|d}.aq
>>>>> atomic <op> release    amo<op>.{w|d}.rl
>>>>> atomic <op>            amo<op>.{w|d}.aqrl
>>>>>
>>>>> Linux Construct       RVWMO LR/SC Mapping
>>>>> atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez
>>>>> loop
>>>>> atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez
>>>>> loop
>>>>> atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ;
>>>>> bnez
>>>>> loop OR
>>>>>                        fence.tso; loop: lr.{w|d}; <op>;
>>>>> sc.{w|d}∗ ;
>>>>> bnez loop
>>>>> atomic <op>            loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl;
>>>>> bnez
>>>>> loop

Note the load and store forms mentioned here. How would ...

>>>>> The Linux mappings for release operations may seem stronger
>>>>> than
>>>>> necessary,
>>>>> but these mappings are needed to cover some cases in which
>>>>> Linux
>>>>> requires
>>>>> stronger orderings than the more intuitive mappings would
>>>>> provide.
>>>>> In particular, as of the time this text is being written, Linux
>>>>> is
>>>>> actively
>>>>> debating whether to require load-load, load-store, and store-
>>>>> store
>>>>> orderings
>>>>> between accesses in one critical section and accesses in a
>>>>> subsequent critical
>>>>> section in the same hart and protected by the same
>>>>> synchronization
>>>>> object.
>>>>> Not all combinations of FENCE RW,W/FENCE R,RW mappings with
>>>>> aq/rl
>>>>> mappings
>>>>> combine to provide such orderings.
>>>>> There are a few ways around this problem, including:
>>>>> 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This
>>>>> suffices
>>>>>    but is undesirable, as it defeats the purpose of the aq/rl
>>>>> modifiers.
>>>>> 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This
>>>>> does
>>>>> not
>>>>>    currently work due to the lack of load and store opcodes
>>>>> with aq
>>>>> and rl
>>>>>    modifiers.
>>>>
>>>> As before I don't understand this point. Can you give an example
>>>> of
>>>> what
>>>> sort of opcode / instruction is missing?
>>> If I understand the spec correctly then l{b|h|w|d} and s{b|h|w|d}
>>> instructions don't have aq or rl annotation.
>>
>> How would load insns other that LR and store insns other than SC come
>> into play here?
> 
> This part of the spec. is not only about LR and SC which cover Load-
> Exclusive and Store-Exclusive cases, but also about non-Exclusive cases
> for each l{b|h|w|d} and s{b|h|w|d} are used.

... the spec (obviously covering other forms, too) be relevant when
reasoning whether just suffixes or actual barrier insns need using?

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h
  2024-03-07 11:11           ` Jan Beulich
@ 2024-03-07 12:28             ` Oleksii
  0 siblings, 0 replies; 88+ messages in thread
From: Oleksii @ 2024-03-07 12:28 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Thu, 2024-03-07 at 12:11 +0100, Jan Beulich wrote:
> On 07.03.2024 12:01, Oleksii wrote:
> > On Thu, 2024-03-07 at 11:46 +0100, Jan Beulich wrote:
> > > On 07.03.2024 11:35, Oleksii wrote:
> > > > On Wed, 2024-03-06 at 15:56 +0100, Jan Beulich wrote:
> > > > > On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > > > > > The header was taken from Linux kernl 6.4.0-rc1.
> > > > > > 
> > > > > > Addionally, were updated:
> > > > > > * add emulation of {cmp}xchg for 1/2 byte types using 32-
> > > > > > bit
> > > > > > atomic
> > > > > >   access.
> > > > > > * replace tabs with spaces
> > > > > > * replace __* variale with *__
> > > > > > * introduce generic version of xchg_* and cmpxchg_*.
> > > > > > 
> > > > > > Implementation of 4- and 8-byte cases were left as it is
> > > > > > done
> > > > > > in
> > > > > > Linux kernel as according to the RISC-V spec:
> > > > > > ```
> > > > > > Table A.5 ( only part of the table was copied here )
> > > > > > 
> > > > > > Linux Construct       RVWMO Mapping
> > > > > > atomic <op> relaxed    amo<op>.{w|d}
> > > > > > atomic <op> acquire    amo<op>.{w|d}.aq
> > > > > > atomic <op> release    amo<op>.{w|d}.rl
> > > > > > atomic <op>            amo<op>.{w|d}.aqrl
> > > > > > 
> > > > > > Linux Construct       RVWMO LR/SC Mapping
> > > > > > atomic <op> relaxed    loop: lr.{w|d}; <op>; sc.{w|d}; bnez
> > > > > > loop
> > > > > > atomic <op> acquire    loop: lr.{w|d}.aq; <op>; sc.{w|d};
> > > > > > bnez
> > > > > > loop
> > > > > > atomic <op> release    loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗
> > > > > > ;
> > > > > > bnez
> > > > > > loop OR
> > > > > >                        fence.tso; loop: lr.{w|d}; <op>;
> > > > > > sc.{w|d}∗ ;
> > > > > > bnez loop
> > > > > > atomic <op>            loop: lr.{w|d}.aq; <op>;
> > > > > > sc.{w|d}.aqrl;
> > > > > > bnez
> > > > > > loop
> 
> Note the load and store forms mentioned here. How would ...
> 
> > > > > > The Linux mappings for release operations may seem stronger
> > > > > > than
> > > > > > necessary,
> > > > > > but these mappings are needed to cover some cases in which
> > > > > > Linux
> > > > > > requires
> > > > > > stronger orderings than the more intuitive mappings would
> > > > > > provide.
> > > > > > In particular, as of the time this text is being written,
> > > > > > Linux
> > > > > > is
> > > > > > actively
> > > > > > debating whether to require load-load, load-store, and
> > > > > > store-
> > > > > > store
> > > > > > orderings
> > > > > > between accesses in one critical section and accesses in a
> > > > > > subsequent critical
> > > > > > section in the same hart and protected by the same
> > > > > > synchronization
> > > > > > object.
> > > > > > Not all combinations of FENCE RW,W/FENCE R,RW mappings with
> > > > > > aq/rl
> > > > > > mappings
> > > > > > combine to provide such orderings.
> > > > > > There are a few ways around this problem, including:
> > > > > > 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl.
> > > > > > This
> > > > > > suffices
> > > > > >    but is undesirable, as it defeats the purpose of the
> > > > > > aq/rl
> > > > > > modifiers.
> > > > > > 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW.
> > > > > > This
> > > > > > does
> > > > > > not
> > > > > >    currently work due to the lack of load and store opcodes
> > > > > > with aq
> > > > > > and rl
> > > > > >    modifiers.
> > > > > 
> > > > > As before I don't understand this point. Can you give an
> > > > > example
> > > > > of
> > > > > what
> > > > > sort of opcode / instruction is missing?
> > > > If I understand the spec correctly then l{b|h|w|d} and
> > > > s{b|h|w|d}
> > > > instructions don't have aq or rl annotation.
> > > 
> > > How would load insns other that LR and store insns other than SC
> > > come
> > > into play here?
> > 
> > This part of the spec. is not only about LR and SC which cover
> > Load-
> > Exclusive and Store-Exclusive cases, but also about non-Exclusive
> > cases
> > for each l{b|h|w|d} and s{b|h|w|d} are used.
> 
> ... the spec (obviously covering other forms, too) be relevant when
> reasoning whether just suffixes or actual barrier insns need using?
Based on 3 rules which are in the commit message and in the spec.,
there is no difference between what option should be used ( at least, I
wasn't able to find an explanation in that paragraph ), but based on
the tables provided in the same paragraph ( and partially in the commit
message ) if an instruction has .aq or .rl annotation it should be
used.

And speaking about xchg and cmpxcgh case and their implementations, all
instructions have .ar/.rl suffixes, so we'd rather prefer suffixes
instead of barriers. 

Does it make sense?

~ Oleksii


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-06 14:13   ` Jan Beulich
@ 2024-03-07 13:01     ` Oleksii
  2024-03-07 13:24       ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-07 13:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
> > +/* Generic IO read/write.  These perform native-endian accesses.
> > */
> > +static inline void __raw_writeb(uint8_t val, volatile void __iomem
> > *addr)
> > +{
> > +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
> > +}
> 
> I realize this is like Linux has it, but how is the compiler to know
> that
> *addr is being access here? 
Assembler syntax told compiler that. 0(%1) - means that the memory
location pointed to by the address in register %1.

> If the omission of respective constraints here
> and below is intentional, I think a comment (covering all instances)
> is
> needed. Note that while supposedly cloned from Arm code, Arm variants
> do
> have such constraints in Linux.
> 
It uses this constains only in arm32:
#define __raw_writeb __raw_writeb
static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
{
	asm volatile("strb %1, %0"
		     : : "Qo" (*(volatile u8 __force *)addr), "r"
(val));
}

But in case of arm64:

#define __raw_writeb __raw_writeb
static __always_inline void __raw_writeb(u8 val, volatile void __iomem
*addr)
{
	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
}

And again looking at the defintion they use different option of strb
instruction, and in case of strb they use [%1] which tells compiler
that %1 is addressed which should be dereferenced.

~ Oleksii


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 13:01     ` Oleksii
@ 2024-03-07 13:24       ` Jan Beulich
  2024-03-07 13:44         ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-07 13:24 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 14:01, Oleksii wrote:
> On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
>>> +/* Generic IO read/write.  These perform native-endian accesses.
>>> */
>>> +static inline void __raw_writeb(uint8_t val, volatile void __iomem
>>> *addr)
>>> +{
>>> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
>>> +}
>>
>> I realize this is like Linux has it, but how is the compiler to know
>> that
>> *addr is being access here? 
> Assembler syntax told compiler that. 0(%1) - means that the memory
> location pointed to by the address in register %1.

No, the compiler doesn't decompose the string to figure how operands
are used. That's what the constraints are for. The only two things the
compiler does with the string is replace % operators and count line
separators.

>> If the omission of respective constraints here
>> and below is intentional, I think a comment (covering all instances)
>> is
>> needed. Note that while supposedly cloned from Arm code, Arm variants
>> do
>> have such constraints in Linux.
>>
> It uses this constains only in arm32:
> #define __raw_writeb __raw_writeb
> static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
> {
> 	asm volatile("strb %1, %0"
> 		     : : "Qo" (*(volatile u8 __force *)addr), "r"
> (val));
> }
> 
> But in case of arm64:
> 
> #define __raw_writeb __raw_writeb
> static __always_inline void __raw_writeb(u8 val, volatile void __iomem
> *addr)
> {
> 	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
> }
> 
> And again looking at the defintion they use different option of strb
> instruction, and in case of strb they use [%1] which tells compiler
> that %1 is addressed which should be dereferenced.

Same bug here then; I happened to look at Arm32 only. As mentioned in
the other patch using what's provided here, the problem becomes more
than latent only there. And I can't spot such use in Arm64 code, so it
is likely only a latent bug there.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 13/23] xen/riscv: introduce atomic.h
  2024-03-06 15:31   ` Jan Beulich
@ 2024-03-07 13:30     ` Oleksii
  2024-03-07 15:40       ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-07 13:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Wed, 2024-03-06 at 16:31 +0100, Jan Beulich wrote:
> On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/atomic.h
> > @@ -0,0 +1,296 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Taken and modified from Linux.
> > + *
> > + * The following changes were done:
> > + * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
> > updated
> > + *     to use__*xchg_generic()
> > + * - drop casts in write_atomic() as they are unnecessary
> > + * - drop introduction of WRITE_ONCE() and READ_ONCE().
> > + *   Xen provides ACCESS_ONCE()
> > + * - remove zero-length array access in read_atomic()
> > + * - drop defines similar to pattern
> > + *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
> > + * - move not RISC-V specific functions to asm-generic/atomics-
> > ops.h
> > + * 
> > + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> > + * Copyright (C) 2012 Regents of the University of California
> > + * Copyright (C) 2017 SiFive
> > + * Copyright (C) 2024 Vates SAS
> > + */
> > +
> > +#ifndef _ASM_RISCV_ATOMIC_H
> > +#define _ASM_RISCV_ATOMIC_H
> > +
> > +#include <xen/atomic.h>
> > +
> > +#include <asm/cmpxchg.h>
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +#include <asm-generic/atomic-ops.h>
> 
> While, because of the forward decls in xen/atomic.h, having this
> #include
> works, I wonder if it wouldn't better be placed further down. The
> compiler
> will likely have an easier time when it sees the inline definitions
> ahead
> of any uses.
Do you mean to move it after #define __atomic_release_fence() ?

> 
> > +void __bad_atomic_size(void);
> > +
> > +/*
> > + * Legacy from Linux kernel. For some reason they wanted to have
> > ordered
> > + * read/write access. Thereby read* is used instead of
> > read<X>_cpu()
> > + */
> > +static always_inline void read_atomic_size(const volatile void *p,
> > +                                           void *res,
> > +                                           unsigned int size)
> > +{
> > +    switch ( size )
> > +    {
> > +    case 1: *(uint8_t *)res = readb(p); break;
> > +    case 2: *(uint16_t *)res = readw(p); break;
> > +    case 4: *(uint32_t *)res = readl(p); break;
> > +    case 8: *(uint32_t *)res  = readq(p); break;
> 
> This is the point where the lack of constraints in io.h (see my
> respective
> comment) becomes actually harmful: You're accessing not MMIO, but
> compiler-
> visible variables here. It needs to know which ones are read ...
> 
> > +    default: __bad_atomic_size(); break;
> > +    }
> > +}
> > +
> > +#define read_atomic(p) ({                               \
> > +    union { typeof(*p) val; char c[sizeof(*p)]; } x_;   \
> > +    read_atomic_size(p, x_.c, sizeof(*p));              \
> > +    x_.val;                                             \
> > +})
> > +
> > +#define write_atomic(p, x)                              \
> > +({                                                      \
> > +    typeof(*p) x__ = (x);                               \
> > +    switch ( sizeof(*p) )                               \
> > +    {                                                   \
> > +    case 1: writeb(x__,  p); break;                     \
> > +    case 2: writew(x__, p); break;                      \
> > +    case 4: writel(x__, p); break;                      \
> > +    case 8: writeq(x__, p); break;                      \
> 
> ... or written.
> 
> Nit: There's a stray blank in the writeb() invocation.
> 
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +    x__;                                                \
> > +})
> > +
> > +#define add_sized(p, x)                                 \
> > +({                                                      \
> > +    typeof(*(p)) x__ = (x);                             \
> > +    switch ( sizeof(*(p)) )                             \
> 
> Like you have it here, {read,write}_atomic() also need p properly
> parenthesized. There look to be more parenthesization issues further
> down.
> 
> > +    {                                                   \
> > +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> > +    case 2: writew(read_atomic(p) + x__, p); break;     \
> > +    case 4: writel(read_atomic(p) + x__, p); break;     \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +})
> 
> Any reason this doesn't have an 8-byte case? x86'es at least has one.
Just missed to add and no compiler error I had, but I'll added case 8.

> 
> > +#define __atomic_acquire_fence() \
> > +    __asm__ __volatile__ ( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
> > +
> > +#define __atomic_release_fence() \
> > +    __asm__ __volatile__ ( RISCV_RELEASE_BARRIER "" ::: "memory" )
> 
> Elsewhere you use asm volatile() - why __asm__ __volatile__() here?
> Or why not there (cmpxchg.h, io.h)?
It is how it was defined in Linux kernel, so I decided to use their
code style, but considering this macros likely not to be changed I can
update this lines with asm volatile.

> 
> > +/*
> > + * First, the atomic ops that have no ordering constraints and
> > therefor don't
> > + * have the AQ or RL bits set.  These don't return anything, so
> > there's only
> > + * one version to worry about.
> > + */
> > +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> > +static inline                                               \
> > +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> > +{                                                           \
> > +    __asm__ __volatile__ (                                  \
> > +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> > +        : "+A" (v->counter)                                 \
> > +        : "r" (I)                                           \
> > +        : "memory" );                                       \
> > +}                                                           \
> > +
> > +#define ATOMIC_OPS(op, asm_op, I)                           \
> > +        ATOMIC_OP (op, asm_op, I, w, int,   )
> > +
> > +ATOMIC_OPS(add, add,  i)
> > +ATOMIC_OPS(sub, add, -i)
> > +ATOMIC_OPS(and, and,  i)
> > +ATOMIC_OPS( or,  or,  i)
> > +ATOMIC_OPS(xor, xor,  i)
> > +
> > +#undef ATOMIC_OP
> > +#undef ATOMIC_OPS
> > +
> > +/*
> > + * Atomic ops that have ordered, relaxed, acquire, and release
> > variants.
> > + * There's two flavors of these: the arithmatic ops have both
> > fetch and return
> > + * versions, while the logical ops only have fetch versions.
> > + */
> > +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type,
> > prefix)    \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op##_relaxed(c_type
> > i,              \
> > +                         atomic##prefix##_t
> > *v)                     \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type " %1, %2,
> > %0"                \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}                                                                 
> >   \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t
> > *v) \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2,
> > %0"          \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}
> > +
> > +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type,
> > prefix) \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return_relaxed(c_type
> > i,                 \
> > +                          atomic##prefix##_t
> > *v)                        \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op
> > I;      \
> > +}                                                                 
> >       \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t
> > *v)  \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op(i, v) c_op
> > I;                \
> > +}
> > +
> > +#define ATOMIC_OPS(op, asm_op, c_op,
> > I)                                 \
> > +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
> > )               \
> > +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
> 
> What purpose is the last macro argument when you only ever pass
> nothing
> for it (here and ...
> 
> > +ATOMIC_OPS(add, add, +,  i)
> > +ATOMIC_OPS(sub, add, +, -i)
> > +
> > +#undef ATOMIC_OPS
> > +
> > +#define ATOMIC_OPS(op, asm_op, I) \
> > +        ATOMIC_FETCH_OP(op, asm_op, I, w, int,   )
> 
> ... here)?
for generic ATOMIC64 it is not used:

#ifdef CONFIG_GENERIC_ATOMIC64
#define ATOMIC_OPS(op, asm_op, c_op,
I)					\
        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
)		\
        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
#else
#define ATOMIC_OPS(op, asm_op, c_op,
I)					\
        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
)		\
        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,  
)		\
        ATOMIC_FETCH_OP( op, asm_op,       I, d, s64,
64)		\
        ATOMIC_OP_RETURN(op, asm_op, c_op, I, d, s64, 64)
#endif

( the code is from Linux kernel )
Only CONFIG_GENERIC_ATOMIC64=y was ported to Xen.

> 
> > +ATOMIC_OPS(and, and, i)
> > +ATOMIC_OPS( or,  or, i)
> > +ATOMIC_OPS(xor, xor, i)
> > +
> > +#undef ATOMIC_OPS
> > +
> > +#undef ATOMIC_FETCH_OP
> > +#undef ATOMIC_OP_RETURN
> > +
> > +/* This is required to provide a full barrier on success. */
> > +static inline int atomic_add_unless(atomic_t *v, int a, int u)
> > +{
> > +       int prev, rc;
> > +
> > +    __asm__ __volatile__ (
> > +        "0: lr.w     %[p],  %[c]\n"
> > +        "   beq      %[p],  %[u], 1f\n"
> > +        "   add      %[rc], %[p], %[a]\n"
> > +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> > +        "   bnez     %[rc], 0b\n"
> > +        RISCV_FULL_BARRIER
> > +        "1:\n"
> > +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> > +        : [a] "r" (a), [u] "r" (u)
> > +        : "memory");
> > +    return prev;
> > +}
> > +
> > +/*
> > + * atomic_{cmp,}xchg is required to have exactly the same ordering
> > semantics as
> > + * {cmp,}xchg and the operations that return, so they need a full
> > barrier.
> > + */
> > +#define ATOMIC_OP(c_t, prefix, size)                            \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size, "", "", "");  \
> 
> The inner parentheses aren't really needed here, are they?
> 
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size,               \
> > +                          "", "", RISCV_ACQUIRE_BARRIER);       \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size,               \
> > +                          "", RISCV_RELEASE_BARRIER, "");       \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n)         \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size,               \
> > +                          ".aqrl", "", "");                     \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg_relaxed(atomic##prefix##_t *v,     \
> > +                     c_t o, c_t n)                              \
> > +{                                                               \
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             "", "", "");                       \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v,     \
> > +                     c_t o, c_t n)                              \
> > +{                                                               \
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             "", "", RISCV_ACQUIRE_BARRIER);    \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v,     \
> > +                     c_t o, c_t n)                              \
> > +{	                                                          
> >   \
> 
> A hard tab looks to have been left here.
> 
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             "", RISCV_RELEASE_BARRIER, "");    \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n)
> > \
> > +{                                                               \
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             ".rl", "", " fence rw, rw\n");     \
> > +}
> > +
> > +#define ATOMIC_OPS() \
> > +    ATOMIC_OP(int,   , 4)
> > +
> > +ATOMIC_OPS()
> > +
> > +#undef ATOMIC_OPS
> > +#undef ATOMIC_OP
> > +
> > +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> > +{
> > +       int prev, rc;
> > +
> > +    __asm__ __volatile__ (
> > +        "0: lr.w     %[p],  %[c]\n"
> > +        "   sub      %[rc], %[p], %[o]\n"
> > +        "   bltz     %[rc], 1f\n"
> > +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> > +        "   bnez     %[rc], 0b\n"
> > +        "   fence    rw, rw\n"
> > +        "1:\n"
> > +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> > +        : [o] "r" (offset)
> > +        : "memory" );
> > +    return prev - offset;
> > +}
> > +
> > +#define atomic_dec_if_positive(v)	atomic_sub_if_positive(v,
> > 1)
> 
> Hmm, PPC for some reason also has the latter, but for both: Are they
> indeed
> going to be needed in RISC-V code? They certainly look unnecessary
> for the
> purpose of this series (allowing common code to build).
I checked my private branched and I don't use it still, so it makes
sense to drop it.

> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/atomic-ops.h
> > @@ -0,0 +1,92 @@
> > +#/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
> > +#define _ASM_GENERIC_ATOMIC_OPS_H_
> > +
> > +#include <xen/atomic.h>
> > +#include <xen/lib.h>
> 
> If I'm not mistaken this header provides default implementations for
> every
> xen/atomic.h-provided forward inline declaration that can be
> synthesized
> from other atomic functions. I think a comment to this effect would
> want
> adding somewhere here.
I think we can drop this inclusion here as inclusion of asm-
generic/atomic-ops.h will be always go with inclusion of xen/atomic.h.

~ Oleksii

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 13:24       ` Jan Beulich
@ 2024-03-07 13:44         ` Oleksii
  2024-03-07 15:32           ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-07 13:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
> On 07.03.2024 14:01, Oleksii wrote:
> > On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
> > > > +/* Generic IO read/write.  These perform native-endian
> > > > accesses.
> > > > */
> > > > +static inline void __raw_writeb(uint8_t val, volatile void
> > > > __iomem
> > > > *addr)
> > > > +{
> > > > +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
> > > > +}
> > > 
> > > I realize this is like Linux has it, but how is the compiler to
> > > know
> > > that
> > > *addr is being access here? 
> > Assembler syntax told compiler that. 0(%1) - means that the memory
> > location pointed to by the address in register %1.
> 
> No, the compiler doesn't decompose the string to figure how operands
> are used. That's what the constraints are for. The only two things
> the
> compiler does with the string is replace % operators and count line
> separators.
It looks like I am missing something.

addr -> a some register ( because of "r" contraint ).
val -> is also register ( because of "r" contraint ).

So the compiler will update instert an instruction:
 sb reg1, 0(reg2)

what means *(uint_8 *)reg2 = (uint8_t)reg1.

What am I missing?

~ Oleksii



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 13:44         ` Oleksii
@ 2024-03-07 15:32           ` Jan Beulich
  2024-03-07 16:21             ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-07 15:32 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 14:44, Oleksii wrote:
> On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
>> On 07.03.2024 14:01, Oleksii wrote:
>>> On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
>>>>> +/* Generic IO read/write.  These perform native-endian
>>>>> accesses.
>>>>> */
>>>>> +static inline void __raw_writeb(uint8_t val, volatile void
>>>>> __iomem
>>>>> *addr)
>>>>> +{
>>>>> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
>>>>> +}
>>>>
>>>> I realize this is like Linux has it, but how is the compiler to
>>>> know
>>>> that
>>>> *addr is being access here? 
>>> Assembler syntax told compiler that. 0(%1) - means that the memory
>>> location pointed to by the address in register %1.
>>
>> No, the compiler doesn't decompose the string to figure how operands
>> are used. That's what the constraints are for. The only two things
>> the
>> compiler does with the string is replace % operators and count line
>> separators.
> It looks like I am missing something.
> 
> addr -> a some register ( because of "r" contraint ).
> val -> is also register ( because of "r" contraint ).
> 
> So the compiler will update instert an instruction:
>  sb reg1, 0(reg2)
> 
> what means *(uint_8 *)reg2 = (uint8_t)reg1.
> 
> What am I missing?

The fact that the compiler will not know that *(uint_8 *)reg2 actually
changes across this asm(). It may therefore continue to hold a cached
value in a register, without knowing that its contents went stale.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 13/23] xen/riscv: introduce atomic.h
  2024-03-07 13:30     ` Oleksii
@ 2024-03-07 15:40       ` Jan Beulich
  0 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-03-07 15:40 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 14:30, Oleksii wrote:
> On Wed, 2024-03-06 at 16:31 +0100, Jan Beulich wrote:
>> On 26.02.2024 18:38, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/include/asm/atomic.h
>>> @@ -0,0 +1,296 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Taken and modified from Linux.
>>> + *
>>> + * The following changes were done:
>>> + * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
>>> updated
>>> + *     to use__*xchg_generic()
>>> + * - drop casts in write_atomic() as they are unnecessary
>>> + * - drop introduction of WRITE_ONCE() and READ_ONCE().
>>> + *   Xen provides ACCESS_ONCE()
>>> + * - remove zero-length array access in read_atomic()
>>> + * - drop defines similar to pattern
>>> + *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
>>> + * - move not RISC-V specific functions to asm-generic/atomics-
>>> ops.h
>>> + * 
>>> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
>>> + * Copyright (C) 2012 Regents of the University of California
>>> + * Copyright (C) 2017 SiFive
>>> + * Copyright (C) 2024 Vates SAS
>>> + */
>>> +
>>> +#ifndef _ASM_RISCV_ATOMIC_H
>>> +#define _ASM_RISCV_ATOMIC_H
>>> +
>>> +#include <xen/atomic.h>
>>> +
>>> +#include <asm/cmpxchg.h>
>>> +#include <asm/fence.h>
>>> +#include <asm/io.h>
>>> +#include <asm/system.h>
>>> +
>>> +#include <asm-generic/atomic-ops.h>
>>
>> While, because of the forward decls in xen/atomic.h, having this
>> #include
>> works, I wonder if it wouldn't better be placed further down. The
>> compiler
>> will likely have an easier time when it sees the inline definitions
>> ahead
>> of any uses.
> Do you mean to move it after #define __atomic_release_fence() ?

Perhaps yet further down, at least after the arithmetic ops were defined.

>>> --- /dev/null
>>> +++ b/xen/include/asm-generic/atomic-ops.h
>>> @@ -0,0 +1,92 @@
>>> +#/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
>>> +#define _ASM_GENERIC_ATOMIC_OPS_H_
>>> +
>>> +#include <xen/atomic.h>
>>> +#include <xen/lib.h>
>>
>> If I'm not mistaken this header provides default implementations for
>> every
>> xen/atomic.h-provided forward inline declaration that can be
>> synthesized
>> from other atomic functions. I think a comment to this effect would
>> want
>> adding somewhere here.
> I think we can drop this inclusion here as inclusion of asm-
> generic/atomic-ops.h will be always go with inclusion of xen/atomic.h.

I'm okay with dropping that include, but that wasn't the purpose of my
comment. I was rather asking for a comment to be added here stating
what is (not) to be present in this header.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 15:32           ` Jan Beulich
@ 2024-03-07 16:21             ` Oleksii
  2024-03-07 17:14               ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-07 16:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Thu, 2024-03-07 at 16:32 +0100, Jan Beulich wrote:
> On 07.03.2024 14:44, Oleksii wrote:
> > On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
> > > On 07.03.2024 14:01, Oleksii wrote:
> > > > On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
> > > > > > +/* Generic IO read/write.  These perform native-endian
> > > > > > accesses.
> > > > > > */
> > > > > > +static inline void __raw_writeb(uint8_t val, volatile void
> > > > > > __iomem
> > > > > > *addr)
> > > > > > +{
> > > > > > +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r"
> > > > > > (addr) );
> > > > > > +}
> > > > > 
> > > > > I realize this is like Linux has it, but how is the compiler
> > > > > to
> > > > > know
> > > > > that
> > > > > *addr is being access here? 
> > > > Assembler syntax told compiler that. 0(%1) - means that the
> > > > memory
> > > > location pointed to by the address in register %1.
> > > 
> > > No, the compiler doesn't decompose the string to figure how
> > > operands
> > > are used. That's what the constraints are for. The only two
> > > things
> > > the
> > > compiler does with the string is replace % operators and count
> > > line
> > > separators.
> > It looks like I am missing something.
> > 
> > addr -> a some register ( because of "r" contraint ).
> > val -> is also register ( because of "r" contraint ).
> > 
> > So the compiler will update instert an instruction:
> >  sb reg1, 0(reg2)
> > 
> > what means *(uint_8 *)reg2 = (uint8_t)reg1.
> > 
> > What am I missing?
> 
> The fact that the compiler will not know that *(uint_8 *)reg2
> actually
> changes across this asm(). It may therefore continue to hold a cached
> value in a register, without knowing that its contents went stale.
Then it makes sense to me. Thanks. It explains why it is needed +Q, but
I don't understand why constraint 'o' isn't used for __raw_writew, but
was used for __raw_writeb:

   static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
   {
           asm volatile("strb %1, %0"
                        : "+Qo" (*(volatile u8 __force *)addr)
                        : "r" (val));
   }
   
   static inline void __raw_writew(u16 val, volatile void __iomem *addr)
   {
           asm volatile("strh %1, %0"
                        : "+Q" (*(volatile u16 __force *)addr)
                        : "r" (val));
   } 
   
If I understand correctly 'o' that the address is offsettable, so why
addr can not be offsettable for everyone case?

And one more thing, in Xen constraint "+" is used, but in Linux it was
dropped:
https://patchwork.kernel.org/project/linux-arm-kernel/patch/1426958753-26903-1-git-send-email-peter@hurleysoftware.com/

To me it looks like constraints should be always "+Qo".

~ Oleksii




^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 16:21             ` Oleksii
@ 2024-03-07 17:14               ` Jan Beulich
  2024-03-07 20:49                 ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-07 17:14 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 17:21, Oleksii wrote:
> On Thu, 2024-03-07 at 16:32 +0100, Jan Beulich wrote:
>> On 07.03.2024 14:44, Oleksii wrote:
>>> On Thu, 2024-03-07 at 14:24 +0100, Jan Beulich wrote:
>>>> On 07.03.2024 14:01, Oleksii wrote:
>>>>> On Wed, 2024-03-06 at 15:13 +0100, Jan Beulich wrote:
>>>>>>> +/* Generic IO read/write.  These perform native-endian
>>>>>>> accesses.
>>>>>>> */
>>>>>>> +static inline void __raw_writeb(uint8_t val, volatile void
>>>>>>> __iomem
>>>>>>> *addr)
>>>>>>> +{
>>>>>>> +    asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r"
>>>>>>> (addr) );
>>>>>>> +}
>>>>>>
>>>>>> I realize this is like Linux has it, but how is the compiler
>>>>>> to
>>>>>> know
>>>>>> that
>>>>>> *addr is being access here? 
>>>>> Assembler syntax told compiler that. 0(%1) - means that the
>>>>> memory
>>>>> location pointed to by the address in register %1.
>>>>
>>>> No, the compiler doesn't decompose the string to figure how
>>>> operands
>>>> are used. That's what the constraints are for. The only two
>>>> things
>>>> the
>>>> compiler does with the string is replace % operators and count
>>>> line
>>>> separators.
>>> It looks like I am missing something.
>>>
>>> addr -> a some register ( because of "r" contraint ).
>>> val -> is also register ( because of "r" contraint ).
>>>
>>> So the compiler will update instert an instruction:
>>>  sb reg1, 0(reg2)
>>>
>>> what means *(uint_8 *)reg2 = (uint8_t)reg1.
>>>
>>> What am I missing?
>>
>> The fact that the compiler will not know that *(uint_8 *)reg2
>> actually
>> changes across this asm(). It may therefore continue to hold a cached
>> value in a register, without knowing that its contents went stale.
> Then it makes sense to me. Thanks.

FTAOD similar considerations apply to memory reads. The compiler may need
to know that values held in registers first need writing back to memory
before an asm() can be invoked.

> It explains why it is needed +Q, but
> I don't understand why constraint 'o' isn't used for __raw_writew, but
> was used for __raw_writeb:
> 
>    static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>    {
>            asm volatile("strb %1, %0"
>                         : "+Qo" (*(volatile u8 __force *)addr)
>                         : "r" (val));
>    }
>    
>    static inline void __raw_writew(u16 val, volatile void __iomem *addr)
>    {
>            asm volatile("strh %1, %0"
>                         : "+Q" (*(volatile u16 __force *)addr)
>                         : "r" (val));
>    } 
>    
> If I understand correctly 'o' that the address is offsettable, so why
> addr can not be offsettable for everyone case?

I don't know; there may be peculiarities in RISC-V specific constraints.

> And one more thing, in Xen constraint "+" is used, but in Linux it was
> dropped:
> https://patchwork.kernel.org/project/linux-arm-kernel/patch/1426958753-26903-1-git-send-email-peter@hurleysoftware.com/
> 
> To me it looks like constraints should be always "+Qo".

For plain writes it should at least be "=Qo" then, yes. To me making those
input operands on Arm can't have been quite right.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 17:14               ` Jan Beulich
@ 2024-03-07 20:49                 ` Oleksii
  2024-03-07 20:54                   ` Oleksii
  2024-03-08  7:18                   ` Jan Beulich
  0 siblings, 2 replies; 88+ messages in thread
From: Oleksii @ 2024-03-07 20:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> For plain writes it should at least be "=Qo" then, yes.
Constraints Q is a machine specific constraint, and I am not sure that
it makes sense to use "=o" only and probably it is a reason why it is
enough only "r". Does it make sense?

>  To me making those
> input operands on Arm can't have been quite right.
I  don't understand why they both are input, logically it looks like an
address should be an output.

~ Oleksii



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 20:49                 ` Oleksii
@ 2024-03-07 20:54                   ` Oleksii
  2024-03-08  7:26                     ` Jan Beulich
  2024-03-08  7:18                   ` Jan Beulich
  1 sibling, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-07 20:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> > For plain writes it should at least be "=Qo" then, yes.
> Constraints Q is a machine specific constraint, and I am not sure
> that
> it makes sense to use "=o" only and probably it is a reason why it is
> enough only "r". Does it make sense?
Probably for RISC-V can be used:
RISC-V—config/riscv/constraints.md
   ...
   A
       An address that is held in a general-purpose register.
   ...

AArch64 family—config/aarch64/constraints.md:
   ...
   Q
       A memory address which uses a single base register with no
   offset
   ...
Also 'no offset' explains why it was added 'o' constraint for Arm
additionally.

~ Oleksii

> 
> >  To me making those
> > input operands on Arm can't have been quite right.
> I  don't understand why they both are input, logically it looks like
> an
> address should be an output.
> 
> ~ Oleksii
> 



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 20:49                 ` Oleksii
  2024-03-07 20:54                   ` Oleksii
@ 2024-03-08  7:18                   ` Jan Beulich
  1 sibling, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-03-08  7:18 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 21:49, Oleksii wrote:
> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>> For plain writes it should at least be "=Qo" then, yes.
> Constraints Q is a machine specific constraint, and I am not sure that
> it makes sense to use "=o" only and probably it is a reason why it is
> enough only "r". Does it make sense?

Especially the 'only "r"' part doesn't, no. I also don't see why "=o"
would make no sense - that's fundamentally no different than "=m".
Unless the immediates used in the ultimate insns are large enough to
cover the full range of possible values, my take is that "o" is never
appropriate to use. With one exception in a case like we have here:
If the operand isn't used in the actual instruction(s), then that's
fine, as the specific value of the adjusted immediate wouldn't matter
at all.

As to "Q" - that's an Arm constraint anyway, not a RISC-V one? If so,
I'm not sure why we're discussing it here. In any event, I'd be curious
to understand in how far the combination "Qo" makes sense.

>>  To me making those
>> input operands on Arm can't have been quite right.
> I  don't understand why they both are input, logically it looks like an
> address should be an output.

How would an address be an output, when that's needed to be calculated
for the memory access to take place? It would be an output (and
presumably a dummy/fake one) only if the address calculation was done
in the asm() itself (and there I don't mean the operands, but the
actual assembly instruction(s)).

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-07 20:54                   ` Oleksii
@ 2024-03-08  7:26                     ` Jan Beulich
  2024-03-08 10:14                       ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-08  7:26 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 07.03.2024 21:54, Oleksii wrote:
> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>> For plain writes it should at least be "=Qo" then, yes.
>> Constraints Q is a machine specific constraint, and I am not sure
>> that
>> it makes sense to use "=o" only and probably it is a reason why it is
>> enough only "r". Does it make sense?
> Probably for RISC-V can be used:
> RISC-V—config/riscv/constraints.md
>    ...
>    A
>        An address that is held in a general-purpose register.
>    ...

Just from the description I would have said no, but looking at what "A"
actually expands to it is indeed RISC-V's counterpart of Arm's "Q". So
yes, this looks like what amo* want to use, and then as a real operand,
not just a fake one.

> AArch64 family—config/aarch64/constraints.md:
>    ...
>    Q
>        A memory address which uses a single base register with no
>    offset
>    ...
> Also 'no offset' explains why it was added 'o' constraint for Arm
> additionally.

I don't think it does.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-08  7:26                     ` Jan Beulich
@ 2024-03-08 10:14                       ` Oleksii
  2024-03-08 11:49                         ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-08 10:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
> On 07.03.2024 21:54, Oleksii wrote:
> > On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
> > > On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> > > > For plain writes it should at least be "=Qo" then, yes.
> > > Constraints Q is a machine specific constraint, and I am not sure
> > > that
> > > it makes sense to use "=o" only and probably it is a reason why
> > > it is
> > > enough only "r". Does it make sense?
> > Probably for RISC-V can be used:
> > RISC-V—config/riscv/constraints.md
> >    ...
> >    A
> >        An address that is held in a general-purpose register.
> >    ...
> 
> Just from the description I would have said no, but looking at what
> "A"
> actually expands to it is indeed RISC-V's counterpart of Arm's "Q".
> So
> yes, this looks like what amo* want to use, and then as a real
> operand,
> not just a fake one.
I am not sure that I know how to check correctly how "A" expands, but I
tried to look at code which will be generated with and without
constraints and it is the same:
   // static inline void __raw_writel(uint32_t val, volatile void
   __iomem *addr)
   // {
   //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr) );
   // }
   
   static inline void __raw_writel(uint32_t val, volatile void __iomem
   *addr)
   {
       asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
   uint32_t __force *)addr) );
   }
   
   ffffffffc003d774:       aabbd7b7                lui     a5,0xaabbd
   ffffffffc003d778:       cdd78793                add     a5,a5,-803 #
   ffffffffaabbccdd <start-0x15443323>
   ffffffffc003d77c:       f8f42423                sw      a5,-120(s0)
   ffffffffc003d780:       0140000f                fence   w,o
   

> 
> > AArch64 family—config/aarch64/constraints.md:
> >    ...
> >    Q
> >        A memory address which uses a single base register with no
> >    offset
> >    ...
> > Also 'no offset' explains why it was added 'o' constraint for Arm
> > additionally.
> 
> I don't think it does.
> 
> Jan



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-08 10:14                       ` Oleksii
@ 2024-03-08 11:49                         ` Jan Beulich
  2024-03-08 11:52                           ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-08 11:49 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 08.03.2024 11:14, Oleksii wrote:
> On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
>> On 07.03.2024 21:54, Oleksii wrote:
>>> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>>>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>>>> For plain writes it should at least be "=Qo" then, yes.
>>>> Constraints Q is a machine specific constraint, and I am not sure
>>>> that
>>>> it makes sense to use "=o" only and probably it is a reason why
>>>> it is
>>>> enough only "r". Does it make sense?
>>> Probably for RISC-V can be used:
>>> RISC-V—config/riscv/constraints.md
>>>    ...
>>>    A
>>>        An address that is held in a general-purpose register.
>>>    ...
>>
>> Just from the description I would have said no, but looking at what
>> "A"
>> actually expands to it is indeed RISC-V's counterpart of Arm's "Q".
>> So
>> yes, this looks like what amo* want to use, and then as a real
>> operand,
>> not just a fake one.
> I am not sure that I know how to check correctly how "A" expands, but I
> tried to look at code which will be generated with and without
> constraints and it is the same:

As expected.

>    // static inline void __raw_writel(uint32_t val, volatile void
>    __iomem *addr)
>    // {
>    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr) );
>    // }
>    
>    static inline void __raw_writel(uint32_t val, volatile void __iomem
>    *addr)
>    {
>        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
>    uint32_t __force *)addr) );

You want just "A" here though; adding an offset (as "o" permits) would
yield an insn which the assembler would reject. Also just to remind
you: In write functions you need "=A" (and in amo ones "+A"), i.e. the
memory operand then needs to be an output, not an input.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-08 11:49                         ` Jan Beulich
@ 2024-03-08 11:52                           ` Jan Beulich
  2024-03-08 12:17                             ` Oleksii
  0 siblings, 1 reply; 88+ messages in thread
From: Jan Beulich @ 2024-03-08 11:52 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 08.03.2024 12:49, Jan Beulich wrote:
> On 08.03.2024 11:14, Oleksii wrote:
>> On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
>>> On 07.03.2024 21:54, Oleksii wrote:
>>>> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>>>>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>>>>> For plain writes it should at least be "=Qo" then, yes.
>>>>> Constraints Q is a machine specific constraint, and I am not sure
>>>>> that
>>>>> it makes sense to use "=o" only and probably it is a reason why
>>>>> it is
>>>>> enough only "r". Does it make sense?
>>>> Probably for RISC-V can be used:
>>>> RISC-V—config/riscv/constraints.md
>>>>    ...
>>>>    A
>>>>        An address that is held in a general-purpose register.
>>>>    ...
>>>
>>> Just from the description I would have said no, but looking at what
>>> "A"
>>> actually expands to it is indeed RISC-V's counterpart of Arm's "Q".
>>> So
>>> yes, this looks like what amo* want to use, and then as a real
>>> operand,
>>> not just a fake one.
>> I am not sure that I know how to check correctly how "A" expands, but I
>> tried to look at code which will be generated with and without
>> constraints and it is the same:
> 
> As expected.
> 
>>    // static inline void __raw_writel(uint32_t val, volatile void
>>    __iomem *addr)
>>    // {
>>    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr) );
>>    // }
>>    
>>    static inline void __raw_writel(uint32_t val, volatile void __iomem
>>    *addr)
>>    {
>>        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
>>    uint32_t __force *)addr) );
> 
> You want just "A" here though; adding an offset (as "o" permits) would
> yield an insn which the assembler would reject.

Wait - this is plain SW, so can't it even be the more generic "m" then?
(As said, I'm uncertain about "o"; in general I think it's risky to use.)

Jan

> Also just to remind
> you: In write functions you need "=A" (and in amo ones "+A"), i.e. the
> memory operand then needs to be an output, not an input.
> 
> Jan



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-08 11:52                           ` Jan Beulich
@ 2024-03-08 12:17                             ` Oleksii
  2024-03-08 12:54                               ` Jan Beulich
  0 siblings, 1 reply; 88+ messages in thread
From: Oleksii @ 2024-03-08 12:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Fri, 2024-03-08 at 12:52 +0100, Jan Beulich wrote:
> On 08.03.2024 12:49, Jan Beulich wrote:
> > On 08.03.2024 11:14, Oleksii wrote:
> > > On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
> > > > On 07.03.2024 21:54, Oleksii wrote:
> > > > > On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
> > > > > > On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
> > > > > > > For plain writes it should at least be "=Qo" then, yes.
> > > > > > Constraints Q is a machine specific constraint, and I am
> > > > > > not sure
> > > > > > that
> > > > > > it makes sense to use "=o" only and probably it is a reason
> > > > > > why
> > > > > > it is
> > > > > > enough only "r". Does it make sense?
> > > > > Probably for RISC-V can be used:
> > > > > RISC-V—config/riscv/constraints.md
> > > > >    ...
> > > > >    A
> > > > >        An address that is held in a general-purpose register.
> > > > >    ...
> > > > 
> > > > Just from the description I would have said no, but looking at
> > > > what
> > > > "A"
> > > > actually expands to it is indeed RISC-V's counterpart of Arm's
> > > > "Q".
> > > > So
> > > > yes, this looks like what amo* want to use, and then as a real
> > > > operand,
> > > > not just a fake one.
> > > I am not sure that I know how to check correctly how "A" expands,
> > > but I
> > > tried to look at code which will be generated with and without
> > > constraints and it is the same:
> > 
> > As expected.
But if it is epxected and generated code is the same, do we really need
constraints then?

> > 
> > >    // static inline void __raw_writel(uint32_t val, volatile void
> > >    __iomem *addr)
> > >    // {
> > >    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr)
> > > );
> > >    // }
> > >    
> > >    static inline void __raw_writel(uint32_t val, volatile void
> > > __iomem
> > >    *addr)
> > >    {
> > >        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
> > >    uint32_t __force *)addr) );
> > 
> > You want just "A" here though; adding an offset (as "o" permits)
> > would
> > yield an insn which the assembler would reject.
> 
> Wait - this is plain SW, so can't it even be the more generic "m"
> then?
> (As said, I'm uncertain about "o"; in general I think it's risky to
> use.)
What do you mean by "plain SW"?

Are you suggesting changing 'm' to 'o' so that the final result will be
"Am"? Based on the descriptions of 'A' and 'm', it seems to me that we
can just use 'A' alone because both constraints indicate that the
operand is in memory, and 'A' specifically denotes that an address is
held in a register.
> 
> 
> > Also just to remind
> > you: In write functions you need "=A" (and in amo ones "+A"), i.e.
> > the
> > memory operand then needs to be an output, not an input.
Could you please clarify about which one amo you are speaking? That one
who are defined by ATOMIC_OP and ATOMIC_FETCH_OP? They are already
using +A constraints:
    __asm__ __volatile__ (                                          \
        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
        : "+A" (v->counter), "=r" (ret)                             \
        : "r" (I)                                                   \
        : "memory" );                                               \

~ Oleksii


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 12/23] xen/riscv: introduce io.h
  2024-03-08 12:17                             ` Oleksii
@ 2024-03-08 12:54                               ` Jan Beulich
  0 siblings, 0 replies; 88+ messages in thread
From: Jan Beulich @ 2024-03-08 12:54 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 08.03.2024 13:17, Oleksii wrote:
> On Fri, 2024-03-08 at 12:52 +0100, Jan Beulich wrote:
>> On 08.03.2024 12:49, Jan Beulich wrote:
>>> On 08.03.2024 11:14, Oleksii wrote:
>>>> On Fri, 2024-03-08 at 08:26 +0100, Jan Beulich wrote:
>>>>> On 07.03.2024 21:54, Oleksii wrote:
>>>>>> On Thu, 2024-03-07 at 21:49 +0100, Oleksii wrote:
>>>>>>> On Thu, 2024-03-07 at 18:14 +0100, Jan Beulich wrote:
>>>>>>>> For plain writes it should at least be "=Qo" then, yes.
>>>>>>> Constraints Q is a machine specific constraint, and I am
>>>>>>> not sure
>>>>>>> that
>>>>>>> it makes sense to use "=o" only and probably it is a reason
>>>>>>> why
>>>>>>> it is
>>>>>>> enough only "r". Does it make sense?
>>>>>> Probably for RISC-V can be used:
>>>>>> RISC-V—config/riscv/constraints.md
>>>>>>    ...
>>>>>>    A
>>>>>>        An address that is held in a general-purpose register.
>>>>>>    ...
>>>>>
>>>>> Just from the description I would have said no, but looking at
>>>>> what
>>>>> "A"
>>>>> actually expands to it is indeed RISC-V's counterpart of Arm's
>>>>> "Q".
>>>>> So
>>>>> yes, this looks like what amo* want to use, and then as a real
>>>>> operand,
>>>>> not just a fake one.
>>>> I am not sure that I know how to check correctly how "A" expands,
>>>> but I
>>>> tried to look at code which will be generated with and without
>>>> constraints and it is the same:
>>>
>>> As expected.
> But if it is epxected and generated code is the same, do we really need
> constraints then?

Yes. Again: Proper constraints are the only way for the compiler to know
everything it needs to know to generate correct code around an asm().

>>>>    // static inline void __raw_writel(uint32_t val, volatile void
>>>>    __iomem *addr)
>>>>    // {
>>>>    //     asm volatile ( "sw %0, 0(%1)" : : "r" (val), "r"(addr)
>>>> );
>>>>    // }
>>>>    
>>>>    static inline void __raw_writel(uint32_t val, volatile void
>>>> __iomem
>>>>    *addr)
>>>>    {
>>>>        asm volatile ( "sw %0, %1" : : "r" (val), "Ao" (*(volatile
>>>>    uint32_t __force *)addr) );
>>>
>>> You want just "A" here though; adding an offset (as "o" permits)
>>> would
>>> yield an insn which the assembler would reject.
>>
>> Wait - this is plain SW, so can't it even be the more generic "m"
>> then?
>> (As said, I'm uncertain about "o"; in general I think it's risky to
>> use.)
> What do you mean by "plain SW"?

The plain store instruction, i.e. not SC. That one permits wider addressing
modes iirc, which we ought to permit where possible.

> Are you suggesting changing 'm' to 'o' so that the final result will be
> "Am"? Based on the descriptions of 'A' and 'm', it seems to me that we
> can just use 'A' alone because both constraints indicate that the
> operand is in memory, and 'A' specifically denotes that an address is
> held in a register.

No, no "A" at all. Just "m", which is a superset of "A" anyway.

>>> Also just to remind
>>> you: In write functions you need "=A" (and in amo ones "+A"), i.e.
>>> the
>>> memory operand then needs to be an output, not an input.
> Could you please clarify about which one amo you are speaking? That one
> who are defined by ATOMIC_OP and ATOMIC_FETCH_OP?

All. They're all read-modify-write operations if I'm not mistaken.

> They are already
> using +A constraints:
>     __asm__ __volatile__ (                                          \
>         "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
>         : "+A" (v->counter), "=r" (ret)                             \
>         : "r" (I)                                                   \
>         : "memory" );                                               \

Good. I merely thought I'd mention that aspect for completeness.

Jan


^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2024-03-08 12:54 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-26 17:38 [PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 01/23] xen/riscv: disable unnecessary configs Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 02/23] xen/riscv: use some asm-generic headers Oleksii Kurochko
2024-02-27  7:35   ` Jan Beulich
2024-02-26 17:38 ` [PATCH v5 03/23] xen/riscv: introduce nospec.h Oleksii Kurochko
2024-02-27  7:38   ` Jan Beulich
2024-02-28  9:59     ` Oleksii
2024-02-29 13:49   ` Julien Grall
2024-02-29 14:01     ` Jan Beulich
2024-02-29 16:09       ` Oleksii
2024-02-29 16:27   ` Jan Beulich
2024-02-26 17:38 ` [PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions Oleksii Kurochko
2024-02-29 13:54   ` Julien Grall
2024-02-29 14:03     ` Jan Beulich
2024-02-29 14:08       ` Julien Grall
2024-02-29 16:17     ` Oleksii
2024-02-29 15:52   ` Jan Beulich
2024-02-29 16:25   ` Andrew Cooper
2024-03-01  9:15     ` Oleksii
2024-02-26 17:38 ` [PATCH v5 05/23] xen/asm-generic: introduce generic find first set bit functions Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 06/23] xen/asm-generic: introduce generic ffz() Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 07/23] xen/asm-generic: introduce generic hweight64() Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 08/23] xen/asm-generic: introduce generic non-atomic test_*bit() Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 09/23] xen/riscv: introduce bitops.h Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 10/23] xen/riscv: introduces acrquire, release and full barriers Oleksii Kurochko
2024-03-05  7:42   ` Jan Beulich
2024-02-26 17:38 ` [PATCH v5 11/23] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
2024-03-06 14:56   ` Jan Beulich
2024-03-07 10:35     ` Oleksii
2024-03-07 10:46       ` Jan Beulich
2024-03-07 11:01         ` Oleksii
2024-03-07 11:11           ` Jan Beulich
2024-03-07 12:28             ` Oleksii
2024-02-26 17:38 ` [PATCH v5 12/23] xen/riscv: introduce io.h Oleksii Kurochko
2024-03-06 14:13   ` Jan Beulich
2024-03-07 13:01     ` Oleksii
2024-03-07 13:24       ` Jan Beulich
2024-03-07 13:44         ` Oleksii
2024-03-07 15:32           ` Jan Beulich
2024-03-07 16:21             ` Oleksii
2024-03-07 17:14               ` Jan Beulich
2024-03-07 20:49                 ` Oleksii
2024-03-07 20:54                   ` Oleksii
2024-03-08  7:26                     ` Jan Beulich
2024-03-08 10:14                       ` Oleksii
2024-03-08 11:49                         ` Jan Beulich
2024-03-08 11:52                           ` Jan Beulich
2024-03-08 12:17                             ` Oleksii
2024-03-08 12:54                               ` Jan Beulich
2024-03-08  7:18                   ` Jan Beulich
2024-02-26 17:38 ` [PATCH v5 13/23] xen/riscv: introduce atomic.h Oleksii Kurochko
2024-03-06 15:31   ` Jan Beulich
2024-03-07 13:30     ` Oleksii
2024-03-07 15:40       ` Jan Beulich
2024-02-26 17:38 ` [PATCH v5 14/23] xen/riscv: introduce monitor.h Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 15/23] xen/riscv: add definition of __read_mostly Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 16/23] xen/riscv: add required things to current.h Oleksii Kurochko
2024-02-26 17:38 ` [PATCH v5 17/23] xen/riscv: add minimal stuff to page.h to build full Xen Oleksii Kurochko
2024-02-26 17:39 ` [PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
2024-03-05  8:05   ` Jan Beulich
2024-03-05 17:34     ` Oleksii
2024-02-26 17:39 ` [PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
2024-03-05  8:17   ` Jan Beulich
2024-03-05 16:46     ` Oleksii
2024-02-26 17:39 ` [PATCH v5 20/23] xen/riscv: introduce vm_event_*() functions Oleksii Kurochko
2024-02-26 17:39 ` [PATCH v5 21/23] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
2024-03-05  8:40   ` Jan Beulich
2024-02-26 17:39 ` [PATCH v5 22/23] xen/riscv: enable full Xen build Oleksii Kurochko
2024-02-26 17:39 ` [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
2024-02-27  7:55   ` Jan Beulich
2024-02-28 17:03     ` Oleksii
2024-02-28 22:58     ` Julien Grall
2024-02-28 23:11       ` Andrew Cooper
2024-02-29 17:00         ` Oleksii
2024-02-29  7:58       ` Jan Beulich
2024-02-29 10:23         ` Julien Grall
2024-02-29 11:56           ` Jan Beulich
2024-02-29 11:59             ` Jan Beulich
2024-02-29 12:05           ` Andrew Cooper
2024-02-29 12:17             ` Jan Beulich
2024-02-29 12:32               ` Julien Grall
2024-02-29 12:51                 ` Jan Beulich
2024-02-29 13:44                   ` Julien Grall
2024-02-29 14:07                     ` Jan Beulich
2024-02-29 14:14                       ` Julien Grall
2024-02-29 17:43                         ` Stefano Stabellini
2024-02-29 12:27             ` Julien Grall
2024-02-29 16:54         ` Oleksii

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.