All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/30]  Enable build of full Xen for RISC-V
@ 2024-02-05 15:32 Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 01/30] xen/riscv: disable unnecessary configs Oleksii Kurochko
                   ` (29 more replies)
  0 siblings, 30 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Doug Goldstein, Stefano Stabellini,
	Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Wei Liu, Paul Durrant,
	Roger Pau Monné,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Shawn Anastasio, Tamas K Lengyel, Alexandru Isaila,
	Petre Pircalabu

This patch series performs all of the additions necessary to drop the
build overrides for RISCV and enable the full Xen build. Except in cases
where compatibile implementations already exist (e.g. atomic.h and
bitops.h), the newly added definitions are simple.

The patch series is based on the following patch series:
-	[PATCH v6 0/9] Introduce generic headers   [1]
- [PATCH] move __read_mostly to xen/cache.h  [2]
- [PATCH] xen: move BUG_ON(), WARN_ON(), ASSERT(), ASSERT_UNREACHABLE() to xen/bug.h [3]
- [XEN PATCH v2 1/3] xen: introduce STATIC_ASSERT_UNREACHABLE() [4]
- [PATCH] xen/lib: introduce generic find next bit operations [5]

Right now, the patch series doesn't have a direct dependency on [2] and it
provides __read_mostly in the patch:
    [PATCH v3 26/34] xen/riscv: add definition of __read_mostly
However, it will be dropped as soon as [2] is merged or at least when the
final version of the patch [2] is provided.

[1] https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com/
[2] https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b333e@suse.com/
[3] https://lore.kernel.org/xen-devel/4887b2d91a4bf2e8b4b66f03964259651981403b.1706897023.git.oleksii.kurochko@gmail.com/
[4] https://lore.kernel.org/xen-devel/42fc6ae8d3eb802429d29c774502ff232340dc84.1706259490.git.federico.serafini@bugseng.com/
[5] https://lore.kernel.org/xen-devel/52730e6314210ba4164a9934a720c4fda201447b.1706266854.git.oleksii.kurochko@gmail.com/

---
Changes in V4:
 - Update the cover letter message: new patch series dependencies.
 - Some patches were merged to staging, so they were dropped in this patch series:
     [PATCH v3 09/34] xen/riscv: introduce system.h
     [PATCH v3 18/34] xen/riscv: introduce domain.h
     [PATCH v3 19/34] xen/riscv: introduce guest_access.h
 - Was sent out of this patch series:
     [PATCH v3 16/34] xen/lib: introduce generic find next bit operations
 - [PATCH v3 17/34] xen/riscv: add compilation of generic find-next-bit.c was
   droped as CONFIG_GENERIC_FIND_NEXT_BIT was dropped.
 - All other changes are specific to a specific patch.
---
Changes in V3:
 - Update the cover letter message
 - The following patches were dropped as they were merged to staging:
    [PATCH v2 03/39] xen/riscv:introduce asm/byteorder.h
    [PATCH v2 04/39] xen/riscv: add public arch-riscv.h
    [PATCH v2 05/39] xen/riscv: introduce spinlock.h
    [PATCH v2 20/39] xen/riscv: define bug frame tables in xen.lds.S
    [PATCH v2 34/39] xen: add RISCV support for pmu.h
    [PATCH v2 35/39] xen: add necessary headers to common to build full Xen for RISC-V
 - Instead of the following patches were introduced new:
    [PATCH v2 10/39] xen/riscv: introduce asm/iommu.h
    [PATCH v2 11/39] xen/riscv: introduce asm/nospec.h
 - remove "asm/"  for commit messages which start with "xen/riscv:"
 - code style updates.
 - add emulation of {cmp}xchg_* for 1 and 2 bytes types.
 - code style fixes.
 - add SPDX and footer for the newly added headers.
 - introduce generic find-next-bit.c.
 - some other mionor changes. ( details please find in a patch )
---
Changes in V2:
  - Drop the following patches as they are the part of [2]:
      [PATCH v1 06/57] xen/riscv: introduce paging.h
      [PATCH v1 08/57] xen/riscv: introduce asm/device.h
      [PATCH v1 10/57] xen/riscv: introduce asm/grant_table.h
      [PATCH v1 12/57] xen/riscv: introduce asm/hypercall.h
      [PATCH v1 13/57] xen/riscv: introduce asm/iocap.h
      [PATCH v1 15/57] xen/riscv: introduce asm/mem_access.h
      [PATCH v1 18/57] xen/riscv: introduce asm/random.h
      [PATCH v1 21/57] xen/riscv: introduce asm/xenoprof.h
      [PATCH v1 24/57] xen/riscv: introduce asm/percpu.h
      [PATCH v1 29/57] xen/riscv: introduce asm/hardirq.h
      [PATCH v1 33/57] xen/riscv: introduce asm/altp2m.h
      [PATCH v1 38/57] xen/riscv: introduce asm/monitor.h
      [PATCH v1 39/57] xen/riscv: introduce asm/numa.h
      [PATCH v1 42/57] xen/riscv: introduce asm/softirq.h
  - xen/lib.h in most of the cases were changed to xen/bug.h as
    mostly functionilty of bug.h is used.
  - align arch-riscv.h with Arm's version of it.
  - change the Author of commit with introduction of asm/atomic.h.
  - update some definition from spinlock.h.
  - code style changes.
---

Bobby Eshleman (1):
  xen/riscv: introduce atomic.h

Oleksii Kurochko (29):
  xen/riscv: disable unnecessary configs
  xen/riscv: use some asm-generic headers
  xen: add support in public/hvm/save.h for PPC and RISC-V
  xen/riscv: introduce cpufeature.h
  xen/riscv: introduce guest_atomics.h
  xen: avoid generation of empty asm/iommu.h
  xen/asm-generic: introdure nospec.h
  xen/riscv: introduce setup.h
  xen/riscv: introduce bitops.h
  xen/riscv: introduce flushtlb.h
  xen/riscv: introduce smp.h
  xen/riscv: introduce cmpxchg.h
  xen/riscv: introduce io.h
  xen/riscv: introduce irq.h
  xen/riscv: introduce p2m.h
  xen/riscv: introduce regs.h
  xen/riscv: introduce time.h
  xen/riscv: introduce event.h
  xen/riscv: introduce monitor.h
  xen/riscv: add definition of __read_mostly
  xen/riscv: define an address of frame table
  xen/riscv: add required things to current.h
  xen/riscv: add minimal stuff to page.h to build full Xen
  xen/riscv: add minimal stuff to processor.h to build full Xen
  xen/riscv: add minimal stuff to mm.h to build full Xen
  xen/riscv: introduce vm_event_*() functions
  xen/rirscv: add minimal amount of stubs to build full Xen
  xen/riscv: enable full Xen build
  xen/README: add compiler and binutils versions for RISC-V64

 README                                        |   3 +
 .../gitlab-ci/riscv-fixed-randconfig.yaml     |  27 ++
 docs/misc/riscv/booting.txt                   |   8 +
 xen/arch/arm/include/asm/Makefile             |   1 +
 xen/arch/ppc/include/asm/Makefile             |   1 +
 xen/arch/ppc/include/asm/nospec.h             |  15 -
 xen/arch/riscv/Kconfig                        |   7 +
 xen/arch/riscv/Makefile                       |  18 +-
 xen/arch/riscv/arch.mk                        |   5 +-
 xen/arch/riscv/configs/tiny64_defconfig       |  17 +
 xen/arch/riscv/early_printk.c                 | 168 -------
 xen/arch/riscv/include/asm/Makefile           |  13 +
 xen/arch/riscv/include/asm/atomic.h           | 395 +++++++++++++++++
 xen/arch/riscv/include/asm/bitops.h           | 164 +++++++
 xen/arch/riscv/include/asm/cache.h            |   2 +
 xen/arch/riscv/include/asm/cmpxchg.h          | 237 ++++++++++
 xen/arch/riscv/include/asm/config.h           | 107 +++--
 xen/arch/riscv/include/asm/cpufeature.h       |  23 +
 xen/arch/riscv/include/asm/current.h          |  19 +
 xen/arch/riscv/include/asm/event.h            |  40 ++
 xen/arch/riscv/include/asm/fence.h            |   8 +
 xen/arch/riscv/include/asm/flushtlb.h         |  34 ++
 xen/arch/riscv/include/asm/guest_atomics.h    |  44 ++
 xen/arch/riscv/include/asm/io.h               | 142 ++++++
 xen/arch/riscv/include/asm/irq.h              |  37 ++
 xen/arch/riscv/include/asm/mm.h               | 246 +++++++++++
 xen/arch/riscv/include/asm/monitor.h          |  26 ++
 xen/arch/riscv/include/asm/p2m.h              | 102 +++++
 xen/arch/riscv/include/asm/page.h             |  19 +
 xen/arch/riscv/include/asm/processor.h        |  23 +
 xen/arch/riscv/include/asm/regs.h             |  29 ++
 xen/arch/riscv/include/asm/setup.h            |  17 +
 xen/arch/riscv/include/asm/smp.h              |  26 ++
 xen/arch/riscv/include/asm/time.h             |  29 ++
 xen/arch/riscv/mm.c                           |  52 ++-
 xen/arch/riscv/setup.c                        |  10 +-
 xen/arch/riscv/stubs.c                        | 415 ++++++++++++++++++
 xen/arch/riscv/traps.c                        |  25 ++
 xen/arch/riscv/vm_event.c                     |  19 +
 xen/include/asm-generic/bitops/__ffs.h        |  47 ++
 xen/include/asm-generic/bitops/bitops-bits.h  |  21 +
 xen/include/asm-generic/bitops/ffs.h          |   9 +
 xen/include/asm-generic/bitops/ffsl.h         |  16 +
 xen/include/asm-generic/bitops/ffz.h          |  18 +
 .../asm-generic/bitops/find-first-set-bit.h   |  17 +
 xen/include/asm-generic/bitops/fls.h          |  18 +
 xen/include/asm-generic/bitops/flsl.h         |  10 +
 .../asm-generic/bitops/generic-non-atomic.h   |  89 ++++
 xen/include/asm-generic/bitops/hweight.h      |  13 +
 xen/include/asm-generic/bitops/test-bit.h     |  18 +
 .../asm => include/asm-generic}/nospec.h      |   6 +-
 xen/include/public/hvm/save.h                 |   4 +-
 xen/include/xen/iommu.h                       |   4 +
 53 files changed, 2640 insertions(+), 223 deletions(-)
 create mode 100644 docs/misc/riscv/booting.txt
 delete mode 100644 xen/arch/ppc/include/asm/nospec.h
 create mode 100644 xen/arch/riscv/include/asm/Makefile
 create mode 100644 xen/arch/riscv/include/asm/atomic.h
 create mode 100644 xen/arch/riscv/include/asm/bitops.h
 create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h
 create mode 100644 xen/arch/riscv/include/asm/cpufeature.h
 create mode 100644 xen/arch/riscv/include/asm/event.h
 create mode 100644 xen/arch/riscv/include/asm/fence.h
 create mode 100644 xen/arch/riscv/include/asm/flushtlb.h
 create mode 100644 xen/arch/riscv/include/asm/guest_atomics.h
 create mode 100644 xen/arch/riscv/include/asm/io.h
 create mode 100644 xen/arch/riscv/include/asm/irq.h
 create mode 100644 xen/arch/riscv/include/asm/monitor.h
 create mode 100644 xen/arch/riscv/include/asm/p2m.h
 create mode 100644 xen/arch/riscv/include/asm/regs.h
 create mode 100644 xen/arch/riscv/include/asm/setup.h
 create mode 100644 xen/arch/riscv/include/asm/smp.h
 create mode 100644 xen/arch/riscv/include/asm/time.h
 create mode 100644 xen/arch/riscv/stubs.c
 create mode 100644 xen/arch/riscv/vm_event.c
 create mode 100644 xen/include/asm-generic/bitops/__ffs.h
 create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
 create mode 100644 xen/include/asm-generic/bitops/ffs.h
 create mode 100644 xen/include/asm-generic/bitops/ffsl.h
 create mode 100644 xen/include/asm-generic/bitops/ffz.h
 create mode 100644 xen/include/asm-generic/bitops/find-first-set-bit.h
 create mode 100644 xen/include/asm-generic/bitops/fls.h
 create mode 100644 xen/include/asm-generic/bitops/flsl.h
 create mode 100644 xen/include/asm-generic/bitops/generic-non-atomic.h
 create mode 100644 xen/include/asm-generic/bitops/hweight.h
 create mode 100644 xen/include/asm-generic/bitops/test-bit.h
 rename xen/{arch/arm/include/asm => include/asm-generic}/nospec.h (79%)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 107+ messages in thread

* [PATCH v4 01/30] xen/riscv: disable unnecessary configs
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 02/30] xen/riscv: use some asm-generic headers Oleksii Kurochko
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Doug Goldstein, Stefano Stabellini,
	Alistair Francis, Bob Eshleman, Connor Davis

This patch disables unnecessary configs for two cases:
1. By utilizing EXTRA_FIXED_RANDCONFIG and risc-fixed-randconfig.yaml
   file for randconfig builds (GitLab CI jobs).
2. By using tiny64_defconfig for non-randconfig builds.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - Nothing changed. Only rebase
---
Changes in V3:
 - Remove EXTRA_FIXED_RANDCONFIG for non-randconfig jobs.
   For non-randconfig jobs, it is sufficient to disable configs by using the defconfig.
 - Remove double blank lines in build.yaml file before archlinux-current-gcc-riscv64-debug
---
Changes in V2:
 - update the commit message.
 - remove xen/arch/riscv/Kconfig changes.
---
 .../gitlab-ci/riscv-fixed-randconfig.yaml     | 27 +++++++++++++++++++
 xen/arch/riscv/configs/tiny64_defconfig       | 17 ++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/automation/gitlab-ci/riscv-fixed-randconfig.yaml b/automation/gitlab-ci/riscv-fixed-randconfig.yaml
index f1282b40c9..344f39c2d8 100644
--- a/automation/gitlab-ci/riscv-fixed-randconfig.yaml
+++ b/automation/gitlab-ci/riscv-fixed-randconfig.yaml
@@ -5,3 +5,30 @@
       CONFIG_EXPERT=y
       CONFIG_GRANT_TABLE=n
       CONFIG_MEM_ACCESS=n
+      CONFIG_COVERAGE=n
+      CONFIG_SCHED_CREDIT=n
+      CONFIG_SCHED_CREDIT2=n
+      CONFIG_SCHED_RTDS=n
+      CONFIG_SCHED_NULL=n
+      CONFIG_SCHED_ARINC653=n
+      CONFIG_TRACEBUFFER=n
+      CONFIG_HYPFS=n
+      CONFIG_SPECULATIVE_HARDEN_ARRAY=n
+      CONFIG_ARGO=n
+      CONFIG_HYPFS_CONFIG=n
+      CONFIG_CORE_PARKING=n
+      CONFIG_DEBUG_TRACE=n
+      CONFIG_IOREQ_SERVER=n
+      CONFIG_CRASH_DEBUG=n
+      CONFIG_KEXEC=n
+      CONFIG_LIVEPATCH=n
+      CONFIG_NUMA=n
+      CONFIG_PERF_COUNTERS=n
+      CONFIG_HAS_PMAP=n
+      CONFIG_TRACEBUFFER=n
+      CONFIG_XENOPROF=n
+      CONFIG_COMPAT=n
+      CONFIG_COVERAGE=n
+      CONFIG_UBSAN=n
+      CONFIG_NEEDS_LIBELF=n
+      CONFIG_XSM=n
diff --git a/xen/arch/riscv/configs/tiny64_defconfig b/xen/arch/riscv/configs/tiny64_defconfig
index 09defe236b..35915255e6 100644
--- a/xen/arch/riscv/configs/tiny64_defconfig
+++ b/xen/arch/riscv/configs/tiny64_defconfig
@@ -7,6 +7,23 @@
 # CONFIG_GRANT_TABLE is not set
 # CONFIG_SPECULATIVE_HARDEN_ARRAY is not set
 # CONFIG_MEM_ACCESS is not set
+# CONFIG_ARGO is not set
+# CONFIG_HYPFS_CONFIG is not set
+# CONFIG_CORE_PARKING is not set
+# CONFIG_DEBUG_TRACE is not set
+# CONFIG_IOREQ_SERVER is not set
+# CONFIG_CRASH_DEBUG is not setz
+# CONFIG_KEXEC is not set
+# CONFIG_LIVEPATCH is not set
+# CONFIG_NUMA is not set
+# CONFIG_PERF_COUNTERS is not set
+# CONFIG_HAS_PMAP is not set
+# CONFIG_TRACEBUFFER is not set
+# CONFIG_XENOPROF is not set
+# CONFIG_COMPAT is not set
+# CONFIG_COVERAGE is not set
+# CONFIG_UBSAN is not set
+# CONFIG_NEEDS_LIBELF is not set
 
 CONFIG_RISCV_64=y
 CONFIG_DEBUG=y
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 02/30] xen/riscv: use some asm-generic headers
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 01/30] xen/riscv: disable unnecessary configs Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:03   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V Oleksii Kurochko
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Some headers are the same as asm-generic verions of them
so use them instead of arch-specific headers.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 As [PATCH v6 0/9] Introduce generic headers
 (https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com/)
 is not stable, the list in asm/Makefile can be changed, but the changes will
 be easy.
---
Changes in V4:
- removed numa.h from asm/include/Makefile because of the patch: [PATCH v2] NUMA: no need for asm/numa.h when !NUMA
- updated the commit message
---
Changes in V3:
 - remove monitor.h from the RISC-V asm/Makefile list.
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V2:
 - New commit introduced in V2.
---
 xen/arch/riscv/include/asm/Makefile | 12 ++++++++++++
 1 file changed, 12 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/Makefile

diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
new file mode 100644
index 0000000000..ced02e26ed
--- /dev/null
+++ b/xen/arch/riscv/include/asm/Makefile
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+generic-y += altp2m.h
+generic-y += device.h
+generic-y += div64.h
+generic-y += hardirq.h
+generic-y += hypercall.h
+generic-y += iocap.h
+generic-y += paging.h
+generic-y += percpu.h
+generic-y += random.h
+generic-y += softirq.h
+generic-y += vm_event.h
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 01/30] xen/riscv: disable unnecessary configs Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 02/30] xen/riscv: use some asm-generic headers Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:05   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 04/30] xen/riscv: introduce cpufeature.h Oleksii Kurochko
                   ` (26 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

No specific header is needed to include in public/hvm/save.h for
PPC and RISC-V for now.

Code related to PPC was changed based on the comment:
https://lore.kernel.org/xen-devel/c2f3280e-2208-496b-a0b5-fda1a2076b3a@raptorengineering.com/

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
- Updated the commit message
---
Changes in V3:
 - update the commit message.
 - For PPC and RISC-V nothing to include in public/hvm/save.h, so just comment was
   added.
---
Changes in V2:
 - remove copyright an the top of hvm/save.h as the header write now is a newly
   introduced empty header.
---
 xen/include/public/hvm/save.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/public/hvm/save.h b/xen/include/public/hvm/save.h
index 5561495b27..72e16ab5bc 100644
--- a/xen/include/public/hvm/save.h
+++ b/xen/include/public/hvm/save.h
@@ -89,8 +89,8 @@ DECLARE_HVM_SAVE_TYPE(END, 0, struct hvm_save_end);
 #include "../arch-x86/hvm/save.h"
 #elif defined(__arm__) || defined(__aarch64__)
 #include "../arch-arm/hvm/save.h"
-#elif defined(__powerpc64__)
-#include "../arch-ppc.h"
+#elif defined(__powerpc64__) || defined(__riscv)
+/* no specific header to include */
 #else
 #error "unsupported architecture"
 #endif
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 04/30] xen/riscv: introduce cpufeature.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (2 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h Oleksii Kurochko
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
 - Nothing changed. Only rebase
---
Changes in V3:
 - add SPDX and footer
 - update declaration of cpu_nr_siblings() to return unsigned int instead of int.
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/cpufeature.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/cpufeature.h

diff --git a/xen/arch/riscv/include/asm/cpufeature.h b/xen/arch/riscv/include/asm/cpufeature.h
new file mode 100644
index 0000000000..c08b7d67ad
--- /dev/null
+++ b/xen/arch/riscv/include/asm/cpufeature.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_CPUFEATURE_H
+#define __ASM_RISCV_CPUFEATURE_H
+
+#ifndef __ASSEMBLY__
+
+static inline unsigned int cpu_nr_siblings(unsigned int cpu)
+{
+    return 1;
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_RISCV_CPUFEATURE_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (3 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 04/30] xen/riscv: introduce cpufeature.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:07   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h Oleksii Kurochko
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - Drop in guest_testop() and guest_bitop() casts of function arguments.
 - Change "commit message" to "commit title" in "Changes in V3" to be more precise about
   what was changed.
 - use BUG_ON("unimplemented") instead of ASSERT_UNREACHABLE
---
Changes in V3:
 - update the commit title
 - drop TODO commit.
 - add ASSERT_UNREACHABLE for stubs guest functions.
 - Add SPDX & footer
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/guest_atomics.h | 44 ++++++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/guest_atomics.h

diff --git a/xen/arch/riscv/include/asm/guest_atomics.h b/xen/arch/riscv/include/asm/guest_atomics.h
new file mode 100644
index 0000000000..de54914454
--- /dev/null
+++ b/xen/arch/riscv/include/asm/guest_atomics.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_GUEST_ATOMICS_H
+#define __ASM_RISCV_GUEST_ATOMICS_H
+
+#include <xen/bug.h>
+
+#define guest_testop(name)                                                  \
+static inline int guest_##name(struct domain *d, int nr, volatile void *p)  \
+{                                                                           \
+    BUG_ON("unimplemented");                                                \
+                                                                            \
+    return 0;                                                               \
+}
+
+#define guest_bitop(name)                                                   \
+static inline void guest_##name(struct domain *d, int nr, volatile void *p) \
+{                                                                           \
+    BUG_ON("unimplemented");                                                \
+}
+
+guest_bitop(set_bit)
+guest_bitop(clear_bit)
+guest_bitop(change_bit)
+
+#undef guest_bitop
+
+guest_testop(test_and_set_bit)
+guest_testop(test_and_clear_bit)
+guest_testop(test_and_change_bit)
+
+#undef guest_testop
+
+#define guest_test_bit(d, nr, p) ((void)(d), test_bit(nr, p))
+
+#endif /* __ASM_RISCV_GUEST_ATOMICS_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (4 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:10   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 07/30] xen/asm-generic: introdure nospec.h Oleksii Kurochko
                   ` (23 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Jan Beulich, Paul Durrant, Roger Pau Monné

asm/iommu.h shouldn't be included when CONFIG_HAS_PASSTHROUGH
isn't enabled.
As <asm/iommu.h> is ifdef-ed by CONFIG_HAS_PASSTHROUGH it should
be also ifdef-ed field "struct arch_iommu arch" in struct domain_iommu
as definition of arch_iommu is located in <asm/iommu.h>.

These amount of changes are enough to avoid generation of empty
asm/iommu.h for now.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - Update the commit message.
---
Changes in V3:
 - new patch.
---
 xen/include/xen/iommu.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index a21f25df9f..7aa6a77209 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -337,7 +337,9 @@ extern int iommu_add_extra_reserved_device_memory(unsigned long start,
 extern int iommu_get_extra_reserved_device_memory(iommu_grdm_t *func,
                                                   void *ctxt);
 
+#ifdef CONFIG_HAS_PASSTHROUGH
 #include <asm/iommu.h>
+#endif
 
 #ifndef iommu_call
 # define iommu_call(ops, fn, args...) ((ops)->fn(args))
@@ -345,7 +347,9 @@ extern int iommu_get_extra_reserved_device_memory(iommu_grdm_t *func,
 #endif
 
 struct domain_iommu {
+#ifdef CONFIG_HAS_PASSTHROUGH
     struct arch_iommu arch;
+#endif
 
     /* iommu_ops */
     const struct iommu_ops *platform_ops;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (5 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-18 18:30   ` Julien Grall
  2024-02-05 15:32 ` [PATCH v4 08/30] xen/riscv: introduce setup.h Oleksii Kurochko
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
	George Dunlap, Jan Beulich, Wei Liu, Shawn Anastasio,
	Alistair Francis, Bob Eshleman, Connor Davis

The <asm/nospec.h> header is similar between Arm, PPC, and RISC-V,
so it has been moved to asm-generic.

Arm's nospec.h was taken as a base with updated guards:
 _ASM_ARM_NOSPEC_H -> _ASM_GENERIC_NOSPEC_H

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
 - Rebase the patch. It was conflics in asm/include/Makefile because it doesn't contain
   numa.h in it because of the patch: [PATCH v2] NUMA: no need for asm/numa.h when !NUMA
 - Properly move/rename the Arm's nospec.h with only guards update in the header from
   _ASM_ARM_NOSPEC_H to _ASM_GENERIC_NOSPEC_H.
 - Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V3:
 - new patch.
---
 xen/arch/arm/include/asm/Makefile                 |  1 +
 xen/arch/ppc/include/asm/Makefile                 |  1 +
 xen/arch/ppc/include/asm/nospec.h                 | 15 ---------------
 xen/arch/riscv/include/asm/Makefile               |  1 +
 .../include/asm => include/asm-generic}/nospec.h  |  6 +++---
 5 files changed, 6 insertions(+), 18 deletions(-)
 delete mode 100644 xen/arch/ppc/include/asm/nospec.h
 rename xen/{arch/arm/include/asm => include/asm-generic}/nospec.h (79%)

diff --git a/xen/arch/arm/include/asm/Makefile b/xen/arch/arm/include/asm/Makefile
index 4a4036c951..41f73bf968 100644
--- a/xen/arch/arm/include/asm/Makefile
+++ b/xen/arch/arm/include/asm/Makefile
@@ -3,6 +3,7 @@ generic-y += altp2m.h
 generic-y += device.h
 generic-y += hardirq.h
 generic-y += iocap.h
+generic-y += nospec.h
 generic-y += paging.h
 generic-y += percpu.h
 generic-y += random.h
diff --git a/xen/arch/ppc/include/asm/Makefile b/xen/arch/ppc/include/asm/Makefile
index ced02e26ed..2e8623bb10 100644
--- a/xen/arch/ppc/include/asm/Makefile
+++ b/xen/arch/ppc/include/asm/Makefile
@@ -5,6 +5,7 @@ generic-y += div64.h
 generic-y += hardirq.h
 generic-y += hypercall.h
 generic-y += iocap.h
+generic-y += nospec.h
 generic-y += paging.h
 generic-y += percpu.h
 generic-y += random.h
diff --git a/xen/arch/ppc/include/asm/nospec.h b/xen/arch/ppc/include/asm/nospec.h
deleted file mode 100644
index b97322e48d..0000000000
--- a/xen/arch/ppc/include/asm/nospec.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/* From arch/arm/include/asm/nospec.h. */
-#ifndef __ASM_PPC_NOSPEC_H__
-#define __ASM_PPC_NOSPEC_H__
-
-static inline bool evaluate_nospec(bool condition)
-{
-    return condition;
-}
-
-static inline void block_speculation(void)
-{
-}
-
-#endif /* __ASM_PPC_NOSPEC_H__ */
diff --git a/xen/arch/riscv/include/asm/Makefile b/xen/arch/riscv/include/asm/Makefile
index ced02e26ed..2e8623bb10 100644
--- a/xen/arch/riscv/include/asm/Makefile
+++ b/xen/arch/riscv/include/asm/Makefile
@@ -5,6 +5,7 @@ generic-y += div64.h
 generic-y += hardirq.h
 generic-y += hypercall.h
 generic-y += iocap.h
+generic-y += nospec.h
 generic-y += paging.h
 generic-y += percpu.h
 generic-y += random.h
diff --git a/xen/arch/arm/include/asm/nospec.h b/xen/include/asm-generic/nospec.h
similarity index 79%
rename from xen/arch/arm/include/asm/nospec.h
rename to xen/include/asm-generic/nospec.h
index 51c7aea4f4..65fd745db2 100644
--- a/xen/arch/arm/include/asm/nospec.h
+++ b/xen/include/asm-generic/nospec.h
@@ -1,8 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */
 
-#ifndef _ASM_ARM_NOSPEC_H
-#define _ASM_ARM_NOSPEC_H
+#ifndef _ASM_GENERIC_NOSPEC_H
+#define _ASM_GENERIC_NOSPEC_H
 
 static inline bool evaluate_nospec(bool condition)
 {
@@ -13,7 +13,7 @@ static inline void block_speculation(void)
 {
 }
 
-#endif /* _ASM_ARM_NOSPEC_H */
+#endif /* _ASM_GENERIC_NOSPEC_H */
 
 /*
  * Local variables:
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 08/30] xen/riscv: introduce setup.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (6 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 07/30] xen/asm-generic: introdure nospec.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 09/30] xen/riscv: introduce bitops.h Oleksii Kurochko
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
  - Nothing changed. Only rebase
---
Changes in V3:
 - add SPDX
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/setup.h | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/setup.h

diff --git a/xen/arch/riscv/include/asm/setup.h b/xen/arch/riscv/include/asm/setup.h
new file mode 100644
index 0000000000..7613a5dbd0
--- /dev/null
+++ b/xen/arch/riscv/include/asm/setup.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_RISCV_SETUP_H__
+#define __ASM_RISCV_SETUP_H__
+
+#define max_init_domid (0)
+
+#endif /* __ASM_RISCV_SETUP_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 09/30] xen/riscv: introduce bitops.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (7 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 08/30] xen/riscv: introduce setup.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:58   ` Jan Beulich
  2024-02-13  9:19   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 10/30] xen/riscv: introduce flushtlb.h Oleksii Kurochko
                   ` (20 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Taken from Linux-6.4.0-rc1

Xen's bitops.h consists of several Linux's headers:
* linux/arch/include/asm/bitops.h:
  * The following function were removed as they aren't used in Xen:
      	* test_and_set_bit_lock
      	* clear_bit_unlock
      	* __clear_bit_unlock
   * The following functions were renamed in the way how they are
     used by common code:
     	* __test_and_set_bit
     	* __test_and_clear_bit
   * The declaration and implementation of the following functios
     were updated to make Xen build happy:
	* clear_bit
	* set_bit
	* __test_and_clear_bit
	* __test_and_set_bit
* linux/arch/include/linux/bits.h ( taken only definitions for BIT_MASK,
  BIT_WORD, BITS_PER_BYTE )
* linux/include/asm-generic/bitops/generic-non-atomic.h with the
  following changes:
   * Only functions that can be reused in Xen were left;
     others were removed.
   * it was updated the message inside #ifndef ... #endif.
   * __always_inline -> always_inline to be align with definition in
     xen/compiler.h.
   * update function prototypes from
     generic___test_and_*(unsigned long nr nr, volatile unsigned long *addr)
     to
     generic___test_and_*(unsigned long nr, volatile void *addr) to be
     consistent with other related macros/defines.
   * convert identations from tabs to spaces.
   * inside generic__test_and_* use 'bitops_uint_t' instead of 'unsigned long'
     to be generic.

Additionaly, the following bit ops are introduced:
* __ffs
* ffsl
* fls
* flsl
* ffs
* ffz
* find_first_bit_set
* hweight64
* test_bit

Some of the introduced bit operations are included in asm-generic,
as they exhibit similarity across multiple architectures.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
  - updated the commit message: dropped the message about what was taken from linux/include/asm-generic/bitops/find.h
    as related changes now are located in xen/bitops.h. Also these changes were removed from riscv/bitops.h
  - switch tabs to spaces.
  - update return type of __ffs function, format __ffs according to Xen code style. Move the function to
    respective asm-generic header.
  - format ffsl() according to Xen code style, update the type of num: int -> unsigned to be align with
    return type of the function. Move the function to respective asm-generic header.
  - add new line for the files:
      asm-generic/bitops-bits.h
      asm-generic/ffz.h
      asm-generic/find-first-bit-set.h
      asm-generic/fls.h
      asm-generic/flsl.h
      asm-generic/test-bit.h
  - rename asm-generic/find-first-bit-set.h to asm-generic/find-first-set-bit.h to be aligned with the function
    name implemented inside.
  - introduce generic___test_and*() operation for non-atomic bitops.
  - rename current __test_and_*() -> test_and_*() as their implementation are atomic aware.
  - define __test_and_*() to generic___test_and_*().
  - introduce test_and_change_bit().
  - update asm-generic/bitops/bitops-bits.h to give possoibility to change BITOP_*() macros by architecture.
    Also, it was introduced bitops_uint_t type to make generic___test_and_*() generic.
  - "include asm-generic/bitops/bitops-bits.h" to files which use its definitions.
  - add comment why generic ffz is defined as __ffs().
  - update the commit message.
  - swtich ffsl() to generic_ffsl().
---
Changes in V3:
 - update the commit message
 - Introduce the following asm-generic bitops headers:
	create mode 100644 xen/arch/riscv/include/asm/bitops.h
	create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
	create mode 100644 xen/include/asm-generic/bitops/ffs.h
	create mode 100644 xen/include/asm-generic/bitops/ffz.h
	create mode 100644 xen/include/asm-generic/bitops/find-first-bit-set.h
	create mode 100644 xen/include/asm-generic/bitops/fls.h
	create mode 100644 xen/include/asm-generic/bitops/flsl.h
	create mode 100644 xen/include/asm-generic/bitops/hweight.h
	create mode 100644 xen/include/asm-generic/bitops/test-bit.h
 - switch some bitops functions to asm-generic's versions.
 - re-sync some macros with Linux kernel version mentioned in the commit message.
 - Xen code style fixes.
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/bitops.h           | 164 ++++++++++++++++++
 xen/arch/riscv/include/asm/config.h           |   2 +
 xen/include/asm-generic/bitops/__ffs.h        |  47 +++++
 xen/include/asm-generic/bitops/bitops-bits.h  |  21 +++
 xen/include/asm-generic/bitops/ffs.h          |   9 +
 xen/include/asm-generic/bitops/ffsl.h         |  16 ++
 xen/include/asm-generic/bitops/ffz.h          |  18 ++
 .../asm-generic/bitops/find-first-set-bit.h   |  17 ++
 xen/include/asm-generic/bitops/fls.h          |  18 ++
 xen/include/asm-generic/bitops/flsl.h         |  10 ++
 .../asm-generic/bitops/generic-non-atomic.h   |  89 ++++++++++
 xen/include/asm-generic/bitops/hweight.h      |  13 ++
 xen/include/asm-generic/bitops/test-bit.h     |  18 ++
 13 files changed, 442 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/bitops.h
 create mode 100644 xen/include/asm-generic/bitops/__ffs.h
 create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
 create mode 100644 xen/include/asm-generic/bitops/ffs.h
 create mode 100644 xen/include/asm-generic/bitops/ffsl.h
 create mode 100644 xen/include/asm-generic/bitops/ffz.h
 create mode 100644 xen/include/asm-generic/bitops/find-first-set-bit.h
 create mode 100644 xen/include/asm-generic/bitops/fls.h
 create mode 100644 xen/include/asm-generic/bitops/flsl.h
 create mode 100644 xen/include/asm-generic/bitops/generic-non-atomic.h
 create mode 100644 xen/include/asm-generic/bitops/hweight.h
 create mode 100644 xen/include/asm-generic/bitops/test-bit.h

diff --git a/xen/arch/riscv/include/asm/bitops.h b/xen/arch/riscv/include/asm/bitops.h
new file mode 100644
index 0000000000..1225298d35
--- /dev/null
+++ b/xen/arch/riscv/include/asm/bitops.h
@@ -0,0 +1,164 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2012 Regents of the University of California */
+
+#ifndef _ASM_RISCV_BITOPS_H
+#define _ASM_RISCV_BITOPS_H
+
+#include <asm/system.h>
+
+#include <asm-generic/bitops/bitops-bits.h>
+
+/* Based on linux/arch/include/linux/bits.h */
+
+#define BIT_MASK(nr)        (1UL << ((nr) % BITS_PER_LONG))
+#define BIT_WORD(nr)        ((nr) / BITS_PER_LONG)
+
+#define __set_bit(n,p)      set_bit(n,p)
+#define __clear_bit(n,p)    clear_bit(n,p)
+
+/* Based on linux/arch/include/asm/bitops.h */
+
+#if ( BITS_PER_LONG == 64 )
+#define __AMO(op)   "amo" #op ".d"
+#elif ( BITS_PER_LONG == 32 )
+#define __AMO(op)   "amo" #op ".w"
+#else
+#error "Unexpected BITS_PER_LONG"
+#endif
+
+#define __test_and_op_bit_ord(op, mod, nr, addr, ord)   \
+({                                                      \
+    unsigned long __res, __mask;                        \
+    __mask = BIT_MASK(nr);                              \
+    __asm__ __volatile__ (                              \
+        __AMO(op) #ord " %0, %2, %1"                    \
+        : "=r" (__res), "+A" (addr[BIT_WORD(nr)])       \
+        : "r" (mod(__mask))                             \
+        : "memory");                                    \
+    ((__res & __mask) != 0);                            \
+})
+
+#define __op_bit_ord(op, mod, nr, addr, ord)    \
+    __asm__ __volatile__ (                      \
+        __AMO(op) #ord " zero, %1, %0"          \
+        : "+A" (addr[BIT_WORD(nr)])             \
+        : "r" (mod(BIT_MASK(nr)))               \
+        : "memory");
+
+#define __test_and_op_bit(op, mod, nr, addr)    \
+    __test_and_op_bit_ord(op, mod, nr, addr, .aqrl)
+#define __op_bit(op, mod, nr, addr) \
+    __op_bit_ord(op, mod, nr, addr, )
+
+/* Bitmask modifiers */
+#define __NOP(x)    (x)
+#define __NOT(x)    (~(x))
+
+/**
+ * __test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation may be reordered on other architectures than x86.
+ */
+static inline int test_and_set_bit(int nr, volatile void *p)
+{
+    volatile uint32_t *addr = p;
+
+    return __test_and_op_bit(or, __NOP, nr, addr);
+}
+
+/**
+ * __test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation can be reordered on other architectures other than x86.
+ */
+static inline int test_and_clear_bit(int nr, volatile void *p)
+{
+    volatile uint32_t *addr = p;
+
+    return __test_and_op_bit(and, __NOT, nr, addr);
+}
+
+/**
+ * set_bit - Atomically set a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * Note: there are no guarantees that this function will not be reordered
+ * on non x86 architectures, so if you are writing portable code,
+ * make sure not to rely on its reordering guarantees.
+ *
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
+ */
+static inline void set_bit(int nr, volatile void *p)
+{
+    volatile uint32_t *addr = p;
+
+    __op_bit(or, __NOP, nr, addr);
+}
+
+/**
+ * clear_bit - Clears a bit in memory
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
+ *
+ * Note: there are no guarantees that this function will not be reordered
+ * on non x86 architectures, so if you are writing portable code,
+ * make sure not to rely on its reordering guarantees.
+ */
+static inline void clear_bit(int nr, volatile void *p)
+{
+    volatile uint32_t *addr = p;
+
+    __op_bit(and, __NOT, nr, addr);
+}
+
+/**
+ * test_and_change_bit - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This operation is atomic and cannot be reordered.
+ * It also implies a memory barrier.
+ */
+static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
+{
+	return __test_and_op_bit(xor, __NOP, nr, addr);
+}
+
+#undef __test_and_op_bit
+#undef __op_bit
+#undef __NOP
+#undef __NOT
+#undef __AMO
+
+#include <asm-generic/bitops/generic-non-atomic.h>
+
+#define __test_and_set_bit generic___test_and_set_bit
+#define __test_and_clear_bit generic___test_and_clear_bit
+#define __test_and_change_bit generic___test_and_change_bit
+
+#include <asm-generic/bitops/fls.h>
+#include <asm-generic/bitops/flsl.h>
+#include <asm-generic/bitops/__ffs.h>
+#include <asm-generic/bitops/ffs.h>
+#include <asm-generic/bitops/ffsl.h>
+#include <asm-generic/bitops/ffz.h>
+#include <asm-generic/bitops/find-first-set-bit.h>
+#include <asm-generic/bitops/hweight.h>
+#include <asm-generic/bitops/test-bit.h>
+
+#endif /* _ASM_RISCV_BITOPS_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
index a80cdd4f85..56387ac159 100644
--- a/xen/arch/riscv/include/asm/config.h
+++ b/xen/arch/riscv/include/asm/config.h
@@ -50,6 +50,8 @@
 # error "Unsupported RISCV variant"
 #endif
 
+#define BITS_PER_BYTE 8
+
 #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
 #define BITS_PER_LONG  (BYTES_PER_LONG << 3)
 #define POINTER_ALIGN  BYTES_PER_LONG
diff --git a/xen/include/asm-generic/bitops/__ffs.h b/xen/include/asm-generic/bitops/__ffs.h
new file mode 100644
index 0000000000..fecb4484d9
--- /dev/null
+++ b/xen/include/asm-generic/bitops/__ffs.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS___FFS_H_
+#define _ASM_GENERIC_BITOPS___FFS_H_
+
+/**
+ * ffs - find first bit in word.
+ * @word: The word to search
+ *
+ * Returns 0 if no bit exists, otherwise returns 1-indexed bit location.
+ */
+static inline unsigned int __ffs(unsigned long word)
+{
+    unsigned int num = 0;
+
+#if BITS_PER_LONG == 64
+    if ( (word & 0xffffffff) == 0 )
+    {
+        num += 32;
+        word >>= 32;
+    }
+#endif
+    if ( (word & 0xffff) == 0 )
+    {
+        num += 16;
+        word >>= 16;
+    }
+    if ( (word & 0xff) == 0 )
+    {
+        num += 8;
+        word >>= 8;
+    }
+    if ( (word & 0xf) == 0 )
+    {
+        num += 4;
+        word >>= 4;
+    }
+    if ( (word & 0x3) == 0 )
+    {
+        num += 2;
+        word >>= 2;
+    }
+    if ( (word & 0x1) == 0 )
+        num += 1;
+    return num;
+}
+
+#endif /* _ASM_GENERIC_BITOPS___FFS_H_ */
diff --git a/xen/include/asm-generic/bitops/bitops-bits.h b/xen/include/asm-generic/bitops/bitops-bits.h
new file mode 100644
index 0000000000..4ece2affd6
--- /dev/null
+++ b/xen/include/asm-generic/bitops/bitops-bits.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_BITS_H_
+#define _ASM_GENERIC_BITOPS_BITS_H_
+
+#ifndef BITOP_BITS_PER_WORD
+#define BITOP_BITS_PER_WORD     32
+#endif
+
+#ifndef BITOP_MASK
+#define BITOP_MASK(nr)          (1U << ((nr) % BITOP_BITS_PER_WORD))
+#endif
+
+#ifndef BITOP_WORD
+#define BITOP_WORD(nr)          ((nr) / BITOP_BITS_PER_WORD)
+#endif
+
+#ifndef BITOP_TYPE
+typedef uint32_t bitops_uint_t;
+#endif
+
+#endif /* _ASM_GENERIC_BITOPS_BITS_H_ */
diff --git a/xen/include/asm-generic/bitops/ffs.h b/xen/include/asm-generic/bitops/ffs.h
new file mode 100644
index 0000000000..3f75fded14
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffs.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFS_H_
+#define _ASM_GENERIC_BITOPS_FFS_H_
+
+#include <xen/macros.h>
+
+#define ffs(x) ({ unsigned int t_ = (x); fls(ISOLATE_LSB(t_)); })
+
+#endif /* _ASM_GENERIC_BITOPS_FFS_H_ */
diff --git a/xen/include/asm-generic/bitops/ffsl.h b/xen/include/asm-generic/bitops/ffsl.h
new file mode 100644
index 0000000000..d0996808f5
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffsl.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFSL_H_
+#define _ASM_GENERIC_BITOPS_FFSL_H_
+
+/**
+ * ffsl - find first bit in long.
+ * @word: The word to search
+ *
+ * Returns 0 if no bit exists, otherwise returns 1-indexed bit location.
+ */
+static inline unsigned int ffsl(unsigned long word)
+{
+    return generic_ffsl(word);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FFSL_H_ */
diff --git a/xen/include/asm-generic/bitops/ffz.h b/xen/include/asm-generic/bitops/ffz.h
new file mode 100644
index 0000000000..5932fe6695
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffz.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFZ_H_
+#define _ASM_GENERIC_BITOPS_FFZ_H_
+
+/*
+ * ffz - find first zero in word.
+ * @word: The word to search
+ *
+ * Undefined if no zero exists, so code should check against ~0UL first.
+ *
+ * ffz() is defined as __ffs() and not as ffs() as it is defined in such
+ * a way in Linux kernel (6.4.0 ) from where this header was taken, so this
+ * header is supposed to be aligned with Linux kernel version.
+ * Also, most architectures are defined in the same way in Xen.
+ */
+#define ffz(x)  __ffs(~(x))
+
+#endif /* _ASM_GENERIC_BITOPS_FFZ_H_ */
diff --git a/xen/include/asm-generic/bitops/find-first-set-bit.h b/xen/include/asm-generic/bitops/find-first-set-bit.h
new file mode 100644
index 0000000000..7d28b8a89b
--- /dev/null
+++ b/xen/include/asm-generic/bitops/find-first-set-bit.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_
+#define _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_
+
+/**
+ * find_first_set_bit - find the first set bit in @word
+ * @word: the word to search
+ *
+ * Returns the bit-number of the first set bit (first bit being 0).
+ * The input must *not* be zero.
+ */
+static inline unsigned int find_first_set_bit(unsigned long word)
+{
+        return ffsl(word) - 1;
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_ */
diff --git a/xen/include/asm-generic/bitops/fls.h b/xen/include/asm-generic/bitops/fls.h
new file mode 100644
index 0000000000..369a4c790c
--- /dev/null
+++ b/xen/include/asm-generic/bitops/fls.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FLS_H_
+#define _ASM_GENERIC_BITOPS_FLS_H_
+
+/**
+ * fls - find last (most-significant) bit set
+ * @x: the word to search
+ *
+ * This is defined the same way as ffs.
+ * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
+ */
+
+static inline int fls(unsigned int x)
+{
+    return generic_fls(x);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FLS_H_ */
diff --git a/xen/include/asm-generic/bitops/flsl.h b/xen/include/asm-generic/bitops/flsl.h
new file mode 100644
index 0000000000..d0a2e9c729
--- /dev/null
+++ b/xen/include/asm-generic/bitops/flsl.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FLSL_H_
+#define _ASM_GENERIC_BITOPS_FLSL_H_
+
+static inline int flsl(unsigned long x)
+{
+    return generic_flsl(x);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FLSL_H_ */
diff --git a/xen/include/asm-generic/bitops/generic-non-atomic.h b/xen/include/asm-generic/bitops/generic-non-atomic.h
new file mode 100644
index 0000000000..07efca245e
--- /dev/null
+++ b/xen/include/asm-generic/bitops/generic-non-atomic.h
@@ -0,0 +1,89 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * The file is based on Linux ( 6.4.0 ) header:
+ *   include/asm-generic/bitops/generic-non-atomic.h
+ * 
+ * Only functions that can be reused in Xen were left; others were removed.
+ * 
+ * Also, the following changes were done:
+ *  - it was updated the message inside #ifndef ... #endif.
+ *  - __always_inline -> always_inline to be align with definition in
+ *    xen/compiler.h.
+ *  - update function prototypes from
+ *    generic___test_and_*(unsigned long nr nr, volatile unsigned long *addr) to
+ *    generic___test_and_*(unsigned long nr, volatile void *addr) to be
+ *    consistent with other related macros/defines.
+ *  - convert identations from tabs to spaces.
+ *  - inside generic__test_and_* use 'bitops_uint_t' instead of 'unsigned long'
+ *    to be generic.
+ */
+
+#ifndef __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+#define __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+
+#include <xen/compiler.h>
+
+#include <asm-generic/bitops/bitops-bits.h>
+
+#ifndef _LINUX_BITOPS_H
+#error only <xen/bitops.h> can be included directly
+#endif
+
+/*
+ * Generic definitions for bit operations, should not be used in regular code
+ * directly.
+ */
+
+/**
+ * generic___test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static always_inline bool
+generic___test_and_set_bit(unsigned long nr, volatile void *addr)
+{
+    bitops_uint_t mask = BIT_MASK(nr);
+    bitops_uint_t *p = ((bitops_uint_t *)addr) + BIT_WORD(nr);
+    bitops_uint_t old = *p;
+
+    *p = old | mask;
+    return (old & mask) != 0;
+}
+
+/**
+ * generic___test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static always_inline bool
+generic___test_and_clear_bit(bitops_uint_t nr, volatile void *addr)
+{
+    bitops_uint_t mask = BIT_MASK(nr);
+    bitops_uint_t *p = ((bitops_uint_t *)addr) + BIT_WORD(nr);
+    bitops_uint_t old = *p;
+
+    *p = old & ~mask;
+    return (old & mask) != 0;
+}
+
+/* WARNING: non atomic and it can be reordered! */
+static always_inline bool
+generic___test_and_change_bit(unsigned long nr, volatile void *addr)
+{
+    bitops_uint_t mask = BIT_MASK(nr);
+    bitops_uint_t *p = ((bitops_uint_t *)addr) + BIT_WORD(nr);
+    bitops_uint_t old = *p;
+
+    *p = old ^ mask;
+    return (old & mask) != 0;
+}
+
+#endif /* __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H */
diff --git a/xen/include/asm-generic/bitops/hweight.h b/xen/include/asm-generic/bitops/hweight.h
new file mode 100644
index 0000000000..0d7577054e
--- /dev/null
+++ b/xen/include/asm-generic/bitops/hweight.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_HWEIGHT_H_
+#define _ASM_GENERIC_BITOPS_HWEIGHT_H_
+
+/*
+ * hweightN - returns the hamming weight of a N-bit word
+ * @x: the word to weigh
+ *
+ * The Hamming Weight of a number is the total number of bits set in it.
+ */
+#define hweight64(x) generic_hweight64(x)
+
+#endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */
diff --git a/xen/include/asm-generic/bitops/test-bit.h b/xen/include/asm-generic/bitops/test-bit.h
new file mode 100644
index 0000000000..6fb414d808
--- /dev/null
+++ b/xen/include/asm-generic/bitops/test-bit.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_TESTBIT_H_
+#define _ASM_GENERIC_BITOPS_TESTBIT_H_
+
+#include <asm-generic/bitops/bitops-bits.h>
+
+/**
+ * test_bit - Determine whether a bit is set
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ */
+static inline int test_bit(int nr, const volatile void *addr)
+{
+    const volatile bitops_uint_t *p = addr;
+    return 1 & (p[BITOP_WORD(nr)] >> (nr & (BITOP_BITS_PER_WORD - 1)));
+}
+
+#endif /* _ASM_GENERIC_BITOPS_TESTBIT_H_ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 10/30] xen/riscv: introduce flushtlb.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (8 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 09/30] xen/riscv: introduce bitops.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 11/30] xen/riscv: introduce smp.h Oleksii Kurochko
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
 - s/BUG/BUG_ON(...)
---
Changes in V3:
 - add SPDX & footer
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/flushtlb.h | 34 +++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/flushtlb.h

diff --git a/xen/arch/riscv/include/asm/flushtlb.h b/xen/arch/riscv/include/asm/flushtlb.h
new file mode 100644
index 0000000000..7ce32bea0b
--- /dev/null
+++ b/xen/arch/riscv/include/asm/flushtlb.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_FLUSHTLB_H__
+#define __ASM_RISCV_FLUSHTLB_H__
+
+#include <xen/bug.h>
+#include <xen/cpumask.h>
+
+/*
+ * Filter the given set of CPUs, removing those that definitely flushed their
+ * TLB since @page_timestamp.
+ */
+/* XXX lazy implementation just doesn't clear anything.... */
+static inline void tlbflush_filter(cpumask_t *mask, uint32_t page_timestamp) {}
+
+#define tlbflush_current_time() (0)
+
+static inline void page_set_tlbflush_timestamp(struct page_info *page)
+{
+    BUG_ON("unimplemented");
+}
+
+/* Flush specified CPUs' TLBs */
+void arch_flush_tlb_mask(const cpumask_t *mask);
+
+#endif /* __ASM_RISCV_FLUSHTLB_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 11/30] xen/riscv: introduce smp.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (9 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 10/30] xen/riscv: introduce flushtlb.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:13   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
                   ` (18 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - update the V3 changes ( remove that cpu_is_offline() was droped and instead of message
   used subject )
 - drop cpu_is_offline() as it was moved to xen/smp.h.
---
Changes in V3:
 - add SPDX.
 - drop unnessary #ifdef.
 - fix "No new line"
 - update the commit subject
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/smp.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/smp.h

diff --git a/xen/arch/riscv/include/asm/smp.h b/xen/arch/riscv/include/asm/smp.h
new file mode 100644
index 0000000000..b1ea91b1eb
--- /dev/null
+++ b/xen/arch/riscv/include/asm/smp.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_SMP_H
+#define __ASM_RISCV_SMP_H
+
+#include <xen/cpumask.h>
+#include <xen/percpu.h>
+
+DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_mask);
+DECLARE_PER_CPU(cpumask_var_t, cpu_core_mask);
+
+/*
+ * Do we, for platform reasons, need to actually keep CPUs online when we
+ * would otherwise prefer them to be off?
+ */
+#define park_offline_cpus false
+
+#endif
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (10 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 11/30] xen/riscv: introduce smp.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-13 10:37   ` Jan Beulich
  2024-02-18 19:00   ` Julien Grall
  2024-02-05 15:32 ` [PATCH v4 13/30] xen/riscv: introduce io.h Oleksii Kurochko
                   ` (17 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

The header was taken from Linux kernl 6.4.0-rc1.

Addionally, were updated:
* add emulation of {cmp}xchg for 1/2 byte types
* replace tabs with spaces
* replace __* varialbed with *__
* introduce generic version of xchg_* and cmpxchg_*.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - Code style fixes.
 - enforce in __xchg_*() has the same type for new and *ptr, also "\n"
   was removed at the end of asm instruction.
 - dependency from https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@bugseng.com/
 - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE().
 - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used.
 - drop cmpxcg{32,64}_{local} as they aren't used.
 - introduce generic version of xchg_* and cmpxchg_*.
 - update the commit message.
---
Changes in V3:
 - update the commit message
 - add emulation of {cmp}xchg_... for 1 and 2 bytes types
---
Changes in V2:
 - update the comment at the top of the header.
 - change xen/lib.h to xen/bug.h.
 - sort inclusion of headers properly.
---
 xen/arch/riscv/include/asm/cmpxchg.h | 237 +++++++++++++++++++++++++++
 1 file changed, 237 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h

diff --git a/xen/arch/riscv/include/asm/cmpxchg.h b/xen/arch/riscv/include/asm/cmpxchg.h
new file mode 100644
index 0000000000..b751a50cbf
--- /dev/null
+++ b/xen/arch/riscv/include/asm/cmpxchg.h
@@ -0,0 +1,237 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (C) 2014 Regents of the University of California */
+
+#ifndef _ASM_RISCV_CMPXCHG_H
+#define _ASM_RISCV_CMPXCHG_H
+
+#include <xen/compiler.h>
+#include <xen/lib.h>
+
+#include <asm/fence.h>
+#include <asm/io.h>
+#include <asm/system.h>
+
+#define ALIGN_DOWN(addr, size)  ((addr) & (~((size) - 1)))
+
+#define __amoswap_generic(ptr, new, ret, sfx, release_barrier, acquire_barrier) \
+({ \
+    asm volatile( \
+        release_barrier \
+        " amoswap" sfx " %0, %2, %1\n" \
+        acquire_barrier \
+        : "=r" (ret), "+A" (*ptr) \
+        : "r" (new) \
+        : "memory" ); \
+})
+
+#define emulate_xchg_1_2(ptr, new, ret, release_barrier, acquire_barrier) \
+({ \
+    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned long)ptr, 4); \
+    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr))) * BITS_PER_BYTE; \
+    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
+    uint8_t mask_h = mask_l + mask_size - 1; \
+    unsigned long mask = GENMASK(mask_h, mask_l); \
+    unsigned long new_ = (unsigned long)(new) << mask_l; \
+    unsigned long ret_; \
+    unsigned long rc; \
+    \
+    asm volatile( \
+        release_barrier \
+        "0: lr.d %0, %2\n" \
+        "   and  %1, %0, %z4\n" \
+        "   or   %1, %1, %z3\n" \
+        "   sc.d %1, %1, %2\n" \
+        "   bnez %1, 0b\n" \
+        acquire_barrier \
+        : "=&r" (ret_), "=&r" (rc), "+A" (*ptr_32b_aligned) \
+        : "rJ" (new_), "rJ" (~mask) \
+        : "memory"); \
+    \
+    ret = (__typeof__(*(ptr)))((ret_ & mask) >> mask_l); \
+})
+
+#define __xchg_generic(ptr, new, size, sfx, release_barrier, acquire_barrier) \
+({ \
+    __typeof__(ptr) ptr__ = (ptr); \
+    __typeof__(*(ptr)) new__ = (new); \
+    __typeof__(*(ptr)) ret__; \
+    switch (size) \
+    { \
+    case 1: \
+    case 2: \
+        emulate_xchg_1_2(ptr__, new__, ret__, release_barrier, acquire_barrier); \
+        break; \
+    case 4: \
+        __amoswap_generic(ptr__, new__, ret__,\
+                          ".w" sfx,  release_barrier, acquire_barrier); \
+        break; \
+    case 8: \
+        __amoswap_generic(ptr__, new__, ret__,\
+                          ".d" sfx,  release_barrier, acquire_barrier); \
+        break; \
+    default: \
+        STATIC_ASSERT_UNREACHABLE(); \
+    } \
+    ret__; \
+})
+
+#define xchg_relaxed(ptr, x) \
+({ \
+    __typeof__(*(ptr)) x_ = (x); \
+    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), "", "", ""); \
+})
+
+#define xchg_acquire(ptr, x) \
+({ \
+    __typeof__(*(ptr)) x_ = (x); \
+    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
+                                       "", "", RISCV_ACQUIRE_BARRIER); \
+})
+
+#define xchg_release(ptr, x) \
+({ \
+    __typeof__(*(ptr)) x_ = (x); \
+    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
+                                       "", RISCV_RELEASE_BARRIER, ""); \
+})
+
+#define xchg(ptr,x) \
+({ \
+    __typeof__(*(ptr)) ret__; \
+    ret__ = (__typeof__(*(ptr))) \
+            __xchg_generic(ptr, (unsigned long)(x), sizeof(*(ptr)), \
+                           ".aqrl", "", ""); \
+    ret__; \
+})
+
+#define __generic_cmpxchg(ptr, old, new, ret, lr_sfx, sc_sfx, release_barrier, acquire_barrier)	\
+ ({ \
+    register unsigned int rc; \
+    asm volatile( \
+        release_barrier \
+        "0: lr" lr_sfx " %0, %2\n" \
+        "   bne  %0, %z3, 1f\n" \
+        "   sc" sc_sfx " %1, %z4, %2\n" \
+        "   bnez %1, 0b\n" \
+        acquire_barrier \
+        "1:\n" \
+        : "=&r" (ret), "=&r" (rc), "+A" (*ptr) \
+        : "rJ" (old), "rJ" (new) \
+        : "memory"); \
+ })
+
+#define emulate_cmpxchg_1_2(ptr, old, new, ret, sc_sfx, release_barrier, acquire_barrier) \
+({ \
+    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned long)ptr, 4); \
+    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr))) * BITS_PER_BYTE; \
+    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
+    uint8_t mask_h = mask_l + mask_size - 1; \
+    unsigned long mask = GENMASK(mask_h, mask_l); \
+    unsigned long old_ = (unsigned long)(old) << mask_l; \
+    unsigned long new_ = (unsigned long)(new) << mask_l; \
+    unsigned long ret_; \
+    unsigned long rc; \
+    \
+    __asm__ __volatile__ ( \
+        release_barrier \
+        "0: lr.d %0, %2\n" \
+        "   and  %1, %0, %z5\n" \
+        "   bne  %1, %z3, 1f\n" \
+        "   and  %1, %0, %z6\n" \
+        "   or   %1, %1, %z4\n" \
+        "   sc.d" sc_sfx " %1, %1, %2\n" \
+        "   bnez %1, 0b\n" \
+        acquire_barrier \
+        "1:\n" \
+        : "=&r" (ret_), "=&r" (rc), "+A" (*ptr_32b_aligned) \
+        : "rJ" (old_), "rJ" (new_), \
+          "rJ" (mask), "rJ" (~mask) \
+        : "memory"); \
+    \
+    ret = (__typeof__(*(ptr)))((ret_ & mask) >> mask_l); \
+})
+
+/*
+ * Atomic compare and exchange.  Compare OLD with MEM, if identical,
+ * store NEW in MEM.  Return the initial value in MEM.  Success is
+ * indicated by comparing RETURN with OLD.
+ */
+#define __cmpxchg_generic(ptr, old, new, size, sc_sfx, release_barrier, acquire_barrier) \
+({ \
+    __typeof__(ptr) ptr__ = (ptr); \
+    __typeof__(*(ptr)) old__ = (__typeof__(*(ptr)))(old); \
+    __typeof__(*(ptr)) new__ = (__typeof__(*(ptr)))(new); \
+    __typeof__(*(ptr)) ret__; \
+    switch (size) \
+    { \
+    case 1: \
+    case 2: \
+        emulate_cmpxchg_1_2(ptr, old, new, ret__,\
+                            sc_sfx, release_barrier, acquire_barrier); \
+        break; \
+    case 4: \
+        __generic_cmpxchg(ptr__, old__, new__, ret__, \
+                          ".w", ".w"sc_sfx, release_barrier, acquire_barrier); \
+        break; \
+    case 8: \
+        __generic_cmpxchg(ptr__, old__, new__, ret__, \
+                          ".d", ".d"sc_sfx, release_barrier, acquire_barrier); \
+        break; \
+    default: \
+        STATIC_ASSERT_UNREACHABLE(); \
+    } \
+    ret__; \
+})
+
+#define cmpxchg_relaxed(ptr, o, n) \
+({ \
+    __typeof__(*(ptr)) o_ = (o); \
+    __typeof__(*(ptr)) n_ = (n); \
+    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, \
+                    o_, n_, sizeof(*(ptr)), "", "", ""); \
+})
+
+#define cmpxchg_acquire(ptr, o, n) \
+({ \
+    __typeof__(*(ptr)) o_ = (o); \
+    __typeof__(*(ptr)) n_ = (n); \
+    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, o_, n_, sizeof(*(ptr)), \
+                                          "", "", RISCV_ACQUIRE_BARRIER); \
+})
+
+#define cmpxchg_release(ptr, o, n) \
+({ \
+    __typeof__(*(ptr)) o_ = (o); \
+    __typeof__(*(ptr)) n_ = (n); \
+    (__typeof__(*(ptr)))__cmpxchg_release(ptr, o_, n_, sizeof(*(ptr)), \
+                                          "", RISCV_RELEASE_BARRIER, ""); \
+})
+
+#define cmpxchg(ptr, o, n) \
+({ \
+    __typeof__(*(ptr)) ret__; \
+    ret__ = (__typeof__(*(ptr))) \
+            __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
+                              sizeof(*(ptr)), ".rl", "", " fence rw, rw\n"); \
+    ret__; \
+})
+
+#define __cmpxchg(ptr, o, n, s) \
+({ \
+    __typeof__(*(ptr)) ret__; \
+    ret__ = (__typeof__(*(ptr))) \
+            __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
+                              s, ".rl", "", " fence rw, rw\n"); \
+    ret__; \
+})
+
+#endif /* _ASM_RISCV_CMPXCHG_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 13/30] xen/riscv: introduce io.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (11 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-13 11:05   ` Jan Beulich
  2024-02-18 19:07   ` Julien Grall
  2024-02-05 15:32 ` [PATCH v4 14/30] xen/riscv: introduce atomic.h Oleksii Kurochko
                   ` (16 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

The header taken form Linux 6.4.0-rc1 and is based on
arch/riscv/include/asm/mmio.h.

Addionally, to the header was added definions of ioremap_*().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - delete inner parentheses in macros.
 - s/u<N>/uint<N>.
---
Changes in V3:
 - re-sync with linux kernel
 - update the commit message
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/io.h | 142 ++++++++++++++++++++++++++++++++
 1 file changed, 142 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/io.h

diff --git a/xen/arch/riscv/include/asm/io.h b/xen/arch/riscv/include/asm/io.h
new file mode 100644
index 0000000000..1e61a40522
--- /dev/null
+++ b/xen/arch/riscv/include/asm/io.h
@@ -0,0 +1,142 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * {read,write}{b,w,l,q} based on arch/arm64/include/asm/io.h
+ *   which was based on arch/arm/include/io.h
+ *
+ * Copyright (C) 1996-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2014 Regents of the University of California
+ */
+
+
+#ifndef _ASM_RISCV_IO_H
+#define _ASM_RISCV_IO_H
+
+#include <asm/byteorder.h>
+
+/*
+ * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
+ * change the properties of memory regions.  This should be fixed by the
+ * upcoming platform spec.
+ */
+#define ioremap_nocache(addr, size) ioremap(addr, size)
+#define ioremap_wc(addr, size) ioremap(addr, size)
+#define ioremap_wt(addr, size) ioremap(addr, size)
+
+/* Generic IO read/write.  These perform native-endian accesses. */
+#define __raw_writeb __raw_writeb
+static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
+{
+	asm volatile("sb %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+
+#define __raw_writew __raw_writew
+static inline void __raw_writew(uint16_t val, volatile void __iomem *addr)
+{
+	asm volatile("sh %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+
+#define __raw_writel __raw_writel
+static inline void __raw_writel(uint32_t val, volatile void __iomem *addr)
+{
+	asm volatile("sw %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_writeq __raw_writeq
+static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
+{
+	asm volatile("sd %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+#endif
+
+#define __raw_readb __raw_readb
+static inline uint8_t __raw_readb(const volatile void __iomem *addr)
+{
+	uint8_t val;
+
+	asm volatile("lb %0, 0(%1)" : "=r" (val) : "r" (addr));
+	return val;
+}
+
+#define __raw_readw __raw_readw
+static inline uint16_t __raw_readw(const volatile void __iomem *addr)
+{
+	uint16_t val;
+
+	asm volatile("lh %0, 0(%1)" : "=r" (val) : "r" (addr));
+	return val;
+}
+
+#define __raw_readl __raw_readl
+static inline uint32_t __raw_readl(const volatile void __iomem *addr)
+{
+	uint32_t val;
+
+	asm volatile("lw %0, 0(%1)" : "=r" (val) : "r" (addr));
+	return val;
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_readq __raw_readq
+static inline u64 __raw_readq(const volatile void __iomem *addr)
+{
+	u64 val;
+
+	asm volatile("ld %0, 0(%1)" : "=r" (val) : "r" (addr));
+	return val;
+}
+#endif
+
+/*
+ * Unordered I/O memory access primitives.  These are even more relaxed than
+ * the relaxed versions, as they don't even order accesses between successive
+ * operations to the I/O regions.
+ */
+#define readb_cpu(c)		({ uint8_t  __r = __raw_readb(c); __r; })
+#define readw_cpu(c)		({ uint16_t __r = le16_to_cpu((__force __le16)__raw_readw(c)); __r; })
+#define readl_cpu(c)		({ uint32_t __r = le32_to_cpu((__force __le32)__raw_readl(c)); __r; })
+
+#define writeb_cpu(v,c)		((void)__raw_writeb(v,c))
+#define writew_cpu(v,c)		((void)__raw_writew((__force uint16_t)cpu_to_le16(v),c))
+#define writel_cpu(v,c)		((void)__raw_writel((__force uint32_t)cpu_to_le32(v),c))
+
+#ifdef CONFIG_64BIT
+#define readq_cpu(c)		({ u64 __r = le64_to_cpu((__force __le64)__raw_readq(c)); __r; })
+#define writeq_cpu(v,c)		((void)__raw_writeq((__force u64)cpu_to_le64(v),c))
+#endif
+
+/*
+ * I/O memory access primitives. Reads are ordered relative to any
+ * following Normal memory access. Writes are ordered relative to any prior
+ * Normal memory access.  The memory barriers here are necessary as RISC-V
+ * doesn't define any ordering between the memory space and the I/O space.
+ */
+#define __io_br()	do {} while (0)
+#define __io_ar(v)	__asm__ __volatile__ ("fence i,r" : : : "memory");
+#define __io_bw()	__asm__ __volatile__ ("fence w,o" : : : "memory");
+#define __io_aw()	do { } while (0)
+
+#define readb(c)	({ uint8_t  __v; __io_br(); __v = readb_cpu(c); __io_ar(__v); __v; })
+#define readw(c)	({ uint16_t __v; __io_br(); __v = readw_cpu(c); __io_ar(__v); __v; })
+#define readl(c)	({ uint32_t __v; __io_br(); __v = readl_cpu(c); __io_ar(__v); __v; })
+
+#define writeb(v,c)	({ __io_bw(); writeb_cpu(v,c); __io_aw(); })
+#define writew(v,c)	({ __io_bw(); writew_cpu(v,c); __io_aw(); })
+#define writel(v,c)	({ __io_bw(); writel_cpu(v,c); __io_aw(); })
+
+#ifdef CONFIG_64BIT
+#define readq(c)	({ u64 __v; __io_br(); __v = readq_cpu(c); __io_ar(__v); __v; })
+#define writeq(v,c)	({ __io_bw(); writeq_cpu((v),(c)); __io_aw(); })
+#endif
+
+#endif /* _ASM_RISCV_IO_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 14/30] xen/riscv: introduce atomic.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (12 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 13/30] xen/riscv: introduce io.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-13 11:36   ` Jan Beulich
  2024-02-18 19:22   ` Julien Grall
  2024-02-05 15:32 ` [PATCH v4 15/30] xen/riscv: introduce irq.h Oleksii Kurochko
                   ` (15 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Bobby Eshleman, Alistair Francis, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu, Oleksii Kurochko

From: Bobby Eshleman <bobbyeshleman@gmail.com>

Additionally, this patch introduces macros in fence.h,
which are utilized in atomic.h.

atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n)
were updated to use __*xchg_generic().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - do changes related to the updates of [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
 - drop casts in read_atomic_size(), write_atomic(), add_sized()
 - tabs -> spaces
 - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler to handle NR_CPUS=1
   the same as NR_CPUS>1 with accepting less than ideal performance.
---
Changes in V3:
  - update the commit message
  - add SPDX for fence.h
  - code style fixes
  - Remove /* TODO: ... */ for add_sized macros. It looks correct to me.
  - re-order the patch
  - merge to this patch fence.h
---
Changes in V2:
 - Change an author of commit. I got this header from Bobby's old repo.
---
 xen/arch/riscv/include/asm/atomic.h | 395 ++++++++++++++++++++++++++++
 xen/arch/riscv/include/asm/fence.h  |   8 +
 2 files changed, 403 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/atomic.h
 create mode 100644 xen/arch/riscv/include/asm/fence.h

diff --git a/xen/arch/riscv/include/asm/atomic.h b/xen/arch/riscv/include/asm/atomic.h
new file mode 100644
index 0000000000..267d3c0803
--- /dev/null
+++ b/xen/arch/riscv/include/asm/atomic.h
@@ -0,0 +1,395 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Taken and modified from Linux.
+ *
+ * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated to use
+ * __*xchg_generic()
+ * 
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ * Copyright (C) 2021 Vates SAS
+ */
+
+#ifndef _ASM_RISCV_ATOMIC_H
+#define _ASM_RISCV_ATOMIC_H
+
+#include <xen/atomic.h>
+#include <asm/cmpxchg.h>
+#include <asm/fence.h>
+#include <asm/io.h>
+#include <asm/system.h>
+
+void __bad_atomic_size(void);
+
+static always_inline void read_atomic_size(const volatile void *p,
+                                           void *res,
+                                           unsigned int size)
+{
+    switch ( size )
+    {
+    case 1: *(uint8_t *)res = readb(p); break;
+    case 2: *(uint16_t *)res = readw(p); break;
+    case 4: *(uint32_t *)res = readl(p); break;
+    case 8: *(uint32_t *)res  = readq(p); break;
+    default: __bad_atomic_size(); break;
+    }
+}
+
+#define read_atomic(p) ({                               \
+    union { typeof(*p) val; char c[0]; } x_;            \
+    read_atomic_size(p, x_.c, sizeof(*p));              \
+    x_.val;                                             \
+})
+
+#define write_atomic(p, x)                              \
+({                                                      \
+    typeof(*p) x__ = (x);                               \
+    switch ( sizeof(*p) )                               \
+    {                                                   \
+    case 1: writeb((uint8_t)x__,  p); break;            \
+    case 2: writew((uint16_t)x__, p); break;            \
+    case 4: writel((uint32_t)x__, p); break;            \
+    case 8: writeq((uint64_t)x__, p); break;            \
+    default: __bad_atomic_size(); break;                \
+    }                                                   \
+    x__;                                                \
+})
+
+#define add_sized(p, x)                                 \
+({                                                      \
+    typeof(*(p)) x__ = (x);                             \
+    switch ( sizeof(*(p)) )                             \
+    {                                                   \
+    case 1: writeb(read_atomic(p) + x__, p); break;     \
+    case 2: writew(read_atomic(p) + x__, p); break;     \
+    case 4: writel(read_atomic(p) + x__, p); break;     \
+    default: __bad_atomic_size(); break;                \
+    }                                                   \
+})
+
+/*
+ *  __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
+ *               non-scalar types unchanged.
+ *
+ * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char'
+ * is not type-compatible with 'signed char', and we define a separate case.
+ */
+#define __scalar_type_to_expr_cases(type)               \
+    unsigned type:  (unsigned type)0,                   \
+    signed type:    (signed type)0
+
+#define __unqual_scalar_typeof(x) typeof(               \
+    _Generic((x),                                       \
+        char:  (char)0,                                 \
+        __scalar_type_to_expr_cases(char),              \
+        __scalar_type_to_expr_cases(short),             \
+        __scalar_type_to_expr_cases(int),               \
+        __scalar_type_to_expr_cases(long),              \
+        __scalar_type_to_expr_cases(long long),         \
+        default: (x)))
+
+#define READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#define WRITE_ONCE(x, val)                                      \
+    do {                                                        \
+        *(volatile typeof(x) *)&(x) = (val);                    \
+    } while (0)
+
+#define __atomic_acquire_fence() \
+    __asm__ __volatile__( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
+
+#define __atomic_release_fence() \
+    __asm__ __volatile__( RISCV_RELEASE_BARRIER "" ::: "memory" );
+
+static inline int atomic_read(const atomic_t *v)
+{
+    return READ_ONCE(v->counter);
+}
+
+static inline int _atomic_read(atomic_t v)
+{
+    return v.counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+    WRITE_ONCE(v->counter, i);
+}
+
+static inline void _atomic_set(atomic_t *v, int i)
+{
+    v->counter = i;
+}
+
+static inline int atomic_sub_and_test(int i, atomic_t *v)
+{
+    return atomic_sub_return(i, v) == 0;
+}
+
+static inline void atomic_inc(atomic_t *v)
+{
+    atomic_add(1, v);
+}
+
+static inline int atomic_inc_return(atomic_t *v)
+{
+    return atomic_add_return(1, v);
+}
+
+static inline void atomic_dec(atomic_t *v)
+{
+    atomic_sub(1, v);
+}
+
+static inline int atomic_dec_return(atomic_t *v)
+{
+    return atomic_sub_return(1, v);
+}
+
+static inline int atomic_dec_and_test(atomic_t *v)
+{
+    return atomic_sub_return(1, v) == 0;
+}
+
+static inline int atomic_add_negative(int i, atomic_t *v)
+{
+    return atomic_add_return(i, v) < 0;
+}
+
+static inline int atomic_inc_and_test(atomic_t *v)
+{
+    return atomic_add_return(1, v) == 0;
+}
+
+/*
+ * First, the atomic ops that have no ordering constraints and therefor don't
+ * have the AQ or RL bits set.  These don't return anything, so there's only
+ * one version to worry about.
+ */
+#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
+static inline                                               \
+void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
+{                                                           \
+    __asm__ __volatile__ (                                  \
+        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
+        : "+A" (v->counter)                                 \
+        : "r" (I)                                           \
+        : "memory" );                                       \
+}                                                           \
+
+#define ATOMIC_OPS(op, asm_op, I)                           \
+        ATOMIC_OP (op, asm_op, I, w, int,   )
+
+ATOMIC_OPS(add, add,  i)
+ATOMIC_OPS(sub, add, -i)
+ATOMIC_OPS(and, and,  i)
+ATOMIC_OPS( or,  or,  i)
+ATOMIC_OPS(xor, xor,  i)
+
+#undef ATOMIC_OP
+#undef ATOMIC_OPS
+
+/*
+ * Atomic ops that have ordered, relaxed, acquire, and release variants.
+ * There's two flavors of these: the arithmatic ops have both fetch and return
+ * versions, while the logical ops only have fetch versions.
+ */
+#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type, prefix)    \
+static inline                                                       \
+c_type atomic##prefix##_fetch_##op##_relaxed(c_type i,              \
+                         atomic##prefix##_t *v)                     \
+{                                                                   \
+    register c_type ret;                                            \
+    __asm__ __volatile__ (                                          \
+        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
+        : "+A" (v->counter), "=r" (ret)                             \
+        : "r" (I)                                                   \
+        : "memory" );                                               \
+    return ret;                                                     \
+}                                                                   \
+static inline                                                       \
+c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \
+{                                                                   \
+    register c_type ret;                                            \
+    __asm__ __volatile__ (                                          \
+        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2, %0"          \
+        : "+A" (v->counter), "=r" (ret)                             \
+        : "r" (I)                                                   \
+        : "memory" );                                               \
+    return ret;                                                     \
+}
+
+#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \
+static inline                                                           \
+c_type atomic##prefix##_##op##_return_relaxed(c_type i,                 \
+                          atomic##prefix##_t *v)                        \
+{                                                                       \
+        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I;      \
+}                                                                       \
+static inline                                                           \
+c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v)  \
+{                                                                       \
+        return atomic##prefix##_fetch_##op(i, v) c_op I;                \
+}
+
+#define ATOMIC_OPS(op, asm_op, c_op, I)                                 \
+        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,   )               \
+        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
+
+ATOMIC_OPS(add, add, +,  i)
+ATOMIC_OPS(sub, add, +, -i)
+
+#define atomic_add_return_relaxed   atomic_add_return_relaxed
+#define atomic_sub_return_relaxed   atomic_sub_return_relaxed
+#define atomic_add_return   atomic_add_return
+#define atomic_sub_return   atomic_sub_return
+
+#define atomic_fetch_add_relaxed    atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed    atomic_fetch_sub_relaxed
+#define atomic_fetch_add    atomic_fetch_add
+#define atomic_fetch_sub    atomic_fetch_sub
+
+#undef ATOMIC_OPS
+
+#define ATOMIC_OPS(op, asm_op, I) \
+        ATOMIC_FETCH_OP(op, asm_op, I, w, int,   )
+
+ATOMIC_OPS(and, and, i)
+ATOMIC_OPS( or,  or, i)
+ATOMIC_OPS(xor, xor, i)
+
+#define atomic_fetch_and_relaxed    atomic_fetch_and_relaxed
+#define atomic_fetch_or_relaxed	    atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed    atomic_fetch_xor_relaxed
+#define atomic_fetch_and    atomic_fetch_and
+#define atomic_fetch_or     atomic_fetch_or
+#define atomic_fetch_xor    atomic_fetch_xor
+
+#undef ATOMIC_OPS
+
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_OP_RETURN
+
+/* This is required to provide a full barrier on success. */
+static inline int atomic_add_unless(atomic_t *v, int a, int u)
+{
+       int prev, rc;
+
+    __asm__ __volatile__ (
+        "0: lr.w     %[p],  %[c]\n"
+        "   beq      %[p],  %[u], 1f\n"
+        "   add      %[rc], %[p], %[a]\n"
+        "   sc.w.rl  %[rc], %[rc], %[c]\n"
+        "   bnez     %[rc], 0b\n"
+        "   fence    rw, rw\n"
+        "1:\n"
+        : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
+        : [a]"r" (a), [u]"r" (u)
+        : "memory");
+    return prev;
+}
+#define atomic_fetch_add_unless atomic_fetch_add_unless
+
+/*
+ * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
+ * {cmp,}xchg and the operations that return, so they need a full barrier.
+ */
+#define ATOMIC_OP(c_t, prefix, size)                            \
+static inline                                                   \
+c_t atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size, "", "", "");  \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          "", "", RISCV_ACQUIRE_BARRIER);       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          "", RISCV_RELEASE_BARRIER, "");       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n)         \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          ".aqrl", "", "");                     \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_relaxed(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", "", "");                       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", "", RISCV_ACQUIRE_BARRIER);    \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{	                                                            \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", RISCV_RELEASE_BARRIER, "");    \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n) \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             ".rl", "", " fence rw, rw\n");     \
+}
+
+#define ATOMIC_OPS() \
+    ATOMIC_OP(int,   , 4)
+
+ATOMIC_OPS()
+
+#define atomic_xchg_relaxed atomic_xchg_relaxed
+#define atomic_xchg_acquire atomic_xchg_acquire
+#define atomic_xchg_release atomic_xchg_release
+#define atomic_xchg atomic_xchg
+#define atomic_cmpxchg_relaxed atomic_cmpxchg_relaxed
+#define atomic_cmpxchg_acquire atomic_cmpxchg_acquire
+#define atomic_cmpxchg_release atomic_cmpxchg_release
+#define atomic_cmpxchg atomic_cmpxchg
+
+#undef ATOMIC_OPS
+#undef ATOMIC_OP
+
+static inline int atomic_sub_if_positive(atomic_t *v, int offset)
+{
+       int prev, rc;
+
+    __asm__ __volatile__ (
+        "0: lr.w     %[p],  %[c]\n"
+        "   sub      %[rc], %[p], %[o]\n"
+        "   bltz     %[rc], 1f\n"
+        "   sc.w.rl  %[rc], %[rc], %[c]\n"
+        "   bnez     %[rc], 0b\n"
+        "   fence    rw, rw\n"
+        "1:\n"
+        : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
+        : [o]"r" (offset)
+        : "memory" );
+    return prev - offset;
+}
+
+#define atomic_dec_if_positive(v)	atomic_sub_if_positive(v, 1)
+
+#endif /* _ASM_RISCV_ATOMIC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/riscv/include/asm/fence.h b/xen/arch/riscv/include/asm/fence.h
new file mode 100644
index 0000000000..ff3f23dbd7
--- /dev/null
+++ b/xen/arch/riscv/include/asm/fence.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_RISCV_FENCE_H
+#define _ASM_RISCV_FENCE_H
+
+#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
+#define RISCV_RELEASE_BARRIER   "\tfence rw,  w\n"
+
+#endif	/* _ASM_RISCV_FENCE_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 15/30] xen/riscv: introduce irq.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (13 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 14/30] xen/riscv: introduce atomic.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 16/30] xen/riscv: introduce p2m.h Oleksii Kurochko
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
 - Change BUG to BUG_ON("unimplemented").
 - Add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V3:
 - add SPDX
 - remove all that was wraped to HAS_DEVICETREE_... as for RISC-V it is going to be
   always selected.
 - update the commit message
---
Changes in V2:
 - add ifdef CONFIG_HAS_DEVICE_TREE for things that shouldn't be
   in case !CONFIG_HAS_DEVICE_TREE
 - use proper includes.
---
 xen/arch/riscv/include/asm/irq.h | 37 ++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/irq.h

diff --git a/xen/arch/riscv/include/asm/irq.h b/xen/arch/riscv/include/asm/irq.h
new file mode 100644
index 0000000000..0dfd4d6e8a
--- /dev/null
+++ b/xen/arch/riscv/include/asm/irq.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_IRQ_H__
+#define __ASM_RISCV_IRQ_H__
+
+#include <xen/bug.h>
+
+/* TODO */
+#define nr_irqs 0U
+#define nr_static_irqs 0
+#define arch_hwdom_irqs(domid) 0U
+
+#define domain_pirq_to_irq(d, pirq) (pirq)
+
+#define arch_evtchn_bind_pirq(d, pirq) ((void)((d) + (pirq)))
+
+struct arch_pirq {
+};
+
+struct arch_irq_desc {
+    unsigned int type;
+};
+
+static inline void arch_move_irqs(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+#endif /* __ASM_RISCV_IRQ_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 16/30] xen/riscv: introduce p2m.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (14 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 15/30] xen/riscv: introduce irq.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:16   ` Jan Beulich
  2024-02-18 18:18   ` Julien Grall
  2024-02-05 15:32 ` [PATCH v4 17/30] xen/riscv: introduce regs.h Oleksii Kurochko
                   ` (13 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - update the comment above p2m_type_t. RISC-V has only 2 free for use bits in PTE, not 4 as Arm.
 - update the comment after p2m_ram_rw: s/guest/domain/ as this also applies for dom0.
 - return INVALID_MFN in gfn_to_mfn() instead of mfn(0).
 - drop PPC changes.
---
Changes in V3:
 - add SPDX
 - drop unneeded for now p2m types.
 - return false in all functions implemented with BUG() inside.
 - update the commit message
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/p2m.h | 102 +++++++++++++++++++++++++++++++
 1 file changed, 102 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/p2m.h

diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
new file mode 100644
index 0000000000..8ad020974f
--- /dev/null
+++ b/xen/arch/riscv/include/asm/p2m.h
@@ -0,0 +1,102 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_P2M_H__
+#define __ASM_RISCV_P2M_H__
+
+#include <asm/page-bits.h>
+
+#define paddr_bits PADDR_BITS
+
+/*
+ * List of possible type for each page in the p2m entry.
+ * The number of available bit per page in the pte for this purpose is 2 bits.
+ * So it's possible to only have 4 fields. If we run out of value in the
+ * future, it's possible to use higher value for pseudo-type and don't store
+ * them in the p2m entry.
+ */
+typedef enum {
+    p2m_invalid = 0,    /* Nothing mapped here */
+    p2m_ram_rw,         /* Normal read/write domain RAM */
+} p2m_type_t;
+
+#include <xen/p2m-common.h>
+
+static inline int get_page_and_type(struct page_info *page,
+                                    struct domain *domain,
+                                    unsigned long type)
+{
+    BUG_ON("unimplemented");
+    return -EINVAL;
+}
+
+/* Look up a GFN and take a reference count on the backing page. */
+typedef unsigned int p2m_query_t;
+#define P2M_ALLOC    (1u<<0)   /* Populate PoD and paged-out entries */
+#define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */
+
+static inline struct page_info *get_page_from_gfn(
+    struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
+{
+    BUG_ON("unimplemented");
+    return NULL;
+}
+
+static inline void memory_type_changed(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+
+static inline int guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn,
+                                                        unsigned int order)
+{
+    return -EOPNOTSUPP;
+}
+
+static inline int guest_physmap_add_entry(struct domain *d,
+                            gfn_t gfn,
+                            mfn_t mfn,
+                            unsigned long page_order,
+                            p2m_type_t t)
+{
+    BUG_ON("unimplemented");
+    return -EINVAL;
+}
+
+/* Untyped version for RAM only, for compatibility */
+static inline int __must_check
+guest_physmap_add_page(struct domain *d, gfn_t gfn, mfn_t mfn,
+                       unsigned int page_order)
+{
+    return guest_physmap_add_entry(d, gfn, mfn, page_order, p2m_ram_rw);
+}
+
+static inline mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
+{
+    BUG_ON("unimplemented");
+    return INVALID_MFN;
+}
+
+static inline bool arch_acquire_resource_check(struct domain *d)
+{
+    /*
+     * The reference counting of foreign entries in set_foreign_p2m_entry()
+     * is supported on RISCV.
+     */
+    return true;
+}
+
+static inline void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+    /* Not supported on RISCV. */
+}
+
+#endif /* __ASM_RISCV_P2M_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 17/30] xen/riscv: introduce regs.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (15 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 16/30] xen/riscv: introduce p2m.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-18 18:22   ` Julien Grall
  2024-02-05 15:32 ` [PATCH v4 18/30] xen/riscv: introduce time.h Oleksii Kurochko
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
------
Changes in V4:
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
 - s/BUG()/BUG_ON("unimplemented")
---
Changes in V3:
 - update the commit message
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
 - remove "include <asm/current.h>" and use a forward declaration instead.
---
Changes in V2:
 - change xen/lib.h to xen/bug.h
 - remove unnecessary empty line
---
xen/arch/riscv/include/asm/regs.h | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/regs.h

diff --git a/xen/arch/riscv/include/asm/regs.h b/xen/arch/riscv/include/asm/regs.h
new file mode 100644
index 0000000000..c70ea2aa0c
--- /dev/null
+++ b/xen/arch/riscv/include/asm/regs.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ARM_RISCV_REGS_H__
+#define __ARM_RISCV_REGS_H__
+
+#ifndef __ASSEMBLY__
+
+#include <xen/bug.h>
+
+#define hyp_mode(r)     (0)
+
+struct cpu_user_regs;
+
+static inline bool guest_mode(const struct cpu_user_regs *r)
+{
+    BUG_ON("unimplemented");
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ARM_RISCV_REGS_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 18/30] xen/riscv: introduce time.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (16 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 17/30] xen/riscv: introduce regs.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:18   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 19/30] xen/riscv: introduce event.h Oleksii Kurochko
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
 - s/BUG()/BUG_ON("unimplemented")
---
Changes in V3:
 - Acked-by: Jan Beulich <jbeulich@suse.com>
 - add SPDX
 - Add new line
---
Changes in V2:
 - change xen/lib.h to xen/bug.h
 - remove inclusion of <asm/processor.h> as it's not needed.
---
 xen/arch/riscv/include/asm/time.h | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/time.h

diff --git a/xen/arch/riscv/include/asm/time.h b/xen/arch/riscv/include/asm/time.h
new file mode 100644
index 0000000000..2e359fa046
--- /dev/null
+++ b/xen/arch/riscv/include/asm/time.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_TIME_H__
+#define __ASM_RISCV_TIME_H__
+
+#include <xen/bug.h>
+#include <asm/csr.h>
+
+struct vcpu;
+
+/* TODO: implement */
+static inline void force_update_vcpu_system_time(struct vcpu *v) { BUG_ON("unimplemented"); }
+
+typedef unsigned long cycles_t;
+
+static inline cycles_t get_cycles(void)
+{
+	return csr_read(CSR_TIME);
+}
+
+#endif /* __ASM_RISCV_TIME_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 19/30] xen/riscv: introduce event.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (17 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 18/30] xen/riscv: introduce time.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:20   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 20/30] xen/riscv: introduce monitor.h Oleksii Kurochko
                   ` (10 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - s/BUG()/BUG_ON("unimplemented")
 - s/xen\/bug.h/xen\/lib.h as BUG_ON is defined in xen/lib.h.
---
Changes in V3:
 - add SPDX
 - add BUG() inside stubs.
 - update the commit message
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/event.h | 40 ++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/event.h

diff --git a/xen/arch/riscv/include/asm/event.h b/xen/arch/riscv/include/asm/event.h
new file mode 100644
index 0000000000..b6a76c0f5d
--- /dev/null
+++ b/xen/arch/riscv/include/asm/event.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_EVENT_H__
+#define __ASM_RISCV_EVENT_H__
+
+#include <xen/lib.h>
+
+void vcpu_mark_events_pending(struct vcpu *v);
+
+static inline int vcpu_event_delivery_is_enabled(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+static inline int local_events_need_delivery(void)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+static inline void local_event_delivery_enable(void)
+{
+    BUG_ON("unimplemented");
+}
+
+/* No arch specific virq definition now. Default to global. */
+static inline bool arch_virq_is_global(unsigned int virq)
+{
+    return true;
+}
+
+#endif
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 20/30] xen/riscv: introduce monitor.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (18 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 19/30] xen/riscv: introduce event.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 21/30] xen/riscv: add definition of __read_mostly Oleksii Kurochko
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Tamas K Lengyel, Alexandru Isaila,
	Petre Pircalabu, Alistair Francis, Bob Eshleman, Connor Davis

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Taking into account conversion in [PATCH v6 0/9] Introduce generic headers
 (https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com/)
 this patch can be changed
---
Changes in V4:
 - Nothing changed. Only rebase.
---
Changes in V3:
 - new patch.
---
 xen/arch/riscv/include/asm/monitor.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/monitor.h

diff --git a/xen/arch/riscv/include/asm/monitor.h b/xen/arch/riscv/include/asm/monitor.h
new file mode 100644
index 0000000000..f4fe2c0690
--- /dev/null
+++ b/xen/arch/riscv/include/asm/monitor.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_MONITOR_H__
+#define __ASM_RISCV_MONITOR_H__
+
+#include <xen/bug.h>
+
+#include <asm-generic/monitor.h>
+
+struct domain;
+
+static inline uint32_t arch_monitor_get_capabilities(struct domain *d)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+#endif /* __ASM_RISCV_MONITOR_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 21/30] xen/riscv: add definition of __read_mostly
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (19 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 20/30] xen/riscv: introduce monitor.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 22/30] xen/riscv: define an address of frame table Oleksii Kurochko
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

The definition of __read_mostly should be removed in:
https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b333e@suse.com/

The patch introduces it in arch-specific header to not
block enabling of full Xen build for RISC-V.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
- [PATCH] move __read_mostly to xen/cache.h  [2]

Right now, the patch series doesn't have a direct dependency on [2] and it
provides __read_mostly in the patch:
    [PATCH v3 26/34] xen/riscv: add definition of __read_mostly
However, it will be dropped as soon as [2] is merged or at least when the
final version of the patch [2] is provided.

[2] https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b333e@suse.com/
---
Changes in V4:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/cache.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/riscv/include/asm/cache.h b/xen/arch/riscv/include/asm/cache.h
index 69573eb051..94bd94db53 100644
--- a/xen/arch/riscv/include/asm/cache.h
+++ b/xen/arch/riscv/include/asm/cache.h
@@ -3,4 +3,6 @@
 #ifndef _ASM_RISCV_CACHE_H
 #define _ASM_RISCV_CACHE_H
 
+#define __read_mostly __section(".data.read_mostly")
+
 #endif /* _ASM_RISCV_CACHE_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 22/30] xen/riscv: define an address of frame table
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (20 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 21/30] xen/riscv: add definition of __read_mostly Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-13 13:07   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 23/30] xen/riscv: add required things to current.h Oleksii Kurochko
                   ` (7 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Also, the patch adds some helpful macros that assist in avoiding
the redefinition of memory layout for each MMU mode.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - move "#define VPN_BITS (9)" inside CONFIG_RISCV_64 as for SV32 it should be defined differently.
 - drop SLOTN_ENTRY_SIZE and introduce DIRECTMAP_SIZE. It is not needed for now, but will be needed in the
   future.
 - update memory layout table and some related macros.
---
Changes in V3:
 - drop OFFSET_BITS, and use PAGE_SHIFT instead.
 - code style fixes.
 - add comment how macros are useful.
 - move all memory related layout definitions close to comment with memory layout description.
 - make memory layout description generic for any MMU mode.
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/config.h | 105 ++++++++++++++++++++--------
 1 file changed, 77 insertions(+), 28 deletions(-)

diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h
index 56387ac159..479da15782 100644
--- a/xen/arch/riscv/include/asm/config.h
+++ b/xen/arch/riscv/include/asm/config.h
@@ -6,6 +6,16 @@
 #include <xen/const.h>
 #include <xen/page-size.h>
 
+#include <asm/riscv_encoding.h>
+
+#ifdef CONFIG_RISCV_64
+#define CONFIG_PAGING_LEVELS 3
+#define RV_STAGE1_MODE SATP_MODE_SV39
+#else
+#define CONFIG_PAGING_LEVELS 2
+#define RV_STAGE1_MODE SATP_MODE_SV32
+#endif
+
 /*
  * RISC-V64 Layout:
  *
@@ -23,25 +33,78 @@
  * It means that:
  *   top VA bits are simply ignored for the purpose of translating to PA.
  *
+ * Amount of slots for Frametable were calculated base on
+ * sizeof(struct page_info) = 48. If the 'struct page_info' is changed,
+ * the table below must be updated.
+ *
  * ============================================================================
- *    Start addr    |   End addr        |  Size  | Slot       |area description
- * ============================================================================
- * FFFFFFFFC0800000 |  FFFFFFFFFFFFFFFF |1016 MB | L2 511     | Unused
- * FFFFFFFFC0600000 |  FFFFFFFFC0800000 |  2 MB  | L2 511     | Fixmap
- * FFFFFFFFC0200000 |  FFFFFFFFC0600000 |  4 MB  | L2 511     | FDT
- * FFFFFFFFC0000000 |  FFFFFFFFC0200000 |  2 MB  | L2 511     | Xen
- *                 ...                  |  1 GB  | L2 510     | Unused
- * 0000003200000000 |  0000007F80000000 | 309 GB | L2 200-509 | Direct map
- *                 ...                  |  1 GB  | L2 199     | Unused
- * 0000003100000000 |  00000031C0000000 |  3 GB  | L2 196-198 | Frametable
- *                 ...                  |  1 GB  | L2 195     | Unused
- * 0000003080000000 |  00000030C0000000 |  1 GB  | L2 194     | VMAP
- *                 ...                  | 194 GB | L2 0 - 193 | Unused
+ * Start addr          | End addr         | Slot       | area description
  * ============================================================================
- *
+ *                   .....                 L2 511          Unused
+ *  0xffffffffc0600000  0xffffffffc0800000 L2 511          Fixmap
+ *  0xffffffffc0200000  0xffffffffc0600000 L2 511          FDT
+ *  0xffffffffc0000000  0xffffffffc0200000 L2 511          Xen
+ *                   .....                 L2 510          Unused
+ *  0x3200000000        0x7f40000000       L2 200-509      Direct map
+ *                   .....                 L2 199          Unused
+ *  0x30c0000000        0x31c0000000       L2 195-198      Frametable
+ *                   .....                 L2 194          Unused
+ *  0x3040000000        0x3080000000       L2 193          VMAP
+ *                   .....                 L2 0-192        Unused
+#elif RV_STAGE1_MODE == SATP_MODE_SV48
+ * Memory layout is the same as for SV39 in terms of slots, so only start and
+ * end addresses should be shifted by 9
 #endif
  */
 
+#define HYP_PT_ROOT_LEVEL (CONFIG_PAGING_LEVELS - 1)
+
+#ifdef CONFIG_RISCV_64
+
+#define VPN_BITS (9)
+
+#define SLOTN_ENTRY_BITS        (HYP_PT_ROOT_LEVEL * VPN_BITS + PAGE_SHIFT)
+#define SLOTN(slot)             (_AT(vaddr_t, slot) << SLOTN_ENTRY_BITS)
+
+#if RV_STAGE1_MODE == SATP_MODE_SV39
+#define XEN_VIRT_START 0xFFFFFFFFC0000000
+#elif RV_STAGE1_MODE == SATP_MODE_SV48
+#define XEN_VIRT_START 0xFFFFFF8000000000
+#else
+#error "unsupported RV_STAGE1_MODE"
+#endif
+
+#define DIRECTMAP_SLOT_END      509
+#define DIRECTMAP_SLOT_START    200
+#define DIRECTMAP_VIRT_START    SLOTN(DIRECTMAP_SLOT_START)
+#define DIRECTMAP_SIZE          (SLOTN(DIRECTMAP_SLOT_END) - SLOTN(DIRECTMAP_SLOT_START))
+
+#define FRAMETABLE_SCALE_FACTOR  (PAGE_SIZE/sizeof(struct page_info))
+#define FRAMETABLE_SIZE_IN_SLOTS (((DIRECTMAP_SIZE / SLOTN(1)) / FRAMETABLE_SCALE_FACTOR) + 1)
+
+/*
+ * We have to skip Unused slot between DIRECTMAP and FRAMETABLE (look at mem.
+ * layout), so -1 is needed
+ */
+#define FRAMETABLE_SLOT_START   (DIRECTMAP_SLOT_START - FRAMETABLE_SIZE_IN_SLOTS - 1)
+#define FRAMETABLE_SIZE         (FRAMETABLE_SIZE_IN_SLOTS * SLOTN(1))
+#define FRAMETABLE_VIRT_START   SLOTN(FRAMETABLE_SLOT_START)
+#define FRAMETABLE_NR           (FRAMETABLE_SIZE / sizeof(*frame_table))
+#define FRAMETABLE_VIRT_END     (FRAMETABLE_VIRT_START + FRAMETABLE_SIZE - 1)
+
+/*
+ * We have to skip Unused slot between Frametable and VMAP (look at mem.
+ * layout), so an additional -1 is needed */
+#define VMAP_SLOT_START         (FRAMETABLE_SLOT_START - 1 - 1)
+#define VMAP_VIRT_START         SLOTN(VMAP_SLOT_START)
+#define VMAP_VIRT_SIZE          GB(1)
+
+#else
+#error "RV32 isn't supported"
+#endif
+
+#define HYPERVISOR_VIRT_START XEN_VIRT_START
+
 #if defined(CONFIG_RISCV_64)
 # define LONG_BYTEORDER 3
 # define ELFSIZE 64
@@ -75,24 +138,10 @@
 #define CODE_FILL /* empty */
 #endif
 
-#ifdef CONFIG_RISCV_64
-#define XEN_VIRT_START 0xFFFFFFFFC0000000 /* (_AC(-1, UL) + 1 - GB(1)) */
-#else
-#error "RV32 isn't supported"
-#endif
-
 #define SMP_CACHE_BYTES (1 << 6)
 
 #define STACK_SIZE PAGE_SIZE
 
-#ifdef CONFIG_RISCV_64
-#define CONFIG_PAGING_LEVELS 3
-#define RV_STAGE1_MODE SATP_MODE_SV39
-#else
-#define CONFIG_PAGING_LEVELS 2
-#define RV_STAGE1_MODE SATP_MODE_SV32
-#endif
-
 #define IDENT_AREA_SIZE 64
 
 #endif /* __RISCV_CONFIG_H__ */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 23/30] xen/riscv: add required things to current.h
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (21 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 22/30] xen/riscv: define an address of frame table Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 24/30] xen/riscv: add minimal stuff to page.h to build full Xen Oleksii Kurochko
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Add minimal requied things to be able to build full Xen.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
 - BUG() was changed to BUG_ON("unimplemented");
 - Change "xen/bug.h" to "xen/lib.h" as BUG_ON is defined in xen/lib.h.
 - Add Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V3:
 - add SPDX
 - drop a forward declaration of struct vcpu;
 - update guest_cpu_user_regs() macros
 - replace get_processor_id with smp_processor_id
 - update the commit message
 - code style fixes
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/current.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index d84f15dc50..aedb6dc732 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -3,6 +3,21 @@
 #ifndef __ASM_CURRENT_H
 #define __ASM_CURRENT_H
 
+#include <xen/lib.h>
+#include <xen/percpu.h>
+#include <asm/processor.h>
+
+#ifndef __ASSEMBLY__
+
+/* Which VCPU is "current" on this PCPU. */
+DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
+
+#define current            this_cpu(curr_vcpu)
+#define set_current(vcpu)  do { current = (vcpu); } while (0)
+#define get_cpu_current(cpu)  per_cpu(curr_vcpu, cpu)
+
+#define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
+
 #define switch_stack_and_jump(stack, fn) do {               \
     asm volatile (                                          \
             "mv sp, %0\n"                                   \
@@ -10,4 +25,8 @@
     unreachable();                                          \
 } while ( false )
 
+#define get_per_cpu_offset() __per_cpu_offset[smp_processor_id()]
+
+#endif /* __ASSEMBLY__ */
+
 #endif /* __ASM_CURRENT_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 24/30] xen/riscv: add minimal stuff to page.h to build full Xen
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (22 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 23/30] xen/riscv: add required things to current.h Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
---
 - Change message -> subject in "Changes in V3"
 - s/BUG/BUG_ON("...")
 - Do proper rebase ( pfn_to_paddr() and paddr_to_pfn() aren't removed ).
---
Changes in V3:
 - update the commit subject
 - add implemetation of PAGE_HYPERVISOR macros
 - add Acked-by: Jan Beulich <jbeulich@suse.com>
 - drop definition of pfn_to_addr, and paddr_to_pfn in <asm/mm.h>
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/page.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h
index 95074e29b3..c831e16417 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -6,6 +6,7 @@
 #ifndef __ASSEMBLY__
 
 #include <xen/const.h>
+#include <xen/bug.h>
 #include <xen/types.h>
 
 #include <asm/mm.h>
@@ -32,6 +33,10 @@
 #define PTE_LEAF_DEFAULT            (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
 #define PTE_TABLE                   (PTE_VALID)
 
+#define PAGE_HYPERVISOR_RW          (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
+
+#define PAGE_HYPERVISOR             PAGE_HYPERVISOR_RW
+
 /* Calculate the offsets into the pagetables for a given VA */
 #define pt_linear_offset(lvl, va)   ((va) >> XEN_PT_LEVEL_SHIFT(lvl))
 
@@ -62,6 +67,20 @@ static inline bool pte_is_valid(pte_t p)
     return p.pte & PTE_VALID;
 }
 
+static inline void invalidate_icache(void)
+{
+    BUG_ON("unimplemented");
+}
+
+#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
+#define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE)
+
+/* TODO: Flush the dcache for an entire page. */
+static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache)
+{
+    BUG_ON("unimplemented");
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_PAGE_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (23 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 24/30] xen/riscv: add minimal stuff to page.h to build full Xen Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-13 13:33   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
                   ` (4 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Alistair Francis,
	Bob Eshleman, Connor Davis

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - Change message -> subject in "Changes in V3"
 - Documentation about system requirement was added. In the future, it can be checked if the extension is supported
   by system __riscv_isa_extension_available() ( https://gitlab.com/xen-project/people/olkur/xen/-/commit/737998e89ed305eb92059300c374dfa53d2143fa )
 - update cpu_relax() function to check if __riscv_zihintpause is supported by a toolchain
 - add conditional _zihintpause to -march if it is supported by a toolchain
Changes in V3:
 - update the commit subject
 - rename get_processor_id to smp_processor_id
 - code style fixes
 - update the cpu_relax instruction: use pause instruction instead of div %0, %0, zero
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 docs/misc/riscv/booting.txt            |  8 ++++++++
 xen/arch/riscv/Kconfig                 |  7 +++++++
 xen/arch/riscv/arch.mk                 |  1 +
 xen/arch/riscv/include/asm/processor.h | 23 +++++++++++++++++++++++
 4 files changed, 39 insertions(+)
 create mode 100644 docs/misc/riscv/booting.txt

diff --git a/docs/misc/riscv/booting.txt b/docs/misc/riscv/booting.txt
new file mode 100644
index 0000000000..38fad74956
--- /dev/null
+++ b/docs/misc/riscv/booting.txt
@@ -0,0 +1,8 @@
+System requirements
+===================
+
+The following extensions are expected to be supported by a system on which
+Xen is run:
+- Zihintpause:
+  On a system that doesn't have this extension, cpu_relax() should be
+  implemented properly. Otherwise, an illegal instruction exception will arise.
diff --git a/xen/arch/riscv/Kconfig b/xen/arch/riscv/Kconfig
index f382b36f6c..383ce06771 100644
--- a/xen/arch/riscv/Kconfig
+++ b/xen/arch/riscv/Kconfig
@@ -45,6 +45,13 @@ config RISCV_ISA_C
 
 	  If unsure, say Y.
 
+config TOOLCHAIN_HAS_ZIHINTPAUSE
+	bool
+	default y
+	depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zihintpause)
+	depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zihintpause)
+	depends on LLD_VERSION >= 150000 || LD_VERSION >= 23600
+
 endmenu
 
 source "common/Kconfig"
diff --git a/xen/arch/riscv/arch.mk b/xen/arch/riscv/arch.mk
index 8403f96b6f..a4b53adaf7 100644
--- a/xen/arch/riscv/arch.mk
+++ b/xen/arch/riscv/arch.mk
@@ -7,6 +7,7 @@ CFLAGS-$(CONFIG_RISCV_64) += -mabi=lp64
 
 riscv-march-$(CONFIG_RISCV_ISA_RV64G) := rv64g
 riscv-march-$(CONFIG_RISCV_ISA_C)       := $(riscv-march-y)c
+riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE) := $(riscv-march-y)_zihintpause
 
 # Note that -mcmodel=medany is used so that Xen can be mapped
 # into the upper half _or_ the lower half of the address space.
diff --git a/xen/arch/riscv/include/asm/processor.h b/xen/arch/riscv/include/asm/processor.h
index 6db681d805..289dc35ea0 100644
--- a/xen/arch/riscv/include/asm/processor.h
+++ b/xen/arch/riscv/include/asm/processor.h
@@ -12,6 +12,9 @@
 
 #ifndef __ASSEMBLY__
 
+/* TODO: need to be implemeted */
+#define smp_processor_id() 0
+
 /* On stack VCPU state */
 struct cpu_user_regs
 {
@@ -53,6 +56,26 @@ struct cpu_user_regs
     unsigned long pregs;
 };
 
+/* TODO: need to implement */
+#define cpu_to_core(cpu)   (0)
+#define cpu_to_socket(cpu) (0)
+
+static inline void cpu_relax(void)
+{
+#ifdef __riscv_zihintpause
+    /*
+     * Reduce instruction retirement.
+     * This assumes the PC changes.
+     */
+    __asm__ __volatile__ ("pause");
+#else
+    /* Encoding of the pause instruction */
+    __asm__ __volatile__ (".insn 0x100000F");
+#endif
+
+    barrier();
+}
+
 static inline void wfi(void)
 {
     __asm__ __volatile__ ("wfi");
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h to build full Xen
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (24 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-13 14:19   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 27/30] xen/riscv: introduce vm_event_*() functions Oleksii Kurochko
                   ` (3 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - update an argument name of PFN_ORDERN macros.
 - drop pad at the end of 'struct page_info'.
 - Change message -> subject in "Changes in V3"
 - delete duplicated macros from riscv/mm.h
 - fix identation in struct page_info
 - align comment for PGC_ macros
 - update definitions of domain_set_alloc_bitsize() and domain_clamp_alloc_bitsize()
 - drop unnessary comments.
 - s/BUG/BUG_ON("...")
 - define __virt_to_maddr, __maddr_to_virt as stubs
 - add inclusion of xen/mm-frame.h for mfn_x and others
 - include "xen/mm.h" instead of "asm/mm.h" to fix compilation issues:
	 In file included from arch/riscv/setup.c:7:
	./arch/riscv/include/asm/mm.h:60:28: error: field 'list' has incomplete type
	   60 |     struct page_list_entry list;
	      |                            ^~~~
	./arch/riscv/include/asm/mm.h:81:43: error: 'MAX_ORDER' undeclared here (not in a function)
	   81 |                 unsigned long first_dirty:MAX_ORDER + 1;
	      |                                           ^~~~~~~~~
	./arch/riscv/include/asm/mm.h:81:31: error: bit-field 'first_dirty' width not an integer constant
	   81 |                 unsigned long first_dirty:MAX_ORDER + 1;
 - Define __virt_to_mfn() and __mfn_to_virt() using maddr_to_mfn() and mfn_to_maddr().
---
Changes in V3:
 - update the commit title
 - introduce DIRECTMAP_VIRT_START.
 - drop changes related pfn_to_paddr() and paddr_to_pfn as they were remvoe in
   [PATCH v2 32/39] xen/riscv: add minimal stuff to asm/page.h to build full Xen
 - code style fixes.
 - drop get_page_nr  and put_page_nr as they don't need for time being
 - drop CONFIG_STATIC_MEMORY related things
 - code style fixes
---
Changes in V2:
 - define stub for arch_get_dma_bitsize(void)
---
 xen/arch/riscv/include/asm/mm.h | 246 ++++++++++++++++++++++++++++++++
 xen/arch/riscv/mm.c             |   2 +-
 xen/arch/riscv/setup.c          |   2 +-
 3 files changed, 248 insertions(+), 2 deletions(-)

diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 07c7a0abba..0254babcc1 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -3,11 +3,252 @@
 #ifndef _ASM_RISCV_MM_H
 #define _ASM_RISCV_MM_H
 
+#include <public/xen.h>
+#include <xen/bug.h>
+#include <xen/mm-frame.h>
+#include <xen/pdx.h>
+#include <xen/types.h>
+
 #include <asm/page-bits.h>
 
 #define pfn_to_paddr(pfn) ((paddr_t)(pfn) << PAGE_SHIFT)
 #define paddr_to_pfn(pa)  ((unsigned long)((pa) >> PAGE_SHIFT))
 
+#define paddr_to_pdx(pa)    mfn_to_pdx(maddr_to_mfn(pa))
+#define gfn_to_gaddr(gfn)   pfn_to_paddr(gfn_x(gfn))
+#define gaddr_to_gfn(ga)    _gfn(paddr_to_pfn(ga))
+#define mfn_to_maddr(mfn)   pfn_to_paddr(mfn_x(mfn))
+#define maddr_to_mfn(ma)    _mfn(paddr_to_pfn(ma))
+#define vmap_to_mfn(va)     maddr_to_mfn(virt_to_maddr((vaddr_t)va))
+#define vmap_to_page(va)    mfn_to_page(vmap_to_mfn(va))
+
+static inline unsigned long __virt_to_maddr(unsigned long va)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+static inline void *__maddr_to_virt(unsigned long ma)
+{
+    BUG_ON("unimplemented");
+    return NULL;
+}
+
+#define virt_to_maddr(va) __virt_to_maddr((unsigned long)(va))
+#define maddr_to_virt(pa) __maddr_to_virt((unsigned long)(pa))
+
+/* Convert between Xen-heap virtual addresses and machine frame numbers. */
+#define __virt_to_mfn(va)  mfn_x(maddr_to_mfn(virt_to_maddr(va)))
+#define __mfn_to_virt(mfn) maddr_to_virt(mfn_to_maddr(_mfn(mfn)))
+
+/* Convert between Xen-heap virtual addresses and page-info structures. */
+static inline struct page_info *virt_to_page(const void *v)
+{
+    BUG_ON("unimplemented");
+    return NULL;
+}
+
+/*
+ * We define non-underscored wrappers for above conversion functions.
+ * These are overriden in various source files while underscored version
+ * remain intact.
+ */
+#define virt_to_mfn(va)     __virt_to_mfn(va)
+#define mfn_to_virt(mfn)    __mfn_to_virt(mfn)
+
+struct page_info
+{
+    /* Each frame can be threaded onto a doubly-linked list. */
+    struct page_list_entry list;
+
+    /* Reference count and various PGC_xxx flags and fields. */
+    unsigned long count_info;
+
+    /* Context-dependent fields follow... */
+    union {
+        /* Page is in use: ((count_info & PGC_count_mask) != 0). */
+        struct {
+            /* Type reference count and various PGT_xxx flags and fields. */
+            unsigned long type_info;
+        } inuse;
+        /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
+        union {
+            struct {
+                /*
+                 * Index of the first *possibly* unscrubbed page in the buddy.
+                 * One more bit than maximum possible order to accommodate
+                 * INVALID_DIRTY_IDX.
+                 */
+#define INVALID_DIRTY_IDX ((1UL << (MAX_ORDER + 1)) - 1)
+                unsigned long first_dirty:MAX_ORDER + 1;
+
+                /* Do TLBs need flushing for safety before next page use? */
+                bool need_tlbflush:1;
+
+#define BUDDY_NOT_SCRUBBING    0
+#define BUDDY_SCRUBBING        1
+#define BUDDY_SCRUB_ABORT      2
+                unsigned long scrub_state:2;
+            };
+
+                unsigned long val;
+        } free;
+    } u;
+
+    union {
+        /* Page is in use, but not as a shadow. */
+        struct {
+            /* Owner of this page (zero if page is anonymous). */
+            struct domain *domain;
+        } inuse;
+
+        /* Page is on a free list. */
+        struct {
+            /* Order-size of the free chunk this page is the head of. */
+            unsigned int order;
+        } free;
+    } v;
+
+    union {
+        /*
+         * Timestamp from 'TLB clock', used to avoid extra safety flushes.
+         * Only valid for: a) free pages, and b) pages with zero type count
+         */
+        uint32_t tlbflush_timestamp;
+    };
+};
+
+#define frame_table ((struct page_info *)FRAMETABLE_VIRT_START)
+
+/* PDX of the first page in the frame table. */
+extern unsigned long frametable_base_pdx;
+
+/* Convert between machine frame numbers and page-info structures. */
+#define mfn_to_page(mfn)                                            \
+    (frame_table + (mfn_to_pdx(mfn) - frametable_base_pdx))
+#define page_to_mfn(pg)                                             \
+    pdx_to_mfn((unsigned long)((pg) - frame_table) + frametable_base_pdx)
+
+static inline void *page_to_virt(const struct page_info *pg)
+{
+    return mfn_to_virt(mfn_x(page_to_mfn(pg)));
+}
+
+/*
+ * Common code requires get_page_type and put_page_type.
+ * We don't care about typecounts so we just do the minimum to make it
+ * happy.
+ */
+static inline int get_page_type(struct page_info *page, unsigned long type)
+{
+    return 1;
+}
+
+static inline void put_page_type(struct page_info *page)
+{
+}
+
+static inline void put_page_and_type(struct page_info *page)
+{
+    put_page_type(page);
+    put_page(page);
+}
+
+/*
+ * RISC-V does not have an M2P, but common code expects a handful of
+ * M2P-related defines and functions. Provide dummy versions of these.
+ */
+#define INVALID_M2P_ENTRY        (~0UL)
+#define SHARED_M2P_ENTRY         (~0UL - 1UL)
+#define SHARED_M2P(_e)           ((_e) == SHARED_M2P_ENTRY)
+
+#define set_gpfn_from_mfn(mfn, pfn) do { (void)(mfn), (void)(pfn); } while (0)
+#define mfn_to_gfn(d, mfn) ((void)(d), _gfn(mfn_x(mfn)))
+
+#define PDX_GROUP_SHIFT (16 + 5)
+
+static inline unsigned long domain_get_maximum_gpfn(struct domain *d)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+static inline long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    BUG_ON("unimplemented");
+    return 0;
+}
+
+/*
+ * On RISCV, all the RAM is currently direct mapped in Xen.
+ * Hence return always true.
+ */
+static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
+{
+    return true;
+}
+
+#define PG_shift(idx)   (BITS_PER_LONG - (idx))
+#define PG_mask(x, idx) (x ## UL << PG_shift(idx))
+
+#define PGT_none          PG_mask(0, 1)  /* no special uses of this page   */
+#define PGT_writable_page PG_mask(1, 1)  /* has writable mappings?         */
+#define PGT_type_mask     PG_mask(1, 1)  /* Bits 31 or 63.                 */
+
+ /* Count of uses of this frame as its current type. */
+#define PGT_count_width   PG_shift(2)
+#define PGT_count_mask    ((1UL << PGT_count_width) - 1)
+
+/*
+ * Page needs to be scrubbed. Since this bit can only be set on a page that is
+ * free (i.e. in PGC_state_free) we can reuse PGC_allocated bit.
+ */
+#define _PGC_need_scrub   _PGC_allocated
+#define PGC_need_scrub    PGC_allocated
+
+/* Cleared when the owning guest 'frees' this page. */
+#define _PGC_allocated    PG_shift(1)
+#define PGC_allocated     PG_mask(1, 1)
+/* Page is Xen heap? */
+#define _PGC_xen_heap     PG_shift(2)
+#define PGC_xen_heap      PG_mask(1, 2)
+/* Page is broken? */
+#define _PGC_broken       PG_shift(7)
+#define PGC_broken        PG_mask(1, 7)
+/* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
+#define PGC_state         PG_mask(3, 9)
+#define PGC_state_inuse   PG_mask(0, 9)
+#define PGC_state_offlining PG_mask(1, 9)
+#define PGC_state_offlined PG_mask(2, 9)
+#define PGC_state_free    PG_mask(3, 9)
+#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
+
+/* Count of references to this frame. */
+#define PGC_count_width   PG_shift(9)
+#define PGC_count_mask    ((1UL << PGC_count_width) - 1)
+
+#define _PGC_extra        PG_shift(10)
+#define PGC_extra         PG_mask(1, 10)
+
+#define is_xen_heap_page(page) ((page)->count_info & PGC_xen_heap)
+#define is_xen_heap_mfn(mfn) \
+    (mfn_valid(mfn) && is_xen_heap_page(mfn_to_page(mfn)))
+
+#define is_xen_fixed_mfn(mfn)                                   \
+    ((mfn_to_maddr(mfn) >= virt_to_maddr((vaddr_t)_start)) &&   \
+     (mfn_to_maddr(mfn) <= virt_to_maddr((vaddr_t)_end - 1)))
+
+#define page_get_owner(_p)    (_p)->v.inuse.domain
+#define page_set_owner(_p,_d) ((_p)->v.inuse.domain = (_d))
+
+/* TODO: implement */
+#define mfn_valid(mfn) ({ (void)(mfn); 0; })
+
+#define domain_set_alloc_bitsize(d) ((void)(d))
+#define domain_clamp_alloc_bitsize(d, b) ((void)(d), (b))
+
+#define PFN_ORDER(pfn) ((pfn)->v.free.order)
+
 extern unsigned char cpu0_boot_stack[];
 
 void setup_initial_pagetables(void);
@@ -20,4 +261,9 @@ unsigned long calc_phys_offset(void);
 
 void turn_on_mmu(unsigned long ra);
 
+static inline unsigned int arch_get_dma_bitsize(void)
+{
+    return 32; /* TODO */
+}
+
 #endif /* _ASM_RISCV_MM_H */
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index 053f043a3d..fe3a43be20 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -5,12 +5,12 @@
 #include <xen/init.h>
 #include <xen/kernel.h>
 #include <xen/macros.h>
+#include <xen/mm.h>
 #include <xen/pfn.h>
 
 #include <asm/early_printk.h>
 #include <asm/csr.h>
 #include <asm/current.h>
-#include <asm/mm.h>
 #include <asm/page.h>
 #include <asm/processor.h>
 
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 6593f601c1..98a94c4c48 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -2,9 +2,9 @@
 
 #include <xen/compile.h>
 #include <xen/init.h>
+#include <xen/mm.h>
 
 #include <asm/early_printk.h>
-#include <asm/mm.h>
 
 /* Xen stack for bringing up the first CPU. */
 unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 27/30] xen/riscv: introduce vm_event_*() functions
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (25 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 28/30] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Tamas K Lengyel, Alexandru Isaila, Petre Pircalabu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
  - New patch.
---
 xen/arch/riscv/Makefile   |  1 +
 xen/arch/riscv/vm_event.c | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+)
 create mode 100644 xen/arch/riscv/vm_event.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 2fefe14e7c..1ed1a8369b 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_RISCV_64) += riscv64/
 obj-y += sbi.o
 obj-y += setup.o
 obj-y += traps.o
+obj-y += vm_event.o
 
 $(TARGET): $(TARGET)-syms
 	$(OBJCOPY) -O binary -S $< $@
diff --git a/xen/arch/riscv/vm_event.c b/xen/arch/riscv/vm_event.c
new file mode 100644
index 0000000000..bb1fc73bc1
--- /dev/null
+++ b/xen/arch/riscv/vm_event.c
@@ -0,0 +1,19 @@
+#include <xen/bug.h>
+
+struct vm_event_st;
+struct vcpu;
+
+void vm_event_fill_regs(struct vm_event_st *req)
+{
+    BUG_ON("unimplemented");
+}
+
+void vm_event_set_registers(struct vcpu *v, struct vm_event_st *rsp)
+{
+    BUG_ON("unimplemented");
+}
+
+void vm_event_monitor_next_interrupt(struct vcpu *v)
+{
+    /* Not supported on RISCV. */
+}
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 28/30] xen/rirscv: add minimal amount of stubs to build full Xen
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (26 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 27/30] xen/riscv: introduce vm_event_*() functions Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-12 15:24   ` Jan Beulich
  2024-02-05 15:32 ` [PATCH v4 29/30] xen/riscv: enable full Xen build Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
  - added new stubs which are necessary for compilation after rebase: __cpu_up(), __cpu_disable(), __cpu_die()
    from smpboot.c
  - back changes related to printk() in early_printk() as they should be removed in the next patch to avoid
    compilation error.
  - update definition of cpu_khz: __read_mostly -> __ro_after_init.
  - drop vm_event_reset_vmtrace(). It is defibed in asm-generic/vm_event.h.
  - move vm_event_*() functions from stubs.c to riscv/vm_event.c.
  - s/BUG/BUG_ON("unimplemented") in stubs.c
  - back irq_actor_none() and irq_actor_none() as common/irq.c isn't compiled at this moment,
    so this function are needed to avoid compilation error.
  - defined max_page to avoid compilation error, it will be removed as soon as common/page_alloc.c will
    be compiled.
---
Changes in V3:
 - code style fixes.
 - update attribute for frametable_base_pdx  and frametable_virt_end to __ro_after_init.
   insteaf of read_mostly.
 - use BUG() instead of assert_failed/WARN for newly introduced stubs.
 - drop "#include <public/vm_event.h>" in stubs.c and use forward declaration instead.
 - drop ack_node() and end_node() as they aren't used now.
---
Changes in V2:
 - define udelay stub
 - remove 'select HAS_PDX' from RISC-V Kconfig because of
   https://lore.kernel.org/xen-devel/20231006144405.1078260-1-andrew.cooper3@citrix.com/
---
 xen/arch/riscv/Makefile       |   1 +
 xen/arch/riscv/early_printk.c |   1 -
 xen/arch/riscv/mm.c           |  50 ++++
 xen/arch/riscv/setup.c        |   8 +
 xen/arch/riscv/stubs.c        | 438 ++++++++++++++++++++++++++++++++++
 xen/arch/riscv/traps.c        |  25 ++
 6 files changed, 522 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/riscv/stubs.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 1ed1a8369b..60afbc0ad9 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -4,6 +4,7 @@ obj-y += mm.o
 obj-$(CONFIG_RISCV_64) += riscv64/
 obj-y += sbi.o
 obj-y += setup.o
+obj-y += stubs.o
 obj-y += traps.o
 obj-y += vm_event.o
 
diff --git a/xen/arch/riscv/early_printk.c b/xen/arch/riscv/early_printk.c
index 60742a042d..6d0911659d 100644
--- a/xen/arch/riscv/early_printk.c
+++ b/xen/arch/riscv/early_printk.c
@@ -207,4 +207,3 @@ void printk(const char *format, ...)
 }
 
 #endif
-
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index fe3a43be20..2c3fb7d72e 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include <xen/bug.h>
 #include <xen/cache.h>
 #include <xen/compiler.h>
 #include <xen/init.h>
@@ -14,6 +15,9 @@
 #include <asm/page.h>
 #include <asm/processor.h>
 
+unsigned long __ro_after_init frametable_base_pdx;
+unsigned long __ro_after_init frametable_virt_end;
+
 struct mmu_desc {
     unsigned int num_levels;
     unsigned int pgtbl_count;
@@ -294,3 +298,49 @@ unsigned long __init calc_phys_offset(void)
     phys_offset = load_start - XEN_VIRT_START;
     return phys_offset;
 }
+
+void put_page(struct page_info *page)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned long get_upper_mfn_bound(void)
+{
+    /* No memory hotplug yet, so current memory limit is the final one. */
+    return max_page - 1;
+}
+
+void arch_dump_shared_mem_info(void)
+{
+    BUG_ON("unimplemented");
+}
+
+int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
+{
+    BUG_ON("unimplemented");
+    return -1;
+}
+
+int xenmem_add_to_physmap_one(struct domain *d, unsigned int space,
+                              union add_to_physmap_extra extra,
+                              unsigned long idx, gfn_t gfn)
+{
+    BUG_ON("unimplemented");
+
+    return 0;
+}
+
+int destroy_xen_mappings(unsigned long s, unsigned long e)
+{
+    BUG_ON("unimplemented");
+    return -1;
+}
+
+int map_pages_to_xen(unsigned long virt,
+                     mfn_t mfn,
+                     unsigned long nr_mfns,
+                     unsigned int flags)
+{
+    BUG_ON("unimplemented");
+    return -1;
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 98a94c4c48..8bb5bdb2ae 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -1,11 +1,19 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include <xen/bug.h>
 #include <xen/compile.h>
 #include <xen/init.h>
 #include <xen/mm.h>
 
+#include <public/version.h>
+
 #include <asm/early_printk.h>
 
+void arch_get_xen_caps(xen_capabilities_info_t *info)
+{
+    BUG_ON("unimplemented");
+}
+
 /* Xen stack for bringing up the first CPU. */
 unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
     __aligned(STACK_SIZE);
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
new file mode 100644
index 0000000000..529f1dbe52
--- /dev/null
+++ b/xen/arch/riscv/stubs.c
@@ -0,0 +1,438 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#include <xen/cpumask.h>
+#include <xen/domain.h>
+#include <xen/irq.h>
+#include <xen/nodemask.h>
+#include <xen/time.h>
+#include <public/domctl.h>
+
+#include <asm/current.h>
+
+/* smpboot.c */
+
+cpumask_t cpu_online_map;
+cpumask_t cpu_present_map;
+cpumask_t cpu_possible_map;
+
+/* ID of the PCPU we're running on */
+DEFINE_PER_CPU(unsigned int, cpu_id);
+/* XXX these seem awfully x86ish... */
+/* representing HT siblings of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
+/* representing HT and core siblings of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
+
+nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
+
+/*
+ * max_page is defined in page_alloc.c which isn't complied for now.
+ * definition of max_page will be remove as soon as page_alloc is built.
+ */
+unsigned long __read_mostly max_page;
+
+/* time.c */
+
+unsigned long __ro_after_init cpu_khz;  /* CPU clock frequency in kHz. */
+
+s_time_t get_s_time(void)
+{
+    BUG_ON("unimplemented");
+}
+
+int reprogram_timer(s_time_t timeout)
+{
+    BUG_ON("unimplemented");
+}
+
+void send_timer_event(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void domain_set_time_offset(struct domain *d, int64_t time_offset_seconds)
+{
+    BUG_ON("unimplemented");
+}
+
+/* shutdown.c */
+
+void machine_restart(unsigned int delay_millisecs)
+{
+    BUG_ON("unimplemented");
+}
+
+void machine_halt(void)
+{
+    BUG_ON("unimplemented");
+}
+
+/* domctl.c */
+
+long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
+                    XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_get_domain_info(const struct domain *d,
+                          struct xen_domctl_getdomaininfo *info)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
+{
+    BUG_ON("unimplemented");
+}
+
+/* monitor.c */
+
+int arch_monitor_domctl_event(struct domain *d,
+                              struct xen_domctl_monitor_op *mop)
+{
+    BUG_ON("unimplemented");
+}
+
+/* smp.c */
+
+void arch_flush_tlb_mask(const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+void smp_send_event_check_mask(const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+void smp_send_call_function_mask(const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+/* irq.c */
+
+struct pirq *alloc_pirq_struct(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+int pirq_guest_bind(struct vcpu *v, struct pirq *pirq, int will_share)
+{
+    BUG_ON("unimplemented");
+}
+
+void pirq_guest_unbind(struct domain *d, struct pirq *pirq)
+{
+    BUG_ON("unimplemented");
+}
+
+void pirq_set_affinity(struct domain *d, int pirq, const cpumask_t *mask)
+{
+    BUG_ON("unimplemented");
+}
+
+hw_irq_controller no_irq_type = {
+    .typename = "none",
+    .startup = irq_startup_none,
+    .shutdown = irq_shutdown_none,
+    .enable = irq_enable_none,
+    .disable = irq_disable_none,
+};
+
+int arch_init_one_irq_desc(struct irq_desc *desc)
+{
+    BUG_ON("unimplemented");
+}
+
+void smp_send_state_dump(unsigned int cpu)
+{
+    BUG_ON("unimplemented");
+}
+
+/* domain.c */
+
+DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
+unsigned long __per_cpu_offset[NR_CPUS];
+
+void context_switch(struct vcpu *prev, struct vcpu *next)
+{
+    BUG_ON("unimplemented");
+}
+
+void continue_running(struct vcpu *same)
+{
+    BUG_ON("unimplemented");
+}
+
+void sync_local_execstate(void)
+{
+    BUG_ON("unimplemented");
+}
+
+void sync_vcpu_execstate(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void startup_cpu_idle_loop(void)
+{
+    BUG_ON("unimplemented");
+}
+
+void free_domain_struct(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void dump_pageframe_info(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void free_vcpu_struct(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_vcpu_create(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_vcpu_destroy(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_switch_to_aarch64_mode(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_domain_create(struct domain *d,
+                       struct xen_domctl_createdomain *config,
+                       unsigned int flags)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_domain_teardown(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_destroy(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_shutdown(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_pause(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_unpause(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_domain_soft_reset(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_domain_creation_finished(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    BUG_ON("unimplemented");
+}
+
+int arch_vcpu_reset(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+int domain_relinquish_resources(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_dump_domain_info(struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_dump_vcpu_info(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_mark_events_pending(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_update_evtchn_irq(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_block_unless_event_pending(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+void vcpu_kick(struct vcpu *v)
+{
+    BUG_ON("unimplemented");
+}
+
+struct domain *alloc_domain_struct(void)
+{
+    BUG_ON("unimplemented");
+}
+
+struct vcpu *alloc_vcpu_struct(const struct domain *d)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned long
+hypercall_create_continuation(unsigned int op, const char *format, ...)
+{
+    BUG_ON("unimplemented");
+}
+
+int __init parse_arch_dom0_param(const char *s, const char *e)
+{
+    BUG_ON("unimplemented");
+}
+
+/* guestcopy.c */
+
+unsigned long raw_copy_to_guest(void *to, const void *from, unsigned int len)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned long raw_copy_from_guest(void *to, const void __user *from,
+                                  unsigned int len)
+{
+    BUG_ON("unimplemented");
+}
+
+/* sysctl.c */
+
+long arch_do_sysctl(struct xen_sysctl *sysctl,
+                    XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
+{
+    BUG_ON("unimplemented");
+}
+
+void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
+{
+    BUG_ON("unimplemented");
+}
+
+/* p2m.c */
+
+int arch_set_paging_mempool_size(struct domain *d, uint64_t size)
+{
+    BUG_ON("unimplemented");
+}
+
+int unmap_mmio_regions(struct domain *d,
+                       gfn_t start_gfn,
+                       unsigned long nr,
+                       mfn_t mfn)
+{
+    BUG_ON("unimplemented");
+}
+
+int map_mmio_regions(struct domain *d,
+                     gfn_t start_gfn,
+                     unsigned long nr,
+                     mfn_t mfn)
+{
+    BUG_ON("unimplemented");
+}
+
+int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
+                          unsigned long gfn, mfn_t mfn)
+{
+    BUG_ON("unimplemented");
+}
+
+/* Return the size of the pool, in bytes. */
+int arch_get_paging_mempool_size(struct domain *d, uint64_t *size)
+{
+    BUG_ON("unimplemented");
+}
+
+/* delay.c */
+
+void udelay(unsigned long usecs)
+{
+    BUG_ON("unimplemented");
+}
+
+/* guest_access.h */ 
+
+static inline unsigned long raw_clear_guest(void *to, unsigned int len)
+{
+    BUG_ON("unimplemented");
+}
+
+/* smpboot.c */
+
+int __cpu_up(unsigned int cpu)
+{
+    BUG_ON("unimplemented");
+}
+
+void __cpu_disable(void)
+{
+    BUG_ON("unimplemented");
+}
+
+void __cpu_die(unsigned int cpu)
+{
+    BUG_ON("unimplemented");
+}
+
+/*
+ * The following functions are defined in common/irq.c, which will be built in
+ * the next commit, so these changes will be removed there.
+ */
+
+void cf_check irq_actor_none(struct irq_desc *desc)
+{
+    BUG_ON("unimplemented");
+}
+
+unsigned int cf_check irq_startup_none(struct irq_desc *desc)
+{
+    BUG_ON("unimplemented");
+
+    return 0;
+}
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index ccd3593f5a..ca56df75d8 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -4,6 +4,10 @@
  *
  * RISC-V Trap handlers
  */
+
+#include <xen/lib.h>
+#include <xen/sched.h>
+
 #include <asm/processor.h>
 #include <asm/traps.h>
 
@@ -11,3 +15,24 @@ void do_trap(struct cpu_user_regs *cpu_regs)
 {
     die();
 }
+
+void vcpu_show_execution_state(struct vcpu *v)
+{
+    assert_failed("need to be implented");
+}
+
+void show_execution_state(const struct cpu_user_regs *regs)
+{
+    printk("implement show_execution_state(regs)\n");
+}
+
+void arch_hypercall_tasklet_result(struct vcpu *v, long res)
+{
+    assert_failed("need to be implented");
+}
+
+enum mc_disposition arch_do_multicall_call(struct mc_state *state)
+{
+    assert_failed("need to be implented");
+    return mc_continue;
+}
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 29/30] xen/riscv: enable full Xen build
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (27 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 28/30] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-05 15:32 ` [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
  29 siblings, 0 replies; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Alistair Francis, Bob Eshleman, Connor Davis,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in V4:
 - drop stubs for irq_actor_none() and irq_actor_none() as common/irq.c is compiled now.
 - drop defintion of max_page in stubs.c as common/page_alloc.c is compiled now.
 - drop printk() related changes in riscv/early_printk.c as common version will be used.
---
Changes in V3:
 - Reviewed-by: Jan Beulich <jbeulich@suse.com>
 - unrealted change dropped in tiny64_defconfig
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/Makefile       |  16 +++-
 xen/arch/riscv/arch.mk        |   4 -
 xen/arch/riscv/early_printk.c | 167 ----------------------------------
 xen/arch/riscv/stubs.c        |  23 -----
 4 files changed, 15 insertions(+), 195 deletions(-)

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 60afbc0ad9..81b77b13d6 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -12,10 +12,24 @@ $(TARGET): $(TARGET)-syms
 	$(OBJCOPY) -O binary -S $< $@
 
 $(TARGET)-syms: $(objtree)/prelink.o $(obj)/xen.lds
-	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) -o $@
+	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< \
+	    $(objtree)/common/symbols-dummy.o -o $(dot-target).0
+	$(NM) -pa --format=sysv $(dot-target).0 \
+		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+		> $(dot-target).0.S
+	$(MAKE) $(build)=$(@D) $(dot-target).0.o
+	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< \
+	    $(dot-target).0.o -o $(dot-target).1
+	$(NM) -pa --format=sysv $(dot-target).1 \
+		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+		> $(dot-target).1.S
+	$(MAKE) $(build)=$(@D) $(dot-target).1.o
+	$(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
+	    $(dot-target).1.o -o $@
 	$(NM) -pa --format=sysv $@ \
 		| $(objtree)/tools/symbols --all-symbols --xensyms --sysv --sort \
 		> $@.map
+	rm -f $(@D)/.$(@F).[0-9]*
 
 $(obj)/xen.lds: $(src)/xen.lds.S FORCE
 	$(call if_changed_dep,cpp_lds_S)
diff --git a/xen/arch/riscv/arch.mk b/xen/arch/riscv/arch.mk
index a4b53adaf7..4363776f34 100644
--- a/xen/arch/riscv/arch.mk
+++ b/xen/arch/riscv/arch.mk
@@ -14,7 +14,3 @@ riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE) := $(riscv-march-y)_zihintpause
 # -mcmodel=medlow would force Xen into the lower half.
 
 CFLAGS += -march=$(riscv-march-y) -mstrict-align -mcmodel=medany
-
-# TODO: Drop override when more of the build is working
-override ALL_OBJS-y = arch/$(SRCARCH)/built_in.o
-override ALL_LIBS-y =
diff --git a/xen/arch/riscv/early_printk.c b/xen/arch/riscv/early_printk.c
index 6d0911659d..610c814f54 100644
--- a/xen/arch/riscv/early_printk.c
+++ b/xen/arch/riscv/early_printk.c
@@ -40,170 +40,3 @@ void early_printk(const char *str)
         str++;
     }
 }
-
-/*
- * The following #if 1 ... #endif should be removed after printk
- * and related stuff are ready.
- */
-#if 1
-
-#include <xen/stdarg.h>
-#include <xen/string.h>
-
-/**
- * strlen - Find the length of a string
- * @s: The string to be sized
- */
-size_t (strlen)(const char * s)
-{
-    const char *sc;
-
-    for (sc = s; *sc != '\0'; ++sc)
-        /* nothing */;
-    return sc - s;
-}
-
-/**
- * memcpy - Copy one area of memory to another
- * @dest: Where to copy to
- * @src: Where to copy from
- * @count: The size of the area.
- *
- * You should not use this function to access IO space, use memcpy_toio()
- * or memcpy_fromio() instead.
- */
-void *(memcpy)(void *dest, const void *src, size_t count)
-{
-    char *tmp = (char *) dest, *s = (char *) src;
-
-    while (count--)
-        *tmp++ = *s++;
-
-    return dest;
-}
-
-int vsnprintf(char* str, size_t size, const char* format, va_list args)
-{
-    size_t i = 0; /* Current position in the output string */
-    size_t written = 0; /* Total number of characters written */
-    char* dest = str;
-
-    while ( format[i] != '\0' && written < size - 1 )
-    {
-        if ( format[i] == '%' )
-        {
-            i++;
-
-            if ( format[i] == '\0' )
-                break;
-
-            if ( format[i] == '%' )
-            {
-                if ( written < size - 1 )
-                {
-                    dest[written] = '%';
-                    written++;
-                }
-                i++;
-                continue;
-            }
-
-            /*
-             * Handle format specifiers.
-             * For simplicity, only %s and %d are implemented here.
-             */
-
-            if ( format[i] == 's' )
-            {
-                char* arg = va_arg(args, char*);
-                size_t arglen = strlen(arg);
-
-                size_t remaining = size - written - 1;
-
-                if ( arglen > remaining )
-                    arglen = remaining;
-
-                memcpy(dest + written, arg, arglen);
-
-                written += arglen;
-                i++;
-            }
-            else if ( format[i] == 'd' )
-            {
-                int arg = va_arg(args, int);
-
-                /* Convert the integer to string representation */
-                char numstr[32]; /* Assumes a maximum of 32 digits */
-                int numlen = 0;
-                int num = arg;
-                size_t remaining;
-
-                if ( arg < 0 )
-                {
-                    if ( written < size - 1 )
-                    {
-                        dest[written] = '-';
-                        written++;
-                    }
-
-                    num = -arg;
-                }
-
-                do
-                {
-                    numstr[numlen] = '0' + num % 10;
-                    num = num / 10;
-                    numlen++;
-                } while ( num > 0 );
-
-                /* Reverse the string */
-                for (int j = 0; j < numlen / 2; j++)
-                {
-                    char tmp = numstr[j];
-                    numstr[j] = numstr[numlen - 1 - j];
-                    numstr[numlen - 1 - j] = tmp;
-                }
-
-                remaining = size - written - 1;
-
-                if ( numlen > remaining )
-                    numlen = remaining;
-
-                memcpy(dest + written, numstr, numlen);
-
-                written += numlen;
-                i++;
-            }
-        }
-        else
-        {
-            if ( written < size - 1 )
-            {
-                dest[written] = format[i];
-                written++;
-            }
-            i++;
-        }
-    }
-
-    if ( size > 0 )
-        dest[written] = '\0';
-
-    return written;
-}
-
-void printk(const char *format, ...)
-{
-    static char buf[1024];
-
-    va_list args;
-    va_start(args, format);
-
-    (void)vsnprintf(buf, sizeof(buf), format, args);
-
-    early_printk(buf);
-
-    va_end(args);
-}
-
-#endif
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 529f1dbe52..bda35fc347 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -24,12 +24,6 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-/*
- * max_page is defined in page_alloc.c which isn't complied for now.
- * definition of max_page will be remove as soon as page_alloc is built.
- */
-unsigned long __read_mostly max_page;
-
 /* time.c */
 
 unsigned long __ro_after_init cpu_khz;  /* CPU clock frequency in kHz. */
@@ -419,20 +413,3 @@ void __cpu_die(unsigned int cpu)
 {
     BUG_ON("unimplemented");
 }
-
-/*
- * The following functions are defined in common/irq.c, which will be built in
- * the next commit, so these changes will be removed there.
- */
-
-void cf_check irq_actor_none(struct irq_desc *desc)
-{
-    BUG_ON("unimplemented");
-}
-
-unsigned int cf_check irq_startup_none(struct irq_desc *desc)
-{
-    BUG_ON("unimplemented");
-
-    return 0;
-}
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
                   ` (28 preceding siblings ...)
  2024-02-05 15:32 ` [PATCH v4 29/30] xen/riscv: enable full Xen build Oleksii Kurochko
@ 2024-02-05 15:32 ` Oleksii Kurochko
  2024-02-14  9:52   ` Jan Beulich
  29 siblings, 1 reply; 107+ messages in thread
From: Oleksii Kurochko @ 2024-02-05 15:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksii Kurochko, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 Changes in V4:
  - Update version of GCC (12.2) and GNU Binutils (2.39) to the version
    which are in Xen's contrainter for RISC-V
---
 Changes in V3:
  - new patch
---
 README | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/README b/README
index c8a108449e..9a898125e1 100644
--- a/README
+++ b/README
@@ -48,6 +48,9 @@ provided by your OS distributor:
       - For ARM 64-bit:
         - GCC 5.1 or later
         - GNU Binutils 2.24 or later
+      - For RISC-V 64-bit:
+        - GCC 12.2 or later
+        - GNU Binutils 2.39 or later
     * POSIX compatible awk
     * Development install of zlib (e.g., zlib-dev)
     * Development install of Python 2.7 or later (e.g., python-dev)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 02/30] xen/riscv: use some asm-generic headers
  2024-02-05 15:32 ` [PATCH v4 02/30] xen/riscv: use some asm-generic headers Oleksii Kurochko
@ 2024-02-12 15:03   ` Jan Beulich
  2024-02-14  9:54     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:03 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Some headers are the same as asm-generic verions of them
> so use them instead of arch-specific headers.

Just to mention it (I'll commit this as is, unless asked to do otherwise):
With this description I'd expect those "some headers" to be removed by
this patch. Yet you're not talking about anything that exists; instead I
think you mean "would end up the same". Yet that's precisely what
asm-generic/ is for. Hence I would have said something along the lines of
"don't need any customization".

> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
>  As [PATCH v6 0/9] Introduce generic headers
>  (https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com/)
>  is not stable, the list in asm/Makefile can be changed, but the changes will
>  be easy.

Or wait - doesn't this mean the change here can't be committed yet? I
know the cover letter specifies dependencies, yet I think we need to come
to a point where this large series won't need re-posting again and again.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V
  2024-02-05 15:32 ` [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V Oleksii Kurochko
@ 2024-02-12 15:05   ` Jan Beulich
  2024-02-14  9:57     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:05 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> No specific header is needed to include in public/hvm/save.h for
> PPC and RISC-V for now.
> 
> Code related to PPC was changed based on the comment:
> https://lore.kernel.org/xen-devel/c2f3280e-2208-496b-a0b5-fda1a2076b3a@raptorengineering.com/
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

Albeit I don't see why ...

> --- a/xen/include/public/hvm/save.h
> +++ b/xen/include/public/hvm/save.h
> @@ -89,8 +89,8 @@ DECLARE_HVM_SAVE_TYPE(END, 0, struct hvm_save_end);
>  #include "../arch-x86/hvm/save.h"
>  #elif defined(__arm__) || defined(__aarch64__)
>  #include "../arch-arm/hvm/save.h"
> -#elif defined(__powerpc64__)
> -#include "../arch-ppc.h"
> +#elif defined(__powerpc64__) || defined(__riscv)
> +/* no specific header to include */
>  #else

... this isn't simply

#elif !defined(__powerpc64__) && !defined(__riscv)

Jan

>  #error "unsupported architecture"
>  #endif



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h
  2024-02-05 15:32 ` [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h Oleksii Kurochko
@ 2024-02-12 15:07   ` Jan Beulich
  2024-02-14 10:01     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:07 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h
  2024-02-05 15:32 ` [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h Oleksii Kurochko
@ 2024-02-12 15:10   ` Jan Beulich
  2024-02-14 10:05     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:10 UTC (permalink / raw)
  To: Oleksii Kurochko; +Cc: Paul Durrant, Roger Pau Monné, xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> asm/iommu.h shouldn't

... need to ...

> be included when CONFIG_HAS_PASSTHROUGH
> isn't enabled.
> As <asm/iommu.h> is ifdef-ed by CONFIG_HAS_PASSTHROUGH it should
> be also ifdef-ed field "struct arch_iommu arch" in struct domain_iommu
> as definition of arch_iommu is located in <asm/iommu.h>.
> 
> These amount of changes are enough to avoid generation of empty
> asm/iommu.h for now.

I'm also inclined to insert "just" here, to make more obvious why e.g.
...

> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V4:
>  - Update the commit message.
> ---
> Changes in V3:
>  - new patch.
> ---
>  xen/include/xen/iommu.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index a21f25df9f..7aa6a77209 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -337,7 +337,9 @@ extern int iommu_add_extra_reserved_device_memory(unsigned long start,
>  extern int iommu_get_extra_reserved_device_memory(iommu_grdm_t *func,
>                                                    void *ctxt);
>  
> +#ifdef CONFIG_HAS_PASSTHROUGH
>  #include <asm/iommu.h>
> +#endif
>  
>  #ifndef iommu_call
>  # define iommu_call(ops, fn, args...) ((ops)->fn(args))
> @@ -345,7 +347,9 @@ extern int iommu_get_extra_reserved_device_memory(iommu_grdm_t *func,
>  #endif
>  
>  struct domain_iommu {
> +#ifdef CONFIG_HAS_PASSTHROUGH
>      struct arch_iommu arch;
> +#endif
>  
>      /* iommu_ops */
>      const struct iommu_ops *platform_ops;

... this is left visible despite quite likely being meaningless without
HAS_PASSTHROUGH.

Then (happy to make the small edits while committing):
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 11/30] xen/riscv: introduce smp.h
  2024-02-05 15:32 ` [PATCH v4 11/30] xen/riscv: introduce smp.h Oleksii Kurochko
@ 2024-02-12 15:13   ` Jan Beulich
  2024-02-14 11:06     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:13 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 16/30] xen/riscv: introduce p2m.h
  2024-02-05 15:32 ` [PATCH v4 16/30] xen/riscv: introduce p2m.h Oleksii Kurochko
@ 2024-02-12 15:16   ` Jan Beulich
  2024-02-14 12:12     ` Oleksii
  2024-02-18 18:18   ` Julien Grall
  1 sibling, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:16 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
with two more nits:

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/p2m.h
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef __ASM_RISCV_P2M_H__
> +#define __ASM_RISCV_P2M_H__
> +
> +#include <asm/page-bits.h>
> +
> +#define paddr_bits PADDR_BITS
> +
> +/*
> + * List of possible type for each page in the p2m entry.
> + * The number of available bit per page in the pte for this purpose is 2 bits.
> + * So it's possible to only have 4 fields. If we run out of value in the
> + * future, it's possible to use higher value for pseudo-type and don't store
> + * them in the p2m entry.
> + */
> +typedef enum {
> +    p2m_invalid = 0,    /* Nothing mapped here */
> +    p2m_ram_rw,         /* Normal read/write domain RAM */
> +} p2m_type_t;
> +
> +#include <xen/p2m-common.h>
> +
> +static inline int get_page_and_type(struct page_info *page,
> +                                    struct domain *domain,
> +                                    unsigned long type)
> +{
> +    BUG_ON("unimplemented");
> +    return -EINVAL;
> +}
> +
> +/* Look up a GFN and take a reference count on the backing page. */
> +typedef unsigned int p2m_query_t;
> +#define P2M_ALLOC    (1u<<0)   /* Populate PoD and paged-out entries */
> +#define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */
> +
> +static inline struct page_info *get_page_from_gfn(
> +    struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
> +{
> +    BUG_ON("unimplemented");
> +    return NULL;
> +}
> +
> +static inline void memory_type_changed(struct domain *d)
> +{
> +    BUG_ON("unimplemented");
> +}
> +
> +
> +static inline int guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn,

This line looks to be too long.

> +                                                        unsigned int order)
> +{
> +    return -EOPNOTSUPP;
> +}
> +
> +static inline int guest_physmap_add_entry(struct domain *d,
> +                            gfn_t gfn,
> +                            mfn_t mfn,
> +                            unsigned long page_order,
> +                            p2m_type_t t)

Indentation isn't quite right here.

I'll see about dealing with those while committing.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 18/30] xen/riscv: introduce time.h
  2024-02-05 15:32 ` [PATCH v4 18/30] xen/riscv: introduce time.h Oleksii Kurochko
@ 2024-02-12 15:18   ` Jan Beulich
  2024-02-14 12:14     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:18 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Nevertheless ...

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/time.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef __ASM_RISCV_TIME_H__
> +#define __ASM_RISCV_TIME_H__
> +
> +#include <xen/bug.h>
> +#include <asm/csr.h>
> +
> +struct vcpu;
> +
> +/* TODO: implement */
> +static inline void force_update_vcpu_system_time(struct vcpu *v) { BUG_ON("unimplemented"); }

... nit: Too long line. The comment also doesn't look to serve any purpose
anymore, with the BUG_ON() now taking uniform shape.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 19/30] xen/riscv: introduce event.h
  2024-02-05 15:32 ` [PATCH v4 19/30] xen/riscv: introduce event.h Oleksii Kurochko
@ 2024-02-12 15:20   ` Jan Beulich
  2024-02-14 12:16     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:20 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
again with a nit, though:

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/event.h
> @@ -0,0 +1,40 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef __ASM_RISCV_EVENT_H__
> +#define __ASM_RISCV_EVENT_H__
> +
> +#include <xen/lib.h>
> +
> +void vcpu_mark_events_pending(struct vcpu *v);
> +
> +static inline int vcpu_event_delivery_is_enabled(struct vcpu *v)
> +{
> +    BUG_ON("unimplemented");
> +    return 0;
> +}
> +
> +static inline int local_events_need_delivery(void)
> +{
> +    BUG_ON("unimplemented");
> +    return 0;
> +}
> +
> +static inline void local_event_delivery_enable(void)
> +{
> +    BUG_ON("unimplemented");
> +}
> +
> +/* No arch specific virq definition now. Default to global. */
> +static inline bool arch_virq_is_global(unsigned int virq)
> +{
> +    return true;
> +}
> +
> +#endif

This want to gain the usual comment.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 28/30] xen/rirscv: add minimal amount of stubs to build full Xen
  2024-02-05 15:32 ` [PATCH v4 28/30] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
@ 2024-02-12 15:24   ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:24 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/early_printk.c
> +++ b/xen/arch/riscv/early_printk.c
> @@ -207,4 +207,3 @@ void printk(const char *format, ...)
>  }
>  
>  #endif
> -

Unrelated change?

> --- a/xen/arch/riscv/traps.c
> +++ b/xen/arch/riscv/traps.c
> @@ -4,6 +4,10 @@
>   *
>   * RISC-V Trap handlers
>   */
> +
> +#include <xen/lib.h>
> +#include <xen/sched.h>
> +
>  #include <asm/processor.h>
>  #include <asm/traps.h>
>  
> @@ -11,3 +15,24 @@ void do_trap(struct cpu_user_regs *cpu_regs)
>  {
>      die();
>  }
> +
> +void vcpu_show_execution_state(struct vcpu *v)
> +{
> +    assert_failed("need to be implented");
> +}
> +
> +void show_execution_state(const struct cpu_user_regs *regs)
> +{
> +    printk("implement show_execution_state(regs)\n");
> +}
> +
> +void arch_hypercall_tasklet_result(struct vcpu *v, long res)
> +{
> +    assert_failed("need to be implented");
> +}
> +
> +enum mc_disposition arch_do_multicall_call(struct mc_state *state)
> +{
> +    assert_failed("need to be implented");
> +    return mc_continue;
> +}

The assert_failed() here want switching to the "canonical" BUG_ON().

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 09/30] xen/riscv: introduce bitops.h
  2024-02-05 15:32 ` [PATCH v4 09/30] xen/riscv: introduce bitops.h Oleksii Kurochko
@ 2024-02-12 15:58   ` Jan Beulich
  2024-02-14 11:06     ` Oleksii
  2024-02-13  9:19   ` Jan Beulich
  1 sibling, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-12 15:58 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/bitops.h
> @@ -0,0 +1,164 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (C) 2012 Regents of the University of California */
> +
> +#ifndef _ASM_RISCV_BITOPS_H
> +#define _ASM_RISCV_BITOPS_H
> +
> +#include <asm/system.h>
> +
> +#include <asm-generic/bitops/bitops-bits.h>

Especially with ...

> +/* Based on linux/arch/include/linux/bits.h */
> +
> +#define BIT_MASK(nr)        (1UL << ((nr) % BITS_PER_LONG))
> +#define BIT_WORD(nr)        ((nr) / BITS_PER_LONG)

... these it's not entirely obvious why bitops-bits.h would be needed
here.

> +#define __set_bit(n,p)      set_bit(n,p)
> +#define __clear_bit(n,p)    clear_bit(n,p)

Nit (as before?): Missing blanks after commas.

> +/* Based on linux/arch/include/asm/bitops.h */
> +
> +#if ( BITS_PER_LONG == 64 )

Imo the parentheses here make things only harder to read.

> +#define __AMO(op)   "amo" #op ".d"
> +#elif ( BITS_PER_LONG == 32 )
> +#define __AMO(op)   "amo" #op ".w"
> +#else
> +#error "Unexpected BITS_PER_LONG"
> +#endif
> +
> +#define __test_and_op_bit_ord(op, mod, nr, addr, ord)   \

The revision log says __test_and_* were renamed. Same anomaly for
__test_and_op_bit() then.

> +({                                                      \
> +    unsigned long __res, __mask;                        \

Leftover leading underscores?

> +    __mask = BIT_MASK(nr);                              \
> +    __asm__ __volatile__ (                              \
> +        __AMO(op) #ord " %0, %2, %1"                    \
> +        : "=r" (__res), "+A" (addr[BIT_WORD(nr)])       \
> +        : "r" (mod(__mask))                             \
> +        : "memory");                                    \
> +    ((__res & __mask) != 0);                            \
> +})
> +
> +#define __op_bit_ord(op, mod, nr, addr, ord)    \
> +    __asm__ __volatile__ (                      \
> +        __AMO(op) #ord " zero, %1, %0"          \
> +        : "+A" (addr[BIT_WORD(nr)])             \
> +        : "r" (mod(BIT_MASK(nr)))               \
> +        : "memory");
> +
> +#define __test_and_op_bit(op, mod, nr, addr)    \
> +    __test_and_op_bit_ord(op, mod, nr, addr, .aqrl)
> +#define __op_bit(op, mod, nr, addr) \
> +    __op_bit_ord(op, mod, nr, addr, )
> +
> +/* Bitmask modifiers */
> +#define __NOP(x)    (x)
> +#define __NOT(x)    (~(x))

Here the (double) leading underscores are truly worrying: Simple
names like this aren't impossible to be assigned meaninb by a compiler.

> +/**
> + * __test_and_set_bit - Set a bit and return its old value
> + * @nr: Bit to set
> + * @addr: Address to count from
> + *
> + * This operation may be reordered on other architectures than x86.
> + */
> +static inline int test_and_set_bit(int nr, volatile void *p)
> +{
> +    volatile uint32_t *addr = p;

With BIT_WORD() / BIT_MASK() being long-based, is the use of uint32_t
here actually correct?

> +    return __test_and_op_bit(or, __NOP, nr, addr);
> +}
> +
> +/**
> + * __test_and_clear_bit - Clear a bit and return its old value
> + * @nr: Bit to clear
> + * @addr: Address to count from
> + *
> + * This operation can be reordered on other architectures other than x86.

Nit: double "other" (and I think it's the 1st one that wants dropping,
not - as the earlier comment suggests - the 2nd one). Question is: Are
the comments correct? Both resolve to something which is (also) at
least a compiler barrier. Same concern also applies further down, to
at least set_bit() and clear_bit().

> + */
> +static inline int test_and_clear_bit(int nr, volatile void *p)
> +{
> +    volatile uint32_t *addr = p;
> +
> +    return __test_and_op_bit(and, __NOT, nr, addr);
> +}
> +
> +/**
> + * set_bit - Atomically set a bit in memory
> + * @nr: the bit to set
> + * @addr: the address to start counting from
> + *
> + * Note: there are no guarantees that this function will not be reordered
> + * on non x86 architectures, so if you are writing portable code,
> + * make sure not to rely on its reordering guarantees.
> + *
> + * Note that @nr may be almost arbitrarily large; this function is not
> + * restricted to acting on a single-word quantity.
> + */
> +static inline void set_bit(int nr, volatile void *p)
> +{
> +    volatile uint32_t *addr = p;
> +
> +    __op_bit(or, __NOP, nr, addr);
> +}
> +
> +/**
> + * clear_bit - Clears a bit in memory
> + * @nr: Bit to clear
> + * @addr: Address to start counting from
> + *
> + * Note: there are no guarantees that this function will not be reordered
> + * on non x86 architectures, so if you are writing portable code,
> + * make sure not to rely on its reordering guarantees.
> + */
> +static inline void clear_bit(int nr, volatile void *p)
> +{
> +    volatile uint32_t *addr = p;
> +
> +    __op_bit(and, __NOT, nr, addr);
> +}
> +
> +/**
> + * test_and_change_bit - Change a bit and return its old value

How come this one's different? I notice the comments are the same (and
hence as confusing) in Linux; are you sure they're applicable there?

> + * @nr: Bit to change
> + * @addr: Address to count from
> + *
> + * This operation is atomic and cannot be reordered.
> + * It also implies a memory barrier.
> + */
> +static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
> +{
> +	return __test_and_op_bit(xor, __NOP, nr, addr);
> +}
> +
> +#undef __test_and_op_bit
> +#undef __op_bit
> +#undef __NOP
> +#undef __NOT
> +#undef __AMO
> +
> +#include <asm-generic/bitops/generic-non-atomic.h>
> +
> +#define __test_and_set_bit generic___test_and_set_bit
> +#define __test_and_clear_bit generic___test_and_clear_bit
> +#define __test_and_change_bit generic___test_and_change_bit
> +
> +#include <asm-generic/bitops/fls.h>
> +#include <asm-generic/bitops/flsl.h>
> +#include <asm-generic/bitops/__ffs.h>
> +#include <asm-generic/bitops/ffs.h>
> +#include <asm-generic/bitops/ffsl.h>
> +#include <asm-generic/bitops/ffz.h>
> +#include <asm-generic/bitops/find-first-set-bit.h>
> +#include <asm-generic/bitops/hweight.h>
> +#include <asm-generic/bitops/test-bit.h>

To be honest there's too much stuff being added here to asm-generic/,
all in one go. I'll see about commenting on the remaining parts here,
but I'd like to ask that you seriously consider splitting.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 09/30] xen/riscv: introduce bitops.h
  2024-02-05 15:32 ` [PATCH v4 09/30] xen/riscv: introduce bitops.h Oleksii Kurochko
  2024-02-12 15:58   ` Jan Beulich
@ 2024-02-13  9:19   ` Jan Beulich
  1 sibling, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-13  9:19 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/include/asm/config.h
> +++ b/xen/arch/riscv/include/asm/config.h
> @@ -50,6 +50,8 @@
>  # error "Unsupported RISCV variant"
>  #endif
>  
> +#define BITS_PER_BYTE 8
> +
>  #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
>  #define BITS_PER_LONG  (BYTES_PER_LONG << 3)
>  #define POINTER_ALIGN  BYTES_PER_LONG

How does this change relate to this patch? I can't see the new symbol
being used anywhere.

> --- /dev/null
> +++ b/xen/include/asm-generic/bitops/__ffs.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_GENERIC_BITOPS___FFS_H_
> +#define _ASM_GENERIC_BITOPS___FFS_H_
> +
> +/**
> + * ffs - find first bit in word.

__ffs ? Or wait, ...

> + * @word: The word to search
> + *
> + * Returns 0 if no bit exists, otherwise returns 1-indexed bit location.

... this actually describes ffs(), not __ffs(), and the implementation
doesn't match the description. The correct description for this function
(as Linux also has it)

 * Undefined if no bit exists, so code should check against 0 first.

Which raises a question regarding "Taken from Linux-6.4.0-rc1" in the
description. ffs.h pretty clearly also doesn't come from there.

I first I thought I might withdraw my earlier request to split all of
this up. But with just these two observations I now feel it's even
more important that you do, so every piece can be properly attributed
to (and then checked for) its origin.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-05 15:32 ` [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
@ 2024-02-13 10:37   ` Jan Beulich
  2024-02-15 13:41     ` Oleksii
  2024-02-18 19:00   ` Julien Grall
  1 sibling, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-13 10:37 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> @@ -0,0 +1,237 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright (C) 2014 Regents of the University of California */
> +
> +#ifndef _ASM_RISCV_CMPXCHG_H
> +#define _ASM_RISCV_CMPXCHG_H
> +
> +#include <xen/compiler.h>
> +#include <xen/lib.h>
> +
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +#define ALIGN_DOWN(addr, size)  ((addr) & (~((size) - 1)))

This feels risky: Consider what happens when someone passes 2U as 2nd argument.
The cheapest adjustment to make would be to use 1UL in the expression.

> +#define __amoswap_generic(ptr, new, ret, sfx, release_barrier, acquire_barrier) \
> +({ \
> +    asm volatile( \
> +        release_barrier \
> +        " amoswap" sfx " %0, %2, %1\n" \

While I won't insist, the revision log says \n were dropped from asm()
where not needed. A separator is needed here only if ...

> +        acquire_barrier \

... this isn't blank. Which imo suggests that the separator should be
part of the argument passed in. But yes, one can view this differently,
hence why I said I won't insist.

As to the naming of the two  - I'd generally suggest to make as litte
implications as possible: It doesn't really matter here whether it's
acquire or release; that matters at the use sites. What matters here
is that one is a "pre" barrier and the other is a "post" one.

> +        : "=r" (ret), "+A" (*ptr) \
> +        : "r" (new) \
> +        : "memory" ); \
> +})
> +
> +#define emulate_xchg_1_2(ptr, new, ret, release_barrier, acquire_barrier) \
> +({ \
> +    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned long)ptr, 4); \

You now appear to assume that this macro is only used with inputs not
crossing word boundaries. That's okay as long as suitably guaranteed
at the use sites, but imo wants saying in a comment.

> +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr))) * BITS_PER_BYTE; \

Why 0x8 (i.e. spanning 64 bits), not 4 (matching the uint32_t use
above)?

> +    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
> +    uint8_t mask_h = mask_l + mask_size - 1; \
> +    unsigned long mask = GENMASK(mask_h, mask_l); \

Personally I find this confusing, naming-wise: GENMASK() takes bit
positions as inputs, not masks. (Initially, because of this, I
thought the calculations all can't be quite right.)

> +    unsigned long new_ = (unsigned long)(new) << mask_l; \
> +    unsigned long ret_; \
> +    unsigned long rc; \

Similarly, why unsigned long here?

I also wonder about the mix of underscore suffixed (or not) variable
names here.

> +    \
> +    asm volatile( \

Nit: Missing blank before opening parenthesis.

> +        release_barrier \
> +        "0: lr.d %0, %2\n" \

Even here it's an 8-byte access. Even if - didn't check - the insn was
okay to use with just a 4-byte aligned pointer, wouldn't it make sense
then to 8-byte align it, and be consistent throughout this macro wrt
the base unit acted upon? Alternatively, why not use lr.w here, thus
reducing possible collisions between multiple CPUs accessing the same
cache line?

> +        "   and  %1, %0, %z4\n" \
> +        "   or   %1, %1, %z3\n" \
> +        "   sc.d %1, %1, %2\n" \
> +        "   bnez %1, 0b\n" \
> +        acquire_barrier \
> +        : "=&r" (ret_), "=&r" (rc), "+A" (*ptr_32b_aligned) \
> +        : "rJ" (new_), "rJ" (~mask) \

I think that as soon as there are more than 2 or maybe 3 operands,
legibility is vastly improved by using named asm() operands.

> +        : "memory"); \

Nit: Missing blank before closing parenthesis.

> +    \
> +    ret = (__typeof__(*(ptr)))((ret_ & mask) >> mask_l); \
> +})

Why does "ret" need to be a macro argument? If you had only the
expression here, not the the assigment, ...

> +#define __xchg_generic(ptr, new, size, sfx, release_barrier, acquire_barrier) \
> +({ \
> +    __typeof__(ptr) ptr__ = (ptr); \

Is this local variable really needed? Can't you use "ptr" directly
in the three macro invocations?

> +    __typeof__(*(ptr)) new__ = (new); \
> +    __typeof__(*(ptr)) ret__; \
> +    switch (size) \
> +    { \
> +    case 1: \
> +    case 2: \
> +        emulate_xchg_1_2(ptr__, new__, ret__, release_barrier, acquire_barrier); \

... this would become

        ret__ = emulate_xchg_1_2(ptr__, new__, release_barrier, acquire_barrier); \

But, unlike assumed above, there's no enforcement here that a 2-byte
quantity won't cross a word, double-word, cache line, or even page
boundary. That might be okay if then the code would simply crash (like
the AMO insns emitted further down would), but aiui silent misbehavior
would result.

Also nit: The switch() higher up is (still/again) missing blanks.

> +        break; \
> +    case 4: \
> +        __amoswap_generic(ptr__, new__, ret__,\
> +                          ".w" sfx,  release_barrier, acquire_barrier); \
> +        break; \
> +    case 8: \
> +        __amoswap_generic(ptr__, new__, ret__,\
> +                          ".d" sfx,  release_barrier, acquire_barrier); \
> +        break; \
> +    default: \
> +        STATIC_ASSERT_UNREACHABLE(); \
> +    } \
> +    ret__; \
> +})
> +
> +#define xchg_relaxed(ptr, x) \
> +({ \
> +    __typeof__(*(ptr)) x_ = (x); \
> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), "", "", ""); \
> +})
> +
> +#define xchg_acquire(ptr, x) \
> +({ \
> +    __typeof__(*(ptr)) x_ = (x); \
> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
> +                                       "", "", RISCV_ACQUIRE_BARRIER); \
> +})
> +
> +#define xchg_release(ptr, x) \
> +({ \
> +    __typeof__(*(ptr)) x_ = (x); \
> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
> +                                       "", RISCV_RELEASE_BARRIER, ""); \
> +})
> +
> +#define xchg(ptr,x) \
> +({ \
> +    __typeof__(*(ptr)) ret__; \
> +    ret__ = (__typeof__(*(ptr))) \
> +            __xchg_generic(ptr, (unsigned long)(x), sizeof(*(ptr)), \
> +                           ".aqrl", "", ""); \

The .aqrl doesn't look to affect the (emulated) 1- and 2-byte cases.

Further, amoswap also exists in release-only and acquire-only forms.
Why do you prefer explicit barrier insns over those? (Looks to
similarly apply to the emulation path as well as to the cmpxchg
machinery then, as both lr and sc also come in all four possible
acquire/release forms. Perhaps for the emulation path using
explicit barriers is better, in case the acquire/release forms of
lr/sc - being used inside the loop - might perform worse.)

> +    ret__; \
> +})
> +
> +#define __generic_cmpxchg(ptr, old, new, ret, lr_sfx, sc_sfx, release_barrier, acquire_barrier)	\
> + ({ \
> +    register unsigned int rc; \
> +    asm volatile( \
> +        release_barrier \
> +        "0: lr" lr_sfx " %0, %2\n" \
> +        "   bne  %0, %z3, 1f\n" \
> +        "   sc" sc_sfx " %1, %z4, %2\n" \
> +        "   bnez %1, 0b\n" \
> +        acquire_barrier \
> +        "1:\n" \
> +        : "=&r" (ret), "=&r" (rc), "+A" (*ptr) \
> +        : "rJ" (old), "rJ" (new) \
> +        : "memory"); \
> + })
> +
> +#define emulate_cmpxchg_1_2(ptr, old, new, ret, sc_sfx, release_barrier, acquire_barrier) \
> +({ \
> +    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned long)ptr, 4); \
> +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr))) * BITS_PER_BYTE; \
> +    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
> +    uint8_t mask_h = mask_l + mask_size - 1; \
> +    unsigned long mask = GENMASK(mask_h, mask_l); \
> +    unsigned long old_ = (unsigned long)(old) << mask_l; \
> +    unsigned long new_ = (unsigned long)(new) << mask_l; \
> +    unsigned long ret_; \
> +    unsigned long rc; \
> +    \
> +    __asm__ __volatile__ ( \
> +        release_barrier \
> +        "0: lr.d %0, %2\n" \
> +        "   and  %1, %0, %z5\n" \
> +        "   bne  %1, %z3, 1f\n" \
> +        "   and  %1, %0, %z6\n" \

Isn't this equivalent to

        "   xor  %1, %1, %0\n" \

this eliminating one (likely register) input?

Furthermore with the above and ...

> +        "   or   %1, %1, %z4\n" \
> +        "   sc.d" sc_sfx " %1, %1, %2\n" \
> +        "   bnez %1, 0b\n" \

... this re-written to

        "   xor  %0, %1, %0\n" \
        "   or   %0, %0, %z4\n" \
        "   sc.d" sc_sfx " %0, %0, %2\n" \
        "   bnez %0, 0b\n" \

you'd then no longer clobber the ret_ & mask you've already calculated
in %1, so ...

> +        acquire_barrier \
> +        "1:\n" \
> +        : "=&r" (ret_), "=&r" (rc), "+A" (*ptr_32b_aligned) \
> +        : "rJ" (old_), "rJ" (new_), \
> +          "rJ" (mask), "rJ" (~mask) \
> +        : "memory"); \
> +    \
> +    ret = (__typeof__(*(ptr)))((ret_ & mask) >> mask_l); \

... you could use rc here. (Of course variable naming or use then may
want changing, assuming I understand why "rc" is named the way it is.)

> +})
> +
> +/*
> + * Atomic compare and exchange.  Compare OLD with MEM, if identical,
> + * store NEW in MEM.  Return the initial value in MEM.  Success is
> + * indicated by comparing RETURN with OLD.
> + */
> +#define __cmpxchg_generic(ptr, old, new, size, sc_sfx, release_barrier, acquire_barrier) \
> +({ \
> +    __typeof__(ptr) ptr__ = (ptr); \
> +    __typeof__(*(ptr)) old__ = (__typeof__(*(ptr)))(old); \
> +    __typeof__(*(ptr)) new__ = (__typeof__(*(ptr)))(new); \
> +    __typeof__(*(ptr)) ret__; \
> +    switch (size) \
> +    { \
> +    case 1: \
> +    case 2: \
> +        emulate_cmpxchg_1_2(ptr, old, new, ret__,\
> +                            sc_sfx, release_barrier, acquire_barrier); \
> +        break; \
> +    case 4: \
> +        __generic_cmpxchg(ptr__, old__, new__, ret__, \
> +                          ".w", ".w"sc_sfx, release_barrier, acquire_barrier); \
> +        break; \
> +    case 8: \
> +        __generic_cmpxchg(ptr__, old__, new__, ret__, \
> +                          ".d", ".d"sc_sfx, release_barrier, acquire_barrier); \
> +        break; \
> +    default: \
> +        STATIC_ASSERT_UNREACHABLE(); \
> +    } \
> +    ret__; \
> +})
> +
> +#define cmpxchg_relaxed(ptr, o, n) \
> +({ \
> +    __typeof__(*(ptr)) o_ = (o); \
> +    __typeof__(*(ptr)) n_ = (n); \
> +    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, \
> +                    o_, n_, sizeof(*(ptr)), "", "", ""); \
> +})
> +
> +#define cmpxchg_acquire(ptr, o, n) \
> +({ \
> +    __typeof__(*(ptr)) o_ = (o); \
> +    __typeof__(*(ptr)) n_ = (n); \
> +    (__typeof__(*(ptr)))__cmpxchg_generic(ptr, o_, n_, sizeof(*(ptr)), \
> +                                          "", "", RISCV_ACQUIRE_BARRIER); \
> +})
> +
> +#define cmpxchg_release(ptr, o, n) \
> +({ \
> +    __typeof__(*(ptr)) o_ = (o); \
> +    __typeof__(*(ptr)) n_ = (n); \
> +    (__typeof__(*(ptr)))__cmpxchg_release(ptr, o_, n_, sizeof(*(ptr)), \
> +                                          "", RISCV_RELEASE_BARRIER, ""); \
> +})
> +
> +#define cmpxchg(ptr, o, n) \
> +({ \
> +    __typeof__(*(ptr)) ret__; \
> +    ret__ = (__typeof__(*(ptr))) \
> +            __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
> +                              sizeof(*(ptr)), ".rl", "", " fence rw, rw\n"); \

No RISCV_..._BARRIER for use here and ...

> +    ret__; \
> +})
> +
> +#define __cmpxchg(ptr, o, n, s) \
> +({ \
> +    __typeof__(*(ptr)) ret__; \
> +    ret__ = (__typeof__(*(ptr))) \
> +            __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
> +                              s, ".rl", "", " fence rw, rw\n"); \

... here? And anyway, wouldn't it make sense to have

#define cmpxchg(ptr, o, n) __cmpxchg(ptr, o, n, sizeof(*(ptr))

to limit redundancy?

Plus wouldn't

#define __cmpxchg(ptr, o, n, s) \
    ((__typeof__(*(ptr))) \
     __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
                       s, ".rl", "", " fence rw, rw\n"))

be shorter and thus easier to follow as well? As I notice only now,
this would apparently apply further up as well.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 13/30] xen/riscv: introduce io.h
  2024-02-05 15:32 ` [PATCH v4 13/30] xen/riscv: introduce io.h Oleksii Kurochko
@ 2024-02-13 11:05   ` Jan Beulich
  2024-02-14 11:34     ` Oleksii
  2024-02-18 19:07   ` Julien Grall
  1 sibling, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-13 11:05 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> The header taken form Linux 6.4.0-rc1 and is based on
> arch/riscv/include/asm/mmio.h.
> 
> Addionally, to the header was added definions of ioremap_*().
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V4:
>  - delete inner parentheses in macros.
>  - s/u<N>/uint<N>.
> ---
> Changes in V3:
>  - re-sync with linux kernel
>  - update the commit message
> ---
> Changes in V2:
>  - Nothing changed. Only rebase.
> ---
>  xen/arch/riscv/include/asm/io.h | 142 ++++++++++++++++++++++++++++++++
>  1 file changed, 142 insertions(+)
>  create mode 100644 xen/arch/riscv/include/asm/io.h
> 
> diff --git a/xen/arch/riscv/include/asm/io.h b/xen/arch/riscv/include/asm/io.h
> new file mode 100644
> index 0000000000..1e61a40522
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/io.h
> @@ -0,0 +1,142 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * {read,write}{b,w,l,q} based on arch/arm64/include/asm/io.h
> + *   which was based on arch/arm/include/io.h
> + *
> + * Copyright (C) 1996-2000 Russell King
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2014 Regents of the University of California
> + */
> +
> +
> +#ifndef _ASM_RISCV_IO_H
> +#define _ASM_RISCV_IO_H
> +
> +#include <asm/byteorder.h>
> +
> +/*
> + * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> + * change the properties of memory regions.  This should be fixed by the
> + * upcoming platform spec.
> + */
> +#define ioremap_nocache(addr, size) ioremap(addr, size)
> +#define ioremap_wc(addr, size) ioremap(addr, size)
> +#define ioremap_wt(addr, size) ioremap(addr, size)
> +
> +/* Generic IO read/write.  These perform native-endian accesses. */
> +#define __raw_writeb __raw_writeb

What use are this and the similar other #define-s?

> +static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
> +{
> +	asm volatile("sb %0, 0(%1)" : : "r" (val), "r" (addr));

Nit (throughout): Missing blanks. Or wait - is this file intended to
be Linux style? If so, it's just one blank that's missing.

> +/*
> + * Unordered I/O memory access primitives.  These are even more relaxed than
> + * the relaxed versions, as they don't even order accesses between successive
> + * operations to the I/O regions.
> + */
> +#define readb_cpu(c)		({ uint8_t  __r = __raw_readb(c); __r; })
> +#define readw_cpu(c)		({ uint16_t __r = le16_to_cpu((__force __le16)__raw_readw(c)); __r; })
> +#define readl_cpu(c)		({ uint32_t __r = le32_to_cpu((__force __le32)__raw_readl(c)); __r; })

Didn't we settle on the little-endian stuff to be dropped from here?
No matter what CPU endianness, what endianness a particular device
(and hence its MMIO region(s)) is using is entirely independent. Hence
conversion, where necessary, needs to occur at a layer up.

Also, what good do the __r variables do here? If they weren't here,
we also wouldn't need to discuss their naming.

> +#define writeb_cpu(v,c)		((void)__raw_writeb(v,c))
> +#define writew_cpu(v,c)		((void)__raw_writew((__force uint16_t)cpu_to_le16(v),c))
> +#define writel_cpu(v,c)		((void)__raw_writel((__force uint32_t)cpu_to_le32(v),c))

Nit: Blanks after commas please (also again further down).

> +#ifdef CONFIG_64BIT
> +#define readq_cpu(c)		({ u64 __r = le64_to_cpu((__force __le64)__raw_readq(c)); __r; })
> +#define writeq_cpu(v,c)		((void)__raw_writeq((__force u64)cpu_to_le64(v),c))

uint64_t (twice)

> +#endif
> +
> +/*
> + * I/O memory access primitives. Reads are ordered relative to any
> + * following Normal memory access. Writes are ordered relative to any prior
> + * Normal memory access.  The memory barriers here are necessary as RISC-V
> + * doesn't define any ordering between the memory space and the I/O space.
> + */
> +#define __io_br()	do {} while (0)

Nit: This and ...

> +#define __io_ar(v)	__asm__ __volatile__ ("fence i,r" : : : "memory");
> +#define __io_bw()	__asm__ __volatile__ ("fence w,o" : : : "memory");
> +#define __io_aw()	do { } while (0)

... this want to be spelled exactly the same.

Also, why does __io_ar() have a parameter (which it then doesn't use)?

Finally at least within a single file please be consistent about asm()
vs __asm__() use.

> +#define readb(c)	({ uint8_t  __v; __io_br(); __v = readb_cpu(c); __io_ar(__v); __v; })
> +#define readw(c)	({ uint16_t __v; __io_br(); __v = readw_cpu(c); __io_ar(__v); __v; })
> +#define readl(c)	({ uint32_t __v; __io_br(); __v = readl_cpu(c); __io_ar(__v); __v; })

Here the local variables are surely needed. Still they would preferably
not have any underscores as prefixes.

> +#define writeb(v,c)	({ __io_bw(); writeb_cpu(v,c); __io_aw(); })
> +#define writew(v,c)	({ __io_bw(); writew_cpu(v,c); __io_aw(); })
> +#define writel(v,c)	({ __io_bw(); writel_cpu(v,c); __io_aw(); })
> +
> +#ifdef CONFIG_64BIT
> +#define readq(c)	({ u64 __v; __io_br(); __v = readq_cpu(c); __io_ar(__v); __v; })

uint64_t again

> +#define writeq(v,c)	({ __io_bw(); writeq_cpu((v),(c)); __io_aw(); })

Inner parentheses still left?

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 14/30] xen/riscv: introduce atomic.h
  2024-02-05 15:32 ` [PATCH v4 14/30] xen/riscv: introduce atomic.h Oleksii Kurochko
@ 2024-02-13 11:36   ` Jan Beulich
  2024-02-14 12:11     ` Oleksii
  2024-02-18 19:22   ` Julien Grall
  1 sibling, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-13 11:36 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Bobby Eshleman, Alistair Francis, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> From: Bobby Eshleman <bobbyeshleman@gmail.com>
> 
> Additionally, this patch introduces macros in fence.h,
> which are utilized in atomic.h.

These are used in an earlier patch already, so either you want to
re-order the series, or you want to move that introduction ahead.

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/atomic.h
> @@ -0,0 +1,395 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Taken and modified from Linux.
> + *
> + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated to use
> + * __*xchg_generic()
> + * 
> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> + * Copyright (C) 2012 Regents of the University of California
> + * Copyright (C) 2017 SiFive
> + * Copyright (C) 2021 Vates SAS
> + */
> +
> +#ifndef _ASM_RISCV_ATOMIC_H
> +#define _ASM_RISCV_ATOMIC_H
> +
> +#include <xen/atomic.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +void __bad_atomic_size(void);
> +
> +static always_inline void read_atomic_size(const volatile void *p,
> +                                           void *res,
> +                                           unsigned int size)
> +{
> +    switch ( size )
> +    {
> +    case 1: *(uint8_t *)res = readb(p); break;
> +    case 2: *(uint16_t *)res = readw(p); break;
> +    case 4: *(uint32_t *)res = readl(p); break;
> +    case 8: *(uint32_t *)res  = readq(p); break;

Why is it the MMIO primitives you use here, i.e. not read<X>_cpu()?
It's RAM you're accessing after all.

Also - no CONFIG_64BIT conditional here (like you have in the other
patch)?

> +    default: __bad_atomic_size(); break;
> +    }
> +}
> +
> +#define read_atomic(p) ({                               \
> +    union { typeof(*p) val; char c[0]; } x_;            \
> +    read_atomic_size(p, x_.c, sizeof(*p));              \

I'll be curious for how much longer gcc will tolerate this accessing
of a zero-length array, without issuing at least a warning. I'd
recommend using sizeof(*(p)) as the array dimension right away. (From
this not also the missing parentheses in what you have.)

> +    x_.val;                                             \
> +})
> +
> +#define write_atomic(p, x)                              \
> +({                                                      \
> +    typeof(*p) x__ = (x);                               \
> +    switch ( sizeof(*p) )                               \
> +    {                                                   \
> +    case 1: writeb((uint8_t)x__,  p); break;            \
> +    case 2: writew((uint16_t)x__, p); break;            \
> +    case 4: writel((uint32_t)x__, p); break;            \
> +    case 8: writeq((uint64_t)x__, p); break;            \

Are the casts actually necessary here?

> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +    x__;                                                \
> +})
> +
> +#define add_sized(p, x)                                 \
> +({                                                      \
> +    typeof(*(p)) x__ = (x);                             \
> +    switch ( sizeof(*(p)) )                             \
> +    {                                                   \
> +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> +    case 2: writew(read_atomic(p) + x__, p); break;     \
> +    case 4: writel(read_atomic(p) + x__, p); break;     \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +})
> +
> +/*
> + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
> + *               non-scalar types unchanged.
> + *
> + * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char'
> + * is not type-compatible with 'signed char', and we define a separate case.
> + */
> +#define __scalar_type_to_expr_cases(type)               \
> +    unsigned type:  (unsigned type)0,                   \
> +    signed type:    (signed type)0
> +
> +#define __unqual_scalar_typeof(x) typeof(               \
> +    _Generic((x),                                       \
> +        char:  (char)0,                                 \
> +        __scalar_type_to_expr_cases(char),              \
> +        __scalar_type_to_expr_cases(short),             \
> +        __scalar_type_to_expr_cases(int),               \
> +        __scalar_type_to_expr_cases(long),              \
> +        __scalar_type_to_expr_cases(long long),         \
> +        default: (x)))

This isn't RISC-V specific, is it? In which case it wants moving to,
perhaps, xen/macros.h (and then also have the leading underscores
dropped).

> +#define READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> +#define WRITE_ONCE(x, val)                                      \
> +    do {                                                        \
> +        *(volatile typeof(x) *)&(x) = (val);                    \
> +    } while (0)

In Xen we use ACCESS_ONCE(); any reason you need to introduce
{READ,WRITE}_ONCE() in addition? Without them, __unqual_scalar_typeof()
may then also not be needed (or, if there's a need to enhance it, may
then be needed for ACCESS_ONCE()). Which in turn raises the question
why only READ_ONCE() uses it here.

> +#define __atomic_acquire_fence() \
> +    __asm__ __volatile__( RISCV_ACQUIRE_BARRIER "" ::: "memory" )

Missing blank here and ...

> +#define __atomic_release_fence() \
> +    __asm__ __volatile__( RISCV_RELEASE_BARRIER "" ::: "memory" );

... here, and stray semicolon additionally just here.

> +static inline int atomic_read(const atomic_t *v)
> +{
> +    return READ_ONCE(v->counter);
> +}
> +
> +static inline int _atomic_read(atomic_t v)
> +{
> +    return v.counter;
> +}
> +
> +static inline void atomic_set(atomic_t *v, int i)
> +{
> +    WRITE_ONCE(v->counter, i);
> +}
> +
> +static inline void _atomic_set(atomic_t *v, int i)
> +{
> +    v->counter = i;
> +}
> +
> +static inline int atomic_sub_and_test(int i, atomic_t *v)
> +{
> +    return atomic_sub_return(i, v) == 0;
> +}
> +
> +static inline void atomic_inc(atomic_t *v)
> +{
> +    atomic_add(1, v);
> +}
> +
> +static inline int atomic_inc_return(atomic_t *v)
> +{
> +    return atomic_add_return(1, v);
> +}
> +
> +static inline void atomic_dec(atomic_t *v)
> +{
> +    atomic_sub(1, v);
> +}
> +
> +static inline int atomic_dec_return(atomic_t *v)
> +{
> +    return atomic_sub_return(1, v);
> +}
> +
> +static inline int atomic_dec_and_test(atomic_t *v)
> +{
> +    return atomic_sub_return(1, v) == 0;
> +}
> +
> +static inline int atomic_add_negative(int i, atomic_t *v)
> +{
> +    return atomic_add_return(i, v) < 0;
> +}
> +
> +static inline int atomic_inc_and_test(atomic_t *v)
> +{
> +    return atomic_add_return(1, v) == 0;
> +}

None of these look RISC-V-specific. Perhaps worth having something in
asm-generic/ that can be utilized here?

> +/*
> + * First, the atomic ops that have no ordering constraints and therefor don't
> + * have the AQ or RL bits set.  These don't return anything, so there's only
> + * one version to worry about.
> + */
> +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> +static inline                                               \
> +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                           \
> +    __asm__ __volatile__ (                                  \
> +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> +        : "+A" (v->counter)                                 \
> +        : "r" (I)                                           \
> +        : "memory" );                                       \
> +}                                                           \
> +
> +#define ATOMIC_OPS(op, asm_op, I)                           \
> +        ATOMIC_OP (op, asm_op, I, w, int,   )

So the last three parameters are to be ready to also support
atomic64, without actually doing so right now?

> +ATOMIC_OPS(add, add,  i)
> +ATOMIC_OPS(sub, add, -i)
> +ATOMIC_OPS(and, and,  i)
> +ATOMIC_OPS( or,  or,  i)
> +ATOMIC_OPS(xor, xor,  i)
> +
> +#undef ATOMIC_OP
> +#undef ATOMIC_OPS
> +
> +/*
> + * Atomic ops that have ordered, relaxed, acquire, and release variants.
> + * There's two flavors of these: the arithmatic ops have both fetch and return
> + * versions, while the logical ops only have fetch versions.
> + */

I'm somewhat confused by the comment: It first talks of 4 variants, but
then says there are only 2 (arithmetic) or 1 (logical) ones.

> +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type, prefix)    \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op##_relaxed(c_type i,              \
> +                         atomic##prefix##_t *v)                     \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}                                                                   \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2, %0"          \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}
> +
> +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return_relaxed(c_type i,                 \
> +                          atomic##prefix##_t *v)                        \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I;      \
> +}                                                                       \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v)  \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op(i, v) c_op I;                \
> +}
> +
> +#define ATOMIC_OPS(op, asm_op, c_op, I)                                 \
> +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,   )               \
> +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
> +
> +ATOMIC_OPS(add, add, +,  i)
> +ATOMIC_OPS(sub, add, +, -i)
> +
> +#define atomic_add_return_relaxed   atomic_add_return_relaxed
> +#define atomic_sub_return_relaxed   atomic_sub_return_relaxed
> +#define atomic_add_return   atomic_add_return
> +#define atomic_sub_return   atomic_sub_return
> +
> +#define atomic_fetch_add_relaxed    atomic_fetch_add_relaxed
> +#define atomic_fetch_sub_relaxed    atomic_fetch_sub_relaxed
> +#define atomic_fetch_add    atomic_fetch_add
> +#define atomic_fetch_sub    atomic_fetch_sub

What are all of these #define-s (any yet more further down) about?

> +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> +{
> +       int prev, rc;
> +
> +    __asm__ __volatile__ (
> +        "0: lr.w     %[p],  %[c]\n"
> +        "   sub      %[rc], %[p], %[o]\n"
> +        "   bltz     %[rc], 1f\n"
> +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> +        "   bnez     %[rc], 0b\n"
> +        "   fence    rw, rw\n"
> +        "1:\n"
> +        : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> +        : [o]"r" (offset)

Nit: Blanks please between ] and ".

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/fence.h
> @@ -0,0 +1,8 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _ASM_RISCV_FENCE_H
> +#define _ASM_RISCV_FENCE_H
> +
> +#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
> +#define RISCV_RELEASE_BARRIER   "\tfence rw,  w\n"

Seeing that another "fence rw, rw" appears in this patch, I'm now pretty
sure you want to add e.g. RISCV_FULL_BARRIER here as well.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 22/30] xen/riscv: define an address of frame table
  2024-02-05 15:32 ` [PATCH v4 22/30] xen/riscv: define an address of frame table Oleksii Kurochko
@ 2024-02-13 13:07   ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-13 13:07 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Also, the patch adds some helpful macros that assist in avoiding
> the redefinition of memory layout for each MMU mode.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

Acked-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-05 15:32 ` [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
@ 2024-02-13 13:33   ` Jan Beulich
  2024-02-15 16:38     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-13 13:33 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/Kconfig
> +++ b/xen/arch/riscv/Kconfig
> @@ -45,6 +45,13 @@ config RISCV_ISA_C
>  
>  	  If unsure, say Y.
>  
> +config TOOLCHAIN_HAS_ZIHINTPAUSE
> +	bool
> +	default y

Shorter as "def_bool y".

> +	depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zihintpause)
> +	depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zihintpause)

So for a reason I cannot really see -mabi= is indeed required here,
or else the compiler sees an issue with the D extension. But enabling
both M and A shouldn't really be needed in this check, as being
unrelated?

> +	depends on LLD_VERSION >= 150000 || LD_VERSION >= 23600

What's the linker dependency here? Depending on the answer I might further
ask why "TOOLCHAIN" when elsewhere we use CC_HAS_ or HAS_CC_ or HAS_AS_.

That said, you may or may not be aware that personally I'm against
encoding such in Kconfig, and my repeated attempts to get the respective
discussion unstuck have not led anywhere. Therefore if you keep this, I'll
be in trouble whether to actually ack the change as a whole.

> --- a/xen/arch/riscv/include/asm/processor.h
> +++ b/xen/arch/riscv/include/asm/processor.h
> @@ -12,6 +12,9 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +/* TODO: need to be implemeted */
> +#define smp_processor_id() 0
> +
>  /* On stack VCPU state */
>  struct cpu_user_regs
>  {
> @@ -53,6 +56,26 @@ struct cpu_user_regs
>      unsigned long pregs;
>  };
>  
> +/* TODO: need to implement */
> +#define cpu_to_core(cpu)   (0)
> +#define cpu_to_socket(cpu) (0)
> +
> +static inline void cpu_relax(void)
> +{
> +#ifdef __riscv_zihintpause
> +    /*
> +     * Reduce instruction retirement.
> +     * This assumes the PC changes.
> +     */
> +    __asm__ __volatile__ ("pause");
> +#else
> +    /* Encoding of the pause instruction */
> +    __asm__ __volatile__ (".insn 0x100000F");
> +#endif

Like elsewhere, nit: Missing blanks immediately inside the parentheses.

> +    barrier();

It's probably okay to be separate, but I'd suggest folding this right
into the asm()-s.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h to build full Xen
  2024-02-05 15:32 ` [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
@ 2024-02-13 14:19   ` Jan Beulich
  2024-02-16 11:03     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-13 14:19 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V4:
>  - update an argument name of PFN_ORDERN macros.
>  - drop pad at the end of 'struct page_info'.
>  - Change message -> subject in "Changes in V3"
>  - delete duplicated macros from riscv/mm.h
>  - fix identation in struct page_info

I'm sorry, but how does this fit ...

> +struct page_info
> +{
> +    /* Each frame can be threaded onto a doubly-linked list. */
> +    struct page_list_entry list;
> +
> +    /* Reference count and various PGC_xxx flags and fields. */
> +    unsigned long count_info;
> +
> +    /* Context-dependent fields follow... */
> +    union {
> +        /* Page is in use: ((count_info & PGC_count_mask) != 0). */
> +        struct {
> +            /* Type reference count and various PGT_xxx flags and fields. */
> +            unsigned long type_info;
> +        } inuse;
> +        /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
> +        union {
> +            struct {
> +                /*
> +                 * Index of the first *possibly* unscrubbed page in the buddy.
> +                 * One more bit than maximum possible order to accommodate
> +                 * INVALID_DIRTY_IDX.
> +                 */
> +#define INVALID_DIRTY_IDX ((1UL << (MAX_ORDER + 1)) - 1)
> +                unsigned long first_dirty:MAX_ORDER + 1;
> +
> +                /* Do TLBs need flushing for safety before next page use? */
> +                bool need_tlbflush:1;
> +
> +#define BUDDY_NOT_SCRUBBING    0
> +#define BUDDY_SCRUBBING        1
> +#define BUDDY_SCRUB_ABORT      2
> +                unsigned long scrub_state:2;
> +            };
> +
> +                unsigned long val;

... this?

> +        } free;
> +    } u;
> +
> +    union {
> +        /* Page is in use, but not as a shadow. */

I'm also pretty sure I asked before what shadow this comment alludes to.

> +        struct {
> +            /* Owner of this page (zero if page is anonymous). */
> +            struct domain *domain;

Seeing this is a pointer, I find "zero" in the comment a little
misleading. Better say NULL?

> +        } inuse;
> +
> +        /* Page is on a free list. */
> +        struct {
> +            /* Order-size of the free chunk this page is the head of. */
> +            unsigned int order;
> +        } free;
> +    } v;
> +
> +    union {
> +        /*
> +         * Timestamp from 'TLB clock', used to avoid extra safety flushes.
> +         * Only valid for: a) free pages, and b) pages with zero type count
> +         */
> +        uint32_t tlbflush_timestamp;
> +    };
> +};
> +
> +#define frame_table ((struct page_info *)FRAMETABLE_VIRT_START)
> +
> +/* PDX of the first page in the frame table. */
> +extern unsigned long frametable_base_pdx;
> +
> +/* Convert between machine frame numbers and page-info structures. */
> +#define mfn_to_page(mfn)                                            \
> +    (frame_table + (mfn_to_pdx(mfn) - frametable_base_pdx))
> +#define page_to_mfn(pg)                                             \
> +    pdx_to_mfn((unsigned long)((pg) - frame_table) + frametable_base_pdx)
> +
> +static inline void *page_to_virt(const struct page_info *pg)
> +{
> +    return mfn_to_virt(mfn_x(page_to_mfn(pg)));
> +}

Would be nice if this and the inverse function would live closer to
one another.

> +/*
> + * Common code requires get_page_type and put_page_type.
> + * We don't care about typecounts so we just do the minimum to make it
> + * happy.
> + */
> +static inline int get_page_type(struct page_info *page, unsigned long type)
> +{
> +    return 1;
> +}
> +
> +static inline void put_page_type(struct page_info *page)
> +{
> +}
> +
> +static inline void put_page_and_type(struct page_info *page)
> +{
> +    put_page_type(page);
> +    put_page(page);
> +}
> +
> +/*
> + * RISC-V does not have an M2P, but common code expects a handful of
> + * M2P-related defines and functions. Provide dummy versions of these.
> + */
> +#define INVALID_M2P_ENTRY        (~0UL)
> +#define SHARED_M2P_ENTRY         (~0UL - 1UL)
> +#define SHARED_M2P(_e)           ((_e) == SHARED_M2P_ENTRY)
> +
> +#define set_gpfn_from_mfn(mfn, pfn) do { (void)(mfn), (void)(pfn); } while (0)
> +#define mfn_to_gfn(d, mfn) ((void)(d), _gfn(mfn_x(mfn)))
> +
> +#define PDX_GROUP_SHIFT (16 + 5)

Where are these magic numbers coming from? None of the other three
architectures use literal numbers here, thus making clear what
values are actually meant. If you can't use suitable constants,
please add a comment.

> +static inline unsigned long domain_get_maximum_gpfn(struct domain *d)
> +{
> +    BUG_ON("unimplemented");
> +    return 0;
> +}
> +
> +static inline long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    BUG_ON("unimplemented");
> +    return 0;
> +}
> +
> +/*
> + * On RISCV, all the RAM is currently direct mapped in Xen.
> + * Hence return always true.
> + */
> +static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
> +{
> +    return true;
> +}
> +
> +#define PG_shift(idx)   (BITS_PER_LONG - (idx))
> +#define PG_mask(x, idx) (x ## UL << PG_shift(idx))
> +
> +#define PGT_none          PG_mask(0, 1)  /* no special uses of this page   */
> +#define PGT_writable_page PG_mask(1, 1)  /* has writable mappings?         */
> +#define PGT_type_mask     PG_mask(1, 1)  /* Bits 31 or 63.                 */
> +
> + /* Count of uses of this frame as its current type. */

Imo the PGC_ related revision log item should have covered this one, too.

> +#define PGT_count_width   PG_shift(2)
> +#define PGT_count_mask    ((1UL << PGT_count_width) - 1)
> +
> +/*
> + * Page needs to be scrubbed. Since this bit can only be set on a page that is
> + * free (i.e. in PGC_state_free) we can reuse PGC_allocated bit.
> + */
> +#define _PGC_need_scrub   _PGC_allocated
> +#define PGC_need_scrub    PGC_allocated
> +
> +/* Cleared when the owning guest 'frees' this page. */
> +#define _PGC_allocated    PG_shift(1)
> +#define PGC_allocated     PG_mask(1, 1)
> +/* Page is Xen heap? */
> +#define _PGC_xen_heap     PG_shift(2)
> +#define PGC_xen_heap      PG_mask(1, 2)
> +/* Page is broken? */
> +#define _PGC_broken       PG_shift(7)
> +#define PGC_broken        PG_mask(1, 7)
> +/* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
> +#define PGC_state         PG_mask(3, 9)
> +#define PGC_state_inuse   PG_mask(0, 9)
> +#define PGC_state_offlining PG_mask(1, 9)
> +#define PGC_state_offlined PG_mask(2, 9)
> +#define PGC_state_free    PG_mask(3, 9)
> +#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
> +
> +/* Count of references to this frame. */
> +#define PGC_count_width   PG_shift(9)
> +#define PGC_count_mask    ((1UL << PGC_count_width) - 1)
> +
> +#define _PGC_extra        PG_shift(10)
> +#define PGC_extra         PG_mask(1, 10)
> +
> +#define is_xen_heap_page(page) ((page)->count_info & PGC_xen_heap)
> +#define is_xen_heap_mfn(mfn) \
> +    (mfn_valid(mfn) && is_xen_heap_page(mfn_to_page(mfn)))
> +
> +#define is_xen_fixed_mfn(mfn)                                   \
> +    ((mfn_to_maddr(mfn) >= virt_to_maddr((vaddr_t)_start)) &&   \
> +     (mfn_to_maddr(mfn) <= virt_to_maddr((vaddr_t)_end - 1)))
> +
> +#define page_get_owner(_p)    (_p)->v.inuse.domain
> +#define page_set_owner(_p,_d) ((_p)->v.inuse.domain = (_d))

Unnecessary (leading) underscores again.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-05 15:32 ` [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
@ 2024-02-14  9:52   ` Jan Beulich
  2024-02-14 12:21     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-14  9:52 UTC (permalink / raw)
  To: Oleksii Kurochko
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
>  Changes in V4:
>   - Update version of GCC (12.2) and GNU Binutils (2.39) to the version
>     which are in Xen's contrainter for RISC-V
> ---
>  Changes in V3:
>   - new patch
> ---
>  README | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/README b/README
> index c8a108449e..9a898125e1 100644
> --- a/README
> +++ b/README
> @@ -48,6 +48,9 @@ provided by your OS distributor:
>        - For ARM 64-bit:
>          - GCC 5.1 or later
>          - GNU Binutils 2.24 or later
> +      - For RISC-V 64-bit:
> +        - GCC 12.2 or later
> +        - GNU Binutils 2.39 or later

And neither gcc 12.1 nor binutils 2.38 are good enough? Once again the
question likely wouldn't have needed raising if there was a non-empty
description ...

Also - Clang pretty certainly supports RISC-V, too. Any information on
a minimally required version there?

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 02/30] xen/riscv: use some asm-generic headers
  2024-02-12 15:03   ` Jan Beulich
@ 2024-02-14  9:54     ` Oleksii
  2024-02-14 10:03       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-14  9:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-12 at 16:03 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > Some headers are the same as asm-generic verions of them
> > so use them instead of arch-specific headers.
> 
> Just to mention it (I'll commit this as is, unless asked to do
> otherwise):
> With this description I'd expect those "some headers" to be removed
> by
> this patch. Yet you're not talking about anything that exists;
> instead I
> think you mean "would end up the same". Yet that's precisely what
> asm-generic/ is for. Hence I would have said something along the
> lines of
> "don't need any customization".
Agree that "some headers" isn't the best one option to describe the
changes. Perhaps, it would be better to specify the headers which would
end up the same as asm-generic's version.
Thanks for such notes!

> 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > Acked-by: Jan Beulich <jbeulich@suse.com>
> > ---
> >  As [PATCH v6 0/9] Introduce generic headers
> >  (
> > https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com
> > /)
> >  is not stable, the list in asm/Makefile can be changed, but the
> > changes will
> >  be easy.
> 
> Or wait - doesn't this mean the change here can't be committed yet? I
> know the cover letter specifies dependencies, yet I think we need to
> come
> to a point where this large series won't need re-posting again and
> again.
We can't committed it now because asm-generic version of device.h,
which is not commited yet.

We can drop the change " generic-y += device.h ", and commit the
current one patch, but it sill will require to create a new patch for
using of asm-generic/device.h. Or as an option, I can merge "generic-y
+= device.h" into PATCH 29/30 xen/riscv: enable full Xen build.

I don't expect that the of asm-generic headers will changed in
riscv/include/asm/Makefile, but it looks to me that it is better to
wait until asm-generic/device.h will be in staging branch.


If you have better ideas, please share it with me.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V
  2024-02-12 15:05   ` Jan Beulich
@ 2024-02-14  9:57     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-14  9:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On Mon, 2024-02-12 at 16:05 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > No specific header is needed to include in public/hvm/save.h for
> > PPC and RISC-V for now.
> > 
> > Code related to PPC was changed based on the comment:
> > https://lore.kernel.org/xen-devel/c2f3280e-2208-496b-a0b5-fda1a2076b3a@raptorengineering.com/
> > 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> Albeit I don't see why ...
> 
> > --- a/xen/include/public/hvm/save.h
> > +++ b/xen/include/public/hvm/save.h
> > @@ -89,8 +89,8 @@ DECLARE_HVM_SAVE_TYPE(END, 0, struct
> > hvm_save_end);
> >  #include "../arch-x86/hvm/save.h"
> >  #elif defined(__arm__) || defined(__aarch64__)
> >  #include "../arch-arm/hvm/save.h"
> > -#elif defined(__powerpc64__)
> > -#include "../arch-ppc.h"
> > +#elif defined(__powerpc64__) || defined(__riscv)
> > +/* no specific header to include */
> >  #else
> 
> ... this isn't simply
> 
> #elif !defined(__powerpc64__) && !defined(__riscv)
I can change that to your option in the next patch version if the patch
won't be merged now.

~ Oleksii
> 
> >  #error "unsupported architecture"
> >  #endif
> 



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h
  2024-02-12 15:07   ` Jan Beulich
@ 2024-02-14 10:01     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-14 10:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-12 at 16:07 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
Thanks!

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 02/30] xen/riscv: use some asm-generic headers
  2024-02-14  9:54     ` Oleksii
@ 2024-02-14 10:03       ` Jan Beulich
  2024-02-20 18:57         ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-14 10:03 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 14.02.2024 10:54, Oleksii wrote:
> On Mon, 2024-02-12 at 16:03 +0100, Jan Beulich wrote:
>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>>  As [PATCH v6 0/9] Introduce generic headers
>>>  (
>>> https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com
>>> /)
>>>  is not stable, the list in asm/Makefile can be changed, but the
>>> changes will
>>>  be easy.
>>
>> Or wait - doesn't this mean the change here can't be committed yet? I
>> know the cover letter specifies dependencies, yet I think we need to
>> come
>> to a point where this large series won't need re-posting again and
>> again.
> We can't committed it now because asm-generic version of device.h,
> which is not commited yet.
> 
> We can drop the change " generic-y += device.h ", and commit the
> current one patch, but it sill will require to create a new patch for
> using of asm-generic/device.h. Or as an option, I can merge "generic-y
> += device.h" into PATCH 29/30 xen/riscv: enable full Xen build.
> 
> I don't expect that the of asm-generic headers will changed in
> riscv/include/asm/Makefile, but it looks to me that it is better to
> wait until asm-generic/device.h will be in staging branch.
> 
> 
> If you have better ideas, please share it with me.

My main point was that the interdependencies here have grown too far,
imo. The more that while having dependencies stated in the cover letter
is useful, while committing (and also reviewing) I for one would
typically only look at the individual patches.

For this patch alone, maybe it would be more obvious that said
dependency exists if it was last on the asm-generic series, rather
than part of the series here (which depends on that other series
anyway). That series now looks to be making some progress, and it being
a prereq for here it may be prudent to focus on getting that one in,
before re-posting here.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h
  2024-02-12 15:10   ` Jan Beulich
@ 2024-02-14 10:05     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-14 10:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Roger Pau Monné, xen-devel

On Mon, 2024-02-12 at 16:10 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > asm/iommu.h shouldn't
> 
> ... need to ...
> 
> > be included when CONFIG_HAS_PASSTHROUGH
> > isn't enabled.
> > As <asm/iommu.h> is ifdef-ed by CONFIG_HAS_PASSTHROUGH it should
> > be also ifdef-ed field "struct arch_iommu arch" in struct
> > domain_iommu
> > as definition of arch_iommu is located in <asm/iommu.h>.
> > 
> > These amount of changes are enough to avoid generation of empty
> > asm/iommu.h for now.
> 
> I'm also inclined to insert "just" here, to make more obvious why
> e.g.
> ...
> 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > ---
> > Changes in V4:
> >  - Update the commit message.
> > ---
> > Changes in V3:
> >  - new patch.
> > ---
> >  xen/include/xen/iommu.h | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> > index a21f25df9f..7aa6a77209 100644
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -337,7 +337,9 @@ extern int
> > iommu_add_extra_reserved_device_memory(unsigned long start,
> >  extern int iommu_get_extra_reserved_device_memory(iommu_grdm_t
> > *func,
> >                                                    void *ctxt);
> >  
> > +#ifdef CONFIG_HAS_PASSTHROUGH
> >  #include <asm/iommu.h>
> > +#endif
> >  
> >  #ifndef iommu_call
> >  # define iommu_call(ops, fn, args...) ((ops)->fn(args))
> > @@ -345,7 +347,9 @@ extern int
> > iommu_get_extra_reserved_device_memory(iommu_grdm_t *func,
> >  #endif
> >  
> >  struct domain_iommu {
> > +#ifdef CONFIG_HAS_PASSTHROUGH
> >      struct arch_iommu arch;
> > +#endif
> >  
> >      /* iommu_ops */
> >      const struct iommu_ops *platform_ops;
> 
> ... this is left visible despite quite likely being meaningless
> without
> HAS_PASSTHROUGH.
> 
> Then (happy to make the small edits while committing):
I'll be happy with that. Thanks.

> Acked-by: Jan Beulich <jbeulich@suse.com>
Thanks for Ack.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 09/30] xen/riscv: introduce bitops.h
  2024-02-12 15:58   ` Jan Beulich
@ 2024-02-14 11:06     ` Oleksii
  2024-02-14 11:20       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-14 11:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-12 at 16:58 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/bitops.h
> > @@ -0,0 +1,164 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/* Copyright (C) 2012 Regents of the University of California */
> > +
> > +#ifndef _ASM_RISCV_BITOPS_H
> > +#define _ASM_RISCV_BITOPS_H
> > +
> > +#include <asm/system.h>
> > +
> > +#include <asm-generic/bitops/bitops-bits.h>
> 
> Especially with ...
> 
> > +/* Based on linux/arch/include/linux/bits.h */
> > +
> > +#define BIT_MASK(nr)        (1UL << ((nr) % BITS_PER_LONG))
> > +#define BIT_WORD(nr)        ((nr) / BITS_PER_LONG)
> 
> ... these it's not entirely obvious why bitops-bits.h would be needed
> here.
They are needed as __test_and_op_bit_ord(), __op_bit_ord() macros are
used them, but probably it makes sense to drop BIT_MASK() and
BIT_WORD(), and just use BITOPS_MASK() and BITOPS_WORD() from asm-
generic/bitops-bits.h or re-define BITOPS_MASK() and BITOPS_WORD()
before inclusion of bitops-bits.h in the way as BIT_MASK and BIT_WORD
macros are defined now to be aligned with Linux.

> 
> > +#define __set_bit(n,p)      set_bit(n,p)
> > +#define __clear_bit(n,p)    clear_bit(n,p)
> 
> Nit (as before?): Missing blanks after commas.
Thanks. I'll add blanks.

> 
> > +/* Based on linux/arch/include/asm/bitops.h */
> > +
> > +#if ( BITS_PER_LONG == 64 )
> 
> Imo the parentheses here make things only harder to read.
I can drop them, this part was copied from Linux, so it was decided to
leave it as is.

> 
> > +#define __AMO(op)   "amo" #op ".d"
> > +#elif ( BITS_PER_LONG == 32 )
> > +#define __AMO(op)   "amo" #op ".w"
> > +#else
> > +#error "Unexpected BITS_PER_LONG"
> > +#endif
> > +
> > +#define __test_and_op_bit_ord(op, mod, nr, addr, ord)   \
> 
> The revision log says __test_and_* were renamed. Same anomaly for
> __test_and_op_bit() then.
I'll double check the namings. Thanks.
> 
> > +({                                                      \
> > +    unsigned long __res, __mask;                        \
> 
> Leftover leading underscores?
It is how it was defined in Linux, so I thought that I've to leave it
as it, but I am OK to rename this variables in next patch version.

> 
> > +    __mask = BIT_MASK(nr);                              \
> > +    __asm__ __volatile__ (                              \
> > +        __AMO(op) #ord " %0, %2, %1"                    \
> > +        : "=r" (__res), "+A" (addr[BIT_WORD(nr)])       \
> > +        : "r" (mod(__mask))                             \
> > +        : "memory");                                    \
> > +    ((__res & __mask) != 0);                            \
> > +})
> > +
> > +#define __op_bit_ord(op, mod, nr, addr, ord)    \
> > +    __asm__ __volatile__ (                      \
> > +        __AMO(op) #ord " zero, %1, %0"          \
> > +        : "+A" (addr[BIT_WORD(nr)])             \
> > +        : "r" (mod(BIT_MASK(nr)))               \
> > +        : "memory");
> > +
> > +#define __test_and_op_bit(op, mod, nr, addr)    \
> > +    __test_and_op_bit_ord(op, mod, nr, addr, .aqrl)
> > +#define __op_bit(op, mod, nr, addr) \
> > +    __op_bit_ord(op, mod, nr, addr, )
> > +
> > +/* Bitmask modifiers */
> > +#define __NOP(x)    (x)
> > +#define __NOT(x)    (~(x))
> 
> Here the (double) leading underscores are truly worrying: Simple
> names like this aren't impossible to be assigned meaninb by a
> compiler.
I am not really understand what is the difference for a compiler
between NOP and __NOP? Do you mean that the leading double underscores
(__) are often used to indicate that these macros are implementation-
specific and might be reserved for the compiler or the standard
library?

> 
> > +/**
> > + * __test_and_set_bit - Set a bit and return its old value
> > + * @nr: Bit to set
> > + * @addr: Address to count from
> > + *
> > + * This operation may be reordered on other architectures than
> > x86.
> > + */
> > +static inline int test_and_set_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> 
> With BIT_WORD() / BIT_MASK() being long-based, is the use of uint32_t
> here actually correct?
No, it is not correct. It seems to me it would be better to use
BITOPS_WORD(), BITOPS_MASK() and bitops_uint_t() and just redefine them
before inclusion of bitops-bit.h to be aligned with Linux
implementation.

> 
> > +    return __test_and_op_bit(or, __NOP, nr, addr);
> > +}
> > +
> > +/**
> > + * __test_and_clear_bit - Clear a bit and return its old value
> > + * @nr: Bit to clear
> > + * @addr: Address to count from
> > + *
> > + * This operation can be reordered on other architectures other
> > than x86.
> 
> Nit: double "other" (and I think it's the 1st one that wants
> dropping,
> not - as the earlier comment suggests - the 2nd one). Question is:
> Are
> the comments correct? Both resolve to something which is (also) at
> least a compiler barrier. Same concern also applies further down, to
> at least set_bit() and clear_bit().
It looks like comments aren't correct as operation inside is atomic,
also it implies compiler memory barrier. So the comments related to
'reordering' should be dropped.
I am not sure that I know why in Linux these comments were left.

> 
> > + */
> > +static inline int test_and_clear_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> > +
> > +    return __test_and_op_bit(and, __NOT, nr, addr);
> > +}
> > +
> > +/**
> > + * set_bit - Atomically set a bit in memory
> > + * @nr: the bit to set
> > + * @addr: the address to start counting from
> > + *
> > + * Note: there are no guarantees that this function will not be
> > reordered
> > + * on non x86 architectures, so if you are writing portable code,
> > + * make sure not to rely on its reordering guarantees.
> > + *
> > + * Note that @nr may be almost arbitrarily large; this function is
> > not
> > + * restricted to acting on a single-word quantity.
> > + */
> > +static inline void set_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> > +
> > +    __op_bit(or, __NOP, nr, addr);
> > +}
> > +
> > +/**
> > + * clear_bit - Clears a bit in memory
> > + * @nr: Bit to clear
> > + * @addr: Address to start counting from
> > + *
> > + * Note: there are no guarantees that this function will not be
> > reordered
> > + * on non x86 architectures, so if you are writing portable code,
> > + * make sure not to rely on its reordering guarantees.
> > + */
> > +static inline void clear_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> > +
> > +    __op_bit(and, __NOT, nr, addr);
> > +}
> > +
> > +/**
> > + * test_and_change_bit - Change a bit and return its old value
> 
> How come this one's different? I notice the comments are the same
> (and
> hence as confusing) in Linux; are you sure they're applicable there?
No, I am not sure. As I mentioned above, all this functions are atomic
and uses compiler memory barrier, so it looks like the comment for 
clear_bit isn't really correct.
In case if all such functions are uses __test_and_op_bit_ord() and
__op_bit_ord() which are atomic and with compiler barrier, it seems
that we can just drop all such comments or even all the comments. I am
not sure that anyone is needed the comment as by default this function
is safe to use:
 * This operation is atomic and cannot be reordered.
 * It also implies a memory barrier.

> 
> > + * @nr: Bit to change
> > + * @addr: Address to count from
> > + *
> > + * This operation is atomic and cannot be reordered.
> > + * It also implies a memory barrier.
> > + */
> > +static inline int test_and_change_bit(int nr, volatile unsigned
> > long *addr)
> > +{
> > +	return __test_and_op_bit(xor, __NOP, nr, addr);
> > +}
> > +
> > +#undef __test_and_op_bit
> > +#undef __op_bit
> > +#undef __NOP
> > +#undef __NOT
> > +#undef __AMO
> > +
> > +#include <asm-generic/bitops/generic-non-atomic.h>
> > +
> > +#define __test_and_set_bit generic___test_and_set_bit
> > +#define __test_and_clear_bit generic___test_and_clear_bit
> > +#define __test_and_change_bit generic___test_and_change_bit
> > +
> > +#include <asm-generic/bitops/fls.h>
> > +#include <asm-generic/bitops/flsl.h>
> > +#include <asm-generic/bitops/__ffs.h>
> > +#include <asm-generic/bitops/ffs.h>
> > +#include <asm-generic/bitops/ffsl.h>
> > +#include <asm-generic/bitops/ffz.h>
> > +#include <asm-generic/bitops/find-first-set-bit.h>
> > +#include <asm-generic/bitops/hweight.h>
> > +#include <asm-generic/bitops/test-bit.h>
> 
> To be honest there's too much stuff being added here to asm-generic/,
> all in one go. I'll see about commenting on the remaining parts here,
> but I'd like to ask that you seriously consider splitting.
Would it be better to send it outside of this patch series? I can
create a separate patch series with a separate patch for each asm-
generic/bitops/*.h

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 11/30] xen/riscv: introduce smp.h
  2024-02-12 15:13   ` Jan Beulich
@ 2024-02-14 11:06     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-14 11:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-12 at 16:13 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
Thanks for Ack.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 09/30] xen/riscv: introduce bitops.h
  2024-02-14 11:06     ` Oleksii
@ 2024-02-14 11:20       ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-14 11:20 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 14.02.2024 12:06, Oleksii wrote:
> On Mon, 2024-02-12 at 16:58 +0100, Jan Beulich wrote:
>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>> +({                                                      \
>>> +    unsigned long __res, __mask;                        \
>>
>> Leftover leading underscores?
> It is how it was defined in Linux, so I thought that I've to leave it
> as it, but I am OK to rename this variables in next patch version.

My view: If you retain Linux style, retaining such names is also (kind
of) okay. If you convert to Xen style, then name changes are to occur
as part of that conversion.

>>> +    __mask = BIT_MASK(nr);                              \
>>> +    __asm__ __volatile__ (                              \
>>> +        __AMO(op) #ord " %0, %2, %1"                    \
>>> +        : "=r" (__res), "+A" (addr[BIT_WORD(nr)])       \
>>> +        : "r" (mod(__mask))                             \
>>> +        : "memory");                                    \
>>> +    ((__res & __mask) != 0);                            \
>>> +})
>>> +
>>> +#define __op_bit_ord(op, mod, nr, addr, ord)    \
>>> +    __asm__ __volatile__ (                      \
>>> +        __AMO(op) #ord " zero, %1, %0"          \
>>> +        : "+A" (addr[BIT_WORD(nr)])             \
>>> +        : "r" (mod(BIT_MASK(nr)))               \
>>> +        : "memory");
>>> +
>>> +#define __test_and_op_bit(op, mod, nr, addr)    \
>>> +    __test_and_op_bit_ord(op, mod, nr, addr, .aqrl)
>>> +#define __op_bit(op, mod, nr, addr) \
>>> +    __op_bit_ord(op, mod, nr, addr, )
>>> +
>>> +/* Bitmask modifiers */
>>> +#define __NOP(x)    (x)
>>> +#define __NOT(x)    (~(x))
>>
>> Here the (double) leading underscores are truly worrying: Simple
>> names like this aren't impossible to be assigned meaninb by a
>> compiler.
> I am not really understand what is the difference for a compiler
> between NOP and __NOP? Do you mean that the leading double underscores
> (__) are often used to indicate that these macros are implementation-
> specific and might be reserved for the compiler or the standard
> library?

It's not "often used". Identifiers starting with two underscores or an
underscore and a capital letter are reserved for the implementation
(i.e. for the compiler's internal use). When not overly generic we
stand a fair chance of getting away. But NOP and NOT are pretty generic.

>>> +#include <asm-generic/bitops/fls.h>
>>> +#include <asm-generic/bitops/flsl.h>
>>> +#include <asm-generic/bitops/__ffs.h>
>>> +#include <asm-generic/bitops/ffs.h>
>>> +#include <asm-generic/bitops/ffsl.h>
>>> +#include <asm-generic/bitops/ffz.h>
>>> +#include <asm-generic/bitops/find-first-set-bit.h>
>>> +#include <asm-generic/bitops/hweight.h>
>>> +#include <asm-generic/bitops/test-bit.h>
>>
>> To be honest there's too much stuff being added here to asm-generic/,
>> all in one go. I'll see about commenting on the remaining parts here,
>> but I'd like to ask that you seriously consider splitting.
> Would it be better to send it outside of this patch series? I can
> create a separate patch series with a separate patch for each asm-
> generic/bitops/*.h

Not sure. Depends in part on whether then you'd effectively introduce
dead code. If the introduction was such that RISC-V used the new ones
right away, then yes, that would quite likely be better.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 13/30] xen/riscv: introduce io.h
  2024-02-13 11:05   ` Jan Beulich
@ 2024-02-14 11:34     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-14 11:34 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Tue, 2024-02-13 at 12:05 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > The header taken form Linux 6.4.0-rc1 and is based on
> > arch/riscv/include/asm/mmio.h.
> > 
> > Addionally, to the header was added definions of ioremap_*().
> > 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > ---
> > Changes in V4:
> >  - delete inner parentheses in macros.
> >  - s/u<N>/uint<N>.
> > ---
> > Changes in V3:
> >  - re-sync with linux kernel
> >  - update the commit message
> > ---
> > Changes in V2:
> >  - Nothing changed. Only rebase.
> > ---
> >  xen/arch/riscv/include/asm/io.h | 142
> > ++++++++++++++++++++++++++++++++
> >  1 file changed, 142 insertions(+)
> >  create mode 100644 xen/arch/riscv/include/asm/io.h
> > 
> > diff --git a/xen/arch/riscv/include/asm/io.h
> > b/xen/arch/riscv/include/asm/io.h
> > new file mode 100644
> > index 0000000000..1e61a40522
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/io.h
> > @@ -0,0 +1,142 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * {read,write}{b,w,l,q} based on arch/arm64/include/asm/io.h
> > + *   which was based on arch/arm/include/io.h
> > + *
> > + * Copyright (C) 1996-2000 Russell King
> > + * Copyright (C) 2012 ARM Ltd.
> > + * Copyright (C) 2014 Regents of the University of California
> > + */
> > +
> > +
> > +#ifndef _ASM_RISCV_IO_H
> > +#define _ASM_RISCV_IO_H
> > +
> > +#include <asm/byteorder.h>
> > +
> > +/*
> > + * The RISC-V ISA doesn't yet specify how to query or modify PMAs,
> > so we can't
> > + * change the properties of memory regions.  This should be fixed
> > by the
> > + * upcoming platform spec.
> > + */
> > +#define ioremap_nocache(addr, size) ioremap(addr, size)
> > +#define ioremap_wc(addr, size) ioremap(addr, size)
> > +#define ioremap_wt(addr, size) ioremap(addr, size)
> > +
> > +/* Generic IO read/write.  These perform native-endian accesses.
> > */
> > +#define __raw_writeb __raw_writeb
> 
> What use are this and the similar other #define-s?
I don't know specific reason for that, this file fully based on Linux's
arch/riscv/include/asm/mmio.h, so we can drop such defines.

> 
> > +static inline void __raw_writeb(uint8_t val, volatile void __iomem
> > *addr)
> > +{
> > +	asm volatile("sb %0, 0(%1)" : : "r" (val), "r" (addr));
> 
> Nit (throughout): Missing blanks. Or wait - is this file intended to
> be Linux style? If so, it's just one blank that's missing.
I started to update the code style, so I am OK to add missing blanks.
Thanks.
> 
> > +/*
> > + * Unordered I/O memory access primitives.  These are even more
> > relaxed than
> > + * the relaxed versions, as they don't even order accesses between
> > successive
> > + * operations to the I/O regions.
> > + */
> > +#define readb_cpu(c)		({ uint8_t  __r = __raw_readb(c);
> > __r; })
> > +#define readw_cpu(c)		({ uint16_t __r =
> > le16_to_cpu((__force __le16)__raw_readw(c)); __r; })
> > +#define readl_cpu(c)		({ uint32_t __r =
> > le32_to_cpu((__force __le32)__raw_readl(c)); __r; })
> 
> Didn't we settle on the little-endian stuff to be dropped from here?
> No matter what CPU endianness, what endianness a particular device
> (and hence its MMIO region(s)) is using is entirely independent.
> Hence
> conversion, where necessary, needs to occur at a layer up.
Yes, just missed to remove that.

> 
> Also, what good do the __r variables do here? If they weren't here,
> we also wouldn't need to discuss their naming.
I don't see to much sense in __r, so it could be dropped.

> 
> > +#define writeb_cpu(v,c)		((void)__raw_writeb(v,c))
> > +#define
> > writew_cpu(v,c)		((void)__raw_writew((__force uint16_t)cpu_to_le16(v),c))
> > +#define
> > writel_cpu(v,c)		((void)__raw_writel((__force uint32_t)cpu_to_le32(v),c))
> 
> Nit: Blanks after commas please (also again further down).
Thanks, I'll update that.

> 
> > +#ifdef CONFIG_64BIT
> > +#define readq_cpu(c)		({ u64 __r = le64_to_cpu((__force
> > __le64)__raw_readq(c)); __r; })
> > +#define
> > writeq_cpu(v,c)		((void)__raw_writeq((__force u64)cpu_to_le64(v),c))
> 
> uint64_t (twice)
> 
> > +#endif
> > +
> > +/*
> > + * I/O memory access primitives. Reads are ordered relative to any
> > + * following Normal memory access. Writes are ordered relative to
> > any prior
> > + * Normal memory access.  The memory barriers here are necessary
> > as RISC-V
> > + * doesn't define any ordering between the memory space and the
> > I/O space.
> > + */
> > +#define __io_br()	do {} while (0)
> 
> Nit: This and ...
> 
> > +#define __io_ar(v)	__asm__ __volatile__ ("fence i,r" : : :
> > "memory");
> > +#define __io_bw()	__asm__ __volatile__ ("fence w,o" : : :
> > "memory");
> > +#define __io_aw()	do { } while (0)
> 
> ... this want to be spelled exactly the same.
Oh, overlooked that. Thanks.

> 
> Also, why does __io_ar() have a parameter (which it then doesn't
> use)?
In case of Xen and RISC-V, it can be droped. in case of Linux,
__io_ar() is also defined, at least, for Arm where this parameter is
used, so I assume that intention was to have the same API for
__io_ar().

> 
> Finally at least within a single file please be consistent about
> asm()
> vs __asm__() use.
> 
> > +#define readb(c)	({ uint8_t  __v; __io_br(); __v =
> > readb_cpu(c); __io_ar(__v); __v; })
> > +#define readw(c)	({ uint16_t __v; __io_br(); __v =
> > readw_cpu(c); __io_ar(__v); __v; })
> > +#define readl(c)	({ uint32_t __v; __io_br(); __v =
> > readl_cpu(c); __io_ar(__v); __v; })
> 
> Here the local variables are surely needed. Still they would
> preferably
> not have any underscores as prefixes.
Thanks. This header was left untouched as it was mostly just copy of
Linux's mmio.h, but I'll update it according to Xen code style. Thanks.

> 
> > +#define writeb(v,c)	({ __io_bw(); writeb_cpu(v,c); __io_aw();
> > })
> > +#define writew(v,c)	({ __io_bw(); writew_cpu(v,c); __io_aw();
> > })
> > +#define writel(v,c)	({ __io_bw(); writel_cpu(v,c); __io_aw();
> > })
> > +
> > +#ifdef CONFIG_64BIT
> > +#define readq(c)	({ u64 __v; __io_br(); __v = readq_cpu(c);
> > __io_ar(__v); __v; })
> 
> uint64_t again
Thanks. I'll update that too.

> 
> > +#define writeq(v,c)	({ __io_bw(); writeq_cpu((v),(c));
> > __io_aw(); })
> 
> Inner parentheses still left?
Overlooked. Thanks, I'll update that in the next patch version.

~ Oleksii



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 14/30] xen/riscv: introduce atomic.h
  2024-02-13 11:36   ` Jan Beulich
@ 2024-02-14 12:11     ` Oleksii
  2024-02-14 13:09       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-14 12:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bobby Eshleman, Alistair Francis, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Tue, 2024-02-13 at 12:36 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > From: Bobby Eshleman <bobbyeshleman@gmail.com>
> > 
> > Additionally, this patch introduces macros in fence.h,
> > which are utilized in atomic.h.
> 
> These are used in an earlier patch already, so either you want to
> re-order the series, or you want to move that introduction ahead.
> 
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/atomic.h
> > @@ -0,0 +1,395 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Taken and modified from Linux.
> > + *
> > + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
> > updated to use
> > + * __*xchg_generic()
> > + * 
> > + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> > + * Copyright (C) 2012 Regents of the University of California
> > + * Copyright (C) 2017 SiFive
> > + * Copyright (C) 2021 Vates SAS
> > + */
> > +
> > +#ifndef _ASM_RISCV_ATOMIC_H
> > +#define _ASM_RISCV_ATOMIC_H
> > +
> > +#include <xen/atomic.h>
> > +#include <asm/cmpxchg.h>
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +void __bad_atomic_size(void);
> > +
> > +static always_inline void read_atomic_size(const volatile void *p,
> > +                                           void *res,
> > +                                           unsigned int size)
> > +{
> > +    switch ( size )
> > +    {
> > +    case 1: *(uint8_t *)res = readb(p); break;
> > +    case 2: *(uint16_t *)res = readw(p); break;
> > +    case 4: *(uint32_t *)res = readl(p); break;
> > +    case 8: *(uint32_t *)res  = readq(p); break;
> 
> Why is it the MMIO primitives you use here, i.e. not read<X>_cpu()?
> It's RAM you're accessing after all.
Legacy from Linux kernel. For some reason they wanted to have ordered
read/write access.

> 
> Also - no CONFIG_64BIT conditional here (like you have in the other
> patch)?
Agree, it should be added.

> 
> > +    default: __bad_atomic_size(); break;
> > +    }
> > +}
> > +
> > +#define read_atomic(p) ({                               \
> > +    union { typeof(*p) val; char c[0]; } x_;            \
> > +    read_atomic_size(p, x_.c, sizeof(*p));              \
> 
> I'll be curious for how much longer gcc will tolerate this accessing
> of a zero-length array, without issuing at least a warning. I'd
> recommend using sizeof(*(p)) as the array dimension right away. (From
> this not also the missing parentheses in what you have.)
Thanks. I'll update that.

> 
> > +    x_.val;                                             \
> > +})
> > +
> > +#define write_atomic(p, x)                              \
> > +({                                                      \
> > +    typeof(*p) x__ = (x);                               \
> > +    switch ( sizeof(*p) )                               \
> > +    {                                                   \
> > +    case 1: writeb((uint8_t)x__,  p); break;            \
> > +    case 2: writew((uint16_t)x__, p); break;            \
> > +    case 4: writel((uint32_t)x__, p); break;            \
> > +    case 8: writeq((uint64_t)x__, p); break;            \
> 
> Are the casts actually necessary here?
Not really, we can drop them.

> 
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +    x__;                                                \
> > +})
> > +
> > +#define add_sized(p, x)                                 \
> > +({                                                      \
> > +    typeof(*(p)) x__ = (x);                             \
> > +    switch ( sizeof(*(p)) )                             \
> > +    {                                                   \
> > +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> > +    case 2: writew(read_atomic(p) + x__, p); break;     \
> > +    case 4: writel(read_atomic(p) + x__, p); break;     \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +})
> > +
> > +/*
> > + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar
> > type, leaving
> > + *               non-scalar types unchanged.
> > + *
> > + * Prefer C11 _Generic for better compile-times and simpler code.
> > Note: 'char'
> > + * is not type-compatible with 'signed char', and we define a
> > separate case.
> > + */
> > +#define __scalar_type_to_expr_cases(type)               \
> > +    unsigned type:  (unsigned type)0,                   \
> > +    signed type:    (signed type)0
> > +
> > +#define __unqual_scalar_typeof(x) typeof(               \
> > +    _Generic((x),                                       \
> > +        char:  (char)0,                                 \
> > +        __scalar_type_to_expr_cases(char),              \
> > +        __scalar_type_to_expr_cases(short),             \
> > +        __scalar_type_to_expr_cases(int),               \
> > +        __scalar_type_to_expr_cases(long),              \
> > +        __scalar_type_to_expr_cases(long long),         \
> > +        default: (x)))
> 
> This isn't RISC-V specific, is it? In which case it wants moving to,
> perhaps, xen/macros.h (and then also have the leading underscores
> dropped).
No, at all. But this thing is only used in RISC-V part, but if it would
be better to move it to xen/macros.h I will happy to sent separate
patch.

> 
> > +#define READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x)
> > *)&(x))
> > +#define WRITE_ONCE(x, val)                                      \
> > +    do {                                                        \
> > +        *(volatile typeof(x) *)&(x) = (val);                    \
> > +    } while (0)
> 
> In Xen we use ACCESS_ONCE(); any reason you need to introduce
> {READ,WRITE}_ONCE() in addition? Without them,
> __unqual_scalar_typeof()
> may then also not be needed (or, if there's a need to enhance it, may
> then be needed for ACCESS_ONCE()). Which in turn raises the question
> why only READ_ONCE() uses it here.
Hmm, READ_ONCE() and WRITE_ONCE() can be dropped then, I'll switch
everything in my code to ACCESS_ONCE().

> 
> > +#define __atomic_acquire_fence() \
> > +    __asm__ __volatile__( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
> 
> Missing blank here and ...
> 
> > +#define __atomic_release_fence() \
> > +    __asm__ __volatile__( RISCV_RELEASE_BARRIER "" ::: "memory" );
> 
> ... here, and stray semicolon additionally just here.
Thanks. I'll apply your comments to this part of code.

> 
> > +static inline int atomic_read(const atomic_t *v)
> > +{
> > +    return READ_ONCE(v->counter);
> > +}
> > +
> > +static inline int _atomic_read(atomic_t v)
> > +{
> > +    return v.counter;
> > +}
> > +
> > +static inline void atomic_set(atomic_t *v, int i)
> > +{
> > +    WRITE_ONCE(v->counter, i);
> > +}
> > +
> > +static inline void _atomic_set(atomic_t *v, int i)
> > +{
> > +    v->counter = i;
> > +}
> > +
> > +static inline int atomic_sub_and_test(int i, atomic_t *v)
> > +{
> > +    return atomic_sub_return(i, v) == 0;
> > +}
> > +
> > +static inline void atomic_inc(atomic_t *v)
> > +{
> > +    atomic_add(1, v);
> > +}
> > +
> > +static inline int atomic_inc_return(atomic_t *v)
> > +{
> > +    return atomic_add_return(1, v);
> > +}
> > +
> > +static inline void atomic_dec(atomic_t *v)
> > +{
> > +    atomic_sub(1, v);
> > +}
> > +
> > +static inline int atomic_dec_return(atomic_t *v)
> > +{
> > +    return atomic_sub_return(1, v);
> > +}
> > +
> > +static inline int atomic_dec_and_test(atomic_t *v)
> > +{
> > +    return atomic_sub_return(1, v) == 0;
> > +}
> > +
> > +static inline int atomic_add_negative(int i, atomic_t *v)
> > +{
> > +    return atomic_add_return(i, v) < 0;
> > +}
> > +
> > +static inline int atomic_inc_and_test(atomic_t *v)
> > +{
> > +    return atomic_add_return(1, v) == 0;
> > +}
> 
> None of these look RISC-V-specific. Perhaps worth having something in
> asm-generic/ that can be utilized here?
Looks like we can, at least, PPC has the similar definitions.

> 
> > +/*
> > + * First, the atomic ops that have no ordering constraints and
> > therefor don't
> > + * have the AQ or RL bits set.  These don't return anything, so
> > there's only
> > + * one version to worry about.
> > + */
> > +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> > +static inline                                               \
> > +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> > +{                                                           \
> > +    __asm__ __volatile__ (                                  \
> > +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> > +        : "+A" (v->counter)                                 \
> > +        : "r" (I)                                           \
> > +        : "memory" );                                       \
> > +}                                                           \
> > +
> > +#define ATOMIC_OPS(op, asm_op, I)                           \
> > +        ATOMIC_OP (op, asm_op, I, w, int,   )
> 
> So the last three parameters are to be ready to also support
> atomic64, without actually doing so right now?
Yes, it is ready to support.

> 
> > +ATOMIC_OPS(add, add,  i)
> > +ATOMIC_OPS(sub, add, -i)
> > +ATOMIC_OPS(and, and,  i)
> > +ATOMIC_OPS( or,  or,  i)
> > +ATOMIC_OPS(xor, xor,  i)
> > +
> > +#undef ATOMIC_OP
> > +#undef ATOMIC_OPS
> > +
> > +/*
> > + * Atomic ops that have ordered, relaxed, acquire, and release
> > variants.
> > + * There's two flavors of these: the arithmatic ops have both
> > fetch and return
> > + * versions, while the logical ops only have fetch versions.
> > + */
> 
> I'm somewhat confused by the comment: It first talks of 4 variants,
> but
> then says there are only 2 (arithmetic) or 1 (logical) ones.
Probably they mean that usually they have 4 variants, but it was
implemented only 2 (arith) and 1 (logical).

> 
> > +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type,
> > prefix)    \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op##_relaxed(c_type
> > i,              \
> > +                         atomic##prefix##_t
> > *v)                     \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type " %1, %2,
> > %0"                \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}                                                                 
> >   \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t
> > *v) \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2,
> > %0"          \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}
> > +
> > +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type,
> > prefix) \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return_relaxed(c_type
> > i,                 \
> > +                          atomic##prefix##_t
> > *v)                        \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op
> > I;      \
> > +}                                                                 
> >       \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t
> > *v)  \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op(i, v) c_op
> > I;                \
> > +}
> > +
> > +#define ATOMIC_OPS(op, asm_op, c_op,
> > I)                                 \
> > +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
> > )               \
> > +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
> > +
> > +ATOMIC_OPS(add, add, +,  i)
> > +ATOMIC_OPS(sub, add, +, -i)
> > +
> > +#define atomic_add_return_relaxed   atomic_add_return_relaxed
> > +#define atomic_sub_return_relaxed   atomic_sub_return_relaxed
> > +#define atomic_add_return   atomic_add_return
> > +#define atomic_sub_return   atomic_sub_return
> > +
> > +#define atomic_fetch_add_relaxed    atomic_fetch_add_relaxed
> > +#define atomic_fetch_sub_relaxed    atomic_fetch_sub_relaxed
> > +#define atomic_fetch_add    atomic_fetch_add
> > +#define atomic_fetch_sub    atomic_fetch_sub
> 
> What are all of these #define-s (any yet more further down) about?
> 
> > +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> > +{
> > +       int prev, rc;
> > +
> > +    __asm__ __volatile__ (
> > +        "0: lr.w     %[p],  %[c]\n"
> > +        "   sub      %[rc], %[p], %[o]\n"
> > +        "   bltz     %[rc], 1f\n"
> > +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> > +        "   bnez     %[rc], 0b\n"
> > +        "   fence    rw, rw\n"
> > +        "1:\n"
> > +        : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> > +        : [o]"r" (offset)
> 
> Nit: Blanks please between ] and ".
Thanks. I'll update that.

> 
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/fence.h
> > @@ -0,0 +1,8 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef _ASM_RISCV_FENCE_H
> > +#define _ASM_RISCV_FENCE_H
> > +
> > +#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
> > +#define RISCV_RELEASE_BARRIER   "\tfence rw,  w\n"
> 
> Seeing that another "fence rw, rw" appears in this patch, I'm now
> pretty
> sure you want to add e.g. RISCV_FULL_BARRIER here as well.
It makes sense. I'll do that. Thanks.
> 
> Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 16/30] xen/riscv: introduce p2m.h
  2024-02-12 15:16   ` Jan Beulich
@ 2024-02-14 12:12     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-14 12:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-12 at 16:16 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> with two more nits:
> 
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/p2m.h
> > @@ -0,0 +1,102 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +#ifndef __ASM_RISCV_P2M_H__
> > +#define __ASM_RISCV_P2M_H__
> > +
> > +#include <asm/page-bits.h>
> > +
> > +#define paddr_bits PADDR_BITS
> > +
> > +/*
> > + * List of possible type for each page in the p2m entry.
> > + * The number of available bit per page in the pte for this
> > purpose is 2 bits.
> > + * So it's possible to only have 4 fields. If we run out of value
> > in the
> > + * future, it's possible to use higher value for pseudo-type and
> > don't store
> > + * them in the p2m entry.
> > + */
> > +typedef enum {
> > +    p2m_invalid = 0,    /* Nothing mapped here */
> > +    p2m_ram_rw,         /* Normal read/write domain RAM */
> > +} p2m_type_t;
> > +
> > +#include <xen/p2m-common.h>
> > +
> > +static inline int get_page_and_type(struct page_info *page,
> > +                                    struct domain *domain,
> > +                                    unsigned long type)
> > +{
> > +    BUG_ON("unimplemented");
> > +    return -EINVAL;
> > +}
> > +
> > +/* Look up a GFN and take a reference count on the backing page.
> > */
> > +typedef unsigned int p2m_query_t;
> > +#define P2M_ALLOC    (1u<<0)   /* Populate PoD and paged-out
> > entries */
> > +#define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */
> > +
> > +static inline struct page_info *get_page_from_gfn(
> > +    struct domain *d, unsigned long gfn, p2m_type_t *t,
> > p2m_query_t q)
> > +{
> > +    BUG_ON("unimplemented");
> > +    return NULL;
> > +}
> > +
> > +static inline void memory_type_changed(struct domain *d)
> > +{
> > +    BUG_ON("unimplemented");
> > +}
> > +
> > +
> > +static inline int guest_physmap_mark_populate_on_demand(struct
> > domain *d, unsigned long gfn,
> 
> This line looks to be too long.
> 
> > +                                                        unsigned
> > int order)
> > +{
> > +    return -EOPNOTSUPP;
> > +}
> > +
> > +static inline int guest_physmap_add_entry(struct domain *d,
> > +                            gfn_t gfn,
> > +                            mfn_t mfn,
> > +                            unsigned long page_order,
> > +                            p2m_type_t t)
> 
> Indentation isn't quite right here.
> 
> I'll see about dealing with those while committing.
Thanks a lot.

~ Oleksii

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 18/30] xen/riscv: introduce time.h
  2024-02-12 15:18   ` Jan Beulich
@ 2024-02-14 12:14     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-14 12:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-12 at 16:18 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> Nevertheless ...
> 
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/time.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +#ifndef __ASM_RISCV_TIME_H__
> > +#define __ASM_RISCV_TIME_H__
> > +
> > +#include <xen/bug.h>
> > +#include <asm/csr.h>
> > +
> > +struct vcpu;
> > +
> > +/* TODO: implement */
> > +static inline void force_update_vcpu_system_time(struct vcpu *v) {
> > BUG_ON("unimplemented"); }
> 
> ... nit: Too long line. The comment also doesn't look to serve any
> purpose
> anymore, with the BUG_ON() now taking uniform shape.
I'll drop the comment and move "BUG_ON(...)" to new line.


~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 19/30] xen/riscv: introduce event.h
  2024-02-12 15:20   ` Jan Beulich
@ 2024-02-14 12:16     ` Oleksii
  2024-02-14 13:11       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-14 12:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-12 at 16:20 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> again with a nit, though:
> 
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/event.h
> > @@ -0,0 +1,40 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +#ifndef __ASM_RISCV_EVENT_H__
> > +#define __ASM_RISCV_EVENT_H__
> > +
> > +#include <xen/lib.h>
> > +
> > +void vcpu_mark_events_pending(struct vcpu *v);
> > +
> > +static inline int vcpu_event_delivery_is_enabled(struct vcpu *v)
> > +{
> > +    BUG_ON("unimplemented");
> > +    return 0;
> > +}
> > +
> > +static inline int local_events_need_delivery(void)
> > +{
> > +    BUG_ON("unimplemented");
> > +    return 0;
> > +}
> > +
> > +static inline void local_event_delivery_enable(void)
> > +{
> > +    BUG_ON("unimplemented");
> > +}
> > +
> > +/* No arch specific virq definition now. Default to global. */
> > +static inline bool arch_virq_is_global(unsigned int virq)
> > +{
> > +    return true;
> > +}
> > +
> > +#endif
> 
> This want to gain the usual comment.
Do you mean that commit messag should be updated?

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-14  9:52   ` Jan Beulich
@ 2024-02-14 12:21     ` Oleksii
  2024-02-14 13:06       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-14 12:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On Wed, 2024-02-14 at 10:52 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > ---
> >  Changes in V4:
> >   - Update version of GCC (12.2) and GNU Binutils (2.39) to the
> > version
> >     which are in Xen's contrainter for RISC-V
> > ---
> >  Changes in V3:
> >   - new patch
> > ---
> >  README | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/README b/README
> > index c8a108449e..9a898125e1 100644
> > --- a/README
> > +++ b/README
> > @@ -48,6 +48,9 @@ provided by your OS distributor:
> >        - For ARM 64-bit:
> >          - GCC 5.1 or later
> >          - GNU Binutils 2.24 or later
> > +      - For RISC-V 64-bit:
> > +        - GCC 12.2 or later
> > +        - GNU Binutils 2.39 or later
> 
> And neither gcc 12.1 nor binutils 2.38 are good enough? Once again
> the
> question likely wouldn't have needed raising if there was a non-empty
> description ...
I haven't verified gcc 12.1 and binutils 2.38. gcc 12.2 and binutils
2.39 were chosen because this veriosn is used in Xen contrainer for
RISC-V, on my system I have newer versions. So this is the minimal
versions which would be always tested and I can't be sure that the
lessser version will work fine, as there is not any compilation testing
for that.

> 
> Also - Clang pretty certainly supports RISC-V, too. Any information
> on
> a minimally required version there?
I haven't verified that. I am only testing gcc for now.
I can add this information to commit message.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64
  2024-02-14 12:21     ` Oleksii
@ 2024-02-14 13:06       ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-14 13:06 UTC (permalink / raw)
  To: Oleksii
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 14.02.2024 13:21, Oleksii wrote:
> On Wed, 2024-02-14 at 10:52 +0100, Jan Beulich wrote:
>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>> ---
>>>  Changes in V4:
>>>   - Update version of GCC (12.2) and GNU Binutils (2.39) to the
>>> version
>>>     which are in Xen's contrainter for RISC-V
>>> ---
>>>  Changes in V3:
>>>   - new patch
>>> ---
>>>  README | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/README b/README
>>> index c8a108449e..9a898125e1 100644
>>> --- a/README
>>> +++ b/README
>>> @@ -48,6 +48,9 @@ provided by your OS distributor:
>>>        - For ARM 64-bit:
>>>          - GCC 5.1 or later
>>>          - GNU Binutils 2.24 or later
>>> +      - For RISC-V 64-bit:
>>> +        - GCC 12.2 or later
>>> +        - GNU Binutils 2.39 or later
>>
>> And neither gcc 12.1 nor binutils 2.38 are good enough? Once again
>> the
>> question likely wouldn't have needed raising if there was a non-empty
>> description ...
> I haven't verified gcc 12.1 and binutils 2.38. gcc 12.2 and binutils
> 2.39 were chosen because this veriosn is used in Xen contrainer for
> RISC-V, on my system I have newer versions. So this is the minimal
> versions which would be always tested and I can't be sure that the
> lessser version will work fine, as there is not any compilation testing
> for that.
> 
>>
>> Also - Clang pretty certainly supports RISC-V, too. Any information
>> on
>> a minimally required version there?
> I haven't verified that. I am only testing gcc for now.
> I can add this information to commit message.

Yes please. And if this isn't a firm lower bound, that fact imo wants
reflecting in README itself as well.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 14/30] xen/riscv: introduce atomic.h
  2024-02-14 12:11     ` Oleksii
@ 2024-02-14 13:09       ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-14 13:09 UTC (permalink / raw)
  To: Oleksii
  Cc: Bobby Eshleman, Alistair Francis, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 14.02.2024 13:11, Oleksii wrote:
> On Tue, 2024-02-13 at 12:36 +0100, Jan Beulich wrote:
>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/include/asm/atomic.h
>>> @@ -0,0 +1,395 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Taken and modified from Linux.
>>> + *
>>> + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
>>> updated to use
>>> + * __*xchg_generic()
>>> + * 
>>> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
>>> + * Copyright (C) 2012 Regents of the University of California
>>> + * Copyright (C) 2017 SiFive
>>> + * Copyright (C) 2021 Vates SAS
>>> + */
>>> +
>>> +#ifndef _ASM_RISCV_ATOMIC_H
>>> +#define _ASM_RISCV_ATOMIC_H
>>> +
>>> +#include <xen/atomic.h>
>>> +#include <asm/cmpxchg.h>
>>> +#include <asm/fence.h>
>>> +#include <asm/io.h>
>>> +#include <asm/system.h>
>>> +
>>> +void __bad_atomic_size(void);
>>> +
>>> +static always_inline void read_atomic_size(const volatile void *p,
>>> +                                           void *res,
>>> +                                           unsigned int size)
>>> +{
>>> +    switch ( size )
>>> +    {
>>> +    case 1: *(uint8_t *)res = readb(p); break;
>>> +    case 2: *(uint16_t *)res = readw(p); break;
>>> +    case 4: *(uint32_t *)res = readl(p); break;
>>> +    case 8: *(uint32_t *)res  = readq(p); break;
>>
>> Why is it the MMIO primitives you use here, i.e. not read<X>_cpu()?
>> It's RAM you're accessing after all.
> Legacy from Linux kernel. For some reason they wanted to have ordered
> read/write access.

Wants expressing in a comment then, or at the very least in the patch
description.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 19/30] xen/riscv: introduce event.h
  2024-02-14 12:16     ` Oleksii
@ 2024-02-14 13:11       ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-14 13:11 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 14.02.2024 13:16, Oleksii wrote:
> On Mon, 2024-02-12 at 16:20 +0100, Jan Beulich wrote:
>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>>
>> Acked-by: Jan Beulich <jbeulich@suse.com>
>> again with a nit, though:
>>
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/include/asm/event.h
>>> @@ -0,0 +1,40 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +#ifndef __ASM_RISCV_EVENT_H__
>>> +#define __ASM_RISCV_EVENT_H__
>>> +
>>> +#include <xen/lib.h>
>>> +
>>> +void vcpu_mark_events_pending(struct vcpu *v);
>>> +
>>> +static inline int vcpu_event_delivery_is_enabled(struct vcpu *v)
>>> +{
>>> +    BUG_ON("unimplemented");
>>> +    return 0;
>>> +}
>>> +
>>> +static inline int local_events_need_delivery(void)
>>> +{
>>> +    BUG_ON("unimplemented");
>>> +    return 0;
>>> +}
>>> +
>>> +static inline void local_event_delivery_enable(void)
>>> +{
>>> +    BUG_ON("unimplemented");
>>> +}
>>> +
>>> +/* No arch specific virq definition now. Default to global. */
>>> +static inline bool arch_virq_is_global(unsigned int virq)
>>> +{
>>> +    return true;
>>> +}
>>> +
>>> +#endif
>>
>> This want to gain the usual comment.
> Do you mean that commit messag should be updated?

No, I indeed mean "comment". Just go look what I committed yesterday.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-13 10:37   ` Jan Beulich
@ 2024-02-15 13:41     ` Oleksii
  2024-02-19 11:22       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-15 13:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

> > +        : "=r" (ret), "+A" (*ptr) \
> > +        : "r" (new) \
> > +        : "memory" ); \
> > +})
> > +
> > +#define emulate_xchg_1_2(ptr, new, ret, release_barrier,
> > acquire_barrier) \
> > +({ \
> > +    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned
> > long)ptr, 4); \
> 
> You now appear to assume that this macro is only used with inputs not
> crossing word boundaries. That's okay as long as suitably guaranteed
> at the use sites, but imo wants saying in a comment.
> 
> > +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr)))
> > * BITS_PER_BYTE; \
> 
> Why 0x8 (i.e. spanning 64 bits), not 4 (matching the uint32_t use
> above)?
The idea to read 8 bytes was to deal with crossing word boundary. So if
our address is 0x3 and we have to xchg() 2 bytes, what will cross 4
byte boundary. Instead we align add 0x3, so it will become 0x0 and then
just always work with 8 bytes.

> 
> > +    unsigned long new_ = (unsigned long)(new) << mask_l; \
> > +    unsigned long ret_; \
> > +    unsigned long rc; \
> 
> Similarly, why unsigned long here?
sizeof(unsigned long) is 8 bytes and it was chosen as we are working
with lc/sc.d which are working with 8 bytes.

> 
> I also wonder about the mix of underscore suffixed (or not) variable
> names here.
If the question about ret_, then the same as before size of ret
argument of the macros will be 1 or 2, but {lc/sc}.d expected to work
with 8 bytes.

> 
> > +        release_barrier \
> > +        "0: lr.d %0, %2\n" \
> 
> Even here it's an 8-byte access. Even if - didn't check - the insn
> was
> okay to use with just a 4-byte aligned pointer, wouldn't it make
> sense
> then to 8-byte align it, and be consistent throughout this macro wrt
> the base unit acted upon? Alternatively, why not use lr.w here, thus
> reducing possible collisions between multiple CPUs accessing the same
> cache line?
According to the docs:
LR and SC operate on naturally-aligned 64-bit (RV64 only) or 32-bit
words in memory. Misaligned
addresses will generate misaligned address exceptions.

My intention was to deal with 4-byte crossing boundary. so if ptr is 4-
byte aligned then by reading 8-bytes we shouldn't care about boundary
crossing, if I am not missing something.

But your opinion about reduction of collisions makes sense also...

> 
> > +        "   and  %1, %0, %z4\n" \
> > +        "   or   %1, %1, %z3\n" \
> > +        "   sc.d %1, %1, %2\n" \
> > +        "   bnez %1, 0b\n" \
> > +        acquire_barrier \
> > +        : "=&r" (ret_), "=&r" (rc), "+A" (*ptr_32b_aligned) \
> > +        : "rJ" (new_), "rJ" (~mask) \
> 
> I think that as soon as there are more than 2 or maybe 3 operands,
> legibility is vastly improved by using named asm() operands.
Just to clarify you mean that it would be better to use instead of %0
use names?

> 
> > +        : "memory"); \
> 
> Nit: Missing blank before closing parenthesis.
> 
> > +    \
> > +    ret = (__typeof__(*(ptr)))((ret_ & mask) >> mask_l); \
> > +})
> 
> Why does "ret" need to be a macro argument? If you had only the
> expression here, not the the assigment, ...
> 
> > +#define __xchg_generic(ptr, new, size, sfx, release_barrier,
> > acquire_barrier) \
> > +({ \
> > +    __typeof__(ptr) ptr__ = (ptr); \
> 
> Is this local variable really needed? Can't you use "ptr" directly
> in the three macro invocations?
> 
> > +    __typeof__(*(ptr)) new__ = (new); \
> > +    __typeof__(*(ptr)) ret__; \
> > +    switch (size) \
> > +    { \
> > +    case 1: \
> > +    case 2: \
> > +        emulate_xchg_1_2(ptr__, new__, ret__, release_barrier,
> > acquire_barrier); \
> 
> ... this would become
> 
>         ret__ = emulate_xchg_1_2(ptr__, new__, release_barrier,
> acquire_barrier); \
> 
> But, unlike assumed above, there's no enforcement here that a 2-byte
> quantity won't cross a word, double-word, cache line, or even page
> boundary. That might be okay if then the code would simply crash
> (like
> the AMO insns emitted further down would), but aiui silent
> misbehavior
> would result.
As I mentioned above with 4-byte alignment and then reading and working
with 8-byte then crossing a word or double-word boundary shouldn't be
an issue.

I am not sure that I know how to check that we are crossing cache line
boundary.

Regarding page boundary, if the next page is mapped then all should
work fine, otherwise it will be an exception.

> 
> Also nit: The switch() higher up is (still/again) missing blanks.
> 
> > +        break; \
> > +    case 4: \
> > +        __amoswap_generic(ptr__, new__, ret__,\
> > +                          ".w" sfx,  release_barrier,
> > acquire_barrier); \
> > +        break; \
> > +    case 8: \
> > +        __amoswap_generic(ptr__, new__, ret__,\
> > +                          ".d" sfx,  release_barrier,
> > acquire_barrier); \
> > +        break; \
> > +    default: \
> > +        STATIC_ASSERT_UNREACHABLE(); \
> > +    } \
> > +    ret__; \
> > +})
> > +
> > +#define xchg_relaxed(ptr, x) \
> > +({ \
> > +    __typeof__(*(ptr)) x_ = (x); \
> > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),
> > "", "", ""); \
> > +})
> > +
> > +#define xchg_acquire(ptr, x) \
> > +({ \
> > +    __typeof__(*(ptr)) x_ = (x); \
> > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
> > +                                       "", "",
> > RISCV_ACQUIRE_BARRIER); \
> > +})
> > +
> > +#define xchg_release(ptr, x) \
> > +({ \
> > +    __typeof__(*(ptr)) x_ = (x); \
> > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
> > +                                       "", RISCV_RELEASE_BARRIER,
> > ""); \
> > +})
> > +
> > +#define xchg(ptr,x) \
> > +({ \
> > +    __typeof__(*(ptr)) ret__; \
> > +    ret__ = (__typeof__(*(ptr))) \
> > +            __xchg_generic(ptr, (unsigned long)(x),
> > sizeof(*(ptr)), \
> > +                           ".aqrl", "", ""); \
> 
> The .aqrl doesn't look to affect the (emulated) 1- and 2-byte cases.
> 
> Further, amoswap also exists in release-only and acquire-only forms.
> Why do you prefer explicit barrier insns over those? (Looks to
> similarly apply to the emulation path as well as to the cmpxchg
> machinery then, as both lr and sc also come in all four possible
> acquire/release forms. Perhaps for the emulation path using
> explicit barriers is better, in case the acquire/release forms of
> lr/sc - being used inside the loop - might perform worse.)
As 1- and 2-byte cases are emulated I decided that is not to provide
sfx argument for emulation macros as it will not have to much affect on
emulated types and just consume more performance on acquire and release
version of sc/ld instructions.


> > 
> 
> No RISCV_..._BARRIER for use here and ...
> 
> > +    ret__; \
> > +})
> > +
> > +#define __cmpxchg(ptr, o, n, s) \
> > +({ \
> > +    __typeof__(*(ptr)) ret__; \
> > +    ret__ = (__typeof__(*(ptr))) \
> > +            __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned
> > long)(n), \
> > +                              s, ".rl", "", " fence rw, rw\n"); \
> 
> ... here? And anyway, wouldn't it make sense to have
> 
> #define cmpxchg(ptr, o, n) __cmpxchg(ptr, o, n, sizeof(*(ptr))
> 
> to limit redundancy?
> 
> Plus wouldn't
> 
> #define __cmpxchg(ptr, o, n, s) \
>     ((__typeof__(*(ptr))) \
>      __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
>                        s, ".rl", "", " fence rw, rw\n"))
> 
> be shorter and thus easier to follow as well? As I notice only now,
> this would apparently apply further up as well.
I understand your point about "#define cmpxchg(ptr, o, n) __cmpxchg(",
but I can't undestand how the definition of __cmxchng should be done
shorter. Could you please clarify that?

~ Oleksii



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-13 13:33   ` Jan Beulich
@ 2024-02-15 16:38     ` Oleksii
  2024-02-15 16:43       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-15 16:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On Tue, 2024-02-13 at 14:33 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > --- a/xen/arch/riscv/Kconfig
> > +++ b/xen/arch/riscv/Kconfig
> > @@ -45,6 +45,13 @@ config RISCV_ISA_C
> >  
> >  	  If unsure, say Y.
> >  
> > +config TOOLCHAIN_HAS_ZIHINTPAUSE
> > +	bool
> > +	default y
> 
> Shorter as "def_bool y".
> 
> > +	depends on !64BIT || $(cc-option,-mabi=lp64 -
> > march=rv64ima_zihintpause)
> > +	depends on !32BIT || $(cc-option,-mabi=ilp32 -
> > march=rv32ima_zihintpause)
> 
> So for a reason I cannot really see -mabi= is indeed required here,
> or else the compiler sees an issue with the D extension. But enabling
> both M and A shouldn't really be needed in this check, as being
> unrelated?
Agree, that M and A could be dropped.

Regarding -mabi my guess is because D extension can be emulated by
compiler, doesn't matter if D is set in -march.  If it is set then
hardware instruction will be used, otherwise emulated instruction will
be used.
And if D extenstion is always present it is need to know which ABI
should be used. If D extenstion has h/w support then -mabi should be
also update to lp64d instead of lp64.

> 
> > +	depends on LLD_VERSION >= 150000 || LD_VERSION >= 23600
> 
> What's the linker dependency here? Depending on the answer I might
> further
> ask why "TOOLCHAIN" when elsewhere we use CC_HAS_ or HAS_CC_ or
> HAS_AS_.
I missed to introduce {L}LLD_VERSION config. It should output from the
command:
  riscv64-linux-gnu-ld --version
> 
> That said, you may or may not be aware that personally I'm against
> encoding such in Kconfig, and my repeated attempts to get the
> respective
> discussion unstuck have not led anywhere. Therefore if you keep this,
> I'll
> be in trouble whether to actually ack the change as a whole.
Could I ask what is wrong with introduction of such things on KConfig?

Would it be better to put everything in riscv/arch.mk?

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-15 16:38     ` Oleksii
@ 2024-02-15 16:43       ` Jan Beulich
  2024-02-16 11:16         ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-15 16:43 UTC (permalink / raw)
  To: Oleksii
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On 15.02.2024 17:38, Oleksii wrote:
> On Tue, 2024-02-13 at 14:33 +0100, Jan Beulich wrote:
>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>> +	depends on LLD_VERSION >= 150000 || LD_VERSION >= 23600
>>
>> What's the linker dependency here? Depending on the answer I might
>> further
>> ask why "TOOLCHAIN" when elsewhere we use CC_HAS_ or HAS_CC_ or
>> HAS_AS_.
> I missed to introduce {L}LLD_VERSION config. It should output from the
> command:
>   riscv64-linux-gnu-ld --version

Doesn't answer my question though where the linker version matters
here.

>> That said, you may or may not be aware that personally I'm against
>> encoding such in Kconfig, and my repeated attempts to get the
>> respective
>> discussion unstuck have not led anywhere. Therefore if you keep this,
>> I'll
>> be in trouble whether to actually ack the change as a whole.
> Could I ask what is wrong with introduction of such things on KConfig?

Just one of several possible pointers:
https://lists.xen.org/archives/html/xen-devel/2022-09/msg01793.html

> Would it be better to put everything in riscv/arch.mk?

Or a mix of both, as per the proposal. Just to be clear, if I say "yes"
to your question, someone else may come along and tell you to turn
around again.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h to build full Xen
  2024-02-13 14:19   ` Jan Beulich
@ 2024-02-16 11:03     ` Oleksii
  2024-02-19  8:07       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-16 11:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

> 
> > +        } free;
> > +    } u;
> > +
> > +    union {
> > +        /* Page is in use, but not as a shadow. */
> 
> I'm also pretty sure I asked before what shadow this comment alludes
> to.
I missed your request about 'shadow' before.

The comment arrived from Arm.

I tried to find out the answer by investigation how 'inuse' is used,
and, unfortunately, I couldn't find an answer what 'shadow' alludes to.

> 
> > +/*
> > + * Common code requires get_page_type and put_page_type.
> > + * We don't care about typecounts so we just do the minimum to
> > make it
> > + * happy.
> > + */
> > +static inline int get_page_type(struct page_info *page, unsigned
> > long type)
> > +{
> > +    return 1;
> > +}
> > +
> > +static inline void put_page_type(struct page_info *page)
> > +{
> > +}
> > +
> > +static inline void put_page_and_type(struct page_info *page)
> > +{
> > +    put_page_type(page);
> > +    put_page(page);
> > +}
> > +
> > +/*
> > + * RISC-V does not have an M2P, but common code expects a handful
> > of
> > + * M2P-related defines and functions. Provide dummy versions of
> > these.
> > + */
> > +#define INVALID_M2P_ENTRY        (~0UL)
> > +#define SHARED_M2P_ENTRY         (~0UL - 1UL)
> > +#define SHARED_M2P(_e)           ((_e) == SHARED_M2P_ENTRY)
> > +
> > +#define set_gpfn_from_mfn(mfn, pfn) do { (void)(mfn), (void)(pfn);
> > } while (0)
> > +#define mfn_to_gfn(d, mfn) ((void)(d), _gfn(mfn_x(mfn)))
> > +
> > +#define PDX_GROUP_SHIFT (16 + 5)
> 
> Where are these magic numbers coming from? None of the other three
> architectures use literal numbers here, thus making clear what
> values are actually meant. If you can't use suitable constants,
> please add a comment.
This numbers are incorrect for RISC-V, it should be 12 + 9 ( PAGE_SHIFT
+ VPN_BITS ).
I did some comparision of how some macros are defined in PPC and missed
to update that.

~ Oleksii



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-15 16:43       ` Jan Beulich
@ 2024-02-16 11:16         ` Oleksii
  2024-02-19  8:06           ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-16 11:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On Thu, 2024-02-15 at 17:43 +0100, Jan Beulich wrote:
> On 15.02.2024 17:38, Oleksii wrote:
> > On Tue, 2024-02-13 at 14:33 +0100, Jan Beulich wrote:
> > > On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > > > +	depends on LLD_VERSION >= 150000 || LD_VERSION >=
> > > > 23600
> > > 
> > > What's the linker dependency here? Depending on the answer I
> > > might
> > > further
> > > ask why "TOOLCHAIN" when elsewhere we use CC_HAS_ or HAS_CC_ or
> > > HAS_AS_.
> > I missed to introduce {L}LLD_VERSION config. It should output from
> > the
> > command:
> >   riscv64-linux-gnu-ld --version
> 
> Doesn't answer my question though where the linker version matters
> here.
Then I misinterpreted your initial question.
Could you please provide further clarification or rephrase it for
better understanding?

Probably, your question was about why linker dependency is needed here,
then
it is not sufficient to check if a toolchain supports a particular  
extension without checking if the linker supports that extension   
too.
For example, Clang 15 supports Zihintpause but GNU bintutils
2.35.2 does not, leading build errors like so:
    
   riscv64-linux-gnu-ld: -march=rv64i_zihintpause2p0: Invalid or
   unknown z ISA extension: 'zihintpause'


~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 16/30] xen/riscv: introduce p2m.h
  2024-02-05 15:32 ` [PATCH v4 16/30] xen/riscv: introduce p2m.h Oleksii Kurochko
  2024-02-12 15:16   ` Jan Beulich
@ 2024-02-18 18:18   ` Julien Grall
  1 sibling, 0 replies; 107+ messages in thread
From: Julien Grall @ 2024-02-18 18:18 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi Oleksii,

On 05/02/2024 15:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V4:
>   - update the comment above p2m_type_t. RISC-V has only 2 free for use bits in PTE, not 4 as Arm.
>   - update the comment after p2m_ram_rw: s/guest/domain/ as this also applies for dom0.
>   - return INVALID_MFN in gfn_to_mfn() instead of mfn(0).
>   - drop PPC changes.
> ---
> Changes in V3:
>   - add SPDX
>   - drop unneeded for now p2m types.
>   - return false in all functions implemented with BUG() inside.
>   - update the commit message
> ---
> Changes in V2:
>   - Nothing changed. Only rebase.
> ---
>   xen/arch/riscv/include/asm/p2m.h | 102 +++++++++++++++++++++++++++++++
>   1 file changed, 102 insertions(+)
>   create mode 100644 xen/arch/riscv/include/asm/p2m.h
> 
> diff --git a/xen/arch/riscv/include/asm/p2m.h b/xen/arch/riscv/include/asm/p2m.h
> new file mode 100644
> index 0000000000..8ad020974f
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/p2m.h
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef __ASM_RISCV_P2M_H__
> +#define __ASM_RISCV_P2M_H__
> +
> +#include <asm/page-bits.h>
> +
> +#define paddr_bits PADDR_BITS
> +
> +/*
> + * List of possible type for each page in the p2m entry.
> + * The number of available bit per page in the pte for this purpose is 2 bits.

That's not a lot and I expect you will ran out fairly quickly if you 
decide to store whether...

> + * So it's possible to only have 4 fields. If we run out of value in the
> + * future, it's possible to use higher value for pseudo-type and don't store
> + * them in the p2m entry.
> + */
> +typedef enum {
> +    p2m_invalid = 0,    /* Nothing mapped here */
> +    p2m_ram_rw,         /* Normal read/write domain RAM */

... the RAM is Read-Write. Depend on your P2M implementation, you could 
rely on the HW page-attributes to augment you p2m_type. So effectively, 
your two bits would contain information you can't already store.

Anyway, your approach is ok as your aim is to only build Xen for now. 
BUt this likely want to be re-think once you add the P2M support.

> +} p2m_type_t;
> +
> +#include <xen/p2m-common.h>
> +
> +static inline int get_page_and_type(struct page_info *page,
> +                                    struct domain *domain,
> +                                    unsigned long type)
> +{
> +    BUG_ON("unimplemented");
> +    return -EINVAL;
> +}
> +
> +/* Look up a GFN and take a reference count on the backing page. */
> +typedef unsigned int p2m_query_t;
> +#define P2M_ALLOC    (1u<<0)   /* Populate PoD and paged-out entries */
> +#define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */

Coding style: I understansd this is what Arm did, but the style is not 
correct. Please add a space before and after <<.

> +
> +static inline struct page_info *get_page_from_gfn(
> +    struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
> +{
> +    BUG_ON("unimplemented");
> +    return NULL;
> +}
> +
> +static inline void memory_type_changed(struct domain *d)
> +{
> +    BUG_ON("unimplemented");
> +}
> +
> +
> +static inline int guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn,
> +                                                        unsigned int order)
> +{
> +    return -EOPNOTSUPP;
> +}
> +
> +static inline int guest_physmap_add_entry(struct domain *d,
> +                            gfn_t gfn,
> +                            mfn_t mfn,
> +                            unsigned long page_order,
> +                            p2m_type_t t)
> +{
> +    BUG_ON("unimplemented");
> +    return -EINVAL;
> +}
> +
> +/* Untyped version for RAM only, for compatibility */
> +static inline int __must_check
> +guest_physmap_add_page(struct domain *d, gfn_t gfn, mfn_t mfn,
> +                       unsigned int page_order)
> +{
> +    return guest_physmap_add_entry(d, gfn, mfn, page_order, p2m_ram_rw);
> +}
> +
> +static inline mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
> +{
> +    BUG_ON("unimplemented");
> +    return INVALID_MFN;
> +}
> +
> +static inline bool arch_acquire_resource_check(struct domain *d)
> +{
> +    /*
> +     * The reference counting of foreign entries in set_foreign_p2m_entry()
> +     * is supported on RISCV.
> +     */
> +    return true;

AFAICT, the current implementation of set_foreign_p2m_entry() is a 
BUG_ON(). So I think it would make sense to return 'false' as this 
reflects better the current state.

> +}
> +
> +static inline void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
> +{
> +    /* Not supported on RISCV. */
> +}
> +
> +#endif /* __ASM_RISCV_P2M_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 17/30] xen/riscv: introduce regs.h
  2024-02-05 15:32 ` [PATCH v4 17/30] xen/riscv: introduce regs.h Oleksii Kurochko
@ 2024-02-18 18:22   ` Julien Grall
  2024-02-19 14:40     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Julien Grall @ 2024-02-18 18:22 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi,

On 05/02/2024 15:32, Oleksii Kurochko wrote:
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ------
> Changes in V4:
>   - add Acked-by: Jan Beulich <jbeulich@suse.com>
>   - s/BUG()/BUG_ON("unimplemented")
> ---
> Changes in V3:
>   - update the commit message
>   - add Acked-by: Jan Beulich <jbeulich@suse.com>
>   - remove "include <asm/current.h>" and use a forward declaration instead.
> ---
> Changes in V2:
>   - change xen/lib.h to xen/bug.h
>   - remove unnecessary empty line
> ---
> xen/arch/riscv/include/asm/regs.h | 29 +++++++++++++++++++++++++++++
>   1 file changed, 29 insertions(+)
>   create mode 100644 xen/arch/riscv/include/asm/regs.h
> 
> diff --git a/xen/arch/riscv/include/asm/regs.h b/xen/arch/riscv/include/asm/regs.h
> new file mode 100644
> index 0000000000..c70ea2aa0c
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/regs.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef __ARM_RISCV_REGS_H__
> +#define __ARM_RISCV_REGS_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#include <xen/bug.h>
> +
> +#define hyp_mode(r)     (0)

I don't understand where here you return 0 (which should really be 
false) but ...

> +
> +struct cpu_user_regs;
> +
> +static inline bool guest_mode(const struct cpu_user_regs *r)
> +{
> +    BUG_ON("unimplemented");
> +}

... here you return BUG_ON(). But I couldn't find any user of both 
guest_mode() and hyp_mode(). So isn't it a bit prematurate to introduce 
the helpers?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-05 15:32 ` [PATCH v4 07/30] xen/asm-generic: introdure nospec.h Oleksii Kurochko
@ 2024-02-18 18:30   ` Julien Grall
  2024-02-19 11:59     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Julien Grall @ 2024-02-18 18:30 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Shawn Anastasio, Alistair Francis, Bob Eshleman,
	Connor Davis

Hi Oleksii,

Title: Typo s/introdure/introduce/

On 05/02/2024 15:32, Oleksii Kurochko wrote:
> The <asm/nospec.h> header is similar between Arm, PPC, and RISC-V,
> so it has been moved to asm-generic.

I am not 100% convinced that moving this header to asm-generic is a good 
idea. At least for Arm, those helpers ought to be non-empty, what about 
RISC-V?

If the answer is they should be non-empty. Then I would consider to keep 
the duplication to make clear that each architecture should take their 
own decision in term of security.

The alternative, is to have a generic implementation that is safe by 
default (if that's even possible).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-05 15:32 ` [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
  2024-02-13 10:37   ` Jan Beulich
@ 2024-02-18 19:00   ` Julien Grall
  2024-02-19 14:00     ` Oleksii
  1 sibling, 1 reply; 107+ messages in thread
From: Julien Grall @ 2024-02-18 19:00 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu



On 05/02/2024 15:32, Oleksii Kurochko wrote:
> The header was taken from Linux kernl 6.4.0-rc1.
> 
> Addionally, were updated:
> * add emulation of {cmp}xchg for 1/2 byte types

This explaination is a little bit light. IIUC, you are implementing them 
using 32-bit atomic access. Is that correct? If so, please spell it out.

Also, I wonder whether it would be better to try to get rid of the 1/2 
bytes access. Do you know where they are used?

> * replace tabs with spaces
Does this mean you are not planning to backport any Linux fixes?

> * replace __* varialbed with *__

s/varialbed/variable/

> * introduce generic version of xchg_* and cmpxchg_*.
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V4:
>   - Code style fixes.
>   - enforce in __xchg_*() has the same type for new and *ptr, also "\n"
>     was removed at the end of asm instruction.
>   - dependency from https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@bugseng.com/
>   - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE().
>   - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used.
>   - drop cmpxcg{32,64}_{local} as they aren't used.
>   - introduce generic version of xchg_* and cmpxchg_*.
>   - update the commit message.
> ---
> Changes in V3:
>   - update the commit message
>   - add emulation of {cmp}xchg_... for 1 and 2 bytes types
> ---
> Changes in V2:
>   - update the comment at the top of the header.
>   - change xen/lib.h to xen/bug.h.
>   - sort inclusion of headers properly.
> ---
>   xen/arch/riscv/include/asm/cmpxchg.h | 237 +++++++++++++++++++++++++++
>   1 file changed, 237 insertions(+)
>   create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h
> 
> diff --git a/xen/arch/riscv/include/asm/cmpxchg.h b/xen/arch/riscv/include/asm/cmpxchg.h
> new file mode 100644
> index 0000000000..b751a50cbf
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> @@ -0,0 +1,237 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright (C) 2014 Regents of the University of California */
> +
> +#ifndef _ASM_RISCV_CMPXCHG_H
> +#define _ASM_RISCV_CMPXCHG_H
> +
> +#include <xen/compiler.h>
> +#include <xen/lib.h>
> +
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +#define ALIGN_DOWN(addr, size)  ((addr) & (~((size) - 1)))
> +
> +#define __amoswap_generic(ptr, new, ret, sfx, release_barrier, acquire_barrier) \
> +({ \
> +    asm volatile( \
> +        release_barrier \
> +        " amoswap" sfx " %0, %2, %1\n" \
> +        acquire_barrier \
> +        : "=r" (ret), "+A" (*ptr) \
> +        : "r" (new) \
> +        : "memory" ); \
> +})
> +
> +#define emulate_xchg_1_2(ptr, new, ret, release_barrier, acquire_barrier) \
> +({ \
> +    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned long)ptr, 4); \
> +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr))) * BITS_PER_BYTE; \
> +    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
> +    uint8_t mask_h = mask_l + mask_size - 1; \
> +    unsigned long mask = GENMASK(mask_h, mask_l); \
> +    unsigned long new_ = (unsigned long)(new) << mask_l; \
> +    unsigned long ret_; \
> +    unsigned long rc; \
> +    \
> +    asm volatile( \
> +        release_barrier \
> +        "0: lr.d %0, %2\n" \

I was going to ask why this is lr.d rather than lr.w. But I see Jan 
already asked. I agree with him that it should probably be a lr.w and ...

> +        "   and  %1, %0, %z4\n" \
> +        "   or   %1, %1, %z3\n" \
> +        "   sc.d %1, %1, %2\n" \

... respectively sc.w. The same applies for cmpxchg.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 13/30] xen/riscv: introduce io.h
  2024-02-05 15:32 ` [PATCH v4 13/30] xen/riscv: introduce io.h Oleksii Kurochko
  2024-02-13 11:05   ` Jan Beulich
@ 2024-02-18 19:07   ` Julien Grall
  2024-02-19 14:32     ` Oleksii
  1 sibling, 1 reply; 107+ messages in thread
From: Julien Grall @ 2024-02-18 19:07 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu



On 05/02/2024 15:32, Oleksii Kurochko wrote:
> The header taken form Linux 6.4.0-rc1 and is based on
> arch/riscv/include/asm/mmio.h.
> 
> Addionally, to the header was added definions of ioremap_*().

s/definions/definitions/

> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in V4:
>   - delete inner parentheses in macros.
>   - s/u<N>/uint<N>.
> ---
> Changes in V3:
>   - re-sync with linux kernel
>   - update the commit message
> ---
> Changes in V2:
>   - Nothing changed. Only rebase.
> ---
>   xen/arch/riscv/include/asm/io.h | 142 ++++++++++++++++++++++++++++++++
>   1 file changed, 142 insertions(+)
>   create mode 100644 xen/arch/riscv/include/asm/io.h
> 
> diff --git a/xen/arch/riscv/include/asm/io.h b/xen/arch/riscv/include/asm/io.h
> new file mode 100644
> index 0000000000..1e61a40522
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/io.h
> @@ -0,0 +1,142 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * {read,write}{b,w,l,q} based on arch/arm64/include/asm/io.h
> + *   which was based on arch/arm/include/io.h
> + *
> + * Copyright (C) 1996-2000 Russell King
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2014 Regents of the University of California
> + */
> +
> +
> +#ifndef _ASM_RISCV_IO_H
> +#define _ASM_RISCV_IO_H
> +
> +#include <asm/byteorder.h>
> +
> +/*
> + * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> + * change the properties of memory regions.  This should be fixed by the
> + * upcoming platform spec.
> + */
> +#define ioremap_nocache(addr, size) ioremap(addr, size)
> +#define ioremap_wc(addr, size) ioremap(addr, size)
> +#define ioremap_wt(addr, size) ioremap(addr, size)
> +
> +/* Generic IO read/write.  These perform native-endian accesses. */
> +#define __raw_writeb __raw_writeb
> +static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
> +{
> +	asm volatile("sb %0, 0(%1)" : : "r" (val), "r" (addr));
> +}
> +
> +#define __raw_writew __raw_writew
> +static inline void __raw_writew(uint16_t val, volatile void __iomem *addr)
> +{
> +	asm volatile("sh %0, 0(%1)" : : "r" (val), "r" (addr));
> +}
> +
> +#define __raw_writel __raw_writel
> +static inline void __raw_writel(uint32_t val, volatile void __iomem *addr)
> +{
> +	asm volatile("sw %0, 0(%1)" : : "r" (val), "r" (addr));
> +}
> +
> +#ifdef CONFIG_64BIT
> +#define __raw_writeq __raw_writeq
> +static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
> +{
> +	asm volatile("sd %0, 0(%1)" : : "r" (val), "r" (addr));
> +}
> +#endif
> +
> +#define __raw_readb __raw_readb
> +static inline uint8_t __raw_readb(const volatile void __iomem *addr)
> +{
> +	uint8_t val;
> +
> +	asm volatile("lb %0, 0(%1)" : "=r" (val) : "r" (addr));
> +	return val;
> +}
> +
> +#define __raw_readw __raw_readw
> +static inline uint16_t __raw_readw(const volatile void __iomem *addr)
> +{
> +	uint16_t val;
> +
> +	asm volatile("lh %0, 0(%1)" : "=r" (val) : "r" (addr));
> +	return val;
> +}
> +
> +#define __raw_readl __raw_readl
> +static inline uint32_t __raw_readl(const volatile void __iomem *addr)
> +{
> +	uint32_t val;
> +
> +	asm volatile("lw %0, 0(%1)" : "=r" (val) : "r" (addr));
> +	return val;
> +}
> +
> +#ifdef CONFIG_64BIT
> +#define __raw_readq __raw_readq
> +static inline u64 __raw_readq(const volatile void __iomem *addr)
> +{
> +	u64 val;
> +
> +	asm volatile("ld %0, 0(%1)" : "=r" (val) : "r" (addr));
> +	return val;
> +}
> +#endif
> +
> +/*
> + * Unordered I/O memory access primitives.  These are even more relaxed than
> + * the relaxed versions, as they don't even order accesses between successive
> + * operations to the I/O regions.
> + */
> +#define readb_cpu(c)		({ uint8_t  __r = __raw_readb(c); __r; })
> +#define readw_cpu(c)		({ uint16_t __r = le16_to_cpu((__force __le16)__raw_readw(c)); __r; })
> +#define readl_cpu(c)		({ uint32_t __r = le32_to_cpu((__force __le32)__raw_readl(c)); __r; })
> +
> +#define writeb_cpu(v,c)		((void)__raw_writeb(v,c))
> +#define writew_cpu(v,c)		((void)__raw_writew((__force uint16_t)cpu_to_le16(v),c))
> +#define writel_cpu(v,c)		((void)__raw_writel((__force uint32_t)cpu_to_le32(v),c))

NIT: __raw_write*() are already returning void. So I am not sure to 
understand the pointer of the cast. IIUC, this is coming from Linux, are 
you intend to keep the code as-is (including style)? If not, then I 
woudl consider to drop the cast on the three lines above and ...

> +
> +#ifdef CONFIG_64BIT
> +#define readq_cpu(c)		({ u64 __r = le64_to_cpu((__force __le64)__raw_readq(c)); __r; })
> +#define writeq_cpu(v,c)		((void)__raw_writeq((__force u64)cpu_to_le64(v),c))

... here as well.

> +#endif
> +
> +/*
> + * I/O memory access primitives. Reads are ordered relative to any
> + * following Normal memory access. Writes are ordered relative to any prior
> + * Normal memory access.  The memory barriers here are necessary as RISC-V
> + * doesn't define any ordering between the memory space and the I/O space.
> + */
> +#define __io_br()	do {} while (0)
> +#define __io_ar(v)	__asm__ __volatile__ ("fence i,r" : : : "memory");
> +#define __io_bw()	__asm__ __volatile__ ("fence w,o" : : : "memory");
> +#define __io_aw()	do { } while (0)
> +
> +#define readb(c)	({ uint8_t  __v; __io_br(); __v = readb_cpu(c); __io_ar(__v); __v; })
> +#define readw(c)	({ uint16_t __v; __io_br(); __v = readw_cpu(c); __io_ar(__v); __v; })
> +#define readl(c)	({ uint32_t __v; __io_br(); __v = readl_cpu(c); __io_ar(__v); __v; })
> +
> +#define writeb(v,c)	({ __io_bw(); writeb_cpu(v,c); __io_aw(); })
> +#define writew(v,c)	({ __io_bw(); writew_cpu(v,c); __io_aw(); })
> +#define writel(v,c)	({ __io_bw(); writel_cpu(v,c); __io_aw(); })
> +
> +#ifdef CONFIG_64BIT
> +#define readq(c)	({ u64 __v; __io_br(); __v = readq_cpu(c); __io_ar(__v); __v; })
> +#define writeq(v,c)	({ __io_bw(); writeq_cpu((v),(c)); __io_aw(); })
> +#endif
> +
> +#endif /* _ASM_RISCV_IO_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 14/30] xen/riscv: introduce atomic.h
  2024-02-05 15:32 ` [PATCH v4 14/30] xen/riscv: introduce atomic.h Oleksii Kurochko
  2024-02-13 11:36   ` Jan Beulich
@ 2024-02-18 19:22   ` Julien Grall
  2024-02-19 14:35     ` Oleksii
  1 sibling, 1 reply; 107+ messages in thread
From: Julien Grall @ 2024-02-18 19:22 UTC (permalink / raw)
  To: Oleksii Kurochko, xen-devel
  Cc: Bobby Eshleman, Alistair Francis, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi,

On 05/02/2024 15:32, Oleksii Kurochko wrote:
> From: Bobby Eshleman <bobbyeshleman@gmail.com>
> 
> Additionally, this patch introduces macros in fence.h,
> which are utilized in atomic.h.
> 
> atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n)
> were updated to use __*xchg_generic().
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

The author is Bobby, but I don't see a Signed-off-by. Did you forgot it?

> ---
> Changes in V4:
>   - do changes related to the updates of [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
>   - drop casts in read_atomic_size(), write_atomic(), add_sized()
>   - tabs -> spaces
>   - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler to handle NR_CPUS=1
>     the same as NR_CPUS>1 with accepting less than ideal performance.
> ---
> Changes in V3:
>    - update the commit message
>    - add SPDX for fence.h
>    - code style fixes
>    - Remove /* TODO: ... */ for add_sized macros. It looks correct to me.
>    - re-order the patch
>    - merge to this patch fence.h
> ---
> Changes in V2:
>   - Change an author of commit. I got this header from Bobby's old repo.
> ---
>   xen/arch/riscv/include/asm/atomic.h | 395 ++++++++++++++++++++++++++++
>   xen/arch/riscv/include/asm/fence.h  |   8 +
>   2 files changed, 403 insertions(+)
>   create mode 100644 xen/arch/riscv/include/asm/atomic.h
>   create mode 100644 xen/arch/riscv/include/asm/fence.h
> 
> diff --git a/xen/arch/riscv/include/asm/atomic.h b/xen/arch/riscv/include/asm/atomic.h
> new file mode 100644
> index 0000000000..267d3c0803
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/atomic.h
> @@ -0,0 +1,395 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Taken and modified from Linux.

Which version of Linux? Can you also spell out what are the big changes? 
This would be helpful if we need to re-sync.

> + *
> + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated to use
> + * __*xchg_generic()
> + *
> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> + * Copyright (C) 2012 Regents of the University of California
> + * Copyright (C) 2017 SiFive
> + * Copyright (C) 2021 Vates SAS
> + */
> +
> +#ifndef _ASM_RISCV_ATOMIC_H
> +#define _ASM_RISCV_ATOMIC_H
> +
> +#include <xen/atomic.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +void __bad_atomic_size(void);
> +
> +static always_inline void read_atomic_size(const volatile void *p,
> +                                           void *res,
> +                                           unsigned int size)
> +{
> +    switch ( size )
> +    {
> +    case 1: *(uint8_t *)res = readb(p); break;
> +    case 2: *(uint16_t *)res = readw(p); break;
> +    case 4: *(uint32_t *)res = readl(p); break;
> +    case 8: *(uint32_t *)res  = readq(p); break;
> +    default: __bad_atomic_size(); break;
> +    }
> +}
> +
> +#define read_atomic(p) ({                               \
> +    union { typeof(*p) val; char c[0]; } x_;            \
> +    read_atomic_size(p, x_.c, sizeof(*p));              \
> +    x_.val;                                             \
> +})
> +
> +#define write_atomic(p, x)                              \
> +({                                                      \
> +    typeof(*p) x__ = (x);                               \
> +    switch ( sizeof(*p) )                               \
> +    {                                                   \
> +    case 1: writeb((uint8_t)x__,  p); break;            \
> +    case 2: writew((uint16_t)x__, p); break;            \
> +    case 4: writel((uint32_t)x__, p); break;            \
> +    case 8: writeq((uint64_t)x__, p); break;            \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +    x__;                                                \
> +})
> +
> +#define add_sized(p, x)                                 \
> +({                                                      \
> +    typeof(*(p)) x__ = (x);                             \
> +    switch ( sizeof(*(p)) )                             \
> +    {                                                   \
> +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> +    case 2: writew(read_atomic(p) + x__, p); break;     \
> +    case 4: writel(read_atomic(p) + x__, p); break;     \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +})
> +
> +/*
> + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
> + *               non-scalar types unchanged.
> + *
> + * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char'

Xen is technically built using c99/gnu99. So it is feels a bit odd to 
introduce a C11 feature. I see that _Generic is already used in PPC... 
However, if we decide to add more use of it, then I think this should at 
minimum be documented in docs/misra/C-language-toolchain.rst (the more 
if we plan the macro is moved to common as Jan suggested).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-16 11:16         ` Oleksii
@ 2024-02-19  8:06           ` Jan Beulich
  2024-02-23 17:00             ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-19  8:06 UTC (permalink / raw)
  To: Oleksii
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On 16.02.2024 12:16, Oleksii wrote:
> On Thu, 2024-02-15 at 17:43 +0100, Jan Beulich wrote:
>> On 15.02.2024 17:38, Oleksii wrote:
>>> On Tue, 2024-02-13 at 14:33 +0100, Jan Beulich wrote:
>>>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>>>> +	depends on LLD_VERSION >= 150000 || LD_VERSION >=
>>>>> 23600
>>>>
>>>> What's the linker dependency here? Depending on the answer I
>>>> might
>>>> further
>>>> ask why "TOOLCHAIN" when elsewhere we use CC_HAS_ or HAS_CC_ or
>>>> HAS_AS_.
>>> I missed to introduce {L}LLD_VERSION config. It should output from
>>> the
>>> command:
>>>   riscv64-linux-gnu-ld --version
>>
>> Doesn't answer my question though where the linker version matters
>> here.
> Then I misinterpreted your initial question.
> Could you please provide further clarification or rephrase it for
> better understanding?
> 
> Probably, your question was about why linker dependency is needed here,
> then
> it is not sufficient to check if a toolchain supports a particular  
> extension without checking if the linker supports that extension   
> too.
> For example, Clang 15 supports Zihintpause but GNU bintutils
> 2.35.2 does not, leading build errors like so:
>     
>    riscv64-linux-gnu-ld: -march=rv64i_zihintpause2p0: Invalid or
>    unknown z ISA extension: 'zihintpause'

Hmm, that's certainly "interesting" behavior of the RISC-V linker. Yet
isn't the linker capability expected to be tied to that of gas? I would
find it far more natural if a gas dependency existed here. If such a
connection cannot be taken for granted, I'm pretty sure you'd need to
probe both then anyway.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h to build full Xen
  2024-02-16 11:03     ` Oleksii
@ 2024-02-19  8:07       ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-19  8:07 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 16.02.2024 12:03, Oleksii wrote:
>>
>>> +        } free;
>>> +    } u;
>>> +
>>> +    union {
>>> +        /* Page is in use, but not as a shadow. */
>>
>> I'm also pretty sure I asked before what shadow this comment alludes
>> to.
> I missed your request about 'shadow' before.
> 
> The comment arrived from Arm.
> 
> I tried to find out the answer by investigation how 'inuse' is used,
> and, unfortunately, I couldn't find an answer what 'shadow' alludes to.

That's from x86'es shadow paging, where a page can serve as a "shadow" of
a guest page table page.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-15 13:41     ` Oleksii
@ 2024-02-19 11:22       ` Jan Beulich
  2024-02-19 14:29         ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-19 11:22 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 15.02.2024 14:41, Oleksii wrote:
>>> +        : "=r" (ret), "+A" (*ptr) \
>>> +        : "r" (new) \
>>> +        : "memory" ); \
>>> +})
>>> +
>>> +#define emulate_xchg_1_2(ptr, new, ret, release_barrier,
>>> acquire_barrier) \
>>> +({ \
>>> +    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned
>>> long)ptr, 4); \
>>
>> You now appear to assume that this macro is only used with inputs not
>> crossing word boundaries. That's okay as long as suitably guaranteed
>> at the use sites, but imo wants saying in a comment.
>>
>>> +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr)))
>>> * BITS_PER_BYTE; \
>>
>> Why 0x8 (i.e. spanning 64 bits), not 4 (matching the uint32_t use
>> above)?
> The idea to read 8 bytes was to deal with crossing word boundary. So if
> our address is 0x3 and we have to xchg() 2 bytes, what will cross 4
> byte boundary. Instead we align add 0x3, so it will become 0x0 and then
> just always work with 8 bytes.

Then what if my 2-byte access crosses a dword boundary? A cache line
one? A page one?

>>> +    unsigned long new_ = (unsigned long)(new) << mask_l; \
>>> +    unsigned long ret_; \
>>> +    unsigned long rc; \
>>
>> Similarly, why unsigned long here?
> sizeof(unsigned long) is 8 bytes and it was chosen as we are working
> with lc/sc.d which are working with 8 bytes.
> 
>>
>> I also wonder about the mix of underscore suffixed (or not) variable
>> names here.
> If the question about ret_, then the same as before size of ret
> argument of the macros will be 1 or 2, but {lc/sc}.d expected to work
> with 8 bytes.

Then what's the uint32_t * about?

>>> +        release_barrier \
>>> +        "0: lr.d %0, %2\n" \
>>
>> Even here it's an 8-byte access. Even if - didn't check - the insn
>> was
>> okay to use with just a 4-byte aligned pointer, wouldn't it make
>> sense
>> then to 8-byte align it, and be consistent throughout this macro wrt
>> the base unit acted upon? Alternatively, why not use lr.w here, thus
>> reducing possible collisions between multiple CPUs accessing the same
>> cache line?
> According to the docs:
> LR and SC operate on naturally-aligned 64-bit (RV64 only) or 32-bit
> words in memory. Misaligned
> addresses will generate misaligned address exceptions.
> 
> My intention was to deal with 4-byte crossing boundary. so if ptr is 4-
> byte aligned then by reading 8-bytes we shouldn't care about boundary
> crossing, if I am not missing something.

If a ptr is 4-byte aligned, there's no point reading more than 4 bytes.

>>> +        "   and  %1, %0, %z4\n" \
>>> +        "   or   %1, %1, %z3\n" \
>>> +        "   sc.d %1, %1, %2\n" \
>>> +        "   bnez %1, 0b\n" \
>>> +        acquire_barrier \
>>> +        : "=&r" (ret_), "=&r" (rc), "+A" (*ptr_32b_aligned) \
>>> +        : "rJ" (new_), "rJ" (~mask) \
>>
>> I think that as soon as there are more than 2 or maybe 3 operands,
>> legibility is vastly improved by using named asm() operands.
> Just to clarify you mean that it would be better to use instead of %0
> use names?

Yes. Just like you have it in one of the other patches that I looked at
later.

>>> +        : "memory"); \
>>
>> Nit: Missing blank before closing parenthesis.
>>
>>> +    \
>>> +    ret = (__typeof__(*(ptr)))((ret_ & mask) >> mask_l); \
>>> +})
>>
>> Why does "ret" need to be a macro argument? If you had only the
>> expression here, not the the assigment, ...
>>
>>> +#define __xchg_generic(ptr, new, size, sfx, release_barrier,
>>> acquire_barrier) \
>>> +({ \
>>> +    __typeof__(ptr) ptr__ = (ptr); \
>>
>> Is this local variable really needed? Can't you use "ptr" directly
>> in the three macro invocations?
>>
>>> +    __typeof__(*(ptr)) new__ = (new); \
>>> +    __typeof__(*(ptr)) ret__; \
>>> +    switch (size) \
>>> +    { \
>>> +    case 1: \
>>> +    case 2: \
>>> +        emulate_xchg_1_2(ptr__, new__, ret__, release_barrier,
>>> acquire_barrier); \
>>
>> ... this would become
>>
>>         ret__ = emulate_xchg_1_2(ptr__, new__, release_barrier,
>> acquire_barrier); \
>>
>> But, unlike assumed above, there's no enforcement here that a 2-byte
>> quantity won't cross a word, double-word, cache line, or even page
>> boundary. That might be okay if then the code would simply crash
>> (like
>> the AMO insns emitted further down would), but aiui silent
>> misbehavior
>> would result.
> As I mentioned above with 4-byte alignment and then reading and working
> with 8-byte then crossing a word or double-word boundary shouldn't be
> an issue.
> 
> I am not sure that I know how to check that we are crossing cache line
> boundary.
> 
> Regarding page boundary, if the next page is mapped then all should
> work fine, otherwise it will be an exception.

Are you sure lr.d / sc.d are happy to access across such a boundary,
when both pages are mapped?

To me it seems pretty clear that for atomic accesses you want to
demand natural alignment, i.e. 2-byte alignment for 2-byte accesses.
This way you can be sure no potentially problematic boundaries will
be crossed.

>>> +        break; \
>>> +    case 4: \
>>> +        __amoswap_generic(ptr__, new__, ret__,\
>>> +                          ".w" sfx,  release_barrier,
>>> acquire_barrier); \
>>> +        break; \
>>> +    case 8: \
>>> +        __amoswap_generic(ptr__, new__, ret__,\
>>> +                          ".d" sfx,  release_barrier,
>>> acquire_barrier); \
>>> +        break; \
>>> +    default: \
>>> +        STATIC_ASSERT_UNREACHABLE(); \
>>> +    } \
>>> +    ret__; \
>>> +})
>>> +
>>> +#define xchg_relaxed(ptr, x) \
>>> +({ \
>>> +    __typeof__(*(ptr)) x_ = (x); \
>>> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),
>>> "", "", ""); \
>>> +})
>>> +
>>> +#define xchg_acquire(ptr, x) \
>>> +({ \
>>> +    __typeof__(*(ptr)) x_ = (x); \
>>> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)), \
>>> +                                       "", "",
>>> RISCV_ACQUIRE_BARRIER); \
>>> +})
>>> +
>>> +#define xchg_release(ptr, x) \
>>> +({ \
>>> +    __typeof__(*(ptr)) x_ = (x); \
>>> +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_, sizeof(*(ptr)),\
>>> +                                       "", RISCV_RELEASE_BARRIER,
>>> ""); \
>>> +})
>>> +
>>> +#define xchg(ptr,x) \
>>> +({ \
>>> +    __typeof__(*(ptr)) ret__; \
>>> +    ret__ = (__typeof__(*(ptr))) \
>>> +            __xchg_generic(ptr, (unsigned long)(x),
>>> sizeof(*(ptr)), \
>>> +                           ".aqrl", "", ""); \
>>
>> The .aqrl doesn't look to affect the (emulated) 1- and 2-byte cases.
>>
>> Further, amoswap also exists in release-only and acquire-only forms.
>> Why do you prefer explicit barrier insns over those? (Looks to
>> similarly apply to the emulation path as well as to the cmpxchg
>> machinery then, as both lr and sc also come in all four possible
>> acquire/release forms. Perhaps for the emulation path using
>> explicit barriers is better, in case the acquire/release forms of
>> lr/sc - being used inside the loop - might perform worse.)
> As 1- and 2-byte cases are emulated I decided that is not to provide
> sfx argument for emulation macros as it will not have to much affect on
> emulated types and just consume more performance on acquire and release
> version of sc/ld instructions.

Question is whether the common case (4- and 8-byte accesses) shouldn't
be valued higher, with 1- and 2-byte emulation being there just to
allow things to not break altogether.

>> No RISCV_..._BARRIER for use here and ...
>>
>>> +    ret__; \
>>> +})
>>> +
>>> +#define __cmpxchg(ptr, o, n, s) \
>>> +({ \
>>> +    __typeof__(*(ptr)) ret__; \
>>> +    ret__ = (__typeof__(*(ptr))) \
>>> +            __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned
>>> long)(n), \
>>> +                              s, ".rl", "", " fence rw, rw\n"); \
>>
>> ... here? And anyway, wouldn't it make sense to have
>>
>> #define cmpxchg(ptr, o, n) __cmpxchg(ptr, o, n, sizeof(*(ptr))
>>
>> to limit redundancy?
>>
>> Plus wouldn't
>>
>> #define __cmpxchg(ptr, o, n, s) \
>>     ((__typeof__(*(ptr))) \
>>      __cmpxchg_generic(ptr, (unsigned long)(o), (unsigned long)(n), \
>>                        s, ".rl", "", " fence rw, rw\n"))
>>
>> be shorter and thus easier to follow as well? As I notice only now,
>> this would apparently apply further up as well.
> I understand your point about "#define cmpxchg(ptr, o, n) __cmpxchg(",
> but I can't undestand how the definition of __cmxchng should be done
> shorter. Could you please clarify that?

You did notice that in my form there's no local variable, and hence
also no macro-wide scope ( ({ ... }) )?

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-18 18:30   ` Julien Grall
@ 2024-02-19 11:59     ` Oleksii
  2024-02-19 12:18       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-19 11:59 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Shawn Anastasio, Alistair Francis, Bob Eshleman,
	Connor Davis

Hi Julien,

On Sun, 2024-02-18 at 18:30 +0000, Julien Grall wrote:
> Hi Oleksii,
> 
> Title: Typo s/introdure/introduce/
> 
> On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > The <asm/nospec.h> header is similar between Arm, PPC, and RISC-V,
> > so it has been moved to asm-generic.
> 
> I am not 100% convinced that moving this header to asm-generic is a
> good 
> idea. At least for Arm, those helpers ought to be non-empty, what
> about 
> RISC-V?
For Arm, they are not taking any action, are they? There are no
specific fences or other mechanisms inside
evaluate_nospec()/block_speculation() to address speculation.

For RISC-V, it can be implemented in a similar manner, at least for
now. Since these functions are only used in the grant tables code ( for
Arm and so for RISC-V ), which is not supported by RISC-V.

> 
> If the answer is they should be non-empty. Then I would consider to
> keep 
> the duplication to make clear that each architecture should take
> their 
> own decision in term of security.
> 
> The alternative, is to have a generic implementation that is safe by 
> default (if that's even possible).
I am not certain that we can have a generic implementation, as each
architecture may have specific speculation issues.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-19 11:59     ` Oleksii
@ 2024-02-19 12:18       ` Jan Beulich
  2024-02-20 20:30         ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-19 12:18 UTC (permalink / raw)
  To: Oleksii
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Shawn Anastasio, Alistair Francis, Bob Eshleman, Connor Davis,
	Julien Grall, xen-devel

On 19.02.2024 12:59, Oleksii wrote:
> Hi Julien,
> 
> On Sun, 2024-02-18 at 18:30 +0000, Julien Grall wrote:
>> Hi Oleksii,
>>
>> Title: Typo s/introdure/introduce/
>>
>> On 05/02/2024 15:32, Oleksii Kurochko wrote:
>>> The <asm/nospec.h> header is similar between Arm, PPC, and RISC-V,
>>> so it has been moved to asm-generic.
>>
>> I am not 100% convinced that moving this header to asm-generic is a
>> good 
>> idea. At least for Arm, those helpers ought to be non-empty, what
>> about 
>> RISC-V?
> For Arm, they are not taking any action, are they? There are no
> specific fences or other mechanisms inside
> evaluate_nospec()/block_speculation() to address speculation.

The question isn't the status quo, but how things should be looking like
if everything was in place that's (in principle) needed.

> For RISC-V, it can be implemented in a similar manner, at least for
> now. Since these functions are only used in the grant tables code ( for
> Arm and so for RISC-V ), which is not supported by RISC-V.

Same here - the question is whether long term, when gnttab is also
supported, RISC-V would get away without doing anything. Still ...

>> If the answer is they should be non-empty. Then I would consider to
>> keep 
>> the duplication to make clear that each architecture should take
>> their 
>> own decision in term of security.
>>
>> The alternative, is to have a generic implementation that is safe by 
>> default (if that's even possible).
> I am not certain that we can have a generic implementation, as each
> architecture may have specific speculation issues.

... it's theoretically possible that there'd be an arch with no
speculation issues, maybe simply because of not speculating.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-18 19:00   ` Julien Grall
@ 2024-02-19 14:00     ` Oleksii
  2024-02-19 14:12       ` Jan Beulich
  2024-02-19 14:25       ` Julien Grall
  0 siblings, 2 replies; 107+ messages in thread
From: Oleksii @ 2024-02-19 14:00 UTC (permalink / raw)
  To: Julien Grall, xen-devel, Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Stefano Stabellini, Wei Liu

On Sun, 2024-02-18 at 19:00 +0000, Julien Grall wrote:
> 
> 
> On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > The header was taken from Linux kernl 6.4.0-rc1.
> > 
> > Addionally, were updated:
> > * add emulation of {cmp}xchg for 1/2 byte types
> 
> This explaination is a little bit light. IIUC, you are implementing
> them 
> using 32-bit atomic access. Is that correct? If so, please spell it
> out.
Sure, I'll update commit message.

> 
> Also, I wonder whether it would be better to try to get rid of the
> 1/2 
> bytes access. Do you know where they are used?
Right now, the issue is with test_and_clear_bool() which is used in
common/sched/core.c:840
[https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/sched/core.c?ref_type=heads#L840
]

I don't remember details, but in xen-devel chat someone told me that
grant table requires 1/2 bytes access.

> 
> > * replace tabs with spaces
> Does this mean you are not planning to backport any Linux fixes?
If it will be any fixes for sure I'll back port them, but it looks like
this code is stable enough and not to many fixes will be done there, so
it shouldn't be hard to backport them and switch to spaces.

> 
> > * replace __* varialbed with *__
> 
> s/varialbed/variable/
> 
> > * introduce generic version of xchg_* and cmpxchg_*.
> > 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > ---
> > Changes in V4:
> >   - Code style fixes.
> >   - enforce in __xchg_*() has the same type for new and *ptr, also
> > "\n"
> >     was removed at the end of asm instruction.
> >   - dependency from
> > https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@bugseng.com/
> >   - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE().
> >   - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used.
> >   - drop cmpxcg{32,64}_{local} as they aren't used.
> >   - introduce generic version of xchg_* and cmpxchg_*.
> >   - update the commit message.
> > ---
> > Changes in V3:
> >   - update the commit message
> >   - add emulation of {cmp}xchg_... for 1 and 2 bytes types
> > ---
> > Changes in V2:
> >   - update the comment at the top of the header.
> >   - change xen/lib.h to xen/bug.h.
> >   - sort inclusion of headers properly.
> > ---
> >   xen/arch/riscv/include/asm/cmpxchg.h | 237
> > +++++++++++++++++++++++++++
> >   1 file changed, 237 insertions(+)
> >   create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h
> > 
> > diff --git a/xen/arch/riscv/include/asm/cmpxchg.h
> > b/xen/arch/riscv/include/asm/cmpxchg.h
> > new file mode 100644
> > index 0000000000..b751a50cbf
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> > @@ -0,0 +1,237 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/* Copyright (C) 2014 Regents of the University of California */
> > +
> > +#ifndef _ASM_RISCV_CMPXCHG_H
> > +#define _ASM_RISCV_CMPXCHG_H
> > +
> > +#include <xen/compiler.h>
> > +#include <xen/lib.h>
> > +
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +#define ALIGN_DOWN(addr, size)  ((addr) & (~((size) - 1)))
> > +
> > +#define __amoswap_generic(ptr, new, ret, sfx, release_barrier,
> > acquire_barrier) \
> > +({ \
> > +    asm volatile( \
> > +        release_barrier \
> > +        " amoswap" sfx " %0, %2, %1\n" \
> > +        acquire_barrier \
> > +        : "=r" (ret), "+A" (*ptr) \
> > +        : "r" (new) \
> > +        : "memory" ); \
> > +})
> > +
> > +#define emulate_xchg_1_2(ptr, new, ret, release_barrier,
> > acquire_barrier) \
> > +({ \
> > +    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned
> > long)ptr, 4); \
> > +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr)))
> > * BITS_PER_BYTE; \
> > +    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
> > +    uint8_t mask_h = mask_l + mask_size - 1; \
> > +    unsigned long mask = GENMASK(mask_h, mask_l); \
> > +    unsigned long new_ = (unsigned long)(new) << mask_l; \
> > +    unsigned long ret_; \
> > +    unsigned long rc; \
> > +    \
> > +    asm volatile( \
> > +        release_barrier \
> > +        "0: lr.d %0, %2\n" \
> 
> I was going to ask why this is lr.d rather than lr.w. But I see Jan 
> already asked. I agree with him that it should probably be a lr.w and
> ...
> 
> > +        "   and  %1, %0, %z4\n" \
> > +        "   or   %1, %1, %z3\n" \
> > +        "   sc.d %1, %1, %2\n" \
> 
> ... respectively sc.w. The same applies for cmpxchg.

I agree that it would be better, and my initial attempt was to handle
4-byte or 8-byte boundary crossing during 2-byte access:

0 1 2 3 4 5 6 7 8
X X X 1 1 X X X X

In this case, if I align address 3 to address 0 and then read 4 bytes
instead of 8 bytes, I will not process the byte at address 4. This was
the reason why I started to read 8 bytes.

I also acknowledge that there could be an issue in the following case:

X  4094 4095 4096
X    1   1    X
In this situation, when I read 8 bytes, there could be an issue where
the new page (which starts at 4096) will not be mapped. It seems
correct in this case to check that variable is within one page and read
4 bytes instead of 8.

One more thing I am uncertain about is if we change everything to read
4 bytes with 4-byte alignment, what should be done with the first case?
Should we panic? (I am not sure if this is an option.) Should we
perform the operation twice for addresses 0x0 and 0x4?

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-19 14:00     ` Oleksii
@ 2024-02-19 14:12       ` Jan Beulich
  2024-02-19 15:20         ` Oleksii
  2024-02-19 14:25       ` Julien Grall
  1 sibling, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-19 14:12 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Stefano Stabellini, Wei Liu, Julien Grall,
	xen-devel

On 19.02.2024 15:00, Oleksii wrote:
> On Sun, 2024-02-18 at 19:00 +0000, Julien Grall wrote:
>> On 05/02/2024 15:32, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/include/asm/cmpxchg.h
>>> @@ -0,0 +1,237 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/* Copyright (C) 2014 Regents of the University of California */
>>> +
>>> +#ifndef _ASM_RISCV_CMPXCHG_H
>>> +#define _ASM_RISCV_CMPXCHG_H
>>> +
>>> +#include <xen/compiler.h>
>>> +#include <xen/lib.h>
>>> +
>>> +#include <asm/fence.h>
>>> +#include <asm/io.h>
>>> +#include <asm/system.h>
>>> +
>>> +#define ALIGN_DOWN(addr, size)  ((addr) & (~((size) - 1)))
>>> +
>>> +#define __amoswap_generic(ptr, new, ret, sfx, release_barrier,
>>> acquire_barrier) \
>>> +({ \
>>> +    asm volatile( \
>>> +        release_barrier \
>>> +        " amoswap" sfx " %0, %2, %1\n" \
>>> +        acquire_barrier \
>>> +        : "=r" (ret), "+A" (*ptr) \
>>> +        : "r" (new) \
>>> +        : "memory" ); \
>>> +})
>>> +
>>> +#define emulate_xchg_1_2(ptr, new, ret, release_barrier,
>>> acquire_barrier) \
>>> +({ \
>>> +    uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned
>>> long)ptr, 4); \
>>> +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr)))
>>> * BITS_PER_BYTE; \
>>> +    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
>>> +    uint8_t mask_h = mask_l + mask_size - 1; \
>>> +    unsigned long mask = GENMASK(mask_h, mask_l); \
>>> +    unsigned long new_ = (unsigned long)(new) << mask_l; \
>>> +    unsigned long ret_; \
>>> +    unsigned long rc; \
>>> +    \
>>> +    asm volatile( \
>>> +        release_barrier \
>>> +        "0: lr.d %0, %2\n" \
>>
>> I was going to ask why this is lr.d rather than lr.w. But I see Jan 
>> already asked. I agree with him that it should probably be a lr.w and
>> ...
>>
>>> +        "   and  %1, %0, %z4\n" \
>>> +        "   or   %1, %1, %z3\n" \
>>> +        "   sc.d %1, %1, %2\n" \
>>
>> ... respectively sc.w. The same applies for cmpxchg.
> 
> I agree that it would be better, and my initial attempt was to handle
> 4-byte or 8-byte boundary crossing during 2-byte access:
> 
> 0 1 2 3 4 5 6 7 8
> X X X 1 1 X X X X
> 
> In this case, if I align address 3 to address 0 and then read 4 bytes
> instead of 8 bytes, I will not process the byte at address 4. This was
> the reason why I started to read 8 bytes.
> 
> I also acknowledge that there could be an issue in the following case:
> 
> X  4094 4095 4096
> X    1   1    X
> In this situation, when I read 8 bytes, there could be an issue where
> the new page (which starts at 4096) will not be mapped. It seems
> correct in this case to check that variable is within one page and read
> 4 bytes instead of 8.
> 
> One more thing I am uncertain about is if we change everything to read
> 4 bytes with 4-byte alignment, what should be done with the first case?
> Should we panic? (I am not sure if this is an option.)

Counter question (raised elsewhere already): What if a 4-byte access
crosses a word / cache line / page boundary? Ideally exactly the
same would happen for a 2-byte access crossing a respective boundary.
(Which you can achieve relatively easily by masking off only address
bit 1, keeping address bit 0 unaltered.)

> Should we
> perform the operation twice for addresses 0x0 and 0x4?

That wouldn't be atomic then.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-19 14:00     ` Oleksii
  2024-02-19 14:12       ` Jan Beulich
@ 2024-02-19 14:25       ` Julien Grall
  1 sibling, 0 replies; 107+ messages in thread
From: Julien Grall @ 2024-02-19 14:25 UTC (permalink / raw)
  To: Oleksii, xen-devel, Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Stefano Stabellini, Wei Liu

Hi Oleksii,

On 19/02/2024 14:00, Oleksii wrote:
> On Sun, 2024-02-18 at 19:00 +0000, Julien Grall wrote:
>>
>>
>> On 05/02/2024 15:32, Oleksii Kurochko wrote:
>>> The header was taken from Linux kernl 6.4.0-rc1.
>>>
>>> Addionally, were updated:
>>> * add emulation of {cmp}xchg for 1/2 byte types
>>
>> This explaination is a little bit light. IIUC, you are implementing
>> them
>> using 32-bit atomic access. Is that correct? If so, please spell it
>> out.
> Sure, I'll update commit message.
> 
>>
>> Also, I wonder whether it would be better to try to get rid of the
>> 1/2
>> bytes access. Do you know where they are used?
> Right now, the issue is with test_and_clear_bool() which is used in
> common/sched/core.c:840
> [https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/sched/core.c?ref_type=heads#L840
> ]
> 
> I don't remember details, but in xen-devel chat someone told me that
> grant table requires 1/2 bytes access.

Ok :/. This would be part of the ABI then and therefore can't be easily 
changed.

> 
>>
>>> * replace tabs with spaces
>> Does this mean you are not planning to backport any Linux fixes?
> If it will be any fixes for sure I'll back port them, but it looks like
> this code is stable enough and not to many fixes will be done there, so
> it shouldn't be hard to backport them and switch to spaces.

Fair enough.

>>> +        "   and  %1, %0, %z4\n" \
>>> +        "   or   %1, %1, %z3\n" \
>>> +        "   sc.d %1, %1, %2\n" \
>>
>> ... respectively sc.w. The same applies for cmpxchg.
> 
> I agree that it would be better, and my initial attempt was to handle
> 4-byte or 8-byte boundary crossing during 2-byte access:
> 
> 0 1 2 3 4 5 6 7 8
> X X X 1 1 X X X X
> 
> In this case, if I align address 3 to address 0 and then read 4 bytes
> instead of 8 bytes, I will not process the byte at address 4. This was
> the reason why I started to read 8 bytes.

At least on Arm, the architecture doesn't support atomic operations if 
the access is not aligned to its size (this will send a data abort). On 
some architecture, this is supported but potentially very slow.

So all the common code should already use properly aligned address. 
Therefore, I don't really see the reason to add support for unaligned 
access.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-19 11:22       ` Jan Beulich
@ 2024-02-19 14:29         ` Oleksii
  2024-02-19 15:01           ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-19 14:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-19 at 12:22 +0100, Jan Beulich wrote:
> On 15.02.2024 14:41, Oleksii wrote:
> > > > +        : "=r" (ret), "+A" (*ptr) \
> > > > +        : "r" (new) \
> > > > +        : "memory" ); \
> > > > +})
> > > > +
> > > > +#define emulate_xchg_1_2(ptr, new, ret, release_barrier,
> > > > acquire_barrier) \
> > > > +({ \
> > > > +    uint32_t *ptr_32b_aligned = (uint32_t
> > > > *)ALIGN_DOWN((unsigned
> > > > long)ptr, 4); \
> > > 
> > > You now appear to assume that this macro is only used with inputs
> > > not
> > > crossing word boundaries. That's okay as long as suitably
> > > guaranteed
> > > at the use sites, but imo wants saying in a comment.
> > > 
> > > > +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 -
> > > > sizeof(*ptr)))
> > > > * BITS_PER_BYTE; \
> > > 
> > > Why 0x8 (i.e. spanning 64 bits), not 4 (matching the uint32_t use
> > > above)?
> > The idea to read 8 bytes was to deal with crossing word boundary.
> > So if
> > our address is 0x3 and we have to xchg() 2 bytes, what will cross 4
> > byte boundary. Instead we align add 0x3, so it will become 0x0 and
> > then
> > just always work with 8 bytes.
> 
> Then what if my 2-byte access crosses a dword boundary? A cache line
> one? A page one?
Everything looks okay to me, except in the case of a page boundary.

In the scenario of a dword boundary:

0 1 2 3 4 5 6 7 8 9 ...
X X X X X X X 1 1 X

Assuming a variable starts at address 7, 4-byte alignment will be
enforced, and 8 bytes will be processed starting from address 4.

Concerning a cache line, it should still work, with potential
performance issues arising only if a part of the variable is cached
while another part is not.

Regarding page crossing, I acknowledge that it could be problematic if
the variable is entirely located at the end of a page, as there is no
guarantee that the next page exists. In this case, it would be
preferable to consistently read 4 bytes with 4-byte alignment:

X 4094 4095 4096?
X  1    1    ?

However, if the variable spans two pages, proper page mapping should be
ensured.

It appears sensible to reconsider the macros and implement 4-byte
alignment and 4-byte access, but then this is not clear how better to
deal with first case ( dword boundary ). Panic ? or use the macros
twice for address 4, and address 8?

> 
> > > > +    unsigned long new_ = (unsigned long)(new) << mask_l; \
> > > > +    unsigned long ret_; \
> > > > +    unsigned long rc; \
> > > 
> > > Similarly, why unsigned long here?
> > sizeof(unsigned long) is 8 bytes and it was chosen as we are
> > working
> > with lc/sc.d which are working with 8 bytes.
> > 
> > > 
> > > I also wonder about the mix of underscore suffixed (or not)
> > > variable
> > > names here.
> > If the question about ret_, then the same as before size of ret
> > argument of the macros will be 1 or 2, but {lc/sc}.d expected to
> > work
> > with 8 bytes.
> 
> Then what's the uint32_t * about?
Agree, then it should be also unsigned long.
> > 

> > > > +    __typeof__(*(ptr)) new__ = (new); \
> > > > +    __typeof__(*(ptr)) ret__; \
> > > > +    switch (size) \
> > > > +    { \
> > > > +    case 1: \
> > > > +    case 2: \
> > > > +        emulate_xchg_1_2(ptr__, new__, ret__, release_barrier,
> > > > acquire_barrier); \
> > > 
> > > ... this would become
> > > 
> > >         ret__ = emulate_xchg_1_2(ptr__, new__, release_barrier,
> > > acquire_barrier); \
> > > 
> > > But, unlike assumed above, there's no enforcement here that a 2-
> > > byte
> > > quantity won't cross a word, double-word, cache line, or even
> > > page
> > > boundary. That might be okay if then the code would simply crash
> > > (like
> > > the AMO insns emitted further down would), but aiui silent
> > > misbehavior
> > > would result.
> > As I mentioned above with 4-byte alignment and then reading and
> > working
> > with 8-byte then crossing a word or double-word boundary shouldn't
> > be
> > an issue.
> > 
> > I am not sure that I know how to check that we are crossing cache
> > line
> > boundary.
> > 
> > Regarding page boundary, if the next page is mapped then all should
> > work fine, otherwise it will be an exception.
> 
> Are you sure lr.d / sc.d are happy to access across such a boundary,
> when both pages are mapped?
If they are mapped, my expectation that lr.d and sc.d should be happy.

> 
> To me it seems pretty clear that for atomic accesses you want to
> demand natural alignment, i.e. 2-byte alignment for 2-byte accesses.
> This way you can be sure no potentially problematic boundaries will
> be crossed.
It makes sense, but I am not sure that I can guarantee that a user of
macros will always have 2-byte alignment (except during a panic) in the
future.

Even now, I am uncertain that everyone will be willing to add
__alignment(...) to struct vcpu->is_urgent
(xen/include/xen/sched.h:218) and other possible cases to accommodate
RISC-V requirements.

> 
> > > > +        break; \
> > > > +    case 4: \
> > > > +        __amoswap_generic(ptr__, new__, ret__,\
> > > > +                          ".w" sfx,  release_barrier,
> > > > acquire_barrier); \
> > > > +        break; \
> > > > +    case 8: \
> > > > +        __amoswap_generic(ptr__, new__, ret__,\
> > > > +                          ".d" sfx,  release_barrier,
> > > > acquire_barrier); \
> > > > +        break; \
> > > > +    default: \
> > > > +        STATIC_ASSERT_UNREACHABLE(); \
> > > > +    } \
> > > > +    ret__; \
> > > > +})
> > > > +
> > > > +#define xchg_relaxed(ptr, x) \
> > > > +({ \
> > > > +    __typeof__(*(ptr)) x_ = (x); \
> > > > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_,
> > > > sizeof(*(ptr)),
> > > > "", "", ""); \
> > > > +})
> > > > +
> > > > +#define xchg_acquire(ptr, x) \
> > > > +({ \
> > > > +    __typeof__(*(ptr)) x_ = (x); \
> > > > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_,
> > > > sizeof(*(ptr)), \
> > > > +                                       "", "",
> > > > RISCV_ACQUIRE_BARRIER); \
> > > > +})
> > > > +
> > > > +#define xchg_release(ptr, x) \
> > > > +({ \
> > > > +    __typeof__(*(ptr)) x_ = (x); \
> > > > +    (__typeof__(*(ptr)))__xchg_generic(ptr, x_,
> > > > sizeof(*(ptr)),\
> > > > +                                       "",
> > > > RISCV_RELEASE_BARRIER,
> > > > ""); \
> > > > +})
> > > > +
> > > > +#define xchg(ptr,x) \
> > > > +({ \
> > > > +    __typeof__(*(ptr)) ret__; \
> > > > +    ret__ = (__typeof__(*(ptr))) \
> > > > +            __xchg_generic(ptr, (unsigned long)(x),
> > > > sizeof(*(ptr)), \
> > > > +                           ".aqrl", "", ""); \
> > > 
> > > The .aqrl doesn't look to affect the (emulated) 1- and 2-byte
> > > cases.
> > > 
> > > Further, amoswap also exists in release-only and acquire-only
> > > forms.
> > > Why do you prefer explicit barrier insns over those? (Looks to
> > > similarly apply to the emulation path as well as to the cmpxchg
> > > machinery then, as both lr and sc also come in all four possible
> > > acquire/release forms. Perhaps for the emulation path using
> > > explicit barriers is better, in case the acquire/release forms of
> > > lr/sc - being used inside the loop - might perform worse.)
> > As 1- and 2-byte cases are emulated I decided that is not to
> > provide
> > sfx argument for emulation macros as it will not have to much
> > affect on
> > emulated types and just consume more performance on acquire and
> > release
> > version of sc/ld instructions.
> 
> Question is whether the common case (4- and 8-byte accesses)
> shouldn't
> be valued higher, with 1- and 2-byte emulation being there just to
> allow things to not break altogether.
If I understand you correctly, it would make sense to add the 'sfx'
argument for the 1/2-byte access case, ensuring that all options are
available for 1/2-byte access case as well.


~ Oleksii



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 13/30] xen/riscv: introduce io.h
  2024-02-18 19:07   ` Julien Grall
@ 2024-02-19 14:32     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-19 14:32 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

On Sun, 2024-02-18 at 19:07 +0000, Julien Grall wrote:
> 
> > +/*
> > + * Unordered I/O memory access primitives.  These are even more
> > relaxed than
> > + * the relaxed versions, as they don't even order accesses between
> > successive
> > + * operations to the I/O regions.
> > + */
> > +#define readb_cpu(c)		({ uint8_t  __r = __raw_readb(c);
> > __r; })
> > +#define readw_cpu(c)		({ uint16_t __r =
> > le16_to_cpu((__force __le16)__raw_readw(c)); __r; })
> > +#define readl_cpu(c)		({ uint32_t __r =
> > le32_to_cpu((__force __le32)__raw_readl(c)); __r; })
> > +
> > +#define writeb_cpu(v,c)		((void)__raw_writeb(v,c))
> > +#define
> > writew_cpu(v,c)		((void)__raw_writew((__force uint16_t)cpu_to_le16(v),c))
> > +#define
> > writel_cpu(v,c)		((void)__raw_writel((__force uint32_t)cpu_to_le32(v),c))
> 
> NIT: __raw_write*() are already returning void. So I am not sure to 
> understand the pointer of the cast. IIUC, this is coming from Linux,
> are 
> you intend to keep the code as-is (including style)? If not, then I 
> woudl consider to drop the cast on the three lines above and ...
Changes have already been made in this header, so it makes sense to
remove these casts. Thanks.


~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 14/30] xen/riscv: introduce atomic.h
  2024-02-18 19:22   ` Julien Grall
@ 2024-02-19 14:35     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-19 14:35 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Bobby Eshleman, Alistair Francis, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi Julien,

On Sun, 2024-02-18 at 19:22 +0000, Julien Grall wrote:
> Hi,
> 
> On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > From: Bobby Eshleman <bobbyeshleman@gmail.com>
> > 
> > Additionally, this patch introduces macros in fence.h,
> > which are utilized in atomic.h.
> > 
> > atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n)
> > were updated to use __*xchg_generic().
> > 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> The author is Bobby, but I don't see a Signed-off-by. Did you forgot
> it?
I missed to add that as I thought that it would be enough to change a
commit author.

> 
> > ---
> > Changes in V4:
> >   - do changes related to the updates of [PATCH v3 13/34]
> > xen/riscv: introduce cmpxchg.h
> >   - drop casts in read_atomic_size(), write_atomic(), add_sized()
> >   - tabs -> spaces
> >   - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler
> > to handle NR_CPUS=1
> >     the same as NR_CPUS>1 with accepting less than ideal
> > performance.
> > ---
> > Changes in V3:
> >    - update the commit message
> >    - add SPDX for fence.h
> >    - code style fixes
> >    - Remove /* TODO: ... */ for add_sized macros. It looks correct
> > to me.
> >    - re-order the patch
> >    - merge to this patch fence.h
> > ---
> > Changes in V2:
> >   - Change an author of commit. I got this header from Bobby's old
> > repo.
> > ---
> >   xen/arch/riscv/include/asm/atomic.h | 395
> > ++++++++++++++++++++++++++++
> >   xen/arch/riscv/include/asm/fence.h  |   8 +
> >   2 files changed, 403 insertions(+)
> >   create mode 100644 xen/arch/riscv/include/asm/atomic.h
> >   create mode 100644 xen/arch/riscv/include/asm/fence.h
> > 
> > diff --git a/xen/arch/riscv/include/asm/atomic.h
> > b/xen/arch/riscv/include/asm/atomic.h
> > new file mode 100644
> > index 0000000000..267d3c0803
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/atomic.h
> > @@ -0,0 +1,395 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Taken and modified from Linux.
> 
> Which version of Linux? Can you also spell out what are the big
> changes? 
> This would be helpful if we need to re-sync.
Sure, I'll add the changes here.

> 
> > + *
> > + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
> > updated to use
> > + * __*xchg_generic()
> > + *
> > + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> > + * Copyright (C) 2012 Regents of the University of California
> > + * Copyright (C) 2017 SiFive
> > + * Copyright (C) 2021 Vates SAS
> > + */
> > +
> > +#ifndef _ASM_RISCV_ATOMIC_H
> > +#define _ASM_RISCV_ATOMIC_H
> > +
> > +#include <xen/atomic.h>
> > +#include <asm/cmpxchg.h>
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +void __bad_atomic_size(void);
> > +
> > +static always_inline void read_atomic_size(const volatile void *p,
> > +                                           void *res,
> > +                                           unsigned int size)
> > +{
> > +    switch ( size )
> > +    {
> > +    case 1: *(uint8_t *)res = readb(p); break;
> > +    case 2: *(uint16_t *)res = readw(p); break;
> > +    case 4: *(uint32_t *)res = readl(p); break;
> > +    case 8: *(uint32_t *)res  = readq(p); break;
> > +    default: __bad_atomic_size(); break;
> > +    }
> > +}
> > +
> > +#define read_atomic(p) ({                               \
> > +    union { typeof(*p) val; char c[0]; } x_;            \
> > +    read_atomic_size(p, x_.c, sizeof(*p));              \
> > +    x_.val;                                             \
> > +})
> > +
> > +#define write_atomic(p, x)                              \
> > +({                                                      \
> > +    typeof(*p) x__ = (x);                               \
> > +    switch ( sizeof(*p) )                               \
> > +    {                                                   \
> > +    case 1: writeb((uint8_t)x__,  p); break;            \
> > +    case 2: writew((uint16_t)x__, p); break;            \
> > +    case 4: writel((uint32_t)x__, p); break;            \
> > +    case 8: writeq((uint64_t)x__, p); break;            \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +    x__;                                                \
> > +})
> > +
> > +#define add_sized(p, x)                                 \
> > +({                                                      \
> > +    typeof(*(p)) x__ = (x);                             \
> > +    switch ( sizeof(*(p)) )                             \
> > +    {                                                   \
> > +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> > +    case 2: writew(read_atomic(p) + x__, p); break;     \
> > +    case 4: writel(read_atomic(p) + x__, p); break;     \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +})
> > +
> > +/*
> > + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar
> > type, leaving
> > + *               non-scalar types unchanged.
> > + *
> > + * Prefer C11 _Generic for better compile-times and simpler code.
> > Note: 'char'
> 
> Xen is technically built using c99/gnu99. So it is feels a bit odd to
> introduce a C11 feature. I see that _Generic is already used in
> PPC... 
> However, if we decide to add more use of it, then I think this should
> at 
> minimum be documented in docs/misra/C-language-toolchain.rst (the
> more 
> if we plan the macro is moved to common as Jan suggested).
> 
> Cheers,
> 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 17/30] xen/riscv: introduce regs.h
  2024-02-18 18:22   ` Julien Grall
@ 2024-02-19 14:40     ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-19 14:40 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi Julien,

On Sun, 2024-02-18 at 18:22 +0000, Julien Grall wrote:
> Hi,
> 
> On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> > Acked-by: Jan Beulich <jbeulich@suse.com>
> > ------
> > Changes in V4:
> >   - add Acked-by: Jan Beulich <jbeulich@suse.com>
> >   - s/BUG()/BUG_ON("unimplemented")
> > ---
> > Changes in V3:
> >   - update the commit message
> >   - add Acked-by: Jan Beulich <jbeulich@suse.com>
> >   - remove "include <asm/current.h>" and use a forward declaration
> > instead.
> > ---
> > Changes in V2:
> >   - change xen/lib.h to xen/bug.h
> >   - remove unnecessary empty line
> > ---
> > xen/arch/riscv/include/asm/regs.h | 29
> > +++++++++++++++++++++++++++++
> >   1 file changed, 29 insertions(+)
> >   create mode 100644 xen/arch/riscv/include/asm/regs.h
> > 
> > diff --git a/xen/arch/riscv/include/asm/regs.h
> > b/xen/arch/riscv/include/asm/regs.h
> > new file mode 100644
> > index 0000000000..c70ea2aa0c
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/regs.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +#ifndef __ARM_RISCV_REGS_H__
> > +#define __ARM_RISCV_REGS_H__
> > +
> > +#ifndef __ASSEMBLY__
> > +
> > +#include <xen/bug.h>
> > +
> > +#define hyp_mode(r)     (0)
> 
> I don't understand where here you return 0 (which should really be 
> false) but ...
> 
> > +
> > +struct cpu_user_regs;
> > +
> > +static inline bool guest_mode(const struct cpu_user_regs *r)
> > +{
> > +    BUG_ON("unimplemented");
> > +}
> 
> ... here you return BUG_ON(). But I couldn't find any user of both 
> guest_mode() and hyp_mode(). So isn't it a bit prematurate to
> introduce 
> the helpers?

I agree regarding hyp_mode() it can be dropped , but gest_mode() is
used by common/keyhandler.c:142.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-19 14:29         ` Oleksii
@ 2024-02-19 15:01           ` Jan Beulich
  2024-02-23 12:23             ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-19 15:01 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 19.02.2024 15:29, Oleksii wrote:
> On Mon, 2024-02-19 at 12:22 +0100, Jan Beulich wrote:
>> On 15.02.2024 14:41, Oleksii wrote:
>>> As I mentioned above with 4-byte alignment and then reading and
>>> working
>>> with 8-byte then crossing a word or double-word boundary shouldn't
>>> be
>>> an issue.
>>>
>>> I am not sure that I know how to check that we are crossing cache
>>> line
>>> boundary.
>>>
>>> Regarding page boundary, if the next page is mapped then all should
>>> work fine, otherwise it will be an exception.
>>
>> Are you sure lr.d / sc.d are happy to access across such a boundary,
>> when both pages are mapped?
> If they are mapped, my expectation that lr.d and sc.d should be happy.

How does this expectation of yours fit with the A extension doc having
this:

"For LR and SC, the A extension requires that the address held in rs1 be
 naturally aligned to the size of the operand (i.e., eight-byte aligned
 for 64-bit words and four-byte aligned for 32-bit words). If the
 address is not naturally aligned, an address-misaligned exception or an
 access-fault exception will be generated."

It doesn't even say "may"; it says "will".

>> To me it seems pretty clear that for atomic accesses you want to
>> demand natural alignment, i.e. 2-byte alignment for 2-byte accesses.
>> This way you can be sure no potentially problematic boundaries will
>> be crossed.
> It makes sense, but I am not sure that I can guarantee that a user of
> macros will always have 2-byte alignment (except during a panic) in the
> future.
> 
> Even now, I am uncertain that everyone will be willing to add
> __alignment(...) to struct vcpu->is_urgent
> (xen/include/xen/sched.h:218) and other possible cases to accommodate
> RISC-V requirements.

->is_urgent is bool, i.e. 1 byte and hence okay at any address. For all
normal variables and fields the compiler will guarantee suitable
(natural) alignment. What you prohibit by requiring aligned items is
use of fields of e.g. packed structures.

>>> As 1- and 2-byte cases are emulated I decided that is not to
>>> provide
>>> sfx argument for emulation macros as it will not have to much
>>> affect on
>>> emulated types and just consume more performance on acquire and
>>> release
>>> version of sc/ld instructions.
>>
>> Question is whether the common case (4- and 8-byte accesses)
>> shouldn't
>> be valued higher, with 1- and 2-byte emulation being there just to
>> allow things to not break altogether.
> If I understand you correctly, it would make sense to add the 'sfx'
> argument for the 1/2-byte access case, ensuring that all options are
> available for 1/2-byte access case as well.

That's one of the possibilities. As said, I'm not overly worried about
the emulated cases. For the initial implementation I'd recommend going
with what is easiest there, yielding the best possible result for the
4- and 8-byte cases. If later it turns out repeated acquire/release
accesses are a problem in the emulation loop, things can be changed
to explicit barriers, without touching the 4- and 8-byte cases.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-19 14:12       ` Jan Beulich
@ 2024-02-19 15:20         ` Oleksii
  2024-02-19 15:29           ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-19 15:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Stefano Stabellini, Wei Liu, Julien Grall,
	xen-devel

On Mon, 2024-02-19 at 15:12 +0100, Jan Beulich wrote:
> On 19.02.2024 15:00, Oleksii wrote:
> > On Sun, 2024-02-18 at 19:00 +0000, Julien Grall wrote:
> > > On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > > > --- /dev/null
> > > > +++ b/xen/arch/riscv/include/asm/cmpxchg.h
> > > > @@ -0,0 +1,237 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > +/* Copyright (C) 2014 Regents of the University of California
> > > > */
> > > > +
> > > > +#ifndef _ASM_RISCV_CMPXCHG_H
> > > > +#define _ASM_RISCV_CMPXCHG_H
> > > > +
> > > > +#include <xen/compiler.h>
> > > > +#include <xen/lib.h>
> > > > +
> > > > +#include <asm/fence.h>
> > > > +#include <asm/io.h>
> > > > +#include <asm/system.h>
> > > > +
> > > > +#define ALIGN_DOWN(addr, size)  ((addr) & (~((size) - 1)))
> > > > +
> > > > +#define __amoswap_generic(ptr, new, ret, sfx, release_barrier,
> > > > acquire_barrier) \
> > > > +({ \
> > > > +    asm volatile( \
> > > > +        release_barrier \
> > > > +        " amoswap" sfx " %0, %2, %1\n" \
> > > > +        acquire_barrier \
> > > > +        : "=r" (ret), "+A" (*ptr) \
> > > > +        : "r" (new) \
> > > > +        : "memory" ); \
> > > > +})
> > > > +
> > > > +#define emulate_xchg_1_2(ptr, new, ret, release_barrier,
> > > > acquire_barrier) \
> > > > +({ \
> > > > +    uint32_t *ptr_32b_aligned = (uint32_t
> > > > *)ALIGN_DOWN((unsigned
> > > > long)ptr, 4); \
> > > > +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 -
> > > > sizeof(*ptr)))
> > > > * BITS_PER_BYTE; \
> > > > +    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
> > > > +    uint8_t mask_h = mask_l + mask_size - 1; \
> > > > +    unsigned long mask = GENMASK(mask_h, mask_l); \
> > > > +    unsigned long new_ = (unsigned long)(new) << mask_l; \
> > > > +    unsigned long ret_; \
> > > > +    unsigned long rc; \
> > > > +    \
> > > > +    asm volatile( \
> > > > +        release_barrier \
> > > > +        "0: lr.d %0, %2\n" \
> > > 
> > > I was going to ask why this is lr.d rather than lr.w. But I see
> > > Jan 
> > > already asked. I agree with him that it should probably be a lr.w
> > > and
> > > ...
> > > 
> > > > +        "   and  %1, %0, %z4\n" \
> > > > +        "   or   %1, %1, %z3\n" \
> > > > +        "   sc.d %1, %1, %2\n" \
> > > 
> > > ... respectively sc.w. The same applies for cmpxchg.
> > 
> > I agree that it would be better, and my initial attempt was to
> > handle
> > 4-byte or 8-byte boundary crossing during 2-byte access:
> > 
> > 0 1 2 3 4 5 6 7 8
> > X X X 1 1 X X X X
> > 
> > In this case, if I align address 3 to address 0 and then read 4
> > bytes
> > instead of 8 bytes, I will not process the byte at address 4. This
> > was
> > the reason why I started to read 8 bytes.
> > 
> > I also acknowledge that there could be an issue in the following
> > case:
> > 
> > X  4094 4095 4096
> > X    1   1    X
> > In this situation, when I read 8 bytes, there could be an issue
> > where
> > the new page (which starts at 4096) will not be mapped. It seems
> > correct in this case to check that variable is within one page and
> > read
> > 4 bytes instead of 8.
> > 
> > One more thing I am uncertain about is if we change everything to
> > read
> > 4 bytes with 4-byte alignment, what should be done with the first
> > case?
> > Should we panic? (I am not sure if this is an option.)
> 
> Counter question (raised elsewhere already): What if a 4-byte access
> crosses a word / cache line / page boundary? Ideally exactly the
> same would happen for a 2-byte access crossing a respective boundary.
> (Which you can achieve relatively easily by masking off only address
> bit 1, keeping address bit 0 unaltered.)
But if we align down on a 4-byte boundary and then access 4 bytes, we
can't cross a boundary. I agree that the algorithm is not correct, as
it can ignore some values in certain situations. For example:
0 1 2 3 4 5 6 7 8
X X X 1 1 X X X X
In this case, the value at address 4 won't be updated.

I agree that introducing a new macro to check if a variable crosses a
boundary is necessary or as an option we can check that addr is 2-byte
aligned:

#define CHECK_BOUNDARY_CROSSING(start, end, boundary_size)
ASSERT((start / boundary_size) != (end / boundary_size))

Then, it is necessary to check:

CHECK_BOUNDARY_CROSSING(start, end, 4)
CHECK_BOUNDARY_CROSSING(start, end, PAGE_SIZE)

But why do we need to check the cache line boundary? In the case of the
cache, the question will only be about performance, but it should still
work, shouldn't it?

~ Oleksii



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-19 15:20         ` Oleksii
@ 2024-02-19 15:29           ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-19 15:29 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Stefano Stabellini, Wei Liu, Julien Grall,
	xen-devel

On 19.02.2024 16:20, Oleksii wrote:
> On Mon, 2024-02-19 at 15:12 +0100, Jan Beulich wrote:
>> On 19.02.2024 15:00, Oleksii wrote:
>>> On Sun, 2024-02-18 at 19:00 +0000, Julien Grall wrote:
>>>> On 05/02/2024 15:32, Oleksii Kurochko wrote:
>>>>> --- /dev/null
>>>>> +++ b/xen/arch/riscv/include/asm/cmpxchg.h
>>>>> @@ -0,0 +1,237 @@
>>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>>> +/* Copyright (C) 2014 Regents of the University of California
>>>>> */
>>>>> +
>>>>> +#ifndef _ASM_RISCV_CMPXCHG_H
>>>>> +#define _ASM_RISCV_CMPXCHG_H
>>>>> +
>>>>> +#include <xen/compiler.h>
>>>>> +#include <xen/lib.h>
>>>>> +
>>>>> +#include <asm/fence.h>
>>>>> +#include <asm/io.h>
>>>>> +#include <asm/system.h>
>>>>> +
>>>>> +#define ALIGN_DOWN(addr, size)  ((addr) & (~((size) - 1)))
>>>>> +
>>>>> +#define __amoswap_generic(ptr, new, ret, sfx, release_barrier,
>>>>> acquire_barrier) \
>>>>> +({ \
>>>>> +    asm volatile( \
>>>>> +        release_barrier \
>>>>> +        " amoswap" sfx " %0, %2, %1\n" \
>>>>> +        acquire_barrier \
>>>>> +        : "=r" (ret), "+A" (*ptr) \
>>>>> +        : "r" (new) \
>>>>> +        : "memory" ); \
>>>>> +})
>>>>> +
>>>>> +#define emulate_xchg_1_2(ptr, new, ret, release_barrier,
>>>>> acquire_barrier) \
>>>>> +({ \
>>>>> +    uint32_t *ptr_32b_aligned = (uint32_t
>>>>> *)ALIGN_DOWN((unsigned
>>>>> long)ptr, 4); \
>>>>> +    uint8_t mask_l = ((unsigned long)(ptr) & (0x8 -
>>>>> sizeof(*ptr)))
>>>>> * BITS_PER_BYTE; \
>>>>> +    uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \
>>>>> +    uint8_t mask_h = mask_l + mask_size - 1; \
>>>>> +    unsigned long mask = GENMASK(mask_h, mask_l); \
>>>>> +    unsigned long new_ = (unsigned long)(new) << mask_l; \
>>>>> +    unsigned long ret_; \
>>>>> +    unsigned long rc; \
>>>>> +    \
>>>>> +    asm volatile( \
>>>>> +        release_barrier \
>>>>> +        "0: lr.d %0, %2\n" \
>>>>
>>>> I was going to ask why this is lr.d rather than lr.w. But I see
>>>> Jan 
>>>> already asked. I agree with him that it should probably be a lr.w
>>>> and
>>>> ...
>>>>
>>>>> +        "   and  %1, %0, %z4\n" \
>>>>> +        "   or   %1, %1, %z3\n" \
>>>>> +        "   sc.d %1, %1, %2\n" \
>>>>
>>>> ... respectively sc.w. The same applies for cmpxchg.
>>>
>>> I agree that it would be better, and my initial attempt was to
>>> handle
>>> 4-byte or 8-byte boundary crossing during 2-byte access:
>>>
>>> 0 1 2 3 4 5 6 7 8
>>> X X X 1 1 X X X X
>>>
>>> In this case, if I align address 3 to address 0 and then read 4
>>> bytes
>>> instead of 8 bytes, I will not process the byte at address 4. This
>>> was
>>> the reason why I started to read 8 bytes.
>>>
>>> I also acknowledge that there could be an issue in the following
>>> case:
>>>
>>> X  4094 4095 4096
>>> X    1   1    X
>>> In this situation, when I read 8 bytes, there could be an issue
>>> where
>>> the new page (which starts at 4096) will not be mapped. It seems
>>> correct in this case to check that variable is within one page and
>>> read
>>> 4 bytes instead of 8.
>>>
>>> One more thing I am uncertain about is if we change everything to
>>> read
>>> 4 bytes with 4-byte alignment, what should be done with the first
>>> case?
>>> Should we panic? (I am not sure if this is an option.)
>>
>> Counter question (raised elsewhere already): What if a 4-byte access
>> crosses a word / cache line / page boundary? Ideally exactly the
>> same would happen for a 2-byte access crossing a respective boundary.
>> (Which you can achieve relatively easily by masking off only address
>> bit 1, keeping address bit 0 unaltered.)
> But if we align down on a 4-byte boundary and then access 4 bytes, we
> can't cross a boundary. I agree that the algorithm is not correct, as
> it can ignore some values in certain situations. For example:
> 0 1 2 3 4 5 6 7 8
> X X X 1 1 X X X X
> In this case, the value at address 4 won't be updated.
> 
> I agree that introducing a new macro to check if a variable crosses a
> boundary is necessary or as an option we can check that addr is 2-byte
> aligned:
> 
> #define CHECK_BOUNDARY_CROSSING(start, end, boundary_size)
> ASSERT((start / boundary_size) != (end / boundary_size))
> 
> Then, it is necessary to check:
> 
> CHECK_BOUNDARY_CROSSING(start, end, 4)
> CHECK_BOUNDARY_CROSSING(start, end, PAGE_SIZE)
> 
> But why do we need to check the cache line boundary? In the case of the
> cache, the question will only be about performance, but it should still
> work, shouldn't it?

You don't need to check for any of these boundaries. You can simply
leverage what the hardware does for misaligned accesses. See the
various other replies I've sent - I thought things should have become
pretty much crystal clear by now: For 1-byte accesses you access the
containing word, by clearing the low two bits. For 2-byte accesses
you also access the containing word, by clearing only bit 1 (which
the naturally leaves no bit that needs clearing for the projected
[but not necessary] case of handling a 4-byte access). If the resulting
4-byte access then is still misaligned, it'll fault just as a non-
emulated 4-byte access would. And you don't need to care about any of
the boundaries, not at words, not at cache lines, and not at pages.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 02/30] xen/riscv: use some asm-generic headers
  2024-02-14 10:03       ` Jan Beulich
@ 2024-02-20 18:57         ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-20 18:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Wed, 2024-02-14 at 11:03 +0100, Jan Beulich wrote:
> On 14.02.2024 10:54, Oleksii wrote:
> > On Mon, 2024-02-12 at 16:03 +0100, Jan Beulich wrote:
> > > On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > > >  As [PATCH v6 0/9] Introduce generic headers
> > > >  (
> > > > https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kurochko@gmail.com
> > > > /)
> > > >  is not stable, the list in asm/Makefile can be changed, but
> > > > the
> > > > changes will
> > > >  be easy.
> > > 
> > > Or wait - doesn't this mean the change here can't be committed
> > > yet? I
> > > know the cover letter specifies dependencies, yet I think we need
> > > to
> > > come
> > > to a point where this large series won't need re-posting again
> > > and
> > > again.
> > We can't committed it now because asm-generic version of device.h,
> > which is not commited yet.
> > 
> > We can drop the change " generic-y += device.h ", and commit the
> > current one patch, but it sill will require to create a new patch
> > for
> > using of asm-generic/device.h. Or as an option, I can merge
> > "generic-y
> > += device.h" into PATCH 29/30 xen/riscv: enable full Xen build.
> > 
> > I don't expect that the of asm-generic headers will changed in
> > riscv/include/asm/Makefile, but it looks to me that it is better to
> > wait until asm-generic/device.h will be in staging branch.
> > 
> > 
> > If you have better ideas, please share it with me.
> 
> My main point was that the interdependencies here have grown too far,
> imo. The more that while having dependencies stated in the cover
> letter
> is useful, while committing (and also reviewing) I for one would
> typically only look at the individual patches.
> 
> For this patch alone, maybe it would be more obvious that said
> dependency exists if it was last on the asm-generic series, rather
> than part of the series here (which depends on that other series
> anyway). That series now looks to be making some progress, and it
> being
> a prereq for here it may be prudent to focus on getting that one in,
> before re-posting here.
I'll be more specific next time regarding dependencies and specify what
a prereq changes are.

Considering that asm-generic/device.h was merged to staging branch. It
seems to me that there are no more additional prereqs for this patch.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-19 12:18       ` Jan Beulich
@ 2024-02-20 20:30         ` Oleksii
  2024-02-21 11:00           ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-20 20:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Shawn Anastasio, Alistair Francis, Bob Eshleman, Connor Davis,
	Julien Grall, xen-devel

On Mon, 2024-02-19 at 13:18 +0100, Jan Beulich wrote:
> On 19.02.2024 12:59, Oleksii wrote:
> > Hi Julien,
> > 
> > On Sun, 2024-02-18 at 18:30 +0000, Julien Grall wrote:
> > > Hi Oleksii,
> > > 
> > > Title: Typo s/introdure/introduce/
> > > 
> > > On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > > > The <asm/nospec.h> header is similar between Arm, PPC, and
> > > > RISC-V,
> > > > so it has been moved to asm-generic.
> > > 
> > > I am not 100% convinced that moving this header to asm-generic is
> > > a
> > > good 
> > > idea. At least for Arm, those helpers ought to be non-empty, what
> > > about 
> > > RISC-V?
> > For Arm, they are not taking any action, are they? There are no
> > specific fences or other mechanisms inside
> > evaluate_nospec()/block_speculation() to address speculation.
> 
> The question isn't the status quo, but how things should be looking
> like
> if everything was in place that's (in principle) needed.
> 
> > For RISC-V, it can be implemented in a similar manner, at least for
> > now. Since these functions are only used in the grant tables code (
> > for
> > Arm and so for RISC-V ), which is not supported by RISC-V.
> 
> Same here - the question is whether long term, when gnttab is also
> supported, RISC-V would get away without doing anything. Still ...
> 
> > > If the answer is they should be non-empty. Then I would consider
> > > to
> > > keep 
> > > the duplication to make clear that each architecture should take
> > > their 
> > > own decision in term of security.
> > > 
> > > The alternative, is to have a generic implementation that is safe
> > > by 
> > > default (if that's even possible).
> > I am not certain that we can have a generic implementation, as each
> > architecture may have specific speculation issues.
> 
> ... it's theoretically possible that there'd be an arch with no
> speculation issues, maybe simply because of not speculating.

I am not sure that understand your and Julien point.

For example, modern CPU uses speculative execution to reduce the cost
of conditional branch instructions using schemes that predict the
execution path of a program based on the history of branch executions.

Arm CPUs are vulnerable for speculative execution, but if to look at
the code of evaluate_nospec()/block_speculation() functions they are
doing nothing for Arm.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-20 20:30         ` Oleksii
@ 2024-02-21 11:00           ` Jan Beulich
  2024-02-21 12:47             ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-21 11:00 UTC (permalink / raw)
  To: Oleksii
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Shawn Anastasio, Alistair Francis, Bob Eshleman, Connor Davis,
	Julien Grall, xen-devel

On 20.02.2024 21:30, Oleksii wrote:
> On Mon, 2024-02-19 at 13:18 +0100, Jan Beulich wrote:
>> On 19.02.2024 12:59, Oleksii wrote:
>>> Hi Julien,
>>>
>>> On Sun, 2024-02-18 at 18:30 +0000, Julien Grall wrote:
>>>> Hi Oleksii,
>>>>
>>>> Title: Typo s/introdure/introduce/
>>>>
>>>> On 05/02/2024 15:32, Oleksii Kurochko wrote:
>>>>> The <asm/nospec.h> header is similar between Arm, PPC, and
>>>>> RISC-V,
>>>>> so it has been moved to asm-generic.
>>>>
>>>> I am not 100% convinced that moving this header to asm-generic is
>>>> a
>>>> good 
>>>> idea. At least for Arm, those helpers ought to be non-empty, what
>>>> about 
>>>> RISC-V?
>>> For Arm, they are not taking any action, are they? There are no
>>> specific fences or other mechanisms inside
>>> evaluate_nospec()/block_speculation() to address speculation.
>>
>> The question isn't the status quo, but how things should be looking
>> like
>> if everything was in place that's (in principle) needed.
>>
>>> For RISC-V, it can be implemented in a similar manner, at least for
>>> now. Since these functions are only used in the grant tables code (
>>> for
>>> Arm and so for RISC-V ), which is not supported by RISC-V.
>>
>> Same here - the question is whether long term, when gnttab is also
>> supported, RISC-V would get away without doing anything. Still ...
>>
>>>> If the answer is they should be non-empty. Then I would consider
>>>> to
>>>> keep 
>>>> the duplication to make clear that each architecture should take
>>>> their 
>>>> own decision in term of security.
>>>>
>>>> The alternative, is to have a generic implementation that is safe
>>>> by 
>>>> default (if that's even possible).
>>> I am not certain that we can have a generic implementation, as each
>>> architecture may have specific speculation issues.
>>
>> ... it's theoretically possible that there'd be an arch with no
>> speculation issues, maybe simply because of not speculating.
> 
> I am not sure that understand your and Julien point.
> 
> For example, modern CPU uses speculative execution to reduce the cost
> of conditional branch instructions using schemes that predict the
> execution path of a program based on the history of branch executions.
> 
> Arm CPUs are vulnerable for speculative execution, but if to look at
> the code of evaluate_nospec()/block_speculation() functions they are
> doing nothing for Arm.

Which, as I understood Julien say, likely isn't correct. In which case
this header shouldn't be dropped, using the generic one instead. The
generic headers, as pointed out several times before, should imo be used
only if their use results in correct behavior. What is acceptable is if
their use results in sub-optimal behavior (e.g. reduced performance or
lack of a certain feature that another architecture maybe implements).

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-21 11:00           ` Jan Beulich
@ 2024-02-21 12:47             ` Oleksii
  2024-02-21 14:07               ` Julien Grall
  2024-02-21 14:58               ` Jan Beulich
  0 siblings, 2 replies; 107+ messages in thread
From: Oleksii @ 2024-02-21 12:47 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Shawn Anastasio, Alistair Francis, Bob Eshleman, Connor Davis,
	xen-devel

On Wed, 2024-02-21 at 12:00 +0100, Jan Beulich wrote:
> On 20.02.2024 21:30, Oleksii wrote:
> > On Mon, 2024-02-19 at 13:18 +0100, Jan Beulich wrote:
> > > On 19.02.2024 12:59, Oleksii wrote:
> > > > Hi Julien,
> > > > 
> > > > On Sun, 2024-02-18 at 18:30 +0000, Julien Grall wrote:
> > > > > Hi Oleksii,
> > > > > 
> > > > > Title: Typo s/introdure/introduce/
> > > > > 
> > > > > On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > > > > > The <asm/nospec.h> header is similar between Arm, PPC, and
> > > > > > RISC-V,
> > > > > > so it has been moved to asm-generic.
> > > > > 
> > > > > I am not 100% convinced that moving this header to asm-
> > > > > generic is
> > > > > a
> > > > > good 
> > > > > idea. At least for Arm, those helpers ought to be non-empty,
> > > > > what
> > > > > about 
> > > > > RISC-V?
> > > > For Arm, they are not taking any action, are they? There are no
> > > > specific fences or other mechanisms inside
> > > > evaluate_nospec()/block_speculation() to address speculation.
> > > 
> > > The question isn't the status quo, but how things should be
> > > looking
> > > like
> > > if everything was in place that's (in principle) needed.
> > > 
> > > > For RISC-V, it can be implemented in a similar manner, at least
> > > > for
> > > > now. Since these functions are only used in the grant tables
> > > > code (
> > > > for
> > > > Arm and so for RISC-V ), which is not supported by RISC-V.
> > > 
> > > Same here - the question is whether long term, when gnttab is
> > > also
> > > supported, RISC-V would get away without doing anything. Still
> > > ...
> > > 
> > > > > If the answer is they should be non-empty. Then I would
> > > > > consider
> > > > > to
> > > > > keep 
> > > > > the duplication to make clear that each architecture should
> > > > > take
> > > > > their 
> > > > > own decision in term of security.
> > > > > 
> > > > > The alternative, is to have a generic implementation that is
> > > > > safe
> > > > > by 
> > > > > default (if that's even possible).
> > > > I am not certain that we can have a generic implementation, as
> > > > each
> > > > architecture may have specific speculation issues.
> > > 
> > > ... it's theoretically possible that there'd be an arch with no
> > > speculation issues, maybe simply because of not speculating.
> > 
> > I am not sure that understand your and Julien point.
> > 
> > For example, modern CPU uses speculative execution to reduce the
> > cost
> > of conditional branch instructions using schemes that predict the
> > execution path of a program based on the history of branch
> > executions.
> > 
> > Arm CPUs are vulnerable for speculative execution, but if to look
> > at
> > the code of evaluate_nospec()/block_speculation() functions they
> > are
> > doing nothing for Arm.
> 
> Which, as I understood Julien say, likely isn't correct. In which
> case
> this header shouldn't be dropped, using the generic one instead. The
> generic headers, as pointed out several times before, should imo be
> used
> only if their use results in correct behavior. What is acceptable is
> if
> their use results in sub-optimal behavior (e.g. reduced performance
> or
> lack of a certain feature that another architecture maybe
> implements).

As I understand it, evaluate_nospec()/block_speculation() were
introduced for x86 to address the L1TF vulnerability specific to x86
CPUs. This vulnerability is exclusive to x86 architectures [1], which
explains why evaluate_nospec()/block_speculation() are left empty for
Arm, RISC-V, and PPC.

It is unclear whether these functions should be utilized to mitigate
other speculation vulnerabilities. If they should, then, based on the
current implementation, the Arm platform seems to accept having
speculative vulnerabilities.

The question arises: why can't other architectures make their own
decisions regarding security? By default, if an architecture leaves the
mentioned functions empty, it implies an agreement to potentially have
speculative vulnerabilities. Subsequently, if an architecture needs to
address such vulnerabilities, they can develop arch-specific nospec.h
to implement the required code.

If reaching an agreement to potentially have speculative
vulnerabilities is deemed unfavorable, I am open to reconsidering these
changes and reverting to the use of arch-specific nospec.h.
Your input on this matter is appreciated.

[1] https://docs.kernel.org/admin-guide/hw-vuln/l1tf.html



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-21 12:47             ` Oleksii
@ 2024-02-21 14:07               ` Julien Grall
  2024-02-21 14:58               ` Jan Beulich
  1 sibling, 0 replies; 107+ messages in thread
From: Julien Grall @ 2024-02-21 14:07 UTC (permalink / raw)
  To: Oleksii, Jan Beulich
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Shawn Anastasio, Alistair Francis, Bob Eshleman, Connor Davis,
	xen-devel

Hi Oleksii,

On 21/02/2024 12:47, Oleksii wrote:
> On Wed, 2024-02-21 at 12:00 +0100, Jan Beulich wrote:
>> On 20.02.2024 21:30, Oleksii wrote:
>>> On Mon, 2024-02-19 at 13:18 +0100, Jan Beulich wrote:
>>>> On 19.02.2024 12:59, Oleksii wrote:
>>>>> Hi Julien,
>>>>>
>>>>> On Sun, 2024-02-18 at 18:30 +0000, Julien Grall wrote:
>>>>>> Hi Oleksii,
>>>>>>
>>>>>> Title: Typo s/introdure/introduce/
>>>>>>
>>>>>> On 05/02/2024 15:32, Oleksii Kurochko wrote:
>>>>>>> The <asm/nospec.h> header is similar between Arm, PPC, and
>>>>>>> RISC-V,
>>>>>>> so it has been moved to asm-generic.
>>>>>>
>>>>>> I am not 100% convinced that moving this header to asm-
>>>>>> generic is
>>>>>> a
>>>>>> good
>>>>>> idea. At least for Arm, those helpers ought to be non-empty,
>>>>>> what
>>>>>> about
>>>>>> RISC-V?
>>>>> For Arm, they are not taking any action, are they? There are no
>>>>> specific fences or other mechanisms inside
>>>>> evaluate_nospec()/block_speculation() to address speculation.
>>>>
>>>> The question isn't the status quo, but how things should be
>>>> looking
>>>> like
>>>> if everything was in place that's (in principle) needed.
>>>>
>>>>> For RISC-V, it can be implemented in a similar manner, at least
>>>>> for
>>>>> now. Since these functions are only used in the grant tables
>>>>> code (
>>>>> for
>>>>> Arm and so for RISC-V ), which is not supported by RISC-V.
>>>>
>>>> Same here - the question is whether long term, when gnttab is
>>>> also
>>>> supported, RISC-V would get away without doing anything. Still
>>>> ...
>>>>
>>>>>> If the answer is they should be non-empty. Then I would
>>>>>> consider
>>>>>> to
>>>>>> keep
>>>>>> the duplication to make clear that each architecture should
>>>>>> take
>>>>>> their
>>>>>> own decision in term of security.
>>>>>>
>>>>>> The alternative, is to have a generic implementation that is
>>>>>> safe
>>>>>> by
>>>>>> default (if that's even possible).
>>>>> I am not certain that we can have a generic implementation, as
>>>>> each
>>>>> architecture may have specific speculation issues.
>>>>
>>>> ... it's theoretically possible that there'd be an arch with no
>>>> speculation issues, maybe simply because of not speculating.
>>>
>>> I am not sure that understand your and Julien point.
>>>
>>> For example, modern CPU uses speculative execution to reduce the
>>> cost
>>> of conditional branch instructions using schemes that predict the
>>> execution path of a program based on the history of branch
>>> executions.
>>>
>>> Arm CPUs are vulnerable for speculative execution, but if to look
>>> at
>>> the code of evaluate_nospec()/block_speculation() functions they
>>> are
>>> doing nothing for Arm.
>>
>> Which, as I understood Julien say, likely isn't correct. In which
>> case
>> this header shouldn't be dropped, using the generic one instead. The
>> generic headers, as pointed out several times before, should imo be
>> used
>> only if their use results in correct behavior. What is acceptable is
>> if
>> their use results in sub-optimal behavior (e.g. reduced performance
>> or
>> lack of a certain feature that another architecture maybe
>> implements).
> 
> As I understand it, evaluate_nospec()/block_speculation() were
> introduced for x86 to address the L1TF vulnerability specific to x86
> CPUs. This vulnerability is exclusive to x86 architectures [1], which
> explains why evaluate_nospec()/block_speculation() are left empty for
> Arm, RISC-V, and PPC.
> 
> It is unclear whether these functions should be utilized to mitigate
> other speculation vulnerabilities. 

The name is generic enough that someone may want to use it for other 
speculations. If we think this is only related to L1TF, then the 
functions names should reflect it. But see below.

> If they should, then, based on the
> current implementation, the Arm platform seems to accept having
> speculative vulnerabilities.

Looking at some of the use in common code (such as the grant-table 
code), it is unclear to me why it is empty on Arm. I think we need a 
speculation barrier.

I would raise the same question for RISC-V/PPC.

> 
> The question arises: why can't other architectures make their own
> decisions regarding security? 

Each architecture can make there own decision. I am not trying to 
prevent that. What I am trying to prevent is a developper including the 
asm-generic without realizing that the header doesn't provide a safe 
version.

> By default, if an architecture leaves the
> mentioned functions empty, it implies an agreement to potentially have
> speculative vulnerabilities. 

See above. That agreement is somewhat implicit. It would be better if 
this is explicit.

So overall, I would prefer if that header is not part of asm-generic.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 07/30] xen/asm-generic: introdure nospec.h
  2024-02-21 12:47             ` Oleksii
  2024-02-21 14:07               ` Julien Grall
@ 2024-02-21 14:58               ` Jan Beulich
  1 sibling, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-21 14:58 UTC (permalink / raw)
  To: Oleksii
  Cc: Stefano Stabellini, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Shawn Anastasio, Alistair Francis, Bob Eshleman, Connor Davis,
	xen-devel, Julien Grall

On 21.02.2024 13:47, Oleksii wrote:
> As I understand it, evaluate_nospec()/block_speculation() were
> introduced for x86 to address the L1TF vulnerability specific to x86
> CPUs.

Well, if you look at one of the later commits altering x86'es variant,
you'll find that this wasn't really correct.

> This vulnerability is exclusive to x86 architectures [1], which
> explains why evaluate_nospec()/block_speculation() are left empty for
> Arm, RISC-V, and PPC.
> 
> It is unclear whether these functions should be utilized to mitigate
> other speculation vulnerabilities. If they should, then, based on the
> current implementation, the Arm platform seems to accept having
> speculative vulnerabilities.
> 
> The question arises: why can't other architectures make their own
> decisions regarding security? By default, if an architecture leaves the
> mentioned functions empty, it implies an agreement to potentially have
> speculative vulnerabilities. Subsequently, if an architecture needs to
> address such vulnerabilities, they can develop arch-specific nospec.h
> to implement the required code.

You can't take different perspectives on security. There is some
hardening which one architecture may go farther with than another,
but e.g. information leaks are information leaks and hence need
addressing. Of course if an arch knew it had no (known) issues, then
using a generic form of this header would be okay (until such time
where an issue would be found).

And btw, looking at how xen/nospec.h came about, it looks pretty clear
to me that array_index_mask_nospec() should move from system.h to
nospec.h. That would make Arm's form immediately different from what
a generic one might have, and quite likely an inline assembly variant
could also do better on RISC-V (unless, as said, RISC-V simply has no
such issues). Then again I notice Arm64 has no override here ...

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-19 15:01           ` Jan Beulich
@ 2024-02-23 12:23             ` Oleksii
  2024-02-26  9:45               ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-23 12:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

> 
> > > > As 1- and 2-byte cases are emulated I decided that is not to
> > > > provide
> > > > sfx argument for emulation macros as it will not have to much
> > > > affect on
> > > > emulated types and just consume more performance on acquire and
> > > > release
> > > > version of sc/ld instructions.
> > > 
> > > Question is whether the common case (4- and 8-byte accesses)
> > > shouldn't
> > > be valued higher, with 1- and 2-byte emulation being there just
> > > to
> > > allow things to not break altogether.
> > If I understand you correctly, it would make sense to add the 'sfx'
> > argument for the 1/2-byte access case, ensuring that all options
> > are
> > available for 1/2-byte access case as well.
> 
> That's one of the possibilities. As said, I'm not overly worried
> about
> the emulated cases. For the initial implementation I'd recommend
> going
> with what is easiest there, yielding the best possible result for the
> 4- and 8-byte cases. If later it turns out repeated acquire/release
> accesses are a problem in the emulation loop, things can be changed
> to explicit barriers, without touching the 4- and 8-byte cases.
I am confused then a little bit if emulated case is not an issue.

For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and aqcuire
version of xchg barries are used.

The similar is done for cmpxchg.

If something will be needed to change in emulation loop it won't
require to change 4- and 8-byte cases.

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-19  8:06           ` Jan Beulich
@ 2024-02-23 17:00             ` Oleksii
  2024-02-26  7:26               ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-23 17:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On Mon, 2024-02-19 at 09:06 +0100, Jan Beulich wrote:
> On 16.02.2024 12:16, Oleksii wrote:
> > On Thu, 2024-02-15 at 17:43 +0100, Jan Beulich wrote:
> > > On 15.02.2024 17:38, Oleksii wrote:
> > > > On Tue, 2024-02-13 at 14:33 +0100, Jan Beulich wrote:
> > > > > On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > > > > > +	depends on LLD_VERSION >= 150000 || LD_VERSION >=
> > > > > > 23600
> > > > > 
> > > > > What's the linker dependency here? Depending on the answer I
> > > > > might
> > > > > further
> > > > > ask why "TOOLCHAIN" when elsewhere we use CC_HAS_ or HAS_CC_
> > > > > or
> > > > > HAS_AS_.
> > > > I missed to introduce {L}LLD_VERSION config. It should output
> > > > from
> > > > the
> > > > command:
> > > >   riscv64-linux-gnu-ld --version
> > > 
> > > Doesn't answer my question though where the linker version
> > > matters
> > > here.
> > Then I misinterpreted your initial question.
> > Could you please provide further clarification or rephrase it for
> > better understanding?
> > 
> > Probably, your question was about why linker dependency is needed
> > here,
> > then
> > it is not sufficient to check if a toolchain supports a particular 
> > extension without checking if the linker supports that extension   
> > too.
> > For example, Clang 15 supports Zihintpause but GNU bintutils
> > 2.35.2 does not, leading build errors like so:
> >     
> >    riscv64-linux-gnu-ld: -march=rv64i_zihintpause2p0: Invalid or
> >    unknown z ISA extension: 'zihintpause'
> 
> Hmm, that's certainly "interesting" behavior of the RISC-V linker.
> Yet
> isn't the linker capability expected to be tied to that of gas? I
> would
> find it far more natural if a gas dependency existed here. If such a
> connection cannot be taken for granted, I'm pretty sure you'd need to
> probe both then anyway.

Wouldn't it be enough in this case instead of introducing of new
configs and etc, just to do the following:
   +ifeq ($(CONFIG_RISCV_64),y)
   +has_zihintpause = $(call as-insn,$(CC) -mabi=lp64 -
   march=rv64i_zihintpause, "pause",_zihintpause,)
   +else
   +has_zihintpause = $(call as-insn,$(CC) -mabi=ilp32 -
   march=rv32i_zihintpause, "pause",_zihintpause,)
   +endif
   +
    riscv-march-$(CONFIG_RISCV_ISA_RV64G) := rv64g
    riscv-march-$(CONFIG_RISCV_ISA_C)       := $(riscv-march-y)c
   -riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE) := $(riscv-march-
   y)_zihintpause
    
    # Note that -mcmodel=medany is used so that Xen can be mapped
    # into the upper half _or_ the lower half of the address space.
    # -mcmodel=medlow would force Xen into the lower half.
    
   -CFLAGS += -march=$(riscv-march-y) -mstrict-align -mcmodel=medany
   +CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align
   -
   mcmodel=medany

Probably, it would be better:
   ...
   +CFLAGS += -march=$(riscv-march-y)$(call or,$(has_zihintpause)) -
   mstrict-align -
   mcmodel=medany


~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h to build full Xen
  2024-02-23 17:00             ` Oleksii
@ 2024-02-26  7:26               ` Jan Beulich
  0 siblings, 0 replies; 107+ messages in thread
From: Jan Beulich @ 2024-02-26  7:26 UTC (permalink / raw)
  To: Oleksii
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Alistair Francis, Bob Eshleman, Connor Davis, xen-devel

On 23.02.2024 18:00, Oleksii wrote:
> On Mon, 2024-02-19 at 09:06 +0100, Jan Beulich wrote:
>> On 16.02.2024 12:16, Oleksii wrote:
>>> On Thu, 2024-02-15 at 17:43 +0100, Jan Beulich wrote:
>>>> On 15.02.2024 17:38, Oleksii wrote:
>>>>> On Tue, 2024-02-13 at 14:33 +0100, Jan Beulich wrote:
>>>>>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>>>>>> +	depends on LLD_VERSION >= 150000 || LD_VERSION >=
>>>>>>> 23600
>>>>>>
>>>>>> What's the linker dependency here? Depending on the answer I
>>>>>> might
>>>>>> further
>>>>>> ask why "TOOLCHAIN" when elsewhere we use CC_HAS_ or HAS_CC_
>>>>>> or
>>>>>> HAS_AS_.
>>>>> I missed to introduce {L}LLD_VERSION config. It should output
>>>>> from
>>>>> the
>>>>> command:
>>>>>   riscv64-linux-gnu-ld --version
>>>>
>>>> Doesn't answer my question though where the linker version
>>>> matters
>>>> here.
>>> Then I misinterpreted your initial question.
>>> Could you please provide further clarification or rephrase it for
>>> better understanding?
>>>
>>> Probably, your question was about why linker dependency is needed
>>> here,
>>> then
>>> it is not sufficient to check if a toolchain supports a particular 
>>> extension without checking if the linker supports that extension   
>>> too.
>>> For example, Clang 15 supports Zihintpause but GNU bintutils
>>> 2.35.2 does not, leading build errors like so:
>>>     
>>>    riscv64-linux-gnu-ld: -march=rv64i_zihintpause2p0: Invalid or
>>>    unknown z ISA extension: 'zihintpause'
>>
>> Hmm, that's certainly "interesting" behavior of the RISC-V linker.
>> Yet
>> isn't the linker capability expected to be tied to that of gas? I
>> would
>> find it far more natural if a gas dependency existed here. If such a
>> connection cannot be taken for granted, I'm pretty sure you'd need to
>> probe both then anyway.
> 
> Wouldn't it be enough in this case instead of introducing of new
> configs and etc, just to do the following:
>    +ifeq ($(CONFIG_RISCV_64),y)
>    +has_zihintpause = $(call as-insn,$(CC) -mabi=lp64 -
>    march=rv64i_zihintpause, "pause",_zihintpause,)
>    +else
>    +has_zihintpause = $(call as-insn,$(CC) -mabi=ilp32 -
>    march=rv32i_zihintpause, "pause",_zihintpause,)
>    +endif
>    +
>     riscv-march-$(CONFIG_RISCV_ISA_RV64G) := rv64g
>     riscv-march-$(CONFIG_RISCV_ISA_C)       := $(riscv-march-y)c
>    -riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE) := $(riscv-march-
>    y)_zihintpause
>     
>     # Note that -mcmodel=medany is used so that Xen can be mapped
>     # into the upper half _or_ the lower half of the address space.
>     # -mcmodel=medlow would force Xen into the lower half.
>     
>    -CFLAGS += -march=$(riscv-march-y) -mstrict-align -mcmodel=medany
>    +CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align
>    -
>    mcmodel=medany

Yes, this is kind of what I'd expect.

> Probably, it would be better:
>    ...
>    +CFLAGS += -march=$(riscv-march-y)$(call or,$(has_zihintpause)) -
>    mstrict-align -
>    mcmodel=medany

Why the use of "or"? IOW right now I don't see what's "better" here.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-23 12:23             ` Oleksii
@ 2024-02-26  9:45               ` Jan Beulich
  2024-02-26 11:18                 ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-26  9:45 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 23.02.2024 13:23, Oleksii wrote:
>>
>>>>> As 1- and 2-byte cases are emulated I decided that is not to
>>>>> provide
>>>>> sfx argument for emulation macros as it will not have to much
>>>>> affect on
>>>>> emulated types and just consume more performance on acquire and
>>>>> release
>>>>> version of sc/ld instructions.
>>>>
>>>> Question is whether the common case (4- and 8-byte accesses)
>>>> shouldn't
>>>> be valued higher, with 1- and 2-byte emulation being there just
>>>> to
>>>> allow things to not break altogether.
>>> If I understand you correctly, it would make sense to add the 'sfx'
>>> argument for the 1/2-byte access case, ensuring that all options
>>> are
>>> available for 1/2-byte access case as well.
>>
>> That's one of the possibilities. As said, I'm not overly worried
>> about
>> the emulated cases. For the initial implementation I'd recommend
>> going
>> with what is easiest there, yielding the best possible result for the
>> 4- and 8-byte cases. If later it turns out repeated acquire/release
>> accesses are a problem in the emulation loop, things can be changed
>> to explicit barriers, without touching the 4- and 8-byte cases.
> I am confused then a little bit if emulated case is not an issue.
> 
> For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and aqcuire
> version of xchg barries are used.
> 
> The similar is done for cmpxchg.
> 
> If something will be needed to change in emulation loop it won't
> require to change 4- and 8-byte cases.

I'm afraid I don't understand your reply.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-26  9:45               ` Jan Beulich
@ 2024-02-26 11:18                 ` Oleksii
  2024-02-26 11:28                   ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-26 11:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
> On 23.02.2024 13:23, Oleksii wrote:
> > > 
> > > > > > As 1- and 2-byte cases are emulated I decided that is not
> > > > > > to
> > > > > > provide
> > > > > > sfx argument for emulation macros as it will not have to
> > > > > > much
> > > > > > affect on
> > > > > > emulated types and just consume more performance on acquire
> > > > > > and
> > > > > > release
> > > > > > version of sc/ld instructions.
> > > > > 
> > > > > Question is whether the common case (4- and 8-byte accesses)
> > > > > shouldn't
> > > > > be valued higher, with 1- and 2-byte emulation being there
> > > > > just
> > > > > to
> > > > > allow things to not break altogether.
> > > > If I understand you correctly, it would make sense to add the
> > > > 'sfx'
> > > > argument for the 1/2-byte access case, ensuring that all
> > > > options
> > > > are
> > > > available for 1/2-byte access case as well.
> > > 
> > > That's one of the possibilities. As said, I'm not overly worried
> > > about
> > > the emulated cases. For the initial implementation I'd recommend
> > > going
> > > with what is easiest there, yielding the best possible result for
> > > the
> > > 4- and 8-byte cases. If later it turns out repeated
> > > acquire/release
> > > accesses are a problem in the emulation loop, things can be
> > > changed
> > > to explicit barriers, without touching the 4- and 8-byte cases.
> > I am confused then a little bit if emulated case is not an issue.
> > 
> > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
> > aqcuire
> > version of xchg barries are used.
> > 
> > The similar is done for cmpxchg.
> > 
> > If something will be needed to change in emulation loop it won't
> > require to change 4- and 8-byte cases.
> 
> I'm afraid I don't understand your reply.
IIUC, emulated cases it is implemented correctly in terms of usage
barriers. And it also OK not to use sfx for lr/sc instructions and use
only barriers.

For 4- and 8-byte cases are used sfx + barrier depending on the
specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
What also looks to me correct. But you suggested to provide the best
possible result for 4- and 8-byte cases. 

So I don't understand what the best possible result is as the current
one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
acquire, release, generic xchg/cmpxchg ) looks correct to me:
xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
barriers.
xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only release
barrier
xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
barrier
xchg_relaxed ->  (..., "", "", "") - no barries, no sfx

The similar for cmpxchg().

~ Oleksii


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-26 11:18                 ` Oleksii
@ 2024-02-26 11:28                   ` Jan Beulich
  2024-02-26 12:58                     ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-26 11:28 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 12:18, Oleksii wrote:
> On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
>> On 23.02.2024 13:23, Oleksii wrote:
>>>>
>>>>>>> As 1- and 2-byte cases are emulated I decided that is not
>>>>>>> to
>>>>>>> provide
>>>>>>> sfx argument for emulation macros as it will not have to
>>>>>>> much
>>>>>>> affect on
>>>>>>> emulated types and just consume more performance on acquire
>>>>>>> and
>>>>>>> release
>>>>>>> version of sc/ld instructions.
>>>>>>
>>>>>> Question is whether the common case (4- and 8-byte accesses)
>>>>>> shouldn't
>>>>>> be valued higher, with 1- and 2-byte emulation being there
>>>>>> just
>>>>>> to
>>>>>> allow things to not break altogether.
>>>>> If I understand you correctly, it would make sense to add the
>>>>> 'sfx'
>>>>> argument for the 1/2-byte access case, ensuring that all
>>>>> options
>>>>> are
>>>>> available for 1/2-byte access case as well.
>>>>
>>>> That's one of the possibilities. As said, I'm not overly worried
>>>> about
>>>> the emulated cases. For the initial implementation I'd recommend
>>>> going
>>>> with what is easiest there, yielding the best possible result for
>>>> the
>>>> 4- and 8-byte cases. If later it turns out repeated
>>>> acquire/release
>>>> accesses are a problem in the emulation loop, things can be
>>>> changed
>>>> to explicit barriers, without touching the 4- and 8-byte cases.
>>> I am confused then a little bit if emulated case is not an issue.
>>>
>>> For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
>>> aqcuire
>>> version of xchg barries are used.
>>>
>>> The similar is done for cmpxchg.
>>>
>>> If something will be needed to change in emulation loop it won't
>>> require to change 4- and 8-byte cases.
>>
>> I'm afraid I don't understand your reply.
> IIUC, emulated cases it is implemented correctly in terms of usage
> barriers. And it also OK not to use sfx for lr/sc instructions and use
> only barriers.
> 
> For 4- and 8-byte cases are used sfx + barrier depending on the
> specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
> What also looks to me correct. But you suggested to provide the best
> possible result for 4- and 8-byte cases. 
> 
> So I don't understand what the best possible result is as the current
> one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
> acquire, release, generic xchg/cmpxchg ) looks correct to me:
> xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
> barriers.
> xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only release
> barrier
> xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
> barrier
> xchg_relaxed ->  (..., "", "", "") - no barries, no sfx

So first: While explicit barriers are technically okay, I don't follow why
you insist on using them when you can achieve the same by suitably tagging
the actual insn doing the exchange. Then second: It's somewhat hard for me
to see the final effect on the emulation paths without you actually having
done the switch. Maybe no special handling is necessary there anymore
then. And as said, it may actually be acceptable for the emulation paths
to "only" be correct, but not be ideal in terms of performance. After all,
if you use the normal 4-byte primitive in there, more (non-explicit)
barriers than needed would occur if the involved loop has to take more
than one iteration. Which could (but imo doesn't need to be) avoided by
using a more relaxed 4-byte primitive there and an explicit barrier
outside of the loop.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-26 11:28                   ` Jan Beulich
@ 2024-02-26 12:58                     ` Oleksii
  2024-02-26 14:20                       ` Jan Beulich
  0 siblings, 1 reply; 107+ messages in thread
From: Oleksii @ 2024-02-26 12:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-26 at 12:28 +0100, Jan Beulich wrote:
> On 26.02.2024 12:18, Oleksii wrote:
> > On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
> > > On 23.02.2024 13:23, Oleksii wrote:
> > > > > 
> > > > > > > > As 1- and 2-byte cases are emulated I decided that is
> > > > > > > > not
> > > > > > > > to
> > > > > > > > provide
> > > > > > > > sfx argument for emulation macros as it will not have
> > > > > > > > to
> > > > > > > > much
> > > > > > > > affect on
> > > > > > > > emulated types and just consume more performance on
> > > > > > > > acquire
> > > > > > > > and
> > > > > > > > release
> > > > > > > > version of sc/ld instructions.
> > > > > > > 
> > > > > > > Question is whether the common case (4- and 8-byte
> > > > > > > accesses)
> > > > > > > shouldn't
> > > > > > > be valued higher, with 1- and 2-byte emulation being
> > > > > > > there
> > > > > > > just
> > > > > > > to
> > > > > > > allow things to not break altogether.
> > > > > > If I understand you correctly, it would make sense to add
> > > > > > the
> > > > > > 'sfx'
> > > > > > argument for the 1/2-byte access case, ensuring that all
> > > > > > options
> > > > > > are
> > > > > > available for 1/2-byte access case as well.
> > > > > 
> > > > > That's one of the possibilities. As said, I'm not overly
> > > > > worried
> > > > > about
> > > > > the emulated cases. For the initial implementation I'd
> > > > > recommend
> > > > > going
> > > > > with what is easiest there, yielding the best possible result
> > > > > for
> > > > > the
> > > > > 4- and 8-byte cases. If later it turns out repeated
> > > > > acquire/release
> > > > > accesses are a problem in the emulation loop, things can be
> > > > > changed
> > > > > to explicit barriers, without touching the 4- and 8-byte
> > > > > cases.
> > > > I am confused then a little bit if emulated case is not an
> > > > issue.
> > > > 
> > > > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
> > > > aqcuire
> > > > version of xchg barries are used.
> > > > 
> > > > The similar is done for cmpxchg.
> > > > 
> > > > If something will be needed to change in emulation loop it
> > > > won't
> > > > require to change 4- and 8-byte cases.
> > > 
> > > I'm afraid I don't understand your reply.
> > IIUC, emulated cases it is implemented correctly in terms of usage
> > barriers. And it also OK not to use sfx for lr/sc instructions and
> > use
> > only barriers.
> > 
> > For 4- and 8-byte cases are used sfx + barrier depending on the
> > specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
> > What also looks to me correct. But you suggested to provide the
> > best
> > possible result for 4- and 8-byte cases. 
> > 
> > So I don't understand what the best possible result is as the
> > current
> > one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
> > acquire, release, generic xchg/cmpxchg ) looks correct to me:
> > xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
> > barriers.
> > xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only
> > release
> > barrier
> > xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
> > barrier
> > xchg_relaxed ->  (..., "", "", "") - no barries, no sfx
> 
> So first: While explicit barriers are technically okay, I don't
> follow why
> you insist on using them when you can achieve the same by suitably
> tagging
> the actual insn doing the exchange. Then second: It's somewhat hard
> for me
> to see the final effect on the emulation paths without you actually
> having
> done the switch. Maybe no special handling is necessary there anymore
> then. And as said, it may actually be acceptable for the emulation
> paths
> to "only" be correct, but not be ideal in terms of performance. After
> all,
> if you use the normal 4-byte primitive in there, more (non-explicit)
> barriers than needed would occur if the involved loop has to take
> more
> than one iteration. Which could (but imo doesn't need to be) avoided
> by
> using a more relaxed 4-byte primitive there and an explicit barrier
> outside of the loop.

According to the spec:
Table A.5 ( part of the table only I copied here )

Linux Construct          RVWMO Mapping
atomic <op> relaxed           amo<op>.{w|d}
atomic <op> acquire           amo<op>.{w|d}.aq
atomic <op> release           amo<op>.{w|d}.rl
atomic <op>                   amo<op>.{w|d}.aqrl

Linux Construct          RVWMO LR/SC Mapping
atomic <op> relaxed       loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
atomic <op> acquire       loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
atomic <op> release       loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez 
loop OR
                          fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ;
bnez loop
atomic <op>               loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
loop

The Linux mappings for release operations may seem stronger than
necessary, but these mappings
are needed to cover some cases in which Linux requires stronger
orderings than the more intuitive
mappings would provide. In particular, as of the time this text is
being written, Linux is actively
debating whether to require load-load, load-store, and store-store
orderings between accesses in one
critical section and accesses in a subsequent critical section in the
same hart and protected by the
same synchronization object. Not all combinations of FENCE RW,W/FENCE
R,RW mappings
with aq/rl mappings combine to provide such orderings. There are a few
ways around this problem,
including:
1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This suffices
but is undesir-
able, as it defeats the purpose of the aq/rl modifiers.
2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does not
currently work
due to the lack of load and store opcodes with aq and rl modifiers.
3. Strengthen the mappings of release operations such that they would
enforce sufficient order-
ings in the presence of either type of acquire mapping. This is the
currently-recommended
solution, and the one shown in Table A.5.


Based on this it is enough in our case use only suffixed istructions
(amo<op>.{w|d}{.aq, .rl, .aqrl, .aqrl }, lr.{w|d}.{.aq, .aqrl }.


But as far as I understand in Linux atomics were strengthen with
fences:
    Atomics present the same issue with locking: release and acquire
    variants need to be strengthened to meet the constraints defined
    by the Linux-kernel memory consistency model [1].
    
    Atomics present a further issue: implementations of atomics such
    as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
    which do not give full-ordering with .aqrl; for example, current
    implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
    below to end up with the state indicated in the "exists" clause.
    
    In order to "synchronize" LKMM and RISC-V's implementation, this
    commit strengthens the implementations of the atomics operations
    by replacing .rl and .aq with the use of ("lightweigth") fences,
    and by replacing .aqrl LR/SC pairs in sequences such as:
    
      0:      lr.w.aqrl  %0, %addr
              bne        %0, %old, 1f
              ...
              sc.w.aqrl  %1, %new, %addr
              bnez       %1, 0b
      1:
    
    with sequences of the form:
    
      0:      lr.w       %0, %addr
              bne        %0, %old, 1f
              ...
              sc.w.rl    %1, %new, %addr   /* SC-release   */
              bnez       %1, 0b
              fence      rw, rw            /* "full" fence */
      1:
    
    following Daniel's suggestion.
    
    These modifications were validated with simulation of the RISC-V
    with sequences of the form:
    
      0:      lr.w       %0, %addr
              bne        %0, %old, 1f
              ...
              sc.w.rl    %1, %new, %addr   /* SC-release   */
              bnez       %1, 0b
              fence      rw, rw            /* "full" fence */
      1:
    
    following Daniel's suggestion.
    
    These modifications were validated with simulation of the RISC-V
    memory consistency model.
    
    C lr-sc-aqrl-pair-vs-full-barrier
    
    {}
    
    P0(int *x, int *y, atomic_t *u)
    {
            int r0;
            int r1;
    
            WRITE_ONCE(*x, 1);
            r0 = atomic_cmpxchg(u, 0, 1);
            r1 = READ_ONCE(*y);
    }
    
    P1(int *x, int *y, atomic_t *v)
    {
            int r0;
            int r1;
    
            WRITE_ONCE(*y, 1);
            r0 = atomic_cmpxchg(v, 0, 1);
            r1 = READ_ONCE(*x);
    }
    
    exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)
    
    [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
     
https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
        https://marc.info/?l=linux-kernel&m=151633436614259&w=2


Thereby Linux kernel implementation seems to me more safe and it is a
reason why I want/wanted to be aligned with it.

~ Oleksii




^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-26 12:58                     ` Oleksii
@ 2024-02-26 14:20                       ` Jan Beulich
  2024-02-26 14:37                         ` Oleksii
  0 siblings, 1 reply; 107+ messages in thread
From: Jan Beulich @ 2024-02-26 14:20 UTC (permalink / raw)
  To: Oleksii
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On 26.02.2024 13:58, Oleksii wrote:
> On Mon, 2024-02-26 at 12:28 +0100, Jan Beulich wrote:
>> On 26.02.2024 12:18, Oleksii wrote:
>>> On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
>>>> On 23.02.2024 13:23, Oleksii wrote:
>>>>>>
>>>>>>>>> As 1- and 2-byte cases are emulated I decided that is
>>>>>>>>> not
>>>>>>>>> to
>>>>>>>>> provide
>>>>>>>>> sfx argument for emulation macros as it will not have
>>>>>>>>> to
>>>>>>>>> much
>>>>>>>>> affect on
>>>>>>>>> emulated types and just consume more performance on
>>>>>>>>> acquire
>>>>>>>>> and
>>>>>>>>> release
>>>>>>>>> version of sc/ld instructions.
>>>>>>>>
>>>>>>>> Question is whether the common case (4- and 8-byte
>>>>>>>> accesses)
>>>>>>>> shouldn't
>>>>>>>> be valued higher, with 1- and 2-byte emulation being
>>>>>>>> there
>>>>>>>> just
>>>>>>>> to
>>>>>>>> allow things to not break altogether.
>>>>>>> If I understand you correctly, it would make sense to add
>>>>>>> the
>>>>>>> 'sfx'
>>>>>>> argument for the 1/2-byte access case, ensuring that all
>>>>>>> options
>>>>>>> are
>>>>>>> available for 1/2-byte access case as well.
>>>>>>
>>>>>> That's one of the possibilities. As said, I'm not overly
>>>>>> worried
>>>>>> about
>>>>>> the emulated cases. For the initial implementation I'd
>>>>>> recommend
>>>>>> going
>>>>>> with what is easiest there, yielding the best possible result
>>>>>> for
>>>>>> the
>>>>>> 4- and 8-byte cases. If later it turns out repeated
>>>>>> acquire/release
>>>>>> accesses are a problem in the emulation loop, things can be
>>>>>> changed
>>>>>> to explicit barriers, without touching the 4- and 8-byte
>>>>>> cases.
>>>>> I am confused then a little bit if emulated case is not an
>>>>> issue.
>>>>>
>>>>> For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
>>>>> aqcuire
>>>>> version of xchg barries are used.
>>>>>
>>>>> The similar is done for cmpxchg.
>>>>>
>>>>> If something will be needed to change in emulation loop it
>>>>> won't
>>>>> require to change 4- and 8-byte cases.
>>>>
>>>> I'm afraid I don't understand your reply.
>>> IIUC, emulated cases it is implemented correctly in terms of usage
>>> barriers. And it also OK not to use sfx for lr/sc instructions and
>>> use
>>> only barriers.
>>>
>>> For 4- and 8-byte cases are used sfx + barrier depending on the
>>> specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
>>> What also looks to me correct. But you suggested to provide the
>>> best
>>> possible result for 4- and 8-byte cases. 
>>>
>>> So I don't understand what the best possible result is as the
>>> current
>>> one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
>>> acquire, release, generic xchg/cmpxchg ) looks correct to me:
>>> xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
>>> barriers.
>>> xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only
>>> release
>>> barrier
>>> xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
>>> barrier
>>> xchg_relaxed ->  (..., "", "", "") - no barries, no sfx
>>
>> So first: While explicit barriers are technically okay, I don't
>> follow why
>> you insist on using them when you can achieve the same by suitably
>> tagging
>> the actual insn doing the exchange. Then second: It's somewhat hard
>> for me
>> to see the final effect on the emulation paths without you actually
>> having
>> done the switch. Maybe no special handling is necessary there anymore
>> then. And as said, it may actually be acceptable for the emulation
>> paths
>> to "only" be correct, but not be ideal in terms of performance. After
>> all,
>> if you use the normal 4-byte primitive in there, more (non-explicit)
>> barriers than needed would occur if the involved loop has to take
>> more
>> than one iteration. Which could (but imo doesn't need to be) avoided
>> by
>> using a more relaxed 4-byte primitive there and an explicit barrier
>> outside of the loop.
> 
> According to the spec:
> Table A.5 ( part of the table only I copied here )
> 
> Linux Construct          RVWMO Mapping
> atomic <op> relaxed           amo<op>.{w|d}
> atomic <op> acquire           amo<op>.{w|d}.aq
> atomic <op> release           amo<op>.{w|d}.rl
> atomic <op>                   amo<op>.{w|d}.aqrl
> 
> Linux Construct          RVWMO LR/SC Mapping
> atomic <op> relaxed       loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
> atomic <op> acquire       loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez loop
> atomic <op> release       loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ; bnez 
> loop OR
>                           fence.tso; loop: lr.{w|d}; <op>; sc.{w|d}∗ ;
> bnez loop
> atomic <op>               loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez
> loop

In your consideration what to implement you will want to limit
things to constructs we actually use. I can't find any use of the
relaxed, acquire, or release forms of atomics as mentioned above.

> The Linux mappings for release operations may seem stronger than
> necessary, but these mappings
> are needed to cover some cases in which Linux requires stronger
> orderings than the more intuitive
> mappings would provide. In particular, as of the time this text is
> being written, Linux is actively
> debating whether to require load-load, load-store, and store-store
> orderings between accesses in one
> critical section and accesses in a subsequent critical section in the
> same hart and protected by the
> same synchronization object. Not all combinations of FENCE RW,W/FENCE
> R,RW mappings
> with aq/rl mappings combine to provide such orderings. There are a few
> ways around this problem,
> including:
> 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This suffices
> but is undesir-
> able, as it defeats the purpose of the aq/rl modifiers.
> 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does not
> currently work
> due to the lack of load and store opcodes with aq and rl modifiers.

I don't understand this point: Which specific load and/or store forms
are missing? According to my reading of the A extension spec all
combination of aq/rl exist with both lr and sc.

> 3. Strengthen the mappings of release operations such that they would
> enforce sufficient order-
> ings in the presence of either type of acquire mapping. This is the
> currently-recommended
> solution, and the one shown in Table A.5.
> 
> 
> Based on this it is enough in our case use only suffixed istructions
> (amo<op>.{w|d}{.aq, .rl, .aqrl, .aqrl }, lr.{w|d}.{.aq, .aqrl }.
> 
> 
> But as far as I understand in Linux atomics were strengthen with
> fences:
>     Atomics present the same issue with locking: release and acquire
>     variants need to be strengthened to meet the constraints defined
>     by the Linux-kernel memory consistency model [1].
>     
>     Atomics present a further issue: implementations of atomics such
>     as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
>     which do not give full-ordering with .aqrl; for example, current
>     implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
>     below to end up with the state indicated in the "exists" clause.
>     
>     In order to "synchronize" LKMM and RISC-V's implementation, this
>     commit strengthens the implementations of the atomics operations
>     by replacing .rl and .aq with the use of ("lightweigth") fences,
>     and by replacing .aqrl LR/SC pairs in sequences such as:
>     
>       0:      lr.w.aqrl  %0, %addr
>               bne        %0, %old, 1f
>               ...
>               sc.w.aqrl  %1, %new, %addr
>               bnez       %1, 0b
>       1:
>     
>     with sequences of the form:
>     
>       0:      lr.w       %0, %addr
>               bne        %0, %old, 1f
>               ...
>               sc.w.rl    %1, %new, %addr   /* SC-release   */
>               bnez       %1, 0b
>               fence      rw, rw            /* "full" fence */
>       1:
>     
>     following Daniel's suggestion.

I'm likely missing something, yet as it looks it does help that the
code fragment above appears to be ...

>     These modifications were validated with simulation of the RISC-V
>     with sequences of the form:
>     
>       0:      lr.w       %0, %addr
>               bne        %0, %old, 1f
>               ...
>               sc.w.rl    %1, %new, %addr   /* SC-release   */
>               bnez       %1, 0b
>               fence      rw, rw            /* "full" fence */
>       1:
>     
>     following Daniel's suggestion.

... entirely the same as this one. Yet there's presumably a reason
for quoting it twice?

>     These modifications were validated with simulation of the RISC-V
>     memory consistency model.
>     
>     C lr-sc-aqrl-pair-vs-full-barrier
>     
>     {}
>     
>     P0(int *x, int *y, atomic_t *u)
>     {
>             int r0;
>             int r1;
>     
>             WRITE_ONCE(*x, 1);
>             r0 = atomic_cmpxchg(u, 0, 1);
>             r1 = READ_ONCE(*y);
>     }
>     
>     P1(int *x, int *y, atomic_t *v)
>     {
>             int r0;
>             int r1;
>     
>             WRITE_ONCE(*y, 1);
>             r0 = atomic_cmpxchg(v, 0, 1);
>             r1 = READ_ONCE(*x);
>     }
>     
>     exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)
>     
>     [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
>      
> https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
>         https://marc.info/?l=linux-kernel&m=151633436614259&w=2
> 
> 
> Thereby Linux kernel implementation seems to me more safe and it is a
> reason why I want/wanted to be aligned with it.

Which may end up being okay. I hope you realize though that there's a
lot more explanation needed in the respective commits then compared to
what you've had so far. As a minimum, absolutely anything remotely
unexpected needs to be explained.

Jan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h
  2024-02-26 14:20                       ` Jan Beulich
@ 2024-02-26 14:37                         ` Oleksii
  0 siblings, 0 replies; 107+ messages in thread
From: Oleksii @ 2024-02-26 14:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Alistair Francis, Bob Eshleman, Connor Davis, Andrew Cooper,
	George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu,
	xen-devel

On Mon, 2024-02-26 at 15:20 +0100, Jan Beulich wrote:
> On 26.02.2024 13:58, Oleksii wrote:
> > On Mon, 2024-02-26 at 12:28 +0100, Jan Beulich wrote:
> > > On 26.02.2024 12:18, Oleksii wrote:
> > > > On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
> > > > > On 23.02.2024 13:23, Oleksii wrote:
> > > > > > > 
> > > > > > > > > > As 1- and 2-byte cases are emulated I decided that
> > > > > > > > > > is
> > > > > > > > > > not
> > > > > > > > > > to
> > > > > > > > > > provide
> > > > > > > > > > sfx argument for emulation macros as it will not
> > > > > > > > > > have
> > > > > > > > > > to
> > > > > > > > > > much
> > > > > > > > > > affect on
> > > > > > > > > > emulated types and just consume more performance on
> > > > > > > > > > acquire
> > > > > > > > > > and
> > > > > > > > > > release
> > > > > > > > > > version of sc/ld instructions.
> > > > > > > > > 
> > > > > > > > > Question is whether the common case (4- and 8-byte
> > > > > > > > > accesses)
> > > > > > > > > shouldn't
> > > > > > > > > be valued higher, with 1- and 2-byte emulation being
> > > > > > > > > there
> > > > > > > > > just
> > > > > > > > > to
> > > > > > > > > allow things to not break altogether.
> > > > > > > > If I understand you correctly, it would make sense to
> > > > > > > > add
> > > > > > > > the
> > > > > > > > 'sfx'
> > > > > > > > argument for the 1/2-byte access case, ensuring that
> > > > > > > > all
> > > > > > > > options
> > > > > > > > are
> > > > > > > > available for 1/2-byte access case as well.
> > > > > > > 
> > > > > > > That's one of the possibilities. As said, I'm not overly
> > > > > > > worried
> > > > > > > about
> > > > > > > the emulated cases. For the initial implementation I'd
> > > > > > > recommend
> > > > > > > going
> > > > > > > with what is easiest there, yielding the best possible
> > > > > > > result
> > > > > > > for
> > > > > > > the
> > > > > > > 4- and 8-byte cases. If later it turns out repeated
> > > > > > > acquire/release
> > > > > > > accesses are a problem in the emulation loop, things can
> > > > > > > be
> > > > > > > changed
> > > > > > > to explicit barriers, without touching the 4- and 8-byte
> > > > > > > cases.
> > > > > > I am confused then a little bit if emulated case is not an
> > > > > > issue.
> > > > > > 
> > > > > > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed
> > > > > > and
> > > > > > aqcuire
> > > > > > version of xchg barries are used.
> > > > > > 
> > > > > > The similar is done for cmpxchg.
> > > > > > 
> > > > > > If something will be needed to change in emulation loop it
> > > > > > won't
> > > > > > require to change 4- and 8-byte cases.
> > > > > 
> > > > > I'm afraid I don't understand your reply.
> > > > IIUC, emulated cases it is implemented correctly in terms of
> > > > usage
> > > > barriers. And it also OK not to use sfx for lr/sc instructions
> > > > and
> > > > use
> > > > only barriers.
> > > > 
> > > > For 4- and 8-byte cases are used sfx + barrier depending on the
> > > > specific case ( relaxed, acquire, release, generic xchg/cmpxchg
> > > > ).
> > > > What also looks to me correct. But you suggested to provide the
> > > > best
> > > > possible result for 4- and 8-byte cases. 
> > > > 
> > > > So I don't understand what the best possible result is as the
> > > > current
> > > > one usage of __{cmp}xchg_generic for each specific case  (
> > > > relaxed,
> > > > acquire, release, generic xchg/cmpxchg ) looks correct to me:
> > > > xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
> > > > barriers.
> > > > xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only
> > > > release
> > > > barrier
> > > > xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only
> > > > acquire
> > > > barrier
> > > > xchg_relaxed ->  (..., "", "", "") - no barries, no sfx
> > > 
> > > So first: While explicit barriers are technically okay, I don't
> > > follow why
> > > you insist on using them when you can achieve the same by
> > > suitably
> > > tagging
> > > the actual insn doing the exchange. Then second: It's somewhat
> > > hard
> > > for me
> > > to see the final effect on the emulation paths without you
> > > actually
> > > having
> > > done the switch. Maybe no special handling is necessary there
> > > anymore
> > > then. And as said, it may actually be acceptable for the
> > > emulation
> > > paths
> > > to "only" be correct, but not be ideal in terms of performance.
> > > After
> > > all,
> > > if you use the normal 4-byte primitive in there, more (non-
> > > explicit)
> > > barriers than needed would occur if the involved loop has to take
> > > more
> > > than one iteration. Which could (but imo doesn't need to be)
> > > avoided
> > > by
> > > using a more relaxed 4-byte primitive there and an explicit
> > > barrier
> > > outside of the loop.
> > 
> > According to the spec:
> > Table A.5 ( part of the table only I copied here )
> > 
> > Linux Construct          RVWMO Mapping
> > atomic <op> relaxed           amo<op>.{w|d}
> > atomic <op> acquire           amo<op>.{w|d}.aq
> > atomic <op> release           amo<op>.{w|d}.rl
> > atomic <op>                   amo<op>.{w|d}.aqrl
> > 
> > Linux Construct          RVWMO LR/SC Mapping
> > atomic <op> relaxed       loop: lr.{w|d}; <op>; sc.{w|d}; bnez loop
> > atomic <op> acquire       loop: lr.{w|d}.aq; <op>; sc.{w|d}; bnez
> > loop
> > atomic <op> release       loop: lr.{w|d}; <op>; sc.{w|d}.aqrl∗ ;
> > bnez 
> > loop OR
> >                           fence.tso; loop: lr.{w|d}; <op>;
> > sc.{w|d}∗ ;
> > bnez loop
> > atomic <op>               loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl;
> > bnez
> > loop
> 
> In your consideration what to implement you will want to limit
> things to constructs we actually use. I can't find any use of the
> relaxed, acquire, or release forms of atomics as mentioned above.
> 
> > The Linux mappings for release operations may seem stronger than
> > necessary, but these mappings
> > are needed to cover some cases in which Linux requires stronger
> > orderings than the more intuitive
> > mappings would provide. In particular, as of the time this text is
> > being written, Linux is actively
> > debating whether to require load-load, load-store, and store-store
> > orderings between accesses in one
> > critical section and accesses in a subsequent critical section in
> > the
> > same hart and protected by the
> > same synchronization object. Not all combinations of FENCE
> > RW,W/FENCE
> > R,RW mappings
> > with aq/rl mappings combine to provide such orderings. There are a
> > few
> > ways around this problem,
> > including:
> > 1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This
> > suffices
> > but is undesir-
> > able, as it defeats the purpose of the aq/rl modifiers.
> > 2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does
> > not
> > currently work
> > due to the lack of load and store opcodes with aq and rl modifiers.
> 
> I don't understand this point: Which specific load and/or store forms
> are missing? According to my reading of the A extension spec all
> combination of aq/rl exist with both lr and sc.
I think this is not about lr and sc instructions.
It is about l{b|h|w|d} and s{b|h|w|d}, which should be used with fence
in case of acquire and seq_cst.

> 
> > 3. Strengthen the mappings of release operations such that they
> > would
> > enforce sufficient order-
> > ings in the presence of either type of acquire mapping. This is the
> > currently-recommended
> > solution, and the one shown in Table A.5.
> > 
> > 
> > Based on this it is enough in our case use only suffixed
> > istructions
> > (amo<op>.{w|d}{.aq, .rl, .aqrl, .aqrl }, lr.{w|d}.{.aq, .aqrl }.
> > 
> > 
> > But as far as I understand in Linux atomics were strengthen with
> > fences:
> >     Atomics present the same issue with locking: release and
> > acquire
> >     variants need to be strengthened to meet the constraints
> > defined
> >     by the Linux-kernel memory consistency model [1].
> >     
> >     Atomics present a further issue: implementations of atomics
> > such
> >     as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC
> > pairs,
> >     which do not give full-ordering with .aqrl; for example,
> > current
> >     implementations allow the "lr-sc-aqrl-pair-vs-full-barrier"
> > test
> >     below to end up with the state indicated in the "exists"
> > clause.
> >     
> >     In order to "synchronize" LKMM and RISC-V's implementation,
> > this
> >     commit strengthens the implementations of the atomics
> > operations
> >     by replacing .rl and .aq with the use of ("lightweigth")
> > fences,
> >     and by replacing .aqrl LR/SC pairs in sequences such as:
> >     
> >       0:      lr.w.aqrl  %0, %addr
> >               bne        %0, %old, 1f
> >               ...
> >               sc.w.aqrl  %1, %new, %addr
> >               bnez       %1, 0b
> >       1:
> >     
> >     with sequences of the form:
> >     
> >       0:      lr.w       %0, %addr
> >               bne        %0, %old, 1f
> >               ...
> >               sc.w.rl    %1, %new, %addr   /* SC-release   */
> >               bnez       %1, 0b
> >               fence      rw, rw            /* "full" fence */
> >       1:
> >     
> >     following Daniel's suggestion.
> 
> I'm likely missing something, yet as it looks it does help that the
> code fragment above appears to be ...
> 
> >     These modifications were validated with simulation of the RISC-
> > V
> >     with sequences of the form:
> >     
> >       0:      lr.w       %0, %addr
> >               bne        %0, %old, 1f
> >               ...
> >               sc.w.rl    %1, %new, %addr   /* SC-release   */
> >               bnez       %1, 0b
> >               fence      rw, rw            /* "full" fence */
> >       1:
> >     
> >     following Daniel's suggestion.
> 
> ... entirely the same as this one. Yet there's presumably a reason
> for quoting it twice?
I think it was done by accident

~ Oleksii
> 
> >     These modifications were validated with simulation of the RISC-
> > V
> >     memory consistency model.
> >     
> >     C lr-sc-aqrl-pair-vs-full-barrier
> >     
> >     {}
> >     
> >     P0(int *x, int *y, atomic_t *u)
> >     {
> >             int r0;
> >             int r1;
> >     
> >             WRITE_ONCE(*x, 1);
> >             r0 = atomic_cmpxchg(u, 0, 1);
> >             r1 = READ_ONCE(*y);
> >     }
> >     
> >     P1(int *x, int *y, atomic_t *v)
> >     {
> >             int r0;
> >             int r1;
> >     
> >             WRITE_ONCE(*y, 1);
> >             r0 = atomic_cmpxchg(v, 0, 1);
> >             r1 = READ_ONCE(*x);
> >     }
> >     
> >     exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)
> >     
> >     [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
> >      
> > https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
> >         https://marc.info/?l=linux-kernel&m=151633436614259&w=2
> > 
> > 
> > Thereby Linux kernel implementation seems to me more safe and it is
> > a
> > reason why I want/wanted to be aligned with it.
> 
> Which may end up being okay. I hope you realize though that there's a
> lot more explanation needed in the respective commits then compared
> to
> what you've had so far. As a minimum, absolutely anything remotely
> unexpected needs to be explained.
> 
> Jan



^ permalink raw reply	[flat|nested] 107+ messages in thread

end of thread, other threads:[~2024-02-26 14:38 UTC | newest]

Thread overview: 107+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-05 15:32 [PATCH v4 00/30] Enable build of full Xen for RISC-V Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 01/30] xen/riscv: disable unnecessary configs Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 02/30] xen/riscv: use some asm-generic headers Oleksii Kurochko
2024-02-12 15:03   ` Jan Beulich
2024-02-14  9:54     ` Oleksii
2024-02-14 10:03       ` Jan Beulich
2024-02-20 18:57         ` Oleksii
2024-02-05 15:32 ` [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V Oleksii Kurochko
2024-02-12 15:05   ` Jan Beulich
2024-02-14  9:57     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 04/30] xen/riscv: introduce cpufeature.h Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h Oleksii Kurochko
2024-02-12 15:07   ` Jan Beulich
2024-02-14 10:01     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h Oleksii Kurochko
2024-02-12 15:10   ` Jan Beulich
2024-02-14 10:05     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 07/30] xen/asm-generic: introdure nospec.h Oleksii Kurochko
2024-02-18 18:30   ` Julien Grall
2024-02-19 11:59     ` Oleksii
2024-02-19 12:18       ` Jan Beulich
2024-02-20 20:30         ` Oleksii
2024-02-21 11:00           ` Jan Beulich
2024-02-21 12:47             ` Oleksii
2024-02-21 14:07               ` Julien Grall
2024-02-21 14:58               ` Jan Beulich
2024-02-05 15:32 ` [PATCH v4 08/30] xen/riscv: introduce setup.h Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 09/30] xen/riscv: introduce bitops.h Oleksii Kurochko
2024-02-12 15:58   ` Jan Beulich
2024-02-14 11:06     ` Oleksii
2024-02-14 11:20       ` Jan Beulich
2024-02-13  9:19   ` Jan Beulich
2024-02-05 15:32 ` [PATCH v4 10/30] xen/riscv: introduce flushtlb.h Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 11/30] xen/riscv: introduce smp.h Oleksii Kurochko
2024-02-12 15:13   ` Jan Beulich
2024-02-14 11:06     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h Oleksii Kurochko
2024-02-13 10:37   ` Jan Beulich
2024-02-15 13:41     ` Oleksii
2024-02-19 11:22       ` Jan Beulich
2024-02-19 14:29         ` Oleksii
2024-02-19 15:01           ` Jan Beulich
2024-02-23 12:23             ` Oleksii
2024-02-26  9:45               ` Jan Beulich
2024-02-26 11:18                 ` Oleksii
2024-02-26 11:28                   ` Jan Beulich
2024-02-26 12:58                     ` Oleksii
2024-02-26 14:20                       ` Jan Beulich
2024-02-26 14:37                         ` Oleksii
2024-02-18 19:00   ` Julien Grall
2024-02-19 14:00     ` Oleksii
2024-02-19 14:12       ` Jan Beulich
2024-02-19 15:20         ` Oleksii
2024-02-19 15:29           ` Jan Beulich
2024-02-19 14:25       ` Julien Grall
2024-02-05 15:32 ` [PATCH v4 13/30] xen/riscv: introduce io.h Oleksii Kurochko
2024-02-13 11:05   ` Jan Beulich
2024-02-14 11:34     ` Oleksii
2024-02-18 19:07   ` Julien Grall
2024-02-19 14:32     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 14/30] xen/riscv: introduce atomic.h Oleksii Kurochko
2024-02-13 11:36   ` Jan Beulich
2024-02-14 12:11     ` Oleksii
2024-02-14 13:09       ` Jan Beulich
2024-02-18 19:22   ` Julien Grall
2024-02-19 14:35     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 15/30] xen/riscv: introduce irq.h Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 16/30] xen/riscv: introduce p2m.h Oleksii Kurochko
2024-02-12 15:16   ` Jan Beulich
2024-02-14 12:12     ` Oleksii
2024-02-18 18:18   ` Julien Grall
2024-02-05 15:32 ` [PATCH v4 17/30] xen/riscv: introduce regs.h Oleksii Kurochko
2024-02-18 18:22   ` Julien Grall
2024-02-19 14:40     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 18/30] xen/riscv: introduce time.h Oleksii Kurochko
2024-02-12 15:18   ` Jan Beulich
2024-02-14 12:14     ` Oleksii
2024-02-05 15:32 ` [PATCH v4 19/30] xen/riscv: introduce event.h Oleksii Kurochko
2024-02-12 15:20   ` Jan Beulich
2024-02-14 12:16     ` Oleksii
2024-02-14 13:11       ` Jan Beulich
2024-02-05 15:32 ` [PATCH v4 20/30] xen/riscv: introduce monitor.h Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 21/30] xen/riscv: add definition of __read_mostly Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 22/30] xen/riscv: define an address of frame table Oleksii Kurochko
2024-02-13 13:07   ` Jan Beulich
2024-02-05 15:32 ` [PATCH v4 23/30] xen/riscv: add required things to current.h Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 24/30] xen/riscv: add minimal stuff to page.h to build full Xen Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 25/30] xen/riscv: add minimal stuff to processor.h " Oleksii Kurochko
2024-02-13 13:33   ` Jan Beulich
2024-02-15 16:38     ` Oleksii
2024-02-15 16:43       ` Jan Beulich
2024-02-16 11:16         ` Oleksii
2024-02-19  8:06           ` Jan Beulich
2024-02-23 17:00             ` Oleksii
2024-02-26  7:26               ` Jan Beulich
2024-02-05 15:32 ` [PATCH v4 26/30] xen/riscv: add minimal stuff to mm.h " Oleksii Kurochko
2024-02-13 14:19   ` Jan Beulich
2024-02-16 11:03     ` Oleksii
2024-02-19  8:07       ` Jan Beulich
2024-02-05 15:32 ` [PATCH v4 27/30] xen/riscv: introduce vm_event_*() functions Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 28/30] xen/rirscv: add minimal amount of stubs to build full Xen Oleksii Kurochko
2024-02-12 15:24   ` Jan Beulich
2024-02-05 15:32 ` [PATCH v4 29/30] xen/riscv: enable full Xen build Oleksii Kurochko
2024-02-05 15:32 ` [PATCH v4 30/30] xen/README: add compiler and binutils versions for RISC-V64 Oleksii Kurochko
2024-02-14  9:52   ` Jan Beulich
2024-02-14 12:21     ` Oleksii
2024-02-14 13:06       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.