linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] KVM perf support
@ 2022-12-15 17:00 Atish Patra
  2022-12-15 17:00 ` [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count Atish Patra
                   ` (10 more replies)
  0 siblings, 11 replies; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

This series extends perf support for KVM. The KVM implementation relies
on the SBI PMU extension and trap n emulation of hpmcounter CSRs.
The KVM implementation exposes the virtual counters to the guest and internally
manage the counters using kernel perf counters. 

This series doesn't support the counter overflow as the Sscofpmf extension
doesn't allow trap & emulation mechanism of scountovf CSR yet. The required
changes to allow that are being under discussions. Supporting overflow interrupt
also requires AIA interrupt filtering support.

1. PATCH1-4 are generic KVM/PMU driver improvements.
2. PATCH8 disables hpmcounter for now. It will be enabled to maintain ABI
requirement once the ONE reg interface is settled. 

perf stat works in kvm guests with this series. 

Here is example of running perf stat in a guest running in KVM.

===========================================================================
/ # /host/apps/perf stat -e instructions -e cycles -e r8000000000000005 \
> -e r8000000000000006 -e r8000000000000007 -e r8000000000000008 \
> -e r800000000000000a perf bench sched messaging -g 10 -l 10

# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run

     Total time: 7.769 [sec]
                 
 Performance counter stats for 'perf bench sched messaging -g 10 -l 10':

       73556259604      cycles
       73387266056      instructions              #    1.00  insn per cycle
                 0      dTLB-store-misses
                 0      iTLB-load-misses
                 0      r8000000000000005
              2595      r8000000000000006
              2272      r8000000000000007
                10      r8000000000000008
                 0      r800000000000000a

      12.173720400 seconds time elapsed

       1.002716000 seconds user
      21.931047000 seconds sys


Note: The SBI_PMU_FW_SET_TIMER (eventid : r8000000000000005) is zero
as kvm guest supports sstc now. 

This series can be found here as well.
https://github.com/atishp04/linux/tree/kvm_perf_v2

TODO:
1. Add sscofpmf support.
2. Add One reg interface for the following operations:
	1. Enable/Disable PMU (should it at VM level rather than vcpu ?)
	2. Number of hpmcounter and width of the counters
	3. Init PMU
	4. Allow guest user to access cycle & instret without trapping

Changes from v1->v2:
1. Addressed comments from Andrew.
2. Removed kvpmu sanity check.
3. Added a kvm pmu init flag and the sanity check to probe function.
4. Improved the linux vs sbi error code handling.
 

Atish Patra (11):
RISC-V: Define helper functions expose hpm counter width and count
RISC-V: KVM: Define a probe function for SBI extension data structures
RISC-V: KVM: Return correct code for hsm stop function
RISC-V: KVM: Modify SBI extension handler to return SBI error code
RISC-V: KVM: Improve privilege mode filtering for perf
RISC-V: KVM: Add skeleton support for perf
RISC-V: KVM: Add SBI PMU extension support
RISC-V: KVM: Disable all hpmcounter access for VS/VU mode
RISC-V: KVM: Implement trap & emulate for hpmcounters
RISC-V: KVM: Implement perf support without sampling
RISC-V: KVM: Implement firmware events

arch/riscv/include/asm/kvm_host.h     |   3 +
arch/riscv/include/asm/kvm_vcpu_pmu.h | 108 +++++
arch/riscv/include/asm/kvm_vcpu_sbi.h |  13 +-
arch/riscv/include/asm/sbi.h          |   2 +-
arch/riscv/kvm/Makefile               |   1 +
arch/riscv/kvm/main.c                 |   3 +-
arch/riscv/kvm/tlb.c                  |   6 +-
arch/riscv/kvm/vcpu.c                 |   5 +
arch/riscv/kvm/vcpu_insn.c            |   4 +-
arch/riscv/kvm/vcpu_pmu.c             | 585 ++++++++++++++++++++++++++
arch/riscv/kvm/vcpu_sbi.c             |  56 ++-
arch/riscv/kvm/vcpu_sbi_base.c        |  45 +-
arch/riscv/kvm/vcpu_sbi_hsm.c         |  22 +-
arch/riscv/kvm/vcpu_sbi_pmu.c         |  86 ++++
arch/riscv/kvm/vcpu_sbi_replace.c     |  51 ++-
drivers/perf/riscv_pmu_sbi.c          |  62 ++-
include/linux/perf/riscv_pmu.h        |   5 +
17 files changed, 963 insertions(+), 94 deletions(-)
create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
create mode 100644 arch/riscv/kvm/vcpu_pmu.c
create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c

--
2.25.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-12 10:06   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 02/11] RISC-V: KVM: Define a probe function for SBI extension data structures Atish Patra
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

KVM module needs to know how many hardware counters and the counter
width that the platform supports. Otherwise, it will not be able to show
optimal value of virtual counters to the guest. The virtual hardware
counters also need to have the same width as the logical hardware
counters for simplicity. However, there shouldn't be mapping between
virtual hardware counters and logical hardware counters. As we don't
support hetergeneous harts or counters with different width as of now,
the implementation relies on the counter width of the first available
programmable counter.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 drivers/perf/riscv_pmu_sbi.c   | 35 +++++++++++++++++++++++++++++++++-
 include/linux/perf/riscv_pmu.h |  3 +++
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 3852c18..65d4aa4 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -49,6 +49,9 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
 static union sbi_pmu_ctr_info *pmu_ctr_list;
 static unsigned int riscv_pmu_irq;
 
+/* Cache the available counters in a bitmask */
+unsigned long cmask;
+
 struct sbi_pmu_event_data {
 	union {
 		union {
@@ -264,6 +267,37 @@ static bool pmu_sbi_ctr_is_fw(int cidx)
 	return (info->type == SBI_PMU_CTR_TYPE_FW) ? true : false;
 }
 
+/*
+ * Returns the counter width of a programmable counter and number of hardware
+ * counters. As we don't support heterneous CPUs yet, it is okay to just
+ * return the counter width of the first programmable counter.
+ */
+int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
+{
+	int i;
+	union sbi_pmu_ctr_info *info;
+	u32 hpm_width = 0, hpm_count = 0;
+
+	if (!cmask)
+		return -EINVAL;
+
+	for_each_set_bit(i, &cmask, RISCV_MAX_COUNTERS) {
+		info = &pmu_ctr_list[i];
+		if (!info)
+			continue;
+		if (!hpm_width && (info->csr != CSR_CYCLE) && (info->csr != CSR_INSTRET))
+			hpm_width = info->width;
+		if (info->type == SBI_PMU_CTR_TYPE_HW)
+			hpm_count++;
+	}
+
+	*hw_ctr_width = hpm_width;
+	*num_hw_ctr = hpm_count;
+
+	return 0;
+}
+EXPORT_SYMBOL(riscv_pmu_get_hpm_info);
+
 static int pmu_sbi_ctr_get_idx(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
@@ -798,7 +832,6 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
 static int pmu_sbi_device_probe(struct platform_device *pdev)
 {
 	struct riscv_pmu *pmu = NULL;
-	unsigned long cmask = 0;
 	int ret = -ENODEV;
 	int num_counters;
 
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index e17e86a..a1c3f77 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -73,6 +73,9 @@ void riscv_pmu_legacy_skip_init(void);
 static inline void riscv_pmu_legacy_skip_init(void) {};
 #endif
 struct riscv_pmu *riscv_pmu_alloc(void);
+#ifdef CONFIG_RISCV_PMU_SBI
+int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
+#endif
 
 #endif /* CONFIG_RISCV_PMU */
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 02/11] RISC-V: KVM: Define a probe function for SBI extension data structures
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
  2022-12-15 17:00 ` [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-12 10:21   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function Atish Patra
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

Currently the probe function just checks if an SBI extension is
registered or not. However, the extension may not want to advertise
itself depending on some other condition.
An additional extension specific probe function will allow
extensions to decide if they want to be advertised to the caller or
not. Any extension that does not require additional dependency checks
can avoid implementing this function.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/kvm_vcpu_sbi.h |  3 +++
 arch/riscv/kvm/vcpu_sbi_base.c        | 13 +++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index f79478a..61dac1b 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -29,6 +29,9 @@ struct kvm_vcpu_sbi_extension {
 	int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		       unsigned long *out_val, struct kvm_cpu_trap *utrap,
 		       bool *exit);
+
+	/* Extension specific probe function */
+	unsigned long (*probe)(struct kvm_vcpu *vcpu, unsigned long extid);
 };
 
 void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
index 5d65c63..89e2415 100644
--- a/arch/riscv/kvm/vcpu_sbi_base.c
+++ b/arch/riscv/kvm/vcpu_sbi_base.c
@@ -19,6 +19,7 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 {
 	int ret = 0;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+	const struct kvm_vcpu_sbi_extension *sbi_ext;
 
 	switch (cp->a6) {
 	case SBI_EXT_BASE_GET_SPEC_VERSION:
@@ -43,8 +44,16 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			 */
 			kvm_riscv_vcpu_sbi_forward(vcpu, run);
 			*exit = true;
-		} else
-			*out_val = kvm_vcpu_sbi_find_ext(cp->a0) ? 1 : 0;
+		} else {
+			sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
+			if (sbi_ext) {
+				if (sbi_ext->probe)
+					*out_val = sbi_ext->probe(vcpu, cp->a0);
+				else
+					*out_val = 1;
+			} else
+				*out_val = 0;
+		}
 		break;
 	case SBI_EXT_BASE_GET_MVENDORID:
 		*out_val = vcpu->arch.mvendorid;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
  2022-12-15 17:00 ` [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count Atish Patra
  2022-12-15 17:00 ` [PATCH v2 02/11] RISC-V: KVM: Define a probe function for SBI extension data structures Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-12 10:28   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code Atish Patra
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

According to the SBI specification, the stop function can only
return error code SBI_ERR_FAILED. However, currently it returns
-EINVAL which will be mapped SBI_ERR_INVALID_PARAM.

Return the appropriate linux error code.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/kvm/vcpu_sbi_hsm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
index 2e915ca..0f8d9fe 100644
--- a/arch/riscv/kvm/vcpu_sbi_hsm.c
+++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
@@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
 static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
 {
 	if (vcpu->arch.power_off)
-		return -EINVAL;
+		return -EPERM;
 
 	kvm_riscv_vcpu_power_off(vcpu);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (2 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-12 11:04   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 05/11] RISC-V: KVM: Improve privilege mode filtering for perf Atish Patra
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Andrew Jones, Anup Patel, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

Currently, the SBI extension handle is expected to return Linux error code.
The top SBI layer converts the Linux error code to SBI specific error code
that can be returned to guest invoking the SBI calls. This model works
as long as SBI error codes have 1-to-1 mappings between them.
However, that may not be true always. This patch attempts to disassociate
both these error codes by allowing the SBI extension implementation to
return SBI specific error codes as well.

The extension will continue to return the Linux error specific code which
will indicate any problem *with* the extension emulation while the
SBI specific error will indicate the problem *of* the emulation.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/kvm_vcpu_sbi.h | 10 ++++--
 arch/riscv/kvm/vcpu_sbi.c             | 45 ++++++++-------------------
 arch/riscv/kvm/vcpu_sbi_base.c        | 38 +++++++++++-----------
 arch/riscv/kvm/vcpu_sbi_hsm.c         | 22 +++++++------
 arch/riscv/kvm/vcpu_sbi_replace.c     | 44 +++++++++++++-------------
 5 files changed, 74 insertions(+), 85 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index 61dac1b..fee9253 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -18,6 +18,12 @@ struct kvm_vcpu_sbi_context {
 	int return_handled;
 };
 
+struct kvm_vcpu_sbi_ext_data {
+	unsigned long out_val;
+	unsigned long err_val;
+	bool uexit;
+};
+
 struct kvm_vcpu_sbi_extension {
 	unsigned long extid_start;
 	unsigned long extid_end;
@@ -27,8 +33,8 @@ struct kvm_vcpu_sbi_extension {
 	 * specific error codes.
 	 */
 	int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
-		       unsigned long *out_val, struct kvm_cpu_trap *utrap,
-		       bool *exit);
+		       struct kvm_vcpu_sbi_ext_data *edata,
+		       struct kvm_cpu_trap *utrap);
 
 	/* Extension specific probe function */
 	unsigned long (*probe)(struct kvm_vcpu *vcpu, unsigned long extid);
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index f96991d..50c5472 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -12,26 +12,6 @@
 #include <asm/sbi.h>
 #include <asm/kvm_vcpu_sbi.h>
 
-static int kvm_linux_err_map_sbi(int err)
-{
-	switch (err) {
-	case 0:
-		return SBI_SUCCESS;
-	case -EPERM:
-		return SBI_ERR_DENIED;
-	case -EINVAL:
-		return SBI_ERR_INVALID_PARAM;
-	case -EFAULT:
-		return SBI_ERR_INVALID_ADDRESS;
-	case -EOPNOTSUPP:
-		return SBI_ERR_NOT_SUPPORTED;
-	case -EALREADY:
-		return SBI_ERR_ALREADY_AVAILABLE;
-	default:
-		return SBI_ERR_FAILURE;
-	};
-}
-
 #ifndef CONFIG_RISCV_SBI_V01
 static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
 	.extid_start = -1UL,
@@ -125,11 +105,10 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	int ret = 1;
 	bool next_sepc = true;
-	bool userspace_exit = false;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 	const struct kvm_vcpu_sbi_extension *sbi_ext;
 	struct kvm_cpu_trap utrap = { 0 };
-	unsigned long out_val = 0;
+	struct kvm_vcpu_sbi_ext_data edata_out = { 0 };
 	bool ext_is_v01 = false;
 
 	sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
@@ -139,7 +118,7 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		    cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
 			ext_is_v01 = true;
 #endif
-		ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
+		ret = sbi_ext->handler(vcpu, run, &edata_out, &utrap);
 	} else {
 		/* Return error for unsupported SBI calls */
 		cp->a0 = SBI_ERR_NOT_SUPPORTED;
@@ -156,25 +135,27 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		goto ecall_done;
 	}
 
+	/* The SBI extension returns Linux error code. Exits the ioctl loop
+	 * and forwards the error to the userspace.
+	 */
+	if (ret < 0) {
+		next_sepc = false;
+		goto ecall_done;
+	}
+
 	/* Exit ioctl loop or Propagate the error code the guest */
-	if (userspace_exit) {
+	if (edata_out.uexit) {
 		next_sepc = false;
 		ret = 0;
 	} else {
-		/**
-		 * SBI extension handler always returns an Linux error code. Convert
-		 * it to the SBI specific error code that can be propagated the SBI
-		 * caller.
-		 */
-		ret = kvm_linux_err_map_sbi(ret);
-		cp->a0 = ret;
+		cp->a0 = edata_out.err_val;
 		ret = 1;
 	}
 ecall_done:
 	if (next_sepc)
 		cp->sepc += 4;
 	if (!ext_is_v01)
-		cp->a1 = out_val;
+		cp->a1 = edata_out.out_val;
 
 	return ret;
 }
diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
index 89e2415..487828d 100644
--- a/arch/riscv/kvm/vcpu_sbi_base.c
+++ b/arch/riscv/kvm/vcpu_sbi_base.c
@@ -14,24 +14,23 @@
 #include <asm/kvm_vcpu_sbi.h>
 
 static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
-				    unsigned long *out_val,
-				    struct kvm_cpu_trap *trap, bool *exit)
+				    struct kvm_vcpu_sbi_ext_data *edata,
+				    struct kvm_cpu_trap *trap)
 {
-	int ret = 0;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 	const struct kvm_vcpu_sbi_extension *sbi_ext;
 
 	switch (cp->a6) {
 	case SBI_EXT_BASE_GET_SPEC_VERSION:
-		*out_val = (KVM_SBI_VERSION_MAJOR <<
+		edata->out_val = (KVM_SBI_VERSION_MAJOR <<
 			    SBI_SPEC_VERSION_MAJOR_SHIFT) |
 			    KVM_SBI_VERSION_MINOR;
 		break;
 	case SBI_EXT_BASE_GET_IMP_ID:
-		*out_val = KVM_SBI_IMPID;
+		edata->out_val = KVM_SBI_IMPID;
 		break;
 	case SBI_EXT_BASE_GET_IMP_VERSION:
-		*out_val = LINUX_VERSION_CODE;
+		edata->out_val = LINUX_VERSION_CODE;
 		break;
 	case SBI_EXT_BASE_PROBE_EXT:
 		if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
@@ -43,33 +42,33 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			 * forward it to the userspace
 			 */
 			kvm_riscv_vcpu_sbi_forward(vcpu, run);
-			*exit = true;
+			edata->uexit = true;
 		} else {
 			sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
 			if (sbi_ext) {
 				if (sbi_ext->probe)
-					*out_val = sbi_ext->probe(vcpu, cp->a0);
+					edata->out_val = sbi_ext->probe(vcpu, cp->a0);
 				else
-					*out_val = 1;
+					edata->out_val = 1;
 			} else
-				*out_val = 0;
+				edata->out_val = 0;
 		}
 		break;
 	case SBI_EXT_BASE_GET_MVENDORID:
-		*out_val = vcpu->arch.mvendorid;
+		edata->out_val = vcpu->arch.mvendorid;
 		break;
 	case SBI_EXT_BASE_GET_MARCHID:
-		*out_val = vcpu->arch.marchid;
+		edata->out_val = vcpu->arch.marchid;
 		break;
 	case SBI_EXT_BASE_GET_MIMPID:
-		*out_val = vcpu->arch.mimpid;
+		edata->out_val = vcpu->arch.mimpid;
 		break;
 	default:
-		ret = -EOPNOTSUPP;
+		edata->err_val = SBI_ERR_NOT_SUPPORTED;
 		break;
 	}
 
-	return ret;
+	return 0;
 }
 
 const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
@@ -79,17 +78,16 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
 };
 
 static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
-					struct kvm_run *run,
-					unsigned long *out_val,
-					struct kvm_cpu_trap *utrap,
-					bool *exit)
+				       struct kvm_run *run,
+				       struct kvm_vcpu_sbi_ext_data *edata,
+				       struct kvm_cpu_trap *utrap)
 {
 	/*
 	 * Both SBI experimental and vendor extensions are
 	 * unconditionally forwarded to userspace.
 	 */
 	kvm_riscv_vcpu_sbi_forward(vcpu, run);
-	*exit = true;
+	edata->uexit = true;
 	return 0;
 }
 
diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
index 0f8d9fe..4188f21 100644
--- a/arch/riscv/kvm/vcpu_sbi_hsm.c
+++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
@@ -21,9 +21,9 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
 
 	target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
 	if (!target_vcpu)
-		return -EINVAL;
+		return SBI_ERR_INVALID_PARAM;
 	if (!target_vcpu->arch.power_off)
-		return -EALREADY;
+		return SBI_ERR_ALREADY_AVAILABLE;
 
 	reset_cntx = &target_vcpu->arch.guest_reset_context;
 	/* start address */
@@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
 static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
 {
 	if (vcpu->arch.power_off)
-		return -EPERM;
+		return SBI_ERR_FAILURE;
 
 	kvm_riscv_vcpu_power_off(vcpu);
 
@@ -57,7 +57,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
 
 	target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
 	if (!target_vcpu)
-		return -EINVAL;
+		return SBI_ERR_INVALID_PARAM;
 	if (!target_vcpu->arch.power_off)
 		return SBI_HSM_STATE_STARTED;
 	else if (vcpu->stat.generic.blocking)
@@ -66,10 +66,10 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
 		return SBI_HSM_STATE_STOPPED;
 }
 
+
 static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
-				   unsigned long *out_val,
-				   struct kvm_cpu_trap *utrap,
-				   bool *exit)
+				   struct kvm_vcpu_sbi_ext_data *edata,
+				   struct kvm_cpu_trap *utrap)
 {
 	int ret = 0;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
@@ -88,8 +88,8 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case SBI_EXT_HSM_HART_STATUS:
 		ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
 		if (ret >= 0) {
-			*out_val = ret;
-			ret = 0;
+			edata->out_val = ret;
+			edata->err_val = 0;
 		}
 		break;
 	case SBI_EXT_HSM_HART_SUSPEND:
@@ -108,7 +108,9 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		ret = -EOPNOTSUPP;
 	}
 
-	return ret;
+	edata->err_val = ret;
+
+	return 0;
 }
 
 const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
index 03a0198..d029136 100644
--- a/arch/riscv/kvm/vcpu_sbi_replace.c
+++ b/arch/riscv/kvm/vcpu_sbi_replace.c
@@ -14,15 +14,17 @@
 #include <asm/kvm_vcpu_sbi.h>
 
 static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
-				    unsigned long *out_val,
-				    struct kvm_cpu_trap *utrap, bool *exit)
+				    struct kvm_vcpu_sbi_ext_data *edata,
+				    struct kvm_cpu_trap *utrap)
 {
 	int ret = 0;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 	u64 next_cycle;
 
-	if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
-		return -EINVAL;
+	if (cp->a6 != SBI_EXT_TIME_SET_TIMER) {
+		edata->err_val = SBI_ERR_INVALID_PARAM;
+		return 0;
+	}
 
 #if __riscv_xlen == 32
 	next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
@@ -41,8 +43,8 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
 };
 
 static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
-				   unsigned long *out_val,
-				   struct kvm_cpu_trap *utrap, bool *exit)
+				   struct kvm_vcpu_sbi_ext_data *edata,
+				   struct kvm_cpu_trap *utrap)
 {
 	int ret = 0;
 	unsigned long i;
@@ -51,8 +53,10 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	unsigned long hmask = cp->a0;
 	unsigned long hbase = cp->a1;
 
-	if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
-		return -EINVAL;
+	if (cp->a6 != SBI_EXT_IPI_SEND_IPI) {
+		edata->err_val = SBI_ERR_INVALID_PARAM;
+		return 0;
+	}
 
 	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
 		if (hbase != -1UL) {
@@ -76,10 +80,9 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
 };
 
 static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
-				      unsigned long *out_val,
-				      struct kvm_cpu_trap *utrap, bool *exit)
+				      struct kvm_vcpu_sbi_ext_data *edata,
+				      struct kvm_cpu_trap *utrap)
 {
-	int ret = 0;
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 	unsigned long hmask = cp->a0;
 	unsigned long hbase = cp->a1;
@@ -116,10 +119,10 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 		 */
 		break;
 	default:
-		ret = -EOPNOTSUPP;
+		edata->err_val = SBI_ERR_NOT_SUPPORTED;
 	}
 
-	return ret;
+	return 0;
 }
 
 const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
@@ -130,14 +133,13 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
 
 static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
 				    struct kvm_run *run,
-				    unsigned long *out_val,
-				    struct kvm_cpu_trap *utrap, bool *exit)
+				    struct kvm_vcpu_sbi_ext_data *edata,
+				    struct kvm_cpu_trap *utrap)
 {
 	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
 	unsigned long funcid = cp->a6;
 	u32 reason = cp->a1;
 	u32 type = cp->a0;
-	int ret = 0;
 
 	switch (funcid) {
 	case SBI_EXT_SRST_RESET:
@@ -146,24 +148,24 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
 			kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
 						KVM_SYSTEM_EVENT_SHUTDOWN,
 						reason);
-			*exit = true;
+			edata->uexit = true;
 			break;
 		case SBI_SRST_RESET_TYPE_COLD_REBOOT:
 		case SBI_SRST_RESET_TYPE_WARM_REBOOT:
 			kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
 						KVM_SYSTEM_EVENT_RESET,
 						reason);
-			*exit = true;
+			edata->uexit = true;
 			break;
 		default:
-			ret = -EOPNOTSUPP;
+			edata->err_val = SBI_ERR_NOT_SUPPORTED;
 		}
 		break;
 	default:
-		ret = -EOPNOTSUPP;
+		edata->err_val = SBI_ERR_NOT_SUPPORTED;
 	}
 
-	return ret;
+	return 0;
 }
 
 const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 05/11] RISC-V: KVM: Improve privilege mode filtering for perf
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (3 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2022-12-15 20:17   ` Conor Dooley
  2022-12-15 17:00 ` [PATCH v2 06/11] RISC-V: KVM: Add skeleton support " Atish Patra
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Andrew Jones, Anup Patel, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

Currently, the host driver doesn't have any method to identify if the
requested perf event is from kvm or bare metal. As KVM runs in HS
mode, there are no separate hypervisor privilege mode to distinguish
between the attributes for guest/host.

Improve the privilege mode filtering by using the event specific
config1 field.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 drivers/perf/riscv_pmu_sbi.c   | 27 ++++++++++++++++++++++-----
 include/linux/perf/riscv_pmu.h |  2 ++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 65d4aa4..df795b7 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -298,6 +298,27 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
 }
 EXPORT_SYMBOL(riscv_pmu_get_hpm_info);
 
+static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
+{
+	unsigned long cflags = 0;
+	bool guest_events = false;
+
+	if (event->attr.config1 & RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS)
+		guest_events = true;
+	if (event->attr.exclude_kernel)
+		cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VSINH : SBI_PMU_CFG_FLAG_SET_SINH;
+	if (event->attr.exclude_user)
+		cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VUINH : SBI_PMU_CFG_FLAG_SET_UINH;
+	if (guest_events && event->attr.exclude_hv)
+		cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
+	if (event->attr.exclude_host)
+		cflags |= SBI_PMU_CFG_FLAG_SET_UINH | SBI_PMU_CFG_FLAG_SET_SINH;
+	if (event->attr.exclude_guest)
+		cflags |= SBI_PMU_CFG_FLAG_SET_VSINH | SBI_PMU_CFG_FLAG_SET_VUINH;
+
+	return cflags;
+}
+
 static int pmu_sbi_ctr_get_idx(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
@@ -308,11 +329,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
 	uint64_t cbase = 0;
 	unsigned long cflags = 0;
 
-	if (event->attr.exclude_kernel)
-		cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
-	if (event->attr.exclude_user)
-		cflags |= SBI_PMU_CFG_FLAG_SET_UINH;
-
+	cflags = pmu_sbi_get_filter_flags(event);
 	/* retrieve the available counter index */
 #if defined(CONFIG_32BIT)
 	ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index a1c3f77..1c42146 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -26,6 +26,8 @@
 
 #define RISCV_PMU_STOP_FLAG_RESET 1
 
+#define RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS 0x1
+
 struct cpu_hw_events {
 	/* currently enabled events */
 	int			n_events;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 06/11] RISC-V: KVM: Add skeleton support for perf
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (4 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 05/11] RISC-V: KVM: Improve privilege mode filtering for perf Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-12 15:10   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 07/11] RISC-V: KVM: Add SBI PMU extension support Atish Patra
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

This patch only adds barebore structure of perf implementation. Most of
the function returns zero at this point and will be implemented
fully in the future.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/kvm_host.h     |   3 +
 arch/riscv/include/asm/kvm_vcpu_pmu.h |  76 ++++++++++++++
 arch/riscv/kvm/Makefile               |   1 +
 arch/riscv/kvm/vcpu.c                 |   5 +
 arch/riscv/kvm/vcpu_insn.c            |   2 +-
 arch/riscv/kvm/vcpu_pmu.c             | 142 ++++++++++++++++++++++++++
 6 files changed, 228 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
 create mode 100644 arch/riscv/kvm/vcpu_pmu.c

diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
index 93f43a3..f9874b4 100644
--- a/arch/riscv/include/asm/kvm_host.h
+++ b/arch/riscv/include/asm/kvm_host.h
@@ -18,6 +18,7 @@
 #include <asm/kvm_vcpu_insn.h>
 #include <asm/kvm_vcpu_sbi.h>
 #include <asm/kvm_vcpu_timer.h>
+#include <asm/kvm_vcpu_pmu.h>
 
 #define KVM_MAX_VCPUS			1024
 
@@ -228,6 +229,8 @@ struct kvm_vcpu_arch {
 
 	/* Don't run the VCPU (blocked) */
 	bool pause;
+
+	struct kvm_pmu pmu;
 };
 
 static inline void kvm_arch_hardware_unsetup(void) {}
diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
new file mode 100644
index 0000000..6a8c0f7
--- /dev/null
+++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 Rivos Inc
+ *
+ * Authors:
+ *     Atish Patra <atishp@rivosinc.com>
+ */
+
+#ifndef __KVM_VCPU_RISCV_PMU_H
+#define __KVM_VCPU_RISCV_PMU_H
+
+#include <linux/perf/riscv_pmu.h>
+#include <asm/kvm_vcpu_sbi.h>
+#include <asm/sbi.h>
+
+#ifdef CONFIG_RISCV_PMU_SBI
+#define RISCV_KVM_MAX_FW_CTRS 32
+#define RISCV_MAX_COUNTERS      64
+
+/* Per virtual pmu counter data */
+struct kvm_pmc {
+	u8 idx;
+	struct perf_event *perf_event;
+	uint64_t counter_val;
+	union sbi_pmu_ctr_info cinfo;
+	/* Event monitoring status */
+	bool started;
+};
+
+/* PMU data structure per vcpu */
+struct kvm_pmu {
+	struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
+	/* Number of the virtual firmware counters available */
+	int num_fw_ctrs;
+	/* Number of the virtual hardware counters available */
+	int num_hw_ctrs;
+	/* A flag to indicate that pmu initialization is done */
+	bool init_done;
+	/* Bit map of all the virtual counter used */
+	DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
+};
+
+#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
+#define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
+
+int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata);
+int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
+				struct kvm_vcpu_sbi_ext_data *edata);
+int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+				 unsigned long ctr_mask, unsigned long flag, uint64_t ival,
+				 struct kvm_vcpu_sbi_ext_data *edata);
+int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+				unsigned long ctr_mask, unsigned long flag,
+				struct kvm_vcpu_sbi_ext_data *edata);
+int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+				     unsigned long ctr_mask, unsigned long flag,
+				     unsigned long eidx, uint64_t edata,
+				     struct kvm_vcpu_sbi_ext_data *extdata);
+int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
+				struct kvm_vcpu_sbi_ext_data *edata);
+int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
+void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
+void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
+
+#else
+struct kvm_pmu {
+};
+
+static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
+static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
+#endif
+#endif
diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
index 019df920..5de1053 100644
--- a/arch/riscv/kvm/Makefile
+++ b/arch/riscv/kvm/Makefile
@@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
 kvm-y += vcpu_sbi_replace.o
 kvm-y += vcpu_sbi_hsm.o
 kvm-y += vcpu_timer.o
+kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index 7c08567..b746f21 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -137,6 +137,7 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
 
 	WRITE_ONCE(vcpu->arch.irqs_pending, 0);
 	WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
+	kvm_riscv_vcpu_pmu_reset(vcpu);
 
 	vcpu->arch.hfence_head = 0;
 	vcpu->arch.hfence_tail = 0;
@@ -194,6 +195,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	/* Setup VCPU timer */
 	kvm_riscv_vcpu_timer_init(vcpu);
 
+	/* setup performance monitoring */
+	kvm_riscv_vcpu_pmu_init(vcpu);
+
 	/* Reset VCPU */
 	kvm_riscv_reset_vcpu(vcpu);
 
@@ -216,6 +220,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	/* Cleanup VCPU timer */
 	kvm_riscv_vcpu_timer_deinit(vcpu);
 
+	kvm_riscv_vcpu_pmu_deinit(vcpu);
 	/* Free unused pages pre-allocated for G-stage page table mappings */
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
 }
diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c
index 0bb5276..1ff2649 100644
--- a/arch/riscv/kvm/vcpu_insn.c
+++ b/arch/riscv/kvm/vcpu_insn.c
@@ -213,7 +213,7 @@ struct csr_func {
 		    unsigned long wr_mask);
 };
 
-static const struct csr_func csr_funcs[] = { };
+static const struct csr_func csr_funcs[] = {};
 
 /**
  * kvm_riscv_vcpu_csr_return -- Handle CSR read/write after user space
diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
new file mode 100644
index 0000000..0f0748f1
--- /dev/null
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -0,0 +1,142 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 Rivos Inc
+ *
+ * Authors:
+ *     Atish Patra <atishp@rivosinc.com>
+ */
+
+#include <linux/errno.h>
+#include <linux/err.h>
+#include <linux/kvm_host.h>
+#include <linux/perf/riscv_pmu.h>
+#include <asm/csr.h>
+#include <asm/kvm_vcpu_sbi.h>
+#include <asm/kvm_vcpu_pmu.h>
+#include <linux/kvm_host.h>
+
+#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
+
+int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata)
+{
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+
+	edata->out_val = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
+
+	return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
+				struct kvm_vcpu_sbi_ext_data *edata)
+{
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+
+	if ((cidx > RISCV_MAX_COUNTERS) || (cidx == 1)) {
+		edata->err_val = SBI_ERR_INVALID_PARAM;
+		return 0;
+	}
+
+	edata->out_val = kvpmu->pmc[cidx].cinfo.value;
+
+	return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+				 unsigned long ctr_mask, unsigned long flag, uint64_t ival,
+				 struct kvm_vcpu_sbi_ext_data *edata)
+{
+	/* TODO */
+	return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+				unsigned long ctr_mask, unsigned long flag,
+				struct kvm_vcpu_sbi_ext_data *edata)
+{
+	/* TODO */
+	return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+				     unsigned long ctr_mask, unsigned long flag,
+				     unsigned long eidx, uint64_t edata,
+				     struct kvm_vcpu_sbi_ext_data *extdata)
+{
+	/* TODO */
+	return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
+				struct kvm_vcpu_sbi_ext_data *edata)
+{
+	/* TODO */
+	return 0;
+}
+
+int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
+{
+	int i = 0, num_fw_ctrs, ret, num_hw_ctrs = 0, hpm_width = 0;
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+
+	ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
+	if (ret < 0)
+		return ret;
+
+	if (!hpm_width || !num_hw_ctrs) {
+		pr_err("Can not initialize PMU for vcpu with NULL hpmcounter width/count\n");
+		return -EINVAL;
+	}
+
+	if ((num_hw_ctrs + RISCV_KVM_MAX_FW_CTRS) > RISCV_MAX_COUNTERS)
+		num_fw_ctrs = RISCV_MAX_COUNTERS - num_hw_ctrs;
+	else
+		num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
+
+	kvpmu->num_hw_ctrs = num_hw_ctrs;
+	kvpmu->num_fw_ctrs = num_fw_ctrs;
+	/*
+	 * There is no corelation betwen the logical hardware counter and virtual counters.
+	 * However, we need to encode a hpmcounter CSR in the counter info field so that
+	 * KVM can trap n emulate the read. This works well in the migraiton usecase as
+	 * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
+	 */
+	for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
+		/* TIME CSR shouldn't be read from perf interface */
+		if (i == 1)
+			continue;
+		kvpmu->pmc[i].idx = i;
+		if (i < kvpmu->num_hw_ctrs) {
+			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
+			if (i < 3)
+				/* CY, IR counters */
+				kvpmu->pmc[i].cinfo.width = 63;
+			else
+				kvpmu->pmc[i].cinfo.width = hpm_width;
+			/*
+			 * The CSR number doesn't have any relation with the logical
+			 * hardware counters. The CSR numbers are encoded sequentially
+			 * to avoid maintaining a map between the virtual counter
+			 * and CSR number.
+			 */
+			kvpmu->pmc[i].cinfo.csr = CSR_CYCLE + i;
+		} else {
+			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_FW;
+			kvpmu->pmc[i].cinfo.width = BITS_PER_LONG - 1;
+		}
+	}
+
+	kvpmu->init_done = true;
+
+	return 0;
+}
+
+void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
+{
+	/* TODO */
+}
+
+void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
+{
+	/* TODO */
+}
+
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 07/11] RISC-V: KVM: Add SBI PMU extension support
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (5 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 06/11] RISC-V: KVM: Add skeleton support " Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-12 15:29   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 08/11] RISC-V: KVM: Disable all hpmcounter access for VS/VU mode Atish Patra
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

SBI PMU extension allows KVM guests to configure/start/stop/query about
the PMU counters in virtualized enviornment as well.

In order to allow that, KVM implements the entire SBI PMU extension.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/kvm/Makefile       |  2 +-
 arch/riscv/kvm/vcpu_sbi.c     | 11 +++++
 arch/riscv/kvm/vcpu_sbi_pmu.c | 86 +++++++++++++++++++++++++++++++++++
 3 files changed, 98 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c

diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
index 5de1053..278e97c 100644
--- a/arch/riscv/kvm/Makefile
+++ b/arch/riscv/kvm/Makefile
@@ -25,4 +25,4 @@ kvm-y += vcpu_sbi_base.o
 kvm-y += vcpu_sbi_replace.o
 kvm-y += vcpu_sbi_hsm.o
 kvm-y += vcpu_timer.o
-kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
+kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index 50c5472..3b8b84e8 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -20,6 +20,16 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
 };
 #endif
 
+#ifdef CONFIG_RISCV_PMU_SBI
+extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
+#else
+static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
+	.extid_start = -1UL,
+	.extid_end = -1UL,
+	.handler = NULL,
+};
+#endif
+
 static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
 	&vcpu_sbi_ext_v01,
 	&vcpu_sbi_ext_base,
@@ -28,6 +38,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
 	&vcpu_sbi_ext_rfence,
 	&vcpu_sbi_ext_srst,
 	&vcpu_sbi_ext_hsm,
+	&vcpu_sbi_ext_pmu,
 	&vcpu_sbi_ext_experimental,
 	&vcpu_sbi_ext_vendor,
 };
diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c
new file mode 100644
index 0000000..223752f
--- /dev/null
+++ b/arch/riscv/kvm/vcpu_sbi_pmu.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 Rivos Inc
+ *
+ * Authors:
+ *     Atish Patra <atishp@rivosinc.com>
+ */
+
+#include <linux/errno.h>
+#include <linux/err.h>
+#include <linux/kvm_host.h>
+#include <asm/csr.h>
+#include <asm/sbi.h>
+#include <asm/kvm_vcpu_sbi.h>
+
+static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				   struct kvm_vcpu_sbi_ext_data *edata,
+				   struct kvm_cpu_trap *utrap)
+{
+	int ret = 0;
+	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	unsigned long funcid = cp->a6;
+	uint64_t temp;
+
+	/* Return not supported if PMU is not initialized */
+	if (!kvpmu->init_done)
+		return -EINVAL;
+
+	switch (funcid) {
+	case SBI_EXT_PMU_NUM_COUNTERS:
+		ret = kvm_riscv_vcpu_pmu_num_ctrs(vcpu, edata);
+		break;
+	case SBI_EXT_PMU_COUNTER_GET_INFO:
+		ret = kvm_riscv_vcpu_pmu_ctr_info(vcpu, cp->a0, edata);
+		break;
+	case SBI_EXT_PMU_COUNTER_CFG_MATCH:
+#if defined(CONFIG_32BIT)
+		temp = ((uint64_t)cp->a5 << 32) | cp->a4;
+#else
+		temp = cp->a4;
+#endif
+		ret = kvm_riscv_vcpu_pmu_ctr_cfg_match(vcpu, cp->a0, cp->a1,
+						       cp->a2, cp->a3, temp, edata);
+		break;
+	case SBI_EXT_PMU_COUNTER_START:
+#if defined(CONFIG_32BIT)
+		temp = ((uint64_t)cp->a4 << 32) | cp->a3;
+#else
+		temp = cp->a3;
+#endif
+		ret = kvm_riscv_vcpu_pmu_ctr_start(vcpu, cp->a0, cp->a1, cp->a2,
+						   temp, edata);
+		break;
+	case SBI_EXT_PMU_COUNTER_STOP:
+		ret = kvm_riscv_vcpu_pmu_ctr_stop(vcpu, cp->a0, cp->a1, cp->a2, edata);
+		break;
+	case SBI_EXT_PMU_COUNTER_FW_READ:
+		ret = kvm_riscv_vcpu_pmu_ctr_read(vcpu, cp->a0, edata);
+		break;
+	default:
+		edata->err_val = SBI_ERR_NOT_SUPPORTED;
+	}
+
+
+	return ret;
+}
+
+unsigned long kvm_sbi_ext_pmu_probe(struct kvm_vcpu *vcpu, unsigned long extid)
+{
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+
+	/*
+	 * PMU Extension is only available to guests if privilege mode filtering
+	 * is available. Otherwise, guest will always count events while the
+	 * execution is in hypervisor mode.
+	 */
+	return kvpmu->init_done && riscv_isa_extension_available(NULL, SSCOFPMF);
+}
+
+const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
+	.extid_start = SBI_EXT_PMU,
+	.extid_end = SBI_EXT_PMU,
+	.handler = kvm_sbi_ext_pmu_handler,
+	.probe = kvm_sbi_ext_pmu_probe,
+};
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 08/11] RISC-V: KVM: Disable all hpmcounter access for VS/VU mode
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (6 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 07/11] RISC-V: KVM: Add SBI PMU extension support Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-12 15:47   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 09/11] RISC-V: KVM: Implement trap & emulate for hpmcounters Atish Patra
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

Any guest must not get access to any hpmcounter including cycle/instret
without any checks. We achieve that by disabling all the bits except TM
bit in hcountern.

However, instret and cycle access for guest userspace can be enabled
upon explicit request (via ONE REG) or on first trap from VU mode
to maintain ABI requirement in the future. This patch doesn't support
that as ONE REG inteface is not settled yet.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/kvm/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index 58c5489..9c2efd3 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -49,7 +49,8 @@ int kvm_arch_hardware_enable(void)
 	hideleg |= (1UL << IRQ_VS_EXT);
 	csr_write(CSR_HIDELEG, hideleg);
 
-	csr_write(CSR_HCOUNTEREN, -1UL);
+	/* VS should access only TM bit. Everything else should trap */
+	csr_write(CSR_HCOUNTEREN, 0x02);
 
 	csr_write(CSR_HVIP, 0);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 09/11] RISC-V: KVM: Implement trap & emulate for hpmcounters
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (7 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 08/11] RISC-V: KVM: Disable all hpmcounter access for VS/VU mode Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-13 11:47   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling Atish Patra
  2022-12-15 17:00 ` [PATCH v2 11/11] RISC-V: KVM: Implement firmware events Atish Patra
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

As the KVM guests only see the virtual PMU counters, all hpmcounter
access should trap and KVM emulates the read access on behalf of guests.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/kvm_vcpu_pmu.h | 16 ++++++++++
 arch/riscv/kvm/vcpu_insn.c            |  4 ++-
 arch/riscv/kvm/vcpu_pmu.c             | 44 ++++++++++++++++++++++++++-
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
index 6a8c0f7..7a9a8e6 100644
--- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
+++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
@@ -43,6 +43,19 @@ struct kvm_pmu {
 #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
 #define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
 
+#if defined(CONFIG_32BIT)
+#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
+{ .base = CSR_CYCLEH,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm }, \
+{ .base = CSR_CYCLE,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
+#else
+#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
+{ .base = CSR_CYCLE,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
+#endif
+
+int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
+				unsigned long *val, unsigned long new_val,
+				unsigned long wr_mask);
+
 int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata);
 int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
 				struct kvm_vcpu_sbi_ext_data *edata);
@@ -65,6 +78,9 @@ void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
 #else
 struct kvm_pmu {
 };
+#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
+{ .base = 0,      .count = 0, .func = NULL },
+
 
 static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c
index 1ff2649..f689337 100644
--- a/arch/riscv/kvm/vcpu_insn.c
+++ b/arch/riscv/kvm/vcpu_insn.c
@@ -213,7 +213,9 @@ struct csr_func {
 		    unsigned long wr_mask);
 };
 
-static const struct csr_func csr_funcs[] = {};
+static const struct csr_func csr_funcs[] = {
+	KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS
+};
 
 /**
  * kvm_riscv_vcpu_csr_return -- Handle CSR read/write after user space
diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 0f0748f1..53c4163 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -17,6 +17,43 @@
 
 #define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
 
+static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
+			unsigned long *out_val)
+{
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	struct kvm_pmc *pmc;
+	u64 enabled, running;
+
+	pmc = &kvpmu->pmc[cidx];
+	if (!pmc->perf_event)
+		return -EINVAL;
+
+	pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
+	*out_val = pmc->counter_val;
+
+	return 0;
+}
+
+int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
+				unsigned long *val, unsigned long new_val,
+				unsigned long wr_mask)
+{
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	int cidx, ret = KVM_INSN_CONTINUE_NEXT_SEPC;
+
+	if (!kvpmu || !kvpmu->init_done)
+		return KVM_INSN_EXIT_TO_USER_SPACE;
+
+	if (wr_mask)
+		return KVM_INSN_ILLEGAL_TRAP;
+	cidx = csr_num - CSR_CYCLE;
+
+	if (pmu_ctr_read(vcpu, cidx, val) < 0)
+		return KVM_INSN_EXIT_TO_USER_SPACE;
+
+	return ret;
+}
+
 int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata)
 {
 	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
@@ -69,7 +106,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
 int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
 				struct kvm_vcpu_sbi_ext_data *edata)
 {
-	/* TODO */
+	int ret;
+
+	ret = pmu_ctr_read(vcpu, cidx, &edata->out_val);
+	if (ret == -EINVAL)
+		edata->err_val = SBI_ERR_INVALID_PARAM;
+
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (8 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 09/11] RISC-V: KVM: Implement trap & emulate for hpmcounters Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-13 11:45   ` Andrew Jones
  2022-12-15 17:00 ` [PATCH v2 11/11] RISC-V: KVM: Implement firmware events Atish Patra
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

RISC-V SBI PMU & Sscofpmf ISA extension allows supporting perf in
the virtualization enviornment as well. KVM implementation
relies on SBI PMU extension for most the most part while trapping
& emulating the CSRs read for counter access.

This patch doesn't have the event sampling support yet.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/kvm/vcpu_pmu.c | 358 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 342 insertions(+), 16 deletions(-)

diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 53c4163..21c1f0f 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -12,10 +12,163 @@
 #include <linux/perf/riscv_pmu.h>
 #include <asm/csr.h>
 #include <asm/kvm_vcpu_sbi.h>
+#include <asm/bitops.h>
 #include <asm/kvm_vcpu_pmu.h>
 #include <linux/kvm_host.h>
 
 #define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
+#define get_event_type(x) ((x & SBI_PMU_EVENT_IDX_TYPE_MASK) >> 16)
+#define get_event_code(x) (x & SBI_PMU_EVENT_IDX_CODE_MASK)
+
+static inline u64 pmu_get_sample_period(struct kvm_pmc *pmc)
+{
+	u64 counter_val_mask = GENMASK(pmc->cinfo.width, 0);
+	u64 sample_period;
+
+	if (!pmc->counter_val)
+		sample_period = counter_val_mask;
+	else
+		sample_period = (-pmc->counter_val) & counter_val_mask;
+
+	return sample_period;
+}
+
+static u32 pmu_get_perf_event_type(unsigned long eidx)
+{
+	enum sbi_pmu_event_type etype = get_event_type(eidx);
+	u32 type;
+
+	if (etype == SBI_PMU_EVENT_TYPE_HW)
+		type = PERF_TYPE_HARDWARE;
+	else if (etype == SBI_PMU_EVENT_TYPE_CACHE)
+		type = PERF_TYPE_HW_CACHE;
+	else if (etype == SBI_PMU_EVENT_TYPE_RAW || etype == SBI_PMU_EVENT_TYPE_FW)
+		type = PERF_TYPE_RAW;
+	else
+		type = PERF_TYPE_MAX;
+
+	return type;
+}
+
+static inline bool pmu_is_fw_event(unsigned long eidx)
+{
+
+	return get_event_type(eidx) == SBI_PMU_EVENT_TYPE_FW;
+}
+
+static void pmu_release_perf_event(struct kvm_pmc *pmc)
+{
+	if (pmc->perf_event) {
+		perf_event_disable(pmc->perf_event);
+		perf_event_release_kernel(pmc->perf_event);
+		pmc->perf_event = NULL;
+	}
+}
+
+static u64 pmu_get_perf_event_hw_config(u32 sbi_event_code)
+{
+	/* SBI PMU HW event code is offset by 1 from perf hw event codes */
+	return (u64)sbi_event_code - 1;
+}
+
+static u64 pmu_get_perf_event_cache_config(u32 sbi_event_code)
+{
+	u64 config = U64_MAX;
+	unsigned int cache_type, cache_op, cache_result;
+
+	/* All the cache event masks lie within 0xFF. No separate masking is necesssary */
+	cache_type = (sbi_event_code & SBI_PMU_EVENT_CACHE_ID_CODE_MASK) >> 3;
+	cache_op = (sbi_event_code & SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK) >> 1;
+	cache_result = sbi_event_code & SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK;
+
+	if (cache_type >= PERF_COUNT_HW_CACHE_MAX ||
+	    cache_op >= PERF_COUNT_HW_CACHE_OP_MAX ||
+	    cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
+		return config;
+
+	config = cache_type | (cache_op << 8) | (cache_result << 16);
+
+	return config;
+}
+
+static u64 pmu_get_perf_event_config(unsigned long eidx, uint64_t evt_data)
+{
+	enum sbi_pmu_event_type etype = get_event_type(eidx);
+	u32 ecode = get_event_code(eidx);
+	u64 config = U64_MAX;
+
+	if (etype == SBI_PMU_EVENT_TYPE_HW)
+		config = pmu_get_perf_event_hw_config(ecode);
+	else if (etype == SBI_PMU_EVENT_TYPE_CACHE)
+		config = pmu_get_perf_event_cache_config(ecode);
+	else if (etype == SBI_PMU_EVENT_TYPE_RAW)
+		config = evt_data & RISCV_PMU_RAW_EVENT_MASK;
+	else if ((etype == SBI_PMU_EVENT_TYPE_FW) && (ecode < SBI_PMU_FW_MAX))
+		config = (1ULL << 63) | ecode;
+
+	return config;
+}
+
+static int pmu_get_fixed_pmc_index(unsigned long eidx)
+{
+	u32 etype = pmu_get_perf_event_type(eidx);
+	u32 ecode = get_event_code(eidx);
+	int ctr_idx;
+
+	if (etype != SBI_PMU_EVENT_TYPE_HW)
+		return -EINVAL;
+
+	if (ecode == SBI_PMU_HW_CPU_CYCLES)
+		ctr_idx = 0;
+	else if (ecode == SBI_PMU_HW_INSTRUCTIONS)
+		ctr_idx = 2;
+	else
+		return -EINVAL;
+
+	return ctr_idx;
+}
+
+static int pmu_get_programmable_pmc_index(struct kvm_pmu *kvpmu, unsigned long eidx,
+					  unsigned long cbase, unsigned long cmask)
+{
+	int ctr_idx = -1;
+	int i, pmc_idx;
+	int min, max;
+
+	if (pmu_is_fw_event(eidx)) {
+		/* Firmware counters are mapped 1:1 starting from num_hw_ctrs for simplicity */
+		min = kvpmu->num_hw_ctrs;
+		max = min + kvpmu->num_fw_ctrs;
+	} else {
+		/* First 3 counters are reserved for fixed counters */
+		min = 3;
+		max = kvpmu->num_hw_ctrs;
+	}
+
+	for_each_set_bit(i, &cmask, BITS_PER_LONG) {
+		pmc_idx = i + cbase;
+		if ((pmc_idx >= min && pmc_idx < max) &&
+		    !test_bit(pmc_idx, kvpmu->pmc_in_use)) {
+			ctr_idx = pmc_idx;
+			break;
+		}
+	}
+
+	return ctr_idx;
+}
+
+static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
+			     unsigned long cbase, unsigned long cmask)
+{
+	int ret;
+
+	/* Fixed counters need to be have fixed mapping as they have different width */
+	ret = pmu_get_fixed_pmc_index(eidx);
+	if (ret >= 0)
+		return ret;
+
+	return pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
+}
 
 static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
 			unsigned long *out_val)
@@ -82,7 +235,41 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 				 unsigned long ctr_mask, unsigned long flag, uint64_t ival,
 				 struct kvm_vcpu_sbi_ext_data *edata)
 {
-	/* TODO */
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	int i, num_ctrs, pmc_index, sbiret = 0;
+	struct kvm_pmc *pmc;
+
+	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
+	if (ctr_base + __fls(ctr_mask) >= num_ctrs) {
+		sbiret = SBI_ERR_INVALID_PARAM;
+		goto out;
+	}
+
+	/* Start the counters that have been configured and requested by the guest */
+	for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
+		pmc_index = i + ctr_base;
+		if (!test_bit(pmc_index, kvpmu->pmc_in_use))
+			continue;
+		pmc = &kvpmu->pmc[pmc_index];
+		if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
+			pmc->counter_val = ival;
+		if (pmc->perf_event) {
+			if (unlikely(pmc->started)) {
+				sbiret = SBI_ERR_ALREADY_STARTED;
+				continue;
+			}
+			perf_event_period(pmc->perf_event, pmu_get_sample_period(pmc));
+			perf_event_enable(pmc->perf_event);
+			pmc->started = true;
+		} else {
+			kvm_debug("Can not start counter due to invalid confiugartion\n");
+			sbiret = SBI_ERR_INVALID_PARAM;
+		}
+	}
+
+out:
+	edata->err_val = sbiret;
+
 	return 0;
 }
 
@@ -90,16 +277,142 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 				unsigned long ctr_mask, unsigned long flag,
 				struct kvm_vcpu_sbi_ext_data *edata)
 {
-	/* TODO */
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	int i, num_ctrs, pmc_index, sbiret = 0;
+	u64 enabled, running;
+	struct kvm_pmc *pmc;
+
+	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
+	if ((ctr_base + __fls(ctr_mask)) >= num_ctrs) {
+		sbiret = SBI_ERR_INVALID_PARAM;
+		goto out;
+	}
+
+	/* Stop the counters that have been configured and requested by the guest */
+	for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
+		pmc_index = i + ctr_base;
+		if (!test_bit(pmc_index, kvpmu->pmc_in_use))
+			continue;
+		pmc = &kvpmu->pmc[pmc_index];
+		if (pmc->perf_event) {
+			if (pmc->started) {
+				/* Stop counting the counter */
+				perf_event_disable(pmc->perf_event);
+				pmc->started = false;
+			} else
+				sbiret = SBI_ERR_ALREADY_STOPPED;
+
+			if (flag & SBI_PMU_STOP_FLAG_RESET) {
+				/* Relase the counter if this is a reset request */
+				pmc->counter_val += perf_event_read_value(pmc->perf_event,
+									  &enabled, &running);
+				pmu_release_perf_event(pmc);
+				clear_bit(pmc_index, kvpmu->pmc_in_use);
+			}
+		} else {
+			kvm_debug("Can not stop counter due to invalid confiugartion\n");
+			sbiret = SBI_ERR_INVALID_PARAM;
+		}
+	}
+
+out:
+	edata->err_val = sbiret;
+
 	return 0;
 }
 
 int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 				     unsigned long ctr_mask, unsigned long flag,
-				     unsigned long eidx, uint64_t edata,
-				     struct kvm_vcpu_sbi_ext_data *extdata)
+				     unsigned long eidx, uint64_t evt_data,
+				     struct kvm_vcpu_sbi_ext_data *ext_data)
 {
-	/* TODO */
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	struct perf_event *event;
+	struct perf_event_attr attr;
+	int num_ctrs, ctr_idx;
+	u32 etype = pmu_get_perf_event_type(eidx);
+	u64 config;
+	struct kvm_pmc *pmc;
+	int sbiret = 0;
+
+
+	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
+	if (etype == PERF_TYPE_MAX || (ctr_base + __fls(ctr_mask) >= num_ctrs)) {
+		sbiret = SBI_ERR_INVALID_PARAM;
+		goto out;
+	}
+
+	if (pmu_is_fw_event(eidx)) {
+		sbiret = SBI_ERR_NOT_SUPPORTED;
+		goto out;
+	}
+
+	/*
+	 * SKIP_MATCH flag indicates the caller is aware of the assigned counter
+	 * for this event. Just do a sanity check if it already marked used.
+	 */
+	if (flag & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
+		if (!test_bit(ctr_base, kvpmu->pmc_in_use)) {
+			sbiret = SBI_ERR_FAILURE;
+			goto out;
+		}
+		ctr_idx = ctr_base;
+		goto match_done;
+	}
+
+	ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
+	if (ctr_idx < 0) {
+		sbiret = SBI_ERR_NOT_SUPPORTED;
+		goto out;
+	}
+
+match_done:
+	pmc = &kvpmu->pmc[ctr_idx];
+	pmu_release_perf_event(pmc);
+	pmc->idx = ctr_idx;
+
+	config = pmu_get_perf_event_config(eidx, evt_data);
+	memset(&attr, 0, sizeof(struct perf_event_attr));
+	attr.type = etype;
+	attr.size = sizeof(attr);
+	attr.pinned = true;
+
+	/*
+	 * It should never reach here if the platform doesn't support sscofpmf extensio
+	 * as mode filtering won't work without it.
+	 */
+	attr.exclude_host = true;
+	attr.exclude_hv = true;
+	attr.exclude_user = !!(flag & SBI_PMU_CFG_FLAG_SET_UINH);
+	attr.exclude_kernel = !!(flag & SBI_PMU_CFG_FLAG_SET_SINH);
+	attr.config = config;
+	attr.config1 = RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS;
+	if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
+		//TODO: Do we really want to clear the value in hardware counter
+		pmc->counter_val = 0;
+	}
+
+	/*
+	 * Set the default sample_period for now. The guest specified value
+	 * will be updated in the start call.
+	 */
+	attr.sample_period = pmu_get_sample_period(pmc);
+
+	event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
+	if (IS_ERR(event)) {
+		pr_err("kvm pmu event creation failed event %pe for eidx %lx\n", event, eidx);
+		return -EOPNOTSUPP;
+	}
+
+	set_bit(ctr_idx, kvpmu->pmc_in_use);
+	pmc->perf_event = event;
+	if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
+		perf_event_enable(pmc->perf_event);
+
+	ext_data->out_val = ctr_idx;
+out:
+	ext_data->err_val = sbiret;
+
 	return 0;
 }
 
@@ -119,6 +432,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 {
 	int i = 0, num_fw_ctrs, ret, num_hw_ctrs = 0, hpm_width = 0;
 	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	struct kvm_pmc *pmc;
 
 	ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
 	if (ret < 0)
@@ -134,6 +448,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 	else
 		num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
 
+	bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
 	kvpmu->num_hw_ctrs = num_hw_ctrs;
 	kvpmu->num_fw_ctrs = num_fw_ctrs;
 	/*
@@ -146,24 +461,26 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 		/* TIME CSR shouldn't be read from perf interface */
 		if (i == 1)
 			continue;
-		kvpmu->pmc[i].idx = i;
+		pmc = &kvpmu->pmc[i];
+		pmc->idx = i;
+		pmc->counter_val = 0;
 		if (i < kvpmu->num_hw_ctrs) {
 			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
 			if (i < 3)
 				/* CY, IR counters */
-				kvpmu->pmc[i].cinfo.width = 63;
+				pmc->cinfo.width = 63;
 			else
-				kvpmu->pmc[i].cinfo.width = hpm_width;
+				pmc->cinfo.width = hpm_width;
 			/*
 			 * The CSR number doesn't have any relation with the logical
 			 * hardware counters. The CSR numbers are encoded sequentially
 			 * to avoid maintaining a map between the virtual counter
 			 * and CSR number.
 			 */
-			kvpmu->pmc[i].cinfo.csr = CSR_CYCLE + i;
+			pmc->cinfo.csr = CSR_CYCLE + i;
 		} else {
-			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_FW;
-			kvpmu->pmc[i].cinfo.width = BITS_PER_LONG - 1;
+			pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
+			pmc->cinfo.width = BITS_PER_LONG - 1;
 		}
 	}
 
@@ -172,13 +489,22 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
+void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
 {
-	/* TODO */
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	struct kvm_pmc *pmc;
+	int i;
+
+	if (!kvpmu)
+		return;
+
+	for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
+		pmc = &kvpmu->pmc[i];
+		pmu_release_perf_event(pmc);
+	}
 }
 
-void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
+void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
 {
-	/* TODO */
+	kvm_riscv_vcpu_pmu_deinit(vcpu);
 }
-
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 11/11] RISC-V: KVM: Implement firmware events
  2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
                   ` (9 preceding siblings ...)
  2022-12-15 17:00 ` [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling Atish Patra
@ 2022-12-15 17:00 ` Atish Patra
  2023-01-13 12:08   ` Andrew Jones
  10 siblings, 1 reply; 37+ messages in thread
From: Atish Patra @ 2022-12-15 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Anup Patel, Andrew Jones, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

SBI PMU extension defines a set of firmware events which can provide
useful information to guests about number of SBI calls. As hypervisor
implements the SBI PMU extension, these firmware events corresponds
to ecall invocations between VS->HS mode. All other firmware events
will always report zero if monitored as KVM doesn't implement them.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 arch/riscv/include/asm/kvm_vcpu_pmu.h |  16 ++++
 arch/riscv/include/asm/sbi.h          |   2 +-
 arch/riscv/kvm/tlb.c                  |   6 +-
 arch/riscv/kvm/vcpu_pmu.c             | 105 ++++++++++++++++++++++----
 arch/riscv/kvm/vcpu_sbi_replace.c     |   7 ++
 5 files changed, 119 insertions(+), 17 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
index 7a9a8e6..cccc6182 100644
--- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
+++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
@@ -17,6 +17,14 @@
 #define RISCV_KVM_MAX_FW_CTRS 32
 #define RISCV_MAX_COUNTERS      64
 
+struct kvm_fw_event {
+	/* Current value of the event */
+	unsigned long value;
+
+	/* Event monitoring status */
+	bool started;
+};
+
 /* Per virtual pmu counter data */
 struct kvm_pmc {
 	u8 idx;
@@ -25,11 +33,14 @@ struct kvm_pmc {
 	union sbi_pmu_ctr_info cinfo;
 	/* Event monitoring status */
 	bool started;
+	/* Monitoring event ID */
+	unsigned long event_idx;
 };
 
 /* PMU data structure per vcpu */
 struct kvm_pmu {
 	struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
+	struct kvm_fw_event fw_event[RISCV_KVM_MAX_FW_CTRS];
 	/* Number of the virtual firmware counters available */
 	int num_fw_ctrs;
 	/* Number of the virtual hardware counters available */
@@ -52,6 +63,7 @@ struct kvm_pmu {
 { .base = CSR_CYCLE,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
 #endif
 
+int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid);
 int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
 				unsigned long *val, unsigned long new_val,
 				unsigned long wr_mask);
@@ -81,6 +93,10 @@ struct kvm_pmu {
 #define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
 { .base = 0,      .count = 0, .func = NULL },
 
+static inline int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
+{
+	return 0;
+}
 
 static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 2a0ef738..a192a95a 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -171,7 +171,7 @@ enum sbi_pmu_fw_generic_events_t {
 	SBI_PMU_FW_IPI_SENT		= 6,
 	SBI_PMU_FW_IPI_RECVD		= 7,
 	SBI_PMU_FW_FENCE_I_SENT		= 8,
-	SBI_PMU_FW_FENCE_I_RECVD	= 9,
+	SBI_PMU_FW_FENCE_I_RCVD		= 9,
 	SBI_PMU_FW_SFENCE_VMA_SENT	= 10,
 	SBI_PMU_FW_SFENCE_VMA_RCVD	= 11,
 	SBI_PMU_FW_SFENCE_VMA_ASID_SENT	= 12,
diff --git a/arch/riscv/kvm/tlb.c b/arch/riscv/kvm/tlb.c
index 309d79b..de81920 100644
--- a/arch/riscv/kvm/tlb.c
+++ b/arch/riscv/kvm/tlb.c
@@ -181,6 +181,7 @@ void kvm_riscv_local_tlb_sanitize(struct kvm_vcpu *vcpu)
 
 void kvm_riscv_fence_i_process(struct kvm_vcpu *vcpu)
 {
+	kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_RCVD);
 	local_flush_icache_all();
 }
 
@@ -264,15 +265,18 @@ void kvm_riscv_hfence_process(struct kvm_vcpu *vcpu)
 						d.addr, d.size, d.order);
 			break;
 		case KVM_RISCV_HFENCE_VVMA_ASID_GVA:
+			kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
 			kvm_riscv_local_hfence_vvma_asid_gva(
 						READ_ONCE(v->vmid), d.asid,
 						d.addr, d.size, d.order);
 			break;
 		case KVM_RISCV_HFENCE_VVMA_ASID_ALL:
+			kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
 			kvm_riscv_local_hfence_vvma_asid_all(
 						READ_ONCE(v->vmid), d.asid);
 			break;
 		case KVM_RISCV_HFENCE_VVMA_GVA:
+			kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_RCVD);
 			kvm_riscv_local_hfence_vvma_gva(
 						READ_ONCE(v->vmid),
 						d.addr, d.size, d.order);
@@ -323,7 +327,7 @@ void kvm_riscv_fence_i(struct kvm *kvm,
 		       unsigned long hbase, unsigned long hmask)
 {
 	make_xfence_request(kvm, hbase, hmask, KVM_REQ_FENCE_I,
-			    KVM_REQ_FENCE_I, NULL);
+		    KVM_REQ_FENCE_I, NULL);
 }
 
 void kvm_riscv_hfence_gvma_vmid_gpa(struct kvm *kvm,
diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 21c1f0f..a64a7ae 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -170,18 +170,36 @@ static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
 	return pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
 }
 
+int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
+{
+	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+	struct kvm_fw_event *fevent;
+
+	if (!kvpmu || fid >= SBI_PMU_FW_MAX)
+		return -EINVAL;
+
+	fevent = &kvpmu->fw_event[fid];
+	if (fevent->started)
+		fevent->value++;
+
+	return 0;
+}
+
 static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
 			unsigned long *out_val)
 {
 	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
 	u64 enabled, running;
+	int fevent_code;
 
 	pmc = &kvpmu->pmc[cidx];
-	if (!pmc->perf_event)
-		return -EINVAL;
 
-	pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
+	if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+		fevent_code = get_event_code(pmc->event_idx);
+		pmc->counter_val = kvpmu->fw_event[fevent_code].value;
+	} else if (pmc->perf_event)
+		pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
 	*out_val = pmc->counter_val;
 
 	return 0;
@@ -238,6 +256,7 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
 	int i, num_ctrs, pmc_index, sbiret = 0;
 	struct kvm_pmc *pmc;
+	int fevent_code;
 
 	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
 	if (ctr_base + __fls(ctr_mask) >= num_ctrs) {
@@ -253,7 +272,22 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 		pmc = &kvpmu->pmc[pmc_index];
 		if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
 			pmc->counter_val = ival;
-		if (pmc->perf_event) {
+		if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+			fevent_code = get_event_code(pmc->event_idx);
+			if (fevent_code >= SBI_PMU_FW_MAX) {
+				sbiret = SBI_ERR_INVALID_PARAM;
+				goto out;
+			}
+
+			/* Check if the counter was already started for some reason */
+			if (kvpmu->fw_event[fevent_code].started) {
+				sbiret = SBI_ERR_ALREADY_STARTED;
+				continue;
+			}
+
+			kvpmu->fw_event[fevent_code].started = true;
+			kvpmu->fw_event[fevent_code].value = pmc->counter_val;
+		} else if (pmc->perf_event) {
 			if (unlikely(pmc->started)) {
 				sbiret = SBI_ERR_ALREADY_STARTED;
 				continue;
@@ -281,6 +315,7 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 	int i, num_ctrs, pmc_index, sbiret = 0;
 	u64 enabled, running;
 	struct kvm_pmc *pmc;
+	int fevent_code;
 
 	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
 	if ((ctr_base + __fls(ctr_mask)) >= num_ctrs) {
@@ -294,7 +329,18 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 		if (!test_bit(pmc_index, kvpmu->pmc_in_use))
 			continue;
 		pmc = &kvpmu->pmc[pmc_index];
-		if (pmc->perf_event) {
+		if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+			fevent_code = get_event_code(pmc->event_idx);
+			if (fevent_code >= SBI_PMU_FW_MAX) {
+				sbiret = SBI_ERR_INVALID_PARAM;
+				goto out;
+			}
+
+			if (!kvpmu->fw_event[fevent_code].started)
+				sbiret = SBI_ERR_ALREADY_STOPPED;
+
+			kvpmu->fw_event[fevent_code].started = false;
+		} else if (pmc->perf_event) {
 			if (pmc->started) {
 				/* Stop counting the counter */
 				perf_event_disable(pmc->perf_event);
@@ -307,12 +353,15 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
 				pmc->counter_val += perf_event_read_value(pmc->perf_event,
 									  &enabled, &running);
 				pmu_release_perf_event(pmc);
-				clear_bit(pmc_index, kvpmu->pmc_in_use);
 			}
 		} else {
 			kvm_debug("Can not stop counter due to invalid confiugartion\n");
 			sbiret = SBI_ERR_INVALID_PARAM;
 		}
+		if (flag & SBI_PMU_STOP_FLAG_RESET) {
+			pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
+			clear_bit(pmc_index, kvpmu->pmc_in_use);
+		}
 	}
 
 out:
@@ -329,12 +378,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
 	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
 	struct perf_event *event;
 	struct perf_event_attr attr;
-	int num_ctrs, ctr_idx;
+	int num_ctrs, ctr_idx, sbiret = 0;
 	u32 etype = pmu_get_perf_event_type(eidx);
 	u64 config;
-	struct kvm_pmc *pmc;
-	int sbiret = 0;
-
+	struct kvm_pmc *pmc = NULL;
+	bool is_fevent;
+	unsigned long event_code;
 
 	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
 	if (etype == PERF_TYPE_MAX || (ctr_base + __fls(ctr_mask) >= num_ctrs)) {
@@ -342,7 +391,9 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
 		goto out;
 	}
 
-	if (pmu_is_fw_event(eidx)) {
+	event_code = get_event_code(eidx);
+	is_fevent = pmu_is_fw_event(eidx);
+	if (is_fevent && event_code >= SBI_PMU_FW_MAX) {
 		sbiret = SBI_ERR_NOT_SUPPORTED;
 		goto out;
 	}
@@ -357,7 +408,10 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
 			goto out;
 		}
 		ctr_idx = ctr_base;
-		goto match_done;
+		if (is_fevent)
+			goto perf_event_done;
+		else
+			goto match_done;
 	}
 
 	ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
@@ -366,6 +420,13 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
 		goto out;
 	}
 
+	/*
+	 * No need to create perf events for firmware events as the firmware counter
+	 * is supposed to return the measurement of VS->HS mode invocations.
+	 */
+	if (is_fevent)
+		goto perf_event_done;
+
 match_done:
 	pmc = &kvpmu->pmc[ctr_idx];
 	pmu_release_perf_event(pmc);
@@ -404,10 +465,19 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
 		return -EOPNOTSUPP;
 	}
 
-	set_bit(ctr_idx, kvpmu->pmc_in_use);
 	pmc->perf_event = event;
-	if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
-		perf_event_enable(pmc->perf_event);
+perf_event_done:
+	if (flag & SBI_PMU_CFG_FLAG_AUTO_START) {
+		if (is_fevent)
+			kvpmu->fw_event[event_code].started = true;
+		else
+			perf_event_enable(pmc->perf_event);
+	}
+	/* This should be only true for firmware events */
+	if (!pmc)
+		pmc = &kvpmu->pmc[ctr_idx];
+	pmc->event_idx = eidx;
+	set_bit(ctr_idx, kvpmu->pmc_in_use);
 
 	ext_data->out_val = ctr_idx;
 out:
@@ -451,6 +521,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 	bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
 	kvpmu->num_hw_ctrs = num_hw_ctrs;
 	kvpmu->num_fw_ctrs = num_fw_ctrs;
+	memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
 	/*
 	 * There is no corelation betwen the logical hardware counter and virtual counters.
 	 * However, we need to encode a hpmcounter CSR in the counter info field so that
@@ -464,6 +535,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
 		pmc = &kvpmu->pmc[i];
 		pmc->idx = i;
 		pmc->counter_val = 0;
+		pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
 		if (i < kvpmu->num_hw_ctrs) {
 			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
 			if (i < 3)
@@ -501,7 +573,10 @@ void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
 	for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
 		pmc = &kvpmu->pmc[i];
 		pmu_release_perf_event(pmc);
+		pmc->counter_val = 0;
+		pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
 	}
+	memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
 }
 
 void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
index d029136..3f39711 100644
--- a/arch/riscv/kvm/vcpu_sbi_replace.c
+++ b/arch/riscv/kvm/vcpu_sbi_replace.c
@@ -11,6 +11,7 @@
 #include <linux/kvm_host.h>
 #include <asm/sbi.h>
 #include <asm/kvm_vcpu_timer.h>
+#include <asm/kvm_vcpu_pmu.h>
 #include <asm/kvm_vcpu_sbi.h>
 
 static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
@@ -26,6 +27,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		return 0;
 	}
 
+	kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_SET_TIMER);
 #if __riscv_xlen == 32
 	next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
 #else
@@ -58,6 +60,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		return 0;
 	}
 
+	kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_IPI_SENT);
 	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
 		if (hbase != -1UL) {
 			if (tmp->vcpu_id < hbase)
@@ -68,6 +71,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		ret = kvm_riscv_vcpu_set_interrupt(tmp, IRQ_VS_SOFT);
 		if (ret < 0)
 			break;
+		kvm_riscv_vcpu_pmu_incr_fw(tmp, SBI_PMU_FW_IPI_RECVD);
 	}
 
 	return ret;
@@ -91,6 +95,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 	switch (funcid) {
 	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
 		kvm_riscv_fence_i(vcpu->kvm, hbase, hmask);
+		kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_SENT);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
 		if (cp->a2 == 0 && cp->a3 == 0)
@@ -98,6 +103,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 		else
 			kvm_riscv_hfence_vvma_gva(vcpu->kvm, hbase, hmask,
 						  cp->a2, cp->a3, PAGE_SHIFT);
+		kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_SENT);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
 		if (cp->a2 == 0 && cp->a3 == 0)
@@ -108,6 +114,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 						       hbase, hmask,
 						       cp->a2, cp->a3,
 						       PAGE_SHIFT, cp->a4);
+		kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_SENT);
 		break;
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
 	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 05/11] RISC-V: KVM: Improve privilege mode filtering for perf
  2022-12-15 17:00 ` [PATCH v2 05/11] RISC-V: KVM: Improve privilege mode filtering for perf Atish Patra
@ 2022-12-15 20:17   ` Conor Dooley
  2022-12-15 21:10     ` Atish Kumar Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Conor Dooley @ 2022-12-15 20:17 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Andrew Jones, Anup Patel, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

[-- Attachment #1: Type: text/plain, Size: 3377 bytes --]

Hey Atish,

On Thu, Dec 15, 2022 at 09:00:40AM -0800, Atish Patra wrote:
> RISC-V: KVM: Improve privilege mode filtering for perf

I almost marked this as "not applicable" in patchwork as I was mislead
by the $subject. I know our perf driver is a real mixed bag, but should
it not be something more like:
"perf: RISC-V: Improve privilege mode filtering for KVM"?
It was only when I noticed that the rest of the series had been marked
as "Handled Elsewhere" that I realised that this must not be a KVM patch
;)

Thanks,
Conor

> Currently, the host driver doesn't have any method to identify if the
> requested perf event is from kvm or bare metal. As KVM runs in HS
> mode, there are no separate hypervisor privilege mode to distinguish
> between the attributes for guest/host.
> 
> Improve the privilege mode filtering by using the event specific
> config1 field.
> 
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  drivers/perf/riscv_pmu_sbi.c   | 27 ++++++++++++++++++++++-----
>  include/linux/perf/riscv_pmu.h |  2 ++
>  2 files changed, 24 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index 65d4aa4..df795b7 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
> @@ -298,6 +298,27 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
>  }
>  EXPORT_SYMBOL(riscv_pmu_get_hpm_info);
>  
> +static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
> +{
> +	unsigned long cflags = 0;
> +	bool guest_events = false;
> +
> +	if (event->attr.config1 & RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS)
> +		guest_events = true;
> +	if (event->attr.exclude_kernel)
> +		cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VSINH : SBI_PMU_CFG_FLAG_SET_SINH;
> +	if (event->attr.exclude_user)
> +		cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VUINH : SBI_PMU_CFG_FLAG_SET_UINH;
> +	if (guest_events && event->attr.exclude_hv)
> +		cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
> +	if (event->attr.exclude_host)
> +		cflags |= SBI_PMU_CFG_FLAG_SET_UINH | SBI_PMU_CFG_FLAG_SET_SINH;
> +	if (event->attr.exclude_guest)
> +		cflags |= SBI_PMU_CFG_FLAG_SET_VSINH | SBI_PMU_CFG_FLAG_SET_VUINH;
> +
> +	return cflags;
> +}
> +
>  static int pmu_sbi_ctr_get_idx(struct perf_event *event)
>  {
>  	struct hw_perf_event *hwc = &event->hw;
> @@ -308,11 +329,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
>  	uint64_t cbase = 0;
>  	unsigned long cflags = 0;
>  
> -	if (event->attr.exclude_kernel)
> -		cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
> -	if (event->attr.exclude_user)
> -		cflags |= SBI_PMU_CFG_FLAG_SET_UINH;
> -
> +	cflags = pmu_sbi_get_filter_flags(event);
>  	/* retrieve the available counter index */
>  #if defined(CONFIG_32BIT)
>  	ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
> diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
> index a1c3f77..1c42146 100644
> --- a/include/linux/perf/riscv_pmu.h
> +++ b/include/linux/perf/riscv_pmu.h
> @@ -26,6 +26,8 @@
>  
>  #define RISCV_PMU_STOP_FLAG_RESET 1
>  
> +#define RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS 0x1
> +
>  struct cpu_hw_events {
>  	/* currently enabled events */
>  	int			n_events;
> -- 
> 2.25.1
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 05/11] RISC-V: KVM: Improve privilege mode filtering for perf
  2022-12-15 20:17   ` Conor Dooley
@ 2022-12-15 21:10     ` Atish Kumar Patra
  0 siblings, 0 replies; 37+ messages in thread
From: Atish Kumar Patra @ 2022-12-15 21:10 UTC (permalink / raw)
  To: Conor Dooley
  Cc: linux-kernel, Andrew Jones, Anup Patel, Atish Patra, Guo Ren,
	kvm-riscv, kvm, linux-riscv, Mark Rutland, Palmer Dabbelt,
	Paul Walmsley, Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 12:18 PM Conor Dooley <conor@kernel.org> wrote:
>
> Hey Atish,
>
> On Thu, Dec 15, 2022 at 09:00:40AM -0800, Atish Patra wrote:
> > RISC-V: KVM: Improve privilege mode filtering for perf
>
> I almost marked this as "not applicable" in patchwork as I was mislead
> by the $subject. I know our perf driver is a real mixed bag, but should
> it not be something more like:
> "perf: RISC-V: Improve privilege mode filtering for KVM"?

Sure. I will change it in the next version.

> It was only when I noticed that the rest of the series had been marked
> as "Handled Elsewhere" that I realised that this must not be a KVM patch
> ;)
>
> Thanks,
> Conor
>
> > Currently, the host driver doesn't have any method to identify if the
> > requested perf event is from kvm or bare metal. As KVM runs in HS
> > mode, there are no separate hypervisor privilege mode to distinguish
> > between the attributes for guest/host.
> >
> > Improve the privilege mode filtering by using the event specific
> > config1 field.
> >
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  drivers/perf/riscv_pmu_sbi.c   | 27 ++++++++++++++++++++++-----
> >  include/linux/perf/riscv_pmu.h |  2 ++
> >  2 files changed, 24 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> > index 65d4aa4..df795b7 100644
> > --- a/drivers/perf/riscv_pmu_sbi.c
> > +++ b/drivers/perf/riscv_pmu_sbi.c
> > @@ -298,6 +298,27 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
> >  }
> >  EXPORT_SYMBOL(riscv_pmu_get_hpm_info);
> >
> > +static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
> > +{
> > +     unsigned long cflags = 0;
> > +     bool guest_events = false;
> > +
> > +     if (event->attr.config1 & RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS)
> > +             guest_events = true;
> > +     if (event->attr.exclude_kernel)
> > +             cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VSINH : SBI_PMU_CFG_FLAG_SET_SINH;
> > +     if (event->attr.exclude_user)
> > +             cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VUINH : SBI_PMU_CFG_FLAG_SET_UINH;
> > +     if (guest_events && event->attr.exclude_hv)
> > +             cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
> > +     if (event->attr.exclude_host)
> > +             cflags |= SBI_PMU_CFG_FLAG_SET_UINH | SBI_PMU_CFG_FLAG_SET_SINH;
> > +     if (event->attr.exclude_guest)
> > +             cflags |= SBI_PMU_CFG_FLAG_SET_VSINH | SBI_PMU_CFG_FLAG_SET_VUINH;
> > +
> > +     return cflags;
> > +}
> > +
> >  static int pmu_sbi_ctr_get_idx(struct perf_event *event)
> >  {
> >       struct hw_perf_event *hwc = &event->hw;
> > @@ -308,11 +329,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
> >       uint64_t cbase = 0;
> >       unsigned long cflags = 0;
> >
> > -     if (event->attr.exclude_kernel)
> > -             cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
> > -     if (event->attr.exclude_user)
> > -             cflags |= SBI_PMU_CFG_FLAG_SET_UINH;
> > -
> > +     cflags = pmu_sbi_get_filter_flags(event);
> >       /* retrieve the available counter index */
> >  #if defined(CONFIG_32BIT)
> >       ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
> > diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
> > index a1c3f77..1c42146 100644
> > --- a/include/linux/perf/riscv_pmu.h
> > +++ b/include/linux/perf/riscv_pmu.h
> > @@ -26,6 +26,8 @@
> >
> >  #define RISCV_PMU_STOP_FLAG_RESET 1
> >
> > +#define RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS 0x1
> > +
> >  struct cpu_hw_events {
> >       /* currently enabled events */
> >       int                     n_events;
> > --
> > 2.25.1
> >
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count
  2022-12-15 17:00 ` [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count Atish Patra
@ 2023-01-12 10:06   ` Andrew Jones
  2023-01-12 18:18     ` Atish Kumar Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-12 10:06 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:36AM -0800, Atish Patra wrote:
> KVM module needs to know how many hardware counters and the counter
> width that the platform supports. Otherwise, it will not be able to show
> optimal value of virtual counters to the guest. The virtual hardware
> counters also need to have the same width as the logical hardware
> counters for simplicity. However, there shouldn't be mapping between
> virtual hardware counters and logical hardware counters. As we don't
> support hetergeneous harts or counters with different width as of now,
> the implementation relies on the counter width of the first available
> programmable counter.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  drivers/perf/riscv_pmu_sbi.c   | 35 +++++++++++++++++++++++++++++++++-
>  include/linux/perf/riscv_pmu.h |  3 +++
>  2 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index 3852c18..65d4aa4 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
> @@ -49,6 +49,9 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
>  static union sbi_pmu_ctr_info *pmu_ctr_list;
>  static unsigned int riscv_pmu_irq;
>  
> +/* Cache the available counters in a bitmask */
> +unsigned long cmask;

I presume this can be static since it's not getting added to the header.
And don't we need this to be a long long for rv32? We should probably
just use u64.

> +
>  struct sbi_pmu_event_data {
>  	union {
>  		union {
> @@ -264,6 +267,37 @@ static bool pmu_sbi_ctr_is_fw(int cidx)
>  	return (info->type == SBI_PMU_CTR_TYPE_FW) ? true : false;
>  }
>  
> +/*
> + * Returns the counter width of a programmable counter and number of hardware
> + * counters. As we don't support heterneous CPUs yet, it is okay to just

heterogeneous

> + * return the counter width of the first programmable counter.
> + */
> +int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
> +{
> +	int i;
> +	union sbi_pmu_ctr_info *info;
> +	u32 hpm_width = 0, hpm_count = 0;
> +
> +	if (!cmask)
> +		return -EINVAL;
> +
> +	for_each_set_bit(i, &cmask, RISCV_MAX_COUNTERS) {
> +		info = &pmu_ctr_list[i];
> +		if (!info)
> +			continue;
> +		if (!hpm_width && (info->csr != CSR_CYCLE) && (info->csr != CSR_INSTRET))

nit: No need for () around the != expressions

> +			hpm_width = info->width;
> +		if (info->type == SBI_PMU_CTR_TYPE_HW)
> +			hpm_count++;
> +	}
> +
> +	*hw_ctr_width = hpm_width;
> +	*num_hw_ctr = hpm_count;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(riscv_pmu_get_hpm_info);

EXPORT_SYMBOL_GPL ?

> +
>  static int pmu_sbi_ctr_get_idx(struct perf_event *event)
>  {
>  	struct hw_perf_event *hwc = &event->hw;
> @@ -798,7 +832,6 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
>  static int pmu_sbi_device_probe(struct platform_device *pdev)
>  {
>  	struct riscv_pmu *pmu = NULL;
> -	unsigned long cmask = 0;
>  	int ret = -ENODEV;
>  	int num_counters;
>  
> diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
> index e17e86a..a1c3f77 100644
> --- a/include/linux/perf/riscv_pmu.h
> +++ b/include/linux/perf/riscv_pmu.h
> @@ -73,6 +73,9 @@ void riscv_pmu_legacy_skip_init(void);
>  static inline void riscv_pmu_legacy_skip_init(void) {};
>  #endif
>  struct riscv_pmu *riscv_pmu_alloc(void);
> +#ifdef CONFIG_RISCV_PMU_SBI
> +int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
> +#endif
>  
>  #endif /* CONFIG_RISCV_PMU */
>  
> -- 
> 2.25.1
> 

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 02/11] RISC-V: KVM: Define a probe function for SBI extension data structures
  2022-12-15 17:00 ` [PATCH v2 02/11] RISC-V: KVM: Define a probe function for SBI extension data structures Atish Patra
@ 2023-01-12 10:21   ` Andrew Jones
  2023-01-12 18:19     ` Atish Kumar Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-12 10:21 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:37AM -0800, Atish Patra wrote:
> Currently the probe function just checks if an SBI extension is
> registered or not. However, the extension may not want to advertise
> itself depending on some other condition.
> An additional extension specific probe function will allow
> extensions to decide if they want to be advertised to the caller or
> not. Any extension that does not require additional dependency checks
> can avoid implementing this function.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/include/asm/kvm_vcpu_sbi.h |  3 +++
>  arch/riscv/kvm/vcpu_sbi_base.c        | 13 +++++++++++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> index f79478a..61dac1b 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> @@ -29,6 +29,9 @@ struct kvm_vcpu_sbi_extension {
>  	int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		       unsigned long *out_val, struct kvm_cpu_trap *utrap,
>  		       bool *exit);
> +
> +	/* Extension specific probe function */
> +	unsigned long (*probe)(struct kvm_vcpu *vcpu, unsigned long extid);

It doesn't seem like the extid parameter should be necessary since the
probe function is specific to the extension, but it doesn't hurt either.

>  };
>  
>  void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
> diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> index 5d65c63..89e2415 100644
> --- a/arch/riscv/kvm/vcpu_sbi_base.c
> +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> @@ -19,6 +19,7 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  {
>  	int ret = 0;
>  	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> +	const struct kvm_vcpu_sbi_extension *sbi_ext;
>  
>  	switch (cp->a6) {
>  	case SBI_EXT_BASE_GET_SPEC_VERSION:
> @@ -43,8 +44,16 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  			 */
>  			kvm_riscv_vcpu_sbi_forward(vcpu, run);
>  			*exit = true;
> -		} else
> -			*out_val = kvm_vcpu_sbi_find_ext(cp->a0) ? 1 : 0;
> +		} else {
> +			sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> +			if (sbi_ext) {
> +				if (sbi_ext->probe)
> +					*out_val = sbi_ext->probe(vcpu, cp->a0);
> +				else
> +					*out_val = 1;
> +			} else
> +				*out_val = 0;
> +		}
>  		break;
>  	case SBI_EXT_BASE_GET_MVENDORID:
>  		*out_val = vcpu->arch.mvendorid;
> -- 
> 2.25.1
>

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function
  2022-12-15 17:00 ` [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function Atish Patra
@ 2023-01-12 10:28   ` Andrew Jones
  2023-01-12 18:25     ` Atish Kumar Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-12 10:28 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:38AM -0800, Atish Patra wrote:
> According to the SBI specification, the stop function can only
> return error code SBI_ERR_FAILED. However, currently it returns
> -EINVAL which will be mapped SBI_ERR_INVALID_PARAM.

I presume the mapping referred to here is kvm_linux_err_map_sbi().
If so, then -EPERM isn't correct either. That maps to SBI_ERR_DENIED.
The only thing that will ensure we get SBI_ERR_FAILURE (-1) is
anything not handled by the kvm_linux_err_map_sbi switch, as we
need to use the default.

Thanks,
drew

> 
> Return the appropriate linux error code.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/kvm/vcpu_sbi_hsm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> index 2e915ca..0f8d9fe 100644
> --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
>  static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu->arch.power_off)
> -		return -EINVAL;
> +		return -EPERM;
>  
>  	kvm_riscv_vcpu_power_off(vcpu);
>  
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code
  2022-12-15 17:00 ` [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code Atish Patra
@ 2023-01-12 11:04   ` Andrew Jones
  2023-01-12 18:47     ` Atish Kumar Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-12 11:04 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:39AM -0800, Atish Patra wrote:
> Currently, the SBI extension handle is expected to return Linux error code.

handler

> The top SBI layer converts the Linux error code to SBI specific error code
> that can be returned to guest invoking the SBI calls. This model works
> as long as SBI error codes have 1-to-1 mappings between them.

as long as SBI and Linux error codes have 1-to-1...

> However, that may not be true always. This patch attempts to disassociate
> both these error codes by allowing the SBI extension implementation to
> return SBI specific error codes as well.
> 
> The extension will continue to return the Linux error specific code which
> will indicate any problem *with* the extension emulation while the
> SBI specific error will indicate the problem *of* the emulation.
> 
> Suggested-by: Andrew Jones <ajones@ventanamicro.com>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/include/asm/kvm_vcpu_sbi.h | 10 ++++--
>  arch/riscv/kvm/vcpu_sbi.c             | 45 ++++++++-------------------
>  arch/riscv/kvm/vcpu_sbi_base.c        | 38 +++++++++++-----------
>  arch/riscv/kvm/vcpu_sbi_hsm.c         | 22 +++++++------
>  arch/riscv/kvm/vcpu_sbi_replace.c     | 44 +++++++++++++-------------
>  5 files changed, 74 insertions(+), 85 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> index 61dac1b..fee9253 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> @@ -18,6 +18,12 @@ struct kvm_vcpu_sbi_context {
>  	int return_handled;
>  };
>  
> +struct kvm_vcpu_sbi_ext_data {
> +	unsigned long out_val;
> +	unsigned long err_val;
> +	bool uexit;
> +};
> +
>  struct kvm_vcpu_sbi_extension {
>  	unsigned long extid_start;
>  	unsigned long extid_end;
> @@ -27,8 +33,8 @@ struct kvm_vcpu_sbi_extension {
>  	 * specific error codes.
>  	 */
>  	int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> -		       unsigned long *out_val, struct kvm_cpu_trap *utrap,
> -		       bool *exit);
> +		       struct kvm_vcpu_sbi_ext_data *edata,
> +		       struct kvm_cpu_trap *utrap);
>  
>  	/* Extension specific probe function */
>  	unsigned long (*probe)(struct kvm_vcpu *vcpu, unsigned long extid);
> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> index f96991d..50c5472 100644
> --- a/arch/riscv/kvm/vcpu_sbi.c
> +++ b/arch/riscv/kvm/vcpu_sbi.c
> @@ -12,26 +12,6 @@
>  #include <asm/sbi.h>
>  #include <asm/kvm_vcpu_sbi.h>
>  
> -static int kvm_linux_err_map_sbi(int err)
> -{
> -	switch (err) {
> -	case 0:
> -		return SBI_SUCCESS;
> -	case -EPERM:
> -		return SBI_ERR_DENIED;
> -	case -EINVAL:
> -		return SBI_ERR_INVALID_PARAM;
> -	case -EFAULT:
> -		return SBI_ERR_INVALID_ADDRESS;
> -	case -EOPNOTSUPP:
> -		return SBI_ERR_NOT_SUPPORTED;
> -	case -EALREADY:
> -		return SBI_ERR_ALREADY_AVAILABLE;
> -	default:
> -		return SBI_ERR_FAILURE;
> -	};
> -}
> -
>  #ifndef CONFIG_RISCV_SBI_V01
>  static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
>  	.extid_start = -1UL,
> @@ -125,11 +105,10 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
>  	int ret = 1;
>  	bool next_sepc = true;
> -	bool userspace_exit = false;
>  	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
>  	const struct kvm_vcpu_sbi_extension *sbi_ext;
>  	struct kvm_cpu_trap utrap = { 0 };
> -	unsigned long out_val = 0;
> +	struct kvm_vcpu_sbi_ext_data edata_out = { 0 };
>  	bool ext_is_v01 = false;
>  
>  	sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
> @@ -139,7 +118,7 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		    cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
>  			ext_is_v01 = true;
>  #endif
> -		ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
> +		ret = sbi_ext->handler(vcpu, run, &edata_out, &utrap);
>  	} else {
>  		/* Return error for unsupported SBI calls */
>  		cp->a0 = SBI_ERR_NOT_SUPPORTED;
> @@ -156,25 +135,27 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		goto ecall_done;
>  	}
>  
> +	/* The SBI extension returns Linux error code. Exits the ioctl loop

Need opening /* wing

And for wording, maybe

 When the SBI extension returns a Linux error code, it exist the ioctl
 loop and forwards the error to userspace.

> +	 * and forwards the error to the userspace.
> +	 */
> +	if (ret < 0) {
> +		next_sepc = false;
> +		goto ecall_done;
> +	}

Shouldn't this ret < 0 check go above the "Handle special error cases..."
block?

> +
>  	/* Exit ioctl loop or Propagate the error code the guest */
> -	if (userspace_exit) {
> +	if (edata_out.uexit) {
>  		next_sepc = false;
>  		ret = 0;
>  	} else {
> -		/**
> -		 * SBI extension handler always returns an Linux error code. Convert
> -		 * it to the SBI specific error code that can be propagated the SBI
> -		 * caller.
> -		 */
> -		ret = kvm_linux_err_map_sbi(ret);
> -		cp->a0 = ret;
> +		cp->a0 = edata_out.err_val;
>  		ret = 1;
>  	}
>  ecall_done:
>  	if (next_sepc)
>  		cp->sepc += 4;
>  	if (!ext_is_v01)
> -		cp->a1 = out_val;
> +		cp->a1 = edata_out.out_val;
>  
>  	return ret;
>  }
> diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> index 89e2415..487828d 100644
> --- a/arch/riscv/kvm/vcpu_sbi_base.c
> +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> @@ -14,24 +14,23 @@
>  #include <asm/kvm_vcpu_sbi.h>
>  
>  static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> -				    unsigned long *out_val,
> -				    struct kvm_cpu_trap *trap, bool *exit)
> +				    struct kvm_vcpu_sbi_ext_data *edata,
> +				    struct kvm_cpu_trap *trap)
>  {
> -	int ret = 0;
>  	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
>  	const struct kvm_vcpu_sbi_extension *sbi_ext;
>  
>  	switch (cp->a6) {
>  	case SBI_EXT_BASE_GET_SPEC_VERSION:
> -		*out_val = (KVM_SBI_VERSION_MAJOR <<
> +		edata->out_val = (KVM_SBI_VERSION_MAJOR <<
>  			    SBI_SPEC_VERSION_MAJOR_SHIFT) |
>  			    KVM_SBI_VERSION_MINOR;
>  		break;
>  	case SBI_EXT_BASE_GET_IMP_ID:
> -		*out_val = KVM_SBI_IMPID;
> +		edata->out_val = KVM_SBI_IMPID;
>  		break;
>  	case SBI_EXT_BASE_GET_IMP_VERSION:
> -		*out_val = LINUX_VERSION_CODE;
> +		edata->out_val = LINUX_VERSION_CODE;
>  		break;
>  	case SBI_EXT_BASE_PROBE_EXT:
>  		if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
> @@ -43,33 +42,33 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  			 * forward it to the userspace
>  			 */
>  			kvm_riscv_vcpu_sbi_forward(vcpu, run);
> -			*exit = true;
> +			edata->uexit = true;
>  		} else {
>  			sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
>  			if (sbi_ext) {
>  				if (sbi_ext->probe)
> -					*out_val = sbi_ext->probe(vcpu, cp->a0);
> +					edata->out_val = sbi_ext->probe(vcpu, cp->a0);
>  				else
> -					*out_val = 1;
> +					edata->out_val = 1;
>  			} else
> -				*out_val = 0;
> +				edata->out_val = 0;
>  		}
>  		break;
>  	case SBI_EXT_BASE_GET_MVENDORID:
> -		*out_val = vcpu->arch.mvendorid;
> +		edata->out_val = vcpu->arch.mvendorid;
>  		break;
>  	case SBI_EXT_BASE_GET_MARCHID:
> -		*out_val = vcpu->arch.marchid;
> +		edata->out_val = vcpu->arch.marchid;
>  		break;
>  	case SBI_EXT_BASE_GET_MIMPID:
> -		*out_val = vcpu->arch.mimpid;
> +		edata->out_val = vcpu->arch.mimpid;
>  		break;
>  	default:
> -		ret = -EOPNOTSUPP;
> +		edata->err_val = SBI_ERR_NOT_SUPPORTED;
>  		break;
>  	}
>  
> -	return ret;
> +	return 0;
>  }
>  
>  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> @@ -79,17 +78,16 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
>  };
>  
>  static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
> -					struct kvm_run *run,
> -					unsigned long *out_val,
> -					struct kvm_cpu_trap *utrap,
> -					bool *exit)
> +				       struct kvm_run *run,
> +				       struct kvm_vcpu_sbi_ext_data *edata,
> +				       struct kvm_cpu_trap *utrap)
>  {
>  	/*
>  	 * Both SBI experimental and vendor extensions are
>  	 * unconditionally forwarded to userspace.
>  	 */
>  	kvm_riscv_vcpu_sbi_forward(vcpu, run);
> -	*exit = true;
> +	edata->uexit = true;
>  	return 0;
>  }
>  
> diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> index 0f8d9fe..4188f21 100644
> --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> @@ -21,9 +21,9 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
>  
>  	target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
>  	if (!target_vcpu)
> -		return -EINVAL;
> +		return SBI_ERR_INVALID_PARAM;
>  	if (!target_vcpu->arch.power_off)
> -		return -EALREADY;
> +		return SBI_ERR_ALREADY_AVAILABLE;
>  
>  	reset_cntx = &target_vcpu->arch.guest_reset_context;
>  	/* start address */
> @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
>  static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu->arch.power_off)
> -		return -EPERM;
> +		return SBI_ERR_FAILURE;

-EPERM shouldn't map to SBI_ERR_FAILURE, but this should be addressed in
the previous patch.

>  
>  	kvm_riscv_vcpu_power_off(vcpu);
>  
> @@ -57,7 +57,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
>  
>  	target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
>  	if (!target_vcpu)
> -		return -EINVAL;
> +		return SBI_ERR_INVALID_PARAM;
>  	if (!target_vcpu->arch.power_off)
>  		return SBI_HSM_STATE_STARTED;
>  	else if (vcpu->stat.generic.blocking)
> @@ -66,10 +66,10 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
>  		return SBI_HSM_STATE_STOPPED;
>  }
>  
> +

stray new blank line

>  static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> -				   unsigned long *out_val,
> -				   struct kvm_cpu_trap *utrap,
> -				   bool *exit)
> +				   struct kvm_vcpu_sbi_ext_data *edata,
> +				   struct kvm_cpu_trap *utrap)
>  {
>  	int ret = 0;
>  	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> @@ -88,8 +88,8 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	case SBI_EXT_HSM_HART_STATUS:
>  		ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
>  		if (ret >= 0) {
> -			*out_val = ret;
> -			ret = 0;
> +			edata->out_val = ret;
> +			edata->err_val = 0;

We want a 'return 0;' here. Otherwise below we write ret to err_val.

>  		}
>  		break;
>  	case SBI_EXT_HSM_HART_SUSPEND:
> @@ -108,7 +108,9 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		ret = -EOPNOTSUPP;

This ret = -EOPNOTSUPP in the context and two others above it need to be
converted to SBI errors.

>  	}
>  
> -	return ret;
> +	edata->err_val = ret;
> +
> +	return 0;
>  }
>  
>  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
> diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> index 03a0198..d029136 100644
> --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> @@ -14,15 +14,17 @@
>  #include <asm/kvm_vcpu_sbi.h>
>  
>  static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> -				    unsigned long *out_val,
> -				    struct kvm_cpu_trap *utrap, bool *exit)
> +				    struct kvm_vcpu_sbi_ext_data *edata,
> +				    struct kvm_cpu_trap *utrap)
>  {
>  	int ret = 0;

ret is only used at the bottom to return 0. We can drop the variable and
change it to an explicit zero.

>  	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
>  	u64 next_cycle;
>  
> -	if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
> -		return -EINVAL;
> +	if (cp->a6 != SBI_EXT_TIME_SET_TIMER) {
> +		edata->err_val = SBI_ERR_INVALID_PARAM;
> +		return 0;
> +	}
>  
>  #if __riscv_xlen == 32
>  	next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
> @@ -41,8 +43,8 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
>  };
>  
>  static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> -				   unsigned long *out_val,
> -				   struct kvm_cpu_trap *utrap, bool *exit)
> +				   struct kvm_vcpu_sbi_ext_data *edata,
> +				   struct kvm_cpu_trap *utrap)
>  {
>  	int ret = 0;
>  	unsigned long i;
> @@ -51,8 +53,10 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	unsigned long hmask = cp->a0;
>  	unsigned long hbase = cp->a1;
>  
> -	if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
> -		return -EINVAL;
> +	if (cp->a6 != SBI_EXT_IPI_SEND_IPI) {
> +		edata->err_val = SBI_ERR_INVALID_PARAM;
> +		return 0;
> +	}
>  
>  	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
>  		if (hbase != -1UL) {
> @@ -76,10 +80,9 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
>  };
>  
>  static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> -				      unsigned long *out_val,
> -				      struct kvm_cpu_trap *utrap, bool *exit)
> +				      struct kvm_vcpu_sbi_ext_data *edata,
> +				      struct kvm_cpu_trap *utrap)
>  {
> -	int ret = 0;
>  	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
>  	unsigned long hmask = cp->a0;
>  	unsigned long hbase = cp->a1;
> @@ -116,10 +119,10 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
>  		 */
>  		break;
>  	default:
> -		ret = -EOPNOTSUPP;
> +		edata->err_val = SBI_ERR_NOT_SUPPORTED;
>  	}
>  
> -	return ret;
> +	return 0;
>  }
>  
>  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
> @@ -130,14 +133,13 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
>  
>  static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
>  				    struct kvm_run *run,
> -				    unsigned long *out_val,
> -				    struct kvm_cpu_trap *utrap, bool *exit)
> +				    struct kvm_vcpu_sbi_ext_data *edata,
> +				    struct kvm_cpu_trap *utrap)
>  {
>  	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
>  	unsigned long funcid = cp->a6;
>  	u32 reason = cp->a1;
>  	u32 type = cp->a0;
> -	int ret = 0;
>  
>  	switch (funcid) {
>  	case SBI_EXT_SRST_RESET:
> @@ -146,24 +148,24 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
>  			kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
>  						KVM_SYSTEM_EVENT_SHUTDOWN,
>  						reason);
> -			*exit = true;
> +			edata->uexit = true;
>  			break;
>  		case SBI_SRST_RESET_TYPE_COLD_REBOOT:
>  		case SBI_SRST_RESET_TYPE_WARM_REBOOT:
>  			kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
>  						KVM_SYSTEM_EVENT_RESET,
>  						reason);
> -			*exit = true;
> +			edata->uexit = true;
>  			break;
>  		default:
> -			ret = -EOPNOTSUPP;
> +			edata->err_val = SBI_ERR_NOT_SUPPORTED;
>  		}
>  		break;
>  	default:
> -		ret = -EOPNOTSUPP;
> +		edata->err_val = SBI_ERR_NOT_SUPPORTED;
>  	}
>  
> -	return ret;
> +	return 0;
>  }
>  
>  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
> -- 
> 2.25.1
>

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 06/11] RISC-V: KVM: Add skeleton support for perf
  2022-12-15 17:00 ` [PATCH v2 06/11] RISC-V: KVM: Add skeleton support " Atish Patra
@ 2023-01-12 15:10   ` Andrew Jones
  2023-01-12 18:09     ` Atish Kumar Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-12 15:10 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:41AM -0800, Atish Patra wrote:
> This patch only adds barebore structure of perf implementation. Most of
> the function returns zero at this point and will be implemented
> fully in the future.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/include/asm/kvm_host.h     |   3 +
>  arch/riscv/include/asm/kvm_vcpu_pmu.h |  76 ++++++++++++++
>  arch/riscv/kvm/Makefile               |   1 +
>  arch/riscv/kvm/vcpu.c                 |   5 +
>  arch/riscv/kvm/vcpu_insn.c            |   2 +-
>  arch/riscv/kvm/vcpu_pmu.c             | 142 ++++++++++++++++++++++++++
>  6 files changed, 228 insertions(+), 1 deletion(-)
>  create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
>  create mode 100644 arch/riscv/kvm/vcpu_pmu.c
> 
> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> index 93f43a3..f9874b4 100644
> --- a/arch/riscv/include/asm/kvm_host.h
> +++ b/arch/riscv/include/asm/kvm_host.h
> @@ -18,6 +18,7 @@
>  #include <asm/kvm_vcpu_insn.h>
>  #include <asm/kvm_vcpu_sbi.h>
>  #include <asm/kvm_vcpu_timer.h>
> +#include <asm/kvm_vcpu_pmu.h>
>  
>  #define KVM_MAX_VCPUS			1024
>  
> @@ -228,6 +229,8 @@ struct kvm_vcpu_arch {
>  
>  	/* Don't run the VCPU (blocked) */
>  	bool pause;
> +
> +	struct kvm_pmu pmu;
>  };
>  
>  static inline void kvm_arch_hardware_unsetup(void) {}
> diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> new file mode 100644
> index 0000000..6a8c0f7
> --- /dev/null
> +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> @@ -0,0 +1,76 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022 Rivos Inc
> + *
> + * Authors:
> + *     Atish Patra <atishp@rivosinc.com>
> + */
> +
> +#ifndef __KVM_VCPU_RISCV_PMU_H
> +#define __KVM_VCPU_RISCV_PMU_H
> +
> +#include <linux/perf/riscv_pmu.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +#include <asm/sbi.h>
> +
> +#ifdef CONFIG_RISCV_PMU_SBI
> +#define RISCV_KVM_MAX_FW_CTRS 32
> +#define RISCV_MAX_COUNTERS      64
> +
> +/* Per virtual pmu counter data */
> +struct kvm_pmc {
> +	u8 idx;
> +	struct perf_event *perf_event;
> +	uint64_t counter_val;
> +	union sbi_pmu_ctr_info cinfo;
> +	/* Event monitoring status */
> +	bool started;
> +};
> +
> +/* PMU data structure per vcpu */
> +struct kvm_pmu {
> +	struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> +	/* Number of the virtual firmware counters available */
> +	int num_fw_ctrs;
> +	/* Number of the virtual hardware counters available */
> +	int num_hw_ctrs;
> +	/* A flag to indicate that pmu initialization is done */
> +	bool init_done;
> +	/* Bit map of all the virtual counter used */
> +	DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
> +};
> +
> +#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
> +#define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
> +
> +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata);
> +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> +				struct kvm_vcpu_sbi_ext_data *edata);
> +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> +				 unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> +				 struct kvm_vcpu_sbi_ext_data *edata);
> +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> +				unsigned long ctr_mask, unsigned long flag,
> +				struct kvm_vcpu_sbi_ext_data *edata);
> +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> +				     unsigned long ctr_mask, unsigned long flag,
> +				     unsigned long eidx, uint64_t edata,
> +				     struct kvm_vcpu_sbi_ext_data *extdata);

How about replacing 'edata' with 'evtdata' and then using 'edata' for the
struct kvm_vcpu_sbi_ext_data pointer in order to keep the struct pointer
name consistent with the other functions?

> +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> +				struct kvm_vcpu_sbi_ext_data *edata);
> +int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
> +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
> +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
> +
> +#else
> +struct kvm_pmu {
> +};
> +
> +static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> +{
> +	return 0;
> +}
> +static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
> +#endif
> +#endif

nit: it'd be nice to have

 #endif /* CONFIG_RISCV_PMU_SBI */
 #endif /* !__KVM_VCPU_RISCV_PMU_H */

> diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> index 019df920..5de1053 100644
> --- a/arch/riscv/kvm/Makefile
> +++ b/arch/riscv/kvm/Makefile
> @@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
>  kvm-y += vcpu_sbi_replace.o
>  kvm-y += vcpu_sbi_hsm.o
>  kvm-y += vcpu_timer.o
> +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 7c08567..b746f21 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -137,6 +137,7 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
>  
>  	WRITE_ONCE(vcpu->arch.irqs_pending, 0);
>  	WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
> +	kvm_riscv_vcpu_pmu_reset(vcpu);
>  
>  	vcpu->arch.hfence_head = 0;
>  	vcpu->arch.hfence_tail = 0;
> @@ -194,6 +195,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  	/* Setup VCPU timer */
>  	kvm_riscv_vcpu_timer_init(vcpu);
>  
> +	/* setup performance monitoring */
> +	kvm_riscv_vcpu_pmu_init(vcpu);
> +
>  	/* Reset VCPU */
>  	kvm_riscv_reset_vcpu(vcpu);
>  
> @@ -216,6 +220,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	/* Cleanup VCPU timer */
>  	kvm_riscv_vcpu_timer_deinit(vcpu);
>  
> +	kvm_riscv_vcpu_pmu_deinit(vcpu);
>  	/* Free unused pages pre-allocated for G-stage page table mappings */
>  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>  }
> diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c
> index 0bb5276..1ff2649 100644
> --- a/arch/riscv/kvm/vcpu_insn.c
> +++ b/arch/riscv/kvm/vcpu_insn.c
> @@ -213,7 +213,7 @@ struct csr_func {
>  		    unsigned long wr_mask);
>  };
>  
> -static const struct csr_func csr_funcs[] = { };
> +static const struct csr_func csr_funcs[] = {};

Stray change

>  
>  /**
>   * kvm_riscv_vcpu_csr_return -- Handle CSR read/write after user space
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> new file mode 100644
> index 0000000..0f0748f1
> --- /dev/null
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -0,0 +1,142 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 Rivos Inc
> + *
> + * Authors:
> + *     Atish Patra <atishp@rivosinc.com>
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/err.h>
> +#include <linux/kvm_host.h>
> +#include <linux/perf/riscv_pmu.h>
> +#include <asm/csr.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +#include <asm/kvm_vcpu_pmu.h>
> +#include <linux/kvm_host.h>
> +
> +#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> +
> +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata)
> +{
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> +	edata->out_val = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;

 edata->out_val = kvm_pmu_num_counters(kvpmu); 

> +
> +	return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> +				struct kvm_vcpu_sbi_ext_data *edata)
> +{
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> +	if ((cidx > RISCV_MAX_COUNTERS) || (cidx == 1)) {

nit: No need for () around the expressions

> +		edata->err_val = SBI_ERR_INVALID_PARAM;
> +		return 0;
> +	}
> +
> +	edata->out_val = kvpmu->pmc[cidx].cinfo.value;
> +
> +	return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> +				 unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> +				 struct kvm_vcpu_sbi_ext_data *edata)
> +{
> +	/* TODO */
> +	return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> +				unsigned long ctr_mask, unsigned long flag,
> +				struct kvm_vcpu_sbi_ext_data *edata)
> +{
> +	/* TODO */
> +	return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> +				     unsigned long ctr_mask, unsigned long flag,
> +				     unsigned long eidx, uint64_t edata,
> +				     struct kvm_vcpu_sbi_ext_data *extdata)
> +{
> +	/* TODO */
> +	return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> +				struct kvm_vcpu_sbi_ext_data *edata)
> +{
> +	/* TODO */
> +	return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> +{
> +	int i = 0, num_fw_ctrs, ret, num_hw_ctrs = 0, hpm_width = 0;
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> +	ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (!hpm_width || !num_hw_ctrs) {
> +		pr_err("Can not initialize PMU for vcpu with NULL hpmcounter width/count\n");
                        ^ Cannot                   ^ VCPU                         ^ or number counters

> +		return -EINVAL;
> +	}
> +
> +	if ((num_hw_ctrs + RISCV_KVM_MAX_FW_CTRS) > RISCV_MAX_COUNTERS)

Shouldn't we warn about this condition? Presumably it means Linux selected
RISCV_MAX_COUNTERS too small, so a warning would let us know we need to
bump it up.

> +		num_fw_ctrs = RISCV_MAX_COUNTERS - num_hw_ctrs;
> +	else
> +		num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
> +
> +	kvpmu->num_hw_ctrs = num_hw_ctrs;
> +	kvpmu->num_fw_ctrs = num_fw_ctrs;

nit: add blank line here

> +	/*
> +	 * There is no corelation betwen the logical hardware counter and virtual counters.

correlation 

> +	 * However, we need to encode a hpmcounter CSR in the counter info field so that
> +	 * KVM can trap n emulate the read. This works well in the migraiton usecase as

migration

> +	 * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
> +	 */
> +	for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
> +		/* TIME CSR shouldn't be read from perf interface */
> +		if (i == 1)
> +			continue;
> +		kvpmu->pmc[i].idx = i;
> +		if (i < kvpmu->num_hw_ctrs) {
> +			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
> +			if (i < 3)
> +				/* CY, IR counters */
> +				kvpmu->pmc[i].cinfo.width = 63;
> +			else
> +				kvpmu->pmc[i].cinfo.width = hpm_width;
> +			/*
> +			 * The CSR number doesn't have any relation with the logical
> +			 * hardware counters. The CSR numbers are encoded sequentially
> +			 * to avoid maintaining a map between the virtual counter
> +			 * and CSR number.
> +			 */
> +			kvpmu->pmc[i].cinfo.csr = CSR_CYCLE + i;
> +		} else {
> +			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_FW;
> +			kvpmu->pmc[i].cinfo.width = BITS_PER_LONG - 1;
> +		}
> +	}
> +
> +	kvpmu->init_done = true;
> +
> +	return 0;
> +}
> +
> +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> +{
> +	/* TODO */
> +}
> +
> +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> +{
> +	/* TODO */
> +}
> +
> -- 
> 2.25.1
>

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 07/11] RISC-V: KVM: Add SBI PMU extension support
  2022-12-15 17:00 ` [PATCH v2 07/11] RISC-V: KVM: Add SBI PMU extension support Atish Patra
@ 2023-01-12 15:29   ` Andrew Jones
  2023-01-12 18:04     ` Atish Kumar Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-12 15:29 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:42AM -0800, Atish Patra wrote:
> SBI PMU extension allows KVM guests to configure/start/stop/query about
> the PMU counters in virtualized enviornment as well.
> 
> In order to allow that, KVM implements the entire SBI PMU extension.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/kvm/Makefile       |  2 +-
>  arch/riscv/kvm/vcpu_sbi.c     | 11 +++++
>  arch/riscv/kvm/vcpu_sbi_pmu.c | 86 +++++++++++++++++++++++++++++++++++
>  3 files changed, 98 insertions(+), 1 deletion(-)
>  create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c
> 
> diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> index 5de1053..278e97c 100644
> --- a/arch/riscv/kvm/Makefile
> +++ b/arch/riscv/kvm/Makefile
> @@ -25,4 +25,4 @@ kvm-y += vcpu_sbi_base.o
>  kvm-y += vcpu_sbi_replace.o
>  kvm-y += vcpu_sbi_hsm.o
>  kvm-y += vcpu_timer.o
> -kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> index 50c5472..3b8b84e8 100644
> --- a/arch/riscv/kvm/vcpu_sbi.c
> +++ b/arch/riscv/kvm/vcpu_sbi.c
> @@ -20,6 +20,16 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
>  };
>  #endif
>  
> +#ifdef CONFIG_RISCV_PMU_SBI
> +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
> +#else
> +static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> +	.extid_start = -1UL,
> +	.extid_end = -1UL,
> +	.handler = NULL,
> +};
> +#endif
> +
>  static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
>  	&vcpu_sbi_ext_v01,
>  	&vcpu_sbi_ext_base,
> @@ -28,6 +38,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
>  	&vcpu_sbi_ext_rfence,
>  	&vcpu_sbi_ext_srst,
>  	&vcpu_sbi_ext_hsm,
> +	&vcpu_sbi_ext_pmu,
>  	&vcpu_sbi_ext_experimental,
>  	&vcpu_sbi_ext_vendor,
>  };
> diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c
> new file mode 100644
> index 0000000..223752f
> --- /dev/null
> +++ b/arch/riscv/kvm/vcpu_sbi_pmu.c
> @@ -0,0 +1,86 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 Rivos Inc
> + *
> + * Authors:
> + *     Atish Patra <atishp@rivosinc.com>
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/err.h>
> +#include <linux/kvm_host.h>
> +#include <asm/csr.h>
> +#include <asm/sbi.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +
> +static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +				   struct kvm_vcpu_sbi_ext_data *edata,
> +				   struct kvm_cpu_trap *utrap)
> +{
> +	int ret = 0;
> +	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	unsigned long funcid = cp->a6;
> +	uint64_t temp;
> +
> +	/* Return not supported if PMU is not initialized */
> +	if (!kvpmu->init_done)
> +		return -EINVAL;
> +
> +	switch (funcid) {
> +	case SBI_EXT_PMU_NUM_COUNTERS:
> +		ret = kvm_riscv_vcpu_pmu_num_ctrs(vcpu, edata);
> +		break;
> +	case SBI_EXT_PMU_COUNTER_GET_INFO:
> +		ret = kvm_riscv_vcpu_pmu_ctr_info(vcpu, cp->a0, edata);
> +		break;
> +	case SBI_EXT_PMU_COUNTER_CFG_MATCH:
> +#if defined(CONFIG_32BIT)
> +		temp = ((uint64_t)cp->a5 << 32) | cp->a4;
> +#else
> +		temp = cp->a4;
> +#endif
> +		ret = kvm_riscv_vcpu_pmu_ctr_cfg_match(vcpu, cp->a0, cp->a1,
> +						       cp->a2, cp->a3, temp, edata);
> +		break;
> +	case SBI_EXT_PMU_COUNTER_START:
> +#if defined(CONFIG_32BIT)
> +		temp = ((uint64_t)cp->a4 << 32) | cp->a3;
> +#else
> +		temp = cp->a3;
> +#endif
> +		ret = kvm_riscv_vcpu_pmu_ctr_start(vcpu, cp->a0, cp->a1, cp->a2,
> +						   temp, edata);
> +		break;
> +	case SBI_EXT_PMU_COUNTER_STOP:
> +		ret = kvm_riscv_vcpu_pmu_ctr_stop(vcpu, cp->a0, cp->a1, cp->a2, edata);
> +		break;
> +	case SBI_EXT_PMU_COUNTER_FW_READ:
> +		ret = kvm_riscv_vcpu_pmu_ctr_read(vcpu, cp->a0, edata);
> +		break;
> +	default:
> +		edata->err_val = SBI_ERR_NOT_SUPPORTED;
> +	}
> +
> +
> +	return ret;
> +}
> +
> +unsigned long kvm_sbi_ext_pmu_probe(struct kvm_vcpu *vcpu, unsigned long extid)
> +{
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> +	/*
> +	 * PMU Extension is only available to guests if privilege mode filtering
> +	 * is available. Otherwise, guest will always count events while the
> +	 * execution is in hypervisor mode.
> +	 */
> +	return kvpmu->init_done && riscv_isa_extension_available(NULL, SSCOFPMF);

Assuming we're only supporting homogeneous systems, then can't we just
check for Sscofpmf at PMU init time? When the extension isn't present
we'd fail to init and then here init_done wouldn't be set.

> +}
> +
> +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> +	.extid_start = SBI_EXT_PMU,
> +	.extid_end = SBI_EXT_PMU,
> +	.handler = kvm_sbi_ext_pmu_handler,
> +	.probe = kvm_sbi_ext_pmu_probe,
> +};
> -- 
> 2.25.1
>

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 08/11] RISC-V: KVM: Disable all hpmcounter access for VS/VU mode
  2022-12-15 17:00 ` [PATCH v2 08/11] RISC-V: KVM: Disable all hpmcounter access for VS/VU mode Atish Patra
@ 2023-01-12 15:47   ` Andrew Jones
  0 siblings, 0 replies; 37+ messages in thread
From: Andrew Jones @ 2023-01-12 15:47 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:43AM -0800, Atish Patra wrote:
> Any guest must not get access to any hpmcounter including cycle/instret
> without any checks. We achieve that by disabling all the bits except TM
> bit in hcountern.

hcounteren

> 
> However, instret and cycle access for guest userspace can be enabled
> upon explicit request (via ONE REG) or on first trap from VU mode
> to maintain ABI requirement in the future. This patch doesn't support
> that as ONE REG inteface is not settled yet.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/kvm/main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
> index 58c5489..9c2efd3 100644
> --- a/arch/riscv/kvm/main.c
> +++ b/arch/riscv/kvm/main.c
> @@ -49,7 +49,8 @@ int kvm_arch_hardware_enable(void)
>  	hideleg |= (1UL << IRQ_VS_EXT);
>  	csr_write(CSR_HIDELEG, hideleg);
>  
> -	csr_write(CSR_HCOUNTEREN, -1UL);
> +	/* VS should access only TM bit. Everything else should trap */

s/TM bit/the time counter/

> +	csr_write(CSR_HCOUNTEREN, 0x02);
>  
>  	csr_write(CSR_HVIP, 0);
>  
> -- 
> 2.25.1
> 

Otherwise,

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 07/11] RISC-V: KVM: Add SBI PMU extension support
  2023-01-12 15:29   ` Andrew Jones
@ 2023-01-12 18:04     ` Atish Kumar Patra
  0 siblings, 0 replies; 37+ messages in thread
From: Atish Kumar Patra @ 2023-01-12 18:04 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 7:29 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:42AM -0800, Atish Patra wrote:
> > SBI PMU extension allows KVM guests to configure/start/stop/query about
> > the PMU counters in virtualized enviornment as well.
> >
> > In order to allow that, KVM implements the entire SBI PMU extension.
> >
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  arch/riscv/kvm/Makefile       |  2 +-
> >  arch/riscv/kvm/vcpu_sbi.c     | 11 +++++
> >  arch/riscv/kvm/vcpu_sbi_pmu.c | 86 +++++++++++++++++++++++++++++++++++
> >  3 files changed, 98 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c
> >
> > diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> > index 5de1053..278e97c 100644
> > --- a/arch/riscv/kvm/Makefile
> > +++ b/arch/riscv/kvm/Makefile
> > @@ -25,4 +25,4 @@ kvm-y += vcpu_sbi_base.o
> >  kvm-y += vcpu_sbi_replace.o
> >  kvm-y += vcpu_sbi_hsm.o
> >  kvm-y += vcpu_timer.o
> > -kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> > +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
> > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> > index 50c5472..3b8b84e8 100644
> > --- a/arch/riscv/kvm/vcpu_sbi.c
> > +++ b/arch/riscv/kvm/vcpu_sbi.c
> > @@ -20,6 +20,16 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> >  };
> >  #endif
> >
> > +#ifdef CONFIG_RISCV_PMU_SBI
> > +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
> > +#else
> > +static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> > +     .extid_start = -1UL,
> > +     .extid_end = -1UL,
> > +     .handler = NULL,
> > +};
> > +#endif
> > +
> >  static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> >       &vcpu_sbi_ext_v01,
> >       &vcpu_sbi_ext_base,
> > @@ -28,6 +38,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> >       &vcpu_sbi_ext_rfence,
> >       &vcpu_sbi_ext_srst,
> >       &vcpu_sbi_ext_hsm,
> > +     &vcpu_sbi_ext_pmu,
> >       &vcpu_sbi_ext_experimental,
> >       &vcpu_sbi_ext_vendor,
> >  };
> > diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c
> > new file mode 100644
> > index 0000000..223752f
> > --- /dev/null
> > +++ b/arch/riscv/kvm/vcpu_sbi_pmu.c
> > @@ -0,0 +1,86 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2022 Rivos Inc
> > + *
> > + * Authors:
> > + *     Atish Patra <atishp@rivosinc.com>
> > + */
> > +
> > +#include <linux/errno.h>
> > +#include <linux/err.h>
> > +#include <linux/kvm_host.h>
> > +#include <asm/csr.h>
> > +#include <asm/sbi.h>
> > +#include <asm/kvm_vcpu_sbi.h>
> > +
> > +static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > +                                struct kvm_vcpu_sbi_ext_data *edata,
> > +                                struct kvm_cpu_trap *utrap)
> > +{
> > +     int ret = 0;
> > +     struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +     unsigned long funcid = cp->a6;
> > +     uint64_t temp;
> > +
> > +     /* Return not supported if PMU is not initialized */
> > +     if (!kvpmu->init_done)
> > +             return -EINVAL;
> > +
> > +     switch (funcid) {
> > +     case SBI_EXT_PMU_NUM_COUNTERS:
> > +             ret = kvm_riscv_vcpu_pmu_num_ctrs(vcpu, edata);
> > +             break;
> > +     case SBI_EXT_PMU_COUNTER_GET_INFO:
> > +             ret = kvm_riscv_vcpu_pmu_ctr_info(vcpu, cp->a0, edata);
> > +             break;
> > +     case SBI_EXT_PMU_COUNTER_CFG_MATCH:
> > +#if defined(CONFIG_32BIT)
> > +             temp = ((uint64_t)cp->a5 << 32) | cp->a4;
> > +#else
> > +             temp = cp->a4;
> > +#endif
> > +             ret = kvm_riscv_vcpu_pmu_ctr_cfg_match(vcpu, cp->a0, cp->a1,
> > +                                                    cp->a2, cp->a3, temp, edata);
> > +             break;
> > +     case SBI_EXT_PMU_COUNTER_START:
> > +#if defined(CONFIG_32BIT)
> > +             temp = ((uint64_t)cp->a4 << 32) | cp->a3;
> > +#else
> > +             temp = cp->a3;
> > +#endif
> > +             ret = kvm_riscv_vcpu_pmu_ctr_start(vcpu, cp->a0, cp->a1, cp->a2,
> > +                                                temp, edata);
> > +             break;
> > +     case SBI_EXT_PMU_COUNTER_STOP:
> > +             ret = kvm_riscv_vcpu_pmu_ctr_stop(vcpu, cp->a0, cp->a1, cp->a2, edata);
> > +             break;
> > +     case SBI_EXT_PMU_COUNTER_FW_READ:
> > +             ret = kvm_riscv_vcpu_pmu_ctr_read(vcpu, cp->a0, edata);
> > +             break;
> > +     default:
> > +             edata->err_val = SBI_ERR_NOT_SUPPORTED;
> > +     }
> > +
> > +
> > +     return ret;
> > +}
> > +
> > +unsigned long kvm_sbi_ext_pmu_probe(struct kvm_vcpu *vcpu, unsigned long extid)
> > +{
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +
> > +     /*
> > +      * PMU Extension is only available to guests if privilege mode filtering
> > +      * is available. Otherwise, guest will always count events while the
> > +      * execution is in hypervisor mode.
> > +      */
> > +     return kvpmu->init_done && riscv_isa_extension_available(NULL, SSCOFPMF);
>
> Assuming we're only supporting homogeneous systems, then can't we just
> check for Sscofpmf at PMU init time? When the extension isn't present
> we'd fail to init and then here init_done wouldn't be set.
>

Sure. We can do that too. Will revise it v3.

> > +}
> > +
> > +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> > +     .extid_start = SBI_EXT_PMU,
> > +     .extid_end = SBI_EXT_PMU,
> > +     .handler = kvm_sbi_ext_pmu_handler,
> > +     .probe = kvm_sbi_ext_pmu_probe,
> > +};
> > --
> > 2.25.1
> >
>
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 06/11] RISC-V: KVM: Add skeleton support for perf
  2023-01-12 15:10   ` Andrew Jones
@ 2023-01-12 18:09     ` Atish Kumar Patra
  0 siblings, 0 replies; 37+ messages in thread
From: Atish Kumar Patra @ 2023-01-12 18:09 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 7:10 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:41AM -0800, Atish Patra wrote:
> > This patch only adds barebore structure of perf implementation. Most of
> > the function returns zero at this point and will be implemented
> > fully in the future.
> >
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  arch/riscv/include/asm/kvm_host.h     |   3 +
> >  arch/riscv/include/asm/kvm_vcpu_pmu.h |  76 ++++++++++++++
> >  arch/riscv/kvm/Makefile               |   1 +
> >  arch/riscv/kvm/vcpu.c                 |   5 +
> >  arch/riscv/kvm/vcpu_insn.c            |   2 +-
> >  arch/riscv/kvm/vcpu_pmu.c             | 142 ++++++++++++++++++++++++++
> >  6 files changed, 228 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> >  create mode 100644 arch/riscv/kvm/vcpu_pmu.c
> >
> > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> > index 93f43a3..f9874b4 100644
> > --- a/arch/riscv/include/asm/kvm_host.h
> > +++ b/arch/riscv/include/asm/kvm_host.h
> > @@ -18,6 +18,7 @@
> >  #include <asm/kvm_vcpu_insn.h>
> >  #include <asm/kvm_vcpu_sbi.h>
> >  #include <asm/kvm_vcpu_timer.h>
> > +#include <asm/kvm_vcpu_pmu.h>
> >
> >  #define KVM_MAX_VCPUS                        1024
> >
> > @@ -228,6 +229,8 @@ struct kvm_vcpu_arch {
> >
> >       /* Don't run the VCPU (blocked) */
> >       bool pause;
> > +
> > +     struct kvm_pmu pmu;
> >  };
> >
> >  static inline void kvm_arch_hardware_unsetup(void) {}
> > diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > new file mode 100644
> > index 0000000..6a8c0f7
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > @@ -0,0 +1,76 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (c) 2022 Rivos Inc
> > + *
> > + * Authors:
> > + *     Atish Patra <atishp@rivosinc.com>
> > + */
> > +
> > +#ifndef __KVM_VCPU_RISCV_PMU_H
> > +#define __KVM_VCPU_RISCV_PMU_H
> > +
> > +#include <linux/perf/riscv_pmu.h>
> > +#include <asm/kvm_vcpu_sbi.h>
> > +#include <asm/sbi.h>
> > +
> > +#ifdef CONFIG_RISCV_PMU_SBI
> > +#define RISCV_KVM_MAX_FW_CTRS 32
> > +#define RISCV_MAX_COUNTERS      64
> > +
> > +/* Per virtual pmu counter data */
> > +struct kvm_pmc {
> > +     u8 idx;
> > +     struct perf_event *perf_event;
> > +     uint64_t counter_val;
> > +     union sbi_pmu_ctr_info cinfo;
> > +     /* Event monitoring status */
> > +     bool started;
> > +};
> > +
> > +/* PMU data structure per vcpu */
> > +struct kvm_pmu {
> > +     struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> > +     /* Number of the virtual firmware counters available */
> > +     int num_fw_ctrs;
> > +     /* Number of the virtual hardware counters available */
> > +     int num_hw_ctrs;
> > +     /* A flag to indicate that pmu initialization is done */
> > +     bool init_done;
> > +     /* Bit map of all the virtual counter used */
> > +     DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
> > +};
> > +
> > +#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
> > +#define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
> > +
> > +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata);
> > +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> > +                             struct kvm_vcpu_sbi_ext_data *edata);
> > +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > +                              unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> > +                              struct kvm_vcpu_sbi_ext_data *edata);
> > +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > +                             unsigned long ctr_mask, unsigned long flag,
> > +                             struct kvm_vcpu_sbi_ext_data *edata);
> > +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > +                                  unsigned long ctr_mask, unsigned long flag,
> > +                                  unsigned long eidx, uint64_t edata,
> > +                                  struct kvm_vcpu_sbi_ext_data *extdata);
>
> How about replacing 'edata' with 'evtdata' and then using 'edata' for the
> struct kvm_vcpu_sbi_ext_data pointer in order to keep the struct pointer
> name consistent with the other functions?
>

Sure.

> > +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> > +                             struct kvm_vcpu_sbi_ext_data *edata);
> > +int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
> > +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
> > +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
> > +
> > +#else
> > +struct kvm_pmu {
> > +};
> > +
> > +static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> > +{
> > +     return 0;
> > +}
> > +static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
> > +static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
> > +#endif
> > +#endif
>
> nit: it'd be nice to have
>
>  #endif /* CONFIG_RISCV_PMU_SBI */
>  #endif /* !__KVM_VCPU_RISCV_PMU_H */
>
> > diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> > index 019df920..5de1053 100644
> > --- a/arch/riscv/kvm/Makefile
> > +++ b/arch/riscv/kvm/Makefile
> > @@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
> >  kvm-y += vcpu_sbi_replace.o
> >  kvm-y += vcpu_sbi_hsm.o
> >  kvm-y += vcpu_timer.o
> > +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> > index 7c08567..b746f21 100644
> > --- a/arch/riscv/kvm/vcpu.c
> > +++ b/arch/riscv/kvm/vcpu.c
> > @@ -137,6 +137,7 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
> >
> >       WRITE_ONCE(vcpu->arch.irqs_pending, 0);
> >       WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
> > +     kvm_riscv_vcpu_pmu_reset(vcpu);
> >
> >       vcpu->arch.hfence_head = 0;
> >       vcpu->arch.hfence_tail = 0;
> > @@ -194,6 +195,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >       /* Setup VCPU timer */
> >       kvm_riscv_vcpu_timer_init(vcpu);
> >
> > +     /* setup performance monitoring */
> > +     kvm_riscv_vcpu_pmu_init(vcpu);
> > +
> >       /* Reset VCPU */
> >       kvm_riscv_reset_vcpu(vcpu);
> >
> > @@ -216,6 +220,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> >       /* Cleanup VCPU timer */
> >       kvm_riscv_vcpu_timer_deinit(vcpu);
> >
> > +     kvm_riscv_vcpu_pmu_deinit(vcpu);
> >       /* Free unused pages pre-allocated for G-stage page table mappings */
> >       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> >  }
> > diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c
> > index 0bb5276..1ff2649 100644
> > --- a/arch/riscv/kvm/vcpu_insn.c
> > +++ b/arch/riscv/kvm/vcpu_insn.c
> > @@ -213,7 +213,7 @@ struct csr_func {
> >                   unsigned long wr_mask);
> >  };
> >
> > -static const struct csr_func csr_funcs[] = { };
> > +static const struct csr_func csr_funcs[] = {};
>
> Stray change
>
> >
> >  /**
> >   * kvm_riscv_vcpu_csr_return -- Handle CSR read/write after user space
> > diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> > new file mode 100644
> > index 0000000..0f0748f1
> > --- /dev/null
> > +++ b/arch/riscv/kvm/vcpu_pmu.c
> > @@ -0,0 +1,142 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2022 Rivos Inc
> > + *
> > + * Authors:
> > + *     Atish Patra <atishp@rivosinc.com>
> > + */
> > +
> > +#include <linux/errno.h>
> > +#include <linux/err.h>
> > +#include <linux/kvm_host.h>
> > +#include <linux/perf/riscv_pmu.h>
> > +#include <asm/csr.h>
> > +#include <asm/kvm_vcpu_sbi.h>
> > +#include <asm/kvm_vcpu_pmu.h>
> > +#include <linux/kvm_host.h>
> > +
> > +#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> > +
> > +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata)
> > +{
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +
> > +     edata->out_val = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
>
>  edata->out_val = kvm_pmu_num_counters(kvpmu);
>

ok.

> > +
> > +     return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> > +                             struct kvm_vcpu_sbi_ext_data *edata)
> > +{
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +
> > +     if ((cidx > RISCV_MAX_COUNTERS) || (cidx == 1)) {
>
> nit: No need for () around the expressions
>
> > +             edata->err_val = SBI_ERR_INVALID_PARAM;
> > +             return 0;
> > +     }
> > +
> > +     edata->out_val = kvpmu->pmc[cidx].cinfo.value;
> > +
> > +     return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > +                              unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> > +                              struct kvm_vcpu_sbi_ext_data *edata)
> > +{
> > +     /* TODO */
> > +     return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > +                             unsigned long ctr_mask, unsigned long flag,
> > +                             struct kvm_vcpu_sbi_ext_data *edata)
> > +{
> > +     /* TODO */
> > +     return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > +                                  unsigned long ctr_mask, unsigned long flag,
> > +                                  unsigned long eidx, uint64_t edata,
> > +                                  struct kvm_vcpu_sbi_ext_data *extdata)
> > +{
> > +     /* TODO */
> > +     return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> > +                             struct kvm_vcpu_sbi_ext_data *edata)
> > +{
> > +     /* TODO */
> > +     return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> > +{
> > +     int i = 0, num_fw_ctrs, ret, num_hw_ctrs = 0, hpm_width = 0;
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +
> > +     ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> > +     if (ret < 0)
> > +             return ret;
> > +
> > +     if (!hpm_width || !num_hw_ctrs) {
> > +             pr_err("Can not initialize PMU for vcpu with NULL hpmcounter width/count\n");
>                         ^ Cannot                   ^ VCPU                         ^ or number counters
>
> > +             return -EINVAL;
> > +     }
> > +
> > +     if ((num_hw_ctrs + RISCV_KVM_MAX_FW_CTRS) > RISCV_MAX_COUNTERS)
>
> Shouldn't we warn about this condition? Presumably it means Linux selected
> RISCV_MAX_COUNTERS too small, so a warning would let us know we need to
> bump it up.
>

ok. I will leave a warning message. We are unlikely to hit that
scenario as RISCV_MAX_COUNTERS(64) unless
we add bunch of firmware counters for kvm. At that point both
RISCV_MAX_COUNTERS &
RISCV_FW_MAX_COUNTERS need to be increased.

> > +             num_fw_ctrs = RISCV_MAX_COUNTERS - num_hw_ctrs;
> > +     else
> > +             num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
> > +
> > +     kvpmu->num_hw_ctrs = num_hw_ctrs;
> > +     kvpmu->num_fw_ctrs = num_fw_ctrs;
>
> nit: add blank line here
>
> > +     /*
> > +      * There is no corelation betwen the logical hardware counter and virtual counters.
>
> correlation
>
> > +      * However, we need to encode a hpmcounter CSR in the counter info field so that
> > +      * KVM can trap n emulate the read. This works well in the migraiton usecase as
>
> migration
>

Thanks for the review. I will address all other comments as well.

> > +      * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
> > +      */
> > +     for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
> > +             /* TIME CSR shouldn't be read from perf interface */
> > +             if (i == 1)
> > +                     continue;
> > +             kvpmu->pmc[i].idx = i;
> > +             if (i < kvpmu->num_hw_ctrs) {
> > +                     kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
> > +                     if (i < 3)
> > +                             /* CY, IR counters */
> > +                             kvpmu->pmc[i].cinfo.width = 63;
> > +                     else
> > +                             kvpmu->pmc[i].cinfo.width = hpm_width;
> > +                     /*
> > +                      * The CSR number doesn't have any relation with the logical
> > +                      * hardware counters. The CSR numbers are encoded sequentially
> > +                      * to avoid maintaining a map between the virtual counter
> > +                      * and CSR number.
> > +                      */
> > +                     kvpmu->pmc[i].cinfo.csr = CSR_CYCLE + i;
> > +             } else {
> > +                     kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_FW;
> > +                     kvpmu->pmc[i].cinfo.width = BITS_PER_LONG - 1;
> > +             }
> > +     }
> > +
> > +     kvpmu->init_done = true;
> > +
> > +     return 0;
> > +}
> > +
> > +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> > +{
> > +     /* TODO */
> > +}
> > +
> > +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> > +{
> > +     /* TODO */
> > +}
> > +
> > --
> > 2.25.1
> >
>
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count
  2023-01-12 10:06   ` Andrew Jones
@ 2023-01-12 18:18     ` Atish Kumar Patra
  2023-01-13  7:22       ` Andrew Jones
  0 siblings, 1 reply; 37+ messages in thread
From: Atish Kumar Patra @ 2023-01-12 18:18 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 2:06 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:36AM -0800, Atish Patra wrote:
> > KVM module needs to know how many hardware counters and the counter
> > width that the platform supports. Otherwise, it will not be able to show
> > optimal value of virtual counters to the guest. The virtual hardware
> > counters also need to have the same width as the logical hardware
> > counters for simplicity. However, there shouldn't be mapping between
> > virtual hardware counters and logical hardware counters. As we don't
> > support hetergeneous harts or counters with different width as of now,
> > the implementation relies on the counter width of the first available
> > programmable counter.
> >
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  drivers/perf/riscv_pmu_sbi.c   | 35 +++++++++++++++++++++++++++++++++-
> >  include/linux/perf/riscv_pmu.h |  3 +++
> >  2 files changed, 37 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> > index 3852c18..65d4aa4 100644
> > --- a/drivers/perf/riscv_pmu_sbi.c
> > +++ b/drivers/perf/riscv_pmu_sbi.c
> > @@ -49,6 +49,9 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
> >  static union sbi_pmu_ctr_info *pmu_ctr_list;
> >  static unsigned int riscv_pmu_irq;
> >
> > +/* Cache the available counters in a bitmask */
> > +unsigned long cmask;
>
> I presume this can be static since it's not getting added to the header.
> And don't we need this to be a long long for rv32? We should probably
> just use u64.
>

Yeah. u64 would be better. I will change it along with static. Thanks.

> > +
> >  struct sbi_pmu_event_data {
> >       union {
> >               union {
> > @@ -264,6 +267,37 @@ static bool pmu_sbi_ctr_is_fw(int cidx)
> >       return (info->type == SBI_PMU_CTR_TYPE_FW) ? true : false;
> >  }
> >
> > +/*
> > + * Returns the counter width of a programmable counter and number of hardware
> > + * counters. As we don't support heterneous CPUs yet, it is okay to just
>
> heterogeneous
>

Fixed.

> > + * return the counter width of the first programmable counter.
> > + */
> > +int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
> > +{
> > +     int i;
> > +     union sbi_pmu_ctr_info *info;
> > +     u32 hpm_width = 0, hpm_count = 0;
> > +
> > +     if (!cmask)
> > +             return -EINVAL;
> > +
> > +     for_each_set_bit(i, &cmask, RISCV_MAX_COUNTERS) {
> > +             info = &pmu_ctr_list[i];
> > +             if (!info)
> > +                     continue;
> > +             if (!hpm_width && (info->csr != CSR_CYCLE) && (info->csr != CSR_INSTRET))
>
> nit: No need for () around the != expressions
>

Fixed.

> > +                     hpm_width = info->width;
> > +             if (info->type == SBI_PMU_CTR_TYPE_HW)
> > +                     hpm_count++;
> > +     }
> > +
> > +     *hw_ctr_width = hpm_width;
> > +     *num_hw_ctr = hpm_count;
> > +
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL(riscv_pmu_get_hpm_info);
>
> EXPORT_SYMBOL_GPL ?
>

Is that mandatory ? I have seen usage of both in arch/riscv and other
places though.
I am also not sure if any other non-GPL module should/need access to this.

> > +
> >  static int pmu_sbi_ctr_get_idx(struct perf_event *event)
> >  {
> >       struct hw_perf_event *hwc = &event->hw;
> > @@ -798,7 +832,6 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
> >  static int pmu_sbi_device_probe(struct platform_device *pdev)
> >  {
> >       struct riscv_pmu *pmu = NULL;
> > -     unsigned long cmask = 0;
> >       int ret = -ENODEV;
> >       int num_counters;
> >
> > diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
> > index e17e86a..a1c3f77 100644
> > --- a/include/linux/perf/riscv_pmu.h
> > +++ b/include/linux/perf/riscv_pmu.h
> > @@ -73,6 +73,9 @@ void riscv_pmu_legacy_skip_init(void);
> >  static inline void riscv_pmu_legacy_skip_init(void) {};
> >  #endif
> >  struct riscv_pmu *riscv_pmu_alloc(void);
> > +#ifdef CONFIG_RISCV_PMU_SBI
> > +int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
> > +#endif
> >
> >  #endif /* CONFIG_RISCV_PMU */
> >
> > --
> > 2.25.1
> >
>
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 02/11] RISC-V: KVM: Define a probe function for SBI extension data structures
  2023-01-12 10:21   ` Andrew Jones
@ 2023-01-12 18:19     ` Atish Kumar Patra
  0 siblings, 0 replies; 37+ messages in thread
From: Atish Kumar Patra @ 2023-01-12 18:19 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 2:21 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:37AM -0800, Atish Patra wrote:
> > Currently the probe function just checks if an SBI extension is
> > registered or not. However, the extension may not want to advertise
> > itself depending on some other condition.
> > An additional extension specific probe function will allow
> > extensions to decide if they want to be advertised to the caller or
> > not. Any extension that does not require additional dependency checks
> > can avoid implementing this function.
> >
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  arch/riscv/include/asm/kvm_vcpu_sbi.h |  3 +++
> >  arch/riscv/kvm/vcpu_sbi_base.c        | 13 +++++++++++--
> >  2 files changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > index f79478a..61dac1b 100644
> > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > @@ -29,6 +29,9 @@ struct kvm_vcpu_sbi_extension {
> >       int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >                      unsigned long *out_val, struct kvm_cpu_trap *utrap,
> >                      bool *exit);
> > +
> > +     /* Extension specific probe function */
> > +     unsigned long (*probe)(struct kvm_vcpu *vcpu, unsigned long extid);
>
> It doesn't seem like the extid parameter should be necessary since the
> probe function is specific to the extension, but it doesn't hurt either.
>

Yeah. You are correct. I will drop it. Thanks.

> >  };
> >
> >  void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
> > diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> > index 5d65c63..89e2415 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_base.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> > @@ -19,6 +19,7 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >  {
> >       int ret = 0;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > +     const struct kvm_vcpu_sbi_extension *sbi_ext;
> >
> >       switch (cp->a6) {
> >       case SBI_EXT_BASE_GET_SPEC_VERSION:
> > @@ -43,8 +44,16 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >                        */
> >                       kvm_riscv_vcpu_sbi_forward(vcpu, run);
> >                       *exit = true;
> > -             } else
> > -                     *out_val = kvm_vcpu_sbi_find_ext(cp->a0) ? 1 : 0;
> > +             } else {
> > +                     sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> > +                     if (sbi_ext) {
> > +                             if (sbi_ext->probe)
> > +                                     *out_val = sbi_ext->probe(vcpu, cp->a0);
> > +                             else
> > +                                     *out_val = 1;
> > +                     } else
> > +                             *out_val = 0;
> > +             }
> >               break;
> >       case SBI_EXT_BASE_GET_MVENDORID:
> >               *out_val = vcpu->arch.mvendorid;
> > --
> > 2.25.1
> >
>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function
  2023-01-12 10:28   ` Andrew Jones
@ 2023-01-12 18:25     ` Atish Kumar Patra
  2023-01-13  7:25       ` Andrew Jones
  0 siblings, 1 reply; 37+ messages in thread
From: Atish Kumar Patra @ 2023-01-12 18:25 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 2:28 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:38AM -0800, Atish Patra wrote:
> > According to the SBI specification, the stop function can only
> > return error code SBI_ERR_FAILED. However, currently it returns
> > -EINVAL which will be mapped SBI_ERR_INVALID_PARAM.
>
> I presume the mapping referred to here is kvm_linux_err_map_sbi().
> If so, then -EPERM isn't correct either. That maps to SBI_ERR_DENIED.
> The only thing that will ensure we get SBI_ERR_FAILURE (-1) is
> anything not handled by the kvm_linux_err_map_sbi switch, as we
> need to use the default.
>

It returns SBI_ERR_FAILURE in the next patch when kvm_linux_err_map_sbi
is removed. Maybe I should drop this patch. The next patch does the
correct thing anyways.

> Thanks,
> drew
>
> >
> > Return the appropriate linux error code.
> >
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  arch/riscv/kvm/vcpu_sbi_hsm.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> > index 2e915ca..0f8d9fe 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> > @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> >  static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
> >  {
> >       if (vcpu->arch.power_off)
> > -             return -EINVAL;
> > +             return -EPERM;
> >
> >       kvm_riscv_vcpu_power_off(vcpu);
> >
> > --
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code
  2023-01-12 11:04   ` Andrew Jones
@ 2023-01-12 18:47     ` Atish Kumar Patra
  2023-01-13  7:42       ` Andrew Jones
  0 siblings, 1 reply; 37+ messages in thread
From: Atish Kumar Patra @ 2023-01-12 18:47 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 3:04 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:39AM -0800, Atish Patra wrote:
> > Currently, the SBI extension handle is expected to return Linux error code.
>
> handler
>
> > The top SBI layer converts the Linux error code to SBI specific error code
> > that can be returned to guest invoking the SBI calls. This model works
> > as long as SBI error codes have 1-to-1 mappings between them.
>
> as long as SBI and Linux error codes have 1-to-1...
>
> > However, that may not be true always. This patch attempts to disassociate
> > both these error codes by allowing the SBI extension implementation to
> > return SBI specific error codes as well.
> >
> > The extension will continue to return the Linux error specific code which
> > will indicate any problem *with* the extension emulation while the
> > SBI specific error will indicate the problem *of* the emulation.
> >
> > Suggested-by: Andrew Jones <ajones@ventanamicro.com>
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  arch/riscv/include/asm/kvm_vcpu_sbi.h | 10 ++++--
> >  arch/riscv/kvm/vcpu_sbi.c             | 45 ++++++++-------------------
> >  arch/riscv/kvm/vcpu_sbi_base.c        | 38 +++++++++++-----------
> >  arch/riscv/kvm/vcpu_sbi_hsm.c         | 22 +++++++------
> >  arch/riscv/kvm/vcpu_sbi_replace.c     | 44 +++++++++++++-------------
> >  5 files changed, 74 insertions(+), 85 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > index 61dac1b..fee9253 100644
> > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > @@ -18,6 +18,12 @@ struct kvm_vcpu_sbi_context {
> >       int return_handled;
> >  };
> >
> > +struct kvm_vcpu_sbi_ext_data {
> > +     unsigned long out_val;
> > +     unsigned long err_val;
> > +     bool uexit;
> > +};
> > +
> >  struct kvm_vcpu_sbi_extension {
> >       unsigned long extid_start;
> >       unsigned long extid_end;
> > @@ -27,8 +33,8 @@ struct kvm_vcpu_sbi_extension {
> >        * specific error codes.
> >        */
> >       int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > -                    unsigned long *out_val, struct kvm_cpu_trap *utrap,
> > -                    bool *exit);
> > +                    struct kvm_vcpu_sbi_ext_data *edata,
> > +                    struct kvm_cpu_trap *utrap);
> >
> >       /* Extension specific probe function */
> >       unsigned long (*probe)(struct kvm_vcpu *vcpu, unsigned long extid);
> > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> > index f96991d..50c5472 100644
> > --- a/arch/riscv/kvm/vcpu_sbi.c
> > +++ b/arch/riscv/kvm/vcpu_sbi.c
> > @@ -12,26 +12,6 @@
> >  #include <asm/sbi.h>
> >  #include <asm/kvm_vcpu_sbi.h>
> >
> > -static int kvm_linux_err_map_sbi(int err)
> > -{
> > -     switch (err) {
> > -     case 0:
> > -             return SBI_SUCCESS;
> > -     case -EPERM:
> > -             return SBI_ERR_DENIED;
> > -     case -EINVAL:
> > -             return SBI_ERR_INVALID_PARAM;
> > -     case -EFAULT:
> > -             return SBI_ERR_INVALID_ADDRESS;
> > -     case -EOPNOTSUPP:
> > -             return SBI_ERR_NOT_SUPPORTED;
> > -     case -EALREADY:
> > -             return SBI_ERR_ALREADY_AVAILABLE;
> > -     default:
> > -             return SBI_ERR_FAILURE;
> > -     };
> > -}
> > -
> >  #ifndef CONFIG_RISCV_SBI_V01
> >  static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> >       .extid_start = -1UL,
> > @@ -125,11 +105,10 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >  {
> >       int ret = 1;
> >       bool next_sepc = true;
> > -     bool userspace_exit = false;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >       const struct kvm_vcpu_sbi_extension *sbi_ext;
> >       struct kvm_cpu_trap utrap = { 0 };
> > -     unsigned long out_val = 0;
> > +     struct kvm_vcpu_sbi_ext_data edata_out = { 0 };
> >       bool ext_is_v01 = false;
> >
> >       sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
> > @@ -139,7 +118,7 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >                   cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
> >                       ext_is_v01 = true;
> >  #endif
> > -             ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
> > +             ret = sbi_ext->handler(vcpu, run, &edata_out, &utrap);
> >       } else {
> >               /* Return error for unsupported SBI calls */
> >               cp->a0 = SBI_ERR_NOT_SUPPORTED;
> > @@ -156,25 +135,27 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >               goto ecall_done;
> >       }
> >
> > +     /* The SBI extension returns Linux error code. Exits the ioctl loop
>
> Need opening /* wing
>
> And for wording, maybe
>
>  When the SBI extension returns a Linux error code, it exist the ioctl
>  loop and forwards the error to userspace.
>

Sure.

> > +      * and forwards the error to the userspace.
> > +      */
> > +     if (ret < 0) {
> > +             next_sepc = false;
> > +             goto ecall_done;
> > +     }
>
> Shouldn't this ret < 0 check go above the "Handle special error cases..."
> block?
>

Why ? I thought if an extension handler has both utrap updated and
returns an error,
user space redirection should take precedence. Let me know if it is wrong.

Currently, only legacy SBI extension handler uses utrap for unpriv reads.
So I don't feel strongly about it either. Just wanted to understand
your reasonings.

> > +
> >       /* Exit ioctl loop or Propagate the error code the guest */
> > -     if (userspace_exit) {
> > +     if (edata_out.uexit) {
> >               next_sepc = false;
> >               ret = 0;
> >       } else {
> > -             /**
> > -              * SBI extension handler always returns an Linux error code. Convert
> > -              * it to the SBI specific error code that can be propagated the SBI
> > -              * caller.
> > -              */
> > -             ret = kvm_linux_err_map_sbi(ret);
> > -             cp->a0 = ret;
> > +             cp->a0 = edata_out.err_val;
> >               ret = 1;
> >       }
> >  ecall_done:
> >       if (next_sepc)
> >               cp->sepc += 4;
> >       if (!ext_is_v01)
> > -             cp->a1 = out_val;
> > +             cp->a1 = edata_out.out_val;
> >
> >       return ret;
> >  }
> > diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> > index 89e2415..487828d 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_base.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> > @@ -14,24 +14,23 @@
> >  #include <asm/kvm_vcpu_sbi.h>
> >
> >  static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > -                                 unsigned long *out_val,
> > -                                 struct kvm_cpu_trap *trap, bool *exit)
> > +                                 struct kvm_vcpu_sbi_ext_data *edata,
> > +                                 struct kvm_cpu_trap *trap)
> >  {
> > -     int ret = 0;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >       const struct kvm_vcpu_sbi_extension *sbi_ext;
> >
> >       switch (cp->a6) {
> >       case SBI_EXT_BASE_GET_SPEC_VERSION:
> > -             *out_val = (KVM_SBI_VERSION_MAJOR <<
> > +             edata->out_val = (KVM_SBI_VERSION_MAJOR <<
> >                           SBI_SPEC_VERSION_MAJOR_SHIFT) |
> >                           KVM_SBI_VERSION_MINOR;
> >               break;
> >       case SBI_EXT_BASE_GET_IMP_ID:
> > -             *out_val = KVM_SBI_IMPID;
> > +             edata->out_val = KVM_SBI_IMPID;
> >               break;
> >       case SBI_EXT_BASE_GET_IMP_VERSION:
> > -             *out_val = LINUX_VERSION_CODE;
> > +             edata->out_val = LINUX_VERSION_CODE;
> >               break;
> >       case SBI_EXT_BASE_PROBE_EXT:
> >               if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
> > @@ -43,33 +42,33 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >                        * forward it to the userspace
> >                        */
> >                       kvm_riscv_vcpu_sbi_forward(vcpu, run);
> > -                     *exit = true;
> > +                     edata->uexit = true;
> >               } else {
> >                       sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> >                       if (sbi_ext) {
> >                               if (sbi_ext->probe)
> > -                                     *out_val = sbi_ext->probe(vcpu, cp->a0);
> > +                                     edata->out_val = sbi_ext->probe(vcpu, cp->a0);
> >                               else
> > -                                     *out_val = 1;
> > +                                     edata->out_val = 1;
> >                       } else
> > -                             *out_val = 0;
> > +                             edata->out_val = 0;
> >               }
> >               break;
> >       case SBI_EXT_BASE_GET_MVENDORID:
> > -             *out_val = vcpu->arch.mvendorid;
> > +             edata->out_val = vcpu->arch.mvendorid;
> >               break;
> >       case SBI_EXT_BASE_GET_MARCHID:
> > -             *out_val = vcpu->arch.marchid;
> > +             edata->out_val = vcpu->arch.marchid;
> >               break;
> >       case SBI_EXT_BASE_GET_MIMPID:
> > -             *out_val = vcpu->arch.mimpid;
> > +             edata->out_val = vcpu->arch.mimpid;
> >               break;
> >       default:
> > -             ret = -EOPNOTSUPP;
> > +             edata->err_val = SBI_ERR_NOT_SUPPORTED;
> >               break;
> >       }
> >
> > -     return ret;
> > +     return 0;
> >  }
> >
> >  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> > @@ -79,17 +78,16 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> >  };
> >
> >  static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
> > -                                     struct kvm_run *run,
> > -                                     unsigned long *out_val,
> > -                                     struct kvm_cpu_trap *utrap,
> > -                                     bool *exit)
> > +                                    struct kvm_run *run,
> > +                                    struct kvm_vcpu_sbi_ext_data *edata,
> > +                                    struct kvm_cpu_trap *utrap)
> >  {
> >       /*
> >        * Both SBI experimental and vendor extensions are
> >        * unconditionally forwarded to userspace.
> >        */
> >       kvm_riscv_vcpu_sbi_forward(vcpu, run);
> > -     *exit = true;
> > +     edata->uexit = true;
> >       return 0;
> >  }
> >
> > diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> > index 0f8d9fe..4188f21 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> > @@ -21,9 +21,9 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> >
> >       target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> >       if (!target_vcpu)
> > -             return -EINVAL;
> > +             return SBI_ERR_INVALID_PARAM;
> >       if (!target_vcpu->arch.power_off)
> > -             return -EALREADY;
> > +             return SBI_ERR_ALREADY_AVAILABLE;
> >
> >       reset_cntx = &target_vcpu->arch.guest_reset_context;
> >       /* start address */
> > @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> >  static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
> >  {
> >       if (vcpu->arch.power_off)
> > -             return -EPERM;
> > +             return SBI_ERR_FAILURE;
>
> -EPERM shouldn't map to SBI_ERR_FAILURE, but this should be addressed in
> the previous patch.
>
> >
> >       kvm_riscv_vcpu_power_off(vcpu);
> >
> > @@ -57,7 +57,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
> >
> >       target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> >       if (!target_vcpu)
> > -             return -EINVAL;
> > +             return SBI_ERR_INVALID_PARAM;
> >       if (!target_vcpu->arch.power_off)
> >               return SBI_HSM_STATE_STARTED;
> >       else if (vcpu->stat.generic.blocking)
> > @@ -66,10 +66,10 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
> >               return SBI_HSM_STATE_STOPPED;
> >  }
> >
> > +
>
> stray new blank line
>

Fixed.

> >  static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > -                                unsigned long *out_val,
> > -                                struct kvm_cpu_trap *utrap,
> > -                                bool *exit)
> > +                                struct kvm_vcpu_sbi_ext_data *edata,
> > +                                struct kvm_cpu_trap *utrap)
> >  {
> >       int ret = 0;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > @@ -88,8 +88,8 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >       case SBI_EXT_HSM_HART_STATUS:
> >               ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
> >               if (ret >= 0) {
> > -                     *out_val = ret;
> > -                     ret = 0;
> > +                     edata->out_val = ret;
> > +                     edata->err_val = 0;
>
> We want a 'return 0;' here. Otherwise below we write ret to err_val.
>

Ahh yes. Thanks for catching that. Will fix it.

> >               }
> >               break;
> >       case SBI_EXT_HSM_HART_SUSPEND:
> > @@ -108,7 +108,9 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >               ret = -EOPNOTSUPP;
>
> This ret = -EOPNOTSUPP in the context and two others above it need to be
> converted to SBI errors.
>

Will do. Thanks.

> >       }
> >
> > -     return ret;
> > +     edata->err_val = ret;
> > +
> > +     return 0;
> >  }
> >
> >  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
> > diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> > index 03a0198..d029136 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> > @@ -14,15 +14,17 @@
> >  #include <asm/kvm_vcpu_sbi.h>
> >
> >  static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > -                                 unsigned long *out_val,
> > -                                 struct kvm_cpu_trap *utrap, bool *exit)
> > +                                 struct kvm_vcpu_sbi_ext_data *edata,
> > +                                 struct kvm_cpu_trap *utrap)
> >  {
> >       int ret = 0;
>
> ret is only used at the bottom to return 0. We can drop the variable and
> change it to an explicit zero.
>

Sure.

> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >       u64 next_cycle;
> >
> > -     if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
> > -             return -EINVAL;
> > +     if (cp->a6 != SBI_EXT_TIME_SET_TIMER) {
> > +             edata->err_val = SBI_ERR_INVALID_PARAM;
> > +             return 0;
> > +     }
> >
> >  #if __riscv_xlen == 32
> >       next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
> > @@ -41,8 +43,8 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
> >  };
> >
> >  static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > -                                unsigned long *out_val,
> > -                                struct kvm_cpu_trap *utrap, bool *exit)
> > +                                struct kvm_vcpu_sbi_ext_data *edata,
> > +                                struct kvm_cpu_trap *utrap)
> >  {
> >       int ret = 0;
> >       unsigned long i;
> > @@ -51,8 +53,10 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >       unsigned long hmask = cp->a0;
> >       unsigned long hbase = cp->a1;
> >
> > -     if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
> > -             return -EINVAL;
> > +     if (cp->a6 != SBI_EXT_IPI_SEND_IPI) {
> > +             edata->err_val = SBI_ERR_INVALID_PARAM;
> > +             return 0;
> > +     }
> >
> >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> >               if (hbase != -1UL) {
> > @@ -76,10 +80,9 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
> >  };
> >
> >  static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > -                                   unsigned long *out_val,
> > -                                   struct kvm_cpu_trap *utrap, bool *exit)
> > +                                   struct kvm_vcpu_sbi_ext_data *edata,
> > +                                   struct kvm_cpu_trap *utrap)
> >  {
> > -     int ret = 0;
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >       unsigned long hmask = cp->a0;
> >       unsigned long hbase = cp->a1;
> > @@ -116,10 +119,10 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >                */
> >               break;
> >       default:
> > -             ret = -EOPNOTSUPP;
> > +             edata->err_val = SBI_ERR_NOT_SUPPORTED;
> >       }
> >
> > -     return ret;
> > +     return 0;
> >  }
> >
> >  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
> > @@ -130,14 +133,13 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
> >
> >  static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> >                                   struct kvm_run *run,
> > -                                 unsigned long *out_val,
> > -                                 struct kvm_cpu_trap *utrap, bool *exit)
> > +                                 struct kvm_vcpu_sbi_ext_data *edata,
> > +                                 struct kvm_cpu_trap *utrap)
> >  {
> >       struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> >       unsigned long funcid = cp->a6;
> >       u32 reason = cp->a1;
> >       u32 type = cp->a0;
> > -     int ret = 0;
> >
> >       switch (funcid) {
> >       case SBI_EXT_SRST_RESET:
> > @@ -146,24 +148,24 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> >                       kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> >                                               KVM_SYSTEM_EVENT_SHUTDOWN,
> >                                               reason);
> > -                     *exit = true;
> > +                     edata->uexit = true;
> >                       break;
> >               case SBI_SRST_RESET_TYPE_COLD_REBOOT:
> >               case SBI_SRST_RESET_TYPE_WARM_REBOOT:
> >                       kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> >                                               KVM_SYSTEM_EVENT_RESET,
> >                                               reason);
> > -                     *exit = true;
> > +                     edata->uexit = true;
> >                       break;
> >               default:
> > -                     ret = -EOPNOTSUPP;
> > +                     edata->err_val = SBI_ERR_NOT_SUPPORTED;
> >               }
> >               break;
> >       default:
> > -             ret = -EOPNOTSUPP;
> > +             edata->err_val = SBI_ERR_NOT_SUPPORTED;
> >       }
> >
> > -     return ret;
> > +     return 0;
> >  }
> >
> >  const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
> > --
> > 2.25.1
> >
>
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count
  2023-01-12 18:18     ` Atish Kumar Patra
@ 2023-01-13  7:22       ` Andrew Jones
  2023-01-24 20:41         ` Atish Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-13  7:22 UTC (permalink / raw)
  To: Atish Kumar Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 10:18:05AM -0800, Atish Kumar Patra wrote:
> On Thu, Jan 12, 2023 at 2:06 AM Andrew Jones <ajones@ventanamicro.com> wrote:
> >
> > On Thu, Dec 15, 2022 at 09:00:36AM -0800, Atish Patra wrote:
...
> > > +EXPORT_SYMBOL(riscv_pmu_get_hpm_info);
> >
> > EXPORT_SYMBOL_GPL ?
> >
> 
> Is that mandatory ? I have seen usage of both in arch/riscv and other
> places though.
> I am also not sure if any other non-GPL module should/need access to this.

TBH, I'm not sure what the best policy is, but I presumed we should use
_GPL when we aren't aware of anything non-GPL and then when a day comes
that something non-GPL would like this to be exported, the patch that
flips it will provide the justification in its commit message.

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function
  2023-01-12 18:25     ` Atish Kumar Patra
@ 2023-01-13  7:25       ` Andrew Jones
  0 siblings, 0 replies; 37+ messages in thread
From: Andrew Jones @ 2023-01-13  7:25 UTC (permalink / raw)
  To: Atish Kumar Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 10:25:06AM -0800, Atish Kumar Patra wrote:
> On Thu, Jan 12, 2023 at 2:28 AM Andrew Jones <ajones@ventanamicro.com> wrote:
> >
> > On Thu, Dec 15, 2022 at 09:00:38AM -0800, Atish Patra wrote:
> > > According to the SBI specification, the stop function can only
> > > return error code SBI_ERR_FAILED. However, currently it returns
> > > -EINVAL which will be mapped SBI_ERR_INVALID_PARAM.
> >
> > I presume the mapping referred to here is kvm_linux_err_map_sbi().
> > If so, then -EPERM isn't correct either. That maps to SBI_ERR_DENIED.
> > The only thing that will ensure we get SBI_ERR_FAILURE (-1) is
> > anything not handled by the kvm_linux_err_map_sbi switch, as we
> > need to use the default.
> >
> 
> It returns SBI_ERR_FAILURE in the next patch when kvm_linux_err_map_sbi
> is removed. Maybe I should drop this patch. The next patch does the
> correct thing anyways.

Yeah, I saw that, but then we silently fix a bug in the next patch.
I like that this is a separate patch, but it should do what it says
it's doing :-)

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code
  2023-01-12 18:47     ` Atish Kumar Patra
@ 2023-01-13  7:42       ` Andrew Jones
  0 siblings, 0 replies; 37+ messages in thread
From: Andrew Jones @ 2023-01-13  7:42 UTC (permalink / raw)
  To: Atish Kumar Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 10:47:13AM -0800, Atish Kumar Patra wrote:
> On Thu, Jan 12, 2023 at 3:04 AM Andrew Jones <ajones@ventanamicro.com> wrote:
> >
> > On Thu, Dec 15, 2022 at 09:00:39AM -0800, Atish Patra wrote:
...
> > > +      * and forwards the error to the userspace.
> > > +      */
> > > +     if (ret < 0) {
> > > +             next_sepc = false;
> > > +             goto ecall_done;
> > > +     }
> >
> > Shouldn't this ret < 0 check go above the "Handle special error cases..."
> > block?
> >
> 
> Why ? I thought if an extension handler has both utrap updated and
> returns an error,
> user space redirection should take precedence. Let me know if it is wrong.

My thinking was that utrap cannot be trusted if the handler returned an
error. The handler may not even have had a chance to set it before hitting
the error condition. IOW, we should check 'ret' right after the handler
call. If it's < 0, then from the caller's perspective the handler didn't
execute due to a KVM failure and it should be reported to userspace.

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling
  2022-12-15 17:00 ` [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling Atish Patra
@ 2023-01-13 11:45   ` Andrew Jones
  2023-01-23  7:23     ` Andrew Jones
  2023-01-26  0:50     ` Atish Patra
  0 siblings, 2 replies; 37+ messages in thread
From: Andrew Jones @ 2023-01-13 11:45 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:45AM -0800, Atish Patra wrote:
> RISC-V SBI PMU & Sscofpmf ISA extension allows supporting perf in
> the virtualization enviornment as well. KVM implementation
> relies on SBI PMU extension for most the most part while trapping

s/most the most/the most/

> & emulating the CSRs read for counter access.
> 
> This patch doesn't have the event sampling support yet.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/kvm/vcpu_pmu.c | 358 ++++++++++++++++++++++++++++++++++++--
>  1 file changed, 342 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> index 53c4163..21c1f0f 100644
> --- a/arch/riscv/kvm/vcpu_pmu.c
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -12,10 +12,163 @@
>  #include <linux/perf/riscv_pmu.h>
>  #include <asm/csr.h>
>  #include <asm/kvm_vcpu_sbi.h>
> +#include <asm/bitops.h>
>  #include <asm/kvm_vcpu_pmu.h>
>  #include <linux/kvm_host.h>
>  
>  #define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> +#define get_event_type(x) ((x & SBI_PMU_EVENT_IDX_TYPE_MASK) >> 16)
> +#define get_event_code(x) (x & SBI_PMU_EVENT_IDX_CODE_MASK)

Should put () around x

> +
> +static inline u64 pmu_get_sample_period(struct kvm_pmc *pmc)

I'd rather we use kvm_pmu_ for the prefix instead of just pmu for
this and all the other functions.

Also I'd drop the inline attribute here and in other functions below to
let the compiler completely decide what to do.

> +{
> +	u64 counter_val_mask = GENMASK(pmc->cinfo.width, 0);
> +	u64 sample_period;
> +
> +	if (!pmc->counter_val)
> +		sample_period = counter_val_mask;
> +	else
> +		sample_period = (-pmc->counter_val) & counter_val_mask;

I probably don't understand this, since I see arm64/kvm doing the same
thing, but if sample_period is the number of remaining counts a counter
has, then I'd write it as

  sample_period = counter_val_mask - (pmc->counter_val & counter_val_mask);

As it stands, the pmc->counter_val == 0 case would be the same, but
the other case differs by 1. arm64/kvm doesn't appear to handle the
zero case.

> +
> +	return sample_period;
> +}
> +
> +static u32 pmu_get_perf_event_type(unsigned long eidx)
> +{
> +	enum sbi_pmu_event_type etype = get_event_type(eidx);
> +	u32 type;
> +
> +	if (etype == SBI_PMU_EVENT_TYPE_HW)
> +		type = PERF_TYPE_HARDWARE;
> +	else if (etype == SBI_PMU_EVENT_TYPE_CACHE)
> +		type = PERF_TYPE_HW_CACHE;
> +	else if (etype == SBI_PMU_EVENT_TYPE_RAW || etype == SBI_PMU_EVENT_TYPE_FW)
> +		type = PERF_TYPE_RAW;
> +	else
> +		type = PERF_TYPE_MAX;

nit: This might look nicer as a switch

> +
> +	return type;
> +}
> +
> +static inline bool pmu_is_fw_event(unsigned long eidx)
> +{
> +

extra blank line here

> +	return get_event_type(eidx) == SBI_PMU_EVENT_TYPE_FW;
> +}
> +
> +static void pmu_release_perf_event(struct kvm_pmc *pmc)
> +{
> +	if (pmc->perf_event) {
> +		perf_event_disable(pmc->perf_event);
> +		perf_event_release_kernel(pmc->perf_event);
> +		pmc->perf_event = NULL;
> +	}
> +}
> +
> +static u64 pmu_get_perf_event_hw_config(u32 sbi_event_code)
> +{
> +	/* SBI PMU HW event code is offset by 1 from perf hw event codes */
> +	return (u64)sbi_event_code - 1;

This is probably fine to do since we're mapping specified codes to UAPI
codes, so it's unlikely something will change this relationship, but if we
expose the mapping explicitly it'll give code readers a chance to see
what's what without looking stuff up elsewhere. How about creating an
array indexed by sbi_event_code with the mapping?

  enum perf_hw_id hw_event_perf_map[SBI_PMU_HW_GENERAL_MAX] = {
  [SBI_PMU_HW_CPU_CYCLES] = PERF_COUNT_HW_CPU_CYCLES,
  ...
  };

> +}
> +
> +static u64 pmu_get_perf_event_cache_config(u32 sbi_event_code)
> +{
> +	u64 config = U64_MAX;
> +	unsigned int cache_type, cache_op, cache_result;
> +
> +	/* All the cache event masks lie within 0xFF. No separate masking is necesssary */
> +	cache_type = (sbi_event_code & SBI_PMU_EVENT_CACHE_ID_CODE_MASK) >> 3;

It'd be nice to have a define for that 3 shift.

> +	cache_op = (sbi_event_code & SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK) >> 1;

Also for this 1 shift.

> +	cache_result = sbi_event_code & SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK;
> +
> +	if (cache_type >= PERF_COUNT_HW_CACHE_MAX ||
> +	    cache_op >= PERF_COUNT_HW_CACHE_OP_MAX ||
> +	    cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
> +		return config;
> +
> +	config = cache_type | (cache_op << 8) | (cache_result << 16);
> +
> +	return config;
> +}
> +
> +static u64 pmu_get_perf_event_config(unsigned long eidx, uint64_t evt_data)
> +{
> +	enum sbi_pmu_event_type etype = get_event_type(eidx);
> +	u32 ecode = get_event_code(eidx);
> +	u64 config = U64_MAX;
> +
> +	if (etype == SBI_PMU_EVENT_TYPE_HW)
> +		config = pmu_get_perf_event_hw_config(ecode);
> +	else if (etype == SBI_PMU_EVENT_TYPE_CACHE)
> +		config = pmu_get_perf_event_cache_config(ecode);
> +	else if (etype == SBI_PMU_EVENT_TYPE_RAW)
> +		config = evt_data & RISCV_PMU_RAW_EVENT_MASK;
> +	else if ((etype == SBI_PMU_EVENT_TYPE_FW) && (ecode < SBI_PMU_FW_MAX))
> +		config = (1ULL << 63) | ecode;

nit: switch?

> +
> +	return config;
> +}
> +
> +static int pmu_get_fixed_pmc_index(unsigned long eidx)
> +{
> +	u32 etype = pmu_get_perf_event_type(eidx);
> +	u32 ecode = get_event_code(eidx);
> +	int ctr_idx;
> +
> +	if (etype != SBI_PMU_EVENT_TYPE_HW)
> +		return -EINVAL;
> +
> +	if (ecode == SBI_PMU_HW_CPU_CYCLES)
> +		ctr_idx = 0;
> +	else if (ecode == SBI_PMU_HW_INSTRUCTIONS)
> +		ctr_idx = 2;

nit: Could drop the ctr_idx variable and return directly

> +	else
> +		return -EINVAL;
> +
> +	return ctr_idx;
> +}
> +
> +static int pmu_get_programmable_pmc_index(struct kvm_pmu *kvpmu, unsigned long eidx,
> +					  unsigned long cbase, unsigned long cmask)
> +{
> +	int ctr_idx = -1;
> +	int i, pmc_idx;
> +	int min, max;
> +
> +	if (pmu_is_fw_event(eidx)) {
> +		/* Firmware counters are mapped 1:1 starting from num_hw_ctrs for simplicity */
> +		min = kvpmu->num_hw_ctrs;
> +		max = min + kvpmu->num_fw_ctrs;
> +	} else {
> +		/* First 3 counters are reserved for fixed counters */
> +		min = 3;
> +		max = kvpmu->num_hw_ctrs;
> +	}
> +
> +	for_each_set_bit(i, &cmask, BITS_PER_LONG) {
> +		pmc_idx = i + cbase;
> +		if ((pmc_idx >= min && pmc_idx < max) &&
> +		    !test_bit(pmc_idx, kvpmu->pmc_in_use)) {
> +			ctr_idx = pmc_idx;
> +			break;

nit: Could drop the ctr_idx variable and return directly

> +		}
> +	}
> +
> +	return ctr_idx;
> +}
> +
> +static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
> +			     unsigned long cbase, unsigned long cmask)
> +{
> +	int ret;
> +
> +	/* Fixed counters need to be have fixed mapping as they have different width */
> +	ret = pmu_get_fixed_pmc_index(eidx);
> +	if (ret >= 0)
> +		return ret;
> +
> +	return pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
> +}
>  
>  static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
>  			unsigned long *out_val)
> @@ -82,7 +235,41 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  				 unsigned long ctr_mask, unsigned long flag, uint64_t ival,
>  				 struct kvm_vcpu_sbi_ext_data *edata)
>  {
> -	/* TODO */
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	int i, num_ctrs, pmc_index, sbiret = 0;
> +	struct kvm_pmc *pmc;
> +
> +	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;

We can put kvm_pmu_num_counters(kvpmu) in its one use below and
drop num_ctrs.

> +	if (ctr_base + __fls(ctr_mask) >= num_ctrs) {

__fls is undefined when ctr_mask is zero, so we should check it first,
unless the caller of this function is required to check it.

(Same two comments for the functions below that have the same check.)

> +		sbiret = SBI_ERR_INVALID_PARAM;
> +		goto out;
> +	}
> +
> +	/* Start the counters that have been configured and requested by the guest */
> +	for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
> +		pmc_index = i + ctr_base;
> +		if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> +			continue;
> +		pmc = &kvpmu->pmc[pmc_index];
> +		if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> +			pmc->counter_val = ival;
> +		if (pmc->perf_event) {
> +			if (unlikely(pmc->started)) {
> +				sbiret = SBI_ERR_ALREADY_STARTED;
> +				continue;
> +			}
> +			perf_event_period(pmc->perf_event, pmu_get_sample_period(pmc));
> +			perf_event_enable(pmc->perf_event);
> +			pmc->started = true;
> +		} else {
> +			kvm_debug("Can not start counter due to invalid confiugartion\n");
                                   ^ Cannot                             ^ configuration

> +			sbiret = SBI_ERR_INVALID_PARAM;
> +		}
> +	}

Possibly a spec oversight is that we continue to try and start counters,
even when we've seen errors. The problem with implementing that is that
if we have both errors we only return the last one. I.e. one counter
was already started and another counter resulted in invalid-param, we
only return invalid-param. We also don't say anything about the number
of failures / successes. I think we should bail on the first error and
even stop counters that we started. Callers can then try again after
correcting their input without potentially getting already-started errors.
We'd need to change the spec to do that though.

> +
> +out:
> +	edata->err_val = sbiret;
> +
>  	return 0;
>  }
>  
> @@ -90,16 +277,142 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  				unsigned long ctr_mask, unsigned long flag,
>  				struct kvm_vcpu_sbi_ext_data *edata)
>  {
> -	/* TODO */
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	int i, num_ctrs, pmc_index, sbiret = 0;
> +	u64 enabled, running;
> +	struct kvm_pmc *pmc;
> +
> +	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
> +	if ((ctr_base + __fls(ctr_mask)) >= num_ctrs) {

nit: Can drop ()

> +		sbiret = SBI_ERR_INVALID_PARAM;
> +		goto out;
> +	}
> +
> +	/* Stop the counters that have been configured and requested by the guest */
> +	for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
> +		pmc_index = i + ctr_base;
> +		if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> +			continue;
> +		pmc = &kvpmu->pmc[pmc_index];
> +		if (pmc->perf_event) {
> +			if (pmc->started) {
> +				/* Stop counting the counter */
> +				perf_event_disable(pmc->perf_event);
> +				pmc->started = false;
> +			} else
> +				sbiret = SBI_ERR_ALREADY_STOPPED;
> +
> +			if (flag & SBI_PMU_STOP_FLAG_RESET) {
> +				/* Relase the counter if this is a reset request */
> +				pmc->counter_val += perf_event_read_value(pmc->perf_event,
> +									  &enabled, &running);
> +				pmu_release_perf_event(pmc);
> +				clear_bit(pmc_index, kvpmu->pmc_in_use);
> +			}
> +		} else {
> +			kvm_debug("Can not stop counter due to invalid confiugartion\n");
                                   ^Cannot                             ^ configuration

> +			sbiret = SBI_ERR_INVALID_PARAM;
> +		}
> +	}

Same comment about the multiple errors problem.

> +
> +out:
> +	edata->err_val = sbiret;
> +
>  	return 0;
>  }
>  
>  int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  				     unsigned long ctr_mask, unsigned long flag,
> -				     unsigned long eidx, uint64_t edata,
> -				     struct kvm_vcpu_sbi_ext_data *extdata)
> +				     unsigned long eidx, uint64_t evt_data,
> +				     struct kvm_vcpu_sbi_ext_data *ext_data)

This should be part of the skeleton patch.

>  {
> -	/* TODO */
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	struct perf_event *event;
> +	struct perf_event_attr attr;
> +	int num_ctrs, ctr_idx;
> +	u32 etype = pmu_get_perf_event_type(eidx);
> +	u64 config;
> +	struct kvm_pmc *pmc;
> +	int sbiret = 0;
> +
> +
> +	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
> +	if (etype == PERF_TYPE_MAX || (ctr_base + __fls(ctr_mask) >= num_ctrs)) {
> +		sbiret = SBI_ERR_INVALID_PARAM;
> +		goto out;
> +	}
> +
> +	if (pmu_is_fw_event(eidx)) {
> +		sbiret = SBI_ERR_NOT_SUPPORTED;
> +		goto out;
> +	}
> +
> +	/*
> +	 * SKIP_MATCH flag indicates the caller is aware of the assigned counter
> +	 * for this event. Just do a sanity check if it already marked used.
> +	 */
> +	if (flag & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
> +		if (!test_bit(ctr_base, kvpmu->pmc_in_use)) {
> +			sbiret = SBI_ERR_FAILURE;
> +			goto out;

I see this is the same way OpenSBI implements this, but I don't really
understand it. The spec says

"""
NOTE: When SBI_PMU_CFG_FLAG_SKIP_MATCH is set in config_flags, the SBI
implementation will unconditionally select the first counter from the set
of counters specified by the counter_idx_base and counter_idx_mask.
"""

which doesn't say anything about the caller just wanting to confirm the
counter is in use. Indeed, the spec says this function is for

"""
Find and configure a counter from a set of counters which is not started
(or enabled) and can monitor the specified event.
"""

which would imply errors should returned anytime we match something
already started, rather than the opposite.

Also, I think we should be using ctr_base + first-set-bit(ctr_mask),
because the spec says to look at both the base and the mask, and take
the first counter.

Finally, the spec doesn't have SBI_ERR_FAILURE listed as a possible
error.

> +		}
> +		ctr_idx = ctr_base;
> +		goto match_done;

nit: We don't need the match_done label and goto. We can just use
an if-else.

> +	}
> +
> +	ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
> +	if (ctr_idx < 0) {
> +		sbiret = SBI_ERR_NOT_SUPPORTED;
> +		goto out;
> +	}
> +
> +match_done:
> +	pmc = &kvpmu->pmc[ctr_idx];
> +	pmu_release_perf_event(pmc);
> +	pmc->idx = ctr_idx;
> +
> +	config = pmu_get_perf_event_config(eidx, evt_data);
> +	memset(&attr, 0, sizeof(struct perf_event_attr));

At the top of the function we can use

        struct perf_event_attr attr = {
          .type = etype,
          .size = sizeof(struct perf_event_attr),

	  /* anything else we know at the top */

	};

instead of the memset and some of the assignments here.

> +	attr.type = etype;
> +	attr.size = sizeof(attr);
> +	attr.pinned = true;
> +
> +	/*
> +	 * It should never reach here if the platform doesn't support sscofpmf extensio
                                                                     ^ the     ^ extension

> +	 * as mode filtering won't work without it.
> +	 */
> +	attr.exclude_host = true;
> +	attr.exclude_hv = true;
> +	attr.exclude_user = !!(flag & SBI_PMU_CFG_FLAG_SET_UINH);
> +	attr.exclude_kernel = !!(flag & SBI_PMU_CFG_FLAG_SET_SINH);
> +	attr.config = config;
> +	attr.config1 = RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS;
> +	if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
> +		//TODO: Do we really want to clear the value in hardware counter
> +		pmc->counter_val = 0;
> +	}
> +
> +	/*
> +	 * Set the default sample_period for now. The guest specified value
> +	 * will be updated in the start call.
> +	 */
> +	attr.sample_period = pmu_get_sample_period(pmc);
> +
> +	event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
> +	if (IS_ERR(event)) {
> +		pr_err("kvm pmu event creation failed event %pe for eidx %lx\n", event, eidx);
> +		return -EOPNOTSUPP;

event is an error, so we shouldn't call it an event in the error message.
How about

  pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));

and then PTR_ERR(event) intead of -EOPNOTSUPP.

> +	}
> +
> +	set_bit(ctr_idx, kvpmu->pmc_in_use);
> +	pmc->perf_event = event;
> +	if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> +		perf_event_enable(pmc->perf_event);
> +
> +	ext_data->out_val = ctr_idx;
> +out:
> +	ext_data->err_val = sbiret;
> +
>  	return 0;
>  }
>  
> @@ -119,6 +432,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  {
>  	int i = 0, num_fw_ctrs, ret, num_hw_ctrs = 0, hpm_width = 0;
>  	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	struct kvm_pmc *pmc;
>  
>  	ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
>  	if (ret < 0)
> @@ -134,6 +448,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  	else
>  		num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
>  
> +	bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);

Could move this bitmap clearing to deinit as it should be clear the
first time already since vcpus are allocated with __GFP_ZERO.

>  	kvpmu->num_hw_ctrs = num_hw_ctrs;
>  	kvpmu->num_fw_ctrs = num_fw_ctrs;
>  	/*
> @@ -146,24 +461,26 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  		/* TIME CSR shouldn't be read from perf interface */
>  		if (i == 1)
>  			continue;
> -		kvpmu->pmc[i].idx = i;
> +		pmc = &kvpmu->pmc[i];
> +		pmc->idx = i;
> +		pmc->counter_val = 0;

Also could be moved to deinit.

>  		if (i < kvpmu->num_hw_ctrs) {
>  			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
>  			if (i < 3)
>  				/* CY, IR counters */
> -				kvpmu->pmc[i].cinfo.width = 63;
> +				pmc->cinfo.width = 63;
>  			else
> -				kvpmu->pmc[i].cinfo.width = hpm_width;
> +				pmc->cinfo.width = hpm_width;
>  			/*
>  			 * The CSR number doesn't have any relation with the logical
>  			 * hardware counters. The CSR numbers are encoded sequentially
>  			 * to avoid maintaining a map between the virtual counter
>  			 * and CSR number.
>  			 */
> -			kvpmu->pmc[i].cinfo.csr = CSR_CYCLE + i;
> +			pmc->cinfo.csr = CSR_CYCLE + i;
>  		} else {
> -			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_FW;
> -			kvpmu->pmc[i].cinfo.width = BITS_PER_LONG - 1;
> +			pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
> +			pmc->cinfo.width = BITS_PER_LONG - 1;

Almost all of these changes can be avoided by using the pmc pointer in the
skeleton patch from the start.

>  		}
>  	}
>  
> @@ -172,13 +489,22 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> -void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
>  {
> -	/* TODO */
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	struct kvm_pmc *pmc;
> +	int i;
> +
> +	if (!kvpmu)
> +		return;
> +
> +	for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
> +		pmc = &kvpmu->pmc[i];
> +		pmu_release_perf_event(pmc);
> +	}
>  }
>  
> -void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
>  {
> -	/* TODO */
> +	kvm_riscv_vcpu_pmu_deinit(vcpu);

The skeleton patch could put deinit and reset in the right order to avoid
this change.

>  }
> -
> -- 
> 2.25.1
> 

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 09/11] RISC-V: KVM: Implement trap & emulate for hpmcounters
  2022-12-15 17:00 ` [PATCH v2 09/11] RISC-V: KVM: Implement trap & emulate for hpmcounters Atish Patra
@ 2023-01-13 11:47   ` Andrew Jones
  0 siblings, 0 replies; 37+ messages in thread
From: Andrew Jones @ 2023-01-13 11:47 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:44AM -0800, Atish Patra wrote:
> As the KVM guests only see the virtual PMU counters, all hpmcounter
> access should trap and KVM emulates the read access on behalf of guests.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/include/asm/kvm_vcpu_pmu.h | 16 ++++++++++
>  arch/riscv/kvm/vcpu_insn.c            |  4 ++-
>  arch/riscv/kvm/vcpu_pmu.c             | 44 ++++++++++++++++++++++++++-
>  3 files changed, 62 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> index 6a8c0f7..7a9a8e6 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> @@ -43,6 +43,19 @@ struct kvm_pmu {
>  #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
>  #define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
>  
> +#if defined(CONFIG_32BIT)
> +#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
> +{ .base = CSR_CYCLEH,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm }, \
> +{ .base = CSR_CYCLE,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
> +#else
> +#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
> +{ .base = CSR_CYCLE,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
> +#endif
> +
> +int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
> +				unsigned long *val, unsigned long new_val,
> +				unsigned long wr_mask);
> +
>  int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata);
>  int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
>  				struct kvm_vcpu_sbi_ext_data *edata);
> @@ -65,6 +78,9 @@ void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
>  #else
>  struct kvm_pmu {
>  };
> +#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
> +{ .base = 0,      .count = 0, .func = NULL },
> +
>  
>  static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  {
> diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c
> index 1ff2649..f689337 100644
> --- a/arch/riscv/kvm/vcpu_insn.c
> +++ b/arch/riscv/kvm/vcpu_insn.c
> @@ -213,7 +213,9 @@ struct csr_func {
>  		    unsigned long wr_mask);
>  };
>  
> -static const struct csr_func csr_funcs[] = {};
> +static const struct csr_func csr_funcs[] = {
> +	KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS
> +};
>  
>  /**
>   * kvm_riscv_vcpu_csr_return -- Handle CSR read/write after user space
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> index 0f0748f1..53c4163 100644
> --- a/arch/riscv/kvm/vcpu_pmu.c
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -17,6 +17,43 @@
>  
>  #define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
>  
> +static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> +			unsigned long *out_val)
> +{
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	struct kvm_pmc *pmc;
> +	u64 enabled, running;
> +
> +	pmc = &kvpmu->pmc[cidx];
> +	if (!pmc->perf_event)
> +		return -EINVAL;
> +
> +	pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
> +	*out_val = pmc->counter_val;
> +
> +	return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
> +				unsigned long *val, unsigned long new_val,
> +				unsigned long wr_mask)
> +{
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	int cidx, ret = KVM_INSN_CONTINUE_NEXT_SEPC;
> +
> +	if (!kvpmu || !kvpmu->init_done)
> +		return KVM_INSN_EXIT_TO_USER_SPACE;
> +
> +	if (wr_mask)
> +		return KVM_INSN_ILLEGAL_TRAP;

nit: add blank line here

> +	cidx = csr_num - CSR_CYCLE;
> +
> +	if (pmu_ctr_read(vcpu, cidx, val) < 0)
> +		return KVM_INSN_EXIT_TO_USER_SPACE;
> +
> +	return ret;
> +}
> +
>  int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_ext_data *edata)
>  {
>  	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> @@ -69,7 +106,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
>  int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
>  				struct kvm_vcpu_sbi_ext_data *edata)
>  {
> -	/* TODO */
> +	int ret;
> +
> +	ret = pmu_ctr_read(vcpu, cidx, &edata->out_val);
> +	if (ret == -EINVAL)
> +		edata->err_val = SBI_ERR_INVALID_PARAM;
> +
>  	return 0;
>  }
>  
> -- 
> 2.25.1
>

Otherwise

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 11/11] RISC-V: KVM: Implement firmware events
  2022-12-15 17:00 ` [PATCH v2 11/11] RISC-V: KVM: Implement firmware events Atish Patra
@ 2023-01-13 12:08   ` Andrew Jones
  2023-01-26  3:08     ` Atish Patra
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Jones @ 2023-01-13 12:08 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Dec 15, 2022 at 09:00:46AM -0800, Atish Patra wrote:
> SBI PMU extension defines a set of firmware events which can provide
> useful information to guests about number of SBI calls. As hypervisor
> implements the SBI PMU extension, these firmware events corresponds
> to ecall invocations between VS->HS mode. All other firmware events
> will always report zero if monitored as KVM doesn't implement them.
> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  arch/riscv/include/asm/kvm_vcpu_pmu.h |  16 ++++
>  arch/riscv/include/asm/sbi.h          |   2 +-
>  arch/riscv/kvm/tlb.c                  |   6 +-
>  arch/riscv/kvm/vcpu_pmu.c             | 105 ++++++++++++++++++++++----
>  arch/riscv/kvm/vcpu_sbi_replace.c     |   7 ++
>  5 files changed, 119 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> index 7a9a8e6..cccc6182 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> @@ -17,6 +17,14 @@
>  #define RISCV_KVM_MAX_FW_CTRS 32
>  #define RISCV_MAX_COUNTERS      64
>  
> +struct kvm_fw_event {
> +	/* Current value of the event */
> +	unsigned long value;
> +
> +	/* Event monitoring status */
> +	bool started;
> +};
> +
>  /* Per virtual pmu counter data */
>  struct kvm_pmc {
>  	u8 idx;
> @@ -25,11 +33,14 @@ struct kvm_pmc {
>  	union sbi_pmu_ctr_info cinfo;
>  	/* Event monitoring status */
>  	bool started;
> +	/* Monitoring event ID */
> +	unsigned long event_idx;
>  };
>  
>  /* PMU data structure per vcpu */
>  struct kvm_pmu {
>  	struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> +	struct kvm_fw_event fw_event[RISCV_KVM_MAX_FW_CTRS];
>  	/* Number of the virtual firmware counters available */
>  	int num_fw_ctrs;
>  	/* Number of the virtual hardware counters available */
> @@ -52,6 +63,7 @@ struct kvm_pmu {
>  { .base = CSR_CYCLE,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
>  #endif
>  
> +int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid);
>  int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
>  				unsigned long *val, unsigned long new_val,
>  				unsigned long wr_mask);
> @@ -81,6 +93,10 @@ struct kvm_pmu {
>  #define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
>  { .base = 0,      .count = 0, .func = NULL },
>  
> +static inline int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
> +{
> +	return 0;
> +}
>  
>  static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  {
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 2a0ef738..a192a95a 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -171,7 +171,7 @@ enum sbi_pmu_fw_generic_events_t {
>  	SBI_PMU_FW_IPI_SENT		= 6,
>  	SBI_PMU_FW_IPI_RECVD		= 7,
>  	SBI_PMU_FW_FENCE_I_SENT		= 8,
> -	SBI_PMU_FW_FENCE_I_RECVD	= 9,
> +	SBI_PMU_FW_FENCE_I_RCVD		= 9,

This should probably be in its own patch.

>  	SBI_PMU_FW_SFENCE_VMA_SENT	= 10,
>  	SBI_PMU_FW_SFENCE_VMA_RCVD	= 11,
>  	SBI_PMU_FW_SFENCE_VMA_ASID_SENT	= 12,
> diff --git a/arch/riscv/kvm/tlb.c b/arch/riscv/kvm/tlb.c
> index 309d79b..de81920 100644
> --- a/arch/riscv/kvm/tlb.c
> +++ b/arch/riscv/kvm/tlb.c
> @@ -181,6 +181,7 @@ void kvm_riscv_local_tlb_sanitize(struct kvm_vcpu *vcpu)
>  
>  void kvm_riscv_fence_i_process(struct kvm_vcpu *vcpu)
>  {
> +	kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_RCVD);
>  	local_flush_icache_all();
>  }
>  
> @@ -264,15 +265,18 @@ void kvm_riscv_hfence_process(struct kvm_vcpu *vcpu)
>  						d.addr, d.size, d.order);
>  			break;
>  		case KVM_RISCV_HFENCE_VVMA_ASID_GVA:
> +			kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
>  			kvm_riscv_local_hfence_vvma_asid_gva(
>  						READ_ONCE(v->vmid), d.asid,
>  						d.addr, d.size, d.order);
>  			break;
>  		case KVM_RISCV_HFENCE_VVMA_ASID_ALL:
> +			kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
>  			kvm_riscv_local_hfence_vvma_asid_all(
>  						READ_ONCE(v->vmid), d.asid);
>  			break;
>  		case KVM_RISCV_HFENCE_VVMA_GVA:
> +			kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_RCVD);
>  			kvm_riscv_local_hfence_vvma_gva(
>  						READ_ONCE(v->vmid),
>  						d.addr, d.size, d.order);
> @@ -323,7 +327,7 @@ void kvm_riscv_fence_i(struct kvm *kvm,
>  		       unsigned long hbase, unsigned long hmask)
>  {
>  	make_xfence_request(kvm, hbase, hmask, KVM_REQ_FENCE_I,
> -			    KVM_REQ_FENCE_I, NULL);
> +		    KVM_REQ_FENCE_I, NULL);

stray change, and whitespace was correct before

>  }
>  
>  void kvm_riscv_hfence_gvma_vmid_gpa(struct kvm *kvm,
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> index 21c1f0f..a64a7ae 100644
> --- a/arch/riscv/kvm/vcpu_pmu.c
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -170,18 +170,36 @@ static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
>  	return pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
>  }
>  
> +int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
> +{
> +	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +	struct kvm_fw_event *fevent;
> +
> +	if (!kvpmu || fid >= SBI_PMU_FW_MAX)
> +		return -EINVAL;
> +
> +	fevent = &kvpmu->fw_event[fid];
> +	if (fevent->started)
> +		fevent->value++;
> +
> +	return 0;
> +}
> +
>  static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
>  			unsigned long *out_val)
>  {
>  	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
>  	struct kvm_pmc *pmc;
>  	u64 enabled, running;
> +	int fevent_code;
>  
>  	pmc = &kvpmu->pmc[cidx];
> -	if (!pmc->perf_event)
> -		return -EINVAL;
>  
> -	pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
> +	if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> +		fevent_code = get_event_code(pmc->event_idx);
> +		pmc->counter_val = kvpmu->fw_event[fevent_code].value;
> +	} else if (pmc->perf_event)
> +		pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
>  	*out_val = pmc->counter_val;
>  
>  	return 0;
> @@ -238,6 +256,7 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
>  	int i, num_ctrs, pmc_index, sbiret = 0;
>  	struct kvm_pmc *pmc;
> +	int fevent_code;
>  
>  	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
>  	if (ctr_base + __fls(ctr_mask) >= num_ctrs) {
> @@ -253,7 +272,22 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  		pmc = &kvpmu->pmc[pmc_index];
>  		if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
>  			pmc->counter_val = ival;
> -		if (pmc->perf_event) {
> +		if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> +			fevent_code = get_event_code(pmc->event_idx);
> +			if (fevent_code >= SBI_PMU_FW_MAX) {
> +				sbiret = SBI_ERR_INVALID_PARAM;
> +				goto out;
> +			}
> +
> +			/* Check if the counter was already started for some reason */
> +			if (kvpmu->fw_event[fevent_code].started) {
> +				sbiret = SBI_ERR_ALREADY_STARTED;
> +				continue;
> +			}
> +
> +			kvpmu->fw_event[fevent_code].started = true;
> +			kvpmu->fw_event[fevent_code].value = pmc->counter_val;
> +		} else if (pmc->perf_event) {
>  			if (unlikely(pmc->started)) {
>  				sbiret = SBI_ERR_ALREADY_STARTED;
>  				continue;
> @@ -281,6 +315,7 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  	int i, num_ctrs, pmc_index, sbiret = 0;
>  	u64 enabled, running;
>  	struct kvm_pmc *pmc;
> +	int fevent_code;
>  
>  	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
>  	if ((ctr_base + __fls(ctr_mask)) >= num_ctrs) {
> @@ -294,7 +329,18 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  		if (!test_bit(pmc_index, kvpmu->pmc_in_use))
>  			continue;
>  		pmc = &kvpmu->pmc[pmc_index];
> -		if (pmc->perf_event) {
> +		if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> +			fevent_code = get_event_code(pmc->event_idx);
> +			if (fevent_code >= SBI_PMU_FW_MAX) {
> +				sbiret = SBI_ERR_INVALID_PARAM;
> +				goto out;
> +			}
> +
> +			if (!kvpmu->fw_event[fevent_code].started)
> +				sbiret = SBI_ERR_ALREADY_STOPPED;
> +
> +			kvpmu->fw_event[fevent_code].started = false;
> +		} else if (pmc->perf_event) {
>  			if (pmc->started) {
>  				/* Stop counting the counter */
>  				perf_event_disable(pmc->perf_event);
> @@ -307,12 +353,15 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
>  				pmc->counter_val += perf_event_read_value(pmc->perf_event,
>  									  &enabled, &running);
>  				pmu_release_perf_event(pmc);
> -				clear_bit(pmc_index, kvpmu->pmc_in_use);
>  			}
>  		} else {
>  			kvm_debug("Can not stop counter due to invalid confiugartion\n");
>  			sbiret = SBI_ERR_INVALID_PARAM;
>  		}
> +		if (flag & SBI_PMU_STOP_FLAG_RESET) {
> +			pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
> +			clear_bit(pmc_index, kvpmu->pmc_in_use);

nit: I'd probably just leave clear_bit where it was and add

if (flag & SBI_PMU_STOP_FLAG_RESET)
   pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;

to the firmware arm.

> +		}
>  	}
>  
>  out:
> @@ -329,12 +378,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
>  	struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
>  	struct perf_event *event;
>  	struct perf_event_attr attr;
> -	int num_ctrs, ctr_idx;
> +	int num_ctrs, ctr_idx, sbiret = 0;
>  	u32 etype = pmu_get_perf_event_type(eidx);
>  	u64 config;
> -	struct kvm_pmc *pmc;
> -	int sbiret = 0;
> -
> +	struct kvm_pmc *pmc = NULL;
> +	bool is_fevent;
> +	unsigned long event_code;
>  
>  	num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
>  	if (etype == PERF_TYPE_MAX || (ctr_base + __fls(ctr_mask) >= num_ctrs)) {
> @@ -342,7 +391,9 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
>  		goto out;
>  	}
>  
> -	if (pmu_is_fw_event(eidx)) {
> +	event_code = get_event_code(eidx);
> +	is_fevent = pmu_is_fw_event(eidx);
> +	if (is_fevent && event_code >= SBI_PMU_FW_MAX) {
>  		sbiret = SBI_ERR_NOT_SUPPORTED;
>  		goto out;
>  	}
> @@ -357,7 +408,10 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
>  			goto out;
>  		}
>  		ctr_idx = ctr_base;
> -		goto match_done;
> +		if (is_fevent)
> +			goto perf_event_done;
> +		else
> +			goto match_done;
>  	}
>  
>  	ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
> @@ -366,6 +420,13 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
>  		goto out;
>  	}
>  
> +	/*
> +	 * No need to create perf events for firmware events as the firmware counter
> +	 * is supposed to return the measurement of VS->HS mode invocations.
> +	 */
> +	if (is_fevent)
> +		goto perf_event_done;
> +
>  match_done:
>  	pmc = &kvpmu->pmc[ctr_idx];
>  	pmu_release_perf_event(pmc);
> @@ -404,10 +465,19 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
>  		return -EOPNOTSUPP;
>  	}
>  
> -	set_bit(ctr_idx, kvpmu->pmc_in_use);
>  	pmc->perf_event = event;
> -	if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> -		perf_event_enable(pmc->perf_event);

Maybe we can move the perf setup stuff into a helper function and
then guard it with an if-statement rather than have the gotos?

> +perf_event_done:
> +	if (flag & SBI_PMU_CFG_FLAG_AUTO_START) {
> +		if (is_fevent)
> +			kvpmu->fw_event[event_code].started = true;
> +		else
> +			perf_event_enable(pmc->perf_event);
> +	}
> +	/* This should be only true for firmware events */
> +	if (!pmc)
> +		pmc = &kvpmu->pmc[ctr_idx];
> +	pmc->event_idx = eidx;
> +	set_bit(ctr_idx, kvpmu->pmc_in_use);
>  
>  	ext_data->out_val = ctr_idx;
>  out:
> @@ -451,6 +521,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  	bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
>  	kvpmu->num_hw_ctrs = num_hw_ctrs;
>  	kvpmu->num_fw_ctrs = num_fw_ctrs;
> +	memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));

I'm wondering if we need this array. We already have an underused pmc for
the fw events which has counter_val and started. Can't we just used those?

>  	/*
>  	 * There is no corelation betwen the logical hardware counter and virtual counters.
>  	 * However, we need to encode a hpmcounter CSR in the counter info field so that
> @@ -464,6 +535,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>  		pmc = &kvpmu->pmc[i];
>  		pmc->idx = i;
>  		pmc->counter_val = 0;
> +		pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
>  		if (i < kvpmu->num_hw_ctrs) {
>  			kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
>  			if (i < 3)
> @@ -501,7 +573,10 @@ void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
>  	for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
>  		pmc = &kvpmu->pmc[i];
>  		pmu_release_perf_event(pmc);
> +		pmc->counter_val = 0;
> +		pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
>  	}
> +	memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
>  }
>  
>  void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> index d029136..3f39711 100644
> --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> @@ -11,6 +11,7 @@
>  #include <linux/kvm_host.h>
>  #include <asm/sbi.h>
>  #include <asm/kvm_vcpu_timer.h>
> +#include <asm/kvm_vcpu_pmu.h>
>  #include <asm/kvm_vcpu_sbi.h>
>  
>  static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> @@ -26,6 +27,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		return 0;
>  	}
>  
> +	kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_SET_TIMER);
>  #if __riscv_xlen == 32
>  	next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
>  #else
> @@ -58,6 +60,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		return 0;
>  	}
>  
> +	kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_IPI_SENT);
>  	kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
>  		if (hbase != -1UL) {
>  			if (tmp->vcpu_id < hbase)
> @@ -68,6 +71,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		ret = kvm_riscv_vcpu_set_interrupt(tmp, IRQ_VS_SOFT);
>  		if (ret < 0)
>  			break;
> +		kvm_riscv_vcpu_pmu_incr_fw(tmp, SBI_PMU_FW_IPI_RECVD);
>  	}
>  
>  	return ret;
> @@ -91,6 +95,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
>  	switch (funcid) {
>  	case SBI_EXT_RFENCE_REMOTE_FENCE_I:
>  		kvm_riscv_fence_i(vcpu->kvm, hbase, hmask);
> +		kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_SENT);
>  		break;
>  	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
>  		if (cp->a2 == 0 && cp->a3 == 0)
> @@ -98,6 +103,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
>  		else
>  			kvm_riscv_hfence_vvma_gva(vcpu->kvm, hbase, hmask,
>  						  cp->a2, cp->a3, PAGE_SHIFT);
> +		kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_SENT);
>  		break;
>  	case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
>  		if (cp->a2 == 0 && cp->a3 == 0)
> @@ -108,6 +114,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
>  						       hbase, hmask,
>  						       cp->a2, cp->a3,
>  						       PAGE_SHIFT, cp->a4);
> +		kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_SENT);
>  		break;
>  	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
>  	case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
> -- 
> 2.25.1
>

It think it'd be nice to break the application of
kvm_riscv_vcpu_pmu_incr_fw() out of this patch. I.e. introduce
kvm_riscv_vcpu_pmu_incr_fw() in this patch and then a second patch
applies it to all the ecalls.

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling
  2023-01-13 11:45   ` Andrew Jones
@ 2023-01-23  7:23     ` Andrew Jones
  2023-01-26  0:50     ` Atish Patra
  1 sibling, 0 replies; 37+ messages in thread
From: Andrew Jones @ 2023-01-23  7:23 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Anup Patel, Atish Patra, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Fri, Jan 13, 2023 at 12:45:02PM +0100, Andrew Jones wrote:
> On Thu, Dec 15, 2022 at 09:00:45AM -0800, Atish Patra wrote:
...
> > +	/* Start the counters that have been configured and requested by the guest */
> > +	for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
> > +		pmc_index = i + ctr_base;
> > +		if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> > +			continue;
> > +		pmc = &kvpmu->pmc[pmc_index];
> > +		if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> > +			pmc->counter_val = ival;
> > +		if (pmc->perf_event) {
> > +			if (unlikely(pmc->started)) {
> > +				sbiret = SBI_ERR_ALREADY_STARTED;
> > +				continue;
> > +			}
> > +			perf_event_period(pmc->perf_event, pmu_get_sample_period(pmc));
> > +			perf_event_enable(pmc->perf_event);
> > +			pmc->started = true;
> > +		} else {
> > +			kvm_debug("Can not start counter due to invalid confiugartion\n");
>                                    ^ Cannot                             ^ configuration
> 
> > +			sbiret = SBI_ERR_INVALID_PARAM;
> > +		}
> > +	}
> 
> Possibly a spec oversight is that we continue to try and start counters,
> even when we've seen errors. The problem with implementing that is that
> if we have both errors we only return the last one. I.e. one counter
> was already started and another counter resulted in invalid-param, we
> only return invalid-param. We also don't say anything about the number
> of failures / successes. I think we should bail on the first error and
> even stop counters that we started. Callers can then try again after
> correcting their input without potentially getting already-started errors.
> We'd need to change the spec to do that though.
>

Thinking about this some more, the spec doesn't prohibit implementations
from bailing on the first error, so we can do that. But maybe we don't
need to stop the counters we started. We can leave it to the driver to
sort out what got configured/started and what didn't when it gets an
error.

Thanks,
drew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count
  2023-01-13  7:22       ` Andrew Jones
@ 2023-01-24 20:41         ` Atish Patra
  0 siblings, 0 replies; 37+ messages in thread
From: Atish Patra @ 2023-01-24 20:41 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Atish Kumar Patra, linux-kernel, Anup Patel, Guo Ren, kvm-riscv,
	kvm, linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Thu, Jan 12, 2023 at 11:22 PM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Jan 12, 2023 at 10:18:05AM -0800, Atish Kumar Patra wrote:
> > On Thu, Jan 12, 2023 at 2:06 AM Andrew Jones <ajones@ventanamicro.com> wrote:
> > >
> > > On Thu, Dec 15, 2022 at 09:00:36AM -0800, Atish Patra wrote:
> ...
> > > > +EXPORT_SYMBOL(riscv_pmu_get_hpm_info);
> > >
> > > EXPORT_SYMBOL_GPL ?
> > >
> >
> > Is that mandatory ? I have seen usage of both in arch/riscv and other
> > places though.
> > I am also not sure if any other non-GPL module should/need access to this.
>
> TBH, I'm not sure what the best policy is, but I presumed we should use
> _GPL when we aren't aware of anything non-GPL and then when a day comes
> that something non-GPL would like this to be exported, the patch that
> flips it will provide the justification in its commit message.
>

Sgtm. Changed it to EXPORT_SYMBOL_GPL for now.

> Thanks,
> drew



-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling
  2023-01-13 11:45   ` Andrew Jones
  2023-01-23  7:23     ` Andrew Jones
@ 2023-01-26  0:50     ` Atish Patra
  1 sibling, 0 replies; 37+ messages in thread
From: Atish Patra @ 2023-01-26  0:50 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Atish Patra, linux-kernel, Anup Patel, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Fri, Jan 13, 2023 at 3:45 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:45AM -0800, Atish Patra wrote:
> > RISC-V SBI PMU & Sscofpmf ISA extension allows supporting perf in
> > the virtualization enviornment as well. KVM implementation
> > relies on SBI PMU extension for most the most part while trapping
>
> s/most the most/the most/
>

Done.

> > & emulating the CSRs read for counter access.
> >
> > This patch doesn't have the event sampling support yet.
> >
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  arch/riscv/kvm/vcpu_pmu.c | 358 ++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 342 insertions(+), 16 deletions(-)
> >
> > diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> > index 53c4163..21c1f0f 100644
> > --- a/arch/riscv/kvm/vcpu_pmu.c
> > +++ b/arch/riscv/kvm/vcpu_pmu.c
> > @@ -12,10 +12,163 @@
> >  #include <linux/perf/riscv_pmu.h>
> >  #include <asm/csr.h>
> >  #include <asm/kvm_vcpu_sbi.h>
> > +#include <asm/bitops.h>
> >  #include <asm/kvm_vcpu_pmu.h>
> >  #include <linux/kvm_host.h>
> >
> >  #define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> > +#define get_event_type(x) ((x & SBI_PMU_EVENT_IDX_TYPE_MASK) >> 16)
> > +#define get_event_code(x) (x & SBI_PMU_EVENT_IDX_CODE_MASK)
>
> Should put () around x
>

Sure,

> > +
> > +static inline u64 pmu_get_sample_period(struct kvm_pmc *pmc)
>
> I'd rather we use kvm_pmu_ for the prefix instead of just pmu for
> this and all the other functions.
>
> Also I'd drop the inline attribute here and in other functions below to
> let the compiler completely decide what to do.
>

Done.

> > +{
> > +     u64 counter_val_mask = GENMASK(pmc->cinfo.width, 0);
> > +     u64 sample_period;
> > +
> > +     if (!pmc->counter_val)
> > +             sample_period = counter_val_mask;
> > +     else
> > +             sample_period = (-pmc->counter_val) & counter_val_mask;
>
> I probably don't understand this, since I see arm64/kvm doing the same
> thing, but if sample_period is the number of remaining counts a counter
> has, then I'd write it as

That's correct.

>
>   sample_period = counter_val_mask - (pmc->counter_val & counter_val_mask);
>
> As it stands, the pmc->counter_val == 0 case would be the same, but
> the other case differs by 1. arm64/kvm doesn't appear to handle the
> zero case.
>

We do have to consider the last increment as well where it actually overflows.
Thus, if we want to rewrite as per your preference it should be

sample_period = counter_val_mask - (pmc->counter_val & counter_val_mask) + 1 ;

For zero cases, I got it wrong. It should be

counter_val_mask + 1

I guess If a user in a guest sets the sampling period to be 0 (even if
it is impractical to do that), that's a user's choice. KVM should just
create the perf event accordingly. I cross checked the x86
implementation. They do the same as well.
All the counters are 64 bit wide for ARM64. I guess that's why ARM64
didn't have to deal with the special 0 case.

Thanks for catching this case.

> > +
> > +     return sample_period;
> > +}
> > +
> > +static u32 pmu_get_perf_event_type(unsigned long eidx)
> > +{
> > +     enum sbi_pmu_event_type etype = get_event_type(eidx);
> > +     u32 type;
> > +
> > +     if (etype == SBI_PMU_EVENT_TYPE_HW)
> > +             type = PERF_TYPE_HARDWARE;
> > +     else if (etype == SBI_PMU_EVENT_TYPE_CACHE)
> > +             type = PERF_TYPE_HW_CACHE;
> > +     else if (etype == SBI_PMU_EVENT_TYPE_RAW || etype == SBI_PMU_EVENT_TYPE_FW)
> > +             type = PERF_TYPE_RAW;
> > +     else
> > +             type = PERF_TYPE_MAX;
>
> nit: This might look nicer as a switch
>
> > +
> > +     return type;
> > +}
> > +
> > +static inline bool pmu_is_fw_event(unsigned long eidx)
> > +{
> > +
>
> extra blank line here
>
> > +     return get_event_type(eidx) == SBI_PMU_EVENT_TYPE_FW;
> > +}
> > +
> > +static void pmu_release_perf_event(struct kvm_pmc *pmc)
> > +{
> > +     if (pmc->perf_event) {
> > +             perf_event_disable(pmc->perf_event);
> > +             perf_event_release_kernel(pmc->perf_event);
> > +             pmc->perf_event = NULL;
> > +     }
> > +}
> > +
> > +static u64 pmu_get_perf_event_hw_config(u32 sbi_event_code)
> > +{
> > +     /* SBI PMU HW event code is offset by 1 from perf hw event codes */
> > +     return (u64)sbi_event_code - 1;
>
> This is probably fine to do since we're mapping specified codes to UAPI
> codes, so it's unlikely something will change this relationship, but if we
> expose the mapping explicitly it'll give code readers a chance to see
> what's what without looking stuff up elsewhere. How about creating an
> array indexed by sbi_event_code with the mapping?
>
>   enum perf_hw_id hw_event_perf_map[SBI_PMU_HW_GENERAL_MAX] = {
>   [SBI_PMU_HW_CPU_CYCLES] = PERF_COUNT_HW_CPU_CYCLES,
>   ...
>   };
>

Done.

> > +}
> > +
> > +static u64 pmu_get_perf_event_cache_config(u32 sbi_event_code)
> > +{
> > +     u64 config = U64_MAX;
> > +     unsigned int cache_type, cache_op, cache_result;
> > +
> > +     /* All the cache event masks lie within 0xFF. No separate masking is necesssary */
> > +     cache_type = (sbi_event_code & SBI_PMU_EVENT_CACHE_ID_CODE_MASK) >> 3;
>
> It'd be nice to have a define for that 3 shift.
>
> > +     cache_op = (sbi_event_code & SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK) >> 1;
>
> Also for this 1 shift.
>

Done.

> > +     cache_result = sbi_event_code & SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK;
> > +
> > +     if (cache_type >= PERF_COUNT_HW_CACHE_MAX ||
> > +         cache_op >= PERF_COUNT_HW_CACHE_OP_MAX ||
> > +         cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
> > +             return config;
> > +
> > +     config = cache_type | (cache_op << 8) | (cache_result << 16);
> > +
> > +     return config;
> > +}
> > +
> > +static u64 pmu_get_perf_event_config(unsigned long eidx, uint64_t evt_data)
> > +{
> > +     enum sbi_pmu_event_type etype = get_event_type(eidx);
> > +     u32 ecode = get_event_code(eidx);
> > +     u64 config = U64_MAX;
> > +
> > +     if (etype == SBI_PMU_EVENT_TYPE_HW)
> > +             config = pmu_get_perf_event_hw_config(ecode);
> > +     else if (etype == SBI_PMU_EVENT_TYPE_CACHE)
> > +             config = pmu_get_perf_event_cache_config(ecode);
> > +     else if (etype == SBI_PMU_EVENT_TYPE_RAW)
> > +             config = evt_data & RISCV_PMU_RAW_EVENT_MASK;
> > +     else if ((etype == SBI_PMU_EVENT_TYPE_FW) && (ecode < SBI_PMU_FW_MAX))
> > +             config = (1ULL << 63) | ecode;
>
> nit: switch?
>
> > +
> > +     return config;
> > +}
> > +
> > +static int pmu_get_fixed_pmc_index(unsigned long eidx)
> > +{
> > +     u32 etype = pmu_get_perf_event_type(eidx);
> > +     u32 ecode = get_event_code(eidx);
> > +     int ctr_idx;
> > +
> > +     if (etype != SBI_PMU_EVENT_TYPE_HW)
> > +             return -EINVAL;
> > +
> > +     if (ecode == SBI_PMU_HW_CPU_CYCLES)
> > +             ctr_idx = 0;
> > +     else if (ecode == SBI_PMU_HW_INSTRUCTIONS)
> > +             ctr_idx = 2;
>
> nit: Could drop the ctr_idx variable and return directly
>
> > +     else
> > +             return -EINVAL;
> > +
> > +     return ctr_idx;
> > +}
> > +
> > +static int pmu_get_programmable_pmc_index(struct kvm_pmu *kvpmu, unsigned long eidx,
> > +                                       unsigned long cbase, unsigned long cmask)
> > +{
> > +     int ctr_idx = -1;
> > +     int i, pmc_idx;
> > +     int min, max;
> > +
> > +     if (pmu_is_fw_event(eidx)) {
> > +             /* Firmware counters are mapped 1:1 starting from num_hw_ctrs for simplicity */
> > +             min = kvpmu->num_hw_ctrs;
> > +             max = min + kvpmu->num_fw_ctrs;
> > +     } else {
> > +             /* First 3 counters are reserved for fixed counters */
> > +             min = 3;
> > +             max = kvpmu->num_hw_ctrs;
> > +     }
> > +
> > +     for_each_set_bit(i, &cmask, BITS_PER_LONG) {
> > +             pmc_idx = i + cbase;
> > +             if ((pmc_idx >= min && pmc_idx < max) &&
> > +                 !test_bit(pmc_idx, kvpmu->pmc_in_use)) {
> > +                     ctr_idx = pmc_idx;
> > +                     break;
>
> nit: Could drop the ctr_idx variable and return directly
>
> > +             }
> > +     }
> > +
> > +     return ctr_idx;
> > +}
> > +
> > +static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
> > +                          unsigned long cbase, unsigned long cmask)
> > +{
> > +     int ret;
> > +
> > +     /* Fixed counters need to be have fixed mapping as they have different width */
> > +     ret = pmu_get_fixed_pmc_index(eidx);
> > +     if (ret >= 0)
> > +             return ret;
> > +
> > +     return pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
> > +}
> >
> >  static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> >                       unsigned long *out_val)
> > @@ -82,7 +235,41 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >                                unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> >                                struct kvm_vcpu_sbi_ext_data *edata)
> >  {
> > -     /* TODO */
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +     int i, num_ctrs, pmc_index, sbiret = 0;
> > +     struct kvm_pmc *pmc;
> > +
> > +     num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
>
> We can put kvm_pmu_num_counters(kvpmu) in its one use below and
> drop num_ctrs.
>
> > +     if (ctr_base + __fls(ctr_mask) >= num_ctrs) {
>
> __fls is undefined when ctr_mask is zero, so we should check it first,
> unless the caller of this function is required to check it.
>
> (Same two comments for the functions below that have the same check.)
>

Moved these checks to a common counter validate function that can be called
from start/stop/config.

> > +             sbiret = SBI_ERR_INVALID_PARAM;
> > +             goto out;
> > +     }
> > +
> > +     /* Start the counters that have been configured and requested by the guest */
> > +     for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
> > +             pmc_index = i + ctr_base;
> > +             if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> > +                     continue;
> > +             pmc = &kvpmu->pmc[pmc_index];
> > +             if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> > +                     pmc->counter_val = ival;
> > +             if (pmc->perf_event) {
> > +                     if (unlikely(pmc->started)) {
> > +                             sbiret = SBI_ERR_ALREADY_STARTED;
> > +                             continue;
> > +                     }
> > +                     perf_event_period(pmc->perf_event, pmu_get_sample_period(pmc));
> > +                     perf_event_enable(pmc->perf_event);
> > +                     pmc->started = true;
> > +             } else {
> > +                     kvm_debug("Can not start counter due to invalid confiugartion\n");
>                                    ^ Cannot                             ^ configuration
>
> > +                     sbiret = SBI_ERR_INVALID_PARAM;
> > +             }
> > +     }
>
> Possibly a spec oversight is that we continue to try and start counters,
> even when we've seen errors. The problem with implementing that is that
> if we have both errors we only return the last one. I.e. one counter
> was already started and another counter resulted in invalid-param, we
> only return invalid-param. We also don't say anything about the number
> of failures / successes. I think we should bail on the first error and
> even stop counters that we started. Callers can then try again after
> correcting their input without potentially getting already-started errors.
> We'd need to change the spec to do that though.
>

The idea was to provide a best effort service for batching use cases.
As long as the caller has a valid counter,
the SBI implementation should try to start that counter. For the
invalid counters, it does notify
the caller that one or more counters from the requested counter is invalid.
It's up to the caller to decide whether it actually cares about the error.

E.g. The caller may just want to stop all the counters without
tracking which counters are actually configured/started.

As per the spec:
SBI_ERR_INVALID_PARAM : set of counters has at least one invalid counter.

SBI_ERR_ALREADY_STARTED : set of counters includes at least one
counter which is already started.


> > +
> > +out:
> > +     edata->err_val = sbiret;
> > +
> >       return 0;
> >  }
> >
> > @@ -90,16 +277,142 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >                               unsigned long ctr_mask, unsigned long flag,
> >                               struct kvm_vcpu_sbi_ext_data *edata)
> >  {
> > -     /* TODO */
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +     int i, num_ctrs, pmc_index, sbiret = 0;
> > +     u64 enabled, running;
> > +     struct kvm_pmc *pmc;
> > +
> > +     num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
> > +     if ((ctr_base + __fls(ctr_mask)) >= num_ctrs) {
>
> nit: Can drop ()
>
> > +             sbiret = SBI_ERR_INVALID_PARAM;
> > +             goto out;
> > +     }
> > +
> > +     /* Stop the counters that have been configured and requested by the guest */
> > +     for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
> > +             pmc_index = i + ctr_base;
> > +             if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> > +                     continue;
> > +             pmc = &kvpmu->pmc[pmc_index];
> > +             if (pmc->perf_event) {
> > +                     if (pmc->started) {
> > +                             /* Stop counting the counter */
> > +                             perf_event_disable(pmc->perf_event);
> > +                             pmc->started = false;
> > +                     } else
> > +                             sbiret = SBI_ERR_ALREADY_STOPPED;
> > +
> > +                     if (flag & SBI_PMU_STOP_FLAG_RESET) {
> > +                             /* Relase the counter if this is a reset request */
> > +                             pmc->counter_val += perf_event_read_value(pmc->perf_event,
> > +                                                                       &enabled, &running);
> > +                             pmu_release_perf_event(pmc);
> > +                             clear_bit(pmc_index, kvpmu->pmc_in_use);
> > +                     }
> > +             } else {
> > +                     kvm_debug("Can not stop counter due to invalid confiugartion\n");
>                                    ^Cannot                             ^ configuration
>
> > +                     sbiret = SBI_ERR_INVALID_PARAM;
> > +             }
> > +     }
>
> Same comment about the multiple errors problem.
>
> > +
> > +out:
> > +     edata->err_val = sbiret;
> > +
> >       return 0;
> >  }
> >
> >  int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >                                    unsigned long ctr_mask, unsigned long flag,
> > -                                  unsigned long eidx, uint64_t edata,
> > -                                  struct kvm_vcpu_sbi_ext_data *extdata)
> > +                                  unsigned long eidx, uint64_t evt_data,
> > +                                  struct kvm_vcpu_sbi_ext_data *ext_data)
>
> This should be part of the skeleton patch.
>
> >  {
> > -     /* TODO */
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +     struct perf_event *event;
> > +     struct perf_event_attr attr;
> > +     int num_ctrs, ctr_idx;
> > +     u32 etype = pmu_get_perf_event_type(eidx);
> > +     u64 config;
> > +     struct kvm_pmc *pmc;
> > +     int sbiret = 0;
> > +
> > +
> > +     num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
> > +     if (etype == PERF_TYPE_MAX || (ctr_base + __fls(ctr_mask) >= num_ctrs)) {
> > +             sbiret = SBI_ERR_INVALID_PARAM;
> > +             goto out;
> > +     }
> > +
> > +     if (pmu_is_fw_event(eidx)) {
> > +             sbiret = SBI_ERR_NOT_SUPPORTED;
> > +             goto out;
> > +     }
> > +
> > +     /*
> > +      * SKIP_MATCH flag indicates the caller is aware of the assigned counter
> > +      * for this event. Just do a sanity check if it already marked used.
> > +      */
> > +     if (flag & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
> > +             if (!test_bit(ctr_base, kvpmu->pmc_in_use)) {
> > +                     sbiret = SBI_ERR_FAILURE;
> > +                     goto out;
>
> I see this is the same way OpenSBI implements this, but I don't really
> understand it. The spec says
>
> """
> NOTE: When SBI_PMU_CFG_FLAG_SKIP_MATCH is set in config_flags, the SBI
> implementation will unconditionally select the first counter from the set
> of counters specified by the counter_idx_base and counter_idx_mask.
> """
>
> which doesn't say anything about the caller just wanting to confirm the
> counter is in use. Indeed, the spec says this function is for
>
> """
> Find and configure a counter from a set of counters which is not started
> (or enabled) and can monitor the specified event.
> """
>
> which would imply errors should returned anytime we match something
> already started, rather than the opposite.
>

If the caller requested for a specific counter and that is already
configured, SBI implementation
can not assign another counter. In that case, it has to return an error only.

> Also, I think we should be using ctr_base + first-set-bit(ctr_mask),
> because the spec says to look at both the base and the mask, and take
> the first counter.
>

Ahh yes. Thanks for catching. Fixed.

> Finally, the spec doesn't have SBI_ERR_FAILURE listed as a possible
> error.
>

It should be SBI_ERR_INVALID_PARAM. Fixed.

> > +             }
> > +             ctr_idx = ctr_base;
> > +             goto match_done;
>
> nit: We don't need the match_done label and goto. We can just use
> an if-else.
>
> > +     }
> > +
> > +     ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
> > +     if (ctr_idx < 0) {
> > +             sbiret = SBI_ERR_NOT_SUPPORTED;
> > +             goto out;
> > +     }
> > +
> > +match_done:
> > +     pmc = &kvpmu->pmc[ctr_idx];
> > +     pmu_release_perf_event(pmc);
> > +     pmc->idx = ctr_idx;
> > +
> > +     config = pmu_get_perf_event_config(eidx, evt_data);
> > +     memset(&attr, 0, sizeof(struct perf_event_attr));
>
> At the top of the function we can use
>
>         struct perf_event_attr attr = {
>           .type = etype,
>           .size = sizeof(struct perf_event_attr),
>
>           /* anything else we know at the top */
>
>         };
>
> instead of the memset and some of the assignments here.
>

Done.

> > +     attr.type = etype;
> > +     attr.size = sizeof(attr);
> > +     attr.pinned = true;
> > +
> > +     /*
> > +      * It should never reach here if the platform doesn't support sscofpmf extensio
>                                                                      ^ the     ^ extension
>
> > +      * as mode filtering won't work without it.
> > +      */
> > +     attr.exclude_host = true;
> > +     attr.exclude_hv = true;
> > +     attr.exclude_user = !!(flag & SBI_PMU_CFG_FLAG_SET_UINH);
> > +     attr.exclude_kernel = !!(flag & SBI_PMU_CFG_FLAG_SET_SINH);
> > +     attr.config = config;
> > +     attr.config1 = RISCV_KVM_PMU_CONFIG1_GUEST_EVENTS;
> > +     if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
> > +             //TODO: Do we really want to clear the value in hardware counter
> > +             pmc->counter_val = 0;
> > +     }
> > +
> > +     /*
> > +      * Set the default sample_period for now. The guest specified value
> > +      * will be updated in the start call.
> > +      */
> > +     attr.sample_period = pmu_get_sample_period(pmc);
> > +
> > +     event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
> > +     if (IS_ERR(event)) {
> > +             pr_err("kvm pmu event creation failed event %pe for eidx %lx\n", event, eidx);
> > +             return -EOPNOTSUPP;
>
> event is an error, so we shouldn't call it an event in the error message.
> How about
>
>   pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));
>
> and then PTR_ERR(event) intead of -EOPNOTSUPP.
>

Done.

> > +     }
> > +
> > +     set_bit(ctr_idx, kvpmu->pmc_in_use);
> > +     pmc->perf_event = event;
> > +     if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> > +             perf_event_enable(pmc->perf_event);
> > +
> > +     ext_data->out_val = ctr_idx;
> > +out:
> > +     ext_data->err_val = sbiret;
> > +
> >       return 0;
> >  }
> >
> > @@ -119,6 +432,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> >  {
> >       int i = 0, num_fw_ctrs, ret, num_hw_ctrs = 0, hpm_width = 0;
> >       struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +     struct kvm_pmc *pmc;
> >
> >       ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> >       if (ret < 0)
> > @@ -134,6 +448,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> >       else
> >               num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
> >
> > +     bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
>
> Could move this bitmap clearing to deinit as it should be clear the
> first time already since vcpus are allocated with __GFP_ZERO.
>
> >       kvpmu->num_hw_ctrs = num_hw_ctrs;
> >       kvpmu->num_fw_ctrs = num_fw_ctrs;
> >       /*
> > @@ -146,24 +461,26 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> >               /* TIME CSR shouldn't be read from perf interface */
> >               if (i == 1)
> >                       continue;
> > -             kvpmu->pmc[i].idx = i;
> > +             pmc = &kvpmu->pmc[i];
> > +             pmc->idx = i;
> > +             pmc->counter_val = 0;
>
> Also could be moved to deinit.
>
> >               if (i < kvpmu->num_hw_ctrs) {
> >                       kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
> >                       if (i < 3)
> >                               /* CY, IR counters */
> > -                             kvpmu->pmc[i].cinfo.width = 63;
> > +                             pmc->cinfo.width = 63;
> >                       else
> > -                             kvpmu->pmc[i].cinfo.width = hpm_width;
> > +                             pmc->cinfo.width = hpm_width;
> >                       /*
> >                        * The CSR number doesn't have any relation with the logical
> >                        * hardware counters. The CSR numbers are encoded sequentially
> >                        * to avoid maintaining a map between the virtual counter
> >                        * and CSR number.
> >                        */
> > -                     kvpmu->pmc[i].cinfo.csr = CSR_CYCLE + i;
> > +                     pmc->cinfo.csr = CSR_CYCLE + i;
> >               } else {
> > -                     kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_FW;
> > -                     kvpmu->pmc[i].cinfo.width = BITS_PER_LONG - 1;
> > +                     pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
> > +                     pmc->cinfo.width = BITS_PER_LONG - 1;
>
> Almost all of these changes can be avoided by using the pmc pointer in the
> skeleton patch from the start.
>

Fixed.

> >               }
> >       }
> >
> > @@ -172,13 +489,22 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> >       return 0;
> >  }
> >
> > -void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> > +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> >  {
> > -     /* TODO */
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +     struct kvm_pmc *pmc;
> > +     int i;
> > +
> > +     if (!kvpmu)
> > +             return;
> > +
> > +     for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
> > +             pmc = &kvpmu->pmc[i];
> > +             pmu_release_perf_event(pmc);
> > +     }
> >  }
> >
> > -void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> > +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> >  {
> > -     /* TODO */
> > +     kvm_riscv_vcpu_pmu_deinit(vcpu);
>
> The skeleton patch could put deinit and reset in the right order to avoid
> this change.
>

Done. I have addressed all the nit comments as well.
Thanks for the detailed review.

> >  }
> > -
> > --
> > 2.25.1
> >
>
> Thanks,
> drew



-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 11/11] RISC-V: KVM: Implement firmware events
  2023-01-13 12:08   ` Andrew Jones
@ 2023-01-26  3:08     ` Atish Patra
  0 siblings, 0 replies; 37+ messages in thread
From: Atish Patra @ 2023-01-26  3:08 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Atish Patra, linux-kernel, Anup Patel, Guo Ren, kvm-riscv, kvm,
	linux-riscv, Mark Rutland, Palmer Dabbelt, Paul Walmsley,
	Sergey Matyukevich, Eric Lin, Will Deacon

On Fri, Jan 13, 2023 at 4:08 AM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Thu, Dec 15, 2022 at 09:00:46AM -0800, Atish Patra wrote:
> > SBI PMU extension defines a set of firmware events which can provide
> > useful information to guests about number of SBI calls. As hypervisor
> > implements the SBI PMU extension, these firmware events corresponds
> > to ecall invocations between VS->HS mode. All other firmware events
> > will always report zero if monitored as KVM doesn't implement them.
> >
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  arch/riscv/include/asm/kvm_vcpu_pmu.h |  16 ++++
> >  arch/riscv/include/asm/sbi.h          |   2 +-
> >  arch/riscv/kvm/tlb.c                  |   6 +-
> >  arch/riscv/kvm/vcpu_pmu.c             | 105 ++++++++++++++++++++++----
> >  arch/riscv/kvm/vcpu_sbi_replace.c     |   7 ++
> >  5 files changed, 119 insertions(+), 17 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > index 7a9a8e6..cccc6182 100644
> > --- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > @@ -17,6 +17,14 @@
> >  #define RISCV_KVM_MAX_FW_CTRS 32
> >  #define RISCV_MAX_COUNTERS      64
> >
> > +struct kvm_fw_event {
> > +     /* Current value of the event */
> > +     unsigned long value;
> > +
> > +     /* Event monitoring status */
> > +     bool started;
> > +};
> > +
> >  /* Per virtual pmu counter data */
> >  struct kvm_pmc {
> >       u8 idx;
> > @@ -25,11 +33,14 @@ struct kvm_pmc {
> >       union sbi_pmu_ctr_info cinfo;
> >       /* Event monitoring status */
> >       bool started;
> > +     /* Monitoring event ID */
> > +     unsigned long event_idx;
> >  };
> >
> >  /* PMU data structure per vcpu */
> >  struct kvm_pmu {
> >       struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> > +     struct kvm_fw_event fw_event[RISCV_KVM_MAX_FW_CTRS];
> >       /* Number of the virtual firmware counters available */
> >       int num_fw_ctrs;
> >       /* Number of the virtual hardware counters available */
> > @@ -52,6 +63,7 @@ struct kvm_pmu {
> >  { .base = CSR_CYCLE,      .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
> >  #endif
> >
> > +int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid);
> >  int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
> >                               unsigned long *val, unsigned long new_val,
> >                               unsigned long wr_mask);
> > @@ -81,6 +93,10 @@ struct kvm_pmu {
> >  #define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
> >  { .base = 0,      .count = 0, .func = NULL },
> >
> > +static inline int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
> > +{
> > +     return 0;
> > +}
> >
> >  static inline int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> >  {
> > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> > index 2a0ef738..a192a95a 100644
> > --- a/arch/riscv/include/asm/sbi.h
> > +++ b/arch/riscv/include/asm/sbi.h
> > @@ -171,7 +171,7 @@ enum sbi_pmu_fw_generic_events_t {
> >       SBI_PMU_FW_IPI_SENT             = 6,
> >       SBI_PMU_FW_IPI_RECVD            = 7,
> >       SBI_PMU_FW_FENCE_I_SENT         = 8,
> > -     SBI_PMU_FW_FENCE_I_RECVD        = 9,
> > +     SBI_PMU_FW_FENCE_I_RCVD         = 9,
>
> This should probably be in its own patch.

done.

>
> >       SBI_PMU_FW_SFENCE_VMA_SENT      = 10,
> >       SBI_PMU_FW_SFENCE_VMA_RCVD      = 11,
> >       SBI_PMU_FW_SFENCE_VMA_ASID_SENT = 12,
> > diff --git a/arch/riscv/kvm/tlb.c b/arch/riscv/kvm/tlb.c
> > index 309d79b..de81920 100644
> > --- a/arch/riscv/kvm/tlb.c
> > +++ b/arch/riscv/kvm/tlb.c
> > @@ -181,6 +181,7 @@ void kvm_riscv_local_tlb_sanitize(struct kvm_vcpu *vcpu)
> >
> >  void kvm_riscv_fence_i_process(struct kvm_vcpu *vcpu)
> >  {
> > +     kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_RCVD);
> >       local_flush_icache_all();
> >  }
> >
> > @@ -264,15 +265,18 @@ void kvm_riscv_hfence_process(struct kvm_vcpu *vcpu)
> >                                               d.addr, d.size, d.order);
> >                       break;
> >               case KVM_RISCV_HFENCE_VVMA_ASID_GVA:
> > +                     kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
> >                       kvm_riscv_local_hfence_vvma_asid_gva(
> >                                               READ_ONCE(v->vmid), d.asid,
> >                                               d.addr, d.size, d.order);
> >                       break;
> >               case KVM_RISCV_HFENCE_VVMA_ASID_ALL:
> > +                     kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
> >                       kvm_riscv_local_hfence_vvma_asid_all(
> >                                               READ_ONCE(v->vmid), d.asid);
> >                       break;
> >               case KVM_RISCV_HFENCE_VVMA_GVA:
> > +                     kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_RCVD);
> >                       kvm_riscv_local_hfence_vvma_gva(
> >                                               READ_ONCE(v->vmid),
> >                                               d.addr, d.size, d.order);
> > @@ -323,7 +327,7 @@ void kvm_riscv_fence_i(struct kvm *kvm,
> >                      unsigned long hbase, unsigned long hmask)
> >  {
> >       make_xfence_request(kvm, hbase, hmask, KVM_REQ_FENCE_I,
> > -                         KVM_REQ_FENCE_I, NULL);
> > +                 KVM_REQ_FENCE_I, NULL);
>
> stray change, and whitespace was correct before
>

Fixed.

> >  }
> >
> >  void kvm_riscv_hfence_gvma_vmid_gpa(struct kvm *kvm,
> > diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> > index 21c1f0f..a64a7ae 100644
> > --- a/arch/riscv/kvm/vcpu_pmu.c
> > +++ b/arch/riscv/kvm/vcpu_pmu.c
> > @@ -170,18 +170,36 @@ static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
> >       return pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
> >  }
> >
> > +int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
> > +{
> > +     struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +     struct kvm_fw_event *fevent;
> > +
> > +     if (!kvpmu || fid >= SBI_PMU_FW_MAX)
> > +             return -EINVAL;
> > +
> > +     fevent = &kvpmu->fw_event[fid];
> > +     if (fevent->started)
> > +             fevent->value++;
> > +
> > +     return 0;
> > +}
> > +
> >  static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> >                       unsigned long *out_val)
> >  {
> >       struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> >       struct kvm_pmc *pmc;
> >       u64 enabled, running;
> > +     int fevent_code;
> >
> >       pmc = &kvpmu->pmc[cidx];
> > -     if (!pmc->perf_event)
> > -             return -EINVAL;
> >
> > -     pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
> > +     if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> > +             fevent_code = get_event_code(pmc->event_idx);
> > +             pmc->counter_val = kvpmu->fw_event[fevent_code].value;
> > +     } else if (pmc->perf_event)
> > +             pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
> >       *out_val = pmc->counter_val;
> >
> >       return 0;
> > @@ -238,6 +256,7 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >       struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> >       int i, num_ctrs, pmc_index, sbiret = 0;
> >       struct kvm_pmc *pmc;
> > +     int fevent_code;
> >
> >       num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
> >       if (ctr_base + __fls(ctr_mask) >= num_ctrs) {
> > @@ -253,7 +272,22 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >               pmc = &kvpmu->pmc[pmc_index];
> >               if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> >                       pmc->counter_val = ival;
> > -             if (pmc->perf_event) {
> > +             if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> > +                     fevent_code = get_event_code(pmc->event_idx);
> > +                     if (fevent_code >= SBI_PMU_FW_MAX) {
> > +                             sbiret = SBI_ERR_INVALID_PARAM;
> > +                             goto out;
> > +                     }
> > +
> > +                     /* Check if the counter was already started for some reason */
> > +                     if (kvpmu->fw_event[fevent_code].started) {
> > +                             sbiret = SBI_ERR_ALREADY_STARTED;
> > +                             continue;
> > +                     }
> > +
> > +                     kvpmu->fw_event[fevent_code].started = true;
> > +                     kvpmu->fw_event[fevent_code].value = pmc->counter_val;
> > +             } else if (pmc->perf_event) {
> >                       if (unlikely(pmc->started)) {
> >                               sbiret = SBI_ERR_ALREADY_STARTED;
> >                               continue;
> > @@ -281,6 +315,7 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >       int i, num_ctrs, pmc_index, sbiret = 0;
> >       u64 enabled, running;
> >       struct kvm_pmc *pmc;
> > +     int fevent_code;
> >
> >       num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
> >       if ((ctr_base + __fls(ctr_mask)) >= num_ctrs) {
> > @@ -294,7 +329,18 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >               if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> >                       continue;
> >               pmc = &kvpmu->pmc[pmc_index];
> > -             if (pmc->perf_event) {
> > +             if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> > +                     fevent_code = get_event_code(pmc->event_idx);
> > +                     if (fevent_code >= SBI_PMU_FW_MAX) {
> > +                             sbiret = SBI_ERR_INVALID_PARAM;
> > +                             goto out;
> > +                     }
> > +
> > +                     if (!kvpmu->fw_event[fevent_code].started)
> > +                             sbiret = SBI_ERR_ALREADY_STOPPED;
> > +
> > +                     kvpmu->fw_event[fevent_code].started = false;
> > +             } else if (pmc->perf_event) {
> >                       if (pmc->started) {
> >                               /* Stop counting the counter */
> >                               perf_event_disable(pmc->perf_event);
> > @@ -307,12 +353,15 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> >                               pmc->counter_val += perf_event_read_value(pmc->perf_event,
> >                                                                         &enabled, &running);
> >                               pmu_release_perf_event(pmc);
> > -                             clear_bit(pmc_index, kvpmu->pmc_in_use);
> >                       }
> >               } else {
> >                       kvm_debug("Can not stop counter due to invalid confiugartion\n");
> >                       sbiret = SBI_ERR_INVALID_PARAM;
> >               }
> > +             if (flag & SBI_PMU_STOP_FLAG_RESET) {
> > +                     pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
> > +                     clear_bit(pmc_index, kvpmu->pmc_in_use);
>
> nit: I'd probably just leave clear_bit where it was and add
>
> if (flag & SBI_PMU_STOP_FLAG_RESET)
>    pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
>
> to the firmware arm.
>

We have to do clear_bit for the firmware as well. That's why I moved it below
to avoid the same code twice.

> > +             }
> >       }
> >
> >  out:
> > @@ -329,12 +378,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> >       struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> >       struct perf_event *event;
> >       struct perf_event_attr attr;
> > -     int num_ctrs, ctr_idx;
> > +     int num_ctrs, ctr_idx, sbiret = 0;
> >       u32 etype = pmu_get_perf_event_type(eidx);
> >       u64 config;
> > -     struct kvm_pmc *pmc;
> > -     int sbiret = 0;
> > -
> > +     struct kvm_pmc *pmc = NULL;
> > +     bool is_fevent;
> > +     unsigned long event_code;
> >
> >       num_ctrs = kvpmu->num_fw_ctrs + kvpmu->num_hw_ctrs;
> >       if (etype == PERF_TYPE_MAX || (ctr_base + __fls(ctr_mask) >= num_ctrs)) {
> > @@ -342,7 +391,9 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> >               goto out;
> >       }
> >
> > -     if (pmu_is_fw_event(eidx)) {
> > +     event_code = get_event_code(eidx);
> > +     is_fevent = pmu_is_fw_event(eidx);
> > +     if (is_fevent && event_code >= SBI_PMU_FW_MAX) {
> >               sbiret = SBI_ERR_NOT_SUPPORTED;
> >               goto out;
> >       }
> > @@ -357,7 +408,10 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> >                       goto out;
> >               }
> >               ctr_idx = ctr_base;
> > -             goto match_done;
> > +             if (is_fevent)
> > +                     goto perf_event_done;
> > +             else
> > +                     goto match_done;
> >       }
> >
> >       ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
> > @@ -366,6 +420,13 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> >               goto out;
> >       }
> >
> > +     /*
> > +      * No need to create perf events for firmware events as the firmware counter
> > +      * is supposed to return the measurement of VS->HS mode invocations.
> > +      */
> > +     if (is_fevent)
> > +             goto perf_event_done;
> > +
> >  match_done:
> >       pmc = &kvpmu->pmc[ctr_idx];
> >       pmu_release_perf_event(pmc);
> > @@ -404,10 +465,19 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> >               return -EOPNOTSUPP;
> >       }
> >
> > -     set_bit(ctr_idx, kvpmu->pmc_in_use);
> >       pmc->perf_event = event;
> > -     if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> > -             perf_event_enable(pmc->perf_event);
>
> Maybe we can move the perf setup stuff into a helper function and
> then guard it with an if-statement rather than have the gotos?
>

Sure. Done.

> > +perf_event_done:
> > +     if (flag & SBI_PMU_CFG_FLAG_AUTO_START) {
> > +             if (is_fevent)
> > +                     kvpmu->fw_event[event_code].started = true;
> > +             else
> > +                     perf_event_enable(pmc->perf_event);
> > +     }
> > +     /* This should be only true for firmware events */
> > +     if (!pmc)
> > +             pmc = &kvpmu->pmc[ctr_idx];
> > +     pmc->event_idx = eidx;
> > +     set_bit(ctr_idx, kvpmu->pmc_in_use);
> >
> >       ext_data->out_val = ctr_idx;
> >  out:
> > @@ -451,6 +521,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> >       bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
> >       kvpmu->num_hw_ctrs = num_hw_ctrs;
> >       kvpmu->num_fw_ctrs = num_fw_ctrs;
> > +     memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
>
> I'm wondering if we need this array. We already have an underused pmc for
> the fw events which has counter_val and started. Can't we just used those?
>

My initial implementation was doing the same. However, pmc array is
counter index based
while fw_event is firmware event id based.
We need to maintain a mapping between these two or iterate the pmc
array one by one to find
correct entry in kvm_riscv_vcpu_pmu_incr_fw.

As the kvm_riscv_vcpu_pmu_incr_fw may be called from a performance
sensitive path, a separate
fw_event array helps quick lookup.

> >       /*
> >        * There is no corelation betwen the logical hardware counter and virtual counters.
> >        * However, we need to encode a hpmcounter CSR in the counter info field so that
> > @@ -464,6 +535,7 @@ int kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> >               pmc = &kvpmu->pmc[i];
> >               pmc->idx = i;
> >               pmc->counter_val = 0;
> > +             pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
> >               if (i < kvpmu->num_hw_ctrs) {
> >                       kvpmu->pmc[i].cinfo.type = SBI_PMU_CTR_TYPE_HW;
> >                       if (i < 3)
> > @@ -501,7 +573,10 @@ void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> >       for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
> >               pmc = &kvpmu->pmc[i];
> >               pmu_release_perf_event(pmc);
> > +             pmc->counter_val = 0;
> > +             pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
> >       }
> > +     memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
> >  }
> >
> >  void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> > diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> > index d029136..3f39711 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/kvm_host.h>
> >  #include <asm/sbi.h>
> >  #include <asm/kvm_vcpu_timer.h>
> > +#include <asm/kvm_vcpu_pmu.h>
> >  #include <asm/kvm_vcpu_sbi.h>
> >
> >  static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > @@ -26,6 +27,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >               return 0;
> >       }
> >
> > +     kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_SET_TIMER);
> >  #if __riscv_xlen == 32
> >       next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
> >  #else
> > @@ -58,6 +60,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >               return 0;
> >       }
> >
> > +     kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_IPI_SENT);
> >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> >               if (hbase != -1UL) {
> >                       if (tmp->vcpu_id < hbase)
> > @@ -68,6 +71,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >               ret = kvm_riscv_vcpu_set_interrupt(tmp, IRQ_VS_SOFT);
> >               if (ret < 0)
> >                       break;
> > +             kvm_riscv_vcpu_pmu_incr_fw(tmp, SBI_PMU_FW_IPI_RECVD);
> >       }
> >
> >       return ret;
> > @@ -91,6 +95,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >       switch (funcid) {
> >       case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> >               kvm_riscv_fence_i(vcpu->kvm, hbase, hmask);
> > +             kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_SENT);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> >               if (cp->a2 == 0 && cp->a3 == 0)
> > @@ -98,6 +103,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >               else
> >                       kvm_riscv_hfence_vvma_gva(vcpu->kvm, hbase, hmask,
> >                                                 cp->a2, cp->a3, PAGE_SHIFT);
> > +             kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_SENT);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> >               if (cp->a2 == 0 && cp->a3 == 0)
> > @@ -108,6 +114,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> >                                                      hbase, hmask,
> >                                                      cp->a2, cp->a3,
> >                                                      PAGE_SHIFT, cp->a4);
> > +             kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_SENT);
> >               break;
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> >       case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
> > --
> > 2.25.1
> >
>
> It think it'd be nice to break the application of
> kvm_riscv_vcpu_pmu_incr_fw() out of this patch. I.e. introduce
> kvm_riscv_vcpu_pmu_incr_fw() in this patch and then a second patch
> applies it to all the ecalls.
>

Done.

> Thanks,
> drew



-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-01-26  3:08 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-15 17:00 [PATCH v2 00/11] KVM perf support Atish Patra
2022-12-15 17:00 ` [PATCH v2 01/11] RISC-V: Define helper functions expose hpm counter width and count Atish Patra
2023-01-12 10:06   ` Andrew Jones
2023-01-12 18:18     ` Atish Kumar Patra
2023-01-13  7:22       ` Andrew Jones
2023-01-24 20:41         ` Atish Patra
2022-12-15 17:00 ` [PATCH v2 02/11] RISC-V: KVM: Define a probe function for SBI extension data structures Atish Patra
2023-01-12 10:21   ` Andrew Jones
2023-01-12 18:19     ` Atish Kumar Patra
2022-12-15 17:00 ` [PATCH v2 03/11] RISC-V: KVM: Return correct code for hsm stop function Atish Patra
2023-01-12 10:28   ` Andrew Jones
2023-01-12 18:25     ` Atish Kumar Patra
2023-01-13  7:25       ` Andrew Jones
2022-12-15 17:00 ` [PATCH v2 04/11] RISC-V: KVM: Modify SBI extension handler to return SBI error code Atish Patra
2023-01-12 11:04   ` Andrew Jones
2023-01-12 18:47     ` Atish Kumar Patra
2023-01-13  7:42       ` Andrew Jones
2022-12-15 17:00 ` [PATCH v2 05/11] RISC-V: KVM: Improve privilege mode filtering for perf Atish Patra
2022-12-15 20:17   ` Conor Dooley
2022-12-15 21:10     ` Atish Kumar Patra
2022-12-15 17:00 ` [PATCH v2 06/11] RISC-V: KVM: Add skeleton support " Atish Patra
2023-01-12 15:10   ` Andrew Jones
2023-01-12 18:09     ` Atish Kumar Patra
2022-12-15 17:00 ` [PATCH v2 07/11] RISC-V: KVM: Add SBI PMU extension support Atish Patra
2023-01-12 15:29   ` Andrew Jones
2023-01-12 18:04     ` Atish Kumar Patra
2022-12-15 17:00 ` [PATCH v2 08/11] RISC-V: KVM: Disable all hpmcounter access for VS/VU mode Atish Patra
2023-01-12 15:47   ` Andrew Jones
2022-12-15 17:00 ` [PATCH v2 09/11] RISC-V: KVM: Implement trap & emulate for hpmcounters Atish Patra
2023-01-13 11:47   ` Andrew Jones
2022-12-15 17:00 ` [PATCH v2 10/11] RISC-V: KVM: Implement perf support without sampling Atish Patra
2023-01-13 11:45   ` Andrew Jones
2023-01-23  7:23     ` Andrew Jones
2023-01-26  0:50     ` Atish Patra
2022-12-15 17:00 ` [PATCH v2 11/11] RISC-V: KVM: Implement firmware events Atish Patra
2023-01-13 12:08   ` Andrew Jones
2023-01-26  3:08     ` Atish Patra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).