All of lore.kernel.org
 help / color / mirror / Atom feed
* [KVMTOOL][V1] Implement support for ppc64le
@ 2016-03-21  7:17 Balbir Singh
  2016-03-21  7:17 ` [PATCH 1/4] Add basic little endian support Balbir Singh
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Balbir Singh @ 2016-03-21  7:17 UTC (permalink / raw)
  To: will.deacon, kvm; +Cc: penberg, mpe, mikey, aik

This patchset adds support for ppc64le. As a part of the support 1/4 converts
key data structures in the fdt to big endian. 2/4 introduces h_set_mode call
to support little endian interrupt processing. This requires support to execute
and queue commands to a particular vcpu and hence a generic infrastructure
is added in patch 2/4. Patch 3/4 fixes a race condition found during exit.
Patch 4/4 adds support for fixing spapr_pci to support little endian guests
so that virtio-pci can be detected and virtio can work

This patchset was tested on x64 (on my laptop) and on a ppc64le system.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] Add basic little endian support.
  2016-03-21  7:17 [KVMTOOL][V1] Implement support for ppc64le Balbir Singh
@ 2016-03-21  7:17 ` Balbir Singh
  2016-03-21 21:55   ` Michael Neuling
  2016-03-21  7:17 ` [PATCH 2/4] Implement H_SET_MODE for ppc64le Balbir Singh
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Balbir Singh @ 2016-03-21  7:17 UTC (permalink / raw)
  To: will.deacon, kvm; +Cc: penberg, mpe, mikey, aik, Balbir Singh

Currently kvmtool works well/was designed for big endian ppc64 systems.
This patch adds support for little endian systems

The system does not yet boot as support for h_set_mode is required to help
with exceptions in big endian mode -- first page fault. The support comes in
the next patch of the series

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
---
 powerpc/kvm.c   | 24 ++++++++++++------------
 powerpc/spapr.h |  5 +++--
 2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/powerpc/kvm.c b/powerpc/kvm.c
index b4c3310..d147e0c 100644
--- a/powerpc/kvm.c
+++ b/powerpc/kvm.c
@@ -253,21 +253,21 @@ static void generate_segment_page_sizes(struct kvm_ppc_smmu_info *info, struct f
 		if (sps->page_shift == 0)
 			break;
 
-		*p++ = sps->page_shift;
-		*p++ = sps->slb_enc;
+		*p++ = cpu_to_be32(sps->page_shift);
+		*p++ = cpu_to_be32(sps->slb_enc);
 
 		for (j = 0; j < KVM_PPC_PAGE_SIZES_MAX_SZ; j++)
 			if (!info->sps[i].enc[j].page_shift)
 				break;
 
-		*p++ = j;	/* count of enc */
+		*p++ = cpu_to_be32(j);	/* count of enc */
 
 		for (j = 0; j < KVM_PPC_PAGE_SIZES_MAX_SZ; j++) {
 			if (!info->sps[i].enc[j].page_shift)
 				break;
 
-			*p++ = info->sps[i].enc[j].page_shift;
-			*p++ = info->sps[i].enc[j].pte_enc;
+			*p++ = cpu_to_be32(info->sps[i].enc[j].page_shift);
+			*p++ = cpu_to_be32(info->sps[i].enc[j].pte_enc);
 		}
 	}
 }
@@ -292,7 +292,7 @@ static int setup_fdt(struct kvm *kvm)
 	u8		staging_fdt[FDT_MAX_SIZE];
 	struct cpu_info *cpu_info = find_cpu_info(kvm);
 	struct fdt_prop segment_page_sizes;
-	u32 segment_sizes_1T[] = {0x1c, 0x28, 0xffffffff, 0xffffffff};
+	u32 segment_sizes_1T[] = {cpu_to_be32(0x1c), cpu_to_be32(0x28), 0xffffffff, 0xffffffff};
 
 	/* Generate an appropriate DT at kvm->arch.fdt_gra */
 	void *fdt_dest = guest_flat_to_host(kvm, kvm->arch.fdt_gra);
@@ -364,7 +364,7 @@ static int setup_fdt(struct kvm *kvm)
 	_FDT(fdt_property_cell(fdt, "#size-cells", 0x0));
 
 	for (i = 0; i < smp_cpus; i += SMT_THREADS) {
-		int32_t pft_size_prop[] = { 0, HPT_ORDER };
+		int32_t pft_size_prop[] = { 0, cpu_to_be32(HPT_ORDER) };
 		uint32_t servers_prop[SMT_THREADS];
 		uint32_t gservers_prop[SMT_THREADS * 2];
 		int threads = (smp_cpus - i) >= SMT_THREADS ? SMT_THREADS :
@@ -503,11 +503,11 @@ int kvm__arch_setup_firmware(struct kvm *kvm)
 	 */
 	uint32_t *rtas = guest_flat_to_host(kvm, kvm->arch.rtas_gra);
 
-	rtas[0] = 0x7c641b78;
-	rtas[1] = 0x3c600000;
-	rtas[2] = 0x6063f000;
-	rtas[3] = 0x44000022;
-	rtas[4] = 0x4e800020;
+	rtas[0] = cpu_to_be32(0x7c641b78);
+	rtas[1] = cpu_to_be32(0x3c600000);
+	rtas[2] = cpu_to_be32(0x6063f000);
+	rtas[3] = cpu_to_be32(0x44000022);
+	rtas[4] = cpu_to_be32(0x4e800020);
 	kvm->arch.rtas_size = 20;
 
 	pr_info("Set up %ld bytes of RTAS at 0x%lx\n",
diff --git a/powerpc/spapr.h b/powerpc/spapr.h
index 7a377d0..8b294d1 100644
--- a/powerpc/spapr.h
+++ b/powerpc/spapr.h
@@ -15,6 +15,7 @@
 #define __HW_SPAPR_H__
 
 #include <inttypes.h>
+#include <linux/byteorder.h>
 
 #include "kvm/kvm.h"
 #include "kvm/kvm-cpu.h"
@@ -80,12 +81,12 @@ int spapr_rtas_fdt_setup(struct kvm *kvm, void *fdt);
 
 static inline uint32_t rtas_ld(struct kvm *kvm, target_ulong phys, int n)
 {
-	return *((uint32_t *)guest_flat_to_host(kvm, phys + 4*n));
+	return cpu_to_be32(*((uint32_t *)guest_flat_to_host(kvm, phys + 4*n)));
 }
 
 static inline void rtas_st(struct kvm *kvm, target_ulong phys, int n, uint32_t val)
 {
-	*((uint32_t *)guest_flat_to_host(kvm, phys + 4*n)) = val;
+	*((uint32_t *)guest_flat_to_host(kvm, phys + 4*n)) = cpu_to_be32(val);
 }
 
 typedef void (*spapr_rtas_fn)(struct kvm_cpu *vcpu, uint32_t token,
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] Implement H_SET_MODE for ppc64le
  2016-03-21  7:17 [KVMTOOL][V1] Implement support for ppc64le Balbir Singh
  2016-03-21  7:17 ` [PATCH 1/4] Add basic little endian support Balbir Singh
@ 2016-03-21  7:17 ` Balbir Singh
  2016-03-30  5:39   ` Michael Ellerman
  2016-03-21  7:17 ` [PATCH 3/4] Fix a race during exit processing Balbir Singh
  2016-03-21  7:17 ` [PATCH 4/4] Implement spapr pci for little endian systems Balbir Singh
  3 siblings, 1 reply; 9+ messages in thread
From: Balbir Singh @ 2016-03-21  7:17 UTC (permalink / raw)
  To: will.deacon, kvm; +Cc: penberg, mpe, mikey, aik, Balbir Singh

Basic infrastructure for queuing a task to a specifici CPU and
the use of that in setting ILE (Little Endian Interrupt Handling)
on power via h_set_mode hypercall

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
---
 include/kvm/kvm-cpu.h              |   7 +++
 include/kvm/kvm.h                  |   1 +
 kvm-cpu.c                          |  50 +++++++++++++++++
 powerpc/include/kvm/kvm-cpu-arch.h |   2 +
 powerpc/kvm.c                      |   2 +-
 powerpc/spapr.h                    |  15 ++++-
 powerpc/spapr_hcall.c              | 111 +++++++++++++++++++++++++++++++++++++
 7 files changed, 185 insertions(+), 3 deletions(-)

diff --git a/include/kvm/kvm-cpu.h b/include/kvm/kvm-cpu.h
index aa0cb54..5009681 100644
--- a/include/kvm/kvm-cpu.h
+++ b/include/kvm/kvm-cpu.h
@@ -4,6 +4,11 @@
 #include "kvm/kvm-cpu-arch.h"
 #include <stdbool.h>
 
+struct kvm_cpu_task {
+	void (*task)(void *data);
+	void *data;
+};
+
 int kvm_cpu__init(struct kvm *kvm);
 int kvm_cpu__exit(struct kvm *kvm);
 struct kvm_cpu *kvm_cpu__arch_init(struct kvm *kvm, unsigned long cpu_id);
@@ -23,5 +28,7 @@ void kvm_cpu__show_code(struct kvm_cpu *vcpu);
 void kvm_cpu__show_registers(struct kvm_cpu *vcpu);
 void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu);
 void kvm_cpu__arch_nmi(struct kvm_cpu *cpu);
+int kvm_cpu__queue_task(struct kvm_cpu *cpu, void (*task)(void *data),
+			void *data);
 
 #endif /* KVM__KVM_CPU_H */
diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index 37155db..731abee 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -15,6 +15,7 @@
 
 #define SIGKVMEXIT		(SIGRTMIN + 0)
 #define SIGKVMPAUSE		(SIGRTMIN + 1)
+#define SIGKVMTASK		(SIGRTMIN + 2)
 
 #define KVM_PID_FILE_PATH	"/.lkvm/"
 #define HOME_DIR		getenv("HOME")
diff --git a/kvm-cpu.c b/kvm-cpu.c
index ad4441b..438414f 100644
--- a/kvm-cpu.c
+++ b/kvm-cpu.c
@@ -83,10 +83,59 @@ void kvm_cpu__reboot(struct kvm *kvm)
 	}
 }
 
+static void kvm_cpu__run_task(int sig, siginfo_t * info, void *context)
+{
+	union sigval val;
+	struct kvm_cpu_task *task_ptr;
+
+	if (!info) {
+		pr_warning("signal queued without info\n");
+		return;
+	}
+
+	val = info->si_value;
+	task_ptr = val.sival_ptr;
+	if (!task_ptr) {
+		pr_warning("Task queued without data\n");
+		return;
+	}
+
+	if (!task_ptr->task || !task_ptr->data) {
+		pr_warning("Failed to get task information\n");
+		return;
+	}
+
+	task_ptr->task(task_ptr->data);
+	free(task_ptr);
+}
+
+int kvm_cpu__queue_task(struct kvm_cpu *cpu, void (*task)(void *data),
+			void *data)
+{
+	struct kvm_cpu_task *task_ptr = NULL;
+	union sigval val;
+
+	task_ptr = malloc(sizeof(struct kvm_cpu_task));
+	if (!task_ptr)
+		return -ENOMEM;
+
+	task_ptr->task = task;
+	task_ptr->data = data;
+	val.sival_ptr = task_ptr;
+
+	pthread_sigqueue(cpu->thread, SIGKVMTASK, val);
+	return 0;
+}
+
 int kvm_cpu__start(struct kvm_cpu *cpu)
 {
 	sigset_t sigset;
 
+	struct sigaction action = {
+		.sa_sigaction = kvm_cpu__run_task,
+		.sa_flags = SA_SIGINFO,
+	};
+
 	sigemptyset(&sigset);
 	sigaddset(&sigset, SIGALRM);
 
@@ -94,6 +143,7 @@ int kvm_cpu__start(struct kvm_cpu *cpu)
 
 	signal(SIGKVMEXIT, kvm_cpu_signal_handler);
 	signal(SIGKVMPAUSE, kvm_cpu_signal_handler);
+	sigaction(SIGKVMTASK, &action, NULL);
 
 	kvm_cpu__reset_vcpu(cpu);
 
diff --git a/powerpc/include/kvm/kvm-cpu-arch.h b/powerpc/include/kvm/kvm-cpu-arch.h
index 01eafdf..033b702 100644
--- a/powerpc/include/kvm/kvm-cpu-arch.h
+++ b/powerpc/include/kvm/kvm-cpu-arch.h
@@ -38,6 +38,8 @@
 
 #define POWER7_EXT_IRQ	0
 
+#define LPCR_ILE (1 << (63-38))
+
 struct kvm;
 
 struct kvm_cpu {
diff --git a/powerpc/kvm.c b/powerpc/kvm.c
index d147e0c..2dbd0fe 100644
--- a/powerpc/kvm.c
+++ b/powerpc/kvm.c
@@ -286,7 +286,7 @@ static int setup_fdt(struct kvm *kvm)
 	uint32_t	int_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 	char 		hypertas_prop_kvm[] = "hcall-pft\0hcall-term\0"
 		"hcall-dabr\0hcall-interrupt\0hcall-tce\0hcall-vio\0"
-		"hcall-splpar\0hcall-bulk";
+		"hcall-splpar\0hcall-bulk\0hcall-set-mode";
 	int 		i, j;
 	char 		cpu_name[30];
 	u8		staging_fdt[FDT_MAX_SIZE];
diff --git a/powerpc/spapr.h b/powerpc/spapr.h
index 8b294d1..f851f4a 100644
--- a/powerpc/spapr.h
+++ b/powerpc/spapr.h
@@ -27,7 +27,7 @@ typedef uintptr_t target_phys_addr_t;
 #define H_HARDWARE	-1	/* Hardware error */
 #define H_FUNCTION	-2	/* Function not supported */
 #define H_PARAMETER	-4	/* Parameter invalid, out-of-range or conflicting */
-
+#define H_P2		-55
 #define H_SET_DABR		0x28
 #define H_LOGICAL_CI_LOAD	0x3c
 #define H_LOGICAL_CI_STORE	0x40
@@ -41,7 +41,18 @@ typedef uintptr_t target_phys_addr_t;
 #define H_EOI			0x64
 #define H_IPI			0x6c
 #define H_XIRR			0x74
-#define MAX_HCALL_OPCODE	H_XIRR
+#define H_SET_MODE		0x31C
+#define MAX_HCALL_OPCODE	H_SET_MODE
+
+/* Values for 2nd argument to H_SET_MODE */
+#define H_SET_MODE_RESOURCE_SET_CIABR		1
+#define H_SET_MODE_RESOURCE_SET_DAWR		2
+#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE	3
+#define H_SET_MODE_RESOURCE_LE			4
+
+/* Flags for H_SET_MODE_RESOURCE_LE */
+#define H_SET_MODE_ENDIAN_BIG		0
+#define H_SET_MODE_ENDIAN_LITTLE	1
 
 /*
  * The hcalls above are standardized in PAPR and implemented by pHyp
diff --git a/powerpc/spapr_hcall.c b/powerpc/spapr_hcall.c
index ff1d63a..682fad5 100644
--- a/powerpc/spapr_hcall.c
+++ b/powerpc/spapr_hcall.c
@@ -18,6 +18,9 @@
 
 #include <stdio.h>
 #include <assert.h>
+#include <sys/eventfd.h>
+
+static int task_event;
 
 static spapr_hcall_fn papr_hypercall_table[(MAX_HCALL_OPCODE / 4) + 1];
 static spapr_hcall_fn kvmppc_hypercall_table[KVMPPC_HCALL_MAX -
@@ -74,6 +77,113 @@ static target_ulong h_logical_dcbf(struct kvm_cpu *vcpu, target_ulong opcode, ta
 	return H_SUCCESS;
 }
 
+struct lpcr_data {
+	struct kvm_cpu	*cpu;
+	int		mode;
+};
+
+static int get_cpu_lpcr(struct kvm_cpu *vcpu, target_ulong *lpcr)
+{
+	struct kvm_one_reg reg = {
+		.id = KVM_REG_PPC_LPCR_64,
+		.addr = (__u64)lpcr
+	};
+
+	return ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, &reg);
+}
+
+static int set_cpu_lpcr(struct kvm_cpu *vcpu, target_ulong *lpcr)
+{
+	struct kvm_one_reg reg = {
+		.id = KVM_REG_PPC_LPCR_64,
+		.addr = (__u64)lpcr
+	};
+
+	return ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, &reg);
+}
+
+static void set_lpcr_cpu(void *data)
+{
+	struct lpcr_data *fn_data = (struct lpcr_data *)data;
+	int ret;
+	target_ulong lpcr;
+	u64 task_done = 1;
+
+	if (!fn_data || !fn_data->cpu)
+		return;
+
+	ret = get_cpu_lpcr(fn_data->cpu, &lpcr);
+	if (ret < 0)
+		return;
+
+	if (fn_data->mode == H_SET_MODE_ENDIAN_BIG)
+		lpcr &= ~LPCR_ILE;
+	else
+		lpcr |= LPCR_ILE;
+
+	ret = set_cpu_lpcr(fn_data->cpu, &lpcr);
+	if (ret < 0)
+		return;
+
+	free(data);
+	if (write(task_event, &task_done, sizeof(task_done)) < 0)
+		pr_warning("Failed to notify of lpcr task done\n");
+}
+
+#define for_each_vcpu(cpu, kvm, i) \
+	for ((i) = 0, (cpu) = (kvm)->cpus[i]; (i) < (kvm)->nrcpus; (i)++, (cpu) = (kvm)->cpus[i])
+
+static target_ulong h_set_mode(struct kvm_cpu *vcpu, target_ulong opcode, target_ulong *args)
+{
+	int ret = H_SUCCESS;
+	struct kvm *kvm = vcpu->kvm;
+	struct kvm_cpu *cpu;
+	int i;
+
+	switch (args[1]) {
+	case H_SET_MODE_RESOURCE_LE: {
+		u64 total_done = 0;
+		u64 task_read;
+
+		task_event = eventfd(0, 0);
+		if (task_event < 0) {
+			pr_warning("Failed to create task_event");
+			break;
+		}
+		for_each_vcpu(cpu, kvm, i) {
+			struct lpcr_data *data;
+
+			data = malloc(sizeof(struct lpcr_data));
+			if (!data) {
+				ret = H_P2;
+				break;
+			}
+			data->cpu = cpu;
+			data->mode = args[0];
+
+			kvm_cpu__queue_task(cpu, set_lpcr_cpu, data);
+		}
+
+		while ((int)total_done < kvm->nrcpus) {
+			int err;
+			err = read(task_event, &task_read, sizeof(task_read));
+			if (err < 0) {
+				ret = H_P2;
+				break;
+			}
+			total_done += task_read;
+		}
+		close(task_event);
+		break;
+	}
+	default:
+		ret = H_FUNCTION;
+		break;
+	}
+	return (ret < 0) ? H_P2 : H_SUCCESS;
+}
+
+
 void spapr_register_hypercall(target_ulong opcode, spapr_hcall_fn fn)
 {
 	spapr_hcall_fn *slot;
@@ -128,6 +238,7 @@ void hypercall_init(void)
 	spapr_register_hypercall(H_LOGICAL_CACHE_STORE, h_logical_store);
 	spapr_register_hypercall(H_LOGICAL_ICBI, h_logical_icbi);
 	spapr_register_hypercall(H_LOGICAL_DCBF, h_logical_dcbf);
+	spapr_register_hypercall(H_SET_MODE, h_set_mode);
 
 	/* KVM-PPC specific hcalls */
 	spapr_register_hypercall(KVMPPC_H_RTAS, h_rtas);
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] Fix a race during exit processing
  2016-03-21  7:17 [KVMTOOL][V1] Implement support for ppc64le Balbir Singh
  2016-03-21  7:17 ` [PATCH 1/4] Add basic little endian support Balbir Singh
  2016-03-21  7:17 ` [PATCH 2/4] Implement H_SET_MODE for ppc64le Balbir Singh
@ 2016-03-21  7:17 ` Balbir Singh
  2016-03-21  7:17 ` [PATCH 4/4] Implement spapr pci for little endian systems Balbir Singh
  3 siblings, 0 replies; 9+ messages in thread
From: Balbir Singh @ 2016-03-21  7:17 UTC (permalink / raw)
  To: will.deacon, kvm; +Cc: penberg, mpe, mikey, aik, Balbir Singh

Fix a race, described below

	lkvm stop ...	handle_stop
			kvm_cpu__reboot
			kvm_cmd_run_exit
			vcpus exit
			...
			dev_exit
			...
			ioport__unregister
			..serial...
			kvm__pause --> br_write_lock
			pthread_kill

But the thread is already dead above.

We mark the cpus as dying so that kvm_pause does nothing.
This should not break any semantics

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
---
 builtin-run.c | 3 +++
 kvm.c         | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/builtin-run.c b/builtin-run.c
index 17b1428..cdc7158 100644
--- a/builtin-run.c
+++ b/builtin-run.c
@@ -58,6 +58,7 @@ __thread struct kvm_cpu *current_kvm_cpu;
 static int  kvm_run_wrapper;
 
 bool do_debug_print = false;
+int kvm_cmd_exit;
 
 static const char * const run_usage[] = {
 	"lkvm run [<options>] [<kernel image>]",
@@ -648,6 +649,7 @@ static void kvm_cmd_run_exit(struct kvm *kvm, int guest_ret)
 {
 	compat__print_all_messages();
 
+	kvm_cmd_exit = 1;
 	init_list__exit(kvm);
 
 	if (guest_ret == 0 && do_debug_print)
@@ -659,6 +661,7 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix)
 	int ret = -EFAULT;
 	struct kvm *kvm;
 
+	kvm_cmd_exit = 0;
 	kvm = kvm_cmd_run_init(argc, argv);
 	if (IS_ERR(kvm))
 		return PTR_ERR(kvm);
diff --git a/kvm.c b/kvm.c
index 1081072..53cf0e2 100644
--- a/kvm.c
+++ b/kvm.c
@@ -33,6 +33,8 @@
 
 #define DEFINE_KVM_EXIT_REASON(reason) [reason] = #reason
 
+extern int kvm_cmd_exit;
+
 const char *kvm_exit_reasons[] = {
 	DEFINE_KVM_EXIT_REASON(KVM_EXIT_UNKNOWN),
 	DEFINE_KVM_EXIT_REASON(KVM_EXIT_EXCEPTION),
@@ -435,6 +437,9 @@ void kvm__pause(struct kvm *kvm)
 	if (!kvm->cpus[0] || kvm->cpus[0]->thread == 0)
 		return;
 
+	if (kvm_cmd_exit)
+		return;
+
 	mutex_lock(&pause_lock);
 
 	pause_event = eventfd(0, 0);
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] Implement spapr pci for little endian systems.
  2016-03-21  7:17 [KVMTOOL][V1] Implement support for ppc64le Balbir Singh
                   ` (2 preceding siblings ...)
  2016-03-21  7:17 ` [PATCH 3/4] Fix a race during exit processing Balbir Singh
@ 2016-03-21  7:17 ` Balbir Singh
  3 siblings, 0 replies; 9+ messages in thread
From: Balbir Singh @ 2016-03-21  7:17 UTC (permalink / raw)
  To: will.deacon, kvm; +Cc: penberg, mpe, mikey, aik, Balbir Singh

Port the spapr_pci implementation for ppc64le.
Based on suggestions by Alexey Kardashevskiy <aik@ozlabs.ru>
We should have always used phys_hi and 64 bit addr and size.

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
---
 powerpc/spapr_pci.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/powerpc/spapr_pci.c b/powerpc/spapr_pci.c
index 768e3f2..a15f7d8 100644
--- a/powerpc/spapr_pci.c
+++ b/powerpc/spapr_pci.c
@@ -234,8 +234,11 @@ int spapr_populate_pci_devices(struct kvm *kvm,
 	int bus_off, node_off = 0, devid, fn, i, n, devices;
 	struct device_header *dev_hdr;
 	char nodename[256];
-	struct of_pci_unit_address reg[PCI_NUM_REGIONS + 1],
-				   assigned_addresses[PCI_NUM_REGIONS];
+	struct of_pci_unit64_address {
+		u32 phys_hi;
+		u64 addr;
+		u64 size;
+	} __attribute((packed)) reg[PCI_NUM_REGIONS + 1], assigned_addresses[PCI_NUM_REGIONS];
 	uint32_t bus_range[] = { cpu_to_be32(0), cpu_to_be32(0xff) };
 	struct of_pci_ranges_entry ranges[] = {
 		{
@@ -339,7 +342,7 @@ int spapr_populate_pci_devices(struct kvm *kvm,
 				      le16_to_cpu(hdr->subsys_vendor_id)));
 
 		/* Config space region comes first */
-		reg[0].hi = cpu_to_be32(
+		reg[0].phys_hi = cpu_to_be32(
 			of_pci_b_n(0) |
 			of_pci_b_p(0) |
 			of_pci_b_t(0) |
@@ -347,8 +350,8 @@ int spapr_populate_pci_devices(struct kvm *kvm,
 			of_pci_b_bbbbbbbb(0) |
 			of_pci_b_ddddd(devid) |
 			of_pci_b_fff(fn));
-		reg[0].mid = 0;
-		reg[0].lo = 0;
+		reg[0].addr = 0;
+		reg[0].size = 0;
 
 		n = 0;
 		/* Six BARs, no ROM supported, addresses are 32bit */
@@ -357,7 +360,7 @@ int spapr_populate_pci_devices(struct kvm *kvm,
 				continue;
 			}
 
-			reg[n+1].hi = cpu_to_be32(
+			reg[n+1].phys_hi = cpu_to_be32(
 				of_pci_b_n(0) |
 				of_pci_b_p(0) |
 				of_pci_b_t(0) |
@@ -366,10 +369,10 @@ int spapr_populate_pci_devices(struct kvm *kvm,
 				of_pci_b_ddddd(devid) |
 				of_pci_b_fff(fn) |
 				of_pci_b_rrrrrrrr(bars[i]));
-			reg[n+1].mid = 0;
-			reg[n+1].lo = cpu_to_be64(hdr->bar_size[i]);
+			reg[n+1].size = cpu_to_be64(hdr->bar_size[i]);
+			reg[n+1].addr = 0;
 
-			assigned_addresses[n].hi = cpu_to_be32(
+			assigned_addresses[n].phys_hi = cpu_to_be32(
 				of_pci_b_n(1) |
 				of_pci_b_p(0) |
 				of_pci_b_t(0) |
@@ -383,8 +386,8 @@ int spapr_populate_pci_devices(struct kvm *kvm,
 			 * Writing zeroes to assigned_addresses causes the guest kernel to
 			 * reassign BARs
 			 */
-			assigned_addresses[n].mid = cpu_to_be64(bar_to_addr(le32_to_cpu(hdr->bar[i])));
-			assigned_addresses[n].lo = reg[n+1].lo;
+			assigned_addresses[n].addr = cpu_to_be64(bar_to_addr(le32_to_cpu(hdr->bar[i])));
+			assigned_addresses[n].size = reg[n+1].size;
 
 			++n;
 		}
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] Add basic little endian support.
  2016-03-21  7:17 ` [PATCH 1/4] Add basic little endian support Balbir Singh
@ 2016-03-21 21:55   ` Michael Neuling
  2016-03-21 22:30     ` Michael Ellerman
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Neuling @ 2016-03-21 21:55 UTC (permalink / raw)
  To: Balbir Singh, will.deacon, kvm; +Cc: penberg, mpe, aik

On Mon, 2016-03-21 at 18:17 +1100, Balbir Singh wrote:

> Currently kvmtool works well/was designed for big endian ppc64 systems.
> This patch adds support for little endian systems
> 
> The system does not yet boot as support for h_set_mode is required to help
> with exceptions in big endian mode -- first page fault. The support comes in
> the next patch of the series

Can we define some of the variables below with the appropriate endian?
pft_size_prop, segment_page_sizes_1, and rtas could all be defined as
big endian.

Mikey



> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> ---
>  powerpc/kvm.c   | 24 ++++++++++++------------
>  powerpc/spapr.h |  5 +++--
>  2 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/powerpc/kvm.c b/powerpc/kvm.c
> index b4c3310..d147e0c 100644
> --- a/powerpc/kvm.c
> +++ b/powerpc/kvm.c
> @@ -253,21 +253,21 @@ static void generate_segment_page_sizes(struct kvm_ppc_smu_info *info, struct f
>    	  	  if (sps->page_shift == 0)
>    	  	  	  break;
>  
> -  	  	  *p++ = sps->page_shift;
> -  	  	  *p++ = sps->slb_enc;
> +  	  	  *p++ = cpu_to_be32(sps->page_shift);
> +  	  	  *p++ = cpu_to_be32(sps->slb_enc);
>  
>    	  	  for (j = 0; j < KVM_PPC_PAGE_SIZES_MAX_SZ; j++)
>    	  	  	  if (!info->sps[i].enc[j].page_shift)
>    	  	  	  	  break;
>  
> -  	  	  *p++ = j;  	  /* count of enc */
> +  	  	  *p++ = cpu_to_be32(j);  	  /* count of enc */
>  
>    	  	  for (j = 0; j < KVM_PPC_PAGE_SIZES_MAX_SZ; j++) {
>    	  	  	  if (!info->sps[i].enc[j].page_shift)
>    	  	  	  	  break;
>  
> -  	  	  	  *p++ = info->sps[i].enc[j].page_shift;
> -  	  	  	  *p++ = info->sps[i].enc[j].pte_enc;
> +  	  	  	  *p++ = cpu_to_be32(info->sps[i].enc[j].page_shift);
> +  	  	  	  *p++ = cpu_to_be32(info->sps[i].enc[j].pte_enc);
>    	  	  }
>    	  }
>  }
> @@ -292,7 +292,7 @@ static int setup_fdt(struct kvm *kvm)
>    	  u8  	  	  staging_fdt[FDT_MAX_SIZE];
>    	  struct cpu_info *cpu_info = find_cpu_info(kvm);
>    	  struct fdt_prop segment_page_sizes;
> -  	  u32 segment_sizes_1T[] = {0x1c, 0x28, 0xffffffff, 0xffffffff};
> +  	  u32 segment_sizes_1T[] = {cpu_to_be32(0x1c), cpu_to_be32(0x28), 0xffffffff, 0xffffffff};
>  
>    	  /* Generate an appropriate DT at kvm->arch.fdt_gra */
>    	  void *fdt_dest = guest_flat_to_host(kvm, kvm->arch.fdt_gra);
> @@ -364,7 +364,7 @@ static int setup_fdt(struct kvm *kvm)
>    	  _FDT(fdt_property_cell(fdt, "#size-cells", 0x0));
>  
>    	  for (i = 0; i < smp_cpus; i += SMT_THREADS) {
> -  	  	  int32_t pft_size_prop[] = { 0, HPT_ORDER };
> +  	  	  int32_t pft_size_prop[] = { 0, cpu_to_be32(HPT_ORDER) };
>    	  	  uint32_t servers_prop[SMT_THREADS];
>    	  	  uint32_t gservers_prop[SMT_THREADS * 2];
>    	  	  int threads = (smp_cpus - i) >= SMT_THREADS ? SMT_THREADS :
> @@ -503,11 +503,11 @@ int kvm__arch_setup_firmware(struct kvm *kvm)
>    	   */
>    	  uint32_t *rtas = guest_flat_to_host(kvm, kvm->arch.rtas_gra);
>  
> -  	  rtas[0] = 0x7c641b78;
> -  	  rtas[1] = 0x3c600000;
> -  	  rtas[2] = 0x6063f000;
> -  	  rtas[3] = 0x44000022;
> -  	  rtas[4] = 0x4e800020;
> +  	  rtas[0] = cpu_to_be32(0x7c641b78);
> +  	  rtas[1] = cpu_to_be32(0x3c600000);
> +  	  rtas[2] = cpu_to_be32(0x6063f000);
> +  	  rtas[3] = cpu_to_be32(0x44000022);
> +  	  rtas[4] = cpu_to_be32(0x4e800020);
>    	  kvm->arch.rtas_size = 20;
>  
>    	  pr_info("Set up %ld bytes of RTAS at 0x%lx\n",
> diff --git a/powerpc/spapr.h b/powerpc/spapr.h
> index 7a377d0..8b294d1 100644
> --- a/powerpc/spapr.h
> +++ b/powerpc/spapr.h
> @@ -15,6 +15,7 @@
>  #define __HW_SPAPR_H__
>  
>  #include <inttypes.h>
> +#include <linux/byteorder.h>
>  
>  #include "kvm/kvm.h"
>  #include "kvm/kvm-cpu.h"
> @@ -80,12 +81,12 @@ int spapr_rtas_fdt_setup(struct kvm *kvm, void *fdt);
>  
>  static inline uint32_t rtas_ld(struct kvm *kvm, target_ulong phys, int n)
>  {
> -  	  return *((uint32_t *)guest_flat_to_host(kvm, phys + 4*n));
> +  	  return cpu_to_be32(*((uint32_t *)guest_flat_to_host(kvm, phys + 4*n)));
>  }
>  
>  static inline void rtas_st(struct kvm *kvm, target_ulong phys, int n, uint32_t val)
>  {
> -  	  *((uint32_t *)guest_flat_to_host(kvm, phys + 4*n)) = val;
> +  	  *((uint32_t *)guest_flat_to_host(kvm, phys + 4*n)) = cpu_to_be32(val);
>  }
>  
>  typedef void (*spapr_rtas_fn)(struct kvm_cpu *vcpu, uint32_t token,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] Add basic little endian support.
  2016-03-21 21:55   ` Michael Neuling
@ 2016-03-21 22:30     ` Michael Ellerman
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2016-03-21 22:30 UTC (permalink / raw)
  To: Michael Neuling, Balbir Singh, will.deacon, kvm; +Cc: penberg, aik

On Tue, 2016-03-22 at 08:55 +1100, Michael Neuling wrote:
> On Mon, 2016-03-21 at 18:17 +1100, Balbir Singh wrote:
>
> > Currently kvmtool works well/was designed for big endian ppc64 systems.
> > This patch adds support for little endian systems
> >
> > The system does not yet boot as support for h_set_mode is required to help
> > with exceptions in big endian mode -- first page fault. The support comes in
> > the next patch of the series
>
> Can we define some of the variables below with the appropriate endian?
> pft_size_prop, segment_page_sizes_1, and rtas could all be defined as
> big endian.

Yeah that would be good.

kvmtool does have the definitions for __be32 etc.

I can't see any support in the Makefiles for running sparse, but you can always
run it manually and/or add support.

cheers


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] Implement H_SET_MODE for ppc64le
  2016-03-21  7:17 ` [PATCH 2/4] Implement H_SET_MODE for ppc64le Balbir Singh
@ 2016-03-30  5:39   ` Michael Ellerman
  2016-03-30 11:08     ` Balbir Singh
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2016-03-30  5:39 UTC (permalink / raw)
  To: Balbir Singh, will.deacon, kvm; +Cc: penberg, mikey, aik

Hi Balbir,

So I got this running and it seems to work well.

I have some comments on the implementation though, see below ...

On Mon, 2016-03-21 at 18:17 +1100, Balbir Singh wrote:

> Basic infrastructure for queuing a task to a specifici CPU and
> the use of that in setting ILE (Little Endian Interrupt Handling)
> on power via h_set_mode hypercall
> 
> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
> index 37155db..731abee 100644
> --- a/include/kvm/kvm.h
> +++ b/include/kvm/kvm.h
> @@ -15,6 +15,7 @@
>  
>  #define SIGKVMEXIT		(SIGRTMIN + 0)
>  #define SIGKVMPAUSE		(SIGRTMIN + 1)
> +#define SIGKVMTASK		(SIGRTMIN + 2)
>  
>  #define KVM_PID_FILE_PATH	"/.lkvm/"
>  #define HOME_DIR		getenv("HOME")
> diff --git a/kvm-cpu.c b/kvm-cpu.c
> index ad4441b..438414f 100644
> --- a/kvm-cpu.c
> +++ b/kvm-cpu.c
> @@ -83,10 +83,59 @@ void kvm_cpu__reboot(struct kvm *kvm)
>  	}
>  }
>  
> +static void kvm_cpu__run_task(int sig, siginfo_t * info, void *context)
> +{
> +	union sigval val;
> +	struct kvm_cpu_task *task_ptr;
> +
> +	if (!info) {
> +		pr_warning("signal queued without info\n");
> +		return;
> +	}
> +
> +	val = info->si_value;
> +	task_ptr = val.sival_ptr;
> +	if (!task_ptr) {
> +		pr_warning("Task queued without data\n");
> +		return;
> +	}
> +
> +	if (!task_ptr->task || !task_ptr->data) {
> +		pr_warning("Failed to get task information\n");
> +		return;
> +	}
> +
> +	task_ptr->task(task_ptr->data);
> +	free(task_ptr);
> +}

I don't think it's safe to do the actual task call from signal context. Rather
it should set a flag that the main loop detects and then runs the task there.

> +int kvm_cpu__queue_task(struct kvm_cpu *cpu, void (*task)(void *data),
> +			void *data)
> +{
> +	struct kvm_cpu_task *task_ptr = NULL;
> +	union sigval val;
> +
> +	task_ptr = malloc(sizeof(struct kvm_cpu_task));
> +	if (!task_ptr)
> +		return -ENOMEM;
> +
> +	task_ptr->task = task;
> +	task_ptr->data = data;
> +	val.sival_ptr = task_ptr;
> +
> +	pthread_sigqueue(cpu->thread, SIGKVMTASK, val);
> +	return 0;
> +}

I think it would be nicer if this interface dealt with waiting for the
response. Rather than the caller having to do it.

Possibly in future we'll want to do an async task, but we can refactor the code
then to skip doing the wait.

> diff --git a/powerpc/include/kvm/kvm-cpu-arch.h b/powerpc/include/kvm/kvm-cpu-arch.h
> index 01eafdf..033b702 100644
> --- a/powerpc/include/kvm/kvm-cpu-arch.h
> +++ b/powerpc/include/kvm/kvm-cpu-arch.h
> @@ -38,6 +38,8 @@
>  
>  #define POWER7_EXT_IRQ	0
>  
> +#define LPCR_ILE (1 << (63-38))
> +
>  struct kvm;
>  
>  struct kvm_cpu {
> diff --git a/powerpc/spapr.h b/powerpc/spapr.h
> index 8b294d1..f851f4a 100644
> --- a/powerpc/spapr.h
> +++ b/powerpc/spapr.h
> @@ -27,7 +27,7 @@ typedef uintptr_t target_phys_addr_t;
>  #define H_HARDWARE	-1	/* Hardware error */
>  #define H_FUNCTION	-2	/* Function not supported */
>  #define H_PARAMETER	-4	/* Parameter invalid, out-of-range or conflicting */
> -
> +#define H_P2		-55
>  #define H_SET_DABR		0x28
>  #define H_LOGICAL_CI_LOAD	0x3c
>  #define H_LOGICAL_CI_STORE	0x40
> @@ -41,7 +41,18 @@ typedef uintptr_t target_phys_addr_t;
>  #define H_EOI			0x64
>  #define H_IPI			0x6c
>  #define H_XIRR			0x74
> -#define MAX_HCALL_OPCODE	H_XIRR
> +#define H_SET_MODE		0x31C
> +#define MAX_HCALL_OPCODE	H_SET_MODE
> +
> +/* Values for 2nd argument to H_SET_MODE */
> +#define H_SET_MODE_RESOURCE_SET_CIABR		1
> +#define H_SET_MODE_RESOURCE_SET_DAWR		2
> +#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE	3
> +#define H_SET_MODE_RESOURCE_LE			4
> +
> +/* Flags for H_SET_MODE_RESOURCE_LE */
> +#define H_SET_MODE_ENDIAN_BIG		0
> +#define H_SET_MODE_ENDIAN_LITTLE	1
>  
>  /*
>   * The hcalls above are standardized in PAPR and implemented by pHyp
> diff --git a/powerpc/spapr_hcall.c b/powerpc/spapr_hcall.c
> index ff1d63a..682fad5 100644
> --- a/powerpc/spapr_hcall.c
> +++ b/powerpc/spapr_hcall.c
> @@ -18,6 +18,9 @@
>  
>  #include <stdio.h>
>  #include <assert.h>
> +#include <sys/eventfd.h>
> +
> +static int task_event;
>  
>  static spapr_hcall_fn papr_hypercall_table[(MAX_HCALL_OPCODE / 4) + 1];
>  static spapr_hcall_fn kvmppc_hypercall_table[KVMPPC_HCALL_MAX -
> @@ -74,6 +77,113 @@ static target_ulong h_logical_dcbf(struct kvm_cpu *vcpu, target_ulong opcode, ta
>  	return H_SUCCESS;
>  }
>  
> +struct lpcr_data {
> +	struct kvm_cpu	*cpu;
> +	int		mode;
> +};
> +
> +static int get_cpu_lpcr(struct kvm_cpu *vcpu, target_ulong *lpcr)
> +{
> +	struct kvm_one_reg reg = {
> +		.id = KVM_REG_PPC_LPCR_64,
> +		.addr = (__u64)lpcr
> +	};
> +
> +	return ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, &reg);
> +}
> +
> +static int set_cpu_lpcr(struct kvm_cpu *vcpu, target_ulong *lpcr)

This function has a reasonable name ..

> +{
> +	struct kvm_one_reg reg = {
> +		.id = KVM_REG_PPC_LPCR_64,
> +		.addr = (__u64)lpcr
> +	};
> +
> +	return ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, &reg);
> +}
> +
> +static void set_lpcr_cpu(void *data)

But then this one is *very* similar.

I think this should actually be called set_cpu_ile(), because that's what it
does. And maybe have "task" in the name because it's the version for using with
kvm_cpu__queue_task().

> +{
> +	struct lpcr_data *fn_data = (struct lpcr_data *)data;
> +	int ret;
> +	target_ulong lpcr;
> +	u64 task_done = 1;
> +
> +	if (!fn_data || !fn_data->cpu)
> +		return;

This should be hard errors IMHO.

> +	ret = get_cpu_lpcr(fn_data->cpu, &lpcr);
> +	if (ret < 0)
> +		return;

Uh oh!

It looks like most code calls die() if KVM_SET_ONE_REG fails, that would be
preferable I think than running some cpus with a different endian :)

> +	if (fn_data->mode == H_SET_MODE_ENDIAN_BIG)
> +		lpcr &= ~LPCR_ILE;
> +	else
> +		lpcr |= LPCR_ILE;
> +
> +	ret = set_cpu_lpcr(fn_data->cpu, &lpcr);
> +	if (ret < 0)
> +		return;
> +
> +	free(data);

I don't think we should be doing the free here.

> +	if (write(task_event, &task_done, sizeof(task_done)) < 0)
> +		pr_warning("Failed to notify of lpcr task done\n");
> +}
> +
> +#define for_each_vcpu(cpu, kvm, i) \
> +	for ((i) = 0, (cpu) = (kvm)->cpus[i]; (i) < (kvm)->nrcpus; (i)++, (cpu) = (kvm)->cpus[i])

That should probably be in a header.
>
> +static target_ulong h_set_mode(struct kvm_cpu *vcpu, target_ulong opcode, target_ulong *args)
> +{
> +	int ret = H_SUCCESS;

That init should be removed.

> +	struct kvm *kvm = vcpu->kvm;
> +	struct kvm_cpu *cpu;
> +	int i;
> +
> +	switch (args[1]) {
> +	case H_SET_MODE_RESOURCE_LE: {
> +		u64 total_done = 0;
> +		u64 task_read;
> +
> +		task_event = eventfd(0, 0);
> +		if (task_event < 0) {
> +			pr_warning("Failed to create task_event");
> +			break;

That will return H_SUCCESS which is not OK.

> +		}
> +		for_each_vcpu(cpu, kvm, i) {
> +			struct lpcr_data *data;
> +
> +			data = malloc(sizeof(struct lpcr_data));

Is there any reason not to do this synchronously?

That would allow you to put data on the stack. And also avoid the while loop
below.

> +			if (!data) {
> +				ret = H_P2;
> +				break;
> +			}
> +			data->cpu = cpu;
> +			data->mode = args[0];
> +
> +			kvm_cpu__queue_task(cpu, set_lpcr_cpu, data);
> +		}
> +
> +		while ((int)total_done < kvm->nrcpus) {
> +			int err;
> +			err = read(task_event, &task_read, sizeof(task_read));
> +			if (err < 0) {
> +				ret = H_P2;
> +				break;
> +			}
> +			total_done += task_read;
> +		}
> +		close(task_event);
> +		break;
> +	}
> +	default:
> +		ret = H_FUNCTION;
> +		break;
> +	}
> +	return (ret < 0) ? H_P2 : H_SUCCESS;
> +}

I think that ends up being correct, but it's pretty obscure. ie. for an
unsupported resource we should return H_P2, and you get that to happen by
setting ret to H_FUNCTION (-2).

cheers


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] Implement H_SET_MODE for ppc64le
  2016-03-30  5:39   ` Michael Ellerman
@ 2016-03-30 11:08     ` Balbir Singh
  0 siblings, 0 replies; 9+ messages in thread
From: Balbir Singh @ 2016-03-30 11:08 UTC (permalink / raw)
  To: Michael Ellerman, will.deacon, kvm; +Cc: penberg, mikey, aik



On 30/03/16 16:39, Michael Ellerman wrote:
> Hi Balbir,
>
> So I got this running and it seems to work well.
>
> I have some comments on the implementation though, see below ...
>
> On Mon, 2016-03-21 at 18:17 +1100, Balbir Singh wrote:
>
>> Basic infrastructure for queuing a task to a specifici CPU and
>> the use of that in setting ILE (Little Endian Interrupt Handling)
>> on power via h_set_mode hypercall
>>
>> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
>> diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
>> index 37155db..731abee 100644
>> --- a/include/kvm/kvm.h
>> +++ b/include/kvm/kvm.h
>> @@ -15,6 +15,7 @@
>>  
>>  #define SIGKVMEXIT		(SIGRTMIN + 0)
>>  #define SIGKVMPAUSE		(SIGRTMIN + 1)
>> +#define SIGKVMTASK		(SIGRTMIN + 2)
>>  
>>  #define KVM_PID_FILE_PATH	"/.lkvm/"
>>  #define HOME_DIR		getenv("HOME")
>> diff --git a/kvm-cpu.c b/kvm-cpu.c
>> index ad4441b..438414f 100644
>> --- a/kvm-cpu.c
>> +++ b/kvm-cpu.c
>> @@ -83,10 +83,59 @@ void kvm_cpu__reboot(struct kvm *kvm)
>>  	}
>>  }
>>  
>> +static void kvm_cpu__run_task(int sig, siginfo_t * info, void *context)
>> +{
>> +	union sigval val;
>> +	struct kvm_cpu_task *task_ptr;
>> +
>> +	if (!info) {
>> +		pr_warning("signal queued without info\n");
>> +		return;
>> +	}
>> +
>> +	val = info->si_value;
>> +	task_ptr = val.sival_ptr;
>> +	if (!task_ptr) {
>> +		pr_warning("Task queued without data\n");
>> +		return;
>> +	}
>> +
>> +	if (!task_ptr->task || !task_ptr->data) {
>> +		pr_warning("Failed to get task information\n");
>> +		return;
>> +	}
>> +
>> +	task_ptr->task(task_ptr->data);
>> +	free(task_ptr);
>> +}
> I don't think it's safe to do the actual task call from signal context. Rather
> it should set a flag that the main loop detects and then runs the task there.
I don't think it matters. We do other stuff like kvm__pause() from signal context.
The problem I see with the main loop detection is that we could potentially have
both modes queued up

>> +int kvm_cpu__queue_task(struct kvm_cpu *cpu, void (*task)(void *data),
>> +			void *data)
>> +{
>> +	struct kvm_cpu_task *task_ptr = NULL;
>> +	union sigval val;
>> +
>> +	task_ptr = malloc(sizeof(struct kvm_cpu_task));
>> +	if (!task_ptr)
>> +		return -ENOMEM;
>> +
>> +	task_ptr->task = task;
>> +	task_ptr->data = data;
>> +	val.sival_ptr = task_ptr;
>> +
>> +	pthread_sigqueue(cpu->thread, SIGKVMTASK, val);
>> +	return 0;
>> +}
> I think it would be nicer if this interface dealt with waiting for the
> response. Rather than the caller having to do it.
>
> Possibly in future we'll want to do an async task, but we can refactor the code
> then to skip doing the wait.
I wrote the core code (powerpc independent bits) to be async with an example of
our code of how to do sync.
>> diff --git a/powerpc/include/kvm/kvm-cpu-arch.h b/powerpc/include/kvm/kvm-cpu-arch.h
>> index 01eafdf..033b702 100644
>> --- a/powerpc/include/kvm/kvm-cpu-arch.h
>> +++ b/powerpc/include/kvm/kvm-cpu-arch.h
>> @@ -38,6 +38,8 @@
>>  
>>  #define POWER7_EXT_IRQ	0
>>  
>> +#define LPCR_ILE (1 << (63-38))
>> +
>>  struct kvm;
>>  
>>  struct kvm_cpu {
>> diff --git a/powerpc/spapr.h b/powerpc/spapr.h
>> index 8b294d1..f851f4a 100644
>> --- a/powerpc/spapr.h
>> +++ b/powerpc/spapr.h
>> @@ -27,7 +27,7 @@ typedef uintptr_t target_phys_addr_t;
>>  #define H_HARDWARE	-1	/* Hardware error */
>>  #define H_FUNCTION	-2	/* Function not supported */
>>  #define H_PARAMETER	-4	/* Parameter invalid, out-of-range or conflicting */
>> -
>> +#define H_P2		-55
>>  #define H_SET_DABR		0x28
>>  #define H_LOGICAL_CI_LOAD	0x3c
>>  #define H_LOGICAL_CI_STORE	0x40
>> @@ -41,7 +41,18 @@ typedef uintptr_t target_phys_addr_t;
>>  #define H_EOI			0x64
>>  #define H_IPI			0x6c
>>  #define H_XIRR			0x74
>> -#define MAX_HCALL_OPCODE	H_XIRR
>> +#define H_SET_MODE		0x31C
>> +#define MAX_HCALL_OPCODE	H_SET_MODE
>> +
>> +/* Values for 2nd argument to H_SET_MODE */
>> +#define H_SET_MODE_RESOURCE_SET_CIABR		1
>> +#define H_SET_MODE_RESOURCE_SET_DAWR		2
>> +#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE	3
>> +#define H_SET_MODE_RESOURCE_LE			4
>> +
>> +/* Flags for H_SET_MODE_RESOURCE_LE */
>> +#define H_SET_MODE_ENDIAN_BIG		0
>> +#define H_SET_MODE_ENDIAN_LITTLE	1
>>  
>>  /*
>>   * The hcalls above are standardized in PAPR and implemented by pHyp
>> diff --git a/powerpc/spapr_hcall.c b/powerpc/spapr_hcall.c
>> index ff1d63a..682fad5 100644
>> --- a/powerpc/spapr_hcall.c
>> +++ b/powerpc/spapr_hcall.c
>> @@ -18,6 +18,9 @@
>>  
>>  #include <stdio.h>
>>  #include <assert.h>
>> +#include <sys/eventfd.h>
>> +
>> +static int task_event;
>>  
>>  static spapr_hcall_fn papr_hypercall_table[(MAX_HCALL_OPCODE / 4) + 1];
>>  static spapr_hcall_fn kvmppc_hypercall_table[KVMPPC_HCALL_MAX -
>> @@ -74,6 +77,113 @@ static target_ulong h_logical_dcbf(struct kvm_cpu *vcpu, target_ulong opcode, ta
>>  	return H_SUCCESS;
>>  }
>>  
>> +struct lpcr_data {
>> +	struct kvm_cpu	*cpu;
>> +	int		mode;
>> +};
>> +
>> +static int get_cpu_lpcr(struct kvm_cpu *vcpu, target_ulong *lpcr)
>> +{
>> +	struct kvm_one_reg reg = {
>> +		.id = KVM_REG_PPC_LPCR_64,
>> +		.addr = (__u64)lpcr
>> +	};
>> +
>> +	return ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, &reg);
>> +}
>> +
>> +static int set_cpu_lpcr(struct kvm_cpu *vcpu, target_ulong *lpcr)
> This function has a reasonable name ..
>
>> +{
>> +	struct kvm_one_reg reg = {
>> +		.id = KVM_REG_PPC_LPCR_64,
>> +		.addr = (__u64)lpcr
>> +	};
>> +
>> +	return ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, &reg);
>> +}
>> +
>> +static void set_lpcr_cpu(void *data)
> But then this one is *very* similar.
>
> I think this should actually be called set_cpu_ile(), because that's what it
> does. And maybe have "task" in the name because it's the version for using with
> kvm_cpu__queue_task().
I'll change this to set_cpu_ile -- good catch
>
>> +{
>> +	struct lpcr_data *fn_data = (struct lpcr_data *)data;
>> +	int ret;
>> +	target_ulong lpcr;
>> +	u64 task_done = 1;
>> +
>> +	if (!fn_data || !fn_data->cpu)
>> +		return;
> This should be hard errors IMHO.
I wanted to avoid hard errors to see if the OS can fail with an OOPS
later. My concern is that hard errors always leave the system in a very
bad state with no scope to debug. I can change that if required
>> +	ret = get_cpu_lpcr(fn_data->cpu, &lpcr);
>> +	if (ret < 0)
>> +		return;
> Uh oh!
>
> It looks like most code calls die() if KVM_SET_ONE_REG fails, that would be
> preferable I think than running some cpus with a different endian :)

>> +	if (fn_data->mode == H_SET_MODE_ENDIAN_BIG)
>> +		lpcr &= ~LPCR_ILE;
>> +	else
>> +		lpcr |= LPCR_ILE;
>> +
>> +	ret = set_cpu_lpcr(fn_data->cpu, &lpcr);
>> +	if (ret < 0)
>> +		return;
>> +
>> +	free(data);
> I don't think we should be doing the free here.
>From the point the callback gets the data, it owns it. Otherwise we'd have to
implement an I am done with this processing

>> +	if (write(task_event, &task_done, sizeof(task_done)) < 0)
>> +		pr_warning("Failed to notify of lpcr task done\n");
>> +}
>> +
>> +#define for_each_vcpu(cpu, kvm, i) \
>> +	for ((i) = 0, (cpu) = (kvm)->cpus[i]; (i) < (kvm)->nrcpus; (i)++, (cpu) = (kvm)->cpus[i])
> That should probably be in a header.
Yep, I did not see any use outside this function, but I can move it
>> +static target_ulong h_set_mode(struct kvm_cpu *vcpu, target_ulong opcode, target_ulong *args)
>> +{
>> +	int ret = H_SUCCESS;
> That init should be removed.
OK, I'll revisit the error handling
>> +	struct kvm *kvm = vcpu->kvm;
>> +	struct kvm_cpu *cpu;
>> +	int i;
>> +
>> +	switch (args[1]) {
>> +	case H_SET_MODE_RESOURCE_LE: {
>> +		u64 total_done = 0;
>> +		u64 task_read;
>> +
>> +		task_event = eventfd(0, 0);
>> +		if (task_event < 0) {
>> +			pr_warning("Failed to create task_event");
>> +			break;
> That will return H_SUCCESS which is not OK.
Good catch!
>> +		}
>> +		for_each_vcpu(cpu, kvm, i) {
>> +			struct lpcr_data *data;
>> +
>> +			data = malloc(sizeof(struct lpcr_data));
> Is there any reason not to do this synchronously?
>
> That would allow you to put data on the stack. And also avoid the while loop
> below.
Synchronous implies a two way communication mechanism between this vCPU
and all others to communicate begin and end of a task
>
>> +			if (!data) {
>> +				ret = H_P2;
>> +				break;
>> +			}
>> +			data->cpu = cpu;
>> +			data->mode = args[0];
>> +
>> +			kvm_cpu__queue_task(cpu, set_lpcr_cpu, data);
>> +		}
>> +
>> +		while ((int)total_done < kvm->nrcpus) {
>> +			int err;
>> +			err = read(task_event, &task_read, sizeof(task_read));
>> +			if (err < 0) {
>> +				ret = H_P2;
>> +				break;
>> +			}
>> +			total_done += task_read;
>> +		}
>> +		close(task_event);
>> +		break;
>> +	}
>> +	default:
>> +		ret = H_FUNCTION;
>> +		break;
>> +	}
>> +	return (ret < 0) ? H_P2 : H_SUCCESS;
>> +}
> I think that ends up being correct, but it's pretty obscure. ie. for an
> unsupported resource we should return H_P2, and you get that to happen by
> setting ret to H_FUNCTION (-2).
Good catch! I'll fix this
>
> cheers
>

Thanks for the review
Balbir

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-03-30 11:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-21  7:17 [KVMTOOL][V1] Implement support for ppc64le Balbir Singh
2016-03-21  7:17 ` [PATCH 1/4] Add basic little endian support Balbir Singh
2016-03-21 21:55   ` Michael Neuling
2016-03-21 22:30     ` Michael Ellerman
2016-03-21  7:17 ` [PATCH 2/4] Implement H_SET_MODE for ppc64le Balbir Singh
2016-03-30  5:39   ` Michael Ellerman
2016-03-30 11:08     ` Balbir Singh
2016-03-21  7:17 ` [PATCH 3/4] Fix a race during exit processing Balbir Singh
2016-03-21  7:17 ` [PATCH 4/4] Implement spapr pci for little endian systems Balbir Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.