All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 0/3] Use efi_rts_wq to invoke EFI Runtime Services
@ 2018-05-22  3:13 Sai Praneeth Prakhya
  2018-05-22  3:13 ` [PATCH V3 1/3] x86/efi: Call efi_delete_dummy_variable() after creating efi_rts_wq Sai Praneeth Prakhya
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Sai Praneeth Prakhya @ 2018-05-22  3:13 UTC (permalink / raw)
  To: linux-efi, linux-kernel
  Cc: Sai Praneeth, Lee Chun-Yi, Borislav Petkov, Tony Luck,
	Will Deacon, Dave Hansen, Mark Rutland, Bhupesh Sharma,
	Naresh Bhat, Ricardo Neri, Peter Zijlstra, Ravi Shankar,
	Matt Fleming, Dan Williams, Ard Biesheuvel, Miguel Ojeda

From: Sai Praneeth <sai.praneeth.prakhya@intel.com>

Problem statement:
------------------
Presently, efi_runtime_services() silently switch %cr3 from swapper_pgd
to efi_pgd. As a consequence, kernel code that runs in efi_pgd (e.g.,
perf code via an NMI) will have incorrect user space mappings[1]. This
could lead to otherwise unexpected access errors and, worse, unauthorized
access to firmware code and data.

Detailed discussion of problem statement:
-----------------------------------------
As this switch is not propagated to other kernel subsystems; they will
wrongly assume that swapper_pgd is still in use and it can lead to
following issues:

1. If kernel code tries to access user space addresses while in efi_pgd,
it could lead to unauthorized accesses to firmware code/data.
(e.g: <__>/copy_from_user_nmi()).
[This could also be disastrous if the frame pointer happens to point at
MMIO in the EFI runtime mappings] - Mark Rutland.

An example of a subsystem that could touch user space while in efi_pgd is
perf. Assume that we are in efi_pgd, a user could use perf to profile
some user data and depending on the address the user is trying to
profile, two things could happen.
1. If the mappings are absent, perf fails to profile.
2. If efi_pgd does have mappings for the requested address (these
  mappings are erroneous), perf profiles firmware code/data. If the
  address is MMIO'ed, perf could have potentially changed some device state.

The culprit in both the cases is, EFI subsystem swapping out pgd and not
perf. Because, EFI subsystem has broken the *general assumption* that
all other subsystems rely on - "user space might be valid and nobody has
switched %cr3".

Solutions:
----------
There are two ways to fix this issue:
1. Educate about pgd change to *all* the subsystems that could
   potentially access user space while in efi_pgd.
  On x86, AFAIK, it could happen only when some one touches user space
  from the back of an NMI (a quick audit on <__>/copy_from_user_nmi,
  showed perf and oprofile). On arm, it could happen from multiple
  places as arm runs efi_runtime_services() interrupts enabled (ARM folks,
  please comment on this as I might be wrong); whereas x86 runs
  efi_runtime_services() interrupts disabled.

  I think, this solution isn't holistic because
  a. Any other subsystem might well do the same, if not now, in future.
  b. Also, this solutions looks simpler on x86 but not true if it's the
    same for arm (ARM folks, please comment on this as I might be wrong).
  c. This solution looks like a work around rather than addressing the issue.

2. Running efi_runtime_services() in kthread context.
  This makes sense because efi_pgd doesn't have user space and kthread
  by definition means that user space is not valid. Any kernel code that
  tries to touch user space while in kthread is buggy in itself. If so,
  it should be an easy fix in the other subsystem. This also take us one
  step closer to long awaiting proposal of Andy - Running EFI at CPL 3.

What does this patch set do?
----------------------------
Introduce efi_rts_wq (EFI runtime services work queue).
When a user process requests the kernel to execute any efi_runtime_service(),
kernel queues the work to efi_rts_wq, a kthread comes along, switches to
efi_pgd and executes efi_runtime_service() in kthread context. IOW, this
patch set adds support to the EFI subsystem to handle all calls to
efi_runtime_services() using a work queue (which in turn uses kthread).

How running efi_runtime_services() in kthread fixes above discussed issues?
---------------------------------------------------------------------------
If we run efi_runtime_services() in kthread context and if perf
checks for it, we could get both the above scenarios correct by perf
aborting the profiling. Not only perf, but any subsystem that tries to
touch user space should first check for kthread context and if so,
should abort.

Q. If we still need check for kthread context in other subsystems that
access user space, what does this patch set fix?
A. This patch set makes sure that EFI subsystem is not at fault.
Without this patch set the blame is upon EFI subsystem, because it's the
one that changed pgd and hasn't communicated this change to everyone and
hence broke the general assumption. Running efi_runtime_services() in
kthread means explicitly communicating that user space is invalid, now 
it's the responsibility of other subsystem to make sure that it's
running in right context.

Testing:
--------
Tested using LUV (Linux UEFI Validation) for x86_64, x86_32 and arm64
(qemu only). Will appreciate the effort if someone could test the
patches on real ARM/ARM64 machines.
LUV: https://01.org/linux-uefi-validation

Credits:
--------
Thanks to Ricardo, Dan, Miguel and Mark for initial reviews and
suggestions. Thanks to Boris and Andy for making me think through/help
on what I am addressing with this patch set. Please feel free to pour in
your comments and concerns.

Note:
-----
Patches are based on Linus's kernel v4.17-rc6

[1] Backup: Detailing efi_pgd:
------------------------------
efi_pgd has mappings for EFI Runtime Code/Data (on x86, plus EFI Boot time
Code/Data) regions. Due to the nature of these mappings, they fall
in user space address ranges and they are not the same as swapper.

[On arm64, the EFI mappings are in the VA range usually used for user
space. The two halves of the address space are managed by separate
tables, TTBR0 and TTBR1. We always map the kernel in TTBR1, and we map
user space or EFI runtime mappings in TTBR0.] - Mark Rutland

Changes from V2 to V3:
----------------------
1. Rewrite the cover letter to clearly state the problem. What we are
fixing and what we are not fixing.
2. Make efi_delete_dummy_variable() change local to x86.
3. Avoid using BUG(), instead, print error message and exit gracefully.
4. Move struct efi_runtime_work to runtime-wrappers.c file.
5. Give enum a name (efi_rts_ids) and use it in efi_runtime_work.
6. Add Naresh (maintainer of LUV for ARM) and Miguel to the CC list.

Changes from V1 to V2:
----------------------
1. Remove unnecessary include of asm/efi.h file - Fixes build error on
ia64, reported by 0-day
2. Use enum to identify efi_runtime_services()
3. Use alloc_ordered_workqueue() to create efi_rts_wq as
create_workqueue() is scheduled for depreciation.
4. Make efi_call_rts() static, as it has no callers outside
runtime-wrappers.c
5. Use BUG(), when we are unable to queue work or unable to identify
requested efi_runtime_service() - Because these two situations should
*never* happen.

Sai Praneeth (3):
  x86/efi: Call efi_delete_dummy_variable() after creating efi_rts_wq
  efi: Introduce efi_queue_work() to queue any efi_runtime_service() on 
       efi_rts_wq
  efi: Use efi_rts_wq to invoke EFI Runtime Services

 arch/x86/platform/efi/efi.c             |  15 +-
 drivers/firmware/efi/arm-runtime.c      |   3 +
 drivers/firmware/efi/efi.c              |  25 ++++
 drivers/firmware/efi/runtime-wrappers.c | 250 +++++++++++++++++++++++++++++---
 include/linux/efi.h                     |   4 +
 5 files changed, 271 insertions(+), 26 deletions(-)

Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Cc: Lee Chun-Yi <jlee@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Naresh Bhat <naresh.bhat@linaro.org>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>

-- 
2.7.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH V3 1/3] x86/efi: Call efi_delete_dummy_variable() after creating efi_rts_wq
  2018-05-22  3:13 [PATCH V3 0/3] Use efi_rts_wq to invoke EFI Runtime Services Sai Praneeth Prakhya
@ 2018-05-22  3:13 ` Sai Praneeth Prakhya
  2018-05-22  3:13 ` [PATCH V3 2/3] efi: Introduce efi_queue_work() to queue any efi_runtime_service() on efi_rts_wq Sai Praneeth Prakhya
  2018-05-22  3:13 ` [PATCH V3 3/3] efi: Use efi_rts_wq to invoke EFI Runtime Services Sai Praneeth Prakhya
  2 siblings, 0 replies; 6+ messages in thread
From: Sai Praneeth Prakhya @ 2018-05-22  3:13 UTC (permalink / raw)
  To: linux-efi, linux-kernel
  Cc: Sai Praneeth, Lee Chun-Yi, Borislav Petkov, Tony Luck,
	Will Deacon, Dave Hansen, Mark Rutland, Bhupesh Sharma,
	Naresh Bhat, Ricardo Neri, Peter Zijlstra, Ravi Shankar,
	Matt Fleming, Dan Williams, Ard Biesheuvel, Miguel Ojeda

From: Sai Praneeth <sai.praneeth.prakhya@intel.com>

Create a workqueue named efi_rts_wq (efi runtime services workqueue), so
that all efi_runtime_services() are executed in kthread context.

Invoking efi_runtime_services() through efi_rts_wq means all accesses to
efi_runtime_services() should be done after efi_rts_wq has been created.
efi_delete_dummy_variable() calls set_variable(), hence
efi_delete_dummy_variable() should be called after efi_rts_wq has been
created.

Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Cc: Lee Chun-Yi <jlee@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Naresh Bhat <naresh.bhat@linaro.org>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
---
 arch/x86/platform/efi/efi.c        | 15 +++++++++------
 drivers/firmware/efi/arm-runtime.c |  3 +++
 drivers/firmware/efi/efi.c         | 25 +++++++++++++++++++++++++
 include/linux/efi.h                |  4 ++++
 4 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 9061babfbc83..adcc55cd25ce 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -893,9 +893,6 @@ static void __init kexec_enter_virtual_mode(void)
 
 	if (efi_enabled(EFI_OLD_MEMMAP) && (__supported_pte_mask & _PAGE_NX))
 		runtime_code_page_mkexec();
-
-	/* clean DUMMY object */
-	efi_delete_dummy_variable();
 #endif
 }
 
@@ -1015,9 +1012,6 @@ static void __init __efi_enter_virtual_mode(void)
 	 * necessary relocation fixups for the new virtual addresses.
 	 */
 	efi_runtime_update_mappings();
-
-	/* clean DUMMY object */
-	efi_delete_dummy_variable();
 }
 
 void __init efi_enter_virtual_mode(void)
@@ -1031,6 +1025,15 @@ void __init efi_enter_virtual_mode(void)
 		__efi_enter_virtual_mode();
 
 	efi_dump_pagetable();
+
+	if (!efi_create_rts_wq())
+		return;
+
+	/*
+	 * Clean DUMMY object calls EFI Runtime Service, set_variable(), so
+	 * it should be invoked only after efi_rts_wq is ready.
+	 */
+	efi_delete_dummy_variable();
 }
 
 static int __init arch_parse_efi_cmdline(char *str)
diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c
index 5889cbea60b8..6fb06130b53f 100644
--- a/drivers/firmware/efi/arm-runtime.c
+++ b/drivers/firmware/efi/arm-runtime.c
@@ -139,6 +139,9 @@ static int __init arm_enable_runtime_services(void)
 		return -ENOMEM;
 	}
 
+	if (!efi_create_rts_wq())
+		return 0;
+
 	/* Set up runtime services function pointers */
 	efi_native_runtime_setup();
 	set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 232f4915223b..b9103caa03b4 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -84,6 +84,8 @@ struct mm_struct efi_mm = {
 	.mmlist			= LIST_HEAD_INIT(efi_mm.mmlist),
 };
 
+struct workqueue_struct *efi_rts_wq;
+
 static bool disable_runtime;
 static int __init setup_noefi(char *arg)
 {
@@ -337,6 +339,13 @@ static int __init efisubsys_init(void)
 	if (!efi_enabled(EFI_BOOT))
 		return 0;
 
+	/*
+	 * If we failed to create efi_rts_wq, EFI_RUNTIME_SERVICES would
+	 * have been be cleared, check for that condition.
+	 */
+	if (!efi_enabled(EFI_RUNTIME_SERVICES))
+		return 0;
+
 	/* We register the efi directory at /sys/firmware/efi */
 	efi_kobj = kobject_create_and_add("efi", firmware_kobj);
 	if (!efi_kobj) {
@@ -971,3 +980,19 @@ static int register_update_efi_random_seed(void)
 }
 late_initcall(register_update_efi_random_seed);
 #endif
+
+bool __init efi_create_rts_wq(void)
+{
+	/*
+	 * Since we process only one efi_runtime_service() at a time, an
+	 * ordered workqueue (which creates only one execution context)
+	 * should suffice all our needs.
+	 */
+	efi_rts_wq = alloc_ordered_workqueue("efi_rts_wq", 0);
+	if (!efi_rts_wq) {
+		pr_err("Creating efi_rts_wq failed, EFI runtime services disabled.\n");
+		clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
+		return false;
+	}
+	return true;
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 3016d8c456bc..565955010b18 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -987,6 +987,7 @@ extern void efi_map_pal_code (void);
 extern void efi_memmap_walk (efi_freemem_callback_t callback, void *arg);
 extern void efi_gettimeofday (struct timespec64 *ts);
 extern void efi_enter_virtual_mode (void);	/* switch EFI to virtual mode, if possible */
+extern bool __init efi_create_rts_wq(void);
 #ifdef CONFIG_X86
 extern void efi_late_init(void);
 extern void efi_free_boot_services(void);
@@ -1651,4 +1652,7 @@ struct linux_efi_tpm_eventlog {
 
 extern int efi_tpm_eventlog_init(void);
 
+/* Workqueue to queue EFI Runtime Services */
+extern struct workqueue_struct *efi_rts_wq;
+
 #endif /* _LINUX_EFI_H */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH V3 2/3] efi: Introduce efi_queue_work() to queue any efi_runtime_service() on efi_rts_wq
  2018-05-22  3:13 [PATCH V3 0/3] Use efi_rts_wq to invoke EFI Runtime Services Sai Praneeth Prakhya
  2018-05-22  3:13 ` [PATCH V3 1/3] x86/efi: Call efi_delete_dummy_variable() after creating efi_rts_wq Sai Praneeth Prakhya
@ 2018-05-22  3:13 ` Sai Praneeth Prakhya
  2018-05-22  7:36   ` Peter Zijlstra
  2018-05-22  3:13 ` [PATCH V3 3/3] efi: Use efi_rts_wq to invoke EFI Runtime Services Sai Praneeth Prakhya
  2 siblings, 1 reply; 6+ messages in thread
From: Sai Praneeth Prakhya @ 2018-05-22  3:13 UTC (permalink / raw)
  To: linux-efi, linux-kernel
  Cc: Sai Praneeth, Lee Chun-Yi, Borislav Petkov, Tony Luck,
	Will Deacon, Dave Hansen, Mark Rutland, Bhupesh Sharma,
	Naresh Bhat, Ricardo Neri, Peter Zijlstra, Ravi Shankar,
	Matt Fleming, Dan Williams, Ard Biesheuvel, Miguel Ojeda

From: Sai Praneeth <sai.praneeth.prakhya@intel.com>

When a process requests the kernel to execute any efi_runtime_service(),
the requested efi_runtime_service (represented as an identifier) and its
arguments are packed into a struct named efi_runtime_work and queued
onto work queue named efi_rts_wq. The caller then waits until the work
is completed.

Introduce efi_queue_work() that 1. Populates efi_runtime_work 2. Queues
work onto efi_rts_wq and 3. Waits until worker thread returns.

The caller thread has to wait until the worker thread returns, because
it depends on the return status of efi_runtime_service() and, in
specific cases, the arguments populated by efi_runtime_service(). Some
efi_runtime_services() takes a pointer to buffer as an argument and
fills up the buffer with requested data. For instance,
efi_get_variable() and efi_get_next_variable(). Hence, caller process
cannot just post the work and get going.

Some facts about efi_runtime_services():
1. A quick look at all the efi_runtime_services() shows that any
efi_runtime_service() has five or less arguments.
2. An argument of efi_runtime_service() can be a value (of any type) or
a pointer (of any type).
Hence, efi_runtime_work has five void pointers to store these arguments.

Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Cc: Lee Chun-Yi <jlee@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Naresh Bhat <naresh.bhat@linaro.org>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
---
 drivers/firmware/efi/runtime-wrappers.c | 80 +++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
index ae54870b2788..a9866045ed52 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -1,6 +1,14 @@
 /*
  * runtime-wrappers.c - Runtime Services function call wrappers
  *
+ * Implementation summary:
+ * -----------------------
+ * 1. When user/kernel thread requests to execute efi_runtime_service(),
+ * enqueue work to efi_rts_wq.
+ * 2. Caller thread waits until the work is finished because it's
+ * dependent on the return status and execution of efi_runtime_service().
+ * For instance, get_variable() and get_next_variable().
+ *
  * Copyright (C) 2014 Linaro Ltd. <ard.biesheuvel@linaro.org>
  *
  * Split off from arch/x86/platform/efi/efi.c
@@ -22,6 +30,8 @@
 #include <linux/mutex.h>
 #include <linux/semaphore.h>
 #include <linux/stringify.h>
+#include <linux/workqueue.h>
+
 #include <asm/efi.h>
 
 /*
@@ -33,6 +43,76 @@
 #define __efi_call_virt(f, args...) \
 	__efi_call_virt_pointer(efi.systab->runtime, f, args)
 
+/* efi_runtime_service() function identifiers */
+enum efi_rts_ids {
+	GET_TIME,
+	SET_TIME,
+	GET_WAKEUP_TIME,
+	SET_WAKEUP_TIME,
+	GET_VARIABLE,
+	GET_NEXT_VARIABLE,
+	SET_VARIABLE,
+	SET_VARIABLE_NONBLOCKING,
+	QUERY_VARIABLE_INFO,
+	QUERY_VARIABLE_INFO_NONBLOCKING,
+	GET_NEXT_HIGH_MONO_COUNT,
+	RESET_SYSTEM,
+	UPDATE_CAPSULE,
+	QUERY_CAPSULE_CAPS,
+};
+
+/*
+ * efi_runtime_work:	Details of EFI Runtime Service work
+ * @func:		EFI Runtime Service function identifier
+ * @arg<1-5>:		EFI Runtime Service function arguments
+ * @status:		Status of executing EFI Runtime Service
+ */
+struct efi_runtime_work {
+	void *arg1;
+	void *arg2;
+	void *arg3;
+	void *arg4;
+	void *arg5;
+	efi_status_t status;
+	struct work_struct work;
+	enum efi_rts_ids efi_rts_id;
+};
+
+/*
+ * efi_queue_work:	Queue efi_runtime_service() and wait until it's done
+ * @rts:		efi_runtime_service() function identifier
+ * @rts_arg<1-5>:	efi_runtime_service() function arguments
+ *
+ * Accesses to efi_runtime_services() are serialized by a binary
+ * semaphore (efi_runtime_lock) and caller waits until the work is
+ * finished, hence _only_ one work is queued at a time and the queued
+ * work gets flushed.
+ */
+#define efi_queue_work(_rts, _arg1, _arg2, _arg3, _arg4, _arg5)	\
+({									\
+	struct efi_runtime_work efi_rts_work;				\
+	efi_rts_work.status = EFI_ABORTED;				\
+									\
+	INIT_WORK_ONSTACK(&efi_rts_work.work, efi_call_rts);		\
+	efi_rts_work.arg1 = _arg1;					\
+	efi_rts_work.arg2 = _arg2;					\
+	efi_rts_work.arg3 = _arg3;					\
+	efi_rts_work.arg4 = _arg4;					\
+	efi_rts_work.arg5 = _arg5;					\
+	efi_rts_work.efi_rts_id = _rts;					\
+									\
+	/*								\
+	 * queue_work() returns 0 if work was already on queue,         \
+	 * _ideally_ this should never happen.                          \
+	 */								\
+	if (queue_work(efi_rts_wq, &efi_rts_work.work))			\
+		flush_work(&efi_rts_work.work);				\
+	else								\
+		pr_err("Failed to queue work to efi_rts_wq.\n");	\
+									\
+	efi_rts_work.status;						\
+})
+
 void efi_call_virt_check_flags(unsigned long flags, const char *call)
 {
 	unsigned long cur_flags, mismatch;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH V3 3/3] efi: Use efi_rts_wq to invoke EFI Runtime Services
  2018-05-22  3:13 [PATCH V3 0/3] Use efi_rts_wq to invoke EFI Runtime Services Sai Praneeth Prakhya
  2018-05-22  3:13 ` [PATCH V3 1/3] x86/efi: Call efi_delete_dummy_variable() after creating efi_rts_wq Sai Praneeth Prakhya
  2018-05-22  3:13 ` [PATCH V3 2/3] efi: Introduce efi_queue_work() to queue any efi_runtime_service() on efi_rts_wq Sai Praneeth Prakhya
@ 2018-05-22  3:13 ` Sai Praneeth Prakhya
  2 siblings, 0 replies; 6+ messages in thread
From: Sai Praneeth Prakhya @ 2018-05-22  3:13 UTC (permalink / raw)
  To: linux-efi, linux-kernel
  Cc: Sai Praneeth, Lee Chun-Yi, Borislav Petkov, Tony Luck,
	Will Deacon, Dave Hansen, Mark Rutland, Bhupesh Sharma,
	Naresh Bhat, Ricardo Neri, Peter Zijlstra, Ravi Shankar,
	Matt Fleming, Dan Williams, Ard Biesheuvel, Miguel Ojeda

From: Sai Praneeth <sai.praneeth.prakhya@intel.com>

Presently, when a user process requests the kernel to execute any
efi_runtime_service(), kernel switches the page directory (%cr3) from
swapper_pgd to efi_pgd. Other subsystems in the kernel aren't aware of
this switch and they might think, user space is still valid (i.e. the
user space mappings are still pointing to the process that requested to
run efi_runtime_service()) but in reality it is not so.

A solution for this issue is to use kthread to run efi_runtime_service()
When a user process requests the kernel to execute any
efi_runtime_service(), kernel queues the work to efi_rts_wq, a kthread
comes along, switches to efi_pgd and executes efi_runtime_service() in
kthread context. Anything that tries to touch user space addresses while
in kthread is terminally broken.

Implementation summary:
-----------------------
1. When user/kernel thread requests to execute efi_runtime_service(),
enqueue work to efi_rts_wq.
2. Caller thread waits until the work is finished because it's dependent
on the return status of efi_runtime_service().

Semantics to pack arguments in efi_runtime_work (has void pointers):
1. If argument is a pointer (of any type), pass it as is.
2. If argument is a value (of any type), address of the value is passed.

Introduce a handler function (called efi_call_rts()) that
1. Understands efi_runtime_work and
2. Invokes the appropriate efi_runtime_service() with the appropriate
arguments

Semantics followed by efi_call_rts() to understand efi_runtime_work:
1. If argument was a pointer, recast it from void pointer to original
pointer type.
2. If argument was a value, recast it from void pointer to original
pointer type and dereference it.

pstore writes could potentially be invoked in atomic context and it uses
set_variable<>() and query_variable_info<>() to store logs. If we invoke
efi_runtime_services() through efi_rts_wq while in atomic(), kernel
issues a warning ("scheduling wile in atomic") and prints stack trace.
One way to overcome this is to not make the caller process wait for the
worker thread to finish. This approach breaks pstore i.e. the log
messages aren't written to efi variables. Hence, pstore calls
efi_runtime_services() without using efi_rts_wq or in other words
efi_rts_wq will be used unconditionally for all the
efi_runtime_services() except set_variable<>() and
query_variable_info<>().

Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Cc: Lee Chun-Yi <jlee@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Naresh Bhat <naresh.bhat@linaro.org>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
---
 drivers/firmware/efi/runtime-wrappers.c | 170 ++++++++++++++++++++++++++++----
 1 file changed, 150 insertions(+), 20 deletions(-)

diff --git a/drivers/firmware/efi/runtime-wrappers.c b/drivers/firmware/efi/runtime-wrappers.c
index a9866045ed52..23ff128fcb2f 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -170,13 +170,107 @@ void efi_call_virt_check_flags(unsigned long flags, const char *call)
  */
 static DEFINE_SEMAPHORE(efi_runtime_lock);
 
+/*
+ * Calls the appropriate efi_runtime_service() with the appropriate
+ * arguments.
+ *
+ * Semantics followed by efi_call_rts() to understand efi_runtime_work:
+ * 1. If argument was a pointer, recast it from void pointer to original
+ * pointer type.
+ * 2. If argument was a value, recast it from void pointer to original
+ * pointer type and dereference it.
+ */
+static void efi_call_rts(struct work_struct *work)
+{
+	struct efi_runtime_work *efi_rts_work;
+	void *arg1, *arg2, *arg3, *arg4, *arg5;
+	efi_status_t status = EFI_NOT_FOUND;
+
+	efi_rts_work = container_of(work, struct efi_runtime_work, work);
+	arg1 = efi_rts_work->arg1;
+	arg2 = efi_rts_work->arg2;
+	arg3 = efi_rts_work->arg3;
+	arg4 = efi_rts_work->arg4;
+	arg5 = efi_rts_work->arg5;
+
+	switch (efi_rts_work->efi_rts_id) {
+	case GET_TIME:
+		status = efi_call_virt(get_time, (efi_time_t *)arg1,
+				       (efi_time_cap_t *)arg2);
+		break;
+	case SET_TIME:
+		status = efi_call_virt(set_time, (efi_time_t *)arg1);
+		break;
+	case GET_WAKEUP_TIME:
+		status = efi_call_virt(get_wakeup_time, (efi_bool_t *)arg1,
+				       (efi_bool_t *)arg2, (efi_time_t *)arg3);
+		break;
+	case SET_WAKEUP_TIME:
+		status = efi_call_virt(set_wakeup_time, *(efi_bool_t *)arg1,
+				       (efi_time_t *)arg2);
+		break;
+	case GET_VARIABLE:
+		status = efi_call_virt(get_variable, (efi_char16_t *)arg1,
+				       (efi_guid_t *)arg2, (u32 *)arg3,
+				       (unsigned long *)arg4, (void *)arg5);
+		break;
+	case GET_NEXT_VARIABLE:
+		status = efi_call_virt(get_next_variable, (unsigned long *)arg1,
+				       (efi_char16_t *)arg2,
+				       (efi_guid_t *)arg3);
+		break;
+	case SET_VARIABLE:
+		/* fall through */
+	case SET_VARIABLE_NONBLOCKING:
+		status = efi_call_virt(set_variable, (efi_char16_t *)arg1,
+				       (efi_guid_t *)arg2, *(u32 *)arg3,
+				       *(unsigned long *)arg4, (void *)arg5);
+		break;
+	case QUERY_VARIABLE_INFO:
+		/* fall through */
+	case QUERY_VARIABLE_INFO_NONBLOCKING:
+		status = efi_call_virt(query_variable_info, *(u32 *)arg1,
+				       (u64 *)arg2, (u64 *)arg3, (u64 *)arg4);
+		break;
+	case GET_NEXT_HIGH_MONO_COUNT:
+		status = efi_call_virt(get_next_high_mono_count, (u32 *)arg1);
+		break;
+	case RESET_SYSTEM:
+		__efi_call_virt(reset_system, *(int *)arg1,
+				*(efi_status_t *)arg2,
+				*(unsigned long *)arg3,
+				(efi_char16_t *)arg4);
+		break;
+	case UPDATE_CAPSULE:
+		status = efi_call_virt(update_capsule,
+				       (efi_capsule_header_t **)arg1,
+				       *(unsigned long *)arg2,
+				       *(unsigned long *)arg3);
+		break;
+	case QUERY_CAPSULE_CAPS:
+		status = efi_call_virt(query_capsule_caps,
+				       (efi_capsule_header_t **)arg1,
+				       *(unsigned long *)arg2, (u64 *)arg3,
+				       (int *)arg4);
+		break;
+	default:
+		/*
+		 * Ideally, we should never reach here because a caller of this
+		 * function should have put the right efi_runtime_service()
+		 * function identifier into efi_rts_work->efi_rts_id
+		 */
+		pr_err("Requested executing invalid EFI Runtime Service.\n");
+	}
+	efi_rts_work->status = status;
+}
+
 static efi_status_t virt_efi_get_time(efi_time_t *tm, efi_time_cap_t *tc)
 {
 	efi_status_t status;
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(get_time, tm, tc);
+	status = efi_queue_work(GET_TIME, tm, tc, NULL, NULL, NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -187,7 +281,7 @@ static efi_status_t virt_efi_set_time(efi_time_t *tm)
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(set_time, tm);
+	status = efi_queue_work(SET_TIME, tm, NULL, NULL, NULL, NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -200,7 +294,8 @@ static efi_status_t virt_efi_get_wakeup_time(efi_bool_t *enabled,
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(get_wakeup_time, enabled, pending, tm);
+	status = efi_queue_work(GET_WAKEUP_TIME, enabled, pending, tm, NULL,
+				NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -211,7 +306,8 @@ static efi_status_t virt_efi_set_wakeup_time(efi_bool_t enabled, efi_time_t *tm)
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(set_wakeup_time, enabled, tm);
+	status = efi_queue_work(SET_WAKEUP_TIME, &enabled, tm, NULL, NULL,
+				NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -226,8 +322,8 @@ static efi_status_t virt_efi_get_variable(efi_char16_t *name,
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(get_variable, name, vendor, attr, data_size,
-			       data);
+	status = efi_queue_work(GET_VARIABLE, name, vendor, attr, data_size,
+				data);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -240,7 +336,8 @@ static efi_status_t virt_efi_get_next_variable(unsigned long *name_size,
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(get_next_variable, name_size, name, vendor);
+	status = efi_queue_work(GET_NEXT_VARIABLE, name_size, name, vendor,
+				NULL, NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -255,8 +352,15 @@ static efi_status_t virt_efi_set_variable(efi_char16_t *name,
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(set_variable, name, vendor, attr, data_size,
-			       data);
+
+	/* pstore shouldn't use efi_rts_wq while in atomic */
+	if (!in_atomic())
+		status = efi_queue_work(SET_VARIABLE, name, vendor, &attr,
+					&data_size, data);
+	else
+		status = efi_call_virt(set_variable, name, vendor, attr,
+				       data_size, data);
+
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -271,8 +375,14 @@ virt_efi_set_variable_nonblocking(efi_char16_t *name, efi_guid_t *vendor,
 	if (down_trylock(&efi_runtime_lock))
 		return EFI_NOT_READY;
 
-	status = efi_call_virt(set_variable, name, vendor, attr, data_size,
-			       data);
+	/* pstore shouldn't use efi_rts_wq while in atomic */
+	if (!in_atomic())
+		status = efi_queue_work(SET_VARIABLE_NONBLOCKING, &name, vendor,
+					&attr,	&data_size, data);
+	else
+		status = efi_call_virt(set_variable, name, vendor, attr,
+				       data_size, data);
+
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -290,8 +400,17 @@ static efi_status_t virt_efi_query_variable_info(u32 attr,
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(query_variable_info, attr, storage_space,
-			       remaining_space, max_variable_size);
+
+	/* pstore shouldn't use efi_rts_wq while in atomic */
+	if (!in_atomic())
+		status = efi_queue_work(QUERY_VARIABLE_INFO, &attr,
+					storage_space, remaining_space,
+					max_variable_size, NULL);
+	else
+		status = efi_call_virt(query_variable_info, attr,
+				       storage_space, remaining_space,
+				       max_variable_size);
+
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -310,8 +429,16 @@ virt_efi_query_variable_info_nonblocking(u32 attr,
 	if (down_trylock(&efi_runtime_lock))
 		return EFI_NOT_READY;
 
-	status = efi_call_virt(query_variable_info, attr, storage_space,
-			       remaining_space, max_variable_size);
+	/* pstore shouldn't use efi_rts_wq while in atomic */
+	if (!in_atomic())
+		status = efi_queue_work(QUERY_VARIABLE_INFO_NONBLOCKING, &attr,
+					storage_space, remaining_space,
+					max_variable_size, NULL);
+	else
+		status = efi_call_virt(query_variable_info, attr,
+				       storage_space, remaining_space,
+				       max_variable_size);
+
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -322,7 +449,8 @@ static efi_status_t virt_efi_get_next_high_mono_count(u32 *count)
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(get_next_high_mono_count, count);
+	status = efi_queue_work(GET_NEXT_HIGH_MONO_COUNT, count, NULL, NULL,
+				NULL, NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -337,7 +465,8 @@ static void virt_efi_reset_system(int reset_type,
 			"could not get exclusive access to the firmware\n");
 		return;
 	}
-	__efi_call_virt(reset_system, reset_type, status, data_size, data);
+	efi_queue_work(RESET_SYSTEM, &reset_type, &status, &data_size, data,
+		       NULL);
 	up(&efi_runtime_lock);
 }
 
@@ -352,7 +481,8 @@ static efi_status_t virt_efi_update_capsule(efi_capsule_header_t **capsules,
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(update_capsule, capsules, count, sg_list);
+	status = efi_queue_work(UPDATE_CAPSULE, capsules, &count, &sg_list,
+				NULL, NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
@@ -369,8 +499,8 @@ static efi_status_t virt_efi_query_capsule_caps(efi_capsule_header_t **capsules,
 
 	if (down_interruptible(&efi_runtime_lock))
 		return EFI_ABORTED;
-	status = efi_call_virt(query_capsule_caps, capsules, count, max_size,
-			       reset_type);
+	status = efi_queue_work(QUERY_CAPSULE_CAPS, capsules, &count,
+				max_size, reset_type, NULL);
 	up(&efi_runtime_lock);
 	return status;
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V3 2/3] efi: Introduce efi_queue_work() to queue any efi_runtime_service() on efi_rts_wq
  2018-05-22  3:13 ` [PATCH V3 2/3] efi: Introduce efi_queue_work() to queue any efi_runtime_service() on efi_rts_wq Sai Praneeth Prakhya
@ 2018-05-22  7:36   ` Peter Zijlstra
  2018-05-22 22:44     ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2018-05-22  7:36 UTC (permalink / raw)
  To: Sai Praneeth Prakhya
  Cc: linux-efi, linux-kernel, Lee Chun-Yi, Borislav Petkov, Tony Luck,
	Will Deacon, Dave Hansen, Mark Rutland, Bhupesh Sharma,
	Naresh Bhat, Ricardo Neri, Ravi Shankar, Matt Fleming,
	Dan Williams, Ard Biesheuvel, Miguel Ojeda

On Mon, May 21, 2018 at 08:13:03PM -0700, Sai Praneeth Prakhya wrote:
> +	/*								\
> +	 * queue_work() returns 0 if work was already on queue,         \
> +	 * _ideally_ this should never happen.                          \
> +	 */								\
> +	if (queue_work(efi_rts_wq, &efi_rts_work.work))			\
> +		flush_work(&efi_rts_work.work);				\

Since you're _always_ going to wait for it, it is _much_ cheaper to put
a completion in your actual work and wait for that.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH V3 2/3] efi: Introduce efi_queue_work() to queue any efi_runtime_service() on efi_rts_wq
  2018-05-22  7:36   ` Peter Zijlstra
@ 2018-05-22 22:44     ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 6+ messages in thread
From: Prakhya, Sai Praneeth @ 2018-05-22 22:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-efi, linux-kernel, Lee Chun-Yi, Borislav Petkov, Luck,
	Tony, Will Deacon, Hansen, Dave, Mark Rutland, Bhupesh Sharma,
	Naresh Bhat, Neri, Ricardo, Shankar, Ravi V, Matt Fleming,
	Williams, Dan J, Ard Biesheuvel, Miguel Ojeda

> On Mon, May 21, 2018 at 08:13:03PM -0700, Sai Praneeth Prakhya wrote:
> > +	/*								\
> > +	 * queue_work() returns 0 if work was already on queue,         \
> > +	 * _ideally_ this should never happen.                          \
> > +	 */								\
> > +	if (queue_work(efi_rts_wq, &efi_rts_work.work))
> 	\
> > +		flush_work(&efi_rts_work.work);
> 	\
> 
> Since you're _always_ going to wait for it, it is _much_ cheaper to put a
> completion in your actual work and wait for that.

Sure! I will change it.
I also noticed that flush_work() in turn calls wait_for_completion().

Will also wait for a couple of days before posting V4.

Regards,
Sai

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-05-22 22:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-22  3:13 [PATCH V3 0/3] Use efi_rts_wq to invoke EFI Runtime Services Sai Praneeth Prakhya
2018-05-22  3:13 ` [PATCH V3 1/3] x86/efi: Call efi_delete_dummy_variable() after creating efi_rts_wq Sai Praneeth Prakhya
2018-05-22  3:13 ` [PATCH V3 2/3] efi: Introduce efi_queue_work() to queue any efi_runtime_service() on efi_rts_wq Sai Praneeth Prakhya
2018-05-22  7:36   ` Peter Zijlstra
2018-05-22 22:44     ` Prakhya, Sai Praneeth
2018-05-22  3:13 ` [PATCH V3 3/3] efi: Use efi_rts_wq to invoke EFI Runtime Services Sai Praneeth Prakhya

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.