linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] x86: Verify access_ok() context
@ 2016-11-22  9:57 Peter Zijlstra
  2016-11-22 17:28 ` Andy Lutomirski
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2016-11-22  9:57 UTC (permalink / raw)
  To: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Andy Lutomirski,
	H. Peter Anvin
  Cc: linux-kernel


I recently encountered wreckage because access_ok() was used where it
should not be, add an explicit WARN when access_ok() is used wrongly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/uaccess.h |  7 +++++--
 include/linux/preempt.h        | 21 +++++++++++++--------
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index faf3687f1035..b139c46ba122 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -88,8 +88,11 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
  * checks that the pointer is in the user space range - after calling
  * this function, memory access functions may still return -EFAULT.
  */
-#define access_ok(type, addr, size) \
-	likely(!__range_not_ok(addr, size, user_addr_max()))
+#define access_ok(type, addr, size)					\
+({									\
+	WARN_ON_ONCE(!in_task());					\
+	likely(!__range_not_ok(addr, size, user_addr_max()));		\
+})
 
 /*
  * These are the main single-value transfer routines.  They automatically
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 75e4e30677f1..7eeceac52dea 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -65,19 +65,24 @@
 
 /*
  * Are we doing bottom half or hardware interrupt processing?
- * Are we in a softirq context? Interrupt context?
- * in_softirq - Are we currently processing softirq or have bh disabled?
- * in_serving_softirq - Are we currently processing softirq?
+ *
+ * in_irq()       - We're in (hard) IRQ context
+ * in_softirq()   - We have BH disabled, or are processing softirqs
+ * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
+ * in_serving_softirq() - We're in softirq context
+ * in_nmi()       - We're in NMI context
+ * in_task()	  - We're in task context
+ *
+ * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
+ *       should not be used in new code.
  */
 #define in_irq()		(hardirq_count())
 #define in_softirq()		(softirq_count())
 #define in_interrupt()		(irq_count())
 #define in_serving_softirq()	(softirq_count() & SOFTIRQ_OFFSET)
-
-/*
- * Are we in NMI context?
- */
-#define in_nmi()	(preempt_count() & NMI_MASK)
+#define in_nmi()		(preempt_count() & NMI_MASK)
+#define in_task()		(!(preempt_count() & \
+				   (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
 
 /*
  * The preempt_count offset after preempt_disable();

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2016-11-22  9:57 [RFC][PATCH] x86: Verify access_ok() context Peter Zijlstra
@ 2016-11-22 17:28 ` Andy Lutomirski
  2016-11-22 19:37   ` Peter Zijlstra
  0 siblings, 1 reply; 18+ messages in thread
From: Andy Lutomirski @ 2016-11-22 17:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	linux-kernel

On Tue, Nov 22, 2016 at 1:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> I recently encountered wreckage because access_ok() was used where it
> should not be, add an explicit WARN when access_ok() is used wrongly.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/include/asm/uaccess.h |  7 +++++--
>  include/linux/preempt.h        | 21 +++++++++++++--------
>  2 files changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index faf3687f1035..b139c46ba122 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -88,8 +88,11 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
>   * checks that the pointer is in the user space range - after calling
>   * this function, memory access functions may still return -EFAULT.
>   */
> -#define access_ok(type, addr, size) \
> -       likely(!__range_not_ok(addr, size, user_addr_max()))
> +#define access_ok(type, addr, size)                                    \
> +({                                                                     \
> +       WARN_ON_ONCE(!in_task());                                       \

Should this be guarded by some debug option?  This may hurt
performance on production systems quite a bit.

> diff --git a/include/linux/preempt.h b/include/linux/preempt.h
> index 75e4e30677f1..7eeceac52dea 100644
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -65,19 +65,24 @@
>
>  /*
>   * Are we doing bottom half or hardware interrupt processing?
> - * Are we in a softirq context? Interrupt context?
> - * in_softirq - Are we currently processing softirq or have bh disabled?
> - * in_serving_softirq - Are we currently processing softirq?
> + *
> + * in_irq()       - We're in (hard) IRQ context
> + * in_softirq()   - We have BH disabled, or are processing softirqs
> + * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
> + * in_serving_softirq() - We're in softirq context
> + * in_nmi()       - We're in NMI context
> + * in_task()     - We're in task context
> + *
> + * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
> + *       should not be used in new code.
>   */
>  #define in_irq()               (hardirq_count())
>  #define in_softirq()           (softirq_count())
>  #define in_interrupt()         (irq_count())
>  #define in_serving_softirq()   (softirq_count() & SOFTIRQ_OFFSET)
> -
> -/*
> - * Are we in NMI context?
> - */
> -#define in_nmi()       (preempt_count() & NMI_MASK)
> +#define in_nmi()               (preempt_count() & NMI_MASK)
> +#define in_task()              (!(preempt_count() & \
> +                                  (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))

LGTM.

For what it's worth, I think ARM recently started saving the address
limit and resetting it to USER_DS on NMI entry.

--Andy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2016-11-22 17:28 ` Andy Lutomirski
@ 2016-11-22 19:37   ` Peter Zijlstra
  2016-11-22 19:42     ` Linus Torvalds
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2016-11-22 19:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	linux-kernel

On Tue, Nov 22, 2016 at 09:28:01AM -0800, Andy Lutomirski wrote:
> On Tue, Nov 22, 2016 at 1:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:

> > +#define access_ok(type, addr, size)                                    \
> > +({                                                                     \
> > +       WARN_ON_ONCE(!in_task());                                       \
> 
> Should this be guarded by some debug option?  This may hurt
> performance on production systems quite a bit.

I suspected something like that; any suitable CONFIG come to mind? I'm
somewhat reluctant to create yet another one for this.

CONFIG_DEBUG_VM seems somehow inappropriate.

> For what it's worth, I think ARM recently started saving the address
> limit and resetting it to USER_DS on NMI entry.

Up to them of course, but doing less on interrupt entry/exit seems
better.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2016-11-22 19:37   ` Peter Zijlstra
@ 2016-11-22 19:42     ` Linus Torvalds
  2016-12-05 10:27       ` Peter Zijlstra
  0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2016-11-22 19:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	linux-kernel

On Tue, Nov 22, 2016 at 11:37 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> CONFIG_DEBUG_VM seems somehow inappropriate.

The usual might_fault() logic? That uses

    defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)

(and "might_sleep()" uses just CONFIG_DEBUG_ATOMIC_SLEEP, maybe that's fine).

               Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2016-11-22 19:42     ` Linus Torvalds
@ 2016-12-05 10:27       ` Peter Zijlstra
  2017-01-16 20:27         ` David Smith
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2016-12-05 10:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	linux-kernel

On Tue, Nov 22, 2016 at 11:42:19AM -0800, Linus Torvalds wrote:
> On Tue, Nov 22, 2016 at 11:37 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > CONFIG_DEBUG_VM seems somehow inappropriate.
> 
> The usual might_fault() logic? That uses
> 
>     defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
> 
> (and "might_sleep()" uses just CONFIG_DEBUG_ATOMIC_SLEEP, maybe that's fine).
> 

Fair enough; something like so then?

---
Subject: x86: Verify access_ok() context
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 22 Nov 2016 10:57:15 +0100

I recently encountered wreckage because access_ok() was used where it
should not be, add an explicit WARN when access_ok() is used wrongly.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/uaccess.h |   13 +++++++++++--
 include/linux/preempt.h        |   21 +++++++++++++--------
 2 files changed, 24 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -68,6 +68,12 @@ static inline bool __chk_range_not_ok(un
 	__chk_range_not_ok((unsigned long __force)(addr), size, limit); \
 })
 
+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+#define ACCESS_OK_WARN()	WARN_ON_ONCE(!in_task())
+#else
+#define ACCESS_OK_WARN()
+#endif
+
 /**
  * access_ok: - Checks if a user space pointer is valid
  * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE.  Note that
@@ -88,8 +94,11 @@ static inline bool __chk_range_not_ok(un
  * checks that the pointer is in the user space range - after calling
  * this function, memory access functions may still return -EFAULT.
  */
-#define access_ok(type, addr, size) \
-	likely(!__range_not_ok(addr, size, user_addr_max()))
+#define access_ok(type, addr, size)					\
+({									\
+	ACCESS_OK_WARN();						\
+	likely(!__range_not_ok(addr, size, user_addr_max()));		\
+})
 
 /*
  * These are the main single-value transfer routines.  They automatically
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -65,19 +65,24 @@
 
 /*
  * Are we doing bottom half or hardware interrupt processing?
- * Are we in a softirq context? Interrupt context?
- * in_softirq - Are we currently processing softirq or have bh disabled?
- * in_serving_softirq - Are we currently processing softirq?
+ *
+ * in_irq()       - We're in (hard) IRQ context
+ * in_softirq()   - We have BH disabled, or are processing softirqs
+ * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
+ * in_serving_softirq() - We're in softirq context
+ * in_nmi()       - We're in NMI context
+ * in_task()	  - We're in task context
+ *
+ * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
+ *       should not be used in new code.
  */
 #define in_irq()		(hardirq_count())
 #define in_softirq()		(softirq_count())
 #define in_interrupt()		(irq_count())
 #define in_serving_softirq()	(softirq_count() & SOFTIRQ_OFFSET)
-
-/*
- * Are we in NMI context?
- */
-#define in_nmi()	(preempt_count() & NMI_MASK)
+#define in_nmi()		(preempt_count() & NMI_MASK)
+#define in_task()		(!(preempt_count() & \
+				   (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
 
 /*
  * The preempt_count offset after preempt_disable();

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2016-12-05 10:27       ` Peter Zijlstra
@ 2017-01-16 20:27         ` David Smith
  2017-01-16 21:14           ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: David Smith @ 2017-01-16 20:27 UTC (permalink / raw)
  To: Peter Zijlstra, Linus Torvalds
  Cc: Andy Lutomirski, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	linux-kernel, Frank Ch. Eigler

If you call access_ok() with page faulting disabled, you'll still see this new warning. If you put that new access_ok() call in a module that gets loaded/unloaded, you see one warning for every module load, which gets a bit annoying.

How about modifying it like this:

---
From: David Smith <dsmith@redhat.com>
Date: Mon, 16 Jan 2017 14:07:31 -0600
Subject: [PATCH] Relax x86 new access_ok() warning a bit.

Signed-off-by: David Smith <dsmith@redhat.com>
---
 arch/x86/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index ea148313570f..0cbd3cca5e7b 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -69,7 +69,7 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
 })
 
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-# define WARN_ON_IN_IRQ()	WARN_ON_ONCE(!in_task())
+# define WARN_ON_IN_IRQ()	WARN_ON_ONCE(!in_task() && !pagefault_disabled())
 #else
 # define WARN_ON_IN_IRQ()
 #endif
-- 


-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-16 20:27         ` David Smith
@ 2017-01-16 21:14           ` Thomas Gleixner
  2017-01-18 22:16             ` David Smith
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2017-01-16 21:14 UTC (permalink / raw)
  To: David Smith
  Cc: Peter Zijlstra, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
	H. Peter Anvin, linux-kernel, Frank Ch. Eigler

On Mon, 16 Jan 2017, David Smith wrote:

> If you call access_ok() with page faulting disabled, you'll still see
> this new warning.

And how so? It's just checking for task context. page fault disable/enable
has absolutely nothing to do with that.

> If you put that new access_ok() call in a module that gets
> loaded/unloaded, you see one warning for every module load, which gets a
> bit annoying.

Can you please elaborate where this access_ok() is placed in the module
code?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-16 21:14           ` Thomas Gleixner
@ 2017-01-18 22:16             ` David Smith
  2017-01-19  0:19               ` Andy Lutomirski
  2017-01-19 18:12               ` Thomas Gleixner
  0 siblings, 2 replies; 18+ messages in thread
From: David Smith @ 2017-01-18 22:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
	H. Peter Anvin, linux-kernel, Frank Ch. Eigler

On 01/16/2017 03:14 PM, Thomas Gleixner wrote:
> On Mon, 16 Jan 2017, David Smith wrote:
> 
>> If you call access_ok() with page faulting disabled, you'll still see
>> this new warning.
> 
> And how so? It's just checking for task context. page fault disable/enable
> has absolutely nothing to do with that.

True, task context and page fault disable/enable have nothing to do with each other. However, the access_ok() comment states:

 * Context: User context only. This function may sleep if pagefaults are        
 *          enabled.                                                            

That seems to indicate that the function won't sleep if pagefaults are disabled, and thus there is no need for a CONFIG_DEBUG_ATOMIC_SLEEP warning if pagefaults are disabled.

>> If you put that new access_ok() call in a module that gets
>> loaded/unloaded, you see one warning for every module load, which gets a
>> bit annoying.
> 
> Can you please elaborate where this access_ok() is placed in the module
> code?

It doesn't really matter where you place the access_ok() call in the module code. If you call access_ok() in a module, then that module has its own WARN_ON_ONCE() static variable. If access_ok() was a function exported from the kernel, then there would be only one copy of the WARN_ON_ONCE() static variable.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-18 22:16             ` David Smith
@ 2017-01-19  0:19               ` Andy Lutomirski
  2017-01-19 15:37                 ` David Smith
  2017-01-20  8:24                 ` Peter Zijlstra
  2017-01-19 18:12               ` Thomas Gleixner
  1 sibling, 2 replies; 18+ messages in thread
From: Andy Lutomirski @ 2017-01-19  0:19 UTC (permalink / raw)
  To: David Smith
  Cc: Thomas Gleixner, Peter Zijlstra, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, Frank Ch. Eigler

On Wed, Jan 18, 2017 at 2:16 PM, David Smith <dsmith@redhat.com> wrote:
> On 01/16/2017 03:14 PM, Thomas Gleixner wrote:
>> On Mon, 16 Jan 2017, David Smith wrote:
>>
>>> If you call access_ok() with page faulting disabled, you'll still see
>>> this new warning.
>>
>> And how so? It's just checking for task context. page fault disable/enable
>> has absolutely nothing to do with that.
>
> True, task context and page fault disable/enable have nothing to do with each other. However, the access_ok() comment states:
>
>  * Context: User context only. This function may sleep if pagefaults are
>  *          enabled.
>
> That seems to indicate that the function won't sleep if pagefaults are disabled, and thus there is no need for a CONFIG_DEBUG_ATOMIC_SLEEP warning if pagefaults are disabled.

ISTM even with pagefault_disable() in play, using access_ok() from,
say, interrupt context is dangerous unless you've first checked that
you're in a task.  But I guess that in_task() would still return
false, e.g. in perf.

>
>>> If you put that new access_ok() call in a module that gets
>>> loaded/unloaded, you see one warning for every module load, which gets a
>>> bit annoying.
>>
>> Can you please elaborate where this access_ok() is placed in the module
>> code?
>
> It doesn't really matter where you place the access_ok() call in the module code. If you call access_ok() in a module, then that module has its own WARN_ON_ONCE() static variable. If access_ok() was a function exported from the kernel, then there would be only one copy of the WARN_ON_ONCE() static variable.

That doesn't seem like such a big deal to me.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-19  0:19               ` Andy Lutomirski
@ 2017-01-19 15:37                 ` David Smith
  2017-01-20  8:24                 ` Peter Zijlstra
  1 sibling, 0 replies; 18+ messages in thread
From: David Smith @ 2017-01-19 15:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Peter Zijlstra, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, Frank Ch. Eigler

On 01/18/2017 06:19 PM, Andy Lutomirski wrote:
> On Wed, Jan 18, 2017 at 2:16 PM, David Smith <dsmith@redhat.com> wrote:
>> On 01/16/2017 03:14 PM, Thomas Gleixner wrote:
>>> On Mon, 16 Jan 2017, David Smith wrote:

... stuff deleted ...

>>>> If you put that new access_ok() call in a module that gets
>>>> loaded/unloaded, you see one warning for every module load, which gets a
>>>> bit annoying.
>>>
>>> Can you please elaborate where this access_ok() is placed in the module
>>> code?
>>
>> It doesn't really matter where you place the access_ok() call in the module
>> code. If you call access_ok() in a module, then that module has its own
>> WARN_ON_ONCE() static variable. If access_ok() was a function exported
>> from the kernel, then there would be only one copy of the WARN_ON_ONCE()
>> static variable.
> 
> That doesn't seem like such a big deal to me.

To be clear here, I'm not suggesting we replace the access_ok() macro
with an exported kernel function. I'm just stating the fact that if you
have several modules that call the access_ok() macro (or one module that
gets loaded/unloaded/reloaded multiple times), each one can produce the
new warning. I'm not seeing a "flood" of these new warnings, but a
steady enough stream of them to be annoying.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-18 22:16             ` David Smith
  2017-01-19  0:19               ` Andy Lutomirski
@ 2017-01-19 18:12               ` Thomas Gleixner
  2017-01-19 20:22                 ` Frank Ch. Eigler
  1 sibling, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2017-01-19 18:12 UTC (permalink / raw)
  To: David Smith
  Cc: Peter Zijlstra, Linus Torvalds, Andy Lutomirski, Ingo Molnar,
	H. Peter Anvin, linux-kernel, Frank Ch. Eigler

On Wed, 18 Jan 2017, David Smith wrote:
> On 01/16/2017 03:14 PM, Thomas Gleixner wrote:
> >> If you put that new access_ok() call in a module that gets
> >> loaded/unloaded, you see one warning for every module load, which gets a
> >> bit annoying.
> > 
> > Can you please elaborate where this access_ok() is placed in the module
> > code?
> 
> It doesn't really matter where you place the access_ok() call in the
> module code.

It does matter very much, because the fact that the warning triggers tells
me that it's placed in code which is NOT executed in task context.

> If you call access_ok() in a module, then that module has
> its own WARN_ON_ONCE() static variable. If access_ok() was a function
> exported from the kernel, then there would be only one copy of the
> WARN_ON_ONCE() static variable.

Not a big deal. If access_ok() is called from the wrong context in that
module then this should be fixed and not the warning supressed.

We are not papering over problems.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-19 18:12               ` Thomas Gleixner
@ 2017-01-19 20:22                 ` Frank Ch. Eigler
  2017-01-19 20:50                   ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Frank Ch. Eigler @ 2017-01-19 20:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: David Smith, Peter Zijlstra, Linus Torvalds, Andy Lutomirski,
	Ingo Molnar, H. Peter Anvin, linux-kernel

Hi, Thomas -

On Thu, Jan 19, 2017 at 07:12:48PM +0100, Thomas Gleixner wrote:
> [...]
> It does matter very much, because the fact that the warning triggers tells
> me that it's placed in code which is NOT executed in task context.
> [...]
> We are not papering over problems.

Understood.  We were interpreting the comments around access_ok to
mean that the underlying hazard condition was different (stricter)
than in_task().  If the warning could be made to match that hazard
condition more precisely, then safe but non-in_task() callers can use
access_ok() without the warning.

- FChE

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-19 20:22                 ` Frank Ch. Eigler
@ 2017-01-19 20:50                   ` Thomas Gleixner
  2017-01-19 21:27                     ` Frank Ch. Eigler
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2017-01-19 20:50 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: David Smith, Peter Zijlstra, Linus Torvalds, Andy Lutomirski,
	Ingo Molnar, H. Peter Anvin, linux-kernel

On Thu, 19 Jan 2017, Frank Ch. Eigler wrote:

> Hi, Thomas -
> 
> On Thu, Jan 19, 2017 at 07:12:48PM +0100, Thomas Gleixner wrote:
> > [...]
> > It does matter very much, because the fact that the warning triggers tells
> > me that it's placed in code which is NOT executed in task context.
> > [...]
> > We are not papering over problems.
> 
> Understood.  We were interpreting the comments around access_ok to
> mean that the underlying hazard condition was different (stricter)
> than in_task().  If the warning could be made to match that hazard
> condition more precisely, then safe but non-in_task() callers can use
> access_ok() without the warning.

Well, if you are not in thread context then the check is pointless:

	__range_not_ok(addr, size, user_addr_max())

and:

#define user_addr_max() (current->thread.addr_limit.seg)

So what guarantees when you are not in context of current, i.e. in thread
context, that the addr/size which is checked against the limits of current
actually belongs to current?

I assume this is about systemtap modules. Can you please explain what you
are trying to achieve? I guess you know that you actually access current,
but then we need a seperate special function and not relaxing of the
checks.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-19 20:50                   ` Thomas Gleixner
@ 2017-01-19 21:27                     ` Frank Ch. Eigler
  2017-01-19 22:20                       ` Peter Zijlstra
  2017-01-19 23:04                       ` Thomas Gleixner
  0 siblings, 2 replies; 18+ messages in thread
From: Frank Ch. Eigler @ 2017-01-19 21:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: David Smith, Peter Zijlstra, Linus Torvalds, Andy Lutomirski,
	Ingo Molnar, H. Peter Anvin, linux-kernel

Hi, Thomas -

> Well, if you are not in thread context then the check is pointless:
> 	__range_not_ok(addr, size, user_addr_max())
> and:
> #define user_addr_max() (current->thread.addr_limit.seg)
> 
> So what guarantees when you are not in context of current, i.e. in thread
> context, that the addr/size which is checked against the limits of current
> actually belongs to current?

We're probably in task context in that there is a valid current(), but
running with preemption and/or interrupts and/or pagefaults disabled
at that point, so in_task() objects.  Think of it like from a kprobes
handler callback, except maybe more temporary preemption blocking.


> I assume this is about systemtap modules. Can you please explain
> what you are trying to achieve? I guess you know that you actually
> access current, but then we need a seperate special function and not
> relaxing of the checks.

This part is used in a part of the runtime that is a userspace
analogue of probe_kernel_address(), where we're given a potential
userspace address.  We would like to quickly test whether it's even
plausible as a userspace address, before doing a (pagefault-disabled)
trial fetch/store to it.


- FChE

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-19 21:27                     ` Frank Ch. Eigler
@ 2017-01-19 22:20                       ` Peter Zijlstra
  2017-01-19 23:04                       ` Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2017-01-19 22:20 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Thomas Gleixner, David Smith, Linus Torvalds, Andy Lutomirski,
	Ingo Molnar, H. Peter Anvin, linux-kernel

On Thu, Jan 19, 2017 at 04:27:18PM -0500, Frank Ch. Eigler wrote:
> Hi, Thomas -
> 
> > Well, if you are not in thread context then the check is pointless:
> > 	__range_not_ok(addr, size, user_addr_max())
> > and:
> > #define user_addr_max() (current->thread.addr_limit.seg)
> > 
> > So what guarantees when you are not in context of current, i.e. in thread
> > context, that the addr/size which is checked against the limits of current
> > actually belongs to current?
> 
> We're probably in task context in that there is a valid current(), but
> running with preemption and/or interrupts and/or pagefaults disabled
> at that point, so in_task() objects.  Think of it like from a kprobes
> handler callback, except maybe more temporary preemption blocking.

#define in_task()               (!(preempt_count() & \ 
                                   (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) 

So it doesn't care about preempt_disable(), and it doesn't care about
local_irq_disable(), it also doesn't care about local_bh_disable().

What it does care about are nmi_enter(), irq_enter() and __do_softirq().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-19 21:27                     ` Frank Ch. Eigler
  2017-01-19 22:20                       ` Peter Zijlstra
@ 2017-01-19 23:04                       ` Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: Thomas Gleixner @ 2017-01-19 23:04 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: David Smith, Peter Zijlstra, Linus Torvalds, Andy Lutomirski,
	Ingo Molnar, H. Peter Anvin, linux-kernel

Frank.

On Thu, 19 Jan 2017, Frank Ch. Eigler wrote:
> > Well, if you are not in thread context then the check is pointless:
> > 	__range_not_ok(addr, size, user_addr_max())
> > and:
> > #define user_addr_max() (current->thread.addr_limit.seg)
> > 
> > So what guarantees when you are not in context of current, i.e. in thread
> > context, that the addr/size which is checked against the limits of current
> > actually belongs to current?
> 
> We're probably in task context in that there is a valid current(), but

current is always accessible no matter in which context you are - task,
softirq, hardirq, nmi ...

> running with preemption and/or interrupts and/or pagefaults disabled
> at that point, so in_task() objects.

As Peter explained, neither preempt disable nor interrupt disable not
pagefault disabled have any influence on in_task(). It merily checks the
context: !in_softirq() && !in_hardirq() && !in_nmi().

So that warning happens definitely not from task context.

Care to share the code?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-19  0:19               ` Andy Lutomirski
  2017-01-19 15:37                 ` David Smith
@ 2017-01-20  8:24                 ` Peter Zijlstra
  2017-01-20  8:50                   ` Thomas Gleixner
  1 sibling, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2017-01-20  8:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: David Smith, Thomas Gleixner, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, Frank Ch. Eigler

On Wed, Jan 18, 2017 at 04:19:47PM -0800, Andy Lutomirski wrote:
> ISTM even with pagefault_disable() in play, using access_ok() from,
> say, interrupt context is dangerous unless you've first checked that
> you're in a task.  But I guess that in_task() would still return
> false, e.g. in perf.

The test was created exactly because perf was using access_ok()
_wrongly_. See commit: ae31fe51a3cc ("perf/x86: Restore TASK_SIZE check
on frame pointer").

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] x86: Verify access_ok() context
  2017-01-20  8:24                 ` Peter Zijlstra
@ 2017-01-20  8:50                   ` Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Gleixner @ 2017-01-20  8:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, David Smith, Linus Torvalds, Ingo Molnar,
	H. Peter Anvin, linux-kernel, Frank Ch. Eigler

On Fri, 20 Jan 2017, Peter Zijlstra wrote:

> On Wed, Jan 18, 2017 at 04:19:47PM -0800, Andy Lutomirski wrote:
> > ISTM even with pagefault_disable() in play, using access_ok() from,
> > say, interrupt context is dangerous unless you've first checked that
> > you're in a task.  But I guess that in_task() would still return
> > false, e.g. in perf.
> 
> The test was created exactly because perf was using access_ok()
> _wrongly_. See commit: ae31fe51a3cc ("perf/x86: Restore TASK_SIZE check
> on frame pointer").

If you validate a user space address against current outside the task
context, then what guarantees that this user space address belongs to
current? Nothing!

Sure, there are interrupts like breakpoints, etc. where we exactly know
that the address which we are looking at belongs to current, because the
code accesses soemthing which belongs exactly to that breakpoint. And in
these cases we need a check which is designed specifically for that case.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-01-20  8:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22  9:57 [RFC][PATCH] x86: Verify access_ok() context Peter Zijlstra
2016-11-22 17:28 ` Andy Lutomirski
2016-11-22 19:37   ` Peter Zijlstra
2016-11-22 19:42     ` Linus Torvalds
2016-12-05 10:27       ` Peter Zijlstra
2017-01-16 20:27         ` David Smith
2017-01-16 21:14           ` Thomas Gleixner
2017-01-18 22:16             ` David Smith
2017-01-19  0:19               ` Andy Lutomirski
2017-01-19 15:37                 ` David Smith
2017-01-20  8:24                 ` Peter Zijlstra
2017-01-20  8:50                   ` Thomas Gleixner
2017-01-19 18:12               ` Thomas Gleixner
2017-01-19 20:22                 ` Frank Ch. Eigler
2017-01-19 20:50                   ` Thomas Gleixner
2017-01-19 21:27                     ` Frank Ch. Eigler
2017-01-19 22:20                       ` Peter Zijlstra
2017-01-19 23:04                       ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).