All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-06 17:50 ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

As Peter asked me to also do the decoupling in one shot, this is
the new series.

I recently discovered that might_fault() doesn't call might_sleep()
anymore. Therefore bugs like:

  spin_lock(&lock);
  rc = copy_to_user(...);
  spin_unlock(&lock);

would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
changed to disable false positives for code like:

  pagefault_disable();
  rc = copy_to_user(...);
  pagefault_enable();

Whereby the caller wants do deal with failures.

If we schedule while holding a spinlock, bad things can happen.
Preemption is disabled, but we still preempt - on s390 this can lead
to spinlocks never getting unlocked (as unlocking code checks for the
cpu id that locke the spinlock), therefore resulting in a deadlock.

Until now, pagefault_(dis|en)nable) simply modified the preempt count,
therefore telling the pagefault handler that the context is atomic and
sleeping is disallowed.

With !CONFIG_PREEMPT_COUNT, the preempt count does not exist.
So preempt_disable() as well es pagefault_disable() is a NOP.
in_atomic() checks the preempt count. So we can't detect in_atomic on
systems without preemption.
That is also mentioned in the comment of in_atomic():

  "WARNING: this macro cannot always detect atomic context; in particular,
   it cannot know about held spinlocks in non-preemptible kernels."

We automatically have !CONFIG_PREEMPT_COUNT with !CONFIG_PREEMPT
and !CONFIG_DEBUG_ATOMIC_SLEEP, so on a system with disabled pagefaults.

All fault handlers currently rely on in_atomic() to check for disabled
pagefaults.

This series therefore does 2 things:


1. Decouple pagefault_disable() from preempt_enable()

pagefault_(dis|en)able() modifies an own counter and doesn't touch
preemption anymore. The fault handlers now only check for
disabled pagefaults.

I checked (hopefully) every caller of pagefault_disable(), if they
implicitly rely on preempt_disable() -  and if so added these calls.
Hope I haven't missed some cases.

I didn't check all users of preempt_disable() if they relied on them
disabling pagefaults. My assumption is that such code is broken
either way (on non-preemptible systems). Pagefaults would only be
disabled with CONFIG_PREEMPT_COUNT.

I also thought about leaving the in_atomic() check inside the
fault handlers in addition to the new check. In my opinion that
can only hide bugs that would happen with !CONFIG_PREEMPT_COUNT.

We could even add a warning at that place like
  WARN_ONCE(in_atomic() && !pagefault_disabled())

But most of these cases should be found with the following part.


2. Reenable might_sleep() checks for might_fault()

As we can now decide if we really have pagefaults disabled,
we can reenable the might_sleep() check in might_fault().

So this should now work:

  spin_lock(&lock); /* also if left away */
  pagefault_disable()
  rc = copy_to_user(...);
  pagefault_enable();
  spin_unlock(&lock);

And this should report a warning again:

  spin_lock(&lock);
  rc = copy_to_user(...);
  spin_unlock(&lock);


Cross compiled on powerpc, arm, sparc, sparc64, arm64, x86_64, i386,
mips, alpha, ia64, xtensa, m68k, microblaze.

Tested on s390x.

Any feedback very welcome!

Thanks!


David Hildenbrand (15):
  uaccess: count pagefault_disable() levels in pagefault_disabled
  mm, uaccess: trigger might_sleep() in might_fault() with disabled
    pagefaults
  uaccess: clarify that uaccess may only sleep if pagefaults are enabled
  mm: explicitly disable/enable preemption in kmap_atomic_*
  mips: kmap_coherent relies on disabled preemption
  mm: use pagefault_disabled() to check for disabled pagefaults
  drm/i915: use pagefault_disabled() to check for disabled pagefaults
  futex: UP futex_atomic_op_inuser() relies on disabled preemption
  futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled
    preemption
  arm/futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled
    preemption
  arm/futex: UP futex_atomic_op_inuser() relies on disabled preemption
  futex: clarify that preemption doesn't have to be disabled
  powerpc: enable_kernel_altivec() requires disabled preemption
  mips: properly lock access to the fpu
  uaccess: decouple preemption from the pagefault logic

 arch/alpha/mm/fault.c                      |  5 ++--
 arch/arc/include/asm/futex.h               | 10 +++----
 arch/arc/mm/fault.c                        |  2 +-
 arch/arm/include/asm/futex.h               | 13 +++++++--
 arch/arm/mm/fault.c                        |  2 +-
 arch/arm/mm/highmem.c                      |  3 ++
 arch/arm64/include/asm/futex.h             |  4 +--
 arch/arm64/mm/fault.c                      |  2 +-
 arch/avr32/include/asm/uaccess.h           | 12 +++++---
 arch/avr32/mm/fault.c                      |  4 +--
 arch/cris/mm/fault.c                       |  4 +--
 arch/frv/mm/fault.c                        |  4 +--
 arch/frv/mm/highmem.c                      |  2 ++
 arch/hexagon/include/asm/uaccess.h         |  3 +-
 arch/ia64/mm/fault.c                       |  4 +--
 arch/m32r/include/asm/uaccess.h            | 30 +++++++++++++-------
 arch/m32r/mm/fault.c                       |  4 +--
 arch/m68k/mm/fault.c                       |  4 +--
 arch/metag/mm/fault.c                      |  2 +-
 arch/metag/mm/highmem.c                    |  4 ++-
 arch/microblaze/include/asm/uaccess.h      |  6 ++--
 arch/microblaze/mm/fault.c                 |  6 ++--
 arch/microblaze/mm/highmem.c               |  4 ++-
 arch/mips/include/asm/uaccess.h            | 45 ++++++++++++++++++++----------
 arch/mips/kernel/signal-common.h           |  9 ++----
 arch/mips/mm/fault.c                       |  4 +--
 arch/mips/mm/highmem.c                     |  5 +++-
 arch/mips/mm/init.c                        |  2 ++
 arch/mn10300/include/asm/highmem.h         |  3 ++
 arch/mn10300/mm/fault.c                    |  4 +--
 arch/nios2/mm/fault.c                      |  2 +-
 arch/parisc/include/asm/cacheflush.h       |  2 ++
 arch/parisc/kernel/traps.c                 |  4 +--
 arch/parisc/mm/fault.c                     |  4 +--
 arch/powerpc/lib/vmx-helper.c              | 11 ++++----
 arch/powerpc/mm/fault.c                    |  9 +++---
 arch/powerpc/mm/highmem.c                  |  4 ++-
 arch/s390/include/asm/uaccess.h            | 15 ++++++----
 arch/s390/mm/fault.c                       |  2 +-
 arch/score/include/asm/uaccess.h           | 15 ++++++----
 arch/score/mm/fault.c                      |  3 +-
 arch/sh/mm/fault.c                         |  3 +-
 arch/sparc/mm/fault_32.c                   |  4 +--
 arch/sparc/mm/fault_64.c                   |  4 +--
 arch/sparc/mm/highmem.c                    |  4 ++-
 arch/sparc/mm/init_64.c                    |  2 +-
 arch/tile/include/asm/uaccess.h            | 18 ++++++++----
 arch/tile/mm/fault.c                       |  4 +--
 arch/tile/mm/highmem.c                     |  3 +-
 arch/um/kernel/trap.c                      |  4 +--
 arch/unicore32/mm/fault.c                  |  2 +-
 arch/x86/include/asm/uaccess.h             | 15 ++++++----
 arch/x86/include/asm/uaccess_32.h          |  6 ++--
 arch/x86/lib/usercopy_32.c                 |  6 ++--
 arch/x86/mm/fault.c                        |  5 ++--
 arch/x86/mm/highmem_32.c                   |  3 +-
 arch/x86/mm/iomap_32.c                     |  2 ++
 arch/xtensa/mm/fault.c                     |  4 +--
 arch/xtensa/mm/highmem.c                   |  2 ++
 drivers/crypto/vmx/aes.c                   |  8 +++++-
 drivers/crypto/vmx/aes_cbc.c               |  6 ++++
 drivers/crypto/vmx/ghash.c                 |  8 ++++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  3 +-
 include/asm-generic/futex.h                |  7 +++--
 include/linux/highmem.h                    |  2 ++
 include/linux/io-mapping.h                 |  2 ++
 include/linux/kernel.h                     |  3 +-
 include/linux/sched.h                      |  1 +
 include/linux/uaccess.h                    | 36 +++++++++++++++---------
 kernel/fork.c                              |  3 ++
 lib/strnlen_user.c                         |  6 ++--
 mm/memory.c                                | 18 ++++--------
 72 files changed, 302 insertions(+), 169 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-06 17:50 ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

As Peter asked me to also do the decoupling in one shot, this is
the new series.

I recently discovered that might_fault() doesn't call might_sleep()
anymore. Therefore bugs like:

  spin_lock(&lock);
  rc = copy_to_user(...);
  spin_unlock(&lock);

would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
changed to disable false positives for code like:

  pagefault_disable();
  rc = copy_to_user(...);
  pagefault_enable();

Whereby the caller wants do deal with failures.

If we schedule while holding a spinlock, bad things can happen.
Preemption is disabled, but we still preempt - on s390 this can lead
to spinlocks never getting unlocked (as unlocking code checks for the
cpu id that locke the spinlock), therefore resulting in a deadlock.

Until now, pagefault_(dis|en)nable) simply modified the preempt count,
therefore telling the pagefault handler that the context is atomic and
sleeping is disallowed.

With !CONFIG_PREEMPT_COUNT, the preempt count does not exist.
So preempt_disable() as well es pagefault_disable() is a NOP.
in_atomic() checks the preempt count. So we can't detect in_atomic on
systems without preemption.
That is also mentioned in the comment of in_atomic():

  "WARNING: this macro cannot always detect atomic context; in particular,
   it cannot know about held spinlocks in non-preemptible kernels."

We automatically have !CONFIG_PREEMPT_COUNT with !CONFIG_PREEMPT
and !CONFIG_DEBUG_ATOMIC_SLEEP, so on a system with disabled pagefaults.

All fault handlers currently rely on in_atomic() to check for disabled
pagefaults.

This series therefore does 2 things:


1. Decouple pagefault_disable() from preempt_enable()

pagefault_(dis|en)able() modifies an own counter and doesn't touch
preemption anymore. The fault handlers now only check for
disabled pagefaults.

I checked (hopefully) every caller of pagefault_disable(), if they
implicitly rely on preempt_disable() -  and if so added these calls.
Hope I haven't missed some cases.

I didn't check all users of preempt_disable() if they relied on them
disabling pagefaults. My assumption is that such code is broken
either way (on non-preemptible systems). Pagefaults would only be
disabled with CONFIG_PREEMPT_COUNT.

I also thought about leaving the in_atomic() check inside the
fault handlers in addition to the new check. In my opinion that
can only hide bugs that would happen with !CONFIG_PREEMPT_COUNT.

We could even add a warning at that place like
  WARN_ONCE(in_atomic() && !pagefault_disabled())

But most of these cases should be found with the following part.


2. Reenable might_sleep() checks for might_fault()

As we can now decide if we really have pagefaults disabled,
we can reenable the might_sleep() check in might_fault().

So this should now work:

  spin_lock(&lock); /* also if left away */
  pagefault_disable()
  rc = copy_to_user(...);
  pagefault_enable();
  spin_unlock(&lock);

And this should report a warning again:

  spin_lock(&lock);
  rc = copy_to_user(...);
  spin_unlock(&lock);


Cross compiled on powerpc, arm, sparc, sparc64, arm64, x86_64, i386,
mips, alpha, ia64, xtensa, m68k, microblaze.

Tested on s390x.

Any feedback very welcome!

Thanks!


David Hildenbrand (15):
  uaccess: count pagefault_disable() levels in pagefault_disabled
  mm, uaccess: trigger might_sleep() in might_fault() with disabled
    pagefaults
  uaccess: clarify that uaccess may only sleep if pagefaults are enabled
  mm: explicitly disable/enable preemption in kmap_atomic_*
  mips: kmap_coherent relies on disabled preemption
  mm: use pagefault_disabled() to check for disabled pagefaults
  drm/i915: use pagefault_disabled() to check for disabled pagefaults
  futex: UP futex_atomic_op_inuser() relies on disabled preemption
  futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled
    preemption
  arm/futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled
    preemption
  arm/futex: UP futex_atomic_op_inuser() relies on disabled preemption
  futex: clarify that preemption doesn't have to be disabled
  powerpc: enable_kernel_altivec() requires disabled preemption
  mips: properly lock access to the fpu
  uaccess: decouple preemption from the pagefault logic

 arch/alpha/mm/fault.c                      |  5 ++--
 arch/arc/include/asm/futex.h               | 10 +++----
 arch/arc/mm/fault.c                        |  2 +-
 arch/arm/include/asm/futex.h               | 13 +++++++--
 arch/arm/mm/fault.c                        |  2 +-
 arch/arm/mm/highmem.c                      |  3 ++
 arch/arm64/include/asm/futex.h             |  4 +--
 arch/arm64/mm/fault.c                      |  2 +-
 arch/avr32/include/asm/uaccess.h           | 12 +++++---
 arch/avr32/mm/fault.c                      |  4 +--
 arch/cris/mm/fault.c                       |  4 +--
 arch/frv/mm/fault.c                        |  4 +--
 arch/frv/mm/highmem.c                      |  2 ++
 arch/hexagon/include/asm/uaccess.h         |  3 +-
 arch/ia64/mm/fault.c                       |  4 +--
 arch/m32r/include/asm/uaccess.h            | 30 +++++++++++++-------
 arch/m32r/mm/fault.c                       |  4 +--
 arch/m68k/mm/fault.c                       |  4 +--
 arch/metag/mm/fault.c                      |  2 +-
 arch/metag/mm/highmem.c                    |  4 ++-
 arch/microblaze/include/asm/uaccess.h      |  6 ++--
 arch/microblaze/mm/fault.c                 |  6 ++--
 arch/microblaze/mm/highmem.c               |  4 ++-
 arch/mips/include/asm/uaccess.h            | 45 ++++++++++++++++++++----------
 arch/mips/kernel/signal-common.h           |  9 ++----
 arch/mips/mm/fault.c                       |  4 +--
 arch/mips/mm/highmem.c                     |  5 +++-
 arch/mips/mm/init.c                        |  2 ++
 arch/mn10300/include/asm/highmem.h         |  3 ++
 arch/mn10300/mm/fault.c                    |  4 +--
 arch/nios2/mm/fault.c                      |  2 +-
 arch/parisc/include/asm/cacheflush.h       |  2 ++
 arch/parisc/kernel/traps.c                 |  4 +--
 arch/parisc/mm/fault.c                     |  4 +--
 arch/powerpc/lib/vmx-helper.c              | 11 ++++----
 arch/powerpc/mm/fault.c                    |  9 +++---
 arch/powerpc/mm/highmem.c                  |  4 ++-
 arch/s390/include/asm/uaccess.h            | 15 ++++++----
 arch/s390/mm/fault.c                       |  2 +-
 arch/score/include/asm/uaccess.h           | 15 ++++++----
 arch/score/mm/fault.c                      |  3 +-
 arch/sh/mm/fault.c                         |  3 +-
 arch/sparc/mm/fault_32.c                   |  4 +--
 arch/sparc/mm/fault_64.c                   |  4 +--
 arch/sparc/mm/highmem.c                    |  4 ++-
 arch/sparc/mm/init_64.c                    |  2 +-
 arch/tile/include/asm/uaccess.h            | 18 ++++++++----
 arch/tile/mm/fault.c                       |  4 +--
 arch/tile/mm/highmem.c                     |  3 +-
 arch/um/kernel/trap.c                      |  4 +--
 arch/unicore32/mm/fault.c                  |  2 +-
 arch/x86/include/asm/uaccess.h             | 15 ++++++----
 arch/x86/include/asm/uaccess_32.h          |  6 ++--
 arch/x86/lib/usercopy_32.c                 |  6 ++--
 arch/x86/mm/fault.c                        |  5 ++--
 arch/x86/mm/highmem_32.c                   |  3 +-
 arch/x86/mm/iomap_32.c                     |  2 ++
 arch/xtensa/mm/fault.c                     |  4 +--
 arch/xtensa/mm/highmem.c                   |  2 ++
 drivers/crypto/vmx/aes.c                   |  8 +++++-
 drivers/crypto/vmx/aes_cbc.c               |  6 ++++
 drivers/crypto/vmx/ghash.c                 |  8 ++++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  3 +-
 include/asm-generic/futex.h                |  7 +++--
 include/linux/highmem.h                    |  2 ++
 include/linux/io-mapping.h                 |  2 ++
 include/linux/kernel.h                     |  3 +-
 include/linux/sched.h                      |  1 +
 include/linux/uaccess.h                    | 36 +++++++++++++++---------
 kernel/fork.c                              |  3 ++
 lib/strnlen_user.c                         |  6 ++--
 mm/memory.c                                | 18 ++++--------
 72 files changed, 302 insertions(+), 169 deletions(-)

-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Until now, pagefault_disable()/pagefault_enabled() used the preempt
count to track whether in an environment with pagefaults disabled (can
be queried via in_atomic()).

This patch introduces a separate counter in task_struct to count the
level of pagefault_disable() calls. We'll keep manipulating the preempt
count to retain compatibility to existing pagefault handlers.

It is now possible to verify whether in a pagefault_disable() envionment
by calling pagefault_disabled(). In contrast to in_atomic() it will not
be influenced by preempt_enable()/preempt_disable().

This patch is based on a patch from Ingo Molnar.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/sched.h   |  1 +
 include/linux/uaccess.h | 36 +++++++++++++++++++++++++++++-------
 kernel/fork.c           |  3 +++
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ede26ca..75778cb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1724,6 +1724,7 @@ struct task_struct {
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 	unsigned long	task_state_change;
 #endif
+	int pagefault_disabled;
 };
 
 /* Future-safe accessor for struct task_struct's cpus_allowed. */
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index ecd3319..23290cc 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -2,20 +2,36 @@
 #define __LINUX_UACCESS_H__
 
 #include <linux/preempt.h>
+#include <linux/sched.h>
 #include <asm/uaccess.h>
 
+static __always_inline void pagefault_disabled_inc(void)
+{
+	current->pagefault_disabled++;
+}
+
+static __always_inline void pagefault_disabled_dec(void)
+{
+	current->pagefault_disabled--;
+	WARN_ON(current->pagefault_disabled < 0);
+}
+
 /*
- * These routines enable/disable the pagefault handler in that
- * it will not take any locks and go straight to the fixup table.
+ * These routines enable/disable the pagefault handler. If disabled, it will
+ * not take any locks and go straight to the fixup table.
+ *
+ * We increase the preempt and the pagefault count, to be able to distinguish
+ * whether we run in simple atomic context or in a real pagefault_disable()
+ * context.
+ *
+ * For now, after pagefault_disabled() has been called, we run in atomic
+ * context. User access methods will not sleep.
  *
- * They have great resemblance to the preempt_disable/enable calls
- * and in fact they are identical; this is because currently there is
- * no other way to make the pagefault handlers do this. So we do
- * disable preemption but we don't necessarily care about that.
  */
 static inline void pagefault_disable(void)
 {
 	preempt_count_inc();
+	pagefault_disabled_inc();
 	/*
 	 * make sure to have issued the store before a pagefault
 	 * can hit.
@@ -25,18 +41,24 @@ static inline void pagefault_disable(void)
 
 static inline void pagefault_enable(void)
 {
-#ifndef CONFIG_PREEMPT
 	/*
 	 * make sure to issue those last loads/stores before enabling
 	 * the pagefault handler again.
 	 */
 	barrier();
+	pagefault_disabled_dec();
+#ifndef CONFIG_PREEMPT
 	preempt_count_dec();
 #else
 	preempt_enable();
 #endif
 }
 
+/*
+ * Is the pagefault handler disabled? If so, user access methods will not sleep.
+ */
+#define pagefault_disabled() (current->pagefault_disabled != 0)
+
 #ifndef ARCH_HAS_NOCACHE_UACCESS
 
 static inline unsigned long __copy_from_user_inatomic_nocache(void *to,
diff --git a/kernel/fork.c b/kernel/fork.c
index 03c1eaa..c344d27 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1396,6 +1396,9 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	p->hardirq_context = 0;
 	p->softirq_context = 0;
 #endif
+
+	p->pagefault_disabled = 0;
+
 #ifdef CONFIG_LOCKDEP
 	p->lockdep_depth = 0; /* no locks held yet */
 	p->curr_chain_key = 0;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Until now, pagefault_disable()/pagefault_enabled() used the preempt
count to track whether in an environment with pagefaults disabled (can
be queried via in_atomic()).

This patch introduces a separate counter in task_struct to count the
level of pagefault_disable() calls. We'll keep manipulating the preempt
count to retain compatibility to existing pagefault handlers.

It is now possible to verify whether in a pagefault_disable() envionment
by calling pagefault_disabled(). In contrast to in_atomic() it will not
be influenced by preempt_enable()/preempt_disable().

This patch is based on a patch from Ingo Molnar.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/sched.h   |  1 +
 include/linux/uaccess.h | 36 +++++++++++++++++++++++++++++-------
 kernel/fork.c           |  3 +++
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ede26ca..75778cb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1724,6 +1724,7 @@ struct task_struct {
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 	unsigned long	task_state_change;
 #endif
+	int pagefault_disabled;
 };
 
 /* Future-safe accessor for struct task_struct's cpus_allowed. */
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index ecd3319..23290cc 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -2,20 +2,36 @@
 #define __LINUX_UACCESS_H__
 
 #include <linux/preempt.h>
+#include <linux/sched.h>
 #include <asm/uaccess.h>
 
+static __always_inline void pagefault_disabled_inc(void)
+{
+	current->pagefault_disabled++;
+}
+
+static __always_inline void pagefault_disabled_dec(void)
+{
+	current->pagefault_disabled--;
+	WARN_ON(current->pagefault_disabled < 0);
+}
+
 /*
- * These routines enable/disable the pagefault handler in that
- * it will not take any locks and go straight to the fixup table.
+ * These routines enable/disable the pagefault handler. If disabled, it will
+ * not take any locks and go straight to the fixup table.
+ *
+ * We increase the preempt and the pagefault count, to be able to distinguish
+ * whether we run in simple atomic context or in a real pagefault_disable()
+ * context.
+ *
+ * For now, after pagefault_disabled() has been called, we run in atomic
+ * context. User access methods will not sleep.
  *
- * They have great resemblance to the preempt_disable/enable calls
- * and in fact they are identical; this is because currently there is
- * no other way to make the pagefault handlers do this. So we do
- * disable preemption but we don't necessarily care about that.
  */
 static inline void pagefault_disable(void)
 {
 	preempt_count_inc();
+	pagefault_disabled_inc();
 	/*
 	 * make sure to have issued the store before a pagefault
 	 * can hit.
@@ -25,18 +41,24 @@ static inline void pagefault_disable(void)
 
 static inline void pagefault_enable(void)
 {
-#ifndef CONFIG_PREEMPT
 	/*
 	 * make sure to issue those last loads/stores before enabling
 	 * the pagefault handler again.
 	 */
 	barrier();
+	pagefault_disabled_dec();
+#ifndef CONFIG_PREEMPT
 	preempt_count_dec();
 #else
 	preempt_enable();
 #endif
 }
 
+/*
+ * Is the pagefault handler disabled? If so, user access methods will not sleep.
+ */
+#define pagefault_disabled() (current->pagefault_disabled != 0)
+
 #ifndef ARCH_HAS_NOCACHE_UACCESS
 
 static inline unsigned long __copy_from_user_inatomic_nocache(void *to,
diff --git a/kernel/fork.c b/kernel/fork.c
index 03c1eaa..c344d27 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1396,6 +1396,9 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	p->hardirq_context = 0;
 	p->softirq_context = 0;
 #endif
+
+	p->pagefault_disabled = 0;
+
 #ifdef CONFIG_LOCKDEP
 	p->lockdep_depth = 0; /* no locks held yet */
 	p->curr_chain_key = 0;
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 02/15] mm, uaccess: trigger might_sleep() in might_fault() with disabled pagefaults
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
pagefault_disable()") removed might_sleep() checks for all user access
code (that uses might_fault()).

The reason was to disable wrong "sleep in atomic" warnings in the
following scenario:
    pagefault_disable()
    rc = copy_to_user(...)
    pagefault_enable()

Which is valid, as pagefault_disable() increments the preempt counter
and therefore disables the pagefault handler. copy_to_user() will not
sleep and return an error code if a page is not available.

However, as all might_sleep() checks are removed,
CONFIG_DEBUG_ATOMIC_SLEEP would no longer detect the following scenario:
    spin_lock(&lock);
    rc = copy_to_user(...)
    spin_unlock(&lock)

If the kernel is compiled with preemption turned on, preempt_disable()
will make in_atomic() detect disabled preemption. The fault handler would
correctly never sleep on user access.
However, with preemption turned off, preempt_disable() is usually a NOP
(with !CONFIG_PREEMPT_COUNT), therefore in_atomic() will not be able to
detect disabled preemption nor disabled pagefaults. The fault handler
could sleep.
We really want to enable CONFIG_DEBUG_ATOMIC_SLEEP checks for user access
functions again, otherwise we can end up with horrible deadlocks.

Root of all evil is that pagefault_disable() acts almost as
preempt_disable(), depending on preemption being turned on/off.

As we now have pagefault_disabled(), we can use it to distinguish
whether user acces functions might sleep.

Convert might_fault() into a makro that calls __might_fault(), to
allow proper file + line messages in case of a might_sleep() warning.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/kernel.h |  3 ++-
 mm/memory.c            | 18 ++++++------------
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 3a5b48e..060dd7b 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -244,7 +244,8 @@ static inline u32 reciprocal_scale(u32 val, u32 ep_ro)
 
 #if defined(CONFIG_MMU) && \
 	(defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP))
-void might_fault(void);
+#define might_fault() __might_fault(__FILE__, __LINE__)
+void __might_fault(const char *file, int line);
 #else
 static inline void might_fault(void) { }
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index d1fa0c1..2ddd80a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3737,7 +3737,7 @@ void print_vma_addr(char *prefix, unsigned long ip)
 }
 
 #if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
-void might_fault(void)
+void __might_fault(const char *file, int line)
 {
 	/*
 	 * Some code (nfs/sunrpc) uses socket ops on kernel memory while
@@ -3747,21 +3747,15 @@ void might_fault(void)
 	 */
 	if (segment_eq(get_fs(), KERNEL_DS))
 		return;
-
-	/*
-	 * it would be nicer only to annotate paths which are not under
-	 * pagefault_disable, however that requires a larger audit and
-	 * providing helpers like get_user_atomic.
-	 */
-	if (in_atomic())
+	if (pagefault_disabled())
 		return;
-
-	__might_sleep(__FILE__, __LINE__, 0);
-
+	__might_sleep(file, line, 0);
+#if defined(CONFIG_DEBUG_ATOMIC_SLEEP)
 	if (current->mm)
 		might_lock_read(&current->mm->mmap_sem);
+#endif
 }
-EXPORT_SYMBOL(might_fault);
+EXPORT_SYMBOL(__might_fault);
 #endif
 
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 02/15] mm, uaccess: trigger might_sleep() in might_fault() with disabled pagefaults
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
pagefault_disable()") removed might_sleep() checks for all user access
code (that uses might_fault()).

The reason was to disable wrong "sleep in atomic" warnings in the
following scenario:
    pagefault_disable()
    rc = copy_to_user(...)
    pagefault_enable()

Which is valid, as pagefault_disable() increments the preempt counter
and therefore disables the pagefault handler. copy_to_user() will not
sleep and return an error code if a page is not available.

However, as all might_sleep() checks are removed,
CONFIG_DEBUG_ATOMIC_SLEEP would no longer detect the following scenario:
    spin_lock(&lock);
    rc = copy_to_user(...)
    spin_unlock(&lock)

If the kernel is compiled with preemption turned on, preempt_disable()
will make in_atomic() detect disabled preemption. The fault handler would
correctly never sleep on user access.
However, with preemption turned off, preempt_disable() is usually a NOP
(with !CONFIG_PREEMPT_COUNT), therefore in_atomic() will not be able to
detect disabled preemption nor disabled pagefaults. The fault handler
could sleep.
We really want to enable CONFIG_DEBUG_ATOMIC_SLEEP checks for user access
functions again, otherwise we can end up with horrible deadlocks.

Root of all evil is that pagefault_disable() acts almost as
preempt_disable(), depending on preemption being turned on/off.

As we now have pagefault_disabled(), we can use it to distinguish
whether user acces functions might sleep.

Convert might_fault() into a makro that calls __might_fault(), to
allow proper file + line messages in case of a might_sleep() warning.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/kernel.h |  3 ++-
 mm/memory.c            | 18 ++++++------------
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 3a5b48e..060dd7b 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -244,7 +244,8 @@ static inline u32 reciprocal_scale(u32 val, u32 ep_ro)
 
 #if defined(CONFIG_MMU) && \
 	(defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP))
-void might_fault(void);
+#define might_fault() __might_fault(__FILE__, __LINE__)
+void __might_fault(const char *file, int line);
 #else
 static inline void might_fault(void) { }
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index d1fa0c1..2ddd80a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3737,7 +3737,7 @@ void print_vma_addr(char *prefix, unsigned long ip)
 }
 
 #if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
-void might_fault(void)
+void __might_fault(const char *file, int line)
 {
 	/*
 	 * Some code (nfs/sunrpc) uses socket ops on kernel memory while
@@ -3747,21 +3747,15 @@ void might_fault(void)
 	 */
 	if (segment_eq(get_fs(), KERNEL_DS))
 		return;
-
-	/*
-	 * it would be nicer only to annotate paths which are not under
-	 * pagefault_disable, however that requires a larger audit and
-	 * providing helpers like get_user_atomic.
-	 */
-	if (in_atomic())
+	if (pagefault_disabled())
 		return;
-
-	__might_sleep(__FILE__, __LINE__, 0);
-
+	__might_sleep(file, line, 0);
+#if defined(CONFIG_DEBUG_ATOMIC_SLEEP)
 	if (current->mm)
 		might_lock_read(&current->mm->mmap_sem);
+#endif
 }
-EXPORT_SYMBOL(might_fault);
+EXPORT_SYMBOL(__might_fault);
 #endif
 
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 03/15] uaccess: clarify that uaccess may only sleep if pagefaults are enabled
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

In general, non-atomic variants of user access functions must not sleep
if pagefaults are disabled.

Let's update all relevant comments in uaccess code. This also reflects
the might_sleep() checks in might_fault().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/avr32/include/asm/uaccess.h      | 12 ++++++----
 arch/hexagon/include/asm/uaccess.h    |  3 ++-
 arch/m32r/include/asm/uaccess.h       | 30 +++++++++++++++--------
 arch/microblaze/include/asm/uaccess.h |  6 +++--
 arch/mips/include/asm/uaccess.h       | 45 +++++++++++++++++++++++------------
 arch/s390/include/asm/uaccess.h       | 15 ++++++++----
 arch/score/include/asm/uaccess.h      | 15 ++++++++----
 arch/tile/include/asm/uaccess.h       | 18 +++++++++-----
 arch/x86/include/asm/uaccess.h        | 15 ++++++++----
 arch/x86/include/asm/uaccess_32.h     |  6 +++--
 arch/x86/lib/usercopy_32.c            |  6 +++--
 lib/strnlen_user.c                    |  6 +++--
 12 files changed, 118 insertions(+), 59 deletions(-)

diff --git a/arch/avr32/include/asm/uaccess.h b/arch/avr32/include/asm/uaccess.h
index a46f7cf..68cf638 100644
--- a/arch/avr32/include/asm/uaccess.h
+++ b/arch/avr32/include/asm/uaccess.h
@@ -97,7 +97,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -116,7 +117,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -136,7 +138,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -158,7 +161,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
index e4127e4..f000a38 100644
--- a/arch/hexagon/include/asm/uaccess.h
+++ b/arch/hexagon/include/asm/uaccess.h
@@ -36,7 +36,8 @@
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
diff --git a/arch/m32r/include/asm/uaccess.h b/arch/m32r/include/asm/uaccess.h
index 71adff2..cac7014 100644
--- a/arch/m32r/include/asm/uaccess.h
+++ b/arch/m32r/include/asm/uaccess.h
@@ -91,7 +91,8 @@ static inline void set_fs(mm_segment_t s)
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -155,7 +156,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -175,7 +177,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -194,7 +197,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -274,7 +278,8 @@ do {									\
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -568,7 +573,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -588,7 +594,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -606,7 +613,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -626,7 +634,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -677,7 +686,8 @@ unsigned long clear_user(void __user *mem, unsigned long len);
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 62942fd..331b0d3 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -178,7 +178,8 @@ extern long __user_bad(void);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -290,7 +291,8 @@ extern long __user_bad(void);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index bf8b324..9722357 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -103,7 +103,8 @@ extern u64 __ua_limit;
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -138,7 +139,8 @@ extern u64 __ua_limit;
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -157,7 +159,8 @@ extern u64 __ua_limit;
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -177,7 +180,8 @@ extern u64 __ua_limit;
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -199,7 +203,8 @@ extern u64 __ua_limit;
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -498,7 +503,8 @@ extern void __put_user_unknown(void);
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -517,7 +523,8 @@ extern void __put_user_unknown(void);
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -537,7 +544,8 @@ extern void __put_user_unknown(void);
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -559,7 +567,8 @@ extern void __put_user_unknown(void);
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -815,7 +824,8 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -888,7 +898,8 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -1075,7 +1086,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -1107,7 +1119,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -1329,7 +1342,8 @@ strncpy_from_user(char *__to, const char __user *__from, long __len)
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
@@ -1398,7 +1412,8 @@ static inline long __strnlen_user(const char __user *s, long n)
  * strnlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index d64a7a6..9dd4cc4 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -98,7 +98,8 @@ static inline unsigned long extable_fixup(const struct exception_table_entry *x)
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -118,7 +119,8 @@ unsigned long __must_check __copy_from_user(void *to, const void __user *from,
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -264,7 +266,8 @@ int __get_user_bad(void) __attribute__((noreturn));
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -290,7 +293,8 @@ __compiletime_warning("copy_from_user() buffer size is not provably correct")
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -348,7 +352,8 @@ static inline unsigned long strnlen_user(const char __user *src, unsigned long n
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/score/include/asm/uaccess.h b/arch/score/include/asm/uaccess.h
index ab66ddd..20a3591 100644
--- a/arch/score/include/asm/uaccess.h
+++ b/arch/score/include/asm/uaccess.h
@@ -36,7 +36,8 @@
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -61,7 +62,8 @@
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -79,7 +81,8 @@
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -98,7 +101,8 @@
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -119,7 +123,8 @@
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/tile/include/asm/uaccess.h b/arch/tile/include/asm/uaccess.h
index d598c11..0a9c4265 100644
--- a/arch/tile/include/asm/uaccess.h
+++ b/arch/tile/include/asm/uaccess.h
@@ -85,7 +85,8 @@ int __range_ok(unsigned long addr, unsigned long size);
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -199,7 +200,8 @@ extern int __get_user_bad(void)
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -281,7 +283,8 @@ extern int __put_user_bad(void)
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -337,7 +340,8 @@ extern int __put_user_bad(void)
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -373,7 +377,8 @@ copy_to_user(void __user *to, const void *from, unsigned long n)
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -444,7 +449,8 @@ static inline unsigned long __must_check copy_from_user(void *to,
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to user space.  Caller must check
  * the specified blocks with access_ok() before calling this function.
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index ace9dec..a8df874 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -74,7 +74,8 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -145,7 +146,8 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -240,7 +242,8 @@ extern void __put_user_8(void);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -455,7 +458,8 @@ struct __large_struct { unsigned long buf[100]; };
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -479,7 +483,8 @@ struct __large_struct { unsigned long buf[100]; };
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index 0ed5504..f5dcb52 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -74,7 +74,8 @@ __copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -121,7 +122,8 @@ __copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c
index e2f5e21..91d93b9 100644
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -647,7 +647,8 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -668,7 +669,8 @@ EXPORT_SYMBOL(_copy_to_user);
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index a28df52..36c15a2 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -84,7 +84,8 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
  * @str: The string to measure.
  * @count: Maximum count (including NUL character)
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
@@ -113,7 +114,8 @@ EXPORT_SYMBOL(strnlen_user);
  * strlen_user: - Get the size of a user string INCLUDING final NUL.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 03/15] uaccess: clarify that uaccess may only sleep if pagefaults are enabled
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

In general, non-atomic variants of user access functions must not sleep
if pagefaults are disabled.

Let's update all relevant comments in uaccess code. This also reflects
the might_sleep() checks in might_fault().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/avr32/include/asm/uaccess.h      | 12 ++++++----
 arch/hexagon/include/asm/uaccess.h    |  3 ++-
 arch/m32r/include/asm/uaccess.h       | 30 +++++++++++++++--------
 arch/microblaze/include/asm/uaccess.h |  6 +++--
 arch/mips/include/asm/uaccess.h       | 45 +++++++++++++++++++++++------------
 arch/s390/include/asm/uaccess.h       | 15 ++++++++----
 arch/score/include/asm/uaccess.h      | 15 ++++++++----
 arch/tile/include/asm/uaccess.h       | 18 +++++++++-----
 arch/x86/include/asm/uaccess.h        | 15 ++++++++----
 arch/x86/include/asm/uaccess_32.h     |  6 +++--
 arch/x86/lib/usercopy_32.c            |  6 +++--
 lib/strnlen_user.c                    |  6 +++--
 12 files changed, 118 insertions(+), 59 deletions(-)

diff --git a/arch/avr32/include/asm/uaccess.h b/arch/avr32/include/asm/uaccess.h
index a46f7cf..68cf638 100644
--- a/arch/avr32/include/asm/uaccess.h
+++ b/arch/avr32/include/asm/uaccess.h
@@ -97,7 +97,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -116,7 +117,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -136,7 +138,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -158,7 +161,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
index e4127e4..f000a38 100644
--- a/arch/hexagon/include/asm/uaccess.h
+++ b/arch/hexagon/include/asm/uaccess.h
@@ -36,7 +36,8 @@
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
diff --git a/arch/m32r/include/asm/uaccess.h b/arch/m32r/include/asm/uaccess.h
index 71adff2..cac7014 100644
--- a/arch/m32r/include/asm/uaccess.h
+++ b/arch/m32r/include/asm/uaccess.h
@@ -91,7 +91,8 @@ static inline void set_fs(mm_segment_t s)
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -155,7 +156,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -175,7 +177,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -194,7 +197,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -274,7 +278,8 @@ do {									\
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -568,7 +573,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -588,7 +594,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -606,7 +613,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -626,7 +634,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -677,7 +686,8 @@ unsigned long clear_user(void __user *mem, unsigned long len);
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 62942fd..331b0d3 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -178,7 +178,8 @@ extern long __user_bad(void);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -290,7 +291,8 @@ extern long __user_bad(void);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index bf8b324..9722357 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -103,7 +103,8 @@ extern u64 __ua_limit;
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -138,7 +139,8 @@ extern u64 __ua_limit;
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -157,7 +159,8 @@ extern u64 __ua_limit;
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -177,7 +180,8 @@ extern u64 __ua_limit;
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -199,7 +203,8 @@ extern u64 __ua_limit;
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -498,7 +503,8 @@ extern void __put_user_unknown(void);
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -517,7 +523,8 @@ extern void __put_user_unknown(void);
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -537,7 +544,8 @@ extern void __put_user_unknown(void);
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -559,7 +567,8 @@ extern void __put_user_unknown(void);
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -815,7 +824,8 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -888,7 +898,8 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -1075,7 +1086,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -1107,7 +1119,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -1329,7 +1342,8 @@ strncpy_from_user(char *__to, const char __user *__from, long __len)
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
@@ -1398,7 +1412,8 @@ static inline long __strnlen_user(const char __user *s, long n)
  * strnlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index d64a7a6..9dd4cc4 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -98,7 +98,8 @@ static inline unsigned long extable_fixup(const struct exception_table_entry *x)
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -118,7 +119,8 @@ unsigned long __must_check __copy_from_user(void *to, const void __user *from,
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -264,7 +266,8 @@ int __get_user_bad(void) __attribute__((noreturn));
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -290,7 +293,8 @@ __compiletime_warning("copy_from_user() buffer size is not provably correct")
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -348,7 +352,8 @@ static inline unsigned long strnlen_user(const char __user *src, unsigned long n
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/score/include/asm/uaccess.h b/arch/score/include/asm/uaccess.h
index ab66ddd..20a3591 100644
--- a/arch/score/include/asm/uaccess.h
+++ b/arch/score/include/asm/uaccess.h
@@ -36,7 +36,8 @@
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -61,7 +62,8 @@
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -79,7 +81,8 @@
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -98,7 +101,8 @@
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -119,7 +123,8 @@
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/tile/include/asm/uaccess.h b/arch/tile/include/asm/uaccess.h
index d598c11..0a9c4265 100644
--- a/arch/tile/include/asm/uaccess.h
+++ b/arch/tile/include/asm/uaccess.h
@@ -85,7 +85,8 @@ int __range_ok(unsigned long addr, unsigned long size);
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -199,7 +200,8 @@ extern int __get_user_bad(void)
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -281,7 +283,8 @@ extern int __put_user_bad(void)
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -337,7 +340,8 @@ extern int __put_user_bad(void)
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -373,7 +377,8 @@ copy_to_user(void __user *to, const void *from, unsigned long n)
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -444,7 +449,8 @@ static inline unsigned long __must_check copy_from_user(void *to,
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to user space.  Caller must check
  * the specified blocks with access_ok() before calling this function.
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index ace9dec..a8df874 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -74,7 +74,8 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -145,7 +146,8 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -240,7 +242,8 @@ extern void __put_user_8(void);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -455,7 +458,8 @@ struct __large_struct { unsigned long buf[100]; };
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -479,7 +483,8 @@ struct __large_struct { unsigned long buf[100]; };
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index 0ed5504..f5dcb52 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -74,7 +74,8 @@ __copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -121,7 +122,8 @@ __copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c
index e2f5e21..91d93b9 100644
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -647,7 +647,8 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -668,7 +669,8 @@ EXPORT_SYMBOL(_copy_to_user);
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Copy data from user space to kernel space.
  *
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index a28df52..36c15a2 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -84,7 +84,8 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
  * @str: The string to measure.
  * @count: Maximum count (including NUL character)
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
@@ -113,7 +114,8 @@ EXPORT_SYMBOL(strnlen_user);
  * strlen_user: - Get the size of a user string INCLUDING final NUL.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are
+ *          enabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 04/15] mm: explicitly disable/enable preemption in kmap_atomic_*
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The existing code relies on pagefault_disable() implicitly disabling
preemption, so that no schedule will happen between kmap_atomic() and
kunmap_atomic().

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arm/mm/highmem.c                | 3 +++
 arch/frv/mm/highmem.c                | 2 ++
 arch/metag/mm/highmem.c              | 4 +++-
 arch/microblaze/mm/highmem.c         | 4 +++-
 arch/mips/mm/highmem.c               | 5 ++++-
 arch/mn10300/include/asm/highmem.h   | 3 +++
 arch/parisc/include/asm/cacheflush.h | 2 ++
 arch/powerpc/mm/highmem.c            | 4 +++-
 arch/sparc/mm/highmem.c              | 4 +++-
 arch/tile/mm/highmem.c               | 3 ++-
 arch/x86/mm/highmem_32.c             | 3 ++-
 arch/x86/mm/iomap_32.c               | 2 ++
 arch/xtensa/mm/highmem.c             | 2 ++
 include/linux/highmem.h              | 2 ++
 include/linux/io-mapping.h           | 2 ++
 15 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index b98895d..ee8dfa7 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -59,6 +59,7 @@ void *kmap_atomic(struct page *page)
 	void *kmap;
 	int type;
 
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -121,6 +122,7 @@ void __kunmap_atomic(void *kvaddr)
 		kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)]));
 	}
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
@@ -130,6 +132,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	int idx, type;
 	struct page *page = pfn_to_page(pfn);
 
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
diff --git a/arch/frv/mm/highmem.c b/arch/frv/mm/highmem.c
index bed9a9b..785344b 100644
--- a/arch/frv/mm/highmem.c
+++ b/arch/frv/mm/highmem.c
@@ -42,6 +42,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long paddr;
 	int type;
 
+	preempt_disable();
 	pagefault_disable();
 	type = kmap_atomic_idx_push();
 	paddr = page_to_phys(page);
@@ -85,5 +86,6 @@ void __kunmap_atomic(void *kvaddr)
 	}
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/metag/mm/highmem.c b/arch/metag/mm/highmem.c
index d71f621..807f1b1 100644
--- a/arch/metag/mm/highmem.c
+++ b/arch/metag/mm/highmem.c
@@ -43,7 +43,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	int type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -82,6 +82,7 @@ void __kunmap_atomic(void *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
@@ -95,6 +96,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	unsigned long vaddr;
 	int type;
 
+	preempt_disable();
 	pagefault_disable();
 
 	type = kmap_atomic_idx_push();
diff --git a/arch/microblaze/mm/highmem.c b/arch/microblaze/mm/highmem.c
index 5a92576..2fcc5a5 100644
--- a/arch/microblaze/mm/highmem.c
+++ b/arch/microblaze/mm/highmem.c
@@ -37,7 +37,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -63,6 +63,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -84,5 +85,6 @@ void __kunmap_atomic(void *kvaddr)
 #endif
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/mips/mm/highmem.c b/arch/mips/mm/highmem.c
index da815d2..11661cb 100644
--- a/arch/mips/mm/highmem.c
+++ b/arch/mips/mm/highmem.c
@@ -47,7 +47,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -72,6 +72,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < FIXADDR_START) { // FIXME
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -92,6 +93,7 @@ void __kunmap_atomic(void *kvaddr)
 #endif
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
@@ -104,6 +106,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	unsigned long vaddr;
 	int idx, type;
 
+	preempt_disable();
 	pagefault_disable();
 
 	type = kmap_atomic_idx_push();
diff --git a/arch/mn10300/include/asm/highmem.h b/arch/mn10300/include/asm/highmem.h
index 2fbbe4d..1ddea5a 100644
--- a/arch/mn10300/include/asm/highmem.h
+++ b/arch/mn10300/include/asm/highmem.h
@@ -75,6 +75,7 @@ static inline void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	int idx, type;
 
+	preempt_disable();
 	pagefault_disable();
 	if (page < highmem_start_page)
 		return page_address(page);
@@ -98,6 +99,7 @@ static inline void __kunmap_atomic(unsigned long vaddr)
 
 	if (vaddr < FIXADDR_START) { /* FIXME */
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -122,6 +124,7 @@ static inline void __kunmap_atomic(unsigned long vaddr)
 
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 #endif /* __KERNEL__ */
 
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index de65f66..ec2df4b 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -142,6 +142,7 @@ static inline void kunmap(struct page *page)
 
 static inline void *kmap_atomic(struct page *page)
 {
+	preempt_disable();
 	pagefault_disable();
 	return page_address(page);
 }
@@ -150,6 +151,7 @@ static inline void __kunmap_atomic(void *addr)
 {
 	flush_kernel_dcache_page_addr(addr);
 	pagefault_enable();
+	preempt_enable();
 }
 
 #define kmap_atomic_prot(page, prot)	kmap_atomic(page)
diff --git a/arch/powerpc/mm/highmem.c b/arch/powerpc/mm/highmem.c
index e7450bd..e292c8a 100644
--- a/arch/powerpc/mm/highmem.c
+++ b/arch/powerpc/mm/highmem.c
@@ -34,7 +34,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -59,6 +59,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -82,5 +83,6 @@ void __kunmap_atomic(void *kvaddr)
 
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c
index 449f864..a454ec5 100644
--- a/arch/sparc/mm/highmem.c
+++ b/arch/sparc/mm/highmem.c
@@ -53,7 +53,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	long idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -91,6 +91,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < FIXADDR_START) { // FIXME
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -126,5 +127,6 @@ void __kunmap_atomic(void *kvaddr)
 
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/tile/mm/highmem.c b/arch/tile/mm/highmem.c
index 6aa2f26..fcd5450 100644
--- a/arch/tile/mm/highmem.c
+++ b/arch/tile/mm/highmem.c
@@ -201,7 +201,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	int idx, type;
 	pte_t *pte;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 
 	/* Avoid icache flushes by disallowing atomic executable mappings. */
@@ -259,6 +259,7 @@ void __kunmap_atomic(void *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 4500142..eecb207a 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -35,7 +35,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 
 	if (!PageHighMem(page))
@@ -100,6 +100,7 @@ void __kunmap_atomic(void *kvaddr)
 #endif
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 9ca35fc..2b7ece0 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -59,6 +59,7 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
+	preempt_disable();
 	pagefault_disable();
 
 	type = kmap_atomic_idx_push();
@@ -117,5 +118,6 @@ iounmap_atomic(void __iomem *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(iounmap_atomic);
diff --git a/arch/xtensa/mm/highmem.c b/arch/xtensa/mm/highmem.c
index 8cfb71e..184cead 100644
--- a/arch/xtensa/mm/highmem.c
+++ b/arch/xtensa/mm/highmem.c
@@ -42,6 +42,7 @@ void *kmap_atomic(struct page *page)
 	enum fixed_addresses idx;
 	unsigned long vaddr;
 
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -79,6 +80,7 @@ void __kunmap_atomic(void *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 9286a46..6aefcd0 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -65,6 +65,7 @@ static inline void kunmap(struct page *page)
 
 static inline void *kmap_atomic(struct page *page)
 {
+	preempt_disable();
 	pagefault_disable();
 	return page_address(page);
 }
@@ -73,6 +74,7 @@ static inline void *kmap_atomic(struct page *page)
 static inline void __kunmap_atomic(void *addr)
 {
 	pagefault_enable();
+	preempt_enable();
 }
 
 #define kmap_atomic_pfn(pfn)	kmap_atomic(pfn_to_page(pfn))
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 657fab4..c27dde7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -141,6 +141,7 @@ static inline void __iomem *
 io_mapping_map_atomic_wc(struct io_mapping *mapping,
 			 unsigned long offset)
 {
+	preempt_disable();
 	pagefault_disable();
 	return ((char __force __iomem *) mapping) + offset;
 }
@@ -149,6 +150,7 @@ static inline void
 io_mapping_unmap_atomic(void __iomem *vaddr)
 {
 	pagefault_enable();
+	preempt_enable();
 }
 
 /* Non-atomic map/unmap */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 04/15] mm: explicitly disable/enable preemption in kmap_atomic_*
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The existing code relies on pagefault_disable() implicitly disabling
preemption, so that no schedule will happen between kmap_atomic() and
kunmap_atomic().

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arm/mm/highmem.c                | 3 +++
 arch/frv/mm/highmem.c                | 2 ++
 arch/metag/mm/highmem.c              | 4 +++-
 arch/microblaze/mm/highmem.c         | 4 +++-
 arch/mips/mm/highmem.c               | 5 ++++-
 arch/mn10300/include/asm/highmem.h   | 3 +++
 arch/parisc/include/asm/cacheflush.h | 2 ++
 arch/powerpc/mm/highmem.c            | 4 +++-
 arch/sparc/mm/highmem.c              | 4 +++-
 arch/tile/mm/highmem.c               | 3 ++-
 arch/x86/mm/highmem_32.c             | 3 ++-
 arch/x86/mm/iomap_32.c               | 2 ++
 arch/xtensa/mm/highmem.c             | 2 ++
 include/linux/highmem.h              | 2 ++
 include/linux/io-mapping.h           | 2 ++
 15 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index b98895d..ee8dfa7 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -59,6 +59,7 @@ void *kmap_atomic(struct page *page)
 	void *kmap;
 	int type;
 
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -121,6 +122,7 @@ void __kunmap_atomic(void *kvaddr)
 		kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)]));
 	}
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
@@ -130,6 +132,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	int idx, type;
 	struct page *page = pfn_to_page(pfn);
 
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
diff --git a/arch/frv/mm/highmem.c b/arch/frv/mm/highmem.c
index bed9a9b..785344b 100644
--- a/arch/frv/mm/highmem.c
+++ b/arch/frv/mm/highmem.c
@@ -42,6 +42,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long paddr;
 	int type;
 
+	preempt_disable();
 	pagefault_disable();
 	type = kmap_atomic_idx_push();
 	paddr = page_to_phys(page);
@@ -85,5 +86,6 @@ void __kunmap_atomic(void *kvaddr)
 	}
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/metag/mm/highmem.c b/arch/metag/mm/highmem.c
index d71f621..807f1b1 100644
--- a/arch/metag/mm/highmem.c
+++ b/arch/metag/mm/highmem.c
@@ -43,7 +43,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	int type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -82,6 +82,7 @@ void __kunmap_atomic(void *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
@@ -95,6 +96,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	unsigned long vaddr;
 	int type;
 
+	preempt_disable();
 	pagefault_disable();
 
 	type = kmap_atomic_idx_push();
diff --git a/arch/microblaze/mm/highmem.c b/arch/microblaze/mm/highmem.c
index 5a92576..2fcc5a5 100644
--- a/arch/microblaze/mm/highmem.c
+++ b/arch/microblaze/mm/highmem.c
@@ -37,7 +37,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -63,6 +63,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -84,5 +85,6 @@ void __kunmap_atomic(void *kvaddr)
 #endif
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/mips/mm/highmem.c b/arch/mips/mm/highmem.c
index da815d2..11661cb 100644
--- a/arch/mips/mm/highmem.c
+++ b/arch/mips/mm/highmem.c
@@ -47,7 +47,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -72,6 +72,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < FIXADDR_START) { // FIXME
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -92,6 +93,7 @@ void __kunmap_atomic(void *kvaddr)
 #endif
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
@@ -104,6 +106,7 @@ void *kmap_atomic_pfn(unsigned long pfn)
 	unsigned long vaddr;
 	int idx, type;
 
+	preempt_disable();
 	pagefault_disable();
 
 	type = kmap_atomic_idx_push();
diff --git a/arch/mn10300/include/asm/highmem.h b/arch/mn10300/include/asm/highmem.h
index 2fbbe4d..1ddea5a 100644
--- a/arch/mn10300/include/asm/highmem.h
+++ b/arch/mn10300/include/asm/highmem.h
@@ -75,6 +75,7 @@ static inline void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	int idx, type;
 
+	preempt_disable();
 	pagefault_disable();
 	if (page < highmem_start_page)
 		return page_address(page);
@@ -98,6 +99,7 @@ static inline void __kunmap_atomic(unsigned long vaddr)
 
 	if (vaddr < FIXADDR_START) { /* FIXME */
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -122,6 +124,7 @@ static inline void __kunmap_atomic(unsigned long vaddr)
 
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 #endif /* __KERNEL__ */
 
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index de65f66..ec2df4b 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -142,6 +142,7 @@ static inline void kunmap(struct page *page)
 
 static inline void *kmap_atomic(struct page *page)
 {
+	preempt_disable();
 	pagefault_disable();
 	return page_address(page);
 }
@@ -150,6 +151,7 @@ static inline void __kunmap_atomic(void *addr)
 {
 	flush_kernel_dcache_page_addr(addr);
 	pagefault_enable();
+	preempt_enable();
 }
 
 #define kmap_atomic_prot(page, prot)	kmap_atomic(page)
diff --git a/arch/powerpc/mm/highmem.c b/arch/powerpc/mm/highmem.c
index e7450bd..e292c8a 100644
--- a/arch/powerpc/mm/highmem.c
+++ b/arch/powerpc/mm/highmem.c
@@ -34,7 +34,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -59,6 +59,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -82,5 +83,6 @@ void __kunmap_atomic(void *kvaddr)
 
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c
index 449f864..a454ec5 100644
--- a/arch/sparc/mm/highmem.c
+++ b/arch/sparc/mm/highmem.c
@@ -53,7 +53,7 @@ void *kmap_atomic(struct page *page)
 	unsigned long vaddr;
 	long idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -91,6 +91,7 @@ void __kunmap_atomic(void *kvaddr)
 
 	if (vaddr < FIXADDR_START) { // FIXME
 		pagefault_enable();
+		preempt_enable();
 		return;
 	}
 
@@ -126,5 +127,6 @@ void __kunmap_atomic(void *kvaddr)
 
 	kmap_atomic_idx_pop();
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/tile/mm/highmem.c b/arch/tile/mm/highmem.c
index 6aa2f26..fcd5450 100644
--- a/arch/tile/mm/highmem.c
+++ b/arch/tile/mm/highmem.c
@@ -201,7 +201,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	int idx, type;
 	pte_t *pte;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 
 	/* Avoid icache flushes by disallowing atomic executable mappings. */
@@ -259,6 +259,7 @@ void __kunmap_atomic(void *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 4500142..eecb207a 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -35,7 +35,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
-	/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
+	preempt_disable();
 	pagefault_disable();
 
 	if (!PageHighMem(page))
@@ -100,6 +100,7 @@ void __kunmap_atomic(void *kvaddr)
 #endif
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 9ca35fc..2b7ece0 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -59,6 +59,7 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
 	unsigned long vaddr;
 	int idx, type;
 
+	preempt_disable();
 	pagefault_disable();
 
 	type = kmap_atomic_idx_push();
@@ -117,5 +118,6 @@ iounmap_atomic(void __iomem *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(iounmap_atomic);
diff --git a/arch/xtensa/mm/highmem.c b/arch/xtensa/mm/highmem.c
index 8cfb71e..184cead 100644
--- a/arch/xtensa/mm/highmem.c
+++ b/arch/xtensa/mm/highmem.c
@@ -42,6 +42,7 @@ void *kmap_atomic(struct page *page)
 	enum fixed_addresses idx;
 	unsigned long vaddr;
 
+	preempt_disable();
 	pagefault_disable();
 	if (!PageHighMem(page))
 		return page_address(page);
@@ -79,6 +80,7 @@ void __kunmap_atomic(void *kvaddr)
 	}
 
 	pagefault_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
 
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 9286a46..6aefcd0 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -65,6 +65,7 @@ static inline void kunmap(struct page *page)
 
 static inline void *kmap_atomic(struct page *page)
 {
+	preempt_disable();
 	pagefault_disable();
 	return page_address(page);
 }
@@ -73,6 +74,7 @@ static inline void *kmap_atomic(struct page *page)
 static inline void __kunmap_atomic(void *addr)
 {
 	pagefault_enable();
+	preempt_enable();
 }
 
 #define kmap_atomic_pfn(pfn)	kmap_atomic(pfn_to_page(pfn))
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 657fab4..c27dde7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -141,6 +141,7 @@ static inline void __iomem *
 io_mapping_map_atomic_wc(struct io_mapping *mapping,
 			 unsigned long offset)
 {
+	preempt_disable();
 	pagefault_disable();
 	return ((char __force __iomem *) mapping) + offset;
 }
@@ -149,6 +150,7 @@ static inline void
 io_mapping_unmap_atomic(void __iomem *vaddr)
 {
 	pagefault_enable();
+	preempt_enable();
 }
 
 /* Non-atomic map/unmap */
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 05/15] mips: kmap_coherent relies on disabled preemption
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

k(un)map_coherent relies on pagefault_disable() to also disable
preemption.

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

This patch is based on a patch by Yang Shi on the -rt tree:
"k{un}map_coherent are just called when cpu_has_dc_aliases == 1 with VIPT
cache. However, actually, the most modern MIPS processors have PIPT dcache
without dcache alias issue. In such case, k{un}map_atomic will be called
with preempt enabled."

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/mips/mm/init.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index faa5c98..198a314 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -90,6 +90,7 @@ static void *__kmap_pgprot(struct page *page, unsigned long addr, pgprot_t prot)
 
 	BUG_ON(Page_dcache_dirty(page));
 
+	preempt_disable();
 	pagefault_disable();
 	idx = (addr >> PAGE_SHIFT) & (FIX_N_COLOURS - 1);
 	idx += in_interrupt() ? FIX_N_COLOURS : 0;
@@ -152,6 +153,7 @@ void kunmap_coherent(void)
 	write_c0_entryhi(old_ctx);
 	local_irq_restore(flags);
 	pagefault_enable();
+	preempt_enable();
 }
 
 void copy_user_highpage(struct page *to, struct page *from,
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 05/15] mips: kmap_coherent relies on disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

k(un)map_coherent relies on pagefault_disable() to also disable
preemption.

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

This patch is based on a patch by Yang Shi on the -rt tree:
"k{un}map_coherent are just called when cpu_has_dc_aliases == 1 with VIPT
cache. However, actually, the most modern MIPS processors have PIPT dcache
without dcache alias issue. In such case, k{un}map_atomic will be called
with preempt enabled."

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/mips/mm/init.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index faa5c98..198a314 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -90,6 +90,7 @@ static void *__kmap_pgprot(struct page *page, unsigned long addr, pgprot_t prot)
 
 	BUG_ON(Page_dcache_dirty(page));
 
+	preempt_disable();
 	pagefault_disable();
 	idx = (addr >> PAGE_SHIFT) & (FIX_N_COLOURS - 1);
 	idx += in_interrupt() ? FIX_N_COLOURS : 0;
@@ -152,6 +153,7 @@ void kunmap_coherent(void)
 	write_c0_entryhi(old_ctx);
 	local_irq_restore(flags);
 	pagefault_enable();
+	preempt_enable();
 }
 
 void copy_user_highpage(struct page *to, struct page *from,
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 06/15] mm: use pagefault_disabled() to check for disabled pagefaults
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Now that the pagefault disabled counter is in place, we can replace
the in_atomic() checks by pagefault_disabled() checks.

If a fault happens while in_atomic(), it can only be a bug in the code.
Either a pagefault_disable() call is missing or the specific *in_atomic()
variants of uaccess should have been used. That's what might_fault()
checks when calling might_sleep().

pagefault_disable() is defined in linux/uaccess.h, so let's properly
add that include to all relevant files.

This patch is based on a patch from Thomas Gleixner.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/alpha/mm/fault.c      | 5 ++---
 arch/arc/mm/fault.c        | 2 +-
 arch/arm/mm/fault.c        | 2 +-
 arch/arm64/mm/fault.c      | 2 +-
 arch/avr32/mm/fault.c      | 4 ++--
 arch/cris/mm/fault.c       | 4 ++--
 arch/frv/mm/fault.c        | 4 ++--
 arch/ia64/mm/fault.c       | 4 ++--
 arch/m32r/mm/fault.c       | 4 ++--
 arch/m68k/mm/fault.c       | 4 ++--
 arch/metag/mm/fault.c      | 2 +-
 arch/microblaze/mm/fault.c | 6 +++---
 arch/mips/mm/fault.c       | 4 ++--
 arch/mn10300/mm/fault.c    | 4 ++--
 arch/nios2/mm/fault.c      | 2 +-
 arch/parisc/kernel/traps.c | 4 ++--
 arch/parisc/mm/fault.c     | 4 ++--
 arch/powerpc/mm/fault.c    | 9 +++++----
 arch/s390/mm/fault.c       | 2 +-
 arch/score/mm/fault.c      | 3 ++-
 arch/sh/mm/fault.c         | 3 ++-
 arch/sparc/mm/fault_32.c   | 4 ++--
 arch/sparc/mm/fault_64.c   | 4 ++--
 arch/sparc/mm/init_64.c    | 2 +-
 arch/tile/mm/fault.c       | 4 ++--
 arch/um/kernel/trap.c      | 4 ++--
 arch/unicore32/mm/fault.c  | 2 +-
 arch/x86/mm/fault.c        | 5 +++--
 arch/xtensa/mm/fault.c     | 4 ++--
 29 files changed, 55 insertions(+), 52 deletions(-)

diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index 9d0ac09..f702a27 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -23,8 +23,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
-
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 
 extern void die_if_kernel(char *,struct pt_regs *,long, unsigned long *);
 
@@ -107,7 +106,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
 
 	/* If we're in an interrupt context, or have no user context,
 	   we must not take the fault.  */
-	if (!mm || in_atomic())
+	if (!mm || pagefault_disabled())
 		goto no_context;
 
 #ifdef CONFIG_ALPHA_LARGE_VMALLOC
diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 6a2e006..2f632a0 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -86,7 +86,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 6333d9c..841a1b6 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -276,7 +276,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 96da131..d93c129 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -211,7 +211,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
 	 * If we're in an interrupt or have no user context, we must not take
 	 * the fault.
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/avr32/mm/fault.c b/arch/avr32/mm/fault.c
index d223a8b..09f7ced 100644
--- a/arch/avr32/mm/fault.c
+++ b/arch/avr32/mm/fault.c
@@ -14,11 +14,11 @@
 #include <linux/pagemap.h>
 #include <linux/kdebug.h>
 #include <linux/kprobes.h>
+#include <linux/uaccess.h>
 
 #include <asm/mmu_context.h>
 #include <asm/sysreg.h>
 #include <asm/tlb.h>
-#include <asm/uaccess.h>
 
 #ifdef CONFIG_KPROBES
 static inline int notify_page_fault(struct pt_regs *regs, int trap)
@@ -81,7 +81,7 @@ asmlinkage void do_page_fault(unsigned long ecr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user context, we must
 	 * not take the fault...
 	 */
-	if (in_atomic() || !mm || regs->sr & SYSREG_BIT(GM))
+	if (pagefault_disabled() || !mm || regs->sr & SYSREG_BIT(GM))
 		goto no_context;
 
 	local_irq_enable();
diff --git a/arch/cris/mm/fault.c b/arch/cris/mm/fault.c
index 83f12f2..41f83a7 100644
--- a/arch/cris/mm/fault.c
+++ b/arch/cris/mm/fault.c
@@ -8,7 +8,7 @@
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/wait.h>
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 #include <arch/system.h>
 
 extern int find_fixup_code(struct pt_regs *);
@@ -113,7 +113,7 @@ do_page_fault(unsigned long address, struct pt_regs *regs,
 	 * user context, we must not take the fault.
 	 */
 
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/frv/mm/fault.c b/arch/frv/mm/fault.c
index ec4917d..d1f67d1 100644
--- a/arch/frv/mm/fault.c
+++ b/arch/frv/mm/fault.c
@@ -19,9 +19,9 @@
 #include <linux/kernel.h>
 #include <linux/ptrace.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
-#include <asm/uaccess.h>
 #include <asm/gdb-stub.h>
 
 /*****************************************************************************/
@@ -78,7 +78,7 @@ asmlinkage void do_page_fault(int datammu, unsigned long esr0, unsigned long ear
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(__frame))
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index ba5ba7a..5a636aa 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -11,10 +11,10 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/prefetch.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/uaccess.h>
 
 extern int die(char *, struct pt_regs *, long);
 
@@ -96,7 +96,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	/*
 	 * If we're in an interrupt or have no user context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
diff --git a/arch/m32r/mm/fault.c b/arch/m32r/mm/fault.c
index e3d4d48901..4e1892e 100644
--- a/arch/m32r/mm/fault.c
+++ b/arch/m32r/mm/fault.c
@@ -24,9 +24,9 @@
 #include <linux/vt_kern.h>		/* For unblank_screen() */
 #include <linux/highmem.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/m32r.h>
-#include <asm/uaccess.h>
 #include <asm/hardirq.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -114,7 +114,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * If we're in an interrupt or have no user context or are running in an
 	 * atomic region then we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (error_code & ACE_USERMODE)
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index b2f04ae..156309b0 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -10,10 +10,10 @@
 #include <linux/ptrace.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/setup.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 extern void die_if_kernel(char *, struct pt_regs *, long);
@@ -81,7 +81,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/metag/mm/fault.c b/arch/metag/mm/fault.c
index 2de5dc6..713700d 100644
--- a/arch/metag/mm/fault.c
+++ b/arch/metag/mm/fault.c
@@ -105,7 +105,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	mm = tsk->mm;
 
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c
index d46a5eb..6a75a42 100644
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -107,13 +107,13 @@ void do_page_fault(struct pt_regs *regs, unsigned long address,
 	if ((error_code & 0x13) == 0x13 || (error_code & 0x11) == 0x11)
 		is_write = 0;
 
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(pagefault_disabled() || !mm)) {
 		if (kernel_mode(regs))
 			goto bad_area_nosemaphore;
 
-		/* in_atomic() in user mode is really bad,
+		/* pagefault_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
-		pr_emerg("Page fault in user mode with in_atomic(), mm = %p\n",
+		pr_emerg("Page fault in user mode with pagefault_disabled(), mm = %p\n",
 									mm);
 		pr_emerg("r15 = %lx  MSR = %lx\n",
 		       regs->r15, regs->msr);
diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 7ff8637..ccba212 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -21,10 +21,10 @@
 #include <linux/module.h>
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
+#include <linux/uaccess.h>
 
 #include <asm/branch.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/ptrace.h>
 #include <asm/highmem.h>		/* For VMALLOC_END */
 #include <linux/kdebug.h>
@@ -94,7 +94,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/mn10300/mm/fault.c b/arch/mn10300/mm/fault.c
index 0c2cc5d..1d441a8 100644
--- a/arch/mn10300/mm/fault.c
+++ b/arch/mn10300/mm/fault.c
@@ -23,8 +23,8 @@
 #include <linux/interrupt.h>
 #include <linux/init.h>
 #include <linux/vt_kern.h>		/* For unblank_screen() */
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 #include <asm/hardirq.h>
 #include <asm/cpu-regs.h>
@@ -168,7 +168,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long fault_code,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if ((fault_code & MMUFCR_xFC_ACCESS) == MMUFCR_xFC_ACCESS_USR)
diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index 0c9b6af..fdae091 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -77,7 +77,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 47ee620..e72483c 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -26,9 +26,9 @@
 #include <linux/console.h>
 #include <linux/bug.h>
 #include <linux/ratelimit.h>
+#include <linux/uaccess.h>
 
 #include <asm/assembly.h>
-#include <asm/uaccess.h>
 #include <asm/io.h>
 #include <asm/irq.h>
 #include <asm/traps.h>
@@ -800,7 +800,7 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
 	     * unless pagefault_disable() was called before.
 	     */
 
-	    if (fault_space == 0 && !in_atomic())
+	    if (fault_space == 0 && !pagefault_disabled())
 	    {
 		pdc_chassis_send_status(PDC_CHASSIS_DIRECT_PANIC);
 		parisc_terminate("Kernel Fault", regs, code, fault_address);
diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
index e5120e6..15503ad 100644
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -15,8 +15,8 @@
 #include <linux/sched.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/traps.h>
 
 /* Various important other fields */
@@ -207,7 +207,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
 	int fault;
 	unsigned int flags;
 
-	if (in_atomic())
+	if (pagefault_disabled())
 		goto no_context;
 
 	tsk = current;
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index b396868..04658b5 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -33,13 +33,13 @@
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 #include <linux/hugetlb.h>
+#include <linux/uaccess.h>
 
 #include <asm/firmware.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
 #include <asm/debug.h>
@@ -272,15 +272,16 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (!arch_irq_disabled_regs(regs))
 		local_irq_enable();
 
-	if (in_atomic() || mm == NULL) {
+	if (pagefault_disabled() || mm == NULL) {
 		if (!user_mode(regs)) {
 			rc = SIGSEGV;
 			goto bail;
 		}
-		/* in_atomic() in user mode is really bad,
+		/* pagefault_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
 		printk(KERN_EMERG "Page fault in user mode with "
-		       "in_atomic() = %d mm = %p\n", in_atomic(), mm);
+		       "pagefault_disabled() = %d mm = %p\n",
+		       pagefault_disabled(), mm);
 		printk(KERN_EMERG "NIP = %lx  MSR = %lx\n",
 		       regs->nip, regs->msr);
 		die("Weird page fault", regs, SIGSEGV);
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 76515bc..a3dcc10 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -399,7 +399,7 @@ static inline int do_exception(struct pt_regs *regs, int access)
 	 * user context.
 	 */
 	fault = VM_FAULT_BADCONTEXT;
-	if (unlikely(!user_space_fault(regs) || in_atomic() || !mm))
+	if (unlikely(!user_space_fault(regs) || pagefault_disabled() || !mm))
 		goto out;
 
 	address = trans_exc_code & __FAIL_ADDR_MASK;
diff --git a/arch/score/mm/fault.c b/arch/score/mm/fault.c
index 6860beb..37a6c2e 100644
--- a/arch/score/mm/fault.c
+++ b/arch/score/mm/fault.c
@@ -34,6 +34,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/ptrace.h>
+#include <linux/uaccess.h>
 
 /*
  * This routine handles page faults.  It determines the address,
@@ -73,7 +74,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
 	* If we're in an interrupt or have no user
 	* context, we must not take the fault..
 	*/
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index a58fec9..f7a83a9 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -17,6 +17,7 @@
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 #include <asm/io_trapped.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -440,7 +441,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
 	 * If we're in an interrupt, have no user context or are running
 	 * in an atomic region then we must not take the fault:
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(pagefault_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index 70d8171..c399e7b 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -21,6 +21,7 @@
 #include <linux/perf_event.h>
 #include <linux/interrupt.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -29,7 +30,6 @@
 #include <asm/setup.h>
 #include <asm/smp.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 
 #include "mm_32.h"
 
@@ -196,7 +196,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 4798232..0ff9e50 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -22,12 +22,12 @@
 #include <linux/kdebug.h>
 #include <linux/percpu.h>
 #include <linux/context_tracking.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/openprom.h>
 #include <asm/oplib.h>
-#include <asm/uaccess.h>
 #include <asm/asi.h>
 #include <asm/lsu.h>
 #include <asm/sections.h>
@@ -330,7 +330,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto intr_or_no_mm;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 4ca0d6b..795ea96 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2706,7 +2706,7 @@ void hugetlb_setup(struct pt_regs *regs)
 	struct mm_struct *mm = current->mm;
 	struct tsb_config *tp;
 
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		const struct exception_table_entry *entry;
 
 		entry = search_exception_tables(regs->tpc);
diff --git a/arch/tile/mm/fault.c b/arch/tile/mm/fault.c
index e83cc99..3f4f58d 100644
--- a/arch/tile/mm/fault.c
+++ b/arch/tile/mm/fault.c
@@ -354,9 +354,9 @@ static int handle_page_fault(struct pt_regs *regs,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running in an
-	 * atomic region then we must not take the fault.
+	 * region with pagefaults disabled then we must not take the fault.
 	 */
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		vma = NULL;  /* happy compiler */
 		goto bad_area_nosemaphore;
 	}
diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index 8e4daf4..7317c87 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -35,10 +35,10 @@ int handle_page_fault(unsigned long address, unsigned long ip,
 	*code_out = SEGV_MAPERR;
 
 	/*
-	 * If the fault was during atomic operation, don't take the fault, just
+	 * If the fault was with pagefaults disabled, don't take the fault, just
 	 * fail.
 	 */
-	if (in_atomic())
+	if (pagefault_disabled())
 		goto out_nosemaphore;
 
 	if (is_user)
diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c
index 0dc922d..b3cf385 100644
--- a/arch/unicore32/mm/fault.c
+++ b/arch/unicore32/mm/fault.c
@@ -218,7 +218,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 181c53b..be394b1 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -13,6 +13,7 @@
 #include <linux/hugetlb.h>		/* hstate_index_to_shift	*/
 #include <linux/prefetch.h>		/* prefetchw			*/
 #include <linux/context_tracking.h>	/* exception_enter(), ...	*/
+#include <linux/uaccess.h>		/* pagefault_disabled()		*/
 
 #include <asm/traps.h>			/* dotraplinkage, ...		*/
 #include <asm/pgalloc.h>		/* pgd_*(), ...			*/
@@ -1126,9 +1127,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running
-	 * in an atomic region then we must not take the fault:
+	 * in a region with pagefaults disabled then we must not take the fault
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(pagefault_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c
index 9e3571a..99ba449 100644
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -15,10 +15,10 @@
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 #include <asm/mmu_context.h>
 #include <asm/cacheflush.h>
 #include <asm/hardirq.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 DEFINE_PER_CPU(unsigned long, asid_cache) = ASID_USER_FIRST;
@@ -57,7 +57,7 @@ void do_page_fault(struct pt_regs *regs)
 	/* If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		bad_page_fault(regs, address, SIGSEGV);
 		return;
 	}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 06/15] mm: use pagefault_disabled() to check for disabled pagefaults
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Now that the pagefault disabled counter is in place, we can replace
the in_atomic() checks by pagefault_disabled() checks.

If a fault happens while in_atomic(), it can only be a bug in the code.
Either a pagefault_disable() call is missing or the specific *in_atomic()
variants of uaccess should have been used. That's what might_fault()
checks when calling might_sleep().

pagefault_disable() is defined in linux/uaccess.h, so let's properly
add that include to all relevant files.

This patch is based on a patch from Thomas Gleixner.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/alpha/mm/fault.c      | 5 ++---
 arch/arc/mm/fault.c        | 2 +-
 arch/arm/mm/fault.c        | 2 +-
 arch/arm64/mm/fault.c      | 2 +-
 arch/avr32/mm/fault.c      | 4 ++--
 arch/cris/mm/fault.c       | 4 ++--
 arch/frv/mm/fault.c        | 4 ++--
 arch/ia64/mm/fault.c       | 4 ++--
 arch/m32r/mm/fault.c       | 4 ++--
 arch/m68k/mm/fault.c       | 4 ++--
 arch/metag/mm/fault.c      | 2 +-
 arch/microblaze/mm/fault.c | 6 +++---
 arch/mips/mm/fault.c       | 4 ++--
 arch/mn10300/mm/fault.c    | 4 ++--
 arch/nios2/mm/fault.c      | 2 +-
 arch/parisc/kernel/traps.c | 4 ++--
 arch/parisc/mm/fault.c     | 4 ++--
 arch/powerpc/mm/fault.c    | 9 +++++----
 arch/s390/mm/fault.c       | 2 +-
 arch/score/mm/fault.c      | 3 ++-
 arch/sh/mm/fault.c         | 3 ++-
 arch/sparc/mm/fault_32.c   | 4 ++--
 arch/sparc/mm/fault_64.c   | 4 ++--
 arch/sparc/mm/init_64.c    | 2 +-
 arch/tile/mm/fault.c       | 4 ++--
 arch/um/kernel/trap.c      | 4 ++--
 arch/unicore32/mm/fault.c  | 2 +-
 arch/x86/mm/fault.c        | 5 +++--
 arch/xtensa/mm/fault.c     | 4 ++--
 29 files changed, 55 insertions(+), 52 deletions(-)

diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index 9d0ac09..f702a27 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -23,8 +23,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
-
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 
 extern void die_if_kernel(char *,struct pt_regs *,long, unsigned long *);
 
@@ -107,7 +106,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
 
 	/* If we're in an interrupt context, or have no user context,
 	   we must not take the fault.  */
-	if (!mm || in_atomic())
+	if (!mm || pagefault_disabled())
 		goto no_context;
 
 #ifdef CONFIG_ALPHA_LARGE_VMALLOC
diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 6a2e006..2f632a0 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -86,7 +86,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 6333d9c..841a1b6 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -276,7 +276,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 96da131..d93c129 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -211,7 +211,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
 	 * If we're in an interrupt or have no user context, we must not take
 	 * the fault.
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/avr32/mm/fault.c b/arch/avr32/mm/fault.c
index d223a8b..09f7ced 100644
--- a/arch/avr32/mm/fault.c
+++ b/arch/avr32/mm/fault.c
@@ -14,11 +14,11 @@
 #include <linux/pagemap.h>
 #include <linux/kdebug.h>
 #include <linux/kprobes.h>
+#include <linux/uaccess.h>
 
 #include <asm/mmu_context.h>
 #include <asm/sysreg.h>
 #include <asm/tlb.h>
-#include <asm/uaccess.h>
 
 #ifdef CONFIG_KPROBES
 static inline int notify_page_fault(struct pt_regs *regs, int trap)
@@ -81,7 +81,7 @@ asmlinkage void do_page_fault(unsigned long ecr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user context, we must
 	 * not take the fault...
 	 */
-	if (in_atomic() || !mm || regs->sr & SYSREG_BIT(GM))
+	if (pagefault_disabled() || !mm || regs->sr & SYSREG_BIT(GM))
 		goto no_context;
 
 	local_irq_enable();
diff --git a/arch/cris/mm/fault.c b/arch/cris/mm/fault.c
index 83f12f2..41f83a7 100644
--- a/arch/cris/mm/fault.c
+++ b/arch/cris/mm/fault.c
@@ -8,7 +8,7 @@
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/wait.h>
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 #include <arch/system.h>
 
 extern int find_fixup_code(struct pt_regs *);
@@ -113,7 +113,7 @@ do_page_fault(unsigned long address, struct pt_regs *regs,
 	 * user context, we must not take the fault.
 	 */
 
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/frv/mm/fault.c b/arch/frv/mm/fault.c
index ec4917d..d1f67d1 100644
--- a/arch/frv/mm/fault.c
+++ b/arch/frv/mm/fault.c
@@ -19,9 +19,9 @@
 #include <linux/kernel.h>
 #include <linux/ptrace.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
-#include <asm/uaccess.h>
 #include <asm/gdb-stub.h>
 
 /*****************************************************************************/
@@ -78,7 +78,7 @@ asmlinkage void do_page_fault(int datammu, unsigned long esr0, unsigned long ear
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(__frame))
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index ba5ba7a..5a636aa 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -11,10 +11,10 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/prefetch.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/uaccess.h>
 
 extern int die(char *, struct pt_regs *, long);
 
@@ -96,7 +96,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	/*
 	 * If we're in an interrupt or have no user context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
diff --git a/arch/m32r/mm/fault.c b/arch/m32r/mm/fault.c
index e3d4d48901..4e1892e 100644
--- a/arch/m32r/mm/fault.c
+++ b/arch/m32r/mm/fault.c
@@ -24,9 +24,9 @@
 #include <linux/vt_kern.h>		/* For unblank_screen() */
 #include <linux/highmem.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/m32r.h>
-#include <asm/uaccess.h>
 #include <asm/hardirq.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -114,7 +114,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	 * If we're in an interrupt or have no user context or are running in an
 	 * atomic region then we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (error_code & ACE_USERMODE)
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index b2f04ae..156309b0 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -10,10 +10,10 @@
 #include <linux/ptrace.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/setup.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 extern void die_if_kernel(char *, struct pt_regs *, long);
@@ -81,7 +81,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/metag/mm/fault.c b/arch/metag/mm/fault.c
index 2de5dc6..713700d 100644
--- a/arch/metag/mm/fault.c
+++ b/arch/metag/mm/fault.c
@@ -105,7 +105,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	mm = tsk->mm;
 
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c
index d46a5eb..6a75a42 100644
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -107,13 +107,13 @@ void do_page_fault(struct pt_regs *regs, unsigned long address,
 	if ((error_code & 0x13) == 0x13 || (error_code & 0x11) == 0x11)
 		is_write = 0;
 
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(pagefault_disabled() || !mm)) {
 		if (kernel_mode(regs))
 			goto bad_area_nosemaphore;
 
-		/* in_atomic() in user mode is really bad,
+		/* pagefault_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
-		pr_emerg("Page fault in user mode with in_atomic(), mm = %p\n",
+		pr_emerg("Page fault in user mode with pagefault_disabled(), mm = %p\n",
 									mm);
 		pr_emerg("r15 = %lx  MSR = %lx\n",
 		       regs->r15, regs->msr);
diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 7ff8637..ccba212 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -21,10 +21,10 @@
 #include <linux/module.h>
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
+#include <linux/uaccess.h>
 
 #include <asm/branch.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/ptrace.h>
 #include <asm/highmem.h>		/* For VMALLOC_END */
 #include <linux/kdebug.h>
@@ -94,7 +94,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/mn10300/mm/fault.c b/arch/mn10300/mm/fault.c
index 0c2cc5d..1d441a8 100644
--- a/arch/mn10300/mm/fault.c
+++ b/arch/mn10300/mm/fault.c
@@ -23,8 +23,8 @@
 #include <linux/interrupt.h>
 #include <linux/init.h>
 #include <linux/vt_kern.h>		/* For unblank_screen() */
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 #include <asm/hardirq.h>
 #include <asm/cpu-regs.h>
@@ -168,7 +168,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long fault_code,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if ((fault_code & MMUFCR_xFC_ACCESS) == MMUFCR_xFC_ACCESS_USR)
diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index 0c9b6af..fdae091 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -77,7 +77,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 47ee620..e72483c 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -26,9 +26,9 @@
 #include <linux/console.h>
 #include <linux/bug.h>
 #include <linux/ratelimit.h>
+#include <linux/uaccess.h>
 
 #include <asm/assembly.h>
-#include <asm/uaccess.h>
 #include <asm/io.h>
 #include <asm/irq.h>
 #include <asm/traps.h>
@@ -800,7 +800,7 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
 	     * unless pagefault_disable() was called before.
 	     */
 
-	    if (fault_space == 0 && !in_atomic())
+	    if (fault_space == 0 && !pagefault_disabled())
 	    {
 		pdc_chassis_send_status(PDC_CHASSIS_DIRECT_PANIC);
 		parisc_terminate("Kernel Fault", regs, code, fault_address);
diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
index e5120e6..15503ad 100644
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -15,8 +15,8 @@
 #include <linux/sched.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/traps.h>
 
 /* Various important other fields */
@@ -207,7 +207,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
 	int fault;
 	unsigned int flags;
 
-	if (in_atomic())
+	if (pagefault_disabled())
 		goto no_context;
 
 	tsk = current;
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index b396868..04658b5 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -33,13 +33,13 @@
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 #include <linux/hugetlb.h>
+#include <linux/uaccess.h>
 
 #include <asm/firmware.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
 #include <asm/debug.h>
@@ -272,15 +272,16 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (!arch_irq_disabled_regs(regs))
 		local_irq_enable();
 
-	if (in_atomic() || mm == NULL) {
+	if (pagefault_disabled() || mm == NULL) {
 		if (!user_mode(regs)) {
 			rc = SIGSEGV;
 			goto bail;
 		}
-		/* in_atomic() in user mode is really bad,
+		/* pagefault_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
 		printk(KERN_EMERG "Page fault in user mode with "
-		       "in_atomic() = %d mm = %p\n", in_atomic(), mm);
+		       "pagefault_disabled() = %d mm = %p\n",
+		       pagefault_disabled(), mm);
 		printk(KERN_EMERG "NIP = %lx  MSR = %lx\n",
 		       regs->nip, regs->msr);
 		die("Weird page fault", regs, SIGSEGV);
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 76515bc..a3dcc10 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -399,7 +399,7 @@ static inline int do_exception(struct pt_regs *regs, int access)
 	 * user context.
 	 */
 	fault = VM_FAULT_BADCONTEXT;
-	if (unlikely(!user_space_fault(regs) || in_atomic() || !mm))
+	if (unlikely(!user_space_fault(regs) || pagefault_disabled() || !mm))
 		goto out;
 
 	address = trans_exc_code & __FAIL_ADDR_MASK;
diff --git a/arch/score/mm/fault.c b/arch/score/mm/fault.c
index 6860beb..37a6c2e 100644
--- a/arch/score/mm/fault.c
+++ b/arch/score/mm/fault.c
@@ -34,6 +34,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/ptrace.h>
+#include <linux/uaccess.h>
 
 /*
  * This routine handles page faults.  It determines the address,
@@ -73,7 +74,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
 	* If we're in an interrupt or have no user
 	* context, we must not take the fault..
 	*/
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index a58fec9..f7a83a9 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -17,6 +17,7 @@
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 #include <asm/io_trapped.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -440,7 +441,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
 	 * If we're in an interrupt, have no user context or are running
 	 * in an atomic region then we must not take the fault:
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(pagefault_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index 70d8171..c399e7b 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -21,6 +21,7 @@
 #include <linux/perf_event.h>
 #include <linux/interrupt.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -29,7 +30,6 @@
 #include <asm/setup.h>
 #include <asm/smp.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 
 #include "mm_32.h"
 
@@ -196,7 +196,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 4798232..0ff9e50 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -22,12 +22,12 @@
 #include <linux/kdebug.h>
 #include <linux/percpu.h>
 #include <linux/context_tracking.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/openprom.h>
 #include <asm/oplib.h>
-#include <asm/uaccess.h>
 #include <asm/asi.h>
 #include <asm/lsu.h>
 #include <asm/sections.h>
@@ -330,7 +330,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto intr_or_no_mm;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 4ca0d6b..795ea96 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2706,7 +2706,7 @@ void hugetlb_setup(struct pt_regs *regs)
 	struct mm_struct *mm = current->mm;
 	struct tsb_config *tp;
 
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		const struct exception_table_entry *entry;
 
 		entry = search_exception_tables(regs->tpc);
diff --git a/arch/tile/mm/fault.c b/arch/tile/mm/fault.c
index e83cc99..3f4f58d 100644
--- a/arch/tile/mm/fault.c
+++ b/arch/tile/mm/fault.c
@@ -354,9 +354,9 @@ static int handle_page_fault(struct pt_regs *regs,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running in an
-	 * atomic region then we must not take the fault.
+	 * region with pagefaults disabled then we must not take the fault.
 	 */
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		vma = NULL;  /* happy compiler */
 		goto bad_area_nosemaphore;
 	}
diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index 8e4daf4..7317c87 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -35,10 +35,10 @@ int handle_page_fault(unsigned long address, unsigned long ip,
 	*code_out = SEGV_MAPERR;
 
 	/*
-	 * If the fault was during atomic operation, don't take the fault, just
+	 * If the fault was with pagefaults disabled, don't take the fault, just
 	 * fail.
 	 */
-	if (in_atomic())
+	if (pagefault_disabled())
 		goto out_nosemaphore;
 
 	if (is_user)
diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c
index 0dc922d..b3cf385 100644
--- a/arch/unicore32/mm/fault.c
+++ b/arch/unicore32/mm/fault.c
@@ -218,7 +218,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 181c53b..be394b1 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -13,6 +13,7 @@
 #include <linux/hugetlb.h>		/* hstate_index_to_shift	*/
 #include <linux/prefetch.h>		/* prefetchw			*/
 #include <linux/context_tracking.h>	/* exception_enter(), ...	*/
+#include <linux/uaccess.h>		/* pagefault_disabled()		*/
 
 #include <asm/traps.h>			/* dotraplinkage, ...		*/
 #include <asm/pgalloc.h>		/* pgd_*(), ...			*/
@@ -1126,9 +1127,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running
-	 * in an atomic region then we must not take the fault:
+	 * in a region with pagefaults disabled then we must not take the fault
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(pagefault_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c
index 9e3571a..99ba449 100644
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -15,10 +15,10 @@
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 #include <asm/mmu_context.h>
 #include <asm/cacheflush.h>
 #include <asm/hardirq.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 DEFINE_PER_CPU(unsigned long, asid_cache) = ASID_USER_FIRST;
@@ -57,7 +57,7 @@ void do_page_fault(struct pt_regs *regs)
 	/* If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		bad_page_fault(regs, address, SIGSEGV);
 		return;
 	}
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 07/15] drm/i915: use pagefault_disabled() to check for disabled pagefaults
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Now that the pagefault disabled counter is in place, we can replace
the in_atomic() check by a pagefault_disabled() checks.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7ab63d9..98dc211 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -32,6 +32,7 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include <linux/dma_remapping.h>
+#include <linux/uaccess.h>
 
 #define  __EXEC_OBJECT_HAS_PIN (1<<31)
 #define  __EXEC_OBJECT_HAS_FENCE (1<<30)
@@ -458,7 +459,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	}
 
 	/* We can't wait for rendering with pagefaults disabled */
-	if (obj->active && in_atomic())
+	if (obj->active && pagefault_disabled())
 		return -EFAULT;
 
 	if (use_cpu_reloc(obj))
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 07/15] drm/i915: use pagefault_disabled() to check for disabled pagefaults
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Now that the pagefault disabled counter is in place, we can replace
the in_atomic() check by a pagefault_disabled() checks.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7ab63d9..98dc211 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -32,6 +32,7 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include <linux/dma_remapping.h>
+#include <linux/uaccess.h>
 
 #define  __EXEC_OBJECT_HAS_PIN (1<<31)
 #define  __EXEC_OBJECT_HAS_FENCE (1<<30)
@@ -458,7 +459,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	}
 
 	/* We can't wait for rendering with pagefaults disabled */
-	if (obj->active && in_atomic())
+	if (obj->active && pagefault_disabled())
 		return -EFAULT;
 
 	if (use_cpu_reloc(obj))
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 08/15] futex: UP futex_atomic_op_inuser() relies on disabled preemption
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Let's explicitly disable/enable preemption in the !CONFIG_SMP version
of futex_atomic_op_inuser, to prepare for pagefault_disable() not
touching preemption anymore.

Otherwise we might break mutual exclusion when relying on a get_user()/
put_user() implementation.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/asm-generic/futex.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/futex.h b/include/asm-generic/futex.h
index b59b5a5..3586017 100644
--- a/include/asm-generic/futex.h
+++ b/include/asm-generic/futex.h
@@ -8,8 +8,7 @@
 #ifndef CONFIG_SMP
 /*
  * The following implementation only for uniprocessor machines.
- * For UP, it's relies on the fact that pagefault_disable() also disables
- * preemption to ensure mutual exclusion.
+ * It relies on preempt_disable() ensuring mutual exclusion.
  *
  */
 
@@ -38,6 +37,7 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 	if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
 		oparg = 1 << oparg;
 
+	preempt_disable();
 	pagefault_disable();
 
 	ret = -EFAULT;
@@ -72,6 +72,7 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 
 out_pagefault_enable:
 	pagefault_enable();
+	preempt_enable();
 
 	if (ret == 0) {
 		switch (cmp) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 08/15] futex: UP futex_atomic_op_inuser() relies on disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Let's explicitly disable/enable preemption in the !CONFIG_SMP version
of futex_atomic_op_inuser, to prepare for pagefault_disable() not
touching preemption anymore.

Otherwise we might break mutual exclusion when relying on a get_user()/
put_user() implementation.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/asm-generic/futex.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/futex.h b/include/asm-generic/futex.h
index b59b5a5..3586017 100644
--- a/include/asm-generic/futex.h
+++ b/include/asm-generic/futex.h
@@ -8,8 +8,7 @@
 #ifndef CONFIG_SMP
 /*
  * The following implementation only for uniprocessor machines.
- * For UP, it's relies on the fact that pagefault_disable() also disables
- * preemption to ensure mutual exclusion.
+ * It relies on preempt_disable() ensuring mutual exclusion.
  *
  */
 
@@ -38,6 +37,7 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 	if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
 		oparg = 1 << oparg;
 
+	preempt_disable();
 	pagefault_disable();
 
 	ret = -EFAULT;
@@ -72,6 +72,7 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 
 out_pagefault_enable:
 	pagefault_enable();
+	preempt_enable();
 
 	if (ret == 0) {
 		switch (cmp) {
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 09/15] futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled preemption
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Let's explicitly disable/enable preemption in the !CONFIG_SMP version
of futex_atomic_cmpxchg_inatomic, to prepare for pagefault_disable() not
touching preemption anymore. This is needed for this function to be
callable from both, atomic and non-atomic context.

Otherwise we might break mutual exclusion when relying on a get_user()/
put_user() implementation.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/asm-generic/futex.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/asm-generic/futex.h b/include/asm-generic/futex.h
index 3586017..e56272c 100644
--- a/include/asm-generic/futex.h
+++ b/include/asm-generic/futex.h
@@ -107,6 +107,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 {
 	u32 val;
 
+	preempt_disable();
 	if (unlikely(get_user(val, uaddr) != 0))
 		return -EFAULT;
 
@@ -114,6 +115,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 		return -EFAULT;
 
 	*uval = val;
+	preempt_enable();
 
 	return 0;
 }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 09/15] futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Let's explicitly disable/enable preemption in the !CONFIG_SMP version
of futex_atomic_cmpxchg_inatomic, to prepare for pagefault_disable() not
touching preemption anymore. This is needed for this function to be
callable from both, atomic and non-atomic context.

Otherwise we might break mutual exclusion when relying on a get_user()/
put_user() implementation.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/asm-generic/futex.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/asm-generic/futex.h b/include/asm-generic/futex.h
index 3586017..e56272c 100644
--- a/include/asm-generic/futex.h
+++ b/include/asm-generic/futex.h
@@ -107,6 +107,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 {
 	u32 val;
 
+	preempt_disable();
 	if (unlikely(get_user(val, uaddr) != 0))
 		return -EFAULT;
 
@@ -114,6 +115,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 		return -EFAULT;
 
 	*uval = val;
+	preempt_enable();
 
 	return 0;
 }
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 10/15] arm/futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled preemption
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The !CONFIG_SMP implementation of futex_atomic_cmpxchg_inatomic()
requires preemption to be disabled to guarantee mutual exclusion.
Let's make this explicit.

This patch is based on a patch by Sebastian Andrzej Siewior on the
-rt branch.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arm/include/asm/futex.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 4e78065..255bfd1 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -93,6 +93,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
+	preempt_disable();
 	__asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n"
 	"1:	" TUSER(ldr) "	%1, [%4]\n"
 	"	teq	%1, %2\n"
@@ -104,6 +105,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	: "cc", "memory");
 
 	*uval = val;
+	preempt_enable();
+
 	return ret;
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 10/15] arm/futex: UP futex_atomic_cmpxchg_inatomic() relies on disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The !CONFIG_SMP implementation of futex_atomic_cmpxchg_inatomic()
requires preemption to be disabled to guarantee mutual exclusion.
Let's make this explicit.

This patch is based on a patch by Sebastian Andrzej Siewior on the
-rt branch.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arm/include/asm/futex.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 4e78065..255bfd1 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -93,6 +93,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
+	preempt_disable();
 	__asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n"
 	"1:	" TUSER(ldr) "	%1, [%4]\n"
 	"	teq	%1, %2\n"
@@ -104,6 +105,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	: "cc", "memory");
 
 	*uval = val;
+	preempt_enable();
+
 	return ret;
 }
 
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 11/15] arm/futex: UP futex_atomic_op_inuser() relies on disabled preemption
  2015-05-06 17:50 ` David Hildenbrand
  (?)
  (?)
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The !CONFIG_SMP implementation of futex_atomic_op_inuser() seems to rely
on disabled preemption to guarantee mutual exclusion.

>From commit e589ed23dd27:
    "For UP it's enough to disable preemption to ensure mutual exclusion..."
>From the code itself:
    "!SMP, we can work around lack of atomic ops by disabling preemption"

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arm/include/asm/futex.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 255bfd1..5eed828 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -127,7 +127,10 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+#ifndef CONFIG_SMP
+	preempt_disable();
+#endif
+	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -149,7 +152,10 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
+#ifndef CONFIG_SMP
+	preempt_enable();
+#endif
 
 	if (!ret) {
 		switch (cmp) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 11/15] arm/futex: UP futex_atomic_op_inuser() relies on disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The !CONFIG_SMP implementation of futex_atomic_op_inuser() seems to rely
on disabled preemption to guarantee mutual exclusion.

From commit e589ed23dd27:
    "For UP it's enough to disable preemption to ensure mutual exclusion..."
From the code itself:
    "!SMP, we can work around lack of atomic ops by disabling preemption"

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arm/include/asm/futex.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 255bfd1..5eed828 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -127,7 +127,10 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+#ifndef CONFIG_SMP
+	preempt_disable();
+#endif
+	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -149,7 +152,10 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
+#ifndef CONFIG_SMP
+	preempt_enable();
+#endif
 
 	if (!ret) {
 		switch (cmp) {
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 11/15] arm/futex: UP futex_atomic_op_inuser() relies on disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The !CONFIG_SMP implementation of futex_atomic_op_inuser() seems to rely
on disabled preemption to guarantee mutual exclusion.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH RFC 11/15] arm/futex: UP futex_atomic_op_inuser() relies on disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

The !CONFIG_SMP implementation of futex_atomic_op_inuser() seems to rely
on disabled preemption to guarantee mutual exclusion.

>From commit e589ed23dd27:
    "For UP it's enough to disable preemption to ensure mutual exclusion..."
>From the code itself:
    "!SMP, we can work around lack of atomic ops by disabling preemption"

Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arm/include/asm/futex.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 255bfd1..5eed828 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -127,7 +127,10 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+#ifndef CONFIG_SMP
+	preempt_disable();
+#endif
+	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -149,7 +152,10 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
+#ifndef CONFIG_SMP
+	preempt_enable();
+#endif
 
 	if (!ret) {
 		switch (cmp) {
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 12/15] futex: clarify that preemption doesn't have to be disabled
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

As arm64 and arc have no special implementations for !CONFIG_SMP, mutual
exclusion doesn't seem to rely on preemption.

Let's make it clear in the comments that preemption doesn't have to be
disabled when accessing user space in the futex code, so we can remove
preempt_disable() from pagefault_disable().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arc/include/asm/futex.h   | 10 +++++-----
 arch/arm64/include/asm/futex.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
index 4dc64dd..05b5aaf 100644
--- a/arch/arc/include/asm/futex.h
+++ b/arch/arc/include/asm/futex.h
@@ -53,7 +53,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -75,7 +75,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
 
 	if (!ret) {
 		switch (cmp) {
@@ -104,7 +104,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 	return ret;
 }
 
-/* Compare-xchg with preemption disabled.
+/* Compare-xchg with pagefaults disabled.
  *  Notes:
  *      -Best-Effort: Exchg happens only if compare succeeds.
  *          If compare fails, returns; leaving retry/looping to upper layers
@@ -121,7 +121,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+	pagefault_disable();
 
 	/* TBD : can use llock/scond */
 	__asm__ __volatile__(
@@ -142,7 +142,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
 	: "r"(oldval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
 	: "cc", "memory");
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
 
 	*uval = val;
 	return val;
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 5f750dc..74069b3 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -58,7 +58,7 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -85,7 +85,7 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
 
 	if (!ret) {
 		switch (cmp) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 12/15] futex: clarify that preemption doesn't have to be disabled
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

As arm64 and arc have no special implementations for !CONFIG_SMP, mutual
exclusion doesn't seem to rely on preemption.

Let's make it clear in the comments that preemption doesn't have to be
disabled when accessing user space in the futex code, so we can remove
preempt_disable() from pagefault_disable().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/arc/include/asm/futex.h   | 10 +++++-----
 arch/arm64/include/asm/futex.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
index 4dc64dd..05b5aaf 100644
--- a/arch/arc/include/asm/futex.h
+++ b/arch/arc/include/asm/futex.h
@@ -53,7 +53,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -75,7 +75,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
 
 	if (!ret) {
 		switch (cmp) {
@@ -104,7 +104,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 	return ret;
 }
 
-/* Compare-xchg with preemption disabled.
+/* Compare-xchg with pagefaults disabled.
  *  Notes:
  *      -Best-Effort: Exchg happens only if compare succeeds.
  *          If compare fails, returns; leaving retry/looping to upper layers
@@ -121,7 +121,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+	pagefault_disable();
 
 	/* TBD : can use llock/scond */
 	__asm__ __volatile__(
@@ -142,7 +142,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
 	: "r"(oldval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
 	: "cc", "memory");
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
 
 	*uval = val;
 	return val;
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 5f750dc..74069b3 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -58,7 +58,7 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
-	pagefault_disable();	/* implies preempt_disable() */
+	pagefault_disable();
 
 	switch (op) {
 	case FUTEX_OP_SET:
@@ -85,7 +85,7 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
 		ret = -ENOSYS;
 	}
 
-	pagefault_enable();	/* subsumes preempt_enable() */
+	pagefault_enable();
 
 	if (!ret) {
 		switch (cmp) {
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 13/15] powerpc: enable_kernel_altivec() requires disabled preemption
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

enable_kernel_altivec() has to be called with disabled preemption.
Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/powerpc/lib/vmx-helper.c | 11 ++++++-----
 drivers/crypto/vmx/aes.c      |  8 +++++++-
 drivers/crypto/vmx/aes_cbc.c  |  6 ++++++
 drivers/crypto/vmx/ghash.c    |  8 ++++++++
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
index 3cf529c..ac93a3b 100644
--- a/arch/powerpc/lib/vmx-helper.c
+++ b/arch/powerpc/lib/vmx-helper.c
@@ -27,11 +27,11 @@ int enter_vmx_usercopy(void)
 	if (in_interrupt())
 		return 0;
 
-	/* This acts as preempt_disable() as well and will make
-	 * enable_kernel_altivec(). We need to disable page faults
-	 * as they can call schedule and thus make us lose the VMX
-	 * context. So on page faults, we just fail which will cause
-	 * a fallback to the normal non-vmx copy.
+	preempt_disable();
+	/*
+	 * We need to disable page faults as they can call schedule and
+	 * thus make us lose the VMX context. So on page faults, we just
+	 * fail which will cause a fallback to the normal non-vmx copy.
 	 */
 	pagefault_disable();
 
@@ -47,6 +47,7 @@ int enter_vmx_usercopy(void)
 int exit_vmx_usercopy(void)
 {
 	pagefault_enable();
+	preempt_enable();
 	return 0;
 }
 
diff --git a/drivers/crypto/vmx/aes.c b/drivers/crypto/vmx/aes.c
index ab300ea..a9064e3 100644
--- a/drivers/crypto/vmx/aes.c
+++ b/drivers/crypto/vmx/aes.c
@@ -78,12 +78,14 @@ static int p8_aes_setkey(struct crypto_tfm *tfm, const u8 *key,
     int ret;
     struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
 
+    preempt_disable();
     pagefault_disable();
     enable_kernel_altivec();
     ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
     ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
     pagefault_enable();
-    
+    preempt_enable();
+
     ret += crypto_cipher_setkey(ctx->fallback, key, keylen);
     return ret;
 }
@@ -95,10 +97,12 @@ static void p8_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
     if (in_interrupt()) {
         crypto_cipher_encrypt_one(ctx->fallback, dst, src);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
         aes_p8_encrypt(src, dst, &ctx->enc_key);
         pagefault_enable();
+	preempt_enable();
     }
 }
 
@@ -109,10 +113,12 @@ static void p8_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
     if (in_interrupt()) {
         crypto_cipher_decrypt_one(ctx->fallback, dst, src);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
         aes_p8_decrypt(src, dst, &ctx->dec_key);
         pagefault_enable();
+	preempt_enable();
     }
 }
 
diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
index 1a559b7..477284a 100644
--- a/drivers/crypto/vmx/aes_cbc.c
+++ b/drivers/crypto/vmx/aes_cbc.c
@@ -79,11 +79,13 @@ static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
     int ret;
     struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
 
+    preempt_disable();
     pagefault_disable();
     enable_kernel_altivec();
     ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
     ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
     pagefault_enable();
+    preempt_enable();
 
     ret += crypto_blkcipher_setkey(ctx->fallback, key, keylen);
     return ret;
@@ -106,6 +108,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
     if (in_interrupt()) {
         ret = crypto_blkcipher_encrypt(&fallback_desc, dst, src, nbytes);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
 
@@ -119,6 +122,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
 	}
 
         pagefault_enable();
+	preempt_enable();
     }
 
     return ret;
@@ -141,6 +145,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
     if (in_interrupt()) {
         ret = crypto_blkcipher_decrypt(&fallback_desc, dst, src, nbytes);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
 
@@ -154,6 +159,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
 		}
 
         pagefault_enable();
+	preempt_enable();
     }
 
     return ret;
diff --git a/drivers/crypto/vmx/ghash.c b/drivers/crypto/vmx/ghash.c
index d0ffe27..f255ec4 100644
--- a/drivers/crypto/vmx/ghash.c
+++ b/drivers/crypto/vmx/ghash.c
@@ -114,11 +114,13 @@ static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key,
     if (keylen != GHASH_KEY_LEN)
         return -EINVAL;
 
+    preempt_disable();
     pagefault_disable();
     enable_kernel_altivec();
     enable_kernel_fp();
     gcm_init_p8(ctx->htable, (const u64 *) key);
     pagefault_enable();
+    preempt_enable();
     return crypto_shash_setkey(ctx->fallback, key, keylen);
 }
 
@@ -140,23 +142,27 @@ static int p8_ghash_update(struct shash_desc *desc,
             }
             memcpy(dctx->buffer + dctx->bytes, src,
                     GHASH_DIGEST_SIZE - dctx->bytes);
+	    preempt_disable();
             pagefault_disable();
             enable_kernel_altivec();
             enable_kernel_fp();
             gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
                     GHASH_DIGEST_SIZE);
             pagefault_enable();
+	    preempt_enable();
             src += GHASH_DIGEST_SIZE - dctx->bytes;
             srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
             dctx->bytes = 0;
         }
         len = srclen & ~(GHASH_DIGEST_SIZE - 1);
         if (len) {
+	    preempt_disable();
             pagefault_disable();
             enable_kernel_altivec();
             enable_kernel_fp();
             gcm_ghash_p8(dctx->shash, ctx->htable, src, len);
             pagefault_enable();
+	    preempt_enable();
             src += len;
             srclen -= len;
         }
@@ -180,12 +186,14 @@ static int p8_ghash_final(struct shash_desc *desc, u8 *out)
         if (dctx->bytes) {
             for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
                 dctx->buffer[i] = 0;
+	    preempt_disable();
             pagefault_disable();
             enable_kernel_altivec();
             enable_kernel_fp();
             gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
                     GHASH_DIGEST_SIZE);
             pagefault_enable();
+	    preempt_enable();
             dctx->bytes = 0;
         }
         memcpy(out, dctx->shash, GHASH_DIGEST_SIZE);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 13/15] powerpc: enable_kernel_altivec() requires disabled preemption
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

enable_kernel_altivec() has to be called with disabled preemption.
Let's make this explicit, to prepare for pagefault_disable() not
touching preemption anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/powerpc/lib/vmx-helper.c | 11 ++++++-----
 drivers/crypto/vmx/aes.c      |  8 +++++++-
 drivers/crypto/vmx/aes_cbc.c  |  6 ++++++
 drivers/crypto/vmx/ghash.c    |  8 ++++++++
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
index 3cf529c..ac93a3b 100644
--- a/arch/powerpc/lib/vmx-helper.c
+++ b/arch/powerpc/lib/vmx-helper.c
@@ -27,11 +27,11 @@ int enter_vmx_usercopy(void)
 	if (in_interrupt())
 		return 0;
 
-	/* This acts as preempt_disable() as well and will make
-	 * enable_kernel_altivec(). We need to disable page faults
-	 * as they can call schedule and thus make us lose the VMX
-	 * context. So on page faults, we just fail which will cause
-	 * a fallback to the normal non-vmx copy.
+	preempt_disable();
+	/*
+	 * We need to disable page faults as they can call schedule and
+	 * thus make us lose the VMX context. So on page faults, we just
+	 * fail which will cause a fallback to the normal non-vmx copy.
 	 */
 	pagefault_disable();
 
@@ -47,6 +47,7 @@ int enter_vmx_usercopy(void)
 int exit_vmx_usercopy(void)
 {
 	pagefault_enable();
+	preempt_enable();
 	return 0;
 }
 
diff --git a/drivers/crypto/vmx/aes.c b/drivers/crypto/vmx/aes.c
index ab300ea..a9064e3 100644
--- a/drivers/crypto/vmx/aes.c
+++ b/drivers/crypto/vmx/aes.c
@@ -78,12 +78,14 @@ static int p8_aes_setkey(struct crypto_tfm *tfm, const u8 *key,
     int ret;
     struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
 
+    preempt_disable();
     pagefault_disable();
     enable_kernel_altivec();
     ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
     ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
     pagefault_enable();
-    
+    preempt_enable();
+
     ret += crypto_cipher_setkey(ctx->fallback, key, keylen);
     return ret;
 }
@@ -95,10 +97,12 @@ static void p8_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
     if (in_interrupt()) {
         crypto_cipher_encrypt_one(ctx->fallback, dst, src);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
         aes_p8_encrypt(src, dst, &ctx->enc_key);
         pagefault_enable();
+	preempt_enable();
     }
 }
 
@@ -109,10 +113,12 @@ static void p8_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
     if (in_interrupt()) {
         crypto_cipher_decrypt_one(ctx->fallback, dst, src);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
         aes_p8_decrypt(src, dst, &ctx->dec_key);
         pagefault_enable();
+	preempt_enable();
     }
 }
 
diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
index 1a559b7..477284a 100644
--- a/drivers/crypto/vmx/aes_cbc.c
+++ b/drivers/crypto/vmx/aes_cbc.c
@@ -79,11 +79,13 @@ static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
     int ret;
     struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
 
+    preempt_disable();
     pagefault_disable();
     enable_kernel_altivec();
     ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
     ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
     pagefault_enable();
+    preempt_enable();
 
     ret += crypto_blkcipher_setkey(ctx->fallback, key, keylen);
     return ret;
@@ -106,6 +108,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
     if (in_interrupt()) {
         ret = crypto_blkcipher_encrypt(&fallback_desc, dst, src, nbytes);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
 
@@ -119,6 +122,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
 	}
 
         pagefault_enable();
+	preempt_enable();
     }
 
     return ret;
@@ -141,6 +145,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
     if (in_interrupt()) {
         ret = crypto_blkcipher_decrypt(&fallback_desc, dst, src, nbytes);
     } else {
+	preempt_disable();
         pagefault_disable();
         enable_kernel_altivec();
 
@@ -154,6 +159,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
 		}
 
         pagefault_enable();
+	preempt_enable();
     }
 
     return ret;
diff --git a/drivers/crypto/vmx/ghash.c b/drivers/crypto/vmx/ghash.c
index d0ffe27..f255ec4 100644
--- a/drivers/crypto/vmx/ghash.c
+++ b/drivers/crypto/vmx/ghash.c
@@ -114,11 +114,13 @@ static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key,
     if (keylen != GHASH_KEY_LEN)
         return -EINVAL;
 
+    preempt_disable();
     pagefault_disable();
     enable_kernel_altivec();
     enable_kernel_fp();
     gcm_init_p8(ctx->htable, (const u64 *) key);
     pagefault_enable();
+    preempt_enable();
     return crypto_shash_setkey(ctx->fallback, key, keylen);
 }
 
@@ -140,23 +142,27 @@ static int p8_ghash_update(struct shash_desc *desc,
             }
             memcpy(dctx->buffer + dctx->bytes, src,
                     GHASH_DIGEST_SIZE - dctx->bytes);
+	    preempt_disable();
             pagefault_disable();
             enable_kernel_altivec();
             enable_kernel_fp();
             gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
                     GHASH_DIGEST_SIZE);
             pagefault_enable();
+	    preempt_enable();
             src += GHASH_DIGEST_SIZE - dctx->bytes;
             srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
             dctx->bytes = 0;
         }
         len = srclen & ~(GHASH_DIGEST_SIZE - 1);
         if (len) {
+	    preempt_disable();
             pagefault_disable();
             enable_kernel_altivec();
             enable_kernel_fp();
             gcm_ghash_p8(dctx->shash, ctx->htable, src, len);
             pagefault_enable();
+	    preempt_enable();
             src += len;
             srclen -= len;
         }
@@ -180,12 +186,14 @@ static int p8_ghash_final(struct shash_desc *desc, u8 *out)
         if (dctx->bytes) {
             for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
                 dctx->buffer[i] = 0;
+	    preempt_disable();
             pagefault_disable();
             enable_kernel_altivec();
             enable_kernel_fp();
             gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
                     GHASH_DIGEST_SIZE);
             pagefault_enable();
+	    preempt_enable();
             dctx->bytes = 0;
         }
         memcpy(out, dctx->shash, GHASH_DIGEST_SIZE);
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 14/15] mips: properly lock access to the fpu
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Let's always disable preemption and pagefaults when locking the fpu,
so we can be sure that the owner won't change in between.

This is a preparation for pagefault_disable() not touching preemption
anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/mips/kernel/signal-common.h | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/mips/kernel/signal-common.h b/arch/mips/kernel/signal-common.h
index 06805e0..0b85f82 100644
--- a/arch/mips/kernel/signal-common.h
+++ b/arch/mips/kernel/signal-common.h
@@ -28,12 +28,7 @@ extern void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 extern int fpcsr_pending(unsigned int __user *fpcsr);
 
 /* Make sure we will not lose FPU ownership */
-#ifdef CONFIG_PREEMPT
-#define lock_fpu_owner()	preempt_disable()
-#define unlock_fpu_owner()	preempt_enable()
-#else
-#define lock_fpu_owner()	pagefault_disable()
-#define unlock_fpu_owner()	pagefault_enable()
-#endif
+#define lock_fpu_owner()	({ preempt_disable(); pagefault_disable(); })
+#define unlock_fpu_owner()	({ pagefault_enable(); preempt_enable(); })
 
 #endif	/* __SIGNAL_COMMON_H */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 14/15] mips: properly lock access to the fpu
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Let's always disable preemption and pagefaults when locking the fpu,
so we can be sure that the owner won't change in between.

This is a preparation for pagefault_disable() not touching preemption
anymore.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/mips/kernel/signal-common.h | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/mips/kernel/signal-common.h b/arch/mips/kernel/signal-common.h
index 06805e0..0b85f82 100644
--- a/arch/mips/kernel/signal-common.h
+++ b/arch/mips/kernel/signal-common.h
@@ -28,12 +28,7 @@ extern void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 extern int fpcsr_pending(unsigned int __user *fpcsr);
 
 /* Make sure we will not lose FPU ownership */
-#ifdef CONFIG_PREEMPT
-#define lock_fpu_owner()	preempt_disable()
-#define unlock_fpu_owner()	preempt_enable()
-#else
-#define lock_fpu_owner()	pagefault_disable()
-#define unlock_fpu_owner()	pagefault_enable()
-#endif
+#define lock_fpu_owner()	({ preempt_disable(); pagefault_disable(); })
+#define unlock_fpu_owner()	({ pagefault_enable(); preempt_enable(); })
 
 #endif	/* __SIGNAL_COMMON_H */
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 15/15] uaccess: decouple preemption from the pagefault logic
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 17:50   ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

As the fault handlers now all rely on the pagefault_disabled() checks
and implicit preempt_disable() calls by pagefault_disable() have been
made explicit, we can completely rely on the pagefault_disableD counter.

So let's no longer touch the preempt count when disabling/enabling
pagefaults. After a call to pagefault_disable(), pagefault_disabled()
will return true, but in_atomic() won't.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/uaccess.h | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 23290cc..83f2a7b 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -1,7 +1,6 @@
 #ifndef __LINUX_UACCESS_H__
 #define __LINUX_UACCESS_H__
 
-#include <linux/preempt.h>
 #include <linux/sched.h>
 #include <asm/uaccess.h>
 
@@ -20,17 +19,11 @@ static __always_inline void pagefault_disabled_dec(void)
  * These routines enable/disable the pagefault handler. If disabled, it will
  * not take any locks and go straight to the fixup table.
  *
- * We increase the preempt and the pagefault count, to be able to distinguish
- * whether we run in simple atomic context or in a real pagefault_disable()
- * context.
- *
- * For now, after pagefault_disabled() has been called, we run in atomic
- * context. User access methods will not sleep.
- *
+ * User access methods will not sleep when called from a pagefault_disabled()
+ * environment.
  */
 static inline void pagefault_disable(void)
 {
-	preempt_count_inc();
 	pagefault_disabled_inc();
 	/*
 	 * make sure to have issued the store before a pagefault
@@ -47,11 +40,6 @@ static inline void pagefault_enable(void)
 	 */
 	barrier();
 	pagefault_disabled_dec();
-#ifndef CONFIG_PREEMPT
-	preempt_count_dec();
-#else
-	preempt_enable();
-#endif
 }
 
 /*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH RFC 15/15] uaccess: decouple preemption from the pagefault logic
@ 2015-05-06 17:50   ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-06 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

As the fault handlers now all rely on the pagefault_disabled() checks
and implicit preempt_disable() calls by pagefault_disable() have been
made explicit, we can completely rely on the pagefault_disableD counter.

So let's no longer touch the preempt count when disabling/enabling
pagefaults. After a call to pagefault_disable(), pagefault_disabled()
will return true, but in_atomic() won't.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/uaccess.h | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 23290cc..83f2a7b 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -1,7 +1,6 @@
 #ifndef __LINUX_UACCESS_H__
 #define __LINUX_UACCESS_H__
 
-#include <linux/preempt.h>
 #include <linux/sched.h>
 #include <asm/uaccess.h>
 
@@ -20,17 +19,11 @@ static __always_inline void pagefault_disabled_dec(void)
  * These routines enable/disable the pagefault handler. If disabled, it will
  * not take any locks and go straight to the fixup table.
  *
- * We increase the preempt and the pagefault count, to be able to distinguish
- * whether we run in simple atomic context or in a real pagefault_disable()
- * context.
- *
- * For now, after pagefault_disabled() has been called, we run in atomic
- * context. User access methods will not sleep.
- *
+ * User access methods will not sleep when called from a pagefault_disabled()
+ * environment.
  */
 static inline void pagefault_disable(void)
 {
-	preempt_count_inc();
 	pagefault_disabled_inc();
 	/*
 	 * make sure to have issued the store before a pagefault
@@ -47,11 +40,6 @@ static inline void pagefault_enable(void)
 	 */
 	barrier();
 	pagefault_disabled_dec();
-#ifndef CONFIG_PREEMPT
-	preempt_count_dec();
-#else
-	preempt_enable();
-#endif
 }
 
 /*
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
  2015-05-06 17:50 ` David Hildenbrand
@ 2015-05-06 22:01   ` Andrew Morton
  -1 siblings, 0 replies; 79+ messages in thread
From: Andrew Morton @ 2015-05-06 22:01 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, peterz, yang.shi, bigeasy, benh, paulus,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> As Peter asked me to also do the decoupling in one shot, this is
> the new series.
> 
> I recently discovered that might_fault() doesn't call might_sleep()
> anymore. Therefore bugs like:
> 
>   spin_lock(&lock);
>   rc = copy_to_user(...);
>   spin_unlock(&lock);
> 
> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
> changed to disable false positives for code like:
> 
>   pagefault_disable();
>   rc = copy_to_user(...);
>   pagefault_enable();
> 
> Whereby the caller wants do deal with failures.

hm, that was a significant screwup.  I wonder how many bugs we
subsequently added.

>
> ..
>

> This series therefore does 2 things:
> 
> 
> 1. Decouple pagefault_disable() from preempt_enable()
> 
> ...
> 
> 2. Reenable might_sleep() checks for might_fault()

All seems sensible to me.  pagefault_disabled has to go into the
task_struct (rather than being per-cpu) because
pagefault_disabled_inc() doesn't disable preemption, yes?


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-06 22:01   ` Andrew Morton
  0 siblings, 0 replies; 79+ messages in thread
From: Andrew Morton @ 2015-05-06 22:01 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, peterz, yang.shi, bigeasy, benh, paulus,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> As Peter asked me to also do the decoupling in one shot, this is
> the new series.
> 
> I recently discovered that might_fault() doesn't call might_sleep()
> anymore. Therefore bugs like:
> 
>   spin_lock(&lock);
>   rc = copy_to_user(...);
>   spin_unlock(&lock);
> 
> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
> changed to disable false positives for code like:
> 
>   pagefault_disable();
>   rc = copy_to_user(...);
>   pagefault_enable();
> 
> Whereby the caller wants do deal with failures.

hm, that was a significant screwup.  I wonder how many bugs we
subsequently added.

>
> ..
>

> This series therefore does 2 things:
> 
> 
> 1. Decouple pagefault_disable() from preempt_enable()
> 
> ...
> 
> 2. Reenable might_sleep() checks for might_fault()

All seems sensible to me.  pagefault_disabled has to go into the
task_struct (rather than being per-cpu) because
pagefault_disabled_inc() doesn't disable preemption, yes?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 13/15] powerpc: enable_kernel_altivec() requires disabled preemption
  2015-05-06 17:50   ` David Hildenbrand
@ 2015-05-07  0:21     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 79+ messages in thread
From: Benjamin Herrenschmidt @ 2015-05-07  0:21 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, peterz, yang.shi, bigeasy, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Wed, 2015-05-06 at 19:50 +0200, David Hildenbrand wrote:
> enable_kernel_altivec() has to be called with disabled preemption.
> Let's make this explicit, to prepare for pagefault_disable() not
> touching preemption anymore.
>
> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

> ---
>  arch/powerpc/lib/vmx-helper.c | 11 ++++++-----
>  drivers/crypto/vmx/aes.c      |  8 +++++++-
>  drivers/crypto/vmx/aes_cbc.c  |  6 ++++++
>  drivers/crypto/vmx/ghash.c    |  8 ++++++++
>  4 files changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
> index 3cf529c..ac93a3b 100644
> --- a/arch/powerpc/lib/vmx-helper.c
> +++ b/arch/powerpc/lib/vmx-helper.c
> @@ -27,11 +27,11 @@ int enter_vmx_usercopy(void)
>  	if (in_interrupt())
>  		return 0;
>  
> -	/* This acts as preempt_disable() as well and will make
> -	 * enable_kernel_altivec(). We need to disable page faults
> -	 * as they can call schedule and thus make us lose the VMX
> -	 * context. So on page faults, we just fail which will cause
> -	 * a fallback to the normal non-vmx copy.
> +	preempt_disable();
> +	/*
> +	 * We need to disable page faults as they can call schedule and
> +	 * thus make us lose the VMX context. So on page faults, we just
> +	 * fail which will cause a fallback to the normal non-vmx copy.
>  	 */
>  	pagefault_disable();
>  
> @@ -47,6 +47,7 @@ int enter_vmx_usercopy(void)
>  int exit_vmx_usercopy(void)
>  {
>  	pagefault_enable();
> +	preempt_enable();
>  	return 0;
>  }
>  
> diff --git a/drivers/crypto/vmx/aes.c b/drivers/crypto/vmx/aes.c
> index ab300ea..a9064e3 100644
> --- a/drivers/crypto/vmx/aes.c
> +++ b/drivers/crypto/vmx/aes.c
> @@ -78,12 +78,14 @@ static int p8_aes_setkey(struct crypto_tfm *tfm, const u8 *key,
>      int ret;
>      struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
>  
> +    preempt_disable();
>      pagefault_disable();
>      enable_kernel_altivec();
>      ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
>      ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
>      pagefault_enable();
> -    
> +    preempt_enable();
> +
>      ret += crypto_cipher_setkey(ctx->fallback, key, keylen);
>      return ret;
>  }
> @@ -95,10 +97,12 @@ static void p8_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
>      if (in_interrupt()) {
>          crypto_cipher_encrypt_one(ctx->fallback, dst, src);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>          aes_p8_encrypt(src, dst, &ctx->enc_key);
>          pagefault_enable();
> +	preempt_enable();
>      }
>  }
>  
> @@ -109,10 +113,12 @@ static void p8_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
>      if (in_interrupt()) {
>          crypto_cipher_decrypt_one(ctx->fallback, dst, src);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>          aes_p8_decrypt(src, dst, &ctx->dec_key);
>          pagefault_enable();
> +	preempt_enable();
>      }
>  }
>  
> diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
> index 1a559b7..477284a 100644
> --- a/drivers/crypto/vmx/aes_cbc.c
> +++ b/drivers/crypto/vmx/aes_cbc.c
> @@ -79,11 +79,13 @@ static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
>      int ret;
>      struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
>  
> +    preempt_disable();
>      pagefault_disable();
>      enable_kernel_altivec();
>      ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
>      ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
>      pagefault_enable();
> +    preempt_enable();
>  
>      ret += crypto_blkcipher_setkey(ctx->fallback, key, keylen);
>      return ret;
> @@ -106,6 +108,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
>      if (in_interrupt()) {
>          ret = crypto_blkcipher_encrypt(&fallback_desc, dst, src, nbytes);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>  
> @@ -119,6 +122,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
>  	}
>  
>          pagefault_enable();
> +	preempt_enable();
>      }
>  
>      return ret;
> @@ -141,6 +145,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
>      if (in_interrupt()) {
>          ret = crypto_blkcipher_decrypt(&fallback_desc, dst, src, nbytes);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>  
> @@ -154,6 +159,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
>  		}
>  
>          pagefault_enable();
> +	preempt_enable();
>      }
>  
>      return ret;
> diff --git a/drivers/crypto/vmx/ghash.c b/drivers/crypto/vmx/ghash.c
> index d0ffe27..f255ec4 100644
> --- a/drivers/crypto/vmx/ghash.c
> +++ b/drivers/crypto/vmx/ghash.c
> @@ -114,11 +114,13 @@ static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key,
>      if (keylen != GHASH_KEY_LEN)
>          return -EINVAL;
>  
> +    preempt_disable();
>      pagefault_disable();
>      enable_kernel_altivec();
>      enable_kernel_fp();
>      gcm_init_p8(ctx->htable, (const u64 *) key);
>      pagefault_enable();
> +    preempt_enable();
>      return crypto_shash_setkey(ctx->fallback, key, keylen);
>  }
>  
> @@ -140,23 +142,27 @@ static int p8_ghash_update(struct shash_desc *desc,
>              }
>              memcpy(dctx->buffer + dctx->bytes, src,
>                      GHASH_DIGEST_SIZE - dctx->bytes);
> +	    preempt_disable();
>              pagefault_disable();
>              enable_kernel_altivec();
>              enable_kernel_fp();
>              gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
>                      GHASH_DIGEST_SIZE);
>              pagefault_enable();
> +	    preempt_enable();
>              src += GHASH_DIGEST_SIZE - dctx->bytes;
>              srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
>              dctx->bytes = 0;
>          }
>          len = srclen & ~(GHASH_DIGEST_SIZE - 1);
>          if (len) {
> +	    preempt_disable();
>              pagefault_disable();
>              enable_kernel_altivec();
>              enable_kernel_fp();
>              gcm_ghash_p8(dctx->shash, ctx->htable, src, len);
>              pagefault_enable();
> +	    preempt_enable();
>              src += len;
>              srclen -= len;
>          }
> @@ -180,12 +186,14 @@ static int p8_ghash_final(struct shash_desc *desc, u8 *out)
>          if (dctx->bytes) {
>              for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
>                  dctx->buffer[i] = 0;
> +	    preempt_disable();
>              pagefault_disable();
>              enable_kernel_altivec();
>              enable_kernel_fp();
>              gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
>                      GHASH_DIGEST_SIZE);
>              pagefault_enable();
> +	    preempt_enable();
>              dctx->bytes = 0;
>          }
>          memcpy(out, dctx->shash, GHASH_DIGEST_SIZE);



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 13/15] powerpc: enable_kernel_altivec() requires disabled preemption
@ 2015-05-07  0:21     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 79+ messages in thread
From: Benjamin Herrenschmidt @ 2015-05-07  0:21 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, peterz, yang.shi, bigeasy, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Wed, 2015-05-06 at 19:50 +0200, David Hildenbrand wrote:
> enable_kernel_altivec() has to be called with disabled preemption.
> Let's make this explicit, to prepare for pagefault_disable() not
> touching preemption anymore.
>
> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

> ---
>  arch/powerpc/lib/vmx-helper.c | 11 ++++++-----
>  drivers/crypto/vmx/aes.c      |  8 +++++++-
>  drivers/crypto/vmx/aes_cbc.c  |  6 ++++++
>  drivers/crypto/vmx/ghash.c    |  8 ++++++++
>  4 files changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
> index 3cf529c..ac93a3b 100644
> --- a/arch/powerpc/lib/vmx-helper.c
> +++ b/arch/powerpc/lib/vmx-helper.c
> @@ -27,11 +27,11 @@ int enter_vmx_usercopy(void)
>  	if (in_interrupt())
>  		return 0;
>  
> -	/* This acts as preempt_disable() as well and will make
> -	 * enable_kernel_altivec(). We need to disable page faults
> -	 * as they can call schedule and thus make us lose the VMX
> -	 * context. So on page faults, we just fail which will cause
> -	 * a fallback to the normal non-vmx copy.
> +	preempt_disable();
> +	/*
> +	 * We need to disable page faults as they can call schedule and
> +	 * thus make us lose the VMX context. So on page faults, we just
> +	 * fail which will cause a fallback to the normal non-vmx copy.
>  	 */
>  	pagefault_disable();
>  
> @@ -47,6 +47,7 @@ int enter_vmx_usercopy(void)
>  int exit_vmx_usercopy(void)
>  {
>  	pagefault_enable();
> +	preempt_enable();
>  	return 0;
>  }
>  
> diff --git a/drivers/crypto/vmx/aes.c b/drivers/crypto/vmx/aes.c
> index ab300ea..a9064e3 100644
> --- a/drivers/crypto/vmx/aes.c
> +++ b/drivers/crypto/vmx/aes.c
> @@ -78,12 +78,14 @@ static int p8_aes_setkey(struct crypto_tfm *tfm, const u8 *key,
>      int ret;
>      struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
>  
> +    preempt_disable();
>      pagefault_disable();
>      enable_kernel_altivec();
>      ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
>      ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
>      pagefault_enable();
> -    
> +    preempt_enable();
> +
>      ret += crypto_cipher_setkey(ctx->fallback, key, keylen);
>      return ret;
>  }
> @@ -95,10 +97,12 @@ static void p8_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
>      if (in_interrupt()) {
>          crypto_cipher_encrypt_one(ctx->fallback, dst, src);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>          aes_p8_encrypt(src, dst, &ctx->enc_key);
>          pagefault_enable();
> +	preempt_enable();
>      }
>  }
>  
> @@ -109,10 +113,12 @@ static void p8_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
>      if (in_interrupt()) {
>          crypto_cipher_decrypt_one(ctx->fallback, dst, src);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>          aes_p8_decrypt(src, dst, &ctx->dec_key);
>          pagefault_enable();
> +	preempt_enable();
>      }
>  }
>  
> diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
> index 1a559b7..477284a 100644
> --- a/drivers/crypto/vmx/aes_cbc.c
> +++ b/drivers/crypto/vmx/aes_cbc.c
> @@ -79,11 +79,13 @@ static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
>      int ret;
>      struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
>  
> +    preempt_disable();
>      pagefault_disable();
>      enable_kernel_altivec();
>      ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
>      ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
>      pagefault_enable();
> +    preempt_enable();
>  
>      ret += crypto_blkcipher_setkey(ctx->fallback, key, keylen);
>      return ret;
> @@ -106,6 +108,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
>      if (in_interrupt()) {
>          ret = crypto_blkcipher_encrypt(&fallback_desc, dst, src, nbytes);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>  
> @@ -119,6 +122,7 @@ static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
>  	}
>  
>          pagefault_enable();
> +	preempt_enable();
>      }
>  
>      return ret;
> @@ -141,6 +145,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
>      if (in_interrupt()) {
>          ret = crypto_blkcipher_decrypt(&fallback_desc, dst, src, nbytes);
>      } else {
> +	preempt_disable();
>          pagefault_disable();
>          enable_kernel_altivec();
>  
> @@ -154,6 +159,7 @@ static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
>  		}
>  
>          pagefault_enable();
> +	preempt_enable();
>      }
>  
>      return ret;
> diff --git a/drivers/crypto/vmx/ghash.c b/drivers/crypto/vmx/ghash.c
> index d0ffe27..f255ec4 100644
> --- a/drivers/crypto/vmx/ghash.c
> +++ b/drivers/crypto/vmx/ghash.c
> @@ -114,11 +114,13 @@ static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key,
>      if (keylen != GHASH_KEY_LEN)
>          return -EINVAL;
>  
> +    preempt_disable();
>      pagefault_disable();
>      enable_kernel_altivec();
>      enable_kernel_fp();
>      gcm_init_p8(ctx->htable, (const u64 *) key);
>      pagefault_enable();
> +    preempt_enable();
>      return crypto_shash_setkey(ctx->fallback, key, keylen);
>  }
>  
> @@ -140,23 +142,27 @@ static int p8_ghash_update(struct shash_desc *desc,
>              }
>              memcpy(dctx->buffer + dctx->bytes, src,
>                      GHASH_DIGEST_SIZE - dctx->bytes);
> +	    preempt_disable();
>              pagefault_disable();
>              enable_kernel_altivec();
>              enable_kernel_fp();
>              gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
>                      GHASH_DIGEST_SIZE);
>              pagefault_enable();
> +	    preempt_enable();
>              src += GHASH_DIGEST_SIZE - dctx->bytes;
>              srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
>              dctx->bytes = 0;
>          }
>          len = srclen & ~(GHASH_DIGEST_SIZE - 1);
>          if (len) {
> +	    preempt_disable();
>              pagefault_disable();
>              enable_kernel_altivec();
>              enable_kernel_fp();
>              gcm_ghash_p8(dctx->shash, ctx->htable, src, len);
>              pagefault_enable();
> +	    preempt_enable();
>              src += len;
>              srclen -= len;
>          }
> @@ -180,12 +186,14 @@ static int p8_ghash_final(struct shash_desc *desc, u8 *out)
>          if (dctx->bytes) {
>              for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
>                  dctx->buffer[i] = 0;
> +	    preempt_disable();
>              pagefault_disable();
>              enable_kernel_altivec();
>              enable_kernel_fp();
>              gcm_ghash_p8(dctx->shash, ctx->htable, dctx->buffer,
>                      GHASH_DIGEST_SIZE);
>              pagefault_enable();
> +	    preempt_enable();
>              dctx->bytes = 0;
>          }
>          memcpy(out, dctx->shash, GHASH_DIGEST_SIZE);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
  2015-05-06 22:01   ` Andrew Morton
@ 2015-05-07  6:23     ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07  6:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, mingo, peterz, yang.shi, bigeasy, benh, paulus,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> > This series therefore does 2 things:
> > 
> > 
> > 1. Decouple pagefault_disable() from preempt_enable()
> > 
> > ...
> > 
> > 2. Reenable might_sleep() checks for might_fault()
> 
> All seems sensible to me.  pagefault_disabled has to go into the
> task_struct (rather than being per-cpu) because
> pagefault_disabled_inc() doesn't disable preemption, yes?
> 

Right, we can now get scheduled while in pagefault_disable() (if preemption
hasn't been disabled manually). So we have to store it per task/thread not per
cpu.

Actually even the preempt disable counter is only per-cpu for x86 and lives in
thread_info for all other archs (which is also not 100% clean but doesn't
matter at that point).

I had that pagefault disable counter in thread_info before, but that required
messing with asm-offsets of some arch (I had a proper version but this one
feels cleaner).

Thanks for having a look!

David


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-07  6:23     ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07  6:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, mingo, peterz, yang.shi, bigeasy, benh, paulus,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> > This series therefore does 2 things:
> > 
> > 
> > 1. Decouple pagefault_disable() from preempt_enable()
> > 
> > ...
> > 
> > 2. Reenable might_sleep() checks for might_fault()
> 
> All seems sensible to me.  pagefault_disabled has to go into the
> task_struct (rather than being per-cpu) because
> pagefault_disabled_inc() doesn't disable preemption, yes?
> 

Right, we can now get scheduled while in pagefault_disable() (if preemption
hasn't been disabled manually). So we have to store it per task/thread not per
cpu.

Actually even the preempt disable counter is only per-cpu for x86 and lives in
thread_info for all other archs (which is also not 100% clean but doesn't
matter at that point).

I had that pagefault disable counter in thread_info before, but that required
messing with asm-offsets of some arch (I had a proper version but this one
feels cleaner).

Thanks for having a look!

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
  2015-05-06 22:01   ` Andrew Morton
@ 2015-05-07  9:48     ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07  9:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, linux-kernel, mingo, peterz, yang.shi,
	bigeasy, benh, paulus, heiko.carstens, schwidefsky, borntraeger,
	mst, tglx, David.Laight, hughd, hocko, ralf, herbert, linux,
	airlied, daniel.vetter, linux-mm, linux-arch


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
> 
> > As Peter asked me to also do the decoupling in one shot, this is
> > the new series.
> > 
> > I recently discovered that might_fault() doesn't call might_sleep()
> > anymore. Therefore bugs like:
> > 
> >   spin_lock(&lock);
> >   rc = copy_to_user(...);
> >   spin_unlock(&lock);
> > 
> > would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
> > changed to disable false positives for code like:
> > 
> >   pagefault_disable();
> >   rc = copy_to_user(...);
> >   pagefault_enable();
> > 
> > Whereby the caller wants do deal with failures.
> 
> hm, that was a significant screwup.  I wonder how many bugs we
> subsequently added.

So I'm wondering what the motivation was to allow things like:

   pagefault_disable();
   rc = copy_to_user(...);
   pagefault_enable();

and to declare it a false positive?

AFAICS most uses are indeed atomic:

        pagefault_disable();
        ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
        pagefault_enable();

so why not make it explicitly atomic again?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-07  9:48     ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07  9:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, linux-kernel, mingo, peterz, yang.shi,
	bigeasy, benh, paulus, heiko.carstens, schwidefsky, borntraeger,
	mst, tglx, David.Laight, hughd, hocko, ralf, herbert, linux,
	airlied, daniel.vetter, linux-mm, linux-arch


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
> 
> > As Peter asked me to also do the decoupling in one shot, this is
> > the new series.
> > 
> > I recently discovered that might_fault() doesn't call might_sleep()
> > anymore. Therefore bugs like:
> > 
> >   spin_lock(&lock);
> >   rc = copy_to_user(...);
> >   spin_unlock(&lock);
> > 
> > would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
> > changed to disable false positives for code like:
> > 
> >   pagefault_disable();
> >   rc = copy_to_user(...);
> >   pagefault_enable();
> > 
> > Whereby the caller wants do deal with failures.
> 
> hm, that was a significant screwup.  I wonder how many bugs we
> subsequently added.

So I'm wondering what the motivation was to allow things like:

   pagefault_disable();
   rc = copy_to_user(...);
   pagefault_enable();

and to declare it a false positive?

AFAICS most uses are indeed atomic:

        pagefault_disable();
        ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
        pagefault_enable();

so why not make it explicitly atomic again?

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-06 17:50   ` David Hildenbrand
@ 2015-05-07 10:22     ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 10:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Wed, May 06, 2015 at 07:50:25PM +0200, David Hildenbrand wrote:
> +/*
> + * Is the pagefault handler disabled? If so, user access methods will not sleep.
> + */
> +#define pagefault_disabled() (current->pagefault_disabled != 0)

So -RT has:

static inline bool pagefault_disabled(void)
{
	return current->pagefault_disabled || in_atomic();
}

AFAICR we did this to avoid having to do both:

	preempt_disable();
	pagefault_disable();

in a fair number of places -- just like this patch-set does, this is
touching two cachelines where one would have been enough.

Also, removing in_atomic() from fault handlers like you did
significantly changes semantics for interrupts (soft, hard and NMI).

So while I agree with most of these patches, I'm very hesitant on the
above little detail.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 10:22     ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 10:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Wed, May 06, 2015 at 07:50:25PM +0200, David Hildenbrand wrote:
> +/*
> + * Is the pagefault handler disabled? If so, user access methods will not sleep.
> + */
> +#define pagefault_disabled() (current->pagefault_disabled != 0)

So -RT has:

static inline bool pagefault_disabled(void)
{
	return current->pagefault_disabled || in_atomic();
}

AFAICR we did this to avoid having to do both:

	preempt_disable();
	pagefault_disable();

in a fair number of places -- just like this patch-set does, this is
touching two cachelines where one would have been enough.

Also, removing in_atomic() from fault handlers like you did
significantly changes semantics for interrupts (soft, hard and NMI).

So while I agree with most of these patches, I'm very hesitant on the
above little detail.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 10:22     ` Peter Zijlstra
  (?)
@ 2015-05-07 10:50       ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 10:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch, tglx

> On Wed, May 06, 2015 at 07:50:25PM +0200, David Hildenbrand wrote:
> > +/*
> > + * Is the pagefault handler disabled? If so, user access methods will not sleep.
> > + */
> > +#define pagefault_disabled() (current->pagefault_disabled != 0)
> 
> So -RT has:
> 
> static inline bool pagefault_disabled(void)
> {
> 	return current->pagefault_disabled || in_atomic();
> }
> 
> AFAICR we did this to avoid having to do both:
> 
> 	preempt_disable();
> 	pagefault_disable();
> 
> in a fair number of places -- just like this patch-set does, this is
> touching two cachelines where one would have been enough.
> 
> Also, removing in_atomic() from fault handlers like you did
> significantly changes semantics for interrupts (soft, hard and NMI).
> 
> So while I agree with most of these patches, I'm very hesitant on the
> above little detail.
> 

Just to make sure we have a common understanding (as written in my cover
letter):

Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
there is no preempt counter, in_atomic() won't work. So doing a
preempt_disable() instead of a pagefault_disable() is not going to work.
(not sure how -RT handles that - most probably with CONFIG_PREEMPT_COUNT being
enabled, due to atomic debug).

That's why I dropped that check for a reason.

So in my opinion, in_atomic() should never be used in any fault handler - it
has nothing to do with disabled pagefaults. It doesn't give us anything more
besides some false security for atomic environments.


This patchset is about decoupling both concept. (not ending up with to
mechanisms doing almost the same)

That's also what Thomas Gleixner suggested
https://lkml.org/lkml/2014/11/27/820 .


David


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 10:50       ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 10:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Wed, May 06, 2015 at 07:50:25PM +0200, David Hildenbrand wrote:
> > +/*
> > + * Is the pagefault handler disabled? If so, user access methods will not sleep.
> > + */
> > +#define pagefault_disabled() (current->pagefault_disabled != 0)
> 
> So -RT has:
> 
> static inline bool pagefault_disabled(void)
> {
> 	return current->pagefault_disabled || in_atomic();
> }
> 
> AFAICR we did this to avoid having to do both:
> 
> 	preempt_disable();
> 	pagefault_disable();
> 
> in a fair number of places -- just like this patch-set does, this is
> touching two cachelines where one would have been enough.
> 
> Also, removing in_atomic() from fault handlers like you did
> significantly changes semantics for interrupts (soft, hard and NMI).
> 
> So while I agree with most of these patches, I'm very hesitant on the
> above little detail.
> 

Just to make sure we have a common understanding (as written in my cover
letter):

Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
there is no preempt counter, in_atomic() won't work. So doing a
preempt_disable() instead of a pagefault_disable() is not going to work.
(not sure how -RT handles that - most probably with CONFIG_PREEMPT_COUNT being
enabled, due to atomic debug).

That's why I dropped that check for a reason.

So in my opinion, in_atomic() should never be used in any fault handler - it
has nothing to do with disabled pagefaults. It doesn't give us anything more
besides some false security for atomic environments.


This patchset is about decoupling both concept. (not ending up with to
mechanisms doing almost the same)

That's also what Thomas Gleixner suggested
https://lkml.org/lkml/2014/11/27/820 .


David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 10:50       ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 10:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Wed, May 06, 2015 at 07:50:25PM +0200, David Hildenbrand wrote:
> > +/*
> > + * Is the pagefault handler disabled? If so, user access methods will not sleep.
> > + */
> > +#define pagefault_disabled() (current->pagefault_disabled != 0)
> 
> So -RT has:
> 
> static inline bool pagefault_disabled(void)
> {
> 	return current->pagefault_disabled || in_atomic();
> }
> 
> AFAICR we did this to avoid having to do both:
> 
> 	preempt_disable();
> 	pagefault_disable();
> 
> in a fair number of places -- just like this patch-set does, this is
> touching two cachelines where one would have been enough.
> 
> Also, removing in_atomic() from fault handlers like you did
> significantly changes semantics for interrupts (soft, hard and NMI).
> 
> So while I agree with most of these patches, I'm very hesitant on the
> above little detail.
> 

Just to make sure we have a common understanding (as written in my cover
letter):

Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
there is no preempt counter, in_atomic() won't work. So doing a
preempt_disable() instead of a pagefault_disable() is not going to work.
(not sure how -RT handles that - most probably with CONFIG_PREEMPT_COUNT being
enabled, due to atomic debug).

That's why I dropped that check for a reason.

So in my opinion, in_atomic() should never be used in any fault handler - it
has nothing to do with disabled pagefaults. It doesn't give us anything more
besides some false security for atomic environments.


This patchset is about decoupling both concept. (not ending up with to
mechanisms doing almost the same)

That's also what Thomas Gleixner suggested
https://lkml.org/lkml/2014/11/27/820 .


David


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
  2015-05-07  9:48     ` Ingo Molnar
@ 2015-05-07 10:51       ` Christian Borntraeger
  -1 siblings, 0 replies; 79+ messages in thread
From: Christian Borntraeger @ 2015-05-07 10:51 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton
  Cc: David Hildenbrand, linux-kernel, mingo, peterz, yang.shi,
	bigeasy, benh, paulus, heiko.carstens, schwidefsky, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Am 07.05.2015 um 11:48 schrieb Ingo Molnar:
> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
>> On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
>>
>>> As Peter asked me to also do the decoupling in one shot, this is
>>> the new series.
>>>
>>> I recently discovered that might_fault() doesn't call might_sleep()
>>> anymore. Therefore bugs like:
>>>
>>>   spin_lock(&lock);
>>>   rc = copy_to_user(...);
>>>   spin_unlock(&lock);
>>>
>>> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
>>> changed to disable false positives for code like:
>>>
>>>   pagefault_disable();
>>>   rc = copy_to_user(...);
>>>   pagefault_enable();
>>>
>>> Whereby the caller wants do deal with failures.
>>
>> hm, that was a significant screwup.  I wonder how many bugs we
>> subsequently added.
> 
> So I'm wondering what the motivation was to allow things like:
> 
>    pagefault_disable();
>    rc = copy_to_user(...);
>    pagefault_enable();
> 
> and to declare it a false positive?
> 
> AFAICS most uses are indeed atomic:
> 
>         pagefault_disable();
>         ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
>         pagefault_enable();
> 
> so why not make it explicitly atomic again?

Hmm, I am probably misreading that, but it sound as you suggest to go back
to Davids first proposal
https://lkml.org/lkml/2014/11/25/436
which makes might_fault to also contain might_sleep. Correct?

Christian


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-07 10:51       ` Christian Borntraeger
  0 siblings, 0 replies; 79+ messages in thread
From: Christian Borntraeger @ 2015-05-07 10:51 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton
  Cc: David Hildenbrand, linux-kernel, mingo, peterz, yang.shi,
	bigeasy, benh, paulus, heiko.carstens, schwidefsky, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

Am 07.05.2015 um 11:48 schrieb Ingo Molnar:
> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
>> On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
>>
>>> As Peter asked me to also do the decoupling in one shot, this is
>>> the new series.
>>>
>>> I recently discovered that might_fault() doesn't call might_sleep()
>>> anymore. Therefore bugs like:
>>>
>>>   spin_lock(&lock);
>>>   rc = copy_to_user(...);
>>>   spin_unlock(&lock);
>>>
>>> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
>>> changed to disable false positives for code like:
>>>
>>>   pagefault_disable();
>>>   rc = copy_to_user(...);
>>>   pagefault_enable();
>>>
>>> Whereby the caller wants do deal with failures.
>>
>> hm, that was a significant screwup.  I wonder how many bugs we
>> subsequently added.
> 
> So I'm wondering what the motivation was to allow things like:
> 
>    pagefault_disable();
>    rc = copy_to_user(...);
>    pagefault_enable();
> 
> and to declare it a false positive?
> 
> AFAICS most uses are indeed atomic:
> 
>         pagefault_disable();
>         ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
>         pagefault_enable();
> 
> so why not make it explicitly atomic again?

Hmm, I am probably misreading that, but it sound as you suggest to go back
to Davids first proposal
https://lkml.org/lkml/2014/11/25/436
which makes might_fault to also contain might_sleep. Correct?

Christian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
  2015-05-07 10:51       ` Christian Borntraeger
@ 2015-05-07 11:08         ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 11:08 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Andrew Morton, David Hildenbrand, linux-kernel, mingo, peterz,
	yang.shi, bigeasy, benh, paulus, heiko.carstens, schwidefsky,
	mst, tglx, David.Laight, hughd, hocko, ralf, herbert, linux,
	airlied, daniel.vetter, linux-mm, linux-arch


* Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> Am 07.05.2015 um 11:48 schrieb Ingo Molnar:
> > 
> > * Andrew Morton <akpm@linux-foundation.org> wrote:
> > 
> >> On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
> >>
> >>> As Peter asked me to also do the decoupling in one shot, this is
> >>> the new series.
> >>>
> >>> I recently discovered that might_fault() doesn't call might_sleep()
> >>> anymore. Therefore bugs like:
> >>>
> >>>   spin_lock(&lock);
> >>>   rc = copy_to_user(...);
> >>>   spin_unlock(&lock);
> >>>
> >>> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
> >>> changed to disable false positives for code like:
> >>>
> >>>   pagefault_disable();
> >>>   rc = copy_to_user(...);
> >>>   pagefault_enable();
> >>>
> >>> Whereby the caller wants do deal with failures.
> >>
> >> hm, that was a significant screwup.  I wonder how many bugs we
> >> subsequently added.
> > 
> > So I'm wondering what the motivation was to allow things like:
> > 
> >    pagefault_disable();
> >    rc = copy_to_user(...);
> >    pagefault_enable();
> > 
> > and to declare it a false positive?
> > 
> > AFAICS most uses are indeed atomic:
> > 
> >         pagefault_disable();
> >         ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
> >         pagefault_enable();
> > 
> > so why not make it explicitly atomic again?
> 
> Hmm, I am probably misreading that, but it sound as you suggest to go back
> to Davids first proposal
> https://lkml.org/lkml/2014/11/25/436
> which makes might_fault to also contain might_sleep. Correct?

Yes, but I'm wondering what I'm missing: is there any deep reason for 
making pagefaults-disabled sections non-atomic?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-07 11:08         ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 11:08 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Andrew Morton, David Hildenbrand, linux-kernel, mingo, peterz,
	yang.shi, bigeasy, benh, paulus, heiko.carstens, schwidefsky,
	mst, tglx, David.Laight, hughd, hocko, ralf, herbert, linux,
	airlied, daniel.vetter, linux-mm, linux-arch


* Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> Am 07.05.2015 um 11:48 schrieb Ingo Molnar:
> > 
> > * Andrew Morton <akpm@linux-foundation.org> wrote:
> > 
> >> On Wed,  6 May 2015 19:50:24 +0200 David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
> >>
> >>> As Peter asked me to also do the decoupling in one shot, this is
> >>> the new series.
> >>>
> >>> I recently discovered that might_fault() doesn't call might_sleep()
> >>> anymore. Therefore bugs like:
> >>>
> >>>   spin_lock(&lock);
> >>>   rc = copy_to_user(...);
> >>>   spin_unlock(&lock);
> >>>
> >>> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was
> >>> changed to disable false positives for code like:
> >>>
> >>>   pagefault_disable();
> >>>   rc = copy_to_user(...);
> >>>   pagefault_enable();
> >>>
> >>> Whereby the caller wants do deal with failures.
> >>
> >> hm, that was a significant screwup.  I wonder how many bugs we
> >> subsequently added.
> > 
> > So I'm wondering what the motivation was to allow things like:
> > 
> >    pagefault_disable();
> >    rc = copy_to_user(...);
> >    pagefault_enable();
> > 
> > and to declare it a false positive?
> > 
> > AFAICS most uses are indeed atomic:
> > 
> >         pagefault_disable();
> >         ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
> >         pagefault_enable();
> > 
> > so why not make it explicitly atomic again?
> 
> Hmm, I am probably misreading that, but it sound as you suggest to go back
> to Davids first proposal
> https://lkml.org/lkml/2014/11/25/436
> which makes might_fault to also contain might_sleep. Correct?

Yes, but I'm wondering what I'm missing: is there any deep reason for 
making pagefaults-disabled sections non-atomic?

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 10:50       ` David Hildenbrand
@ 2015-05-07 11:12         ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> Just to make sure we have a common understanding (as written in my cover
> letter):
> 
> Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> there is no preempt counter, in_atomic() won't work. 

But there is, we _always_ have a preempt_count, and irq_enter() et al.
_always_ increment the relevant bits.

The thread_info::preempt_count field it never under PREEMPT_COUNT
include/asm-generic/preempt.h provides stuff regardless of
PREEMPT_COUNT.

See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
__preempt_count_add() _always_ just works.

Its only things like preempt_disable() / preempt_enable() that get
munged depending on PREEMPT_COUNT/PREEMPT.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:12         ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> Just to make sure we have a common understanding (as written in my cover
> letter):
> 
> Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> there is no preempt counter, in_atomic() won't work. 

But there is, we _always_ have a preempt_count, and irq_enter() et al.
_always_ increment the relevant bits.

The thread_info::preempt_count field it never under PREEMPT_COUNT
include/asm-generic/preempt.h provides stuff regardless of
PREEMPT_COUNT.

See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
__preempt_count_add() _always_ just works.

Its only things like preempt_disable() / preempt_enable() that get
munged depending on PREEMPT_COUNT/PREEMPT.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 10:50       ` David Hildenbrand
@ 2015-05-07 11:12         ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 11:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch


* David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> > AFAICR we did this to avoid having to do both:
> > 
> > 	preempt_disable();
> > 	pagefault_disable();
> > 
> > in a fair number of places -- just like this patch-set does, this is
> > touching two cachelines where one would have been enough.
> > 
> > Also, removing in_atomic() from fault handlers like you did
> > significantly changes semantics for interrupts (soft, hard and NMI).
> > 
> > So while I agree with most of these patches, I'm very hesitant on the
> > above little detail.
> 
> Just to make sure we have a common understanding (as written in my 
> cover letter):
> 
> Your suggestion won't work with !CONFIG_PREEMPT 
> (!CONFIG_PREEMPT_COUNT). If there is no preempt counter, in_atomic() 
> won't work. So doing a preempt_disable() instead of a 
> pagefault_disable() is not going to work. (not sure how -RT handles 
> that - most probably with CONFIG_PREEMPT_COUNT being enabled, due to 
> atomic debug).
> 
> That's why I dropped that check for a reason.

So, what's the point of disabling the preempt counter?

Looks like the much simpler (and faster) solution would be to 
eliminate CONFIG_PREEMPT_COUNT (i.e. make it always available), and 
use it for pagefault-disable.

> This patchset is about decoupling both concept. (not ending up with 
> to mechanisms doing almost the same)

So that's really backwards: just because we might not have a handy 
counter we introduce _another one_, and duplicate checks for it ;-)

Why not keep a single counter, if indeed what we care about most in 
the pagefault_disable() case is atomicity?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:12         ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 11:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch


* David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> > AFAICR we did this to avoid having to do both:
> > 
> > 	preempt_disable();
> > 	pagefault_disable();
> > 
> > in a fair number of places -- just like this patch-set does, this is
> > touching two cachelines where one would have been enough.
> > 
> > Also, removing in_atomic() from fault handlers like you did
> > significantly changes semantics for interrupts (soft, hard and NMI).
> > 
> > So while I agree with most of these patches, I'm very hesitant on the
> > above little detail.
> 
> Just to make sure we have a common understanding (as written in my 
> cover letter):
> 
> Your suggestion won't work with !CONFIG_PREEMPT 
> (!CONFIG_PREEMPT_COUNT). If there is no preempt counter, in_atomic() 
> won't work. So doing a preempt_disable() instead of a 
> pagefault_disable() is not going to work. (not sure how -RT handles 
> that - most probably with CONFIG_PREEMPT_COUNT being enabled, due to 
> atomic debug).
> 
> That's why I dropped that check for a reason.

So, what's the point of disabling the preempt counter?

Looks like the much simpler (and faster) solution would be to 
eliminate CONFIG_PREEMPT_COUNT (i.e. make it always available), and 
use it for pagefault-disable.

> This patchset is about decoupling both concept. (not ending up with 
> to mechanisms doing almost the same)

So that's really backwards: just because we might not have a handy 
counter we introduce _another one_, and duplicate checks for it ;-)

Why not keep a single counter, if indeed what we care about most in 
the pagefault_disable() case is atomicity?

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:12         ` Peter Zijlstra
@ 2015-05-07 11:23           ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 11:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > Just to make sure we have a common understanding (as written in my cover
> > letter):
> > 
> > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > there is no preempt counter, in_atomic() won't work. 
> 
> But there is, we _always_ have a preempt_count, and irq_enter() et al.
> _always_ increment the relevant bits.
> 
> The thread_info::preempt_count field it never under PREEMPT_COUNT
> include/asm-generic/preempt.h provides stuff regardless of
> PREEMPT_COUNT.
> 
> See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> __preempt_count_add() _always_ just works.
> 
> Its only things like preempt_disable() / preempt_enable() that get
> munged depending on PREEMPT_COUNT/PREEMPT.
> 

Sorry for the confusion. Sure, there is always the count.

My point is that preempt_disable() won't result in an in_atomic() == true
with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
handlers. It is not reliable.


David


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:23           ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 11:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > Just to make sure we have a common understanding (as written in my cover
> > letter):
> > 
> > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > there is no preempt counter, in_atomic() won't work. 
> 
> But there is, we _always_ have a preempt_count, and irq_enter() et al.
> _always_ increment the relevant bits.
> 
> The thread_info::preempt_count field it never under PREEMPT_COUNT
> include/asm-generic/preempt.h provides stuff regardless of
> PREEMPT_COUNT.
> 
> See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> __preempt_count_add() _always_ just works.
> 
> Its only things like preempt_disable() / preempt_enable() that get
> munged depending on PREEMPT_COUNT/PREEMPT.
> 

Sorry for the confusion. Sure, there is always the count.

My point is that preempt_disable() won't result in an in_atomic() == true
with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
handlers. It is not reliable.


David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:23           ` David Hildenbrand
@ 2015-05-07 11:25             ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 11:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch


* David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> > On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > > Just to make sure we have a common understanding (as written in my cover
> > > letter):
> > > 
> > > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > > there is no preempt counter, in_atomic() won't work. 
> > 
> > But there is, we _always_ have a preempt_count, and irq_enter() et al.
> > _always_ increment the relevant bits.
> > 
> > The thread_info::preempt_count field it never under PREEMPT_COUNT
> > include/asm-generic/preempt.h provides stuff regardless of
> > PREEMPT_COUNT.
> > 
> > See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> > __preempt_count_add() _always_ just works.
> > 
> > Its only things like preempt_disable() / preempt_enable() that get
> > munged depending on PREEMPT_COUNT/PREEMPT.
> > 
> 
> Sorry for the confusion. Sure, there is always the count.
> 
> My point is that preempt_disable() won't result in an in_atomic() == true
> with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
> handlers. It is not reliable.

That's why we have the preempt_count_inc()/dec() methods that are 
always available.

So where's the problem?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:25             ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 11:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch


* David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> > On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > > Just to make sure we have a common understanding (as written in my cover
> > > letter):
> > > 
> > > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > > there is no preempt counter, in_atomic() won't work. 
> > 
> > But there is, we _always_ have a preempt_count, and irq_enter() et al.
> > _always_ increment the relevant bits.
> > 
> > The thread_info::preempt_count field it never under PREEMPT_COUNT
> > include/asm-generic/preempt.h provides stuff regardless of
> > PREEMPT_COUNT.
> > 
> > See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> > __preempt_count_add() _always_ just works.
> > 
> > Its only things like preempt_disable() / preempt_enable() that get
> > munged depending on PREEMPT_COUNT/PREEMPT.
> > 
> 
> Sorry for the confusion. Sure, there is always the count.
> 
> My point is that preempt_disable() won't result in an in_atomic() == true
> with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
> handlers. It is not reliable.

That's why we have the preempt_count_inc()/dec() methods that are 
always available.

So where's the problem?

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:25             ` Ingo Molnar
@ 2015-05-07 11:30               ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 11:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> 
> * David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
> 
> > > On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > > > Just to make sure we have a common understanding (as written in my cover
> > > > letter):
> > > > 
> > > > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > > > there is no preempt counter, in_atomic() won't work. 
> > > 
> > > But there is, we _always_ have a preempt_count, and irq_enter() et al.
> > > _always_ increment the relevant bits.
> > > 
> > > The thread_info::preempt_count field it never under PREEMPT_COUNT
> > > include/asm-generic/preempt.h provides stuff regardless of
> > > PREEMPT_COUNT.
> > > 
> > > See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> > > __preempt_count_add() _always_ just works.
> > > 
> > > Its only things like preempt_disable() / preempt_enable() that get
> > > munged depending on PREEMPT_COUNT/PREEMPT.
> > > 
> > 
> > Sorry for the confusion. Sure, there is always the count.
> > 
> > My point is that preempt_disable() won't result in an in_atomic() == true
> > with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
> > handlers. It is not reliable.
> 
> That's why we have the preempt_count_inc()/dec() methods that are 
> always available.
> 
> So where's the problem?


My point:

Getting rid of PREEMPT_COUNT (and therefore always doing
preempt_count_inc()/dec()) will make preempt_disable() __never__ be a NOP.

So with !CONFIG_PREEMPT we will do preemption stuff that is simply not needed.

Two concepts that share one mechanism. I think this is broken.

David


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:30               ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 11:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> 
> * David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:
> 
> > > On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > > > Just to make sure we have a common understanding (as written in my cover
> > > > letter):
> > > > 
> > > > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > > > there is no preempt counter, in_atomic() won't work. 
> > > 
> > > But there is, we _always_ have a preempt_count, and irq_enter() et al.
> > > _always_ increment the relevant bits.
> > > 
> > > The thread_info::preempt_count field it never under PREEMPT_COUNT
> > > include/asm-generic/preempt.h provides stuff regardless of
> > > PREEMPT_COUNT.
> > > 
> > > See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> > > __preempt_count_add() _always_ just works.
> > > 
> > > Its only things like preempt_disable() / preempt_enable() that get
> > > munged depending on PREEMPT_COUNT/PREEMPT.
> > > 
> > 
> > Sorry for the confusion. Sure, there is always the count.
> > 
> > My point is that preempt_disable() won't result in an in_atomic() == true
> > with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
> > handlers. It is not reliable.
> 
> That's why we have the preempt_count_inc()/dec() methods that are 
> always available.
> 
> So where's the problem?


My point:

Getting rid of PREEMPT_COUNT (and therefore always doing
preempt_count_inc()/dec()) will make preempt_disable() __never__ be a NOP.

So with !CONFIG_PREEMPT we will do preemption stuff that is simply not needed.

Two concepts that share one mechanism. I think this is broken.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
  2015-05-07 11:08         ` Ingo Molnar
@ 2015-05-07 11:40           ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christian Borntraeger, Andrew Morton, David Hildenbrand,
	linux-kernel, mingo, yang.shi, bigeasy, benh, paulus,
	heiko.carstens, schwidefsky, mst, tglx, David.Laight, hughd,
	hocko, ralf, herbert, linux, airlied, daniel.vetter, linux-mm,
	linux-arch

On Thu, May 07, 2015 at 01:08:28PM +0200, Ingo Molnar wrote:
> Yes, but I'm wondering what I'm missing: is there any deep reason for 
> making pagefaults-disabled sections non-atomic?

This all comes from -rt, where we had significant latencies due to these
things.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable()
@ 2015-05-07 11:40           ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christian Borntraeger, Andrew Morton, David Hildenbrand,
	linux-kernel, mingo, yang.shi, bigeasy, benh, paulus,
	heiko.carstens, schwidefsky, mst, tglx, David.Laight, hughd,
	hocko, ralf, herbert, linux, airlied, daniel.vetter, linux-mm,
	linux-arch

On Thu, May 07, 2015 at 01:08:28PM +0200, Ingo Molnar wrote:
> Yes, but I'm wondering what I'm missing: is there any deep reason for 
> making pagefaults-disabled sections non-atomic?

This all comes from -rt, where we had significant latencies due to these
things.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:12         ` Peter Zijlstra
@ 2015-05-07 11:40           ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 11:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > Just to make sure we have a common understanding (as written in my cover
> > letter):
> > 
> > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > there is no preempt counter, in_atomic() won't work. 
> 
> But there is, we _always_ have a preempt_count, and irq_enter() et al.
> _always_ increment the relevant bits.
> 
> The thread_info::preempt_count field it never under PREEMPT_COUNT
> include/asm-generic/preempt.h provides stuff regardless of
> PREEMPT_COUNT.
> 
> See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> __preempt_count_add() _always_ just works.


Okay thinking about this further, I think I got your point. That basically means
that the in_atomic() check makes sense for irqs.

But in my opinion, it does not help do replace

preempt_disable()
pagefault_disable()

by

preempt_disable()


(as discussed because of the PREEMPT_COUNT stuff)

So I agree that we should better add it to not mess with hard/soft irq.

> 
> Its only things like preempt_disable() / preempt_enable() that get
> munged depending on PREEMPT_COUNT/PREEMPT.
> 

But anyhow, opinions seem to differ how to best handle that whole stuff.

I think a separate counter just makes sense, as we are dealing with two
different concepts and we don't want to lose the preempt_disable =^ NOP
for !CONFIG_PREEMPT.

I also think that

pagefault_disable()
rt = copy_from_user()
pagefault_enable()

is a valid use case.

So any suggestions how to continue?

Thanks!

David


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:40           ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 11:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > Just to make sure we have a common understanding (as written in my cover
> > letter):
> > 
> > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > there is no preempt counter, in_atomic() won't work. 
> 
> But there is, we _always_ have a preempt_count, and irq_enter() et al.
> _always_ increment the relevant bits.
> 
> The thread_info::preempt_count field it never under PREEMPT_COUNT
> include/asm-generic/preempt.h provides stuff regardless of
> PREEMPT_COUNT.
> 
> See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> __preempt_count_add() _always_ just works.


Okay thinking about this further, I think I got your point. That basically means
that the in_atomic() check makes sense for irqs.

But in my opinion, it does not help do replace

preempt_disable()
pagefault_disable()

by

preempt_disable()


(as discussed because of the PREEMPT_COUNT stuff)

So I agree that we should better add it to not mess with hard/soft irq.

> 
> Its only things like preempt_disable() / preempt_enable() that get
> munged depending on PREEMPT_COUNT/PREEMPT.
> 

But anyhow, opinions seem to differ how to best handle that whole stuff.

I think a separate counter just makes sense, as we are dealing with two
different concepts and we don't want to lose the preempt_disable =^ NOP
for !CONFIG_PREEMPT.

I also think that

pagefault_disable()
rt = copy_from_user()
pagefault_enable()

is a valid use case.

So any suggestions how to continue?

Thanks!

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:23           ` David Hildenbrand
@ 2015-05-07 11:42             ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 01:23:35PM +0200, David Hildenbrand wrote:
> > On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > > Just to make sure we have a common understanding (as written in my cover
> > > letter):
> > > 
> > > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > > there is no preempt counter, in_atomic() won't work. 
> > 
> > But there is, we _always_ have a preempt_count, and irq_enter() et al.
> > _always_ increment the relevant bits.
> > 
> > The thread_info::preempt_count field it never under PREEMPT_COUNT
> > include/asm-generic/preempt.h provides stuff regardless of
> > PREEMPT_COUNT.
> > 
> > See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> > __preempt_count_add() _always_ just works.
> > 
> > Its only things like preempt_disable() / preempt_enable() that get
> > munged depending on PREEMPT_COUNT/PREEMPT.
> > 
> 
> Sorry for the confusion. Sure, there is always the count.
> 
> My point is that preempt_disable() won't result in an in_atomic() == true
> with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
> handlers. It is not reliable.

It _very_ reliably tells if we're in interrupts! Which your patches
break.

It also very much avoids touching two cachelines in a number of places.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:42             ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 01:23:35PM +0200, David Hildenbrand wrote:
> > On Thu, May 07, 2015 at 12:50:53PM +0200, David Hildenbrand wrote:
> > > Just to make sure we have a common understanding (as written in my cover
> > > letter):
> > > 
> > > Your suggestion won't work with !CONFIG_PREEMPT (!CONFIG_PREEMPT_COUNT). If
> > > there is no preempt counter, in_atomic() won't work. 
> > 
> > But there is, we _always_ have a preempt_count, and irq_enter() et al.
> > _always_ increment the relevant bits.
> > 
> > The thread_info::preempt_count field it never under PREEMPT_COUNT
> > include/asm-generic/preempt.h provides stuff regardless of
> > PREEMPT_COUNT.
> > 
> > See how __irq_enter() -> preempt_count_add(HARDIRQ_OFFSET) ->
> > __preempt_count_add() _always_ just works.
> > 
> > Its only things like preempt_disable() / preempt_enable() that get
> > munged depending on PREEMPT_COUNT/PREEMPT.
> > 
> 
> Sorry for the confusion. Sure, there is always the count.
> 
> My point is that preempt_disable() won't result in an in_atomic() == true
> with !PREEMPT_COUNT, so I don't see any point in adding in to the pagefault
> handlers. It is not reliable.

It _very_ reliably tells if we're in interrupts! Which your patches
break.

It also very much avoids touching two cachelines in a number of places.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:40           ` David Hildenbrand
@ 2015-05-07 11:48             ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 01:40:30PM +0200, David Hildenbrand wrote:
> I think a separate counter just makes sense, as we are dealing with two
> different concepts and we don't want to lose the preempt_disable =^ NOP
> for !CONFIG_PREEMPT.

Right, let me try and get my head on straight -- I'm so used to
PREEMPT=y being the 'hard' case :-)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:48             ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 01:40:30PM +0200, David Hildenbrand wrote:
> I think a separate counter just makes sense, as we are dealing with two
> different concepts and we don't want to lose the preempt_disable =^ NOP
> for !CONFIG_PREEMPT.

Right, let me try and get my head on straight -- I'm so used to
PREEMPT=y being the 'hard' case :-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:40           ` David Hildenbrand
@ 2015-05-07 11:51             ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 01:40:30PM +0200, David Hildenbrand wrote:
> But anyhow, opinions seem to differ how to best handle that whole stuff.
> 
> I think a separate counter just makes sense, as we are dealing with two
> different concepts and we don't want to lose the preempt_disable =^ NOP
> for !CONFIG_PREEMPT.
> 
> I also think that
> 
> pagefault_disable()
> rt = copy_from_user()
> pagefault_enable()
> 
> is a valid use case.
> 
> So any suggestions how to continue?


static inline bool __pagefault_disabled(void)
{
	return current->pagefault_disabled;
}

static inline bool pagefault_disabled(void)
{
	return in_atomic() || __pagefault_disabled();
}

And leave the preempt_disable() + pagefault_disable() for now. You're
right in that that is clearest.

If we ever get to the point where that really is an issue, I'll try and
be clever then :-)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 11:51             ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 11:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 01:40:30PM +0200, David Hildenbrand wrote:
> But anyhow, opinions seem to differ how to best handle that whole stuff.
> 
> I think a separate counter just makes sense, as we are dealing with two
> different concepts and we don't want to lose the preempt_disable =^ NOP
> for !CONFIG_PREEMPT.
> 
> I also think that
> 
> pagefault_disable()
> rt = copy_from_user()
> pagefault_enable()
> 
> is a valid use case.
> 
> So any suggestions how to continue?


static inline bool __pagefault_disabled(void)
{
	return current->pagefault_disabled;
}

static inline bool pagefault_disabled(void)
{
	return in_atomic() || __pagefault_disabled();
}

And leave the preempt_disable() + pagefault_disable() for now. You're
right in that that is clearest.

If we ever get to the point where that really is an issue, I'll try and
be clever then :-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 11:51             ` Peter Zijlstra
@ 2015-05-07 12:14               ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Thu, May 07, 2015 at 01:40:30PM +0200, David Hildenbrand wrote:
> > But anyhow, opinions seem to differ how to best handle that whole stuff.
> > 
> > I think a separate counter just makes sense, as we are dealing with two
> > different concepts and we don't want to lose the preempt_disable =^ NOP
> > for !CONFIG_PREEMPT.
> > 
> > I also think that
> > 
> > pagefault_disable()
> > rt = copy_from_user()
> > pagefault_enable()
> > 
> > is a valid use case.
> > 
> > So any suggestions how to continue?
> 
> 
> static inline bool __pagefault_disabled(void)
> {
> 	return current->pagefault_disabled;
> }
> 
> static inline bool pagefault_disabled(void)
> {
> 	return in_atomic() || __pagefault_disabled();
> }
> 
> And leave the preempt_disable() + pagefault_disable() for now. You're
> right in that that is clearest.
> 
> If we ever get to the point where that really is an issue, I'll try and
> be clever then :-)
> 

Thanks :), well just to make sure I got your opinion on this correctly:

1. You think that 2 counters is the way to go for now

2. You agree that we can't replace preempt_disable()+pagefault_disable() with
preempt_disable() (CONFIG_PREEMPT stuff), so we need to have them separately

3. We need in_atomic() (in the fault handlers only!) in addition to make sure we
don't mess with irq contexts (In that case I would add a good comment to that
place, describing why preempt_disable() won't help)


I think this is the right way to go because:

a) This way we don't have to modify preempt_disable() logic (including
PREEMPT_COUNT).

b) There are not that many users relying on
preempt_disable()+pagefault_disable()  (compared to pure preempt_disable() or
pagefault_disable() users), so the performance overhead of two cache lines
should be small. Users only making use of one of them should see no difference
in performance.

c) We correctly decouple preemption and pagefault logic. Therefore we can now
preempt when pagefaults are disabled, which feels right.

d) We can get some of that -rt flavor into the base kernel

e) We don't require inatomic variants in pagefault_disable() context as Ingo
suggested (For me, this now feels wrong - I had a different opinion back then
when working on the first revision of this patchset).
We would use inatomic() because preempt_disable() behaves differently with
PREEMPT_COUNT, mixing concepts at the user level.


@Ingo, do you have a strong feeling against this whole patchset/idea?


David


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 12:14               ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

> On Thu, May 07, 2015 at 01:40:30PM +0200, David Hildenbrand wrote:
> > But anyhow, opinions seem to differ how to best handle that whole stuff.
> > 
> > I think a separate counter just makes sense, as we are dealing with two
> > different concepts and we don't want to lose the preempt_disable =^ NOP
> > for !CONFIG_PREEMPT.
> > 
> > I also think that
> > 
> > pagefault_disable()
> > rt = copy_from_user()
> > pagefault_enable()
> > 
> > is a valid use case.
> > 
> > So any suggestions how to continue?
> 
> 
> static inline bool __pagefault_disabled(void)
> {
> 	return current->pagefault_disabled;
> }
> 
> static inline bool pagefault_disabled(void)
> {
> 	return in_atomic() || __pagefault_disabled();
> }
> 
> And leave the preempt_disable() + pagefault_disable() for now. You're
> right in that that is clearest.
> 
> If we ever get to the point where that really is an issue, I'll try and
> be clever then :-)
> 

Thanks :), well just to make sure I got your opinion on this correctly:

1. You think that 2 counters is the way to go for now

2. You agree that we can't replace preempt_disable()+pagefault_disable() with
preempt_disable() (CONFIG_PREEMPT stuff), so we need to have them separately

3. We need in_atomic() (in the fault handlers only!) in addition to make sure we
don't mess with irq contexts (In that case I would add a good comment to that
place, describing why preempt_disable() won't help)


I think this is the right way to go because:

a) This way we don't have to modify preempt_disable() logic (including
PREEMPT_COUNT).

b) There are not that many users relying on
preempt_disable()+pagefault_disable()  (compared to pure preempt_disable() or
pagefault_disable() users), so the performance overhead of two cache lines
should be small. Users only making use of one of them should see no difference
in performance.

c) We correctly decouple preemption and pagefault logic. Therefore we can now
preempt when pagefaults are disabled, which feels right.

d) We can get some of that -rt flavor into the base kernel

e) We don't require inatomic variants in pagefault_disable() context as Ingo
suggested (For me, this now feels wrong - I had a different opinion back then
when working on the first revision of this patchset).
We would use inatomic() because preempt_disable() behaves differently with
PREEMPT_COUNT, mixing concepts at the user level.


@Ingo, do you have a strong feeling against this whole patchset/idea?


David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 12:14               ` David Hildenbrand
@ 2015-05-07 12:27                 ` Ingo Molnar
  -1 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 12:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch


* David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> @Ingo, do you have a strong feeling against this whole 
> patchset/idea?

No objections, sounds good to me now.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 12:27                 ` Ingo Molnar
  0 siblings, 0 replies; 79+ messages in thread
From: Ingo Molnar @ 2015-05-07 12:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, linux-kernel, mingo, yang.shi, bigeasy, benh,
	paulus, akpm, heiko.carstens, schwidefsky, borntraeger, mst,
	tglx, David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch


* David Hildenbrand <dahi@linux.vnet.ibm.com> wrote:

> @Ingo, do you have a strong feeling against this whole 
> patchset/idea?

No objections, sounds good to me now.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
  2015-05-07 12:14               ` David Hildenbrand
@ 2015-05-07 12:32                 ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 12:32 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 02:14:39PM +0200, David Hildenbrand wrote:
> Thanks :), well just to make sure I got your opinion on this correctly:
> 
> 1. You think that 2 counters is the way to go for now

ack

> 2. You agree that we can't replace preempt_disable()+pagefault_disable() with
> preempt_disable() (CONFIG_PREEMPT stuff), so we need to have them separately

ack

> 3. We need in_atomic() (in the fault handlers only!) in addition to make sure we
> don't mess with irq contexts (In that case I would add a good comment to that
> place, describing why preempt_disable() won't help)

ack

> I think this is the right way to go because:
> 
> a) This way we don't have to modify preempt_disable() logic (including
> PREEMPT_COUNT).
> 
> b) There are not that many users relying on
> preempt_disable()+pagefault_disable()  (compared to pure preempt_disable() or
> pagefault_disable() users), so the performance overhead of two cache lines
> should be small. Users only making use of one of them should see no difference
> in performance.

indeed.

> c) We correctly decouple preemption and pagefault logic. Therefore we can now
> preempt when pagefaults are disabled, which feels right.

Right, that's always been the intent of introducing pagefault_disable().

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled
@ 2015-05-07 12:32                 ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2015-05-07 12:32 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, mingo, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, hocko, ralf, herbert, linux, airlied,
	daniel.vetter, linux-mm, linux-arch

On Thu, May 07, 2015 at 02:14:39PM +0200, David Hildenbrand wrote:
> Thanks :), well just to make sure I got your opinion on this correctly:
> 
> 1. You think that 2 counters is the way to go for now

ack

> 2. You agree that we can't replace preempt_disable()+pagefault_disable() with
> preempt_disable() (CONFIG_PREEMPT stuff), so we need to have them separately

ack

> 3. We need in_atomic() (in the fault handlers only!) in addition to make sure we
> don't mess with irq contexts (In that case I would add a good comment to that
> place, describing why preempt_disable() won't help)

ack

> I think this is the right way to go because:
> 
> a) This way we don't have to modify preempt_disable() logic (including
> PREEMPT_COUNT).
> 
> b) There are not that many users relying on
> preempt_disable()+pagefault_disable()  (compared to pure preempt_disable() or
> pagefault_disable() users), so the performance overhead of two cache lines
> should be small. Users only making use of one of them should see no difference
> in performance.

indeed.

> c) We correctly decouple preemption and pagefault logic. Therefore we can now
> preempt when pagefaults are disabled, which feels right.

Right, that's always been the intent of introducing pagefault_disable().

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH draft] mm: use pagefault_disable() to check for disabled pagefaults in the handler
  2015-05-07 11:51             ` Peter Zijlstra
@ 2015-05-07 15:45               ` David Hildenbrand
  -1 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 15:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, ralf, herbert, linux, airlied,
	daniel.vetter, cmetcalf, linux-mm, linux-arch

---- zip ---

This will replacement patch nr 6 if we have an agreement :)

So this is my suggestion to fix the irq context problem. We should not modify
pagefault_disabled() to check for in_atomic(), otherwise we can't reuse it
for might_fault()/might_sleep() checks.

This also allows to clean it up later (if ever :) ).

Thanks!

---- zip ---

Introduce faulthandler_disabled() and use it to check for irq context and
disabled pagefaults (via pagefault_disable()) in the pagefault handlers.

Please note that we keep the in_atomic() checks in place. This is needed
to detect irq context (in which case preemption is always properly disabled).

In contrast, preempt_disable() should never be used to disable pagefaults.
With !CONFIG_PREEMPT_COUNT, preempt_disable() doesn't modify the preempt
counter, and therefore the result of in_atomic() differs.
We validate that condition by using might_fault() checks when calling
might_sleep().

Therefore, we add an comment to faulthandler_disabled(), describing why this
is needed.

faulthandler_disabled() and pagefault_disable() are defined in
linux/uaccess.h, so let's properly add that include to all relevant files.

This patch is based on a patch from Thomas Gleixner.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/alpha/mm/fault.c      |  5 ++---
 arch/arc/mm/fault.c        |  2 +-
 arch/arm/mm/fault.c        |  2 +-
 arch/arm64/mm/fault.c      |  2 +-
 arch/avr32/mm/fault.c      |  4 ++--
 arch/cris/mm/fault.c       |  6 +++---
 arch/frv/mm/fault.c        |  4 ++--
 arch/ia64/mm/fault.c       |  4 ++--
 arch/m32r/mm/fault.c       |  8 ++++----
 arch/m68k/mm/fault.c       |  4 ++--
 arch/metag/mm/fault.c      |  2 +-
 arch/microblaze/mm/fault.c |  8 ++++----
 arch/mips/mm/fault.c       |  4 ++--
 arch/mn10300/mm/fault.c    |  4 ++--
 arch/nios2/mm/fault.c      |  2 +-
 arch/parisc/kernel/traps.c |  4 ++--
 arch/parisc/mm/fault.c     |  4 ++--
 arch/powerpc/mm/fault.c    |  9 +++++----
 arch/s390/mm/fault.c       |  2 +-
 arch/score/mm/fault.c      |  3 ++-
 arch/sh/mm/fault.c         |  5 +++--
 arch/sparc/mm/fault_32.c   |  4 ++--
 arch/sparc/mm/fault_64.c   |  4 ++--
 arch/sparc/mm/init_64.c    |  2 +-
 arch/tile/mm/fault.c       |  4 ++--
 arch/um/kernel/trap.c      |  4 ++--
 arch/unicore32/mm/fault.c  |  2 +-
 arch/x86/mm/fault.c        |  5 +++--
 arch/xtensa/mm/fault.c     |  4 ++--
 include/linux/uaccess.h    | 12 ++++++++++++
 30 files changed, 72 insertions(+), 57 deletions(-)

diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index 9d0ac09..4a905bd 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -23,8 +23,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
-
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 
 extern void die_if_kernel(char *,struct pt_regs *,long, unsigned long *);
 
@@ -107,7 +106,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
 
 	/* If we're in an interrupt context, or have no user context,
 	   we must not take the fault.  */
-	if (!mm || in_atomic())
+	if (!mm || faulthandler_disabled())
 		goto no_context;
 
 #ifdef CONFIG_ALPHA_LARGE_VMALLOC
diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 6a2e006..d948e4e 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -86,7 +86,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 6333d9c..0d629b8 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -276,7 +276,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 96da131..0948d32 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -211,7 +211,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
 	 * If we're in an interrupt or have no user context, we must not take
 	 * the fault.
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/avr32/mm/fault.c b/arch/avr32/mm/fault.c
index d223a8b..c035339 100644
--- a/arch/avr32/mm/fault.c
+++ b/arch/avr32/mm/fault.c
@@ -14,11 +14,11 @@
 #include <linux/pagemap.h>
 #include <linux/kdebug.h>
 #include <linux/kprobes.h>
+#include <linux/uaccess.h>
 
 #include <asm/mmu_context.h>
 #include <asm/sysreg.h>
 #include <asm/tlb.h>
-#include <asm/uaccess.h>
 
 #ifdef CONFIG_KPROBES
 static inline int notify_page_fault(struct pt_regs *regs, int trap)
@@ -81,7 +81,7 @@ asmlinkage void do_page_fault(unsigned long ecr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user context, we must
 	 * not take the fault...
 	 */
-	if (in_atomic() || !mm || regs->sr & SYSREG_BIT(GM))
+	if (faulthandler_disabled() || !mm || regs->sr & SYSREG_BIT(GM))
 		goto no_context;
 
 	local_irq_enable();
diff --git a/arch/cris/mm/fault.c b/arch/cris/mm/fault.c
index 83f12f2..3066d40 100644
--- a/arch/cris/mm/fault.c
+++ b/arch/cris/mm/fault.c
@@ -8,7 +8,7 @@
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/wait.h>
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 #include <arch/system.h>
 
 extern int find_fixup_code(struct pt_regs *);
@@ -109,11 +109,11 @@ do_page_fault(unsigned long address, struct pt_regs *regs,
 	info.si_code = SEGV_MAPERR;
 
 	/*
-	 * If we're in an interrupt or "atomic" operation or have no
+	 * If we're in an interrupt, have pagefaults disabled or have no
 	 * user context, we must not take the fault.
 	 */
 
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/frv/mm/fault.c b/arch/frv/mm/fault.c
index ec4917d..61d9976 100644
--- a/arch/frv/mm/fault.c
+++ b/arch/frv/mm/fault.c
@@ -19,9 +19,9 @@
 #include <linux/kernel.h>
 #include <linux/ptrace.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
-#include <asm/uaccess.h>
 #include <asm/gdb-stub.h>
 
 /*****************************************************************************/
@@ -78,7 +78,7 @@ asmlinkage void do_page_fault(int datammu, unsigned long esr0, unsigned long ear
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(__frame))
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index ba5ba7a..70b40d1 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -11,10 +11,10 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/prefetch.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/uaccess.h>
 
 extern int die(char *, struct pt_regs *, long);
 
@@ -96,7 +96,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	/*
 	 * If we're in an interrupt or have no user context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
diff --git a/arch/m32r/mm/fault.c b/arch/m32r/mm/fault.c
index e3d4d48901..8f9875b 100644
--- a/arch/m32r/mm/fault.c
+++ b/arch/m32r/mm/fault.c
@@ -24,9 +24,9 @@
 #include <linux/vt_kern.h>		/* For unblank_screen() */
 #include <linux/highmem.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/m32r.h>
-#include <asm/uaccess.h>
 #include <asm/hardirq.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -111,10 +111,10 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	mm = tsk->mm;
 
 	/*
-	 * If we're in an interrupt or have no user context or are running in an
-	 * atomic region then we must not take the fault..
+	 * If we're in an interrupt or have no user context or have pagefaults
+	 * disabled then we must not take the fault.
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (error_code & ACE_USERMODE)
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index b2f04ae..6a94cdd 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -10,10 +10,10 @@
 #include <linux/ptrace.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/setup.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 extern void die_if_kernel(char *, struct pt_regs *, long);
@@ -81,7 +81,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/metag/mm/fault.c b/arch/metag/mm/fault.c
index 2de5dc6..f57edca 100644
--- a/arch/metag/mm/fault.c
+++ b/arch/metag/mm/fault.c
@@ -105,7 +105,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	mm = tsk->mm;
 
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c
index d46a5eb..177dfc0 100644
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -107,14 +107,14 @@ void do_page_fault(struct pt_regs *regs, unsigned long address,
 	if ((error_code & 0x13) == 0x13 || (error_code & 0x11) == 0x11)
 		is_write = 0;
 
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(faulthandler_disabled() || !mm)) {
 		if (kernel_mode(regs))
 			goto bad_area_nosemaphore;
 
-		/* in_atomic() in user mode is really bad,
+		/* faulthandler_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
-		pr_emerg("Page fault in user mode with in_atomic(), mm = %p\n",
-									mm);
+		pr_emerg("Page fault in user mode with faulthandler_disabled(), mm = %p\n",
+			 mm);
 		pr_emerg("r15 = %lx  MSR = %lx\n",
 		       regs->r15, regs->msr);
 		die("Weird page fault", regs, SIGSEGV);
diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 7ff8637..36c0f26 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -21,10 +21,10 @@
 #include <linux/module.h>
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
+#include <linux/uaccess.h>
 
 #include <asm/branch.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/ptrace.h>
 #include <asm/highmem.h>		/* For VMALLOC_END */
 #include <linux/kdebug.h>
@@ -94,7 +94,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/mn10300/mm/fault.c b/arch/mn10300/mm/fault.c
index 0c2cc5d..4a1d181 100644
--- a/arch/mn10300/mm/fault.c
+++ b/arch/mn10300/mm/fault.c
@@ -23,8 +23,8 @@
 #include <linux/interrupt.h>
 #include <linux/init.h>
 #include <linux/vt_kern.h>		/* For unblank_screen() */
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 #include <asm/hardirq.h>
 #include <asm/cpu-regs.h>
@@ -168,7 +168,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long fault_code,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if ((fault_code & MMUFCR_xFC_ACCESS) == MMUFCR_xFC_ACCESS_USR)
diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index 0c9b6af..b51878b 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -77,7 +77,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 47ee620..6548fd1 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -26,9 +26,9 @@
 #include <linux/console.h>
 #include <linux/bug.h>
 #include <linux/ratelimit.h>
+#include <linux/uaccess.h>
 
 #include <asm/assembly.h>
-#include <asm/uaccess.h>
 #include <asm/io.h>
 #include <asm/irq.h>
 #include <asm/traps.h>
@@ -800,7 +800,7 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
 	     * unless pagefault_disable() was called before.
 	     */
 
-	    if (fault_space == 0 && !in_atomic())
+	    if (fault_space == 0 && !faulthandler_disabled())
 	    {
 		pdc_chassis_send_status(PDC_CHASSIS_DIRECT_PANIC);
 		parisc_terminate("Kernel Fault", regs, code, fault_address);
diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
index e5120e6..15503ad 100644
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -15,8 +15,8 @@
 #include <linux/sched.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/traps.h>
 
 /* Various important other fields */
@@ -207,7 +207,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
 	int fault;
 	unsigned int flags;
 
-	if (in_atomic())
+	if (pagefault_disabled())
 		goto no_context;
 
 	tsk = current;
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index b396868..6d53597 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -33,13 +33,13 @@
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 #include <linux/hugetlb.h>
+#include <linux/uaccess.h>
 
 #include <asm/firmware.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
 #include <asm/debug.h>
@@ -272,15 +272,16 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (!arch_irq_disabled_regs(regs))
 		local_irq_enable();
 
-	if (in_atomic() || mm == NULL) {
+	if (faulthandler_disabled() || mm == NULL) {
 		if (!user_mode(regs)) {
 			rc = SIGSEGV;
 			goto bail;
 		}
-		/* in_atomic() in user mode is really bad,
+		/* faulthandler_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
 		printk(KERN_EMERG "Page fault in user mode with "
-		       "in_atomic() = %d mm = %p\n", in_atomic(), mm);
+		       "faulthandler_disabled() = %d mm = %p\n",
+		       faulthandler_disabled(), mm);
 		printk(KERN_EMERG "NIP = %lx  MSR = %lx\n",
 		       regs->nip, regs->msr);
 		die("Weird page fault", regs, SIGSEGV);
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 76515bc..4c8f5d7 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -399,7 +399,7 @@ static inline int do_exception(struct pt_regs *regs, int access)
 	 * user context.
 	 */
 	fault = VM_FAULT_BADCONTEXT;
-	if (unlikely(!user_space_fault(regs) || in_atomic() || !mm))
+	if (unlikely(!user_space_fault(regs) || faulthandler_disabled() || !mm))
 		goto out;
 
 	address = trans_exc_code & __FAIL_ADDR_MASK;
diff --git a/arch/score/mm/fault.c b/arch/score/mm/fault.c
index 6860beb..37a6c2e 100644
--- a/arch/score/mm/fault.c
+++ b/arch/score/mm/fault.c
@@ -34,6 +34,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/ptrace.h>
+#include <linux/uaccess.h>
 
 /*
  * This routine handles page faults.  It determines the address,
@@ -73,7 +74,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
 	* If we're in an interrupt or have no user
 	* context, we must not take the fault..
 	*/
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index a58fec9..79d8276 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -17,6 +17,7 @@
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 #include <asm/io_trapped.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -438,9 +439,9 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running
-	 * in an atomic region then we must not take the fault:
+	 * with pagefaults disabled then we must not take the fault:
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(faulthandler_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index 70d8171..c399e7b 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -21,6 +21,7 @@
 #include <linux/perf_event.h>
 #include <linux/interrupt.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -29,7 +30,6 @@
 #include <asm/setup.h>
 #include <asm/smp.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 
 #include "mm_32.h"
 
@@ -196,7 +196,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 4798232..e9268ea 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -22,12 +22,12 @@
 #include <linux/kdebug.h>
 #include <linux/percpu.h>
 #include <linux/context_tracking.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/openprom.h>
 #include <asm/oplib.h>
-#include <asm/uaccess.h>
 #include <asm/asi.h>
 #include <asm/lsu.h>
 #include <asm/sections.h>
@@ -330,7 +330,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto intr_or_no_mm;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 4ca0d6b..cee9b77 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2706,7 +2706,7 @@ void hugetlb_setup(struct pt_regs *regs)
 	struct mm_struct *mm = current->mm;
 	struct tsb_config *tp;
 
-	if (in_atomic() || !mm) {
+	if (faulthandler_disabled() || !mm) {
 		const struct exception_table_entry *entry;
 
 		entry = search_exception_tables(regs->tpc);
diff --git a/arch/tile/mm/fault.c b/arch/tile/mm/fault.c
index e83cc99..3f4f58d 100644
--- a/arch/tile/mm/fault.c
+++ b/arch/tile/mm/fault.c
@@ -354,9 +354,9 @@ static int handle_page_fault(struct pt_regs *regs,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running in an
-	 * atomic region then we must not take the fault.
+	 * region with pagefaults disabled then we must not take the fault.
 	 */
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		vma = NULL;  /* happy compiler */
 		goto bad_area_nosemaphore;
 	}
diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index 8e4daf4..f9c9e5a 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -35,10 +35,10 @@ int handle_page_fault(unsigned long address, unsigned long ip,
 	*code_out = SEGV_MAPERR;
 
 	/*
-	 * If the fault was during atomic operation, don't take the fault, just
+	 * If the fault was with pagefaults disabled, don't take the fault, just
 	 * fail.
 	 */
-	if (in_atomic())
+	if (faulthandler_disabled())
 		goto out_nosemaphore;
 
 	if (is_user)
diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c
index 0dc922d..afccef552 100644
--- a/arch/unicore32/mm/fault.c
+++ b/arch/unicore32/mm/fault.c
@@ -218,7 +218,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 181c53b..9dc9098 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -13,6 +13,7 @@
 #include <linux/hugetlb.h>		/* hstate_index_to_shift	*/
 #include <linux/prefetch.h>		/* prefetchw			*/
 #include <linux/context_tracking.h>	/* exception_enter(), ...	*/
+#include <linux/uaccess.h>		/* faulthandler_disabled()	*/
 
 #include <asm/traps.h>			/* dotraplinkage, ...		*/
 #include <asm/pgalloc.h>		/* pgd_*(), ...			*/
@@ -1126,9 +1127,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running
-	 * in an atomic region then we must not take the fault:
+	 * in a region with pagefaults disabled then we must not take the fault
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(faulthandler_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c
index 9e3571a..83a44a3 100644
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -15,10 +15,10 @@
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 #include <asm/mmu_context.h>
 #include <asm/cacheflush.h>
 #include <asm/hardirq.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 DEFINE_PER_CPU(unsigned long, asid_cache) = ASID_USER_FIRST;
@@ -57,7 +57,7 @@ void do_page_fault(struct pt_regs *regs)
 	/* If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm) {
+	if (faulthandler_disabled() || !mm) {
 		bad_page_fault(regs, address, SIGSEGV);
 		return;
 	}
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 23290cc..90786d2 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -59,6 +59,18 @@ static inline void pagefault_enable(void)
  */
 #define pagefault_disabled() (current->pagefault_disabled != 0)
 
+/*
+ * The pagefault handler is in general disabled by pagefault_disable() or
+ * when in irq context (via in_atomic()).
+ *
+ * This function should only be used by the fault handlers. Other users should
+ * stick to pagefault_disabled().
+ * Please NEVER use preempt_disable() to disable the fault handler. With
+ * !CONFIG_PREEMPT_COUNT, this is like a NOP. So the handler won't be disabled.
+ * in_atomic() will report different values based on !CONFIG_PREEMPT_COUNT.
+ */
+#define faulthandler_disabled() (pagefault_disabled() || in_atomic())
+
 #ifndef ARCH_HAS_NOCACHE_UACCESS
 
 static inline unsigned long __copy_from_user_inatomic_nocache(void *to,
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH draft] mm: use pagefault_disable() to check for disabled pagefaults in the handler
@ 2015-05-07 15:45               ` David Hildenbrand
  0 siblings, 0 replies; 79+ messages in thread
From: David Hildenbrand @ 2015-05-07 15:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: dahi, mingo, peterz, yang.shi, bigeasy, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, tglx,
	David.Laight, hughd, ralf, herbert, linux, airlied,
	daniel.vetter, cmetcalf, linux-mm, linux-arch

---- zip ---

This will replacement patch nr 6 if we have an agreement :)

So this is my suggestion to fix the irq context problem. We should not modify
pagefault_disabled() to check for in_atomic(), otherwise we can't reuse it
for might_fault()/might_sleep() checks.

This also allows to clean it up later (if ever :) ).

Thanks!

---- zip ---

Introduce faulthandler_disabled() and use it to check for irq context and
disabled pagefaults (via pagefault_disable()) in the pagefault handlers.

Please note that we keep the in_atomic() checks in place. This is needed
to detect irq context (in which case preemption is always properly disabled).

In contrast, preempt_disable() should never be used to disable pagefaults.
With !CONFIG_PREEMPT_COUNT, preempt_disable() doesn't modify the preempt
counter, and therefore the result of in_atomic() differs.
We validate that condition by using might_fault() checks when calling
might_sleep().

Therefore, we add an comment to faulthandler_disabled(), describing why this
is needed.

faulthandler_disabled() and pagefault_disable() are defined in
linux/uaccess.h, so let's properly add that include to all relevant files.

This patch is based on a patch from Thomas Gleixner.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/alpha/mm/fault.c      |  5 ++---
 arch/arc/mm/fault.c        |  2 +-
 arch/arm/mm/fault.c        |  2 +-
 arch/arm64/mm/fault.c      |  2 +-
 arch/avr32/mm/fault.c      |  4 ++--
 arch/cris/mm/fault.c       |  6 +++---
 arch/frv/mm/fault.c        |  4 ++--
 arch/ia64/mm/fault.c       |  4 ++--
 arch/m32r/mm/fault.c       |  8 ++++----
 arch/m68k/mm/fault.c       |  4 ++--
 arch/metag/mm/fault.c      |  2 +-
 arch/microblaze/mm/fault.c |  8 ++++----
 arch/mips/mm/fault.c       |  4 ++--
 arch/mn10300/mm/fault.c    |  4 ++--
 arch/nios2/mm/fault.c      |  2 +-
 arch/parisc/kernel/traps.c |  4 ++--
 arch/parisc/mm/fault.c     |  4 ++--
 arch/powerpc/mm/fault.c    |  9 +++++----
 arch/s390/mm/fault.c       |  2 +-
 arch/score/mm/fault.c      |  3 ++-
 arch/sh/mm/fault.c         |  5 +++--
 arch/sparc/mm/fault_32.c   |  4 ++--
 arch/sparc/mm/fault_64.c   |  4 ++--
 arch/sparc/mm/init_64.c    |  2 +-
 arch/tile/mm/fault.c       |  4 ++--
 arch/um/kernel/trap.c      |  4 ++--
 arch/unicore32/mm/fault.c  |  2 +-
 arch/x86/mm/fault.c        |  5 +++--
 arch/xtensa/mm/fault.c     |  4 ++--
 include/linux/uaccess.h    | 12 ++++++++++++
 30 files changed, 72 insertions(+), 57 deletions(-)

diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index 9d0ac09..4a905bd 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -23,8 +23,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
-
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 
 extern void die_if_kernel(char *,struct pt_regs *,long, unsigned long *);
 
@@ -107,7 +106,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
 
 	/* If we're in an interrupt context, or have no user context,
 	   we must not take the fault.  */
-	if (!mm || in_atomic())
+	if (!mm || faulthandler_disabled())
 		goto no_context;
 
 #ifdef CONFIG_ALPHA_LARGE_VMALLOC
diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 6a2e006..d948e4e 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -86,7 +86,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 6333d9c..0d629b8 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -276,7 +276,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 96da131..0948d32 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -211,7 +211,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
 	 * If we're in an interrupt or have no user context, we must not take
 	 * the fault.
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/avr32/mm/fault.c b/arch/avr32/mm/fault.c
index d223a8b..c035339 100644
--- a/arch/avr32/mm/fault.c
+++ b/arch/avr32/mm/fault.c
@@ -14,11 +14,11 @@
 #include <linux/pagemap.h>
 #include <linux/kdebug.h>
 #include <linux/kprobes.h>
+#include <linux/uaccess.h>
 
 #include <asm/mmu_context.h>
 #include <asm/sysreg.h>
 #include <asm/tlb.h>
-#include <asm/uaccess.h>
 
 #ifdef CONFIG_KPROBES
 static inline int notify_page_fault(struct pt_regs *regs, int trap)
@@ -81,7 +81,7 @@ asmlinkage void do_page_fault(unsigned long ecr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user context, we must
 	 * not take the fault...
 	 */
-	if (in_atomic() || !mm || regs->sr & SYSREG_BIT(GM))
+	if (faulthandler_disabled() || !mm || regs->sr & SYSREG_BIT(GM))
 		goto no_context;
 
 	local_irq_enable();
diff --git a/arch/cris/mm/fault.c b/arch/cris/mm/fault.c
index 83f12f2..3066d40 100644
--- a/arch/cris/mm/fault.c
+++ b/arch/cris/mm/fault.c
@@ -8,7 +8,7 @@
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/wait.h>
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
 #include <arch/system.h>
 
 extern int find_fixup_code(struct pt_regs *);
@@ -109,11 +109,11 @@ do_page_fault(unsigned long address, struct pt_regs *regs,
 	info.si_code = SEGV_MAPERR;
 
 	/*
-	 * If we're in an interrupt or "atomic" operation or have no
+	 * If we're in an interrupt, have pagefaults disabled or have no
 	 * user context, we must not take the fault.
 	 */
 
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/frv/mm/fault.c b/arch/frv/mm/fault.c
index ec4917d..61d9976 100644
--- a/arch/frv/mm/fault.c
+++ b/arch/frv/mm/fault.c
@@ -19,9 +19,9 @@
 #include <linux/kernel.h>
 #include <linux/ptrace.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
-#include <asm/uaccess.h>
 #include <asm/gdb-stub.h>
 
 /*****************************************************************************/
@@ -78,7 +78,7 @@ asmlinkage void do_page_fault(int datammu, unsigned long esr0, unsigned long ear
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(__frame))
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index ba5ba7a..70b40d1 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -11,10 +11,10 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/prefetch.h>
+#include <linux/uaccess.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
-#include <asm/uaccess.h>
 
 extern int die(char *, struct pt_regs *, long);
 
@@ -96,7 +96,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	/*
 	 * If we're in an interrupt or have no user context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
diff --git a/arch/m32r/mm/fault.c b/arch/m32r/mm/fault.c
index e3d4d48901..8f9875b 100644
--- a/arch/m32r/mm/fault.c
+++ b/arch/m32r/mm/fault.c
@@ -24,9 +24,9 @@
 #include <linux/vt_kern.h>		/* For unblank_screen() */
 #include <linux/highmem.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/m32r.h>
-#include <asm/uaccess.h>
 #include <asm/hardirq.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -111,10 +111,10 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	mm = tsk->mm;
 
 	/*
-	 * If we're in an interrupt or have no user context or are running in an
-	 * atomic region then we must not take the fault..
+	 * If we're in an interrupt or have no user context or have pagefaults
+	 * disabled then we must not take the fault.
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (error_code & ACE_USERMODE)
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index b2f04ae..6a94cdd 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -10,10 +10,10 @@
 #include <linux/ptrace.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
 #include <asm/setup.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 extern void die_if_kernel(char *, struct pt_regs *, long);
@@ -81,7 +81,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/metag/mm/fault.c b/arch/metag/mm/fault.c
index 2de5dc6..f57edca 100644
--- a/arch/metag/mm/fault.c
+++ b/arch/metag/mm/fault.c
@@ -105,7 +105,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	mm = tsk->mm;
 
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c
index d46a5eb..177dfc0 100644
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -107,14 +107,14 @@ void do_page_fault(struct pt_regs *regs, unsigned long address,
 	if ((error_code & 0x13) == 0x13 || (error_code & 0x11) == 0x11)
 		is_write = 0;
 
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(faulthandler_disabled() || !mm)) {
 		if (kernel_mode(regs))
 			goto bad_area_nosemaphore;
 
-		/* in_atomic() in user mode is really bad,
+		/* faulthandler_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
-		pr_emerg("Page fault in user mode with in_atomic(), mm = %p\n",
-									mm);
+		pr_emerg("Page fault in user mode with faulthandler_disabled(), mm = %p\n",
+			 mm);
 		pr_emerg("r15 = %lx  MSR = %lx\n",
 		       regs->r15, regs->msr);
 		die("Weird page fault", regs, SIGSEGV);
diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 7ff8637..36c0f26 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -21,10 +21,10 @@
 #include <linux/module.h>
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
+#include <linux/uaccess.h>
 
 #include <asm/branch.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/ptrace.h>
 #include <asm/highmem.h>		/* For VMALLOC_END */
 #include <linux/kdebug.h>
@@ -94,7 +94,7 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/mn10300/mm/fault.c b/arch/mn10300/mm/fault.c
index 0c2cc5d..4a1d181 100644
--- a/arch/mn10300/mm/fault.c
+++ b/arch/mn10300/mm/fault.c
@@ -23,8 +23,8 @@
 #include <linux/interrupt.h>
 #include <linux/init.h>
 #include <linux/vt_kern.h>		/* For unblank_screen() */
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 #include <asm/hardirq.h>
 #include <asm/cpu-regs.h>
@@ -168,7 +168,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long fault_code,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if ((fault_code & MMUFCR_xFC_ACCESS) == MMUFCR_xFC_ACCESS_USR)
diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index 0c9b6af..b51878b 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -77,7 +77,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 47ee620..6548fd1 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -26,9 +26,9 @@
 #include <linux/console.h>
 #include <linux/bug.h>
 #include <linux/ratelimit.h>
+#include <linux/uaccess.h>
 
 #include <asm/assembly.h>
-#include <asm/uaccess.h>
 #include <asm/io.h>
 #include <asm/irq.h>
 #include <asm/traps.h>
@@ -800,7 +800,7 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
 	     * unless pagefault_disable() was called before.
 	     */
 
-	    if (fault_space == 0 && !in_atomic())
+	    if (fault_space == 0 && !faulthandler_disabled())
 	    {
 		pdc_chassis_send_status(PDC_CHASSIS_DIRECT_PANIC);
 		parisc_terminate("Kernel Fault", regs, code, fault_address);
diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
index e5120e6..15503ad 100644
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -15,8 +15,8 @@
 #include <linux/sched.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 
-#include <asm/uaccess.h>
 #include <asm/traps.h>
 
 /* Various important other fields */
@@ -207,7 +207,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
 	int fault;
 	unsigned int flags;
 
-	if (in_atomic())
+	if (pagefault_disabled())
 		goto no_context;
 
 	tsk = current;
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index b396868..6d53597 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -33,13 +33,13 @@
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 #include <linux/hugetlb.h>
+#include <linux/uaccess.h>
 
 #include <asm/firmware.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
-#include <asm/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
 #include <asm/debug.h>
@@ -272,15 +272,16 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (!arch_irq_disabled_regs(regs))
 		local_irq_enable();
 
-	if (in_atomic() || mm == NULL) {
+	if (faulthandler_disabled() || mm == NULL) {
 		if (!user_mode(regs)) {
 			rc = SIGSEGV;
 			goto bail;
 		}
-		/* in_atomic() in user mode is really bad,
+		/* faulthandler_disabled() in user mode is really bad,
 		   as is current->mm == NULL. */
 		printk(KERN_EMERG "Page fault in user mode with "
-		       "in_atomic() = %d mm = %p\n", in_atomic(), mm);
+		       "faulthandler_disabled() = %d mm = %p\n",
+		       faulthandler_disabled(), mm);
 		printk(KERN_EMERG "NIP = %lx  MSR = %lx\n",
 		       regs->nip, regs->msr);
 		die("Weird page fault", regs, SIGSEGV);
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 76515bc..4c8f5d7 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -399,7 +399,7 @@ static inline int do_exception(struct pt_regs *regs, int access)
 	 * user context.
 	 */
 	fault = VM_FAULT_BADCONTEXT;
-	if (unlikely(!user_space_fault(regs) || in_atomic() || !mm))
+	if (unlikely(!user_space_fault(regs) || faulthandler_disabled() || !mm))
 		goto out;
 
 	address = trans_exc_code & __FAIL_ADDR_MASK;
diff --git a/arch/score/mm/fault.c b/arch/score/mm/fault.c
index 6860beb..37a6c2e 100644
--- a/arch/score/mm/fault.c
+++ b/arch/score/mm/fault.c
@@ -34,6 +34,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/ptrace.h>
+#include <linux/uaccess.h>
 
 /*
  * This routine handles page faults.  It determines the address,
@@ -73,7 +74,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
 	* If we're in an interrupt or have no user
 	* context, we must not take the fault..
 	*/
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto bad_area_nosemaphore;
 
 	if (user_mode(regs))
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index a58fec9..79d8276 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -17,6 +17,7 @@
 #include <linux/kprobes.h>
 #include <linux/perf_event.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 #include <asm/io_trapped.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
@@ -438,9 +439,9 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running
-	 * in an atomic region then we must not take the fault:
+	 * with pagefaults disabled then we must not take the fault:
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(faulthandler_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index 70d8171..c399e7b 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -21,6 +21,7 @@
 #include <linux/perf_event.h>
 #include <linux/interrupt.h>
 #include <linux/kdebug.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -29,7 +30,6 @@
 #include <asm/setup.h>
 #include <asm/smp.h>
 #include <asm/traps.h>
-#include <asm/uaccess.h>
 
 #include "mm_32.h"
 
@@ -196,7 +196,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (pagefault_disabled() || !mm)
 		goto no_context;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 4798232..e9268ea 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -22,12 +22,12 @@
 #include <linux/kdebug.h>
 #include <linux/percpu.h>
 #include <linux/context_tracking.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/openprom.h>
 #include <asm/oplib.h>
-#include <asm/uaccess.h>
 #include <asm/asi.h>
 #include <asm/lsu.h>
 #include <asm/sections.h>
@@ -330,7 +330,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto intr_or_no_mm;
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 4ca0d6b..cee9b77 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2706,7 +2706,7 @@ void hugetlb_setup(struct pt_regs *regs)
 	struct mm_struct *mm = current->mm;
 	struct tsb_config *tp;
 
-	if (in_atomic() || !mm) {
+	if (faulthandler_disabled() || !mm) {
 		const struct exception_table_entry *entry;
 
 		entry = search_exception_tables(regs->tpc);
diff --git a/arch/tile/mm/fault.c b/arch/tile/mm/fault.c
index e83cc99..3f4f58d 100644
--- a/arch/tile/mm/fault.c
+++ b/arch/tile/mm/fault.c
@@ -354,9 +354,9 @@ static int handle_page_fault(struct pt_regs *regs,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running in an
-	 * atomic region then we must not take the fault.
+	 * region with pagefaults disabled then we must not take the fault.
 	 */
-	if (in_atomic() || !mm) {
+	if (pagefault_disabled() || !mm) {
 		vma = NULL;  /* happy compiler */
 		goto bad_area_nosemaphore;
 	}
diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index 8e4daf4..f9c9e5a 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -35,10 +35,10 @@ int handle_page_fault(unsigned long address, unsigned long ip,
 	*code_out = SEGV_MAPERR;
 
 	/*
-	 * If the fault was during atomic operation, don't take the fault, just
+	 * If the fault was with pagefaults disabled, don't take the fault, just
 	 * fail.
 	 */
-	if (in_atomic())
+	if (faulthandler_disabled())
 		goto out_nosemaphore;
 
 	if (is_user)
diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c
index 0dc922d..afccef552 100644
--- a/arch/unicore32/mm/fault.c
+++ b/arch/unicore32/mm/fault.c
@@ -218,7 +218,7 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm)
+	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
 	if (user_mode(regs))
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 181c53b..9dc9098 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -13,6 +13,7 @@
 #include <linux/hugetlb.h>		/* hstate_index_to_shift	*/
 #include <linux/prefetch.h>		/* prefetchw			*/
 #include <linux/context_tracking.h>	/* exception_enter(), ...	*/
+#include <linux/uaccess.h>		/* faulthandler_disabled()	*/
 
 #include <asm/traps.h>			/* dotraplinkage, ...		*/
 #include <asm/pgalloc.h>		/* pgd_*(), ...			*/
@@ -1126,9 +1127,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 
 	/*
 	 * If we're in an interrupt, have no user context or are running
-	 * in an atomic region then we must not take the fault:
+	 * in a region with pagefaults disabled then we must not take the fault
 	 */
-	if (unlikely(in_atomic() || !mm)) {
+	if (unlikely(faulthandler_disabled() || !mm)) {
 		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c
index 9e3571a..83a44a3 100644
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -15,10 +15,10 @@
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 #include <asm/mmu_context.h>
 #include <asm/cacheflush.h>
 #include <asm/hardirq.h>
-#include <asm/uaccess.h>
 #include <asm/pgalloc.h>
 
 DEFINE_PER_CPU(unsigned long, asid_cache) = ASID_USER_FIRST;
@@ -57,7 +57,7 @@ void do_page_fault(struct pt_regs *regs)
 	/* If we're in an interrupt or have no user
 	 * context, we must not take the fault..
 	 */
-	if (in_atomic() || !mm) {
+	if (faulthandler_disabled() || !mm) {
 		bad_page_fault(regs, address, SIGSEGV);
 		return;
 	}
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 23290cc..90786d2 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -59,6 +59,18 @@ static inline void pagefault_enable(void)
  */
 #define pagefault_disabled() (current->pagefault_disabled != 0)
 
+/*
+ * The pagefault handler is in general disabled by pagefault_disable() or
+ * when in irq context (via in_atomic()).
+ *
+ * This function should only be used by the fault handlers. Other users should
+ * stick to pagefault_disabled().
+ * Please NEVER use preempt_disable() to disable the fault handler. With
+ * !CONFIG_PREEMPT_COUNT, this is like a NOP. So the handler won't be disabled.
+ * in_atomic() will report different values based on !CONFIG_PREEMPT_COUNT.
+ */
+#define faulthandler_disabled() (pagefault_disabled() || in_atomic())
+
 #ifndef ARCH_HAS_NOCACHE_UACCESS
 
 static inline unsigned long __copy_from_user_inatomic_nocache(void *to,
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2015-05-07 15:45 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-06 17:50 [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable() David Hildenbrand
2015-05-06 17:50 ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-07 10:22   ` Peter Zijlstra
2015-05-07 10:22     ` Peter Zijlstra
2015-05-07 10:50     ` David Hildenbrand
2015-05-07 10:50       ` David Hildenbrand
2015-05-07 10:50       ` David Hildenbrand
2015-05-07 11:12       ` Peter Zijlstra
2015-05-07 11:12         ` Peter Zijlstra
2015-05-07 11:23         ` David Hildenbrand
2015-05-07 11:23           ` David Hildenbrand
2015-05-07 11:25           ` Ingo Molnar
2015-05-07 11:25             ` Ingo Molnar
2015-05-07 11:30             ` David Hildenbrand
2015-05-07 11:30               ` David Hildenbrand
2015-05-07 11:42           ` Peter Zijlstra
2015-05-07 11:42             ` Peter Zijlstra
2015-05-07 11:40         ` David Hildenbrand
2015-05-07 11:40           ` David Hildenbrand
2015-05-07 11:48           ` Peter Zijlstra
2015-05-07 11:48             ` Peter Zijlstra
2015-05-07 11:51           ` Peter Zijlstra
2015-05-07 11:51             ` Peter Zijlstra
2015-05-07 12:14             ` David Hildenbrand
2015-05-07 12:14               ` David Hildenbrand
2015-05-07 12:27               ` Ingo Molnar
2015-05-07 12:27                 ` Ingo Molnar
2015-05-07 12:32               ` Peter Zijlstra
2015-05-07 12:32                 ` Peter Zijlstra
2015-05-07 15:45             ` [PATCH draft] mm: use pagefault_disable() to check for disabled pagefaults in the handler David Hildenbrand
2015-05-07 15:45               ` David Hildenbrand
2015-05-07 11:12       ` [PATCH RFC 01/15] uaccess: count pagefault_disable() levels in pagefault_disabled Ingo Molnar
2015-05-07 11:12         ` Ingo Molnar
2015-05-06 17:50 ` [PATCH RFC 02/15] mm, uaccess: trigger might_sleep() in might_fault() with disabled pagefaults David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 03/15] uaccess: clarify that uaccess may only sleep if pagefaults are enabled David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 04/15] mm: explicitly disable/enable preemption in kmap_atomic_* David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 05/15] mips: kmap_coherent relies on disabled preemption David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 06/15] mm: use pagefault_disabled() to check for disabled pagefaults David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 07/15] drm/i915: " David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 08/15] futex: UP futex_atomic_op_inuser() relies on disabled preemption David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 09/15] futex: UP futex_atomic_cmpxchg_inatomic() " David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 10/15] arm/futex: " David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 11/15] arm/futex: UP futex_atomic_op_inuser() " David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 12/15] futex: clarify that preemption doesn't have to be disabled David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 13/15] powerpc: enable_kernel_altivec() requires disabled preemption David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-07  0:21   ` Benjamin Herrenschmidt
2015-05-07  0:21     ` Benjamin Herrenschmidt
2015-05-06 17:50 ` [PATCH RFC 14/15] mips: properly lock access to the fpu David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 17:50 ` [PATCH RFC 15/15] uaccess: decouple preemption from the pagefault logic David Hildenbrand
2015-05-06 17:50   ` David Hildenbrand
2015-05-06 22:01 ` [PATCH RFC 00/15] decouple pagefault_disable() from preempt_disable() Andrew Morton
2015-05-06 22:01   ` Andrew Morton
2015-05-07  6:23   ` David Hildenbrand
2015-05-07  6:23     ` David Hildenbrand
2015-05-07  9:48   ` Ingo Molnar
2015-05-07  9:48     ` Ingo Molnar
2015-05-07 10:51     ` Christian Borntraeger
2015-05-07 10:51       ` Christian Borntraeger
2015-05-07 11:08       ` Ingo Molnar
2015-05-07 11:08         ` Ingo Molnar
2015-05-07 11:40         ` Peter Zijlstra
2015-05-07 11:40           ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.