linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
@ 2014-12-10 14:23 David Hildenbrand
  2014-12-10 14:23 ` [PATCH v2 1/5] uaccess: add pagefault_count to thread_info David Hildenbrand
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: David Hildenbrand @ 2014-12-10 14:23 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: benh, paulus, akpm, heiko.carstens, dahi, schwidefsky,
	borntraeger, mst, tglx, David.Laight, peterz, hughd, hocko

v1 -> v2:
- moved pagefault_count to the end of thread_info for all archs that would have
  required manually calculating asm-offsets - to keep changes minimal.
- remove unlikely() from "mm, uaccess: trigger might_sleep() in" and keep
  changes minimal (in_atomic() -> pagefault_disabled())

----

I recently discovered that might_fault() doesn't call might_sleep() anymore.
Therefore bugs like:
	spin_lock(&lock);
	rc = copy_to_user(...);
	spin_unlock(&lock);
would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was changed to
disable false positives for code like:
	pagefault_disable();
	rc = copy_to_user(...);
	pagefault_enable();

Until now, pagefault_disable() and pagefault_enable() simply modified the
preempt count, therefore telling the pagefault handler that the context is
atomic and sleeping is disallowed.

In order to reenable might_sleep() checks for the correct path, we need a way to
detect whether we run in a pagefault_disable() context.

This series therefore introduces a separate pagefault_count and uses it to count
the levels of pagefault_disable() per thread. might_sleep() checks are
reactivated for the !pagefault_disable() path.

So this should now work:
	spin_lock(&lock); /* also if left away */
	pagefault_disable()
	rc = copy_to_user(...);
	pagefault_enable();
	spin_unlock(&lock);
And this should report a warning again:
	spin_lock(&lock);
	rc = copy_to_user(...);
	spin_unlock(&lock);

Please note that this series will not completely split the handling of
pagefault_disable() and the preempt count. This will be done in another series.
Purpose of this series is to reenable might_sleep() checks for might_fault(),
avoiding to produce false positives.

Cross compiled on powerpc, arm, sparc, sparc64, arm64, x86_64, i386, mips,
alpha, ia64, xtensa, m68k, microblaze.

Tested on s390.


David Hildenbrand (5):
  uaccess: add pagefault_count to thread_info
  uaccess: count pagefault_disable() levels in pagefault_count
  mm, uaccess: trigger might_sleep() in might_fault() when pagefaults
    are disabled
  uaccess: clarify that uaccess may only sleep if pagefaults are not
    disabled
  uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count

 arch/alpha/include/asm/thread_info.h      |  1 +
 arch/arc/include/asm/thread_info.h        |  1 +
 arch/arm/include/asm/thread_info.h        |  1 +
 arch/arm64/include/asm/thread_info.h      |  1 +
 arch/avr32/include/asm/thread_info.h      |  1 +
 arch/avr32/include/asm/uaccess.h          | 12 +++++---
 arch/blackfin/include/asm/thread_info.h   |  1 +
 arch/c6x/include/asm/thread_info.h        |  1 +
 arch/cris/include/asm/thread_info.h       |  1 +
 arch/frv/include/asm/thread_info.h        |  1 +
 arch/hexagon/include/asm/thread_info.h    |  1 +
 arch/hexagon/include/asm/uaccess.h        |  3 +-
 arch/ia64/include/asm/thread_info.h       |  1 +
 arch/m32r/include/asm/thread_info.h       |  1 +
 arch/m32r/include/asm/uaccess.h           | 30 ++++++++++++------
 arch/m68k/include/asm/thread_info.h       |  1 +
 arch/metag/include/asm/thread_info.h      |  1 +
 arch/microblaze/include/asm/thread_info.h |  1 +
 arch/microblaze/include/asm/uaccess.h     |  6 ++--
 arch/mips/include/asm/thread_info.h       |  1 +
 arch/mips/include/asm/uaccess.h           | 45 ++++++++++++++++++---------
 arch/mn10300/include/asm/thread_info.h    |  1 +
 arch/openrisc/include/asm/thread_info.h   |  1 +
 arch/parisc/include/asm/thread_info.h     |  1 +
 arch/powerpc/include/asm/thread_info.h    |  1 +
 arch/s390/include/asm/thread_info.h       |  1 +
 arch/s390/include/asm/uaccess.h           | 15 ++++++---
 arch/score/include/asm/thread_info.h      |  1 +
 arch/score/include/asm/uaccess.h          | 15 ++++++---
 arch/sh/include/asm/thread_info.h         |  1 +
 arch/sparc/include/asm/thread_info_32.h   |  1 +
 arch/sparc/include/asm/thread_info_64.h   |  1 +
 arch/tile/include/asm/thread_info.h       |  1 +
 arch/tile/include/asm/uaccess.h           | 21 ++++++++-----
 arch/um/include/asm/thread_info.h         |  1 +
 arch/unicore32/include/asm/thread_info.h  |  1 +
 arch/x86/include/asm/thread_info.h        |  1 +
 arch/x86/include/asm/uaccess.h            | 15 ++++++---
 arch/x86/include/asm/uaccess_32.h         |  6 ++--
 arch/x86/lib/usercopy_32.c                |  6 ++--
 arch/xtensa/include/asm/thread_info.h     |  1 +
 include/linux/kernel.h                    |  3 +-
 include/linux/uaccess.h                   | 51 ++++++++++++++++++++++++++-----
 lib/Kconfig.debug                         |  9 ++++++
 lib/strnlen_user.c                        |  6 ++--
 mm/maccess.c                              | 11 +++++++
 mm/memory.c                               | 18 ++++-------
 47 files changed, 222 insertions(+), 80 deletions(-)

-- 
1.8.5.5


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/5] uaccess: add pagefault_count to thread_info
  2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
@ 2014-12-10 14:23 ` David Hildenbrand
  2014-12-15 10:07   ` LF.Tan
  2014-12-10 14:23 ` [PATCH v2 2/5] uaccess: count pagefault_disable() levels in pagefault_count David Hildenbrand
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2014-12-10 14:23 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: benh, paulus, akpm, heiko.carstens, dahi, schwidefsky,
	borntraeger, mst, tglx, David.Laight, peterz, hughd, hocko

This patch adds the pagefault_count to the thread_info of all
architectures. It will be used to count the pagefault_disable() levels
on a per-thread basis.

We are not reusing the preempt_count as this is per cpu on x86 and we want to
demangle pagefault_disable() from preemption in the future.

The new counter is added directly below the preempt_count, except for archs
relying on a manual calculation of asm offsets - to minimize the changes.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/alpha/include/asm/thread_info.h      | 1 +
 arch/arc/include/asm/thread_info.h        | 1 +
 arch/arm/include/asm/thread_info.h        | 1 +
 arch/arm64/include/asm/thread_info.h      | 1 +
 arch/avr32/include/asm/thread_info.h      | 1 +
 arch/blackfin/include/asm/thread_info.h   | 1 +
 arch/c6x/include/asm/thread_info.h        | 1 +
 arch/cris/include/asm/thread_info.h       | 1 +
 arch/frv/include/asm/thread_info.h        | 1 +
 arch/hexagon/include/asm/thread_info.h    | 1 +
 arch/ia64/include/asm/thread_info.h       | 1 +
 arch/m32r/include/asm/thread_info.h       | 1 +
 arch/m68k/include/asm/thread_info.h       | 1 +
 arch/metag/include/asm/thread_info.h      | 1 +
 arch/microblaze/include/asm/thread_info.h | 1 +
 arch/mips/include/asm/thread_info.h       | 1 +
 arch/mn10300/include/asm/thread_info.h    | 1 +
 arch/openrisc/include/asm/thread_info.h   | 1 +
 arch/parisc/include/asm/thread_info.h     | 1 +
 arch/powerpc/include/asm/thread_info.h    | 1 +
 arch/s390/include/asm/thread_info.h       | 1 +
 arch/score/include/asm/thread_info.h      | 1 +
 arch/sh/include/asm/thread_info.h         | 1 +
 arch/sparc/include/asm/thread_info_32.h   | 1 +
 arch/sparc/include/asm/thread_info_64.h   | 1 +
 arch/tile/include/asm/thread_info.h       | 1 +
 arch/um/include/asm/thread_info.h         | 1 +
 arch/unicore32/include/asm/thread_info.h  | 1 +
 arch/x86/include/asm/thread_info.h        | 1 +
 arch/xtensa/include/asm/thread_info.h     | 1 +
 30 files changed, 30 insertions(+)

diff --git a/arch/alpha/include/asm/thread_info.h b/arch/alpha/include/asm/thread_info.h
index 48bbea6..1830671 100644
--- a/arch/alpha/include/asm/thread_info.h
+++ b/arch/alpha/include/asm/thread_info.h
@@ -22,6 +22,7 @@ struct thread_info {
 	mm_segment_t		addr_limit;	/* thread address space */
 	unsigned		cpu;		/* current CPU */
 	int			preempt_count; /* 0 => preemptable, <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 	unsigned int		status;		/* thread-synchronous flags */
 
 	int bpt_nsaved;
diff --git a/arch/arc/include/asm/thread_info.h b/arch/arc/include/asm/thread_info.h
index 02bc5ec..2fde704 100644
--- a/arch/arc/include/asm/thread_info.h
+++ b/arch/arc/include/asm/thread_info.h
@@ -41,6 +41,7 @@
 struct thread_info {
 	unsigned long flags;		/* low level flags */
 	int preempt_count;		/* 0 => preemptable, <0 => BUG */
+	int pagefault_count;		/* pagefault_disable() levels */
 	struct task_struct *task;	/* main task structure */
 	mm_segment_t addr_limit;	/* thread address space */
 	struct exec_domain *exec_domain;/* execution domain */
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index ce73ab6..bf47d2d 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -51,6 +51,7 @@ struct cpu_context_save {
 struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
+	int			pagefault_count;/* pagefault_disable() levels */
 	mm_segment_t		addr_limit;	/* address limit */
 	struct task_struct	*task;		/* main task structure */
 	struct exec_domain	*exec_domain;	/* execution domain */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 459bf8e..2469f15 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -50,6 +50,7 @@ struct thread_info {
 	struct exec_domain	*exec_domain;	/* execution domain */
 	struct restart_block	restart_block;
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
+	int			pagefault_count;/* pagefault_disable() levels */
 	int			cpu;		/* cpu */
 };
 
diff --git a/arch/avr32/include/asm/thread_info.h b/arch/avr32/include/asm/thread_info.h
index a978f3f..0c1d6f7 100644
--- a/arch/avr32/include/asm/thread_info.h
+++ b/arch/avr32/include/asm/thread_info.h
@@ -25,6 +25,7 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	__u32			cpu;
 	__s32			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	__s32			pagefault_count;/* pagefault_disable() levels */
 	__u32			rar_saved;	/* return address... */
 	__u32			rsr_saved;	/* ...and status register
 						   saved by debug handler
diff --git a/arch/blackfin/include/asm/thread_info.h b/arch/blackfin/include/asm/thread_info.h
index 55f473b..3ba26aa 100644
--- a/arch/blackfin/include/asm/thread_info.h
+++ b/arch/blackfin/include/asm/thread_info.h
@@ -41,6 +41,7 @@ struct thread_info {
 	unsigned long flags;	/* low level flags */
 	int cpu;		/* cpu we're on */
 	int preempt_count;	/* 0 => preemptable, <0 => BUG */
+	int pagefault_count;	/* pagefault_disable() levels */
 	mm_segment_t addr_limit;	/* address limit */
 	struct restart_block restart_block;
 #ifndef CONFIG_SMP
diff --git a/arch/c6x/include/asm/thread_info.h b/arch/c6x/include/asm/thread_info.h
index d4e9ef8..6b2dcac 100644
--- a/arch/c6x/include/asm/thread_info.h
+++ b/arch/c6x/include/asm/thread_info.h
@@ -44,6 +44,7 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			cpu;		/* cpu we're on */
 	int			preempt_count;	/* 0 = preemptable, <0 = BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 	mm_segment_t		addr_limit;	/* thread address space */
 	struct restart_block	restart_block;
 };
diff --git a/arch/cris/include/asm/thread_info.h b/arch/cris/include/asm/thread_info.h
index 55dede1..3356902 100644
--- a/arch/cris/include/asm/thread_info.h
+++ b/arch/cris/include/asm/thread_info.h
@@ -32,6 +32,7 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	__u32			cpu;		/* current CPU */
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 	__u32			tls;		/* TLS for this thread */
 
 	mm_segment_t		addr_limit;	/* thread address space:
diff --git a/arch/frv/include/asm/thread_info.h b/arch/frv/include/asm/thread_info.h
index af29e17..79a97ee 100644
--- a/arch/frv/include/asm/thread_info.h
+++ b/arch/frv/include/asm/thread_info.h
@@ -36,6 +36,7 @@ struct thread_info {
 	unsigned long		status;		/* thread-synchronous flags */
 	__u32			cpu;		/* current CPU */
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 
 	mm_segment_t		addr_limit;	/* thread address space:
 						 * 0-0xBFFFFFFF for user-thead
diff --git a/arch/hexagon/include/asm/thread_info.h b/arch/hexagon/include/asm/thread_info.h
index a59dad3..d54042e 100644
--- a/arch/hexagon/include/asm/thread_info.h
+++ b/arch/hexagon/include/asm/thread_info.h
@@ -51,6 +51,7 @@ struct thread_info {
 	unsigned long		flags;          /* low level flags */
 	__u32                   cpu;            /* current cpu */
 	int                     preempt_count;  /* 0=>preemptible,<0=>BUG */
+	int                     pagefault_count;/* pagefault_disable() levels */
 	mm_segment_t            addr_limit;     /* segmentation sux */
 	/*
 	 * used for syscalls somehow;
diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/thread_info.h
index 5b17418..14f128c 100644
--- a/arch/ia64/include/asm/thread_info.h
+++ b/arch/ia64/include/asm/thread_info.h
@@ -27,6 +27,7 @@ struct thread_info {
 	__u32 status;			/* Thread synchronous flags */
 	mm_segment_t addr_limit;	/* user-level address space limit */
 	int preempt_count;		/* 0=premptable, <0=BUG; will also serve as bh-counter */
+	int pagefault_count;		/* pagefault_disable() levels */
 	struct restart_block restart_block;
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	__u64 ac_stamp;
diff --git a/arch/m32r/include/asm/thread_info.h b/arch/m32r/include/asm/thread_info.h
index 0017170..bf14efb 100644
--- a/arch/m32r/include/asm/thread_info.h
+++ b/arch/m32r/include/asm/thread_info.h
@@ -35,6 +35,7 @@ struct thread_info {
 						   0-0xFFFFFFFF for kernel-thread
 						*/
 	struct restart_block    restart_block;
+	int			pagefault_count;/* pagefault_disable() levels */
 
 	__u8			supervisor_stack[0];
 };
diff --git a/arch/m68k/include/asm/thread_info.h b/arch/m68k/include/asm/thread_info.h
index 21a4784..5a6a203 100644
--- a/arch/m68k/include/asm/thread_info.h
+++ b/arch/m68k/include/asm/thread_info.h
@@ -29,6 +29,7 @@ struct thread_info {
 	struct exec_domain	*exec_domain;	/* execution domain */
 	mm_segment_t		addr_limit;	/* thread address space */
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 	__u32			cpu;		/* should always be 0 on m68k */
 	unsigned long		tp_value;	/* thread pointer */
 	struct restart_block    restart_block;
diff --git a/arch/metag/include/asm/thread_info.h b/arch/metag/include/asm/thread_info.h
index 4771133..91729f5 100644
--- a/arch/metag/include/asm/thread_info.h
+++ b/arch/metag/include/asm/thread_info.h
@@ -33,6 +33,7 @@ struct thread_info {
 	unsigned long status;	/* thread-synchronous flags */
 	u32 cpu;		/* current CPU */
 	int preempt_count;	/* 0 => preemptable, <0 => BUG */
+	int pagefault_count;	/* pagefault_disable() levels */
 
 	mm_segment_t addr_limit;	/* thread address space */
 	struct restart_block restart_block;
diff --git a/arch/microblaze/include/asm/thread_info.h b/arch/microblaze/include/asm/thread_info.h
index 8c9d365..f905b02 100644
--- a/arch/microblaze/include/asm/thread_info.h
+++ b/arch/microblaze/include/asm/thread_info.h
@@ -70,6 +70,7 @@ struct thread_info {
 	unsigned long		status; /* thread-synchronous flags */
 	__u32			cpu; /* current CPU */
 	__s32			preempt_count; /* 0 => preemptable,< 0 => BUG*/
+	__s32			pagefault_count; /* pagefault_disable() levels */
 	mm_segment_t		addr_limit; /* thread address space */
 	struct restart_block	restart_block;
 
diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h
index 7de8658..f9f27ac 100644
--- a/arch/mips/include/asm/thread_info.h
+++ b/arch/mips/include/asm/thread_info.h
@@ -28,6 +28,7 @@ struct thread_info {
 	unsigned long		tp_value;	/* thread pointer */
 	__u32			cpu;		/* current CPU */
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 
 	mm_segment_t		addr_limit;	/*
 						 * thread address space limit:
diff --git a/arch/mn10300/include/asm/thread_info.h b/arch/mn10300/include/asm/thread_info.h
index bf280ea..f6c03a5 100644
--- a/arch/mn10300/include/asm/thread_info.h
+++ b/arch/mn10300/include/asm/thread_info.h
@@ -45,6 +45,7 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	__u32			cpu;		/* current CPU */
 	__s32			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	__s32			pagefault_count;/* pagefault_disable() levels */
 
 	mm_segment_t		addr_limit;	/* thread address space:
 						   0-0xBFFFFFFF for user-thead
diff --git a/arch/openrisc/include/asm/thread_info.h b/arch/openrisc/include/asm/thread_info.h
index d797acc..bdabd6e 100644
--- a/arch/openrisc/include/asm/thread_info.h
+++ b/arch/openrisc/include/asm/thread_info.h
@@ -52,6 +52,7 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	__u32			cpu;		/* current CPU */
 	__s32			preempt_count; /* 0 => preemptable, <0 => BUG */
+	__s32			pagefault_count;/* pagefault_disable() levels */
 
 	mm_segment_t		addr_limit; /* thread address space:
 					       0-0x7FFFFFFF for user-thead
diff --git a/arch/parisc/include/asm/thread_info.h b/arch/parisc/include/asm/thread_info.h
index a846118..e37b76b 100644
--- a/arch/parisc/include/asm/thread_info.h
+++ b/arch/parisc/include/asm/thread_info.h
@@ -14,6 +14,7 @@ struct thread_info {
 	mm_segment_t addr_limit;	/* user-level address space limit */
 	__u32 cpu;			/* current CPU */
 	int preempt_count;		/* 0=premptable, <0=BUG; will also serve as bh-counter */
+	int pagefault_count;		/* pagefault_disable() levels */
 	struct restart_block restart_block;
 };
 
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index b034ecd..e8585fd 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -43,6 +43,7 @@ struct thread_info {
 	int		cpu;			/* cpu we're on */
 	int		preempt_count;		/* 0 => preemptable,
 						   <0 => BUG */
+	int		pagefault_count;	/* pagefault_disable() levels */
 	struct restart_block restart_block;
 	unsigned long	local_flags;		/* private flags for thread */
 
diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h
index 4d62fd5..bbf0513f 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -39,6 +39,7 @@ struct thread_info {
 	unsigned long		sys_call_table;	/* System call table address */
 	unsigned int		cpu;		/* current CPU */
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 	struct restart_block	restart_block;
 	unsigned int		system_call;
 	__u64			user_timer;
diff --git a/arch/score/include/asm/thread_info.h b/arch/score/include/asm/thread_info.h
index 656b7ad..d7f748d 100644
--- a/arch/score/include/asm/thread_info.h
+++ b/arch/score/include/asm/thread_info.h
@@ -35,6 +35,7 @@ struct thread_info {
 
 	/* 0 => preemptable, < 0 => BUG */
 	int			preempt_count;
+	int			pagefault_count;/* pagefault_disable() levels */
 
 	/*
 	 * thread address space:
diff --git a/arch/sh/include/asm/thread_info.h b/arch/sh/include/asm/thread_info.h
index ad27ffa..682a466 100644
--- a/arch/sh/include/asm/thread_info.h
+++ b/arch/sh/include/asm/thread_info.h
@@ -32,6 +32,7 @@ struct thread_info {
 	__u32			status;		/* thread synchronous flags */
 	__u32			cpu;
 	int			preempt_count; /* 0 => preemptable, <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 	mm_segment_t		addr_limit;	/* thread address space */
 	struct restart_block	restart_block;
 	unsigned long		previous_sp;	/* sp of previous stack in case
diff --git a/arch/sparc/include/asm/thread_info_32.h b/arch/sparc/include/asm/thread_info_32.h
index 025c984..ff0b112 100644
--- a/arch/sparc/include/asm/thread_info_32.h
+++ b/arch/sparc/include/asm/thread_info_32.h
@@ -49,6 +49,7 @@ struct thread_info {
 	unsigned long		w_saved;
 
 	struct restart_block	restart_block;
+	int			pagefault_count;/* pagefault_disable() levels */
 };
 
 /*
diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h
index 798f027..76a60ab 100644
--- a/arch/sparc/include/asm/thread_info_64.h
+++ b/arch/sparc/include/asm/thread_info_64.h
@@ -62,6 +62,7 @@ struct thread_info {
 
 	struct pt_regs		*kern_una_regs;
 	unsigned int		kern_una_insn;
+	int			pagefault_count;/* pagefault_disable() levels */
 
 	unsigned long		fpregs[(7 * 256) / sizeof(unsigned long)]
 		__attribute__ ((aligned(64)));
diff --git a/arch/tile/include/asm/thread_info.h b/arch/tile/include/asm/thread_info.h
index 48e4fd0..57032b6 100644
--- a/arch/tile/include/asm/thread_info.h
+++ b/arch/tile/include/asm/thread_info.h
@@ -33,6 +33,7 @@ struct thread_info {
 	__u32			cpu;		/* current CPU */
 	int			preempt_count;	/* 0 => preemptable,
 						   <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 
 	mm_segment_t		addr_limit;	/* thread address space
 						   (KERNEL_DS or USER_DS) */
diff --git a/arch/um/include/asm/thread_info.h b/arch/um/include/asm/thread_info.h
index 1c5b2a8..90b193c 100644
--- a/arch/um/include/asm/thread_info.h
+++ b/arch/um/include/asm/thread_info.h
@@ -19,6 +19,7 @@ struct thread_info {
 	__u32			cpu;		/* current CPU */
 	int			preempt_count;  /* 0 => preemptable,
 						   <0 => BUG */
+	int			pagefault_count;/* pagefault_disable() levels */
 	mm_segment_t		addr_limit;	/* thread address space:
 					 	   0-0xBFFFFFFF for user
 						   0-0xFFFFFFFF for kernel */
diff --git a/arch/unicore32/include/asm/thread_info.h b/arch/unicore32/include/asm/thread_info.h
index af36d8e..1d50fb3 100644
--- a/arch/unicore32/include/asm/thread_info.h
+++ b/arch/unicore32/include/asm/thread_info.h
@@ -69,6 +69,7 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			preempt_count;	/* 0 => preemptable */
 						/* <0 => bug */
+	int			pagefault_count;/* pagefault_disable() levels */
 	mm_segment_t		addr_limit;	/* address limit */
 	struct task_struct	*task;		/* main task structure */
 	struct exec_domain	*exec_domain;	/* execution domain */
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 547e344..fa075ab 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -30,6 +30,7 @@ struct thread_info {
 	__u32			status;		/* thread synchronous flags */
 	__u32			cpu;		/* current CPU */
 	int			saved_preempt_count;
+	int			pagefault_count;/* pagefault_disable() levels */
 	mm_segment_t		addr_limit;
 	struct restart_block    restart_block;
 	void __user		*sysenter_return;
diff --git a/arch/xtensa/include/asm/thread_info.h b/arch/xtensa/include/asm/thread_info.h
index 470153e..a866129 100644
--- a/arch/xtensa/include/asm/thread_info.h
+++ b/arch/xtensa/include/asm/thread_info.h
@@ -53,6 +53,7 @@ struct thread_info {
 	mm_segment_t		addr_limit;	/* thread address space */
 	struct restart_block    restart_block;
 
+	__s32			pagefault_count;/* pagefault_disable() levels */
 	unsigned long		cpenable;
 
 	/* Allocate storage for extra user states and coprocessor states. */
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/5] uaccess: count pagefault_disable() levels in pagefault_count
  2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
  2014-12-10 14:23 ` [PATCH v2 1/5] uaccess: add pagefault_count to thread_info David Hildenbrand
@ 2014-12-10 14:23 ` David Hildenbrand
  2014-12-10 14:23 ` [PATCH v2 3/5] mm, uaccess: trigger might_sleep() in might_fault() when pagefaults are disabled David Hildenbrand
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2014-12-10 14:23 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: benh, paulus, akpm, heiko.carstens, dahi, schwidefsky,
	borntraeger, mst, tglx, David.Laight, peterz, hughd, hocko

pagefault_disable() and pagefault_enable() have to increment/decrement the
pagefault_count. We keep manipulating the preempt count to retain compability
to existing pagefault handlers. This has to be demangled in separate patches.

It is now possible to verify whether in a pagefault_disable() envionment by
calling pagefault_disabled().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/uaccess.h | 45 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index ecd3319..1dfc678 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -2,20 +2,45 @@
 #define __LINUX_UACCESS_H__
 
 #include <linux/preempt.h>
+#include <linux/thread_info.h>
 #include <asm/uaccess.h>
 
+static __always_inline int pagefault_count(void)
+{
+	return current_thread_info()->pagefault_count;
+}
+
+static __always_inline int *pagefault_count_ptr(void)
+{
+	return &current_thread_info()->pagefault_count;
+}
+
+static __always_inline void pagefault_count_inc(void)
+{
+	(*pagefault_count_ptr())++;
+}
+
+static __always_inline void pagefault_count_dec(void)
+{
+	(*pagefault_count_ptr())--;
+}
+
 /*
- * These routines enable/disable the pagefault handler in that
- * it will not take any locks and go straight to the fixup table.
+ * These routines enable/disable the pagefault handler. If disabled, it will
+ * not take any locks and go straight to the fixup table.
+ *
+ * We increase the preempt and the pagefault count, to be able to distinguish
+ * whether we run in simple atomic context or in a real pagefault_disable()
+ * context.
+ *
+ * For now, after pagefault_disabled() has been called, we run in atomic
+ * context. User access methods will not sleep.
  *
- * They have great resemblance to the preempt_disable/enable calls
- * and in fact they are identical; this is because currently there is
- * no other way to make the pagefault handlers do this. So we do
- * disable preemption but we don't necessarily care about that.
  */
 static inline void pagefault_disable(void)
 {
 	preempt_count_inc();
+	pagefault_count_inc();
 	/*
 	 * make sure to have issued the store before a pagefault
 	 * can hit.
@@ -25,18 +50,24 @@ static inline void pagefault_disable(void)
 
 static inline void pagefault_enable(void)
 {
-#ifndef CONFIG_PREEMPT
 	/*
 	 * make sure to issue those last loads/stores before enabling
 	 * the pagefault handler again.
 	 */
 	barrier();
+	pagefault_count_dec();
+#ifndef CONFIG_PREEMPT
 	preempt_count_dec();
 #else
 	preempt_enable();
 #endif
 }
 
+/*
+ * Is the pagefault handler disabled? If so, user access methods will not sleep.
+ */
+#define pagefault_disabled() (pagefault_count() != 0)
+
 #ifndef ARCH_HAS_NOCACHE_UACCESS
 
 static inline unsigned long __copy_from_user_inatomic_nocache(void *to,
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/5] mm, uaccess: trigger might_sleep() in might_fault() when pagefaults are disabled
  2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
  2014-12-10 14:23 ` [PATCH v2 1/5] uaccess: add pagefault_count to thread_info David Hildenbrand
  2014-12-10 14:23 ` [PATCH v2 2/5] uaccess: count pagefault_disable() levels in pagefault_count David Hildenbrand
@ 2014-12-10 14:23 ` David Hildenbrand
  2014-12-10 14:23 ` [PATCH v2 4/5] uaccess: clarify that uaccess may only sleep if pagefaults are not disabled David Hildenbrand
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2014-12-10 14:23 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: benh, paulus, akpm, heiko.carstens, dahi, schwidefsky,
	borntraeger, mst, tglx, David.Laight, peterz, hughd, hocko

Commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
pagefault_disable()") removed might_sleep() checks for all user access code
(that uses might_fault()).

The reason was to disable wrong "sleep in atomic" warnings in the following
scenario:
    pagefault_disable()
    rc = copy_to_user(...)
    pagefault_enable()

Which is valid, as pagefault_disable() increments the preempt counter and
therefore disables the pagefault handler. copy_to_user() will not sleep and return
an invalid return code if a page is not available.

However, as all might_sleep() checks are removed, CONFIG_DEBUG_ATOMIC_SLEEP
would no longer detect the following scenario:
    spin_lock(&lock);
    rc = copy_to_user(...)
    spin_unlock(&lock)

If the kernel is compiled with preemption turned on, the preempt counter would
be incremented and copy_to_user() would never sleep. However, with preemption
turned off, the preempt counter will not be touched, we will therefore sleep in
atomic context. We really want to enable CONFIG_DEBUG_ATOMIC_SLEEP checks for
user access functions again, otherwise horrible deadlocks might be hard to debug.

Root of all evil is that pagefault_disable() acted almost as preempt_disable(),
depending on preemption being turned on/off.

As we now have pagefault_disabled(), we can use it to distingusih whether user
acces functions might sleep.

Convert might_fault() into a makro that calls __might_fault(), to allow proper
file + line messages in case of a might_sleep() warning. We can't move the code
directly into kernel.h for now, as that results in ugly header recursions we
can't avoid for now.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/kernel.h |  3 ++-
 mm/memory.c            | 18 ++++++------------
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 446d76a..7e65a55 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -232,7 +232,8 @@ static inline u32 reciprocal_scale(u32 val, u32 ep_ro)
 
 #if defined(CONFIG_MMU) && \
 	(defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP))
-void might_fault(void);
+#define might_fault() __might_fault(__FILE__, __LINE__)
+void __might_fault(const char *file, int line);
 #else
 static inline void might_fault(void) { }
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index 0b3f6c7..563720a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3686,7 +3686,7 @@ void print_vma_addr(char *prefix, unsigned long ip)
 }
 
 #if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
-void might_fault(void)
+void __might_fault(const char *file, int line)
 {
 	/*
 	 * Some code (nfs/sunrpc) uses socket ops on kernel memory while
@@ -3696,21 +3696,15 @@ void might_fault(void)
 	 */
 	if (segment_eq(get_fs(), KERNEL_DS))
 		return;
-
-	/*
-	 * it would be nicer only to annotate paths which are not under
-	 * pagefault_disable, however that requires a larger audit and
-	 * providing helpers like get_user_atomic.
-	 */
-	if (in_atomic())
+	if (pagefault_disabled())
 		return;
-
-	__might_sleep(__FILE__, __LINE__, 0);
-
+	__might_sleep(file, line, 0);
+#if defined(CONFIG_DEBUG_ATOMIC_SLEEP)
 	if (current->mm)
 		might_lock_read(&current->mm->mmap_sem);
+#endif
 }
-EXPORT_SYMBOL(might_fault);
+EXPORT_SYMBOL(__might_fault);
 #endif
 
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 4/5] uaccess: clarify that uaccess may only sleep if pagefaults are not disabled
  2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
                   ` (2 preceding siblings ...)
  2014-12-10 14:23 ` [PATCH v2 3/5] mm, uaccess: trigger might_sleep() in might_fault() when pagefaults are disabled David Hildenbrand
@ 2014-12-10 14:23 ` David Hildenbrand
  2014-12-10 14:23 ` [PATCH v2 5/5] uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count David Hildenbrand
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2014-12-10 14:23 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: benh, paulus, akpm, heiko.carstens, dahi, schwidefsky,
	borntraeger, mst, tglx, David.Laight, peterz, hughd, hocko

In general, non-atomic variants of user access functions may not sleep if
pagefaults are disabled.

Let's update all relevant comments in uaccess code. This also refelects the
might_sleep() checks in might_fault().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 arch/avr32/include/asm/uaccess.h      | 12 ++++++----
 arch/hexagon/include/asm/uaccess.h    |  3 ++-
 arch/m32r/include/asm/uaccess.h       | 30 +++++++++++++++--------
 arch/microblaze/include/asm/uaccess.h |  6 +++--
 arch/mips/include/asm/uaccess.h       | 45 +++++++++++++++++++++++------------
 arch/s390/include/asm/uaccess.h       | 15 ++++++++----
 arch/score/include/asm/uaccess.h      | 15 ++++++++----
 arch/tile/include/asm/uaccess.h       | 21 ++++++++++------
 arch/x86/include/asm/uaccess.h        | 15 ++++++++----
 arch/x86/include/asm/uaccess_32.h     |  6 +++--
 arch/x86/lib/usercopy_32.c            |  6 +++--
 lib/strnlen_user.c                    |  6 +++--
 12 files changed, 120 insertions(+), 60 deletions(-)

diff --git a/arch/avr32/include/asm/uaccess.h b/arch/avr32/include/asm/uaccess.h
index 245b2ee..6b6dd58 100644
--- a/arch/avr32/include/asm/uaccess.h
+++ b/arch/avr32/include/asm/uaccess.h
@@ -97,7 +97,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -116,7 +117,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -136,7 +138,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -158,7 +161,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
index e4127e4..af98618 100644
--- a/arch/hexagon/include/asm/uaccess.h
+++ b/arch/hexagon/include/asm/uaccess.h
@@ -36,7 +36,8 @@
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
diff --git a/arch/m32r/include/asm/uaccess.h b/arch/m32r/include/asm/uaccess.h
index 84fe7ba..83bfa33 100644
--- a/arch/m32r/include/asm/uaccess.h
+++ b/arch/m32r/include/asm/uaccess.h
@@ -91,7 +91,8 @@ static inline void set_fs(mm_segment_t s)
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -155,7 +156,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -175,7 +177,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -194,7 +197,8 @@ extern int fixup_exception(struct pt_regs *regs);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -274,7 +278,8 @@ do {									\
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -568,7 +573,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -588,7 +594,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -606,7 +613,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -626,7 +634,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -677,7 +686,8 @@ unsigned long clear_user(void __user *mem, unsigned long len);
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 59a89a6..53bfbb8 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -178,7 +178,8 @@ extern long __user_bad(void);
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -290,7 +291,8 @@ extern long __user_bad(void);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 22a5624..eded117 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -103,7 +103,8 @@ extern u64 __ua_limit;
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -138,7 +139,8 @@ extern u64 __ua_limit;
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -157,7 +159,8 @@ extern u64 __ua_limit;
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -177,7 +180,8 @@ extern u64 __ua_limit;
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -199,7 +203,8 @@ extern u64 __ua_limit;
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -498,7 +503,8 @@ extern void __put_user_unknown(void);
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -517,7 +523,8 @@ extern void __put_user_unknown(void);
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -537,7 +544,8 @@ extern void __put_user_unknown(void);
  * @x:	 Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -559,7 +567,8 @@ extern void __put_user_unknown(void);
  * @x:	 Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -815,7 +824,8 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -888,7 +898,8 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -1075,7 +1086,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -1107,7 +1119,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -1356,7 +1369,8 @@ static inline long __strlen_user(const char __user *s)
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
@@ -1425,7 +1439,8 @@ static inline long __strnlen_user(const char __user *s, long n)
  * strnlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index cd4c68e..1291da5 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -98,7 +98,8 @@ static inline unsigned long extable_fixup(const struct exception_table_entry *x)
  * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -118,7 +119,8 @@ unsigned long __must_check __copy_from_user(void *to, const void __user *from,
  * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
- * Context: User context only.	This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -264,7 +266,8 @@ int __get_user_bad(void) __attribute__((noreturn));
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -290,7 +293,8 @@ __compiletime_warning("copy_from_user() buffer size is not provably correct")
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.
  *
@@ -348,7 +352,8 @@ static inline unsigned long strnlen_user(const char __user *src, unsigned long n
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/score/include/asm/uaccess.h b/arch/score/include/asm/uaccess.h
index ab66ddd..010a800 100644
--- a/arch/score/include/asm/uaccess.h
+++ b/arch/score/include/asm/uaccess.h
@@ -36,7 +36,8 @@
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -61,7 +62,8 @@
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -79,7 +81,8 @@
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -98,7 +101,8 @@
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -119,7 +123,8 @@
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/tile/include/asm/uaccess.h b/arch/tile/include/asm/uaccess.h
index b6cde32..9efe668 100644
--- a/arch/tile/include/asm/uaccess.h
+++ b/arch/tile/include/asm/uaccess.h
@@ -78,7 +78,8 @@ int __range_ok(unsigned long addr, unsigned long size);
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -192,7 +193,8 @@ extern int __get_user_bad(void)
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -272,7 +274,8 @@ extern int __put_user_bad(void)
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -327,7 +330,8 @@ extern int __put_user_bad(void)
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -363,7 +367,8 @@ copy_to_user(void __user *to, const void *from, unsigned long n)
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -434,7 +439,8 @@ static inline unsigned long __must_check copy_from_user(void *to,
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to user space.  Caller must check
  * the specified blocks with access_ok() before calling this function.
@@ -466,7 +472,8 @@ copy_in_user(void __user *to, const void __user *from, unsigned long n)
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 0d592e0..014d8cb 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -74,7 +74,8 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
  * @addr: User space pointer to start of block to check
  * @size: Size of block to check
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Checks if a pointer to a block of memory in user space is valid.
  *
@@ -145,7 +146,8 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -240,7 +242,8 @@ extern void __put_user_8(void);
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
@@ -455,7 +458,8 @@ struct __large_struct { unsigned long buf[100]; };
  * @x:   Variable to store result.
  * @ptr: Source address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple variable from user space to kernel
  * space.  It supports simple types like char and int, but not larger
@@ -479,7 +483,8 @@ struct __large_struct { unsigned long buf[100]; };
  * @x:   Value to copy to user space.
  * @ptr: Destination address, in user space.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * This macro copies a single simple value from kernel space to user
  * space.  It supports simple types like char and int, but not larger
diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index 3c03a5d..ae5b37f 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -70,7 +70,8 @@ __copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.  Caller must check
  * the specified block with access_ok() before calling this function.
@@ -117,7 +118,8 @@ __copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.  Caller must check
  * the specified block with access_ok() before calling this function.
diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c
index e2f5e21..64ba86e 100644
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -647,7 +647,8 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);
  * @from: Source address, in kernel space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from kernel space to user space.
  *
@@ -668,7 +669,8 @@ EXPORT_SYMBOL(_copy_to_user);
  * @from: Source address, in user space.
  * @n:    Number of bytes to copy.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Copy data from user space to kernel space.
  *
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index a28df52..0013abd 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -84,7 +84,8 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
  * @str: The string to measure.
  * @count: Maximum count (including NUL character)
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
@@ -113,7 +114,8 @@ EXPORT_SYMBOL(strnlen_user);
  * strlen_user: - Get the size of a user string INCLUDING final NUL.
  * @str: The string to measure.
  *
- * Context: User context only.  This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ *          disabled.
  *
  * Get the size of a NUL-terminated string in user space.
  *
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 5/5] uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count
  2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
                   ` (3 preceding siblings ...)
  2014-12-10 14:23 ` [PATCH v2 4/5] uaccess: clarify that uaccess may only sleep if pagefaults are not disabled David Hildenbrand
@ 2014-12-10 14:23 ` David Hildenbrand
  2014-12-15 10:45 ` [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() Peter Zijlstra
  2015-01-12 14:19 ` David Hildenbrand
  6 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2014-12-10 14:23 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: benh, paulus, akpm, heiko.carstens, dahi, schwidefsky,
	borntraeger, mst, tglx, David.Laight, peterz, hughd, hocko

This patch introduces CONFIG_DEBUG_PAGEFAULT_COUNT, to detect over-/underflows
in the pagefault_count resulting from a wrong usage of pagefault_enable() and
pagefault_disable().

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
---
 include/linux/uaccess.h |  8 +++++++-
 lib/Kconfig.debug       |  9 +++++++++
 mm/maccess.c            | 11 +++++++++++
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 1dfc678..6ffb90b 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -20,11 +20,17 @@ static __always_inline void pagefault_count_inc(void)
 	(*pagefault_count_ptr())++;
 }
 
-static __always_inline void pagefault_count_dec(void)
+static __always_inline void __pagefault_count_dec(void)
 {
 	(*pagefault_count_ptr())--;
 }
 
+#ifdef CONFIG_DEBUG_PAGEFAULT_COUNT
+extern void pagefault_count_dec(void);
+#else
+#define pagefault_count_dec() __pagefault_count_dec()
+#endif
+
 /*
  * These routines enable/disable the pagefault handler. If disabled, it will
  * not take any locks and go straight to the fixup table.
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index d780351..9dea6e0 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -860,6 +860,15 @@ config DEBUG_PREEMPT
 	  if kernel code uses it in a preemption-unsafe way. Also, the kernel
 	  will detect preemption count underflows.
 
+config DEBUG_PAGEFAULT_COUNT
+	bool "Debug pagefault_disable / pagefault_enable"
+	depends on DEBUG_KERNEL
+	default n
+	help
+	  If you say Y here then the kernel will detect pagefault count
+	  over-/underflows and therefore non-matching pagefault_enable() and
+	  pagefault_disable() calls.
+
 menu "Lock Debugging (spinlocks, mutexes, etc...)"
 
 config DEBUG_RT_MUTEXES
diff --git a/mm/maccess.c b/mm/maccess.c
index d53adf9..4b72aa1 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -5,6 +5,17 @@
 #include <linux/mm.h>
 #include <linux/uaccess.h>
 
+#ifdef CONFIG_DEBUG_PAGEFAULT_COUNT
+void pagefault_count_dec(void)
+{
+	__pagefault_count_dec();
+
+	/* underflow / previous overflow ? */
+	WARN_ON(pagefault_count() < 0);
+}
+EXPORT_SYMBOL(pagefault_count_dec);
+#endif
+
 /**
  * probe_kernel_read(): safely attempt to read from a location
  * @dst: pointer to the buffer that shall take the data
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] uaccess: add pagefault_count to thread_info
  2014-12-10 14:23 ` [PATCH v2 1/5] uaccess: add pagefault_count to thread_info David Hildenbrand
@ 2014-12-15 10:07   ` LF.Tan
  2014-12-15 11:23     ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: LF.Tan @ 2014-12-15 10:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Linux-Arch, benh, paulus, akpm, heiko.carstens,
	schwidefsky, borntraeger, Michael S. Tsirkin, Thomas Gleixner,
	David.Laight, peterz, hughd, hocko

On Wed, Dec 10, 2014 at 10:23 PM, David Hildenbrand
<dahi@linux.vnet.ibm.com> wrote:
> This patch adds the pagefault_count to the thread_info of all
> architectures. It will be used to count the pagefault_disable() levels
> on a per-thread basis.
>
> We are not reusing the preempt_count as this is per cpu on x86 and we want to
> demangle pagefault_disable() from preemption in the future.
>
> The new counter is added directly below the preempt_count, except for archs
> relying on a manual calculation of asm offsets - to minimize the changes.
>
Hi David

Is this patchset targeted for 3.19? If yes, then we need this for
arch/nios2 as well (new arch in 3.19).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
                   ` (4 preceding siblings ...)
  2014-12-10 14:23 ` [PATCH v2 5/5] uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count David Hildenbrand
@ 2014-12-15 10:45 ` Peter Zijlstra
  2014-12-15 11:21   ` David Hildenbrand
  2015-01-12 14:19 ` David Hildenbrand
  6 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2014-12-15 10:45 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-arch, benh, paulus, akpm, heiko.carstens,
	schwidefsky, borntraeger, mst, tglx, David.Laight, hughd, hocko

On Wed, Dec 10, 2014 at 03:23:29PM +0100, David Hildenbrand wrote:

Did you look at the -rt patches where this comes from?

https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=b389ced19ab649438196d132768fe6522d2f052b
https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=4fb7f9d416f7b34036d9d1b209e77c462ac0ee10
https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=c730a4aade9e5c9b84f65de01d6612bf70c577e3
https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=d365f5bf933e988a39874b33302f02ff6c7989b7
https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=93eb18f43dfa5d49d99e2b8ad236eea2c35dd539
https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=8947442e896921e1b645f9e1fd0f2beee103bba0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2014-12-15 10:45 ` [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() Peter Zijlstra
@ 2014-12-15 11:21   ` David Hildenbrand
  2014-12-15 12:50     ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2014-12-15 11:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-arch, benh, paulus, akpm, heiko.carstens,
	schwidefsky, borntraeger, mst, tglx, David.Laight, hughd, hocko

> On Wed, Dec 10, 2014 at 03:23:29PM +0100, David Hildenbrand wrote:
> 
> Did you look at the -rt patches where this comes from?
> 
> https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=b389ced19ab649438196d132768fe6522d2f052b
> https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=4fb7f9d416f7b34036d9d1b209e77c462ac0ee10
> https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=c730a4aade9e5c9b84f65de01d6612bf70c577e3
> https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=d365f5bf933e988a39874b33302f02ff6c7989b7
> https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=93eb18f43dfa5d49d99e2b8ad236eea2c35dd539
> https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=8947442e896921e1b645f9e1fd0f2beee103bba0
> 
> 

Thanks for the links - haven't seen these patches so far (somebody on the list
just mentioned that someone tried to demangle that stuff some time ago). But
good to know that somebody is working on that pagefault_disable() thing.

Do you know what the plans for this series are? So I can base my patches
(might_sleep() checks for might_fault()) on that queue.

David


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] uaccess: add pagefault_count to thread_info
  2014-12-15 10:07   ` LF.Tan
@ 2014-12-15 11:23     ` David Hildenbrand
  2014-12-15 12:48       ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2014-12-15 11:23 UTC (permalink / raw)
  To: LF.Tan
  Cc: linux-kernel, Linux-Arch, benh, paulus, akpm, heiko.carstens,
	schwidefsky, borntraeger, Michael S. Tsirkin, Thomas Gleixner,
	David.Laight, peterz, hughd, hocko

> On Wed, Dec 10, 2014 at 10:23 PM, David Hildenbrand
> <dahi@linux.vnet.ibm.com> wrote:
> > This patch adds the pagefault_count to the thread_info of all
> > architectures. It will be used to count the pagefault_disable() levels
> > on a per-thread basis.
> >
> > We are not reusing the preempt_count as this is per cpu on x86 and we want to
> > demangle pagefault_disable() from preemption in the future.
> >
> > The new counter is added directly below the preempt_count, except for archs
> > relying on a manual calculation of asm offsets - to minimize the changes.
> >
> Hi David
> 
> Is this patchset targeted for 3.19? If yes, then we need this for
> arch/nios2 as well (new arch in 3.19).
> 

Hi,

Peter just showed my that there is some work ongoing on that
pagefault_disable() topic. So I am not sure if the arch-specific part of this
series is still relevant.

But thanks for the info!

David


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] uaccess: add pagefault_count to thread_info
  2014-12-15 11:23     ` David Hildenbrand
@ 2014-12-15 12:48       ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2014-12-15 12:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: LF.Tan, linux-kernel, Linux-Arch, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, Michael S. Tsirkin,
	Thomas Gleixner, David.Laight, hughd, hocko

On Mon, Dec 15, 2014 at 12:23:01PM +0100, David Hildenbrand wrote:
> Peter just showed my that there is some work ongoing on that
> pagefault_disable() topic. So I am not sure if the arch-specific part of this
> series is still relevant.

Well 'on-going' would be over stating it. We did those bits in -rt many
years ago, but never really made a strong enough effort to get them
upstream.

If there now is a 'semi' sane use-case for upstreaming them we should
indeed look at that.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2014-12-15 11:21   ` David Hildenbrand
@ 2014-12-15 12:50     ` Peter Zijlstra
  2014-12-15 13:08       ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2014-12-15 12:50 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-arch, benh, paulus, akpm, heiko.carstens,
	schwidefsky, borntraeger, mst, tglx, David.Laight, hughd, hocko

On Mon, Dec 15, 2014 at 12:21:27PM +0100, David Hildenbrand wrote:
> > On Wed, Dec 10, 2014 at 03:23:29PM +0100, David Hildenbrand wrote:
> > 
> > Did you look at the -rt patches where this comes from?
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=b389ced19ab649438196d132768fe6522d2f052b
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=4fb7f9d416f7b34036d9d1b209e77c462ac0ee10
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=c730a4aade9e5c9b84f65de01d6612bf70c577e3
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=d365f5bf933e988a39874b33302f02ff6c7989b7
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=93eb18f43dfa5d49d99e2b8ad236eea2c35dd539
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=8947442e896921e1b645f9e1fd0f2beee103bba0
> > 
> > 
> 
> Thanks for the links - haven't seen these patches so far (somebody on the list
> just mentioned that someone tried to demangle that stuff some time ago). But
> good to know that somebody is working on that pagefault_disable() thing.
> 
> Do you know what the plans for this series are? So I can base my patches
> (might_sleep() checks for might_fault()) on that queue.

As stated in that other email, there's no active work on this atm. Its
just what -rt needed the pagefault_{en,dis}able() bits for. I think we
should try and merge some of that upstream now that there is a stronger
use case.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2014-12-15 12:50     ` Peter Zijlstra
@ 2014-12-15 13:08       ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2014-12-15 13:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-arch, benh, paulus, akpm, heiko.carstens,
	schwidefsky, borntraeger, mst, tglx, David.Laight, hughd, hocko

> On Mon, Dec 15, 2014 at 12:21:27PM +0100, David Hildenbrand wrote:
> > > On Wed, Dec 10, 2014 at 03:23:29PM +0100, David Hildenbrand wrote:
> > > 
> > > Did you look at the -rt patches where this comes from?
> > > 
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=b389ced19ab649438196d132768fe6522d2f052b
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=4fb7f9d416f7b34036d9d1b209e77c462ac0ee10
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=c730a4aade9e5c9b84f65de01d6612bf70c577e3
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=d365f5bf933e988a39874b33302f02ff6c7989b7
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=93eb18f43dfa5d49d99e2b8ad236eea2c35dd539
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=8947442e896921e1b645f9e1fd0f2beee103bba0
> > > 
> > > 
> > 
> > Thanks for the links - haven't seen these patches so far (somebody on the list
> > just mentioned that someone tried to demangle that stuff some time ago). But
> > good to know that somebody is working on that pagefault_disable() thing.
> > 
> > Do you know what the plans for this series are? So I can base my patches
> > (might_sleep() checks for might_fault()) on that queue.
> 
> As stated in that other email, there's no active work on this atm. Its
> just what -rt needed the pagefault_{en,dis}able() bits for. I think we
> should try and merge some of that upstream now that there is a stronger
> use case.
> 

Ah, now I get it. So the main question is which approach is better:

a) -rt version: Store the pagefault_count in struct task_struct()
b) my version: Storing it in thread_info

IOW: My series first and the -rt part (pagefault handlers, preempt fixup) on
top or -rt version first and my work (patch 3 + 4 ) on top.

Getting rid of that whole preemption handling in pagefault_disable() / fixing up
the pagefault handlers is something I would have addressed in future patches,
but that part seems to be just fine in the -rt code.

Thanks for having a look!

David


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
                   ` (5 preceding siblings ...)
  2014-12-15 10:45 ` [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() Peter Zijlstra
@ 2015-01-12 14:19 ` David Hildenbrand
  2015-01-30 15:52   ` Christian Borntraeger
  2015-02-09 14:42   ` Peter Zijlstra
  6 siblings, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2015-01-12 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-arch, tglx, peterz
  Cc: benh, paulus, akpm, heiko.carstens, schwidefsky, borntraeger,
	mst, David.Laight, hughd, hocko

Thomas, Peter,

anything that speaks against putting the pagefault_disable counter into
thread_info (my series) instead of task_struct (rt tree)?

IOW, what would be the right place for it?

Would be good to know for me how to proceed with this series.

Thanks!

David

> v1 -> v2:
> - moved pagefault_count to the end of thread_info for all archs that would have
>   required manually calculating asm-offsets - to keep changes minimal.
> - remove unlikely() from "mm, uaccess: trigger might_sleep() in" and keep
>   changes minimal (in_atomic() -> pagefault_disabled())
> 
> ----
> 
> I recently discovered that might_fault() doesn't call might_sleep() anymore.
> Therefore bugs like:
> 	spin_lock(&lock);
> 	rc = copy_to_user(...);
> 	spin_unlock(&lock);
> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was changed to
> disable false positives for code like:
> 	pagefault_disable();
> 	rc = copy_to_user(...);
> 	pagefault_enable();
> 
> Until now, pagefault_disable() and pagefault_enable() simply modified the
> preempt count, therefore telling the pagefault handler that the context is
> atomic and sleeping is disallowed.
> 
> In order to reenable might_sleep() checks for the correct path, we need a way to
> detect whether we run in a pagefault_disable() context.
> 
> This series therefore introduces a separate pagefault_count and uses it to count
> the levels of pagefault_disable() per thread. might_sleep() checks are
> reactivated for the !pagefault_disable() path.
> 
> So this should now work:
> 	spin_lock(&lock); /* also if left away */
> 	pagefault_disable()
> 	rc = copy_to_user(...);
> 	pagefault_enable();
> 	spin_unlock(&lock);
> And this should report a warning again:
> 	spin_lock(&lock);
> 	rc = copy_to_user(...);
> 	spin_unlock(&lock);
> 
> Please note that this series will not completely split the handling of
> pagefault_disable() and the preempt count. This will be done in another series.
> Purpose of this series is to reenable might_sleep() checks for might_fault(),
> avoiding to produce false positives.
> 
> Cross compiled on powerpc, arm, sparc, sparc64, arm64, x86_64, i386, mips,
> alpha, ia64, xtensa, m68k, microblaze.
> 
> Tested on s390.
> 
> 
> David Hildenbrand (5):
>   uaccess: add pagefault_count to thread_info
>   uaccess: count pagefault_disable() levels in pagefault_count
>   mm, uaccess: trigger might_sleep() in might_fault() when pagefaults
>     are disabled
>   uaccess: clarify that uaccess may only sleep if pagefaults are not
>     disabled
>   uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count
> 
>  arch/alpha/include/asm/thread_info.h      |  1 +
>  arch/arc/include/asm/thread_info.h        |  1 +
>  arch/arm/include/asm/thread_info.h        |  1 +
>  arch/arm64/include/asm/thread_info.h      |  1 +
>  arch/avr32/include/asm/thread_info.h      |  1 +
>  arch/avr32/include/asm/uaccess.h          | 12 +++++---
>  arch/blackfin/include/asm/thread_info.h   |  1 +
>  arch/c6x/include/asm/thread_info.h        |  1 +
>  arch/cris/include/asm/thread_info.h       |  1 +
>  arch/frv/include/asm/thread_info.h        |  1 +
>  arch/hexagon/include/asm/thread_info.h    |  1 +
>  arch/hexagon/include/asm/uaccess.h        |  3 +-
>  arch/ia64/include/asm/thread_info.h       |  1 +
>  arch/m32r/include/asm/thread_info.h       |  1 +
>  arch/m32r/include/asm/uaccess.h           | 30 ++++++++++++------
>  arch/m68k/include/asm/thread_info.h       |  1 +
>  arch/metag/include/asm/thread_info.h      |  1 +
>  arch/microblaze/include/asm/thread_info.h |  1 +
>  arch/microblaze/include/asm/uaccess.h     |  6 ++--
>  arch/mips/include/asm/thread_info.h       |  1 +
>  arch/mips/include/asm/uaccess.h           | 45 ++++++++++++++++++---------
>  arch/mn10300/include/asm/thread_info.h    |  1 +
>  arch/openrisc/include/asm/thread_info.h   |  1 +
>  arch/parisc/include/asm/thread_info.h     |  1 +
>  arch/powerpc/include/asm/thread_info.h    |  1 +
>  arch/s390/include/asm/thread_info.h       |  1 +
>  arch/s390/include/asm/uaccess.h           | 15 ++++++---
>  arch/score/include/asm/thread_info.h      |  1 +
>  arch/score/include/asm/uaccess.h          | 15 ++++++---
>  arch/sh/include/asm/thread_info.h         |  1 +
>  arch/sparc/include/asm/thread_info_32.h   |  1 +
>  arch/sparc/include/asm/thread_info_64.h   |  1 +
>  arch/tile/include/asm/thread_info.h       |  1 +
>  arch/tile/include/asm/uaccess.h           | 21 ++++++++-----
>  arch/um/include/asm/thread_info.h         |  1 +
>  arch/unicore32/include/asm/thread_info.h  |  1 +
>  arch/x86/include/asm/thread_info.h        |  1 +
>  arch/x86/include/asm/uaccess.h            | 15 ++++++---
>  arch/x86/include/asm/uaccess_32.h         |  6 ++--
>  arch/x86/lib/usercopy_32.c                |  6 ++--
>  arch/xtensa/include/asm/thread_info.h     |  1 +
>  include/linux/kernel.h                    |  3 +-
>  include/linux/uaccess.h                   | 51 ++++++++++++++++++++++++++-----
>  lib/Kconfig.debug                         |  9 ++++++
>  lib/strnlen_user.c                        |  6 ++--
>  mm/maccess.c                              | 11 +++++++
>  mm/memory.c                               | 18 ++++-------
>  47 files changed, 222 insertions(+), 80 deletions(-)
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-01-12 14:19 ` David Hildenbrand
@ 2015-01-30 15:52   ` Christian Borntraeger
  2015-02-09 14:42   ` Peter Zijlstra
  1 sibling, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2015-01-30 15:52 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel, linux-arch, tglx, peterz
  Cc: benh, paulus, akpm, heiko.carstens, schwidefsky, mst,
	David.Laight, hughd, hocko

Am 12.01.2015 um 15:19 schrieb David Hildenbrand:
> Thomas, Peter,
> 
> anything that speaks against putting the pagefault_disable counter into
> thread_info (my series) instead of task_struct (rt tree)?
> 
> IOW, what would be the right place for it?
> 
> Would be good to know for me how to proceed with this series.

Given that I just had to debug another piece of code were the might sleep
check would have shown the problem, I want some progress here. From the
non-rt perspective thread_info looks as good as task_struct, so for
whatever its worth

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>

to the approach that has a smaller patch (to improve upstreaming and review).

If my assumption is true, rt should be able to cope with both ways and the
adoption should be small in case of thread_info.
Would be good if Thomas/Ingo can ack or nack, though

Christian

> 
> Thanks!
> 
> David
> 
>> v1 -> v2:
>> - moved pagefault_count to the end of thread_info for all archs that would have
>>   required manually calculating asm-offsets - to keep changes minimal.
>> - remove unlikely() from "mm, uaccess: trigger might_sleep() in" and keep
>>   changes minimal (in_atomic() -> pagefault_disabled())
>>
>> ----
>>
>> I recently discovered that might_fault() doesn't call might_sleep() anymore.
>> Therefore bugs like:
>> 	spin_lock(&lock);
>> 	rc = copy_to_user(...);
>> 	spin_unlock(&lock);
>> would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was changed to
>> disable false positives for code like:
>> 	pagefault_disable();
>> 	rc = copy_to_user(...);
>> 	pagefault_enable();
>>
>> Until now, pagefault_disable() and pagefault_enable() simply modified the
>> preempt count, therefore telling the pagefault handler that the context is
>> atomic and sleeping is disallowed.
>>
>> In order to reenable might_sleep() checks for the correct path, we need a way to
>> detect whether we run in a pagefault_disable() context.
>>
>> This series therefore introduces a separate pagefault_count and uses it to count
>> the levels of pagefault_disable() per thread. might_sleep() checks are
>> reactivated for the !pagefault_disable() path.
>>
>> So this should now work:
>> 	spin_lock(&lock); /* also if left away */
>> 	pagefault_disable()
>> 	rc = copy_to_user(...);
>> 	pagefault_enable();
>> 	spin_unlock(&lock);
>> And this should report a warning again:
>> 	spin_lock(&lock);
>> 	rc = copy_to_user(...);
>> 	spin_unlock(&lock);
>>
>> Please note that this series will not completely split the handling of
>> pagefault_disable() and the preempt count. This will be done in another series.
>> Purpose of this series is to reenable might_sleep() checks for might_fault(),
>> avoiding to produce false positives.
>>
>> Cross compiled on powerpc, arm, sparc, sparc64, arm64, x86_64, i386, mips,
>> alpha, ia64, xtensa, m68k, microblaze.
>>
>> Tested on s390.
>>
>>
>> David Hildenbrand (5):
>>   uaccess: add pagefault_count to thread_info
>>   uaccess: count pagefault_disable() levels in pagefault_count
>>   mm, uaccess: trigger might_sleep() in might_fault() when pagefaults
>>     are disabled
>>   uaccess: clarify that uaccess may only sleep if pagefaults are not
>>     disabled
>>   uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count
>>
>>  arch/alpha/include/asm/thread_info.h      |  1 +
>>  arch/arc/include/asm/thread_info.h        |  1 +
>>  arch/arm/include/asm/thread_info.h        |  1 +
>>  arch/arm64/include/asm/thread_info.h      |  1 +
>>  arch/avr32/include/asm/thread_info.h      |  1 +
>>  arch/avr32/include/asm/uaccess.h          | 12 +++++---
>>  arch/blackfin/include/asm/thread_info.h   |  1 +
>>  arch/c6x/include/asm/thread_info.h        |  1 +
>>  arch/cris/include/asm/thread_info.h       |  1 +
>>  arch/frv/include/asm/thread_info.h        |  1 +
>>  arch/hexagon/include/asm/thread_info.h    |  1 +
>>  arch/hexagon/include/asm/uaccess.h        |  3 +-
>>  arch/ia64/include/asm/thread_info.h       |  1 +
>>  arch/m32r/include/asm/thread_info.h       |  1 +
>>  arch/m32r/include/asm/uaccess.h           | 30 ++++++++++++------
>>  arch/m68k/include/asm/thread_info.h       |  1 +
>>  arch/metag/include/asm/thread_info.h      |  1 +
>>  arch/microblaze/include/asm/thread_info.h |  1 +
>>  arch/microblaze/include/asm/uaccess.h     |  6 ++--
>>  arch/mips/include/asm/thread_info.h       |  1 +
>>  arch/mips/include/asm/uaccess.h           | 45 ++++++++++++++++++---------
>>  arch/mn10300/include/asm/thread_info.h    |  1 +
>>  arch/openrisc/include/asm/thread_info.h   |  1 +
>>  arch/parisc/include/asm/thread_info.h     |  1 +
>>  arch/powerpc/include/asm/thread_info.h    |  1 +
>>  arch/s390/include/asm/thread_info.h       |  1 +
>>  arch/s390/include/asm/uaccess.h           | 15 ++++++---
>>  arch/score/include/asm/thread_info.h      |  1 +
>>  arch/score/include/asm/uaccess.h          | 15 ++++++---
>>  arch/sh/include/asm/thread_info.h         |  1 +
>>  arch/sparc/include/asm/thread_info_32.h   |  1 +
>>  arch/sparc/include/asm/thread_info_64.h   |  1 +
>>  arch/tile/include/asm/thread_info.h       |  1 +
>>  arch/tile/include/asm/uaccess.h           | 21 ++++++++-----
>>  arch/um/include/asm/thread_info.h         |  1 +
>>  arch/unicore32/include/asm/thread_info.h  |  1 +
>>  arch/x86/include/asm/thread_info.h        |  1 +
>>  arch/x86/include/asm/uaccess.h            | 15 ++++++---
>>  arch/x86/include/asm/uaccess_32.h         |  6 ++--
>>  arch/x86/lib/usercopy_32.c                |  6 ++--
>>  arch/xtensa/include/asm/thread_info.h     |  1 +
>>  include/linux/kernel.h                    |  3 +-
>>  include/linux/uaccess.h                   | 51 ++++++++++++++++++++++++++-----
>>  lib/Kconfig.debug                         |  9 ++++++
>>  lib/strnlen_user.c                        |  6 ++--
>>  mm/maccess.c                              | 11 +++++++
>>  mm/memory.c                               | 18 ++++-------
>>  47 files changed, 222 insertions(+), 80 deletions(-)
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-01-12 14:19 ` David Hildenbrand
  2015-01-30 15:52   ` Christian Borntraeger
@ 2015-02-09 14:42   ` Peter Zijlstra
  2015-02-19 14:48     ` David Hildenbrand
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2015-02-09 14:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-arch, tglx, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, David.Laight,
	hughd, hocko

On Mon, Jan 12, 2015 at 03:19:11PM +0100, David Hildenbrand wrote:
> Thomas, Peter,
> 
> anything that speaks against putting the pagefault_disable counter into
> thread_info (my series) instead of task_struct (rt tree)?
> 
> IOW, what would be the right place for it?

I think we put it in task_struct because lazy; ARM seems one of the few
popular archs where current still goes through thread_info.

And that I think is the only reason to maybe use thread_info, cost of
access. The down-side of using thread_info is of course that it reduces
stack size.

In any case; I think that if you want to go do this; please consider the
route -rt took and completely separate the two, don't leave the
preempt_count_{inc,dec} remnant in pagefault_{en,dis}able() at all.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-02-09 14:42   ` Peter Zijlstra
@ 2015-02-19 14:48     ` David Hildenbrand
  2015-02-19 15:07       ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2015-02-19 14:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-arch, tglx, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, David.Laight,
	hughd, hocko

> On Mon, Jan 12, 2015 at 03:19:11PM +0100, David Hildenbrand wrote:
> > Thomas, Peter,
> > 
> > anything that speaks against putting the pagefault_disable counter into
> > thread_info (my series) instead of task_struct (rt tree)?
> > 
> > IOW, what would be the right place for it?
> 
> I think we put it in task_struct because lazy; ARM seems one of the few
> popular archs where current still goes through thread_info.
> 
> And that I think is the only reason to maybe use thread_info, cost of
> access. The down-side of using thread_info is of course that it reduces
> stack size.
> 
> In any case; I think that if you want to go do this; please consider the
> route -rt took and completely separate the two, don't leave the
> preempt_count_{inc,dec} remnant in pagefault_{en,dis}able() at all.
> 
> 

Thanks Peter,

I am currently preparing/testing a series that does the requested separation
(getting rid of preempt_count_{inc,dec} ...) while putting the pagefault disable
count into task_info.

Downside is that now that I have to touch all fault handlers, I have to go
through all archs again.

Think I'll have something to show in a couple of days.

David


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-02-19 14:48     ` David Hildenbrand
@ 2015-02-19 15:07       ` Peter Zijlstra
  2015-02-19 15:14         ` David Hildenbrand
  2015-03-27 15:40         ` David Hildenbrand
  0 siblings, 2 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-02-19 15:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-arch, tglx, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, David.Laight,
	hughd, hocko

On Thu, Feb 19, 2015 at 03:48:05PM +0100, David Hildenbrand wrote:
> Downside is that now that I have to touch all fault handlers, I have to go
> through all archs again.

You should be able to borrow from the -rt patches there. They have all
that.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-02-19 15:07       ` Peter Zijlstra
@ 2015-02-19 15:14         ` David Hildenbrand
  2015-03-27 15:40         ` David Hildenbrand
  1 sibling, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2015-02-19 15:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-arch, tglx, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, David.Laight,
	hughd, hocko

> On Thu, Feb 19, 2015 at 03:48:05PM +0100, David Hildenbrand wrote:
> > Downside is that now that I have to touch all fault handlers, I have to go
> > through all archs again.
> 
> You should be able to borrow from the -rt patches there. They have all
> that.
> 

Jup, that's what I partially did.

Thanks

David


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-02-19 15:07       ` Peter Zijlstra
  2015-02-19 15:14         ` David Hildenbrand
@ 2015-03-27 15:40         ` David Hildenbrand
  2015-03-27 16:15           ` Peter Zijlstra
  1 sibling, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2015-03-27 15:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-arch, tglx, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, David.Laight,
	hughd, hocko

> On Thu, Feb 19, 2015 at 03:48:05PM +0100, David Hildenbrand wrote:
> > Downside is that now that I have to touch all fault handlers, I have to go
> > through all archs again.
> 
> You should be able to borrow from the -rt patches there. They have all
> that.
> 

Hi Peter,

I hadn't much time to work on this lately, and it seems like this will be much
bigger that I expected.

We have various places in the code that rely on pagefault_disable() to also
disable preemtpion. Most of these places were ignored on -rt, because not
supported.

One of these places is e.g. powerpc's vmx based usercopy. While these places
are easy to handle, I was struggeling e.g. with asm-generic futex functions.

e.g. futex_atomic_op_inuser(): easy to fix, add preempt_enable/disable
respectively.

e.g. futex_atomic_cmpxchg_inatomic(): not so easy / nice to fix.

The "inatomic" variants rely on the caller to make sure that preemption is
disabled.

        pagefault_disable();
        ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
        pagefault_enable();

1. We could simply add preempt_disable/enable to the calling code. However that
results in _all_ futex_atomic_cmpxchg_inatomic() running with disabled
preemption, although the implementation doesn't really need it. So there is not
really a "decoupling", but to counters to set.

2. We could add the preempt_disable/enable to the implementations that only
need it, leaving calling code as is. However, then the name
"futex_atomic_cmpxchg_inatomic" is misleading, because it has nothing to do
with "inatomic" anymore.

3. We could move the pagefault_ calls into the implementation and add
the preempt_ calls to the calling code. Once again, functions that don't rely
on preemption have it disabled.

The "inatomic" part is now somewhat wrong. Because they can't be called from
atomic context. They have to be called from a pagefault-disabled
environment.The preemption part is implementation specific.
So I wonder if what we really want is something like

/* can be called from atomic context, but it's not required */
int futex_atomic_cmpxchg_nopfault(...) {
	int ret;

	pagefault_disable();
	ret = futex_atomic_cmpxchg_disabled_pfault(...)
	pagefault_enable();
	return ret;	
}

/* has to be called with disabled pagefaults */
int futex_atomic_cmpxchg_disabled_pfault(...) {
	int ret;

	/* do architecture specific stuff */

	return ret;	
}

/* has to be called with disabled pagefaults */
int futex_atomic_cmpxchg_disabled_pfault(...) {
	int ret;

	preempt_disable()
	/* do architecture common stuff as default */
	preempt_enable()

	return ret;	
}

The same applies to other "inatomic" functions. I think most of these functions
rely on pagefaults to be disabled in order to work correctly, not disabled
preemption.

Any idea how to fix this or what would be the way to go?

Thanks!

David


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-03-27 15:40         ` David Hildenbrand
@ 2015-03-27 16:15           ` Peter Zijlstra
  2015-03-27 19:05             ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2015-03-27 16:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-arch, tglx, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, David.Laight,
	hughd, hocko

On Fri, Mar 27, 2015 at 04:40:50PM +0100, David Hildenbrand wrote:
> e.g. futex_atomic_op_inuser(): easy to fix, add preempt_enable/disable
> respectively.
> 
> e.g. futex_atomic_cmpxchg_inatomic(): not so easy / nice to fix.
> 
> The "inatomic" variants rely on the caller to make sure that preemption is
> disabled.
> 
>         pagefault_disable();
>         ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
>         pagefault_enable();

Typically the _inatomic() variants of functions have the exception
tables required for fixups and can return -EFAULT. In that regard the
futex_atomic_cmpxchg_inatomic() is consistently named.

In specific the above is taken from cmpxchg_futex_value_locked(), which
is private to futex.c, so we don't really need to worry about it.

Furthermore, the futex.c helpers that wrap them in pagefault_disable()
do so because they want to handle the fault themselves. I don't think we
need to worry about that.

> 1. We could simply add preempt_disable/enable to the calling code. However that
> results in _all_ futex_atomic_cmpxchg_inatomic() running with disabled
> preemption, although the implementation doesn't really need it. So there is not
> really a "decoupling", but to counters to set.

Not really needed, the few callsites where they are not already under a
lock is where we want to explicitly handle the -EFAULT case ourselves.

> 2. We could add the preempt_disable/enable to the implementations that only
> need it, leaving calling code as is. However, then the name
> "futex_atomic_cmpxchg_inatomic" is misleading, because it has nothing to do
> with "inatomic" anymore.

The _inatomic() naming is because it _can_ be called from atomic
context, like __copy_to_user_inatomic(). It doesn't mean it has to.
These functions work just fine outside of atomic regions.

And they still can be used in atomic regions, but now
pagefault_disable() will also trigger the exception fixup.

I don't think we should worry too much about this.

> The same applies to other "inatomic" functions. I think most of these functions
> rely on pagefaults to be disabled in order to work correctly, not disabled
> preemption.
> 
> Any idea how to fix this or what would be the way to go?

I have the feeling you're over thinking this. _inatomic() has exception
fixups and will return -EFAULT when it cannot do the pagefault in place,
for whatever reason -- traditionally because of atomic context, now also
pagefault_disable().

And esp. things like futexes have been extensively used under -rt and
are known good.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()
  2015-03-27 16:15           ` Peter Zijlstra
@ 2015-03-27 19:05             ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2015-03-27 19:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-arch, tglx, benh, paulus, akpm,
	heiko.carstens, schwidefsky, borntraeger, mst, David.Laight,
	hughd, hocko

> On Fri, Mar 27, 2015 at 04:40:50PM +0100, David Hildenbrand wrote:
> > e.g. futex_atomic_op_inuser(): easy to fix, add preempt_enable/disable
> > respectively.
> > 
> > e.g. futex_atomic_cmpxchg_inatomic(): not so easy / nice to fix.
> > 
> > The "inatomic" variants rely on the caller to make sure that preemption is
> > disabled.
> > 
> >         pagefault_disable();
> >         ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval);
> >         pagefault_enable();
> 
> Typically the _inatomic() variants of functions have the exception
> tables required for fixups and can return -EFAULT. In that regard the
> futex_atomic_cmpxchg_inatomic() is consistently named.
> 
> In specific the above is taken from cmpxchg_futex_value_locked(), which
> is private to futex.c, so we don't really need to worry about it.
> 
> Furthermore, the futex.c helpers that wrap them in pagefault_disable()
> do so because they want to handle the fault themselves. I don't think we
> need to worry about that.

I totally agree with pagefault_disable() and that -EFAULT logic to handle that
themselves. I'm basically only concerned about implicitly used disabled
preemption.

> 
> > 1. We could simply add preempt_disable/enable to the calling code. However that
> > results in _all_ futex_atomic_cmpxchg_inatomic() running with disabled
> > preemption, although the implementation doesn't really need it. So there is not
> > really a "decoupling", but to counters to set.
> 
> Not really needed, the few callsites where they are not already under a
> lock is where we want to explicitly handle the -EFAULT case ourselves.
> 
> > 2. We could add the preempt_disable/enable to the implementations that only
> > need it, leaving calling code as is. However, then the name
> > "futex_atomic_cmpxchg_inatomic" is misleading, because it has nothing to do
> > with "inatomic" anymore.
> 
> The _inatomic() naming is because it _can_ be called from atomic
> context, like __copy_to_user_inatomic(). It doesn't mean it has to.

Well, they have to be called from an pagefault_disabled environment (for now
atomic). Atomic context is optional, with a few exceptions (see next section).

> These functions work just fine outside of atomic regions.

To make clear what I'm worried about, have a look at the following code taken
from include/asm-generic/futex.h):

static inline int
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
                              u32 oldval, u32 newval)
{
        u32 val;

        if (unlikely(get_user(val, uaddr) != 0))
                return -EFAULT;

        if (val == oldval && unlikely(put_user(newval, uaddr) != 0))
                return -EFAULT;

        *uval = val;

        return 0;
}

This _has to_ be called from an atomic context. Otherwise the logic is broken
(mutual exclusion). Not adding a preempt_disable() somewhere in the calling code
(or the function itself) will not allow this function to work properly. At
least that's my understanding :)

And we have exactly that case when we drop preempt_disable() from pagefault_disable()
in the futex code.

My quick hack for this special case would be to add preempt_disable/enable to
that function body. But maybe I am totally wrong about that given code and
preemption.

> 
> And they still can be used in atomic regions, but now
> pagefault_disable() will also trigger the exception fixup.
> 
> I don't think we should worry too much about this.
> 
> > The same applies to other "inatomic" functions. I think most of these functions
> > rely on pagefaults to be disabled in order to work correctly, not disabled
> > preemption.

I agree. The kmap_atomic stuff is another candidate I identified that needs
additional preempt_disable().

> > 
> > Any idea how to fix this or what would be the way to go?
> 
> I have the feeling you're over thinking this. _inatomic() has exception
> fixups and will return -EFAULT when it cannot do the pagefault in place,
> for whatever reason -- traditionally because of atomic context, now also
> pagefault_disable().

Haha, well I don't want to break things. And places like the futex code look
suspicious. That's why I better double check with an expert.

> 
> And esp. things like futexes have been extensively used under -rt and
> are known good.

Yes, on most configuration, but maybe not all (archs that use asm-generic code
+ !CONFIG_SMP + CONFIG_PREEMPT)

Thanks for your reply.

David


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-03-27 19:05 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-10 14:23 [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() David Hildenbrand
2014-12-10 14:23 ` [PATCH v2 1/5] uaccess: add pagefault_count to thread_info David Hildenbrand
2014-12-15 10:07   ` LF.Tan
2014-12-15 11:23     ` David Hildenbrand
2014-12-15 12:48       ` Peter Zijlstra
2014-12-10 14:23 ` [PATCH v2 2/5] uaccess: count pagefault_disable() levels in pagefault_count David Hildenbrand
2014-12-10 14:23 ` [PATCH v2 3/5] mm, uaccess: trigger might_sleep() in might_fault() when pagefaults are disabled David Hildenbrand
2014-12-10 14:23 ` [PATCH v2 4/5] uaccess: clarify that uaccess may only sleep if pagefaults are not disabled David Hildenbrand
2014-12-10 14:23 ` [PATCH v2 5/5] uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count David Hildenbrand
2014-12-15 10:45 ` [PATCH v2 0/5] Reenable might_sleep() checks for might_fault() Peter Zijlstra
2014-12-15 11:21   ` David Hildenbrand
2014-12-15 12:50     ` Peter Zijlstra
2014-12-15 13:08       ` David Hildenbrand
2015-01-12 14:19 ` David Hildenbrand
2015-01-30 15:52   ` Christian Borntraeger
2015-02-09 14:42   ` Peter Zijlstra
2015-02-19 14:48     ` David Hildenbrand
2015-02-19 15:07       ` Peter Zijlstra
2015-02-19 15:14         ` David Hildenbrand
2015-03-27 15:40         ` David Hildenbrand
2015-03-27 16:15           ` Peter Zijlstra
2015-03-27 19:05             ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).