All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	David Howells <dhowells@redhat.com>,
	Hirokazu Takata <takata@linux-m32r.org>,
	Michal Simek <monstr@monstr.eu>,
	Koichi Yasutake <yasutake.koichi@jp.panasonic.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Chris Metcalf <cmetcalf@tilera.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Arnd Bergmann <arnd@arndb.de>,
	linux-arm-kernel@lists.infradead.org,
	linux-m32r@ml.linux-m32r.org, linux-m32r-ja@ml.linux-m32r.org,
	microblaze-uclinux@itee.uq.edu.au, linux-am33-list@redhat.com,
	linuxppc-dev@lists.ozlabs.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org, kvm@vger.kernel.org, rostedt@goodmis.org
Subject: Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
Date: Sun, 19 May 2013 12:35:26 +0300	[thread overview]
Message-ID: <20130519093526.GD19883@redhat.com> (raw)
In-Reply-To: <20130516184041.GP19669@dyad.programming.kicks-ass.net>

On Thu, May 16, 2013 at 08:40:41PM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 02:16:10PM +0300, Michael S. Tsirkin wrote:
> > There are several ways to make sure might_fault
> > calling function does not sleep.
> > One is to use it on kernel or otherwise locked memory - apparently
> > nfs/sunrpc does this. As noted by Ingo, this is handled by the
> > migh_fault() implementation in mm/memory.c but not the one in
> > linux/kernel.h so in the current code might_fault() schedules
> > differently depending on CONFIG_PROVE_LOCKING, which is an undesired
> > semantical side effect.
> > 
> > Another is to call pagefault_disable: in this case the page fault
> > handler will go to fixups processing and we get an error instead of
> > sleeping, so the might_sleep annotation is a false positive.
> > vhost driver wants to do this now in order to reuse socket ops
> > under a spinlock (and fall back on slower thread handler
> > on error).
> 
> Are you using the assumption that spin_lock() implies preempt_disable() implies
> pagefault_disable()? Note that this assumption isn't valid for -rt where the
> spinlock becomes preemptible but we'll not disable pagefaults.

No, I was not assuming that. What I'm trying to say is that a caller
that does something like this under a spinlock:
	preempt_disable
	pagefault_disable
	error = copy_to_user
	pagefault_enable
	preempt_enable_no_resched

is not doing anything wrong and should not get a warning,
as long as error is handled correctly later.
Right?

> > Address both issues by:
> > 	- dropping the unconditional call to might_sleep
> > 	  from the fast might_fault code in linux/kernel.h
> > 	- checking for pagefault_disable() in the
> > 	  CONFIG_PROVE_LOCKING implementation
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/kernel.h |  1 -
> >  mm/memory.c            | 14 +++++++++-----
> >  2 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e96329c..322b065 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -198,7 +198,6 @@ void might_fault(void);
> >  #else
> >  static inline void might_fault(void)
> >  {
> > -	might_sleep();
> 
> This removes potential resched points for PREEMPT_VOLUNTARY -- was that
> intentional?

No it's a bug. Thanks for pointing this out.
OK so I guess it should be might_sleep_if(!in_atomic())
and this means might_fault would have to move from linux/kernel.h to
linux/uaccess.h, since in_atomic() is in linux/hardirq.h

Makes sense?

> >  }
> >  #endif
> >  
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 6dc1882..1b8327b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4222,13 +4222,17 @@ void might_fault(void)
> >  	if (segment_eq(get_fs(), KERNEL_DS))
> >  		return;
> >  
> > -	might_sleep();
> >  	/*
> > -	 * it would be nicer only to annotate paths which are not under
> > -	 * pagefault_disable, however that requires a larger audit and
> > -	 * providing helpers like get_user_atomic.
> > +	 * It would be nicer to annotate paths which are under preempt_disable
> > +	 * but not under pagefault_disable, however that requires a new flag
> > +	 * for differentiating between the two.
> 
> -rt has this, pagefault_disable() doesn't change the preempt count but pokes
> at task_struct::pagefault_disable.

Good to know.

So maybe we can import this at least for CONFIG_PROVE_LOCKING?
To make the patch smaller I'd prefer doing both for now,
this way this patchset does not have to poke in too many
mm internals.
I can try doing that - unless
someone else has plans to merge this part soon anyway?

> >  	 */
> > -	if (!in_atomic() && current->mm)
> > +	if (in_atomic())
> > +		return;
> > +
> > +	might_sleep();
> > +
> > +	if (current->mm)
> >  		might_lock_read(&current->mm->mmap_sem);
> >  }
> >  EXPORT_SYMBOL(might_fault);
> > -- 
> > MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	David Howells <dhowells@redhat.com>,
	Hirokazu Takata <takata@linux-m32r.org>,
	Michal Simek <monstr@monstr.eu>,
	Koichi Yasutake <yasutake.koichi@jp.panasonic.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Chris Metcalf <cmetcalf@tilera.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Arnd Bergmann <arnd@arndb.de>,
	linux-arm-kernel@lists.infradead.org,
	linux-m32r@ml.linux-m32r.org, linux-m32r-ja@ml.linux-m32r.org,
	microblaze-uclinux@itee.uq.edu.au, linux-am33-list@redhat.com,
	linuxppc-dev@lists.ozlabs.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org, kvm@
Subject: Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
Date: Sun, 19 May 2013 12:35:26 +0300	[thread overview]
Message-ID: <20130519093526.GD19883@redhat.com> (raw)
In-Reply-To: <20130516184041.GP19669@dyad.programming.kicks-ass.net>

On Thu, May 16, 2013 at 08:40:41PM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 02:16:10PM +0300, Michael S. Tsirkin wrote:
> > There are several ways to make sure might_fault
> > calling function does not sleep.
> > One is to use it on kernel or otherwise locked memory - apparently
> > nfs/sunrpc does this. As noted by Ingo, this is handled by the
> > migh_fault() implementation in mm/memory.c but not the one in
> > linux/kernel.h so in the current code might_fault() schedules
> > differently depending on CONFIG_PROVE_LOCKING, which is an undesired
> > semantical side effect.
> > 
> > Another is to call pagefault_disable: in this case the page fault
> > handler will go to fixups processing and we get an error instead of
> > sleeping, so the might_sleep annotation is a false positive.
> > vhost driver wants to do this now in order to reuse socket ops
> > under a spinlock (and fall back on slower thread handler
> > on error).
> 
> Are you using the assumption that spin_lock() implies preempt_disable() implies
> pagefault_disable()? Note that this assumption isn't valid for -rt where the
> spinlock becomes preemptible but we'll not disable pagefaults.

No, I was not assuming that. What I'm trying to say is that a caller
that does something like this under a spinlock:
	preempt_disable
	pagefault_disable
	error = copy_to_user
	pagefault_enable
	preempt_enable_no_resched

is not doing anything wrong and should not get a warning,
as long as error is handled correctly later.
Right?

> > Address both issues by:
> > 	- dropping the unconditional call to might_sleep
> > 	  from the fast might_fault code in linux/kernel.h
> > 	- checking for pagefault_disable() in the
> > 	  CONFIG_PROVE_LOCKING implementation
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/kernel.h |  1 -
> >  mm/memory.c            | 14 +++++++++-----
> >  2 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e96329c..322b065 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -198,7 +198,6 @@ void might_fault(void);
> >  #else
> >  static inline void might_fault(void)
> >  {
> > -	might_sleep();
> 
> This removes potential resched points for PREEMPT_VOLUNTARY -- was that
> intentional?

No it's a bug. Thanks for pointing this out.
OK so I guess it should be might_sleep_if(!in_atomic())
and this means might_fault would have to move from linux/kernel.h to
linux/uaccess.h, since in_atomic() is in linux/hardirq.h

Makes sense?

> >  }
> >  #endif
> >  
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 6dc1882..1b8327b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4222,13 +4222,17 @@ void might_fault(void)
> >  	if (segment_eq(get_fs(), KERNEL_DS))
> >  		return;
> >  
> > -	might_sleep();
> >  	/*
> > -	 * it would be nicer only to annotate paths which are not under
> > -	 * pagefault_disable, however that requires a larger audit and
> > -	 * providing helpers like get_user_atomic.
> > +	 * It would be nicer to annotate paths which are under preempt_disable
> > +	 * but not under pagefault_disable, however that requires a new flag
> > +	 * for differentiating between the two.
> 
> -rt has this, pagefault_disable() doesn't change the preempt count but pokes
> at task_struct::pagefault_disable.

Good to know.

So maybe we can import this at least for CONFIG_PROVE_LOCKING?
To make the patch smaller I'd prefer doing both for now,
this way this patchset does not have to poke in too many
mm internals.
I can try doing that - unless
someone else has plans to merge this part soon anyway?

> >  	 */
> > -	if (!in_atomic() && current->mm)
> > +	if (in_atomic())
> > +		return;
> > +
> > +	might_sleep();
> > +
> > +	if (current->mm)
> >  		might_lock_read(&current->mm->mmap_sem);
> >  }
> >  EXPORT_SYMBOL(might_fault);
> > -- 
> > MST

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	David Howells <dhowells@redhat.com>,
	Hirokazu Takata <takata@linux-m32r.org>,
	Michal Simek <monstr@monstr.eu>,
	Koichi Yasutake <yasutake.koichi@jp.panasonic.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Chris Metcalf <cmetcalf@tilera.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Arnd Bergmann <arnd@arndb.de>,
	linux-arm-kernel@lists.infradead.org,
	linux-m32r@ml.linux-m32r.org, linux-m32r-ja@ml.linux-m32r.org,
	microblaze-uclinux@itee.uq.edu.au, linux-am33-list@redhat.com,
	linuxppc-dev@lists.ozlabs.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org, kvm@vger.kernel.org, rostedt@goodmis.org
Subject: Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
Date: Sun, 19 May 2013 12:35:26 +0300	[thread overview]
Message-ID: <20130519093526.GD19883@redhat.com> (raw)
In-Reply-To: <20130516184041.GP19669@dyad.programming.kicks-ass.net>

On Thu, May 16, 2013 at 08:40:41PM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 02:16:10PM +0300, Michael S. Tsirkin wrote:
> > There are several ways to make sure might_fault
> > calling function does not sleep.
> > One is to use it on kernel or otherwise locked memory - apparently
> > nfs/sunrpc does this. As noted by Ingo, this is handled by the
> > migh_fault() implementation in mm/memory.c but not the one in
> > linux/kernel.h so in the current code might_fault() schedules
> > differently depending on CONFIG_PROVE_LOCKING, which is an undesired
> > semantical side effect.
> > 
> > Another is to call pagefault_disable: in this case the page fault
> > handler will go to fixups processing and we get an error instead of
> > sleeping, so the might_sleep annotation is a false positive.
> > vhost driver wants to do this now in order to reuse socket ops
> > under a spinlock (and fall back on slower thread handler
> > on error).
> 
> Are you using the assumption that spin_lock() implies preempt_disable() implies
> pagefault_disable()? Note that this assumption isn't valid for -rt where the
> spinlock becomes preemptible but we'll not disable pagefaults.

No, I was not assuming that. What I'm trying to say is that a caller
that does something like this under a spinlock:
	preempt_disable
	pagefault_disable
	error = copy_to_user
	pagefault_enable
	preempt_enable_no_resched

is not doing anything wrong and should not get a warning,
as long as error is handled correctly later.
Right?

> > Address both issues by:
> > 	- dropping the unconditional call to might_sleep
> > 	  from the fast might_fault code in linux/kernel.h
> > 	- checking for pagefault_disable() in the
> > 	  CONFIG_PROVE_LOCKING implementation
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/kernel.h |  1 -
> >  mm/memory.c            | 14 +++++++++-----
> >  2 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e96329c..322b065 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -198,7 +198,6 @@ void might_fault(void);
> >  #else
> >  static inline void might_fault(void)
> >  {
> > -	might_sleep();
> 
> This removes potential resched points for PREEMPT_VOLUNTARY -- was that
> intentional?

No it's a bug. Thanks for pointing this out.
OK so I guess it should be might_sleep_if(!in_atomic())
and this means might_fault would have to move from linux/kernel.h to
linux/uaccess.h, since in_atomic() is in linux/hardirq.h

Makes sense?

> >  }
> >  #endif
> >  
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 6dc1882..1b8327b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4222,13 +4222,17 @@ void might_fault(void)
> >  	if (segment_eq(get_fs(), KERNEL_DS))
> >  		return;
> >  
> > -	might_sleep();
> >  	/*
> > -	 * it would be nicer only to annotate paths which are not under
> > -	 * pagefault_disable, however that requires a larger audit and
> > -	 * providing helpers like get_user_atomic.
> > +	 * It would be nicer to annotate paths which are under preempt_disable
> > +	 * but not under pagefault_disable, however that requires a new flag
> > +	 * for differentiating between the two.
> 
> -rt has this, pagefault_disable() doesn't change the preempt count but pokes
> at task_struct::pagefault_disable.

Good to know.

So maybe we can import this at least for CONFIG_PROVE_LOCKING?
To make the patch smaller I'd prefer doing both for now,
this way this patchset does not have to poke in too many
mm internals.
I can try doing that - unless
someone else has plans to merge this part soon anyway?

> >  	 */
> > -	if (!in_atomic() && current->mm)
> > +	if (in_atomic())
> > +		return;
> > +
> > +	might_sleep();
> > +
> > +	if (current->mm)
> >  		might_lock_read(&current->mm->mmap_sem);
> >  }
> >  EXPORT_SYMBOL(might_fault);
> > -- 
> > MST

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-m32r-ja@ml.linux-m32r.org, kvm@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	David Howells <dhowells@redhat.com>,
	linux-mm@kvack.org, Paul Mackerras <paulus@samba.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-arch@vger.kernel.org, linux-am33-list@redhat.com,
	Hirokazu Takata <takata@linux-m32r.org>,
	x86@kernel.org, Ingo Molnar <mingo@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>,
	microblaze-uclinux@itee.uq.edu.au,
	Chris Metcalf <cmetcalf@tilera.com>,
	rostedt@goodmis.org, Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Michal Simek <monstr@monstr.eu>,
	linux-m32r@ml.linux-m32r.org, linux-kernel@vger.kernel.org,
	Koichi Yasutake <yasutake.koichi@jp.panasonic.com>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
Date: Sun, 19 May 2013 12:35:26 +0300	[thread overview]
Message-ID: <20130519093526.GD19883@redhat.com> (raw)
In-Reply-To: <20130516184041.GP19669@dyad.programming.kicks-ass.net>

On Thu, May 16, 2013 at 08:40:41PM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 02:16:10PM +0300, Michael S. Tsirkin wrote:
> > There are several ways to make sure might_fault
> > calling function does not sleep.
> > One is to use it on kernel or otherwise locked memory - apparently
> > nfs/sunrpc does this. As noted by Ingo, this is handled by the
> > migh_fault() implementation in mm/memory.c but not the one in
> > linux/kernel.h so in the current code might_fault() schedules
> > differently depending on CONFIG_PROVE_LOCKING, which is an undesired
> > semantical side effect.
> > 
> > Another is to call pagefault_disable: in this case the page fault
> > handler will go to fixups processing and we get an error instead of
> > sleeping, so the might_sleep annotation is a false positive.
> > vhost driver wants to do this now in order to reuse socket ops
> > under a spinlock (and fall back on slower thread handler
> > on error).
> 
> Are you using the assumption that spin_lock() implies preempt_disable() implies
> pagefault_disable()? Note that this assumption isn't valid for -rt where the
> spinlock becomes preemptible but we'll not disable pagefaults.

No, I was not assuming that. What I'm trying to say is that a caller
that does something like this under a spinlock:
	preempt_disable
	pagefault_disable
	error = copy_to_user
	pagefault_enable
	preempt_enable_no_resched

is not doing anything wrong and should not get a warning,
as long as error is handled correctly later.
Right?

> > Address both issues by:
> > 	- dropping the unconditional call to might_sleep
> > 	  from the fast might_fault code in linux/kernel.h
> > 	- checking for pagefault_disable() in the
> > 	  CONFIG_PROVE_LOCKING implementation
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/kernel.h |  1 -
> >  mm/memory.c            | 14 +++++++++-----
> >  2 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e96329c..322b065 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -198,7 +198,6 @@ void might_fault(void);
> >  #else
> >  static inline void might_fault(void)
> >  {
> > -	might_sleep();
> 
> This removes potential resched points for PREEMPT_VOLUNTARY -- was that
> intentional?

No it's a bug. Thanks for pointing this out.
OK so I guess it should be might_sleep_if(!in_atomic())
and this means might_fault would have to move from linux/kernel.h to
linux/uaccess.h, since in_atomic() is in linux/hardirq.h

Makes sense?

> >  }
> >  #endif
> >  
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 6dc1882..1b8327b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4222,13 +4222,17 @@ void might_fault(void)
> >  	if (segment_eq(get_fs(), KERNEL_DS))
> >  		return;
> >  
> > -	might_sleep();
> >  	/*
> > -	 * it would be nicer only to annotate paths which are not under
> > -	 * pagefault_disable, however that requires a larger audit and
> > -	 * providing helpers like get_user_atomic.
> > +	 * It would be nicer to annotate paths which are under preempt_disable
> > +	 * but not under pagefault_disable, however that requires a new flag
> > +	 * for differentiating between the two.
> 
> -rt has this, pagefault_disable() doesn't change the preempt count but pokes
> at task_struct::pagefault_disable.

Good to know.

So maybe we can import this at least for CONFIG_PROVE_LOCKING?
To make the patch smaller I'd prefer doing both for now,
this way this patchset does not have to poke in too many
mm internals.
I can try doing that - unless
someone else has plans to merge this part soon anyway?

> >  	 */
> > -	if (!in_atomic() && current->mm)
> > +	if (in_atomic())
> > +		return;
> > +
> > +	might_sleep();
> > +
> > +	if (current->mm)
> >  		might_lock_read(&current->mm->mmap_sem);
> >  }
> >  EXPORT_SYMBOL(might_fault);
> > -- 
> > MST

WARNING: multiple messages have this Message-ID (diff)
From: mst@redhat.com (Michael S. Tsirkin)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
Date: Sun, 19 May 2013 12:35:26 +0300	[thread overview]
Message-ID: <20130519093526.GD19883@redhat.com> (raw)
In-Reply-To: <20130516184041.GP19669@dyad.programming.kicks-ass.net>

On Thu, May 16, 2013 at 08:40:41PM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 02:16:10PM +0300, Michael S. Tsirkin wrote:
> > There are several ways to make sure might_fault
> > calling function does not sleep.
> > One is to use it on kernel or otherwise locked memory - apparently
> > nfs/sunrpc does this. As noted by Ingo, this is handled by the
> > migh_fault() implementation in mm/memory.c but not the one in
> > linux/kernel.h so in the current code might_fault() schedules
> > differently depending on CONFIG_PROVE_LOCKING, which is an undesired
> > semantical side effect.
> > 
> > Another is to call pagefault_disable: in this case the page fault
> > handler will go to fixups processing and we get an error instead of
> > sleeping, so the might_sleep annotation is a false positive.
> > vhost driver wants to do this now in order to reuse socket ops
> > under a spinlock (and fall back on slower thread handler
> > on error).
> 
> Are you using the assumption that spin_lock() implies preempt_disable() implies
> pagefault_disable()? Note that this assumption isn't valid for -rt where the
> spinlock becomes preemptible but we'll not disable pagefaults.

No, I was not assuming that. What I'm trying to say is that a caller
that does something like this under a spinlock:
	preempt_disable
	pagefault_disable
	error = copy_to_user
	pagefault_enable
	preempt_enable_no_resched

is not doing anything wrong and should not get a warning,
as long as error is handled correctly later.
Right?

> > Address both issues by:
> > 	- dropping the unconditional call to might_sleep
> > 	  from the fast might_fault code in linux/kernel.h
> > 	- checking for pagefault_disable() in the
> > 	  CONFIG_PROVE_LOCKING implementation
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/kernel.h |  1 -
> >  mm/memory.c            | 14 +++++++++-----
> >  2 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e96329c..322b065 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -198,7 +198,6 @@ void might_fault(void);
> >  #else
> >  static inline void might_fault(void)
> >  {
> > -	might_sleep();
> 
> This removes potential resched points for PREEMPT_VOLUNTARY -- was that
> intentional?

No it's a bug. Thanks for pointing this out.
OK so I guess it should be might_sleep_if(!in_atomic())
and this means might_fault would have to move from linux/kernel.h to
linux/uaccess.h, since in_atomic() is in linux/hardirq.h

Makes sense?

> >  }
> >  #endif
> >  
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 6dc1882..1b8327b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4222,13 +4222,17 @@ void might_fault(void)
> >  	if (segment_eq(get_fs(), KERNEL_DS))
> >  		return;
> >  
> > -	might_sleep();
> >  	/*
> > -	 * it would be nicer only to annotate paths which are not under
> > -	 * pagefault_disable, however that requires a larger audit and
> > -	 * providing helpers like get_user_atomic.
> > +	 * It would be nicer to annotate paths which are under preempt_disable
> > +	 * but not under pagefault_disable, however that requires a new flag
> > +	 * for differentiating between the two.
> 
> -rt has this, pagefault_disable() doesn't change the preempt count but pokes
> at task_struct::pagefault_disable.

Good to know.

So maybe we can import this at least for CONFIG_PROVE_LOCKING?
To make the patch smaller I'd prefer doing both for now,
this way this patchset does not have to poke in too many
mm internals.
I can try doing that - unless
someone else has plans to merge this part soon anyway?

> >  	 */
> > -	if (!in_atomic() && current->mm)
> > +	if (in_atomic())
> > +		return;
> > +
> > +	might_sleep();
> > +
> > +	if (current->mm)
> >  		might_lock_read(&current->mm->mmap_sem);
> >  }
> >  EXPORT_SYMBOL(might_fault);
> > -- 
> > MST

  reply	other threads:[~2013-05-19  9:37 UTC|newest]

Thread overview: 228+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-16 11:07 [PATCH v2 00/10] uaccess: better might_sleep/might_fault behavior Michael S. Tsirkin
2013-05-16 11:07 ` Michael S. Tsirkin
2013-05-16 11:07 ` Michael S. Tsirkin
2013-05-16 11:07 ` Michael S. Tsirkin
2013-05-16 11:07 ` Michael S. Tsirkin
2013-05-16 11:07 ` Michael S. Tsirkin
2013-05-16 11:10 ` [PATCH v2 01/10] asm-generic: uaccess s/might_sleep/might_fault/ Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10 ` [PATCH v2 02/10] arm64: " Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 13:29   ` Catalin Marinas
2013-05-16 13:29     ` Catalin Marinas
2013-05-16 13:29     ` Catalin Marinas
2013-05-16 13:29     ` Catalin Marinas
2013-05-16 13:29     ` Catalin Marinas
2013-05-16 13:29     ` Catalin Marinas
2013-05-16 13:29     ` Catalin Marinas
2013-05-16 11:10 ` [PATCH v2 03/10] frv: " Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:10   ` Michael S. Tsirkin
2013-05-16 11:11 ` [PATCH v2 04/10] m32r: " Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11 ` [PATCH v2 05/10] microblaze: " Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:11   ` Michael S. Tsirkin
2013-05-16 11:12 ` [PATCH v2 06/10] mn10300: " Michael S. Tsirkin
2013-05-16 11:12   ` Michael S. Tsirkin
2013-05-16 11:12   ` Michael S. Tsirkin
2013-05-16 11:12   ` Michael S. Tsirkin
2013-05-16 11:12   ` Michael S. Tsirkin
2013-05-16 11:12   ` Michael S. Tsirkin
2013-05-16 11:15 ` [PATCH v2 07/10] powerpc: " Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-22 13:59   ` Arnd Bergmann
2013-05-22 13:59     ` Arnd Bergmann
2013-05-22 13:59     ` Arnd Bergmann
2013-05-22 13:59     ` Arnd Bergmann
2013-05-22 14:30     ` Michael S. Tsirkin
2013-05-22 14:30       ` Michael S. Tsirkin
2013-05-22 14:30       ` Michael S. Tsirkin
2013-05-22 14:30       ` Michael S. Tsirkin
2013-05-22 14:30       ` Michael S. Tsirkin
2013-05-24 13:00     ` Michael S. Tsirkin
2013-05-24 13:00       ` Michael S. Tsirkin
2013-05-24 13:00       ` Michael S. Tsirkin
2013-05-24 13:00       ` Michael S. Tsirkin
2013-05-24 13:00       ` Michael S. Tsirkin
2013-05-24 13:11       ` Michael S. Tsirkin
2013-05-24 13:11         ` Michael S. Tsirkin
2013-05-24 13:11         ` Michael S. Tsirkin
2013-05-24 13:11         ` Michael S. Tsirkin
2013-05-24 13:11         ` Michael S. Tsirkin
2013-05-24 13:30         ` Arnd Bergmann
2013-05-24 13:30           ` Arnd Bergmann
2013-05-24 13:30           ` Arnd Bergmann
2013-05-24 13:30           ` Arnd Bergmann
2013-05-24 13:30           ` Arnd Bergmann
2013-05-16 11:15 ` [PATCH v2 08/10] tile: " Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 13:33   ` Chris Metcalf
2013-05-16 13:33     ` Chris Metcalf
2013-05-16 13:33     ` Chris Metcalf
2013-05-16 13:33     ` Chris Metcalf
2013-05-16 13:33     ` Chris Metcalf
2013-05-16 13:33     ` Chris Metcalf
2013-05-16 11:15 ` [PATCH v2 09/10] x86: " Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:15   ` Michael S. Tsirkin
2013-05-16 11:16 ` [PATCH v2 10/10] kernel: might_fault does not imply might_sleep Michael S. Tsirkin
2013-05-16 11:16   ` Michael S. Tsirkin
2013-05-16 11:16   ` Michael S. Tsirkin
2013-05-16 11:16   ` Michael S. Tsirkin
2013-05-16 11:16   ` Michael S. Tsirkin
2013-05-16 11:16   ` Michael S. Tsirkin
2013-05-16 18:40   ` Peter Zijlstra
2013-05-16 18:40     ` Peter Zijlstra
2013-05-16 18:40     ` Peter Zijlstra
2013-05-16 18:40     ` Peter Zijlstra
2013-05-16 18:40     ` Peter Zijlstra
2013-05-19  9:35     ` Michael S. Tsirkin [this message]
2013-05-19  9:35       ` Michael S. Tsirkin
2013-05-19  9:35       ` Michael S. Tsirkin
2013-05-19  9:35       ` Michael S. Tsirkin
2013-05-19  9:35       ` Michael S. Tsirkin
2013-05-19 12:34       ` Steven Rostedt
2013-05-19 12:34         ` Steven Rostedt
2013-05-19 12:34         ` Steven Rostedt
2013-05-19 12:34         ` Steven Rostedt
2013-05-19 12:34         ` Steven Rostedt
2013-05-19 13:34         ` Michael S. Tsirkin
2013-05-19 13:34           ` Michael S. Tsirkin
2013-05-19 13:34           ` Michael S. Tsirkin
2013-05-19 13:34           ` Michael S. Tsirkin
2013-05-19 13:34           ` Michael S. Tsirkin
2013-05-19 16:06           ` Steven Rostedt
2013-05-19 16:06             ` Steven Rostedt
2013-05-19 16:06             ` Steven Rostedt
2013-05-19 16:06             ` Steven Rostedt
2013-05-19 16:06             ` Steven Rostedt
2013-05-19 16:40             ` Michael S. Tsirkin
2013-05-19 16:40               ` Michael S. Tsirkin
2013-05-19 16:40               ` Michael S. Tsirkin
2013-05-19 16:40               ` Michael S. Tsirkin
2013-05-19 16:40               ` Michael S. Tsirkin
2013-05-19 20:23               ` Steven Rostedt
2013-05-19 20:23                 ` Steven Rostedt
2013-05-19 20:23                 ` Steven Rostedt
2013-05-19 20:23                 ` Steven Rostedt
2013-05-19 20:23                 ` Steven Rostedt
2013-05-19 20:35                 ` Michael S. Tsirkin
2013-05-19 20:35                   ` Michael S. Tsirkin
2013-05-19 20:35                   ` Michael S. Tsirkin
2013-05-19 20:35                   ` Michael S. Tsirkin
2013-05-19 20:35                   ` Michael S. Tsirkin
2013-05-21 11:18               ` Peter Zijlstra
2013-05-21 11:18                 ` Peter Zijlstra
2013-05-21 11:18                 ` Peter Zijlstra
2013-05-21 11:18                 ` Peter Zijlstra
2013-05-21 11:21       ` Peter Zijlstra
2013-05-21 11:21         ` Peter Zijlstra
2013-05-21 11:21         ` Peter Zijlstra
2013-05-21 11:21         ` Peter Zijlstra
2013-05-21 11:57       ` Peter Zijlstra
2013-05-21 11:57         ` Peter Zijlstra
2013-05-21 11:57         ` Peter Zijlstra
2013-05-21 11:57         ` Peter Zijlstra
2013-05-21 13:28         ` Michael S. Tsirkin
2013-05-21 13:28           ` Michael S. Tsirkin
2013-05-22  9:47         ` Michael S. Tsirkin
2013-05-22  9:47           ` Michael S. Tsirkin
2013-05-22  9:47           ` Michael S. Tsirkin
2013-05-22  9:47           ` Michael S. Tsirkin
2013-05-22  9:47           ` Michael S. Tsirkin
2013-05-22 10:16           ` Peter Zijlstra
2013-05-22 10:16             ` Peter Zijlstra
2013-05-22 10:16             ` Peter Zijlstra
2013-05-22 10:16             ` Peter Zijlstra
2013-05-22 20:38         ` Michael S. Tsirkin
2013-05-22 20:38           ` Michael S. Tsirkin
2013-05-22 20:38           ` Michael S. Tsirkin
2013-05-22 20:38           ` Michael S. Tsirkin
2013-05-22 20:38           ` Michael S. Tsirkin
2013-05-22 20:36     ` Michael S. Tsirkin
2013-05-22 20:36       ` Michael S. Tsirkin
2013-05-22 20:36       ` Michael S. Tsirkin
2013-05-22 20:36       ` Michael S. Tsirkin
2013-05-22 20:36       ` Michael S. Tsirkin
2013-05-22  9:25 ` [PATCH v2 00/10] uaccess: better might_sleep/might_fault behavior Arnd Bergmann
2013-05-22  9:25   ` Arnd Bergmann
2013-05-22  9:25   ` Arnd Bergmann
2013-05-22  9:25   ` Arnd Bergmann
2013-05-22  9:58   ` Michael S. Tsirkin
2013-05-22  9:58     ` Michael S. Tsirkin
2013-05-22  9:58     ` Michael S. Tsirkin
2013-05-22  9:58     ` Michael S. Tsirkin
2013-05-22  9:58     ` Michael S. Tsirkin
2013-05-22 10:19   ` Peter Zijlstra
2013-05-22 10:19     ` Peter Zijlstra
2013-05-22 10:19     ` Peter Zijlstra
2013-05-22 10:19     ` Peter Zijlstra
2013-05-22 11:07     ` Michael S. Tsirkin
2013-05-22 11:07       ` Michael S. Tsirkin
2013-05-22 11:07       ` Michael S. Tsirkin
2013-05-22 11:07       ` Michael S. Tsirkin
2013-05-22 11:07       ` Michael S. Tsirkin
2013-05-22 11:27       ` Peter Zijlstra
2013-05-22 11:27         ` Peter Zijlstra
2013-05-22 11:27         ` Peter Zijlstra
2013-05-22 11:27         ` Peter Zijlstra
2013-05-22 11:27         ` Peter Zijlstra
2013-05-22 13:41   ` Russell King - ARM Linux
2013-05-22 13:41     ` Russell King - ARM Linux
2013-05-22 13:41     ` Russell King - ARM Linux
2013-05-22 13:41     ` Russell King - ARM Linux
2013-05-22 14:04     ` Arnd Bergmann
2013-05-22 14:04       ` Arnd Bergmann
2013-05-22 14:04       ` Arnd Bergmann
2013-05-22 14:04       ` Arnd Bergmann
2013-05-22 14:44       ` Michael S. Tsirkin
2013-05-22 14:44         ` Michael S. Tsirkin
2013-05-22 14:44         ` Michael S. Tsirkin
2013-05-22 14:44         ` Michael S. Tsirkin
2013-05-22 14:44         ` Michael S. Tsirkin
2013-05-22 14:44         ` Michael S. Tsirkin
2013-05-24 14:17 ` [PATCH v3 01/11] asm-generic: uaccess s/might_sleep/might_fault/ Michael S. Tsirkin
2013-05-24 14:17 ` [PATCH v3 02/11] arm64: " Michael S. Tsirkin
2013-05-24 14:17   ` Michael S. Tsirkin
2013-05-24 14:17 ` [PATCH v3 03/11] frv: " Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 04/11] m32r: " Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 05/11] microblaze: " Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 06/11] mn10300: " Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 07/11] powerpc: " Michael S. Tsirkin
2013-05-24 14:18   ` Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 08/11] tile: " Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 09/11] x86: " Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 10/11] kernel: drop voluntary schedule from might_fault Michael S. Tsirkin
2013-05-24 14:18   ` Michael S. Tsirkin
2013-05-24 14:18 ` [PATCH v3 11/11] kernel: uaccess in atomic with pagefault_disable Michael S. Tsirkin
2013-05-24 14:18   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130519093526.GD19883@redhat.com \
    --to=mst@redhat.com \
    --cc=arnd@arndb.de \
    --cc=benh@kernel.crashing.org \
    --cc=catalin.marinas@arm.com \
    --cc=cmetcalf@tilera.com \
    --cc=dhowells@redhat.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-am33-list@redhat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m32r-ja@ml.linux-m32r.org \
    --cc=linux-m32r@ml.linux-m32r.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=microblaze-uclinux@itee.uq.edu.au \
    --cc=mingo@redhat.com \
    --cc=monstr@monstr.eu \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=takata@linux-m32r.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    --cc=yasutake.koichi@jp.panasonic.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.