All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:07 ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 15:07 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, linux-mm, Andrew Morton, Jan Kara, lustre-devel,
	cluster-devel

Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
results in bailing out from the fault code, the CPU then retries the
access, and we fault again effectively doing what the handler wanted.

CC: lustre-devel@lists.lustre.org
CC: cluster-devel@redhat.com
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
 include/linux/buffer_head.h                      | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
index ee01f20d8b11..9afa6bec3e6f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
+++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
@@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 		result = VM_FAULT_LOCKED;
 		break;
 	case -ENODATA:
+	case -EAGAIN:
 	case -EFAULT:
 		result = VM_FAULT_NOPAGE;
 		break;
 	case -ENOMEM:
 		result = VM_FAULT_OOM;
 		break;
-	case -EAGAIN:
-		result = VM_FAULT_RETRY;
-		break;
 	default:
 		result = VM_FAULT_SIGBUS;
 		break;
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index d67ab83823ad..79591c3660cc 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
 {
 	if (err == 0)
 		return VM_FAULT_LOCKED;
-	if (err == -EFAULT)
+	if (err == -EFAULT || err == -EAGAIN)
 		return VM_FAULT_NOPAGE;
 	if (err == -ENOMEM)
 		return VM_FAULT_OOM;
-	if (err == -EAGAIN)
-		return VM_FAULT_RETRY;
 	/* -ENOSPC, -EDQUOT, -EIO ... */
 	return VM_FAULT_SIGBUS;
 }
-- 
2.10.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:07 ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 15:07 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, linux-mm, Andrew Morton, Jan Kara, lustre-devel,
	cluster-devel

Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
results in bailing out from the fault code, the CPU then retries the
access, and we fault again effectively doing what the handler wanted.

CC: lustre-devel@lists.lustre.org
CC: cluster-devel@redhat.com
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
 include/linux/buffer_head.h                      | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
index ee01f20d8b11..9afa6bec3e6f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
+++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
@@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 		result = VM_FAULT_LOCKED;
 		break;
 	case -ENODATA:
+	case -EAGAIN:
 	case -EFAULT:
 		result = VM_FAULT_NOPAGE;
 		break;
 	case -ENOMEM:
 		result = VM_FAULT_OOM;
 		break;
-	case -EAGAIN:
-		result = VM_FAULT_RETRY;
-		break;
 	default:
 		result = VM_FAULT_SIGBUS;
 		break;
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index d67ab83823ad..79591c3660cc 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
 {
 	if (err == 0)
 		return VM_FAULT_LOCKED;
-	if (err == -EFAULT)
+	if (err == -EFAULT || err == -EAGAIN)
 		return VM_FAULT_NOPAGE;
 	if (err == -ENOMEM)
 		return VM_FAULT_OOM;
-	if (err == -EAGAIN)
-		return VM_FAULT_RETRY;
 	/* -ENOSPC, -EDQUOT, -EIO ... */
 	return VM_FAULT_SIGBUS;
 }
-- 
2.10.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:07 ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 15:07 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, linux-mm, Andrew Morton, Jan Kara, lustre-devel,
	cluster-devel

Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
results in bailing out from the fault code, the CPU then retries the
access, and we fault again effectively doing what the handler wanted.

CC: lustre-devel at lists.lustre.org
CC: cluster-devel at redhat.com
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
 include/linux/buffer_head.h                      | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
index ee01f20d8b11..9afa6bec3e6f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
+++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
@@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 		result = VM_FAULT_LOCKED;
 		break;
 	case -ENODATA:
+	case -EAGAIN:
 	case -EFAULT:
 		result = VM_FAULT_NOPAGE;
 		break;
 	case -ENOMEM:
 		result = VM_FAULT_OOM;
 		break;
-	case -EAGAIN:
-		result = VM_FAULT_RETRY;
-		break;
 	default:
 		result = VM_FAULT_SIGBUS;
 		break;
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index d67ab83823ad..79591c3660cc 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
 {
 	if (err == 0)
 		return VM_FAULT_LOCKED;
-	if (err == -EFAULT)
+	if (err == -EFAULT || err == -EAGAIN)
 		return VM_FAULT_NOPAGE;
 	if (err == -ENOMEM)
 		return VM_FAULT_OOM;
-	if (err == -EAGAIN)
-		return VM_FAULT_RETRY;
 	/* -ENOSPC, -EDQUOT, -EIO ... */
 	return VM_FAULT_SIGBUS;
 }
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Cluster-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:07 ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 15:07 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
results in bailing out from the fault code, the CPU then retries the
access, and we fault again effectively doing what the handler wanted.

CC: lustre-devel at lists.lustre.org
CC: cluster-devel at redhat.com
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
 include/linux/buffer_head.h                      | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
index ee01f20d8b11..9afa6bec3e6f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
+++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
@@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 		result = VM_FAULT_LOCKED;
 		break;
 	case -ENODATA:
+	case -EAGAIN:
 	case -EFAULT:
 		result = VM_FAULT_NOPAGE;
 		break;
 	case -ENOMEM:
 		result = VM_FAULT_OOM;
 		break;
-	case -EAGAIN:
-		result = VM_FAULT_RETRY;
-		break;
 	default:
 		result = VM_FAULT_SIGBUS;
 		break;
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index d67ab83823ad..79591c3660cc 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
 {
 	if (err == 0)
 		return VM_FAULT_LOCKED;
-	if (err == -EFAULT)
+	if (err == -EFAULT || err == -EAGAIN)
 		return VM_FAULT_NOPAGE;
 	if (err == -ENOMEM)
 		return VM_FAULT_OOM;
-	if (err == -EAGAIN)
-		return VM_FAULT_RETRY;
 	/* -ENOSPC, -EDQUOT, -EIO ... */
 	return VM_FAULT_SIGBUS;
 }
-- 
2.10.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 15:07 ` Jan Kara
@ 2017-02-03 15:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 30+ messages in thread
From: Matthew Wilcox @ 2017-02-03 15:13 UTC (permalink / raw)
  To: Jan Kara
  Cc: Al Viro, linux-fsdevel, linux-mm, Andrew Morton, lustre-devel,
	cluster-devel

On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.

Reading this commit message makes me wonder if this is the best fix.
It would seem logical that if I want the fault to be retried that I should
return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
driver / filesystem writers one fewer way to shoot themselves in the foot?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:13   ` Matthew Wilcox
  0 siblings, 0 replies; 30+ messages in thread
From: Matthew Wilcox @ 2017-02-03 15:13 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.

Reading this commit message makes me wonder if this is the best fix.
It would seem logical that if I want the fault to be retried that I should
return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
driver / filesystem writers one fewer way to shoot themselves in the foot?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 15:13   ` [Cluster-devel] " Matthew Wilcox
  (?)
@ 2017-02-03 15:46     ` Jan Kara
  -1 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 15:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Al Viro, linux-fsdevel, linux-mm, Andrew Morton,
	lustre-devel, cluster-devel

On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> 
> Reading this commit message makes me wonder if this is the best fix.
> It would seem logical that if I want the fault to be retried that I should
> return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> driver / filesystem writers one fewer way to shoot themselves in the foot?

VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
was set in page fault flags and it means - we have dropped mmap_sem, we
loaded page needed to satisfy the fault and now we need to try again (have
a look at __lock_page_or_retry()). I have my reservations about this
interface but it works...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:46     ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 15:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Al Viro, linux-fsdevel, linux-mm, Andrew Morton,
	lustre-devel, cluster-devel

On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> 
> Reading this commit message makes me wonder if this is the best fix.
> It would seem logical that if I want the fault to be retried that I should
> return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> driver / filesystem writers one fewer way to shoot themselves in the foot?

VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
was set in page fault flags and it means - we have dropped mmap_sem, we
loaded page needed to satisfy the fault and now we need to try again (have
a look at __lock_page_or_retry()). I have my reservations about this
interface but it works...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:46     ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 15:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> 
> Reading this commit message makes me wonder if this is the best fix.
> It would seem logical that if I want the fault to be retried that I should
> return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> driver / filesystem writers one fewer way to shoot themselves in the foot?

VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
was set in page fault flags and it means - we have dropped mmap_sem, we
loaded page needed to satisfy the fault and now we need to try again (have
a look at __lock_page_or_retry()). I have my reservations about this
interface but it works...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 15:46     ` [lustre-devel] " Jan Kara
@ 2017-02-03 15:53       ` Matthew Wilcox
  -1 siblings, 0 replies; 30+ messages in thread
From: Matthew Wilcox @ 2017-02-03 15:53 UTC (permalink / raw)
  To: Jan Kara
  Cc: Al Viro, linux-fsdevel, linux-mm, Andrew Morton, lustre-devel,
	cluster-devel

On Fri, Feb 03, 2017 at 04:46:40PM +0100, Jan Kara wrote:
> On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> > On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > > from ->page_mkwrite is completely unhandled by the mm code and results
> > > in locking and writeably mapping the page which definitely is not what
> > > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > > results in bailing out from the fault code, the CPU then retries the
> > > access, and we fault again effectively doing what the handler wanted.
> > 
> > Reading this commit message makes me wonder if this is the best fix.
> > It would seem logical that if I want the fault to be retried that I should
> > return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> > treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> > driver / filesystem writers one fewer way to shoot themselves in the foot?
> 
> VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
> was set in page fault flags and it means - we have dropped mmap_sem, we
> loaded page needed to satisfy the fault and now we need to try again (have
> a look at __lock_page_or_retry()). I have my reservations about this
> interface but it works...

Oh, I understand what it's *supposed* to be used for ;-)  It's just
a bit of an attractive nuisance.  Maybe renaming it to something like
VM_FAULT_PAGE_RETRY would stop people from thinking that it meant "retry
the fault".  And we could #define VM_FAULT_RETRY VM_FAULT_NOPAGE so that
people who want to retry the fault in a normal way could use a return
value that sounds like it does what they want instead of a return value
that is supposed to be used to indicate that we put a PFN into the
page table?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 15:53       ` Matthew Wilcox
  0 siblings, 0 replies; 30+ messages in thread
From: Matthew Wilcox @ 2017-02-03 15:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri, Feb 03, 2017 at 04:46:40PM +0100, Jan Kara wrote:
> On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> > On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > > from ->page_mkwrite is completely unhandled by the mm code and results
> > > in locking and writeably mapping the page which definitely is not what
> > > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > > results in bailing out from the fault code, the CPU then retries the
> > > access, and we fault again effectively doing what the handler wanted.
> > 
> > Reading this commit message makes me wonder if this is the best fix.
> > It would seem logical that if I want the fault to be retried that I should
> > return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> > treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> > driver / filesystem writers one fewer way to shoot themselves in the foot?
> 
> VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
> was set in page fault flags and it means - we have dropped mmap_sem, we
> loaded page needed to satisfy the fault and now we need to try again (have
> a look at __lock_page_or_retry()). I have my reservations about this
> interface but it works...

Oh, I understand what it's *supposed* to be used for ;-)  It's just
a bit of an attractive nuisance.  Maybe renaming it to something like
VM_FAULT_PAGE_RETRY would stop people from thinking that it meant "retry
the fault".  And we could #define VM_FAULT_RETRY VM_FAULT_NOPAGE so that
people who want to retry the fault in a normal way could use a return
value that sounds like it does what they want instead of a return value
that is supposed to be used to indicate that we put a PFN into the
page table?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 15:53       ` [Cluster-devel] " Matthew Wilcox
  (?)
@ 2017-02-03 16:10         ` Jan Kara
  -1 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 16:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Al Viro, linux-fsdevel, linux-mm, Andrew Morton,
	lustre-devel, cluster-devel

On Fri 03-02-17 07:53:26, Matthew Wilcox wrote:
> On Fri, Feb 03, 2017 at 04:46:40PM +0100, Jan Kara wrote:
> > On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> > > On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > > > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > > > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > > > from ->page_mkwrite is completely unhandled by the mm code and results
> > > > in locking and writeably mapping the page which definitely is not what
> > > > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > > > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > > > results in bailing out from the fault code, the CPU then retries the
> > > > access, and we fault again effectively doing what the handler wanted.
> > > 
> > > Reading this commit message makes me wonder if this is the best fix.
> > > It would seem logical that if I want the fault to be retried that I should
> > > return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> > > treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> > > driver / filesystem writers one fewer way to shoot themselves in the foot?
> > 
> > VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
> > was set in page fault flags and it means - we have dropped mmap_sem, we
> > loaded page needed to satisfy the fault and now we need to try again (have
> > a look at __lock_page_or_retry()). I have my reservations about this
> > interface but it works...
> 
> Oh, I understand what it's *supposed* to be used for ;-)  It's just
> a bit of an attractive nuisance.  Maybe renaming it to something like
> VM_FAULT_PAGE_RETRY would stop people from thinking that it meant "retry
> the fault".  And we could #define VM_FAULT_RETRY VM_FAULT_NOPAGE so that
> people who want to retry the fault in a normal way could use a return
> value that sounds like it does what they want instead of a return value
> that is supposed to be used to indicate that we put a PFN into the
> page table?

So a better name for VM_FAULT_RETRY and disentangling VM_FAULT_NOPAGE so
that it is not misused for retrying the fault would be nice. But it is a
much larger endeavor (I actually had a look into this some time ago) than
this simple bugfix...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 16:10         ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 16:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Al Viro, linux-fsdevel, linux-mm, Andrew Morton,
	lustre-devel, cluster-devel

On Fri 03-02-17 07:53:26, Matthew Wilcox wrote:
> On Fri, Feb 03, 2017 at 04:46:40PM +0100, Jan Kara wrote:
> > On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> > > On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > > > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > > > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > > > from ->page_mkwrite is completely unhandled by the mm code and results
> > > > in locking and writeably mapping the page which definitely is not what
> > > > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > > > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > > > results in bailing out from the fault code, the CPU then retries the
> > > > access, and we fault again effectively doing what the handler wanted.
> > > 
> > > Reading this commit message makes me wonder if this is the best fix.
> > > It would seem logical that if I want the fault to be retried that I should
> > > return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> > > treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> > > driver / filesystem writers one fewer way to shoot themselves in the foot?
> > 
> > VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
> > was set in page fault flags and it means - we have dropped mmap_sem, we
> > loaded page needed to satisfy the fault and now we need to try again (have
> > a look at __lock_page_or_retry()). I have my reservations about this
> > interface but it works...
> 
> Oh, I understand what it's *supposed* to be used for ;-)  It's just
> a bit of an attractive nuisance.  Maybe renaming it to something like
> VM_FAULT_PAGE_RETRY would stop people from thinking that it meant "retry
> the fault".  And we could #define VM_FAULT_RETRY VM_FAULT_NOPAGE so that
> people who want to retry the fault in a normal way could use a return
> value that sounds like it does what they want instead of a return value
> that is supposed to be used to indicate that we put a PFN into the
> page table?

So a better name for VM_FAULT_RETRY and disentangling VM_FAULT_NOPAGE so
that it is not misused for retrying the fault would be nice. But it is a
much larger endeavor (I actually had a look into this some time ago) than
this simple bugfix...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 16:10         ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-03 16:10 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri 03-02-17 07:53:26, Matthew Wilcox wrote:
> On Fri, Feb 03, 2017 at 04:46:40PM +0100, Jan Kara wrote:
> > On Fri 03-02-17 07:13:59, Matthew Wilcox wrote:
> > > On Fri, Feb 03, 2017 at 04:07:29PM +0100, Jan Kara wrote:
> > > > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > > > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > > > from ->page_mkwrite is completely unhandled by the mm code and results
> > > > in locking and writeably mapping the page which definitely is not what
> > > > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > > > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > > > results in bailing out from the fault code, the CPU then retries the
> > > > access, and we fault again effectively doing what the handler wanted.
> > > 
> > > Reading this commit message makes me wonder if this is the best fix.
> > > It would seem logical that if I want the fault to be retried that I should
> > > return VM_FAULT_RETRY, not VM_FAULT_NOPAGE.  Why don't we have the MM
> > > treat VM_FAULT_RETRY the same way that it treats VM_FAULT_NOPAGE and give
> > > driver / filesystem writers one fewer way to shoot themselves in the foot?
> > 
> > VM_FAULT_RETRY is special, it may be used only if FAULT_FLAG_ALLOW_RETRY
> > was set in page fault flags and it means - we have dropped mmap_sem, we
> > loaded page needed to satisfy the fault and now we need to try again (have
> > a look at __lock_page_or_retry()). I have my reservations about this
> > interface but it works...
> 
> Oh, I understand what it's *supposed* to be used for ;-)  It's just
> a bit of an attractive nuisance.  Maybe renaming it to something like
> VM_FAULT_PAGE_RETRY would stop people from thinking that it meant "retry
> the fault".  And we could #define VM_FAULT_RETRY VM_FAULT_NOPAGE so that
> people who want to retry the fault in a normal way could use a return
> value that sounds like it does what they want instead of a return value
> that is supposed to be used to indicate that we put a PFN into the
> page table?

So a better name for VM_FAULT_RETRY and disentangling VM_FAULT_NOPAGE so
that it is not misused for retrying the fault would be nice. But it is a
much larger endeavor (I actually had a look into this some time ago) than
this simple bugfix...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 15:07 ` Jan Kara
  (?)
@ 2017-02-03 23:20   ` Andrew Morton
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2017-02-03 23:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Al Viro, linux-fsdevel, linux-mm, lustre-devel, cluster-devel

On Fri,  3 Feb 2017 16:07:29 +0100 Jan Kara <jack@suse.cz> wrote:

> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.

I'm not getting any sense of the urgency of this fix.  The bug *sounds*
bad?  Which kernel versions need fixing?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 23:20   ` Andrew Morton
  0 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2017-02-03 23:20 UTC (permalink / raw)
  To: Jan Kara; +Cc: Al Viro, linux-fsdevel, linux-mm, lustre-devel, cluster-devel

On Fri,  3 Feb 2017 16:07:29 +0100 Jan Kara <jack@suse.cz> wrote:

> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.

I'm not getting any sense of the urgency of this fix.  The bug *sounds*
bad?  Which kernel versions need fixing?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 23:20   ` Andrew Morton
  0 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2017-02-03 23:20 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri,  3 Feb 2017 16:07:29 +0100 Jan Kara <jack@suse.cz> wrote:

> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.

I'm not getting any sense of the urgency of this fix.  The bug *sounds*
bad?  Which kernel versions need fixing?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 15:07 ` Jan Kara
  (?)
@ 2017-02-03 23:44   ` Xiong, Jinshan
  -1 siblings, 0 replies; 30+ messages in thread
From: Xiong, Jinshan @ 2017-02-03 23:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Al Viro, cluster-devel, linux-mm, linux-fsdevel, Andrew Morton,
	lustre-devel

Hi Jan,

Thanks for the patch. 

The proposed patch should be able to fix the problem, however, do you think it would be a better approach by revising it as:

…
case -EAGAIN:
	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
		up_read(&mm->mmap_sem);
		return VM_FAULT_RETRY;
	}
	return VM_FAULT_NOPAGE;
…

This way it can retry fault routine in mm instead of letting CPU have a new fault access.

Thanks,
Jinshan

> On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.
> 
> CC: lustre-devel@lists.lustre.org
> CC: cluster-devel@redhat.com
> Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> include/linux/buffer_head.h                      | 4 +---
> 2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> index ee01f20d8b11..9afa6bec3e6f 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> 		result = VM_FAULT_LOCKED;
> 		break;
> 	case -ENODATA:
> +	case -EAGAIN:
> 	case -EFAULT:
> 		result = VM_FAULT_NOPAGE;
> 		break;
> 	case -ENOMEM:
> 		result = VM_FAULT_OOM;
> 		break;
> -	case -EAGAIN:
> -		result = VM_FAULT_RETRY;
> -		break;
> 	default:
> 		result = VM_FAULT_SIGBUS;
> 		break;
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index d67ab83823ad..79591c3660cc 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> {
> 	if (err == 0)
> 		return VM_FAULT_LOCKED;
> -	if (err == -EFAULT)
> +	if (err == -EFAULT || err == -EAGAIN)
> 		return VM_FAULT_NOPAGE;
> 	if (err == -ENOMEM)
> 		return VM_FAULT_OOM;
> -	if (err == -EAGAIN)
> -		return VM_FAULT_RETRY;
> 	/* -ENOSPC, -EDQUOT, -EIO ... */
> 	return VM_FAULT_SIGBUS;
> }
> -- 
> 2.10.2
> 
> _______________________________________________
> lustre-devel mailing list
> lustre-devel@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 23:44   ` Xiong, Jinshan
  0 siblings, 0 replies; 30+ messages in thread
From: Xiong, Jinshan @ 2017-02-03 23:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Al Viro, cluster-devel, linux-mm, linux-fsdevel, Andrew Morton,
	lustre-devel

Hi Jan,

Thanks for the patch. 

The proposed patch should be able to fix the problem, however, do you think it would be a better approach by revising it as:

?
case -EAGAIN:
	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
		up_read(&mm->mmap_sem);
		return VM_FAULT_RETRY;
	}
	return VM_FAULT_NOPAGE;
?

This way it can retry fault routine in mm instead of letting CPU have a new fault access.

Thanks,
Jinshan

> On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.
> 
> CC: lustre-devel at lists.lustre.org
> CC: cluster-devel at redhat.com
> Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> include/linux/buffer_head.h                      | 4 +---
> 2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> index ee01f20d8b11..9afa6bec3e6f 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> 		result = VM_FAULT_LOCKED;
> 		break;
> 	case -ENODATA:
> +	case -EAGAIN:
> 	case -EFAULT:
> 		result = VM_FAULT_NOPAGE;
> 		break;
> 	case -ENOMEM:
> 		result = VM_FAULT_OOM;
> 		break;
> -	case -EAGAIN:
> -		result = VM_FAULT_RETRY;
> -		break;
> 	default:
> 		result = VM_FAULT_SIGBUS;
> 		break;
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index d67ab83823ad..79591c3660cc 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> {
> 	if (err == 0)
> 		return VM_FAULT_LOCKED;
> -	if (err == -EFAULT)
> +	if (err == -EFAULT || err == -EAGAIN)
> 		return VM_FAULT_NOPAGE;
> 	if (err == -ENOMEM)
> 		return VM_FAULT_OOM;
> -	if (err == -EAGAIN)
> -		return VM_FAULT_RETRY;
> 	/* -ENOSPC, -EDQUOT, -EIO ... */
> 	return VM_FAULT_SIGBUS;
> }
> -- 
> 2.10.2
> 
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-03 23:44   ` Xiong, Jinshan
  0 siblings, 0 replies; 30+ messages in thread
From: Xiong, Jinshan @ 2017-02-03 23:44 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Jan,

Thanks for the patch. 

The proposed patch should be able to fix the problem, however, do you think it would be a better approach by revising it as:

?
case -EAGAIN:
	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
		up_read(&mm->mmap_sem);
		return VM_FAULT_RETRY;
	}
	return VM_FAULT_NOPAGE;
?

This way it can retry fault routine in mm instead of letting CPU have a new fault access.

Thanks,
Jinshan

> On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.
> 
> CC: lustre-devel at lists.lustre.org
> CC: cluster-devel at redhat.com
> Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> include/linux/buffer_head.h                      | 4 +---
> 2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> index ee01f20d8b11..9afa6bec3e6f 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> 		result = VM_FAULT_LOCKED;
> 		break;
> 	case -ENODATA:
> +	case -EAGAIN:
> 	case -EFAULT:
> 		result = VM_FAULT_NOPAGE;
> 		break;
> 	case -ENOMEM:
> 		result = VM_FAULT_OOM;
> 		break;
> -	case -EAGAIN:
> -		result = VM_FAULT_RETRY;
> -		break;
> 	default:
> 		result = VM_FAULT_SIGBUS;
> 		break;
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index d67ab83823ad..79591c3660cc 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> {
> 	if (err == 0)
> 		return VM_FAULT_LOCKED;
> -	if (err == -EFAULT)
> +	if (err == -EFAULT || err == -EAGAIN)
> 		return VM_FAULT_NOPAGE;
> 	if (err == -ENOMEM)
> 		return VM_FAULT_OOM;
> -	if (err == -EAGAIN)
> -		return VM_FAULT_RETRY;
> 	/* -ENOSPC, -EDQUOT, -EIO ... */
> 	return VM_FAULT_SIGBUS;
> }
> -- 
> 2.10.2
> 
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 23:44   ` Xiong, Jinshan
  (?)
  (?)
@ 2017-02-06  8:59     ` Jan Kara
  -1 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-06  8:59 UTC (permalink / raw)
  To: Xiong, Jinshan
  Cc: Jan Kara, Al Viro, cluster-devel, linux-mm, linux-fsdevel,
	Andrew Morton, lustre-devel

Hi Xiong,

On Fri 03-02-17 23:44:57, Xiong, Jinshan wrote:
> Thanks for the patch. 
> 
> The proposed patch should be able to fix the problem, however, do you
> think it would be a better approach by revising it as:
> 
> …
> case -EAGAIN:
> 	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
> 		up_read(&mm->mmap_sem);
> 		return VM_FAULT_RETRY;
> 	}
> 	return VM_FAULT_NOPAGE;
> …
> 
> This way it can retry fault routine in mm instead of letting CPU have a
> new fault access.

Well, we could do that but IMHO that is a negligible benefit not worth the
complications in the code. After all these retries should better be rare or
you have bigger problems with your fault handler... What would be
worthwhile is something like:

	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
		up_read(&mm->mmap_sem);
		<wait for condition causing EAGAIN to resolve>
		return VM_FAULT_RETRY;
	}

However that wait is specific to the fault handler so we cannot do that in
the generic code.

								Honza

> > On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> > 
> > CC: lustre-devel@lists.lustre.org
> > CC: cluster-devel@redhat.com
> > Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> > include/linux/buffer_head.h                      | 4 +---
> > 2 files changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > index ee01f20d8b11..9afa6bec3e6f 100644
> > --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > 		result = VM_FAULT_LOCKED;
> > 		break;
> > 	case -ENODATA:
> > +	case -EAGAIN:
> > 	case -EFAULT:
> > 		result = VM_FAULT_NOPAGE;
> > 		break;
> > 	case -ENOMEM:
> > 		result = VM_FAULT_OOM;
> > 		break;
> > -	case -EAGAIN:
> > -		result = VM_FAULT_RETRY;
> > -		break;
> > 	default:
> > 		result = VM_FAULT_SIGBUS;
> > 		break;
> > diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> > index d67ab83823ad..79591c3660cc 100644
> > --- a/include/linux/buffer_head.h
> > +++ b/include/linux/buffer_head.h
> > @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> > {
> > 	if (err == 0)
> > 		return VM_FAULT_LOCKED;
> > -	if (err == -EFAULT)
> > +	if (err == -EFAULT || err == -EAGAIN)
> > 		return VM_FAULT_NOPAGE;
> > 	if (err == -ENOMEM)
> > 		return VM_FAULT_OOM;
> > -	if (err == -EAGAIN)
> > -		return VM_FAULT_RETRY;
> > 	/* -ENOSPC, -EDQUOT, -EIO ... */
> > 	return VM_FAULT_SIGBUS;
> > }
> > -- 
> > 2.10.2
> > 
> > _______________________________________________
> > lustre-devel mailing list
> > lustre-devel@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-06  8:59     ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-06  8:59 UTC (permalink / raw)
  To: Xiong, Jinshan
  Cc: Jan Kara, Al Viro, cluster-devel, linux-mm, linux-fsdevel,
	Andrew Morton, lustre-devel

Hi Xiong,

On Fri 03-02-17 23:44:57, Xiong, Jinshan wrote:
> Thanks for the patch. 
> 
> The proposed patch should be able to fix the problem, however, do you
> think it would be a better approach by revising it as:
> 
> a?|
> case -EAGAIN:
> 	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
> 		up_read(&mm->mmap_sem);
> 		return VM_FAULT_RETRY;
> 	}
> 	return VM_FAULT_NOPAGE;
> a?|
> 
> This way it can retry fault routine in mm instead of letting CPU have a
> new fault access.

Well, we could do that but IMHO that is a negligible benefit not worth the
complications in the code. After all these retries should better be rare or
you have bigger problems with your fault handler... What would be
worthwhile is something like:

	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
		up_read(&mm->mmap_sem);
		<wait for condition causing EAGAIN to resolve>
		return VM_FAULT_RETRY;
	}

However that wait is specific to the fault handler so we cannot do that in
the generic code.

								Honza

> > On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> > 
> > CC: lustre-devel@lists.lustre.org
> > CC: cluster-devel@redhat.com
> > Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> > include/linux/buffer_head.h                      | 4 +---
> > 2 files changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > index ee01f20d8b11..9afa6bec3e6f 100644
> > --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > 		result = VM_FAULT_LOCKED;
> > 		break;
> > 	case -ENODATA:
> > +	case -EAGAIN:
> > 	case -EFAULT:
> > 		result = VM_FAULT_NOPAGE;
> > 		break;
> > 	case -ENOMEM:
> > 		result = VM_FAULT_OOM;
> > 		break;
> > -	case -EAGAIN:
> > -		result = VM_FAULT_RETRY;
> > -		break;
> > 	default:
> > 		result = VM_FAULT_SIGBUS;
> > 		break;
> > diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> > index d67ab83823ad..79591c3660cc 100644
> > --- a/include/linux/buffer_head.h
> > +++ b/include/linux/buffer_head.h
> > @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> > {
> > 	if (err == 0)
> > 		return VM_FAULT_LOCKED;
> > -	if (err == -EFAULT)
> > +	if (err == -EFAULT || err == -EAGAIN)
> > 		return VM_FAULT_NOPAGE;
> > 	if (err == -ENOMEM)
> > 		return VM_FAULT_OOM;
> > -	if (err == -EAGAIN)
> > -		return VM_FAULT_RETRY;
> > 	/* -ENOSPC, -EDQUOT, -EIO ... */
> > 	return VM_FAULT_SIGBUS;
> > }
> > -- 
> > 2.10.2
> > 
> > _______________________________________________
> > lustre-devel mailing list
> > lustre-devel@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-06  8:59     ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-06  8:59 UTC (permalink / raw)
  To: Xiong, Jinshan
  Cc: Jan Kara, Al Viro, cluster-devel, linux-mm, linux-fsdevel,
	Andrew Morton, lustre-devel

Hi Xiong,

On Fri 03-02-17 23:44:57, Xiong, Jinshan wrote:
> Thanks for the patch. 
> 
> The proposed patch should be able to fix the problem, however, do you
> think it would be a better approach by revising it as:
> 
> ?
> case -EAGAIN:
> 	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
> 		up_read(&mm->mmap_sem);
> 		return VM_FAULT_RETRY;
> 	}
> 	return VM_FAULT_NOPAGE;
> ?
> 
> This way it can retry fault routine in mm instead of letting CPU have a
> new fault access.

Well, we could do that but IMHO that is a negligible benefit not worth the
complications in the code. After all these retries should better be rare or
you have bigger problems with your fault handler... What would be
worthwhile is something like:

	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
		up_read(&mm->mmap_sem);
		<wait for condition causing EAGAIN to resolve>
		return VM_FAULT_RETRY;
	}

However that wait is specific to the fault handler so we cannot do that in
the generic code.

								Honza

> > On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> > 
> > CC: lustre-devel at lists.lustre.org
> > CC: cluster-devel at redhat.com
> > Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> > include/linux/buffer_head.h                      | 4 +---
> > 2 files changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > index ee01f20d8b11..9afa6bec3e6f 100644
> > --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > 		result = VM_FAULT_LOCKED;
> > 		break;
> > 	case -ENODATA:
> > +	case -EAGAIN:
> > 	case -EFAULT:
> > 		result = VM_FAULT_NOPAGE;
> > 		break;
> > 	case -ENOMEM:
> > 		result = VM_FAULT_OOM;
> > 		break;
> > -	case -EAGAIN:
> > -		result = VM_FAULT_RETRY;
> > -		break;
> > 	default:
> > 		result = VM_FAULT_SIGBUS;
> > 		break;
> > diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> > index d67ab83823ad..79591c3660cc 100644
> > --- a/include/linux/buffer_head.h
> > +++ b/include/linux/buffer_head.h
> > @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> > {
> > 	if (err == 0)
> > 		return VM_FAULT_LOCKED;
> > -	if (err == -EFAULT)
> > +	if (err == -EFAULT || err == -EAGAIN)
> > 		return VM_FAULT_NOPAGE;
> > 	if (err == -ENOMEM)
> > 		return VM_FAULT_OOM;
> > -	if (err == -EAGAIN)
> > -		return VM_FAULT_RETRY;
> > 	/* -ENOSPC, -EDQUOT, -EIO ... */
> > 	return VM_FAULT_SIGBUS;
> > }
> > -- 
> > 2.10.2
> > 
> > _______________________________________________
> > lustre-devel mailing list
> > lustre-devel at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-06  8:59     ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-06  8:59 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Xiong,

On Fri 03-02-17 23:44:57, Xiong, Jinshan wrote:
> Thanks for the patch. 
> 
> The proposed patch should be able to fix the problem, however, do you
> think it would be a better approach by revising it as:
> 
> ?
> case -EAGAIN:
> 	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
> 		up_read(&mm->mmap_sem);
> 		return VM_FAULT_RETRY;
> 	}
> 	return VM_FAULT_NOPAGE;
> ?
> 
> This way it can retry fault routine in mm instead of letting CPU have a
> new fault access.

Well, we could do that but IMHO that is a negligible benefit not worth the
complications in the code. After all these retries should better be rare or
you have bigger problems with your fault handler... What would be
worthwhile is something like:

	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
		up_read(&mm->mmap_sem);
		<wait for condition causing EAGAIN to resolve>
		return VM_FAULT_RETRY;
	}

However that wait is specific to the fault handler so we cannot do that in
the generic code.

								Honza

> > On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> > 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> > 
> > CC: lustre-devel at lists.lustre.org
> > CC: cluster-devel at redhat.com
> > Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> > include/linux/buffer_head.h                      | 4 +---
> > 2 files changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > index ee01f20d8b11..9afa6bec3e6f 100644
> > --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> > @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > 		result = VM_FAULT_LOCKED;
> > 		break;
> > 	case -ENODATA:
> > +	case -EAGAIN:
> > 	case -EFAULT:
> > 		result = VM_FAULT_NOPAGE;
> > 		break;
> > 	case -ENOMEM:
> > 		result = VM_FAULT_OOM;
> > 		break;
> > -	case -EAGAIN:
> > -		result = VM_FAULT_RETRY;
> > -		break;
> > 	default:
> > 		result = VM_FAULT_SIGBUS;
> > 		break;
> > diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> > index d67ab83823ad..79591c3660cc 100644
> > --- a/include/linux/buffer_head.h
> > +++ b/include/linux/buffer_head.h
> > @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> > {
> > 	if (err == 0)
> > 		return VM_FAULT_LOCKED;
> > -	if (err == -EFAULT)
> > +	if (err == -EFAULT || err == -EAGAIN)
> > 		return VM_FAULT_NOPAGE;
> > 	if (err == -ENOMEM)
> > 		return VM_FAULT_OOM;
> > -	if (err == -EAGAIN)
> > -		return VM_FAULT_RETRY;
> > 	/* -ENOSPC, -EDQUOT, -EIO ... */
> > 	return VM_FAULT_SIGBUS;
> > }
> > -- 
> > 2.10.2
> > 
> > _______________________________________________
> > lustre-devel mailing list
> > lustre-devel at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 23:20   ` Andrew Morton
  (?)
@ 2017-02-06  9:24     ` Jan Kara
  -1 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-06  9:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, Al Viro, linux-fsdevel, linux-mm, lustre-devel, cluster-devel

On Fri 03-02-17 15:20:54, Andrew Morton wrote:
> On Fri,  3 Feb 2017 16:07:29 +0100 Jan Kara <jack@suse.cz> wrote:
> 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> 
> I'm not getting any sense of the urgency of this fix.  The bug *sounds*
> bad?  Which kernel versions need fixing?

So I did more analysis of GFS2 and Lustre behavior. AFAICS GFS2 returns
EAGAIN only for truncated page, when we then return with VM_FAULT_RETRY,
do_page_mkwrite() locks the page, sees it is truncated and bails out
properly thus silently fixes up the problem. The Lustre bug looks like it
could actually result in some real problems and the bug is there since the
initial commit in which Lustre was added in 3.11 (d7e09d0397e84).

So overall the issue doesn't look like too serious currently but it is
certainly a serious bug waiting to happen.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-06  9:24     ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-06  9:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, Al Viro, linux-fsdevel, linux-mm, lustre-devel, cluster-devel

On Fri 03-02-17 15:20:54, Andrew Morton wrote:
> On Fri,  3 Feb 2017 16:07:29 +0100 Jan Kara <jack@suse.cz> wrote:
> 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> 
> I'm not getting any sense of the urgency of this fix.  The bug *sounds*
> bad?  Which kernel versions need fixing?

So I did more analysis of GFS2 and Lustre behavior. AFAICS GFS2 returns
EAGAIN only for truncated page, when we then return with VM_FAULT_RETRY,
do_page_mkwrite() locks the page, sees it is truncated and bails out
properly thus silently fixes up the problem. The Lustre bug looks like it
could actually result in some real problems and the bug is there since the
initial commit in which Lustre was added in 3.11 (d7e09d0397e84).

So overall the issue doesn't look like too serious currently but it is
certainly a serious bug waiting to happen.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-06  9:24     ` Jan Kara
  0 siblings, 0 replies; 30+ messages in thread
From: Jan Kara @ 2017-02-06  9:24 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri 03-02-17 15:20:54, Andrew Morton wrote:
> On Fri,  3 Feb 2017 16:07:29 +0100 Jan Kara <jack@suse.cz> wrote:
> 
> > Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> > code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> > from ->page_mkwrite is completely unhandled by the mm code and results
> > in locking and writeably mapping the page which definitely is not what
> > the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> > filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> > results in bailing out from the fault code, the CPU then retries the
> > access, and we fault again effectively doing what the handler wanted.
> 
> I'm not getting any sense of the urgency of this fix.  The bug *sounds*
> bad?  Which kernel versions need fixing?

So I did more analysis of GFS2 and Lustre behavior. AFAICS GFS2 returns
EAGAIN only for truncated page, when we then return with VM_FAULT_RETRY,
do_page_mkwrite() locks the page, sees it is truncated and bails out
properly thus silently fixes up the problem. The Lustre bug looks like it
could actually result in some real problems and the bug is there since the
initial commit in which Lustre was added in 3.11 (d7e09d0397e84).

So overall the issue doesn't look like too serious currently but it is
certainly a serious bug waiting to happen.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  2017-02-03 15:07 ` Jan Kara
  (?)
@ 2017-02-06 20:52   ` Xiong, Jinshan
  -1 siblings, 0 replies; 30+ messages in thread
From: Xiong, Jinshan @ 2017-02-06 20:52 UTC (permalink / raw)
  To: Jan Kara
  Cc: Al Viro, cluster-devel, linux-mm, linux-fsdevel, Andrew Morton,
	lustre-devel

looks good to me.

Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>

> On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.
> 
> CC: lustre-devel@lists.lustre.org
> CC: cluster-devel@redhat.com
> Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> include/linux/buffer_head.h                      | 4 +---
> 2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> index ee01f20d8b11..9afa6bec3e6f 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> 		result = VM_FAULT_LOCKED;
> 		break;
> 	case -ENODATA:
> +	case -EAGAIN:
> 	case -EFAULT:
> 		result = VM_FAULT_NOPAGE;
> 		break;
> 	case -ENOMEM:
> 		result = VM_FAULT_OOM;
> 		break;
> -	case -EAGAIN:
> -		result = VM_FAULT_RETRY;
> -		break;
> 	default:
> 		result = VM_FAULT_SIGBUS;
> 		break;
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index d67ab83823ad..79591c3660cc 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> {
> 	if (err == 0)
> 		return VM_FAULT_LOCKED;
> -	if (err == -EFAULT)
> +	if (err == -EFAULT || err == -EAGAIN)
> 		return VM_FAULT_NOPAGE;
> 	if (err == -ENOMEM)
> 		return VM_FAULT_OOM;
> -	if (err == -EAGAIN)
> -		return VM_FAULT_RETRY;
> 	/* -ENOSPC, -EDQUOT, -EIO ... */
> 	return VM_FAULT_SIGBUS;
> }
> -- 
> 2.10.2
> 
> _______________________________________________
> lustre-devel mailing list
> lustre-devel@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-06 20:52   ` Xiong, Jinshan
  0 siblings, 0 replies; 30+ messages in thread
From: Xiong, Jinshan @ 2017-02-06 20:52 UTC (permalink / raw)
  To: Jan Kara
  Cc: Al Viro, cluster-devel, linux-mm, linux-fsdevel, Andrew Morton,
	lustre-devel

looks good to me.

Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>

> On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.
> 
> CC: lustre-devel at lists.lustre.org
> CC: cluster-devel at redhat.com
> Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> include/linux/buffer_head.h                      | 4 +---
> 2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> index ee01f20d8b11..9afa6bec3e6f 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> 		result = VM_FAULT_LOCKED;
> 		break;
> 	case -ENODATA:
> +	case -EAGAIN:
> 	case -EFAULT:
> 		result = VM_FAULT_NOPAGE;
> 		break;
> 	case -ENOMEM:
> 		result = VM_FAULT_OOM;
> 		break;
> -	case -EAGAIN:
> -		result = VM_FAULT_RETRY;
> -		break;
> 	default:
> 		result = VM_FAULT_SIGBUS;
> 		break;
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index d67ab83823ad..79591c3660cc 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> {
> 	if (err == 0)
> 		return VM_FAULT_LOCKED;
> -	if (err == -EFAULT)
> +	if (err == -EFAULT || err == -EAGAIN)
> 		return VM_FAULT_NOPAGE;
> 	if (err == -ENOMEM)
> 		return VM_FAULT_OOM;
> -	if (err == -EAGAIN)
> -		return VM_FAULT_RETRY;
> 	/* -ENOSPC, -EDQUOT, -EIO ... */
> 	return VM_FAULT_SIGBUS;
> }
> -- 
> 2.10.2
> 
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Cluster-devel] [lustre-devel] [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
@ 2017-02-06 20:52   ` Xiong, Jinshan
  0 siblings, 0 replies; 30+ messages in thread
From: Xiong, Jinshan @ 2017-02-06 20:52 UTC (permalink / raw)
  To: cluster-devel.redhat.com

looks good to me.

Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>

> On Feb 3, 2017, at 7:07 AM, Jan Kara <jack@suse.cz> wrote:
> 
> Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
> code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
> from ->page_mkwrite is completely unhandled by the mm code and results
> in locking and writeably mapping the page which definitely is not what
> the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other
> filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which
> results in bailing out from the fault code, the CPU then retries the
> access, and we fault again effectively doing what the handler wanted.
> 
> CC: lustre-devel at lists.lustre.org
> CC: cluster-devel at redhat.com
> Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> drivers/staging/lustre/lustre/llite/llite_mmap.c | 4 +---
> include/linux/buffer_head.h                      | 4 +---
> 2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> index ee01f20d8b11..9afa6bec3e6f 100644
> --- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
> +++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
> @@ -390,15 +390,13 @@ static int ll_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> 		result = VM_FAULT_LOCKED;
> 		break;
> 	case -ENODATA:
> +	case -EAGAIN:
> 	case -EFAULT:
> 		result = VM_FAULT_NOPAGE;
> 		break;
> 	case -ENOMEM:
> 		result = VM_FAULT_OOM;
> 		break;
> -	case -EAGAIN:
> -		result = VM_FAULT_RETRY;
> -		break;
> 	default:
> 		result = VM_FAULT_SIGBUS;
> 		break;
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index d67ab83823ad..79591c3660cc 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -243,12 +243,10 @@ static inline int block_page_mkwrite_return(int err)
> {
> 	if (err == 0)
> 		return VM_FAULT_LOCKED;
> -	if (err == -EFAULT)
> +	if (err == -EFAULT || err == -EAGAIN)
> 		return VM_FAULT_NOPAGE;
> 	if (err == -ENOMEM)
> 		return VM_FAULT_OOM;
> -	if (err == -EAGAIN)
> -		return VM_FAULT_RETRY;
> 	/* -ENOSPC, -EDQUOT, -EIO ... */
> 	return VM_FAULT_SIGBUS;
> }
> -- 
> 2.10.2
> 
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org




^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-02-06 20:52 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-03 15:07 [PATCH] mm: Avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers Jan Kara
2017-02-03 15:07 ` [Cluster-devel] " Jan Kara
2017-02-03 15:07 ` [lustre-devel] " Jan Kara
2017-02-03 15:07 ` Jan Kara
2017-02-03 15:13 ` Matthew Wilcox
2017-02-03 15:13   ` [Cluster-devel] " Matthew Wilcox
2017-02-03 15:46   ` Jan Kara
2017-02-03 15:46     ` [Cluster-devel] " Jan Kara
2017-02-03 15:46     ` [lustre-devel] " Jan Kara
2017-02-03 15:53     ` Matthew Wilcox
2017-02-03 15:53       ` [Cluster-devel] " Matthew Wilcox
2017-02-03 16:10       ` Jan Kara
2017-02-03 16:10         ` [Cluster-devel] " Jan Kara
2017-02-03 16:10         ` [lustre-devel] " Jan Kara
2017-02-03 23:20 ` Andrew Morton
2017-02-03 23:20   ` [Cluster-devel] " Andrew Morton
2017-02-03 23:20   ` Andrew Morton
2017-02-06  9:24   ` Jan Kara
2017-02-06  9:24     ` [Cluster-devel] " Jan Kara
2017-02-06  9:24     ` [lustre-devel] " Jan Kara
2017-02-03 23:44 ` Xiong, Jinshan
2017-02-03 23:44   ` [Cluster-devel] " Xiong, Jinshan
2017-02-03 23:44   ` Xiong, Jinshan
2017-02-06  8:59   ` Jan Kara
2017-02-06  8:59     ` [Cluster-devel] " Jan Kara
2017-02-06  8:59     ` Jan Kara
2017-02-06  8:59     ` Jan Kara
2017-02-06 20:52 ` Xiong, Jinshan
2017-02-06 20:52   ` [Cluster-devel] " Xiong, Jinshan
2017-02-06 20:52   ` Xiong, Jinshan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.