linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm/gup.c: fix the wrong comments
@ 2019-04-08  2:37 Huang Shijie
  2019-04-08  2:37 ` [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages Huang Shijie
  2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox
  0 siblings, 2 replies; 13+ messages in thread
From: Huang Shijie @ 2019-04-08  2:37 UTC (permalink / raw)
  To: akpm
  Cc: william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm,
	linux-kernel, Huang Shijie

When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
get_user_pages_fast().

In the following scenario, we will may meet the bug in the DMA case:
	    .....................
	    get_user_pages_fast(start,,, pages);
	        ......
	    sg_alloc_table_from_pages(, pages, ...);
	    .....................

The root cause is that sg_alloc_table_from_pages() requires the
page order to keep the same as it used in the user space, but
get_user_pages_fast() will mess it up.

So change the comments, and make it more clear for the driver
users.

Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
---
 mm/gup.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 22acdd0f79ff..fb11ff90ba3b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1129,10 +1129,6 @@ EXPORT_SYMBOL(get_user_pages_locked);
  *  with:
  *
  *      get_user_pages_unlocked(tsk, mm, ..., pages);
- *
- * It is functionally equivalent to get_user_pages_fast so
- * get_user_pages_fast should be used instead if specific gup_flags
- * (e.g. FOLL_FORCE) are not required.
  */
 long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 			     struct page **pages, unsigned int gup_flags)
@@ -2147,6 +2143,10 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
  * If not successful, it will fall back to taking the lock and
  * calling get_user_pages().
  *
+ * Note this routine may fill the pages array with entries in a
+ * different order than get_user_pages_unlocked(), which may cause
+ * issues for callers expecting the routines to be equivalent.
+ *
  * Returns number of pages pinned. This may be fewer than the number
  * requested. If nr_pages is 0 or negative, returns 0. If no pages
  * were pinned, returns -errno.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages
  2019-04-08  2:37 [PATCH 1/2] mm/gup.c: fix the wrong comments Huang Shijie
@ 2019-04-08  2:37 ` Huang Shijie
  2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox
  1 sibling, 0 replies; 13+ messages in thread
From: Huang Shijie @ 2019-04-08  2:37 UTC (permalink / raw)
  To: akpm
  Cc: william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm,
	linux-kernel, Huang Shijie

The get_user_pages_fast() may mess up the page order in @pages array,
We will get the wrong DMA results in this case.

Add more commit to clarify it.

Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
---
 lib/scatterlist.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 739dc9fe2c55..c170afb1a25e 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -450,6 +450,9 @@ EXPORT_SYMBOL(__sg_alloc_table_from_pages);
  *    specified by the page array. The returned sg table is released by
  *    sg_free_table.
  *
+ * Note: Do not use get_user_pages_fast() to pin the pages for @pages array,
+ *       it may mess up the page order, and we will get the wrong DMA results.
+
  * Returns:
  *   0 on success, negative error on failure
  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-08  2:37 [PATCH 1/2] mm/gup.c: fix the wrong comments Huang Shijie
  2019-04-08  2:37 ` [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages Huang Shijie
@ 2019-04-08 14:13 ` Matthew Wilcox
  2019-04-09  1:08   ` Huang Shijie
  1 sibling, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2019-04-08 14:13 UTC (permalink / raw)
  To: Huang Shijie
  Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> get_user_pages_fast().
> 
> In the following scenario, we will may meet the bug in the DMA case:
> 	    .....................
> 	    get_user_pages_fast(start,,, pages);
> 	        ......
> 	    sg_alloc_table_from_pages(, pages, ...);
> 	    .....................
> 
> The root cause is that sg_alloc_table_from_pages() requires the
> page order to keep the same as it used in the user space, but
> get_user_pages_fast() will mess it up.

I don't understand how get_user_pages_fast() can return the pages in a
different order in the array from the order they appear in userspace.
Can you explain?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox
@ 2019-04-09  1:08   ` Huang Shijie
  2019-04-09  2:49     ` Matthew Wilcox
  2019-04-09 20:23     ` Ira Weiny
  0 siblings, 2 replies; 13+ messages in thread
From: Huang Shijie @ 2019-04-09  1:08 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > get_user_pages_fast().
> > 
> > In the following scenario, we will may meet the bug in the DMA case:
> > 	    .....................
> > 	    get_user_pages_fast(start,,, pages);
> > 	        ......
> > 	    sg_alloc_table_from_pages(, pages, ...);
> > 	    .....................
> > 
> > The root cause is that sg_alloc_table_from_pages() requires the
> > page order to keep the same as it used in the user space, but
> > get_user_pages_fast() will mess it up.
> 
> I don't understand how get_user_pages_fast() can return the pages in a
> different order in the array from the order they appear in userspace.
> Can you explain?
Please see the code in gup.c:

	int get_user_pages_fast(unsigned long start, int nr_pages,
				unsigned int gup_flags, struct page **pages)
	{
		.......
		if (gup_fast_permitted(start, nr_pages)) {
			local_irq_disable();
			gup_pgd_range(addr, end, gup_flags, pages, &nr);               // The @pages array maybe filled at the first time.
			local_irq_enable();
			ret = nr;
		}
		.......
		if (nr < nr_pages) {
			/* Try to get the remaining pages with get_user_pages */
			start += nr << PAGE_SHIFT;
			pages += nr;                                                  // The @pages is moved forward.

			if (gup_flags & FOLL_LONGTERM) {
				down_read(&current->mm->mmap_sem);
				ret = __gup_longterm_locked(current, current->mm,      // The @pages maybe filled at the second time
							    start, nr_pages - nr,
							    pages, NULL, gup_flags);
				up_read(&current->mm->mmap_sem);
			} else {
				/*
				 * retain FAULT_FOLL_ALLOW_RETRY optimization if
				 * possible
				 */
				ret = get_user_pages_unlocked(start, nr_pages - nr,    // The @pages maybe filled at the second time.
							      pages, gup_flags);
			}
		}


		.....................

BTW, I do not know why we mess up the page order. It maybe used in some special case.

Thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-09  1:08   ` Huang Shijie
@ 2019-04-09  2:49     ` Matthew Wilcox
  2019-04-09  3:04       ` Huang Shijie
  2019-04-09 20:23     ` Ira Weiny
  1 sibling, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2019-04-09  2:49 UTC (permalink / raw)
  To: Huang Shijie
  Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > The root cause is that sg_alloc_table_from_pages() requires the
> > > page order to keep the same as it used in the user space, but
> > > get_user_pages_fast() will mess it up.
> > 
> > I don't understand how get_user_pages_fast() can return the pages in a
> > different order in the array from the order they appear in userspace.
> > Can you explain?
> Please see the code in gup.c:
> 
> 	int get_user_pages_fast(unsigned long start, int nr_pages,
> 				unsigned int gup_flags, struct page **pages)
> 	{
> 		.......
> 		if (gup_fast_permitted(start, nr_pages)) {
> 			local_irq_disable();
> 			gup_pgd_range(addr, end, gup_flags, pages, &nr);               // The @pages array maybe filled at the first time.

Right ... but if it's not filled entirely, it will be filled part-way,
and then we stop.

> 			local_irq_enable();
> 			ret = nr;
> 		}
> 		.......
> 		if (nr < nr_pages) {
> 			/* Try to get the remaining pages with get_user_pages */
> 			start += nr << PAGE_SHIFT;
> 			pages += nr;                                                  // The @pages is moved forward.

Yes, to the point where gup_pgd_range() stopped.

> 			if (gup_flags & FOLL_LONGTERM) {
> 				down_read(&current->mm->mmap_sem);
> 				ret = __gup_longterm_locked(current, current->mm,      // The @pages maybe filled at the second time

Right.

> 				/*
> 				 * retain FAULT_FOLL_ALLOW_RETRY optimization if
> 				 * possible
> 				 */
> 				ret = get_user_pages_unlocked(start, nr_pages - nr,    // The @pages maybe filled at the second time.
> 							      pages, gup_flags);

Yes.  But they'll be in the same order.

> BTW, I do not know why we mess up the page order. It maybe used in some special case.

I'm not discounting the possibility that you've found a bug.
But documenting that a bug exists is not the solution; the solution is
fixing the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-09  2:49     ` Matthew Wilcox
@ 2019-04-09  3:04       ` Huang Shijie
  2019-04-09 11:19         ` Matthew Wilcox
  0 siblings, 1 reply; 13+ messages in thread
From: Huang Shijie @ 2019-04-09  3:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > page order to keep the same as it used in the user space, but
> > > > get_user_pages_fast() will mess it up.
> > > 
> > > I don't understand how get_user_pages_fast() can return the pages in a
> > > different order in the array from the order they appear in userspace.
> > > Can you explain?
> > Please see the code in gup.c:
> > 
> > 	int get_user_pages_fast(unsigned long start, int nr_pages,
> > 				unsigned int gup_flags, struct page **pages)
> > 	{
> > 		.......
> > 		if (gup_fast_permitted(start, nr_pages)) {
> > 			local_irq_disable();
> > 			gup_pgd_range(addr, end, gup_flags, pages, &nr);               // The @pages array maybe filled at the first time.
> 
> Right ... but if it's not filled entirely, it will be filled part-way,
> and then we stop.
> 
> > 			local_irq_enable();
> > 			ret = nr;
> > 		}
> > 		.......
> > 		if (nr < nr_pages) {
> > 			/* Try to get the remaining pages with get_user_pages */
> > 			start += nr << PAGE_SHIFT;
> > 			pages += nr;                                                  // The @pages is moved forward.
> 
> Yes, to the point where gup_pgd_range() stopped.
> 
> > 			if (gup_flags & FOLL_LONGTERM) {
> > 				down_read(&current->mm->mmap_sem);
> > 				ret = __gup_longterm_locked(current, current->mm,      // The @pages maybe filled at the second time
> 
> Right.
> 
> > 				/*
> > 				 * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > 				 * possible
> > 				 */
> > 				ret = get_user_pages_unlocked(start, nr_pages - nr,    // The @pages maybe filled at the second time.
> > 							      pages, gup_flags);
> 
> Yes.  But they'll be in the same order.
> 
> > BTW, I do not know why we mess up the page order. It maybe used in some special case.
> 
> I'm not discounting the possibility that you've found a bug.
> But documenting that a bug exists is not the solution; the solution is
> fixing the bug.
I do not think it is a bug :)

If we use the get_user_pages_unlocked(), DMA is okay, such as:
                     ....
		     get_user_pages_unlocked()
		     sg_alloc_table_from_pages()
	             .....

I think the comment is not accurate enough. So just add more comments, and tell the driver
users how to use the GUPs.

Thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-09  3:04       ` Huang Shijie
@ 2019-04-09 11:19         ` Matthew Wilcox
  2019-04-09 14:55           ` Weiny, Ira
  0 siblings, 1 reply; 13+ messages in thread
From: Matthew Wilcox @ 2019-04-09 11:19 UTC (permalink / raw)
  To: Huang Shijie
  Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > > page order to keep the same as it used in the user space, but
> > > > > get_user_pages_fast() will mess it up.
> > > > 
> > > > I don't understand how get_user_pages_fast() can return the pages in a
> > > > different order in the array from the order they appear in userspace.
> > > > Can you explain?
> > > Please see the code in gup.c:
> > > 
> > > 	int get_user_pages_fast(unsigned long start, int nr_pages,
> > > 				unsigned int gup_flags, struct page **pages)
> > > 	{
> > > 		.......
> > > 		if (gup_fast_permitted(start, nr_pages)) {
> > > 			local_irq_disable();
> > > 			gup_pgd_range(addr, end, gup_flags, pages, &nr);               // The @pages array maybe filled at the first time.
> > 
> > Right ... but if it's not filled entirely, it will be filled part-way,
> > and then we stop.
> > 
> > > 			local_irq_enable();
> > > 			ret = nr;
> > > 		}
> > > 		.......
> > > 		if (nr < nr_pages) {
> > > 			/* Try to get the remaining pages with get_user_pages */
> > > 			start += nr << PAGE_SHIFT;
> > > 			pages += nr;                                                  // The @pages is moved forward.
> > 
> > Yes, to the point where gup_pgd_range() stopped.
> > 
> > > 			if (gup_flags & FOLL_LONGTERM) {
> > > 				down_read(&current->mm->mmap_sem);
> > > 				ret = __gup_longterm_locked(current, current->mm,      // The @pages maybe filled at the second time
> > 
> > Right.
> > 
> > > 				/*
> > > 				 * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > > 				 * possible
> > > 				 */
> > > 				ret = get_user_pages_unlocked(start, nr_pages - nr,    // The @pages maybe filled at the second time.
> > > 							      pages, gup_flags);
> > 
> > Yes.  But they'll be in the same order.
> > 
> > > BTW, I do not know why we mess up the page order. It maybe used in some special case.
> > 
> > I'm not discounting the possibility that you've found a bug.
> > But documenting that a bug exists is not the solution; the solution is
> > fixing the bug.
> I do not think it is a bug :)
> 
> If we use the get_user_pages_unlocked(), DMA is okay, such as:
>                      ....
> 		     get_user_pages_unlocked()
> 		     sg_alloc_table_from_pages()
> 	             .....
> 
> I think the comment is not accurate enough. So just add more comments, and tell the driver
> users how to use the GUPs.

gup_fast() and gup_unlocked() should return the pages in the same order.
If they do not, then it is a bug.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-09 11:19         ` Matthew Wilcox
@ 2019-04-09 14:55           ` Weiny, Ira
  2019-04-10  1:20             ` Huang Shijie
  0 siblings, 1 reply; 13+ messages in thread
From: Weiny, Ira @ 2019-04-09 14:55 UTC (permalink / raw)
  To: Matthew Wilcox, Huang Shijie
  Cc: akpm, william.kucharski, palmer, axboe, keescook, linux-mm, linux-kernel

> On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > > The root cause is that sg_alloc_table_from_pages() requires
> > > > > > the page order to keep the same as it used in the user space,
> > > > > > but
> > > > > > get_user_pages_fast() will mess it up.
> > > > >
> > > > > I don't understand how get_user_pages_fast() can return the
> > > > > pages in a different order in the array from the order they appear in
> userspace.
> > > > > Can you explain?
> > > > Please see the code in gup.c:
> > > >
> > > > 	int get_user_pages_fast(unsigned long start, int nr_pages,
> > > > 				unsigned int gup_flags, struct page **pages)
> > > > 	{
> > > > 		.......
> > > > 		if (gup_fast_permitted(start, nr_pages)) {
> > > > 			local_irq_disable();
> > > > 			gup_pgd_range(addr, end, gup_flags, pages, &nr);
> // The @pages array maybe filled at the first time.
> > >
> > > Right ... but if it's not filled entirely, it will be filled
> > > part-way, and then we stop.
> > >
> > > > 			local_irq_enable();
> > > > 			ret = nr;
> > > > 		}
> > > > 		.......
> > > > 		if (nr < nr_pages) {
> > > > 			/* Try to get the remaining pages with
> get_user_pages */
> > > > 			start += nr << PAGE_SHIFT;
> > > > 			pages += nr;                                                  // The
> @pages is moved forward.
> > >
> > > Yes, to the point where gup_pgd_range() stopped.
> > >
> > > > 			if (gup_flags & FOLL_LONGTERM) {
> > > > 				down_read(&current->mm->mmap_sem);
> > > > 				ret = __gup_longterm_locked(current,
> current->mm,      // The @pages maybe filled at the second time
> > >
> > > Right.
> > >
> > > > 				/*
> > > > 				 * retain FAULT_FOLL_ALLOW_RETRY
> optimization if
> > > > 				 * possible
> > > > 				 */
> > > > 				ret = get_user_pages_unlocked(start,
> nr_pages - nr,    // The @pages maybe filled at the second time.
> > > > 							      pages, gup_flags);
> > >
> > > Yes.  But they'll be in the same order.
> > >
> > > > BTW, I do not know why we mess up the page order. It maybe used in
> some special case.
> > >
> > > I'm not discounting the possibility that you've found a bug.
> > > But documenting that a bug exists is not the solution; the solution
> > > is fixing the bug.
> > I do not think it is a bug :)
> >
> > If we use the get_user_pages_unlocked(), DMA is okay, such as:
> >                      ....
> > 		     get_user_pages_unlocked()
> > 		     sg_alloc_table_from_pages()
> > 	             .....
> >
> > I think the comment is not accurate enough. So just add more comments,
> > and tell the driver users how to use the GUPs.
> 
> gup_fast() and gup_unlocked() should return the pages in the same order.
> If they do not, then it is a bug.

Is there a reproducer for this?  Or do you have some debug output which shows this problem?

Ira


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-09  1:08   ` Huang Shijie
  2019-04-09  2:49     ` Matthew Wilcox
@ 2019-04-09 20:23     ` Ira Weiny
  2019-04-10  1:18       ` Huang Shijie
  1 sibling, 1 reply; 13+ messages in thread
From: Ira Weiny @ 2019-04-09 20:23 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > > get_user_pages_fast().
> > > 
> > > In the following scenario, we will may meet the bug in the DMA case:
> > > 	    .....................
> > > 	    get_user_pages_fast(start,,, pages);
> > > 	        ......
> > > 	    sg_alloc_table_from_pages(, pages, ...);
> > > 	    .....................
> > > 
> > > The root cause is that sg_alloc_table_from_pages() requires the
> > > page order to keep the same as it used in the user space, but
> > > get_user_pages_fast() will mess it up.
> > 
> > I don't understand how get_user_pages_fast() can return the pages in a
> > different order in the array from the order they appear in userspace.
> > Can you explain?
> Please see the code in gup.c:
> 
> 	int get_user_pages_fast(unsigned long start, int nr_pages,
> 				unsigned int gup_flags, struct page **pages)
> 	{
> 		.......
> 		if (gup_fast_permitted(start, nr_pages)) {
> 			local_irq_disable();
> 			gup_pgd_range(addr, end, gup_flags, pages, &nr);               // The @pages array maybe filled at the first time.
> 			local_irq_enable();
> 			ret = nr;
> 		}
> 		.......
> 		if (nr < nr_pages) {
> 			/* Try to get the remaining pages with get_user_pages */
> 			start += nr << PAGE_SHIFT;
> 			pages += nr;                                                  // The @pages is moved forward.
> 
> 			if (gup_flags & FOLL_LONGTERM) {
> 				down_read(&current->mm->mmap_sem);
> 				ret = __gup_longterm_locked(current, current->mm,      // The @pages maybe filled at the second time
>

Neither this nor the get_user_pages_unlocked is filling the pages a second
time.  It is adding to the page array having moved start and the page array
forward.

Are you doing a FOLL_LONGTERM GUP?  Or are you in the else clause below when
you get this bug?

Ira

> 							    start, nr_pages - nr,
> 							    pages, NULL, gup_flags);
> 				up_read(&current->mm->mmap_sem);
> 			} else {
> 				/*
> 				 * retain FAULT_FOLL_ALLOW_RETRY optimization if
> 				 * possible
> 				 */
> 				ret = get_user_pages_unlocked(start, nr_pages - nr,    // The @pages maybe filled at the second time.
> 							      pages, gup_flags);
> 			}
> 		}
> 
> 
> 		.....................
> 
> BTW, I do not know why we mess up the page order. It maybe used in some special case.
> 
> Thanks
> Huang Shijie
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-09 20:23     ` Ira Weiny
@ 2019-04-10  1:18       ` Huang Shijie
  2019-04-10 18:08         ` Ira Weiny
  0 siblings, 1 reply; 13+ messages in thread
From: Huang Shijie @ 2019-04-10  1:18 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Tue, Apr 09, 2019 at 01:23:16PM -0700, Ira Weiny wrote:
> On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > > > get_user_pages_fast().
> > > > 
> > > > In the following scenario, we will may meet the bug in the DMA case:
> > > > 	    .....................
> > > > 	    get_user_pages_fast(start,,, pages);
> > > > 	        ......
> > > > 	    sg_alloc_table_from_pages(, pages, ...);
> > > > 	    .....................
> > > > 
> > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > page order to keep the same as it used in the user space, but
> > > > get_user_pages_fast() will mess it up.
> > > 
> > > I don't understand how get_user_pages_fast() can return the pages in a
> > > different order in the array from the order they appear in userspace.
> > > Can you explain?
> > Please see the code in gup.c:
> > 
> > 	int get_user_pages_fast(unsigned long start, int nr_pages,
> > 				unsigned int gup_flags, struct page **pages)
> > 	{
> > 		.......
> > 		if (gup_fast_permitted(start, nr_pages)) {
> > 			local_irq_disable();
> > 			gup_pgd_range(addr, end, gup_flags, pages, &nr);               // The @pages array maybe filled at the first time.
> > 			local_irq_enable();
> > 			ret = nr;
> > 		}
> > 		.......
> > 		if (nr < nr_pages) {
> > 			/* Try to get the remaining pages with get_user_pages */
> > 			start += nr << PAGE_SHIFT;
> > 			pages += nr;                                                  // The @pages is moved forward.
> > 
> > 			if (gup_flags & FOLL_LONGTERM) {
> > 				down_read(&current->mm->mmap_sem);
> > 				ret = __gup_longterm_locked(current, current->mm,      // The @pages maybe filled at the second time
> >
> 
> Neither this nor the get_user_pages_unlocked is filling the pages a second
The get_user_pages_unlocked() will call the handle_mm_fault which will allocate a
new page for the empty PTE, and save the new page into the @pages array.


> time.  It is adding to the page array having moved start and the page array
> forward.

Yes. This will mess up the page order.

I will read the code again to check if I am wrong :)

> 
> Are you doing a FOLL_LONGTERM GUP?  Or are you in the else clause below when
> you get this bug?
I do not use FOLL_LONGTERM, I just use the FOLL_WRITE.

So it seems it runs into the else clause below.

Thanks
Huang Shijie

> 
> Ira
> 
> > 							    start, nr_pages - nr,
> > 							    pages, NULL, gup_flags);
> > 				up_read(&current->mm->mmap_sem);
> > 			} else {
> > 				/*
> > 				 * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > 				 * possible
> > 				 */
> > 				ret = get_user_pages_unlocked(start, nr_pages - nr,    // The @pages maybe filled at the second time.
> > 							      pages, gup_flags);
> > 			}
> > 		}
> > 
> > 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-09 14:55           ` Weiny, Ira
@ 2019-04-10  1:20             ` Huang Shijie
  2019-04-10 18:04               ` Ira Weiny
  0 siblings, 1 reply; 13+ messages in thread
From: Huang Shijie @ 2019-04-10  1:20 UTC (permalink / raw)
  To: Weiny, Ira
  Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Tue, Apr 09, 2019 at 02:55:31PM +0000, Weiny, Ira wrote:
> > On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> > > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > > > The root cause is that sg_alloc_table_from_pages() requires
> > > > > > > the page order to keep the same as it used in the user space,
> > > > > > > but
> > > > > > > get_user_pages_fast() will mess it up.
> > > > > >
> > > > > > I don't understand how get_user_pages_fast() can return the
> > > > > > pages in a different order in the array from the order they appear in
> > userspace.
> > > > > > Can you explain?
> > > > > Please see the code in gup.c:
> > > > >
> > > > > 	int get_user_pages_fast(unsigned long start, int nr_pages,
> > > > > 				unsigned int gup_flags, struct page **pages)
> > > > > 	{
> > > > > 		.......
> > > > > 		if (gup_fast_permitted(start, nr_pages)) {
> > > > > 			local_irq_disable();
> > > > > 			gup_pgd_range(addr, end, gup_flags, pages, &nr);
> > // The @pages array maybe filled at the first time.
> > > >
> > > > Right ... but if it's not filled entirely, it will be filled
> > > > part-way, and then we stop.
> > > >
> > > > > 			local_irq_enable();
> > > > > 			ret = nr;
> > > > > 		}
> > > > > 		.......
> > > > > 		if (nr < nr_pages) {
> > > > > 			/* Try to get the remaining pages with
> > get_user_pages */
> > > > > 			start += nr << PAGE_SHIFT;
> > > > > 			pages += nr;                                                  // The
> > @pages is moved forward.
> > > >
> > > > Yes, to the point where gup_pgd_range() stopped.
> > > >
> > > > > 			if (gup_flags & FOLL_LONGTERM) {
> > > > > 				down_read(&current->mm->mmap_sem);
> > > > > 				ret = __gup_longterm_locked(current,
> > current->mm,      // The @pages maybe filled at the second time
> > > >
> > > > Right.
> > > >
> > > > > 				/*
> > > > > 				 * retain FAULT_FOLL_ALLOW_RETRY
> > optimization if
> > > > > 				 * possible
> > > > > 				 */
> > > > > 				ret = get_user_pages_unlocked(start,
> > nr_pages - nr,    // The @pages maybe filled at the second time.
> > > > > 							      pages, gup_flags);
> > > >
> > > > Yes.  But they'll be in the same order.
> > > >
> > > > > BTW, I do not know why we mess up the page order. It maybe used in
> > some special case.
> > > >
> > > > I'm not discounting the possibility that you've found a bug.
> > > > But documenting that a bug exists is not the solution; the solution
> > > > is fixing the bug.
> > > I do not think it is a bug :)
> > >
> > > If we use the get_user_pages_unlocked(), DMA is okay, such as:
> > >                      ....
> > > 		     get_user_pages_unlocked()
> > > 		     sg_alloc_table_from_pages()
> > > 	             .....
> > >
> > > I think the comment is not accurate enough. So just add more comments,
> > > and tell the driver users how to use the GUPs.
> > 
> > gup_fast() and gup_unlocked() should return the pages in the same order.
> > If they do not, then it is a bug.
> 
> Is there a reproducer for this?  Or do you have some debug output which shows this problem?
Is Matthew right?

 " gup_fast() and gup_unlocked() should return the pages in the same order.
 If they do not, then it is a bug."

If Matthew is right,
I need more time to debug the DMA issue...
	

Thanks
Huang Shijie
 

> 
> Ira
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-10  1:20             ` Huang Shijie
@ 2019-04-10 18:04               ` Ira Weiny
  0 siblings, 0 replies; 13+ messages in thread
From: Ira Weiny @ 2019-04-10 18:04 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Wed, Apr 10, 2019 at 09:20:36AM +0800, Huang Shijie wrote:
> On Tue, Apr 09, 2019 at 02:55:31PM +0000, Weiny, Ira wrote:
> > > On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote:
> > > > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote:
> > > > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > > > > The root cause is that sg_alloc_table_from_pages() requires
> > > > > > > > the page order to keep the same as it used in the user space,
> > > > > > > > but
> > > > > > > > get_user_pages_fast() will mess it up.
> > > > > > >
> > > > > > > I don't understand how get_user_pages_fast() can return the
> > > > > > > pages in a different order in the array from the order they appear in
> > > userspace.
> > > > > > > Can you explain?
> > > > > > Please see the code in gup.c:
> > > > > >
> > > > > > 	int get_user_pages_fast(unsigned long start, int nr_pages,
> > > > > > 				unsigned int gup_flags, struct page **pages)
> > > > > > 	{
> > > > > > 		.......
> > > > > > 		if (gup_fast_permitted(start, nr_pages)) {
> > > > > > 			local_irq_disable();
> > > > > > 			gup_pgd_range(addr, end, gup_flags, pages, &nr);
> > > // The @pages array maybe filled at the first time.
> > > > >
> > > > > Right ... but if it's not filled entirely, it will be filled
> > > > > part-way, and then we stop.
> > > > >
> > > > > > 			local_irq_enable();
> > > > > > 			ret = nr;
> > > > > > 		}
> > > > > > 		.......
> > > > > > 		if (nr < nr_pages) {
> > > > > > 			/* Try to get the remaining pages with
> > > get_user_pages */
> > > > > > 			start += nr << PAGE_SHIFT;
> > > > > > 			pages += nr;                                                  // The
> > > @pages is moved forward.
> > > > >
> > > > > Yes, to the point where gup_pgd_range() stopped.
> > > > >
> > > > > > 			if (gup_flags & FOLL_LONGTERM) {
> > > > > > 				down_read(&current->mm->mmap_sem);
> > > > > > 				ret = __gup_longterm_locked(current,
> > > current->mm,      // The @pages maybe filled at the second time
> > > > >
> > > > > Right.
> > > > >
> > > > > > 				/*
> > > > > > 				 * retain FAULT_FOLL_ALLOW_RETRY
> > > optimization if
> > > > > > 				 * possible
> > > > > > 				 */
> > > > > > 				ret = get_user_pages_unlocked(start,
> > > nr_pages - nr,    // The @pages maybe filled at the second time.
> > > > > > 							      pages, gup_flags);
> > > > >
> > > > > Yes.  But they'll be in the same order.
> > > > >
> > > > > > BTW, I do not know why we mess up the page order. It maybe used in
> > > some special case.
> > > > >
> > > > > I'm not discounting the possibility that you've found a bug.
> > > > > But documenting that a bug exists is not the solution; the solution
> > > > > is fixing the bug.
> > > > I do not think it is a bug :)
> > > >
> > > > If we use the get_user_pages_unlocked(), DMA is okay, such as:
> > > >                      ....
> > > > 		     get_user_pages_unlocked()
> > > > 		     sg_alloc_table_from_pages()
> > > > 	             .....
> > > >
> > > > I think the comment is not accurate enough. So just add more comments,
> > > > and tell the driver users how to use the GUPs.
> > > 
> > > gup_fast() and gup_unlocked() should return the pages in the same order.
> > > If they do not, then it is a bug.
> > 
> > Is there a reproducer for this?  Or do you have some debug output which shows this problem?
> Is Matthew right?
> 
>  " gup_fast() and gup_unlocked() should return the pages in the same order.
>  If they do not, then it is a bug."

Yes I think he is...

Ira

> 
> If Matthew is right,
> I need more time to debug the DMA issue...
> 	
> 
> Thanks
> Huang Shijie
>  
> 
> > 
> > Ira
> > 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments
  2019-04-10  1:18       ` Huang Shijie
@ 2019-04-10 18:08         ` Ira Weiny
  0 siblings, 0 replies; 13+ messages in thread
From: Ira Weiny @ 2019-04-10 18:08 UTC (permalink / raw)
  To: Huang Shijie
  Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook,
	linux-mm, linux-kernel

On Wed, Apr 10, 2019 at 09:18:50AM +0800, Huang Shijie wrote:
> On Tue, Apr 09, 2019 at 01:23:16PM -0700, Ira Weiny wrote:
> > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote:
> > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote:
> > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote:
> > > > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own
> > > > > get_user_pages_fast().
> > > > > 
> > > > > In the following scenario, we will may meet the bug in the DMA case:
> > > > > 	    .....................
> > > > > 	    get_user_pages_fast(start,,, pages);
> > > > > 	        ......
> > > > > 	    sg_alloc_table_from_pages(, pages, ...);
> > > > > 	    .....................
> > > > > 
> > > > > The root cause is that sg_alloc_table_from_pages() requires the
> > > > > page order to keep the same as it used in the user space, but
> > > > > get_user_pages_fast() will mess it up.
> > > > 
> > > > I don't understand how get_user_pages_fast() can return the pages in a
> > > > different order in the array from the order they appear in userspace.
> > > > Can you explain?
> > > Please see the code in gup.c:
> > > 
> > > 	int get_user_pages_fast(unsigned long start, int nr_pages,
> > > 				unsigned int gup_flags, struct page **pages)
> > > 	{
> > > 		.......
> > > 		if (gup_fast_permitted(start, nr_pages)) {
> > > 			local_irq_disable();
> > > 			gup_pgd_range(addr, end, gup_flags, pages, &nr);               // The @pages array maybe filled at the first time.
> > > 			local_irq_enable();
> > > 			ret = nr;
> > > 		}
> > > 		.......
> > > 		if (nr < nr_pages) {
> > > 			/* Try to get the remaining pages with get_user_pages */
> > > 			start += nr << PAGE_SHIFT;
> > > 			pages += nr;                                                  // The @pages is moved forward.
> > > 
> > > 			if (gup_flags & FOLL_LONGTERM) {
> > > 				down_read(&current->mm->mmap_sem);
> > > 				ret = __gup_longterm_locked(current, current->mm,      // The @pages maybe filled at the second time
> > >
> > 
> > Neither this nor the get_user_pages_unlocked is filling the pages a second
> The get_user_pages_unlocked() will call the handle_mm_fault which will allocate a
> new page for the empty PTE, and save the new page into the @pages array.

But shouldn't this happen if get_user_pages_unlocked() is called directly?

> 
> 
> > time.  It is adding to the page array having moved start and the page array
> > forward.
> 
> Yes. This will mess up the page order.
> 
> I will read the code again to check if I am wrong :)
> 
> > 
> > Are you doing a FOLL_LONGTERM GUP?  Or are you in the else clause below when
> > you get this bug?
> I do not use FOLL_LONGTERM, I just use the FOLL_WRITE.
> 
> So it seems it runs into the else clause below.

Ok thanks,
Ira

> 
> Thanks
> Huang Shijie
> 
> > 
> > Ira
> > 
> > > 							    start, nr_pages - nr,
> > > 							    pages, NULL, gup_flags);
> > > 				up_read(&current->mm->mmap_sem);
> > > 			} else {
> > > 				/*
> > > 				 * retain FAULT_FOLL_ALLOW_RETRY optimization if
> > > 				 * possible
> > > 				 */
> > > 				ret = get_user_pages_unlocked(start, nr_pages - nr,    // The @pages maybe filled at the second time.
> > > 							      pages, gup_flags);
> > > 			}
> > > 		}
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-04-10 18:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-08  2:37 [PATCH 1/2] mm/gup.c: fix the wrong comments Huang Shijie
2019-04-08  2:37 ` [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages Huang Shijie
2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox
2019-04-09  1:08   ` Huang Shijie
2019-04-09  2:49     ` Matthew Wilcox
2019-04-09  3:04       ` Huang Shijie
2019-04-09 11:19         ` Matthew Wilcox
2019-04-09 14:55           ` Weiny, Ira
2019-04-10  1:20             ` Huang Shijie
2019-04-10 18:04               ` Ira Weiny
2019-04-09 20:23     ` Ira Weiny
2019-04-10  1:18       ` Huang Shijie
2019-04-10 18:08         ` Ira Weiny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).