* [PATCH 1/2] mm/gup.c: fix the wrong comments @ 2019-04-08 2:37 Huang Shijie 2019-04-08 2:37 ` [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages Huang Shijie 2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox 0 siblings, 2 replies; 13+ messages in thread From: Huang Shijie @ 2019-04-08 2:37 UTC (permalink / raw) To: akpm Cc: william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm, linux-kernel, Huang Shijie When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own get_user_pages_fast(). In the following scenario, we will may meet the bug in the DMA case: ..................... get_user_pages_fast(start,,, pages); ...... sg_alloc_table_from_pages(, pages, ...); ..................... The root cause is that sg_alloc_table_from_pages() requires the page order to keep the same as it used in the user space, but get_user_pages_fast() will mess it up. So change the comments, and make it more clear for the driver users. Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai> --- mm/gup.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 22acdd0f79ff..fb11ff90ba3b 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1129,10 +1129,6 @@ EXPORT_SYMBOL(get_user_pages_locked); * with: * * get_user_pages_unlocked(tsk, mm, ..., pages); - * - * It is functionally equivalent to get_user_pages_fast so - * get_user_pages_fast should be used instead if specific gup_flags - * (e.g. FOLL_FORCE) are not required. */ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, struct page **pages, unsigned int gup_flags) @@ -2147,6 +2143,10 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, * If not successful, it will fall back to taking the lock and * calling get_user_pages(). * + * Note this routine may fill the pages array with entries in a + * different order than get_user_pages_unlocked(), which may cause + * issues for callers expecting the routines to be equivalent. + * * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages * were pinned, returns -errno. -- 2.17.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages 2019-04-08 2:37 [PATCH 1/2] mm/gup.c: fix the wrong comments Huang Shijie @ 2019-04-08 2:37 ` Huang Shijie 2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox 1 sibling, 0 replies; 13+ messages in thread From: Huang Shijie @ 2019-04-08 2:37 UTC (permalink / raw) To: akpm Cc: william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm, linux-kernel, Huang Shijie The get_user_pages_fast() may mess up the page order in @pages array, We will get the wrong DMA results in this case. Add more commit to clarify it. Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai> --- lib/scatterlist.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 739dc9fe2c55..c170afb1a25e 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -450,6 +450,9 @@ EXPORT_SYMBOL(__sg_alloc_table_from_pages); * specified by the page array. The returned sg table is released by * sg_free_table. * + * Note: Do not use get_user_pages_fast() to pin the pages for @pages array, + * it may mess up the page order, and we will get the wrong DMA results. + * Returns: * 0 on success, negative error on failure */ -- 2.17.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-08 2:37 [PATCH 1/2] mm/gup.c: fix the wrong comments Huang Shijie 2019-04-08 2:37 ` [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages Huang Shijie @ 2019-04-08 14:13 ` Matthew Wilcox 2019-04-09 1:08 ` Huang Shijie 1 sibling, 1 reply; 13+ messages in thread From: Matthew Wilcox @ 2019-04-08 14:13 UTC (permalink / raw) To: Huang Shijie Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm, linux-kernel On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own > get_user_pages_fast(). > > In the following scenario, we will may meet the bug in the DMA case: > ..................... > get_user_pages_fast(start,,, pages); > ...... > sg_alloc_table_from_pages(, pages, ...); > ..................... > > The root cause is that sg_alloc_table_from_pages() requires the > page order to keep the same as it used in the user space, but > get_user_pages_fast() will mess it up. I don't understand how get_user_pages_fast() can return the pages in a different order in the array from the order they appear in userspace. Can you explain? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox @ 2019-04-09 1:08 ` Huang Shijie 2019-04-09 2:49 ` Matthew Wilcox 2019-04-09 20:23 ` Ira Weiny 0 siblings, 2 replies; 13+ messages in thread From: Huang Shijie @ 2019-04-09 1:08 UTC (permalink / raw) To: Matthew Wilcox Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm, linux-kernel On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own > > get_user_pages_fast(). > > > > In the following scenario, we will may meet the bug in the DMA case: > > ..................... > > get_user_pages_fast(start,,, pages); > > ...... > > sg_alloc_table_from_pages(, pages, ...); > > ..................... > > > > The root cause is that sg_alloc_table_from_pages() requires the > > page order to keep the same as it used in the user space, but > > get_user_pages_fast() will mess it up. > > I don't understand how get_user_pages_fast() can return the pages in a > different order in the array from the order they appear in userspace. > Can you explain? Please see the code in gup.c: int get_user_pages_fast(unsigned long start, int nr_pages, unsigned int gup_flags, struct page **pages) { ....... if (gup_fast_permitted(start, nr_pages)) { local_irq_disable(); gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. local_irq_enable(); ret = nr; } ....... if (nr < nr_pages) { /* Try to get the remaining pages with get_user_pages */ start += nr << PAGE_SHIFT; pages += nr; // The @pages is moved forward. if (gup_flags & FOLL_LONGTERM) { down_read(¤t->mm->mmap_sem); ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time start, nr_pages - nr, pages, NULL, gup_flags); up_read(¤t->mm->mmap_sem); } else { /* * retain FAULT_FOLL_ALLOW_RETRY optimization if * possible */ ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. pages, gup_flags); } } ..................... BTW, I do not know why we mess up the page order. It maybe used in some special case. Thanks Huang Shijie ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-09 1:08 ` Huang Shijie @ 2019-04-09 2:49 ` Matthew Wilcox 2019-04-09 3:04 ` Huang Shijie 2019-04-09 20:23 ` Ira Weiny 1 sibling, 1 reply; 13+ messages in thread From: Matthew Wilcox @ 2019-04-09 2:49 UTC (permalink / raw) To: Huang Shijie Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm, linux-kernel On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > The root cause is that sg_alloc_table_from_pages() requires the > > > page order to keep the same as it used in the user space, but > > > get_user_pages_fast() will mess it up. > > > > I don't understand how get_user_pages_fast() can return the pages in a > > different order in the array from the order they appear in userspace. > > Can you explain? > Please see the code in gup.c: > > int get_user_pages_fast(unsigned long start, int nr_pages, > unsigned int gup_flags, struct page **pages) > { > ....... > if (gup_fast_permitted(start, nr_pages)) { > local_irq_disable(); > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. Right ... but if it's not filled entirely, it will be filled part-way, and then we stop. > local_irq_enable(); > ret = nr; > } > ....... > if (nr < nr_pages) { > /* Try to get the remaining pages with get_user_pages */ > start += nr << PAGE_SHIFT; > pages += nr; // The @pages is moved forward. Yes, to the point where gup_pgd_range() stopped. > if (gup_flags & FOLL_LONGTERM) { > down_read(¤t->mm->mmap_sem); > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time Right. > /* > * retain FAULT_FOLL_ALLOW_RETRY optimization if > * possible > */ > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. > pages, gup_flags); Yes. But they'll be in the same order. > BTW, I do not know why we mess up the page order. It maybe used in some special case. I'm not discounting the possibility that you've found a bug. But documenting that a bug exists is not the solution; the solution is fixing the bug. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-09 2:49 ` Matthew Wilcox @ 2019-04-09 3:04 ` Huang Shijie 2019-04-09 11:19 ` Matthew Wilcox 0 siblings, 1 reply; 13+ messages in thread From: Huang Shijie @ 2019-04-09 3:04 UTC (permalink / raw) To: Matthew Wilcox Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm, linux-kernel On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote: > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > The root cause is that sg_alloc_table_from_pages() requires the > > > > page order to keep the same as it used in the user space, but > > > > get_user_pages_fast() will mess it up. > > > > > > I don't understand how get_user_pages_fast() can return the pages in a > > > different order in the array from the order they appear in userspace. > > > Can you explain? > > Please see the code in gup.c: > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > unsigned int gup_flags, struct page **pages) > > { > > ....... > > if (gup_fast_permitted(start, nr_pages)) { > > local_irq_disable(); > > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. > > Right ... but if it's not filled entirely, it will be filled part-way, > and then we stop. > > > local_irq_enable(); > > ret = nr; > > } > > ....... > > if (nr < nr_pages) { > > /* Try to get the remaining pages with get_user_pages */ > > start += nr << PAGE_SHIFT; > > pages += nr; // The @pages is moved forward. > > Yes, to the point where gup_pgd_range() stopped. > > > if (gup_flags & FOLL_LONGTERM) { > > down_read(¤t->mm->mmap_sem); > > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time > > Right. > > > /* > > * retain FAULT_FOLL_ALLOW_RETRY optimization if > > * possible > > */ > > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. > > pages, gup_flags); > > Yes. But they'll be in the same order. > > > BTW, I do not know why we mess up the page order. It maybe used in some special case. > > I'm not discounting the possibility that you've found a bug. > But documenting that a bug exists is not the solution; the solution is > fixing the bug. I do not think it is a bug :) If we use the get_user_pages_unlocked(), DMA is okay, such as: .... get_user_pages_unlocked() sg_alloc_table_from_pages() ..... I think the comment is not accurate enough. So just add more comments, and tell the driver users how to use the GUPs. Thanks Huang Shijie ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-09 3:04 ` Huang Shijie @ 2019-04-09 11:19 ` Matthew Wilcox 2019-04-09 14:55 ` Weiny, Ira 0 siblings, 1 reply; 13+ messages in thread From: Matthew Wilcox @ 2019-04-09 11:19 UTC (permalink / raw) To: Huang Shijie Cc: akpm, william.kucharski, ira.weiny, palmer, axboe, keescook, linux-mm, linux-kernel On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote: > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote: > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > > The root cause is that sg_alloc_table_from_pages() requires the > > > > > page order to keep the same as it used in the user space, but > > > > > get_user_pages_fast() will mess it up. > > > > > > > > I don't understand how get_user_pages_fast() can return the pages in a > > > > different order in the array from the order they appear in userspace. > > > > Can you explain? > > > Please see the code in gup.c: > > > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > > unsigned int gup_flags, struct page **pages) > > > { > > > ....... > > > if (gup_fast_permitted(start, nr_pages)) { > > > local_irq_disable(); > > > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. > > > > Right ... but if it's not filled entirely, it will be filled part-way, > > and then we stop. > > > > > local_irq_enable(); > > > ret = nr; > > > } > > > ....... > > > if (nr < nr_pages) { > > > /* Try to get the remaining pages with get_user_pages */ > > > start += nr << PAGE_SHIFT; > > > pages += nr; // The @pages is moved forward. > > > > Yes, to the point where gup_pgd_range() stopped. > > > > > if (gup_flags & FOLL_LONGTERM) { > > > down_read(¤t->mm->mmap_sem); > > > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time > > > > Right. > > > > > /* > > > * retain FAULT_FOLL_ALLOW_RETRY optimization if > > > * possible > > > */ > > > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. > > > pages, gup_flags); > > > > Yes. But they'll be in the same order. > > > > > BTW, I do not know why we mess up the page order. It maybe used in some special case. > > > > I'm not discounting the possibility that you've found a bug. > > But documenting that a bug exists is not the solution; the solution is > > fixing the bug. > I do not think it is a bug :) > > If we use the get_user_pages_unlocked(), DMA is okay, such as: > .... > get_user_pages_unlocked() > sg_alloc_table_from_pages() > ..... > > I think the comment is not accurate enough. So just add more comments, and tell the driver > users how to use the GUPs. gup_fast() and gup_unlocked() should return the pages in the same order. If they do not, then it is a bug. ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-09 11:19 ` Matthew Wilcox @ 2019-04-09 14:55 ` Weiny, Ira 2019-04-10 1:20 ` Huang Shijie 0 siblings, 1 reply; 13+ messages in thread From: Weiny, Ira @ 2019-04-09 14:55 UTC (permalink / raw) To: Matthew Wilcox, Huang Shijie Cc: akpm, william.kucharski, palmer, axboe, keescook, linux-mm, linux-kernel > On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote: > > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote: > > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > > > The root cause is that sg_alloc_table_from_pages() requires > > > > > > the page order to keep the same as it used in the user space, > > > > > > but > > > > > > get_user_pages_fast() will mess it up. > > > > > > > > > > I don't understand how get_user_pages_fast() can return the > > > > > pages in a different order in the array from the order they appear in > userspace. > > > > > Can you explain? > > > > Please see the code in gup.c: > > > > > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > > > unsigned int gup_flags, struct page **pages) > > > > { > > > > ....... > > > > if (gup_fast_permitted(start, nr_pages)) { > > > > local_irq_disable(); > > > > gup_pgd_range(addr, end, gup_flags, pages, &nr); > // The @pages array maybe filled at the first time. > > > > > > Right ... but if it's not filled entirely, it will be filled > > > part-way, and then we stop. > > > > > > > local_irq_enable(); > > > > ret = nr; > > > > } > > > > ....... > > > > if (nr < nr_pages) { > > > > /* Try to get the remaining pages with > get_user_pages */ > > > > start += nr << PAGE_SHIFT; > > > > pages += nr; // The > @pages is moved forward. > > > > > > Yes, to the point where gup_pgd_range() stopped. > > > > > > > if (gup_flags & FOLL_LONGTERM) { > > > > down_read(¤t->mm->mmap_sem); > > > > ret = __gup_longterm_locked(current, > current->mm, // The @pages maybe filled at the second time > > > > > > Right. > > > > > > > /* > > > > * retain FAULT_FOLL_ALLOW_RETRY > optimization if > > > > * possible > > > > */ > > > > ret = get_user_pages_unlocked(start, > nr_pages - nr, // The @pages maybe filled at the second time. > > > > pages, gup_flags); > > > > > > Yes. But they'll be in the same order. > > > > > > > BTW, I do not know why we mess up the page order. It maybe used in > some special case. > > > > > > I'm not discounting the possibility that you've found a bug. > > > But documenting that a bug exists is not the solution; the solution > > > is fixing the bug. > > I do not think it is a bug :) > > > > If we use the get_user_pages_unlocked(), DMA is okay, such as: > > .... > > get_user_pages_unlocked() > > sg_alloc_table_from_pages() > > ..... > > > > I think the comment is not accurate enough. So just add more comments, > > and tell the driver users how to use the GUPs. > > gup_fast() and gup_unlocked() should return the pages in the same order. > If they do not, then it is a bug. Is there a reproducer for this? Or do you have some debug output which shows this problem? Ira ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-09 14:55 ` Weiny, Ira @ 2019-04-10 1:20 ` Huang Shijie 2019-04-10 18:04 ` Ira Weiny 0 siblings, 1 reply; 13+ messages in thread From: Huang Shijie @ 2019-04-10 1:20 UTC (permalink / raw) To: Weiny, Ira Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook, linux-mm, linux-kernel On Tue, Apr 09, 2019 at 02:55:31PM +0000, Weiny, Ira wrote: > > On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote: > > > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote: > > > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > > > > The root cause is that sg_alloc_table_from_pages() requires > > > > > > > the page order to keep the same as it used in the user space, > > > > > > > but > > > > > > > get_user_pages_fast() will mess it up. > > > > > > > > > > > > I don't understand how get_user_pages_fast() can return the > > > > > > pages in a different order in the array from the order they appear in > > userspace. > > > > > > Can you explain? > > > > > Please see the code in gup.c: > > > > > > > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > > > > unsigned int gup_flags, struct page **pages) > > > > > { > > > > > ....... > > > > > if (gup_fast_permitted(start, nr_pages)) { > > > > > local_irq_disable(); > > > > > gup_pgd_range(addr, end, gup_flags, pages, &nr); > > // The @pages array maybe filled at the first time. > > > > > > > > Right ... but if it's not filled entirely, it will be filled > > > > part-way, and then we stop. > > > > > > > > > local_irq_enable(); > > > > > ret = nr; > > > > > } > > > > > ....... > > > > > if (nr < nr_pages) { > > > > > /* Try to get the remaining pages with > > get_user_pages */ > > > > > start += nr << PAGE_SHIFT; > > > > > pages += nr; // The > > @pages is moved forward. > > > > > > > > Yes, to the point where gup_pgd_range() stopped. > > > > > > > > > if (gup_flags & FOLL_LONGTERM) { > > > > > down_read(¤t->mm->mmap_sem); > > > > > ret = __gup_longterm_locked(current, > > current->mm, // The @pages maybe filled at the second time > > > > > > > > Right. > > > > > > > > > /* > > > > > * retain FAULT_FOLL_ALLOW_RETRY > > optimization if > > > > > * possible > > > > > */ > > > > > ret = get_user_pages_unlocked(start, > > nr_pages - nr, // The @pages maybe filled at the second time. > > > > > pages, gup_flags); > > > > > > > > Yes. But they'll be in the same order. > > > > > > > > > BTW, I do not know why we mess up the page order. It maybe used in > > some special case. > > > > > > > > I'm not discounting the possibility that you've found a bug. > > > > But documenting that a bug exists is not the solution; the solution > > > > is fixing the bug. > > > I do not think it is a bug :) > > > > > > If we use the get_user_pages_unlocked(), DMA is okay, such as: > > > .... > > > get_user_pages_unlocked() > > > sg_alloc_table_from_pages() > > > ..... > > > > > > I think the comment is not accurate enough. So just add more comments, > > > and tell the driver users how to use the GUPs. > > > > gup_fast() and gup_unlocked() should return the pages in the same order. > > If they do not, then it is a bug. > > Is there a reproducer for this? Or do you have some debug output which shows this problem? Is Matthew right? " gup_fast() and gup_unlocked() should return the pages in the same order. If they do not, then it is a bug." If Matthew is right, I need more time to debug the DMA issue... Thanks Huang Shijie > > Ira > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-10 1:20 ` Huang Shijie @ 2019-04-10 18:04 ` Ira Weiny 0 siblings, 0 replies; 13+ messages in thread From: Ira Weiny @ 2019-04-10 18:04 UTC (permalink / raw) To: Huang Shijie Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook, linux-mm, linux-kernel On Wed, Apr 10, 2019 at 09:20:36AM +0800, Huang Shijie wrote: > On Tue, Apr 09, 2019 at 02:55:31PM +0000, Weiny, Ira wrote: > > > On Tue, Apr 09, 2019 at 11:04:18AM +0800, Huang Shijie wrote: > > > > On Mon, Apr 08, 2019 at 07:49:29PM -0700, Matthew Wilcox wrote: > > > > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > > > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > > > > > The root cause is that sg_alloc_table_from_pages() requires > > > > > > > > the page order to keep the same as it used in the user space, > > > > > > > > but > > > > > > > > get_user_pages_fast() will mess it up. > > > > > > > > > > > > > > I don't understand how get_user_pages_fast() can return the > > > > > > > pages in a different order in the array from the order they appear in > > > userspace. > > > > > > > Can you explain? > > > > > > Please see the code in gup.c: > > > > > > > > > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > > > > > unsigned int gup_flags, struct page **pages) > > > > > > { > > > > > > ....... > > > > > > if (gup_fast_permitted(start, nr_pages)) { > > > > > > local_irq_disable(); > > > > > > gup_pgd_range(addr, end, gup_flags, pages, &nr); > > > // The @pages array maybe filled at the first time. > > > > > > > > > > Right ... but if it's not filled entirely, it will be filled > > > > > part-way, and then we stop. > > > > > > > > > > > local_irq_enable(); > > > > > > ret = nr; > > > > > > } > > > > > > ....... > > > > > > if (nr < nr_pages) { > > > > > > /* Try to get the remaining pages with > > > get_user_pages */ > > > > > > start += nr << PAGE_SHIFT; > > > > > > pages += nr; // The > > > @pages is moved forward. > > > > > > > > > > Yes, to the point where gup_pgd_range() stopped. > > > > > > > > > > > if (gup_flags & FOLL_LONGTERM) { > > > > > > down_read(¤t->mm->mmap_sem); > > > > > > ret = __gup_longterm_locked(current, > > > current->mm, // The @pages maybe filled at the second time > > > > > > > > > > Right. > > > > > > > > > > > /* > > > > > > * retain FAULT_FOLL_ALLOW_RETRY > > > optimization if > > > > > > * possible > > > > > > */ > > > > > > ret = get_user_pages_unlocked(start, > > > nr_pages - nr, // The @pages maybe filled at the second time. > > > > > > pages, gup_flags); > > > > > > > > > > Yes. But they'll be in the same order. > > > > > > > > > > > BTW, I do not know why we mess up the page order. It maybe used in > > > some special case. > > > > > > > > > > I'm not discounting the possibility that you've found a bug. > > > > > But documenting that a bug exists is not the solution; the solution > > > > > is fixing the bug. > > > > I do not think it is a bug :) > > > > > > > > If we use the get_user_pages_unlocked(), DMA is okay, such as: > > > > .... > > > > get_user_pages_unlocked() > > > > sg_alloc_table_from_pages() > > > > ..... > > > > > > > > I think the comment is not accurate enough. So just add more comments, > > > > and tell the driver users how to use the GUPs. > > > > > > gup_fast() and gup_unlocked() should return the pages in the same order. > > > If they do not, then it is a bug. > > > > Is there a reproducer for this? Or do you have some debug output which shows this problem? > Is Matthew right? > > " gup_fast() and gup_unlocked() should return the pages in the same order. > If they do not, then it is a bug." Yes I think he is... Ira > > If Matthew is right, > I need more time to debug the DMA issue... > > > Thanks > Huang Shijie > > > > > > Ira > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-09 1:08 ` Huang Shijie 2019-04-09 2:49 ` Matthew Wilcox @ 2019-04-09 20:23 ` Ira Weiny 2019-04-10 1:18 ` Huang Shijie 1 sibling, 1 reply; 13+ messages in thread From: Ira Weiny @ 2019-04-09 20:23 UTC (permalink / raw) To: Huang Shijie Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook, linux-mm, linux-kernel On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own > > > get_user_pages_fast(). > > > > > > In the following scenario, we will may meet the bug in the DMA case: > > > ..................... > > > get_user_pages_fast(start,,, pages); > > > ...... > > > sg_alloc_table_from_pages(, pages, ...); > > > ..................... > > > > > > The root cause is that sg_alloc_table_from_pages() requires the > > > page order to keep the same as it used in the user space, but > > > get_user_pages_fast() will mess it up. > > > > I don't understand how get_user_pages_fast() can return the pages in a > > different order in the array from the order they appear in userspace. > > Can you explain? > Please see the code in gup.c: > > int get_user_pages_fast(unsigned long start, int nr_pages, > unsigned int gup_flags, struct page **pages) > { > ....... > if (gup_fast_permitted(start, nr_pages)) { > local_irq_disable(); > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. > local_irq_enable(); > ret = nr; > } > ....... > if (nr < nr_pages) { > /* Try to get the remaining pages with get_user_pages */ > start += nr << PAGE_SHIFT; > pages += nr; // The @pages is moved forward. > > if (gup_flags & FOLL_LONGTERM) { > down_read(¤t->mm->mmap_sem); > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time > Neither this nor the get_user_pages_unlocked is filling the pages a second time. It is adding to the page array having moved start and the page array forward. Are you doing a FOLL_LONGTERM GUP? Or are you in the else clause below when you get this bug? Ira > start, nr_pages - nr, > pages, NULL, gup_flags); > up_read(¤t->mm->mmap_sem); > } else { > /* > * retain FAULT_FOLL_ALLOW_RETRY optimization if > * possible > */ > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. > pages, gup_flags); > } > } > > > ..................... > > BTW, I do not know why we mess up the page order. It maybe used in some special case. > > Thanks > Huang Shijie > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-09 20:23 ` Ira Weiny @ 2019-04-10 1:18 ` Huang Shijie 2019-04-10 18:08 ` Ira Weiny 0 siblings, 1 reply; 13+ messages in thread From: Huang Shijie @ 2019-04-10 1:18 UTC (permalink / raw) To: Ira Weiny Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook, linux-mm, linux-kernel On Tue, Apr 09, 2019 at 01:23:16PM -0700, Ira Weiny wrote: > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own > > > > get_user_pages_fast(). > > > > > > > > In the following scenario, we will may meet the bug in the DMA case: > > > > ..................... > > > > get_user_pages_fast(start,,, pages); > > > > ...... > > > > sg_alloc_table_from_pages(, pages, ...); > > > > ..................... > > > > > > > > The root cause is that sg_alloc_table_from_pages() requires the > > > > page order to keep the same as it used in the user space, but > > > > get_user_pages_fast() will mess it up. > > > > > > I don't understand how get_user_pages_fast() can return the pages in a > > > different order in the array from the order they appear in userspace. > > > Can you explain? > > Please see the code in gup.c: > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > unsigned int gup_flags, struct page **pages) > > { > > ....... > > if (gup_fast_permitted(start, nr_pages)) { > > local_irq_disable(); > > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. > > local_irq_enable(); > > ret = nr; > > } > > ....... > > if (nr < nr_pages) { > > /* Try to get the remaining pages with get_user_pages */ > > start += nr << PAGE_SHIFT; > > pages += nr; // The @pages is moved forward. > > > > if (gup_flags & FOLL_LONGTERM) { > > down_read(¤t->mm->mmap_sem); > > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time > > > > Neither this nor the get_user_pages_unlocked is filling the pages a second The get_user_pages_unlocked() will call the handle_mm_fault which will allocate a new page for the empty PTE, and save the new page into the @pages array. > time. It is adding to the page array having moved start and the page array > forward. Yes. This will mess up the page order. I will read the code again to check if I am wrong :) > > Are you doing a FOLL_LONGTERM GUP? Or are you in the else clause below when > you get this bug? I do not use FOLL_LONGTERM, I just use the FOLL_WRITE. So it seems it runs into the else clause below. Thanks Huang Shijie > > Ira > > > start, nr_pages - nr, > > pages, NULL, gup_flags); > > up_read(¤t->mm->mmap_sem); > > } else { > > /* > > * retain FAULT_FOLL_ALLOW_RETRY optimization if > > * possible > > */ > > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. > > pages, gup_flags); > > } > > } > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] mm/gup.c: fix the wrong comments 2019-04-10 1:18 ` Huang Shijie @ 2019-04-10 18:08 ` Ira Weiny 0 siblings, 0 replies; 13+ messages in thread From: Ira Weiny @ 2019-04-10 18:08 UTC (permalink / raw) To: Huang Shijie Cc: Matthew Wilcox, akpm, william.kucharski, palmer, axboe, keescook, linux-mm, linux-kernel On Wed, Apr 10, 2019 at 09:18:50AM +0800, Huang Shijie wrote: > On Tue, Apr 09, 2019 at 01:23:16PM -0700, Ira Weiny wrote: > > On Tue, Apr 09, 2019 at 09:08:33AM +0800, Huang Shijie wrote: > > > On Mon, Apr 08, 2019 at 07:13:13AM -0700, Matthew Wilcox wrote: > > > > On Mon, Apr 08, 2019 at 10:37:45AM +0800, Huang Shijie wrote: > > > > > When CONFIG_HAVE_GENERIC_GUP is defined, the kernel will use its own > > > > > get_user_pages_fast(). > > > > > > > > > > In the following scenario, we will may meet the bug in the DMA case: > > > > > ..................... > > > > > get_user_pages_fast(start,,, pages); > > > > > ...... > > > > > sg_alloc_table_from_pages(, pages, ...); > > > > > ..................... > > > > > > > > > > The root cause is that sg_alloc_table_from_pages() requires the > > > > > page order to keep the same as it used in the user space, but > > > > > get_user_pages_fast() will mess it up. > > > > > > > > I don't understand how get_user_pages_fast() can return the pages in a > > > > different order in the array from the order they appear in userspace. > > > > Can you explain? > > > Please see the code in gup.c: > > > > > > int get_user_pages_fast(unsigned long start, int nr_pages, > > > unsigned int gup_flags, struct page **pages) > > > { > > > ....... > > > if (gup_fast_permitted(start, nr_pages)) { > > > local_irq_disable(); > > > gup_pgd_range(addr, end, gup_flags, pages, &nr); // The @pages array maybe filled at the first time. > > > local_irq_enable(); > > > ret = nr; > > > } > > > ....... > > > if (nr < nr_pages) { > > > /* Try to get the remaining pages with get_user_pages */ > > > start += nr << PAGE_SHIFT; > > > pages += nr; // The @pages is moved forward. > > > > > > if (gup_flags & FOLL_LONGTERM) { > > > down_read(¤t->mm->mmap_sem); > > > ret = __gup_longterm_locked(current, current->mm, // The @pages maybe filled at the second time > > > > > > > Neither this nor the get_user_pages_unlocked is filling the pages a second > The get_user_pages_unlocked() will call the handle_mm_fault which will allocate a > new page for the empty PTE, and save the new page into the @pages array. But shouldn't this happen if get_user_pages_unlocked() is called directly? > > > > time. It is adding to the page array having moved start and the page array > > forward. > > Yes. This will mess up the page order. > > I will read the code again to check if I am wrong :) > > > > > Are you doing a FOLL_LONGTERM GUP? Or are you in the else clause below when > > you get this bug? > I do not use FOLL_LONGTERM, I just use the FOLL_WRITE. > > So it seems it runs into the else clause below. Ok thanks, Ira > > Thanks > Huang Shijie > > > > > Ira > > > > > start, nr_pages - nr, > > > pages, NULL, gup_flags); > > > up_read(¤t->mm->mmap_sem); > > > } else { > > > /* > > > * retain FAULT_FOLL_ALLOW_RETRY optimization if > > > * possible > > > */ > > > ret = get_user_pages_unlocked(start, nr_pages - nr, // The @pages maybe filled at the second time. > > > pages, gup_flags); > > > } > > > } > > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-04-10 18:09 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-04-08 2:37 [PATCH 1/2] mm/gup.c: fix the wrong comments Huang Shijie 2019-04-08 2:37 ` [PATCH 2/2] lib/scatterlist.c: add more commit for sg_alloc_table_from_pages Huang Shijie 2019-04-08 14:13 ` [PATCH 1/2] mm/gup.c: fix the wrong comments Matthew Wilcox 2019-04-09 1:08 ` Huang Shijie 2019-04-09 2:49 ` Matthew Wilcox 2019-04-09 3:04 ` Huang Shijie 2019-04-09 11:19 ` Matthew Wilcox 2019-04-09 14:55 ` Weiny, Ira 2019-04-10 1:20 ` Huang Shijie 2019-04-10 18:04 ` Ira Weiny 2019-04-09 20:23 ` Ira Weiny 2019-04-10 1:18 ` Huang Shijie 2019-04-10 18:08 ` Ira Weiny
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).