linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
@ 2023-01-05 11:23 Yishai Hadas
  2023-01-05 13:36 ` Jason Gunthorpe
  2023-01-05 20:06 ` Jason Gunthorpe
  0 siblings, 2 replies; 8+ messages in thread
From: Yishai Hadas @ 2023-01-05 11:23 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-mm, axboe, logang
  Cc: hch, alex.williamson, jgg, yishaih, leonro, maorg

When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
in its 'sgt_append->prv' flow to check whether it can merge contiguous
pages into the last SG, it passes the page arguments in the wrong order.

The first parameter should be the next candidate page to be merged to
the last page and not the opposite.

The current code leads to a corrupted SG which resulted in OOPs and
unexpected errors when non-contiguous pages are merged wrongly.

Fix to pass the page parameters in the right order.

Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 lib/scatterlist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index a0ad2a7959b5..f72aa50c6654 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -476,7 +476,7 @@ int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
 		/* Merge contiguous pages into the last SG */
 		prv_len = sgt_append->prv->length;
 		last_pg = sg_page(sgt_append->prv);
-		while (n_pages && pages_are_mergeable(last_pg, pages[0])) {
+		while (n_pages && pages_are_mergeable(pages[0], last_pg)) {
 			if (sgt_append->prv->length + PAGE_SIZE > max_segment)
 				break;
 			sgt_append->prv->length += PAGE_SIZE;
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  2023-01-05 11:23 [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly Yishai Hadas
@ 2023-01-05 13:36 ` Jason Gunthorpe
  2023-01-05 16:48   ` Yishai Hadas
  2023-01-05 20:06 ` Jason Gunthorpe
  1 sibling, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2023-01-05 13:36 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: linux-kernel, linux-block, linux-mm, axboe, logang, hch,
	alex.williamson, leonro, maorg

On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> in its 'sgt_append->prv' flow to check whether it can merge contiguous
> pages into the last SG, it passes the page arguments in the wrong order.
> 
> The first parameter should be the next candidate page to be merged to
> the last page and not the opposite.
> 
> The current code leads to a corrupted SG which resulted in OOPs and
> unexpected errors when non-contiguous pages are merged wrongly.
> 
> Fix to pass the page parameters in the right order.
> 
> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  lib/scatterlist.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Also, I'm looking more closely at '156 and this is not right either:

-               unsigned long paddr =
-                       (page_to_pfn(sg_page(sgt_append->prv)) * PAGE_SIZE +
-                        sgt_append->prv->offset + sgt_append->prv->length) /
-                       PAGE_SIZE;
-
-               while (n_pages && page_to_pfn(pages[0]) == paddr) {
+               last_pg = sg_page(sgt_append->prv);
+               while (n_pages && pages_are_mergeable(last_pg, pages[0])) {

This change will break things like multi-page combining, sub page
scenarios and maybe more.

The contiguity test here has to be done a phys, it should go back to
struct page to check if the pgmap is OK.

Can you fix it as well?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  2023-01-05 13:36 ` Jason Gunthorpe
@ 2023-01-05 16:48   ` Yishai Hadas
  0 siblings, 0 replies; 8+ messages in thread
From: Yishai Hadas @ 2023-01-05 16:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, linux-block, linux-mm, axboe, logang, hch,
	alex.williamson, leonro, maorg

On 05/01/2023 15:36, Jason Gunthorpe wrote:
> On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
>> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
>> in its 'sgt_append->prv' flow to check whether it can merge contiguous
>> pages into the last SG, it passes the page arguments in the wrong order.
>>
>> The first parameter should be the next candidate page to be merged to
>> the last page and not the opposite.
>>
>> The current code leads to a corrupted SG which resulted in OOPs and
>> unexpected errors when non-contiguous pages are merged wrongly.
>>
>> Fix to pass the page parameters in the right order.
>>
>> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>   lib/scatterlist.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Thanks Jason

>
> Also, I'm looking more closely at '156 and this is not right either:
>
> -               unsigned long paddr =
> -                       (page_to_pfn(sg_page(sgt_append->prv)) * PAGE_SIZE +
> -                        sgt_append->prv->offset + sgt_append->prv->length) /
> -                       PAGE_SIZE;
> -
> -               while (n_pages && page_to_pfn(pages[0]) == paddr) {
> +               last_pg = sg_page(sgt_append->prv);
> +               while (n_pages && pages_are_mergeable(last_pg, pages[0])) {
>
> This change will break things like multi-page combining, sub page
> scenarios and maybe more.
>
> The contiguity test here has to be done a phys, it should go back to
> struct page to check if the pgmap is OK.
>
> Can you fix it as well?


Yes, I have locally some candidate patch as you asked, on top of this one.

I would like to run some extra testing on, then may send it.

Yishai


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  2023-01-05 11:23 [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly Yishai Hadas
  2023-01-05 13:36 ` Jason Gunthorpe
@ 2023-01-05 20:06 ` Jason Gunthorpe
  2023-01-05 20:21   ` Keith Busch
  2023-01-05 20:23   ` Logan Gunthorpe
  1 sibling, 2 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2023-01-05 20:06 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: linux-kernel, linux-block, linux-mm, axboe, logang, hch,
	alex.williamson, leonro, maorg

On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> in its 'sgt_append->prv' flow to check whether it can merge contiguous
> pages into the last SG, it passes the page arguments in the wrong order.
> 
> The first parameter should be the next candidate page to be merged to
> the last page and not the opposite.
> 
> The current code leads to a corrupted SG which resulted in OOPs and
> unexpected errors when non-contiguous pages are merged wrongly.
> 
> Fix to pass the page parameters in the right order.
> 
> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  lib/scatterlist.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

rdma is pretty much the only user of this API and this bug is causing
bad data corruption, so I'm going to take it to the rdma tree and send
it tomorrow.

Which raises the question why the original patch was done at all,
nothing ever inputs pgmap pages into this function?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  2023-01-05 20:06 ` Jason Gunthorpe
@ 2023-01-05 20:21   ` Keith Busch
  2023-01-05 20:23     ` Jason Gunthorpe
  2023-01-05 20:23   ` Logan Gunthorpe
  1 sibling, 1 reply; 8+ messages in thread
From: Keith Busch @ 2023-01-05 20:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, linux-kernel, linux-block, linux-mm, axboe, logang,
	hch, alex.williamson, leonro, maorg

On Thu, Jan 05, 2023 at 04:06:11PM -0400, Jason Gunthorpe wrote:
> On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> > When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> > in its 'sgt_append->prv' flow to check whether it can merge contiguous
> > pages into the last SG, it passes the page arguments in the wrong order.
> > 
> > The first parameter should be the next candidate page to be merged to
> > the last page and not the opposite.
> > 
> > The current code leads to a corrupted SG which resulted in OOPs and
> > unexpected errors when non-contiguous pages are merged wrongly.
> > 
> > Fix to pass the page parameters in the right order.
> > 
> > Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> > Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> > ---
> >  lib/scatterlist.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> rdma is pretty much the only user of this API and this bug is causing
> bad data corruption, so I'm going to take it to the rdma tree and send
> it tomorrow.
> 
> Which raises the question why the original patch was done at all,
> nothing ever inputs pgmap pages into this function?

This just takes any arbitrary user addresses, right? The user could
provide addresses from mmap'ing pci resource files that resolve to pgmap
pages.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  2023-01-05 20:21   ` Keith Busch
@ 2023-01-05 20:23     ` Jason Gunthorpe
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2023-01-05 20:23 UTC (permalink / raw)
  To: Keith Busch
  Cc: Yishai Hadas, linux-kernel, linux-block, linux-mm, axboe, logang,
	hch, alex.williamson, leonro, maorg

On Thu, Jan 05, 2023 at 01:21:43PM -0700, Keith Busch wrote:
> On Thu, Jan 05, 2023 at 04:06:11PM -0400, Jason Gunthorpe wrote:
> > On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> > > When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> > > in its 'sgt_append->prv' flow to check whether it can merge contiguous
> > > pages into the last SG, it passes the page arguments in the wrong order.
> > > 
> > > The first parameter should be the next candidate page to be merged to
> > > the last page and not the opposite.
> > > 
> > > The current code leads to a corrupted SG which resulted in OOPs and
> > > unexpected errors when non-contiguous pages are merged wrongly.
> > > 
> > > Fix to pass the page parameters in the right order.
> > > 
> > > Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> > > Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> > > ---
> > >  lib/scatterlist.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > rdma is pretty much the only user of this API and this bug is causing
> > bad data corruption, so I'm going to take it to the rdma tree and send
> > it tomorrow.
> > 
> > Which raises the question why the original patch was done at all,
> > nothing ever inputs pgmap pages into this function?
> 
> This just takes any arbitrary user addresses, right? The user could
> provide addresses from mmap'ing pci resource files that resolve to pgmap
> pages.

No, it passes FOLL_LONGTERM and pin_user_pages will not return any pgmaps
in that case.

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  2023-01-05 20:06 ` Jason Gunthorpe
  2023-01-05 20:21   ` Keith Busch
@ 2023-01-05 20:23   ` Logan Gunthorpe
  2023-01-05 20:25     ` Jason Gunthorpe
  1 sibling, 1 reply; 8+ messages in thread
From: Logan Gunthorpe @ 2023-01-05 20:23 UTC (permalink / raw)
  To: Jason Gunthorpe, Yishai Hadas
  Cc: linux-kernel, linux-block, linux-mm, axboe, hch, alex.williamson,
	leonro, maorg



On 2023-01-05 13:06, Jason Gunthorpe wrote:
> On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
>> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
>> in its 'sgt_append->prv' flow to check whether it can merge contiguous
>> pages into the last SG, it passes the page arguments in the wrong order.
>>
>> The first parameter should be the next candidate page to be merged to
>> the last page and not the opposite.
>>
>> The current code leads to a corrupted SG which resulted in OOPs and
>> unexpected errors when non-contiguous pages are merged wrongly.
>>
>> Fix to pass the page parameters in the right order.
>>
>> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>  lib/scatterlist.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> rdma is pretty much the only user of this API and this bug is causing
> bad data corruption, so I'm going to take it to the rdma tree and send
> it tomorrow.
> 
> Which raises the question why the original patch was done at all,
> nothing ever inputs pgmap pages into this function?

It was done solely because you had suggested it was necessary.

https://lore.kernel.org/all/20210929224653.GZ964074@nvidia.com/

Though when the patch was correct when I originally wrote it and it
looks like I merged it poorly somewhere along the line (roughly v5 of
the series) when the paddr stuff was added. Sorry about that.
The paddr stuff was messy and really hard to understand.

Anyway, Yishai's first patch looks correct to me, but I guess we need to
fix it further. For what it's worth:

Reviewed-by: Logan Gunthorpe <logang@deltatee.com>

Logan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly
  2023-01-05 20:23   ` Logan Gunthorpe
@ 2023-01-05 20:25     ` Jason Gunthorpe
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2023-01-05 20:25 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Yishai Hadas, linux-kernel, linux-block, linux-mm, axboe, hch,
	alex.williamson, leonro, maorg

On Thu, Jan 05, 2023 at 01:23:52PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 2023-01-05 13:06, Jason Gunthorpe wrote:
> > On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> >> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> >> in its 'sgt_append->prv' flow to check whether it can merge contiguous
> >> pages into the last SG, it passes the page arguments in the wrong order.
> >>
> >> The first parameter should be the next candidate page to be merged to
> >> the last page and not the opposite.
> >>
> >> The current code leads to a corrupted SG which resulted in OOPs and
> >> unexpected errors when non-contiguous pages are merged wrongly.
> >>
> >> Fix to pass the page parameters in the right order.
> >>
> >> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> >> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> >> ---
> >>  lib/scatterlist.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > rdma is pretty much the only user of this API and this bug is causing
> > bad data corruption, so I'm going to take it to the rdma tree and send
> > it tomorrow.
> > 
> > Which raises the question why the original patch was done at all,
> > nothing ever inputs pgmap pages into this function?
> 
> It was done solely because you had suggested it was necessary.
> 
> https://lore.kernel.org/all/20210929224653.GZ964074@nvidia.com/

Yes, but that was when I was expecting this would work with
FOLL_LONGTERM and PUP..

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-05 20:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-05 11:23 [PATCH] lib/scatterlist: Fix to merge contiguous pages into the last SG properly Yishai Hadas
2023-01-05 13:36 ` Jason Gunthorpe
2023-01-05 16:48   ` Yishai Hadas
2023-01-05 20:06 ` Jason Gunthorpe
2023-01-05 20:21   ` Keith Busch
2023-01-05 20:23     ` Jason Gunthorpe
2023-01-05 20:23   ` Logan Gunthorpe
2023-01-05 20:25     ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).