linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
@ 2022-02-03  6:27 Lukas Bulwahn
  2022-02-03  8:38 ` John Hubbard
  0 siblings, 1 reply; 9+ messages in thread
From: Lukas Bulwahn @ 2022-02-03  6:27 UTC (permalink / raw)
  To: Andrew Morton, John Hubbard, Linux Kernel Mailing List, Linux-MM
  Cc: Jason Gunthorpe, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Jason Gunthorpe, Kirill A. Shutemov

Dear John,

Your change "mm/gup: clean up follow_pfn_pte() slightly" (see Link),
visible in linux-next as commit 05fef840b5c6 ("mm/gup: clean up
follow_pfn_pte() slightly"), is somehow weird.

In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
out. However, at the label out, the value of page is not used, but the
return uses the variables i and ret.

Static analysis tools, such as clang-analyzer, rightfully complain
about such weird code.

Maybe you can have another look at what you intended to set in the
branch of that commit or if you intend to jump to the label out?


Best regards,

Lukas

Link: https://lkml.kernel.org/r/20220201101108.306062-3-jhubbard@nvidia.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-03  6:27 Weird code with change "mm/gup: clean up follow_pfn_pte() slightly" Lukas Bulwahn
@ 2022-02-03  8:38 ` John Hubbard
  2022-02-03 13:01   ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: John Hubbard @ 2022-02-03  8:38 UTC (permalink / raw)
  To: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List, Linux-MM
  Cc: Jason Gunthorpe, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Jason Gunthorpe, Kirill A. Shutemov

On 2/2/22 22:27, Lukas Bulwahn wrote:
> Dear John,
> 
> Your change "mm/gup: clean up follow_pfn_pte() slightly" (see Link),
> visible in linux-next as commit 05fef840b5c6 ("mm/gup: clean up
> follow_pfn_pte() slightly"), is somehow weird.

Well. That sounds like something to be avoided. :)

> 
> In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
> out. However, at the label out, the value of page is not used, but the
> return uses the variables i and ret.

Yes, I think that the complaint is accurate. The intent of this code is
to return either number of pages so far (i) or ret (which should be zero
in this case), because we are just stopping early, rather than calling
this an actual error.

And since we do skip over setting pages[i] = pages, it's pointless to
assign page to anything.

So instead of this:

	if (pages) {
		page = ERR_PTR(-EFAULT);
		goto out;
	}

...I should have written this:

	if (pages)
		goto out;


I'll send an updated series with this correction.

Thank you for the report!


thanks,
-- 
John Hubbard
NVIDIA
> 
> Static analysis tools, such as clang-analyzer, rightfully complain
> about such weird code.
> 
> Maybe you can have another look at what you intended to set in the
> branch of that commit or if you intend to jump to the label out?
> 
> 
> Best regards,
> 
> Lukas
> 
> Link: https://lkml.kernel.org/r/20220201101108.306062-3-jhubbard@nvidia.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-03  8:38 ` John Hubbard
@ 2022-02-03 13:01   ` Jason Gunthorpe
  2022-02-03 20:44     ` John Hubbard
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2022-02-03 13:01 UTC (permalink / raw)
  To: John Hubbard
  Cc: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List,
	Linux-MM, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Kirill A. Shutemov

On Thu, Feb 03, 2022 at 12:38:33AM -0800, John Hubbard wrote:
> On 2/2/22 22:27, Lukas Bulwahn wrote:
> > Dear John,
> > 
> > Your change "mm/gup: clean up follow_pfn_pte() slightly" (see Link),
> > visible in linux-next as commit 05fef840b5c6 ("mm/gup: clean up
> > follow_pfn_pte() slightly"), is somehow weird.
> 
> Well. That sounds like something to be avoided. :)
> 
> > 
> > In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
> > out. However, at the label out, the value of page is not used, but the
> > return uses the variables i and ret.
> 
> Yes, I think that the complaint is accurate. The intent of this code is
> to return either number of pages so far (i) or ret (which should be zero
> in this case), because we are just stopping early, rather than calling
> this an actual error.

IIRC GUP shouldn't return 0, it should return an error code, not zero.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-03 13:01   ` Jason Gunthorpe
@ 2022-02-03 20:44     ` John Hubbard
  2022-02-04  0:45       ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: John Hubbard @ 2022-02-03 20:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List,
	Linux-MM, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Kirill A. Shutemov

On 2/3/22 05:01, Jason Gunthorpe wrote:
...
>>> In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
>>> out. However, at the label out, the value of page is not used, but the
>>> return uses the variables i and ret.
>>
>> Yes, I think that the complaint is accurate. The intent of this code is
>> to return either number of pages so far (i) or ret (which should be zero
>> in this case), because we are just stopping early, rather than calling
>> this an actual error.
> 
> IIRC GUP shouldn't return 0, it should return an error code, not zero.
> 
> Jason

Errors work for single pages, but GUP is a multi-page API call. If it
returned an error part way through the list of pages, then callers would
have no way of knowing how many pages to release.

With that in mind, the API returns the number of pages that were
successfully pinned, if that number is > 0 (or even 0, in some cases),
even when an error has been encountered.

__get_user_pages()'s kerneldoc documentation covers it. And I see now
that it needs tweaking to include the FOLL_PIN case, but anyway:

  * Returns either number of pages pinned (which may be less than the
  * number requested), or an error. Details about the return value:
  *
  * -- If nr_pages is 0, returns 0.
  * -- If nr_pages is >0, but no pages were pinned, returns -errno.
  * -- If nr_pages is >0, and some pages were pinned, returns the number of
  *    pages pinned. Again, this may be less than nr_pages.
  * -- 0 return value is possible when the fault would need to be retried.
  *
  * The caller is responsible for releasing returned @pages, via put_page().


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-03 20:44     ` John Hubbard
@ 2022-02-04  0:45       ` Jason Gunthorpe
  2022-02-04  0:59         ` John Hubbard
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2022-02-04  0:45 UTC (permalink / raw)
  To: John Hubbard
  Cc: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List,
	Linux-MM, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Kirill A. Shutemov

On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
> On 2/3/22 05:01, Jason Gunthorpe wrote:
> ...
> > > > In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
> > > > out. However, at the label out, the value of page is not used, but the
> > > > return uses the variables i and ret.
> > > 
> > > Yes, I think that the complaint is accurate. The intent of this code is
> > > to return either number of pages so far (i) or ret (which should be zero
> > > in this case), because we are just stopping early, rather than calling
> > > this an actual error.
> > 
> > IIRC GUP shouldn't return 0, it should return an error code, not zero.
> > 
> > Jason
> 
> Errors work for single pages, but GUP is a multi-page API call. If it
> returned an error part way through the list of pages, then callers would
> have no way of knowing how many pages to release.

Yes, but that is returning a positive error code, I said it should not
return zero.

When it hits an error with pages already loaded it returns that number
and the caller will then do gup once more with the VA pointing at the
problematic page. Then GUP can return the error code because it has 0
pages on the next iteration.

It should not return 0 here when it got an error.

>  * Returns either number of pages pinned (which may be less than the
>  * number requested), or an error. Details about the return value:
>  *
>  * -- If nr_pages is 0, returns 0.
>  * -- If nr_pages is >0, but no pages were pinned, returns -errno.
>  * -- If nr_pages is >0, and some pages were pinned, returns the number of
>  *    pages pinned. Again, this may be less than nr_pages.
>  * -- 0 return value is possible when the fault would need to be retried.

I actually don't know of any place that handles the 0 return code, or
what 'fault would need to be retried' is supposed to mean for the
caller ...

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-04  0:45       ` Jason Gunthorpe
@ 2022-02-04  0:59         ` John Hubbard
  2022-02-04  1:06           ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: John Hubbard @ 2022-02-04  0:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List,
	Linux-MM, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Kirill A. Shutemov

On 2/3/22 16:45, Jason Gunthorpe wrote:
> On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
>> On 2/3/22 05:01, Jason Gunthorpe wrote:
>> ...
>>>>> In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
>>>>> out. However, at the label out, the value of page is not used, but the
>>>>> return uses the variables i and ret.
>>>>
>>>> Yes, I think that the complaint is accurate. The intent of this code is
>>>> to return either number of pages so far (i) or ret (which should be zero
>>>> in this case), because we are just stopping early, rather than calling
>>>> this an actual error.
>>>
>>> IIRC GUP shouldn't return 0, it should return an error code, not zero.
>>>
>>> Jason
>>
>> Errors work for single pages, but GUP is a multi-page API call. If it
>> returned an error part way through the list of pages, then callers would
>> have no way of knowing how many pages to release.
> 
> Yes, but that is returning a positive error code, I said it should not
> return zero.
> 
> When it hits an error with pages already loaded it returns that number
> and the caller will then do gup once more with the VA pointing at the
> problematic page. Then GUP can return the error code because it has 0
> pages on the next iteration.
> 
> It should not return 0 here when it got an error.

This is perhaps better API design, but it's not what exists now. The call
sites today handle 0 pages ret value correctly, already. There are lots
of call sites. Is this worth changing?

Also, to be clear, are you proposing just handling zero as a special,
or something more extensive? Because after we get N pages into it,
someone has to unpin those pages, and it's been up to the caller so far.

> 
>>   * Returns either number of pages pinned (which may be less than the
>>   * number requested), or an error. Details about the return value:
>>   *
>>   * -- If nr_pages is 0, returns 0.
>>   * -- If nr_pages is >0, but no pages were pinned, returns -errno.
>>   * -- If nr_pages is >0, and some pages were pinned, returns the number of
>>   *    pages pinned. Again, this may be less than nr_pages.
>>   * -- 0 return value is possible when the fault would need to be retried.
> 
> I actually don't know of any place that handles the 0 return code, or
> what 'fault would need to be retried' is supposed to mean for the
> caller ...
> 

There are quite a few places that handle a 0 return, and they understand
that it is an error for their case. For example:

static int non_atomic_pte_lookup(struct vm_area_struct *vma,
				 unsigned long vaddr, int write,
				 unsigned long *paddr, int *pageshift)
{
	struct page *page;

#ifdef CONFIG_HUGETLB_PAGE
	*pageshift = is_vm_hugetlb_page(vma) ? HPAGE_SHIFT : PAGE_SHIFT;
#else
	*pageshift = PAGE_SHIFT;
#endif
	if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
		return -EFAULT;
	*paddr = page_to_phys(page);
	put_page(page);
	return 0;
}


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-04  0:59         ` John Hubbard
@ 2022-02-04  1:06           ` Jason Gunthorpe
  2022-02-04  1:22             ` John Hubbard
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2022-02-04  1:06 UTC (permalink / raw)
  To: John Hubbard
  Cc: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List,
	Linux-MM, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Kirill A. Shutemov

On Thu, Feb 03, 2022 at 04:59:56PM -0800, John Hubbard wrote:
> On 2/3/22 16:45, Jason Gunthorpe wrote:
> > On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
> > > On 2/3/22 05:01, Jason Gunthorpe wrote:
> > > ...
> > > > > > In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
> > > > > > out. However, at the label out, the value of page is not used, but the
> > > > > > return uses the variables i and ret.
> > > > > 
> > > > > Yes, I think that the complaint is accurate. The intent of this code is
> > > > > to return either number of pages so far (i) or ret (which should be zero
> > > > > in this case), because we are just stopping early, rather than calling
> > > > > this an actual error.
> > > > 
> > > > IIRC GUP shouldn't return 0, it should return an error code, not zero.
> > > > 
> > > > Jason
> > > 
> > > Errors work for single pages, but GUP is a multi-page API call. If it
> > > returned an error part way through the list of pages, then callers would
> > > have no way of knowing how many pages to release.
> > 
> > Yes, but that is returning a positive error code, I said it should not
> > return zero.
> > 
> > When it hits an error with pages already loaded it returns that number
> > and the caller will then do gup once more with the VA pointing at the
> > problematic page. Then GUP can return the error code because it has 0
> > pages on the next iteration.
> > 
> > It should not return 0 here when it got an error.
> 
> This is perhaps better API design, but it's not what exists now. 

I think it is what exists today, 0 certainly is not implemented as
'need retry' anywhere I found.

So why do we return 0, if it means an error, instead of returning the
actual errno?

> The call sites today handle 0 pages ret value correctly,

This isn't correct though:

 	if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
 		return -EFAULT;

If GUP wanted the caller to permanently fail with -EFAULT, it should
have directly returned EFAULT.

0 means 'to be retried', whatever that means, and there is no retry
in the above.

IOW, the above does not handle a 0 return correctly, according to the
comment.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-04  1:06           ` Jason Gunthorpe
@ 2022-02-04  1:22             ` John Hubbard
  2022-02-04  1:26               ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: John Hubbard @ 2022-02-04  1:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List,
	Linux-MM, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Kirill A. Shutemov

On 2/3/22 17:06, Jason Gunthorpe wrote:
> On Thu, Feb 03, 2022 at 04:59:56PM -0800, John Hubbard wrote:
>> On 2/3/22 16:45, Jason Gunthorpe wrote:
>>> On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
>>>> On 2/3/22 05:01, Jason Gunthorpe wrote:
>>>> ...
>>>>>>> In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
>>>>>>> out. However, at the label out, the value of page is not used, but the
>>>>>>> return uses the variables i and ret.
>>>>>>
>>>>>> Yes, I think that the complaint is accurate. The intent of this code is
>>>>>> to return either number of pages so far (i) or ret (which should be zero
>>>>>> in this case), because we are just stopping early, rather than calling
>>>>>> this an actual error.
>>>>>
>>>>> IIRC GUP shouldn't return 0, it should return an error code, not zero.
>>>>>
>>>>> Jason
>>>>
>>>> Errors work for single pages, but GUP is a multi-page API call. If it
>>>> returned an error part way through the list of pages, then callers would
>>>> have no way of knowing how many pages to release.
>>>
>>> Yes, but that is returning a positive error code, I said it should not
>>> return zero.
>>>
>>> When it hits an error with pages already loaded it returns that number
>>> and the caller will then do gup once more with the VA pointing at the
>>> problematic page. Then GUP can return the error code because it has 0
>>> pages on the next iteration.
>>>
>>> It should not return 0 here when it got an error.
>>
>> This is perhaps better API design, but it's not what exists now.
> 
> I think it is what exists today, 0 certainly is not implemented as
> 'need retry' anywhere I found.
> 
> So why do we return 0, if it means an error, instead of returning the
> actual errno?

Well, now returning 0 sounds all wrong, when you put it like that. :)

So, simply this approach? :

@@ -1205,8 +1201,15 @@ static long __get_user_pages(struct mm_struct *mm,
  		} else if (PTR_ERR(page) == -EEXIST) {
  			/*
  			 * Proper page table entry exists, but no corresponding
-			 * struct page.
+			 * struct page. If the caller expects **pages to be
+			 * filled in, bail out now, because that can't be done
+			 * for this page.
  			 */
+			if (pages) {
+				ret = PTR_ERR(page);
+				goto out;
+			}
+
  			goto next_page;
  		} else if (IS_ERR(page)) {
  			ret = PTR_ERR(page);

> 
>> The call sites today handle 0 pages ret value correctly,
> 
> This isn't correct though:
> 
>   	if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
>   		return -EFAULT;
> 
> If GUP wanted the caller to permanently fail with -EFAULT, it should
> have directly returned EFAULT.
> 
> 0 means 'to be retried', whatever that means, and there is no retry
> in the above.
> 
> IOW, the above does not handle a 0 return correctly, according to the
> comment.
> 

I recall seeing several sites that do a quick attempt at one page and
force a -errno failure if anything other than ret==1 occurs. I guess the
good news is that changing GUP to return -errno instead of 0 won't affect
them.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Weird code with change "mm/gup: clean up follow_pfn_pte() slightly"
  2022-02-04  1:22             ` John Hubbard
@ 2022-02-04  1:26               ` Jason Gunthorpe
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2022-02-04  1:26 UTC (permalink / raw)
  To: John Hubbard
  Cc: Lukas Bulwahn, Andrew Morton, Linux Kernel Mailing List,
	Linux-MM, Peter Xu, Alex Williamson, Andrea Arcangeli,
	David Hildenbrand, Jan Kara, Kirill A. Shutemov

On Thu, Feb 03, 2022 at 05:22:36PM -0800, John Hubbard wrote:
> > So why do we return 0, if it means an error, instead of returning the
> > actual errno?
> 
> Well, now returning 0 sounds all wrong, when you put it like that. :)
> 
> So, simply this approach? :
> 
> @@ -1205,8 +1201,15 @@ static long __get_user_pages(struct mm_struct *mm,
>  		} else if (PTR_ERR(page) == -EEXIST) {
>  			/*
>  			 * Proper page table entry exists, but no corresponding
> -			 * struct page.
> +			 * struct page. If the caller expects **pages to be
> +			 * filled in, bail out now, because that can't be done
> +			 * for this page.
>  			 */
> +			if (pages) {
> +				ret = PTR_ERR(page);
> +				goto out;
> +			}
> +
>  			goto next_page;
>  		} else if (IS_ERR(page)) {
>  			ret = PTR_ERR(page);

It is what I had in mind, I certainly wouldn't add a new condition
that returns 0..

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-04  1:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-03  6:27 Weird code with change "mm/gup: clean up follow_pfn_pte() slightly" Lukas Bulwahn
2022-02-03  8:38 ` John Hubbard
2022-02-03 13:01   ` Jason Gunthorpe
2022-02-03 20:44     ` John Hubbard
2022-02-04  0:45       ` Jason Gunthorpe
2022-02-04  0:59         ` John Hubbard
2022-02-04  1:06           ` Jason Gunthorpe
2022-02-04  1:22             ` John Hubbard
2022-02-04  1:26               ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).