All of lore.kernel.org
 help / color / mirror / Atom feed
* st.c page_to_pfn() usage
@ 2004-10-23 11:35 Arjan van de Ven
  2004-10-23 13:14 ` William Lee Irwin III
  2004-10-23 20:20 ` Kai Makisara
  0 siblings, 2 replies; 4+ messages in thread
From: Arjan van de Ven @ 2004-10-23 11:35 UTC (permalink / raw)
  To: Kai.Makisara; +Cc: linux-scsi

Hi,

I'm auditing kernel uses of page_to_pfn() and I came across st.c. 
st.c uses page_to_pfn() twice, the first usage is:

if (page_to_pfn(STbp->sg[i].page) == page_to_pfn(STbp->sg[i-1].page) + 1)
	STp->nbr_combinable++;

which looks quite incorrect, I *think* the code wants to check "physically
contiguous" however pfn is the wrong thing to check there; with discontigmem
(which is basically true now for all 64 bit architectures), the assumption
"pfn and pfn+1 are physically next to each other" isn't per se true,
example:

|----- zone 1 ----------|     gap     |-------- zone 2 ----------|

the last page of zone 1 and the first page of zone 2 may be 1 pfn appart in the
discontigmem case, but physically far appart.


the other page_to_pfn() user is st_map_user_pages():
        for (i=0; i < nr_pages; i++) {
                if (page_to_pfn(sgl[i].page) > max_pfn)
                        goto out_unmap;
        }

and again I get the suspicion this is not the right thing to do...
it *looks* like you want to detect userspace mmaped vmalloc() pages or
mmaped pci mmio ranges and the like, however the test is entirely the wrong
thing for that, take an average x86 server PC with say 6Gb of memory:

0               3.5Gb              4Gb           6.5Gb
|-------------------||-- PCI GAP --||---------------|

and something similar happens for vmalloc; however I fully admit that I
don't quite entirely understand the purpose of this code yet, so maybe I'm
misreading the intention of the code

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: st.c page_to_pfn() usage
  2004-10-23 11:35 st.c page_to_pfn() usage Arjan van de Ven
@ 2004-10-23 13:14 ` William Lee Irwin III
  2004-10-23 20:20 ` Kai Makisara
  1 sibling, 0 replies; 4+ messages in thread
From: William Lee Irwin III @ 2004-10-23 13:14 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Kai.Makisara, linux-scsi

On Sat, Oct 23, 2004 at 01:35:47PM +0200, Arjan van de Ven wrote:
> I'm auditing kernel uses of page_to_pfn() and I came across st.c. 
> st.c uses page_to_pfn() twice, the first usage is:
> if (page_to_pfn(STbp->sg[i].page) == page_to_pfn(STbp->sg[i-1].page) + 1)
> 	STp->nbr_combinable++;
> which looks quite incorrect, I *think* the code wants to check "physically
> contiguous" however pfn is the wrong thing to check there; with discontigmem
> (which is basically true now for all 64 bit architectures), the assumption
> "pfn and pfn+1 are physically next to each other" isn't per se true,
> example:

I'm rather hard pressed to see how contiguous pfn's would result in
discontiguous memory. They're meant to be truly physical; the only
abuses of this I'm aware of are for hypervisor-related use, whose IO
invariants are such that all these things about bouncing and most all
handling of physical limitations of devices are irrelevant anyway.


On Sat, Oct 23, 2004 at 01:35:47PM +0200, Arjan van de Ven wrote:
> |----- zone 1 ----------|     gap     |-------- zone 2 ----------|
> the last page of zone 1 and the first page of zone 2 may be 1 pfn
> appart in the discontigmem case, but physically far appart.

I'm not sure what the numbers described in that diagram have relations
to. The 2.4.x invariant was that within a zone pfn's were monotonically
increasing, the 2.6.x invariant is that zones are literally intervals
of pfn's, which as one might expect makes various machines unhappy.


On Sat, Oct 23, 2004 at 01:35:47PM +0200, Arjan van de Ven wrote:
> the other page_to_pfn() user is st_map_user_pages():
>         for (i=0; i < nr_pages; i++) {
>                 if (page_to_pfn(sgl[i].page) > max_pfn)
>                         goto out_unmap;
>         }
> and again I get the suspicion this is not the right thing to do...
> it *looks* like you want to detect userspace mmaped vmalloc() pages or
> mmaped pci mmio ranges and the like, however the test is entirely the wrong
> thing for that, take an average x86 server PC with say 6Gb of memory:

The name max_pfn is almost certainly an accident, as the name of the
formal parameter clashes with the global variable. This seems to be a
speculative attempt to do IO without bouncing for the direct IO case.


On Sat, Oct 23, 2004 at 01:35:47PM +0200, Arjan van de Ven wrote:
> 0               3.5Gb              4Gb           6.5Gb
> |-------------------||-- PCI GAP --||---------------|
> 
> and something similar happens for vmalloc; however I fully admit that I
> don't quite entirely understand the purpose of this code yet, so maybe I'm
> misreading the intention of the code

The max_pfn passed is STp->max_pfn, which I suppose is Hungarian notation
for some pfn stored in a structure meant to represent the bounce_pfn for
the tape device. I suppose one could hunt for where it's initialized.


-- wli

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: st.c page_to_pfn() usage
  2004-10-23 11:35 st.c page_to_pfn() usage Arjan van de Ven
  2004-10-23 13:14 ` William Lee Irwin III
@ 2004-10-23 20:20 ` Kai Makisara
  2004-10-23 23:40   ` Arjan van de Ven
  1 sibling, 1 reply; 4+ messages in thread
From: Kai Makisara @ 2004-10-23 20:20 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: William Lee Irwin III, linux-scsi

On Sat, 23 Oct 2004, Arjan van de Ven wrote:

> Hi,
> 
> I'm auditing kernel uses of page_to_pfn() and I came across st.c. 
> st.c uses page_to_pfn() twice, the first usage is:
> 
> if (page_to_pfn(STbp->sg[i].page) == page_to_pfn(STbp->sg[i-1].page) + 1)
> 	STp->nbr_combinable++;
> 
> which looks quite incorrect, I *think* the code wants to check "physically
> contiguous" however pfn is the wrong thing to check there; with discontigmem
> (which is basically true now for all 64 bit architectures), the assumption
> "pfn and pfn+1 are physically next to each other" isn't per se true,
> example:
> 
> |----- zone 1 ----------|     gap     |-------- zone 2 ----------|
> 
> the last page of zone 1 and the first page of zone 2 may be 1 pfn appart in the
> discontigmem case, but physically far appart.
> 
The code is inside DEB(), i.e., it is debugging code. The purpose of the 
code is to count the occasions where physically contiguous pages could be 
combined to a single scatter/gather segment. No decisions are made based 
on the result. My experiments suggested that combining is not usually 
possible and it is not worth trying. I did not rip off the code in case it 
would prove useful later. If it is incorrect, the easiest solution is to 
rip off everything related to STp->nbr_combinable.

> 
> the other page_to_pfn() user is st_map_user_pages():
>         for (i=0; i < nr_pages; i++) {
>                 if (page_to_pfn(sgl[i].page) > max_pfn)
>                         goto out_unmap;
>         }
> 
> and again I get the suspicion this is not the right thing to do...
> it *looks* like you want to detect userspace mmaped vmalloc() pages or
> mmaped pci mmio ranges and the like, however the test is entirely the wrong
> thing for that, take an average x86 server PC with say 6Gb of memory:
> 
> 0               3.5Gb              4Gb           6.5Gb
> |-------------------||-- PCI GAP --||---------------|
> 
> and something similar happens for vmalloc; however I fully admit that I
> don't quite entirely understand the purpose of this code yet, so maybe I'm
> misreading the intention of the code

The purpose of the code is to check if the pages are directly reachable by 
the SCSI HBA. If not, the driver reverts to bouncing all pages of this 
request. As William points out, a better name for the field would
have been bounce_pfn. The value is computed by the code fragment:

        bounce_limit = scsi_calculate_bounce_limit(SDp->host) >> PAGE_SHIFT;
        if (bounce_limit > ULONG_MAX)
                bounce_limit = ULONG_MAX;
        tpnt->max_pfn = bounce_limit;

I have never been quite sure that this is correct in all architectures. 
However, I have copied the usage from ll_rw_blk.c. If you suggest a 
better method, please see that ll_rw_blk.c is also fixed (unless I have 
read ll_rw_blk.c incorrectly).

-- 
Kai

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: st.c page_to_pfn() usage
  2004-10-23 20:20 ` Kai Makisara
@ 2004-10-23 23:40   ` Arjan van de Ven
  0 siblings, 0 replies; 4+ messages in thread
From: Arjan van de Ven @ 2004-10-23 23:40 UTC (permalink / raw)
  To: Kai Makisara; +Cc: William Lee Irwin III, linux-scsi


> The code is inside DEB(), i.e., it is debugging code. The purpose of the 
> code is to count the occasions where physically contiguous pages could be 
> combined to a single scatter/gather segment. No decisions are made based 
> on the result. My experiments suggested that combining is not usually 
> possible and it is not worth trying. I did not rip off the code in case it 
> would prove useful later. If it is incorrect, the easiest solution is to 
> rip off everything related to STp->nbr_combinable.

just ignore my email; as william points out I'm flat out wrong....
-- 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-10-23 23:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-23 11:35 st.c page_to_pfn() usage Arjan van de Ven
2004-10-23 13:14 ` William Lee Irwin III
2004-10-23 20:20 ` Kai Makisara
2004-10-23 23:40   ` Arjan van de Ven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.