linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question about free_one_pgd() changes in these 3.5G patches
@ 2003-07-16  0:52 Ron Niles
  2003-07-16 12:31 ` Hugh Dickins
  0 siblings, 1 reply; 5+ messages in thread
From: Ron Niles @ 2003-07-16  0:52 UTC (permalink / raw)
  To: linux-kernel


I'm trying to expand my kernel memory space to 3GB and am checking out these
patches. I'm kind of mystified by the changes to free_one_pgd in these
patches, both the one that Chuck Luciano recently posted and the one in AA
(00_3.5G-address-space-4). Both of these seem to change the loop from:

	int j;
	...

	for (j = 0; j < PTRS_PER_PMD ; j++) {
 		prefetchw(pmd+j+(PREFETCH_STRIDE/16));
		free_one_pmd(pmd+j);
	}
to:

	pmd_t * pmd, * md, * emd;
	...

	/*
	 * Beware if changing the loop below.  It once used int j,
	 *	for (j = 0; j < PTRS_PER_PMD; j++)
	 *		free_one_pmd(pmd+j);
	 * but some older i386 compilers (e.g. egcs-2.91.66, gcc-2.95.3)
	 * terminated the loop with a _signed_ address comparison
	 * using "jle", when configured for HIGHMEM64GB (X86_PAE).
	 * If also configured for 3GB of kernel virtual address space,
	 * if page at physical 0x3ffff000 virtual 0x7ffff000 is used as
	 * a pmd, when that mm exits the loop goes on to free "entries"
	 * found at 0x80000000 onwards.  The loop below compiles instead
	 * to be terminated by unsigned address comparison using "jb". 

	for (md = pmd, emd = pmd + PTRS_PER_PMD; md < emd; md++) {
		prefetchw(md+(PREFETCH_STRIDE/16));
		free_one_pmd(md);
 	}

The comment (found in the AA patch) makes no sense to me. Since j is an int,
you would expect the loop to exit with jle. If you want it to exit on jb,
just change j to unsigned, right? Also PTRS_PER_PMD is never very large,
around 512 I think, so it really doesn't matter unless PTRS_PER_PMD exceeds
0x7fffffff, which is really far from reality.

Secondly, the new code is dangerous if the pmd happens to be on the page at
the top of memory. In this case pmd is something like 0xfffff000, emd = pmd
+ PTRS_PER_PMD rolls over to zero, and the loop never gets executed since md
is never less than zero.

It seems to me the change is unnecessary, but if it is needed, it should
protect against rollover on the top of memory page, assuming PTRS_PER_PMD is
never zero:

	for (md = pmd, emd = pmd + PTRS_PER_PMD - 1; md <= emd; md++)

Do we guarantee that the top of memory page is never used and the rollover
is impossible? Even so, am I missing something as to why this change is
necessary?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about free_one_pgd() changes in these 3.5G patches
  2003-07-16  0:52 Question about free_one_pgd() changes in these 3.5G patches Ron Niles
@ 2003-07-16 12:31 ` Hugh Dickins
  2003-07-16 15:25   ` Dave Jones
  0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2003-07-16 12:31 UTC (permalink / raw)
  To: Ron Niles; +Cc: linux-kernel

On Tue, 15 Jul 2003, Ron Niles wrote:
> 
> I'm trying to expand my kernel memory space to 3GB and am checking out these
> patches. I'm kind of mystified by the changes to free_one_pgd in these
> patches, both the one that Chuck Luciano recently posted and the one in AA
> (00_3.5G-address-space-4). Both of these seem to change the loop from:
> 
> 	int j;
> 	...
> 
> 	for (j = 0; j < PTRS_PER_PMD ; j++) {
>  		prefetchw(pmd+j+(PREFETCH_STRIDE/16));
> 		free_one_pmd(pmd+j);
> 	}
> to:
> 
> 	pmd_t * pmd, * md, * emd;
> 	...
> 
> 	/*
> 	 * Beware if changing the loop below.  It once used int j,
> 	 *	for (j = 0; j < PTRS_PER_PMD; j++)
> 	 *		free_one_pmd(pmd+j);
> 	 * but some older i386 compilers (e.g. egcs-2.91.66, gcc-2.95.3)
> 	 * terminated the loop with a _signed_ address comparison
> 	 * using "jle", when configured for HIGHMEM64GB (X86_PAE).
> 	 * If also configured for 3GB of kernel virtual address space,
> 	 * if page at physical 0x3ffff000 virtual 0x7ffff000 is used as
> 	 * a pmd, when that mm exits the loop goes on to free "entries"
> 	 * found at 0x80000000 onwards.  The loop below compiles instead
> 	 * to be terminated by unsigned address comparison using "jb". 
> 
> 	for (md = pmd, emd = pmd + PTRS_PER_PMD; md < emd; md++) {
> 		prefetchw(md+(PREFETCH_STRIDE/16));
> 		free_one_pmd(md);
>  	}
> 
> The comment (found in the AA patch) makes no sense to me. Since j is an int,
> you would expect the loop to exit with jle. If you want it to exit on jb,
> just change j to unsigned, right? Also PTRS_PER_PMD is never very large,
> around 512 I think, so it really doesn't matter unless PTRS_PER_PMD exceeds
> 0x7fffffff, which is really far from reality.

That comment (and the rewritten loop) originally came from me.
I thought it was a champion comment, I'm saddened that you disagree!

I've tried to cover the point by saying they terminated the loop with
"a _signed_ address comparison": the loop got optimized in such a way
that it wasn't testing int j as the C shows, but the address pmd+j.

Even so, it's conceivable that your proposed change, to unsigned j,
might be enough to jolt those compilers into doing the right thing.
But I never tried that, preferring to code the pointers explicitly.

> Secondly, the new code is dangerous if the pmd happens to be on the page at
> the top of memory. In this case pmd is something like 0xfffff000, emd = pmd
> + PTRS_PER_PMD rolls over to zero, and the loop never gets executed since md
> is never less than zero.
> 
> It seems to me the change is unnecessary, but if it is needed, it should
> protect against rollover on the top of memory page, assuming PTRS_PER_PMD is
> never zero:
> 
> 	for (md = pmd, emd = pmd + PTRS_PER_PMD - 1; md <= emd; md++)
> 
> Do we guarantee that the top of memory page is never used and the rollover
> is impossible? Even so, am I missing something as to why this change is
> necessary?

It is the case that we don't use the page-worth of virtual addresses at
the very top of virtual memory (I'm trying to phrase that pedantically to
make the point that it is virtual not physical addresses relevant here).

I know that to be true of i386, I believe it to be true of all arches,
you're right to pose the question.

In 2.5 Linus did very briefly use it for the FIX_VSYSCALL page, but
quickly accepted the value of keeping that page unmapped, and it's now
dignified as FIX_HOLE.  Not that we'd be using the fixmap area for pmds,
which have to come from directly mapped lowmem (or be temporarily
kmapped in wli's highmem pmd patch in -mm).

Hugh


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about free_one_pgd() changes in these 3.5G patches
  2003-07-16 12:31 ` Hugh Dickins
@ 2003-07-16 15:25   ` Dave Jones
  2003-07-16 18:57     ` Hugh Dickins
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Jones @ 2003-07-16 15:25 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Ron Niles, linux-kernel

On Wed, Jul 16, 2003 at 01:31:39PM +0100, Hugh Dickins wrote:

Both this..

 > > 	for (j = 0; j < PTRS_PER_PMD ; j++) {
 > >  		prefetchw(pmd+j+(PREFETCH_STRIDE/16));
 > > 		free_one_pmd(pmd+j);
 > > 	}

and this..

 > > 	pmd_t * pmd, * md, * emd;
 > > 	for (md = pmd, emd = pmd + PTRS_PER_PMD; md < emd; md++) {
 > > 		prefetchw(md+(PREFETCH_STRIDE/16));
 > > 		free_one_pmd(md);
 > >  	}

both use the prefetch that was removed in 2.5 for 'being bogus'.
It can prefetch past the end iirc, which is fatal on some CPUs.

		Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about free_one_pgd() changes in these 3.5G patches
  2003-07-16 15:25   ` Dave Jones
@ 2003-07-16 18:57     ` Hugh Dickins
  0 siblings, 0 replies; 5+ messages in thread
From: Hugh Dickins @ 2003-07-16 18:57 UTC (permalink / raw)
  To: Dave Jones; +Cc: Ron Niles, David Mosberger, linux-kernel

On Wed, 16 Jul 2003, Dave Jones wrote:
> On Wed, Jul 16, 2003 at 01:31:39PM +0100, Hugh Dickins wrote:
> 
> Both this..
>  > >  		prefetchw(pmd+j+(PREFETCH_STRIDE/16));
> and this..
>  > > 		prefetchw(md+(PREFETCH_STRIDE/16));
> 
> both use the prefetch that was removed in 2.5 for 'being bogus'.
> It can prefetch past the end iirc, which is fatal on some CPUs.

That prefetchw never made sense to me, nor to DavidM whom I falsely
accused of it; but I had never noticed that someone (aha - you!)
got up the courage to remove it from 2.5.  A 2.4.22 patch to Marcelo
would come with greater authority from you than from me, Dave.

Hugh


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Question about free_one_pgd() changes in these 3.5G patches
@ 2003-07-16 13:23 Ron Niles
  0 siblings, 0 replies; 5+ messages in thread
From: Ron Niles @ 2003-07-16 13:23 UTC (permalink / raw)
  To: linux-kernel; +Cc: Hugh Dickins

>On Tue, 15 Jul 2003, Ron Niles wrote:
>> 
>> 	/*
>> 	 * Beware if changing the loop below.  It once used int j,
>> 	 *	for (j = 0; j < PTRS_PER_PMD; j++)
>> 	 *		free_one_pmd(pmd+j);
>> 	 * but some older i386 compilers (e.g. egcs-2.91.66, gcc-2.95.3)
>> 	 * terminated the loop with a _signed_ address comparison
>> 	 * using "jle", when configured for HIGHMEM64GB (X86_PAE).
>> 	 * If also configured for 3GB of kernel virtual address space,
>> 	 * if page at physical 0x3ffff000 virtual 0x7ffff000 is used as
>> 	 * a pmd, when that mm exits the loop goes on to free "entries"
>> 	 * found at 0x80000000 onwards.  The loop below compiles instead
>> 	 * to be terminated by unsigned address comparison using "jb". 
>> 
>> 	for (md = pmd, emd = pmd + PTRS_PER_PMD; md < emd; md++) {
>> 		prefetchw(md+(PREFETCH_STRIDE/16));
>> 		free_one_pmd(md);
>>  	}
>> 
>> The comment (found in the AA patch) makes no sense to me. Since j is an
int,
>> you would expect the loop to exit with jle. If you want it to exit on jb,
>> just change j to unsigned, right? Also PTRS_PER_PMD is never very large,
>> around 512 I think, so it really doesn't matter unless PTRS_PER_PMD
exceeds
>> 0x7fffffff, which is really far from reality.
>
>That comment (and the rewritten loop) originally came from me.
>I thought it was a champion comment, I'm saddened that you disagree!
>
>I've tried to cover the point by saying they terminated the loop with
>"a _signed_ address comparison": the loop got optimized in such a way
>that it wasn't testing int j as the C shows, but the address pmd+j.
>

Thanks, Hugh, it _is_ a champion comment, and it makes perfect sense now
that
I understand the compiler-optimized comparison is on pmd+j, not j.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-07-16 18:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-16  0:52 Question about free_one_pgd() changes in these 3.5G patches Ron Niles
2003-07-16 12:31 ` Hugh Dickins
2003-07-16 15:25   ` Dave Jones
2003-07-16 18:57     ` Hugh Dickins
2003-07-16 13:23 Ron Niles

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).