linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [i386] Questions regarding provisional page tables initialization
@ 2007-07-01 20:38 Ahmed S. Darwish
  2007-07-01 22:02 ` Andreas Schwab
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ahmed S. Darwish @ 2007-07-01 20:38 UTC (permalink / raw)
  To: linux-kernel

Hi list,

AFAIK, in the initializaion phase, kernel builds pages tables with two
mappings, identity and PAGE_OFFSET + C mapping. The provisional _global
directory_ is contained in swapper_pg_dir variable. while the provisional 
_page tables_ are stored starting from pg0, right after _end.

There're some stuff that confused me for a full day about the code (head.S)
that  accomplishes the above words:

	movl $(pg0 - __PAGE_OFFSET), %edi
	movl $(swapper_pg_dir - __PAGE_OFFSET), %edx	
	movl $0x007, %eax			/* 0x007 = PRESENT+RW+USER */
10:
	leal $0x007(%edi),%ecx			/* Create PDE entry */

What does the address of 7 bytes displacement after %edi - the physical address
of pg0 - represent ?. Why not just putting the address of %edi (the address of
pagetable cell to be mapped by swapper_pg_dir) in %ecx without displacement?

        page_pde_offset = (__PAGE_OFFSET >> 20)
	movl %ecx,(%edx)			/* Store identity PDE entry */
	movl %ecx,page_pde_offset(%edx)		/* Store kernel PDE entry */

Why the pde_offset is PAGE_OFFSET >> 20 instead of PAGE_OFFSET >> 22 ?
* 22 to right shift the whole page_shift (12) and pgdir_shift (10) bits.

        [...]
	/* Initialize the 1024 _page table_ cells with %eax (0x007) */
	movl $1024, %ecx
11:
	stosl
	addl $0x1000,%eax
	loop 11b

The page table entries beginning from pg0 (pointed by %edi) and following pages 
are initialized with  the series 7 + 8 + 8 + ... for each cell. This series has
the property of setting the PRESENT+RW+USER bits in the whole entries to 1 but it
sets lots of the entries BASE address to 0 too. Why is this done ?

Thanks,

-- 
Ahmed S. Darwish
HomePage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-01 20:38 [i386] Questions regarding provisional page tables initialization Ahmed S. Darwish
@ 2007-07-01 22:02 ` Andreas Schwab
  2007-07-01 22:19 ` Jeremy Fitzhardinge
  2007-07-02 10:43 ` Andi Kleen
  2 siblings, 0 replies; 9+ messages in thread
From: Andreas Schwab @ 2007-07-01 22:02 UTC (permalink / raw)
  To: Ahmed S. Darwish; +Cc: linux-kernel

"Ahmed S. Darwish" <darwish.07@gmail.com> writes:

>         page_pde_offset = (__PAGE_OFFSET >> 20)
> 	movl %ecx,(%edx)			/* Store identity PDE entry */
> 	movl %ecx,page_pde_offset(%edx)		/* Store kernel PDE entry */
>
> Why the pde_offset is PAGE_OFFSET >> 20 instead of PAGE_OFFSET >> 22 ?
> * 22 to right shift the whole page_shift (12) and pgdir_shift (10) bits.

4 is the element size.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-01 20:38 [i386] Questions regarding provisional page tables initialization Ahmed S. Darwish
  2007-07-01 22:02 ` Andreas Schwab
@ 2007-07-01 22:19 ` Jeremy Fitzhardinge
  2007-07-02  1:13   ` Ahmed S. Darwish
  2007-07-02 10:43 ` Andi Kleen
  2 siblings, 1 reply; 9+ messages in thread
From: Jeremy Fitzhardinge @ 2007-07-01 22:19 UTC (permalink / raw)
  To: Ahmed S. Darwish; +Cc: linux-kernel

Ahmed S. Darwish wrote:
> Hi list,
>
> AFAIK, in the initializaion phase, kernel builds pages tables with two
> mappings, identity and PAGE_OFFSET + C mapping. The provisional _global
> directory_ is contained in swapper_pg_dir variable. while the provisional 
> _page tables_ are stored starting from pg0, right after _end.
>
> There're some stuff that confused me for a full day about the code (head.S)
> that  accomplishes the above words:
>
> 	movl $(pg0 - __PAGE_OFFSET), %edi
> 	movl $(swapper_pg_dir - __PAGE_OFFSET), %edx	
> 	movl $0x007, %eax			/* 0x007 = PRESENT+RW+USER */
> 10:
> 	leal $0x007(%edi),%ecx			/* Create PDE entry */
>
> What does the address of 7 bytes displacement after %edi - the physical address
> of pg0 - represent ?. Why not just putting the address of %edi (the address of
> pagetable cell to be mapped by swapper_pg_dir) in %ecx without displacement?
>   

The pte format contains the pfn in the top 20 bits, and flags in the 
lower 12 bits.  As the comment says "0x007 = PRESENT+RW+USER".

>         page_pde_offset = (__PAGE_OFFSET >> 20)
> 	movl %ecx,(%edx)			/* Store identity PDE entry */
> 	movl %ecx,page_pde_offset(%edx)		/* Store kernel PDE entry */
>
> Why the pde_offset is PAGE_OFFSET >> 20 instead of PAGE_OFFSET >> 22 ?
> * 22 to right shift the whole page_shift (12) and pgdir_shift (10) bits.
>   

As Andreas said, its (PAGE_OFFSET >> 22) << 2.

>         [...]
> 	/* Initialize the 1024 _page table_ cells with %eax (0x007) */
> 	movl $1024, %ecx
> 11:
> 	stosl
> 	addl $0x1000,%eax
> 	loop 11b
>
> The page table entries beginning from pg0 (pointed by %edi) and following pages 
> are initialized with  the series 7 + 8 + 8 + ... for each cell. This series has
> the property of setting the PRESENT+RW+USER bits in the whole entries to 1 but it
> sets lots of the entries BASE address to 0 too. Why is this done ?
>   

I don't follow you.  Are you overlooking the 'L' on stosl?

    J

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-01 22:19 ` Jeremy Fitzhardinge
@ 2007-07-02  1:13   ` Ahmed S. Darwish
  2007-07-02  9:18     ` Andreas Schwab
  0 siblings, 1 reply; 9+ messages in thread
From: Ahmed S. Darwish @ 2007-07-02  1:13 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: linux-kernel

Hi Jeremy,

On Sun, Jul 01, 2007 at 03:19:30PM -0700, Jeremy Fitzhardinge wrote:
> Ahmed S. Darwish wrote:
> >Hi list,
> >
> >AFAIK, in the initializaion phase, kernel builds pages tables with two
> >mappings, identity and PAGE_OFFSET + C mapping. The provisional _global
> >directory_ is contained in swapper_pg_dir variable. while the provisional 
> >_page tables_ are stored starting from pg0, right after _end.
> >
> >There're some stuff that confused me for a full day about the code (head.S)
> >that  accomplishes the above words:
> >
> >	movl $(pg0 - __PAGE_OFFSET), %edi
> >	movl $(swapper_pg_dir - __PAGE_OFFSET), %edx	
> >	movl $0x007, %eax			/* 0x007 = PRESENT+RW+USER */
> >10:
> >	leal $0x007(%edi),%ecx			/* Create PDE entry */
> >
> >What does the address of 7 bytes displacement after %edi - the physical 
> >address
> >of pg0 - represent ?. Why not just putting the address of %edi (the 
> >address of
> >pagetable cell to be mapped by swapper_pg_dir) in %ecx without 
> >displacement?
> >  
> 
> The pte format contains the pfn in the top 20 bits, and flags in the 
> lower 12 bits.  As the comment says "0x007 = PRESENT+RW+USER".
> 

yes, but isn't the displacement here (0x007) a _bytes_ displacement ?. so
effectively, %ecx now contains physical address of pg0 + 7bytes. Is it A 
meaningful place/address ?.

> >        page_pde_offset = (__PAGE_OFFSET >> 20)
> >	movl %ecx,(%edx)			/* Store identity PDE entry 
> >	*/
> >	movl %ecx,page_pde_offset(%edx)		/* Store kernel PDE entry */
> >
> >Why the pde_offset is PAGE_OFFSET >> 20 instead of PAGE_OFFSET >> 22 ?
> >* 22 to right shift the whole page_shift (12) and pgdir_shift (10) bits.
> >  
> 
> As Andreas said, its (PAGE_OFFSET >> 22) << 2.
> 

Great!, Thanks a lot.

> >        [...]
> >	/* Initialize the 1024 _page table_ cells with %eax (0x007) */
> >	movl $1024, %ecx
> >11:
> >	stosl
> >	addl $0x1000,%eax
> >	loop 11b
> >
> >The page table entries beginning from pg0 (pointed by %edi) and following 
> >pages are initialized with  the series 7 + 8 + 8 + ... for each cell. This 
> >series has
> >the property of setting the PRESENT+RW+USER bits in the whole entries to 1 
> >but it
> >sets lots of the entries BASE address to 0 too. Why is this done ?
> >  
> 
> I don't follow you.  Are you overlooking the 'L' on stosl?
> 

Sorry for not making the question clear. my question was that the first entry in
the page table pointed by (%edi) is initialize with %eax = 0x007, a reasonable
value (setting the 3 pte flags). Beginning from entry 2, they got initialized
with a value = "new %eax = old %eax + 8", generating a table of entries
initialized with 7, 15, 31, .. . While this scheme makes the 3 PRESENT, RW 
and USER flags set, it makes alot of "pte"s with equivalent "pfn"s. Here comes 
my wonder, why initializing pg0 that way ?.

Thanks,

-- 
Ahmed S. Darwish
HomePage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-02  1:13   ` Ahmed S. Darwish
@ 2007-07-02  9:18     ` Andreas Schwab
  2007-07-02 10:23       ` Ahmed S. Darwish
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Schwab @ 2007-07-02  9:18 UTC (permalink / raw)
  To: Ahmed S. Darwish; +Cc: Jeremy Fitzhardinge, linux-kernel

"Ahmed S. Darwish" <darwish.07@gmail.com> writes:

> yes, but isn't the displacement here (0x007) a _bytes_ displacement ?. so
> effectively, %ecx now contains physical address of pg0 + 7bytes. Is it A 
> meaningful place/address ?.

It's not pg0 + 7bytes, it is pg0 plus 3 flag bits.  Since a page address
is always page aligned, the low bits are reused for flags.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-02  9:18     ` Andreas Schwab
@ 2007-07-02 10:23       ` Ahmed S. Darwish
  2007-07-02 11:34         ` Brian Gerst
  0 siblings, 1 reply; 9+ messages in thread
From: Ahmed S. Darwish @ 2007-07-02 10:23 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Jeremy Fitzhardinge, linux-kernel

Hi Andreas,

On Mon, Jul 02, 2007 at 11:18:08AM +0200, Andreas Schwab wrote:
> "Ahmed S. Darwish" <darwish.07@gmail.com> writes:
> 
> > yes, but isn't the displacement here (0x007) a _bytes_ displacement ?. so
> > effectively, %ecx now contains physical address of pg0 + 7bytes. Is it A 
> > meaningful place/address ?.
> 
> It's not pg0 + 7bytes, it is pg0 plus 3 flag bits.  Since a page address
> is always page aligned, the low bits are reused for flags.
> 

I'm sure there's a problem in _my_ understanding, but isn't the displacement
- as specified by AT&T syntax - represented in bytes ?. I've wrote a small
assembly function to be sure:

.data
integer:
	.string "%d\n"

.text
test_func:
	push	%ebp
	mov	%esp, %ebp
	push	0x008(%ebp)   ## 8 bytes displacement (the first arg), right ?
	push	$integer
	call 	printf
	mov	%ebp, %esp
	pop	%ebp
	ret

The above method works fine and prints "5" to stdout by the code:

.global	main
main:
	mov	$5, %eax
	push	%eax
	call	test_func

	movl	$1, %eax
	movl	$0, %ebx
	int	$0x80

now back to head.S code:
 	leal    0x007(%edi),%ecx	/* Create PDE entry */

Isn't the above line the same condition (bytes, not bits displacement) ?. 
Thanks for your patience !.

(For other kind replies, don't understand me wrong. I did my homework and
 studied the pte format before asking ;). It's just the bytes/bits issue 
 above that confuses me).

-- 
Ahmed S. Darwish
HomePage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-01 20:38 [i386] Questions regarding provisional page tables initialization Ahmed S. Darwish
  2007-07-01 22:02 ` Andreas Schwab
  2007-07-01 22:19 ` Jeremy Fitzhardinge
@ 2007-07-02 10:43 ` Andi Kleen
  2 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2007-07-02 10:43 UTC (permalink / raw)
  To: Ahmed S. Darwish; +Cc: linux-kernel

"Ahmed S. Darwish" <darwish.07@gmail.com> writes:
> 
> There're some stuff that confused me for a full day about the code (head.S)
> that  accomplishes the above words:
> 
> 	movl $(pg0 - __PAGE_OFFSET), %edi
> 	movl $(swapper_pg_dir - __PAGE_OFFSET), %edx	
> 	movl $0x007, %eax			/* 0x007 = PRESENT+RW+USER */
> 10:
> 	leal $0x007(%edi),%ecx			/* Create PDE entry */
> 
> What does the address of 7 bytes displacement after %edi - the physical address
> of pg0 - represent ?. 

The trick is not to know everything, but to know where to look ...

Page table entries use the first 12 bits for various flags. Take a look
at the Intel or AMD x86 documentation from their websites.

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-02 10:23       ` Ahmed S. Darwish
@ 2007-07-02 11:34         ` Brian Gerst
  2007-07-02 15:43           ` Ahmed S. Darwish
  0 siblings, 1 reply; 9+ messages in thread
From: Brian Gerst @ 2007-07-02 11:34 UTC (permalink / raw)
  To: Ahmed S. Darwish; +Cc: Andreas Schwab, Jeremy Fitzhardinge, linux-kernel

Ahmed S. Darwish wrote:
> now back to head.S code:
>  	leal    0x007(%edi),%ecx	/* Create PDE entry */
> 
> Isn't the above line the same condition (bytes, not bits displacement) ?. 
> Thanks for your patience !.

The leal instruction (Load Effective Address) is often used as a way to
add a constant to one register and store the result in another register
in a single instruction.  The values don't even have to be addresses at
all, since no memory is actually referenced.  Otherwise this would be
written as:

	movl %edi,%ecx
	addl $0x007, %ecx

--
				Brian Gerst

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i386] Questions regarding provisional page tables initialization
  2007-07-02 11:34         ` Brian Gerst
@ 2007-07-02 15:43           ` Ahmed S. Darwish
  0 siblings, 0 replies; 9+ messages in thread
From: Ahmed S. Darwish @ 2007-07-02 15:43 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Andreas Schwab, Jeremy Fitzhardinge, linux-kernel

On Mon, Jul 02, 2007 at 07:34:20AM -0400, Brian Gerst wrote:
> Ahmed S. Darwish wrote:
> > now back to head.S code:
> >  	leal    0x007(%edi),%ecx	/* Create PDE entry */
> > 
> > Isn't the above line the same condition (bytes, not bits displacement) ?. 
> > Thanks for your patience !.
> 
> The leal instruction (Load Effective Address) is often used as a way to
> add a constant to one register and store the result in another register
> in a single instruction.
> [...]

At last I got it due to all of everybody's very nice explanations :). 
Here's the provisional page tables intialization scenario:

We want to store pg0 pte addresses to high 20bits swapper_pg_dir entries,
we also want the PRESENT,RW,USER flags be set in those entries. This is 
done by  getting the address of each pg0 entry and adding 7 to it (Brian's
note). Since pg0 entires address are page aligned(Andreas, Jeremy notes),
adding 7 will assure that the first 3 bits are set without affecting the
high 20 bits.

Big Thanks!,

-- 
Ahmed S. Darwish
HomePage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-07-02 15:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-01 20:38 [i386] Questions regarding provisional page tables initialization Ahmed S. Darwish
2007-07-01 22:02 ` Andreas Schwab
2007-07-01 22:19 ` Jeremy Fitzhardinge
2007-07-02  1:13   ` Ahmed S. Darwish
2007-07-02  9:18     ` Andreas Schwab
2007-07-02 10:23       ` Ahmed S. Darwish
2007-07-02 11:34         ` Brian Gerst
2007-07-02 15:43           ` Ahmed S. Darwish
2007-07-02 10:43 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).