linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Assignment of GDT entries
@ 2006-09-13 18:58 Jeremy Fitzhardinge
  2006-09-13 19:16 ` Arjan van de Ven
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2006-09-13 18:58 UTC (permalink / raw)
  To: Linus Torvalds, Ingo Molnar, Andi Kleen, Eric W. Biederman,
	Arjan van de Ven, Zachary Amsden
  Cc: Linux Kernel Mailing List, Michael A Fetterman

What's the rationale for the current assignment of GDT entries?  In 
particular, this section:

 *   0 - null
 *   1 - reserved
 *   2 - reserved
 *   3 - reserved
 *
 *   4 - unused			<==== new cacheline
 *   5 - unused
 *
 *  ------- start of TLS (Thread-Local Storage) segments:
 *
 *   6 - TLS segment #1			[ glibc's TLS segment ]
 *   7 - TLS segment #2			[ Wine's %fs Win32 segment ]
 *   8 - TLS segment #3
 *   9 - reserved
 *  10 - reserved
 *  11 - reserved


What are entries 1-3 and 9-11 reserved for?  Must they be unused for 
some reason, or is there some proposed use that has not been impemented yet?

Also, is there a particular reason kernel GDT entries start at 12?  
Would there be a problem in using either 4 or 5 for a kernel GDT descriptor?

I'm asking because I'd like to use one of these entries for the PDA 
descriptor, so that it is on the same cache line as the TLS 
descriptors.  That way, the entry/exit segment register reloads would 
still only need to touch two GDT cache lines.  Would there be a real 
problem in doing this?

Thanks,
    J


^ permalink raw reply	[flat|nested] 32+ messages in thread
* Re: Assignment of GDT entries
@ 2006-09-14  3:23 Albert Cahalan
  2006-09-14  6:11 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Albert Cahalan @ 2006-09-14  3:23 UTC (permalink / raw)
  To: torvalds, jeremy, mingo, ak, ebiederm, arjan, zach, linux-kernel

Linus Torvalds writes:
> On Wed, 13 Sep 2006, Jeremy Fitzhardinge wrote:

>> So does this mean that moving the user-visible cs/ds isn't
>> likely to break stuff, if it has been done before?
>
> Yes. I _think_ we could do it. It's been done before, and nobody noticed.
>
> That said, it may actually be that programs have since become much more
> aware of segments, for a rather perverse reason: the TLS stuff. Old
> programs are all very much coded and compiled for a totally flat model,
> and as such they really don't know _anything_ about segments. But with
> more TLS stuff, it's possible that a modern threded program is at least
> aware of _some_ of it.

We actually have an ABI problem right now because of this.
Note that i386 and x86_64 use different GDT slots.

As far as I can tell, users need to hard-code the mapping
from TLS slot to segment number. They use 0,1,2 to ask the
kernel to set things up (via set_thread_area), but can't
just pop that into %fs or %gs.

So a 32-bit app using set_thread_area can work on i386 or x86_64,
but not both. I guess glibc gets %gs set up free via clone() with
the right flags, and thus does not need to determine the kernel.
For anything involving set_thread_area though, it gets nasty.

Typical hacks that result from this:

call uname() and look for "x86_64"
see of the addresses of local variables exceed 0xbfffffff
examine /proc/1/maps
check for a /lib64 directory
change SSE register 8 in a signal handler frame and see if it sticks
checksum the vdso code
...

Please save us from these foul hacks.

^ permalink raw reply	[flat|nested] 32+ messages in thread
* Re: Assignment of GDT entries
@ 2006-09-14  4:06 Albert Cahalan
  2006-09-14  4:44 ` Eric W. Biederman
  0 siblings, 1 reply; 32+ messages in thread
From: Albert Cahalan @ 2006-09-14  4:06 UTC (permalink / raw)
  To: torvalds, jeremy, mingo, ak, ebiederm, arjan, zach, linux-kernel

Jeremy Fitzhardinge writes:
> Zachary Amsden wrote:

>> I believe 9,10,11 are reserved for future users like yourself or
>> expanded TLS segments.  I think a bank of 3 TLS segments in the
>> GDT is working fine now (does NPTL even use more than one?).
>
> Nope.  And there's a comment that wine uses one more.  I think
> the third is completely unused.

I use the third. The sucky thing is that I need to determine if
the kernel is 64-bit to know what I must load into the segment
register. Fortunately this code is not yet out in the wild, so
you can still fix the ABI situation for me at least.

>>> Otherwise line 1 would be ideal for putting 3 TLS, kernel+user
>>> code+data and PDA into, thereby making 99.999% of GDT descriptor
>>> uses come from one cache line.
>>
>> That change is visible to userspace, unfortunately.
>
> Don't think it matters much.  32-bit processes on x86-64 seem
> perfectly happy with the TLS being in a different place.

Heh. I wish. Well, OK, but only because I detect the kernel!

> I think the ABI is defined in terms of "use the selector for
> the entry that set_thread_area/clone returns", and so is not
> a constant.  But I agree it would be better not to.
>
> Hm, moving user cs/ds would be pretty visible too... Hm, and
> it would have a greater chance of breaking stuff if they changed,
> compared to moving the TLS...

I think that would be a lower chance, not a greater chance.
Reasons why an app might care:

a. identify a 64-bit kernel
b. far jumps between 32-bit and 64-bit code
c. reload of ds/es after a string operation on thread-private data

Perhaps i386 should change to match x86_64.

^ permalink raw reply	[flat|nested] 32+ messages in thread
* Re: Assignment of GDT entries
@ 2006-09-15  7:55 Mikael Pettersson
  2006-09-15  8:20 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Mikael Pettersson @ 2006-09-15  7:55 UTC (permalink / raw)
  To: acahalan, jeremy; +Cc: ak, arjan, ebiederm, linux-kernel, mingo, torvalds, zach

On Wed, 13 Sep 2006 23:11:05 -0700, Jeremy Fitzhardinge wrote:
>Albert Cahalan wrote:
>> We actually have an ABI problem right now because of this.
>> Note that i386 and x86_64 use different GDT slots.
>>
>> As far as I can tell, users need to hard-code the mapping
>> from TLS slot to segment number. They use 0,1,2 to ask the
>> kernel to set things up (via set_thread_area), but can't
>> just pop that into %fs or %gs.
>
>That's not true at all.  The program I posted earlier in this thread 
>uses set_thread_area() to allocate a GDT slot, and it works on both 
>native 32 bit and 32-under-64.

The i386 TLS API has three components:

(1) set_thread_area(entry_number == -1):
    allocates and sets up the first available TLS entry and
    copies the chosen GDT index back to user-space
(2) set_thread_area(6 <= entry_number && entry_number <= 8):
    allocates and sets up the indicated GDT entry
(3) get_thread_area(6 <= entry_number && entry_number <= 8):
    retrieves the contents of the indicated GDT entry

Only (1) works in x86-64's ia32 emulation, the other two fail
with EINVAL because x86-64 only accepts GDT indices 12 to 14
for TLS entries. glibc only uses (1).

If you move the i386 TLS GDT entries to other indices then you
break (2) and (3) also on i386.

It's not difficult to design a better i386 TLS API that avoids
requiring user-space to know the actual GDT indices (just use
logical TLS indices and always copy the GDT index to user-space).
but unfortunately that doesn't help us now because the TLS GDT
indices must remain fixed as long as the current API is supported.

I _personally_ could certainly handle a post-2.6.18 kernel where
the improved API (new syscalls) is in place, the GDT indices have
been moved, and consequently components (2) and (3) of the old API
are broken. However, this still implies breaking binary compatibility,
which is not something to be done lightly.

(What's _really_ sad is that the implementation of the i386 TLS API
internally operates on logical TLS indices, it's just the syscall
interface that insists on requiring actual GDT indices from user-space.)

/Mikael

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2006-09-15 18:27 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-13 18:58 Assignment of GDT entries Jeremy Fitzhardinge
2006-09-13 19:16 ` Arjan van de Ven
2006-09-13 20:00   ` Alan Cox
2006-09-13 20:02     ` Jeremy Fitzhardinge
2006-09-13 20:20   ` Jeremy Fitzhardinge
2006-09-13 20:59     ` Zachary Amsden
2006-09-13 21:15       ` Jeremy Fitzhardinge
2006-09-13 21:35       ` Alan Cox
2006-09-14  0:25         ` Zachary Amsden
2006-09-14  1:40           ` Stephen Rothwell
2006-09-14 13:03           ` Alan Cox
2006-09-13 19:55 ` linux-os (Dick Johnson)
2006-09-13 20:08   ` Jeremy Fitzhardinge
2006-09-13 20:32     ` linux-os (Dick Johnson)
2006-09-13 21:21 ` Linus Torvalds
2006-09-13 21:47   ` Jeremy Fitzhardinge
2006-09-13 22:05     ` Linus Torvalds
2006-09-13 22:22       ` Jeremy Fitzhardinge
2006-09-14  6:00 ` Andi Kleen
2006-09-14  3:23 Albert Cahalan
2006-09-14  6:11 ` Jeremy Fitzhardinge
2006-09-14  4:06 Albert Cahalan
2006-09-14  4:44 ` Eric W. Biederman
2006-09-14  6:19   ` Albert Cahalan
2006-09-14  6:28     ` Zachary Amsden
2006-09-14  7:12       ` Albert Cahalan
2006-09-14  7:24         ` Zachary Amsden
2006-09-14  6:29     ` Jeremy Fitzhardinge
2006-09-15  7:55 Mikael Pettersson
2006-09-15  8:20 ` Jeremy Fitzhardinge
2006-09-15  8:58   ` Mikael Pettersson
2006-09-15 18:27     ` Jeremy Fitzhardinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).