linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: What is the truth about Linux 2.4's RAM limitations?
@ 2001-07-10 18:12 Jesse Pollard
  2001-07-10 18:22 ` Jonathan Lundell
  2001-07-10 18:28 ` Brian Gerst
  0 siblings, 2 replies; 29+ messages in thread
From: Jesse Pollard @ 2001-07-10 18:12 UTC (permalink / raw)
  To: ttabi, linux-kernel

Timur Tabi <ttabi@interactivesi.com>:
> Jesse Pollard wrote:
> >>So what are the limits without using PAE? Here I'm still having a little
> >>problem finding definitive answers but ...
> >>
> >3 GB. Final answers are in the FAQ, and have been discussed before. You can
> >also look in the Intel 80x86 CPU specifications.
> >
> >The only way to exceed current limits is via some form of segment register usage
> >which will require a different compiler and a replacement of the memory
> >architecture of x86 Linux implementation.
> >
> 
> Are you talking about using 48-bit pointers?
> 
> (48-bit pointers, aka 16:32 pointers, on x86 are basically "far 32-bit 
> pointers".  That is, each pointer is stored as a 48-bit value, where 16 
> bits are for the selector/segment, and 32 bits are for the offset.

That sounds right - I'm not yet fully familiar with the low level intel
x86 design yet. There is also (based on list email) a limit to how
many page tables can be active. Two is desirable (one system, one user)
but the x86 design only has one. This causes Linux (and maybe others too)
to split the 32 bit range into a 3G (user) and 1G (system) address ranges
to allow the page cache/cpu cache to work in a more optimum manner. If
the entire page table were given to a user, then a full cache flush would
have to be done on every context switch and system call. That would be
very slow, but would allow a full 4G address for the user.

The use of 48 bit addresses has the same problem. Doing the remapping for
the segment + offset requires flushing the cache as well (the cache seems
to be between the segment registers and the page tables - not sure, not
necessarily coreect... I still have to get the new CPU specs...)

Any body want to offer a full reference? Or a tutorial on Intel addressing
capability?.


-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 29+ messages in thread
* Re: What is the truth about Linux 2.4's RAM limitations?
@ 2001-07-11  4:31 alad
  0 siblings, 0 replies; 29+ messages in thread
From: alad @ 2001-07-11  4:31 UTC (permalink / raw)
  To: Jesse Pollard; +Cc: root, Timur Tabi, linux-kernel








Jesse Pollard <pollard@tomcat.admin.navo.hpc.mil> on 07/11/2001 01:08:02 AM

To:   root@chaos.analogic.com, Timur Tabi <ttabi@interactivesi.com>
cc:   linux-kernel@vger.kernel.org (bcc: Amol Lad/HSS)

Subject:  Re: What is the truth about Linux 2.4's RAM limitations?




"Richard B. Johnson" <root@chaos.analogic.com>
...
> In Unix and Unix variants, it is by design, provided that the
> kernel exist within every process address space. Early Nixes
> like Ultrix, simply called the kernel entry point. Since it
> was protected, this trapped to the real kernel and the page-fault
> handler actually performed the work on behalf of the caller.
>
> Unlike some OS (like VMS), a context-switch does not occur
> when the kernel provides services for the calling task.
> Therefore, it was most reasonable to have the kernel exist within
> each tasks address space. With modern processors, it doesn't make
> very much difference, you could have user space start at virtual
> address 0 and extend to virtual address 0xffffffff. However, this would
> not be Unix. It would also force the kernel to use additional
> CPU cycles when addressing a tasks virtual address space,
> i.e., when data are copied to/from user to kernel space.

I believe the VAX/VMS implementation shared OS and user space:

     p0   - user application       0
     p1   - system shared libraries      0x3fffffff
     p2   - kernel            0x7fffffff
          rest was I/O, cache memory     0xffffffff

It was a hardware design, not a function of the software.

UNIX origins were on a PDP-11. there were two sets of addressing registers
1 kernel, 1 user  (except on 11/45 - 1 kernel, 1 user, 1 "executive"
(never used except in some really strange form of extented shared library)

A full context switch was required. Kernel had to map a single 4KW window
to the user space for access to the parameters. Another 4KW window was used
to map the IO space. The remaining 6 mapping registers were used for supporting
the kernel virtual address. BTW, 1 KW = 2K Bytes, a mapping register could
map anything from 16 bytes to 8K bytes, if I remember correctly. The PDP 11
with memory management only had 16 mapping registers (8 user, 8 kernel) with
a maximum address of 64K bytes (16 bit addresses... my how far we've come).
The base hardware could only handle a maximum of 256 K bytes. More recent
cpu's expanded the size of the mapping registers (more bits/register) but did
not increase the number of registers. The last system (PDP-11/70 level) could
handle 4 MB of physical memory, but with all of the restrictions of the small
systems, just more processes were handled.

It was not possible to share memory between kernel/user other than that one
4KW window. The Linux 3G/1G split is a design choice for speed. It would still
be Linux even if it did 4G/0, just a different MM architecture with a lot
more overhead on intel x86 hardware.
>>>>> Can you please write what exactly the 'overhead' is and how the same
overhead is not there in 3G/1G split

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





^ permalink raw reply	[flat|nested] 29+ messages in thread
* Re: What is the truth about Linux 2.4's RAM limitations?
@ 2001-07-10 21:49 Jesse Pollard
  2001-07-10 22:07 ` Jonathan Lundell
  0 siblings, 1 reply; 29+ messages in thread
From: Jesse Pollard @ 2001-07-10 21:49 UTC (permalink / raw)
  To: cw, Brian Gerst; +Cc: Jesse Pollard, ttabi, linux-kernel


> On Tue, Jul 10, 2001 at 02:28:54PM -0400, Brian Gerst wrote:
> 
>     Jesse Pollard wrote:
> 
>         > If the entire page table were given to a user, then a full cache
>         > flush would have to be done on every context switch and system
>         > call. That would be very slow, but would allow a full 4G address
>         > for the user.
> 
>     A full cache flush would be needed at every entry into the kernel,
>     including hardware interrupts.  Very poor for performance.
> 
> Why would a cache flush be necessary at all? I assume ia32 caches
> where physically not virtually mapped?

Because the entire virtual mapping is replaced by that of the kernel.
This would invalidate the entire cache table. It was also pointed out
that this would have to be done for every interrupt too.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 29+ messages in thread
* Re: What is the truth about Linux 2.4's RAM limitations?
@ 2001-07-10 18:38 Jesse Pollard
  2001-07-10 19:14 ` Mark H. Wood
  0 siblings, 1 reply; 29+ messages in thread
From: Jesse Pollard @ 2001-07-10 18:38 UTC (permalink / raw)
  To: root, Timur Tabi; +Cc: linux-kernel

"Richard B. Johnson" <root@chaos.analogic.com>
...
> In Unix and Unix variants, it is by design, provided that the
> kernel exist within every process address space. Early Nixes
> like Ultrix, simply called the kernel entry point. Since it
> was protected, this trapped to the real kernel and the page-fault
> handler actually performed the work on behalf of the caller.
> 
> Unlike some OS (like VMS), a context-switch does not occur
> when the kernel provides services for the calling task.
> Therefore, it was most reasonable to have the kernel exist within
> each tasks address space. With modern processors, it doesn't make
> very much difference, you could have user space start at virtual
> address 0 and extend to virtual address 0xffffffff. However, this would
> not be Unix. It would also force the kernel to use additional
> CPU cycles when addressing a tasks virtual address space,
> i.e., when data are copied to/from user to kernel space.

I believe the VAX/VMS implementation shared OS and user space:

	p0	- user application		0
	p1	- system shared libraries	0x3fffffff
	p2	- kernel			0x7fffffff
		rest was I/O, cache memory	0xffffffff

It was a hardware design, not a function of the software.

UNIX origins were on a PDP-11. there were two sets of addressing registers
1 kernel, 1 user  (except on 11/45 - 1 kernel, 1 user, 1 "executive"
(never used except in some really strange form of extented shared library)

A full context switch was required. Kernel had to map a single 4KW window
to the user space for access to the parameters. Another 4KW window was used
to map the IO space. The remaining 6 mapping registers were used for supporting
the kernel virtual address. BTW, 1 KW = 2K Bytes, a mapping register could
map anything from 16 bytes to 8K bytes, if I remember correctly. The PDP 11
with memory management only had 16 mapping registers (8 user, 8 kernel) with
a maximum address of 64K bytes (16 bit addresses... my how far we've come).
The base hardware could only handle a maximum of 256 K bytes. More recent
cpu's expanded the size of the mapping registers (more bits/register) but did
not increase the number of registers. The last system (PDP-11/70 level) could
handle 4 MB of physical memory, but with all of the restrictions of the small
systems, just more processes were handled.

It was not possible to share memory between kernel/user other than that one
4KW window. The Linux 3G/1G split is a design choice for speed. It would still
be Linux even if it did 4G/0, just a different MM architecture with a lot
more overhead on intel x86 hardware.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 29+ messages in thread
* Re: What is the truth about Linux 2.4's RAM limitations?
@ 2001-07-09 21:29 Jesse Pollard
  2001-07-10 17:01 ` Timur Tabi
  0 siblings, 1 reply; 29+ messages in thread
From: Jesse Pollard @ 2001-07-09 21:29 UTC (permalink / raw)
  To: larry, linux-kernel

---------  Received message begins Here  ---------

> 
> 
> Where I just started work we run large processes for simulations and
> testing of semi-conductors.  Currently we use Solaris because of past
> limitations on the amount of RAM that a single process can address under
> Linux.  Recently we tried to run some tests on a Dell dual Xeon 1.7GHz
> with 4GB of RAM running Redhat 7.1 box (with the stock Redhat SMP kernel).
> Speedwise it kicked the crap out of our Sunblade (Dual 750MHz with 8GB of
> RAM)but we had problems with process dying right around 2.3GB (according
> to top).
> 
> So I started to investigate, and quickly discovered that there is no good
> source for finding this sort of information on line.  At least not that I
> could find.  Nearly every piece of information I found conflicted in at
> least some small way with another piece of information I found.
> So I ask here in the hopes of a definitive answer.
> 
>  * What is the maximum amount of RAM that a *single* process can address
>    under a 2.4 kernel, with PAE enabled?  Without?

3GB tops. PAE only allows more processes. A single process cannot under
any circumstances reach more than 4G, but due to other limits 3 is the max.

>  * And, what (if any) paramaters can effect this (recompiling the app
>    etc)?

You need a 64 bit CPU.

> What I think I know so far is listed below.  I welcome being flamed, told
> that I'm stupid and that I should have looked "here" so long as said
> messages also contain pointers to definitive information :-)
> 
> Linux 2.4 does support greater then 4GB of RAM with these caveats ...
> 
>  * It does this by supporting Intel's PAE (Physical Address Extension)
>    features which are in all Pentium Pro and newer CPU's.
>  * The PAE extensions allow up to a maximum of 64GB of RAM that the OS
>    (not a process) can address.
>  * It does this via indirect pointers to the higher memory locations, so
>    there is a CPU and RAM hit for using this.
>  * Benchmarks seem to indicated around 3-6% CPU hit just for using the PAE
>    extensions (ie. it applies regardless of whether you are actually
>    accessing memory locations greater then 4GB).
>  * If the kernel is compiled to use PAE, Linux will not boot on a computer
>    whose hardware doesn't support PAE.
>  * PAE does not increase Linux's ability for *single* processes to see
>    greater then 3GB of RAM (see below).
> 
> So what are the limits without using PAE? Here I'm still having a little
> problem finding definitive answers but ...

3 GB. Final answers are in the FAQ, and have been discussed before. You can
also look in the Intel 80x86 CPU specifications.

The only way to exceed current limits is via some form of segment register usage
which will require a different compiler and a replacement of the memory
architecture of x86 Linux implementation.

> 
>  * With PAE compiled into the kernel the OS can address a maximum of 4GB
>    of RAM.
>  * With 2.4 kernels (with a large memory configuration) a single process
>    can address up to the total amount of RAM in the machine minus 1GB
>    (reserved for the kernel), to a maximum 3GB.
>  * By default the kernel reserves 1GB for it's own use, however I think
>    that this is a tunable parameter so if we have 4GB of RAM in a box we
>    can tune it so that most of that should be available to the processes
>    (?).
> 
> I have documented the below information on my web site, and will post
> whatever answers I recieve there:
> 
> 	http://www.spack.org/index.cgi/LinuxRamLimits

And it isn't really a Linux limit. It is the hardware. If you need more
virtual space, get a 64 bit processor.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 29+ messages in thread
[parent not found: <Pine.LNX.4.32.0107091250170.25061-100000@maus.spack.org.suse.lists.linux.kernel>]
* What is the truth about Linux 2.4's RAM limitations?
@ 2001-07-09 20:01 Adam Shand
  2001-07-09 21:15 ` Brian Gerst
                   ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Adam Shand @ 2001-07-09 20:01 UTC (permalink / raw)
  To: linux-kernel


Where I just started work we run large processes for simulations and
testing of semi-conductors.  Currently we use Solaris because of past
limitations on the amount of RAM that a single process can address under
Linux.  Recently we tried to run some tests on a Dell dual Xeon 1.7GHz
with 4GB of RAM running Redhat 7.1 box (with the stock Redhat SMP kernel).
Speedwise it kicked the crap out of our Sunblade (Dual 750MHz with 8GB of
RAM)but we had problems with process dying right around 2.3GB (according
to top).

So I started to investigate, and quickly discovered that there is no good
source for finding this sort of information on line.  At least not that I
could find.  Nearly every piece of information I found conflicted in at
least some small way with another piece of information I found.
So I ask here in the hopes of a definitive answer.

 * What is the maximum amount of RAM that a *single* process can address
   under a 2.4 kernel, with PAE enabled?  Without?

 * And, what (if any) paramaters can effect this (recompiling the app
   etc)?

What I think I know so far is listed below.  I welcome being flamed, told
that I'm stupid and that I should have looked "here" so long as said
messages also contain pointers to definitive information :-)

Linux 2.4 does support greater then 4GB of RAM with these caveats ...

 * It does this by supporting Intel's PAE (Physical Address Extension)
   features which are in all Pentium Pro and newer CPU's.
 * The PAE extensions allow up to a maximum of 64GB of RAM that the OS
   (not a process) can address.
 * It does this via indirect pointers to the higher memory locations, so
   there is a CPU and RAM hit for using this.
 * Benchmarks seem to indicated around 3-6% CPU hit just for using the PAE
   extensions (ie. it applies regardless of whether you are actually
   accessing memory locations greater then 4GB).
 * If the kernel is compiled to use PAE, Linux will not boot on a computer
   whose hardware doesn't support PAE.
 * PAE does not increase Linux's ability for *single* processes to see
   greater then 3GB of RAM (see below).

So what are the limits without using PAE? Here I'm still having a little
problem finding definitive answers but ...

 * With PAE compiled into the kernel the OS can address a maximum of 4GB
   of RAM.
 * With 2.4 kernels (with a large memory configuration) a single process
   can address up to the total amount of RAM in the machine minus 1GB
   (reserved for the kernel), to a maximum 3GB.
 * By default the kernel reserves 1GB for it's own use, however I think
   that this is a tunable parameter so if we have 4GB of RAM in a box we
   can tune it so that most of that should be available to the processes
   (?).

I have documented the below information on my web site, and will post
whatever answers I recieve there:

	http://www.spack.org/index.cgi/LinuxRamLimits

Thanks,
Adam.





^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2001-07-16  8:38 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-07-10 18:12 What is the truth about Linux 2.4's RAM limitations? Jesse Pollard
2001-07-10 18:22 ` Jonathan Lundell
2001-07-10 18:28 ` Brian Gerst
2001-07-10 18:43   ` Chris Wedgwood
2001-07-10 19:35     ` Brian Gerst
  -- strict thread matches above, loose matches on Subject: below --
2001-07-11  4:31 alad
2001-07-10 21:49 Jesse Pollard
2001-07-10 22:07 ` Jonathan Lundell
2001-07-10 18:38 Jesse Pollard
2001-07-10 19:14 ` Mark H. Wood
2001-07-09 21:29 Jesse Pollard
2001-07-10 17:01 ` Timur Tabi
     [not found] <Pine.LNX.4.32.0107091250170.25061-100000@maus.spack.org.suse.lists.linux.kernel>
2001-07-09 21:03 ` Andi Kleen
2001-07-09 20:01 Adam Shand
2001-07-09 21:15 ` Brian Gerst
2001-07-09 21:18 ` Rik van Riel
2001-07-09 22:17 ` Matti Aarnio
2001-07-10 13:49   ` Chris Wedgwood
2001-07-10 17:03     ` Timur Tabi
2001-07-10 17:35       ` Richard B. Johnson
2001-07-10 18:01         ` Timur Tabi
2001-07-10 18:08         ` Jonathan Lundell
2001-07-10 18:45           ` Richard B. Johnson
2001-07-10 19:26             ` Jonathan Lundell
2001-07-10 23:56             ` Jesse Pollard
2001-07-10 20:19         ` Malcolm Beattie
2001-07-10  3:01 ` jlnance
2001-07-10  3:29   ` Michael Bacarella
2001-07-16  8:37   ` Ingo Oeser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).