linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] VM: I have a dream...
@ 2006-01-21 18:08 Al Boldi
  2006-01-21 18:42 ` Jamie Lokier
                   ` (4 more replies)
  0 siblings, 5 replies; 75+ messages in thread
From: Al Boldi @ 2006-01-21 18:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel

A long time ago, when i was a kid, I had dream. It went like this:

I am waking up in the twenty-first century and start my computer.
After completing the boot sequence, I start top to find that my memory is 
equal to total disk-capacity.  What's more, there is no more swap.
Apps are executed inplace, as if already loaded.
Physical RAM is used to cache slower storage RAM, much the same as the CPU 
cache RAM caches slower physical RAM.

When I woke up, I was really looking forward for the new century.

Sadly, the current way of dealing with memory can at best only be described 
as schizophrenic.  Again the reason being, that we are still running in the 
last-century mode.

Wouldn't it be nice to take advantage of todays 64bit archs and TB drives, 
and run a more modern way of life w/o this memory/storage split personality?

All comments, other than "dream on", are most welcome!

Thanks!

--
Al


^ permalink raw reply	[flat|nested] 75+ messages in thread
* Re: [RFC] VM: I have a dream...
@ 2006-02-01 13:58 Al Boldi
  2006-02-01 14:38 ` Jamie Lokier
  0 siblings, 1 reply; 75+ messages in thread
From: Al Boldi @ 2006-02-01 13:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel

Thanks for your detailed responses!

Kyle Moffett wrote:
> BTW, unless you have a patch or something to propose, let's take this
> off-list, it's getting kind of OT now.

No patches yet, but even if there were, would they get accepted?

> On Jan 31, 2006, at 10:56, Al Boldi wrote:
> > Kyle Moffett wrote:
> >> Is it necessarily faulty?  It seems to me that the current way
> >> works pretty well so far, and unless you can prove a really strong
> >> point the other way, there's no point in changing.  You have to
> >> remember that change introduces bugs which then have to be located
> >> and removed again, so change is not necessarily cheap.
> >
> > Faulty, because we are currently running a legacy solution to
> > workaround an 8,16,(32) arch bits address space limitation, which
> > does not exist in 64bits+ archs for most purposes.
>
> There are a lot of reasons for paging, only _one_ of them is/was to
> deal with too-small address spaces.  Other reasons are that sometimes
> you really _do_ want a nonlinear mapping of data/files/libs/etc.  It
> also allows easy remapping of IO space or video RAM into application
> address spaces, etc.  If you have a direct linear mapping from
> storage into RAM, common non-linear mappings become _extremely_
> complex and CPU-intensive.
>
> Besides, you never did address the issue of large changes causing
> large bugs.  Any large change needs to have advantages proportional
> to the bugs it will cause, and you have not yet proven this case.

How could reverting a workaround introduce large bugs?

> > Trying to defend the current way would be similar to rejecting the
> > move from  16bit to 32bit. Do you remember that time?  One of the
> > arguments used was:  the current way works pretty well so far.
>
> Arbitrary analogies do not prove things.

Analogies are there to make a long story short.

> Can you cite examples that
> clearly indicate how paged-memory is to direct-linear-mapping as 16-
> bit processors are to 32-bit processors?

I mentioned this in a previous message.

> > There is a lot to gain, for one there is no more swapping w/ all
> > its related side-effects.
>
> This is *NOT* true.  When you have more data than RAM, you have to
> put data on disk, which means swapping, regardless of the method in
> which it is done.
>
> > You're dealing with memory only.  You can also run your fs inside
> > memory, like tmpfs, which is definitely faster.
>
> Not on Linux.  We have a whole unique dcache system precisely so that
> a frequently accessed filesystem _is_ as fast as tmpfs (Unless you're
> writing and syncing a lot, in which case you still need to wait for
> disk hardware to commit data).

This is true, and may very well explain why dcache is so CPU intensive.

> > And there may be lots of other advantages, due to the simplified
> > architecture applied.
>
> Can you describe in detail your "simplified architecture"?? I can't
> see any significant complexity advantages over the standard paging
> model that Linux has.
>
> >>> Why would you think that the shortest path between two points is
> >>> complicated, when you have the ability to fly?
> >>
> >> Bad analogy.
> >
> > If you didn't understand it's meaning.  The shortest path meaning
> > accessing hw w/o running workarounds; using 64bits+ to fly over
> > past limitations.
>
> This makes *NO* technical sense and is uselessly vague.  Applying
> vague indirect analogies to technical topics is a fruitless
> endeavor.  Please provide technical points and reasons why it _is_
> indead shorter/better/faster, and then you can still leave out the
> analogy because the technical argument is sufficient.
>
> >>>> But unless the stumbling block since 1980 has been that it was too
> >>>> hard to get/make a CPU with a 64 bit address space, I don't see
> >>>> what's different today.
> >>>
> >>> You are hitting the nail right on it's head here. Nothing moves the
> >>> masses like mass-production.
> >>
> >> Uhh, no, you misread his argument: If there were other reasons that
> >> this was not done in the past than lack of 64-bit CPUS, then this is
> >> probably still not practical/feasible/desirable.
> >
> > Uhh?
> > The point here is: Even if there were 64bit archs available in the
> > past, this did not mean that moving into native 64bits would be
> > commercially viable, due to its unavailability on the mass-market.
>
> Are you even reading these messages?

Bryan Henderson wrote:
> >1) IF the ONLY reason this was not done before is that 64-bit archs
> >were hard to get, then you are right.
> >
> >2) IF there were OTHER reasons, then you are not correct.
> >
> >This is the argument.  You keep discussing how 64-bit archs were not
> >easily available before and are now, and I AGREE, but that is NOT
> >RELEVANT to the point he made.
>
> As I remember it, my argument was that single level storage was known and
> practical for 25 years and people did not flock to it, therefore they must
> not see it as useful.  So if 64 bit processors were not available enough
> during that time, that blows away my argument, because people might have
> liked the idea but just couldn't afford the necessary address width.  It
> doesn't matter if there were other reasons to shun the technology; all it
> takes is one.  And if 64 bit processors are more available today, that
> might tip the balance in favor of making the change away from multilevel
> storage.

Thanks for clarifying this!

> But I don't really buy that 64 bit processors weren't available until
> recently.  I think they weren't produced in commodity fashion because
> people didn't have a need for them.  They saw what you can do with 128 bit
> addresses (i.e. single level storage) in the IBM I Series line, but
> weren't impressed.  People added lots of other new technology to the
> mainstream CPU lines, but not additional address bits.  Not until they
> wanted to address more than 4G of main memory at a time did they see any
> reason to make 64 bit processors in volume.

True, so with 64bits=16MTB what reason would there be to stick with a swapped 
memory model?

Jamie Lokier wrote:
> Al Boldi wrote:
> > There is a lot to gain, for one there is no more swapping w/ all its
> > related side-effects.  You're dealing with memory only.
>
> I'm sorry, I think I don't understand.  My weakness.  Can you please
> explain?
>
> Presumably you will want access to more data than you have RAM,
> because RAM is still limited to a few GB these days, whereas a typical
> personal data store is a few 100s of GB.
>
> 64-bit architecture doesn't change this mismatch.  So how do you
> propose to avoid swapping to/from a disk, with all the time delays and
> I/O scheduling algorithms that needs?

This is exactly what a linear-mapped memory model avoids.
Everything is already mapped into memory/disk.

Lennart Sorensen wrote:
> Of course there is swapping.  The cpu only executes thigns from physical
> memory, so at some point you have to load stuff from disk to physical
> memory.  That seems amazingly much like the definition of swapping too.
> Sometimes you call it loading.  Not much difference really.  If
> something else is occupying physical memory so there isn't room, it has
> to be put somewhere, which if it is just caching some physical disk
> space, you just dump it, but if it is some giant chunk of data you are
> currently generating, then it needs to go to some other place that
> handles temporary data that doesn't already have a palce in the
> filesystem.  Unless you have infinite physical memory, at some point you
> will have to move temporary data from physical memory to somewhere else.
> That is swapping no matter how you view the system's address space.
> Making it be called something else doesn't change the facts.

Would you call reading and writing to memory/disk swapping?

> Applications don't currently care if they are swapped to disk or in
> physical memory.  That is handled by the OS and is transparent to the
> application.

Yes, a linear-mapped memory model extends this transparency to the OS.

> > If you didn't understand it's meaning.  The shortest path meaning
> > accessing hw w/o running workarounds; using 64bits+ to fly over past
> > limitations.
>
> THe OS still has to map the address space to where it physically exists.
> Mapping all disk space into the address space may actually be a lot less
> efficient than using the filesystem interface for a block device.

Did you try tmpfs?

> > Uhh?
> > The point here is: Even if there were 64bit archs available in the past,
> > this did not mean that moving into native 64bits would be commercially
> > viable, due to its unavailability on the mass-market.
> >
> > So with 64bits widely available now, and to let Linux spread its wings
> > and really fly, how could tmpfs merged w/ swap be tweaked to provide
> > direct mapped access into this linear address space?
>
> Applications can mmap files if they want to.  Your idea seems likely to
> make the OS much more complex, and waste a lot of resources on mapping
> disk space to the address space, and from the applications point of view
> it doesn't seem to make any difference at all.  It might be a fun idea
> for some academic research OS somewhere to go work out the kinks and see
> if it has any efficiency at all in real use.  Given Linux runs on lots
> of architectures, trying to make it work completely differently on 64bit
> systems doesn't make that much sense really, especially when there is no
> apparent benefit to the change.

Arch bits have nothing to do with a linear-mapped memory model, they only 
limit its usefulness.  So with 8,16,(32) bits this linear-mapped model isn't 
really viable because of its address-space limit.  But with a 64bit+ arch 
the limits are wide enough to make a linear-mapped model viable.  A 32bit 
arch is inbetween, so for some a 4GB limit may be acceptable.

Barry K. Nathan wrote:
> On 1/31/06, Al Boldi <a1426z@gawab.com> wrote:
> > Faulty, because we are currently running a legacy solution to workaround
> > an 8,16,(32) arch bits address space limitation, which does not exist in
> > 64bits+ archs for most purposes.
>
> In the early 1990's (and maybe even the mid 90's), the typical hard
> disk's storage could theoretically be byte-addressed using 32-bit
> addresses -- just as (if I understand you correctly) you are arguing
> that today's hard disks can be byte-addressed using 64-bit addresses.
>
> If this was going to be practical ever (on commodity hardware anyway),
> I would have expected someone to try it on a 32-bit PC or Mac when
> hard drives were in the 100MB-3GB range... That suggests to me that
> there's a more fundamental reason (i.e. other than lack of address
> space) that caused people to stick with the current scheme.

32bits is in brackets - 8,16,(32) - to high-light that it's an inbetween.

> tmpfs isn't "definitely faster". Remember those benchmarks where Linux
> ext2 beat Solaris tmpfs?

Linux tmpfs is faster because it can short-circuit dcache, in effect doing an 
o_sync.  It slows down when swapping kicks in.

> Also, the only way I see where "there is no more swapping" and
> "[y]ou're dealing with memory only" is if the disk *becomes* main
> memory, and main memory becomes an L3 (or L4) cache for the CPU [and
> as a consequence, main memory also becomes the main form of long-term
> storage]. Is that what you're proposing?

In the long-term yes, maybe even move it into hardware.  But for the 
short-term there is no need to blow things out of proportion, a simple 
tweaking of tmpfs merged w/ swap may do the trick quick and easy.

> If so, then it actually makes *less* sense to me than before -- with
> your scheme, you've reduced the speed of main memory by 100x or more,
> then you try to compensate with a huge cache. IOW, you've reduced the
> speed of *main* memory to (more or less) the speed of today's swap!
> Suddenly it doesn't sound so good anymore...

There really isn't anything new here; we do swap and access the fs on disk 
and compensate with a huge dcache now.  All this idea implies, is to remove 
certain barriers that could not be easily passed before, thus move swap and 
fs into main memory.

Can you see how removing barriers would aid performance?

Thanks!

--
Al


^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2006-02-03 17:41 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-21 18:08 [RFC] VM: I have a dream Al Boldi
2006-01-21 18:42 ` Jamie Lokier
2006-01-21 18:46 ` Avi Kivity
2006-01-23 19:52   ` Bryan Henderson
2006-01-25 22:04     ` Al Boldi
2006-01-26 19:18       ` Bryan Henderson
2006-01-27 16:12         ` Al Boldi
2006-01-27 19:17           ` Bryan Henderson
2006-01-30 13:21             ` Al Boldi
2006-01-30 13:35               ` Kyle Moffett
2006-01-31 15:56                 ` Al Boldi
2006-01-31 16:34                   ` Kyle Moffett
2006-01-31 23:14                     ` Bryan Henderson
2006-01-31 16:34                   ` Lennart Sorensen
2006-01-31 19:23                   ` Jamie Lokier
2006-02-01  4:06                   ` Barry K. Nathan
2006-02-01  9:51                     ` Andrew Walrond
2006-02-01 17:51                       ` Lennart Sorensen
2006-02-01 18:21                         ` Andrew Walrond
2006-02-01 18:25                           ` Lennart Sorensen
2006-02-02 15:11                   ` Alan Cox
2006-02-02 18:59                     ` Al Boldi
2006-02-02 22:33                       ` Bryan Henderson
2006-02-03 14:46                       ` Alan Cox
2006-01-30 16:49               ` Bryan Henderson
2006-01-26  0:03     ` Jon Smirl
2006-01-26 19:48       ` Bryan Henderson
2006-01-22  8:16 ` Pavel Machek
2006-01-22 12:33 ` Robin Holt
2006-01-23 18:03   ` Al Boldi
2006-01-23 18:40     ` Valdis.Kletnieks
2006-01-23 19:26       ` Benjamin LaHaise
2006-01-23 19:40         ` Valdis.Kletnieks
2006-01-23 22:26     ` Pavel Machek
2006-01-22 19:55 ` Barry K. Nathan
2006-01-23  5:23   ` Michael Loftis
2006-01-23  5:46     ` Chase Venters
2006-01-23  8:20       ` Barry K. Nathan
2006-01-23 13:17       ` Jamie Lokier
2006-01-23 20:21         ` Peter Chubb
2006-01-23 15:05     ` Ram Gupta
2006-01-23 15:26       ` Diego Calleja
2006-01-23 16:11         ` linux-os (Dick Johnson)
2006-01-23 16:50           ` Jamie Lokier
2006-01-24  2:08           ` Horst von Brand
2006-01-25  6:13             ` Jamie Lokier
2006-01-25  9:23             ` Bernd Petrovitsch
2006-01-25  9:42               ` Lee Revell
2006-01-25 15:02                 ` Jamie Lokier
2006-01-25 23:24                   ` Lee Revell
2006-01-25 15:05               ` Jamie Lokier
2006-01-25 15:47                 ` Bernd Petrovitsch
2006-01-25 16:09                 ` Diego Calleja
2006-01-25 17:26                   ` Jamie Lokier
2006-01-26 19:13                     ` Bryan Henderson
2006-01-25 23:28                 ` Lee Revell
2006-01-26  1:29                   ` Diego Calleja
2006-01-26  5:01                   ` Jamie Lokier
2006-01-26  5:11                     ` Lee Revell
2006-01-26 14:46                       ` Dave Kleikamp
2006-01-24  2:10           ` Horst von Brand
2006-01-25 22:27         ` Nix
2006-01-26 15:13           ` Denis Vlasenko
2006-01-26 16:23             ` Nix
2006-01-23 20:43       ` Michael Loftis
2006-01-23 22:42         ` Nikita Danilov
2006-01-24 14:36           ` Ram Gupta
2006-01-24 15:04             ` Diego Calleja
2006-01-24 20:59               ` Bryan Henderson
2006-01-24 15:11             ` Nikita Danilov
2006-01-23 22:57         ` Ram Gupta
2006-01-24 10:08         ` Meelis Roos
2006-02-01 13:58 Al Boldi
2006-02-01 14:38 ` Jamie Lokier
2006-02-02 12:26   ` Al Boldi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).