linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Things that Longhorn seems to be doing right
@ 2003-10-29  8:50 Hans Reiser
  2003-10-29 22:42 ` Erik Andersen
                   ` (2 more replies)
  0 siblings, 3 replies; 83+ messages in thread
From: Hans Reiser @ 2003-10-29  8:50 UTC (permalink / raw)
  To: linux-kernel

They are building in support for transactions into the OS.

Everything will be in XML.  (It is not so important what format it is 
in, as it is that they are going to do it in one format.)

Support for browsing versions in the FS.

Support for browsing and querying XML their unified format.  Ok, so SQL 
sucks, but this is still better than what we offer today in Linux.


-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-29 23:03   ` Hans Reiser
@ 2003-10-29 22:25     ` Dax Kelson
  2003-10-30  0:20       ` Joseph Pingenot
  0 siblings, 1 reply; 83+ messages in thread
From: Dax Kelson @ 2003-10-29 22:25 UTC (permalink / raw)
  To: Hans Reiser; +Cc: andersen, linux-kernel

On Wed, 2003-10-29 at 16:03, Hans Reiser wrote:

> If they have a beta today, and we are not doing anything today in that 
> area, they are probably going to beat us to shipping something in that 
> area unless we make a real effort.  That means well-earned advantage for 
> them.

Except, they didn't release a beta.

They released a developer preview (not even alpha), mostly to show off
the APIs.

AFAIK the developer preview has no WinFS bits in it at all.

Dax Kelson
Guru Labs


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-29  8:50 Things that Longhorn seems to be doing right Hans Reiser
@ 2003-10-29 22:42 ` Erik Andersen
  2003-10-29 23:03   ` Hans Reiser
  2003-10-30  1:52   ` Theodore Ts'o
  2003-10-30  7:25 ` Christian Axelsson
       [not found] ` <200311011731.10052.ioe-lkml@rameria.de>
  2 siblings, 2 replies; 83+ messages in thread
From: Erik Andersen @ 2003-10-29 22:42 UTC (permalink / raw)
  To: Hans Reiser; +Cc: linux-kernel

On Wed Oct 29, 2003 at 11:50:46AM +0300, Hans Reiser wrote:
> They are building in support for transactions into the OS.
> 
> Everything will be in XML.  (It is not so important what format it is 
> in, as it is that they are going to do it in one format.)
> 
> Support for browsing versions in the FS.
> 

s/seems to be doing right/might eventually get right after
it is released in 2006 and after you apply sp1 + sp2 + sp3
which will probably be released around Q3 2007/

:-)

> Support for browsing and querying XML their unified format.  Ok, so SQL 
> sucks, but this is still better than what we offer today in Linux.

Linux today vs an MS vaporware press release is not exactly a
fair comparison.  Based on their release schedule, it looks like
we have a couple of years yet before anything materializes from
the folk in Redmond.

 -Erik

--
Erik B. Andersen             http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-29 22:42 ` Erik Andersen
@ 2003-10-29 23:03   ` Hans Reiser
  2003-10-29 22:25     ` Dax Kelson
  2003-10-30  1:52   ` Theodore Ts'o
  1 sibling, 1 reply; 83+ messages in thread
From: Hans Reiser @ 2003-10-29 23:03 UTC (permalink / raw)
  To: andersen; +Cc: linux-kernel

Erik Andersen wrote:

>
>
>Linux today vs an MS vaporware press release is not exactly a
>fair comparison.  Based on their release schedule, it looks like
>we have a couple of years yet before anything materializes from
>the folk in Redmond.
>
> -Erik
>
>--
>Erik B. Andersen             http://codepoet-consulting.com/
>--This message was written using 73% post-consumer electrons--
>
>
>  
>
If they have a beta today, and we are not doing anything today in that 
area, they are probably going to beat us to shipping something in that 
area unless we make a real effort.  That means well-earned advantage for 
them.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-29 22:25     ` Dax Kelson
@ 2003-10-30  0:20       ` Joseph Pingenot
  2003-10-30  0:54         ` Neil Brown
  2003-10-30  2:09         ` Alex Belits
  0 siblings, 2 replies; 83+ messages in thread
From: Joseph Pingenot @ 2003-10-30  0:20 UTC (permalink / raw)
  To: Dax Kelson; +Cc: Hans Reiser, andersen, linux-kernel

>From Dax Kelson on Wednesday, 29 October, 2003:
>On Wed, 2003-10-29 at 16:03, Hans Reiser wrote:
>> If they have a beta today, and we are not doing anything today in that 
>> area, they are probably going to beat us to shipping something in that 
>> area unless we make a real effort.  That means well-earned advantage for 
>> them.
>Except, they didn't release a beta.
>They released a developer preview (not even alpha), mostly to show off
>the APIs.
>AFAIK the developer preview has no WinFS bits in it at all.

Regardless, it's an interesting idea, and one which might be fruitful.  

I give you then two bits: our treatment of the tech and the reality of their
  tech:

00: ISVAPOR | TAKESEROUSLY
01: ISVAPOR | IGNORE
10: NOTVAPOR | TAKESERIOUSLY
11: NOTVAPOR | IGNORE

If we come up with a working implementation and it *is* just vaporware, then
  we're ahead.
We're way ahead.
  
If we merely dismiss it as vaporware and it turns out to be,
no net change.

If it's not vaporware and we take it seriously and look at something similar
  for Linux and 'nix, we're still ahead (especially if we get to it first,
  and do it better).
We're somewhat ahead.

If we merely dismiss it as vaporware and it turns out NOT to be, 
we are behind, _potentially_with_patents_blocking_our_progress_.


Conclusion: the optimal case would be for it to truly be vaporware and we
  make it real.  Next case would be for it to not be vaporware, but for us
  to get there first and/or do it better.  Next to last would be us to not
  take us seriously and for it to actually be vaporware.  LAST and certainly
  LEAST would be for it to *not* be vaporware, and for us to not take it
  seriously.  In that case, we face not only being behind in tech, but also
  potentially _the_INABILITY_ to work towards this, since Microsoft would
  have patents (it's highly likely) on the work and would use them to block
  our Freedom to Innovate.
Conclusion: best to take it seriously and work on it; those two cases
  are the most optimal.


The Linux community should investigate it and potentially offer a similar
  functionality (e.g. improved ability to search for document content),
  since it looks interesting, and we could have it way before they do.
Maybe a search engine group could team up with a filesystems group and
  potentially others.  This is something where maybe Google and other
  minor players would like to get in on the action, given Microsoft's current
  bent to control the world's searching via the MicroSoft Network (see
  also, slashdot).  We need to team up for the best chances of beating
  the 800lb Orc.  :)

My two pfennig; take it or leave it.

-Joseph
-- 
Joseph===============================================trelane@digitasaru.net
"Asked by CollabNet CTO Brian Behlendorf whether Microsoft will enforce its
 patents against open source projects, Mundie replied, 'Yes, absolutely.'
 An audience member pointed out that many open source projects aren't
 funded and so can't afford legal representation to rival Microsoft's. 'Oh
 well,' said Mundie. 'Get your money, and let's go to court.' 
Microsoft's patents only defensive? http://swpat.ffii.org/players/microsoft

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  0:20       ` Joseph Pingenot
@ 2003-10-30  0:54         ` Neil Brown
  2003-10-30  1:34           ` Joseph Pingenot
  2003-10-30  2:09         ` Alex Belits
  1 sibling, 1 reply; 83+ messages in thread
From: Neil Brown @ 2003-10-30  0:54 UTC (permalink / raw)
  To: trelane; +Cc: Dax Kelson, Hans Reiser, andersen, linux-kernel

On Wednesday October 29, trelane@digitasaru.net wrote:
> 
> Regardless, it's an interesting idea, and one which might be fruitful.  
> 
> I give you then two bits: our treatment of the tech and the reality of their
>   tech:
> 
> 00: ISVAPOR | TAKESEROUSLY
> 01: ISVAPOR | IGNORE
> 10: NOTVAPOR | TAKESERIOUSLY
> 11: NOTVAPOR | IGNORE
> 
> If we come up with a working implementation and it *is* just vaporware, then
>   we're ahead.
> We're way ahead.
>   
> If we merely dismiss it as vaporware and it turns out to be,
> no net change.
...snip...
> Conclusion: best to take it seriously and work on it; those two cases
>   are the most optimal.
> 

Sounds like the same argument that is used in "Pascal's Wager" for
belief in God, and I seriously don't think the argument works in
either case.  (note that I'm not making a statement about the
conclusion in either case, only about the arguement).

NeilBrown

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  0:54         ` Neil Brown
@ 2003-10-30  1:34           ` Joseph Pingenot
  2003-10-30  2:54             ` Bernd Eckenfels
                               ` (3 more replies)
  0 siblings, 4 replies; 83+ messages in thread
From: Joseph Pingenot @ 2003-10-30  1:34 UTC (permalink / raw)
  To: Neil Brown, Dax Kelson, Hans Reiser, andersen, linux-kernel

>From Neil Brown on Thursday, 30 October, 2003:
>On Wednesday October 29, trelane@digitasaru.net wrote:
>> Regardless, it's an interesting idea, and one which might be fruitful.  
>> I give you then two bits: our treatment of the tech and the reality of their
>>   tech:
>> 00: ISVAPOR | TAKESEROUSLY
>> 01: ISVAPOR | IGNORE
>> 10: NOTVAPOR | TAKESERIOUSLY
>> 11: NOTVAPOR | IGNORE
>> If we come up with a working implementation and it *is* just vaporware, then
>>   we're ahead.
>> We're way ahead.
>> If we merely dismiss it as vaporware and it turns out to be,
>> no net change.
>...snip...
>> Conclusion: best to take it seriously and work on it; those two cases
>>   are the most optimal.
>Sounds like the same argument that is used in "Pascal's Wager" for
>belief in God, and I seriously don't think the argument works in
>either case.  (note that I'm not making a statement about the
>conclusion in either case, only about the arguement).

"Sounds like?"  Sure.  It kind of does, now that you mention it.

Regradless of the similarities and the validity of Pascal's argument, my
  argument, I think, stands.  I outlined the four potential futures.  We
  have control over only one bit, Microsoft has the other.  The tech sounds
  nice, it is an interesting avenue to persue, Pascal aside.

I don't see any reason why we *shouldn't* look at the problem and try to
  do it.  What reasons do you see for not persuing the problem to its
  inevitible implementation?

I see big pitfalls in *not* looking at the problem.  In what respect are
  the pitfalls of ignoring it as outlined by me invalid?

mfG,

Joseph

-- 
Joseph===============================================trelane@digitasaru.net
"Asked by CollabNet CTO Brian Behlendorf whether Microsoft will enforce its
 patents against open source projects, Mundie replied, 'Yes, absolutely.'
 An audience member pointed out that many open source projects aren't
 funded and so can't afford legal representation to rival Microsoft's. 'Oh
 well,' said Mundie. 'Get your money, and let's go to court.' 
Microsoft's patents only defensive? http://swpat.ffii.org/players/microsoft

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-29 22:42 ` Erik Andersen
  2003-10-29 23:03   ` Hans Reiser
@ 2003-10-30  1:52   ` Theodore Ts'o
  2003-10-30  2:03     ` Joseph Pingenot
                       ` (4 more replies)
  1 sibling, 5 replies; 83+ messages in thread
From: Theodore Ts'o @ 2003-10-30  1:52 UTC (permalink / raw)
  To: Erik Andersen, Hans Reiser, linux-kernel

Keep in mind that just because Windows does thing a certain way
doesn't mean we have to provide the same functionality in exactly the
same way.

Also keep in mind that Microsoft very deliberately blurs what they do
in their "kernel" versus what they provide via system libraries (i.e.,
API's provided via their DLL's, or shared libraries).

At some level what they have done can be very easily replicated by
having a userspace database which is tied to the filesystem so you can
do select statements to search on metadata assocated with files.  We
can do this simply by associating UUID's to files, and storing the
file metadata in a MySQL database which can be searched via
appropriate userspace libraries which we provide.

Please do **not** assume that just because of the vaporware press
releases released by Microsoft that (a) they have pushed an SQL Query
optimizer into the kernel, or that (b) even if they did, we should
follow their bad example and attempt to do the same.  

There are multiple ways of skinning this particular cat, and we don't
need to blindly follow Microsoft's design mistakes.

Fortunately, I have enough faith in Linus Torvalds' taste that I'm not
particularly worried what would happen if someone were to send him a
patch that attempted to cram MySQL or Postgres into the guts of the
Linux kernel....  although I would like to watch when someone proposes
such a thing!

						- Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:52   ` Theodore Ts'o
@ 2003-10-30  2:03     ` Joseph Pingenot
  2003-10-30  9:23       ` Ingo Oeser
  2003-10-30  3:57     ` Scott Robert Ladd
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 83+ messages in thread
From: Joseph Pingenot @ 2003-10-30  2:03 UTC (permalink / raw)
  To: Theodore Ts'o, Erik Andersen, Hans Reiser, linux-kernel

>From Theodore Ts'o on Wednesday, 29 October, 2003:
>Keep in mind that just because Windows does thing a certain way
>doesn't mean we have to provide the same functionality in exactly the
>same way.
>Also keep in mind that Microsoft very deliberately blurs what they do
>in their "kernel" versus what they provide via system libraries (i.e.,
>API's provided via their DLL's, or shared libraries).

Indeed, although certain things could be half-kernel, half-user
  (OK, 0.01% kernel, 99.99% user, e.g. userspace daemon that
  intercepts certain writes).  Of course, at that point, you might
  make a special library to interact with the daemon directly, although
  it's then not at all like just calling write().

Actually, thinking about it, it's ideal to have as a pluggable userspace
  daemon: on open() or a little after, determine the filetype, and forward
  interactions to a module/plugin that knows how to deal with that
  data format.  The plugin then calls some under-process (either back to
  the daemon or some other thing) to then archive off the information.

>At some level what they have done can be very easily replicated by
>having a userspace database which is tied to the filesystem so you can
>do select statements to search on metadata assocated with files.  We
>can do this simply by associating UUID's to files, and storing the
>file metadata in a MySQL database which can be searched via
>appropriate userspace libraries which we provide.
>
>Please do **not** assume that just because of the vaporware press
>releases released by Microsoft that (a) they have pushed an SQL Query
>optimizer into the kernel, or that (b) even if they did, we should
>follow their bad example and attempt to do the same.  

Indeed.  I prefer to keep my crashes in userspace.  :)

>There are multiple ways of skinning this particular cat, and we don't
>need to blindly follow Microsoft's design mistakes.

Amen, my brother!  ;)

>Fortunately, I have enough faith in Linus Torvalds' taste that I'm not
>particularly worried what would happen if someone were to send him a
>patch that attempted to cram MySQL or Postgres into the guts of the
>Linux kernel....  although I would like to watch when someone proposes
>such a thing!

heh.

-Joseph
-- 
Joseph===============================================trelane@digitasaru.net
"Asked by CollabNet CTO Brian Behlendorf whether Microsoft will enforce its
 patents against open source projects, Mundie replied, 'Yes, absolutely.'
 An audience member pointed out that many open source projects aren't
 funded and so can't afford legal representation to rival Microsoft's. 'Oh
 well,' said Mundie. 'Get your money, and let's go to court.' 
Microsoft's patents only defensive? http://swpat.ffii.org/players/microsoft

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  0:20       ` Joseph Pingenot
  2003-10-30  0:54         ` Neil Brown
@ 2003-10-30  2:09         ` Alex Belits
  2003-10-30  3:12           ` Joseph Pingenot
  2003-10-30  4:06           ` Scott Robert Ladd
  1 sibling, 2 replies; 83+ messages in thread
From: Alex Belits @ 2003-10-30  2:09 UTC (permalink / raw)
  To: Joseph Pingenot; +Cc: Dax Kelson, Hans Reiser, andersen, linux-kernel

On Wed, 29 Oct 2003, Joseph Pingenot wrote:

> >> them.
> >Except, they didn't release a beta.
> >They released a developer preview (not even alpha), mostly to show off
> >the APIs.
> >AFAIK the developer preview has no WinFS bits in it at all.
>
> Regardless, it's an interesting idea, and one which might be fruitful.

  There is another possibility -- that the only implementation of the
standardized indexable/searchable format that Microsoft wants to base this
system on is a horrendous resource pig, infected with inflexible
restrictions and requirements, that everything will have to follow, and
will be unable to do any further progress in various directions where
non-Microsoft software has advantage.

  What most of XML-based formats certainly are. If further development
will blindly take this road, we will lose huge amount of flexibility in
exchange for a certain Microsoft-compatible (for a while) system of
organizing data. But, say, using grep on a text file will become
impossible without making a XML-ified file, and XML-ified grep. Pipes and
sockets will have to be redesigned, too, and many kinds of low-level
functionality that Unixlike systems enjoyed thanks to unified file
descriptors and nonintrusive way of OS handling the data will become
cumbersome second-class citizens in a world where structured data files
(VMS? Mainframes?) and strong file-type binding (MacOS? PalmOS?) are what
the system is based on. Not to mention niceties like having to stuff the
whole expat into the kernel, and enjoy memory bloat and various kinds of
DoS based on that. It won't harm Microsoft a single bit -- it would be
their wet dream to outlaw all file formats but MS Office, and make every
program talk through the Office-based interface, but it will turn Linux
(or any other system that follows this idea) into something else.

It may be a great idea to add additional interfaces that will provide
a similar functionality through multiple userspace applications that will
form another layer of data access. But those can't be just stuffed into
kernel, or have one, set in stone format, imposed on files and queries. It
may allow something compatible with Microsoft, but it certainly should not
grant immortality to current incarnation of XML, SQL and derivatives of
those. Linux's greatest strength is in providing good infrastructure, and
just stuffing particular (bound to be bad) implementations of some ideas
(that are not necessarily good beyond their basic core) into the system
instead of providing sufficient infrastructure to provide those in various
ways, makes it more like an ideologically-charged finished environment
than an infrastructure for creating such environments. Microsoft always
created narrowly-defined, bloated, followed-the-party-line environments
that captured and confined the developers. There is no need to imitate
that in a system that is known for being just the opposite.

-- 
Alex

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:34           ` Joseph Pingenot
@ 2003-10-30  2:54             ` Bernd Eckenfels
  2003-10-30  2:58               ` Arnaldo Carvalho de Melo
  2003-10-30  3:16               ` Joseph Pingenot
  2003-10-30  3:16             ` Neil Brown
                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 83+ messages in thread
From: Bernd Eckenfels @ 2003-10-30  2:54 UTC (permalink / raw)
  To: linux-kernel

In article <20031030013418.GD3094@digitasaru.net> you wrote:
> I don't see any reason why we *shouldn't* look at the problem and try to
>  do it.  What reasons do you see for not persuing the problem to its
>  inevitible implementation?

Just do it. :)

Greetings
Bernd
-- 
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  2:54             ` Bernd Eckenfels
@ 2003-10-30  2:58               ` Arnaldo Carvalho de Melo
  2003-10-30  3:16               ` Joseph Pingenot
  1 sibling, 0 replies; 83+ messages in thread
From: Arnaldo Carvalho de Melo @ 2003-10-30  2:58 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

Em Thu, Oct 30, 2003 at 03:54:44AM +0100, Bernd Eckenfels escreveu:
> In article <20031030013418.GD3094@digitasaru.net> you wrote:
> > I don't see any reason why we *shouldn't* look at the problem and try to
> >  do it.  What reasons do you see for not persuing the problem to its
> >  inevitible implementation?
> 
> Just do it. :)

Exactly, the idea is controversial, etc, but you can always show us an
implementation for comments :)

- Arnaldo

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  2:09         ` Alex Belits
@ 2003-10-30  3:12           ` Joseph Pingenot
  2003-10-30  4:21             ` Scott Robert Ladd
  2003-10-30  9:52             ` Ingo Oeser
  2003-10-30  4:06           ` Scott Robert Ladd
  1 sibling, 2 replies; 83+ messages in thread
From: Joseph Pingenot @ 2003-10-30  3:12 UTC (permalink / raw)
  To: Alex Belits, Dax Kelson, Hans Reiser, andersen, linux-kernel

>From Alex Belits on Wednesday, 29 October, 2003:
>On Wed, 29 Oct 2003, Joseph Pingenot wrote:
>> >> them.
>> >Except, they didn't release a beta.
>> >They released a developer preview (not even alpha), mostly to show off
>> >the APIs.
>> >AFAIK the developer preview has no WinFS bits in it at all.
>> Regardless, it's an interesting idea, and one which might be fruitful.
>  There is another possibility -- that the only implementation of the
>standardized indexable/searchable format that Microsoft wants to base this
>system on is a horrendous resource pig, infected with inflexible
>restrictions and requirements, that everything will have to follow, and
>will be unable to do any further progress in various directions where
>non-Microsoft software has advantage.

Interesting take on XML.

>  What most of XML-based formats certainly are. If further development
>will blindly take this road, we will lose huge amount of flexibility in
>exchange for a certain Microsoft-compatible (for a while) system of
>organizing data. But, say, using grep on a text file will become

Actually, the point isn't to be Microsoft-compatible; rather, it's to
  aid in the indexing of information for quick search and cross-reference
  (their thrust at the 'knowledge worker').

>impossible without making a XML-ified file, and XML-ified grep. Pipes and
>sockets will have to be redesigned, too, and many kinds of low-level
>functionality that Unixlike systems enjoyed thanks to unified file
>descriptors and nonintrusive way of OS handling the data will become
>cumbersome second-class citizens in a world where structured data files
>(VMS? Mainframes?) and strong file-type binding (MacOS? PalmOS?) are what
>the system is based on. Not to mention niceties like having to stuff the

Well, the point of making this a modular userspace daemon is that we don't
  *have* to dictate any such thing.  The idea is that writes could be
  piped through the indexing daemon, and the daemon would then have plugins
  that understand *different* formats.  Optionally, I suppose, one could
  add a new open() flag to say "don't index this".

>whole expat into the kernel, and enjoy memory bloat and various kinds of
>DoS based on that. It won't harm Microsoft a single bit -- it would be

The idea is also to keep the kernel as clean as possible, while keeping
  it also as transparent/opaque (depending on your viewpoint) as possible.

There are two extrema: completely in-kernel (either dictating the choice
  of backend and formats or using modules to allow choice) and completely
  userspace.  The nicety of in-kernel is speed and the fact that the process
  need not know anything about the indexing unless it wants to.  The cost
  is stability and security.  The nicety of userspace is that it is very
  highly modular, with obvious, strict APIs and the security that goes
  with being in userspace.  The cost is the context switching performance
  hit and the fact that each process that wants to index its stuff must
  tell the filesystem and the indexing service its data (effectively, two
  writes and a completely separate API).  [I'm likely preaching to the
  choir here, but it's good to outline it]

I think the Golden Mean is, erm, in the mean.  :)

>their wet dream to outlaw all file formats but MS Office, and make every
>program talk through the Office-based interface, but it will turn Linux
>(or any other system that follows this idea) into something else.

Indeed, but we're not trying to dictate a format.  That's why it'd have
  to be pluggable.  Unix/Linux is all about choice and freedom.  Let
  Microsoft straightjacket their product; we should build it open, transparent,
  and free, and welcome the Microsoft refugees.

>It may be a great idea to add additional interfaces that will provide
>a similar functionality through multiple userspace applications that will
>form another layer of data access. But those can't be just stuffed into
>kernel, or have one, set in stone format, imposed on files and queries. It

Dang straight.

>may allow something compatible with Microsoft, but it certainly should not
>grant immortality to current incarnation of XML, SQL and derivatives of
>those. Linux's greatest strength is in providing good infrastructure, and

Indeed.  That's the whole point of choice and freedom.  If/when something
  better comes along, the implementer can quickly add the format to the
  indexing service, and people will find the transition that much easier.
  And humanity is better off for the ease in migration.

>just stuffing particular (bound to be bad) implementations of some ideas
>(that are not necessarily good beyond their basic core) into the system
>instead of providing sufficient infrastructure to provide those in various
>ways, makes it more like an ideologically-charged finished environment
>than an infrastructure for creating such environments. Microsoft always
>created narrowly-defined, bloated, followed-the-party-line environments
>that captured and confined the developers. There is no need to imitate
>that in a system that is known for being just the opposite.

Indeed.  Couldn't agree more.  That's why we should create *infrastructure*
  for such an indexing service, and allow the community to create plugins
  as needed.

I'm sure the user community can come up with excellent ways of using such
  a service, especially when it's open and free.  (Indeed, I just thought
  of a new use: identifying .desktop files on the system, and potentially
  linking them together in menus automatically, regardless of their location.
  Such an implementation would require a .desktop file indexing plugin, but
  with such an open and free system, it's quite easy, I'd think.)

-Joseph
-- 
Joseph===============================================trelane@digitasaru.net
"Asked by CollabNet CTO Brian Behlendorf whether Microsoft will enforce its
 patents against open source projects, Mundie replied, 'Yes, absolutely.'
 An audience member pointed out that many open source projects aren't
 funded and so can't afford legal representation to rival Microsoft's. 'Oh
 well,' said Mundie. 'Get your money, and let's go to court.' 
Microsoft's patents only defensive? http://swpat.ffii.org/players/microsoft

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:34           ` Joseph Pingenot
  2003-10-30  2:54             ` Bernd Eckenfels
@ 2003-10-30  3:16             ` Neil Brown
  2003-10-30  3:39               ` Joseph Pingenot
  2003-10-30 10:27             ` Thorsten Körner
  2003-10-30 21:28             ` jlnance
  3 siblings, 1 reply; 83+ messages in thread
From: Neil Brown @ 2003-10-30  3:16 UTC (permalink / raw)
  To: trelane; +Cc: Dax Kelson, Hans Reiser, andersen, linux-kernel

On Wednesday October 29, trelane@digitasaru.net wrote:
> 
> "Sounds like?"  Sure.  It kind of does, now that you mention it.
> 
> Regradless of the similarities and the validity of Pascal's argument, my
>   argument, I think, stands.  I outlined the four potential futures.  We
>   have control over only one bit, Microsoft has the other.  The tech sounds
>   nice, it is an interesting avenue to persue, Pascal aside.
> 
> I don't see any reason why we *shouldn't* look at the problem and try to
>   do it.  What reasons do you see for not persuing the problem to its
>   inevitible implementation?

 There are lots of other interesting problems.  Why persue this one?
(as with Pascal: there are many who claim to be "god" and demand
 worship, which do you follow).
 That doesn't mean you shouldn't pursue this one.  It just means that
 you haven't given a good reason.
  "Microsoft might do something that we haven't so we should" isn't a
  good reason.
  "I find this interesting" or "I have an immediate need for this" are
  both good reasons, and if they apply, then by all means, pursue it.

> 
> I see big pitfalls in *not* looking at the problem.  In what respect are
>   the pitfalls of ignoring it as outlined by me invalid?

 The future is full of pits that we cannot see, and many that we do
 see are mirages.
 Your main pitfall seems to be patents.  There are lots of patents out
 there that might be a problem, and lots more that will undoubtedly be
 taken out.  Why target this one?
 History seems to suggest that patents for seriously clever ideas
 aren't a problem.  It is usually possible to come up with a different
 clever idea that achieves the same end (I gather the RTLinux patent
 has been avoided that way, but I don't follow RT much. gzip and
 vorbis are other examples).  It is mainly the trivial patents that
 are a problem. 

 
If you look at your argument, you will see that it gives no hint of
what the technology is that we might be wanting to persue.  It is
purely a "Microsoft ways they will do it, so I think we should to"
argument, and there are many things that Microsoft do that I
definately don't think we should do.

NeilBrown


[Just for the record 
  - I don't think database transaction support should go in the kernel.
    I'd rather take things out of filesystems than put them in.
  - I do think there is a God worth worshiping
]


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  2:54             ` Bernd Eckenfels
  2003-10-30  2:58               ` Arnaldo Carvalho de Melo
@ 2003-10-30  3:16               ` Joseph Pingenot
  2003-10-30  5:28                 ` Jeff Garzik
  1 sibling, 1 reply; 83+ messages in thread
From: Joseph Pingenot @ 2003-10-30  3:16 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

>From Bernd Eckenfels on Thursday, 30 October, 2003:
>In article <20031030013418.GD3094@digitasaru.net> you wrote:
>> I don't see any reason why we *shouldn't* look at the problem and try to
>>  do it.  What reasons do you see for not persuing the problem to its
>>  inevitible implementation?
>Just do it. :)
>Greetings
>Bernd

Hee hee.  The Other Shoe Drops.  This ought to be written down as
  Some-and-Such's Law: All Open Source Debates Will Continue Until Someone
  Tells Someone Else to Shut Up and Code It.  ;)

Actually, I'm highly, highly tempted.  Only problem is that I have a
  ton of other projects that require my attention.  If someone wants to
  fight the CAP server and collaboration battles, I will.  Otherwise,
  it'll have to wait until after I finish the projects I currently have
  on the table.  :/

Or maybe if someone will get my professors off my back (and fill my brain
  with the material so that I don't need to learn QM or E&M)  ;)   Aaah,
  my good friend Jackson....

-Joseph
-- 
Joseph===============================================trelane@digitasaru.net
"Asked by CollabNet CTO Brian Behlendorf whether Microsoft will enforce its
 patents against open source projects, Mundie replied, 'Yes, absolutely.'
 An audience member pointed out that many open source projects aren't
 funded and so can't afford legal representation to rival Microsoft's. 'Oh
 well,' said Mundie. 'Get your money, and let's go to court.' 
Microsoft's patents only defensive? http://swpat.ffii.org/players/microsoft

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  3:16             ` Neil Brown
@ 2003-10-30  3:39               ` Joseph Pingenot
  0 siblings, 0 replies; 83+ messages in thread
From: Joseph Pingenot @ 2003-10-30  3:39 UTC (permalink / raw)
  To: Neil Brown; +Cc: Dax Kelson, Hans Reiser, andersen, linux-kernel

>From Neil Brown on Thursday, 30 October, 2003:
>On Wednesday October 29, trelane@digitasaru.net wrote:
>> "Sounds like?"  Sure.  It kind of does, now that you mention it.
>> Regradless of the similarities and the validity of Pascal's argument, my
>>   argument, I think, stands.  I outlined the four potential futures.  We
>>   have control over only one bit, Microsoft has the other.  The tech sounds
>>   nice, it is an interesting avenue to persue, Pascal aside.
>> I don't see any reason why we *shouldn't* look at the problem and try to
>>   do it.  What reasons do you see for not persuing the problem to its
>>   inevitible implementation?
> There are lots of other interesting problems.  Why persue this one?
>(as with Pascal: there are many who claim to be "god" and demand
> worship, which do you follow).
> That doesn't mean you shouldn't pursue this one.  It just means that
> you haven't given a good reason.
>  "Microsoft might do something that we haven't so we should" isn't a
>  good reason.
>  "I find this interesting" or "I have an immediate need for this" are
>  both good reasons, and if they apply, then by all means, pursue it.
>> I see big pitfalls in *not* looking at the problem.  In what respect are
>>   the pitfalls of ignoring it as outlined by me invalid?
> The future is full of pits that we cannot see, and many that we do
> see are mirages.
> Your main pitfall seems to be patents.  There are lots of patents out
> there that might be a problem, and lots more that will undoubtedly be
> taken out.  Why target this one?
> History seems to suggest that patents for seriously clever ideas
> aren't a problem.  It is usually possible to come up with a different
> clever idea that achieves the same end (I gather the RTLinux patent
> has been avoided that way, but I don't follow RT much. gzip and
> vorbis are other examples).  It is mainly the trivial patents that
> are a problem. 

Excellent points.  I see your counterargument much better now, thank you.

I do think it's an interesting tech of potential value.  Only time will
  tell if it's worth it, and for whom it might be valuable.

>If you look at your argument, you will see that it gives no hint of
>what the technology is that we might be wanting to persue.  It is
>purely a "Microsoft ways they will do it, so I think we should to"
>argument, and there are many things that Microsoft do that I
>definately don't think we should do.

True.  I was likely being a little overly anxious that Linux might fall behind
  in something.  That said, ignoring the stated intentions of the others,
  especially those who seek to destroy your stuff, shouldn't be taken
  lightly.  I'd really like to see it persued in FOSS, 'cause it's
  potentially interesting for various types of people ('knowledge worker'
  buzzword aside; I do see some potential applications if there's a way
  to keep track of the content of various files (e.g. .desktop files).  I'm
  sure there are plenty of people smarter and/or more creative than I
  who can think of even more useful stuff to do with improved indexing.

I don't think that it was purely a case of following Microsoft, and I
  definitely concur that there are many things that Microsoft does that
  we should not.

>[Just for the record 
>  - I don't think database transaction support should go in the kernel.

I agree.  Generally, the less that goes in the kernel, the better, IMHO.

>    I'd rather take things out of filesystems than put them in.

Not sure if I completely agree.  I do agree with simplicity being
  generally the best course.  Generally, however, should be emphasized.

>  - I do think there is a God worth worshiping

Agreed.  Others likely disagree.  ;)

>]

-- 
Joseph===============================================trelane@digitasaru.net
"Asked by CollabNet CTO Brian Behlendorf whether Microsoft will enforce its
 patents against open source projects, Mundie replied, 'Yes, absolutely.'
 An audience member pointed out that many open source projects aren't
 funded and so can't afford legal representation to rival Microsoft's. 'Oh
 well,' said Mundie. 'Get your money, and let's go to court.' 
Microsoft's patents only defensive? http://swpat.ffii.org/players/microsoft

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:52   ` Theodore Ts'o
  2003-10-30  2:03     ` Joseph Pingenot
@ 2003-10-30  3:57     ` Scott Robert Ladd
  2003-10-30  4:08       ` Larry McVoy
                         ` (2 more replies)
  2003-10-30  7:33     ` Diego Calleja García
                       ` (2 subsequent siblings)
  4 siblings, 3 replies; 83+ messages in thread
From: Scott Robert Ladd @ 2003-10-30  3:57 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Erik Andersen, Hans Reiser, linux-kernel

Theodore Ts'o wrote:
> Keep in mind that just because Windows does thing a certain way 
> doesn't mean we have to provide the same functionality in exactly the
> same way.

Very true. Linux is best defined by those who proactively implement
powerful ideas.

That doesn't mean, however, that the folks in Redmond can't come up with
an interesting and useful idea that we might just want to consider.

> Also keep in mind that Microsoft very deliberately blurs what they do
> in their "kernel" versus what they provide via system libraries
> (i.e., API's provided via their DLL's, or shared libraries).

Any database-style file system should be implemented in a modular
fashion, just like current Linux file systems.

Microsoft's penchant for integrating everything is their greatest
weakness (in terms of security) as well as their greatest strength (in
terms of customer lock-in). Since we don't care about locking anyone
into anything, we don't have those nasty marketing droids forcing us to
make poor technical choices.

> There are multiple ways of skinning this particular cat, and we don't
> need to blindly follow Microsoft's design mistakes.

Agreed -- but we might want pay attention, in case skinning cats has
some actual value.

(Disclaimer: No felines were harmed in the production of this e-mail.)

> Fortunately, I have enough faith in Linus Torvalds' taste that I'm
> not particularly worried what would happen if someone were to send
> him a patch that attempted to cram MySQL or Postgres into the guts of
> the Linux kernel....  although I would like to watch when someone
> proposes such a thing!

MySQL wouldn't need to be shoved into the kernel; a small, fast database 
engine (one of my professional specialities, BTW) could provide metadata 
services in a file system module. SQL is a bloated pig; an effective 
file system needs to be both useful and efficient, leading me to think 
that we should consider a more succinct query mechanism for any 
metadata-based file system.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  2:09         ` Alex Belits
  2003-10-30  3:12           ` Joseph Pingenot
@ 2003-10-30  4:06           ` Scott Robert Ladd
  1 sibling, 0 replies; 83+ messages in thread
From: Scott Robert Ladd @ 2003-10-30  4:06 UTC (permalink / raw)
  To: Alex Belits
  Cc: Joseph Pingenot, Dax Kelson, Hans Reiser, andersen, linux-kernel

Alex Belits wrote:
>   There is another possibility -- that the only implementation of the
> standardized indexable/searchable format that Microsoft wants to base this
> system on is a horrendous resource pig, infected with inflexible
> restrictions and requirements...

I suspect one motivation for Longhorn's file system is DRM, encoded into 
the metadata.

I don't think Hans' original message was suggesting that we clone 
Microsoft's new file system. Rather, my impression was that he was 
interested in the kind of functionality envisioned by Microsoft, and in 
pre-empting any conceptual patents Redmond might be planning.

It might be very interesting to peruse Microsoft's recent patent 
applications...

>   What most of XML-based formats certainly are. If further development
> will blindly take this road, we will lose huge amount of flexibility in
> exchange for a certain Microsoft-compatible (for a while) system of
> organizing data.

I am not fond of XML in many circumstances; it is inefficient both in 
terms of storage and processing, and is overkill for many applications. 
A files system should be mean and lean, even when it implements advanced 
features like metadata. So I think we're in agreement that Linux should 
find a better path to a similar solution.

And do it soon, before Microsoft patents the concept of a file system 
itself!

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  3:57     ` Scott Robert Ladd
@ 2003-10-30  4:08       ` Larry McVoy
  2003-10-30 13:46       ` Jesse Pollard
  2003-10-31  4:50       ` Stephen Satchell
  2 siblings, 0 replies; 83+ messages in thread
From: Larry McVoy @ 2003-10-30  4:08 UTC (permalink / raw)
  To: Scott Robert Ladd
  Cc: Theodore Ts'o, Erik Andersen, Hans Reiser, linux-kernel

> SQL is a bloated pig; an effective 
> file system needs to be both useful and efficient, leading me to think 
> that we should consider a more succinct query mechanism for any 
> metadata-based file system.

It's certainly possible to make a mostly/partially SQL compliant query
engine which is lean.  We did so for the commercial version of BK, i.e.,
this does exactly what you think it does:

select ID,STATUS,SEVERITY,PRIORITY,SUMMARY
from bugs
where	(SEVERITY == 1 or SEVERITY == 2 or SEVERITY == 3) and 
	(PRIORITY == 1 or PRIORITY == 2 or PRIORITY == 3) and 
	(STATUS == "new" or STATUS == "open" or STATUS == "assigned")
order by
	ID

The code which implements that:

wc query.c select.y
    284     830    5679 query.c
    650    2033   13775 select.y
    934    2863   19454 total

SQL compatibility isn't the problem, full (and bloated) SQL implementations are.
-- 
---
Larry McVoy              lm at bitmover.com          http://www.bitmover.com/lm

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  3:12           ` Joseph Pingenot
@ 2003-10-30  4:21             ` Scott Robert Ladd
  2003-10-31 16:42               ` Timothy Miller
  2003-10-30  9:52             ` Ingo Oeser
  1 sibling, 1 reply; 83+ messages in thread
From: Scott Robert Ladd @ 2003-10-30  4:21 UTC (permalink / raw)
  To: trelane; +Cc: Alex Belits, Dax Kelson, Hans Reiser, andersen, linux-kernel

Joseph Pingenot wrote:
> Actually, the point isn't to be Microsoft-compatible; rather, it's to
> aid in the indexing of information for quick search and cross-reference
> (their thrust at the 'knowledge worker').

Exactly. In my experience the most significant problem programmers face 
is inventing a system for categorizing and organizing information. I 
have 20 years of collected data on everything from fly fishing to solar 
sails; lord knows I can't ever find anything, no matter how carefully I 
organize my directories.

Another problem with metadata is that it is largely generated by the 
user, who is notoriously lazy. A truly powerful system would use 
contextual analysis and other  algorithms to automatically generate 
metadata, freeing the user from an onerous task (which is what computers 
should do). Certainly, some search engiens are bordering on this capability.

> Well, the point of making this a modular userspace daemon is that we don't
> *have* to dictate any such thing.  The idea is that writes could be
> piped through the indexing daemon, and the daemon would then have plugins
> that understand *different* formats.  Optionally, I suppose, one could
> add a new open() flag to say "don't index this".

I've been working on a daemon that actively examines and updates 
metadata, similar to a web-indexing spider, but more directed; the 
metadata is then stored  in "spatial" structures such r-trees or M-trees 
(this daemon is very experimental at this point, and I'm still exploring 
structures and algorithms).

> Indeed.  Couldn't agree more.  That's why we should create *infrastructure*
> for such an indexing service, and allow the community to create plugins
> as needed.

A wise approach.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  3:16               ` Joseph Pingenot
@ 2003-10-30  5:28                 ` Jeff Garzik
  2003-10-30  5:56                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 83+ messages in thread
From: Jeff Garzik @ 2003-10-30  5:28 UTC (permalink / raw)
  To: trelane; +Cc: Bernd Eckenfels, linux-kernel

Joseph Pingenot wrote:
>>From Bernd Eckenfels on Thursday, 30 October, 2003:
> 
>>In article <20031030013418.GD3094@digitasaru.net> you wrote:
>>
>>>I don't see any reason why we *shouldn't* look at the problem and try to
>>> do it.  What reasons do you see for not persuing the problem to its
>>> inevitible implementation?
>>
>>Just do it. :)
>>Greetings
>>Bernd
> 
> 
> Hee hee.  The Other Shoe Drops.  This ought to be written down as
>   Some-and-Such's Law: All Open Source Debates Will Continue Until Someone
>   Tells Someone Else to Shut Up and Code It.  ;)


What, it's not yet time to invoke Godwin's Law?  ;)

	Jeff




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  5:28                 ` Jeff Garzik
@ 2003-10-30  5:56                   ` Valdis.Kletnieks
  0 siblings, 0 replies; 83+ messages in thread
From: Valdis.Kletnieks @ 2003-10-30  5:56 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: trelane, Bernd Eckenfels, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 336 bytes --]

On Thu, 30 Oct 2003 00:28:13 EST, Jeff Garzik said:
> Joseph Pingenot wrote:

> >   Some-and-Such's Law: All Open Source Debates Will Continue Until Someone
> >   Tells Someone Else to Shut Up and Code It.  ;)
> What, it's not yet time to invoke Godwin's Law?  ;)

Only if they're wearing jackboots when they say "shut up and code" :)


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-29  8:50 Things that Longhorn seems to be doing right Hans Reiser
  2003-10-29 22:42 ` Erik Andersen
@ 2003-10-30  7:25 ` Christian Axelsson
  2003-10-30  8:10   ` Hans Reiser
       [not found] ` <200311011731.10052.ioe-lkml@rameria.de>
  2 siblings, 1 reply; 83+ messages in thread
From: Christian Axelsson @ 2003-10-30  7:25 UTC (permalink / raw)
  To: Hans Reiser; +Cc: linux-kernel

Hans Reiser wrote:
> They are building in support for transactions into the OS.
> 
> Everything will be in XML.  (It is not so important what format it is 
> in, as it is that they are going to do it in one format.)
> 
> Support for browsing versions in the FS.
> 
> Support for browsing and querying XML their unified format.  Ok, so SQL 
> sucks, but this is still better than what we offer today in Linux.

I think this is better implemented in userspace somehow to keep the 
kernel away from bloat.
What do you think of the GNOME Storage project: 
http://www.gnome.org/~seth/storage/

Its not exactly the same but maybe its going in that direction.

-- 
Christan Axelsson
smiler@lanil.mine.nu



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:52   ` Theodore Ts'o
  2003-10-30  2:03     ` Joseph Pingenot
  2003-10-30  3:57     ` Scott Robert Ladd
@ 2003-10-30  7:33     ` Diego Calleja García
  2003-10-30  8:43       ` Giuliano Pochini
  2003-10-30  8:05     ` Hans Reiser
  2003-10-30 11:21     ` Felipe Alfaro Solana
  4 siblings, 1 reply; 83+ messages in thread
From: Diego Calleja García @ 2003-10-30  7:33 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: andersen, reiser, linux-kernel

El Wed, 29 Oct 2003 20:52:12 -0500 Theodore Ts'o <tytso@mit.edu> escribió:

> At some level what they have done can be very easily replicated by
> having a userspace database which is tied to the filesystem so you can
> do select statements to search on metadata assocated with files.  We
> can do this simply by associating UUID's to files, and storing the
> file metadata in a MySQL database which can be searched via
> appropriate userspace libraries which we provide.


Something like this?: http://www.gnome.org/~seth/storage/index.html

Another thing that Longhorn *is* doing right is the revamp of the "graphic
subsystem" aka Avalon, they seem to be trying to catch up with Mac OS X
in that field. 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:52   ` Theodore Ts'o
                       ` (2 preceding siblings ...)
  2003-10-30  7:33     ` Diego Calleja García
@ 2003-10-30  8:05     ` Hans Reiser
  2003-10-30  8:17       ` Wichert Akkerman
                         ` (2 more replies)
  2003-10-30 11:21     ` Felipe Alfaro Solana
  4 siblings, 3 replies; 83+ messages in thread
From: Hans Reiser @ 2003-10-30  8:05 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Erik Andersen, linux-kernel

Theodore Ts'o wrote:

>Keep in mind that just because Windows does thing a certain way
>doesn't mean we have to provide the same functionality in exactly the
>same way.
>
>Also keep in mind that Microsoft very deliberately blurs what they do
>in their "kernel" versus what they provide via system libraries (i.e.,
>API's provided via their DLL's, or shared libraries).
>
>At some level what they have done can be very easily replicated by
>having a userspace database which is tied to the filesystem so you can
>do select statements to search on metadata assocated with files. 
>

> We
>can do this simply by associating UUID's to files, and storing the
>file metadata in a MySQL database which can be searched via
>appropriate userspace libraries which we provide.
>  
>
What a performance nightmare.  Updating a user space database every time 
a file changes --- let's move to a micro-kernel architecture for all of 
the kernel the same day.....;-)

Not to mention that SQL is utterly unsuited for semi-structured data 
queries (what people store in filesystems is semi-structured data), and 
would only be effective for those fields that you require every file to 
have.

>Please do **not** assume that just because of the vaporware press
>releases released by Microsoft that (a) they have pushed an SQL Query
>optimizer into the kernel or that (b) even if they did, we should
>follow their bad example and attempt to do the same.  
>
>There are multiple ways of skinning this particular cat, and we don't
>need to blindly follow Microsoft's design mistakes.
>
>Fortunately, I have enough faith in Linus Torvalds' taste that I'm not
>particularly worried what would happen if someone were to send him a
>patch that attempted to cram MySQL or Postgres into the guts of the
>Linux kernel....  although I would like to watch when someone proposes
>such a thing!
>
>						- Ted
>
>
>  
>
How about you send him a patch that removes all of that networking stuff 
from the kernel and puts it into user space where it belongs.;-)  There 
was this Windows user on Slashdot some time ago who claimed that it 
wasn't just the browser that should be unbundled from the kernel, the 
whole networking stack was unfairly bundled and locked out the companies 
that used to provide DOS with networking stacks (the user didn't have in 
mind patching the windows kernel and recompiling, he really thought it 
should all be in user space).  Your kind of fellow.....

It is true that there are many features, such as an automatic text 
indexer, that belong in user space, but the basic indexes (aka 
directories) and index traversal code belong in the kernel.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  7:25 ` Christian Axelsson
@ 2003-10-30  8:10   ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-10-30  8:10 UTC (permalink / raw)
  To: Christian Axelsson; +Cc: linux-kernel

Christian Axelsson wrote:

> Hans Reiser wrote:
>
>> They are building in support for transactions into the OS.
>>
>> Everything will be in XML.  (It is not so important what format it is 
>> in, as it is that they are going to do it in one format.)
>>
>> Support for browsing versions in the FS.
>>
>> Support for browsing and querying XML their unified format.  Ok, so 
>> SQL sucks, but this is still better than what we offer today in Linux.
>
>
> I think this is better implemented in userspace somehow to keep the 
> kernel away from bloat.
> What do you think of the GNOME Storage project: 
> http://www.gnome.org/~seth/storage/
>
> Its not exactly the same but maybe its going in that direction.
>
well, I much much prefer www.namesys.com/whitepaper.html ;-)

The reasons why SQL sucks for semi-structured data are described there.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  8:05     ` Hans Reiser
@ 2003-10-30  8:17       ` Wichert Akkerman
  2003-10-30 11:59         ` Hans Reiser
  2003-10-30  9:14       ` Giuliano Pochini
  2003-10-30 17:48       ` Theodore Ts'o
  2 siblings, 1 reply; 83+ messages in thread
From: Wichert Akkerman @ 2003-10-30  8:17 UTC (permalink / raw)
  To: linux-kernel

Previously Hans Reiser wrote:
> It is true that there are many features, such as an automatic text 
> indexer, that belong in user space, but the basic indexes (aka 
> directories) and index traversal code belong in the kernel.

Sure, but if you have a kernel which supports arbitraty extended
attributes for files you don't need much more kernel support. You
can implement things like metadata for files and query languages on
top of that in userspace. If you modify applications to (also) put some
metadata (meta tags from html pages, document properties from office
documents, etc.) in those extended attributes you might already be where
microsoft is going.

You only would need some kernel interaction if you want to keep an
updated index of file contents (dnotify for a while filesystem and
reindexing whole files instead of blocks doesn't sound very attractive).

Wichert.

-- 
Wichert Akkerman <wichert@wiggy.net>    It is simple to make things.
http://www.wiggy.net/                   It is hard to make things simple.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  7:33     ` Diego Calleja García
@ 2003-10-30  8:43       ` Giuliano Pochini
  0 siblings, 0 replies; 83+ messages in thread
From: Giuliano Pochini @ 2003-10-30  8:43 UTC (permalink / raw)
  To: Diego Calleja García; +Cc: linux-kernel



On Thu, 30 Oct 2003, Diego Calleja [ISO-8859-15] García wrote:

> Another thing that Longhorn *is* doing right is the revamp of the "graphic
> subsystem" aka Avalon, they seem to be trying to catch up with Mac OS X
> in that field.

I have a Mac, but I don't see any advantage in it. It works just like X
and it's not faster. The real problem with X is the lack of drivers.


--
Giuliano.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  8:05     ` Hans Reiser
  2003-10-30  8:17       ` Wichert Akkerman
@ 2003-10-30  9:14       ` Giuliano Pochini
  2003-10-30  9:55         ` Hans Reiser
  2003-10-30 17:48       ` Theodore Ts'o
  2 siblings, 1 reply; 83+ messages in thread
From: Giuliano Pochini @ 2003-10-30  9:14 UTC (permalink / raw)
  To: Hans Reiser; +Cc: linux-kernel



On Thu, 30 Oct 2003, Hans Reiser wrote:

> >We can do this simply by associating UUID's to files, and storing the
> >the file metadata in a MySQL database which can be searched via
> >appropriate userspace libraries which we provide.
> >
> What a performance nightmare.  Updating a user space database every time
> a file changes --- let's move to a micro-kernel architecture for all of
> the kernel the same day.....;-)

If applications do not cooperate explicitly, there is no other way than
scanning the files after they have been modified. Sure, it's slow, but
there is no need to do the work immediately. Take into account the MS's
goal is to make the system seem fast to the normal (desktop) user. I
guess that system is aimed to speedup searches in word and text files,
not in the whole filesystem. And the normal desktop user do write files
only sometimes, so performance isn't a problem (unless you're copying a
whole CD of word files into the HD). I think that intercepting
open,write,close is enough to provide the same functionality in Linux.


--
Giuliano.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  2:03     ` Joseph Pingenot
@ 2003-10-30  9:23       ` Ingo Oeser
  0 siblings, 0 replies; 83+ messages in thread
From: Ingo Oeser @ 2003-10-30  9:23 UTC (permalink / raw)
  To: trelane; +Cc: Theodore Ts'o, Erik Andersen, Hans Reiser, linux-kernel

On Thursday 30 October 2003 03:03, Joseph Pingenot wrote:
> Actually, thinking about it, it's ideal to have as a pluggable userspace
>   daemon: on open() or a little after, determine the filetype, and forward
>   interactions to a module/plugin that knows how to deal with that
>   data format.  The plugin then calls some under-process (either back to
>   the daemon or some other thing) to then archive off the information.

Sounds a little bit like the STREAMS interface.

You could also have a wrapper around your own open() which can do more.
But sth. like you suggest is done by KDE already internally (and Gnome maybe
too),



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  3:12           ` Joseph Pingenot
  2003-10-30  4:21             ` Scott Robert Ladd
@ 2003-10-30  9:52             ` Ingo Oeser
  1 sibling, 0 replies; 83+ messages in thread
From: Ingo Oeser @ 2003-10-30  9:52 UTC (permalink / raw)
  To: trelane; +Cc: Alex Belits, Dax Kelson, Hans Reiser, andersen, linux-kernel

On Thursday 30 October 2003 04:12, Joseph Pingenot wrote:
> being in userspace.  The cost is the context switching performance hit and
> the fact that each process that wants to index its stuff must tell the
> filesystem and the indexing service its data (effectively, two writes and a
> completely separate API).  [I'm likely preaching to the choir here, but
> it's good to outline it]

To tell the data, you just have to create a mapping of a file and
pass that somehow. So you get it fresh from the pagecache.

Since the indexer will just read it, this mapping will be even shared,
which means it will not affect disk performance of the other
applications using this file.

Also notice, that no global index is needed, but per user ones, which
must be mergable with a global one.

Rationale: User A should not know the file contents of user B.

And currently indexing is sloooow. I tried glimpse and htdig and they
run several hours just for indexing the KDE and QT documentation.

This let me come to the conclusion that a small keyword generator
(strings?) run after fsync, which stores autogenerated keywords in the
nearest index (per file, per directory, per user, global) might be better.

Another interesting idea might be using existing indexes by letting
applications define a search handler. This might prove useful for
databases and will allow for unified search. But that might prove to be
quite hard.

Regards

Ingo Oeser



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  9:14       ` Giuliano Pochini
@ 2003-10-30  9:55         ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-10-30  9:55 UTC (permalink / raw)
  To: Giuliano Pochini; +Cc: linux-kernel

Giuliano Pochini wrote:

>On Thu, 30 Oct 2003, Hans Reiser wrote:
>
>  
>
>>>We can do this simply by associating UUID's to files, and storing the
>>>the file metadata in a MySQL database which can be searched via
>>>appropriate userspace libraries which we provide.
>>>
>>>      
>>>
>>What a performance nightmare.  Updating a user space database every time
>>a file changes --- let's move to a micro-kernel architecture for all of
>>the kernel the same day.....;-)
>>    
>>
>
>If applications do not cooperate explicitly, there is no other way than
>scanning the files after they have been modified. Sure, it's slow, but
>there is no need to do the work immediately. Take into account the MS's
>goal is to make the system seem fast to the normal (desktop) user. I
>guess that system is aimed to speedup searches in word and text files,
>not in the whole filesystem. And the normal desktop user do write files
>only sometimes, so performance isn't a problem (unless you're copying a
>whole CD of word files into the HD). I think that intercepting
>open,write,close is enough to provide the same functionality in Linux.
>
>
>--
>Giuliano.
>
>
>  
>
I was not very articulate here.  I agree that automatic text indexing 
should be done with a lag in batches for performance reasons, rather 
than in the BeFS style.  I also think that it should not be done for all 
files, that the user should have control over what he runs the indexer 
on, and what indexer he likes to run, and what settings it uses, etc.  
In particular, the user should be free to index by hand.

All that said, the indexes themselves should just be feature enhanced 
directories accessed via the kernel.  Feature enhancements might include 
such things as better space efficiency, ordering plugins, etc.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:34           ` Joseph Pingenot
  2003-10-30  2:54             ` Bernd Eckenfels
  2003-10-30  3:16             ` Neil Brown
@ 2003-10-30 10:27             ` Thorsten Körner
  2003-10-30 21:28             ` jlnance
  3 siblings, 0 replies; 83+ messages in thread
From: Thorsten Körner @ 2003-10-30 10:27 UTC (permalink / raw)
  To: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hallo Joseph
Am Donnerstag, 30. Oktober 2003 02:34 schrieb Joseph Pingenot:
> From Neil Brown on Thursday, 30 October, 2003:
> >On Wednesday October 29, trelane@digitasaru.net wrote:
> >> Regardless, it's an interesting idea, and one which might be
> >> fruitful. I give you then two bits: our treatment of the tech
> >> and the reality of their tech:
> >> 00: ISVAPOR | TAKESEROUSLY
> >> 01: ISVAPOR | IGNORE
> >> 10: NOTVAPOR | TAKESERIOUSLY
> >> 11: NOTVAPOR | IGNORE
> >> If we come up with a working implementation and it *is* just
> >> vaporware, then we're ahead.
> >> We're way ahead.
> >> If we merely dismiss it as vaporware and it turns out to be,
> >> no net change.
> >
> >...snip...
> >
> >> Conclusion: best to take it seriously and work on it; those
> >> two cases are the most optimal.
> >
> >Sounds like the same argument that is used in "Pascal's Wager"
> > for belief in God, and I seriously don't think the argument
> > works in either case.  (note that I'm not making a statement
> > about the conclusion in either case, only about the arguement).
>
> "Sounds like?"  Sure.  It kind of does, now that you mention it.
>
> Regradless of the similarities and the validity of Pascal's
> argument, my argument, I think, stands.  I outlined the four
> potential futures.  We have control over only one bit, Microsoft
> has the other.  The tech sounds nice, it is an interesting avenue
> to persue, Pascal aside.
You are absolutley right. It seems to be more than just an
interesting tech. It sounds like a really outstanding idea. And
noone should say it's a bad idea, because it's from Microsoft (tm).
They have had many good ideas (also many bad implementations of
them).
>
> I don't see any reason why we *shouldn't* look at the problem and
> try to do it.  What reasons do you see for not persuing the
> problem to its inevitible implementation?
ACK

CU

Thorsten

- --
Thorsten Körner			|  e-Commerce-Consulting
Dannenkoppel 51 		|  D-22391 Hamburg
Tel.: 040/536 308 27		|  Fax.: 040/536 308 26
mailto:t.koerner@123tk.com	|  http://www.123tk.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/oOd0s5R35vLkl/cRAp20AJ9PmBjKqnX5f8l9CaQZGGPBEa0vSQCgqEP3
O+OPB2lVef6BGSc3Ay672Ew=
=sZjd
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:52   ` Theodore Ts'o
                       ` (3 preceding siblings ...)
  2003-10-30  8:05     ` Hans Reiser
@ 2003-10-30 11:21     ` Felipe Alfaro Solana
  4 siblings, 0 replies; 83+ messages in thread
From: Felipe Alfaro Solana @ 2003-10-30 11:21 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Erik Andersen, Hans Reiser, Linux Kernel Mailinglist

On Thu, 2003-10-30 at 02:52, Theodore Ts'o wrote:
> Keep in mind that just because Windows does thing a certain way
> doesn't mean we have to provide the same functionality in exactly the
> same way.
> 
> Also keep in mind that Microsoft very deliberately blurs what they do
> in their "kernel" versus what they provide via system libraries (i.e.,
> API's provided via their DLL's, or shared libraries).
> 
> At some level what they have done can be very easily replicated by
> having a userspace database which is tied to the filesystem so you can
> do select statements to search on metadata assocated with files.  We
> can do this simply by associating UUID's to files, and storing the
> file metadata in a MySQL database which can be searched via
> appropriate userspace libraries which we provide.
> 
> Please do **not** assume that just because of the vaporware press
> releases released by Microsoft that (a) they have pushed an SQL Query
> optimizer into the kernel, or that (b) even if they did, we should
> follow their bad example and attempt to do the same.  
> 
> There are multiple ways of skinning this particular cat, and we don't
> need to blindly follow Microsoft's design mistakes.
> 
> Fortunately, I have enough faith in Linus Torvalds' taste that I'm not
> particularly worried what would happen if someone were to send him a
> patch that attempted to cram MySQL or Postgres into the guts of the
> Linux kernel....  although I would like to watch when someone proposes
> such a thing!

In fact, the GNOME taskforce is already working on something much like
the much-touted-but-nearly-inexistent WinFS. It's all done in userspace:

http://www.gnome.org/~seth/storage/index.html


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  8:17       ` Wichert Akkerman
@ 2003-10-30 11:59         ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-10-30 11:59 UTC (permalink / raw)
  To: Wichert Akkerman; +Cc: linux-kernel

You don't need extended attributes, you just needs files and 
directories....  www.namesys.com/v4.html says more.

Hans

Wichert Akkerman wrote:

>Previously Hans Reiser wrote:
>  
>
>>It is true that there are many features, such as an automatic text 
>>indexer, that belong in user space, but the basic indexes (aka 
>>directories) and index traversal code belong in the kernel.
>>    
>>
>
>Sure, but if you have a kernel which supports arbitraty extended
>attributes for files you don't need much more kernel support. You
>can implement things like metadata for files and query languages on
>top of that in userspace. If you modify applications to (also) put some
>metadata (meta tags from html pages, document properties from office
>documents, etc.) in those extended attributes you might already be where
>microsoft is going.
>
>You only would need some kernel interaction if you want to keep an
>updated index of file contents (dnotify for a while filesystem and
>reindexing whole files instead of blocks doesn't sound very attractive).
>
>Wichert.
>
>  
>


-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  3:57     ` Scott Robert Ladd
  2003-10-30  4:08       ` Larry McVoy
@ 2003-10-30 13:46       ` Jesse Pollard
  2003-10-31  4:50       ` Stephen Satchell
  2 siblings, 0 replies; 83+ messages in thread
From: Jesse Pollard @ 2003-10-30 13:46 UTC (permalink / raw)
  To: Scott Robert Ladd, Theodore Ts'o
  Cc: Erik Andersen, Hans Reiser, linux-kernel

On Wednesday 29 October 2003 21:57, Scott Robert Ladd wrote:
> Theodore Ts'o wrote:
> > Keep in mind that just because Windows does thing a certain way
> > doesn't mean we have to provide the same functionality in exactly the
> > same way.
>
> Very true. Linux is best defined by those who proactively implement
> powerful ideas.
>
> That doesn't mean, however, that the folks in Redmond can't come up with
> an interesting and useful idea that we might just want to consider.
>
> > Also keep in mind that Microsoft very deliberately blurs what they do
> > in their "kernel" versus what they provide via system libraries
> > (i.e., API's provided via their DLL's, or shared libraries).
>
> Any database-style file system should be implemented in a modular
> fashion, just like current Linux file systems.
>
[snip]
> > Fortunately, I have enough faith in Linus Torvalds' taste that I'm
> > not particularly worried what would happen if someone were to send
> > him a patch that attempted to cram MySQL or Postgres into the guts of
> > the Linux kernel....  although I would like to watch when someone
> > proposes such a thing!
>
> MySQL wouldn't need to be shoved into the kernel; a small, fast database
> engine (one of my professional specialities, BTW) could provide metadata
> services in a file system module. SQL is a bloated pig; an effective
> file system needs to be both useful and efficient, leading me to think
> that we should consider a more succinct query mechanism for any
> metadata-based file system.

Umm... keep in mind that every system that has a in-kernel database system
has tanked. Remember PIC systems? How about MUMPS?

The closest thing to this that hasn't died (quite?) has been the VMS
datatrieve facility. But even that wasn't in the kernel proper, it was
a layered facility added to the DCL user API that was accessable to
applications. It basicly provided multiple ISAM support to read/write.
And no, the files were not portable... they had to be converted to normal
RMS files/stream files first; and you lost the keys when you did.

The problem with the database system (anywhere) is that it is absolutely
horrible for I/O throughput. Having to reference schemas, multiple key
hashing, even key identification all takes multiple I/O operations to do.
Not to mention the duplications caused by having to store the results in
addition to storing the raw data.

And you no longer get to do a simple "read" of data. You have to completely
drop the concept of "data stream" and data portability. If you DO keep the
original semantics of a file, then you double/triple the data I/O (once for
the data, once for the keys, and once for the correlation/compound keys).
Then you have to deal with the huge amount of metadata that maintains the
above. On top of that, you also have to include a general purpose locking
facility (NOT advisory either) or you WILL get a corrupted data file with
only one writer (do a "tail -f" on a file, and the file gets corrupted during
updates to the keys or base data).

The last time I saw this amount of crap/problems was with a Clearcase file
system (couldn't even back the system up).

If you REALLY want to test this, make it a user mode NFS server, and mount
it through a loopback.

I think it would be more usefull to have file migration support added to the
current metadata (extra index, extra modes, compressed data, compressed date,
migrated date, unmigrated date, migration expiration date... possible with
XFS maybe. and only a little more changes needed...).

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  8:05     ` Hans Reiser
  2003-10-30  8:17       ` Wichert Akkerman
  2003-10-30  9:14       ` Giuliano Pochini
@ 2003-10-30 17:48       ` Theodore Ts'o
  2003-10-30 19:23         ` Hans Reiser
  2003-10-31 11:01         ` Kenneth Johansson
  2 siblings, 2 replies; 83+ messages in thread
From: Theodore Ts'o @ 2003-10-30 17:48 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Erik Andersen, linux-kernel

On Thu, Oct 30, 2003 at 11:05:05AM +0300, Hans Reiser wrote:
> What a performance nightmare.  Updating a user space database every time 
> a file changes --- let's move to a micro-kernel architecture for all of 
> the kernel the same day.....;-)

Nope, the user space database only needs to change when the file
metadata changes.

> Not to mention that SQL is utterly unsuited for semi-structured data 
> queries (what people store in filesystems is semi-structured data), and 
> would only be effective for those fields that you require every file to 
> have.

Your assumption here is that the only thing that people search and
index on is semi-structed data.  While this might be interesting for
text-based data, in fact, the problem space which WinFS has been
addressing isn't necessarily text files.  For example, one of the
examples given in the WinFS paper is the scenario where the user has a
large number of digital photographs, where some of the metadata might
be extracted from the EXIF headers, and some might be inserted by the
user him/herself (for example, the list of names of the people in the
picture, or the subject matter of the photograph: flowers, mountains,
etc --- the latter being very important for professional or
semi-professional stock photographers).  Such information is in fact
very structured, and is much more likely to stay constant even when
the file is modified.

In addition, even for text-based files, in the future, files will very
likely not be straight ASCII, but some kind of rich text based format
with formatting, unicode, etc.  And even general, unstructured
text-based indexing is hard enough that putting that into the kernel
is just as bad as putting an SQL optimizer into the kernel.  That I
would claim would have to be done in userspace, as part of the
overhead when OpenOffice saves the file.  (Note that some of the
Linux-based office suites store files as gzip'ed XML files, which
again argues that putting it in the kernel is insane --- why should we
compress the file, only to have the kernel uncompress it and then
re-parse the XML just so they can index it?  Much better to have
OpenOffice do the indexing while it has the uncompressed, parsed out
text tree in memory.  And if the indexes need to be updated in
userspace, then life is much, much, much simpler if the lookups are
also done in userspace --- especially when complex SQL query
optimizations may be required.)

> How about you send him a patch that removes all of that networking stuff 
> from the kernel and puts it into user space where it belongs.;-)  There 
> was this Windows user on Slashdot some time ago who claimed that it 
> wasn't just the browser that should be unbundled from the kernel, the 
> whole networking stack was unfairly bundled and locked out the companies 
> that used to provide DOS with networking stacks (the user didn't have in 
> mind patching the windows kernel and recompiling, he really thought it 
> should all be in user space).  Your kind of fellow.....

Networking has definite performance requirements on a per-packet basis
which requires that it be in the kernel.  Given that indexing happens
rarely (i.e., only when a file is saved), the same arguments simply
don't apply.  If you consider how often a user is going to ask the
question, "Give me a list of all photographs taken between June 10,
1993 and July 24, 1996 which contains Mary Schmidt as a subject and
whose resolution is at least 150 dpi", it definitely demonstrates why
this doesn't need to be in the kernel.

If you consider the amount of data that needs to be shovelled back and
forth between the kernel's network device driver to a userspace
networking stack and then back down into the kernel to the socket
layer when processing a TCP connection over a 10 gigabyte Ethernet
link, it's clear why it has to be in the kernel.  When you consider
how much data needs to be referenced when doing indexing, and in fact
that it may exist in uncompressed form only in the userspace
application, you'll see why it indeed it's better to do it in userspace.

The bottom line is that if a case can be made that some portion of the
functionality required by WinFS needs to be in the kernel, and in the
filesystem layer specifically, I'm all in favor of it.  But it has to
be justified.  To date, I haven't seen a justification for why the
database processing aspect of things needs to be in the kernel.

						- Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 17:48       ` Theodore Ts'o
@ 2003-10-30 19:23         ` Hans Reiser
  2003-10-30 20:31           ` Theodore Ts'o
  2003-10-31 11:01         ` Kenneth Johansson
  1 sibling, 1 reply; 83+ messages in thread
From: Hans Reiser @ 2003-10-30 19:23 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Erik Andersen, linux-kernel

Theodore Ts'o wrote:

>On Thu, Oct 30, 2003 at 11:05:05AM +0300, Hans Reiser wrote:
>  
>
>>What a performance nightmare.  Updating a user space database every time 
>>a file changes --- let's move to a micro-kernel architecture for all of 
>>the kernel the same day.....;-)
>>    
>>
>
>Nope, the user space database only needs to change when the file
>metadata changes.
>  
>
Do you mean like it does with every file create?

>  
>
>>Not to mention that SQL is utterly unsuited for semi-structured data 
>>queries (what people store in filesystems is semi-structured data), and 
>>would only be effective for those fields that you require every file to 
>>have.
>>    
>>
>
>Your assumption here is that the only thing that people search and
>index on is semi-structed data.
>
No, my assumption is that structured data is a special case of 
semi-structured data, and should be modeled that way.

>
>In addition, even for text-based files, in the future, files will very
>likely not be straight ASCII, but some kind of rich text based format
>with formatting, unicode, etc.
>
Formatting does not make text table structured.

>  And even general, unstructured
>text-based indexing is hard enough that putting that into the kernel
>is just as bad as putting an SQL optimizer into the kernel.
>
Well, since I don't think that SQL belongs in the filesystem, and I 
think that text indexing should be done by users choosing how to index 
their text, including choosing whether to use an automatic indexer or do 
it by hand, and I think that the automatic indexer probably belongs in 
user space (I could be wrong, but I would at least choose to do version 
1 of such a thing in user space, perhaps using a language other than C), 
I have to say that we are agreeing here.  Surely it is an accident, but 
oh well.;-)

>  That I
>would claim would have to be done in userspace, as part of the
>overhead when OpenOffice saves the file.  (Note that some of the
>Linux-based office suites store files as gzip'ed XML files, which
>again argues that putting it in the kernel is insane --- why should we
>compress the file, only to have the kernel uncompress it and then
>re-parse the XML just so they can index it?  Much better to have
>OpenOffice do the indexing while it has the uncompressed, parsed out
>text tree in memory.  And if the indexes need to be updated in
>userspace, then life is much, much, much simpler if the lookups are
>also done in userspace --- especially when complex SQL query
>optimizations may be required.)
>  
>
Well I agree.

You are missing my argument.  I am saying that the indexes and name 
space belong in the kernel, not that the auto-indexer belongs in the kernel.

>  
>
>>How about you send him a patch that removes all of that networking stuff 
>>from the kernel and puts it into user space where it belongs.;-)  There 
>>was this Windows user on Slashdot some time ago who claimed that it 
>>wasn't just the browser that should be unbundled from the kernel, the 
>>whole networking stack was unfairly bundled and locked out the companies 
>>that used to provide DOS with networking stacks (the user didn't have in 
>>mind patching the windows kernel and recompiling, he really thought it 
>>should all be in user space).  Your kind of fellow.....
>>    
>>
>
>Networking has definite performance requirements on a per-packet basis
>which requires that it be in the kernel.  Given that indexing happens
>rarely (i.e., only when a file is saved), the same arguments simply
>don't apply.  If you consider how often a user is going to ask the
>question, "Give me a list of all photographs taken between June 10,
>1993 and July 24, 1996 which contains Mary Schmidt as a subject and
>whose resolution is at least 150 dpi",
>
uh, all the time, if there is a namespace that lets him.  How often do 
you use google?  How often do you memorize the primary key of an object 
in a relational database, and use only that versus how often do you do a 
richer query?

> it definitely demonstrates why
>this doesn't need to be in the kernel.
>
>If you consider the amount of data that needs to be shovelled back and
>forth between the kernel's network device driver to a userspace
>networking stack and then back down into the kernel to the socket
>layer when processing a TCP connection over a 10 gigabyte Ethernet
>link, it's clear why it has to be in the kernel. 
>
> When you consider
>how much data needs to be referenced when doing indexing, and in fact
>that it may exist in uncompressed form only in the userspace
>application, you'll see why it indeed it's better to do it in userspace.
>
>The bottom line is that if a case can be made that some portion of the
>functionality required by WinFS needs to be in the kernel, and in the
>filesystem layer specifically, I'm all in favor of it.  But it has to
>be justified.  To date, I haven't seen a justification for why the
>database processing aspect of things needs to be in the kernel.
>
>						- Ted
>
>
>  
>
In general, arguments over whether functionality belongs in the kernel 
or a userspace library are not as easy as you tend to suggest.  I think 
you are a bit inclined to assume that what Unix does today is the right 
thing for 2006.  The kernel is going to grow at probably roughly the 
same rate that computer horsepower grows, and the 30 year trend of 
putting more and more into the kernel will continue.

Most filesystem namespace functionality belongs in the kernel because 
subnames tend to invoke the functionality of other subnames when one 
creates a richly compounding filesystem name space.  There are however 
exceptions to this.  I would put directory lookup in the kernel.  I 
would put vicinity set intersection in the kernel.  I would put set 
difference in the kernel.  I would put set union in the kernel.  I would 
put inheritance in the kernel.  I would generally continue to put namei 
in the kernel.  Maybe macro expansion belongs in user space libraries, I 
haven't thought enough about it to say.  Probably the main reason I 
don't want the auto-indexer in the kernel is irrational: I don't want to 
design it, I want to see a lot of experiments, and I think the 
psychological barriers to entry are lower for user space experiments.  
Other valid reasons might be that string processing tools are richer in 
user space, and sys_reiser4() will provide efficient batch operations 
that will overcome most of the pain of context switches to the kernel 
for each index update.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 19:23         ` Hans Reiser
@ 2003-10-30 20:31           ` Theodore Ts'o
  2003-10-31  7:40             ` Hans Reiser
  0 siblings, 1 reply; 83+ messages in thread
From: Theodore Ts'o @ 2003-10-30 20:31 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Erik Andersen, linux-kernel

On Thu, Oct 30, 2003 at 10:23:49PM +0300, Hans Reiser wrote:
> >Your assumption here is that the only thing that people search and
> >index on is semi-structed data.
> >
> No, my assumption is that structured data is a special case of 
> semi-structured data, and should be modeled that way.

There are much more powerful ways of handling structured data (as
opposed to generalized text searches).  What WinFS is specifically
addressing is searching and selected based on structured data.  

> >In addition, even for text-based files, in the future, files will very
> >likely not be straight ASCII, but some kind of rich text based format
> >with formatting, unicode, etc.
> >
> Formatting does not make text table structured.

No, but it means that doing searches on formatted text is very
difficult, and should be done in userspace, not kernel space.

> You are missing my argument.  I am saying that the indexes and name 
> space belong in the kernel, not that the auto-indexer belongs in the kernel.

Searching and name spaces are different things.  Fundamentally I
disagree with your belief that they are the same thing (and yes I've
read your whitepaper on the namesys web page).  You can do much, much
more powerful select statements than makes sense to do via the
directory abstraction.  (Think about arbitrary select statements,
possibly with subselect statements.  That's what Microsoft is
promising in WinFS.  Do you really want to support an opendir system
call where its argument is an arbitrary SQL select statement?  I
didn't think so.)

There is a very, very big difference between a pathname, which is
guaranteed to be refer to a single unique file, such as might be used
in a Makefile.  This is what most people consider a real namespace.
When addressing people, a passport number, or a driver's license
number, or a social security number, are all examples of a namespace.
Each one of these is guaranteed to return either no result, or a
single specific person.  

In contrast, consider searching for someone who is male, between 30
and 40, is named Tom, and lived in Libertyville, Illinois sometime
between 1960 and 1970, and is married to someone named Mary who was
born in California.  This might return several people, and most people
would **NOT** consider the space of all queries about people to be a
"name space".  Searches are not names.  They do not uniquely identify
people or objects, which is a fundamental requirement of a name.

We can create a filesystem with a directory indexed by social security
number, and another directory with hard links that indexes people's
records by driver's ID.  That makes sense.  But putting in sufficient
indexes so that the above query of looking for somone named Tom who is
married to someone named Mary (and this is an example where an query
optimizer would be needed) is simple, pure insanity.

> uh, all the time, if there is a namespace that lets him.  How often do 
> you use google?  How often do you memorize the primary key of an object 
> in a relational database, and use only that versus how often do you do a 
> richer query?

I use google dozens of times a day.  I type commands to bash hundreds
of times a day.  Does that mean that bash command line parsing should
be in the kernel?  Of course not!

The bottom line is that for something that happens dozens or even
hundreds of times a day, that's an argument that it *shouldn't* be
done in the kernel.  Compare and contrast that with handling incoming
network packets, which can happen millions of times per hour.

						- Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  1:34           ` Joseph Pingenot
                               ` (2 preceding siblings ...)
  2003-10-30 10:27             ` Thorsten Körner
@ 2003-10-30 21:28             ` jlnance
  2003-10-30 22:29               ` Måns Rullgård
  2003-10-31  1:04               ` Clemens Schwaighofer
  3 siblings, 2 replies; 83+ messages in thread
From: jlnance @ 2003-10-30 21:28 UTC (permalink / raw)
  To: linux-kernel

On Wed, Oct 29, 2003 at 07:34:19PM -0600, Joseph Pingenot wrote:

> I don't see any reason why we *shouldn't* look at the problem and try to
>   do it.  What reasons do you see for not persuing the problem to its
>   inevitible implementation?

Well I am not qualified to comment on this particular technology, I have
not investigated it enough to understand what it is trying to accomplish.

But there is a danger in letting MS marketing drive the direction of our
OS.  Particularly since, as you note in another post, working on project
X means that project Y suffers due to lack of developer attention.  I
dont think that the MS idea of what a computer should be look bear much
resembelence to what I want mine to look like.  Which is why I started
using Linux.

Unfortunatly, we probably dont really have a choice.  MS has enough market
share that we must emulate not only their good, but even their bad ideas
if we want Linux to be used by people other than those who develop it.

my $0.02

Jim

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 21:28             ` jlnance
@ 2003-10-30 22:29               ` Måns Rullgård
  2003-10-31  2:03                 ` Daniel B.
  2003-10-31  1:04               ` Clemens Schwaighofer
  1 sibling, 1 reply; 83+ messages in thread
From: Måns Rullgård @ 2003-10-30 22:29 UTC (permalink / raw)
  To: linux-kernel

jlnance@unity.ncsu.edu writes:

> Unfortunatly, we probably dont really have a choice.  MS has enough market
> share that we must emulate not only their good, but even their bad ideas
> if we want Linux to be used by people other than those who develop it.

Well, do we, necessarily?  Is the goal with Linux to get as many users
as possible, or to create the best OS possible?  I was hoping for the
latter.

-- 
Måns Rullgård
mru@kth.se


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 21:28             ` jlnance
  2003-10-30 22:29               ` Måns Rullgård
@ 2003-10-31  1:04               ` Clemens Schwaighofer
  1 sibling, 0 replies; 83+ messages in thread
From: Clemens Schwaighofer @ 2003-10-31  1:04 UTC (permalink / raw)
  To: jlnance; +Cc: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

jlnance@unity.ncsu.edu wrote:

> Unfortunatly, we probably dont really have a choice.  MS has enough market
> share that we must emulate not only their good, but even their bad ideas
> if we want Linux to be used by people other than those who develop it.

*wrong* we do have a choice. why do _we_ use linux. why do _I_ use
Linux. Yes, because it gives me things and possibilites I don't have
with M$. Like the possibility to choose. if we copy everything M$ does.
we don't need to develop Linux further. plus, most of these windows cool
things are definitly not a part of the kernel anyway. eg this WinFS.
100% a user space thing with a DB.

- --
Clemens Schwaighofer - IT Engineer & System Administration
==========================================================
Tequila Japan, 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN
Tel: +81-(0)3-3545-7703            Fax: +81-(0)3-3545-7343
http://www.tequila.jp
==========================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE/obUijBz/yQjBxz8RAmHhAKDHnfnCFfpOuwtElbYMueERLkrMSwCg2a7D
GlI2KYq8CL3pZnaZnJn9iuY=
=OaXf
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 22:29               ` Måns Rullgård
@ 2003-10-31  2:03                 ` Daniel B.
  0 siblings, 0 replies; 83+ messages in thread
From: Daniel B. @ 2003-10-31  2:03 UTC (permalink / raw)
  Cc: linux-kernel

Måns Rullgård wrote:
> 
> jlnance@unity.ncsu.edu writes:
> 
> > Unfortunatly, we probably dont really have a choice.  MS has enough market
> > share that we must emulate not only their good, but even their bad ideas
> > if we want Linux to be used by people other than those who develop it.
> 
> Well, do we, necessarily?  Is the goal with Linux to get as many users
> as possible, or to create the best OS possible?  I was hoping for the
> latter.

How about using usefulness as a measure?  (Well, yeah, then the question
is "useful to whom"?)  


Daniel
-- 
Daniel Barclay
dsb@smart.net

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  3:57     ` Scott Robert Ladd
  2003-10-30  4:08       ` Larry McVoy
  2003-10-30 13:46       ` Jesse Pollard
@ 2003-10-31  4:50       ` Stephen Satchell
  2 siblings, 0 replies; 83+ messages in thread
From: Stephen Satchell @ 2003-10-31  4:50 UTC (permalink / raw)
  To: Scott Robert Ladd, Theodore Ts'o
  Cc: Erik Andersen, Hans Reiser, linux-kernel

At 10:57 PM 10/29/2003 -0500, Scott Robert Ladd wrote:
>MySQL wouldn't need to be shoved into the kernel; a small, fast database 
>engine (one of my professional specialities, BTW) could provide metadata 
>services in a file system module. SQL is a bloated pig; an effective file 
>system needs to be both useful and efficient, leading me to think that we 
>should consider a more succinct query mechanism for any metadata-based 
>file system.

I might add that Microsoft didn't invent the 
database-in-the-operating-system concept -- IBM has had that concept for 
years with the AS-400 line.  Indeed, the announcement by Microsoft sounds 
like MS is really courting the business marketplace; perhaps it is willing 
to give up some markets to go after what it may view as its core 
customers.  How many science/engineer types willingly use Windows as other 
than a window (pardon the pun) to a Unix-type system and as an 
office-productivity tool?

One way Microsoft could "beat Linux" is to bolster its integration with 
business mainframes, providing data layering tools designed for business 
applications "out of the box" -- a task that Linux itself (the operating 
system) could never, and should never, do.  If some company or group wants 
to put in the skull sweat and build a distribution that would do the deed, 
with kernel mods and userland support, more power to them.


-- 
"People who seem to have had a new idea have often just stopped having an 
old idea." -- Dr. Edwin H. Land  


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 20:31           ` Theodore Ts'o
@ 2003-10-31  7:40             ` Hans Reiser
  2003-10-31 19:30               ` Theodore Ts'o
  0 siblings, 1 reply; 83+ messages in thread
From: Hans Reiser @ 2003-10-31  7:40 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Erik Andersen, linux-kernel

Theodore Ts'o wrote:

>On Thu, Oct 30, 2003 at 10:23:49PM +0300, Hans Reiser wrote:
>  
>
>>>Your assumption here is that the only thing that people search and
>>>index on is semi-structed data.
>>>
>>>      
>>>
>>No, my assumption is that structured data is a special case of 
>>semi-structured data, and should be modeled that way.
>>    
>>
>
>There are much more powerful ways of handling structured data (as
>opposed to generalized text searches). 
>
Special cases of general theorems are not more powerful than the general 
theorems, they are simply special cases.   You can design a language 
that has the power of both relational algebra and boolean algebra.

> What WinFS is specifically
>addressing is searching and selected based on structured data.  
>
>  
>
>>>In addition, even for text-based files, in the future, files will very
>>>likely not be straight ASCII, but some kind of rich text based format
>>>with formatting, unicode, etc.
>>>
>>>      
>>>
>>Formatting does not make text table structured.
>>    
>>
>
>No, but it means that doing searches on formatted text is very
>difficult,
>
When you say formatted text, do you mean fonts and stuff, or do you mean 
object storage models.  Object storage models should generally be 
replaced with files and directories. 

Are you saying that auto-indexers should not parse the formatted text, 
index the document, and allow users to find the document, with the 
auto-indexer running in user space, but the indexes being traversed by 
the filesystem namespace resolver?  The kernel does not need to 
understand how to parse a document, it just needs to support queries 
that use the indexes created by an auto-indexer that does understand it.


> and should be done in userspace, not kernel space.
>
>  
>
>>You are missing my argument.  I am saying that the indexes and name 
>>space belong in the kernel, not that the auto-indexer belongs in the kernel.
>>    
>>
>
>Searching and name spaces are different things.  Fundamentally I
>disagree with your belief that they are the same thing (and yes I've
>read your whitepaper on the namesys web page).  You can do much, much
>more powerful select statements than makes sense to do via the
>directory abstraction.  (Think about arbitrary select statements,
>possibly with subselect statements.  That's what Microsoft is
>promising in WinFS.  Do you really want to support an opendir system
>call where its argument is an arbitrary SQL select statement?
>
No, I hate SQL.  I want to allow people to use Reiser6 queries to find 
things.;-)

>  I
>didn't think so.)
>
>There is a very, very big difference between a pathname, which is
>guaranteed to be refer to a single unique file, such as might be used
>in a Makefile.  This is what most people consider a real namespace.
>  
>
You mean, it is what most people consider a primary key.  Or at least I 
hope you mean that, because the whole point of all those articles (in 
what, the 80's was it? ) that strove to coin the name "namespace" was 
that filesystems and databases and search engines and so on are all 
namespaces. and they strove to imply that unifying them was possible and 
desirable.

>When addressing people, a passport number, or a driver's license
>number, or a social security number, are all examples of a namespace.
>Each one of these is guaranteed to return either no result, or a
>single specific person.  
>
>In contrast, consider searching for someone who is male, between 30
>and 40, is named Tom, and lived in Libertyville, Illinois sometime
>between 1960 and 1970, and is married to someone named Mary who was
>born in California.  This might return several people, and most people
>would **NOT** consider the space of all queries about people to be a
>"name space". 
>
Oh god, did you read the literature?

> Searches are not names.  They do not uniquely identify
>people or objects, which is a fundamental requirement of a name.
>  
>
You mean like Theodore?  Are you saying that Theodore is not a name 
because it does not uniquely identify you?

>We can create a filesystem with a directory indexed by social security
>number, and another directory with hard links that indexes people's
>records by driver's ID.  That makes sense.  But putting in sufficient
>indexes so that the above query of looking for somone named Tom who is
>married to someone named Mary (and this is an example where an query
>optimizer would be needed) is simple, pure insanity.
>  
>
I bet it will be less code than balanced trees were.

>  
>
>>uh, all the time, if there is a namespace that lets him.  How often do 
>>you use google?  How often do you memorize the primary key of an object 
>>in a relational database, and use only that versus how often do you do a 
>>richer query?
>>    
>>
>
>I use google dozens of times a day.  I type commands to bash hundreds
>of times a day.  Does that mean that bash command line parsing should
>be in the kernel?  Of course not!
>
>The bottom line is that for something that happens dozens or even
>hundreds of times a day, that's an argument that it *shouldn't* be
>done in the kernel.  Compare and contrast that with handling incoming
>network packets, which can happen millions of times per hour.
>  
>
Actually the relevant measure is, not how often do you use it, but how 
often would it context switch if it was not in the kernel.  Users rarely 
use the networking code directly.

Naming is used by programs a lot.  Enhace naming, and the programs will 
used enhanced naming a lot.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 17:48       ` Theodore Ts'o
  2003-10-30 19:23         ` Hans Reiser
@ 2003-10-31 11:01         ` Kenneth Johansson
  2003-10-31 13:52           ` Jesse Pollard
  1 sibling, 1 reply; 83+ messages in thread
From: Kenneth Johansson @ 2003-10-31 11:01 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Hans Reiser, Erik Andersen, linux-kernel

On Thu, 2003-10-30 at 18:48, Theodore Ts'o wrote:

> The bottom line is that if a case can be made that some portion of the
> functionality required by WinFS needs to be in the kernel, and in the
> filesystem layer specifically, I'm all in favor of it.  But it has to

What about some way to quickly detect changes to the filesystem. That
would really help any type of indexing function to avoid scanning the
entire disk. 

It would help things like backup and even the locate database. 

It could be something simple as a modification number that increased
with every change combined with a size limited list of what every change
was. Then every indexing task could just store what the modification
number was last time it did it's work compare that number to the current
number and read all the changes from the change log. If the stored
modification number had fallen out of the log it has to go over the
entire filesystem but that would not have to happen that often with a
big enough log. 

Probably some optimisation have to be done to keep the log small you do
not want to store every putc as a separate event.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 11:01         ` Kenneth Johansson
@ 2003-10-31 13:52           ` Jesse Pollard
  0 siblings, 0 replies; 83+ messages in thread
From: Jesse Pollard @ 2003-10-31 13:52 UTC (permalink / raw)
  To: Kenneth Johansson, Theodore Ts'o
  Cc: Hans Reiser, Erik Andersen, linux-kernel

On Friday 31 October 2003 05:01, Kenneth Johansson wrote:
> On Thu, 2003-10-30 at 18:48, Theodore Ts'o wrote:
> > The bottom line is that if a case can be made that some portion of the
> > functionality required by WinFS needs to be in the kernel, and in the
> > filesystem layer specifically, I'm all in favor of it.  But it has to
>
> What about some way to quickly detect changes to the filesystem. That
> would really help any type of indexing function to avoid scanning the
> entire disk.
>
> It would help things like backup and even the locate database.
>
> It could be something simple as a modification number that increased
> with every change combined with a size limited list of what every change
> was. Then every indexing task could just store what the modification
> number was last time it did it's work compare that number to the current
> number and read all the changes from the change log. If the stored
> modification number had fallen out of the log it has to go over the
> entire filesystem but that would not have to happen that often with a
> big enough log.
>
> Probably some optimisation have to be done to keep the log small you do
> not want to store every putc as a separate event.

Putc isn't the problem - that caches up full blocks of data before giving
them to the kernel.

The problem would be something like syslog, which you really might like to
search/index frequently (real time analysis).

No log would be able to handle all cases, and you will have to figure out
what to do for the exceptions, and recovery procedures when that fails.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 20:47                 ` Hans Reiser
@ 2003-10-31 13:59                   ` Herman
  2003-10-31 21:23                     ` Richard B. Johnson
  2003-10-31 21:08                   ` David S. Miller
  1 sibling, 1 reply; 83+ messages in thread
From: Herman @ 2003-10-31 13:59 UTC (permalink / raw)
  To: linux-kernel

On Friday 31 October 2003 8:47 pm, Hans Reiser wrote:
> I can't get US bank accounts for my programmers working for me.  Why?  
> Because every US bank without exception uses social security numbers as
> a primary key.  A person without a social security number cannot be
> coped with.  This is a weakness directly due to molding rather than
> matching structure in data.

No, that is a legal requirement, not a weakness due to molding, but I get your 
point.

BTW, to my mind, the killer app in a business environment is the automatic 
file versioning feature in longhorn.  This protects people against fat finger 
mistakes, and geez, any business has its fair share of fat head, fat finger 
and dumb blond types.  This is the only feature from VMS that I am longing 
for...

Cheers,

H.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30  4:21             ` Scott Robert Ladd
@ 2003-10-31 16:42               ` Timothy Miller
  2003-10-31 19:15                 ` Hans Reiser
  0 siblings, 1 reply; 83+ messages in thread
From: Timothy Miller @ 2003-10-31 16:42 UTC (permalink / raw)
  To: Scott Robert Ladd
  Cc: trelane, Alex Belits, Dax Kelson, Hans Reiser, andersen, linux-kernel



Scott Robert Ladd wrote:

> 
> Another problem with metadata is that it is largely generated by the 
> user, who is notoriously lazy. A truly powerful system would use 
> contextual analysis and other  algorithms to automatically generate 
> metadata, freeing the user from an onerous task (which is what computers 
> should do). Certainly, some search engiens are bordering on this 
> capability.
> 

There is a French company called Pertimm which develops a search engine 
that does this with documents.  It even does cross-language queries 
based on sophistocated linguistic analysis.  Often, I wish google had 
some of those features, if even a primitive synonym table.

The relevance here, though, is that the Pertimm index is much larger 
than the actual text that be being indexed.  That's not a problem, 
really, because the same is true for google.  You need that for 
efficient searches.  But there is no place for such a thing in a file 
system.  I don't think any Linux developers would want the metadata to 
even APPROACH the size of the file data, let alone get LARGER.

Indexing of this sort has its place, but applying it to a whole file 
system is much too broad of a use.  For instance, you wouldn't want to 
index the contents of your binary programs, or even shell scripts for 
that matter.  So text, data, and code need to have different kinds of 
indexing.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 16:42               ` Timothy Miller
@ 2003-10-31 19:15                 ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-10-31 19:15 UTC (permalink / raw)
  To: Timothy Miller
  Cc: Scott Robert Ladd, trelane, Alex Belits, Dax Kelson, andersen,
	linux-kernel

Timothy Miller wrote:

>
>
>
> Indexing of this sort has its place, but applying it to a whole file 
> system is much too broad of a use. 
>
>
>
You should instead say that applying it blindly and inflexibly without 
user control is bad....

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31  7:40             ` Hans Reiser
@ 2003-10-31 19:30               ` Theodore Ts'o
  2003-10-31 20:47                 ` Hans Reiser
  2003-11-03 12:42                 ` Nikita Danilov
  0 siblings, 2 replies; 83+ messages in thread
From: Theodore Ts'o @ 2003-10-31 19:30 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Erik Andersen, linux-kernel

On Fri, Oct 31, 2003 at 10:40:03AM +0300, Hans Reiser wrote:
> Special cases of general theorems are not more powerful than the general 
> theorems, they are simply special cases.   You can design a language 
> that has the power of both relational algebra and boolean algebra.

Just because you can reduce everything to a turing machine doesn't
mean that the best way to implement a filesystem is with an infinitely
long tape which can only contain zero's and one's.  There are plenty
of optimizations which means that you can quickly and with minimal
overhead do searches based on structured data, which is far, far more
difficult to do if you are doing unstructured searches.  (In fact, in
some cases, if you don't have structured data to distinguish between
the author and the subject, you have to do the equivalent of natural
language processing if you are trying to do via an unstructured search
to find all papers written *about* a famous author, while not getting
false hits that were *written* by that same famous author.  Doing this
requires structured, not unstructured data.)

> >No, but it means that doing searches on formatted text is very
> >difficult,
> >
> When you say formatted text, do you mean fonts and stuff, or do you mean 
> object storage models.  Object storage models should generally be 
> replaced with files and directories. 

I mean fonts and stuff.  Stripping out fonts, tables, etc. for doing
generalized, unstructured text search, clearly needs to be done in
userspace.  Actually, I think we both agree on this point.  The poing
of disagreement is whether the searches utilizing such indexes should
be done in the kernel as part of the intrinsic part of the filesystem,
or in userspace.  I believe that we need to draw a very firm line
between what you call "primary keys", which uniquely identify a file,
and generalized searches.  You believe the two should be unified.

> Are you saying that auto-indexers should not parse the formatted text, 
> index the document, and allow users to find the document, with the 
> auto-indexer running in user space, but the indexes being traversed by 
> the filesystem namespace resolver?  The kernel does not need to 
> understand how to parse a document, it just needs to support queries 
> that use the indexes created by an auto-indexer that does understand it.

I believe that there is a big difference between, "I want the file
named /home/tytso/src/e2fsprogs/e2fsck/e2fsck.c", and "I remember
vaguely that 5 years ago, I read a paper about the effects of high-fat
diets on akida's, where the first name of the author was Tom".  The
first is a filename lookup.  The second is a search.  I would like
better search tools for files in a filesystem, no doubt.  But I would
never, ever put a search that might return an ambiguous number of
responses (that might change over time as more files are added to the
filesystem) in a Makefile as a source file.  

You are conflating these two concepts, pointing out that filename path
resolution happens a lot, and so therefore generalized searches should
also be done in the kernel.  What I am saying is that generalized
searches where the user needs to look at the returned set of files,
and then apply human intelligence to see which of the returned set of
files was the one they were looking for is a FUNDAMENTALLY DIFFERENT
OPERATION from a filename lookup via a primary key.  The latter should
be done in the kernel, as is the case to day.  The former should by no
means be in the kernel, and should be done in userspace, preferably
with a graphical interface lookup so the user can look at the returned
files, look at the context in which the search parameters appear, and
select the ones which actually is the document they were looking for.

Sure, Google has the concept of the "I'm feeling lucky" button.  But
there is a fundamental difference between a URL, and saying, "Type
'Akida fat diet' into Google and hit "I'm feeling lucky".  The latter
is something that you would never put into hypertext document as a
link, because it changes over time, and what works today might not
work tomorrow.  That is the difference between a name (a URL), and a
search string (what you type into Google).

> >In contrast, consider searching for someone who is male, between 30
> >and 40, is named Tom, and lived in Libertyville, Illinois sometime
> >between 1960 and 1970, and is married to someone named Mary who was
> >born in California.  This might return several people, and most people
> >would **NOT** consider the space of all queries about people to be a
> >"name space". 
> >
> Oh god, did you read the literature?

Is this the same literature as the ones which said that Microkernels
were the way, the truth and the light?  Is this the same literature as
the stuff written by the Professor Tennenbaum, who said he would have
given Linus a failing grade if he submitted Linux as a project?  There
are plenty of things in the Literature that I consider to be pure
stuff and nonsense, and people who claim that searches and "name
spaces" to be identical fall into that category as far as I'm
concerned....

> >Searches are not names.  They do not uniquely identify
> >people or objects, which is a fundamental requirement of a name.
> > 
> >
> You mean like Theodore?  Are you saying that Theodore is not a name 
> because it does not uniquely identify you?

In the computer science usage, yes, "Theodore" is not a name.  It is a
nick name; it is a convenient handle by which I can be identified; but
it does not uniquely identify me.  (I am reminded of a story from when
I was at MIT, and someone called up a fraternity, Tau Epsilon Theta,
and asked for "Mike", and was told, "which one".  "Well, the one which
lives at Tep".  "There's more than one".  "Well, the one with blond
hair".  "Sorry, there are three Mikes with Blond hair at TEP".  The
result was a run of frat shirts that were labelled, "Blond Mike from
TEP".  The moral of the story?  "Mike" is not a useful name when
trying to contact a specific person at this specific fraternity at MIT
back in the late 80's.)

> >The bottom line is that for something that happens dozens or even
> >hundreds of times a day, that's an argument that it *shouldn't* be
> >done in the kernel.  Compare and contrast that with handling incoming
> >network packets, which can happen millions of times per hour.
> > 
> >
> Actually the relevant measure is, not how often do you use it, but how 
> often would it context switch if it was not in the kernel.  Users rarely 
> use the networking code directly.

If random generic searches that return an ambiguous number of matches,
some of which may be the one the user wants, and some of them not,
happens only a few dozen times a day (which is about how often I use
Google), then an extra context switch, which is really fast in Linux,
is completely lost in the noise.  

> Naming is used by programs a lot.  Enhace naming, and the programs will 
> used enhanced naming a lot.

Searching and Naming are not the same thing.  Period.

						- Ted

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 19:30               ` Theodore Ts'o
@ 2003-10-31 20:47                 ` Hans Reiser
  2003-10-31 13:59                   ` Herman
  2003-10-31 21:08                   ` David S. Miller
  2003-11-03 12:42                 ` Nikita Danilov
  1 sibling, 2 replies; 83+ messages in thread
From: Hans Reiser @ 2003-10-31 20:47 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Erik Andersen, linux-kernel

Theodore Ts'o wrote:

>On Fri, Oct 31, 2003 at 10:40:03AM +0300, Hans Reiser wrote:
>  
>
>>Special cases of general theorems are not more powerful than the general 
>>theorems, they are simply special cases.   You can design a language 
>>that has the power of both relational algebra and boolean algebra.
>>    
>>
>
>Just because you can reduce everything to a turing machine doesn't
>mean that the best way to implement a filesystem is with an infinitely
>long tape which can only contain zero's and one's.
>
Beautifully poetic.:)

>  There are plenty
>of optimizations which means that you can quickly and with minimal
>overhead do searches based on structured data, which is far, far more
>difficult to do if you are doing unstructured searches. 
>
Which is why you should feel free to perform relational algebra on data 
that happens to have a table structure.

A general purpose filesystem should match rather than mold the structure 
of the data.  If the data happens to be tabular, go for it, use tuple 
structures in your queries, but if you can devise semantics that allow 
you to use what structure is there and known to you, and does not 
require you to use structure when it is not appropriate, that is much 
more powerful.

I can't get US bank accounts for my programmers working for me.  Why?  
Because every US bank without exception uses social security numbers as 
a primary key.  A person without a social security number cannot be 
coped with.  This is a weakness directly due to molding rather than 
matching structure in data.

> (In fact, in
>some cases, if you don't have structured data to distinguish between
>the author and the subject, you have to do the equivalent of natural
>language processing if you are trying to do via an unstructured search
>to find all papers written *about* a famous author, while not getting
>false hits that were *written* by that same famous author.  Doing this
>requires structured, not unstructured data.)
>  
>
So use structure when it is there and you know it, and use a model that 
allows you to use structure when it is there and you know it, but don't 
use a model that requires that you mold the data into a table structure.

With library systems that use structured card catalogs, sometimes it is 
annoying that you cannot find all papers associated with an author, both 
the ones by him and about him, without having to say author/authorname 
OR subject/authorname.  It cuts both ways.....

>  
>
>>>No, but it means that doing searches on formatted text is very
>>>difficult,
>>>
>>>      
>>>
>>When you say formatted text, do you mean fonts and stuff, or do you mean 
>>object storage models.  Object storage models should generally be 
>>replaced with files and directories. 
>>    
>>
>
>I mean fonts and stuff.  Stripping out fonts, tables, etc. for doing
>generalized, unstructured text search, clearly needs to be done in
>userspace.  Actually, I think we both agree on this point.  The poing
>of disagreement is whether the searches utilizing such indexes should
>be done in the kernel as part of the intrinsic part of the filesystem,
>or in userspace.  I believe that we need to draw a very firm line
>between what you call "primary keys", which uniquely identify a file,
>and generalized searches.  You believe the two should be unified.
>  
>
In reiser4, objectids uniquely identify a file, and all pathnames can be 
renamed.  Long ago I used to think that one should be able to find a 
file by its objectid, but I lost that argument, and deserved to lose it.

>  
>
>>Are you saying that auto-indexers should not parse the formatted text, 
>>index the document, and allow users to find the document, with the 
>>auto-indexer running in user space, but the indexes being traversed by 
>>the filesystem namespace resolver?  The kernel does not need to 
>>understand how to parse a document, it just needs to support queries 
>>that use the indexes created by an auto-indexer that does understand it.
>>    
>>
>
>I believe that there is a big difference between, "I want the file
>named /home/tytso/src/e2fsprogs/e2fsck/e2fsck.c", and "I remember
>vaguely that 5 years ago, I read a paper about the effects of high-fat
>diets on akida's, where the first name of the author was Tom".  The
>first is a filename lookup.  The second is a search.  I would like
>better search tools for files in a filesystem, no doubt.  But I would
>never, ever put a search that might return an ambiguous number of
>responses (that might change over time as more files are added to the
>filesystem) in a Makefile as a source file.  
>  
>
Suppose you want to only recompile files that have been changed since 
the last recompile.....  that will return an ambiguous number of 
responses.....

Suppose you want to only recompile files that are in a particular 
directory, and you don't know how many that is before performing the query?

Suppose you want to recompile all files that have been used in the last 
year and were compiled by the distro for a 386 rather than by you for 
your AMD 64?

>You are conflating these two concepts, pointing out that filename path
>resolution happens a lot, and so therefore generalized searches should
>also be done in the kernel.  What I am saying is that generalized
>searches where the user needs to look at the returned set of files,
>and then apply human intelligence to see which of the returned set of
>files was the one they were looking 
>
why do you assume they are looking for one of the set instead of the 
whole set?

>for is a FUNDAMENTALLY DIFFERENT
>OPERATION from a filename lookup via a primary key.  The latter should
>be done in the kernel, as is the case to day.  The former should by no
>means be in the kernel, and should be done in userspace, preferably
>with a graphical interface lookup so the user can look at the returned
>files, look at the context in which the search parameters appear, and
>select the ones which actually is the document they were looking for.
>  
>
Prompting the user for search refinement does indeed deserve a context 
switch to user space.

>Sure, Google has the concept of the "I'm feeling lucky" button.  But
>there is a fundamental difference between a URL, and saying, "Type
>'Akida fat diet' into Google and hit "I'm feeling lucky".  The latter
>is something that you would never put into hypertext document as a
>link, because it changes over time, and what works today might not
>work tomorrow.
>
Getting different results each day is sometimes desirable in a query.

>  That is the difference between a name (a URL), and a
>search string (what you type into Google).
>
>  
>
>>>In contrast, consider searching for someone who is male, between 30
>>>and 40, is named Tom, and lived in Libertyville, Illinois sometime
>>>between 1960 and 1970, and is married to someone named Mary who was
>>>born in California.  This might return several people, and most people
>>>would **NOT** consider the space of all queries about people to be a
>>>"name space". 
>>>
>>>      
>>>
>>Oh god, did you read the literature?
>>    
>>
>
>Is this the same literature as the ones which said that Microkernels
>were the way, the truth and the light?  Is this the same literature as
>the stuff written by the Professor Tennenbaum, who said he would have
>given Linus a failing grade if he submitted Linux as a project?  There
>are plenty of things in the Literature that I consider to be pure
>stuff and nonsense, and people who claim that searches and "name
>spaces" to be identical fall into that category as far as I'm
>concerned....
>  
>
If you make a statement about what most people consider a name space to 
be, you should be consistent with the literature that created the term.

If you want to make a statement about what name spaces SHOULD be, that 
is different, yes?

>
>
>  
>
>>Naming is used by programs a lot.  Enhace naming, and the programs will 
>>used enhanced naming a lot.
>>    
>>
>
>Searching and Naming are not the same thing.  Period.
>
>						- Ted
>  
>
Makefiles will use enhanced naming a lot when it is there for them to 
conveniently employ.  Many other programs will as well.

I am going to avoid directly responding to "Searching and Naming are not 
the same thing. Period." because there are articles in the literature on 
the different objectives of naming, and on why X different categories of 
name space usage (The first article had 5 different ones, one was 
navigating, and I forget the others.  A later article had more.  The 
articles are worth reading.) are deeply different, and these articles 
have truth to them.

If you say that user refinement of searches belongs in user space, we 
agree.  If you say that names resolve to single objects and never should 
resolve to sets of objects, we disagree.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 20:47                 ` Hans Reiser
  2003-10-31 13:59                   ` Herman
@ 2003-10-31 21:08                   ` David S. Miller
  2003-11-02 21:42                     ` Hans Reiser
  1 sibling, 1 reply; 83+ messages in thread
From: David S. Miller @ 2003-10-31 21:08 UTC (permalink / raw)
  To: Hans Reiser; +Cc: tytso, andersen, linux-kernel

On Fri, 31 Oct 2003 23:47:26 +0300
Hans Reiser <reiser@namesys.com> wrote:

> If you say that names resolve to single objects and never should 
> resolve to sets of objects, we disagree.

While I have no personal opinion either way on the utility of such an
idea, I do think that if we ever do support a "one to many" mapping of
names to inodes we should make you do the security audit of a full
Linux system in the presence of this feature, deal?  :-)

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 13:59                   ` Herman
@ 2003-10-31 21:23                     ` Richard B. Johnson
  2003-11-01 18:30                       ` Hans Reiser
  0 siblings, 1 reply; 83+ messages in thread
From: Richard B. Johnson @ 2003-10-31 21:23 UTC (permalink / raw)
  To: Herman; +Cc: linux-kernel

On Fri, 31 Oct 2003, Herman wrote:

> On Friday 31 October 2003 8:47 pm, Hans Reiser wrote:
> > I can't get US bank accounts for my programmers working for me.  Why?
> > Because every US bank without exception uses social security numbers as
> > a primary key.  A person without a social security number cannot be
> > coped with.  This is a weakness directly due to molding rather than
> > matching structure in data.
>
> No, that is a legal requirement, not a weakness due to molding,
> but I get your point.
>

Not a legal requirement in the United States. In fact, using a
"social security" or "taxpayer identification number" for
identification is contrary to federal law and a SS card contains
the words; "Not for identification".

This is rigidly enforced by many federal agencies and completely
ignored by others (go figure). Banks say they need to have a
SS number for 1099 forms, even for accounts that earn no
interest! It's just that their databases have an entry for
SS numbers (if required by the kind of account), and the
software is defective, requiring that field to be filled.
The result being that many person's rights are violated because
of defective software!

Incidentally, I recently obtained a so-called identification
badge that is now required for me to have access to my airplane.
I purposely left the SS# entry blank when I filled out the
form. This raised a stink that likely went all the way to the
state house. I got the badge. It seems that a State Government,
that has no business regulating air commerce, also wants to
keep my SS# on hand for surveillance. You need to keep putting
those attempts down.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 21:23                     ` Richard B. Johnson
@ 2003-11-01 18:30                       ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-11-01 18:30 UTC (permalink / raw)
  To: root; +Cc: Herman, linux-kernel

Good man!

Richard B. Johnson wrote:

>
>
>Incidentally, I recently obtained a so-called identification
>badge that is now required for me to have access to my airplane.
>I purposely left the SS# entry blank when I filled out the
>form. This raised a stink that likely went all the way to the
>state house. I got the badge. It seems that a State Government,
>that has no business regulating air commerce, also wants to
>keep my SS# on hand for surveillance. You need to keep putting
>those attempts down.
>
>  
>

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 21:08                   ` David S. Miller
@ 2003-11-02 21:42                     ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-11-02 21:42 UTC (permalink / raw)
  To: David S. Miller; +Cc: tytso, andersen, linux-kernel

David S. Miller wrote:

>On Fri, 31 Oct 2003 23:47:26 +0300
>Hans Reiser <reiser@namesys.com> wrote:
>
>  
>
>>If you say that names resolve to single objects and never should 
>>resolve to sets of objects, we disagree.
>>    
>>
>
>While I have no personal opinion either way on the utility of such an
>idea, I do think that if we ever do support a "one to many" mapping of
>names to inodes we should make you do the security audit of a full
>Linux system in the presence of this feature, deal?  :-)
>
>
>  
>
;-)

I don't know how seriously you desire me to take your comment, so 
forgive me if I take it too seriously.

You can't upgrade existing APIs to handle sets of inodes without 
changing them in ways that require source code modification, so one can 
presume that the app writer used the new APIs as correctly as he 
performs all his other changes to his code.

Agreed?

Of course, bash would be much more secure if we got rid of globbing (*), 
yes?  Ted, can you write and send a patch in to the bash maintainer for 
that?    ;-)

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
       [not found]   ` <3FA3FF46.7010309@namesys.com>
@ 2003-11-03 10:55     ` Ingo Oeser
  2003-11-04  8:10       ` Hans Reiser
  0 siblings, 1 reply; 83+ messages in thread
From: Ingo Oeser @ 2003-11-03 10:55 UTC (permalink / raw)
  To: Hans Reiser; +Cc: linux-kernel

Hi Hans,
hi LKML,

On Saturday 01 November 2003 19:45, Hans Reiser wrote:
> Ingo Oeser wrote:
> >2) In a out-of-the-box Linux system its getting harder and harder to find
> >	the issuer of a search request to do the refinement.
> >
> >The latter is due to heavy asynchronity of modern user interfaces.
> >
> >I have usally several consoles, several xterms with screen in them,
> >several desktops with xterms and other programs, several X-Servers
> >running, some screen sessions attached to some virtual terminals and
> >some people even have multihead.
> >
> >Where does the refinement get sent to, without application support?

> probably looking at the DISPLAY environment variable for the relevant
> process is the right answer.

It's not always passed through, since each layer only knows the relative
issuer, but not the absolute one. And some systems just have a piece of
memory and a semaphore for notification.

This is a bit hard. Try it experimentally to get a notification done for
each leaf UI in the UI tree situation decribed above. My desktop is KDE
3.1.4, to ease your experiment.

Regards

Ingo Oeser



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31 19:30               ` Theodore Ts'o
  2003-10-31 20:47                 ` Hans Reiser
@ 2003-11-03 12:42                 ` Nikita Danilov
  2003-11-03 16:58                   ` Timothy Miller
  1 sibling, 1 reply; 83+ messages in thread
From: Nikita Danilov @ 2003-11-03 12:42 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Hans Reiser, Erik Andersen, linux-kernel

Theodore Ts'o writes:

[...]

 > 
 > I believe that there is a big difference between, "I want the file
 > named /home/tytso/src/e2fsprogs/e2fsck/e2fsck.c", and "I remember
 > vaguely that 5 years ago, I read a paper about the effects of high-fat
 > diets on akida's, where the first name of the author was Tom".  The
 > first is a filename lookup.  The second is a search.  I would like
 > better search tools for files in a filesystem, no doubt.  But I would
 > never, ever put a search that might return an ambiguous number of
 > responses (that might change over time as more files are added to the
 > filesystem) in a Makefile as a source file.  

It is called "a directory". :) There is no crime in putting

cc src/*.c

into Makefile. I think that Hans' query-result-object denoting multiple
objects is more like directory than single regular file.

[...]

 > 
 > 
 > 						- Ted

Nikita.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 12:42                 ` Nikita Danilov
@ 2003-11-03 16:58                   ` Timothy Miller
  2003-11-04  8:13                     ` Hans Reiser
  0 siblings, 1 reply; 83+ messages in thread
From: Timothy Miller @ 2003-11-03 16:58 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Theodore Ts'o, Hans Reiser, Erik Andersen, linux-kernel



Nikita Danilov wrote:

> It is called "a directory". :) There is no crime in putting
> 
> cc src/*.c
> 
> into Makefile. I think that Hans' query-result-object denoting multiple
> objects is more like directory than single regular file.

So a file system query that results in multiple files generates a 
"virtual directory"?


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 10:55     ` Ingo Oeser
@ 2003-11-04  8:10       ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-11-04  8:10 UTC (permalink / raw)
  To: Ingo Oeser; +Cc: linux-kernel

Ingo Oeser wrote:

> This is a bit hard. Try it experimentally to get a notification done for
>
>each leaf UI in the UI tree situation decribed above. My desktop is KDE
>3.1.4, to ease your experiment.
>
>  
>
I'll trust you that it is hard.  If I get funding for query refinement 
someday, I'll budget a bit extra for this issue.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 16:58                   ` Timothy Miller
@ 2003-11-04  8:13                     ` Hans Reiser
  2003-11-05 13:51                       ` Ingo Oeser
  0 siblings, 1 reply; 83+ messages in thread
From: Hans Reiser @ 2003-11-04  8:13 UTC (permalink / raw)
  To: Timothy Miller
  Cc: Nikita Danilov, Theodore Ts'o, Erik Andersen, linux-kernel

Timothy Miller wrote:

>
>
> Nikita Danilov wrote:
>
>> It is called "a directory". :) There is no crime in putting
>>
>> cc src/*.c
>>
>> into Makefile. I think that Hans' query-result-object denoting multiple
>> objects is more like directory than single regular file.
>
>
> So a file system query that results in multiple files generates a 
> "virtual directory"?
>
>
>
Remember that this code does not exist yet.....;-)

Sounds like it might be a good way to do it though.

-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-05 13:51                       ` Ingo Oeser
@ 2003-11-05  2:07                         ` Hans Reiser
  0 siblings, 0 replies; 83+ messages in thread
From: Hans Reiser @ 2003-11-05  2:07 UTC (permalink / raw)
  To: Ingo Oeser
  Cc: Nikita Danilov, Theodore Ts'o, Erik Andersen, linux-kernel,
	Timothy Miller

Ingo Oeser wrote:

>On Tuesday 04 November 2003 09:13, Hans Reiser wrote:
>  
>
>>Timothy Miller wrote:
>>    
>>
>>>Nikita Danilov wrote:
>>>      
>>>
>>>>It is called "a directory". :) There is no crime in putting
>>>>
>>>>cc src/*.c
>>>>
>>>>into Makefile. I think that Hans' query-result-object denoting multiple
>>>>objects is more like directory than single regular file.
>>>>        
>>>>
>>>So a file system query that results in multiple files generates a
>>>"virtual directory"?
>>>      
>>>
>>Remember that this code does not exist yet.....;-)
>>
>>Sounds like it might be a good way to do it though.
>>    
>>
>
>Yes and this also solves the "refine feedback" problem: Just return
>sth. useful in the stat->nlink for that directory
>or even create a new stat-like syscall.
>  
>
I don't understand what you are saying about nlink.  Can you say more?

>Now the issuer can decide on ANY level, whether to refine the search or
>accept the result to present it in a listing.
>
>A proper replacement for nlink is looong overdue.
>
>But even with the crappy one, we have now, it can be decided since a
>list of 65K is too much for a proper selection and cannot be handled by
>a user. Somebody even said that every search pattern revealing more
>than 50 records is not refined enough.
>  
>
If the user is looking for only one record....

>PS: Hans, we just saved you the funding on this topic.
>
>Regards
>
>Ingo Oeser
>
>
>
>
>  
>


-- 
Hans



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-04  8:13                     ` Hans Reiser
@ 2003-11-05 13:51                       ` Ingo Oeser
  2003-11-05  2:07                         ` Hans Reiser
  0 siblings, 1 reply; 83+ messages in thread
From: Ingo Oeser @ 2003-11-05 13:51 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Nikita Danilov, Theodore Ts'o, Erik Andersen, linux-kernel,
	Timothy Miller

On Tuesday 04 November 2003 09:13, Hans Reiser wrote:
> Timothy Miller wrote:
> > Nikita Danilov wrote:
> >> It is called "a directory". :) There is no crime in putting
> >>
> >> cc src/*.c
> >>
> >> into Makefile. I think that Hans' query-result-object denoting multiple
> >> objects is more like directory than single regular file.
> >
> > So a file system query that results in multiple files generates a
> > "virtual directory"?
>
> Remember that this code does not exist yet.....;-)
>
> Sounds like it might be a good way to do it though.

Yes and this also solves the "refine feedback" problem: Just return
sth. useful in the stat->nlink for that directory
or even create a new stat-like syscall.

Now the issuer can decide on ANY level, whether to refine the search or
accept the result to present it in a listing.

A proper replacement for nlink is looong overdue.

But even with the crappy one, we have now, it can be decided since a
list of 65K is too much for a proper selection and cannot be handled by
a user. Somebody even said that every search pattern revealing more
than 50 records is not refined enough.

PS: Hans, we just saved you the funding on this topic.

Regards

Ingo Oeser



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-04  0:35     ` Daniel B.
@ 2003-11-04 14:05       ` Brian Beattie
  0 siblings, 0 replies; 83+ messages in thread
From: Brian Beattie @ 2003-11-04 14:05 UTC (permalink / raw)
  To: linux-kernel

On Mon, 2003-11-03 at 19:35, Daniel B. wrote:
> Brian Beattie wrote:
> > 
> > ... a real paradyne(sp?) shift.  
> 
> Close.  It's "paradigm."
> 
> :-)

And that I think is the one thing we(the mailing list) do agree on.

:) :)


-- 
Brian Beattie            | Experienced kernel hacker/embedded systems
beattie@beattie-home.net | programmer, direct or contract, short or
www.beattie-home.net     | long term, available immediately.

"Honor isn't about making the right choices.
It's about dealing with the consequences." -- Midori Koto


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 20:54         ` Richard B. Johnson
  2003-11-03 21:01           ` Valdis.Kletnieks
  2003-11-04  8:47           ` Michael Clark
@ 2003-11-04 14:02           ` Brian Beattie
  2 siblings, 0 replies; 83+ messages in thread
From: Brian Beattie @ 2003-11-04 14:02 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Linux kernel

On Mon, 2003-11-03 at 15:54, Richard B. Johnson wrote:
> On Mon, 3 Nov 2003 Valdis.Kletnieks@vt.edu wrote:
> 
> > On Mon, 03 Nov 2003 15:17:55 EST, "Richard B. Johnson" said:
> >
> > > Yeah. Nobody should ever use a word that's spelled like paradigm!
> > > How about "really good example..." (which is what it means).
> >
> > No, a paradigm is "a way of looking at things" or "a world view".
> > So this would include paradigms like "the world is flat" or "monolithic
> > kernels are preferable" or "gcc is the only compiler for the kernel".
> >
> 
> No. don't you think I look things up in a dictionary before I
> write a definition?
> 
> The Webster's "New Collegiate Dictionary", 150th Anniversary
> Edition clearly  states;
> 
> 1. EXAMPLE, PATTERN;esp: an outstandingly clear or typical
> example or archetype.

And I read the complete reference before I get pissy

"One entry found for paradigm.
Main Entry: par·a·digm
Pronunciation: 'par-&-"dIm also -"dim
Function: noun
Etymology: Late Latin paradigma, from Greek paradeigma, from
paradeiknynai to show side by side, from para- + deiknynai to show --
more at DICTION
Date: 15th century
1 : EXAMPLE, PATTERN; especially : an outstandingly clear or typical
example or archetype
2 : an example of a conjugation or declension showing a word in all its
inflectional forms
3 : a philosophical and theoretical framework of a scientific school or
discipline within which theories, laws, and generalizations and the
experiments performed in support of them are formulated"

Futher English is a plastic language and meanings and nuances change
over time.
-- 
Brian Beattie            | Experienced kernel hacker/embedded systems
beattie@beattie-home.net | programmer, direct or contract, short or
www.beattie-home.net     | long term, available immediately.

"Honor isn't about making the right choices.
It's about dealing with the consequences." -- Midori Koto


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-04  8:47           ` Michael Clark
@ 2003-11-04 12:47             ` Richard B. Johnson
  0 siblings, 0 replies; 83+ messages in thread
From: Richard B. Johnson @ 2003-11-04 12:47 UTC (permalink / raw)
  To: Michael Clark; +Cc: Valdis.Kletnieks, Brian Beattie, Linux kernel

On Tue, 4 Nov 2003, Michael Clark wrote:

> On 11/04/03 04:54, Richard B. Johnson wrote:
> > It is particularly irksome to me because I studied Latin in
> > High School, where I first encountered this word. My second
> > encounter was where somebody corrupted it to mean some kind
> > of new idea. Then some idiot named a company Paradigm and
> > the end was clear.
>
> How do you know the guy is an idiot - did you meet him?
>
> Quite a good name for a loudspeaker company in respect to providing
> a 'reference example' for sound.
>
> Although funny they have a registered trademark for 'Paradigm'
> - a bit generic methinks.
>
> ~mc

Yes, metaparadigm is much more meaningful --and it can mean anything
you want it to because it had not been previously defined for a few
hundred years!

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 20:54         ` Richard B. Johnson
  2003-11-03 21:01           ` Valdis.Kletnieks
@ 2003-11-04  8:47           ` Michael Clark
  2003-11-04 12:47             ` Richard B. Johnson
  2003-11-04 14:02           ` Brian Beattie
  2 siblings, 1 reply; 83+ messages in thread
From: Michael Clark @ 2003-11-04  8:47 UTC (permalink / raw)
  To: root; +Cc: Valdis.Kletnieks, Brian Beattie, Linux kernel

On 11/04/03 04:54, Richard B. Johnson wrote:
> It is particularly irksome to me because I studied Latin in
> High School, where I first encountered this word. My second
> encounter was where somebody corrupted it to mean some kind
> of new idea. Then some idiot named a company Paradigm and
> the end was clear.

How do you know the guy is an idiot - did you meet him?

Quite a good name for a loudspeaker company in respect to providing
a 'reference example' for sound.

Although funny they have a registered trademark for 'Paradigm'
- a bit generic methinks.

~mc


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 19:35   ` Brian Beattie
  2003-11-03 20:17     ` Richard B. Johnson
@ 2003-11-04  0:35     ` Daniel B.
  2003-11-04 14:05       ` Brian Beattie
  1 sibling, 1 reply; 83+ messages in thread
From: Daniel B. @ 2003-11-04  0:35 UTC (permalink / raw)
  Cc: linux-kernel

Brian Beattie wrote:
> 
> ... a real paradyne(sp?) shift.  

Close.  It's "paradigm."

:-)


Daniel
-- 
Daniel Barclay
dsb@smart.net

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 21:01           ` Valdis.Kletnieks
@ 2003-11-03 22:06             ` Måns Rullgård
  0 siblings, 0 replies; 83+ messages in thread
From: Måns Rullgård @ 2003-11-03 22:06 UTC (permalink / raw)
  To: linux-kernel

Valdis.Kletnieks@vt.edu writes:

>> No. don't you think I look things up in a dictionary before I
>> write a definition?
>> 
>> The Webster's "New Collegiate Dictionary", 150th Anniversary
>> Edition clearly  states;
>> 
>> 1. EXAMPLE, PATTERN;esp: an outstandingly clear or typical
>> example or archetype.
>
> The current prevailing paradigm is that words mean what convention says they
> mean, not what the dictionary says.
>
> You got as much chance of reclaiming that word as you do "hacker". ;)

Well, hacker still has the right meaning among hackers.  Maybe
paradigm survives in some group, as well, though I'm not sure where.

-- 
Måns Rullgård
mru@kth.se


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 20:54         ` Richard B. Johnson
@ 2003-11-03 21:01           ` Valdis.Kletnieks
  2003-11-03 22:06             ` Måns Rullgård
  2003-11-04  8:47           ` Michael Clark
  2003-11-04 14:02           ` Brian Beattie
  2 siblings, 1 reply; 83+ messages in thread
From: Valdis.Kletnieks @ 2003-11-03 21:01 UTC (permalink / raw)
  To: root; +Cc: Linux kernel

[-- Attachment #1: Type: text/plain, Size: 514 bytes --]

On Mon, 03 Nov 2003 15:54:29 EST, "Richard B. Johnson" said:

> No. don't you think I look things up in a dictionary before I
> write a definition?
> 
> The Webster's "New Collegiate Dictionary", 150th Anniversary
> Edition clearly  states;
> 
> 1. EXAMPLE, PATTERN;esp: an outstandingly clear or typical
> example or archetype.

The current prevailing paradigm is that words mean what convention says they
mean, not what the dictionary says.

You got as much chance of reclaiming that word as you do "hacker". ;)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 20:23       ` Valdis.Kletnieks
  2003-11-03 20:54         ` Richard B. Johnson
@ 2003-11-03 20:55         ` Roland Dreier
  1 sibling, 0 replies; 83+ messages in thread
From: Roland Dreier @ 2003-11-03 20:55 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: root, Brian Beattie, Linux kernel

    Richard> Yeah. Nobody should ever use a word that's spelled like
    Richard> paradigm!  How about "really good example..." (which is
    Richard> what it means).

    Valdis> No, a paradigm is "a way of looking at things" or "a world
    Valdis> view".  So this would include paradigms like "the world is
    Valdis> flat" or "monolithic kernels are preferable" or "gcc is
    Valdis> the only compiler for the kernel".

Actually "paradigm" has both meanings -- it may mean "example" _or_
"framework."

 - R.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 20:23       ` Valdis.Kletnieks
@ 2003-11-03 20:54         ` Richard B. Johnson
  2003-11-03 21:01           ` Valdis.Kletnieks
                             ` (2 more replies)
  2003-11-03 20:55         ` Roland Dreier
  1 sibling, 3 replies; 83+ messages in thread
From: Richard B. Johnson @ 2003-11-03 20:54 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Brian Beattie, Linux kernel

On Mon, 3 Nov 2003 Valdis.Kletnieks@vt.edu wrote:

> On Mon, 03 Nov 2003 15:17:55 EST, "Richard B. Johnson" said:
>
> > Yeah. Nobody should ever use a word that's spelled like paradigm!
> > How about "really good example..." (which is what it means).
>
> No, a paradigm is "a way of looking at things" or "a world view".
> So this would include paradigms like "the world is flat" or "monolithic
> kernels are preferable" or "gcc is the only compiler for the kernel".
>

No. don't you think I look things up in a dictionary before I
write a definition?

The Webster's "New Collegiate Dictionary", 150th Anniversary
Edition clearly  states;

1. EXAMPLE, PATTERN;esp: an outstandingly clear or typical
example or archetype.


It was stolen from the study of languages where it was used
to exhibit a word in all of its inflectional forms. It has
been corrupted to mean some "new view" of something while, in
fact, it is defined to mean EXAMPLE or PATTERN.

It is particularly irksome to me because I studied Latin in
High School, where I first encountered this word. My second
encounter was where somebody corrupted it to mean some kind
of new idea. Then some idiot named a company Paradigm and
the end was clear.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 20:17     ` Richard B. Johnson
@ 2003-11-03 20:23       ` Valdis.Kletnieks
  2003-11-03 20:54         ` Richard B. Johnson
  2003-11-03 20:55         ` Roland Dreier
  0 siblings, 2 replies; 83+ messages in thread
From: Valdis.Kletnieks @ 2003-11-03 20:23 UTC (permalink / raw)
  To: root; +Cc: Brian Beattie, Linux kernel

[-- Attachment #1: Type: text/plain, Size: 402 bytes --]

On Mon, 03 Nov 2003 15:17:55 EST, "Richard B. Johnson" said:

> Yeah. Nobody should ever use a word that's spelled like paradigm!
> How about "really good example..." (which is what it means).

No, a paradigm is "a way of looking at things" or "a world view".
So this would include paradigms like "the world is flat" or "monolithic
kernels are preferable" or "gcc is the only compiler for the kernel".

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-03 19:35   ` Brian Beattie
@ 2003-11-03 20:17     ` Richard B. Johnson
  2003-11-03 20:23       ` Valdis.Kletnieks
  2003-11-04  0:35     ` Daniel B.
  1 sibling, 1 reply; 83+ messages in thread
From: Richard B. Johnson @ 2003-11-03 20:17 UTC (permalink / raw)
  To: Brian Beattie; +Cc: Valdis.Kletnieks, Linux kernel

On Mon, 3 Nov 2003, Brian Beattie wrote:

> On Sun, 2003-11-02 at 12:15, Valdis.Kletnieks@vt.edu wrote:
> > On Sun, 02 Nov 2003 08:11:32 EST, Brian Beattie <beattie@beattie-home.net>  said:
> >
[SNIPPED...]

>
> I don't know about anytime soon and it woudl be a real paradyne(sp?)

Yeah. Nobody should ever use a word that's spelled like paradigm!
How about "really good example..." (which is what it means).

[SNIPPED...]

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-02 17:15 ` Valdis.Kletnieks
@ 2003-11-03 19:35   ` Brian Beattie
  2003-11-03 20:17     ` Richard B. Johnson
  2003-11-04  0:35     ` Daniel B.
  0 siblings, 2 replies; 83+ messages in thread
From: Brian Beattie @ 2003-11-03 19:35 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

On Sun, 2003-11-02 at 12:15, Valdis.Kletnieks@vt.edu wrote:
> On Sun, 02 Nov 2003 08:11:32 EST, Brian Beattie <beattie@beattie-home.net>  said:
> 
> > for storage might be feasible soon.  The idea is that you have a
> > permanent store, using raid or raid-like redundancy and file versioning
> > so that nothing is ever deleted, you just keep adding drives and
> > replacing those that fail.  Of course you'd need some geographic
> > diversity and a way for storage to migrate to newer "file stores" to
> > really work, but just think, no more backups to fail...ever!   
> 
> This may be very nice for the high end, but getting "geographic diversity"
> means you have to get space in a colo of some sort (unless you're a big enough
> site that you have another building of your own at least a mile or two away),
> and bandwidth between the two sites.

I don't know about anytime soon and it woudl be a real paradyne(sp?)
shift.  As it is right now home many home users do backups ate all.  The
replication could well be rather low bandwidth, more to CYA in the event
of a fire of natural disaster, and really a secondary issue.  What
intriques me, is the notion of a permanent file store that is never
deleted.

I have no idea if this will every make sense, but I notice, that I
always have more disk than will fit on any backup media I can afford. 
So this is really just a "what if?" thought.

-- 
Brian Beattie            | Experienced kernel hacker/embedded systems
beattie@beattie-home.net | programmer, direct or contract, short or
www.beattie-home.net     | long term, available immediately.

"Honor isn't about making the right choices.
It's about dealing with the consequences." -- Midori Koto


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-11-02 13:11 Brian Beattie
@ 2003-11-02 17:15 ` Valdis.Kletnieks
  2003-11-03 19:35   ` Brian Beattie
  0 siblings, 1 reply; 83+ messages in thread
From: Valdis.Kletnieks @ 2003-11-02 17:15 UTC (permalink / raw)
  To: Brian Beattie; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]

On Sun, 02 Nov 2003 08:11:32 EST, Brian Beattie <beattie@beattie-home.net>  said:

> for storage might be feasible soon.  The idea is that you have a
> permanent store, using raid or raid-like redundancy and file versioning
> so that nothing is ever deleted, you just keep adding drives and
> replacing those that fail.  Of course you'd need some geographic
> diversity and a way for storage to migrate to newer "file stores" to
> really work, but just think, no more backups to fail...ever!   

This may be very nice for the high end, but getting "geographic diversity"
means you have to get space in a colo of some sort (unless you're a big enough
site that you have another building of your own at least a mile or two away),
and bandwidth between the two sites.

Somehow, I don't see this anytime soon for the home user, the SOHO user, or the
small company that has 7-8 internal servers and a T-1.

Remember that for this to work, the bandwidth and off-site storage has to be
available at a cost the user can afford.  Remember that a lot of people aren't
too happy with the current price point for cable or DSL access - and those
price points are set with a high overcommital of bandwidth.  If everybody
starts trying to do backups over the network, the provider will have to build
out more capacity, and raise the price to cover it.

Yes, we're looking at offsite disk mirroring as a backup solution.  But we're
lucky that we have a large open space in a switch room some 3 miles from the
data center and dark fiber from here to there.  But it's STILL going to be a
big chunk of change. I dread to think what it would cost per month if we had to
pay for the space and bandwidth.


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
@ 2003-11-02 13:11 Brian Beattie
  2003-11-02 17:15 ` Valdis.Kletnieks
  0 siblings, 1 reply; 83+ messages in thread
From: Brian Beattie @ 2003-11-02 13:11 UTC (permalink / raw)
  To: linux-kernel

On Fri, 2003-10-31 at 08:59, Herman wrote:
 
> BTW, to my mind, the killer app in a business environment is the automatic 
> file versioning feature in longhorn.  This protects people against fat finger 
> mistakes, and geez, any business has its fair share of fat head, fat finger 
> and dumb blond types.  This is the only feature from VMS that I am longing 
> for...
> 

I have had this idea, for a while, that with the continued fall in price
per bit of storage, and the fact, that back-up strategies are not
catching up and are perhaps falling behind, that maybe a new paradyne
for storage might be feasible soon.  The idea is that you have a
permanent store, using raid or raid-like redundancy and file versioning
so that nothing is ever deleted, you just keep adding drives and
replacing those that fail.  Of course you'd need some geographic
diversity and a way for storage to migrate to newer "file stores" to
really work, but just think, no more backups to fail...ever!   

-- 
Brian Beattie            | Experienced kernel hacker/embedded systems
beattie@beattie-home.net | programmer, direct or contract, short or
www.beattie-home.net     | long term, available immediately.

"Honor isn't about making the right choices.
It's about dealing with the consequences." -- Midori Koto


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-31  1:46           ` Daniel B.
@ 2003-10-31  1:57             ` Philippe Troin
  0 siblings, 0 replies; 83+ messages in thread
From: Philippe Troin @ 2003-10-31  1:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List

"Daniel B." <dsb@smart.net> writes:

> Alex Belits wrote:
> > 
> > On Thu, 30 Oct 2003, Ihar 'Philips' Filipau wrote:
> > 
> ...
> > 
> > 3. Pluggable directory generator -- a userspace process can tell the
> > system to make an object that looks exactly like a directory, except that
> > its contents are provided by the process, that is being queried when the
> > directory is accessed.
> 
> That sounds like ClearCase's dynamically generated views of directories
> and files.

Or hurd's translators.

Phil.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 17:23         ` Alex Belits
@ 2003-10-31  1:46           ` Daniel B.
  2003-10-31  1:57             ` Philippe Troin
  0 siblings, 1 reply; 83+ messages in thread
From: Daniel B. @ 2003-10-31  1:46 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Alex Belits wrote:
> 
> On Thu, 30 Oct 2003, Ihar 'Philips' Filipau wrote:
> 
...
> 
> 3. Pluggable directory generator -- a userspace process can tell the
> system to make an object that looks exactly like a directory, except that
> its contents are provided by the process, that is being queried when the
> directory is accessed.

That sounds like ClearCase's dynamically generated views of directories
and files.

Daniel
-- 
Daniel Barclay
dsb@smart.net

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
  2003-10-30 11:10       ` Ihar 'Philips' Filipau
@ 2003-10-30 17:23         ` Alex Belits
  2003-10-31  1:46           ` Daniel B.
  0 siblings, 1 reply; 83+ messages in thread
From: Alex Belits @ 2003-10-30 17:23 UTC (permalink / raw)
  To: Ihar 'Philips' Filipau
  Cc: trelane, Linux Kernel Mailing List, Theodore Ts'o

On Thu, 30 Oct 2003, Ihar 'Philips' Filipau wrote:

> >>Keep in mind that just because Windows does thing a certain way
> >>doesn't mean we have to provide the same functionality in exactly the
> >>same way.
> >>Also keep in mind that Microsoft very deliberately blurs what they do
> >>in their "kernel" versus what they provide via system libraries (i.e.,
> >>API's provided via their DLL's, or shared libraries).
> >
> > Indeed, although certain things could be half-kernel, half-user
> >   (OK, 0.01% kernel, 99.99% user, e.g. userspace daemon that
> >   intercepts certain writes).  Of course, at that point, you might
> >   make a special library to interact with the daemon directly, although
> >   it's then not at all like just calling write().
> >
>
>    I beleive this is 100% user space issue.
>
>    And I think if one really want to do something like this - Gnome's
> VFS is a good candidate for this. They already have all abstractions in
> place.

  Why not just provide a general-purpose interface for:

1. Userspace-visible transactions. A userspace process can mark a set of
fd, inodes, files, or "whatever this set of processes did since now", and
tell the filesystem to keep a log of changes to that. Journaling will then
mark relevant changes (and possibly create an additional log depending on
the design, or pass the log-related information to another userspace
program), and treat them as a transaction, with the possibility of
rollback on kernel-originated error, userspace request, or, possibly, a
transaction manager daemon, that may have its own reason to fail the
transaction.

2. Update notifications. A set of files or directories, or whatever a
certain set of processes accesses, is being monitored, and the list
of changes (pages, byte ranges, lists of created/deleted directory
entries) is somehow maintained and being passed to a set of processes.
Processes can have passive monitoring (they will know what has been
changed -- good for indexers and other kinds of application-specific
daemons) or intrusive pass-through monitoring (the change is not applied
until the process confirms it, and transaction interface applies to this
if enabled -- this will be a performance hit, and can be done for, say, a
distributed transaction manager).

3. Pluggable directory generator -- a userspace process can tell the
system to make an object that looks exactly like a directory, except that
its contents are provided by the process, that is being queried when the
directory is accessed.

Obviously, the need for performance/asyncronous access and security
requirements should be addressed in the implementations of those things,
however relatively small scope of the interfaces can allow to do that in a
more or less sane manner. Then userspace can have all kinds of indexing
monstrosities, transaction-using databases, transaction managers, etc.

-- 
Alex

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
       [not found]         ` <MhLf.1pF.9@gated-at.bofh.it>
@ 2003-10-30 12:16           ` Ihar 'Philips' Filipau
  0 siblings, 0 replies; 83+ messages in thread
From: Ihar 'Philips' Filipau @ 2003-10-30 12:16 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Linux Kernel Mailing List

Hans Reiser wrote:
> 
> All that said, the indexes themselves should just be feature enhanced 
> directories accessed via the kernel.  Feature enhancements might include 
> such things as better space efficiency, ordering plugins, etc.
> 

   I still see no point in putting this into kernel space.

   As a proof of concept - if someone wants to try - one can implement 
this system on top of any other fs. in user space.

   open("/aaa.txt") ->
     inode ->
      underlaying_fs.open(itoa(inode)+".meta")
      underlaying_fs.open(itoa(inode)+".data")

   write(fd) -> fd<->inode -> updateindex(inode) + write(inode). [1]

   and so on and so forth. LD_PRELOAD=libcoolfs.so my_cool_app

   in other words file system can be used as smart storage.
   hard links can be 'mis'used to implement search indeces.

[1]  mmap() is the notable problem.

-- 
Ihar 'Philips' Filipau  / with best regards from Saarbruecken.
--
   "... and for $64000 question, could you get yourself vaguely
      familiar with the notion of on-topic posting?"
				-- Al Viro @ LKML


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
       [not found]       ` <Mcs2.2FJ.5@gated-at.bofh.it>
@ 2003-10-30 12:04         ` Ihar 'Philips' Filipau
  0 siblings, 0 replies; 83+ messages in thread
From: Ihar 'Philips' Filipau @ 2003-10-30 12:04 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Linux Kernel Mailing List

Larry McVoy wrote:
> 
> select ID,STATUS,SEVERITY,PRIORITY,SUMMARY
> from bugs
> where	(SEVERITY == 1 or SEVERITY == 2 or SEVERITY == 3) and 
> 	(PRIORITY == 1 or PRIORITY == 2 or PRIORITY == 3) and 
> 	(STATUS == "new" or STATUS == "open" or STATUS == "assigned")
> order by
> 	ID
> 

   In fact it mustn't be a text.
   Some kind of encoding like ASN.1 will do the job.
   Parsing text in C - is really painful.

   my 0.02 euro.

-- 
Ihar 'Philips' Filipau  / with best regards from Saarbruecken.
--
   "... and for $64000 question, could you get yourself vaguely
      familiar with the notion of on-topic posting?"
				-- Al Viro @ LKML


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Things that Longhorn seems to be doing right
       [not found]     ` <Maqe.8l3.9@gated-at.bofh.it>
@ 2003-10-30 11:10       ` Ihar 'Philips' Filipau
  2003-10-30 17:23         ` Alex Belits
  0 siblings, 1 reply; 83+ messages in thread
From: Ihar 'Philips' Filipau @ 2003-10-30 11:10 UTC (permalink / raw)
  To: trelane; +Cc: Linux Kernel Mailing List, Theodore Ts'o

Joseph Pingenot wrote:
> From Theodore Ts'o on Wednesday, 29 October, 2003:
> 
>>Keep in mind that just because Windows does thing a certain way
>>doesn't mean we have to provide the same functionality in exactly the
>>same way.
>>Also keep in mind that Microsoft very deliberately blurs what they do
>>in their "kernel" versus what they provide via system libraries (i.e.,
>>API's provided via their DLL's, or shared libraries).
> 
> Indeed, although certain things could be half-kernel, half-user
>   (OK, 0.01% kernel, 99.99% user, e.g. userspace daemon that
>   intercepts certain writes).  Of course, at that point, you might
>   make a special library to interact with the daemon directly, although
>   it's then not at all like just calling write().
> 

   I beleive this is 100% user space issue.

   And I think if one really want to do something like this - Gnome's 
VFS is a good candidate for this. They already have all abstractions in 
place.

   [ Yes, sure I'm not using gnome by myself - but knowing nature of the 
prokect I bet they already started doing something like this ;-))) Ashes 
to ashes, dust to dust - bloat to bloat. ]

-- 
Ihar 'Philips' Filipau  / with best regards from Saarbruecken.
--
   "... and for $64000 question, could you get yourself vaguely
      familiar with the notion of on-topic posting?"
				-- Al Viro @ LKML


^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2003-11-05 14:07 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-29  8:50 Things that Longhorn seems to be doing right Hans Reiser
2003-10-29 22:42 ` Erik Andersen
2003-10-29 23:03   ` Hans Reiser
2003-10-29 22:25     ` Dax Kelson
2003-10-30  0:20       ` Joseph Pingenot
2003-10-30  0:54         ` Neil Brown
2003-10-30  1:34           ` Joseph Pingenot
2003-10-30  2:54             ` Bernd Eckenfels
2003-10-30  2:58               ` Arnaldo Carvalho de Melo
2003-10-30  3:16               ` Joseph Pingenot
2003-10-30  5:28                 ` Jeff Garzik
2003-10-30  5:56                   ` Valdis.Kletnieks
2003-10-30  3:16             ` Neil Brown
2003-10-30  3:39               ` Joseph Pingenot
2003-10-30 10:27             ` Thorsten Körner
2003-10-30 21:28             ` jlnance
2003-10-30 22:29               ` Måns Rullgård
2003-10-31  2:03                 ` Daniel B.
2003-10-31  1:04               ` Clemens Schwaighofer
2003-10-30  2:09         ` Alex Belits
2003-10-30  3:12           ` Joseph Pingenot
2003-10-30  4:21             ` Scott Robert Ladd
2003-10-31 16:42               ` Timothy Miller
2003-10-31 19:15                 ` Hans Reiser
2003-10-30  9:52             ` Ingo Oeser
2003-10-30  4:06           ` Scott Robert Ladd
2003-10-30  1:52   ` Theodore Ts'o
2003-10-30  2:03     ` Joseph Pingenot
2003-10-30  9:23       ` Ingo Oeser
2003-10-30  3:57     ` Scott Robert Ladd
2003-10-30  4:08       ` Larry McVoy
2003-10-30 13:46       ` Jesse Pollard
2003-10-31  4:50       ` Stephen Satchell
2003-10-30  7:33     ` Diego Calleja García
2003-10-30  8:43       ` Giuliano Pochini
2003-10-30  8:05     ` Hans Reiser
2003-10-30  8:17       ` Wichert Akkerman
2003-10-30 11:59         ` Hans Reiser
2003-10-30  9:14       ` Giuliano Pochini
2003-10-30  9:55         ` Hans Reiser
2003-10-30 17:48       ` Theodore Ts'o
2003-10-30 19:23         ` Hans Reiser
2003-10-30 20:31           ` Theodore Ts'o
2003-10-31  7:40             ` Hans Reiser
2003-10-31 19:30               ` Theodore Ts'o
2003-10-31 20:47                 ` Hans Reiser
2003-10-31 13:59                   ` Herman
2003-10-31 21:23                     ` Richard B. Johnson
2003-11-01 18:30                       ` Hans Reiser
2003-10-31 21:08                   ` David S. Miller
2003-11-02 21:42                     ` Hans Reiser
2003-11-03 12:42                 ` Nikita Danilov
2003-11-03 16:58                   ` Timothy Miller
2003-11-04  8:13                     ` Hans Reiser
2003-11-05 13:51                       ` Ingo Oeser
2003-11-05  2:07                         ` Hans Reiser
2003-10-31 11:01         ` Kenneth Johansson
2003-10-31 13:52           ` Jesse Pollard
2003-10-30 11:21     ` Felipe Alfaro Solana
2003-10-30  7:25 ` Christian Axelsson
2003-10-30  8:10   ` Hans Reiser
     [not found] ` <200311011731.10052.ioe-lkml@rameria.de>
     [not found]   ` <3FA3FF46.7010309@namesys.com>
2003-11-03 10:55     ` Ingo Oeser
2003-11-04  8:10       ` Hans Reiser
     [not found] <LUlv.31e.5@gated-at.bofh.it>
     [not found] ` <M7iG.41B.7@gated-at.bofh.it>
     [not found]   ` <MagC.82U.7@gated-at.bofh.it>
     [not found]     ` <Maqe.8l3.9@gated-at.bofh.it>
2003-10-30 11:10       ` Ihar 'Philips' Filipau
2003-10-30 17:23         ` Alex Belits
2003-10-31  1:46           ` Daniel B.
2003-10-31  1:57             ` Philippe Troin
     [not found]     ` <Mcig.2uf.1@gated-at.bofh.it>
     [not found]       ` <Mcs2.2FJ.5@gated-at.bofh.it>
2003-10-30 12:04         ` Ihar 'Philips' Filipau
     [not found]     ` <Mg2B.7wf.9@gated-at.bofh.it>
     [not found]       ` <Mh8n.BT.9@gated-at.bofh.it>
     [not found]         ` <MhLf.1pF.9@gated-at.bofh.it>
2003-10-30 12:16           ` Ihar 'Philips' Filipau
2003-11-02 13:11 Brian Beattie
2003-11-02 17:15 ` Valdis.Kletnieks
2003-11-03 19:35   ` Brian Beattie
2003-11-03 20:17     ` Richard B. Johnson
2003-11-03 20:23       ` Valdis.Kletnieks
2003-11-03 20:54         ` Richard B. Johnson
2003-11-03 21:01           ` Valdis.Kletnieks
2003-11-03 22:06             ` Måns Rullgård
2003-11-04  8:47           ` Michael Clark
2003-11-04 12:47             ` Richard B. Johnson
2003-11-04 14:02           ` Brian Beattie
2003-11-03 20:55         ` Roland Dreier
2003-11-04  0:35     ` Daniel B.
2003-11-04 14:05       ` Brian Beattie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).