linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: udev and devfs - The final word
@ 2004-01-08 13:53 "Andrey Borzenkov" 
  2004-01-08 15:40 ` Ian Kent
  2004-01-09  8:51 ` Helge Hafting
  0 siblings, 2 replies; 158+ messages in thread
From: "Andrey Borzenkov"  @ 2004-01-08 13:53 UTC (permalink / raw)
  To: "Greg KH" ; +Cc: linux-kernel


> So, how does devfs stack up to the above problems and constraints:
>   Problems:
>     1) devfs only shows the dev entries for the devices in the system.

Is this a problem? Where exactly this problem lies?

>     2) devfs does not handle the need for dynamic major/minor numbers

Neither does udev. Both take whatever driver gives them.

>     3) devfs does not provide a way to name devices in a persistent
>        fashion.

I am not sure what exactly you mean here.

>     4) devfs does provide a deamon that userspace programs can hook > into
>        to listen to see what devices are being created or removed.
>   Constraints:
>     1) devfs forces the devfs naming policy into the kernel.  If you
>        don't like this naming scheme, tough.

kernel imposes naming scheme for exporting devices in sysfs. It is
possible to get rid of devfs_name in kernel and use those names
that must exist anyway to support udev as well. devfs has
devfsd that can call whatever naming agent you like.

>     2) devfs does not follow the LSB device naming standard.

it is user-space (devfsd) issue, not kernel space (devfs)

>     3) devfs is small, and embedded devices use it.  However it is
>        implemented in non-pagable memory.

Same for sysfs. Other Unices have pageable kernel memory. If Linux
had it any memory based filesystem could benefit from it. I did not
look at backing store for sysfs patches but it is likely that same
idea could be used for devfs.

> Oh yeah, and there are the insolvable race conditions with the devfs
> implementation in the kernel, but I'm not going to talk about them > right

I do not argue that current devfs implementation is ugly and racy. I
just beg you to point at what makes those races "unsolvable".

> now, sorry.  See the linux-kernel archives if you care about them (and
> if you use devfs, you should care...)

I do care. Searching archives for devfs mostly brings "everyone knows
this is crap". That is why I kindly ask you to show real evidence that
the problems it has are unsolvable.

> So devfs is 2 for 7, ignoring the kernel races.

Hmm ... I really see only one - devfs names that are historically
used. Assuming that

- devfs just exports kernel names (that must exist anyway)
- sysfs provides consistent cdev view as it does for bdev

devfsd simply can take kernel name and call whatever program you like
to implement naming policy including udev. With added benefit of
removable devices support :)

regards

-andrey

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 13:53 udev and devfs - The final word "Andrey Borzenkov" 
@ 2004-01-08 15:40 ` Ian Kent
  2004-01-08 17:26   ` Diego Calleja
  2004-01-08 18:14   ` Alex Goddard
  2004-01-09  8:51 ` Helge Hafting
  1 sibling, 2 replies; 158+ messages in thread
From: Ian Kent @ 2004-01-08 15:40 UTC (permalink / raw)
  To: "Andrey Borzenkov" ; +Cc: "Greg KH" , linux-kernel

On Thu, 8 Jan 2004, [koi8-r] "Andrey Borzenkov[koi8-r] "  wrote:

>
> >     4) devfs does provide a deamon that userspace programs can hook > into
> >        to listen to see what devices are being created or removed.
> >   Constraints:
> >     1) devfs forces the devfs naming policy into the kernel.  If you
> >        don't like this naming scheme, tough.
>
> kernel imposes naming scheme for exporting devices in sysfs. It is
> possible to get rid of devfs_name in kernel and use those names
> that must exist anyway to support udev as well. devfs has
> devfsd that can call whatever naming agent you like.

Yes. I'm having trouble finding justification for that statement as
well.

devfs appears to have almost no device name info within it.

>
> >     2) devfs does not follow the LSB device naming standard.
>
> it is user-space (devfsd) issue, not kernel space (devfs)

And there is heaps of device naming going on in devfsd. As is what people
seem to be recommending.

> > Oh yeah, and there are the insolvable race conditions with the devfs
> > implementation in the kernel, but I'm not going to talk about them > right
>
> I do not argue that current devfs implementation is ugly and racy. I
> just beg you to point at what makes those races "unsolvable".
>
> > now, sorry.  See the linux-kernel archives if you care about them (and
> > if you use devfs, you should care...)
>
> I do care. Searching archives for devfs mostly brings "everyone knows
> this is crap". That is why I kindly ask you to show real evidence that
> the problems it has are unsolvable.

Again I'm also unable to find descriptions of the 'unsolvable' races.

I wouldn't mind knowing what they are either. Anyone?

Ian



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 15:40 ` Ian Kent
@ 2004-01-08 17:26   ` Diego Calleja
  2004-01-08 19:25     ` Andrey Borzenkov
  2004-01-08 18:14   ` Alex Goddard
  1 sibling, 1 reply; 158+ messages in thread
From: Diego Calleja @ 2004-01-08 17:26 UTC (permalink / raw)
  To: Ian Kent; +Cc: arvidjaar, greg, linux-kernel

El Thu, 8 Jan 2004 23:40:16 +0800 (WST) Ian Kent <raven@themaw.net> escribió:

> 
> Again I'm also unable to find descriptions of the 'unsolvable' races.
> 
> I wouldn't mind knowing what they are either. Anyone?

You can find tons of examples (several of them patches by Al Viro to fix them) by
searching with google with keywords like "devfs races". The "should fix" list
(http://www.kernel.org/pub/linux/kernel/people/akpm/must-fix) has this:

hch: devfs: there's a fundamental lookup vs devfsd race that's only
  fixable by introducing a lookup vs devfs deadlock.  I can't see how this is
  fixable without getting rid of the current devfsd design.  Mandrake seems
  to have a workaround for this so this is at least not triggered so easily,
  but that's not what I'd consider a fix..

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 15:40 ` Ian Kent
  2004-01-08 17:26   ` Diego Calleja
@ 2004-01-08 18:14   ` Alex Goddard
  2004-01-08 18:35     ` Alex Goddard
  2004-01-08 19:22     ` Andrey Borzenkov
  1 sibling, 2 replies; 158+ messages in thread
From: Alex Goddard @ 2004-01-08 18:14 UTC (permalink / raw)
  To: Ian Kent; +Cc: "Andrey Borzenkov" , linux-kernel

On Thu, 8 Jan 2004, Ian Kent wrote:

> On Thu, 8 Jan 2004, [koi8-r] "Andrey Borzenkov[koi8-r] "  wrote:

[Snip]

> > I do care. Searching archives for devfs mostly brings "everyone knows
> > this is crap". That is why I kindly ask you to show real evidence that
> > the problems it has are unsolvable.
> 
> Again I'm also unable to find descriptions of the 'unsolvable' races.
> 
> I wouldn't mind knowing what they are either. Anyone?

Can no one think of good search terms for these sorts of things these
days?  From the first and only page of results for 'devfs unsolvable
races':

http://marc.theaimsgroup.com/?l=linux-kernel&m=105851630726585&w=2

-- 
Alex Goddard
agoddard at purdue.edu

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 18:14   ` Alex Goddard
@ 2004-01-08 18:35     ` Alex Goddard
  2004-01-08 19:22     ` Andrey Borzenkov
  1 sibling, 0 replies; 158+ messages in thread
From: Alex Goddard @ 2004-01-08 18:35 UTC (permalink / raw)
  To: Ian Kent; +Cc: "Andrey Borzenkov" , linux-kernel

On Thu, 8 Jan 2004, Alex Goddard wrote:

> On Thu, 8 Jan 2004, Ian Kent wrote:
> 
> > On Thu, 8 Jan 2004, [koi8-r] "Andrey Borzenkov[koi8-r] "  wrote:
> 
> [Snip]
> 
> > > I do care. Searching archives for devfs mostly brings "everyone knows
> > > this is crap". That is why I kindly ask you to show real evidence that
> > > the problems it has are unsolvable.
> > 
> > Again I'm also unable to find descriptions of the 'unsolvable' races.
> > 
> > I wouldn't mind knowing what they are either. Anyone?
> 
> Can no one think of good search terms for these sorts of things these
> days?  From the first and only page of results for 'devfs unsolvable
> races':
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=105851630726585&w=2

Someone else already mentioned the many, many posts by Al Viro on this
subject I was about to post a URL or two for.  They make for good (and
entertainingly flame-ridden) reading.

-- 
Alex Goddard
agoddard at purdue.edu

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 18:14   ` Alex Goddard
  2004-01-08 18:35     ` Alex Goddard
@ 2004-01-08 19:22     ` Andrey Borzenkov
  1 sibling, 0 replies; 158+ messages in thread
From: Andrey Borzenkov @ 2004-01-08 19:22 UTC (permalink / raw)
  To: Alex Goddard, Ian Kent; +Cc: linux-kernel

On Thursday 08 January 2004 21:14, Alex Goddard wrote:
> On Thu, 8 Jan 2004, Ian Kent wrote:
> > On Thu, 8 Jan 2004, [koi8-r] "Andrey Borzenkov[koi8-r] "  wrote:
>
> [Snip]
>
> > > I do care. Searching archives for devfs mostly brings "everyone knows
> > > this is crap". That is why I kindly ask you to show real evidence that
> > > the problems it has are unsolvable.
> >
> > Again I'm also unable to find descriptions of the 'unsolvable' races.
> >
> > I wouldn't mind knowing what they are either. Anyone?
>
> Can no one think of good search terms for these sorts of things these
> days?  From the first and only page

hmm ... that does not sound like "tons of" as someone else mentioned does it?

> of results for 'devfs unsolvable 
> races':
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=105851630726585&w=2

oh, that same again ... you are beating dead horse. See reply to another 
similar post.

thank you

-andrey


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 17:26   ` Diego Calleja
@ 2004-01-08 19:25     ` Andrey Borzenkov
  2004-01-08 22:40       ` Alex Goddard
  0 siblings, 1 reply; 158+ messages in thread
From: Andrey Borzenkov @ 2004-01-08 19:25 UTC (permalink / raw)
  To: Diego Calleja, Ian Kent; +Cc: greg, linux-kernel

On Thursday 08 January 2004 20:26, Diego Calleja wrote:
> El Thu, 8 Jan 2004 23:40:16 +0800 (WST) Ian Kent <raven@themaw.net> 
escribió:
> > Again I'm also unable to find descriptions of the 'unsolvable' races.
> >
> > I wouldn't mind knowing what they are either. Anyone?
>
> You can find tons of examples (several of them patches by Al Viro to fix
> them) by searching with google with keywords like "devfs races". The
> "should fix" list
> (http://www.kernel.org/pub/linux/kernel/people/akpm/must-fix) has this:
>

is it a gospel?

> hch: devfs: there's a fundamental lookup vs devfsd race that's only
>   fixable by introducing a lookup vs devfs deadlock.  I can't see how this
> is fixable without getting rid of the current devfsd design.  Mandrake
> seems to have a workaround for this so this is at least not triggered so
> easily, but that's not what I'd consider a fix..

oh, well ... if you selected this example ...

Mandrake workaround it mentions was my first attempt to fix this; this did not 
fix the devfs but rather fixed the user-space program that provoked this on 
boot (and that was buggy irrespectively of this problem).

Current 2.6 kernel includes my fix to deadlock condition. Current -mm includes 
one possible fix for race condition; Andrew Morton mentioned that it is 
unlikely to be accepted due to minor changes in VFS layer; I am working on 
another less intrusive fix and overall devfs cleanup.

Would you please instead of citing long obsolete paper show me real example 
and explain *why* it is not fixable. Better yet, would you take some time to 
try to provoke any of those huge races and report back your success (stack 
trace and instructions how to reproduce them are welcome :)

Thank you

-andrey


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 19:25     ` Andrey Borzenkov
@ 2004-01-08 22:40       ` Alex Goddard
  2004-01-09  7:03         ` "Andrey Borzenkov" 
  0 siblings, 1 reply; 158+ messages in thread
From: Alex Goddard @ 2004-01-08 22:40 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: Diego Calleja, Ian Kent, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 3059 bytes --]

To shorten my reply from one post to two:

I wasn't aware the lookup race had been fixed.  Silly me.  As to the
number of results for the terms I used, I was looking for that specific
post, as I'd cited it before.  'devfs deadlock' also returns plenty of
good (and irrelevant)  results at that same archive.

On Thu, 8 Jan 2004, Andrey Borzenkov wrote:

> On Thursday 08 January 2004 20:26, Diego Calleja wrote:
> > El Thu, 8 Jan 2004 23:40:16 +0800 (WST) Ian Kent <raven@themaw.net> 
> escribió:
> > > Again I'm also unable to find descriptions of the 'unsolvable' races.
> > >
> > > I wouldn't mind knowing what they are either. Anyone?
> >
> > You can find tons of examples (several of them patches by Al Viro to fix
> > them) by searching with google with keywords like "devfs races". The
> > "should fix" list
> > (http://www.kernel.org/pub/linux/kernel/people/akpm/must-fix) has this:
> >
> 
> is it a gospel?

Given akpm's track record and, the fact that he's going to be maintaining
2.6, his word is enough for me.  But...

[lookup vs. devfsd deadlock]

> oh, well ... if you selected this example ...
> 
> Mandrake workaround it mentions was my first attempt to fix this; this
> did not fix the devfs but rather fixed the user-space program that
> provoked this on boot (and that was buggy irrespectively of this
> problem).
> 
> Current 2.6 kernel includes my fix to deadlock condition. Current -mm
> includes one possible fix for race condition; Andrew Morton mentioned
> that it is unlikely to be accepted due to minor changes in VFS layer; I
> am working on another less intrusive fix and overall devfs cleanup.
> 
> Would you please instead of citing long obsolete paper show me real example 
> and explain *why* it is not fixable. Better yet, would you take some time to 
> try to provoke any of those huge races and report back your success (stack 
> trace and instructions how to reproduce them are welcome :)
> 
> Thank you

One should do one's own research.  If I'd done my own better, I would've
found a better example than the one I posted previously.  However, since
you seem unwilling or unable to do your own homework, and I have nearly
unlimited free time until next Monday, here's some more:

http://marc.theaimsgroup.com/?l=linux-kernel&m=103696697201341&w=2

Some parts of that post have been fixed.  I haven't taken the time to read
the rest of the thread (I don't remember that thread being very long).

However, I'm all but positive the problems Viro points out in this post
still apply:  
http://marc.theaimsgroup.com/?l=linux-kernel&m=107223747630894&w=2

These posts are out there and they are _not_ hard to find at all.  That
last one was from about two weeks ago.  The other is much older, and I'd
have to do some digging in ugly, ugly code to find points that still
apply.  However, as I started this paragraph by saying, this stuff isn't
hard to find at all.

Please consider doing your own homework.  It's not like discussion of
devfs and its problems has been rare.

Thank you.

-- 
Alex Goddard
agoddard at purdue.edu

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 22:40       ` Alex Goddard
@ 2004-01-09  7:03         ` "Andrey Borzenkov" 
  0 siblings, 0 replies; 158+ messages in thread
From: "Andrey Borzenkov"  @ 2004-01-09  7:03 UTC (permalink / raw)
  To: "Alex Goddard" 
  Cc: "Diego Calleja" , "Ian Kent" , linux-kernel



> One should do one's own research.  If I'd done my own better, I would've
> found a better example than the one I posted previously.  However, since
> you seem unwilling or unable to do your own homework, and I have nearly
> unlimited free time until next Monday,

lucky you; I wish I could say the same :(

> here's some more:
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=103696697201341&w=2
> 
> Some parts of that post have been fixed.  I haven't taken the time to read
> the rest of the thread (I don't remember that thread being very long).
> 
> However, I'm all but positive the problems Viro points out in this post
> still apply:  
> http://marc.theaimsgroup.com/?l=linux-kernel&m=107223747630894&w=2
> 

May I ask you once more - please show me the _unsolvable_ parts.
Articles you quote discuss devfs shortcomings and design problems;
many of them are fixed; others can be fixed; none of them are
unfixable.

thank you

-andrey

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-08 13:53 udev and devfs - The final word "Andrey Borzenkov" 
  2004-01-08 15:40 ` Ian Kent
@ 2004-01-09  8:51 ` Helge Hafting
  1 sibling, 0 replies; 158+ messages in thread
From: Helge Hafting @ 2004-01-09  8:51 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: Greg KH, linux-kernel

Andrey Borzenkov wrote:
>>So, how does devfs stack up to the above problems and constraints:
>>  Problems:
>>    1) devfs only shows the dev entries for the devices in the system.
> 
> 
> Is this a problem? Where exactly this problem lies?
> 

Some people like to unload their (modular) devices so they don't use
memory when not in use.  This saves them a few kB.
It is then useful to have the device node sitting there anyway, functioning
as a trigger.  The kernel will autoload the module when the device is opened,
but that won't happen if there's no device node to open.

This approach can be made to work with devfs too - by splitting the device
into two pieces.  One that manages the device node, and autoloads the rest
when needed.


> 
>>    2) devfs does not handle the need for dynamic major/minor numbers
> 
> 
> Neither does udev. Both take whatever driver gives them.
> 
> 
>>    3) devfs does not provide a way to name devices in a persistent
>>       fashion.
> 
> 
> I am not sure what exactly you mean here.
> 
Probably something like
mv /dev/hda /dev/mydisk
and have it remembered upon the next reboot.  devfsd can do that though.

> 
>>    4) devfs does provide a deamon that userspace programs can hook > into
>>       to listen to see what devices are being created or removed.
>>  Constraints:
>>    1) devfs forces the devfs naming policy into the kernel.  If you
>>       don't like this naming scheme, tough.
> 
> 
> kernel imposes naming scheme for exporting devices in sysfs. It is
> possible to get rid of devfs_name in kernel and use those names
> that must exist anyway to support udev as well. devfs has
> devfsd that can call whatever naming agent you like.
> 
> 
>>    2) devfs does not follow the LSB device naming standard.
> 
> 
> it is user-space (devfsd) issue, not kernel space (devfs)
> 
> 
>>    3) devfs is small, and embedded devices use it.  However it is
>>       implemented in non-pagable memory.
> 
> 
> Same for sysfs. Other Unices have pageable kernel memory. If Linux
> had it any memory based filesystem could benefit from it. I did not
> look at backing store for sysfs patches but it is likely that same
> idea could be used for devfs.
> 
> 
>>Oh yeah, and there are the insolvable race conditions with the devfs
>>implementation in the kernel, but I'm not going to talk about them > right
> 
> 
> I do not argue that current devfs implementation is ugly and racy. I
> just beg you to point at what makes those races "unsolvable".
> 
They are of course not unsolvable.  Those understanding them seems
to believe that a complete rewrite will be easier than the rather
invasive changes necessary to fix the races.  This is why the
_current implementation_ is marked obsolete.  Nobody is volunteering
to fix devfs or rewrite it though.


Helge Hafting


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:52                                                       ` Linus Torvalds
  2004-01-05  6:11                                                         ` viro
  2004-01-05  7:47                                                         ` Greg KH
@ 2004-01-11 22:12                                                         ` Ed L Cashin
  2 siblings, 0 replies; 158+ messages in thread
From: Ed L Cashin @ 2004-01-11 22:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds

Linus Torvalds <torvalds@osdl.org> writes:

...
> But since you brought it up: do you actually have anything else that can
> open a remote IMAP file with a few thousand messages without taking ages
> for it, and that you don't have to mouse around with? I'd like a graphical
> interface for configuring stuff etc, but I sure as hell don't want to find
> some f*ing icon to save a few messages that I selected in-order to my
> "doit" queue or go to the next one, or pipe the thing to a shell-script,
> or any number of things that are my actual _job_.

Don't you already use emacs?  Emacs has gnus!  Gnus now has nnimap, an
imap backend.

> And the "no mousing" means that I don't want to have some popup window 
> that asks me what file I want to save into or similar crap. I can type 
> fast enough if I stay on the keyboard and can focus on one part of the 
> screen, but if I have to switch my focus around, I'm a goner.
>
> On a related matter, I'm probably a retard, but I've tried alternatives to
> "trn" too, and there really aren't any. None of the graphical news readers
> can show me one full page of threads, select the 3-4 threads from _that_
> one page that I want (from the keyboard), and then kill _that_ one page.
> Not the whole newsgroup: only the part that shows in the window at that
> time.

Man, you sound so ready for gnus.  It does nntp as well.

-- 
--Ed L Cashin            |   PGP public key:
  ecashin@uga.edu        |   http://noserose.net/e/pgp/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
@ 2004-01-08 13:05 "Andrey Borzenkov" 
  0 siblings, 0 replies; 158+ messages in thread
From: "Andrey Borzenkov"  @ 2004-01-08 13:05 UTC (permalink / raw)
  To: "Greg KH" ; +Cc: linux-kernel

> > > 
> > >  2) We are (well, were) running out of major and minor numbers for
> > >     devices.
> > 
> > devfs tried to fix this one by _getting rid_ of those numbers.
> > Seriously - what are they needed for?  
>
> But devfs failed in this.  The devfs kernel interface still requires a
> major/minor number to create device nodes.
>
> Hopefully I can work on fixing this up in 2.7.

You must be kidding here. Where exactly do you see devfs fault?

It is not "devfs kernel interface". It is "kernel userland interface"
that requires a major/minor to associate device node with device.

Devfs (in its current shape) is no more than just a repository of
(device node, major/minor) relations. Devfs does not care where
dev number comes from. Driver may hardcode it or request
dynamically from kernel, either way is fine.

Answering your other mail - "devfs kernel interface" provided ability
to request dynamic device number when registering a node. Sounds
very much like what you'd wish. Somebody decided it was evil and
removed it. I personally do not care either way.

regards

-andrey

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-07 13:39                                                     ` Robin Rosenberg
@ 2004-01-07 17:16                                                       ` Nigel Cunningham
  0 siblings, 0 replies; 158+ messages in thread
From: Nigel Cunningham @ 2004-01-07 17:16 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1868 bytes --]

Ah. Well if you've unmounted filesystems prior to suspending, I would
expect you should be fine. The device numbers might change - if they can
change between mounts - but that won't be any different because of
suspending. If you're talking about suspending with the file systems
mounted, that ought to work to (once the appropriate power management
support is done). If the user fails to reconnect the device before
resuming, they should expect the same problems that they would encounter
if they pulled it out without suspending. Of course I'm saying 'should'
a lot here. Let me use it one more time... in my mind at least, the fact
that we've suspended should be irrelevant to how things work.

Regards,

Nigel

On Thu, 2004-01-08 at 02:39, Robin Rosenberg wrote:
> måndagen den 5 januari 2004 13.39 skrev Nigel Cunningham:
> > Hi.
> >
> > The suspend to disk implementations all assume that devices are not
> > [dis]appearing under us while we're suspended. If you do go adding and
> > removing devices while the power is off, you can expect the same
> > problems you'd get if you removed them without suspending the machine.
> > It would be roughly equivalent to hot[un]plugging devices.
> 
> Yes. It's very unclear unless you do mind reading, but I had in mind mounted filesystems
> such as /home on a USB stick or firewire Reasonable? yes! But such devices have to
> be rediscovered and allocated in such a way that the user can resume using the device
> as soon as it has been found. And it should not fail miserably if the user forgets to connect
> the device before resuming the machine. As you cannot unmount /home (usually) the
> kernel must remember the device somehow or make mounting file systems more loosely
> than today.
> 
> -- robin
-- 
My work on Software Suspend is graciously brought to you by
LinuxFund.org.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-06  1:41                                                             ` Andries Brouwer
@ 2004-01-07 17:14                                                               ` Greg KH
  0 siblings, 0 replies; 158+ messages in thread
From: Greg KH @ 2004-01-07 17:14 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Daniel Jacobowitz, Rob Love, rob, Pascal Schmidt,
	linux-kernel

On Tue, Jan 06, 2004 at 02:41:15AM +0100, Andries Brouwer wrote:
> On Mon, Jan 05, 2004 at 04:00:15PM -0800, Greg KH wrote:
> 
> > > > Have you even _tried_ udev?
> > > 
> > > Yes, and it works reasonably well. I have version 012 here.
> > > Some flaws will be fixed in 013 or so.
> > 
> > What flaws would that be?  The short time delay for partitions?  Or
> > something else?
> 
> Yes, partitions are not handled very well.
> So far I have never seen udev discover partitions on its own.

That is because it can not.  Please see the current thread "removable
media revalidation - udev vs. devfs or static /dev" on lkml for a
solution to this.

> > > Some difficulties are of a more fundamental type, not so easy to fix.
> > 
> > Such as?
> 
> Udev cannot do anything when there are no events.
> And media insertion or removal does not always give events.

Exactly.  That's why userspace needs to poll for this.

> [By the way, a compilation warning for every C file:
> % make
> gcc  -pipe -Wall -Wmore.. -Os -fomit-frame-pointer -D_GNU_SOURCE \
>   -I/usr/lib/gcc-lib/i486-suse-linux/3.2/include -I.../udev-012/libsysfs
>   -c -o udev.o udev.c
> cc1: warning: changing search order for system directory
>      "/usr/lib/gcc-lib/i486-suse-linux/3.2/include"
> cc1: warning: as it has already been specified as a non-system directory]

Odd, it works here just fine on a number of different Red Hat boxes :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 12:39                                                   ` Nigel Cunningham
@ 2004-01-07 13:39                                                     ` Robin Rosenberg
  2004-01-07 17:16                                                       ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: Robin Rosenberg @ 2004-01-07 13:39 UTC (permalink / raw)
  To: ncunningham; +Cc: Linux Kernel Mailing List

måndagen den 5 januari 2004 13.39 skrev Nigel Cunningham:
> Hi.
>
> The suspend to disk implementations all assume that devices are not
> [dis]appearing under us while we're suspended. If you do go adding and
> removing devices while the power is off, you can expect the same
> problems you'd get if you removed them without suspending the machine.
> It would be roughly equivalent to hot[un]plugging devices.

Yes. It's very unclear unless you do mind reading, but I had in mind mounted filesystems
such as /home on a USB stick or firewire Reasonable? yes! But such devices have to
be rediscovered and allocated in such a way that the user can resume using the device
as soon as it has been found. And it should not fail miserably if the user forgets to connect
the device before resuming the machine. As you cannot unmount /home (usually) the
kernel must remember the device somehow or make mounting file systems more loosely
than today.

-- robin



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-07 13:26                   ` viro
@ 2004-01-07 13:27                     ` Olaf Hering
  0 siblings, 0 replies; 158+ messages in thread
From: Olaf Hering @ 2004-01-07 13:27 UTC (permalink / raw)
  To: viro; +Cc: Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

 On Wed, Jan 07, viro@parcelfarce.linux.theplanet.co.uk wrote:

> On Wed, Jan 07, 2004 at 02:00:36PM +0100, Olaf Hering wrote:
>  
> > Ok, it was mkfs.minix and an older distro.
> 
> mkfs should simply pass O_EXCL to open().  Which is what you really want
> and yes, it should work on 2.6 (not sure if it got backported on 2.4).

Thanks! I will play with it and see what current tools use.

-- 
USB is for mice, FireWire is for men!

sUse lINUX ag, nÜRNBERG

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-07 13:00                 ` Olaf Hering
@ 2004-01-07 13:26                   ` viro
  2004-01-07 13:27                     ` Olaf Hering
  0 siblings, 1 reply; 158+ messages in thread
From: viro @ 2004-01-07 13:26 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

On Wed, Jan 07, 2004 at 02:00:36PM +0100, Olaf Hering wrote:
 
> Ok, it was mkfs.minix and an older distro.

mkfs should simply pass O_EXCL to open().  Which is what you really want
and yes, it should work on 2.6 (not sure if it got backported on 2.4).

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-07 11:18               ` viro
@ 2004-01-07 13:00                 ` Olaf Hering
  2004-01-07 13:26                   ` viro
  0 siblings, 1 reply; 158+ messages in thread
From: Olaf Hering @ 2004-01-07 13:00 UTC (permalink / raw)
  To: viro; +Cc: Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

 On Wed, Jan 07, viro@parcelfarce.linux.theplanet.co.uk wrote:

> On Wed, Jan 07, 2004 at 11:15:59AM +0100, Olaf Hering wrote:
> > Now, thats just fine and it was always been that way.
> > What if I chroot into /foo, proc is mounted on /foo/proc,
> > and run fsck /dev/sda3 in that chroot? 
> > That silly app looks for /etc/mtab (oh my...) and start the work.
> > Fine. Now, /dev/root is in reality /dev/sda3. Bad for me.
> 
> Huh?

For short: noone knows that /dev/sda3 is busy/used.

> Note that you're not only adding ad-hackery (which filesystems get that
> major:minor printed and which do not?), you *STILL* hadn't solved your
> problem.  Why?  Because you still won't catch e.g. ext3 on /dev/sda5 with
> external journal on /dev/sda3.  And if you hack parsing ext3 lines in
> /proc/mounts, there's always reiserfs, jfs, etc., etc.  _And_ there's
> RAID with the same problems wrt. access to components.  Real funny
> when you have raid0 on md0, have md0 mounted and try to fsck one of
> components.

That makes sense. Is there a sane way to inform userland apps that some
stuff is used (mounted, part of a volume group or raid)? Sure, the raid
or lvm specific tools will tell you...

> Scanning /etc/mtab or /proc/mounts in such situations is wrong.  If fsck
> is doing that, it's broken.  The right way to fix it depends on what you
> really want and whatever the hell it is, putting new and new fs-specific
> code that would parse /proc/mounts lines into fsck(8) is not an answer.

Ok, it was mkfs.minix and an older distro. But still, is '/dev/root' or
'/dev/fred' really correct?

-- 
USB is for mice, FireWire is for men!

sUse lINUX ag, nÜRNBERG

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-07 10:15             ` Olaf Hering
@ 2004-01-07 11:18               ` viro
  2004-01-07 13:00                 ` Olaf Hering
  0 siblings, 1 reply; 158+ messages in thread
From: viro @ 2004-01-07 11:18 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

On Wed, Jan 07, 2004 at 11:15:59AM +0100, Olaf Hering wrote:
> Now, thats just fine and it was always been that way.
> What if I chroot into /foo, proc is mounted on /foo/proc,
> and run fsck /dev/sda3 in that chroot? 
> That silly app looks for /etc/mtab (oh my...) and start the work.
> Fine. Now, /dev/root is in reality /dev/sda3. Bad for me.

Huh?
 
> the whole thing would work as expected of /proc/self/mounts would have
> a sane format:
> olh@melon:~> cat /proc/mounts
> 0:0 / rootfs rw 0 0
> 8:3 / ext3 rw 0 0
> proc /proc proc rw 0 0
> devpts /dev/pts devpts rw 0 0
> 58:0 /abuild ext3 rw 0 0
> 58:1 /data1 ext3 rw 0 0
> 58:2 /data2 ext3 rw 0 0
> shmfs /dev/shm shm rw 0 0
> automount(pid937) /suse autofs rw 0 0
> wotan:/real-home/jplack /suse/jplack nfs rw,nosuid,v3,rsize=8192,wsize=8192,hard,intr,tcp,nolock,addr=wotan 0 0
> wotan:/real-home/olh /suse/olh nfs rw,nosuid,v3,rsize=8192,wsize=8192,hard,intr,tcp,nolock,addr=wotan 0 0

It's *NOT* a sane format.

> Now fsck could look for /dev/sda3, realize that it is a block
> device node and look for that in the kernel mount table.
> If it is mounted, abort with a nice and meaningful error message.
> 
> So my question is: why was this strange format invented in the first place?
> And: will 2.7 get a sane /proc/self/mounts format for block devices?

Yes.  It already has one.

Note that you're not only adding ad-hackery (which filesystems get that
major:minor printed and which do not?), you *STILL* hadn't solved your
problem.  Why?  Because you still won't catch e.g. ext3 on /dev/sda5 with
external journal on /dev/sda3.  And if you hack parsing ext3 lines in
/proc/mounts, there's always reiserfs, jfs, etc., etc.  _And_ there's
RAID with the same problems wrt. access to components.  Real funny
when you have raid0 on md0, have md0 mounted and try to fsck one of
components.

Scanning /etc/mtab or /proc/mounts in such situations is wrong.  If fsck
is doing that, it's broken.  The right way to fix it depends on what you
really want and whatever the hell it is, putting new and new fs-specific
code that would parse /proc/mounts lines into fsck(8) is not an answer.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 12:34           ` Rob Landley
                               ` (2 preceding siblings ...)
  2004-01-02  0:17             ` Maciej Zenczykowski
@ 2004-01-07 10:23             ` Olaf Hering
  3 siblings, 0 replies; 158+ messages in thread
From: Olaf Hering @ 2004-01-07 10:23 UTC (permalink / raw)
  To: Rob Landley
  Cc: Rob Love, Andries Brouwer, Pascal Schmidt, linux-kernel, Greg KH

 On Thu, Jan 01, Rob Landley wrote:

> Fundamental problem: "Unique" depends on the other devices in the system.  You 
> can't guarantee unique by looking at one device, more or less by definition.

This is certainly not true. (well, maybe for a few device types).

Almost everything can be reached via a well defined bus (or more than
one bus). Each of them does obviously require an identifier. Thats the
hardware part.
Software tends to put a unique identifier into the 'logical' stuff, like
filesystem UUIDs.
So you can construct a unique device node for every device in the
system. And this will work even across distributions!
Stuff like sda3, mouse1 or dsp0 will obviously break. It just happend to
work because everyone on this list knows what to do and where to look.

Sure, there are exceptions, like 2 identical mice, or 2 identical USB
audio devices. But this cant be fixed.

-- 
USB is for mice, FireWire is for men!

sUse lINUX ag, nÜRNBERG

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 22:55           ` viro
  2003-12-31 23:05             ` Rob Love
  2003-12-31 23:48             ` Andreas Dilger
@ 2004-01-07 10:15             ` Olaf Hering
  2004-01-07 11:18               ` viro
  2 siblings, 1 reply; 158+ messages in thread
From: Olaf Hering @ 2004-01-07 10:15 UTC (permalink / raw)
  To: viro; +Cc: Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

 On Wed, Dec 31, viro@parcelfarce.linux.theplanet.co.uk wrote:

> On Wed, Dec 31, 2003 at 05:20:18PM -0500, Rob Love wrote:
> > On Wed, 2003-12-31 at 17:01, Nathan Conrad wrote:

> > Uh, Unix systems (Linux included) do not use the filename of the device
> > node at all.  Those are just names for you, the user.
> > 
> > The kernel uses the device number to understand what device user-space
> > is trying to access.  The kernel associates the device with a device
> > number.  Normally that number is static, and known a priori, so we just
> > create a huge /dev directory with all possible devices and their
> > assigned numbers (you can see these numbers with ls -la).
> > 
> > But if the kernel _tells_ user-space what the device number is, for each
> > device as it is created, we do not need a static /dev directory.  We can
> > assemble the directory on the fly and device numbers really no longer
> > matter.  This is what udev does.
> 
> I think you've missed a point here.  There are several places where kernel
> deals with device identification.
> 	a) when normal pathname lookup results in a device node on filesystem.
> That's the regular way.
> 	b) when we create a new device node; device number is passed to
> ->mknod() and new device node is created.  Also a normal codepath.
> 	c) when late-boot code mounts the final root.  It used to be black
> magic, but these days it's done by regular syscalls.  Namely, we parse the
> "device name" (most of the work is done by lookups in sysfs), do mknod(2)
> and mount(2).  It's still done from the kernel mode, but it could be moved
> to userland.  Should be, actually.
> 	d) when kernel deals with resume/suspend stuff.  Currently - black
> magic.  Should be moved to early userland (same parser as for final root
> name + mknod on rootfs + open() to get the device in question).
> 	e) in several pathological syscalls we pass device number to
> identify a device.  ustat(2) and its ilk - bad API that can't die.
> 	f) /dev/raw passes device number to bind raw device to block device.
> Bad API; we probably ought to replace it with saner one at some point.
> 	g) RAID setup - mix of both pathologies; should be done in userland
> and interfaces are in bad need of cleanup.
> 	h) nfsd uses device number as a substitute for export ID if said
> ID is not given explicitly.  That, BTW, is a big problem for crackpipe
> dreams about random device numbers - export ID _must_ be stable across
> reboots.
> 	i) mtdblk parses "device name" on boot; should be take to early
> userland, same as RAID et.al.

This is about the /proc/self/mounts format:

Why does it contain stuff like "/dev/root" or "/dev/sda3" or
"/dev/myblockdevice"? Does anyone __really__ care about it? I doubt
that. What I have here (with 2.4) is:

olh@melon:~> cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
proc /proc proc rw 0 0
devpts /dev/pts devpts rw 0 0
/dev/vg_melon/abuild /abuild ext3 rw 0 0
/dev/vg_melon/data1 /data1 ext3 rw 0 0
/dev/vg_melon/data2 /data2 ext3 rw 0 0
shmfs /dev/shm shm rw 0 0
automount(pid937) /suse autofs rw 0 0
wotan:/real-home/jplack /suse/jplack nfs rw,nosuid,v3,rsize=8192,wsize=8192,hard,intr,tcp,nolock,addr=wotan 0 0
wotan:/real-home/olh /suse/olh nfs rw,nosuid,v3,rsize=8192,wsize=8192,hard,intr,tcp,nolock,addr=wotan 0 0

Now, thats just fine and it was always been that way.
What if I chroot into /foo, proc is mounted on /foo/proc,
and run fsck /dev/sda3 in that chroot? 
That silly app looks for /etc/mtab (oh my...) and start the work.
Fine. Now, /dev/root is in reality /dev/sda3. Bad for me.

the whole thing would work as expected of /proc/self/mounts would have
a sane format:
olh@melon:~> cat /proc/mounts
0:0 / rootfs rw 0 0
8:3 / ext3 rw 0 0
proc /proc proc rw 0 0
devpts /dev/pts devpts rw 0 0
58:0 /abuild ext3 rw 0 0
58:1 /data1 ext3 rw 0 0
58:2 /data2 ext3 rw 0 0
shmfs /dev/shm shm rw 0 0
automount(pid937) /suse autofs rw 0 0
wotan:/real-home/jplack /suse/jplack nfs rw,nosuid,v3,rsize=8192,wsize=8192,hard,intr,tcp,nolock,addr=wotan 0 0
wotan:/real-home/olh /suse/olh nfs rw,nosuid,v3,rsize=8192,wsize=8192,hard,intr,tcp,nolock,addr=wotan 0 0

Now fsck could look for /dev/sda3, realize that it is a block
device node and look for that in the kernel mount table.
If it is mounted, abort with a nice and meaningful error message.

So my question is: why was this strange format invented in the first place?
And: will 2.7 get a sane /proc/self/mounts format for block devices?

-- 
USB is for mice, FireWire is for men!

sUse lINUX ag, nÜRNBERG

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:02                                                   ` Linus Torvalds
  2004-01-05  4:38                                                     ` viro
@ 2004-01-07  9:57                                                     ` Pavel Machek
  1 sibling, 0 replies; 158+ messages in thread
From: Pavel Machek @ 2004-01-07  9:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: viro, Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

Hi!

> If nothing else, things like SATA will end up meaning that the device you 
> were used to seeign as /dev/hdc will suddenly show up as /dev/scd0 
> instead. Just because you changed the cabling while you upgraded to a 

I do not see easy solution for cdroms... UUID is not going to work there...
				Pavel


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-06  1:06                                                             ` Andries Brouwer
@ 2004-01-06 15:00                                                               ` Mark Mielke
  0 siblings, 0 replies; 158+ messages in thread
From: Mark Mielke @ 2004-01-06 15:00 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Linus Torvalds, linux-kernel, Greg KH

On Tue, Jan 06, 2004 at 02:06:48AM +0100, Andries Brouwer wrote:
> On Mon, Jan 05, 2004 at 03:32:03PM -0800, Linus Torvalds wrote:
> > > Something reproducible is better.
> > And I've told you why reproducibility is a BAD THING
> > Basically, if you cannot 100% guarantee reproducibility,
> > then the _appearance_ of reproducibility is literally a mistake.
> OK. We now understand perfectly each others point of view.
> It was a pleasure to provoke this discussion - can hardly
> wait for 2.7 :-)

Hehe.... s/provoke/perpetuate/g

Cheers,
mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 19:52                                                     ` Andries Brouwer
  2004-01-05 20:38                                                       ` Linus Torvalds
@ 2004-01-06  7:14                                                       ` Vojtech Pavlik
  1 sibling, 0 replies; 158+ messages in thread
From: Vojtech Pavlik @ 2004-01-06  7:14 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Daniel Jacobowitz, Rob Love, rob, Pascal Schmidt,
	linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 08:52:28PM +0100, Andries Brouwer wrote:

> > udev can then use those serial numbers to have a stable pathname
> 
> True. Provided that it knows how to get them.
> The kernel driver knew all about the device.
> Must udev also know all about all possible devices?

No. But it must have rules about what to do with all possible device
types (at least very generic default rules), based on the data the
drivers can provide to identify the device.

> Do I/O to these devices?

If the using an UUID stored on the device (like the filesystem UUID), yes.

> Or must sysfs export all data that could possibly be used?

Not necessarily. But udev must get the all the data that could possibly
be used for assigning a name to the device. It can get them either as
hotplug command line arguments and environment variables or via sysfs,
or by any other means.

-- 
Vojtech Pavlik
SuSE Labs, SuSE CR

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-06  4:28                                                                 ` viro
@ 2004-01-06  5:07                                                                   ` Linus Torvalds
  0 siblings, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-06  5:07 UTC (permalink / raw)
  To: viro
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH



On Tue, 6 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:
> > 
> > Oh, don't look too closely at some pseudo-code, it's not like the code
> > would actually do that for a minor number. But for things like major
> > number allocation for disk devices, it might not be too far off. And we 
> > migth even want to start off the minors at some "random" offset (obviously 
> > while keeping the alignment right for the partition handling)
> 
> True, but...  Let me put it that way - entire area is a minefield and
> I would really like to avoid nasty surprises from "obvious" patches,
> what with having just spent 4 months dealing with the fallout from one
> such beast.

Hey, it's entirely possible that we won't be able to do it at _all_ during 
2.7.x, since it would require that all the distributions have started 
using udev or equivalent. Which is by no means certain at all. It's 
possible that just lack of ubiqutous infrastructure will mean that it 
would be too painful to even try this in a few months..

Do don't worry too much.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-06  1:17                                                               ` Linus Torvalds
@ 2004-01-06  4:28                                                                 ` viro
  2004-01-06  5:07                                                                   ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: viro @ 2004-01-06  4:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 05:17:20PM -0800, Linus Torvalds wrote:
> 
> 
> On Tue, 6 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:
> > 
> > Cute.  There's a little issue of, say it, meaningful relationship between
> > sda and sda4, completely lost that way.  And _that_ has nothing to do with
> > device enumeration.
> 
> Oh, don't look too closely at some pseudo-code, it's not like the code
> would actually do that for a minor number. But for things like major
> number allocation for disk devices, it might not be too far off. And we 
> migth even want to start off the minors at some "random" offset (obviously 
> while keeping the alignment right for the partition handling)

True, but...  Let me put it that way - entire area is a minefield and
I would really like to avoid nasty surprises from "obvious" patches,
what with having just spent 4 months dealing with the fallout from one
such beast.

Let's clean the things up first; then it will be easier to see what can
and should be done.  Sure thing, reducing amount of places that deal with
device numbers is a good thing.  Let's see how far we can get it, what
obstacles still remain (and during 2.5 a _lot_ of them had been killed)
and what is needed to remove the rest.

Once they are gone (and that will be one-by-one, keeping the list of
things to grep for and checking the results of greps as we go) - then
we'll have cleaner playing field for any experiments in that area.
_And_ there will be less temptation to play the bundling games for
everyone involved (cf. devfs disaster, aka. "my glorious idea allows
to do $NEEDED_THING that way; merge the entire thing and nevermind
the fact that doing $NEEDED_THING essentially the same way is possible
without the rest of patch and can be split out of it").

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-06  0:00                                                           ` Greg KH
@ 2004-01-06  1:41                                                             ` Andries Brouwer
  2004-01-07 17:14                                                               ` Greg KH
  0 siblings, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-06  1:41 UTC (permalink / raw)
  To: Greg KH
  Cc: Andries Brouwer, Linus Torvalds, Daniel Jacobowitz, Rob Love,
	rob, Pascal Schmidt, linux-kernel

On Mon, Jan 05, 2004 at 04:00:15PM -0800, Greg KH wrote:

> > > Have you even _tried_ udev?
> > 
> > Yes, and it works reasonably well. I have version 012 here.
> > Some flaws will be fixed in 013 or so.
> 
> What flaws would that be?  The short time delay for partitions?  Or
> something else?

Yes, partitions are not handled very well.
So far I have never seen udev discover partitions on its own.
I provoke it using "blockdev --rereadpt".
The result is that partitions appear in /proc/partitions and in /udev.
After removing the media another "blockdev --rereadpt" returns
"No such device or address" and the entry in /proc/partitions
disappears, but that in /udev stays.

> > Some difficulties are of a more fundamental type, not so easy to fix.
> 
> Such as?

Udev cannot do anything when there are no events.
And media insertion or removal does not always give events.

Andries

[By the way, a compilation warning for every C file:
% make
gcc  -pipe -Wall -Wmore.. -Os -fomit-frame-pointer -D_GNU_SOURCE \
  -I/usr/lib/gcc-lib/i486-suse-linux/3.2/include -I.../udev-012/libsysfs
  -c -o udev.o udev.c
cc1: warning: changing search order for system directory
     "/usr/lib/gcc-lib/i486-suse-linux/3.2/include"
cc1: warning: as it has already been specified as a non-system directory]




^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
@ 2004-01-06  1:20 Paul Zimmerman
  0 siblings, 0 replies; 158+ messages in thread
From: Paul Zimmerman @ 2004-01-06  1:20 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

On Tuesday 06 January 2004 00:31 Rob Landley wrote:
> What about kernel upgrades?  Future backwards compatability when
developers
> change the device enumeration methods?  (The sata driver got completely
> rewritten from scratch, and now it detects devices in a wildly different
> order, but we need this shim layer for backwards compatability with a
> guarantee we never should have made because we encouraged old scripts to
> remain broken.)  This plants hidden land mines restricting future
> development.  You're basically proposing a whole "device number
stabilization
> infrastructure" for future kernels if it's to have ANY meaning at all...

Did people really write scripts that used major:minor numbers to refer to
devices? I would have thought they would use the /dev/xxx name, and those
will
not change when "random" device numbers are implemented, will they?

- Paul


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-06  0:59                                                             ` viro
@ 2004-01-06  1:17                                                               ` Linus Torvalds
  2004-01-06  4:28                                                                 ` viro
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2004-01-06  1:17 UTC (permalink / raw)
  To: viro
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH



On Tue, 6 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:
> 
> Cute.  There's a little issue of, say it, meaningful relationship between
> sda and sda4, completely lost that way.  And _that_ has nothing to do with
> device enumeration.

Oh, don't look too closely at some pseudo-code, it's not like the code
would actually do that for a minor number. But for things like major
number allocation for disk devices, it might not be too far off. And we 
migth even want to start off the minors at some "random" offset (obviously 
while keeping the alignment right for the partition handling)

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 23:32                                                           ` Linus Torvalds
  2004-01-06  0:59                                                             ` viro
@ 2004-01-06  1:06                                                             ` Andries Brouwer
  2004-01-06 15:00                                                               ` Mark Mielke
  1 sibling, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-06  1:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 03:32:03PM -0800, Linus Torvalds wrote:

> > Something reproducible is better.
> 
> And I've told you why reproducibility is a BAD THING
> 
> Basically, if you cannot 100% guarantee reproducibility,
> then the _appearance_ of reproducibility is literally a mistake.

OK. We now understand perfectly each others point of view.
It was a pleasure to provoke this discussion - can hardly
wait for 2.7 :-)

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 23:32                                                           ` Linus Torvalds
@ 2004-01-06  0:59                                                             ` viro
  2004-01-06  1:17                                                               ` Linus Torvalds
  2004-01-06  1:06                                                             ` Andries Brouwer
  1 sibling, 1 reply; 158+ messages in thread
From: viro @ 2004-01-06  0:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 03:32:03PM -0800, Linus Torvalds wrote:
> 	dev_t lbt_devno()
> 	{
> 		return nr++;
> 	}
> 
> since the numbers do have to be unique "per boot". They just shouldn't be 
> considered "stable" _nor_ "meaningful".

Cute.  There's a little issue of, say it, meaningful relationship between
sda and sda4, completely lost that way.  And _that_ has nothing to do with
device enumeration.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-06  0:43                                                               ` Greg KH
@ 2004-01-06  0:53                                                                 ` Shawn
  0 siblings, 0 replies; 158+ messages in thread
From: Shawn @ 2004-01-06  0:53 UTC (permalink / raw)
  To: Greg KH
  Cc: Mark Mielke, Linus Torvalds, Andries Brouwer, Daniel Jacobowitz,
	Rob Love, rob, Pascal Schmidt, linux-kernel

I'm embarrassed to say I did not read that.

I'm starting to wonder what some folks are complaining about. WRT
practicality and useability, udev about covers it once alsa and vmware
;) get sysfs-ified.

My own foray into udev was a little lacking owing to these little
issues.

On Mon, 2004-01-05 at 18:43, Greg KH wrote:
> In summary, udev doesn't care squat about the major/minor that the
> kernel has used for a device.  It merely uses those numbers and creates
> a /dev entry with them, assigned to a name that it comes up with.
> 
> Does that help out?  The udev OLS paper might also help explain some of
> this.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 23:05                                                             ` Shawn
  2004-01-05 23:23                                                               ` Shawn
@ 2004-01-06  0:43                                                               ` Greg KH
  2004-01-06  0:53                                                                 ` Shawn
  1 sibling, 1 reply; 158+ messages in thread
From: Greg KH @ 2004-01-06  0:43 UTC (permalink / raw)
  To: Shawn
  Cc: Mark Mielke, Linus Torvalds, Andries Brouwer, Daniel Jacobowitz,
	Rob Love, rob, Pascal Schmidt, linux-kernel

On Mon, Jan 05, 2004 at 05:05:16PM -0600, Shawn wrote:
> On Mon, 2004-01-05 at 16:25, Mark Mielke wrote:
> > On Mon, Jan 05, 2004 at 04:17:57PM -0600, Shawn wrote:
> > > ...
> > > As an admin, would I at least theoretically have /some/ consistency if
> > > merely for my own sanity when dealing with block devices by hand (I do
> > > need to setup LVM stuff from time to time)??
> > 
> > If all you care about is that /dev names remain consistent, you need
> > not fear. udev and devfs are two different ways of providing this
> > consistency. They abstract the device numbers from the /dev names,
> > meaning that you don't have to care if the numbers change. The names
> > don't.
> I'm obviously confused if this is true, as then I do not know how the
> great and powerful udev derives the names if not from the numbers, or
> some other sysfs info.

udev can derive the names for the /dev entries from just about anything
you can think of:
	- sysfs files
	- bus topology
	- bus ids
	- any script/program that you might want to run
	- the kernel name

It will default back to the "kernel name" that shows up in sysfs, and is
what we currently use, if it can not match up any other name to it.  The
method of creating these rules that udev uses, are contained in the
udev.rules file.  See the udev man page for the syntax and some example
rules.  Also see the example udev.rules and udev.rules.devfs files for
lots more example rules that you might want to come up with.

The strength in this is that udev can poke around and try to find a
unique "tag" that a specific device exports (be it UUID, or a CDDB
entry) and use that to match up a name to.  That enables your cdrom to
always be called /dev/cdrom no matter where in the scsi chain it happens
to be.

In summary, udev doesn't care squat about the major/minor that the
kernel has used for a device.  It merely uses those numbers and creates
a /dev entry with them, assigned to a name that it comes up with.

Does that help out?  The udev OLS paper might also help explain some of
this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 23:13                                                         ` Andries Brouwer
  2004-01-05 23:32                                                           ` Linus Torvalds
  2004-01-06  0:00                                                           ` Greg KH
@ 2004-01-06  0:31                                                           ` Rob Landley
  2 siblings, 0 replies; 158+ messages in thread
From: Rob Landley @ 2004-01-06  0:31 UTC (permalink / raw)
  To: Andries Brouwer, Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, Pascal Schmidt,
	linux-kernel, Greg KH

On Monday 05 January 2004 17:13, Andries Brouwer wrote:
> An earlier fragment of the discussion was concerned with the fact
> that random(); is a bad idea. Something reproducible is better.

To find people making bad assuptions that will only break after widespread 
deployment, random() is much better than "usually reproducible".

> Let us abbreviate the above function f. Some driver determines that
> a disk has serial number A809ADGC. Another driver determines that
> some device was produced by HP but otherwise has no opinion.
> A third driver has no stable information at all about the device.
> They assign device numbers f("A809ADGC"), f("HP"), f("").
>
> What is the result? Yes, device numbers are cookies, but a reasonable
> attempt has been made to make the device numbers stable.

Should the same argument be made about process ID's?  When your system boots 
up, your daemons generally start in the same order.  But any script that 
depends on this is broken.

Or filehandles.  They're cookies.  There's whole pages on why it's a bad idea 
to make assumptions about what filehandles point to:

http://en.tldp.org/HOWTO/Secure-Programs-HOWTO/avoid-race.html

> No guarantees anywhere - this is best effort. Better than no effort.

You're suggesting that it should be easier to write things that are 
fundamentally unclean, and bake in assumptions that WILL break, but not on 
the developer's machine, only for end-users who aren't expecting it.

What's the advantage?  Making it easier for people to do something stupid?  
(You can sort of trust this thing we can't make any guarantees about.  Since 
when is "sort of trust" a condition that's encouraged?  At the very least, 
those kinds of cases are backed up by a detection and recovery mechanism and 
the whole thing's called a heuristic.)  Why is there a need for this?

Either the kernel can make a guarantee, or it should very much not make a 
guarantee.  Where is an example of a middle ground?

> And this information helps udev. It may make a callout superfluous,
> or even give udev information that cannot be obtained from userspace.

I'm waiting for the udev maintainer to weight in on this and say "no, it 
doesn't".  If there is information that "cannot be obtained from userspace", 
then we should fix the sysfs exports.  Encoding something in a semi-stable 
cookie and actually trying to USE that information is stupid.

What about kernel upgrades?  Future backwards compatability when developers 
change the device enumeration methods?  (The sata driver got completely 
rewritten from scratch, and now it detects devices in a wildly different 
order, but we need this shim layer for backwards compatability with a 
guarantee we never should have made because we encouraged old scripts to 
remain broken.)  This plants hidden land mines restricting future 
development.  You're basically proposing a whole "device number stabilization 
infrastructure" for future kernels if it's to have ANY meaning at all...

Where's the advantage?  Name a single real-world case that's more difficult to 
fix than it would be to make the kernel pander to it in perpetuity.

> Andries

Rob


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 21:06                                                               ` Vojtech Pavlik
  2004-01-05 22:22                                                                 ` Theodore Ts'o
@ 2004-01-06  0:14                                                                 ` Rob Landley
  1 sibling, 0 replies; 158+ messages in thread
From: Rob Landley @ 2004-01-06  0:14 UTC (permalink / raw)
  To: Vojtech Pavlik, Theodore Ts'o, Greg KH, Linus Torvalds, viro,
	Daniel Jacobowitz, Andries Brouwer, Rob Love, Pascal Schmidt,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1951 bytes --]

On Monday 05 January 2004 15:06, Vojtech Pavlik wrote:
> On Mon, Jan 05, 2004 at 03:11:44PM -0500, Theodore Ts'o wrote:
> > On Mon, Jan 05, 2004 at 12:15:56PM +0100, Vojtech Pavlik wrote:
> > > Mutt with IMAP is rather bearable even on a GPRS connection (40kbps,
> > > 1sec latency). On a 100baseTX it's not distinguishable from local
> > > operation.
> >
> > Hmm... I've tried using mutt/IMAP over GPRS connection, and I find it
> > extremely unpleasant, myself.  My solution is to use isync to provide
> > a local cached copy of the IMAP server on my laptop, and then run mutt
> > against the local cached copy.
> >
> > I have a patch to isync which allows it to issue multiple IMAP
> > commands in parallel (instead of operating in lockstep fashion):
> >
> > http://bugs.debian.org/cgi-bin/bugreport.cgi//tmp/async-imap-patch?bug=22
> >6222&msg=3&att=1
> >
> > With this patch, isync works very well, even over high latency, slow
> > speed links.
>
> That looks very nice. Now, if there were a way how to make the isync
> IMAP connections go over a compressed ssh link (like I'm doing with
> Mutt/IMAP) that'd be very cool.

You can run any tcp/ip service over ssh.

Tell isync that the imap server it's synchronizing with lives on the loopback 
interface, and then run a variant this little python script I use to check my 
email (adjusting the last line for your connection info).  (Note that the far 
end needs netcat.  If you haven't got it, try the version in busybox.)

Yeah, the script's a quick and dirty hack, but really easy to modify.  I have 
a more complicated one using SO_ORIGINAL_DEST and a lookup table if you 
prefer to setup some firewall rules and tell your imap server it lives in the 
192.168.x.x or 10.x.x.x address range...  But I've never gotten around to 
configuring my laptop to use it just to tunnel pop. :)

I keep meaning to put the full solution up on http://dvpn.sf.net, but nobody's 
pestered me about it. :)

Rob

[-- Attachment #2: boing.py --]
[-- Type: text/x-python, Size: 605 bytes --]

#!/usr/bin/python

import socket,struct,sys,os,signal

vpnip="127.0.0.1"
vpnport=int(sys.argv[1]) # 110 25

signal.signal(signal.SIGCHLD, lambda a,b: os.waitpid(-1,os.WNOHANG))

sock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
sock.bind((vpnip,vpnport))
sock.listen(10)

while 1:
  try: (conn,addr)=sock.accept()
  except socket.error: continue

  if os.fork():
    conn.close()
    continue

  os.dup2(conn.fileno(),0)
  os.dup2(conn.fileno(),1)
  conn.close()
  sock.close()

  os.execvp("ssh",("ssh","-i","/home/landley/.ssh/id_dsa","landley@66.92.53.140","./netcat","192.168.1.31",str(vpnport)))

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 23:13                                                         ` Andries Brouwer
  2004-01-05 23:32                                                           ` Linus Torvalds
@ 2004-01-06  0:00                                                           ` Greg KH
  2004-01-06  1:41                                                             ` Andries Brouwer
  2004-01-06  0:31                                                           ` Rob Landley
  2 siblings, 1 reply; 158+ messages in thread
From: Greg KH @ 2004-01-06  0:00 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Daniel Jacobowitz, Rob Love, rob, Pascal Schmidt,
	linux-kernel

On Tue, Jan 06, 2004 at 12:13:26AM +0100, Andries Brouwer wrote:
> On Mon, Jan 05, 2004 at 12:38:54PM -0800, Linus Torvalds wrote:
> 
> > Have you even _tried_ udev?
> 
> Yes, and it works reasonably well. I have version 012 here.
> Some flaws will be fixed in 013 or so.

What flaws would that be?  The short time delay for partitions?  Or
something else?

> Some difficulties are of a more fundamental type, not so easy to fix.

Such as?

> But udev is an entirely different discussion. Some other time.

Feel free to bring it up on the linux-hotplug-devel list whenever you
wish.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 23:13                                                         ` Andries Brouwer
@ 2004-01-05 23:32                                                           ` Linus Torvalds
  2004-01-06  0:59                                                             ` viro
  2004-01-06  1:06                                                             ` Andries Brouwer
  2004-01-06  0:00                                                           ` Greg KH
  2004-01-06  0:31                                                           ` Rob Landley
  2 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05 23:32 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Daniel Jacobowitz, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Tue, 6 Jan 2004, Andries Brouwer wrote:
> 
> Now compare our setups:
> 
> dev_t lbt_devno(void) { return random(); }

Actually, I'd have something like

	int nr;

	initialize()
	{
	#ifdef CONFIG_DEBUG_BAD_USERS
		nr = random();
	#endif
	}

	dev_t lbt_devno()
	{
		return nr++;
	}

since the numbers do have to be unique "per boot". They just shouldn't be 
considered "stable" _nor_ "meaningful".

> dev_t aeb_devno(char *s) { dev_t d = hash(s); while (inuse(d)) d++; return d; }
> 
> An earlier fragment of the discussion was concerned with the fact
> that random(); is a bad idea. Something reproducible is better.

And I've told you why reproducibility is a BAD THING, and why I disagree.

Basically, if you cannot 100% guarantee reproducibility (and nobody can,
not your hashes, not anything else), then the _appearance_ of 
reproducibility is literally a mistake. Because it ends up being a bug 
waiting to happen - and one that is very very hard to reproduce on a 
developer machine.

You seem to continually ignore this issue.

I'm  not going to bother arguign this for another week. I'm just going to 
state once and for all:

 - total device number reproducability is fundamentally impossible. It's 
   not just impossible in theory, it is impossible in practice too.
 - with that in mind, anything that depends on stable device numbers is a 
   BUG.
 - Thus all your arguments boil down to: "I want to encourage bugs".

My argument is that we should find and fix the bugs. And we should do so 
by making the lack of meaning of the device numbers as well-known as 
possible. And that shouldn't just be due to long and boring threads on the 
kernel mailing list, but by actually actively trying to trigger the bad 
cases.

Bugs are good at hiding, and then showing up at the most inopportune 
times when you can't debug them. It's much better to try to trigger them 
where a developer can see them, and you do that by doing strange things.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 23:05                                                             ` Shawn
@ 2004-01-05 23:23                                                               ` Shawn
  2004-01-06  0:43                                                               ` Greg KH
  1 sibling, 0 replies; 158+ messages in thread
From: Shawn @ 2004-01-05 23:23 UTC (permalink / raw)
  To: Mark Mielke
  Cc: Linus Torvalds, Andries Brouwer, Daniel Jacobowitz, Rob Love,
	rob, Pascal Schmidt, linux-kernel, Greg KH

And looking back on some of these emails, it seems there was more than
just me being confused. Seems this is a point worth emphasizing.

On Mon, 2004-01-05 at 17:05, Shawn wrote:
> On Mon, 2004-01-05 at 16:25, Mark Mielke wrote:
> > On Mon, Jan 05, 2004 at 04:17:57PM -0600, Shawn wrote:
> > > ...
> > > As an admin, would I at least theoretically have /some/ consistency if
> > > merely for my own sanity when dealing with block devices by hand (I do
> > > need to setup LVM stuff from time to time)??
> > 
> > If all you care about is that /dev names remain consistent, you need
> > not fear. udev and devfs are two different ways of providing this
> > consistency. They abstract the device numbers from the /dev names,
> > meaning that you don't have to care if the numbers change. The names
> > don't.
> I'm obviously confused if this is true, as then I do not know how the
> great and powerful udev derives the names if not from the numbers, or
> some other sysfs info.
> 
> Anyway, assuming this is true, I have much less concern.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 20:38                                                       ` Linus Torvalds
  2004-01-05 22:17                                                         ` Shawn
@ 2004-01-05 23:13                                                         ` Andries Brouwer
  2004-01-05 23:32                                                           ` Linus Torvalds
                                                                             ` (2 more replies)
  1 sibling, 3 replies; 158+ messages in thread
From: Andries Brouwer @ 2004-01-05 23:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 12:38:54PM -0800, Linus Torvalds wrote:

> Have you even _tried_ udev?

Yes, and it works reasonably well. I have version 012 here.
Some flaws will be fixed in 013 or so. Some difficulties are of a
more fundamental type, not so easy to fix. But udev is an entirely
different discussion. Some other time.

> In particular, the kernel should never have policy encoded in it, and 
> naming of a device is about pretty much nothing _but_ policy.

Of course. But this is not about naming.

The kernel invents device numbers, and user space names.

Now compare our setups:

dev_t lbt_devno(void) { return random(); }

dev_t aeb_devno(char *s) { dev_t d = hash(s); while (inuse(d)) d++; return d; }

An earlier fragment of the discussion was concerned with the fact
that random(); is a bad idea. Something reproducible is better.

Let us abbreviate the above function f. Some driver determines that
a disk has serial number A809ADGC. Another driver determines that
some device was produced by HP but otherwise has no opinion.
A third driver has no stable information at all about the device.
They assign device numbers f("A809ADGC"), f("HP"), f("").

What is the result? Yes, device numbers are cookies, but a reasonable
attempt has been made to make the device numbers stable.
No guarantees anywhere - this is best effort. Better than no effort.

And this information helps udev. It may make a callout superfluous,
or even give udev information that cannot be obtained from userspace.

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 22:25                                                           ` Mark Mielke
@ 2004-01-05 23:05                                                             ` Shawn
  2004-01-05 23:23                                                               ` Shawn
  2004-01-06  0:43                                                               ` Greg KH
  0 siblings, 2 replies; 158+ messages in thread
From: Shawn @ 2004-01-05 23:05 UTC (permalink / raw)
  To: Mark Mielke
  Cc: Linus Torvalds, Andries Brouwer, Daniel Jacobowitz, Rob Love,
	rob, Pascal Schmidt, linux-kernel, Greg KH

On Mon, 2004-01-05 at 16:25, Mark Mielke wrote:
> On Mon, Jan 05, 2004 at 04:17:57PM -0600, Shawn wrote:
> > ...
> > As an admin, would I at least theoretically have /some/ consistency if
> > merely for my own sanity when dealing with block devices by hand (I do
> > need to setup LVM stuff from time to time)??
> 
> If all you care about is that /dev names remain consistent, you need
> not fear. udev and devfs are two different ways of providing this
> consistency. They abstract the device numbers from the /dev names,
> meaning that you don't have to care if the numbers change. The names
> don't.
I'm obviously confused if this is true, as then I do not know how the
great and powerful udev derives the names if not from the numbers, or
some other sysfs info.

Anyway, assuming this is true, I have much less concern.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 22:17                                                         ` Shawn
@ 2004-01-05 22:25                                                           ` Mark Mielke
  2004-01-05 23:05                                                             ` Shawn
  0 siblings, 1 reply; 158+ messages in thread
From: Mark Mielke @ 2004-01-05 22:25 UTC (permalink / raw)
  To: Shawn
  Cc: Linus Torvalds, Andries Brouwer, Daniel Jacobowitz, Rob Love,
	rob, Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 04:17:57PM -0600, Shawn wrote:
> Having said that, I will say that they are /somewhat/ stable. You can,
> in general, say 'fdisk /dev/hdb' and be editing the same block device's
> partition table... That is, if nothing has changed in the BIOS or
> hardware or kernel or....
> ...
> As an admin, would I at least theoretically have /some/ consistency if
> merely for my own sanity when dealing with block devices by hand (I do
> need to setup LVM stuff from time to time)??

If all you care about is that /dev names remain consistent, you need
not fear. udev and devfs are two different ways of providing this
consistency. They abstract the device numbers from the /dev names,
meaning that you don't have to care if the numbers change. The names
don't.

Cheers,
mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 21:06                                                               ` Vojtech Pavlik
@ 2004-01-05 22:22                                                                 ` Theodore Ts'o
  2004-01-06  0:14                                                                 ` Rob Landley
  1 sibling, 0 replies; 158+ messages in thread
From: Theodore Ts'o @ 2004-01-05 22:22 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Greg KH, Linus Torvalds, viro, Daniel Jacobowitz,
	Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel

On Mon, Jan 05, 2004 at 10:06:25PM +0100, Vojtech Pavlik wrote:
> 
> That looks very nice. Now, if there were a way how to make the isync
> IMAP connections go over a compressed ssh link (like I'm doing with
> Mutt/IMAP) that'd be very cool.
> 

The following in your .isyncrc file will do the trick:

Mailbox thunk
Box Inbox
Host thunk.org
Tunnel "socat SOCKS4A:127.0.0.1:thunk.org:143 STDIO"

You can also do this via secure IMAP, but then ssh's compression won't
be able to do much.  Nevertheless, I do this when synchronizing
against an IMAP server where I don't have ssh access, and where I want
the connection between the thunk.org and po14.mit.edu to be secured.
So I use the following syntax in .isyncrc to achieve to do this:

Mailbox Inbox
Box Inbox
Host imaps:po14.mit.edu
Tunnel "socat SOCKS4A:127.0.0.1:po14.mit.edu:993 STDIO"
UseSSLv2 yes
UseSSLv3 yes
UseTLSv1 yes

						- Ted

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 16:36                                                   ` Andreas Schwab
@ 2004-01-05 22:18                                                     ` Mark Mielke
  0 siblings, 0 replies; 158+ messages in thread
From: Mark Mielke @ 2004-01-05 22:18 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Rob Landley, linux-kernel

On Mon, Jan 05, 2004 at 05:36:09PM +0100, Andreas Schwab wrote:
> Mark Mielke <mark@mark.mielke.cc> writes:
> > There are a few cases that we might be forced to maintain regular
> > numbers: mkfifo() creates a named pipe, and bind() creates a named
> > socket.
> Neither fifos nor sockets are devices.

Well, then, as long as things like this don't break... :-)

Other than backing up /dev, does anybody have *real* cases where a
program assumes major:minor is consistent across reboots? We should
start notifying the authors now... NFS seems to be one, given the
explanation offered for how fsid's are derived...

mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 20:38                                                       ` Linus Torvalds
@ 2004-01-05 22:17                                                         ` Shawn
  2004-01-05 22:25                                                           ` Mark Mielke
  2004-01-05 23:13                                                         ` Andries Brouwer
  1 sibling, 1 reply; 158+ messages in thread
From: Shawn @ 2004-01-05 22:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

Linus is correct. I say this somewhat because I find his arguments to
make perfect sense in a philosophical way, but more because it is his
kernel.

Anyway, I'll weigh in with my 0.02 pesos:
Right now, as things are, hardware devices' numbers are not very stable
as it is. Detection order can and will change, and you should not rely
on them being the same. PERIOD.

Having said that, I will say that they are /somewhat/ stable. You can,
in general, say 'fdisk /dev/hdb' and be editing the same block device's
partition table... That is, if nothing has changed in the BIOS or
hardware or kernel or....

Now, correct me if I'm wrong, but I don't believe we are expecting
device numbers to change nearly every time you reboot given there are no
hardware changes with a dynamic numbering scheme, right?

I would, as an admin, have need for distinguishing between my 4
identical SATA hard drives with identical partition tables without
having to resort to examining UUIDs, serial number or FS labels by hand,
especially if I dd(1) stuff between them. I understand this is not as
simple as with ide(0-N)(pri|sec)(master|slave) (ignoring that ide(0-N)
could be detected in arbitrary order) as SATA is different.

As an admin, would I at least theoretically have /some/ consistency if
merely for my own sanity when dealing with block devices by hand (I do
need to setup LVM stuff from time to time)??

On Mon, 2004-01-05 at 14:38, Linus Torvalds wrote:
> On Mon, 5 Jan 2004, Andries Brouwer wrote:
> > 
> > > udev can then use those serial numbers to have a stable pathname
> > 
> > True. Provided that it knows how to get them.
> 
> And that is the _only_ thing that the "device number" actually is. It is a 
> cookie that the kernel has allocated for the device that the kernel knows 
> about. Nothing more.
> 
> Go back and read my emails. Device numbers cannot have any meaning, they 
> literally are _only_ useful as cookies. 
> 
> > The kernel driver knew all about the device.
> 
> No. The kernel driver knows _of_ the device, it does not know "all about"  
> the device. And that's a big difference.
> 
> Quite often the kernel only knows that it found "a device". It has very
> limited knowledge about what the device is, and what it can do. That's why 
> we have tools like "smartd" etc, that know a lot more about devices than 
> the kernel often does.
> 
> In particular, the kernel driver knows _nothing_ about potential serial
> numbers or how to read them for different classes of devices.
> 
> > Must udev also know all about all possible devices? Do I/O to these devices?
> > Or must sysfs export all data that could possibly be used?
> 
> There is nothing to export. You seem to imply that the kernel somehow
> knows more than user space, but the reverse is generally true. 
> 
> In particular, the kernel should never have policy encoded in it, and 
> naming of a device is about pretty much nothing _but_ policy. Stuff that 
> the kernel literally has _zero_ knowledged about.
> 
> Yes, the kernel knows the physical location, but that doesn't actually 
> help the kernel itself. It's exported through sysfs, yes, and udev, 
> together with the hotplug stuff, can be used to make up the "stable name". 
> 
> Have you even _tried_ udev? Udev can do exactly things like find UUID's 
> off disks - something the kernel doesn't have a _clue_ about. When the 
> kernel sees a disk, it's just a disk. The kernel doesn't know if there is 
> an UUID embedded on the disk, and the kernel SHOULD NOT HAVE A POLICY to 
> try to find one.
> 
> But for user space, the thing is trivially done: the kernel will notify
> user space about the fact that it found a device (without necessarily
> knowing what the heck the device is - quite common with USB or specialty
> SCSI devices). The kernel pretty much doesn't know _anything_ about things
> like laser range finders, cameras etc. It ends up classifying the device 
> on a very rough level, nothing more.
> 
> And without knowing practically _anythign_ about the device, it still has
> to allocate a device number. Exactly so that somebody else can come around
> and poke at it, and maybe know that "ahh, this device is a USB-attached
> camera" or similar.
> 
> Do you not see that fundamental issue? The kernel has to allocate a number 
> before a UUID or anythign else is necessarily available. 
> 
> The UUID/serial number/type policy comes _later_.
> 
> 		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 20:11                                                             ` Theodore Ts'o
@ 2004-01-05 21:06                                                               ` Vojtech Pavlik
  2004-01-05 22:22                                                                 ` Theodore Ts'o
  2004-01-06  0:14                                                                 ` Rob Landley
  0 siblings, 2 replies; 158+ messages in thread
From: Vojtech Pavlik @ 2004-01-05 21:06 UTC (permalink / raw)
  To: Theodore Ts'o, Vojtech Pavlik, Greg KH, Linus Torvalds, viro,
	Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel

On Mon, Jan 05, 2004 at 03:11:44PM -0500, Theodore Ts'o wrote:

> On Mon, Jan 05, 2004 at 12:15:56PM +0100, Vojtech Pavlik wrote:
> > 
> > Mutt with IMAP is rather bearable even on a GPRS connection (40kbps,
> > 1sec latency). On a 100baseTX it's not distinguishable from local
> > operation.
> 
> Hmm... I've tried using mutt/IMAP over GPRS connection, and I find it
> extremely unpleasant, myself.  My solution is to use isync to provide
> a local cached copy of the IMAP server on my laptop, and then run mutt
> against the local cached copy.  
> 
> I have a patch to isync which allows it to issue multiple IMAP
> commands in parallel (instead of operating in lockstep fashion):
> 
> http://bugs.debian.org/cgi-bin/bugreport.cgi//tmp/async-imap-patch?bug=226222&msg=3&att=1
> 
> With this patch, isync works very well, even over high latency, slow
> speed links.

That looks very nice. Now, if there were a way how to make the isync
IMAP connections go over a compressed ssh link (like I'm doing with
Mutt/IMAP) that'd be very cool.

-- 
Vojtech Pavlik
SuSE Labs, SuSE CR

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 19:52                                                     ` Andries Brouwer
@ 2004-01-05 20:38                                                       ` Linus Torvalds
  2004-01-05 22:17                                                         ` Shawn
  2004-01-05 23:13                                                         ` Andries Brouwer
  2004-01-06  7:14                                                       ` Vojtech Pavlik
  1 sibling, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05 20:38 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Daniel Jacobowitz, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Mon, 5 Jan 2004, Andries Brouwer wrote:
> 
> > udev can then use those serial numbers to have a stable pathname
> 
> True. Provided that it knows how to get them.

And that is the _only_ thing that the "device number" actually is. It is a 
cookie that the kernel has allocated for the device that the kernel knows 
about. Nothing more.

Go back and read my emails. Device numbers cannot have any meaning, they 
literally are _only_ useful as cookies. 

> The kernel driver knew all about the device.

No. The kernel driver knows _of_ the device, it does not know "all about"  
the device. And that's a big difference.

Quite often the kernel only knows that it found "a device". It has very
limited knowledge about what the device is, and what it can do. That's why 
we have tools like "smartd" etc, that know a lot more about devices than 
the kernel often does.

In particular, the kernel driver knows _nothing_ about potential serial
numbers or how to read them for different classes of devices.

> Must udev also know all about all possible devices? Do I/O to these devices?
> Or must sysfs export all data that could possibly be used?

There is nothing to export. You seem to imply that the kernel somehow
knows more than user space, but the reverse is generally true. 

In particular, the kernel should never have policy encoded in it, and 
naming of a device is about pretty much nothing _but_ policy. Stuff that 
the kernel literally has _zero_ knowledged about.

Yes, the kernel knows the physical location, but that doesn't actually 
help the kernel itself. It's exported through sysfs, yes, and udev, 
together with the hotplug stuff, can be used to make up the "stable name". 

Have you even _tried_ udev? Udev can do exactly things like find UUID's 
off disks - something the kernel doesn't have a _clue_ about. When the 
kernel sees a disk, it's just a disk. The kernel doesn't know if there is 
an UUID embedded on the disk, and the kernel SHOULD NOT HAVE A POLICY to 
try to find one.

But for user space, the thing is trivially done: the kernel will notify
user space about the fact that it found a device (without necessarily
knowing what the heck the device is - quite common with USB or specialty
SCSI devices). The kernel pretty much doesn't know _anything_ about things
like laser range finders, cameras etc. It ends up classifying the device 
on a very rough level, nothing more.

And without knowing practically _anythign_ about the device, it still has
to allocate a device number. Exactly so that somebody else can come around
and poke at it, and maybe know that "ahh, this device is a USB-attached
camera" or similar.

Do you not see that fundamental issue? The kernel has to allocate a number 
before a UUID or anythign else is necessarily available. 

The UUID/serial number/type policy comes _later_.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 11:15                                                           ` Vojtech Pavlik
@ 2004-01-05 20:11                                                             ` Theodore Ts'o
  2004-01-05 21:06                                                               ` Vojtech Pavlik
  0 siblings, 1 reply; 158+ messages in thread
From: Theodore Ts'o @ 2004-01-05 20:11 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Greg KH, Linus Torvalds, viro, Daniel Jacobowitz,
	Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel

On Mon, Jan 05, 2004 at 12:15:56PM +0100, Vojtech Pavlik wrote:
> 
> Mutt with IMAP is rather bearable even on a GPRS connection (40kbps,
> 1sec latency). On a 100baseTX it's not distinguishable from local
> operation.

Hmm... I've tried using mutt/IMAP over GPRS connection, and I find it
extremely unpleasant, myself.  My solution is to use isync to provide
a local cached copy of the IMAP server on my laptop, and then run mutt
against the local cached copy.  

I have a patch to isync which allows it to issue multiple IMAP
commands in parallel (instead of operating in lockstep fashion):

http://bugs.debian.org/cgi-bin/bugreport.cgi//tmp/async-imap-patch?bug=226222&msg=3&att=1

With this patch, isync works very well, even over high latency, slow
speed links.

						- Ted

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 16:13                                                   ` Linus Torvalds
  2004-01-05 17:29                                                     ` Vojtech Pavlik
@ 2004-01-05 19:52                                                     ` Andries Brouwer
  2004-01-05 20:38                                                       ` Linus Torvalds
  2004-01-06  7:14                                                       ` Vojtech Pavlik
  1 sibling, 2 replies; 158+ messages in thread
From: Andries Brouwer @ 2004-01-05 19:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 08:13:26AM -0800, Linus Torvalds wrote:

> > You keep repeating that enumerating is impossible, and that therefore
> > stable device numbers are impossible, and that consequently, since we
> > cannot have stable device numbers expecting them to be stable is broken.
> 
> Right.

> When I talk about "enumerate", I do not mean "give numbers starting at 1".
> It boils down to not how many devices there can be, but to whether there 
> is any way to "walk the space of devices".

Yes, that is what one commonly calls to enumerate.  Let us say,
an effective way, given some integer, to find the associated device.

[You can leave the mathematics out - this enumerable is not the same as
denumerable or countable. The set of devices on earth is finite.]

> And there fundamentally isn't. And _that_ is the basic issue: if you
> _cannot_ number a space, you cannot have a stable device number.

If there is no effective way to find a disk given some number,
there may very well be an effective way to find a number given some disk.
And indeed, there usually is.

> There are no "serial numbers".
> Please. Where do you think those numbers would come from?

Most of my devices do have them...

> My point is that for the subset of devices that _do_ have serial numbers 

Ah, wait! You also have heard about devices with serial numbers! Good!
It is those devices I was talking about. Remember? ["important special case"]

> udev can then use those serial numbers to have a stable pathname

True. Provided that it knows how to get them.
The kernel driver knew all about the device.
Must udev also know all about all possible devices? Do I/O to these devices?
Or must sysfs export all data that could possibly be used?

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 17:52                                                       ` Davide Libenzi
  2004-01-05 18:03                                                         ` Linus Torvalds
  2004-01-05 18:09                                                         ` Hugo Mills
@ 2004-01-05 19:10                                                         ` Paul Rolland
  2 siblings, 0 replies; 158+ messages in thread
From: Paul Rolland @ 2004-01-05 19:10 UTC (permalink / raw)
  To: 'Davide Libenzi', 'Vojtech Pavlik'
  Cc: 'Linus Torvalds', 'Andries Brouwer',
	'Daniel Jacobowitz', 'Rob Love',
	rob, 'Pascal Schmidt',
	'Linux Kernel Mailing List', 'Greg KH'

Hello,

> > Two dimensional discrete space (*) is enumerable. Just 
> start at [0,0]
> > and assign numbers going around the center in a growing spiral (**).
> > That way you assign a number to every point in that space. 
> This is very
> > similar to the trick used to demonstrate fractions are enumerable.
> 
> Vojtech, a spiral (in the math sense) won't work because whatever 
> continuos function you choose for the radius, you are going to skip 
> integers when the radius grows (and duplicate them when it's 
> small). Also, 
> IIRC, fractions are enumerable because they're a mapping from two 
> enumerable spaces (integers): F = F(I1, I2) = I1 / I2.
> 
No, I think Vojtech was meaning this kind of spiral and
enumeration :

  ...16 15 14 13
     5  4  3  12
     6  1  2  11
     7  8  9  10  

and so on... The spiral in not to be taken in the math sense...

Regards,
Paul


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 17:52                                                       ` Davide Libenzi
  2004-01-05 18:03                                                         ` Linus Torvalds
@ 2004-01-05 18:09                                                         ` Hugo Mills
  2004-01-05 19:10                                                         ` Paul Rolland
  2 siblings, 0 replies; 158+ messages in thread
From: Hugo Mills @ 2004-01-05 18:09 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Linus Torvalds, Andries Brouwer, Daniel Jacobowitz, Rob Love,
	rob, Pascal Schmidt, Linux Kernel Mailing List, Greg KH

[-- Attachment #1: Type: text/plain, Size: 1745 bytes --]

On Mon, Jan 05, 2004 at 09:52:45AM -0800, Davide Libenzi wrote:
> On Mon, 5 Jan 2004, Vojtech Pavlik wrote:
> 
> > On Mon, Jan 05, 2004 at 08:13:26AM -0800, Linus Torvalds wrote:
> > 
> > > But the thing is, some things you simply _cannot_ number. For example, a
> > > two-dimensional space is innumerable - you need more than one integer
> > > number to look things up.  So is the set of real numbers (but not the set 
> > > of fractions), etc etc.
> > 
> > Two dimensional discrete space (*) is enumerable. Just start at [0,0]
> > and assign numbers going around the center in a growing spiral (**).
> > That way you assign a number to every point in that space. This is very
> > similar to the trick used to demonstrate fractions are enumerable.
> 
> Vojtech, a spiral (in the math sense) won't work because whatever 
> continuos function you choose for the radius, you are going to skip 
> integers when the radius grows (and duplicate them when it's small). Also, 
> IIRC, fractions are enumerable because they're a mapping from two 
> enumerable spaces (integers): F = F(I1, I2) = I1 / I2.

   I think he meant something like this:

( 0,  0)
( 1,  0)
( 0,  1)
(-1,  0)
( 0, -1)
( 2,  0)
( 1,  1)
( 0,  2)
(-1,  1)
(-2,  0)
(-1, -1)
etc.

   Rationals are countable since they're the product of the integers
(numerator) and the natural numbers without zero (denominator). You
can count them in a similar way to the above "spiral", making sure
that you don't count 1/2 and 2/4 as two different numbers. :)

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 1C335860 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
      --- Try everything once,  except incest and folk-dancing. ---      

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 17:52                                                       ` Davide Libenzi
@ 2004-01-05 18:03                                                         ` Linus Torvalds
  2004-01-05 18:09                                                         ` Hugo Mills
  2004-01-05 19:10                                                         ` Paul Rolland
  2 siblings, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05 18:03 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Vojtech Pavlik, Andries Brouwer, Daniel Jacobowitz, Rob Love,
	rob, Pascal Schmidt, Linux Kernel Mailing List, Greg KH



On Mon, 5 Jan 2004, Davide Libenzi wrote:
> 
> Vojtech, a spiral (in the math sense) won't work

It's not a spiral in that sense - it's just that the pattern you get when
walking the "dots" looks like a spiral.

> Also, IIRC, fractions are enumerable because they're a mapping from two
> enumerable spaces (integers): F = F(I1, I2) = I1 / I2.

Which is exactly the thing that Vojtech is really talking about: the
enumerable space of a _discrete_ two-dimensional shape, ie folding two
enumerable spaces onto one.

The negative values don't matter, since you can effectively enumerate both
ways starting from zero (ie the full set of integers is not any less
enumerable than the positive numbers are):

	0, 1, -1, 2, -2, 3, -3, ...

so it doesn't really matter if you only enumerate one quadrant (which is
effectively the same thing as enumerating fractions) or all four
quadrants.

The "spiral" pattern for a two-dimensional enumeration ends up being
something like

  (0,0) -> (1,0) -> (0,1) -> (-1,0) -> (0, -1) -> (1,-1) -> (2,0) -> ...

(do it on a graph paper and it's obvious, the above is probably wrong 
since I'm trying to visualize it)


		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 17:29                                                     ` Vojtech Pavlik
  2004-01-05 17:33                                                       ` Linus Torvalds
@ 2004-01-05 17:52                                                       ` Davide Libenzi
  2004-01-05 18:03                                                         ` Linus Torvalds
                                                                           ` (2 more replies)
  1 sibling, 3 replies; 158+ messages in thread
From: Davide Libenzi @ 2004-01-05 17:52 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Linus Torvalds, Andries Brouwer, Daniel Jacobowitz, Rob Love,
	rob, Pascal Schmidt, Linux Kernel Mailing List, Greg KH

On Mon, 5 Jan 2004, Vojtech Pavlik wrote:

> On Mon, Jan 05, 2004 at 08:13:26AM -0800, Linus Torvalds wrote:
> 
> > But the thing is, some things you simply _cannot_ number. For example, a
> > two-dimensional space is innumerable - you need more than one integer
> > number to look things up.  So is the set of real numbers (but not the set 
> > of fractions), etc etc.
> 
> Two dimensional discrete space (*) is enumerable. Just start at [0,0]
> and assign numbers going around the center in a growing spiral (**).
> That way you assign a number to every point in that space. This is very
> similar to the trick used to demonstrate fractions are enumerable.

Vojtech, a spiral (in the math sense) won't work because whatever 
continuos function you choose for the radius, you are going to skip 
integers when the radius grows (and duplicate them when it's small). Also, 
IIRC, fractions are enumerable because they're a mapping from two 
enumerable spaces (integers): F = F(I1, I2) = I1 / I2.



- Davide




^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 17:29                                                     ` Vojtech Pavlik
@ 2004-01-05 17:33                                                       ` Linus Torvalds
  2004-01-05 17:52                                                       ` Davide Libenzi
  1 sibling, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05 17:33 UTC (permalink / raw)
  To: Vojtech Pavlik
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH



On Mon, 5 Jan 2004, Vojtech Pavlik wrote:
> 
> Two dimensional discrete space (*) is enumerable.

Yeah, I'm sorry - you're the second person to point it out, and I really 
knew that but had all the wrong associations (I was thinking of the 
complex plane, not a discrete thing).

> (**) Assuming the coordinates can be negative. For non-negative
>      it's even easier.

It ends up being exactly the same pattern as for fractions (ignoring 0,
which just shifts it), which I explicitly listed as being enumerable, so I
was just being stupid.

I can only say that it's been some time since I actually did my early 
math.. 

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 16:13                                                   ` Linus Torvalds
@ 2004-01-05 17:29                                                     ` Vojtech Pavlik
  2004-01-05 17:33                                                       ` Linus Torvalds
  2004-01-05 17:52                                                       ` Davide Libenzi
  2004-01-05 19:52                                                     ` Andries Brouwer
  1 sibling, 2 replies; 158+ messages in thread
From: Vojtech Pavlik @ 2004-01-05 17:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Daniel Jacobowitz, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 08:13:26AM -0800, Linus Torvalds wrote:

> But the thing is, some things you simply _cannot_ number. For example, a
> two-dimensional space is innumerable - you need more than one integer
> number to look things up.  So is the set of real numbers (but not the set 
> of fractions), etc etc.

Two dimensional discrete space (*) is enumerable. Just start at [0,0]
and assign numbers going around the center in a growing spiral (**).
That way you assign a number to every point in that space. This is very
similar to the trick used to demonstrate fractions are enumerable.

(*)  The one where you can use two integers to look things up.
(**) Assuming the coordinates can be negative. For non-negative
     it's even easier.

-- 
Vojtech Pavlik
SuSE Labs, SuSE CR

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 15:13                                                 ` Mark Mielke
@ 2004-01-05 16:36                                                   ` Andreas Schwab
  2004-01-05 22:18                                                     ` Mark Mielke
  0 siblings, 1 reply; 158+ messages in thread
From: Andreas Schwab @ 2004-01-05 16:36 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

Mark Mielke <mark@mark.mielke.cc> writes:

> There are a few cases that we might be forced to maintain regular
> numbers: mkfifo() creates a named pipe, and bind() creates a named
> socket.

Neither fifos nor sockets are devices.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 12:27                                                 ` Andries Brouwer
@ 2004-01-05 16:13                                                   ` Linus Torvalds
  2004-01-05 17:29                                                     ` Vojtech Pavlik
  2004-01-05 19:52                                                     ` Andries Brouwer
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05 16:13 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Daniel Jacobowitz, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Mon, 5 Jan 2004, Andries Brouwer wrote:
> 
> You have this strange hangup concerning "enumerate", and then keep
> repeating to others that enumerating is impossible, and that therefore
> stable device numbers are impossible, and that consequently, since we
> cannot have stable device numbers expecting them to be stable is broken.

Right.

> It is an old misconception - I recall you telling me how many billion years
> an "ls /dev" would take with 64-bit device numbers.

No. When I talk abotu "enumerate", I do not mean "give numbers starting at
1". In the mathematical sense it means that you _can_ number them with
integers, not that it is necessarily a sequence from 1...n.

For example, PCI device slots are "enumerable". That doesn't mean that we
give them numbers 1..n, it only means that we can encode their address in
a single number. So if everything was a PCI slot, we could enumerate the
whole address space, and "stable" device numbers would be possible (they'd
be stable by _slot_, not by actual device, but that's good enough for some
people).

But the thing is, some things you simply _cannot_ number. For example, a
two-dimensional space is innumerable - you need more than one integer
number to look things up.  So is the set of real numbers (but not the set 
of fractions), etc etc.

It boils down to not how many devices there can be, but to whether there 
is any way to "walk the space of devices".

And there fundamentally isn't. And _that_ is the basic issue: if you
_cannot_ number a space, you cannot have a stable device number.

> No - I never advocated "find a device number by enumeration".
> Quite the opposite, I advocated "use a hash of the serial number
> as the device number of a disk". And more generally, "it is the
> driver's job to assign a device number".

There _is_ no such number as you are talking about. You are talking pure 
theory that has nothing to do with reality. There are no "serial numbers".

Don't you see? This is what "enumeration" is all about. You are assuming a 
model that simply DOES NOT EXIST. Your "serial numbers" are exactly what 
I'm talking about when I say "enumerate". Whenever you claim that a device 
has a "serial number", you literally claim that the device space is 
enumerable, and that is what I have been telling you from day one IS NOT 
TRUE!

Whether you then hash the serial number or not is totally irrelevant: an 
enumeration of hashes is still an enumeration.

Devices do not _have_ serial numbers. They are not enumerable. In other
words, they do not have some kind of explicit identity that we can use to
give them numbers. That is what "innumerable" MEANS, and that is why I 
have been harping on the issue.

Please. Where do you think those numbers would come from?

So I claim as an axiom for device numbering that devices are not
enumerable, and that this _fundamentally_ leads to the corollary that you 
cannot give them stable numbers. Not with hashes, not with _anything_. The 
best you can do is to _literally_ just give them some per-session unique 
integer that is simply the discovery ordering, nothing more.

> So it is not difficult at all to give this network attached storage
> a stable device number.

It is not only difficult, it is fundamentally _impossible_.

> And if one can, there is no reason not to do so.
> It may even allow udev to give stable names as well.

My point is that for the subset of devices that _do_ have serial numbers 
(and it is a subset, nothing more), udev can then use those serial numbers 
to have a stable pathname to the device. But it's a _pathname_, not a 
number.

And for devices that don't have serial numbers, udev can try to use other 
heuristics instead to give those stable names. Sometimes those other 
heuristics would be looking at the actual _content_ of the thing.

For example, if you wanted to, you could make udev do a cddb lookup on the
CD-ROM, and use that as the pathname, so that when you insert your
favorite audio disk, it will always show up in the same place, regardless 
of whether you put it in the DVD slot or the CD-RW drive. 

[ Yeah, that sounds like a singularly silly thing to do, but it's a good 
  example of something where there is no actual serial number, but you can 
  "identify" it automatically through its contents, and name it stably 
  according to that. ]

That is indeed the point of udev.  Doing things that the kenrel 
_obviously_ should not do.

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  3:48                                               ` Rob Landley
  2004-01-05  4:52                                                 ` Trond Myklebust
@ 2004-01-05 15:13                                                 ` Mark Mielke
  2004-01-05 16:36                                                   ` Andreas Schwab
  1 sibling, 1 reply; 158+ messages in thread
From: Mark Mielke @ 2004-01-05 15:13 UTC (permalink / raw)
  To: Rob Landley
  Cc: David Lang, Linus Torvalds, Andries Brouwer, Rob Love,
	Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 09:48:24PM -0600, Rob Landley wrote:
> On Sunday 04 January 2004 21:06, David Lang wrote:
> > Linus, what Andries is saying is that if you export a directory (say
> > /home) the process of exporting it somehow uses the /dev device number so
> > if the server reboots and gets a different device number for the partition
> > that /home is on the clients won't see it as the same export, breaking the
> > NFS requirement that a server can be rebooted.
> NFS always struck me as a peverse design.  "The fileserver must be
> stateless with regard to clients, even though maintainging state is
> what a filesystem DOES, and the point of the thing is to export a
> filesystem."  Okay...  (If it was exporting read-only filesystems
> with no locking of any kind, maybe they'd have a point, but come on
> guys...)

Statelessness translated to capacity back in the day when maintaining state
for hundreds or thousands of machines was expensive...

I don't buy NFS as an excuse, though. I refuse to believe that a
shared /dev is necessary or desirable for *any* environment. /dev/pts
is one example of where everybody seems to have already agreed on
this.

With udev, or with devfs, a shared /dev becomes unnecessary. /dev will
no longer need to be 7000+ entries. It could be a few hundred or less
for common configurations, and 0% persistence/remote storage for
tmpfs-udev or devfs.

There are a few cases that we might be forced to maintain regular
numbers: mkfifo() creates a named pipe, and bind() creates a named
socket. These might be accessed between reboots over NFS, or local
mounts by many existing programs. I think these must be guaranteed to
keep the same major:minor numbers across reboots (preferably, even
across kernel releases). These are exceptional cases, though, and
should be considered as such.

Cheers,
mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05 11:01                                                 ` Robin Rosenberg
@ 2004-01-05 12:39                                                   ` Nigel Cunningham
  2004-01-07 13:39                                                     ` Robin Rosenberg
  0 siblings, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2004-01-05 12:39 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 781 bytes --]

Hi.

The suspend to disk implementations all assume that devices are not
[dis]appearing under us while we're suspended. If you do go adding and
removing devices while the power is off, you can expect the same
problems you'd get if you removed them without suspending the machine.
It would be roughly equivalent to hot[un]plugging devices.

To return to the original point though, userspace may see a sudden big
jump in the time clock if it's looking, but it won't suddenly find major
& minor numbers are different.

Regards,

Nigel

On Tue, 2004-01-06 at 00:01, Robin Rosenberg wrote:
> > Yes. You end up running the original kernel.
> 
> But not necessarily the same devices.

-- 
My work on Software Suspend is graciously brought to you by
LinuxFund.org.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  3:33                                               ` Linus Torvalds
  2004-01-05  3:50                                                 ` viro
@ 2004-01-05 12:27                                                 ` Andries Brouwer
  2004-01-05 16:13                                                   ` Linus Torvalds
  1 sibling, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-05 12:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 07:33:16PM -0800, Linus Torvalds wrote:

[A mailbox full of messages, too many to reply to.
Yes, Daniel Jacobowitz understood that I referred to fsid in the NFS case:

  There is a great variation here in what various servers and clients do,
  but roughly speaking filehandles tend to contain a fsid, and this fsid
  often (no fsid= given) involves (major,minor,ino).

No, I have not talked this year about exporting /dev. Also interesting.
Yes, as I said, one can avoid NFS problems by giving fsid=.
It is similar elsewhere. A thousand minor problems are caused by
unstable device numbers. All annoying, each can be solved easily
once one has figured out what goes wrong and why. That is why I say
"preferably stable across reboots".]


What remains to be said?

Linus, let me try a bit more to address what I see as a misconception
in your posts.

> It shouldn't be fixed by saying "device numbers have to be stable across 
> reboots", because the fact is, we're most likely going to have storage 
> that is really really hard to enumerate in a repeatable fashion.

You have this strange hangup concerning "enumerate", and then keep
repeating to others that enumerating is impossible, and that therefore
stable device numbers are impossible, and that consequently, since we
cannot have stable device numbers expecting them to be stable is broken.

It is an old misconception - I recall you telling me how many billion years
an "ls /dev" would take with 64-bit device numbers.

No - I never advocated "find a device number by enumeration".
Quite the opposite, I advocated "use a hash of the serial number
as the device number of a disk". And more generally, "it is the
driver's job to assign a device number".

So it is not difficult at all to give this network attached storage
a stable device number.

And if one can, there is no reason not to do so.
It may even allow udev to give stable names as well.


Andries

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  7:47                                                         ` Greg KH
@ 2004-01-05 11:15                                                           ` Vojtech Pavlik
  2004-01-05 20:11                                                             ` Theodore Ts'o
  0 siblings, 1 reply; 158+ messages in thread
From: Vojtech Pavlik @ 2004-01-05 11:15 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, viro, Daniel Jacobowitz, Andries Brouwer,
	Rob Love, rob, Pascal Schmidt, linux-kernel

On Sun, Jan 04, 2004 at 11:47:17PM -0800, Greg KH wrote:

> > But since you brought it up: do you actually have anything else that can
> > open a remote IMAP file with a few thousand messages without taking ages
> > for it, and that you don't have to mouse around with? I'd like a graphical
> > interface for configuring stuff etc, but I sure as hell don't want to find
> > some f*ing icon to save a few messages that I selected in-order to my
> > "doit" queue or go to the next one, or pipe the thing to a shell-script,
> > or any number of things that are my actual _job_.
> 
> mutt can provide a path for a recovering pine addict.  I did that a
> number of years ago and have been quite happy since.  I can't vouch for
> its IMAP speeds (seems to be fast enough for me, as long as I don't try
> to do a filter on a large IMAP folder), but the other tasks you do
> (selecting, piping, etc.) work very well.

Mutt with IMAP is rather bearable even on a GPRS connection (40kbps,
1sec latency). On a 100baseTX it's not distinguishable from local
operation.

One thing missing in mutt is a persistent message and message header
cache - opening a folder can take a lot of time over a slow connection.
But there is a patch at least for the message header cache persistence
floating on the 'net somewhere.

Another thing that bugs me often in mutt is its inability to service
keystrokes while doing something else (like checking for new mail over
IMAP with a slow link). It becomes unresponsive until that task is done.

> I even think there's a mutt config file that duplicates all of the
> default pine keystrokes just to make moving easier.
> 
> The message threading was reason enough for me to switch, although I've
> heard rumors that pine can handle that now.
> 
> thanks,
> 
> greg k-h
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Vojtech Pavlik
SuSE Labs, SuSE CR

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  7:45                                               ` Nigel Cunningham
@ 2004-01-05 11:01                                                 ` Robin Rosenberg
  2004-01-05 12:39                                                   ` Nigel Cunningham
  0 siblings, 1 reply; 158+ messages in thread
From: Robin Rosenberg @ 2004-01-05 11:01 UTC (permalink / raw)
  To: linux-kernel

måndagen den 5 januari 2004 08.45 skrev Nigel Cunningham:
> Hi.
>
> On Mon, 2004-01-05 at 20:44, James H. Cloos Jr. wrote:
> > >>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:
> >
> > Linus> Why? Becuase that _program_ sure as hell isn't
> > Linus> running across a reboot.
> >
> > Is that strictly true?  With (software) suspend to disk,
> > will the old device enumeration data be recovered from
> > the suspend partition?
>
> Yes. You end up running the original kernel.

But not necessarily the same devices.

> Regards,
>
> Nigel

-- rob in


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  7:44                                             ` James H. Cloos Jr.
  2004-01-05  7:45                                               ` Nigel Cunningham
@ 2004-01-05  9:06                                               ` Valdis.Kletnieks
  1 sibling, 0 replies; 158+ messages in thread
From: Valdis.Kletnieks @ 2004-01-05  9:06 UTC (permalink / raw)
  To: James H. Cloos Jr.; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 431 bytes --]

On Mon, 05 Jan 2004 02:44:10 EST, "James H. Cloos Jr." said:
> >>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> Linus> Why? Becuase that _program_ sure as hell isn't
> Linus> running across a reboot.
> 
> Is that strictly true?  With (software) suspend to disk,
> will the old device enumeration data be recovered from
> the suspend partition?

That would be a suspend, not a reboot, if we're speaking strictly....

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:52                                                       ` Linus Torvalds
  2004-01-05  6:11                                                         ` viro
@ 2004-01-05  7:47                                                         ` Greg KH
  2004-01-05 11:15                                                           ` Vojtech Pavlik
  2004-01-11 22:12                                                         ` Ed L Cashin
  2 siblings, 1 reply; 158+ messages in thread
From: Greg KH @ 2004-01-05  7:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: viro, Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel

On Sun, Jan 04, 2004 at 08:52:56PM -0800, Linus Torvalds wrote:
> 
> But since you brought it up: do you actually have anything else that can
> open a remote IMAP file with a few thousand messages without taking ages
> for it, and that you don't have to mouse around with? I'd like a graphical
> interface for configuring stuff etc, but I sure as hell don't want to find
> some f*ing icon to save a few messages that I selected in-order to my
> "doit" queue or go to the next one, or pipe the thing to a shell-script,
> or any number of things that are my actual _job_.

mutt can provide a path for a recovering pine addict.  I did that a
number of years ago and have been quite happy since.  I can't vouch for
its IMAP speeds (seems to be fast enough for me, as long as I don't try
to do a filter on a large IMAP folder), but the other tasks you do
(selecting, piping, etc.) work very well.

I even think there's a mutt config file that duplicates all of the
default pine keystrokes just to make moving easier.

The message threading was reason enough for me to switch, although I've
heard rumors that pine can handle that now.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  7:44                                             ` James H. Cloos Jr.
@ 2004-01-05  7:45                                               ` Nigel Cunningham
  2004-01-05 11:01                                                 ` Robin Rosenberg
  2004-01-05  9:06                                               ` Valdis.Kletnieks
  1 sibling, 1 reply; 158+ messages in thread
From: Nigel Cunningham @ 2004-01-05  7:45 UTC (permalink / raw)
  To: James H. Cloos Jr.
  Cc: Linus Torvalds, Andries Brouwer, Rob Love, rob, Pascal Schmidt,
	Linux Kernel Mailing List, Greg KH

[-- Attachment #1: Type: text/plain, Size: 523 bytes --]

Hi.

On Mon, 2004-01-05 at 20:44, James H. Cloos Jr. wrote:
> >>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> Linus> Why? Becuase that _program_ sure as hell isn't
> Linus> running across a reboot.
> 
> Is that strictly true?  With (software) suspend to disk,
> will the old device enumeration data be recovered from
> the suspend partition?

Yes. You end up running the original kernel.

Regards,

Nigel

-- 
My work on Software Suspend is graciously brought to you by
LinuxFund.org.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  2:52                                           ` Linus Torvalds
  2004-01-05  3:06                                             ` David Lang
  2004-01-05  3:07                                             ` Daniel Jacobowitz
@ 2004-01-05  7:44                                             ` James H. Cloos Jr.
  2004-01-05  7:45                                               ` Nigel Cunningham
  2004-01-05  9:06                                               ` Valdis.Kletnieks
  2 siblings, 2 replies; 158+ messages in thread
From: James H. Cloos Jr. @ 2004-01-05  7:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

>>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:

Linus> Why? Becuase that _program_ sure as hell isn't
Linus> running across a reboot.

Is that strictly true?  With (software) suspend to disk,
will the old device enumeration data be recovered from
the suspend partition?

-JimC


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:38                                                     ` viro
  2004-01-05  4:52                                                       ` Linus Torvalds
  2004-01-05  5:26                                                       ` Eric W. Biederman
@ 2004-01-05  7:39                                                       ` Greg KH
  2 siblings, 0 replies; 158+ messages in thread
From: Greg KH @ 2004-01-05  7:39 UTC (permalink / raw)
  To: viro
  Cc: Linus Torvalds, Daniel Jacobowitz, Andries Brouwer, Rob Love,
	rob, Pascal Schmidt, linux-kernel

On Mon, Jan 05, 2004 at 04:38:30AM +0000, viro@parcelfarce.linux.theplanet.co.uk wrote:
> 
> Then we'd better have a very good idea of the things that are going to
> break.  Note that right now even late-boot code in kernel itself will
> break on that - there are explicit checks for ROOT_DEV==MKDEV(2,0),
> all sorts of weird crap deep in the bowels of arch/ppc/*/*, etc.
> 
> It won't be an easy transition - I know that Greg is very optimistic
> about it, but there will be a *lot* of crap to take care of.

Oh I know it's going to be tough, and there's going to be a lot of crap
to take care of, but in the end, I think it will be worth it...hopefully
if I'm still sane then...

> ObOtherStraightforwardThings: net_device refcounting.  Take a look at
> Jeff's queue someday - by now it's one big merge short of getting it
> right for practically all drivers.  1.9Mb total + 247Kb pending patches
> here.  Several hundreds changesets, practically all of them fixing
> exploitable holes.  And yes, most of them had been bugs all along -
> since 2.2 if not earlier.  Sure, that made things better, but if somebody
> comes along and makes similar "fun" necessary for e.g. ALSA...

Yeah, ALSA scares me, along with the input layer code.  I had dreams of
easily converting them to use proper refcounting, but now know there's
no way that would be an easy conversion and have pretty much given up on
it.  For 2.6 at least.  

That's why my "simple_class" patch will have to be a band-aid for now to
get sysfs representation for those types of devices.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:52                                                       ` Linus Torvalds
@ 2004-01-05  6:11                                                         ` viro
  2004-01-05  7:47                                                         ` Greg KH
  2004-01-11 22:12                                                         ` Ed L Cashin
  2 siblings, 0 replies; 158+ messages in thread
From: viro @ 2004-01-05  6:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 08:52:56PM -0800, Linus Torvalds wrote:
 
> That's where "mount by label" does part of the job. But if the system is 
> _always_ set up to do things like NFS exports according to some separate 
> UUID, that too would "just work".

mount by label does part of the job, until you decide to use dd(1) to copy
a disk.  At which point you have, AFAICS, no way tell which copy will get
mounted.
 
> Those are them fighting words.
> 
> But since you brought it up: do you actually have anything else that can
> open a remote IMAP file with a few thousand messages without taking ages
> for it, and that you don't have to mouse around with? I'd like a graphical
> interface for configuring stuff etc, but I sure as hell don't want to find
> some f*ing icon to save a few messages that I selected in-order to my
> "doit" queue or go to the next one, or pipe the thing to a shell-script,
> or any number of things that are my actual _job_.

I prefer to ssh to another box and use mutt.  Seriously, I've made a mistake
of reading imapd source and that was enough to decide that I'm _not_ touching
uw-<anything> and that protocol in general unless I really have no other
options.  So far I've managed to avoid that...

> On a related matter, I'm probably a retard, but I've tried alternatives to
> "trn" too, and there really aren't any.

Same here.  There are things about trn command set I'd prefer to see changed,
but it's better than other newsreaders I've seen...

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:38                                                     ` viro
  2004-01-05  4:52                                                       ` Linus Torvalds
@ 2004-01-05  5:26                                                       ` Eric W. Biederman
  2004-01-05  7:39                                                       ` Greg KH
  2 siblings, 0 replies; 158+ messages in thread
From: Eric W. Biederman @ 2004-01-05  5:26 UTC (permalink / raw)
  To: viro
  Cc: Linus Torvalds, Daniel Jacobowitz, Andries Brouwer, Rob Love,
	rob, Pascal Schmidt, linux-kernel, Greg KH

viro@parcelfarce.linux.theplanet.co.uk writes:

> On Sun, Jan 04, 2004 at 08:02:20PM -0800, Linus Torvalds wrote:
> > Now, we'd probably not want to force the switch, but I do suspect we'll 
> > have exactly this as a switch in the "Kernel Debugging Config" section. 
> > Where even _common_ things like disks could end up with per-bootup values. 
> > Just to verify that every part of the system ends up having it right.
> 
> Then we'd better have a very good idea of the things that are going to
> break.  Note that right now even late-boot code in kernel itself will
> break on that - there are explicit checks for ROOT_DEV==MKDEV(2,0),
> all sorts of weird crap deep in the bowels of arch/ppc/*/*, etc.

/sbin/lilo and possibly some of the other bootloaders.  Relationships
between devices are a challenge to work with.  How do you go from a
partition to it's actual block device etc.  I don't remember how many
major numbers lilo has hard coded, I just remember looking at it once
and realizing I couldn't think of a better way to accomplish what it
was trying to do.

Eric


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:38                                                     ` viro
@ 2004-01-05  4:52                                                       ` Linus Torvalds
  2004-01-05  6:11                                                         ` viro
                                                                           ` (2 more replies)
  2004-01-05  5:26                                                       ` Eric W. Biederman
  2004-01-05  7:39                                                       ` Greg KH
  2 siblings, 3 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05  4:52 UTC (permalink / raw)
  To: viro
  Cc: Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH



On Mon, 5 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:

> > If nothing else, things like SATA will end up meaning that the device you 
> > were used to seeign as /dev/hdc will suddenly show up as /dev/scd0 
> > instead. Just because you changed the cabling while you upgraded to a 
> > newer version of your CD-ROM drive.
> 
> If I open the damn box, I sure as hell can be bothered to edit stuff in
> /etc...

Actually, not necessarily.

The thing is, _the_ most common reason I have for opening the box is that 
the effing thing started having problems.

At which point I want to just remove the disk, move it to another box, and 
boot up the other box.

And THAT is exactly the kind of situation where I sure as hell don't want
to care exactly where the disk was. I can't "prepare" for it by editing
files in /etc, since I don't know that the CPU fan or whatever is going to
die on me.

And this is _exactly_ why we should try to get away from device numbering
having any meaning. Because if we do this right, something like the CPU
fan dying, and me moving a disk to a new machine that has SATA (with the
disk having both SATA and PATA connectors), I shouldn't need to even
_think_ about it.

That's where "mount by label" does part of the job. But if the system is 
_always_ set up to do things like NFS exports according to some separate 
UUID, that too would "just work".

There's a lot to be said for "just work".  Even if sometimes it takes some 
pain when you break old (and broken) assumptions.

> > because "pine" still doesn't get UTF-8 right, and nobody is apparently
> > ever going to fix it. Oh, well. But at least I know I'm doing something
> > _wrong_, which in itself is a good thing.).
> 
> Heh.  Took you long enough - "using pine" should've been a dead giveaway
> from the very beginning ;-)

Those are them fighting words.

But since you brought it up: do you actually have anything else that can
open a remote IMAP file with a few thousand messages without taking ages
for it, and that you don't have to mouse around with? I'd like a graphical
interface for configuring stuff etc, but I sure as hell don't want to find
some f*ing icon to save a few messages that I selected in-order to my
"doit" queue or go to the next one, or pipe the thing to a shell-script,
or any number of things that are my actual _job_.

And the "no mousing" means that I don't want to have some popup window 
that asks me what file I want to save into or similar crap. I can type 
fast enough if I stay on the keyboard and can focus on one part of the 
screen, but if I have to switch my focus around, I'm a goner.

On a related matter, I'm probably a retard, but I've tried alternatives to
"trn" too, and there really aren't any. None of the graphical news readers
can show me one full page of threads, select the 3-4 threads from _that_
one page that I want (from the keyboard), and then kill _that_ one page.
Not the whole newsgroup: only the part that shows in the window at that
time.

In "trn", the magic command is capital-D, for "discard".

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  3:48                                               ` Rob Landley
@ 2004-01-05  4:52                                                 ` Trond Myklebust
  2004-01-05 15:13                                                 ` Mark Mielke
  1 sibling, 0 replies; 158+ messages in thread
From: Trond Myklebust @ 2004-01-05  4:52 UTC (permalink / raw)
  To: rob; +Cc: linux-kernel

På su , 04/01/2004 klokka 22:48, skreiv Rob Landley:

> NFS always struck me as a peverse design.  "The fileserver must be stateless 
> with regard to clients, even though maintainging state is what a filesystem 
> DOES, and the point of the thing is to export a filesystem."  Okay...  (If it 
> was exporting read-only filesystems with no locking of any kind, maybe they'd 
> have a point, but come on guys...)

Sigh... What has that got to do with anything?

Read the RFCs: NFS *was* entirely stateless until v4 was drafted.
Locking was never part of the NFS protocol, but was an external addition
that was documented by the Open Group. So, yes, there is a history and a
reason behind all the talk of statelessness.

As for the current thread about remembering device numbers: as far as
NFS is concerned, that is entirely an implementation issue. There is no
need for any extra NFS protocol support for this sort of crap.

> So why, exactly, can the NFS server not maintain whatever extra state it needs 
> to remember between reboots in a filesystem?  (Not even necessarily the one 
> it's exporting, just some rc file something under /var.)  The device node it 
> was exporting USED to be in the filesystem, you know, ala mknod.  Now that 
> the kernel's not keeping that stable, have the #*%(&# server generate a 
> number and make a note of it somewhere.  (Is requiring an NFS server to have 
> access to persistent storage too much to ask?)

It could be done (and probably entirely in userspace). I assume you are
volunteering to do the work?

> Personally, I could never figure out why Samba servers are in userspace but 
> NFS servers seem to want to live in the kernel.  I can almost secure a samba 
> server for access to the outside world, but a NFS system that isn't behind a 
> firewall automatically says to me "this machine has already been compromised 
> eight ways from sunday within five minutes of being exposed to the internet".  
> Call me paranoid...

Sun was doing Kerberos for NFS years before the Samba project was
started.

Security has bugger all to do with kernel or userland and everything to
do with the short-sighted "munitions" policies of certain governments at
the time around when the Sun RPC protocol was being drafted. The same
policies were still around to dictate our implementation much later when
we were doing RPC for Linux. Now the laws have changed, and so we've
finally been able to add strong authentication in 2.6.x.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:15                                           ` Peter Chubb
@ 2004-01-05  4:42                                             ` Linus Torvalds
  0 siblings, 0 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05  4:42 UTC (permalink / raw)
  To: Peter Chubb
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Mon, 5 Jan 2004, Peter Chubb wrote:
> 
> It's worse than that.  You can do
>      mknod fred b maj minor
> anywhere on any UNIX filesystem and expect it to a) work and b) refer
> to the same device for all time until it is removed.

Hmm.. I can see (a) (except for the fact that pretty  much all unixes have 
mount-flags to say "no device files") but I don't see why you'd _ever_ 
expect (b) to be true.

It's patently not true for such rather traditional unix devices as pty's, 
for example. The "same device" ends up being true only for as long as the 
master at the other end exists - and the same numbers get re-used in all 
normal usage for different virtual devices.

> I know that Linux already breaks this (the stupid /dev/sg[0-9] that
> depend not on the SCSI bus and lun but on the order they're detected,
> for example) 

That "stupid" thing is a hell of a lot less stupid than the alternatives, 
and is very much equivalent to how pty's work. 

In fact, the "number according to detection" is pretty much the best 
device number allocation strategy. It's the _only_ one that doesn't have 
some incorrect bias built-in. 

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  4:02                                                   ` Linus Torvalds
@ 2004-01-05  4:38                                                     ` viro
  2004-01-05  4:52                                                       ` Linus Torvalds
                                                                         ` (2 more replies)
  2004-01-07  9:57                                                     ` Pavel Machek
  1 sibling, 3 replies; 158+ messages in thread
From: viro @ 2004-01-05  4:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 08:02:20PM -0800, Linus Torvalds wrote:
> 
> 
> On Mon, 5 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:
> > 
> > What is _not_ OK, though, is to have folks suddenly see /dev/hda3 changing
> > its device number - then we would break existing setups that worked all
> > along; even if admin can fix the breakage, it's not a good thing to do.
> 
> Ehh, it will actually happen.
> 
> If nothing else, things like SATA will end up meaning that the device you 
> were used to seeign as /dev/hdc will suddenly show up as /dev/scd0 
> instead. Just because you changed the cabling while you upgraded to a 
> newer version of your CD-ROM drive.

If I open the damn box, I sure as hell can be bothered to edit stuff in
/etc...

> And the thing is, with fs labels and udev, even "existing systems" really
> shouldn't much care.
> 
> Now, we'd probably not want to force the switch, but I do suspect we'll 
> have exactly this as a switch in the "Kernel Debugging Config" section. 
> Where even _common_ things like disks could end up with per-bootup values. 
> Just to verify that every part of the system ends up having it right.

Then we'd better have a very good idea of the things that are going to
break.  Note that right now even late-boot code in kernel itself will
break on that - there are explicit checks for ROOT_DEV==MKDEV(2,0),
all sorts of weird crap deep in the bowels of arch/ppc/*/*, etc.

It won't be an easy transition - I know that Greg is very optimistic
about it, but there will be a *lot* of crap to take care of.  In theory
getting bigger dev_t should've been very straightforward, but if you
check what really had been involved...

ObOtherStraightforwardThings: net_device refcounting.  Take a look at
Jeff's queue someday - by now it's one big merge short of getting it
right for practically all drivers.  1.9Mb total + 247Kb pending patches
here.  Several hundreds changesets, practically all of them fixing
exploitable holes.  And yes, most of them had been bugs all along -
since 2.2 if not earlier.  Sure, that made things better, but if somebody
comes along and makes similar "fun" necessary for e.g. ALSA...

> because "pine" still doesn't get UTF-8 right, and nobody is apparently
> ever going to fix it. Oh, well. But at least I know I'm doing something
> _wrong_, which in itself is a good thing.).

Heh.  Took you long enough - "using pine" should've been a dead giveaway
from the very beginning ;-)

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 22:01                                         ` Andries Brouwer
                                                             ` (3 preceding siblings ...)
  2004-01-05  2:52                                           ` Linus Torvalds
@ 2004-01-05  4:15                                           ` Peter Chubb
  2004-01-05  4:42                                             ` Linus Torvalds
  4 siblings, 1 reply; 158+ messages in thread
From: Peter Chubb @ 2004-01-05  4:15 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

>>>>> "Andries" == Andries Brouwer <aebr@win.tue.nl> writes:

Andries> On Sun, Jan 04, 2004 at 01:05:20PM -0800, Linus Torvalds
Andries> wrote:

Andries> Surprise! Are you leaving POSIX? Or ditching NFS?  Or
Andries> demanding that NFS servers must never reboot?

Andries> A common Unix idiom is testing for the identity of two files
Andries> by comparing st_ino and st_dev.  A broken idiom?

It's worse than that.  You can do
     mknod fred b maj minor
anywhere on any UNIX filesystem and expect it to a) work and b) refer
to the same device for all time until it is removed. However, this
doesn't appear to be guaranteed by SUS -- the only guarantees are that
the dev_t returned from the stat() family of calls is unique within a LAN.

I know that Linux already breaks this (the stupid /dev/sg[0-9] that
depend not on the SCSI bus and lun but on the order they're detected,
for example) 

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  3:50                                                 ` viro
@ 2004-01-05  4:02                                                   ` Linus Torvalds
  2004-01-05  4:38                                                     ` viro
  2004-01-07  9:57                                                     ` Pavel Machek
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05  4:02 UTC (permalink / raw)
  To: viro
  Cc: Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH



On Mon, 5 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:
> 
> What is _not_ OK, though, is to have folks suddenly see /dev/hda3 changing
> its device number - then we would break existing setups that worked all
> along; even if admin can fix the breakage, it's not a good thing to do.

Ehh, it will actually happen.

If nothing else, things like SATA will end up meaning that the device you 
were used to seeign as /dev/hdc will suddenly show up as /dev/scd0 
instead. Just because you changed the cabling while you upgraded to a 
newer version of your CD-ROM drive.

And the thing is, with fs labels and udev, even "existing systems" really
shouldn't much care.

Now, we'd probably not want to force the switch, but I do suspect we'll 
have exactly this as a switch in the "Kernel Debugging Config" section. 
Where even _common_ things like disks could end up with per-bootup values. 
Just to verify that every part of the system ends up having it right.

Think of it this way: RedHat not that long ago decided to break with a
_lot_ of tradition by switching over to UTF-8 as the common text encoring.  
It broke some _major_ programs in how they dealt with "simple" things like
keyboard input that had worked for literally _decades_.

And you could switch it off if you really wanted to, but quite frankly, it 
wasn't even a simple choice in the install. You had to know what you were 
doing to switch it off.

And the thing is, that is _the_ single thing that cleaned up a lot of 
remaining problems wrt UTF-8 on Linux. Yes, almost all of them had been 
solved already, or RH wouldn't have dared do the switch. But to get there 
all the way, you had to literally force the cut-over.

(Yeah, I'm a bad person, and I personally went back to the C locale,
because "pine" still doesn't get UTF-8 right, and nobody is apparently
ever going to fix it. Oh, well. But at least I know I'm doing something
_wrong_, which in itself is a good thing.).

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  3:33                                               ` Linus Torvalds
@ 2004-01-05  3:50                                                 ` viro
  2004-01-05  4:02                                                   ` Linus Torvalds
  2004-01-05 12:27                                                 ` Andries Brouwer
  1 sibling, 1 reply; 158+ messages in thread
From: viro @ 2004-01-05  3:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Daniel Jacobowitz, Andries Brouwer, Rob Love, rob,
	Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 07:33:16PM -0800, Linus Torvalds wrote:
> Ahh. I'll buy into that, and yes, this is an example of something that 
> needs to be fixed. 
> 
> It shouldn't be fixed by saying "device numbers have to be stable across 
> reboots", because the fact is, we're most likely going to have storage 
> that is really really hard to enumerate in a repeatable fashion.
> 
> So the _proper_ thing to do is to have the NFS server not use the device 
> number as part of fsid. It should use the _stable_ UUID of the filesystem 
> or some similar label.

... and we already have a way to specify it explicitly.  Which, BTW, allows
to take server down, copy exported fs from failing IDE disk to SCSI one and
reexport.  With clients remaining happy with you.  Remember discussions
circa 2.5.50 or so about that stuff?

So we have tools for that.  And it's 100% OK to say "if you are doing NFS
export of filesystem that lives on $new_weird_device, explicit fsid= is
not just a good idea, it's a must-have".

What is _not_ OK, though, is to have folks suddenly see /dev/hda3 changing
its device number - then we would break existing setups that worked all
along; even if admin can fix the breakage, it's not a good thing to do.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  3:06                                             ` David Lang
@ 2004-01-05  3:48                                               ` Rob Landley
  2004-01-05  4:52                                                 ` Trond Myklebust
  2004-01-05 15:13                                                 ` Mark Mielke
  0 siblings, 2 replies; 158+ messages in thread
From: Rob Landley @ 2004-01-05  3:48 UTC (permalink / raw)
  To: David Lang, Linus Torvalds
  Cc: Andries Brouwer, Rob Love, Pascal Schmidt, linux-kernel, Greg KH

On Sunday 04 January 2004 21:06, David Lang wrote:
> Linus, what Andries is saying is that if you export a directory (say
> /home) the process of exporting it somehow uses the /dev device number so
> if the server reboots and gets a different device number for the partition
> that /home is on the clients won't see it as the same export, breaking the
> NFS requirement that a server can be rebooted.

NFS always struck me as a peverse design.  "The fileserver must be stateless 
with regard to clients, even though maintainging state is what a filesystem 
DOES, and the point of the thing is to export a filesystem."  Okay...  (If it 
was exporting read-only filesystems with no locking of any kind, maybe they'd 
have a point, but come on guys...)

So here's an example of where the fileserver _does_ expect to maintain 
non-file state across reboots.  "Ooh, the device node we're exporting is part 
of the ID, gee, we missed one!"

So why, exactly, can the NFS server not maintain whatever extra state it needs 
to remember between reboots in a filesystem?  (Not even necessarily the one 
it's exporting, just some rc file something under /var.)  The device node it 
was exporting USED to be in the filesystem, you know, ala mknod.  Now that 
the kernel's not keeping that stable, have the #*%(&# server generate a 
number and make a note of it somewhere.  (Is requiring an NFS server to have 
access to persistent storage too much to ask?)

Personally, I could never figure out why Samba servers are in userspace but 
NFS servers seem to want to live in the kernel.  I can almost secure a samba 
server for access to the outside world, but a NFS system that isn't behind a 
firewall automatically says to me "this machine has already been compromised 
eight ways from sunday within five minutes of being exposed to the internet".  
Call me paranoid...

Rob


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  2:29                                             ` Andries Brouwer
@ 2004-01-05  3:42                                               ` viro
  0 siblings, 0 replies; 158+ messages in thread
From: viro @ 2004-01-05  3:42 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Mon, Jan 05, 2004 at 03:29:01AM +0100, Andries Brouwer wrote:
> On Sun, Jan 04, 2004 at 10:37:10PM +0000, viro@parcelfarce.linux.theplanet.co.uk wrote:
> 
> Hi Al - a happy 2004 to you too!
> 
> > Now, care to explain how preserving aforementioned common Unix idiom
> > is related to your expostulations?
> 
> Hmm. You sound like you agree that random device numbers and NFS
> are a bad combination, but don't see why my example might be
> relevant.

No.  I don't see what the fuck does it have to POSIX compliance, ability
to determine whether two files are identical by st_ino/st_dev and common
UNIX idioms.
 
> There is a great variation here in what various servers and clients do,
> but roughly speaking filehandles tend to contain a fsid, and this fsid
> often (no fsid= given) involves (major,minor,ino).

Now, _that_ is true.  And yes, I agree that setups with unstable device
numbers do need explicit actions on part of admin.  In particular, editing
/etc/exports to add fsid= in each relevant entry.

Which means that *in* *setups* *where* *numbers* *are* *currently* *stable*
we should not make them random without admin's knowledge.  And /etc/exports
is not the only problem - RAID, journaling filesystems with device number of
log stored on-disk, etc.

*However*, if we are talking about new classes of devices, all bets are off
and proper fix is to stop using unsuitable interfaces for those devices.
For exports it means "use explicit fsid".  For RAID we both agreed, IIRC,
that raidtools will need to switch to saner API, etc.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  3:07                                             ` Daniel Jacobowitz
@ 2004-01-05  3:33                                               ` Linus Torvalds
  2004-01-05  3:50                                                 ` viro
  2004-01-05 12:27                                                 ` Andries Brouwer
  0 siblings, 2 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05  3:33 UTC (permalink / raw)
  To: Daniel Jacobowitz
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Sun, 4 Jan 2004, Daniel Jacobowitz wrote:
> 
> I think you two are talking straight past each other.  Andries is
> talking about the fsid, which is determined by the NFS server, based on
> its idea of the device number of the filesystem underlying the exported
> directory.  Right now, I can reboot my host system, and when it comes
> up then the NFS directories it exports to clients will have the same
> fsid.  With random device numbers it won't work; after rebooting the
> NFS server all clients will be forced to explicitly unmount and
> remount.

Ahh. I'll buy into that, and yes, this is an example of something that 
needs to be fixed. 

It shouldn't be fixed by saying "device numbers have to be stable across 
reboots", because the fact is, we're most likely going to have storage 
that is really really hard to enumerate in a repeatable fashion.

So the _proper_ thing to do is to have the NFS server not use the device 
number as part of fsid. It should use the _stable_ UUID of the filesystem 
or some similar label.

And it should do that exactly because the device number isn't as stable as 
NFS exporting would like it to be. Exactly because things like network- 
attached disks etc.  How would you otherwise export a disk that perhaps 
gets its address from DHCP? 

[ I incredulously asked a NetApp person why you'd ever want to expose the
  _disk_ over ethenet, rather than just have the NAS device export a
  filesystem of its own. It turns out that some people really want to just
  see a block device, either because Windows sucks at network filesystems
  or because they want to do things like databases onto them. The point
  being that once you do that, you'll likely want to export the thing as
  an SMB share from the thing that "owns" the disk. 

  So you would literally have a _disk_ whose IP address changed depending 
  on what other machines were booted on the same network. ]

Issues like this is also why Linux vendors have already started doing
things like "mount by label" - because disks have a nasty tendency to move
around, and specifying the fstab contents (or "root=xxx" on the kernel
command line) with physical location or similar just doesn't work all
that well. It happens today with things like USB2 or firewire disks. They 
get moved around, and they get a new device number.

It's still not _common_, but it's slowly getting there.

> Now, it seems to me that this isn't much of an argument against random
> device numbers.  Have userspace set a UUID for the device if you want,
> and use that in the fsid instead.  But that's the argument; it has
> nothing to do with the NFS server exporting its /dev.

I buy into that, and I agree 100% with you that this is just a case where 
you should use a UUID.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  2:52                                           ` Linus Torvalds
  2004-01-05  3:06                                             ` David Lang
@ 2004-01-05  3:07                                             ` Daniel Jacobowitz
  2004-01-05  3:33                                               ` Linus Torvalds
  2004-01-05  7:44                                             ` James H. Cloos Jr.
  2 siblings, 1 reply; 158+ messages in thread
From: Daniel Jacobowitz @ 2004-01-05  3:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 06:52:56PM -0800, Linus Torvalds wrote:
> 
> 
> On Sun, 4 Jan 2004, Andries Brouwer wrote:
> > 
> > Surprise! Are you leaving POSIX? Or ditching NFS?
> > Or demanding that NFS servers must never reboot?
> 
> Ok, Andries, time for you to take a deep breath, and calm down. Because 
> your arguments are getting ridiculous in the extreme.
> 
> A NFS server is sure as hell not going to export _its_ dynamic /dev to its 
> clients. That would be not just stupid, but crazy. Next you tell me that 
> you were using devfs and exporting that over NFS. 
> 
> A NFS server is going to export something _totally_ different than its own 
> /dev directory - it needs to be _client_-specific anyway. That's true with 
> stable numbers too, btw - ever tried to mount a Solaris /dev on a Linux 
> client? No workee.

I think you two are talking straight past each other.  Andries is
talking about the fsid, which is determined by the NFS server, based on
its idea of the device number of the filesystem underlying the exported
directory.  Right now, I can reboot my host system, and when it comes
up then the NFS directories it exports to clients will have the same
fsid.  With random device numbers it won't work; after rebooting the
NFS server all clients will be forced to explicitly unmount and
remount.

Now, it seems to me that this isn't much of an argument against random
device numbers.  Have userspace set a UUID for the device if you want,
and use that in the fsid instead.  But that's the argument; it has
nothing to do with the NFS server exporting its /dev.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  2:52                                           ` Linus Torvalds
@ 2004-01-05  3:06                                             ` David Lang
  2004-01-05  3:48                                               ` Rob Landley
  2004-01-05  3:07                                             ` Daniel Jacobowitz
  2004-01-05  7:44                                             ` James H. Cloos Jr.
  2 siblings, 1 reply; 158+ messages in thread
From: David Lang @ 2004-01-05  3:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

Linus, what Andries is saying is that if you export a directory (say
/home) the process of exporting it somehow uses the /dev device number so
if the server reboots and gets a different device number for the partition
that /home is on the clients won't see it as the same export, breaking the
NFS requirement that a server can be rebooted.

I don't agree with him becouse if the NFS server does include /dev info in
what it shows to the outside world it's already broken.

David Lang


 On Sun, 4 Jan 2004, Linus Torvalds wrote:

> Date: Sun, 4 Jan 2004 18:52:56 -0800 (PST)
> From: Linus Torvalds <torvalds@osdl.org>
> To: Andries Brouwer <aebr@win.tue.nl>
> Cc: Rob Love <rml@ximian.com>, rob@landley.net,
>      Pascal Schmidt <der.eremit@email.de>, linux-kernel@vger.kernel.org,
>      Greg KH <greg@kroah.com>
> Subject: Re: udev and devfs - The final word
>
>
>
> On Sun, 4 Jan 2004, Andries Brouwer wrote:
> >
> > Surprise! Are you leaving POSIX? Or ditching NFS?
> > Or demanding that NFS servers must never reboot?
>
> Ok, Andries, time for you to take a deep breath, and calm down. Because
> your arguments are getting ridiculous in the extreme.
>
> A NFS server is sure as hell not going to export _its_ dynamic /dev to its
> clients. That would be not just stupid, but crazy. Next you tell me that
> you were using devfs and exporting that over NFS.
>
> A NFS server is going to export something _totally_ different than its own
> /dev directory - it needs to be _client_-specific anyway. That's true with
> stable numbers too, btw - ever tried to mount a Solaris /dev on a Linux
> client? No workee.
>
> > A common Unix idiom is testing for the identity
> > of two files by comparing st_ino and st_dev.
> > A broken idiom?
>
> No. It still works. Even if the device numbers change across reboots.
>
> Why? Becuase that _program_ sure as hell isn't running across a reboot.
>
> And again, this is not something we haven't seen before. Have you ever
> looked at the "st_dev" values? Try once - look at what it returns for a
> NFS-mounted filesystem. Ponder. Notice how it already is NOT stable across
> reboots.
>
> In other words, the stuff you're complaining about is all stuff that
> nobody has _ever_ been able to rely on, and that has nothign to do with
> udev or anythign else. It all just shows how 100% right I am for saying
> that you cannot rely on stable numbers.
>
> So I repeat: calm down, and think it through.
>
> 		Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 22:01                                         ` Andries Brouwer
                                                             ` (2 preceding siblings ...)
  2004-01-04 23:35                                           ` Valdis.Kletnieks
@ 2004-01-05  2:52                                           ` Linus Torvalds
  2004-01-05  3:06                                             ` David Lang
                                                               ` (2 more replies)
  2004-01-05  4:15                                           ` Peter Chubb
  4 siblings, 3 replies; 158+ messages in thread
From: Linus Torvalds @ 2004-01-05  2:52 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Sun, 4 Jan 2004, Andries Brouwer wrote:
> 
> Surprise! Are you leaving POSIX? Or ditching NFS?
> Or demanding that NFS servers must never reboot?

Ok, Andries, time for you to take a deep breath, and calm down. Because 
your arguments are getting ridiculous in the extreme.

A NFS server is sure as hell not going to export _its_ dynamic /dev to its 
clients. That would be not just stupid, but crazy. Next you tell me that 
you were using devfs and exporting that over NFS. 

A NFS server is going to export something _totally_ different than its own 
/dev directory - it needs to be _client_-specific anyway. That's true with 
stable numbers too, btw - ever tried to mount a Solaris /dev on a Linux 
client? No workee.

> A common Unix idiom is testing for the identity
> of two files by comparing st_ino and st_dev.
> A broken idiom?

No. It still works. Even if the device numbers change across reboots.

Why? Becuase that _program_ sure as hell isn't running across a reboot.

And again, this is not something we haven't seen before. Have you ever 
looked at the "st_dev" values? Try once - look at what it returns for a 
NFS-mounted filesystem. Ponder. Notice how it already is NOT stable across 
reboots.

In other words, the stuff you're complaining about is all stuff that 
nobody has _ever_ been able to rely on, and that has nothign to do with 
udev or anythign else. It all just shows how 100% right I am for saying 
that you cannot rely on stable numbers.

So I repeat: calm down, and think it through.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 22:37                                           ` viro
  2004-01-05  1:02                                             ` Mark Mielke
@ 2004-01-05  2:29                                             ` Andries Brouwer
  2004-01-05  3:42                                               ` viro
  1 sibling, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-05  2:29 UTC (permalink / raw)
  To: viro
  Cc: Andries Brouwer, Linus Torvalds, Rob Love, rob, Pascal Schmidt,
	linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 10:37:10PM +0000, viro@parcelfarce.linux.theplanet.co.uk wrote:

Hi Al - a happy 2004 to you too!

> Now, care to explain how preserving aforementioned common Unix idiom
> is related to your expostulations?

Hmm. You sound like you agree that random device numbers and NFS
are a bad combination, but don't see why my example might be
relevant.

There is a great variation here in what various servers and clients do,
but roughly speaking filehandles tend to contain a fsid, and this fsid
often (no fsid= given) involves (major,minor,ino). When device numbers
vary randomly, the fsid may vary randomly. Various bad things may happen:
maybe all file handles go stale (or, worse, refer to something else),
or maybe device numbers on the client vary randomly.

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  1:02                                             ` Mark Mielke
@ 2004-01-05  2:24                                               ` Valdis.Kletnieks
  0 siblings, 0 replies; 158+ messages in thread
From: Valdis.Kletnieks @ 2004-01-05  2:24 UTC (permalink / raw)
  To: Mark Mielke; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 886 bytes --]

On Sun, 04 Jan 2004 20:02:36 EST, Mark Mielke said:

> If and when this comes up in 2.7 development, I would like to see an
> option of the sort: 1) Try to maintain major:minor numbers across
> reboots (even at the expense of complexity and efficiency), 2) Try to
> maintain a subset of the major:minor numbers across reboots
> (compromise) 3) Provide the most efficient implementation, making no
> guarantees regarding the numbering scheme, unless using a numbering
> scheme turns out to be more efficient. Deprecate 1), and let 2) and 3)
> evolve until we see who the victor is... :-) As long as the interface
> that maps device to number is abstracted, the above should be pluggable.

I'd recommend (at least during 2.7) some code in the allocator:

	if (LINUX_VERSION_CODE % 3) {
		major ^= get_random_bytes(4);
		minor ^= get_random_bytes(4);
	}

Just to keep everybody honest. :)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  1:58                                               ` viro
@ 2004-01-05  2:12                                                 ` Jeremy Maitin-Shepard
  0 siblings, 0 replies; 158+ messages in thread
From: Jeremy Maitin-Shepard @ 2004-01-05  2:12 UTC (permalink / raw)
  To: viro; +Cc: linux-kernel

viro@parcelfarce.linux.theplanet.co.uk writes:

> On Sun, Jan 04, 2004 at 08:43:27PM -0500, Jeremy Maitin-Shepard wrote:
>> Unfortunately, programs such as tar depend on inode numbers of distinct
>> files being distinct even when the file is not open over a period of
>> several minutes/seconds.  This is needed to avoid dumping hard links
>> more than once.  Furthermore, there is no efficient way to write
>> programs such as tar without depending on this capability.  Thus, if
>> st_ino cannot be used reliably for this purpose, it would be useful for
>> there to be a system call for retrieving a true
>> unique-within-the-filesystem identifier for the file.

> No such thing.  It's not the matter of having a syscall to extract such
> identifier - it's that on a lot of filesystems (including many common Unix
> ones) there's nothing that would qualify.

Even if the files in question aren't being modified, created, deleted,
etc.?  Even if nothing on the filesystem is being modified, created,
deleted, etc.?

> [snip]

-- 
Jeremy Maitin-Shepard

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-05  1:43                                             ` Jeremy Maitin-Shepard
@ 2004-01-05  1:58                                               ` viro
  2004-01-05  2:12                                                 ` Jeremy Maitin-Shepard
  0 siblings, 1 reply; 158+ messages in thread
From: viro @ 2004-01-05  1:58 UTC (permalink / raw)
  To: Jeremy Maitin-Shepard; +Cc: linux-kernel

On Sun, Jan 04, 2004 at 08:43:27PM -0500, Jeremy Maitin-Shepard wrote:

> Unfortunately, programs such as tar depend on inode numbers of distinct
> files being distinct even when the file is not open over a period of
> several minutes/seconds.  This is needed to avoid dumping hard links
> more than once.  Furthermore, there is no efficient way to write
> programs such as tar without depending on this capability.  Thus, if
> st_ino cannot be used reliably for this purpose, it would be useful for
> there to be a system call for retrieving a true
> unique-within-the-filesystem identifier for the file.

No such thing.  It's not the matter of having a syscall to extract such
identifier - it's that on a lot of filesystems (including many common Unix
ones) there's nothing that would qualify.

Note that tar et.al. do not behave well if used on actively modified directory
tree and ->st_ino reuse is the least of the problems in that area.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 23:35                                           ` Valdis.Kletnieks
@ 2004-01-05  1:43                                             ` Jeremy Maitin-Shepard
  2004-01-05  1:58                                               ` viro
  0 siblings, 1 reply; 158+ messages in thread
From: Jeremy Maitin-Shepard @ 2004-01-05  1:43 UTC (permalink / raw)
  To: linux-kernel

Valdis.Kletnieks@vt.edu writes:

> On Sun, 04 Jan 2004 23:01:04 +0100, Andries Brouwer said:
>> A common Unix idiom is testing for the identity
>> of two files by comparing st_ino and st_dev.
>> A broken idiom?

> Comparing two of these obtained at the same time is *usually* a good
> test, although racy even on current systems. (Consider the case of an
> unlink()/creat() pair between the two stat() calls - there's been more than
> one race condition resulting in a security hole based on THIS one).  It's
> only safe if you actually have an open reference to both files before you
> fstat() either one.  And yes, it has to be fstat(), as you can't guarantee
> that the file referenced by path in stat() is the one you did an
> open() on.

Unfortunately, programs such as tar depend on inode numbers of distinct
files being distinct even when the file is not open over a period of
several minutes/seconds.  This is needed to avoid dumping hard links
more than once.  Furthermore, there is no efficient way to write
programs such as tar without depending on this capability.  Thus, if
st_ino cannot be used reliably for this purpose, it would be useful for
there to be a system call for retrieving a true
unique-within-the-filesystem identifier for the file.

-- 
Jeremy Maitin-Shepard

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 22:37                                           ` viro
@ 2004-01-05  1:02                                             ` Mark Mielke
  2004-01-05  2:24                                               ` Valdis.Kletnieks
  2004-01-05  2:29                                             ` Andries Brouwer
  1 sibling, 1 reply; 158+ messages in thread
From: Mark Mielke @ 2004-01-05  1:02 UTC (permalink / raw)
  To: viro
  Cc: Andries Brouwer, Linus Torvalds, Rob Love, rob, Pascal Schmidt,
	linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 10:37:10PM +0000, viro@parcelfarce.linux.theplanet.co.uk wrote:
> On Sun, Jan 04, 2004 at 11:01:04PM +0100, Andries Brouwer wrote:
> > A common Unix idiom is testing for the identity
> > of two files by comparing st_ino and st_dev.
> > A broken idiom?
> 	No, just your usual highly selective reading.  First of all, that
> idiom relies only on different ->s_dev *among* *currently* *mounted*
> *filesystems*.
> ...
> Now, care to explain how preserving aforementioned common Unix idiom
> is related to your expostulations?

I think he is defending bad design practices by pointing out common
bad design practices, and asking why these bad practices shouldn't be
allowed to continue, given that they are so common... :-)

Are there any real programs that assume st_dev/st_ino values are constant
across mount/unmount/mount? If so, Linus is saying we should break these
programs, so that the authors can become aware of the problem, rather than
leaving the problem as a subtle corner case.

I see no reason at all to keep these programs running. They are incorrect,
and that is that.

If and when this comes up in 2.7 development, I would like to see an
option of the sort: 1) Try to maintain major:minor numbers across
reboots (even at the expense of complexity and efficiency), 2) Try to
maintain a subset of the major:minor numbers across reboots
(compromise) 3) Provide the most efficient implementation, making no
guarantees regarding the numbering scheme, unless using a numbering
scheme turns out to be more efficient. Deprecate 1), and let 2) and 3)
evolve until we see who the victor is... :-) As long as the interface
that maps device to number is abstracted, the above should be pluggable.

mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 22:01                                         ` Andries Brouwer
  2004-01-04 22:37                                           ` viro
  2004-01-04 22:37                                           ` Helge Hafting
@ 2004-01-04 23:35                                           ` Valdis.Kletnieks
  2004-01-05  1:43                                             ` Jeremy Maitin-Shepard
  2004-01-05  2:52                                           ` Linus Torvalds
  2004-01-05  4:15                                           ` Peter Chubb
  4 siblings, 1 reply; 158+ messages in thread
From: Valdis.Kletnieks @ 2004-01-04 23:35 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 790 bytes --]

On Sun, 04 Jan 2004 23:01:04 +0100, Andries Brouwer said:

> A common Unix idiom is testing for the identity
> of two files by comparing st_ino and st_dev.
> A broken idiom?

Comparing two of these obtained at the same time is *usually* a good
test, although racy even on current systems. (Consider the case of an
unlink()/creat() pair between the two stat() calls - there's been more than
one race condition resulting in a security hole based on THIS one).  It's
only safe if you actually have an open reference to both files before you
fstat() either one.  And yes, it has to be fstat(), as you can't guarantee
that the file referenced by path in stat() is the one you did an open() on.

Comparing the st_ino/st_dev for a file to day with one from last Friday has
NEVER been a good idea.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 22:01                                         ` Andries Brouwer
  2004-01-04 22:37                                           ` viro
@ 2004-01-04 22:37                                           ` Helge Hafting
  2004-01-04 23:35                                           ` Valdis.Kletnieks
                                                             ` (2 subsequent siblings)
  4 siblings, 0 replies; 158+ messages in thread
From: Helge Hafting @ 2004-01-04 22:37 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 11:01:04PM +0100, Andries Brouwer wrote:
> On Sun, Jan 04, 2004 at 01:05:20PM -0800, Linus Torvalds wrote:
> 
> > Oh, _I_ always understood. You were the one that was arguing for
> > stable numbers as somehow important.
> 
> Indeed. I said "preferably stable across reboots".
> 
> > I'm just telling you that they aren't stable, and that a
> > user application that depends on their stability or
> > their uniqueness is BROKEN.
> 
> Surprise! Are you leaving POSIX? Or ditching NFS?
> Or demanding that NFS servers must never reboot?
> 
> A common Unix idiom is testing for the identity
> of two files by comparing st_ino and st_dev.
> A broken idiom?
> 
> No idea what part of our Unix heritage you now have decided to call broken.
> 

You worry about /dev over nfs, with the server booting in the middle of
such a comparison?  This can work even with randomized device numbers,
just don't let that nfs server populate the exported /dev itself.

Let the client(s) run udev, and have one /dev for each on persistent 
storage.  If the nfs server reboots it simply keeps serving /dev's
in whatever shape the clients set them up with.

Helge Hafting

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 22:01                                         ` Andries Brouwer
@ 2004-01-04 22:37                                           ` viro
  2004-01-05  1:02                                             ` Mark Mielke
  2004-01-05  2:29                                             ` Andries Brouwer
  2004-01-04 22:37                                           ` Helge Hafting
                                                             ` (3 subsequent siblings)
  4 siblings, 2 replies; 158+ messages in thread
From: viro @ 2004-01-04 22:37 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 11:01:04PM +0100, Andries Brouwer wrote:
> A common Unix idiom is testing for the identity
> of two files by comparing st_ino and st_dev.
> A broken idiom?

	No, just your usual highly selective reading.  First of all, that
idiom relies only on different ->s_dev *among* *currently* *mounted*
*filesystems*.  In part that has anything to do with devices, it means
only one thing:

	Any two different block devices that are both currently opened by
	the kernel and are both alive must have different device numbers.

Note the "are alive" part - we can even allow reuse of device numbers
as long as we make sure that stat() will fail on filesystems mounted
from dead ones.

Now, care to explain how preserving aforementioned common Unix idiom
is related to your expostulations?

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 21:05                                       ` Linus Torvalds
@ 2004-01-04 22:01                                         ` Andries Brouwer
  2004-01-04 22:37                                           ` viro
                                                             ` (4 more replies)
  0 siblings, 5 replies; 158+ messages in thread
From: Andries Brouwer @ 2004-01-04 22:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 01:05:20PM -0800, Linus Torvalds wrote:

> Oh, _I_ always understood. You were the one that was arguing for
> stable numbers as somehow important.

Indeed. I said "preferably stable across reboots".

> I'm just telling you that they aren't stable, and that a
> user application that depends on their stability or
> their uniqueness is BROKEN.

Surprise! Are you leaving POSIX? Or ditching NFS?
Or demanding that NFS servers must never reboot?

A common Unix idiom is testing for the identity
of two files by comparing st_ino and st_dev.
A broken idiom?

No idea what part of our Unix heritage you now have decided to call broken.

Andries



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04 13:21                                     ` Andries Brouwer
@ 2004-01-04 21:05                                       ` Linus Torvalds
  2004-01-04 22:01                                         ` Andries Brouwer
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2004-01-04 21:05 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Sun, 4 Jan 2004, Andries Brouwer wrote:
>
> On Sat, Jan 03, 2004 at 07:04:17PM -0800, Linus Torvalds wrote:
> > 
> > I agree that for a stable kernel we should then go back to "best effort" 
> > mode, where for simple politeness reasons we should try to keep device 
> > numbers as stable as we can.
> 
> Good - you understand now.

Oh, _I_ always understood. You were the one that was arguing for stable
numbers as somehow important. I'm just telling you that they aren't
stable, and that a user application that depends on their stability or
their uniqieness is BROKEN.

> So, the right setup - you call it politeness, I call it quality
> of implementation - is to have both stable names and stable numbers,
> in as many cases as possible.

And I still disagree. You seem to think that this is an "absolute 
goodness", and call it a quality issue.

While I personally strongly believe that it is a bug in user space to
care, and that it is not a quality issue at all, but rather a "allow buggy
and/or nonconverted user space to work".

In other words, it's not about "quality", as much as about compatibility 
with applications that are old and/or braindead. Big difference.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04  1:54                                 ` Valdis.Kletnieks
@ 2004-01-04 18:44                                   ` Mark Mielke
  0 siblings, 0 replies; 158+ messages in thread
From: Mark Mielke @ 2004-01-04 18:44 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Andries Brouwer, Linus Torvalds, Rob Love, rob, Pascal Schmidt,
	linux-kernel, Greg KH

On Sat, Jan 03, 2004 at 08:54:36PM -0500, Valdis.Kletnieks@vt.edu wrote:
> ISTR that SunOS 4.0 handled an NFS-mounted /dev and swap just fine
> some 15 years ago? (in fact, due to performance differences between
> the disks on a Sun3/ 2xx server and the shoebox disk on a 3/50, you
> could page faster over the net than to a local /dev/swap).

Whether it did at some point, or whether it didn't, doesn't really matter.

It doesn't need to, and with the amount of memory that most computers come
with these days, remote access storage for tiny kernel data structures, like
that which would be required for tmpfs /dev that is only populated with the
devices that actually exist, just isn't worth it.

> So it's more a case of "we have decided to do it differently" than
> "that's so nuts that it shouldn't be expected to work"....

I was saying "why do you think this is a good model?" not "I can't imagine
why you would do it..." :-) Sorry it didn't come across as I intended.

mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04  3:04                                   ` Linus Torvalds
@ 2004-01-04 13:21                                     ` Andries Brouwer
  2004-01-04 21:05                                       ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-04 13:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sat, Jan 03, 2004 at 07:04:17PM -0800, Linus Torvalds wrote:

> I agree that for a stable kernel we should then go back to "best effort" 
> mode, where for simple politeness reasons we should try to keep device 
> numbers as stable as we can.

Good - you understand now.
So, the right setup - you call it politeness, I call it quality
of implementation - is to have both stable names and stable numbers,
in as many cases as possible.

Concerning the names, we are in reasonable shape. We have nameif
that binds a stable name to a MAC address. Much beter than eth2.
Also udev is a good step in the right direction - it gives
stable names under certain circumstances.

(And since udev can use the kernel device number, it can give stable
names under more circumstances when the kernel device number is
more often stable.)

Concerning the numbers, numbers based on enumeration are less than
satisfactory - they must be the last fallback when nothing else
can be found. And the ordering then is the ordering in time.

Almost always something better can be found. It is the drivers' job
to invent the device number. For the important special case of
SCSI or IDE disk, the disk serial number can be used.

Our helper function takes a string and an integer and a range, and
produces a device number in the given range, distinct from already
existing numbers. If you prefer random device numbers you make this
function ignore the string argument. I prefer stable device numbers
so would do an md5sum-like thing.

And that brings us back to the start of this thread:
Life is simpler when there is more room.
So it is a pity that we chose for less room.

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04  8:57                 ` Greg KH
@ 2004-01-04  9:43                   ` Rob Landley
  0 siblings, 0 replies; 158+ messages in thread
From: Rob Landley @ 2004-01-04  9:43 UTC (permalink / raw)
  To: Greg KH; +Cc: Kai Henningsen, linux-kernel

On Sunday 04 January 2004 02:57, Greg KH wrote:
> On Fri, Jan 02, 2004 at 01:26:44AM -0600, Rob Landley wrote:
> > > Moral: keep the identifier creation framework flexible enough so that
> > > you can chose device-specific means to produce useful identifiers.
> > > (And, use long identifiers, as they're less likely to be duplicated in
> > > general.)
> >
> > Seems to be what udev is for.  When we do go to random major and minor
> > numbers, maybe it would be useful to let udev request specific ones? 
> > (Just a thought...)
>
> Let udev request specific what?  Major/minor numbers?  Huh?  I think you
> are very confused here...

Currently, NFS exports are using device major/minor as part of the identifier 
for an exported direcory, and device numbers are going to be dynamically 
allocated in 2.7 to support hotplug, so i was wondering if there was a need 
to have some way for root to go "I know this device hotplugged in at major 3 
minor 99, but if major 53 minor 12 is free, could you change it to that?")  A 
bit like dup2, only for devices.

The discussion has moved on since then, and now it seems pretty clear that NFS 
is going to be expected to use something OTHER than device numbers, and Linus 
wants a clean break with device nodes being cookies.  Better solution all 
around, really...

But the original question did make sense.  (The answer was "no", but that's 
often the sign of a good question. :)

> thanks,
>
> greg k-h

Rob


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-02  7:26               ` Rob Landley
@ 2004-01-04  8:57                 ` Greg KH
  2004-01-04  9:43                   ` Rob Landley
  0 siblings, 1 reply; 158+ messages in thread
From: Greg KH @ 2004-01-04  8:57 UTC (permalink / raw)
  To: Rob Landley; +Cc: Kai Henningsen, linux-kernel

On Fri, Jan 02, 2004 at 01:26:44AM -0600, Rob Landley wrote:
> > Moral: keep the identifier creation framework flexible enough so that you
> > can chose device-specific means to produce useful identifiers. (And, use
> > long identifiers, as they're less likely to be duplicated in general.)
> 
> Seems to be what udev is for.  When we do go to random major and minor 
> numbers, maybe it would be useful to let udev request specific ones?  (Just a 
> thought...)

Let udev request specific what?  Major/minor numbers?  Huh?  I think you
are very confused here...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04  2:49                                 ` Andries Brouwer
@ 2004-01-04  3:04                                   ` Linus Torvalds
  2004-01-04 13:21                                     ` Andries Brouwer
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2004-01-04  3:04 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Sun, 4 Jan 2004, Andries Brouwer wrote:
> 
> You write long stories - but it really is desirable to have
> stable device numbers.

And I write the long stories because you do not seem to _get_ the point.

The point is that we will most likely ON PURPOSE break those stable device 
numbers, for debugging reasons. Because it is _not_ desirable to have 
people _believe_ that they can depend on stable device numbers.

> I don't see why that would be relevant. One identifies
> things by their UUID. Order is never important.

And this is exactly how it should be. However, it requires that user code 
actually does the right thing.

And to _verify_ that user code properly identifies devices by other things 
than device numbers, we should during 2.7.x explicitly _break_ all 
dependencies on stable device numbers.

And UUID's are _not_ "device numbers". They fundamentally _cannot_ be 
that, because the kernel just doesn't have any information on how to 
generate a unique identifier that is actually stable.

The kernel doesn't know what it can depend on - should it look at the UUID
in the boot sector of the disk, or should it look up the UUID using IP
number reverse lookup, or what? 

The only thing that can generate a UUID is literally user mode. Which is 
_exactly_ why things like udev exists.

So device numbers are _not_ UUID's. Device numbers are needed before the 
UUID's have been identified. 

And that has been my point all along: device numbers do not have any
meaning. They are neither unique nor stable across reboots. They have no
information AT ALL associated with them. Anybody who thinks that they are
is fundamentally _wrong_ about it.

I agree that for a stable kernel we should then go back to "best effort" 
mode, where for simple politeness reasons we should try to keep device 
numbers as stable as we can. 

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04  2:09                               ` Linus Torvalds
@ 2004-01-04  2:49                                 ` Andries Brouwer
  2004-01-04  3:04                                   ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-04  2:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sat, Jan 03, 2004 at 06:09:47PM -0800, Linus Torvalds wrote:
> On Sun, 4 Jan 2004, Andries Brouwer wrote:

> > Empty talk. This is not about finding and fixing bugs.
> > We know very precisely what properties the NFS protocol has.
> > Now one can have a system that works as well as possible with NFS.
> > And one can have a worse system.
> 
> Oh, things can be _much_ worse than /dev over NFS.

Yes, but why do you start saying that?

Our topic is the statement that it is good to have device numbers
stable across a reboot. Not absolutely necessary, but good.

For example, given an NFS mount, if the server reboots and
suddenly the client sees different stat data, that would be
less than optimal. A low quality NFS implementation.

You write long stories - but it really is desirable to have
stable device numbers.

> You don't seem to realize what I mean with "not enumerable".

One of your side avenues is the matter of enumeration.
I don't see why that would be relevant. One identifies
things by their UUID. Order is never important.

> And there just _isn't_ any way to make them the same or to "describe" the 
> storage in any integer of any finite length. It has nothing to do with 
> 32-bit vs 64-bit vs 1024-bit.

A UUID usually takes 128 bits.

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03 23:08                             ` Andries Brouwer
  2004-01-04  1:16                               ` Mark Mielke
@ 2004-01-04  2:09                               ` Linus Torvalds
  2004-01-04  2:49                                 ` Andries Brouwer
  1 sibling, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2004-01-04  2:09 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Sun, 4 Jan 2004, Andries Brouwer wrote:
> 
> Empty talk. This is not about finding and fixing bugs.
> We know very precisely what properties the NFS protocol has.
> Now one can have a system that works as well as possible with NFS.
> And one can have a worse system.

Oh, things can be _much_ worse than /dev over NFS. 

You don't seem to realize what I men with "not enumerable".

With NFS, you could have some strange per-mount device number mapping etc, 
and it wouldn't need to be all that complicated.

But if you start considering network-attached storage (as in "disks over
IP", not as in "samba"), the problem is that you fundamentally cannot
enumerate the things on a kernel level. EVER. There is no way to do
automatic discovery, because the bus fundamentally isn't enumerable. It
isn't even _repeatable_, ie if you do broadcast "tell me what disks
exists", the results won't be ordered some way.

In other words, the device numbers that eventually get attached to these 
disks (however the discovery ends up working - with the sysadmin 
explicitly mentioning them, or with some kind of broadcast protocol) 
simply WILL NOT NECESSARILY be the same across reboots. 

And there just _isn't_ any way to make them the same or to "describe" the 
storage in any integer of any finite length. It has nothing to do with 
32-bit vs 64-bit vs 1024-bit.

Once you accept that fact, you should accept the fact that device numbers 
not only have no meaning, they literally have no permanence across reboots 
either.

Yes, the common case is permanent. What I'm saying is that the common case 
_cannot_ be the generic case. 

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-04  1:16                               ` Mark Mielke
@ 2004-01-04  1:54                                 ` Valdis.Kletnieks
  2004-01-04 18:44                                   ` Mark Mielke
  0 siblings, 1 reply; 158+ messages in thread
From: Valdis.Kletnieks @ 2004-01-04  1:54 UTC (permalink / raw)
  To: Mark Mielke
  Cc: Andries Brouwer, Linus Torvalds, Rob Love, rob, Pascal Schmidt,
	linux-kernel, Greg KH

[-- Attachment #1: Type: text/plain, Size: 772 bytes --]

On Sat, 03 Jan 2004 20:16:26 EST, Mark Mielke said:

> It seems to me that as long as /dev is always a local mount (tmpfs in
> the case of an NFS-root installation), it doesn't really matter. Maintaining
> system-specific information on a remote machine seems dirty, and something
> that shouldn't be *expected* to work. You wouldn't expect /proc to work
> over NFS, would you? :-)

ISTR that SunOS 4.0 handled an NFS-mounted /dev and swap just fine some 15
years ago? (in fact, due to performance differences between the disks on a Sun3/
2xx server and the shoebox disk on a 3/50, you could page faster over the net
than to a local /dev/swap).

So it's more a case of "we have decided to do it differently" than "that's so nuts
that it shouldn't be expected to work"....

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03 23:08                             ` Andries Brouwer
@ 2004-01-04  1:16                               ` Mark Mielke
  2004-01-04  1:54                                 ` Valdis.Kletnieks
  2004-01-04  2:09                               ` Linus Torvalds
  1 sibling, 1 reply; 158+ messages in thread
From: Mark Mielke @ 2004-01-04  1:16 UTC (permalink / raw)
  To: Andries Brouwer
  Cc: Linus Torvalds, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sun, Jan 04, 2004 at 12:08:40AM +0100, Andries Brouwer wrote:
> On Sat, Jan 03, 2004 at 02:27:47PM -0800, Linus Torvalds wrote:
> > And then a high-quality implementation actually ends up being 
> > _detrimental_. It's hiding problems that can still happen, they just 
> > happen rarely enough that the bugs don't get found and fixed.
> Empty talk. This is not about finding and fixing bugs.
> We know very precisely what properties the NFS protocol has.
> Now one can have a system that works as well as possible with NFS.
> And one can have a worse system.

It seems to me that as long as /dev is always a local mount (tmpfs in
the case of an NFS-root installation), it doesn't really matter. Maintaining
system-specific information on a remote machine seems dirty, and something
that shouldn't be *expected* to work. You wouldn't expect /proc to work
over NFS, would you? :-)

mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03 22:27                           ` Linus Torvalds
@ 2004-01-03 23:08                             ` Andries Brouwer
  2004-01-04  1:16                               ` Mark Mielke
  2004-01-04  2:09                               ` Linus Torvalds
  0 siblings, 2 replies; 158+ messages in thread
From: Andries Brouwer @ 2004-01-03 23:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Sat, Jan 03, 2004 at 02:27:47PM -0800, Linus Torvalds wrote:

> > Sure. It is not "need". It is "quality of implementation".
> > Consider NFS.

> And then a high-quality implementation actually ends up being 
> _detrimental_. It's hiding problems that can still happen, they just 
> happen rarely enough that the bugs don't get found and fixed.

Empty talk. This is not about finding and fixing bugs.
We know very precisely what properties the NFS protocol has.
Now one can have a system that works as well as possible with NFS.
And one can have a worse system.

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03 22:16       ` Greg KH
@ 2004-01-03 22:33         ` Christoph Hellwig
  0 siblings, 0 replies; 158+ messages in thread
From: Christoph Hellwig @ 2004-01-03 22:33 UTC (permalink / raw)
  To: Greg KH; +Cc: Witukind, linux-kernel, linux-hotplug-devel

On Sat, Jan 03, 2004 at 02:16:04PM -0800, Greg KH wrote:
> > If devfs works good on FreeBSD, it probably means that the current
> > devfs for Linux is badly designed, not that the idea of devfs is bad.
> 
> I have no idea how FreeBSD implemented devfs.
> 
> If you know how FreeBSD implemented devfs, and how it solves all of the
> problems that I detailed in my original posting, I would be interested.

The FreeBSD implementation is pretty similar to the devfs we have in 2.6
API- and implementation wise.  Just because it works somehow in most
situation doesn't mean it's right..


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03 13:10                         ` Andries Brouwer
@ 2004-01-03 22:27                           ` Linus Torvalds
  2004-01-03 23:08                             ` Andries Brouwer
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2004-01-03 22:27 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Sat, 3 Jan 2004, Andries Brouwer wrote:
> 
> Sure. It is not "need". It is "quality of implementation".
> Consider NFS.

The problems occur when there are things we _cannot_ guarantee, and that
user space starts unnecessarily to depend on. And that ends up resulting
in bugs waiting to happen. Bugs that many "normal" developers may never 
hit, simply because the quality of implementation ends up being so good 
that it hides the problem cases in regular usage.

And then a high-quality implementation actually ends up being 
_detrimental_. It's hiding problems that can still happen, they just 
happen rarely enough that the bugs don't get found and fixed.

And then the painful thing of forcing "stupid", aka "bad QoI" behaviour, 
actually ends up being the better thing in the long run.

			Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
       [not found]     ` <20040103140140.3b848e9f.witukind@nsbm.kicks-ass.org>
@ 2004-01-03 22:16       ` Greg KH
  2004-01-03 22:33         ` Christoph Hellwig
  0 siblings, 1 reply; 158+ messages in thread
From: Greg KH @ 2004-01-03 22:16 UTC (permalink / raw)
  To: Witukind; +Cc: linux-kernel, linux-hotplug-devel

On Sat, Jan 03, 2004 at 02:01:40PM +0100, Witukind wrote:
> On Fri, 2 Jan 2004 21:59:38 -0800
> Greg KH <greg@kroah.com> wrote:
> 
> > On Thu, Jan 01, 2004 at 02:18:55AM +0100, Helge Hafting wrote:
> > > On Tue, Dec 30, 2003 at 04:29:42PM -0800, Greg KH wrote:
> > > > 
> > > >  2) We are (well, were) running out of major and minor numbers for
> > > >     devices.
> > > 
> > > devfs tried to fix this one by _getting rid_ of those numbers.
> > > Seriously - what are they needed for?  
> > 
> > But devfs failed in this.  The devfs kernel interface still requires a
> > major/minor number to create device nodes.
> 
> Let's be more precise and not say that "devfs" failed this, but that the
> current implementation of devfs failed this.

Um, that's all we have to go by right now, sorry.

> If devfs works good on FreeBSD, it probably means that the current
> devfs for Linux is badly designed, not that the idea of devfs is bad.

I have no idea how FreeBSD implemented devfs.

If you know how FreeBSD implemented devfs, and how it solves all of the
problems that I detailed in my original posting, I would be interested.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03 15:22     ` Helge Hafting
  2004-01-03 21:18       ` viro
@ 2004-01-03 22:11       ` Greg KH
  1 sibling, 0 replies; 158+ messages in thread
From: Greg KH @ 2004-01-03 22:11 UTC (permalink / raw)
  To: Helge Hafting; +Cc: linux-hotplug-devel, linux-kernel

On Sat, Jan 03, 2004 at 04:22:41PM +0100, Helge Hafting wrote:
> > Hopefully I can work on fixing this up in 2.7.
> 
> Interesting - how do you plan to do this?  

Probably something like the current interface for USB minor numbers when
CONFIG_USB_DYNAMIC_MINORS is enabled.  The drivers will request a
certian major/minor, but the kernel will just give it whatever it feels
like.

That's my first guess, actual implementation will probably differ wildly
:)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03  6:51                   ` Valdis.Kletnieks
  2004-01-03 11:57                     ` Ian Kent
@ 2004-01-03 22:08                     ` Greg KH
  1 sibling, 0 replies; 158+ messages in thread
From: Greg KH @ 2004-01-03 22:08 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

On Sat, Jan 03, 2004 at 01:51:08AM -0500, Valdis.Kletnieks@vt.edu wrote:
> On Fri, 02 Jan 2004 22:07:48 PST, Greg KH <greg@kroah.com>  said:
> 
> > What is "efficiently"?  No one really cares about milliseconds here,
> > seconds are even tollerable at least for small seconds :)
> 
> Anybody who's had to sit and watch a Sun E10K enumerate 400+ disks
> will disagree with that, unless "small seconds" are tiny fractions thereof. :)

It's "small seconds" _after_ the kernel has enumerated them.  That's the
majority of the time spent enumerating scsi disks.

Also, udev will be running while the kernel is off detecting the next
disk.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03 15:22     ` Helge Hafting
@ 2004-01-03 21:18       ` viro
  2004-01-03 22:11       ` Greg KH
  1 sibling, 0 replies; 158+ messages in thread
From: viro @ 2004-01-03 21:18 UTC (permalink / raw)
  To: Helge Hafting; +Cc: Greg KH, linux-hotplug-devel, linux-kernel

On Sat, Jan 03, 2004 at 04:22:41PM +0100, Helge Hafting wrote:
> On Fri, Jan 02, 2004 at 09:59:38PM -0800, Greg KH wrote:
> > On Thu, Jan 01, 2004 at 02:18:55AM +0100, Helge Hafting wrote:
> > > On Tue, Dec 30, 2003 at 04:29:42PM -0800, Greg KH wrote:
> > > > 
> > > >  2) We are (well, were) running out of major and minor numbers for
> > > >     devices.
> > > 
> > > devfs tried to fix this one by _getting rid_ of those numbers.
> > > Seriously - what are they needed for?  
> > 
> > But devfs failed in this.  The devfs kernel interface still requires a
> > major/minor number to create device nodes.
> > 
> Yes.  The numbers went unused in the common case of opening a device by name though.

No, they were not.  RTFS, please.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03  5:59   ` Greg KH
@ 2004-01-03 15:22     ` Helge Hafting
  2004-01-03 21:18       ` viro
  2004-01-03 22:11       ` Greg KH
       [not found]     ` <20040103140140.3b848e9f.witukind@nsbm.kicks-ass.org>
  1 sibling, 2 replies; 158+ messages in thread
From: Helge Hafting @ 2004-01-03 15:22 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-hotplug-devel, linux-kernel

On Fri, Jan 02, 2004 at 09:59:38PM -0800, Greg KH wrote:
> On Thu, Jan 01, 2004 at 02:18:55AM +0100, Helge Hafting wrote:
> > On Tue, Dec 30, 2003 at 04:29:42PM -0800, Greg KH wrote:
> > > 
> > >  2) We are (well, were) running out of major and minor numbers for
> > >     devices.
> > 
> > devfs tried to fix this one by _getting rid_ of those numbers.
> > Seriously - what are they needed for?  
> 
> But devfs failed in this.  The devfs kernel interface still requires a
> major/minor number to create device nodes.
> 
Yes.  The numbers went unused in the common case of opening a device by name though.

> Hopefully I can work on fixing this up in 2.7.

Interesting - how do you plan to do this?  
There must be some connection from device node to driver.  Devfs had
a pointer in the inode.  The old way has numbers, and spend time on
a search.  

Are you considering a sort of "minimal devfs" managed by udev?

Helge Hafting

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03  4:46                       ` Linus Torvalds
@ 2004-01-03 13:10                         ` Andries Brouwer
  2004-01-03 22:27                           ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-03 13:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andries Brouwer, Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH

On Fri, Jan 02, 2004 at 08:46:33PM -0800, Linus Torvalds wrote:

> > Random cookies? I prefer "arbitrary" over "random". The value plays no role
> > at all, but it must be unique, preferably stable across reboots.
> 
> The operative word in "preferably stable across reboots" is
> "preferably". Because it basically cannot be in the general case,
> and thus nothing must ever _assume_ it is.

Sure. It is not "need". It is "quality of implementation".
Consider NFS.

Andries


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03  6:51                   ` Valdis.Kletnieks
@ 2004-01-03 11:57                     ` Ian Kent
  2004-01-03 22:08                     ` Greg KH
  1 sibling, 0 replies; 158+ messages in thread
From: Ian Kent @ 2004-01-03 11:57 UTC (permalink / raw)
  To: Kernel Mailing List

On Sat, 3 Jan 2004 Valdis.Kletnieks@vt.edu wrote:

> On Fri, 02 Jan 2004 22:07:48 PST, Greg KH <greg@kroah.com>  said:
> 
> > What is "efficiently"?  No one really cares about milliseconds here,
> > seconds are even tollerable at least for small seconds :)
> 
> Anybody who's had to sit and watch a Sun E10K enumerate 400+ disks
> will disagree with that, unless "small seconds" are tiny fractions thereof. :)
> 
> 
> 

Even an old E3500 with only 70 or so disks and the evil RDAC is enough.

Ian



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03  6:07                 ` Greg KH
@ 2004-01-03  6:51                   ` Valdis.Kletnieks
  2004-01-03 11:57                     ` Ian Kent
  2004-01-03 22:08                     ` Greg KH
  0 siblings, 2 replies; 158+ messages in thread
From: Valdis.Kletnieks @ 2004-01-03  6:51 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

On Fri, 02 Jan 2004 22:07:48 PST, Greg KH <greg@kroah.com>  said:

> What is "efficiently"?  No one really cares about milliseconds here,
> seconds are even tollerable at least for small seconds :)

Anybody who's had to sit and watch a Sun E10K enumerate 400+ disks
will disagree with that, unless "small seconds" are tiny fractions thereof. :)



[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
       [not found]               ` <20040102103104.GA28168@mark.mielke.cc>
@ 2004-01-03  6:07                 ` Greg KH
  2004-01-03  6:51                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 158+ messages in thread
From: Greg KH @ 2004-01-03  6:07 UTC (permalink / raw)
  To: Maciej Zenczykowski, Rob Landley, Rob Love, Andries Brouwer,
	Pascal Schmidt, linux-kernel

On Fri, Jan 02, 2004 at 05:31:04AM -0500, Mark Mielke wrote:
> On Fri, Jan 02, 2004 at 01:17:20AM +0100, Maciej Zenczykowski wrote:
> > Wouldn't this be a classical birthday problem with 50% collision chance
> > popping up in and around a few hundred devices? [20 for 8 bits, 23 for
> > 365, 302 for 16 bits, 77163 for 32 bits], and that's only in a single
> > system - with hundreds of thousands of systems even a 0.1% collision rate
> > is deadly. [0.1% collision rate at 32 bits with 2932 devices]  Even with 
> > only 300 devices per system, you'll still get a collision (at 32 bits) on 
> > more than 1 system in a hundred thousand.
> 
> I don't see this (multiple systems) as relevant. Device numbers do not need
> to be unique across systems, and they shouldn't even need to be unique across
> system reboots. Even when collisions occur, it doesn't matter, as it can just
> pick a different random number, or follow a free list, or hundreds of other
> algorithms.
> 
> Isn't this all just a question of device registration performance? 1) The
> device module needs to register the appropriate numbers efficiently.

What is "efficiently"?  No one really cares about milliseconds here,
seconds are even tollerable at least for small seconds :)

> 2) /dev needs to be populated or updated efficiently. devfs tried for
> a just in time approach, whereas udev tries for a proactive approach.

"proactive"?  udev is "reactive" in that it reacts to the number that
the kernel exports to userspace.  That's all.

Remember, devfs also uses those same, hardcoded numbers...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-02  0:17                 ` Hollis Blanchard
  2004-01-02  0:36                   ` viro
@ 2004-01-03  6:04                   ` Greg KH
  1 sibling, 0 replies; 158+ messages in thread
From: Greg KH @ 2004-01-03  6:04 UTC (permalink / raw)
  To: Hollis Blanchard
  Cc: Tommi Virtanen, Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel

On Thu, Jan 01, 2004 at 06:17:43PM -0600, Hollis Blanchard wrote:
> On Wednesday, Dec 31, 2003, at 15:52 US/Central, Tommi Virtanen wrote:
> >I think devfs names are accepted as root= arguments, so that's a bit of
> >a loss.. with udev, your /dev and your root= are equal only if you
> >follow the standard naming.
> >
> >For root=, I can see how early userspace can move that to userspace.
> >But what about swsuspend?
> >
> >Are there any more kernel options taking file names? I think now would
> >be a good time to stop adding more of them :)
> 
> "console=" takes driver-supplied names which usually happen to match 
> /dev node names. For example, drivers/serial/8250.c names itself 
> "ttyS", so "console=ttyS0" will end up going to that driver, regardless 
> of the state of /dev.

These are just string matches that the different console drivers use.
They have nothing to do with an actual /dev node.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01  1:18 ` Helge Hafting
@ 2004-01-03  5:59   ` Greg KH
  2004-01-03 15:22     ` Helge Hafting
       [not found]     ` <20040103140140.3b848e9f.witukind@nsbm.kicks-ass.org>
  0 siblings, 2 replies; 158+ messages in thread
From: Greg KH @ 2004-01-03  5:59 UTC (permalink / raw)
  To: Helge Hafting; +Cc: linux-hotplug-devel, linux-kernel

On Thu, Jan 01, 2004 at 02:18:55AM +0100, Helge Hafting wrote:
> On Tue, Dec 30, 2003 at 04:29:42PM -0800, Greg KH wrote:
> > 
> >  2) We are (well, were) running out of major and minor numbers for
> >     devices.
> 
> devfs tried to fix this one by _getting rid_ of those numbers.
> Seriously - what are they needed for?  

But devfs failed in this.  The devfs kernel interface still requires a
major/minor number to create device nodes.

Hopefully I can work on fixing this up in 2.7.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-03  3:00                     ` Andries Brouwer
@ 2004-01-03  4:46                       ` Linus Torvalds
  2004-01-03 13:10                         ` Andries Brouwer
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2004-01-03  4:46 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Rob Love, rob, Pascal Schmidt, linux-kernel, Greg KH



On Sat, 3 Jan 2004, Andries Brouwer wrote:
> 
> > Note that one reason I didn't much like the 64-bit versions is that not 
> > only are they bigger, they also encourage insanity. Ie you'd find SCSI 
> > people who want to try to encode device/controller/bus/target/lun info 
> > into the device number. 
> 
> Weak. "We don't want this power that has good uses because it also
> can be used stupidly." That is not Unix-style.

No.

That's not the argument: the argument is that the _only_ thing that 64-bit 
stuff can be used for is stupid things.

For everything else, a 32-bit dev_t is sufficient.

And the UNIX way is definitely: "do one thing, and do it well" and "small
is beautiful". It has _never_ been "overdesign everything to accomodate
stupidity".

You may have confused UNIX with Multics. Where overdesign was the rule, 
not the exception.

> > We should resist any effort that makes the numbers "mean" something. They 
> > are random cookies. Not "unique identifiers", and not "addresses".
> 
> Random cookies? I prefer "arbitrary" over "random". The value plays no role
> at all, but it must be unique, preferably stable across reboots.

Don't use "unique". It has way too many connotations of _true_ uniqieness 
in computer science.

And the operative word in "preferably stable across reboots" is
"preferably". Because it basically cannot be in the general case (it 
can't be unique for things that aren't enumerable, and clearly a lot of 
things aren't), and thus nothing must ever _assume_ it is.

And the thing is, to break those wrong assumptions (that are true in many
common cases, but are _not_ true in the rare general case), we may have to
actively do things that are "silly" on purpose. For example, for 
debugging, we start the "jiffies" counter not at zero, but at -300. That's 
patently _silly_, but it was very useful in finding the cases where the 
rare general case was not handled correctly.

Similarly, I'll probably advocate at some point (when distributions are
using udev) that we purposefully try to make device numbers _unstable_
across reboots, to find cases that do the wrong thing and have things
hardcoded. Exactly to find and fix them, so that the distribution works 
correctly even when things aren't enumerable.

(As to examples of inumerable devices, iSCSI comes to mind. As does pretty 
much anything else that is connected over IP - you can't even enumerate 
according to path or IP, since those may change too).

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-02 20:42                   ` Linus Torvalds
@ 2004-01-03  3:00                     ` Andries Brouwer
  2004-01-03  4:46                       ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-03  3:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rob Love, Andries Brouwer, rob, Pascal Schmidt, linux-kernel, Greg KH

On Fri, Jan 02, 2004 at 12:42:41PM -0800, Linus Torvalds wrote:

Hi Linus - A happy 2004 !


> Note that one reason I didn't much like the 64-bit versions is that not 
> only are they bigger, they also encourage insanity. Ie you'd find SCSI 
> people who want to try to encode device/controller/bus/target/lun info 
> into the device number. 

Weak. "We don't want this power that has good uses because it also
can be used stupidly." That is not Unix-style.

> We should resist any effort that makes the numbers "mean" something. They 
> are random cookies. Not "unique identifiers", and not "addresses".

Random cookies? I prefer "arbitrary" over "random". The value plays no role
at all, but it must be unique, preferably stable across reboots.

Andries




^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 15:54                 ` Rob Love
@ 2004-01-02 20:42                   ` Linus Torvalds
  2004-01-03  3:00                     ` Andries Brouwer
  0 siblings, 1 reply; 158+ messages in thread
From: Linus Torvalds @ 2004-01-02 20:42 UTC (permalink / raw)
  To: Rob Love; +Cc: Andries Brouwer, rob, Pascal Schmidt, linux-kernel, Greg KH



On Thu, 1 Jan 2004, Rob Love wrote:
>
> On Thu, 2004-01-01 at 10:48, Andries Brouwer wrote: 
> > I am afraid I have to disappoint you. I made them 64-bit,
> > and I think they were 64-bit for a few months in the -mm tree,
> > forgot the details, but unfortunately Al went back to 32-bit again.
> 
> You did disappoint me!  My heart is crushed and my aspirations for the
> future ruined.
> 
> But you are right, dunno what I was thinking.

Note that one reason I didn't much like the 64-bit versions is that not 
only are they bigger, they also encourage insanity. Ie you'd find SCSI 
people who want to try to encode device/controller/bus/target/lun info 
into the device number. 

We should resist any effort that makes the numbers "mean" something. They 
are random cookies. Not "unique identifiers", and not "addresses".

The unique identifiers you get from things like udev, using contents of
the device itself or user preferences etc. That's outside the scope of the
kernel. The addresses you get from /sys.

		Linus

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-02 17:54 ` Andreas Jellinghaus
@ 2004-01-02 18:19   ` Shawn
  0 siblings, 0 replies; 158+ messages in thread
From: Shawn @ 2004-01-02 18:19 UTC (permalink / raw)
  To: Andreas Jellinghaus; +Cc: linux-kernel

Let me begin by pointing out that I was a proponent of devfs from when
it first got written.

On Fri, 2004-01-02 at 11:54, Andreas Jellinghaus wrote:
> On Wed, 31 Dec 2003 00:32:58 +0000, Greg KH wrote:
> > The Problems:
> >  1) A static /dev is unwieldy and big.  It would be nice to only show
> >     the /dev entries for the devices we actually have running in the
> >     system.
> neither devfs nor udev handle the virtual part. only devpts does, 
> and only for one special class of virtual devices. and usb devices
> are neither handled by devfs nor udev, but by usbfs.
I'm thinking maybe this is just fine.

> Actually udev is a regression:
>  - devfs was a first efford at a sane /dev naming policy, udev returns to
>    the old and cryptic lsb device naming.
Every way of doing things is just another say of doing it. Location
based naming has it's major issues. It's solved by UUID or LABEL, so
device naming is just a matter of preference anyway. You can change it
with udev, IIRC. You could not with devfs. Chances are you use devfsd
anyway, right?

>  - devfs made makedev obsolete, udev doesn't work without it / can
>    currently not create all devices because of missing sysfs support.
No one is saying it is currently perfect for everyone, however, it suits
many people just fine. devfs went through the same thing and this is an
invalid argument when debating the technical merit of either.

> Ignore this mail if you want, but people might be unhappy with udev
> because of these regressions and not caring about it will not improve
> the situation.
By the time devfs goes away enough testing will have happened. Don't
look for it to go away within 2.6.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31  0:29 Greg KH
                   ` (2 preceding siblings ...)
  2004-01-01  1:18 ` Helge Hafting
@ 2004-01-02 17:54 ` Andreas Jellinghaus
  2004-01-02 18:19   ` Shawn
  3 siblings, 1 reply; 158+ messages in thread
From: Andreas Jellinghaus @ 2004-01-02 17:54 UTC (permalink / raw)
  To: linux-kernel

On Wed, 31 Dec 2003 00:32:58 +0000, Greg KH wrote:
> The Problems:
>  1) A static /dev is unwieldy and big.  It would be nice to only show
>     the /dev entries for the devices we actually have running in the
>     system.

last time i checked, devices for physical resources are only a part
of the devices in /dev. the other big part are those devices for
virtual resources, like virtual master/slave tty, network block devices,
loop devices, virtual consoles, etc.

neither devfs nor udev handle the virtual part. only devpts does, 
and only for one special class of virtual devices. and usb devices
are neither handled by devfs nor udev, but by usbfs.

Actually udev is a regression:
 - devfs was a first efford at a sane /dev naming policy, udev returns to
   the old and cryptic lsb device naming.
 - devfs made makedev obsolete, udev doesn't work without it / can
   currently not create all devices because of missing sysfs support.

Ignore this mail if you want, but people might be unhappy with udev
because of these regressions and not caring about it will not improve
the situation.

Andreas


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 19:17   ` Greg KH
@ 2004-01-02 16:45     ` Shawn
  0 siblings, 0 replies; 158+ messages in thread
From: Shawn @ 2004-01-02 16:45 UTC (permalink / raw)
  To: Greg KH; +Cc: Prakash K. Cheemplavam, linux-hotplug-devel, linux-kernel

On Wed, 2003-12-31 at 13:17, Greg KH wrote:
> In fact, now that I know Gentoo works without devfs, I'm considering
> putting it on an old laptop I have around here...

If you use an "old" laptop you might want to use the distcc option... ;)
Unless you like you installs to take three weeks... Literally.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 19:43             ` Kai Henningsen
@ 2004-01-02  7:26               ` Rob Landley
  2004-01-04  8:57                 ` Greg KH
  0 siblings, 1 reply; 158+ messages in thread
From: Rob Landley @ 2004-01-02  7:26 UTC (permalink / raw)
  To: Kai Henningsen, linux-kernel

On Thursday 01 January 2004 13:43, Kai Henningsen wrote:
> rob@landley.net (Rob Landley)  wrote on 01.01.04 in 
<200401010634.28559.rob@landley.net>:
> > On Wednesday 31 December 2003 18:31, Rob Love wrote:
> > > On Wed, 2003-12-31 at 19:15, Andries Brouwer wrote:
> > > > My plan has been to essentially use a hashed disk serial number
> > > > for this "any old unique value". The problem is that "any old"
> > > > is easy enough, but "unique" is more difficult.
> > > > Naming devices is very difficult, but in some important cases,
> > > > like SCSI or IDE disks, that would work and give a stable name.
> > >
> > > Yup.
> > >
> > > > The kernel must not invent consecutive numbers - that does not
> > > > lead to stable names. Setting this up correctly is nontrivial.
> > >
> > > This is definitely an interesting problem space.
> > >
> > > I agree wrt just inventing consecutive numbers.  If there was a nice
> > > way to trivially generate a random and unique number from some
> > > device-inherent information, that would be nice.
> > >
> > > 	Rob Love
> >
> > Fundamental problem: "Unique" depends on the other devices in the system.
> > You can't guarantee unique by looking at one device, more or less by
> > definition.
>
> This is actually not fundamental at all.
>
> The best-known exception is probably the MAC address. But it is not the
> only example of devices having true unique information.

I thought of mentioning this, but deleted it as a digression.  But since you 
brought it up:

A) There are ethernet cards that have the same mac address.  (Over the years, 
the cheap manufacturers have managed to screw this up.  Ask Alan Cox.)  They 
show up randomly and cause real headaches for network administrators if you 
don't think to look for it.

B) You can override the mac address thing thing comes with.  This is done all 
the time.  (Hot failover comes to mind, but it's not the only one.  I 
remember how the cable modem company that serviced my mother's house snagged 
the mac address of the cable modem as part of the inital setup, and refused 
to work with a different mac address.  (I asked their support guys: They 
wanted to make sure you were still using the machine they'd installed their 
special software on, which was a windows machine and I was installing a linux 
firewall.  And predicting THIS digression: yes I power cycled and hit the 
reset button on the cable modem, it didn't help.  The problem was at the 
other end, their gateway dropped packets from the wrong mac address.)

So I changed the mac address of the other machine as part of its init scripts, 
and it worked again...

> It is certainly true, though, that there are devices without this kind of
> info.
>
> And remember that you can sometimes use secondary information. With any
> kind of read-write storage device, it might be possible to create such a
> piece of information and store it onto that device.

I.E. a udev config entry?

> Moral: keep the identifier creation framework flexible enough so that you
> can chose device-specific means to produce useful identifiers. (And, use
> long identifiers, as they're less likely to be duplicated in general.)

Seems to be what udev is for.  When we do go to random major and minor 
numbers, maybe it would be useful to let udev request specific ones?  (Just a 
thought...)

> MfG Kai

Rob


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 23:14           ` Rob
@ 2004-01-02  3:53             ` Tyler Hall
  0 siblings, 0 replies; 158+ messages in thread
From: Tyler Hall @ 2004-01-02  3:53 UTC (permalink / raw)
  To: rpc; +Cc: linux-kernel

Since we're moving toward treating device numbers as unique handles for 
devices in a system, why can't we just dynamically allocate them like 
process ID's? As each device driver loads and registers with the kernel, 
it can request a device number and the kernel can assign the next 
available one.

Tyler

Rob wrote:

>On Wednesday 31 December 2003 07:31 pm, Rob Love wrote:
>
><snip>
>  
>
>>This is definitely an interesting problem space.
>>
>>I agree wrt just inventing consecutive numbers.  If there was a nice way
>>to trivially generate a random and unique number from some
>>device-inherent information, that would be nice.
>>
>>	Rob Love
>>    
>>
>
>my first thought was hardware serial numbers, but i'm guessing they mostly 
>don't exist based on the discomfort caused by the pentium 3 serial number in 
>the past. my second thought was raw latency. in the real world, 2 identical 
>devices of any nature are going to respond electrically at different rates. i 
>kind of stole the concept from what i read about the i810 rng... quantum 
>differences can distinguish between 2 of anything, and based on the response 
>time, 'cookies' can be written out to keep them separately ID'd. some devices 
>will get slower over time, e.g. increasing error rates and aging silicon will 
>throw the 'cookie' off, so you'd re-calibrate every so often, like on a 
>reboot. those are rare for some of us ;)
>
>the big IF: can you measure that with enough precision to at least decrease 
>the probablity of collision? 
>
>  
>


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-02  0:17                 ` Hollis Blanchard
@ 2004-01-02  0:36                   ` viro
  2004-01-03  6:04                   ` Greg KH
  1 sibling, 0 replies; 158+ messages in thread
From: viro @ 2004-01-02  0:36 UTC (permalink / raw)
  To: Hollis Blanchard
  Cc: Tommi Virtanen, Rob Love, Nathan Conrad, Pascal Schmidt,
	linux-kernel, Greg KH

On Thu, Jan 01, 2004 at 06:17:43PM -0600, Hollis Blanchard wrote:
> "console=" takes driver-supplied names which usually happen to match 
> /dev node names. For example, drivers/serial/8250.c names itself 
> "ttyS", so "console=ttyS0" will end up going to that driver, regardless 
> of the state of /dev.
> 
> I'm not saying that's good or bad, but what's the alternative? 
> "console=class/tty/ttyS0"?

Console code will need serious work anyway; note that current names
do _not_ refer to tty devices - there is some overlap, but right now
we have
	* console drivers
	* some of them being connected with tty drivers; those can tell
which tty driver corresponds to them
	* console ouput code maintaining chain of console drivers; output
is sent to them, attempt to open() /dev/console ends up picking the first
console driver that has corresponding tty one (== has console->device())
and opening the tty device in question
	* unholy mess with redirects.

There's no device nodes for console drivers.  So names in console=... are
something very odd, indeed.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 21:52               ` Tommi Virtanen
@ 2004-01-02  0:17                 ` Hollis Blanchard
  2004-01-02  0:36                   ` viro
  2004-01-03  6:04                   ` Greg KH
  0 siblings, 2 replies; 158+ messages in thread
From: Hollis Blanchard @ 2004-01-02  0:17 UTC (permalink / raw)
  To: Tommi Virtanen
  Cc: Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

On Wednesday, Dec 31, 2003, at 15:52 US/Central, Tommi Virtanen wrote:
> I think devfs names are accepted as root= arguments, so that's a bit of
> a loss.. with udev, your /dev and your root= are equal only if you
> follow the standard naming.
>
> For root=, I can see how early userspace can move that to userspace.
> But what about swsuspend?
>
> Are there any more kernel options taking file names? I think now would
> be a good time to stop adding more of them :)

"console=" takes driver-supplied names which usually happen to match 
/dev node names. For example, drivers/serial/8250.c names itself 
"ttyS", so "console=ttyS0" will end up going to that driver, regardless 
of the state of /dev.

I'm not saying that's good or bad, but what's the alternative? 
"console=class/tty/ttyS0"?

-- 
Hollis Blanchard
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 12:34           ` Rob Landley
  2004-01-01 15:22             ` Rob Love
  2004-01-01 19:43             ` Kai Henningsen
@ 2004-01-02  0:17             ` Maciej Zenczykowski
       [not found]               ` <20040102103104.GA28168@mark.mielke.cc>
  2004-01-07 10:23             ` Olaf Hering
  3 siblings, 1 reply; 158+ messages in thread
From: Maciej Zenczykowski @ 2004-01-02  0:17 UTC (permalink / raw)
  To: Rob Landley
  Cc: Rob Love, Andries Brouwer, Pascal Schmidt, linux-kernel, Greg KH

> Solve 90% of the problem space and have a human deal with the exceptions.
> How big's the unique number being exported, anyway?  (If it's 32 bits, the 
> exceptions are 1 in 4 billion.  It may never be seen in the wild...)

Wouldn't this be a classical birthday problem with 50% collision chance
popping up in and around a few hundred devices? [20 for 8 bits, 23 for
365, 302 for 16 bits, 77163 for 32 bits], and that's only in a single
system - with hundreds of thousands of systems even a 0.1% collision rate
is deadly. [0.1% collision rate at 32 bits with 2932 devices]  Even with 
only 300 devices per system, you'll still get a collision (at 32 bits) on 
more than 1 system in a hundred thousand.

Cheers,
MaZe.



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01  0:31         ` Rob Love
  2004-01-01 12:34           ` Rob Landley
@ 2004-01-01 23:14           ` Rob
  2004-01-02  3:53             ` Tyler Hall
  1 sibling, 1 reply; 158+ messages in thread
From: Rob @ 2004-01-01 23:14 UTC (permalink / raw)
  To: linux-kernel

On Wednesday 31 December 2003 07:31 pm, Rob Love wrote:

<snip>
> This is definitely an interesting problem space.
>
> I agree wrt just inventing consecutive numbers.  If there was a nice way
> to trivially generate a random and unique number from some
> device-inherent information, that would be nice.
>
> 	Rob Love

my first thought was hardware serial numbers, but i'm guessing they mostly 
don't exist based on the discomfort caused by the pentium 3 serial number in 
the past. my second thought was raw latency. in the real world, 2 identical 
devices of any nature are going to respond electrically at different rates. i 
kind of stole the concept from what i read about the i810 rng... quantum 
differences can distinguish between 2 of anything, and based on the response 
time, 'cookies' can be written out to keep them separately ID'd. some devices 
will get slower over time, e.g. increasing error rates and aging silicon will 
throw the 'cookie' off, so you'd re-calibrate every so often, like on a 
reboot. those are rare for some of us ;)

the big IF: can you measure that with enough precision to at least decrease 
the probablity of collision? 

-- 
Rob Couto
rpc@cafe4111.org
Rules for computing success:
1) Attitude is no substitute for competence.
2) Ease of use is no substitute for power.
3) Safety matters; use a static-free hammer.
--


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 19:53   ` walt
@ 2004-01-01 21:53     ` Martin Schlemmer
  0 siblings, 0 replies; 158+ messages in thread
From: Martin Schlemmer @ 2004-01-01 21:53 UTC (permalink / raw)
  To: walt; +Cc: Linux Kernel Mailing Lists, Greg KH

[-- Attachment #1: Type: text/plain, Size: 828 bytes --]

On Thu, 2004-01-01 at 21:53, walt wrote:
> Martin Schlemmer wrote:
> > On Thu, 2004-01-01 at 00:17, walt wrote:
> 
> >> ...I have  not been able to get udev working yet...
> 
> > Hmm, It works fine here?  I was under the impression that
>  > it should _just_work_ if you have latest everything unstable...
> 
> Yes!  I want to confirm that it DOES 'just work' with this one
> little thingy I missed:
> 
> I needed to add TWO boot flags because of the way I have my
> kernel configured:  'nodevfs' AND 'devfs=nomount'.
> 
> Without the 'devfs=nomount' flag the kernel was starting devfsd
> anyway, which keeps udev from working, apparently.
> 

Hmm, right, that will do it.

Perhaps I could change this to display a warning if udev is present,
but devfs is mounted over /dev ...


-- 
Martin Schlemmer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 16:17     ` Pascal Schmidt
@ 2004-01-01 20:03       ` Greg KH
  0 siblings, 0 replies; 158+ messages in thread
From: Greg KH @ 2004-01-01 20:03 UTC (permalink / raw)
  To: Pascal Schmidt; +Cc: linux-kernel

On Thu, Jan 01, 2004 at 05:17:50PM +0100, Pascal Schmidt wrote:
> On Wed, 31 Dec 2003, Greg KH wrote:
> 
> > You would not have any "extra" overhead if you don't add any new devices
> > to your system.  udev only runs when /sbin/hotplug runs. As for extra
> > space on your disk, this email thread is almost as big as the udev
> > binary is :)
> 
> Well, but if random device numbers become a reality, udev would have
> to run at boot time or I wouldn't get usable device nodes.

Exactly, it's on the TODO list :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
       [not found] ` <fa.hv9hpq7.1l1q9p3@ifi.uio.no>
@ 2004-01-01 19:53   ` walt
  2004-01-01 21:53     ` Martin Schlemmer
  0 siblings, 1 reply; 158+ messages in thread
From: walt @ 2004-01-01 19:53 UTC (permalink / raw)
  Cc: linux-kernel, Greg KH

Martin Schlemmer wrote:
> On Thu, 2004-01-01 at 00:17, walt wrote:

>> ...I have  not been able to get udev working yet...

> Hmm, It works fine here?  I was under the impression that
 > it should _just_work_ if you have latest everything unstable...

Yes!  I want to confirm that it DOES 'just work' with this one
little thingy I missed:

I needed to add TWO boot flags because of the way I have my
kernel configured:  'nodevfs' AND 'devfs=nomount'.

Without the 'devfs=nomount' flag the kernel was starting devfsd
anyway, which keeps udev from working, apparently.

So, Greg, please be nice to Martin, who is working hard to
get gentoo people out of your mailbox.

Thanks to both of you, and Happy New Year!

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 12:34           ` Rob Landley
  2004-01-01 15:22             ` Rob Love
@ 2004-01-01 19:43             ` Kai Henningsen
  2004-01-02  7:26               ` Rob Landley
  2004-01-02  0:17             ` Maciej Zenczykowski
  2004-01-07 10:23             ` Olaf Hering
  3 siblings, 1 reply; 158+ messages in thread
From: Kai Henningsen @ 2004-01-01 19:43 UTC (permalink / raw)
  To: linux-kernel

rob@landley.net (Rob Landley)  wrote on 01.01.04 in <200401010634.28559.rob@landley.net>:

> On Wednesday 31 December 2003 18:31, Rob Love wrote:
> > On Wed, 2003-12-31 at 19:15, Andries Brouwer wrote:
> > > My plan has been to essentially use a hashed disk serial number
> > > for this "any old unique value". The problem is that "any old"
> > > is easy enough, but "unique" is more difficult.
> > > Naming devices is very difficult, but in some important cases,
> > > like SCSI or IDE disks, that would work and give a stable name.
> >
> > Yup.
> >
> > > The kernel must not invent consecutive numbers - that does not
> > > lead to stable names. Setting this up correctly is nontrivial.
> >
> > This is definitely an interesting problem space.
> >
> > I agree wrt just inventing consecutive numbers.  If there was a nice way
> > to trivially generate a random and unique number from some
> > device-inherent information, that would be nice.
> >
> > 	Rob Love
>
> Fundamental problem: "Unique" depends on the other devices in the system.
> You can't guarantee unique by looking at one device, more or less by
> definition.

This is actually not fundamental at all.

The best-known exception is probably the MAC address. But it is not the  
only example of devices having true unique information.

It is certainly true, though, that there are devices without this kind of  
info.

And remember that you can sometimes use secondary information. With any  
kind of read-write storage device, it might be possible to create such a  
piece of information and store it onto that device.

Moral: keep the identifier creation framework flexible enough so that you  
can chose device-specific means to produce useful identifiers. (And, use  
long identifiers, as they're less likely to be duplicated in general.)

MfG Kai

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
@ 2004-01-01 16:59 Shaheed
  0 siblings, 0 replies; 158+ messages in thread
From: Shaheed @ 2004-01-01 16:59 UTC (permalink / raw)
  To: linux-kernel

Rob Landley wrote:

>Combine that with hotplug and you have a world of pain. Generating a number 
> from a device is just a fancy hashing function, but as soon as you have two 
> devices that generate the same number independently (when in separate 
> systems) and you plug them both into the same system: boom.

If one has two otherwise identical devices, the only thing that distinguishes 
them to the system is their point of attachment. Even from a user's point of 
view, the only difference is the connector it is plugged into. That implies 
that the hash resolution value ought to be based on the point of attachment.

It seems to me that the key to making this system as transparent as possible 
is to make these source value of the hash and the attachment point visible 
and navigable by userspace/humans. Perhaps something like this:

- every driver exports its name and some driver-or-devicetype-dependant value 
(serial number, MAC address, disk WWID, pty number, kernel address of kobject 
or whatever) to /sbin/hotplug. The userspace logic gets to hash+uniquify the 
value as required, and then create a sysfs tree node ("/uid/xxx") whose 
leaves contain the point of attachment.

- At the bottom of the sysfs tree for the device add a leaf that points back 
to the entry into "/uid" tree.

Thus, userspace can navigate in either direction between the point of 
attachment, and the identifiying characteristic of the deivce.

Thanks, Shaheed

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 19:23   ` Greg KH
  2003-12-31 20:19     ` Rob Love
@ 2004-01-01 16:17     ` Pascal Schmidt
  2004-01-01 20:03       ` Greg KH
  1 sibling, 1 reply; 158+ messages in thread
From: Pascal Schmidt @ 2004-01-01 16:17 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

On Wed, 31 Dec 2003, Greg KH wrote:

> You would not have any "extra" overhead if you don't add any new devices
> to your system.  udev only runs when /sbin/hotplug runs. As for extra
> space on your disk, this email thread is almost as big as the udev
> binary is :)

Well, but if random device numbers become a reality, udev would have
to run at boot time or I wouldn't get usable device nodes. So there
is some setup complexity (because so far I don't need a correctly setup
hotplug system at all). Not much of a problem, granted, distributions
will do this for most of us and only a few people will do it by hand.

-- 
Ciao,
Pascal


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 15:48               ` Andries Brouwer
@ 2004-01-01 15:54                 ` Rob Love
  2004-01-02 20:42                   ` Linus Torvalds
  0 siblings, 1 reply; 158+ messages in thread
From: Rob Love @ 2004-01-01 15:54 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: rob, Pascal Schmidt, linux-kernel, Greg KH

On Thu, 2004-01-01 at 10:48, Andries Brouwer wrote:

> I am afraid I have to disappoint you. I made them 64-bit,
> and I think they were 64-bit for a few months in the -mm tree,
> forgot the details, but unfortunately Al went back to 32-bit again.

You did disappoint me!  My heart is crushed and my aspirations for the
future ruined.

But you are right, dunno what I was thinking.

	Rob Love



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 15:22             ` Rob Love
@ 2004-01-01 15:48               ` Andries Brouwer
  2004-01-01 15:54                 ` Rob Love
  0 siblings, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-01 15:48 UTC (permalink / raw)
  To: Rob Love; +Cc: rob, Andries Brouwer, Pascal Schmidt, linux-kernel, Greg KH

On Thu, Jan 01, 2004 at 10:22:53AM -0500, Rob Love wrote:

> Device numbers are 64-bit now.
> 
> 	Rob Love

I am afraid I have to disappoint you. I made them 64-bit,
and I think they were 64-bit for a few months in the -mm tree,
forgot the details, but unfortunately Al went back to 32-bit again.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01 12:34           ` Rob Landley
@ 2004-01-01 15:22             ` Rob Love
  2004-01-01 15:48               ` Andries Brouwer
  2004-01-01 19:43             ` Kai Henningsen
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 158+ messages in thread
From: Rob Love @ 2004-01-01 15:22 UTC (permalink / raw)
  To: rob; +Cc: Andries Brouwer, Pascal Schmidt, linux-kernel, Greg KH

On Thu, 2004-01-01 at 07:34, Rob Landley wrote:

> Fundamental problem: "Unique" depends on the other devices in the system.  You 
> can't guarantee unique by looking at one device, more or less by definition.

Of course.

> Combine that with hotplug and you have a world of pain.  Generating a number 
> from a device is just a fancy hashing function, but as soon as you have two 
> devices that generate the same number independently (when in separate 
> systems) and you plug them both into the same system: boom.

A solution would have to deal with collisions.

> Of course the EASY way to deal with collisions is to just fail the hash thingy 
> in a detectable way, and punt to some kind of udev override.  So if you yank 
> a drive from system A, throw it in system B, try to re-export it NFS, and 
> it's not going to work, it TELLS you.

No no no.  Nothing this complicated.  No punting to udev.

> Solve 90% of the problem space and have a human deal with the exceptions.  How 
> big's the unique number being exported, anyway?  (If it's 32 bits, the 
> exceptions are 1 in 4 billion.  It may never be seen in the wild...)

Device numbers are 64-bit now.

	Rob Love



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01  0:31         ` Rob Love
@ 2004-01-01 12:34           ` Rob Landley
  2004-01-01 15:22             ` Rob Love
                               ` (3 more replies)
  2004-01-01 23:14           ` Rob
  1 sibling, 4 replies; 158+ messages in thread
From: Rob Landley @ 2004-01-01 12:34 UTC (permalink / raw)
  To: Rob Love, Andries Brouwer; +Cc: Pascal Schmidt, linux-kernel, Greg KH

On Wednesday 31 December 2003 18:31, Rob Love wrote:
> On Wed, 2003-12-31 at 19:15, Andries Brouwer wrote:
> > My plan has been to essentially use a hashed disk serial number
> > for this "any old unique value". The problem is that "any old"
> > is easy enough, but "unique" is more difficult.
> > Naming devices is very difficult, but in some important cases,
> > like SCSI or IDE disks, that would work and give a stable name.
>
> Yup.
>
> > The kernel must not invent consecutive numbers - that does not
> > lead to stable names. Setting this up correctly is nontrivial.
>
> This is definitely an interesting problem space.
>
> I agree wrt just inventing consecutive numbers.  If there was a nice way
> to trivially generate a random and unique number from some
> device-inherent information, that would be nice.
>
> 	Rob Love

Fundamental problem: "Unique" depends on the other devices in the system.  You 
can't guarantee unique by looking at one device, more or less by definition.

Combine that with hotplug and you have a world of pain.  Generating a number 
from a device is just a fancy hashing function, but as soon as you have two 
devices that generate the same number independently (when in separate 
systems) and you plug them both into the same system: boom.

Now if you don't care about hotplug, it gets a little easier.  You can have a 
collission handler that does some kind of hashing thing, figuring out which 
device needs to get bumped and bumping it.  (As long as it consistently picks 
the same victim, you're okay, although that in and of itself could get 
interesting.  And if you remove the earlier device it conflicted with and 
reboot, the device could get renumbered which is evil...)

Of course the EASY way to deal with collisions is to just fail the hash thingy 
in a detectable way, and punt to some kind of udev override.  So if you yank 
a drive from system A, throw it in system B, try to re-export it NFS, and 
it's not going to work, it TELLS you.

Solve 90% of the problem space and have a human deal with the exceptions.  How 
big's the unique number being exported, anyway?  (If it's 32 bits, the 
exceptions are 1 in 4 billion.  It may never be seen in the wild...)

Rob


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01  2:03     ` Martin Schlemmer
@ 2004-01-01  2:05       ` Martin Schlemmer
  0 siblings, 0 replies; 158+ messages in thread
From: Martin Schlemmer @ 2004-01-01  2:05 UTC (permalink / raw)
  To: walt; +Cc: Linux Kernel Mailing Lists, Greg KH

[-- Attachment #1: Type: text/plain, Size: 369 bytes --]

On Thu, 2004-01-01 at 04:03, Martin Schlemmer wrote:
> On Thu, 2004-01-01 at 00:17, walt wrote:
> 
> > Note that the portage system already includes 'hotplug' and 'udev'
> > but possibly lagging behind a bit:  hotplug-20030805-r3 and udev-011.
> > 
> 
> Afiak, we are current on udev :D 

Err, correction - I just saw 012 is out =p


-- 
Martin Schlemmer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 22:17   ` walt
@ 2004-01-01  2:03     ` Martin Schlemmer
  2004-01-01  2:05       ` Martin Schlemmer
  0 siblings, 1 reply; 158+ messages in thread
From: Martin Schlemmer @ 2004-01-01  2:03 UTC (permalink / raw)
  To: walt; +Cc: Linux Kernel Mailing Lists, Greg KH

[-- Attachment #1: Type: text/plain, Size: 1785 bytes --]

On Thu, 2004-01-01 at 00:17, walt wrote:

> Note that the portage system already includes 'hotplug' and 'udev'
> but possibly lagging behind a bit:  hotplug-20030805-r3 and udev-011.
> 

Afiak, we are current on udev :D  As for hotplug, I will have to check -
I see the latest usb patches cause usb.agent to complain about "09" not
valid token or such, but I have not looked into it yet.

> I have installed them both but just have not been able to get udev
> working yet -- I don't yet understand the problems well enough to tell
> you why, unfortutately.  (udev is still marked 'experimental' so I'm
> probably omitting important steps somewhere.)
> 

Well, ideally you need baselayout-1.8.6.12-r3 as well ...  But if you
do have issues, try to bother me first, as it could be something I did
or did not do ;)

> If you could get udev working in gentoo you would become an instant
> hero rather than the target of nasty emails.  Think of how great
> that would be for your New Year!  We would become the wind beneath
> your wings instead of the rotten tomatoes in your mailbox  ;0)

Hmm, It works fine here?  With sysfs patches from Greg (not yet into
official linux bk), I only had to run alsa's script to create device
nodes, and create /dev/{core,stdin,stdout,stderr} - the rest udev
creates - although, yes we do have the ramdisk/tarball feature to
save permissions/additions.

But once again, drop me a mail first with versions of udev, baselayout,
kernel, hotplug, etc, if you have latest unstable baselaout and still
cannot get it working - it is a Gentoo issue after all (as well as the
fact that I was under the impression that it should _just_work_ if you
have latest everything unstable =) ...


Thanks,

-- 

Martin Schlemmer


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31  0:29 Greg KH
  2003-12-31  0:53 ` Prakash K. Cheemplavam
  2003-12-31 12:43 ` Paulo Marques
@ 2004-01-01  1:18 ` Helge Hafting
  2004-01-03  5:59   ` Greg KH
  2004-01-02 17:54 ` Andreas Jellinghaus
  3 siblings, 1 reply; 158+ messages in thread
From: Helge Hafting @ 2004-01-01  1:18 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-hotplug-devel, linux-kernel

On Tue, Dec 30, 2003 at 04:29:42PM -0800, Greg KH wrote:
> 
>  2) We are (well, were) running out of major and minor numbers for
>     devices.

devfs tried to fix this one by _getting rid_ of those numbers.
Seriously - what are they needed for?  
(Yes, I know why they're needed with /dev on ext2)
Opening a device in devfs went straight to the device from the
inode - no extra lookup of "device numbers"
Numbers were provided mostly for backward compatibility - they
weren't used for the main task of accessing devices.

udev has many other advantages of course, too bad we still
have to carry those numbers around.

Helge Hafting

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2004-01-01  0:15       ` Andries Brouwer
@ 2004-01-01  0:31         ` Rob Love
  2004-01-01 12:34           ` Rob Landley
  2004-01-01 23:14           ` Rob
  0 siblings, 2 replies; 158+ messages in thread
From: Rob Love @ 2004-01-01  0:31 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Pascal Schmidt, linux-kernel, Greg KH

On Wed, 2003-12-31 at 19:15, Andries Brouwer wrote:

> My plan has been to essentially use a hashed disk serial number
> for this "any old unique value". The problem is that "any old"
> is easy enough, but "unique" is more difficult.
> Naming devices is very difficult, but in some important cases,
> like SCSI or IDE disks, that would work and give a stable name.

Yup.

> The kernel must not invent consecutive numbers - that does not
> lead to stable names. Setting this up correctly is nontrivial.

This is definitely an interesting problem space.

I agree wrt just inventing consecutive numbers.  If there was a nice way
to trivially generate a random and unique number from some
device-inherent information, that would be nice.

	Rob Love



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 20:19     ` Rob Love
  2003-12-31 22:01       ` Nathan Conrad
@ 2004-01-01  0:15       ` Andries Brouwer
  2004-01-01  0:31         ` Rob Love
  1 sibling, 1 reply; 158+ messages in thread
From: Andries Brouwer @ 2004-01-01  0:15 UTC (permalink / raw)
  To: Rob Love; +Cc: Pascal Schmidt, linux-kernel, Greg KH

On Wed, Dec 31, 2003 at 03:19:22PM -0500, Rob Love wrote:

> We can get to the point where we don't even need the explicit concept of
> device numbers, but just "any old unique value" to use as a cookie.  The
> kernel can pull that number from anywhere, and notify user-space via
> udev ala hotplug.

My plan has been to essentially use a hashed disk serial number
for this "any old unique value". The problem is that "any old"
is easy enough, but "unique" is more difficult.
Naming devices is very difficult, but in some important cases,
like SCSI or IDE disks, that would work and give a stable name.

The kernel must not invent consecutive numbers - that does not
lead to stable names. Setting this up correctly is nontrivial.


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 22:55           ` viro
  2003-12-31 23:05             ` Rob Love
@ 2003-12-31 23:48             ` Andreas Dilger
  2004-01-07 10:15             ` Olaf Hering
  2 siblings, 0 replies; 158+ messages in thread
From: Andreas Dilger @ 2003-12-31 23:48 UTC (permalink / raw)
  To: viro; +Cc: Rob Love, Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

On Dec 31, 2003  22:55 +0000, viro@parcelfarce.linux.theplanet.co.uk wrote:
> 	h) nfsd uses device number as a substitute for export ID if said
> ID is not given explicitly.  That, BTW, is a big problem for crackpipe
> dreams about random device numbers - export ID _must_ be stable across
> reboots.

We had a problem with this and Lustre, when we NFS export it.  Lustre is
already a network filesystem so we don't have a device number.  I had a
discussion with Neil Brown about this and suggested that we allow NFS to
get a _real_ stable export ID from the filesystem (e.g. superblock UUID
or similar) instead of the device number hackery which only has a vague
relationship to stable.

We implemented it for Lustre with a filesystem option FS_NFSEXP_FSID
that tells nfsd it can export such a filesystem in the absence of
FS_REQUIRES_DEV and then put our export ID into sb->s_dev (although I'd
prefer something slightly cleaner than that).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 21:45           ` Tommi Virtanen
@ 2003-12-31 23:10             ` Rob Love
  2003-12-31 21:52               ` Tommi Virtanen
  0 siblings, 1 reply; 158+ messages in thread
From: Rob Love @ 2003-12-31 23:10 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

On Wed, 2003-12-31 at 16:45, Tommi Virtanen wrote:

> Let me try to rephrase Nathan's question more explicitly.
> 
> If user policy decides all naming, how does the kernel parse e.g. 
> root=/dev/foo arguments? Or the swap partition to use for swsuspend?

Oh.  That has always been a hack, ala name_to_dev_t().

We will have to continue doing that hack so long as those users are in
the kernel proper (and not early user-space, for example).

	Rob Love



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 22:55           ` viro
@ 2003-12-31 23:05             ` Rob Love
  2003-12-31 23:48             ` Andreas Dilger
  2004-01-07 10:15             ` Olaf Hering
  2 siblings, 0 replies; 158+ messages in thread
From: Rob Love @ 2003-12-31 23:05 UTC (permalink / raw)
  To: viro; +Cc: Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

On Wed, 2003-12-31 at 17:55, viro@parcelfarce.linux.theplanet.co.uk
wrote:

> I think you've missed a point here.  There are several places where kernel
> deals with device identification.

I know all of this.  I was trying to explain how Unix VFS understands
devices (via major/minor number, not filename).  Different audience.

	Rob Love



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 22:20         ` Rob Love
  2003-12-31 21:45           ` Tommi Virtanen
@ 2003-12-31 22:55           ` viro
  2003-12-31 23:05             ` Rob Love
                               ` (2 more replies)
  1 sibling, 3 replies; 158+ messages in thread
From: viro @ 2003-12-31 22:55 UTC (permalink / raw)
  To: Rob Love; +Cc: Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

On Wed, Dec 31, 2003 at 05:20:18PM -0500, Rob Love wrote:
> On Wed, 2003-12-31 at 17:01, Nathan Conrad wrote:
> 
> > One thing that I'm confused about with respect to device files is how
> > kernel arguments are supposed to work. Now, we _seem_ to have a
> > mish-mash of different ways to tell the kernel which device to open as
> > a console, which device to use as a suspend device, etc.... Now, all
> > of the device names are being migrated to userland. How is the kernel
> > supposed to determine which device to use when it is told use
> > /dev/hda3 or /dev/ide/host0/something/part3 as the suspend partition?
> > The kernel no longer knows to which device this string this device is
> > connected.
> 
> Uh, Unix systems (Linux included) do not use the filename of the device
> node at all.  Those are just names for you, the user.
> 
> The kernel uses the device number to understand what device user-space
> is trying to access.  The kernel associates the device with a device
> number.  Normally that number is static, and known a priori, so we just
> create a huge /dev directory with all possible devices and their
> assigned numbers (you can see these numbers with ls -la).
> 
> But if the kernel _tells_ user-space what the device number is, for each
> device as it is created, we do not need a static /dev directory.  We can
> assemble the directory on the fly and device numbers really no longer
> matter.  This is what udev does.

I think you've missed a point here.  There are several places where kernel
deals with device identification.
	a) when normal pathname lookup results in a device node on filesystem.
That's the regular way.
	b) when we create a new device node; device number is passed to
->mknod() and new device node is created.  Also a normal codepath.
	c) when late-boot code mounts the final root.  It used to be black
magic, but these days it's done by regular syscalls.  Namely, we parse the
"device name" (most of the work is done by lookups in sysfs), do mknod(2)
and mount(2).  It's still done from the kernel mode, but it could be moved
to userland.  Should be, actually.
	d) when kernel deals with resume/suspend stuff.  Currently - black
magic.  Should be moved to early userland (same parser as for final root
name + mknod on rootfs + open() to get the device in question).
	e) in several pathological syscalls we pass device number to
identify a device.  ustat(2) and its ilk - bad API that can't die.
	f) /dev/raw passes device number to bind raw device to block device.
Bad API; we probably ought to replace it with saner one at some point.
	g) RAID setup - mix of both pathologies; should be done in userland
and interfaces are in bad need of cleanup.
	h) nfsd uses device number as a substitute for export ID if said
ID is not given explicitly.  That, BTW, is a big problem for crackpipe
dreams about random device numbers - export ID _must_ be stable across
reboots.
	i) mtdblk parses "device name" on boot; should be take to early
userland, same as RAID et.al.

	Eventually name_to_dev_t() should be gone from kernel mode
completely - all callers should be shifted to early userland.  But
that will take a lot of work - currently we have a big mess in that
area.

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 22:01       ` Nathan Conrad
@ 2003-12-31 22:20         ` Rob Love
  2003-12-31 21:45           ` Tommi Virtanen
  2003-12-31 22:55           ` viro
  0 siblings, 2 replies; 158+ messages in thread
From: Rob Love @ 2003-12-31 22:20 UTC (permalink / raw)
  To: Nathan Conrad; +Cc: Pascal Schmidt, linux-kernel, Greg KH

On Wed, 2003-12-31 at 17:01, Nathan Conrad wrote:

> One thing that I'm confused about with respect to device files is how
> kernel arguments are supposed to work. Now, we _seem_ to have a
> mish-mash of different ways to tell the kernel which device to open as
> a console, which device to use as a suspend device, etc.... Now, all
> of the device names are being migrated to userland. How is the kernel
> supposed to determine which device to use when it is told use
> /dev/hda3 or /dev/ide/host0/something/part3 as the suspend partition?
> The kernel no longer knows to which device this string this device is
> connected.

Uh, Unix systems (Linux included) do not use the filename of the device
node at all.  Those are just names for you, the user.

The kernel uses the device number to understand what device user-space
is trying to access.  The kernel associates the device with a device
number.  Normally that number is static, and known a priori, so we just
create a huge /dev directory with all possible devices and their
assigned numbers (you can see these numbers with ls -la).

But if the kernel _tells_ user-space what the device number is, for each
device as it is created, we do not need a static /dev directory.  We can
assemble the directory on the fly and device numbers really no longer
matter.  This is what udev does.

	Rob Love



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
       [not found] ` <fa.de7jae9.1jk0pjt@ifi.uio.no>
@ 2003-12-31 22:17   ` walt
  2004-01-01  2:03     ` Martin Schlemmer
  0 siblings, 1 reply; 158+ messages in thread
From: walt @ 2003-12-31 22:17 UTC (permalink / raw)
  Cc: linux-kernel

Greg KH wrote:

> In fact, now that I know Gentoo works without devfs, I'm considering
> putting it on an old laptop I have around here...

That would be ideal.  I'm sure you will like the 'portage' system as
much as we (the gentoo hordes) do.

Note that the portage system already includes 'hotplug' and 'udev'
but possibly lagging behind a bit:  hotplug-20030805-r3 and udev-011.

I have installed them both but just have not been able to get udev
working yet -- I don't yet understand the problems well enough to tell
you why, unfortutately.  (udev is still marked 'experimental' so I'm
probably omitting important steps somewhere.)

If you could get udev working in gentoo you would become an instant
hero rather than the target of nasty emails.  Think of how great
that would be for your New Year!  We would become the wind beneath
your wings instead of the rotten tomatoes in your mailbox  ;0)

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 20:19     ` Rob Love
@ 2003-12-31 22:01       ` Nathan Conrad
  2003-12-31 22:20         ` Rob Love
  2004-01-01  0:15       ` Andries Brouwer
  1 sibling, 1 reply; 158+ messages in thread
From: Nathan Conrad @ 2003-12-31 22:01 UTC (permalink / raw)
  To: Rob Love; +Cc: Pascal Schmidt, linux-kernel, Greg KH

One thing that I'm confused about with respect to device files is how
kernel arguments are supposed to work. Now, we _seem_ to have a
mish-mash of different ways to tell the kernel which device to open as
a console, which device to use as a suspend device, etc.... Now, all
of the device names are being migrated to userland. How is the kernel
supposed to determine which device to use when it is told use
/dev/hda3 or /dev/ide/host0/something/part3 as the suspend partition?
The kernel no longer knows to which device this string this device is
connected.

(I have not looked into how these parameters are parsed; this is pure
speculation)

One solution that I see if the device names are totally removed from
the kernel is specifying these parameters as sysfs paths. Would this
work? Or is there a better way?

-Nathan

On Wed, Dec 31, 2003 at 03:19:22PM -0500, Rob Love wrote:
> On Wed, 2003-12-31 at 14:23, Greg KH wrote:
> 
> > What benefit would there be in "random" numbers? More compressed number
> > space by giving out numbers sequentially?
> 
> That is one advantage.
> 
> > Or less having to work with the numbers because they become just
> > cookies and never need to be inspected except in very small parts of
> > the kernel?
> 
> Yup, especially this one.  It is not so much "let's make the device
> numbers random" but "let's just not care what they are."
> 
> We can get to the point where we don't even need the explicit concept of
> device numbers, but just "any old unique value" to use as a cookie.  The
> kernel can pull that number from anywhere, and notify user-space via
> udev ala hotplug.
> 
> 	Rob Love
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Nathan J. Conrad                     Campus phone #5930
301 Scott hall, UNC Charlotte        http://bungled.net
GPG: F4FC 7E25 9308 ECE1 735C  0798 CE86 DA45 9170 3112

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 23:10             ` Rob Love
@ 2003-12-31 21:52               ` Tommi Virtanen
  2004-01-02  0:17                 ` Hollis Blanchard
  0 siblings, 1 reply; 158+ messages in thread
From: Tommi Virtanen @ 2003-12-31 21:52 UTC (permalink / raw)
  To: Rob Love; +Cc: Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

Rob Love wrote:
>>Let me try to rephrase Nathan's question more explicitly.
>>
>>If user policy decides all naming, how does the kernel parse e.g. 
>>root=/dev/foo arguments? Or the swap partition to use for swsuspend?
> Oh.  That has always been a hack, ala name_to_dev_t().
> 
> We will have to continue doing that hack so long as those users are in
> the kernel proper (and not early user-space, for example).

I think devfs names are accepted as root= arguments, so that's a bit of
a loss.. with udev, your /dev and your root= are equal only if you
follow the standard naming.

For root=, I can see how early userspace can move that to userspace.
But what about swsuspend?

Are there any more kernel options taking file names? I think now would
be a good time to stop adding more of them :)


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 22:20         ` Rob Love
@ 2003-12-31 21:45           ` Tommi Virtanen
  2003-12-31 23:10             ` Rob Love
  2003-12-31 22:55           ` viro
  1 sibling, 1 reply; 158+ messages in thread
From: Tommi Virtanen @ 2003-12-31 21:45 UTC (permalink / raw)
  To: Rob Love; +Cc: Nathan Conrad, Pascal Schmidt, linux-kernel, Greg KH

Rob Love wrote:
>>One thing that I'm confused about with respect to device files is how
>>kernel arguments are supposed to work. Now, we _seem_ to have a
>>mish-mash of different ways to tell the kernel which device to open as
>>a console, which device to use as a suspend device, etc.... Now, all
>>of the device names are being migrated to userland. How is the kernel
>>supposed to determine which device to use when it is told use
>>/dev/hda3 or /dev/ide/host0/something/part3 as the suspend partition?
>>The kernel no longer knows to which device this string this device is
>>connected.
...

> The kernel uses the device number to understand what device user-space
> is trying to access.  The kernel associates the device with a device
> number.  Normally that number is static, and known a priori, so we just
> create a huge /dev directory with all possible devices and their
> assigned numbers (you can see these numbers with ls -la).

Let me try to rephrase Nathan's question more explicitly.

If user policy decides all naming, how does the kernel parse e.g. 
root=/dev/foo arguments? Or the swap partition to use for swsuspend?


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31 19:23   ` Greg KH
@ 2003-12-31 20:19     ` Rob Love
  2003-12-31 22:01       ` Nathan Conrad
  2004-01-01  0:15       ` Andries Brouwer
  2004-01-01 16:17     ` Pascal Schmidt
  1 sibling, 2 replies; 158+ messages in thread
From: Rob Love @ 2003-12-31 20:19 UTC (permalink / raw)
  To: Pascal Schmidt; +Cc: linux-kernel, Greg KH

On Wed, 2003-12-31 at 14:23, Greg KH wrote:

> What benefit would there be in "random" numbers? More compressed number
> space by giving out numbers sequentially?

That is one advantage.

> Or less having to work with the numbers because they become just
> cookies and never need to be inspected except in very small parts of
> the kernel?

Yup, especially this one.  It is not so much "let's make the device
numbers random" but "let's just not care what they are."

We can get to the point where we don't even need the explicit concept of
device numbers, but just "any old unique value" to use as a cookie.  The
kernel can pull that number from anywhere, and notify user-space via
udev ala hotplug.

	Rob Love



^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31  3:05 ` Pascal Schmidt
@ 2003-12-31 19:23   ` Greg KH
  2003-12-31 20:19     ` Rob Love
  2004-01-01 16:17     ` Pascal Schmidt
  0 siblings, 2 replies; 158+ messages in thread
From: Greg KH @ 2003-12-31 19:23 UTC (permalink / raw)
  To: Pascal Schmidt; +Cc: linux-kernel

On Wed, Dec 31, 2003 at 04:05:59AM +0100, Pascal Schmidt wrote:
> On Wed, 31 Dec 2003 01:40:09 +0100, you wrote in linux.kernel:
> 
> >     2) udev does not care about the major/minor number schemes.  If the
> >        kernel tomorrow switches to randomly assign major and minor numbers
> >        to different devices, it would work just fine (this is exactly
> >        what I am proposing to do in 2.7...)
> 
> Why? I want to keep my static device files in /dev. I don't even have
> hotpluggable devices, and many months do pass before even one piece
> of hardware gets changed (in which case I know what I have to do).
> I don't want to eat any overhead or run any daemons or hotplug agents.

You would not have any "extra" overhead if you don't add any new devices
to your system.  udev only runs when /sbin/hotplug runs.  As for extra
space on your disk, this email thread is almost as big as the udev
binary is :)

> What benefit would there be in "random" numbers? More compressed number
> space by giving out numbers sequentially?

Yes.

> Or less having to work with the numbers because they become just
> cookies and never need to be inspected except in very small parts of
> the kernel?

That is already happening today in the kernel.  

And 2.8 will probably have the "random number" assignment be a compile
option, depending on the maturity of udev.  We'll just have to see how
it works out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31  0:53 ` Prakash K. Cheemplavam
@ 2003-12-31 19:17   ` Greg KH
  2004-01-02 16:45     ` Shawn
  0 siblings, 1 reply; 158+ messages in thread
From: Greg KH @ 2003-12-31 19:17 UTC (permalink / raw)
  To: Prakash K. Cheemplavam; +Cc: linux-hotplug-devel, linux-kernel

On Wed, Dec 31, 2003 at 01:53:55AM +0100, Prakash K. Cheemplavam wrote:
> Greg KH wrote:
> 
> [big snip]
> > All the people wanting to bring up the udev vs. devfs argument go back
> > and read the previous paragraph.  Yes, all Gentoo users who keep filling
> > up my inbox with smoking emails, I mean you.
> [yet another big snip]
> 
> Hihi, life is unfair to you. ;-) I am one of those nasty gentoo users 
> and still use devfs, but I want to switch asap, as I found a thread in 
> gentoo forums about it and furthermore tend to do experiments with my 
> installation. So not all gentoo users are bad users. ;-) I really 
> appreciate your work and hope you will find more time in developing udev 
> instead of wasting time (though it was quite interesting for me to read 
> your text) with arguing for it. So I hope when I do the transition it 
> goes smoothly, but even if not, I won't bash onto your head. ;-)

Thanks, I have gotten a lot of response to this message from Gentoo
users appologizing for the "bad seeds".  By no means did I mean to
disparage all Gentoo users, just the ones that keep bothering me with
this pointless argument.

In fact, now that I know Gentoo works without devfs, I'm considering
putting it on an old laptop I have around here...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31  0:29 Greg KH
  2003-12-31  0:53 ` Prakash K. Cheemplavam
@ 2003-12-31 12:43 ` Paulo Marques
  2004-01-01  1:18 ` Helge Hafting
  2004-01-02 17:54 ` Andreas Jellinghaus
  3 siblings, 0 replies; 158+ messages in thread
From: Paulo Marques @ 2003-12-31 12:43 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

Greg KH wrote:

> Oh yeah, and there are the insolvable race conditions with the devfs
> implementation in the kernel, but I'm not going to talk about them right
> now, sorry.  See the linux-kernel archives if you care about them (and
> if you use devfs, you should care...)

I really think you should, because IMHO this is *the* major argument against devfs.

I spent days trying to tweak a mandrake distribution into running from a Compact 
Flash card.

The init sequence would fail with I/O errors as if the card had hardware 
problems. It took me a long time to realize that it was devfs and devfsd the 
culprits. With *exactly* the same setup, but static device nodes the system 
worked just fine.

Maybe it was the slow compact flash PIO modes that were triggering the bug, but 
the truth was that devfs had bugs in it, and I never saw anyone trying to 
correct them later.

So my opinion is: udev is *really* needed and you're doing a great job with it. 
Don't let anyone tell you otherwise :)

Just my 2 cents,

-- 
Paulo Marques - www.grupopie.com

"In a world without walls and fences who needs windows and gates?"


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
       [not found] <18Cz7-7Ep-7@gated-at.bofh.it>
@ 2003-12-31  3:05 ` Pascal Schmidt
  2003-12-31 19:23   ` Greg KH
  0 siblings, 1 reply; 158+ messages in thread
From: Pascal Schmidt @ 2003-12-31  3:05 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

On Wed, 31 Dec 2003 01:40:09 +0100, you wrote in linux.kernel:

>     2) udev does not care about the major/minor number schemes.  If the
>        kernel tomorrow switches to randomly assign major and minor numbers
>        to different devices, it would work just fine (this is exactly
>        what I am proposing to do in 2.7...)

Why? I want to keep my static device files in /dev. I don't even have
hotpluggable devices, and many months do pass before even one piece
of hardware gets changed (in which case I know what I have to do).
I don't want to eat any overhead or run any daemons or hotplug agents.

What benefit would there be in "random" numbers? More compressed number
space by giving out numbers sequentially? Or less having to work with
the numbers because they become just cookies and never need to be
inspected except in very small parts of the kernel?

-- 
Ciao,
Pascal

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: udev and devfs - The final word
  2003-12-31  0:29 Greg KH
@ 2003-12-31  0:53 ` Prakash K. Cheemplavam
  2003-12-31 19:17   ` Greg KH
  2003-12-31 12:43 ` Paulo Marques
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 158+ messages in thread
From: Prakash K. Cheemplavam @ 2003-12-31  0:53 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-hotplug-devel, linux-kernel

Greg KH wrote:

[big snip]
 > All the people wanting to bring up the udev vs. devfs argument go back
 > and read the previous paragraph.  Yes, all Gentoo users who keep filling
 > up my inbox with smoking emails, I mean you.
[yet another big snip]

Hihi, life is unfair to you. ;-) I am one of those nasty gentoo users 
and still use devfs, but I want to switch asap, as I found a thread in 
gentoo forums about it and furthermore tend to do experiments with my 
installation. So not all gentoo users are bad users. ;-) I really 
appreciate your work and hope you will find more time in developing udev 
instead of wasting time (though it was quite interesting for me to read 
your text) with arguing for it. So I hope when I do the transition it 
goes smoothly, but even if not, I won't bash onto your head. ;-)

Cheers,

Prakash

^ permalink raw reply	[flat|nested] 158+ messages in thread

* udev and devfs - The final word
@ 2003-12-31  0:29 Greg KH
  2003-12-31  0:53 ` Prakash K. Cheemplavam
                   ` (3 more replies)
  0 siblings, 4 replies; 158+ messages in thread
From: Greg KH @ 2003-12-31  0:29 UTC (permalink / raw)
  To: linux-hotplug-devel, linux-kernel

(This text can be found at
kernel.org/pub/linux/utils/kernel/hotplug/udev_vs_devfs for those who
want to link to it.  I'll also update it with info based on the thread I
know is going to spawn from this post...)


Executive summary for those too lazy to read this whole thing:
	I don't care about devfs, and I don't want to talk about it at
	all anymore.  If you love devfs, fine, I'm not trying to tell
	anyone what to do.  But you really should be looking into using
	udev instead.  All further email messages sent to me about devfs
	will be gladly ignored.


First off, some background.  For a description of udev, and what it's
original design goals were, please see the OLS 2003 paper on udev,
available at:
	<http://www.kroah.com/linux/talks/ols_2003_udev_paper/Reprint-Kroah-Hartman-OLS2003.pdf>
and the slides for the talk, available at:
	<http://www.kroah.com/linux/talks/ols_2003_udev_talk/>
The OLS paper can also be found in the docs/ directory of the udev
tarball, available on kernel.org in the /pub/linux/utils/kernel/hotplug/
directory.

In that OLS paper, I described the current situation of a static /dev
and the current problems that a number of people have with it.  I also
detailed how devfs tries to solve a number of these problems.  In
hindsight, I should have never mentioned the word, devfs, when talking
about udev.  I did so only because it seemed like a good place to start
with.  Most people understood what devfs is, and what it does.  To
compare udev against it, showing how udev was more powerful, and a more
complete solution to the problems people were having, seemed like a
natural comparison to me.

But no more.  I hereby never want to compare devfs and udev again.  With
the exception of this message...

The Problems:
 1) A static /dev is unwieldy and big.  It would be nice to only show
    the /dev entries for the devices we actually have running in the
    system.
 2) We are (well, were) running out of major and minor numbers for
    devices.
 3) Users want a way to name devices in a persistent fashion (i.e. "This
    disk here, must _always_ be called "boot_disk" no matter where in
    the scsi tree I put it", or "This USB camera must always be called
    "camera" no matter if I have other USB scsi devices plugged in or
    not.")
 4) Userspace programs want to know when devices are created or removed,
    and what /dev entry is associated with them.

The constraints:
 1) No policy in the kernel!
 2) Follow standards (like the LSB)
 3) must be small so embedded devices will use it.


So, how does devfs stack up to the above problems and constraints:
  Problems:
    1) devfs only shows the dev entries for the devices in the system.
    2) devfs does not handle the need for dynamic major/minor numbers
    3) devfs does not provide a way to name devices in a persistent
       fashion.
    4) devfs does provide a deamon that userspace programs can hook into
       to listen to see what devices are being created or removed.
  Constraints:
    1) devfs forces the devfs naming policy into the kernel.  If you
       don't like this naming scheme, tough.
    2) devfs does not follow the LSB device naming standard.
    3) devfs is small, and embedded devices use it.  However it is
       implemented in non-pagable memory.

Oh yeah, and there are the insolvable race conditions with the devfs
implementation in the kernel, but I'm not going to talk about them right
now, sorry.  See the linux-kernel archives if you care about them (and
if you use devfs, you should care...)

So devfs is 2 for 7, ignoring the kernel races.

And now for udev:
  Problems:
    1) using udev, the /dev tree only is populated for the devices that
       are currently present in the system.
    2) udev does not care about the major/minor number schemes.  If the
       kernel tomorrow switches to randomly assign major and minor numbers
       to different devices, it would work just fine (this is exactly
       what I am proposing to do in 2.7...)
    3) This is the main reason udev is around.  It provides the ability
       to name devices in a persistent manner.  More on that below.
    4) udev emits D-BUS messages so that any other userspace program
       (like HAL) can listen to see what devices are created or removed.
       It also allows userspace programs to query it's database to see
       what devices are present and what they are currently named as
       (providing a pointer into the sysfs tree for that specific device
       node.)
  Constraints:
    1) udev moves _all_ naming policies out of the kernel and into
       userspace.
    2) udev defaults to using the LSB device naming standard.  If users
       want to deviate away from this standard (for example when naming
       some devices in a persistent manner), it is easily possible to do
       so.
    3) udev is small (49Kb binary) and is entirely in userspace, which
       is swapable, and doesn't have to be running at all times.

Nice, 7 out of 7 for udev.  Makes you think the problems and constraints
were picked by a udev developer, right?  No, the problems and
constraints are ones I've seen over the years and so udev, along with
the kernel driver model and sysfs, were created to solve these real
problems.  I also have had the luxury to see the problems that the
current devfs implementation has, and have taken the time to work out
something that does not have those same problems.

So by just looking at the above descriptions, everyone should instantly
realize that udev is far better than devfs and start helping out udev
development, right?  Oh, you want more info, ok...

Back in May 2003 I released a very tiny version of udev that implemented
everything that devfs currently does, in about 6Kb of userspace code:
	http://marc.theaimsgroup.com/?l=linux-kernel&m=105003185331553

Yes, that's right, 6Kb.  So, you are asking, why are you still working
on udev if it did everything devfs did back in May 2003?  That's because
just managing static device nodes based on what the kernel calls the
devices is _not_ the primary goal of udev.  It's just a tiny side affect
of it's primary goal, the ability to never worry about major/minor
number assignments and provide the ability to achieve persistent device
names if wanted.

All the people wanting to bring up the udev vs. devfs argument go back
and read the previous paragraph.  Yes, all Gentoo users who keep filling
up my inbox with smoking emails, I mean you.

So, how well does udev solve it's goals:
  Prevent users from ever worrying about major/minor numbers
    And here you were, not knowing you ever needed to worry about
    major/minor numbers in the first place, right?  Ah, I see you
    haven't plugged in 2 USB printers and tried to figure out which
    printer was which /dev entry?  Or plugged in 4000 SCSI disks and
    tried to figure out how to access that 3642nd disk and what it was
    called in /dev.  Or plugged in a USB camera and a USB flash drive
    and then tried to download the pictures off of the flash drive by
    accident?

    As the above scenarios show, both desktop users and big iron users
    both need to not worry about which device is assigned to what
    major/minor device.
   
    udev doesn't care what major/minor number is assigned to a device.
    It merely takes the numbers that the kernel says it assigned to the
    device and creates a device node based on it, which the user can
    then use (if you don't understand the whole major/minor to device
    node issue, or even what a device node is, trust me, you don't
    really want to, go install udev and don't worry about it...)  As
    stated above, if the kernel decides to start randomly assigning
    major numbers to all devices, then udev will still work just fine.

  Provide a persistent device naming solution:
    Lots of people want to assign a specific name that they can talk to
    a device to, no matter where it is in the system, or what order they
    plugged the device in.  USB printers, SCSI disks, PCI sound cards,
    Firewire disks, USB mice, and lots of other devices all need to be
    assigned a name in a consistent manner (udev doesn't handle network
    devices, naming them is already a solved solution, using nameif).
    udev allows users to create simple rules to describe what device to
    name.  If users want to call a program running a large database
    half-way around the world, asking it what to name this device, it
    can.  We don't put the naming database into the kernel (like other
    Unix variants have), everything is in userspace, and easily
    accessible.  You can even run a perl script to name your device if
    you are that crazy...

    For more information on how to create udev rules to name devices,
    please see the udev man page, and look at the example udev rules
    that ship with the tarball.
 

So, convinced already why you should use udev instead of devfs?  No.
Ok, fine, I'm not forcing you to abandon your bloated, stifling policy,
nonextensible, end of life feature if you don't want to.  But please
don't bother me about it either, I don't care about devfs, only about
udev.

This is my last posting about this topic, all further emails sent to me
about why devfs is wonderful, and why are you making fun of this
wonderful, stable gift from the gods, will be gleefully ignored and
possibly posted in a public place where others can see.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 158+ messages in thread

end of thread, other threads:[~2004-01-12 20:59 UTC | newest]

Thread overview: 158+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-08 13:53 udev and devfs - The final word "Andrey Borzenkov" 
2004-01-08 15:40 ` Ian Kent
2004-01-08 17:26   ` Diego Calleja
2004-01-08 19:25     ` Andrey Borzenkov
2004-01-08 22:40       ` Alex Goddard
2004-01-09  7:03         ` "Andrey Borzenkov" 
2004-01-08 18:14   ` Alex Goddard
2004-01-08 18:35     ` Alex Goddard
2004-01-08 19:22     ` Andrey Borzenkov
2004-01-09  8:51 ` Helge Hafting
  -- strict thread matches above, loose matches on Subject: below --
2004-01-08 13:05 "Andrey Borzenkov" 
2004-01-06  1:20 Paul Zimmerman
     [not found] <fa.flhsork.uka2hg@ifi.uio.no>
     [not found] ` <fa.hv9hpq7.1l1q9p3@ifi.uio.no>
2004-01-01 19:53   ` walt
2004-01-01 21:53     ` Martin Schlemmer
2004-01-01 16:59 Shaheed
     [not found] <fa.af64864.ugabhg@ifi.uio.no>
     [not found] ` <fa.de7jae9.1jk0pjt@ifi.uio.no>
2003-12-31 22:17   ` walt
2004-01-01  2:03     ` Martin Schlemmer
2004-01-01  2:05       ` Martin Schlemmer
     [not found] <18Cz7-7Ep-7@gated-at.bofh.it>
2003-12-31  3:05 ` Pascal Schmidt
2003-12-31 19:23   ` Greg KH
2003-12-31 20:19     ` Rob Love
2003-12-31 22:01       ` Nathan Conrad
2003-12-31 22:20         ` Rob Love
2003-12-31 21:45           ` Tommi Virtanen
2003-12-31 23:10             ` Rob Love
2003-12-31 21:52               ` Tommi Virtanen
2004-01-02  0:17                 ` Hollis Blanchard
2004-01-02  0:36                   ` viro
2004-01-03  6:04                   ` Greg KH
2003-12-31 22:55           ` viro
2003-12-31 23:05             ` Rob Love
2003-12-31 23:48             ` Andreas Dilger
2004-01-07 10:15             ` Olaf Hering
2004-01-07 11:18               ` viro
2004-01-07 13:00                 ` Olaf Hering
2004-01-07 13:26                   ` viro
2004-01-07 13:27                     ` Olaf Hering
2004-01-01  0:15       ` Andries Brouwer
2004-01-01  0:31         ` Rob Love
2004-01-01 12:34           ` Rob Landley
2004-01-01 15:22             ` Rob Love
2004-01-01 15:48               ` Andries Brouwer
2004-01-01 15:54                 ` Rob Love
2004-01-02 20:42                   ` Linus Torvalds
2004-01-03  3:00                     ` Andries Brouwer
2004-01-03  4:46                       ` Linus Torvalds
2004-01-03 13:10                         ` Andries Brouwer
2004-01-03 22:27                           ` Linus Torvalds
2004-01-03 23:08                             ` Andries Brouwer
2004-01-04  1:16                               ` Mark Mielke
2004-01-04  1:54                                 ` Valdis.Kletnieks
2004-01-04 18:44                                   ` Mark Mielke
2004-01-04  2:09                               ` Linus Torvalds
2004-01-04  2:49                                 ` Andries Brouwer
2004-01-04  3:04                                   ` Linus Torvalds
2004-01-04 13:21                                     ` Andries Brouwer
2004-01-04 21:05                                       ` Linus Torvalds
2004-01-04 22:01                                         ` Andries Brouwer
2004-01-04 22:37                                           ` viro
2004-01-05  1:02                                             ` Mark Mielke
2004-01-05  2:24                                               ` Valdis.Kletnieks
2004-01-05  2:29                                             ` Andries Brouwer
2004-01-05  3:42                                               ` viro
2004-01-04 22:37                                           ` Helge Hafting
2004-01-04 23:35                                           ` Valdis.Kletnieks
2004-01-05  1:43                                             ` Jeremy Maitin-Shepard
2004-01-05  1:58                                               ` viro
2004-01-05  2:12                                                 ` Jeremy Maitin-Shepard
2004-01-05  2:52                                           ` Linus Torvalds
2004-01-05  3:06                                             ` David Lang
2004-01-05  3:48                                               ` Rob Landley
2004-01-05  4:52                                                 ` Trond Myklebust
2004-01-05 15:13                                                 ` Mark Mielke
2004-01-05 16:36                                                   ` Andreas Schwab
2004-01-05 22:18                                                     ` Mark Mielke
2004-01-05  3:07                                             ` Daniel Jacobowitz
2004-01-05  3:33                                               ` Linus Torvalds
2004-01-05  3:50                                                 ` viro
2004-01-05  4:02                                                   ` Linus Torvalds
2004-01-05  4:38                                                     ` viro
2004-01-05  4:52                                                       ` Linus Torvalds
2004-01-05  6:11                                                         ` viro
2004-01-05  7:47                                                         ` Greg KH
2004-01-05 11:15                                                           ` Vojtech Pavlik
2004-01-05 20:11                                                             ` Theodore Ts'o
2004-01-05 21:06                                                               ` Vojtech Pavlik
2004-01-05 22:22                                                                 ` Theodore Ts'o
2004-01-06  0:14                                                                 ` Rob Landley
2004-01-11 22:12                                                         ` Ed L Cashin
2004-01-05  5:26                                                       ` Eric W. Biederman
2004-01-05  7:39                                                       ` Greg KH
2004-01-07  9:57                                                     ` Pavel Machek
2004-01-05 12:27                                                 ` Andries Brouwer
2004-01-05 16:13                                                   ` Linus Torvalds
2004-01-05 17:29                                                     ` Vojtech Pavlik
2004-01-05 17:33                                                       ` Linus Torvalds
2004-01-05 17:52                                                       ` Davide Libenzi
2004-01-05 18:03                                                         ` Linus Torvalds
2004-01-05 18:09                                                         ` Hugo Mills
2004-01-05 19:10                                                         ` Paul Rolland
2004-01-05 19:52                                                     ` Andries Brouwer
2004-01-05 20:38                                                       ` Linus Torvalds
2004-01-05 22:17                                                         ` Shawn
2004-01-05 22:25                                                           ` Mark Mielke
2004-01-05 23:05                                                             ` Shawn
2004-01-05 23:23                                                               ` Shawn
2004-01-06  0:43                                                               ` Greg KH
2004-01-06  0:53                                                                 ` Shawn
2004-01-05 23:13                                                         ` Andries Brouwer
2004-01-05 23:32                                                           ` Linus Torvalds
2004-01-06  0:59                                                             ` viro
2004-01-06  1:17                                                               ` Linus Torvalds
2004-01-06  4:28                                                                 ` viro
2004-01-06  5:07                                                                   ` Linus Torvalds
2004-01-06  1:06                                                             ` Andries Brouwer
2004-01-06 15:00                                                               ` Mark Mielke
2004-01-06  0:00                                                           ` Greg KH
2004-01-06  1:41                                                             ` Andries Brouwer
2004-01-07 17:14                                                               ` Greg KH
2004-01-06  0:31                                                           ` Rob Landley
2004-01-06  7:14                                                       ` Vojtech Pavlik
2004-01-05  7:44                                             ` James H. Cloos Jr.
2004-01-05  7:45                                               ` Nigel Cunningham
2004-01-05 11:01                                                 ` Robin Rosenberg
2004-01-05 12:39                                                   ` Nigel Cunningham
2004-01-07 13:39                                                     ` Robin Rosenberg
2004-01-07 17:16                                                       ` Nigel Cunningham
2004-01-05  9:06                                               ` Valdis.Kletnieks
2004-01-05  4:15                                           ` Peter Chubb
2004-01-05  4:42                                             ` Linus Torvalds
2004-01-01 19:43             ` Kai Henningsen
2004-01-02  7:26               ` Rob Landley
2004-01-04  8:57                 ` Greg KH
2004-01-04  9:43                   ` Rob Landley
2004-01-02  0:17             ` Maciej Zenczykowski
     [not found]               ` <20040102103104.GA28168@mark.mielke.cc>
2004-01-03  6:07                 ` Greg KH
2004-01-03  6:51                   ` Valdis.Kletnieks
2004-01-03 11:57                     ` Ian Kent
2004-01-03 22:08                     ` Greg KH
2004-01-07 10:23             ` Olaf Hering
2004-01-01 23:14           ` Rob
2004-01-02  3:53             ` Tyler Hall
2004-01-01 16:17     ` Pascal Schmidt
2004-01-01 20:03       ` Greg KH
2003-12-31  0:29 Greg KH
2003-12-31  0:53 ` Prakash K. Cheemplavam
2003-12-31 19:17   ` Greg KH
2004-01-02 16:45     ` Shawn
2003-12-31 12:43 ` Paulo Marques
2004-01-01  1:18 ` Helge Hafting
2004-01-03  5:59   ` Greg KH
2004-01-03 15:22     ` Helge Hafting
2004-01-03 21:18       ` viro
2004-01-03 22:11       ` Greg KH
     [not found]     ` <20040103140140.3b848e9f.witukind@nsbm.kicks-ass.org>
2004-01-03 22:16       ` Greg KH
2004-01-03 22:33         ` Christoph Hellwig
2004-01-02 17:54 ` Andreas Jellinghaus
2004-01-02 18:19   ` Shawn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).