linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Documentation for sysfs, hotplug, and firmware loading.
@ 2007-07-17 21:03 Rob Landley
  2007-07-17 21:55 ` Randy Dunlap
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-17 21:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg KH, Michael-Luke Jones, Krzysztof Halasa, Rod Whitby,
	Russell King, khc, david

[-- Attachment #1: Type: text/plain, Size: 860 bytes --]

Here's some sysfs/hotplug/firmware loading documentation I wrote.  I finally 
tracked down the netlink bits to finish it up, so I can send it out to the 
world.

What's wrong with it? :)

Note, I still need to actually confirm that /sbin/hotplug can be called from 
initramfs by a statically linked device to load firmware before init gets 
spawned.  It should work, and was explicitly discussed as a design goal a 
year or two back, but it might need a bugfix patch to actually, you know, 
_work_.

(P.S.  I'd cc Kay Sievers on this, but he's still spam-blocking my email.  
Thanks to Kay for answer lots of questions about this at OLS, and to Fank 
Sorenson who wrote a netlink implementation of mdev back in 2005 that I dug 
up to figure out how that part works.)
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

[-- Attachment #2: sysfs.txt --]
[-- Type: text/plain, Size: 9060 bytes --]

hotplug and firmware loading with sysfs.
========================================

The 2.6.x Linux kernels export a device tree through sysfs, which is a
synthetic filesystem generally mounted at "/sys".  Among other things,
this filesystem tells userspace what hardware is available, so userspace tools
(such as udev or mdev) can dynamically populate a "/dev" directory with device
nodes representing the currently available hardware.

Notification when hardware is inserted or removed is provided by the
hotplug mechanism.  Linux provides two hotplug interfaces: /sbin/hotplug and
netlink.

The combination of sysfs and hotplug obsoleted the older "devfs", which was
removed from the 2.6.16 kernel.

Device nodes:
=============

Sysfs exports major and minor numbers for device nodes with which to populate
/dev via mknod(2).  These major and minor numbers are found in files named
"dev", which contain two colon separated ascii decimal numbers followed by
exactly one newline.  I.E.

  $ cat /sys/class/mem/zero/dev
  1:5

Note that the name of the directory containing a dev entry is usually the
traditional name for the device node.  (The above entry is for "/dev/zero".)

Entires for block devices are found at the following locations:

  /sys/block/*/dev
  /sys/block/*/*/dev

Entries for char devices are found at the following locations:

  /sys/bus/*/devices/*/dev
  /sys/class/*/*/dev

A very simple bash script to populate /dev from /sys (without addressing
ownership or permissions of the resulting /dev nodes) might look like:

  #!/bin/bash

  # Populate block devices

  for i in /sys/block/*/dev /sys/block/*/*/dev
  do
    if [ -f $i ]
    then
      MAJOR=$(sed 's/:.*//' < $i)
      MINOR=$(sed 's/.*://' < $i)
      DEVNAME=$(echo $i | sed -e 's@/dev@@' -e 's@.*/@@')
      mknod /dev/$DEVNAME b $MAJOR $MINOR
    fi
  done

  # Populate char devices

  for i in /sys/bus/*/devices/*/dev /sys/class/*/*/dev
  do
    if [ -f $i ]
    then
      MAJOR=$(sed 's/:.*//' < $i)
      MINOR=$(sed 's/.*://' < $i)
      DEVNAME=$(echo $i | sed -e 's@/dev@@' -e 's@.*/@@')
      mknod /dev/$DEVNAME c $MAJOR $MINOR
    fi
  done

Hotplug:
========

The hotplug mechanism asynchronously notifies userspace when hardware is
inserted, removed, or undergoes a similar significant state change.  Linux
provides two interfaces to hotplug; the kernel can spawn a usermode helper
process, or it can send a message to an existing daemon listening to a netlink
socket.

-- Usermode helper

The usermode helper hotplug mechanism spawns a new process to handle each
hotplug event.  Each such helper process belongs to the root user (UID 0) and
is a child of the init task (PID 1).  The kernel spawns one process per hotplug
event, supplying environment variables to each new process describing that
particular hotplug event.  By default the kernel spawns instances of
"/sbin/hotplug", but this default can be changed by writing a new path into
"/proc/sys/kernel/hotplug" (assuming /proc is mounted).

A simple bash script to record variables from hotplug events might look like:

  #!/bin/bash

  env >> /filename

It's possible to disable the usermode helper hotplug mechanism (by writing an
empty string into /proc/sys/kernel/hotplug), but there's little reason to
do this since a usermode helper won't be spawned if /sbin/hotplug doesn't
exist, and negative dentries will record the fact it doesn't exist after
the first lookup attempt.

-- Netlink

A daemon listening to the netlink socket receives a packet of data for each
hotplug event, containing the same information a usermode helper would receive
in environment variables.

The netlink packet contains a set of null terminated text lines.
Each line but the first contains a KEYWORD=VALUE pair defining a hotplug
event variable.  The first line of the netlink packet combines the $ACTION
and $DEVPATH values, separated by an @ (at sign).

Here's a C program to print hotplug nelink events to stdout:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <sys/poll.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <unistd.h>

#include <linux/types.h>
#include <linux/netlink.h>

void die(char *s)
{
	write(2,s,strlen(s));
	exit(1);
}

int main(int argc, char *argv[])
{
	struct sockaddr_nl nls;
	struct pollfd pfd;
	char buf[512];

	// Open hotplug event netlink socket

	memset(&nls,0,sizeof(struct sockaddr_nl));
	nls.nl_family = AF_NETLINK;
	nls.nl_pid = getpid();
	nls.nl_groups = -1;

	pfd.events = POLLIN;
	pfd.fd = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_KOBJECT_UEVENT);
	if (pfd.fd==-1)
		die("Not root\n");

	// Listen to netlink socket

	if (bind(pfd.fd, (void *)&nls, sizeof(struct sockaddr_nl)))
		die("Bind failed\n");
	while (-1!=poll(&pfd, 1, -1)) {
		int i, len = recv(pfd.fd, buf, sizeof(buf), MSG_DONTWAIT);
		if (len == -1) //die("recv\n");

		// Print the data to stdout.
		i = 0;
		while (i<len) {
			printf("%s\n", buf+i);
			i += strlen(buf+i)+1;
		}
	}
	die("poll\n");

	// Dear gcc: shut up.
	return 0;
}

Hotplug event variables:
========================

Every hotplug event should provide at least the following variables:

  ACTION
    The current hotplug action: "add" to add the device...
    [QUESTION: Full list of actions?]

  DEVPATH
    Path under /sys at which this device's sysfs directory can be found.
    If $DEVPATH begins with /block/ the event refers to a block device,
    otherwise it refers to a char device.

  SUBSYSTEM
    If this is "block", it's a block device.  Anything else is a char device.

The following variables are also provided for some devices:

  MAJOR and MINOR
    If these are present, a device node can be created in /dev for this device.
    Some devices (such as network cards) don't generate a /dev node.

    [QUESTION: Any way to get the default name?]

  DRIVER
    If present, a suggested driver (module) for handling this device.  No
    relation to whether or not a driver is currently handling the device.

  INTERFACE and IFINDEX
    When SUBSYSTEM=net, these variables indicate the name of the interface
    and a unique integer for the interface.  (Note that "INTERFACE=eth0" could
    be paired with "IFINDEX=2" because eth0 isn't guaranteed to come before lo
    and the count doesn't start at 0.)

  FIRMWARE
    The system is requesting firmware for the device.  See "Firmware loading"
    below.

Injecting events into hotplug via "uevent":
===========================================

Events can be injected into the hotplug mechanism through sysfs via the
"uevent" files.  Each directory in sysfs containing a "dev" file should also
contain a "uevent" file.

Note that in newer kernel versions, "uevent" is readable.  Reading from uevent
provides the set of "extra" variables associated with this event.

Firmware loading
================

If the hotplug variable FIRMWARE is set, the kernel is requesting firmware
for a device (identified by $DEVPATH).  To provide the firmware to the kernel,
do the following:

  echo 1 > /sys/$DEVPATH/loading
  cat /path/to/$FIRMWARE > /sys/$DEVPATH/data
  echo 0 > /sys/$DEVPATH/loading

Note that "echo -1 > /sys/$DEVPATH/loading" will cancel the firmware load
and return an error to the kernel, and /sys/class/firmware/timeout contains a
timeout (in seconds) for firmware loads.

See Documentation/firmware_class for more information.

Loading firmware for statically linked devices
==============================================

An advantage of the usermode helper hotplug mechanism is that if initramfs
contains an executable /sbin/hotplug, it can be called even before the kernel
runs init.  This allows /sbin/hotplug to supply firmware (out of initramfs) to
statically linked device drivers.  (The netlink mechanism requires a daemon to
listen to a socket, and such a daemon cannot be spawned before init runs.)

For licensing reasons, binary-only firmware should not be linked into the
kernel image, but instead placed in an externally supplied initramfs which
can be passed to the Linux kernel through the old initrd mechanism.
See Documentation/filesystems/ramfs-rootfs-initramfs.txt for details.

stable_api_nonsense:
====================

Note: Sysfs exports a lot of kernel internal state, and the maintainers of
sysfs do not believe that exposing information to userspace for use by
userspace programs constitues an "API" that must be "stable".  The sysfs
infrastructure is maintained by the author of
Documentation/stable_api_nonsense.txt, who seems to believe it applies to
userspace as well.  Therefore, at best only a subset of the information in
sysfs can be considered stable from version to version.

The information documented here should remain stable.  Some other parts of
sysfs are documented under Documentation/API, although that directory comes
with a warning that anything documented there can go away after two years.
Any other information exported by sysfs should be considered debugging info
at best, and probably shouldn't have been exported at all since it's not a
"stable API" intended for use by actual programs.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-17 21:03 Documentation for sysfs, hotplug, and firmware loading Rob Landley
@ 2007-07-17 21:55 ` Randy Dunlap
  2007-07-18  7:58 ` Cornelia Huck
  2007-07-18 10:33 ` Kay Sievers
  2 siblings, 0 replies; 27+ messages in thread
From: Randy Dunlap @ 2007-07-17 21:55 UTC (permalink / raw)
  To: Rob Landley
  Cc: linux-kernel, Greg KH, Michael-Luke Jones, Krzysztof Halasa,
	Rod Whitby, Russell King, david

On Tue, 17 Jul 2007 17:03:31 -0400 Rob Landley wrote:

> Here's some sysfs/hotplug/firmware loading documentation I wrote.  I finally 
> tracked down the netlink bits to finish it up, so I can send it out to the 
> world.
> 
> What's wrong with it? :)

(1.  It's an attachment. :)


2.  Here's a C program to print hotplug nelink events to stdout:

s/nelink/netlink/

3.  See Documentation/firmware_class for more information.

Perhaps add a '/' after "firmware_class" to indicate a directory?

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-17 21:03 Documentation for sysfs, hotplug, and firmware loading Rob Landley
  2007-07-17 21:55 ` Randy Dunlap
@ 2007-07-18  7:58 ` Cornelia Huck
  2007-07-18 17:39   ` Rob Landley
  2007-07-18 10:33 ` Kay Sievers
  2 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2007-07-18  7:58 UTC (permalink / raw)
  To: Rob Landley
  Cc: linux-kernel, Greg KH, Michael-Luke Jones, Krzysztof Halasa,
	Rod Whitby, Russell King, david

On Tue, 17 Jul 2007 17:03:31 -0400,
Rob Landley <rob@landley.net> wrote:

> Here's some sysfs/hotplug/firmware loading documentation I wrote.  I finally 
> tracked down the netlink bits to finish it up, so I can send it out to the 
> world.
> 
> What's wrong with it? :)

OK, some comments from me:


> Entires for block devices are found at the following locations:
  ^^^^^^^ typo
>
>  /sys/block/*/dev
>  /sys/block/*/*/dev

Note that this will change to /sys/class/block/ in the future.

> Entries for char devices are found at the following locations:
> 
>  /sys/bus/*/devices/*/dev
>  /sys/class/*/*/dev

Uh, that is actually the generic location?

It may be enough (and less confusing) to just state that the dev
attribute will belong to the associated "class" device sitting
under /sys/class/ (with the current exception of /sys/block/).

(And how about referring to Documentation/sysfs-rules.txt?)

> A simple bash script to record variables from hotplug events might look like:

Using a bash script is actually a very bad idea in the general case. It
can lead to OOM very quickly on large installations.

> It's possible to disable the usermode helper hotplug mechanism (by writing an
> empty string into /proc/sys/kernel/hotplug), but there's little reason to
> do this since a usermode helper won't be spawned if /sbin/hotplug doesn't
> exist, and negative dentries will record the fact it doesn't exist after
> the first lookup attempt.

AFAIK, the normal mode of operation is to use the hotplug mechanism
during early setup but to disable it once you have a listener on
netlink in place. My systems have an empty /proc/sys/kernel/hotplug.


>   ACTION
>     The current hotplug action: "add" to add the device...
>     [QUESTION: Full list of actions?]

Would be good. See lib/kobject_uevent.c.

>   DEVPATH
>     Path under /sys at which this device's sysfs directory can be found.
>     If $DEVPATH begins with /block/ the event refers to a block device,
>     otherwise it refers to a char device.

Huh? That's just the path in sysfs. And there's more than block and
char :) Check SUBSYSTEM for what your device actually is.

>   SUBSYSTEM
>     If this is "block", it's a block device.  Anything else is a char device.

No. For devices, SUBSYSTEM may be the class (like 'scsi_device') or the
bus (like 'pci').

>   DRIVER
>     If present, a suggested driver (module) for handling this device.  No
>     relation to whether or not a driver is currently handling the device.

No, this actually is the current driver.

> stable_api_nonsense:
> ====================
> 
> Note: Sysfs exports a lot of kernel internal state, and the maintainers of
> sysfs do not believe that exposing information to userspace for use by
> userspace programs constitues an "API" that must be "stable".  The sysfs
> infrastructure is maintained by the author of
> Documentation/stable_api_nonsense.txt, who seems to believe it applies to
> userspace as well.  Therefore, at best only a subset of the information in
> sysfs can be considered stable from version to version.
> 
> The information documented here should remain stable.  Some other parts
> of sysfs are documented under Documentation/API, although that
> directory comes with a warning that anything documented there can go
> away after two years. Any other information exported by sysfs should be
> considered debugging info at best, and probably shouldn't have been
> exported at all since it's not a "stable API" intended for use by
> actual programs.

Uh. Please refer to Documentation/sysfs-rules.txt.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-17 21:03 Documentation for sysfs, hotplug, and firmware loading Rob Landley
  2007-07-17 21:55 ` Randy Dunlap
  2007-07-18  7:58 ` Cornelia Huck
@ 2007-07-18 10:33 ` Kay Sievers
  2 siblings, 0 replies; 27+ messages in thread
From: Kay Sievers @ 2007-07-18 10:33 UTC (permalink / raw)
  To: Rob Landley
  Cc: linux-kernel, Greg KH, Michael-Luke Jones, Krzysztof Halasa,
	Rod Whitby, Russell King, david

On Tue, 2007-07-17 at 17:03 -0400, Rob Landley wrote:
> Here's some sysfs/hotplug/firmware loading documentation I wrote.  I finally 
> tracked down the netlink bits to finish it up, so I can send it out to the 
> world.
> 
> What's wrong with it? :)

A lot. :)

> Note, I still need to actually confirm that /sbin/hotplug can be called from 
> initramfs by a statically linked device to load firmware before init gets 
> spawned.  It should work, and was explicitly discussed as a design goal a 
> year or two back, but it might need a bugfix patch to actually, you know, 
> _work_.
> 
> (P.S.  I'd cc Kay Sievers on this, but he's still spam-blocking my email.  
> Thanks to Kay for answer lots of questions about this at OLS, and to Fank 
> Sorenson who wrote a netlink implementation of mdev back in 2005 that I dug 
> up to figure out how that part works.)

hotplug and firmware loading with sysfs.
========================================

The 2.6.x Linux kernels export a device tree through sysfs, which is a
synthetic filesystem generally mounted at "/sys".  Among other things,
this filesystem tells userspace what hardware is available, so userspace tools
(such as udev or mdev) can dynamically populate a "/dev" directory with device
nodes representing the currently available hardware.

@@ It exposes the current state of kernel devices, it's not necessarily
@@ related to any hardware.

Notification when hardware is inserted or removed is provided by the
hotplug mechanism.  Linux provides two hotplug interfaces: /sbin/hotplug and
netlink.

@@ Same here, it's about kernel device creation and not hardware "insertion".

The combination of sysfs and hotplug obsoleted the older "devfs", which was
removed from the 2.6.16 kernel.

@@ Udev replaced devfs, not hotplug or sysfs.

Device nodes:
=============

Sysfs exports major and minor numbers for device nodes with which to populate
/dev via mknod(2).  These major and minor numbers are found in files named
"dev", which contain two colon separated ascii decimal numbers followed by
exactly one newline.  I.E.

  $ cat /sys/class/mem/zero/dev
  1:5

Note that the name of the directory containing a dev entry is usually the
traditional name for the device node.  (The above entry is for "/dev/zero".)

Entires for block devices are found at the following locations:

  /sys/block/*/dev
  /sys/block/*/*/dev

Entries for char devices are found at the following locations:

  /sys/bus/*/devices/*/dev
  /sys/class/*/*/dev

@@ Wrong! /sys/class/block/* will be block-devices. Please read the stuff mentioned
@@ in the document Greg has posted. You _must_ always determine the subsystem.
@@ This rule is plain wrong.

A very simple bash script to populate /dev from /sys (without addressing
ownership or permissions of the resulting /dev nodes) might look like:

  #!/bin/bash

  # Populate block devices

  for i in /sys/block/*/dev /sys/block/*/*/dev
  do
    if [ -f $i ]
    then
      MAJOR=$(sed 's/:.*//' < $i)
      MINOR=$(sed 's/.*://' < $i)
      DEVNAME=$(echo $i | sed -e 's@/dev@@' -e 's@.*/@@')
      mknod /dev/$DEVNAME b $MAJOR $MINOR
    fi
  done

  # Populate char devices

  for i in /sys/bus/*/devices/*/dev /sys/class/*/*/dev
  do
    if [ -f $i ]
    then
      MAJOR=$(sed 's/:.*//' < $i)
      MINOR=$(sed 's/.*://' < $i)
      DEVNAME=$(echo $i | sed -e 's@/dev@@' -e 's@.*/@@')
      mknod /dev/$DEVNAME c $MAJOR $MINOR
    fi
  done

@@ That will fail badly. And again, please don't confuse
@@ "class" and "char", they are not the same, and never have been.

Hotplug:
========

The hotplug mechanism asynchronously notifies userspace when hardware is
inserted, removed, or undergoes a similar significant state change.  Linux
provides two interfaces to hotplug; the kernel can spawn a usermode helper
process, or it can send a message to an existing daemon listening to a netlink
socket.

-- Usermode helper

The usermode helper hotplug mechanism spawns a new process to handle each
hotplug event.  Each such helper process belongs to the root user (UID 0) and
is a child of the init task (PID 1).  The kernel spawns one process per hotplug
event, supplying environment variables to each new process describing that
particular hotplug event.  By default the kernel spawns instances of
"/sbin/hotplug", but this default can be changed by writing a new path into
"/proc/sys/kernel/hotplug" (assuming /proc is mounted).

@@ Are you sure it's a child of init?

A simple bash script to record variables from hotplug events might look like:

  #!/bin/bash

  env >> /filename

It's possible to disable the usermode helper hotplug mechanism (by writing an
empty string into /proc/sys/kernel/hotplug), but there's little reason to
do this since a usermode helper won't be spawned if /sbin/hotplug doesn't
exist, and negative dentries will record the fact it doesn't exist after
the first lookup attempt.

@@ That negative dentries will exist, is no reason not to disable it.
@@ Almost no system is using /sbin/hotplug anymore, so I don't see the
@@ point of this sentence. If you don't need to, why run all the code
@@ and let the kernel clone a process that is know to fail?
@@ Suggesting that, just doesn't make sense.

-- Netlink

A daemon listening to the netlink socket receives a packet of data for each
hotplug event, containing the same information a usermode helper would receive
in environment variables.

The netlink packet contains a set of null terminated text lines.
Each line but the first contains a KEYWORD=VALUE pair defining a hotplug
event variable.  The first line of the netlink packet combines the $ACTION
and $DEVPATH values, separated by an @ (at sign).

Here's a C program to print hotplug nelink events to stdout:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <sys/poll.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <unistd.h>

#include <linux/types.h>
#include <linux/netlink.h>

void die(char *s)
{
        write(2,s,strlen(s));
        exit(1);
}

int main(int argc, char *argv[])
{
        struct sockaddr_nl nls;
        struct pollfd pfd;
        char buf[512];

        // Open hotplug event netlink socket

        memset(&nls,0,sizeof(struct sockaddr_nl));
        nls.nl_family = AF_NETLINK;
        nls.nl_pid = getpid();
        nls.nl_groups = -1;

@@ You are not supposed to listen to anything else than group 1!

        pfd.events = POLLIN;
        pfd.fd = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_KOBJECT_UEVENT);
        if (pfd.fd==-1)
                die("Not root\n");

        // Listen to netlink socket

        if (bind(pfd.fd, (void *)&nls, sizeof(struct sockaddr_nl)))
                die("Bind failed\n");
        while (-1!=poll(&pfd, 1, -1)) {
                int i, len = recv(pfd.fd, buf, sizeof(buf), MSG_DONTWAIT);
                if (len == -1) //die("recv\n");

                // Print the data to stdout.
                i = 0;
                while (i<len) {
                        printf("%s\n", buf+i);
                        i += strlen(buf+i)+1;
                }
        }
        die("poll\n");

        // Dear gcc: shut up.
        return 0;
}

Hotplug event variables:
========================

Every hotplug event should provide at least the following variables:

  ACTION
    The current hotplug action: "add" to add the device...
    [QUESTION: Full list of actions?]

  DEVPATH
    Path under /sys at which this device's sysfs directory can be found.
    If $DEVPATH begins with /block/ the event refers to a block device,
    otherwise it refers to a char device.

@@ Devpathes are not defined what to start with, it's wrong to document
@@ it that way. If you don't have the event env or start from
@@ /sys/{class,bus,block,subsystem}), devices have a "subsystem" link,
@@ which is the only valid way to retrieve the "subsystem" from sysfs.

  SUBSYSTEM
    If this is "block", it's a block device.  Anything else is a char device.

The following variables are also provided for some devices:

  MAJOR and MINOR
    If these are present, a device node can be created in /dev for this device.
    Some devices (such as network cards) don't generate a /dev node.

    [QUESTION: Any way to get the default name?]

@@ The directory name is the default name.

  DRIVER
    If present, a suggested driver (module) for handling this device.  No
    relation to whether or not a driver is currently handling the device.

@@ Wrong, it's the currently bound driver, no "suggestion" at all.
@@ And don't confuse modules and drivers they are not always the same
@@ strings. Some modules have multiple drivers.

  INTERFACE and IFINDEX
    When SUBSYSTEM=net, these variables indicate the name of the interface
    and a unique integer for the interface.  (Note that "INTERFACE=eth0" could
    be paired with "IFINDEX=2" because eth0 isn't guaranteed to come before lo
    and the count doesn't start at 0.)

@@ What could be paired?

  FIRMWARE
    The system is requesting firmware for the device.  See "Firmware loading"
    below.

Injecting events into hotplug via "uevent":
===========================================

Events can be injected into the hotplug mechanism through sysfs via the
"uevent" files.  Each directory in sysfs containing a "dev" file should also
contain a "uevent" file.

@@ The name of the action should be written to "uevent".
@@ Currently only "add" is supported.

Note that in newer kernel versions, "uevent" is readable.  Reading from uevent
provides the set of "extra" variables associated with this event.

Firmware loading
================

If the hotplug variable FIRMWARE is set, the kernel is requesting firmware
for a device (identified by $DEVPATH).  To provide the firmware to the kernel,
do the following:

  echo 1 > /sys/$DEVPATH/loading
  cat /path/to/$FIRMWARE > /sys/$DEVPATH/data
  echo 0 > /sys/$DEVPATH/loading

Note that "echo -1 > /sys/$DEVPATH/loading" will cancel the firmware load
and return an error to the kernel, and /sys/class/firmware/timeout contains a
timeout (in seconds) for firmware loads.

See Documentation/firmware_class for more information.

Loading firmware for statically linked devices
==============================================

An advantage of the usermode helper hotplug mechanism is that if initramfs
contains an executable /sbin/hotplug, it can be called even before the kernel
runs init.  This allows /sbin/hotplug to supply firmware (out of initramfs) to
statically linked device drivers.  (The netlink mechanism requires a daemon to
listen to a socket, and such a daemon cannot be spawned before init runs.)

@@ There is no difference between /sbin/hotplug and netlink, this
@@ advantage simply doesn't exist. Most systems run udevd in initramfs. 
@@
@@ The only advantage of /sbin/hotplug is that you run easily OOM, because
@@ of too many event running at the same time. :)

For licensing reasons, binary-only firmware should not be linked into the
kernel image, but instead placed in an externally supplied initramfs which
can be passed to the Linux kernel through the old initrd mechanism.
See Documentation/filesystems/ramfs-rootfs-initramfs.txt for details.

@@ Firmware licensing issues definitely don't belong here.

stable_api_nonsense:
====================

Note: Sysfs exports a lot of kernel internal state, and the maintainers of
sysfs do not believe that exposing information to userspace for use by
userspace programs constitues an "API" that must be "stable".  The sysfs
infrastructure is maintained by the author of

@@ You still seems to misunderstand what this all is about.
@@ It's not that we "believe" something, it's that sysfs is
@@ a construct where the location of information can move around
@@ at runtime.
@@ It's a dynamic tree of devices, and not values at a defined
@@ location. That makes it totally different from anything else.
@@ The next second, a device can be somewhere else and that
@@ is nothing you could compare to any other API, so we need
@@ different rules here. It's not about "believing in something".
@@ Again, please read the document Greg posted.

Documentation/stable_api_nonsense.txt, who seems to believe it applies to
userspace as well.  Therefore, at best only a subset of the information in
sysfs can be considered stable from version to version.

@@ Only well defined ways of retrieving information from sysfs are
@@ expected to produce predictable results. It's a dump of the kernel
@@ state and you can't read reliably it without following some rules.

The information documented here should remain stable.  Some other parts of
sysfs are documented under Documentation/API, although that directory comes
with a warning that anything documented there can go away after two years.
Any other information exported by sysfs should be considered debugging info
at best, and probably shouldn't have been exported at all since it's not a
"stable API" intended for use by actual programs.

@@ It's not debugging, but you have to reach this information
@@ in a specific way. There is not much information we don't need.
@@ But we can't just expect it to look the same, even on the same
@@ system, a second later sysfs can look differently, devices can get
@@ renamed or move around dynamically.

I'm very unhappy with this text. I think you need to rework most
this document to be useful. You really can't ignore the rules to access
the information in sysfs. You try to point things down to a static
behavior, but that isn't what sysfs is about, or what it provides.

Please try to understand the intention of the document Greg posted,
it may be wrong in the wording, but it still contains a lot of things
you will need to describe to document sysfs.

Maybe I misunderstand you, and you only want to describe how to reliably
initialize /dev from sysfs, but then please change the title of the
document and describe what the udev or HAL code is doing at coldplug time,
instead of calling it "Documentation for sysfs". And again, I already
told you that a lot of times, read the udevtrigger code, how to "get all
devices" properly, your example scripts are wrong. 

Thanks,
Kay


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-18  7:58 ` Cornelia Huck
@ 2007-07-18 17:39   ` Rob Landley
  2007-07-18 23:33     ` Kay Sievers
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-18 17:39 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-kernel, Greg KH, Michael-Luke Jones, Krzysztof Halasa,
	Rod Whitby, Russell King, david

On Wednesday 18 July 2007 3:58:57 am Cornelia Huck wrote:
> On Tue, 17 Jul 2007 17:03:31 -0400,
>
> Rob Landley <rob@landley.net> wrote:
> > Here's some sysfs/hotplug/firmware loading documentation I wrote.  I
> > finally tracked down the netlink bits to finish it up, so I can send it
> > out to the world.
> >
> > What's wrong with it? :)
>
> OK, some comments from me:
> > Entires for block devices are found at the following locations:
>
>   ^^^^^^^ typo

Got it.

> >  /sys/block/*/dev
> >  /sys/block/*/*/dev
>
> Note that this will change to /sys/class/block/ in the future.

At OLS, Kay Sievers said in a future version they were going to move it 
to "/sys/subsystem/block", which I can't document right now because no 
current kernel does this, and that path will never work with any previous 
kernel, but there should be a compatability symlink from the old path to the 
new one.  He never mentioned /sys/class/block.

To all of this, I would like to humbly ask:

PICK ONE!  JUST #*%(&#%& PICK ONE!  AAAAAAAAHHHHHHH!!!!!!!!!

I don't care where it is.  Just put it somewhere I can find it, and keep it 
there.  All this gratuitous moving stuff around serves NO PURPOSE other than 
to break userspace.  I'm trying to document this so that the next time you 
go "oh wait, it should be at "/sys/tarantula/fruitbat" I can show that you're 
breaking an existing documented userspace API.

There's a kernel config option to make symlinks from the old 
location.  /sys/block makes as much sense as any other location, and it's 
what's there now.

> > Entries for char devices are found at the following locations:
> >
> >  /sys/bus/*/devices/*/dev
> >  /sys/class/*/*/dev
>
> Uh, that is actually the generic location?

It's what Kay Sievers and Greg KH told me at OLS when I tracked them down to 
ask.  I've also experimentally verified it working on Ubuntu 7.04.  That was 
cut and pasted from Kay's email, and it works today.

> It may be enough (and less confusing) to just state that the dev
> attribute will belong to the associated "class" device sitting
> under /sys/class/ (with the current exception of /sys/block/).

Nope.  If you recurse down under /sys/class following symlinks, you go into an 
endless loop bouncing off of /sys/devices and getting pointed back.  If you 
don't follow symlinks, it works fine up until about 2.6.20 at which point 
things that were previously directories BECAME symlinks because the 
directories got moved, and it all broke.

Which is why I want it documented where to look for these suckers.  Just give 
me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE.

This document is trying to document just enough information to make hotplug 
work using sysfs (which includes firmware loading if necessary).

> (And how about referring to Documentation/sysfs-rules.txt?)

Because there isn't one in 2.6.22, and I've been writing this file on and off 
for a month as I tracked down various bits of information?

> > A simple bash script to record variables from hotplug events might look
> > like:
>
> Using a bash script is actually a very bad idea in the general case. It
> can lead to OOM very quickly on large installations.

I know.  I'm just trying to show people how to do it.  Notice that this script 
doesn't DO anything, it just dumps the variables (and proves 
that /sys/hotplug got called).  You're worried about the scalability of a 
debugging script.

> > It's possible to disable the usermode helper hotplug mechanism (by
> > writing an empty string into /proc/sys/kernel/hotplug), but there's
> > little reason to do this since a usermode helper won't be spawned if
> > /sbin/hotplug doesn't exist, and negative dentries will record the fact
> > it doesn't exist after the first lookup attempt.
>
> AFAIK, the normal mode of operation is to use the hotplug mechanism
> during early setup but to disable it once you have a listener on
> netlink in place. My systems have an empty /proc/sys/kernel/hotplug.

And I documented how to blank it.  However, lots of embedded systems stick 
with one mechanism because having two is something they won't waste space on.  
(I note that mdev is about 5k and can be made to use very little memory per 
instance.)

Also, you can hold off on all device probing until a netlink daemon is up and 
then echo "add" to all the "uevent" entries you find.  At one point, udev did 
this, dunno what it's doing now.

> >   ACTION
> >     The current hotplug action: "add" to add the device...
> >     [QUESTION: Full list of actions?]
>
> Would be good. See lib/kobject_uevent.c.

Ah, I left a TODO item in there that I forgot to mark TODO. :)

(Rummage)  Seems to be "add, remove, change, online, offline, move"?

I can list 'em.  Now I'm vaguely curious what generates online and offline 
events (MII transciever state transitions on a network card, or does this 
have to do with power saving modes?)  And I have no idea what the difference 
between "change" and "move" is....

> >   DEVPATH
> >     Path under /sys at which this device's sysfs directory can be found.
> >     If $DEVPATH begins with /block/ the event refers to a block device,
> >     otherwise it refers to a char device.
>
> Huh? That's just the path in sysfs. And there's more than block and
> char :) Check SUBSYSTEM for what your device actually is.

If you are doing mknod, you need three pieces of information:
1) Major, 2) Minor, 3) Block or Char device.  That's pretty much it.  If 
you're trying to populate /dev you need that info.

> >   SUBSYSTEM
> >     If this is "block", it's a block device.  Anything else is a char
> > device.
>
> No. For devices, SUBSYSTEM may be the class (like 'scsi_device') or the
> bus (like 'pci').

Do you make a /dev node for either one?

I'm trying to, at minimum, document what you pass to mknod.  I consider it 
important to know.

> >   DRIVER
> >     If present, a suggested driver (module) for handling this device.  No
> >     relation to whether or not a driver is currently handling the device.
>
> No, this actually is the current driver.

I've had it suggest drivers for devices that didn't have any loaded, and I had 
it _not_ specify drivers for devices that were loaded.  (I checked.)

Could I get some clarification here?

> > stable_api_nonsense:
> > ====================
> >
> > Note: Sysfs exports a lot of kernel internal state, and the maintainers
> > of sysfs do not believe that exposing information to userspace for use by
> > userspace programs constitues an "API" that must be "stable".  The sysfs
> > infrastructure is maintained by the author of
> > Documentation/stable_api_nonsense.txt, who seems to believe it applies to
> > userspace as well.  Therefore, at best only a subset of the information
> > in sysfs can be considered stable from version to version.
> >
> > The information documented here should remain stable.  Some other parts
> > of sysfs are documented under Documentation/API, although that
> > directory comes with a warning that anything documented there can go
> > away after two years. Any other information exported by sysfs should be
> > considered debugging info at best, and probably shouldn't have been
> > exported at all since it's not a "stable API" intended for use by
> > actual programs.
>
> Uh. Please refer to Documentation/sysfs-rules.txt.

Where do I find this?  It's not in 2.6.22, and -rc1 isn't out yet, so I assume 
you mean it's in the random git snapshot du jour?

(Rummages...)

Sigh.  Missed that thread, and yes I've been working on this document since 
well before that was posted...

Ah yes.  I replied to that when it was first posted.  It's still "here's a 
list of things NOT to do" rather then telling you what you CAN do.  I'm 
trying to document what you can do.

Useful documentation is not "Doing THIS is forbidden.  Doing THIS is 
forbidden.  Doing THIS is forbidden.  What are you allowed to do?  Guess!  
Oh, and anything I didn't explicitly mention could change at any time.  Have 
fun."

Sysfs CAN export a stable API.  It may only be a subset of what it's 
exporting, but it can still do so.

Sigh.  I'll dig through this to try to find useful information amongst 
the "thou shalt nots", but I've got to catch a plane in a couple hours.  
Maybe on the flight...

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-18 17:39   ` Rob Landley
@ 2007-07-18 23:33     ` Kay Sievers
  2007-07-20  5:14       ` Rob Landley
  2007-07-18 23:40     ` Greg KH
  2007-07-19  8:16     ` Cornelia Huck
  2 siblings, 1 reply; 27+ messages in thread
From: Kay Sievers @ 2007-07-18 23:33 UTC (permalink / raw)
  To: Rob Landley
  Cc: Cornelia Huck, linux-kernel, Greg KH, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On 7/18/07, Rob Landley <rob@landley.net> wrote:
> > >  /sys/block/*/dev
> > >  /sys/block/*/*/dev
> >
> > Note that this will change to /sys/class/block/ in the future.
>
> At OLS, Kay Sievers said in a future version they were going to move it
> to "/sys/subsystem/block", which I can't document right now because no
> current kernel does this, and that path will never work with any previous
> kernel, but there should be a compatability symlink from the old path to the
> new one.

That will be the case.

> He never mentioned /sys/class/block.

So? How about reading your  email?
  http://marc.info/?l=linux-kernel&m=118260305012165&w=2

You seem to miss the the very basic skills to collect the needed
information to do the job of documenting something.

> To all of this, I would like to humbly ask:
>
> PICK ONE!  JUST #*%(&#%& PICK ONE!  AAAAAAAAHHHHHHH!!!!!!!!!

Man, you totally miss the point.

> > > Entries for char devices are found at the following locations:
> > >
> > >  /sys/bus/*/devices/*/dev
> > >  /sys/class/*/*/dev
> >
> > Uh, that is actually the generic location?
>
> It's what Kay Sievers and Greg KH told me at OLS when I tracked them down to
> ask.  I've also experimentally verified it working on Ubuntu 7.04.  That was
> cut and pasted from Kay's email, and it works today.

That is still true, but it still does not tell you the type of node to
create, as you seem to insist on.

> > It may be enough (and less confusing) to just state that the dev
> > attribute will belong to the associated "class" device sitting
> > under /sys/class/ (with the current exception of /sys/block/).
>
> Nope.  If you recurse down under /sys/class following symlinks, you go into an
> endless loop bouncing off of /sys/devices and getting pointed back.  If you
> don't follow symlinks, it works fine up until about 2.6.20 at which point
> things that were previously directories BECAME symlinks because the
> directories got moved, and it all broke.

That's total nonsense.

> Which is why I want it documented where to look for these suckers.  Just give
> me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE.
>
> This document is trying to document just enough information to make hotplug
> work using sysfs (which includes firmware loading if necessary).
>
> > (And how about referring to Documentation/sysfs-rules.txt?)
>
> Because there isn't one in 2.6.22, and I've been writing this file on and off
> for a month as I tracked down various bits of information?

I invested a lot of time explaining stuff to you in email and
personally, but really, that seems just like a total waste of time. I
will not reply to any of your mails until you have proven to have read
the udevtrigger code, and got a clue how to do stuff reliably, and get
the basic knowledge needed to document it.

Kay

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-18 17:39   ` Rob Landley
  2007-07-18 23:33     ` Kay Sievers
@ 2007-07-18 23:40     ` Greg KH
  2007-07-21  0:37       ` Rob Landley
  2007-07-19  8:16     ` Cornelia Huck
  2 siblings, 1 reply; 27+ messages in thread
From: Greg KH @ 2007-07-18 23:40 UTC (permalink / raw)
  To: Rob Landley
  Cc: Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Wed, Jul 18, 2007 at 01:39:53PM -0400, Rob Landley wrote:
> PICK ONE!  JUST #*%(&#%& PICK ONE!  AAAAAAAAHHHHHHH!!!!!!!!!
> 
> I don't care where it is.  Just put it somewhere I can find it, and keep it 
> there.  All this gratuitous moving stuff around serves NO PURPOSE other than 
> to break userspace.  I'm trying to document this so that the next time you 
> go "oh wait, it should be at "/sys/tarantula/fruitbat" I can show that you're 
> breaking an existing documented userspace API.
> 
> There's a kernel config option to make symlinks from the old 
> location.  /sys/block makes as much sense as any other location, and it's 
> what's there now.

Read the sysfs documentation file we just added, it describes how this
is all documented and should be used.  So well that I do not think you
need to try to document it again.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-18 17:39   ` Rob Landley
  2007-07-18 23:33     ` Kay Sievers
  2007-07-18 23:40     ` Greg KH
@ 2007-07-19  8:16     ` Cornelia Huck
  2007-07-21  0:21       ` Rob Landley
  2 siblings, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2007-07-19  8:16 UTC (permalink / raw)
  To: Rob Landley
  Cc: linux-kernel, Greg KH, Michael-Luke Jones, Krzysztof Halasa,
	Rod Whitby, Russell King, david

On Wed, 18 Jul 2007 13:39:53 -0400,
Rob Landley <rob@landley.net> wrote:

> Nope.  If you recurse down under /sys/class following symlinks, you go into an 
> endless loop bouncing off of /sys/devices and getting pointed back.  If you 
> don't follow symlinks, it works fine up until about 2.6.20 at which point 
> things that were previously directories BECAME symlinks because the 
> directories got moved, and it all broke.

I have no idea what you're doing.

> Which is why I want it documented where to look for these suckers.  Just give 
> me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE.

See Documentation/sysfs-rules.txt.

> This document is trying to document just enough information to make hotplug 
> work using sysfs (which includes firmware loading if necessary).
> 
> > (And how about referring to Documentation/sysfs-rules.txt?)
> 
> Because there isn't one in 2.6.22, and I've been writing this file on and off 
> for a month as I tracked down various bits of information?

That was a _suggestion_.

> I know.  I'm just trying to show people how to do it.  Notice that this script 
> doesn't DO anything, it just dumps the variables (and proves 
> that /sys/hotplug got called).  You're worried about the scalability of a 
> debugging script.

If you use bash scripts as examples, people will write bash scripts.

> (Rummage)  Seems to be "add, remove, change, online, offline, move"?
> 
> I can list 'em.  Now I'm vaguely curious what generates online and offline 
> events (MII transciever state transitions on a network card, or does this 
> have to do with power saving modes?)  And I have no idea what the difference 
> between "change" and "move" is....

"change" - something about the device has changed
"move" - the device is in a different position in the tree now

You may want to grep for the usage...

> 
> > >   DEVPATH
> > >     Path under /sys at which this device's sysfs directory can be found.
> > >     If $DEVPATH begins with /block/ the event refers to a block device,
> > >     otherwise it refers to a char device.
> >
> > Huh? That's just the path in sysfs. And there's more than block and
> > char :) Check SUBSYSTEM for what your device actually is.
> 
> If you are doing mknod, you need three pieces of information:
> 1) Major, 2) Minor, 3) Block or Char device.  That's pretty much it.  If 
> you're trying to populate /dev you need that info.
> 
> > >   SUBSYSTEM
> > >     If this is "block", it's a block device.  Anything else is a char
> > > device.
> >
> > No. For devices, SUBSYSTEM may be the class (like 'scsi_device') or the
> > bus (like 'pci').
> 
> Do you make a /dev node for either one?
> 
> I'm trying to, at minimum, document what you pass to mknod.  I consider it 
> important to know.

The problem is that your information is wrong. Imagine someone reading
this document, thinking "cool, I'll create a char node if
SUBSYSTEM!=block" and subsequently getting completely confused about
all those SUBSYSTEM==pci events.

> 
> > >   DRIVER
> > >     If present, a suggested driver (module) for handling this device.  No
> > >     relation to whether or not a driver is currently handling the device.
> >
> > No, this actually is the current driver.
> 
> I've had it suggest drivers for devices that didn't have any loaded, and I had 
> it _not_ specify drivers for devices that were loaded.  (I checked.)

The code disagrees with you. If a driver matches and probing succeeds,
it will be specified, otherwise not. Maybe you were checking the wrong
devices?

> Ah yes.  I replied to that when it was first posted.  It's still "here's a 
> list of things NOT to do" rather then telling you what you CAN do.  I'm 
> trying to document what you can do.
> 
> Useful documentation is not "Doing THIS is forbidden.  Doing THIS is 
> forbidden.  Doing THIS is forbidden.  What are you allowed to do?  Guess!  
> Oh, and anything I didn't explicitly mention could change at any time.  Have 
> fun."

It _does_ specify what you may rely on. Don't rely on anything else.

> Sysfs CAN export a stable API.  It may only be a subset of what it's 
> exporting, but it can still do so.

And that is exactly what sysfs-rules.txt is doing. I don't understand
your problem.

If you think that getting this information from sysfs-rules.txt could
be made easier, do a patch against it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-18 23:33     ` Kay Sievers
@ 2007-07-20  5:14       ` Rob Landley
  2007-07-20  7:00         ` Greg KH
  0 siblings, 1 reply; 27+ messages in thread
From: Rob Landley @ 2007-07-20  5:14 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Cornelia Huck, linux-kernel, Greg KH, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Wednesday 18 July 2007 7:33:19 pm Kay Sievers wrote:
> On 7/18/07, Rob Landley <rob@landley.net> wrote:
> > > >  /sys/block/*/dev
> > > >  /sys/block/*/*/dev
> > >
> > > Note that this will change to /sys/class/block/ in the future.
> >
> > At OLS, Kay Sievers said in a future version they were going to move it
> > to "/sys/subsystem/block", which I can't document right now because no
> > current kernel does this, and that path will never work with any previous
> > kernel, but there should be a compatability symlink from the old path to
> > the new one.
>
> That will be the case.
>
> > He never mentioned /sys/class/block.
>
> So? How about reading your  email?
>   http://marc.info/?l=linux-kernel&m=118260305012165&w=2

It wasn't in the notes you sent me from OLS, and I didn't compare against the 
earlier one because I thought the OLS notes were complete.

> Yes, in this order (if you want to use it, but /sys/block will still be
> there): /sys/subsytem/block/devices/*
>   /sys/class/block/*
>   /sys/block/*/*

"if you want to use it" said to me that sys/class/block/* was optional, so I 
didn't add it to the document I was writing.

> You seem to miss the the very basic skills to collect the needed
> information to do the job of documenting something.

I've gotten a lot of contradictory information, a lot of gratuitous changes 
from previous versions, a lot of notice of as of yet unmerged plans, and a 
lot of useless information (mostly in the form of "don't"s) while researching 
this topic.  I'm trying to find a useful subset.

I asked you what I needed when we met in person, and you didn't 
put /sys/class/block/* in the list.

Sysfs is almost unique in that examining the implementation tells me NOTHING 
about how to use it.  This is a defect in sysfs that I'm attempting to 
rectify by writing documentation about what you can rely on when trying to 
use it for hotplug and firmware loading.  This is a specific, limited use 
that I'm familiar with the requirements for.

Is there anything in /sys/class/block that _isn't_ in /sys/block?  Does "if 
you want to use it, but /sys/block will still be there" NOT mean, as I 
assumed at the time, that I could safely ignore it?  (My impression from the 
meeting at OLS was that adding /sys/block to /sys/class/block had been just 
an idea rejected in favor of adding it to /sys/subsystem/block.  I note that 
neither my Ubuntu 7.04 laptop nor the 2.6.22 system I built has either 
a /sys/class/block or a /sys/subsystem/block, so anything written attempting 
to use that won't work on any currently deployed Linux system.  You can't use 
it today, and will never be able to use it on any kernel version deployed 
today.)

> > To all of this, I would like to humbly ask:
> >
> > PICK ONE!  JUST #*%(&#%& PICK ONE!  AAAAAAAAHHHHHHH!!!!!!!!!
>
> Man, you totally miss the point.

I want to document a stable API, including the subset of sysfs that will 
remain stable.  The "point" appears to be that there isn't one because sysfs 
is "special", and udev should be in the kernel source tarball.

I'm trying to write down here the minimal information needed to find the "dev" 
nodes to populate /dev.  There's no functional reason I'm aware of for them 
to keep moving around.

> > > > Entries for char devices are found at the following locations:
> > > >
> > > >  /sys/bus/*/devices/*/dev
> > > >  /sys/class/*/*/dev
> > >
> > > Uh, that is actually the generic location?
> >
> > It's what Kay Sievers and Greg KH told me at OLS when I tracked them down
> > to ask.  I've also experimentally verified it working on Ubuntu 7.04. 
> > That was cut and pasted from Kay's email, and it works today.
>
> That is still true, but it still does not tell you the type of node to
> create, as you seem to insist on.

I don't insist on it, mknod insists on it.  You cannot mknod a dev node 
without specifying block or char.

You're saying that sysfs should provide major and minor numbers without 
anywhere specifying "char" or "block", meaning the major and minor numbers 
cannot be _used_.  I am insisting on getting the third piece of information 
without which "major" and "minor" are useless.

I asked very specifically about this at OLS, several times.  What you're 
telling me now seems to contradict what you told me then.

What I documented works in today's kernel.  You've talked about adding new 
mechanisms that won't work in today's kernel, which I'm not worrying about as 
long as the mechanisms that work with today's kernel continue to work.  Now 
you say you're going to break today's kernel by adding block devices 
to /sys/class, which I got the impression was NOT going to happen at OLS 
(that it was going to move to sys/subsystem but that sys/block symlink would 
still track it).  I specifically asked "what paths do I need to look at to 
find char devices" and "what paths do I need to look at to find block 
devices", and the paths in the documentation are the ones I got when I asked.

If block is going to move to sys/class, I can put in a warning about this 
pending breakage in the documentation, and modify my example code to filter 
it out.

> > > It may be enough (and less confusing) to just state that the dev
> > > attribute will belong to the associated "class" device sitting
> > > under /sys/class/ (with the current exception of /sys/block/).
> >
> > Nope.  If you recurse down under /sys/class following symlinks, you go
> > into an endless loop bouncing off of /sys/devices and getting pointed
> > back.  If you don't follow symlinks, it works fine up until about 2.6.20
> > at which point things that were previously directories BECAME symlinks
> > because the directories got moved, and it all broke.
>
> That's total nonsense.

Which part, the "following symlinks produced an endless loop" or 
the "directories turned into symlinks so not following them broke?"

Let's see...

According to my blog, Frank Sorensen first sent me a C port of my /dev 
populating script on December 12, 2005.  The current kernel at the time was 
2.6.14, so grab that, build user Mode Linux...  Huh, it won't build with gcc 
4.1.2.  Or 3.4.  Ok, defconfig?  Nope, that wants a stack check symbol?  
Let's see...  Ah, google says add -fno-stack-protector to CFLAGS.  Right...  
Fire it up under qemu, "mount -t sysfs /sys /sys", and:

In 2.6.14, /sys/block/hda/device points 
to ../../devices/pci0000:00/0000:00:01.1/ide0/0.0

/sys/block/hda/device/block points to ../../../../../block/hda

So in 2.6.14 you could 
go /sys/block/hda/device/block/device/block/device/block... endlessly, which 
is the reason I wrote mdev not to follow symlinks but to instead only look at 
actual subdirectories.  (It uses the same code to traverse down 
beneath /sys/block and /sys/class to look for "dev" entries.)  This works 
fine up through the 2.6.20 in ubuntu 7.04, where everything 
in /sys/class/tty/* is still a subdirectory.  But in 2.6.22, /sys/class/tty/* 
is all symlinks.  Hence the code that was working before changed, due to 
something that worked fine for a couple years but broke because it wasn't 
considered part of a stable API.

Which part of this is "total nonsense"?

> > Which is why I want it documented where to look for these suckers.  Just
> > give me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE.
> >
> > This document is trying to document just enough information to make
> > hotplug work using sysfs (which includes firmware loading if necessary).
> >
> > > (And how about referring to Documentation/sysfs-rules.txt?)
> >
> > Because there isn't one in 2.6.22, and I've been writing this file on and
> > off for a month as I tracked down various bits of information?
>
> I invested a lot of time explaining stuff to you in email and
> personally, but really, that seems just like a total waste of time.

I wrote up a document.  Started writing it before OLS, incorporated the 
information I got from you while at OLS, and took a while tracking down some 
old code doing netlink so I could include enough for people to puzzle out how 
that works.

I would have bounced earlier unfinished drafts off of you, but you were 
spam-blocking my email.  (Might still be, I don't know.  This is why I wanted 
to talk to you in person at OLS.)

> I will not reply to any of your mails

And I _can't_ reply to yours off-list because of your out of control spam 
filter.

> until you have proven to have read 
> the udevtrigger code,

I read the udev code when it was first posted.  I read it again 20 versions 
later, and read it again 20 versions after that.  I couldn't COMPILE the darn 
thing for its first ~40 releases, the code got ripped out and re-written 
several times, I watched as it grew and then threw out libsysfs.

So essentially you're saying "well read it again, we've finally got it right 
now"?

> and got a clue how to do stuff reliably, and get  
> the basic knowledge needed to document it.

Because talking to you and having you email me the notes from this 
conversation did not provide the basic knowledge needed to document hotplug 
and firmware loading.  Nor did asking for feedback on the document I wrote 
up.  Thanks ever so much.

I point out that udev changes from version to version, so that running an old 
version of udev against a new kernel has been known to break.  Udev was more 
or less completely rewritten three times while I was still paying attention 
to it.  Reading the udev code and seeing what it's doing struck me as about 
as likely to reveal a stable API as reading the kernel source, or 
experimenting with sysfs from userspace.  (Both of which I've _done_ at 
various points, and it keeps changing.)

Are you saying that the current version of udev will work with all future 
kernels, and thus if I can figure out what udev is doing today, I can just 
document that as the stable API?

> Kay

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-20  5:14       ` Rob Landley
@ 2007-07-20  7:00         ` Greg KH
  2007-07-20  7:54           ` Cornelia Huck
  2007-07-21  6:23           ` Rob Landley
  0 siblings, 2 replies; 27+ messages in thread
From: Greg KH @ 2007-07-20  7:00 UTC (permalink / raw)
  To: Rob Landley
  Cc: Kay Sievers, Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Fri, Jul 20, 2007 at 01:14:27AM -0400, Rob Landley wrote:
> Is there anything in /sys/class/block that _isn't_ in /sys/block?

No.

> Does "if you want to use it, but /sys/block will still be there" NOT
> mean, as I assumed at the time, that I could safely ignore it?

Ignore what?  /sys/block?  If you see /sys/class/block, then yes, you
can ignore it as they are just symlinks back to each other.

> (My impression from the meeting at OLS was that adding /sys/block to
> /sys/class/block had been just an idea rejected in favor of adding it
> to /sys/subsystem/block.

No, the /sys/class/block is in -mm and has been for some time.

> I note that neither my Ubuntu 7.04 laptop nor the 2.6.22 system I
> built has either a /sys/class/block or a /sys/subsystem/block, so
> anything written attempting to use that won't work on any currently
> deployed Linux system.  You can't use it today, and will never be able
> to use it on any kernel version deployed today.)

Not true at all, it works just fine here on my machines, and on all
distros released in the past year or so as tested out by a lot of users.

The only reason it isn't in Linus's tree just yet, is that some very old
mkinitrd programs don't seem to like it (we are talking Fedora Core 3
based distros, not Fedora itself.)  People are trying to work out what
the proper fix is for userspace there when they get the time.

So, expect the change to show up in 2.6.24.

> > > To all of this, I would like to humbly ask:
> > >
> > > PICK ONE!  JUST #*%(&#%& PICK ONE!  AAAAAAAAHHHHHHH!!!!!!!!!
> >
> > Man, you totally miss the point.
> 
> I want to document a stable API, including the subset of sysfs that will 
> remain stable.  The "point" appears to be that there isn't one because sysfs 
> is "special", and udev should be in the kernel source tarball.

What?  Since when does udev have to be in the kernel source tarball?
Who ever said that?

The issue here is that if you follow the rules as specified by the file,
Documentation/sysfs-rules.txt and Documentation/ABI/*/sysfs-* you should
be just fine.  To ignore them, as you have done in your examples, will
cause problems.

> I'm trying to write down here the minimal information needed to find the "dev" 
> nodes to populate /dev.  There's no functional reason I'm aware of for them 
> to keep moving around.

The issue is that the devices themselves keep moving around in the sysfs
tree all the time as systems are dynamic and change.

Again, look at the udevtrigger program for a simple way to achieve this
/dev population that you so desire.

But also realize that sysfs is much bigger than just trying to get the
information to create a /dev tree.

> > > > > Entries for char devices are found at the following locations:
> > > > >
> > > > >  /sys/bus/*/devices/*/dev
> > > > >  /sys/class/*/*/dev
> > > >
> > > > Uh, that is actually the generic location?
> > >
> > > It's what Kay Sievers and Greg KH told me at OLS when I tracked them down
> > > to ask.  I've also experimentally verified it working on Ubuntu 7.04. 
> > > That was cut and pasted from Kay's email, and it works today.
> >
> > That is still true, but it still does not tell you the type of node to
> > create, as you seem to insist on.
> 
> I don't insist on it, mknod insists on it.  You cannot mknod a dev node 
> without specifying block or char.
> 
> You're saying that sysfs should provide major and minor numbers without 
> anywhere specifying "char" or "block", meaning the major and minor numbers 
> cannot be _used_.  I am insisting on getting the third piece of information 
> without which "major" and "minor" are useless.
> 
> I asked very specifically about this at OLS, several times.  What you're 
> telling me now seems to contradict what you told me then.

Here's the rule:
	If the SUBSYSTEM is "block", it's a block device.  Otherwise
	it's a char device.

But also realize that the majority of events you will get have nothing
to do with device nodes.  I think you are forgetting this fact.

> If block is going to move to sys/class, I can put in a warning about this 
> pending breakage in the documentation, and modify my example code to filter 
> it out.

It's not a "breakage", we are preserving a symlink.  The point is that
you should not rely on the fact that /sys/block will be there in the
future, as the documentation I pointed to above describes.

> > > > It may be enough (and less confusing) to just state that the dev
> > > > attribute will belong to the associated "class" device sitting
> > > > under /sys/class/ (with the current exception of /sys/block/).
> > >
> > > Nope.  If you recurse down under /sys/class following symlinks, you go
> > > into an endless loop bouncing off of /sys/devices and getting pointed
> > > back.  If you don't follow symlinks, it works fine up until about 2.6.20
> > > at which point things that were previously directories BECAME symlinks
> > > because the directories got moved, and it all broke.
> >
> > That's total nonsense.
> 
> Which part, the "following symlinks produced an endless loop" or 
> the "directories turned into symlinks so not following them broke?"
> 
> Let's see...
> 
> According to my blog, Frank Sorensen first sent me a C port of my /dev 
> populating script on December 12, 2005.  The current kernel at the time was 
> 2.6.14, so grab that, build user Mode Linux...  Huh, it won't build with gcc 
> 4.1.2.  Or 3.4.  Ok, defconfig?  Nope, that wants a stack check symbol?  
> Let's see...  Ah, google says add -fno-stack-protector to CFLAGS.  Right...  
> Fire it up under qemu, "mount -t sysfs /sys /sys", and:
> 
> In 2.6.14, /sys/block/hda/device points 
> to ../../devices/pci0000:00/0000:00:01.1/ide0/0.0
> 
> /sys/block/hda/device/block points to ../../../../../block/hda
> 
> So in 2.6.14 you could 
> go /sys/block/hda/device/block/device/block/device/block... endlessly, which 
> is the reason I wrote mdev not to follow symlinks but to instead only look at 
> actual subdirectories.

That was the problem right there.  Why would you ever want to traverse
symlinks blindly without realizing what you were walking?  You can't
just run 'find' on sysfs and expect to not get caught in endless loops,
as the goal of the different parts of sysfs is to be able to start in
one place, and figure out all of the needed information from there.

For example, if you have a device, you can get the subsystem it belongs
to, the driver bound to it, and other stuff.  If you start with a
driver, you can get the devices it binds.  Can you see the circle
already?

So, you need to watch what you are trying to find, and if you do that,
you never will get caught in circles.  We never had that problem in udev
at all, as we just work with what was passed to us, not blindly try to
walk the whole sysfs tree.

Please use the proper context in order to get the information you need.
And at all times, a directory can turn into a symlink in order to keep
the same information possible.

> (It uses the same code to traverse down 
> beneath /sys/block and /sys/class to look for "dev" entries.)  This works 
> fine up through the 2.6.20 in ubuntu 7.04, where everything 
> in /sys/class/tty/* is still a subdirectory.  But in 2.6.22, /sys/class/tty/* 
> is all symlinks.  Hence the code that was working before changed, due to 
> something that worked fine for a couple years but broke because it wasn't 
> considered part of a stable API.
> 
> Which part of this is "total nonsense"?

Your code :)

> > until you have proven to have read 
> > the udevtrigger code,
> 
> I read the udev code when it was first posted.  I read it again 20 versions 
> later, and read it again 20 versions after that.  I couldn't COMPILE the darn 
> thing for its first ~40 releases, the code got ripped out and re-written 
> several times, I watched as it grew and then threw out libsysfs.

You could not build it?  Why not?  Did you send me a patch for this
problem that was major enough to keep you from using the project?

> So essentially you're saying "well read it again, we've finally got it right 
> now"?

Not at all, we are saying to look at how to achive what you are trying
to achieve by reading a very small and well documented .c file (530
lines with comments) that explains how to easily and quickly achieve
what you are trying to duplicate.

Heck, I did the same thing in a bash script for the Gentoo startup code
a while back that still works, but has ordering issues that the .c file
fixes up.  Hence it was dropped for the replacement that we are pointing
you at.

> > and got a clue how to do stuff reliably, and get  
> > the basic knowledge needed to document it.
> 
> Because talking to you and having you email me the notes from this 
> conversation did not provide the basic knowledge needed to document hotplug 
> and firmware loading.  Nor did asking for feedback on the document I wrote 
> up.  Thanks ever so much.
> 
> I point out that udev changes from version to version, so that running an old 
> version of udev against a new kernel has been known to break.

Hence the Documenation/CHANGES file documents the version that is
needed.  Right now it shows a version that is over a year and a half
old.  I do know that you can get away with running versions that are
even older than that if you want to, but it's not really recommended.

> Udev was more or less completely rewritten three times while I was
> still paying attention to it.  Reading the udev code and seeing what
> it's doing struck me as about as likely to reveal a stable API as
> reading the kernel source, or experimenting with sysfs from userspace.
> (Both of which I've _done_ at various points, and it keeps changing.)

The development cycle of udev has nothing to do with sysfs here.  Other
than the fact that we learned how to interact with a kernel interface
that directly exposes the internals of the kernel itself, something that
no one had done before.  In learning how to handle such major changes,
udev has changed in order to support zillions of devices, small memory
footprints, and lightening fast speed, all changes that required big
udev internal changes, but had _nothing_ to do with the kernel and/or
sysfs.

> Are you saying that the current version of udev will work with all future 
> kernels, and thus if I can figure out what udev is doing today, I can just 
> document that as the stable API?

If you want to figure out how to create a dynamic /dev filesystem that
can handle persistance device names, dynamic rules created by users,
zillions of devices on small and big systems, small footprint, and very
quick speed, then yes, read the udev source code.

What is the goal of this document here?  You start out trying to explain
the hotplug interface, and then get side tracked into talking about
creating a dynamic /dev/ filesystem in userspace and then ramble on into
how sysfs is layed out.  These are three separate things

While the act of creating such a /dev filesystem does have something to
do with the hotplug/uevent interface of the kernel, it isn't reliant on
it.  And the layout of sysfs also doesn't really have much affect on the
creation of such a /dev filesystem, as udev proves (it works just fine
without sysfs even being mounted.)

If you want to just document the hotplug/uevent interface then do
that.

If you want to document sysfs and it's structure, do that too, after
reading the existing documentation and understanding that.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-20  7:00         ` Greg KH
@ 2007-07-20  7:54           ` Cornelia Huck
  2007-07-20  8:09             ` Greg KH
  2007-07-21  6:23           ` Rob Landley
  1 sibling, 1 reply; 27+ messages in thread
From: Cornelia Huck @ 2007-07-20  7:54 UTC (permalink / raw)
  To: Greg KH
  Cc: Rob Landley, Kay Sievers, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Fri, 20 Jul 2007 00:00:01 -0700,
Greg KH <greg@kroah.com> wrote:

> > I don't insist on it, mknod insists on it.  You cannot mknod a dev node 
> > without specifying block or char.
> > 
> > You're saying that sysfs should provide major and minor numbers without 
> > anywhere specifying "char" or "block", meaning the major and minor numbers 
> > cannot be _used_.  I am insisting on getting the third piece of information 
> > without which "major" and "minor" are useless.
> > 
> > I asked very specifically about this at OLS, several times.  What you're 
> > telling me now seems to contradict what you told me then.
> 
> Here's the rule:
> 	If the SUBSYSTEM is "block", it's a block device.  Otherwise
> 	it's a char device.

That's actually quite confusing to the casual reader, since:

> But also realize that the majority of events you will get have nothing
> to do with device nodes.  I think you are forgetting this fact.

So the rule should be:
	If the SUBSYSTEM is "block" (implying major/minor are provided),
	it's a block device.
	If the SUBSYSTEM is not "block", and major/minor are provided,
	it's a char device.
	If major/minor are not provided, the event/device is not
	relevant to device node creation.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-20  7:54           ` Cornelia Huck
@ 2007-07-20  8:09             ` Greg KH
  2007-07-21  3:48               ` Rob Landley
  0 siblings, 1 reply; 27+ messages in thread
From: Greg KH @ 2007-07-20  8:09 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Rob Landley, Kay Sievers, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Fri, Jul 20, 2007 at 09:54:01AM +0200, Cornelia Huck wrote:
> On Fri, 20 Jul 2007 00:00:01 -0700,
> Greg KH <greg@kroah.com> wrote:
> 
> > > I don't insist on it, mknod insists on it.  You cannot mknod a dev node 
> > > without specifying block or char.
> > > 
> > > You're saying that sysfs should provide major and minor numbers without 
> > > anywhere specifying "char" or "block", meaning the major and minor numbers 
> > > cannot be _used_.  I am insisting on getting the third piece of information 
> > > without which "major" and "minor" are useless.
> > > 
> > > I asked very specifically about this at OLS, several times.  What you're 
> > > telling me now seems to contradict what you told me then.
> > 
> > Here's the rule:
> > 	If the SUBSYSTEM is "block", it's a block device.  Otherwise
> > 	it's a char device.
> 
> That's actually quite confusing to the casual reader, since:
> 
> > But also realize that the majority of events you will get have nothing
> > to do with device nodes.  I think you are forgetting this fact.
> 
> So the rule should be:
> 	If the SUBSYSTEM is "block" (implying major/minor are provided),
> 	it's a block device.
> 	If the SUBSYSTEM is not "block", and major/minor are provided,
> 	it's a char device.
> 	If major/minor are not provided, the event/device is not
> 	relevant to device node creation.

Yes, that is much more descriptive, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-19  8:16     ` Cornelia Huck
@ 2007-07-21  0:21       ` Rob Landley
  2007-07-21  0:43         ` Greg KH
                           ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-21  0:21 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-kernel, Greg KH, Michael-Luke Jones, Krzysztof Halasa,
	Rod Whitby, Russell King, david

On Thursday 19 July 2007 4:16:17 am Cornelia Huck wrote:
> On Wed, 18 Jul 2007 13:39:53 -0400,
>
> Rob Landley <rob@landley.net> wrote:
> > Nope.  If you recurse down under /sys/class following symlinks, you go
> > into an endless loop bouncing off of /sys/devices and getting pointed
> > back.  If you don't follow symlinks, it works fine up until about 2.6.20
> > at which point things that were previously directories BECAME symlinks
> > because the directories got moved, and it all broke.
>
> I have no idea what you're doing.

See the email to kay sievers.  In 2.6.14 following symlinks hit an endless
/sys/block/hda/device/block/device/block/device/block...  This has changed 
since, like much of sysfs, but in the absence of either a spec or a stable 
API there's no guarantee it won't reoccur.

> > Which is why I want it documented where to look for these suckers.  Just
> > give me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE.
>
> See Documentation/sysfs-rules.txt.

Ok:

Paragraph 1: "It's not stable."
Paragraph 2: "It's not stable."
Paragraph 3: If you really really need to access it directly...
Paragraph 4: DO NOT DO $XXX.
Paragraph 5: Expect it to be mounted at /sys
Paragraph 6: DO NOT DO $XXX.  (Specficially, the way you were distinguishing 
between block and char devices?  Don't do that.  No, we won't tell you what 
to replace it with, keep reading.)

So far, not exactly gripping reading.

Paragraph 7: What a devpath is.  Ok, is it just me or does it say that 
applications shouldn't use the symlinks in sysfs?  Why are they there, then?

Paragraph 8: The kernel has a name for the device.
Paragraph 9: Subsystem is a string.  What it means, we leave for you to guess.
Paragraph 10: Driver is the name of a driver.  (Does this mean a driver is 
currently loaded and handling the device, or that the kernel is suggesting a 
driver based on something like PCI ID, through the kind of mechanism that 
used to be used to request module loading?  Experimentally, it looks like the 
first, which makes sense but isn't specified.  Does something 
like /sys/class/mem/zero or have a driver?  Experimentally, no, it hasn't got 
a device link.)
Paragraph 11: Atributes, and yet more DO NOT DO $XXX.  It took me three reads 
of that to figure out they probably meant "Attributes belong to a device, 
don't confuse the attributes of another device with attributes of this 
device."  (Following _which_ device symlink?)

Ok, back up.  /sys/devices does not contain all the information necessary to 
populate /dev, because it hasn't got things like 
ramdisks, /dev/zero, /dev/console which are THERE in sysfs, which may or may 
not be supported by the kernel (the kernel might have ramdisk support, might 
not).  These things could also, in future, have their major and minor numbers 
dynamically (even randomly) assigned.  That's been discussed on this list.

I'm not trying to document /sys/devices.  I'm trying to document hotplug, 
populating /dev, and things like firmware loading that fall out of that.  
This requires use of sysfs, and I'm only trying to document as much of sysfs 
as you need to do that.  I'm not documenting stuff 
like /sys/devices/system/cpu.

The consensus so far is "the udev implementation is the spec", except I 
watched the udev implementation change rather a lot before I stopped tracking 
it, and saw a number of people complain on this list about things breaking 
when they upgraded the kernel but not udev.

Back to reading the document:
> - Properties of parent devices never belong into a child device.

Belong into?

>  Always look at the parent devices themselves for determining device
>  context properties.

For determining?

What was the original language of this document?

> If the device 'eth0' or 'sda' does not have a
>   "driver"-link, then this device does not have a driver.

Again, whether they mean "the kernel was not built with a driver that can 
handle this device" or "no driver is currently loaded and handling this 
device".  It _sounds_ like "this device is not supported by Linux", which 
probably isn't what they meant.

> Never copy any property of the parent-device into a child-device.

I note that the only mention made so far of parent-child relationships in 
devices is in terms of "don'ts".  I assume they're talking about how a 
partition can be the child of a block device, and a network controller card 
can be the child of a pci bus device?

Ah, I see.  The next paragraph is on hierarchy, yet doesn't actually explain 
anything, other than to imply that the device hierarchy being fully 
represented there is a dream to be achieved sometime in the future but not 
necessarily the truth with today's kernels, because stuff is still being 
_moved_ into /sys/devices.

> - Classification by subsystem
>  There are currently three places for classification of devices:
>  /sys/block, /sys/class and /sys/bus.

So if somebody wants to write code that runs on a current kernel, they have no 
alternative but to look in these three places.  If future kernels want to be 
compatible with current kernels, they must retain these three places as 
symlinks or some such.  Therefore, these three places form an API, and that's 
what I'm trying to document.

>  It is planned that these will 
>  not contain any device-directories themselves, but only flat lists of
>  symlinks pointing to the unified /sys/devices tree.

So presumably /sys/class/mem/zero and friends will show up under /sys/devices 
at some point, but that path should still find the thing via symlinks.  So 
moving it is an implementation detail irrelevant to documenting an api 
through with /dev can be populated by a userspace application.

>  Assuming /sys/class/<subsystem> and /sys/bus/<subsystem>, or
>  /sys/block and /sys/class/block are not interchangeable, is a bug in
>  the application.

So the fact Kay didn't mention /sys/class/block when we talked at OLS is a 
simple oversight.  You know, the type of feedback I was asking for was "This 
is wrong, you should change it to X", rather than "this is wrong, you're 
stupid".  I already _know_ I don't understand every nook and cranny of the 
Linux kernel, and that this is unlikely to change, thanks.  If you were 
wondering why writing documentation is like pulling teeth, and that there 
isn't enough of it, now you know.

Anyway, it seems like the paragraph I quoted can be rephrased "The contents 
of /sys/class/* and /sys/bus/* are interchangeable, and the contents 
of /sys/block and /sys/class/block are interchangeable."  However, this 
implies that /sys/class/block and /sys/block are interchangeable 
because /sys/class/block is part of /sys/class/*...

> - Block
>   The converted block-subsystem at /sys/class/block, or
>   /sys/subsystem/block will contain the links for disks and partitions
>   at the same level, never in a hierarchy.

A) The converted block subsystem isn't in 2.6.22 or any earlier kernel, thus 
software has to be written to use the current /sys/block path to be 
compatible with any kernel actually deployed anywhere today.  And if you've 
got the old mechanism working, what's the advantage of the new one?

B) This is the first mention (by implication only) that the current /sys/block 
might have a hierarchy under it.  The point of this document is obviously not 
to document how it works _today_.  The point of mine _is_.

> - "device"-link and <subsystem>:<kernel name>-links
>   Never depend on the "device"-link.

This paragraph is a big "DO NOT DO $XXX" repeated six times, and I note that 
this link was the reason I made mdev not follow symlinks in the first place.

>   Never depend on the class-specific links back to the /sys/class
>   directory.

DO NOT DO $XXX.

>   Never depend on a specific parent device position in the devpath,
>   or the chain of parent devices.

DO NOT DO $XXX.

Although I am curious about this paragraph.  Not that I'm really trying to 
document this bit, but "You must always request the parent device you are 
looking for by its subsystem value." does not explain HOW to request a 
specific device by its subsystem value.  According to the earlier bullet 
point about "subsystem", it's a simple string "(block, tty, pci, ...)" that 
seems to indicate a category of devices rather than a specific device...

And that is the end of the document.

Now, where in it did it explain how to determine whether a device is a block 
device or a char device when attempting to use the major and minor numbers 
extracted from sysfs to do a mknod?

> > This document is trying to document just enough information to make
> > hotplug work using sysfs (which includes firmware loading if necessary).
> >
> > > (And how about referring to Documentation/sysfs-rules.txt?)
> >
> > Because there isn't one in 2.6.22, and I've been writing this file on and
> > off for a month as I tracked down various bits of information?
>
> That was a _suggestion_.
>
> > I know.  I'm just trying to show people how to do it.  Notice that this
> > script doesn't DO anything, it just dumps the variables (and proves
> > that /sys/hotplug got called).  You're worried about the scalability of a
> > debugging script.
>
> If you use bash scripts as examples, people will write bash scripts.

And they'll find out, as I did, that it scales horribly.  (Scanning /dev with 
a shell script took about 15 seconds last time I tried it, which was 2 
laptops ago...)

I can document scalability issues, though.  (That and sequencing are why the 
netlink method exists in the first place...)

I was writing up a "history of hotplug" document that got trashed when my 
laptop died last month, but when I get around to rewriting it I plan to 
mention how /sys/hotplug was originally a big shell script and why they 
stopped doing that:
http://lwn.net/2001/0830/a/diet-hotplug.php3

> > (Rummage)  Seems to be "add, remove, change, online, offline, move"?
> >
> > I can list 'em.  Now I'm vaguely curious what generates online and
> > offline events (MII transciever state transitions on a network card, or
> > does this have to do with power saving modes?)  And I have no idea what
> > the difference between "change" and "move" is....
>
> "change" - something about the device has changed
> "move" - the device is in a different position in the tree now
>
> You may want to grep for the usage...

I did, thanks.

> > > No. For devices, SUBSYSTEM may be the class (like 'scsi_device') or the
> > > bus (like 'pci').
> >
> > Do you make a /dev node for either one?
> >
> > I'm trying to, at minimum, document what you pass to mknod.  I consider
> > it important to know.
>
> The problem is that your information is wrong. Imagine someone reading
> this document, thinking "cool, I'll create a char node if
> SUBSYSTEM!=block" and subsequently getting completely confused about
> all those SUBSYSTEM==pci events.

They get major and minor numbers for SUBSYSTEM==pci?

Is there something wrong other than the need to filter out /sys/class/block if 
that starts contaminating the pool of char devices (which it isn't in current 
kernels)?

> > > >   DRIVER
> > > >     If present, a suggested driver (module) for handling this device.
> > > >  No relation to whether or not a driver is currently handling the
> > > > device.
> > >
> > > No, this actually is the current driver.
> >
> > I've had it suggest drivers for devices that didn't have any loaded, and
> > I had it _not_ specify drivers for devices that were loaded.  (I
> > checked.)
>
> The code disagrees with you. If a driver matches and probing succeeds,
> it will be specified, otherwise not. Maybe you were checking the wrong
> devices?

Could be.  Or I could have been looking at an old kernel.

> > Ah yes.  I replied to that when it was first posted.  It's still "here's
> > a list of things NOT to do" rather then telling you what you CAN do.  I'm
> > trying to document what you can do.
> >
> > Useful documentation is not "Doing THIS is forbidden.  Doing THIS is
> > forbidden.  Doing THIS is forbidden.  What are you allowed to do?  Guess!
> > Oh, and anything I didn't explicitly mention could change at any time. 
> > Have fun."
>
> It _does_ specify what you may rely on. Don't rely on anything else.

Ok, how do I find a list of:

A) all char devices in the system.
B) all block devices currently in the system
C) whether a device that just got hotplugged (and I just got a hotplug event 
for) is a char or block device so I can call mknod?  (I know how to tell when 
there's no device node for it, there's no "dev" entry or no MAJOR=/MINOR= 
variables.  But when I do have those, how do I tell if they refer to a block 
or a char device?

Are you saying there's no reliable way to do that, or are you saying the 
document explains how to do that?

> > Sysfs CAN export a stable API.  It may only be a subset of what it's
> > exporting, but it can still do so.
>
> And that is exactly what sysfs-rules.txt is doing.

What does going on about buggy apps have to do with listing what you're 
allowed to do?  Listing things not to do isn't the same as listing what 
you're allowed to do, unless attempting to arrive at an API document by 
process of elimination.

> I don't understand your problem.

I've noticed.  I suspect I phrased my goals unclearly.

> If you think that getting this information from sysfs-rules.txt could
> be made easier, do a patch against it.

I'm trying to document hotplug, which includes scanning to initially 
populate /dev, and firmware loading.  Documenting a portion of sysfs is a 
side effect of documenting hotplug.

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-18 23:40     ` Greg KH
@ 2007-07-21  0:37       ` Rob Landley
  0 siblings, 0 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-21  0:37 UTC (permalink / raw)
  To: Greg KH
  Cc: Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Wednesday 18 July 2007 7:40:20 pm Greg KH wrote:
> On Wed, Jul 18, 2007 at 01:39:53PM -0400, Rob Landley wrote:
> > PICK ONE!  JUST #*%(&#%& PICK ONE!  AAAAAAAAHHHHHHH!!!!!!!!!
> >
> > I don't care where it is.  Just put it somewhere I can find it, and keep
> > it there.  All this gratuitous moving stuff around serves NO PURPOSE
> > other than to break userspace.  I'm trying to document this so that the
> > next time you go "oh wait, it should be at "/sys/tarantula/fruitbat" I
> > can show that you're breaking an existing documented userspace API.
> >
> > There's a kernel config option to make symlinks from the old
> > location.  /sys/block makes as much sense as any other location, and it's
> > what's there now.
>
> Read the sysfs documentation file we just added, it describes how this
> is all documented and should be used.  So well that I do not think you
> need to try to document it again.

I'm not trying to document all of sysfs, I'm trying to document hotplug.  I 
realize now I should have been more clear about that.

I've been working on the document I just posted on and off since may,  
(Possibly longer but I lost a lot of data in the hard drive crash on my 
laptop last month.  For example, I can't find a copy of my 
half-finished "history of hotplug" document and will probably need to start 
over, although I've still got a few places to look to see if I backed up a 
copy...)

This document has been sitting mostly unchanged on my hard drive since OLS, 
until I finally tracked down example code to do the netlink bit so I could 
finish it.  I tried to bounce a copy of the "everything but netlink" version 
off of kay by replying to his email with notes from OLS, and that's when I 
bumped into the "he's spam-blocking me" issue.  It got lost in the shuffle of 
OLS, and I just got back to it at the start of this thread.

Earlier today I read (and commented on, in the message to Cornelia Huck) the 
copy of Documentation/sysfs-rules.txt.  (Ah, darn it.  I have too many open 
windows on my desktop.  Hits "send" on message to Cornelia huck I _wrote_ 
earlier today.)

Documentation/sysfs-rules.txt doesn't talk about /sbin/hotplug or netlink 
hotplug.  It doesn't say how to distinguish a char device from a block 
device.  It mostly talks about finding stuff under the "/sys/devices" 
directory, most of which isn't relevant to populating /dev.  It doesn't 
clearly distinguish where you can find information in current kernels (2.6.22 
and earlier) from stuff that hasn't gone into any existing release.  Ideally 
I'd like to identify a subset of that information which is not only present 
in current kernels but should remain findable at that location in future 
kernels.  Over half the document is about what _not_ to do, and consists of 
warnings about "buggy apps", despite the assumption that anything _not_ 
explicitly documented is forbidden because most of the things sysfs exports 
are considered unmaintainable.

I've read the stuff under Documentation/ABI/{stable,testing}, and would be 
happy to refer to it rather than duplicating if I could get the info I needed 
out of it.  Documentation/filesystems/sysfs.txt is still from Patrick Mochel 
in 2003 and mostly about the kernel side rather than an API exported to 
userspace, and sysfs-pci.txt in that directory is similar.  Is there more I 
missed?

> thanks,
>
> greg k-h

Sorry, I'm not trying to be a pain.  I'm trying to document something I had to 
figure out for myself experimentally in 2005, which has been broken for me by 
kernel changes twice since then (when the "device" symlink went in back 
around 2.6.14, and when subdirs turned to symlinks recently), and I'm told is 
changing again with the additon of /sys/class/block (which means /sys/class/* 
no longer contains just char devices).

Ideally I'd like to come up with documentation that allows somebody to write 
one program that works on existing AND on new kernels, hence "stable API".

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-21  0:21       ` Rob Landley
@ 2007-07-21  0:43         ` Greg KH
  2007-07-23 23:26           ` Rob Landley
  2007-07-21  0:49         ` Greg KH
  2007-07-21  0:52         ` Greg KH
  2 siblings, 1 reply; 27+ messages in thread
From: Greg KH @ 2007-07-21  0:43 UTC (permalink / raw)
  To: Rob Landley
  Cc: Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> Ok, back up.  /sys/devices does not contain all the information necessary to 
> populate /dev, because it hasn't got things like 
> ramdisks, /dev/zero, /dev/console which are THERE in sysfs, which may or may 
> not be supported by the kernel (the kernel might have ramdisk support, might 
> not).

Welcome to 2007:

$ ls /sys/devices/virtual/mem/
full  kmem  kmsg  mem  null  port  random  urandom  zero
$ ls /sys/devices/virtual/tty/
console  tty12  tty19  tty25  tty31  tty38  tty44  tty50  tty57  tty63
ptmx     tty13  tty2   tty26  tty32  tty39  tty45  tty51  tty58  tty7
tty      tty14  tty20  tty27  tty33  tty4   tty46  tty52  tty59  tty8
tty0     tty15  tty21  tty28  tty34  tty40  tty47  tty53  tty6   tty9
tty1     tty16  tty22  tty29  tty35  tty41  tty48  tty54  tty60
tty10    tty17  tty23  tty3   tty36  tty42  tty49  tty55  tty61
tty11    tty18  tty24  tty30  tty37  tty43  tty5   tty56  tty62

I suggest you take a close look at the kernel before making statements
like the above :)

> These things could also, in future, have their major and minor numbers 
> dynamically (even randomly) assigned.  That's been discussed on this list.

I tried that once, it will require some core api kernel changes and a
lot of infrastrucure work to get that to work properly.  Not that it
will never happen in the future, but it's just not a trivial change at
the moment...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-21  0:21       ` Rob Landley
  2007-07-21  0:43         ` Greg KH
@ 2007-07-21  0:49         ` Greg KH
  2007-07-21  0:52         ` Greg KH
  2 siblings, 0 replies; 27+ messages in thread
From: Greg KH @ 2007-07-21  0:49 UTC (permalink / raw)
  To: Rob Landley
  Cc: Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> I'm not trying to document /sys/devices.  I'm trying to document hotplug, 
> populating /dev, and things like firmware loading that fall out of that.  
> This requires use of sysfs, and I'm only trying to document as much of sysfs 
> as you need to do that.

Like I stated before, you do not need to even have sysfs mounted to have
a dynamic /dev.

And why do you need to document populating /dev dynamically?  udev
already solves this problem for you, it's not like people are going off
and reinventing udev for their own enjoyment would not at least look at
how it solves this problem first.

To do otherwise would be foolish :)

Firmware loading is fine to document if you wish to do so.  But again,
why?  We already have multiple userspace programs that provide this
feature for them.  Perhaps you want to document how to add firmware to a
system in order for these different programs to pick them up?

Or perhaps you want to document how to add this kind of functionality to
your kernel driver so that it can handle firmware loading by using the
firmware interface that the kernel provides?

If you just want to document the hotplug/uevent api, then do just that.
However I think you are overreaching with your scope here and getting
mighty confused in the process.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-21  0:21       ` Rob Landley
  2007-07-21  0:43         ` Greg KH
  2007-07-21  0:49         ` Greg KH
@ 2007-07-21  0:52         ` Greg KH
  2007-07-21  6:32           ` Rob Landley
  2 siblings, 1 reply; 27+ messages in thread
From: Greg KH @ 2007-07-21  0:52 UTC (permalink / raw)
  To: Rob Landley
  Cc: Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> >  Always look at the parent devices themselves for determining device
> >  context properties.
> 
> For determining?
> 
> What was the original language of this document?

Ok, that's just being mean, cut it out right now if you ever want my
help again.

I'll gladly accept patches for this document that is in the kernel tree
now if you want to send them.  But criticizing the grammer of a document
with statements like this one gets you no where and is damm rude.

I suggest you start this thread over if you want my feedback, I'm not
going to respond anymore to this one.

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-20  8:09             ` Greg KH
@ 2007-07-21  3:48               ` Rob Landley
  0 siblings, 0 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-21  3:48 UTC (permalink / raw)
  To: Greg KH
  Cc: Cornelia Huck, Kay Sievers, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Friday 20 July 2007 4:09:36 am Greg KH wrote:
> On Fri, Jul 20, 2007 at 09:54:01AM +0200, Cornelia Huck wrote:
> > On Fri, 20 Jul 2007 00:00:01 -0700,
> >
> > Greg KH <greg@kroah.com> wrote:
> > > > I don't insist on it, mknod insists on it.  You cannot mknod a dev
> > > > node without specifying block or char.
> > > >
> > > > You're saying that sysfs should provide major and minor numbers
> > > > without anywhere specifying "char" or "block", meaning the major and
> > > > minor numbers cannot be _used_.  I am insisting on getting the third
> > > > piece of information without which "major" and "minor" are useless.
> > > >
> > > > I asked very specifically about this at OLS, several times.  What
> > > > you're telling me now seems to contradict what you told me then.
> > >
> > > Here's the rule:
> > > 	If the SUBSYSTEM is "block", it's a block device.  Otherwise
> > > 	it's a char device.
> >
> > That's actually quite confusing to the casual reader, since:
> > > But also realize that the majority of events you will get have nothing
> > > to do with device nodes.  I think you are forgetting this fact.
> >
> > So the rule should be:
> > 	If the SUBSYSTEM is "block" (implying major/minor are provided),
> > 	it's a block device.
> > 	If the SUBSYSTEM is not "block", and major/minor are provided,
> > 	it's a char device.
> > 	If major/minor are not provided, the event/device is not
> > 	relevant to device node creation.
>
> Yes, that is much more descriptive, thanks.

agreed, thanks.

I'll try to post an updated version of my hotplug documentation later tonight.  
(Just a _touch_ jetlagged at the moment, though.  It may only be 9:47 
california time, but it's 11:47 on the east cost.  I think.)

> greg k-h

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-20  7:00         ` Greg KH
  2007-07-20  7:54           ` Cornelia Huck
@ 2007-07-21  6:23           ` Rob Landley
  1 sibling, 0 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-21  6:23 UTC (permalink / raw)
  To: Greg KH
  Cc: Kay Sievers, Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Friday 20 July 2007 3:00:01 am Greg KH wrote:
> On Fri, Jul 20, 2007 at 01:14:27AM -0400, Rob Landley wrote:
> > Is there anything in /sys/class/block that _isn't_ in /sys/block?
>
> No.
>
> > Does "if you want to use it, but /sys/block will still be there" NOT
> > mean, as I assumed at the time, that I could safely ignore it?
>
> Ignore what?  /sys/block?  If you see /sys/class/block, then yes, you
> can ignore it as they are just symlinks back to each other.

I read "if you want to use it" as meaning "if you want to 
use "/sys/class/block", I.E. it was optional.

If /sys/block is going to remain a symlink to /sys/class/block then using the 
path "/sys/block" should work on both existing kernels and on new kernels 
without modification.

What moving it will force me to do is edit out "/sys/class/block' if I find it 
looking for char devices under "/sys/class".  Moving it forces me to add code 
to remove it.  I don't know what the supposed benefit is...

Ah, hang on.  Most likely this is an implementation detail from moving 
everything under /sys/subsystem and making /sys/class a symlink -> subsystem 
and /sys/block a symlink -> subsystem/block.  So it sounds like it's not 
really intentional breakage, just an implementation side effect.  Ok, that 
makes sense.

(Sorry, this hadn't occurred to me until now.  I'm not the one implementing 
this stuff, so I haven't spent the past few months thinking through the 
ramifications already.  This entire thread is one of about 12 things I'm 
working on at the moment.)

> > (My impression from the meeting at OLS was that adding /sys/block to
> > /sys/class/block had been just an idea rejected in favor of adding it
> > to /sys/subsystem/block.
>
> No, the /sys/class/block is in -mm and has been for some time.

A yes, "set back up an automated -mm testing thing that away when previous 
laptop did".  I need to bump that up on my todo list...

> > I note that neither my Ubuntu 7.04 laptop nor the 2.6.22 system I
> > built has either a /sys/class/block or a /sys/subsystem/block, so
> > anything written attempting to use that won't work on any currently
> > deployed Linux system.  You can't use it today, and will never be able
> > to use it on any kernel version deployed today.)
>
> Not true at all, it works just fine here on my machines, and on all
> distros released in the past year or so as tested out by a lot of users.

"All the distros" meaning "not Ubuntu 7.04"?  I just checked again, it doesn't 
have either a /sys/class/block or /sys/subsystem/block.  I realize they're 
coming...

> The only reason it isn't in Linus's tree just yet, is that some very old
> mkinitrd programs don't seem to like it (we are talking Fedora Core 3
> based distros, not Fedora itself.)  People are trying to work out what
> the proper fix is for userspace there when they get the time.
>
> So, expect the change to show up in 2.6.24.

Ok.

> > > > To all of this, I would like to humbly ask:
> > > >
> > > > PICK ONE!  JUST #*%(&#%& PICK ONE!  AAAAAAAAHHHHHHH!!!!!!!!!
> > >
> > > Man, you totally miss the point.
> >
> > I want to document a stable API, including the subset of sysfs that will
> > remain stable.  The "point" appears to be that there isn't one because
> > sysfs is "special", and udev should be in the kernel source tarball.
>
> What?  Since when does udev have to be in the kernel source tarball?
> Who ever said that?

I got that impression from:
  http://lkml.org/lkml/2006/7/30/228
  http://lwn.net/Articles/193603/

> The issue here is that if you follow the rules as specified by the file,
> Documentation/sysfs-rules.txt and Documentation/ABI/*/sysfs-* you should
> be just fine.  To ignore them, as you have done in your examples, will
> cause problems.

I read the Documentation/ABI ones in the current linus -git (well, as of... 
tuesday?)  and am unaware of any conflicts.  I read sysfs-rules.txt this 
morning, and the only _required_ change I'm aware of is filtering 
out /sys/class/block if it occurs.  (Is there more Documentation in Andrew 
Morton's tree that's not in Linus's?)

Lemme re-read my document...

Ok, /sys/bus/*/devices/*/dev is the path Kay told me to use during our 
discussion at OLS.  It's one of the paths I cut and pasted out of the email 
he sent me.  He typed that path, I didn't.  I see that the bit 
in "sysfs-rules.txt" about never using the "devices" symlink contradicts 
that.  Ok, so what _should_ I do?

Personally, I've never seen a dev link under "/sys/bus".  I neither own nor 
have personally encountered hardware that does that, and the first I heard 
that there _was_ such hardware was when talking to him at OLS.

Kay: if you get this, what path should I use?  Your email said:

> /sys/bus/*/devices/*/dev
> /sys/class/*/*/dev
> /sys/block/*/dev
> /sys/block/*/*/dev
>
> /sys/subsystem/*/devices/*/dev

The fifth of which isn't in currently deployed kernels (not in kernel.org, not 
in most recent ubuntu release, that counts as "not currently deployed" to 
me), and the first four should continue to work even when that goes in, so I 
didn't include it in "how to find this information", but I note that it also 
follows "devices"...

> > I'm trying to write down here the minimal information needed to find the
> > "dev" nodes to populate /dev.  There's no functional reason I'm aware of
> > for them to keep moving around.
>
> The issue is that the devices themselves keep moving around in the sysfs
> tree all the time as systems are dynamic and change.
>
> Again, look at the udevtrigger program for a simple way to achieve this
> /dev population that you so desire.

I'm aware of that program, and can iterate through the tree and write "add" 
to "uevent" instead of reading "dev".  Technically the /sbin/hotplug approach 
would have the same potential race condition with remove happening during 
scan: the potential downside is leaving a node in /dev that will give 
an -ENODEV if you try to open it, which is annoying but not fatal (no obvious 
security implications or anything), and a small enough race condition that it 
would seldom if ever inconvenience real users.

I can document this, though...

> But also realize that sysfs is much bigger than just trying to get the
> information to create a /dev tree.

I know.  At the moment I'm only trying to document the subset of sysfs needed 
to maintain a /dev tree via hotplug.  In future this may expand to include 
enough information to persistently name certain types of devices, but I'd be 
just as happy to have that under Devices/ABI and refer to it instead.

> > > > > > Entries for char devices are found at the following locations:
> > > > > >
> > > > > >  /sys/bus/*/devices/*/dev
> > > > > >  /sys/class/*/*/dev
> > > > >
> > > > > Uh, that is actually the generic location?
> > > >
> > > > It's what Kay Sievers and Greg KH told me at OLS when I tracked them
> > > > down to ask.  I've also experimentally verified it working on Ubuntu
> > > > 7.04. That was cut and pasted from Kay's email, and it works today.
> > >
> > > That is still true, but it still does not tell you the type of node to
> > > create, as you seem to insist on.
> >
> > I don't insist on it, mknod insists on it.  You cannot mknod a dev node
> > without specifying block or char.
> >
> > You're saying that sysfs should provide major and minor numbers without
> > anywhere specifying "char" or "block", meaning the major and minor
> > numbers cannot be _used_.  I am insisting on getting the third piece of
> > information without which "major" and "minor" are useless.
> >
> > I asked very specifically about this at OLS, several times.  What you're
> > telling me now seems to contradict what you told me then.
>
> Here's the rule:
> 	If the SUBSYSTEM is "block", it's a block device.  Otherwise
> 	it's a char device.

Ok.  Cornelia Huck seemed to disagree, but I see that's been resolved in 
another message.

> But also realize that the majority of events you will get have nothing
> to do with device nodes.  I think you are forgetting this fact.

Actually, I'm filtering them out, but I should make a note of it in the 
documentation.  (I'd happily document other events you might want to respond 
to that come into the hotplug mechanism, but I don't know what they are and 
am trying to start with the basics and flesh it out later.  Persistent device 
naming is a can of worms I'll have to open eventually.  Ubuntu 7.04 put uuids 
on every _partition_ in my laptop, and spins up my external usb hard drive 
trying to mount root.  When connected a machine with an IDE hard drive, 
that's now going through the scsi layer.  Sigh...)

> > If block is going to move to sys/class, I can put in a warning about this
> > pending breakage in the documentation, and modify my example code to
> > filter it out.
>
> It's not a "breakage", we are preserving a symlink.  The point is that
> you should not rely on the fact that /sys/block will be there in the
> future, as the documentation I pointed to above describes.

It doesn't say that /sys/block are deprecated or will be removed.  The closest 
it says is:

>   If /sys/subsystem exists, /sys/bus, /sys/class and /sys/block can be
>   ignored.

Which isn't the same as "must be ignored" or "may be removed".  I asked about 
this explicitly at OLS ("can I just keep using /sys/block and /sys/class") 
and was told that there were no plans to remove them.

Existing systems require using the older names, and I'm unaware of any 
information the new names provide that the old ones don't.

> > > > > It may be enough (and less confusing) to just state that the dev
> > > > > attribute will belong to the associated "class" device sitting
> > > > > under /sys/class/ (with the current exception of /sys/block/).
> > > >
> > > > Nope.  If you recurse down under /sys/class following symlinks, you
> > > > go into an endless loop bouncing off of /sys/devices and getting
> > > > pointed back.  If you don't follow symlinks, it works fine up until
> > > > about 2.6.20 at which point things that were previously directories
> > > > BECAME symlinks because the directories got moved, and it all broke.
> > >
> > > That's total nonsense.
> >
> > Which part, the "following symlinks produced an endless loop" or
> > the "directories turned into symlinks so not following them broke?"
> >
> > Let's see...
> >
> > According to my blog, Frank Sorensen first sent me a C port of my /dev
> > populating script on December 12, 2005.  The current kernel at the time
> > was 2.6.14, so grab that, build user Mode Linux...  Huh, it won't build
> > with gcc 4.1.2.  Or 3.4.  Ok, defconfig?  Nope, that wants a stack check
> > symbol? Let's see...  Ah, google says add -fno-stack-protector to CFLAGS.
> >  Right... Fire it up under qemu, "mount -t sysfs /sys /sys", and:
> >
> > In 2.6.14, /sys/block/hda/device points
> > to ../../devices/pci0000:00/0000:00:01.1/ide0/0.0
> >
> > /sys/block/hda/device/block points to ../../../../../block/hda
> >
> > So in 2.6.14 you could
> > go /sys/block/hda/device/block/device/block/device/block... endlessly,
> > which is the reason I wrote mdev not to follow symlinks but to instead
> > only look at actual subdirectories.
>
> That was the problem right there.  Why would you ever want to traverse
> symlinks blindly without realizing what you were walking?

Because I didn't want to encode an unknown structure of sysfs into the 
program?  Partitions were at the same level as hard drives one release (when 
I first came up with a working probing script for my Firmware Linux project, 
which according to my blog was October 27, 2005 and was using something like 
Linux 2.6.10), and moved into subdirectories the next, and I had no way of 
knowing if or when a third layer was going to be added to some future device 
I didn't know about.

Keep in mind I've been following this, on and off, for a while now:
http://lkml.org/lkml/2003/12/9/1
http://lkml.org/lkml/2003/12/10/16

And what I did was in response to the endless loop was _stop_ traversing 
symlinks at all, and only followed subdirectories.  Which worked fine until 
subdirectories got moved and replaced by symlinks, which is when I started 
asking "so what paths can I follow that will reliably be there in both 
current and future releases"?  Which is what I'm trying to document now.

> You can't 
> just run 'find' on sysfs and expect to not get caught in endless loops,
> as the goal of the different parts of sysfs is to be able to start in
> one place, and figure out all of the needed information from there.

I'm trying to document what those paths are.

People keep wanting to tell me about future plans that aren't merged yet.  A 
year ago the future plans were directories becoming symlinks, now the plans 
are /sys/subsystem, I'm sure in a year there will be new future plans.  I'd 
really like not to have to change existing code to still work with them, 
hence an attempt to document the API I _SHOULD_ use so that I don't have to.

If I discard what's there now and document the current future plans that 
aren't merged yet, how do I know that they won't themselves be ripped out a 
year after that?

> For example, if you have a device, you can get the subsystem it belongs
> to, the driver bound to it, and other stuff.

Can you get the default name of the device currently encoded as the last 
element of the path that "DEVPATH" points to (ala /class/mem/zero), but which 
I won't necessarily get if DEVPATH starts to point to /device/12345/:00 as 
some people keep saying DEVPATH should point to?

It's not encoded as one of the hotplug variables, other than extractable from 
DEVPATH in a way that may or may not continue to work...

> If you start with a 
> driver, you can get the devices it binds.  Can you see the circle
> already?
>
> So, you need to watch what you are trying to find, and if you do that,
> you never will get caught in circles.  We never had that problem in udev
> at all, as we just work with what was passed to us, not blindly try to
> walk the whole sysfs tree.

You wrote both the sysfs code and the udev code.  You wrote both sides of the 
export, changed both sides of the export fairly freely, and you know what you 
intended to do and what was merely an implementation artifact.

> Please use the proper context in order to get the information you need.
> And at all times, a directory can turn into a symlink in order to keep
> the same information possible.

I am aware of that, therefore I need to look at a known set of paths, which is 
what I'm trying to document now.

> > (It uses the same code to traverse down
> > beneath /sys/block and /sys/class to look for "dev" entries.)  This works
> > fine up through the 2.6.20 in ubuntu 7.04, where everything
> > in /sys/class/tty/* is still a subdirectory.  But in 2.6.22,
> > /sys/class/tty/* is all symlinks.  Hence the code that was working before
> > changed, due to something that worked fine for a couple years but broke
> > because it wasn't considered part of a stable API.
> >
> > Which part of this is "total nonsense"?
>
> Your code :)

*shrug*  There was a better way to do it back in 2005 without reading your 
mind?

> > > until you have proven to have read
> > > the udevtrigger code,
> >
> > I read the udev code when it was first posted.  I read it again 20
> > versions later, and read it again 20 versions after that.  I couldn't
> > COMPILE the darn thing for its first ~40 releases, the code got ripped
> > out and re-written several times, I watched as it grew and then threw out
> > libsysfs.
>
> You could not build it?  Why not?

I don't clearly remember the details from two years ago, but after glancing at 
my blog http://landley.net/notes-2005.html#27-10-2005 I vaguely recall that 
it had large numbers of undocumented environmental dependencies and I got 
sick of playing whack-a-mole installing packages, plus it had no 
documentation whatsoever and required a complicated configuration file to the 
point it was actually _easier_ to write a shell script to parse sysfs 
directly.  (Trying that got results in about 15 minutes.  Staring at udev for 
half a day did not.)

Plus I remember downloading different early versions of udev and finding 
things hugely rewritten between each update to the point that trying to pick 
it apart until it stabilized was a waste of time.

I also remember thinking that libsysfs sounded like a horrible idea (having 
your own copy of a shared library in the source tree defeats the purpose of 
having a shared library).  It was something that bothered me about the design 
from day one, libsysfs was in theory an external library but udev included 
its own copy, which made as much sense to me as including its own copy of 
glibc.  Here's the problem back in 2003:
http://www.ussg.iu.edu/hypermail/linux/kernel/0311.2/0716.html

Here's you replying to me on that topic in 2005:
http://www.ussg.iu.edu/hypermail/linux/kernel/0512.1/0617.html


> Did you send me a patch for this 
> problem that was major enough to keep you from using the project?

That problem was only one reason I didn't use the project.  I objected to most 
of the design, at some length, in this post back in 2005:
http://lkml.org/lkml/2005/10/30/189

> > So essentially you're saying "well read it again, we've finally got it
> > right now"?
>
> Not at all, we are saying to look at how to achive what you are trying
> to achieve by reading a very small and well documented .c file (530
> lines with comments) that explains how to easily and quickly achieve
> what you are trying to duplicate.

Ok.  I note that most of the stuff I was objecting to in 2005 wouldn't _fit_ 
in 530 lines, and my first gripe in my blog post was lack of documentation 
and unnecessary complexity, neither of which appear to be the case now...

> Heck, I did the same thing in a bash script for the Gentoo startup code
> a while back that still works, but has ordering issues that the .c file
> fixes up.  Hence it was dropped for the replacement that we are pointing
> you at.

Ok.

> > > and got a clue how to do stuff reliably, and get
> > > the basic knowledge needed to document it.
> >
> > Because talking to you and having you email me the notes from this
> > conversation did not provide the basic knowledge needed to document
> > hotplug and firmware loading.  Nor did asking for feedback on the
> > document I wrote up.  Thanks ever so much.
> >
> > I point out that udev changes from version to version, so that running an
> > old version of udev against a new kernel has been known to break.
>
> Hence the Documenation/CHANGES file documents the version that is
> needed.  Right now it shows a version that is over a year and a half
> old.  I do know that you can get away with running versions that are
> even older than that if you want to, but it's not really recommended.

Udev appears to have changed, for the better.  I'm still uncomfortable 
with "the implementation is the specification".

> > Udev was more or less completely rewritten three times while I was
> > still paying attention to it.  Reading the udev code and seeing what
> > it's doing struck me as about as likely to reveal a stable API as
> > reading the kernel source, or experimenting with sysfs from userspace.
> > (Both of which I've _done_ at various points, and it keeps changing.)
>
> The development cycle of udev has nothing to do with sysfs here.

I'm trying to figure out how to decouple them, yes. :)

> Other 
> than the fact that we learned how to interact with a kernel interface
> that directly exposes the internals of the kernel itself, something that
> no one had done before.  In learning how to handle such major changes,
> udev has changed in order to support zillions of devices, small memory
> footprints, and lightening fast speed, all changes that required big
> udev internal changes, but had _nothing_ to do with the kernel and/or
> sysfs.

Arriving at simple can take a lot of work.  You don't have to tell me that. :)

> > Are you saying that the current version of udev will work with all future
> > kernels, and thus if I can figure out what udev is doing today, I can
> > just document that as the stable API?
>
> If you want to figure out how to create a dynamic /dev filesystem that
> can handle persistance device names, dynamic rules created by users,
> zillions of devices on small and big systems, small footprint, and very
> quick speed, then yes, read the udev source code.

I did all that but the persistent device names in mdev, without ever referring 
to udev (after bad experiences with it in 2005), although I no longer 
maintain any part of busybox.

> What is the goal of this document here?  You start out trying to explain
> the hotplug interface, and then get side tracked into talking about
> creating a dynamic /dev/ filesystem in userspace and then ramble on into
> how sysfs is layed out.  These are three separate things

Various people asked me for documentation on hotplug and firmware loading, and 
what I know how to do with hotplug (because I had to work out how in 2005, 
and I'd like to nail down the approved way of doing it) is create /dev nodes.

> While the act of creating such a /dev filesystem does have something to
> do with the hotplug/uevent interface of the kernel, it isn't reliant on
> it.  And the layout of sysfs also doesn't really have much affect on the
> creation of such a /dev filesystem, as udev proves (it works just fine
> without sysfs even being mounted.)

Via netlink events?  I vaguely recall a thread about deferring all the "add" 
events until after a netlink daemon was up, but I thought you needed sysfs 
for that.

> If you want to just document the hotplug/uevent interface then do
> that.
>
> If you want to document sysfs and it's structure, do that too, after
> reading the existing documentation and understanding that.

I've read the existing documentation that I've seen.  Unfortunately I'm too 
jetlagged at the moment to finish collating it, and I need to go look at this 
cleaned-up no-longer-scary udev when I'm awake.

> thanks,
>
> greg k-h

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-21  0:52         ` Greg KH
@ 2007-07-21  6:32           ` Rob Landley
  0 siblings, 0 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-21  6:32 UTC (permalink / raw)
  To: Greg KH
  Cc: Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Friday 20 July 2007 8:52:11 pm Greg KH wrote:
> On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> > >  Always look at the parent devices themselves for determining device
> > >  context properties.
> >
> > For determining?
> >
> > What was the original language of this document?
>
> Ok, that's just being mean, cut it out right now if you ever want my
> help again.

You're right, that was uncalled for and I apologize.

I got a little on edge from some of Kay's earlier comments ala:
> You seem to miss the the very basic skills to collect the needed
> information to do the job of documenting something.

Probably combined with jetlag.  I'll stop posting until I get some sleep.

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-21  0:43         ` Greg KH
@ 2007-07-23 23:26           ` Rob Landley
  2007-07-24  7:38             ` Cornelia Huck
  0 siblings, 1 reply; 27+ messages in thread
From: Rob Landley @ 2007-07-23 23:26 UTC (permalink / raw)
  To: Greg KH
  Cc: Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Friday 20 July 2007 8:43:49 pm Greg KH wrote:
> On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> > Ok, back up.  /sys/devices does not contain all the information necessary
> > to populate /dev, because it hasn't got things like
> > ramdisks, /dev/zero, /dev/console which are THERE in sysfs, which may or
> > may not be supported by the kernel (the kernel might have ramdisk
> > support, might not).
>
> Welcome to 2007:
>
> $ ls /sys/devices/virtual/mem/
> full  kmem  kmsg  mem  null  port  random  urandom  zero
> $ ls /sys/devices/virtual/tty/
> console  tty12  tty19  tty25  tty31  tty38  tty44  tty50  tty57  tty63
> ptmx     tty13  tty2   tty26  tty32  tty39  tty45  tty51  tty58  tty7
> tty      tty14  tty20  tty27  tty33  tty4   tty46  tty52  tty59  tty8
> tty0     tty15  tty21  tty28  tty34  tty40  tty47  tty53  tty6   tty9
> tty1     tty16  tty22  tty29  tty35  tty41  tty48  tty54  tty60
> tty10    tty17  tty23  tty3   tty36  tty42  tty49  tty55  tty61
> tty11    tty18  tty24  tty30  tty37  tty43  tty5   tty56  tty62
>
> I suggest you take a close look at the kernel before making statements
> like the above :)

I did:

landley@dell:/sys/devices$ ls /sys/devices/virtual
ls: /sys/devices/virtual: No such file or directory
landley@dell:/sys/devices$ cat /proc/version
Linux version 2.6.20-16-generic (root@terranova) (gcc version 4.1.2 (Ubuntu 
4.1.2-0ubuntu4)) #2 SMP Thu Jun 7 20:19:32 UTC 2007

I.E. Ubuntu 7.04, stock.  The most recent release, using a 2.6.20 kernel.

I see that what you're talking about is in 2.6.22.  Back when I started 
writing my document in May, I forgot to check 2.6.22.

> > These things could also, in future, have their major and minor numbers
> > dynamically (even randomly) assigned.  That's been discussed on this
> > list.
>
> I tried that once, it will require some core api kernel changes and a
> lot of infrastrucure work to get that to work properly.  Not that it
> will never happen in the future, but it's just not a trivial change at
> the moment...

Understood.

> thanks,
>
> greg k-h

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-23 23:26           ` Rob Landley
@ 2007-07-24  7:38             ` Cornelia Huck
  0 siblings, 0 replies; 27+ messages in thread
From: Cornelia Huck @ 2007-07-24  7:38 UTC (permalink / raw)
  To: Rob Landley
  Cc: Greg KH, linux-kernel, Michael-Luke Jones, Krzysztof Halasa,
	Rod Whitby, Russell King, david

On Mon, 23 Jul 2007 19:26:54 -0400,
Rob Landley <rob@landley.net> wrote:

<tons of stuff>

Sorry, that's way too much for me to bother reading. I'd look at any
patches against the existing documentation, though. But I'll stop
looking at this thread, there's too much ranting for my taste in there.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-23 21:49           ` Rob Landley
@ 2007-08-05 12:57             ` Bodo Eggert
  0 siblings, 0 replies; 27+ messages in thread
From: Bodo Eggert @ 2007-08-05 12:57 UTC (permalink / raw)
  To: Rob Landley
  Cc: 7eggert, Greg KH, Cornelia Huck, linux-kernel,
	Michael-Luke Jones, Krzysztof Halasa, Rod Whitby, Russell King,
	david

On Mon, 23 Jul 2007, Rob Landley wrote:
> On Saturday 21 July 2007 8:14:41 am Bodo Eggert wrote:
> > Greg KH <greg@kroah.com> wrote:
> > > On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> > >> I'm not trying to document /sys/devices.  I'm trying to document
> > >> hotplug, populating /dev, and things like firmware loading that fall out
> > >> of that. This requires use of sysfs, and I'm only trying to document as
> > >> much of sysfs as you need to do that.
> > >
> > > Like I stated before, you do not need to even have sysfs mounted to have
> > > a dynamic /dev.
> > >
> > > And why do you need to document populating /dev dynamically?  udev
> > > already solves this problem for you, it's not like people are going off
> > > and reinventing udev for their own enjoyment would not at least look at
> > > how it solves this problem first.
> >
> > Turning your words around, you get: "Whatever one of these programs does
> > documents how dynamic devices should be handled." If this is true, any
> > change that makes one of these programs break is a kernel bug.
> >
> > Besides that: How am I supposed to be able to correctly change udev if
> > there is no document telling me what would work and what happens to
> > work by accident?
> 
> You aren't expected to.  Remember that udev and sysfs are written by the same 
> people, working together off-list.  They're free to break the exported data 
> format on a whim, because they write the code at both ends and fundamentally 
> they're talking to themselves.  They honestly say you can't expect a new 
> kernel to work with an old udev, and they say it with a straight face.  (To 
> me, this sounds like saying you can't expect a new kernel to work with an old 
> version of ps, because of /proc.)
> 
> Documentation is a threat to this way of working, because it would impose 
> restrictions on them.  A spec is only of use if you introduce the radical 
> idea that the information exported by sysfs exists for some purpose _other_ 
> than simply to provide udev with information (and a specific version of udev 
> matched to that kernel version, at that).

And having no documentation is, as you can see in this thread, a threat to
open software. Having exactly one blob of software, and being left on your
own once you change a bit, is one step away from having a binary only module.

> > > To do otherwise would be foolish :)
> >
> > Some people like to fool around and create even smaller wheels.
> > E.g. I'm changing the ACPI button driver to just call Ctrl_alt_del
> > in order not to have an extra process running and free 0.2 % of my RAM.
> 
> When I started looking at udev in 2005, it was a disaster.  My commentary at 
> the time is at http://lkml.org/lkml/2005/10/30/189 and the relevant bit is:

[...]

> And so I made mdev, a utility which populated /dev _with_ a config file in 7k.  
> Greg's upset I didn't just patch udev to remove libsysfs, remove the 
> duplicated klibc code, remove the gratuitous database, remove the 
> overcomplicated config file parser (with rules compiler), and so on.  They're 
> boggling that I could ever have been unhappy with the One True Project to 
> populate /dev.

I see, we agree on this point. Besides that, I like to see the steps to be
done, instead of having a letter sent to a voodoo doctor in Africa (called 
udev) and getting back a magic spell to be chanted on my system (unless 
he just pokes some voodoo doll).
-- 
Top 100 things you don't want the sysadmin to say:
87. Sorry, the new equipment didn't get budgetted.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-24  6:38           ` Greg KH
@ 2007-07-25 19:28             ` Rob Landley
  0 siblings, 0 replies; 27+ messages in thread
From: Rob Landley @ 2007-07-25 19:28 UTC (permalink / raw)
  To: Greg KH
  Cc: Bodo Eggert, Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Tuesday 24 July 2007 2:38:18 am Greg KH wrote:
> > In other words: Grasping sysfs is not a feasible task? If this is true,
> > how can anybody reliably use sysfs?
>
> Huh, I never stated that at all.  If you wish to fully document sysfs
> and how it works, then great, do that.  But that was not the stated
> intent of this document, and is why I think the author got confused as
> he was attempting to put a narrow portion of how sysfs works as a
> reflection on how the whole of the body works.

I am often confused.  Confusion is my natural state.  It's how I got started 
writing documentation, as "notes to me" so I'd be able to reproduce what I 
did.

Currently my "one more thing to research before I can write this up" is your 
earlier comment that I can coldplug the existing set of devices without 
talking to sysfs.  Possibly there's a way to do this through netlink, but I 
don't know of any way to send data _back_ to the kernel with the usermode 
helper mechanism, and telling the kernel to do that for every device in the 
system at once seems like a fork bomb waiting to happen anyway.

Yes, some embedded developers want to remove the networking layer from the 
kernel, meaning they can't use netlink, meaning if we ever _do_ go to 
dynamically allocated /dev nodes and you _must_ populate /dev by getting the 
numbers out of the kernel, this is an issue.

So far, I've been able to devote about 15 minutes to this topic since the 
weekend, and that was stolen from something else...

> thanks,
>
> greg k-h

Rob

-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-21 12:14         ` Bodo Eggert
  2007-07-23 21:49           ` Rob Landley
@ 2007-07-24  6:38           ` Greg KH
  2007-07-25 19:28             ` Rob Landley
  1 sibling, 1 reply; 27+ messages in thread
From: Greg KH @ 2007-07-24  6:38 UTC (permalink / raw)
  To: Bodo Eggert
  Cc: Rob Landley, Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Sat, Jul 21, 2007 at 02:14:41PM +0200, Bodo Eggert wrote:
> Greg KH <greg@kroah.com> wrote:
> > On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> 
> >> I'm not trying to document /sys/devices.  I'm trying to document hotplug,
> >> populating /dev, and things like firmware loading that fall out of that.
> >> This requires use of sysfs, and I'm only trying to document as much of sysfs
> >> as you need to do that.
> > 
> > Like I stated before, you do not need to even have sysfs mounted to have
> > a dynamic /dev.
> > 
> > And why do you need to document populating /dev dynamically?  udev
> > already solves this problem for you, it's not like people are going off
> > and reinventing udev for their own enjoyment would not at least look at
> > how it solves this problem first.
> 
> Turning your words around, you get: "Whatever one of these programs does
> documents how dynamic devices should be handled." If this is true, any
> change that makes one of these programs break is a kernel bug.

Not at all.  The kernel changed things numerous times that showed up as
bugs in udev, and I fully admit that (and have in the past, numerous
times.)

> Besides that: How am I supposed to be able to correctly change udev if
> there is no document telling me what would work and what happens to
> work by accident?

Um, the same way you change any codebase?  :)

> > To do otherwise would be foolish :)
> 
> Some people like to fool around and create even smaller wheels.
> E.g. I'm changing the ACPI button driver to just call Ctrl_alt_del
> in order not to have an extra process running and free 0.2 % of my RAM.

That's great, and I have nothing against that, and encourage you to do
so.

But I don't suppose you are trying to complain to the ACPI developers
about this whole thing now are you?  Are you hasseling them and
demanding that they fully document their interfaces that you need to use
so that you can hook into their code differently than they wish you to
do so?

> > Firmware loading is fine to document if you wish to do so.  But again,
> > why?  We already have multiple userspace programs that provide this
> > feature for them.  Perhaps you want to document how to add firmware to a
> > system in order for these different programs to pick them up?
> 
> I once tried to install a firmware for hotplug. Even finding the place whre
> I'm supposed to put it was harder than rewriting that *beep* from start,
> but I could not rewrite it because I didn't have any documentation.

The firmware layer has never been fully documented, and the maintainer
of the code died a few years ago.  It has been well known that this is
one area of the kernel that needs a lot of attention and help.  Please
feel free to chip in if you can do so.

> Even digging in that pile of wrapper scrips in order to debug that thing
> was a nightmare. (Having a number of places where the firmware will be
> expected in one of many versions and formats stored using one of many
> filenames can drive you nuts.)

I fully agree.

> > Or perhaps you want to document how to add this kind of functionality to
> > your kernel driver so that it can handle firmware loading by using the
> > firmware interface that the kernel provides?
> 
> I suppose that's missing, too. Or scattered in a number of contradicting
> and mostly outdated howtos across the internet.

Ir proably is, hence my suggestion on something that would be very
valuable to have documented.

> > If you just want to document the hotplug/uevent api, then do just that.
> > However I think you are overreaching with your scope here and getting
> > mighty confused in the process.
> 
> In other words: Grasping sysfs is not a feasible task? If this is true,
> how can anybody reliably use sysfs?

Huh, I never stated that at all.  If you wish to fully document sysfs
and how it works, then great, do that.  But that was not the stated
intent of this document, and is why I think the author got confused as
he was attempting to put a narrow portion of how sysfs works as a
reflection on how the whole of the body works.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
  2007-07-21 12:14         ` Bodo Eggert
@ 2007-07-23 21:49           ` Rob Landley
  2007-08-05 12:57             ` Bodo Eggert
  2007-07-24  6:38           ` Greg KH
  1 sibling, 1 reply; 27+ messages in thread
From: Rob Landley @ 2007-07-23 21:49 UTC (permalink / raw)
  To: 7eggert
  Cc: Greg KH, Cornelia Huck, linux-kernel, Michael-Luke Jones,
	Krzysztof Halasa, Rod Whitby, Russell King, david

On Saturday 21 July 2007 8:14:41 am Bodo Eggert wrote:
> Greg KH <greg@kroah.com> wrote:
> > On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
> >> I'm not trying to document /sys/devices.  I'm trying to document
> >> hotplug, populating /dev, and things like firmware loading that fall out
> >> of that. This requires use of sysfs, and I'm only trying to document as
> >> much of sysfs as you need to do that.
> >
> > Like I stated before, you do not need to even have sysfs mounted to have
> > a dynamic /dev.
> >
> > And why do you need to document populating /dev dynamically?  udev
> > already solves this problem for you, it's not like people are going off
> > and reinventing udev for their own enjoyment would not at least look at
> > how it solves this problem first.
>
> Turning your words around, you get: "Whatever one of these programs does
> documents how dynamic devices should be handled." If this is true, any
> change that makes one of these programs break is a kernel bug.
>
> Besides that: How am I supposed to be able to correctly change udev if
> there is no document telling me what would work and what happens to
> work by accident?

You aren't expected to.  Remember that udev and sysfs are written by the same 
people, working together off-list.  They're free to break the exported data 
format on a whim, because they write the code at both ends and fundamentally 
they're talking to themselves.  They honestly say you can't expect a new 
kernel to work with an old udev, and they say it with a straight face.  (To 
me, this sounds like saying you can't expect a new kernel to work with an old 
version of ps, because of /proc.)

Documentation is a threat to this way of working, because it would impose 
restrictions on them.  A spec is only of use if you introduce the radical 
idea that the information exported by sysfs exists for some purpose _other_ 
than simply to provide udev with information (and a specific version of udev 
matched to that kernel version, at that).

> > To do otherwise would be foolish :)
>
> Some people like to fool around and create even smaller wheels.
> E.g. I'm changing the ACPI button driver to just call Ctrl_alt_del
> in order not to have an extra process running and free 0.2 % of my RAM.

When I started looking at udev in 2005, it was a disaster.  My commentary at 
the time is at http://lkml.org/lkml/2005/10/30/189 and the relevant bit is:

> It turns out udev is smaller than it seems because it block copies
> code out of klibc and libsysfs (yes, having a standard interface library to
> sysfs is _such_ a good idea we should fork our own copy and bundle it. 
> After all, that's what shared libaries are for...)  And once you chop out
> all that, 90% of what's left is _still_ optional (try "grep ' main(' *.c'"
> and notice we have 12 separate occurrences of main().  For something that
> needs at most two (bootup and hotplug) and the bootup version can be a
> command line option. You don't need udevinfo, udevmonitor, udevsend, or
> udevtest.)  Add in the fact that udev/udevd use a gratuitous database that
> can be ditched, and then contemplate simplifying the config file (cut down
> the parser, and embedded users should NOT need a rules compiler; dunno
> whether it's worth it to keep the same config file syntax or come up with
> something tiny and dumb for embedded use)...

And so I made mdev, a utility which populated /dev _with_ a config file in 7k.  
Greg's upset I didn't just patch udev to remove libsysfs, remove the 
duplicated klibc code, remove the gratuitous database, remove the 
overcomplicated config file parser (with rules compiler), and so on.  They're 
boggling that I could ever have been unhappy with the One True Project to 
populate /dev.

> > Firmware loading is fine to document if you wish to do so.  But again,
> > why?  We already have multiple userspace programs that provide this
> > feature for them.  Perhaps you want to document how to add firmware to a
> > system in order for these different programs to pick them up?
>
> I once tried to install a firmware for hotplug. Even finding the place whre
> I'm supposed to put it was harder than rewriting that *beep* from start,
> but I could not rewrite it because I didn't have any documentation.
> Even digging in that pile of wrapper scrips in order to debug that thing
> was a nightmare. (Having a number of places where the firmware will be
> expected in one of many versions and formats stored using one of many
> filenames can drive you nuts.)

One of my todo list items is to see if loading firmware out of initramfs for 
statically linked devices, before /init gets spawned, can be made to work.  
Documenting how it works under normal circumstances is step 1.

Then again I'm wierd and think that documentation is a good thing in 
itself.  "You want to document this?  Why would you do a crazy thing like 
that?"  Because I'm that kind of crazy?  Because people have _ASKED_ me for 
this documentation off-list?  (Some of the people cc'd on this thread, in 
fact, although I didn't track down all of the requestors, just a few for 
review purposes...)

> > Or perhaps you want to document how to add this kind of functionality to
> > your kernel driver so that it can handle firmware loading by using the
> > firmware interface that the kernel provides?
>
> I suppose that's missing, too. Or scattered in a number of contradicting
> and mostly outdated howtos across the internet.

As I said, I really don't think they wanted it documented.  Look at the hugely 
defensive reaction they've had to my recent attempt.  Kay Sievers is _still_ 
spam blocking my email:

>                    The mail system
> <kay.sievers@vrfy.org>: host mx01.kundenserver.de[212.227.15.150] refused
> to talk to me: 421 mails from 71.162.243.5 refused: local dynamic IP
> address 71.162.243.5

Which it isn't, it's a FIOS that's had the same static IP since last year, and 
I told him about this at OLS, and I've had Greg forward email to him since 
then, but that cut and paste was from a bounce message that came in 
yesterday.  But that's a side issue.

The email Kay sent me from OLS (where I spoke to him in person for most of an 
hour) tells me to use /sys/bus/*/devices/*/dev but the documentation Greg 
posted a week ago explicitly says that any path following the "devices" 
symlink is a bug in the application.  (A document which of course, I wasn't 
CC'd on but was chastized for not having seen at the start of this thread, 
documentation which like the earlier version I _did_ comment on consists 
primarily of warnings "DO NOT DO $XXX".  And you wonder why I start to 
suspect they don't want third parties to use sysfs?)

I asked for clarification about this (how should I find devices under /sys/bus 
without following the devices symlink), but Greg said he won't reply to this 
thread anymore because his feelings were hurt.  Kay Sievers telling me I'm 
wasting his time and too stupid to understand what he tells me isn't expected 
to hurt my feelings, though:

> I invested a lot of time explaining stuff to you in email and
> personally, but really, that seems just like a total waste of time. I
> will not reply to any of your mails until you have proven to have read
> the udevtrigger code, and got a clue how to do stuff reliably, and get
> the basic knowledge needed to document it.

So him telling me to follow the "devices" symlink and Greg telling me never to 
do that are, collectively, my fault.  And he won't tell me anymore because 
I'm not worthy.  Add in Greg Kroah-Hartman saying he won't respond to this 
thread anymore and it's a bit...  frustrating, really.

I have Cornelia Huck presumably trying to help, saying things like the 
following exchange: http://lkml.org/lkml/2007/7/19/52
> > > >   SUBSYSTEM
> > > >     If this is "block", it's a block device.  Anything else is a char
> > > > device.
> > >
> > > No. For devices, SUBSYSTEM may be the class (like 'scsi_device') or the
> > > bus (like 'pci').
> >
> > Do you make a /dev node for either one?
> >
> > I'm trying to, at minimum, document what you pass to mknod.  I consider
> > it important to know.
>
> The problem is that your information is wrong. Imagine someone reading
> this document, thinking "cool, I'll create a char node if
> SUBSYSTEM!=block" and subsequently getting completely confused about
> all those SUBSYSTEM==pci events.

Except, it turns out, my information on that point _wasn't_ wrong, but even 
she (an existing sysfs developer) didn't know it and had to be corrected by 
Greg:
http://lkml.org/lkml/2007/7/20/66

I'm not blaming Cornelia, I'm thankful she's trying to help.  I'm just 
pointing out that this stuff is hard to document, because even developers who 
think they know it don't.  The fact that sysfs can give me a major and minor 
number without telling me whether it's "char" or "block" (without which major 
and minor are useless) strikes me as a design flaw, to be honest.  This 
ENTIRE THREAD boils down to me trying to figure out A) how do I find (iterate 
through) all the char devices in sysfs, B) how do I find all the block 
devices in sysfs, C) when I get a hotplug event that gives me major and 
minor, how do I tell if it's char or block.  (I now have an answer to C from 
Greg, although he gave it to Cordelia, not me, possibly because she wasn't 
trying to write documentation.)

When I ask questions of the sysfs developers they seldom bother to _correct_ 
me, they just tell me I'm wrong and make fun of me for not knowing it, and if 
I'm lucky they tell me to go read udev or the kernel sources.

Except the problem I have is that I _looked_ at the kernel many times over the 
years (and older versions of udev), and worked out how to do this circa 
2.6.12 and they changed it (by turning the flat /sys/block into a nested 
thing where partitions are subdirectories under the block device's 
subdirectory).  Then I had it working again circa 2.6.14 and they changed it 
again around 2.6.20 (turning subdirectories to symlinks so my strategy for 
avoiding the endless loops broke).  Then they changed it again by 
adding "/sys/class/block" so there are now block devices in a directory that 
formerly contained only char devices.

Plus they added char devices to /sys/bus that aren't referred to 
from /sys/class, so looking under /sys/class doesn't find all char devices 
anymore.  (They _say_ they did, anyway.  I still haven't actually seen a 
system that has one, but I'm taking them at their word here.)  I _still_ only 
know how to find the /sys/bus devices by following the path Kay told me, 
which Greg told me not to use, and resolution on that can't come in this 
thread because both Kay and Greg have told me they won't reply anymore.  (And 
I can't email kay off-list, because he's still spam-blocking me.)

Looking at the code doesn't tell me their intent.  It doesn't get a statement 
from them "this is how programs are expected to use information exported by 
the kernel, this subset of sysfs is a stable API, and it will continue to 
work with future kernels".  I've merely worked out a method that works, but 
they can always change it and blame me for not having done it the way they 
were thinking.  (Even when, like adding /sys/bus or /sys/class/block, the way 
they were thinking didn't exist yet.)

> > If you just want to document the hotplug/uevent api, then do just that.
> > However I think you are overreaching with your scope here and getting
> > mighty confused in the process.
>
> In other words: Grasping sysfs is not a feasible task? If this is true,
> how can anybody reliably use sysfs?

If it was possible to reliably use sysfs (without having written it), would 
you ever be _required_ to upgrade udev, except for security reasons or to 
gain new features?

As far as I can tell, "why do you need to document this" translates to "I do 
not want you to document this".  They consider sysfs, like the kernel 
internal APIs, something that's expected to change incompatibly from release 
to release.

Despite that, I'll post an updated version of the document that started this 
thread when I get back to Pittsburgh.  (I'm funny that way.)  I'm just 
working on other less-frustrating things for a while first.

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Documentation for sysfs, hotplug, and firmware loading.
       [not found]       ` <8Jbuh-84N-3@gated-at.bofh.it>
@ 2007-07-21 12:14         ` Bodo Eggert
  2007-07-23 21:49           ` Rob Landley
  2007-07-24  6:38           ` Greg KH
  0 siblings, 2 replies; 27+ messages in thread
From: Bodo Eggert @ 2007-07-21 12:14 UTC (permalink / raw)
  To: Greg KH, Rob Landley, Cornelia Huck, linux-kernel,
	Michael-Luke Jones, Krzysztof Halasa, Rod Whitby, Russell King,
	david

Greg KH <greg@kroah.com> wrote:
> On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:

>> I'm not trying to document /sys/devices.  I'm trying to document hotplug,
>> populating /dev, and things like firmware loading that fall out of that.
>> This requires use of sysfs, and I'm only trying to document as much of sysfs
>> as you need to do that.
> 
> Like I stated before, you do not need to even have sysfs mounted to have
> a dynamic /dev.
> 
> And why do you need to document populating /dev dynamically?  udev
> already solves this problem for you, it's not like people are going off
> and reinventing udev for their own enjoyment would not at least look at
> how it solves this problem first.

Turning your words around, you get: "Whatever one of these programs does
documents how dynamic devices should be handled." If this is true, any
change that makes one of these programs break is a kernel bug.

Besides that: How am I supposed to be able to correctly change udev if
there is no document telling me what would work and what happens to
work by accident?

> To do otherwise would be foolish :)

Some people like to fool around and create even smaller wheels.
E.g. I'm changing the ACPI button driver to just call Ctrl_alt_del
in order not to have an extra process running and free 0.2 % of my RAM.

> Firmware loading is fine to document if you wish to do so.  But again,
> why?  We already have multiple userspace programs that provide this
> feature for them.  Perhaps you want to document how to add firmware to a
> system in order for these different programs to pick them up?

I once tried to install a firmware for hotplug. Even finding the place whre
I'm supposed to put it was harder than rewriting that *beep* from start,
but I could not rewrite it because I didn't have any documentation.
Even digging in that pile of wrapper scrips in order to debug that thing
was a nightmare. (Having a number of places where the firmware will be
expected in one of many versions and formats stored using one of many
filenames can drive you nuts.)

> Or perhaps you want to document how to add this kind of functionality to
> your kernel driver so that it can handle firmware loading by using the
> firmware interface that the kernel provides?

I suppose that's missing, too. Or scattered in a number of contradicting
and mostly outdated howtos across the internet.

> If you just want to document the hotplug/uevent api, then do just that.
> However I think you are overreaching with your scope here and getting
> mighty confused in the process.

In other words: Grasping sysfs is not a feasible task? If this is true,
how can anybody reliably use sysfs?
-- 
Top 100 things you don't want the sysadmin to say:
99. Shit!!

Friß, Spammer: G@r.7eggert.dyndns.org Sln3@hI.7eggert.dyndns.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2007-08-05 12:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-17 21:03 Documentation for sysfs, hotplug, and firmware loading Rob Landley
2007-07-17 21:55 ` Randy Dunlap
2007-07-18  7:58 ` Cornelia Huck
2007-07-18 17:39   ` Rob Landley
2007-07-18 23:33     ` Kay Sievers
2007-07-20  5:14       ` Rob Landley
2007-07-20  7:00         ` Greg KH
2007-07-20  7:54           ` Cornelia Huck
2007-07-20  8:09             ` Greg KH
2007-07-21  3:48               ` Rob Landley
2007-07-21  6:23           ` Rob Landley
2007-07-18 23:40     ` Greg KH
2007-07-21  0:37       ` Rob Landley
2007-07-19  8:16     ` Cornelia Huck
2007-07-21  0:21       ` Rob Landley
2007-07-21  0:43         ` Greg KH
2007-07-23 23:26           ` Rob Landley
2007-07-24  7:38             ` Cornelia Huck
2007-07-21  0:49         ` Greg KH
2007-07-21  0:52         ` Greg KH
2007-07-21  6:32           ` Rob Landley
2007-07-18 10:33 ` Kay Sievers
     [not found] <8I2t1-7jt-5@gated-at.bofh.it>
     [not found] ` <8IlPa-3Gl-61@gated-at.bofh.it>
     [not found]   ` <8IzyM-86P-47@gated-at.bofh.it>
     [not found]     ` <8Jb1c-7vL-3@gated-at.bofh.it>
     [not found]       ` <8Jbuh-84N-3@gated-at.bofh.it>
2007-07-21 12:14         ` Bodo Eggert
2007-07-23 21:49           ` Rob Landley
2007-08-05 12:57             ` Bodo Eggert
2007-07-24  6:38           ` Greg KH
2007-07-25 19:28             ` Rob Landley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).