linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* driverfs problems
@ 2002-05-17 16:08 Arnd Bergmann
  2002-05-20 16:39 ` driverfs problem Patrick Mochel
  2002-05-21 21:24 ` driverfs problems Pavel Machek
  0 siblings, 2 replies; 5+ messages in thread
From: Arnd Bergmann @ 2002-05-17 16:08 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: linux-kernel, Arnd Bergmann

Hi,

I'm trying to write a new bus driver for the s/390 channel subsystem and 
stumbled over a few problems.

- If I register 'struct device's for my devices and unregister them at 
module unload time with put_device, the memory for the d_entries
is never freed because the refcount is still '3' (or '7' for directories) at 
the beginning of  'driverfs_unlink()'. Consequently, 
'driverfs_d_delete_file()' never gets called at all. So far, I could not
find the place where the refcount is incremented and not decremented
again. Any idea?

- When using many devices, a lot of memory seems to be wasted by 
identical files. When I simulated 65536 devices, which would be the
architectural limit of the channel subsystem, my ~200MB free mem were 
immediately filled and the out of memory handler started killing my
user space programs. More investigation showed that each dummy devices
needs between 3 and 4 kb on a 31 bit s390 system. That is probably not too 
much if you assume that people who can afford thousands of devices typically 
also have much RAM, but I could imagine that there is still room for improval.
Would it be feasable to allocate the dentries for standard files 
(name/power/status plus the ones provided by architecture and bus driver) 
only when the parent directory is accessed?

- I'm not sure about how to name the device directories. I don't have anything
like a hierarchical structure (except for something like scsi devices behind
a channel device) but rather a flat list of up to 65536 devices that are 
accessed by a device number that was defined by the system administrator. 
Each device also has a control unit type, comparable to a PCI ID, and in the 
general case each device driver knows about one control unit type. A 
hypothetical system might have
- one console, control unit type 0x3215, device number 0x0000
- three network devices, control unit type  0x1732, devno 0x0100 to 0x0102
- 1024 storage devices, control unit type 0x3390, devno 0x1000 to 0x13ff

I have a.t.m. three different ideas for how to structure the driverfs, in 
this case:
a) flat listing:
/root/channel/{0000,0100,0103,0102,1000-13ff}
advantage: reflects the real physical layout, no policy
disadvantage: difficult to parse as a human (similar to pre-devfs /dev/*),
possible scalability problems when scanning through long lists in kernel

b) by control unit type:
/root/channel/1732/{0100,0101,0102}
/root/channel/3215/0000
/root/channel/3390/{1000-13ff}
advantage: easy to find e.g. the console if you don't know the devno
disadvantage: control unit type might be unknown in a few special cases,
a control unit type is not really a bus but a common property of the devices

c) split device number:
/root/channel/00/00
/root/channel/01/{00,01,02}
/root/channel/{10,11,12,13}/{00-ff}
advantage: no large directories
disadvantage: does not reflect physical structure but policy

I personally favour solution b), but I want to be consistent with other
architectures. Is there any other bus handling 'many' devices? How 
do they do it?

Arnd <><

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: driverfs problem
  2002-05-17 16:08 driverfs problems Arnd Bergmann
@ 2002-05-20 16:39 ` Patrick Mochel
  2002-05-21 15:52   ` Arnd Bergmann
  2002-05-21 21:26   ` Pavel Machek
  2002-05-21 21:24 ` driverfs problems Pavel Machek
  1 sibling, 2 replies; 5+ messages in thread
From: Patrick Mochel @ 2002-05-20 16:39 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-kernel, Arnd Bergmann


Hi. Sorry about the delay in responding..

> - If I register 'struct device's for my devices and unregister them at 
> module unload time with put_device, the memory for the d_entries
> is never freed because the refcount is still '3' (or '7' for directories) at 
> the beginning of  'driverfs_unlink()'. Consequently, 
> 'driverfs_d_delete_file()' never gets called at all. So far, I could not
> find the place where the refcount is incremented and not decremented
> again. Any idea?

This is interesting. It appears that I need to do an extra dput() when 
removing files to push the refcount to 0. This counters the lookup_hash() 
that is done when creating the files and directories, and is analogous to 
the path_release() call in sys_unlink() and sys_rmdir(). Please try the 
attached patch, and let me know if it fixes the problem.

Note: I _think_ this is right. Can anyone confirm or deny this? 

> - When using many devices, a lot of memory seems to be wasted by 
> identical files. When I simulated 65536 devices, which would be the
> architectural limit of the channel subsystem, my ~200MB free mem were 
> immediately filled and the out of memory handler started killing my
> user space programs. More investigation showed that each dummy devices
> needs between 3 and 4 kb on a 31 bit s390 system. That is probably not too 
> much if you assume that people who can afford thousands of devices typically 
> also have much RAM, but I could imagine that there is still room for improval.

Yes, that was the assumption. However, there is much room for improvement. 
We shouldn't require that much memory.

> Would it be feasable to allocate the dentries for standard files 
> (name/power/status plus the ones provided by architecture and bus driver) 
> only when the parent directory is accessed?

Theoretically, I believe that's possible. Al Viro has also spoken of a 
per-device filesystem, which could help in that area. Althoygh, I have not 
pursued either option. Though I hate to say it, there is other progress to 
be made before I have a chance to seriously investigate these options.

> - I'm not sure about how to name the device directories. I don't have anything
> like a hierarchical structure (except for something like scsi devices behind
> a channel device) but rather a flat list of up to 65536 devices that are 
> accessed by a device number that was defined by the system administrator. 
> Each device also has a control unit type, comparable to a PCI ID, and in the 
> general case each device driver knows about one control unit type. A 
> hypothetical system might have
> - one console, control unit type 0x3215, device number 0x0000
> - three network devices, control unit type  0x1732, devno 0x0100 to 0x0102
> - 1024 storage devices, control unit type 0x3390, devno 0x1000 to 0x13ff

The control unit types are irrelevant at this point, as they dictate the 
type of device. You want to accurately represent the physical layout of 
the system. 

I honestly don't know how the controllers look on the s390, so I will go 
off what I do know about x86 PCI SCSI contollers. You typically have a 
hierarchy like:

.
`-- controller
    |-- chan0
    |   |-- 0
    |   |-- 1
    |   |-- 2
    |   `-- 3
	...
    `-- chan1
        |-- 0
        |-- 1
        |-- 2
        `-- 3
	...

right? (Note: naming is fictional here) I would think the layout would 
similar, but that's only a guess. How many devices can be on a channel? 
Does splitting them up like this help with the large directories?

Thanks,

	-pat

===== fs/driverfs/inode.c 1.18 vs edited =====
--- 1.18/fs/driverfs/inode.c	Tue Mar 12 14:22:16 2002
+++ edited/fs/driverfs/inode.c	Mon May 20 09:25:48 2002
@@ -592,6 +592,7 @@
 		if (!strcmp(entry->name,name)) {
 			list_del_init(node);
 			vfs_unlink(entry->dentry->d_parent->d_inode,entry->dentry);
+			dput(entry->dentry);
 			put_mount();
 			break;
 		}
@@ -616,7 +617,6 @@
 	if (!dentry)
 		goto done;
 
-	dget(dentry);
 	down(&dentry->d_parent->d_inode->i_sem);
 	down(&dentry->d_inode->i_sem);
 
@@ -627,6 +627,7 @@
 
 		list_del_init(node);
 		vfs_unlink(dentry->d_inode,entry->dentry);
+		dput(entry->dentry);
 		put_mount();
 		node = dir->files.next;
 	}


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: driverfs problem
  2002-05-20 16:39 ` driverfs problem Patrick Mochel
@ 2002-05-21 15:52   ` Arnd Bergmann
  2002-05-21 21:26   ` Pavel Machek
  1 sibling, 0 replies; 5+ messages in thread
From: Arnd Bergmann @ 2002-05-21 15:52 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: linux-kernel, Arnd Bergmann, Cornelia Huck

On Monday 20 May 2002 18:39, Patrick Mochel wrote:

> This is interesting. It appears that I need to do an extra dput() when
> removing files to push the refcount to 0. This counters the lookup_hash()
> that is done when creating the files and directories, and is analogous to
> the path_release() call in sys_unlink() and sys_rmdir(). Please try the
> attached patch, and let me know if it fixes the problem.
Yes, all files are correctly removed and the memory freed with that patch.

> Theoretically, I believe that's possible. Al Viro has also spoken of a
> per-device filesystem, which could help in that area. Althoygh, I have not
> pursued either option. Though I hate to say it, there is other progress to
> be made before I have a chance to seriously investigate these options.
That's ok. Right now its only a potential problem for us and if it becomes
a real one, I can investigate it further. My current idea is to have a
device_{create,remove}_{platform,bus}_file() interface that adds files
to all devices (*_platform_file) or all devices in that belong to one bus
subdir (*_bus_file). They would then shared a struct inode and thus 
appear hard linked. The space wasted by the dentries is a lot less than
what the dentries need.

> The control unit types are irrelevant at this point, as they dictate the
> type of device. You want to accurately represent the physical layout of
> the system.
Yes, I know. Unfortunately, the physical layout is not a tree and therefore
not easy to represent in a file system. The architecture is also trying
hard to hide the physical layout from the OS because 99% of the time,
we don't care.

> I honestly don't know how the controllers look on the s390, so I will go
> off what I do know about x86 PCI SCSI contollers. You typically have a
> hierarchy like:
>
> .
> `-- controller
>     |-- chan0
>     |   |-- 0
>     |   |-- 1
>     |   |-- 2
>     |   `-- 3
> 	...
>     `-- chan1
>         |-- 0
>         |-- 1
>         |-- 2
>         `-- 3
> 	...
>
> right? (Note: naming is fictional here) I would think the layout would
If it was that easy, I probably wouldn't have needed to ask ;-). The main 
problem is that a device can be connected to multiple control units 
(something vaguely like a bridge in PCI semantics) and each of these can 
itself be connected to multiple channel paths (i.e. connectors of the 
'channel subsystem', our root bus within driverfs).
The i/o driver in Linux can mostly ignore these details, because it is based
on 'subchannels', which are logical connections from the channel subsystem
to a single device, independent of the path(s) inbetween.

> similar, but that's only a guess. How many devices can be on a channel?
> Does splitting them up like this help with the large directories?
There can be 256 channel paths, each going to a control unit and each 
control unit can have a number of devices depending on its type, at least 
thousands of devices for certain types (actually, it's yet a bit more 
complicated than that ;-).

The total number of devices is currently limited to 65536, each device has 
one unique subchannel, irq (irq number equals subchannel number) and device 
number.
[Conny, please correct me if I've been telling nonsense]

It might be possible to list the devices by their channel path if I could
use symbolic links inside driverfs for devices that are accessed through
multiple channel paths, but I don't think there is much to gain from that
compared to the extra effort.

Thanks for your help,

Arnd <><

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: driverfs problems
  2002-05-17 16:08 driverfs problems Arnd Bergmann
  2002-05-20 16:39 ` driverfs problem Patrick Mochel
@ 2002-05-21 21:24 ` Pavel Machek
  1 sibling, 0 replies; 5+ messages in thread
From: Pavel Machek @ 2002-05-21 21:24 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Patrick Mochel, linux-kernel, Arnd Bergmann

Hi!

> I have a.t.m. three different ideas for how to structure the driverfs, in 
> this case:
> a) flat listing:
> /root/channel/{0000,0100,0103,0102,1000-13ff}
> advantage: reflects the real physical layout, no policy
> disadvantage: difficult to parse as a human (similar to pre-devfs /dev/*),
> possible scalability problems when scanning through long lists in
> kernel

I'd prefer this. 65000 is not that much, and anything else is ugly...
									Pavel
-- 
(about SSSCA) "I don't say this lightly.  However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: driverfs problem
  2002-05-20 16:39 ` driverfs problem Patrick Mochel
  2002-05-21 15:52   ` Arnd Bergmann
@ 2002-05-21 21:26   ` Pavel Machek
  1 sibling, 0 replies; 5+ messages in thread
From: Pavel Machek @ 2002-05-21 21:26 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: Arnd Bergmann, linux-kernel, Arnd Bergmann

Hi!

> > - I'm not sure about how to name the device directories. I don't have anything
> > like a hierarchical structure (except for something like scsi devices behind
> > a channel device) but rather a flat list of up to 65536 devices that are 
> > accessed by a device number that was defined by the system administrator. 
> > Each device also has a control unit type, comparable to a PCI ID, and in the 
> > general case each device driver knows about one control unit type. A 
> > hypothetical system might have
> > - one console, control unit type 0x3215, device number 0x0000
> > - three network devices, control unit type  0x1732, devno 0x0100 to 0x0102
> > - 1024 storage devices, control unit type 0x3390, devno 0x1000 to 0x13ff
> 
> The control unit types are irrelevant at this point, as they dictate the 
> type of device. You want to accurately represent the physical layout of 
> the system. 

s390 is linux on the top of vm. Linux runs under vmware-kind of
machine.... Its difficult to talk about "physical".
								Pavel
-- 
(about SSSCA) "I don't say this lightly.  However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-05-21 21:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-17 16:08 driverfs problems Arnd Bergmann
2002-05-20 16:39 ` driverfs problem Patrick Mochel
2002-05-21 15:52   ` Arnd Bergmann
2002-05-21 21:26   ` Pavel Machek
2002-05-21 21:24 ` driverfs problems Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).