* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-25 19:02 ` Linus Torvalds
@ 2004-01-25 20:21 ` viro
2004-01-27 6:51 ` Rusty Russell
2004-01-25 23:12 ` Steve Youngs
` (2 subsequent siblings)
3 siblings, 1 reply; 38+ messages in thread
From: viro @ 2004-01-25 20:21 UTC (permalink / raw)
To: Linus Torvalds
Cc: Alan Stern, Greg KH, Kernel development list, Patrick Mochel
On Sun, Jan 25, 2004 at 11:02:58AM -0800, Linus Torvalds wrote:
> The proper thing to do (and what we _have_ done) is to say "unloading of
> modules is not supported". It's a debugging feature, and you literally
> shouldn't do it unless you are actively developing that module.
>
> Sadly, some modules are broken. Old 16-bit PCMCIA in particular _depends_
> on unloading modules, since the old PCMCIA layer doesn't do hotplug: it
> literally thinks of module load/unload as the "plug/unplug" event.
>
> But it basically boils down to: don't think of module unload as a "normal
> event". It isn't. Getting it truly right is (a) too painful and (b) would
> be too slow, so we're not even going to try.
>
> (As an example of "too painful, too slow", think of something like a
> packet filter module. You'd literally have to increment the count in every
> part that gets a packet, and decrement the count at every point where it
> lets the packet go. And since it would have to be SMP-safe, it would have
> to be a locked cycle, or we'd have to have per-CPU counters - at which
> point you now also have to worry about things like preemption and
> sleeping, which just means that it would be a _lot_ of very fragile code).
Packet filter is hardly a normal module. For absolute majority of modules
it's nowhere near that bad.
HOWEVER, module unload is not the real problem. We have objects with
limited lifetimes. Always had, always will. Whether we remove pieces
of .text from the in-core kernel or not, we must be able to deal with
that. Even if methods themselves are present, they won't do you any
good when data structures belonging to object are destroyed.
If that is handled right, rmmod is trivial for 99% of modules. The
rest (including Rusty's stuff) can simply say "we can't be unloaded
at all" and be done with that.
Basically, "protect the module" is wrong - it should be "protect specific
object" and we need that anyway. We already have that for the largest
class of modules - 300-odd netdev ones. We have that for filesystems.
We have that for block devices. We have infrastructure for doing that
to character devices, ttys and ldiscs. Which leaves us with truly weird
stuff that doesn't work in such terms and yes, there your arguments
apply.
Frankly, I'm much more concerned about the stuff that _can't_ be disabled.
You can disable module unloading; hell, you can disable modules completely.
You can't disable ifdown(8). Which currently means very bad things for
stuff in /proc/sys/net/ipv4/{conf,neigh}/*. And all arguments re rmmod
deadlocks apply to that sort of situations...
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-25 20:21 ` viro
@ 2004-01-27 6:51 ` Rusty Russell
2004-01-27 13:56 ` Roman Zippel
0 siblings, 1 reply; 38+ messages in thread
From: Rusty Russell @ 2004-01-27 6:51 UTC (permalink / raw)
To: viro; +Cc: torvalds, stern, greg, linux-kernel, mochel
On Sun, 25 Jan 2004 20:21:37 +0000
viro@parcelfarce.linux.theplanet.co.uk wrote:
> Basically, "protect the module" is wrong - it should be "protect specific
> object" and we need that anyway.
Agreed. You're oversimplifying a little, though.
In this model, the object here is the function text. So if you hand out
a pointer to the function text, you need to hold a refcount.
BUT since the module itself is the only one which can hand these out,
and it unregisters everything it has registered, and all those references
fall to zero, it's trivial to prove that there are no more references to
the module functions.
This (as Al points out by referring to lifetime) is the same problem if you
want to kfree() the thing you've registered: either deregistration is
synchronous or it supplies a callback which does the actual kfree. And most
registration interfaces in the kernel are headed towards this model, and
it can be pressed into service for module removal as well.
Hope that clarifies,
Rusty.
--
there are those who do and those who hang on and you don't see too
many doers quoting their contemporaries. -- Larry McVoy
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-27 6:51 ` Rusty Russell
@ 2004-01-27 13:56 ` Roman Zippel
2004-01-27 23:29 ` Rusty Russell
0 siblings, 1 reply; 38+ messages in thread
From: Roman Zippel @ 2004-01-27 13:56 UTC (permalink / raw)
To: Rusty Russell; +Cc: viro, torvalds, stern, greg, linux-kernel, mochel
Hi,
On Tue, 27 Jan 2004, Rusty Russell wrote:
> viro@parcelfarce.linux.theplanet.co.uk wrote:
>
> > Basically, "protect the module" is wrong - it should be "protect specific
> > object" and we need that anyway.
>
> Agreed. You're oversimplifying a little, though.
>
> In this model, the object here is the function text. So if you hand out
> a pointer to the function text, you need to hold a refcount.
>
> BUT since the module itself is the only one which can hand these out,
> and it unregisters everything it has registered, and all those references
> fall to zero, it's trivial to prove that there are no more references to
> the module functions.
There is a very simple rule: If two or more objects have the same
lifetime, they all only need to be protected once.
So if you have two static objects (e.g. a static structure with a
reference to a function), the module count is indeed enough to protect
both objects. The problem starts when you mix dynamic and static objects,
then the module count is not enough anymore and you have to use proper
refcounts, but the generic module code currently has no decent way to get
to that information, so we have to maintain the module count additionally
to the refcount. Now we only have to find a way to utilize the object
protection to also protect the module it refers to and you can get rid of
the module count.
Rusty, you need to get away from this idea, that every single function
pointer must be protected by itself, it can also be protected via another
object, the real problem is that the generic module code doesn't know
anything about this. Fixing this requires changing every single module,
but in the end it would be worth it, as it avoids the duplicated
protection and we had decent module unload semantics.
bye, Roman
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-27 13:56 ` Roman Zippel
@ 2004-01-27 23:29 ` Rusty Russell
2004-01-28 2:36 ` Roman Zippel
0 siblings, 1 reply; 38+ messages in thread
From: Rusty Russell @ 2004-01-27 23:29 UTC (permalink / raw)
To: Roman Zippel; +Cc: viro, torvalds, stern, greg, linux-kernel, mochel
In message <Pine.LNX.4.58.0401271142510.7855@serv> you write:
> Hi,
Hi Roman!
> Fixing this requires changing every single module, but in the end it
> would be worth it, as it avoids the duplicated protection and we had
> decent module unload semantics.
And I still disagree. <shrug>
If it's any consolation, I don't plan any significant module work in
2.7. If you want to work on this, you're welcome to it. Perhaps you
can convince Linus et al that it's worth the pain?
Cheers,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-27 23:29 ` Rusty Russell
@ 2004-01-28 2:36 ` Roman Zippel
2004-01-28 3:54 ` Rusty Russell
0 siblings, 1 reply; 38+ messages in thread
From: Roman Zippel @ 2004-01-28 2:36 UTC (permalink / raw)
To: Rusty Russell; +Cc: viro, torvalds, stern, greg, linux-kernel, mochel
Hi Rusty,
On Wed, 28 Jan 2004, Rusty Russell wrote:
> > Fixing this requires changing every single module, but in the end it
> > would be worth it, as it avoids the duplicated protection and we had
> > decent module unload semantics.
>
> And I still disagree. <shrug>
And I still don't know why. :(
> If it's any consolation, I don't plan any significant module work in
> 2.7. If you want to work on this, you're welcome to it. Perhaps you
> can convince Linus et al that it's worth the pain?
Well, the problem is that this won't be an one man show, it requires that
a number of kernel hackers understand the problem and the possible
solutions are discussed beforehand. I can understand that a lot here are
scared of such big change, but either we either continue complaining about
module unloading or we do something about it and this requires exploring
the various possibilities.
Rusty, you are the modules maintainer, you are supposed to understand
these issues, if you already block a discussion like that, what am I
supposed to expect from others? I really don't claim to have all the
answers and I really would like to discuss the various possible approaches
to solve this problem, so people also know what they can expect.
bye, Roman
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-28 2:36 ` Roman Zippel
@ 2004-01-28 3:54 ` Rusty Russell
0 siblings, 0 replies; 38+ messages in thread
From: Rusty Russell @ 2004-01-28 3:54 UTC (permalink / raw)
To: Roman Zippel; +Cc: viro, torvalds, stern, greg, linux-kernel, mochel
In message <Pine.LNX.4.58.0401280304180.7851@serv> you write:
> Hi Rusty,
>
> On Wed, 28 Jan 2004, Rusty Russell wrote:
>
> > > Fixing this requires changing every single module, but in the end it
> > > would be worth it, as it avoids the duplicated protection and we had
> > > decent module unload semantics.
> >
> > And I still disagree. <shrug>
>
> And I still don't know why. :(
Exactly. So we have this same conversation over and over. It's the
single most frustrating experience I've ever had in kernel
development. 8( I was very disappointed you didn't make it to the
kernel summit.
> Well, the problem is that this won't be an one man show, it requires that
> a number of kernel hackers understand the problem and the possible
> solutions are discussed beforehand. I can understand that a lot here are
> scared of such big change, but either we either continue complaining about
> module unloading or we do something about it and this requires exploring
> the various possibilities.
Even if the perfect scheme were achieved, I don't think Linus would
accept changing every module. I was originally agitating for a
"perfect" solution, so few of us cared.
Linus has said it simply isn't important. Many kernel developers
basically agree.
> Rusty, you are the modules maintainer, you are supposed to understand
> these issues, if you already block a discussion like that, what am I
> supposed to expect from others?
I'm sorry. I tried to stay out of these discussions (hey maybe
someone will come up with a great solution!), but when Linus posted
something which was basically incorrect, I felt I had to clear the
record.
For me, this issue long ago used up its timeslice.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-25 19:02 ` Linus Torvalds
2004-01-25 20:21 ` viro
@ 2004-01-25 23:12 ` Steve Youngs
2004-01-26 3:22 ` Adam Kropelin
2004-01-26 16:22 ` Roman Zippel
2004-01-27 6:41 ` Rusty Russell
3 siblings, 1 reply; 38+ messages in thread
From: Steve Youngs @ 2004-01-25 23:12 UTC (permalink / raw)
To: Linux Kernel List; +Cc: Linus Torvalds
[-- Attachment #1: Type: text/plain, Size: 2963 bytes --]
* Linus Torvalds <torvalds@osdl.org> writes:
> - doing proper refcounting of modules is _really_ really
> hard. The reason is that proper refcounting is a "local"
> issue: you reference count a single data structure. It's
> basically impossible to make a "global" reference count
> without jumping through hoops.
Please understand that I coming from an _extremely_ naive perspective,
but why do refcounting at all? Couldn't the refcount be a simple
boolean? For example:
- The cdrom module is "in use" when a cdrom filesystem is mounted
and it isn't "in use" when there are _no_ cdrom filesystems
mounted.
- An ethernet module is "in use" while the device is attached to an
IP.
- A packet filter module is "in use" while there are iptables rules
(or similar) existing that require it.
I see the process working along these lines: When a module is loaded
into the kernel it (the module) exports a symbol (a function) that the
kernel can use for determining whether or not the module is still in
use. If no such symbol is exported the kernel would mark the module
as being "non-unloadable" (its refcount would always be non-nil), and
send an appropriate message to root via syslog saying "xyz module
cannot be automatically unloaded".
It should also be possible for modules to tell the kernel, through the
same mechanism, that they should _never_ be unloaded.
I don't, yet, have the knowledge to turn my ideas into code so I won't
be offended in the slightest if you ignore them. Just tell me
why. :-)
> - lack of testing.
A moot point once the kernel can safely and efficiently do module
unloading.
> Unloading a module happens once in a blue moon, if even then.
We get an awful lot of blue moons here.
> The proper thing to do (and what we _have_ done) is to say
> "unloading of modules is not supported".
Wouldn't it be better to say "unloading of modules is currently
discouraged because we haven't _yet_ solved the problems related to
unloading modules".
> But it basically boils down to: don't think of module unload as a "normal
> event". It isn't. Getting it truly right is (a) too painful and (b) would
> be too slow, so we're not even going to try.
Now there's a cop out if ever I saw one. Surely, Linus, you've
overcome _much_ bigger problems than this at different times. Just
because the only solutions that have been thought of so far are hard
to implement, are too inefficient, and aren't very safe doesn't mean
you should give up. It just means that you need a different
solution.
> (As an example of "too painful, too slow", think of something like a
> packet filter module.
See my examples above.
--
|---<Steve Youngs>---------------<GnuPG KeyID: A94B3003>---|
| Ashes to ashes, dust to dust. |
| The proof of the pudding, is under the crust. |
|------------------------------<sryoungs@bigpond.net.au>---|
[-- Attachment #2: Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-25 23:12 ` Steve Youngs
@ 2004-01-26 3:22 ` Adam Kropelin
2004-01-26 5:06 ` Steve Youngs
0 siblings, 1 reply; 38+ messages in thread
From: Adam Kropelin @ 2004-01-26 3:22 UTC (permalink / raw)
To: Linux Kernel List, Linus Torvalds
On Mon, Jan 26, 2004 at 09:12:58AM +1000, Steve Youngs wrote:
> * Linus Torvalds <torvalds@osdl.org> writes:
>
> > - doing proper refcounting of modules is _really_ really
> > hard. The reason is that proper refcounting is a "local"
> > issue: you reference count a single data structure. It's
> > basically impossible to make a "global" reference count
> > without jumping through hoops.
>
> Please understand that I coming from an _extremely_ naive perspective,
> but why do refcounting at all? Couldn't the refcount be a simple
> boolean?
A boolean is just a one-bit reference count. If the maximum number of
simultaneous 'users' for a given module is one, then a boolean will work.
If there is potential for more than one simultaneous user then you need
more bits.
Either way, it doesn't simplify the problem.
> I see the process working along these lines: When a module is loaded
> into the kernel it (the module) exports a symbol (a function) that the
> kernel can use for determining whether or not the module is still in
> use.
And how will the module know when it is or is not "in use"? By keeping
a count of the number of current users, of course.
> > - lack of testing.
>
> A moot point once the kernel can safely and efficiently do module
> unloading.
I don't follow your logic. Once it works we don't have to test it so
therefore we never need to test it? I don't buy the premise or the
conclusion.
> > Unloading a module happens once in a blue moon, if even then.
>
> We get an awful lot of blue moons here.
This moon's not worth barking at.
> > But it basically boils down to: don't think of module unload as a "normal
> > event". It isn't. Getting it truly right is (a) too painful and (b) would
> > be too slow, so we're not even going to try.
>
> Now there's a cop out if ever I saw one. Surely, Linus, you've
> overcome _much_ bigger problems than this at different times.
Linus can of course speak for himself but from my perspective it's just
a simple cost/benefit analysis. This one's just not worth any more toil.
Several extremely bright people have tackled the problem and eventually
concluded it's extremely hard to solve and generally not worth the
trouble. It's time to let go.
--Adam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 3:22 ` Adam Kropelin
@ 2004-01-26 5:06 ` Steve Youngs
2004-01-26 5:21 ` Valdis.Kletnieks
2004-01-26 15:50 ` Adam Kropelin
0 siblings, 2 replies; 38+ messages in thread
From: Steve Youngs @ 2004-01-26 5:06 UTC (permalink / raw)
To: Linux Kernel List
[-- Attachment #1: Type: text/plain, Size: 3939 bytes --]
* Adam Kropelin <akropel1@rochester.rr.com> writes:
> On Mon, Jan 26, 2004 at 09:12:58AM +1000, Steve Youngs wrote:
>> * Linus Torvalds <torvalds@osdl.org> writes:
>>
>> > - doing proper refcounting of modules is _really_ really
>> > hard. The reason is that proper refcounting is a "local"
>> > issue: you reference count a single data structure. It's
>> > basically impossible to make a "global" reference count
>> > without jumping through hoops.
>>
>> Please understand that I coming from an _extremely_ naive perspective,
>> but why do refcounting at all? Couldn't the refcount be a simple
>> boolean?
> A boolean is just a one-bit reference count. If the maximum number of
> simultaneous 'users' for a given module is one, then a boolean will work.
> If there is potential for more than one simultaneous user then you need
> more bits.
Why? A module is either being used or it isn't, the number of uses
shouldn't even come into it.
>> I see the process working along these lines: When a module is loaded
>> into the kernel it (the module) exports a symbol (a function) that the
>> kernel can use for determining whether or not the module is still in
>> use.
> And how will the module know when it is or is not "in use"? By keeping
> a count of the number of current users, of course.
No, the number of current users wouldn't have any bearing on it whatsoever.
How each module does it would be up to the module itself. For an ethernet
card it could be while the card is associated with a IP; for a USB keyboard
it could be while the keyboard is plugged in; for a sound driver it could
be while anything is accessing the sound devices. etc etc.
I'm suggesting that the responsibility for determining when it is safe
to unload a particular module should lay with the module itself and
not with the kernel.
>> > - lack of testing.
>>
>> A moot point once the kernel can safely and efficiently do module
>> unloading.
> I don't follow your logic. Once it works we don't have to test it so
> therefore we never need to test it?
Possibly a poor choice of words on my part. What I meant was that
once the functionality goes into the kernel testing will happen on
every single Linux box in the land that has this future kernel. Some
of those users will report bugs if there are any. And some of those
users may even help to fix those bugs.
Also what I meant is that you can't test something that doesn't
exist.
>> > Unloading a module happens once in a blue moon, if even then.
>>
>> We get an awful lot of blue moons here.
> This moon's not worth barking at.
There are an awful lot of users out there who would disagree with you
on that. _One_ of the reasons that I believe that this moon is worth
barking at is:
If a module should never, in the normal course of events, be unloaded,
then there isn't _any_ point to being able to load them in the first
place. Not being able to unload them _totally_ defeats the purpose of
modules.
Yes, I could be wrong, I sure as anything don't know the full story,
but I'm not convinced that I am wrong yet.
> Several extremely bright people have tackled the problem and eventually
> concluded it's extremely hard to solve and generally not worth the
> trouble. It's time to let go.
Several extremely bright people thought Galileo was a heretic and a
fool.
Tell me that this problem is _impossible_ to solve and providing you
can show me _why_ it is impossible I'll speak no more on this
matter.
But if you tell me that this problem is too hard to solve, then I have
no alternative than to think: "you're not trying hard enough".
--
|---<Steve Youngs>---------------<GnuPG KeyID: A94B3003>---|
| Ashes to ashes, dust to dust. |
| The proof of the pudding, is under the crust. |
|------------------------------<sryoungs@bigpond.net.au>---|
[-- Attachment #2: Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 5:06 ` Steve Youngs
@ 2004-01-26 5:21 ` Valdis.Kletnieks
2004-01-26 5:55 ` Steve Youngs
2004-01-26 15:50 ` Adam Kropelin
1 sibling, 1 reply; 38+ messages in thread
From: Valdis.Kletnieks @ 2004-01-26 5:21 UTC (permalink / raw)
To: Steve Youngs; +Cc: Linux Kernel List
[-- Attachment #1: Type: text/plain, Size: 656 bytes --]
On Mon, 26 Jan 2004 15:06:48 +1000, Steve Youngs <sryoungs@bigpond.net.au> said:
> > A boolean is just a one-bit reference count. If the maximum number of
> > simultaneous 'users' for a given module is one, then a boolean will work.
> > If there is potential for more than one simultaneous user then you need
> > more bits.
>
> Why? A module is either being used or it isn't, the number of uses
> shouldn't even come into it.
OK. There's 2 users of the module. The first one exits. How does it (or
anything else) know that it's NOT safe to just clear the in-use bit and clean
it up?
And how does the second one know it IS safe to clean up?
[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 5:21 ` Valdis.Kletnieks
@ 2004-01-26 5:55 ` Steve Youngs
2004-01-26 6:25 ` Valdis.Kletnieks
2004-01-26 8:48 ` Helge Hafting
0 siblings, 2 replies; 38+ messages in thread
From: Steve Youngs @ 2004-01-26 5:55 UTC (permalink / raw)
To: Linux Kernel List
[-- Attachment #1: Type: text/plain, Size: 1383 bytes --]
* Valdis Kletnieks <Valdis.Kletnieks@vt.edu> writes:
> On Mon, 26 Jan 2004 15:06:48 +1000, Steve Youngs <sryoungs@bigpond.net.au> said:
>> > A boolean is just a one-bit reference count. If the maximum number of
>> > simultaneous 'users' for a given module is one, then a boolean will work.
>> > If there is potential for more than one simultaneous user then you need
>> > more bits.
>>
>> Why? A module is either being used or it isn't, the number of uses
>> shouldn't even come into it.
> OK. There's 2 users of the module. The first one exits. How does
> it (or anything else) know that it's NOT safe to just clear the
> in-use bit and clean it up?
Because the 2nd user is still using the module so its in-use bit
should still be set. Remember that when the module was first loaded
it registered a function with the kernel for testing whether the
module is in use.
I must be overlooking something because I see the answer so clearly.
Maybe if someone could give me a real world example of a situation
where it'd be hard/impossible/unsafe to unload a module and I'll see
if my ideas can be applied.
--
|---<Steve Youngs>---------------<GnuPG KeyID: A94B3003>---|
| Ashes to ashes, dust to dust. |
| The proof of the pudding, is under the crust. |
|------------------------------<sryoungs@bigpond.net.au>---|
[-- Attachment #2: Type: application/pgp-signature, Size: 256 bytes --]
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 5:55 ` Steve Youngs
@ 2004-01-26 6:25 ` Valdis.Kletnieks
2004-01-26 8:48 ` Helge Hafting
1 sibling, 0 replies; 38+ messages in thread
From: Valdis.Kletnieks @ 2004-01-26 6:25 UTC (permalink / raw)
To: Steve Youngs; +Cc: Linux Kernel List
[-- Attachment #1: Type: text/plain, Size: 487 bytes --]
On Mon, 26 Jan 2004 15:55:06 +1000, Steve Youngs <sryoungs@bigpond.net.au> said:
> Because the 2nd user is still using the module so its in-use bit
> should still be set. Remember that when the module was first loaded
> it registered a function with the kernel for testing whether the
> module is in use.
Anybody who's ever been in a bathroom stall and somebody turned off the
lights on their way out will intuitively understand why you need a reference
count and not an in-use bit,
[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 5:55 ` Steve Youngs
2004-01-26 6:25 ` Valdis.Kletnieks
@ 2004-01-26 8:48 ` Helge Hafting
1 sibling, 0 replies; 38+ messages in thread
From: Helge Hafting @ 2004-01-26 8:48 UTC (permalink / raw)
To: Steve Youngs; +Cc: Linux Kernel List
Steve Youngs wrote:
> * Valdis Kletnieks <Valdis.Kletnieks@vt.edu> writes:
>
> > On Mon, 26 Jan 2004 15:06:48 +1000, Steve Youngs <sryoungs@bigpond.net.au> said:
> >> > A boolean is just a one-bit reference count. If the maximum number of
> >> > simultaneous 'users' for a given module is one, then a boolean will work.
> >> > If there is potential for more than one simultaneous user then you need
> >> > more bits.
> >>
> >> Why? A module is either being used or it isn't, the number of uses
> >> shouldn't even come into it.
>
> > OK. There's 2 users of the module. The first one exits. How does
> > it (or anything else) know that it's NOT safe to just clear the
> > in-use bit and clean it up?
>
> Because the 2nd user is still using the module so its in-use bit
> should still be set. Remember that when the module was first loaded
> it registered a function with the kernel for testing whether the
> module is in use.
Are you talking about an in-use bit _per module_ or an in-use bit _per user_ ???
Either way has some problems:
In-use bit per module:
This is what everybody thought yopu were talking about up until now.
The problem above is real: There are 2 (or more) users. One exits.
How does the code know that it can't clear the module's in-use bit?
This one user doesn't know about the other users - how could it?
And there is no "number of users" anywhere because you said you
didn't want that. (The number would be the refcount you tries to get rid of)
Problems, and you haven't provided a solution yet.
Possible solution: Let each user know about all the others so it
can check wether it is ok to clear the in-use bit. This is unworkable!
In-use bit per user:
Your comment about the second users in-use bit being set seems to imply
that you want an in-use bit per module user. (instead of per module).
The rule now becomes "no unloading as long as there is at least one
in-use bit set for this module." There may be several such bits.
This seems straightforward enough until you try to manage such
a set of bits. I might try to unload a module: The unload code checks
the first user bit - it is not set. So that user isn't using the module
now. Fine. Check the next bit - not in use there either. And so on.
It is a lengthy procedure, which is dangerous. Think SMP or preempt:
The unload function are checking the last in-use bits (after first checking the first ones
and finding them all zeroed), and then the first
module user sets his bit because he starts using the module again.
This is what we call a "race". The module unloader comes to the wrong
conclusion - that the module isn't in use - because a user came in and
used it _while_ the bits were being counted.
Solutions:
1. Use locks for manipulating the multiple in-use bits. Too slow.
2. Use refcounting instead. Better, because the count can be changed and tested
atomically, avoiding the above mentioned race _and_ the slow locking.
There are some cases where even this is too slow, but such a module could
be made "not unloadable". To take Linus' example of a packet filter
module: It cannot be unloaded on its own because an interrupt can
make use of it anytime, making refcounting too expensive.
Unloading it is possible however, as part
of unloading the entire TCP/IP module as a whole. You may then
reload tcp/ip without packet filtering if you so wish...
>
> I must be overlooking something
Sure.
> because I see the answer so clearly.
> Maybe if someone could give me a real world example of a situation
> where it'd be hard/impossible/unsafe to unload a module and I'll see
> if my ideas can be applied.
Try the above. Or look at actual code people have problems with. Run tests
(on SMP) and get some surprises. Basically, the surprises revolve
around several things happening simultaneously.
Note that many modules (but not all of the existing ones) _can_ be made safely unloadable.
Those that can do it simply don't bother - because they have more important
and/or interesting work to do. Feel free to volunteer for fixing things . . .
Helge Hafting
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 5:06 ` Steve Youngs
2004-01-26 5:21 ` Valdis.Kletnieks
@ 2004-01-26 15:50 ` Adam Kropelin
1 sibling, 0 replies; 38+ messages in thread
From: Adam Kropelin @ 2004-01-26 15:50 UTC (permalink / raw)
To: Linux Kernel List; +Cc: Steve Youngs
On Mon, Jan 26, 2004 at 03:06:48PM +1000, Steve Youngs wrote:
> * Adam Kropelin <akropel1@rochester.rr.com> writes:
>
> > On Mon, Jan 26, 2004 at 09:12:58AM +1000, Steve Youngs wrote:
> >> * Linus Torvalds <torvalds@osdl.org> writes:
> >>
> >> > - doing proper refcounting of modules is _really_ really
> >> > hard. The reason is that proper refcounting is a "local"
> >>
> >> Please understand that I coming from an _extremely_ naive perspective,
> >> but why do refcounting at all? Couldn't the refcount be a simple
> >> boolean?
>
> > A boolean is just a one-bit reference count. If the maximum number of
> > simultaneous 'users' for a given module is one, then a boolean will work.
> > If there is potential for more than one simultaneous user then you need
> > more bits.
>
> Why? A module is either being used or it isn't, the number of uses
> shouldn't even come into it.
I think others in the thread have adequately explained the details
you're missing.
> I'm suggesting that the responsibility for determining when it is safe
> to unload a particular module should lay with the module itself and
> not with the kernel.
That does not change anything at all. The same problems apply and the
same pool of solutions is still available. The easy cases are still easy
and the hard cases are still *really* hard.
> >> > - lack of testing.
> >>
> >> A moot point once the kernel can safely and efficiently do module
> >> unloading.
>
> > I don't follow your logic. Once it works we don't have to test it so
> > therefore we never need to test it?
>
> Possibly a poor choice of words on my part. What I meant was that
> once the functionality goes into the kernel testing will happen on
> every single Linux box in the land that has this future kernel. Some
> of those users will report bugs if there are any. And some of those
> users may even help to fix those bugs.
Hello? 2.4, etc. support removing modules. Linus was speaking from
experience. One of the reasons module removal is perpetually broken in
subtle ways on those kernels is that it simply does not receive enough
testing. Having some new implementation on 2.6 doesn't change that.
> Also what I meant is that you can't test something that doesn't
> exist.
As I pointed out above, it does exist.
> >> We get an awful lot of blue moons here.
>
> > This moon's not worth barking at.
>
> There are an awful lot of users out there who would disagree with you
> on that. _One_ of the reasons that I believe that this moon is worth
> barking at is:
>
> If a module should never, in the normal course of events, be unloaded,
> then there isn't _any_ point to being able to load them in the first
> place. Not being able to unload them _totally_ defeats the purpose of
> modules.
Think a little harder and you'll see why that's a ridiculous conclusion.
Hint #1: Distributions. Hint #2: SILO.
> Tell me that this problem is _impossible_ to solve and providing you
> can show me _why_ it is impossible I'll speak no more on this
> matter.
No one yet has claimed this is "impossible" to solve, only that it's not
worth solving. Clearly it's important to you, so have at it. Further
discussion of how you "see the answer so clearly" and the rest of us are
"not trying hard enough" should come in the form of patches.
--Adam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-25 19:02 ` Linus Torvalds
2004-01-25 20:21 ` viro
2004-01-25 23:12 ` Steve Youngs
@ 2004-01-26 16:22 ` Roman Zippel
2004-01-27 19:32 ` Russell King
2004-01-27 20:29 ` Greg KH
2004-01-27 6:41 ` Rusty Russell
3 siblings, 2 replies; 38+ messages in thread
From: Roman Zippel @ 2004-01-26 16:22 UTC (permalink / raw)
To: Linus Torvalds
Cc: Alan Stern, Greg KH, Kernel development list, Patrick Mochel
Hi,
On Sun, 25 Jan 2004, Linus Torvalds wrote:
> - doing proper refcounting of modules is _really_ really hard. The reason
> is that proper refcounting is a "local" issue: you reference count a
> single data structure. It's basically impossible to make a "global"
> reference count without jumping through hoops.
IOW module reference counts are evil, no matter what you do it will always
add more overhead. I fully agree with Al, once we have proper per object
protection it's rather easy to build the module infrastructure on top of
it. I only want to add a bit what it means practically.
For example pci drivers currently do something like:
int init()
{
if (pci_register_driver(drv) < 0)
pci_unregister_driver(drv);
}
void exit()
{
pci_unregister_driver(drv);
}
All this is done without a module count, this means that
pci_unregister_driver() cannot return before the last reference is gone.
For network devices this is not that much of a problem, as they can be
rather easily deconfigured automatically, but that's not that easy for
mounted block devices, so one has to be damned careful when to call the
exit function.
To prevent module unloading we usually added a reference to the module, so
the generic module code knows, when it's safe to unload the module. OTOH
with the driver object we can easily tell without any additional overhead
whether the driver is busy, we only have to find a way to tell that the
generic module code.
Technically there are a number of ways to do this, but practically we have
to decide which module unload semantics we want to have and how easy it
should be to upgrade the drivers to this.
We are now in the situation, where we should decide whether we want to
continue the status quo and module unloading stays a PITA or we fix it
properly once and for all, but that requires changing every single driver.
For a large number of drivers (all which Al mentioned) the updates should
be rather staightforward, the rest would simply be not unloadable, unless
they fix their interfaces.
bye, Roman
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 16:22 ` Roman Zippel
@ 2004-01-27 19:32 ` Russell King
2004-01-27 20:28 ` Greg KH
2004-01-27 20:29 ` Greg KH
1 sibling, 1 reply; 38+ messages in thread
From: Russell King @ 2004-01-27 19:32 UTC (permalink / raw)
To: Roman Zippel
Cc: Linus Torvalds, Alan Stern, Greg KH, Kernel development list,
Patrick Mochel
On Mon, Jan 26, 2004 at 05:22:41PM +0100, Roman Zippel wrote:
> For example pci drivers currently do something like:
>
> int init()
> {
> if (pci_register_driver(drv) < 0)
> pci_unregister_driver(drv);
> }
>
> void exit()
> {
> pci_unregister_driver(drv);
> }
I'd like to take this opportunity to mention that the above is buggy
as written. If pci_register_driver() fails, the device_driver structure
is not registered, and therefore pci_unregister_driver() may cause
Bad Things(tm) to happen.
(and yes, pci_module_init() is buggy as it currently stands, and I
believe GregKH has a patch in his queue from the stability freeze
from yours truely to fix it.)
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-27 19:32 ` Russell King
@ 2004-01-27 20:28 ` Greg KH
0 siblings, 0 replies; 38+ messages in thread
From: Greg KH @ 2004-01-27 20:28 UTC (permalink / raw)
To: Roman Zippel, Linus Torvalds, Alan Stern,
Kernel development list, Patrick Mochel
On Tue, Jan 27, 2004 at 07:32:28PM +0000, Russell King wrote:
>
> (and yes, pci_module_init() is buggy as it currently stands, and I
> believe GregKH has a patch in his queue from the stability freeze
> from yours truely to fix it.)
Yes I still have it, I need to dig that out and send it onward... Sorry
about the delay.
greg k-h
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-26 16:22 ` Roman Zippel
2004-01-27 19:32 ` Russell King
@ 2004-01-27 20:29 ` Greg KH
2004-01-28 2:03 ` Roman Zippel
1 sibling, 1 reply; 38+ messages in thread
From: Greg KH @ 2004-01-27 20:29 UTC (permalink / raw)
To: Roman Zippel
Cc: Linus Torvalds, Alan Stern, Kernel development list, Patrick Mochel
On Mon, Jan 26, 2004 at 05:22:41PM +0100, Roman Zippel wrote:
>
> All this is done without a module count, this means that
> pci_unregister_driver() cannot return before the last reference is gone.
> For network devices this is not that much of a problem, as they can be
> rather easily deconfigured automatically, but that's not that easy for
> mounted block devices, so one has to be damned careful when to call the
> exit function.
Um, not anymore. I can yank out a mounted block device and watch the
scsi core recover just fine. The need to make everything hotpluggable
has fixed up a lot of issues like this (as well as made more
problems...)
thanks,
greg k-h
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-27 20:29 ` Greg KH
@ 2004-01-28 2:03 ` Roman Zippel
2004-01-28 2:17 ` viro
0 siblings, 1 reply; 38+ messages in thread
From: Roman Zippel @ 2004-01-28 2:03 UTC (permalink / raw)
To: Greg KH
Cc: Linus Torvalds, Alan Stern, Kernel development list, Patrick Mochel
Hi,
On Tue, 27 Jan 2004, Greg KH wrote:
> > All this is done without a module count, this means that
> > pci_unregister_driver() cannot return before the last reference is gone.
> > For network devices this is not that much of a problem, as they can be
> > rather easily deconfigured automatically, but that's not that easy for
> > mounted block devices, so one has to be damned careful when to call the
> > exit function.
>
> Um, not anymore. I can yank out a mounted block device and watch the
> scsi core recover just fine. The need to make everything hotpluggable
> has fixed up a lot of issues like this (as well as made more
> problems...)
Recovery of the scsi core is IMO one the smallest problems, but how do you
recover at the block layer? The point is that you have here theoretically
more one recovery strategy, but simply pulling out the module leaves you
not much room for a controlled recovery.
bye, Roman
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-28 2:03 ` Roman Zippel
@ 2004-01-28 2:17 ` viro
2004-01-28 2:53 ` Roman Zippel
0 siblings, 1 reply; 38+ messages in thread
From: viro @ 2004-01-28 2:17 UTC (permalink / raw)
To: Roman Zippel
Cc: Greg KH, Linus Torvalds, Alan Stern, Kernel development list,
Patrick Mochel
On Wed, Jan 28, 2004 at 03:03:31AM +0100, Roman Zippel wrote:
> Hi,
>
> On Tue, 27 Jan 2004, Greg KH wrote:
>
> > > All this is done without a module count, this means that
> > > pci_unregister_driver() cannot return before the last reference is gone.
> > > For network devices this is not that much of a problem, as they can be
> > > rather easily deconfigured automatically, but that's not that easy for
> > > mounted block devices, so one has to be damned careful when to call the
> > > exit function.
> >
> > Um, not anymore. I can yank out a mounted block device and watch the
> > scsi core recover just fine. The need to make everything hotpluggable
> > has fixed up a lot of issues like this (as well as made more
> > problems...)
>
> Recovery of the scsi core is IMO one the smallest problems, but how do you
> recover at the block layer? The point is that you have here theoretically
> more one recovery strategy, but simply pulling out the module leaves you
> not much room for a controlled recovery.
Block layer is not too big issue. We have almost everything in the tree
already - the main problem is to get check_disk_change() use regularized.
Now, sound and character devices in general...
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-28 2:17 ` viro
@ 2004-01-28 2:53 ` Roman Zippel
0 siblings, 0 replies; 38+ messages in thread
From: Roman Zippel @ 2004-01-28 2:53 UTC (permalink / raw)
To: viro
Cc: Greg KH, Linus Torvalds, Alan Stern, Kernel development list,
Patrick Mochel
Hi,
On Wed, 28 Jan 2004 viro@parcelfarce.linux.theplanet.co.uk wrote:
> > Recovery of the scsi core is IMO one the smallest problems, but how do you
> > recover at the block layer? The point is that you have here theoretically
> > more one recovery strategy, but simply pulling out the module leaves you
> > not much room for a controlled recovery.
>
> Block layer is not too big issue. We have almost everything in the tree
> already - the main problem is to get check_disk_change() use regularized.
> Now, sound and character devices in general...
Hmm, I more meant "user controlled recovery", the simplest strategy is of
course to throw everything away and that should be indeed not too
difficult, but I don't really think that this is user prefered strategy if
he accidentally unplugs/plugs a device. OTOH that the simple strategy
works reliably is of course a prerequisite to even think about an any more
advanced recovery.
But with the current module infrastructure the user has not much choice
anyway, without any indication of module usage state the user can only
guess what will happen when he tries to unload a module, so that currently
the best advice is indeed: don't do it.
bye, Roman
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: PATCH: (as177) Add class_device_unregister_wait() and platform_device_unregister_wait() to the driver model core
2004-01-25 19:02 ` Linus Torvalds
` (2 preceding siblings ...)
2004-01-26 16:22 ` Roman Zippel
@ 2004-01-27 6:41 ` Rusty Russell
3 siblings, 0 replies; 38+ messages in thread
From: Rusty Russell @ 2004-01-27 6:41 UTC (permalink / raw)
To: Linus Torvalds; +Cc: stern, greg, linux-kernel, mochel
On Sun, 25 Jan 2004 11:02:58 -0800 (PST)
Linus Torvalds <torvalds@osdl.org> wrote:
> On Sun, 25 Jan 2004, Alan Stern wrote:
> >
> > Is there some reason why modules don't work like this?
>
> There's a few:
>
> - pain. pain. pain.
>
> - doing proper refcounting of modules is _really_ really hard. The reason
> is that proper refcounting is a "local" issue: you reference count a
> single data structure. It's basically impossible to make a "global"
> reference count without jumping through hoops.
>
> - lack of testing. Unloading a module happens once in a blue moon, if
> even then.
And modules do work like you proposed, if you use "rmmod --wait".
Doing proper refcounting is actually fairly easy: every function pointer
has an associated reference count (or pointer to the module containing
the refcount).
But how much pain are you prepared to put up with to have a pseudo-pagable
kernel?
> (As an example of "too painful, too slow", think of something like a
> packet filter module. You'd literally have to increment the count in every
> part that gets a packet, and decrement the count at every point where it
> lets the packet go. And since it would have to be SMP-safe, it would have
> to be a locked cycle, or we'd have to have per-CPU counters - at which
> point you now also have to worry about things like preemption and
> sleeping, which just means that it would be a _lot_ of very fragile code).
Actually, this is already handled. The module reference counts are per-cpu
and don't contain any barriers. We go to an *awful* lot of pain on remove
to synchronize, but as Linus says, it's not the normal case.
Since we hit the (atomic_t) ref to the devices on every packet, bumping
the refcount on the module is lost in the noise.
But Dave doesn't want to do it: it makes the code uglier and painful.
Cheers,
Rusty.
--
there are those who do and those who hang on and you don't see too
many doers quoting their contemporaries. -- Larry McVoy
^ permalink raw reply [flat|nested] 38+ messages in thread