linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [OKS] Module removal
@ 2002-07-01 17:48 Bill Davidsen
  2002-07-01 18:35 ` Richard B. Johnson
                   ` (5 more replies)
  0 siblings, 6 replies; 83+ messages in thread
From: Bill Davidsen @ 2002-07-01 17:48 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
comments on points raied.

The suggestion was made that kernel module removal be depreciated or
removed. I'd like to note that there are two common uses for this
capability, and the problems addressed by module removal should be kept in
mind. These are in addition to the PCMCIA issue raised.

1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
better (or usefully at all) with one or the other driver used to read CDs.
I've noted that several programs for reading the image from an ISO CD
(readcd and sdd) have end of data problems with the SCSI interface.

2 - restarting NICs when total reinitialization is needed. In server
applications it's sometimes necessary to move or clear a NIC connection,
force renegotiation because the blade on the switch was set wrong, etc.
It's preferable to take down one NIC for a moment than suffer a full
outage via reboot.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 17:48 [OKS] Module removal Bill Davidsen
@ 2002-07-01 18:35 ` Richard B. Johnson
  2002-07-01 18:42 ` Jose Luis Domingo Lopez
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 83+ messages in thread
From: Richard B. Johnson @ 2002-07-01 18:35 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux-Kernel Mailing List

On Mon, 1 Jul 2002, Bill Davidsen wrote:

> Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
> comments on points raied.
> 
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
> 

One of the best features of Linux is the ability to install and remove
modules. With this capability, designing drivers in Linux is much easier
than, for instance, Sun or NT. With Linux, one can make and test
individual portions of drivers much like writing and testing individual
procedures for user-mode code. As long as the programmer doesn't do
something that destroys the kernel, one can remove, code more, install,
then continue until done.

If the ability to remove modules is eliminated, we are degenerating
the kernel. I would have to write code that assumes that everything
works, i.e., a complete module, then throw it off the cliff to see if
it flies. I know this is the way many of the recent grads do it, but
they get promoted to management long before their code is tested and
I end up having to fix their stuff. So please don't eliminate module-
removal capability.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

                 Windows-2000/Professional isn't.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 17:48 [OKS] Module removal Bill Davidsen
  2002-07-01 18:35 ` Richard B. Johnson
@ 2002-07-01 18:42 ` Jose Luis Domingo Lopez
  2002-07-01 18:45   ` Shawn
  2002-07-01 19:57 ` Diego Calleja
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 83+ messages in thread
From: Jose Luis Domingo Lopez @ 2002-07-01 18:42 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

On Monday, 01 July 2002, at 13:48:55 -0400,
Bill Davidsen wrote:

> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
> 
>From my non-kernel non-programmer point of view, module removal can be
useful under more circunstances than the ones you said. For example,
trying some combination of parameters for a module to get you damned
piece of hardware working, and having to reboot each time you want to
pass it a different set of parameters, doesn't seem reasonable to me.
Examples such as network cards (as Bill said), your nice TV capture
card, or just setting a "debug=1" for a module that seems not to be
working OK.

Except there was a way to pass parameters to modules once loaded, and
have them "reconfigure" themselves for the new parameters.

Just the opinion of a Linux user :)

-- 
Jose Luis Domingo Lopez
Linux Registered User #189436     Debian Linux Woody (Linux 2.4.19-pre6aa1)

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 18:42 ` Jose Luis Domingo Lopez
@ 2002-07-01 18:45   ` Shawn
  0 siblings, 0 replies; 83+ messages in thread
From: Shawn @ 2002-07-01 18:45 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

On 07/01, Jose Luis Domingo Lopez said something like:
> On Monday, 01 July 2002, at 13:48:55 -0400,
> Bill Davidsen wrote:
> 
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be kept in
> > mind. These are in addition to the PCMCIA issue raised.
> > 
> From my non-kernel non-programmer point of view, module removal can be
> useful under more circunstances than the ones you said. For example,
> trying some combination of parameters for a module to get you damned

How about simply upgrading a driver w/out rebooting?

--
Shawn Leas
core@enodev.com

I planted some bird seed.  A bird came up.  Now I don't know
what to feed it.
						-- Stephen Wright

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 17:48 [OKS] Module removal Bill Davidsen
  2002-07-01 18:35 ` Richard B. Johnson
  2002-07-01 18:42 ` Jose Luis Domingo Lopez
@ 2002-07-01 19:57 ` Diego Calleja
  2002-07-01 20:03   ` Diego Calleja
                     ` (2 more replies)
  2002-07-02 11:37 ` Stephen C. Tweedie
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 83+ messages in thread
From: Diego Calleja @ 2002-07-01 19:57 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-kernel

On Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
Bill Davidsen <davidsen@tmr.com> escribió:

> Having read some notes on the Ottawa Kernel Summit, I'd like to offer
> some comments on points raied.
> 
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be
> kept in mind. These are in addition to the PCMCIA issue raised.

And why people wants to remove this nice feature? Only because they
don't use it, or there's a more profund reason?

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 19:57 ` Diego Calleja
@ 2002-07-01 20:03   ` Diego Calleja
  2002-07-01 22:20   ` Jose Luis Domingo Lopez
  2002-07-01 22:56   ` Ryan Anderson
  2 siblings, 0 replies; 83+ messages in thread
From: Diego Calleja @ 2002-07-01 20:03 UTC (permalink / raw)
  To: Diego Calleja; +Cc: davidsen, linux-kernel

On Mon, 1 Jul 2002 21:57:18 +0200
Diego Calleja <diegocg@teleline.es> escribió:

> On Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
> Bill Davidsen <davidsen@tmr.com> escribió:
> 
> > Having read some notes on the Ottawa Kernel Summit, I'd like to
> > offer some comments on points raied.
> > 
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be
> > kept in mind. These are in addition to the PCMCIA issue raised.
> 
> And why people wants to remove this nice feature? Only because they
> don't use it, or there's a more profund reason?

profund->deep (sorry, not all the world speak english ;) )

> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 19:57 ` Diego Calleja
  2002-07-01 20:03   ` Diego Calleja
@ 2002-07-01 22:20   ` Jose Luis Domingo Lopez
  2002-07-01 22:56   ` Ryan Anderson
  2 siblings, 0 replies; 83+ messages in thread
From: Jose Luis Domingo Lopez @ 2002-07-01 22:20 UTC (permalink / raw)
  To: linux-kernel

On Monday, 01 July 2002, at 13:45:28 -0500,
Diego Calleja wrote:

> And why people wants to remove this nice feature? Only because they
> don't use it, or there's a more profund reason?
> 
This was discussed in the recent Linux Kernel Summit. Read:
http://lwn.net/Articles/3327/

for more information on the subject.

-- 
José Luis Domingo López
Linux Registered User #189436     Debian Linux Woody (Linux 2.4.19-pre6aa1)
 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 19:57 ` Diego Calleja
  2002-07-01 20:03   ` Diego Calleja
  2002-07-01 22:20   ` Jose Luis Domingo Lopez
@ 2002-07-01 22:56   ` Ryan Anderson
  2 siblings, 0 replies; 83+ messages in thread
From: Ryan Anderson @ 2002-07-01 22:56 UTC (permalink / raw)
  To: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 1047 bytes --]

On Mon, 1 Jul 2002, Diego Calleja wrote:

> On Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
> Bill Davidsen <davidsen@tmr.com> escribió:
> 
> > Having read some notes on the Ottawa Kernel Summit, I'd like to offer
> > some comments on points raied.
> > 
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be
> > kept in mind. These are in addition to the PCMCIA issue raised.
> 
> And why people wants to remove this nice feature? Only because they
> don't use it, or there's a more profund reason?

Summary from listening to the mp3: It has issues with races that were
only made worse by preempt.

The alternative solution proposed (I forget who mentioned this, though.)
was to allow loading a second module on top of the first.

Those who spoke at the summit will probably want to have the "300 email flame war"
here before this spirals too far. :)



--
Ryan Anderson
  sometimes Pug Majere


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 17:48 [OKS] Module removal Bill Davidsen
                   ` (2 preceding siblings ...)
  2002-07-01 19:57 ` Diego Calleja
@ 2002-07-02 11:37 ` Stephen C. Tweedie
  2002-07-02 12:04   ` Richard B. Johnson
                     ` (2 more replies)
  2002-07-03  0:09 ` Vojtech Pavlik
  2002-07-12 21:51 ` David Lang
  5 siblings, 3 replies; 83+ messages in thread
From: Stephen C. Tweedie @ 2002-07-02 11:37 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux-Kernel Mailing List, Stephen Tweedie

Hi,

> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
 
> 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> better (or usefully at all) with one or the other driver used to read CDs.

The proposal was to deprecate module removal, *not* to deprecate
dynamic registration of devices.  If you want to maintain the above
functionality, then there's no reason why a device can't be
deregistered from one driver and reregistered on another while both
drivers are present.  Note that the scsi stack already allows you to
dynamically register and deregister specific targets on the fly.

> 2 - restarting NICs when total reinitialization is needed. In server
> applications it's sometimes necessary to move or clear a NIC connection,
> force renegotiation because the blade on the switch was set wrong, etc.
> It's preferable to take down one NIC for a moment than suffer a full
> outage via reboot.

Again, you might want to do this even with a non-modular driver, or if
you had one module driving two separate NICs --- the shutdown of one
card shouldn't necessarily require the removal of the module code from
the kernel, which is all Rusty was talking about doing.

Cheers,
 Stephen

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 11:37 ` Stephen C. Tweedie
@ 2002-07-02 12:04   ` Richard B. Johnson
  2002-07-02 13:13     ` jlnance
  2002-07-03  3:48     ` simple handling of module removals " Pavel Machek
  2002-07-02 15:20   ` Bill Davidsen
       [not found]   ` <Pine.LNX.3.95.1020702075957.24872A-100000@chaos.analogic.c om>
  2 siblings, 2 replies; 83+ messages in thread
From: Richard B. Johnson @ 2002-07-02 12:04 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Bill Davidsen, Linux-Kernel Mailing List

On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:

> Hi,
> 
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be kept in
> > mind. These are in addition to the PCMCIA issue raised.
>  
> > 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> > better (or usefully at all) with one or the other driver used to read CDs.
> 
> The proposal was to deprecate module removal, *not* to deprecate
> dynamic registration of devices.  If you want to maintain the above
> functionality, then there's no reason why a device can't be
> deregistered from one driver and reregistered on another while both
> drivers are present.  Note that the scsi stack already allows you to
> dynamically register and deregister specific targets on the fly.

As I am led to understand from reading this thread, there is some
known bug caused by module removal. Therefore the "solution" is to
remove module removal capability.

This is absurd. Next, somebody will remove network capability because
there's some bug in it.  Hello there........?  Are there any carbon-
based life-forms out there?

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

                 Windows-2000/Professional isn't.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 12:04   ` Richard B. Johnson
@ 2002-07-02 13:13     ` jlnance
  2002-07-03  3:48     ` simple handling of module removals " Pavel Machek
  1 sibling, 0 replies; 83+ messages in thread
From: jlnance @ 2002-07-02 13:13 UTC (permalink / raw)
  To: linux-kernel

On Tue, Jul 02, 2002 at 08:04:14AM -0400, Richard B. Johnson wrote:

> As I am led to understand from reading this thread, there is some
> known bug caused by module removal. Therefore the "solution" is to
> remove module removal capability.
> 
> This is absurd. Next, somebody will remove network capability because
> there's some bug in it.  Hello there........?  Are there any carbon-
> based life-forms out there?

I may be misundersting things, but I dont think anyone is proposing
removing the module unloading capability.  I think what people are
saying is that the bugs are hard to trigger and fixing them introduces
performance problems so its better to tell people not to unload modules.
I dont think anyone is proposing preventing you from ignoring this advise.

I dont know this for sure, so I would love someone to make a definitive
comment.

Thanks,

Jim

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 11:37 ` Stephen C. Tweedie
  2002-07-02 12:04   ` Richard B. Johnson
@ 2002-07-02 15:20   ` Bill Davidsen
  2002-07-02 15:53     ` Jonathan Corbet
  2002-07-02 16:08     ` Werner Almesberger
       [not found]   ` <Pine.LNX.3.95.1020702075957.24872A-100000@chaos.analogic.c om>
  2 siblings, 2 replies; 83+ messages in thread
From: Bill Davidsen @ 2002-07-02 15:20 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Linux-Kernel Mailing List

On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:

> > 2 - restarting NICs when total reinitialization is needed. In server
> > applications it's sometimes necessary to move or clear a NIC connection,
> > force renegotiation because the blade on the switch was set wrong, etc.
> > It's preferable to take down one NIC for a moment than suffer a full
> > outage via reboot.
> 
> Again, you might want to do this even with a non-modular driver, or if
> you had one module driving two separate NICs --- the shutdown of one
> card shouldn't necessarily require the removal of the module code from
> the kernel, which is all Rusty was talking about doing.

Then you need a new ioctl to get the driver to go through the
initialization all over again, because the only time the cards are fully
probed is when the module is loaded. You can ifconfig up and down all day
and nothing improves. Oh, and reload resets the counts in the driver, that
sometimes very desirable as well.

Not that it can't be done, just that it works now, and reinventing modules
without remove seems a lot of work when we can just have some broken
modules which don't remove.

Also, as someone mentioned, it means a reboot every time you need to try
something new while doing module development. That doesn't sound like a
great idea...

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 15:20   ` Bill Davidsen
@ 2002-07-02 15:53     ` Jonathan Corbet
  2002-07-02 16:07       ` Oliver Neukum
  2002-07-02 16:08     ` Werner Almesberger
  1 sibling, 1 reply; 83+ messages in thread
From: Jonathan Corbet @ 2002-07-02 15:53 UTC (permalink / raw)
  To: linux-kernel

> Also, as someone mentioned, it means a reboot every time you need to try
> something new while doing module development. That doesn't sound like a
> great idea...

This is a misconception I've seen go by a few times now; maybe I wrote up
that session badly.  Anyway...

...the module remove option, under the proposal, would not disappear
entirely.  It would just no longer be represented as safe.  Removable
modules would be a "kernel hacking" option, still available to people
developing drivers and such.  Aunt Tillie would no longer be able to remove
modules from her kernel, but that's not likely to bother her too much...

jon

Jonathan Corbet
Executive editor, LWN.net
corbet@lwn.net

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 15:53     ` Jonathan Corbet
@ 2002-07-02 16:07       ` Oliver Neukum
  2002-07-02 17:48         ` Tom Rini
  0 siblings, 1 reply; 83+ messages in thread
From: Oliver Neukum @ 2002-07-02 16:07 UTC (permalink / raw)
  To: Jonathan Corbet, linux-kernel


> developing drivers and such.  Aunt Tillie would no longer be able to
> remove modules from her kernel, but that's not likely to bother her too
> much...

It would very much bother uncle John, who is in high availability.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 15:20   ` Bill Davidsen
  2002-07-02 15:53     ` Jonathan Corbet
@ 2002-07-02 16:08     ` Werner Almesberger
  1 sibling, 0 replies; 83+ messages in thread
From: Werner Almesberger @ 2002-07-02 16:08 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Stephen C. Tweedie, Linux-Kernel Mailing List

Bill Davidsen wrote:
> On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:
> > Again, you might want to do this even with a non-modular driver, or if
> > you had one module driving two separate NICs --- the shutdown of one
> > card shouldn't necessarily require the removal of the module code from
> > the kernel, which is all Rusty was talking about doing.
[...]
> Also, as someone mentioned, it means a reboot every time you need to try
> something new while doing module development. That doesn't sound like a
> great idea...

They key phrase is "removal of the module code".
                                          ====
The proposal was to leave the code in the kernel, but to drop all
references such that it would not interfere with new versions of
the same module. The issue is strictly making sure *something* is
there, if an after-removal reference happens.

As I wrote in the other thread on this topic, it seems that only
the "return" case is truly module-specific. Since that one could
probably be fixed by other means, I don't quite see what not
freeing the memory area occupied by a module would really buy.

On the other hand, if there are cases where other after-removal
references can happen, this would also break other areas of the
kernel, and should be fixed no matter what happens with modules.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 16:07       ` Oliver Neukum
@ 2002-07-02 17:48         ` Tom Rini
  2002-07-02 18:10           ` Oliver Neukum
  0 siblings, 1 reply; 83+ messages in thread
From: Tom Rini @ 2002-07-02 17:48 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Jonathan Corbet, linux-kernel

On Tue, Jul 02, 2002 at 06:07:06PM +0200, Oliver Neukum wrote:

> > developing drivers and such.  Aunt Tillie would no longer be able to
> > remove modules from her kernel, but that's not likely to bother her too
> > much...
> 
> It would very much bother uncle John, who is in high availability.

Then the HA kernel turns on the ability to still remove modules, along
with all of the other things needed in an HA environment but not
elsewhere.  Provided removing a module doesn't become a horribly racy,
barely usable bit of functionality, which I hope it won't.

-- 
Tom Rini (TR1265)
http://gate.crashing.org/~trini/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 17:48         ` Tom Rini
@ 2002-07-02 18:10           ` Oliver Neukum
  2002-07-02 21:50             ` Ryan Anderson
  0 siblings, 1 reply; 83+ messages in thread
From: Oliver Neukum @ 2002-07-02 18:10 UTC (permalink / raw)
  To: Tom Rini; +Cc: Jonathan Corbet, linux-kernel

Am Dienstag, 2. Juli 2002 19:48 schrieb Tom Rini:
> On Tue, Jul 02, 2002 at 06:07:06PM +0200, Oliver Neukum wrote:
> > > developing drivers and such.  Aunt Tillie would no longer be able to
> > > remove modules from her kernel, but that's not likely to bother her
> > > too much...
> >
> > It would very much bother uncle John, who is in high availability.
>
> Then the HA kernel turns on the ability to still remove modules, along
> with all of the other things needed in an HA environment but not
> elsewhere.  Provided removing a module doesn't become a horribly racy,
> barely usable bit of functionality, which I hope it won't.

Either there is a race or there isn't. Such a thing is unusable on a
production system.

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 18:10           ` Oliver Neukum
@ 2002-07-02 21:50             ` Ryan Anderson
  2002-07-03 22:26               ` Diego Calleja
  2002-07-04  8:04               ` Helge Hafting
  0 siblings, 2 replies; 83+ messages in thread
From: Ryan Anderson @ 2002-07-02 21:50 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Tom Rini, Jonathan Corbet, linux-kernel

On Tue, 2 Jul 2002, Oliver Neukum wrote:

> Am Dienstag, 2. Juli 2002 19:48 schrieb Tom Rini:
> > On Tue, Jul 02, 2002 at 06:07:06PM +0200, Oliver Neukum wrote:
> > > > developing drivers and such.  Aunt Tillie would no longer be able to
> > > > remove modules from her kernel, but that's not likely to bother her
> > > > too much...
> > >
> > > It would very much bother uncle John, who is in high availability.
> >
> > Then the HA kernel turns on the ability to still remove modules, along
> > with all of the other things needed in an HA environment but not
> > elsewhere.  Provided removing a module doesn't become a horribly racy,
> > barely usable bit of functionality, which I hope it won't.
> 
> Either there is a race or there isn't. Such a thing is unusable on a
> production system.

In a single processor, no preempt kernel, there is no race.
Turn on SMP or preempt and there is one.

Anyway, on a HA machine, how the heck are you going to be removing
modules anyway?  If one of your two identical network cards fails, you
lose your HA status if you need to shut down both to remove the module
and thus be able to remove the second card, right?

I would think the HA guys would be pushing for a status of "modules are
never removed, so design around that" to make their lives easier.  That
would also mean you would have a way to say, "hey driver, rescan for
devices - I think you'll find a new one that you should manage".


--
Ryan Anderson
  sometimes Pug Majere


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 17:48 [OKS] Module removal Bill Davidsen
                   ` (3 preceding siblings ...)
  2002-07-02 11:37 ` Stephen C. Tweedie
@ 2002-07-03  0:09 ` Vojtech Pavlik
  2002-07-12 21:51 ` David Lang
  5 siblings, 0 replies; 83+ messages in thread
From: Vojtech Pavlik @ 2002-07-03  0:09 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux-Kernel Mailing List

On Mon, Jul 01, 2002 at 01:48:55PM -0400, Bill Davidsen wrote:

> Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
> comments on points raied.
> 
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
> 
> 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> better (or usefully at all) with one or the other driver used to read CDs.
> I've noted that several programs for reading the image from an ISO CD
> (readcd and sdd) have end of data problems with the SCSI interface.

This will go away once the IDE rewrite is finished. There will be only
one way to access an IDE CD. And it'll be fixed so that all programs
work with it. Or the programs get fixed.

> 2 - restarting NICs when total reinitialization is needed. In server
> applications it's sometimes necessary to move or clear a NIC connection,
> force renegotiation because the blade on the switch was set wrong, etc.
> It's preferable to take down one NIC for a moment than suffer a full
> outage via reboot.

This can be solved without module unload as well. I think something
that'll fulfill the reset functionality will grow out of the hotplug/pm
system once it's fully implemented.

-- 
Vojtech Pavlik
SuSE Labs

^ permalink raw reply	[flat|nested] 83+ messages in thread

* simple handling of module removals Re: [OKS] Module removal
  2002-07-02 12:04   ` Richard B. Johnson
  2002-07-02 13:13     ` jlnance
@ 2002-07-03  3:48     ` Pavel Machek
  2002-07-03 17:25       ` Richard B. Johnson
                         ` (3 more replies)
  1 sibling, 4 replies; 83+ messages in thread
From: Pavel Machek @ 2002-07-03  3:48 UTC (permalink / raw)
  To: Richard B. Johnson
  Cc: Stephen C. Tweedie, Bill Davidsen, Linux-Kernel Mailing List

Hi!

Okay. So we want modules and want them unload. And we want it bugfree.

So... then its okay if module unload is *slow*, right?

I believe you can just freeze_processes(), unload module [now its
safe, you *know* noone is using that module, because all processes are
in your refrigerator], thaw_processes().

That's going to take *lot* of time, but should be very simple and very
effective.

								Pavel
-- 
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-03  3:48     ` simple handling of module removals " Pavel Machek
@ 2002-07-03 17:25       ` Richard B. Johnson
  2002-07-03 23:46       ` Daniel Phillips
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 83+ messages in thread
From: Richard B. Johnson @ 2002-07-03 17:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Stephen C. Tweedie, Bill Davidsen, Linux-Kernel Mailing List

On Wed, 3 Jul 2002, Pavel Machek wrote:

> Hi!
> 
> Okay. So we want modules and want them unload. And we want it bugfree.
> 
> So... then its okay if module unload is *slow*, right?
> 
> I believe you can just freeze_processes(), unload module [now its
> safe, you *know* noone is using that module, because all processes are
> in your refrigerator], thaw_processes().
> 
> That's going to take *lot* of time, but should be very simple and very
> effective.
> 

Absolutely. Nobody cares if it takes many milliseconds to unload a
module. We just don't want to take the 30 or more seconds necessary to
reboot the machine (3 minutes on some embedded '486es with network boots).


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

                 Windows-2000/Professional isn't.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 21:50             ` Ryan Anderson
@ 2002-07-03 22:26               ` Diego Calleja
  2002-07-04  0:00                 ` Keith Owens
  2002-07-04  8:04               ` Helge Hafting
  1 sibling, 1 reply; 83+ messages in thread
From: Diego Calleja @ 2002-07-03 22:26 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: oliver, trini, corbet, linux-kernel

On Tue, 2 Jul 2002 17:50:19 -0400 (EDT)
Ryan Anderson <ryan@michonline.com> escribió:

> In a single processor, no preempt kernel, there is no race.
> Turn on SMP or preempt and there is one.

So we _can't_ talk about remove module removal, but _disabling_ module
removal in the worst case.

Because if the above is correct, single processors without preempt works
well and can use module removal safely...

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-03  3:48     ` simple handling of module removals " Pavel Machek
  2002-07-03 17:25       ` Richard B. Johnson
@ 2002-07-03 23:46       ` Daniel Phillips
  2002-07-08 12:21         ` Richard B. Johnson
  2002-07-03 23:48       ` Daniel Phillips
  2002-07-04  1:18       ` Keith Owens
  3 siblings, 1 reply; 83+ messages in thread
From: Daniel Phillips @ 2002-07-03 23:46 UTC (permalink / raw)
  To: Pavel Machek, Richard B. Johnson
  Cc: Stephen C. Tweedie, Bill Davidsen, Linux-Kernel Mailing List

On Wednesday 03 July 2002 05:48, Pavel Machek wrote:
> Hi!
> 
> Okay. So we want modules and want them unload. And we want it bugfree.
> 
> So... then its okay if module unload is *slow*, right?
> 
> I believe you can just freeze_processes(), unload module [now its
> safe, you *know* noone is using that module, because all processes are
> in your refrigerator], thaw_processes().
> 
> That's going to take *lot* of time, but should be very simple and very
> effective.

Hi Pavel,

Is it just the mod_dec_use_count; return/unload race we're worried about?  
I'm not clear on why this is hard.  I'd think it would be sufficient just to 
walk all runnable processes to ensure none has an execution address inside the
module.  For smp, an ipi would pick up the current process on each cpu.

At this point the use count must be zero and the module deregistered, so all 
we're interested in is that every process that dec'ed the module's use count 
has succeeded in executing its way out of the module.  If not, we try again 
later, or if we're impatient, also bump any processes still inside the module 
to the front of the run queue.

I'm sure I must have missed something.

-- 
Daniel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-03  3:48     ` simple handling of module removals " Pavel Machek
  2002-07-03 17:25       ` Richard B. Johnson
  2002-07-03 23:46       ` Daniel Phillips
@ 2002-07-03 23:48       ` Daniel Phillips
  2002-07-05  9:40         ` Stephen Tweedie
  2002-07-04  1:18       ` Keith Owens
  3 siblings, 1 reply; 83+ messages in thread
From: Daniel Phillips @ 2002-07-03 23:48 UTC (permalink / raw)
  To: Pavel Machek, Richard B. Johnson
  Cc: Stephen C. Tweedie, Bill Davidsen, Linux-Kernel Mailing List

On Wednesday 03 July 2002 05:48, Pavel Machek wrote:
> Hi!
> 
> Okay. So we want modules and want them unload. And we want it bugfree.
> 
> So... then its okay if module unload is *slow*, right?
> 
> I believe you can just freeze_processes(), unload module [now its
> safe, you *know* noone is using that module, because all processes are
> in your refrigerator], thaw_processes().
> 
> That's going to take *lot* of time, but should be very simple and very
> effective.

Hi Pavel,

Is it just the mod_dec_use_count; return/unload race we're worried about?  
I'm not clear on why this is hard.  I'd think it would be sufficient just to 
walk all runnable processes to ensure none has an execution address inside the
module.  For smp, an ipi would pick up the current process on each cpu.

At this point the use count must be zero and the module deregistered, so all 
we're interested in is that every process that dec'ed the module's use count 
has succeeded in executing its way out of the module.  If not, we try again 
later, or if we're impatient, also bump any processes still inside the module 
to the front of the run queue.

I'm sure I must have missed something.

-- 
Daniel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-03 22:26               ` Diego Calleja
@ 2002-07-04  0:00                 ` Keith Owens
  0 siblings, 0 replies; 83+ messages in thread
From: Keith Owens @ 2002-07-04  0:00 UTC (permalink / raw)
  To: Diego Calleja; +Cc: linux-kernel

On Thu, 4 Jul 2002 00:26:36 +0200, 
Diego Calleja <diegocg@teleline.es> wrote:
>On Tue, 2 Jul 2002 17:50:19 -0400 (EDT)
>Ryan Anderson <ryan@michonline.com> escribió:
>
>> In a single processor, no preempt kernel, there is no race.
>> Turn on SMP or preempt and there is one.
>
>So we _can't_ talk about remove module removal, but _disabling_ module
>removal in the worst case.
>
>Because if the above is correct, single processors without preempt works
>well and can use module removal safely...

Module removal is not safe even on UP without preempt.  UP with no
preempt only removes this race

Read usecount
                        Enter module
                        Increment usecount
Check usecount == 0
Clean up module

There are other race conditions on module unload, which is why the
problem is so "interesting".


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-03  3:48     ` simple handling of module removals " Pavel Machek
                         ` (2 preceding siblings ...)
  2002-07-03 23:48       ` Daniel Phillips
@ 2002-07-04  1:18       ` Keith Owens
  2002-07-04  1:53         ` Andrew Morton
                           ` (2 more replies)
  3 siblings, 3 replies; 83+ messages in thread
From: Keith Owens @ 2002-07-04  1:18 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

On Wed, 3 Jul 2002 05:48:09 +0200, 
Pavel Machek <pavel@ucw.cz> wrote:
>Okay. So we want modules and want them unload. And we want it bugfree.
>
>So... then its okay if module unload is *slow*, right?
>
>I believe you can just freeze_processes(), unload module [now its
>safe, you *know* noone is using that module, because all processes are
>in your refrigerator], thaw_processes().

The devil is in the details.

You must not freeze the process doing rmmod, that way lies deadlock.

Modules can run their own kernel threads.  When the module shuts down
it terminates the threads but we must wait until the process entries
for the threads have been reaped.  If you are not careful, the zombie
clean up code can refer to the module that no longer exists.  You must
not freeze any threads that belong to the module.

You must not freeze any process that has entered the module but not yet
incremented the use count, nor any process that has decremented the use
count but not yet left the module.  Simply looking at the EIP after
freeze is not enough.  Module code with a use count of 0 is allowed to
call any function as long as that function does not sleep.  That rule
used to be safe, but adding preempt has turned that safe rule into a
race, freezing processes has the same effect as preempt.

Using freeze or any other quiesce style operation requires that the
module clean up be split into two parts.  The logic must be :-

Check usecount == 0

Call module unregister routine.  Unlike the existing clean up code,
this only removes externally visible interfaces, it does not delete
module structures.

<handwaving>
  Outside the module, do whatever it takes to ensure that nothing is
  executing any module code, including threads, command callbacks etc.
</handwaving>

Check the usecount again.

If usecount is non-zero then some other code entered the module after
checking the usecount the first time and before unregister completed.
Either mark the module for delayed delete or reactivate the module by
calling the module's register routine.

If usecount is still 0 after the handwaving, then it is safe to call
the final module clean up routine to destroy its structures, release
hardware etc.  Then (and only then) is it safe to free the module.


Rusty and I agree that if (and it's a big if) we want to support module
unloading safely then this is the only sane way to do it.  It requires
some moderately complex handwaving code, changes to every module (split
init and cleanup in two) and a new version of modutils in order to do
this method.  Because of the high cost, Rusty is exploring other
options before diving into a kernel wide change.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04  1:18       ` Keith Owens
@ 2002-07-04  1:53         ` Andrew Morton
  2002-07-04  4:00           ` Keith Owens
  2002-07-04  2:25         ` Brian Gerst
  2002-07-05 13:48         ` Pavel Machek
  2 siblings, 1 reply; 83+ messages in thread
From: Andrew Morton @ 2002-07-04  1:53 UTC (permalink / raw)
  To: Keith Owens; +Cc: Linux-Kernel Mailing List

Keith Owens wrote:
> 
> ...
> Rusty and I agree that if (and it's a big if) we want to support module
> unloading safely then this is the only sane way to do it.

Dumb question: what's wrong with the current code as-is?  I don't think
I've ever seen a module removal bug report which wasn't attributable to
some straightforward driver-failed-to-clean-something-up bug.

What problem are you trying to solve here?

-

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04  1:18       ` Keith Owens
  2002-07-04  1:53         ` Andrew Morton
@ 2002-07-04  2:25         ` Brian Gerst
  2002-07-04  3:54           ` David Gibson
  2002-07-04  4:08           ` Keith Owens
  2002-07-05 13:48         ` Pavel Machek
  2 siblings, 2 replies; 83+ messages in thread
From: Brian Gerst @ 2002-07-04  2:25 UTC (permalink / raw)
  To: Keith Owens; +Cc: Linux-Kernel Mailing List

		 Leaders
PrivateKeith Owens wrote:
> On Wed, 3 Jul 2002 05:48:09 +0200, 
> Pavel Machek <pavel@ucw.cz> wrote:
> 
>>Okay. So we want modules and want them unload. And we want it bugfree.
>>
>>So... then its okay if module unload is *slow*, right?
>>
>>I believe you can just freeze_processes(), unload module [now its
>>safe, you *know* noone is using that module, because all processes are
>>in your refrigerator], thaw_processes().
> 
> 
> The devil is in the details.
> 
> You must not freeze the process doing rmmod, that way lies deadlock.
> 
> Modules can run their own kernel threads.  When the module shuts down
> it terminates the threads but we must wait until the process entries
> for the threads have been reaped.  If you are not careful, the zombie
> clean up code can refer to the module that no longer exists.  You must
> not freeze any threads that belong to the module.
> 
> You must not freeze any process that has entered the module but not yet
> incremented the use count, nor any process that has decremented the use
> count but not yet left the module.  Simply looking at the EIP after
> freeze is not enough.  Module code with a use count of 0 is allowed to
> call any function as long as that function does not sleep.  That rule
> used to be safe, but adding preempt has turned that safe rule into a
> race, freezing processes has the same effect as preempt.
> 
> Using freeze or any other quiesce style operation requires that the
> module clean up be split into two parts.  The logic must be :-
> 
> Check usecount == 0
> 
> Call module unregister routine.  Unlike the existing clean up code,
> this only removes externally visible interfaces, it does not delete
> module structures.
> 
> <handwaving>
>   Outside the module, do whatever it takes to ensure that nothing is
>   executing any module code, including threads, command callbacks etc.
> </handwaving>
> 
> Check the usecount again.
> 
> If usecount is non-zero then some other code entered the module after
> checking the usecount the first time and before unregister completed.
> Either mark the module for delayed delete or reactivate the module by
> calling the module's register routine.
> 
> If usecount is still 0 after the handwaving, then it is safe to call
> the final module clean up routine to destroy its structures, release
> hardware etc.  Then (and only then) is it safe to free the module.
> 
> 
> Rusty and I agree that if (and it's a big if) we want to support module
> unloading safely then this is the only sane way to do it.  It requires
> some moderately complex handwaving code, changes to every module (split
> init and cleanup in two) and a new version of modutils in order to do
> this method.  Because of the high cost, Rusty is exploring other
> options before diving into a kernel wide change.

Why not treat a module just like any other structure?  Obtain a 
reference to it _before_ using it.  I propose this change:

- When a module is inserted, it has a reference count of 1.  This keeps 
it loaded.
- get_module() obtains a reference to a module.  It cannot fail.
- put_module() releases a reference to a module.  If it goes to zero, 
the exit callback is called and then module is safely freeable.
- sys_delete_module calls the deregister callback.  When this function 
returns, it is gauranteed that the reference count will not increase 
again.  It then does a put_module().

pseudo code here:

void get_module(struct module *mod)
{
	atomic_inc(&mod->uc.usecount);
}

void put_module(struct module *mod)
{
	if (atomic_dec_and_lock(&mod->uc.usecount, &unload_lock)) {
		if (mod->exit)
			mod->exit();
		free_module(mod);
		spin_unlock(&unload_lock);
	}
}

sys_delete_module()
{
	mod->deregister();
	put_module(mod);
}


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04  2:25         ` Brian Gerst
@ 2002-07-04  3:54           ` David Gibson
  2002-07-04  4:08           ` Keith Owens
  1 sibling, 0 replies; 83+ messages in thread
From: David Gibson @ 2002-07-04  3:54 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Keith Owens, Linux-Kernel Mailing List

On Wed, Jul 03, 2002 at 10:25:34PM -0400, Brian Gerst wrote:
> 		 Leaders
> PrivateKeith Owens wrote:
> >On Wed, 3 Jul 2002 05:48:09 +0200, 
> >Pavel Machek <pavel@ucw.cz> wrote:
> >
> >>Okay. So we want modules and want them unload. And we want it bugfree.
> >>
> >>So... then its okay if module unload is *slow*, right?
> >>
> >>I believe you can just freeze_processes(), unload module [now its
> >>safe, you *know* noone is using that module, because all processes are
> >>in your refrigerator], thaw_processes().
> >
> >
> >The devil is in the details.
> >
> >You must not freeze the process doing rmmod, that way lies deadlock.
> >
> >Modules can run their own kernel threads.  When the module shuts down
> >it terminates the threads but we must wait until the process entries
> >for the threads have been reaped.  If you are not careful, the zombie
> >clean up code can refer to the module that no longer exists.  You must
> >not freeze any threads that belong to the module.
> >
> >You must not freeze any process that has entered the module but not yet
> >incremented the use count, nor any process that has decremented the use
> >count but not yet left the module.  Simply looking at the EIP after
> >freeze is not enough.  Module code with a use count of 0 is allowed to
> >call any function as long as that function does not sleep.  That rule
> >used to be safe, but adding preempt has turned that safe rule into a
> >race, freezing processes has the same effect as preempt.
> >
> >Using freeze or any other quiesce style operation requires that the
> >module clean up be split into two parts.  The logic must be :-
> >
> >Check usecount == 0
> >
> >Call module unregister routine.  Unlike the existing clean up code,
> >this only removes externally visible interfaces, it does not delete
> >module structures.
> >
> ><handwaving>
> >  Outside the module, do whatever it takes to ensure that nothing is
> >  executing any module code, including threads, command callbacks etc.
> ></handwaving>
> >
> >Check the usecount again.
> >
> >If usecount is non-zero then some other code entered the module after
> >checking the usecount the first time and before unregister completed.
> >Either mark the module for delayed delete or reactivate the module by
> >calling the module's register routine.
> >
> >If usecount is still 0 after the handwaving, then it is safe to call
> >the final module clean up routine to destroy its structures, release
> >hardware etc.  Then (and only then) is it safe to free the module.
> >
> >
> >Rusty and I agree that if (and it's a big if) we want to support module
> >unloading safely then this is the only sane way to do it.  It requires
> >some moderately complex handwaving code, changes to every module (split
> >init and cleanup in two) and a new version of modutils in order to do
> >this method.  Because of the high cost, Rusty is exploring other
> >options before diving into a kernel wide change.
> 
> Why not treat a module just like any other structure?  Obtain a 
> reference to it _before_ using it.  I propose this change:

Because in general you don't know you're going to use a module before
you use it.  Using a module is (necessarily) not a narrow well-defined
interface.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.  -- H.L. Mencken
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04  1:53         ` Andrew Morton
@ 2002-07-04  4:00           ` Keith Owens
  0 siblings, 0 replies; 83+ messages in thread
From: Keith Owens @ 2002-07-04  4:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-Kernel Mailing List

On Wed, 03 Jul 2002 18:53:55 -0700, 
Andrew Morton <akpm@zip.com.au> wrote:
>Keith Owens wrote:
>> Rusty and I agree that if (and it's a big if) we want to support module
>> unloading safely then this is the only sane way to do it.
>
>Dumb question: what's wrong with the current code as-is?  I don't think
>I've ever seen a module removal bug report which wasn't attributable to
>some straightforward driver-failed-to-clean-something-up bug.
>
>What problem are you trying to solve here?

Uncounted references to module code and/or data.

  To be race safe, the reference count must be bumped before calling
  any function that might be in a module.  With the current unload
  method, MOD_INC_USE_COUNT inside the module is too late.

  Al Viro closed some of these races by adding the owner field and
  using try_inc_mod_count.  That assumes that every bit of code that
  dereferences a function pointer will handle the module locking.

Code to externally bump the use count must itself be race safe.

  drivers/char/busmouse.c::busmouse_open() has
    if (mse->ops->owner && !try_inc_mod_count(mse->ops->owner))
  This is safe because it first does
    down(&mouse_sem)
  which locks mse and its operations.  At least I assume it is safe, I
  hope that mse operations cannot be deregistered without mouse_sem.

  Although this appears to be safe, it is not obvious that it is safe,
  you have to trace the interaction between mouse_sem, the module
  unload_lock, the MOD_DELETED flag for the module and the individual
  module clean up routines to be sure that there are no races.

  OTOH, HiSax_mod_inc_use_count appears to be unsafe.  It has no
  locking of its own before calling try_inc_mod_count().  AFAICT this
  race exists :-

    HiSax_command() -> HiSax_mod_inc_use_count()
        mod = cs->hw.hisax_d_if->owner;
	// race here
	if (mod)
	  try_inc_mod_count(mod);

  Without any (obvious) locking to prevent hisax unregistration on
  another cpu, mod can be deleted in the middle of that code.  To add
  insult to injury, HiSax_mod_inc_use_count does not check if it locked
  the module or not, it blindly assumes it worked and continues to use
  the module structures!

  I cannot be sure if get_mtd_device() (calls try_inc_mod_count) is
  safe or not.  There is no locking around calls to get_mtd_device()
  which makes it look unsafe.  OTOH it is invoked from mtdblock_open()
  via mtd_fops.open, which may mean that a higher routine is locking
  the module down.  But it is not obvious that the code is safe.

  If get_mtd_device() is being protected by a higher routine such as
  get_fops() then the module use count is already non-zero and
  try_inc_mod_count will always succeed.  In that case,
  try_inc_mod_count is overkill, an unconditional __MOD_INC_USE_COUNT
  will do the job just as well.  In either case, there is a question
  mark over this code.

  Similar comments for net/core/dev.c::dev_open().  No obvious locks to
  protect the parameters passed to try_inc_mod_count, they may or may
  not be protected by a lock in a higher routine.  try_inc_mod_count
  may or may not be overkill here.

The main objections to the existing reference counting model are :-

* Complex and fragile.  The interactions between try_inc_mod_count,
  some other lock that protects the parameters to try_inc_mod_count and
  the module clean up routines are not obvious.  Every module writer
  has to get the interaction exactly right.

* Difficult to audit.  Is get_mtd_device() safe or not?  If it is safe,
  why is it safe and is try_inc_mod_count overkill?

* Easy to omit.  The comments above only apply to the code that is
  using try_inc_mod_count.  How much code exists that should be doing
  module locking but is not?

* All the external locking is scattered around the kernel, anywhere
  that might call a function in a module.

* The external locking is extra cpu overhead (and storage, but to a
  lesser extent) all the time.  Just to cope with a relatively rare
  unload event.

Given those objections, Rusty is looking at alternative methods,
ranging from no unload at all to an unload method that moves all the
complexity into module.c instead of spreading it around the kernel.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04  2:25         ` Brian Gerst
  2002-07-04  3:54           ` David Gibson
@ 2002-07-04  4:08           ` Keith Owens
  2002-07-04 15:02             ` Brian Gerst
  1 sibling, 1 reply; 83+ messages in thread
From: Keith Owens @ 2002-07-04  4:08 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

On Wed, 03 Jul 2002 22:25:34 -0400, 
Brian Gerst <bgerst@didntduck.org> wrote:
>Why not treat a module just like any other structure?  Obtain a 
>reference to it _before_ using it.

That is what try_inc_use_count() does.  But the interface is messy and
difficult to audit.  It relies on the caller taking some other lock
first to ensure that the module address will not change while you are
trying to call try_inc_use_count.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 21:50             ` Ryan Anderson
  2002-07-03 22:26               ` Diego Calleja
@ 2002-07-04  8:04               ` Helge Hafting
  1 sibling, 0 replies; 83+ messages in thread
From: Helge Hafting @ 2002-07-04  8:04 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: linux-kernel

Ryan Anderson wrote:
> 
> On Tue, 2 Jul 2002, Oliver Neukum wrote:
[...]
> > Either there is a race or there isn't. Such a thing is unusable on a
> > production system.
> 
> In a single processor, no preempt kernel, there is no race.
> Turn on SMP or preempt and there is one.

Seems to me that module _replacement_ (as opposed to
unloading in order to free memory) is an easier case.

Just load the new module and initialize it.  If some
other preempted processor manages to race and activate
the old module - no problem, because the code isn't gone.

The old module may be unloaded once we know the new one
will get all future requests and the old one has 0 references.

Helge Hafting

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
       [not found]   ` <Pine.LNX.3.95.1020702075957.24872A-100000@chaos.analogic.c om>
@ 2002-07-04  8:36     ` Mike Galbraith
  0 siblings, 0 replies; 83+ messages in thread
From: Mike Galbraith @ 2002-07-04  8:36 UTC (permalink / raw)
  To: root, Stephen C. Tweedie; +Cc: Bill Davidsen, Linux-Kernel Mailing List

At 08:04 AM 7/2/2002 -0400, Richard B. Johnson wrote:
>On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:
>
> > Hi,
> >
> > > The suggestion was made that kernel module removal be depreciated or
> > > removed. I'd like to note that there are two common uses for this
> > > capability, and the problems addressed by module removal should be 
> kept in
> > > mind. These are in addition to the PCMCIA issue raised.
> >
> > > 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> > > better (or usefully at all) with one or the other driver used to read 
> CDs.
> >
> > The proposal was to deprecate module removal, *not* to deprecate
> > dynamic registration of devices.  If you want to maintain the above
> > functionality, then there's no reason why a device can't be
> > deregistered from one driver and reregistered on another while both
> > drivers are present.  Note that the scsi stack already allows you to
> > dynamically register and deregister specific targets on the fly.
>
>As I am led to understand from reading this thread, there is some
>known bug caused by module removal. Therefore the "solution" is to
>remove module removal capability.

Read the thread more carefully, and you'll understand more than "some
known bug".  Heck, you may even be able to contribute to a more
palatable race solution than depreciating module unload entirely.

>This is absurd. Next, somebody will remove network capability because
>there's some bug in it.  Hello there........?  Are there any carbon-
>based life-forms out there?

I'm not absolutely certain that those life-forms discussing options are
carbon-based ;-) but they have demonstrated considerable knowledge
of the subject IMO.

         -Mike


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04  4:08           ` Keith Owens
@ 2002-07-04 15:02             ` Brian Gerst
  2002-07-04 19:18               ` Werner Almesberger
  0 siblings, 1 reply; 83+ messages in thread
From: Brian Gerst @ 2002-07-04 15:02 UTC (permalink / raw)
  To: Keith Owens; +Cc: Linux-Kernel Mailing List

Keith Owens wrote:
> On Wed, 03 Jul 2002 22:25:34 -0400, 
> Brian Gerst <bgerst@didntduck.org> wrote:
> 
>>Why not treat a module just like any other structure?  Obtain a 
>>reference to it _before_ using it.
> 
> 
> That is what try_inc_use_count() does.  But the interface is messy and
> difficult to audit.  It relies on the caller taking some other lock
> first to ensure that the module address will not change while you are
> trying to call try_inc_use_count.

And that is almost always the case anyways, since most cases traverse a 
linked list that must already be protected.  You are trying to 
overengineer a solution to a simple, but subtle, problem.

PS.  If you really want to make the broken cases show themselves, poison 
the module memory when it is unloaded.  The same can be done for dumping 
init data and text.

--
				Brian Gerst


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04 15:02             ` Brian Gerst
@ 2002-07-04 19:18               ` Werner Almesberger
  0 siblings, 0 replies; 83+ messages in thread
From: Werner Almesberger @ 2002-07-04 19:18 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Keith Owens, Linux-Kernel Mailing List

Brian Gerst wrote:
> PS.  If you really want to make the broken cases show themselves, poison 
> the module memory when it is unloaded.  The same can be done for dumping 
> init data and text.

Unfortunately, many of the code paths leading to the races are
sufficiently different that it's currently unlikely to actually
hit them. (E.g. it's kind of hard for some 1000+ instructions
code path to compete successfully with a cached, pre-fetched,
and pre-decoded "ret" instruction. Entry-after-removal races
(and the related data races) may be a little bit easier to catch,
but I suppose we're still talking about a likely difference of
orders of magnitude.)

I don't think preemption changes that fundamentally.
Microthreading may, and NUMA will make execution times more
variable, increasing the risk of hitting a race.

The best way to trigger such races is probably to simulate
microthreading, e.g. with an UML-SMP where only one "CPU" can
run at a time, and where execution switching is under script
control.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-03 23:48       ` Daniel Phillips
@ 2002-07-05  9:40         ` Stephen Tweedie
  2002-07-06 19:40           ` Daniel Phillips
  0 siblings, 1 reply; 83+ messages in thread
From: Stephen Tweedie @ 2002-07-05  9:40 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Pavel Machek, Richard B. Johnson, Stephen C. Tweedie,
	Bill Davidsen, Linux-Kernel Mailing List

Hi,

On Thu, Jul 04, 2002 at 01:48:59AM +0200, Daniel Phillips
<phillips@arcor.de> wrote:

> Is it just the mod_dec_use_count; return/unload race we're worried about?  
> I'm not clear on why this is hard.  I'd think it would be sufficient just to 
> walk all runnable processes to ensure none has an execution address inside the
> module.

That fails if:

the module function has called somewhere else in the kernel (and
with -fomit-frame-pointer, you can't reliably walk back up the stack
to find out if there is a stack frame higher up the stack which is in
the module);

the module has taken an interrupt into an unrelated driver;

we have computed a call into the module but haven't actually executed
the call yet;

etc. 


> For smp, an ipi would pick up the current process on each cpu.

Without freezing the other CPUs, that still leaves the race wide open.

--Stephen

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-04  1:18       ` Keith Owens
  2002-07-04  1:53         ` Andrew Morton
  2002-07-04  2:25         ` Brian Gerst
@ 2002-07-05 13:48         ` Pavel Machek
  2002-07-07 14:56           ` Keith Owens
  2 siblings, 1 reply; 83+ messages in thread
From: Pavel Machek @ 2002-07-05 13:48 UTC (permalink / raw)
  To: Keith Owens; +Cc: Linux-Kernel Mailing List

Hi!

> >Okay. So we want modules and want them unload. And we want it bugfree.
> >
> >So... then its okay if module unload is *slow*, right?
> >
> >I believe you can just freeze_processes(), unload module [now its
> >safe, you *know* noone is using that module, because all processes are
> >in your refrigerator], thaw_processes().
> 
> The devil is in the details.

I don't think so.

> You must not freeze the process doing rmmod, that way lies deadlock.

That's automagic. Process doing freeze will not freeze itself.

> Modules can run their own kernel threads.  When the module shuts
> down
> it terminates the threads but we must wait until the process entries
> for the threads have been reaped.  If you are not careful, the zombie
> clean up code can refer to the module that no longer exists.  You must
> not freeze any threads that belong to the module.

Look at the code. freezer will try for 5 seconds, then give up. So, in
rare case module has some threads, rmmod will simply fail. I believe
we can fix rare remaining modules one by one.

> You must not freeze any process that has entered the module but not yet
> incremented the use count, nor any process that has decremented the use
> count but not yet left the module.  Simply looking at the EIP after

Look how freezer works. refrigerator() is blocking, by definition. So
if all processes reach refrigerator(), and the use count == 0, it is
indeed safe to unload.

> freeze is not enough.  Module code with a use count of 0 is allowed to
> call any function as long as that function does not sleep.  That
> rule

And that's okay. If it only calls non-sleeping functions, it is not
going to call refrigerator(), and it will exit module code before
calling refrigerator().

> Using freeze or any other quiesce style operation requires that the
> module clean up be split into two parts.  The logic must be :-

I still think it does not.
								Pavel
-- 
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-05  9:40         ` Stephen Tweedie
@ 2002-07-06 19:40           ` Daniel Phillips
  2002-07-06 19:47             ` Pavel Machek
  0 siblings, 1 reply; 83+ messages in thread
From: Daniel Phillips @ 2002-07-06 19:40 UTC (permalink / raw)
  To: Stephen Tweedie
  Cc: Pavel Machek, Richard B. Johnson, Stephen C. Tweedie,
	Bill Davidsen, Linux-Kernel Mailing List

On Friday 05 July 2002 11:40, Stephen Tweedie wrote:
> Hi,
> 
> On Thu, Jul 04, 2002 at 01:48:59AM +0200, Daniel Phillips
> <phillips@arcor.de> wrote:
> 
> > Is it just the mod_dec_use_count; return/unload race we're worried about? 

> > I'm not clear on why this is hard.  I'd think it would be sufficient just 
> > to walk all runnable processes to ensure none has an execution address 
> > inside the module.
> 
> That fails if:
> 
> the module function has called somewhere else in the kernel (and
> with -fomit-frame-pointer, you can't reliably walk back up the stack
> to find out if there is a stack frame higher up the stack which is in
> the module);

Hi Stephen,

I'm assuming for the sake of argument that we're requiring the use count to 
be incremented for any call outside the module, a rule we might want anyway, 
since it is less fragile than the no-sleeping rule.

> the module has taken an interrupt into an unrelated driver;

With Ben's new separate interrupt stacks the current IP would be available at 
a known place at the base of the interrupt stack.

> we have computed a call into the module but haven't actually executed
> the call yet;

This one is problematic, and yes, I now agree the problem is hard. This is 
where Keith's handwaving comes in: we are supposed to have deregistered the 
module's services and ensured all processes are out of the module at this 
point.  I don't know how that helps, really.  I just want to note that this 
seems to be the only really hard problem.  It's not insoluable though: going 
to extremes we could record each region of code from which module calls 
originate and check for execution addresses in that region, along with 
execution addresses in the module.  Picking up the call address would have to 
be an atomic read.  You don't have to tell me this is ugly and slow, but it 
would work.

> etc. 

You enumerated all the areas of concern that I'd identified, so I'm curious 
what the etc stands for.

> > For smp, an ipi would pick up the current process on each cpu.
> 
> Without freezing the other CPUs, that still leaves the race wide open.

With per-cpu runqueues, we take an ipi onto each processor and take the 
task's runqueue lock.   We can now check each runnable task, and the task the 
ipi interrupted.  Repeating for each processor, the only door we have to 
close is inter-processor task migration, which is not a fast path.  So this 
part, at least, seems doable.

Anyway, I'm beginning to see what all the fuss is about.

-- 
Daniel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-06 19:40           ` Daniel Phillips
@ 2002-07-06 19:47             ` Pavel Machek
  0 siblings, 0 replies; 83+ messages in thread
From: Pavel Machek @ 2002-07-06 19:47 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Stephen Tweedie, Richard B. Johnson, Bill Davidsen,
	Linux-Kernel Mailing List

Hi!

> I'm assuming for the sake of argument that we're requiring the use count to 
> be incremented for any call outside the module, a rule we might want anyway, 
> since it is less fragile than the no-sleeping rule.
> 
> > the module has taken an interrupt into an unrelated driver;
> 
> With Ben's new separate interrupt stacks the current IP would be available at 
> a known place at the base of the interrupt stack.
> 
> > we have computed a call into the module but haven't actually executed
> > the call yet;
> 
> This one is problematic, and yes, I now agree the problem is hard. This is 
> where Keith's handwaving comes in: we are supposed to have deregistered the 
> module's services and ensured all processes are out of the module at this 
> point.  I don't know how that helps, really.  I just want to note that this 
> seems to be the only really hard problem.  It's not insoluable though: going 
> to extremes we could record each region of code from which module calls 
> originate and check for execution addresses in that region, along with 
> execution addresses in the module.  Picking up the call address would have to 
> be an atomic read.  You don't have to tell me this is ugly and slow, but it 
> would work.

freeze_processes(), and now you know that rmmod is only runnable job
on the system [approximately; you'd have to audit all threads marked
as non-stoppable]. So... if rmmod() does not do any computed calls to
module being removed, noone is doing that and all is safe.
								Pavel
-- 
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-05 13:48         ` Pavel Machek
@ 2002-07-07 14:56           ` Keith Owens
  2002-07-07 22:36             ` Roman Zippel
                               ` (2 more replies)
  0 siblings, 3 replies; 83+ messages in thread
From: Keith Owens @ 2002-07-07 14:56 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux-Kernel Mailing List

On Fri, 5 Jul 2002 15:48:17 +0200, 
Pavel Machek <pavel@ucw.cz> wrote:
>Keith Owens wrote
>> Modules can run their own kernel threads.  When the module shuts down
>> it terminates the threads but we must wait until the process entries
>> for the threads have been reaped.  If you are not careful, the zombie
>> clean up code can refer to the module that no longer exists.  You must
>> not freeze any threads that belong to the module.
>
>Look at the code. freezer will try for 5 seconds, then give up. So, in
>rare case module has some threads, rmmod will simply fail. I believe
>we can fix rare remaining modules one by one.

There is no failure path for rmmod.  Once rmmod sees a use count of 0,
it must succeed.  Which is why both Rusty and I agree that rmmod must
be split in two, one piece that is allowed to fail and a second piece
that is not.

BTW, freeze_processes() silently ignores zombie processes.  The test is
not obvious, it is hidden in the INTERESTING macro which has an
embedded 'continue' to break out of the loop.  Easy to miss.

>> You must not freeze any process that has entered the module but not yet
>> incremented the use count, nor any process that has decremented the use
>> count but not yet left the module.  Simply looking at the EIP after
>
>Look how freezer works. refrigerator() is blocking, by definition. So
>if all processes reach refrigerator(), and the use count == 0, it is
>indeed safe to unload.

freeze_processes()
  signal_wake_up() - sets TIF_SIGPENDING for other task
    kick_if_running()
      resched_task() - calls preempt_disable() for this cpu
        smp_send_reschedule()
	  smp_reschedule_interrupt() - now on another cpu
	    ret_from_intr
	      resume_kernel - on other cpu

With CONFIG_PREEMPT, a process running on another cpu without a lock
when freeze_processes() is called should immediately end up in
schedule.  I don't see anything in that code path that disables
preemption on other cpus.  If I am right, then a second cpu could be in
this window when freeze_processes is called

  if (xxx->func)
    xxx->func()

where func is a module function.  There is still a window from loading
the function address, through calling the function and up to the point
where the function does MOD_INC_USE_COUNT.  Any reschedule in that
window opens a race with rmmod.  Without preemption, freeze might be
safe, with preemption the race is back again.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-07 14:56           ` Keith Owens
@ 2002-07-07 22:36             ` Roman Zippel
  2002-07-08  1:09             ` Daniel Mose
  2002-07-08 18:13             ` Pavel Machek
  2 siblings, 0 replies; 83+ messages in thread
From: Roman Zippel @ 2002-07-07 22:36 UTC (permalink / raw)
  To: Keith Owens; +Cc: Pavel Machek, Linux-Kernel Mailing List

Hi,

On Mon, 8 Jul 2002, Keith Owens wrote:

> With CONFIG_PREEMPT, a process running on another cpu without a lock
> when freeze_processes() is called should immediately end up in
> schedule.  I don't see anything in that code path that disables
> preemption on other cpus.  If I am right, then a second cpu could be in
> this window when freeze_processes is called
>
>   if (xxx->func)
>     xxx->func()
>
> where func is a module function.  There is still a window from loading
> the function address, through calling the function and up to the point
> where the function does MOD_INC_USE_COUNT.  Any reschedule in that
> window opens a race with rmmod.  Without preemption, freeze might be
> safe, with preemption the race is back again.

While I agree that the current module interface is broken, but the current
suggestions are IMHO not any better. They rely on specific knowledge about
the scheduler. Next time someone wants to change the scheduler or the
scheduling behaviour, he runs into the risk breaking modules (again).

All we have to do is to make sure that there is no reference to the module
left. In other words we have to make sure there is no data structure with
a reference to the module left. Managing data structures is something we
already have to do anyway, so why don't we just the existing interfaces?

As an example I have an alternative "fix" (that means it compiles) for the
bdev layer. First it makes (un)register_blkdev smp/preempt safe, but the
important change is in unregister_blkdev, which basically does:

	for_all_bdev(bdev) {
		if (bdev->bd_op == op)
			return -EBUSY;
	}

After that we know that noone can call into the module anymore, now we
only have to make use of -EBUSY information somehow.
That means that the module count is not strictly necessary anymore and
structures don't have to be polluted with module owner fields.
IMO that's the far cleaner solution and doesn't require messing with the
scheduler. The module interface has to be fixed anyway and I'd prefer it
wouldn't require some scheduling magic.

bye, Roman

Index: fs/block_dev.c
===================================================================
RCS file: /usr/src/cvsroot/linux-2.5/fs/block_dev.c,v
retrieving revision 1.1.1.17
diff -u -p -r1.1.1.17 block_dev.c
--- fs/block_dev.c	6 Jul 2002 00:33:38 -0000	1.1.1.17
+++ fs/block_dev.c	7 Jul 2002 22:12:50 -0000
@@ -417,55 +417,98 @@ int get_blkdev_list(char * p)
 	Return the function table of a device.
 	Load the driver if needed.
 */
-const struct block_device_operations * get_blkfops(unsigned int major)
+const struct block_device_operations *__get_blkfops(unsigned int major)
 {
 	const struct block_device_operations *ret = NULL;

 	/* major 0 is used for non-device mounts */
 	if (major && major < MAX_BLKDEV) {
 #ifdef CONFIG_KMOD
+		spin_unlock(&bdev_lock);
 		if (!blkdevs[major].bdops) {
 			char name[20];
 			sprintf(name, "block-major-%d", major);
 			request_module(name);
 		}
+		spin_lock(&bdev_lock);
 #endif
 		ret = blkdevs[major].bdops;
 	}
 	return ret;
 }

+const struct block_device_operations *get_blkfops(unsigned int major)
+{
+	const struct block_device_operations *bd;
+	spin_lock(&bdev_lock);
+	bd = __get_blkfops(major);
+	spin_unlock(&bdev_lock);
+	return bd;
+}
+
 int register_blkdev(unsigned int major, const char * name, struct block_device_operations *bdops)
 {
+	int res = 0;
+
+	spin_lock(&bdev_lock);
 	if (major == 0) {
 		for (major = MAX_BLKDEV-1; major > 0; major--) {
 			if (blkdevs[major].bdops == NULL) {
 				blkdevs[major].name = name;
 				blkdevs[major].bdops = bdops;
-				return major;
+				res = major;
+				goto out;
 			}
 		}
-		return -EBUSY;
+		res = -EBUSY;
+		goto out;
+	}
+	if (major >= MAX_BLKDEV) {
+		res = -EINVAL;
+		goto out;
+	}
+	if (blkdevs[major].bdops && blkdevs[major].bdops != bdops) {
+		res = -EBUSY;
+		goto out;
 	}
-	if (major >= MAX_BLKDEV)
-		return -EINVAL;
-	if (blkdevs[major].bdops && blkdevs[major].bdops != bdops)
-		return -EBUSY;
 	blkdevs[major].name = name;
 	blkdevs[major].bdops = bdops;
-	return 0;
+out:
+	spin_unlock(&bdev_lock);
+	return res;
 }

 int unregister_blkdev(unsigned int major, const char * name)
 {
+	struct block_device *bdev;
+	struct list_head *p;
+	int i, res = 0;
+
 	if (major >= MAX_BLKDEV)
 		return -EINVAL;
-	if (!blkdevs[major].bdops)
-		return -EINVAL;
-	if (strcmp(blkdevs[major].name, name))
-		return -EINVAL;
+	spin_lock(&bdev_lock);
+	if (!blkdevs[major].bdops) {
+		res = -EINVAL;
+		goto out;
+	}
+	if (strcmp(blkdevs[major].name, name)) {
+		res = -EINVAL;
+		goto out;
+	}
+	for (i = 0; i < HASH_SIZE; i++) {
+		list_for_each(p, &bdev_hashtable[i]) {
+			bdev = list_entry(p, struct block_device, bd_hash);
+			if (bdev->bd_op == blkdevs[major].bdops) {
+				res = -EBUSY;
+				goto out;
+			}
+		}
+	}
+
 	blkdevs[major].name = NULL;
 	blkdevs[major].bdops = NULL;
+out:
+	spin_unlock(&bdev_lock);
 	return 0;
 }

@@ -481,11 +524,15 @@ int unregister_blkdev(unsigned int major
 int check_disk_change(kdev_t dev)
 {
 	int i;
+	struct block_device *bdev = bdget(kdev_t_to_nr(dev));
 	const struct block_device_operations * bdops = NULL;

 	i = major(dev);
-	if (i < MAX_BLKDEV)
-		bdops = blkdevs[i].bdops;
+	spin_lock(&bdev_lock);
+	if (!bdev->bd_op && i < MAX_BLKDEV)
+		bdev->bd_op = blkdevs[i].bdops;
+	bdops = bdev->bd_op;
+	spin_unlock(&bdev_lock);
 	if (bdops == NULL) {
 		devfs_handle_t de;

@@ -496,12 +543,12 @@ int check_disk_change(kdev_t dev)
 			devfs_put_ops (de); /* We're running in owner module */
 		}
 	}
-	if (bdops == NULL)
-		return 0;
-	if (bdops->check_media_change == NULL)
-		return 0;
-	if (!bdops->check_media_change(dev))
+	if (bdops == NULL ||
+	    bdops->check_media_change == NULL ||
+	    !bdops->check_media_change(dev)) {
+		bdput(bdev);
 		return 0;
+	}

 	printk(KERN_DEBUG "VFS: Disk change detected on device %s\n",
 		__bdevname(dev));
@@ -511,6 +558,7 @@ int check_disk_change(kdev_t dev)

 	if (bdops->revalidate)
 		bdops->revalidate(dev);
+	bdput(bdev);
 	return 1;
 }

@@ -536,7 +584,9 @@ static int do_open(struct block_device *
 	down(&bdev->bd_sem);
 	lock_kernel();
 	if (!bdev->bd_op) {
-		bdev->bd_op = get_blkfops(major(dev));
+		spin_lock(&bdev_lock);
+		bdev->bd_op = __get_blkfops(major(dev));
+		spin_unlock(&bdev_lock);
 		if (!bdev->bd_op)
 			goto out;
 		owner = bdev->bd_op->owner;


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-07 14:56           ` Keith Owens
  2002-07-07 22:36             ` Roman Zippel
@ 2002-07-08  1:09             ` Daniel Mose
  2002-07-09 17:07               ` Daniel Mose
  2002-07-08 18:13             ` Pavel Machek
  2 siblings, 1 reply; 83+ messages in thread
From: Daniel Mose @ 2002-07-08  1:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: lixmeta

Keith Owens wrote:
> On Fri, 5 Jul 2002 15:48:17 +0200, 
> Pavel Machek <pavel@ucw.cz> wrote:
> >Keith Owens wrote
> >> Modules can run their own kernel threads.  When the module shuts down
> >> it terminates the threads but we must wait until the process entries
> >> for the threads have been reaped.  If you are not careful, the zombie
> >> clean up code can refer to the module that no longer exists.  You must
> >> not freeze any threads that belong to the module.

I am just out to share a possible angle. -all though not really a programmer.

Can one module replace another -running- twin-module without having to hand-
shake with the kernel? -and without freezing ? (transparently)

If obvious no - no need to read further.
 
Otherwhise, there might be a possibility to create a shut-down module out of 
the existing module it self?
In this case it could probably be a smooth solution to ask the module creator.
(if known, and available) if he/she would be able to make this modification 
on the fly, as he/she probably knows the most about how the signals operate. 
That is to strip down the existing loadable module into sort of a dummy. 
Of course this won't work if hazardous removal issues are shared among all or 
most of the existing kernel modules.

An apology for taking up time and bandwith. But rmmod is such a neat command 
to have a round.

/Daniel Mose


    






^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-03 23:46       ` Daniel Phillips
@ 2002-07-08 12:21         ` Richard B. Johnson
  2002-07-08 12:41           ` Thunder from the hill
  2002-07-08 13:06           ` Keith Owens
  0 siblings, 2 replies; 83+ messages in thread
From: Richard B. Johnson @ 2002-07-08 12:21 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Pavel Machek, Stephen C. Tweedie, Bill Davidsen,
	Linux-Kernel Mailing List

On Thu, 4 Jul 2002, Daniel Phillips wrote:

> On Wednesday 03 July 2002 05:48, Pavel Machek wrote:
> > Hi!
> > 
> > Okay. So we want modules and want them unload. And we want it bugfree.
> > 
> > So... then its okay if module unload is *slow*, right?
> > 
> > I believe you can just freeze_processes(), unload module [now its
> > safe, you *know* noone is using that module, because all processes are
> > in your refrigerator], thaw_processes().
> > 
> > That's going to take *lot* of time, but should be very simple and very
> > effective.
> 
> Hi Pavel,
> 
> Is it just the mod_dec_use_count; return/unload race we're worried about?  
> I'm not clear on why this is hard.  I'd think it would be sufficient just to 
> walk all runnable processes to ensure none has an execution address inside the
> module.  For smp, an ipi would pick up the current process on each cpu.
> 
> At this point the use count must be zero and the module deregistered, so all 
> we're interested in is that every process that dec'ed the module's use count 
> has succeeded in executing its way out of the module.  If not, we try again 
> later, or if we're impatient, also bump any processes still inside the module 
> to the front of the run queue.
> 
> I'm sure I must have missed something.
> 
> -- 
> Daniel
> 

Assuming that there are no users of a module, i.e., the current
MOD_DEC_USE_COUNT mechanism, checked under a lock, I think all
you have to do to assure race-free module removal is to:

(1)	Make sure that `current` isn't preempted. There are several
        ways to do this. This may require another global variable.

(2)	Execute cleanup_module(). It is assumed that the user-code
	will deallocate interrupts, and free resources, etc.

(3)	Set a global flag "module_remove", it doesn't have to be atomic.
	It needs only to be volatile. It is used in schedule() to trap
	all CPUs.
        schedule()
        {
            while(module_remove)
                ;
        }

(4)	Wait number-of-CPUs timer-ticks with interrupts alive and well.
        This will make certain that everything queued on the timer gets
        a chance to be executed and everything that would be preempted
        gets trapped in schedule().
 
(5)	Remove the module, reset the flag.

The cost is the single variable that must be checked every time schedule()
is executed plus the 'don't preempt me' variable (if necessary).


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

                 Windows-2000/Professional isn't.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 12:21         ` Richard B. Johnson
@ 2002-07-08 12:41           ` Thunder from the hill
  2002-07-08 12:57             ` Richard B. Johnson
  2002-07-08 13:06           ` Keith Owens
  1 sibling, 1 reply; 83+ messages in thread
From: Thunder from the hill @ 2002-07-08 12:41 UTC (permalink / raw)
  To: Richard B. Johnson
  Cc: Daniel Phillips, Pavel Machek, Stephen C. Tweedie, Bill Davidsen,
	Linux-Kernel Mailing List

Hi,

On Mon, 8 Jul 2002, Richard B. Johnson wrote:
> (3)	Set a global flag "module_remove", it doesn't have to be atomic.
> 	It needs only to be volatile. It is used in schedule() to trap
> 	all CPUs.
>         schedule()
>         {
>             while(module_remove)
>                 ;
>         }

That doesn't sound too clean to me...

Maybe we should lock that module explicitly, instead of halting anything 
that is schedule()d.

We should possibly add something to lock in struct module (or 
module_info), be it some kind of integer or be it a semaphore (which is 
clearly a bit too much, I think) or a spinlock, or whatever. This 
shouldn't protect the module from being used in parallel, but from being 
used in removal. So on removal, we do something like module->remove |= 1 
or even up(module->m_sem), and when we're done, we do something related to 
undo the up, remove or whatever...

BTW, looking at struct module, we have this union

union {
	atomic_t usecount;
	long pad;
}

Fair enough, but if long pad is to pad (as it name tells us), shouldn't it 
be atomic_t then (I mean, what if we change the type for atomic_t)?

							Regards,
							Thunder
-- 
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o?  K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y- 
------END GEEK CODE BLOCK------


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 12:41           ` Thunder from the hill
@ 2002-07-08 12:57             ` Richard B. Johnson
  2002-07-08 13:58               ` Thunder from the hill
  0 siblings, 1 reply; 83+ messages in thread
From: Richard B. Johnson @ 2002-07-08 12:57 UTC (permalink / raw)
  To: Thunder from the hill
  Cc: Daniel Phillips, Pavel Machek, Stephen C. Tweedie, Bill Davidsen,
	Linux-Kernel Mailing List

On Mon, 8 Jul 2002, Thunder from the hill wrote:

> Hi,
> 
> On Mon, 8 Jul 2002, Richard B. Johnson wrote:
> > (3)	Set a global flag "module_remove", it doesn't have to be atomic.
> > 	It needs only to be volatile. It is used in schedule() to trap
> > 	all CPUs.
> >         schedule()
> >         {
> >             while(module_remove)
> >                 ;
> >         }
> 
> That doesn't sound too clean to me...
> 
> Maybe we should lock that module explicitly, instead of halting anything 
> that is schedule()d.

Locking the module does nothing except increase module overhead. The
premise is that we don't care how long it takes to remove a module. It
just must be done safely. So, what we need to do it make certain that...

(1) All calls from the module have returned.
(2) All calls to the module code have returned.
(3) All user-access has completed.

That's what the trap in schedule() does, in conjunction with the
wait-for-timer-ticks. We don't want to have lots of locks and semiphores
that have to be accessed during normal execution paths.

> 
> We should possibly add something to lock in struct module (or 
> module_info), be it some kind of integer or be it a semaphore (which is 
> clearly a bit too much, I think) or a spinlock, or whatever.

But this doesn't solve the module-removal problem .

> shouldn't protect the module from being used in parallel, but from being 
> used in removal. So on removal, we do something like module->remove |= 1 
> or even up(module->m_sem), and when we're done, we do something related to 
> undo the up, remove or whatever...
> 

Again, it's not the problem I'm addressing.


> BTW, looking at struct module, we have this union
> 
> union {
> 	atomic_t usecount;
> 	long pad;
> }
> 
> Fair enough, but if long pad is to pad (as it name tells us), shouldn't it 
> be atomic_t then (I mean, what if we change the type for atomic_t)?
> 

Good point. Member usecount could be anything. A 'long' isn't the correct
pad for all types, but it will probably handle everything that was
intended.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

                 Windows-2000/Professional isn't.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 12:21         ` Richard B. Johnson
  2002-07-08 12:41           ` Thunder from the hill
@ 2002-07-08 13:06           ` Keith Owens
  2002-07-08 13:15             ` Keith Owens
  1 sibling, 1 reply; 83+ messages in thread
From: Keith Owens @ 2002-07-08 13:06 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

On Mon, 8 Jul 2002 08:21:59 -0400 (EDT), 
"Richard B. Johnson" <root@chaos.analogic.com> wrote:
>On Thu, 4 Jul 2002, Daniel Phillips wrote:
>> Is it just the mod_dec_use_count; return/unload race we're worried about?  
>> I'm not clear on why this is hard.  I'd think it would be sufficient just to 
>> walk all runnable processes to ensure none has an execution address inside the
>> module.  For smp, an ipi would pick up the current process on each cpu.
>> 
>> At this point the use count must be zero and the module deregistered, so all 
>> we're interested in is that every process that dec'ed the module's use count 
>> has succeeded in executing its way out of the module.  If not, we try again 
>> later, or if we're impatient, also bump any processes still inside the module 
>> to the front of the run queue.
>
>Assuming that there are no users of a module, i.e., the current
>MOD_DEC_USE_COUNT mechanism, checked under a lock, I think all
>you have to do to assure race-free module removal is to:
>
>(1)	Make sure that `current` isn't preempted. There are several
>        ways to do this. This may require another global variable.
>
>(2)	Execute cleanup_module(). It is assumed that the user-code
>	will deallocate interrupts, and free resources, etc.

You must not deallocate any resources until you have flushed out any
callers who have loaded the address of a module function but not yet
done MOD_INC_USE_COUNT.  It races between cleanup and a new consumer
entering the module.

>(3)	Set a global flag "module_remove", it doesn't have to be atomic.
>	It needs only to be volatile. It is used in schedule() to trap
>	all CPUs.
>        schedule()
>        {
>            while(module_remove)
>                ;
>        }
>
>(4)	Wait number-of-CPUs timer-ticks with interrupts alive and well.
>        This will make certain that everything queued on the timer gets
>        a chance to be executed and everything that would be preempted
>        gets trapped in schedule().
> 
>(5)	Remove the module, reset the flag.
>
>The cost is the single variable that must be checked every time schedule()
>is executed plus the 'don't preempt me' variable (if necessary).

(3) and (4) are the handwaving[1] bit that says 'wait until all
consumers have either left the module or incremented the use count'.  I
don't like the idea of completely stopping schedule.  Apart from the
long latency it introduces on module removal, I suspect that threads
belong to the module being removed will be a problem, they must proceed
to completion and full thread clean up.

Frankly the quiesce code is not the real problem.  The real race is
this one :-

Check use count, 0 so proceed with unload.
      Another task enters the module on another cpu.
Call cleanup_module(), remove resources.
      Increment use count.
      Use resources, oops.

There are only two viable solutions to this race.  Either _always_
increment the use count outside the module (via the owner field in
structures) or split module exit code in two, as described in [1].

The current code tries to use 'increment use count outside module' but
that has its own race in getting the address of the module.  Closing
that race relies on the interaction between three (yes, three)
unrelated locks which have to be obtained and released in the right
order.  Not only is this complex and fragile, a quick scan of the
kernel found one outright bug and several dubious sections of code.

Rusty and I think that the current method is difficult to use and to
audit, as well as scattering the complexity around the kernel.  Not to
mention that preempt breaks the existing code.  We both think that we
should give up on the current method and put all the complexity of
flushing stale references in one place[2].  The downside is the need to
split module init and exit code into two phases, i.e. change every
module to separate unregister from unallocate.

[1] http://marc.theaimsgroup.com/?l=linux-kernel&m=102574568025442&w=2
[2] http://marc.theaimsgroup.com/?l=linux-kernel&m=102575536930661&w=2 


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 13:06           ` Keith Owens
@ 2002-07-08 13:15             ` Keith Owens
  0 siblings, 0 replies; 83+ messages in thread
From: Keith Owens @ 2002-07-08 13:15 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

On Mon, 08 Jul 2002 23:06:05 +1000, 
Keith Owens <kaos@ocs.com.au> wrote:
>The current code tries to use 'increment use count outside module' but
>that has its own race in getting the address of the module.  Closing
>that race relies on the interaction between three (yes, three)
>unrelated locks which have to be obtained and released in the right
>order.  Not only is this complex and fragile, a quick scan of the
>kernel found one outright bug and several dubious sections of code.

Correction: make that two outright bugs.  I have just been told that
one of the dubious bits of code is broken.

Tracking the interaction of multiple locks to ensure that they
correctly prevent a race on incrementing the use count is too messy and
fragile.  The offending bits of code were not written by beginners but
by experienced kernel programmers.  If experienced programmers get it
wrong, then the method is wrong.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 12:57             ` Richard B. Johnson
@ 2002-07-08 13:58               ` Thunder from the hill
  2002-07-08 15:48                 ` Daniel Gryniewicz
  0 siblings, 1 reply; 83+ messages in thread
From: Thunder from the hill @ 2002-07-08 13:58 UTC (permalink / raw)
  To: Richard B. Johnson
  Cc: Thunder from the hill, Daniel Phillips, Pavel Machek,
	Stephen C. Tweedie, Bill Davidsen, Linux-Kernel Mailing List

Hi,

On Mon, 8 Jul 2002, Richard B. Johnson wrote:
> > That doesn't sound too clean to me...
> > 
> > Maybe we should lock that module explicitly, instead of halting anything 
> > that is schedule()d.

Not to mention that we even stop our own cleanup thread, looking at this.

> Locking the module does nothing except increase module overhead. The
> premise is that we don't care how long it takes to remove a module. It
> just must be done safely. So, what we need to do it make certain that...
> 
> (1) All calls from the module have returned.
> (2) All calls to the module code have returned.
> (3) All user-access has completed.
> 
> That's what the trap in schedule() does, in conjunction with the
> wait-for-timer-ticks. We don't want to have lots of locks and semiphores
> that have to be accessed during normal execution paths.

Still, we shouldn't lock everything. I could do awful lots of interesting 
things while the only thing that is being done is to remove a module. It 
doesn't make sense IMO to lock things that are completely unrelated to 
modules.

And BTW, what's so much of an overhead if we tell everyone who tries to 
mess around with a certain module that he'd better wait until we unloaded 
it? It could be done like your schedule hack, but cleaner in that respect 
that those who got nothing to do with the module can keep on running.

> Good point. Member usecount could be anything. A 'long' isn't the
> correct pad for all types, but it will probably handle everything that
> was intended.

But as I mentioned - atomic_t is changed (e.g. to long long) -> 
module->pad blows up, because the sizeof(struct module) is different, 
depending on which part of the union we're using.

							Regards,
							Thunder
-- 
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o?  K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y- 
------END GEEK CODE BLOCK------


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 13:58               ` Thunder from the hill
@ 2002-07-08 15:48                 ` Daniel Gryniewicz
  2002-07-08 17:23                   ` Thunder from the hill
  0 siblings, 1 reply; 83+ messages in thread
From: Daniel Gryniewicz @ 2002-07-08 15:48 UTC (permalink / raw)
  To: Thunder from the hill
  Cc: Richard B. Johnson, Daniel Phillips, Pavel Machek,
	Stephen C. Tweedie, Bill Davidsen, Linux-Kernel Mailing List

Okay, maybe this is a bit naive, but isn't this problem already solved? 
Couldn't we just put a read/write lock on the module, where using is
reading, and removing is writing?  As I understand it, this should
impose little overhead on the use (read) case, and ensure that, when a
context has the remove (write) lock there are no no users (readers) and
cannot be any?

Daniel

On Mon, 2002-07-08 at 09:58, Thunder from the hill wrote:
> Hi,
> 
> Still, we shouldn't lock everything. I could do awful lots of interesting 
> things while the only thing that is being done is to remove a module. It 
> doesn't make sense IMO to lock things that are completely unrelated to 
> modules.
> 
> And BTW, what's so much of an overhead if we tell everyone who tries to 
> mess around with a certain module that he'd better wait until we unloaded 
> it? It could be done like your schedule hack, but cleaner in that respect 
> that those who got nothing to do with the module can keep on running.
> 
> > Good point. Member usecount could be anything. A 'long' isn't the
> > correct pad for all types, but it will probably handle everything that
> > was intended.
> 
> But as I mentioned - atomic_t is changed (e.g. to long long) -> 
> module->pad blows up, because the sizeof(struct module) is different, 
> depending on which part of the union we're using.
> 
> 							Regards,
> 							Thunder
> -- 
> (Use http://www.ebb.org/ungeek if you can't decode)
> ------BEGIN GEEK CODE BLOCK------
> Version: 3.12
> GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
> N--- o?  K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
> e++++ h* r--- y- 
> ------END GEEK CODE BLOCK------

-- 
Recursion n.:
        See Recursion.
                        -- Random Shack Data Processing Dictionary



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 15:48                 ` Daniel Gryniewicz
@ 2002-07-08 17:23                   ` Thunder from the hill
  0 siblings, 0 replies; 83+ messages in thread
From: Thunder from the hill @ 2002-07-08 17:23 UTC (permalink / raw)
  To: Daniel Gryniewicz
  Cc: Thunder from the hill, Richard B. Johnson, Daniel Phillips,
	Pavel Machek, Stephen C. Tweedie, Bill Davidsen,
	Linux-Kernel Mailing List

Hi,

On 8 Jul 2002, Daniel Gryniewicz wrote:
> Okay, maybe this is a bit naive, but isn't this problem already solved? 
> Couldn't we just put a read/write lock on the module, where using is
> reading, and removing is writing?  As I understand it, this should
> impose little overhead on the use (read) case, and ensure that, when a
> context has the remove (write) lock there are no no users (readers) and
> cannot be any?

My suggestion could cope with just one lock, but there seems to be 
something speaking against that.

							Regards,
							Thunder
-- 
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o?  K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y- 
------END GEEK CODE BLOCK------


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-07 14:56           ` Keith Owens
  2002-07-07 22:36             ` Roman Zippel
  2002-07-08  1:09             ` Daniel Mose
@ 2002-07-08 18:13             ` Pavel Machek
  2002-07-08 22:43               ` Keith Owens
  2 siblings, 1 reply; 83+ messages in thread
From: Pavel Machek @ 2002-07-08 18:13 UTC (permalink / raw)
  To: Keith Owens; +Cc: Linux-Kernel Mailing List

Hi!

> >> Modules can run their own kernel threads.  When the module shuts down
> >> it terminates the threads but we must wait until the process entries
> >> for the threads have been reaped.  If you are not careful, the zombie
> >> clean up code can refer to the module that no longer exists.  You must
> >> not freeze any threads that belong to the module.
> >
> >Look at the code. freezer will try for 5 seconds, then give up. So, in
> >rare case module has some threads, rmmod will simply fail. I believe
> >we can fix rare remaining modules one by one.
> 
> There is no failure path for rmmod.  Once rmmod sees a use count of 0,
> it must succeed.  Which is why both Rusty and I agree that rmmod must
> be split in two, one piece that is allowed to fail and a second piece
> that is not.
> 
> BTW, freeze_processes() silently ignores zombie processes.  The test is
> not obvious, it is hidden in the INTERESTING macro which has an
> embedded 'continue' to break out of the loop.  Easy to miss.

Is there problems? Zombie processes are basically dead, not
interesting, processes.

> >> You must not freeze any process that has entered the module but not yet
> >> incremented the use count, nor any process that has decremented the use
> >> count but not yet left the module.  Simply looking at the EIP after
> >
> >Look how freezer works. refrigerator() is blocking, by definition. So
> >if all processes reach refrigerator(), and the use count == 0, it is
> >indeed safe to unload.
> 
> freeze_processes()
>   signal_wake_up() - sets TIF_SIGPENDING for other task
>     kick_if_running()
>       resched_task() - calls preempt_disable() for this cpu
>         smp_send_reschedule()
> 	  smp_reschedule_interrupt() - now on another cpu
> 	    ret_from_intr
> 	      resume_kernel - on other cpu
> 
> With CONFIG_PREEMPT, a process running on another cpu without a lock
> when freeze_processes() is called should immediately end up in
> schedule.  I don't see anything in that code path that disables
> preemption on other cpus.  If I am right, then a second cpu could be in
> this window when freeze_processes is called
> 
>   if (xxx->func)
>     xxx->func()

okay, so we have

	if (xxx->func)
		interrupt
			schedule()

but schedule at this point is certainly not going to enter signal
handling code, so refrigerator is deffered at run to userspace, so it
continues with
		interrupt return
		xxx->func()
	...
	attempt to return to userspace
		signal code
			refrigerator().

So I believe that is safe.

> where func is a module function.  There is still a window from loading
> the function address, through calling the function and up to the point
> where the function does MOD_INC_USE_COUNT.  Any reschedule in that
> window opens a race with rmmod.  Without preemption, freeze might be
> safe, with preemption the race is back again.

								Pavel
-- 
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 18:13             ` Pavel Machek
@ 2002-07-08 22:43               ` Keith Owens
  2002-07-09 14:00                 ` Pavel Machek
  0 siblings, 1 reply; 83+ messages in thread
From: Keith Owens @ 2002-07-08 22:43 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux-Kernel Mailing List

On Mon, 8 Jul 2002 20:13:31 +0200, 
Pavel Machek <pavel@ucw.cz> wrote:
>Keith Owens wrote
>> BTW, freeze_processes() silently ignores zombie processes.  The test is
>> not obvious, it is hidden in the INTERESTING macro which has an
>> embedded 'continue' to break out of the loop.  Easy to miss.
>
>Is there problems? Zombie processes are basically dead, not
>interesting, processes.

I have had reports of module threads being shut down by the cleanup
routine, the module was removed then the zombie threads were reaped and
got an oops because the module code that spawned them no longer
existed.  It is not clear why this is a problem, I will follow up with
the reporter.

>> freeze_processes()
>>   signal_wake_up() - sets TIF_SIGPENDING for other task
>>     kick_if_running()
>>       resched_task() - calls preempt_disable() for this cpu
>>         smp_send_reschedule()
>>           smp_reschedule_interrupt() - now on another cpu
>>             ret_from_intr
>>               resume_kernel - on other cpu
>> 
>> With CONFIG_PREEMPT, a process running on another cpu without a lock
>> when freeze_processes() is called should immediately end up in
>> schedule.  I don't see anything in that code path that disables
>> preemption on other cpus.  If I am right, then a second cpu could be in
>> this window when freeze_processes is called
>> 
>>   if (xxx->func)
>>     xxx->func()
>
>okay, so we have
>
>	if (xxx->func)
>		interrupt
>			schedule()
>
>but schedule at this point is certainly not going to enter signal
>handling code

With preempt it will.  The interrupt drops back through ret_from_intr
-> resume_kernel.  The preempt count is 0, preemption has only been
disabled on the sending cpu, not the receiving cpu.  need_resched is
entered immediately.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08 22:43               ` Keith Owens
@ 2002-07-09 14:00                 ` Pavel Machek
  0 siblings, 0 replies; 83+ messages in thread
From: Pavel Machek @ 2002-07-09 14:00 UTC (permalink / raw)
  To: Keith Owens; +Cc: Pavel Machek, Linux-Kernel Mailing List

Hi!

> >> freeze_processes()
> >>   signal_wake_up() - sets TIF_SIGPENDING for other task
> >>     kick_if_running()
> >>       resched_task() - calls preempt_disable() for this cpu
> >>         smp_send_reschedule()
> >>           smp_reschedule_interrupt() - now on another cpu
> >>             ret_from_intr
> >>               resume_kernel - on other cpu
> >> 
> >> With CONFIG_PREEMPT, a process running on another cpu without a lock
> >> when freeze_processes() is called should immediately end up in
> >> schedule.  I don't see anything in that code path that disables
> >> preemption on other cpus.  If I am right, then a second cpu could be in
> >> this window when freeze_processes is called
> >> 
> >>   if (xxx->func)
> >>     xxx->func()
> >
> >okay, so we have
> >
> >	if (xxx->func)
> >		interrupt
> >			schedule()
> >
> >but schedule at this point is certainly not going to enter signal
> >handling code
> 
> With preempt it will.  The interrupt drops back through ret_from_intr
> -> resume_kernel.  The preempt count is 0, preemption has only been
> disabled on the sending cpu, not the receiving cpu.  need_resched is
> entered immediately.

Yep but signal handling code will not be entered because signal
handling is only entered when returning to userspace.
								Pavel
-- 
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: simple handling of module removals Re: [OKS] Module removal
  2002-07-08  1:09             ` Daniel Mose
@ 2002-07-09 17:07               ` Daniel Mose
  0 siblings, 0 replies; 83+ messages in thread
From: Daniel Mose @ 2002-07-09 17:07 UTC (permalink / raw)
  To: Linux-Kernel Mailing List

Daniel Mose wrote:
> Keith Owens wrote:
> > On Fri, 5 Jul 2002 15:48:17 +0200, 
> > Pavel Machek <pavel@ucw.cz> wrote:
> > >Keith Owens wrote
> > >> Modules can run their own kernel threads.  When the module shuts down
> > >> it terminates the threads but we must wait until the process entries
> > >> for the threads have been reaped.  If you are not careful, the zombie
> > >> clean up code can refer to the module that no longer exists.  You must
> > >> not freeze any threads that belong to the module.
> 
> I am just out to share a possible angle. -all though not really a programmer.
> 
> Can one module replace another -running- twin-module without having to hand-
> shake with the kernel? -and without freezing ? (transparently)
> 
> If obvious no - no need to read further.
. . .

As some conversations have implied rently in the thread, that it would be 
possible to disconnect,and re-connect the original module - This might also be.

But I will not pretend that I know anything. Just trying to use a bit of wits
It is only an angle. And I'm not exactly boored with Ansi-drawing.

HWclock.
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=->-=-=-=-=-=-=-=-=-
1.    				2.			    3.
  derivate.o==== |L            derivate.o====#|L  100kB    derivate.o====|L=#	
    | G'day!     |i @            A           |i  @   in       |Bye!      |i @
    V		 |n #  1MB       | Up2date?  |n      use      V          |n 
realmodule.o=====|u=#@  in   realmodule.o====#|u=#       realmodule.o=== |u
                  x     use                       x                          x
HWclock.
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=->-=-=-=-=-=-=-=-=-
4.    	 Users=0?              5.    Users=0?		 6.    Users=0?
/dev/0<<-derivate.o=<<=|L# /dev/0xx derivate.o xx|Lx /dev/0<<derivate.o====|L#
              \_EOT=>>=|i#@           Me uglu!>>#|i#@      >> Me brken!=>>=|i#@
                       |n                        |n                        |n
    ( re lm dule.o )== |u         r  lm d l .o   |u                d l .   |u
                        x                         x                         x		
HWclock
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=->-=-=-=-=-=-=-=-=-
7.                                            8.             
                  n         n         n                        |L
derivate.o x|x    i u	>   u i   >   i u             - \ -    |i
I am 0+!         L   x     x   L      L x                      |n
                                                               |u
                                                                x

No this is not really what happened or?.

Actually what can happen is <of course anything>, but hopefully that the "real"
module is removed safely and replaced by a dummy permanently, or until another 
module (maybe the original "real" module) is loaded back into the kernel.
Perhaps the kernel doesn't get to free all of the memory pages that were 
allocated to the "real" module, But the original module is gone, which is what 
was desired, no?  The dummy should probably be designed to be able to answer 
kernel processes in a sort of lame way (rest easy), as well as to listen for
user processes, to deny them acess, if it is not able to unload completely.

I think of this as both a workaround And a design issue. Dummy modules is
not something new. It has been (and is) used to trick kernel processes before.
(and now)

With dummy modules you will probably not have to risk either the functionality 
of the kernel it self, while running, by severe re-programming, opening up new 
possible bug holes, nor the Real, original modules brilliant functionality that
we might want to unload just to momentarily NOT achieve this purpose.

There is just one BIG issue here as far as I'm concerned. 

I Don't know if theese are relevant thoughts or not, since I lack SeVereLy in 
Experience of Linux processing shapes, and communication features. 
So I'm fumbelling more than just a bit here.

(But at Least I'm not having High school summer classes in Fundamental C 
programming rules on Linux-kernel Mailing list. ;-). 
No one named or forgotten.)

I use rmmod  as a security feature at home. What the kernel don't know about, 
it can't reveal to possible intruders -right? So I use a ping script, and 
mailto, to unload eth0 from one of my lan boxes, while dialing on line, so 
I really DO load and UN-load modules all the time. ( 2.2.12 )

---

Image:

- makes sure u make sense.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-01 17:48 [OKS] Module removal Bill Davidsen
                   ` (4 preceding siblings ...)
  2002-07-03  0:09 ` Vojtech Pavlik
@ 2002-07-12 21:51 ` David Lang
  5 siblings, 0 replies; 83+ messages in thread
From: David Lang @ 2002-07-12 21:51 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux-Kernel Mailing List

One oddball use of module removal.

On my tivo I have a module loaded that disables scrambling of the shows it
records (allowing me to download them to standard mpeg files) I need to
not have this module loaded to watch shows that were recorded without the
module loaded.

if module unloading is removed then I will have to reboot my tivo to
unload this module.

yes this module could probably be rewritten to have an interface that
allowed the encryption to be enabled or disabled, but that makes the
module significanlty more complicated then mearly implementing null
functions to replace the existing encryption functions.

David Lang


 On Mon, 1 Jul 2002, Bill Davidsen wrote:

> Date: Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
> From: Bill Davidsen <davidsen@tmr.com>
> To: Linux-Kernel Mailing List <linux-kernel@vger.kernel.org>
> Subject: [OKS] Module removal
>
> Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
> comments on points raied.
>
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
>
> 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> better (or usefully at all) with one or the other driver used to read CDs.
> I've noted that several programs for reading the image from an ISO CD
> (readcd and sdd) have end of data problems with the SCSI interface.
>
> 2 - restarting NICs when total reinitialization is needed. In server
> applications it's sometimes necessary to move or clear a NIC connection,
> force renegotiation because the blade on the switch was set wrong, etc.
> It's preferable to take down one NIC for a moment than suffer a full
> outage via reboot.
>
> --
> bill davidsen <davidsen@tmr.com>
>   CTO, TMR Associates, Inc
> Doing interesting things with little computers since 1979.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-09  8:59                         ` Kai Germaschewski
@ 2002-07-09 11:06                           ` Roman Zippel
  0 siblings, 0 replies; 83+ messages in thread
From: Roman Zippel @ 2002-07-09 11:06 UTC (permalink / raw)
  To: Kai Germaschewski; +Cc: Keith Owens, linux-kernel

Hi,

On Tue, 9 Jul 2002, Kai Germaschewski wrote:

> I tend to see this differently: cleanup_module() cannot fail, but it can
> sleep. So it's perfectly fine to deregister, wait until all references are
> gone, clean up and return. So a kind of two-stage unregister is already
> happening.

That's a possibility, if you can live with a noninterruptable rmmod
process sleeping for a very long time...

> It's different in that it does use explicit refcounting, but
> when the right interfaces are provided, the driver author doesn't need to
> care - the author should just call pci_unregister/netdev_unregister/..,
> that'll sleep until all references are gone (which also means no one will
> use callbacks into the module anymore) and be done.

The unregister function can't prevent someone from start using the device
again (at least not with reasonable effort), but it can detect this. The
author should just check the return value, like he already (hopefully)
does during initialization.

bye, Roman


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-09  6:25                       ` Roman Zippel
@ 2002-07-09  8:59                         ` Kai Germaschewski
  2002-07-09 11:06                           ` Roman Zippel
  0 siblings, 1 reply; 83+ messages in thread
From: Kai Germaschewski @ 2002-07-09  8:59 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Keith Owens, linux-kernel

On Tue, 9 Jul 2002, Roman Zippel wrote:

> Who is changing the use count? How do you ensure that someone doesn't
> cache a reference past that magic sync point?
> IMO the real problem is that cleanup_module() can't fail. If modules
> already do proper data acounting, it's not too difficult to detect if
> there still exists a reference. In the driverfs example the problem is
> that remove_driver() returns void instead of int, so it can't fail.

I tend to see this differently: cleanup_module() cannot fail, but it can 
sleep. So it's perfectly fine to deregister, wait until all references are 
gone, clean up and return. So a kind of two-stage unregister is already 
happening. 

It's different in that it does use explicit refcounting, but 
when the right interfaces are provided, the driver author doesn't need to 
care - the author should just call pci_unregister/netdev_unregister/.., 
that'll sleep until all references are gone (which also means no one will 
use callbacks into the module anymore) and be done.

--Kai



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-09  0:27                     ` Keith Owens
  2002-07-09  3:09                       ` Werner Almesberger
@ 2002-07-09  6:25                       ` Roman Zippel
  2002-07-09  8:59                         ` Kai Germaschewski
  1 sibling, 1 reply; 83+ messages in thread
From: Roman Zippel @ 2002-07-09  6:25 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

Hi,

On Tue, 9 Jul 2002, Keith Owens wrote:

>   Check use count.
>   Unregister.
>   Ensure that all code that had a reference to the module has either
>   dropped that reference or bumped the use count.
>   Check use count again, it may have been bumped above.
>   If use count is still 0, there are no stale references left.  It is
>   now safe to unload, call the module unallocate routine then free its
>   area.
>   If use count != 0, either schedule for removal when the count goes to
>   0 or reregister.  I prefer reregister, Rusty prefers delayed removal.
>
> Ensuring that all code has either dropped stale references or bumped
> the use count means that all processes that were not sleeping when
> rmmod was started must proceed to a sync point.  It is (and always has
> been) illegal to sleep while using module code/data (note: using, not
> referencing).

Who is changing the use count? How do you ensure that someone doesn't
cache a reference past that magic sync point?
IMO the real problem is that cleanup_module() can't fail. If modules
already do proper data acounting, it's not too difficult to detect if
there still exists a reference. In the driverfs example the problem is
that remove_driver() returns void instead of int, so it can't fail.

bye, Roman


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-09  0:27                     ` Keith Owens
@ 2002-07-09  3:09                       ` Werner Almesberger
  2002-07-09  6:25                       ` Roman Zippel
  1 sibling, 0 replies; 83+ messages in thread
From: Werner Almesberger @ 2002-07-09  3:09 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

Keith Owens wrote:
> The only difference between module and non-module is whether the code
> and data areas are freed or left in the kernel for the next user.

Yes, that's exactly why the entry-after-removal problem worries me:
this is something that also means trouble if you're a non-module. 

Just remember the "timer pops after del_timer" problem a while back.
De-registration that leaves stuff behind is extremely dangerous, no
matter if you're a module or not.

I fully agree that return-after-removal is a different issue and
that there are several perfectly reasonable ways for dealing with
this.

> Ensuring that all code has either dropped stale references or bumped
> the use count means that all processes that were not sleeping when
> rmmod was started must proceed to a sync point.  It is (and always has
> been) illegal to sleep while using module code/data (note: using, not
> referencing).

That's the problem with entry-after-removal: you may simply not be
able to know when all references are gone.

Since basically all references to modules go through some sort of
registration function, why not fix the registration functions ?

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-09  1:50           ` Kevin O'Connor
@ 2002-07-09  2:37             ` Werner Almesberger
  0 siblings, 0 replies; 83+ messages in thread
From: Werner Almesberger @ 2002-07-09  2:37 UTC (permalink / raw)
  To: Kevin O'Connor; +Cc: Bill Davidsen, Keith Owens, linux-kernel

Kevin O'Connor wrote:
> I think the above works, but the locking is a bit hairy.  How about the
> following.  (Taxonomy 3)

Looks like an efficient and almost correct implementation of this,
yes.

One bug: you should use yield() instead of schedule() (2.5) or do
	current->policy |= SCHED_YIELD;
before calling schedule (2.4). Or play it extra-safe and use a
traditional wait queue, even though this has more overhead.

> I get the feeling you already had something like this in mind, but your
> example above was rather complicated.  The latter implementation is just
> standard reference counting.

Yup, my example was simply an approximation of the current module
code, with the obvious races fixed.

> With a standard reference counting implementation I don't see any of these
> issues being a problem..

Amazing, isn't it ? :-)

> Did I miss something?

I see two possible reasons why somebody may not want this solution:

 - those wishing to squeeze the very last bit of performance out of
   this may find holding the spin lock during the call a better
   compromise than the reference counting
 - in some cases, there could be a lock that's being held by the
   caller of foo_deregister and that's also needed by
   bar_whatever_op. In this case, you have to drop the lock while
   waiting (or re-think the locking design).

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-04  6:29         ` Werner Almesberger
  2002-07-04  6:50           ` Werner Almesberger
  2002-07-04 12:29           ` Bill Davidsen
@ 2002-07-09  1:50           ` Kevin O'Connor
  2002-07-09  2:37             ` Werner Almesberger
  2 siblings, 1 reply; 83+ messages in thread
From: Kevin O'Connor @ 2002-07-09  1:50 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Bill Davidsen, Keith Owens, linux-kernel

On Thu, Jul 04, 2002 at 03:29:29AM -0300, Werner Almesberger wrote:
> This certainly seems to be the most understandable way, yes. I think
> modules follow basically the principle illistrated below (that's case
> 4b in the taxonomy I posted earlier), where deregistration doesn't
> stop everything, but which should be safe except for
> return-after-removal:
> 
> foo_dispatch(...)
> {
>     read_lock(global_foo_lock);
>     ...
>     xxx = find_stuff();
>     ...
>     xxx->whatever_op(xxx->data);
> }
> 
> foo_deregister(...)
> {
>     write_lock(global_foo_lock);
>     ...
>     write_unlock(global_foo_lock);
> }
> 
> bar_whatever_op(my_data)
> {   
>     lock(my_data); /* think MOD_INC_USE_COUNT */
>     ...
>     read_unlock(global_foo_lock);
>     ...
>     unlock(my_data); /* think MOD_DEC_USE_COUNT */
>     /* return-after-removal race if my_data also protects code,
>        not only data ! */
> }
> 
> bar_main(...)
> {
>     foo_register(&bar_ops,&bar_data);
>     ...
>     foo_deregister(&bar_ops);
>     ...
>     lock(bar_data); /* barrier */
>     unlock(bar_data);
>     ...
>     destroy(&bar_data);
> }

Hi Werner,

I think the above works, but the locking is a bit hairy.  How about the
following.  (Taxonomy 3)

foo_dispatch(...)
{
    spin_lock(global_foo_lock);
    xxx = find_stuff();            // atomic_inc(xxx->refcnt)
    spin_unlock(global_foo_lock);
    ...
    xxx->whatever_op(xxx->data);
    ...
    put_stuff(xxx);                // atomic_dec(xxx->refcnt)
}

foo_deregister(...)
{
    spin_lock(global_foo_lock);
    remove_stuff(xxx);
    spin_unlock(global_foo_lock);

    while (atomic_read(xxx->refcnt))
        schedule();
}

bar_whatever_op(my_data)
{   
    /* no return-after-removal race because bar is pinned. */
}

bar_main(...)
{
    foo_register(&bar_ops,&bar_data);
    ...
    foo_deregister(&bar_ops);

    /* Now guaranteed that no references to bar_ops exist, and guaranteed
       that no concurrent accesses exist. */

    destroy(&bar_data);
}


I get the feeling you already had something like this in mind, but your
example above was rather complicated.  The latter implementation is just
standard reference counting.

> [...]
> I can understand why people don't want to use totally "safe"
> deregistration, e.g.
> 
>  - locking gets more complex and you may run into hairy deadlock
>    scenarios
>  - accomplishing timely removal may become more difficult
>  - nobody likes patches that change the world

With a standard reference counting implementation I don't see any of these
issues being a problem..

Did I miss something?
-Kevin

-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-09  0:07                   ` Werner Almesberger
@ 2002-07-09  0:27                     ` Keith Owens
  2002-07-09  3:09                       ` Werner Almesberger
  2002-07-09  6:25                       ` Roman Zippel
  0 siblings, 2 replies; 83+ messages in thread
From: Keith Owens @ 2002-07-09  0:27 UTC (permalink / raw)
  To: linux-kernel

On Mon, 8 Jul 2002 21:07:13 -0300, 
Werner Almesberger <wa@almesberger.net> wrote:
>What troubles me most in this discussion is that nobody seems to
>think of how these issues affect non-modules.
>
>Entry-after-removal is an anomaly in the subsystem making that call.
>Fixing it for modules is fine as long as only modules will ever
>un-register, but what do you do if non-module code ever wants to
>un-register too ? (E.g. think hot-plugging or software devices.)

The only difference between module and non-module is whether the code
and data areas are freed or left in the kernel for the next user.  All
other processing should be the same for hotplug.  This is why
CONFIG_HOTPLUG retains the __exit code in the kernel instead of
discarding it at link time.

>Besides, a common pattern for those "hold this down, kick that, then
>slowly count to ten, and now it's safe to release" solutions seems
>to be that it takes roughly ten times as long to find the race
>condition buried deep within them, than it took for the respective
>predecessor.

I don't like 'wait for n seconds then release' methods, as you say they
just hide the problem.  Neither Rusty nor I are suggesting that
approach.  Rusty's idea is

  Check use count.
  Unregister.
  Ensure that all code that had a reference to the module has either
  dropped that reference or bumped the use count.
  Check use count again, it may have been bumped above.
  If use count is still 0, there are no stale references left.  It is
  now safe to unload, call the module unallocate routine then free its
  area.
  If use count != 0, either schedule for removal when the count goes to
  0 or reregister.  I prefer reregister, Rusty prefers delayed removal.

Ensuring that all code has either dropped stale references or bumped
the use count means that all processes that were not sleeping when
rmmod was started must proceed to a sync point.  It is (and always has
been) illegal to sleep while using module code/data (note: using, not
referencing).


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-08  0:46                 ` Jamie Lokier
  2002-07-08  6:22                   ` Daniel Phillips
@ 2002-07-09  0:07                   ` Werner Almesberger
  2002-07-09  0:27                     ` Keith Owens
  1 sibling, 1 reply; 83+ messages in thread
From: Werner Almesberger @ 2002-07-09  0:07 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Bill Davidsen, Keith Owens, linux-kernel

Jamie Lokier wrote:
> The more I think about it, the more I think the `owner' thing in
> file_operations et al. is the right thing in all cases,

What troubles me most in this discussion is that nobody seems to
think of how these issues affect non-modules.

Entry-after-removal is an anomaly in the subsystem making that call.
Fixing it for modules is fine as long as only modules will ever
un-register, but what do you do if non-module code ever wants to
un-register too ? (E.g. think hot-plugging or software devices.)

Invent some pseudo-module framework for it ? Require all code using
that subsystem to be a module ? What happens if there are multiple
registrations per chunk of code ?

Besides, a common pattern for those "hold this down, kick that, then
slowly count to ten, and now it's safe to release" solutions seems
to be that it takes roughly ten times as long to find the race
condition buried deep within them, than it took for the respective
predecessor.

I'm not sure the incremental band aid approach works in this case.

I'm sorry that I don't have to offer anything more constructive at
the moment, but I first want to build some testing framework. One
oops says more than ten pages of analysis explaining a race
condition :-)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-08  6:22                   ` Daniel Phillips
@ 2002-07-08 10:16                     ` Bill Davidsen
  0 siblings, 0 replies; 83+ messages in thread
From: Bill Davidsen @ 2002-07-08 10:16 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Jamie Lokier, Werner Almesberger, Keith Owens, linux-kernel

On Mon, 8 Jul 2002, Daniel Phillips wrote:

> That's not nice.  It requires the calling code to know it's calling a
> module and it imposes the inc/dec overhead on callers even when the
> target isn't compiled as a module.

I don't think so... the module knows, there's a bit of hard code for the
module anyway, so that could do the inc/dec and know when the module was
in the "about to be removed" state. It's not clear if "it works most of
the time now" means we just need to address some simple corner cases or if
the whole design of modules needs to be reimplemented to really sweep out
the corners, and they're not "simple" at all.

This has the feeling that someone could say "we could just..." at any
time, and the whole problem would vanish.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-08  0:31                 ` Jamie Lokier
@ 2002-07-08  6:42                   ` Oliver Neukum
  0 siblings, 0 replies; 83+ messages in thread
From: Oliver Neukum @ 2002-07-08  6:42 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Werner Almesberger, Bill Davidsen, Keith Owens, linux-kernel


> Catching the entry points is what the current `try_inc_mod_count' code
> does.  I can't think of another way to do that.

The old way "2.2-rules" are safe on UP without preempt. So if you
temporarily, through the freeze hack, can introduce these conditions,
you've solved it. Except that you need to deal with a failure due to
an elevated usage count.

IMHO you gain very little by finding partial solutions which won't
help you in the hard cases. Module unload is very rare. It just needs
to work. You need to optimise use of modules, not the unload.
Strictly speaking, by having owner fields you punish drivers compiled
statically. The effect is small and not worth doing something about
it, but it should show the direction.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-08  0:46                 ` Jamie Lokier
@ 2002-07-08  6:22                   ` Daniel Phillips
  2002-07-08 10:16                     ` Bill Davidsen
  2002-07-09  0:07                   ` Werner Almesberger
  1 sibling, 1 reply; 83+ messages in thread
From: Daniel Phillips @ 2002-07-08  6:22 UTC (permalink / raw)
  To: Jamie Lokier, Bill Davidsen; +Cc: Werner Almesberger, Keith Owens, linux-kernel

On Monday 08 July 2002 02:46, Jamie Lokier wrote:
> Bill Davidsen wrote:
> > I think you have to do it with the use count, and there may well be
> > modules you can't remove safely.
> 
> I agree, this is the correct and clean thing to do.
> 
> It rather implies that any function in a module which calls
> MOD_{INC,DEC}_USE_COUNT should always be called from a non-module
> function which _itself_ protects the module from removal by temporarily
> bumping the use count.

That's not nice.  It requires the calling code to know it's calling a
module and it imposes the inc/dec overhead on callers even when the
target isn't compiled as a module.

-- 
Daniel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-08  0:18               ` Bill Davidsen
@ 2002-07-08  0:46                 ` Jamie Lokier
  2002-07-08  6:22                   ` Daniel Phillips
  2002-07-09  0:07                   ` Werner Almesberger
  0 siblings, 2 replies; 83+ messages in thread
From: Jamie Lokier @ 2002-07-08  0:46 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Werner Almesberger, Keith Owens, linux-kernel

Bill Davidsen wrote:
> I think you have to do it with the use count, and there may well be
> modules you can't remove safely.

I agree, this is the correct and clean thing to do.

It rather implies that any function in a module which calls
MOD_{INC,DEC}_USE_COUNT should always be called from a non-module
function which _itself_ protects the module from removal by temporarily
bumping the use count.

> But to add re-init code to modules,
> define new ioctls to call it, etc, etc, doesn't seem satisfactory. I think
> we really need to bump the use counter more carefully, to really know when
> a module is in use, and when we can clear it out.
>
> The smp case looks doable, the preempt case may be harder. I really like
> the idea of simply queueing a remove and then doing it when the use count
> drops to zero. But we have have to provide a "don't use" to keep new
> things out. That's hard.

That's already done in `try_inc_mod_count', it's just a bit slow.  But
arguably it's only impact is when you're getting a handle or creating an
object, which is usually relatively slow anyway, such as opening a
device or socket, or adding a firewall rule.

The more I think about it, the more I think the `owner' thing in
file_operations et al. is the right thing in all cases, and that Al Viro
is right about the overhead being reasonable.  Perhaps the interface
could be made a little more generic (`{get,put}_module_reference'?), and
double check the corner cases such as when a module is in "don't use"
mode, blocking and scheduling a reload.

-- Jamie

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-07 21:41               ` Oliver Neukum
@ 2002-07-08  0:31                 ` Jamie Lokier
  2002-07-08  6:42                   ` Oliver Neukum
  0 siblings, 1 reply; 83+ messages in thread
From: Jamie Lokier @ 2002-07-08  0:31 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Werner Almesberger, Bill Davidsen, Keith Owens, linux-kernel

Oliver Neukum wrote:
> How do you find CPU's that are about to execute module code ?
> 
> IMHO you need to do this freeze trick before you check the module
> usage count.
> 
> [..]
> > Another possibility would be the RCU thing: execute the module's exit
> > function, but keep the module's memory allocated until some safe
> > scheduling point later, when you are sure that no CPU can possibly be
> > running the module.
> 
> But what do you do if that CPU increases the module usage count?

Those are the cases where I said this does not help.
You basically need:

      (a) to catch the exiting case properly
      (b) to catch entry points

Catching the entry points is what the current `try_inc_mod_count' code
does.  I can't think of another way to do that.

-- Jamie

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-07 21:09             ` Jamie Lokier
  2002-07-07 21:41               ` Oliver Neukum
@ 2002-07-08  0:18               ` Bill Davidsen
  2002-07-08  0:46                 ` Jamie Lokier
  1 sibling, 1 reply; 83+ messages in thread
From: Bill Davidsen @ 2002-07-08  0:18 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Werner Almesberger, Keith Owens, linux-kernel

On Sun, 7 Jul 2002, Jamie Lokier wrote:

> Werner Almesberger wrote:
> > Okay, here's an almost correct example (except for the usual
> > return-after-removal, plus an unlikely data race described
> > below). foo_1 runs first:
> 
> This can be fixed if we assume a way to ask "is any CPU still executing
> module code?".
> 
> To do this, have the `free_module' function use `smp_call_function' to
> ask every CPU "are you executing code for module XXX?".  The question is
> answered by a routine which walks the stack, checking the instruction
> pointer at each place on the stack to see whether it's inside the module
> of interest.
> 
> Yes this is complex, but it's not that complex -- provided you can rely
> on stack walking to find the module.  (It wouldn't work if x86 used
> -fomit-frame-pointer, for example).

Complex, may not be reliable, and the question is "are you now or will you
ever run code in or for module XXX" because there may be pointers for
threads, interrupt handlers, etc. Gets ugly doing it from the back.

I think you have to do it with the use count, and there may well be
modules you can't remove safely. But to add re-init code to modules,
define new ioctls to call it, etc, etc, doesn't seem satisfactory. I think
we really need to bump the use counter more carefully, to really know when
a module is in use, and when we can clear it out.

The smp case looks doable, the preempt case may be harder. I really like
the idea of simply queueing a remove and then doing it when the use count
drops to zero. But we have have to provide a "don't use" to keep new
things out. That's hard.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-07 21:09             ` Jamie Lokier
@ 2002-07-07 21:41               ` Oliver Neukum
  2002-07-08  0:31                 ` Jamie Lokier
  2002-07-08  0:18               ` Bill Davidsen
  1 sibling, 1 reply; 83+ messages in thread
From: Oliver Neukum @ 2002-07-07 21:41 UTC (permalink / raw)
  To: Jamie Lokier, Werner Almesberger; +Cc: Bill Davidsen, Keith Owens, linux-kernel

Hi,

> To do this, have the `free_module' function use `smp_call_function' to
> ask every CPU "are you executing code for module XXX?".  The question is
> answered by a routine which walks the stack, checking the instruction
> pointer at each place on the stack to see whether it's inside the module
> of interest.
>
> Yes this is complex, but it's not that complex -- provided you can rely
> on stack walking to find the module.  (It wouldn't work if x86 used
> -fomit-frame-pointer, for example).

How do you find CPU's that are about to execute module code ?

IMHO you need to do this freeze trick before you check the module
usage count.

[..]
> Another possibility would be the RCU thing: execute the module's exit
> function, but keep the module's memory allocated until some safe
> scheduling point later, when you are sure that no CPU can possibly be
> running the module.

But what do you do if that CPU increases the module usage count?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-04  6:50           ` Werner Almesberger
@ 2002-07-07 21:09             ` Jamie Lokier
  2002-07-07 21:41               ` Oliver Neukum
  2002-07-08  0:18               ` Bill Davidsen
  0 siblings, 2 replies; 83+ messages in thread
From: Jamie Lokier @ 2002-07-07 21:09 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Bill Davidsen, Keith Owens, linux-kernel

Werner Almesberger wrote:
> Okay, here's an almost correct example (except for the usual
> return-after-removal, plus an unlikely data race described
> below). foo_1 runs first:

This can be fixed if we assume a way to ask "is any CPU still executing
module code?".

To do this, have the `free_module' function use `smp_call_function' to
ask every CPU "are you executing code for module XXX?".  The question is
answered by a routine which walks the stack, checking the instruction
pointer at each place on the stack to see whether it's inside the module
of interest.

Yes this is complex, but it's not that complex -- provided you can rely
on stack walking to find the module.  (It wouldn't work if x86 used
-fomit-frame-pointer, for example).

So module removal works like this:

	int free_module(...)
	{
		/* ...stuff... */
		if (module->uc.usecount != 0)
			return -EAGAIN;
		if (smp_call_function (check_if_in_module, module, 1, 1))
			return -EGAIN;

		/* remove module. */
	}

Admittedly, this _only_ deals with this particular class of race
conditions.

Another possibility would be the RCU thing: execute the module's exit
function, but keep the module's memory allocated until some safe
scheduling point later, when you are sure that no CPU can possibly be
running the module.

These don't help at all with races against parallel opens -- food for
thought, though.

-- Jamie

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-04  6:29         ` Werner Almesberger
  2002-07-04  6:50           ` Werner Almesberger
@ 2002-07-04 12:29           ` Bill Davidsen
  2002-07-09  1:50           ` Kevin O'Connor
  2 siblings, 0 replies; 83+ messages in thread
From: Bill Davidsen @ 2002-07-04 12:29 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Keith Owens, linux-kernel

On Thu, 4 Jul 2002, Werner Almesberger wrote:

> Bill Davidsen wrote:
> > Isn't the right thing to make everything stop using the module before
> > ending it, for any definition of ending?
> 
> This certainly seems to be the most understandable way, yes. I think
> modules follow basically the principle illistrated below (that's case
> 4b in the taxonomy I posted earlier), where deregistration doesn't
> stop everything, but which should be safe except for
> return-after-removal:

There seems no right thing to do when a module get a service request for
something which is not active. Anything at that point would be likely to
corrupt data structures, oops, or leave a process in some unkillable
state.

> I can understand why people don't want to use totally "safe"
> deregistration, e.g.
> 
>  - locking gets more complex and you may run into hairy deadlock
>    scenarios
>  - accomplishing timely removal may become more difficult
>  - nobody likes patches that change the world

Everybody loves them in development kernels ;-) Actually, I think this is
unlikely to result in more than a hang if it fails one way, or no worse
than what we have if it fails the other.

And if I understand the problem, it only is needed when either smp or
preempt are built in, so there would be no overhead for small systems, one
other possible objection to doing it safely.
 
> So the entry/return-after-removal issues may still need to be
> resolved. I'm mainly worried about entry-after-removal, because
> it seems to me that this is likely to imply data races in
> non-modules too, so try_inc_mod_count neither sufficient (for it
> doesn't help with non-modules) nor required (because fixing the
> race would also eliminate entry-after-removal).
> 
> I've seen very few reponses to my analysis. Does everybody think
> I'm nuts (quite possible :-), or is this worth continuing ?

My read is that eliminating actual unloading of modules won't solve these
issues, it would just change them. Therefore if you're having a good time
looking at this I think it would make the kernel more stable, which will
be a goal of 2.6.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-04  6:29         ` Werner Almesberger
@ 2002-07-04  6:50           ` Werner Almesberger
  2002-07-07 21:09             ` Jamie Lokier
  2002-07-04 12:29           ` Bill Davidsen
  2002-07-09  1:50           ` Kevin O'Connor
  2 siblings, 1 reply; 83+ messages in thread
From: Werner Almesberger @ 2002-07-04  6:50 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Keith Owens, linux-kernel

I wrote:
> This is correct if we can be sure that the use count never reaches
> 0 here, but then the whole inc/dec exercise is probably redundant.
> ("probably" as in "it doesn't have to be, but I'd be surprised if
> it isn't"; I'll give an example in a later posting.)

Okay, here's an almost correct example (except for the usual
return-after-removal, plus an unlikely data race described
below). foo_1 runs first:


foo_1(...)
{
    MOD_INC_USE_COUNT;
    ...
    initiate_asynchronous_execution_of(foo_2);
    rendezvous(foo_a);
        /* wait until foo_2 has incremented the use count */
    ...
    MOD_DEC_USE_COUNT;
    ...
    rendezvous(foo_b); /* release foo_2 */
    /* cool return-after-removal race */
}   

foo_2(...)
{
    MOD_INC_USE_COUNT;
    rendezvous(foo_a);
    rendezvous(foo_b);
    MOD_DEC_USE_COUNT;
}


(The pseudo-primitive redezvous(X) stops execution until all
"threads" have reached that point, then they all continue.)

I think the easiest solution is to simply declare such constructs
illegal.

Note that I'm using "return" loosely - whatever is used to implement
rendezvous() may execute instructions after unblocking, which would
race with removal too. Also note that in this case, we may, at least
theoretically, data race with removal if accessing foo_b after
unblocking foo_2.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-04  4:11       ` Bill Davidsen
@ 2002-07-04  6:29         ` Werner Almesberger
  2002-07-04  6:50           ` Werner Almesberger
                             ` (2 more replies)
  0 siblings, 3 replies; 83+ messages in thread
From: Werner Almesberger @ 2002-07-04  6:29 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Keith Owens, linux-kernel

Bill Davidsen wrote:
> Isn't the right thing to make everything stop using the module before
> ending it, for any definition of ending?

This certainly seems to be the most understandable way, yes. I think
modules follow basically the principle illistrated below (that's case
4b in the taxonomy I posted earlier), where deregistration doesn't
stop everything, but which should be safe except for
return-after-removal:


foo_dispatch(...)
{
    read_lock(global_foo_lock);
    ...
    xxx = find_stuff();
    ...
    xxx->whatever_op(xxx->data);
}

foo_deregister(...)
{
    write_lock(global_foo_lock);
    ...
    write_unlock(global_foo_lock);
}

bar_whatever_op(my_data)
{   
    lock(my_data); /* think MOD_INC_USE_COUNT */
    ...
    read_unlock(global_foo_lock);
    ...
    unlock(my_data); /* think MOD_DEC_USE_COUNT */
    /* return-after-removal race if my_data also protects code,
       not only data ! */
}

bar_main(...)
{
    foo_register(&bar_ops,&bar_data);
    ...
    foo_deregister(&bar_ops);
    ...
    lock(bar_data); /* barrier */
    unlock(bar_data);
    ...
    destroy(&bar_data);
}


/* I'm a lazy bastard for not validating this with something like
   Spin, so there may be races left. */

Now there are quite a few things you can omit, without ever noticing
problems. E.g. the read_unlock from bar_whatever_op could be before
the call to whatever_op in foo_dispatch, and all you get is a tiny
bar_whatever_op vs. destroy race. Likewise, moving the barrier in
bar_main may go unnoticed for a long time. If bar_data is never
destroyed, or synchronized by some other means, most of the locking
is actually superfluous, so in many situations, using this model
without strictly following every single detail is actually okay, at
last as far as data races are concerned.

So I think we need to distinguish the entry-after-removal race and
the return-after-removal race, because the former may also be a data
race, while the latter is typically only a code race. (The latter
becomes a data race if you touch shared data after releasing the
lock, but this would be a fairly obvious mistake.)

The entry-after-removal race is not module specific, and may exist
as a true data race in the kernel. (Didn't search for it - I'm not
happy with the fact that I wouldn't be able to catch it in the act
anyway, so I'd rather play a bit with infrastructure first.)

The return-after-removal race could be solved by not deallocating
module memory, or - probably cheaper, but requiring either code
changes or advanced gcc wizardry - by using decrement_and_return.

By the way, there are cases where MOD_DEC_USE_COUNT is followed by
code other than a direct return, e.g. (arbitrary example) in
drivers/net/aironet4500_core.c

This is correct if we can be sure that the use count never reaches
0 here, but then the whole inc/dec exercise is probably redundant.
("probably" as in "it doesn't have to be, but I'd be surprised if
it isn't"; I'll give an example in a later posting.)

However, if this MOD_DEC_USE_COUNT may ever decrement the count to
zero, we have something worse than the usual return-after-removal
race. This may even be a data race, so such code is suspicious in
any case.

> It seems slightly like that tree falling in the forest, and no one to hear
> it. Much easier to handle removal right than service requests after close. 

I can understand why people don't want to use totally "safe"
deregistration, e.g.

 - locking gets more complex and you may run into hairy deadlock
   scenarios
 - accomplishing timely removal may become more difficult
 - nobody likes patches that change the world

So the entry/return-after-removal issues may still need to be
resolved. I'm mainly worried about entry-after-removal, because
it seems to me that this is likely to imply data races in
non-modules too, so try_inc_mod_count neither sufficient (for it
doesn't help with non-modules) nor required (because fixing the
race would also eliminate entry-after-removal).

I've seen very few reponses to my analysis. Does everybody think
I'm nuts (quite possible :-), or is this worth continuing ?

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 16:36     ` Werner Almesberger
  2002-07-02 16:50       ` Benjamin Herrenschmidt
@ 2002-07-04  4:11       ` Bill Davidsen
  2002-07-04  6:29         ` Werner Almesberger
  1 sibling, 1 reply; 83+ messages in thread
From: Bill Davidsen @ 2002-07-04  4:11 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Keith Owens, linux-kernel

On Tue, 2 Jul 2002, Werner Almesberger wrote:

> Actually, if module exit synchronizes properly, even the
> return-after-removal case shouldn't exist, because we'd simply
> wait for this call to return.
> 
> Hmm, interesting. Did I just make the whole problem go away,
> or is travel fatigue playing tricks on my brain ? :-)

You redefined it in what might be a useful way...

My question is, if the race condition involves use of a module after it's
removed, is removing it and leaving it in memory going to be better? With
a driver for an unregistered device be more likely do the right thing even
if it's in memory? Isn't the right thing to make everything stop using the
module before ending it, for any definition of ending? Because I certainly
can't define what the right thing to do would be otherwise, at least in
any general sense.

It seems slightly like that tree falling in the forest, and no one to hear
it. Much easier to handle removal right than service requests after close. 

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 16:50       ` Benjamin Herrenschmidt
  2002-07-02 18:05         ` Werner Almesberger
@ 2002-07-03  3:50         ` Pavel Machek
  1 sibling, 0 replies; 83+ messages in thread
From: Pavel Machek @ 2002-07-03  3:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Werner Almesberger, Keith Owens, linux-kernel

Hi!

> >> It's not really just the module information. If I can, say, get
> >> callbacks from something even after I unregister, I may well
> >> have destroyed the data I need to process the callbacks, and
> >> oops or worse.
> >
> >Actually, if module exit synchronizes properly, even the
> >return-after-removal case shouldn't exist, because we'd simply
> >wait for this call to return.
> >
> >Hmm, interesting. Did I just make the whole problem go away,
> >or is travel fatigue playing tricks on my brain ? :-)
> 
> That was one of the solutions proposed by Rusty, that is basically
> waiting for all CPUs to have scheduled upon exit from module_exit
> and before doing the actual removal.

That seems reasonable... lighter version of freeze_processes() I was
thinking about.
								Pavel
-- 
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02  4:53 ` Keith Owens
  2002-07-02  5:43   ` Werner Almesberger
@ 2002-07-02 23:35   ` Alexander Viro
  1 sibling, 0 replies; 83+ messages in thread
From: Alexander Viro @ 2002-07-02 23:35 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel



On Tue, 2 Jul 2002, Keith Owens wrote:

> For netfilter, the use count reflects the number of packets being
> processed.  Complex and potentially high overhead.

<blink>

_Why_???

If some rule in your chains needs a module foo, you probably want foo
to stay around until that rule is removed.  Even if there's no traffic
whatsoever.

So what's the problem with making use count equal to number of rules
referencing the module?  You need exclusion between chain changes and
chain traversing anyway...
 
> All of this requires that the module information be passed in multiple
> structures and assumes that all code is careful about reference
> counting the code it is about to execute.  There has to be a better
> way!

You know, I'd rather trust core code to do something right than expect
all drivers to follow any non-trivial rules (and "you should not block
in <areas>" _is_ non-trivial enough).

No comments on the network devices - I'll need to read through the code
to answer that one...


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 16:50       ` Benjamin Herrenschmidt
@ 2002-07-02 18:05         ` Werner Almesberger
  2002-07-03  3:50         ` Pavel Machek
  1 sibling, 0 replies; 83+ messages in thread
From: Werner Almesberger @ 2002-07-02 18:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Keith Owens, linux-kernel

Benjamin Herrenschmidt wrote:
> That was one of the solutions proposed by Rusty, that is basically
> waiting for all CPUs to have scheduled upon exit from module_exit
> and before doing the actual removal.

No, that's not what I mean. If you have a de-registration function,
it may give you one of the following assurances:

 1) none (e.g. of xxx_deregister just unlinks some ops structure,
    but makes no attempt to squash concurrent accesses)
 2) guarantee that no further access to ops structure will occur
    after xxx_deregister returns, and
    2a) one can probe for cached references/execution of callbacks,
        e.g.
	  foo_deregister(&my_ops);
	  while (foo_running()) schedule();
	  /* can destroy local state/code now */
	This is a common construct in the Linux kernel. Note that
	none of the module code executes after foo_running returns,
	so return-after-removal can't happen.
    2b) we're guaranteed that any running callbacks or such will
	have acquired their own locks/counts/etc. before
	xxx_deregister returns. Modules would (also) use the use
	count to protect code. Not sure if this happens anywhere
	in the kernel. return-after-removal would be possible
	here (unless we're going through a layer of trampolines
	or such). E.g.
	  foo_deregister(&my_ops);
	  while (myself_running()) schedule();
	  /* may still have to execute code after unlocking */
	  while (MOD_IN_USE) schedule();
	Note that this one still protects data !
 3) guarantee that no direct effects of accesses to ops structure
    occur after xxx_deregister returns, e.g.
      foo_deregister(&my_ops);
      /* can destroy local state/code now */
    This also eliminates return-after-removal, because deregister
    would wait for callbacks to return.
 4a/4b) Like 3, but the callback signals back completion without
    returning (i.e. to indicate that it has copied all shared data,
    and is now working on a private copy), we're back to case 2a or
    2b. In the case of modules, the usage count would be used to
    protect code, so decrement_and_return would be sufficient here.

I think these are the most common cases. The point is that only
case 1) leaves you completely in the dark, and only cases 2b and
4b are special when it comes to modules. Cases 2a, 3, and 4a are
always safe.

Moreover, case 1, and cases 2a, 2b, 4a, and 4b without
synchronizing after de-registration, can also cause problems in
non-module code, e.g.

bar_callback(void *my_data)
{
    *(int **) my_data = 1;
}

...
int whatever = 0,*ptr = &whatever;
...
foo_register(&bar_ops,&ptr);
...
foo_deregister(&bar_ops); /* case 1, 2a, 2b, 4a, or 4b */
ptr = NULL;
/* BUG: bar_callback may still be running ! */
...

Such code would only be data-safe (but not code-safe) if all shared
data is static, e.g. if the callback writes to some hardware
register, and the address is either static or constant.

AFAIK, most de-registration functions should be in case 2a, 3, or
4a. Well, we probably have a bunch of cases 1, too :-)

Now, what does this all boil down to ? Cases 2a, 3, and 4a are
non-issues. Cases 2b and 4b still need decrement_and_return, so I
was a bit too optimistic in assuming we're totally safe. Case 1 is
pathological also for non-modules, unless a few unlikely
constraints are met.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02 16:36     ` Werner Almesberger
@ 2002-07-02 16:50       ` Benjamin Herrenschmidt
  2002-07-02 18:05         ` Werner Almesberger
  2002-07-03  3:50         ` Pavel Machek
  2002-07-04  4:11       ` Bill Davidsen
  1 sibling, 2 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2002-07-02 16:50 UTC (permalink / raw)
  To: Werner Almesberger, Keith Owens; +Cc: linux-kernel

>> It's not really just the module information. If I can, say, get
>> callbacks from something even after I unregister, I may well
>> have destroyed the data I need to process the callbacks, and
>> oops or worse.
>
>Actually, if module exit synchronizes properly, even the
>return-after-removal case shouldn't exist, because we'd simply
>wait for this call to return.
>
>Hmm, interesting. Did I just make the whole problem go away,
>or is travel fatigue playing tricks on my brain ? :-)

That was one of the solutions proposed by Rusty, that is basically
waiting for all CPUs to have scheduled upon exit from module_exit
and before doing the actual removal.

Ben.



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02  5:43   ` Werner Almesberger
@ 2002-07-02 16:36     ` Werner Almesberger
  2002-07-02 16:50       ` Benjamin Herrenschmidt
  2002-07-04  4:11       ` Bill Davidsen
  0 siblings, 2 replies; 83+ messages in thread
From: Werner Almesberger @ 2002-07-02 16:36 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

I wrote:
> It's not really just the module information. If I can, say, get
> callbacks from something even after I unregister, I may well
> have destroyed the data I need to process the callbacks, and
> oops or worse.

Actually, if module exit synchronizes properly, even the
return-after-removal case shouldn't exist, because we'd simply
wait for this call to return.

Hmm, interesting. Did I just make the whole problem go away,
or is travel fatigue playing tricks on my brain ? :-)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
@ 2002-07-02 12:20 Ariel Garcia
  0 siblings, 0 replies; 83+ messages in thread
From: Ariel Garcia @ 2002-07-02 12:20 UTC (permalink / raw)
  To: linux-kernel


On Mon, 1 Jul 2002, Bill Davidsen wrote:

> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.

I would add another one: switching hardware in a laptop.
  I have a laptop in which the cdrom and the floppy drives both share the
same bay, ie, I can put either the floppy or the cd in the bay but not
both simultaneously.
  To be able to use the cdrom after I have booted with the floppy drive
installed (and used it), I have to remove the floppy module first.
Otherwise the cdrom modules cannot be loaded...
Not having the rmmod possibility would be a big step backwards.


Cheers,

Ariel Garcia


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02  4:53 ` Keith Owens
@ 2002-07-02  5:43   ` Werner Almesberger
  2002-07-02 16:36     ` Werner Almesberger
  2002-07-02 23:35   ` Alexander Viro
  1 sibling, 1 reply; 83+ messages in thread
From: Werner Almesberger @ 2002-07-02  5:43 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

Keith Owens wrote:
> For netfilter, the use count reflects the number of packets being
> processed.  Complex and potentially high overhead.

Good example - netfilter may access a huge number of tiny
modules when working on a packet. While this by itself is a
performance issue, the need to keep reference counts right
doesn't necessarily help.

> All of this requires that the module information be passed in multiple
> structures and assumes that all code is careful about reference
> counting the code it is about to execute.

It's not really just the module information. If I can, say, get
callbacks from something even after I unregister, I may well
have destroyed the data I need to process the callbacks, and
oops or worse.

> There has to be a better way!

Well yes, there are other approaches that make sure something is
not used, e.g.

If you can a) make sure no new references are generated, and b)
send a sequential marker through the subsystem, you can be sure
that all references are gone, when the marker re-emerges. (Or use
multiple markers, e.g. one per CPU.)

Likewise, if you can disable restarting of a subsystem, and then
wait until the subsystem is idle, you're safe. E.g. tasklet_disable
works like this.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [OKS] Module removal
  2002-07-02  4:08 RE2: " Brian Gerst
@ 2002-07-02  4:53 ` Keith Owens
  2002-07-02  5:43   ` Werner Almesberger
  2002-07-02 23:35   ` Alexander Viro
  0 siblings, 2 replies; 83+ messages in thread
From: Keith Owens @ 2002-07-02  4:53 UTC (permalink / raw)
  To: linux-kernel

On Tue, 02 Jul 2002 00:08:55 -0400, 
Brian Gerst <bgerst@quark.didntduck.org> wrote:
>Keith Owens wrote:
>> 1) Do the reference counting outside the module, before it is entered.
>>    Not only does this pollute all structures that contain function
>>    pointers, it introduces overhead on every function dereference.  All
>>    of this just to cope with the relatively low possibility that a
>>    module will be removed.
>
>Only "first use" (ie. ->open) functions need gaurding against unloads. 
>Any subsequent functions are guaranteed to have a reference to the 
>module, and don't need to bother with the refcount.  I have a few ideas 
>to optimize the refcounting better than it is now.

Also the close routine, otherwise there is a window where the use count
is 0 but code is still executing in the module.

Network operations such as SIOCGIFHWADDR take an interface name and do
not call any 'open' routine.  The only lock I can see around dev_ifsioc
is dev_base_lock, AFAICT that will not protect against a module being
unloaded while SIOCGIFHWADDR is running.  If dev_base_lock does protect
against module unload, it is not clear that it does so.

For netfilter, the use count reflects the number of packets being
processed.  Complex and potentially high overhead.

All of this requires that the module information be passed in multiple
structures and assumes that all code is careful about reference
counting the code it is about to execute.  There has to be a better
way!


^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2002-07-12 21:54 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-01 17:48 [OKS] Module removal Bill Davidsen
2002-07-01 18:35 ` Richard B. Johnson
2002-07-01 18:42 ` Jose Luis Domingo Lopez
2002-07-01 18:45   ` Shawn
2002-07-01 19:57 ` Diego Calleja
2002-07-01 20:03   ` Diego Calleja
2002-07-01 22:20   ` Jose Luis Domingo Lopez
2002-07-01 22:56   ` Ryan Anderson
2002-07-02 11:37 ` Stephen C. Tweedie
2002-07-02 12:04   ` Richard B. Johnson
2002-07-02 13:13     ` jlnance
2002-07-03  3:48     ` simple handling of module removals " Pavel Machek
2002-07-03 17:25       ` Richard B. Johnson
2002-07-03 23:46       ` Daniel Phillips
2002-07-08 12:21         ` Richard B. Johnson
2002-07-08 12:41           ` Thunder from the hill
2002-07-08 12:57             ` Richard B. Johnson
2002-07-08 13:58               ` Thunder from the hill
2002-07-08 15:48                 ` Daniel Gryniewicz
2002-07-08 17:23                   ` Thunder from the hill
2002-07-08 13:06           ` Keith Owens
2002-07-08 13:15             ` Keith Owens
2002-07-03 23:48       ` Daniel Phillips
2002-07-05  9:40         ` Stephen Tweedie
2002-07-06 19:40           ` Daniel Phillips
2002-07-06 19:47             ` Pavel Machek
2002-07-04  1:18       ` Keith Owens
2002-07-04  1:53         ` Andrew Morton
2002-07-04  4:00           ` Keith Owens
2002-07-04  2:25         ` Brian Gerst
2002-07-04  3:54           ` David Gibson
2002-07-04  4:08           ` Keith Owens
2002-07-04 15:02             ` Brian Gerst
2002-07-04 19:18               ` Werner Almesberger
2002-07-05 13:48         ` Pavel Machek
2002-07-07 14:56           ` Keith Owens
2002-07-07 22:36             ` Roman Zippel
2002-07-08  1:09             ` Daniel Mose
2002-07-09 17:07               ` Daniel Mose
2002-07-08 18:13             ` Pavel Machek
2002-07-08 22:43               ` Keith Owens
2002-07-09 14:00                 ` Pavel Machek
2002-07-02 15:20   ` Bill Davidsen
2002-07-02 15:53     ` Jonathan Corbet
2002-07-02 16:07       ` Oliver Neukum
2002-07-02 17:48         ` Tom Rini
2002-07-02 18:10           ` Oliver Neukum
2002-07-02 21:50             ` Ryan Anderson
2002-07-03 22:26               ` Diego Calleja
2002-07-04  0:00                 ` Keith Owens
2002-07-04  8:04               ` Helge Hafting
2002-07-02 16:08     ` Werner Almesberger
     [not found]   ` <Pine.LNX.3.95.1020702075957.24872A-100000@chaos.analogic.c om>
2002-07-04  8:36     ` Mike Galbraith
2002-07-03  0:09 ` Vojtech Pavlik
2002-07-12 21:51 ` David Lang
2002-07-02  4:08 RE2: " Brian Gerst
2002-07-02  4:53 ` Keith Owens
2002-07-02  5:43   ` Werner Almesberger
2002-07-02 16:36     ` Werner Almesberger
2002-07-02 16:50       ` Benjamin Herrenschmidt
2002-07-02 18:05         ` Werner Almesberger
2002-07-03  3:50         ` Pavel Machek
2002-07-04  4:11       ` Bill Davidsen
2002-07-04  6:29         ` Werner Almesberger
2002-07-04  6:50           ` Werner Almesberger
2002-07-07 21:09             ` Jamie Lokier
2002-07-07 21:41               ` Oliver Neukum
2002-07-08  0:31                 ` Jamie Lokier
2002-07-08  6:42                   ` Oliver Neukum
2002-07-08  0:18               ` Bill Davidsen
2002-07-08  0:46                 ` Jamie Lokier
2002-07-08  6:22                   ` Daniel Phillips
2002-07-08 10:16                     ` Bill Davidsen
2002-07-09  0:07                   ` Werner Almesberger
2002-07-09  0:27                     ` Keith Owens
2002-07-09  3:09                       ` Werner Almesberger
2002-07-09  6:25                       ` Roman Zippel
2002-07-09  8:59                         ` Kai Germaschewski
2002-07-09 11:06                           ` Roman Zippel
2002-07-04 12:29           ` Bill Davidsen
2002-07-09  1:50           ` Kevin O'Connor
2002-07-09  2:37             ` Werner Almesberger
2002-07-02 23:35   ` Alexander Viro
2002-07-02 12:20 Ariel Garcia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).