linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] ATA over Ethernet __init calling __exit
@ 2005-01-13  0:09 Russell King
  2005-01-13  8:50 ` Jens Axboe
  2005-01-13 17:35 ` [PATCH] " Ed L Cashin
  0 siblings, 2 replies; 10+ messages in thread
From: Russell King @ 2005-01-13  0:09 UTC (permalink / raw)
  To: Linux Kernel List, ecashin

static void __exit
aoe_exit(void)
{
...
}

static int __init
aoe_init(void)
{
...
                        aoe_exit();
...
}

Enough said, please shoot the author of that in the foot.  You can't
call functions marked __exit (which may be discarded) from functions
which aren't.  If you want to do this, please loose the __exit on
aoe_exit().

In addition, please shoot the author in the other foot for:

config ATA_OVER_ETH
        tristate "ATA over Ethernet support"
        depends on NET
        default m               <==== this line.

That's not nice for embedded guys who do a "make xxx_defconfig" and
unsuspectingly end up with ATA over Ethernet built in for their
platform.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13  0:09 [BUG] ATA over Ethernet __init calling __exit Russell King
@ 2005-01-13  8:50 ` Jens Axboe
  2005-01-13 19:36   ` Andi Kleen
  2005-01-13 17:35 ` [PATCH] " Ed L Cashin
  1 sibling, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2005-01-13  8:50 UTC (permalink / raw)
  To: Linux Kernel List, ecashin

On Thu, Jan 13 2005, Russell King wrote:
> In addition, please shoot the author in the other foot for:
> 
> config ATA_OVER_ETH
>         tristate "ATA over Ethernet support"
>         depends on NET
>         default m               <==== this line.
> 
> That's not nice for embedded guys who do a "make xxx_defconfig" and
> unsuspectingly end up with ATA over Ethernet built in for their
> platform.

That annoyed me, too. There's no reason for standard kernel driver
modules to assume they should be selected, especially true for something
as special case as ata over ethernet.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13  0:09 [BUG] ATA over Ethernet __init calling __exit Russell King
  2005-01-13  8:50 ` Jens Axboe
@ 2005-01-13 17:35 ` Ed L Cashin
  1 sibling, 0 replies; 10+ messages in thread
From: Ed L Cashin @ 2005-01-13 17:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: Russell King, Jens Axboe, Greg KH

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

Russell King <rmk+lkml@arm.linux.org.uk> writes:

> static void __exit
> aoe_exit(void)
> {
> ...
> }
>
> static int __init
> aoe_init(void)
> {
> ...
>                         aoe_exit();
> ...
> }

Thanks for catching that.  I cleaned up the error handling, too.

> In addition, please shoot the author in the other foot for:
>
> config ATA_OVER_ETH
>         tristate "ATA over Ethernet support"
>         depends on NET
>         default m               <==== this line.
>
> That's not nice for embedded guys who do a "make xxx_defconfig" and
> unsuspectingly end up with ATA over Ethernet built in for their
> platform.

OK, thanks.  The patch below gets rid of that Kconfig line and
fixes the bug where __init code calls __exit code.

It depends on the recently-submitted patch that elimimates sleeping
with interrupts off, so it applies cleanly to Greg KH's usb tree.


Don't call __exit functions from __init functions.
Default to not configuring AoE support in kernel.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>


[-- Attachment #2: diff-exit-usb --]
[-- Type: text/plain, Size: 3163 bytes --]

diff -urpN usb-2.6-export-a/drivers/block/Kconfig usb-2.6-export-b/drivers/block/Kconfig
--- usb-2.6-export-a/drivers/block/Kconfig	2005-01-13 09:44:25.000000000 -0500
+++ usb-2.6-export-b/drivers/block/Kconfig	2005-01-13 12:07:56.000000000 -0500
@@ -449,7 +449,6 @@ source "drivers/block/Kconfig.iosched"
 config ATA_OVER_ETH
 	tristate "ATA over Ethernet support"
 	depends on NET
-	default m
 	help
 	This driver provides Support for ATA over Ethernet block
 	devices like the Coraid EtherDrive (R) Storage Blade.
diff -urpN usb-2.6-export-a/drivers/block/aoe/aoechr.c usb-2.6-export-b/drivers/block/aoe/aoechr.c
--- usb-2.6-export-a/drivers/block/aoe/aoechr.c	2005-01-13 09:47:33.000000000 -0500
+++ usb-2.6-export-b/drivers/block/aoe/aoechr.c	2005-01-13 10:59:23.000000000 -0500
@@ -266,7 +266,7 @@ aoechr_init(void)
 	return 0;
 }
 
-void __exit
+void
 aoechr_exit(void)
 {
 	int i;
diff -urpN usb-2.6-export-a/drivers/block/aoe/aoedev.c usb-2.6-export-b/drivers/block/aoe/aoedev.c
--- usb-2.6-export-a/drivers/block/aoe/aoedev.c	2005-01-13 09:44:00.000000000 -0500
+++ usb-2.6-export-b/drivers/block/aoe/aoedev.c	2005-01-13 10:59:23.000000000 -0500
@@ -159,7 +159,7 @@ aoedev_freedev(struct aoedev *d)
 	kfree(d);
 }
 
-void __exit
+void
 aoedev_exit(void)
 {
 	struct aoedev *d;
diff -urpN usb-2.6-export-a/drivers/block/aoe/aoemain.c usb-2.6-export-b/drivers/block/aoe/aoemain.c
--- usb-2.6-export-a/drivers/block/aoe/aoemain.c	2005-01-13 09:46:44.000000000 -0500
+++ usb-2.6-export-b/drivers/block/aoe/aoemain.c	2005-01-13 10:59:23.000000000 -0500
@@ -53,7 +53,7 @@ discover_timer(ulong vp)
 	}
 }
 
-static void __exit
+static void
 aoe_exit(void)
 {
 	discover_timer(TKILL);
@@ -67,25 +67,43 @@ aoe_exit(void)
 static int __init
 aoe_init(void)
 {
-	int n, (**p)(void);
-	int (*fns[])(void) = {
-		aoedev_init, aoechr_init, aoeblk_init, aoenet_init, NULL
-	};
-
-	for (p=fns; *p != NULL; p++) {
-		n = (*p)();
-		if (n) {
-			aoe_exit();
-			printk(KERN_INFO "aoe: aoe_init: initialisation failure.\n");
-			return n;
-		}
+	int ret;
+
+	ret = aoedev_init();
+	if (ret)
+		return ret;
+	ret = aoechr_init();
+	if (ret)
+		goto chr_fail;
+	ret = aoeblk_init();
+	if (ret)
+		goto blk_fail;
+	ret = aoenet_init();
+	if (ret)
+		goto net_fail;
+	ret = register_blkdev(AOE_MAJOR, DEVICE_NAME);
+	if (ret < 0) {
+		printk(KERN_ERR "aoe: aoeblk_init: can't register major\n");
+		goto blkreg_fail;
 	}
+
 	printk(KERN_INFO
 	       "aoe: aoe_init: AoE v2.6-%s initialised.\n",
 	       VERSION);
-
 	discover_timer(TINIT);
 	return 0;
+
+ blkreg_fail:
+	aoenet_exit();
+ net_fail:
+	aoeblk_exit();
+ blk_fail:
+	aoechr_exit();
+ chr_fail:
+	aoedev_exit();
+	
+	printk(KERN_INFO "aoe: aoe_init: initialisation failure.\n");
+	return ret;
 }
 
 module_init(aoe_init);
diff -urpN usb-2.6-export-a/drivers/block/aoe/aoenet.c usb-2.6-export-b/drivers/block/aoe/aoenet.c
--- usb-2.6-export-a/drivers/block/aoe/aoenet.c	2005-01-13 09:45:02.000000000 -0500
+++ usb-2.6-export-b/drivers/block/aoe/aoenet.c	2005-01-13 10:59:23.000000000 -0500
@@ -167,7 +167,7 @@ aoenet_init(void)
 	return 0;
 }
 
-void __exit
+void
 aoenet_exit(void)
 {
 	dev_remove_pack(&aoe_pt);

[-- Attachment #3: Type: text/plain, Size: 41 bytes --]



-- 
  Ed L Cashin <ecashin@coraid.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13  8:50 ` Jens Axboe
@ 2005-01-13 19:36   ` Andi Kleen
  2005-01-13 20:52     ` Ed L Cashin
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2005-01-13 19:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel List, ecashin, jgarzik

Jens Axboe <axboe@suse.de> writes:

> On Thu, Jan 13 2005, Russell King wrote:
>> In addition, please shoot the author in the other foot for:
>> 
>> config ATA_OVER_ETH
>>         tristate "ATA over Ethernet support"
>>         depends on NET
>>         default m               <==== this line.
>> 
>> That's not nice for embedded guys who do a "make xxx_defconfig" and
>> unsuspectingly end up with ATA over Ethernet built in for their
>> platform.
>
> That annoyed me, too. There's no reason for standard kernel driver
> modules to assume they should be selected, especially true for something
> as special case as ata over ethernet.

In general I think it was a bad idea to merge this driver at all.
The protocol is obviously broken by design - they use a 16 bit sequence
number space which has been proven for many years (in ip fragmentation)
to be far too small for modern network performance.

Also the memory allocation without preallocation in the block write
out path looks quite broken too and will most likely will lead to deadlocks
under high load.

(I wrote a detailed review when it was posted but apparently it 
disappeared or I never got any answer at least) 

IMHO this thing should have never been merged in this form. Can it 
still be backed out?

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13 19:36   ` Andi Kleen
@ 2005-01-13 20:52     ` Ed L Cashin
  2005-01-13 21:53       ` Andi Kleen
  0 siblings, 1 reply; 10+ messages in thread
From: Ed L Cashin @ 2005-01-13 20:52 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jens Axboe, Linux Kernel List, jgarzik

Andi Kleen <ak@muc.de> writes:

...
> In general I think it was a bad idea to merge this driver at all.
> The protocol is obviously broken by design - they use a 16 bit sequence
> number space which has been proven for many years (in ip fragmentation)
> to be far too small for modern network performance.

While that experience may apply well to IP, this is a non-IP protocol
for a single LAN.  For any given AoE device, there are only a few
outstanding packets at any given time.

For existing AoE devices that number of outstanding packets is only
three!  So, with only three packets on the wire at any time for a
given device, 16 bits is overkill.  In fact, the AoE protocol allows
the AoE device to specify how many outstanding packets it supports.
That number is only 16 bits wide.  

If it ever did become desirable, we could use a couple more bits for
the sequence number by borrowing from the low bits of jiffies that we
use to estimate the RTT, but it doesn't seem likely to ever be
desirable.

> Also the memory allocation without preallocation in the block write
> out path looks quite broken too and will most likely will lead to deadlocks
> under high load.
>
> (I wrote a detailed review when it was posted but apparently it 
> disappeared or I never got any answer at least) 

I think you're talking about your suggestion that the skb allocation
could lead to a deadlock.  If I'm correct, this issue is similar to
the one that led us to create a mempool for the buf structs we use.

> IMHO this thing should have never been merged in this form. Can it 
> still be backed out?

The sequence number is a non-issue, and for the deadlock danger you
originally suggested that we add a warning against using aoe storage
for swap.  We could add such a warning to the driver's documentation
and start working on a patch that avoids deadlocking on the skb
allocation.  There's no need to panic and force users go get a third
party driver.

-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13 20:52     ` Ed L Cashin
@ 2005-01-13 21:53       ` Andi Kleen
  2005-01-13 22:09         ` Alan Cox
  2005-01-13 22:12         ` Ed L Cashin
  0 siblings, 2 replies; 10+ messages in thread
From: Andi Kleen @ 2005-01-13 21:53 UTC (permalink / raw)
  To: Ed L Cashin; +Cc: Jens Axboe, Linux Kernel List, jgarzik

On Thu, Jan 13, 2005 at 03:52:05PM -0500, Ed L Cashin wrote:
> Andi Kleen <ak@muc.de> writes:
> 
> ...
> > In general I think it was a bad idea to merge this driver at all.
> > The protocol is obviously broken by design - they use a 16 bit sequence
> > number space which has been proven for many years (in ip fragmentation)
> > to be far too small for modern network performance.
> 
> While that experience may apply well to IP, this is a non-IP protocol
> for a single LAN.  For any given AoE device, there are only a few
> outstanding packets at any given time.
> 
> For existing AoE devices that number of outstanding packets is only
> three!  So, with only three packets on the wire at any time for a
> given device, 16 bits is overkill.  In fact, the AoE protocol allows
> the AoE device to specify how many outstanding packets it supports.
> That number is only 16 bits wide.  
> 
> If it ever did become desirable, we could use a couple more bits for

It likely will if someone ever adds significant write cache to such
devices.

> the sequence number by borrowing from the low bits of jiffies that we
> use to estimate the RTT, but it doesn't seem likely to ever be
> desirable.

Can this be done now? 

> 
> > Also the memory allocation without preallocation in the block write
> > out path looks quite broken too and will most likely will lead to deadlocks
> > under high load.
> >
> > (I wrote a detailed review when it was posted but apparently it 
> > disappeared or I never got any answer at least) 
> 
> I think you're talking about your suggestion that the skb allocation
> could lead to a deadlock.  If I'm correct, this issue is similar to
> the one that led us to create a mempool for the buf structs we use.

For the skbuffs? I don't think it's possible to preallocate them
because the network stack (intentionally) misses hooks to give
them back to you.

BTW iirc your submit patch did too much allocations anyways because
in modern Linux skbs you can just stick a pointer to the page
into the skb when the device announces NETIF_F_SG. 

> 
> > IMHO this thing should have never been merged in this form. Can it 
> > still be backed out?
> 
> The sequence number is a non-issue, and for the deadlock danger you
> originally suggested that we add a warning against using aoe storage
> for swap.  We could add such a warning to the driver's documentation

It's not only swap, the same problem occurs when any file on such
a device is writeable mmaped.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13 21:53       ` Andi Kleen
@ 2005-01-13 22:09         ` Alan Cox
  2005-01-13 22:12         ` Ed L Cashin
  1 sibling, 0 replies; 10+ messages in thread
From: Alan Cox @ 2005-01-13 22:09 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ed L Cashin, Jens Axboe, Linux Kernel Mailing List, jgarzik

On Iau, 2005-01-13 at 21:53, Andi Kleen wrote:
> > If it ever did become desirable, we could use a couple more bits for
> 
> It likely will if someone ever adds significant write cache to such
> devices.

ATA doesn't support tagged queueing. Therefore write cache is
irrelevant.

Alan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13 21:53       ` Andi Kleen
  2005-01-13 22:09         ` Alan Cox
@ 2005-01-13 22:12         ` Ed L Cashin
  2005-01-13 22:48           ` Andi Kleen
  1 sibling, 1 reply; 10+ messages in thread
From: Ed L Cashin @ 2005-01-13 22:12 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jens Axboe, Linux Kernel List, jgarzik

Andi Kleen <ak@muc.de> writes:

> On Thu, Jan 13, 2005 at 03:52:05PM -0500, Ed L Cashin wrote:
>> Andi Kleen <ak@muc.de> writes:
>> 
>> ...
>> > In general I think it was a bad idea to merge this driver at all.
>> > The protocol is obviously broken by design - they use a 16 bit sequence
>> > number space which has been proven for many years (in ip fragmentation)
>> > to be far too small for modern network performance.
>> 
>> While that experience may apply well to IP, this is a non-IP protocol
>> for a single LAN.  For any given AoE device, there are only a few
>> outstanding packets at any given time.
>> 
>> For existing AoE devices that number of outstanding packets is only
>> three!  So, with only three packets on the wire at any time for a
>> given device, 16 bits is overkill.  In fact, the AoE protocol allows
>> the AoE device to specify how many outstanding packets it supports.
>> That number is only 16 bits wide.  
>> 
>> If it ever did become desirable, we could use a couple more bits for
>
> It likely will if someone ever adds significant write cache to such
> devices.
>
>> the sequence number by borrowing from the low bits of jiffies that we
>> use to estimate the RTT, but it doesn't seem likely to ever be
>> desirable.
>
> Can this be done now? 

It seems rash to make the change now, because there is no need for it.

>> > Also the memory allocation without preallocation in the block write
>> > out path looks quite broken too and will most likely will lead to deadlocks
>> > under high load.
>> >
>> > (I wrote a detailed review when it was posted but apparently it 
>> > disappeared or I never got any answer at least) 
>> 
>> I think you're talking about your suggestion that the skb allocation
>> could lead to a deadlock.  If I'm correct, this issue is similar to
>> the one that led us to create a mempool for the buf structs we use.
>
> For the skbuffs? I don't think it's possible to preallocate them
> because the network stack (intentionally) misses hooks to give
> them back to you.

For the skbuffs, we could use a mempool with GFP_NOIO allocation and
then skb_get them before the network layer ever sees them.  We already
keep track of the packets that we've sent out, so we'll just keep a
pointer to the skbuffs.

> BTW iirc your submit patch did too much allocations anyways because
> in modern Linux skbs you can just stick a pointer to the page
> into the skb when the device announces NETIF_F_SG. 

I thought there was some shared information at the end of the data,
but that's interesting, thanks.  I'll look into it.

-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13 22:12         ` Ed L Cashin
@ 2005-01-13 22:48           ` Andi Kleen
  2005-01-13 22:59             ` Ed L Cashin
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2005-01-13 22:48 UTC (permalink / raw)
  To: Ed L Cashin; +Cc: Jens Axboe, Linux Kernel List, jgarzik

On Thu, Jan 13, 2005 at 05:12:21PM -0500, Ed L Cashin wrote:

> It seems rash to make the change now, because there is no need for it.

It would be better to future proof the driver at least a bit.

> 
> For the skbuffs, we could use a mempool with GFP_NOIO allocation and
> then skb_get them before the network layer ever sees them.  We already
> keep track of the packets that we've sent out, so we'll just keep a
> pointer to the skbuffs.

That won't work because you don't know when the driver is done with it.

There is a callback, but it's already used by socket buffer management
(if you don't use a socket it may be usable) 


-Andi 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] ATA over Ethernet __init calling __exit
  2005-01-13 22:48           ` Andi Kleen
@ 2005-01-13 22:59             ` Ed L Cashin
  0 siblings, 0 replies; 10+ messages in thread
From: Ed L Cashin @ 2005-01-13 22:59 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jens Axboe, Linux Kernel List, jgarzik

Andi Kleen <ak@muc.de> writes:

> On Thu, Jan 13, 2005 at 05:12:21PM -0500, Ed L Cashin wrote:
>
>> It seems rash to make the change now, because there is no need for it.
>
> It would be better to future proof the driver at least a bit.
>
>> 
>> For the skbuffs, we could use a mempool with GFP_NOIO allocation and
>> then skb_get them before the network layer ever sees them.  We already
>> keep track of the packets that we've sent out, so we'll just keep a
>> pointer to the skbuffs.
>
> That won't work because you don't know when the driver is done with it.
>
> There is a callback, but it's already used by socket buffer management

The destructor member of struct sk_buff, right?

> (if you don't use a socket it may be usable) 

You mean a struct sock?  No, we don't use that because we're not IP.
Thanks for the lead.

-- 
  Ed L Cashin <ecashin@coraid.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-01-13 23:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-13  0:09 [BUG] ATA over Ethernet __init calling __exit Russell King
2005-01-13  8:50 ` Jens Axboe
2005-01-13 19:36   ` Andi Kleen
2005-01-13 20:52     ` Ed L Cashin
2005-01-13 21:53       ` Andi Kleen
2005-01-13 22:09         ` Alan Cox
2005-01-13 22:12         ` Ed L Cashin
2005-01-13 22:48           ` Andi Kleen
2005-01-13 22:59             ` Ed L Cashin
2005-01-13 17:35 ` [PATCH] " Ed L Cashin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).