linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] DAC960_Release bug (2.4.x)
@ 2003-04-16  9:55 John v/d Kamp
  2003-04-16 22:40 ` Dave Olien
  0 siblings, 1 reply; 6+ messages in thread
From: John v/d Kamp @ 2003-04-16  9:55 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 571 bytes --]

Hi,

It seems the DAC960_Release function doesn't work correctly when
DAC960_Open is called with File->f_flags has O_NONBLOCK set. This causes
BLKRRPART to fail, as an unsigned int gets decreased below 0.
The File struct passed to DAC960_Release is NULL, so in Open the counters
aren't increased, but in Release they are decreased. I've added a simple
check that prevents the decrements if the counters are 0.

Allso, I've no idea why there are two counters. It seems that the
ControllerUsageCount is only used to increment and decrement.

--
John van der Kamp, ConnecTUX

[-- Attachment #2: Type: TEXT/PLAIN, Size: 681 bytes --]

diff -ur linux-2.4.19/drivers/block/DAC960.c patched-2.4.19/drivers/block/DAC960.c
--- linux-2.4.19/drivers/block/DAC960.c	2002-09-13 17:41:30.000000000 +0200
+++ patched-2.4.19/drivers/block/DAC960.c	2003-04-16 11:07:16.000000000 +0200
@@ -5398,8 +5398,10 @@
   /*
     Decrement the Logical Drive and Controller Usage Counts.
   */
-  Controller->LogicalDriveUsageCount[LogicalDriveNumber]--;
-  Controller->ControllerUsageCount--;
+  if (Controller->LogicalDriveUsageCount[LogicalDriveNumber] > 0)
+    Controller->LogicalDriveUsageCount[LogicalDriveNumber]--;
+  if (Controller->ControllerUsageCount > 0)
+    Controller->ControllerUsageCount--;
   return 0;
 }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] DAC960_Release bug (2.4.x)
  2003-04-16  9:55 [PATCH] DAC960_Release bug (2.4.x) John v/d Kamp
@ 2003-04-16 22:40 ` Dave Olien
  2003-04-17  8:48   ` John v/d Kamp
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Olien @ 2003-04-16 22:40 UTC (permalink / raw)
  To: John v/d Kamp; +Cc: linux-kernel


As you observed, ControllerUsageCount doesn't seem to serve
any purpose.  It should just be removed, I think.

I'm curious, what application you have that calls DAC960_Open() with NONBLOCK.

Here's my interpretation of what the drivers' trying to do here.
If you can validate these assumptions, I'd appreciate it.

For the DAC960, it looks like the NONBLOCK flag is just a trick
to allow a user-level DAC960 RAID manager utility to open the driver
without actually interacting with any disk devices on the controller.
The file descriptor you get back is not really associated with a specific
device or controller.

The DAC960_IOCTL function calls DAC960_UserIOCTL only when the
NONBLOCK flag is set.  DAC960_UserIOCTL is designed to operate on
the controller specified by a controller number argument, rather than
a controller associated with its file descriptor argument.

This lets the RAID manager utility operate on any controller at a time
when perhaps there are NO logical devices ONLINE (i.e. you would
be normally UNABLE to get ANY file descriptor for that controller).

The ModuleOnly label in DAC960_open() apparently is meant to indicate
we're opening "the DAC960 module" rather than a specific device.

The problem is, the test for opening this file descriptor is iffy:
	"If the file descriptor is for controller 0, logical device
	0, and NONBLOCK is set, then return this `special` file descriptor."

It's perfectly reasonable for an application to actually want to open
REAL devices this way.  Instead, it gets this peculiar file descriptor.

The problem is there are DAC960 utilities out there that I'm sure rely
on this behavior.  So, it'll be hard to fix "the right way".


On Wed, Apr 16, 2003 at 11:55:51AM +0200, John v/d Kamp wrote:
> Hi,
> 
> It seems the DAC960_Release function doesn't work correctly when
> DAC960_Open is called with File->f_flags has O_NONBLOCK set. This causes
> BLKRRPART to fail, as an unsigned int gets decreased below 0.
> The File struct passed to DAC960_Release is NULL, so in Open the counters
> aren't increased, but in Release they are decreased. I've added a simple
> check that prevents the decrements if the counters are 0.
> 
> Allso, I've no idea why there are two counters. It seems that the
> ControllerUsageCount is only used to increment and decrement.
> 
> --
> John van der Kamp, ConnecTUX
> diff -ur linux-2.4.19/drivers/block/DAC960.c patched-2.4.19/drivers/block/DAC960.c
> --- linux-2.4.19/drivers/block/DAC960.c	2002-09-13 17:41:30.000000000 +0200
> +++ patched-2.4.19/drivers/block/DAC960.c	2003-04-16 11:07:16.000000000 +0200
> @@ -5398,8 +5398,10 @@
>    /*
>      Decrement the Logical Drive and Controller Usage Counts.
>    */
> -  Controller->LogicalDriveUsageCount[LogicalDriveNumber]--;
> -  Controller->ControllerUsageCount--;
> +  if (Controller->LogicalDriveUsageCount[LogicalDriveNumber] > 0)
> +    Controller->LogicalDriveUsageCount[LogicalDriveNumber]--;
> +  if (Controller->ControllerUsageCount > 0)
> +    Controller->ControllerUsageCount--;
>    return 0;
>  }


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] DAC960_Release bug (2.4.x)
  2003-04-16 22:40 ` Dave Olien
@ 2003-04-17  8:48   ` John v/d Kamp
  2003-04-17 20:27     ` Dave Olien
  0 siblings, 1 reply; 6+ messages in thread
From: John v/d Kamp @ 2003-04-17  8:48 UTC (permalink / raw)
  To: Dave Olien; +Cc: linux-kernel

We have an application that detects hardware using the libhd from SuSE.
This lib loads the driver with the NONBLOCK flag, because it doesn't want
to use the drive, but only detect it. When it releases, the File parameter
is NULL, so the counters get decremented, but were never incremented.

When we try to partition the drive, the BLKRRPART ioctl fails because the
counter is 4294967295. (-1)
I don't think any program relies on this counter being < "0".

The whole point on the file descriptor is beyond me, but I think it's a
good thing to address.

--
John van der Kamp, ConnecTUX

On Wed, 16 Apr 2003, Dave Olien wrote:

>
> As you observed, ControllerUsageCount doesn't seem to serve
> any purpose.  It should just be removed, I think.
>
> I'm curious, what application you have that calls DAC960_Open() with NONBLOCK.
>
> Here's my interpretation of what the drivers' trying to do here.
> If you can validate these assumptions, I'd appreciate it.
>
> For the DAC960, it looks like the NONBLOCK flag is just a trick
> to allow a user-level DAC960 RAID manager utility to open the driver
> without actually interacting with any disk devices on the controller.
> The file descriptor you get back is not really associated with a specific
> device or controller.
>
> The DAC960_IOCTL function calls DAC960_UserIOCTL only when the
> NONBLOCK flag is set.  DAC960_UserIOCTL is designed to operate on
> the controller specified by a controller number argument, rather than
> a controller associated with its file descriptor argument.
>
> This lets the RAID manager utility operate on any controller at a time
> when perhaps there are NO logical devices ONLINE (i.e. you would
> be normally UNABLE to get ANY file descriptor for that controller).
>
> The ModuleOnly label in DAC960_open() apparently is meant to indicate
> we're opening "the DAC960 module" rather than a specific device.
>
> The problem is, the test for opening this file descriptor is iffy:
> 	"If the file descriptor is for controller 0, logical device
> 	0, and NONBLOCK is set, then return this `special` file descriptor."
>
> It's perfectly reasonable for an application to actually want to open
> REAL devices this way.  Instead, it gets this peculiar file descriptor.
>
> The problem is there are DAC960 utilities out there that I'm sure rely
> on this behavior.  So, it'll be hard to fix "the right way".
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] DAC960_Release bug (2.4.x)
  2003-04-17  8:48   ` John v/d Kamp
@ 2003-04-17 20:27     ` Dave Olien
  2003-04-18  9:20       ` John v/d Kamp
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Olien @ 2003-04-17 20:27 UTC (permalink / raw)
  To: John v/d Kamp; +Cc: linux-kernel


I've been looking over the kernel's code paths that call a block driver's
release method.  In linux 2.4 and 2.5, it looks like nowhere does the
kernel EVER call the release method with a non-NULL file descriptor.
The file pointer argument to the release method seems to be a left-over
from linux 2.2.

I think the DAC960_Open and DAC960_Release methods in 2.4 and 2.5
are more broken than it first appears.  

The SPECIAL file descriptor that you get with O_NONBLOCK
is a BAD idea.  But we're probably stuck because applications use it.
Could you pass me a URL to the libhd library you're using?  I'd like
to look it over.  What behavior does it expect with the O_NBLOCK flag?

I think what I'll do is assert that there can be ONLY ONE such SPECIAL
file descriptor open at a time.  At Open time, we'll save a pointer to
the inode for this special file descriptor in a module-local variable.
If at open time, we discover there is already such an open file descriptor,
we'll refuse to open another one.

In the release function, we'll compare the inode pointer passed in with
the saved inode pointer, and do the SPECIAL close case, and then zero
out the saved inode pointer.  This isn't a completely reliable solution.
But, it's the best I think of for now.

A patch for 2.4 is attached at the end of this mail.  I'd appreciate it if
you could give it a try and let me know how it works.

On Thu, Apr 17, 2003 at 10:48:51AM +0200, John v/d Kamp wrote:
> We have an application that detects hardware using the libhd from SuSE.
> This lib loads the driver with the NONBLOCK flag, because it doesn't want
> to use the drive, but only detect it. When it releases, the File parameter
> is NULL, so the counters get decremented, but were never incremented.
> 
> When we try to partition the drive, the BLKRRPART ioctl fails because the
> counter is 4294967295. (-1)
> I don't think any program relies on this counter being < "0".
> 
> The whole point on the file descriptor is beyond me, but I think it's a
> good thing to address.
> 

------------------------------------------------------------------------------

diff -ur linux-2.4.18_original/drivers/block/DAC960.c linux-2.4.18_DAC/drivers/block/DAC960.c
--- linux-2.4.18_original/drivers/block/DAC960.c	2001-10-25 13:58:35.000000000 -0700
+++ linux-2.4.18_DAC/drivers/block/DAC960.c	2003-04-17 13:11:15.000000000 -0700
@@ -48,6 +48,31 @@
 #include <asm/uaccess.h>
 #include "DAC960.h"
 
+/*
+ * DAC960_SpecialInode is used to indicate that the DAC960_Open
+ * method was called with the file descriptor that has its O_NONBLOCK
+ * flag set, and was with an inode for controller 0, logical device 0.
+ *
+ * Under this case, DAC960_Open enables a "special" file descriptor that
+ * does not reference a real disk device.  This file descriptor can
+ * be used ONLY for calls to DAC960_UserIOCTL(), which is able to
+ * operate on DAC960 controllers to do "pass through" commands.  These
+ * "pass through" commands can be used to query the state of the devices
+ * on the controller, and modify the state of those devices.
+ *
+ * This "special" file descriptor implementation is a bad idea.  But,
+ * we're stuck with it because there applications that rely on it.
+ * (although this has been broken throughout the entire linux 2.4
+ * release.  So, maybe this could in fact be removed.)
+ *
+ * This fix isn't completely reliable.  But, it should handle  the
+ * common cases.  Almost certainly, there is only one inode for
+ * the (controller 0, disk 0) device in any system.  So, the
+ * SpecialInode pointer is really just a flag in this case.  But,
+ * we'll save the inode pointer as well, just in case.
+ */
+static Inode_T		*DAC960_SpecialInode = NULL;
+static spinlock_t	DAC960_SpecialInodeLock;
 
 /*
   DAC960_ControllerCount is the number of DAC960 Controllers detected.
@@ -5340,9 +5365,29 @@
   int ControllerNumber = DAC960_ControllerNumber(Inode->i_rdev);
   int LogicalDriveNumber = DAC960_LogicalDriveNumber(Inode->i_rdev);
   DAC960_Controller_T *Controller;
+
+  /*
+   * Open a "Special" file descriptor that can be used
+   * to operate on any DAC960 controller, even if there are
+   * no logical devices online.  This hooks into code
+   * in DAC960_IOCTL and DAC960_Close.
+   *
+   * This "Special" file descriptor is a bad idea, but
+   * we're probably stuck with it because of existing 
+   * applications that use it.
+   */
   if (ControllerNumber == 0 && LogicalDriveNumber == 0 &&
-      (File->f_flags & O_NONBLOCK))
-    goto ModuleOnly;
+		(File->f_flags & O_NONBLOCK) && capable(CAP_SYS_ADMIN)) {
+      spin_lock_irq(&DAC960_SpecialInodeLock);
+      if (DAC960_SpecialInode != NULL) {
+      	spin_unlock_irq(&DAC960_SpecialInodeLock);
+	return -ENXIO;
+      }
+      DAC960_SpecialInode = Inode;
+      spin_unlock_irq(&DAC960_SpecialInodeLock);
+      goto ModuleOnly;
+  }
+
   if (ControllerNumber < 0 || ControllerNumber > DAC960_ControllerCount - 1)
     return -ENXIO;
   Controller = DAC960_Controllers[ControllerNumber];
@@ -5376,7 +5421,6 @@
   /*
     Increment Controller and Logical Drive Usage Counts.
   */
-  Controller->ControllerUsageCount++;
   Controller->LogicalDriveUsageCount[LogicalDriveNumber]++;
  ModuleOnly:
   return 0;
@@ -5392,15 +5436,12 @@
   int ControllerNumber = DAC960_ControllerNumber(Inode->i_rdev);
   int LogicalDriveNumber = DAC960_LogicalDriveNumber(Inode->i_rdev);
   DAC960_Controller_T *Controller = DAC960_Controllers[ControllerNumber];
-  if (ControllerNumber == 0 && LogicalDriveNumber == 0 &&
-      File != NULL && (File->f_flags & O_NONBLOCK))
-    goto ModuleOnly;
-  /*
-    Decrement the Logical Drive and Controller Usage Counts.
-  */
-  Controller->LogicalDriveUsageCount[LogicalDriveNumber]--;
-  Controller->ControllerUsageCount--;
- ModuleOnly:
+
+  if ((Inode == DAC960_SpecialInode) && capable(CAP_SYS_ADMIN))
+    DAC960_SpecialInode = NULL;
+  else
+     Controller->LogicalDriveUsageCount[LogicalDriveNumber]--;
+
   return 0;
 }
 
diff -ur linux-2.4.18_original/drivers/block/DAC960.h linux-2.4.18_DAC/drivers/block/DAC960.h
--- linux-2.4.18_original/drivers/block/DAC960.h	2001-10-17 14:46:29.000000000 -0700
+++ linux-2.4.18_DAC/drivers/block/DAC960.h	2003-04-17 11:17:45.000000000 -0700
@@ -2341,7 +2341,6 @@
   unsigned short MaxBlocksPerCommand;
   unsigned short ControllerScatterGatherLimit;
   unsigned short DriverScatterGatherLimit;
-  unsigned int ControllerUsageCount;
   unsigned int CombinedStatusBufferLength;
   unsigned int InitialStatusLength;
   unsigned int CurrentStatusLength;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] DAC960_Release bug (2.4.x)
  2003-04-17 20:27     ` Dave Olien
@ 2003-04-18  9:20       ` John v/d Kamp
  2003-04-18 16:12         ` Dave Olien
  0 siblings, 1 reply; 6+ messages in thread
From: John v/d Kamp @ 2003-04-18  9:20 UTC (permalink / raw)
  To: Dave Olien; +Cc: linux-kernel

The libhd can be found here:
ftp://ftp.suse.com/pub/suse/i386/8.1/suse/src/hwinfo-5.39-2.src.rpm
We use a somewhat older version, but that probably doesn't matter.

I've compiled the module, and our install software ran just fine.
Repartitioning the drive was no problem. Using the driver on a normal
system was no problem either (never was).

--
John van der Kamp, ConnecTUX

On Thu, 17 Apr 2003, Dave Olien wrote:

>
> I've been looking over the kernel's code paths that call a block driver's
> release method.  In linux 2.4 and 2.5, it looks like nowhere does the
> kernel EVER call the release method with a non-NULL file descriptor.
> The file pointer argument to the release method seems to be a left-over
> from linux 2.2.
>
> I think the DAC960_Open and DAC960_Release methods in 2.4 and 2.5
> are more broken than it first appears.
>
> The SPECIAL file descriptor that you get with O_NONBLOCK
> is a BAD idea.  But we're probably stuck because applications use it.
> Could you pass me a URL to the libhd library you're using?  I'd like
> to look it over.  What behavior does it expect with the O_NBLOCK flag?
>
> I think what I'll do is assert that there can be ONLY ONE such SPECIAL
> file descriptor open at a time.  At Open time, we'll save a pointer to
> the inode for this special file descriptor in a module-local variable.
> If at open time, we discover there is already such an open file descriptor,
> we'll refuse to open another one.
>
> In the release function, we'll compare the inode pointer passed in with
> the saved inode pointer, and do the SPECIAL close case, and then zero
> out the saved inode pointer.  This isn't a completely reliable solution.
> But, it's the best I think of for now.
>
> A patch for 2.4 is attached at the end of this mail.  I'd appreciate it if
> you could give it a try and let me know how it works.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] DAC960_Release bug (2.4.x)
  2003-04-18  9:20       ` John v/d Kamp
@ 2003-04-18 16:12         ` Dave Olien
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Olien @ 2003-04-18 16:12 UTC (permalink / raw)
  To: John v/d Kamp; +Cc: linux-kernel


Thanks for testing this patch and the rpm pointer!
I'll look over the rpm this week end, and probably send this patch to
marcello next week. Then, generate a patch for 2.5.


On Fri, Apr 18, 2003 at 11:20:01AM +0200, John v/d Kamp wrote:
> The libhd can be found here:
> ftp://ftp.suse.com/pub/suse/i386/8.1/suse/src/hwinfo-5.39-2.src.rpm
> We use a somewhat older version, but that probably doesn't matter.
> 
> I've compiled the module, and our install software ran just fine.
> Repartitioning the drive was no problem. Using the driver on a normal
> system was no problem either (never was).
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-04-18 16:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-16  9:55 [PATCH] DAC960_Release bug (2.4.x) John v/d Kamp
2003-04-16 22:40 ` Dave Olien
2003-04-17  8:48   ` John v/d Kamp
2003-04-17 20:27     ` Dave Olien
2003-04-18  9:20       ` John v/d Kamp
2003-04-18 16:12         ` Dave Olien

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).