linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] Bug fix for the s390 dcssblk driver
       [not found] <200710201451.57138.elendil@planet.nl>
@ 2007-10-20 17:24 ` emist
  2007-10-21 10:09   ` Heiko Carstens
  0 siblings, 1 reply; 7+ messages in thread
From: emist @ 2007-10-20 17:24 UTC (permalink / raw)
  To: Linux

Frans Pop wrote:
> emist wrote:
>> The following patch fixes and issue in the s390 dcssblk driver. The
>> issue is caused when an unsuccessful attempt is made in order to change
>> a segment's type through the device attribute file "shared". This causes
>> the driver to remove the device in question, removing with it the device
>> attribute which is currently handling the call. The result is a hang on
>> the driver as it removes memory from under its feet.
>>
>> Not exactly sure if this explanation makes sense or its entirely
>> accurate. This is what I believe at this point from encountering and
>> fixing the error. Anyway here is the patch, hope it helps.
> 
> Hi,
> 
> If you don't get any reactions to your patch during the next few days, I 
> suggest you resend it and then CC the linux-s390@vger.kernel.org list and 
> possibly also the maintainer at linux390@de.ibm.com.
> 
> In general you should always try to CC the relevant list/people as listed in 
> the MAINTAINERS file and not just the linux-kernel list, both for patches 
> and when reporting problems.
> 
> Cheers,
> Frans Pop
> 

Thanks Frans, I will do as you suggest.

Have a good one,

Igor H.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Bug fix for the s390 dcssblk driver
  2007-10-20 17:24 ` [PATCH] Bug fix for the s390 dcssblk driver emist
@ 2007-10-21 10:09   ` Heiko Carstens
  2007-10-22  3:46     ` emist
  0 siblings, 1 reply; 7+ messages in thread
From: Heiko Carstens @ 2007-10-21 10:09 UTC (permalink / raw)
  To: emist; +Cc: Linux, Gerald Schaefer, Carsten Otte, Martin Schwidefsky

On Sat, Oct 20, 2007 at 01:24:34PM -0400, emist wrote:
> Frans Pop wrote:
> > emist wrote:
> >> The following patch fixes and issue in the s390 dcssblk driver. The
> >> issue is caused when an unsuccessful attempt is made in order to change
> >> a segment's type through the device attribute file "shared". This causes
> >> the driver to remove the device in question, removing with it the device
> >> attribute which is currently handling the call. The result is a hang on
> >> the driver as it removes memory from under its feet.
> >>
> >> Not exactly sure if this explanation makes sense or its entirely
> >> accurate. This is what I believe at this point from encountering and
> >> fixing the error. Anyway here is the patch, hope it helps.
> > 
> > Hi,
> > 
> > If you don't get any reactions to your patch during the next few days, I 
> > suggest you resend it and then CC the linux-s390@vger.kernel.org list and 
> > possibly also the maintainer at linux390@de.ibm.com.
> > 
> > In general you should always try to CC the relevant list/people as listed in 
> > the MAINTAINERS file and not just the linux-kernel list, both for patches 
> > and when reporting problems.
> > 
> > Cheers,
> > Frans Pop
> > 
> 
> Thanks Frans, I will do as you suggest.
> 
> Have a good one,
> 
> Igor H.

Gerald or Carsten (cc'ed) should look into this.
Thanks for reporting.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Bug fix for the s390 dcssblk driver
  2007-10-21 10:09   ` Heiko Carstens
@ 2007-10-22  3:46     ` emist
  2007-10-22 11:37       ` Cornelia Huck
  0 siblings, 1 reply; 7+ messages in thread
From: emist @ 2007-10-22  3:46 UTC (permalink / raw)
  To: Linux; +Cc: geraldsc, cotte, linux390, linux-s390

[-- Attachment #1: Type: text/plain, Size: 1605 bytes --]

Heiko Carstens wrote:
> On Sat, Oct 20, 2007 at 01:24:34PM -0400, emist wrote:
>> Frans Pop wrote:
>>> emist wrote:
>>>> The following patch fixes and issue in the s390 dcssblk driver. The
>>>> issue is caused when an unsuccessful attempt is made in order to change
>>>> a segment's type through the device attribute file "shared". This causes
>>>> the driver to remove the device in question, removing with it the device
>>>> attribute which is currently handling the call. The result is a hang on
>>>> the driver as it removes memory from under its feet.
>>>>
>>>> Not exactly sure if this explanation makes sense or its entirely
>>>> accurate. This is what I believe at this point from encountering and
>>>> fixing the error. Anyway here is the patch, hope it helps.
>>> Hi,
>>>
>>> If you don't get any reactions to your patch during the next few days, I 
>>> suggest you resend it and then CC the linux-s390@vger.kernel.org list and 
>>> possibly also the maintainer at linux390@de.ibm.com.
>>>
>>> In general you should always try to CC the relevant list/people as listed in 
>>> the MAINTAINERS file and not just the linux-kernel list, both for patches 
>>> and when reporting problems.
>>>
>>> Cheers,
>>> Frans Pop
>>>
>> Thanks Frans, I will do as you suggest.
>>
>> Have a good one,
>>
>> Igor H.
> 
> Gerald or Carsten (cc'ed) should look into this.
> Thanks for reporting.
> 

Hello,

I realized that I did not fix one of the cases where this bug manifests
in my last patch. Here is the complete patch to fix the issue. And this
time I cc'ed the relevant people.

Have a good one,

Igor H.



[-- Attachment #2: dcssblk_fix --]
[-- Type: text/plain, Size: 1539 bytes --]

# This patch fixes a memory corruption bug in the s390 dcssblk driver.
# The bug occurs when an attempt to change the type of a segment
# returns an error. At this point the driver tries to remove the segment in
# question while some of the device's attributes are in use. This causes the
# driver to hang.
#
# questions/comments @ emistz@gmail.com


diff -urN linux-2.6.23.1/drivers/s390/block/dcssblk.c linuxx/drivers/s390/block/dcssblk.c
--- linux-2.6.23.1/drivers/s390/block/dcssblk.c	2007-10-20 01:19:29.000000000 -0400
+++ linuxx/drivers/s390/block/dcssblk.c	2007-10-20 01:16:13.000000000 -0400
@@ -230,8 +230,15 @@
 					   SEGMENT_SHARED);
 		if (rc < 0) {
 			BUG_ON(rc == -EINVAL);
-			if (rc != -EAGAIN)
-				goto removeseg;
+			if (rc != -EAGAIN){
+						PRINT_DEBUG
+				    ("Could not reload segment %s in the specified format, reloading\n",
+				     dev_info->segment_name);
+				rc = segment_modify_shared(dev_info->
+							   segment_name,
+							   SEGMENT_EXCLUSIVE);
+				goto out;
+			}
 		} else {
 			dev_info->is_shared = 1;
 			switch (dev_info->segment_type) {
@@ -253,8 +260,12 @@
 					   SEGMENT_EXCLUSIVE);
 		if (rc < 0) {
 			BUG_ON(rc == -EINVAL);
-			if (rc != -EAGAIN)
-				goto removeseg;
+			if (rc != -EAGAIN){
+				PRINT_DEBUG("Could not reload segment %s in the specified format, reloading\n",
+					dev_info->segment_name);
+				rc = segment_modify_shared(dev_info->segment_name, SEGMENT_SHARED);
+				goto out;	
+			}
 		} else {
 			dev_info->is_shared = 0;
 			set_disk_ro(dev_info->gd, 0);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Bug fix for the s390 dcssblk driver
  2007-10-22  3:46     ` emist
@ 2007-10-22 11:37       ` Cornelia Huck
  2007-10-23 13:22         ` Gerald Schaefer
  0 siblings, 1 reply; 7+ messages in thread
From: Cornelia Huck @ 2007-10-22 11:37 UTC (permalink / raw)
  To: emist; +Cc: Linux, geraldsc, cotte, linux390, linux-s390, Tejun Heo

On Sun, 21 Oct 2007 23:46:49 -0400,
emist <emistz@gmail.com> wrote:

> # This patch fixes a memory corruption bug in the s390 dcssblk driver.
> # The bug occurs when an attempt to change the type of a segment
> # returns an error. At this point the driver tries to remove the segment in
> # question while some of the device's attributes are in use. This causes the
> # driver to hang.

Hm, seems we missed another of those device attributes exhibiting
suicidal tendencies...

Tejun has a patchset allowing device attributes to commit suicide (see
http://marc.info/?l=linux-kernel&m=119027371416452&w=2), although I'm
not sure what its current status is. Until then, you would need to use
device_schedule_callback() to commit suicide.

This all of course only applies if killing the segment is better than
leaving it in its current state, but others can make a better judgement
on that :)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Bug fix for the s390 dcssblk driver
  2007-10-22 11:37       ` Cornelia Huck
@ 2007-10-23 13:22         ` Gerald Schaefer
  2007-10-23 22:03           ` emist
  0 siblings, 1 reply; 7+ messages in thread
From: Gerald Schaefer @ 2007-10-23 13:22 UTC (permalink / raw)
  To: emist
  Cc: Cornelia Huck, Linux, geraldsc, cotte, linux390, linux-s390, Tejun Heo

On Mon, 2007-10-22 at 13:37 +0200, Cornelia Huck wrote:
> On Sun, 21 Oct 2007 23:46:49 -0400,
> emist <emistz@gmail.com> wrote:
> 
> > # This patch fixes a memory corruption bug in the s390 dcssblk driver.
> > # The bug occurs when an attempt to change the type of a segment
> > # returns an error. At this point the driver tries to remove the segment in
> > # question while some of the device's attributes are in use. This causes the
> > # driver to hang.
> 
> Hm, seems we missed another of those device attributes exhibiting
> suicidal tendencies...
> 
> Tejun has a patchset allowing device attributes to commit suicide (see
> http://marc.info/?l=linux-kernel&m=119027371416452&w=2), although I'm
> not sure what its current status is. Until then, you would need to use
> device_schedule_callback() to commit suicide.
> 
> This all of course only applies if killing the segment is better than
> leaving it in its current state, but others can make a better judgement
> on that :)

Hi,

thanks for reporting this bug, seems like we forgot to consider the
suicidal behavior of this driver when the device_unregister() stuff was
changed.

The best solution for now would be to use the scheduled callback
function, like Cornelia described. If segment_modify_shared() should
fail, the DCSS segment will be unloaded. Calling the function again
with the old "shared" flag will not help because it will not reload
the segment. So we need to remove/unregister the device in this error
path, and for now this should be done with device_schedule_callback().

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---

 dcssblk.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

Index: linux-2.6.23/drivers/s390/block/dcssblk.c
===================================================================
--- linux-2.6.23.orig/drivers/s390/block/dcssblk.c
+++ linux-2.6.23/drivers/s390/block/dcssblk.c
@@ -193,6 +193,12 @@ dcssblk_segment_warn(int rc, char* seg_n
        }
 }

+static void dcssblk_unregister_callback(struct device *dev)
+{
+       device_unregister(dev);
+       put_device(dev);
+}
+
 /*
  * device attribute for switching shared/nonshared (exclusive)
  * operation (show + store)
@@ -276,8 +282,7 @@ removeseg:
        blk_cleanup_queue(dev_info->dcssblk_queue);
        dev_info->gd->queue = NULL;
        put_disk(dev_info->gd);
-       device_unregister(dev);
-       put_device(dev);
+       rc = device_schedule_callback(dev, dcssblk_unregister_callback);
 out:
        up_write(&dcssblk_devices_sem);
        return rc;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Bug fix for the s390 dcssblk driver
  2007-10-23 13:22         ` Gerald Schaefer
@ 2007-10-23 22:03           ` emist
  0 siblings, 0 replies; 7+ messages in thread
From: emist @ 2007-10-23 22:03 UTC (permalink / raw)
  To: Gerald Schaefer, Linux, cornelia.huck

Gerald Schaefer wrote:
> On Mon, 2007-10-22 at 13:37 +0200, Cornelia Huck wrote:
>> On Sun, 21 Oct 2007 23:46:49 -0400,
>> emist <emistz@gmail.com> wrote:
>>
>>> # This patch fixes a memory corruption bug in the s390 dcssblk driver.
>>> # The bug occurs when an attempt to change the type of a segment
>>> # returns an error. At this point the driver tries to remove the segment in
>>> # question while some of the device's attributes are in use. This causes the
>>> # driver to hang.
>> Hm, seems we missed another of those device attributes exhibiting
>> suicidal tendencies...
>>
>> Tejun has a patchset allowing device attributes to commit suicide (see
>> http://marc.info/?l=linux-kernel&m=119027371416452&w=2), although I'm
>> not sure what its current status is. Until then, you would need to use
>> device_schedule_callback() to commit suicide.
>>
>> This all of course only applies if killing the segment is better than
>> leaving it in its current state, but others can make a better judgement
>> on that :)
> 
> Hi,
> 
> thanks for reporting this bug, seems like we forgot to consider the
> suicidal behavior of this driver when the device_unregister() stuff was
> changed.
> 
> The best solution for now would be to use the scheduled callback
> function, like Cornelia described. If segment_modify_shared() should
> fail, the DCSS segment will be unloaded. Calling the function again
> with the old "shared" flag will not help because it will not reload
> the segment. So we need to remove/unregister the device in this error
> path, and for now this should be done with device_schedule_callback().
> 
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> ---
> 
>  dcssblk.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.23/drivers/s390/block/dcssblk.c
> ===================================================================
> --- linux-2.6.23.orig/drivers/s390/block/dcssblk.c
> +++ linux-2.6.23/drivers/s390/block/dcssblk.c
> @@ -193,6 +193,12 @@ dcssblk_segment_warn(int rc, char* seg_n
>         }
>  }
> 
> +static void dcssblk_unregister_callback(struct device *dev)
> +{
> +       device_unregister(dev);
> +       put_device(dev);
> +}
> +
>  /*
>   * device attribute for switching shared/nonshared (exclusive)
>   * operation (show + store)
> @@ -276,8 +282,7 @@ removeseg:
>         blk_cleanup_queue(dev_info->dcssblk_queue);
>         dev_info->gd->queue = NULL;
>         put_disk(dev_info->gd);
> -       device_unregister(dev);
> -       put_device(dev);
> +       rc = device_schedule_callback(dev, dcssblk_unregister_callback);
>  out:
>         up_write(&dcssblk_devices_sem);
>         return rc;
> 
> 
Hey,

That makes sense. I no longer have access to an s390 system so I will
not be able to provide a tested patch for this bug. I will however
attempt to fix this issue and submit a patch using the scheduled
callback function and maybe someone could make sure that it works properly.

Have a good one,

Igor H.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Bug fix for the s390 dcssblk driver
@ 2007-10-20  5:07 emist
  0 siblings, 0 replies; 7+ messages in thread
From: emist @ 2007-10-20  5:07 UTC (permalink / raw)
  To: Linux

[-- Attachment #1: Type: text/plain, Size: 615 bytes --]

Hello,

The following patch fixes and issue in the s390 dcssblk driver. The
issue is caused when an unsuccessful attempt is made in order to change
a segment's type through the device attribute file "shared". This causes
the driver to remove the device in question, removing with it the device
attribute which is currently handling the call. The result is a hang on
the driver as it removes memory from under its feet.

Not exactly sure if this explanation makes sense or its entirely
accurate. This is what I believe at this point from encountering and
fixing the error. Anyway here is the patch, hope it helps.



[-- Attachment #2: dcssblk_fix --]
[-- Type: text/plain, Size: 1050 bytes --]

# This patch fixes a memory corruption bug in the s390 dcssblk driver.
# The bug occurs when an attempt to change the type of a segment 
# returns an error. At this point the driver tries to remove the segment in
# question while some of the device's attributes are in use. This causes the
# driver to hang.
# 
# questions/comments @ emistz@gmail.com

diff -urN linux-2.6.23.1/drivers/s390/block/dcssblk.c linuxx/drivers/s390/block/dcssblk.c
--- linux-2.6.23.1/drivers/s390/block/dcssblk.c	2007-10-12 12:43:44.000000000 -0400
+++ linuxx/drivers/s390/block/dcssblk.c	2007-10-20 00:51:19.000000000 -0400
@@ -253,8 +253,12 @@
 					   SEGMENT_EXCLUSIVE);
 		if (rc < 0) {
 			BUG_ON(rc == -EINVAL);
-			if (rc != -EAGAIN)
-				goto removeseg;
+			if (rc != -EAGAIN){
+				PRINT_DEBUG("Could not reload segment %s in the specified format, reloading\n",
+					dev_info->segment_name);
+				rc = segment_modify_shared(dev_info->segment_name, SEGMENT_SHARED);
+				goto out;	
+			}
 		} else {
 			dev_info->is_shared = 0;
 			set_disk_ro(dev_info->gd, 0);

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-10-23 22:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200710201451.57138.elendil@planet.nl>
2007-10-20 17:24 ` [PATCH] Bug fix for the s390 dcssblk driver emist
2007-10-21 10:09   ` Heiko Carstens
2007-10-22  3:46     ` emist
2007-10-22 11:37       ` Cornelia Huck
2007-10-23 13:22         ` Gerald Schaefer
2007-10-23 22:03           ` emist
2007-10-20  5:07 emist

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).