linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.5.37 broke the floppy driver
@ 2002-09-21 23:01 Mikael Pettersson
  2002-09-22 18:37 ` Thomas Molina
  2002-09-23  9:10 ` Jens Axboe
  0 siblings, 2 replies; 8+ messages in thread
From: Mikael Pettersson @ 2002-09-21 23:01 UTC (permalink / raw)
  To: linux-kernel

With 2.5.37, doing a write to floppy makes the kernel print
"blk: request botched" and a few seconds later instantly reboot
the machine (w/o any further messages). 2.5.36 works fine.

"dd bs=8k if=bzImage of=/dev/fd0" triggers this every time.

/Mikael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.37 broke the floppy driver
  2002-09-21 23:01 2.5.37 broke the floppy driver Mikael Pettersson
@ 2002-09-22 18:37 ` Thomas Molina
  2002-09-23  9:10 ` Jens Axboe
  1 sibling, 0 replies; 8+ messages in thread
From: Thomas Molina @ 2002-09-22 18:37 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel

On Sun, 22 Sep 2002, Mikael Pettersson wrote:

> With 2.5.37, doing a write to floppy makes the kernel print
> "blk: request botched" and a few seconds later instantly reboot
> the machine (w/o any further messages). 2.5.36 works fine.
> 
> "dd bs=8k if=bzImage of=/dev/fd0" triggers this every time.

I duplicated this on 2.5.37-bk as well as 2.5.38-bk.  Maybe we have an 
off-by-one error?  I see the following under 2.5.38-bk:

[tmolina@dad boot]$ dd if=bzImage of=/dev/fd0
dd: writing to `/dev/fd0': No space left on device
5+0 records in
4+0 records out

If I repeate the command I get lines of the 
blk: request botched messages with the following flashed briefly on the 
screen (I wouldn't have seen it if I weren't looking for it):

dd: writing to `/dev/fd0': No space left on device
1441+0 records in
1440+0 records out



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.37 broke the floppy driver
  2002-09-21 23:01 2.5.37 broke the floppy driver Mikael Pettersson
  2002-09-22 18:37 ` Thomas Molina
@ 2002-09-23  9:10 ` Jens Axboe
  1 sibling, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2002-09-23  9:10 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel

On Sun, Sep 22 2002, Mikael Pettersson wrote:
> With 2.5.37, doing a write to floppy makes the kernel print
> "blk: request botched" and a few seconds later instantly reboot
> the machine (w/o any further messages). 2.5.36 works fine.
> 
> "dd bs=8k if=bzImage of=/dev/fd0" triggers this every time.

Attached patch should fix the partial completion thing for floppy.

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.601   -> 1.602  
#	drivers/block/ll_rw_blk.c	1.107   -> 1.108  
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/09/23	axboe@burns.home.kernel.dk	1.602
# cleanup end_that_request_first() end_io handling, and fix bug where
# partial completes didn't get accounted right wrt blk_recalc_rq_sectors()
# --------------------------------------------
#
diff -Nru a/drivers/block/ll_rw_blk.c b/drivers/block/ll_rw_blk.c
--- a/drivers/block/ll_rw_blk.c	Mon Sep 23 11:10:18 2002
+++ b/drivers/block/ll_rw_blk.c	Mon Sep 23 11:10:18 2002
@@ -1904,18 +1904,19 @@
 
 int end_that_request_first(struct request *req, int uptodate, int nr_sectors)
 {
-	int nsect, total_nsect;
+	int total_nsect = 0, error = 0;
 	struct bio *bio;
 
 	req->errors = 0;
-	if (!uptodate)
+	if (!uptodate) {
 		printk("end_request: I/O error, dev %s, sector %lu\n",
 			kdevname(req->rq_dev), req->sector);
+		error = -EIO;
+	}
 
-	total_nsect = 0;
 	while ((bio = req->bio)) {
-		nsect = bio_iovec(bio)->bv_len >> 9;
-		total_nsect += nsect;
+		const int nsect = bio_iovec(bio)->bv_len >> 9;
+		int new_bio = 0;
 
 		BIO_BUG_ON(bio_iovec(bio)->bv_len > bio->bi_size);
 
@@ -1927,36 +1928,52 @@
 
 			bio_iovec(bio)->bv_offset += partial;
 			bio_iovec(bio)->bv_len -= partial;
-			blk_recalc_rq_sectors(req, total_nsect);
-			blk_recalc_rq_segments(req);
-			bio_endio(bio, partial, !uptodate ? -EIO : 0);
-			return 1;
+			bio_endio(bio, partial, error);
+			total_nsect += nr_sectors;
+			break;
 		}
 
 		/*
-		 * if bio->bi_end_io returns 0, this bio is done. move on
+		 * we are ending the last part of the bio, advance req pointer
 		 */
-		req->bio = bio->bi_next;
-		if (bio_endio(bio, nsect << 9, !uptodate ? -EIO : 0)) {
-			bio->bi_idx++;
-			req->bio = bio;
+		if ((nsect << 9) >= bio->bi_size) {
+			req->bio = bio->bi_next;
+			new_bio = 1;
 		}
 
+		bio_endio(bio, nsect << 9, error);
+
+		total_nsect += nsect;
 		nr_sectors -= nsect;
 
+		/*
+		 * if we didn't advance the req->bio pointer, advance bi_idx
+		 * to indicate we are now on the next bio_vec
+		 */
+		if (!new_bio)
+			bio->bi_idx++;
+
 		if ((bio = req->bio)) {
 			/*
 			 * end more in this run, or just return 'not-done'
 			 */
-			if (unlikely(nr_sectors <= 0)) {
-				blk_recalc_rq_sectors(req, total_nsect);
-				blk_recalc_rq_segments(req);
-				return 1;
-			}
+			if (unlikely(nr_sectors <= 0))
+				break;
 		}
 	}
 
-	return 0;
+	/*
+	 * completely done
+	 */
+	if (!req->bio)
+		return 0;
+
+	/*
+	 * if the request wasn't completed, update state
+	 */
+	blk_recalc_rq_sectors(req, total_nsect);
+	blk_recalc_rq_segments(req);
+	return 1;
 }
 
 void end_that_request_last(struct request *req)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.37 broke the floppy driver
  2002-09-23 19:09       ` Alexander Viro
@ 2002-09-23 20:04         ` Mikael Pettersson
  0 siblings, 0 replies; 8+ messages in thread
From: Mikael Pettersson @ 2002-09-23 20:04 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Jens Axboe, Linus Torvalds, linux-kernel

Alexander Viro writes:
 > 
 > 
 > On Mon, 23 Sep 2002, Mikael Pettersson wrote:
 > 
 > > With O100-get_gendisk-C38 the oops is cured, but the floppy size is
 > > still wrong. Freshly booted, dd if=/dev/fd0H1440 bs=72k of=/dev/null
 > > reads only 720K instead of 1440K. Same thing on write: trying to
 > > write more than 720K results in an ENOSPC error.
 > 
 > 
 > Arrrgh.  O/O101-floppy_sizes-C38 and it's also my fsckup - missed the
 > size in kilobytes/size in sectors.  Fortunately that kind of crap is
 > over - blk_size[] is no more...

With that patch the /dev/fd0H1440 device now seems to work. However,
the /dev/fd0 (autosensing) device gets its very first read or write
after boot wrong. For example, dd if=/dev/fd0 bs=8k count=1 of=/dev/null
only reads 4096 bytes, and likewise a write will only write 4096 bytes
before reporting ENOSPC. Accesses after the first work though.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.37 broke the floppy driver
  2002-09-23 18:53     ` Mikael Pettersson
@ 2002-09-23 19:09       ` Alexander Viro
  2002-09-23 20:04         ` Mikael Pettersson
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Viro @ 2002-09-23 19:09 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: Jens Axboe, Linus Torvalds, linux-kernel



On Mon, 23 Sep 2002, Mikael Pettersson wrote:

> With O100-get_gendisk-C38 the oops is cured, but the floppy size is
> still wrong. Freshly booted, dd if=/dev/fd0H1440 bs=72k of=/dev/null
> reads only 720K instead of 1440K. Same thing on write: trying to
> write more than 720K results in an ENOSPC error.


Arrrgh.  O/O101-floppy_sizes-C38 and it's also my fsckup - missed the
size in kilobytes/size in sectors.  Fortunately that kind of crap is
over - blk_size[] is no more...


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.37 broke the floppy driver
       [not found]   ` <Pine.GSO.4.21.0209231104090.3948-100000@weyl.math.psu.edu>
@ 2002-09-23 18:53     ` Mikael Pettersson
  2002-09-23 19:09       ` Alexander Viro
  0 siblings, 1 reply; 8+ messages in thread
From: Mikael Pettersson @ 2002-09-23 18:53 UTC (permalink / raw)
  To: Alexander Viro
  Cc: Jens Axboe, Linus Torvalds, Mikael Pettersson, linux-kernel

Alexander Viro writes:
 > 
 > 
 > On Mon, 23 Sep 2002, Jens Axboe wrote:
 > 
 > > Al? The first bug was a legitimate partial completion error in
 > > ll_rw_blk, however there appears to be other breakage hitting floppy as
 > > well.
 > 
 > The third is my fault - I've screwed up reordering patches; get_gendisk
 > prototype change should've been submitted before floppy gendisks.  Missing
 > patch is O/O100-get_gendisk-C38 in usual place, I'll send it to Linus
 > when I finish with the next 4 chunks (ubd fixes on top of jdike's ones +
 > unexporting register_disk() + tapeblock switch to gendisk + removal of
 > "what if opened device has no gendisk" logics).  No comments on the
 > second one, though.

With O100-get_gendisk-C38 the oops is cured, but the floppy size is
still wrong. Freshly booted, dd if=/dev/fd0H1440 bs=72k of=/dev/null
reads only 720K instead of 1440K. Same thing on write: trying to
write more than 720K results in an ENOSPC error.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.37 broke the floppy driver
  2002-09-23 13:05 Mikael Pettersson
@ 2002-09-23 14:52 ` Jens Axboe
       [not found]   ` <Pine.GSO.4.21.0209231104090.3948-100000@weyl.math.psu.edu>
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2002-09-23 14:52 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel, Alexander Viro

On Mon, Sep 23 2002, Mikael Pettersson wrote:
> On Mon, 23 Sep 2002 11:10:41 +0200, Jens Axboe wrote:
> >On Sun, Sep 22 2002, Mikael Pettersson wrote:
> >> With 2.5.37, doing a write to floppy makes the kernel print
> >> "blk: request botched" and a few seconds later instantly reboot
> >> the machine (w/o any further messages). 2.5.36 works fine.
> >> 
> >> "dd bs=8k if=bzImage of=/dev/fd0" triggers this every time.
> >
> >Attached patch should fix the partial completion thing for floppy.
> >
> ># This is a BitKeeper generated patch for the following project:
> ># Project Name: Linux kernel tree
> ># This patch format is intended for GNU patch command version 2.5 or higher.
> ># This patch includes the following deltas:
> >#	           ChangeSet	1.601   -> 1.602  
> >#	drivers/block/ll_rw_blk.c	1.107   -> 1.108  
> 
> It's an improvement (the kernel doesn't reboot as soon as I
> read or write /dev/fd0), but there are still some strange
> things going on with floppy in 2.5.38 (this all worked in 2.5.36):
> 
> 1. dd if=/dev/fd0 bs=72k of=/tmp/a
>    (after reboot) only reads 2048 bytes; /dev/fd0 is actually 1.44M
> 2. dd if=/dev/fd0 bs=72k of=/tmp/a
>    (repeat the command) now it reads 10 records = 720K,
>    which is still only half of the true size
> 3. dd if=/dev/fd0H1440 bs=72k of=/tmp/a
>    oopses in blk_dev.c:do_open() line 676, see below:

Al? The first bug was a legitimate partial completion error in
ll_rw_blk, however there appears to be other breakage hitting floppy as
well.

> (2.5.38 tarball, plain UP config, gcc-2.95.3)
> 
> Unable to handle kernel paging request at virtual address 00001738
> c01370f0
> *pde = 00000000
> Oops: 0000
> CPU:    0
> EIP:    0060:[<c01370f0>]    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010292
> eax: c02fe12c   ebx: c11a02a0   ecx: 00001810   edx: c7b09350
> esi: c11a0240   edi: c11a021c   ebp: 00000000   esp: c6c2df10
> ds: 0068   es: 0068   ss: 0068
> Stack: c7b09350 c6f09aa0 ffffffe9 c114b360 c11a0260 00000000 00000000 c0137276 
>        c11a0240 c7b09350 c6f09aa0 c7b09350 c6f09aa0 c7b09350 c0130869 c7b09350 
>        c6f09aa0 00000000 c118c000 00008000 bffff9a8 c01307a6 c6c96ca0 c114b360 
> Call Trace: [<c0137276>] [<c0130869>] [<c01307a6>] [<c0130b13>] [<c0106dbf>] 
> Code: 83 b9 28 ff ff ff 00 75 1f 8b 46 44 ff 48 50 8b 46 44 8d 48 
> 
> 
> >>EIP; c01370f0 <do_open+258/344>   <=====
> 
> >>eax; c02fe12c <blk_dev+22c/d340>
> >>ebx; c11a02a0 <END_OF_CODE+e87c20/????>
> >>ecx; 00001810 Before first symbol
> >>edx; c7b09350 <END_OF_CODE+77f0cd0/????>
> >>esi; c11a0240 <END_OF_CODE+e87bc0/????>
> >>edi; c11a021c <END_OF_CODE+e87b9c/????>
> >>esp; c6c2df10 <END_OF_CODE+6915890/????>
> 
> Trace; c0137276 <blkdev_open+22/28>
> Trace; c0130869 <dentry_open+b9/16c>
> Trace; c01307a6 <filp_open+52/5c>
> Trace; c0130b13 <sys_open+37/78>
> Trace; c0106dbf <syscall_call+7/b>
> 
> Code;  c01370f0 <do_open+258/344>
> 00000000 <_EIP>:
> Code;  c01370f0 <do_open+258/344>   <=====
>    0:   83 b9 28 ff ff ff 00      cmpl   $0x0,0xffffff28(%ecx)   <=====
> Code;  c01370f7 <do_open+25f/344>
>    7:   75 1f                     jne    28 <_EIP+0x28> c0137118 <do_open+280/344>
> Code;  c01370f9 <do_open+261/344>
>    9:   8b 46 44                  mov    0x44(%esi),%eax
> Code;  c01370fc <do_open+264/344>
>    c:   ff 48 50                  decl   0x50(%eax)
> Code;  c01370ff <do_open+267/344>
>    f:   8b 46 44                  mov    0x44(%esi),%eax
> Code;  c0137102 <do_open+26a/344>
>   12:   8d 48 00                  lea    0x0(%eax),%ecx
> 
> fs/block_dev.c:
> static int do_open(struct block_device *bdev, struct inode *inode, struct file *file)
> {
> 	...
> 	if (bdev->bd_contains == bdev) {
> 		...
> 	} else {
> 		down(&bdev->bd_contains->bd_sem);
> 		bdev->bd_contains->bd_part_count++;
> 		if (!bdev->bd_openers) {
> 			struct gendisk *g = get_gendisk(dev);
> 			struct hd_struct *p;
> BOGUS? ->		p = g->part + minor(dev) - g->first_minor - 1;
> 			inode->i_data.backing_dev_info =
> 			   bdev->bd_inode->i_data.backing_dev_info =
> 			   bdev->bd_contains->bd_inode->i_data.backing_dev_info;
> OOPS HERE ->		if (!p->nr_sects) {
> 				bdev->bd_contains->bd_part_count--;
> 				up(&bdev->bd_contains->bd_sem);
> 				ret = -ENXIO;
> 				goto out2;
> 			}
> 
> I correlated a gdb disassembly with do_open(), and it looks like
> 'p' got a bogus value (ecx, 0x1810) at the indicated line.
> 
> /Mikael
> 

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.37 broke the floppy driver
@ 2002-09-23 13:05 Mikael Pettersson
  2002-09-23 14:52 ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Mikael Pettersson @ 2002-09-23 13:05 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel

On Mon, 23 Sep 2002 11:10:41 +0200, Jens Axboe wrote:
>On Sun, Sep 22 2002, Mikael Pettersson wrote:
>> With 2.5.37, doing a write to floppy makes the kernel print
>> "blk: request botched" and a few seconds later instantly reboot
>> the machine (w/o any further messages). 2.5.36 works fine.
>> 
>> "dd bs=8k if=bzImage of=/dev/fd0" triggers this every time.
>
>Attached patch should fix the partial completion thing for floppy.
>
># This is a BitKeeper generated patch for the following project:
># Project Name: Linux kernel tree
># This patch format is intended for GNU patch command version 2.5 or higher.
># This patch includes the following deltas:
>#	           ChangeSet	1.601   -> 1.602  
>#	drivers/block/ll_rw_blk.c	1.107   -> 1.108  

It's an improvement (the kernel doesn't reboot as soon as I
read or write /dev/fd0), but there are still some strange
things going on with floppy in 2.5.38 (this all worked in 2.5.36):

1. dd if=/dev/fd0 bs=72k of=/tmp/a
   (after reboot) only reads 2048 bytes; /dev/fd0 is actually 1.44M
2. dd if=/dev/fd0 bs=72k of=/tmp/a
   (repeat the command) now it reads 10 records = 720K,
   which is still only half of the true size
3. dd if=/dev/fd0H1440 bs=72k of=/tmp/a
   oopses in blk_dev.c:do_open() line 676, see below:

(2.5.38 tarball, plain UP config, gcc-2.95.3)

Unable to handle kernel paging request at virtual address 00001738
c01370f0
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0060:[<c01370f0>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010292
eax: c02fe12c   ebx: c11a02a0   ecx: 00001810   edx: c7b09350
esi: c11a0240   edi: c11a021c   ebp: 00000000   esp: c6c2df10
ds: 0068   es: 0068   ss: 0068
Stack: c7b09350 c6f09aa0 ffffffe9 c114b360 c11a0260 00000000 00000000 c0137276 
       c11a0240 c7b09350 c6f09aa0 c7b09350 c6f09aa0 c7b09350 c0130869 c7b09350 
       c6f09aa0 00000000 c118c000 00008000 bffff9a8 c01307a6 c6c96ca0 c114b360 
Call Trace: [<c0137276>] [<c0130869>] [<c01307a6>] [<c0130b13>] [<c0106dbf>] 
Code: 83 b9 28 ff ff ff 00 75 1f 8b 46 44 ff 48 50 8b 46 44 8d 48 


>>EIP; c01370f0 <do_open+258/344>   <=====

>>eax; c02fe12c <blk_dev+22c/d340>
>>ebx; c11a02a0 <END_OF_CODE+e87c20/????>
>>ecx; 00001810 Before first symbol
>>edx; c7b09350 <END_OF_CODE+77f0cd0/????>
>>esi; c11a0240 <END_OF_CODE+e87bc0/????>
>>edi; c11a021c <END_OF_CODE+e87b9c/????>
>>esp; c6c2df10 <END_OF_CODE+6915890/????>

Trace; c0137276 <blkdev_open+22/28>
Trace; c0130869 <dentry_open+b9/16c>
Trace; c01307a6 <filp_open+52/5c>
Trace; c0130b13 <sys_open+37/78>
Trace; c0106dbf <syscall_call+7/b>

Code;  c01370f0 <do_open+258/344>
00000000 <_EIP>:
Code;  c01370f0 <do_open+258/344>   <=====
   0:   83 b9 28 ff ff ff 00      cmpl   $0x0,0xffffff28(%ecx)   <=====
Code;  c01370f7 <do_open+25f/344>
   7:   75 1f                     jne    28 <_EIP+0x28> c0137118 <do_open+280/344>
Code;  c01370f9 <do_open+261/344>
   9:   8b 46 44                  mov    0x44(%esi),%eax
Code;  c01370fc <do_open+264/344>
   c:   ff 48 50                  decl   0x50(%eax)
Code;  c01370ff <do_open+267/344>
   f:   8b 46 44                  mov    0x44(%esi),%eax
Code;  c0137102 <do_open+26a/344>
  12:   8d 48 00                  lea    0x0(%eax),%ecx

fs/block_dev.c:
static int do_open(struct block_device *bdev, struct inode *inode, struct file *file)
{
	...
	if (bdev->bd_contains == bdev) {
		...
	} else {
		down(&bdev->bd_contains->bd_sem);
		bdev->bd_contains->bd_part_count++;
		if (!bdev->bd_openers) {
			struct gendisk *g = get_gendisk(dev);
			struct hd_struct *p;
BOGUS? ->		p = g->part + minor(dev) - g->first_minor - 1;
			inode->i_data.backing_dev_info =
			   bdev->bd_inode->i_data.backing_dev_info =
			   bdev->bd_contains->bd_inode->i_data.backing_dev_info;
OOPS HERE ->		if (!p->nr_sects) {
				bdev->bd_contains->bd_part_count--;
				up(&bdev->bd_contains->bd_sem);
				ret = -ENXIO;
				goto out2;
			}

I correlated a gdb disassembly with do_open(), and it looks like
'p' got a bogus value (ecx, 0x1810) at the indicated line.

/Mikael

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-09-23 19:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-21 23:01 2.5.37 broke the floppy driver Mikael Pettersson
2002-09-22 18:37 ` Thomas Molina
2002-09-23  9:10 ` Jens Axboe
2002-09-23 13:05 Mikael Pettersson
2002-09-23 14:52 ` Jens Axboe
     [not found]   ` <Pine.GSO.4.21.0209231104090.3948-100000@weyl.math.psu.edu>
2002-09-23 18:53     ` Mikael Pettersson
2002-09-23 19:09       ` Alexander Viro
2002-09-23 20:04         ` Mikael Pettersson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).