linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Linux Raid5/6 abover 2 Terabytes
@ 2004-04-06 21:00 Evan Felix
  2004-04-06 23:02 ` Andreas Schwab
  0 siblings, 1 reply; 2+ messages in thread
From: Evan Felix @ 2004-04-06 21:00 UTC (permalink / raw)
  To: linux-raid, linux-kernel, neilb

[-- Attachment #1: Type: text/plain, Size: 2342 bytes --]

Here is a patch that fixes a major issue in the raid5/6 code.  It seems
that the code:

logical_sector = bi->bi_sector & ~(STRIPE_SECTORS-1);
(sector_t)     = (sector_t)    & (constant)

that the right side of the & does not get extended correctly when the
constant is promoted to the sector_t type.  I have CONFIG_LBD turned on
so sector_t should be 64bits wide.  This fails to properly mask the
value of 4294967296 (2TB/512) to 4294967296.  in my case it was coming
out 0.  this cause the loop following this code to read from 0 to
4294967296 blocks so it could write one character.

As you might imagine this makes a format of a 3.5TB filesystem take a
very long time.

Here is the patch:
Binary files linux-2.6.5/drivers/md/mktables and
linux-2.6.5fixraid/drivers/md/mktables differ
diff -urN -X /home/efelix/.cvsignore linux-2.6.5/drivers/md/raid5.c
linux-2.6.5fixraid/drivers/md/raid5.c
--- linux-2.6.5/drivers/md/raid5.c	2004-04-04 03:36:26.000000000 +0000
+++ linux-2.6.5fixraid/drivers/md/raid5.c	2004-04-06 18:26:05.000000000
+0000
@@ -1334,8 +1334,9 @@
 		disk_stat_add(mddev->gendisk, read_sectors, bio_sectors(bi));
 	}

-	logical_sector = bi->bi_sector & ~(STRIPE_SECTORS-1);
+	logical_sector = bi->bi_sector & ~((sector_t)STRIPE_SECTORS-1);
 	last_sector = bi->bi_sector + (bi->bi_size>>9);
+	PRINTK("Bio: %Lu logical %Lu   last
%Lu\n",bi->bi_sector,logical_sector,last_sector);
 
 	bi->bi_next = NULL;
 	bi->bi_phys_segments = 1;	/* over-loaded to count active stripes */
diff -urN -X /home/efelix/.cvsignore linux-2.6.5/drivers/md/raid6main.c
linux-2.6.5fixraid/drivers/md/raid6main.c
--- linux-2.6.5/drivers/md/raid6main.c	2004-04-04 03:36:14.000000000
+0000
+++ linux-2.6.5fixraid/drivers/md/raid6main.c	2004-04-06
18:31:30.000000000 +0000
@@ -1496,7 +1496,7 @@
 		disk_stat_add(mddev->gendisk, read_sectors, bio_sectors(bi));
 	}
 
-	logical_sector = bi->bi_sector & ~(STRIPE_SECTORS-1);
+	logical_sector = bi->bi_sector & ~((sector_t)STRIPE_SECTORS-1);
 	last_sector = bi->bi_sector + (bi->bi_size>>9);
 
 	bi->bi_next = NULL;


I have tested this on at least 2 arrays, with ext2 and some long dd's

Evan
-- 
-------------------------
Evan Felix
Administrator of Supercomputer #5 in Top 500, Nov 2003
Environmental Molecular Sciences Laboratory
Pacific Northwest National Laboratory
Operated for the U.S. DOE by Battelle

[-- Attachment #2: fix_block_mask.patch --]
[-- Type: text/x-patch, Size: 1383 bytes --]

Binary files linux-2.6.5/drivers/md/mktables and linux-2.6.5fixraid/drivers/md/mktables differ
diff -urN -X /home/efelix/.cvsignore linux-2.6.5/drivers/md/raid5.c linux-2.6.5fixraid/drivers/md/raid5.c
--- linux-2.6.5/drivers/md/raid5.c	2004-04-04 03:36:26.000000000 +0000
+++ linux-2.6.5fixraid/drivers/md/raid5.c	2004-04-06 18:26:05.000000000 +0000
@@ -1334,8 +1334,9 @@
 		disk_stat_add(mddev->gendisk, read_sectors, bio_sectors(bi));
 	}

-	logical_sector = bi->bi_sector & ~(STRIPE_SECTORS-1);
+	logical_sector = bi->bi_sector & ~((sector_t)STRIPE_SECTORS-1);
 	last_sector = bi->bi_sector + (bi->bi_size>>9);
+	PRINTK("Bio: %Lu logical %Lu   last %Lu\n",bi->bi_sector,logical_sector,last_sector);
 
 	bi->bi_next = NULL;
 	bi->bi_phys_segments = 1;	/* over-loaded to count active stripes */
diff -urN -X /home/efelix/.cvsignore linux-2.6.5/drivers/md/raid6main.c linux-2.6.5fixraid/drivers/md/raid6main.c
--- linux-2.6.5/drivers/md/raid6main.c	2004-04-04 03:36:14.000000000 +0000
+++ linux-2.6.5fixraid/drivers/md/raid6main.c	2004-04-06 18:31:30.000000000 +0000
@@ -1496,7 +1496,7 @@
 		disk_stat_add(mddev->gendisk, read_sectors, bio_sectors(bi));
 	}
 
-	logical_sector = bi->bi_sector & ~(STRIPE_SECTORS-1);
+	logical_sector = bi->bi_sector & ~((sector_t)STRIPE_SECTORS-1);
 	last_sector = bi->bi_sector + (bi->bi_size>>9);
 
 	bi->bi_next = NULL;

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Linux Raid5/6 abover 2 Terabytes
  2004-04-06 21:00 [PATCH] Linux Raid5/6 abover 2 Terabytes Evan Felix
@ 2004-04-06 23:02 ` Andreas Schwab
  0 siblings, 0 replies; 2+ messages in thread
From: Andreas Schwab @ 2004-04-06 23:02 UTC (permalink / raw)
  To: Evan Felix; +Cc: linux-raid, linux-kernel, neilb

Evan Felix <evan.felix@pnl.gov> writes:

> Here is a patch that fixes a major issue in the raid5/6 code.  It seems
> that the code:
>
> logical_sector = bi->bi_sector & ~(STRIPE_SECTORS-1);
> (sector_t)     = (sector_t)    & (constant)
>
> that the right side of the & does not get extended correctly when the
> constant is promoted to the sector_t type.  I have CONFIG_LBD turned on
> so sector_t should be 64bits wide.  This fails to properly mask the
> value of 4294967296 (2TB/512) to 4294967296.  in my case it was coming
> out 0.  this cause the loop following this code to read from 0 to
> 4294967296 blocks so it could write one character.
>
> As you might imagine this makes a format of a 3.5TB filesystem take a
> very long time.
>
> Here is the patch:

Alternatively replace ~(STRIPE_SECTORS-1) by -STRIPE_SECTORS, which
doesn't need a cast.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-04-06 23:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-06 21:00 [PATCH] Linux Raid5/6 abover 2 Terabytes Evan Felix
2004-04-06 23:02 ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).