* [patch for playing] 2.5.65 patch to support > 256 disks
@ 2003-03-21 18:56 Badari Pulavarty
2003-03-22 11:00 ` Douglas Gilbert
0 siblings, 1 reply; 22+ messages in thread
From: Badari Pulavarty @ 2003-03-21 18:56 UTC (permalink / raw)
To: linux-kernel, linux-scsi; +Cc: Andrew Morton
[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]
Hi,
Andries Brouwer recently submitted 32 bit dev_t patches,
which are in 2.5.65-mm2. This patch applies on those patches to support
more than 256 disks. This is for playing only.
I tested this with 4000 disks using scsi_debug. I attached my actual
disks (50) after 4000 scsi_debug disks. I am able to access my disks
fine and do IO on them.
Problems (so far):
1) sd.c - sd_index_bits[] arrys became big - need to be fixed.
2) 4000 disks eats up lots of low memory (~460 MB). Here is the
/proc/meminfo output before & after insmod.
Before:
MemTotal: 3883276 kB
MemFree: 3808028 kB
Buffers: 3240 kB
Cached: 41860 kB
SwapCached: 0 kB
Active: 45360 kB
Inactive: 7288 kB
HighTotal: 3014616 kB
HighFree: 2961856 kB
LowTotal: 868660 kB
LowFree: 846172 kB
SwapTotal: 2040244 kB
SwapFree: 2040244 kB
Dirty: 192 kB
Writeback: 0 kB
Mapped: 14916 kB
Slab: 7164 kB
Committed_AS: 12952 kB
PageTables: 312 kB
ReverseMaps: 1895
====
After:
MemTotal: 3883276 kB
MemFree: 3224140 kB
Buffers: 3880 kB
Cached: 140376 kB
SwapCached: 0 kB
Active: 47512 kB
Inactive: 105508 kB
HighTotal: 3014616 kB
HighFree: 2838144 kB
LowTotal: 868660 kB
LowFree: 385996 kB
SwapTotal: 2040244 kB
SwapFree: 2040244 kB
Dirty: 92 kB
Writeback: 0 kB
Mapped: 16172 kB
Slab: 464364 kB
Committed_AS: 14996 kB
PageTables: 412 kB
ReverseMaps: 2209
[-- Attachment #2: sd.patch --]
[-- Type: text/x-diff, Size: 1353 bytes --]
--- linux/drivers/scsi/sd.c Thu Mar 20 15:06:00 2003
+++ linux.new/drivers/scsi/sd.c Fri Mar 21 11:50:54 2003
@@ -56,7 +56,9 @@
* Remaining dev_t-handling stuff
*/
#define SD_MAJORS 16
-#define SD_DISKS (SD_MAJORS << 4)
+#define SD_DISKS_PER_MAJOR_SHIFT (KDEV_MINOR_BITS - 4)
+#define SD_DISKS_PER_MAJOR (1 << SD_DISKS_PER_MAJOR_SHIFT)
+#define SD_DISKS (SD_MAJORS << SD_DISKS_PER_MAJOR_SHIFT)
/*
* Time out in seconds for disks and Magneto-opticals (which are slower).
@@ -1328,17 +1330,23 @@ static int sd_attach(struct scsi_device
sdkp->index = index;
gd->de = sdp->de;
- gd->major = sd_major(index >> 4);
- gd->first_minor = (index & 15) << 4;
+ gd->major = sd_major(index >> SD_DISKS_PER_MAJOR_SHIFT);
+ gd->first_minor = (index & (SD_DISKS_PER_MAJOR - 1)) << 4;
gd->minors = 16;
gd->fops = &sd_fops;
- if (index >= 26) {
+ if (index < 26) {
+ sprintf(gd->disk_name, "sd%c", 'a' + index % 26);
+ } else if (index < (26*27)) {
sprintf(gd->disk_name, "sd%c%c",
'a' + index/26-1,'a' + index % 26);
} else {
- sprintf(gd->disk_name, "sd%c", 'a' + index % 26);
- }
+ const unsigned int m1 = (index/ 26 - 1) / 26 - 1;
+ const unsigned int m2 = (index / 26 - 1) % 26;
+ const unsigned int m3 = index % 26;
+ sprintf(gd->disk_name, "sd%c%c%c",
+ 'a' + m1, 'a' + m2, 'a' + m3);
+ }
sd_init_onedisk(sdkp, gd);
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-21 18:56 [patch for playing] 2.5.65 patch to support > 256 disks Badari Pulavarty
@ 2003-03-22 11:00 ` Douglas Gilbert
2003-03-22 11:04 ` Andrew Morton
0 siblings, 1 reply; 22+ messages in thread
From: Douglas Gilbert @ 2003-03-22 11:00 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: linux-kernel, linux-scsi, Andrew Morton
Badari Pulavarty wrote:
> Hi,
>
> Andries Brouwer recently submitted 32 bit dev_t patches,
> which are in 2.5.65-mm2. This patch applies on those patches to support
> more than 256 disks. This is for playing only.
>
> I tested this with 4000 disks using scsi_debug. I attached my actual
> disks (50) after 4000 scsi_debug disks. I am able to access my disks
> fine and do IO on them.
>
> Problems (so far):
>
> 1) sd.c - sd_index_bits[] arrys became big - need to be fixed.
>
> 2) 4000 disks eats up lots of low memory (~460 MB). Here is the
> /proc/meminfo output before & after insmod.
>
> Before:
> MemTotal: 3883276 kB
> MemFree: 3808028 kB
> Buffers: 3240 kB
> Cached: 41860 kB
> SwapCached: 0 kB
> Active: 45360 kB
> Inactive: 7288 kB
> HighTotal: 3014616 kB
> HighFree: 2961856 kB
> LowTotal: 868660 kB
> LowFree: 846172 kB
> SwapTotal: 2040244 kB
> SwapFree: 2040244 kB
> Dirty: 192 kB
> Writeback: 0 kB
> Mapped: 14916 kB
> Slab: 7164 kB
> Committed_AS: 12952 kB
> PageTables: 312 kB
> ReverseMaps: 1895
> ====
> After:
> MemTotal: 3883276 kB
> MemFree: 3224140 kB
> Buffers: 3880 kB
> Cached: 140376 kB
> SwapCached: 0 kB
> Active: 47512 kB
> Inactive: 105508 kB
> HighTotal: 3014616 kB
> HighFree: 2838144 kB
> LowTotal: 868660 kB
> LowFree: 385996 kB
> SwapTotal: 2040244 kB
> SwapFree: 2040244 kB
> Dirty: 92 kB
> Writeback: 0 kB
> Mapped: 16172 kB
> Slab: 464364 kB
> Committed_AS: 14996 kB
> PageTables: 412 kB
> ReverseMaps: 2209
Badari,
I poked around looking for data on the size issue.
Here are the byte sizes for the per device and per host
structures in scsi_debug and the scsi mid level for
i386, non-smp in lk 2.5.65:
sizeof(sdebug_dev_info)=60, sizeof(scsi_device)=376
sizeof(sdebug_host_info)=24, sizeof(Scsi_Host)=224
So for 4000 disks they should be responsible for about
2 MB.
The scsi_cmd_cache slab info went from this before those
84 pseudo disks were added:
# cat slabinfo_pre.txt
scsi_cmd_cache 3 11 356 1 1 1 : 32 16 :
22 301 6 5 0 0 43 : 3235 31 3264 0
to this afterwards:
# cat slabinfo_post.txt
scsi_cmd_cache 44 55 356 5 5 1 : 32 16 :
66 398 12 7 0 0 43 : 5837 40 5833 0
I did notice a rather large growth of nodes
in sysfs. For 84 added scsi_debug pseudo disks the number
of sysfs nodes went from 686 to 3347.
Does anybody know what is the per node memory cost of sysfs?
Doug Gilbert
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-22 11:00 ` Douglas Gilbert
@ 2003-03-22 11:04 ` Andrew Morton
2003-03-22 11:46 ` Douglas Gilbert
0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2003-03-22 11:04 UTC (permalink / raw)
To: dougg; +Cc: pbadari, linux-kernel, linux-scsi
Douglas Gilbert <dougg@torque.net> wrote:
>
> > Slab: 464364 kB
It's all in slab.
> I did notice a rather large growth of nodes
> in sysfs. For 84 added scsi_debug pseudo disks the number
> of sysfs nodes went from 686 to 3347.
>
> Does anybody know what is the per node memory cost of sysfs?
Let's see all of /pro/slabinfo please.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-22 11:04 ` Andrew Morton
@ 2003-03-22 11:46 ` Douglas Gilbert
2003-03-22 12:05 ` Andrew Morton
0 siblings, 1 reply; 22+ messages in thread
From: Douglas Gilbert @ 2003-03-22 11:46 UTC (permalink / raw)
To: Andrew Morton; +Cc: pbadari, linux-kernel, linux-scsi
[-- Attachment #1: Type: text/plain, Size: 532 bytes --]
Andrew Morton wrote:
> Douglas Gilbert <dougg@torque.net> wrote:
>
>>>Slab: 464364 kB
>>
>
> It's all in slab.
>
>
>>I did notice a rather large growth of nodes
>>in sysfs. For 84 added scsi_debug pseudo disks the number
>>of sysfs nodes went from 686 to 3347.
>>
>>Does anybody know what is the per node memory cost of sysfs?
>
>
> Let's see all of /pro/slabinfo please.
Andrew,
Attachments are /proc/slabinfo pre and post:
$ modprobe scsi_debug add_host=42 num_devs=2
which adds 84 pseudo disks.
Doug Gilbert
[-- Attachment #2: slabinfo_pre.txt --]
[-- Type: text/plain, Size: 14921 bytes --]
slabinfo - version: 1.2 (statistics)
hpsb_packet 0 0 132 0 0 1 : 32 16 : 16 16 1 1 0 0 61 : 46 1 47 0
rpc_buffers 8 8 2048 4 4 1 : 32 16 : 8 8 4 0 0 0 34 : 4 4 0 0
rpc_tasks 8 19 204 1 1 1 : 32 16 : 16 16 1 0 0 0 51 : 7 1 0 0
rpc_inode_cache 0 0 480 0 0 1 : 32 16 : 0 0 0 0 0 0 40 : 0 0 0 0
unix_sock 10 21 576 3 3 1 : 32 16 : 21 32 3 0 0 0 39 : 593 4 587 0
tcp_tw_bucket 0 0 100 0 0 1 : 32 16 : 0 0 0 0 0 0 71 : 0 0 0 0
tcp_bind_bucket 7 125 28 1 1 1 : 32 16 : 16 16 1 0 0 0 157 : 6 1 0 0
tcp_open_request 0 0 68 0 0 1 : 32 16 : 0 0 0 0 0 0 88 : 0 0 0 0
inet_peer_cache 0 0 52 0 0 1 : 32 16 : 0 0 0 0 0 0 104 : 0 0 0 0
secpath_cache 0 0 32 0 0 1 : 32 16 : 0 0 0 0 0 0 144 : 0 0 0 0
flow_cache 0 0 60 0 0 1 : 32 16 : 0 0 0 0 0 0 94 : 0 0 0 0
xfrm4_dst_cache 0 0 220 0 0 1 : 32 16 : 0 0 0 0 0 0 50 : 0 0 0 0
ip_fib_hash 10 125 28 1 1 1 : 32 16 : 16 16 1 0 0 0 157 : 10 1 1 0
ip_dst_cache 4 19 208 1 1 1 : 32 16 : 19 48 1 0 0 0 51 : 3 3 2 0
arp_cache 2 21 188 1 1 1 : 32 16 : 17 32 1 0 0 0 53 : 0 2 0 0
raw4_sock 0 0 576 0 0 1 : 32 16 : 7 14 2 2 0 0 39 : 4 2 6 0
udp_sock 7 7 576 1 1 1 : 32 16 : 14 17 2 1 0 0 39 : 37 3 33 0
tcp_sock 16 21 1088 3 3 2 : 32 16 : 21 30 3 0 0 0 39 : 49 6 39 0
scsi_cmd_cache 4 11 356 1 1 1 : 32 16 : 22 141 4 3 0 0 43 : 3155 14 3167 0
nfs_write_data 36 40 452 5 5 1 : 32 16 : 40 40 5 0 0 0 40 : 31 5 0 0
nfs_read_data 32 36 436 4 4 1 : 32 16 : 36 36 4 0 0 0 41 : 28 4 0 0
nfs_inode_cache 0 0 704 0 0 2 : 32 16 : 0 0 0 0 0 0 43 : 0 0 0 0
nfs_page 0 0 108 0 0 1 : 32 16 : 0 0 0 0 0 0 68 : 0 0 0 0
isofs_inode_cache 0 0 440 0 0 1 : 32 16 : 0 0 0 0 0 0 41 : 0 0 0 0
ext2_inode_cache 4 7 576 1 1 1 : 32 16 : 7 7 1 0 0 0 39 : 3 1 0 0
journal_head 26 62 60 1 1 1 : 32 16 : 420 1092 9 2 0 1 94 : 3741 71 3746 55
revoke_table 1 144 24 1 1 1 : 32 16 : 16 16 1 0 0 0 176 : 0 1 0 0
revoke_record 0 0 28 0 0 1 : 32 16 : 16 32 2 2 0 0 157 : 3 2 5 0
ext3_inode_cache 2268 2268 576 324 324 1 : 32 16 : 2268 2279 324 0 0 0 39 : 1967 329 28 0
ext3_xattr 0 0 60 0 0 1 : 32 16 : 0 0 0 0 0 0 94 : 0 0 0 0
eventpoll_pwq 0 0 48 0 0 1 : 32 16 : 0 0 0 0 0 0 109 : 0 0 0 0
eventpoll_epi 0 0 72 0 0 1 : 32 16 : 0 0 0 0 0 0 85 : 0 0 0 0
kioctx 0 0 268 0 0 1 : 32 16 : 0 0 0 0 0 0 46 : 0 0 0 0
kiocb 0 0 172 0 0 1 : 32 16 : 0 0 0 0 0 0 55 : 0 0 0 0
dnotify_cache 0 0 32 0 0 1 : 32 16 : 0 0 0 0 0 0 144 : 0 0 0 0
file_lock_cache 17 30 128 1 1 1 : 32 16 : 30 160 1 0 0 0 62 : 1052 12 1047 0
fasync_cache 0 0 28 0 0 1 : 32 16 : 0 0 0 0 0 0 157 : 0 0 0 0
shmem_inode_cache 2 7 576 1 1 1 : 32 16 : 7 13 1 0 0 0 39 : 0 2 0 0
idr_layer_cache 0 0 148 0 0 1 : 32 16 : 0 0 0 0 0 0 58 : 0 0 0 0
posix_timers_cache 0 0 136 0 0 1 : 32 16 : 0 0 0 0 0 0 60 : 0 0 0 0
uid_cache 5 101 36 1 1 1 : 32 16 : 18 32 1 0 0 0 133 : 3 2 0 0
sgpool-MAX_PHYS_SEGMENTS 32 32 2048 16 16 1 : 32 16 : 34 38 18 2 0 0 34 : 0 35 3 0
sgpool-64 32 32 1024 8 8 1 : 32 16 : 36 48 12 4 0 0 36 : 6 36 10 0
sgpool-32 32 32 512 4 4 1 : 32 16 : 40 72 9 5 0 0 40 : 33 37 38 0
sgpool-16 32 42 268 3 3 1 : 32 16 : 42 72 3 0 0 0 46 : 225 36 229 0
sgpool-8 48 54 140 2 2 1 : 32 16 : 54 195 2 0 0 0 59 : 2832 43 2843 0
deadline_drq 768 826 64 14 14 1 : 32 16 : 1035 1302 21 0 0 0 91 : 1191 89 482 30
blkdev_requests 768 775 156 31 31 1 : 32 16 : 1025 1264 47 3 0 1 57 : 1180 100 484 28
biovec-BIO_MAX_PAGES 256 260 3072 52 52 4 : 32 16 : 256 256 52 0 0 0 37 : 0 256 0 0
biovec-128 256 260 1536 52 52 2 : 32 16 : 256 256 52 0 0 0 37 : 0 256 0 0
biovec-64 256 260 768 52 52 1 : 32 16 : 260 268 52 0 0 0 37 : 8 259 11 0
biovec-16 256 266 204 14 14 1 : 32 16 : 266 286 14 0 0 0 51 : 291 259 294 0
biovec-4 256 310 60 5 5 1 : 32 16 : 272 288 5 0 0 0 94 : 481 258 483 0
biovec-1 267 288 24 2 2 1 : 32 16 : 512 1229 6 3 0 0 176 : 5079 317 5093 47
bio 266 318 72 6 6 1 : 32 16 : 509 1337 13 2 0 1 85 : 5855 329 5874 54
sock_inode_cache 36 40 484 5 5 1 : 32 16 : 40 55 5 0 0 0 40 : 545 10 519 0
skbuff_head_cache 16 24 160 1 1 1 : 32 16 : 24 42 1 0 0 0 56 : 13 3 0 0
sock 2 9 444 1 1 1 : 32 16 : 9 16 1 0 0 0 41 : 14 3 15 0
proc_inode_cache 82 99 440 11 11 1 : 32 16 : 99 147 11 0 0 0 41 : 572 15 519 1
sigqueue 9 27 144 1 1 1 : 32 16 : 16 16 1 0 0 0 59 : 744 1 745 0
radix_tree_node 805 812 272 58 58 1 : 32 16 : 812 840 58 0 0 0 46 : 743 69 11 0
cdev_cache 143 250 28 2 2 1 : 32 16 : 158 173 2 0 0 0 157 : 131 12 0 0
bdev_cache 4 32 120 1 1 1 : 32 16 : 16 16 1 0 0 0 64 : 6 1 3 0
mnt_cache 19 53 72 1 1 1 : 32 16 : 35 54 1 0 0 0 85 : 24 9 14 0
inode_cache 503 513 424 57 57 1 : 32 16 : 513 546 57 0 0 0 41 : 703 305 505 0
dentry_cache 3854 3857 204 203 203 1 : 32 16 : 3968 4474 216 2 0 1 51 : 5098 699 1909 34
filp 286 300 156 12 12 1 : 32 16 : 291 307 12 0 0 0 57 : 261 25 0 0
names_cache 1 1 4096 1 1 1 : 32 16 : 8 44 22 21 0 0 33 : 14723 44 14767 0
buffer_head 1506 1534 64 26 26 1 : 32 16 : 1523 2071 26 0 0 1 91 : 3797 139 2404 28
mm_struct 35 35 576 5 5 1 : 32 16 : 35 73 5 0 0 0 39 : 1324 9 1307 0
vm_area_struct 500 500 76 10 10 1 : 32 16 : 527 8798 13 1 0 1 82 : 26991 558 26555 513
fs_cache 40 84 44 1 1 1 : 32 16 : 40 96 2 1 0 0 116 : 736 6 717 0
files_cache 27 27 424 3 3 1 : 32 16 : 27 53 4 1 0 0 41 : 734 8 717 0
signal_cache 65 72 52 1 1 1 : 32 16 : 65 134 1 0 0 0 104 : 743 14 707 0
sighand_cache 45 45 1344 15 15 1 : 32 16 : 48 60 16 1 0 0 35 : 724 23 703 0
task_struct 55 55 1632 11 11 2 : 32 16 : 55 68 11 0 0 0 37 : 736 21 704 0
pte_chain 1650 1650 76 33 33 1 : 32 16 : 1650 6660 42 1 0 1 82 : 9854 453 8357 310
pgd 27 27 4096 27 27 1 : 32 16 : 33 46 43 16 0 0 33 : 1288 45 1307 0
size-131072(DMA) 0 0 131072 0 0 32 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-131072 0 0 131072 0 0 32 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-65536(DMA) 0 0 65536 0 0 16 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-65536 0 0 65536 0 0 16 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-32768(DMA) 0 0 32768 0 0 8 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-32768 0 0 32768 0 0 8 : 8 4 : 1 1 1 1 0 0 9 : 0 1 1 0
size-16384(DMA) 0 0 16384 0 0 4 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-16384 1 1 16384 1 1 4 : 8 4 : 2 4 4 3 0 0 9 : 2 4 5 0
size-8192(DMA) 0 0 8192 0 0 2 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-8192 6 6 8192 6 6 2 : 8 4 : 7 10 9 3 0 0 9 : 24 10 28 0
size-4096(DMA) 0 0 4096 0 0 1 : 32 16 : 0 0 0 0 0 0 33 : 0 0 0 0
size-4096 27 27 4096 27 27 1 : 32 16 : 34 50 49 22 0 0 33 : 343 50 366 0
size-2048(DMA) 0 0 2048 0 0 1 : 32 16 : 0 0 0 0 0 0 34 : 0 0 0 0
size-2048 10 10 2048 5 5 1 : 32 16 : 10 12 5 0 0 0 34 : 4 7 1 0
size-1024(DMA) 0 0 1024 0 0 1 : 32 16 : 0 0 0 0 0 0 36 : 0 0 0 0
size-1024 40 40 1024 10 10 1 : 32 16 : 40 61 11 1 0 0 36 : 531 37 530 0
size-512(DMA) 0 0 512 0 0 1 : 32 16 : 8 8 1 1 0 0 40 : 0 1 1 0
size-512 62 64 512 8 8 1 : 32 16 : 72 94 9 1 0 0 40 : 426 21 385 0
size-256(DMA) 0 0 268 0 0 1 : 32 16 : 14 14 1 1 0 0 46 : 31 1 32 0
size-256 9 14 268 1 1 1 : 32 16 : 28 60 2 1 0 0 46 : 594 7 592 0
size-192(DMA) 0 0 204 0 0 1 : 32 16 : 0 0 0 0 0 0 51 : 0 0 0 0
size-192 133 133 204 7 7 1 : 32 16 : 133 188 7 0 0 0 51 : 481 65 422 0
size-128(DMA) 0 0 140 0 0 1 : 32 16 : 0 0 0 0 0 0 59 : 0 0 0 0
size-128 554 621 140 23 23 1 : 32 16 : 617 811 23 0 0 0 59 : 1781 287 1514 6
size-64(DMA) 0 0 76 0 0 1 : 32 16 : 0 0 0 0 0 0 82 : 0 0 0 0
size-64 338 400 76 8 8 1 : 32 16 : 366 589 8 0 0 0 82 : 285 213 144 27
size-32(DMA) 0 0 44 0 0 1 : 32 16 : 0 0 0 0 0 0 116 : 0 0 0 0
size-32 232 252 44 3 3 1 : 32 16 : 232 399 3 0 0 0 116 : 14239 174 14042 142
kmem_cache 110 110 180 5 5 1 : 32 16 : 110 110 5 0 0 0 54 : 34 72 0 0
[-- Attachment #3: slabinfo_post.txt --]
[-- Type: text/plain, Size: 14921 bytes --]
slabinfo - version: 1.2 (statistics)
hpsb_packet 0 0 132 0 0 1 : 32 16 : 16 16 1 1 0 0 61 : 46 1 47 0
rpc_buffers 8 8 2048 4 4 1 : 32 16 : 8 8 4 0 0 0 34 : 4 4 0 0
rpc_tasks 8 19 204 1 1 1 : 32 16 : 16 16 1 0 0 0 51 : 7 1 0 0
rpc_inode_cache 0 0 480 0 0 1 : 32 16 : 0 0 0 0 0 0 40 : 0 0 0 0
unix_sock 10 21 576 3 3 1 : 32 16 : 21 32 3 0 0 0 39 : 593 4 587 0
tcp_tw_bucket 0 0 100 0 0 1 : 32 16 : 0 0 0 0 0 0 71 : 0 0 0 0
tcp_bind_bucket 7 125 28 1 1 1 : 32 16 : 16 16 1 0 0 0 157 : 6 1 0 0
tcp_open_request 0 0 68 0 0 1 : 32 16 : 0 0 0 0 0 0 88 : 0 0 0 0
inet_peer_cache 0 0 52 0 0 1 : 32 16 : 0 0 0 0 0 0 104 : 0 0 0 0
secpath_cache 0 0 32 0 0 1 : 32 16 : 0 0 0 0 0 0 144 : 0 0 0 0
flow_cache 0 0 60 0 0 1 : 32 16 : 0 0 0 0 0 0 94 : 0 0 0 0
xfrm4_dst_cache 0 0 220 0 0 1 : 32 16 : 0 0 0 0 0 0 50 : 0 0 0 0
ip_fib_hash 10 125 28 1 1 1 : 32 16 : 16 16 1 0 0 0 157 : 10 1 1 0
ip_dst_cache 5 19 208 1 1 1 : 32 16 : 19 63 1 0 0 0 51 : 3 4 2 0
arp_cache 2 21 188 1 1 1 : 32 16 : 17 32 1 0 0 0 53 : 0 2 0 0
raw4_sock 0 0 576 0 0 1 : 32 16 : 7 14 2 2 0 0 39 : 4 2 6 0
udp_sock 7 7 576 1 1 1 : 32 16 : 14 17 2 1 0 0 39 : 37 3 33 0
tcp_sock 16 21 1088 3 3 2 : 32 16 : 21 30 3 0 0 0 39 : 49 6 39 0
scsi_cmd_cache 66 66 356 6 6 1 : 32 16 : 66 220 9 3 0 0 43 : 5735 22 5713 0
nfs_write_data 36 40 452 5 5 1 : 32 16 : 40 40 5 0 0 0 40 : 31 5 0 0
nfs_read_data 32 36 436 4 4 1 : 32 16 : 36 36 4 0 0 0 41 : 28 4 0 0
nfs_inode_cache 0 0 704 0 0 2 : 32 16 : 0 0 0 0 0 0 43 : 0 0 0 0
nfs_page 0 0 108 0 0 1 : 32 16 : 0 0 0 0 0 0 68 : 0 0 0 0
isofs_inode_cache 0 0 440 0 0 1 : 32 16 : 0 0 0 0 0 0 41 : 0 0 0 0
ext2_inode_cache 4 7 576 1 1 1 : 32 16 : 7 7 1 0 0 0 39 : 3 1 0 0
journal_head 21 62 60 1 1 1 : 32 16 : 420 1124 9 2 0 1 94 : 6048 73 6046 55
revoke_table 1 144 24 1 1 1 : 32 16 : 16 16 1 0 0 0 176 : 0 1 0 0
revoke_record 0 0 28 0 0 1 : 32 16 : 16 32 2 2 0 0 157 : 3 2 5 0
ext3_inode_cache 2275 2275 576 325 325 1 : 32 16 : 2275 2290 325 0 0 0 39 : 1969 331 28 0
ext3_xattr 0 0 60 0 0 1 : 32 16 : 0 0 0 0 0 0 94 : 0 0 0 0
eventpoll_pwq 0 0 48 0 0 1 : 32 16 : 0 0 0 0 0 0 109 : 0 0 0 0
eventpoll_epi 0 0 72 0 0 1 : 32 16 : 0 0 0 0 0 0 85 : 0 0 0 0
kioctx 0 0 268 0 0 1 : 32 16 : 0 0 0 0 0 0 46 : 0 0 0 0
kiocb 0 0 172 0 0 1 : 32 16 : 0 0 0 0 0 0 55 : 0 0 0 0
dnotify_cache 0 0 32 0 0 1 : 32 16 : 0 0 0 0 0 0 144 : 0 0 0 0
file_lock_cache 17 30 128 1 1 1 : 32 16 : 30 173 1 0 0 0 62 : 1093 13 1089 0
fasync_cache 0 0 28 0 0 1 : 32 16 : 0 0 0 0 0 0 157 : 0 0 0 0
shmem_inode_cache 2 7 576 1 1 1 : 32 16 : 7 13 1 0 0 0 39 : 0 2 0 0
idr_layer_cache 0 0 148 0 0 1 : 32 16 : 0 0 0 0 0 0 58 : 0 0 0 0
posix_timers_cache 0 0 136 0 0 1 : 32 16 : 0 0 0 0 0 0 60 : 0 0 0 0
uid_cache 5 101 36 1 1 1 : 32 16 : 18 32 1 0 0 0 133 : 3 2 0 0
sgpool-MAX_PHYS_SEGMENTS 32 32 2048 16 16 1 : 32 16 : 34 38 18 2 0 0 34 : 0 35 3 0
sgpool-64 32 32 1024 8 8 1 : 32 16 : 36 48 12 4 0 0 36 : 6 36 10 0
sgpool-32 32 32 512 4 4 1 : 32 16 : 40 72 9 5 0 0 40 : 33 37 38 0
sgpool-16 32 42 268 3 3 1 : 32 16 : 42 82 3 0 0 0 46 : 228 37 233 0
sgpool-8 50 54 140 2 2 1 : 32 16 : 54 227 2 0 0 0 59 : 4322 45 4335 0
deadline_drq 22274 22361 64 379 379 1 : 32 16 : 22538 32880 507 0 0 1 91 : 31317 2219 10606 658
blkdev_requests 22273 22300 156 892 892 1 : 32 16 : 22541 32829 1204 5 0 2 57 : 30944 2592 10609 655
biovec-BIO_MAX_PAGES 256 260 3072 52 52 4 : 32 16 : 256 256 52 0 0 0 37 : 0 256 0 0
biovec-128 256 260 1536 52 52 2 : 32 16 : 256 256 52 0 0 0 37 : 0 256 0 0
biovec-64 256 260 768 52 52 1 : 32 16 : 260 268 52 0 0 0 37 : 8 259 11 0
biovec-16 256 266 204 14 14 1 : 32 16 : 266 296 14 0 0 0 51 : 293 260 297 0
biovec-4 256 310 60 5 5 1 : 32 16 : 272 304 5 0 0 0 94 : 481 259 484 0
biovec-1 285 288 24 2 2 1 : 32 16 : 512 1277 6 3 0 0 176 : 7486 320 7503 47
bio 274 318 72 6 6 1 : 32 16 : 509 1391 14 3 0 1 85 : 8265 333 8287 55
sock_inode_cache 36 40 484 5 5 1 : 32 16 : 40 55 5 0 0 0 40 : 545 10 519 0
skbuff_head_cache 16 24 160 1 1 1 : 32 16 : 24 42 1 0 0 0 56 : 13 3 0 0
sock 2 9 444 1 1 1 : 32 16 : 9 16 1 0 0 0 41 : 14 3 15 0
proc_inode_cache 82 99 440 11 11 1 : 32 16 : 99 179 11 0 0 0 41 : 572 17 521 1
sigqueue 9 27 144 1 1 1 : 32 16 : 16 16 1 0 0 0 59 : 767 1 768 0
radix_tree_node 840 840 272 60 60 1 : 32 16 : 840 884 60 0 0 0 46 : 774 73 11 0
cdev_cache 143 250 28 2 2 1 : 32 16 : 158 173 2 0 0 0 157 : 131 12 0 0
bdev_cache 4 32 120 1 1 1 : 32 16 : 20 32 1 0 0 0 64 : 89 2 87 0
mnt_cache 19 53 72 1 1 1 : 32 16 : 35 54 1 0 0 0 85 : 24 9 14 0
inode_cache 3375 3375 424 375 375 1 : 32 16 : 3375 3418 375 0 0 0 41 : 3340 624 589 0
dentry_cache 6744 6745 204 355 355 1 : 32 16 : 6744 7379 368 2 0 1 51 : 7671 1004 1911 34
filp 286 300 156 12 12 1 : 32 16 : 291 307 12 0 0 0 57 : 261 25 0 0
names_cache 1 1 4096 1 1 1 : 32 16 : 8 47 23 22 0 0 33 : 15424 47 15471 0
buffer_head 2458 2478 64 42 42 1 : 32 16 : 2458 3060 42 0 0 1 91 : 5662 206 3396 28
mm_struct 35 35 576 5 5 1 : 32 16 : 35 93 5 0 0 0 39 : 1494 11 1479 0
vm_area_struct 500 500 76 10 10 1 : 32 16 : 527 8855 13 1 0 1 82 : 29247 562 28815 513
fs_cache 40 84 44 1 1 1 : 32 16 : 40 128 2 1 0 0 116 : 946 8 929 0
files_cache 27 27 424 3 3 1 : 32 16 : 27 59 4 1 0 0 41 : 944 10 929 0
signal_cache 107 144 52 2 2 1 : 32 16 : 107 205 2 0 0 0 104 : 950 19 877 0
sighand_cache 87 87 1344 29 29 1 : 32 16 : 87 106 30 1 0 0 35 : 920 39 873 0
task_struct 95 95 1632 19 19 2 : 32 16 : 100 117 20 1 0 0 37 : 937 32 874 0
pte_chain 1650 1650 76 33 33 1 : 32 16 : 1650 6684 42 1 0 1 82 : 10041 455 8545 310
pgd 27 27 4096 27 27 1 : 32 16 : 33 51 48 21 0 0 33 : 1455 50 1479 0
size-131072(DMA) 0 0 131072 0 0 32 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-131072 0 0 131072 0 0 32 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-65536(DMA) 0 0 65536 0 0 16 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-65536 0 0 65536 0 0 16 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-32768(DMA) 0 0 32768 0 0 8 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-32768 0 0 32768 0 0 8 : 8 4 : 1 1 1 1 0 0 9 : 0 1 1 0
size-16384(DMA) 0 0 16384 0 0 4 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-16384 1 1 16384 1 1 4 : 8 4 : 2 4 4 3 0 0 9 : 2 4 5 0
size-8192(DMA) 0 0 8192 0 0 2 : 8 4 : 0 0 0 0 0 0 9 : 0 0 0 0
size-8192 91 91 8192 91 91 2 : 8 4 : 92 96 95 4 0 0 9 : 149 96 154 0
size-4096(DMA) 0 0 4096 0 0 1 : 32 16 : 0 0 0 0 0 0 33 : 0 0 0 0
size-4096 27 27 4096 27 27 1 : 32 16 : 34 57 55 28 0 0 33 : 489 57 519 0
size-2048(DMA) 0 0 2048 0 0 1 : 32 16 : 0 0 0 0 0 0 34 : 0 0 0 0
size-2048 94 94 2048 47 47 1 : 32 16 : 94 96 47 0 0 0 34 : 88 49 43 0
size-1024(DMA) 0 0 1024 0 0 1 : 32 16 : 0 0 0 0 0 0 36 : 0 0 0 0
size-1024 40 40 1024 10 10 1 : 32 16 : 40 67 11 1 0 0 36 : 702 39 703 0
size-512(DMA) 0 0 512 0 0 1 : 32 16 : 8 16 2 2 0 0 40 : 83 2 85 0
size-512 234 248 512 31 31 1 : 32 16 : 248 282 32 1 0 0 40 : 1721 46 1537 0
size-256(DMA) 0 0 268 0 0 1 : 32 16 : 14 28 2 2 0 0 46 : 534 2 536 0
size-256 94 98 268 7 7 1 : 32 16 : 98 149 8 1 0 0 46 : 736 14 657 0
size-192(DMA) 0 0 204 0 0 1 : 32 16 : 0 0 0 0 0 0 51 : 0 0 0 0
size-192 209 209 204 11 11 1 : 32 16 : 209 274 11 0 0 0 51 : 725 75 592 0
size-128(DMA) 0 0 140 0 0 1 : 32 16 : 0 0 0 0 0 0 59 : 0 0 0 0
size-128 681 702 140 26 26 1 : 32 16 : 691 955 26 0 0 0 59 : 2740 297 2351 6
size-64(DMA) 0 0 76 0 0 1 : 32 16 : 0 0 0 0 0 0 82 : 0 0 0 0
size-64 750 750 76 15 15 1 : 32 16 : 750 1041 15 0 0 0 82 : 1141 247 623 27
size-32(DMA) 0 0 44 0 0 1 : 32 16 : 0 0 0 0 0 0 116 : 0 0 0 0
size-32 452 504 44 6 6 1 : 32 16 : 452 619 6 0 0 0 116 : 18166 190 17772 142
kmem_cache 110 110 180 5 5 1 : 32 16 : 110 110 5 0 0 0 54 : 34 72 0 0
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-22 11:46 ` Douglas Gilbert
@ 2003-03-22 12:05 ` Andrew Morton
2003-03-24 21:32 ` Badari Pulavarty
2003-03-25 10:56 ` Jens Axboe
0 siblings, 2 replies; 22+ messages in thread
From: Andrew Morton @ 2003-03-22 12:05 UTC (permalink / raw)
To: dougg; +Cc: pbadari, linux-kernel, linux-scsi, Jens Axboe
Douglas Gilbert <dougg@torque.net> wrote:
>
> Andrew Morton wrote:
> > Douglas Gilbert <dougg@torque.net> wrote:
> >
> >>>Slab: 464364 kB
> >>
> >
> > It's all in slab.
> >
> >
> >>I did notice a rather large growth of nodes
> >>in sysfs. For 84 added scsi_debug pseudo disks the number
> >>of sysfs nodes went from 686 to 3347.
> >>
> >>Does anybody know what is the per node memory cost of sysfs?
> >
> >
> > Let's see all of /pro/slabinfo please.
>
> Andrew,
> Attachments are /proc/slabinfo pre and post:
> $ modprobe scsi_debug add_host=42 num_devs=2
> which adds 84 pseudo disks.
>
OK, thanks. So with 48 disks you've lost five megabytes to blkdev_requests
and deadline_drq objects. With 4000 disks, you're toast. That's enough
request structures to put 200 gigabytes of memory under I/O ;)
We need to make the request structures dymanically allocated for other
reasons (which I cannot immediately remember) but it didn't happen. I guess
we have some motivation now.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-22 12:05 ` Andrew Morton
@ 2003-03-24 21:32 ` Badari Pulavarty
2003-03-24 22:22 ` Douglas Gilbert
2003-03-25 0:10 ` Andrew Morton
2003-03-25 10:56 ` Jens Axboe
1 sibling, 2 replies; 22+ messages in thread
From: Badari Pulavarty @ 2003-03-24 21:32 UTC (permalink / raw)
To: Andrew Morton, dougg; +Cc: linux-kernel, linux-scsi, Jens Axboe
On Saturday 22 March 2003 04:05 am, Andrew Morton wrote:
> OK, thanks. So with 48 disks you've lost five megabytes to blkdev_requests
> and deadline_drq objects. With 4000 disks, you're toast. That's enough
> request structures to put 200 gigabytes of memory under I/O ;)
>
> We need to make the request structures dymanically allocated for other
> reasons (which I cannot immediately remember) but it didn't happen. I
> guess we have some motivation now.
Here is the list of slab caches which consumed more than 1 MB
in the process of inserting 4000 disks.
#insmod scsi_debug.ko add_host=4 num_devs=1000
deadline_drq before:1280 after:1025420 diff:1024140 size:64 incr:65544960
blkdev_requests before:1280 after:1025400 diff:1024120 size:156 incr:159762720
* deadline_drq, blkdev_requests consumed almost 80 MB. We need to fix this.
inode_cache before:700 after:140770 diff:140070 size:364 incr:50985480
dentry_cache before:4977 after:145061 diff:140084 size:172 incr:24094448
* inode cache increased by 50 MB, dentry cache 24 MB.
It looks like we cached 140,000 inodes. I wonder why ?
size-8192 before:8 after:4010 diff:4002 size:8192 incr:32784384
* 32 MB is for 4 hosts ram disk for scsi_debug (4*8MB)
size-2048 before:112 after:4102 diff:3990 size:2060 incr:8219400
size-512 before:87 after:8085 diff:7998 size:524 incr:4190952
size-192 before:907 after:16910 diff:16003 size:204 incr:3264612
size-64 before:459 after:76500 diff:76041 size:76 incr:5779116
size-32 before:523 after:24528 diff:24005 size:44 incr:1056220
* 30MB for all other structures. I need to look closely on what these are..
Thanks,
Badari
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-24 21:32 ` Badari Pulavarty
@ 2003-03-24 22:22 ` Douglas Gilbert
2003-03-24 22:54 ` Badari Pulavarty
2003-03-25 0:10 ` Andrew Morton
1 sibling, 1 reply; 22+ messages in thread
From: Douglas Gilbert @ 2003-03-24 22:22 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: Andrew Morton, linux-kernel, linux-scsi, Jens Axboe
Badari Pulavarty wrote:
> On Saturday 22 March 2003 04:05 am, Andrew Morton wrote:
>
>>OK, thanks. So with 48 disks you've lost five megabytes to blkdev_requests
>>and deadline_drq objects. With 4000 disks, you're toast. That's enough
>>request structures to put 200 gigabytes of memory under I/O ;)
>>
>>We need to make the request structures dymanically allocated for other
>>reasons (which I cannot immediately remember) but it didn't happen. I
>>guess we have some motivation now.
>
>
> Here is the list of slab caches which consumed more than 1 MB
> in the process of inserting 4000 disks.
>
> #insmod scsi_debug.ko add_host=4 num_devs=1000
>
> deadline_drq before:1280 after:1025420 diff:1024140 size:64 incr:65544960
> blkdev_requests before:1280 after:1025400 diff:1024120 size:156 incr:159762720
>
> * deadline_drq, blkdev_requests consumed almost 80 MB. We need to fix this.
>
> inode_cache before:700 after:140770 diff:140070 size:364 incr:50985480
> dentry_cache before:4977 after:145061 diff:140084 size:172 incr:24094448
>
> * inode cache increased by 50 MB, dentry cache 24 MB.
> It looks like we cached 140,000 inodes. I wonder why ?
> <snip/>
Badari,
What number do you get from
# cd /sys; du -a | wc
when you have 4000 disks?
With two disks the count on my system is 528.
Also scsi_debug should use only 8 MB (default) for
simulated storage shared between all pseudo disks
(i.e. not 8 MB per simulated host).
Doug Gilbert
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-24 22:22 ` Douglas Gilbert
@ 2003-03-24 22:54 ` Badari Pulavarty
0 siblings, 0 replies; 22+ messages in thread
From: Badari Pulavarty @ 2003-03-24 22:54 UTC (permalink / raw)
To: dougg; +Cc: Andrew Morton, linux-kernel, linux-scsi, Jens Axboe
On Monday 24 March 2003 02:22 pm, Douglas Gilbert wrote:
> Badari Pulavarty wrote:
> > On Saturday 22 March 2003 04:05 am, Andrew Morton wrote:
> >>OK, thanks. So with 48 disks you've lost five megabytes to
> >> blkdev_requests and deadline_drq objects. With 4000 disks, you're
> >> toast. That's enough request structures to put 200 gigabytes of memory
> >> under I/O ;)
> >>
> >>We need to make the request structures dymanically allocated for other
> >>reasons (which I cannot immediately remember) but it didn't happen. I
> >>guess we have some motivation now.
> >
> > Here is the list of slab caches which consumed more than 1 MB
> > in the process of inserting 4000 disks.
> >
> > #insmod scsi_debug.ko add_host=4 num_devs=1000
> >
> > deadline_drq before:1280 after:1025420 diff:1024140 size:64
> > incr:65544960 blkdev_requests before:1280 after:1025400 diff:1024120
> > size:156 incr:159762720
> >
> > * deadline_drq, blkdev_requests consumed almost 80 MB. We need to fix
> > this.
> >
> > inode_cache before:700 after:140770 diff:140070 size:364
> > incr:50985480 dentry_cache before:4977 after:145061 diff:140084
> > size:172 incr:24094448
> >
> > * inode cache increased by 50 MB, dentry cache 24 MB.
> > It looks like we cached 140,000 inodes. I wonder why ?
> >
> > <snip/>
>
> Badari,
> What number do you get from
> # cd /sys; du -a | wc
> when you have 4000 disks?
[root@elm3b78 sysfs]# du -a | wc -l
140735
Okay. That explains the inodes.
>
> Also scsi_debug should use only 8 MB (default) for
> simulated storage shared between all pseudo disks
> (i.e. not 8 MB per simulated host).
Hmm. When I did
insmod scsi_debug.o add_host=1 num_devs=4000
it used 8MB.
But when I did
insmod scsi_debug.o add_host=4 num_devs=1000
it used 32 MB. So I assumed it is per host.
Before:
size-8192 8 8 8192 8 8 2 : 8 4 : 9
12 10 2 0 0 37 : 3 12 7 0
After:
size-8192 4010 4010 8192 4010 4010 2 : 8 4 : 4010
4014 4012 2 0 0 37 : 4006 4014 4011 0
Thanks,
Badari
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 0:10 ` Andrew Morton
@ 2003-03-24 22:57 ` Badari Pulavarty
0 siblings, 0 replies; 22+ messages in thread
From: Badari Pulavarty @ 2003-03-24 22:57 UTC (permalink / raw)
To: Andrew Morton; +Cc: dougg, linux-kernel, linux-scsi, axboe
On Monday 24 March 2003 04:10 pm, Andrew Morton wrote:
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > Here is the list of slab caches which consumed more than 1 MB
> > in the process of inserting 4000 disks.
>
> Thanks for doing this.
>
> > #insmod scsi_debug.ko add_host=4 num_devs=1000
> >
> > deadline_drq before:1280 after:1025420 diff:1024140 size:64
> > incr:65544960 blkdev_requests before:1280 after:1025400 diff:1024120
> > size:156 incr:159762720
> >
> > * deadline_drq, blkdev_requests consumed almost 80 MB. We need to fix
> > this.
>
> Yes, we do. But.
>
> > inode_cache before:700 after:140770 diff:140070 size:364
> > incr:50985480 dentry_cache before:4977 after:145061 diff:140084
> > size:172 incr:24094448
> >
> > * inode cache increased by 50 MB, dentry cache 24 MB.
> > It looks like we cached 140,000 inodes. I wonder why ?
>
> # find /sys/block/hda | wc -l
> 43
>
> Oh shit, we're toast.
>
> How many partitions did these "disks" have?
These are all scsi_debug disks. No partitions.
Yeah !! I know what you mean, with partitions we
are going to get 5 more inodes per partition..
[root@elm3b78 sdaaa]# find /sysfs/block/sdaa
/sysfs/block/sdaa
/sysfs/block/sdaa/iosched
/sysfs/block/sdaa/iosched/fifo_batch
/sysfs/block/sdaa/iosched/front_merges
/sysfs/block/sdaa/iosched/writes_starved
/sysfs/block/sdaa/iosched/write_expire
/sysfs/block/sdaa/iosched/read_expire
/sysfs/block/sdaa/device
/sysfs/block/sdaa/stat
/sysfs/block/sdaa/size
/sysfs/block/sdaa/range
/sysfs/block/sdaa/dev
Thanks,
Badari
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-24 21:32 ` Badari Pulavarty
2003-03-24 22:22 ` Douglas Gilbert
@ 2003-03-25 0:10 ` Andrew Morton
2003-03-24 22:57 ` Badari Pulavarty
1 sibling, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2003-03-25 0:10 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: dougg, linux-kernel, linux-scsi, axboe
Badari Pulavarty <pbadari@us.ibm.com> wrote:
>
> Here is the list of slab caches which consumed more than 1 MB
> in the process of inserting 4000 disks.
Thanks for doing this.
> #insmod scsi_debug.ko add_host=4 num_devs=1000
>
> deadline_drq before:1280 after:1025420 diff:1024140 size:64 incr:65544960
> blkdev_requests before:1280 after:1025400 diff:1024120 size:156 incr:159762720
>
> * deadline_drq, blkdev_requests consumed almost 80 MB. We need to fix this.
Yes, we do. But.
> inode_cache before:700 after:140770 diff:140070 size:364 incr:50985480
> dentry_cache before:4977 after:145061 diff:140084 size:172 incr:24094448
>
> * inode cache increased by 50 MB, dentry cache 24 MB.
> It looks like we cached 140,000 inodes. I wonder why ?
# find /sys/block/hda | wc -l
43
Oh shit, we're toast.
How many partitions did these "disks" have?
> size-2048 before:112 after:4102 diff:3990 size:2060 incr:8219400
> size-512 before:87 after:8085 diff:7998 size:524 incr:4190952
> size-192 before:907 after:16910 diff:16003 size:204 incr:3264612
> size-64 before:459 after:76500 diff:76041 size:76 incr:5779116
> size-32 before:523 after:24528 diff:24005 size:44 incr:1056220
>
> * 30MB for all other structures. I need to look closely on what these are..
Yes, that will take some work. What I would do is to change kmalloc:
+ int foo;
kmalloc(...)
{
+ if (foo)
+ dump_stack();
and then set `foo' in gdb across the registration of one disk.
Probably you can set foo by hand in scsi_register_device() or wherever it
happens.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-22 12:05 ` Andrew Morton
2003-03-24 21:32 ` Badari Pulavarty
@ 2003-03-25 10:56 ` Jens Axboe
2003-03-25 11:23 ` Jens Axboe
2003-03-25 11:39 ` Nick Piggin
1 sibling, 2 replies; 22+ messages in thread
From: Jens Axboe @ 2003-03-25 10:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: dougg, pbadari, linux-kernel, linux-scsi
On Sat, Mar 22 2003, Andrew Morton wrote:
> Douglas Gilbert <dougg@torque.net> wrote:
> >
> > Andrew Morton wrote:
> > > Douglas Gilbert <dougg@torque.net> wrote:
> > >
> > >>>Slab: 464364 kB
> > >>
> > >
> > > It's all in slab.
> > >
> > >
> > >>I did notice a rather large growth of nodes
> > >>in sysfs. For 84 added scsi_debug pseudo disks the number
> > >>of sysfs nodes went from 686 to 3347.
> > >>
> > >>Does anybody know what is the per node memory cost of sysfs?
> > >
> > >
> > > Let's see all of /pro/slabinfo please.
> >
> > Andrew,
> > Attachments are /proc/slabinfo pre and post:
> > $ modprobe scsi_debug add_host=42 num_devs=2
> > which adds 84 pseudo disks.
> >
>
> OK, thanks. So with 48 disks you've lost five megabytes to blkdev_requests
> and deadline_drq objects. With 4000 disks, you're toast. That's enough
> request structures to put 200 gigabytes of memory under I/O ;)
>
> We need to make the request structures dymanically allocated for other
> reasons (which I cannot immediately remember) but it didn't happen. I guess
> we have some motivation now.
Here's a patch that makes the request allocation (and io scheduler
private data) dynamic, with upper and lower bounds of 4 and 256
respectively. The numbers are a bit random - the 4 will allow us to make
progress, but it might be a smidgen too low. Perhaps 8 would be good.
256 is twice as much as before, but that should be alright as long as
the io scheduler copes. BLKDEV_MAX_RQ and BLKDEV_MIN_RQ control these
two variables.
We loose the old batching functionality, for now. I can resurrect that
if needed. It's a rough fit with the mempool, it doesn't _quite_ fit our
needs here. I'll probably end up doing a specialised block pool scheme
for this.
Hasn't been tested all that much, it boots though :-)
drivers/block/deadline-iosched.c | 86 ++++++++---------
drivers/block/elevator.c | 18 +++
drivers/block/ll_rw_blk.c | 197 +++++++++++++--------------------------
include/linux/blkdev.h | 11 +-
include/linux/elevator.h | 7 +
5 files changed, 141 insertions(+), 178 deletions(-)
===== drivers/block/deadline-iosched.c 1.17 vs edited =====
--- 1.17/drivers/block/deadline-iosched.c Fri Mar 14 16:55:04 2003
+++ edited/drivers/block/deadline-iosched.c Tue Mar 25 11:05:37 2003
@@ -71,6 +71,8 @@
int fifo_batch;
int writes_starved;
int front_merges;
+
+ mempool_t *drq_pool;
};
/*
@@ -673,28 +675,11 @@
static void deadline_exit(request_queue_t *q, elevator_t *e)
{
struct deadline_data *dd = e->elevator_data;
- struct deadline_rq *drq;
- struct request *rq;
- int i;
BUG_ON(!list_empty(&dd->fifo_list[READ]));
BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
- for (i = READ; i <= WRITE; i++) {
- struct request_list *rl = &q->rq[i];
- struct list_head *entry;
-
- list_for_each(entry, &rl->free) {
- rq = list_entry_rq(entry);
-
- if ((drq = RQ_DATA(rq)) == NULL)
- continue;
-
- rq->elevator_private = NULL;
- kmem_cache_free(drq_pool, drq);
- }
- }
-
+ mempool_destroy(dd->drq_pool);
kfree(dd->hash);
kfree(dd);
}
@@ -706,9 +691,7 @@
static int deadline_init(request_queue_t *q, elevator_t *e)
{
struct deadline_data *dd;
- struct deadline_rq *drq;
- struct request *rq;
- int i, ret = 0;
+ int i;
if (!drq_pool)
return -ENOMEM;
@@ -724,6 +707,13 @@
return -ENOMEM;
}
+ dd->drq_pool = mempool_create(BLKDEV_MIN_RQ, mempool_alloc_slab, mempool_free_slab, drq_pool);
+ if (!dd->drq_pool) {
+ kfree(dd->hash);
+ kfree(dd);
+ return -ENOMEM;
+ }
+
for (i = 0; i < DL_HASH_ENTRIES; i++)
INIT_LIST_HEAD(&dd->hash[i]);
@@ -739,33 +729,41 @@
dd->front_merges = 1;
dd->fifo_batch = fifo_batch;
e->elevator_data = dd;
+ return 0;
+}
- for (i = READ; i <= WRITE; i++) {
- struct request_list *rl = &q->rq[i];
- struct list_head *entry;
-
- list_for_each(entry, &rl->free) {
- rq = list_entry_rq(entry);
-
- drq = kmem_cache_alloc(drq_pool, GFP_KERNEL);
- if (!drq) {
- ret = -ENOMEM;
- break;
- }
+static void deadline_put_request(request_queue_t *q, struct request *rq)
+{
+ struct deadline_data *dd = q->elevator.elevator_data;
+ struct deadline_rq *drq = RQ_DATA(rq);
- memset(drq, 0, sizeof(*drq));
- INIT_LIST_HEAD(&drq->fifo);
- INIT_LIST_HEAD(&drq->hash);
- RB_CLEAR(&drq->rb_node);
- drq->request = rq;
- rq->elevator_private = drq;
- }
+ if (drq) {
+ mempool_free(drq, dd->drq_pool);
+ rq->elevator_private = NULL;
}
+}
+
+static int
+deadline_set_request(request_queue_t *q, struct request *rq, int gfp_mask)
+{
+ struct deadline_data *dd = q->elevator.elevator_data;
+ struct deadline_rq *drq;
- if (ret)
- deadline_exit(q, e);
+ drq = mempool_alloc(dd->drq_pool, gfp_mask);
+ if (drq) {
+ RB_CLEAR(&drq->rb_node);
+ drq->request = rq;
- return ret;
+ INIT_LIST_HEAD(&drq->hash);
+ drq->hash_valid_count = 0;
+
+ INIT_LIST_HEAD(&drq->fifo);
+
+ rq->elevator_private = drq;
+ return 0;
+ }
+
+ return 1;
}
/*
@@ -916,6 +914,8 @@
.elevator_queue_empty_fn = deadline_queue_empty,
.elevator_former_req_fn = deadline_former_request,
.elevator_latter_req_fn = deadline_latter_request,
+ .elevator_set_req_fn = deadline_set_request,
+ .elevator_put_req_fn = deadline_put_request,
.elevator_init_fn = deadline_init,
.elevator_exit_fn = deadline_exit,
===== drivers/block/elevator.c 1.40 vs edited =====
--- 1.40/drivers/block/elevator.c Sun Feb 16 12:32:35 2003
+++ edited/drivers/block/elevator.c Tue Mar 25 11:04:01 2003
@@ -408,6 +408,24 @@
return NULL;
}
+int elv_set_request(request_queue_t *q, struct request *rq, int gfp_mask)
+{
+ elevator_t *e = &q->elevator;
+
+ if (e->elevator_set_req_fn)
+ return e->elevator_set_req_fn(q, rq, gfp_mask);
+
+ return 0;
+}
+
+void elv_put_request(request_queue_t *q, struct request *rq)
+{
+ elevator_t *e = &q->elevator;
+
+ if (e->elevator_put_req_fn)
+ e->elevator_put_req_fn(q, rq);
+}
+
int elv_register_queue(struct gendisk *disk)
{
request_queue_t *q = disk->queue;
===== drivers/block/ll_rw_blk.c 1.161 vs edited =====
--- 1.161/drivers/block/ll_rw_blk.c Mon Mar 24 02:56:03 2003
+++ edited/drivers/block/ll_rw_blk.c Tue Mar 25 11:53:37 2003
@@ -48,12 +48,6 @@
*/
static int queue_nr_requests;
-/*
- * How many free requests must be available before we wake a process which
- * is waiting for a request?
- */
-static int batch_requests;
-
unsigned long blk_max_low_pfn, blk_max_pfn;
int blk_nohighio = 0;
@@ -1118,26 +1112,6 @@
spin_unlock_irq(&blk_plug_lock);
}
-static int __blk_cleanup_queue(struct request_list *list)
-{
- struct list_head *head = &list->free;
- struct request *rq;
- int i = 0;
-
- while (!list_empty(head)) {
- rq = list_entry(head->next, struct request, queuelist);
- list_del_init(&rq->queuelist);
- kmem_cache_free(request_cachep, rq);
- i++;
- }
-
- if (i != list->count)
- printk("request list leak!\n");
-
- list->count = 0;
- return i;
-}
-
/**
* blk_cleanup_queue: - release a &request_queue_t when it is no longer needed
* @q: the request queue to be released
@@ -1154,18 +1128,14 @@
**/
void blk_cleanup_queue(request_queue_t * q)
{
- int count = (queue_nr_requests*2);
+ struct request_list *rl = &q->rq;
elevator_exit(q);
- count -= __blk_cleanup_queue(&q->rq[READ]);
- count -= __blk_cleanup_queue(&q->rq[WRITE]);
-
del_timer_sync(&q->unplug_timer);
flush_scheduled_work();
- if (count)
- printk("blk_cleanup_queue: leaked requests (%d)\n", count);
+ mempool_destroy(rl->rq_pool);
if (blk_queue_tagged(q))
blk_queue_free_tags(q);
@@ -1175,42 +1145,17 @@
static int blk_init_free_list(request_queue_t *q)
{
- struct request_list *rl;
- struct request *rq;
- int i;
+ struct request_list *rl = &q->rq;
- INIT_LIST_HEAD(&q->rq[READ].free);
- INIT_LIST_HEAD(&q->rq[WRITE].free);
- q->rq[READ].count = 0;
- q->rq[WRITE].count = 0;
+ rl->count[READ] = BLKDEV_MAX_RQ;
+ rl->count[WRITE] = BLKDEV_MAX_RQ;
- /*
- * Divide requests in half between read and write
- */
- rl = &q->rq[READ];
- for (i = 0; i < (queue_nr_requests*2); i++) {
- rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
- if (!rq)
- goto nomem;
-
- /*
- * half way through, switch to WRITE list
- */
- if (i == queue_nr_requests)
- rl = &q->rq[WRITE];
+ rl->rq_pool = mempool_create(BLKDEV_MIN_RQ, mempool_alloc_slab, mempool_free_slab, request_cachep);
- memset(rq, 0, sizeof(struct request));
- rq->rq_status = RQ_INACTIVE;
- list_add(&rq->queuelist, &rl->free);
- rl->count++;
- }
+ if (!rl->rq_pool)
+ return -ENOMEM;
- init_waitqueue_head(&q->rq[READ].wait);
- init_waitqueue_head(&q->rq[WRITE].wait);
return 0;
-nomem:
- blk_cleanup_queue(q);
- return 1;
}
static int __make_request(request_queue_t *, struct bio *);
@@ -1277,34 +1222,57 @@
return 0;
}
+static inline void blk_free_request(request_queue_t *q, struct request *rq)
+{
+ elv_put_request(q, rq);
+ mempool_free(rq, q->rq.rq_pool);
+}
+
+static inline struct request *blk_alloc_request(request_queue_t *q,int gfp_mask)
+{
+ struct request *rq = mempool_alloc(q->rq.rq_pool, gfp_mask);
+
+ if (rq) {
+ if (!elv_set_request(q, rq, gfp_mask))
+ return rq;
+
+ mempool_free(rq, q->rq.rq_pool);
+ }
+
+ return NULL;
+}
+
#define blkdev_free_rq(list) list_entry((list)->next, struct request, queuelist)
/*
* Get a free request. queue lock must be held and interrupts
* disabled on the way in.
*/
-static struct request *get_request(request_queue_t *q, int rw)
+static struct request *get_request(request_queue_t *q, int rw, int gfp_mask)
{
struct request *rq = NULL;
- struct request_list *rl = q->rq + rw;
+ struct request_list *rl = &q->rq;
- if (!list_empty(&rl->free)) {
- rq = blkdev_free_rq(&rl->free);
- list_del_init(&rq->queuelist);
- rq->ref_count = 1;
- rl->count--;
- if (rl->count < queue_congestion_on_threshold())
- set_queue_congested(q, rw);
+ if (!rl->count[rw])
+ return NULL;
+
+ rq = blk_alloc_request(q, gfp_mask);
+ if (rq) {
+ INIT_LIST_HEAD(&rq->queuelist);
rq->flags = 0;
- rq->rq_status = RQ_ACTIVE;
rq->errors = 0;
- rq->special = NULL;
- rq->buffer = NULL;
- rq->data = NULL;
- rq->sense = NULL;
- rq->waiting = NULL;
+ rq->rq_status = RQ_ACTIVE;
+ rl->count[rw]--;
+ if (rl->count[rw] < queue_congestion_on_threshold())
+ set_queue_congested(q, rw);
rq->bio = rq->biotail = NULL;
+ rq->buffer = NULL;
+ rq->ref_count = 1;
rq->q = q;
rq->rl = rl;
+ rq->special = NULL;
+ rq->data = NULL;
+ rq->waiting = NULL;
+ rq->sense = NULL;
}
return rq;
@@ -1316,7 +1284,7 @@
static struct request *get_request_wait(request_queue_t *q, int rw)
{
DEFINE_WAIT(wait);
- struct request_list *rl = &q->rq[rw];
+ struct request_list *rl = &q->rq;
struct request *rq;
spin_lock_prefetch(q->queue_lock);
@@ -1325,20 +1293,15 @@
do {
int block = 0;
- prepare_to_wait_exclusive(&rl->wait, &wait,
- TASK_UNINTERRUPTIBLE);
spin_lock_irq(q->queue_lock);
- if (!rl->count)
+ if (!rl->count[rw])
block = 1;
spin_unlock_irq(q->queue_lock);
if (block)
io_schedule();
- finish_wait(&rl->wait, &wait);
- spin_lock_irq(q->queue_lock);
- rq = get_request(q, rw);
- spin_unlock_irq(q->queue_lock);
+ rq = get_request(q, rw, GFP_NOIO);
} while (rq == NULL);
return rq;
}
@@ -1349,13 +1312,7 @@
BUG_ON(rw != READ && rw != WRITE);
- spin_lock_irq(q->queue_lock);
- rq = get_request(q, rw);
- spin_unlock_irq(q->queue_lock);
-
- if (!rq && (gfp_mask & __GFP_WAIT))
- rq = get_request_wait(q, rw);
-
+ rq = get_request(q, rw, gfp_mask);
if (rq) {
rq->flags = 0;
rq->buffer = NULL;
@@ -1374,7 +1331,7 @@
BUG_ON(rw != READ && rw != WRITE);
- rq = get_request(q, rw);
+ rq = get_request(q, rw, GFP_ATOMIC);
if (rq) {
rq->flags = 0;
@@ -1508,34 +1465,26 @@
if (unlikely(!q))
return;
- req->rq_status = RQ_INACTIVE;
- req->q = NULL;
- req->rl = NULL;
-
/*
* Request may not have originated from ll_rw_blk. if not,
* it didn't come out of our reserved rq pools
*/
if (rl) {
- int rw = 0;
+ int rw = rq_data_dir(req);
BUG_ON(!list_empty(&req->queuelist));
- list_add(&req->queuelist, &rl->free);
+ blk_free_request(q, req);
- if (rl == &q->rq[WRITE])
- rw = WRITE;
- else if (rl == &q->rq[READ])
- rw = READ;
- else
- BUG();
-
- rl->count++;
- if (rl->count >= queue_congestion_off_threshold())
+ rl->count[rw]++;
+ if (rl->count[rw] >= queue_congestion_off_threshold())
clear_queue_congested(q, rw);
- if (rl->count >= batch_requests && waitqueue_active(&rl->wait))
- wake_up(&rl->wait);
}
+
+ req->rq_status = RQ_INACTIVE;
+ req->q = NULL;
+ req->rl = NULL;
+ req->elevator_private = NULL;
}
void blk_put_request(struct request *req)
@@ -1772,7 +1721,7 @@
if (freereq) {
req = freereq;
freereq = NULL;
- } else if ((req = get_request(q, rw)) == NULL) {
+ } else if ((req = get_request(q, rw, GFP_ATOMIC)) == NULL) {
spin_unlock_irq(q->queue_lock);
/*
@@ -1815,8 +1764,8 @@
__blk_put_request(q, freereq);
if (blk_queue_plugged(q)) {
- int nr_queued = (queue_nr_requests - q->rq[0].count) +
- (queue_nr_requests - q->rq[1].count);
+ int nr_queued = (queue_nr_requests - q->rq.count[0]) +
+ (queue_nr_requests - q->rq.count[1]);
if (nr_queued == q->unplug_thresh)
__generic_unplug_device(q);
}
@@ -2183,7 +2132,6 @@
int __init blk_dev_init(void)
{
- int total_ram = nr_free_pages() << (PAGE_SHIFT - 10);
int i;
request_cachep = kmem_cache_create("blkdev_requests",
@@ -2191,24 +2139,11 @@
if (!request_cachep)
panic("Can't create request pool slab cache\n");
- /*
- * Free request slots per queue. One per quarter-megabyte.
- * We use this many requests for reads, and this many for writes.
- */
- queue_nr_requests = (total_ram >> 9) & ~7;
- if (queue_nr_requests < 16)
- queue_nr_requests = 16;
- if (queue_nr_requests > 128)
- queue_nr_requests = 128;
-
- batch_requests = queue_nr_requests / 8;
- if (batch_requests > 8)
- batch_requests = 8;
+ queue_nr_requests = BLKDEV_MAX_RQ;
printk("block request queues:\n");
- printk(" %d requests per read queue\n", queue_nr_requests);
- printk(" %d requests per write queue\n", queue_nr_requests);
- printk(" %d requests per batch\n", batch_requests);
+ printk(" %d/%d requests per read queue\n", BLKDEV_MIN_RQ, queue_nr_requests);
+ printk(" %d/%d requests per write queue\n", BLKDEV_MIN_RQ, queue_nr_requests);
printk(" enter congestion at %d\n", queue_congestion_on_threshold());
printk(" exit congestion at %d\n", queue_congestion_off_threshold());
===== include/linux/blkdev.h 1.98 vs edited =====
--- 1.98/include/linux/blkdev.h Tue Feb 18 11:29:00 2003
+++ edited/include/linux/blkdev.h Tue Mar 25 10:24:29 2003
@@ -10,6 +10,7 @@
#include <linux/pagemap.h>
#include <linux/backing-dev.h>
#include <linux/wait.h>
+#include <linux/mempool.h>
#include <asm/scatterlist.h>
@@ -18,10 +19,12 @@
struct elevator_s;
typedef struct elevator_s elevator_t;
+#define BLKDEV_MIN_RQ 4
+#define BLKDEV_MAX_RQ 256
+
struct request_list {
- unsigned int count;
- struct list_head free;
- wait_queue_head_t wait;
+ unsigned int count[2];
+ mempool_t *rq_pool;
};
/*
@@ -180,7 +183,7 @@
/*
* the queue request freelist, one for reads and one for writes
*/
- struct request_list rq[2];
+ struct request_list rq;
request_fn_proc *request_fn;
merge_request_fn *back_merge_fn;
===== include/linux/elevator.h 1.18 vs edited =====
--- 1.18/include/linux/elevator.h Sun Jan 12 15:10:40 2003
+++ edited/include/linux/elevator.h Tue Mar 25 11:03:42 2003
@@ -15,6 +15,8 @@
typedef void (elevator_remove_req_fn) (request_queue_t *, struct request *);
typedef struct request *(elevator_request_list_fn) (request_queue_t *, struct request *);
typedef struct list_head *(elevator_get_sort_head_fn) (request_queue_t *, struct request *);
+typedef int (elevator_set_req_fn) (request_queue_t *, struct request *, int);
+typedef void (elevator_put_req_fn) (request_queue_t *, struct request *);
typedef int (elevator_init_fn) (request_queue_t *, elevator_t *);
typedef void (elevator_exit_fn) (request_queue_t *, elevator_t *);
@@ -34,6 +36,9 @@
elevator_request_list_fn *elevator_former_req_fn;
elevator_request_list_fn *elevator_latter_req_fn;
+ elevator_set_req_fn *elevator_set_req_fn;
+ elevator_put_req_fn *elevator_put_req_fn;
+
elevator_init_fn *elevator_init_fn;
elevator_exit_fn *elevator_exit_fn;
@@ -58,6 +63,8 @@
extern struct request *elv_latter_request(request_queue_t *, struct request *);
extern int elv_register_queue(struct gendisk *);
extern void elv_unregister_queue(struct gendisk *);
+extern int elv_set_request(request_queue_t *, struct request *, int);
+extern void elv_put_request(request_queue_t *, struct request *);
#define __elv_add_request_pos(q, rq, pos) \
(q)->elevator.elevator_add_req_fn((q), (rq), (pos))
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 10:56 ` Jens Axboe
@ 2003-03-25 11:23 ` Jens Axboe
2003-03-25 11:37 ` Jens Axboe
2003-03-25 11:39 ` Nick Piggin
1 sibling, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2003-03-25 11:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: dougg, pbadari, linux-kernel, linux-scsi
On Tue, Mar 25 2003, Jens Axboe wrote:
> Here's a patch that makes the request allocation (and io scheduler
> private data) dynamic, with upper and lower bounds of 4 and 256
> respectively. The numbers are a bit random - the 4 will allow us to make
> progress, but it might be a smidgen too low. Perhaps 8 would be good.
> 256 is twice as much as before, but that should be alright as long as
> the io scheduler copes. BLKDEV_MAX_RQ and BLKDEV_MIN_RQ control these
> two variables.
>
> We loose the old batching functionality, for now. I can resurrect that
> if needed. It's a rough fit with the mempool, it doesn't _quite_ fit our
> needs here. I'll probably end up doing a specialised block pool scheme
> for this.
>
> Hasn't been tested all that much, it boots though :-)
Here's a version with better lock handling. We drop the queue_lock for
most parts of __make_request(), except the actual io scheduler calls.
Patch is against 2.5.66-BK as of today.
drivers/block/deadline-iosched.c | 86 ++++++-------
drivers/block/elevator.c | 18 ++
drivers/block/ll_rw_blk.c | 245 ++++++++++++---------------------------
include/linux/blkdev.h | 12 +
include/linux/elevator.h | 7 +
5 files changed, 154 insertions(+), 214 deletions(-)
===== drivers/block/deadline-iosched.c 1.17 vs edited =====
--- 1.17/drivers/block/deadline-iosched.c Fri Mar 14 16:55:04 2003
+++ edited/drivers/block/deadline-iosched.c Tue Mar 25 11:05:37 2003
@@ -71,6 +71,8 @@
int fifo_batch;
int writes_starved;
int front_merges;
+
+ mempool_t *drq_pool;
};
/*
@@ -673,28 +675,11 @@
static void deadline_exit(request_queue_t *q, elevator_t *e)
{
struct deadline_data *dd = e->elevator_data;
- struct deadline_rq *drq;
- struct request *rq;
- int i;
BUG_ON(!list_empty(&dd->fifo_list[READ]));
BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
- for (i = READ; i <= WRITE; i++) {
- struct request_list *rl = &q->rq[i];
- struct list_head *entry;
-
- list_for_each(entry, &rl->free) {
- rq = list_entry_rq(entry);
-
- if ((drq = RQ_DATA(rq)) == NULL)
- continue;
-
- rq->elevator_private = NULL;
- kmem_cache_free(drq_pool, drq);
- }
- }
-
+ mempool_destroy(dd->drq_pool);
kfree(dd->hash);
kfree(dd);
}
@@ -706,9 +691,7 @@
static int deadline_init(request_queue_t *q, elevator_t *e)
{
struct deadline_data *dd;
- struct deadline_rq *drq;
- struct request *rq;
- int i, ret = 0;
+ int i;
if (!drq_pool)
return -ENOMEM;
@@ -724,6 +707,13 @@
return -ENOMEM;
}
+ dd->drq_pool = mempool_create(BLKDEV_MIN_RQ, mempool_alloc_slab, mempool_free_slab, drq_pool);
+ if (!dd->drq_pool) {
+ kfree(dd->hash);
+ kfree(dd);
+ return -ENOMEM;
+ }
+
for (i = 0; i < DL_HASH_ENTRIES; i++)
INIT_LIST_HEAD(&dd->hash[i]);
@@ -739,33 +729,41 @@
dd->front_merges = 1;
dd->fifo_batch = fifo_batch;
e->elevator_data = dd;
+ return 0;
+}
- for (i = READ; i <= WRITE; i++) {
- struct request_list *rl = &q->rq[i];
- struct list_head *entry;
-
- list_for_each(entry, &rl->free) {
- rq = list_entry_rq(entry);
-
- drq = kmem_cache_alloc(drq_pool, GFP_KERNEL);
- if (!drq) {
- ret = -ENOMEM;
- break;
- }
+static void deadline_put_request(request_queue_t *q, struct request *rq)
+{
+ struct deadline_data *dd = q->elevator.elevator_data;
+ struct deadline_rq *drq = RQ_DATA(rq);
- memset(drq, 0, sizeof(*drq));
- INIT_LIST_HEAD(&drq->fifo);
- INIT_LIST_HEAD(&drq->hash);
- RB_CLEAR(&drq->rb_node);
- drq->request = rq;
- rq->elevator_private = drq;
- }
+ if (drq) {
+ mempool_free(drq, dd->drq_pool);
+ rq->elevator_private = NULL;
}
+}
+
+static int
+deadline_set_request(request_queue_t *q, struct request *rq, int gfp_mask)
+{
+ struct deadline_data *dd = q->elevator.elevator_data;
+ struct deadline_rq *drq;
- if (ret)
- deadline_exit(q, e);
+ drq = mempool_alloc(dd->drq_pool, gfp_mask);
+ if (drq) {
+ RB_CLEAR(&drq->rb_node);
+ drq->request = rq;
- return ret;
+ INIT_LIST_HEAD(&drq->hash);
+ drq->hash_valid_count = 0;
+
+ INIT_LIST_HEAD(&drq->fifo);
+
+ rq->elevator_private = drq;
+ return 0;
+ }
+
+ return 1;
}
/*
@@ -916,6 +914,8 @@
.elevator_queue_empty_fn = deadline_queue_empty,
.elevator_former_req_fn = deadline_former_request,
.elevator_latter_req_fn = deadline_latter_request,
+ .elevator_set_req_fn = deadline_set_request,
+ .elevator_put_req_fn = deadline_put_request,
.elevator_init_fn = deadline_init,
.elevator_exit_fn = deadline_exit,
===== drivers/block/elevator.c 1.40 vs edited =====
--- 1.40/drivers/block/elevator.c Sun Feb 16 12:32:35 2003
+++ edited/drivers/block/elevator.c Tue Mar 25 11:04:01 2003
@@ -408,6 +408,24 @@
return NULL;
}
+int elv_set_request(request_queue_t *q, struct request *rq, int gfp_mask)
+{
+ elevator_t *e = &q->elevator;
+
+ if (e->elevator_set_req_fn)
+ return e->elevator_set_req_fn(q, rq, gfp_mask);
+
+ return 0;
+}
+
+void elv_put_request(request_queue_t *q, struct request *rq)
+{
+ elevator_t *e = &q->elevator;
+
+ if (e->elevator_put_req_fn)
+ e->elevator_put_req_fn(q, rq);
+}
+
int elv_register_queue(struct gendisk *disk)
{
request_queue_t *q = disk->queue;
===== drivers/block/ll_rw_blk.c 1.161 vs edited =====
--- 1.161/drivers/block/ll_rw_blk.c Mon Mar 24 02:56:03 2003
+++ edited/drivers/block/ll_rw_blk.c Tue Mar 25 12:20:16 2003
@@ -48,12 +48,6 @@
*/
static int queue_nr_requests;
-/*
- * How many free requests must be available before we wake a process which
- * is waiting for a request?
- */
-static int batch_requests;
-
unsigned long blk_max_low_pfn, blk_max_pfn;
int blk_nohighio = 0;
@@ -1118,26 +1112,6 @@
spin_unlock_irq(&blk_plug_lock);
}
-static int __blk_cleanup_queue(struct request_list *list)
-{
- struct list_head *head = &list->free;
- struct request *rq;
- int i = 0;
-
- while (!list_empty(head)) {
- rq = list_entry(head->next, struct request, queuelist);
- list_del_init(&rq->queuelist);
- kmem_cache_free(request_cachep, rq);
- i++;
- }
-
- if (i != list->count)
- printk("request list leak!\n");
-
- list->count = 0;
- return i;
-}
-
/**
* blk_cleanup_queue: - release a &request_queue_t when it is no longer needed
* @q: the request queue to be released
@@ -1154,18 +1128,14 @@
**/
void blk_cleanup_queue(request_queue_t * q)
{
- int count = (queue_nr_requests*2);
+ struct request_list *rl = &q->rq;
elevator_exit(q);
- count -= __blk_cleanup_queue(&q->rq[READ]);
- count -= __blk_cleanup_queue(&q->rq[WRITE]);
-
del_timer_sync(&q->unplug_timer);
flush_scheduled_work();
- if (count)
- printk("blk_cleanup_queue: leaked requests (%d)\n", count);
+ mempool_destroy(rl->rq_pool);
if (blk_queue_tagged(q))
blk_queue_free_tags(q);
@@ -1175,42 +1145,17 @@
static int blk_init_free_list(request_queue_t *q)
{
- struct request_list *rl;
- struct request *rq;
- int i;
+ struct request_list *rl = &q->rq;
- INIT_LIST_HEAD(&q->rq[READ].free);
- INIT_LIST_HEAD(&q->rq[WRITE].free);
- q->rq[READ].count = 0;
- q->rq[WRITE].count = 0;
+ rl->count[READ] = BLKDEV_MAX_RQ;
+ rl->count[WRITE] = BLKDEV_MAX_RQ;
- /*
- * Divide requests in half between read and write
- */
- rl = &q->rq[READ];
- for (i = 0; i < (queue_nr_requests*2); i++) {
- rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
- if (!rq)
- goto nomem;
-
- /*
- * half way through, switch to WRITE list
- */
- if (i == queue_nr_requests)
- rl = &q->rq[WRITE];
+ rl->rq_pool = mempool_create(BLKDEV_MIN_RQ, mempool_alloc_slab, mempool_free_slab, request_cachep);
- memset(rq, 0, sizeof(struct request));
- rq->rq_status = RQ_INACTIVE;
- list_add(&rq->queuelist, &rl->free);
- rl->count++;
- }
+ if (!rl->rq_pool)
+ return -ENOMEM;
- init_waitqueue_head(&q->rq[READ].wait);
- init_waitqueue_head(&q->rq[WRITE].wait);
return 0;
-nomem:
- blk_cleanup_queue(q);
- return 1;
}
static int __make_request(request_queue_t *, struct bio *);
@@ -1277,34 +1222,62 @@
return 0;
}
+static inline void blk_free_request(request_queue_t *q, struct request *rq)
+{
+ elv_put_request(q, rq);
+ mempool_free(rq, q->rq.rq_pool);
+}
+
+static inline struct request *blk_alloc_request(request_queue_t *q,int gfp_mask)
+{
+ struct request *rq = mempool_alloc(q->rq.rq_pool, gfp_mask);
+
+ if (rq) {
+ if (!elv_set_request(q, rq, gfp_mask))
+ return rq;
+
+ mempool_free(rq, q->rq.rq_pool);
+ }
+
+ return NULL;
+}
+
#define blkdev_free_rq(list) list_entry((list)->next, struct request, queuelist)
/*
- * Get a free request. queue lock must be held and interrupts
- * disabled on the way in.
+ * Get a free request, queue_lock must not be held
*/
-static struct request *get_request(request_queue_t *q, int rw)
+static struct request *get_request(request_queue_t *q, int rw, int gfp_mask)
{
struct request *rq = NULL;
- struct request_list *rl = q->rq + rw;
+ struct request_list *rl = &q->rq;
- if (!list_empty(&rl->free)) {
- rq = blkdev_free_rq(&rl->free);
- list_del_init(&rq->queuelist);
- rq->ref_count = 1;
- rl->count--;
- if (rl->count < queue_congestion_on_threshold())
+ /*
+ * known racy, ok though
+ */
+ if (!rl->count[rw])
+ return NULL;
+
+ rq = blk_alloc_request(q, gfp_mask);
+ if (rq) {
+ spin_lock_irq(q->queue_lock);
+ rl->count[rw]--;
+ if (rl->count[rw] < queue_congestion_on_threshold())
set_queue_congested(q, rw);
+ spin_unlock_irq(q->queue_lock);
+
+ INIT_LIST_HEAD(&rq->queuelist);
rq->flags = 0;
- rq->rq_status = RQ_ACTIVE;
rq->errors = 0;
- rq->special = NULL;
- rq->buffer = NULL;
- rq->data = NULL;
- rq->sense = NULL;
- rq->waiting = NULL;
+ rq->rq_status = RQ_ACTIVE;
rq->bio = rq->biotail = NULL;
+ rq->buffer = NULL;
+ rq->ref_count = 1;
rq->q = q;
rq->rl = rl;
+ rq->special = NULL;
+ rq->data = NULL;
+ rq->waiting = NULL;
+ rq->sense = NULL;
}
return rq;
@@ -1316,7 +1289,7 @@
static struct request *get_request_wait(request_queue_t *q, int rw)
{
DEFINE_WAIT(wait);
- struct request_list *rl = &q->rq[rw];
+ struct request_list *rl = &q->rq;
struct request *rq;
spin_lock_prefetch(q->queue_lock);
@@ -1325,64 +1298,24 @@
do {
int block = 0;
- prepare_to_wait_exclusive(&rl->wait, &wait,
- TASK_UNINTERRUPTIBLE);
spin_lock_irq(q->queue_lock);
- if (!rl->count)
+ if (!rl->count[rw])
block = 1;
spin_unlock_irq(q->queue_lock);
if (block)
io_schedule();
- finish_wait(&rl->wait, &wait);
- spin_lock_irq(q->queue_lock);
- rq = get_request(q, rw);
- spin_unlock_irq(q->queue_lock);
+ rq = get_request(q, rw, GFP_NOIO);
} while (rq == NULL);
return rq;
}
struct request *blk_get_request(request_queue_t *q, int rw, int gfp_mask)
{
- struct request *rq;
-
- BUG_ON(rw != READ && rw != WRITE);
-
- spin_lock_irq(q->queue_lock);
- rq = get_request(q, rw);
- spin_unlock_irq(q->queue_lock);
-
- if (!rq && (gfp_mask & __GFP_WAIT))
- rq = get_request_wait(q, rw);
-
- if (rq) {
- rq->flags = 0;
- rq->buffer = NULL;
- rq->bio = rq->biotail = NULL;
- rq->waiting = NULL;
- }
- return rq;
-}
-
-/*
- * Non-locking blk_get_request variant, for special requests from drivers.
- */
-struct request *__blk_get_request(request_queue_t *q, int rw)
-{
- struct request *rq;
-
BUG_ON(rw != READ && rw != WRITE);
- rq = get_request(q, rw);
-
- if (rq) {
- rq->flags = 0;
- rq->buffer = NULL;
- rq->bio = rq->biotail = NULL;
- rq->waiting = NULL;
- }
- return rq;
+ return get_request(q, rw, gfp_mask);
}
/**
@@ -1499,6 +1432,9 @@
disk->stamp_idle = now;
}
+/*
+ * queue lock must be held
+ */
void __blk_put_request(request_queue_t *q, struct request *req)
{
struct request_list *rl = req->rl;
@@ -1508,34 +1444,26 @@
if (unlikely(!q))
return;
- req->rq_status = RQ_INACTIVE;
- req->q = NULL;
- req->rl = NULL;
-
/*
* Request may not have originated from ll_rw_blk. if not,
* it didn't come out of our reserved rq pools
*/
if (rl) {
- int rw = 0;
+ int rw = rq_data_dir(req);
BUG_ON(!list_empty(&req->queuelist));
- list_add(&req->queuelist, &rl->free);
+ blk_free_request(q, req);
- if (rl == &q->rq[WRITE])
- rw = WRITE;
- else if (rl == &q->rq[READ])
- rw = READ;
- else
- BUG();
-
- rl->count++;
- if (rl->count >= queue_congestion_off_threshold())
+ rl->count[rw]++;
+ if (rl->count[rw] >= queue_congestion_off_threshold())
clear_queue_congested(q, rw);
- if (rl->count >= batch_requests && waitqueue_active(&rl->wait))
- wake_up(&rl->wait);
}
+
+ req->rq_status = RQ_INACTIVE;
+ req->q = NULL;
+ req->rl = NULL;
+ req->elevator_private = NULL;
}
void blk_put_request(struct request *req)
@@ -1694,9 +1622,9 @@
barrier = test_bit(BIO_RW_BARRIER, &bio->bi_rw);
- spin_lock_irq(q->queue_lock);
again:
insert_here = NULL;
+ spin_lock_irq(q->queue_lock);
if (blk_queue_empty(q)) {
blk_plug_device(q);
@@ -1769,12 +1697,12 @@
* a free slot.
*/
get_rq:
+ spin_unlock_irq(q->queue_lock);
+
if (freereq) {
req = freereq;
freereq = NULL;
- } else if ((req = get_request(q, rw)) == NULL) {
- spin_unlock_irq(q->queue_lock);
-
+ } else if ((req = get_request(q, rw, GFP_NOIO)) == NULL) {
/*
* READA bit set
*/
@@ -1782,7 +1710,6 @@
goto end_io;
freereq = get_request_wait(q, rw);
- spin_lock_irq(q->queue_lock);
goto again;
}
@@ -1809,19 +1736,20 @@
req->bio = req->biotail = bio;
req->rq_disk = bio->bi_bdev->bd_disk;
req->start_time = jiffies;
+
+ spin_lock_irq(q->queue_lock);
add_request(q, req, insert_here);
out:
if (freereq)
__blk_put_request(q, freereq);
if (blk_queue_plugged(q)) {
- int nr_queued = (queue_nr_requests - q->rq[0].count) +
- (queue_nr_requests - q->rq[1].count);
+ int nr_queued = (queue_nr_requests - q->rq.count[0]) +
+ (queue_nr_requests - q->rq.count[1]);
if (nr_queued == q->unplug_thresh)
__generic_unplug_device(q);
}
spin_unlock_irq(q->queue_lock);
-
return 0;
end_io:
@@ -2183,7 +2111,6 @@
int __init blk_dev_init(void)
{
- int total_ram = nr_free_pages() << (PAGE_SHIFT - 10);
int i;
request_cachep = kmem_cache_create("blkdev_requests",
@@ -2191,24 +2118,11 @@
if (!request_cachep)
panic("Can't create request pool slab cache\n");
- /*
- * Free request slots per queue. One per quarter-megabyte.
- * We use this many requests for reads, and this many for writes.
- */
- queue_nr_requests = (total_ram >> 9) & ~7;
- if (queue_nr_requests < 16)
- queue_nr_requests = 16;
- if (queue_nr_requests > 128)
- queue_nr_requests = 128;
-
- batch_requests = queue_nr_requests / 8;
- if (batch_requests > 8)
- batch_requests = 8;
+ queue_nr_requests = BLKDEV_MAX_RQ;
printk("block request queues:\n");
- printk(" %d requests per read queue\n", queue_nr_requests);
- printk(" %d requests per write queue\n", queue_nr_requests);
- printk(" %d requests per batch\n", batch_requests);
+ printk(" %d/%d requests per read queue\n", BLKDEV_MIN_RQ, queue_nr_requests);
+ printk(" %d/%d requests per write queue\n", BLKDEV_MIN_RQ, queue_nr_requests);
printk(" enter congestion at %d\n", queue_congestion_on_threshold());
printk(" exit congestion at %d\n", queue_congestion_off_threshold());
@@ -2250,7 +2164,6 @@
EXPORT_SYMBOL(blk_phys_contig_segment);
EXPORT_SYMBOL(blk_hw_contig_segment);
EXPORT_SYMBOL(blk_get_request);
-EXPORT_SYMBOL(__blk_get_request);
EXPORT_SYMBOL(blk_put_request);
EXPORT_SYMBOL(blk_insert_request);
===== include/linux/blkdev.h 1.98 vs edited =====
--- 1.98/include/linux/blkdev.h Tue Feb 18 11:29:00 2003
+++ edited/include/linux/blkdev.h Tue Mar 25 12:12:20 2003
@@ -10,6 +10,7 @@
#include <linux/pagemap.h>
#include <linux/backing-dev.h>
#include <linux/wait.h>
+#include <linux/mempool.h>
#include <asm/scatterlist.h>
@@ -18,10 +19,12 @@
struct elevator_s;
typedef struct elevator_s elevator_t;
+#define BLKDEV_MIN_RQ 4
+#define BLKDEV_MAX_RQ 256
+
struct request_list {
- unsigned int count;
- struct list_head free;
- wait_queue_head_t wait;
+ int count[2];
+ mempool_t *rq_pool;
};
/*
@@ -180,7 +183,7 @@
/*
* the queue request freelist, one for reads and one for writes
*/
- struct request_list rq[2];
+ struct request_list rq;
request_fn_proc *request_fn;
merge_request_fn *back_merge_fn;
@@ -330,7 +333,6 @@
extern void blk_attempt_remerge(request_queue_t *, struct request *);
extern void __blk_attempt_remerge(request_queue_t *, struct request *);
extern struct request *blk_get_request(request_queue_t *, int, int);
-extern struct request *__blk_get_request(request_queue_t *, int);
extern void blk_put_request(struct request *);
extern void blk_insert_request(request_queue_t *, struct request *, int, void *);
extern void blk_plug_device(request_queue_t *);
===== include/linux/elevator.h 1.18 vs edited =====
--- 1.18/include/linux/elevator.h Sun Jan 12 15:10:40 2003
+++ edited/include/linux/elevator.h Tue Mar 25 11:03:42 2003
@@ -15,6 +15,8 @@
typedef void (elevator_remove_req_fn) (request_queue_t *, struct request *);
typedef struct request *(elevator_request_list_fn) (request_queue_t *, struct request *);
typedef struct list_head *(elevator_get_sort_head_fn) (request_queue_t *, struct request *);
+typedef int (elevator_set_req_fn) (request_queue_t *, struct request *, int);
+typedef void (elevator_put_req_fn) (request_queue_t *, struct request *);
typedef int (elevator_init_fn) (request_queue_t *, elevator_t *);
typedef void (elevator_exit_fn) (request_queue_t *, elevator_t *);
@@ -34,6 +36,9 @@
elevator_request_list_fn *elevator_former_req_fn;
elevator_request_list_fn *elevator_latter_req_fn;
+ elevator_set_req_fn *elevator_set_req_fn;
+ elevator_put_req_fn *elevator_put_req_fn;
+
elevator_init_fn *elevator_init_fn;
elevator_exit_fn *elevator_exit_fn;
@@ -58,6 +63,8 @@
extern struct request *elv_latter_request(request_queue_t *, struct request *);
extern int elv_register_queue(struct gendisk *);
extern void elv_unregister_queue(struct gendisk *);
+extern int elv_set_request(request_queue_t *, struct request *, int);
+extern void elv_put_request(request_queue_t *, struct request *);
#define __elv_add_request_pos(q, rq, pos) \
(q)->elevator.elevator_add_req_fn((q), (rq), (pos))
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 11:23 ` Jens Axboe
@ 2003-03-25 11:37 ` Jens Axboe
0 siblings, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2003-03-25 11:37 UTC (permalink / raw)
To: Andrew Morton; +Cc: dougg, pbadari, linux-kernel, linux-scsi
On Tue, Mar 25 2003, Jens Axboe wrote:
> On Tue, Mar 25 2003, Jens Axboe wrote:
> > Here's a patch that makes the request allocation (and io scheduler
> > private data) dynamic, with upper and lower bounds of 4 and 256
> > respectively. The numbers are a bit random - the 4 will allow us to make
> > progress, but it might be a smidgen too low. Perhaps 8 would be good.
> > 256 is twice as much as before, but that should be alright as long as
> > the io scheduler copes. BLKDEV_MAX_RQ and BLKDEV_MIN_RQ control these
> > two variables.
> >
> > We loose the old batching functionality, for now. I can resurrect that
> > if needed. It's a rough fit with the mempool, it doesn't _quite_ fit our
> > needs here. I'll probably end up doing a specialised block pool scheme
> > for this.
> >
> > Hasn't been tested all that much, it boots though :-)
>
> Here's a version with better lock handling. We drop the queue_lock for
> most parts of __make_request(), except the actual io scheduler calls.
That was buggy, fixing... Back later.
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 10:56 ` Jens Axboe
2003-03-25 11:23 ` Jens Axboe
@ 2003-03-25 11:39 ` Nick Piggin
2003-03-25 12:01 ` Jens Axboe
1 sibling, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2003-03-25 11:39 UTC (permalink / raw)
To: Jens Axboe; +Cc: Andrew Morton, dougg, pbadari, linux-kernel, linux-scsi
Jens Axboe wrote:
>On Sat, Mar 22 2003, Andrew Morton wrote:
>
>>Douglas Gilbert <dougg@torque.net> wrote:
>>
>>>Andrew Morton wrote:
>>>
>>>>Douglas Gilbert <dougg@torque.net> wrote:
>>>>
>>>>
>>>>>>Slab: 464364 kB
>>>>>>
>>>>It's all in slab.
>>>>
>>>>
>>>>
>>>>>I did notice a rather large growth of nodes
>>>>>in sysfs. For 84 added scsi_debug pseudo disks the number
>>>>>of sysfs nodes went from 686 to 3347.
>>>>>
>>>>>Does anybody know what is the per node memory cost of sysfs?
>>>>>
>>>>
>>>>Let's see all of /pro/slabinfo please.
>>>>
>>>Andrew,
>>>Attachments are /proc/slabinfo pre and post:
>>> $ modprobe scsi_debug add_host=42 num_devs=2
>>>which adds 84 pseudo disks.
>>>
>>>
>>OK, thanks. So with 48 disks you've lost five megabytes to blkdev_requests
>>and deadline_drq objects. With 4000 disks, you're toast. That's enough
>>request structures to put 200 gigabytes of memory under I/O ;)
>>
>>We need to make the request structures dymanically allocated for other
>>reasons (which I cannot immediately remember) but it didn't happen. I guess
>>we have some motivation now.
>>
>
>Here's a patch that makes the request allocation (and io scheduler
>private data) dynamic, with upper and lower bounds of 4 and 256
>respectively. The numbers are a bit random - the 4 will allow us to make
>progress, but it might be a smidgen too low. Perhaps 8 would be good.
>256 is twice as much as before, but that should be alright as long as
>the io scheduler copes. BLKDEV_MAX_RQ and BLKDEV_MIN_RQ control these
>two variables.
>
>We loose the old batching functionality, for now. I can resurrect that
>if needed. It's a rough fit with the mempool, it doesn't _quite_ fit our
>needs here. I'll probably end up doing a specialised block pool scheme
>for this.
>
>Hasn't been tested all that much, it boots though :-)
>
Nice Jens. Very good in theory but I haven't looked at the
code too much yet.
Would it be possible to have all queues allocate out of
the one global pool of free requests. This way you could
have a big minimum (say 128) and a big maximum
(say min(Mbytes, spindles).
This way memory usage is decoupled from the number of
queues, and busy spindles could make use of more
available free requests.
Oh and the max value can easily be runtime tunable, right?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 11:39 ` Nick Piggin
@ 2003-03-25 12:01 ` Jens Axboe
2003-03-25 12:12 ` Nick Piggin
0 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2003-03-25 12:01 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, dougg, pbadari, linux-kernel, linux-scsi
On Tue, Mar 25 2003, Nick Piggin wrote:
>
>
> Jens Axboe wrote:
>
> >On Sat, Mar 22 2003, Andrew Morton wrote:
> >
> >>Douglas Gilbert <dougg@torque.net> wrote:
> >>
> >>>Andrew Morton wrote:
> >>>
> >>>>Douglas Gilbert <dougg@torque.net> wrote:
> >>>>
> >>>>
> >>>>>>Slab: 464364 kB
> >>>>>>
> >>>>It's all in slab.
> >>>>
> >>>>
> >>>>
> >>>>>I did notice a rather large growth of nodes
> >>>>>in sysfs. For 84 added scsi_debug pseudo disks the number
> >>>>>of sysfs nodes went from 686 to 3347.
> >>>>>
> >>>>>Does anybody know what is the per node memory cost of sysfs?
> >>>>>
> >>>>
> >>>>Let's see all of /pro/slabinfo please.
> >>>>
> >>>Andrew,
> >>>Attachments are /proc/slabinfo pre and post:
> >>> $ modprobe scsi_debug add_host=42 num_devs=2
> >>>which adds 84 pseudo disks.
> >>>
> >>>
> >>OK, thanks. So with 48 disks you've lost five megabytes to
> >>blkdev_requests
> >>and deadline_drq objects. With 4000 disks, you're toast. That's enough
> >>request structures to put 200 gigabytes of memory under I/O ;)
> >>
> >>We need to make the request structures dymanically allocated for other
> >>reasons (which I cannot immediately remember) but it didn't happen. I
> >>guess
> >>we have some motivation now.
> >>
> >
> >Here's a patch that makes the request allocation (and io scheduler
> >private data) dynamic, with upper and lower bounds of 4 and 256
> >respectively. The numbers are a bit random - the 4 will allow us to make
> >progress, but it might be a smidgen too low. Perhaps 8 would be good.
> >256 is twice as much as before, but that should be alright as long as
> >the io scheduler copes. BLKDEV_MAX_RQ and BLKDEV_MIN_RQ control these
> >two variables.
> >
> >We loose the old batching functionality, for now. I can resurrect that
> >if needed. It's a rough fit with the mempool, it doesn't _quite_ fit our
> >needs here. I'll probably end up doing a specialised block pool scheme
> >for this.
> >
> >Hasn't been tested all that much, it boots though :-)
> >
> Nice Jens. Very good in theory but I haven't looked at the
> code too much yet.
>
> Would it be possible to have all queues allocate out of
> the one global pool of free requests. This way you could
> have a big minimum (say 128) and a big maximum
> (say min(Mbytes, spindles).
Well not really, as far as I can see we _need_ a pool per queue. Imagine
a bio handed to raid, needs to be split to 6 different queues. But our
minimum is 4, deadlock possibility. It could probably be made to work,
however I greatly prefer a per-queue reserve.
> This way memory usage is decoupled from the number of
> queues, and busy spindles could make use of more
> available free requests.
>
> Oh and the max value can easily be runtime tunable, right?
Sure. However, they don't really mean _anything_. Max is just some
random number to prevent one queue going nuts, and could be completely
removed if the vm works perfectly. Beyond some limit there's little
benefit to doing that, though. But MAX could be runtime tunable. Min is
basically just to make sure we don't kill ourselves, I don't see any
point in making that runtime tunable. It's not really a tunable.
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 12:01 ` Jens Axboe
@ 2003-03-25 12:12 ` Nick Piggin
2003-03-25 12:35 ` Jens Axboe
0 siblings, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2003-03-25 12:12 UTC (permalink / raw)
To: Jens Axboe; +Cc: Andrew Morton, dougg, pbadari, linux-kernel, linux-scsi
Jens Axboe wrote:
>On Tue, Mar 25 2003, Nick Piggin wrote:
>
>>
>>Nice Jens. Very good in theory but I haven't looked at the
>>code too much yet.
>>
>>Would it be possible to have all queues allocate out of
>>the one global pool of free requests. This way you could
>>have a big minimum (say 128) and a big maximum
>>(say min(Mbytes, spindles).
>>
>
>Well not really, as far as I can see we _need_ a pool per queue. Imagine
>a bio handed to raid, needs to be split to 6 different queues. But our
>minimum is 4, deadlock possibility. It could probably be made to work,
>however I greatly prefer a per-queue reserve.
>
OK yeah you are right there. In light of your comment below
I'm happy with that. I was mostly worried about queues being
restricted to a small maximum.
>
>
>>This way memory usage is decoupled from the number of
>>queues, and busy spindles could make use of more
>>available free requests.
>>
>>Oh and the max value can easily be runtime tunable, right?
>>
>
>Sure. However, they don't really mean _anything_. Max is just some
>random number to prevent one queue going nuts, and could be completely
>removed if the vm works perfectly. Beyond some limit there's little
>benefit to doing that, though. But MAX could be runtime tunable. Min is
>basically just to make sure we don't kill ourselves, I don't see any
>point in making that runtime tunable. It's not really a tunable.
>
OK thats good then. I would like to see max removed, however perhaps
the VM isn't up to that yet. I'll be testing this when your code
solidifies!
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 12:12 ` Nick Piggin
@ 2003-03-25 12:35 ` Jens Axboe
2003-03-27 0:29 ` Badari Pulavarty
0 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2003-03-25 12:35 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, dougg, pbadari, linux-kernel, linux-scsi
On Tue, Mar 25 2003, Nick Piggin wrote:
> Jens Axboe wrote:
>
> >On Tue, Mar 25 2003, Nick Piggin wrote:
> >
> >>
> >>Nice Jens. Very good in theory but I haven't looked at the
> >>code too much yet.
> >>
> >>Would it be possible to have all queues allocate out of
> >>the one global pool of free requests. This way you could
> >>have a big minimum (say 128) and a big maximum
> >>(say min(Mbytes, spindles).
> >>
> >
> >Well not really, as far as I can see we _need_ a pool per queue. Imagine
> >a bio handed to raid, needs to be split to 6 different queues. But our
> >minimum is 4, deadlock possibility. It could probably be made to work,
> >however I greatly prefer a per-queue reserve.
> >
> OK yeah you are right there. In light of your comment below
> I'm happy with that. I was mostly worried about queues being
> restricted to a small maximum.
Understandable, we'll make max tunable.
> >>This way memory usage is decoupled from the number of
> >>queues, and busy spindles could make use of more
> >>available free requests.
> >>
> >>Oh and the max value can easily be runtime tunable, right?
> >>
> >
> >Sure. However, they don't really mean _anything_. Max is just some
> >random number to prevent one queue going nuts, and could be completely
> >removed if the vm works perfectly. Beyond some limit there's little
> >benefit to doing that, though. But MAX could be runtime tunable. Min is
> >basically just to make sure we don't kill ourselves, I don't see any
> >point in making that runtime tunable. It's not really a tunable.
> >
> OK thats good then. I would like to see max removed, however perhaps
> the VM isn't up to that yet. I'll be testing this when your code
> solidifies!
Only testing will tell, so yes you are very welcome to give it a shot.
Let me release a known working version first :)
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-25 12:35 ` Jens Axboe
@ 2003-03-27 0:29 ` Badari Pulavarty
2003-03-27 9:18 ` Jens Axboe
0 siblings, 1 reply; 22+ messages in thread
From: Badari Pulavarty @ 2003-03-27 0:29 UTC (permalink / raw)
To: Jens Axboe, Nick Piggin; +Cc: Andrew Morton, dougg, linux-kernel, linux-scsi
On Tuesday 25 March 2003 04:35 am, Jens Axboe wrote:
> Only testing will tell, so yes you are very welcome to give it a shot.
> Let me release a known working version first :)
Jens,
I found whats using 32MB out of 8192-byte slab.
size-8192 before:10 after:4012 diff:4002 size:8192 incr:32784384
It is deadline_init():
dd->hash = kmalloc(sizeof(struct list_head)*DL_HASH_ENTRIES,GFP_KERNEL);
It is creating 8K hash table for each queue. Since we have 4000 queues,
it used 32MB. I wonder why the current code needs 1024 hash buckets,
when maximum requests are only 256. And also, since you are making
request allocation dynamic, can you change this too ? Any issues here ?
Thanks,
Badari
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-27 0:29 ` Badari Pulavarty
@ 2003-03-27 9:18 ` Jens Axboe
2003-03-28 17:04 ` Badari Pulavarty
0 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2003-03-27 9:18 UTC (permalink / raw)
To: Badari Pulavarty
Cc: Nick Piggin, Andrew Morton, dougg, linux-kernel, linux-scsi
On Wed, Mar 26 2003, Badari Pulavarty wrote:
> On Tuesday 25 March 2003 04:35 am, Jens Axboe wrote:
>
> > Only testing will tell, so yes you are very welcome to give it a shot.
> > Let me release a known working version first :)
>
> Jens,
>
> I found whats using 32MB out of 8192-byte slab.
>
> size-8192 before:10 after:4012 diff:4002 size:8192 incr:32784384
>
> It is deadline_init():
>
> dd->hash = kmalloc(sizeof(struct list_head)*DL_HASH_ENTRIES,GFP_KERNEL);
>
> It is creating 8K hash table for each queue. Since we have 4000 queues,
Yes
> it used 32MB. I wonder why the current code needs 1024 hash buckets,
Hmm actually that's a leftover from when we played with bigger queue
sizes, I inadvertently forgot to change it back when pushing the rbtree
deadline update to Linus. It used to be 256. We can shrink this to 2^7
or 2^8 instead, which will then only eat 1-2K.
> when maximum requests are only 256. And also, since you are making
> request allocation dynamic, can you change this too ? Any issues here ?
No real issues to shrinking it, bigger problem if we move to larger
queues. With the rq-dyn-alloc patch, we can make the max number of
requests ceiling a lot higher and then the hash needs to be bigger too.
But for now, 256 entry should be a good default and suffice for the
future, I'll push that change.
--
Jens Axboe
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-27 9:18 ` Jens Axboe
@ 2003-03-28 17:04 ` Badari Pulavarty
2003-03-28 18:41 ` Andries Brouwer
0 siblings, 1 reply; 22+ messages in thread
From: Badari Pulavarty @ 2003-03-28 17:04 UTC (permalink / raw)
To: Jens Axboe; +Cc: Nick Piggin, Andrew Morton, dougg, linux-kernel, linux-scsi
Hi All,
I found 2048-byte slab heavy users. 8MB doesn't seem much they all
add up.
size-2048 before:98 after:4095 diff:3997 size:2060 incr:8233820
The problem is with
sd.c: sd_attach()
alloc_disk(16)
alloc_disk() allocates "hd_struct" structure for 15 minors.
So it is a 84*15 = 1260 byte allocation. They all come from
2048-byte slabs. Since I have 4000 simulated disks, it uses up 8MB.
Proposed fixes:
1) Make the allocations come from its own slab, instead of
2048-byte slab. (~40% saving).
2) Instead of allocatinf hd_struct structure for all possible partitions,
why not allocated them dynamically, as we see a partition ? This
way we could (in theory) support more than 16 partitions, if needed.
Are there any issues with doing (2) ?
Thanks,
Badari
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-28 17:04 ` Badari Pulavarty
@ 2003-03-28 18:41 ` Andries Brouwer
2003-03-29 1:39 ` Badari Pulavarty
0 siblings, 1 reply; 22+ messages in thread
From: Andries Brouwer @ 2003-03-28 18:41 UTC (permalink / raw)
To: Badari Pulavarty
Cc: Jens Axboe, Nick Piggin, Andrew Morton, dougg, linux-kernel, linux-scsi
On Fri, Mar 28, 2003 at 09:04:41AM -0800, Badari Pulavarty wrote:
> 2) Instead of allocating hd_struct structure for all possible partitions,
> why not allocated them dynamically, as we see a partition ? This
> way we could (in theory) support more than 16 partitions, if needed.
This is what I plan to do.
Of course you are welcome to do it first.
Andries
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch for playing] 2.5.65 patch to support > 256 disks
2003-03-28 18:41 ` Andries Brouwer
@ 2003-03-29 1:39 ` Badari Pulavarty
0 siblings, 0 replies; 22+ messages in thread
From: Badari Pulavarty @ 2003-03-29 1:39 UTC (permalink / raw)
To: Andries Brouwer
Cc: Jens Axboe, Nick Piggin, Andrew Morton, dougg, linux-kernel, linux-scsi
[-- Attachment #1: Type: text/plain, Size: 648 bytes --]
On Friday 28 March 2003 10:41 am, Andries Brouwer wrote:
> On Fri, Mar 28, 2003 at 09:04:41AM -0800, Badari Pulavarty wrote:
> > 2) Instead of allocating hd_struct structure for all possible partitions,
> > why not allocated them dynamically, as we see a partition ? This
> > way we could (in theory) support more than 16 partitions, if needed.
>
> This is what I plan to do.
> Of course you are welcome to do it first.
>
> Andries
Okay !! Here is my patch to add hd_structs dynamically as we add partitions.
Machine boots fine. I was able to add/delete partitions.
It is not polished yet, but any comments ?
Thanks,
Badari
[-- Attachment #2: dyn.part --]
[-- Type: text/x-diff, Size: 7327 bytes --]
--- linux/include/linux/genhd.h Fri Mar 28 14:49:59 2003
+++ linux.new/include/linux/genhd.h Fri Mar 28 18:17:55 2003
@@ -63,7 +63,7 @@ struct hd_struct {
devfs_handle_t de; /* primary (master) devfs entry */
struct kobject kobj;
unsigned reads, read_sectors, writes, write_sectors;
- int policy;
+ int policy, partno;
};
#define GENHD_FL_REMOVABLE 1
@@ -89,7 +89,7 @@ struct gendisk {
int minor_shift; /* number of times minor is shifted to
get real minor */
char disk_name[16]; /* name of major driver */
- struct hd_struct *part; /* [indexed by minor] */
+ struct hd_struct **part; /* [indexed by minor] */
struct block_device_operations *fops;
struct request_queue *queue;
void *private_data;
--- linux/drivers/block/genhd.c Fri Mar 28 14:30:57 2003
+++ linux.new/drivers/block/genhd.c Fri Mar 28 18:21:17 2003
@@ -365,11 +365,13 @@ static int show_partition(struct seq_fil
(unsigned long long)get_capacity(sgp) >> 1,
disk_name(sgp, 0, buf));
for (n = 0; n < sgp->minors - 1; n++) {
- if (sgp->part[n].nr_sects == 0)
+ if (!sgp->part[n])
+ continue;
+ if (sgp->part[n]->nr_sects == 0)
continue;
seq_printf(part, "%4d %4d %10llu %s\n",
sgp->major, n + 1 + sgp->first_minor,
- (unsigned long long)sgp->part[n].nr_sects >> 1 ,
+ (unsigned long long)sgp->part[n]->nr_sects >> 1 ,
disk_name(sgp, n + 1, buf));
}
@@ -531,8 +533,6 @@ static decl_subsys(block,&ktype_block);
struct gendisk *alloc_disk(int minors)
{
- int dbg = 0 ;
-
struct gendisk *disk = kmalloc(sizeof(struct gendisk), GFP_KERNEL);
if (disk) {
memset(disk, 0, sizeof(struct gendisk));
@@ -541,7 +541,7 @@ struct gendisk *alloc_disk(int minors)
return NULL;
}
if (minors > 1) {
- int size = (minors - 1) * sizeof(struct hd_struct);
+ int size = (minors - 1) * sizeof(struct hd_struct *);
disk->part = kmalloc(size, GFP_KERNEL);
if (!disk->part) {
kfree(disk);
@@ -593,8 +593,8 @@ void set_device_ro(struct block_device *
struct gendisk *disk = bdev->bd_disk;
if (bdev->bd_contains != bdev) {
int part = bdev->bd_dev - MKDEV(disk->major, disk->first_minor);
- struct hd_struct *p = &disk->part[part-1];
- p->policy = flag;
+ struct hd_struct *p = disk->part[part-1];
+ if (p) p->policy = flag;
} else
disk->policy = flag;
}
@@ -604,7 +604,7 @@ void set_disk_ro(struct gendisk *disk, i
int i;
disk->policy = flag;
for (i = 0; i < disk->minors - 1; i++)
- disk->part[i].policy = flag;
+ if (disk->part[i]) disk->part[i]->policy = flag;
}
int bdev_read_only(struct block_device *bdev)
@@ -615,8 +615,9 @@ int bdev_read_only(struct block_device *
disk = bdev->bd_disk;
if (bdev->bd_contains != bdev) {
int part = bdev->bd_dev - MKDEV(disk->major, disk->first_minor);
- struct hd_struct *p = &disk->part[part-1];
- return p->policy;
+ struct hd_struct *p = disk->part[part-1];
+ if (p) return p->policy;
+ return 0;
} else
return disk->policy;
}
--- linux/drivers/block/ioctl.c Fri Mar 28 15:01:48 2003
+++ linux.new/drivers/block/ioctl.c Fri Mar 28 18:16:55 2003
@@ -41,11 +41,11 @@ static int blkpg_ioctl(struct block_devi
return -EINVAL;
}
/* partition number in use? */
- if (disk->part[part - 1].nr_sects != 0)
+ if (disk->part[part - 1])
return -EBUSY;
/* overlap? */
for (i = 0; i < disk->minors - 1; i++) {
- struct hd_struct *s = &disk->part[i];
+ struct hd_struct *s = disk->part[i];
if (!(start+length <= s->start_sect ||
start >= s->start_sect + s->nr_sects))
return -EBUSY;
@@ -54,7 +54,9 @@ static int blkpg_ioctl(struct block_devi
add_partition(disk, part, start, length);
return 0;
case BLKPG_DEL_PARTITION:
- if (disk->part[part - 1].nr_sects == 0)
+ if (!disk->part[part-1])
+ return -ENXIO;
+ if (disk->part[part - 1]->nr_sects == 0)
return -ENXIO;
/* partition in use? Incomplete check for now. */
bdevp = bdget(MKDEV(disk->major, disk->first_minor) + part);
--- linux/drivers/block/ll_rw_blk.c Fri Mar 28 14:58:48 2003
+++ linux.new/drivers/block/ll_rw_blk.c Fri Mar 28 18:15:25 2003
@@ -1867,7 +1867,7 @@ static inline void blk_partition_remap(s
if (bdev == bdev->bd_contains)
return;
- p = &disk->part[bdev->bd_dev-MKDEV(disk->major,disk->first_minor)-1];
+ p = disk->part[bdev->bd_dev-MKDEV(disk->major,disk->first_minor)-1];
switch (bio->bi_rw) {
case READ:
p->read_sectors += bio_sectors(bio);
--- linux/fs/block_dev.c Fri Mar 28 17:36:28 2003
+++ linux.new/fs/block_dev.c Fri Mar 28 18:15:08 2003
@@ -559,10 +559,10 @@ static int do_open(struct block_device *
bdev->bd_contains = whole;
down(&whole->bd_sem);
whole->bd_part_count++;
- p = disk->part + part - 1;
+ p = disk->part[part - 1];
bdev->bd_inode->i_data.backing_dev_info =
whole->bd_inode->i_data.backing_dev_info;
- if (!(disk->flags & GENHD_FL_UP) || !p->nr_sects) {
+ if (!(disk->flags & GENHD_FL_UP) || !p || !p->nr_sects) {
whole->bd_part_count--;
up(&whole->bd_sem);
ret = -ENXIO;
--- linux/fs/partitions/check.c Fri Mar 28 14:32:29 2003
+++ linux.new/fs/partitions/check.c Fri Mar 28 18:22:56 2003
@@ -103,8 +103,8 @@ char *disk_name(struct gendisk *hd, int
}
sprintf(buf, "%s", hd->disk_name);
} else {
- if (hd->part[part-1].de) {
- pos = devfs_generate_path(hd->part[part-1].de, buf, 64);
+ if (hd->part[part-1]->de) {
+ pos = devfs_generate_path(hd->part[part-1]->de, buf, 64);
if (pos >= 0)
return buf + pos;
}
@@ -160,7 +160,7 @@ static void devfs_register_partition(str
{
#ifdef CONFIG_DEVFS_FS
devfs_handle_t dir;
- struct hd_struct *p = dev->part;
+ struct hd_struct *p = dev->parts;
char devname[16];
if (p[part-1].de)
@@ -203,7 +203,7 @@ static struct sysfs_ops part_sysfs_ops =
static ssize_t part_dev_read(struct hd_struct * p, char *page)
{
struct gendisk *disk = container_of(p->kobj.parent,struct gendisk,kobj);
- int part = p - disk->part + 1;
+ int part = p->partno;
dev_t base = MKDEV(disk->major, disk->first_minor);
return sprintf(page, "%04x\n", (unsigned)(base + part));
}
@@ -255,19 +255,30 @@ static struct kobj_type ktype_part = {
void delete_partition(struct gendisk *disk, int part)
{
- struct hd_struct *p = disk->part + part - 1;
+ struct hd_struct *p = disk->part[part-1];
+ if (!p)
+ return;
if (!p->nr_sects)
return;
+ printk("del_partition: disk:%x part:%d \n", disk, part);
p->start_sect = 0;
p->nr_sects = 0;
p->reads = p->writes = p->read_sectors = p->write_sectors = 0;
devfs_unregister(p->de);
kobject_unregister(&p->kobj);
+ disk->part[part-1] = NULL;
+ kfree(p);
}
void add_partition(struct gendisk *disk, int part, sector_t start, sector_t len)
{
- struct hd_struct *p = disk->part + part - 1;
+ struct hd_struct *p;
+ printk("add_partition: disk:%x part:%d start:%d len:%d\n", disk, part, (int)start, (int)len);
+
+ p = kmalloc(sizeof(struct hd_struct), GFP_KERNEL);
+ if (!p) return;
+
+ memset(p, 0, sizeof(struct hd_struct));
p->start_sect = start;
p->nr_sects = len;
@@ -276,6 +287,9 @@ void add_partition(struct gendisk *disk,
p->kobj.parent = &disk->kobj;
p->kobj.ktype = &ktype_part;
kobject_register(&p->kobj);
+
+ p->partno = part;
+ disk->part[part-1] = p;
}
static void disk_sysfs_symlinks(struct gendisk *disk)
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2003-03-29 1:30 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-21 18:56 [patch for playing] 2.5.65 patch to support > 256 disks Badari Pulavarty
2003-03-22 11:00 ` Douglas Gilbert
2003-03-22 11:04 ` Andrew Morton
2003-03-22 11:46 ` Douglas Gilbert
2003-03-22 12:05 ` Andrew Morton
2003-03-24 21:32 ` Badari Pulavarty
2003-03-24 22:22 ` Douglas Gilbert
2003-03-24 22:54 ` Badari Pulavarty
2003-03-25 0:10 ` Andrew Morton
2003-03-24 22:57 ` Badari Pulavarty
2003-03-25 10:56 ` Jens Axboe
2003-03-25 11:23 ` Jens Axboe
2003-03-25 11:37 ` Jens Axboe
2003-03-25 11:39 ` Nick Piggin
2003-03-25 12:01 ` Jens Axboe
2003-03-25 12:12 ` Nick Piggin
2003-03-25 12:35 ` Jens Axboe
2003-03-27 0:29 ` Badari Pulavarty
2003-03-27 9:18 ` Jens Axboe
2003-03-28 17:04 ` Badari Pulavarty
2003-03-28 18:41 ` Andries Brouwer
2003-03-29 1:39 ` Badari Pulavarty
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).