linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Nick Piggin <piggin@cyberone.com.au>
Cc: Andrew Morton <akpm@digeo.com>,
	dougg@torque.net, pbadari@us.ibm.com,
	linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [patch for playing] 2.5.65 patch to support > 256 disks
Date: Tue, 25 Mar 2003 13:01:21 +0100	[thread overview]
Message-ID: <20030325120121.GV2371@suse.de> (raw)
In-Reply-To: <3E803FDF.1070401@cyberone.com.au>

On Tue, Mar 25 2003, Nick Piggin wrote:
> 
> 
> Jens Axboe wrote:
> 
> >On Sat, Mar 22 2003, Andrew Morton wrote:
> >
> >>Douglas Gilbert <dougg@torque.net> wrote:
> >>
> >>>Andrew Morton wrote:
> >>>
> >>>>Douglas Gilbert <dougg@torque.net> wrote:
> >>>>
> >>>>
> >>>>>>Slab:           464364 kB
> >>>>>>
> >>>>It's all in slab.
> >>>>
> >>>>
> >>>>
> >>>>>I did notice a rather large growth of nodes
> >>>>>in sysfs. For 84 added scsi_debug pseudo disks the number
> >>>>>of sysfs nodes went from 686 to 3347.
> >>>>>
> >>>>>Does anybody know what is the per node memory cost of sysfs?
> >>>>>
> >>>>
> >>>>Let's see all of /pro/slabinfo please.
> >>>>
> >>>Andrew,
> >>>Attachments are /proc/slabinfo pre and post:
> >>> $ modprobe scsi_debug add_host=42 num_devs=2
> >>>which adds 84 pseudo disks.
> >>>
> >>>
> >>OK, thanks.  So with 48 disks you've lost five megabytes to 
> >>blkdev_requests
> >>and deadline_drq objects.  With 4000 disks, you're toast.  That's enough
> >>request structures to put 200 gigabytes of memory under I/O ;)
> >>
> >>We need to make the request structures dymanically allocated for other
> >>reasons (which I cannot immediately remember) but it didn't happen.  I 
> >>guess
> >>we have some motivation now.
> >>
> >
> >Here's a patch that makes the request allocation (and io scheduler
> >private data) dynamic, with upper and lower bounds of 4 and 256
> >respectively. The numbers are a bit random - the 4 will allow us to make
> >progress, but it might be a smidgen too low. Perhaps 8 would be good.
> >256 is twice as much as before, but that should be alright as long as
> >the io scheduler copes. BLKDEV_MAX_RQ and BLKDEV_MIN_RQ control these
> >two variables.
> >
> >We loose the old batching functionality, for now. I can resurrect that
> >if needed. It's a rough fit with the mempool, it doesn't _quite_ fit our
> >needs here. I'll probably end up doing a specialised block pool scheme
> >for this.
> >
> >Hasn't been tested all that much, it boots though :-)
> >
> Nice Jens. Very good in theory but I haven't looked at the
> code too much yet.
> 
> Would it be possible to have all queues allocate out of
> the one global pool of free requests. This way you could
> have a big minimum (say 128) and a big maximum
> (say min(Mbytes, spindles).

Well not really, as far as I can see we _need_ a pool per queue. Imagine
a bio handed to raid, needs to be split to 6 different queues. But our
minimum is 4, deadlock possibility. It could probably be made to work,
however I greatly prefer a per-queue reserve.

> This way memory usage is decoupled from the number of
> queues, and busy spindles could make use of more
> available free requests.
> 
> Oh and the max value can easily be runtime tunable, right?

Sure. However, they don't really mean _anything_. Max is just some
random number to prevent one queue going nuts, and could be completely
removed if the vm works perfectly. Beyond some limit there's little
benefit to doing that, though. But MAX could be runtime tunable. Min is
basically just to make sure we don't kill ourselves, I don't see any
point in making that runtime tunable. It's not really a tunable.

-- 
Jens Axboe


  reply	other threads:[~2003-03-25 11:50 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-21 18:56 [patch for playing] 2.5.65 patch to support > 256 disks Badari Pulavarty
2003-03-22 11:00 ` Douglas Gilbert
2003-03-22 11:04   ` Andrew Morton
2003-03-22 11:46     ` Douglas Gilbert
2003-03-22 12:05       ` Andrew Morton
2003-03-24 21:32         ` Badari Pulavarty
2003-03-24 22:22           ` Douglas Gilbert
2003-03-24 22:54             ` Badari Pulavarty
2003-03-25  0:10           ` Andrew Morton
2003-03-24 22:57             ` Badari Pulavarty
2003-03-25 10:56         ` Jens Axboe
2003-03-25 11:23           ` Jens Axboe
2003-03-25 11:37             ` Jens Axboe
2003-03-25 11:39           ` Nick Piggin
2003-03-25 12:01             ` Jens Axboe [this message]
2003-03-25 12:12               ` Nick Piggin
2003-03-25 12:35                 ` Jens Axboe
2003-03-27  0:29                   ` Badari Pulavarty
2003-03-27  9:18                     ` Jens Axboe
2003-03-28 17:04                       ` Badari Pulavarty
2003-03-28 18:41                         ` Andries Brouwer
2003-03-29  1:39                           ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030325120121.GV2371@suse.de \
    --to=axboe@suse.de \
    --cc=akpm@digeo.com \
    --cc=dougg@torque.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=piggin@cyberone.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).