From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S267300AbTGLHit (ORCPT ); Sat, 12 Jul 2003 03:38:49 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S267888AbTGLHit (ORCPT ); Sat, 12 Jul 2003 03:38:49 -0400 Received: from ns.virtualhost.dk ([195.184.98.160]:47000 "EHLO virtualhost.dk") by vger.kernel.org with ESMTP id S267300AbTGLHir (ORCPT ); Sat, 12 Jul 2003 03:38:47 -0400 Date: Sat, 12 Jul 2003 09:48:27 +0200 From: Jens Axboe To: Chris Mason Cc: Marcelo Tosatti , lkml , "Stephen C. Tweedie" , Alan Cox , Jeff Garzik , Andrew Morton , Andrea Arcangeli , Alexander Viro Subject: Re: RFC on io-stalls patch Message-ID: <20030712074827.GA31308@suse.de> References: <20030710135747.GT825@suse.de> <1057932804.13313.58.camel@tiny.suse.com> <20030712073710.GK843@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030712073710.GK843@suse.de> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 12 2003, Jens Axboe wrote: > On Fri, Jul 11 2003, Chris Mason wrote: > > On Thu, 2003-07-10 at 09:57, Jens Axboe wrote: > > > On Tue, Jul 08 2003, Marcelo Tosatti wrote: > > > > > > > > Hello people, > > > > > > > > To get better IO interactivity and to fix potential SMP IO hangs (due to > > > > missed wakeups) we, (Chris Mason integrated Andrea's work) added > > > > "io-stalls-10" patch in 2.4.22-pre3. > > > > > > > > The "low-latency" patch (which is part of io-stalls-10) seemed to be a > > > > good approach to increase IO fairness. Some people (Alan, AFAIK) are a bit > > > > concerned about that, though. > > > > > > > > Could you guys, Stephen, Andrew and maybe Viro (if interested :)) which > > > > havent been part of the discussions around the IO stalls issue take a look > > > > at the patch, please? > > > > > > > > It seems safe and a good approach to me, but might not be. Or have small > > > > "glitches". > > > > > > Well, I have one naive question. What prevents writes from eating the > > > entire request pool now? In the 2.2 and earlier days, we reserved the > > > last 3rd of the requests to writes. 2.4.1 and later used a split request > > > list to make that same guarentee. > > > > > > I only did a quick read of the patch so maybe I'm missing the new > > > mechanism for this. Are we simply relying on fair (FIFO) request > > > allocation and oversized queue to do its job alone? > > > > Seems that way. With the 2.4.21 code, a read might easily get a > > request, but then spend forever waiting for a huge queue of merged > > writes to get to disk. > > Correct > > > I believe the new way provides better overall read performance in the > > presence of lots of writes. > > I fail to see the logic in that. Reads are now treated fairly wrt > writes, but it would be really easy to let writes consume the entire > capacity of the queue (be it all the requests, or just going oversized). > > I think the oversized logic is flawed right now, and should only apply > to writes. Always let reads get through. And don't let writes consume > the last 1/8th of the requests, or something like that at least. I'll > try and do a patch for pre4. Something simple like this should really be added, imo. Untested. ===== drivers/block/ll_rw_blk.c 1.47 vs edited ===== --- 1.47/drivers/block/ll_rw_blk.c Fri Jul 11 10:30:54 2003 +++ edited/drivers/block/ll_rw_blk.c Sat Jul 12 09:47:32 2003 @@ -549,10 +549,18 @@ static struct request *get_request(request_queue_t *q, int rw) { struct request *rq = NULL; - struct request_list *rl; + struct request_list *rl = &q->rq; - rl = &q->rq; - if (!list_empty(&rl->free) && !blk_oversized_queue(q)) { + /* + * only apply the oversized queue logic to writes. and only let + * writes consume 7/8ths of the queue, always leave room for some + * reads + */ + if ((rw == WRITE) && + blk_oversized_queue(q) || rl->count < q->nr_requests / 8) + return NULL; + + if (!list_empty(&rl->free)) { rq = blkdev_free_rq(&rl->free); list_del(&rq->queue); rl->count--; -- Jens Axboe