All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Cc: chris.mason@oracle.com, david@fromorbit.com, hch@infradead.org,
	akpm@linux-foundation.org, jack@suse.cz,
	yanmin_zhang@linux.intel.com, Jens Axboe <jens.axboe@oracle.com>
Subject: [PATCH 11/13] block: disallow merging of read-ahead bits into normal request
Date: Mon, 25 May 2009 09:31:04 +0200	[thread overview]
Message-ID: <1243236668-3398-22-git-send-email-jens.axboe@oracle.com> (raw)
In-Reply-To: <1243236668-3398-1-git-send-email-jens.axboe@oracle.com>

For SSD type devices, the request latency is really low. So for those
types of devices, we may not want to merge the read part of a request into
the read-ahead request that is also generates.

Add code to mpage.c to properly propagate read vs reada information to
the block layer and let the elevator core check and prevent such merges.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/elevator.c |    7 +++++++
 fs/mpage.c       |   30 ++++++++++++++++++++++++------
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index 6261b24..17cfaa2 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -67,6 +67,13 @@ static int elv_iosched_allow_merge(struct request *rq, struct bio *bio)
 {
 	struct request_queue *q = rq->q;
 
+	/*
+	 * Disallow merge of a read-ahead bio into a normal request for SSD
+	 */
+	if (blk_queue_nonrot(q) &&
+	    bio_rw_ahead(bio) && !(rq->cmd_flags & REQ_FAILFAST_DEV))
+		return 0;
+
 	if (q->elv_ops.elevator_allow_merge_fn)
 		return elv_call_allow_merge_fn(q, rq, bio);
 
diff --git a/fs/mpage.c b/fs/mpage.c
index 680ba60..d02cf51 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -180,11 +180,18 @@ do_mpage_readpage(struct bio *bio, struct page *page, unsigned nr_pages,
 	unsigned page_block;
 	unsigned first_hole = blocks_per_page;
 	struct block_device *bdev = NULL;
-	int length;
+	int length, rw;
 	int fully_mapped = 1;
 	unsigned nblocks;
 	unsigned relative_block;
 
+	/*
+	 * If there's some read-ahead in this range, be sure to tell
+	 * the block layer about it. We start off as a READ, then switch
+	 * to READA if we spot the read-ahead marker on the page.
+	 */
+	rw = READ;
+
 	if (page_has_buffers(page))
 		goto confused;
 
@@ -289,7 +296,7 @@ do_mpage_readpage(struct bio *bio, struct page *page, unsigned nr_pages,
 	 * This page will go to BIO.  Do we need to send this BIO off first?
 	 */
 	if (bio && (*last_block_in_bio != blocks[0] - 1))
-		bio = mpage_bio_submit(READ, bio);
+		bio = mpage_bio_submit(rw, bio);
 
 alloc_new:
 	if (bio == NULL) {
@@ -301,8 +308,19 @@ alloc_new:
 	}
 
 	length = first_hole << blkbits;
-	if (bio_add_page(bio, page, length, 0) < length) {
-		bio = mpage_bio_submit(READ, bio);
+
+	/*
+	 * If this is an SSD, don't merge the read-ahead part of the IO
+	 * with the actual request. We want the interesting part to complete
+	 * as quickly as possible.
+	 */
+	if (blk_queue_nonrot(bdev_get_queue(bdev)) &&
+	    bio->bi_size && PageReadahead(page)) {
+		bio = mpage_bio_submit(rw, bio);
+		rw = READA;
+		goto alloc_new;
+	} else if (bio_add_page(bio, page, length, 0) < length) {
+		bio = mpage_bio_submit(rw, bio);
 		goto alloc_new;
 	}
 
@@ -310,7 +328,7 @@ alloc_new:
 	nblocks = map_bh->b_size >> blkbits;
 	if ((buffer_boundary(map_bh) && relative_block == nblocks) ||
 	    (first_hole != blocks_per_page))
-		bio = mpage_bio_submit(READ, bio);
+		bio = mpage_bio_submit(rw, bio);
 	else
 		*last_block_in_bio = blocks[blocks_per_page - 1];
 out:
@@ -318,7 +336,7 @@ out:
 
 confused:
 	if (bio)
-		bio = mpage_bio_submit(READ, bio);
+		bio = mpage_bio_submit(rw, bio);
 	if (!PageUptodate(page))
 	        block_read_full_page(page, get_block);
 	else
-- 
1.6.3.rc0.1.gf800


  parent reply	other threads:[~2009-05-25  7:35 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-25  7:30 [PATCH 0/12] Per-bdi writeback flusher threads #5 Jens Axboe
2009-05-25  7:30 ` [PATCH 01/13] libata: get rid of ATA_MAX_QUEUE loop in ata_qc_complete_multiple() Jens Axboe
2009-05-25  7:30 ` [PATCH 01/12] ntfs: remove old debug check for dirty data in ntfs_put_super() Jens Axboe
2009-05-25  7:30 ` [PATCH 02/13] block: add static rq allocation cache Jens Axboe
2009-05-25  7:30 ` [PATCH 02/12] btrfs: properly register fs backing device Jens Axboe
2009-05-25  7:30 ` [PATCH 03/13] scsi: unify allocation of scsi command and sense buffer Jens Axboe
2009-05-25  7:41   ` Christoph Hellwig
2009-05-25  7:46     ` Jens Axboe
2009-05-25  7:50       ` Christoph Hellwig
2009-05-25  7:54         ` Jens Axboe
2009-05-25 10:33         ` Boaz Harrosh
2009-05-25 10:42           ` Christoph Hellwig
2009-05-25 10:49             ` Jens Axboe
2009-05-26  4:36         ` FUJITA Tomonori
2009-05-26  5:08           ` FUJITA Tomonori
2009-05-25  8:15   ` Pekka Enberg
2009-05-25  8:15     ` Pekka Enberg
2009-05-25 11:32     ` Nick Piggin
2009-05-25  9:28   ` Boaz Harrosh
2009-05-26  1:45     ` Roland Dreier
2009-05-26  4:36       ` FUJITA Tomonori
2009-05-26  6:29         ` Jens Axboe
2009-05-26  7:25           ` FUJITA Tomonori
2009-05-26  7:32             ` Jens Axboe
2009-05-26  7:38               ` FUJITA Tomonori
2009-05-26 14:47                 ` James Bottomley
2009-05-26 15:13                   ` Matthew Wilcox
2009-05-26 15:31                   ` FUJITA Tomonori
2009-05-26 16:05                     ` Boaz Harrosh
2009-05-27  1:36                       ` FUJITA Tomonori
2009-05-27  7:54                         ` Boaz Harrosh
2009-05-27  8:26                           ` FUJITA Tomonori
2009-05-27  9:11                             ` Boaz Harrosh
2009-05-26 16:12                   ` Boaz Harrosh
2009-05-26 16:28                     ` Boaz Harrosh
2009-05-26  7:56               ` FUJITA Tomonori
2009-05-26  5:23     ` FUJITA Tomonori
2009-05-25  7:30 ` [PATCH 03/12] writeback: move dirty inodes from super_block to backing_dev_info Jens Axboe
2009-05-25  7:30 ` [PATCH 04/13] scsi: get rid of lock in __scsi_put_command() Jens Axboe
2009-05-25  7:30 ` [PATCH 04/12] writeback: switch to per-bdi threads for flushing data Jens Axboe
2009-05-25  7:30 ` [PATCH 05/13] aio: mostly crap Jens Axboe
2009-05-25  9:09   ` Jan Kara
2009-05-25  7:30 ` [PATCH 05/12] writeback: get rid of pdflush completely Jens Axboe
2009-05-25  7:30 ` [PATCH 06/13] block: move elevator ops into the queue Jens Axboe
2009-05-25  7:30 ` [PATCH 06/12] writeback: separate the flushing state/task from the bdi Jens Axboe
2009-05-25  7:30 ` [PATCH 07/13] block: avoid indirect calls to enter cfq io scheduler Jens Axboe
2009-05-26  9:02   ` Nikanth K
2009-05-26  9:02     ` Nikanth K
2009-05-25  7:30 ` [PATCH 07/12] writeback: support > 1 flusher thread per bdi Jens Axboe
2009-05-25  7:30 ` [PATCH 08/13] block: change the tag sync vs async restriction logic Jens Axboe
2009-05-25  7:30 ` [PATCH 08/12] writeback: include default_backing_dev_info in writeback Jens Axboe
2009-05-25  7:31 ` [PATCH 09/13] libata: switch to using block layer tagging support Jens Axboe
2009-05-25  7:31 ` [PATCH 09/12] writeback: allow sleepy exit of default writeback task Jens Axboe
2009-05-25  7:31 ` [PATCH 10/13] block: add function for waiting for a specific free tag Jens Axboe
2009-05-25  7:31 ` [PATCH 10/12] writeback: add some debug inode list counters to bdi stats Jens Axboe
2009-05-25  7:31 ` Jens Axboe [this message]
2009-05-25  7:31 ` [PATCH 11/12] writeback: add name to backing_dev_info Jens Axboe
2009-05-25  7:31 ` [PATCH 12/13] block: first cut at implementing a NAPI approach for block devices Jens Axboe
2009-05-25  7:31 ` [PATCH 12/12] writeback: check for registered bdi in flusher add and inode dirty Jens Axboe
2009-05-25  7:31 ` [PATCH 13/13] block: unlocked completion test patch Jens Axboe
2009-05-25  7:33 ` [PATCH 0/12] Per-bdi writeback flusher threads #5 Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1243236668-3398-22-git-send-email-jens.axboe@oracle.com \
    --to=jens.axboe@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.