All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: axboe@kernel.dk, tytso@mit.edu, djwong@us.ibm.com,
	shli@kernel.org, neilb@suse.de, adilger.kernel@dilger.ca,
	jack@suse.cz, linux-kernel@vger.kernel.org, kmannth@us.ibm.com,
	cmm@us.ibm.com, linux-ext4@vger.kernel.org, rwheeler@redhat.com,
	hch@lst.de, josef@redhat.com, jmoyer@redhat.com
Subject: Re: [KNOWN BUGGY RFC PATCH 4/3] block: skip elevator initialization for flush requests
Date: Tue, 25 Jan 2011 16:55:01 -0500	[thread overview]
Message-ID: <20110125215500.GA6836@redhat.com> (raw)
In-Reply-To: <20110125204158.GA3013@redhat.com>

On Tue, Jan 25 2011 at  3:41pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> Hi Tejun,
> 
> On Fri, Jan 21 2011 at 10:59am -0500,
> Tejun Heo <tj@kernel.org> wrote:
> > 
> > * As flush requests are never put on the IO scheduler, request fields
> >   used for flush share space with rq->rb_node.  rq->completion_data is
> >   moved out of the union.  This increases the request size by one
> >   pointer.
> > 
> >   As rq->elevator_private* are used only by the iosched too, it is
> >   possible to reduce the request size further.  However, to do that,
> >   we need to modify request allocation path such that iosched data is
> >   not allocated for flush requests.
> 
> I decided to take a crack at using rq->elevator_private* and came up
> with the following patch.
> 
> Unfortunately, in testing I found that flush requests that have data do
> in fact eventually get added to the queue as normal requests, via:
> 1) "data but flush is not necessary" case in blk_insert_flush
> 2) REQ_FSEQ_DATA case in blk_flush_complete_seq

Vivek helped me understand that adding the request to the queue doesn't
mean it goes to the elevator.  It is inserting the request directly to
the underlying queue.

That embarassing oversight aside, the flush request is still getting to
the elevator somehow -- even though elv_set_request() was clearly not
called.

It is an interesting duality that:
1) REQ_ELVPRIV is never set because priv=0 is passed to blk_alloc_request
2) yet when blk_free_request() checks rq->cmd_flags REQ_ELVPRIV is set;
   resulting in the call to elv_put_request()

> I know this because in my following get_request() change to _not_ call
> elv_set_request() for flush requests hit cfq_put_request()'s
> BUG_ON(!cfqq->allocated[rw]).

FYI, here is the backtrace:

PID: 0      TASK: ffff88007ccd6b30  CPU: 1   COMMAND: "swapper"
 #0 [ffff880002103930] show_trace_log_lvl at ffffffff8100f3ec
 #1 [ffff880002103980] delay_tsc at ffffffff8125e62a
 #2 [ffff8800021039b0] __const_udelay at ffffffff8125e5d6
 #3 [ffff8800021039c0] panic at ffffffff814c3604
 #4 [ffff880002103a40] oops_end at ffffffff814c7622
 #5 [ffff880002103a70] die at ffffffff8100f33b
 #6 [ffff880002103aa0] do_trap at ffffffff814c6ec4
 #7 [ffff880002103b00] do_invalid_op at ffffffff8100cee5
 #8 [ffff880002103ba0] invalid_op at ffffffff8100bf5b
    [exception RIP: cfq_put_request+128]
    RIP: ffffffff8124f000  RSP: ffff880002103c58  RFLAGS: 00010046
    RAX: 0000000000000019  RBX: ffff88007b5668a0  RCX: 0000000000002c7b
    RDX: 0000000000000000  RSI: ffff88007b5668a0  RDI: ffff88007b5668a0
    RBP: ffff880002103c68   R8: 0000000000000003   R9: 0000000000000001
    R10: 0000000000000003  R11: 0000000000000003  R12: ffff88007b566940
    R13: 00000000018c2441  R14: 0000000000000001  R15: 00000000000000a5
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff880002103c70] elv_put_request at ffffffff8123208e
#10 [ffff880002103c80] __blk_put_request at ffffffff8123ae23
#11 [ffff880002103cb0] blk_finish_request at ffffffff8123b049
#12 [ffff880002103d00] __blk_end_request_all at ffffffff8123b0fb
#13 [ffff880002103d20] blk_flush_complete_seq at ffffffff8123de4c
#14 [ffff880002103d50] flush_end_io at ffffffff8123e095
#15 [ffff880002103da0] blk_finish_request at ffffffff8123aedb
#16 [ffff880002103df0] __blk_end_request_all at ffffffff8123b0fb
#17 [ffff880002103e10] blk_done at ffffffffa002e085
#18 [ffff880002103e50] vring_interrupt at ffffffffa001f19c
#19 [ffff880002103e70] vp_vring_interrupt at ffffffffa00264bb
#20 [ffff880002103ec0] vp_interrupt at ffffffffa0026544
#21 [ffff880002103ee0] handle_IRQ_event at ffffffff810d10b0
#22 [ffff880002103f30] handle_fasteoi_irq at ffffffff810d38a9
#23 [ffff880002103f60] handle_irq at ffffffff8100dfb9
#24 [ffff880002103f80] do_IRQ at ffffffff814cb32c
--- <IRQ stack> ---
#25 [ffff88007ccf9e28] ret_from_intr at ffffffff8100bad3
    [exception RIP: native_safe_halt+11]
    RIP: ffffffff81033f0b  RSP: ffff88007ccf9ed8  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffff88007ccf9ed8  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: ffffffff81d0f1e8
    RBP: ffffffff8100bace   R8: 0000000000000000   R9: 0000000000000001
    R10: 0000000000000000  R11: 00000000fffba7c1  R12: 0000000000000001
    R13: ffff88007ccf9e68  R14: 0000000281075d93  R15: ffff88007ccf9e98
    ORIG_RAX: ffffffffffffffc4  CS: 0010  SS: 0018
#26 [ffff88007ccf9ee0] default_idle at ffffffff81013e0d
#27 [ffff88007ccf9f00] cpu_idle at ffffffff81009e96

  reply	other threads:[~2011-01-25 21:55 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-21 15:59 [PATCHSET] block: reimplement FLUSH/FUA to support merge Tejun Heo
2011-01-21 15:59 ` Tejun Heo
2011-01-21 15:59 ` [PATCH 1/3] block: add REQ_FLUSH_SEQ Tejun Heo
2011-01-21 15:59   ` Tejun Heo
2011-01-21 15:59 ` [PATCH 2/3] block: improve flush bio completion Tejun Heo
2011-01-21 15:59   ` Tejun Heo
2011-01-21 15:59 ` [PATCH 3/3] block: reimplement FLUSH/FUA to support merge Tejun Heo
2011-01-21 18:56   ` Vivek Goyal
2011-01-21 19:19     ` Vivek Goyal
2011-01-23 10:25     ` Tejun Heo
2011-01-23 10:29       ` Tejun Heo
2011-01-24 20:31       ` Darrick J. Wong
2011-01-25 10:21         ` Tejun Heo
2011-01-25 11:39           ` Jens Axboe
2011-03-23 23:37             ` Darrick J. Wong
2011-01-25 22:56           ` Darrick J. Wong
2011-01-22  0:49   ` Mike Snitzer
2011-01-23 10:31     ` Tejun Heo
2011-01-25 20:46       ` Vivek Goyal
2011-01-25 21:04         ` Mike Snitzer
2011-01-23 10:48   ` [PATCH UPDATED " Tejun Heo
2011-01-23 10:48     ` Tejun Heo
2011-01-25 20:41   ` [KNOWN BUGGY RFC PATCH 4/3] block: skip elevator initialization for flush requests Mike Snitzer
2011-01-25 20:41     ` Mike Snitzer
2011-01-25 21:55     ` Mike Snitzer [this message]
2011-01-26  5:27       ` [RFC PATCH 4/3] block: skip elevator initialization for flush requests -- was never BUGGY relative to upstream Mike Snitzer
2011-01-26 10:03     ` [KNOWN BUGGY RFC PATCH 4/3] block: skip elevator initialization for flush requests Tejun Heo
2011-01-26 10:05       ` Tejun Heo
2011-02-01 17:38       ` [RFC " Mike Snitzer
2011-02-01 18:52         ` Tejun Heo
2011-02-01 22:46           ` [PATCH v2 1/2] " Mike Snitzer
2011-02-02 21:51             ` Vivek Goyal
2011-02-02 22:06               ` Mike Snitzer
2011-02-02 22:55             ` [PATCH v3 1/2] block: skip elevator data " Mike Snitzer
2011-02-03  9:28               ` Tejun Heo
2011-02-03 14:48                 ` [PATCH v4 " Mike Snitzer
2011-02-03 13:24               ` [PATCH v3 " Jens Axboe
2011-02-03 13:38                 ` Tejun Heo
2011-02-04 15:04                   ` Vivek Goyal
2011-02-04 15:08                     ` Tejun Heo
2011-02-04 16:58                     ` [PATCH v5 " Mike Snitzer
2011-02-03 14:54                 ` [PATCH v3 " Mike Snitzer
2011-02-01 22:46           ` [PATCH v2 2/2] block: share request flush fields with elevator_private Mike Snitzer
2011-02-01 22:46             ` Mike Snitzer
2011-02-02 21:52             ` Vivek Goyal
2011-02-03  9:24             ` Tejun Heo
2011-02-11 10:08             ` Jens Axboe
2011-01-21 15:59 ` [PATCH 3/3] block: reimplement FLUSH/FUA to support merge Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110125215500.GA6836@redhat.com \
    --to=snitzer@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=axboe@kernel.dk \
    --cc=cmm@us.ibm.com \
    --cc=djwong@us.ibm.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=josef@redhat.com \
    --cc=kmannth@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=rwheeler@redhat.com \
    --cc=shli@kernel.org \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.