All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Elliott, Robert (Server Storage)" <Elliott@hp.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Jeff Moyer <jmoyer@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	"dgilbert@interlog.com" <dgilbert@interlog.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Bart Van Assche <bvanassche@fusionio.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: scsi-mq V2
Date: Sat, 12 Jul 2014 21:50:43 +0000	[thread overview]
Message-ID: <94D0CD8314A33A4D9D801C0FE68B402958B9969F@G9W0745.americas.hpqcorp.net> (raw)
In-Reply-To: <20140711145524.GB12478@kvack.org>



> -----Original Message-----
> From: Benjamin LaHaise [mailto:bcrl@kvack.org]
> Sent: Friday, 11 July, 2014 9:55 AM
> To: Elliott, Robert (Server Storage)
> Cc: Christoph Hellwig; Jeff Moyer; Jens Axboe; dgilbert@interlog.com; James
> Bottomley; Bart Van Assche; linux-scsi@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: scsi-mq V2
...
> Can you try the below totally untested patch instead?  It looks like
> put_reqs_available() is not irq-safe.
> 

With that addition alone, fio still runs into the same problem.

I added the same fix to get_reqs_available, which also accesses 
kcpu->reqs_available, and the test has run for 35 minutes with 
no problem.

Patch applied:

diff --git a/fs/aio.c b/fs/aio.c
index e59bba8..8e85e26 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -830,16 +830,20 @@ void exit_aio(struct mm_struct *mm)
 static void put_reqs_available(struct kioctx *ctx, unsigned nr)
 {
 	struct kioctx_cpu *kcpu;
+	unsigned long flags;
 
 	preempt_disable();
 	kcpu = this_cpu_ptr(ctx->cpu);
 
+	local_irq_save(flags);
 	kcpu->reqs_available += nr;
+
 	while (kcpu->reqs_available >= ctx->req_batch * 2) {
 		kcpu->reqs_available -= ctx->req_batch;
 		atomic_add(ctx->req_batch, &ctx->reqs_available);
 	}
 
+	local_irq_restore(flags);
 	preempt_enable();
 }
 
@@ -847,10 +851,12 @@ static bool get_reqs_available(struct kioctx *ctx)
 {
 	struct kioctx_cpu *kcpu;
 	bool ret = false;
+	unsigned long flags;
 
 	preempt_disable();
 	kcpu = this_cpu_ptr(ctx->cpu);
 
+	local_irq_save(flags);
 	if (!kcpu->reqs_available) {
 		int old, avail = atomic_read(&ctx->reqs_available);
 
@@ -869,6 +875,7 @@ static bool get_reqs_available(struct kioctx *ctx)
 	ret = true;
 	kcpu->reqs_available--;
 out:
+	local_irq_restore(flags);
 	preempt_enable();
 	return ret;
 }

--
I will see if that solves the problem with the scsi-mq-3 tree, or 
at least some of the bisect trees leading up to it.

A few other comments:

1. Those changes boost _raw_spin_lock_irqsave into first place
in perf top:

  6.59%  [kernel]                    [k] _raw_spin_lock_irqsave
  4.37%  [kernel]                    [k] put_compound_page
  2.87%  [scsi_debug]                [k] sdebug_q_cmd_hrt_complete
  2.74%  [kernel]                    [k] _raw_spin_lock
  2.73%  [kernel]                    [k] apic_timer_interrupt
  2.41%  [kernel]                    [k] do_blockdev_direct_IO
  2.24%  [kernel]                    [k] __get_page_tail
  1.97%  [kernel]                    [k] _raw_spin_unlock_irqrestore
  1.87%  [kernel]                    [k] scsi_queue_rq
  1.76%  [scsi_debug]                [k] schedule_resp

Maybe (later) kcpu->reqs_available should converted to an atomic,
like ctx->reqs_available, to reduce that overhead?

2. After the f8567a3 patch, aio_complete has one early return that 
bypasses the call to put_reqs_available.  Is that OK, or does
that mean that sync iocbs will now eat up reqs_available?

        /*
         * Special case handling for sync iocbs:
         *  - events go directly into the iocb for fast handling
         *  - the sync task with the iocb in its stack holds the single iocb
         *    ref, no other paths have a way to get another ref
         *  - the sync task helpfully left a reference to itself in the iocb
         */
        if (is_sync_kiocb(iocb)) {
                iocb->ki_user_data = res;
                smp_wmb();
                iocb->ki_ctx = ERR_PTR(-EXDEV);
                wake_up_process(iocb->ki_obj.tsk);
                return;
        }


3. The f8567a3 patch renders this comment in aio.c out of date - it's 
no longer incremented when pulled off the ringbuffer, but is now 
incremented when aio_complete is called.

        struct {
                /*
                 * This counts the number of available slots in the ringbuffer,
                 * so we avoid overflowing it: it's decremented (if positive)
                 * when allocating a kiocb and incremented when the resulting
                 * io_event is pulled off the ringbuffer.
                 *
                 * We batch accesses to it with a percpu version.
                 */
                atomic_t        reqs_available;
        } ____cacheline_aligned_in_smp;


---
Rob Elliott    HP Server Storage




  reply	other threads:[~2014-07-12 21:53 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-25 16:51 scsi-mq V2 Christoph Hellwig
2014-06-25 16:51 ` [PATCH 01/14] sd: don't use rq->cmd_len before setting it up Christoph Hellwig
2014-07-09 11:12   ` Hannes Reinecke
2014-07-09 11:12     ` Hannes Reinecke
2014-07-09 15:03     ` Christoph Hellwig
2014-06-25 16:51 ` [PATCH 02/14] scsi: split __scsi_queue_insert Christoph Hellwig
2014-07-09 11:12   ` Hannes Reinecke
2014-06-25 16:51 ` [PATCH 03/14] scsi: centralize command re-queueing in scsi_dispatch_fn Christoph Hellwig
2014-07-08 20:51   ` Elliott, Robert (Server Storage)
2014-07-09  6:40     ` Christoph Hellwig
2014-07-09 11:13   ` Hannes Reinecke
2014-06-25 16:51 ` [PATCH 04/14] scsi: set ->scsi_done before calling scsi_dispatch_cmd Christoph Hellwig
2014-07-09 11:14   ` Hannes Reinecke
2014-06-25 16:51 ` [PATCH 05/14] scsi: push host_lock down into scsi_{host,target}_queue_ready Christoph Hellwig
2014-07-09 11:14   ` Hannes Reinecke
2014-06-25 16:51 ` [PATCH 06/14] scsi: convert target_busy to an atomic_t Christoph Hellwig
2014-07-09 11:15   ` Hannes Reinecke
2014-07-09 11:15     ` Hannes Reinecke
2014-06-25 16:51 ` [PATCH 07/14] scsi: convert host_busy to atomic_t Christoph Hellwig
2014-07-09 11:15   ` Hannes Reinecke
2014-06-25 16:51 ` [PATCH 08/14] scsi: convert device_busy " Christoph Hellwig
2014-07-09 11:16   ` Hannes Reinecke
2014-07-09 16:49   ` James Bottomley
2014-07-10  6:01     ` Christoph Hellwig
2014-06-25 16:51 ` [PATCH 09/14] scsi: fix the {host,target,device}_blocked counter mess Christoph Hellwig
2014-07-09 11:12   ` Hannes Reinecke
2014-07-10  6:06     ` Christoph Hellwig
2014-06-25 16:51 ` [PATCH 10/14] scsi: only maintain target_blocked if the driver has a target queue limit Christoph Hellwig
2014-07-09 11:19   ` Hannes Reinecke
2014-07-09 15:05     ` Christoph Hellwig
2014-06-25 16:51 ` [PATCH 11/14] scsi: unwind blk_end_request_all and blk_end_request_err calls Christoph Hellwig
2014-07-09 11:20   ` Hannes Reinecke
2014-06-25 16:51 ` [PATCH 12/14] scatterlist: allow chaining to preallocated chunks Christoph Hellwig
2014-07-09 11:21   ` Hannes Reinecke
2014-06-25 16:52 ` [PATCH 13/14] scsi: add support for a blk-mq based I/O path Christoph Hellwig
2014-07-09 11:25   ` Hannes Reinecke
2014-07-16 11:13   ` Mike Christie
2014-07-16 11:16     ` Christoph Hellwig
2014-06-25 16:52 ` [PATCH 14/14] fnic: reject device resets without assigned tags for the blk-mq case Christoph Hellwig
2014-07-09 11:27   ` Hannes Reinecke
2014-06-26  4:50 ` scsi-mq V2 Jens Axboe
2014-06-26 22:07   ` Elliott, Robert (Server Storage)
2014-06-27 14:42     ` Bart Van Assche
2014-06-30 15:20   ` Jens Axboe
2014-06-30 15:25     ` Christoph Hellwig
2014-06-30 15:54       ` Martin K. Petersen
2014-07-08 14:48 ` Christoph Hellwig
2014-07-09 16:39   ` Douglas Gilbert
2014-07-09 19:38     ` Jens Axboe
2014-07-10  0:53       ` Elliott, Robert (Server Storage)
2014-07-10  6:20         ` Christoph Hellwig
2014-07-10 13:36           ` Benjamin LaHaise
2014-07-10 13:39             ` Jens Axboe
2014-07-10 13:44               ` Benjamin LaHaise
2014-07-10 13:48                 ` Jens Axboe
2014-07-10 13:50                   ` Benjamin LaHaise
2014-07-10 13:52                     ` Jens Axboe
2014-07-10 13:50             ` Christoph Hellwig
2014-07-10 13:52               ` Jens Axboe
2014-07-10 14:36                 ` Elliott, Robert (Server Storage)
2014-07-10 14:45                   ` Benjamin LaHaise
2014-07-10 15:11                     ` Jeff Moyer
2014-07-10 15:11                       ` Jeff Moyer
2014-07-10 19:59                       ` Jens Axboe
2014-07-10 19:59                         ` Jens Axboe
2014-07-10 20:05                         ` Jeff Moyer
2014-07-10 20:05                           ` Jeff Moyer
2014-07-10 20:06                           ` Jens Axboe
2014-07-10 20:06                             ` Jens Axboe
2014-07-10 15:51           ` Elliott, Robert (Server Storage)
2014-07-10 16:04             ` Christoph Hellwig
2014-07-10 16:14               ` Christoph Hellwig
2014-07-10 18:49                 ` Elliott, Robert (Server Storage)
2014-07-10 19:14                   ` Jeff Moyer
2014-07-10 19:14                     ` Jeff Moyer
2014-07-10 19:36                     ` Jeff Moyer
2014-07-10 19:36                       ` Jeff Moyer
2014-07-10 21:10                     ` Elliott, Robert (Server Storage)
2014-07-11  6:02                       ` Elliott, Robert (Server Storage)
2014-07-11  6:14                         ` Christoph Hellwig
2014-07-11 14:33                           ` Elliott, Robert (Server Storage)
2014-07-11 14:55                             ` Benjamin LaHaise
2014-07-12 21:50                               ` Elliott, Robert (Server Storage) [this message]
2014-07-12 23:20                                 ` Elliott, Robert (Server Storage)
2014-07-13 17:15                                   ` Elliott, Robert (Server Storage)
2014-07-14 17:15                                     ` Benjamin LaHaise
2014-07-14  9:13   ` Sagi Grimberg
2014-08-21 12:32     ` Performance degradation in IO writes vs. reads (was scsi-mq V2) Sagi Grimberg
2014-08-21 12:32       ` Sagi Grimberg
2014-08-21 13:03       ` Christoph Hellwig
2014-08-21 14:02         ` Sagi Grimberg
2014-08-24 16:41           ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94D0CD8314A33A4D9D801C0FE68B402958B9969F@G9W0745.americas.hpqcorp.net \
    --to=elliott@hp.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=bvanassche@fusionio.com \
    --cc=dgilbert@interlog.com \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.