All of lore.kernel.org
 help / color / mirror / Atom feed
From: Douglas Gilbert <dgilbert@interlog.com>
To: "Elliott, Robert (Server Storage)" <Elliott@hp.com>,
	Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>
Cc: SCSI development list <linux-scsi@vger.kernel.org>
Subject: Re: lk 3.17-rc4 blk_mq large write problems
Date: Thu, 11 Sep 2014 01:37:03 -0400	[thread overview]
Message-ID: <541134FF.40807@interlog.com> (raw)
In-Reply-To: <94D0CD8314A33A4D9D801C0FE68B402958C72114@G9W0745.americas.hpqcorp.net>

On 14-09-10 11:48 PM, Elliott, Robert (Server Storage) wrote:
>
>
>> -----Original Message-----
>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
>> owner@vger.kernel.org] On Behalf Of Jens Axboe
>> Sent: Wednesday, September 10, 2014 9:00 PM
>> To: dgilbert@interlog.com; Christoph Hellwig
>> Cc: SCSI development list
>> Subject: Re: lk 3.17-rc4 blk_mq large write problems
>>
>> On 2014-09-10 18:58, Douglas Gilbert wrote:
>>> On 14-09-10 11:41 AM, Christoph Hellwig wrote:
>>>> While it might not help with a blown stack, can you give the patch
>> below
>>>> a try?  I tries to solve a problem where the timeout handler hits
>>>> before we've fully set up a command.  While I'd like to understand
>> the
>>>> root cause of why we're hitting it as well, I'd also really to fix
>> that
>>>> race. It would also be good to get a gdb listing of the exact area in
>>>> scsi_times_out listed in the oops.
>>>
>>> RIP: 0010:[<ffffffff8127cd2e>]  [<ffffffff8127cd2e>]
>>> scsi_times_out+0xe/0x2e0
>>>
>>> (gdb) disassemble scsi_times_out
>>> Dump of assembler code for function scsi_times_out:
>>>      0xffffffff8127d030 <+0>:    push   %rbp
>>>      0xffffffff8127d031 <+1>:    mov    $0x2007,%esi
>>>      0xffffffff8127d036 <+6>:    push   %rbx
>>>      0xffffffff8127d037 <+7>:    mov    0xf8(%rdi),%rbx
>>>      0xffffffff8127d03e <+14>:    mov    (%rbx),%rax
>>>      0xffffffff8127d041 <+17>:    mov    %rbx,%rdi
>>>      0xffffffff8127d044 <+20>:    mov    (%rax),%rbp
>>>      0xffffffff8127d047 <+23>:    callq  0xffffffff81277c70
>>> <scsi_log_completion>
>>>      0xffffffff8127d04c <+28>:    cmpl   $0xffffffff,0x154(%rbp)
>>>      0xffffffff8127d053 <+35>:    je     0xffffffff8127d05f
>>> <scsi_times_out+47>
>>> ...
>>>
>>> which seems to agree 'objdump -drS scsi_error.o':
>>>
>>> 00000000000028b0 <scsi_times_out>:
>>>       28b0:    55                       push   %rbp
>>>       28b1:    be 07 20 00 00           mov    $0x2007,%esi
>>>       28b6:    53                       push   %rbx
>>>       28b7:    48 8b 9f f8 00 00 00     mov    0xf8(%rdi),%rbx
>>>       28be:    48 8b 03                 mov    (%rbx),%rax
>>>       28c1:    48 89 df                 mov    %rbx,%rdi
>>>       28c4:    48 8b 28                 mov    (%rax),%rbp
>>>       28c7:    e8 00 00 00 00           callq  28cc
>> <scsi_times_out+0x1c>
>>>               28c8: R_X86_64_PC32    scsi_log_completion-0x4
>>>       28cc:    83 bd 54 01 00 00 ff     cmpl   $0xffffffff,0x154(%rbp)
>>
>> This would be more useful if you had DEBUGINFO enabled. At least it
>> would save use some time :-)
>>
>
> On my system, that function compiles to:
>
> enum blk_eh_timer_return scsi_times_out(struct request *req)
> {
>      2580:       55                      push   %rbp
>      2581:       48 89 e5                mov    %rsp,%rbp
>      2584:       41 55                   push   %r13
>      2586:       41 54                   push   %r12
>      2588:       53                      push   %rbx
>      2589:       48 83 ec 08             sub    $0x8,%rsp
>      258d:       e8 00 00 00 00          callq  2592 <scsi_times_out+0x12>
>                          258e: R_X86_64_PC32     mcount-0x4
>          struct scsi_cmnd *scmd = req->special;
>      2592:       48 8b 9f f8 00 00 00    mov    0xf8(%rdi),%rbx
>          enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
>          struct Scsi_Host *host = scmd->device->host;
>      2599:       48 8b 03                mov    (%rbx),%rax
>      259c:       4c 8b 20                mov    (%rax),%r12
>      259f:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>
> So, Doug's +0xe corresponds to +0x19 here.
>
> 258d-2591 and the extra code preceding them are for stack
> overflow checking, which I have enabled, and the location
> for the optional ftrace jump if tracing this function is
> turned on.
>
> 2592 is scmd = req->special.  %rdi is req, the first and
> only function argument (per the x86_64 abi).  0xf8(%rdi)
> is referencing req->special.  req must be OK, since that's
> not the instruction pointer address that failed.
>
> 2599 is dereferencing req->special (i.e., scmd) to set host.
> req->special must contain NULL, causing the exception
> at this IP.
>
> As a short-term workaround, it might be better to just let
> this function exit if scmd is NULL, assuming that it was
> already dealt with properly.  That would still be misled
> if the race condition, whatever it may be, allows scmd to
> be reused before this function runs; this function would
> then operate on a future command.

Switching back to a SATA SSD gives me the same
"scsi_times_out+0xe" NULL de-reference oops shown at the
start of this thread.

(gdb) list *(scsi_times_out+0xe)
0xffffffff812804fe is in scsi_times_out (drivers/scsi/scsi_error.c:277).
272	 */
273	enum blk_eh_timer_return scsi_times_out(struct request *req)
274	{
275		struct scsi_cmnd *scmd = req->special;
276		enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
277		struct Scsi_Host *host = scmd->device->host;
278	
279		trace_scsi_dispatch_cmd_timeout(scmd);
280		scsi_log_completion(scmd, TIMEOUT_ERROR);

As Rob suggested.



  reply	other threads:[~2014-09-11  5:37 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-10  3:55 lk 3.17-rc4 blk_mq large write problems Douglas Gilbert
2014-09-10 15:41 ` Christoph Hellwig
2014-09-10 16:47   ` Jens Axboe
2014-09-10 18:09     ` Christoph Hellwig
2014-09-10 18:26       ` Jens Axboe
2014-09-10 18:40         ` Christoph Hellwig
2014-09-10 18:42           ` Jens Axboe
2014-09-11  0:58   ` Douglas Gilbert
2014-09-11  2:00     ` Jens Axboe
2014-09-11  3:48       ` Elliott, Robert (Server Storage)
2014-09-11  5:37         ` Douglas Gilbert [this message]
2014-09-17 17:04           ` Christoph Hellwig
2014-09-22 23:14 ` Douglas Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=541134FF.40807@interlog.com \
    --to=dgilbert@interlog.com \
    --cc=Elliott@hp.com \
    --cc=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.