All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sumit Saxena <sumit.saxena@broadcom.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: RE: Application stops due to ext4 filesytsem IO error
Date: Tue, 13 Jun 2017 19:01:17 +0530	[thread overview]
Message-ID: <4e04ad5952109c9c319383e47c1df73c@mail.gmail.com> (raw)
In-Reply-To: bf603f1c2f3873f58717101abb3d9c83@mail.gmail.com

Gentle ping.

I have opened kernel BZ for this. Here is the BZ link-
https://bugzilla.kernel.org/show_bug.cgi?id=196057

Thanks,
Sumit
>-----Original Message-----
>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>Sent: Tuesday, June 06, 2017 9:05 PM
>To: 'Jens Axboe'
>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>Subject: RE: Application stops due to ext4 filesytsem IO error
>
>Gentle ping..
>
>>-----Original Message-----
>>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>>Sent: Monday, June 05, 2017 12:59 PM
>>To: 'Jens Axboe'
>>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>>Subject: Application stops due to ext4 filesytsem IO error
>>
>>Jens,
>>
>>We am observing  application stops while running ext4 filesystem IOs
>>along with target reset in parallel.
>>Our suspect is this behavior can be attributed to linux block layer.
>>See below for details-
>>
>>Problem statement - " Application stops due to IO error from file
>>system buffered IO. (Note - It is always a FS meta data read failure)"
>>Issue is reproducible - "Yes. It is consistently reproducible."
>>Brief about setup -
>>Latest 4.11 kernel. Issue hits irrespective of whether SCSI MQ is
>>enabled or disabled. use_blk_mq=Y and use_blk_mq=N has similar issue.
>>Direct attached 4 SAS/SATA drives connected to MegaRAID Invader
>>controller.
>>
>>Reproduction steps -
>>-Create ext4 FS on 4 JBODs(non RAID volumes) behind MegaRAID SAS
>>controller.
>>-Start Data integrity test on all four ext4 mounted partition. (Tool
>>should be configured to send Buffered FS IO).
>>-Send Target Reset  (have some delay between next reset to allow some
>>IO on device) on each JBOD to simulate error condition. (sg_reset -d
>/dev/sdX).
>>
>>End result -
>>Combination of target resets and FS IOs in parallel causes application
>>halt with ext4 Filesystem IO error.
>>We are able to restart  application without cleaning and unmounting
>>filesystem.
>>Below are the error logs at the time of application stop-
>>
>>--------------------------
>>sd 0:0:53:0: target reset called for
>>scmd(ffff88003cf25148)
>>sd 0:0:53:0: attempting target reset!
>>scmd(ffff88003cf25148) tm_dev_handle 0xb
>>sd 0:0:53:0: [sde] tag#519 BRCM Debug: request->cmd_flags: 0x80700
>bio-
>>>bi_flags: 0x2          bio->bi_opf: 0x3000 rq_flags 0x20e3
>>..
>>sd 0:0:53:0: [sde] tag#519 CDB: Read(10) 28 00 15 00 11 10 00 00 f8 00
>>EXT4-fs error (device sde): __ext4_get_inode_loc:4465: inode #11018287:
>>block 44040738: comm chaos: unable to read itable block
>>-----------------------
>>
>>We debug further to understand what is happening above LLD. See below-
>>
>>During target reset,  there may be IO coming from target with CHECK
>>CONDITION with below sense information-.
>>Sense Key : Aborted Command [current]
>>Add. Sense: No additional sense information
>>
>>Such Aborted command should be retried by SML/Block layer. This happens
>>from SML expect for FS Meta data read.
>>>From driver level debug, we found IOs with REQ_FAILFAST_DEV bit set in
>>scmd->request->cmd_flags are not retried by SML and that is also as
>>expected.
>>
>>Below is the code in scsi_error.c(function- scsi_noretry_cmd) which
>>causes IOs with REQ_FAILFAST_DEV enabled not getting retried bit
>>completed back to upper layer-
>>--------
>>/*
>>     * assume caller has checked sense and determined
>>     * the check condition was retryable.
>>     */
>>    if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>>        scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
>>        return 1;
>>    else
>>        return 0;
>>--------
>>
>>IO which causes application to stop has REQ_FAILFAST_DEV enabled inside
>>"scmd->request->cmd_flags". We noticed that this bit will be set for
>>filesystem Read ahead meta data IOs. In order to confirm the same, we
>>mounted with option inode_readahead_blks=0 to disable ext4's inode
>>table readahead algorithm and did not observe the issue. Issue does not
>>hit with DIRECT IOs but only with cached/buffered IOs.
>>
>>2. From driver level debug prints, we also noticed - There are many IO
>>failures with REQ_FAILFAST_DEV handled gracefully by filesystem.
>>Application level failure happens only If IO has RQF_MIXED_MERGE set.
>>If IO merging is disabled through sysfs parameter for SCSI device in
>>question- nomerges set to 2, we are not seeing the issue.
>>
>>3. We added few prints in driver to dump "scmd->request->cmd_flags" and
>>"scmd->request->rq_flags" for IOs completed with CHECK CONDITION and
>>culprit IOs has all these bits- REQ_FAILFAST_DEV and REQ_RAHEAD bit set
>>in "scmd->request->cmd_flags" and RQF_MIXED_MERGE bit set in "scmd-
>>>request->rq_flags". Also it's not necessarily true that all IOs with
>>>request->these
>>three bits set will cause issue but whenever issue hits, these three
>>bits are set for IO causing failure.
>>
>>
>>In summary,
>>FS mechanism of using READ AHEAD for meta data works fine (in case of
>>IO
>>failure) if there is no mix/merge at block layer.
>>FS mechanism of using READ AHEAD for meta data has some corner case
>>which is not handled properly (in case of IO failure) if  there was
>>mix/merge at block layer.
>>megaraid_sas driver's behavior seems correct here. Aborted IO goes to
>>SML with CHECK CONDITION settings and SML decided to fail fast IO as it
>>was requested.
>>
>>Query -  Is this block layer (page cache) issue?  What should be the
ideal fix
>?
>>
>>Thanks,
>>Sumit

WARNING: multiple messages have this Message-ID (diff)
From: Sumit Saxena <sumit.saxena@broadcom.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: RE: Application stops due to ext4 filesytsem IO error
Date: Tue, 13 Jun 2017 19:01:17 +0530	[thread overview]
Message-ID: <4e04ad5952109c9c319383e47c1df73c@mail.gmail.com> (raw)
In-Reply-To: bf603f1c2f3873f58717101abb3d9c83@mail.gmail.com

Gentle ping.

I have opened kernel BZ for this. Here is the BZ link-
https://bugzilla.kernel.org/show_bug.cgi?id=196057

Thanks,
Sumit
>-----Original Message-----
>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>Sent: Tuesday, June 06, 2017 9:05 PM
>To: 'Jens Axboe'
>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>Subject: RE: Application stops due to ext4 filesytsem IO error
>
>Gentle ping..
>
>>-----Original Message-----
>>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>>Sent: Monday, June 05, 2017 12:59 PM
>>To: 'Jens Axboe'
>>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>>Subject: Application stops due to ext4 filesytsem IO error
>>
>>Jens,
>>
>>We am observing  application stops while running ext4 filesystem IOs
>>along with target reset in parallel.
>>Our suspect is this behavior can be attributed to linux block layer.
>>See below for details-
>>
>>Problem statement - " Application stops due to IO error from file
>>system buffered IO. (Note - It is always a FS meta data read failure)"
>>Issue is reproducible - "Yes. It is consistently reproducible."
>>Brief about setup -
>>Latest 4.11 kernel. Issue hits irrespective of whether SCSI MQ is
>>enabled or disabled. use_blk_mq=Y and use_blk_mq=N has similar issue.
>>Direct attached 4 SAS/SATA drives connected to MegaRAID Invader
>>controller.
>>
>>Reproduction steps -
>>-Create ext4 FS on 4 JBODs(non RAID volumes) behind MegaRAID SAS
>>controller.
>>-Start Data integrity test on all four ext4 mounted partition. (Tool
>>should be configured to send Buffered FS IO).
>>-Send Target Reset  (have some delay between next reset to allow some
>>IO on device) on each JBOD to simulate error condition. (sg_reset -d
>/dev/sdX).
>>
>>End result -
>>Combination of target resets and FS IOs in parallel causes application
>>halt with ext4 Filesystem IO error.
>>We are able to restart  application without cleaning and unmounting
>>filesystem.
>>Below are the error logs at the time of application stop-
>>
>>--------------------------
>>sd 0:0:53:0: target reset called for
>>scmd(ffff88003cf25148)
>>sd 0:0:53:0: attempting target reset!
>>scmd(ffff88003cf25148) tm_dev_handle 0xb
>>sd 0:0:53:0: [sde] tag#519 BRCM Debug: request->cmd_flags: 0x80700
>bio-
>>>bi_flags: 0x2          bio->bi_opf: 0x3000 rq_flags 0x20e3
>>..
>>sd 0:0:53:0: [sde] tag#519 CDB: Read(10) 28 00 15 00 11 10 00 00 f8 00
>>EXT4-fs error (device sde): __ext4_get_inode_loc:4465: inode #11018287:
>>block 44040738: comm chaos: unable to read itable block
>>-----------------------
>>
>>We debug further to understand what is happening above LLD. See below-
>>
>>During target reset,  there may be IO coming from target with CHECK
>>CONDITION with below sense information-.
>>Sense Key : Aborted Command [current]
>>Add. Sense: No additional sense information
>>
>>Such Aborted command should be retried by SML/Block layer. This happens
>>from SML expect for FS Meta data read.
>>From driver level debug, we found IOs with REQ_FAILFAST_DEV bit set in
>>scmd->request->cmd_flags are not retried by SML and that is also as
>>expected.
>>
>>Below is the code in scsi_error.c(function- scsi_noretry_cmd) which
>>causes IOs with REQ_FAILFAST_DEV enabled not getting retried bit
>>completed back to upper layer-
>>--------
>>/*
>>     * assume caller has checked sense and determined
>>     * the check condition was retryable.
>>     */
>>    if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>>        scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
>>        return 1;
>>    else
>>        return 0;
>>--------
>>
>>IO which causes application to stop has REQ_FAILFAST_DEV enabled inside
>>"scmd->request->cmd_flags". We noticed that this bit will be set for
>>filesystem Read ahead meta data IOs. In order to confirm the same, we
>>mounted with option inode_readahead_blks=0 to disable ext4's inode
>>table readahead algorithm and did not observe the issue. Issue does not
>>hit with DIRECT IOs but only with cached/buffered IOs.
>>
>>2. From driver level debug prints, we also noticed - There are many IO
>>failures with REQ_FAILFAST_DEV handled gracefully by filesystem.
>>Application level failure happens only If IO has RQF_MIXED_MERGE set.
>>If IO merging is disabled through sysfs parameter for SCSI device in
>>question- nomerges set to 2, we are not seeing the issue.
>>
>>3. We added few prints in driver to dump "scmd->request->cmd_flags" and
>>"scmd->request->rq_flags" for IOs completed with CHECK CONDITION and
>>culprit IOs has all these bits- REQ_FAILFAST_DEV and REQ_RAHEAD bit set
>>in "scmd->request->cmd_flags" and RQF_MIXED_MERGE bit set in "scmd-
>>>request->rq_flags". Also it's not necessarily true that all IOs with
>>>request->these
>>three bits set will cause issue but whenever issue hits, these three
>>bits are set for IO causing failure.
>>
>>
>>In summary,
>>FS mechanism of using READ AHEAD for meta data works fine (in case of
>>IO
>>failure) if there is no mix/merge at block layer.
>>FS mechanism of using READ AHEAD for meta data has some corner case
>>which is not handled properly (in case of IO failure) if  there was
>>mix/merge at block layer.
>>megaraid_sas driver's behavior seems correct here. Aborted IO goes to
>>SML with CHECK CONDITION settings and SML decided to fail fast IO as it
>>was requested.
>>
>>Query -  Is this block layer (page cache) issue?  What should be the
ideal fix
>?
>>
>>Thanks,
>>Sumit

             reply	other threads:[~2017-06-13 13:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-13 13:31 Sumit Saxena [this message]
2017-06-13 13:31 ` Application stops due to ext4 filesytsem IO error Sumit Saxena
  -- strict thread matches above, loose matches on Subject: below --
2017-06-06 15:34 Sumit Saxena
2017-06-06 15:34 ` Sumit Saxena
2017-06-05  7:28 Sumit Saxena

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e04ad5952109c9c319383e47c1df73c@mail.gmail.com \
    --to=sumit.saxena@broadcom.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.