All of lore.kernel.org
 help / color / mirror / Atom feed
* Application stops due to ext4 filesytsem IO error
@ 2017-06-05  7:28 Sumit Saxena
  0 siblings, 0 replies; 5+ messages in thread
From: Sumit Saxena @ 2017-06-05  7:28 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi

Jens,

We am observing  application stops while running ext4 filesystem IOs along
with target reset in parallel.
Our suspect is this behavior can be attributed to linux block layer. See
below for details-

Problem statement - " Application stops due to IO error from file system
buffered IO. (Note - It is always a FS meta data read failure)"
Issue is reproducible - "Yes. It is consistently reproducible."
Brief about setup -
Latest 4.11 kernel. Issue hits irrespective of whether SCSI MQ is enabled
or disabled. use_blk_mq=Y and use_blk_mq=N has similar issue.
Direct attached 4 SAS/SATA drives connected to MegaRAID Invader
controller.

Reproduction steps -
-Create ext4 FS on 4 JBODs(non RAID volumes) behind MegaRAID SAS
controller.
-Start Data integrity test on all four ext4 mounted partition. (Tool
should be configured to send Buffered FS IO).
-Send Target Reset  (have some delay between next reset to allow some IO
on device) on each JBOD to simulate error condition. (sg_reset -d
/dev/sdX).

End result -
Combination of target resets and FS IOs in parallel causes application
halt with ext4 Filesystem IO error.
We are able to restart  application without cleaning and unmounting
filesystem.
Below are the error logs at the time of application stop-

--------------------------
sd 0:0:53:0: target reset called for
scmd(ffff88003cf25148)
sd 0:0:53:0: attempting target reset!
scmd(ffff88003cf25148) tm_dev_handle 0xb
sd 0:0:53:0: [sde] tag#519 BRCM Debug: request->cmd_flags: 0x80700
bio->bi_flags: 0x2          bio->bi_opf: 0x3000 rq_flags 0x20e3
..
sd 0:0:53:0: [sde] tag#519 CDB: Read(10) 28 00 15 00 11 10 00 00 f8 00
EXT4-fs error (device sde): __ext4_get_inode_loc:4465: inode #11018287:
block 44040738: comm chaos: unable to read itable block
-----------------------

We debug further to understand what is happening above LLD. See below-

During target reset,  there may be IO coming from target with CHECK
CONDITION with below sense information-.
Sense Key : Aborted Command [current]
Add. Sense: No additional sense information

Such Aborted command should be retried by SML/Block layer. This happens
from SML expect for FS Meta data read.
>From driver level debug, we found IOs with REQ_FAILFAST_DEV bit set in
scmd->request->cmd_flags are not retried by SML and that is also as
expected.

Below is the code in scsi_error.c(function- scsi_noretry_cmd) which causes
IOs with REQ_FAILFAST_DEV enabled not getting retried bit completed back
to upper layer-
--------
/*
     * assume caller has checked sense and determined
     * the check condition was retryable.
     */
    if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
        scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
        return 1;
    else
        return 0;
--------

IO which causes application to stop has REQ_FAILFAST_DEV enabled inside
"scmd->request->cmd_flags". We noticed that this bit will be set for
filesystem Read ahead meta data IOs. In order to confirm the same, we
mounted with option inode_readahead_blks=0 to disable ext4's inode table
readahead algorithm and did not observe the issue. Issue does not hit with
DIRECT IOs but only with cached/buffered IOs.

2. From driver level debug prints, we also noticed - There are many IO
failures with REQ_FAILFAST_DEV handled gracefully by filesystem.
Application level failure happens only If IO has RQF_MIXED_MERGE set.
If IO merging is disabled through sysfs parameter for SCSI device in
question- nomerges set to 2, we are not seeing the issue.

3. We added few prints in driver to dump "scmd->request->cmd_flags" and
"scmd->request->rq_flags" for IOs completed with CHECK CONDITION and
culprit IOs has all these bits- REQ_FAILFAST_DEV and REQ_RAHEAD bit set in
"scmd->request->cmd_flags" and RQF_MIXED_MERGE bit set in
"scmd->request->rq_flags". Also it's not necessarily true that all IOs
with these three bits set will cause issue but whenever issue hits, these
three bits are set for IO causing failure.


In summary,
FS mechanism of using READ AHEAD for meta data works fine (in case of IO
failure) if there is no mix/merge at block layer.
FS mechanism of using READ AHEAD for meta data has some corner case which
is not handled properly (in case of IO failure) if  there was  mix/merge
at block layer.
megaraid_sas driver's behavior seems correct here. Aborted IO goes to SML
with CHECK CONDITION settings and SML decided to fail fast IO as it was
requested.

Query -  Is this block layer (page cache) issue?  What should be the ideal
fix ?

Thanks,
Sumit

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Application stops due to ext4 filesytsem IO error
@ 2017-06-13 13:31 ` Sumit Saxena
  0 siblings, 0 replies; 5+ messages in thread
From: Sumit Saxena @ 2017-06-13 13:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi

Gentle ping.

I have opened kernel BZ for this. Here is the BZ link-
https://bugzilla.kernel.org/show_bug.cgi?id=196057

Thanks,
Sumit
>-----Original Message-----
>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>Sent: Tuesday, June 06, 2017 9:05 PM
>To: 'Jens Axboe'
>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>Subject: RE: Application stops due to ext4 filesytsem IO error
>
>Gentle ping..
>
>>-----Original Message-----
>>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>>Sent: Monday, June 05, 2017 12:59 PM
>>To: 'Jens Axboe'
>>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>>Subject: Application stops due to ext4 filesytsem IO error
>>
>>Jens,
>>
>>We am observing  application stops while running ext4 filesystem IOs
>>along with target reset in parallel.
>>Our suspect is this behavior can be attributed to linux block layer.
>>See below for details-
>>
>>Problem statement - " Application stops due to IO error from file
>>system buffered IO. (Note - It is always a FS meta data read failure)"
>>Issue is reproducible - "Yes. It is consistently reproducible."
>>Brief about setup -
>>Latest 4.11 kernel. Issue hits irrespective of whether SCSI MQ is
>>enabled or disabled. use_blk_mq=Y and use_blk_mq=N has similar issue.
>>Direct attached 4 SAS/SATA drives connected to MegaRAID Invader
>>controller.
>>
>>Reproduction steps -
>>-Create ext4 FS on 4 JBODs(non RAID volumes) behind MegaRAID SAS
>>controller.
>>-Start Data integrity test on all four ext4 mounted partition. (Tool
>>should be configured to send Buffered FS IO).
>>-Send Target Reset  (have some delay between next reset to allow some
>>IO on device) on each JBOD to simulate error condition. (sg_reset -d
>/dev/sdX).
>>
>>End result -
>>Combination of target resets and FS IOs in parallel causes application
>>halt with ext4 Filesystem IO error.
>>We are able to restart  application without cleaning and unmounting
>>filesystem.
>>Below are the error logs at the time of application stop-
>>
>>--------------------------
>>sd 0:0:53:0: target reset called for
>>scmd(ffff88003cf25148)
>>sd 0:0:53:0: attempting target reset!
>>scmd(ffff88003cf25148) tm_dev_handle 0xb
>>sd 0:0:53:0: [sde] tag#519 BRCM Debug: request->cmd_flags: 0x80700
>bio-
>>>bi_flags: 0x2          bio->bi_opf: 0x3000 rq_flags 0x20e3
>>..
>>sd 0:0:53:0: [sde] tag#519 CDB: Read(10) 28 00 15 00 11 10 00 00 f8 00
>>EXT4-fs error (device sde): __ext4_get_inode_loc:4465: inode #11018287:
>>block 44040738: comm chaos: unable to read itable block
>>-----------------------
>>
>>We debug further to understand what is happening above LLD. See below-
>>
>>During target reset,  there may be IO coming from target with CHECK
>>CONDITION with below sense information-.
>>Sense Key : Aborted Command [current]
>>Add. Sense: No additional sense information
>>
>>Such Aborted command should be retried by SML/Block layer. This happens
>>from SML expect for FS Meta data read.
>>>From driver level debug, we found IOs with REQ_FAILFAST_DEV bit set in
>>scmd->request->cmd_flags are not retried by SML and that is also as
>>expected.
>>
>>Below is the code in scsi_error.c(function- scsi_noretry_cmd) which
>>causes IOs with REQ_FAILFAST_DEV enabled not getting retried bit
>>completed back to upper layer-
>>--------
>>/*
>>     * assume caller has checked sense and determined
>>     * the check condition was retryable.
>>     */
>>    if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>>        scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
>>        return 1;
>>    else
>>        return 0;
>>--------
>>
>>IO which causes application to stop has REQ_FAILFAST_DEV enabled inside
>>"scmd->request->cmd_flags". We noticed that this bit will be set for
>>filesystem Read ahead meta data IOs. In order to confirm the same, we
>>mounted with option inode_readahead_blks=0 to disable ext4's inode
>>table readahead algorithm and did not observe the issue. Issue does not
>>hit with DIRECT IOs but only with cached/buffered IOs.
>>
>>2. From driver level debug prints, we also noticed - There are many IO
>>failures with REQ_FAILFAST_DEV handled gracefully by filesystem.
>>Application level failure happens only If IO has RQF_MIXED_MERGE set.
>>If IO merging is disabled through sysfs parameter for SCSI device in
>>question- nomerges set to 2, we are not seeing the issue.
>>
>>3. We added few prints in driver to dump "scmd->request->cmd_flags" and
>>"scmd->request->rq_flags" for IOs completed with CHECK CONDITION and
>>culprit IOs has all these bits- REQ_FAILFAST_DEV and REQ_RAHEAD bit set
>>in "scmd->request->cmd_flags" and RQF_MIXED_MERGE bit set in "scmd-
>>>request->rq_flags". Also it's not necessarily true that all IOs with
>>>request->these
>>three bits set will cause issue but whenever issue hits, these three
>>bits are set for IO causing failure.
>>
>>
>>In summary,
>>FS mechanism of using READ AHEAD for meta data works fine (in case of
>>IO
>>failure) if there is no mix/merge at block layer.
>>FS mechanism of using READ AHEAD for meta data has some corner case
>>which is not handled properly (in case of IO failure) if  there was
>>mix/merge at block layer.
>>megaraid_sas driver's behavior seems correct here. Aborted IO goes to
>>SML with CHECK CONDITION settings and SML decided to fail fast IO as it
>>was requested.
>>
>>Query -  Is this block layer (page cache) issue?  What should be the
ideal fix
>?
>>
>>Thanks,
>>Sumit

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Application stops due to ext4 filesytsem IO error
@ 2017-06-13 13:31 ` Sumit Saxena
  0 siblings, 0 replies; 5+ messages in thread
From: Sumit Saxena @ 2017-06-13 13:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi

Gentle ping.

I have opened kernel BZ for this. Here is the BZ link-
https://bugzilla.kernel.org/show_bug.cgi?id=196057

Thanks,
Sumit
>-----Original Message-----
>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>Sent: Tuesday, June 06, 2017 9:05 PM
>To: 'Jens Axboe'
>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>Subject: RE: Application stops due to ext4 filesytsem IO error
>
>Gentle ping..
>
>>-----Original Message-----
>>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>>Sent: Monday, June 05, 2017 12:59 PM
>>To: 'Jens Axboe'
>>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>>Subject: Application stops due to ext4 filesytsem IO error
>>
>>Jens,
>>
>>We am observing  application stops while running ext4 filesystem IOs
>>along with target reset in parallel.
>>Our suspect is this behavior can be attributed to linux block layer.
>>See below for details-
>>
>>Problem statement - " Application stops due to IO error from file
>>system buffered IO. (Note - It is always a FS meta data read failure)"
>>Issue is reproducible - "Yes. It is consistently reproducible."
>>Brief about setup -
>>Latest 4.11 kernel. Issue hits irrespective of whether SCSI MQ is
>>enabled or disabled. use_blk_mq=Y and use_blk_mq=N has similar issue.
>>Direct attached 4 SAS/SATA drives connected to MegaRAID Invader
>>controller.
>>
>>Reproduction steps -
>>-Create ext4 FS on 4 JBODs(non RAID volumes) behind MegaRAID SAS
>>controller.
>>-Start Data integrity test on all four ext4 mounted partition. (Tool
>>should be configured to send Buffered FS IO).
>>-Send Target Reset  (have some delay between next reset to allow some
>>IO on device) on each JBOD to simulate error condition. (sg_reset -d
>/dev/sdX).
>>
>>End result -
>>Combination of target resets and FS IOs in parallel causes application
>>halt with ext4 Filesystem IO error.
>>We are able to restart  application without cleaning and unmounting
>>filesystem.
>>Below are the error logs at the time of application stop-
>>
>>--------------------------
>>sd 0:0:53:0: target reset called for
>>scmd(ffff88003cf25148)
>>sd 0:0:53:0: attempting target reset!
>>scmd(ffff88003cf25148) tm_dev_handle 0xb
>>sd 0:0:53:0: [sde] tag#519 BRCM Debug: request->cmd_flags: 0x80700
>bio-
>>>bi_flags: 0x2          bio->bi_opf: 0x3000 rq_flags 0x20e3
>>..
>>sd 0:0:53:0: [sde] tag#519 CDB: Read(10) 28 00 15 00 11 10 00 00 f8 00
>>EXT4-fs error (device sde): __ext4_get_inode_loc:4465: inode #11018287:
>>block 44040738: comm chaos: unable to read itable block
>>-----------------------
>>
>>We debug further to understand what is happening above LLD. See below-
>>
>>During target reset,  there may be IO coming from target with CHECK
>>CONDITION with below sense information-.
>>Sense Key : Aborted Command [current]
>>Add. Sense: No additional sense information
>>
>>Such Aborted command should be retried by SML/Block layer. This happens
>>from SML expect for FS Meta data read.
>>From driver level debug, we found IOs with REQ_FAILFAST_DEV bit set in
>>scmd->request->cmd_flags are not retried by SML and that is also as
>>expected.
>>
>>Below is the code in scsi_error.c(function- scsi_noretry_cmd) which
>>causes IOs with REQ_FAILFAST_DEV enabled not getting retried bit
>>completed back to upper layer-
>>--------
>>/*
>>     * assume caller has checked sense and determined
>>     * the check condition was retryable.
>>     */
>>    if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>>        scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
>>        return 1;
>>    else
>>        return 0;
>>--------
>>
>>IO which causes application to stop has REQ_FAILFAST_DEV enabled inside
>>"scmd->request->cmd_flags". We noticed that this bit will be set for
>>filesystem Read ahead meta data IOs. In order to confirm the same, we
>>mounted with option inode_readahead_blks=0 to disable ext4's inode
>>table readahead algorithm and did not observe the issue. Issue does not
>>hit with DIRECT IOs but only with cached/buffered IOs.
>>
>>2. From driver level debug prints, we also noticed - There are many IO
>>failures with REQ_FAILFAST_DEV handled gracefully by filesystem.
>>Application level failure happens only If IO has RQF_MIXED_MERGE set.
>>If IO merging is disabled through sysfs parameter for SCSI device in
>>question- nomerges set to 2, we are not seeing the issue.
>>
>>3. We added few prints in driver to dump "scmd->request->cmd_flags" and
>>"scmd->request->rq_flags" for IOs completed with CHECK CONDITION and
>>culprit IOs has all these bits- REQ_FAILFAST_DEV and REQ_RAHEAD bit set
>>in "scmd->request->cmd_flags" and RQF_MIXED_MERGE bit set in "scmd-
>>>request->rq_flags". Also it's not necessarily true that all IOs with
>>>request->these
>>three bits set will cause issue but whenever issue hits, these three
>>bits are set for IO causing failure.
>>
>>
>>In summary,
>>FS mechanism of using READ AHEAD for meta data works fine (in case of
>>IO
>>failure) if there is no mix/merge at block layer.
>>FS mechanism of using READ AHEAD for meta data has some corner case
>>which is not handled properly (in case of IO failure) if  there was
>>mix/merge at block layer.
>>megaraid_sas driver's behavior seems correct here. Aborted IO goes to
>>SML with CHECK CONDITION settings and SML decided to fail fast IO as it
>>was requested.
>>
>>Query -  Is this block layer (page cache) issue?  What should be the
ideal fix
>?
>>
>>Thanks,
>>Sumit

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Application stops due to ext4 filesytsem IO error
@ 2017-06-06 15:34 ` Sumit Saxena
  0 siblings, 0 replies; 5+ messages in thread
From: Sumit Saxena @ 2017-06-06 15:34 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi

Gentle ping..

>-----Original Message-----
>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>Sent: Monday, June 05, 2017 12:59 PM
>To: 'Jens Axboe'
>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>Subject: Application stops due to ext4 filesytsem IO error
>
>Jens,
>
>We am observing  application stops while running ext4 filesystem IOs
along
>with target reset in parallel.
>Our suspect is this behavior can be attributed to linux block layer. See
below
>for details-
>
>Problem statement - " Application stops due to IO error from file system
>buffered IO. (Note - It is always a FS meta data read failure)"
>Issue is reproducible - "Yes. It is consistently reproducible."
>Brief about setup -
>Latest 4.11 kernel. Issue hits irrespective of whether SCSI MQ is enabled
or
>disabled. use_blk_mq=Y and use_blk_mq=N has similar issue.
>Direct attached 4 SAS/SATA drives connected to MegaRAID Invader
>controller.
>
>Reproduction steps -
>-Create ext4 FS on 4 JBODs(non RAID volumes) behind MegaRAID SAS
>controller.
>-Start Data integrity test on all four ext4 mounted partition. (Tool
should be
>configured to send Buffered FS IO).
>-Send Target Reset  (have some delay between next reset to allow some IO
>on device) on each JBOD to simulate error condition. (sg_reset -d
/dev/sdX).
>
>End result -
>Combination of target resets and FS IOs in parallel causes application
halt
>with ext4 Filesystem IO error.
>We are able to restart  application without cleaning and unmounting
>filesystem.
>Below are the error logs at the time of application stop-
>
>--------------------------
>sd 0:0:53:0: target reset called for
>scmd(ffff88003cf25148)
>sd 0:0:53:0: attempting target reset!
>scmd(ffff88003cf25148) tm_dev_handle 0xb
>sd 0:0:53:0: [sde] tag#519 BRCM Debug: request->cmd_flags: 0x80700   bio-
>>bi_flags: 0x2          bio->bi_opf: 0x3000 rq_flags 0x20e3
>..
>sd 0:0:53:0: [sde] tag#519 CDB: Read(10) 28 00 15 00 11 10 00 00 f8 00
>EXT4-fs error (device sde): __ext4_get_inode_loc:4465: inode #11018287:
>block 44040738: comm chaos: unable to read itable block
>-----------------------
>
>We debug further to understand what is happening above LLD. See below-
>
>During target reset,  there may be IO coming from target with CHECK
>CONDITION with below sense information-.
>Sense Key : Aborted Command [current]
>Add. Sense: No additional sense information
>
>Such Aborted command should be retried by SML/Block layer. This happens
>from SML expect for FS Meta data read.
>>From driver level debug, we found IOs with REQ_FAILFAST_DEV bit set in
>scmd->request->cmd_flags are not retried by SML and that is also as
>expected.
>
>Below is the code in scsi_error.c(function- scsi_noretry_cmd) which
causes
>IOs with REQ_FAILFAST_DEV enabled not getting retried bit completed back
>to upper layer-
>--------
>/*
>     * assume caller has checked sense and determined
>     * the check condition was retryable.
>     */
>    if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>        scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
>        return 1;
>    else
>        return 0;
>--------
>
>IO which causes application to stop has REQ_FAILFAST_DEV enabled inside
>"scmd->request->cmd_flags". We noticed that this bit will be set for
>filesystem Read ahead meta data IOs. In order to confirm the same, we
>mounted with option inode_readahead_blks=0 to disable ext4's inode table
>readahead algorithm and did not observe the issue. Issue does not hit
with
>DIRECT IOs but only with cached/buffered IOs.
>
>2. From driver level debug prints, we also noticed - There are many IO
>failures with REQ_FAILFAST_DEV handled gracefully by filesystem.
>Application level failure happens only If IO has RQF_MIXED_MERGE set.
>If IO merging is disabled through sysfs parameter for SCSI device in
question-
>nomerges set to 2, we are not seeing the issue.
>
>3. We added few prints in driver to dump "scmd->request->cmd_flags" and
>"scmd->request->rq_flags" for IOs completed with CHECK CONDITION and
>culprit IOs has all these bits- REQ_FAILFAST_DEV and REQ_RAHEAD bit set
in
>"scmd->request->cmd_flags" and RQF_MIXED_MERGE bit set in "scmd-
>>request->rq_flags". Also it's not necessarily true that all IOs with
these
>three bits set will cause issue but whenever issue hits, these three bits
are
>set for IO causing failure.
>
>
>In summary,
>FS mechanism of using READ AHEAD for meta data works fine (in case of IO
>failure) if there is no mix/merge at block layer.
>FS mechanism of using READ AHEAD for meta data has some corner case
>which is not handled properly (in case of IO failure) if  there was
mix/merge
>at block layer.
>megaraid_sas driver's behavior seems correct here. Aborted IO goes to SML
>with CHECK CONDITION settings and SML decided to fail fast IO as it was
>requested.
>
>Query -  Is this block layer (page cache) issue?  What should be the
ideal fix ?
>
>Thanks,
>Sumit

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Application stops due to ext4 filesytsem IO error
@ 2017-06-06 15:34 ` Sumit Saxena
  0 siblings, 0 replies; 5+ messages in thread
From: Sumit Saxena @ 2017-06-06 15:34 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi

Gentle ping..

>-----Original Message-----
>From: Sumit Saxena [mailto:sumit.saxena@broadcom.com]
>Sent: Monday, June 05, 2017 12:59 PM
>To: 'Jens Axboe'
>Cc: 'linux-block@vger.kernel.org'; 'linux-scsi@vger.kernel.org'
>Subject: Application stops due to ext4 filesytsem IO error
>
>Jens,
>
>We am observing  application stops while running ext4 filesystem IOs
along
>with target reset in parallel.
>Our suspect is this behavior can be attributed to linux block layer. See
below
>for details-
>
>Problem statement - " Application stops due to IO error from file system
>buffered IO. (Note - It is always a FS meta data read failure)"
>Issue is reproducible - "Yes. It is consistently reproducible."
>Brief about setup -
>Latest 4.11 kernel. Issue hits irrespective of whether SCSI MQ is enabled
or
>disabled. use_blk_mq=Y and use_blk_mq=N has similar issue.
>Direct attached 4 SAS/SATA drives connected to MegaRAID Invader
>controller.
>
>Reproduction steps -
>-Create ext4 FS on 4 JBODs(non RAID volumes) behind MegaRAID SAS
>controller.
>-Start Data integrity test on all four ext4 mounted partition. (Tool
should be
>configured to send Buffered FS IO).
>-Send Target Reset  (have some delay between next reset to allow some IO
>on device) on each JBOD to simulate error condition. (sg_reset -d
/dev/sdX).
>
>End result -
>Combination of target resets and FS IOs in parallel causes application
halt
>with ext4 Filesystem IO error.
>We are able to restart  application without cleaning and unmounting
>filesystem.
>Below are the error logs at the time of application stop-
>
>--------------------------
>sd 0:0:53:0: target reset called for
>scmd(ffff88003cf25148)
>sd 0:0:53:0: attempting target reset!
>scmd(ffff88003cf25148) tm_dev_handle 0xb
>sd 0:0:53:0: [sde] tag#519 BRCM Debug: request->cmd_flags: 0x80700   bio-
>>bi_flags: 0x2          bio->bi_opf: 0x3000 rq_flags 0x20e3
>..
>sd 0:0:53:0: [sde] tag#519 CDB: Read(10) 28 00 15 00 11 10 00 00 f8 00
>EXT4-fs error (device sde): __ext4_get_inode_loc:4465: inode #11018287:
>block 44040738: comm chaos: unable to read itable block
>-----------------------
>
>We debug further to understand what is happening above LLD. See below-
>
>During target reset,  there may be IO coming from target with CHECK
>CONDITION with below sense information-.
>Sense Key : Aborted Command [current]
>Add. Sense: No additional sense information
>
>Such Aborted command should be retried by SML/Block layer. This happens
>from SML expect for FS Meta data read.
>From driver level debug, we found IOs with REQ_FAILFAST_DEV bit set in
>scmd->request->cmd_flags are not retried by SML and that is also as
>expected.
>
>Below is the code in scsi_error.c(function- scsi_noretry_cmd) which
causes
>IOs with REQ_FAILFAST_DEV enabled not getting retried bit completed back
>to upper layer-
>--------
>/*
>     * assume caller has checked sense and determined
>     * the check condition was retryable.
>     */
>    if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>        scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
>        return 1;
>    else
>        return 0;
>--------
>
>IO which causes application to stop has REQ_FAILFAST_DEV enabled inside
>"scmd->request->cmd_flags". We noticed that this bit will be set for
>filesystem Read ahead meta data IOs. In order to confirm the same, we
>mounted with option inode_readahead_blks=0 to disable ext4's inode table
>readahead algorithm and did not observe the issue. Issue does not hit
with
>DIRECT IOs but only with cached/buffered IOs.
>
>2. From driver level debug prints, we also noticed - There are many IO
>failures with REQ_FAILFAST_DEV handled gracefully by filesystem.
>Application level failure happens only If IO has RQF_MIXED_MERGE set.
>If IO merging is disabled through sysfs parameter for SCSI device in
question-
>nomerges set to 2, we are not seeing the issue.
>
>3. We added few prints in driver to dump "scmd->request->cmd_flags" and
>"scmd->request->rq_flags" for IOs completed with CHECK CONDITION and
>culprit IOs has all these bits- REQ_FAILFAST_DEV and REQ_RAHEAD bit set
in
>"scmd->request->cmd_flags" and RQF_MIXED_MERGE bit set in "scmd-
>>request->rq_flags". Also it's not necessarily true that all IOs with
these
>three bits set will cause issue but whenever issue hits, these three bits
are
>set for IO causing failure.
>
>
>In summary,
>FS mechanism of using READ AHEAD for meta data works fine (in case of IO
>failure) if there is no mix/merge at block layer.
>FS mechanism of using READ AHEAD for meta data has some corner case
>which is not handled properly (in case of IO failure) if  there was
mix/merge
>at block layer.
>megaraid_sas driver's behavior seems correct here. Aborted IO goes to SML
>with CHECK CONDITION settings and SML decided to fail fast IO as it was
>requested.
>
>Query -  Is this block layer (page cache) issue?  What should be the
ideal fix ?
>
>Thanks,
>Sumit

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-06-13 13:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-05  7:28 Application stops due to ext4 filesytsem IO error Sumit Saxena
2017-06-06 15:34 Sumit Saxena
2017-06-06 15:34 ` Sumit Saxena
2017-06-13 13:31 Sumit Saxena
2017-06-13 13:31 ` Sumit Saxena

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.