linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Hilliard <james.hilliard1@gmail.com>
To: Konstantin Khorenko <khorenko@virtuozzo.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>,
	Sagar Biradar <sagar.biradar@microchip.com>,
	linux-scsi@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Adaptec OEM Raid Solutions <aacraid@microsemi.com>
Subject: Re: [PATCH v3 0/1] aacraid: Host adapter Adaptec 6405 constantly resets under high io load
Date: Thu, 6 May 2021 16:22:30 -0600	[thread overview]
Message-ID: <CADvTj4rVS-wJy1B=dgEO1AOADNYgL3XkZ01Aq=RTfPGEZC+VMA@mail.gmail.com> (raw)
In-Reply-To: <20190819163546.915-1-khorenko@virtuozzo.com>

On Mon, Aug 19, 2019 at 10:35 AM Konstantin Khorenko
<khorenko@virtuozzo.com> wrote:
>
> Problem description:
> ====================
> A node with Adaptec 6405 controller, latest BIOS V5.3-0[19204]
Hitting this on a Adaptec RAID 71605 as well with BIOS V7.5.0[32118]
> A lot of disks attached to the controller.
> Simple test: running mkfs.ext4 on many disks on the same controller in
> parallel (mkfs is not important here, any serious io load triggers controller
> aborts)
I saw a zfs resilver trigger this.
>
>
> Results:
> * no problems (controller resets) with kernels prior to
>   395e5df79a95 ("scsi: aacraid: Remove reference to Series-9")
>
> * latest ms kernel v5.2-rc6-15-g249155c20f9b - mkfs processes are in D state,
>   lot of complains in logs like:
>
>   [  654.894633] aacraid: Host adapter abort request.
>   aacraid: Outstanding commands on (0,1,43,0):
>   [  699.441034] aacraid: Host adapter abort request.
>   aacraid: Outstanding commands on (0,1,40,0):
>   [  699.442950] aacraid: Host adapter reset request. SCSI hang ?
>   [  714.457428] aacraid: Host adapter reset request. SCSI hang ?
>   ...
>   [  759.514759] aacraid: Host adapter reset request. SCSI hang ?
>   [  759.514869] aacraid 0000:03:00.0: outstanding cmd: midlevel-0
>   [  759.514870] aacraid 0000:03:00.0: outstanding cmd: lowlevel-0
>   [  759.514872] aacraid 0000:03:00.0: outstanding cmd: error handler-498
>   [  759.514873] aacraid 0000:03:00.0: outstanding cmd: firmware-471
>   [  759.514875] aacraid 0000:03:00.0: outstanding cmd: kernel-60
>   [  759.514912] aacraid 0000:03:00.0: Controller reset type is 3
>   [  759.515013] aacraid 0000:03:00.0: Issuing IOP reset
>   [  850.296705] aacraid 0000:03:00.0: IOP reset succeeded
>
> Same complains on Ubuntu kernel 4.15.0-50-generic:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1777586
It's popping up in proxmox as well looks like:
https://forum.proxmox.com/threads/aacraid-host-adapter-abort-request-errors.86903/

When I tested this patch it appears to reduce the frequency of the
issue although I did
still hit an abort request:
aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,47,0):
>
>
>
> Controller:
> ===========
> 03:00.0 RAID bus controller: Adaptec Series 6 - 6G SAS/PCIe 2 (rev 01)
>          Subsystem: Adaptec Series 6 - ASR-6405 - 4 internal 6G SAS ports
>
> Test:
> =====
> # cat dev.list
> /dev/sdq1
> /dev/sde1
> /dev/sds1
> /dev/sdb1
> /dev/sdk1
> /dev/sdaj1
> /dev/sdaf1
> /dev/sdd1
> /dev/sdac1
> /dev/sdai1
> /dev/sdz1
> /dev/sdj1
> /dev/sdy1
> /dev/sdn1
> /dev/sdae1
> /dev/sdg1
> /dev/sdi1
> /dev/sdc1
> /dev/sdf1
> /dev/sdl1
> /dev/sda1
> /dev/sdab1
> /dev/sdr1
> /dev/sdo1
> /dev/sdah1
> /dev/sdm1
> /dev/sdt1
> /dev/sdp1
> /dev/sdad1
> /dev/sdh1
>
> ===========================================
> # cat run_mkfs.sh
> #!/bin/bash
>
> while read i; do
>    mkfs.ext4 $i -q -E lazy_itable_init=1 -O uninit_bg -m 0 &
> done
>
> =================================
> # cat dev.list | ./run_mkfs.sh
>
> The issue is 100% reproducible.
>
> i've bisected to the culprit patch, it's
> 395e5df79a95 ("scsi: aacraid: Remove reference to Series-9")
>
> it changes arc ctrl checks for Series-6 controllers
> and i've checked that resurrection of original logic in arc ctrl checks
> eliminates controller hangs/resets.
>
> Konstantin Khorenko (1):
>   scsi: aacraid: resurrect correct arc ctrl checks for Series-6
>
> --
> v3 changes:
>  * introduced another wrapper to check for devices except for Series 6
>    controllers upon request from Sagar Biradar (Microchip)
>
>  * dropped mentions of private bug ids
>
>
>  drivers/scsi/aacraid/aacraid.h  | 11 +++++++++++
>  drivers/scsi/aacraid/comminit.c |  5 ++---
>  drivers/scsi/aacraid/linit.c    |  2 +-
>  3 files changed, 14 insertions(+), 4 deletions(-)
>
> --
> 2.15.1
>
>

  parent reply	other threads:[~2021-05-06 22:22 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-27 16:14 [PATCH 0/1] aacraid: Host adapter Adaptec 6405 constantly resets under high io load Konstantin Khorenko
2019-06-27 16:14 ` [PATCH 1/1] scsi: aacraid: resurrect correct arc ctrl checks for Series-6 Konstantin Khorenko
2019-07-07 10:09   ` Andrey Jr. Melnikov
2019-07-07 23:49     ` Finn Thain
2019-07-10  9:24       ` Konstantin Khorenko
2019-07-10  9:31         ` [PATCH v2 0/2] aacraid: Host adapter Adaptec 6405 constantly resets under high io load Konstantin Khorenko
2019-07-10  9:31           ` [PATCH v2 1/2] Revert "scsi: aacraid: Remove reference to Series-9" Konstantin Khorenko
2019-07-10  9:31           ` [PATCH v2 2/2] scsi: aacraid: Remove references to Series-9 (only) Konstantin Khorenko
2019-07-12  1:30             ` Martin K. Petersen
2019-08-19 16:35               ` [PATCH v3 0/1] aacraid: Host adapter Adaptec 6405 constantly resets under high io load Konstantin Khorenko
2019-08-19 16:35                 ` [PATCH v3 1/1] scsi: aacraid: resurrect correct arc ctrl checks for Series-6 Konstantin Khorenko
2019-08-29 21:52                 ` [PATCH v3 0/1] aacraid: Host adapter Adaptec 6405 constantly resets under high io load Martin K. Petersen
2021-05-06 22:22                 ` James Hilliard [this message]
     [not found]                   ` <ffdb2223-eed3-75b4-a003-4e4c96b49947@grossegger.com>
2022-02-23  2:41                     ` Martin K. Petersen
2022-10-10 12:31                       ` James Hilliard
2022-10-19 18:00                         ` Konstantin Khorenko
2022-10-26 20:10                           ` James Hilliard
     [not found]                             ` <BYAPR11MB36066925274C38555F20FB17FA339@BYAPR11MB3606.namprd11.prod.outlook.com>
2022-11-13 18:42                               ` James Hilliard
2022-11-15 14:05                                 ` Sagar.Biradar
2022-11-16 21:55                                   ` James Hilliard
2022-11-18  3:36                                     ` Sagar.Biradar
2022-12-03 23:55                                       ` James Hilliard
2022-12-06  5:59                                         ` Sagar.Biradar
2022-12-16 20:44                                           ` Sagar.Biradar
2022-12-20  1:12                                             ` James Hilliard
2022-12-20 19:44                                             ` Konstantin Khorenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADvTj4rVS-wJy1B=dgEO1AOADNYgL3XkZ01Aq=RTfPGEZC+VMA@mail.gmail.com' \
    --to=james.hilliard1@gmail.com \
    --cc=aacraid@microsemi.com \
    --cc=khorenko@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=sagar.biradar@microchip.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).