From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761318Ab2DKVoc (ORCPT ); Wed, 11 Apr 2012 17:44:32 -0400 Received: from mail.tomasu.net ([64.85.170.232]:47283 "EHLO mail.tomasu.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755972Ab2DKVob (ORCPT ); Wed, 11 Apr 2012 17:44:31 -0400 From: Thomas Fjellstrom Reply-To: thomas@fjellstrom.ca To: adam radford Subject: Re: stuck in megaraid_sas.c megasas_adp_reset_gen2 Date: Wed, 11 Apr 2012 15:44:27 -0600 User-Agent: KMail/1.13.7 (Linux/3.2.0-1-amd64; KDE/4.7.4; x86_64; ; ) Cc: lkml , linux-scsi@vger.kernel.org References: <201203211716.45418.thomas@fjellstrom.ca> <201204111417.28893.thomas@fjellstrom.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201204111544.27866.thomas@fjellstrom.ca> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed Apr 11, 2012, adam radford wrote: > On Wed, Apr 11, 2012 at 1:17 PM, Thomas Fjellstrom wrote: > >> ADP_RESET_GEN2: HostDiag=a0 > >> > >> followed by a bunch of: > >> > >> RESET_GEN2: retry=%x, hostdiag=a4 > >> > >> Now I'm not sure the hostdiag should be different between the two. if > >> this aN identifier is similar to the aN identifiers in the MegaCli > >> tool, then it would mean its trying to reset a device that doesn't > >> exist? I only have a single M1015 card installed. > > host diag register output a0 or a4 has absolutely nothing to do with > MegaCli -aN command line argument for specifying adapter number. > > > I just got a second M1015 card in today and gave it a go. Similar issues, > > different log messages. (hand typed from picture taken of screens) > > > > Lots of: > > > > megasas: Waiting for 1 commands to complete > > Can you try booting with kernel command line argument pcie_aspm=off No problem. Things are quite similar. Startup goes like: scsi: waiting for bus probes to complete... Refined TSC... Switched to clocksource tsc udevd[...]: timeout: killing '/sbin/modprobe -b ...' (lots of these, so much that I hit scroll lock so I can see the kernel messages as they come up) scsi 0:0:0:0: megasas: RESET cmd=12 retries=0 megasas: [ 0] waiting for 1 commands to complete (many more waiting messages) Call Trace: [] ? async_synchronize_cookie_domain+0xb2/...c [] ? add_wait_queue+0x3c/0x3c .... megasas: [55] waiting for 1 commands to complete .... megasas: [175] waiting for 1 commands to complete megasas: moving cmd[0]:ffff880234bcb940:0:ffff88002339beec0 the defer queue as internal megaraid_sas: FW detected to be in faultstate, restarting it... ADP_RESET_GEN2: HostDiag=a0 (10s wait) megaraid_sas: FW restarted successfully,initializing next stage... megaraid_sas: HBA recovery state machine,state 2 starting... (30s wait) megasas: Waiting for FW to come to ready state megasas: FW now in ready state megaraid_sas: command ffff880234bcb940, ffff8802339beec0:0detected to be pending while HBA reset megasas: ffff880234bcb940 scsi cmd [12]detected on the internal queue, issue again. megasas: reset successful scsi: 0:0:0:0: megasas: RESET cmd=12 retries 0 megaraid_sas: no pending cmds after reset megasas: reset successful (20s wait) (device offlined message here, missed it this time) (detected all sata devices) And it stalled there. > -Adam -- Thomas Fjellstrom thomas@fjellstrom.ca