From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754224AbcHSDma (ORCPT ); Thu, 18 Aug 2016 23:42:30 -0400 Received: from mail-co1nam03on0064.outbound.protection.outlook.com ([104.47.40.64]:47168 "EHLO NAM03-CO1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752511AbcHSDm3 (ORCPT ); Thu, 18 Aug 2016 23:42:29 -0400 From: Bart Van Assche To: Sreekanth Reddy , "linux-scsi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "irqbalance@lists.infradead.org" CC: Kashyap Desai , Sathya Prakash Veerichetty , Chaitra Basappa , Suganath Prabu Subramani Subject: Re: Observing Softlockup's while running heavy IOs Thread-Topic: Observing Softlockup's while running heavy IOs Thread-Index: AQHR+WEv0zc0EvCs7kaJjJdn7YIetA== Date: Thu, 18 Aug 2016 14:59:59 +0000 Message-ID: References: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Bart.VanAssche@sandisk.com; x-originating-ip: [73.15.202.106] x-ms-office365-filtering-correlation-id: e90db4f3-24a6-4f93-74c1-08d3c7785325 x-microsoft-exchange-diagnostics: 1;SN1PR0201MB1872;20:hJcny59BIV8PJ7hwwufywJ2k8K2iln/bNXg/O4Kl5gmz0Ts+DrmLisxkTRx06wJtysOMSh+aeomxzQIUWbVVTyV8qnmRhVb3dyR4RWf305AKZn5LMT/COqS+Fk6R6sAqYlJTeLHz6ljkvi/25PNnRbYbt2bvQLln1geW86mJVS3AElAwX5ftqcK3S5xGAgH6CMCiWtFdpDd9VKgnPPA+FaC4pt8ayFYNFzCTOGe/tLmh6j8VeZQHX2SYy5zyoZTZ x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN1PR0201MB1872; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(102415321)(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026);SRVR:SN1PR0201MB1872;BCL:0;PCL:0;RULEID:;SRVR:SN1PR0201MB1872; x-forefront-prvs: 0038DE95A2 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(7916002)(199003)(24454002)(189002)(54356999)(87936001)(2900100001)(50986999)(2906002)(106356001)(77096005)(2501003)(106116001)(5001770100001)(99936001)(99286002)(97736004)(3660700001)(122556002)(189998001)(76576001)(586003)(5002640100001)(8676002)(5890100001)(105586002)(76176999)(101416001)(11100500001)(81166006)(86362001)(33656002)(66066001)(3280700002)(10400500002)(8936002)(92566002)(102836003)(6116002)(7846002)(68736007)(7696003)(4326007)(305945005)(7736002)(3846002)(2201001)(81156014)(9686002)(74316002);DIR:OUT;SFP:1101;SCL:1;SRVR:SN1PR0201MB1872;H:SN1PR0201MB1870.namprd02.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: multipart/mixed; boundary="_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_" MIME-Version: 1.0 X-OriginatorOrg: sandisk.com X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Aug 2016 14:59:59.1976 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR0201MB1872 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable On 08/17/16 22:55, Sreekanth Reddy wrote:=0A= > Observing softlockups while running heavy IOs on 8 SSD drives=0A= > connected behind our LSI SAS 3004 HBA.=0A= =0A= Hello Sreekanth,=0A= =0A= This means that more than 23s was spent before the scheduler was =0A= invoked, probably due to a loop. Can you give the attached (untested) =0A= patch a try to see whether it is the loop in __blk_mq_run_hw_queue()?=0A= =0A= Thanks,=0A= =0A= Bart.=0A= =0A= --_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_ Content-Type: text/x-patch; name="0001-block-Measure-__blk_mq_run_hw_queue-execution-time.patch" Content-Description: 0001-block-Measure-__blk_mq_run_hw_queue-execution-time.patch Content-Disposition: attachment; filename="0001-block-Measure-__blk_mq_run_hw_queue-execution-time.patch"; size=1132; creation-date="Thu, 18 Aug 2016 14:59:58 GMT"; modification-date="Thu, 18 Aug 2016 14:59:58 GMT" Content-Transfer-Encoding: base64 RnJvbSA0ZGE5NGYyZWMzN2VlNWQxYjRhNWYxY2UyODg2YmRhZmQ1Y2QzOTRjIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBCYXJ0IFZhbiBBc3NjaGUgPGJhcnQudmFuYXNzY2hlQHNhbmRp c2suY29tPgpEYXRlOiBUaHUsIDE4IEF1ZyAyMDE2IDA3OjUxOjQ5IC0wNzAwClN1YmplY3Q6IFtQ QVRDSF0gYmxvY2s6IE1lYXN1cmUgX19ibGtfbXFfcnVuX2h3X3F1ZXVlKCkgZXhlY3V0aW9uIHRp bWUKCk5vdGU6IHRoZSAibWF4X2VsYXBzZWQiIHZhcmlhYmxlIGNhbiBiZSBtb2RpZmllZCBieSBt dWx0aXBsZSB0aHJlYWRzCmNvbmN1cnJlbnRseS4KLS0tCiBibG9jay9ibGstbXEuYyB8IDEwICsr KysrKysrKysKIDEgZmlsZSBjaGFuZ2VkLCAxMCBpbnNlcnRpb25zKCspCgpkaWZmIC0tZ2l0IGEv YmxvY2svYmxrLW1xLmMgYi9ibG9jay9ibGstbXEuYwppbmRleCBlOTMxYTBlLi42ZDA5NjFjIDEw MDY0NAotLS0gYS9ibG9jay9ibGstbXEuYworKysgYi9ibG9jay9ibGstbXEuYwpAQCAtNzkyLDYg Kzc5Miw5IEBAIHN0YXRpYyB2b2lkIF9fYmxrX21xX3J1bl9od19xdWV1ZShzdHJ1Y3QgYmxrX21x X2h3X2N0eCAqaGN0eCkKIAlMSVNUX0hFQUQoZHJpdmVyX2xpc3QpOwogCXN0cnVjdCBsaXN0X2hl YWQgKmRwdHI7CiAJaW50IHF1ZXVlZDsKKwlzdGF0aWMgbG9uZyBtYXhfZWxhcHNlZCA9IC0xOwor CXVuc2lnbmVkIGxvbmcgc3RhcnQgPSBqaWZmaWVzOworCWxvbmcgZWxhcHNlZDsKIAogCVdBUk5f T04oIWNwdW1hc2tfdGVzdF9jcHUocmF3X3NtcF9wcm9jZXNzb3JfaWQoKSwgaGN0eC0+Y3B1bWFz aykpOwogCkBAIC04ODksNiArODkyLDEzIEBAIHN0YXRpYyB2b2lkIF9fYmxrX21xX3J1bl9od19x dWV1ZShzdHJ1Y3QgYmxrX21xX2h3X2N0eCAqaGN0eCkKIAkJICoqLwogCQlibGtfbXFfcnVuX2h3 X3F1ZXVlKGhjdHgsIHRydWUpOwogCX0KKworCWVsYXBzZWQgPSBqaWZmaWVzIC0gc3RhcnQ7CisJ aWYgKGVsYXBzZWQgPiBtYXhfZWxhcHNlZCkgeworCQltYXhfZWxhcHNlZCA9IGVsYXBzZWQ7CisJ CXByX2luZm8oIiVzKCkgZmluaXNoZWQgYWZ0ZXIgJWQgbXNcbiIsIF9fZnVuY19fLAorCQkJamlm Zmllc190b19tc2VjcyhlbGFwc2VkKSk7CisJfQogfQogCiAvKgotLSAKMi45LjIKCg== --_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: Observing Softlockup's while running heavy IOs Date: Thu, 18 Aug 2016 14:59:59 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_" Return-path: Received: from mail-bn3nam01on0082.outbound.protection.outlook.com ([104.47.33.82]:3433 "EHLO NAM01-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753788AbcHSCjr (ORCPT ); Thu, 18 Aug 2016 22:39:47 -0400 Content-Language: en-US Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Sreekanth Reddy , "linux-scsi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "irqbalance@lists.infradead.org" Cc: Kashyap Desai , Sathya Prakash Veerichetty , Chaitra Basappa , Suganath Prabu Subramani --_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable On 08/17/16 22:55, Sreekanth Reddy wrote:=0A= > Observing softlockups while running heavy IOs on 8 SSD drives=0A= > connected behind our LSI SAS 3004 HBA.=0A= =0A= Hello Sreekanth,=0A= =0A= This means that more than 23s was spent before the scheduler was =0A= invoked, probably due to a loop. Can you give the attached (untested) =0A= patch a try to see whether it is the loop in __blk_mq_run_hw_queue()?=0A= =0A= Thanks,=0A= =0A= Bart.=0A= =0A= --_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_ Content-Type: text/x-patch; name="0001-block-Measure-__blk_mq_run_hw_queue-execution-time.patch" Content-Description: 0001-block-Measure-__blk_mq_run_hw_queue-execution-time.patch Content-Disposition: attachment; filename="0001-block-Measure-__blk_mq_run_hw_queue-execution-time.patch"; size=1132; creation-date="Thu, 18 Aug 2016 14:59:58 GMT"; modification-date="Thu, 18 Aug 2016 14:59:58 GMT" Content-Transfer-Encoding: base64 RnJvbSA0ZGE5NGYyZWMzN2VlNWQxYjRhNWYxY2UyODg2YmRhZmQ1Y2QzOTRjIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBCYXJ0IFZhbiBBc3NjaGUgPGJhcnQudmFuYXNzY2hlQHNhbmRp c2suY29tPgpEYXRlOiBUaHUsIDE4IEF1ZyAyMDE2IDA3OjUxOjQ5IC0wNzAwClN1YmplY3Q6IFtQ QVRDSF0gYmxvY2s6IE1lYXN1cmUgX19ibGtfbXFfcnVuX2h3X3F1ZXVlKCkgZXhlY3V0aW9uIHRp bWUKCk5vdGU6IHRoZSAibWF4X2VsYXBzZWQiIHZhcmlhYmxlIGNhbiBiZSBtb2RpZmllZCBieSBt dWx0aXBsZSB0aHJlYWRzCmNvbmN1cnJlbnRseS4KLS0tCiBibG9jay9ibGstbXEuYyB8IDEwICsr KysrKysrKysKIDEgZmlsZSBjaGFuZ2VkLCAxMCBpbnNlcnRpb25zKCspCgpkaWZmIC0tZ2l0IGEv YmxvY2svYmxrLW1xLmMgYi9ibG9jay9ibGstbXEuYwppbmRleCBlOTMxYTBlLi42ZDA5NjFjIDEw MDY0NAotLS0gYS9ibG9jay9ibGstbXEuYworKysgYi9ibG9jay9ibGstbXEuYwpAQCAtNzkyLDYg Kzc5Miw5IEBAIHN0YXRpYyB2b2lkIF9fYmxrX21xX3J1bl9od19xdWV1ZShzdHJ1Y3QgYmxrX21x X2h3X2N0eCAqaGN0eCkKIAlMSVNUX0hFQUQoZHJpdmVyX2xpc3QpOwogCXN0cnVjdCBsaXN0X2hl YWQgKmRwdHI7CiAJaW50IHF1ZXVlZDsKKwlzdGF0aWMgbG9uZyBtYXhfZWxhcHNlZCA9IC0xOwor CXVuc2lnbmVkIGxvbmcgc3RhcnQgPSBqaWZmaWVzOworCWxvbmcgZWxhcHNlZDsKIAogCVdBUk5f T04oIWNwdW1hc2tfdGVzdF9jcHUocmF3X3NtcF9wcm9jZXNzb3JfaWQoKSwgaGN0eC0+Y3B1bWFz aykpOwogCkBAIC04ODksNiArODkyLDEzIEBAIHN0YXRpYyB2b2lkIF9fYmxrX21xX3J1bl9od19x dWV1ZShzdHJ1Y3QgYmxrX21xX2h3X2N0eCAqaGN0eCkKIAkJICoqLwogCQlibGtfbXFfcnVuX2h3 X3F1ZXVlKGhjdHgsIHRydWUpOwogCX0KKworCWVsYXBzZWQgPSBqaWZmaWVzIC0gc3RhcnQ7CisJ aWYgKGVsYXBzZWQgPiBtYXhfZWxhcHNlZCkgeworCQltYXhfZWxhcHNlZCA9IGVsYXBzZWQ7CisJ CXByX2luZm8oIiVzKCkgZmluaXNoZWQgYWZ0ZXIgJWQgbXNcbiIsIF9fZnVuY19fLAorCQkJamlm Zmllc190b19tc2VjcyhlbGFwc2VkKSk7CisJfQogfQogCiAvKgotLSAKMi45LjIKCg== --_002_SN1PR0201MB1870C7DFF1595905BDAE985F81150SN1PR0201MB1870_--