From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755306AbcHSBvl (ORCPT ); Thu, 18 Aug 2016 21:51:41 -0400 Received: from mail-cys01nam02on0134.outbound.protection.outlook.com ([104.47.37.134]:40681 "EHLO NAM02-CY1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753576AbcHSBvS (ORCPT ); Thu, 18 Aug 2016 21:51:18 -0400 X-Greylist: delayed 3639 seconds by postgrey-1.27 at vger.kernel.org; Thu, 18 Aug 2016 21:50:37 EDT From: "Elliott, Robert (Persistent Memory)" To: Sreekanth Reddy , "linux-scsi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "irqbalance@lists.infradead.org" CC: Kashyap Desai , Sathya Prakash Veerichetty , Chaitra Basappa , Suganath Prabu Subramani Subject: RE: Observing Softlockup's while running heavy IOs Thread-Topic: Observing Softlockup's while running heavy IOs Thread-Index: AQHR+RU3EcS8OEDbkkWwBEplevwIGqBPNTqQ Date: Thu, 18 Aug 2016 21:08:18 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=elliott@hpe.com; x-originating-ip: [73.206.127.124] x-ms-office365-filtering-correlation-id: 8f1e9fbb-84e6-41b9-5b4b-08d3c7abc70c x-microsoft-exchange-diagnostics: 1;DF4PR84MB0170;6:cdSJOaLDhk98Q3CxWgNFZuSt5Zx+l4VyDlHoWpXZGbtKKQBRcMjjLcsdHOnmlKIYg3lxL+4HpujbiI3fvyCFAs8wSJKSEoCQL3PhzUGNJ9fRcZCzS5/58Pxud3MtHsGZ/j3idPqoHUgPR7c1+/ChSEPzZShkh5YXMDMze00AQJ4cjE4A7Cip8Z5Iknv+qeMOx2IDi27NrjkPjYfQ/pTmKUawWsEs4mrplRF8I69GhxXmGKjvlcmxoPcl3KFaR3oJLdjAaSam1W3YWUbjJvXYLlIcDpRthWUydWmY2INMN/S25dZ0GKK6TmkB/qtfMThask9a9fsZQZLa7TigUIYHyQ==;5:ZWWG2e/OYODBKEKTvdfNN38rj2x0rnEyF6G8s5fCfmpo7cUqHOYA4tmBjwhAUHW3LCOfVCVAhW4l5dRdwVD4GW9Cj288+3sMjJhN18qDC5Q2hn5ULWLq2Raieo/B9quysY4TAn+g1T1bcnCeUWR3iQ==;24:MPt5F0xsqFJIei1d7Wd50gxU0XHL+l4Rn1TrNtpXcrkwOGJXSrqHJUJhS2KHi+qVpidrMWzzN6NWrXN0XeJAf+MV9fO+autS7WqnShykKN8=;7:0SYEl2oL2j8B9yCe2TDYEPzAN6CP5RLNo2xu7w5K2kImtDFf4H9EHPfPfs16/Zs5GqBm/UdzK+bedElmRHavGNxobhZTy+y05+z7fieQ7IjwWPvlze8ALFANAVg3RMO+pp2zz6KdM41ZyjPbTS/wiWP+YLL5v8913YeUnFWPr0eUpoaZQAA5mEI8MLcx5QTDRHzFQYtbd8INgyjjf9pzHtEMgsnhFWIovoitmhoEUU/pebDcTD/fCj6k8jEwac20 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DF4PR84MB0170; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(190756311086443)(9452136761055); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026);SRVR:DF4PR84MB0170;BCL:0;PCL:0;RULEID:;SRVR:DF4PR84MB0170; x-forefront-prvs: 0038DE95A2 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(7916002)(13464003)(53474002)(199003)(189002)(377454003)(189998001)(3280700002)(6116002)(19580395003)(68736007)(9686002)(2950100001)(2501003)(8676002)(3660700001)(122556002)(15975445007)(81166006)(8936002)(81156014)(106116001)(11100500001)(74316002)(305945005)(33656002)(7736002)(19580405001)(92566002)(10400500002)(7696003)(77096005)(7846002)(66066001)(586003)(99286002)(4326007)(86362001)(2906002)(54356999)(2900100001)(106356001)(87936001)(76176999)(5002640100001)(105586002)(101416001)(3846002)(50986999)(102836003)(2201001)(97736004)(5001770100001);DIR:OUT;SFP:1102;SCL:1;SRVR:DF4PR84MB0170;H:DF4PR84MB0169.NAMPRD84.PROD.OUTLOOK.COM;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Aug 2016 21:08:18.0581 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 105b2061-b669-4b31-92ac-24d304d195dc X-MS-Exchange-Transport-CrossTenantHeadersStamped: DF4PR84MB0170 X-Microsoft-Exchange-Diagnostics: 1;DF4PR84MB0170;23:xSRncaPDsB+D+M6+yCop9Qq2IBvs20h/GrkKMawVGQ733KiOluuZDfaCc1AYst0H3iP0MtYGx8UB+dTMZ7tCw0rBo0yruBw3/JEk7fbz1zJFmJDGpYBNPFPBekPTb/dxcyBkW2OAO97SBWiMAC/Dhfalk20nnowI/MgKYS8Hh9KsN5u9yyLiQ9F/Pl62uPVP X-OriginatorOrg: hpe.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u7J1plH9029191 > -----Original Message----- > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel- > owner@vger.kernel.org] On Behalf Of Sreekanth Reddy > Sent: Thursday, August 18, 2016 12:56 AM > Subject: Observing Softlockup's while running heavy IOs > > Problem statement: > Observing softlockups while running heavy IOs on 8 SSD drives > connected behind our LSI SAS 3004 HBA. > ... > Observing a loop in the IO path, i.e only one CPU is busy with > processing the interrupts and other CPUs (in the affinity_hint mask) > are busy with sending the IOs (these CPUs are not yet all receiving > any interrupts). For example, only CPU6 is busy with processing the > interrupts from IRQ 219 and remaining CPUs i.e CPU 7,8,9,10 & 11 are > just busy with pumping the IOs and they never processed any IO > interrupts from IRQ 219. So we are observing softlockups due to > existence this loop in the IO Path. > > We may not observe these softlockups if irqbalancer might have > balanced the interrupts among the CPUs enabled in the particular > irq's > affinity_hint mask. so that all the CPUs are equaly busy with send > IOs > and processing the interrupts. I am not sure how irqbalancer balance > the load among the CPUs, but here I see only one CPU from irq's > affinity_hint mask is busy with interrupts and remaining CPUs won't > receive any interrupts from this IRQ. > > Please help me with any suggestions/recomendations to slove/limit > these kind of softlockups. Also please let me known if I have missed > any setting in the irqbalance. > The CPUs need to be forced to self-throttle by processing interrupts for their own submissions, which reduces the time they can submit more IOs. See https://lkml.org/lkml/2014/9/9/931 for discussion of this problem when blk-mq was added. --- Robert Elliott, HPE Persistent Memory