From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752658AbcLFWg6 (ORCPT ); Tue, 6 Dec 2016 17:36:58 -0500 Received: from mail-bn3nam01on0059.outbound.protection.outlook.com ([104.47.33.59]:11776 "EHLO NAM01-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750785AbcLFWgx (ORCPT ); Tue, 6 Dec 2016 17:36:53 -0500 Authentication-Results: spf=none (sender IP is 208.19.100.21) smtp.mailfrom=microsemi.com; redhat.com; dkim=none (message not signed) header.d=none;redhat.com; dmarc=none action=none header.from=microsemi.com; X-IncomingTopHeaderMarker: OriginalChecksum:;UpperCasedChecksum:;SizeAsReceived:1375;Count:19 From: Don Brace To: Xunlei Pang , Joerg Roedel , "David Woodhouse" CC: "iommu@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" , Myron Stowe , Joseph Szczypek , Baoquan He , Dave Young Subject: RE: [PATCH v3] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped Thread-Topic: [PATCH v3] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped Thread-Index: AQHSTvA63D4aFyvMQEO5E76Bba3rhKD7ewaA Date: Tue, 6 Dec 2016 22:03:20 +0000 Message-ID: <4993A297653ECB4581FA5C3C31323D1941809326@avsrvexchmbx1.microsemi.net> References: <1480939747-31916-1-git-send-email-xlpang@redhat.com> In-Reply-To: <1480939747-31916-1-git-send-email-xlpang@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.100.34.10] Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 X-IncomingHeaderCount: 19 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:208.19.100.21;IPV:NLI;CTRY:US;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(7916002)(2980300002)(428002)(377454003)(189002)(13464003)(199003)(50466002)(9686002)(106466001)(92566002)(47776003)(5660300001)(8676002)(39840400001)(46406003)(39450400002)(39410400001)(626004)(39060400001)(7846002)(8936002)(7696004)(8746002)(5250100002)(39860400001)(69596002)(229853002)(38730400001)(55846006)(305945005)(68736007)(101416001)(23726003)(2900100001)(7736002)(81156014)(50986999)(33656002)(4326007)(81166006)(54356999)(76176999)(105586002)(53416004)(189998001)(2920100001)(106116001)(39850400001)(2906002)(86362001)(97736004)(6666003)(97756001)(575784001)(356003)(2950100002)(102836003)(3846002)(5001770100001)(6116002)(104016004);DIR:OUT;SFP:1101;SCL:1;SRVR:MWHPR02MB2256;H:avsrvexchhts1.microsemi.net;FPR:;SPF:None;PTR:InfoDomainNonexistent;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;BN1BFFO11FD020;1:JPxzAONHgHpmF6tyMi4e2T9eSxjyKYUXAjWs4PgY6G3YaGxfFZuQEfP5Wl2IcmxWaXrej73DsgLq/6bdMXHMh06sflE5bV7blquHAyv5zFobWxPbuhISfbDh5+CWP9COoWXUT1Vv9o3XzmruFxbp9iFqqd07donsLkDMOCo54kwWKzN4Z/HI4DYbaZAjBgOPn1nTiYYuP1aTlVpHolsP2EX4t1ZL1nS6LVMPQKHOHBGX6TwH+VURHVjQuLyfX//a8HKr2LaHTilFwTe1kD5UGLqSzSP2HljSc5FgED4L1uGgh8+7Awdi5y0My+BRRJsFZ/tbhBzhRTWWTadezhTeSXBmWFo+VikFPErjYR6vZQIVnYJd7Ntw0Ie2JKJNFb3x4Wv7XWWtxmhtv5S+xQZGC9siTnJ1PGcTmtLC/8Wc8ddPBYM2FBhkzCiqAeAUs/SPm5KJF7Nd9qY/hp+7oCWchKJt3tRim8vqDC/06uK4FfJXFl+xidcE5zCQ7Iqzx4986ITs1cZzVaStuGiSQTSdcLHMBAK4jdRpZMQRHACOi7w= X-MS-Office365-Filtering-Correlation-Id: 23f64e9a-611e-47a8-fef2-08d41e23b24d X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:MWHPR02MB2256; X-Microsoft-Exchange-Diagnostics: 1;MWHPR02MB2256;3:3YLynylrWkixR6DnoBoeFk1DLZ5qKYob8SPphRu1891kuaHfhRjciHmwMcB5WkKeT1EBrSXXSBqLoJRWvKax2KZIxkBKffmEA/BTxF/bslzF8zy3VFMMWdburMUwZyZnIte14vFTCHJWw+TNYabJZwKXVIOAZdT/7VTsYjkprfxrPvGT0RySSHsNiD0vnYM517g3erSMbdkDex6492oF+1PHwtu3UHzGoSyhs6CMhYJ69DKYhxarPhKVdh/v1dhlqaTr+v5UT/tiBduuVtoY7RyhHkyt+q6YZqFX4og89WX1wnDAyTNth4HUnngFEI6cFka6AjTbtboVBj39zotVReiZzQDur6/cs6Bj761ekG+SSXrIbFXivFwg8xr7FzpG;25:qkKghoBh19xN/L3sYZd9DbTc6m374zLEvGCYeM3h27byQH3OnBItH/UsxUTLfAOnROfjI5FI7/0ciEmzztJhMhpJlW0/VWcDNOs6XXGw3StJ9n9PKSG/63FyNk1wmBtk+ylU3oxKIIg5iwuGpG3Seokh3XITiz6G1vAIl0d4Qxkkp8iZbgOoUJZ6UN12Ui2JV7GiWtaU+LANU1nYrBnej3ju4E64U5aVfX88OsVieRNM+W1E6TLCxTZFRYMT7ghRMt5Yj3b0FxPSgNmBfXHhuL7h5Gz5TKX3ecBU6mdeIYG4iYSvCtTB3Vj/FxEjjLGrToYhtq3qOI1tR3c2APRy3E4QD05vlg3AtqlheHmGddtUPlHJv4HT9pYJcf9pt7bCijBrHWqq9GWQnt+mhoPkWWe1yvVZBUnWDt3eEIe+IcLizEKKs8eYEicvGMHLvL0+eCkvovTuVa6Xm5GTVgO39A== X-Microsoft-Exchange-Diagnostics: 1;MWHPR02MB2256;31:MvzjYDh63MddGgiBwBgdDePx1M8EhcFSdEz7mBf3vx1oMY2+lSnWo+hUhsHOY1zrBe4qJq0AvHzJdBdCNudI+4xnKWgL6eVTTRUFFQ98iplUm6s3tCSCU0AnMrPXGxk+NaqwHsr4Tjpw8Gjx1Ga2ExjpPuTc97McqFaHRyST0af3xdIg1O1C7hWrG41HTlFBR+fyr2bZ68RJ5+r4EDgaN3sIDILshkcykUxypWVRJjg6M9haCg7GWZ7PGFea3Vp2b5yQAJyFiBWEGjZ04b+V9dt1MswY8LrTMh1DMFNhWlw=;20:tcHBeo3SJOwe3F401ZMUM0TOKl32cvnf5yg3b1yksc+wq5fujszGnBCwWB20/z13yutppHXMs0mgM8jcb8QybjsAQ01H/vd+qX3iB5go6P/+qdED3K4OI2gp/UzipXD06OpQ8e9m5eLrLkydE5QKR0gxB+i+BsGSYr0Uj0soKIMiEVfWkR42OxzCSh70RkS4De6BeqZERU5TTDYJwC97B00RBM9pdC85lYJPLAiFlbVbJZ8KPKjTFfbNab1gjrT96feZ2FyWJJBzB020Up/ngBvxR7AU5XOSuNA5kXtALjIumYSi0jwUYMl/y9rr/lYJPtD0sGmUnoz7IEj05xyePYJIx4WyXZr7PVV0tsDJOFCyM4KDDNyx+Avf/YucfIYf4IYLWzafB1V1d9CTyogiPc0Jm45OEK74uN61w6YnzjdoRAXz2JtK2rUmmwwdZsAjOcrzAK0Oy7yLTg5gmilO4LvLsUTlWL+OUQ/uWXMU2SEMKzCyjiIta1A9vy0iyqUu X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(9452136761055)(72170198267865); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(13017025)(13015025)(13023025)(13024025)(13018025)(3002001)(10201501046)(6055026)(6041248)(20161123560025)(20161123562025)(20161123564025)(20161123555025)(6072148);SRVR:MWHPR02MB2256;BCL:0;PCL:0;RULEID:;SRVR:MWHPR02MB2256; X-Microsoft-Exchange-Diagnostics: 1;MWHPR02MB2256;4:zEctOhbmt2V3wxKOkZPArRAbT4kGdoCjLKjX/E5efTC7wnbYKcITuT5bhLbhu8uRzNmeZ2RTCXsADL8uOKkvoU66cHP5e2+ztuQErbK4Egoh0719S5WaBE1YdVqQ0TMXd3lq8zr3cU9/apzBucEnR4sunGZKj+GMZKdQnSXvbDo8MO53DbYCmxAbVrgIevJfPHrg6DssF4kdGacXg2+V0vJiJ3NQF9RXRtrYuwdvHHB0gqQgafZAFhVHXP1HZAHLu2cdmM23RlagdneMHEUJ6iBvshyLdhbz/Vf2A9erQ57Wa43uMK8LZGQfLZcYvlBDEz1wN0YPChxiTfdDB+f9853ttI9TAA+m2zqtcnbt8h9j5mmJg0QHvxg40Wx1bqIYYkd/SE2+TAuyu2FJOgW6q5xuNY/QQl9GDCik+NNoKmtU7sNhdw0zjk5ziJl/8WeYezlCoX9VClreW3V9JPDAPNLupZFRSkRlvaaUSVrXPW3xqqUxweS1P3FcqDmPzOzJMhRfN5o9ZZ4qeaL3kxEu3ceQ0Hv8hdvU4bGbqMiEAhoTOYSUUtVl6jOWrv53aiaTR9C21eRC/k0tLOdgpxbQsJfYaRWxqP9NXlJTKFRAvjFyJuZDU+NyxJsCyuz5zw9h8MS5kqH+i5D8Io4LXLsMOLNCMfYos/q5Rw1lG06ZuhYfQumaLduwgNVk3SK/+xT5f783ETPUv+/iP2eTWLwpvK8qejRZ4/kfLbyUNx5237X0r0VMYBEhaPY/1d7UddBlMX/m+WUVnBMPeIcZISyxJw== X-Forefront-PRVS: 01480965DA X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;MWHPR02MB2256;23:FuknlFzqkm2eTAViqIXs0+9llwEooQZJBmPZDrEwy?= =?us-ascii?Q?hcNLyOPqONJSEobIp3hTgvZf8bW4Rwnl7BA0rsFm2SIWUf03fJtFXNp15tX7?= =?us-ascii?Q?FpNQ0f4yUKPhXf2hHb0X/qvxmoFO847R7gGGH3IXcRbTJHdpVo2qSF5u9lEC?= =?us-ascii?Q?6itir+D+6I/PE2S2SJuClQd4kEYubnSTfT7dcrsIOhdrgtnb/Pvmk9tAqNiD?= =?us-ascii?Q?VIvqZ9N6l2zzOprGiYSCbZZUX3nqanJ96MiYm7JixVJyu+pzbZlWv9+8KD/J?= =?us-ascii?Q?6OS3gyQfQq1n1lVU7DU+j6j+JnjJxoBjPordYZ0p6HZz7s5nErQQaYjfLpOQ?= =?us-ascii?Q?9CT2JaWvGmb2cZAGF+TFJ9QzD+ypjJzsBID2qv+XCp2gISJJ4kWYrG2kTT6V?= =?us-ascii?Q?Ke98sMPwMM7+LddX+BygCe2jR3pYLsfysuOQRqdfLIiy1mv/V/rk85QwREpq?= =?us-ascii?Q?7mU1X0W75bOY4vxhntYKGaRayKFrk5qqhzkx2wvPe4C2F7k+hQICsNEPSeUP?= =?us-ascii?Q?o8FWggtBD4Gz3mLMrBugS5XvgY6er2j2A/GIKKp/9XgJrpSt6W6KxTLQsRQX?= =?us-ascii?Q?JUABfnfTS4k0gQkgWkawD4Nsf70ySqZ4wrZkw4YDIEvkEAtJrStyLZsIx5rI?= =?us-ascii?Q?jlfJlmuBcud4YoO33JEqfvZi8x/6EI2l5igreGydypRow3H2mWtl8pKi052x?= =?us-ascii?Q?s9ecbDtPlyJ5qh48PpEiIIb03FfGQxTuq9WDVMohGv8W1NdFeoKRuTn3ne5O?= =?us-ascii?Q?l48+zvO6kq47Jd6PhpGtd85etTWKMzIfrdwCKKqlHb/VajU4mwBda95EFEnW?= =?us-ascii?Q?By6xtKqBV2fk/tyUEZW2yvclFuud3d71HaZ69NuOkqDwX4JWOLjSnPSGKI6E?= =?us-ascii?Q?RO8MbOrB9TQ58wySlkkxTTmuDIcF9Zdu1DdRNeKOk05bC2QVqPPX3qSE5McN?= =?us-ascii?Q?CH+g9dT69Fy1bhQzx/esXYvb78EhxhG8RBtC2LzwjV4wKjAxazUoLN008qAD?= =?us-ascii?Q?NZUM+sumLwkNdKLDf91cfWqidSVOLlBy+RQg7Igfue/txACNDuqgfgkAPwQ1?= =?us-ascii?Q?EDMKfT0VHWaAn5RXmz7wLEYz2RU2n+F8AwVkDjrrXP3IZ3iEq0T16c9a0gAY?= =?us-ascii?Q?PrOXmgl/kUrCTdkT7bEcAWl2E4tum0P/fANWsCaTtRB9GddmjYp1EJj8ysOC?= =?us-ascii?Q?9MVss2OBzQi3Va6zlRbAnYLx4f29PBeeb7ReURcFufnBVVdOuHS47lQ/FP1e?= =?us-ascii?Q?HIKvJaSVdUl+bAH5D+jA5cq48xf58OVDKF0ZMIDHdX3Z5U/IWSld6Bw+gNA8?= =?us-ascii?Q?yyhDzywiU8O9g+RH82t9tP7nZpiUYFF9W3PyoF85h1YwdRho+ZcIykkJODuC?= =?us-ascii?Q?D26stqePAwZC156JESXIM1B6XiVULPjgIvEnulV65Xh50LB8eP0BNHG051pY?= =?us-ascii?Q?nsVgyC8Ng/vtkYdMdXXbvSo5uhoGlca3GzokOQUKfjEr7zctvDA8gkYMsTwv?= =?us-ascii?Q?vtDELIonAfC3gkx9EoB65CvsTTJG9/FmqA=3D?= X-Microsoft-Exchange-Diagnostics: 1;MWHPR02MB2256;6:IXirfEHhY4H7zRRUfRnYG576CPly+s1Muq0uoZm4sOIwTvnib+JGp2NykrfBmt7D16l7TbZiTXVA4FR31fRUGSNBlS+FH7x3id7xTpBSBZMYlTju3DFs25zPsJSpWDwf4s71YEcEMpE2XS7qiZXrdA8ML2pSPJagD9v03SgLjdJu+e/oTaKQt27ts52PNWqL2u48uoe8aR5rbn1J9l/aNpqjqoao6/ZrMLqZM/WyxruGANSW+CTkklmIJthsJSaLho0K0KZkvNSwzn3OED/P6yvRsiNoRGTKlhOqt+/6wZozkan5cYg0eu72yiO1fQurRmwSuet3rVdJlfgQqGMRwG8pbmKfVY+b8WinmKUXyDcTEvjeW5sYCUF/EKptjF/elClwQ7tU0//6xdq4GiPmh6v/ejUDfE5SfDbX/TfqFVAsoiSal2oX0aCXuFg7MkQtRmTdsJ5MLfoNt7eFOyGKE2pwFwIvwTjZjttPfetVITEN4C9igOuIah2RHKJ0LaLH;5:J6SDA6hv6wM7MRml4ZvU2V/giC1LyqnePcyX7uR7scfLK8RZWErB/ObUmYDJWMYKA/3SxkK07wY0fCjBNPEfjIl+WaZ3gJ4fiWILR4jP0vUeQgmhLVVrViZLKRU3akAfr+IZqR76VILvY6aJvqepW74uzx1CZQzKeZbUtkiUhUg=;24:AxILw4eWKUEpYS3nL6kDtYTXiFABXgbncBrxUwZ+f4HdDhuqqBU/Ifg4hrIRxce9noOsn0NKq270utBEMsqtU0RBkNp5ULjH5RcjbSZjWuM= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;MWHPR02MB2256;7:ZyVSojv8ecy7xadpC9dlBC+n1+RE6TU5n7cYWau1A8uBnRA4jMPC4ez/F/83xL8UsHns+Ib5H5Nhw+49t2VvFsss1eUVejQSy1hTpOZKrLSR19ZKTCWpzRKjfD5EMWfXg9j3egvtbXS3+7XJEge07Q2JyZ+7TUs5b2YyfOJlcJ1XhOD4UbrNrt4OeKu3PNWGvUkiJl0drbotCZJX5wALZOKWduLRRsW4ey+lSRgellbVZLICihepXCQWho7RHio4HYcyIY2Yn3l8Y3oPlcffBHvUI5Z5Ov1msydudYqOKvmEV3ZTuI9T8BdStys+wBttClIabOHCYGSZYZH2D7dexRlNi/IC5eh4R4dKR/3tyCY+sRqTP3dDSNaTPXCNq0rUECfJ/TCmRUFr9Yncj4rFej1swAByiwPJOQ87+KxRfb8WbUDtJL0JgfEAqsbLYviKaX5HNsnsTpZY4QTt1QWwYw== X-OriginatorOrg: microsemi.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Dec 2016 22:03:22.6951 (UTC) X-MS-Exchange-CrossTenant-Id: f267a5c8-86d8-4cc9-af71-1fd2c67c8fad X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f267a5c8-86d8-4cc9-af71-1fd2c67c8fad;Ip=[208.19.100.21];Helo=[avsrvexchhts1.microsemi.net] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR02MB2256 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id uB6Mb3aR018415 > -----Original Message----- > From: Xunlei Pang [mailto:xlpang@redhat.com] > Sent: Monday, December 05, 2016 6:09 AM > To: Joerg Roedel; David Woodhouse > Cc: iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org; Xunlei > Pang; Myron Stowe; Joseph Szczypek; Don Brace; Baoquan He; Dave Young > Subject: [PATCH v3] iommu/vt-d: Flush old iommu caches for kdump when > the device gets context mapped > > EXTERNAL EMAIL > > > We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers > under kdump, it can be steadily reproduced on several different machines, > the dmesg log is like: > HP HPSA Driver (v 3.4.16-0) > hpsa 0000:02:00.0: using doorbell to reset controller > hpsa 0000:02:00.0: board ready after hard reset. > hpsa 0000:02:00.0: Waiting for controller to respond to no-op > DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff] > DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - > 0xbdf6efff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff] > DMAR: DRHD: handling fault status reg 2 > DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason > 06] PTE Read access is not set > hpsa 0000:02:00.0: controller message 03:00 timed out > hpsa 0000:02:00.0: no-op failed; re-trying > > After some debugging, we found that the fault addr is from DMA initiated at > the driver probe stage after reset(not in-flight DMA), and the corresponding > pte entry value is correct, the fault is likely due to the old iommu caches > of the in-flight DMA before it. > > Thus we need to flush the old cache after context mapping is setup for the > device, where the device is supposed to finish reset at its driver probe > stage and no in-flight DMA exists hereafter. > > I'm not sure if the hardware is responsible for invalidating all the related > caches allocated in the iommu hardware before, but seems not the case for > hpsa, > actually many device drivers have problems in properly resetting the > hardware. > Anyway flushing (again) by software in kdump kernel when the device gets > context > mapped which is a quite infrequent operation does little harm. > > With this patch, the problematic machine can survive the kdump tests. > > CC: Myron Stowe > CC: Joseph Szczypek > CC: Don Brace > CC: Baoquan He > CC: Dave Young > Fixes: 091d42e43d21 ("iommu/vt-d: Copy translation tables from old kernel") > Fixes: dbcd861f252d ("iommu/vt-d: Do not re-use domain-ids from the old > kernel") > Fixes: cf484d0e6939 ("iommu/vt-d: Mark copied context entries") > Signed-off-by: Xunlei Pang > --- > v2->v3: > Flush context cache only and add Fixes-tag, according to Joerg's comments. > > drivers/iommu/intel-iommu.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 3965e73..624eac9 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -2024,6 +2024,25 @@ static int domain_context_mapping_one(struct > dmar_domain *domain, > if (context_present(context)) > goto out_unlock; > > + /* > + * For kdump cases, old valid entries may be cached due to the > + * in-flight DMA and copied pgtable, but there is no unmapping > + * behaviour for them, thus we need an explicit cache flush for > + * the newly-mapped device. For kdump, at this point, the device > + * is supposed to finish reset at its driver probe stage, so no > + * in-flight DMA will exist, and we don't need to worry anymore > + * hereafter. > + */ > + if (context_copied(context)) { > + u16 did_old = context_domain_id(context); > + > + if (did_old >= 0 && did_old < cap_ndoms(iommu->cap)) > + iommu->flush.flush_context(iommu, did_old, > + (((u16)bus) << 8) | devfn, > + DMA_CCMD_MASK_NOBIT, > + DMA_CCMD_DEVICE_INVL); > + } > + > pgd = domain->pgd; > > context_clear_entry(context); > -- > 1.8.3.1 Tested-by: Don Brace Thanks, Don Brace ESC - Smart Storage Microsemi Corporation From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Brace Subject: RE: [PATCH v3] iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped Date: Tue, 6 Dec 2016 22:03:20 +0000 Message-ID: <4993A297653ECB4581FA5C3C31323D1941809326@avsrvexchmbx1.microsemi.net> References: <1480939747-31916-1-git-send-email-xlpang@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1480939747-31916-1-git-send-email-xlpang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Content-Language: en-US List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Xunlei Pang , Joerg Roedel , David Woodhouse Cc: Joseph Szczypek , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org" , Dave Young List-Id: iommu@lists.linux-foundation.org > -----Original Message----- > From: Xunlei Pang [mailto:xlpang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org] > Sent: Monday, December 05, 2016 6:09 AM > To: Joerg Roedel; David Woodhouse > Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Xunlei > Pang; Myron Stowe; Joseph Szczypek; Don Brace; Baoquan He; Dave Young > Subject: [PATCH v3] iommu/vt-d: Flush old iommu caches for kdump when > the device gets context mapped > > EXTERNAL EMAIL > > > We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers > under kdump, it can be steadily reproduced on several different machines, > the dmesg log is like: > HP HPSA Driver (v 3.4.16-0) > hpsa 0000:02:00.0: using doorbell to reset controller > hpsa 0000:02:00.0: board ready after hard reset. > hpsa 0000:02:00.0: Waiting for controller to respond to no-op > DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff] > DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - > 0xbdf6efff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff] > DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff] > DMAR: DRHD: handling fault status reg 2 > DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason > 06] PTE Read access is not set > hpsa 0000:02:00.0: controller message 03:00 timed out > hpsa 0000:02:00.0: no-op failed; re-trying > > After some debugging, we found that the fault addr is from DMA initiated at > the driver probe stage after reset(not in-flight DMA), and the corresponding > pte entry value is correct, the fault is likely due to the old iommu caches > of the in-flight DMA before it. > > Thus we need to flush the old cache after context mapping is setup for the > device, where the device is supposed to finish reset at its driver probe > stage and no in-flight DMA exists hereafter. > > I'm not sure if the hardware is responsible for invalidating all the related > caches allocated in the iommu hardware before, but seems not the case for > hpsa, > actually many device drivers have problems in properly resetting the > hardware. > Anyway flushing (again) by software in kdump kernel when the device gets > context > mapped which is a quite infrequent operation does little harm. > > With this patch, the problematic machine can survive the kdump tests. > > CC: Myron Stowe > CC: Joseph Szczypek > CC: Don Brace > CC: Baoquan He > CC: Dave Young > Fixes: 091d42e43d21 ("iommu/vt-d: Copy translation tables from old kernel") > Fixes: dbcd861f252d ("iommu/vt-d: Do not re-use domain-ids from the old > kernel") > Fixes: cf484d0e6939 ("iommu/vt-d: Mark copied context entries") > Signed-off-by: Xunlei Pang > --- > v2->v3: > Flush context cache only and add Fixes-tag, according to Joerg's comments. > > drivers/iommu/intel-iommu.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 3965e73..624eac9 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -2024,6 +2024,25 @@ static int domain_context_mapping_one(struct > dmar_domain *domain, > if (context_present(context)) > goto out_unlock; > > + /* > + * For kdump cases, old valid entries may be cached due to the > + * in-flight DMA and copied pgtable, but there is no unmapping > + * behaviour for them, thus we need an explicit cache flush for > + * the newly-mapped device. For kdump, at this point, the device > + * is supposed to finish reset at its driver probe stage, so no > + * in-flight DMA will exist, and we don't need to worry anymore > + * hereafter. > + */ > + if (context_copied(context)) { > + u16 did_old = context_domain_id(context); > + > + if (did_old >= 0 && did_old < cap_ndoms(iommu->cap)) > + iommu->flush.flush_context(iommu, did_old, > + (((u16)bus) << 8) | devfn, > + DMA_CCMD_MASK_NOBIT, > + DMA_CCMD_DEVICE_INVL); > + } > + > pgd = domain->pgd; > > context_clear_entry(context); > -- > 1.8.3.1 Tested-by: Don Brace Thanks, Don Brace ESC - Smart Storage Microsemi Corporation