From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Bart Van Assche To: "tj@kernel.org" , "axboe@kernel.dk" CC: "kernel-team@fb.com" , "linux-kernel@vger.kernel.org" , "peterz@infradead.org" , "osandov@fb.com" , "linux-block@vger.kernel.org" , "oleg@redhat.com" , "hch@lst.de" Subject: Re: [PATCHSET v2] blk-mq: reimplement timeout handling Date: Wed, 20 Dec 2017 23:41:02 +0000 Message-ID: <1513813261.2603.36.camel@wdc.com> References: <20171212190134.535941-1-tj@kernel.org> In-Reply-To: <20171212190134.535941-1-tj@kernel.org> Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 List-ID: T24gVHVlLCAyMDE3LTEyLTEyIGF0IDExOjAxIC0wODAwLCBUZWp1biBIZW8gd3JvdGU6DQo+IEN1 cnJlbnRseSwgYmxrLW1xIHRpbWVvdXQgcGF0aCBzeW5jaHJvbml6ZXMgYWdhaW5zdCB0aGUgdXN1 YWwNCj4gaXNzdWUvY29tcGxldGlvbiBwYXRoIHVzaW5nIGEgY29tcGxleCBzY2hlbWUgaW52b2x2 aW5nIGF0b21pYw0KPiBiaXRmbGFncywgUkVRX0FUT01fKiwgbWVtb3J5IGJhcnJpZXJzIGFuZCBz dWJ0bGUgbWVtb3J5IGNvaGVyZW5jZQ0KPiBydWxlcy4gIFVuZm9ydHVuYXRsZXksIGl0IGNvbnRh aW5zIHF1aXRlIGEgZmV3IGhvbGVzLg0KDQpIZWxsbyBUZWp1biwNCg0KQW4gYXR0ZW1wdCB0byBy dW4gU0NTSSBJL08gd2l0aCB0aGlzIHBhdGNoIHNlcmllcyBhcHBsaWVkIHJlc3VsdGVkIGluDQp0 aGUgZm9sbG93aW5nOg0KDQpCVUc6IHVuYWJsZSB0byBoYW5kbGUga2VybmVsIE5VTEwgcG9pbnRl ciBkZXJlZmVyZW5jZSBhdCAgICAgICAgICAgKG51bGwpDQpJUDogc2NzaV90aW1lc19vdXQrMHgx Yy8weDJkMA0KUEdEIDAgUDREIDANCk9vcHM6IDAwMDAgWyMxXSBQUkVFTVBUIFNNUA0KQ1BVOiAx IFBJRDogNDM3IENvbW06IGt3b3JrZXIvMToxSCBUYWludGVkOiBHICAgICAgICBXICAgICAgICA0 LjE1LjAtcmM0LWRiZysgIzENCkhhcmR3YXJlIG5hbWU6IERlbGwgSW5jLiBQb3dlckVkZ2UgUjcy MC8wVldUOTAsIEJJT1MgMi41LjQgMDEvMjIvMjAxNg0KV29ya3F1ZXVlOiBrYmxvY2tkIGJsa19t cV90aW1lb3V0X3dvcmsNClJJUDogMDAxMDpzY3NpX3RpbWVzX291dCsweDFjLzB4MmQwDQpSU1A6 IDAwMTg6ZmZmZmM5MDAwN2VmM2Q1OCBFRkxBR1M6IDAwMDEwMjQ2DQpSQVg6IDAwMDAwMDAwMDAw MDAwMDAgUkJYOiBmZmZmODgwODc4ZWFiMDAwIFJDWDogMDAwMDAwMDAwMDAwMDAwMA0KUkRYOiAw MDAwMDAwMDAwMDAwMDAwIFJTSTogMDAwMDAwMDAwMDAwMDAwMCBSREk6IGZmZmY4ODA4NzhlYWIw MDANClJCUDogZmZmZjg4MDg3OGVhYjFhMCBSMDg6IGZmZmZmZmZmZmZmZmZmZmYgUjA5OiAwMDAw MDAwMDAwMDAwMDAxDQpSMTA6IDAwMDAwMDAwMDAwMDAwMDAgUjExOiAwMDAwMDAwMDAwMDAwMDAw IFIxMjogMDAwMDAwMDAwMDAwMDAwNA0KUjEzOiAwMDAwMDAwMDAwMDAwMDAwIFIxNDogZmZmZjg4 MDg1ZTRhNWNlOCBSMTU6IGZmZmY4ODA4NzhlOWY4NDgNCkZTOiAgMDAwMDAwMDAwMDAwMDAwMCgw MDAwKSBHUzpmZmZmODgwOTNmNjAwMDAwKDAwMDApIGtubEdTOjAwMDAwMDAwMDAwMDAwMDANCkNT OiAgMDAxMCBEUzogMDAwMCBFUzogMDAwMCBDUjA6IDAwMDAwMDAwODAwNTAwMzMNCkNSMjogMDAw MDAwMDAwMDAwMDAwMCBDUjM6IDAwMDAwMDAwMDFjMGYwMDIgQ1I0OiAwMDAwMDAwMDAwMDYwNmUw DQpDYWxsIFRyYWNlOg0KIGJsa19tcV90ZXJtaW5hdGVfZXhwaXJlZCsweDM2LzB4NzANCiBidF9p dGVyKzB4NDMvMHg1MA0KIGJsa19tcV9xdWV1ZV90YWdfYnVzeV9pdGVyKzB4ZWUvMHgyMDANCiBi bGtfbXFfdGltZW91dF93b3JrKzB4MTg2LzB4MmUwDQogcHJvY2Vzc19vbmVfd29yaysweDIyMS8w eDZlMA0KIHdvcmtlcl90aHJlYWQrMHgzYS8weDM5MA0KIGt0aHJlYWQrMHgxMWMvMHgxNDANCiBy ZXRfZnJvbV9mb3JrKzB4MjQvMHgzMA0KUklQOiBzY3NpX3RpbWVzX291dCsweDFjLzB4MmQwIFJT UDogZmZmZmM5MDAwN2VmM2Q1OA0KQ1IyOiAwMDAwMDAwMDAwMDAwMDAwDQoNCihnZGIpIGxpc3Qg KihzY3NpX3RpbWVzX291dCsweDFjKQ0KMHhmZmZmZmZmZjgxNDdhZGJjIGlzIGluIHNjc2lfdGlt ZXNfb3V0IChkcml2ZXJzL3Njc2kvc2NzaV9lcnJvci5jOjI4NSkuDQoyODAgICAgICAqLw0KMjgx ICAgICBlbnVtIGJsa19laF90aW1lcl9yZXR1cm4gc2NzaV90aW1lc19vdXQoc3RydWN0IHJlcXVl c3QgKnJlcSkNCjI4MiAgICAgew0KMjgzICAgICAgICAgICAgIHN0cnVjdCBzY3NpX2NtbmQgKnNj bWQgPSBibGtfbXFfcnFfdG9fcGR1KHJlcSk7DQoyODQgICAgICAgICAgICAgZW51bSBibGtfZWhf dGltZXJfcmV0dXJuIHJ0biA9IEJMS19FSF9OT1RfSEFORExFRDsNCjI4NSAgICAgICAgICAgICBz dHJ1Y3QgU2NzaV9Ib3N0ICpob3N0ID0gc2NtZC0+ZGV2aWNlLT5ob3N0Ow0KMjg2DQoyODcgICAg ICAgICAgICAgdHJhY2Vfc2NzaV9kaXNwYXRjaF9jbWRfdGltZW91dChzY21kKTsNCjI4OCAgICAg ICAgICAgICBzY3NpX2xvZ19jb21wbGV0aW9uKHNjbWQsIFRJTUVPVVRfRVJST1IpOw0KMjg5DQoN CihnZGIpIGRpc2FzIC9zIHNjc2lfdGltZXNfb3V0DQpbIC4uLiBdDQoyODMgICAgICAgICAgICAg c3RydWN0IHNjc2lfY21uZCAqc2NtZCA9IGJsa19tcV9ycV90b19wZHUocmVxKTsNCjI4NCAgICAg ICAgICAgICBlbnVtIGJsa19laF90aW1lcl9yZXR1cm4gcnRuID0gQkxLX0VIX05PVF9IQU5ETEVE Ow0KMjg1ICAgICAgICAgICAgIHN0cnVjdCBTY3NpX0hvc3QgKmhvc3QgPSBzY21kLT5kZXZpY2Ut Pmhvc3Q7DQogICAweGZmZmZmZmZmODE0N2FkYjIgPCsxOD46ICAgIG1vdiAgICAweDFkOCglcmRp KSwlcmF4DQogICAweGZmZmZmZmZmODE0N2FkYjkgPCsyNT46ICAgIG1vdiAgICAlcmRpLCVyYngN CiAgIDB4ZmZmZmZmZmY4MTQ3YWRiYyA8KzI4PjogICAgbW92ICAgICglcmF4KSwlcjEzDQogICAw eGZmZmZmZmZmODE0N2FkYmYgPCszMT46ICAgIG5vcGwgICAweDAoJXJheCwlcmF4LDEpDQoNCkJh cnQu From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756755AbdLTXlI (ORCPT ); Wed, 20 Dec 2017 18:41:08 -0500 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:43094 "EHLO esa3.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755822AbdLTXlF (ORCPT ); Wed, 20 Dec 2017 18:41:05 -0500 X-IronPort-AV: E=Sophos;i="5.45,434,1508774400"; d="scan'208";a="66249412" From: Bart Van Assche To: "tj@kernel.org" , "axboe@kernel.dk" CC: "kernel-team@fb.com" , "linux-kernel@vger.kernel.org" , "peterz@infradead.org" , "osandov@fb.com" , "linux-block@vger.kernel.org" , "oleg@redhat.com" , "hch@lst.de" Subject: Re: [PATCHSET v2] blk-mq: reimplement timeout handling Thread-Topic: [PATCHSET v2] blk-mq: reimplement timeout handling Thread-Index: AQHTc3vvt0xfyoBJfkyq1vSApEyj9qNM8QSA Date: Wed, 20 Dec 2017 23:41:02 +0000 Message-ID: <1513813261.2603.36.camel@wdc.com> References: <20171212190134.535941-1-tj@kernel.org> In-Reply-To: <20171212190134.535941-1-tj@kernel.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Bart.VanAssche@wdc.com; x-originating-ip: [199.255.44.250] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;CY1PR0401MB1535;20:G2e2y9djV1qzzUdnN2uinsHe31SBzKzv5pcngzbGWpm5aBDIWfHXKa35mfHcsNuur8HevajoboWsu5mwFt/CCeR3QDtjHy9pUPIAAQk5zAUfYYnD7SNVAg8I/NSfyu4ICIx5NassFpFuK1qXq83pBwJdCaPKmc+Unu+mnotMrgk= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 927f297b-8259-46bd-8370-08d5480321a8 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(2017052603307)(7153060);SRVR:CY1PR0401MB1535; x-ms-traffictypediagnostic: CY1PR0401MB1535: wdcipoutbound: EOP-TRUE x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040470)(2401047)(5005006)(8121501046)(93006095)(93001095)(3231023)(10201501046)(3002001)(6055026)(6041268)(20161123560045)(20161123564045)(20161123558120)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011);SRVR:CY1PR0401MB1535;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:CY1PR0401MB1535; x-forefront-prvs: 0527DFA348 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(376002)(39380400002)(396003)(39860400002)(366004)(346002)(199004)(189003)(51234002)(377424004)(24454002)(97736004)(7736002)(81166006)(575784001)(6436002)(316002)(305945005)(3660700001)(86362001)(77096006)(3846002)(6486002)(6116002)(4326008)(53936002)(3280700002)(6246003)(229853002)(68736007)(81156014)(8676002)(54906003)(110136005)(8936002)(25786009)(2950100002)(478600001)(6506007)(2906002)(6512007)(4001150100001)(59450400001)(36756003)(76176011)(66066001)(105586002)(14454004)(72206003)(106356001)(2501003)(99286004)(103116003)(5660300001)(2900100001)(102836004);DIR:OUT;SFP:1102;SCL:1;SRVR:CY1PR0401MB1535;H:CY1PR0401MB1536.namprd04.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <996FEBA545BACE4DAC81127FBD88B023@namprd04.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: wdc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 927f297b-8259-46bd-8370-08d5480321a8 X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Dec 2017 23:41:02.8493 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b61c8803-16f3-4c35-9b17-6f65f441df86 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0401MB1535 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id vBKNfDWZ018315 On Tue, 2017-12-12 at 11:01 -0800, Tejun Heo wrote: > Currently, blk-mq timeout path synchronizes against the usual > issue/completion path using a complex scheme involving atomic > bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence > rules. Unfortunatley, it contains quite a few holes. Hello Tejun, An attempt to run SCSI I/O with this patch series applied resulted in the following: BUG: unable to handle kernel NULL pointer dereference at (null) IP: scsi_times_out+0x1c/0x2d0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 437 Comm: kworker/1:1H Tainted: G W 4.15.0-rc4-dbg+ #1 Hardware name: Dell Inc. PowerEdge R720/0VWT90, BIOS 2.5.4 01/22/2016 Workqueue: kblockd blk_mq_timeout_work RIP: 0010:scsi_times_out+0x1c/0x2d0 RSP: 0018:ffffc90007ef3d58 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880878eab000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880878eab000 RBP: ffff880878eab1a0 R08: ffffffffffffffff R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004 R13: 0000000000000000 R14: ffff88085e4a5ce8 R15: ffff880878e9f848 FS: 0000000000000000(0000) GS:ffff88093f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c0f002 CR4: 00000000000606e0 Call Trace: blk_mq_terminate_expired+0x36/0x70 bt_iter+0x43/0x50 blk_mq_queue_tag_busy_iter+0xee/0x200 blk_mq_timeout_work+0x186/0x2e0 process_one_work+0x221/0x6e0 worker_thread+0x3a/0x390 kthread+0x11c/0x140 ret_from_fork+0x24/0x30 RIP: scsi_times_out+0x1c/0x2d0 RSP: ffffc90007ef3d58 CR2: 0000000000000000 (gdb) list *(scsi_times_out+0x1c) 0xffffffff8147adbc is in scsi_times_out (drivers/scsi/scsi_error.c:285). 280 */ 281 enum blk_eh_timer_return scsi_times_out(struct request *req) 282 { 283 struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); 284 enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED; 285 struct Scsi_Host *host = scmd->device->host; 286 287 trace_scsi_dispatch_cmd_timeout(scmd); 288 scsi_log_completion(scmd, TIMEOUT_ERROR); 289 (gdb) disas /s scsi_times_out [ ... ] 283 struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); 284 enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED; 285 struct Scsi_Host *host = scmd->device->host; 0xffffffff8147adb2 <+18>: mov 0x1d8(%rdi),%rax 0xffffffff8147adb9 <+25>: mov %rdi,%rbx 0xffffffff8147adbc <+28>: mov (%rax),%r13 0xffffffff8147adbf <+31>: nopl 0x0(%rax,%rax,1) Bart.