From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [PATCH] libata: revert "libata: use blk taging" et al. Date: Thu, 12 Mar 2015 06:14:24 -0400 Message-ID: References: <5500863D.4070807@cybernetics.com> <5500B783.20106@fb.com> <5500BF6F.6080000@cybernetics.com> <20150312023148.GA1960870@devbig257.prn2.facebook.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from mail-we0-f180.google.com ([74.125.82.180]:45022 "EHLO mail-we0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751982AbbCLKO0 (ORCPT ); Thu, 12 Mar 2015 06:14:26 -0400 Received: by wesp10 with SMTP id p10so15177747wes.11 for ; Thu, 12 Mar 2015 03:14:24 -0700 (PDT) In-Reply-To: <20150312023148.GA1960870@devbig257.prn2.facebook.com> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Shaohua Li Cc: Tony Battersby , Jens Axboe , Tejun Heo , IDE/ATA development list , Christoph Hellwig , "linux-kernel@vger.kernel.org" On Wed, Mar 11, 2015 at 10:31 PM, Shaohua Li wrote: > On Wed, Mar 11, 2015 at 06:19:27PM -0400, Tony Battersby wrote: >> On 03/11/2015 05:45 PM, Jens Axboe wrote: >> > On 03/11/2015 02:15 PM, Tony Battersby wrote: >> >> This reverts commits 12cb5ce101abfaf74421f8cc9f196e708209eb79 and >> >> 98bd4be1ba95f2fe7f543910792b7163a5de06eb. >> >> >> >> Commit 12cb5ce101ab ("libata: use blk taging") causes the following oops >> >> with scsi-mq enabled: >> >> >> >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 >> >> IP: [] ata_qc_new_init+0x3e/0x120 >> >> PGD 32adf0067 PUD 32adf1067 PMD 0 >> >> Oops: 0002 [#1] SMP DEBUG_PAGEALLOC >> >> Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi igb >> >> i2c_algo_bit ptp pps_core pm80xx libsas scsi_transport_sas sg coretemp >> >> eeprom w83795 i2c_i801 >> >> CPU: 4 PID: 1450 Comm: cydiskbench Not tainted 4.0.0-rc3 #1 >> >> Hardware name: Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.1b 05/04/12 >> >> task: ffff8800ba86d500 ti: ffff88032a064000 task.ti: ffff88032a064000 >> >> RIP: 0010:[] [] ata_qc_new_init+0x3e/0x120 >> >> RSP: 0018:ffff88032a067858 EFLAGS: 00010046 >> >> RAX: 0000000000000000 RBX: ffff8800ba0d2230 RCX: 000000000000002a >> >> RDX: ffffffff80505ae0 RSI: 0000000000000020 RDI: ffff8800ba0d2230 >> >> RBP: ffff88032a067868 R08: 0000000000000201 R09: 0000000000000001 >> >> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ba0d0000 >> >> R13: ffff8800ba0d2230 R14: ffffffff80505ae0 R15: ffff8800ba0d0000 >> >> FS: 0000000041223950(0063) GS:ffff88033e480000(0000) knlGS:0000000000000000 >> >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> >> CR2: 0000000000000058 CR3: 000000032a0a3000 CR4: 00000000000006e0 >> >> Stack: >> >> ffff880329eee758 ffff880329eee758 ffff88032a0678a8 ffffffff80502dad >> >> ffff8800ba167978 ffff880329eee758 ffff88032bf9c520 ffff8800ba167978 >> >> ffff88032bf9c520 ffff88032bf9a290 ffff88032a0678b8 ffffffff80506909 >> >> Call Trace: >> >> [] ata_scsi_translate+0x3d/0x1b0 >> >> [] ata_sas_queuecmd+0x149/0x2a0 >> >> [] sas_queuecommand+0xa0/0x1f0 [libsas] >> >> [] scsi_dispatch_cmd+0xd4/0x1a0 >> >> [] scsi_queue_rq+0x66f/0x7f0 >> >> [] __blk_mq_run_hw_queue+0x208/0x3f0 >> >> [] blk_mq_run_hw_queue+0x88/0xc0 >> >> [] blk_mq_insert_request+0xc4/0x130 >> >> [] blk_execute_rq_nowait+0x73/0x160 >> >> [] sg_common_write+0x3da/0x720 [sg] >> >> [] ? might_fault+0x5e/0xb0 >> >> [] sg_new_write+0x250/0x360 [sg] >> >> [] ? __lock_acquire+0x50c/0xc10 >> >> [] ? lock_release_non_nested+0xa7/0x360 >> >> [] ? _raw_spin_unlock_irqrestore+0x3b/0x60 >> >> [] ? might_fault+0x5e/0xb0 >> >> [] ? might_fault+0x5e/0xb0 >> >> [] sg_write+0x13b/0x450 [sg] >> >> [] ? __lock_acquire+0x50c/0xc10 >> >> [] ? do_futex+0x109/0xbf0 >> >> [] ? might_fault+0x5e/0xb0 >> >> [] vfs_write+0xd1/0x1b0 >> >> [] SyS_write+0x54/0xc0 >> >> [] system_call_fastpath+0x12/0x17 >> >> Code: 24 20 04 0f 85 ec 00 00 00 49 83 3c 24 00 0f 84 cf 00 00 00 83 fe 1f >> >> 0f 87 dc 00 00 00 89 f0 48 69 c0 f0 00 00 00 49 8d 44 04 40 <89> 70 58 48 >> >> c7 40 10 00 00 00 00 4c 89 20 48 89 58 08 c7 40 64 >> >> RIP [] ata_qc_new_init+0x3e/0x120 >> >> RSP >> >> CR2: 0000000000000058 >> >> ---[ end trace 43f5eefb64627eff ]--- >> >> >> >> >> >> scsi-mq uses a host-wide tag map shared among all devices with some >> >> integer tag values >= ATA_MAX_QUEUE. These unexpectedly high tag values >> >> cause __ata_qc_from_tag() to return NULL, which is then dereferenced in >> >> ata_qc_new_init(), causing the oops above. >> > Wait, something is missing here. We should not be getting tag values >> > that are >= ATA_MAX_QUEUE. Instead of reverting this, we need to figure >> > out why this is happening, and fix it. That is correct way forward here. >> > What setup is this being reproduced on? >> > >> >> Hardware: >> PMC 8001 SAS HBA (PCI ID 117c:0042, PCI sub-ID 117c:0043) using pm80xx >> driver >> 4 SATA disks directly connected (no expander) >> >> Test procedure: >> Send disk read/write commands to SATA disks using the SCSI generic (sg) >> driver. >> >> Analysis: >> >> shost->can_queue is 508 >> >> With my patch applied to revert the problematic commits, I added the >> following code to ata_scsi_qc_new(): >> >> int tag = cmd->request->tag; >> static int max_tag; >> if (tag > max_tag) { >> max_tag = tag; >> printk(KERN_DEBUG "max tag %d\n", tag); >> } >> >> Testing one SATA disk at a time with scsi-mq enabled, I get a max tag of >> 64. Testing 4 disks at a time with scsi-mq enabled gives a max tag of >> 194. With scsi-mq disabled, I get a max tag of 30 no matter how many >> disks I test. > > Can you please try this debug patch: > > diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c > index 932d9cc..46f153f 100644 > --- a/drivers/scsi/libsas/sas_ata.c > +++ b/drivers/scsi/libsas/sas_ata.c > @@ -572,7 +572,6 @@ int sas_ata_init(struct domain_device *found_dev) > > ap->private_data = found_dev; > ap->cbl = ATA_CBL_SATA; > - ap->scsi_host = shost; > rc = ata_sas_port_init(ap); > if (rc) { > ata_sas_port_destroy(ap); We need a scsi_host for error recovery, see: ata_std_sched_eh()