target-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maurizio Lombardi <mlombard@redhat.com>
To: Mike Christie <michael.christie@oracle.com>, martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org,
	bvanassche@acm.org
Subject: Re: [PATCH 2/2] target: iscsi: fix a race condition when aborting a task
Date: Tue, 27 Oct 2020 13:49:28 +0000	[thread overview]
Message-ID: <1852a8bd-3edc-5c49-fa51-9afe52f125a8@redhat.com> (raw)
In-Reply-To: <20daa17d-08e7-a412-4d33-bcf75587eca6@oracle.com>

Hello Mike,

Dne 22. 10. 20 v 4:42 Mike Christie napsal(a):
> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do:
> 
> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP.
> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do:
> 
> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd
> 
> The cmd is now freed.
> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released.
> 
> 

Thanks for the review!

There are definitely some problems with task aborts and commands' refcounting *
but this is a different bug than the one this patch is trying to solve (a race to list_del_init());
unless you are saying that abort tasks should never be executed when the connection 
is going down and we have to prevent such cases from happening at all.

* I did a test a few days ago: 
1) blocked the backstore device on the target
2) issued a write command on the initiator
3) waited for abort task and connection shutdown
4) switched the target backstore device to state "running"

I triggered the following warnings in target_handle_abort():

        } else {
                /*
                 * Allow the fabric driver to unmap any resources before
                 * releasing the descriptor via TFO->release_cmd().
                 */
                cmd->se_tfo->aborted_task(cmd);
                if (ack_kref)
                        WARN_ON_ONCE(target_put_sess_cmd(cmd) != 0); <-----------------------
                /*
                 * To do: establish a unit attention condition on the I_T
                 * nexus associated with cmd. See also the paragraph "Aborting
                 * commands" in SAM.
                 */
        }

        WARN_ON_ONCE(kref_read(&cmd->cmd_kref) = 0); <----------------------

        transport_lun_remove_cmd(cmd);

        transport_cmd_check_stop_to_fabric(cmd);
}

target_handle_abort() is executed against a command which has already been released.

[  240.883865] iSCSI/iqn.1994-05.com.redhat:9489e049fb5e: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[  353.848593] ABORT_TASK: Found referenced iSCSI task_tag: 101
[  386.521582] iSCSI Login timeout on Network Portal 0.0.0.0:3260
[  396.537133] WARNING: CPU: 1 PID: 2936 at drivers/target/target_core_transport.c:805 target_handle_abort+0xce/0x180 [target_core_mod]
[  396.537139] Modules linked in: iscsi_target_mod target_core_user uio target_core_pscsi target_core_file target_core_iblock target_core_mod uinput nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ppdev drm joydev pcspkr i2c_piix4 parport_pc parport xfs libcrc32c sr_mod cdrom sd_mod sg ata_generic ata_piix libata virtio_net net_failover failover serio_raw fuse [last unloaded: ip_tables]
[  396.537302] CPU: 1 PID: 2936 Comm: kworker/1:2 Kdump: loaded Not tainted 4.18.0-240.7.el8.x86_64 #1
[  396.537306] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
[  396.537353] Workqueue: target_completion target_abort_work [target_core_mod]
[  396.537394] RIP: 0010:target_handle_abort+0xce/0x180 [target_core_mod]
[  396.537399] Code: 00 e8 e6 db 00 f1 41 81 e4 00 00 40 00 74 9d 48 8d bb 0c 01 00 00 f0 ff 8b 0c 01 00 00 0f 88 ed cd 00 00 75 87 e8 32 ed ff ff <0f> 0b e9 7b ff ff ff c6 03 40 0f 1f 44 00 00 0f 1f 44 00 00 48 8b
[  396.537402] RSP: 0018:ffffac1b818bbe80 EFLAGS: 00010286
[  396.537407] RAX: 0000312f0022d690 RBX: ffff9aec21595350 RCX: 0000000000000000
[  396.537410] RDX: 0000000000000000 RSI: 0000000000000282 RDI: 0000000000000282
[  396.537413] RBP: 0000000000000000 R08: ffff9aebf4d65a00 R09: ffff9aebf4c12d98
[  396.537416] R10: 0000000000000082 R11: 000000000000002e R12: 0000000000400000
[  396.537418] R13: 0000000000000000 R14: ffff9aec02cd9c00 R15: ffff9aec21595488
[  396.537423] FS:  0000000000000000(0000) GS:ffff9aec7fa80000(0000) knlGS:0000000000000000
[  396.537427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.537430] CR2: 000055bfa25e9110 CR3: 00000000614c6000 CR4: 00000000000006e0
[  396.537444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  396.537447] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  396.537449] Call Trace:
[  396.537504]  process_one_work+0x1a7/0x360
[  396.537525]  worker_thread+0x30/0x390
[  396.537532]  ? create_worker+0x1a0/0x1a0
[  396.537536]  kthread+0x112/0x130
[  396.537542]  ? kthread_flush_work_fn+0x10/0x10
[  396.537550]  ret_from_fork+0x35/0x40
[  396.537570] ---[ end trace d6b5bf9aea218e55 ]---
[  396.537734] WARNING: CPU: 1 PID: 2936 at drivers/target/target_core_transport.c:813 target_handle_abort+0x178/0x180 [target_core_mod]
[  396.537737] Modules linked in: iscsi_target_mod target_core_user uio target_core_pscsi target_core_file target_core_iblock target_core_mod uinput nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ppdev drm joydev pcspkr i2c_piix4 parport_pc parport xfs libcrc32c sr_mod cdrom sd_mod sg ata_generic ata_piix libata virtio_net net_failover failover serio_raw fuse [last unloaded: ip_tables]
[  396.537811] CPU: 1 PID: 2936 Comm: kworker/1:2 Kdump: loaded Tainted: G        W        --------- -  - 4.18.0-240.7.el8.x86_64 #1
[  396.537814] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
[  396.537860] Workqueue: target_completion target_abort_work [target_core_mod]
[  396.537903] RIP: 0010:target_handle_abort+0x178/0x180 [target_core_mod]
[  396.537909] Code: 67 03 00 48 85 ed 74 1d 48 8b 45 00 48 8b 7d 08 48 83 c5 18 48 89 de e8 26 db 00 f1 48 8b 45 00 48 85 c0 75 e7 e9 6a ff ff ff <0f> 0b e9 df fe ff ff 90 0f 1f 44 00 00 55 48 8d af 08 01 00 00 53
[  396.537912] RSP: 0018:ffffac1b818bbe80 EFLAGS: 00010246
[  396.537916] RAX: 0000000000000000 RBX: ffff9aec21595350 RCX: 0000000000000000
[  396.537919] RDX: 0000000000000000 RSI: 0000000000000282 RDI: 0000000000000282
[  396.537922] RBP: 0000000000000000 R08: ffff9aebf4d65a00 R09: ffff9aebf4c12d98
[  396.537925] R10: 0000000000000082 R11: 000000000000002e R12: 0000000000400000
[  396.537928] R13: 0000000000000000 R14: ffff9aec02cd9c00 R15: ffff9aec21595488
[  396.537933] FS:  0000000000000000(0000) GS:ffff9aec7fa80000(0000) knlGS:0000000000000000
[  396.537936] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.537939] CR2: 000055bfa25e9110 CR3: 00000000614c6000 CR4: 00000000000006e0
[  396.537951] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  396.537954] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  396.537956] Call Trace:
[  396.537971]  process_one_work+0x1a7/0x360
[  396.537979]  worker_thread+0x30/0x390
[  396.537985]  ? create_worker+0x1a0/0x1a0
[  396.537989]  kthread+0x112/0x130
[  396.537994]  ? kthread_flush_work_fn+0x10/0x10
[  396.538009]  ret_from_fork+0x35/0x40
[  396.538016] ---[ end trace d6b5bf9aea218e56 ]---
[  396.538221] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 101

Maurizio

  reply	other threads:[~2020-10-27 13:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 14:53 [PATCH 0/2] fix race conditions with task aborts Maurizio Lombardi
2020-10-07 14:53 ` [PATCH 1/2] target: iscsi: prevent a race condition in iscsit_unmap_cmd() Maurizio Lombardi
2020-10-08  2:15   ` Bart Van Assche
2020-10-08  9:42     ` Maurizio Lombardi
2020-10-07 14:53 ` [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Maurizio Lombardi
2020-10-22  2:42   ` Mike Christie
2020-10-27 13:49     ` Maurizio Lombardi [this message]
2020-10-27 17:54       ` Mike Christie
2020-10-27 20:03         ` Michael Christie
2020-10-28 17:09           ` Maurizio Lombardi
2020-10-28 20:37             ` Mike Christie
2020-11-10 21:29               ` Maurizio Lombardi
2020-11-10 23:08                 ` Mike Christie
2020-11-11  2:16             ` Mike Christie
2020-11-11 14:58               ` Maurizio Lombardi
2020-11-11 15:37                 ` Michael Christie
2020-11-11 15:48                   ` Maurizio Lombardi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1852a8bd-3edc-5c49-fa51-9afe52f125a8@redhat.com \
    --to=mlombard@redhat.com \
    --cc=bvanassche@acm.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=michael.christie@oracle.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).