All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Thumshirn <jthumshirn@suse.de>
To: "Ewan D. Milne" <emilne@redhat.com>
Cc: linux-scsi@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH RESEND] scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state
Date: Wed, 28 Jun 2017 09:38:15 +0200	[thread overview]
Message-ID: <20170628073815.GA4185@linux-x5ow.site> (raw)
In-Reply-To: <1498589758-31473-1-git-send-email-emilne@redhat.com>

On Tue, Jun 27, 2017 at 02:55:58PM -0400, Ewan Milne wrote:
> From: "Ewan D. Milne" <emilne@redhat.com>
> 
> The addition of the STARGET_REMOVE state had the side effect of
> introducing a race condition that can cause a crash.
> 
> scsi_target_reap_ref_release() checks the starget->state to
> see if it still in STARGET_CREATED, and if so, skips calling
> transport_remove_device() and device_del(), because the starget->state
> is only set to STARGET_RUNNING after scsi_target_add() has called
> device_add() and transport_add_device().
> 
> However, if an rport loss occurs while a target is being scanned,
> it can happen that scsi_remove_target() will be called while the
> starget is still in the STARGET_CREATED state.  In this case, the
> starget->state will be set to STARGET_REMOVE, and as a result,
> scsi_target_reap_ref_release() will take the wrong path.  The end
> result is a panic:

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

Although we've been tampering with the target removal code for quite some
time now, so I really have the gut feeling we haven't really fixed the
root cause yet.

I once tried building a regression test for this (with qemu hot plugging UAS
devices) but that didn't really go far. Maybe we should add a scsi_target
to scsi_debug and add some methods to toggle remove it again. Just to have
a sensible unit test for that code path.

Byte,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Thumshirn <jthumshirn@suse.de>
To: "Ewan D. Milne" <emilne@redhat.com>
Cc: linux-scsi@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH RESEND] scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state
Date: Wed, 28 Jun 2017 09:38:15 +0200	[thread overview]
Message-ID: <20170628073815.GA4185@linux-x5ow.site> (raw)
In-Reply-To: <1498589758-31473-1-git-send-email-emilne@redhat.com>

On Tue, Jun 27, 2017 at 02:55:58PM -0400, Ewan Milne wrote:
> From: "Ewan D. Milne" <emilne@redhat.com>
> 
> The addition of the STARGET_REMOVE state had the side effect of
> introducing a race condition that can cause a crash.
> 
> scsi_target_reap_ref_release() checks the starget->state to
> see if it still in STARGET_CREATED, and if so, skips calling
> transport_remove_device() and device_del(), because the starget->state
> is only set to STARGET_RUNNING after scsi_target_add() has called
> device_add() and transport_add_device().
> 
> However, if an rport loss occurs while a target is being scanned,
> it can happen that scsi_remove_target() will be called while the
> starget is still in the STARGET_CREATED state.  In this case, the
> starget->state will be set to STARGET_REMOVE, and as a result,
> scsi_target_reap_ref_release() will take the wrong path.  The end
> result is a panic:

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

Although we've been tampering with the target removal code for quite some
time now, so I really have the gut feeling we haven't really fixed the
root cause yet.

I once tried building a regression test for this (with qemu hot plugging UAS
devices) but that didn't really go far. Maybe we should add a scsi_target
to scsi_debug and add some methods to toggle remove it again. Just to have
a sensible unit test for that code path.

Byte,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

  parent reply	other threads:[~2017-06-28  7:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-27 18:55 [PATCH RESEND] scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state Ewan D. Milne
2017-06-27 21:26 ` Laurence Oberman
2017-06-28  1:09 ` Martin K. Petersen
2017-06-28  7:38 ` Johannes Thumshirn [this message]
2017-06-28  7:38   ` Johannes Thumshirn
2017-06-28 14:23   ` Ewan D. Milne
2017-07-01 20:55 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170628073815.GA4185@linux-x5ow.site \
    --to=jthumshirn@suse.de \
    --cc=emilne@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.