All of lore.kernel.org
 help / color / mirror / Atom feed
From: <Kento.A.Kobayashi@sony.com>
To: <oneukum@suse.com>, <gregkh@linuxfoundation.org>,
	<stern@rowland.harvard.edu>
Cc: <usb-storage@lists.one-eyed-alien.net>, <Jacky.Cao@sony.com>,
	<linux-kernel@vger.kernel.org>, <linux-scsi@vger.kernel.org>,
	<linux-usb@vger.kernel.org>, <Kento.A.Kobayashi@sony.com>
Subject: RE: [PATCH] usb: uas: fix usb subsystem hang after power off hub port
Date: Tue, 2 Apr 2019 00:28:04 +0000	[thread overview]
Message-ID: <AE5419EAB4965843B3C0C1FE29F1FFE58914EB@JPYOKXMS103.jp.sony.com> (raw)
In-Reply-To: <1553786139.14990.6.camel@suse.com>

Hi,

>> Hi,
>> 
>> > Sorry,
>> > 
>> > I thought this was clear. Your patch is making the assumption that the reset is triggered by the SCSI layer. You cannot make that assumption, as there is an ioctl for resetting a USB device.
>> > In case we are getting an error during the reset (our endpoints vanish), the device driver must report that to the USB layer, so the driver will always be disconnected.
>> > We cannot drop errors.
>> > 
>> > 	Regards
>> > 		Oliver
>> 
>> This patch modified uas_post_reset to skip rebind operation to avoid exception while -ENODEV happens not drop error.
>> If uas_post_reset happens -ENODEV, usb_reset_and_verify_device must happen error.
>> So,when we use ioctl(USBDEVFS_RESET) to reset device, if usb_reset_and_verify_device happens error, the error will be reported through ioctl return value. 
>
>OK, It is possible that I am stupid. We must rebind if uas_post_reset() fails. The driver will crash without the endpoints. Can you please explain again in greater detail, what you are trying to achieve?

Follow is details for this patch.

Issue
- USB subsystem hangs if power off the hub port connecting UAS USB3.0/3.1 device by calling ioctl(USBDEVFS_CONTROL) to do Hub Class Request(CLEAR_FEATURE:PORT_POWER) while the device is being accessed. 
- Status of the process that is accessing the device becomes DEAD and cannot be killed.

Root Cause
- Block layer timeout happens after power off UAS USB device which is accessed as reproduce step. During timeout error handler process, scsi host state becomes SHOST_CANCEL_RECOVERY that causes IO hangs up and lock cannot be released. And in final, usb subsystem hangs up.
Follow is function call:
blk_mq_timeout_work 
  …->scsi_times_out  (… means some functions are not listed before this function.)
    …-> scsi_eh_scmd_add(scsi_host_set_state to SHOST_RECOVERY) 
      … -> scsi_error_handler
        …-> uas_eh_device_reset_handler
            -> usb_lock_device_for_reset  <- take lock
              -> usb_reset_device
                …-> rebind = uas_post_reset (return 1 since ENODEV) 
                …-> usb_unbind_and_rebind_marked_interfaces (rebind=1)
                   …-> uas_disconnect  (scsi_host_set_state to SHOST_CANCEL_RECOVERY)
                        … -> scsi_queue_rq
                             -> scsi_host_queue_ready(return 0 causes IO hangs up.)
            -> usb_unlock_device          <- lock cannot be release since usb_reset_device not finish.


Countermeasure
- Make uas_post_reset doesn’t return 1 when ENODEV returns from uas_configure_endpoints since usb_unbind_and_rebind_marded_interfaces doesn’t need to do unbind/rebind operations in this situation.
blk_mq_timeout_work
  …->scsi_times_out  (… means some functions are not listed before this function.)
    …-> scsi_eh_scmd_add(scsi_host_set_state to SHOST_RECOVERY) 
      … -> scsi_error_handler
       …-> uas_eh_device_reset_handler (*1)
           -> usb_lock_device_for_reset  <- take lock
             -> usb_reset_device
               -> usb_reset_and_verify_device (return ENODEV and FAILED will be reported to *1)
               -> uas_post_reset returns 0 when ENODEV => rebind=0 
               -> usb_unbind_and_rebind_marked_interfaces (rebind=0)
           -> usb_unlock_device          <- release lock


We can get error(-ENODEV) at uas_eh_device_reset_handler from usb_reset_and_verify_device.

Regards,
Kento Kobayashi

WARNING: multiple messages have this Message-ID (diff)
From: Kento.A.Kobayashi@sony.com
To: oneukum@suse.com, gregkh@linuxfoundation.org, stern@rowland.harvard.edu
Cc: usb-storage@lists.one-eyed-alien.net, Jacky.Cao@sony.com,
	linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-usb@vger.kernel.org, Kento.A.Kobayashi@sony.com
Subject: usb: uas: fix usb subsystem hang after power off hub port
Date: Tue, 2 Apr 2019 00:28:04 +0000	[thread overview]
Message-ID: <AE5419EAB4965843B3C0C1FE29F1FFE58914EB@JPYOKXMS103.jp.sony.com> (raw)

Hi,

>> Hi,
>> 
>> > Sorry,
>> > 
>> > I thought this was clear. Your patch is making the assumption that the reset is triggered by the SCSI layer. You cannot make that assumption, as there is an ioctl for resetting a USB device.
>> > In case we are getting an error during the reset (our endpoints vanish), the device driver must report that to the USB layer, so the driver will always be disconnected.
>> > We cannot drop errors.
>> > 
>> > 	Regards
>> > 		Oliver
>> 
>> This patch modified uas_post_reset to skip rebind operation to avoid exception while -ENODEV happens not drop error.
>> If uas_post_reset happens -ENODEV, usb_reset_and_verify_device must happen error.
>> So,when we use ioctl(USBDEVFS_RESET) to reset device, if usb_reset_and_verify_device happens error, the error will be reported through ioctl return value. 
>
>OK, It is possible that I am stupid. We must rebind if uas_post_reset() fails. The driver will crash without the endpoints. Can you please explain again in greater detail, what you are trying to achieve?

Follow is details for this patch.

Issue
- USB subsystem hangs if power off the hub port connecting UAS USB3.0/3.1 device by calling ioctl(USBDEVFS_CONTROL) to do Hub Class Request(CLEAR_FEATURE:PORT_POWER) while the device is being accessed. 
- Status of the process that is accessing the device becomes DEAD and cannot be killed.

Root Cause
- Block layer timeout happens after power off UAS USB device which is accessed as reproduce step. During timeout error handler process, scsi host state becomes SHOST_CANCEL_RECOVERY that causes IO hangs up and lock cannot be released. And in final, usb subsystem hangs up.
Follow is function call:
blk_mq_timeout_work 
  …->scsi_times_out  (… means some functions are not listed before this function.)
    …-> scsi_eh_scmd_add(scsi_host_set_state to SHOST_RECOVERY) 
      … -> scsi_error_handler
        …-> uas_eh_device_reset_handler
            -> usb_lock_device_for_reset  <- take lock
              -> usb_reset_device
                …-> rebind = uas_post_reset (return 1 since ENODEV) 
                …-> usb_unbind_and_rebind_marked_interfaces (rebind=1)
                   …-> uas_disconnect  (scsi_host_set_state to SHOST_CANCEL_RECOVERY)
                        … -> scsi_queue_rq
                             -> scsi_host_queue_ready(return 0 causes IO hangs up.)
            -> usb_unlock_device          <- lock cannot be release since usb_reset_device not finish.


Countermeasure
- Make uas_post_reset doesn’t return 1 when ENODEV returns from uas_configure_endpoints since usb_unbind_and_rebind_marded_interfaces doesn’t need to do unbind/rebind operations in this situation.
blk_mq_timeout_work
  …->scsi_times_out  (… means some functions are not listed before this function.)
    …-> scsi_eh_scmd_add(scsi_host_set_state to SHOST_RECOVERY) 
      … -> scsi_error_handler
       …-> uas_eh_device_reset_handler (*1)
           -> usb_lock_device_for_reset  <- take lock
             -> usb_reset_device
               -> usb_reset_and_verify_device (return ENODEV and FAILED will be reported to *1)
               -> uas_post_reset returns 0 when ENODEV => rebind=0 
               -> usb_unbind_and_rebind_marked_interfaces (rebind=0)
           -> usb_unlock_device          <- release lock


We can get error(-ENODEV) at uas_eh_device_reset_handler from usb_reset_and_verify_device.

Regards,
Kento Kobayashi

WARNING: multiple messages have this Message-ID (diff)
From: <Kento.A.Kobayashi@sony.com>
To: oneukum@suse.com, gregkh@linuxfoundation.org, stern@rowland.harvard.edu
Cc: usb-storage@lists.one-eyed-alien.net, Jacky.Cao@sony.com,
	linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-usb@vger.kernel.org, Kento.A.Kobayashi@sony.com
Subject: RE: [PATCH] usb: uas: fix usb subsystem hang after power off hub port
Date: Tue, 2 Apr 2019 00:28:04 +0000	[thread overview]
Message-ID: <AE5419EAB4965843B3C0C1FE29F1FFE58914EB@JPYOKXMS103.jp.sony.com> (raw)
In-Reply-To: <1553786139.14990.6.camel@suse.com>

Hi,

>> Hi,
>> 
>> > Sorry,
>> > 
>> > I thought this was clear. Your patch is making the assumption that the reset is triggered by the SCSI layer. You cannot make that assumption, as there is an ioctl for resetting a USB device.
>> > In case we are getting an error during the reset (our endpoints vanish), the device driver must report that to the USB layer, so the driver will always be disconnected.
>> > We cannot drop errors.
>> > 
>> > 	Regards
>> > 		Oliver
>> 
>> This patch modified uas_post_reset to skip rebind operation to avoid exception while -ENODEV happens not drop error.
>> If uas_post_reset happens -ENODEV, usb_reset_and_verify_device must happen error.
>> So,when we use ioctl(USBDEVFS_RESET) to reset device, if usb_reset_and_verify_device happens error, the error will be reported through ioctl return value. 
>
>OK, It is possible that I am stupid. We must rebind if uas_post_reset() fails. The driver will crash without the endpoints. Can you please explain again in greater detail, what you are trying to achieve?

Follow is details for this patch.

Issue
- USB subsystem hangs if power off the hub port connecting UAS USB3.0/3.1 device by calling ioctl(USBDEVFS_CONTROL) to do Hub Class Request(CLEAR_FEATURE:PORT_POWER) while the device is being accessed. 
- Status of the process that is accessing the device becomes DEAD and cannot be killed.

Root Cause
- Block layer timeout happens after power off UAS USB device which is accessed as reproduce step. During timeout error handler process, scsi host state becomes SHOST_CANCEL_RECOVERY that causes IO hangs up and lock cannot be released. And in final, usb subsystem hangs up.
Follow is function call:
blk_mq_timeout_work 
  …->scsi_times_out  (… means some functions are not listed before this function.)
    …-> scsi_eh_scmd_add(scsi_host_set_state to SHOST_RECOVERY) 
      … -> scsi_error_handler
        …-> uas_eh_device_reset_handler
            -> usb_lock_device_for_reset  <- take lock
              -> usb_reset_device
                …-> rebind = uas_post_reset (return 1 since ENODEV) 
                …-> usb_unbind_and_rebind_marked_interfaces (rebind=1)
                   …-> uas_disconnect  (scsi_host_set_state to SHOST_CANCEL_RECOVERY)
                        … -> scsi_queue_rq
                             -> scsi_host_queue_ready(return 0 causes IO hangs up.)
            -> usb_unlock_device          <- lock cannot be release since usb_reset_device not finish.


Countermeasure
- Make uas_post_reset doesn’t return 1 when ENODEV returns from uas_configure_endpoints since usb_unbind_and_rebind_marded_interfaces doesn’t need to do unbind/rebind operations in this situation.
blk_mq_timeout_work
  …->scsi_times_out  (… means some functions are not listed before this function.)
    …-> scsi_eh_scmd_add(scsi_host_set_state to SHOST_RECOVERY) 
      … -> scsi_error_handler
       …-> uas_eh_device_reset_handler (*1)
           -> usb_lock_device_for_reset  <- take lock
             -> usb_reset_device
               -> usb_reset_and_verify_device (return ENODEV and FAILED will be reported to *1)
               -> uas_post_reset returns 0 when ENODEV => rebind=0 
               -> usb_unbind_and_rebind_marked_interfaces (rebind=0)
           -> usb_unlock_device          <- release lock


We can get error(-ENODEV) at uas_eh_device_reset_handler from usb_reset_and_verify_device.

Regards,
Kento Kobayashi

  parent reply	other threads:[~2019-04-02  0:28 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-08  9:13 [PATCH] usb: uas: fix usb subsystem hang after power off hub port Kento.A.Kobayashi
2019-03-08  9:13 ` Kento.A.Kobayashi
2019-03-08  9:13 ` Kento.A.Kobayashi
2019-03-08 16:52 ` [PATCH] " Oliver Neukum
2019-03-08 16:52   ` Oliver Neukum
2019-03-08 17:33   ` [PATCH] " Alan Stern
2019-03-08 17:33     ` Alan Stern
2019-03-08 17:33     ` Alan Stern
2019-03-11  8:36   ` [PATCH] " Kento.A.Kobayashi
2019-03-11  8:36     ` Kento.A.Kobayashi
2019-03-11  8:36     ` Kento.A.Kobayashi
2019-03-12 15:37     ` [PATCH] " Oliver Neukum
2019-03-12 15:37       ` Oliver Neukum
2019-03-15  2:28       ` [PATCH] " Kento.A.Kobayashi
2019-03-15  2:28         ` Kento.A.Kobayashi
2019-03-15  2:28         ` Kento.A.Kobayashi
2019-03-25 10:21         ` [PATCH] " Kento.A.Kobayashi
2019-03-25 10:21           ` Kento.A.Kobayashi
2019-03-25 10:21           ` Kento.A.Kobayashi
2019-03-25 10:34           ` [PATCH] " Oliver Neukum
2019-03-25 10:34             ` Oliver Neukum
2019-03-28  7:53             ` [PATCH] " Kento.A.Kobayashi
2019-03-28  7:53               ` Kento.A.Kobayashi
2019-03-28  7:53               ` Kento.A.Kobayashi
2019-03-28 15:15               ` [PATCH] " Oliver Neukum
2019-03-28 15:15                 ` Oliver Neukum
2019-03-28 15:57                 ` [PATCH] " Alan Stern
2019-03-28 15:57                   ` Alan Stern
2019-03-28 15:57                   ` Alan Stern
2019-03-28 16:49                   ` [PATCH] " Oliver Neukum
2019-03-28 16:49                     ` Oliver Neukum
2019-03-29 14:13                     ` [PATCH] " Alan Stern
2019-03-29 14:13                       ` Alan Stern
2019-03-29 14:13                       ` Alan Stern
2019-04-02  0:28                 ` Kento.A.Kobayashi [this message]
2019-04-02  0:28                   ` [PATCH] " Kento.A.Kobayashi
2019-04-02  0:28                   ` Kento.A.Kobayashi
2019-04-02 14:38                   ` [PATCH] " Alan Stern
2019-04-02 14:38                     ` Alan Stern
2019-04-02 14:38                     ` Alan Stern
2019-04-04  3:57                     ` [PATCH] " Kento.A.Kobayashi
2019-04-04  3:57                       ` Kento.A.Kobayashi
2019-04-04  3:57                       ` Kento.A.Kobayashi
2019-04-04 19:33                       ` [PATCH] " Alan Stern
2019-04-04 19:33                         ` Alan Stern
2019-04-04 19:33                         ` Alan Stern
2019-04-09  0:28                         ` [PATCH] " Kento.A.Kobayashi
2019-04-09  0:28                           ` Kento.A.Kobayashi
2019-04-09  0:28                           ` Kento.A.Kobayashi
2019-04-09  1:21                           ` [PATCH] " Alan Stern
2019-04-09  1:21                             ` Alan Stern
2019-04-09  1:21                             ` Alan Stern
2019-04-09  2:10                         ` [PATCH] " Martin K. Petersen
2019-04-09  2:10                           ` Martin K. Petersen
2019-04-09  2:10                           ` Martin K. Petersen
2019-04-09 14:44                           ` [PATCH] " Alan Stern
2019-04-09 14:44                             ` Alan Stern
2019-04-09 14:44                             ` Alan Stern
2019-04-09 15:16                             ` [PATCH] " Bart Van Assche
2019-04-09 15:16                               ` Bart Van Assche
2019-04-09 15:16                               ` Bart Van Assche
2019-04-09 15:16                               ` Bart Van Assche
2019-04-09 16:45                               ` [PATCH] " Alan Stern
2019-04-09 16:45                                 ` Alan Stern
2019-04-09 16:45                                 ` Alan Stern
2019-04-15  0:27                                 ` [PATCH] " Kento.A.Kobayashi
2019-04-15  0:27                                   ` Kento.A.Kobayashi
2019-04-15  0:27                                   ` Kento.A.Kobayashi
2019-04-15 15:18                                   ` [PATCH] " Alan Stern
2019-04-15 15:18                                     ` Alan Stern
2019-04-15 15:18                                     ` Alan Stern
2019-04-15 15:32                                     ` [PATCH] " Alan Stern
2019-04-15 15:32                                       ` Alan Stern
2019-04-15 15:32                                       ` Alan Stern
2019-04-16  2:31                                       ` [PATCH] " Kento.A.Kobayashi
2019-04-16  2:31                                         ` Kento.A.Kobayashi
2019-04-16  2:31                                         ` Kento.A.Kobayashi
2019-04-10  2:11                               ` [PATCH] " Martin K. Petersen
2019-04-10  2:11                                 ` Martin K. Petersen
     [not found] <16EA1F625E922C43B00B9D8225022050086961B5@APYOKXMS108.ap.sony.com>
2019-03-04  6:23 ` [PATCH] " Greg KH
2019-03-04  7:19   ` Jacky.Cao
2019-03-04  7:19     ` Jacky.Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AE5419EAB4965843B3C0C1FE29F1FFE58914EB@JPYOKXMS103.jp.sony.com \
    --to=kento.a.kobayashi@sony.com \
    --cc=Jacky.Cao@sony.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=oneukum@suse.com \
    --cc=stern@rowland.harvard.edu \
    --cc=usb-storage@lists.one-eyed-alien.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.