All of lore.kernel.org
 help / color / mirror / Atom feed
* Question about iscsi session block
@ 2022-02-15 15:49 ` Zhengyuan Liu
  0 siblings, 0 replies; 13+ messages in thread
From: Zhengyuan Liu @ 2022-02-15 15:49 UTC (permalink / raw)
  To: linux-scsi, open-iscsi, dm-devel; +Cc: lduncan, leech, bob.liu

Hi, all

We have an online server which uses multipath + iscsi to attach storage
from Storage Server. There are two NICs on the server and for each it
carries about 20 iscsi sessions and for each session it includes about 50
 iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
 on the server). The problem is: once a NIC gets faulted, it will take too long
(nearly 80s) for multipath to switch to another good NIC link, because it
needs to block all iscsi devices over that faulted NIC firstly. The callstack is
 shown below:

    void iscsi_block_session(struct iscsi_cls_session *session)
    {
        queue_work(iscsi_eh_timer_workq, &session->block_work);
    }

 __iscsi_block_session() -> scsi_target_block() -> target_block() ->
  device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
 blk_mq_quiesce_queue()>synchronize_rcu()

For all sessions and all devices, it was processed sequentially, and we have
traced that for each synchronize_rcu() call it takes about 80ms, so
the total cost
is about 80s (80ms * 20 * 50). It's so long that the application can't
tolerate and
may interrupt service.

So my question is that can we optimize the procedure to reduce the time cost on
blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.

Thanks in advance.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dm-devel] Question about iscsi session block
@ 2022-02-15 15:49 ` Zhengyuan Liu
  0 siblings, 0 replies; 13+ messages in thread
From: Zhengyuan Liu @ 2022-02-15 15:49 UTC (permalink / raw)
  To: linux-scsi, open-iscsi, dm-devel; +Cc: bob.liu, leech, lduncan

Hi, all

We have an online server which uses multipath + iscsi to attach storage
from Storage Server. There are two NICs on the server and for each it
carries about 20 iscsi sessions and for each session it includes about 50
 iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
 on the server). The problem is: once a NIC gets faulted, it will take too long
(nearly 80s) for multipath to switch to another good NIC link, because it
needs to block all iscsi devices over that faulted NIC firstly. The callstack is
 shown below:

    void iscsi_block_session(struct iscsi_cls_session *session)
    {
        queue_work(iscsi_eh_timer_workq, &session->block_work);
    }

 __iscsi_block_session() -> scsi_target_block() -> target_block() ->
  device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
 blk_mq_quiesce_queue()>synchronize_rcu()

For all sessions and all devices, it was processed sequentially, and we have
traced that for each synchronize_rcu() call it takes about 80ms, so
the total cost
is about 80s (80ms * 20 * 50). It's so long that the application can't
tolerate and
may interrupt service.

So my question is that can we optimize the procedure to reduce the time cost on
blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.

Thanks in advance.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] Question about iscsi session block
  2022-02-15 15:49 ` [dm-devel] " Zhengyuan Liu
  (?)
@ 2022-02-15 16:25 ` Donald Williams
  -1 siblings, 0 replies; 13+ messages in thread
From: Donald Williams @ 2022-02-15 16:25 UTC (permalink / raw)
  To: open-iscsi; +Cc: dm-devel, leech, lduncan, linux-scsi, bob.liu


[-- Attachment #1.1: Type: text/plain, Size: 2396 bytes --]

Hello,
   Something else to check is your MPIO configuration.  I have seen this
same symptom when the linux MPIO feature "queue_if_no_path" was enabled

 From the /etc/multipath.conf file showing it enabled.

    failback                immediate
    features                "1 queue_if_no_path"

 Also, in the past some versions of linux multipathd would wait for a
very long time before moving all I/O to the remaining path.

 Regards,
Don


On Tue, Feb 15, 2022 at 10:49 AM Zhengyuan Liu <liuzhengyuang521@gmail.com>
wrote:

> Hi, all
>
> We have an online server which uses multipath + iscsi to attach storage
> from Storage Server. There are two NICs on the server and for each it
> carries about 20 iscsi sessions and for each session it includes about 50
>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block
> devices
>  on the server). The problem is: once a NIC gets faulted, it will take too
> long
> (nearly 80s) for multipath to switch to another good NIC link, because it
> needs to block all iscsi devices over that faulted NIC firstly. The
> callstack is
>  shown below:
>
>     void iscsi_block_session(struct iscsi_cls_session *session)
>     {
>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>     }
>
>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>  blk_mq_quiesce_queue()>synchronize_rcu()
>
> For all sessions and all devices, it was processed sequentially, and we
> have
> traced that for each synchronize_rcu() call it takes about 80ms, so
> the total cost
> is about 80s (80ms * 20 * 50). It's so long that the application can't
> tolerate and
> may interrupt service.
>
> So my question is that can we optimize the procedure to reduce the time
> cost on
> blocking all iscsi devices?  I'm not sure if it is a good idea to increase
> the
> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "open-iscsi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to open-iscsi+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/open-iscsi/CAOOPZo4uNCicVmoHa2za0%3DO1_XiBdtBvTuUzqBTeBc3FmDqEJw%40mail.gmail.com
> .
>

[-- Attachment #1.2: Type: text/html, Size: 4081 bytes --]

[-- Attachment #2: Type: text/plain, Size: 97 bytes --]

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question about iscsi session block
  2022-02-15 15:49 ` [dm-devel] " Zhengyuan Liu
@ 2022-02-15 16:31   ` Mike Christie
  -1 siblings, 0 replies; 13+ messages in thread
From: Mike Christie @ 2022-02-15 16:31 UTC (permalink / raw)
  To: Zhengyuan Liu, linux-scsi, open-iscsi, dm-devel; +Cc: lduncan, leech

On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> Hi, all
> 
> We have an online server which uses multipath + iscsi to attach storage
> from Storage Server. There are two NICs on the server and for each it
> carries about 20 iscsi sessions and for each session it includes about 50
>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>  on the server). The problem is: once a NIC gets faulted, it will take too long
> (nearly 80s) for multipath to switch to another good NIC link, because it
> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>  shown below:
> 
>     void iscsi_block_session(struct iscsi_cls_session *session)
>     {
>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>     }
> 
>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>  blk_mq_quiesce_queue()>synchronize_rcu()
> 
> For all sessions and all devices, it was processed sequentially, and we have
> traced that for each synchronize_rcu() call it takes about 80ms, so
> the total cost
> is about 80s (80ms * 20 * 50). It's so long that the application can't
> tolerate and
> may interrupt service.
> 
> So my question is that can we optimize the procedure to reduce the time cost on
> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.

We need a patch, so the unblock call waits/cancels/flushes the block call or
they could be running in parallel.

I'll send a patchset later today so you can test it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] Question about iscsi session block
@ 2022-02-15 16:31   ` Mike Christie
  0 siblings, 0 replies; 13+ messages in thread
From: Mike Christie @ 2022-02-15 16:31 UTC (permalink / raw)
  To: Zhengyuan Liu, linux-scsi, open-iscsi, dm-devel; +Cc: leech, lduncan

On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> Hi, all
> 
> We have an online server which uses multipath + iscsi to attach storage
> from Storage Server. There are two NICs on the server and for each it
> carries about 20 iscsi sessions and for each session it includes about 50
>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>  on the server). The problem is: once a NIC gets faulted, it will take too long
> (nearly 80s) for multipath to switch to another good NIC link, because it
> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>  shown below:
> 
>     void iscsi_block_session(struct iscsi_cls_session *session)
>     {
>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>     }
> 
>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>  blk_mq_quiesce_queue()>synchronize_rcu()
> 
> For all sessions and all devices, it was processed sequentially, and we have
> traced that for each synchronize_rcu() call it takes about 80ms, so
> the total cost
> is about 80s (80ms * 20 * 50). It's so long that the application can't
> tolerate and
> may interrupt service.
> 
> So my question is that can we optimize the procedure to reduce the time cost on
> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.

We need a patch, so the unblock call waits/cancels/flushes the block call or
they could be running in parallel.

I'll send a patchset later today so you can test it.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question about iscsi session block
  2022-02-15 16:31   ` [dm-devel] " Mike Christie
@ 2022-02-16  1:28     ` Zhengyuan Liu
  -1 siblings, 0 replies; 13+ messages in thread
From: Zhengyuan Liu @ 2022-02-16  1:28 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
<michael.christie@oracle.com> wrote:
>
> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> > Hi, all
> >
> > We have an online server which uses multipath + iscsi to attach storage
> > from Storage Server. There are two NICs on the server and for each it
> > carries about 20 iscsi sessions and for each session it includes about 50
> >  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
> >  on the server). The problem is: once a NIC gets faulted, it will take too long
> > (nearly 80s) for multipath to switch to another good NIC link, because it
> > needs to block all iscsi devices over that faulted NIC firstly. The callstack is
> >  shown below:
> >
> >     void iscsi_block_session(struct iscsi_cls_session *session)
> >     {
> >         queue_work(iscsi_eh_timer_workq, &session->block_work);
> >     }
> >
> >  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
> >   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
> >  blk_mq_quiesce_queue()>synchronize_rcu()
> >
> > For all sessions and all devices, it was processed sequentially, and we have
> > traced that for each synchronize_rcu() call it takes about 80ms, so
> > the total cost
> > is about 80s (80ms * 20 * 50). It's so long that the application can't
> > tolerate and
> > may interrupt service.
> >
> > So my question is that can we optimize the procedure to reduce the time cost on
> > blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> > workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>
> We need a patch, so the unblock call waits/cancels/flushes the block call or
> they could be running in parallel.
>
> I'll send a patchset later today so you can test it.

I'm glad to test once you push the patchset.

Thank you, Mike.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] Question about iscsi session block
@ 2022-02-16  1:28     ` Zhengyuan Liu
  0 siblings, 0 replies; 13+ messages in thread
From: Zhengyuan Liu @ 2022-02-16  1:28 UTC (permalink / raw)
  To: Mike Christie; +Cc: dm-devel, open-iscsi, leech, lduncan, linux-scsi

On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
<michael.christie@oracle.com> wrote:
>
> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> > Hi, all
> >
> > We have an online server which uses multipath + iscsi to attach storage
> > from Storage Server. There are two NICs on the server and for each it
> > carries about 20 iscsi sessions and for each session it includes about 50
> >  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
> >  on the server). The problem is: once a NIC gets faulted, it will take too long
> > (nearly 80s) for multipath to switch to another good NIC link, because it
> > needs to block all iscsi devices over that faulted NIC firstly. The callstack is
> >  shown below:
> >
> >     void iscsi_block_session(struct iscsi_cls_session *session)
> >     {
> >         queue_work(iscsi_eh_timer_workq, &session->block_work);
> >     }
> >
> >  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
> >   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
> >  blk_mq_quiesce_queue()>synchronize_rcu()
> >
> > For all sessions and all devices, it was processed sequentially, and we have
> > traced that for each synchronize_rcu() call it takes about 80ms, so
> > the total cost
> > is about 80s (80ms * 20 * 50). It's so long that the application can't
> > tolerate and
> > may interrupt service.
> >
> > So my question is that can we optimize the procedure to reduce the time cost on
> > blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> > workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>
> We need a patch, so the unblock call waits/cancels/flushes the block call or
> they could be running in parallel.
>
> I'll send a patchset later today so you can test it.

I'm glad to test once you push the patchset.

Thank you, Mike.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question about iscsi session block
  2022-02-16  1:28     ` [dm-devel] " Zhengyuan Liu
@ 2022-02-16  2:19       ` michael.christie
  -1 siblings, 0 replies; 13+ messages in thread
From: michael.christie @ 2022-02-16  2:19 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
> <michael.christie@oracle.com> wrote:
>>
>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
>>> Hi, all
>>>
>>> We have an online server which uses multipath + iscsi to attach storage
>>> from Storage Server. There are two NICs on the server and for each it
>>> carries about 20 iscsi sessions and for each session it includes about 50
>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
>>> (nearly 80s) for multipath to switch to another good NIC link, because it
>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>>>  shown below:
>>>
>>>     void iscsi_block_session(struct iscsi_cls_session *session)
>>>     {
>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>>>     }
>>>
>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>>>  blk_mq_quiesce_queue()>synchronize_rcu()
>>>
>>> For all sessions and all devices, it was processed sequentially, and we have
>>> traced that for each synchronize_rcu() call it takes about 80ms, so
>>> the total cost
>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
>>> tolerate and
>>> may interrupt service.
>>>
>>> So my question is that can we optimize the procedure to reduce the time cost on
>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>>
>> We need a patch, so the unblock call waits/cancels/flushes the block call or
>> they could be running in parallel.
>>
>> I'll send a patchset later today so you can test it.
> 
> I'm glad to test once you push the patchset.
> 
> Thank you, Mike.

I forgot I did this recently :)

commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
Author: Mike Christie <michael.christie@oracle.com>
Date:   Tue May 25 13:18:09 2021 -0500

    scsi: iscsi: Flush block work before unblock
    
    We set the max_active iSCSI EH works to 1, so all work is going to execute
    in order by default. However, userspace can now override this in sysfs. If
    max_active > 1, we can end up with the block_work on CPU1 and
    iscsi_unblock_session running the unblock_work on CPU2 and the session and
    target/device state will end up out of sync with each other.
    
    This adds a flush of the block_work in iscsi_unblock_session.


It was merged in 5.14.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] Question about iscsi session block
@ 2022-02-16  2:19       ` michael.christie
  0 siblings, 0 replies; 13+ messages in thread
From: michael.christie @ 2022-02-16  2:19 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: dm-devel, open-iscsi, leech, lduncan, linux-scsi

On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
> <michael.christie@oracle.com> wrote:
>>
>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
>>> Hi, all
>>>
>>> We have an online server which uses multipath + iscsi to attach storage
>>> from Storage Server. There are two NICs on the server and for each it
>>> carries about 20 iscsi sessions and for each session it includes about 50
>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
>>> (nearly 80s) for multipath to switch to another good NIC link, because it
>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>>>  shown below:
>>>
>>>     void iscsi_block_session(struct iscsi_cls_session *session)
>>>     {
>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>>>     }
>>>
>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>>>  blk_mq_quiesce_queue()>synchronize_rcu()
>>>
>>> For all sessions and all devices, it was processed sequentially, and we have
>>> traced that for each synchronize_rcu() call it takes about 80ms, so
>>> the total cost
>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
>>> tolerate and
>>> may interrupt service.
>>>
>>> So my question is that can we optimize the procedure to reduce the time cost on
>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>>
>> We need a patch, so the unblock call waits/cancels/flushes the block call or
>> they could be running in parallel.
>>
>> I'll send a patchset later today so you can test it.
> 
> I'm glad to test once you push the patchset.
> 
> Thank you, Mike.

I forgot I did this recently :)

commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
Author: Mike Christie <michael.christie@oracle.com>
Date:   Tue May 25 13:18:09 2021 -0500

    scsi: iscsi: Flush block work before unblock
    
    We set the max_active iSCSI EH works to 1, so all work is going to execute
    in order by default. However, userspace can now override this in sysfs. If
    max_active > 1, we can end up with the block_work on CPU1 and
    iscsi_unblock_session running the unblock_work on CPU2 and the session and
    target/device state will end up out of sync with each other.
    
    This adds a flush of the block_work in iscsi_unblock_session.


It was merged in 5.14.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question about iscsi session block
  2022-02-16  2:19       ` [dm-devel] " michael.christie
@ 2022-02-26 23:00         ` Mike Christie
  -1 siblings, 0 replies; 13+ messages in thread
From: Mike Christie @ 2022-02-26 23:00 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

On 2/15/22 8:19 PM, michael.christie@oracle.com wrote:
> On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
>> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
>> <michael.christie@oracle.com> wrote:
>>>
>>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
>>>> Hi, all
>>>>
>>>> We have an online server which uses multipath + iscsi to attach storage
>>>> from Storage Server. There are two NICs on the server and for each it
>>>> carries about 20 iscsi sessions and for each session it includes about 50
>>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
>>>> (nearly 80s) for multipath to switch to another good NIC link, because it
>>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>>>>  shown below:
>>>>
>>>>     void iscsi_block_session(struct iscsi_cls_session *session)
>>>>     {
>>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>>>>     }
>>>>
>>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>>>>  blk_mq_quiesce_queue()>synchronize_rcu()
>>>>
>>>> For all sessions and all devices, it was processed sequentially, and we have
>>>> traced that for each synchronize_rcu() call it takes about 80ms, so
>>>> the total cost
>>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
>>>> tolerate and
>>>> may interrupt service.
>>>>
>>>> So my question is that can we optimize the procedure to reduce the time cost on
>>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
>>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>>>
>>> We need a patch, so the unblock call waits/cancels/flushes the block call or
>>> they could be running in parallel.
>>>
>>> I'll send a patchset later today so you can test it.
>>
>> I'm glad to test once you push the patchset.
>>
>> Thank you, Mike.
> 
> I forgot I did this recently :)
> 
> commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
> Author: Mike Christie <michael.christie@oracle.com>
> Date:   Tue May 25 13:18:09 2021 -0500
> 
>     scsi: iscsi: Flush block work before unblock
>     
>     We set the max_active iSCSI EH works to 1, so all work is going to execute
>     in order by default. However, userspace can now override this in sysfs. If
>     max_active > 1, we can end up with the block_work on CPU1 and
>     iscsi_unblock_session running the unblock_work on CPU2 and the session and
>     target/device state will end up out of sync with each other.
>     
>     This adds a flush of the block_work in iscsi_unblock_session.
> 
> 
> It was merged in 5.14.

Hey, I found one more bug when max_active > 1. While fixing it I decided to just
fix this so we can do the sessions recoveries in parallel and the user doesn't have
to worry about setting max_active.

I'll send a patchset and cc you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] Question about iscsi session block
@ 2022-02-26 23:00         ` Mike Christie
  0 siblings, 0 replies; 13+ messages in thread
From: Mike Christie @ 2022-02-26 23:00 UTC (permalink / raw)
  To: Zhengyuan Liu; +Cc: dm-devel, open-iscsi, leech, lduncan, linux-scsi

On 2/15/22 8:19 PM, michael.christie@oracle.com wrote:
> On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
>> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
>> <michael.christie@oracle.com> wrote:
>>>
>>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
>>>> Hi, all
>>>>
>>>> We have an online server which uses multipath + iscsi to attach storage
>>>> from Storage Server. There are two NICs on the server and for each it
>>>> carries about 20 iscsi sessions and for each session it includes about 50
>>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
>>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
>>>> (nearly 80s) for multipath to switch to another good NIC link, because it
>>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
>>>>  shown below:
>>>>
>>>>     void iscsi_block_session(struct iscsi_cls_session *session)
>>>>     {
>>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
>>>>     }
>>>>
>>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
>>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
>>>>  blk_mq_quiesce_queue()>synchronize_rcu()
>>>>
>>>> For all sessions and all devices, it was processed sequentially, and we have
>>>> traced that for each synchronize_rcu() call it takes about 80ms, so
>>>> the total cost
>>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
>>>> tolerate and
>>>> may interrupt service.
>>>>
>>>> So my question is that can we optimize the procedure to reduce the time cost on
>>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
>>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
>>>
>>> We need a patch, so the unblock call waits/cancels/flushes the block call or
>>> they could be running in parallel.
>>>
>>> I'll send a patchset later today so you can test it.
>>
>> I'm glad to test once you push the patchset.
>>
>> Thank you, Mike.
> 
> I forgot I did this recently :)
> 
> commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
> Author: Mike Christie <michael.christie@oracle.com>
> Date:   Tue May 25 13:18:09 2021 -0500
> 
>     scsi: iscsi: Flush block work before unblock
>     
>     We set the max_active iSCSI EH works to 1, so all work is going to execute
>     in order by default. However, userspace can now override this in sysfs. If
>     max_active > 1, we can end up with the block_work on CPU1 and
>     iscsi_unblock_session running the unblock_work on CPU2 and the session and
>     target/device state will end up out of sync with each other.
>     
>     This adds a flush of the block_work in iscsi_unblock_session.
> 
> 
> It was merged in 5.14.

Hey, I found one more bug when max_active > 1. While fixing it I decided to just
fix this so we can do the sessions recoveries in parallel and the user doesn't have
to worry about setting max_active.

I'll send a patchset and cc you.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question about iscsi session block
  2022-02-26 23:00         ` [dm-devel] " Mike Christie
@ 2022-05-24  6:29           ` Zhengyuan Liu
  -1 siblings, 0 replies; 13+ messages in thread
From: Zhengyuan Liu @ 2022-05-24  6:29 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi, open-iscsi, dm-devel, lduncan, leech

Hi, Mike,

Sorry for the delayed reply since I have no  environment to check your
bellow patcheset untile recently

https://lore.kernel.org/all/20220226230435.38733-1-michael.christie@oracle.com/

After applied those series, the total time has dropped from 80s to
nearly 10s, it's a great improvement.

Thanks, again

On Sun, Feb 27, 2022 at 7:00 AM Mike Christie
<michael.christie@oracle.com> wrote:
>
> On 2/15/22 8:19 PM, michael.christie@oracle.com wrote:
> > On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
> >> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
> >> <michael.christie@oracle.com> wrote:
> >>>
> >>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> >>>> Hi, all
> >>>>
> >>>> We have an online server which uses multipath + iscsi to attach storage
> >>>> from Storage Server. There are two NICs on the server and for each it
> >>>> carries about 20 iscsi sessions and for each session it includes about 50
> >>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
> >>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
> >>>> (nearly 80s) for multipath to switch to another good NIC link, because it
> >>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
> >>>>  shown below:
> >>>>
> >>>>     void iscsi_block_session(struct iscsi_cls_session *session)
> >>>>     {
> >>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
> >>>>     }
> >>>>
> >>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
> >>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
> >>>>  blk_mq_quiesce_queue()>synchronize_rcu()
> >>>>
> >>>> For all sessions and all devices, it was processed sequentially, and we have
> >>>> traced that for each synchronize_rcu() call it takes about 80ms, so
> >>>> the total cost
> >>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
> >>>> tolerate and
> >>>> may interrupt service.
> >>>>
> >>>> So my question is that can we optimize the procedure to reduce the time cost on
> >>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> >>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
> >>>
> >>> We need a patch, so the unblock call waits/cancels/flushes the block call or
> >>> they could be running in parallel.
> >>>
> >>> I'll send a patchset later today so you can test it.
> >>
> >> I'm glad to test once you push the patchset.
> >>
> >> Thank you, Mike.
> >
> > I forgot I did this recently :)
> >
> > commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
> > Author: Mike Christie <michael.christie@oracle.com>
> > Date:   Tue May 25 13:18:09 2021 -0500
> >
> >     scsi: iscsi: Flush block work before unblock
> >
> >     We set the max_active iSCSI EH works to 1, so all work is going to execute
> >     in order by default. However, userspace can now override this in sysfs. If
> >     max_active > 1, we can end up with the block_work on CPU1 and
> >     iscsi_unblock_session running the unblock_work on CPU2 and the session and
> >     target/device state will end up out of sync with each other.
> >
> >     This adds a flush of the block_work in iscsi_unblock_session.
> >
> >
> > It was merged in 5.14.
>
> Hey, I found one more bug when max_active > 1. While fixing it I decided to just
> fix this so we can do the sessions recoveries in parallel and the user doesn't have
> to worry about setting max_active.
>
> I'll send a patchset and cc you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] Question about iscsi session block
@ 2022-05-24  6:29           ` Zhengyuan Liu
  0 siblings, 0 replies; 13+ messages in thread
From: Zhengyuan Liu @ 2022-05-24  6:29 UTC (permalink / raw)
  To: Mike Christie; +Cc: dm-devel, open-iscsi, leech, lduncan, linux-scsi

Hi, Mike,

Sorry for the delayed reply since I have no  environment to check your
bellow patcheset untile recently

https://lore.kernel.org/all/20220226230435.38733-1-michael.christie@oracle.com/

After applied those series, the total time has dropped from 80s to
nearly 10s, it's a great improvement.

Thanks, again

On Sun, Feb 27, 2022 at 7:00 AM Mike Christie
<michael.christie@oracle.com> wrote:
>
> On 2/15/22 8:19 PM, michael.christie@oracle.com wrote:
> > On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
> >> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
> >> <michael.christie@oracle.com> wrote:
> >>>
> >>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> >>>> Hi, all
> >>>>
> >>>> We have an online server which uses multipath + iscsi to attach storage
> >>>> from Storage Server. There are two NICs on the server and for each it
> >>>> carries about 20 iscsi sessions and for each session it includes about 50
> >>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
> >>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
> >>>> (nearly 80s) for multipath to switch to another good NIC link, because it
> >>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
> >>>>  shown below:
> >>>>
> >>>>     void iscsi_block_session(struct iscsi_cls_session *session)
> >>>>     {
> >>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
> >>>>     }
> >>>>
> >>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
> >>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
> >>>>  blk_mq_quiesce_queue()>synchronize_rcu()
> >>>>
> >>>> For all sessions and all devices, it was processed sequentially, and we have
> >>>> traced that for each synchronize_rcu() call it takes about 80ms, so
> >>>> the total cost
> >>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
> >>>> tolerate and
> >>>> may interrupt service.
> >>>>
> >>>> So my question is that can we optimize the procedure to reduce the time cost on
> >>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> >>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
> >>>
> >>> We need a patch, so the unblock call waits/cancels/flushes the block call or
> >>> they could be running in parallel.
> >>>
> >>> I'll send a patchset later today so you can test it.
> >>
> >> I'm glad to test once you push the patchset.
> >>
> >> Thank you, Mike.
> >
> > I forgot I did this recently :)
> >
> > commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
> > Author: Mike Christie <michael.christie@oracle.com>
> > Date:   Tue May 25 13:18:09 2021 -0500
> >
> >     scsi: iscsi: Flush block work before unblock
> >
> >     We set the max_active iSCSI EH works to 1, so all work is going to execute
> >     in order by default. However, userspace can now override this in sysfs. If
> >     max_active > 1, we can end up with the block_work on CPU1 and
> >     iscsi_unblock_session running the unblock_work on CPU2 and the session and
> >     target/device state will end up out of sync with each other.
> >
> >     This adds a flush of the block_work in iscsi_unblock_session.
> >
> >
> > It was merged in 5.14.
>
> Hey, I found one more bug when max_active > 1. While fixing it I decided to just
> fix this so we can do the sessions recoveries in parallel and the user doesn't have
> to worry about setting max_active.
>
> I'll send a patchset and cc you.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-05-25  7:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-15 15:49 Question about iscsi session block Zhengyuan Liu
2022-02-15 15:49 ` [dm-devel] " Zhengyuan Liu
2022-02-15 16:25 ` Donald Williams
2022-02-15 16:31 ` Mike Christie
2022-02-15 16:31   ` [dm-devel] " Mike Christie
2022-02-16  1:28   ` Zhengyuan Liu
2022-02-16  1:28     ` [dm-devel] " Zhengyuan Liu
2022-02-16  2:19     ` michael.christie
2022-02-16  2:19       ` [dm-devel] " michael.christie
2022-02-26 23:00       ` Mike Christie
2022-02-26 23:00         ` [dm-devel] " Mike Christie
2022-05-24  6:29         ` Zhengyuan Liu
2022-05-24  6:29           ` [dm-devel] " Zhengyuan Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.