All of lore.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] Reducing the SRP initiator failover time
@ 2013-02-01 13:43 Bart Van Assche
       [not found] ` <510BC68A.90708-HInyCGIudOg@public.gmane.org>
       [not found] ` <CAJZOPZJeCdkJ0xfK0kxic9jfz5A5ddw7TSWXe51yuO6bYTk4ag@mail.gmail.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Bart Van Assche @ 2013-02-01 13:43 UTC (permalink / raw)
  To: lsf-pc, linux-scsi, linux-rdma, David Dillow

It is known that it takes about two to three minutes before the upstream 
SRP initiator fails over from a failed path to a working path. This is 
not only considered longer than acceptable but is also longer than other 
Linux SCSI initiators (e.g. iSCSI and FC). Progress so far with 
improving the fail-over SRP initiator has been slow. This is because the 
discussion about candidate patches occurred at two different levels: not 
only the patches itself were discussed but also the approach that should 
be followed. That last aspect is easier to discuss in a meeting than 
over a mailing list. Hence the proposal to discuss SRP initiator 
failover behavior during the LSF/MM summit. The topics that need further 
discussion are:
* If a path fails, remove the entire SCSI host or preserve the SCSI
   host and only remove the SCSI devices associated with that host ?
* Which software component should test the state of a path and should
   reconnect to an SRP target if a path is restored ? Should that be
   done by the user space process srp_daemon or by the SRP initiator
   kernel module ?
* How should the SRP initiator behave after a path failure has been
   detected ? Should the behavior be similar to the FC initiator with
   its fast_io_fail_tmo and dev_loss_tmo parameters ?

Dave, if this topic gets accepted, I really hope you will be able to 
attend the LSF/MM summit.

Bart.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time
       [not found] ` <510BC68A.90708-HInyCGIudOg@public.gmane.org>
@ 2013-02-04 12:13   ` Sebastian Riemer
  0 siblings, 0 replies; 5+ messages in thread
From: Sebastian Riemer @ 2013-02-04 12:13 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma, David Dillow, Dongsu Park

Hi Bart,

thanks for approaching this! We're not the best mainline developers so I
guess we won't be there. But we have the big SRP setups and our
sysadmins really don't like reconnecting SRP hosts manually and putting
their devices complicated to the related dm-multipath devices again.

Think about > 200 SRP devices per server (already filtered by initiator
groups). We also consider the srptools as unmaintained, unreliable and
slow. It is possible that the srptools commands don't return. Therefore,
we send the SRP connection strings directly to the initiator within our
mapping jobs.

It would also be great not to develop a DDoS attack reconnect like
open-iscsi does. Rebooting the whole cluster to fix this isn't fun.
There must be a possibility to configure different reconnect intervals.

Btw.: We even had the case that the IPoIB stuff reconnected but the RDMA
part didn't with iSER. It was so broken then, that we couldn't
disconnect or reconnect anymore - only chance hard reboot.

So you know our point of view and we already develop it that way for us.
I'm looking forward what's the output of the discussion. At the current
state it's difficult to nag our bosses to publish what we have so far.

On 01.02.2013 14:43, Bart Van Assche wrote:
> It is known that it takes about two to three minutes before the upstream
> SRP initiator fails over from a failed path to a working path. This is
> not only considered longer than acceptable but is also longer than other
> Linux SCSI initiators (e.g. iSCSI and FC). Progress so far with
> improving the fail-over SRP initiator has been slow. This is because the
> discussion about candidate patches occurred at two different levels: not
> only the patches itself were discussed but also the approach that should
> be followed. That last aspect is easier to discuss in a meeting than
> over a mailing list. Hence the proposal to discuss SRP initiator
> failover behavior during the LSF/MM summit. The topics that need further
> discussion are:
> * If a path fails, remove the entire SCSI host or preserve the SCSI
>   host and only remove the SCSI devices associated with that host ?

Preserve SCSI hosts and SCSI devices unless they are removed explicitly
by disconnect request. Rescanning SCSI devices with "- - -" like
"iscsiadm -R" does for example may reorder the device names (sda becomes
sdb, etc.).

> * Which software component should test the state of a path and should
>   reconnect to an SRP target if a path is restored ? Should that be
>   done by the user space process srp_daemon or by the SRP initiator
>   kernel module ?

By the SRP kernel module. This is exactly the big advantage of SRP so
far: It is simple, it is RDMA and kernel only.

> * How should the SRP initiator behave after a path failure has been
>   detected ? Should the behavior be similar to the FC initiator with
>   its fast_io_fail_tmo and dev_loss_tmo parameters ?

Fine for us as long as it is possible to configure such times and the
behavior at all. For dm-multipath we need fast IO failing and that the
SRP initiator tries to automatically reconnect that path.

Cheers,
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time
       [not found]     ` <BB97625FCF082447AC2B11418FF02044A6E9E9C5-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
@ 2013-02-07 22:42       ` Vu Pham
       [not found]         ` <51142DE9.30900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Vu Pham @ 2013-02-07 22:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: lsf-pc-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-scsi,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Dillow, Oren Duer,
	Sagi Grimberg


>
>
> It is known that it takes about two to three minutes before the 
> upstream SRP initiator fails over from a failed path to a working 
> path. This is not only considered longer than acceptable but is also 
> longer than other Linux SCSI initiators (e.g. iSCSI and FC). Progress 
> so far with improving the fail-over SRP initiator has been slow. This 
> is because the discussion about candidate patches occurred at two 
> different levels: not only the patches itself were discussed but also 
> the approach that should be followed. That last aspect is easier to 
> discuss in a meeting than over a mailing list. Hence the proposal to 
> discuss SRP initiator failover behavior during the LSF/MM summit. The 
> topics that need further discussion are:
> * If a path fails, remove the entire SCSI host or preserve the SCSI
>   host and only remove the SCSI devices associated with that host ?
> * Which software component should test the state of a path and should
>   reconnect to an SRP target if a path is restored ? Should that be
>   done by the user space process srp_daemon or by the SRP initiator
>   kernel module ?
> * How should the SRP initiator behave after a path failure has been
>   detected ? Should the behavior be similar to the FC initiator with
>   its fast_io_fail_tmo and dev_loss_tmo parameters ?
>
> Dave, if this topic gets accepted, I really hope you will be able to 
> attend the LSF/MM summit.
>
> Bart.
>
Hello Bart,

Thank you for taking the initiative.
Mellanox think that this should be discussed. We'd be happy to attend.

We also would like to discuss:
* How and how fast does SRP detect a path failure besides RC error?
* Role of srp_daemon, how often srp_daemon scan fabric for new/old 
targets, how-to scale srp_daemon discovery, traps.

-vu
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time
       [not found]         ` <51142DE9.30900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-02-08  9:24           ` Sagi Grimberg
  2013-02-08 11:38             ` Sebastian Riemer
  0 siblings, 1 reply; 5+ messages in thread
From: Sagi Grimberg @ 2013-02-08  9:24 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vu Pham, lsf-pc-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-scsi, linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Dillow,
	Oren Duer

On 2/8/2013 12:42 AM, Vu Pham wrote:
>
>>
>>
>> It is known that it takes about two to three minutes before the 
>> upstream SRP initiator fails over from a failed path to a working 
>> path. This is not only considered longer than acceptable but is also 
>> longer than other Linux SCSI initiators (e.g. iSCSI and FC). Progress 
>> so far with improving the fail-over SRP initiator has been slow. This 
>> is because the discussion about candidate patches occurred at two 
>> different levels: not only the patches itself were discussed but also 
>> the approach that should be followed. That last aspect is easier to 
>> discuss in a meeting than over a mailing list. Hence the proposal to 
>> discuss SRP initiator failover behavior during the LSF/MM summit. The 
>> topics that need further discussion are:
>> * If a path fails, remove the entire SCSI host or preserve the SCSI
>>   host and only remove the SCSI devices associated with that host ?
>> * Which software component should test the state of a path and should
>>   reconnect to an SRP target if a path is restored ? Should that be
>>   done by the user space process srp_daemon or by the SRP initiator
>>   kernel module ?
>> * How should the SRP initiator behave after a path failure has been
>>   detected ? Should the behavior be similar to the FC initiator with
>>   its fast_io_fail_tmo and dev_loss_tmo parameters ?
>>
>> Dave, if this topic gets accepted, I really hope you will be able to 
>> attend the LSF/MM summit.
>>
>> Bart.
>>
> Hello Bart,
>
> Thank you for taking the initiative.
> Mellanox think that this should be discussed. We'd be happy to attend.
>
> We also would like to discuss:
> * How and how fast does SRP detect a path failure besides RC error?
> * Role of srp_daemon, how often srp_daemon scan fabric for new/old 
> targets, how-to scale srp_daemon discovery, traps.
>
> -vu
Hey Bart,

I agree with Vu that this issue should be discussed. We'd be happy to 
attend.

--
Sagi
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time
  2013-02-08  9:24           ` Sagi Grimberg
@ 2013-02-08 11:38             ` Sebastian Riemer
  0 siblings, 0 replies; 5+ messages in thread
From: Sebastian Riemer @ 2013-02-08 11:38 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Bart Van Assche, Vu Pham, lsf-pc, linux-scsi, linux-rdma,
	David Dillow, Oren Duer

On 08.02.2013 10:24, Sagi Grimberg wrote:
> On 2/8/2013 12:42 AM, Vu Pham wrote:
>> Hello Bart,
>>
>> Thank you for taking the initiative.
>> Mellanox think that this should be discussed. We'd be happy to attend.
>>
>> We also would like to discuss:
>> * How and how fast does SRP detect a path failure besides RC error?
>> * Role of srp_daemon, how often srp_daemon scan fabric for new/old
>> targets, how-to scale srp_daemon discovery, traps.
>>
>> -vu
> Hey Bart,
> 
> I agree with Vu that this issue should be discussed. We'd be happy to
> attend.
> 
> -- 
> Sagi

Wow, also thanks to Mellanox for spending resources on SRP as well! Last
year in June we came across a very different situation.

Cheers,
Sebastian and the ProfitBricks storage team

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-02-08 11:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-01 13:43 [LSF/MM TOPIC] Reducing the SRP initiator failover time Bart Van Assche
     [not found] ` <510BC68A.90708-HInyCGIudOg@public.gmane.org>
2013-02-04 12:13   ` Sebastian Riemer
     [not found] ` <CAJZOPZJeCdkJ0xfK0kxic9jfz5A5ddw7TSWXe51yuO6bYTk4ag@mail.gmail.com>
     [not found]   ` <BB97625FCF082447AC2B11418FF02044A6E9E9C5@MTLDAG01.mtl.com>
     [not found]     ` <BB97625FCF082447AC2B11418FF02044A6E9E9C5-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-02-07 22:42       ` Vu Pham
     [not found]         ` <51142DE9.30900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-08  9:24           ` Sagi Grimberg
2013-02-08 11:38             ` Sebastian Riemer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.