[PATCH] multipathd: Make sure to disable queueing if recovery has failed.

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
@ 2023-11-27 21:54 Benjamin Marzinski
  2023-11-27 22:07 ` Roger Heflin
  2023-12-04 14:38 ` Martin Wilck
  0 siblings, 2 replies; 12+ messages in thread
From: Benjamin Marzinski @ 2023-11-27 21:54 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If a multipath device has no_path_retry set to a number and has lost all
paths, gone into recovery mode, and timed out, it will disable
queue_if_no_paths. After that, if one of those failed paths is removed,
when the device is reloaded, queue_if_no_paths will be re-enabled.  When
set_no_path_retry() is then called to update the queueing state, it will
not disable queue_if_no_paths, since the device is still in the recovery
state, so it believes no work needs to be done. The device will remain
in the recovery state, with retry_ticks at 0, and queueing enabled,
even though there are no usable paths.

To fix this, in set_no_path_retry(), if no_path_retry is set to a number
and the device is queueing but it is in recovery mode and out of
retries with no usable paths, manually disable queue_if_no_path.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/structs_vec.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
index 0e8a46e7..3cb23c73 100644
--- a/libmultipath/structs_vec.c
+++ b/libmultipath/structs_vec.c
@@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp)
 			    !mpp->in_recovery)
 				dm_queue_if_no_path(mpp->alias, 1);
 			leave_recovery_mode(mpp);
-		} else if (pathcount(mpp, PATH_PENDING) == 0)
+		} else if (pathcount(mpp, PATH_PENDING) == 0) {
+			/*
+			 * If in_recovery is set, enter_recovery_mode does
+			 * nothing. If the device is already in recovery
+			 * mode and has already timed out, manually call
+			 * dm_queue_if_no_path to stop it from queueing.
+			 */
+			if ((!mpp->features || is_queueing) &&
+			    mpp->in_recovery && mpp->retry_tick == 0)
+				dm_queue_if_no_path(mpp->alias, 0);
 			enter_recovery_mode(mpp);
+		}
 		break;
 	}
 }
-- 
2.41.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-11-27 21:54 [PATCH] multipathd: Make sure to disable queueing if recovery has failed Benjamin Marzinski
@ 2023-11-27 22:07 ` Roger Heflin
  2023-11-28 17:13   ` Benjamin Marzinski
  2023-12-04 14:38 ` Martin Wilck
  1 sibling, 1 reply; 12+ messages in thread
From: Roger Heflin @ 2023-11-27 22:07 UTC (permalink / raw)
  To: Benjamin Marzinski
  Cc: Christophe Varoqui, device-mapper development, Martin Wilck

How long does recovery take?     I am unclear on when the explictly
set queue_if_no_path is being overridden, and why disabling it is
useful.

Typically we are setting queue_if_no_path to forever with the intent
that it will survive longer storage and/or disk issues without
returning an error to the application and/or corrupting fses if the
storage issue can be fixed.

We generally expect it to recover when the storage comes back, but
that the storage could be experiencing significant issues for a
significant period of time >10 minutes.   Since the storage has to be
fixed to get things working again, there is a lot of negative value
that requires manual recovery steps when an error gets returned (fsck,
loss of data).

We also manually disable queueing if we need to remove the mpath
devices (paths are already gone as they were non-responsive > 24 hours
and removed via tmo_timeout), and/or forcible reboot the nodes when we
determine storage is not coming back.

On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote:
>
> If a multipath device has no_path_retry set to a number and has lost all
> paths, gone into recovery mode, and timed out, it will disable
> queue_if_no_paths. After that, if one of those failed paths is removed,
> when the device is reloaded, queue_if_no_paths will be re-enabled.  When
> set_no_path_retry() is then called to update the queueing state, it will
> not disable queue_if_no_paths, since the device is still in the recovery
> state, so it believes no work needs to be done. The device will remain
> in the recovery state, with retry_ticks at 0, and queueing enabled,
> even though there are no usable paths.
>
> To fix this, in set_no_path_retry(), if no_path_retry is set to a number
> and the device is queueing but it is in recovery mode and out of
> retries with no usable paths, manually disable queue_if_no_path.
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/structs_vec.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index 0e8a46e7..3cb23c73 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp)
>                             !mpp->in_recovery)
>                                 dm_queue_if_no_path(mpp->alias, 1);
>                         leave_recovery_mode(mpp);
> -               } else if (pathcount(mpp, PATH_PENDING) == 0)
> +               } else if (pathcount(mpp, PATH_PENDING) == 0) {
> +                       /*
> +                        * If in_recovery is set, enter_recovery_mode does
> +                        * nothing. If the device is already in recovery
> +                        * mode and has already timed out, manually call
> +                        * dm_queue_if_no_path to stop it from queueing.
> +                        */
> +                       if ((!mpp->features || is_queueing) &&
> +                           mpp->in_recovery && mpp->retry_tick == 0)
> +                               dm_queue_if_no_path(mpp->alias, 0);
>                         enter_recovery_mode(mpp);
> +               }
>                 break;
>         }
>  }
> --
> 2.41.0
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-11-27 22:07 ` Roger Heflin
@ 2023-11-28 17:13   ` Benjamin Marzinski
  2023-11-28 17:59     ` Roger Heflin
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Marzinski @ 2023-11-28 17:13 UTC (permalink / raw)
  To: Roger Heflin; +Cc: Christophe Varoqui, device-mapper development, Martin Wilck

On Mon, Nov 27, 2023 at 04:07:46PM -0600, Roger Heflin wrote:
> How long does recovery take?     I am unclear on when the explictly
> set queue_if_no_path is being overridden, and why disabling it is
> useful.

To be clear here. This fix shouldn't cause multipath to disable queueing
in cases where it wasn't configured to do so. Previously, we were not
always respecting no_path_retry. This fix makes sure that we are.

This change only applies to the case where no_path_retry is set to
a specific number of retries. When configuring multipath, if
no_path_reties is set, it always takes precedence over
"queue_if_no_path" set in the features line. So if a multipath device
isn't configured to queue for a limited number of retries and then
disable queueing, the fix changes nothing.

When no_path_retry is set to a number of retries, if the last path is
lost, the device goes into recovery mode. When a device goes into
recovery mode, multipathd sets a counter to the value of no_path_retry.
After every loop of the path checker through all the devices, that
counter ticks down. If the counter hits zero, multipathd disables
queueing on the device. When a path becomes usable, the device leaves
recovery mode, stopping the countdown or restoring queueing as
necessary.

The problem that this patch fixes happens if the multipath device is
reloaded after that count has hit zero, and queueing has been disabled.
When reloading a device with no_path_retry set to anything other than
"fail", multipathd will automatically enable queueing.  After reloading
the device, multipathd checks the number of usable paths. If there are
no usable paths, it tries to enter recovery mode.  However, in this
case, the device already is in recovery mode, so nothing happens. And
since the countdown has already completed, nothing will happen to
disable queueing in the future. Now you are left with a device stuck
indefinitely in recovery mode with queueing enabled and no usable paths,
even though you configured it to only queue for a limited number of
retries. This is bad.

The solution the patch implements is to check if you are in this
situation:
1. The device has no_path_retry set to a number of retries.
2. The device has no usable paths
3. The device is already in recovery mode
4. The retry counter for the device has already reached 0
5. The device is currently queueing IO.

In this case, we clearly shouldn't be queueing, and calling
enter_recovery_mode() will do nothing. So, we just tell the device to
disable queueing.

The change I made is in set_no_path_retry() which does get called in
other cases besides after a device reload. However, if we are ever in
the situation outlined above, we should not be queueing, so it is
correct to disable it. The code I added in this patch is the mirror of
the code directly above it in set_no_path_retry(). Here is that code.

if (count_active_paths(mpp) > 0) {
        /*
         * If in_recovery is set, leave_recovery_mode() takes
         * care of dm_queue_if_no_path. Otherwise, do it here.
         */
        if ((!mpp->features || !is_queueing) &&
            !mpp->in_recovery)
                dm_queue_if_no_path(mpp->alias, 1);
        leave_recovery_mode(mpp);

It this case, we should be queueing, but we're not, and since we aren't
in recovery mode, calling leave_recovery_mode() doesn't do anything.
Instead we just directly enable queueing.

> Typically we are setting queue_if_no_path to forever with the intent
> that it will survive longer storage and/or disk issues without
> returning an error to the application and/or corrupting fses if the
> storage issue can be fixed.
> 
> We generally expect it to recover when the storage comes back, but
> that the storage could be experiencing significant issues for a
> significant period of time >10 minutes.   Since the storage has to be
> fixed to get things working again, there is a lot of negative value
> that requires manual recovery steps when an error gets returned (fsck,
> loss of data).
> 
> We also manually disable queueing if we need to remove the mpath
> devices (paths are already gone as they were non-responsive > 24 hours
> and removed via tmo_timeout), and/or forcible reboot the nodes when we
> determine storage is not coming back.

If you are setting no_path_retry to "queue" (the recommended way) or
adding "queue_if_no_path" to the features line while leaving
no_path_retry unset (the deprecated way. See the man page) this change
will not effect you. You will not ever get to this code in
set_no_path_retry().

Does that clear things up?

-Ben

> 
> On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote:
> >
> > If a multipath device has no_path_retry set to a number and has lost all
> > paths, gone into recovery mode, and timed out, it will disable
> > queue_if_no_paths. After that, if one of those failed paths is removed,
> > when the device is reloaded, queue_if_no_paths will be re-enabled.  When
> > set_no_path_retry() is then called to update the queueing state, it will
> > not disable queue_if_no_paths, since the device is still in the recovery
> > state, so it believes no work needs to be done. The device will remain
> > in the recovery state, with retry_ticks at 0, and queueing enabled,
> > even though there are no usable paths.
> >
> > To fix this, in set_no_path_retry(), if no_path_retry is set to a number
> > and the device is queueing but it is in recovery mode and out of
> > retries with no usable paths, manually disable queue_if_no_path.
> >
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/structs_vec.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> > index 0e8a46e7..3cb23c73 100644
> > --- a/libmultipath/structs_vec.c
> > +++ b/libmultipath/structs_vec.c
> > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp)
> >                             !mpp->in_recovery)
> >                                 dm_queue_if_no_path(mpp->alias, 1);
> >                         leave_recovery_mode(mpp);
> > -               } else if (pathcount(mpp, PATH_PENDING) == 0)
> > +               } else if (pathcount(mpp, PATH_PENDING) == 0) {
> > +                       /*
> > +                        * If in_recovery is set, enter_recovery_mode does
> > +                        * nothing. If the device is already in recovery
> > +                        * mode and has already timed out, manually call
> > +                        * dm_queue_if_no_path to stop it from queueing.
> > +                        */
> > +                       if ((!mpp->features || is_queueing) &&
> > +                           mpp->in_recovery && mpp->retry_tick == 0)
> > +                               dm_queue_if_no_path(mpp->alias, 0);
> >                         enter_recovery_mode(mpp);
> > +               }
> >                 break;
> >         }
> >  }
> > --
> > 2.41.0
> >
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-11-28 17:13   ` Benjamin Marzinski
@ 2023-11-28 17:59     ` Roger Heflin
  2023-11-28 18:18       ` Benjamin Marzinski
  0 siblings, 1 reply; 12+ messages in thread
From: Roger Heflin @ 2023-11-28 17:59 UTC (permalink / raw)
  To: Benjamin Marzinski
  Cc: Christophe Varoqui, device-mapper development, Martin Wilck

Thank you.  That makes it clear.    Yes, we are setting it to "queue"
so it won't change behavior with that setting.

It also makes it clear that if we say set it no_path_retry to say
100000 that once all paths fail and it goes into recovery mode then
queuing would be disabled.


On Tue, Nov 28, 2023 at 11:13 AM Benjamin Marzinski <bmarzins@redhat.com> wrote:
>
> On Mon, Nov 27, 2023 at 04:07:46PM -0600, Roger Heflin wrote:
> > How long does recovery take?     I am unclear on when the explictly
> > set queue_if_no_path is being overridden, and why disabling it is
> > useful.
>
> To be clear here. This fix shouldn't cause multipath to disable queueing
> in cases where it wasn't configured to do so. Previously, we were not
> always respecting no_path_retry. This fix makes sure that we are.
>
> This change only applies to the case where no_path_retry is set to
> a specific number of retries. When configuring multipath, if
> no_path_reties is set, it always takes precedence over
> "queue_if_no_path" set in the features line. So if a multipath device
> isn't configured to queue for a limited number of retries and then
> disable queueing, the fix changes nothing.
>
> When no_path_retry is set to a number of retries, if the last path is
> lost, the device goes into recovery mode. When a device goes into
> recovery mode, multipathd sets a counter to the value of no_path_retry.
> After every loop of the path checker through all the devices, that
> counter ticks down. If the counter hits zero, multipathd disables
> queueing on the device. When a path becomes usable, the device leaves
> recovery mode, stopping the countdown or restoring queueing as
> necessary.
>
> The problem that this patch fixes happens if the multipath device is
> reloaded after that count has hit zero, and queueing has been disabled.
> When reloading a device with no_path_retry set to anything other than
> "fail", multipathd will automatically enable queueing.  After reloading
> the device, multipathd checks the number of usable paths. If there are
> no usable paths, it tries to enter recovery mode.  However, in this
> case, the device already is in recovery mode, so nothing happens. And
> since the countdown has already completed, nothing will happen to
> disable queueing in the future. Now you are left with a device stuck
> indefinitely in recovery mode with queueing enabled and no usable paths,
> even though you configured it to only queue for a limited number of
> retries. This is bad.
>
> The solution the patch implements is to check if you are in this
> situation:
> 1. The device has no_path_retry set to a number of retries.
> 2. The device has no usable paths
> 3. The device is already in recovery mode
> 4. The retry counter for the device has already reached 0
> 5. The device is currently queueing IO.
>
> In this case, we clearly shouldn't be queueing, and calling
> enter_recovery_mode() will do nothing. So, we just tell the device to
> disable queueing.
>
> The change I made is in set_no_path_retry() which does get called in
> other cases besides after a device reload. However, if we are ever in
> the situation outlined above, we should not be queueing, so it is
> correct to disable it. The code I added in this patch is the mirror of
> the code directly above it in set_no_path_retry(). Here is that code.
>
> if (count_active_paths(mpp) > 0) {
>         /*
>          * If in_recovery is set, leave_recovery_mode() takes
>          * care of dm_queue_if_no_path. Otherwise, do it here.
>          */
>         if ((!mpp->features || !is_queueing) &&
>             !mpp->in_recovery)
>                 dm_queue_if_no_path(mpp->alias, 1);
>         leave_recovery_mode(mpp);
>
> It this case, we should be queueing, but we're not, and since we aren't
> in recovery mode, calling leave_recovery_mode() doesn't do anything.
> Instead we just directly enable queueing.
>
> > Typically we are setting queue_if_no_path to forever with the intent
> > that it will survive longer storage and/or disk issues without
> > returning an error to the application and/or corrupting fses if the
> > storage issue can be fixed.
> >
> > We generally expect it to recover when the storage comes back, but
> > that the storage could be experiencing significant issues for a
> > significant period of time >10 minutes.   Since the storage has to be
> > fixed to get things working again, there is a lot of negative value
> > that requires manual recovery steps when an error gets returned (fsck,
> > loss of data).
> >
> > We also manually disable queueing if we need to remove the mpath
> > devices (paths are already gone as they were non-responsive > 24 hours
> > and removed via tmo_timeout), and/or forcible reboot the nodes when we
> > determine storage is not coming back.
>
> If you are setting no_path_retry to "queue" (the recommended way) or
> adding "queue_if_no_path" to the features line while leaving
> no_path_retry unset (the deprecated way. See the man page) this change
> will not effect you. You will not ever get to this code in
> set_no_path_retry().
>
> Does that clear things up?
>
> -Ben
>
> >
> > On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote:
> > >
> > > If a multipath device has no_path_retry set to a number and has lost all
> > > paths, gone into recovery mode, and timed out, it will disable
> > > queue_if_no_paths. After that, if one of those failed paths is removed,
> > > when the device is reloaded, queue_if_no_paths will be re-enabled.  When
> > > set_no_path_retry() is then called to update the queueing state, it will
> > > not disable queue_if_no_paths, since the device is still in the recovery
> > > state, so it believes no work needs to be done. The device will remain
> > > in the recovery state, with retry_ticks at 0, and queueing enabled,
> > > even though there are no usable paths.
> > >
> > > To fix this, in set_no_path_retry(), if no_path_retry is set to a number
> > > and the device is queueing but it is in recovery mode and out of
> > > retries with no usable paths, manually disable queue_if_no_path.
> > >
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > >  libmultipath/structs_vec.c | 12 +++++++++++-
> > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> > > index 0e8a46e7..3cb23c73 100644
> > > --- a/libmultipath/structs_vec.c
> > > +++ b/libmultipath/structs_vec.c
> > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp)
> > >                             !mpp->in_recovery)
> > >                                 dm_queue_if_no_path(mpp->alias, 1);
> > >                         leave_recovery_mode(mpp);
> > > -               } else if (pathcount(mpp, PATH_PENDING) == 0)
> > > +               } else if (pathcount(mpp, PATH_PENDING) == 0) {
> > > +                       /*
> > > +                        * If in_recovery is set, enter_recovery_mode does
> > > +                        * nothing. If the device is already in recovery
> > > +                        * mode and has already timed out, manually call
> > > +                        * dm_queue_if_no_path to stop it from queueing.
> > > +                        */
> > > +                       if ((!mpp->features || is_queueing) &&
> > > +                           mpp->in_recovery && mpp->retry_tick == 0)
> > > +                               dm_queue_if_no_path(mpp->alias, 0);
> > >                         enter_recovery_mode(mpp);
> > > +               }
> > >                 break;
> > >         }
> > >  }
> > > --
> > > 2.41.0
> > >
> > >
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-11-28 17:59     ` Roger Heflin
@ 2023-11-28 18:18       ` Benjamin Marzinski
  0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Marzinski @ 2023-11-28 18:18 UTC (permalink / raw)
  To: Roger Heflin; +Cc: Christophe Varoqui, device-mapper development, Martin Wilck

On Tue, Nov 28, 2023 at 11:59:55AM -0600, Roger Heflin wrote:
> Thank you.  That makes it clear.    Yes, we are setting it to "queue"
> so it won't change behavior with that setting.
> 
> It also makes it clear that if we say set it no_path_retry to say
> 100000 that once all paths fail and it goes into recovery mode then
> queuing would be disabled.

Well, it wouldn't disable queueing until it had been in recovery mode
for over 100000 path checker retries. At a retry every 5 seconds, that's
getting close to 6 days. Before those 100000 retries had passed, if the
device got reloaded, queueing would still not get disabled with my patch
appled, even if there were no usable paths. The change only applies to
devices in recovery mode that have run out of retries.

If you've noticed something that I've overlooked, and this change could
disable queuing when it shouldn't, please let me know.

-Ben

> 
> 
> On Tue, Nov 28, 2023 at 11:13 AM Benjamin Marzinski <bmarzins@redhat.com> wrote:
> >
> > On Mon, Nov 27, 2023 at 04:07:46PM -0600, Roger Heflin wrote:
> > > How long does recovery take?     I am unclear on when the explictly
> > > set queue_if_no_path is being overridden, and why disabling it is
> > > useful.
> >
> > To be clear here. This fix shouldn't cause multipath to disable queueing
> > in cases where it wasn't configured to do so. Previously, we were not
> > always respecting no_path_retry. This fix makes sure that we are.
> >
> > This change only applies to the case where no_path_retry is set to
> > a specific number of retries. When configuring multipath, if
> > no_path_reties is set, it always takes precedence over
> > "queue_if_no_path" set in the features line. So if a multipath device
> > isn't configured to queue for a limited number of retries and then
> > disable queueing, the fix changes nothing.
> >
> > When no_path_retry is set to a number of retries, if the last path is
> > lost, the device goes into recovery mode. When a device goes into
> > recovery mode, multipathd sets a counter to the value of no_path_retry.
> > After every loop of the path checker through all the devices, that
> > counter ticks down. If the counter hits zero, multipathd disables
> > queueing on the device. When a path becomes usable, the device leaves
> > recovery mode, stopping the countdown or restoring queueing as
> > necessary.
> >
> > The problem that this patch fixes happens if the multipath device is
> > reloaded after that count has hit zero, and queueing has been disabled.
> > When reloading a device with no_path_retry set to anything other than
> > "fail", multipathd will automatically enable queueing.  After reloading
> > the device, multipathd checks the number of usable paths. If there are
> > no usable paths, it tries to enter recovery mode.  However, in this
> > case, the device already is in recovery mode, so nothing happens. And
> > since the countdown has already completed, nothing will happen to
> > disable queueing in the future. Now you are left with a device stuck
> > indefinitely in recovery mode with queueing enabled and no usable paths,
> > even though you configured it to only queue for a limited number of
> > retries. This is bad.
> >
> > The solution the patch implements is to check if you are in this
> > situation:
> > 1. The device has no_path_retry set to a number of retries.
> > 2. The device has no usable paths
> > 3. The device is already in recovery mode
> > 4. The retry counter for the device has already reached 0
> > 5. The device is currently queueing IO.
> >
> > In this case, we clearly shouldn't be queueing, and calling
> > enter_recovery_mode() will do nothing. So, we just tell the device to
> > disable queueing.
> >
> > The change I made is in set_no_path_retry() which does get called in
> > other cases besides after a device reload. However, if we are ever in
> > the situation outlined above, we should not be queueing, so it is
> > correct to disable it. The code I added in this patch is the mirror of
> > the code directly above it in set_no_path_retry(). Here is that code.
> >
> > if (count_active_paths(mpp) > 0) {
> >         /*
> >          * If in_recovery is set, leave_recovery_mode() takes
> >          * care of dm_queue_if_no_path. Otherwise, do it here.
> >          */
> >         if ((!mpp->features || !is_queueing) &&
> >             !mpp->in_recovery)
> >                 dm_queue_if_no_path(mpp->alias, 1);
> >         leave_recovery_mode(mpp);
> >
> > It this case, we should be queueing, but we're not, and since we aren't
> > in recovery mode, calling leave_recovery_mode() doesn't do anything.
> > Instead we just directly enable queueing.
> >
> > > Typically we are setting queue_if_no_path to forever with the intent
> > > that it will survive longer storage and/or disk issues without
> > > returning an error to the application and/or corrupting fses if the
> > > storage issue can be fixed.
> > >
> > > We generally expect it to recover when the storage comes back, but
> > > that the storage could be experiencing significant issues for a
> > > significant period of time >10 minutes.   Since the storage has to be
> > > fixed to get things working again, there is a lot of negative value
> > > that requires manual recovery steps when an error gets returned (fsck,
> > > loss of data).
> > >
> > > We also manually disable queueing if we need to remove the mpath
> > > devices (paths are already gone as they were non-responsive > 24 hours
> > > and removed via tmo_timeout), and/or forcible reboot the nodes when we
> > > determine storage is not coming back.
> >
> > If you are setting no_path_retry to "queue" (the recommended way) or
> > adding "queue_if_no_path" to the features line while leaving
> > no_path_retry unset (the deprecated way. See the man page) this change
> > will not effect you. You will not ever get to this code in
> > set_no_path_retry().
> >
> > Does that clear things up?
> >
> > -Ben
> >
> > >
> > > On Mon, Nov 27, 2023 at 3:54 PM Benjamin Marzinski <bmarzins@redhat.com> wrote:
> > > >
> > > > If a multipath device has no_path_retry set to a number and has lost all
> > > > paths, gone into recovery mode, and timed out, it will disable
> > > > queue_if_no_paths. After that, if one of those failed paths is removed,
> > > > when the device is reloaded, queue_if_no_paths will be re-enabled.  When
> > > > set_no_path_retry() is then called to update the queueing state, it will
> > > > not disable queue_if_no_paths, since the device is still in the recovery
> > > > state, so it believes no work needs to be done. The device will remain
> > > > in the recovery state, with retry_ticks at 0, and queueing enabled,
> > > > even though there are no usable paths.
> > > >
> > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a number
> > > > and the device is queueing but it is in recovery mode and out of
> > > > retries with no usable paths, manually disable queue_if_no_path.
> > > >
> > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > > ---
> > > >  libmultipath/structs_vec.c | 12 +++++++++++-
> > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> > > > index 0e8a46e7..3cb23c73 100644
> > > > --- a/libmultipath/structs_vec.c
> > > > +++ b/libmultipath/structs_vec.c
> > > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp)
> > > >                             !mpp->in_recovery)
> > > >                                 dm_queue_if_no_path(mpp->alias, 1);
> > > >                         leave_recovery_mode(mpp);
> > > > -               } else if (pathcount(mpp, PATH_PENDING) == 0)
> > > > +               } else if (pathcount(mpp, PATH_PENDING) == 0) {
> > > > +                       /*
> > > > +                        * If in_recovery is set, enter_recovery_mode does
> > > > +                        * nothing. If the device is already in recovery
> > > > +                        * mode and has already timed out, manually call
> > > > +                        * dm_queue_if_no_path to stop it from queueing.
> > > > +                        */
> > > > +                       if ((!mpp->features || is_queueing) &&
> > > > +                           mpp->in_recovery && mpp->retry_tick == 0)
> > > > +                               dm_queue_if_no_path(mpp->alias, 0);
> > > >                         enter_recovery_mode(mpp);
> > > > +               }
> > > >                 break;
> > > >         }
> > > >  }
> > > > --
> > > > 2.41.0
> > > >
> > > >
> >


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-11-27 21:54 [PATCH] multipathd: Make sure to disable queueing if recovery has failed Benjamin Marzinski
  2023-11-27 22:07 ` Roger Heflin
@ 2023-12-04 14:38 ` Martin Wilck
  2023-12-04 20:00   ` Benjamin Marzinski
  1 sibling, 1 reply; 12+ messages in thread
From: Martin Wilck @ 2023-12-04 14:38 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote:
> If a multipath device has no_path_retry set to a number and has lost
> all
> paths, gone into recovery mode, and timed out, it will disable
> queue_if_no_paths. After that, if one of those failed paths is
> removed,
> when the device is reloaded, queue_if_no_paths will be re-enabled. 
> When
> set_no_path_retry() is then called to update the queueing state, it
> will
> not disable queue_if_no_paths, since the device is still in the
> recovery
> state, so it believes no work needs to be done. The device will
> remain
> in the recovery state, with retry_ticks at 0, and queueing enabled,
> even though there are no usable paths.
> 
> To fix this, in set_no_path_retry(), if no_path_retry is set to a
> number
> and the device is queueing but it is in recovery mode and out of
> retries with no usable paths, manually disable queue_if_no_path.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Thanks, this makes sense. I am just wondering if this logic should
be moved to enter_recovery_mode() / leave_recovery_mode() instead.
That pair of functions is also called from
update_queue_mode{add,del}_path(), where the same reasoning would
apply, afaics. Am I overlooking something?

Regards
Martin

> ---
>  libmultipath/structs_vec.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index 0e8a46e7..3cb23c73 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp)
>  			    !mpp->in_recovery)
>  				dm_queue_if_no_path(mpp->alias, 1);
>  			leave_recovery_mode(mpp);
> -		} else if (pathcount(mpp, PATH_PENDING) == 0)
> +		} else if (pathcount(mpp, PATH_PENDING) == 0) {
> +			/*
> +			 * If in_recovery is set,
> enter_recovery_mode does
> +			 * nothing. If the device is already in
> recovery
> +			 * mode and has already timed out, manually
> call
> +			 * dm_queue_if_no_path to stop it from
> queueing.
> +			 */
> +			if ((!mpp->features || is_queueing) &&
> +			    mpp->in_recovery && mpp->retry_tick ==
> 0)
> +				dm_queue_if_no_path(mpp->alias, 0);
>  			enter_recovery_mode(mpp);
> +		}
>  		break;
>  	}
>  }


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-12-04 14:38 ` Martin Wilck
@ 2023-12-04 20:00   ` Benjamin Marzinski
  2023-12-04 20:43     ` Martin Wilck
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Marzinski @ 2023-12-04 20:00 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development

On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote:
> On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote:
> > If a multipath device has no_path_retry set to a number and has lost
> > all
> > paths, gone into recovery mode, and timed out, it will disable
> > queue_if_no_paths. After that, if one of those failed paths is
> > removed,
> > when the device is reloaded, queue_if_no_paths will be re-enabled. 
> > When
> > set_no_path_retry() is then called to update the queueing state, it
> > will
> > not disable queue_if_no_paths, since the device is still in the
> > recovery
> > state, so it believes no work needs to be done. The device will
> > remain
> > in the recovery state, with retry_ticks at 0, and queueing enabled,
> > even though there are no usable paths.
> > 
> > To fix this, in set_no_path_retry(), if no_path_retry is set to a
> > number
> > and the device is queueing but it is in recovery mode and out of
> > retries with no usable paths, manually disable queue_if_no_path.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> 
> Thanks, this makes sense. I am just wondering if this logic should
> be moved to enter_recovery_mode() / leave_recovery_mode() instead.
> That pair of functions is also called from
> update_queue_mode{add,del}_path(), where the same reasoning would
> apply, afaics. Am I overlooking something?

In general, I think it's safe to only do this in set_no_path_retry().
The functions that call update_queue_mode{add,del}_path() are:
fail_path()
reinstate_path()
update_multipath()
io_err_stat_handle_pathfail()

fail_path() and reinstate_path() will always call the
update_queue_mode_* function after calling set_no_path_retry(), so the
device will already be in the correct state. update_multipath() usually
will as well (except when it's called by a command line handler), and
further, it will only call update_queue_mode_del_path() when it is
actually failing a path, so the multipath device shouldn't already be
in recovery_mode. The same goes for io_err_stat_handle_pathfail().

Also we need to make sure that mpp->features is uptodate before doing
these checks, so that we know what the current queueing state is.
io_err_stat_handle_pathfail() doesn't update mpp->features before
calling update_queue_mode_del_path(), so it could end up doing the wrong
thing.

And set_no_path_retry() will be called (after updating mpp->features)
the next time the path checker completes for a path in the device, so
devices won't remain in a bad state long, no matter what.

What looking at this did make me realize is that cli_restore_queueing()
and cli_restore_all_queueing() don't update the device strings before
calling set_no_path_retry(). Thinking about this, it seems like they
should call update_multipath(vecs, mpp->alias, 1) instead, to trigger
the set_no_path_retry() call after updating mpp->features.

I'll send another patch to fix cli_restore_queueing() and
cli_restore_all_queueing(). If you think it's necessary, I can also send
one to make io_err_stat_handle_pathfail() call update_multipath() and
then move the code to manually change the queueing state to
enter_recovery_mode() and leave_recovery_mode(), instead of doing it in
set_no_path_retry().

-Ben

> 
> Regards
> Martin
> 
> > ---
> >  libmultipath/structs_vec.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> > index 0e8a46e7..3cb23c73 100644
> > --- a/libmultipath/structs_vec.c
> > +++ b/libmultipath/structs_vec.c
> > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath *mpp)
> >  			    !mpp->in_recovery)
> >  				dm_queue_if_no_path(mpp->alias, 1);
> >  			leave_recovery_mode(mpp);
> > -		} else if (pathcount(mpp, PATH_PENDING) == 0)
> > +		} else if (pathcount(mpp, PATH_PENDING) == 0) {
> > +			/*
> > +			 * If in_recovery is set,
> > enter_recovery_mode does
> > +			 * nothing. If the device is already in
> > recovery
> > +			 * mode and has already timed out, manually
> > call
> > +			 * dm_queue_if_no_path to stop it from
> > queueing.
> > +			 */
> > +			if ((!mpp->features || is_queueing) &&
> > +			    mpp->in_recovery && mpp->retry_tick ==
> > 0)
> > +				dm_queue_if_no_path(mpp->alias, 0);
> >  			enter_recovery_mode(mpp);
> > +		}
> >  		break;
> >  	}
> >  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-12-04 20:00   ` Benjamin Marzinski
@ 2023-12-04 20:43     ` Martin Wilck
  2023-12-04 23:01       ` Benjamin Marzinski
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Wilck @ 2023-12-04 20:43 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: Christophe Varoqui, device-mapper development

On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote:
> On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote:
> > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote:
> > > If a multipath device has no_path_retry set to a number and has
> > > lost
> > > all
> > > paths, gone into recovery mode, and timed out, it will disable
> > > queue_if_no_paths. After that, if one of those failed paths is
> > > removed,
> > > when the device is reloaded, queue_if_no_paths will be re-
> > > enabled. 
> > > When
> > > set_no_path_retry() is then called to update the queueing state,
> > > it
> > > will
> > > not disable queue_if_no_paths, since the device is still in the
> > > recovery
> > > state, so it believes no work needs to be done. The device will
> > > remain
> > > in the recovery state, with retry_ticks at 0, and queueing
> > > enabled,
> > > even though there are no usable paths.
> > > 
> > > To fix this, in set_no_path_retry(), if no_path_retry is set to a
> > > number
> > > and the device is queueing but it is in recovery mode and out of
> > > retries with no usable paths, manually disable queue_if_no_path.
> > > 
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > 
> > Thanks, this makes sense. I am just wondering if this logic should
> > be moved to enter_recovery_mode() / leave_recovery_mode() instead.
> > That pair of functions is also called from
> > update_queue_mode{add,del}_path(), where the same reasoning would
> > apply, afaics. Am I overlooking something?
> 
> In general, I think it's safe to only do this in set_no_path_retry().
> The functions that call update_queue_mode{add,del}_path() are:
> fail_path()
> reinstate_path()
> update_multipath()
> io_err_stat_handle_pathfail()
> 
> fail_path() and reinstate_path() will always call the
> update_queue_mode_* function after calling set_no_path_retry(), so
> the
> device will already be in the correct state. update_multipath()
> usually
> will as well (except when it's called by a command line handler), and
> further, it will only call update_queue_mode_del_path() when it is
> actually failing a path, so the multipath device shouldn't already be
> in recovery_mode. The same goes for io_err_stat_handle_pathfail().
> 
> Also we need to make sure that mpp->features is uptodate before doing
> these checks, so that we know what the current queueing state is.
> io_err_stat_handle_pathfail() doesn't update mpp->features before
> calling update_queue_mode_del_path(), so it could end up doing the
> wrong
> thing.
> 
> And set_no_path_retry() will be called (after updating mpp->features)
> the next time the path checker completes for a path in the device, so
> devices won't remain in a bad state long, no matter what.
> 
> What looking at this did make me realize is that
> cli_restore_queueing()
> and cli_restore_all_queueing() don't update the device strings before
> calling set_no_path_retry(). Thinking about this, it seems like they
> should call update_multipath(vecs, mpp->alias, 1) instead, to trigger
> the set_no_path_retry() call after updating mpp->features.
> 
> I'll send another patch to fix cli_restore_queueing() and
> cli_restore_all_queueing(). If you think it's necessary, I can also
> send
> one to make io_err_stat_handle_pathfail() call update_multipath() and
> then move the code to manually change the queueing state to
> enter_recovery_mode() and leave_recovery_mode(), instead of doing it
> in
> set_no_path_retry().
> 

Thanks for working out all the different cases. Personally, I would
like to have this logic in one place, if possible. I'll leave it to
your judgement, as you've taken a closer look than myself.

I'm not sure what you mean with "mpp->features is up to date"; are you
considering some entity applying changes to the mpp features without
involving multipathd, e.g. by using direct DM ioctls or dmsetup?

Regards
Martin

> -Ben
> 
> > 
> > Regards
> > Martin
> > 
> > > ---
> > >  libmultipath/structs_vec.c | 12 +++++++++++-
> > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/libmultipath/structs_vec.c
> > > b/libmultipath/structs_vec.c
> > > index 0e8a46e7..3cb23c73 100644
> > > --- a/libmultipath/structs_vec.c
> > > +++ b/libmultipath/structs_vec.c
> > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath
> > > *mpp)
> > >  			    !mpp->in_recovery)
> > >  				dm_queue_if_no_path(mpp->alias,
> > > 1);
> > >  			leave_recovery_mode(mpp);
> > > -		} else if (pathcount(mpp, PATH_PENDING) == 0)
> > > +		} else if (pathcount(mpp, PATH_PENDING) == 0) {
> > > +			/*
> > > +			 * If in_recovery is set,
> > > enter_recovery_mode does
> > > +			 * nothing. If the device is already in
> > > recovery
> > > +			 * mode and has already timed out,
> > > manually
> > > call
> > > +			 * dm_queue_if_no_path to stop it from
> > > queueing.
> > > +			 */
> > > +			if ((!mpp->features || is_queueing) &&
> > > +			    mpp->in_recovery && mpp->retry_tick
> > > ==
> > > 0)
> > > +				dm_queue_if_no_path(mpp->alias,
> > > 0);
> > >  			enter_recovery_mode(mpp);
> > > +		}
> > >  		break;
> > >  	}
> > >  }
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-12-04 20:43     ` Martin Wilck
@ 2023-12-04 23:01       ` Benjamin Marzinski
  2023-12-05 14:57         ` Martin Wilck
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Marzinski @ 2023-12-04 23:01 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development

On Mon, Dec 04, 2023 at 09:43:58PM +0100, Martin Wilck wrote:
> On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote:
> > On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote:
> > > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote:
> > > > If a multipath device has no_path_retry set to a number and has
> > > > lost
> > > > all
> > > > paths, gone into recovery mode, and timed out, it will disable
> > > > queue_if_no_paths. After that, if one of those failed paths is
> > > > removed,
> > > > when the device is reloaded, queue_if_no_paths will be re-
> > > > enabled. 
> > > > When
> > > > set_no_path_retry() is then called to update the queueing state,
> > > > it
> > > > will
> > > > not disable queue_if_no_paths, since the device is still in the
> > > > recovery
> > > > state, so it believes no work needs to be done. The device will
> > > > remain
> > > > in the recovery state, with retry_ticks at 0, and queueing
> > > > enabled,
> > > > even though there are no usable paths.
> > > > 
> > > > To fix this, in set_no_path_retry(), if no_path_retry is set to a
> > > > number
> > > > and the device is queueing but it is in recovery mode and out of
> > > > retries with no usable paths, manually disable queue_if_no_path.
> > > > 
> > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > 
> > > Thanks, this makes sense. I am just wondering if this logic should
> > > be moved to enter_recovery_mode() / leave_recovery_mode() instead.
> > > That pair of functions is also called from
> > > update_queue_mode{add,del}_path(), where the same reasoning would
> > > apply, afaics. Am I overlooking something?
> > 
> > In general, I think it's safe to only do this in set_no_path_retry().
> > The functions that call update_queue_mode{add,del}_path() are:
> > fail_path()
> > reinstate_path()
> > update_multipath()
> > io_err_stat_handle_pathfail()
> > 
> > fail_path() and reinstate_path() will always call the
> > update_queue_mode_* function after calling set_no_path_retry(), so
> > the
> > device will already be in the correct state. update_multipath()
> > usually
> > will as well (except when it's called by a command line handler), and
> > further, it will only call update_queue_mode_del_path() when it is
> > actually failing a path, so the multipath device shouldn't already be
> > in recovery_mode. The same goes for io_err_stat_handle_pathfail().
> > 
> > Also we need to make sure that mpp->features is uptodate before doing
> > these checks, so that we know what the current queueing state is.
> > io_err_stat_handle_pathfail() doesn't update mpp->features before
> > calling update_queue_mode_del_path(), so it could end up doing the
> > wrong
> > thing.
> > 
> > And set_no_path_retry() will be called (after updating mpp->features)
> > the next time the path checker completes for a path in the device, so
> > devices won't remain in a bad state long, no matter what.
> > 
> > What looking at this did make me realize is that
> > cli_restore_queueing()
> > and cli_restore_all_queueing() don't update the device strings before
> > calling set_no_path_retry(). Thinking about this, it seems like they
> > should call update_multipath(vecs, mpp->alias, 1) instead, to trigger
> > the set_no_path_retry() call after updating mpp->features.
> > 
> > I'll send another patch to fix cli_restore_queueing() and
> > cli_restore_all_queueing(). If you think it's necessary, I can also
> > send
> > one to make io_err_stat_handle_pathfail() call update_multipath() and
> > then move the code to manually change the queueing state to
> > enter_recovery_mode() and leave_recovery_mode(), instead of doing it
> > in
> > set_no_path_retry().
> > 
> 
> Thanks for working out all the different cases. Personally, I would
> like to have this logic in one place, if possible. I'll leave it to
> your judgement, as you've taken a closer look than myself.
> 
> I'm not sure what you mean with "mpp->features is up to date"; are you
> considering some entity applying changes to the mpp features without
> involving multipathd, e.g. by using direct DM ioctls or dmsetup?

Or right after multipathd calls dm_queue_if_no_path(). We don't
immediately update mpp->features to reflect the changed
queue_if_no_paths setting of the device. So there are windows after
someone, including multipathd, has changed the queue_if_no_paths setting
of the device, during which checking mpp->features won't give the
correct answer unless we first reread the device table. Obviously, if
something other than multipathd has changed this, then there's always
the possiblity of a race, but we can make sure that we don't race with
ourselves.

For instance, the way things are right now, running:

# multipathd disablequeueing maps; multipathd restorequeueing maps

will occasionally leave the devices not queueing until the next time a
path in them is checked, and set_no_path_retry actually restores
queueing.  But I'm actually going to deal with this differently. I'll
send a patchset shortly.

-Ben

> 
> Regards
> Martin
> 
> > -Ben
> > 
> > > 
> > > Regards
> > > Martin
> > > 
> > > > ---
> > > >  libmultipath/structs_vec.c | 12 +++++++++++-
> > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/libmultipath/structs_vec.c
> > > > b/libmultipath/structs_vec.c
> > > > index 0e8a46e7..3cb23c73 100644
> > > > --- a/libmultipath/structs_vec.c
> > > > +++ b/libmultipath/structs_vec.c
> > > > @@ -627,8 +627,18 @@ void set_no_path_retry(struct multipath
> > > > *mpp)
> > > >  			    !mpp->in_recovery)
> > > >  				dm_queue_if_no_path(mpp->alias,
> > > > 1);
> > > >  			leave_recovery_mode(mpp);
> > > > -		} else if (pathcount(mpp, PATH_PENDING) == 0)
> > > > +		} else if (pathcount(mpp, PATH_PENDING) == 0) {
> > > > +			/*
> > > > +			 * If in_recovery is set,
> > > > enter_recovery_mode does
> > > > +			 * nothing. If the device is already in
> > > > recovery
> > > > +			 * mode and has already timed out,
> > > > manually
> > > > call
> > > > +			 * dm_queue_if_no_path to stop it from
> > > > queueing.
> > > > +			 */
> > > > +			if ((!mpp->features || is_queueing) &&
> > > > +			    mpp->in_recovery && mpp->retry_tick
> > > > ==
> > > > 0)
> > > > +				dm_queue_if_no_path(mpp->alias,
> > > > 0);
> > > >  			enter_recovery_mode(mpp);
> > > > +		}
> > > >  		break;
> > > >  	}
> > > >  }
> > 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-12-04 23:01       ` Benjamin Marzinski
@ 2023-12-05 14:57         ` Martin Wilck
  2023-12-05 19:28           ` Benjamin Marzinski
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Wilck @ 2023-12-05 14:57 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: Christophe Varoqui, device-mapper development

On Mon, 2023-12-04 at 18:01 -0500, Benjamin Marzinski wrote:
> On Mon, Dec 04, 2023 at 09:43:58PM +0100, Martin Wilck wrote:
> > On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote:
> > > On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote:
> > > > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote:
> > > > > If a multipath device has no_path_retry set to a number and
> > > > > has
> > > > > lost
> > > > > all
> > > > > paths, gone into recovery mode, and timed out, it will
> > > > > disable
> > > > > queue_if_no_paths. After that, if one of those failed paths
> > > > > is
> > > > > removed,
> > > > > when the device is reloaded, queue_if_no_paths will be re-
> > > > > enabled. 
> > > > > When
> > > > > set_no_path_retry() is then called to update the queueing
> > > > > state,
> > > > > it
> > > > > will
> > > > > not disable queue_if_no_paths, since the device is still in
> > > > > the
> > > > > recovery
> > > > > state, so it believes no work needs to be done. The device
> > > > > will
> > > > > remain
> > > > > in the recovery state, with retry_ticks at 0, and queueing
> > > > > enabled,
> > > > > even though there are no usable paths.
> > > > > 
> > > > > To fix this, in set_no_path_retry(), if no_path_retry is set
> > > > > to a
> > > > > number
> > > > > and the device is queueing but it is in recovery mode and out
> > > > > of
> > > > > retries with no usable paths, manually disable
> > > > > queue_if_no_path.
> > > > > 
> > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > > 
> > > > Thanks, this makes sense. I am just wondering if this logic
> > > > should
> > > > be moved to enter_recovery_mode() / leave_recovery_mode()
> > > > instead.
> > > > That pair of functions is also called from
> > > > update_queue_mode{add,del}_path(), where the same reasoning
> > > > would
> > > > apply, afaics. Am I overlooking something?
> > > 
> > > In general, I think it's safe to only do this in
> > > set_no_path_retry().
> > > The functions that call update_queue_mode{add,del}_path() are:
> > > fail_path()
> > > reinstate_path()
> > > update_multipath()
> > > io_err_stat_handle_pathfail()
> > > 
> > > fail_path() and reinstate_path() will always call the
> > > update_queue_mode_* function after calling set_no_path_retry(),
> > > so
> > > the
> > > device will already be in the correct state. update_multipath()
> > > usually
> > > will as well (except when it's called by a command line handler),
> > > and
> > > further, it will only call update_queue_mode_del_path() when it
> > > is
> > > actually failing a path, so the multipath device shouldn't
> > > already be
> > > in recovery_mode. The same goes for
> > > io_err_stat_handle_pathfail().
> > > 
> > > Also we need to make sure that mpp->features is uptodate before
> > > doing
> > > these checks, so that we know what the current queueing state is.
> > > io_err_stat_handle_pathfail() doesn't update mpp->features before
> > > calling update_queue_mode_del_path(), so it could end up doing
> > > the
> > > wrong
> > > thing.
> > > 
> > > And set_no_path_retry() will be called (after updating mpp-
> > > >features)
> > > the next time the path checker completes for a path in the
> > > device, so
> > > devices won't remain in a bad state long, no matter what.
> > > 
> > > What looking at this did make me realize is that
> > > cli_restore_queueing()
> > > and cli_restore_all_queueing() don't update the device strings
> > > before
> > > calling set_no_path_retry(). Thinking about this, it seems like
> > > they
> > > should call update_multipath(vecs, mpp->alias, 1) instead, to
> > > trigger
> > > the set_no_path_retry() call after updating mpp->features.
> > > 
> > > I'll send another patch to fix cli_restore_queueing() and
> > > cli_restore_all_queueing(). If you think it's necessary, I can
> > > also
> > > send
> > > one to make io_err_stat_handle_pathfail() call update_multipath()
> > > and
> > > then move the code to manually change the queueing state to
> > > enter_recovery_mode() and leave_recovery_mode(), instead of doing
> > > it
> > > in
> > > set_no_path_retry().
> > > 
> > 
> > Thanks for working out all the different cases. Personally, I would
> > like to have this logic in one place, if possible. I'll leave it to
> > your judgement, as you've taken a closer look than myself.
> > 
> > I'm not sure what you mean with "mpp->features is up to date"; are
> > you
> > considering some entity applying changes to the mpp features
> > without
> > involving multipathd, e.g. by using direct DM ioctls or dmsetup?
> 
> Or right after multipathd calls dm_queue_if_no_path(). We don't
> immediately update mpp->features to reflect the changed
> queue_if_no_paths setting of the device. So there are windows after
> someone, including multipathd, has changed the queue_if_no_paths
> setting
> of the device, during which checking mpp->features won't give the
> correct answer unless we first reread the device table. Obviously, if
> something other than multipathd has changed this, then there's always
> the possiblity of a race, but we can make sure that we don't race
> with
> ourselves.
> 
> For instance, the way things are right now, running:
> 
> # multipathd disablequeueing maps; multipathd restorequeueing maps
> 
> will occasionally leave the devices not queueing until the next time
> a
> path in them is checked, and set_no_path_retry actually restores
> queueing.  But I'm actually going to deal with this differently. I'll
> send a patchset shortly.

You are right, I see. It's actually an inconsistency that I would like
to get rid of. I suggest that we ditch mpp->features entirely, and just
keep track of the map's features using an abstract representation using
booleans and ints. We should never apply kernel status changes that are
inconsistent with our own representation of the map's features, IMO.

Along a similar line of reasoning, let me emphasize again why I would
like to have the logic for enabling or disabling queueing in one place.
Your previous post nicely distinguished the various cases and call
chains which change the queueing and recovery mode, but it would be
better if that kind of assessment wasn't necessary, and instead it was
possible to inspect the code in just a block of (say) 50 code lines
that would decide whether or not a given map should be queueing and /
or in recovery mode.

It's an unfortunate property of the multipath-tools code base that
important logic is dispersed over multiple functions and source files,
making a correct assessment of the code flow difficult and cumbersome.
It can't be completely avoided given the complexity of the subject, but
it would be nice to try.

I'm looking forward to your new patch set ;-)

Thanks,
Martin






^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-12-05 14:57         ` Martin Wilck
@ 2023-12-05 19:28           ` Benjamin Marzinski
  2023-12-05 21:01             ` Martin Wilck
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Marzinski @ 2023-12-05 19:28 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development

On Tue, Dec 05, 2023 at 03:57:13PM +0100, Martin Wilck wrote:
> On Mon, 2023-12-04 at 18:01 -0500, Benjamin Marzinski wrote:
> > On Mon, Dec 04, 2023 at 09:43:58PM +0100, Martin Wilck wrote:
> > > On Mon, 2023-12-04 at 15:00 -0500, Benjamin Marzinski wrote:
> > > > On Mon, Dec 04, 2023 at 03:38:03PM +0100, Martin Wilck wrote:
> > > > > On Mon, 2023-11-27 at 16:54 -0500, Benjamin Marzinski wrote:
> > > > > > If a multipath device has no_path_retry set to a number and
> > > > > > has
> > > > > > lost
> > > > > > all
> > > > > > paths, gone into recovery mode, and timed out, it will
> > > > > > disable
> > > > > > queue_if_no_paths. After that, if one of those failed paths
> > > > > > is
> > > > > > removed,
> > > > > > when the device is reloaded, queue_if_no_paths will be re-
> > > > > > enabled. 
> > > > > > When
> > > > > > set_no_path_retry() is then called to update the queueing
> > > > > > state,
> > > > > > it
> > > > > > will
> > > > > > not disable queue_if_no_paths, since the device is still in
> > > > > > the
> > > > > > recovery
> > > > > > state, so it believes no work needs to be done. The device
> > > > > > will
> > > > > > remain
> > > > > > in the recovery state, with retry_ticks at 0, and queueing
> > > > > > enabled,
> > > > > > even though there are no usable paths.
> > > > > > 
> > > > > > To fix this, in set_no_path_retry(), if no_path_retry is set
> > > > > > to a
> > > > > > number
> > > > > > and the device is queueing but it is in recovery mode and out
> > > > > > of
> > > > > > retries with no usable paths, manually disable
> > > > > > queue_if_no_path.
> > > > > > 
> > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > > > 
> > > > > Thanks, this makes sense. I am just wondering if this logic
> > > > > should
> > > > > be moved to enter_recovery_mode() / leave_recovery_mode()
> > > > > instead.
> > > > > That pair of functions is also called from
> > > > > update_queue_mode{add,del}_path(), where the same reasoning
> > > > > would
> > > > > apply, afaics. Am I overlooking something?
> > > > 
> > > > In general, I think it's safe to only do this in
> > > > set_no_path_retry().
> > > > The functions that call update_queue_mode{add,del}_path() are:
> > > > fail_path()
> > > > reinstate_path()
> > > > update_multipath()
> > > > io_err_stat_handle_pathfail()
> > > > 
> > > > fail_path() and reinstate_path() will always call the
> > > > update_queue_mode_* function after calling set_no_path_retry(),
> > > > so
> > > > the
> > > > device will already be in the correct state. update_multipath()
> > > > usually
> > > > will as well (except when it's called by a command line handler),
> > > > and
> > > > further, it will only call update_queue_mode_del_path() when it
> > > > is
> > > > actually failing a path, so the multipath device shouldn't
> > > > already be
> > > > in recovery_mode. The same goes for
> > > > io_err_stat_handle_pathfail().
> > > > 
> > > > Also we need to make sure that mpp->features is uptodate before
> > > > doing
> > > > these checks, so that we know what the current queueing state is.
> > > > io_err_stat_handle_pathfail() doesn't update mpp->features before
> > > > calling update_queue_mode_del_path(), so it could end up doing
> > > > the
> > > > wrong
> > > > thing.
> > > > 
> > > > And set_no_path_retry() will be called (after updating mpp-
> > > > >features)
> > > > the next time the path checker completes for a path in the
> > > > device, so
> > > > devices won't remain in a bad state long, no matter what.
> > > > 
> > > > What looking at this did make me realize is that
> > > > cli_restore_queueing()
> > > > and cli_restore_all_queueing() don't update the device strings
> > > > before
> > > > calling set_no_path_retry(). Thinking about this, it seems like
> > > > they
> > > > should call update_multipath(vecs, mpp->alias, 1) instead, to
> > > > trigger
> > > > the set_no_path_retry() call after updating mpp->features.
> > > > 
> > > > I'll send another patch to fix cli_restore_queueing() and
> > > > cli_restore_all_queueing(). If you think it's necessary, I can
> > > > also
> > > > send
> > > > one to make io_err_stat_handle_pathfail() call update_multipath()
> > > > and
> > > > then move the code to manually change the queueing state to
> > > > enter_recovery_mode() and leave_recovery_mode(), instead of doing
> > > > it
> > > > in
> > > > set_no_path_retry().
> > > > 
> > > 
> > > Thanks for working out all the different cases. Personally, I would
> > > like to have this logic in one place, if possible. I'll leave it to
> > > your judgement, as you've taken a closer look than myself.
> > > 
> > > I'm not sure what you mean with "mpp->features is up to date"; are
> > > you
> > > considering some entity applying changes to the mpp features
> > > without
> > > involving multipathd, e.g. by using direct DM ioctls or dmsetup?
> > 
> > Or right after multipathd calls dm_queue_if_no_path(). We don't
> > immediately update mpp->features to reflect the changed
> > queue_if_no_paths setting of the device. So there are windows after
> > someone, including multipathd, has changed the queue_if_no_paths
> > setting
> > of the device, during which checking mpp->features won't give the
> > correct answer unless we first reread the device table. Obviously, if
> > something other than multipathd has changed this, then there's always
> > the possiblity of a race, but we can make sure that we don't race
> > with
> > ourselves.
> > 
> > For instance, the way things are right now, running:
> > 
> > # multipathd disablequeueing maps; multipathd restorequeueing maps
> > 
> > will occasionally leave the devices not queueing until the next time
> > a
> > path in them is checked, and set_no_path_retry actually restores
> > queueing.  But I'm actually going to deal with this differently. I'll
> > send a patchset shortly.
> 
> You are right, I see. It's actually an inconsistency that I would like
> to get rid of. I suggest that we ditch mpp->features entirely, and just
> keep track of the map's features using an abstract representation using
> booleans and ints. We should never apply kernel status changes that are
> inconsistent with our own representation of the map's features, IMO.

I don't agree with ditching it entirely. The existance of mpp->features
allows people to use a kernel feature with a version of the userspace
tools that wasn't doesn't know about it. This means that if a new
feature is added to the kernel, the existing multipath userspace tools
can be used to test it.

We need to provide users the option of setting features, and for
backwards compatibility, we need to accept strings in the feature line
that we also have other options to set. And we still need to build up a
features line to pass into the kernel. So we will still need the code to
reconcile the features line with other options.  I'm not sure we would
gain much by converting it to a set of variables and then back again.

The one case where we really need to track our current value separate
from the features line is queue_if_no_path, and it does make sense to
track that as a variable, so we can easily update it when we make
changes to it. But even there, the current value is not necessarily the
value we want to put in the features line we pass to the kernel. If we
aren't queueing, and then add a working path, when we reload the device
we want to use the configured no_path_retry value, and not our current
queueing state.

One thing that would be nice would be tweaking the queue_if_no_path
value in the features string that we pass to the kernel based on the
current queueing state and the current path states. If we are reloading
a device, and we know that all the paths in that device are down, and we
currently aren't queueing, we shouldn't pass in a features string that
tells the kernel to enable queueing, which is what we currently do. The
patch I posted which started this conversation exists because we don't
do this, and we need to fix the queueing state afterwards.

> Along a similar line of reasoning, let me emphasize again why I would
> like to have the logic for enabling or disabling queueing in one place.
> Your previous post nicely distinguished the various cases and call
> chains which change the queueing and recovery mode, but it would be
> better if that kind of assessment wasn't necessary, and instead it was
> possible to inspect the code in just a block of (say) 50 code lines
> that would decide whether or not a given map should be queueing and /
> or in recovery mode.
> 
> It's an unfortunate property of the multipath-tools code base that
> important logic is dispersed over multiple functions and source files,
> making a correct assessment of the code flow difficult and cumbersome.
> It can't be completely avoided given the complexity of the subject, but
> it would be nice to try.
> 
> I'm looking forward to your new patch set ;-)

Unfortunately, my new patchset doesn't change the first patch. It just
builds on it to clean how the interactive client commands call things, to
reduce the callers of some of our functions, and keep the client
commands from doing more work than they should be. After the changes,
the only things calling set_no_path_retry() are functions that are
updating the multipath device due to external events, and the check_path
code.  

The update_queue_mode{add,del}_path() functions should be (and based on
my code examination, are) able to assume a device in the correct queuing
and recovery state, and just do the work of entering or leaving the
recovery mode, based on a change in the number of active paths. 

If we move the code that checks if the our queuing state is wrong out of
set_no_path_retry() into the update_queue_mode{add,del}_path(), then we
add a requirement that callers of those functions refresh the current
queueing state. Otherwise we can't actually guarantee that they do the
right thing. With the current callers, this doesn't really matter,
because we don't need them to fix the queueing state. But why not keep
the code that fixes the queueing state in a function that's called when
it could be wrong, and also is (after my new patchset) only called after
we have refreshed the state so it will do the right thing.

So, I would prefer if we go with my first patch, as is.  I'm open to the
idea of moving this code into update_queue_mode{add,del}_path() as part
of a future patchset where we switch to tracking the current queueing
state outside of mpp->features, and updating it whenever we actually
enable or disable queueing. I can work on this if you want.

-Ben 

> Thanks,
> Martin
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] multipathd: Make sure to disable queueing if recovery has failed.
  2023-12-05 19:28           ` Benjamin Marzinski
@ 2023-12-05 21:01             ` Martin Wilck
  0 siblings, 0 replies; 12+ messages in thread
From: Martin Wilck @ 2023-12-05 21:01 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: Christophe Varoqui, device-mapper development

On Tue, 2023-12-05 at 14:28 -0500, Benjamin Marzinski wrote:
> On Tue, Dec 05, 2023 at 03:57:13PM +0100, Martin Wilck wrote:
> > 
> > You are right, I see. It's actually an inconsistency that I would
> > like
> > to get rid of. I suggest that we ditch mpp->features entirely, and
> > just
> > keep track of the map's features using an abstract representation
> > using
> > booleans and ints. We should never apply kernel status changes that
> > are
> > inconsistent with our own representation of the map's features,
> > IMO.
> 
> I don't agree with ditching it entirely. The existance of mpp-
> >features
> allows people to use a kernel feature with a version of the userspace
> tools that wasn't doesn't know about it. This means that if a new
> feature is added to the kernel, the existing multipath userspace
> tools
> can be used to test it.
> 
> We need to provide users the option of setting features, and for
> backwards compatibility, we need to accept strings in the feature
> line
> that we also have other options to set. And we still need to build up
> a
> features line to pass into the kernel. So we will still need the code
> to
> reconcile the features line with other options.  I'm not sure we
> would
> gain much by converting it to a set of variables and then back again.

Fair enough, but consistency is very important. Let's work towards
never setting anything in the kernel that doesn't match our notion of
the desired state of a map.

> The one case where we really need to track our current value separate
> from the features line is queue_if_no_path, and it does make sense to
> track that as a variable, so we can easily update it when we make
> changes to it. But even there, the current value is not necessarily
> the
> value we want to put in the features line we pass to the kernel. If
> we
> aren't queueing, and then add a working path, when we reload the
> device
> we want to use the configured no_path_retry value, and not our
> current
> queueing state.
> 
> One thing that would be nice would be tweaking the queue_if_no_path
> value in the features string that we pass to the kernel based on the
> current queueing state and the current path states. If we are
> reloading
> a device, and we know that all the paths in that device are down, and
> we
> currently aren't queueing, we shouldn't pass in a features string
> that
> tells the kernel to enable queueing, which is what we currently do.

Agreed.

> The
> patch I posted which started this conversation exists because we
> don't
> do this, and we need to fix the queueing state afterwards.
>  
> > Along a similar line of reasoning, let me emphasize again why I
> > would
> > like to have the logic for enabling or disabling queueing in one
> > place.
> > Your previous post nicely distinguished the various cases and call
> > chains which change the queueing and recovery mode, but it would be
> > better if that kind of assessment wasn't necessary, and instead it
> > was
> > possible to inspect the code in just a block of (say) 50 code lines
> > that would decide whether or not a given map should be queueing and
> > /
> > or in recovery mode.
> > 
> > It's an unfortunate property of the multipath-tools code base that
> > important logic is dispersed over multiple functions and source
> > files,
> > making a correct assessment of the code flow difficult and
> > cumbersome.
> > It can't be completely avoided given the complexity of the subject,
> > but
> > it would be nice to try.
> > 
> > I'm looking forward to your new patch set ;-)
> 
> Unfortunately, my new patchset doesn't change the first patch. It
> just
> builds on it to clean how the interactive client commands call
> things, to
> reduce the callers of some of our functions, and keep the client
> commands from doing more work than they should be. After the changes,
> the only things calling set_no_path_retry() are functions that are
> updating the multipath device due to external events, and the
> check_path
> code.  
> 
> The update_queue_mode{add,del}_path() functions should be (and based
> on
> my code examination, are) able to assume a device in the correct
> queuing
> and recovery state, and just do the work of entering or leaving the
> recovery mode, based on a change in the number of active paths. 

OK. Perhaps not what I was hoping for, but better than what we have
now. Maybe you can add a few comment lines to explain the various
cases, lest your analysis be forgotten.


> If we move the code that checks if the our queuing state is wrong out
> of
> set_no_path_retry() into the update_queue_mode{add,del}_path(), then
> we
> add a requirement that callers of those functions refresh the current
> queueing state. Otherwise we can't actually guarantee that they do
> the
> right thing. With the current callers, this doesn't really matter,
> because we don't need them to fix the queueing state. But why not
> keep
> the code that fixes the queueing state in a function that's called
> when
> it could be wrong, and also is (after my new patchset) only called
> after
> we have refreshed the state so it will do the right thing.
> 
> So, I would prefer if we go with my first patch, as is.  I'm open to
> the
> idea of moving this code into update_queue_mode{add,del}_path() as
> part
> of a future patchset where we switch to tracking the current queueing
> state outside of mpp->features, and updating it whenever we actually
> enable or disable queueing. I can work on this if you want.

I'm fine with leaving your first patch as is, perhaps with some
comments. Anyway please resubmit it as part of your new set.

Regards
Martin


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-12-05 21:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-27 21:54 [PATCH] multipathd: Make sure to disable queueing if recovery has failed Benjamin Marzinski
2023-11-27 22:07 ` Roger Heflin
2023-11-28 17:13   ` Benjamin Marzinski
2023-11-28 17:59     ` Roger Heflin
2023-11-28 18:18       ` Benjamin Marzinski
2023-12-04 14:38 ` Martin Wilck
2023-12-04 20:00   ` Benjamin Marzinski
2023-12-04 20:43     ` Martin Wilck
2023-12-04 23:01       ` Benjamin Marzinski
2023-12-05 14:57         ` Martin Wilck
2023-12-05 19:28           ` Benjamin Marzinski
2023-12-05 21:01             ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.