All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
       [not found] ` <0fd8e4d1-a91e-2246-e286-0ba5b7c19128@huawei.com>
@ 2021-02-01 11:04   ` Martin Wilck
  0 siblings, 0 replies; 11+ messages in thread
From: Martin Wilck @ 2021-02-01 11:04 UTC (permalink / raw)
  To: lixiaokeng, Christophe Varoqui, Benjamin Marzinski,
	dm-devel mailing list
  Cc: linfeilong

On Sun, 2021-01-31 at 18:59 +0800, lixiaokeng wrote:
> Hi Martin,
>   The local disks will not be a multipath device. Is it?

It depends on your configuration.

Regards
Martin



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
       [not found] <d6652b08-2d6c-ac46-9d73-b252bc26f41a@huawei.com>
       [not found] ` <0fd8e4d1-a91e-2246-e286-0ba5b7c19128@huawei.com>
@ 2021-02-01 11:16 ` Martin Wilck
  2021-02-01 11:49   ` lixiaokeng
  1 sibling, 1 reply; 11+ messages in thread
From: Martin Wilck @ 2021-02-01 11:16 UTC (permalink / raw)
  To: lixiaokeng, Christophe Varoqui, Benjamin Marzinski,
	dm-devel mailing list
  Cc: linfeilong

Hello lixiaokeng,

On Sun, 2021-01-31 at 18:55 +0800, lixiaokeng wrote:
> When find_multipaths is no and add local path by multiapthd,
> the local path will be filtered by filter_property. The
> pp->mpp is not set in adopt_paths. Then the pp->mpp will be
> dereferenced in get_be6.
> 
> Here add check pp->mpp in ev_add_path.
> 
> Signed-off-by: Lixiaokeng <lixiaokeng@huawei.com>

I don't understand. All callers (uev_add_path(), cli_add_path(),
check_path()) call pathinfo() first, which would return
PATHINFO_SKIPPED if the path was blacklisted. How do you end up 
in ev_add_path() with a blacklisted path?

And how is it possible that add_map_with_path(vecs, pp, 1) returns non-
NULL and pp->mpp is NULL?

Please explain.

Martin



> ---
>  multipathd/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index a4abbb2..8ad8bea 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1010,7 +1010,7 @@ rescan:
>             return 0;
>         }
>         condlog(4,"%s: creating new map", pp->dev);
> -       if ((mpp = add_map_with_path(vecs, pp, 1))) {
> +       if ((mpp = add_map_with_path(vecs, pp, 1)) && pp->mpp) {
>             mpp->action = ACT_CREATE;
>             /*
>              * We don't depend on ACT_CREATE, as domap will





--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-01 11:16 ` Martin Wilck
@ 2021-02-01 11:49   ` lixiaokeng
  2021-02-01 12:13     ` Martin Wilck
  0 siblings, 1 reply; 11+ messages in thread
From: lixiaokeng @ 2021-02-01 11:49 UTC (permalink / raw)
  To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski,
	dm-devel mailing list
  Cc: linfeilong



On 2021/2/1 19:16, Martin Wilck wrote:
> I don't understand. All callers (uev_add_path(), cli_add_path(),
> check_path()) call pathinfo() first, which would return
> PATHINFO_SKIPPED if the path was blacklisted. How do you end up 
> in ev_add_path() with a blacklisted path?
> 
> And how is it possible that add_map_with_path(vecs, pp, 1) returns non-
> NULL and pp->mpp is NULL?

The sdb is a local disk, stack:

cli_add_path
   ->ev_add_path
      ->add_map_with_path
         ->adopt_paths
            ->pathinfo
               ->filter_property
               ->return PATHINFO_SKIPPED,
            ->pp->mpp is NULL and not be set
            ->return 0
      ->mpath_pr_event_handle
         ->get_be64 //pp->mpp is dereference

The multipath.conf show:
defaults {
        find_multipaths no
}

Here, my local disks can't pass filter_property. Why?

Regards,
Lixiaokeng


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-01 11:49   ` lixiaokeng
@ 2021-02-01 12:13     ` Martin Wilck
  2021-02-01 14:50       ` lixiaokeng
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Wilck @ 2021-02-01 12:13 UTC (permalink / raw)
  To: lixiaokeng, Christophe Varoqui, Benjamin Marzinski,
	dm-devel mailing list
  Cc: linfeilong

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

On Mon, 2021-02-01 at 19:49 +0800, lixiaokeng wrote:
> 
> 
> On 2021/2/1 19:16, Martin Wilck wrote:
> > I don't understand. All callers (uev_add_path(), cli_add_path(),
> > check_path()) call pathinfo() first, which would return
> > PATHINFO_SKIPPED if the path was blacklisted. How do you end up 
> > in ev_add_path() with a blacklisted path?
> > 
> > And how is it possible that add_map_with_path(vecs, pp, 1) returns
> > non-
> > NULL and pp->mpp is NULL?
> 
> The sdb is a local disk, stack:
> 
> cli_add_path
>    ->ev_add_path
>       ->add_map_with_path
>          ->adopt_paths
>             ->pathinfo
>                ->filter_property
>                ->return PATHINFO_SKIPPED,
>             ->pp->mpp is NULL and not be set
>             ->return 0

This returns 0, but add_map_with_path() has this code to check whether
the path passed to it was actually added to the new map:

	if (adopt_paths(vecs->pathvec, mpp) ||
	    find_slot(vecs->pathvec, pp) == -1)
		goto out;  -> return NULL

So ev_add_path() should have seen a NULL return from
add_map_with_path(), should not have set start_waiter, and failed. 

Wait ... perhaps it could happen if after "goto rescan:" that
start_waiter was already set to one.

Can you try the attached patch?


>       ->mpath_pr_event_handle
>          ->get_be64 //pp->mpp is dereference
> 
> The multipath.conf show:
> defaults {
>         find_multipaths no
> }
> 
> Here, my local disks can't pass filter_property. Why?

This depends on the disk, and on your blacklisting settings. By
default, multipathd refuses paths that have neither
SCSI_IDENT_* nor ID_WWN_* properties set. See multipath.conf(5).

Regards
Martin


[-- Attachment #2: 0001-multipathd-ev_add_path-fail-if-add_map_with_path-fai.patch --]
[-- Type: text/x-patch, Size: 798 bytes --]

From f3f5440de274590d5d2439861aecfb70b1721988 Mon Sep 17 00:00:00 2001
From: Martin Wilck <mwilck@suse.com>
Date: Mon, 1 Feb 2021 13:10:46 +0100
Subject: [PATCH] multipathd: ev_add_path: fail if add_map_with_path() fails

If start_waiter was set before and the "rescan" label was used,
we may try to set up an empty/invalid map.
Always fail if add_map_with_path() isn't successful.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 6f851ae..134185f 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1028,7 +1028,7 @@ rescan:
 			 */
 			start_waiter = 1;
 		}
-		if (!start_waiter)
+		else
 			goto fail; /* leave path added to pathvec */
 	}
 
-- 
2.29.2


[-- Attachment #3: Type: text/plain, Size: 93 bytes --]

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-01 12:13     ` Martin Wilck
@ 2021-02-01 14:50       ` lixiaokeng
  2021-02-01 15:12         ` Martin Wilck
  0 siblings, 1 reply; 11+ messages in thread
From: lixiaokeng @ 2021-02-01 14:50 UTC (permalink / raw)
  To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski,
	dm-devel mailing list
  Cc: linfeilong


>>
>> cli_add_path
>>    ->ev_add_path
>>       ->add_map_with_path
>>          ->adopt_paths
>>             ->pathinfo
>>                ->filter_property
>>                ->return PATHINFO_SKIPPED,
>>             ->pp->mpp is NULL and not be set
>>             ->return 0
> 
> This returns 0, but add_map_with_path() has this code to check whether
> the path passed to it was actually added to the new map:
> 
> 	if (adopt_paths(vecs->pathvec, mpp) ||
> 	    find_slot(vecs->pathvec, pp) == -1)
> 		goto out;  -> return NULL
> 
> So ev_add_path() should have seen a NULL return from
> add_map_with_path(), should not have set start_waiter, and failed. 
> 

I'm sorry for a big mistake in my stack. As the code is optimized, pathinfo
return PATHINFO_SKIPPED after finish filter_property when I use gdb. It
happens acctualy in:
2141			if (pp->bus == SYSFS_BUS_SCSI &&
2142			    pp->sg_id.proto_id == SCSI_PROTOCOL_USB &&
2143			    !conf->allow_usb_devices) {
2144				condlog(3, "%s: skip USB device %s", pp->dev,
2145					pp->tgt_node_name);
2146				return PATHINFO_SKIPPED;
2147			}
2148		}

Stack:
cli_add_path
   ->ev_add_path
      ->add_map_with_path
         ->adopt_paths
            ->pathinfo
               ->pp->bus == SYSFS_BUS_SCSI
               ->return PATHINFO_SKIPPED,
            ->pp->mpp is NULL and not be set
            ->return 0
      ->mpath_pr_event_handle
         ->get_be64 //pp->mpp is dereference

If you think my patch is ok, I will resend it.

Regards
Lixiaokeng


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-01 14:50       ` lixiaokeng
@ 2021-02-01 15:12         ` Martin Wilck
  2021-02-02  5:26           ` Benjamin Marzinski
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Wilck @ 2021-02-01 15:12 UTC (permalink / raw)
  To: lixiaokeng, Christophe Varoqui, Benjamin Marzinski,
	dm-devel mailing list
  Cc: linfeilong

On Mon, 2021-02-01 at 22:50 +0800, lixiaokeng wrote:
> 
> > > 
> > > cli_add_path
> > >    ->ev_add_path
> > >       ->add_map_with_path
> > >          ->adopt_paths
> > >             ->pathinfo
> > >                ->filter_property
> > >                ->return PATHINFO_SKIPPED,
> > >             ->pp->mpp is NULL and not be set
> > >             ->return 0
> > 
> > This returns 0, but add_map_with_path() has this code to check
> > whether
> > the path passed to it was actually added to the new map:
> > 
> >         if (adopt_paths(vecs->pathvec, mpp) ||
> >             find_slot(vecs->pathvec, pp) == -1)
> >                 goto out;  -> return NULL
> > 
> > So ev_add_path() should have seen a NULL return from
> > add_map_with_path(), should not have set start_waiter, and failed. 
> > 
> 
> I'm sorry for a big mistake in my stack. As the code is optimized,
> pathinfo
> return PATHINFO_SKIPPED after finish filter_property when I use gdb.
> It
> happens acctualy in:
> 2141                    if (pp->bus == SYSFS_BUS_SCSI &&
> 2142                        pp->sg_id.proto_id == SCSI_PROTOCOL_USB
> &&
> 2143                        !conf->allow_usb_devices) {
> 2144                            condlog(3, "%s: skip USB device %s",
> pp->dev,
> 2145                                    pp->tgt_node_name);
> 2146                            return PATHINFO_SKIPPED;
> 2147                    }
> 2148            }
> 
> Stack:
> cli_add_path
>    ->ev_add_path
>       ->add_map_with_path
>          ->adopt_paths
>             ->pathinfo
>                ->pp->bus == SYSFS_BUS_SCSI
>                ->return PATHINFO_SKIPPED,
>             ->pp->mpp is NULL and not be set
>             ->return 0
>       ->mpath_pr_event_handle
>          ->get_be64 //pp->mpp is dereference
> 
> If you think my patch is ok, I will resend it.

The same argument I made above still holds. "pp" wouldn't have been
added to mpp, and add_map_with_path() would fail and return NULL.
Also, if pathinfo() returns PATHINFO_SKIPPED for this device,
how comes that cli_add_path() called ev_add_path() for it? It should
have returned "blacklisted" instead.

Your patch would only be effective if it was possible that
add_map_with_path(vecs, pp, 1) returned an mpp != NULL, and at the same
time pp->mpp was NULL; I still don't understand how that can come to
pass.

Have you tried my patch?

Regards,
Martin




--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-01 15:12         ` Martin Wilck
@ 2021-02-02  5:26           ` Benjamin Marzinski
  2021-02-02  7:18             ` Benjamin Marzinski
                               ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Benjamin Marzinski @ 2021-02-02  5:26 UTC (permalink / raw)
  To: Martin Wilck; +Cc: lixiaokeng, dm-devel mailing list, linfeilong

On Mon, Feb 01, 2021 at 04:12:34PM +0100, Martin Wilck wrote:
> On Mon, 2021-02-01 at 22:50 +0800, lixiaokeng wrote:
> > 
> > > > 
> > > > cli_add_path
> > > >    ->ev_add_path
> > > >       ->add_map_with_path
> > > >          ->adopt_paths
> > > >             ->pathinfo
> > > >                ->filter_property
> > > >                ->return PATHINFO_SKIPPED,
> > > >             ->pp->mpp is NULL and not be set
> > > >             ->return 0
> > > 
> > > This returns 0, but add_map_with_path() has this code to check
> > > whether
> > > the path passed to it was actually added to the new map:
> > > 
> > >         if (adopt_paths(vecs->pathvec, mpp) ||
> > >             find_slot(vecs->pathvec, pp) == -1)
> > >                 goto out;  -> return NULL
> > > 
> > > So ev_add_path() should have seen a NULL return from
> > > add_map_with_path(), should not have set start_waiter, and failed. 
> > > 
> > 
> > I'm sorry for a big mistake in my stack. As the code is optimized,
> > pathinfo
> > return PATHINFO_SKIPPED after finish filter_property when I use gdb.
> > It
> > happens acctualy in:
> > 2141                    if (pp->bus == SYSFS_BUS_SCSI &&
> > 2142                        pp->sg_id.proto_id == SCSI_PROTOCOL_USB
> > &&
> > 2143                        !conf->allow_usb_devices) {
> > 2144                            condlog(3, "%s: skip USB device %s",
> > pp->dev,
> > 2145                                    pp->tgt_node_name);
> > 2146                            return PATHINFO_SKIPPED;
> > 2147                    }
> > 2148            }
> > 
> > Stack:
> > cli_add_path
> >    ->ev_add_path
> >       ->add_map_with_path
> >          ->adopt_paths
> >             ->pathinfo
> >                ->pp->bus == SYSFS_BUS_SCSI
> >                ->return PATHINFO_SKIPPED,
> >             ->pp->mpp is NULL and not be set
> >             ->return 0
> >       ->mpath_pr_event_handle
> >          ->get_be64 //pp->mpp is dereference
> > 
> > If you think my patch is ok, I will resend it.
> 
> The same argument I made above still holds. "pp" wouldn't have been
> added to mpp, and add_map_with_path() would fail and return NULL.
> Also, if pathinfo() returns PATHINFO_SKIPPED for this device,
> how comes that cli_add_path() called ev_add_path() for it? It should
> have returned "blacklisted" instead.

So, I think the main issue here is that filter_property appears to be
broken.  It only filters if uid_attribute is set, but that will never be
set the first time it's called in pathinfo.  This means that it will
pass in the pathinfo call in cli_add_path, and the path will get stored
in the pathvec.

However, it will fail in the pathinfo call from adopt_paths, so the path
won't be added to the multipath device.  This means adopt paths doesn't
actually adopt any paths potentially, but that in itself doesn't cause
it to fail. This check

        if (adopt_paths(vecs->pathvec, mpp) ||
            find_slot(vecs->pathvec, pp) == -1)
                goto out;

passes, since we only check if the path is on the pathvec, not part of
the multipath device, and since filter_property let the path past the
first time, it is. So add_map_with_path() will create a multipath
device, but the path won't be added to it, and pp->mpp == NULL.

So, add_map_with_path() should probably check that we actually created a
map that included the path that got added. But more importantly,
filter_property shouldn't return different results the when it's called
the first time.  That would have avoid the entire situation.

-Ben


> Your patch would only be effective if it was possible that
> add_map_with_path(vecs, pp, 1) returned an mpp != NULL, and at the same
> time pp->mpp was NULL; I still don't understand how that can come to
> pass.
> 
> Have you tried my patch?
> 
> Regards,
> Martin
> 
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-02  5:26           ` Benjamin Marzinski
@ 2021-02-02  7:18             ` Benjamin Marzinski
  2021-02-02 18:00               ` Martin Wilck
  2021-02-02  8:13             ` lixiaokeng
  2021-02-02 11:04             ` Martin Wilck
  2 siblings, 1 reply; 11+ messages in thread
From: Benjamin Marzinski @ 2021-02-02  7:18 UTC (permalink / raw)
  To: Martin Wilck; +Cc: lixiaokeng, dm-devel mailing list, linfeilong

On Mon, Feb 01, 2021 at 11:26:36PM -0600, Benjamin Marzinski wrote:
> On Mon, Feb 01, 2021 at 04:12:34PM +0100, Martin Wilck wrote:
> > On Mon, 2021-02-01 at 22:50 +0800, lixiaokeng wrote:
> > > 
> > > > > 
> > > > > cli_add_path
> > > > >    ->ev_add_path
> > > > >       ->add_map_with_path
> > > > >          ->adopt_paths
> > > > >             ->pathinfo
> > > > >                ->filter_property
> > > > >                ->return PATHINFO_SKIPPED,
> > > > >             ->pp->mpp is NULL and not be set
> > > > >             ->return 0
> > > > 
> > > > This returns 0, but add_map_with_path() has this code to check
> > > > whether
> > > > the path passed to it was actually added to the new map:
> > > > 
> > > >         if (adopt_paths(vecs->pathvec, mpp) ||
> > > >             find_slot(vecs->pathvec, pp) == -1)
> > > >                 goto out;  -> return NULL
> > > > 
> > > > So ev_add_path() should have seen a NULL return from
> > > > add_map_with_path(), should not have set start_waiter, and failed. 
> > > > 
> > > 
> > > I'm sorry for a big mistake in my stack. As the code is optimized,
> > > pathinfo
> > > return PATHINFO_SKIPPED after finish filter_property when I use gdb.
> > > It
> > > happens acctualy in:
> > > 2141                    if (pp->bus == SYSFS_BUS_SCSI &&
> > > 2142                        pp->sg_id.proto_id == SCSI_PROTOCOL_USB
> > > &&
> > > 2143                        !conf->allow_usb_devices) {
> > > 2144                            condlog(3, "%s: skip USB device %s",
> > > pp->dev,
> > > 2145                                    pp->tgt_node_name);
> > > 2146                            return PATHINFO_SKIPPED;
> > > 2147                    }
> > > 2148            }
> > > 
> > > Stack:
> > > cli_add_path
> > >    ->ev_add_path
> > >       ->add_map_with_path
> > >          ->adopt_paths
> > >             ->pathinfo
> > >                ->pp->bus == SYSFS_BUS_SCSI
> > >                ->return PATHINFO_SKIPPED,
> > >             ->pp->mpp is NULL and not be set
> > >             ->return 0
> > >       ->mpath_pr_event_handle
> > >          ->get_be64 //pp->mpp is dereference
> > > 
> > > If you think my patch is ok, I will resend it.
> > 
> > The same argument I made above still holds. "pp" wouldn't have been
> > added to mpp, and add_map_with_path() would fail and return NULL.
> > Also, if pathinfo() returns PATHINFO_SKIPPED for this device,
> > how comes that cli_add_path() called ev_add_path() for it? It should
> > have returned "blacklisted" instead.
> 
> So, I think the main issue here is that filter_property appears to be
> broken.  It only filters if uid_attribute is set, but that will never be
> set the first time it's called in pathinfo.  This means that it will
> pass in the pathinfo call in cli_add_path, and the path will get stored
> in the pathvec.

Just to be a little more clear here, filter_property() will only return
MATCH_PROPERTY_BLIST_MISSING for missing udev properties, if the
uid_attribute is set and seen. We should probably make sure to set
uid_attribute before calling it.

-Ben

> 
> However, it will fail in the pathinfo call from adopt_paths, so the path
> won't be added to the multipath device.  This means adopt paths doesn't
> actually adopt any paths potentially, but that in itself doesn't cause
> it to fail. This check
> 
>         if (adopt_paths(vecs->pathvec, mpp) ||
>             find_slot(vecs->pathvec, pp) == -1)
>                 goto out;
> 
> passes, since we only check if the path is on the pathvec, not part of
> the multipath device, and since filter_property let the path past the
> first time, it is. So add_map_with_path() will create a multipath
> device, but the path won't be added to it, and pp->mpp == NULL.
> 
> So, add_map_with_path() should probably check that we actually created a
> map that included the path that got added. But more importantly,
> filter_property shouldn't return different results the when it's called
> the first time.  That would have avoid the entire situation.
> 
> -Ben
> 
> 
> > Your patch would only be effective if it was possible that
> > add_map_with_path(vecs, pp, 1) returned an mpp != NULL, and at the same
> > time pp->mpp was NULL; I still don't understand how that can come to
> > pass.
> > 
> > Have you tried my patch?
> > 
> > Regards,
> > Martin
> > 
> > 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-02  5:26           ` Benjamin Marzinski
  2021-02-02  7:18             ` Benjamin Marzinski
@ 2021-02-02  8:13             ` lixiaokeng
  2021-02-02 11:04             ` Martin Wilck
  2 siblings, 0 replies; 11+ messages in thread
From: lixiaokeng @ 2021-02-02  8:13 UTC (permalink / raw)
  To: Benjamin Marzinski, Martin Wilck; +Cc: linfeilong, dm-devel mailing list



On 2021/2/2 13:26, Benjamin Marzinski wrote:
> So, I think the main issue here is that filter_property appears to be
> broken.  It only filters if uid_attribute is set, but that will never be
> set the first time it's called in pathinfo.  This means that it will
> pass in the pathinfo call in cli_add_path, and the path will get stored
> in the pathvec.
> 

  Yes! The pathinfo returns PATHINFO_OK in store_pathinfo but returns
PATHINFO_SKIPPED in adopt_paths. I'm sorry for not mentioning it in the
previous e-mails. I just focus on where return PATHINFO_SKIPPED in second
pathinfo.

Gdb second pathinfo in adopt_paths:
(gdb)
2106			if (hidden && !strcmp(hidden, "1")) {
(gdb)
2103			const char *hidden =
(gdb)
2106			if (hidden && !strcmp(hidden, "1")) {
(gdb)
2110			if (is_claimed_by_foreign(pp->udev) ||
(gdb)
2111			    filter_property(conf, pp->udev, 4, pp->uid_attribute) > 0)
(gdb)
2110			if (is_claimed_by_foreign(pp->udev) ||
(gdb)
2146				return PATHINFO_SKIPPED;
(gdb)
2260	}
  I'm not sure filter_property makes pathinfo return PATHINFO_SKIPPED from gdb.
Ben’s analysis resolves my doubts.

> However, it will fail in the pathinfo call from adopt_paths, so the path
> won't be added to the multipath device.  This means adopt paths doesn't
> actually adopt any paths potentially, but that in itself doesn't cause
> it to fail. This check
> 
>         if (adopt_paths(vecs->pathvec, mpp) ||
>             find_slot(vecs->pathvec, pp) == -1)
>                 goto out;
> 
> passes, since we only check if the path is on the pathvec, not part of
> the multipath device, and since filter_property let the path past the
> first time, it is. So add_map_with_path() will create a multipath
> device, but the path won't be added to it, and pp->mpp == NULL.
> 
> So, add_map_with_path() should probably check that we actually created a
> map that included the path that got added. But more importantly,
> filter_property shouldn't return different results the when it's called
> the first time.  That would have avoid the entire situation.
	if (adopt_paths(vecs->pathvec, mpp) ||
	    find_slot(vecs->pathvec, pp) == -1 ||
	    !pp->mpp)
		goto out;

This is better than my first patch to avoid this problem. However, it
is beyond my ability to slove the problem of filter_property returning
different values.

Regards,
Lixiaokeng



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-02  5:26           ` Benjamin Marzinski
  2021-02-02  7:18             ` Benjamin Marzinski
  2021-02-02  8:13             ` lixiaokeng
@ 2021-02-02 11:04             ` Martin Wilck
  2 siblings, 0 replies; 11+ messages in thread
From: Martin Wilck @ 2021-02-02 11:04 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: lixiaokeng, dm-devel mailing list, linfeilong

On Mon, 2021-02-01 at 23:26 -0600, Benjamin Marzinski wrote:
> On Mon, Feb 01, 2021 at 04:12:34PM +0100, Martin Wilck wrote:
> > On Mon, 2021-02-01 at 22:50 +0800, lixiaokeng wrote:
> > > 
> > > > > 
> > > > > cli_add_path
> > > > >    ->ev_add_path
> > > > >       ->add_map_with_path
> > > > >          ->adopt_paths
> > > > >             ->pathinfo
> > > > >                ->filter_property
> > > > >                ->return PATHINFO_SKIPPED,
> > > > >             ->pp->mpp is NULL and not be set
> > > > >             ->return 0
> > > > 
> > > > This returns 0, but add_map_with_path() has this code to check
> > > > whether
> > > > the path passed to it was actually added to the new map:
> > > > 
> > > >         if (adopt_paths(vecs->pathvec, mpp) ||
> > > >             find_slot(vecs->pathvec, pp) == -1)
> > > >                 goto out;  -> return NULL
> > > > 
> > > > So ev_add_path() should have seen a NULL return from
> > > > add_map_with_path(), should not have set start_waiter, and
> > > > failed. 
> > > > 
> > > 
> > > I'm sorry for a big mistake in my stack. As the code is
> > > optimized,
> > > pathinfo
> > > return PATHINFO_SKIPPED after finish filter_property when I use
> > > gdb.
> > > It
> > > happens acctualy in:
> > > 2141                    if (pp->bus == SYSFS_BUS_SCSI &&
> > > 2142                        pp->sg_id.proto_id ==
> > > SCSI_PROTOCOL_USB
> > > &&
> > > 2143                        !conf->allow_usb_devices) {
> > > 2144                            condlog(3, "%s: skip USB device
> > > %s",
> > > pp->dev,
> > > 2145                                    pp->tgt_node_name);
> > > 2146                            return PATHINFO_SKIPPED;
> > > 2147                    }
> > > 2148            }
> > > 
> > > Stack:
> > > cli_add_path
> > >    ->ev_add_path
> > >       ->add_map_with_path
> > >          ->adopt_paths
> > >             ->pathinfo
> > >                ->pp->bus == SYSFS_BUS_SCSI
> > >                ->return PATHINFO_SKIPPED,
> > >             ->pp->mpp is NULL and not be set
> > >             ->return 0
> > >       ->mpath_pr_event_handle
> > >          ->get_be64 //pp->mpp is dereference
> > > 
> > > If you think my patch is ok, I will resend it.
> > 
> > The same argument I made above still holds. "pp" wouldn't have been
> > added to mpp, and add_map_with_path() would fail and return NULL.
> > Also, if pathinfo() returns PATHINFO_SKIPPED for this device,
> > how comes that cli_add_path() called ev_add_path() for it? It
> > should
> > have returned "blacklisted" instead.
> 
> So, I think the main issue here is that filter_property appears to be
> broken.  It only filters if uid_attribute is set, but that will never
> be
> set the first time it's called in pathinfo.  This means that it will
> pass in the pathinfo call in cli_add_path, and the path will get
> stored
> in the pathvec.

I'd not say filter_property() is broken; rather, its callers. 
See below.

> However, it will fail in the pathinfo call from adopt_paths, so the
> path
> won't be added to the multipath device.  This means adopt paths
> doesn't
> actually adopt any paths potentially, but that in itself doesn't
> cause
> it to fail. This check
> 
>         if (adopt_paths(vecs->pathvec, mpp) ||
>             find_slot(vecs->pathvec, pp) == -1)
>                 goto out;
> 
> passes, since we only check if the path is on the pathvec, not part
> of
> the multipath device, and since filter_property let the path past the
> first time, it is. So add_map_with_path() will create a multipath
> device, but the path won't be added to it, and pp->mpp == NULL.
> 
> So, add_map_with_path() should probably check that we actually
> created a
> map that included the path that got added. 

Exactly, this is a bug. Thanks for analyzing this. I'll send a patch
asap.

> But more importantly,
> filter_property shouldn't return different results the when it's
> called
> the first time.  That would have avoid the entire situation.

I'm not sure about this. filter_property() returning different results
depending on the initialization state of the path is irritating, sure.
Yet it makes certain sense that once the information about a path
object is more complete, the decision whether or not it should be
blacklisted might change. The only way to avoid that is to provide the
necessary information in the first place.

IMO the only reasonable way to amend this is to call select_getuid()
before filter_property(). I'll send a patch for that, too.
Determination of getuid/uid_attribute isn't expensive; the only reason
why we've been doing it "lazily" is legacy, AFAICS.

Note: We bump into a deep design issue with multipath-tools here, the
fact that path and multipath properties depend on each other in complex
ways, but these dependencies are hidden deeply in the code and need to
be explicitly memorized by us developers when we work on the code,
which is error-prone. Unfortunately, I see no easy way out of this.

Regards
Martin



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dm-devel] libmultipath: fix NULL dereference in get_be64
  2021-02-02  7:18             ` Benjamin Marzinski
@ 2021-02-02 18:00               ` Martin Wilck
  0 siblings, 0 replies; 11+ messages in thread
From: Martin Wilck @ 2021-02-02 18:00 UTC (permalink / raw)
  To: Benjamin Marzinski, lixiaokeng; +Cc: linfeilong, dm-devel mailing list

On Tue, 2021-02-02 at 01:18 -0600, Benjamin Marzinski wrote:
> On Mon, Feb 01, 2021 at 11:26:36PM -0600, Benjamin Marzinski wrote:
> > On Mon, Feb 01, 2021 at 04:12:34PM +0100, Martin Wilck wrote:
> > > 
> > > The same argument I made above still holds. "pp" wouldn't have
> > > been
> > > added to mpp, and add_map_with_path() would fail and return NULL.
> > > Also, if pathinfo() returns PATHINFO_SKIPPED for this device,
> > > how comes that cli_add_path() called ev_add_path() for it? It
> > > should
> > > have returned "blacklisted" instead.
> > 
> > So, I think the main issue here is that filter_property appears to
> > be
> > broken.  It only filters if uid_attribute is set, but that will
> > never be
> > set the first time it's called in pathinfo.  This means that it
> > will
> > pass in the pathinfo call in cli_add_path, and the path will get
> > stored
> > in the pathvec.
> 
> Just to be a little more clear here, filter_property() will only
> return
> MATCH_PROPERTY_BLIST_MISSING for missing udev properties, if the
> uid_attribute is set and seen. We should probably make sure to set
> uid_attribute before calling it.

I've just spent a few hours trying to figure this out. Unfortunately,
it isn't easy. pp->uid_attribute may be set from the hwtable, which
means that vendor/product must be known, which in turn means that
uid_attribute can't be set correctly before sysfs_pathinfo() is run.

Thus, in order to be consistent, we need to move the filter_property()
quite a bit further down in pathinfo(), after the call to
sysfs_pathinfo(). That can be done, I already have a (basically)
working code here (most of the work was fixing the unit tests).

*However*, that has a side effect. As you said correctly above, 
pp->uid_attribute currently is *never* set the first time the call
chain pathinfo()->filter_property() is run. After the proposed change,
it would *always* be set in this call chain, possibly leading to more
cases where pathinfo returns PATHINFO_SKIPPED.

AFAICS this would matter e.g. for "multipath -w". When we remove a
WWID, we must be sure get_refwwid() fills in the wwid, which won't
happen if PATHINFO_SKIPPED is returned. (This would only affect paths
that filter_property() would skip, so it's just a corner case, yet it
might confuse people in certain situations).

In order not to break such use cases, we need to make the
filter_property() test in pathinfo() dependent on DI_BLACKLIST. That
would make a lot of sense, but it'd cause another semantic change.
d51da42 ("libmultipath: move filter_property|devnode() from
path_discover() into pathinfo()") had deliberately not made this
change.

Well, I'll submit a patch set, in order to make you see how this would
end up looking. But I wouldn't propose to queue that up for mainline
just yet. I strongly hope that the fix for add_map_with_path() alone
will fix lixiaokeng's issue.

Cheers,
Martin



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-02-02 18:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <d6652b08-2d6c-ac46-9d73-b252bc26f41a@huawei.com>
     [not found] ` <0fd8e4d1-a91e-2246-e286-0ba5b7c19128@huawei.com>
2021-02-01 11:04   ` [dm-devel] libmultipath: fix NULL dereference in get_be64 Martin Wilck
2021-02-01 11:16 ` Martin Wilck
2021-02-01 11:49   ` lixiaokeng
2021-02-01 12:13     ` Martin Wilck
2021-02-01 14:50       ` lixiaokeng
2021-02-01 15:12         ` Martin Wilck
2021-02-02  5:26           ` Benjamin Marzinski
2021-02-02  7:18             ` Benjamin Marzinski
2021-02-02 18:00               ` Martin Wilck
2021-02-02  8:13             ` lixiaokeng
2021-02-02 11:04             ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.