Hello! We encountered quite a few of sporadic crashes, which upon inspection turned out to be caused by dev->aliases list being empty. One example of a fix (not tested): --- a/lib/label/hints.c +++ b/lib/label/hints.c @@ -471,7 +471,8 @@ int validate_hints(struct cmd_context *cmd, struct dm_list *hints) 471 if (!(iter = dev_iter_create(NULL, 0))) 472 return 0; 473 while ((dev = dev_iter_get(cmd, iter))) { 474 - if (!(hint = _find_hint_name(hints, dev_name(dev)))) 474+ if (dm_list_empty(dev->aliases) || 475+ !(hint = _find_hint_name(hints, dev_name(dev)))) 475 476 continue; 476 477 477 478 /* The cmd hasn't needed this hint's dev so it's not been scanned. */ So, what happened was that dev_name(dev) was extracting a `dev->aliases` element , however `dev->aliases` was empty, thus the extracted element was a junk. Although we encountered these crashes on an older 2.03.07 version, however the patch applies to latest master as well, thus the bugs are still relevant, which is odd. This makes me wondering, is this a known problem, could I possibly overlooked something, for example that dev->aliases should never be empty, and thus the fix just works around another problem? Any thoughts? _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://listman.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
On Mon, Feb 21, 2022 at 04:45:51PM +0300, Konstantin Kharlamov wrote: > So, what happened was that dev_name(dev) was extracting a `dev->aliases` element > , however `dev->aliases` was empty, thus the extracted element was a junk. > > Although we encountered these crashes on an older 2.03.07 version, however the > patch applies to latest master as well, thus the bugs are still relevant, which > is odd. This makes me wondering, is this a known problem, could I possibly > overlooked something, for example that dev->aliases should never be empty, and > thus the fix just works around another problem? Any thoughts? It's familiar, but I thought it was fixed. I don't remember the details, so we'll have to look at it again. It's related to a device being removed from the system while the command is running, which we need to add tests for. Dave _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://listman.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
On Mon, 2022-02-21 at 10:49 -0600, David Teigland wrote: > On Mon, Feb 21, 2022 at 04:45:51PM +0300, Konstantin Kharlamov wrote: > > So, what happened was that dev_name(dev) was extracting a `dev->aliases` > > element > > , however `dev->aliases` was empty, thus the extracted element was a junk. > > > > Although we encountered these crashes on an older 2.03.07 version, however > > the > > patch applies to latest master as well, thus the bugs are still relevant, > > which > > is odd. This makes me wondering, is this a known problem, could I possibly > > overlooked something, for example that dev->aliases should never be empty, > > and > > thus the fix just works around another problem? Any thoughts? > > It's familiar, but I thought it was fixed. I don't remember the details, > so we'll have to look at it again. It's related to a device being removed > from the system while the command is running, which we need to add tests > for. Okay, thank you, so I guess the conclusion is that before anything else we need to reproduce it with the latest version. _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://listman.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
To close the topic: there was a discussion off-list, and a number of fixes were merged. Some of them were tested by us as well (we backported them to older LVM), and it's been more than a month since with no crashes. So the problem is fixed. On Mon, 2022-02-21 at 16:45 +0300, Konstantin Kharlamov wrote: > Hello! We encountered quite a few of sporadic crashes, which upon inspection > turned out to be caused by dev->aliases list being empty. > > One example of a fix (not tested): > > --- a/lib/label/hints.c > +++ b/lib/label/hints.c > @@ -471,7 +471,8 @@ int validate_hints(struct cmd_context *cmd, struct dm_list > *hints) > 471 if (!(iter = dev_iter_create(NULL, 0))) > 472 return 0; > 473 while ((dev = dev_iter_get(cmd, iter))) { > 474 - if (!(hint = _find_hint_name(hints, dev_name(dev)))) > 474+ if (dm_list_empty(dev->aliases) || > 475+ !(hint = _find_hint_name(hints, dev_name(dev)))) > 475 476 continue; > 476 477 > 477 478 /* The cmd hasn't needed this hint's dev so it's not > been scanned. */ > > So, what happened was that dev_name(dev) was extracting a `dev->aliases` > element > , however `dev->aliases` was empty, thus the extracted element was a junk. > > Although we encountered these crashes on an older 2.03.07 version, however the > patch applies to latest master as well, thus the bugs are still relevant, > which > is odd. This makes me wondering, is this a known problem, could I possibly > overlooked something, for example that dev->aliases should never be empty, and > thus the fix just works around another problem? Any thoughts? > > _______________________________________________ > linux-lvm mailing list > linux-lvm@redhat.com > https://listman.redhat.com/mailman/listinfo/linux-lvm > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ > _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://listman.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/