* sysfs: cannot create duplicate filename @ 2013-03-02 18:03 Andre Heider [not found] ` <CAHsu+b-cG+ED6TX5evRTBjR-LwHugW+8-9hnHXAz5DnAioJnUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Andre Heider @ 2013-03-02 18:03 UTC (permalink / raw) To: linux-efi-u79uwXL29TY76Z2rM5mHXA Hi, After a BIOS update I get this in dmesg: [ 0.581554] EFI Variables Facility v0.08 2004-May-17 [ 0.584914] ------------[ cut here ]------------ [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x d4/0x100() [ 0.586381] Hardware name: To be filled by O.E.M. [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' [ 0.588694] Modules linked in: [ 0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7 [ 0.590280] Call Trace: [ 0.591066] [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100 [ 0.591861] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 [ 0.592650] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 [ 0.593429] [<ffffffff8134dd85>] ? strlcat+0x65/0x80 [ 0.594203] [<ffffffff81208954>] sysfs_add_one+0xd4/0x100 [ 0.594979] [<ffffffff81208b78>] create_dir+0x78/0xd0 [ 0.595753] [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0 [ 0.596532] [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220 [ 0.597310] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 [ 0.598083] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 [ 0.598859] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 [ 0.599631] [<ffffffff8158517e>] register_efivars+0xde/0x420 [ 0.600395] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 [ 0.601150] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 [ 0.601903] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [ 0.602659] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 [ 0.603418] [<ffffffff81d05586>] ? loglevel+0x31/0x31 [ 0.604183] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.604936] [<ffffffff816a653e>] kernel_init+0xe/0xf0 [ 0.605681] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 [ 0.606414] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.607143] ---[ end trace 1609741ab737eb29 ]--- [ 0.607869] ------------[ cut here ]------------ [ 0.608589] WARNING: at /home/andre/linux/lib/kobject.c:196 kobject_add_internal+0x204/0x220() [ 0.609342] Hardware name: To be filled by O.E.M. [ 0.610093] kobject_add_internal failed for SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8 with -EEXIST, don't try to register things with the same name in the same directory. [ 0.611703] Modules linked in: [ 0.612497] Pid: 1, comm: swapper/0 Tainted: G W 3.8.0+ #7 [ 0.613312] Call Trace: [ 0.614112] [<ffffffff81347fb4>] ? kobject_add_internal+0x204/0x220 [ 0.614931] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 [ 0.615753] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 [ 0.616559] [<ffffffff81347fb4>] kobject_add_internal+0x204/0x220 [ 0.617362] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 [ 0.618152] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 [ 0.618936] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 [ 0.619732] [<ffffffff8158517e>] register_efivars+0xde/0x420 [ 0.620518] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 [ 0.621297] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 [ 0.622072] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [ 0.622836] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 [ 0.623590] [<ffffffff81d05586>] ? loglevel+0x31/0x31 [ 0.624314] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.625010] [<ffffffff816a653e>] kernel_init+0xe/0xf0 [ 0.625703] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 [ 0.626393] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.627078] ---[ end trace 1609741ab737eb2a ]--- Sounds like yet another BIOS issue, and I guess the answer is "complain to the manufacturer"... But we know how well that works, so can this be a warning instead of a scary backtrace? :) (this is: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" Thanks, Andre ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <CAHsu+b-cG+ED6TX5evRTBjR-LwHugW+8-9hnHXAz5DnAioJnUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <CAHsu+b-cG+ED6TX5evRTBjR-LwHugW+8-9hnHXAz5DnAioJnUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-03-05 9:39 ` Lingzhu Xiang [not found] ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Lingzhu Xiang @ 2013-03-05 9:39 UTC (permalink / raw) To: Andre Heider; +Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA On 03/03/2013 02:03 AM, Andre Heider wrote: > After a BIOS update I get this in dmesg: > > [ 0.581554] EFI Variables Facility v0.08 2004-May-17 > [ 0.584914] ------------[ cut here ]------------ > [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x > d4/0x100() > [ 0.586381] Hardware name: To be filled by O.E.M. > [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl > BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' > (snip) > > Sounds like yet another BIOS issue, and I guess the answer is > "complain to the manufacturer"... But we know how well that works, so > can this be a warning instead of a scary backtrace? :) > > (this is: Gigabyte Technology Co., Ltd. To be filled by > O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631 I think this problem is getting more and more common. I had an IBM System x3100 M4 and x3850 X5 on which kernel would get stuck in infinite loop creating duplicate sysfs files because, for some reason, there are several duplicate boot entries in nvram getting GetNextVariableName into a circle of iteration (with period > 2). Granted, it's not kernel's fault for duplicate variables, but it'd be much harder for users to fix their broken firmware than kernel dealing with it by not dying on the never-ending GetNextVariableName. Lingzhu Xiang ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-06 13:19 ` Matt Fleming [not found] ` <1362575941.15011.56.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-08 15:11 ` Matt Fleming 1 sibling, 1 reply; 38+ messages in thread From: Matt Fleming @ 2013-03-06 13:19 UTC (permalink / raw) To: Lingzhu Xiang; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On Tue, 2013-03-05 at 17:39 +0800, Lingzhu Xiang wrote: > On 03/03/2013 02:03 AM, Andre Heider wrote: > > After a BIOS update I get this in dmesg: > > > > [ 0.581554] EFI Variables Facility v0.08 2004-May-17 > > [ 0.584914] ------------[ cut here ]------------ > > [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x > > d4/0x100() > > [ 0.586381] Hardware name: To be filled by O.E.M. > > [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl > > BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' > > (snip) > > > > Sounds like yet another BIOS issue, and I guess the answer is > > "complain to the manufacturer"... But we know how well that works, so > > can this be a warning instead of a scary backtrace? :) > > > > (this is: Gigabyte Technology Co., Ltd. To be filled by > > O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" > > Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631 > > I think this problem is getting more and more common. > > I had an IBM System x3100 M4 and x3850 X5 on which kernel would get stuck > in infinite loop creating duplicate sysfs files because, for some reason, > there are several duplicate boot entries in nvram getting GetNextVariableName > into a circle of iteration (with period > 2). > > Granted, it's not kernel's fault for duplicate variables, but it'd be much > harder for users to fix their broken firmware than kernel dealing with it > by not dying on the never-ending GetNextVariableName. It's not just about the manufacturers fixing their firmware and distributing a BIOS update. It's also the fact few users are likely to apply it before installing Linux on a new machine. Yeah, we need something in the kernel to handle this. Thanks for the report guys. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <1362575941.15011.56.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <1362575941.15011.56.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2013-03-06 13:35 ` shea-yfkUTty7RcRWk0Htik3J/w [not found] ` <bd0c3451a7342302623c09cac09bcbfc-yfkUTty7RcRWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: shea-yfkUTty7RcRWk0Htik3J/w @ 2013-03-06 13:35 UTC (permalink / raw) To: Matt Fleming Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA Hi Matt, On 2013-03-06 08:19, Matt Fleming wrote: > On Tue, 2013-03-05 at 17:39 +0800, Lingzhu Xiang wrote: >> On 03/03/2013 02:03 AM, Andre Heider wrote: >> > After a BIOS update I get this in dmesg: >> > >> > [ 0.581554] EFI Variables Facility v0.08 2004-May-17 >> > [ 0.584914] ------------[ cut here ]------------ >> > [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 >> sysfs_add_one+0x >> > d4/0x100() >> > [ 0.586381] Hardware name: To be filled by O.E.M. >> > [ 0.587123] sysfs: cannot create duplicate filename >> '/firmware/efi/vars/SbAsl >> > BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' >> > (snip) >> > >> > Sounds like yet another BIOS issue, and I guess the answer is >> > "complain to the manufacturer"... But we know how well that works, >> so >> > can this be a warning instead of a scary backtrace? :) >> > >> > (this is: Gigabyte Technology Co., Ltd. To be filled by >> > O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" >> >> Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631 >> >> I think this problem is getting more and more common. >> >> I had an IBM System x3100 M4 and x3850 X5 on which kernel would get >> stuck >> in infinite loop creating duplicate sysfs files because, for some >> reason, >> there are several duplicate boot entries in nvram getting >> GetNextVariableName >> into a circle of iteration (with period > 2). >> >> Granted, it's not kernel's fault for duplicate variables, but it'd >> be much >> harder for users to fix their broken firmware than kernel dealing >> with it >> by not dying on the never-ending GetNextVariableName. > > It's not just about the manufacturers fixing their firmware and > distributing a BIOS update. It's also the fact few users are likely > to > apply it before installing Linux on a new machine. > > Yeah, we need something in the kernel to handle this. > Unrelated to the initial issue, but with the growing number of fixes for buggy firmware out there, maybe we should consider an approach similar to CONFIG_PCI_QUIRKS and the separate drivers/pci/quirks.c, where all of the system specific EFI quirks are isolated and can be compiled out if you know your firmware is good? Also might be nice to have a separate config option specifically for all the Apple quirks, as their EFI seems to be the worst out there with respect to following standards. > > Thanks for the report guys. ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <bd0c3451a7342302623c09cac09bcbfc-yfkUTty7RcRWk0Htik3J/w@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <bd0c3451a7342302623c09cac09bcbfc-yfkUTty7RcRWk0Htik3J/w@public.gmane.org> @ 2013-03-06 14:29 ` Matt Fleming 0 siblings, 0 replies; 38+ messages in thread From: Matt Fleming @ 2013-03-06 14:29 UTC (permalink / raw) To: shea-yfkUTty7RcRWk0Htik3J/w Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Matthew Garrett, H. Peter Anvin On Wed, 2013-03-06 at 08:35 -0500, shea-yfkUTty7RcRWk0Htik3J/w@public.gmane.org wrote: > Unrelated to the initial issue, but with the growing number of fixes > for buggy firmware out there, maybe we should consider an approach > similar to CONFIG_PCI_QUIRKS and the separate drivers/pci/quirks.c, > where all of the system specific EFI quirks are isolated and can be > compiled out if you know your firmware is good? Right, the firmware bug spectrum is indeed starting to grow rapidly. I was hoping we could avoid doing something like PCI quirks because that whole thing is a mess and I am concerned about an explosion of quirks that would need to be maintained, some being used in core architecture code. Upto now, we've been catering for the least common denominator, which has worked, but obviously isn't ideal. My plan has always been to try and hold off on introducing something like quirks until my hand is forced. But it is likely that will happen sooner rather than later. One problem we have though, is clearly identifying machines that suffer from these bugs because of the cross-pollination of firmware from Independent BIOS Vendors (IBVs) and OEMs. In theory, we could use the FirmwareRevision field in the EFI System Table, along with DMI strings and such. > Also might be nice to have a separate config option specifically for > all the Apple quirks, as their EFI seems to be the worst out there with > respect to following standards. AFAIK, it's that they're based on EFI 1.10, so pre-UEFI days. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename [not found] ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-06 13:19 ` Matt Fleming @ 2013-03-08 15:11 ` Matt Fleming [not found] ` <1362755479.15011.238.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 38+ messages in thread From: Matt Fleming @ 2013-03-08 15:11 UTC (permalink / raw) To: Lingzhu Xiang; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On Tue, 2013-03-05 at 17:39 +0800, Lingzhu Xiang wrote: > On 03/03/2013 02:03 AM, Andre Heider wrote: > > After a BIOS update I get this in dmesg: > > > > [ 0.581554] EFI Variables Facility v0.08 2004-May-17 > > [ 0.584914] ------------[ cut here ]------------ > > [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x > > d4/0x100() > > [ 0.586381] Hardware name: To be filled by O.E.M. > > [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl > > BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' > > (snip) > > > > Sounds like yet another BIOS issue, and I guess the answer is > > "complain to the manufacturer"... But we know how well that works, so > > can this be a warning instead of a scary backtrace? :) > > > > (this is: Gigabyte Technology Co., Ltd. To be filled by > > O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" > > Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631 > > I think this problem is getting more and more common. > > I had an IBM System x3100 M4 and x3850 X5 on which kernel would get stuck > in infinite loop creating duplicate sysfs files because, for some reason, > there are several duplicate boot entries in nvram getting GetNextVariableName > into a circle of iteration (with period > 2). > > Granted, it's not kernel's fault for duplicate variables, but it'd be much > harder for users to fix their broken firmware than kernel dealing with it > by not dying on the never-ending GetNextVariableName. Guys, could you test the following patch? --- >From bb94d021f900a990def5891a7cb480cc6d3e5247 Mon Sep 17 00:00:00 2001 From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Date: Thu, 7 Mar 2013 11:59:14 +0000 Subject: [PATCH] efivars: Handle duplicate names from get_next_variable() Some firmware exhibits a bug where the same VariableName and VendorGuid values are returned on consecutive invocations of GetNextVariableName(). See, https://bugzilla.kernel.org/show_bug.cgi?id=47631 As a consequence of such a bug, Andre reports hitting the following WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" machine, [ 0.581554] EFI Variables Facility v0.08 2004-May-17 [ 0.584914] ------------[ cut here ]------------ [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100() [ 0.586381] Hardware name: To be filled by O.E.M. [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' [ 0.588694] Modules linked in: [ 0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7 [ 0.590280] Call Trace: [ 0.591066] [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100 [ 0.591861] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 [ 0.592650] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 [ 0.593429] [<ffffffff8134dd85>] ? strlcat+0x65/0x80 [ 0.594203] [<ffffffff81208954>] sysfs_add_one+0xd4/0x100 [ 0.594979] [<ffffffff81208b78>] create_dir+0x78/0xd0 [ 0.595753] [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0 [ 0.596532] [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220 [ 0.597310] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 [ 0.598083] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 [ 0.598859] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 [ 0.599631] [<ffffffff8158517e>] register_efivars+0xde/0x420 [ 0.600395] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 [ 0.601150] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 [ 0.601903] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [ 0.602659] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 [ 0.603418] [<ffffffff81d05586>] ? loglevel+0x31/0x31 [ 0.604183] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.604936] [<ffffffff816a653e>] kernel_init+0xe/0xf0 [ 0.605681] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 [ 0.606414] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.607143] ---[ end trace 1609741ab737eb29 ]--- There's not much we can do to work around and keep traversing the variable list once we hit this firmware bug. Our only solution is to terminate the loop because, as Lingzhu reports, some machines get stuck when they encounter duplicate names, > I had an IBM System x3100 M4 and x3850 X5 on which kernel would > get stuck in infinite loop creating duplicate sysfs files because, > for some reason, there are several duplicate boot entries in nvram > getting GetNextVariableName into a circle of iteration (with > period > 2). Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- drivers/firmware/efivars.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index bea32d1..ea415df 100644 --- a/drivers/firmware/efivars.c +++ b/drivers/firmware/efivars.c @@ -1962,8 +1962,9 @@ int register_efivars(struct efivars *efivars, struct kobject *parent_kobj) { efi_status_t status = EFI_NOT_FOUND; - efi_guid_t vendor_guid; + efi_guid_t vendor_guid, prev_guid; efi_char16_t *variable_name; + efi_char16_t *prev_name = NULL; unsigned long variable_name_size = 1024; int error = 0; @@ -1973,6 +1974,13 @@ int register_efivars(struct efivars *efivars, return -ENOMEM; } + prev_name = kzalloc(variable_name_size, GFP_KERNEL); + if (!prev_name) { + printk(KERN_ERR "efivars: Memory allocation failed.\n"); + error = -ENOMEM; + goto out; + } + spin_lock_init(&efivars->lock); INIT_LIST_HEAD(&efivars->list); efivars->ops = ops; @@ -2005,6 +2013,21 @@ int register_efivars(struct efivars *efivars, &vendor_guid); switch (status) { case EFI_SUCCESS: + /* + * Some firmware implementations return the + * same variable name on consecutive calls to + * get_next_variable(). Terminate the loop + * immediately as there is no guarantee that + * we'll ever see a different variable name, + * and may end up looping here forever. + */ + if (!efi_guidcmp(prev_guid, vendor_guid) && + !memcmp(prev_name, variable_name, variable_name_size)) { + printk(KERN_WARNING "efivars: duplicate variable name.\n"); + status = EFI_NOT_FOUND; + break; + } + efivar_create_sysfs_entry(efivars, variable_name_size, variable_name, @@ -2018,6 +2041,9 @@ int register_efivars(struct efivars *efivars, status = EFI_NOT_FOUND; break; } + + memcpy(prev_name, variable_name, variable_name_size); + memcpy(&prev_guid, &vendor_guid, sizeof(efi_guid_t)); } while (status != EFI_NOT_FOUND); error = create_efivars_bin_attributes(efivars); @@ -2037,6 +2063,7 @@ int register_efivars(struct efivars *efivars, register_filesystem(&efivarfs_type); out: + kfree(prev_name); kfree(variable_name); return error; -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 38+ messages in thread
[parent not found: <1362755479.15011.238.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <1362755479.15011.238.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2013-03-08 18:17 ` Lingzhu Xiang [not found] ` <513A2B29.8090705-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Lingzhu Xiang @ 2013-03-08 18:17 UTC (permalink / raw) To: Matt Fleming; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On 03/08/2013 11:11 PM, Matt Fleming wrote: > + /* > + * Some firmware implementations return the > + * same variable name on consecutive calls to > + * get_next_variable(). Terminate the loop > + * immediately as there is no guarantee that > + * we'll ever see a different variable name, > + * and may end up looping here forever. > + */ > + if (!efi_guidcmp(prev_guid, vendor_guid) && > + !memcmp(prev_name, variable_name, variable_name_size)) { > + printk(KERN_WARNING "efivars: duplicate variable name.\n"); > + status = EFI_NOT_FOUND; > + break; > + } Comparing with the immediately previous one won't detect larger loop like how The IBM x3100 M4 in question was looping as below. (There is some log loss; probably too much data for serial.) [ 9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 9.856520] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7' [ 9.860840] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a' [ 9.865735] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9' [ 9.871157] sysfs: cannot create duplicate filename '/firmware/efi/vars/DIMMStatusVariable-c020489e-6db2-4fe2-a2b1-ca05cca1336a' [ 9.876473] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0007-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 11.541964] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 11.556008] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 11.560022] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 11.564583] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7' [ 11.568977] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a' [ 11.573929] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9' [ 13.278655] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 13.292703] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 13.296716] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 13.301283] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7' [ 13.305669] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a' [ 13.310628] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9' [ 13.316108] sysfs: cannot create duplicate filename '/firmware/efi/vars/DIMMStatusVariable-c020489e-6db2-4fe2-a2b1-ca05cca1336a' [ 15.004171] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 15.017737] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 15.031713] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 15.035730] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 15.040355] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7' [ 15.044743] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a' [ 15.049700] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9' [ 16.729287] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7' [ 16.742919] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 16.756550] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 16.770526] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 16.774532] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 16.779093] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7' [ 16.783542] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a' [ 18.467922] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7' [ 18.481615] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 18.495194] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 18.509242] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 18.513252] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 18.517806] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7' [ 20.193306] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 20.206649] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7' [ 20.220341] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 20.233916] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 20.247957] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 20.251970] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 21.918960] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7' [ 21.931944] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 21.945286] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7' [ 21.958984] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 21.972559] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 21.986608] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' [ 23.644961] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 23.657688] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7' [ 23.670724] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 23.684062] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7' [ 23.697692] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 23.711324] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7' [ 25.370482] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0006-d665f20a-f2b7-485e-a321-542a196f40c7' [ 25.382749] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 25.395419] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7' [ 25.408343] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 25.421625] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7' [ 25.435192] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 27.087790] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0006-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 27.099845] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0006-d665f20a-f2b7-485e-a321-542a196f40c7' [ 27.112118] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 27.124788] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7' [ 27.137702] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 27.150990] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7' [ 28.805316] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0005-d665f20a-f2b7-485e-a321-542a196f40c7' [ 28.816944] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0006-8be4df61-93ca-11d2-aa0d-00e098032b8c' [ 28.829010] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0006-d665f20a-f2b7-485e-a321-542a196f40c7' [ 28.841278] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c' ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <513A2B29.8090705-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <513A2B29.8090705-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-12 10:08 ` Matt Fleming [not found] ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Matt Fleming @ 2013-03-12 10:08 UTC (permalink / raw) To: Lingzhu Xiang; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On Sat, 2013-03-09 at 02:17 +0800, Lingzhu Xiang wrote: > Comparing with the immediately previous one won't detect larger loop > like how The IBM x3100 M4 in question was looping as below. > > (There is some log loss; probably too much data for serial.) > [ 9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' > [ 9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' Ah, thanks. I misunderstood the bug. How about something like this? We're just going to have to suck it up and iterate the entire list. This could be made more optimal, but this isn't a fast path anyway, so I'm not sure it's worth optimising immediately. --- >From c8b6c9c175f4459197e2a13f29a2e4125372d771 Mon Sep 17 00:00:00 2001 From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Date: Thu, 7 Mar 2013 11:59:14 +0000 Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable() Some firmware exhibits a bug where the same VariableName and VendorGuid values are returned on multiple invocations of GetNextVariableName(). See, https://bugzilla.kernel.org/show_bug.cgi?id=47631 As a consequence of such a bug, Andre reports hitting the following WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" machine, [ 0.581554] EFI Variables Facility v0.08 2004-May-17 [ 0.584914] ------------[ cut here ]------------ [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100() [ 0.586381] Hardware name: To be filled by O.E.M. [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' [ 0.588694] Modules linked in: [ 0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7 [ 0.590280] Call Trace: [ 0.591066] [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100 [ 0.591861] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 [ 0.592650] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 [ 0.593429] [<ffffffff8134dd85>] ? strlcat+0x65/0x80 [ 0.594203] [<ffffffff81208954>] sysfs_add_one+0xd4/0x100 [ 0.594979] [<ffffffff81208b78>] create_dir+0x78/0xd0 [ 0.595753] [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0 [ 0.596532] [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220 [ 0.597310] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 [ 0.598083] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 [ 0.598859] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 [ 0.599631] [<ffffffff8158517e>] register_efivars+0xde/0x420 [ 0.600395] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 [ 0.601150] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 [ 0.601903] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [ 0.602659] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 [ 0.603418] [<ffffffff81d05586>] ? loglevel+0x31/0x31 [ 0.604183] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.604936] [<ffffffff816a653e>] kernel_init+0xe/0xf0 [ 0.605681] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 [ 0.606414] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.607143] ---[ end trace 1609741ab737eb29 ]--- There's not much we can do to work around and keep traversing the variable list once we hit this firmware bug. Our only solution is to terminate the loop because, as Lingzhu reports, some machines get stuck when they encounter duplicate names, > I had an IBM System x3100 M4 and x3850 X5 on which kernel would > get stuck in infinite loop creating duplicate sysfs files because, > for some reason, there are several duplicate boot entries in nvram > getting GetNextVariableName into a circle of iteration (with > period > 2). Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- v2: iterate the entire list when checking for duplicates. drivers/firmware/efivars.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index bea32d1..d1656e3 100644 --- a/drivers/firmware/efivars.c +++ b/drivers/firmware/efivars.c @@ -2005,6 +2005,20 @@ int register_efivars(struct efivars *efivars, &vendor_guid); switch (status) { case EFI_SUCCESS: + /* + * Some firmware implementations return the + * same variable name on multiple calls to + * get_next_variable(). Terminate the loop + * immediately as there is no guarantee that + * we'll ever see a different variable name, + * and may end up looping here forever. + */ + if (variable_is_present(variable_name, &vendor_guid)) { + printk(KERN_WARNING "efivars: duplicate variable name.\n"); + status = EFI_NOT_FOUND; + break; + } + efivar_create_sysfs_entry(efivars, variable_name_size, variable_name, -- 1.7.11.7 -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply related [flat|nested] 38+ messages in thread
[parent not found: <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2013-03-12 10:45 ` Lingzhu Xiang [not found] ` <513F0734.80600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-12 13:20 ` Andre Heider 1 sibling, 1 reply; 38+ messages in thread From: Lingzhu Xiang @ 2013-03-12 10:45 UTC (permalink / raw) To: Matt Fleming; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi (Cc'ing Seiji Aguchi for the bug related to sysfs workqueue) On 03/12/2013 06:08 PM, Matt Fleming wrote: > On Sat, 2013-03-09 at 02:17 +0800, Lingzhu Xiang wrote: >> Comparing with the immediately previous one won't detect larger loop >> like how The IBM x3100 M4 in question was looping as below. >> >> (There is some log loss; probably too much data for serial.) >> [ 9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' >> [ 9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' > > Ah, thanks. I misunderstood the bug. > > How about something like this? We're just going to have to suck it up > and iterate the entire list. This could be made more optimal, but this > isn't a fast path anyway, so I'm not sure it's worth optimising > immediately. > > --- > > From c8b6c9c175f4459197e2a13f29a2e4125372d771 Mon Sep 17 00:00:00 2001 > From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > Date: Thu, 7 Mar 2013 11:59:14 +0000 > Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable() > > Some firmware exhibits a bug where the same VariableName and > VendorGuid values are returned on multiple invocations of > GetNextVariableName(). See, > > https://bugzilla.kernel.org/show_bug.cgi?id=47631 > > As a consequence of such a bug, Andre reports hitting the following > WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte > Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e > 11/21/2012)" machine, > > [ 0.581554] EFI Variables Facility v0.08 2004-May-17 > [ 0.584914] ------------[ cut here ]------------ > [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100() > [ 0.586381] Hardware name: To be filled by O.E.M. > [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' > [ 0.588694] Modules linked in: > [ 0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7 > [ 0.590280] Call Trace: > [ 0.591066] [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100 > [ 0.591861] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 > [ 0.592650] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 > [ 0.593429] [<ffffffff8134dd85>] ? strlcat+0x65/0x80 > [ 0.594203] [<ffffffff81208954>] sysfs_add_one+0xd4/0x100 > [ 0.594979] [<ffffffff81208b78>] create_dir+0x78/0xd0 > [ 0.595753] [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0 > [ 0.596532] [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220 > [ 0.597310] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 > [ 0.598083] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 > [ 0.598859] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 > [ 0.599631] [<ffffffff8158517e>] register_efivars+0xde/0x420 > [ 0.600395] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 > [ 0.601150] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 > [ 0.601903] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 > [ 0.602659] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 > [ 0.603418] [<ffffffff81d05586>] ? loglevel+0x31/0x31 > [ 0.604183] [<ffffffff816a6530>] ? rest_init+0x80/0x80 > [ 0.604936] [<ffffffff816a653e>] kernel_init+0xe/0xf0 > [ 0.605681] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 > [ 0.606414] [<ffffffff816a6530>] ? rest_init+0x80/0x80 > [ 0.607143] ---[ end trace 1609741ab737eb29 ]--- > > There's not much we can do to work around and keep traversing the > variable list once we hit this firmware bug. Our only solution is to > terminate the loop because, as Lingzhu reports, some machines get > stuck when they encounter duplicate names, > > > I had an IBM System x3100 M4 and x3850 X5 on which kernel would > > get stuck in infinite loop creating duplicate sysfs files because, > > for some reason, there are several duplicate boot entries in nvram > > getting GetNextVariableName into a circle of iteration (with > > period > 2). > > Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> > Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > --- > > v2: iterate the entire list when checking for duplicates. > > drivers/firmware/efivars.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c > index bea32d1..d1656e3 100644 > --- a/drivers/firmware/efivars.c > +++ b/drivers/firmware/efivars.c > @@ -2005,6 +2005,20 @@ int register_efivars(struct efivars *efivars, > &vendor_guid); > switch (status) { > case EFI_SUCCESS: > + /* > + * Some firmware implementations return the > + * same variable name on multiple calls to > + * get_next_variable(). Terminate the loop > + * immediately as there is no guarantee that > + * we'll ever see a different variable name, > + * and may end up looping here forever. > + */ > + if (variable_is_present(variable_name, &vendor_guid)) { > + printk(KERN_WARNING "efivars: duplicate variable name.\n"); > + status = EFI_NOT_FOUND; > + break; > + } > + > efivar_create_sysfs_entry(efivars, > variable_name_size, > variable_name, > I think using what we have is good enough for this. I'll verify boot time tomorrow. status = EFI_NOT_FOUND here doesn't seem to do anything. By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue patch may also get stuck in loop in the same way. Lingzhu Xiang ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <513F0734.80600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <513F0734.80600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-12 16:35 ` Matt Fleming [not found] ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Matt Fleming @ 2013-03-12 16:35 UTC (permalink / raw) To: Lingzhu Xiang Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi On Tue, 2013-03-12 at 18:45 +0800, Lingzhu Xiang wrote: > (Cc'ing Seiji Aguchi for the bug related to sysfs workqueue) > > On 03/12/2013 06:08 PM, Matt Fleming wrote: > > On Sat, 2013-03-09 at 02:17 +0800, Lingzhu Xiang wrote: > >> Comparing with the immediately previous one won't detect larger loop > >> like how The IBM x3100 M4 in question was looping as below. > >> > >> (There is some log loss; probably too much data for serial.) > >> [ 9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa' > >> [ 9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c' > > > > Ah, thanks. I misunderstood the bug. > > > > How about something like this? We're just going to have to suck it up > > and iterate the entire list. This could be made more optimal, but this > > isn't a fast path anyway, so I'm not sure it's worth optimising > > immediately. > > > > --- > > > > From c8b6c9c175f4459197e2a13f29a2e4125372d771 Mon Sep 17 00:00:00 2001 > > From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > > Date: Thu, 7 Mar 2013 11:59:14 +0000 > > Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable() > > > > Some firmware exhibits a bug where the same VariableName and > > VendorGuid values are returned on multiple invocations of > > GetNextVariableName(). See, > > > > https://bugzilla.kernel.org/show_bug.cgi?id=47631 > > > > As a consequence of such a bug, Andre reports hitting the following > > WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte > > Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e > > 11/21/2012)" machine, > > > > [ 0.581554] EFI Variables Facility v0.08 2004-May-17 > > [ 0.584914] ------------[ cut here ]------------ > > [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100() > > [ 0.586381] Hardware name: To be filled by O.E.M. > > [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' > > [ 0.588694] Modules linked in: > > [ 0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7 > > [ 0.590280] Call Trace: > > [ 0.591066] [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100 > > [ 0.591861] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 > > [ 0.592650] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 > > [ 0.593429] [<ffffffff8134dd85>] ? strlcat+0x65/0x80 > > [ 0.594203] [<ffffffff81208954>] sysfs_add_one+0xd4/0x100 > > [ 0.594979] [<ffffffff81208b78>] create_dir+0x78/0xd0 > > [ 0.595753] [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0 > > [ 0.596532] [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220 > > [ 0.597310] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 > > [ 0.598083] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 > > [ 0.598859] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 > > [ 0.599631] [<ffffffff8158517e>] register_efivars+0xde/0x420 > > [ 0.600395] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 > > [ 0.601150] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 > > [ 0.601903] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 > > [ 0.602659] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 > > [ 0.603418] [<ffffffff81d05586>] ? loglevel+0x31/0x31 > > [ 0.604183] [<ffffffff816a6530>] ? rest_init+0x80/0x80 > > [ 0.604936] [<ffffffff816a653e>] kernel_init+0xe/0xf0 > > [ 0.605681] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 > > [ 0.606414] [<ffffffff816a6530>] ? rest_init+0x80/0x80 > > [ 0.607143] ---[ end trace 1609741ab737eb29 ]--- > > > > There's not much we can do to work around and keep traversing the > > variable list once we hit this firmware bug. Our only solution is to > > terminate the loop because, as Lingzhu reports, some machines get > > stuck when they encounter duplicate names, > > > > > I had an IBM System x3100 M4 and x3850 X5 on which kernel would > > > get stuck in infinite loop creating duplicate sysfs files because, > > > for some reason, there are several duplicate boot entries in nvram > > > getting GetNextVariableName into a circle of iteration (with > > > period > 2). > > > > Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > > Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> > > Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > > --- > > > > v2: iterate the entire list when checking for duplicates. > > > > drivers/firmware/efivars.c | 14 ++++++++++++++ > > 1 file changed, 14 insertions(+) > > > > diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c > > index bea32d1..d1656e3 100644 > > --- a/drivers/firmware/efivars.c > > +++ b/drivers/firmware/efivars.c > > @@ -2005,6 +2005,20 @@ int register_efivars(struct efivars *efivars, > > &vendor_guid); > > switch (status) { > > case EFI_SUCCESS: > > + /* > > + * Some firmware implementations return the > > + * same variable name on multiple calls to > > + * get_next_variable(). Terminate the loop > > + * immediately as there is no guarantee that > > + * we'll ever see a different variable name, > > + * and may end up looping here forever. > > + */ > > + if (variable_is_present(variable_name, &vendor_guid)) { > > + printk(KERN_WARNING "efivars: duplicate variable name.\n"); > > + status = EFI_NOT_FOUND; > > + break; > > + } > > + > > efivar_create_sysfs_entry(efivars, > > variable_name_size, > > variable_name, > > > > I think using what we have is good enough for this. I'll verify > boot time tomorrow. That would be great, thank you. > status = EFI_NOT_FOUND here doesn't seem to do anything. It terminates the enclosing do-while loop. > By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue > patch may also get stuck in loop in the same way. You mean if GetNextVariableName() never terminates? Yeah, efivar_update_sysfs_entries() looks like it needs protecting from that too, good catch. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* RE: sysfs: cannot create duplicate filename [not found] ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2013-03-12 17:51 ` Seiji Aguchi 2013-03-13 10:47 ` Lingzhu Xiang 1 sibling, 0 replies; 38+ messages in thread From: Seiji Aguchi @ 2013-03-12 17:51 UTC (permalink / raw) To: Matt Fleming, Lingzhu Xiang Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA > > By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue > > patch may also get stuck in loop in the same way. In my understanding, a problem reported initially is that WARN_ON() in create_sysfs_entries() hits because multiple sysfs entries of a same variable name are created. On the other hand, in efivar_update_sysfs_entries(), the WARN_ON never hits because If a variable is already present, create_sysfs_entries() is never called. > > You mean if GetNextVariableName() never terminates? Yeah, > efivar_update_sysfs_entries() looks like it needs protecting from that too, good catch. If there is a case where GetNextVariableName() never terminates, I agree that efivar_update_sysfs_entries() should be protected from that . But I'm not sure if the case is related to a bug in this thread.. Seiji ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename [not found] ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-12 17:51 ` Seiji Aguchi @ 2013-03-13 10:47 ` Lingzhu Xiang [not found] ` <51405943.2000601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 38+ messages in thread From: Lingzhu Xiang @ 2013-03-13 10:47 UTC (permalink / raw) To: Matt Fleming; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi On 03/13/2013 12:35 AM, Matt Fleming wrote: >> I think using what we have is good enough for this. I'll verify >> boot time tomorrow. > > That would be great, thank you. On a Windows 8 logo desktop with 91 variables, the average time (30 runs each) of variable iteration goes from 0.009672 second to 0.010180 second (+0.000508 second, or +5%). Negligible. >> status = EFI_NOT_FOUND here doesn't seem to do anything. > > It terminates the enclosing do-while loop. Oops, my brain fart. >> By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue >> patch may also get stuck in loop in the same way. > > You mean if GetNextVariableName() never terminates? Yeah, > efivar_update_sysfs_entries() looks like it needs protecting from that > too, good catch. Yes that's what I meant. Lingzhu Xiang ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <51405943.2000601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <51405943.2000601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-14 16:33 ` Matt Fleming [not found] ` <1363278817.15011.316.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Matt Fleming @ 2013-03-14 16:33 UTC (permalink / raw) To: Lingzhu Xiang Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi On Wed, 2013-03-13 at 18:47 +0800, Lingzhu Xiang wrote: > On 03/13/2013 12:35 AM, Matt Fleming wrote: > >> I think using what we have is good enough for this. I'll verify > >> boot time tomorrow. > > > > That would be great, thank you. > > On a Windows 8 logo desktop with 91 variables, the average time (30 runs > each) of variable iteration goes from 0.009672 second to 0.010180 second > (+0.000508 second, or +5%). Negligible. Thanks for testing that. [...] > >> By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue > >> patch may also get stuck in loop in the same way. > > > > You mean if GetNextVariableName() never terminates? Yeah, > > efivar_update_sysfs_entries() looks like it needs protecting from that > > too, good catch. > > Yes that's what I meant. Hmm.. actually, I'm not at all sure how to fix that case, since the algorithm expects to find duplicates. Seiji, any thoughts here? For a recap of the discussion see, http://article.gmane.org/gmane.linux.kernel.efi/835 -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <1363278817.15011.316.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* RE: sysfs: cannot create duplicate filename [not found] ` <1363278817.15011.316.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2013-03-14 19:31 ` Seiji Aguchi [not found] ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF68807-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Seiji Aguchi @ 2013-03-14 19:31 UTC (permalink / raw) To: Matt Fleming, Lingzhu Xiang Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA > Hmm.. actually, I'm not at all sure how to fix that case, since the algorithm expects to find duplicates. > > Seiji, any thoughts here? For a recap of the discussion see, > > http://article.gmane.org/gmane.linux.kernel.efi/835 Thank you for providing this. If GetNextVariableName() never terminates, the firmware is probably broken. In this case, efivars, including efi_pstore, can't guarantee to work reliably... >Granted, it's not kernel's fault for duplicate variables, but it'd be much harder for users >to fix their broken firmware than kernel dealing with it by not dying on the never-ending >GetNextVariableName. To fix the broken firmware, users need to erase the duplicated variable. Is it correct? If yes, how about printing the duplicated variable name as follows? Or add printk() persuade to upgrading/downgrading firmware? + printk(KERN_WARNING "efivars: duplicate variable name. %s\n", variable_name); As for update_sysfs_entries() , a candidate fix is introducing a global variable like is_firmware_broken. And if the variable is set, update_sysfs_entries() returns without calling GetNextVariableName(). As I mentioned above, there is no way to make efivars, including efi_pstore, work reliably on the broken firmware.. I admit that introducing a global variable is messy. But I can't find any other way to fix.. If I still misunderstand something, please let me know. Seiji ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <A5ED84D3BB3A384992CBB9C77DEDA4D41AF68807-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF68807-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org> @ 2013-03-18 14:51 ` Matt Fleming [not found] ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Matt Fleming @ 2013-03-18 14:51 UTC (permalink / raw) To: Seiji Aguchi Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On Thu, 2013-03-14 at 19:31 +0000, Seiji Aguchi wrote: > To fix the broken firmware, users need to erase the duplicated variable. > Is it correct? I'm unsure whether that will fix the problem for all platforms. Lingzhu, do you have any idea? > If yes, how about printing the duplicated variable name as follows? > Or add printk() persuade to upgrading/downgrading firmware? > > + printk(KERN_WARNING "efivars: duplicate variable name. %s\n", variable_name); Yes, I think there's merit to printing that. Unfortunately, it's a utf16 string, and printk() doesn't know how to print it. > As for update_sysfs_entries() , a candidate fix is introducing a > global variable like is_firmware_broken. > And if the variable is set, update_sysfs_entries() returns without > calling GetNextVariableName(). > As I mentioned above, there is no way to make efivars, including > efi_pstore, work reliably on the broken firmware.. > > I admit that introducing a global variable is messy. But I can't find > any other way to fix.. Right, looks like this is the only way to go. How does everyone feel about this version? Lingzhu, if I could get a Tested-by for this version that would be great. This, plus some other patches, are sitting on the 'urgent' branch. --- >From 544a6c06479e4ea4e7acc7eea589a82f0ff9aa6d Mon Sep 17 00:00:00 2001 From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Date: Thu, 7 Mar 2013 11:59:14 +0000 Subject: [PATCH] efivars: Handle duplicate names from get_next_variable() Some firmware exhibits a bug where the same VariableName and VendorGuid values are returned on multiple invocations of GetNextVariableName(). See, https://bugzilla.kernel.org/show_bug.cgi?id=47631 As a consequence of such a bug, Andre reports hitting the following WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)" machine, [ 0.581554] EFI Variables Facility v0.08 2004-May-17 [ 0.584914] ------------[ cut here ]------------ [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100() [ 0.586381] Hardware name: To be filled by O.E.M. [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' [ 0.588694] Modules linked in: [ 0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7 [ 0.590280] Call Trace: [ 0.591066] [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100 [ 0.591861] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 [ 0.592650] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 [ 0.593429] [<ffffffff8134dd85>] ? strlcat+0x65/0x80 [ 0.594203] [<ffffffff81208954>] sysfs_add_one+0xd4/0x100 [ 0.594979] [<ffffffff81208b78>] create_dir+0x78/0xd0 [ 0.595753] [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0 [ 0.596532] [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220 [ 0.597310] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 [ 0.598083] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 [ 0.598859] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 [ 0.599631] [<ffffffff8158517e>] register_efivars+0xde/0x420 [ 0.600395] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 [ 0.601150] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 [ 0.601903] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [ 0.602659] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 [ 0.603418] [<ffffffff81d05586>] ? loglevel+0x31/0x31 [ 0.604183] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.604936] [<ffffffff816a653e>] kernel_init+0xe/0xf0 [ 0.605681] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 [ 0.606414] [<ffffffff816a6530>] ? rest_init+0x80/0x80 [ 0.607143] ---[ end trace 1609741ab737eb29 ]--- There's not much we can do to work around and keep traversing the variable list once we hit this firmware bug. Our only solution is to terminate the loop because, as Lingzhu reports, some machines get stuck when they encounter duplicate names, > I had an IBM System x3100 M4 and x3850 X5 on which kernel would > get stuck in infinite loop creating duplicate sysfs files because, > for some reason, there are several duplicate boot entries in nvram > getting GetNextVariableName into a circle of iteration (with > period > 2). Also disable the workqueue, as efivar_update_sysfs_entries() uses GetNextVariableName() to figure out which variables have been created since the last iteration. That algorithm isn't going to work if GetNextVariableName() returns duplicates. Note that we don't disable EFI variable creation completely on the affected machines, it's just that any pstore dump-* files won't appear in sysfs until the next boot. Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Seiji Aguchi <seiji.aguchi-7rDLJAbr9SE@public.gmane.org> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- drivers/firmware/efivars.c | 48 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index ae26d5e..a1a3d01 100644 --- a/drivers/firmware/efivars.c +++ b/drivers/firmware/efivars.c @@ -170,6 +170,7 @@ efivar_create_sysfs_entry(struct efivars *efivars, static void efivar_update_sysfs_entries(struct work_struct *); static DECLARE_WORK(efivar_work, efivar_update_sysfs_entries); +static bool efivar_wq_enabled = true; /* Return the number of unicode characters in data */ static unsigned long @@ -1443,7 +1444,7 @@ static int efi_pstore_write(enum pstore_type_id type, spin_unlock_irqrestore(&efivars->lock, flags); - if (reason == KMSG_DUMP_OOPS) + if (reason == KMSG_DUMP_OOPS && efivar_wq_enabled) schedule_work(&efivar_work); *id = part; @@ -1974,6 +1975,35 @@ void unregister_efivars(struct efivars *efivars) } EXPORT_SYMBOL_GPL(unregister_efivars); +/* + * Print a warning when duplicate EFI variables are encountered and + * disable the sysfs workqueue since the firmware is buggy. + */ +static void dup_variable_bug(efi_char16_t *s16, efi_guid_t *vendor_guid, + unsigned long len16) +{ + size_t i, len8 = len16 / sizeof(efi_char16_t); + char *s8; + + /* + * Disable the workqueue since the algorithm it uses for + * detecting new variables won't work with this buggy + * implementation of GetNextVariableName(). + */ + efivar_wq_enabled = false; + + s8 = kzalloc(len8, GFP_KERNEL); + if (!s8) + return; + + for (i = 0; i < len8; i++) + s8[i] = s16[i]; + + printk(KERN_WARNING "efivars: duplicate variable: %s-%pUl\n", + s8, vendor_guid); + kfree(s8); +} + int register_efivars(struct efivars *efivars, const struct efivar_operations *ops, struct kobject *parent_kobj) @@ -2024,6 +2054,22 @@ int register_efivars(struct efivars *efivars, case EFI_SUCCESS: variable_name_size = var_name_strnsize(variable_name, variable_name_size); + + /* + * Some firmware implementations return the + * same variable name on multiple calls to + * get_next_variable(). Terminate the loop + * immediately as there is no guarantee that + * we'll ever see a different variable name, + * and may end up looping here forever. + */ + if (variable_is_present(variable_name, &vendor_guid)) { + dup_variable_bug(variable_name, &vendor_guid, + variable_name_size); + status = EFI_NOT_FOUND; + break; + } + efivar_create_sysfs_entry(efivars, variable_name_size, variable_name, -- 1.7.11.7 -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply related [flat|nested] 38+ messages in thread
[parent not found: <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2013-03-19 10:14 ` Lingzhu Xiang [not found] ` <51483A88.6060509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-21 7:22 ` Lingzhu Xiang 1 sibling, 1 reply; 38+ messages in thread From: Lingzhu Xiang @ 2013-03-19 10:14 UTC (permalink / raw) To: Matt Fleming; +Cc: Seiji Aguchi, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On 03/18/2013 10:51 PM, Matt Fleming wrote: > On Thu, 2013-03-14 at 19:31 +0000, Seiji Aguchi wrote: >> To fix the broken firmware, users need to erase the duplicated variable. >> Is it correct? > > I'm unsure whether that will fix the problem for all platforms. Lingzhu, > do you have any idea? I'm not very sure what is to "erase the duplicated variable". If the firmware is broken, the variable interfaces probably are also broken and can't be used to erase variable. Users need to reset nvram to a known clean state, but often the hardware doesn't provide such facility. > How does everyone feel about this version? Lingzhu, if I could get a > Tested-by for this version that would be great. This, plus some other > patches, are sitting on the 'urgent' branch. I will test this. The IBM x3100m4 is gone. It takes some time to figure out how to corrupt nvram in QEMU... ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <51483A88.6060509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* RE: sysfs: cannot create duplicate filename [not found] ` <51483A88.6060509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-19 14:25 ` Seiji Aguchi [not found] ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org> 0 siblings, 1 reply; 38+ messages in thread From: Seiji Aguchi @ 2013-03-19 14:25 UTC (permalink / raw) To: Lingzhu Xiang, Matt Fleming Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA Lingzhu, I have another idea after reading you comment. > If the firmware is broken, the variable interfaces probably are also broken > and can't be used to erase variable. I see, you are concerned that a set_variable service may not work correctly. If so, the set_variable service should be disabled in addition to get_next_variable service. On the other hands, if Linux can boot up, a get_variable service probably works. Therefore, efivars should be initialized by enabling just get_variable service. > Users need to reset nvram to a known clean state, but often the hardware doesn't provide such > facility. What we can do is to notify that the firmware/hardware is broken with printk. Seiji ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org> @ 2013-03-19 15:26 ` Matt Fleming [not found] ` <514883B7.1010602-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org> 2013-03-19 16:00 ` Lingzhu Xiang 1 sibling, 1 reply; 38+ messages in thread From: Matt Fleming @ 2013-03-19 15:26 UTC (permalink / raw) To: Seiji Aguchi Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On 03/19/2013 02:25 PM, Seiji Aguchi wrote: > I see, you are concerned that a set_variable service may not work correctly. > If so, the set_variable service should be disabled in addition to get_next_variable service. We've no reason to believe that there are broken SetVariable() implementations in the wild. If any ever crop up, we can deal with them as and when. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <514883B7.1010602-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>]
* RE: sysfs: cannot create duplicate filename [not found] ` <514883B7.1010602-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org> @ 2013-03-19 15:46 ` Seiji Aguchi 0 siblings, 0 replies; 38+ messages in thread From: Seiji Aguchi @ 2013-03-19 15:46 UTC (permalink / raw) To: Matt Fleming Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA > We've no reason to believe that there are broken SetVariable() implementations in the wild. If any ever crop up, we can deal with > them as and when. OK. I will make a patch disabling just get_next_variable(). I'm thinking that it is reasonable to set ops->get_next_variable to NULL in register_efivars(), instead of introducing a new global variable.... Seiji > -----Original Message----- > From: Matt Fleming [mailto:matt@console-pimps.org] > Sent: Tuesday, March 19, 2013 11:27 AM > To: Seiji Aguchi > Cc: Lingzhu Xiang; Andre Heider; linux-efi@vger.kernel.org > Subject: Re: sysfs: cannot create duplicate filename > > On 03/19/2013 02:25 PM, Seiji Aguchi wrote: > > I see, you are concerned that a set_variable service may not work correctly. > > If so, the set_variable service should be disabled in addition to get_next_variable service. > > We've no reason to believe that there are broken SetVariable() implementations in the wild. If any ever crop up, we can deal with > them as and when. > > -- > Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename [not found] ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org> 2013-03-19 15:26 ` Matt Fleming @ 2013-03-19 16:00 ` Lingzhu Xiang 1 sibling, 0 replies; 38+ messages in thread From: Lingzhu Xiang @ 2013-03-19 16:00 UTC (permalink / raw) To: Seiji Aguchi; +Cc: Matt Fleming, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On 03/19/2013 10:25 PM, Seiji Aguchi wrote: >> If the firmware is broken, the variable interfaces probably are also broken >> and can't be used to erase variable. > > I see, you are concerned that a set_variable service may not work correctly. > If so, the set_variable service should be disabled in addition to get_next_variable service. Since you're suggesting fixing firmware by erasing, I'm wondering SetVariable may not actually remove duplicate variables since the way they exist is already an undefined behavior. And some variables can't be removed. GetNextVariable is clearly causing harm in this case. But the rest of efi runtime services should be left as they are. Lingzhu Xiang ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename [not found] ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-19 10:14 ` Lingzhu Xiang @ 2013-03-21 7:22 ` Lingzhu Xiang [not found] ` <514AB534.8080209-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 38+ messages in thread From: Lingzhu Xiang @ 2013-03-21 7:22 UTC (permalink / raw) To: Matt Fleming; +Cc: Seiji Aguchi, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On 03/18/2013 10:51 PM, Matt Fleming wrote: > How does everyone feel about this version? Lingzhu, if I could get a > Tested-by for this version that would be great. This, plus some other > patches, are sitting on the 'urgent' branch. > > --- > > From 544a6c06479e4ea4e7acc7eea589a82f0ff9aa6d Mon Sep 17 00:00:00 2001 > From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > Date: Thu, 7 Mar 2013 11:59:14 +0000 > Subject: [PATCH] efivars: Handle duplicate names from get_next_variable() > > Some firmware exhibits a bug where the same VariableName and > VendorGuid values are returned on multiple invocations of > GetNextVariableName(). See, > > https://bugzilla.kernel.org/show_bug.cgi?id=47631 > > As a consequence of such a bug, Andre reports hitting the following > WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte > Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e > 11/21/2012)" machine, > > [ 0.581554] EFI Variables Facility v0.08 2004-May-17 > [ 0.584914] ------------[ cut here ]------------ > [ 0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100() > [ 0.586381] Hardware name: To be filled by O.E.M. > [ 0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8' > [ 0.588694] Modules linked in: > [ 0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7 > [ 0.590280] Call Trace: > [ 0.591066] [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100 > [ 0.591861] [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0 > [ 0.592650] [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50 > [ 0.593429] [<ffffffff8134dd85>] ? strlcat+0x65/0x80 > [ 0.594203] [<ffffffff81208954>] sysfs_add_one+0xd4/0x100 > [ 0.594979] [<ffffffff81208b78>] create_dir+0x78/0xd0 > [ 0.595753] [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0 > [ 0.596532] [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220 > [ 0.597310] [<ffffffff81348307>] kobject_init_and_add+0x67/0x90 > [ 0.598083] [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0 > [ 0.598859] [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0 > [ 0.599631] [<ffffffff8158517e>] register_efivars+0xde/0x420 > [ 0.600395] [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5 > [ 0.601150] [<ffffffff81d4315f>] efivars_init+0xb8/0x104 > [ 0.601903] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 > [ 0.602659] [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6 > [ 0.603418] [<ffffffff81d05586>] ? loglevel+0x31/0x31 > [ 0.604183] [<ffffffff816a6530>] ? rest_init+0x80/0x80 > [ 0.604936] [<ffffffff816a653e>] kernel_init+0xe/0xf0 > [ 0.605681] [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0 > [ 0.606414] [<ffffffff816a6530>] ? rest_init+0x80/0x80 > [ 0.607143] ---[ end trace 1609741ab737eb29 ]--- > > There's not much we can do to work around and keep traversing the > variable list once we hit this firmware bug. Our only solution is to > terminate the loop because, as Lingzhu reports, some machines get > stuck when they encounter duplicate names, > > > I had an IBM System x3100 M4 and x3850 X5 on which kernel would > > get stuck in infinite loop creating duplicate sysfs files because, > > for some reason, there are several duplicate boot entries in nvram > > getting GetNextVariableName into a circle of iteration (with > > period > 2). > > Also disable the workqueue, as efivar_update_sysfs_entries() uses > GetNextVariableName() to figure out which variables have been created > since the last iteration. That algorithm isn't going to work if > GetNextVariableName() returns duplicates. Note that we don't disable > EFI variable creation completely on the affected machines, it's just > that any pstore dump-* files won't appear in sysfs until the next > boot. > > Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Seiji Aguchi <seiji.aguchi-7rDLJAbr9SE@public.gmane.org> > Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> > Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Tested-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Reproduced with 3.9-rc3 in QEMU/OVMF. The kernel gets stuck in a loop. Verified with 3.9-rc3 with this patch. The kernel survives. efivarfs continues working, though some variables won't show up, as expected. [ 2.033855] efivars: duplicate variable: Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c This patch has no real impact for users with normal firmware as tested before on a Windows 8 logo desktop. For the record, here is how I create duplicate variables in QEMU/OVMF: OVMF EmuVariableFvbRuntimeDxe emulates nvram with a reserved area of memory (ReserveEmuVariableNvStore). But there seems to be some layers of cache above this, so directly writing a new duplicate variable here won't make a difference. I ended up writing an EFI app to scan the entire memory and replace all L"Boot0004" with L"Boot0001", which seemed to actually break things (e.g. dmpstore will get stuck). ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <514AB534.8080209-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: sysfs: cannot create duplicate filename [not found] ` <514AB534.8080209-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-21 7:44 ` Matt Fleming 0 siblings, 0 replies; 38+ messages in thread From: Matt Fleming @ 2013-03-21 7:44 UTC (permalink / raw) To: Lingzhu Xiang Cc: Seiji Aguchi, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA On 03/21/2013 07:22 AM, Lingzhu Xiang wrote: > Tested-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Great, thanks! > Reproduced with 3.9-rc3 in QEMU/OVMF. The kernel gets stuck in a loop. > > Verified with 3.9-rc3 with this patch. The kernel survives. efivarfs > continues working, though some variables won't show up, as expected. > > [ 2.033855] efivars: duplicate variable: Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c > > This patch has no real impact for users with normal firmware as tested > before on a Windows 8 logo desktop. > > For the record, here is how I create duplicate variables in QEMU/OVMF: > > OVMF EmuVariableFvbRuntimeDxe emulates nvram with a reserved area of > memory (ReserveEmuVariableNvStore). But there seems to be some layers > of cache above this, so directly writing a new duplicate variable here > won't make a difference. I ended up writing an EFI app to scan the > entire memory and replace all L"Boot0004" with L"Boot0001", which > seemed to actually break things (e.g. dmpstore will get stuck). Thanks for going to such lengths to test this. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename [not found] ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-12 10:45 ` Lingzhu Xiang @ 2013-03-12 13:20 ` Andre Heider 1 sibling, 0 replies; 38+ messages in thread From: Andre Heider @ 2013-03-12 13:20 UTC (permalink / raw) To: Matt Fleming; +Cc: Lingzhu Xiang, linux-efi-u79uwXL29TY76Z2rM5mHXA On Tue, Mar 12, 2013 at 11:08 AM, Matt Fleming <matt-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org> wrote: > From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > Date: Thu, 7 Mar 2013 11:59:14 +0000 > Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable() > > Some firmware exhibits a bug where the same VariableName and > VendorGuid values are returned on multiple invocations of > GetNextVariableName(). See, > > https://bugzilla.kernel.org/show_bug.cgi?id=47631 > > As a consequence of such a bug, Andre reports hitting the following > WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte > Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e > 11/21/2012)" machine, Thanks for working on this, Matt. Unfortunately I cannot test this patch anymore... This BIOS seems beyond buggy, and today I didn't get anything on screen, it didn't boot anything nor could I even enter it. The only thing to make this box usable again was to hit the CMOS reset button, and since then I do not get the WARN_ON() anymore. Seems like the duplicate entry was an artifact of a BIOS upgrade with old CMOS settings. Thanks, Andre ^ permalink raw reply [flat|nested] 38+ messages in thread
* sysfs: cannot create duplicate filename @ 2022-02-13 9:21 Robin Peiremans 2022-02-14 18:08 ` Jason Gunthorpe 0 siblings, 1 reply; 38+ messages in thread From: Robin Peiremans @ 2022-02-13 9:21 UTC (permalink / raw) To: linux-rdma Hi I ran into this error when upgrading to kernel 5.14 (it works pre-5.14). The driver gets loaded (verified via lsmod), but the ib utils don't show any ports. There's a bugreport on elrepo (https://elrepo.org/bugs/view.php?id=1176) that looks pretty much identical, but it looks stalled. I've bisected the kernel and commit 4a7aaf88c89f12f8048137e274ce0d40fe1056b2 seems to be the culprit. Since I'm absolutely no dev, I'm hoping someone here can figure out what exactly is going wrong. Latest mainline kernel still has the same behavior. Relevant output and logs below. $ sudo lspci -s 04:00.0 -vv 04:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02) Subsystem: QLogic Corp. IBA7322 QDR InfiniBand HCA Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 39 IOMMU group: 22 Region 0: Memory at fbc00000 (64-bit, non-prefetchable) [size=4M] Expansion ROM at fc000000 [disabled] [size=128K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <4us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s (ok), Width x4 (downgraded) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b0] MSI-X: Enable- Count=32 Masked- Vector table: BAR=0 offset=00008000 PBA: BAR=0 offset=00009000 Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Kernel modules: ib_qib Relevant /var/log/messages section: Feb 13 09:43:29 localhost kernel: ib_qib 0000:04:00.0: qib0: Reserving QPNs from 0x656b78 to 0x656b78 for non-verbs use Feb 13 09:43:29 localhost kernel: sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:01.2/0000:01:00.0/0000:02:02.0/0000:04:00.0/infiniband/qib0/ports/1/linkcontrol' Feb 13 09:43:29 localhost kernel: CPU: 15 PID: 1118 Comm: modprobe Tainted: G OE 5.17.0-rc3-robin+ #20 Feb 13 09:43:29 localhost kernel: Hardware name: ASUS System Product Name/TUF GAMING X570-PRO (WI-FI), BIOS 3001 12/04/2020 Feb 13 09:43:29 localhost kernel: Call Trace: Feb 13 09:43:29 localhost kernel: <TASK> Feb 13 09:43:29 localhost kernel: dump_stack_lvl+0x48/0x5e Feb 13 09:43:29 localhost kernel: sysfs_warn_dup.cold+0x17/0x24 Feb 13 09:43:29 localhost kernel: internal_create_group+0x348/0x350 Feb 13 09:43:29 localhost kernel: internal_create_groups.part.0+0x3d/0xa0 Feb 13 09:43:29 localhost kernel: setup_port+0x370/0x680 [ib_core] Feb 13 09:43:29 localhost kernel: ? kobject_add+0x6e/0x90 Feb 13 09:43:29 localhost kernel: ib_setup_port_attrs+0x88/0x220 [ib_core] Feb 13 09:43:29 localhost kernel: ib_register_device+0x576/0x650 [ib_core] Feb 13 09:43:29 localhost kernel: ? __vmalloc_node+0x4a/0x70 Feb 13 09:43:29 localhost kernel: rvt_register_device+0x10c/0x270 [rdmavt] Feb 13 09:43:29 localhost kernel: qib_register_ib_device+0x5f8/0x7a0 [ib_qib] Feb 13 09:43:29 localhost kernel: qib_init_one+0x17f/0x470 [ib_qib] Feb 13 09:43:29 localhost kernel: local_pci_probe+0x45/0x80 Feb 13 09:43:29 localhost kernel: ? pci_match_device+0xd7/0x130 Feb 13 09:43:29 localhost kernel: pci_device_probe+0xaa/0x1a0 Feb 13 09:43:29 localhost kernel: really_probe+0x1f5/0x3d0 Feb 13 09:43:29 localhost kernel: __driver_probe_device+0xfe/0x180 Feb 13 09:43:29 localhost kernel: driver_probe_device+0x1e/0x90 Feb 13 09:43:29 localhost kernel: __driver_attach+0xc0/0x1c0 Feb 13 09:43:29 localhost kernel: ? __device_attach_driver+0xe0/0xe0 Feb 13 09:43:29 localhost kernel: ? __device_attach_driver+0xe0/0xe0 Feb 13 09:43:29 localhost kernel: bus_for_each_dev+0x64/0x90 Feb 13 09:43:29 localhost kernel: bus_add_driver+0x149/0x1e0 Feb 13 09:43:29 localhost kernel: driver_register+0x8f/0xe0 Feb 13 09:43:29 localhost kernel: ? qib_init_qibfs+0x11/0x11 [ib_qib] Feb 13 09:43:29 localhost kernel: qib_ib_init+0x3e/0xf62 [ib_qib] Feb 13 09:43:29 localhost kernel: do_one_initcall+0x44/0x200 Feb 13 09:43:29 localhost kernel: ? kmem_cache_alloc_trace+0x163/0x2c0 Feb 13 09:43:29 localhost kernel: do_init_module+0x4c/0x260 Feb 13 09:43:29 localhost kernel: __do_sys_finit_module+0x9b/0xf0 Feb 13 09:43:29 localhost kernel: do_syscall_64+0x3b/0x90 Feb 13 09:43:29 localhost kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae Feb 13 09:43:29 localhost kernel: RIP: 0033:0x7fae66bf5ecd Feb 13 09:43:29 localhost kernel: Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b ef 0e 00 f7 d8 64 89 01 48 Feb 13 09:43:29 localhost kernel: RSP: 002b:00007ffdb7beb198 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 Feb 13 09:43:29 localhost kernel: RAX: ffffffffffffffda RBX: 00007fae677c8c80 RCX: 00007fae66bf5ecd Feb 13 09:43:29 localhost kernel: RDX: 0000000000000000 RSI: 00007fae67173a2a RDI: 0000000000000003 Feb 13 09:43:29 localhost kernel: RBP: 0000000000040004 R08: 0000000000000000 R09: 0000000000000000 Feb 13 09:43:29 localhost kernel: R10: 0000000000000003 R11: 0000000000000246 R12: 00007fae67173a2a Feb 13 09:43:29 localhost kernel: R13: 00007fae677c8ed0 R14: 00007fae677c8c80 R15: 00007fae677c9980 Feb 13 09:43:29 localhost kernel: </TASK> Feb 13 09:43:29 localhost kernel: infiniband qib0: Couldn't register device with driver model Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: qib0: Failed to register driver with ib core. Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: qib0: cannot register verbs: 17! Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling notifier on HCA 0 irq 140 Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling notifier on HCA 0 irq 141 Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling notifier on HCA 0 irq 142 Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling notifier on HCA 0 irq 144 Feb 13 09:43:33 localhost kernel: ib_qib: probe of 0000:04:00.0 failed with error -17 $ ibstatus Fatal error: device '*': sys files not found (/sys/class/infiniband/*/ports) $ git bisect bad 4a7aaf88c89f12f8048137e274ce0d40fe1056b2 is the first bad commit commit 4a7aaf88c89f12f8048137e274ce0d40fe1056b2 Author: Jason Gunthorpe <jgg@nvidia.com> Date: Fri Jun 11 19:00:30 2021 +0300 RDMA/qib: Use attributes for the port sysfs qib should not be creating a mess of kobjects to attach to the port kobject - this is all attributes. The proper API is to create an attribute_group list and create it against the port's kobject. Link: https://lore.kernel.org/r/911e0031e1ed495b0006e8a6efec7b67a702cd5e.1623427137.git.leonro@nvidia.com Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> drivers/infiniband/hw/qib/qib.h | 4 - drivers/infiniband/hw/qib/qib_sysfs.c | 606 ++++++++++++++-------------------- 2 files changed, 254 insertions(+), 356 deletions(-) Cheers Robin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2022-02-13 9:21 Robin Peiremans @ 2022-02-14 18:08 ` Jason Gunthorpe 2022-02-17 13:37 ` Dennis Dalessandro 0 siblings, 1 reply; 38+ messages in thread From: Jason Gunthorpe @ 2022-02-14 18:08 UTC (permalink / raw) To: Robin Peiremans, Mike Marciniszyn; +Cc: linux-rdma On Sun, Feb 13, 2022 at 10:21:49AM +0100, Robin Peiremans wrote: > Hi > > I ran into this error when upgrading to kernel 5.14 (it works > pre-5.14). The driver gets loaded (verified via lsmod), but the ib > utils don't show any ports. > > There's a bugreport on elrepo > (https://elrepo.org/bugs/view.php?id=1176) that looks pretty much > identical, but it looks stalled. > I've bisected the kernel and commit > 4a7aaf88c89f12f8048137e274ce0d40fe1056b2 seems to be the culprit. > Since I'm absolutely no dev, I'm hoping someone here can figure out > what exactly is going wrong. Latest mainline kernel still has the same > behavior. Don't know, Mike tested this patch on qib, maybe he knows Jason ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2022-02-14 18:08 ` Jason Gunthorpe @ 2022-02-17 13:37 ` Dennis Dalessandro 2022-02-17 14:29 ` Marciniszyn, Mike 0 siblings, 1 reply; 38+ messages in thread From: Dennis Dalessandro @ 2022-02-17 13:37 UTC (permalink / raw) To: Jason Gunthorpe, Robin Peiremans, Mike Marciniszyn; +Cc: linux-rdma On 2/14/22 1:08 PM, Jason Gunthorpe wrote: > On Sun, Feb 13, 2022 at 10:21:49AM +0100, Robin Peiremans wrote: >> Hi >> >> I ran into this error when upgrading to kernel 5.14 (it works >> pre-5.14). The driver gets loaded (verified via lsmod), but the ib >> utils don't show any ports. >> >> There's a bugreport on elrepo >> (https://elrepo.org/bugs/view.php?id=1176) that looks pretty much >> identical, but it looks stalled. >> I've bisected the kernel and commit >> 4a7aaf88c89f12f8048137e274ce0d40fe1056b2 seems to be the culprit. >> Since I'm absolutely no dev, I'm hoping someone here can figure out >> what exactly is going wrong. Latest mainline kernel still has the same >> behavior. > > Don't know, Mike tested this patch on qib, maybe he knows A patch to fix the issue should be posted soon. -Denny ^ permalink raw reply [flat|nested] 38+ messages in thread
* RE: sysfs: cannot create duplicate filename 2022-02-17 13:37 ` Dennis Dalessandro @ 2022-02-17 14:29 ` Marciniszyn, Mike 2022-02-17 15:35 ` Robin Peiremans 0 siblings, 1 reply; 38+ messages in thread From: Marciniszyn, Mike @ 2022-02-17 14:29 UTC (permalink / raw) To: Dalessandro, Dennis, Jason Gunthorpe, Robin Peiremans; +Cc: linux-rdma > > > > Don't know, Mike tested this patch on qib, maybe he knows > > A patch to fix the issue should be posted soon. > > -Denny See https://lore.kernel.org/linux-rdma/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u. Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2022-02-17 14:29 ` Marciniszyn, Mike @ 2022-02-17 15:35 ` Robin Peiremans 0 siblings, 0 replies; 38+ messages in thread From: Robin Peiremans @ 2022-02-17 15:35 UTC (permalink / raw) To: Marciniszyn, Mike; +Cc: Dalessandro, Dennis, Jason Gunthorpe, linux-rdma Tested on 5.17 and my ports are back. I can confirm the patch works. Thanks a lot for the quick fix! Robin On Thu, Feb 17, 2022 at 3:29 PM Marciniszyn, Mike <mike.marciniszyn@cornelisnetworks.com> wrote: > > > > > > > Don't know, Mike tested this patch on qib, maybe he knows > > > > A patch to fix the issue should be posted soon. > > > > -Denny > > See https://lore.kernel.org/linux-rdma/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u. > > Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* sysfs: cannot create duplicate filename @ 2011-04-11 14:04 Sebastian Ott 2011-04-11 14:13 ` Greg KH 0 siblings, 1 reply; 38+ messages in thread From: Sebastian Ott @ 2011-04-11 14:04 UTC (permalink / raw) To: Greg Kroah-Hartman; +Cc: linux-kernel Hi, i've seen this warning which looks to be caused by a race between device_add and driver_register [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' [ 80.893611] ------------[ cut here ]------------ [ 80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455 [ 80.893617] Modules linked in: qeth ccwgroup [ 80.893623] Modules linked in: qeth ccwgroup [ 80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1 [ 80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980) [ 80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8) [ 80.893643] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 [ 80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001 [ 80.893651] 0000000000589daa 00000000005a1248 0000000000000000 0000000000000000 [ 80.893654] 0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8 [ 80.893657] 000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00 [ 80.893668] Krnl Code: 00000000002b675c: c0200020041d larl %r2,6b6f96 [ 80.893672] 00000000002b6762: c0e500169afb brasl %r14,589d58 [ 80.893676] 00000000002b6768: a7f40001 brc 15,2b676a [ 80.893680] >00000000002b676c: b904002a lgr %r2,%r10 [ 80.893684] 00000000002b6770: c0e5fffb6236 brasl %r14,222bdc [ 80.893687] 00000000002b6776: a7f4ffad brc 15,2b66d0 [ 80.893692] 00000000002b677a: e320c0480004 lg %r2,72(%r12) [ 80.893695] 00000000002b6780: a7f4ffec brc 15,2b6758 [ 80.893699] Call Trace: [ 80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8) [ 80.893705] [<00000000002b7046>] sysfs_do_create_link+0xda/0x268 [ 80.893708] [<0000000000409f26>] driver_sysfs_add+0x66/0xcc [ 80.893713] [<000000000040a0a2>] device_bind_driver+0x26/0x48 [ 80.893717] [<000000000040a110>] device_attach+0x4c/0xd4 [ 80.893720] [<0000000000409734>] bus_probe_device+0x4c/0x5c [ 80.893724] [<0000000000406c52>] device_add+0x61e/0x73c [ 80.893728] [<00000000004631a6>] ccw_device_todo+0x31a/0x380 [ 80.893733] [<000000000015f5b6>] process_one_work+0x1f6/0x4f0 [ 80.893739] [<0000000000163358>] worker_thread+0x17c/0x370 [ 80.893742] [<00000000001690ca>] kthread+0xa6/0xb0 [ 80.893747] [<000000000058f5f2>] kernel_thread_starter+0x6/0xc [ 80.893752] [<000000000058f5ec>] kernel_thread_starter+0x0/0xc [ 80.893756] 4 locks held by kworker/u:1/25: [ 80.893758] #0: (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 [ 80.893766] #1: ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 [ 80.893773] #2: (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4 [ 80.893780] #3: (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268 [ 80.893787] Last Breaking-Event-Address: [ 80.893790] [<00000000002b6768>] sysfs_add_one+0xcc/0xe8 [ 80.893795] ---[ end trace ded2f91fcf2c6165 ]--- * device_add attached the device to the bus /*break*/ * driver_register walks the list of devices and tries to bind unbound devices * /*continue*/ device_add calls device_attach which gets confused that the device is already bound to a driver to fix this we could: * hold the device lock in device_add (from bus_add_device to bus_probe_device) ..but this means we have to extract the lock from the inside of some of these functions and i'm not sure about holding the lock while the blocking_notifier_call_chain thing.. * add a bus mutex, preventing concurrent registrations of devices and drivers to a bus * change device_attach to detect an already bound device..something like the following: regards, Sebastian ----- Add a check to device_attach to identify an already bound device. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> --- drivers/base/dd.c | 5 +++++ 1 file changed, 5 insertions(+) --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -245,6 +245,10 @@ int device_attach(struct device *dev) device_lock(dev); if (dev->driver) { + if (klist_node_attached(&dev->p->knode_driver)) { + ret = 1; + goto out_unlock; + } ret = device_bind_driver(dev); if (ret == 0) ret = 1; @@ -257,6 +261,7 @@ int device_attach(struct device *dev) ret = bus_for_each_drv(dev->bus, NULL, dev, __device_attach); pm_runtime_put_sync(dev); } +out_unlock: device_unlock(dev); return ret; } ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 14:04 Sebastian Ott @ 2011-04-11 14:13 ` Greg KH 2011-04-11 14:33 ` Sebastian Ott 0 siblings, 1 reply; 38+ messages in thread From: Greg KH @ 2011-04-11 14:13 UTC (permalink / raw) To: Sebastian Ott; +Cc: linux-kernel On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > Hi, > > i've seen this warning which looks to be caused by a race between device_add > and driver_register > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' Isn't the problem here the fact that you are creating 2 directories of the same name? > [ 80.893611] ------------[ cut here ]------------ > [ 80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455 > [ 80.893617] Modules linked in: qeth ccwgroup > [ 80.893623] Modules linked in: qeth ccwgroup > [ 80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1 > [ 80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980) > [ 80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8) > [ 80.893643] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 > [ 80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001 > [ 80.893651] 0000000000589daa 00000000005a1248 0000000000000000 0000000000000000 > [ 80.893654] 0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8 > [ 80.893657] 000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00 > [ 80.893668] Krnl Code: 00000000002b675c: c0200020041d larl %r2,6b6f96 > [ 80.893672] 00000000002b6762: c0e500169afb brasl %r14,589d58 > [ 80.893676] 00000000002b6768: a7f40001 brc 15,2b676a > [ 80.893680] >00000000002b676c: b904002a lgr %r2,%r10 > [ 80.893684] 00000000002b6770: c0e5fffb6236 brasl %r14,222bdc > [ 80.893687] 00000000002b6776: a7f4ffad brc 15,2b66d0 > [ 80.893692] 00000000002b677a: e320c0480004 lg %r2,72(%r12) > [ 80.893695] 00000000002b6780: a7f4ffec brc 15,2b6758 > [ 80.893699] Call Trace: > [ 80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8) > [ 80.893705] [<00000000002b7046>] sysfs_do_create_link+0xda/0x268 > [ 80.893708] [<0000000000409f26>] driver_sysfs_add+0x66/0xcc > [ 80.893713] [<000000000040a0a2>] device_bind_driver+0x26/0x48 > [ 80.893717] [<000000000040a110>] device_attach+0x4c/0xd4 > [ 80.893720] [<0000000000409734>] bus_probe_device+0x4c/0x5c > [ 80.893724] [<0000000000406c52>] device_add+0x61e/0x73c > [ 80.893728] [<00000000004631a6>] ccw_device_todo+0x31a/0x380 > [ 80.893733] [<000000000015f5b6>] process_one_work+0x1f6/0x4f0 > [ 80.893739] [<0000000000163358>] worker_thread+0x17c/0x370 > [ 80.893742] [<00000000001690ca>] kthread+0xa6/0xb0 > [ 80.893747] [<000000000058f5f2>] kernel_thread_starter+0x6/0xc > [ 80.893752] [<000000000058f5ec>] kernel_thread_starter+0x0/0xc > [ 80.893756] 4 locks held by kworker/u:1/25: > [ 80.893758] #0: (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > [ 80.893766] #1: ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > [ 80.893773] #2: (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4 > [ 80.893780] #3: (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268 > [ 80.893787] Last Breaking-Event-Address: > [ 80.893790] [<00000000002b6768>] sysfs_add_one+0xcc/0xe8 > [ 80.893795] ---[ end trace ded2f91fcf2c6165 ]--- > > > * device_add attached the device to the bus /*break*/ > * driver_register walks the list of devices and tries to bind > unbound devices > * /*continue*/ device_add calls device_attach which gets confused > that the device is already bound to a driver Why would your bus code ever allow this to happen? It's the caller's responsiblity to do things in the correct order, right? > to fix this we could: > * hold the device lock in device_add (from bus_add_device to > bus_probe_device) ..but this means we have to extract the lock > from the inside of some of these functions and i'm not sure about > holding the lock while the blocking_notifier_call_chain thing.. > > * add a bus mutex, preventing concurrent registrations of devices > and drivers to a bus > > * change device_attach to detect an already bound device..something > like the following: How about fix your caller code? thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 14:13 ` Greg KH @ 2011-04-11 14:33 ` Sebastian Ott 2011-04-11 14:49 ` Greg KH 0 siblings, 1 reply; 38+ messages in thread From: Sebastian Ott @ 2011-04-11 14:33 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On Mon, 11 Apr 2011, Greg KH wrote: > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > Hi, > > > > i've seen this warning which looks to be caused by a race between device_add > > and driver_register > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > Isn't the problem here the fact that you are creating 2 directories of > the same name? I'm sure this isn't the case here. The bus code just calls device_add and at the same time on a different thread a module is loaded which registers a driver at the bus. I was able to reproduce this with a module which creates a dummy bus and registers drivers and devices on this bus on 2 different workqueues. > > > > [ 80.893611] ------------[ cut here ]------------ > > [ 80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455 > > [ 80.893617] Modules linked in: qeth ccwgroup > > [ 80.893623] Modules linked in: qeth ccwgroup > > [ 80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1 > > [ 80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980) > > [ 80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8) > > [ 80.893643] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 > > [ 80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001 > > [ 80.893651] 0000000000589daa 00000000005a1248 0000000000000000 0000000000000000 > > [ 80.893654] 0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8 > > [ 80.893657] 000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00 > > [ 80.893668] Krnl Code: 00000000002b675c: c0200020041d larl %r2,6b6f96 > > [ 80.893672] 00000000002b6762: c0e500169afb brasl %r14,589d58 > > [ 80.893676] 00000000002b6768: a7f40001 brc 15,2b676a > > [ 80.893680] >00000000002b676c: b904002a lgr %r2,%r10 > > [ 80.893684] 00000000002b6770: c0e5fffb6236 brasl %r14,222bdc > > [ 80.893687] 00000000002b6776: a7f4ffad brc 15,2b66d0 > > [ 80.893692] 00000000002b677a: e320c0480004 lg %r2,72(%r12) > > [ 80.893695] 00000000002b6780: a7f4ffec brc 15,2b6758 > > [ 80.893699] Call Trace: > > [ 80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8) > > [ 80.893705] [<00000000002b7046>] sysfs_do_create_link+0xda/0x268 > > [ 80.893708] [<0000000000409f26>] driver_sysfs_add+0x66/0xcc > > [ 80.893713] [<000000000040a0a2>] device_bind_driver+0x26/0x48 > > [ 80.893717] [<000000000040a110>] device_attach+0x4c/0xd4 > > [ 80.893720] [<0000000000409734>] bus_probe_device+0x4c/0x5c > > [ 80.893724] [<0000000000406c52>] device_add+0x61e/0x73c > > [ 80.893728] [<00000000004631a6>] ccw_device_todo+0x31a/0x380 > > [ 80.893733] [<000000000015f5b6>] process_one_work+0x1f6/0x4f0 > > [ 80.893739] [<0000000000163358>] worker_thread+0x17c/0x370 > > [ 80.893742] [<00000000001690ca>] kthread+0xa6/0xb0 > > [ 80.893747] [<000000000058f5f2>] kernel_thread_starter+0x6/0xc > > [ 80.893752] [<000000000058f5ec>] kernel_thread_starter+0x0/0xc > > [ 80.893756] 4 locks held by kworker/u:1/25: > > [ 80.893758] #0: (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > > [ 80.893766] #1: ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > > [ 80.893773] #2: (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4 > > [ 80.893780] #3: (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268 > > [ 80.893787] Last Breaking-Event-Address: > > [ 80.893790] [<00000000002b6768>] sysfs_add_one+0xcc/0xe8 > > [ 80.893795] ---[ end trace ded2f91fcf2c6165 ]--- > > > > > > * device_add attached the device to the bus /*break*/ > > * driver_register walks the list of devices and tries to bind > > unbound devices > > * /*continue*/ device_add calls device_attach which gets confused > > that the device is already bound to a driver > > Why would your bus code ever allow this to happen? It's the caller's > responsiblity to do things in the correct order, right? I don't think the bus code which calls device_register can (or should) prevent drivers from beeing registered at this bus at the same time. > > > to fix this we could: > > * hold the device lock in device_add (from bus_add_device to > > bus_probe_device) ..but this means we have to extract the lock > > from the inside of some of these functions and i'm not sure about > > holding the lock while the blocking_notifier_call_chain thing.. > > > > * add a bus mutex, preventing concurrent registrations of devices > > and drivers to a bus > > > > * change device_attach to detect an already bound device..something > > like the following: > > How about fix your caller code? > > thanks, > > greg k-h > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 14:33 ` Sebastian Ott @ 2011-04-11 14:49 ` Greg KH 2011-04-11 15:05 ` Sebastian Ott 0 siblings, 1 reply; 38+ messages in thread From: Greg KH @ 2011-04-11 14:49 UTC (permalink / raw) To: Sebastian Ott; +Cc: linux-kernel On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote: > On Mon, 11 Apr 2011, Greg KH wrote: > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > > Hi, > > > > > > i've seen this warning which looks to be caused by a race between device_add > > > and driver_register > > > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > > > Isn't the problem here the fact that you are creating 2 directories of > > the same name? > I'm sure this isn't the case here. The bus code just calls device_add and > at the same time on a different thread a module is loaded which registers > a driver at the bus. > > I was able to reproduce this with a module which creates a dummy bus > and registers drivers and devices on this bus on 2 different workqueues. That makes sense, as no bus should be doing this on multiple "threads". What real-life bus does this today? > > > [ 80.893611] ------------[ cut here ]------------ > > > [ 80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455 > > > [ 80.893617] Modules linked in: qeth ccwgroup > > > [ 80.893623] Modules linked in: qeth ccwgroup > > > [ 80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1 > > > [ 80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980) > > > [ 80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8) > > > [ 80.893643] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 > > > [ 80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001 > > > [ 80.893651] 0000000000589daa 00000000005a1248 0000000000000000 0000000000000000 > > > [ 80.893654] 0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8 > > > [ 80.893657] 000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00 > > > [ 80.893668] Krnl Code: 00000000002b675c: c0200020041d larl %r2,6b6f96 > > > [ 80.893672] 00000000002b6762: c0e500169afb brasl %r14,589d58 > > > [ 80.893676] 00000000002b6768: a7f40001 brc 15,2b676a > > > [ 80.893680] >00000000002b676c: b904002a lgr %r2,%r10 > > > [ 80.893684] 00000000002b6770: c0e5fffb6236 brasl %r14,222bdc > > > [ 80.893687] 00000000002b6776: a7f4ffad brc 15,2b66d0 > > > [ 80.893692] 00000000002b677a: e320c0480004 lg %r2,72(%r12) > > > [ 80.893695] 00000000002b6780: a7f4ffec brc 15,2b6758 > > > [ 80.893699] Call Trace: > > > [ 80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8) > > > [ 80.893705] [<00000000002b7046>] sysfs_do_create_link+0xda/0x268 > > > [ 80.893708] [<0000000000409f26>] driver_sysfs_add+0x66/0xcc > > > [ 80.893713] [<000000000040a0a2>] device_bind_driver+0x26/0x48 > > > [ 80.893717] [<000000000040a110>] device_attach+0x4c/0xd4 > > > [ 80.893720] [<0000000000409734>] bus_probe_device+0x4c/0x5c > > > [ 80.893724] [<0000000000406c52>] device_add+0x61e/0x73c > > > [ 80.893728] [<00000000004631a6>] ccw_device_todo+0x31a/0x380 > > > [ 80.893733] [<000000000015f5b6>] process_one_work+0x1f6/0x4f0 > > > [ 80.893739] [<0000000000163358>] worker_thread+0x17c/0x370 > > > [ 80.893742] [<00000000001690ca>] kthread+0xa6/0xb0 > > > [ 80.893747] [<000000000058f5f2>] kernel_thread_starter+0x6/0xc > > > [ 80.893752] [<000000000058f5ec>] kernel_thread_starter+0x0/0xc > > > [ 80.893756] 4 locks held by kworker/u:1/25: > > > [ 80.893758] #0: (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > > > [ 80.893766] #1: ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > > > [ 80.893773] #2: (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4 > > > [ 80.893780] #3: (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268 > > > [ 80.893787] Last Breaking-Event-Address: > > > [ 80.893790] [<00000000002b6768>] sysfs_add_one+0xcc/0xe8 > > > [ 80.893795] ---[ end trace ded2f91fcf2c6165 ]--- > > > > > > > > > * device_add attached the device to the bus /*break*/ > > > * driver_register walks the list of devices and tries to bind > > > unbound devices > > > * /*continue*/ device_add calls device_attach which gets confused > > > that the device is already bound to a driver > > > > Why would your bus code ever allow this to happen? It's the caller's > > responsiblity to do things in the correct order, right? > I don't think the bus code which calls device_register can (or should) > prevent drivers from beeing registered at this bus at the same time. Why not? That's the way all kernel subsystems work today that I know of. Has this changed? thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 14:49 ` Greg KH @ 2011-04-11 15:05 ` Sebastian Ott 2011-04-11 15:19 ` Greg KH 0 siblings, 1 reply; 38+ messages in thread From: Sebastian Ott @ 2011-04-11 15:05 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On Mon, 11 Apr 2011, Greg KH wrote: > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote: > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > > > Hi, > > > > > > > > i've seen this warning which looks to be caused by a race between device_add > > > > and driver_register > > > > > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > > > > > Isn't the problem here the fact that you are creating 2 directories of > > > the same name? > > I'm sure this isn't the case here. The bus code just calls device_add and > > at the same time on a different thread a module is loaded which registers > > a driver at the bus. > > > > I was able to reproduce this with a module which creates a dummy bus > > and registers drivers and devices on this bus on 2 different workqueues. > > That makes sense, as no bus should be doing this on multiple "threads". > What real-life bus does this today? A bus that will recognize and register a lot of devices, after the first uevent is presented to userspace, a module will be loaded registering a driver from a different thread. I don't think thats uncommon. > > > > > [ 80.893611] ------------[ cut here ]------------ > > > > [ 80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455 > > > > [ 80.893617] Modules linked in: qeth ccwgroup > > > > [ 80.893623] Modules linked in: qeth ccwgroup > > > > [ 80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1 > > > > [ 80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980) > > > > [ 80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8) > > > > [ 80.893643] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3 > > > > [ 80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001 > > > > [ 80.893651] 0000000000589daa 00000000005a1248 0000000000000000 0000000000000000 > > > > [ 80.893654] 0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8 > > > > [ 80.893657] 000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00 > > > > [ 80.893668] Krnl Code: 00000000002b675c: c0200020041d larl %r2,6b6f96 > > > > [ 80.893672] 00000000002b6762: c0e500169afb brasl %r14,589d58 > > > > [ 80.893676] 00000000002b6768: a7f40001 brc 15,2b676a > > > > [ 80.893680] >00000000002b676c: b904002a lgr %r2,%r10 > > > > [ 80.893684] 00000000002b6770: c0e5fffb6236 brasl %r14,222bdc > > > > [ 80.893687] 00000000002b6776: a7f4ffad brc 15,2b66d0 > > > > [ 80.893692] 00000000002b677a: e320c0480004 lg %r2,72(%r12) > > > > [ 80.893695] 00000000002b6780: a7f4ffec brc 15,2b6758 > > > > [ 80.893699] Call Trace: > > > > [ 80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8) > > > > [ 80.893705] [<00000000002b7046>] sysfs_do_create_link+0xda/0x268 > > > > [ 80.893708] [<0000000000409f26>] driver_sysfs_add+0x66/0xcc > > > > [ 80.893713] [<000000000040a0a2>] device_bind_driver+0x26/0x48 > > > > [ 80.893717] [<000000000040a110>] device_attach+0x4c/0xd4 > > > > [ 80.893720] [<0000000000409734>] bus_probe_device+0x4c/0x5c > > > > [ 80.893724] [<0000000000406c52>] device_add+0x61e/0x73c > > > > [ 80.893728] [<00000000004631a6>] ccw_device_todo+0x31a/0x380 > > > > [ 80.893733] [<000000000015f5b6>] process_one_work+0x1f6/0x4f0 > > > > [ 80.893739] [<0000000000163358>] worker_thread+0x17c/0x370 > > > > [ 80.893742] [<00000000001690ca>] kthread+0xa6/0xb0 > > > > [ 80.893747] [<000000000058f5f2>] kernel_thread_starter+0x6/0xc > > > > [ 80.893752] [<000000000058f5ec>] kernel_thread_starter+0x0/0xc > > > > [ 80.893756] 4 locks held by kworker/u:1/25: > > > > [ 80.893758] #0: (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > > > > [ 80.893766] #1: ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0 > > > > [ 80.893773] #2: (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4 > > > > [ 80.893780] #3: (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268 > > > > [ 80.893787] Last Breaking-Event-Address: > > > > [ 80.893790] [<00000000002b6768>] sysfs_add_one+0xcc/0xe8 > > > > [ 80.893795] ---[ end trace ded2f91fcf2c6165 ]--- > > > > > > > > > > > > * device_add attached the device to the bus /*break*/ > > > > * driver_register walks the list of devices and tries to bind > > > > unbound devices > > > > * /*continue*/ device_add calls device_attach which gets confused > > > > that the device is already bound to a driver > > > > > > Why would your bus code ever allow this to happen? It's the caller's > > > responsiblity to do things in the correct order, right? > > I don't think the bus code which calls device_register can (or should) > > prevent drivers from beeing registered at this bus at the same time. > > Why not? That's the way all kernel subsystems work today that I know > of. Has this changed? What about an exported bus_type? At all time a driver for this bus can be registered, the bus code has no chance to prevent or serialize this. > > thanks, > > greg k-h > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 15:05 ` Sebastian Ott @ 2011-04-11 15:19 ` Greg KH 2011-04-11 17:50 ` Sebastian Ott 0 siblings, 1 reply; 38+ messages in thread From: Greg KH @ 2011-04-11 15:19 UTC (permalink / raw) To: Sebastian Ott; +Cc: linux-kernel On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote: > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote: > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > > > > Hi, > > > > > > > > > > i've seen this warning which looks to be caused by a race between device_add > > > > > and driver_register > > > > > > > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > > > > > > > Isn't the problem here the fact that you are creating 2 directories of > > > > the same name? > > > I'm sure this isn't the case here. The bus code just calls device_add and > > > at the same time on a different thread a module is loaded which registers > > > a driver at the bus. > > > > > > I was able to reproduce this with a module which creates a dummy bus > > > and registers drivers and devices on this bus on 2 different workqueues. > > > > That makes sense, as no bus should be doing this on multiple "threads". > > What real-life bus does this today? > A bus that will recognize and register a lot of devices, after the first > uevent is presented to userspace, a module will be loaded registering a > driver from a different thread. I don't think thats uncommon. But again, what kernel code today does this? I think they all have locks to keep this from happening, right? > > > > > * device_add attached the device to the bus /*break*/ > > > > > * driver_register walks the list of devices and tries to bind > > > > > unbound devices > > > > > * /*continue*/ device_add calls device_attach which gets confused > > > > > that the device is already bound to a driver > > > > > > > > Why would your bus code ever allow this to happen? It's the caller's > > > > responsiblity to do things in the correct order, right? > > > I don't think the bus code which calls device_register can (or should) > > > prevent drivers from beeing registered at this bus at the same time. > > > > Why not? That's the way all kernel subsystems work today that I know > > of. Has this changed? > What about an exported bus_type? At all time a driver for this bus can > be registered, the bus code has no chance to prevent or serialize this. No, the bus core is the one that should be binding the bus type to the driver and doing the registering. No individual driver should ever be messing with a bus_type at all. Now perhaps platform devices are, and if so, we might want to resolve this, but no "real" bus should ever be doing this. thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 15:19 ` Greg KH @ 2011-04-11 17:50 ` Sebastian Ott 2011-04-11 17:56 ` Greg KH 0 siblings, 1 reply; 38+ messages in thread From: Sebastian Ott @ 2011-04-11 17:50 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On Mon, 11 Apr 2011, Greg KH wrote: > On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote: > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote: > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > > > > > Hi, > > > > > > > > > > > > i've seen this warning which looks to be caused by a race between device_add > > > > > > and driver_register > > > > > > > > > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > > > > > > > > > Isn't the problem here the fact that you are creating 2 directories of > > > > > the same name? > > > > I'm sure this isn't the case here. The bus code just calls device_add and > > > > at the same time on a different thread a module is loaded which registers > > > > a driver at the bus. > > > > > > > > I was able to reproduce this with a module which creates a dummy bus > > > > and registers drivers and devices on this bus on 2 different workqueues. > > > > > > That makes sense, as no bus should be doing this on multiple "threads". > > > What real-life bus does this today? > > A bus that will recognize and register a lot of devices, after the first > > uevent is presented to userspace, a module will be loaded registering a > > driver from a different thread. I don't think thats uncommon. > > But again, what kernel code today does this? I think they all have > locks to keep this from happening, right? I couldn't find a bus who protects device_register against driver_register and I don't think this is something which should be handled by every individual bus but from within the driver core. > > > > > > > * device_add attached the device to the bus /*break*/ > > > > > > * driver_register walks the list of devices and tries to bind > > > > > > unbound devices > > > > > > * /*continue*/ device_add calls device_attach which gets confused > > > > > > that the device is already bound to a driver > > > > > > > > > > Why would your bus code ever allow this to happen? It's the caller's > > > > > responsiblity to do things in the correct order, right? > > > > I don't think the bus code which calls device_register can (or should) > > > > prevent drivers from beeing registered at this bus at the same time. > > > > > > Why not? That's the way all kernel subsystems work today that I know > > > of. Has this changed? > > What about an exported bus_type? At all time a driver for this bus can > > be registered, the bus code has no chance to prevent or serialize this. > > No, the bus core is the one that should be binding the bus type to the > driver and doing the registering. No individual driver should ever be > messing with a bus_type at all. > > Now perhaps platform devices are, and if so, we might want to resolve > this, but no "real" bus should ever be doing this. > > thanks, > > greg k-h > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 17:50 ` Sebastian Ott @ 2011-04-11 17:56 ` Greg KH 2011-04-12 14:39 ` Sebastian Ott 0 siblings, 1 reply; 38+ messages in thread From: Greg KH @ 2011-04-11 17:56 UTC (permalink / raw) To: Sebastian Ott; +Cc: linux-kernel On Mon, Apr 11, 2011 at 07:50:34PM +0200, Sebastian Ott wrote: > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote: > > > > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote: > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > > > > > > Hi, > > > > > > > > > > > > > > i've seen this warning which looks to be caused by a race between device_add > > > > > > > and driver_register > > > > > > > > > > > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > > > > > > > > > > > Isn't the problem here the fact that you are creating 2 directories of > > > > > > the same name? > > > > > I'm sure this isn't the case here. The bus code just calls device_add and > > > > > at the same time on a different thread a module is loaded which registers > > > > > a driver at the bus. > > > > > > > > > > I was able to reproduce this with a module which creates a dummy bus > > > > > and registers drivers and devices on this bus on 2 different workqueues. > > > > > > > > That makes sense, as no bus should be doing this on multiple "threads". > > > > What real-life bus does this today? > > > A bus that will recognize and register a lot of devices, after the first > > > uevent is presented to userspace, a module will be loaded registering a > > > driver from a different thread. I don't think thats uncommon. > > > > But again, what kernel code today does this? I think they all have > > locks to keep this from happening, right? > I couldn't find a bus who protects device_register against driver_register > and I don't think this is something which should be handled by every > individual bus but from within the driver core. How did you cause the oops in this original message? What type of bus was it on? And did your patch solve the issue? thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-11 17:56 ` Greg KH @ 2011-04-12 14:39 ` Sebastian Ott 2011-04-12 16:02 ` Greg KH 0 siblings, 1 reply; 38+ messages in thread From: Sebastian Ott @ 2011-04-12 14:39 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On Mon, 11 Apr 2011, Greg KH wrote: > On Mon, Apr 11, 2011 at 07:50:34PM +0200, Sebastian Ott wrote: > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote: > > > > > > > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote: > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > > > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > i've seen this warning which looks to be caused by a race between device_add > > > > > > > > and driver_register > > > > > > > > > > > > > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > > > > > > > > > > > > > Isn't the problem here the fact that you are creating 2 directories of > > > > > > > the same name? > > > > > > I'm sure this isn't the case here. The bus code just calls device_add and > > > > > > at the same time on a different thread a module is loaded which registers > > > > > > a driver at the bus. > > > > > > > > > > > > I was able to reproduce this with a module which creates a dummy bus > > > > > > and registers drivers and devices on this bus on 2 different workqueues. > > > > > > > > > > That makes sense, as no bus should be doing this on multiple "threads". > > > > > What real-life bus does this today? > > > > A bus that will recognize and register a lot of devices, after the first > > > > uevent is presented to userspace, a module will be loaded registering a > > > > driver from a different thread. I don't think thats uncommon. > > > > > > But again, what kernel code today does this? I think they all have > > > locks to keep this from happening, right? > > I couldn't find a bus who protects device_register against driver_register > > and I don't think this is something which should be handled by every > > individual bus but from within the driver core. > > How did you cause the oops in this original message? What type of bus > was it on? And did your patch solve the issue? I've seen this warning on a system which had a lot of devices attached. Those devices also responded _very_ slow. So it took some time to register all these deivces (qeth) on the bus (ccw). During this time the qeth module got loaded and registered the qeth driver. Since i've never seen this warning and those devices normally don't take so much time to respond I wrote the beforementioned module to reproduce the race and make sure that this is no driver issue. With this module the warning could be triggered after a few cycles. With the patch applied, I did not see the warning again even after > 10.000 cycles. regards, Sebastian ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: sysfs: cannot create duplicate filename 2011-04-12 14:39 ` Sebastian Ott @ 2011-04-12 16:02 ` Greg KH 0 siblings, 0 replies; 38+ messages in thread From: Greg KH @ 2011-04-12 16:02 UTC (permalink / raw) To: Sebastian Ott; +Cc: linux-kernel On Tue, Apr 12, 2011 at 04:39:49PM +0200, Sebastian Ott wrote: > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > On Mon, Apr 11, 2011 at 07:50:34PM +0200, Sebastian Ott wrote: > > > > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote: > > > > > > > > > > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > > > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote: > > > > > > > On Mon, 11 Apr 2011, Greg KH wrote: > > > > > > > > > > > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > i've seen this warning which looks to be caused by a race between device_add > > > > > > > > > and driver_register > > > > > > > > > > > > > > > > > > [ 80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d' > > > > > > > > > > > > > > > > Isn't the problem here the fact that you are creating 2 directories of > > > > > > > > the same name? > > > > > > > I'm sure this isn't the case here. The bus code just calls device_add and > > > > > > > at the same time on a different thread a module is loaded which registers > > > > > > > a driver at the bus. > > > > > > > > > > > > > > I was able to reproduce this with a module which creates a dummy bus > > > > > > > and registers drivers and devices on this bus on 2 different workqueues. > > > > > > > > > > > > That makes sense, as no bus should be doing this on multiple "threads". > > > > > > What real-life bus does this today? > > > > > A bus that will recognize and register a lot of devices, after the first > > > > > uevent is presented to userspace, a module will be loaded registering a > > > > > driver from a different thread. I don't think thats uncommon. > > > > > > > > But again, what kernel code today does this? I think they all have > > > > locks to keep this from happening, right? > > > I couldn't find a bus who protects device_register against driver_register > > > and I don't think this is something which should be handled by every > > > individual bus but from within the driver core. > > > > How did you cause the oops in this original message? What type of bus > > was it on? And did your patch solve the issue? > > I've seen this warning on a system which had a lot of devices attached. > Those devices also responded _very_ slow. So it took some time to > register all these deivces (qeth) on the bus (ccw). During this time > the qeth module got loaded and registered the qeth driver. > > Since i've never seen this warning and those devices normally don't > take so much time to respond I wrote the beforementioned module to > reproduce the race and make sure that this is no driver issue. > > With this module the warning could be triggered after a few cycles. > With the patch applied, I did not see the warning again even after > > 10.000 cycles. Ok, fair enough. Care to resend the patch again so I can review it once more? thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2022-02-17 15:35 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-03-02 18:03 sysfs: cannot create duplicate filename Andre Heider [not found] ` <CAHsu+b-cG+ED6TX5evRTBjR-LwHugW+8-9hnHXAz5DnAioJnUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-03-05 9:39 ` Lingzhu Xiang [not found] ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-06 13:19 ` Matt Fleming [not found] ` <1362575941.15011.56.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-06 13:35 ` shea-yfkUTty7RcRWk0Htik3J/w [not found] ` <bd0c3451a7342302623c09cac09bcbfc-yfkUTty7RcRWk0Htik3J/w@public.gmane.org> 2013-03-06 14:29 ` Matt Fleming 2013-03-08 15:11 ` Matt Fleming [not found] ` <1362755479.15011.238.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-08 18:17 ` Lingzhu Xiang [not found] ` <513A2B29.8090705-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-12 10:08 ` Matt Fleming [not found] ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-12 10:45 ` Lingzhu Xiang [not found] ` <513F0734.80600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-12 16:35 ` Matt Fleming [not found] ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-12 17:51 ` Seiji Aguchi 2013-03-13 10:47 ` Lingzhu Xiang [not found] ` <51405943.2000601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-14 16:33 ` Matt Fleming [not found] ` <1363278817.15011.316.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-14 19:31 ` Seiji Aguchi [not found] ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF68807-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org> 2013-03-18 14:51 ` Matt Fleming [not found] ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2013-03-19 10:14 ` Lingzhu Xiang [not found] ` <51483A88.6060509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-19 14:25 ` Seiji Aguchi [not found] ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org> 2013-03-19 15:26 ` Matt Fleming [not found] ` <514883B7.1010602-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org> 2013-03-19 15:46 ` Seiji Aguchi 2013-03-19 16:00 ` Lingzhu Xiang 2013-03-21 7:22 ` Lingzhu Xiang [not found] ` <514AB534.8080209-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-21 7:44 ` Matt Fleming 2013-03-12 13:20 ` Andre Heider -- strict thread matches above, loose matches on Subject: below -- 2022-02-13 9:21 Robin Peiremans 2022-02-14 18:08 ` Jason Gunthorpe 2022-02-17 13:37 ` Dennis Dalessandro 2022-02-17 14:29 ` Marciniszyn, Mike 2022-02-17 15:35 ` Robin Peiremans 2011-04-11 14:04 Sebastian Ott 2011-04-11 14:13 ` Greg KH 2011-04-11 14:33 ` Sebastian Ott 2011-04-11 14:49 ` Greg KH 2011-04-11 15:05 ` Sebastian Ott 2011-04-11 15:19 ` Greg KH 2011-04-11 17:50 ` Sebastian Ott 2011-04-11 17:56 ` Greg KH 2011-04-12 14:39 ` Sebastian Ott 2011-04-12 16:02 ` Greg KH
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.