All of lore.kernel.org
 help / color / mirror / Atom feed
* sysfs: cannot create duplicate filename
@ 2013-03-02 18:03 Andre Heider
       [not found] ` <CAHsu+b-cG+ED6TX5evRTBjR-LwHugW+8-9hnHXAz5DnAioJnUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Andre Heider @ 2013-03-02 18:03 UTC (permalink / raw)
  To: linux-efi-u79uwXL29TY76Z2rM5mHXA

Hi,

After a BIOS update I get this in dmesg:

[    0.581554] EFI Variables Facility v0.08 2004-May-17
[    0.584914] ------------[ cut here ]------------
[    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x
d4/0x100()
[    0.586381] Hardware name: To be filled by O.E.M.
[    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl
BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
[    0.588694] Modules linked in:
[    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
[    0.590280] Call Trace:
[    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
[    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
[    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
[    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
[    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
[    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
[    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
[    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
[    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
[    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
[    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
[    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
[    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
[    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
[    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
[    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
[    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
[    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
[    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
[    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.607143] ---[ end trace 1609741ab737eb29 ]---
[    0.607869] ------------[ cut here ]------------
[    0.608589] WARNING: at /home/andre/linux/lib/kobject.c:196
kobject_add_internal+0x204/0x220()
[    0.609342] Hardware name: To be filled by O.E.M.
[    0.610093] kobject_add_internal failed for
SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8 with -EEXIST,
don't try to register things with the same name in the same directory.
[    0.611703] Modules linked in:
[    0.612497] Pid: 1, comm: swapper/0 Tainted: G        W    3.8.0+ #7
[    0.613312] Call Trace:
[    0.614112]  [<ffffffff81347fb4>] ? kobject_add_internal+0x204/0x220
[    0.614931]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
[    0.615753]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
[    0.616559]  [<ffffffff81347fb4>] kobject_add_internal+0x204/0x220
[    0.617362]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
[    0.618152]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
[    0.618936]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
[    0.619732]  [<ffffffff8158517e>] register_efivars+0xde/0x420
[    0.620518]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
[    0.621297]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
[    0.622072]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
[    0.622836]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
[    0.623590]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
[    0.624314]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.625010]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
[    0.625703]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
[    0.626393]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.627078] ---[ end trace 1609741ab737eb2a ]---

Sounds like yet another BIOS issue, and I guess the answer is
"complain to the manufacturer"... But we know how well that works, so
can this be a warning instead of a scary backtrace? :)

(this is: Gigabyte Technology Co., Ltd. To be filled by
O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)"

Thanks,
Andre

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found] ` <CAHsu+b-cG+ED6TX5evRTBjR-LwHugW+8-9hnHXAz5DnAioJnUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-03-05  9:39   ` Lingzhu Xiang
       [not found]     ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Lingzhu Xiang @ 2013-03-05  9:39 UTC (permalink / raw)
  To: Andre Heider; +Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA

On 03/03/2013 02:03 AM, Andre Heider wrote:
> After a BIOS update I get this in dmesg:
> 
> [    0.581554] EFI Variables Facility v0.08 2004-May-17
> [    0.584914] ------------[ cut here ]------------
> [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x
> d4/0x100()
> [    0.586381] Hardware name: To be filled by O.E.M.
> [    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl
> BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
> (snip)
> 
> Sounds like yet another BIOS issue, and I guess the answer is
> "complain to the manufacturer"... But we know how well that works, so
> can this be a warning instead of a scary backtrace? :)
> 
> (this is: Gigabyte Technology Co., Ltd. To be filled by
> O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)"

Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631

I think this problem is getting more and more common.

I had an IBM System x3100 M4 and x3850 X5 on which kernel would get stuck
in infinite loop creating duplicate sysfs files because, for some reason,
there are several duplicate boot entries in nvram getting GetNextVariableName
into a circle of iteration (with period > 2).

Granted, it's not kernel's fault for duplicate variables, but it'd be much
harder for users to fix their broken firmware than kernel dealing with it
by not dying on the never-ending GetNextVariableName.


Lingzhu Xiang

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]     ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-06 13:19       ` Matt Fleming
       [not found]         ` <1362575941.15011.56.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2013-03-08 15:11       ` Matt Fleming
  1 sibling, 1 reply; 38+ messages in thread
From: Matt Fleming @ 2013-03-06 13:19 UTC (permalink / raw)
  To: Lingzhu Xiang; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On Tue, 2013-03-05 at 17:39 +0800, Lingzhu Xiang wrote:
> On 03/03/2013 02:03 AM, Andre Heider wrote:
> > After a BIOS update I get this in dmesg:
> > 
> > [    0.581554] EFI Variables Facility v0.08 2004-May-17
> > [    0.584914] ------------[ cut here ]------------
> > [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x
> > d4/0x100()
> > [    0.586381] Hardware name: To be filled by O.E.M.
> > [    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl
> > BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
> > (snip)
> > 
> > Sounds like yet another BIOS issue, and I guess the answer is
> > "complain to the manufacturer"... But we know how well that works, so
> > can this be a warning instead of a scary backtrace? :)
> > 
> > (this is: Gigabyte Technology Co., Ltd. To be filled by
> > O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)"
> 
> Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631
> 
> I think this problem is getting more and more common.
> 
> I had an IBM System x3100 M4 and x3850 X5 on which kernel would get stuck
> in infinite loop creating duplicate sysfs files because, for some reason,
> there are several duplicate boot entries in nvram getting GetNextVariableName
> into a circle of iteration (with period > 2).
> 
> Granted, it's not kernel's fault for duplicate variables, but it'd be much
> harder for users to fix their broken firmware than kernel dealing with it
> by not dying on the never-ending GetNextVariableName.

It's not just about the manufacturers fixing their firmware and
distributing a BIOS update. It's also the fact few users are likely to
apply it before installing Linux on a new machine.

Yeah, we need something in the kernel to handle this.

Thanks for the report guys.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]         ` <1362575941.15011.56.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-03-06 13:35           ` shea-yfkUTty7RcRWk0Htik3J/w
       [not found]             ` <bd0c3451a7342302623c09cac09bcbfc-yfkUTty7RcRWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: shea-yfkUTty7RcRWk0Htik3J/w @ 2013-03-06 13:35 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

Hi Matt,

On 2013-03-06 08:19, Matt Fleming wrote:
> On Tue, 2013-03-05 at 17:39 +0800, Lingzhu Xiang wrote:
>> On 03/03/2013 02:03 AM, Andre Heider wrote:
>> > After a BIOS update I get this in dmesg:
>> >
>> > [    0.581554] EFI Variables Facility v0.08 2004-May-17
>> > [    0.584914] ------------[ cut here ]------------
>> > [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 
>> sysfs_add_one+0x
>> > d4/0x100()
>> > [    0.586381] Hardware name: To be filled by O.E.M.
>> > [    0.587123] sysfs: cannot create duplicate filename 
>> '/firmware/efi/vars/SbAsl
>> > BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
>> > (snip)
>> >
>> > Sounds like yet another BIOS issue, and I guess the answer is
>> > "complain to the manufacturer"... But we know how well that works, 
>> so
>> > can this be a warning instead of a scary backtrace? :)
>> >
>> > (this is: Gigabyte Technology Co., Ltd. To be filled by
>> > O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)"
>>
>> Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631
>>
>> I think this problem is getting more and more common.
>>
>> I had an IBM System x3100 M4 and x3850 X5 on which kernel would get 
>> stuck
>> in infinite loop creating duplicate sysfs files because, for some 
>> reason,
>> there are several duplicate boot entries in nvram getting 
>> GetNextVariableName
>> into a circle of iteration (with period > 2).
>>
>> Granted, it's not kernel's fault for duplicate variables, but it'd 
>> be much
>> harder for users to fix their broken firmware than kernel dealing 
>> with it
>> by not dying on the never-ending GetNextVariableName.
>
> It's not just about the manufacturers fixing their firmware and
> distributing a BIOS update. It's also the fact few users are likely 
> to
> apply it before installing Linux on a new machine.
>
> Yeah, we need something in the kernel to handle this.
>

Unrelated to the initial issue, but with the growing number of fixes 
for buggy firmware out there, maybe we should consider an approach 
similar to CONFIG_PCI_QUIRKS and the separate drivers/pci/quirks.c, 
where all of the system specific EFI quirks are isolated and can be 
compiled out if you know your firmware is good?

Also might be nice to have a separate config option specifically for 
all the Apple quirks, as their EFI seems to be the worst out there with 
respect to following standards.

>
> Thanks for the report guys.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]             ` <bd0c3451a7342302623c09cac09bcbfc-yfkUTty7RcRWk0Htik3J/w@public.gmane.org>
@ 2013-03-06 14:29               ` Matt Fleming
  0 siblings, 0 replies; 38+ messages in thread
From: Matt Fleming @ 2013-03-06 14:29 UTC (permalink / raw)
  To: shea-yfkUTty7RcRWk0Htik3J/w
  Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	Matthew Garrett, H. Peter Anvin

On Wed, 2013-03-06 at 08:35 -0500, shea-yfkUTty7RcRWk0Htik3J/w@public.gmane.org wrote:
> Unrelated to the initial issue, but with the growing number of fixes 
> for buggy firmware out there, maybe we should consider an approach 
> similar to CONFIG_PCI_QUIRKS and the separate drivers/pci/quirks.c, 
> where all of the system specific EFI quirks are isolated and can be 
> compiled out if you know your firmware is good?

Right, the firmware bug spectrum is indeed starting to grow rapidly. I
was hoping we could avoid doing something like PCI quirks because that
whole thing is a mess and I am concerned about an explosion of quirks
that would need to be maintained, some being used in core architecture
code. Upto now, we've been catering for the least common denominator,
which has worked, but obviously isn't ideal.

My plan has always been to try and hold off on introducing something
like quirks until my hand is forced. But it is likely that will happen
sooner rather than later. 

One problem we have though, is clearly identifying machines that suffer
from these bugs because of the cross-pollination of firmware from
Independent BIOS Vendors (IBVs) and OEMs. In theory, we could use the
FirmwareRevision field in the EFI System Table, along with DMI strings
and such.

> Also might be nice to have a separate config option specifically for 
> all the Apple quirks, as their EFI seems to be the worst out there with 
> respect to following standards.

AFAIK, it's that they're based on EFI 1.10, so pre-UEFI days.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]     ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2013-03-06 13:19       ` Matt Fleming
@ 2013-03-08 15:11       ` Matt Fleming
       [not found]         ` <1362755479.15011.238.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Matt Fleming @ 2013-03-08 15:11 UTC (permalink / raw)
  To: Lingzhu Xiang; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On Tue, 2013-03-05 at 17:39 +0800, Lingzhu Xiang wrote:
> On 03/03/2013 02:03 AM, Andre Heider wrote:
> > After a BIOS update I get this in dmesg:
> > 
> > [    0.581554] EFI Variables Facility v0.08 2004-May-17
> > [    0.584914] ------------[ cut here ]------------
> > [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0x
> > d4/0x100()
> > [    0.586381] Hardware name: To be filled by O.E.M.
> > [    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAsl
> > BufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
> > (snip)
> > 
> > Sounds like yet another BIOS issue, and I guess the answer is
> > "complain to the manufacturer"... But we know how well that works, so
> > can this be a warning instead of a scary backtrace? :)
> > 
> > (this is: Gigabyte Technology Co., Ltd. To be filled by
> > O.E.M./Z77X-UD3H, BIOS F19e 11/21/2012)"
> 
> Just saw https://bugzilla.kernel.org/show_bug.cgi?id=47631
> 
> I think this problem is getting more and more common.
> 
> I had an IBM System x3100 M4 and x3850 X5 on which kernel would get stuck
> in infinite loop creating duplicate sysfs files because, for some reason,
> there are several duplicate boot entries in nvram getting GetNextVariableName
> into a circle of iteration (with period > 2).
> 
> Granted, it's not kernel's fault for duplicate variables, but it'd be much
> harder for users to fix their broken firmware than kernel dealing with it
> by not dying on the never-ending GetNextVariableName.

Guys, could you test the following patch?

---

>From bb94d021f900a990def5891a7cb480cc6d3e5247 Mon Sep 17 00:00:00 2001
From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date: Thu, 7 Mar 2013 11:59:14 +0000
Subject: [PATCH] efivars: Handle duplicate names from get_next_variable()

Some firmware exhibits a bug where the same VariableName and
VendorGuid values are returned on consecutive invocations of
GetNextVariableName(). See,

    https://bugzilla.kernel.org/show_bug.cgi?id=47631

As a consequence of such a bug, Andre reports hitting the following
WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
11/21/2012)" machine,

[    0.581554] EFI Variables Facility v0.08 2004-May-17
[    0.584914] ------------[ cut here ]------------
[    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
[    0.586381] Hardware name: To be filled by O.E.M.
[    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
[    0.588694] Modules linked in:
[    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
[    0.590280] Call Trace:
[    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
[    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
[    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
[    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
[    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
[    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
[    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
[    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
[    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
[    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
[    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
[    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
[    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
[    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
[    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
[    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
[    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
[    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
[    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
[    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.607143] ---[ end trace 1609741ab737eb29 ]---

There's not much we can do to work around and keep traversing the
variable list once we hit this firmware bug. Our only solution is to
terminate the loop because, as Lingzhu reports, some machines get
stuck when they encounter duplicate names,

  > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
  > get stuck in infinite loop creating duplicate sysfs files because,
  > for some reason, there are several duplicate boot entries in nvram
  > getting GetNextVariableName into a circle of iteration (with
  > period > 2).

Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/firmware/efivars.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index bea32d1..ea415df 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -1962,8 +1962,9 @@ int register_efivars(struct efivars *efivars,
 		     struct kobject *parent_kobj)
 {
 	efi_status_t status = EFI_NOT_FOUND;
-	efi_guid_t vendor_guid;
+	efi_guid_t vendor_guid, prev_guid;
 	efi_char16_t *variable_name;
+	efi_char16_t *prev_name = NULL;
 	unsigned long variable_name_size = 1024;
 	int error = 0;
 
@@ -1973,6 +1974,13 @@ int register_efivars(struct efivars *efivars,
 		return -ENOMEM;
 	}
 
+	prev_name = kzalloc(variable_name_size, GFP_KERNEL);
+	if (!prev_name) {
+		printk(KERN_ERR "efivars: Memory allocation failed.\n");
+		error = -ENOMEM;
+		goto out;
+	}
+
 	spin_lock_init(&efivars->lock);
 	INIT_LIST_HEAD(&efivars->list);
 	efivars->ops = ops;
@@ -2005,6 +2013,21 @@ int register_efivars(struct efivars *efivars,
 						&vendor_guid);
 		switch (status) {
 		case EFI_SUCCESS:
+			/*
+			 * Some firmware implementations return the
+			 * same variable name on consecutive calls to
+			 * get_next_variable(). Terminate the loop
+			 * immediately as there is no guarantee that
+			 * we'll ever see a different variable name,
+			 * and may end up looping here forever.
+			 */
+			if (!efi_guidcmp(prev_guid, vendor_guid) &&
+			     !memcmp(prev_name, variable_name, variable_name_size)) {
+				printk(KERN_WARNING "efivars: duplicate variable name.\n");
+				status = EFI_NOT_FOUND;
+				break;
+			}
+
 			efivar_create_sysfs_entry(efivars,
 						  variable_name_size,
 						  variable_name,
@@ -2018,6 +2041,9 @@ int register_efivars(struct efivars *efivars,
 			status = EFI_NOT_FOUND;
 			break;
 		}
+
+		memcpy(prev_name, variable_name, variable_name_size);
+		memcpy(&prev_guid, &vendor_guid, sizeof(efi_guid_t));
 	} while (status != EFI_NOT_FOUND);
 
 	error = create_efivars_bin_attributes(efivars);
@@ -2037,6 +2063,7 @@ int register_efivars(struct efivars *efivars,
 	register_filesystem(&efivarfs_type);
 
 out:
+	kfree(prev_name);
 	kfree(variable_name);
 
 	return error;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]         ` <1362755479.15011.238.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-03-08 18:17           ` Lingzhu Xiang
       [not found]             ` <513A2B29.8090705-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Lingzhu Xiang @ 2013-03-08 18:17 UTC (permalink / raw)
  To: Matt Fleming; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On 03/08/2013 11:11 PM, Matt Fleming wrote:
> +			/*
> +			 * Some firmware implementations return the
> +			 * same variable name on consecutive calls to
> +			 * get_next_variable(). Terminate the loop
> +			 * immediately as there is no guarantee that
> +			 * we'll ever see a different variable name,
> +			 * and may end up looping here forever.
> +			 */
> +			if (!efi_guidcmp(prev_guid, vendor_guid) &&
> +			     !memcmp(prev_name, variable_name, variable_name_size)) {
> +				printk(KERN_WARNING "efivars: duplicate variable name.\n");
> +				status = EFI_NOT_FOUND;
> +				break;
> +			}

Comparing with the immediately previous one won't detect larger loop
like how The IBM x3100 M4 in question was looping as below.

(There is some log loss; probably too much data for serial.)
[    9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[    9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[    9.856520] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7'
[    9.860840] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a'
[    9.865735] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9'
[    9.871157] sysfs: cannot create duplicate filename '/firmware/efi/vars/DIMMStatusVariable-c020489e-6db2-4fe2-a2b1-ca05cca1336a'
[    9.876473] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0007-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   11.541964] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   11.556008] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[   11.560022] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   11.564583] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7'
[   11.568977] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a'
[   11.573929] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9'
[   13.278655] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   13.292703] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[   13.296716] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   13.301283] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7'
[   13.305669] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a'
[   13.310628] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9'
[   13.316108] sysfs: cannot create duplicate filename '/firmware/efi/vars/DIMMStatusVariable-c020489e-6db2-4fe2-a2b1-ca05cca1336a'
[   15.004171] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   15.017737] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   15.031713] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[   15.035730] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   15.040355] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7'
[   15.044743] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a'
[   15.049700] sysfs: cannot create duplicate filename '/firmware/efi/vars/NodalSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9'
[   16.729287] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7'
[   16.742919] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   16.756550] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   16.770526] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[   16.774532] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   16.779093] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7'
[   16.783542] sysfs: cannot create duplicate filename '/firmware/efi/vars/AcpiGlobalVariable-c020489e-6db2-4ef2-9aa5-ca06fc11d36a'
[   18.467922] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7'
[   18.481615] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   18.495194] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   18.509242] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[   18.513252] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   18.517806] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0003-d665f20a-f2b7-485e-a321-542a196f40c7'
[   20.193306] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   20.206649] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7'
[   20.220341] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   20.233916] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   20.247957] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[   20.251970] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   21.918960] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7'
[   21.931944] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   21.945286] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7'
[   21.958984] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   21.972559] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   21.986608] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
[   23.644961] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   23.657688] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7'
[   23.670724] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   23.684062] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7'
[   23.697692] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   23.711324] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBootValidate-d665f20a-f2b7-485e-a321-542a196f40c7'
[   25.370482] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0006-d665f20a-f2b7-485e-a321-542a196f40c7'
[   25.382749] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   25.395419] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7'
[   25.408343] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   25.421625] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7'
[   25.435192] sysfs: cannot create duplicate filename '/firmware/efi/vars/BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   27.087790] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0006-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   27.099845] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0006-d665f20a-f2b7-485e-a321-542a196f40c7'
[   27.112118] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   27.124788] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0004-d665f20a-f2b7-485e-a321-542a196f40c7'
[   27.137702] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   27.150990] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0001-d665f20a-f2b7-485e-a321-542a196f40c7'
[   28.805316] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0005-d665f20a-f2b7-485e-a321-542a196f40c7'
[   28.816944] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0006-8be4df61-93ca-11d2-aa0d-00e098032b8c'
[   28.829010] sysfs: cannot create duplicate filename '/firmware/efi/vars/CFGBoot0006-d665f20a-f2b7-485e-a321-542a196f40c7'
[   28.841278] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0004-8be4df61-93ca-11d2-aa0d-00e098032b8c'

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]             ` <513A2B29.8090705-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-12 10:08               ` Matt Fleming
       [not found]                 ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Matt Fleming @ 2013-03-12 10:08 UTC (permalink / raw)
  To: Lingzhu Xiang; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On Sat, 2013-03-09 at 02:17 +0800, Lingzhu Xiang wrote:
> Comparing with the immediately previous one won't detect larger loop
> like how The IBM x3100 M4 in question was looping as below.
> 
> (There is some log loss; probably too much data for serial.)
> [    9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
> [    9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'

Ah, thanks. I misunderstood the bug.

How about something like this? We're just going to have to suck it up
and iterate the entire list. This could be made more optimal, but this
isn't a fast path anyway, so I'm not sure it's worth optimising
immediately.

---

>From c8b6c9c175f4459197e2a13f29a2e4125372d771 Mon Sep 17 00:00:00 2001
From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date: Thu, 7 Mar 2013 11:59:14 +0000
Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable()

Some firmware exhibits a bug where the same VariableName and
VendorGuid values are returned on multiple invocations of
GetNextVariableName(). See,

    https://bugzilla.kernel.org/show_bug.cgi?id=47631

As a consequence of such a bug, Andre reports hitting the following
WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
11/21/2012)" machine,

[    0.581554] EFI Variables Facility v0.08 2004-May-17
[    0.584914] ------------[ cut here ]------------
[    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
[    0.586381] Hardware name: To be filled by O.E.M.
[    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
[    0.588694] Modules linked in:
[    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
[    0.590280] Call Trace:
[    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
[    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
[    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
[    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
[    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
[    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
[    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
[    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
[    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
[    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
[    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
[    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
[    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
[    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
[    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
[    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
[    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
[    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
[    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
[    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.607143] ---[ end trace 1609741ab737eb29 ]---

There's not much we can do to work around and keep traversing the
variable list once we hit this firmware bug. Our only solution is to
terminate the loop because, as Lingzhu reports, some machines get
stuck when they encounter duplicate names,

  > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
  > get stuck in infinite loop creating duplicate sysfs files because,
  > for some reason, there are several duplicate boot entries in nvram
  > getting GetNextVariableName into a circle of iteration (with
  > period > 2).

Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---

v2: iterate the entire list when checking for duplicates.

 drivers/firmware/efivars.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index bea32d1..d1656e3 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -2005,6 +2005,20 @@ int register_efivars(struct efivars *efivars,
 						&vendor_guid);
 		switch (status) {
 		case EFI_SUCCESS:
+			/*
+			 * Some firmware implementations return the
+			 * same variable name on multiple calls to
+			 * get_next_variable(). Terminate the loop
+			 * immediately as there is no guarantee that
+			 * we'll ever see a different variable name,
+			 * and may end up looping here forever.
+			 */
+			if (variable_is_present(variable_name, &vendor_guid)) {
+				printk(KERN_WARNING "efivars: duplicate variable name.\n");
+				status = EFI_NOT_FOUND;
+				break;
+			}
+
 			efivar_create_sysfs_entry(efivars,
 						  variable_name_size,
 						  variable_name,
-- 
1.7.11.7


-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                 ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-03-12 10:45                   ` Lingzhu Xiang
       [not found]                     ` <513F0734.80600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2013-03-12 13:20                   ` Andre Heider
  1 sibling, 1 reply; 38+ messages in thread
From: Lingzhu Xiang @ 2013-03-12 10:45 UTC (permalink / raw)
  To: Matt Fleming; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi

(Cc'ing Seiji Aguchi for the bug related to sysfs workqueue)

On 03/12/2013 06:08 PM, Matt Fleming wrote:
> On Sat, 2013-03-09 at 02:17 +0800, Lingzhu Xiang wrote:
>> Comparing with the immediately previous one won't detect larger loop
>> like how The IBM x3100 M4 in question was looping as below.
>>
>> (There is some log loss; probably too much data for serial.)
>> [    9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
>> [    9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
> 
> Ah, thanks. I misunderstood the bug.
> 
> How about something like this? We're just going to have to suck it up
> and iterate the entire list. This could be made more optimal, but this
> isn't a fast path anyway, so I'm not sure it's worth optimising
> immediately.
> 
> ---
> 
>  From c8b6c9c175f4459197e2a13f29a2e4125372d771 Mon Sep 17 00:00:00 2001
> From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Date: Thu, 7 Mar 2013 11:59:14 +0000
> Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable()
> 
> Some firmware exhibits a bug where the same VariableName and
> VendorGuid values are returned on multiple invocations of
> GetNextVariableName(). See,
> 
>      https://bugzilla.kernel.org/show_bug.cgi?id=47631
> 
> As a consequence of such a bug, Andre reports hitting the following
> WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
> Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
> 11/21/2012)" machine,
> 
> [    0.581554] EFI Variables Facility v0.08 2004-May-17
> [    0.584914] ------------[ cut here ]------------
> [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
> [    0.586381] Hardware name: To be filled by O.E.M.
> [    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
> [    0.588694] Modules linked in:
> [    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
> [    0.590280] Call Trace:
> [    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
> [    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
> [    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
> [    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
> [    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
> [    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
> [    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
> [    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
> [    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
> [    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
> [    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
> [    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
> [    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
> [    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
> [    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
> [    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
> [    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
> [    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
> [    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
> [    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
> [    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
> [    0.607143] ---[ end trace 1609741ab737eb29 ]---
> 
> There's not much we can do to work around and keep traversing the
> variable list once we hit this firmware bug. Our only solution is to
> terminate the loop because, as Lingzhu reports, some machines get
> stuck when they encounter duplicate names,
> 
>    > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
>    > get stuck in infinite loop creating duplicate sysfs files because,
>    > for some reason, there are several duplicate boot entries in nvram
>    > getting GetNextVariableName into a circle of iteration (with
>    > period > 2).
> 
> Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
> 
> v2: iterate the entire list when checking for duplicates.
> 
>   drivers/firmware/efivars.c | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
> index bea32d1..d1656e3 100644
> --- a/drivers/firmware/efivars.c
> +++ b/drivers/firmware/efivars.c
> @@ -2005,6 +2005,20 @@ int register_efivars(struct efivars *efivars,
>   						&vendor_guid);
>   		switch (status) {
>   		case EFI_SUCCESS:
> +			/*
> +			 * Some firmware implementations return the
> +			 * same variable name on multiple calls to
> +			 * get_next_variable(). Terminate the loop
> +			 * immediately as there is no guarantee that
> +			 * we'll ever see a different variable name,
> +			 * and may end up looping here forever.
> +			 */
> +			if (variable_is_present(variable_name, &vendor_guid)) {
> +				printk(KERN_WARNING "efivars: duplicate variable name.\n");
> +				status = EFI_NOT_FOUND;
> +				break;
> +			}
> +
>   			efivar_create_sysfs_entry(efivars,
>   						  variable_name_size,
>   						  variable_name,
> 

I think using what we have is good enough for this. I'll verify
boot time tomorrow.

status = EFI_NOT_FOUND here doesn't seem to do anything.

By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue
patch may also get stuck in loop in the same way. 


Lingzhu Xiang

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                 ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2013-03-12 10:45                   ` Lingzhu Xiang
@ 2013-03-12 13:20                   ` Andre Heider
  1 sibling, 0 replies; 38+ messages in thread
From: Andre Heider @ 2013-03-12 13:20 UTC (permalink / raw)
  To: Matt Fleming; +Cc: Lingzhu Xiang, linux-efi-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 12, 2013 at 11:08 AM, Matt Fleming <matt-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org> wrote:

> From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Date: Thu, 7 Mar 2013 11:59:14 +0000
> Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable()
>
> Some firmware exhibits a bug where the same VariableName and
> VendorGuid values are returned on multiple invocations of
> GetNextVariableName(). See,
>
>     https://bugzilla.kernel.org/show_bug.cgi?id=47631
>
> As a consequence of such a bug, Andre reports hitting the following
> WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
> Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
> 11/21/2012)" machine,

Thanks for working on this, Matt.

Unfortunately I cannot test this patch anymore... This BIOS seems
beyond buggy, and today I didn't get anything on screen, it didn't
boot anything nor could I even enter it.
The only thing to make this box usable again was to hit the CMOS reset
button, and since then I do not get the WARN_ON() anymore. Seems like
the duplicate entry was an artifact of a BIOS upgrade with old CMOS
settings.

Thanks,
Andre

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                     ` <513F0734.80600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-12 16:35                       ` Matt Fleming
       [not found]                         ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Matt Fleming @ 2013-03-12 16:35 UTC (permalink / raw)
  To: Lingzhu Xiang
  Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi

On Tue, 2013-03-12 at 18:45 +0800, Lingzhu Xiang wrote:
> (Cc'ing Seiji Aguchi for the bug related to sysfs workqueue)
> 
> On 03/12/2013 06:08 PM, Matt Fleming wrote:
> > On Sat, 2013-03-09 at 02:17 +0800, Lingzhu Xiang wrote:
> >> Comparing with the immediately previous one won't detect larger loop
> >> like how The IBM x3100 M4 in question was looping as below.
> >>
> >> (There is some log loss; probably too much data for serial.)
> >> [    9.783119] sysfs: cannot create duplicate filename '/firmware/efi/vars/MemoryTypeInformation-4c19049f-4137-4dd3-9c10-8b97a83ffdfa'
> >> [    9.852017] sysfs: cannot create duplicate filename '/firmware/efi/vars/Boot0003-8be4df61-93ca-11d2-aa0d-00e098032b8c'
> > 
> > Ah, thanks. I misunderstood the bug.
> > 
> > How about something like this? We're just going to have to suck it up
> > and iterate the entire list. This could be made more optimal, but this
> > isn't a fast path anyway, so I'm not sure it's worth optimising
> > immediately.
> > 
> > ---
> > 
> >  From c8b6c9c175f4459197e2a13f29a2e4125372d771 Mon Sep 17 00:00:00 2001
> > From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Date: Thu, 7 Mar 2013 11:59:14 +0000
> > Subject: [PATCH v2] efivars: Handle duplicate names from get_next_variable()
> > 
> > Some firmware exhibits a bug where the same VariableName and
> > VendorGuid values are returned on multiple invocations of
> > GetNextVariableName(). See,
> > 
> >      https://bugzilla.kernel.org/show_bug.cgi?id=47631
> > 
> > As a consequence of such a bug, Andre reports hitting the following
> > WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
> > Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
> > 11/21/2012)" machine,
> > 
> > [    0.581554] EFI Variables Facility v0.08 2004-May-17
> > [    0.584914] ------------[ cut here ]------------
> > [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
> > [    0.586381] Hardware name: To be filled by O.E.M.
> > [    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
> > [    0.588694] Modules linked in:
> > [    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
> > [    0.590280] Call Trace:
> > [    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
> > [    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
> > [    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
> > [    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
> > [    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
> > [    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
> > [    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
> > [    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
> > [    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
> > [    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
> > [    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
> > [    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
> > [    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
> > [    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
> > [    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
> > [    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
> > [    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
> > [    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
> > [    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
> > [    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
> > [    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
> > [    0.607143] ---[ end trace 1609741ab737eb29 ]---
> > 
> > There's not much we can do to work around and keep traversing the
> > variable list once we hit this firmware bug. Our only solution is to
> > terminate the loop because, as Lingzhu reports, some machines get
> > stuck when they encounter duplicate names,
> > 
> >    > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
> >    > get stuck in infinite loop creating duplicate sysfs files because,
> >    > for some reason, there are several duplicate boot entries in nvram
> >    > getting GetNextVariableName into a circle of iteration (with
> >    > period > 2).
> > 
> > Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> > Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > ---
> > 
> > v2: iterate the entire list when checking for duplicates.
> > 
> >   drivers/firmware/efivars.c | 14 ++++++++++++++
> >   1 file changed, 14 insertions(+)
> > 
> > diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
> > index bea32d1..d1656e3 100644
> > --- a/drivers/firmware/efivars.c
> > +++ b/drivers/firmware/efivars.c
> > @@ -2005,6 +2005,20 @@ int register_efivars(struct efivars *efivars,
> >   						&vendor_guid);
> >   		switch (status) {
> >   		case EFI_SUCCESS:
> > +			/*
> > +			 * Some firmware implementations return the
> > +			 * same variable name on multiple calls to
> > +			 * get_next_variable(). Terminate the loop
> > +			 * immediately as there is no guarantee that
> > +			 * we'll ever see a different variable name,
> > +			 * and may end up looping here forever.
> > +			 */
> > +			if (variable_is_present(variable_name, &vendor_guid)) {
> > +				printk(KERN_WARNING "efivars: duplicate variable name.\n");
> > +				status = EFI_NOT_FOUND;
> > +				break;
> > +			}
> > +
> >   			efivar_create_sysfs_entry(efivars,
> >   						  variable_name_size,
> >   						  variable_name,
> > 
> 
> I think using what we have is good enough for this. I'll verify
> boot time tomorrow.

That would be great, thank you.

> status = EFI_NOT_FOUND here doesn't seem to do anything.

It terminates the enclosing do-while loop.

> By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue
> patch may also get stuck in loop in the same way. 

You mean if GetNextVariableName() never terminates? Yeah,
efivar_update_sysfs_entries() looks like it needs protecting from that
too, good catch.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: sysfs: cannot create duplicate filename
       [not found]                         ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-03-12 17:51                           ` Seiji Aguchi
  2013-03-13 10:47                           ` Lingzhu Xiang
  1 sibling, 0 replies; 38+ messages in thread
From: Seiji Aguchi @ 2013-03-12 17:51 UTC (permalink / raw)
  To: Matt Fleming, Lingzhu Xiang
  Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

> > By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue
> > patch may also get stuck in loop in the same way.

In my understanding, a problem reported initially is that WARN_ON() in create_sysfs_entries() hits because
multiple sysfs entries of a same variable name are created.

On the other hand, in efivar_update_sysfs_entries(), the WARN_ON never hits because
If a variable is already present, create_sysfs_entries() is never called.

> 
> You mean if GetNextVariableName() never terminates? Yeah,
> efivar_update_sysfs_entries() looks like it needs protecting from that too, good catch.

If there is a case where GetNextVariableName() never terminates, I agree that efivar_update_sysfs_entries()
should be protected from that .
But I'm not sure if the case is related to a bug in this thread..

Seiji

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                         ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2013-03-12 17:51                           ` Seiji Aguchi
@ 2013-03-13 10:47                           ` Lingzhu Xiang
       [not found]                             ` <51405943.2000601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Lingzhu Xiang @ 2013-03-13 10:47 UTC (permalink / raw)
  To: Matt Fleming; +Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi

On 03/13/2013 12:35 AM, Matt Fleming wrote:
>> I think using what we have is good enough for this. I'll verify
>> boot time tomorrow.
>
> That would be great, thank you.

On a Windows 8 logo desktop with 91 variables, the average time (30 runs 
each) of variable iteration goes from 0.009672 second to 0.010180 second 
(+0.000508 second, or +5%). Negligible.

>> status = EFI_NOT_FOUND here doesn't seem to do anything.
>
> It terminates the enclosing do-while loop.

Oops, my brain fart.

>> By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue
>> patch may also get stuck in loop in the same way.
>
> You mean if GetNextVariableName() never terminates? Yeah,
> efivar_update_sysfs_entries() looks like it needs protecting from that
> too, good catch.

Yes that's what I meant.


Lingzhu Xiang

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                             ` <51405943.2000601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-14 16:33                               ` Matt Fleming
       [not found]                                 ` <1363278817.15011.316.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Matt Fleming @ 2013-03-14 16:33 UTC (permalink / raw)
  To: Lingzhu Xiang
  Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA, Seiji Aguchi

On Wed, 2013-03-13 at 18:47 +0800, Lingzhu Xiang wrote:
> On 03/13/2013 12:35 AM, Matt Fleming wrote:
> >> I think using what we have is good enough for this. I'll verify
> >> boot time tomorrow.
> >
> > That would be great, thank you.
> 
> On a Windows 8 logo desktop with 91 variables, the average time (30 runs 
> each) of variable iteration goes from 0.009672 second to 0.010180 second 
> (+0.000508 second, or +5%). Negligible.

Thanks for testing that.

[...]

> >> By the way, efivar_update_sysfs_entries from Seiji's sysfs workqueue
> >> patch may also get stuck in loop in the same way.
> >
> > You mean if GetNextVariableName() never terminates? Yeah,
> > efivar_update_sysfs_entries() looks like it needs protecting from that
> > too, good catch.
> 
> Yes that's what I meant.

Hmm.. actually, I'm not at all sure how to fix that case, since the
algorithm expects to find duplicates.

Seiji, any thoughts here? For a recap of the discussion see,

	http://article.gmane.org/gmane.linux.kernel.efi/835

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: sysfs: cannot create duplicate filename
       [not found]                                 ` <1363278817.15011.316.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-03-14 19:31                                   ` Seiji Aguchi
       [not found]                                     ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF68807-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Seiji Aguchi @ 2013-03-14 19:31 UTC (permalink / raw)
  To: Matt Fleming, Lingzhu Xiang
  Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

> Hmm.. actually, I'm not at all sure how to fix that case, since the algorithm expects to find duplicates.
> 
> Seiji, any thoughts here? For a recap of the discussion see,
> 
> 	http://article.gmane.org/gmane.linux.kernel.efi/835

Thank you for providing this.

If GetNextVariableName() never terminates, the firmware is probably broken.
In this case, efivars, including efi_pstore, can't guarantee to work reliably...

>Granted, it's not kernel's fault for duplicate variables, but it'd be much harder for users 
>to fix their broken firmware than kernel dealing with it by not dying on the never-ending 
>GetNextVariableName.

To fix the broken firmware, users need to erase the duplicated variable.
Is it correct?
If yes, how about printing the duplicated variable name as follows?
Or add printk() persuade to upgrading/downgrading firmware?

+ printk(KERN_WARNING "efivars: duplicate variable name. %s\n", variable_name);

As for update_sysfs_entries() , a candidate fix is introducing a global variable like is_firmware_broken.
And if the variable is set, update_sysfs_entries() returns without calling GetNextVariableName().
As I mentioned above, there is no way to make efivars, including efi_pstore, work reliably on the broken firmware..

I admit that introducing a global variable is messy. But I can't find any other way to fix..

If I still misunderstand something, please let me know.

Seiji




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                                     ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF68807-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>
@ 2013-03-18 14:51                                       ` Matt Fleming
       [not found]                                         ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Matt Fleming @ 2013-03-18 14:51 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On Thu, 2013-03-14 at 19:31 +0000, Seiji Aguchi wrote:
> To fix the broken firmware, users need to erase the duplicated variable.
> Is it correct?

I'm unsure whether that will fix the problem for all platforms. Lingzhu,
do you have any idea?

> If yes, how about printing the duplicated variable name as follows?
> Or add printk() persuade to upgrading/downgrading firmware?
> 
> + printk(KERN_WARNING "efivars: duplicate variable name. %s\n", variable_name);

Yes, I think there's merit to printing that. Unfortunately, it's a utf16
string, and printk() doesn't know how to print it.

> As for update_sysfs_entries() , a candidate fix is introducing a
> global variable like is_firmware_broken.
> And if the variable is set, update_sysfs_entries() returns without
> calling GetNextVariableName().
> As I mentioned above, there is no way to make efivars, including
> efi_pstore, work reliably on the broken firmware..
> 
> I admit that introducing a global variable is messy. But I can't find
> any other way to fix..

Right, looks like this is the only way to go.

How does everyone feel about this version? Lingzhu, if I could get a
Tested-by for this version that would be great. This, plus some other
patches, are sitting on the 'urgent' branch.

---

>From 544a6c06479e4ea4e7acc7eea589a82f0ff9aa6d Mon Sep 17 00:00:00 2001
From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date: Thu, 7 Mar 2013 11:59:14 +0000
Subject: [PATCH] efivars: Handle duplicate names from get_next_variable()

Some firmware exhibits a bug where the same VariableName and
VendorGuid values are returned on multiple invocations of
GetNextVariableName(). See,

    https://bugzilla.kernel.org/show_bug.cgi?id=47631

As a consequence of such a bug, Andre reports hitting the following
WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
11/21/2012)" machine,

[    0.581554] EFI Variables Facility v0.08 2004-May-17
[    0.584914] ------------[ cut here ]------------
[    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
[    0.586381] Hardware name: To be filled by O.E.M.
[    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
[    0.588694] Modules linked in:
[    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
[    0.590280] Call Trace:
[    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
[    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
[    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
[    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
[    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
[    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
[    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
[    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
[    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
[    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
[    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
[    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
[    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
[    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
[    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
[    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
[    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
[    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
[    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
[    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.607143] ---[ end trace 1609741ab737eb29 ]---

There's not much we can do to work around and keep traversing the
variable list once we hit this firmware bug. Our only solution is to
terminate the loop because, as Lingzhu reports, some machines get
stuck when they encounter duplicate names,

  > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
  > get stuck in infinite loop creating duplicate sysfs files because,
  > for some reason, there are several duplicate boot entries in nvram
  > getting GetNextVariableName into a circle of iteration (with
  > period > 2).

Also disable the workqueue, as efivar_update_sysfs_entries() uses
GetNextVariableName() to figure out which variables have been created
since the last iteration. That algorithm isn't going to work if
GetNextVariableName() returns duplicates. Note that we don't disable
EFI variable creation completely on the affected machines, it's just
that any pstore dump-* files won't appear in sysfs until the next
boot.

Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Seiji Aguchi <seiji.aguchi-7rDLJAbr9SE@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/firmware/efivars.c | 48 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index ae26d5e..a1a3d01 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -170,6 +170,7 @@ efivar_create_sysfs_entry(struct efivars *efivars,
 
 static void efivar_update_sysfs_entries(struct work_struct *);
 static DECLARE_WORK(efivar_work, efivar_update_sysfs_entries);
+static bool efivar_wq_enabled = true;
 
 /* Return the number of unicode characters in data */
 static unsigned long
@@ -1443,7 +1444,7 @@ static int efi_pstore_write(enum pstore_type_id type,
 
 	spin_unlock_irqrestore(&efivars->lock, flags);
 
-	if (reason == KMSG_DUMP_OOPS)
+	if (reason == KMSG_DUMP_OOPS && efivar_wq_enabled)
 		schedule_work(&efivar_work);
 
 	*id = part;
@@ -1974,6 +1975,35 @@ void unregister_efivars(struct efivars *efivars)
 }
 EXPORT_SYMBOL_GPL(unregister_efivars);
 
+/*
+ * Print a warning when duplicate EFI variables are encountered and
+ * disable the sysfs workqueue since the firmware is buggy.
+ */
+static void dup_variable_bug(efi_char16_t *s16, efi_guid_t *vendor_guid,
+			     unsigned long len16)
+{
+	size_t i, len8 = len16 / sizeof(efi_char16_t);
+	char *s8;
+
+	/*
+	 * Disable the workqueue since the algorithm it uses for
+	 * detecting new variables won't work with this buggy
+	 * implementation of GetNextVariableName().
+	 */
+	efivar_wq_enabled = false;
+
+	s8 = kzalloc(len8, GFP_KERNEL);
+	if (!s8)
+		return;
+
+	for (i = 0; i < len8; i++)
+		s8[i] = s16[i];
+
+	printk(KERN_WARNING "efivars: duplicate variable: %s-%pUl\n",
+	       s8, vendor_guid);
+	kfree(s8);
+}
+
 int register_efivars(struct efivars *efivars,
 		     const struct efivar_operations *ops,
 		     struct kobject *parent_kobj)
@@ -2024,6 +2054,22 @@ int register_efivars(struct efivars *efivars,
 		case EFI_SUCCESS:
 			variable_name_size = var_name_strnsize(variable_name,
 							       variable_name_size);
+
+			/*
+			 * Some firmware implementations return the
+			 * same variable name on multiple calls to
+			 * get_next_variable(). Terminate the loop
+			 * immediately as there is no guarantee that
+			 * we'll ever see a different variable name,
+			 * and may end up looping here forever.
+			 */
+			if (variable_is_present(variable_name, &vendor_guid)) {
+				dup_variable_bug(variable_name, &vendor_guid,
+						 variable_name_size);
+				status = EFI_NOT_FOUND;
+				break;
+			}
+
 			efivar_create_sysfs_entry(efivars,
 						  variable_name_size,
 						  variable_name,
-- 
1.7.11.7



-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                                         ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-03-19 10:14                                           ` Lingzhu Xiang
       [not found]                                             ` <51483A88.6060509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2013-03-21  7:22                                           ` Lingzhu Xiang
  1 sibling, 1 reply; 38+ messages in thread
From: Lingzhu Xiang @ 2013-03-19 10:14 UTC (permalink / raw)
  To: Matt Fleming; +Cc: Seiji Aguchi, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On 03/18/2013 10:51 PM, Matt Fleming wrote:
> On Thu, 2013-03-14 at 19:31 +0000, Seiji Aguchi wrote:
>> To fix the broken firmware, users need to erase the duplicated variable.
>> Is it correct?
>
> I'm unsure whether that will fix the problem for all platforms. Lingzhu,
> do you have any idea?

I'm not very sure what is to "erase the duplicated variable". If the 
firmware is broken, the variable interfaces probably are also broken and 
can't be used to erase variable. Users need to reset nvram to a known 
clean state, but often the hardware doesn't provide such facility.

> How does everyone feel about this version? Lingzhu, if I could get a
> Tested-by for this version that would be great. This, plus some other
> patches, are sitting on the 'urgent' branch.

I will test this. The IBM x3100m4 is gone. It takes some time to figure 
out how to corrupt nvram in QEMU...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: sysfs: cannot create duplicate filename
       [not found]                                             ` <51483A88.6060509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-19 14:25                                               ` Seiji Aguchi
       [not found]                                                 ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Seiji Aguchi @ 2013-03-19 14:25 UTC (permalink / raw)
  To: Lingzhu Xiang, Matt Fleming
  Cc: Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

Lingzhu,

I have another idea after reading you comment.
 
> If the firmware is broken, the variable interfaces probably are also broken
> and can't be used to erase variable. 

I see, you are concerned that a set_variable service may not work correctly.
If so,  the set_variable service should be disabled in addition to get_next_variable service.

On the other hands, if Linux can boot up, a get_variable service probably works.
Therefore, efivars should be initialized by enabling just get_variable service.

> Users need to reset nvram to a known clean state, but often the hardware doesn't provide such
> facility.

What we can do is to notify that  the firmware/hardware is broken with printk.

Seiji 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                                                 ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>
@ 2013-03-19 15:26                                                   ` Matt Fleming
       [not found]                                                     ` <514883B7.1010602-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
  2013-03-19 16:00                                                   ` Lingzhu Xiang
  1 sibling, 1 reply; 38+ messages in thread
From: Matt Fleming @ 2013-03-19 15:26 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On 03/19/2013 02:25 PM, Seiji Aguchi wrote:
> I see, you are concerned that a set_variable service may not work correctly.
> If so,  the set_variable service should be disabled in addition to get_next_variable service.

We've no reason to believe that there are broken SetVariable()
implementations in the wild. If any ever crop up, we can deal with them
as and when.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: sysfs: cannot create duplicate filename
       [not found]                                                     ` <514883B7.1010602-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
@ 2013-03-19 15:46                                                       ` Seiji Aguchi
  0 siblings, 0 replies; 38+ messages in thread
From: Seiji Aguchi @ 2013-03-19 15:46 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Lingzhu Xiang, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA


> We've no reason to believe that there are broken SetVariable() implementations in the wild. If any ever crop up, we can deal with
> them as and when.

OK. I will make a patch disabling just get_next_variable().
I'm thinking that it is reasonable to set ops->get_next_variable to NULL in register_efivars(),
 instead of introducing a new global variable....

Seiji


> -----Original Message-----
> From: Matt Fleming [mailto:matt@console-pimps.org]
> Sent: Tuesday, March 19, 2013 11:27 AM
> To: Seiji Aguchi
> Cc: Lingzhu Xiang; Andre Heider; linux-efi@vger.kernel.org
> Subject: Re: sysfs: cannot create duplicate filename
> 
> On 03/19/2013 02:25 PM, Seiji Aguchi wrote:
> > I see, you are concerned that a set_variable service may not work correctly.
> > If so,  the set_variable service should be disabled in addition to get_next_variable service.
> 
> We've no reason to believe that there are broken SetVariable() implementations in the wild. If any ever crop up, we can deal with
> them as and when.
> 
> --
> Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                                                 ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>
  2013-03-19 15:26                                                   ` Matt Fleming
@ 2013-03-19 16:00                                                   ` Lingzhu Xiang
  1 sibling, 0 replies; 38+ messages in thread
From: Lingzhu Xiang @ 2013-03-19 16:00 UTC (permalink / raw)
  To: Seiji Aguchi; +Cc: Matt Fleming, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On 03/19/2013 10:25 PM, Seiji Aguchi wrote:
>> If the firmware is broken, the variable interfaces probably are also broken
>> and can't be used to erase variable. 
> 
> I see, you are concerned that a set_variable service may not work correctly.
> If so,  the set_variable service should be disabled in addition to get_next_variable service.

Since you're suggesting fixing firmware by erasing, I'm wondering
SetVariable may not actually remove duplicate variables since the way
they exist is already an undefined behavior. And some variables can't
be removed.

GetNextVariable is clearly causing harm in this case. But the rest of efi
runtime services should be left as they are.


Lingzhu Xiang

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                                         ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2013-03-19 10:14                                           ` Lingzhu Xiang
@ 2013-03-21  7:22                                           ` Lingzhu Xiang
       [not found]                                             ` <514AB534.8080209-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Lingzhu Xiang @ 2013-03-21  7:22 UTC (permalink / raw)
  To: Matt Fleming; +Cc: Seiji Aguchi, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On 03/18/2013 10:51 PM, Matt Fleming wrote:
> How does everyone feel about this version? Lingzhu, if I could get a
> Tested-by for this version that would be great. This, plus some other
> patches, are sitting on the 'urgent' branch.
> 
> ---
> 
>  From 544a6c06479e4ea4e7acc7eea589a82f0ff9aa6d Mon Sep 17 00:00:00 2001
> From: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Date: Thu, 7 Mar 2013 11:59:14 +0000
> Subject: [PATCH] efivars: Handle duplicate names from get_next_variable()
> 
> Some firmware exhibits a bug where the same VariableName and
> VendorGuid values are returned on multiple invocations of
> GetNextVariableName(). See,
> 
>      https://bugzilla.kernel.org/show_bug.cgi?id=47631
> 
> As a consequence of such a bug, Andre reports hitting the following
> WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
> Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
> 11/21/2012)" machine,
> 
> [    0.581554] EFI Variables Facility v0.08 2004-May-17
> [    0.584914] ------------[ cut here ]------------
> [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
> [    0.586381] Hardware name: To be filled by O.E.M.
> [    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
> [    0.588694] Modules linked in:
> [    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
> [    0.590280] Call Trace:
> [    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
> [    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
> [    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
> [    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
> [    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
> [    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
> [    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
> [    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
> [    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
> [    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
> [    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
> [    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
> [    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
> [    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
> [    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
> [    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
> [    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
> [    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
> [    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
> [    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
> [    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
> [    0.607143] ---[ end trace 1609741ab737eb29 ]---
> 
> There's not much we can do to work around and keep traversing the
> variable list once we hit this firmware bug. Our only solution is to
> terminate the loop because, as Lingzhu reports, some machines get
> stuck when they encounter duplicate names,
> 
>    > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
>    > get stuck in infinite loop creating duplicate sysfs files because,
>    > for some reason, there are several duplicate boot entries in nvram
>    > getting GetNextVariableName into a circle of iteration (with
>    > period > 2).
> 
> Also disable the workqueue, as efivar_update_sysfs_entries() uses
> GetNextVariableName() to figure out which variables have been created
> since the last iteration. That algorithm isn't going to work if
> GetNextVariableName() returns duplicates. Note that we don't disable
> EFI variable creation completely on the affected machines, it's just
> that any pstore dump-* files won't appear in sysfs until the next
> boot.
> 
> Reported-by: Andre Heider <a.heider-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reported-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Seiji Aguchi <seiji.aguchi-7rDLJAbr9SE@public.gmane.org>
> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> Signed-off-by: Matt Fleming <matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Tested-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Reproduced with 3.9-rc3 in QEMU/OVMF. The kernel gets stuck in a loop.

Verified with 3.9-rc3 with this patch. The kernel survives. efivarfs
continues working, though some variables won't show up, as expected.

[    2.033855] efivars: duplicate variable: Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c

This patch has no real impact for users with normal firmware as tested
before on a Windows 8 logo desktop.

For the record, here is how I create duplicate variables in QEMU/OVMF:

OVMF EmuVariableFvbRuntimeDxe emulates nvram with a reserved area of
memory (ReserveEmuVariableNvStore). But there seems to be some layers
of cache above this, so directly writing a new duplicate variable here
won't make a difference. I ended up writing an EFI app to scan the
entire memory and replace all L"Boot0004" with L"Boot0001", which
seemed to actually break things (e.g. dmpstore will get stuck).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
       [not found]                                             ` <514AB534.8080209-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-03-21  7:44                                               ` Matt Fleming
  0 siblings, 0 replies; 38+ messages in thread
From: Matt Fleming @ 2013-03-21  7:44 UTC (permalink / raw)
  To: Lingzhu Xiang
  Cc: Seiji Aguchi, Andre Heider, linux-efi-u79uwXL29TY76Z2rM5mHXA

On 03/21/2013 07:22 AM, Lingzhu Xiang wrote:
> Tested-by: Lingzhu Xiang <lxiang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Great, thanks!

> Reproduced with 3.9-rc3 in QEMU/OVMF. The kernel gets stuck in a loop.
> 
> Verified with 3.9-rc3 with this patch. The kernel survives. efivarfs
> continues working, though some variables won't show up, as expected.
> 
> [    2.033855] efivars: duplicate variable: Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c
> 
> This patch has no real impact for users with normal firmware as tested
> before on a Windows 8 logo desktop.
> 
> For the record, here is how I create duplicate variables in QEMU/OVMF:
> 
> OVMF EmuVariableFvbRuntimeDxe emulates nvram with a reserved area of
> memory (ReserveEmuVariableNvStore). But there seems to be some layers
> of cache above this, so directly writing a new duplicate variable here
> won't make a difference. I ended up writing an EFI app to scan the
> entire memory and replace all L"Boot0004" with L"Boot0001", which
> seemed to actually break things (e.g. dmpstore will get stuck).

Thanks for going to such lengths to test this.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2022-02-17 14:29     ` Marciniszyn, Mike
@ 2022-02-17 15:35       ` Robin Peiremans
  0 siblings, 0 replies; 38+ messages in thread
From: Robin Peiremans @ 2022-02-17 15:35 UTC (permalink / raw)
  To: Marciniszyn, Mike; +Cc: Dalessandro, Dennis, Jason Gunthorpe, linux-rdma

Tested on 5.17 and my ports are back. I can confirm the patch works.

Thanks a lot for the quick fix!

Robin

On Thu, Feb 17, 2022 at 3:29 PM Marciniszyn, Mike
<mike.marciniszyn@cornelisnetworks.com> wrote:
>
> > >
> > > Don't know, Mike tested this patch on qib, maybe he knows
> >
> > A patch to fix the issue should be posted soon.
> >
> > -Denny
>
> See https://lore.kernel.org/linux-rdma/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u.
>
> Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: sysfs: cannot create duplicate filename
  2022-02-17 13:37   ` Dennis Dalessandro
@ 2022-02-17 14:29     ` Marciniszyn, Mike
  2022-02-17 15:35       ` Robin Peiremans
  0 siblings, 1 reply; 38+ messages in thread
From: Marciniszyn, Mike @ 2022-02-17 14:29 UTC (permalink / raw)
  To: Dalessandro, Dennis, Jason Gunthorpe, Robin Peiremans; +Cc: linux-rdma

> >
> > Don't know, Mike tested this patch on qib, maybe he knows
> 
> A patch to fix the issue should be posted soon.
> 
> -Denny

See https://lore.kernel.org/linux-rdma/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u.

Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2022-02-14 18:08 ` Jason Gunthorpe
@ 2022-02-17 13:37   ` Dennis Dalessandro
  2022-02-17 14:29     ` Marciniszyn, Mike
  0 siblings, 1 reply; 38+ messages in thread
From: Dennis Dalessandro @ 2022-02-17 13:37 UTC (permalink / raw)
  To: Jason Gunthorpe, Robin Peiremans, Mike Marciniszyn; +Cc: linux-rdma

On 2/14/22 1:08 PM, Jason Gunthorpe wrote:
> On Sun, Feb 13, 2022 at 10:21:49AM +0100, Robin Peiremans wrote:
>> Hi
>>
>> I ran into this error when upgrading to kernel 5.14 (it works
>> pre-5.14). The driver gets loaded (verified via lsmod), but the ib
>> utils don't show any ports.
>>
>> There's a bugreport on elrepo
>> (https://elrepo.org/bugs/view.php?id=1176) that looks pretty much
>> identical, but it looks stalled.
>> I've bisected the kernel and commit
>> 4a7aaf88c89f12f8048137e274ce0d40fe1056b2 seems to be the culprit.
>> Since I'm absolutely no dev, I'm hoping someone here can figure out
>> what exactly is going wrong. Latest mainline kernel still has the same
>> behavior.
> 
> Don't know, Mike tested this patch on qib, maybe he knows

A patch to fix the issue should be posted soon.

-Denny

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2022-02-13  9:21 Robin Peiremans
@ 2022-02-14 18:08 ` Jason Gunthorpe
  2022-02-17 13:37   ` Dennis Dalessandro
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2022-02-14 18:08 UTC (permalink / raw)
  To: Robin Peiremans, Mike Marciniszyn; +Cc: linux-rdma

On Sun, Feb 13, 2022 at 10:21:49AM +0100, Robin Peiremans wrote:
> Hi
> 
> I ran into this error when upgrading to kernel 5.14 (it works
> pre-5.14). The driver gets loaded (verified via lsmod), but the ib
> utils don't show any ports.
> 
> There's a bugreport on elrepo
> (https://elrepo.org/bugs/view.php?id=1176) that looks pretty much
> identical, but it looks stalled.
> I've bisected the kernel and commit
> 4a7aaf88c89f12f8048137e274ce0d40fe1056b2 seems to be the culprit.
> Since I'm absolutely no dev, I'm hoping someone here can figure out
> what exactly is going wrong. Latest mainline kernel still has the same
> behavior.

Don't know, Mike tested this patch on qib, maybe he knows

Jason

^ permalink raw reply	[flat|nested] 38+ messages in thread

* sysfs: cannot create duplicate filename
@ 2022-02-13  9:21 Robin Peiremans
  2022-02-14 18:08 ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Peiremans @ 2022-02-13  9:21 UTC (permalink / raw)
  To: linux-rdma

Hi

I ran into this error when upgrading to kernel 5.14 (it works
pre-5.14). The driver gets loaded (verified via lsmod), but the ib
utils don't show any ports.

There's a bugreport on elrepo
(https://elrepo.org/bugs/view.php?id=1176) that looks pretty much
identical, but it looks stalled.
I've bisected the kernel and commit
4a7aaf88c89f12f8048137e274ce0d40fe1056b2 seems to be the culprit.
Since I'm absolutely no dev, I'm hoping someone here can figure out
what exactly is going wrong. Latest mainline kernel still has the same
behavior.

Relevant output and logs below.

$ sudo lspci -s 04:00.0 -vv
04:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
Subsystem: QLogic Corp. IBA7322 QDR InfiniBand HCA
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 39
IOMMU group: 22
Region 0: Memory at fbc00000 (64-bit, non-prefetchable) [size=4M]
Expansion ROM at fc000000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <4us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s (ok), Width x4 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-
EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [b0] MSI-X: Enable- Count=32 Masked-
Vector table: BAR=0 offset=00008000
PBA: BAR=0 offset=00009000
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+
MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Kernel modules: ib_qib

Relevant /var/log/messages section:
Feb 13 09:43:29 localhost kernel: ib_qib 0000:04:00.0: qib0: Reserving
QPNs from 0x656b78 to 0x656b78 for non-verbs use
Feb 13 09:43:29 localhost kernel: sysfs: cannot create duplicate
filename '/devices/pci0000:00/0000:00:01.2/0000:01:00.0/0000:02:02.0/0000:04:00.0/infiniband/qib0/ports/1/linkcontrol'
Feb 13 09:43:29 localhost kernel: CPU: 15 PID: 1118 Comm: modprobe
Tainted: G           OE     5.17.0-rc3-robin+ #20
Feb 13 09:43:29 localhost kernel: Hardware name: ASUS System Product
Name/TUF GAMING X570-PRO (WI-FI), BIOS 3001 12/04/2020
Feb 13 09:43:29 localhost kernel: Call Trace:
Feb 13 09:43:29 localhost kernel: <TASK>
Feb 13 09:43:29 localhost kernel: dump_stack_lvl+0x48/0x5e
Feb 13 09:43:29 localhost kernel: sysfs_warn_dup.cold+0x17/0x24
Feb 13 09:43:29 localhost kernel: internal_create_group+0x348/0x350
Feb 13 09:43:29 localhost kernel: internal_create_groups.part.0+0x3d/0xa0
Feb 13 09:43:29 localhost kernel: setup_port+0x370/0x680 [ib_core]
Feb 13 09:43:29 localhost kernel: ? kobject_add+0x6e/0x90
Feb 13 09:43:29 localhost kernel: ib_setup_port_attrs+0x88/0x220 [ib_core]
Feb 13 09:43:29 localhost kernel: ib_register_device+0x576/0x650 [ib_core]
Feb 13 09:43:29 localhost kernel: ? __vmalloc_node+0x4a/0x70
Feb 13 09:43:29 localhost kernel: rvt_register_device+0x10c/0x270 [rdmavt]
Feb 13 09:43:29 localhost kernel: qib_register_ib_device+0x5f8/0x7a0 [ib_qib]
Feb 13 09:43:29 localhost kernel: qib_init_one+0x17f/0x470 [ib_qib]
Feb 13 09:43:29 localhost kernel: local_pci_probe+0x45/0x80
Feb 13 09:43:29 localhost kernel: ? pci_match_device+0xd7/0x130
Feb 13 09:43:29 localhost kernel: pci_device_probe+0xaa/0x1a0
Feb 13 09:43:29 localhost kernel: really_probe+0x1f5/0x3d0
Feb 13 09:43:29 localhost kernel: __driver_probe_device+0xfe/0x180
Feb 13 09:43:29 localhost kernel: driver_probe_device+0x1e/0x90
Feb 13 09:43:29 localhost kernel: __driver_attach+0xc0/0x1c0
Feb 13 09:43:29 localhost kernel: ? __device_attach_driver+0xe0/0xe0
Feb 13 09:43:29 localhost kernel: ? __device_attach_driver+0xe0/0xe0
Feb 13 09:43:29 localhost kernel: bus_for_each_dev+0x64/0x90
Feb 13 09:43:29 localhost kernel: bus_add_driver+0x149/0x1e0
Feb 13 09:43:29 localhost kernel: driver_register+0x8f/0xe0
Feb 13 09:43:29 localhost kernel: ? qib_init_qibfs+0x11/0x11 [ib_qib]
Feb 13 09:43:29 localhost kernel: qib_ib_init+0x3e/0xf62 [ib_qib]
Feb 13 09:43:29 localhost kernel: do_one_initcall+0x44/0x200
Feb 13 09:43:29 localhost kernel: ? kmem_cache_alloc_trace+0x163/0x2c0
Feb 13 09:43:29 localhost kernel: do_init_module+0x4c/0x260
Feb 13 09:43:29 localhost kernel: __do_sys_finit_module+0x9b/0xf0
Feb 13 09:43:29 localhost kernel: do_syscall_64+0x3b/0x90
Feb 13 09:43:29 localhost kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Feb 13 09:43:29 localhost kernel: RIP: 0033:0x7fae66bf5ecd
Feb 13 09:43:29 localhost kernel: Code: 5b 41 5c c3 66 0f 1f 84 00 00
00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d
89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b ef
0e 00 f7 d8 64 89 01 48
Feb 13 09:43:29 localhost kernel: RSP: 002b:00007ffdb7beb198 EFLAGS:
00000246 ORIG_RAX: 0000000000000139
Feb 13 09:43:29 localhost kernel: RAX: ffffffffffffffda RBX:
00007fae677c8c80 RCX: 00007fae66bf5ecd
Feb 13 09:43:29 localhost kernel: RDX: 0000000000000000 RSI:
00007fae67173a2a RDI: 0000000000000003
Feb 13 09:43:29 localhost kernel: RBP: 0000000000040004 R08:
0000000000000000 R09: 0000000000000000
Feb 13 09:43:29 localhost kernel: R10: 0000000000000003 R11:
0000000000000246 R12: 00007fae67173a2a
Feb 13 09:43:29 localhost kernel: R13: 00007fae677c8ed0 R14:
00007fae677c8c80 R15: 00007fae677c9980
Feb 13 09:43:29 localhost kernel: </TASK>
Feb 13 09:43:29 localhost kernel: infiniband qib0: Couldn't register
device with driver model
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: qib0: Failed to
register driver with ib core.
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: qib0: cannot
register verbs: 17!
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 140
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 141
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 142
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 144
Feb 13 09:43:33 localhost kernel: ib_qib: probe of 0000:04:00.0 failed
with error -17

$ ibstatus
Fatal error:  device '*': sys files not found (/sys/class/infiniband/*/ports)

$ git bisect bad
4a7aaf88c89f12f8048137e274ce0d40fe1056b2 is the first bad commit
commit 4a7aaf88c89f12f8048137e274ce0d40fe1056b2
Author: Jason Gunthorpe <jgg@nvidia.com>
Date:   Fri Jun 11 19:00:30 2021 +0300

    RDMA/qib: Use attributes for the port sysfs

    qib should not be creating a mess of kobjects to attach to the port
    kobject - this is all attributes. The proper API is to create an
    attribute_group list and create it against the port's kobject.

    Link: https://lore.kernel.org/r/911e0031e1ed495b0006e8a6efec7b67a702cd5e.1623427137.git.leonro@nvidia.com
    Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

 drivers/infiniband/hw/qib/qib.h       |   4 -
 drivers/infiniband/hw/qib/qib_sysfs.c | 606 ++++++++++++++--------------------
 2 files changed, 254 insertions(+), 356 deletions(-)


Cheers
Robin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-12 14:39               ` Sebastian Ott
@ 2011-04-12 16:02                 ` Greg KH
  0 siblings, 0 replies; 38+ messages in thread
From: Greg KH @ 2011-04-12 16:02 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-kernel

On Tue, Apr 12, 2011 at 04:39:49PM +0200, Sebastian Ott wrote:
> 
> 
> On Mon, 11 Apr 2011, Greg KH wrote:
> 
> > On Mon, Apr 11, 2011 at 07:50:34PM +0200, Sebastian Ott wrote:
> > > 
> > > 
> > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > 
> > > > On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote:
> > > > > 
> > > > > 
> > > > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > > > 
> > > > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote:
> > > > > > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > > > > > 
> > > > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > i've seen this warning which looks to be caused by a race between device_add
> > > > > > > > > and driver_register
> > > > > > > > > 
> > > > > > > > > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> > > > > > > > 
> > > > > > > > Isn't the problem here the fact that you are creating 2 directories of
> > > > > > > > the same name?
> > > > > > > I'm sure this isn't the case here. The bus code just calls device_add and
> > > > > > > at the same time on a different thread a module is loaded which registers
> > > > > > > a driver at the bus.
> > > > > > > 
> > > > > > > I was able to reproduce this with a module which creates a dummy bus
> > > > > > > and registers drivers and devices on this bus on 2 different workqueues.
> > > > > > 
> > > > > > That makes sense, as no bus should be doing this on multiple "threads".
> > > > > > What real-life bus does this today?
> > > > > A bus that will recognize and register a lot of devices, after the first
> > > > > uevent is presented to userspace, a module will be loaded registering a
> > > > > driver from a different thread. I don't think thats uncommon.
> > > > 
> > > > But again, what kernel code today does this?  I think they all have
> > > > locks to keep this from happening, right?
> > > I couldn't find a bus who protects device_register against driver_register
> > > and I don't think this is something which should be handled by every
> > > individual bus but from within the driver core.
> > 
> > How did you cause the oops in this original message?  What type of bus
> > was it on?  And did your patch solve the issue?
> 
> I've seen this warning on a system which had a lot of devices attached.
> Those devices also responded _very_ slow. So it took some time to
> register all these deivces (qeth) on the bus (ccw). During this time
> the qeth module got loaded and registered the qeth driver.
> 
> Since i've never seen this warning and those devices normally don't
> take so much time to respond I wrote the beforementioned module to
> reproduce the race and make sure that this is no driver issue.
> 
> With this module the warning could be triggered after a few cycles.
> With the patch applied, I did not see the warning again even after
> > 10.000 cycles.

Ok, fair enough.  Care to resend the patch again so I can review it once
more?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 17:56             ` Greg KH
@ 2011-04-12 14:39               ` Sebastian Ott
  2011-04-12 16:02                 ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Sebastian Ott @ 2011-04-12 14:39 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel



On Mon, 11 Apr 2011, Greg KH wrote:

> On Mon, Apr 11, 2011 at 07:50:34PM +0200, Sebastian Ott wrote:
> > 
> > 
> > On Mon, 11 Apr 2011, Greg KH wrote:
> > 
> > > On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote:
> > > > 
> > > > 
> > > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > > 
> > > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote:
> > > > > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > > > > 
> > > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > i've seen this warning which looks to be caused by a race between device_add
> > > > > > > > and driver_register
> > > > > > > > 
> > > > > > > > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> > > > > > > 
> > > > > > > Isn't the problem here the fact that you are creating 2 directories of
> > > > > > > the same name?
> > > > > > I'm sure this isn't the case here. The bus code just calls device_add and
> > > > > > at the same time on a different thread a module is loaded which registers
> > > > > > a driver at the bus.
> > > > > > 
> > > > > > I was able to reproduce this with a module which creates a dummy bus
> > > > > > and registers drivers and devices on this bus on 2 different workqueues.
> > > > > 
> > > > > That makes sense, as no bus should be doing this on multiple "threads".
> > > > > What real-life bus does this today?
> > > > A bus that will recognize and register a lot of devices, after the first
> > > > uevent is presented to userspace, a module will be loaded registering a
> > > > driver from a different thread. I don't think thats uncommon.
> > > 
> > > But again, what kernel code today does this?  I think they all have
> > > locks to keep this from happening, right?
> > I couldn't find a bus who protects device_register against driver_register
> > and I don't think this is something which should be handled by every
> > individual bus but from within the driver core.
> 
> How did you cause the oops in this original message?  What type of bus
> was it on?  And did your patch solve the issue?

I've seen this warning on a system which had a lot of devices attached.
Those devices also responded _very_ slow. So it took some time to
register all these deivces (qeth) on the bus (ccw). During this time
the qeth module got loaded and registered the qeth driver.

Since i've never seen this warning and those devices normally don't
take so much time to respond I wrote the beforementioned module to
reproduce the race and make sure that this is no driver issue.

With this module the warning could be triggered after a few cycles.
With the patch applied, I did not see the warning again even after
> 10.000 cycles.

regards,
Sebastian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 17:50           ` Sebastian Ott
@ 2011-04-11 17:56             ` Greg KH
  2011-04-12 14:39               ` Sebastian Ott
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2011-04-11 17:56 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-kernel

On Mon, Apr 11, 2011 at 07:50:34PM +0200, Sebastian Ott wrote:
> 
> 
> On Mon, 11 Apr 2011, Greg KH wrote:
> 
> > On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote:
> > > 
> > > 
> > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > 
> > > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote:
> > > > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > > > 
> > > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > i've seen this warning which looks to be caused by a race between device_add
> > > > > > > and driver_register
> > > > > > > 
> > > > > > > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> > > > > > 
> > > > > > Isn't the problem here the fact that you are creating 2 directories of
> > > > > > the same name?
> > > > > I'm sure this isn't the case here. The bus code just calls device_add and
> > > > > at the same time on a different thread a module is loaded which registers
> > > > > a driver at the bus.
> > > > > 
> > > > > I was able to reproduce this with a module which creates a dummy bus
> > > > > and registers drivers and devices on this bus on 2 different workqueues.
> > > > 
> > > > That makes sense, as no bus should be doing this on multiple "threads".
> > > > What real-life bus does this today?
> > > A bus that will recognize and register a lot of devices, after the first
> > > uevent is presented to userspace, a module will be loaded registering a
> > > driver from a different thread. I don't think thats uncommon.
> > 
> > But again, what kernel code today does this?  I think they all have
> > locks to keep this from happening, right?
> I couldn't find a bus who protects device_register against driver_register
> and I don't think this is something which should be handled by every
> individual bus but from within the driver core.

How did you cause the oops in this original message?  What type of bus
was it on?  And did your patch solve the issue?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 15:19         ` Greg KH
@ 2011-04-11 17:50           ` Sebastian Ott
  2011-04-11 17:56             ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Sebastian Ott @ 2011-04-11 17:50 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel



On Mon, 11 Apr 2011, Greg KH wrote:

> On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote:
> > 
> > 
> > On Mon, 11 Apr 2011, Greg KH wrote:
> > 
> > > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote:
> > > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > > 
> > > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > i've seen this warning which looks to be caused by a race between device_add
> > > > > > and driver_register
> > > > > > 
> > > > > > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> > > > > 
> > > > > Isn't the problem here the fact that you are creating 2 directories of
> > > > > the same name?
> > > > I'm sure this isn't the case here. The bus code just calls device_add and
> > > > at the same time on a different thread a module is loaded which registers
> > > > a driver at the bus.
> > > > 
> > > > I was able to reproduce this with a module which creates a dummy bus
> > > > and registers drivers and devices on this bus on 2 different workqueues.
> > > 
> > > That makes sense, as no bus should be doing this on multiple "threads".
> > > What real-life bus does this today?
> > A bus that will recognize and register a lot of devices, after the first
> > uevent is presented to userspace, a module will be loaded registering a
> > driver from a different thread. I don't think thats uncommon.
> 
> But again, what kernel code today does this?  I think they all have
> locks to keep this from happening, right?
I couldn't find a bus who protects device_register against driver_register
and I don't think this is something which should be handled by every
individual bus but from within the driver core.

> 
> > > > > > * device_add attached the device to the bus /*break*/
> > > > > > * driver_register walks the list of devices and tries to bind
> > > > > >   unbound devices
> > > > > > * /*continue*/ device_add calls device_attach which gets confused
> > > > > >   that the device is already bound to a driver
> > > > > 
> > > > > Why would your bus code ever allow this to happen?  It's the caller's
> > > > > responsiblity to do things in the correct order, right?
> > > > I don't think the bus code which calls device_register can (or should)
> > > > prevent drivers from beeing registered at this bus at the same time.
> > > 
> > > Why not?  That's the way all kernel subsystems work today that I know
> > > of.  Has this changed?
> > What about an exported bus_type? At all time a driver for this bus can
> > be registered, the bus code has no chance to prevent or serialize this.
> 
> No, the bus core is the one that should be binding the bus type to the
> driver and doing the registering.  No individual driver should ever be
> messing with a bus_type at all.
> 
> Now perhaps platform devices are, and if so, we might want to resolve
> this, but no "real" bus should ever be doing this.
> 
> thanks,
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 15:05       ` Sebastian Ott
@ 2011-04-11 15:19         ` Greg KH
  2011-04-11 17:50           ` Sebastian Ott
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2011-04-11 15:19 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-kernel

On Mon, Apr 11, 2011 at 05:05:08PM +0200, Sebastian Ott wrote:
> 
> 
> On Mon, 11 Apr 2011, Greg KH wrote:
> 
> > On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote:
> > > On Mon, 11 Apr 2011, Greg KH wrote:
> > > 
> > > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > > > > Hi,
> > > > > 
> > > > > i've seen this warning which looks to be caused by a race between device_add
> > > > > and driver_register
> > > > > 
> > > > > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> > > > 
> > > > Isn't the problem here the fact that you are creating 2 directories of
> > > > the same name?
> > > I'm sure this isn't the case here. The bus code just calls device_add and
> > > at the same time on a different thread a module is loaded which registers
> > > a driver at the bus.
> > > 
> > > I was able to reproduce this with a module which creates a dummy bus
> > > and registers drivers and devices on this bus on 2 different workqueues.
> > 
> > That makes sense, as no bus should be doing this on multiple "threads".
> > What real-life bus does this today?
> A bus that will recognize and register a lot of devices, after the first
> uevent is presented to userspace, a module will be loaded registering a
> driver from a different thread. I don't think thats uncommon.

But again, what kernel code today does this?  I think they all have
locks to keep this from happening, right?

> > > > > * device_add attached the device to the bus /*break*/
> > > > > * driver_register walks the list of devices and tries to bind
> > > > >   unbound devices
> > > > > * /*continue*/ device_add calls device_attach which gets confused
> > > > >   that the device is already bound to a driver
> > > > 
> > > > Why would your bus code ever allow this to happen?  It's the caller's
> > > > responsiblity to do things in the correct order, right?
> > > I don't think the bus code which calls device_register can (or should)
> > > prevent drivers from beeing registered at this bus at the same time.
> > 
> > Why not?  That's the way all kernel subsystems work today that I know
> > of.  Has this changed?
> What about an exported bus_type? At all time a driver for this bus can
> be registered, the bus code has no chance to prevent or serialize this.

No, the bus core is the one that should be binding the bus type to the
driver and doing the registering.  No individual driver should ever be
messing with a bus_type at all.

Now perhaps platform devices are, and if so, we might want to resolve
this, but no "real" bus should ever be doing this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 14:49     ` Greg KH
@ 2011-04-11 15:05       ` Sebastian Ott
  2011-04-11 15:19         ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Sebastian Ott @ 2011-04-11 15:05 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel



On Mon, 11 Apr 2011, Greg KH wrote:

> On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote:
> > On Mon, 11 Apr 2011, Greg KH wrote:
> > 
> > > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > > > Hi,
> > > > 
> > > > i've seen this warning which looks to be caused by a race between device_add
> > > > and driver_register
> > > > 
> > > > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> > > 
> > > Isn't the problem here the fact that you are creating 2 directories of
> > > the same name?
> > I'm sure this isn't the case here. The bus code just calls device_add and
> > at the same time on a different thread a module is loaded which registers
> > a driver at the bus.
> > 
> > I was able to reproduce this with a module which creates a dummy bus
> > and registers drivers and devices on this bus on 2 different workqueues.
> 
> That makes sense, as no bus should be doing this on multiple "threads".
> What real-life bus does this today?
A bus that will recognize and register a lot of devices, after the first
uevent is presented to userspace, a module will be loaded registering a
driver from a different thread. I don't think thats uncommon.

> 
> > > > [   80.893611] ------------[ cut here ]------------
> > > > [   80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455
> > > > [   80.893617] Modules linked in: qeth ccwgroup
> > > > [   80.893623] Modules linked in: qeth ccwgroup
> > > > [   80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1
> > > > [   80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980)
> > > > [   80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8)
> > > > [   80.893643]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
> > > > [   80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001
> > > > [   80.893651]            0000000000589daa 00000000005a1248 0000000000000000 0000000000000000
> > > > [   80.893654]            0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8
> > > > [   80.893657]            000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00
> > > > [   80.893668] Krnl Code: 00000000002b675c: c0200020041d        larl    %r2,6b6f96
> > > > [   80.893672]            00000000002b6762: c0e500169afb        brasl   %r14,589d58
> > > > [   80.893676]            00000000002b6768: a7f40001            brc     15,2b676a
> > > > [   80.893680]           >00000000002b676c: b904002a            lgr     %r2,%r10
> > > > [   80.893684]            00000000002b6770: c0e5fffb6236        brasl   %r14,222bdc
> > > > [   80.893687]            00000000002b6776: a7f4ffad            brc     15,2b66d0
> > > > [   80.893692]            00000000002b677a: e320c0480004        lg      %r2,72(%r12)
> > > > [   80.893695]            00000000002b6780: a7f4ffec            brc     15,2b6758
> > > > [   80.893699] Call Trace:
> > > > [   80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8)
> > > > [   80.893705]  [<00000000002b7046>] sysfs_do_create_link+0xda/0x268
> > > > [   80.893708]  [<0000000000409f26>] driver_sysfs_add+0x66/0xcc
> > > > [   80.893713]  [<000000000040a0a2>] device_bind_driver+0x26/0x48
> > > > [   80.893717]  [<000000000040a110>] device_attach+0x4c/0xd4
> > > > [   80.893720]  [<0000000000409734>] bus_probe_device+0x4c/0x5c
> > > > [   80.893724]  [<0000000000406c52>] device_add+0x61e/0x73c
> > > > [   80.893728]  [<00000000004631a6>] ccw_device_todo+0x31a/0x380
> > > > [   80.893733]  [<000000000015f5b6>] process_one_work+0x1f6/0x4f0
> > > > [   80.893739]  [<0000000000163358>] worker_thread+0x17c/0x370
> > > > [   80.893742]  [<00000000001690ca>] kthread+0xa6/0xb0
> > > > [   80.893747]  [<000000000058f5f2>] kernel_thread_starter+0x6/0xc
> > > > [   80.893752]  [<000000000058f5ec>] kernel_thread_starter+0x0/0xc
> > > > [   80.893756] 4 locks held by kworker/u:1/25:
> > > > [   80.893758]  #0:  (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> > > > [   80.893766]  #1:  ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> > > > [   80.893773]  #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4
> > > > [   80.893780]  #3:  (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268
> > > > [   80.893787] Last Breaking-Event-Address:
> > > > [   80.893790]  [<00000000002b6768>] sysfs_add_one+0xcc/0xe8
> > > > [   80.893795] ---[ end trace ded2f91fcf2c6165 ]---
> > > > 
> > > > 
> > > > * device_add attached the device to the bus /*break*/
> > > > * driver_register walks the list of devices and tries to bind
> > > >   unbound devices
> > > > * /*continue*/ device_add calls device_attach which gets confused
> > > >   that the device is already bound to a driver
> > > 
> > > Why would your bus code ever allow this to happen?  It's the caller's
> > > responsiblity to do things in the correct order, right?
> > I don't think the bus code which calls device_register can (or should)
> > prevent drivers from beeing registered at this bus at the same time.
> 
> Why not?  That's the way all kernel subsystems work today that I know
> of.  Has this changed?
What about an exported bus_type? At all time a driver for this bus can
be registered, the bus code has no chance to prevent or serialize this.

> 
> thanks,
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 14:33   ` Sebastian Ott
@ 2011-04-11 14:49     ` Greg KH
  2011-04-11 15:05       ` Sebastian Ott
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2011-04-11 14:49 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-kernel

On Mon, Apr 11, 2011 at 04:33:03PM +0200, Sebastian Ott wrote:
> On Mon, 11 Apr 2011, Greg KH wrote:
> 
> > On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > > Hi,
> > > 
> > > i've seen this warning which looks to be caused by a race between device_add
> > > and driver_register
> > > 
> > > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> > 
> > Isn't the problem here the fact that you are creating 2 directories of
> > the same name?
> I'm sure this isn't the case here. The bus code just calls device_add and
> at the same time on a different thread a module is loaded which registers
> a driver at the bus.
> 
> I was able to reproduce this with a module which creates a dummy bus
> and registers drivers and devices on this bus on 2 different workqueues.

That makes sense, as no bus should be doing this on multiple "threads".
What real-life bus does this today?

> > > [   80.893611] ------------[ cut here ]------------
> > > [   80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455
> > > [   80.893617] Modules linked in: qeth ccwgroup
> > > [   80.893623] Modules linked in: qeth ccwgroup
> > > [   80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1
> > > [   80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980)
> > > [   80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8)
> > > [   80.893643]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
> > > [   80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001
> > > [   80.893651]            0000000000589daa 00000000005a1248 0000000000000000 0000000000000000
> > > [   80.893654]            0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8
> > > [   80.893657]            000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00
> > > [   80.893668] Krnl Code: 00000000002b675c: c0200020041d        larl    %r2,6b6f96
> > > [   80.893672]            00000000002b6762: c0e500169afb        brasl   %r14,589d58
> > > [   80.893676]            00000000002b6768: a7f40001            brc     15,2b676a
> > > [   80.893680]           >00000000002b676c: b904002a            lgr     %r2,%r10
> > > [   80.893684]            00000000002b6770: c0e5fffb6236        brasl   %r14,222bdc
> > > [   80.893687]            00000000002b6776: a7f4ffad            brc     15,2b66d0
> > > [   80.893692]            00000000002b677a: e320c0480004        lg      %r2,72(%r12)
> > > [   80.893695]            00000000002b6780: a7f4ffec            brc     15,2b6758
> > > [   80.893699] Call Trace:
> > > [   80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8)
> > > [   80.893705]  [<00000000002b7046>] sysfs_do_create_link+0xda/0x268
> > > [   80.893708]  [<0000000000409f26>] driver_sysfs_add+0x66/0xcc
> > > [   80.893713]  [<000000000040a0a2>] device_bind_driver+0x26/0x48
> > > [   80.893717]  [<000000000040a110>] device_attach+0x4c/0xd4
> > > [   80.893720]  [<0000000000409734>] bus_probe_device+0x4c/0x5c
> > > [   80.893724]  [<0000000000406c52>] device_add+0x61e/0x73c
> > > [   80.893728]  [<00000000004631a6>] ccw_device_todo+0x31a/0x380
> > > [   80.893733]  [<000000000015f5b6>] process_one_work+0x1f6/0x4f0
> > > [   80.893739]  [<0000000000163358>] worker_thread+0x17c/0x370
> > > [   80.893742]  [<00000000001690ca>] kthread+0xa6/0xb0
> > > [   80.893747]  [<000000000058f5f2>] kernel_thread_starter+0x6/0xc
> > > [   80.893752]  [<000000000058f5ec>] kernel_thread_starter+0x0/0xc
> > > [   80.893756] 4 locks held by kworker/u:1/25:
> > > [   80.893758]  #0:  (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> > > [   80.893766]  #1:  ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> > > [   80.893773]  #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4
> > > [   80.893780]  #3:  (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268
> > > [   80.893787] Last Breaking-Event-Address:
> > > [   80.893790]  [<00000000002b6768>] sysfs_add_one+0xcc/0xe8
> > > [   80.893795] ---[ end trace ded2f91fcf2c6165 ]---
> > > 
> > > 
> > > * device_add attached the device to the bus /*break*/
> > > * driver_register walks the list of devices and tries to bind
> > >   unbound devices
> > > * /*continue*/ device_add calls device_attach which gets confused
> > >   that the device is already bound to a driver
> > 
> > Why would your bus code ever allow this to happen?  It's the caller's
> > responsiblity to do things in the correct order, right?
> I don't think the bus code which calls device_register can (or should)
> prevent drivers from beeing registered at this bus at the same time.

Why not?  That's the way all kernel subsystems work today that I know
of.  Has this changed?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 14:13 ` Greg KH
@ 2011-04-11 14:33   ` Sebastian Ott
  2011-04-11 14:49     ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Sebastian Ott @ 2011-04-11 14:33 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

On Mon, 11 Apr 2011, Greg KH wrote:

> On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> > Hi,
> > 
> > i've seen this warning which looks to be caused by a race between device_add
> > and driver_register
> > 
> > [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
> 
> Isn't the problem here the fact that you are creating 2 directories of
> the same name?
I'm sure this isn't the case here. The bus code just calls device_add and
at the same time on a different thread a module is loaded which registers
a driver at the bus.

I was able to reproduce this with a module which creates a dummy bus
and registers drivers and devices on this bus on 2 different workqueues.

> 
> 
> > [   80.893611] ------------[ cut here ]------------
> > [   80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455
> > [   80.893617] Modules linked in: qeth ccwgroup
> > [   80.893623] Modules linked in: qeth ccwgroup
> > [   80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1
> > [   80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980)
> > [   80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8)
> > [   80.893643]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
> > [   80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001
> > [   80.893651]            0000000000589daa 00000000005a1248 0000000000000000 0000000000000000
> > [   80.893654]            0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8
> > [   80.893657]            000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00
> > [   80.893668] Krnl Code: 00000000002b675c: c0200020041d        larl    %r2,6b6f96
> > [   80.893672]            00000000002b6762: c0e500169afb        brasl   %r14,589d58
> > [   80.893676]            00000000002b6768: a7f40001            brc     15,2b676a
> > [   80.893680]           >00000000002b676c: b904002a            lgr     %r2,%r10
> > [   80.893684]            00000000002b6770: c0e5fffb6236        brasl   %r14,222bdc
> > [   80.893687]            00000000002b6776: a7f4ffad            brc     15,2b66d0
> > [   80.893692]            00000000002b677a: e320c0480004        lg      %r2,72(%r12)
> > [   80.893695]            00000000002b6780: a7f4ffec            brc     15,2b6758
> > [   80.893699] Call Trace:
> > [   80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8)
> > [   80.893705]  [<00000000002b7046>] sysfs_do_create_link+0xda/0x268
> > [   80.893708]  [<0000000000409f26>] driver_sysfs_add+0x66/0xcc
> > [   80.893713]  [<000000000040a0a2>] device_bind_driver+0x26/0x48
> > [   80.893717]  [<000000000040a110>] device_attach+0x4c/0xd4
> > [   80.893720]  [<0000000000409734>] bus_probe_device+0x4c/0x5c
> > [   80.893724]  [<0000000000406c52>] device_add+0x61e/0x73c
> > [   80.893728]  [<00000000004631a6>] ccw_device_todo+0x31a/0x380
> > [   80.893733]  [<000000000015f5b6>] process_one_work+0x1f6/0x4f0
> > [   80.893739]  [<0000000000163358>] worker_thread+0x17c/0x370
> > [   80.893742]  [<00000000001690ca>] kthread+0xa6/0xb0
> > [   80.893747]  [<000000000058f5f2>] kernel_thread_starter+0x6/0xc
> > [   80.893752]  [<000000000058f5ec>] kernel_thread_starter+0x0/0xc
> > [   80.893756] 4 locks held by kworker/u:1/25:
> > [   80.893758]  #0:  (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> > [   80.893766]  #1:  ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> > [   80.893773]  #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4
> > [   80.893780]  #3:  (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268
> > [   80.893787] Last Breaking-Event-Address:
> > [   80.893790]  [<00000000002b6768>] sysfs_add_one+0xcc/0xe8
> > [   80.893795] ---[ end trace ded2f91fcf2c6165 ]---
> > 
> > 
> > * device_add attached the device to the bus /*break*/
> > * driver_register walks the list of devices and tries to bind
> >   unbound devices
> > * /*continue*/ device_add calls device_attach which gets confused
> >   that the device is already bound to a driver
> 
> Why would your bus code ever allow this to happen?  It's the caller's
> responsiblity to do things in the correct order, right?
I don't think the bus code which calls device_register can (or should)
prevent drivers from beeing registered at this bus at the same time.

> 
> > to fix this we could:
> > * hold the device lock in device_add (from bus_add_device to
> >   bus_probe_device)  ..but this means we have to extract the lock
> >   from the inside of some of these functions and i'm not sure about
> >   holding the lock while the blocking_notifier_call_chain thing..
> > 
> > * add a bus mutex, preventing concurrent registrations of devices
> >   and drivers to a bus
> > 
> > * change device_attach to detect an already bound device..something
> >   like the following:
> 
> How about fix your caller code?
> 
> thanks,
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: sysfs: cannot create duplicate filename
  2011-04-11 14:04 Sebastian Ott
@ 2011-04-11 14:13 ` Greg KH
  2011-04-11 14:33   ` Sebastian Ott
  0 siblings, 1 reply; 38+ messages in thread
From: Greg KH @ 2011-04-11 14:13 UTC (permalink / raw)
  To: Sebastian Ott; +Cc: linux-kernel

On Mon, Apr 11, 2011 at 04:04:08PM +0200, Sebastian Ott wrote:
> Hi,
> 
> i've seen this warning which looks to be caused by a race between device_add
> and driver_register
> 
> [   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'

Isn't the problem here the fact that you are creating 2 directories of
the same name?


> [   80.893611] ------------[ cut here ]------------
> [   80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455
> [   80.893617] Modules linked in: qeth ccwgroup
> [   80.893623] Modules linked in: qeth ccwgroup
> [   80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1
> [   80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980)
> [   80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8)
> [   80.893643]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
> [   80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001
> [   80.893651]            0000000000589daa 00000000005a1248 0000000000000000 0000000000000000
> [   80.893654]            0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8
> [   80.893657]            000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00
> [   80.893668] Krnl Code: 00000000002b675c: c0200020041d        larl    %r2,6b6f96
> [   80.893672]            00000000002b6762: c0e500169afb        brasl   %r14,589d58
> [   80.893676]            00000000002b6768: a7f40001            brc     15,2b676a
> [   80.893680]           >00000000002b676c: b904002a            lgr     %r2,%r10
> [   80.893684]            00000000002b6770: c0e5fffb6236        brasl   %r14,222bdc
> [   80.893687]            00000000002b6776: a7f4ffad            brc     15,2b66d0
> [   80.893692]            00000000002b677a: e320c0480004        lg      %r2,72(%r12)
> [   80.893695]            00000000002b6780: a7f4ffec            brc     15,2b6758
> [   80.893699] Call Trace:
> [   80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8)
> [   80.893705]  [<00000000002b7046>] sysfs_do_create_link+0xda/0x268
> [   80.893708]  [<0000000000409f26>] driver_sysfs_add+0x66/0xcc
> [   80.893713]  [<000000000040a0a2>] device_bind_driver+0x26/0x48
> [   80.893717]  [<000000000040a110>] device_attach+0x4c/0xd4
> [   80.893720]  [<0000000000409734>] bus_probe_device+0x4c/0x5c
> [   80.893724]  [<0000000000406c52>] device_add+0x61e/0x73c
> [   80.893728]  [<00000000004631a6>] ccw_device_todo+0x31a/0x380
> [   80.893733]  [<000000000015f5b6>] process_one_work+0x1f6/0x4f0
> [   80.893739]  [<0000000000163358>] worker_thread+0x17c/0x370
> [   80.893742]  [<00000000001690ca>] kthread+0xa6/0xb0
> [   80.893747]  [<000000000058f5f2>] kernel_thread_starter+0x6/0xc
> [   80.893752]  [<000000000058f5ec>] kernel_thread_starter+0x0/0xc
> [   80.893756] 4 locks held by kworker/u:1/25:
> [   80.893758]  #0:  (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> [   80.893766]  #1:  ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
> [   80.893773]  #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4
> [   80.893780]  #3:  (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268
> [   80.893787] Last Breaking-Event-Address:
> [   80.893790]  [<00000000002b6768>] sysfs_add_one+0xcc/0xe8
> [   80.893795] ---[ end trace ded2f91fcf2c6165 ]---
> 
> 
> * device_add attached the device to the bus /*break*/
> * driver_register walks the list of devices and tries to bind
>   unbound devices
> * /*continue*/ device_add calls device_attach which gets confused
>   that the device is already bound to a driver

Why would your bus code ever allow this to happen?  It's the caller's
responsiblity to do things in the correct order, right?

> to fix this we could:
> * hold the device lock in device_add (from bus_add_device to
>   bus_probe_device)  ..but this means we have to extract the lock
>   from the inside of some of these functions and i'm not sure about
>   holding the lock while the blocking_notifier_call_chain thing..
> 
> * add a bus mutex, preventing concurrent registrations of devices
>   and drivers to a bus
> 
> * change device_attach to detect an already bound device..something
>   like the following:

How about fix your caller code?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 38+ messages in thread

* sysfs: cannot create duplicate filename
@ 2011-04-11 14:04 Sebastian Ott
  2011-04-11 14:13 ` Greg KH
  0 siblings, 1 reply; 38+ messages in thread
From: Sebastian Ott @ 2011-04-11 14:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel

Hi,

i've seen this warning which looks to be caused by a race between device_add
and driver_register

[   80.893594] sysfs: cannot create duplicate filename '/bus/ccw/drivers/qeth/0.0.b57d'
[   80.893611] ------------[ cut here ]------------
[   80.893614] WARNING: at /home/autobuild/BUILD/linux-2.6.38.2-20110404/fs/sysfs/dir.c:455
[   80.893617] Modules linked in: qeth ccwgroup
[   80.893623] Modules linked in: qeth ccwgroup
[   80.893629] CPU: 1 Not tainted 2.6.38.2-48.x.20110404-s390xdefault #1
[   80.893632] Process kworker/u:1 (pid: 25, task: 000000007e6c5a40, ksp: 000000007e6cb980)
[   80.893635] Krnl PSW : 0704000180000000 00000000002b676c (sysfs_add_one+0xd0/0xe8)
[   80.893643]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
[   80.893647] Krnl GPRS: 00000000000000bd 0000000000000000 000000000000005e 0000000000000001
[   80.893651]            0000000000589daa 00000000005a1248 0000000000000000 0000000000000000
[   80.893654]            0000000000000000 0000000000000001 000000007865c000 000000007e6cbbd8
[   80.893657]            000000007c704398 00000000ffffffef 00000000002b6768 000000007e6cbb00
[   80.893668] Krnl Code: 00000000002b675c: c0200020041d        larl    %r2,6b6f96
[   80.893672]            00000000002b6762: c0e500169afb        brasl   %r14,589d58
[   80.893676]            00000000002b6768: a7f40001            brc     15,2b676a
[   80.893680]           >00000000002b676c: b904002a            lgr     %r2,%r10
[   80.893684]            00000000002b6770: c0e5fffb6236        brasl   %r14,222bdc
[   80.893687]            00000000002b6776: a7f4ffad            brc     15,2b66d0
[   80.893692]            00000000002b677a: e320c0480004        lg      %r2,72(%r12)
[   80.893695]            00000000002b6780: a7f4ffec            brc     15,2b6758
[   80.893699] Call Trace:
[   80.893701] ([<00000000002b6768>] sysfs_add_one+0xcc/0xe8)
[   80.893705]  [<00000000002b7046>] sysfs_do_create_link+0xda/0x268
[   80.893708]  [<0000000000409f26>] driver_sysfs_add+0x66/0xcc
[   80.893713]  [<000000000040a0a2>] device_bind_driver+0x26/0x48
[   80.893717]  [<000000000040a110>] device_attach+0x4c/0xd4
[   80.893720]  [<0000000000409734>] bus_probe_device+0x4c/0x5c
[   80.893724]  [<0000000000406c52>] device_add+0x61e/0x73c
[   80.893728]  [<00000000004631a6>] ccw_device_todo+0x31a/0x380
[   80.893733]  [<000000000015f5b6>] process_one_work+0x1f6/0x4f0
[   80.893739]  [<0000000000163358>] worker_thread+0x17c/0x370
[   80.893742]  [<00000000001690ca>] kthread+0xa6/0xb0
[   80.893747]  [<000000000058f5f2>] kernel_thread_starter+0x6/0xc
[   80.893752]  [<000000000058f5ec>] kernel_thread_starter+0x0/0xc
[   80.893756] 4 locks held by kworker/u:1/25:
[   80.893758]  #0:  (cio){++++.+}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
[   80.893766]  #1:  ((&cdev->private->todo_work)){+.+.+.}, at: [<000000000015f524>] process_one_work+0x164/0x4f0
[   80.893773]  #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<000000000040a0fc>] device_attach+0x38/0xd4
[   80.893780]  #3:  (sysfs_mutex){+.+.+.}, at: [<00000000002b7030>] sysfs_do_create_link+0xc4/0x268
[   80.893787] Last Breaking-Event-Address:
[   80.893790]  [<00000000002b6768>] sysfs_add_one+0xcc/0xe8
[   80.893795] ---[ end trace ded2f91fcf2c6165 ]---


* device_add attached the device to the bus /*break*/
* driver_register walks the list of devices and tries to bind
  unbound devices
* /*continue*/ device_add calls device_attach which gets confused
  that the device is already bound to a driver

to fix this we could:
* hold the device lock in device_add (from bus_add_device to
  bus_probe_device)  ..but this means we have to extract the lock
  from the inside of some of these functions and i'm not sure about
  holding the lock while the blocking_notifier_call_chain thing..

* add a bus mutex, preventing concurrent registrations of devices
  and drivers to a bus

* change device_attach to detect an already bound device..something
  like the following:

regards,
Sebastian
-----

Add a check to device_attach to identify an already bound device.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
---
 drivers/base/dd.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -245,6 +245,10 @@ int device_attach(struct device *dev)
 
 	device_lock(dev);
 	if (dev->driver) {
+		if (klist_node_attached(&dev->p->knode_driver)) {
+			ret = 1;
+			goto out_unlock;
+		}
 		ret = device_bind_driver(dev);
 		if (ret == 0)
 			ret = 1;
@@ -257,6 +261,7 @@ int device_attach(struct device *dev)
 		ret = bus_for_each_drv(dev->bus, NULL, dev, __device_attach);
 		pm_runtime_put_sync(dev);
 	}
+out_unlock:
 	device_unlock(dev);
 	return ret;
 }

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2022-02-17 15:35 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-02 18:03 sysfs: cannot create duplicate filename Andre Heider
     [not found] ` <CAHsu+b-cG+ED6TX5evRTBjR-LwHugW+8-9hnHXAz5DnAioJnUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-03-05  9:39   ` Lingzhu Xiang
     [not found]     ` <5135BD66.1030005-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-06 13:19       ` Matt Fleming
     [not found]         ` <1362575941.15011.56.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-03-06 13:35           ` shea-yfkUTty7RcRWk0Htik3J/w
     [not found]             ` <bd0c3451a7342302623c09cac09bcbfc-yfkUTty7RcRWk0Htik3J/w@public.gmane.org>
2013-03-06 14:29               ` Matt Fleming
2013-03-08 15:11       ` Matt Fleming
     [not found]         ` <1362755479.15011.238.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-03-08 18:17           ` Lingzhu Xiang
     [not found]             ` <513A2B29.8090705-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-12 10:08               ` Matt Fleming
     [not found]                 ` <1363082900.15011.257.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-03-12 10:45                   ` Lingzhu Xiang
     [not found]                     ` <513F0734.80600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-12 16:35                       ` Matt Fleming
     [not found]                         ` <1363106125.15011.263.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-03-12 17:51                           ` Seiji Aguchi
2013-03-13 10:47                           ` Lingzhu Xiang
     [not found]                             ` <51405943.2000601-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-14 16:33                               ` Matt Fleming
     [not found]                                 ` <1363278817.15011.316.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-03-14 19:31                                   ` Seiji Aguchi
     [not found]                                     ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF68807-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>
2013-03-18 14:51                                       ` Matt Fleming
     [not found]                                         ` <1363618261.14988.4.camel-ZqTwcBeJ+wsBof6jY8KHXm7IUlhRatedral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-03-19 10:14                                           ` Lingzhu Xiang
     [not found]                                             ` <51483A88.6060509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-19 14:25                                               ` Seiji Aguchi
     [not found]                                                 ` <A5ED84D3BB3A384992CBB9C77DEDA4D41AF6E1F9-ohthHghroY0jroPwUH3sq+6wyyQG6/Uh@public.gmane.org>
2013-03-19 15:26                                                   ` Matt Fleming
     [not found]                                                     ` <514883B7.1010602-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
2013-03-19 15:46                                                       ` Seiji Aguchi
2013-03-19 16:00                                                   ` Lingzhu Xiang
2013-03-21  7:22                                           ` Lingzhu Xiang
     [not found]                                             ` <514AB534.8080209-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-21  7:44                                               ` Matt Fleming
2013-03-12 13:20                   ` Andre Heider
  -- strict thread matches above, loose matches on Subject: below --
2022-02-13  9:21 Robin Peiremans
2022-02-14 18:08 ` Jason Gunthorpe
2022-02-17 13:37   ` Dennis Dalessandro
2022-02-17 14:29     ` Marciniszyn, Mike
2022-02-17 15:35       ` Robin Peiremans
2011-04-11 14:04 Sebastian Ott
2011-04-11 14:13 ` Greg KH
2011-04-11 14:33   ` Sebastian Ott
2011-04-11 14:49     ` Greg KH
2011-04-11 15:05       ` Sebastian Ott
2011-04-11 15:19         ` Greg KH
2011-04-11 17:50           ` Sebastian Ott
2011-04-11 17:56             ` Greg KH
2011-04-12 14:39               ` Sebastian Ott
2011-04-12 16:02                 ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.