linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] memory corruption caused by efi driver?
@ 2017-06-24  9:52 Yisheng Xie
  2017-06-24 11:12 ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Yisheng Xie @ 2017-06-24  9:52 UTC (permalink / raw)
  To: matt, ard.biesheuvel, gregkh
  Cc: linux-efi, linux-kernel, Hanjun Guo, Xishi Qiu

hi all,

I met an Oops problem with linux-3.10. The RIP is sysfs_open_file+0x46/0x2b0 (I will and the full
crash log in the end of this mail).

when disassemble sysfs_open_file with crash, check and find it happens when open the file:
  /sys/firmware/efi/vars/dbDefault-8be4df61-93ca-11d2-aa0d-00e098032b8c/raw_var

I had dump the info of kobject and efivar_entry, it seems have been corruption:
crash> struct kobject ffff880464552838
struct kobject {
  name = 0x35302d30312d3031 <Address 0x35302d30312d3031 out of bounds>,
  entry = {
    next = 0x9060d307472632e,
    prev = 0x1010df78648862a
  },
  parent = 0x102820300050b,
  kset = 0xf7cecc30ff420835,
  ktype = 0x2935586810ad0c76,
  sd = 0x4112ef7c27763246,
  kref = {
    refcount = {
      counter = 1243300391
    }
  },
  state_initialized = 0,
  state_in_sysfs = 1,
  state_add_uevent_sent = 0,
  state_remove_uevent_sent = 1,
  uevent_suppress = 0
}
crash> p &((struct efivar_entry *)0)->kobj
$1 = (struct kobject *) 0x838
crash> struct efivar_entry -x 0xffff880464552000
struct efivar_entry {
  var = {
    VariableName = {0x64, 0x62, 0x44, 0x65, 0x66, 0x61, 0x75, 0x6c, 0x74, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0...},
    VendorGuid = {
      b = "a\337\344\213ʓ\322\021\252\r\000\340\230\003+\214"
    },
    DataSize = 0xc47,
    Data = "\241Y\300\245䔧J\207\265\253\025\\+\360r@\006\000\000\000\000\000\000$\006\000\000\275\232\372wY\003\062M\275`(\364\347\217xK0\202\006\020\060\202\003\370\240\003\002\001\002\002\na\b\323\304\000\000\000\000\000\004\060\r\006\t*\206H\206\367\r\001\001\v\005\000\060\201\221\061\v0\t\006\003U\004\006\023\002US1\023\060\021\006\003U\004\b\023\nWashington1\020\060\016\006\003U\004\a\023\aRedmond1\036\060\034\006\003U\004\n\023\025Microsoft Corporation1;09\006\003U\004\003\023\062Microsoft Corporation Third Party Marketplace Root0\036\027\r110627212245Z\027\r2606272"...,
    Status = 0x7265632f696b702f,
    Attributes = 0x4d2f7374
  },
  list = {
    next = 0x4d72615069685472,
    prev = 0x30325f6f6f527261
  },
  kobj = {
    name = 0x35302d30312d3031 <Address 0x35302d30312d3031 out of bounds>,
    entry = {
      next = 0x9060d307472632e,
      prev = 0x1010df78648862a
    },
    parent = 0x102820300050b,
    kset = 0xf7cecc30ff420835,
    ktype = 0x2935586810ad0c76,
    sd = 0x4112ef7c27763246,
    kref = {
      refcount = {
        counter = 0x4a1b4227
      }
    },
    state_initialized = 0x0,
    state_in_sysfs = 0x1,
    state_add_uevent_sent = 0x0,
    state_remove_uevent_sent = 0x1,
    uevent_suppress = 0x0
  },
  scanning = 0x48,
  deleting = 0x59
}


Any idea about it?

Any comment is appreciative!

Thanks
Yisheng Xie

detail log:
------
[12476.033560] general protection fault: 0000 [#1] SMP
[12476.039247] kbox catch die event.
[12476.058628] collected_len = 154965, LOG_BUF_LEN_LOCAL = 1048576
[12476.121740] kbox: notify die begin
[12476.125632] kbox: no notify die func register. no need to notify
[12476.132414] do nothing after die!
[12476.136184] Modules linked in: loop binfmt_misc kboxdriver(O) kbox(O) kernel_log_dev(OE) signo_catch(O) bsp_cpld_lpc(OVE) vfat fat intel_powerclamp coretemp intel_rapl crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg i2c_i801 pcspkr shpchp i2c_hid video wmi acpi_pad ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic igb crct10dif_pclmul crct10dif_common i2c_algo_bit ahci i2c_core libahci dca crc32c_intel libata ptp pps_core 8250_dw intel_lpss_module mfd_core [last unloaded: gen_timer]
[12476.191525] CPU: 3 PID: 11257 Comm: cat Tainted: G        WC OE  ----V-------   3.10.0-327.53.58.73.x86_64 #1
[12476.202708] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 05/05/2017
[12476.211528] task: ffff880315ea5080 ti: ffff88045e530000 task.ti: ffff88045e530000
[12476.219965] RIP: 0010:[<ffffffff812601a6>]  [<ffffffff812601a6>] sysfs_open_file+0x46/0x2b0
[12476.229452] RSP: 0018:ffff88045e533c78  EFLAGS: 00010202
[12476.235505] RAX: 2935586810ad0c76 RBX: ffff88043e693e00 RCX: ffff88046451b694
[12476.243560] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88046451b690
[12476.251647] RBP: ffff88045e533ca0 R08: 0000000000000000 R09: 0000000000000000
[12476.259700] R10: 0b90000000000000 R11: ffff880466920780 R12: ffff88042c0094d0
[12476.267752] R13: ffff88046451b690 R14: ffff88042c0094d0 R15: ffff880464552838
[12476.275806] FS:  00007f3e56a96740(0000) GS:ffff88047e4c0000(0000) knlGS:0000000000000000
[12476.285001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12476.291532] CR2: 00007f3e5659aa80 CR3: 000000043e7e8000 CR4: 00000000003407e0
[12476.299621] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[12476.307672] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[12476.315725] Stack:
[12476.318052]  ffff88043e693e00 ffff88042c0094d0 ffff880036cff0c0 0000000000000000
[12476.326565]  ffff88043e693e10 ffff88045e533ce8 ffffffff811e15c7 ffff88042c0094d0
[12476.335079]  ffffffff81260160 ffff88045e533f28 0000000000008000 ffff88045e533df0
[12476.343599] Call Trace:
[12476.346443]  [<ffffffff811e15c7>] do_dentry_open+0x1a7/0x2e0
[12476.352887]  [<ffffffff81260160>] ? sysfs_schedule_callback+0x1c0/0x1c0
[12476.360429]  [<ffffffff811e17f9>] vfs_open+0x39/0x70
[12476.366105]  [<ffffffff811f2c3d>] do_last+0x1ed/0x12a0
[12476.373605]  [<ffffffff81300422>] ? radix_tree_lookup_slot+0x22/0x50
[12476.380851]  [<ffffffff811f3db2>] path_openat+0xc2/0x490
[12476.386906]  [<ffffffff811f557b>] do_filp_open+0x4b/0xb0
[12476.393769]  [<ffffffff81202177>] ? __alloc_fd+0xa7/0x130
[12476.399913]  [<ffffffff811e2cc3>] do_sys_open+0xf3/0x1f0
[12476.405972]  [<ffffffff811e2dde>] SyS_open+0x1e/0x20
[12476.411650]  [<ffffffff81650a49>] system_call_fastpath+0x16/0x1b
[12476.418472] Code: f3 4c 8b 68 78 49 8b 45 08 4c 89 ef 4c 8b 78 48 e8 20 09 00 00 48 85 c0 0f 84 47 02 00 00 49 8b 47 28 48 85 c0 0f 84 ba 01 00 00 <4c> 8b 60 08 4d 85 e4 0f 84 ad 01 00 00 8b 43 44 a8 02 74 2e 41
[12476.442610] RIP  [<ffffffff812601a6>] sysfs_open_file+0x46/0x2b0
[12476.449436]  RSP <ffff88045e533c78>
[12476.453750] ---[ end trace 3f2d7ee3bfcdead8 ]---
[12476.453752] Kernel panic - not syncing: Fatal exception

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] memory corruption caused by efi driver?
  2017-06-24  9:52 [RFC] memory corruption caused by efi driver? Yisheng Xie
@ 2017-06-24 11:12 ` Greg KH
  2017-06-25 13:06   ` Xishi Qiu
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2017-06-24 11:12 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: matt, ard.biesheuvel, linux-efi, linux-kernel, Hanjun Guo, Xishi Qiu

On Sat, Jun 24, 2017 at 05:52:23PM +0800, Yisheng Xie wrote:
> hi all,
> 
> I met an Oops problem with linux-3.10. The RIP is sysfs_open_file+0x46/0x2b0 (I will and the full
> crash log in the end of this mail).

3.10 is _very_ old and obsolete, can you duplicate this on a modern
kernel, like 4.11?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] memory corruption caused by efi driver?
  2017-06-24 11:12 ` Greg KH
@ 2017-06-25 13:06   ` Xishi Qiu
  2017-06-25 13:31     ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Xishi Qiu @ 2017-06-25 13:06 UTC (permalink / raw)
  To: Greg KH
  Cc: Yisheng Xie, matt, ard.biesheuvel, linux-efi, linux-kernel, Hanjun Guo

On 2017/6/24 19:12, Greg KH wrote:

> On Sat, Jun 24, 2017 at 05:52:23PM +0800, Yisheng Xie wrote:
>> hi all,
>>
>> I met an Oops problem with linux-3.10. The RIP is sysfs_open_file+0x46/0x2b0 (I will and the full
>> crash log in the end of this mail).
> 
> 3.10 is _very_ old and obsolete, can you duplicate this on a modern
> kernel, like 4.11?
> 
> thanks,
> 
> greg k-h
> 
> .
> 

Hi, if I disable CONFIG_EFI_VARS, it seems OK now.

And I cann't reproduce the problem on mainline(v4.12).

Here is my test, run some stress test, then
cat /sys/firmware/efi/efivars/*
or
cat /sys/firmware/efi/vars/*/*

1) 3.10, get warning
CONFIG_EFI_VARS=y
CONFIG_EFIVAR_FS=y

2) 3.10, get warning
CONFIG_EFI_VARS=y
CONFIG_EFIVAR_FS=n

3) 3.10, ok
CONFIG_EFI_VARS=n
CONFIG_EFIVAR_FS=y

4) mainline, ok
CONFIG_EFI_VARS=y
CONFIG_EFIVAR_FS=y

log:
[78872.389117] WARNING: at fs/sysfs/file.c:343 sysfs_open_file+0x222/0x2b0()
[78872.389118] missing sysfs attribute operations for kobject: (null)
[78872.389177] Modules linked in: gen_timer(OVE) tun zram(C) ext4 jbd2 mbcache loop regmap_i2c binfmt_misc scsi_transport_iscsi cfg80211 ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack rfk
ill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter
 ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables sg iTCO_wdt ipmi_devintf iTCO_ve
ndor_support vfat fat intel_powerclamp coretemp kvm_intel kvm nfsd crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ipmi_ssif aesni_intel lrw gf128mul auth_rpcgss glue_helper a
blk_helper i7core_edac nfs_acl cryptd lpc_ich pcspkr
[78872.389197]  ipmi_si i2c_i801 edac_core shpchp mfd_core lockd ipmi_msghandler acpi_cpufreq grace sunrpc uinput xfs libcrc32c sd_mod sr_mod crc_t10dif cdrom crct10dif_common ixgbe igb ahci
 mdio libahci ptp i2c_algo_bit pps_core libata i2c_core megaraid_sas dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: gen_timer]
[78872.389202] CPU: 52 PID: 28434 Comm: cat Tainted: G        WC OE  ----V-------   3.10.0-327.55.58.81.x86_64 #2
[78872.389204] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Tecal RH5885 V2/CH91RGPUC, BIOS RGPUC-BIOS-V058 06/23/2013
[78872.389207]  ffff88200a61fc10 00000000df10e27d ffff88200a61fbc8 ffffffff8163ed14
[78872.389208]  ffff88200a61fc00 ffffffff8107b300 00000000fffffff3 ffff88103f6473a0
[78872.389209]  ffff8880236cb700 ffff88103f6473a0 ffff8860281d8838 ffff88200a61fc68
[78872.389210] Call Trace:
[78872.389224]  [<ffffffff8163ed14>] dump_stack+0x19/0x1b
[78872.389233]  [<ffffffff8107b300>] warn_slowpath_common+0x70/0xb0
[78872.389234]  [<ffffffff8107b39c>] warn_slowpath_fmt+0x5c/0x80
[78872.389236]  [<ffffffff8125f1d2>] sysfs_open_file+0x222/0x2b0
[78872.389242]  [<ffffffff811e0167>] do_dentry_open+0x1a7/0x2e0
[78872.389244]  [<ffffffff8125efb0>] ? sysfs_schedule_callback+0x1c0/0x1c0
[78872.389245]  [<ffffffff811e0399>] vfs_open+0x39/0x70
[78872.389251]  [<ffffffff811f183d>] do_last+0x1ed/0x12a0
[78872.389259]  [<ffffffff811c4ffe>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
[78872.389261]  [<ffffffff811f29b2>] path_openat+0xc2/0x490
[78872.389267]  [<ffffffff8112786d>] ? call_rcu_sched+0x1d/0x20
[78872.389275]  [<ffffffff8118484d>] ? shmem_destroy_inode+0x2d/0x40
[78872.389281]  [<ffffffff811fe4c6>] ? evict+0x106/0x170
[78872.389283]  [<ffffffff811f417b>] do_filp_open+0x4b/0xb0
[78872.389286]  [<ffffffff81200d97>] ? __alloc_fd+0xa7/0x130
[78872.389290]  [<ffffffff811e1863>] do_sys_open+0xf3/0x1f0
[78872.389291]  [<ffffffff811e197e>] SyS_open+0x1e/0x20
[78872.389297]  [<ffffffff8164f109>] system_call_fastpath+0x16/0x1b
[78872.389298] ---[ end trace cbe34632be0fdedf ]---
[78872.390067] ------------[ cut here ]------------

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] memory corruption caused by efi driver?
  2017-06-25 13:06   ` Xishi Qiu
@ 2017-06-25 13:31     ` Greg KH
  0 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2017-06-25 13:31 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Yisheng Xie, matt, ard.biesheuvel, linux-efi, linux-kernel, Hanjun Guo

On Sun, Jun 25, 2017 at 09:06:58PM +0800, Xishi Qiu wrote:
> On 2017/6/24 19:12, Greg KH wrote:
> 
> > On Sat, Jun 24, 2017 at 05:52:23PM +0800, Yisheng Xie wrote:
> >> hi all,
> >>
> >> I met an Oops problem with linux-3.10. The RIP is sysfs_open_file+0x46/0x2b0 (I will and the full
> >> crash log in the end of this mail).
> > 
> > 3.10 is _very_ old and obsolete, can you duplicate this on a modern
> > kernel, like 4.11?
> > 
> > thanks,
> > 
> > greg k-h
> > 
> > .
> > 
> 
> Hi, if I disable CONFIG_EFI_VARS, it seems OK now.
> 
> And I cann't reproduce the problem on mainline(v4.12).
> 
> Here is my test, run some stress test, then
> cat /sys/firmware/efi/efivars/*
> or
> cat /sys/firmware/efi/vars/*/*
> 
> 1) 3.10, get warning
> CONFIG_EFI_VARS=y
> CONFIG_EFIVAR_FS=y
> 
> 2) 3.10, get warning
> CONFIG_EFI_VARS=y
> CONFIG_EFIVAR_FS=n
> 
> 3) 3.10, ok
> CONFIG_EFI_VARS=n
> CONFIG_EFIVAR_FS=y
> 
> 4) mainline, ok
> CONFIG_EFI_VARS=y
> CONFIG_EFIVAR_FS=y

Then use mainline :)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-06-25 13:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-24  9:52 [RFC] memory corruption caused by efi driver? Yisheng Xie
2017-06-24 11:12 ` Greg KH
2017-06-25 13:06   ` Xishi Qiu
2017-06-25 13:31     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).