All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] kernel NULL pointer dereference observed during pmem btt switch test
       [not found] ` <622794958.9574724.1469674652262.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-07-28  3:20   ` Yi Zhang
  2016-07-28 15:50     ` Dan Williams
  0 siblings, 1 reply; 9+ messages in thread
From: Yi Zhang @ 2016-07-28  3:20 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w

Hello everyone

Could you help check this issue, thanks.

Steps I used:
1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G"
2. Execute below script
#!/bin/bash
pmem_btt_switch() {
	sector_size_list="512 520 528 4096 4104 4160 4224"
	for sector_size in $sector_size_list; do
		ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size
		ndctl create-namespace -f -e namespace${1}.0 --mode=raw
	done
}

for i in 0 1 2 3; do
	pmem_btt_switch $i &
done

KERNEL log:
[  243.404847] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  243.467271] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  243.513412] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  243.544728] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  243.545371] ------------[ cut here ]------------
[  243.545381] WARNING: CPU: 10 PID: 2078 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80
[  243.545382] sysfs: cannot create duplicate filename '/devices/virtual/bdi/259:1'
[  243.545432] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw nd_pmem gf128mul glue_helper ablk_helper cryptd nd_btt hpilo iTCO_wdt iTCO_vendor_support sg hpwdt pcspkr ipmi_ssif ioatdma wmi pcc_cpufreq acpi_cpufreq acpi_power_meter lpc_ich ipmi_si ipmi_msghandler mfd_core shpchp dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel tg3 serio_raw hpsa ptp i2c_core scsi_transport_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod
[  243.545435] CPU: 10 PID: 2078 Comm: ndctl Not tainted 4.7.0-rc7 #1
[  243.545436] Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
[  243.545439]  0000000000000286 0000000002c04ad5 ffff88006f24f970 ffffffff8134caec
[  243.545441]  ffff88006f24f9c0 0000000000000000 ffff88006f24f9b0 ffffffff8108c351
[  243.545442]  0000001f0000000c ffff88105d236000 ffff88105d1031e0 ffff8800357427f8
[  243.545443] Call Trace:
[  243.545452]  [<ffffffff8134caec>] dump_stack+0x63/0x87
[  243.545460]  [<ffffffff8108c351>] __warn+0xd1/0xf0
[  243.545463]  [<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80
[  243.545465]  [<ffffffff812a0d34>] sysfs_warn_dup+0x64/0x80
[  243.545466]  [<ffffffff812a0e1e>] sysfs_create_dir_ns+0x7e/0x90
[  243.545469]  [<ffffffff8134faaa>] kobject_add_internal+0xaa/0x320
[  243.545473]  [<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0
[  243.545475]  [<ffffffff8134ff55>] kobject_add+0x75/0xd0
[  243.545483]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
[  243.545489]  [<ffffffff8148b0a5>] device_add+0x125/0x610
[  243.545491]  [<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100
[  243.545492]  [<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20
[  243.545498]  [<ffffffff811b775c>] bdi_register+0x8c/0x180
[  243.545500]  [<ffffffff811b7877>] bdi_register_dev+0x27/0x30
[  243.545505]  [<ffffffff813317f5>] add_disk+0x175/0x4a0
[  243.545507]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
[  243.545513]  [<ffffffff814afb7f>] ? nvdimm_bus_unlock+0x1f/0x30
[  243.545518]  [<ffffffffa04e039f>] nd_pmem_probe+0x28f/0x360 [nd_pmem]
[  243.545521]  [<ffffffff814b0599>] nvdimm_bus_probe+0x69/0x120
[  243.545524]  [<ffffffff8148e779>] driver_probe_device+0x239/0x460
[  243.545526]  [<ffffffff8148c974>] bind_store+0xd4/0x110
[  243.545528]  [<ffffffff8148c054>] drv_attr_store+0x24/0x30
[  243.545529]  [<ffffffff812a042a>] sysfs_kf_write+0x3a/0x50
[  243.545531]  [<ffffffff8129fa3b>] kernfs_fop_write+0x11b/0x1a0
[  243.545536]  [<ffffffff8121d5e7>] __vfs_write+0x37/0x160
[  243.545544]  [<ffffffff812ceadd>] ? security_file_permission+0x3d/0xc0
[  243.545550]  [<ffffffff810d7e1f>] ? percpu_down_read+0x1f/0x50
[  243.545552]  [<ffffffff8121e8e2>] vfs_write+0xb2/0x1b0
[  243.545555]  [<ffffffff8121fd35>] SyS_write+0x55/0xc0
[  243.545560]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[  243.545563]  [<ffffffff816e85e1>] entry_SYSCALL64_slow_path+0x25/0x25
[  243.545579] ---[ end trace 6d3b90c425a39fda ]---
[  243.545580] ------------[ cut here ]------------
[  243.545583] WARNING: CPU: 10 PID: 2078 at lib/kobject.c:240 kobject_add_internal+0x262/0x320
[  243.545584] kobject_add_internal failed for 259:1 with -EEXIST, don't try to register things with the same name in the same directory.
[  243.545603] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw nd_pmem gf128mul glue_helper ablk_helper cryptd nd_btt hpilo iTCO_wdt iTCO_vendor_support sg hpwdt pcspkr ipmi_ssif ioatdma wmi pcc_cpufreq acpi_cpufreq acpi_power_meter lpc_ich ipmi_si ipmi_msghandler mfd_core shpchp dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel tg3 serio_raw hpsa ptp i2c_core scsi_transport_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod
[  243.545605] CPU: 10 PID: 2078 Comm: ndctl Tainted: G        W       4.7.0-rc7 #1
[  243.545605] Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
[  243.545607]  0000000000000286 0000000002c04ad5 ffff88006f24f9c0 ffffffff8134caec
[  243.545608]  ffff88006f24fa10 0000000000000000 ffff88006f24fa00 ffffffff8108c351
[  243.545610]  000000f06f24fa28 ffff880035164010 ffff88006c7e3780 00000000ffffffef
[  243.545610] Call Trace:
[  243.545612]  [<ffffffff8134caec>] dump_stack+0x63/0x87
[  243.545614]  [<ffffffff8108c351>] __warn+0xd1/0xf0
[  243.545616]  [<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80
[  243.545618]  [<ffffffff812a0d3c>] ? sysfs_warn_dup+0x6c/0x80
[  243.545619]  [<ffffffff8134fc62>] kobject_add_internal+0x262/0x320
[  243.545621]  [<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0
[  243.545622]  [<ffffffff8134ff55>] kobject_add+0x75/0xd0
[  243.545625]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
[  243.545626]  [<ffffffff8148b0a5>] device_add+0x125/0x610
[  243.545628]  [<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100
[  243.545630]  [<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20
[  243.545632]  [<ffffffff811b775c>] bdi_register+0x8c/0x180
[  243.545634]  [<ffffffff811b7877>] bdi_register_dev+0x27/0x30
[  243.545636]  [<ffffffff813317f5>] add_disk+0x175/0x4a0
[  243.545638]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
[  243.545640]  [<ffffffff814afb7f>] ? nvdimm_bus_unlock+0x1f/0x30
[  243.545642]  [<ffffffffa04e039f>] nd_pmem_probe+0x28f/0x360 [nd_pmem]
[  243.545644]  [<ffffffff814b0599>] nvdimm_bus_probe+0x69/0x120
[  243.545646]  [<ffffffff8148e779>] driver_probe_device+0x239/0x460
[  243.545648]  [<ffffffff8148c974>] bind_store+0xd4/0x110
[  243.545649]  [<ffffffff8148c054>] drv_attr_store+0x24/0x30
[  243.545651]  [<ffffffff812a042a>] sysfs_kf_write+0x3a/0x50
[  243.545652]  [<ffffffff8129fa3b>] kernfs_fop_write+0x11b/0x1a0
[  243.545654]  [<ffffffff8121d5e7>] __vfs_write+0x37/0x160
[  243.545657]  [<ffffffff812ceadd>] ? security_file_permission+0x3d/0xc0
[  243.545659]  [<ffffffff810d7e1f>] ? percpu_down_read+0x1f/0x50
[  243.545661]  [<ffffffff8121e8e2>] vfs_write+0xb2/0x1b0
[  243.545663]  [<ffffffff8121fd35>] SyS_write+0x55/0xc0
[  243.545665]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[  243.545666]  [<ffffffff816e85e1>] entry_SYSCALL64_slow_path+0x25/0x25
[  243.545667] ---[ end trace 6d3b90c425a39fdb ]---
[  243.577109] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[  243.577117] IP: [<ffffffff812a1054>] sysfs_do_create_link_sd.isra.2+0x34/0xb0
[  243.577119] PGD 1057752067 PUD 105e37a067 PMD 0 
[  243.577121] Oops: 0000 [#1] SMP
[  243.577154] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw nd_pmem gf128mul glue_helper ablk_helper cryptd nd_btt hpilo iTCO_wdt iTCO_vendor_support sg hpwdt pcspkr ipmi_ssif ioatdma wmi pcc_cpufreq acpi_cpufreq acpi_power_meter lpc_ich ipmi_si ipmi_msghandler mfd_core shpchp dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel tg3 serio_raw hpsa ptp i2c_core scsi_transport_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod
[  243.577157] CPU: 6 PID: 2078 Comm: ndctl Tainted: G        W       4.7.0-rc7 #1
[  243.577158] Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
[  243.577159] task: ffff8800340c8000 ti: ffff88006f24c000 task.ti: ffff88006f24c000
[  243.577162] RIP: 0010:[<ffffffff812a1054>]  [<ffffffff812a1054>] sysfs_do_create_link_sd.isra.2+0x34/0xb0
[  243.577163] RSP: 0018:ffff88006f24fc28  EFLAGS: 00010246
[  243.577164] RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000001
[  243.577164] RDX: 0000000000000001 RSI: 0000000000000040 RDI: ffffffff822411f0
[  243.577165] RBP: ffff88006f24fc50 R08: ffff8800690f1711 R09: ffffffff8134e82e
[  243.577166] R10: ffff88007799b640 R11: ffffea0000d46000 R12: ffffffff81a3dc3c
[  243.577166] R13: ffff88105ae627f8 R14: 0000000000000001 R15: ffff880034a89040
[  243.577168] FS:  00007f685b5dc780(0000) GS:ffff880077980000(0000) knlGS:0000000000000000
[  243.577168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  243.577169] CR2: 0000000000000040 CR3: 000000105bb0b000 CR4: 00000000001406e0
[  243.577170] Stack:
[  243.577172]  ffff880070666000 ffff880070666080 ffff88006a0635d0 ffff88007066600c
[  243.577173]  ffff880034a89040 ffff88006f24fc60 ffffffff812a10f5 ffff88006f24fcc8
[  243.577175]  ffffffff8133188b ffff880070666000 1030000135282c00 ffff880070666000
[  243.577175] Call Trace:
[  243.577179]  [<ffffffff812a10f5>] sysfs_create_link+0x25/0x40
[  243.577184]  [<ffffffff8133188b>] add_disk+0x20b/0x4a0
[  243.577189]  [<ffffffffa04e039f>] nd_pmem_probe+0x28f/0x360 [nd_pmem]
[  243.577194]  [<ffffffff814b0599>] nvdimm_bus_probe+0x69/0x120
[  243.577198]  [<ffffffff8148e779>] driver_probe_device+0x239/0x460
[  243.577200]  [<ffffffff8148c974>] bind_store+0xd4/0x110
[  243.577202]  [<ffffffff8148c054>] drv_attr_store+0x24/0x30
[  243.577203]  [<ffffffff812a042a>] sysfs_kf_write+0x3a/0x50
[  243.577205]  [<ffffffff8129fa3b>] kernfs_fop_write+0x11b/0x1a0
[  243.577209]  [<ffffffff8121d5e7>] __vfs_write+0x37/0x160
[  243.577215]  [<ffffffff812ceadd>] ? security_file_permission+0x3d/0xc0
[  243.577220]  [<ffffffff810d7e1f>] ? percpu_down_read+0x1f/0x50
[  243.577222]  [<ffffffff8121e8e2>] vfs_write+0xb2/0x1b0
[  243.577224]  [<ffffffff8121fd35>] SyS_write+0x55/0xc0
[  243.577229]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[  243.577232]  [<ffffffff816e85e1>] entry_SYSCALL64_slow_path+0x25/0x25
[  243.577248] Code: 48 89 e5 41 57 41 56 41 55 41 54 49 89 d4 53 74 73 48 85 ff 49 89 fd 74 6b 48 89 f3 48 c7 c7 f0 11 24 82 41 89 ce e8 7c 72 44 00 <48> 8b 1b 48 85 db 74 08 48 89 df e8 ac c1 ff ff 48 c7 c7 f0 11 
[  243.577250] RIP  [<ffffffff812a1054>] sysfs_do_create_link_sd.isra.2+0x34/0xb0
[  243.577251]  RSP <ffff88006f24fc28>
[  243.577251] CR2: 0000000000000040
[  243.577285] ---[ end trace 6d3b90c425a39fdc ]---
[  243.578932] Kernel panic - not syncing: Fatal exception
[  243.597839] Kernel Offset: disabled
[  247.934728] ---[ end Kernel panic - not syncing: Fatal exception
















Best Regards,
  Yi Zhang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
  2016-07-28  3:20   ` [BUG] kernel NULL pointer dereference observed during pmem btt switch test Yi Zhang
@ 2016-07-28 15:50     ` Dan Williams
       [not found]       ` <CAPcyv4g5PpShWfXSV+KPJYW7GFrejUNjk=C1-ak=88iX8XczGA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2016-07-28 15:50 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-nvdimm, linux-block

[ adding linux-block ]

On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@redhat.com> wrote:
> Hello everyone
>
> Could you help check this issue, thanks.
>
> Steps I used:
> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G"
> 2. Execute below script
> #!/bin/bash
> pmem_btt_switch() {
>         sector_size_list="512 520 528 4096 4104 4160 4224"
>         for sector_size in $sector_size_list; do
>                 ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size
>                 ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>         done
> }
>
> for i in 0 1 2 3; do
>         pmem_btt_switch $i &
> done

Thanks for the report.  This looks like del_gendisk() frees the
previous usage of the devt before the bdi is unregistered.  This
appears to be a general problem with all block drivers, not just
libnvdimm, since blk_cleanup_queue() is typically called after
del_gendisk().  I.e. it will always be the case that the bdi
registered with the devt allocated at add_disk() will still be alive
when del_gendisk()->disk_release() frees the previous devt number.

I *think* the path forward is to allow the bdi to hold a reference
against the blk_alloc_devt() allocation until it is done with it.  Any
other ideas on fixing this object lifetime problem?

>
> KERNEL log:
> [  243.404847] nd_pmem namespace2.0: unable to guarantee persistence of writes
> [  243.467271] nd_pmem namespace3.0: unable to guarantee persistence of writes
> [  243.513412] nd_pmem namespace1.0: unable to guarantee persistence of writes
> [  243.544728] nd_pmem namespace0.0: unable to guarantee persistence of writes
> [  243.545371] ------------[ cut here ]------------
> [  243.545381] WARNING: CPU: 10 PID: 2078 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80
> [  243.545382] sysfs: cannot create duplicate filename '/devices/virtual/bdi/259:1'
> [  243.545432] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw nd_pmem gf128mul glue_helper ablk_helper cryptd nd_btt hpilo iTCO_wdt iTCO_vendor_support sg hpwdt pcspkr ipmi_ssif ioatdma wmi pcc_cpufreq acpi_cpufreq acpi_power_meter lpc_ich ipmi_si ipmi_msghandler mfd_core shpchp dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel tg3 serio_raw hpsa ptp i2c_core scsi_transport_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod
> [  243.545435] CPU: 10 PID: 2078 Comm: ndctl Not tainted 4.7.0-rc7 #1
> [  243.545436] Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
> [  243.545439]  0000000000000286 0000000002c04ad5 ffff88006f24f970 ffffffff8134caec
> [  243.545441]  ffff88006f24f9c0 0000000000000000 ffff88006f24f9b0 ffffffff8108c351
> [  243.545442]  0000001f0000000c ffff88105d236000 ffff88105d1031e0 ffff8800357427f8
> [  243.545443] Call Trace:
> [  243.545452]  [<ffffffff8134caec>] dump_stack+0x63/0x87
> [  243.545460]  [<ffffffff8108c351>] __warn+0xd1/0xf0
> [  243.545463]  [<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80
> [  243.545465]  [<ffffffff812a0d34>] sysfs_warn_dup+0x64/0x80
> [  243.545466]  [<ffffffff812a0e1e>] sysfs_create_dir_ns+0x7e/0x90
> [  243.545469]  [<ffffffff8134faaa>] kobject_add_internal+0xaa/0x320
> [  243.545473]  [<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0
> [  243.545475]  [<ffffffff8134ff55>] kobject_add+0x75/0xd0
> [  243.545483]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
> [  243.545489]  [<ffffffff8148b0a5>] device_add+0x125/0x610
> [  243.545491]  [<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100
> [  243.545492]  [<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20
> [  243.545498]  [<ffffffff811b775c>] bdi_register+0x8c/0x180
> [  243.545500]  [<ffffffff811b7877>] bdi_register_dev+0x27/0x30
> [  243.545505]  [<ffffffff813317f5>] add_disk+0x175/0x4a0
> [  243.545507]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
> [  243.545513]  [<ffffffff814afb7f>] ? nvdimm_bus_unlock+0x1f/0x30
> [  243.545518]  [<ffffffffa04e039f>] nd_pmem_probe+0x28f/0x360 [nd_pmem]
> [  243.545521]  [<ffffffff814b0599>] nvdimm_bus_probe+0x69/0x120
> [  243.545524]  [<ffffffff8148e779>] driver_probe_device+0x239/0x460
> [  243.545526]  [<ffffffff8148c974>] bind_store+0xd4/0x110
> [  243.545528]  [<ffffffff8148c054>] drv_attr_store+0x24/0x30
> [  243.545529]  [<ffffffff812a042a>] sysfs_kf_write+0x3a/0x50
> [  243.545531]  [<ffffffff8129fa3b>] kernfs_fop_write+0x11b/0x1a0
> [  243.545536]  [<ffffffff8121d5e7>] __vfs_write+0x37/0x160
> [  243.545544]  [<ffffffff812ceadd>] ? security_file_permission+0x3d/0xc0
> [  243.545550]  [<ffffffff810d7e1f>] ? percpu_down_read+0x1f/0x50
> [  243.545552]  [<ffffffff8121e8e2>] vfs_write+0xb2/0x1b0
> [  243.545555]  [<ffffffff8121fd35>] SyS_write+0x55/0xc0
> [  243.545560]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
> [  243.545563]  [<ffffffff816e85e1>] entry_SYSCALL64_slow_path+0x25/0x25
> [  243.545579] ---[ end trace 6d3b90c425a39fda ]---
> [  243.545580] ------------[ cut here ]------------
> [  243.545583] WARNING: CPU: 10 PID: 2078 at lib/kobject.c:240 kobject_add_internal+0x262/0x320
> [  243.545584] kobject_add_internal failed for 259:1 with -EEXIST, don't try to register things with the same name in the same directory.
> [  243.545603] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw nd_pmem gf128mul glue_helper ablk_helper cryptd nd_btt hpilo iTCO_wdt iTCO_vendor_support sg hpwdt pcspkr ipmi_ssif ioatdma wmi pcc_cpufreq acpi_cpufreq acpi_power_meter lpc_ich ipmi_si ipmi_msghandler mfd_core shpchp dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel tg3 serio_raw hpsa ptp i2c_core scsi_transport_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod
> [  243.545605] CPU: 10 PID: 2078 Comm: ndctl Tainted: G        W       4.7.0-rc7 #1
> [  243.545605] Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
> [  243.545607]  0000000000000286 0000000002c04ad5 ffff88006f24f9c0 ffffffff8134caec
> [  243.545608]  ffff88006f24fa10 0000000000000000 ffff88006f24fa00 ffffffff8108c351
> [  243.545610]  000000f06f24fa28 ffff880035164010 ffff88006c7e3780 00000000ffffffef
> [  243.545610] Call Trace:
> [  243.545612]  [<ffffffff8134caec>] dump_stack+0x63/0x87
> [  243.545614]  [<ffffffff8108c351>] __warn+0xd1/0xf0
> [  243.545616]  [<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80
> [  243.545618]  [<ffffffff812a0d3c>] ? sysfs_warn_dup+0x6c/0x80
> [  243.545619]  [<ffffffff8134fc62>] kobject_add_internal+0x262/0x320
> [  243.545621]  [<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0
> [  243.545622]  [<ffffffff8134ff55>] kobject_add+0x75/0xd0
> [  243.545625]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
> [  243.545626]  [<ffffffff8148b0a5>] device_add+0x125/0x610
> [  243.545628]  [<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100
> [  243.545630]  [<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20
> [  243.545632]  [<ffffffff811b775c>] bdi_register+0x8c/0x180
> [  243.545634]  [<ffffffff811b7877>] bdi_register_dev+0x27/0x30
> [  243.545636]  [<ffffffff813317f5>] add_disk+0x175/0x4a0
> [  243.545638]  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
> [  243.545640]  [<ffffffff814afb7f>] ? nvdimm_bus_unlock+0x1f/0x30
> [  243.545642]  [<ffffffffa04e039f>] nd_pmem_probe+0x28f/0x360 [nd_pmem]
> [  243.545644]  [<ffffffff814b0599>] nvdimm_bus_probe+0x69/0x120
> [  243.545646]  [<ffffffff8148e779>] driver_probe_device+0x239/0x460
> [  243.545648]  [<ffffffff8148c974>] bind_store+0xd4/0x110
> [  243.545649]  [<ffffffff8148c054>] drv_attr_store+0x24/0x30
> [  243.545651]  [<ffffffff812a042a>] sysfs_kf_write+0x3a/0x50
> [  243.545652]  [<ffffffff8129fa3b>] kernfs_fop_write+0x11b/0x1a0
> [  243.545654]  [<ffffffff8121d5e7>] __vfs_write+0x37/0x160
> [  243.545657]  [<ffffffff812ceadd>] ? security_file_permission+0x3d/0xc0
> [  243.545659]  [<ffffffff810d7e1f>] ? percpu_down_read+0x1f/0x50
> [  243.545661]  [<ffffffff8121e8e2>] vfs_write+0xb2/0x1b0
> [  243.545663]  [<ffffffff8121fd35>] SyS_write+0x55/0xc0
> [  243.545665]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
> [  243.545666]  [<ffffffff816e85e1>] entry_SYSCALL64_slow_path+0x25/0x25
> [  243.545667] ---[ end trace 6d3b90c425a39fdb ]---
> [  243.577109] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
> [  243.577117] IP: [<ffffffff812a1054>] sysfs_do_create_link_sd.isra.2+0x34/0xb0
> [  243.577119] PGD 1057752067 PUD 105e37a067 PMD 0
> [  243.577121] Oops: 0000 [#1] SMP
> [  243.577154] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw nd_pmem gf128mul glue_helper ablk_helper cryptd nd_btt hpilo iTCO_wdt iTCO_vendor_support sg hpwdt pcspkr ipmi_ssif ioatdma wmi pcc_cpufreq acpi_cpufreq acpi_power_meter lpc_ich ipmi_si ipmi_msghandler mfd_core shpchp dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel tg3 serio_raw hpsa ptp i2c_core scsi_transport_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod
> [  243.577157] CPU: 6 PID: 2078 Comm: ndctl Tainted: G        W       4.7.0-rc7 #1
> [  243.577158] Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
> [  243.577159] task: ffff8800340c8000 ti: ffff88006f24c000 task.ti: ffff88006f24c000
> [  243.577162] RIP: 0010:[<ffffffff812a1054>]  [<ffffffff812a1054>] sysfs_do_create_link_sd.isra.2+0x34/0xb0
> [  243.577163] RSP: 0018:ffff88006f24fc28  EFLAGS: 00010246
> [  243.577164] RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000001
> [  243.577164] RDX: 0000000000000001 RSI: 0000000000000040 RDI: ffffffff822411f0
> [  243.577165] RBP: ffff88006f24fc50 R08: ffff8800690f1711 R09: ffffffff8134e82e
> [  243.577166] R10: ffff88007799b640 R11: ffffea0000d46000 R12: ffffffff81a3dc3c
> [  243.577166] R13: ffff88105ae627f8 R14: 0000000000000001 R15: ffff880034a89040
> [  243.577168] FS:  00007f685b5dc780(0000) GS:ffff880077980000(0000) knlGS:0000000000000000
> [  243.577168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  243.577169] CR2: 0000000000000040 CR3: 000000105bb0b000 CR4: 00000000001406e0
> [  243.577170] Stack:
> [  243.577172]  ffff880070666000 ffff880070666080 ffff88006a0635d0 ffff88007066600c
> [  243.577173]  ffff880034a89040 ffff88006f24fc60 ffffffff812a10f5 ffff88006f24fcc8
> [  243.577175]  ffffffff8133188b ffff880070666000 1030000135282c00 ffff880070666000
> [  243.577175] Call Trace:
> [  243.577179]  [<ffffffff812a10f5>] sysfs_create_link+0x25/0x40
> [  243.577184]  [<ffffffff8133188b>] add_disk+0x20b/0x4a0
> [  243.577189]  [<ffffffffa04e039f>] nd_pmem_probe+0x28f/0x360 [nd_pmem]
> [  243.577194]  [<ffffffff814b0599>] nvdimm_bus_probe+0x69/0x120
> [  243.577198]  [<ffffffff8148e779>] driver_probe_device+0x239/0x460
> [  243.577200]  [<ffffffff8148c974>] bind_store+0xd4/0x110
> [  243.577202]  [<ffffffff8148c054>] drv_attr_store+0x24/0x30
> [  243.577203]  [<ffffffff812a042a>] sysfs_kf_write+0x3a/0x50
> [  243.577205]  [<ffffffff8129fa3b>] kernfs_fop_write+0x11b/0x1a0
> [  243.577209]  [<ffffffff8121d5e7>] __vfs_write+0x37/0x160
> [  243.577215]  [<ffffffff812ceadd>] ? security_file_permission+0x3d/0xc0
> [  243.577220]  [<ffffffff810d7e1f>] ? percpu_down_read+0x1f/0x50
> [  243.577222]  [<ffffffff8121e8e2>] vfs_write+0xb2/0x1b0
> [  243.577224]  [<ffffffff8121fd35>] SyS_write+0x55/0xc0
> [  243.577229]  [<ffffffff81003b12>] do_syscall_64+0x62/0x110
> [  243.577232]  [<ffffffff816e85e1>] entry_SYSCALL64_slow_path+0x25/0x25
> [  243.577248] Code: 48 89 e5 41 57 41 56 41 55 41 54 49 89 d4 53 74 73 48 85 ff 49 89 fd 74 6b 48 89 f3 48 c7 c7 f0 11 24 82 41 89 ce e8 7c 72 44 00 <48> 8b 1b 48 85 db 74 08 48 89 df e8 ac c1 ff ff 48 c7 c7 f0 11
> [  243.577250] RIP  [<ffffffff812a1054>] sysfs_do_create_link_sd.isra.2+0x34/0xb0
> [  243.577251]  RSP <ffff88006f24fc28>
> [  243.577251] CR2: 0000000000000040
> [  243.577285] ---[ end trace 6d3b90c425a39fdc ]---
> [  243.578932] Kernel panic - not syncing: Fatal exception
> [  243.597839] Kernel Offset: disabled
> [  247.934728] ---[ end Kernel panic - not syncing: Fatal exception
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Best Regards,
>   Yi Zhang
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
  2016-07-28 15:50     ` Dan Williams
@ 2016-07-30 15:52           ` Dan Williams
  0 siblings, 0 replies; 9+ messages in thread
From: Dan Williams @ 2016-07-30 15:52 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-block-u79uwXL29TY76Z2rM5mHXA, linux-nvdimm

On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> [ adding linux-block ]
>
> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> Hello everyone
>>
>> Could you help check this issue, thanks.
>>
>> Steps I used:
>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>> 2. Execute below script
>> #!/bin/bash
>> pmem_btt_switch() {
>>         sector_size_list="512 520 528 4096 4104 4160 4224"
>>         for sector_size in $sector_size_list; do
>>                 ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size
>>                 ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>         done
>> }
>>
>> for i in 0 1 2 3; do
>>         pmem_btt_switch $i &
>> done
>
> Thanks for the report.  This looks like del_gendisk() frees the
> previous usage of the devt before the bdi is unregistered.  This
> appears to be a general problem with all block drivers, not just
> libnvdimm, since blk_cleanup_queue() is typically called after
> del_gendisk().  I.e. it will always be the case that the bdi
> registered with the devt allocated at add_disk() will still be alive
> when del_gendisk()->disk_release() frees the previous devt number.
>
> I *think* the path forward is to allow the bdi to hold a reference
> against the blk_alloc_devt() allocation until it is done with it.  Any
> other ideas on fixing this object lifetime problem?

Does the attached patch solve this for you?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
@ 2016-07-30 15:52           ` Dan Williams
  0 siblings, 0 replies; 9+ messages in thread
From: Dan Williams @ 2016-07-30 15:52 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-nvdimm, linux-block

[-- Attachment #1: Type: text/plain, Size: 1531 bytes --]

On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> [ adding linux-block ]
>
> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@redhat.com> wrote:
>> Hello everyone
>>
>> Could you help check this issue, thanks.
>>
>> Steps I used:
>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>> 2. Execute below script
>> #!/bin/bash
>> pmem_btt_switch() {
>>         sector_size_list="512 520 528 4096 4104 4160 4224"
>>         for sector_size in $sector_size_list; do
>>                 ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size
>>                 ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>         done
>> }
>>
>> for i in 0 1 2 3; do
>>         pmem_btt_switch $i &
>> done
>
> Thanks for the report.  This looks like del_gendisk() frees the
> previous usage of the devt before the bdi is unregistered.  This
> appears to be a general problem with all block drivers, not just
> libnvdimm, since blk_cleanup_queue() is typically called after
> del_gendisk().  I.e. it will always be the case that the bdi
> registered with the devt allocated at add_disk() will still be alive
> when del_gendisk()->disk_release() frees the previous devt number.
>
> I *think* the path forward is to allow the bdi to hold a reference
> against the blk_alloc_devt() allocation until it is done with it.  Any
> other ideas on fixing this object lifetime problem?

Does the attached patch solve this for you?

[-- Attachment #2: 0001-block-fix-bdi-vs-gendisk-lifetime-mismatch.patch --]
[-- Type: text/x-patch, Size: 4446 bytes --]

From 44bcbf8c531e9249d09e6bf502d3696668f3d22c Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams@intel.com>
Date: Sat, 30 Jul 2016 08:23:06 -0700
Subject: [PATCH] block: fix bdi vs gendisk lifetime mismatch

The bdi for gendisk is named after the gendisk.  However, since the
gendisk is destroyed before the bdi it leaves a window where a new
gendisk could dynamically reuse the same devt while a bdi while a bdi
with the same name is still live.  Arrange for the bdi to hold a
reference against its "owner" disk device while it is registered.
Otherwise we can hit sysfs duplicate name collisions like the following:

 WARNING: CPU: 10 PID: 2078 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80
 sysfs: cannot create duplicate filename '/devices/virtual/bdi/259:1'

 Hardware name: HP ProLiant DL580 Gen8, BIOS P79 05/06/2015
  0000000000000286 0000000002c04ad5 ffff88006f24f970 ffffffff8134caec
  ffff88006f24f9c0 0000000000000000 ffff88006f24f9b0 ffffffff8108c351
  0000001f0000000c ffff88105d236000 ffff88105d1031e0 ffff8800357427f8
 Call Trace:
  [<ffffffff8134caec>] dump_stack+0x63/0x87
  [<ffffffff8108c351>] __warn+0xd1/0xf0
  [<ffffffff8108c3cf>] warn_slowpath_fmt+0x5f/0x80
  [<ffffffff812a0d34>] sysfs_warn_dup+0x64/0x80
  [<ffffffff812a0e1e>] sysfs_create_dir_ns+0x7e/0x90
  [<ffffffff8134faaa>] kobject_add_internal+0xaa/0x320
  [<ffffffff81358d4e>] ? vsnprintf+0x34e/0x4d0
  [<ffffffff8134ff55>] kobject_add+0x75/0xd0
  [<ffffffff816e66b2>] ? mutex_lock+0x12/0x2f
  [<ffffffff8148b0a5>] device_add+0x125/0x610
  [<ffffffff8148b788>] device_create_groups_vargs+0xd8/0x100
  [<ffffffff8148b7cc>] device_create_vargs+0x1c/0x20
  [<ffffffff811b775c>] bdi_register+0x8c/0x180
  [<ffffffff811b7877>] bdi_register_dev+0x27/0x30
  [<ffffffff813317f5>] add_disk+0x175/0x4a0

Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 block/genhd.c                    |  2 +-
 include/linux/backing-dev-defs.h |  1 +
 include/linux/backing-dev.h      |  1 +
 mm/backing-dev.c                 | 18 ++++++++++++++++++
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/block/genhd.c b/block/genhd.c
index 3c9dede4e04f..f6f7ffcd4eab 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -614,7 +614,7 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 
 	/* Register BDI before referencing it from bdev */
 	bdi = &disk->queue->backing_dev_info;
-	bdi_register_dev(bdi, disk_devt(disk));
+	bdi_register_owner(bdi, disk_to_dev(disk));
 
 	blk_register_region(disk_devt(disk), disk->minors, NULL,
 			    exact_match, exact_lock, disk);
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 3f103076d0bf..c357f27d5483 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -163,6 +163,7 @@ struct backing_dev_info {
 	wait_queue_head_t wb_waitq;
 
 	struct device *dev;
+	struct device *owner;
 
 	struct timer_list laptop_mode_wb_timer;
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 491a91717788..43b93a947e61 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -24,6 +24,7 @@ __printf(3, 4)
 int bdi_register(struct backing_dev_info *bdi, struct device *parent,
 		const char *fmt, ...);
 int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev);
+int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner);
 void bdi_unregister(struct backing_dev_info *bdi);
 
 int __must_check bdi_setup_and_register(struct backing_dev_info *, char *);
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index efe237742074..7b51cb7905be 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -825,6 +825,19 @@ int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev)
 }
 EXPORT_SYMBOL(bdi_register_dev);
 
+int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner)
+{
+	int rc;
+
+	rc = bdi_register(bdi, NULL, "%u:%u", MAJOR(owner->devt),
+			MINOR(owner->devt));
+	if (rc)
+		return rc;
+	bdi->owner = owner;
+	get_device(owner);
+}
+EXPORT_SYMBOL(bdi_register_owner);
+
 /*
  * Remove bdi from bdi_list, and ensure that it is no longer visible
  */
@@ -849,6 +862,11 @@ void bdi_unregister(struct backing_dev_info *bdi)
 		device_unregister(bdi->dev);
 		bdi->dev = NULL;
 	}
+
+	if (bdi->owner) {
+		put_device(bdi->owner);
+		bdi->owner = NULL;
+	}
 }
 
 void bdi_exit(struct backing_dev_info *bdi)
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
  2016-07-30 15:52           ` Dan Williams
  (?)
@ 2016-07-31 17:19           ` yizhan
       [not found]             ` <d83eeac2-49dc-4c45-9aea-7d68da4fbf7d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  -1 siblings, 1 reply; 9+ messages in thread
From: yizhan @ 2016-07-31 17:19 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm, linux-block

[-- Attachment #1: Type: text/plain, Size: 2177 bytes --]

On 07/30/2016 11:52 PM, Dan Williams wrote:
> On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>> [ adding linux-block ]
>>
>> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@redhat.com> wrote:
>>> Hello everyone
>>>
>>> Could you help check this issue, thanks.
>>>
>>> Steps I used:
>>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>>> 2. Execute below script
>>> #!/bin/bash
>>> pmem_btt_switch() {
>>>          sector_size_list="512 520 528 4096 4104 4160 4224"
>>>          for sector_size in $sector_size_list; do
>>>                  ndctl create-namespace -f -e namespace${1}.0 --mode=sector -l $sector_size
>>>                  ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>>          done
>>> }
>>>
>>> for i in 0 1 2 3; do
>>>          pmem_btt_switch $i &
>>> done
>> Thanks for the report.  This looks like del_gendisk() frees the
>> previous usage of the devt before the bdi is unregistered.  This
>> appears to be a general problem with all block drivers, not just
>> libnvdimm, since blk_cleanup_queue() is typically called after
>> del_gendisk().  I.e. it will always be the case that the bdi
>> registered with the devt allocated at add_disk() will still be alive
>> when del_gendisk()->disk_release() frees the previous devt number.
>>
>> I *think* the path forward is to allow the bdi to hold a reference
>> against the blk_alloc_devt() allocation until it is done with it.  Any
>> other ideas on fixing this object lifetime problem?
> Does the attached patch solve this for you?
Hi Dan
This patch works and the issue cannot be reproduced after several times' 
test, thanks

Another thing is during the bug verifying, I found below error message, 
could you check whether it is reasonable:
[  150.464620] Dev pmem1: unable to read RDB block 0
[  150.486897]  pmem1: unable to read partition table
[  150.486901] pmem1: partition table beyond EOD, truncated
[  151.133287] Buffer I/O error on dev pmem3, logical block 2, async 
page read
[  151.164620] Buffer I/O error on dev pmem3, logical block 2, async 
page read


Best Regards
Yi Zhang


[-- Attachment #2: dmesg.log --]
[-- Type: text/x-log, Size: 12403 bytes --]

[  147.315954] nd_pmem btt0.1: No existing arenas
[  147.315955] nd_pmem btt3.1: No existing arenas
[  147.383257] pmem3s: detected capacity change from 0 to 8523182080
[  147.388988] pmem0s: detected capacity change from 8589934592 to 8523182080
[  147.396444] nd_pmem btt2.1: No existing arenas
[  147.396445] nd_pmem btt1.1: No existing arenas
[  147.430891] pmem2s: detected capacity change from 0 to 8523182080
[  147.430900] pmem1s: detected capacity change from 0 to 8523182080
[  147.478081] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  147.521249] pmem2: detected capacity change from 0 to 8589934592
[  147.521487] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  147.554917] pmem3: detected capacity change from 0 to 8589934592
[  147.580871] pmem2: detected capacity change from 8589934592 to 0
[  147.602631] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  147.602635] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  147.603497] pmem0: detected capacity change from 0 to 8589934592
[  147.666241] pmem1: detected capacity change from 0 to 8589934592
[  147.672306] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  147.704536] pmem2: detected capacity change from 0 to 8589934592
[  147.734077] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  147.766320] pmem3: detected capacity change from 0 to 8589934592
[  147.772506] Dev pmem0: unable to read RDB block 0
[  147.793773]  pmem0: unable to read partition table
[  147.793780] pmem0: partition table beyond EOD, truncated
[  147.865824] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  147.898198] pmem0: detected capacity change from 0 to 8589934592
[  147.904714] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  147.936305] pmem1: detected capacity change from 0 to 8589934592
[  147.971271] nd_pmem btt2.0: No existing arenas
[  148.030303] pmem2s: detected capacity change from 0 to 7582678528
[  148.041378] nd_pmem btt3.0: No existing arenas
[  148.070593] pmem3s: detected capacity change from 0 to 7582678528
[  148.135148] nd_pmem btt0.0: No existing arenas
[  148.135152] nd_pmem btt1.0: No existing arenas
[  148.167364] pmem1s: detected capacity change from 0 to 7582678528
[  148.167374] pmem0s: detected capacity change from 0 to 7582678528
[  148.176990] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  148.212944] pmem2: detected capacity change from 0 to 8589934592
[  148.220145] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  148.254861] pmem3: detected capacity change from 0 to 8589934592
[  148.261628] Dev pmem2: unable to read RDB block 0
[  148.284163]  pmem2: unable to read partition table
[  148.284166] pmem2: partition table beyond EOD, truncated
[  148.351329] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  148.385262] pmem2: detected capacity change from 0 to 8589934592
[  148.390936] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  148.423437] pmem3: detected capacity change from 0 to 8589934592
[  148.460822] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  148.493065] pmem1: detected capacity change from 0 to 8589934592
[  148.539448] nd_pmem btt3.1: No existing arenas
[  148.611564] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  148.611634] nd_pmem btt2.1: No existing arenas
[  148.644103] pmem0: detected capacity change from 0 to 8589934592
[  148.677860] pmem2s: detected capacity change from 0 to 7582678528
[  148.687916] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  148.723037] pmem1: detected capacity change from 0 to 8589934592
[  148.730869] pmem0: detected capacity change from 8589934592 to 0
[  148.752875] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  148.785398] pmem0: detected capacity change from 0 to 8589934592
[  148.822240] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  148.854500] pmem3: detected capacity change from 0 to 8589934592
[  148.869679] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  148.911291] pmem2: detected capacity change from 0 to 8589934592
[  148.919034] nd_pmem btt0.1: No existing arenas
[  148.948702] pmem0s: detected capacity change from 0 to 7582678528
[  148.958829] nd_pmem btt1.1: No existing arenas
[  149.019775] pmem1s: detected capacity change from 0 to 7582678528
[  149.050136] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  149.082633] pmem3: detected capacity change from 0 to 8589934592
[  149.103815] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  149.136176] pmem2: detected capacity change from 0 to 8589934592
[  149.184799] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  149.218606] pmem0: detected capacity change from 0 to 8589934592
[  149.224931] nd_pmem btt3.0: No existing arenas
[  149.235318] pmem3s: detected capacity change from 8589934592 to 8580472832
[  149.241427] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  149.241488] nd_pmem btt2.0: No existing arenas
[  149.247017] pmem2s: detected capacity change from 0 to 8580472832
[  149.277048] pmem1: detected capacity change from 0 to 8589934592
[  149.307015] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  149.341856] pmem0: detected capacity change from 0 to 8589934592
[  149.372847] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  149.405191] pmem1: detected capacity change from 0 to 8589934592
[  149.439029] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  149.471169] pmem3: detected capacity change from 0 to 8589934592
[  149.502574] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  149.534860] pmem2: detected capacity change from 0 to 8589934592
[  149.542808] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  149.574676] pmem3: detected capacity change from 0 to 8589934592
[  149.608027] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  149.640207] pmem2: detected capacity change from 0 to 8589934592
[  149.702443] nd_pmem btt0.0: No existing arenas
[  149.702445] nd_pmem btt1.0: No existing arenas
[  149.715351] pmem0s: detected capacity change from 0 to 8580472832
[  149.721028] pmem1s: detected capacity change from 8589934592 to 8580472832
[  149.729324] nd_pmem btt2.1: No existing arenas
[  149.729329] nd_pmem btt3.1: No existing arenas
[  149.735004] pmem3s: detected capacity change from 0 to 8448573440
[  149.739576] pmem2s: detected capacity change from 0 to 8448573440
[  149.782024] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  149.829323] pmem3: detected capacity change from 0 to 8589934592
[  149.829562] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  149.862662] pmem1: detected capacity change from 0 to 8589934592
[  149.870267] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  149.902596] pmem2: detected capacity change from 0 to 8589934592
[  149.937580] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  149.970071] pmem0: detected capacity change from 0 to 8589934592
[  149.976675] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  150.008906] pmem3: detected capacity change from 0 to 8589934592
[  150.032619] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  150.064691] pmem1: detected capacity change from 0 to 8589934592
[  150.070176] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  150.102587] pmem2: detected capacity change from 0 to 8589934592
[  150.129946] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  150.162165] pmem0: detected capacity change from 0 to 8589934592
[  150.210180] nd_pmem btt2.0: No existing arenas
[  150.210182] nd_pmem btt3.0: No existing arenas
[  150.220950] pmem3s: detected capacity change from 0 to 8448573440
[  150.229106] pmem2s: detected capacity change from 8589934592 to 8448573440
[  150.237939] nd_pmem btt1.1: No existing arenas
[  150.247641] pmem1s: detected capacity change from 0 to 8448573440
[  150.258315] nd_pmem btt0.1: No existing arenas
[  150.264070] pmem0s: detected capacity change from 0 to 8448573440
[  150.319992] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  150.366356] pmem1: detected capacity change from 0 to 8589934592
[  150.366678] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  150.402532] pmem2: detected capacity change from 0 to 8589934592
[  150.407886] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  150.444465] pmem3: detected capacity change from 0 to 8589934592
[  150.464620] Dev pmem1: unable to read RDB block 0
[  150.486897]  pmem1: unable to read partition table
[  150.486901] pmem1: partition table beyond EOD, truncated
[  150.536626] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  150.568956] pmem0: detected capacity change from 0 to 8589934592
[  150.590809] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  150.622145] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  150.623098] pmem1: detected capacity change from 0 to 8589934592
[  150.654534] pmem3: detected capacity change from 0 to 8589934592
[  150.660707] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  150.692821] pmem2: detected capacity change from 0 to 8589934592
[  150.728067] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  150.760343] pmem0: detected capacity change from 0 to 8589934592
[  150.798537] nd_pmem btt1.0: No existing arenas
[  150.804651] pmem1s: detected capacity change from 0 to 8448573440
[  150.829937] nd_pmem btt3.1: No existing arenas
[  150.847261] pmem3s: detected capacity change from 8589934592 to 8320671744
[  150.858221] nd_pmem btt0.0: No existing arenas
[  150.867579] pmem0s: detected capacity change from 0 to 8448573440
[  150.875491] nd_pmem btt2.1: No existing arenas
[  150.881240] pmem2s: detected capacity change from 0 to 8320671744
[  150.925384] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  150.962120] pmem1: detected capacity change from 0 to 8589934592
[  150.985113] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  151.029504] pmem2: detected capacity change from 0 to 8589934592
[  151.029861] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  151.061681] pmem3: detected capacity change from 0 to 8589934592
[  151.093674] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  151.125684] pmem1: detected capacity change from 0 to 8589934592
[  151.133287] Buffer I/O error on dev pmem3, logical block 2, async page read
[  151.164620] Buffer I/O error on dev pmem3, logical block 2, async page read
[  151.243026] nd_pmem namespace3.0: unable to guarantee persistence of writes
[  151.275200] pmem3: detected capacity change from 0 to 8589934592
[  151.288175] nd_pmem namespace2.0: unable to guarantee persistence of writes
[  151.320536] pmem2: detected capacity change from 0 to 8589934592
[  151.347556] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  151.379724] pmem0: detected capacity change from 0 to 8589934592
[  151.407539] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  151.442647] pmem0: detected capacity change from 0 to 8589934592
[  151.450053] nd_pmem btt1.1: No existing arenas
[  151.456290] pmem1s: detected capacity change from 0 to 8320671744
[  151.495572] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  151.540294] pmem1: detected capacity change from 0 to 8589934592
[  151.546802] nd_pmem btt0.1: No existing arenas
[  151.552513] pmem0s: detected capacity change from 0 to 8320671744
[  151.563179] pmem1: detected capacity change from 8589934592 to 0
[  151.590517] nd_pmem namespace1.0: unable to guarantee persistence of writes
[  151.622554] pmem1: detected capacity change from 0 to 8589934592
[  151.659503] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  151.691927] pmem0: detected capacity change from 0 to 8589934592
[  151.720570] nd_pmem namespace0.0: unable to guarantee persistence of writes
[  151.752515] pmem0: detected capacity change from 0 to 8589934592

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
  2016-07-31 17:19           ` yizhan
@ 2016-07-31 17:54                 ` Dan Williams
  0 siblings, 0 replies; 9+ messages in thread
From: Dan Williams @ 2016-07-31 17:54 UTC (permalink / raw)
  To: yizhan; +Cc: linux-block-u79uwXL29TY76Z2rM5mHXA, linux-nvdimm

On Sun, Jul 31, 2016 at 10:19 AM, yizhan <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 07/30/2016 11:52 PM, Dan Williams wrote:
>>
>> On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> wrote:
>>>
>>> [ adding linux-block ]
>>>
>>> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>>>
>>>> Hello everyone
>>>>
>>>> Could you help check this issue, thanks.
>>>>
>>>> Steps I used:
>>>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G
>>>> memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>>>> 2. Execute below script
>>>> #!/bin/bash
>>>> pmem_btt_switch() {
>>>>          sector_size_list="512 520 528 4096 4104 4160 4224"
>>>>          for sector_size in $sector_size_list; do
>>>>                  ndctl create-namespace -f -e namespace${1}.0
>>>> --mode=sector -l $sector_size
>>>>                  ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>>>          done
>>>> }
>>>>
>>>> for i in 0 1 2 3; do
>>>>          pmem_btt_switch $i &
>>>> done
>>>
>>> Thanks for the report.  This looks like del_gendisk() frees the
>>> previous usage of the devt before the bdi is unregistered.  This
>>> appears to be a general problem with all block drivers, not just
>>> libnvdimm, since blk_cleanup_queue() is typically called after
>>> del_gendisk().  I.e. it will always be the case that the bdi
>>> registered with the devt allocated at add_disk() will still be alive
>>> when del_gendisk()->disk_release() frees the previous devt number.
>>>
>>> I *think* the path forward is to allow the bdi to hold a reference
>>> against the blk_alloc_devt() allocation until it is done with it.  Any
>>> other ideas on fixing this object lifetime problem?
>>
>> Does the attached patch solve this for you?
>
> Hi Dan
> This patch works and the issue cannot be reproduced after several times'
> test, thanks

Thank you!

> Another thing is during the bug verifying, I found below error message,
> could you check whether it is reasonable:
> [  150.464620] Dev pmem1: unable to read RDB block 0
> [  150.486897]  pmem1: unable to read partition table
> [  150.486901] pmem1: partition table beyond EOD, truncated
> [  151.133287] Buffer I/O error on dev pmem3, logical block 2, async page
> read
> [  151.164620] Buffer I/O error on dev pmem3, logical block 2, async page
> read
>

This test is racing block device registration versus teardown.  These
messages are expected and are likely coming from the block queue
percpu ref being marked dead while the partition scan runs.  When this
happens blk_queue_enter() in generic_make_request() returns errors for
every new I/O submission while blk_cleanup_queue() runs.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
@ 2016-07-31 17:54                 ` Dan Williams
  0 siblings, 0 replies; 9+ messages in thread
From: Dan Williams @ 2016-07-31 17:54 UTC (permalink / raw)
  To: yizhan; +Cc: linux-nvdimm, linux-block

On Sun, Jul 31, 2016 at 10:19 AM, yizhan <yizhan@redhat.com> wrote:
> On 07/30/2016 11:52 PM, Dan Williams wrote:
>>
>> On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams@intel.com>
>> wrote:
>>>
>>> [ adding linux-block ]
>>>
>>> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@redhat.com> wrote:
>>>>
>>>> Hello everyone
>>>>
>>>> Could you help check this issue, thanks.
>>>>
>>>> Steps I used:
>>>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G
>>>> memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>>>> 2. Execute below script
>>>> #!/bin/bash
>>>> pmem_btt_switch() {
>>>>          sector_size_list="512 520 528 4096 4104 4160 4224"
>>>>          for sector_size in $sector_size_list; do
>>>>                  ndctl create-namespace -f -e namespace${1}.0
>>>> --mode=sector -l $sector_size
>>>>                  ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>>>          done
>>>> }
>>>>
>>>> for i in 0 1 2 3; do
>>>>          pmem_btt_switch $i &
>>>> done
>>>
>>> Thanks for the report.  This looks like del_gendisk() frees the
>>> previous usage of the devt before the bdi is unregistered.  This
>>> appears to be a general problem with all block drivers, not just
>>> libnvdimm, since blk_cleanup_queue() is typically called after
>>> del_gendisk().  I.e. it will always be the case that the bdi
>>> registered with the devt allocated at add_disk() will still be alive
>>> when del_gendisk()->disk_release() frees the previous devt number.
>>>
>>> I *think* the path forward is to allow the bdi to hold a reference
>>> against the blk_alloc_devt() allocation until it is done with it.  Any
>>> other ideas on fixing this object lifetime problem?
>>
>> Does the attached patch solve this for you?
>
> Hi Dan
> This patch works and the issue cannot be reproduced after several times'
> test, thanks

Thank you!

> Another thing is during the bug verifying, I found below error message,
> could you check whether it is reasonable:
> [  150.464620] Dev pmem1: unable to read RDB block 0
> [  150.486897]  pmem1: unable to read partition table
> [  150.486901] pmem1: partition table beyond EOD, truncated
> [  151.133287] Buffer I/O error on dev pmem3, logical block 2, async page
> read
> [  151.164620] Buffer I/O error on dev pmem3, logical block 2, async page
> read
>

This test is racing block device registration versus teardown.  These
messages are expected and are likely coming from the block queue
percpu ref being marked dead while the partition scan runs.  When this
happens blk_queue_enter() in generic_make_request() returns errors for
every new I/O submission while blk_cleanup_queue() runs.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
  2016-07-31 17:54                 ` Dan Williams
@ 2016-08-01  5:30                     ` yizhan
  -1 siblings, 0 replies; 9+ messages in thread
From: yizhan @ 2016-08-01  5:30 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-block-u79uwXL29TY76Z2rM5mHXA, linux-nvdimm



On 08/01/2016 01:54 AM, Dan Williams wrote:
> On Sun, Jul 31, 2016 at 10:19 AM, yizhan <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> On 07/30/2016 11:52 PM, Dan Williams wrote:
>>> On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>> wrote:
>>>> [ adding linux-block ]
>>>>
>>>> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>>>> Hello everyone
>>>>>
>>>>> Could you help check this issue, thanks.
>>>>>
>>>>> Steps I used:
>>>>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G
>>>>> memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>>>>> 2. Execute below script
>>>>> #!/bin/bash
>>>>> pmem_btt_switch() {
>>>>>           sector_size_list="512 520 528 4096 4104 4160 4224"
>>>>>           for sector_size in $sector_size_list; do
>>>>>                   ndctl create-namespace -f -e namespace${1}.0
>>>>> --mode=sector -l $sector_size
>>>>>                   ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>>>>           done
>>>>> }
>>>>>
>>>>> for i in 0 1 2 3; do
>>>>>           pmem_btt_switch $i &
>>>>> done
>>>> Thanks for the report.  This looks like del_gendisk() frees the
>>>> previous usage of the devt before the bdi is unregistered.  This
>>>> appears to be a general problem with all block drivers, not just
>>>> libnvdimm, since blk_cleanup_queue() is typically called after
>>>> del_gendisk().  I.e. it will always be the case that the bdi
>>>> registered with the devt allocated at add_disk() will still be alive
>>>> when del_gendisk()->disk_release() frees the previous devt number.
>>>>
>>>> I *think* the path forward is to allow the bdi to hold a reference
>>>> against the blk_alloc_devt() allocation until it is done with it.  Any
>>>> other ideas on fixing this object lifetime problem?
>>> Does the attached patch solve this for you?
>> Hi Dan
>> This patch works and the issue cannot be reproduced after several times'
>> test, thanks
> Thank you!
>
>> Another thing is during the bug verifying, I found below error message,
>> could you check whether it is reasonable:
>> [  150.464620] Dev pmem1: unable to read RDB block 0
>> [  150.486897]  pmem1: unable to read partition table
>> [  150.486901] pmem1: partition table beyond EOD, truncated
>> [  151.133287] Buffer I/O error on dev pmem3, logical block 2, async page
>> read
>> [  151.164620] Buffer I/O error on dev pmem3, logical block 2, async page
>> read
>>
> This test is racing block device registration versus teardown.  These
> messages are expected and are likely coming from the block queue
> percpu ref being marked dead while the partition scan runs.  When this
> happens blk_queue_enter() in generic_make_request() returns errors for
> every new I/O submission while blk_cleanup_queue() runs.
OK, thanks for your explanation.

Best Regards
Yi Zhang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] kernel NULL pointer dereference observed during pmem btt switch test
@ 2016-08-01  5:30                     ` yizhan
  0 siblings, 0 replies; 9+ messages in thread
From: yizhan @ 2016-08-01  5:30 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm, linux-block



On 08/01/2016 01:54 AM, Dan Williams wrote:
> On Sun, Jul 31, 2016 at 10:19 AM, yizhan <yizhan@redhat.com> wrote:
>> On 07/30/2016 11:52 PM, Dan Williams wrote:
>>> On Thu, Jul 28, 2016 at 8:50 AM, Dan Williams <dan.j.williams@intel.com>
>>> wrote:
>>>> [ adding linux-block ]
>>>>
>>>> On Wed, Jul 27, 2016 at 8:20 PM, Yi Zhang <yizhan@redhat.com> wrote:
>>>>> Hello everyone
>>>>>
>>>>> Could you help check this issue, thanks.
>>>>>
>>>>> Steps I used:
>>>>> 1. Reserve 4*8G of memory for pmem by add kernel parameter "memmap=8G!4G
>>>>> memmap=8G!12G memmap=8G!20G memmap=8G!28G"
>>>>> 2. Execute below script
>>>>> #!/bin/bash
>>>>> pmem_btt_switch() {
>>>>>           sector_size_list="512 520 528 4096 4104 4160 4224"
>>>>>           for sector_size in $sector_size_list; do
>>>>>                   ndctl create-namespace -f -e namespace${1}.0
>>>>> --mode=sector -l $sector_size
>>>>>                   ndctl create-namespace -f -e namespace${1}.0 --mode=raw
>>>>>           done
>>>>> }
>>>>>
>>>>> for i in 0 1 2 3; do
>>>>>           pmem_btt_switch $i &
>>>>> done
>>>> Thanks for the report.  This looks like del_gendisk() frees the
>>>> previous usage of the devt before the bdi is unregistered.  This
>>>> appears to be a general problem with all block drivers, not just
>>>> libnvdimm, since blk_cleanup_queue() is typically called after
>>>> del_gendisk().  I.e. it will always be the case that the bdi
>>>> registered with the devt allocated at add_disk() will still be alive
>>>> when del_gendisk()->disk_release() frees the previous devt number.
>>>>
>>>> I *think* the path forward is to allow the bdi to hold a reference
>>>> against the blk_alloc_devt() allocation until it is done with it.  Any
>>>> other ideas on fixing this object lifetime problem?
>>> Does the attached patch solve this for you?
>> Hi Dan
>> This patch works and the issue cannot be reproduced after several times'
>> test, thanks
> Thank you!
>
>> Another thing is during the bug verifying, I found below error message,
>> could you check whether it is reasonable:
>> [  150.464620] Dev pmem1: unable to read RDB block 0
>> [  150.486897]  pmem1: unable to read partition table
>> [  150.486901] pmem1: partition table beyond EOD, truncated
>> [  151.133287] Buffer I/O error on dev pmem3, logical block 2, async page
>> read
>> [  151.164620] Buffer I/O error on dev pmem3, logical block 2, async page
>> read
>>
> This test is racing block device registration versus teardown.  These
> messages are expected and are likely coming from the block queue
> percpu ref being marked dead while the partition scan runs.  When this
> happens blk_queue_enter() in generic_make_request() returns errors for
> every new I/O submission while blk_cleanup_queue() runs.
OK, thanks for your explanation.

Best Regards
Yi Zhang

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-08-01  5:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <622794958.9574724.1469674652262.JavaMail.zimbra@redhat.com>
     [not found] ` <622794958.9574724.1469674652262.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-07-28  3:20   ` [BUG] kernel NULL pointer dereference observed during pmem btt switch test Yi Zhang
2016-07-28 15:50     ` Dan Williams
     [not found]       ` <CAPcyv4g5PpShWfXSV+KPJYW7GFrejUNjk=C1-ak=88iX8XczGA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-30 15:52         ` Dan Williams
2016-07-30 15:52           ` Dan Williams
2016-07-31 17:19           ` yizhan
     [not found]             ` <d83eeac2-49dc-4c45-9aea-7d68da4fbf7d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-07-31 17:54               ` Dan Williams
2016-07-31 17:54                 ` Dan Williams
     [not found]                 ` <CAPcyv4hxP8aDyzsoeG9XH5ygtNWypU8fUj+qCNKh2wWa+PJh6w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-08-01  5:30                   ` yizhan
2016-08-01  5:30                     ` yizhan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.