* Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
@ 2021-01-17 18:58 Bradley Chapman
2021-01-18 4:36 ` Chaitanya Kulkarni
0 siblings, 1 reply; 17+ messages in thread
From: Bradley Chapman @ 2021-01-17 18:58 UTC (permalink / raw)
To: linux-nvme
All,
I recently plugged a 256GB SPCC NVMe 1.3 drive into the secondary slot
on my Asus X570-P motherboard, running a Ryzen 5 3600 CPU. After
partitioning and formatting the drive, it is detected thusly by the
5.9.15 and 5.10.6 kernels:
[ 1.653074] nvme nvme1: pci function 0000:04:00.0
[ 1.657181] nvme nvme1: missing or invalid SUBNQN field.
[ 1.662294] nvme nvme1: allocated 64 MiB host memory buffer.
[ 1.663105] nvme nvme1: 15/0/0 default/read/poll queues
[ 1.665815] nvme1n1: p1
However, any I/O to the drive (including mounting its filesystem) causes
the following errors to appear in the dmesg. These errors occur with
both the 5.9.15 kernel and the 5.10.6 kernel, and with X570-P BIOS
version 1406 and version 3001. I have modified the BIOS settings to
specify that a GEN 3 device is plugged into the M.2_2 slot instead of
allowing the BIOS to auto-detect the drive.
[ 2745.659502] refcount_t: underflow; use-after-free.
[ 2745.659510] WARNING: CPU: 2 PID: 0 at lib/refcount.c:28
refcount_warn_saturate+0xab/0xf0
[ 2745.659510] Modules linked in: rfcomm(E) cmac(E) bnep(E)
binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) uas(E)
usb_storage(E) btusb(E) btrtl(E) crct10dif_pclmul(E) btbcm(E)
crc32_pclmul(E) btintel(E) ghash_clmulni_intel(E) bluetooth(E) rfkill(E)
aesni_intel(E) crypto_simd(E) cryptd(E) glue_helper(E) efi_pstore(E)
jitterentropy_rng(E) drbg(E) ccp(E) ansi_cprng(E) ecdh_generic(E) ecc(E)
acpi_cpufreq(E) nft_counter(E) efivarfs(E) crc32c_intel(E)
[ 2745.659527] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G E
5.10.6-BET #1
[ 2745.659528] Hardware name: System manufacturer System Product
Name/PRIME X570-P, BIOS 3001 12/04/2020
[ 2745.659529] RIP: 0010:refcount_warn_saturate+0xab/0xf0
[ 2745.659530] Code: 05 af d2 72 01 01 e8 7a 06 87 00 0f 0b c3 80 3d 9d
d2 72 01 00 75 90 48 c7 c7 78 60 44 af c6 05 8d d2 72 01 01 e8 5b 06 87
00 <0f> 0b c3 80 3d 7c d2 72 01 00 0f 85 6d ff ff ff 48 c7 c7 d0 60 44
[ 2745.659531] RSP: 0018:ffffaf1880298f30 EFLAGS: 00010086
[ 2745.659532] RAX: 0000000000000000 RBX: ffff9873cf3bc300 RCX:
0000000000000027
[ 2745.659533] RDX: 0000000000000027 RSI: ffff987acea92e80 RDI:
ffff987acea92e88
[ 2745.659533] RBP: ffff9873d0e661f0 R08: 0000000000000000 R09:
c0000000ffffdfff
[ 2745.659534] R10: ffffaf1880298d50 R11: ffffaf1880298d48 R12:
0000000000000001
[ 2745.659534] R13: ffff9873d0f98580 R14: ffff9873cdf8ac00 R15:
0000000000000000
[ 2745.659535] FS: 0000000000000000(0000) GS:ffff987acea80000(0000)
knlGS:0000000000000000
[ 2745.659536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2745.659536] CR2: 00005588763902c8 CR3: 0000000107248000 CR4:
0000000000350ee0
[ 2745.659537] Call Trace:
[ 2745.659538] <IRQ>
[ 2745.659541] nvme_irq+0x104/0x190
[ 2745.659543] __handle_irq_event_percpu+0x2e/0xd0
[ 2745.659545] handle_irq_event_percpu+0x33/0x80
[ 2745.659545] handle_irq_event+0x39/0x70
[ 2745.659547] handle_edge_irq+0x7c/0x1a0
[ 2745.659549] asm_call_irq_on_stack+0x12/0x20
[ 2745.659549] </IRQ>
[ 2745.659551] common_interrupt+0xd7/0x160
[ 2745.659552] asm_common_interrupt+0x1e/0x40
[ 2745.659554] RIP: 0010:cpuidle_enter_state+0xd2/0x2e0
[ 2745.659555] Code: e8 93 22 6a ff 31 ff 49 89 c5 e8 29 2c 6a ff 45 84
ff 74 12 9c 58 f6 c4 02 0f 85 c4 01 00 00 31 ff e8 a2 d8 6f ff fb 45 85
f6 <0f> 88 c9 00 00 00 49 63 ce be 68 00 00 00 4c 2b 2c 24 48 89 ca 48
[ 2745.659556] RSP: 0018:ffffaf188014fe80 EFLAGS: 00000202
[ 2745.659557] RAX: ffff987acea9ce00 RBX: 0000000000000002 RCX:
000000000000001f
[ 2745.659557] RDX: 0000027f460f1f90 RSI: 00000000239f5229 RDI:
0000000000000000
[ 2745.659558] RBP: ffff9873c1a4e800 R08: 0000000000000002 R09:
000000000001c600
[ 2745.659558] R10: 0000090da145abf0 R11: ffff987acea9be24 R12:
ffffffffaf6d38e0
[ 2745.659559] R13: 0000027f460f1f90 R14: 0000000000000002 R15:
0000000000000000
[ 2745.659561] cpuidle_enter+0x30/0x50
[ 2745.659562] do_idle+0x24f/0x290
[ 2745.659564] cpu_startup_entry+0x1b/0x20
[ 2745.659566] start_secondary+0x10b/0x150
[ 2745.659567] secondary_startup_64_no_verify+0xb0/0xbb
[ 2745.659569] ---[ end trace be84281f034198f3 ]---
[ 2776.138874] nvme nvme1: I/O 414 QID 3 timeout, aborting
[ 2776.138886] nvme nvme1: I/O 415 QID 3 timeout, aborting
[ 2776.138891] nvme nvme1: I/O 416 QID 3 timeout, aborting
[ 2776.138895] nvme nvme1: I/O 417 QID 3 timeout, aborting
[ 2776.138912] nvme nvme1: Abort status: 0x0
[ 2776.138921] nvme nvme1: I/O 428 QID 3 timeout, aborting
[ 2776.138922] nvme nvme1: Abort status: 0x0
[ 2776.138925] nvme nvme1: Abort status: 0x0
[ 2776.138974] nvme nvme1: Abort status: 0x0
[ 2776.138977] nvme nvme1: Abort status: 0x0
[ 2806.346792] nvme nvme1: I/O 414 QID 3 timeout, reset controller
[ 2806.363566] nvme nvme1: 15/0/0 default/read/poll queues
[ 2836.554298] nvme nvme1: I/O 415 QID 3 timeout, disable controller
[ 2836.672064] blk_update_request: I/O error, dev nvme1n1, sector 16350
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672072] blk_update_request: I/O error, dev nvme1n1, sector 16093
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672074] blk_update_request: I/O error, dev nvme1n1, sector 15836
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672076] blk_update_request: I/O error, dev nvme1n1, sector 15579
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672078] blk_update_request: I/O error, dev nvme1n1, sector 15322
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672080] blk_update_request: I/O error, dev nvme1n1, sector 15065
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672082] blk_update_request: I/O error, dev nvme1n1, sector 14808
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672083] blk_update_request: I/O error, dev nvme1n1, sector 14551
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672085] blk_update_request: I/O error, dev nvme1n1, sector 14294
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672087] blk_update_request: I/O error, dev nvme1n1, sector 14037
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672121] nvme nvme1: failed to mark controller live state
[ 2836.672123] nvme nvme1: Removing after probe failure status: -19
[ 2836.689016] Aborting journal on device dm-0-8.
[ 2836.689024] Buffer I/O error on dev dm-0, logical block 25198592,
lost sync page write
[ 2836.689027] JBD2: Error -5 detected when updating journal superblock
for dm-0-8.
[ 2836.723821] percpu ref (hd_struct_free) <= 0 (-28) after switching to
atomic
[ 2836.723828] WARNING: CPU: 8 PID: 0 at lib/percpu-refcount.c:196
percpu_ref_switch_to_atomic_rcu+0x139/0x140
[ 2836.723828] Modules linked in: rfcomm(E) cmac(E) bnep(E)
binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) uas(E)
usb_storage(E) btusb(E) btrtl(E) crct10dif_pclmul(E) btbcm(E)
crc32_pclmul(E) btintel(E) ghash_clmulni_intel(E) bluetooth(E) rfkill(E)
aesni_intel(E) crypto_simd(E) cryptd(E) glue_helper(E) efi_pstore(E)
jitterentropy_rng(E) drbg(E) ccp(E) ansi_cprng(E) ecdh_generic(E) ecc(E)
acpi_cpufreq(E) nft_counter(E) efivarfs(E) crc32c_intel(E)
[ 2836.723844] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G W E
5.10.6-BET #1
[ 2836.723845] Hardware name: System manufacturer System Product
Name/PRIME X570-P, BIOS 3001 12/04/2020
[ 2836.723847] RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x139/0x140
[ 2836.723848] Code: 80 3d f9 f0 72 01 00 0f 85 52 ff ff ff 49 8b 54 24
e0 49 8b 74 24 e8 48 c7 c7 88 5f 44 af c6 05 db f0 72 01 01 e8 ad 24 87
00 <0f> 0b e9 2e ff ff ff 41 55 49 89 f5 41 54 55 48 89 fd 53 48 83 ec
[ 2836.723849] RSP: 0018:ffffaf18803a0f20 EFLAGS: 00010282
[ 2836.723850] RAX: 0000000000000000 RBX: 7fffffffffffffe3 RCX:
0000000000000027
[ 2836.723850] RDX: 0000000000000027 RSI: ffff987acec12e80 RDI:
ffff987acec12e88
[ 2836.723851] RBP: 0000369db0c0e3c8 R08: 0000000000000000 R09:
c0000000ffffdfff
[ 2836.723851] R10: ffffaf18803a0d40 R11: ffffaf18803a0d38 R12:
ffff9873c0bbbda0
[ 2836.723852] R13: ffffffffaf765f10 R14: 0000000000000202 R15:
ffffffffaf6060c0
[ 2836.723853] FS: 0000000000000000(0000) GS:ffff987acec00000(0000)
knlGS:0000000000000000
[ 2836.723853] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2836.723854] CR2: 0000558899b414c0 CR3: 0000000101cd2000 CR4:
0000000000350ee0
[ 2836.723854] Call Trace:
[ 2836.723855] <IRQ>
[ 2836.723859] rcu_core+0x196/0x420
[ 2836.723862] __do_softirq+0xc9/0x214
[ 2836.723863] asm_call_irq_on_stack+0x12/0x20
[ 2836.723864] </IRQ>
[ 2836.723866] do_softirq_own_stack+0x31/0x40
[ 2836.723867] irq_exit_rcu+0x9a/0xa0
[ 2836.723869] sysvec_apic_timer_interrupt+0x2c/0x80
[ 2836.723870] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 2836.723872] RIP: 0010:cpuidle_enter_state+0xd2/0x2e0
[ 2836.723873] Code: e8 93 22 6a ff 31 ff 49 89 c5 e8 29 2c 6a ff 45 84
ff 74 12 9c 58 f6 c4 02 0f 85 c4 01 00 00 31 ff e8 a2 d8 6f ff fb 45 85
f6 <0f> 88 c9 00 00 00 49 63 ce be 68 00 00 00 4c 2b 2c 24 48 89 ca 48
[ 2836.723874] RSP: 0018:ffffaf188017fe80 EFLAGS: 00000202
[ 2836.723874] RAX: ffff987acec1ce00 RBX: 0000000000000002 RCX:
000000000000001f
[ 2836.723875] RDX: 0000029479ea3f98 RSI: 00000000239f5229 RDI:
0000000000000000
[ 2836.723875] RBP: ffff9873c1a4ec00 R08: 0000000000000002 R09:
000000000001c600
[ 2836.723876] R10: 00000959d0ea6498 R11: ffff987acec1be24 R12:
ffffffffaf6d38e0
[ 2836.723876] R13: 0000029479ea3f98 R14: 0000000000000002 R15:
0000000000000000
[ 2836.723878] cpuidle_enter+0x30/0x50
[ 2836.723880] do_idle+0x24f/0x290
[ 2836.723882] cpu_startup_entry+0x1b/0x20
[ 2836.723884] start_secondary+0x10b/0x150
[ 2836.723885] secondary_startup_64_no_verify+0xb0/0xbb
[ 2836.723887] ---[ end trace be84281f034198f4 ]---
After these errors are generated, the device becomes inaccessible and
unmounting its filesystem (which does not hang in D state) generates
additional errors:
[ 2868.181018] Buffer I/O error on dev dm-0, logical block 0, lost sync
page write
[ 2868.181022] EXT4-fs (dm-0): I/O error while writing superblock
After the filesystem is unmounted the device no longer appears in the
output of lsblk(8) and its device node(s) disappear after the kernel
removes the device. Prior to the I/O failures, the nvme error-log
command returns no error entries for any of the 64 log entries present.
nvme fw-log and nvme smart-log return the following output:
Firmware Log for device:nvme1
afi : 0x20
Smart Log for NVME device:nvme1 namespace-id:ffffffff
critical_warning : 0
temperature : 48 C
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 0%
data_units_read : 234
data_units_written : 2,149
host_read_commands : 4,202
host_write_commands : 421,917
controller_busy_time : 0
power_cycles : 7
power_on_hours : 11
unsafe_shutdowns : 0
media_errors : 0
num_err_log_entries : 0
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Temperature Sensor 1 : 48 C
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0
I've checked the kernel change logs and I know that the refcount_t error
has been occurring in other kernel subsystems and was subsequently fixed
in recent kernel point releases, so I will be trying to reproduce this
error with the most recent 5.10 and 5.11-rc kernels.
Any suggestions on what else to try next?
Thanks!
Brad
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-17 18:58 Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free Bradley Chapman
@ 2021-01-18 4:36 ` Chaitanya Kulkarni
2021-01-18 18:33 ` Bradley Chapman
0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-18 4:36 UTC (permalink / raw)
To: chapman6235, linux-nvme
On 1/17/21 11:05 AM, Bradley Chapman wrote:
> [ 2836.554298] nvme nvme1: I/O 415 QID 3 timeout, disable controller
> [ 2836.672064] blk_update_request: I/O error, dev nvme1n1, sector 16350
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672072] blk_update_request: I/O error, dev nvme1n1, sector 16093
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672074] blk_update_request: I/O error, dev nvme1n1, sector 15836
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672076] blk_update_request: I/O error, dev nvme1n1, sector 15579
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672078] blk_update_request: I/O error, dev nvme1n1, sector 15322
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672080] blk_update_request: I/O error, dev nvme1n1, sector 15065
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672082] blk_update_request: I/O error, dev nvme1n1, sector 14808
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672083] blk_update_request: I/O error, dev nvme1n1, sector 14551
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672085] blk_update_request: I/O error, dev nvme1n1, sector 14294
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672087] blk_update_request: I/O error, dev nvme1n1, sector 14037
> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
> [ 2836.672121] nvme nvme1: failed to mark controller live state
> [ 2836.672123] nvme nvme1: Removing after probe failure status: -19
> [ 2836.689016] Aborting journal on device dm-0-8.
> [ 2836.689024] Buffer I/O error on dev dm-0, logical block 25198592,
> lost sync page write
> [ 2836.689027] JBD2: Error -5 detected when updating journal superblock
> for dm-0-8.
Without the knowledge of fs mount/format command I can only suspect that
super
block zeroing issued with write-zeroes request is translated into
REQ_OP_WRITE_ZEROES which controller is not able to process resulting in
the error. This analysis maybe wrong.
Can you please share following details :-
nvme id-ns /dev/nvme0n1 -H (we are interested in oncs part here)
Also for above device what is the value for the queue block write-zeroes
parameter that is present in the
/sys/block/<nvmeXnY>/queue/write_zeroes_max_bytes ?
You can also try blkdiscard -z 0 -l 1024 /dev/<nvmeXnY> to see if the
problem is with
write zeroes.
Also can you please also try the latest nvme tree branch nvme-5.11 ?
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-18 4:36 ` Chaitanya Kulkarni
@ 2021-01-18 18:33 ` Bradley Chapman
2021-01-20 3:08 ` Chaitanya Kulkarni
0 siblings, 1 reply; 17+ messages in thread
From: Bradley Chapman @ 2021-01-18 18:33 UTC (permalink / raw)
To: Chaitanya Kulkarni, linux-nvme
Good afternoon!
On 1/17/21 11:36 PM, Chaitanya Kulkarni wrote:
> On 1/17/21 11:05 AM, Bradley Chapman wrote:
>> [ 2836.554298] nvme nvme1: I/O 415 QID 3 timeout, disable controller
>> [ 2836.672064] blk_update_request: I/O error, dev nvme1n1, sector 16350
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672072] blk_update_request: I/O error, dev nvme1n1, sector 16093
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672074] blk_update_request: I/O error, dev nvme1n1, sector 15836
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672076] blk_update_request: I/O error, dev nvme1n1, sector 15579
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672078] blk_update_request: I/O error, dev nvme1n1, sector 15322
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672080] blk_update_request: I/O error, dev nvme1n1, sector 15065
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672082] blk_update_request: I/O error, dev nvme1n1, sector 14808
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672083] blk_update_request: I/O error, dev nvme1n1, sector 14551
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672085] blk_update_request: I/O error, dev nvme1n1, sector 14294
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672087] blk_update_request: I/O error, dev nvme1n1, sector 14037
>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>> [ 2836.672121] nvme nvme1: failed to mark controller live state
>> [ 2836.672123] nvme nvme1: Removing after probe failure status: -19
>> [ 2836.689016] Aborting journal on device dm-0-8.
>> [ 2836.689024] Buffer I/O error on dev dm-0, logical block 25198592,
>> lost sync page write
>> [ 2836.689027] JBD2: Error -5 detected when updating journal superblock
>> for dm-0-8.
> Without the knowledge of fs mount/format command I can only suspect that
> super
> block zeroing issued with write-zeroes request is translated into
> REQ_OP_WRITE_ZEROES which controller is not able to process resulting in
> the error. This analysis maybe wrong.
>
> Can you please share following details :-
>
> nvme id-ns /dev/nvme0n1 -H (we are interested in oncs part here)
I ran the requested command against /dev/nvme1n1 (since /dev/nvme0n1
works perfectly so far) and here is the result:
NVME Identify Namespace 1:
nsze : 0x1dcf32b0
ncap : 0x1dcf32b0
nuse : 0x1dcf32b0
nsfeat : 0
[2:2] : 0 Deallocated or Unwritten Logical Block error Not Supported
[1:1] : 0 Namespace uses AWUN, AWUPF, and ACWU
[0:0] : 0 Thin Provisioning Not Supported
nlbaf : 0
flbas : 0
[4:4] : 0 Metadata Transferred in Separate Contiguous Buffer
[3:0] : 0 Current LBA Format Selected
mc : 0
[1:1] : 0 Metadata Pointer Not Supported
[0:0] : 0 Metadata as Part of Extended Data LBA Not Supported
dpc : 0
[4:4] : 0 Protection Information Transferred as Last 8 Bytes of
Metadata Not Supported
[3:3] : 0 Protection Information Transferred as First 8 Bytes of
Metadata Not Supported
[2:2] : 0 Protection Information Type 3 Not Supported
[1:1] : 0 Protection Information Type 2 Not Supported
[0:0] : 0 Protection Information Type 1 Not Supported
dps : 0
[3:3] : 0 Protection Information is Transferred as Last 8 Bytes
of Metadata
[2:0] : 0 Protection Information Disabled
nmic : 0
[0:0] : 0 Namespace Multipath Not Capable
rescap : 0
[6:6] : 0 Exclusive Access - All Registrants Not Supported
[5:5] : 0 Write Exclusive - All Registrants Not Supported
[4:4] : 0 Exclusive Access - Registrants Only Not Supported
[3:3] : 0 Write Exclusive - Registrants Only Not Supported
[2:2] : 0 Exclusive Access Not Supported
[1:1] : 0 Write Exclusive Not Supported
[0:0] : 0 Persist Through Power Loss Not Supported
fpi : 0x80
[7:7] : 0x1 Format Progress Indicator Supported
[6:0] : 0 Format Progress Indicator (Remaining 0%)
dlfeat : 1
[4:4] : 0 Guard Field of Deallocated Logical Blocks is set to 0xFFFF
[3:3] : 0 Deallocate Bit in the Write Zeroes Command is Not Supported
[2:0] : 0x1 Bytes Read From a Deallocated Logical Block and its
Metadata are 0x00
nawun : 0
nawupf : 0
nacwu : 0
nabsn : 0
nabo : 0
nabspf : 0
noiob : 0
nvmcap : 0
nsattr : 0
nvmsetid: 0
anagrpid: 0
endgid : 0
nguid : 00000000000000000000000000000000
eui64 : 0000000000000000
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes -
Relative Performance: 0 Best (in use)
>
> Also for above device what is the value for the queue block write-zeroes
>
> parameter that is present in the
> /sys/block/<nvmeXnY>/queue/write_zeroes_max_bytes ?
$ cat /sys/block/nvme1n1/queue/write_zeroes_max_bytes
131584
>
> You can also try blkdiscard -z 0 -l 1024 /dev/<nvmeXnY> to see if the
> problem is with
> write zeroes.
# blkdiscard -z -l 1024 /dev/nvme1n1
blkdiscard: /dev/nvme1n1: BLKZEROOUT ioctl failed: Device or resource busy
>
> Also can you please also try the latest nvme tree branch nvme-5.11 ?
>
Where do I get that code from? Is it already in the 5.11-rc tree or do I
need to look somewhere else? I checked https://github.com/linux-nvme but
I did not see it there.
Brad
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-18 18:33 ` Bradley Chapman
@ 2021-01-20 3:08 ` Chaitanya Kulkarni
2021-01-21 2:33 ` Bradley Chapman
0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-20 3:08 UTC (permalink / raw)
To: chapman6235, linux-nvme
On 1/18/21 10:33 AM, Bradley Chapman wrote:
> Good afternoon!
>
> On 1/17/21 11:36 PM, Chaitanya Kulkarni wrote:
>> On 1/17/21 11:05 AM, Bradley Chapman wrote:
>>> [ 2836.554298] nvme nvme1: I/O 415 QID 3 timeout, disable controller
>>> [ 2836.672064] blk_update_request: I/O error, dev nvme1n1, sector 16350
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672072] blk_update_request: I/O error, dev nvme1n1, sector 16093
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672074] blk_update_request: I/O error, dev nvme1n1, sector 15836
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672076] blk_update_request: I/O error, dev nvme1n1, sector 15579
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672078] blk_update_request: I/O error, dev nvme1n1, sector 15322
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672080] blk_update_request: I/O error, dev nvme1n1, sector 15065
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672082] blk_update_request: I/O error, dev nvme1n1, sector 14808
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672083] blk_update_request: I/O error, dev nvme1n1, sector 14551
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672085] blk_update_request: I/O error, dev nvme1n1, sector 14294
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672087] blk_update_request: I/O error, dev nvme1n1, sector 14037
>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>> [ 2836.672121] nvme nvme1: failed to mark controller live state
>>> [ 2836.672123] nvme nvme1: Removing after probe failure status: -19
>>> [ 2836.689016] Aborting journal on device dm-0-8.
>>> [ 2836.689024] Buffer I/O error on dev dm-0, logical block 25198592,
>>> lost sync page write
>>> [ 2836.689027] JBD2: Error -5 detected when updating journal superblock
>>> for dm-0-8.
>> Without the knowledge of fs mount/format command I can only suspect that
>> super
>> block zeroing issued with write-zeroes request is translated into
>> REQ_OP_WRITE_ZEROES which controller is not able to process resulting in
>> the error. This analysis maybe wrong.
>>
>> Can you please share following details :-
>>
>> nvme id-ns /dev/nvme0n1 -H (we are interested in oncs part here)
> I ran the requested command against /dev/nvme1n1 (since /dev/nvme0n1
> works perfectly so far) and here is the result:
Sorry my bad it suppose to be nvme id-ctrl /dev/nvme0n1 -H
>> Also for above device what is the value for the queue block write-zeroes
>>
>> parameter that is present in the
>> /sys/block/<nvmeXnY>/queue/write_zeroes_max_bytes ?
> $ cat /sys/block/nvme1n1/queue/write_zeroes_max_bytes
> 131584
So write-zeroes is configured from the setup.
>> You can also try blkdiscard -z 0 -l 1024 /dev/<nvmeXnY> to see if the
>> problem is with
>> write zeroes.
> # blkdiscard -z -l 1024 /dev/nvme1n1
> blkdiscard: /dev/nvme1n1: BLKZEROOUT ioctl failed: Device or resource busy
This is exactly what I thought, we need to add a quirk for this model
and make sure
we don't set the write-zeroes support and make blk-lib emulate the
write-zeroes.
>> Also can you please also try the latest nvme tree branch nvme-5.11 ?
>>
> Where do I get that code from? Is it already in the 5.11-rc tree or do I
> need to look somewhere else? I checked https://github.com/linux-nvme but
> I did not see it there.
Here is the link :-git://git.infradead.org/nvme.git
Branch 5.12.
> Brad
>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-20 3:08 ` Chaitanya Kulkarni
@ 2021-01-21 2:33 ` Bradley Chapman
2021-01-21 12:45 ` Niklas Cassel
0 siblings, 1 reply; 17+ messages in thread
From: Bradley Chapman @ 2021-01-21 2:33 UTC (permalink / raw)
To: Chaitanya Kulkarni, linux-nvme
Good evening!
On 1/19/21 10:08 PM, Chaitanya Kulkarni wrote:
> On 1/18/21 10:33 AM, Bradley Chapman wrote:
>> Good afternoon!
>>
>> On 1/17/21 11:36 PM, Chaitanya Kulkarni wrote:
>>> On 1/17/21 11:05 AM, Bradley Chapman wrote:
>>>> [ 2836.554298] nvme nvme1: I/O 415 QID 3 timeout, disable controller
>>>> [ 2836.672064] blk_update_request: I/O error, dev nvme1n1, sector 16350
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672072] blk_update_request: I/O error, dev nvme1n1, sector 16093
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672074] blk_update_request: I/O error, dev nvme1n1, sector 15836
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672076] blk_update_request: I/O error, dev nvme1n1, sector 15579
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672078] blk_update_request: I/O error, dev nvme1n1, sector 15322
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672080] blk_update_request: I/O error, dev nvme1n1, sector 15065
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672082] blk_update_request: I/O error, dev nvme1n1, sector 14808
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672083] blk_update_request: I/O error, dev nvme1n1, sector 14551
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672085] blk_update_request: I/O error, dev nvme1n1, sector 14294
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672087] blk_update_request: I/O error, dev nvme1n1, sector 14037
>>>> op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>>> [ 2836.672121] nvme nvme1: failed to mark controller live state
>>>> [ 2836.672123] nvme nvme1: Removing after probe failure status: -19
>>>> [ 2836.689016] Aborting journal on device dm-0-8.
>>>> [ 2836.689024] Buffer I/O error on dev dm-0, logical block 25198592,
>>>> lost sync page write
>>>> [ 2836.689027] JBD2: Error -5 detected when updating journal superblock
>>>> for dm-0-8.
>>> Without the knowledge of fs mount/format command I can only suspect that
>>> super
>>> block zeroing issued with write-zeroes request is translated into
>>> REQ_OP_WRITE_ZEROES which controller is not able to process resulting in
>>> the error. This analysis maybe wrong.
>>>
>>> Can you please share following details :-
>>>
>>> nvme id-ns /dev/nvme0n1 -H (we are interested in oncs part here)
>> I ran the requested command against /dev/nvme1n1 (since /dev/nvme0n1
>> works perfectly so far) and here is the result:
> Sorry my bad it suppose to be nvme id-ctrl /dev/nvme0n1 -H
$ nvme id-ctrl /dev/nvme1n1 -H
NVME Identify Controller:
vid : 0x2263
ssvid : 0x1d97
sn : P2002287000000001296
mn : SPCC M.2 PCIe SSD
fr : V1.0
rab : 6
ieee : 000000
cmic : 0
[3:3] : 0 ANA not supported
[2:2] : 0 PCI
[1:1] : 0 Single Controller
[0:0] : 0 Single Port
mdts : 5
cntlid : 1
ver : 10300
rtd3r : 249f0
rtd3e : 13880
oaes : 0x200
[9:9] : 0x1 Firmware Activation Notices Supported
[8:8] : 0 Namespace Attribute Changed Event Not Supported
ctratt : 0
[5:5] : 0 Predictable Latency Mode Not Supported
[4:4] : 0 Endurance Groups Not Supported
[3:3] : 0 Read Recovery Levels Not Supported
[2:2] : 0 NVM Sets Not Supported
[1:1] : 0 Non-Operational Power State Permissive Not Supported
[0:0] : 0 128-bit Host Identifier Not Supported
rrls : 0
oacs : 0x7
[8:8] : 0 Doorbell Buffer Config Not Supported
[7:7] : 0 Virtualization Management Not Supported
[6:6] : 0 NVMe-MI Send and Receive Not Supported
[5:5] : 0 Directives Not Supported
[4:4] : 0 Device Self-test Not Supported
[3:3] : 0 NS Management and Attachment Not Supported
[2:2] : 0x1 FW Commit and Download Supported
[1:1] : 0x1 Format NVM Supported
[0:0] : 0x1 Security Send and Receive Supported
acl : 3
aerl : 3
frmw : 0x2
[4:4] : 0 Firmware Activate Without Reset Not Supported
[3:1] : 0x1 Number of Firmware Slots
[0:0] : 0 Firmware Slot 1 Read/Write
lpa : 0xa
[3:3] : 0x1 Telemetry host/controller initiated log page Suporrted
[2:2] : 0 Extended data for Get Log Page Not Supported
[1:1] : 0x1 Command Effects Log Page Supported
[0:0] : 0 SMART/Health Log Page per NS Not Supported
elpe : 63
npss : 0
avscc : 0x1
[0:0] : 0x1 Admin Vendor Specific Commands uses NVMe Format
apsta : 0
[0:0] : 0 Autonomous Power State Transitions Not Supported
wctemp : 354
cctemp : 363
mtfa : 0
hmpre : 16384
hmmin : 16384
tnvmcap : 0
unvmcap : 0
rpmbs : 0
[31:24]: 0 Access Size
[23:16]: 0 Total Size
[5:3] : 0 Authentication Method
[2:0] : 0 Number of RPMB Units
edstt : 5
dsto : 1
fwug : 0
kas : 0
hctma : 0
[0:0] : 0 Host Controlled Thermal Management Not Supported
mntmt : 0
mxtmt : 0
sanicap : 0
[2:2] : 0 Overwrite Sanitize Operation Not Supported
[1:1] : 0 Block Erase Sanitize Operation Not Supported
[0:0] : 0 Crypto Erase Sanitize Operation Not Supported
hmminds : 0
hmmaxd : 0
nsetidmax : 0
anatt : 0
anacap : 0
[7:7] : 0 Non-zero group ID Not Supported
[6:6] : 0 Group ID does not change
[4:4] : 0 ANA Change state Not Supported
[3:3] : 0 ANA Persistent Loss state Not Supported
[2:2] : 0 ANA Inaccessible state Not Supported
[1:1] : 0 ANA Non-optimized state Not Supported
[0:0] : 0 ANA Optimized state Not Supported
anagrpmax : 0
nanagrpid : 0
sqes : 0x66
[7:4] : 0x6 Max SQ Entry Size (64)
[3:0] : 0x6 Min SQ Entry Size (64)
cqes : 0x44
[7:4] : 0x4 Max CQ Entry Size (16)
[3:0] : 0x4 Min CQ Entry Size (16)
maxcmd : 0
nn : 1
oncs : 0x1d
[6:6] : 0 Timestamp Not Supported
[5:5] : 0 Reservations Not Supported
[4:4] : 0x1 Save and Select Supported
[3:3] : 0x1 Write Zeroes Supported
[2:2] : 0x1 Data Set Management Supported
[1:1] : 0 Write Uncorrectable Not Supported
[0:0] : 0x1 Compare Supported
fuses : 0
[0:0] : 0 Fused Compare and Write Not Supported
fna : 0x3
[2:2] : 0 Crypto Erase Not Supported as part of Secure Erase
[1:1] : 0x1 Crypto Erase Applies to All Namespace(s)
[0:0] : 0x1 Format Applies to All Namespace(s)
vwc : 0x5
[7:3] : 0x2 Reserved
[0:0] : 0x1 Volatile Write Cache Present
awun : 0
awupf : 0
nvscc : 0
[0:0] : 0 NVM Vendor Specific Commands uses Vendor Specific Format
nwpc : 0
[2:2] : 0 Permanent Write Protect Not Supported
[1:1] : 0 Write Protect Until Power Supply Not Supported
[0:0] : 0 No Write Protect and Write Protect Namespace Not Supported
acwu : 0
sgls : 0
[1:0] : 0 Scatter-Gather Lists Not Supported
mnan : 0
subnqn :
ioccsz : 0
iorcsz : 0
icdoff : 0
ctrattr : 0
[0:0] : 0 Dynamic Controller Model
msdbd : 0
ps 0 : mp:3.30W operational enlat:5 exlat:5 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
>>> Also for above device what is the value for the queue block write-zeroes
>>>
>>> parameter that is present in the
>>> /sys/block/<nvmeXnY>/queue/write_zeroes_max_bytes ?
>> $ cat /sys/block/nvme1n1/queue/write_zeroes_max_bytes
>> 131584
> So write-zeroes is configured from the setup.
>>> You can also try blkdiscard -z 0 -l 1024 /dev/<nvmeXnY> to see if the
>>> problem is with
>>> write zeroes.
>> # blkdiscard -z -l 1024 /dev/nvme1n1
>> blkdiscard: /dev/nvme1n1: BLKZEROOUT ioctl failed: Device or resource busy
> This is exactly what I thought, we need to add a quirk for this model
> and make sure
> we don't set the write-zeroes support and make blk-lib emulate the
> write-zeroes.
I am ready to take patches for the NVMe driver to test this out - this
device is not a boot device and I have no data on it that needs to be
preserved.
>>> Also can you please also try the latest nvme tree branch nvme-5.11 ?
>>>
>> Where do I get that code from? Is it already in the 5.11-rc tree or do I
>> need to look somewhere else? I checked https://github.com/linux-nvme but
>> I did not see it there.
> Here is the link :-git://git.infradead.org/nvme.git
> Branch 5.12.
I tried fetching the entire repo but it was huge and would have taken a
long time, so I tried to fetch a single branch instead and got this result:
$ git clone --branch 5.12 --single-branch git://git.infradead.org/nvme.git
Cloning into 'nvme'...
warning: Could not find remote branch 5.12 to clone.
fatal: Remote branch 5.12 not found in upstream origin
I haven't compiled any out-of-tree kernel code in a very long time - how
easy is it to add this code to a kernel tree and compile it into the
kernel once I've figured out how to get it?
Brad
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-21 2:33 ` Bradley Chapman
@ 2021-01-21 12:45 ` Niklas Cassel
2021-01-22 2:32 ` Bradley Chapman
2021-01-22 2:54 ` Bradley Chapman
0 siblings, 2 replies; 17+ messages in thread
From: Niklas Cassel @ 2021-01-21 12:45 UTC (permalink / raw)
To: Bradley Chapman; +Cc: linux-nvme, Chaitanya Kulkarni
On Wed, Jan 20, 2021 at 09:33:08PM -0500, Bradley Chapman wrote:
> > > > Also can you please also try the latest nvme tree branch nvme-5.11 ?
> > > >
> > > Where do I get that code from? Is it already in the 5.11-rc tree or do I
> > > need to look somewhere else? I checked https://github.com/linux-nvme but
> > > I did not see it there.
> > Here is the link :-git://git.infradead.org/nvme.git
> > Branch 5.12.
>
> I tried fetching the entire repo but it was huge and would have taken a long
> time, so I tried to fetch a single branch instead and got this result:
>
> $ git clone --branch 5.12 --single-branch git://git.infradead.org/nvme.git
> Cloning into 'nvme'...
> warning: Could not find remote branch 5.12 to clone.
> fatal: Remote branch 5.12 not found in upstream origin
>
> I haven't compiled any out-of-tree kernel code in a very long time - how
> easy is it to add this code to a kernel tree and compile it into the kernel
> once I've figured out how to get it?
Hello there,
You can see the available branches by replacing git:// with https:// i.e.:
https://git.infradead.org/nvme.git
The branch is called nvme-5.12
It is not out-of-tree kernel code, it is a subsystem git tree,
so you build the kernel like usual.
If you already have a kernel git tree somewhere,
simply add an additional remote, and it should be quick:
$ git remote add nvme git://git.infradead.org/nvme.git && git fetch nvme
Kind regards,
Niklas
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-21 12:45 ` Niklas Cassel
@ 2021-01-22 2:32 ` Bradley Chapman
2021-01-22 2:54 ` Chaitanya Kulkarni
2021-01-22 2:54 ` Chaitanya Kulkarni
2021-01-22 2:54 ` Bradley Chapman
1 sibling, 2 replies; 17+ messages in thread
From: Bradley Chapman @ 2021-01-22 2:32 UTC (permalink / raw)
To: Niklas Cassel; +Cc: linux-nvme, Chaitanya Kulkarni
Good evening,
On 1/21/21 7:45 AM, Niklas Cassel wrote:
> On Wed, Jan 20, 2021 at 09:33:08PM -0500, Bradley Chapman wrote:
>>>>> Also can you please also try the latest nvme tree branch nvme-5.11 ?
>>>>>
>>>> Where do I get that code from? Is it already in the 5.11-rc tree or do I
>>>> need to look somewhere else? I checked https://github.com/linux-nvme but
>>>> I did not see it there.
>>> Here is the link :-git://git.infradead.org/nvme.git
>>> Branch 5.12.
>>
>> I tried fetching the entire repo but it was huge and would have taken a long
>> time, so I tried to fetch a single branch instead and got this result:
>>
>> $ git clone --branch 5.12 --single-branch git://git.infradead.org/nvme.git
>> Cloning into 'nvme'...
>> warning: Could not find remote branch 5.12 to clone.
>> fatal: Remote branch 5.12 not found in upstream origin
>>
>> I haven't compiled any out-of-tree kernel code in a very long time - how
>> easy is it to add this code to a kernel tree and compile it into the kernel
>> once I've figured out how to get it?
>
> Hello there,
>
> You can see the available branches by replacing git:// with https:// i.e.:
> https://git.infradead.org/nvme.git
>
> The branch is called nvme-5.12
>
> It is not out-of-tree kernel code, it is a subsystem git tree,
> so you build the kernel like usual.
>
> If you already have a kernel git tree somewhere,
> simply add an additional remote, and it should be quick:
>
> $ git remote add nvme git://git.infradead.org/nvme.git && git fetch nvme
Thanks for the pointer. I've downloaded the code and will add it to a
stable 5.10 tree and a 5.11-rc tree and see what happens.
>
>
> Kind regards,
> Niklas
>
Brad
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-22 2:32 ` Bradley Chapman
@ 2021-01-22 2:54 ` Chaitanya Kulkarni
2021-01-22 2:54 ` Chaitanya Kulkarni
1 sibling, 0 replies; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-22 2:54 UTC (permalink / raw)
To: chapman6235; +Cc: linux-nvme
On 1/21/21 6:32 PM, Bradley Chapman wrote:
> Thanks for the pointer. I've downloaded the code and will add it to a
> stable 5.10 tree and a 5.11-rc tree and see what happens.
>
Please use the latest 5.12 branch and boot into the kernel.
If you can provide device's vendor ID and device ID I can cook up the
patch for you based on 5.12, will be waiting for your response.
These IDs can be found :-
cat /sys/bus/pci/devices/<your device id>/device
cat /sys/bus/pci/devices/<your device id>/vendor
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-22 2:32 ` Bradley Chapman
2021-01-22 2:54 ` Chaitanya Kulkarni
@ 2021-01-22 2:54 ` Chaitanya Kulkarni
1 sibling, 0 replies; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-22 2:54 UTC (permalink / raw)
To: chapman6235; +Cc: linux-nvme
On 1/21/21 6:32 PM, Bradley Chapman wrote:
> Thanks for the pointer. I've downloaded the code and will add it to a
> stable 5.10 tree and a 5.11-rc tree and see what happens.
>
Please use the latest 5.12 branch and boot into the kernel.
If you can provide device's vendor ID and device ID I can cook up the
patch for you based on 5.12, will be waiting for your response.
These IDs can be found :-
cat /sys/bus/pci/devices/<your device id>/device
cat /sys/bus/pci/devices/<your device id>/vendor
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-21 12:45 ` Niklas Cassel
2021-01-22 2:32 ` Bradley Chapman
@ 2021-01-22 2:54 ` Bradley Chapman
2021-01-22 2:57 ` Chaitanya Kulkarni
1 sibling, 1 reply; 17+ messages in thread
From: Bradley Chapman @ 2021-01-22 2:54 UTC (permalink / raw)
To: Niklas Cassel; +Cc: linux-nvme, Chaitanya Kulkarni
Good evening!
On 1/21/21 7:45 AM, Niklas Cassel wrote:
> On Wed, Jan 20, 2021 at 09:33:08PM -0500, Bradley Chapman wrote:
>>>>> Also can you please also try the latest nvme tree branch nvme-5.11 ?
>>>>>
>>>> Where do I get that code from? Is it already in the 5.11-rc tree or do I
>>>> need to look somewhere else? I checked https://github.com/linux-nvme but
>>>> I did not see it there.
>>> Here is the link :-git://git.infradead.org/nvme.git
>>> Branch 5.12.
>>
>> I tried fetching the entire repo but it was huge and would have taken a long
>> time, so I tried to fetch a single branch instead and got this result:
>>
>> $ git clone --branch 5.12 --single-branch git://git.infradead.org/nvme.git
>> Cloning into 'nvme'...
>> warning: Could not find remote branch 5.12 to clone.
>> fatal: Remote branch 5.12 not found in upstream origin
>>
>> I haven't compiled any out-of-tree kernel code in a very long time - how
>> easy is it to add this code to a kernel tree and compile it into the kernel
>> once I've figured out how to get it?
>
> Hello there,
>
> You can see the available branches by replacing git:// with https:// i.e.:
> https://git.infradead.org/nvme.git
>
> The branch is called nvme-5.12
>
> It is not out-of-tree kernel code, it is a subsystem git tree,
> so you build the kernel like usual.
>
> If you already have a kernel git tree somewhere,
> simply add an additional remote, and it should be quick:
>
> $ git remote add nvme git://git.infradead.org/nvme.git && git fetch nvme
>
>
> Kind regards,
> Niklas
>
I compiled the kernel from the above git tree, rebooted and attempted to
mount the filesystem on the NVMe drive. This is what the kernel put into
the dmesg when I attempted to list the contents of the filesystem root,
create an inode for a zero-byte file and then unmount the filesystem.
Brad
<snip/>
[ 52.795975] refcount_t: underflow; use-after-free.
[ 52.795981] WARNING: CPU: 7 PID: 0 at lib/refcount.c:28
refcount_warn_saturate+0xab/0xf0
[ 52.795989] Modules linked in: rfcomm(E) cmac(E) bnep(E)
binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) btusb(E)
btrtl(E) btbcm(E) btintel(E) intel_rapl_common(E) iosf_mbi(E)
crct10dif_pclmul(E) crc32_pclmul(E) bluetooth(E) ghash_clmulni_intel(E)
rfkill(E) jitterentropy_rng(E) aesni_intel(E) crypto_simd(E)
efi_pstore(E) cryptd(E) glue_helper(E) drbg(E) ccp(E) ansi_cprng(E)
ecdh_generic(E) ecc(E) acpi_cpufreq(E) nft_counter(E) efivarfs(E)
crc32c_intel(E)
[ 52.796018] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G E
5.11.0-rc1-BET+ #1
[ 52.796021] Hardware name: System manufacturer System Product
Name/PRIME X570-P, BIOS 3001 12/04/2020
[ 52.796023] RIP: 0010:refcount_warn_saturate+0xab/0xf0
[ 52.796026] Code: 05 02 a0 72 01 01 e8 49 7d 8b 00 0f 0b c3 80 3d f0
9f 72 01 00 75 90 48 c7 c7 88 4c c7 8a c6 05 e0 9f 72 01 01 e8 2a 7d 8b
00 <0f> 0b c3 80 3d cf 9f 72 01 00 0f 85 6d ff ff ff 48 c7 c7 e0 4c c7
[ 52.796028] RSP: 0018:ffffa95b80374f28 EFLAGS: 00010082
[ 52.796031] RAX: 0000000000000000 RBX: ffff9ac74f014800 RCX:
0000000000000027
[ 52.796032] RDX: 0000000000000027 RSI: ffff9ace4ebd2ed0 RDI:
ffff9ace4ebd2ed8
[ 52.796034] RBP: ffff9ac753820080 R08: 0000000000000000 R09:
c0000000ffffdfff
[ 52.796035] R10: ffffa95b80374d48 R11: ffffa95b80374d40 R12:
0000000000000001
[ 52.796037] R13: ffff9ac7539e2100 R14: 0000000000000016 R15:
0000000000000000
[ 52.796038] FS: 0000000000000000(0000) GS:ffff9ace4ebc0000(0000)
knlGS:0000000000000000
[ 52.796040] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 52.796042] CR2: 00007f3eb6493000 CR3: 00000006afe12000 CR4:
0000000000350ee0
[ 52.796043] Call Trace:
[ 52.796045] <IRQ>
[ 52.796046] nvme_irq+0x10b/0x190
[ 52.796052] __handle_irq_event_percpu+0x2e/0xd0
[ 52.796056] handle_irq_event_percpu+0x33/0x80
[ 52.796058] handle_irq_event+0x39/0x70
[ 52.796060] handle_edge_irq+0x7c/0x1a0
[ 52.796064] asm_call_irq_on_stack+0x12/0x20
[ 52.796068] </IRQ>
[ 52.796069] common_interrupt+0xd7/0x160
[ 52.796073] asm_common_interrupt+0x1e/0x40
[ 52.796076] RIP: 0010:cpuidle_enter_state+0xd2/0x2e0
[ 52.796080] Code: e8 73 ca 65 ff 31 ff 49 89 c5 e8 09 d4 65 ff 45 84
ff 74 12 9c 58 f6 c4 02 0f 85 c4 01 00 00 31 ff e8 d2 8a 6b ff fb 45 85
f6 <0f> 88 c9 00 00 00 49 63 ce be 68 00 00 00 4c 2b 2c 24 48 89 ca 48
[ 52.796082] RSP: 0018:ffffa95b80177e80 EFLAGS: 00000202
[ 52.796084] RAX: ffff9ace4ebdce80 RBX: 0000000000000002 RCX:
000000000000001f
[ 52.796085] RDX: 0000000c4ae2908c RSI: 00000000239f5229 RDI:
0000000000000000
[ 52.796086] RBP: ffff9ac74e561400 R08: 0000000000000002 R09:
000000000001c680
[ 52.796088] R10: 0000003ae7504a4c R11: ffff9ace4ebdbe64 R12:
ffffffff8aed3d20
[ 52.796089] R13: 0000000c4ae2908c R14: 0000000000000002 R15:
0000000000000000
[ 52.796092] cpuidle_enter+0x30/0x50
[ 52.796095] do_idle+0x24f/0x290
[ 52.796098] cpu_startup_entry+0x1b/0x20
[ 52.796100] start_secondary+0x11b/0x160
[ 52.796103] secondary_startup_64_no_verify+0xb0/0xbb
[ 52.796107] ---[ end trace a0a237d707896b40 ]---
[ 82.811599] nvme nvme1: I/O 7 QID 8 timeout, aborting
[ 82.811613] nvme nvme1: I/O 8 QID 8 timeout, aborting
[ 82.811617] nvme nvme1: I/O 9 QID 8 timeout, aborting
[ 82.811622] nvme nvme1: I/O 10 QID 8 timeout, aborting
[ 82.811650] nvme nvme1: Abort status: 0x0
[ 82.811665] nvme nvme1: Abort status: 0x0
[ 82.811668] nvme nvme1: Abort status: 0x0
[ 82.811670] nvme nvme1: Abort status: 0x0
[ 113.019489] nvme nvme1: I/O 7 QID 8 timeout, reset controller
[ 113.037771] nvme nvme1: 15/0/0 default/read/poll queues
[ 143.228062] nvme nvme1: I/O 8 QID 8 timeout, disable controller
[ 143.346027] blk_update_request: I/O error, dev nvme1n1, sector 16350
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346039] blk_update_request: I/O error, dev nvme1n1, sector 16093
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346044] blk_update_request: I/O error, dev nvme1n1, sector 15836
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346047] blk_update_request: I/O error, dev nvme1n1, sector 15579
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346049] blk_update_request: I/O error, dev nvme1n1, sector 15322
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346052] blk_update_request: I/O error, dev nvme1n1, sector 15065
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346055] blk_update_request: I/O error, dev nvme1n1, sector 14808
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346057] blk_update_request: I/O error, dev nvme1n1, sector 14551
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346060] blk_update_request: I/O error, dev nvme1n1, sector 14294
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346063] blk_update_request: I/O error, dev nvme1n1, sector 14037
op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 143.346116] nvme nvme1: failed to mark controller live state
[ 143.346120] nvme nvme1: Removing after probe failure status: -19
[ 143.351776] nvme1n1: detected capacity change from 0 to 500118192
[ 143.351836] Aborting journal on device dm-0-8.
[ 143.351842] Buffer I/O error on dev dm-0, logical block 25198592,
lost sync page write
[ 143.351846] JBD2: Error -5 detected when updating journal superblock
for dm-0-8.
[ 181.098750] EXT4-fs error (device dm-0): ext4_read_inode_bitmap:203:
comm touch: Cannot read inode bitmap - block_group = 0, inode_bitmap = 1065
[ 181.098792] Buffer I/O error on dev dm-0, logical block 0, lost sync
page write
[ 181.098800] EXT4-fs (dm-0): I/O error while writing superblock
[ 181.098806] EXT4-fs error (device dm-0): ext4_journal_check_start:83:
comm touch: Detected aborted journal
[ 181.098811] Buffer I/O error on dev dm-0, logical block 0, lost sync
page write
[ 181.098817] EXT4-fs (dm-0): I/O error while writing superblock
[ 181.098819] EXT4-fs (dm-0): Remounting filesystem read-only
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-22 2:54 ` Bradley Chapman
@ 2021-01-22 2:57 ` Chaitanya Kulkarni
2021-01-22 3:16 ` Chaitanya Kulkarni
0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-22 2:57 UTC (permalink / raw)
To: chapman6235; +Cc: linux-nvme
Bradley,
On 1/21/21 6:54 PM, Bradley Chapman wrote:
> I compiled the kernel from the above git tree, rebooted and attempted to
> mount the filesystem on the NVMe drive. This is what the kernel put into
> the dmesg when I attempted to list the contents of the filesystem root,
> create an inode for a zero-byte file and then unmount the filesystem.
>
> Brad
Did you get a chance to see my response to your previous email ?
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-22 2:57 ` Chaitanya Kulkarni
@ 2021-01-22 3:16 ` Chaitanya Kulkarni
2021-01-23 0:54 ` Bradley Chapman
0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-22 3:16 UTC (permalink / raw)
To: chapman6235; +Cc: linux-nvme
On 1/21/21 6:57 PM, Chaitanya Kulkarni wrote:
> Bradley,
>
> On 1/21/21 6:54 PM, Bradley Chapman wrote:
>> I compiled the kernel from the above git tree, rebooted and attempted to
>> mount the filesystem on the NVMe drive. This is what the kernel put into
>> the dmesg when I attempted to list the contents of the filesystem root,
>> create an inode for a zero-byte file and then unmount the filesystem.
>>
>> Brad
> Did you get a chance to see my response to your previous email ?
>
You can try following patch with some modification :-
From e162a2e91e4895ceac6f80042a87c4ba6a4fbbf5 Mon Sep 17 00:00:00 2001
From: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Date: Thu, 21 Jan 2021 19:05:13 -0800
Subject: [PATCH] nvme-pci: add device quirk wip
This is work in progress patch which is based on nvme-5.12
HEAD : b116d37fc0f5 nvmet: add lba to sect conversion helpers
Replace <YOUR DEVICE'S VENDOR ID> and <YOUR DEVICE's DEVICE ID> with
actual values sysfs entries in patch below before you apply the patch :-
cat /sys/bus/pci/devices/<your device id>/device
cat /sys/bus/pci/devices/<your device id>/vendor
This patch is not tested at all.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
drivers/nvme/host/pci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 25456d02eddb..c5b43bcf57b0 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3228,6 +3228,8 @@ static const struct pci_device_id nvme_id_table[] = {
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+ { PCI_DEVICE(<YOUR DEVICE's VENDOR ID>, <YOUR DEVICE's DEVICE ID>),
+ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
--
2.22.1
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-22 3:16 ` Chaitanya Kulkarni
@ 2021-01-23 0:54 ` Bradley Chapman
2021-01-25 8:16 ` Niklas Cassel
0 siblings, 1 reply; 17+ messages in thread
From: Bradley Chapman @ 2021-01-23 0:54 UTC (permalink / raw)
To: Chaitanya Kulkarni; +Cc: linux-nvme
Hello sir!
I didn't check my e-mail until this evening, so I saw all four of your
e-mails at once. I ran the commands you specified based on the following
information from dmesg and lspci:
dmesg:
[ 1.633908] nvme nvme1: pci function 0000:04:00.0
lspci:
04:00.0 Non-Volatile memory controller: Device 1d97:2263 (rev 03)
$ cat /sys/bus/pci/devices/0000\:04\:00.0/device
0x2263
$ cat /sys/bus/pci/devices/0000\:04\:00.0/vendor
0x1d97
On 1/21/21 10:16 PM, Chaitanya Kulkarni wrote:
> On 1/21/21 6:57 PM, Chaitanya Kulkarni wrote:
>> Bradley,
>>
>> On 1/21/21 6:54 PM, Bradley Chapman wrote:
>>> I compiled the kernel from the above git tree, rebooted and attempted to
>>> mount the filesystem on the NVMe drive. This is what the kernel put into
>>> the dmesg when I attempted to list the contents of the filesystem root,
>>> create an inode for a zero-byte file and then unmount the filesystem.
>>>
>>> Brad
>> Did you get a chance to see my response to your previous email ?
>>
> You can try following patch with some modification :-
>
>>From e162a2e91e4895ceac6f80042a87c4ba6a4fbbf5 Mon Sep 17 00:00:00 2001
> From: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Date: Thu, 21 Jan 2021 19:05:13 -0800
> Subject: [PATCH] nvme-pci: add device quirk wip
>
> This is work in progress patch which is based on nvme-5.12
> HEAD : b116d37fc0f5 nvmet: add lba to sect conversion helpers
>
> Replace <YOUR DEVICE'S VENDOR ID> and <YOUR DEVICE's DEVICE ID> with
> actual values sysfs entries in patch below before you apply the patch :-
>
> cat /sys/bus/pci/devices/<your device id>/device
> cat /sys/bus/pci/devices/<your device id>/vendor
>
> This patch is not tested at all.
>
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
> drivers/nvme/host/pci.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 25456d02eddb..c5b43bcf57b0 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3228,6 +3228,8 @@ static const struct pci_device_id nvme_id_table[] = {
> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
> { PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
> + { PCI_DEVICE(<YOUR DEVICE's VENDOR ID>, <YOUR DEVICE's DEVICE ID>),
> + .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
> .driver_data = NVME_QUIRK_SINGLE_VECTOR },
> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
>
With the following patch applied to the NVMe tree, my system hard-locked
and would not respond to Alt+SysRQ once I mounted the filesystem and
attempted a directory listing of the root of the filesystem.
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 25456d02eddb..7ba5e8e92e19 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3228,6 +3228,8 @@ static const struct pci_device_id nvme_id_table[] = {
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+ { PCI_DEVICE(0x1d97, 0x2263), /* SPCC */
+ .driver_data = NVME_QUIRK_SINGLE_VECTOR },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
I don't have a serial console, nor a serial port or other suitable
cabling to make one, so I have no console logs of what caused the hard
lockup, and the lack of response to Alt+SysRQ+S meant that I have no
written logs to share with you all. I'm a bit leery of hard-locking the
system multiple times to try to snipe the dmesg, since I don't want to
trash the other filesystems on this host. What else can I try before I
do that?
Brad
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-23 0:54 ` Bradley Chapman
@ 2021-01-25 8:16 ` Niklas Cassel
2021-01-25 8:34 ` Chaitanya Kulkarni
0 siblings, 1 reply; 17+ messages in thread
From: Niklas Cassel @ 2021-01-25 8:16 UTC (permalink / raw)
To: Bradley Chapman; +Cc: linux-nvme, Chaitanya Kulkarni
On Fri, Jan 22, 2021 at 07:54:26PM -0500, Bradley Chapman wrote:
> With the following patch applied to the NVMe tree, my system hard-locked and
> would not respond to Alt+SysRQ once I mounted the filesystem and attempted a
> directory listing of the root of the filesystem.
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 25456d02eddb..7ba5e8e92e19 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3228,6 +3228,8 @@ static const struct pci_device_id nvme_id_table[] = {
> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
> { PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
> + { PCI_DEVICE(0x1d97, 0x2263), /* SPCC */
> + .driver_data = NVME_QUIRK_SINGLE_VECTOR },
> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
> .driver_data = NVME_QUIRK_SINGLE_VECTOR },
> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
>
Hello Bradley,
Chaitanya asked you to test the NVME_QUIRK_DISABLE_WRITE_ZEROES quirk.
Your patch seems to instead use the NVME_QUIRK_SINGLE_VECTOR quirk.
Did you try the NVME_QUIRK_DISABLE_WRITE_ZEROES quirk?
Kind regards,
Niklas
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-25 8:16 ` Niklas Cassel
@ 2021-01-25 8:34 ` Chaitanya Kulkarni
2021-01-26 2:03 ` Bradley Chapman
0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-25 8:34 UTC (permalink / raw)
To: Niklas Cassel; +Cc: Bradley Chapman, linux-nvme
I have pointed that out on friday already offline to reduce the mailing list noise.
> On Jan 25, 2021, at 12:16 AM, Niklas Cassel <Niklas.Cassel@wdc.com> wrote:
>
> On Fri, Jan 22, 2021 at 07:54:26PM -0500, Bradley Chapman wrote:
>> With the following patch applied to the NVMe tree, my system hard-locked and
>> would not respond to Alt+SysRQ once I mounted the filesystem and attempted a
>> directory listing of the root of the filesystem.
>>
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 25456d02eddb..7ba5e8e92e19 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -3228,6 +3228,8 @@ static const struct pci_device_id nvme_id_table[] = {
>> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
>> { PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
>> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
>> + { PCI_DEVICE(0x1d97, 0x2263), /* SPCC */
>> + .driver_data = NVME_QUIRK_SINGLE_VECTOR },
>> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
>> .driver_data = NVME_QUIRK_SINGLE_VECTOR },
>> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
>>
>
> Hello Bradley,
>
> Chaitanya asked you to test the NVME_QUIRK_DISABLE_WRITE_ZEROES quirk.
> Your patch seems to instead use the NVME_QUIRK_SINGLE_VECTOR quirk.
>
> Did you try the NVME_QUIRK_DISABLE_WRITE_ZEROES quirk?
>
>
> Kind regards,
> Niklas
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-25 8:34 ` Chaitanya Kulkarni
@ 2021-01-26 2:03 ` Bradley Chapman
2021-01-26 2:04 ` Chaitanya Kulkarni
0 siblings, 1 reply; 17+ messages in thread
From: Bradley Chapman @ 2021-01-26 2:03 UTC (permalink / raw)
To: Chaitanya Kulkarni, Niklas Cassel; +Cc: linux-nvme
Good evening!
On 1/25/21 3:34 AM, Chaitanya Kulkarni wrote:
> I have pointed that out on friday already offline to reduce the mailing list noise.
>
>> On Jan 25, 2021, at 12:16 AM, Niklas Cassel <Niklas.Cassel@wdc.com> wrote:
>>
>> On Fri, Jan 22, 2021 at 07:54:26PM -0500, Bradley Chapman wrote:
>>> With the following patch applied to the NVMe tree, my system hard-locked and
>>> would not respond to Alt+SysRQ once I mounted the filesystem and attempted a
>>> directory listing of the root of the filesystem.
>>>
>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>> index 25456d02eddb..7ba5e8e92e19 100644
>>> --- a/drivers/nvme/host/pci.c
>>> +++ b/drivers/nvme/host/pci.c
>>> @@ -3228,6 +3228,8 @@ static const struct pci_device_id nvme_id_table[] = {
>>> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
>>> { PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
>>> .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
>>> + { PCI_DEVICE(0x1d97, 0x2263), /* SPCC */
>>> + .driver_data = NVME_QUIRK_SINGLE_VECTOR },
>>> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
>>> .driver_data = NVME_QUIRK_SINGLE_VECTOR },
>>> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
>>>
>>
>> Hello Bradley,
>>
>> Chaitanya asked you to test the NVME_QUIRK_DISABLE_WRITE_ZEROES quirk.
>> Your patch seems to instead use the NVME_QUIRK_SINGLE_VECTOR quirk.
>>
>> Did you try the NVME_QUIRK_DISABLE_WRITE_ZEROES quirk?
>>
>>
>> Kind regards,
>> Niklas
As Chaitanya pointed out, I did in fact re-test with the correct patch
and everything worked flawlessly. I have sent the corrected patches to
Chaitanya directly.
Brad
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free.
2021-01-26 2:03 ` Bradley Chapman
@ 2021-01-26 2:04 ` Chaitanya Kulkarni
0 siblings, 0 replies; 17+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-26 2:04 UTC (permalink / raw)
To: chapman6235, Niklas Cassel; +Cc: linux-nvme
On 1/25/21 18:03, Bradley Chapman wrote:
> As Chaitanya pointed out, I did in fact re-test with the correct patch
> and everything worked flawlessly. I have sent the corrected patches to
> Chaitanya directly.
>
> Brad
>
Thanks for confirming that, I'll send a patch with your tested by tag.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2021-01-26 2:16 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-17 18:58 Problem with SPCC 256GB NVMe 1.3 drive - refcount_t: underflow; use-after-free Bradley Chapman
2021-01-18 4:36 ` Chaitanya Kulkarni
2021-01-18 18:33 ` Bradley Chapman
2021-01-20 3:08 ` Chaitanya Kulkarni
2021-01-21 2:33 ` Bradley Chapman
2021-01-21 12:45 ` Niklas Cassel
2021-01-22 2:32 ` Bradley Chapman
2021-01-22 2:54 ` Chaitanya Kulkarni
2021-01-22 2:54 ` Chaitanya Kulkarni
2021-01-22 2:54 ` Bradley Chapman
2021-01-22 2:57 ` Chaitanya Kulkarni
2021-01-22 3:16 ` Chaitanya Kulkarni
2021-01-23 0:54 ` Bradley Chapman
2021-01-25 8:16 ` Niklas Cassel
2021-01-25 8:34 ` Chaitanya Kulkarni
2021-01-26 2:03 ` Bradley Chapman
2021-01-26 2:04 ` Chaitanya Kulkarni
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.