From mboxrd@z Thu Jan 1 00:00:00 1970 From: kbusch@kernel.org (Keith Busch) Date: Fri, 15 Mar 2019 10:38:43 -0600 Subject: [PATCH v4 1/3] nvme: set 0 capacity if namespace block size exceeds PAGE_SIZE In-Reply-To: <20190315162837.GA27308@lst.de> References: <20190311220227.23656-1-sagi@grimberg.me> <20190311220227.23656-2-sagi@grimberg.me> <20190312143231.GA1149@lst.de> <8a80ce70-0b98-6c82-a47c-f312a41d2d2a@grimberg.me> <20190315162837.GA27308@lst.de> Message-ID: <20190315163843.GA18289@localhost.localdomain> On Fri, Mar 15, 2019@05:28:37PM +0100, Christoph Hellwig wrote: > On Tue, Mar 12, 2019@02:15:26PM -0700, Sagi Grimberg wrote: > > > >> I like the idea behind this, but it looks rather convoluted. I think > >> for the unusable namespace case we should warn and have a common label > >> that just sets the capacity, not touching anything else. > >> > >> Does something like this work for you? > > > > No, this is what I had done originally, but we need to always have the > > queue set to a decent block size, otherwise blk_queue_stack_limits() > > panics on div by 0.. > > I actually tested it by manually hacking a 8k block size into nvmet > and and it works just fine for me. Where do you see a division by > zero with this patch exactly? I'm not sure abuot a divide-by-zero, but I just hacked up qemu to report an 8k block size and get this on boot. Happens because alloc_page_buffers won't allocate a buffer_head when block size is greater than a page size: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 3 PID: 391 Comm: kworker/u18:1 Not tainted 5.0.0+ #42 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Workqueue: nvme-wq nvme_scan_work [nvme_core] RIP: 0010:create_empty_buffers+0x24/0x100 Code: eb cb 0f 1f 40 00 0f 1f 44 00 00 41 54 55 49 89 d4 53 ba 01 00 00 00 48 89 fb e8 87 fe ff ff 48 89 c5 48 89 c2 eb 03 48 89 ca <48> 8b 4a 08 4c 09 22 48 85 c9 75 f1 48 89 6a 08 48 8b 43 18 48 8d RSP: 0018:ffffbd9ec05cf880 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffffe0fec03a38c0 RCX: ffff99f4c751d000 RDX: 0000000000000000 RSI: ffff99f4c751d000 RDI: ffffe0fec03a38c0 RBP: 0000000000000000 R08: dead0000000000ff R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000200 R15: ffffe0fec03a38c0 FS: 0000000000000000(0000) GS:ffff99f4cfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000000eb28000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: create_page_buffers+0x4d/0x60 block_read_full_page+0x47/0x310 ? __add_to_page_cache_locked+0x288/0x330 ? check_disk_change+0x60/0x60 ? count_shadow_nodes+0x130/0x130 do_read_cache_page+0x31c/0x6b0 ? blkdev_writepages+0x10/0x10 read_dev_sector+0x28/0xc0 read_lba+0x126/0x210 ? kmem_cache_alloc_trace+0x19b/0x1b0 efi_partition+0x137/0x780 ? vsnprintf+0x2ae/0x4a0 ? vsnprintf+0xec/0x4a0 ? snprintf+0x45/0x70 ? is_gpt_valid.part.6+0x400/0x400 ? check_partition+0x137/0x240 check_partition+0x137/0x240 rescan_partitions+0xab/0x350 __blkdev_get+0x342/0x560 ? inode_insert5+0x11f/0x1e0 blkdev_get+0x11f/0x310 ? unlock_new_inode+0x44/0x60 ? bdget+0xff/0x110 __device_add_disk+0x426/0x470 nvme_validate_ns+0x35e/0x7c0 [nvme_core] ? nvme_identify_ctrl.isra.56+0x7e/0xc0 [nvme_core] ? update_load_avg+0x89/0x550 nvme_scan_work+0xe5/0x370 [nvme_core] ? __synchronize_srcu.part.18+0x91/0xc0 ? try_to_wake_up+0x55/0x430 process_one_work+0x1e9/0x3e0 worker_thread+0x21a/0x3d0 ? process_one_work+0x3e0/0x3e0 kthread+0x111/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 Modules linked in: nvme nvme_core serio_raw CR2: 0000000000000008 ---[ end trace b38bdf1b424f36e9 ]---