From: Damien Le Moal <Damien.LeMoal@wdc.com> To: Keith Busch <kbusch@kernel.org> Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>, Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>, "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>, Jens Axboe <axboe@kernel.dk> Subject: Re: [PATCH] nvme: Fix io_opt limit setting Date: Thu, 14 May 2020 04:13:44 +0000 [thread overview] Message-ID: <BY5PR04MB6900B9C6ADCEBD122154A22EE7BC0@BY5PR04MB6900.namprd04.prod.outlook.com> (raw) In-Reply-To: 20200514041215.GA1900@redsun51.ssa.fujisawa.hgst.com On 2020/05/14 13:12, Keith Busch wrote: > On Thu, May 14, 2020 at 03:47:56AM +0000, Damien Le Moal wrote: >> On 2020/05/14 12:40, Keith Busch wrote: >>> On Thu, May 14, 2020 at 10:54:52AM +0900, Damien Le Moal wrote: >>>> Currently, a namespace io_opt queue limit is set by default to the >>>> physical sector size of the namespace and to the the write optimal >>>> size (NOWS) when the namespace reports this value. This causes problems >>>> with block limits stacking in blk_stack_limits() when a namespace block >>>> device is combined with an HDD which generally do not report any optimal >>>> transfer size (io_opt limit is 0). The code: >>>> >>>> /* Optimal I/O a multiple of the physical block size? */ >>>> if (t->io_opt & (t->physical_block_size - 1)) { >>>> t->io_opt = 0; >>>> t->misaligned = 1; >>>> ret = -1; >>>> } >>>> >>>> results in blk_stack_limits() to return an error when the combined >>>> devices have different but compatible physical sector sizes (e.g. 512B >>>> sector SSD with 4KB sector disks). >>>> >>>> Fix this by not setting the optiomal IO size limit if the namespace does >>>> not report an optimal write size value. >>> >>> Won't this continue to break if a controller does report NOWS that's not >>> a multiple of the physical block size of the device it's stacking with? >> >> When io_opt stacking is handled, the physical sector size for the stacked device >> is already resolved to a common value. If the NOWS value cannot accommodate this >> resolved physical sector size, this is an incompatible stacking, so failing is >> OK in that case. > > I see, though it's not strictly incompatible as io_opt is merely a hint > that could continue to work if the stacked limit was recalculated as: > > if (t->io_opt & (t->physical_block_size - 1)) > t->io_opt = lcm(t->io_opt, t->physical_block_size); > > Regardless, your patch does make sense, but it does have a merge > conflict with nvme-5.8. Ooops. I will rebase and resend. And maybe we should send your suggestion above as a proper patch ? > -- Damien Le Moal Western Digital Research
WARNING: multiple messages have this Message-ID (diff)
From: Damien Le Moal <Damien.LeMoal@wdc.com> To: Keith Busch <kbusch@kernel.org> Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>, Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>, "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>, Sagi Grimberg <sagi@grimberg.me> Subject: Re: [PATCH] nvme: Fix io_opt limit setting Date: Thu, 14 May 2020 04:13:44 +0000 [thread overview] Message-ID: <BY5PR04MB6900B9C6ADCEBD122154A22EE7BC0@BY5PR04MB6900.namprd04.prod.outlook.com> (raw) In-Reply-To: 20200514041215.GA1900@redsun51.ssa.fujisawa.hgst.com On 2020/05/14 13:12, Keith Busch wrote: > On Thu, May 14, 2020 at 03:47:56AM +0000, Damien Le Moal wrote: >> On 2020/05/14 12:40, Keith Busch wrote: >>> On Thu, May 14, 2020 at 10:54:52AM +0900, Damien Le Moal wrote: >>>> Currently, a namespace io_opt queue limit is set by default to the >>>> physical sector size of the namespace and to the the write optimal >>>> size (NOWS) when the namespace reports this value. This causes problems >>>> with block limits stacking in blk_stack_limits() when a namespace block >>>> device is combined with an HDD which generally do not report any optimal >>>> transfer size (io_opt limit is 0). The code: >>>> >>>> /* Optimal I/O a multiple of the physical block size? */ >>>> if (t->io_opt & (t->physical_block_size - 1)) { >>>> t->io_opt = 0; >>>> t->misaligned = 1; >>>> ret = -1; >>>> } >>>> >>>> results in blk_stack_limits() to return an error when the combined >>>> devices have different but compatible physical sector sizes (e.g. 512B >>>> sector SSD with 4KB sector disks). >>>> >>>> Fix this by not setting the optiomal IO size limit if the namespace does >>>> not report an optimal write size value. >>> >>> Won't this continue to break if a controller does report NOWS that's not >>> a multiple of the physical block size of the device it's stacking with? >> >> When io_opt stacking is handled, the physical sector size for the stacked device >> is already resolved to a common value. If the NOWS value cannot accommodate this >> resolved physical sector size, this is an incompatible stacking, so failing is >> OK in that case. > > I see, though it's not strictly incompatible as io_opt is merely a hint > that could continue to work if the stacked limit was recalculated as: > > if (t->io_opt & (t->physical_block_size - 1)) > t->io_opt = lcm(t->io_opt, t->physical_block_size); > > Regardless, your patch does make sense, but it does have a merge > conflict with nvme-5.8. Ooops. I will rebase and resend. And maybe we should send your suggestion above as a proper patch ? > -- Damien Le Moal Western Digital Research _______________________________________________ linux-nvme mailing list linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-05-14 4:13 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-05-14 1:54 [PATCH] nvme: Fix io_opt limit setting Damien Le Moal 2020-05-14 1:54 ` Damien Le Moal 2020-05-14 3:29 ` Martin K. Petersen 2020-05-14 3:29 ` Martin K. Petersen 2020-05-14 3:40 ` Keith Busch 2020-05-14 3:40 ` Keith Busch 2020-05-14 3:47 ` Damien Le Moal 2020-05-14 3:47 ` Damien Le Moal 2020-05-14 4:12 ` Keith Busch 2020-05-14 4:12 ` Keith Busch 2020-05-14 4:13 ` Damien Le Moal [this message] 2020-05-14 4:13 ` Damien Le Moal 2020-05-14 4:47 ` Bart Van Assche 2020-05-14 4:47 ` Bart Van Assche 2020-05-14 4:49 ` Damien Le Moal 2020-05-14 4:49 ` Damien Le Moal 2020-05-14 22:19 ` Martin K. Petersen 2020-05-14 22:19 ` Martin K. Petersen 2020-05-14 6:11 ` Hannes Reinecke 2020-05-14 6:11 ` Hannes Reinecke
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=BY5PR04MB6900B9C6ADCEBD122154A22EE7BC0@BY5PR04MB6900.namprd04.prod.outlook.com \ --to=damien.lemoal@wdc.com \ --cc=axboe@kernel.dk \ --cc=hch@lst.de \ --cc=kbusch@kernel.org \ --cc=linux-block@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=sagi@grimberg.me \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.