From: Gulam Mohamed <gulam.mohamed@oracle.com> To: Keith Busch <kbusch@kernel.org> Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>, "axboe@kernel.dk" <axboe@kernel.dk>, "philipp.reisner@linbit.com" <philipp.reisner@linbit.com>, "lars.ellenberg@linbit.com" <lars.ellenberg@linbit.com>, "christoph.boehmwalder@linbit.com" <christoph.boehmwalder@linbit.com>, "minchan@kernel.org" <minchan@kernel.org>, "ngupta@vflare.org" <ngupta@vflare.org>, "senozhatsky@chromium.org" <senozhatsky@chromium.org>, "colyli@suse.de" <colyli@suse.de>, "kent.overstreet@gmail.com" <kent.overstreet@gmail.com>, "agk@redhat.com" <agk@redhat.com>, "snitzer@kernel.org" <snitzer@kernel.org>, "dm-devel@redhat.com" <dm-devel@redhat.com>, "song@kernel.org" <song@kernel.org>, "dan.j.williams@intel.com" <dan.j.williams@intel.com>, "vishal.l.verma@intel.com" <vishal.l.verma@intel.com>, "dave.jiang@intel.com" <dave.jiang@intel.com>, "ira.weiny@intel.com" <ira.weiny@intel.com>, Junxiao Bi <junxiao.bi@oracle.com>, Martin Petersen <martin.petersen@oracle.com>, "kch@nvidia.com" <kch@nvidia.com>, "drbd-dev@lists.linbit.com" <drbd-dev@lists.linbit.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "linux-bcache@vger.kernel.org" <linux-bcache@vger.kernel.org>, "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>, "nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>, Konrad Wilk <konrad.wilk@oracle.com>, Joe Jin <joe.jin@oracle.com>, Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>, Shminderjit Singh <shminderjit.singh@oracle.com> Subject: RE: [PATCH for-6.2/block V3 2/2] block: Change the granularity of io ticks from ms to ns Date: Fri, 23 Dec 2022 14:47:06 +0000 [thread overview] Message-ID: <CO1PR10MB4563F566452B9D3035A82EE298E99@CO1PR10MB4563.namprd10.prod.outlook.com> (raw) In-Reply-To: <Y6M9rJbw3ZMvOeDr@kbusch-mbp.dhcp.thefacebook.com> Hi Keith, Thanks for reviewing this request. Can you please see my inline comments? Regards, Gulam Mohamed. -----Original Message----- From: Keith Busch <kbusch@kernel.org> Sent: Wednesday, December 21, 2022 10:39 PM To: Gulam Mohamed <gulam.mohamed@oracle.com> Cc: linux-block@vger.kernel.org; axboe@kernel.dk; philipp.reisner@linbit.com; lars.ellenberg@linbit.com; christoph.boehmwalder@linbit.com; minchan@kernel.org; ngupta@vflare.org; senozhatsky@chromium.org; colyli@suse.de; kent.overstreet@gmail.com; agk@redhat.com; snitzer@kernel.org; dm-devel@redhat.com; song@kernel.org; dan.j.williams@intel.com; vishal.l.verma@intel.com; dave.jiang@intel.com; ira.weiny@intel.com; Junxiao Bi <junxiao.bi@oracle.com>; Martin Petersen <martin.petersen@oracle.com>; kch@nvidia.com; drbd-dev@lists.linbit.com; linux-kernel@vger.kernel.org; linux-bcache@vger.kernel.org; linux-raid@vger.kernel.org; nvdimm@lists.linux.dev; Konrad Wilk <konrad.wilk@oracle.com>; Joe Jin <joe.jin@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>; Shminderjit Singh <shminderjit.singh@oracle.com> Subject: Re: [PATCH for-6.2/block V3 2/2] block: Change the granularity of io ticks from ms to ns On Wed, Dec 21, 2022 at 04:05:06AM +0000, Gulam Mohamed wrote: > +u64 blk_get_iostat_ticks(struct request_queue *q) { > + return (blk_queue_precise_io_stat(q) ? ktime_get_ns() : > +jiffies); } EXPORT_SYMBOL_GPL(blk_get_iostat_ticks); > + > void update_io_ticks(struct block_device *part, u64 now, bool end) { > u64 stamp; > @@ -968,20 +980,26 @@ EXPORT_SYMBOL(bdev_start_io_acct); > u64 bio_start_io_acct(struct bio *bio) { > return bdev_start_io_acct(bio->bi_bdev, bio_sectors(bio), > - bio_op(bio), jiffies); > + bio_op(bio), > + blk_get_iostat_ticks(bio->bi_bdev->bd_queue)); > } > EXPORT_SYMBOL_GPL(bio_start_io_acct); > > void bdev_end_io_acct(struct block_device *bdev, enum req_op op, > u64 start_time) > { > + u64 now; > + u64 duration; > + struct request_queue *q = bdev_get_queue(bdev); > const int sgrp = op_stat_group(op); > - u64 now = READ_ONCE(jiffies); > - u64 duration = (unsigned long)now -(unsigned long) start_time; > + now = blk_get_iostat_ticks(q);; I don't think you can rely on the blk_queue_precise_io_stat() flag in the completion side. The user can toggle this with data in flight, which means the completion may use different tick units than the start. I think you'll need to add a flag to the request in the start accounting side to know which method to use for the completion. [GULAM]: As per my understanding, this may work for a single request_queue implemetation. But this request based accounting, as per my understanding, may be an issue with the Multi-QUEUE as there is a separate queue for each CPU and the time-stamp being recorded for the block device is a global one. Also, the issue you mentioned about the start and end accounting may update the ticks in different units for the inflight IOs, may be just for a while. So, even if it works for MQ, I am trying to understand how much is it feasible to do this request-based change for an issue which may be there for just a moment? So, can you please correct me if I am wrong and explore more on your suggestion so that I can understand properly? > @@ -951,6 +951,7 @@ ssize_t part_stat_show(struct device *dev, > struct request_queue *q = bdev_get_queue(bdev); > struct disk_stats stat; > unsigned int inflight; > + u64 stat_ioticks; > > if (queue_is_mq(q)) > inflight = blk_mq_in_flight(q, bdev); @@ -959,10 +960,13 @@ ssize_t > part_stat_show(struct device *dev, > > if (inflight) { > part_stat_lock(); > - update_io_ticks(bdev, jiffies, true); > + update_io_ticks(bdev, blk_get_iostat_ticks(q), true); > part_stat_unlock(); > } > part_stat_read_all(bdev, &stat); > + stat_ioticks = (blk_queue_precise_io_stat(q) ? > + div_u64(stat.io_ticks, NSEC_PER_MSEC) : > + jiffies_to_msecs(stat.io_ticks)); With the user able to change the precision at will, I think these io_ticks need to have a consistent unit size. Mixing jiffies and nsecs is going to create garbage stats output. Could existing io_ticks using jiffies be converted to jiffies_to_nsecs() so that you only have one unit to consider? [GULAM]: I am not sure if this will work as we just multiply with 1000000 to convert jiffies to nano-seconds.
WARNING: multiple messages have this Message-ID (diff)
From: Gulam Mohamed <gulam.mohamed@oracle.com> To: Keith Busch <kbusch@kernel.org> Cc: "nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>, Shminderjit Singh <shminderjit.singh@oracle.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "song@kernel.org" <song@kernel.org>, "dm-devel@redhat.com" <dm-devel@redhat.com>, "ira.weiny@intel.com" <ira.weiny@intel.com>, "agk@redhat.com" <agk@redhat.com>, "drbd-dev@lists.linbit.com" <drbd-dev@lists.linbit.com>, "dave.jiang@intel.com" <dave.jiang@intel.com>, "minchan@kernel.org" <minchan@kernel.org>, "vishal.l.verma@intel.com" <vishal.l.verma@intel.com>, Konrad Wilk <konrad.wilk@oracle.com>, Joe Jin <joe.jin@oracle.com>, "kent.overstreet@gmail.com" <kent.overstreet@gmail.com>, "ngupta@vflare.org" <ngupta@vflare.org>, "kch@nvidia.com" <kch@nvidia.com>, "senozhatsky@chromium.org" <senozhatsky@chromium.org>, "snitzer@kernel.org" <snitzer@kernel.org>, "colyli@suse.de" <colyli@suse.de>, "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>, "linux-bcache@vger.kernel.org" <linux-bcache@vger.kernel.org>, "dan.j.williams@intel.com" <dan.j.williams@intel.com>, "axboe@kernel.dk" <axboe@kernel.dk>, "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>, Martin Petersen <martin.petersen@oracle.com>, Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>, "philipp.reisner@linbit.com" <philipp.reisner@linbit.com>, Junxiao Bi <junxiao.bi@oracle.com>, "christoph.boehmwalder@linbit.com" <christoph.boehmwalder@linbit.com>, "lars.ellenberg@linbit.com" <lars.ellenberg@linbit.com> Subject: Re: [dm-devel] [PATCH for-6.2/block V3 2/2] block: Change the granularity of io ticks from ms to ns Date: Fri, 23 Dec 2022 14:47:06 +0000 [thread overview] Message-ID: <CO1PR10MB4563F566452B9D3035A82EE298E99@CO1PR10MB4563.namprd10.prod.outlook.com> (raw) In-Reply-To: <Y6M9rJbw3ZMvOeDr@kbusch-mbp.dhcp.thefacebook.com> Hi Keith, Thanks for reviewing this request. Can you please see my inline comments? Regards, Gulam Mohamed. -----Original Message----- From: Keith Busch <kbusch@kernel.org> Sent: Wednesday, December 21, 2022 10:39 PM To: Gulam Mohamed <gulam.mohamed@oracle.com> Cc: linux-block@vger.kernel.org; axboe@kernel.dk; philipp.reisner@linbit.com; lars.ellenberg@linbit.com; christoph.boehmwalder@linbit.com; minchan@kernel.org; ngupta@vflare.org; senozhatsky@chromium.org; colyli@suse.de; kent.overstreet@gmail.com; agk@redhat.com; snitzer@kernel.org; dm-devel@redhat.com; song@kernel.org; dan.j.williams@intel.com; vishal.l.verma@intel.com; dave.jiang@intel.com; ira.weiny@intel.com; Junxiao Bi <junxiao.bi@oracle.com>; Martin Petersen <martin.petersen@oracle.com>; kch@nvidia.com; drbd-dev@lists.linbit.com; linux-kernel@vger.kernel.org; linux-bcache@vger.kernel.org; linux-raid@vger.kernel.org; nvdimm@lists.linux.dev; Konrad Wilk <konrad.wilk@oracle.com>; Joe Jin <joe.jin@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>; Shminderjit Singh <shminderjit.singh@oracle.com> Subject: Re: [PATCH for-6.2/block V3 2/2] block: Change the granularity of io ticks from ms to ns On Wed, Dec 21, 2022 at 04:05:06AM +0000, Gulam Mohamed wrote: > +u64 blk_get_iostat_ticks(struct request_queue *q) { > + return (blk_queue_precise_io_stat(q) ? ktime_get_ns() : > +jiffies); } EXPORT_SYMBOL_GPL(blk_get_iostat_ticks); > + > void update_io_ticks(struct block_device *part, u64 now, bool end) { > u64 stamp; > @@ -968,20 +980,26 @@ EXPORT_SYMBOL(bdev_start_io_acct); > u64 bio_start_io_acct(struct bio *bio) { > return bdev_start_io_acct(bio->bi_bdev, bio_sectors(bio), > - bio_op(bio), jiffies); > + bio_op(bio), > + blk_get_iostat_ticks(bio->bi_bdev->bd_queue)); > } > EXPORT_SYMBOL_GPL(bio_start_io_acct); > > void bdev_end_io_acct(struct block_device *bdev, enum req_op op, > u64 start_time) > { > + u64 now; > + u64 duration; > + struct request_queue *q = bdev_get_queue(bdev); > const int sgrp = op_stat_group(op); > - u64 now = READ_ONCE(jiffies); > - u64 duration = (unsigned long)now -(unsigned long) start_time; > + now = blk_get_iostat_ticks(q);; I don't think you can rely on the blk_queue_precise_io_stat() flag in the completion side. The user can toggle this with data in flight, which means the completion may use different tick units than the start. I think you'll need to add a flag to the request in the start accounting side to know which method to use for the completion. [GULAM]: As per my understanding, this may work for a single request_queue implemetation. But this request based accounting, as per my understanding, may be an issue with the Multi-QUEUE as there is a separate queue for each CPU and the time-stamp being recorded for the block device is a global one. Also, the issue you mentioned about the start and end accounting may update the ticks in different units for the inflight IOs, may be just for a while. So, even if it works for MQ, I am trying to understand how much is it feasible to do this request-based change for an issue which may be there for just a moment? So, can you please correct me if I am wrong and explore more on your suggestion so that I can understand properly? > @@ -951,6 +951,7 @@ ssize_t part_stat_show(struct device *dev, > struct request_queue *q = bdev_get_queue(bdev); > struct disk_stats stat; > unsigned int inflight; > + u64 stat_ioticks; > > if (queue_is_mq(q)) > inflight = blk_mq_in_flight(q, bdev); @@ -959,10 +960,13 @@ ssize_t > part_stat_show(struct device *dev, > > if (inflight) { > part_stat_lock(); > - update_io_ticks(bdev, jiffies, true); > + update_io_ticks(bdev, blk_get_iostat_ticks(q), true); > part_stat_unlock(); > } > part_stat_read_all(bdev, &stat); > + stat_ioticks = (blk_queue_precise_io_stat(q) ? > + div_u64(stat.io_ticks, NSEC_PER_MSEC) : > + jiffies_to_msecs(stat.io_ticks)); With the user able to change the precision at will, I think these io_ticks need to have a consistent unit size. Mixing jiffies and nsecs is going to create garbage stats output. Could existing io_ticks using jiffies be converted to jiffies_to_nsecs() so that you only have one unit to consider? [GULAM]: I am not sure if this will work as we just multiply with 1000000 to convert jiffies to nano-seconds. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2022-12-23 14:47 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-12-21 4:05 [PATCH for-6.2/block V3 1/2] block: Data type conversion for IO accounting Gulam Mohamed 2022-12-21 4:05 ` [dm-devel] " Gulam Mohamed 2022-12-21 4:05 ` [PATCH for-6.2/block V3 2/2] block: Change the granularity of io ticks from ms to ns Gulam Mohamed 2022-12-21 4:05 ` [dm-devel] " Gulam Mohamed 2022-12-21 16:09 ` kernel test robot 2022-12-21 16:09 ` [dm-devel] " kernel test robot 2022-12-21 17:09 ` Keith Busch 2022-12-21 17:09 ` [dm-devel] " Keith Busch 2022-12-23 14:47 ` Gulam Mohamed [this message] 2022-12-23 14:47 ` Gulam Mohamed 2022-12-21 22:54 ` kernel test robot 2022-12-21 22:54 ` [dm-devel] " kernel test robot 2022-12-24 1:26 ` kernel test robot 2022-12-24 1:26 ` [dm-devel] " kernel test robot 2023-01-17 17:52 ` Mike Snitzer 2023-01-17 17:52 ` [dm-devel] " Mike Snitzer 2022-12-21 9:25 ` [PATCH for-6.2/block V3 1/2] block: Data type conversion for IO accounting kernel test robot 2022-12-21 9:25 ` [dm-devel] " kernel test robot 2022-12-21 16:29 ` kernel test robot 2022-12-21 16:29 ` [dm-devel] " kernel test robot 2022-12-22 15:36 ` kernel test robot 2022-12-22 15:36 ` [dm-devel] " kernel test robot 2022-12-25 11:40 ` Sagi Grimberg 2022-12-25 11:40 ` [dm-devel] " Sagi Grimberg
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CO1PR10MB4563F566452B9D3035A82EE298E99@CO1PR10MB4563.namprd10.prod.outlook.com \ --to=gulam.mohamed@oracle.com \ --cc=agk@redhat.com \ --cc=axboe@kernel.dk \ --cc=christoph.boehmwalder@linbit.com \ --cc=colyli@suse.de \ --cc=dan.j.williams@intel.com \ --cc=dave.jiang@intel.com \ --cc=dm-devel@redhat.com \ --cc=drbd-dev@lists.linbit.com \ --cc=ira.weiny@intel.com \ --cc=joe.jin@oracle.com \ --cc=junxiao.bi@oracle.com \ --cc=kbusch@kernel.org \ --cc=kch@nvidia.com \ --cc=kent.overstreet@gmail.com \ --cc=konrad.wilk@oracle.com \ --cc=lars.ellenberg@linbit.com \ --cc=linux-bcache@vger.kernel.org \ --cc=linux-block@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-raid@vger.kernel.org \ --cc=martin.petersen@oracle.com \ --cc=minchan@kernel.org \ --cc=ngupta@vflare.org \ --cc=nvdimm@lists.linux.dev \ --cc=philipp.reisner@linbit.com \ --cc=rajesh.sivaramasubramaniom@oracle.com \ --cc=senozhatsky@chromium.org \ --cc=shminderjit.singh@oracle.com \ --cc=snitzer@kernel.org \ --cc=song@kernel.org \ --cc=vishal.l.verma@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.