All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock()
@ 2020-05-04 13:29 Konstantin Khlebnikov
  2020-05-04 13:29 ` [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id() Konstantin Khlebnikov
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Konstantin Khlebnikov @ 2020-05-04 13:29 UTC (permalink / raw)
  To: linux-kernel, linux-block, Jens Axboe

RCU lock is required only in blk_account_io_start() to lookup partition.
After that request holds reference to related hd_struct.

Replace get_cpu() with preempt_disable() - returned cpu index is unused.

Non-SMP case also needs preempt_disable, otherwise statistics update could
be non-atomic. Previously that was provided by rcu_read_lock().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 block/blk-core.c          |    3 +++
 include/linux/part_stat.h |    7 +++----
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7f11560bfddb..45ddf7238c06 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1362,6 +1362,7 @@ void blk_account_io_start(struct request *rq, bool new_io)
 		part = rq->part;
 		part_stat_inc(part, merges[rw]);
 	} else {
+		rcu_read_lock();
 		part = disk_map_sector_rcu(rq->rq_disk, blk_rq_pos(rq));
 		if (!hd_struct_try_get(part)) {
 			/*
@@ -1375,6 +1376,8 @@ void blk_account_io_start(struct request *rq, bool new_io)
 			part = &rq->rq_disk->part0;
 			hd_struct_get(part);
 		}
+		rcu_read_unlock();
+
 		part_inc_in_flight(rq->q, part, rw);
 		rq->part = part;
 	}
diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
index ece607607a86..755a01f0fd61 100644
--- a/include/linux/part_stat.h
+++ b/include/linux/part_stat.h
@@ -16,9 +16,10 @@
  * part_stat_{add|set_all}() and {init|free}_part_stats are for
  * internal use only.
  */
+#define part_stat_lock()	preempt_disable()
+#define part_stat_unlock()	preempt_enable()
+
 #ifdef	CONFIG_SMP
-#define part_stat_lock()	({ rcu_read_lock(); get_cpu(); })
-#define part_stat_unlock()	do { put_cpu(); rcu_read_unlock(); } while (0)
 
 #define part_stat_get_cpu(part, field, cpu)				\
 	(per_cpu_ptr((part)->dkstats, (cpu))->field)
@@ -58,8 +59,6 @@ static inline void free_part_stats(struct hd_struct *part)
 }
 
 #else /* !CONFIG_SMP */
-#define part_stat_lock()	({ rcu_read_lock(); 0; })
-#define part_stat_unlock()	rcu_read_unlock()
 
 #define part_stat_get(part, field)		((part)->dkstats.field)
 #define part_stat_get_cpu(part, field, cpu)	part_stat_get(part, field)


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id()
  2020-05-04 13:29 [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Konstantin Khlebnikov
@ 2020-05-04 13:29 ` Konstantin Khlebnikov
  2020-05-04 14:03   ` Christoph Hellwig
  2020-05-04 13:30 ` [PATCH 3/4] block/part_stat: account merge of two requests Konstantin Khlebnikov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Konstantin Khlebnikov @ 2020-05-04 13:29 UTC (permalink / raw)
  To: linux-kernel, linux-block, Jens Axboe

Most architectures have fast path to access percpu for current cpu.
Required preempt_disable() is provided by part_stat_lock().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 include/linux/part_stat.h |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
index 755a01f0fd61..a0ddeff3798e 100644
--- a/include/linux/part_stat.h
+++ b/include/linux/part_stat.h
@@ -36,6 +36,9 @@
 	res;								\
 })
 
+#define __part_stat_add(part, field, addnd)				\
+	__this_cpu_add((part)->dkstats->field, addnd)
+
 static inline void part_stat_set_all(struct hd_struct *part, int value)
 {
 	int i;
@@ -64,6 +67,9 @@ static inline void free_part_stats(struct hd_struct *part)
 #define part_stat_get_cpu(part, field, cpu)	part_stat_get(part, field)
 #define part_stat_read(part, field)		part_stat_get(part, field)
 
+#define __part_stat_add(part, field, addnd)				\
+	(part_stat_get(part, field) += (addnd))
+
 static inline void part_stat_set_all(struct hd_struct *part, int value)
 {
 	memset(&part->dkstats, value, sizeof(struct disk_stats));
@@ -85,9 +91,6 @@ static inline void free_part_stats(struct hd_struct *part)
 	 part_stat_read(part, field[STAT_WRITE]) +			\
 	 part_stat_read(part, field[STAT_DISCARD]))
 
-#define __part_stat_add(part, field, addnd)				\
-	(part_stat_get(part, field) += (addnd))
-
 #define part_stat_add(part, field, addnd)	do {			\
 	__part_stat_add((part), field, addnd);				\
 	if ((part)->partno)						\


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] block/part_stat: account merge of two requests
  2020-05-04 13:29 [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Konstantin Khlebnikov
  2020-05-04 13:29 ` [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id() Konstantin Khlebnikov
@ 2020-05-04 13:30 ` Konstantin Khlebnikov
  2020-05-04 14:03   ` Christoph Hellwig
  2020-05-04 13:31 ` [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio() Konstantin Khlebnikov
  2020-05-04 14:00 ` [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Christoph Hellwig
  3 siblings, 1 reply; 10+ messages in thread
From: Konstantin Khlebnikov @ 2020-05-04 13:30 UTC (permalink / raw)
  To: linux-kernel, linux-block, Jens Axboe

Also rename blk_account_io_merge() into blk_account_io_merge_request() to
distinguish it from merging request and bio.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 block/blk-merge.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index a04e991b5ded..37bced39bae8 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -662,20 +662,23 @@ void blk_rq_set_mixed_merge(struct request *rq)
 	rq->rq_flags |= RQF_MIXED_MERGE;
 }
 
-static void blk_account_io_merge(struct request *req)
+static void blk_account_io_merge_request(struct request *req)
 {
 	if (blk_do_io_stat(req)) {
+		const int sgrp = op_stat_group(req_op(req));
 		struct hd_struct *part;
 
 		part_stat_lock();
 		part = req->part;
 
+		part_stat_inc(part, merges[sgrp]);
 		part_dec_in_flight(req->q, part, rq_data_dir(req));
 
 		hd_struct_put(part);
 		part_stat_unlock();
 	}
 }
+
 /*
  * Two cases of handling DISCARD merge:
  * If max_discard_segments > 1, the driver takes every bio
@@ -787,7 +790,7 @@ static struct request *attempt_merge(struct request_queue *q,
 	/*
 	 * 'next' is going away, so update stats accordingly
 	 */
-	blk_account_io_merge(next);
+	blk_account_io_merge_request(next);
 
 	/*
 	 * ownership of bio passed from next to req, return 'next' for


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio()
  2020-05-04 13:29 [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Konstantin Khlebnikov
  2020-05-04 13:29 ` [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id() Konstantin Khlebnikov
  2020-05-04 13:30 ` [PATCH 3/4] block/part_stat: account merge of two requests Konstantin Khlebnikov
@ 2020-05-04 13:31 ` Konstantin Khlebnikov
  2020-05-04 14:06   ` Christoph Hellwig
  2020-05-04 14:00 ` [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Christoph Hellwig
  3 siblings, 1 reply; 10+ messages in thread
From: Konstantin Khlebnikov @ 2020-05-04 13:31 UTC (permalink / raw)
  To: linux-kernel, linux-block, Jens Axboe

Move non-"new_io" branch of blk_account_io_start() into separate function.
Fix merge accounting for discards (they were counted as write merges).

Also blk_account_io_merge_bio() doesn't call update_io_ticks() unlike to
blk_account_io_start(), there is no reason for that.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 block/blk-core.c |   39 ++++++++++++++++++++++-----------------
 block/blk-exec.c |    2 +-
 block/blk-mq.c   |    2 +-
 block/blk.h      |    2 +-
 4 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 45ddf7238c06..18fb42eb2f18 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -622,6 +622,17 @@ void blk_put_request(struct request *req)
 }
 EXPORT_SYMBOL(blk_put_request);
 
+static void blk_account_io_merge_bio(struct request *req)
+{
+	if (blk_do_io_stat(req)) {
+		const int sgrp = op_stat_group(req_op(req));
+
+		part_stat_lock();
+		part_stat_inc(req->part, merges[sgrp]);
+		part_stat_unlock();
+	}
+}
+
 bool bio_attempt_back_merge(struct request *req, struct bio *bio,
 		unsigned int nr_segs)
 {
@@ -640,7 +651,7 @@ bool bio_attempt_back_merge(struct request *req, struct bio *bio,
 	req->biotail = bio;
 	req->__data_len += bio->bi_iter.bi_size;
 
-	blk_account_io_start(req, false);
+	blk_account_io_merge_bio(req);
 	return true;
 }
 
@@ -664,7 +675,7 @@ bool bio_attempt_front_merge(struct request *req, struct bio *bio,
 	req->__sector = bio->bi_iter.bi_sector;
 	req->__data_len += bio->bi_iter.bi_size;
 
-	blk_account_io_start(req, false);
+	blk_account_io_merge_bio(req);
 	return true;
 }
 
@@ -686,7 +697,7 @@ bool bio_attempt_discard_merge(struct request_queue *q, struct request *req,
 	req->__data_len += bio->bi_iter.bi_size;
 	req->nr_phys_segments = segments + 1;
 
-	blk_account_io_start(req, false);
+	blk_account_io_merge_bio(req);
 	return true;
 no_merge:
 	req_set_nomerge(q, req);
@@ -1258,7 +1269,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
 		return BLK_STS_IOERR;
 
 	if (blk_queue_io_stat(q))
-		blk_account_io_start(rq, true);
+		blk_account_io_start(rq);
 
 	/*
 	 * Since we have a scheduler attached on the top device,
@@ -1348,20 +1359,14 @@ void blk_account_io_done(struct request *req, u64 now)
 	}
 }
 
-void blk_account_io_start(struct request *rq, bool new_io)
+void blk_account_io_start(struct request *rq)
 {
 	struct hd_struct *part;
 	int rw = rq_data_dir(rq);
 
-	if (!blk_do_io_stat(rq))
-		return;
-
-	part_stat_lock();
+	if (blk_do_io_stat(rq)) {
+		part_stat_lock();
 
-	if (!new_io) {
-		part = rq->part;
-		part_stat_inc(part, merges[rw]);
-	} else {
 		rcu_read_lock();
 		part = disk_map_sector_rcu(rq->rq_disk, blk_rq_pos(rq));
 		if (!hd_struct_try_get(part)) {
@@ -1378,13 +1383,13 @@ void blk_account_io_start(struct request *rq, bool new_io)
 		}
 		rcu_read_unlock();
 
-		part_inc_in_flight(rq->q, part, rw);
 		rq->part = part;
-	}
 
-	update_io_ticks(part, jiffies, false);
+		part_inc_in_flight(rq->q, part, rw);
+		update_io_ticks(part, jiffies, false);
 
-	part_stat_unlock();
+		part_stat_unlock();
+	}
 }
 
 /*
diff --git a/block/blk-exec.c b/block/blk-exec.c
index e20a852ae432..85324d53d072 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -55,7 +55,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 	rq->rq_disk = bd_disk;
 	rq->end_io = done;
 
-	blk_account_io_start(rq, true);
+	blk_account_io_start(rq);
 
 	/*
 	 * don't check dying flag for MQ because the request won't
diff --git a/block/blk-mq.c b/block/blk-mq.c
index bcc3a2397d4a..049c4f9417c3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1794,7 +1794,7 @@ static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
 	rq->write_hint = bio->bi_write_hint;
 	blk_rq_bio_prep(rq, bio, nr_segs);
 
-	blk_account_io_start(rq, true);
+	blk_account_io_start(rq);
 }
 
 static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
diff --git a/block/blk.h b/block/blk.h
index 73bd3b1c6938..06cd57cc10fb 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -195,7 +195,7 @@ bool bio_attempt_discard_merge(struct request_queue *q, struct request *req,
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
 		unsigned int nr_segs, struct request **same_queue_rq);
 
-void blk_account_io_start(struct request *req, bool new_io);
+void blk_account_io_start(struct request *req);
 void blk_account_io_completion(struct request *req, unsigned int bytes);
 void blk_account_io_done(struct request *req, u64 now);
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock()
  2020-05-04 13:29 [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Konstantin Khlebnikov
                   ` (2 preceding siblings ...)
  2020-05-04 13:31 ` [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio() Konstantin Khlebnikov
@ 2020-05-04 14:00 ` Christoph Hellwig
  3 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-05-04 14:00 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-kernel, linux-block, Jens Axboe

On Mon, May 04, 2020 at 04:29:29PM +0300, Konstantin Khlebnikov wrote:
> RCU lock is required only in blk_account_io_start() to lookup partition.
> After that request holds reference to related hd_struct.
> 
> Replace get_cpu() with preempt_disable() - returned cpu index is unused.
> 
> Non-SMP case also needs preempt_disable, otherwise statistics update could
> be non-atomic. Previously that was provided by rcu_read_lock().
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Looks good,

Signed-off-by: Christoph Hellwig <hch@lst.de>

although I wonder if we should just kill off part_stat_lock and
part_stat_unlock.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id()
  2020-05-04 13:29 ` [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id() Konstantin Khlebnikov
@ 2020-05-04 14:03   ` Christoph Hellwig
  2020-05-04 15:02     ` Konstantin Khlebnikov
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2020-05-04 14:03 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-kernel, linux-block, Jens Axboe

> +#define __part_stat_add(part, field, addnd)				\
> +	(part_stat_get(part, field) += (addnd))

Just open coding part_stat_get for the UP side would seems a little
easier to read.

Otherwise this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] block/part_stat: account merge of two requests
  2020-05-04 13:30 ` [PATCH 3/4] block/part_stat: account merge of two requests Konstantin Khlebnikov
@ 2020-05-04 14:03   ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-05-04 14:03 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-kernel, linux-block, Jens Axboe

On Mon, May 04, 2020 at 04:30:52PM +0300, Konstantin Khlebnikov wrote:
> Also rename blk_account_io_merge() into blk_account_io_merge_request() to
> distinguish it from merging request and bio.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio()
  2020-05-04 13:31 ` [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio() Konstantin Khlebnikov
@ 2020-05-04 14:06   ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-05-04 14:06 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-kernel, linux-block, Jens Axboe

On Mon, May 04, 2020 at 04:31:04PM +0300, Konstantin Khlebnikov wrote:
> Move non-"new_io" branch of blk_account_io_start() into separate function.
> Fix merge accounting for discards (they were counted as write merges).
> 
> Also blk_account_io_merge_bio() doesn't call update_io_ticks() unlike to
> blk_account_io_start(), there is no reason for that.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

Nitpick below:

> +void blk_account_io_start(struct request *rq)
>  {
>  	struct hd_struct *part;
>  	int rw = rq_data_dir(rq);
>  
> +	if (blk_do_io_stat(rq)) {

part and rw probably should move inside this branch.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id()
  2020-05-04 14:03   ` Christoph Hellwig
@ 2020-05-04 15:02     ` Konstantin Khlebnikov
  2020-05-04 15:10       ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Konstantin Khlebnikov @ 2020-05-04 15:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-block, Jens Axboe, Christoph Lameter,
	Dennis Zhou, Tejun Heo

On 04/05/2020 17.03, Christoph Hellwig wrote:
>> +#define __part_stat_add(part, field, addnd)				\
>> +	(part_stat_get(part, field) += (addnd))
> 
> Just open coding part_stat_get for the UP side would seems a little
> easier to read.

If rewrite filed definition as

#ifdef	CONFIG_SMP
	struct disk_stats __percpu *dkstats;
#else
	struct disk_stats __percpu dkstats[1];
#endif

Then all per-cpu macro should work as is for UP case too.
Surprisingly arrow operator (struct->filed) works for arrays too =)


Inlining per-cpu structures should be good for tiny UP systems.
Especially if this could be done by few macro only in three places:
definition, allocating and freeing.


But sparse still complains.

./include/linux/part_stat.h:45:24: warning: incorrect type in initializer (different address spaces)
./include/linux/part_stat.h:45:24:    expected void const [noderef] <asn:3> *__vpp_verify
./include/linux/part_stat.h:45:24:    got struct disk_stats [noderef] *

Looks like it cannot assign different address-space to the filed.


> 
> Otherwise this looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id()
  2020-05-04 15:02     ` Konstantin Khlebnikov
@ 2020-05-04 15:10       ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-05-04 15:10 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Christoph Hellwig, linux-kernel, linux-block, Jens Axboe,
	Christoph Lameter, Dennis Zhou, Tejun Heo

On Mon, May 04, 2020 at 06:02:28PM +0300, Konstantin Khlebnikov wrote:
> Then all per-cpu macro should work as is for UP case too.
> Surprisingly arrow operator (struct->filed) works for arrays too =)
> 
> 
> Inlining per-cpu structures should be good for tiny UP systems.
> Especially if this could be done by few macro only in three places:
> definition, allocating and freeing.

Or we could just use the percpu ops always, which is what most
users do.  I never really go the UP microoptmization here.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-05-04 15:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-04 13:29 [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Konstantin Khlebnikov
2020-05-04 13:29 ` [PATCH 2/4] block/part_stat: use __this_cpu_add() instead of access by smp_processor_id() Konstantin Khlebnikov
2020-05-04 14:03   ` Christoph Hellwig
2020-05-04 15:02     ` Konstantin Khlebnikov
2020-05-04 15:10       ` Christoph Hellwig
2020-05-04 13:30 ` [PATCH 3/4] block/part_stat: account merge of two requests Konstantin Khlebnikov
2020-05-04 14:03   ` Christoph Hellwig
2020-05-04 13:31 ` [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio() Konstantin Khlebnikov
2020-05-04 14:06   ` Christoph Hellwig
2020-05-04 14:00 ` [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock() Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.