* [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask
[not found] <cover.1382950629.git.agordeev@redhat.com>
@ 2013-10-28 10:05 ` Alexander Gordeev
2013-10-28 21:20 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
` (3 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:05 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Function steal_tags() might miss a bit in cpus_have_tags due to
unsynchronized access from percpu_ida_free(). As result, function
percpu_ida_alloc() might enter unwakable sleep. This update adds
memory barriers to prevent the described scenario.
In fact, accesses to cpus_have_tags are fenced by prepare_to_wait()
and wake_up() calls at the moment and the aforementioned sequence
does not appear could hit. Nevertheless, explicit memory barriers
still seem justifiable.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index b0698ea..96cfacf 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -68,6 +68,11 @@ static inline void steal_tags(struct percpu_ida *pool,
unsigned cpus_have_tags, cpu = pool->cpu_last_stolen;
struct percpu_ida_cpu *remote;
+ /*
+ * Pairs with smp_wmb() in percpu_ida_free()
+ */
+ smp_rmb();
+
for (cpus_have_tags = cpumask_weight(&pool->cpus_have_tags);
cpus_have_tags * pool->percpu_max_size > pool->nr_tags / 2;
cpus_have_tags--) {
@@ -231,8 +236,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
spin_unlock(&tags->lock);
if (nr_free == 1) {
- cpumask_set_cpu(smp_processor_id(),
- &pool->cpus_have_tags);
+ cpumask_set_cpu(smp_processor_id(), &pool->cpus_have_tags);
+ /*
+ * Pairs with smp_rmb() in steal_tags()
+ */
+ smp_wmb();
wake_up(&pool->wait);
}
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts
[not found] <cover.1382950629.git.agordeev@redhat.com>
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
@ 2013-10-28 10:05 ` Alexander Gordeev
2013-10-28 21:21 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1 Alexander Gordeev
` (2 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:05 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Currently percpu_ida_free() waikes up waiters always with local
interrupts disabled and sometimes with pool->lock held. Yet, it
does not appear there is any reason why it could not be done out
of these atomic contexts.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 96cfacf..9dd8741 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -223,6 +223,7 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
struct percpu_ida_cpu *tags;
unsigned long flags;
unsigned nr_free;
+ bool wake_up = false;
BUG_ON(tag >= pool->nr_tags);
@@ -241,8 +242,8 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
* Pairs with smp_rmb() in steal_tags()
*/
smp_wmb();
- wake_up(&pool->wait);
- }
+ wake_up = true;
+ }
if (nr_free == pool->percpu_max_size) {
spin_lock(&pool->lock);
@@ -255,13 +256,15 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
move_tags(pool->freelist, &pool->nr_free,
tags->freelist, &tags->nr_free,
pool->percpu_batch_size);
-
- wake_up(&pool->wait);
+ wake_up = true;
}
spin_unlock(&pool->lock);
}
local_irq_restore(flags);
+
+ if (wake_up)
+ wake_up(&pool->wait);
}
EXPORT_SYMBOL_GPL(percpu_ida_free);
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1
[not found] <cover.1382950629.git.agordeev@redhat.com>
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
@ 2013-10-28 10:05 ` Alexander Gordeev
2013-10-28 10:06 ` [PATCH 4/5] percpu_ida: Sanity check initialization parameters Alexander Gordeev
2013-10-28 10:06 ` [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags Alexander Gordeev
4 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:05 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
In case percpu_max_size is 1 percpu_ida_free() sets bit in
the cpumask and immediately transfers the tag to the pool.
As result, the very next call to percpu_ida_alloc() on the
same CPU will have to pull a tag from the pool to the local
cache and so on. Hence, positive effects of local caching
become largely negated.
This update assumes stealing tags is faster than ping-ponging
between local caches and the pool and prevents returning tags
to the pool in case percpu_max_size is 1.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 9dd8741..4adc3e5 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -243,9 +243,8 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
*/
smp_wmb();
wake_up = true;
- }
-
- if (nr_free == pool->percpu_max_size) {
+ } else if ((nr_free == pool->percpu_max_size) &&
+ (pool->percpu_max_size > 1)) {
spin_lock(&pool->lock);
/*
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/5] percpu_ida: Sanity check initialization parameters
[not found] <cover.1382950629.git.agordeev@redhat.com>
` (2 preceding siblings ...)
2013-10-28 10:05 ` [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1 Alexander Gordeev
@ 2013-10-28 10:06 ` Alexander Gordeev
2013-10-28 10:06 ` [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags Alexander Gordeev
4 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:06 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 4adc3e5..1fc89f9 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -298,6 +298,11 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
{
unsigned i, cpu, order;
+ if (batch_size > max_size)
+ return -ERANGE;
+ if (!batch_size)
+ return -EINVAL;
+
memset(pool, 0, sizeof(*pool));
init_waitqueue_head(&pool->wait);
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags
[not found] <cover.1382950629.git.agordeev@redhat.com>
` (3 preceding siblings ...)
2013-10-28 10:06 ` [PATCH 4/5] percpu_ida: Sanity check initialization parameters Alexander Gordeev
@ 2013-10-28 10:06 ` Alexander Gordeev
4 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:06 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Currently a threshold for stealing tags from remote CPUs
is set to a half of the total number of tags. However,
in general case this threshold is a function not only of
the total number of tags and maximum number of tags per
CPU, but also of a usage pattern. Just let percpu_ida
users decide how big this threshold value should be.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
block/blk-mq-tag.c | 9 +++++----
include/linux/percpu_ida.h | 5 +++--
lib/percpu_ida.c | 7 +++++--
3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index d64a02f..da0b3dd 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -145,10 +145,11 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
tags->nr_max_cache = nr_cache;
tags->nr_batch_move = max(1u, nr_cache / 2);
- ret = __percpu_ida_init(&tags->free_tags, tags->nr_tags -
- tags->nr_reserved_tags,
+ ret = __percpu_ida_init(&tags->free_tags,
+ tags->nr_tags - tags->nr_reserved_tags,
tags->nr_max_cache,
- tags->nr_batch_move);
+ tags->nr_batch_move,
+ (tags->nr_tags - tags->nr_reserved_tags) / 2);
if (ret)
goto err_free_tags;
@@ -158,7 +159,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
* no cached. It's fine reserved tags allocation is slow.
*/
ret = __percpu_ida_init(&tags->reserved_tags, reserved_tags,
- 1, 1);
+ 1, 1, 0);
if (ret)
goto err_reserved_tags;
}
diff --git a/include/linux/percpu_ida.h b/include/linux/percpu_ida.h
index 1900bd0..d874cca 100644
--- a/include/linux/percpu_ida.h
+++ b/include/linux/percpu_ida.h
@@ -18,6 +18,7 @@ struct percpu_ida {
unsigned nr_tags;
unsigned percpu_max_size;
unsigned percpu_batch_size;
+ unsigned max_cached;
struct percpu_ida_cpu __percpu *tag_cpu;
@@ -66,11 +67,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag);
void percpu_ida_destroy(struct percpu_ida *pool);
int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
- unsigned long max_size, unsigned long batch_size);
+ unsigned long max_size, unsigned long batch_size, unsigned max_cached);
static inline int percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags)
{
return __percpu_ida_init(pool, nr_tags, IDA_DEFAULT_PCPU_SIZE,
- IDA_DEFAULT_PCPU_BATCH_MOVE);
+ IDA_DEFAULT_PCPU_BATCH_MOVE, nr_tags / 2);
}
typedef int (*percpu_ida_cb)(unsigned, void *);
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 1fc89f9..241f8a3 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -74,7 +74,7 @@ static inline void steal_tags(struct percpu_ida *pool,
smp_rmb();
for (cpus_have_tags = cpumask_weight(&pool->cpus_have_tags);
- cpus_have_tags * pool->percpu_max_size > pool->nr_tags / 2;
+ cpus_have_tags * pool->percpu_max_size > pool->max_cached;
cpus_have_tags--) {
cpu = cpumask_next(cpu, &pool->cpus_have_tags);
@@ -294,7 +294,7 @@ EXPORT_SYMBOL_GPL(percpu_ida_destroy);
* performance, the workload should not span more cpus than nr_tags / 128.
*/
int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
- unsigned long max_size, unsigned long batch_size)
+ unsigned long max_size, unsigned long batch_size, unsigned max_cached)
{
unsigned i, cpu, order;
@@ -302,6 +302,8 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
return -ERANGE;
if (!batch_size)
return -EINVAL;
+ if (max_cached > nr_tags)
+ return -EINVAL;
memset(pool, 0, sizeof(*pool));
@@ -310,6 +312,7 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
pool->nr_tags = nr_tags;
pool->percpu_max_size = max_size;
pool->percpu_batch_size = batch_size;
+ pool->max_cached = max_cached;
/* Guard against overflow */
if (nr_tags > (unsigned) INT_MAX + 1) {
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
@ 2013-10-28 21:20 ` Kent Overstreet
0 siblings, 0 replies; 7+ messages in thread
From: Kent Overstreet @ 2013-10-28 21:20 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
On Mon, Oct 28, 2013 at 11:05:06AM +0100, Alexander Gordeev wrote:
> Function steal_tags() might miss a bit in cpus_have_tags due to
> unsynchronized access from percpu_ida_free(). As result, function
> percpu_ida_alloc() might enter unwakable sleep. This update adds
> memory barriers to prevent the described scenario.
>
> In fact, accesses to cpus_have_tags are fenced by prepare_to_wait()
> and wake_up() calls at the moment and the aforementioned sequence
> does not appear could hit. Nevertheless, explicit memory barriers
> still seem justifiable.
Good catch!
Acked-by: Kent Overstreet <kmo@daterainc.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
@ 2013-10-28 21:21 ` Kent Overstreet
0 siblings, 0 replies; 7+ messages in thread
From: Kent Overstreet @ 2013-10-28 21:21 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
On Mon, Oct 28, 2013 at 11:05:25AM +0100, Alexander Gordeev wrote:
> Currently percpu_ida_free() waikes up waiters always with local
> interrupts disabled and sometimes with pool->lock held. Yet, it
> does not appear there is any reason why it could not be done out
> of these atomic contexts.
This should be a noticable performance boost, nested irqsave/restore is painful.
Acked-by: Kent Overstreet <kmo@daterainc.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-10-28 21:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <cover.1382950629.git.agordeev@redhat.com>
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
2013-10-28 21:20 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
2013-10-28 21:21 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1 Alexander Gordeev
2013-10-28 10:06 ` [PATCH 4/5] percpu_ida: Sanity check initialization parameters Alexander Gordeev
2013-10-28 10:06 ` [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags Alexander Gordeev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).