* [PATCH] block: don't release queue's sysfs lock during switching elevator
@ 2019-09-23 15:12 Ming Lei
2019-09-24 18:37 ` Bart Van Assche
2019-09-26 6:46 ` Jens Axboe
0 siblings, 2 replies; 5+ messages in thread
From: Ming Lei @ 2019-09-23 15:12 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Ming Lei, Christoph Hellwig, Hannes Reinecke,
Greg KH, Mike Snitzer, Bart Van Assche
cecf5d87ff20 ("block: split .sysfs_lock into two locks") starts to
release & acquire sysfs_lock before registering/un-registering elevator
queue during switching elevator for avoiding potential deadlock from
showing & storing 'queue/iosched' attributes and removing elevator's
kobject.
Turns out there isn't such deadlock because 'q->sysfs_lock' isn't
required in .show & .store of queue/iosched's attributes, and just
elevator's sysfs lock is acquired in elv_iosched_store() and
elv_iosched_show(). So it is safe to hold queue's sysfs lock when
registering/un-registering elevator queue.
The biggest issue is that commit cecf5d87ff20 assumes that concurrent
write on 'queue/scheduler' can't happen. However, this assumption isn't
true, because kernfs_fop_write() only guarantees that concurrent write
aren't called on the same open file, but the write could be from
different open on the file. So we can't release & re-acquire queue's
sysfs lock during switching elevator, otherwise use-after-free on
elevator could be triggered.
Fixes the issue by not releasing queue's sysfs lock during switching
elevator.
Fixes: cecf5d87ff20 ("block: split .sysfs_lock into two locks")
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-sysfs.c | 13 ++++---------
block/elevator.c | 31 +------------------------------
2 files changed, 5 insertions(+), 39 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index b82736c781c5..962fc0c44381 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -989,13 +989,11 @@ int blk_register_queue(struct gendisk *disk)
blk_mq_debugfs_register(q);
}
- /*
- * The flag of QUEUE_FLAG_REGISTERED isn't set yet, so elevator
- * switch won't happen at all.
- */
+ mutex_lock(&q->sysfs_lock);
if (q->elevator) {
ret = elv_register_queue(q, false);
if (ret) {
+ mutex_unlock(&q->sysfs_lock);
mutex_unlock(&q->sysfs_dir_lock);
kobject_del(&q->kobj);
blk_trace_remove_sysfs(dev);
@@ -1005,7 +1003,6 @@ int blk_register_queue(struct gendisk *disk)
has_elevator = true;
}
- mutex_lock(&q->sysfs_lock);
blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);
wbt_enable_default(q);
blk_throtl_register_queue(q);
@@ -1062,12 +1059,10 @@ void blk_unregister_queue(struct gendisk *disk)
kobject_del(&q->kobj);
blk_trace_remove_sysfs(disk_to_dev(disk));
- /*
- * q->kobj has been removed, so it is safe to check if elevator
- * exists without holding q->sysfs_lock.
- */
+ mutex_lock(&q->sysfs_lock);
if (q->elevator)
elv_unregister_queue(q);
+ mutex_unlock(&q->sysfs_lock);
mutex_unlock(&q->sysfs_dir_lock);
kobject_put(&disk_to_dev(disk)->kobj);
diff --git a/block/elevator.c b/block/elevator.c
index bba10e83478a..5437059c9261 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -503,9 +503,7 @@ int elv_register_queue(struct request_queue *q, bool uevent)
if (uevent)
kobject_uevent(&e->kobj, KOBJ_ADD);
- mutex_lock(&q->sysfs_lock);
e->registered = 1;
- mutex_unlock(&q->sysfs_lock);
}
return error;
}
@@ -523,11 +521,9 @@ void elv_unregister_queue(struct request_queue *q)
kobject_uevent(&e->kobj, KOBJ_REMOVE);
kobject_del(&e->kobj);
- mutex_lock(&q->sysfs_lock);
e->registered = 0;
/* Re-enable throttling in case elevator disabled it */
wbt_enable_default(q);
- mutex_unlock(&q->sysfs_lock);
}
}
@@ -590,32 +586,11 @@ int elevator_switch_mq(struct request_queue *q,
lockdep_assert_held(&q->sysfs_lock);
if (q->elevator) {
- if (q->elevator->registered) {
- mutex_unlock(&q->sysfs_lock);
-
- /*
- * Concurrent elevator switch can't happen becasue
- * sysfs write is always exclusively on same file.
- *
- * Also the elevator queue won't be freed after
- * sysfs_lock is released becasue kobject_del() in
- * blk_unregister_queue() waits for completion of
- * .store & .show on its attributes.
- */
+ if (q->elevator->registered)
elv_unregister_queue(q);
- mutex_lock(&q->sysfs_lock);
- }
ioc_clear_queue(q);
elevator_exit(q, q->elevator);
-
- /*
- * sysfs_lock may be dropped, so re-check if queue is
- * unregistered. If yes, don't switch to new elevator
- * any more
- */
- if (!blk_queue_registered(q))
- return 0;
}
ret = blk_mq_init_sched(q, new_e);
@@ -623,11 +598,7 @@ int elevator_switch_mq(struct request_queue *q,
goto out;
if (new_e) {
- mutex_unlock(&q->sysfs_lock);
-
ret = elv_register_queue(q, true);
-
- mutex_lock(&q->sysfs_lock);
if (ret) {
elevator_exit(q, q->elevator);
goto out;
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] block: don't release queue's sysfs lock during switching elevator
2019-09-23 15:12 [PATCH] block: don't release queue's sysfs lock during switching elevator Ming Lei
@ 2019-09-24 18:37 ` Bart Van Assche
2019-09-25 0:13 ` Ming Lei
2019-09-26 6:46 ` Jens Axboe
1 sibling, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2019-09-24 18:37 UTC (permalink / raw)
To: Ming Lei, Jens Axboe
Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Greg KH, Mike Snitzer
On 9/23/19 8:12 AM, Ming Lei wrote:
> @@ -523,11 +521,9 @@ void elv_unregister_queue(struct request_queue *q)
> kobject_uevent(&e->kobj, KOBJ_REMOVE);
> kobject_del(&e->kobj);
>
> - mutex_lock(&q->sysfs_lock);
> e->registered = 0;
> /* Re-enable throttling in case elevator disabled it */
> wbt_enable_default(q);
> - mutex_unlock(&q->sysfs_lock);
> }
> }
Does this patch cause sysfs_lock to be held around
kobject_del(&e->kobj)? Since sysfs_lock is locked from inside
elv_attr_show() and elv_attr_store(), does this mean that this patch
reintroduces the lock inversion problem that was fixed recently?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: don't release queue's sysfs lock during switching elevator
2019-09-24 18:37 ` Bart Van Assche
@ 2019-09-25 0:13 ` Ming Lei
2019-09-25 23:40 ` Bart Van Assche
0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2019-09-25 0:13 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Hannes Reinecke,
Greg KH, Mike Snitzer
On Tue, Sep 24, 2019 at 11:37:09AM -0700, Bart Van Assche wrote:
> On 9/23/19 8:12 AM, Ming Lei wrote:
> > @@ -523,11 +521,9 @@ void elv_unregister_queue(struct request_queue *q)
> > kobject_uevent(&e->kobj, KOBJ_REMOVE);
> > kobject_del(&e->kobj);
> > - mutex_lock(&q->sysfs_lock);
> > e->registered = 0;
> > /* Re-enable throttling in case elevator disabled it */
> > wbt_enable_default(q);
> > - mutex_unlock(&q->sysfs_lock);
> > }
> > }
>
> Does this patch cause sysfs_lock to be held around kobject_del(&e->kobj)?
Yes.
> Since sysfs_lock is locked from inside elv_attr_show() and elv_attr_store(),
The request queue's sysfs_lock isn't required in elv_attr_show() and
elv_attr_store(), and only elevator's sysfs_lock is needed in the two
functions.
> does this mean that this patch reintroduces the lock inversion problem that
> was fixed recently?
No.
The lock inversion issue only existed on kobjects of q->kobj & q->mq_obj,
which was fixed already given the queue's sysfs_lock is required in
.show/.store callback of these two kobjects' attributes.
thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: don't release queue's sysfs lock during switching elevator
2019-09-25 0:13 ` Ming Lei
@ 2019-09-25 23:40 ` Bart Van Assche
0 siblings, 0 replies; 5+ messages in thread
From: Bart Van Assche @ 2019-09-25 23:40 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, Christoph Hellwig, Hannes Reinecke,
Greg KH, Mike Snitzer
On 9/24/19 5:13 PM, Ming Lei wrote:
> On Tue, Sep 24, 2019 at 11:37:09AM -0700, Bart Van Assche wrote:
>> On 9/23/19 8:12 AM, Ming Lei wrote:
>>> @@ -523,11 +521,9 @@ void elv_unregister_queue(struct request_queue *q)
>>> kobject_uevent(&e->kobj, KOBJ_REMOVE);
>>> kobject_del(&e->kobj);
>>> - mutex_lock(&q->sysfs_lock);
>>> e->registered = 0;
>>> /* Re-enable throttling in case elevator disabled it */
>>> wbt_enable_default(q);
>>> - mutex_unlock(&q->sysfs_lock);
>>> }
>>> }
>>
>> Does this patch cause sysfs_lock to be held around kobject_del(&e->kobj)?
>
> Yes.
>
>> Since sysfs_lock is locked from inside elv_attr_show() and elv_attr_store(),
>
> The request queue's sysfs_lock isn't required in elv_attr_show() and
> elv_attr_store(), and only elevator's sysfs_lock is needed in the two
> functions.
>
>> does this mean that this patch reintroduces the lock inversion problem that
>> was fixed recently?
>
> No.
>
> The lock inversion issue only existed on kobjects of q->kobj & q->mq_obj,
> which was fixed already given the queue's sysfs_lock is required in
> .show/.store callback of these two kobjects' attributes.
I had confused e->kobj and eq->kobj, hence my comments. After having
taken another look at your patch:
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: don't release queue's sysfs lock during switching elevator
2019-09-23 15:12 [PATCH] block: don't release queue's sysfs lock during switching elevator Ming Lei
2019-09-24 18:37 ` Bart Van Assche
@ 2019-09-26 6:46 ` Jens Axboe
1 sibling, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2019-09-26 6:46 UTC (permalink / raw)
To: Ming Lei
Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Greg KH,
Mike Snitzer, Bart Van Assche
On 9/23/19 5:12 PM, Ming Lei wrote:
> cecf5d87ff20 ("block: split .sysfs_lock into two locks") starts to
> release & acquire sysfs_lock before registering/un-registering elevator
> queue during switching elevator for avoiding potential deadlock from
> showing & storing 'queue/iosched' attributes and removing elevator's
> kobject.
>
> Turns out there isn't such deadlock because 'q->sysfs_lock' isn't
> required in .show & .store of queue/iosched's attributes, and just
> elevator's sysfs lock is acquired in elv_iosched_store() and
> elv_iosched_show(). So it is safe to hold queue's sysfs lock when
> registering/un-registering elevator queue.
>
> The biggest issue is that commit cecf5d87ff20 assumes that concurrent
> write on 'queue/scheduler' can't happen. However, this assumption isn't
> true, because kernfs_fop_write() only guarantees that concurrent write
> aren't called on the same open file, but the write could be from
> different open on the file. So we can't release & re-acquire queue's
> sysfs lock during switching elevator, otherwise use-after-free on
> elevator could be triggered.
>
> Fixes the issue by not releasing queue's sysfs lock during switching
> elevator.
Applied, thanks Ming.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-09-26 6:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23 15:12 [PATCH] block: don't release queue's sysfs lock during switching elevator Ming Lei
2019-09-24 18:37 ` Bart Van Assche
2019-09-25 0:13 ` Ming Lei
2019-09-25 23:40 ` Bart Van Assche
2019-09-26 6:46 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).