All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bsg: call idr_pre_get() outside the lock.
@ 2010-09-01 14:26 Naohiro Aota
  2010-09-02  4:13 ` FUJITA Tomonori
  2010-09-16 23:37 ` Andrew Morton
  0 siblings, 2 replies; 6+ messages in thread
From: Naohiro Aota @ 2010-09-01 14:26 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-kernel, Jens Axboe, Jiri Kosina, Greg Kroah-Hartman,
	Naohiro Aota, Daniel Mack, linux-scsi

The idr_pre_get() kernel-doc says "This function should be called
prior to locking and calling the idr_get_new* functions.", but the
idr_pre_get() calling in bsg_register_queue() is put inside
mutex_lock(). Let's fix it.

Signed-off-by: Naohiro Aota <naota@elisp.net>
---
 block/bsg.c |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index 82d5882..5fd8dd1 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -1010,13 +1010,11 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
 	bcd = &q->bsg_dev;
 	memset(bcd, 0, sizeof(*bcd));
 
-	mutex_lock(&bsg_mutex);
-
 	ret = idr_pre_get(&bsg_minor_idr, GFP_KERNEL);
-	if (!ret) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	if (!ret)
+		return -ENOMEM;
+
+	mutex_lock(&bsg_mutex);
 
 	ret = idr_get_new(&bsg_minor_idr, bcd, &minor);
 	if (ret < 0)
-- 
1.7.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] bsg: call idr_pre_get() outside the lock.
  2010-09-01 14:26 [PATCH] bsg: call idr_pre_get() outside the lock Naohiro Aota
@ 2010-09-02  4:13 ` FUJITA Tomonori
  2010-09-16 23:37 ` Andrew Morton
  1 sibling, 0 replies; 6+ messages in thread
From: FUJITA Tomonori @ 2010-09-02  4:13 UTC (permalink / raw)
  To: naota, JBottomley
  Cc: fujita.tomonori, linux-kernel, axboe, jkosina, gregkh, daniel,
	linux-scsi

On Wed, 01 Sep 2010 23:26:01 +0900
Naohiro Aota <naota@elisp.net> wrote:

> The idr_pre_get() kernel-doc says "This function should be called
> prior to locking and calling the idr_get_new* functions.", but the
> idr_pre_get() calling in bsg_register_queue() is put inside
> mutex_lock(). Let's fix it.
> 
> Signed-off-by: Naohiro Aota <naota@elisp.net>
> ---
>  block/bsg.c |   10 ++++------
>  1 files changed, 4 insertions(+), 6 deletions(-)

Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>

Thanks,

James, can you merge this via your tree?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bsg: call idr_pre_get() outside the lock.
  2010-09-01 14:26 [PATCH] bsg: call idr_pre_get() outside the lock Naohiro Aota
  2010-09-02  4:13 ` FUJITA Tomonori
@ 2010-09-16 23:37 ` Andrew Morton
  2010-09-17  2:55   ` James Bottomley
                     ` (2 more replies)
  1 sibling, 3 replies; 6+ messages in thread
From: Andrew Morton @ 2010-09-16 23:37 UTC (permalink / raw)
  To: Naohiro Aota
  Cc: FUJITA Tomonori, linux-kernel, Jens Axboe, Jiri Kosina,
	Greg Kroah-Hartman, Daniel Mack, linux-scsi

On Wed, 01 Sep 2010 23:26:01 +0900
Naohiro Aota <naota@elisp.net> wrote:

> The idr_pre_get() kernel-doc says "This function should be called
> prior to locking and calling the idr_get_new* functions.", but the
> idr_pre_get() calling in bsg_register_queue() is put inside
> mutex_lock(). Let's fix it.
> 

The idr_pre_get() kerneldoc is wrong.  Or at least, misleading.

The way this all works is that we precharge the tree via idr_pre_get()
and we do this outside locks so we can use GFP_KERNEL.  Then we take
the lock (a spinlock!) and then try to add an element to the tree,
which will consume objects from the pool which was prefilled by
idr_pre_get().

There's an obvious race here where someone else can get in and steal
objects from the prefilled pool.  We can fix that with a retry loop:


again:
	if (idr_pre_get(..., GFP_KERNEL) == NULL)
		return -ENOMEM;		/* We're really out-of-memory */
	spin_lock(lock);
	if (idr_get_new(...) == -EAGAIN) {
		spin_unlock(lock);
		goto again;		/* Someone stole our preallocation! */
	}
	...

this way we avoid the false -ENOMEM which the race would have caused. 
We only declare -ENOMEM when we're REALLY out of memory.


But none of this is needed when a sleeping lock is used (as long as the
sleeping lock isn't taken on the the VM pageout path, of course).  In
that case we can use the sleeping lock to prevent the above race.

> diff --git a/block/bsg.c b/block/bsg.c
> index 82d5882..5fd8dd1 100644
> --- a/block/bsg.c
> +++ b/block/bsg.c
> @@ -1010,13 +1010,11 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
>  	bcd = &q->bsg_dev;
>  	memset(bcd, 0, sizeof(*bcd));
>  
> -	mutex_lock(&bsg_mutex);
> -
>  	ret = idr_pre_get(&bsg_minor_idr, GFP_KERNEL);
> -	if (!ret) {
> -		ret = -ENOMEM;
> -		goto unlock;
> -	}
> +	if (!ret)
> +		return -ENOMEM;
> +
> +	mutex_lock(&bsg_mutex);
>  
>  	ret = idr_get_new(&bsg_minor_idr, bcd, &minor);
>  	if (ret < 0)

So the old code was OK.

The new code, however, is not OK because it is vulnerable to the above
race wherein another CPU or thread comes in and steals all of this
thread's preallocation.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bsg: call idr_pre_get() outside the lock.
  2010-09-16 23:37 ` Andrew Morton
@ 2010-09-17  2:55   ` James Bottomley
  2010-09-18  3:40   ` FUJITA Tomonori
  2010-09-19  3:11   ` Naohiro Aota
  2 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2010-09-17  2:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naohiro Aota, FUJITA Tomonori, linux-kernel, Jens Axboe,
	Jiri Kosina, Greg Kroah-Hartman, Daniel Mack, linux-scsi

On Thu, 2010-09-16 at 16:37 -0700, Andrew Morton wrote:
> On Wed, 01 Sep 2010 23:26:01 +0900
> Naohiro Aota <naota@elisp.net> wrote:
> 
> > The idr_pre_get() kernel-doc says "This function should be called
> > prior to locking and calling the idr_get_new* functions.", but the
> > idr_pre_get() calling in bsg_register_queue() is put inside
> > mutex_lock(). Let's fix it.
> > 
> 
> The idr_pre_get() kerneldoc is wrong.  Or at least, misleading.
> 
> The way this all works is that we precharge the tree via idr_pre_get()
> and we do this outside locks so we can use GFP_KERNEL.  Then we take
> the lock (a spinlock!) and then try to add an element to the tree,
> which will consume objects from the pool which was prefilled by
> idr_pre_get().
> 
> There's an obvious race here where someone else can get in and steal
> objects from the prefilled pool.  We can fix that with a retry loop:
> 
> 
> again:
> 	if (idr_pre_get(..., GFP_KERNEL) == NULL)
> 		return -ENOMEM;		/* We're really out-of-memory */
> 	spin_lock(lock);
> 	if (idr_get_new(...) == -EAGAIN) {
> 		spin_unlock(lock);
> 		goto again;		/* Someone stole our preallocation! */
> 	}
> 	...
> 
> this way we avoid the false -ENOMEM which the race would have caused. 
> We only declare -ENOMEM when we're REALLY out of memory.
> 
> 
> But none of this is needed when a sleeping lock is used (as long as the
> sleeping lock isn't taken on the the VM pageout path, of course).  In
> that case we can use the sleeping lock to prevent the above race.
> 
> > diff --git a/block/bsg.c b/block/bsg.c
> > index 82d5882..5fd8dd1 100644
> > --- a/block/bsg.c
> > +++ b/block/bsg.c
> > @@ -1010,13 +1010,11 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
> >  	bcd = &q->bsg_dev;
> >  	memset(bcd, 0, sizeof(*bcd));
> >  
> > -	mutex_lock(&bsg_mutex);
> > -
> >  	ret = idr_pre_get(&bsg_minor_idr, GFP_KERNEL);
> > -	if (!ret) {
> > -		ret = -ENOMEM;
> > -		goto unlock;
> > -	}
> > +	if (!ret)
> > +		return -ENOMEM;
> > +
> > +	mutex_lock(&bsg_mutex);
> >  
> >  	ret = idr_get_new(&bsg_minor_idr, bcd, &minor);
> >  	if (ret < 0)
> 
> So the old code was OK.
> 
> The new code, however, is not OK because it is vulnerable to the above
> race wherein another CPU or thread comes in and steals all of this
> thread's preallocation.

Hmm, you're right ... I've dropped the patch.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bsg: call idr_pre_get() outside the lock.
  2010-09-16 23:37 ` Andrew Morton
  2010-09-17  2:55   ` James Bottomley
@ 2010-09-18  3:40   ` FUJITA Tomonori
  2010-09-19  3:11   ` Naohiro Aota
  2 siblings, 0 replies; 6+ messages in thread
From: FUJITA Tomonori @ 2010-09-18  3:40 UTC (permalink / raw)
  To: akpm
  Cc: naota, fujita.tomonori, linux-kernel, axboe, jkosina, gregkh,
	daniel, linux-scsi

On Thu, 16 Sep 2010 16:37:35 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 01 Sep 2010 23:26:01 +0900
> Naohiro Aota <naota@elisp.net> wrote:
> 
> > The idr_pre_get() kernel-doc says "This function should be called
> > prior to locking and calling the idr_get_new* functions.", but the
> > idr_pre_get() calling in bsg_register_queue() is put inside
> > mutex_lock(). Let's fix it.
> > 
> 
> The idr_pre_get() kerneldoc is wrong.  Or at least, misleading.

Ah, thanks a lot. As usual, seems that there are few reliable
documents in kernel.


> The way this all works is that we precharge the tree via idr_pre_get()
> and we do this outside locks so we can use GFP_KERNEL.  Then we take
> the lock (a spinlock!) and then try to add an element to the tree,
> which will consume objects from the pool which was prefilled by
> idr_pre_get().
> 
> There's an obvious race here where someone else can get in and steal
> objects from the prefilled pool.  We can fix that with a retry loop:
> 
> 
> again:
> 	if (idr_pre_get(..., GFP_KERNEL) == NULL)
> 		return -ENOMEM;		/* We're really out-of-memory */
> 	spin_lock(lock);
> 	if (idr_get_new(...) == -EAGAIN) {
> 		spin_unlock(lock);
> 		goto again;		/* Someone stole our preallocation! */
> 	}
> 	...
> 
> this way we avoid the false -ENOMEM which the race would have caused. 
> We only declare -ENOMEM when we're REALLY out of memory.

Looks like there are many misuses idr_pre_get and idr_get_new; some
even calls idr_get_new without any lock (e.g. blk_alloc_devt).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bsg: call idr_pre_get() outside the lock.
  2010-09-16 23:37 ` Andrew Morton
  2010-09-17  2:55   ` James Bottomley
  2010-09-18  3:40   ` FUJITA Tomonori
@ 2010-09-19  3:11   ` Naohiro Aota
  2 siblings, 0 replies; 6+ messages in thread
From: Naohiro Aota @ 2010-09-19  3:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: FUJITA Tomonori, linux-kernel, Jens Axboe, Jiri Kosina,
	Greg Kroah-Hartman, Daniel Mack, linux-scsi

Andrew Morton <akpm@linux-foundation.org> writes:

> On Wed, 01 Sep 2010 23:26:01 +0900
> Naohiro Aota <naota@elisp.net> wrote:
>
>> The idr_pre_get() kernel-doc says "This function should be called
>> prior to locking and calling the idr_get_new* functions.", but the
>> idr_pre_get() calling in bsg_register_queue() is put inside
>> mutex_lock(). Let's fix it.
>> 
>
> The idr_pre_get() kerneldoc is wrong.  Or at least, misleading.
>
> The way this all works is that we precharge the tree via idr_pre_get()
> and we do this outside locks so we can use GFP_KERNEL.  Then we take
> the lock (a spinlock!) and then try to add an element to the tree,
> which will consume objects from the pool which was prefilled by
> idr_pre_get().
>
> There's an obvious race here where someone else can get in and steal
> objects from the prefilled pool.  We can fix that with a retry loop:
>
>
> again:
> 	if (idr_pre_get(..., GFP_KERNEL) == NULL)
> 		return -ENOMEM;		/* We're really out-of-memory */
> 	spin_lock(lock);
> 	if (idr_get_new(...) == -EAGAIN) {
> 		spin_unlock(lock);
> 		goto again;		/* Someone stole our preallocation! */
> 	}
> 	...
>
> this way we avoid the false -ENOMEM which the race would have caused. 
> We only declare -ENOMEM when we're REALLY out of memory.
>
>
> But none of this is needed when a sleeping lock is used (as long as the
> sleeping lock isn't taken on the the VM pageout path, of course).  In
> that case we can use the sleeping lock to prevent the above race.
>
>> diff --git a/block/bsg.c b/block/bsg.c
>> index 82d5882..5fd8dd1 100644
>> --- a/block/bsg.c
>> +++ b/block/bsg.c
>> @@ -1010,13 +1010,11 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
>>  	bcd = &q->bsg_dev;
>>  	memset(bcd, 0, sizeof(*bcd));
>>  
>> -	mutex_lock(&bsg_mutex);
>> -
>>  	ret = idr_pre_get(&bsg_minor_idr, GFP_KERNEL);
>> -	if (!ret) {
>> -		ret = -ENOMEM;
>> -		goto unlock;
>> -	}
>> +	if (!ret)
>> +		return -ENOMEM;
>> +
>> +	mutex_lock(&bsg_mutex);
>>  
>>  	ret = idr_get_new(&bsg_minor_idr, bcd, &minor);
>>  	if (ret < 0)
>
> So the old code was OK.
>
> The new code, however, is not OK because it is vulnerable to the above
> race wherein another CPU or thread comes in and steals all of this
> thread's preallocation.

I see. Thank you for your comment.

> The idr_pre_get() kerneldoc is wrong.  Or at least, misleading.

Then we can fix the kerneldoc like this?

----
>From 707ec72b10950ae123f955308f4f1dc32d7a77be Mon Sep 17 00:00:00 2001
From: Naohiro Aota <naota@elisp.net>
Date: Sun, 19 Sep 2010 08:33:01 +0900
Subject: [PATCH] idr: Fix idr_pre_get() realated description about locking

Despite the idr_pre_get() kernel-doc, there is some case you can call
idr_pre_get() in lock regions. This patch add descriprion for such
cases.

See also: http://lkml.org/lkml/2010/9/16/462

Signed-off-by: Naohiro Aota <naota@elisp.net>
---
 lib/idr.c |   17 ++++++++---------
 1 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/lib/idr.c b/lib/idr.c
index 5e0966b..0f97611 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -110,9 +110,10 @@ static void idr_mark_full(struct idr_layer **pa, int id)
  * @idp:	idr handle
  * @gfp_mask:	memory allocation flags
  *
- * This function should be called prior to locking and calling the
- * idr_get_new* functions. It preallocates enough memory to satisfy
- * the worst possible allocation.
+ * This function should be called prior to calling the idr_get_new*
+ * functions. It preallocates enough memory to satisfy the worst
+ * possible allocation. You can sleep in this function iff without
+ * holding spinlock.
  *
  * If the system is REALLY out of memory this function returns 0,
  * otherwise 1.
@@ -290,9 +291,8 @@ static int idr_get_new_above_int(struct idr *idp, void *ptr, int starting_id)
  * This is the allocate id function.  It should be called with any
  * required locks.
  *
- * If memory is required, it will return -EAGAIN, you should unlock
- * and go back to the idr_pre_get() call.  If the idr is full, it will
- * return -ENOSPC.
+ * If memory is required, it will return -EAGAIN, you should go back to
+ * the idr_pre_get() call. If the idr is full, it will return -ENOSPC.
  *
  * @id returns a value in the range @starting_id ... 0x7fffffff
  */
@@ -321,9 +321,8 @@ EXPORT_SYMBOL(idr_get_new_above);
  * This is the allocate id function.  It should be called with any
  * required locks.
  *
- * If memory is required, it will return -EAGAIN, you should unlock
- * and go back to the idr_pre_get() call.  If the idr is full, it will
- * return -ENOSPC.
+ * If memory is required, it will return -EAGAIN, you should go back to
+ * the idr_pre_get() call. If the idr is full, it will return -ENOSPC.
  *
  * @id returns a value in the range 0 ... 0x7fffffff
  */
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-09-19  3:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-01 14:26 [PATCH] bsg: call idr_pre_get() outside the lock Naohiro Aota
2010-09-02  4:13 ` FUJITA Tomonori
2010-09-16 23:37 ` Andrew Morton
2010-09-17  2:55   ` James Bottomley
2010-09-18  3:40   ` FUJITA Tomonori
2010-09-19  3:11   ` Naohiro Aota

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.