All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  0:36 ` Junil Lee
  0 siblings, 0 replies; 19+ messages in thread
From: Junil Lee @ 2016-01-15  0:36 UTC (permalink / raw)
  To: minchan, ngupta
  Cc: sergey.senozhatsky.work, linux-mm, linux-kernel, Junil Lee

To prevent unlock at the not correct situation, tagging the new obj to
assure lock in migrate_zspage() before right unlock path.

Two functions are in race condition by tag which set 1 on last bit of
obj, however unlock succrently when update new obj to handle before call
unpin_tag() which is right unlock path.

summarize this problem by call flow as below:

		CPU0								CPU1
migrate_zspage
find_alloced_obj()
	trypin_tag() -- obj |= HANDLE_PIN_BIT
obj_malloc() -- new obj is not set			zs_free
record_obj() -- unlock and break sync		pin_tag() -- get lock
unpin_tag()

Signed-off-by: Junil Lee <junil0814.lee@lge.com>
---
 mm/zsmalloc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index e7414ce..bb459ef 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
 		free_obj = obj_malloc(d_page, class, handle);
 		zs_object_copy(free_obj, used_obj, class);
 		index++;
+		free_obj |= BIT(HANDLE_PIN_BIT);
 		record_obj(handle, free_obj);
 		unpin_tag(handle);
 		obj_free(pool, class, used_obj);
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  0:36 ` Junil Lee
  0 siblings, 0 replies; 19+ messages in thread
From: Junil Lee @ 2016-01-15  0:36 UTC (permalink / raw)
  To: minchan, ngupta
  Cc: sergey.senozhatsky.work, linux-mm, linux-kernel, Junil Lee

To prevent unlock at the not correct situation, tagging the new obj to
assure lock in migrate_zspage() before right unlock path.

Two functions are in race condition by tag which set 1 on last bit of
obj, however unlock succrently when update new obj to handle before call
unpin_tag() which is right unlock path.

summarize this problem by call flow as below:

		CPU0								CPU1
migrate_zspage
find_alloced_obj()
	trypin_tag() -- obj |= HANDLE_PIN_BIT
obj_malloc() -- new obj is not set			zs_free
record_obj() -- unlock and break sync		pin_tag() -- get lock
unpin_tag()

Signed-off-by: Junil Lee <junil0814.lee@lge.com>
---
 mm/zsmalloc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index e7414ce..bb459ef 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
 		free_obj = obj_malloc(d_page, class, handle);
 		zs_object_copy(free_obj, used_obj, class);
 		index++;
+		free_obj |= BIT(HANDLE_PIN_BIT);
 		record_obj(handle, free_obj);
 		unpin_tag(handle);
 		obj_free(pool, class, used_obj);
-- 
2.6.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  0:36 ` Junil Lee
@ 2016-01-15  2:35   ` Minchan Kim
  -1 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-15  2:35 UTC (permalink / raw)
  To: Junil Lee; +Cc: ngupta, sergey.senozhatsky.work, linux-mm, linux-kernel

Hi Junil,

On Fri, Jan 15, 2016 at 09:36:24AM +0900, Junil Lee wrote:
> To prevent unlock at the not correct situation, tagging the new obj to
> assure lock in migrate_zspage() before right unlock path.
> 
> Two functions are in race condition by tag which set 1 on last bit of
> obj, however unlock succrently when update new obj to handle before call
> unpin_tag() which is right unlock path.
> 
> summarize this problem by call flow as below:
> 
> 		CPU0								CPU1
> migrate_zspage
> find_alloced_obj()
> 	trypin_tag() -- obj |= HANDLE_PIN_BIT
> obj_malloc() -- new obj is not set			zs_free
> record_obj() -- unlock and break sync		pin_tag() -- get lock
> unpin_tag()

It's really good catch!
I think it should be stable material. For that, we should know this
patch fixes what kinds of problem.

What do you see problem? I mean please write down the oops you saw and
verify that the patch fixes your problem. :)

Minor nit below

> 
> Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> ---
>  mm/zsmalloc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index e7414ce..bb459ef 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
>  		free_obj = obj_malloc(d_page, class, handle);
>  		zs_object_copy(free_obj, used_obj, class);
>  		index++;
> +		free_obj |= BIT(HANDLE_PIN_BIT);
>  		record_obj(handle, free_obj);

I think record_obj should store free_obj to *handle with masking off least bit.
IOW, how about this?

record_obj(handle, obj)
{
        *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
}

Thanks a lot!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  2:35   ` Minchan Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-15  2:35 UTC (permalink / raw)
  To: Junil Lee; +Cc: ngupta, sergey.senozhatsky.work, linux-mm, linux-kernel

Hi Junil,

On Fri, Jan 15, 2016 at 09:36:24AM +0900, Junil Lee wrote:
> To prevent unlock at the not correct situation, tagging the new obj to
> assure lock in migrate_zspage() before right unlock path.
> 
> Two functions are in race condition by tag which set 1 on last bit of
> obj, however unlock succrently when update new obj to handle before call
> unpin_tag() which is right unlock path.
> 
> summarize this problem by call flow as below:
> 
> 		CPU0								CPU1
> migrate_zspage
> find_alloced_obj()
> 	trypin_tag() -- obj |= HANDLE_PIN_BIT
> obj_malloc() -- new obj is not set			zs_free
> record_obj() -- unlock and break sync		pin_tag() -- get lock
> unpin_tag()

It's really good catch!
I think it should be stable material. For that, we should know this
patch fixes what kinds of problem.

What do you see problem? I mean please write down the oops you saw and
verify that the patch fixes your problem. :)

Minor nit below

> 
> Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> ---
>  mm/zsmalloc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index e7414ce..bb459ef 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
>  		free_obj = obj_malloc(d_page, class, handle);
>  		zs_object_copy(free_obj, used_obj, class);
>  		index++;
> +		free_obj |= BIT(HANDLE_PIN_BIT);
>  		record_obj(handle, free_obj);

I think record_obj should store free_obj to *handle with masking off least bit.
IOW, how about this?

record_obj(handle, obj)
{
        *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
}

Thanks a lot!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  2:35   ` Minchan Kim
@ 2016-01-15  3:27     ` Sergey Senozhatsky
  -1 siblings, 0 replies; 19+ messages in thread
From: Sergey Senozhatsky @ 2016-01-15  3:27 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Junil Lee, Andrew Morton, ngupta, sergey.senozhatsky.work,
	linux-mm, linux-kernel

Cc Andrew,

On (01/15/16 11:35), Minchan Kim wrote:
[..]
> > Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> > ---
> >  mm/zsmalloc.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index e7414ce..bb459ef 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> >  		free_obj = obj_malloc(d_page, class, handle);
> >  		zs_object_copy(free_obj, used_obj, class);
> >  		index++;
> > +		free_obj |= BIT(HANDLE_PIN_BIT);
> >  		record_obj(handle, free_obj);
> 
> I think record_obj should store free_obj to *handle with masking off least bit.
> IOW, how about this?
> 
> record_obj(handle, obj)
> {
>         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> }

[just a wild idea]

or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the
object until the class is locked anyway, and migration is happening with
the locked class. extending class->lock scope in zs_free() thus should
not affect the perfomance. so it'll be either zs_free() is touching the
object or the migration, not both.

	-ss

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  3:27     ` Sergey Senozhatsky
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Senozhatsky @ 2016-01-15  3:27 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Junil Lee, Andrew Morton, ngupta, sergey.senozhatsky.work,
	linux-mm, linux-kernel

Cc Andrew,

On (01/15/16 11:35), Minchan Kim wrote:
[..]
> > Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> > ---
> >  mm/zsmalloc.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index e7414ce..bb459ef 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> >  		free_obj = obj_malloc(d_page, class, handle);
> >  		zs_object_copy(free_obj, used_obj, class);
> >  		index++;
> > +		free_obj |= BIT(HANDLE_PIN_BIT);
> >  		record_obj(handle, free_obj);
> 
> I think record_obj should store free_obj to *handle with masking off least bit.
> IOW, how about this?
> 
> record_obj(handle, obj)
> {
>         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> }

[just a wild idea]

or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the
object until the class is locked anyway, and migration is happening with
the locked class. extending class->lock scope in zs_free() thus should
not affect the perfomance. so it'll be either zs_free() is touching the
object or the migration, not both.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  3:27     ` Sergey Senozhatsky
@ 2016-01-15  3:30       ` Sergey Senozhatsky
  -1 siblings, 0 replies; 19+ messages in thread
From: Sergey Senozhatsky @ 2016-01-15  3:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Junil Lee, Andrew Morton, ngupta, linux-mm,
	linux-kernel, Sergey Senozhatsky

On (01/15/16 12:27), Sergey Senozhatsky wrote:
> > > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> > >  		free_obj = obj_malloc(d_page, class, handle);
> > >  		zs_object_copy(free_obj, used_obj, class);
> > >  		index++;
> > > +		free_obj |= BIT(HANDLE_PIN_BIT);
> > >  		record_obj(handle, free_obj);
> > 
> > I think record_obj should store free_obj to *handle with masking off least bit.
> > IOW, how about this?
> > 
> > record_obj(handle, obj)
> > {
> >         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> > }
> 
> [just a wild idea]
> 
> or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the
> object until the class is locked anyway, and migration is happening with
			  UNlocked

> the locked class. extending class->lock scope in zs_free() thus should
> not affect the perfomance. so it'll be either zs_free() is touching the
> object or the migration, not both.

	-ss

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  3:30       ` Sergey Senozhatsky
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Senozhatsky @ 2016-01-15  3:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Junil Lee, Andrew Morton, ngupta, linux-mm,
	linux-kernel, Sergey Senozhatsky

On (01/15/16 12:27), Sergey Senozhatsky wrote:
> > > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> > >  		free_obj = obj_malloc(d_page, class, handle);
> > >  		zs_object_copy(free_obj, used_obj, class);
> > >  		index++;
> > > +		free_obj |= BIT(HANDLE_PIN_BIT);
> > >  		record_obj(handle, free_obj);
> > 
> > I think record_obj should store free_obj to *handle with masking off least bit.
> > IOW, how about this?
> > 
> > record_obj(handle, obj)
> > {
> >         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> > }
> 
> [just a wild idea]
> 
> or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the
> object until the class is locked anyway, and migration is happening with
			  UNlocked

> the locked class. extending class->lock scope in zs_free() thus should
> not affect the perfomance. so it'll be either zs_free() is touching the
> object or the migration, not both.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  3:27     ` Sergey Senozhatsky
@ 2016-01-15  4:49       ` Minchan Kim
  -1 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-15  4:49 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Junil Lee, Andrew Morton, ngupta, linux-mm, linux-kernel

On Fri, Jan 15, 2016 at 12:27:12PM +0900, Sergey Senozhatsky wrote:
> Cc Andrew,
> 
> On (01/15/16 11:35), Minchan Kim wrote:
> [..]
> > > Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> > > ---
> > >  mm/zsmalloc.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > index e7414ce..bb459ef 100644
> > > --- a/mm/zsmalloc.c
> > > +++ b/mm/zsmalloc.c
> > > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> > >  		free_obj = obj_malloc(d_page, class, handle);
> > >  		zs_object_copy(free_obj, used_obj, class);
> > >  		index++;
> > > +		free_obj |= BIT(HANDLE_PIN_BIT);
> > >  		record_obj(handle, free_obj);
> > 
> > I think record_obj should store free_obj to *handle with masking off least bit.
> > IOW, how about this?
> > 
> > record_obj(handle, obj)
> > {
> >         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> > }
> 
> [just a wild idea]
> 
> or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the

Earlier? What do you mean? For getting right class, we should get a stable
handle so we couldn't get class lock first than handle lock.
If I misunderstand, please elaborate a bit.


> object until the class is locked anyway, and migration is happening with
> the locked class. extending class->lock scope in zs_free() thus should
> not affect the perfomance. so it'll be either zs_free() is touching the
> object or the migration, not both.
> 
> 	-ss

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  4:49       ` Minchan Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-15  4:49 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Junil Lee, Andrew Morton, ngupta, linux-mm, linux-kernel

On Fri, Jan 15, 2016 at 12:27:12PM +0900, Sergey Senozhatsky wrote:
> Cc Andrew,
> 
> On (01/15/16 11:35), Minchan Kim wrote:
> [..]
> > > Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> > > ---
> > >  mm/zsmalloc.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > index e7414ce..bb459ef 100644
> > > --- a/mm/zsmalloc.c
> > > +++ b/mm/zsmalloc.c
> > > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> > >  		free_obj = obj_malloc(d_page, class, handle);
> > >  		zs_object_copy(free_obj, used_obj, class);
> > >  		index++;
> > > +		free_obj |= BIT(HANDLE_PIN_BIT);
> > >  		record_obj(handle, free_obj);
> > 
> > I think record_obj should store free_obj to *handle with masking off least bit.
> > IOW, how about this?
> > 
> > record_obj(handle, obj)
> > {
> >         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> > }
> 
> [just a wild idea]
> 
> or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the

Earlier? What do you mean? For getting right class, we should get a stable
handle so we couldn't get class lock first than handle lock.
If I misunderstand, please elaborate a bit.


> object until the class is locked anyway, and migration is happening with
> the locked class. extending class->lock scope in zs_free() thus should
> not affect the perfomance. so it'll be either zs_free() is touching the
> object or the migration, not both.
> 
> 	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  2:35   ` Minchan Kim
@ 2016-01-15  5:05     ` Minchan Kim
  -1 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-15  5:05 UTC (permalink / raw)
  To: Junil Lee; +Cc: ngupta, sergey.senozhatsky.work, linux-mm, linux-kernel

On Fri, Jan 15, 2016 at 11:35:18AM +0900, Minchan Kim wrote:
> Hi Junil,
> 
> On Fri, Jan 15, 2016 at 09:36:24AM +0900, Junil Lee wrote:
> > To prevent unlock at the not correct situation, tagging the new obj to
> > assure lock in migrate_zspage() before right unlock path.
> > 
> > Two functions are in race condition by tag which set 1 on last bit of
> > obj, however unlock succrently when update new obj to handle before call
> > unpin_tag() which is right unlock path.
> > 
> > summarize this problem by call flow as below:
> > 
> > 		CPU0								CPU1
> > migrate_zspage
> > find_alloced_obj()
> > 	trypin_tag() -- obj |= HANDLE_PIN_BIT
> > obj_malloc() -- new obj is not set			zs_free
> > record_obj() -- unlock and break sync		pin_tag() -- get lock
> > unpin_tag()
> 
> It's really good catch!
> I think it should be stable material. For that, we should know this
> patch fixes what kinds of problem.
> 
> What do you see problem? I mean please write down the oops you saw and
> verify that the patch fixes your problem. :)
> 
> Minor nit below
> 
> > 
> > Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> > ---
> >  mm/zsmalloc.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index e7414ce..bb459ef 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> >  		free_obj = obj_malloc(d_page, class, handle);
> >  		zs_object_copy(free_obj, used_obj, class);
> >  		index++;
> > +		free_obj |= BIT(HANDLE_PIN_BIT);
> >  		record_obj(handle, free_obj);
> 
> I think record_obj should store free_obj to *handle with masking off least bit.
> IOW, how about this?
> 
> record_obj(handle, obj)
> {
>         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> }
> 
> Thanks a lot!

Junil, as you pointed out in private mail, my code was broken.
I just wanted to make code more robust but it can add unnecessary
overhead in zsmalloc path although it would be minor so let's
go with your patch but please add comment why we need it.

Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  5:05     ` Minchan Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-15  5:05 UTC (permalink / raw)
  To: Junil Lee; +Cc: ngupta, sergey.senozhatsky.work, linux-mm, linux-kernel

On Fri, Jan 15, 2016 at 11:35:18AM +0900, Minchan Kim wrote:
> Hi Junil,
> 
> On Fri, Jan 15, 2016 at 09:36:24AM +0900, Junil Lee wrote:
> > To prevent unlock at the not correct situation, tagging the new obj to
> > assure lock in migrate_zspage() before right unlock path.
> > 
> > Two functions are in race condition by tag which set 1 on last bit of
> > obj, however unlock succrently when update new obj to handle before call
> > unpin_tag() which is right unlock path.
> > 
> > summarize this problem by call flow as below:
> > 
> > 		CPU0								CPU1
> > migrate_zspage
> > find_alloced_obj()
> > 	trypin_tag() -- obj |= HANDLE_PIN_BIT
> > obj_malloc() -- new obj is not set			zs_free
> > record_obj() -- unlock and break sync		pin_tag() -- get lock
> > unpin_tag()
> 
> It's really good catch!
> I think it should be stable material. For that, we should know this
> patch fixes what kinds of problem.
> 
> What do you see problem? I mean please write down the oops you saw and
> verify that the patch fixes your problem. :)
> 
> Minor nit below
> 
> > 
> > Signed-off-by: Junil Lee <junil0814.lee@lge.com>
> > ---
> >  mm/zsmalloc.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index e7414ce..bb459ef 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1635,6 +1635,7 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
> >  		free_obj = obj_malloc(d_page, class, handle);
> >  		zs_object_copy(free_obj, used_obj, class);
> >  		index++;
> > +		free_obj |= BIT(HANDLE_PIN_BIT);
> >  		record_obj(handle, free_obj);
> 
> I think record_obj should store free_obj to *handle with masking off least bit.
> IOW, how about this?
> 
> record_obj(handle, obj)
> {
>         *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> }
> 
> Thanks a lot!

Junil, as you pointed out in private mail, my code was broken.
I just wanted to make code more robust but it can add unnecessary
overhead in zsmalloc path although it would be minor so let's
go with your patch but please add comment why we need it.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  4:49       ` Minchan Kim
@ 2016-01-15  5:07         ` Sergey Senozhatsky
  -1 siblings, 0 replies; 19+ messages in thread
From: Sergey Senozhatsky @ 2016-01-15  5:07 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Junil Lee, Andrew Morton, ngupta, linux-mm,
	linux-kernel

On (01/15/16 13:49), Minchan Kim wrote:
[..]
> > 
> > or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the
> 
> Earlier? What do you mean? For getting right class, we should get a stable
> handle so we couldn't get class lock first than handle lock.
> If I misunderstand, please elaborate a bit.

ohh... you're right. I didn't really check the code when I was writing
this. please forget what I said.


yeah, agree, record_obj() better be doing this.

	-ss

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  5:07         ` Sergey Senozhatsky
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Senozhatsky @ 2016-01-15  5:07 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Junil Lee, Andrew Morton, ngupta, linux-mm,
	linux-kernel

On (01/15/16 13:49), Minchan Kim wrote:
[..]
> > 
> > or zs_free() can take spin_lock(&class->lock) earlier, it cannot free the
> 
> Earlier? What do you mean? For getting right class, we should get a stable
> handle so we couldn't get class lock first than handle lock.
> If I misunderstand, please elaborate a bit.

ohh... you're right. I didn't really check the code when I was writing
this. please forget what I said.


yeah, agree, record_obj() better be doing this.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  5:07         ` Sergey Senozhatsky
  (?)
@ 2016-01-19 15:47         ` Russell Knize
  2016-01-20  7:00             ` Minchan Kim
  -1 siblings, 1 reply; 19+ messages in thread
From: Russell Knize @ 2016-01-19 15:47 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Junil Lee, Andrew Morton, ngupta, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 208 bytes --]

Just wanted to ack this, as we have been seeing the same problem (weird
race conditions during compaction) and fixed it in the same way a few weeks
ago (resetting the pin bit before recording the obj).

Russ

[-- Attachment #2: Type: text/html, Size: 253 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-19 15:47         ` Russell Knize
@ 2016-01-20  7:00             ` Minchan Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-20  7:00 UTC (permalink / raw)
  To: Russell Knize
  Cc: Sergey Senozhatsky, Junil Lee, Andrew Morton, ngupta, linux-mm,
	linux-kernel

Hello Russ,

On Tue, Jan 19, 2016 at 09:47:12AM -0600, Russell Knize wrote:
>    Just wanted to ack this, as we have been seeing the same problem (weird
>    race conditions during compaction) and fixed it in the same way a few
>    weeks ago (resetting the pin bit before recording the obj).
>    Russ

First of all, thanks for your comment.

The patch you tested have a problem although it's really subtle(ie,
it doesn't do store tearing when I disassemble ARM{32|64}) but it
could have a problem potentially for other architecutres or future ARM.
For right fix, I sent v5 - https://lkml.org/lkml/2016/1/18/263.
If you can prove it fixes your problem, please Tested-by to the thread.
It's really valuable to do testing for stable material.

Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-20  7:00             ` Minchan Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Minchan Kim @ 2016-01-20  7:00 UTC (permalink / raw)
  To: Russell Knize
  Cc: Sergey Senozhatsky, Junil Lee, Andrew Morton, ngupta, linux-mm,
	linux-kernel

Hello Russ,

On Tue, Jan 19, 2016 at 09:47:12AM -0600, Russell Knize wrote:
>    Just wanted to ack this, as we have been seeing the same problem (weird
>    race conditions during compaction) and fixed it in the same way a few
>    weeks ago (resetting the pin bit before recording the obj).
>    Russ

First of all, thanks for your comment.

The patch you tested have a problem although it's really subtle(ie,
it doesn't do store tearing when I disassemble ARM{32|64}) but it
could have a problem potentially for other architecutres or future ARM.
For right fix, I sent v5 - https://lkml.org/lkml/2016/1/18/263.
If you can prove it fixes your problem, please Tested-by to the thread.
It's really valuable to do testing for stable material.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-20  7:00             ` Minchan Kim
@ 2016-01-20 15:21               ` Russell Knize
  -1 siblings, 0 replies; 19+ messages in thread
From: Russell Knize @ 2016-01-20 15:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Junil Lee, Andrew Morton, Nitin Gupta,
	linux-mm, linux-kernel

Yes, I saw your v5 and have already started testing it.  I suspect it
will be stable, as the key for us was to set that bit before the
store.  We were only seeing it on ARM32, but those platforms tend
perform compaction far more often due to the memory pressure.  We
don't see it at all anymore.

Honestly, at first I didn't think setting the bit would help that much
as I assumed it was the barrier in the clear_bit_unlock() that
mattered.  Then I saw the same sort of race happening in the page
migration stuff I've been working on.  I had done the same type of
"optimization" there and in fact did not call unpin_tag() at all after
updating the object handles with the bit dropped.

Russ

On Wed, Jan 20, 2016 at 1:00 AM, Minchan Kim <minchan@kernel.org> wrote:
> Hello Russ,
>
> On Tue, Jan 19, 2016 at 09:47:12AM -0600, Russell Knize wrote:
>>    Just wanted to ack this, as we have been seeing the same problem (weird
>>    race conditions during compaction) and fixed it in the same way a few
>>    weeks ago (resetting the pin bit before recording the obj).
>>    Russ
>
> First of all, thanks for your comment.
>
> The patch you tested have a problem although it's really subtle(ie,
> it doesn't do store tearing when I disassemble ARM{32|64}) but it
> could have a problem potentially for other architecutres or future ARM.
> For right fix, I sent v5 - https://lkml.org/lkml/2016/1/18/263.
> If you can prove it fixes your problem, please Tested-by to the thread.
> It's really valuable to do testing for stable material.
>
> Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-20 15:21               ` Russell Knize
  0 siblings, 0 replies; 19+ messages in thread
From: Russell Knize @ 2016-01-20 15:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Junil Lee, Andrew Morton, Nitin Gupta,
	linux-mm, linux-kernel

Yes, I saw your v5 and have already started testing it.  I suspect it
will be stable, as the key for us was to set that bit before the
store.  We were only seeing it on ARM32, but those platforms tend
perform compaction far more often due to the memory pressure.  We
don't see it at all anymore.

Honestly, at first I didn't think setting the bit would help that much
as I assumed it was the barrier in the clear_bit_unlock() that
mattered.  Then I saw the same sort of race happening in the page
migration stuff I've been working on.  I had done the same type of
"optimization" there and in fact did not call unpin_tag() at all after
updating the object handles with the bit dropped.

Russ

On Wed, Jan 20, 2016 at 1:00 AM, Minchan Kim <minchan@kernel.org> wrote:
> Hello Russ,
>
> On Tue, Jan 19, 2016 at 09:47:12AM -0600, Russell Knize wrote:
>>    Just wanted to ack this, as we have been seeing the same problem (weird
>>    race conditions during compaction) and fixed it in the same way a few
>>    weeks ago (resetting the pin bit before recording the obj).
>>    Russ
>
> First of all, thanks for your comment.
>
> The patch you tested have a problem although it's really subtle(ie,
> it doesn't do store tearing when I disassemble ARM{32|64}) but it
> could have a problem potentially for other architecutres or future ARM.
> For right fix, I sent v5 - https://lkml.org/lkml/2016/1/18/263.
> If you can prove it fixes your problem, please Tested-by to the thread.
> It's really valuable to do testing for stable material.
>
> Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-01-20 15:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-15  0:36 [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition Junil Lee
2016-01-15  0:36 ` Junil Lee
2016-01-15  2:35 ` Minchan Kim
2016-01-15  2:35   ` Minchan Kim
2016-01-15  3:27   ` Sergey Senozhatsky
2016-01-15  3:27     ` Sergey Senozhatsky
2016-01-15  3:30     ` Sergey Senozhatsky
2016-01-15  3:30       ` Sergey Senozhatsky
2016-01-15  4:49     ` Minchan Kim
2016-01-15  4:49       ` Minchan Kim
2016-01-15  5:07       ` Sergey Senozhatsky
2016-01-15  5:07         ` Sergey Senozhatsky
2016-01-19 15:47         ` Russell Knize
2016-01-20  7:00           ` Minchan Kim
2016-01-20  7:00             ` Minchan Kim
2016-01-20 15:21             ` Russell Knize
2016-01-20 15:21               ` Russell Knize
2016-01-15  5:05   ` Minchan Kim
2016-01-15  5:05     ` Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.