linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
@ 2016-01-15  7:39 Junil Lee
  2016-01-15 14:34 ` Minchan Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Junil Lee @ 2016-01-15  7:39 UTC (permalink / raw)
  To: minchan, ngupta
  Cc: sergey.senozhatsky.work, akpm, linux-mm, linux-kernel, Junil Lee

To prevent unlock at the not correct situation, tagging the new obj to
assure lock in migrate_zspage() before right unlock path.

Two functions are in race condition by tag which set 1 on last bit of
obj, however unlock succrently when update new obj to handle before call
unpin_tag() which is right unlock path.

summarize this problem by call flow as below:

		CPU0								CPU1
migrate_zspage
find_alloced_obj()
	trypin_tag() -- obj |= HANDLE_PIN_BIT
obj_malloc() -- new obj is not set			zs_free
record_obj() -- unlock and break sync		pin_tag() -- get lock
unpin_tag()

Before code make crash as below:
	Unable to handle kernel NULL pointer dereference at virtual address 00000000
	CPU: 0 PID: 19001 Comm: CookieMonsterCl Tainted:
	PC is at get_zspage_mapping+0x0/0x24
	LR is at obj_free.isra.22+0x64/0x128
	Call trace:
		[<ffffffc0001a3aa8>] get_zspage_mapping+0x0/0x24
		[<ffffffc0001a4918>] zs_free+0x88/0x114
		[<ffffffc00053ae54>] zram_free_page+0x64/0xcc
		[<ffffffc00053af4c>] zram_slot_free_notify+0x90/0x108
		[<ffffffc000196638>] swap_entry_free+0x278/0x294
		[<ffffffc000199008>] free_swap_and_cache+0x38/0x11c
		[<ffffffc0001837ac>] unmap_single_vma+0x480/0x5c8
		[<ffffffc000184350>] unmap_vmas+0x44/0x60
		[<ffffffc00018a53c>] exit_mmap+0x50/0x110
		[<ffffffc00009e408>] mmput+0x58/0xe0
		[<ffffffc0000a2854>] do_exit+0x320/0x8dc
		[<ffffffc0000a3cb4>] do_group_exit+0x44/0xa8
		[<ffffffc0000ae1bc>] get_signal+0x538/0x580
		[<ffffffc000087e44>] do_signal+0x98/0x4b8
		[<ffffffc00008843c>] do_notify_resume+0x14/0x5c

and for test, print obj value after pin_tag() in zs_free().
Sometimes obj is even number means break synchronization.

After patched, crash is not occurred and obj is only odd number in same
situation.

Signed-off-by: Junil Lee <junil0814.lee@lge.com>
---
 mm/zsmalloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index e7414ce..a24ccb1 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1635,6 +1635,8 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
 		free_obj = obj_malloc(d_page, class, handle);
 		zs_object_copy(free_obj, used_obj, class);
 		index++;
+		/* Must not unlock before unpin_tag() */
+		free_obj |= BIT(HANDLE_PIN_BIT);
 		record_obj(handle, free_obj);
 		unpin_tag(handle);
 		obj_free(pool, class, used_obj);
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15  7:39 [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition Junil Lee
@ 2016-01-15 14:34 ` Minchan Kim
  2016-01-15 15:49   ` Vlastimil Babka
  0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2016-01-15 14:34 UTC (permalink / raw)
  To: Junil Lee; +Cc: ngupta, sergey.senozhatsky.work, akpm, linux-mm, linux-kernel

On Fri, Jan 15, 2016 at 04:39:11PM +0900, Junil Lee wrote:
> To prevent unlock at the not correct situation, tagging the new obj to
> assure lock in migrate_zspage() before right unlock path.
> 
> Two functions are in race condition by tag which set 1 on last bit of
> obj, however unlock succrently when update new obj to handle before call
> unpin_tag() which is right unlock path.
> 
> summarize this problem by call flow as below:
> 
> 		CPU0								CPU1
> migrate_zspage
> find_alloced_obj()
> 	trypin_tag() -- obj |= HANDLE_PIN_BIT
> obj_malloc() -- new obj is not set			zs_free
> record_obj() -- unlock and break sync		pin_tag() -- get lock
> unpin_tag()
> 
> Before code make crash as below:
> 	Unable to handle kernel NULL pointer dereference at virtual address 00000000
> 	CPU: 0 PID: 19001 Comm: CookieMonsterCl Tainted:
> 	PC is at get_zspage_mapping+0x0/0x24
> 	LR is at obj_free.isra.22+0x64/0x128
> 	Call trace:
> 		[<ffffffc0001a3aa8>] get_zspage_mapping+0x0/0x24
> 		[<ffffffc0001a4918>] zs_free+0x88/0x114
> 		[<ffffffc00053ae54>] zram_free_page+0x64/0xcc
> 		[<ffffffc00053af4c>] zram_slot_free_notify+0x90/0x108
> 		[<ffffffc000196638>] swap_entry_free+0x278/0x294
> 		[<ffffffc000199008>] free_swap_and_cache+0x38/0x11c
> 		[<ffffffc0001837ac>] unmap_single_vma+0x480/0x5c8
> 		[<ffffffc000184350>] unmap_vmas+0x44/0x60
> 		[<ffffffc00018a53c>] exit_mmap+0x50/0x110
> 		[<ffffffc00009e408>] mmput+0x58/0xe0
> 		[<ffffffc0000a2854>] do_exit+0x320/0x8dc
> 		[<ffffffc0000a3cb4>] do_group_exit+0x44/0xa8
> 		[<ffffffc0000ae1bc>] get_signal+0x538/0x580
> 		[<ffffffc000087e44>] do_signal+0x98/0x4b8
> 		[<ffffffc00008843c>] do_notify_resume+0x14/0x5c
> 
> and for test, print obj value after pin_tag() in zs_free().
> Sometimes obj is even number means break synchronization.
> 
> After patched, crash is not occurred and obj is only odd number in same
> situation.

If you verified it solved your problem, we should mark this patch
as stable.

> 
> Signed-off-by: Junil Lee <junil0814.lee@lge.com>

Acked-by: Minchan Kim <minchan@kernel.org>

Below comment.

> ---
>  mm/zsmalloc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index e7414ce..a24ccb1 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1635,6 +1635,8 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
>  		free_obj = obj_malloc(d_page, class, handle);
>  		zs_object_copy(free_obj, used_obj, class);
>  		index++;
> +		/* Must not unlock before unpin_tag() */

I want to make comment more clear.

/*
 * record_obj updates handle's value to free_obj and it will invalidate
 * lock bit(ie, HANDLE_PIN_BIT) of handle, which breaks synchronization
 * using pin_tag(e,g, zs_free) so let's keep the lock bit.
 */

Thanks.

> +		free_obj |= BIT(HANDLE_PIN_BIT);
>  		record_obj(handle, free_obj);
>  		unpin_tag(handle);
>  		obj_free(pool, class, used_obj);
> -- 
> 2.6.2
> 

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15 14:34 ` Minchan Kim
@ 2016-01-15 15:49   ` Vlastimil Babka
  2016-01-16  4:09     ` Sergey Senozhatsky
  0 siblings, 1 reply; 11+ messages in thread
From: Vlastimil Babka @ 2016-01-15 15:49 UTC (permalink / raw)
  To: Minchan Kim, Junil Lee
  Cc: ngupta, sergey.senozhatsky.work, akpm, linux-mm, linux-kernel

On 01/15/2016 03:34 PM, Minchan Kim wrote:
> On Fri, Jan 15, 2016 at 04:39:11PM +0900, Junil Lee wrote:
>>
>> Signed-off-by: Junil Lee <junil0814.lee@lge.com>
>
> Acked-by: Minchan Kim <minchan@kernel.org>
>
> Below comment.
>
>> ---
>>   mm/zsmalloc.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
>> index e7414ce..a24ccb1 100644
>> --- a/mm/zsmalloc.c
>> +++ b/mm/zsmalloc.c
>> @@ -1635,6 +1635,8 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class,
>>   		free_obj = obj_malloc(d_page, class, handle);
>>   		zs_object_copy(free_obj, used_obj, class);
>>   		index++;
>> +		/* Must not unlock before unpin_tag() */
>
> I want to make comment more clear.
>
> /*
>   * record_obj updates handle's value to free_obj and it will invalidate
>   * lock bit(ie, HANDLE_PIN_BIT) of handle, which breaks synchronization
>   * using pin_tag(e,g, zs_free) so let's keep the lock bit.
>   */
>
> Thanks.

Could you please also help making the changelog more clear?

>
>> +		free_obj |= BIT(HANDLE_PIN_BIT);
>>   		record_obj(handle, free_obj);

I think record_obj() should use WRITE_ONCE() or something like that.
Otherwise the compiler is IMHO allowed to reorder this, i.e. first to 
assign free_obj to handle, and then add the PIN bit there.

>>   		unpin_tag(handle);
>>   		obj_free(pool, class, used_obj);
>> --
>> 2.6.2
>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-15 15:49   ` Vlastimil Babka
@ 2016-01-16  4:09     ` Sergey Senozhatsky
  2016-01-16  7:44       ` Vlastimil Babka
  0 siblings, 1 reply; 11+ messages in thread
From: Sergey Senozhatsky @ 2016-01-16  4:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Minchan Kim, Junil Lee, ngupta, sergey.senozhatsky.work, akpm,
	linux-mm, linux-kernel, sergey.senozhatsky

On (01/15/16 16:49), Vlastimil Babka wrote:
[..]
> 
> Could you please also help making the changelog more clear?
> 
> >
> >>+		free_obj |= BIT(HANDLE_PIN_BIT);
> >>  		record_obj(handle, free_obj);
> 
> I think record_obj() should use WRITE_ONCE() or something like that.
> Otherwise the compiler is IMHO allowed to reorder this, i.e. first to assign
> free_obj to handle, and then add the PIN bit there.

good note.

... or do both things in record_obj() (per Minchan)

	record_obj(handle, obj)
	{
	        *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
	}

	-ss

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-16  4:09     ` Sergey Senozhatsky
@ 2016-01-16  7:44       ` Vlastimil Babka
  2016-01-16  8:06         ` Sergey Senozhatsky
  0 siblings, 1 reply; 11+ messages in thread
From: Vlastimil Babka @ 2016-01-16  7:44 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Junil Lee, ngupta, sergey.senozhatsky.work, akpm,
	linux-mm, linux-kernel

On 16.1.2016 5:09, Sergey Senozhatsky wrote:
> On (01/15/16 16:49), Vlastimil Babka wrote:
> [..]
>>
>> Could you please also help making the changelog more clear?
>>
>>>
>>>> +		free_obj |= BIT(HANDLE_PIN_BIT);
>>>>  		record_obj(handle, free_obj);
>>
>> I think record_obj() should use WRITE_ONCE() or something like that.
>> Otherwise the compiler is IMHO allowed to reorder this, i.e. first to assign
>> free_obj to handle, and then add the PIN bit there.
> 
> good note.
> 
> ... or do both things in record_obj() (per Minchan)
> 
> 	record_obj(handle, obj)
> 	{
> 	        *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);

Hmm but that's an unpin, not a pin? A mistake or I'm missing something?
Anyway the compiler can do the same thing here without a WRITE_ONCE().

> 	}
> 
> 	-ss
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-16  7:44       ` Vlastimil Babka
@ 2016-01-16  8:06         ` Sergey Senozhatsky
  2016-01-16  8:16           ` Vlastimil Babka
  0 siblings, 1 reply; 11+ messages in thread
From: Sergey Senozhatsky @ 2016-01-16  8:06 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Sergey Senozhatsky, Minchan Kim, Junil Lee, ngupta,
	sergey.senozhatsky.work, akpm, linux-mm, linux-kernel

On (01/16/16 08:44), Vlastimil Babka wrote:
> On 16.1.2016 5:09, Sergey Senozhatsky wrote:
> > On (01/15/16 16:49), Vlastimil Babka wrote:
> > [..]
> >>
> >> Could you please also help making the changelog more clear?
> >>
> >>>
> >>>> +		free_obj |= BIT(HANDLE_PIN_BIT);
> >>>>  		record_obj(handle, free_obj);
> >>
> >> I think record_obj() should use WRITE_ONCE() or something like that.
> >> Otherwise the compiler is IMHO allowed to reorder this, i.e. first to assign
> >> free_obj to handle, and then add the PIN bit there.
> > 
> > good note.
> > 
> > ... or do both things in record_obj() (per Minchan)
> > 
> > 	record_obj(handle, obj)
> > 	{
> > 	        *(unsigned long)handle = obj & ~(1<<HANDLE_PIN_BIT);
> 
> Hmm but that's an unpin, not a pin? A mistake or I'm missing something?

I'm sure it's just a compose-in-mail-app typo.

	-ss

> Anyway the compiler can do the same thing here without a WRITE_ONCE().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-16  8:06         ` Sergey Senozhatsky
@ 2016-01-16  8:16           ` Vlastimil Babka
  2016-01-16 10:05             ` Sergey Senozhatsky
  2016-01-18  6:32             ` Minchan Kim
  0 siblings, 2 replies; 11+ messages in thread
From: Vlastimil Babka @ 2016-01-16  8:16 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Minchan Kim, Junil Lee, ngupta, sergey.senozhatsky.work, akpm,
	linux-mm, linux-kernel

On 16.1.2016 9:06, Sergey Senozhatsky wrote:
> On (01/16/16 08:44), Vlastimil Babka wrote:
>> On 16.1.2016 5:09, Sergey Senozhatsky wrote:
>>> On (01/15/16 16:49), Vlastimil Babka wrote:
>>
>> Hmm but that's an unpin, not a pin? A mistake or I'm missing something?
> 
> I'm sure it's just a compose-in-mail-app typo.

BTW, couldn't the correct fix also just look like this?

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9f15bdd9163c..43f743175ede 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1635,8 +1635,8 @@ static int migrate_zspage(struct zs_pool *pool, struct
size_class *class,
                free_obj = obj_malloc(d_page, class, handle);
                zs_object_copy(free_obj, used_obj, class);
                index++;
+               /* This also effectively unpins the handle */
                record_obj(handle, free_obj);
-               unpin_tag(handle);
                obj_free(pool, class, used_obj);
        }

But I'd still recommend WRITE_ONCE in record_obj(). And I'm not even sure it's
safe on all architectures to do a simple overwrite of a word against somebody
else trying to lock a bit there?

> 	-ss
> 
>> Anyway the compiler can do the same thing here without a WRITE_ONCE().
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-16  8:16           ` Vlastimil Babka
@ 2016-01-16 10:05             ` Sergey Senozhatsky
  2016-01-18  1:02               ` Junil Lee
  2016-01-18  6:32             ` Minchan Kim
  1 sibling, 1 reply; 11+ messages in thread
From: Sergey Senozhatsky @ 2016-01-16 10:05 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Sergey Senozhatsky, Minchan Kim, Junil Lee, ngupta,
	sergey.senozhatsky.work, akpm, linux-mm, linux-kernel

On (01/16/16 09:16), Vlastimil Babka wrote:
[..]
> BTW, couldn't the correct fix also just look like this?
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9f15bdd9163c..43f743175ede 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1635,8 +1635,8 @@ static int migrate_zspage(struct zs_pool *pool, struct
> size_class *class,
>                 free_obj = obj_malloc(d_page, class, handle);
>                 zs_object_copy(free_obj, used_obj, class);
>                 index++;
> +               /* This also effectively unpins the handle */
>                 record_obj(handle, free_obj);
> -               unpin_tag(handle);
>                 obj_free(pool, class, used_obj);
>         }

I think this will work.


> But I'd still recommend WRITE_ONCE in record_obj(). And I'm not even sure it's
> safe on all architectures to do a simple overwrite of a word against somebody
> else trying to lock a bit there?

hm... for example, generic bitops from include/asm-generic/bitops/atomic.h
use _atomic_spin_lock_irqsave()

 #define test_and_set_bit_lock(nr, addr)  test_and_set_bit(nr, addr)

 static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
 {
         unsigned long mask = BIT_MASK(nr);
         unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
         unsigned long old;
         unsigned long flags;

         _atomic_spin_lock_irqsave(p, flags);
         old = *p;
         *p = old | mask;
         _atomic_spin_unlock_irqrestore(p, flags);

         return (old & mask) != 0;
 }

so overwriting it from the outside world (w/o taking _atomic_spin_lock_irqsave(p))
can theoretically be tricky in some cases.

	-ss

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-16 10:05             ` Sergey Senozhatsky
@ 2016-01-18  1:02               ` Junil Lee
  0 siblings, 0 replies; 11+ messages in thread
From: Junil Lee @ 2016-01-18  1:02 UTC (permalink / raw)
  To: Sergey Senozhatsky, Vlastimil Babka
  Cc: Minchan Kim, ngupta, sergey.senozhatsky.work, akpm, linux-mm,
	linux-kernel



2016-01-16 오후 7:05에 Sergey Senozhatsky 이(가) 쓴 글:
> On (01/16/16 09:16), Vlastimil Babka wrote:
> [..]
> > BTW, couldn't the correct fix also just look like this?
> >
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 9f15bdd9163c..43f743175ede 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1635,8 +1635,8 @@ static int migrate_zspage(struct zs_pool
> *pool, struct
> > size_class *class,
> > free_obj = obj_malloc(d_page, class, handle);
> > zs_object_copy(free_obj, used_obj, class);
> > index++;
> > + /* This also effectively unpins the handle */
> > record_obj(handle, free_obj);
> > - unpin_tag(handle);
> > obj_free(pool, class, used_obj);
> > }
>
> I think this will work.
>
I agree.
And I tested previous patch as I sent, this problem has not been
happened since 2 days ago.

I will resend v3 as Babka.

Thanks.
>
> > But I'd still recommend WRITE_ONCE in record_obj(). And I'm not even
> sure it's
> > safe on all architectures to do a simple overwrite of a word against
> somebody
> > else trying to lock a bit there?
>
> hm... for example, generic bitops from
> include/asm-generic/bitops/atomic.h
> use _atomic_spin_lock_irqsave()
>
> #define test_and_set_bit_lock(nr, addr) test_and_set_bit(nr, addr)
>
> static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
> {
> unsigned long mask = BIT_MASK(nr);
> unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
> unsigned long old;
> unsigned long flags;
>
> _atomic_spin_lock_irqsave(p, flags);
> old = *p;
> *p = old | mask;
> _atomic_spin_unlock_irqrestore(p, flags);
>
> return (old & mask) != 0;
> }
>
> so overwriting it from the outside world (w/o taking
> _atomic_spin_lock_irqsave(p))
> can theoretically be tricky in some cases.
>
> -ss

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-16  8:16           ` Vlastimil Babka
  2016-01-16 10:05             ` Sergey Senozhatsky
@ 2016-01-18  6:32             ` Minchan Kim
  2016-01-18  7:04               ` Sergey Senozhatsky
  1 sibling, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2016-01-18  6:32 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Sergey Senozhatsky, Junil Lee, ngupta, sergey.senozhatsky.work,
	akpm, linux-mm, linux-kernel

Hello, Vlastimil

On Sat, Jan 16, 2016 at 09:16:41AM +0100, Vlastimil Babka wrote:
> On 16.1.2016 9:06, Sergey Senozhatsky wrote:
> > On (01/16/16 08:44), Vlastimil Babka wrote:
> >> On 16.1.2016 5:09, Sergey Senozhatsky wrote:
> >>> On (01/15/16 16:49), Vlastimil Babka wrote:
> >>
> >> Hmm but that's an unpin, not a pin? A mistake or I'm missing something?
> > 
> > I'm sure it's just a compose-in-mail-app typo.
> 
> BTW, couldn't the correct fix also just look like this?
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9f15bdd9163c..43f743175ede 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1635,8 +1635,8 @@ static int migrate_zspage(struct zs_pool *pool, struct
> size_class *class,
>                 free_obj = obj_malloc(d_page, class, handle);
>                 zs_object_copy(free_obj, used_obj, class);
>                 index++;
> +               /* This also effectively unpins the handle */
>                 record_obj(handle, free_obj);
> -               unpin_tag(handle);
>                 obj_free(pool, class, used_obj);
>         }
> 
> But I'd still recommend WRITE_ONCE in record_obj(). And I'm not even sure it's

Thanks for the reivew. Yeah, we need WRITE_ONCE in record_obj but
your version will not work. IMHO, WRITE_ONCE can prevent store-tearing
but it couldn't prevent reordering. IOW, we need some barrier as unlock
and clear_bit_unlock includes it.
So, we shouldn't omit unpin_tag there.

> safe on all architectures to do a simple overwrite of a word against somebody
> else trying to lock a bit there?

Hmm, I think it shouldn't be a problem. It's word-alinged, word-sized
store so it should be atomic.

As other example, we have been used lock_page for a bit of page->flags
and used other bits in there with __set_bit(ie, __SetPageXXX).
I guess it's same situation with us just except we are spinning there.
But it is worth to dobule check so need to help lock guys.

> 
> > 	-ss
> > 
> >> Anyway the compiler can do the same thing here without a WRITE_ONCE().
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> > 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition
  2016-01-18  6:32             ` Minchan Kim
@ 2016-01-18  7:04               ` Sergey Senozhatsky
  0 siblings, 0 replies; 11+ messages in thread
From: Sergey Senozhatsky @ 2016-01-18  7:04 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Vlastimil Babka, Sergey Senozhatsky, Junil Lee, ngupta,
	sergey.senozhatsky.work, akpm, linux-mm, linux-kernel

Hello,

oh, you replied in this thread.

On (01/18/16 15:32), Minchan Kim wrote:
> >                 free_obj = obj_malloc(d_page, class, handle);
> >                 zs_object_copy(free_obj, used_obj, class);
> >                 index++;
> > +               /* This also effectively unpins the handle */
> >                 record_obj(handle, free_obj);
> > -               unpin_tag(handle);
> >                 obj_free(pool, class, used_obj);
> >         }
> > 
> > But I'd still recommend WRITE_ONCE in record_obj(). And I'm not even sure it's
> 
> Thanks for the reivew. Yeah, we need WRITE_ONCE in record_obj but
> your version will not work. IMHO, WRITE_ONCE can prevent store-tearing
> but it couldn't prevent reordering. IOW, we need some barrier as unlock
> and clear_bit_unlock includes it.
> So, we shouldn't omit unpin_tag there.

but there is only one store operation after this patch.

static void record_obj(unsigned long handle, unsigned long obj)
{
	*(unsigned long *)handle = obj;
}

does the re-ordering problem exist? zs_free() will see the
old pinned handle and spin, until record_obj() from migrate.

	-ss

> > safe on all architectures to do a simple overwrite of a word against somebody
> > else trying to lock a bit there?
> 
> Hmm, I think it shouldn't be a problem. It's word-alinged, word-sized
> store so it should be atomic.
> 
> As other example, we have been used lock_page for a bit of page->flags
> and used other bits in there with __set_bit(ie, __SetPageXXX).
> I guess it's same situation with us just except we are spinning there.
> But it is worth to dobule check so need to help lock guys.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-01-18  7:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-15  7:39 [PATCH v2] zsmalloc: fix migrate_zspage-zs_free race condition Junil Lee
2016-01-15 14:34 ` Minchan Kim
2016-01-15 15:49   ` Vlastimil Babka
2016-01-16  4:09     ` Sergey Senozhatsky
2016-01-16  7:44       ` Vlastimil Babka
2016-01-16  8:06         ` Sergey Senozhatsky
2016-01-16  8:16           ` Vlastimil Babka
2016-01-16 10:05             ` Sergey Senozhatsky
2016-01-18  1:02               ` Junil Lee
2016-01-18  6:32             ` Minchan Kim
2016-01-18  7:04               ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).