All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] zram: fix possible race when checking idle_strm
@ 2015-08-07  8:03 Joonsoo Kim
  2015-08-07  9:14 ` Sergey Senozhatsky
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Joonsoo Kim @ 2015-08-07  8:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Nitin Gupta, Sergey Senozhatsky, linux-kernel, Joonsoo Kim

Currently, when we enter the wait state due to lack of idle stream,
we check idle_strm list without holding the lock in expanding of
wait_event define. In this case, some one can see stale value and
process could fall into wait state without any upcoming wakeup process.
Although I didn't see any error related to this race, it should be fixed.

To fix it, we should check idle_strm with holding the lock and
this patch implements it.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 drivers/block/zram/zcomp.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 80a62e7..837e9c3 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -76,6 +76,17 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
 	return zstrm;
 }
 
+static bool avail_idle_strm(struct zcomp_strm_multi *zs)
+{
+	int avail;
+
+	spin_lock(&zs->strm_lock);
+	avail = !list_empty(&zs->idle_strm);
+	spin_unlock(&zs->strm_lock);
+
+	return avail;
+}
+
 /*
  * get idle zcomp_strm or wait until other process release
  * (zcomp_strm_release()) one for us
@@ -97,7 +108,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
 		/* zstrm streams limit reached, wait for idle stream */
 		if (zs->avail_strm >= zs->max_strm) {
 			spin_unlock(&zs->strm_lock);
-			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
+			wait_event(zs->strm_wait, avail_idle_strm(zs));
 			continue;
 		}
 		/* allocate new zstrm stream */
@@ -109,7 +120,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
 			spin_lock(&zs->strm_lock);
 			zs->avail_strm--;
 			spin_unlock(&zs->strm_lock);
-			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
+			wait_event(zs->strm_wait, avail_idle_strm(zs));
 			continue;
 		}
 		break;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  8:03 [PATCH] zram: fix possible race when checking idle_strm Joonsoo Kim
@ 2015-08-07  9:14 ` Sergey Senozhatsky
  2015-08-07  9:19   ` Sergey Senozhatsky
  2015-08-07  9:58   ` Sergey Senozhatsky
  2015-08-07  9:37 ` Sergey Senozhatsky
  2015-08-07 14:49 ` Minchan Kim
  2 siblings, 2 replies; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-07  9:14 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, Sergey Senozhatsky,
	linux-kernel, Joonsoo Kim

On (08/07/15 17:03), Joonsoo Kim wrote:
> Currently, when we enter the wait state due to lack of idle stream,
> we check idle_strm list without holding the lock in expanding of
> wait_event define. In this case, some one can see stale value and
> process could fall into wait state without any upcoming wakeup process.

hm... I need to think about it more.

we do wake_up every time we put stream back to the list

zcomp_strm_multi_release():

        spin_lock(&zs->strm_lock);
        if (zs->avail_strm <= zs->max_strm) {
                list_add(&zstrm->list, &zs->idle_strm);
                spin_unlock(&zs->strm_lock);
                wake_up(&zs->strm_wait);
                return;
        }


but I can probably see what you mean... in some very extreme case,
though. I can't even formulate it... eh... we use a multi stream
backend with ->max_strm == 1 and there are two processes, one
just falsely passed the wait_event() `if (condition)' check, the
other one just put stream back to ->idle_strm and called wake_up(),
but the first process hasn't yet executed prepare_to_wait_event()
so it might miss a wakeup. and there should be no other process
doing read or write operation. otherwise, there will be wakeup
eventually.

is this the case you were thinking of?... then yes, this spinlock
may help.

hm... we also can simply forbid downgrading a multi stream backend
to a single stream (in terms of performance it is much slower
than a real single steam anyway). will this do the trick? if we
have more than 1 stream and idle list then there will be more that
one wakeup. and every woken up task will call wakeup once it put
stream.

I don't think I'll be able to check the email until this Sunday,
so here is my

Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

to this patch if Minchan decides that `forbid downgrading` is
hacky or doesn't work.

very nice finding, Joonsoo!


p.s. hm... we can take a look at 'forbid downgrading a multi
stream backend to a single-streamed'.

	-ss

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  9:14 ` Sergey Senozhatsky
@ 2015-08-07  9:19   ` Sergey Senozhatsky
  2015-08-07  9:38     ` Sergey Senozhatsky
  2015-08-07  9:58   ` Sergey Senozhatsky
  1 sibling, 1 reply; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-07  9:19 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Joonsoo Kim, Andrew Morton, Minchan Kim, Nitin Gupta,
	linux-kernel, Joonsoo Kim

On (08/07/15 18:14), Sergey Senozhatsky wrote:
[..]
> hm... we also can simply forbid downgrading a multi stream backend
> to a single stream (in terms of performance it is much slower
> than a real single steam anyway). will this do the trick? if we
> have more than 1 stream and idle list then there will be more that
> one wakeup. and every woken up task will call wakeup once it put
> stream.

this is unreadable, sorry.

should be:
"if we have more than 1 stream and _empty_ idle list then there
will be more than one wakeup. and every woken up task will call
wakeup once it puts stream back to the idle list."

	-ss

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  8:03 [PATCH] zram: fix possible race when checking idle_strm Joonsoo Kim
  2015-08-07  9:14 ` Sergey Senozhatsky
@ 2015-08-07  9:37 ` Sergey Senozhatsky
  2015-08-07 10:16   ` Sergey Senozhatsky
  2015-08-07 14:49 ` Minchan Kim
  2 siblings, 1 reply; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-07  9:37 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, Sergey Senozhatsky,
	linux-kernel, Joonsoo Kim

On (08/07/15 17:03), Joonsoo Kim wrote:
> Currently, when we enter the wait state due to lack of idle stream,
> we check idle_strm list without holding the lock in expanding of
> wait_event define. In this case, some one can see stale value and
> process could fall into wait state without any upcoming wakeup process.
> Although I didn't see any error related to this race, it should be fixed.
> 
> To fix it, we should check idle_strm with holding the lock and
> this patch implements it.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

> ---
>  drivers/block/zram/zcomp.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index 80a62e7..837e9c3 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -76,6 +76,17 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
>  	return zstrm;
>  }
>  
> +static bool avail_idle_strm(struct zcomp_strm_multi *zs)
> +{
> +	int avail;
> +
> +	spin_lock(&zs->strm_lock);
> +	avail = !list_empty(&zs->idle_strm);
> +	spin_unlock(&zs->strm_lock);
> +
> +	return avail;
> +}
> +
>  /*
>   * get idle zcomp_strm or wait until other process release
>   * (zcomp_strm_release()) one for us
> @@ -97,7 +108,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
>  		/* zstrm streams limit reached, wait for idle stream */
>  		if (zs->avail_strm >= zs->max_strm) {
>  			spin_unlock(&zs->strm_lock);
> -			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> +			wait_event(zs->strm_wait, avail_idle_strm(zs));
>  			continue;
>  		}
>  		/* allocate new zstrm stream */
> @@ -109,7 +120,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
>  			spin_lock(&zs->strm_lock);
>  			zs->avail_strm--;
>  			spin_unlock(&zs->strm_lock);
> -			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> +			wait_event(zs->strm_wait, avail_idle_strm(zs));
>  			continue;
>  		}
>  		break;
> -- 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  9:19   ` Sergey Senozhatsky
@ 2015-08-07  9:38     ` Sergey Senozhatsky
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-07  9:38 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Joonsoo Kim, Andrew Morton, Minchan Kim, Nitin Gupta,
	linux-kernel, Joonsoo Kim

On (08/07/15 18:19), Sergey Senozhatsky wrote:
> [..]
> > hm... we also can simply forbid downgrading a multi stream backend
> > to a single stream (in terms of performance it is much slower
> > than a real single steam anyway). will this do the trick? if we
> > have more than 1 stream and idle list then there will be more that
> > one wakeup. and every woken up task will call wakeup once it put
> > stream.
> 
> this is unreadable, sorry.
> 
> should be:
> "if we have more than 1 stream and _empty_ idle list then there
> will be more than one wakeup. and every woken up task will call
> wakeup once it puts stream back to the idle list."

no, nevermind. I think it sucks.

	-ss

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  9:14 ` Sergey Senozhatsky
  2015-08-07  9:19   ` Sergey Senozhatsky
@ 2015-08-07  9:58   ` Sergey Senozhatsky
  2015-08-10  0:32     ` Joonsoo Kim
  1 sibling, 1 reply; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-07  9:58 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Joonsoo Kim, Andrew Morton, Minchan Kim, Nitin Gupta,
	linux-kernel, Joonsoo Kim

On (08/07/15 18:14), Sergey Senozhatsky wrote:
> hm... I need to think about it more.
> 
> we do wake_up every time we put stream back to the list
> 
> zcomp_strm_multi_release():
> 
>         spin_lock(&zs->strm_lock);
>         if (zs->avail_strm <= zs->max_strm) {
>                 list_add(&zstrm->list, &zs->idle_strm);
>                 spin_unlock(&zs->strm_lock);
>                 wake_up(&zs->strm_wait);
>                 return;
>         }
> 
> 
> but I can probably see what you mean... in some very extreme case,
> though. I can't even formulate it... eh... we use a multi stream
> backend with ->max_strm == 1 and there are two processes, one
> just falsely passed the wait_event() `if (condition)' check, the
> other one just put stream back to ->idle_strm and called wake_up(),
> but the first process hasn't yet executed prepare_to_wait_event()
> so it might miss a wakeup. and there should be no other process
> doing read or write operation. otherwise, there will be wakeup
> eventually.
> 
> is this the case you were thinking of?... then yes, this spinlock
> may help.
> 

on the other hand... it's actually

	wait_event() is

	if (condition)
		break;
	prepare_to_wait_event(&wq, &__wait, state)
	if (condition)
		break;
	schedule();

if first condition check was false and we missed a wakeup call between
first condition and prepare_to_wait_event(), then second condition
check should do the trick I think (or you expect that second condition
check may be wrongly pre-fetched or something).
if wakeup arrives after prepare_to_wait_event(), then we are fine by
defintion.

so, I'm puzzled a bit. do we have a problem or we are ok.

	-ss

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  9:37 ` Sergey Senozhatsky
@ 2015-08-07 10:16   ` Sergey Senozhatsky
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-07 10:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Joonsoo Kim, Sergey Senozhatsky, Minchan Kim, Nitin Gupta,
	linux-kernel, Joonsoo Kim

On (08/07/15 18:37), Sergey Senozhatsky wrote:
[..]
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

Andrew, my apologies, can we hold on a bit with my Acked-by. I
need more time to think about the issue -- do we actually have
one or we don't.

It seems that to be on a safe side this check is better be done
under the lock, but at the same time I sort of fail to convenience
myself that doing this check outside the lock introduces any danger.
Need some rest. May be Joonsoo and Minchan can add some bits
(that I currently miss) or thoughts.

	-ss

> > ---
> >  drivers/block/zram/zcomp.c | 15 +++++++++++++--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> > index 80a62e7..837e9c3 100644
> > --- a/drivers/block/zram/zcomp.c
> > +++ b/drivers/block/zram/zcomp.c
> > @@ -76,6 +76,17 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
> >  	return zstrm;
> >  }
> >  
> > +static bool avail_idle_strm(struct zcomp_strm_multi *zs)
> > +{
> > +	int avail;
> > +
> > +	spin_lock(&zs->strm_lock);
> > +	avail = !list_empty(&zs->idle_strm);
> > +	spin_unlock(&zs->strm_lock);
> > +
> > +	return avail;
> > +}
> > +
> >  /*
> >   * get idle zcomp_strm or wait until other process release
> >   * (zcomp_strm_release()) one for us
> > @@ -97,7 +108,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
> >  		/* zstrm streams limit reached, wait for idle stream */
> >  		if (zs->avail_strm >= zs->max_strm) {
> >  			spin_unlock(&zs->strm_lock);
> > -			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> > +			wait_event(zs->strm_wait, avail_idle_strm(zs));
> >  			continue;
> >  		}
> >  		/* allocate new zstrm stream */
> > @@ -109,7 +120,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
> >  			spin_lock(&zs->strm_lock);
> >  			zs->avail_strm--;
> >  			spin_unlock(&zs->strm_lock);
> > -			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> > +			wait_event(zs->strm_wait, avail_idle_strm(zs));
> >  			continue;
> >  		}
> >  		break;
> > -- 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  8:03 [PATCH] zram: fix possible race when checking idle_strm Joonsoo Kim
  2015-08-07  9:14 ` Sergey Senozhatsky
  2015-08-07  9:37 ` Sergey Senozhatsky
@ 2015-08-07 14:49 ` Minchan Kim
  2015-08-10  0:35   ` Joonsoo Kim
  2 siblings, 1 reply; 15+ messages in thread
From: Minchan Kim @ 2015-08-07 14:49 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Nitin Gupta, Sergey Senozhatsky, linux-kernel,
	Joonsoo Kim

Hi Joonsoo,

On Fri, Aug 07, 2015 at 05:03:29PM +0900, Joonsoo Kim wrote:
> Currently, when we enter the wait state due to lack of idle stream,
> we check idle_strm list without holding the lock in expanding of
> wait_event define. In this case, some one can see stale value and
> process could fall into wait state without any upcoming wakeup process.
> Although I didn't see any error related to this race, it should be fixed.

Long time ago, I wondered about lost wake-up problem and found a article.

http://www.linuxjournal.com/article/8144

>From then, I have thought such issue shouldn't happen if something is
wrong since then and I believe it's same issue.

Could you point out exact code sequence about the problem you mentioned?

Thanks.

> 
> To fix it, we should check idle_strm with holding the lock and
> this patch implements it.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  drivers/block/zram/zcomp.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index 80a62e7..837e9c3 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -76,6 +76,17 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
>  	return zstrm;
>  }
>  
> +static bool avail_idle_strm(struct zcomp_strm_multi *zs)
> +{
> +	int avail;
> +
> +	spin_lock(&zs->strm_lock);
> +	avail = !list_empty(&zs->idle_strm);
> +	spin_unlock(&zs->strm_lock);
> +
> +	return avail;
> +}
> +
>  /*
>   * get idle zcomp_strm or wait until other process release
>   * (zcomp_strm_release()) one for us
> @@ -97,7 +108,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
>  		/* zstrm streams limit reached, wait for idle stream */
>  		if (zs->avail_strm >= zs->max_strm) {
>  			spin_unlock(&zs->strm_lock);
> -			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> +			wait_event(zs->strm_wait, avail_idle_strm(zs));
>  			continue;
>  		}
>  		/* allocate new zstrm stream */
> @@ -109,7 +120,7 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
>  			spin_lock(&zs->strm_lock);
>  			zs->avail_strm--;
>  			spin_unlock(&zs->strm_lock);
> -			wait_event(zs->strm_wait, !list_empty(&zs->idle_strm));
> +			wait_event(zs->strm_wait, avail_idle_strm(zs));
>  			continue;
>  		}
>  		break;
> -- 
> 1.9.1
> 

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07  9:58   ` Sergey Senozhatsky
@ 2015-08-10  0:32     ` Joonsoo Kim
  2015-08-10  2:16       ` Sergey Senozhatsky
  2015-08-10 23:26       ` Minchan Kim
  0 siblings, 2 replies; 15+ messages in thread
From: Joonsoo Kim @ 2015-08-10  0:32 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel

On Fri, Aug 07, 2015 at 06:58:16PM +0900, Sergey Senozhatsky wrote:
> On (08/07/15 18:14), Sergey Senozhatsky wrote:
> > hm... I need to think about it more.
> > 
> > we do wake_up every time we put stream back to the list
> > 
> > zcomp_strm_multi_release():
> > 
> >         spin_lock(&zs->strm_lock);
> >         if (zs->avail_strm <= zs->max_strm) {
> >                 list_add(&zstrm->list, &zs->idle_strm);
> >                 spin_unlock(&zs->strm_lock);
> >                 wake_up(&zs->strm_wait);
> >                 return;
> >         }
> > 
> > 
> > but I can probably see what you mean... in some very extreme case,
> > though. I can't even formulate it... eh... we use a multi stream
> > backend with ->max_strm == 1 and there are two processes, one
> > just falsely passed the wait_event() `if (condition)' check, the
> > other one just put stream back to ->idle_strm and called wake_up(),
> > but the first process hasn't yet executed prepare_to_wait_event()
> > so it might miss a wakeup. and there should be no other process
> > doing read or write operation. otherwise, there will be wakeup
> > eventually.
> > 
> > is this the case you were thinking of?... then yes, this spinlock
> > may help.
> > 
> 
> on the other hand... it's actually
> 
> 	wait_event() is
> 
> 	if (condition)
> 		break;
> 	prepare_to_wait_event(&wq, &__wait, state)
> 	if (condition)
> 		break;
> 	schedule();
> 
> if first condition check was false and we missed a wakeup call between
> first condition and prepare_to_wait_event(), then second condition
> check should do the trick I think (or you expect that second condition
> check may be wrongly pre-fetched or something).

Hello, Sergey.

This is what I thought.
I expected that second condition can be false if compiler reuse result
of first check for optimization. I guess that there is no prevention
for this kind of optimization.

So, following is the problem sequence I thought.
T1 means thread 1, T2 means another thread, 2.

<T1-1> check if idle_strm is empty or not with holding the lock
<T1-2> It is empty so do spin_unlock and run wait_event macro
<T1-3> check if idle_strm is empty or not
<T1-4> It is still empty

<T2-1> do strm release
<T2-2> call wake_up

<T1-5> add T1 to wait queue
<T1-6> check if idle_strm is empty or not
<T1-7> compiler reuse <T1-4>'s result or CPU just fetch cached
result so T1 starts waiting

In this case, T1 can be sleep permanently. To prevent compiler
optimization or fetching cached value, we need a lock here.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-07 14:49 ` Minchan Kim
@ 2015-08-10  0:35   ` Joonsoo Kim
  0 siblings, 0 replies; 15+ messages in thread
From: Joonsoo Kim @ 2015-08-10  0:35 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, Nitin Gupta, Sergey Senozhatsky, linux-kernel

On Fri, Aug 07, 2015 at 11:49:04PM +0900, Minchan Kim wrote:
> Hi Joonsoo,
> 
> On Fri, Aug 07, 2015 at 05:03:29PM +0900, Joonsoo Kim wrote:
> > Currently, when we enter the wait state due to lack of idle stream,
> > we check idle_strm list without holding the lock in expanding of
> > wait_event define. In this case, some one can see stale value and
> > process could fall into wait state without any upcoming wakeup process.
> > Although I didn't see any error related to this race, it should be fixed.
> 
> Long time ago, I wondered about lost wake-up problem and found a article.
> 
> http://www.linuxjournal.com/article/8144
> 
> >From then, I have thought such issue shouldn't happen if something is
> wrong since then and I believe it's same issue.
> 
> Could you point out exact code sequence about the problem you mentioned?

Hello,

I describe exact code sequence in reply of Sergey's mail.
It would fully describe the problem.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-10  0:32     ` Joonsoo Kim
@ 2015-08-10  2:16       ` Sergey Senozhatsky
  2015-08-11  8:26         ` Joonsoo Kim
  2015-08-10 23:26       ` Minchan Kim
  1 sibling, 1 reply; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-10  2:16 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Sergey Senozhatsky, Andrew Morton, Minchan Kim, Nitin Gupta,
	linux-kernel

Hello Joonsoo,

On (08/10/15 09:32), Joonsoo Kim wrote:
> > on the other hand... it's actually
> > 
> > 	wait_event() is
> > 
> > 	if (condition)
> > 		break;
> > 	prepare_to_wait_event(&wq, &__wait, state)
> > 	if (condition)
> > 		break;
> > 	schedule();
> > 
> > if first condition check was false and we missed a wakeup call between
> > first condition and prepare_to_wait_event(), then second condition
> > check should do the trick I think (or you expect that second condition
> > check may be wrongly pre-fetched or something).
> 
...
> I expected that second condition can be false if compiler reuse result
> of first check for optimization. I guess that there is no prevention
> for this kind of optimization.

hm... so we have outer and inner checks (out of loop and inside of loop).
can compiler decide that outer and inner checks are equivalent here?

  #define wait_event(wq, condition)                                       \
  do {                                                                    \
          might_sleep();                                                  \
          if (condition)                                                  \
                  break;                                                  \
      ....
                for (;;) {                                                \
                  long __int = prepare_to_wait_event(&wq, &__wait, state);\
                                                                          \
                  if (condition)                                          \
                          break;                                          \
                                                                          \
                  if (___wait_is_interruptible(state) && __int) {         \
                          __ret = __int;                                  \
                          if (exclusive) {                                \
                                  abort_exclusive_wait(&wq, &__wait,      \
                                                       state, NULL);      \
                                  goto __out;                             \
                          }                                               \
                          break;                                          \
                  }                                                       \
                                                                          \
                  cmd;                                                    \
                }                                                         \
      ....
  } while (0)


I probably don't have enough knowledge about compilers; but I think it
must keep two checks. But I may be wrong.

just out of curiosity, a quick grep

wait_event(zatm_vcc->tx_wait, !skb_peek(&zatm_vcc->tx_queue))
wait_event(pmu->recv.wait, (pmu->recv.process == 0))
wait_event(ep->com.waitq, ep->com.rpl_done)
wait_event(cs->waitqueue, !cs->waiting)
wait_event(resync_wait, (mddev->sync_thread == NULL &&...
wait_event(mddev->sb_wait, mddev->flags == 0 ||...

and so on.

	-ss

> So, following is the problem sequence I thought.
> T1 means thread 1, T2 means another thread, 2.
> 
> <T1-1> check if idle_strm is empty or not with holding the lock
> <T1-2> It is  so do spin_unlock and run wait_event macro
> <T1-3> check if idle_strm is empty or not
> <T1-4> It is still empty
> 
> <T2-1> do strm release
> <T2-2> call wake_up
> 
> <T1-5> add T1 to wait queue
> <T1-6> check if idle_strm is empty or not
> <T1-7> compiler reuse <T1-4>'s result or CPU just fetch cached
> result so T1 starts waiting
> 
> In this case, T1 can be sleep permanently. To prevent compiler
> optimization or fetching cached value, we need a lock here.

	-ss

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-10  0:32     ` Joonsoo Kim
  2015-08-10  2:16       ` Sergey Senozhatsky
@ 2015-08-10 23:26       ` Minchan Kim
  2015-08-11  8:25         ` Joonsoo Kim
  1 sibling, 1 reply; 15+ messages in thread
From: Minchan Kim @ 2015-08-10 23:26 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Sergey Senozhatsky, Andrew Morton, Nitin Gupta, linux-kernel,
	paulmck, peterz

Hi Joonsoo,

On Mon, Aug 10, 2015 at 09:32:30AM +0900, Joonsoo Kim wrote:
> On Fri, Aug 07, 2015 at 06:58:16PM +0900, Sergey Senozhatsky wrote:
> > On (08/07/15 18:14), Sergey Senozhatsky wrote:
> > > hm... I need to think about it more.
> > > 
> > > we do wake_up every time we put stream back to the list
> > > 
> > > zcomp_strm_multi_release():
> > > 
> > >         spin_lock(&zs->strm_lock);
> > >         if (zs->avail_strm <= zs->max_strm) {
> > >                 list_add(&zstrm->list, &zs->idle_strm);
> > >                 spin_unlock(&zs->strm_lock);
> > >                 wake_up(&zs->strm_wait);
> > >                 return;
> > >         }
> > > 
> > > 
> > > but I can probably see what you mean... in some very extreme case,
> > > though. I can't even formulate it... eh... we use a multi stream
> > > backend with ->max_strm == 1 and there are two processes, one
> > > just falsely passed the wait_event() `if (condition)' check, the
> > > other one just put stream back to ->idle_strm and called wake_up(),
> > > but the first process hasn't yet executed prepare_to_wait_event()
> > > so it might miss a wakeup. and there should be no other process
> > > doing read or write operation. otherwise, there will be wakeup
> > > eventually.
> > > 
> > > is this the case you were thinking of?... then yes, this spinlock
> > > may help.
> > > 
> > 
> > on the other hand... it's actually
> > 
> > 	wait_event() is
> > 
> > 	if (condition)
> > 		break;
> > 	prepare_to_wait_event(&wq, &__wait, state)
> > 	if (condition)
> > 		break;
> > 	schedule();
> > 
> > if first condition check was false and we missed a wakeup call between
> > first condition and prepare_to_wait_event(), then second condition
> > check should do the trick I think (or you expect that second condition
> > check may be wrongly pre-fetched or something).
> 
> Hello, Sergey.
> 
> This is what I thought.
> I expected that second condition can be false if compiler reuse result
> of first check for optimization. I guess that there is no prevention
> for this kind of optimization.
> 
> So, following is the problem sequence I thought.
> T1 means thread 1, T2 means another thread, 2.
> 
> <T1-1> check if idle_strm is empty or not with holding the lock
> <T1-2> It is empty so do spin_unlock and run wait_event macro
> <T1-3> check if idle_strm is empty or not
> <T1-4> It is still empty
> 
> <T2-1> do strm release
> <T2-2> call wake_up
> 
> <T1-5> add T1 to wait queue
> <T1-6> check if idle_strm is empty or not
> <T1-7> compiler reuse <T1-4>'s result or CPU just fetch cached
> result so T1 starts waiting
> 
> In this case, T1 can be sleep permanently. To prevent compiler
> optimization or fetching cached value, we need a lock here.

When I read Documentation/memory-barrier.txt, it shouldn't happen.

"All memory barriers except the data dependency barriers imply a compiler
barrier. Data dependencies do not impose any additional compiler ordering."

"SLEEP AND WAKE-UP FUNCTIONS
---------------------------

Sleeping and waking on an event flagged in global data ...
...
...
...

A general memory barrier is interpolated automatically by set_current_state()
after it has altered the task state:"

So I think your T1-7 assumption is not true.

As well, there are many examples under drivers/ to use the global data
as event flag without locking or atomic.

Just in case, Ccing Paul and Peter.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-11  8:25         ` Joonsoo Kim
@ 2015-08-11  8:25           ` Sergey Senozhatsky
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Senozhatsky @ 2015-08-11  8:25 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Minchan Kim, Sergey Senozhatsky, Andrew Morton, Nitin Gupta,
	linux-kernel, paulmck, peterz

On (08/11/15 17:25), Joonsoo Kim wrote:
[..]
> > "SLEEP AND WAKE-UP FUNCTIONS
> > ---------------------------
> > 
> > Sleeping and waking on an event flagged in global data ...
> > ...
> > ...
> > ...
> > 
> > A general memory barrier is interpolated automatically by set_current_state()
> > after it has altered the task state:"
> > 
> > So I think your T1-7 assumption is not true.
> > 
> > As well, there are many examples under drivers/ to use the global data
> > as event flag without locking or atomic.
> > 
> 
> Okay. Now, I'm convinced that race is not possible. I will drop this
> patch.

yep, Minchan found it first. thanks guys.

	-ss

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-10 23:26       ` Minchan Kim
@ 2015-08-11  8:25         ` Joonsoo Kim
  2015-08-11  8:25           ` Sergey Senozhatsky
  0 siblings, 1 reply; 15+ messages in thread
From: Joonsoo Kim @ 2015-08-11  8:25 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, Nitin Gupta, linux-kernel,
	paulmck, peterz

On Tue, Aug 11, 2015 at 08:26:33AM +0900, Minchan Kim wrote:
> Hi Joonsoo,
> 
> On Mon, Aug 10, 2015 at 09:32:30AM +0900, Joonsoo Kim wrote:
> > On Fri, Aug 07, 2015 at 06:58:16PM +0900, Sergey Senozhatsky wrote:
> > > On (08/07/15 18:14), Sergey Senozhatsky wrote:
> > > > hm... I need to think about it more.
> > > > 
> > > > we do wake_up every time we put stream back to the list
> > > > 
> > > > zcomp_strm_multi_release():
> > > > 
> > > >         spin_lock(&zs->strm_lock);
> > > >         if (zs->avail_strm <= zs->max_strm) {
> > > >                 list_add(&zstrm->list, &zs->idle_strm);
> > > >                 spin_unlock(&zs->strm_lock);
> > > >                 wake_up(&zs->strm_wait);
> > > >                 return;
> > > >         }
> > > > 
> > > > 
> > > > but I can probably see what you mean... in some very extreme case,
> > > > though. I can't even formulate it... eh... we use a multi stream
> > > > backend with ->max_strm == 1 and there are two processes, one
> > > > just falsely passed the wait_event() `if (condition)' check, the
> > > > other one just put stream back to ->idle_strm and called wake_up(),
> > > > but the first process hasn't yet executed prepare_to_wait_event()
> > > > so it might miss a wakeup. and there should be no other process
> > > > doing read or write operation. otherwise, there will be wakeup
> > > > eventually.
> > > > 
> > > > is this the case you were thinking of?... then yes, this spinlock
> > > > may help.
> > > > 
> > > 
> > > on the other hand... it's actually
> > > 
> > > 	wait_event() is
> > > 
> > > 	if (condition)
> > > 		break;
> > > 	prepare_to_wait_event(&wq, &__wait, state)
> > > 	if (condition)
> > > 		break;
> > > 	schedule();
> > > 
> > > if first condition check was false and we missed a wakeup call between
> > > first condition and prepare_to_wait_event(), then second condition
> > > check should do the trick I think (or you expect that second condition
> > > check may be wrongly pre-fetched or something).
> > 
> > Hello, Sergey.
> > 
> > This is what I thought.
> > I expected that second condition can be false if compiler reuse result
> > of first check for optimization. I guess that there is no prevention
> > for this kind of optimization.
> > 
> > So, following is the problem sequence I thought.
> > T1 means thread 1, T2 means another thread, 2.
> > 
> > <T1-1> check if idle_strm is empty or not with holding the lock
> > <T1-2> It is empty so do spin_unlock and run wait_event macro
> > <T1-3> check if idle_strm is empty or not
> > <T1-4> It is still empty
> > 
> > <T2-1> do strm release
> > <T2-2> call wake_up
> > 
> > <T1-5> add T1 to wait queue
> > <T1-6> check if idle_strm is empty or not
> > <T1-7> compiler reuse <T1-4>'s result or CPU just fetch cached
> > result so T1 starts waiting
> > 
> > In this case, T1 can be sleep permanently. To prevent compiler
> > optimization or fetching cached value, we need a lock here.
> 
> When I read Documentation/memory-barrier.txt, it shouldn't happen.
> 
> "All memory barriers except the data dependency barriers imply a compiler
> barrier. Data dependencies do not impose any additional compiler ordering."
> 
> "SLEEP AND WAKE-UP FUNCTIONS
> ---------------------------
> 
> Sleeping and waking on an event flagged in global data ...
> ...
> ...
> ...
> 
> A general memory barrier is interpolated automatically by set_current_state()
> after it has altered the task state:"
> 
> So I think your T1-7 assumption is not true.
> 
> As well, there are many examples under drivers/ to use the global data
> as event flag without locking or atomic.
> 

Okay. Now, I'm convinced that race is not possible. I will drop this
patch.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] zram: fix possible race when checking idle_strm
  2015-08-10  2:16       ` Sergey Senozhatsky
@ 2015-08-11  8:26         ` Joonsoo Kim
  0 siblings, 0 replies; 15+ messages in thread
From: Joonsoo Kim @ 2015-08-11  8:26 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel

On Mon, Aug 10, 2015 at 11:16:59AM +0900, Sergey Senozhatsky wrote:
> Hello Joonsoo,
> 
> On (08/10/15 09:32), Joonsoo Kim wrote:
> > > on the other hand... it's actually
> > > 
> > > 	wait_event() is
> > > 
> > > 	if (condition)
> > > 		break;
> > > 	prepare_to_wait_event(&wq, &__wait, state)
> > > 	if (condition)
> > > 		break;
> > > 	schedule();
> > > 
> > > if first condition check was false and we missed a wakeup call between
> > > first condition and prepare_to_wait_event(), then second condition
> > > check should do the trick I think (or you expect that second condition
> > > check may be wrongly pre-fetched or something).
> > 
> ...
> > I expected that second condition can be false if compiler reuse result
> > of first check for optimization. I guess that there is no prevention
> > for this kind of optimization.
> 
> hm... so we have outer and inner checks (out of loop and inside of loop).
> can compiler decide that outer and inner checks are equivalent here?
> 
>   #define wait_event(wq, condition)                                       \
>   do {                                                                    \
>           might_sleep();                                                  \
>           if (condition)                                                  \
>                   break;                                                  \
>       ....
>                 for (;;) {                                                \
>                   long __int = prepare_to_wait_event(&wq, &__wait, state);\
>                                                                           \
>                   if (condition)                                          \
>                           break;                                          \
>                                                                           \
>                   if (___wait_is_interruptible(state) && __int) {         \
>                           __ret = __int;                                  \
>                           if (exclusive) {                                \
>                                   abort_exclusive_wait(&wq, &__wait,      \
>                                                        state, NULL);      \
>                                   goto __out;                             \
>                           }                                               \
>                           break;                                          \
>                   }                                                       \
>                                                                           \
>                   cmd;                                                    \
>                 }                                                         \
>       ....
>   } while (0)
> 
> 
> I probably don't have enough knowledge about compilers; but I think it
> must keep two checks. But I may be wrong.
> 
> just out of curiosity, a quick grep
> 
> wait_event(zatm_vcc->tx_wait, !skb_peek(&zatm_vcc->tx_queue))
> wait_event(pmu->recv.wait, (pmu->recv.process == 0))
> wait_event(ep->com.waitq, ep->com.rpl_done)
> wait_event(cs->waitqueue, !cs->waiting)
> wait_event(resync_wait, (mddev->sync_thread == NULL &&...
> wait_event(mddev->sb_wait, mddev->flags == 0 ||...
> 
> and so on.

>From Minchan's comment, I think that race is not possible.
I will drop the patch.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-08-11  8:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-07  8:03 [PATCH] zram: fix possible race when checking idle_strm Joonsoo Kim
2015-08-07  9:14 ` Sergey Senozhatsky
2015-08-07  9:19   ` Sergey Senozhatsky
2015-08-07  9:38     ` Sergey Senozhatsky
2015-08-07  9:58   ` Sergey Senozhatsky
2015-08-10  0:32     ` Joonsoo Kim
2015-08-10  2:16       ` Sergey Senozhatsky
2015-08-11  8:26         ` Joonsoo Kim
2015-08-10 23:26       ` Minchan Kim
2015-08-11  8:25         ` Joonsoo Kim
2015-08-11  8:25           ` Sergey Senozhatsky
2015-08-07  9:37 ` Sergey Senozhatsky
2015-08-07 10:16   ` Sergey Senozhatsky
2015-08-07 14:49 ` Minchan Kim
2015-08-10  0:35   ` Joonsoo Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.