All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
@ 2013-09-11  2:36 ` Chao Yu
  0 siblings, 0 replies; 6+ messages in thread
From: Chao Yu @ 2013-09-11  2:36 UTC (permalink / raw)
  To: ???; +Cc: 谭姝, linux-fsdevel, linux-kernel, linux-f2fs-devel

Hi Kim,

I did some tests as you mention of using random instead of spin_lock.
The test model is as following:
eight threads race to grab one of eight locks for one thousand times,
and I used four methods to generate lock num: 

1.atomic_add_return(1, &sbi->next_lock_num) % NR_GLOBAL_LOCKS;
2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
4.get_random_bytes(&next_lock, sizeof(unsigned int));

the result indicate that:
max count of collide continuously: 4 > 3 > 2 = 1
max-min count of lock is grabbed: 4 > 3 > 2 = 1
elapsed time of generating: 3 > 2 > 4 > 1

So I think it's better to use atomic_add_return in round-robin method to
cost less time and reduce collide.
What's your opinion?

thanks

------- Original Message -------
Sender : ???<jaegeuk.kim@samsung.com> S5(??)/??/?????????(???)/????
Date : 九月 10, 2013 09:52 (GMT+09:00)
Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

Hi,

At first, thank you for the report and please follow the email writing
rules. :)

Anyway, I agree to the below issue.
One thing that I can think of is that we don't need to use the
spin_lock, since we don't care about the exact lock number, but just
need to get any not-collided number.

So, how about removing the spin_lock?
And how about using a random number?
Thanks,

2013-09-06 (?), 09:48 +0000, Chao Yu:
> Hi Kim:
> 
>      I think there is a performance problem: when all sbi->fs_lock is
> holded, 
> 
> then all other threads may get the same next_lock value from
> sbi->next_lock_num in function mutex_lock_op, 
> 
> and wait to get the same lock at position fs_lock[next_lock], it
> unbalance the fs_lock usage. 
> 
> It may lost performance when we do the multithread test.
> 
>  
> 
> Here is the patch to fix this problem:
> 
>  
> 
> Signed-off-by: Yu Chao 
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> 
> old mode 100644
> 
> new mode 100755
> 
> index 467d42d..983bb45
> 
> --- a/fs/f2fs/f2fs.h
> 
> +++ b/fs/f2fs/f2fs.h
> 
> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> 
>         struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> operations */
> 
>         struct mutex node_write;                /* locking node writes
> */
> 
>         struct mutex writepages;                /* mutex for
> writepages() */
> 
> +       spinlock_t spin_lock;                   /* lock for
> next_lock_num */
> 
>         unsigned char next_lock_num;            /* round-robin global
> locks */
> 
>         int por_doing;                          /* recovery is doing
> or not */
> 
>         int on_build_free_nids;                 /* build_free_nids is
> doing */
> 
> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> f2fs_sb_info *sbi)
> 
>  
> 
>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> 
>  {
> 
> -       unsigned char next_lock = sbi->next_lock_num %
> NR_GLOBAL_LOCKS;
> 
> +       unsigned char next_lock;
> 
>         int i = 0;
> 
>  
> 
>         for (; i < NR_GLOBAL_LOCKS; i++)
> 
>                 if (mutex_trylock(&sbi->fs_lock[i]))
> 
>                         return i;
> 
>  
> 
> -       mutex_lock(&sbi->fs_lock[next_lock]);
> 
> +       spin_lock(&sbi->spin_lock);
> 
> +       next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> 
>         sbi->next_lock_num++;
> 
> +       spin_unlock(&sbi->spin_lock);
> 
> +
> 
> +       mutex_lock(&sbi->fs_lock[next_lock]);
> 
>         return next_lock;
> 
>  }
> 
>  
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> 
> old mode 100644
> 
> new mode 100755
> 
> index 75c7dc3..4f27596
> 
> --- a/fs/f2fs/super.c
> 
> +++ b/fs/f2fs/super.c
> 
> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
> void *data, int silent)
> 
>         mutex_init(&sbi->cp_mutex);
> 
>         for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> 
>                 mutex_init(&sbi->fs_lock[i]);
> 
> +       spin_lock_init(&sbi->spin_lock);
> 
>         mutex_init(&sbi->node_write);
> 
>         sbi->por_doing = 0;
> 
>         spin_lock_init(&sbi->stat_lock);
> 
> (END)
> 
>  
> 
> 
> 
> 

-- 
Jaegeuk Kim
Samsung

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
@ 2013-09-11  2:36 ` Chao Yu
  0 siblings, 0 replies; 6+ messages in thread
From: Chao Yu @ 2013-09-11  2:36 UTC (permalink / raw)
  To: ???; +Cc: 谭姝, linux-fsdevel, linux-kernel, linux-f2fs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb2312, Size: 4266 bytes --]

Hi Kim,

I did some tests as you mention of using random instead of spin_lock.
The test model is as following:
eight threads race to grab one of eight locks for one thousand times,
and I used four methods to generate lock num: 

1.atomic_add_return(1, &sbi->next_lock_num) % NR_GLOBAL_LOCKS;
2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
4.get_random_bytes(&next_lock, sizeof(unsigned int));

the result indicate that:
max count of collide continuously: 4 > 3 > 2 = 1
max-min count of lock is grabbed: 4 > 3 > 2 = 1
elapsed time of generating: 3 > 2 > 4 > 1

So I think it's better to use atomic_add_return in round-robin method to
cost less time and reduce collide.
What's your opinion?

thanks

------- Original Message -------
Sender : ???<jaegeuk.kim@samsung.com> S5(??)/??/?????????(???)/????
Date : ¾ÅÔ 10, 2013 09:52 (GMT+09:00)
Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

Hi,

At first, thank you for the report and please follow the email writing
rules. :)

Anyway, I agree to the below issue.
One thing that I can think of is that we don't need to use the
spin_lock, since we don't care about the exact lock number, but just
need to get any not-collided number.

So, how about removing the spin_lock?
And how about using a random number?
Thanks,

2013-09-06 (?), 09:48 +0000, Chao Yu:
> Hi Kim:
> 
>      I think there is a performance problem: when all sbi->fs_lock is
> holded, 
> 
> then all other threads may get the same next_lock value from
> sbi->next_lock_num in function mutex_lock_op, 
> 
> and wait to get the same lock at position fs_lock[next_lock], it
> unbalance the fs_lock usage. 
> 
> It may lost performance when we do the multithread test.
> 
>  
> 
> Here is the patch to fix this problem:
> 
>  
> 
> Signed-off-by: Yu Chao 
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> 
> old mode 100644
> 
> new mode 100755
> 
> index 467d42d..983bb45
> 
> --- a/fs/f2fs/f2fs.h
> 
> +++ b/fs/f2fs/f2fs.h
> 
> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> 
>         struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> operations */
> 
>         struct mutex node_write;                /* locking node writes
> */
> 
>         struct mutex writepages;                /* mutex for
> writepages() */
> 
> +       spinlock_t spin_lock;                   /* lock for
> next_lock_num */
> 
>         unsigned char next_lock_num;            /* round-robin global
> locks */
> 
>         int por_doing;                          /* recovery is doing
> or not */
> 
>         int on_build_free_nids;                 /* build_free_nids is
> doing */
> 
> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> f2fs_sb_info *sbi)
> 
>  
> 
>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> 
>  {
> 
> -       unsigned char next_lock = sbi->next_lock_num %
> NR_GLOBAL_LOCKS;
> 
> +       unsigned char next_lock;
> 
>         int i = 0;
> 
>  
> 
>         for (; i < NR_GLOBAL_LOCKS; i++)
> 
>                 if (mutex_trylock(&sbi->fs_lock[i]))
> 
>                         return i;
> 
>  
> 
> -       mutex_lock(&sbi->fs_lock[next_lock]);
> 
> +       spin_lock(&sbi->spin_lock);
> 
> +       next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> 
>         sbi->next_lock_num++;
> 
> +       spin_unlock(&sbi->spin_lock);
> 
> +
> 
> +       mutex_lock(&sbi->fs_lock[next_lock]);
> 
>         return next_lock;
> 
>  }
> 
>  
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> 
> old mode 100644
> 
> new mode 100755
> 
> index 75c7dc3..4f27596
> 
> --- a/fs/f2fs/super.c
> 
> +++ b/fs/f2fs/super.c
> 
> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
> void *data, int silent)
> 
>         mutex_init(&sbi->cp_mutex);
> 
>         for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> 
>                 mutex_init(&sbi->fs_lock[i]);
> 
> +       spin_lock_init(&sbi->spin_lock);
> 
>         mutex_init(&sbi->node_write);
> 
>         sbi->por_doing = 0;
> 
>         spin_lock_init(&sbi->stat_lock);
> 
> (END)
> 
>  
> 
> 
> 
> 

-- 
Jaegeuk Kim
Samsungÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
  2013-09-11  2:36 ` Chao Yu
@ 2013-09-11 13:14   ` Kim Jaegeuk
  -1 siblings, 0 replies; 6+ messages in thread
From: Kim Jaegeuk @ 2013-09-11 13:14 UTC (permalink / raw)
  To: chao2.yu
  Cc: ???, 谭姝, linux-fsdevel, linux-kernel, linux-f2fs-devel

Hi,

2013/9/11 Chao Yu <chao2.yu@samsung.com>
>
> Hi Kim,
>
> I did some tests as you mention of using random instead of spin_lock.
> The test model is as following:
> eight threads race to grab one of eight locks for one thousand times,
> and I used four methods to generate lock num:
>
> 1.atomic_add_return(1, &sbi->next_lock_num) % NR_GLOBAL_LOCKS;
> 2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
> 3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
> 4.get_random_bytes(&next_lock, sizeof(unsigned int));
>
> the result indicate that:
> max count of collide continuously: 4 > 3 > 2 = 1
> max-min count of lock is grabbed: 4 > 3 > 2 = 1
> elapsed time of generating: 3 > 2 > 4 > 1
>
> So I think it's better to use atomic_add_return in round-robin method to
> cost less time and reduce collide.
> What's your opinion?

Could you test with sbi->next_lock_num++ only instead of using
atomic_add_return?
IMO, this is just an integer value and still I don't think this value should
be covered by any kind of locks.
Thanks,

>
> thanks
>
> ------- Original Message -------
> Sender : ???<jaegeuk.kim@samsung.com> S5(??)/??/?????????(???)/????
> Date : 九月 10, 2013 09:52 (GMT+09:00)
> Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
>
> Hi,
>
> At first, thank you for the report and please follow the email writing
> rules. :)
>
> Anyway, I agree to the below issue.
> One thing that I can think of is that we don't need to use the
> spin_lock, since we don't care about the exact lock number, but just
> need to get any not-collided number.
>
> So, how about removing the spin_lock?
> And how about using a random number?
> Thanks,
>
> 2013-09-06 (?), 09:48 +0000, Chao Yu:
> > Hi Kim:
> >
> >      I think there is a performance problem: when all sbi->fs_lock is
> > holded,
> >
> > then all other threads may get the same next_lock value from
> > sbi->next_lock_num in function mutex_lock_op,
> >
> > and wait to get the same lock at position fs_lock[next_lock], it
> > unbalance the fs_lock usage.
> >
> > It may lost performance when we do the multithread test.
> >
> >
> >
> > Here is the patch to fix this problem:
> >
> >
> >
> > Signed-off-by: Yu Chao
> >
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >
> > old mode 100644
> >
> > new mode 100755
> >
> > index 467d42d..983bb45
> >
> > --- a/fs/f2fs/f2fs.h
> >
> > +++ b/fs/f2fs/f2fs.h
> >
> > @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> >
> >         struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> > operations */
> >
> >         struct mutex node_write;                /* locking node writes
> > */
> >
> >         struct mutex writepages;                /* mutex for
> > writepages() */
> >
> > +       spinlock_t spin_lock;                   /* lock for
> > next_lock_num */
> >
> >         unsigned char next_lock_num;            /* round-robin global
> > locks */
> >
> >         int por_doing;                          /* recovery is doing
> > or not */
> >
> >         int on_build_free_nids;                 /* build_free_nids is
> > doing */
> >
> > @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> > f2fs_sb_info *sbi)
> >
> >
> >
> >  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> >
> >  {
> >
> > -       unsigned char next_lock = sbi->next_lock_num %
> > NR_GLOBAL_LOCKS;
> >
> > +       unsigned char next_lock;
> >
> >         int i = 0;
> >
> >
> >
> >         for (; i < NR_GLOBAL_LOCKS; i++)
> >
> >                 if (mutex_trylock(&sbi->fs_lock[i]))
> >
> >                         return i;
> >
> >
> >
> > -       mutex_lock(&sbi->fs_lock[next_lock]);
> >
> > +       spin_lock(&sbi->spin_lock);
> >
> > +       next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> >
> >         sbi->next_lock_num++;
> >
> > +       spin_unlock(&sbi->spin_lock);
> >
> > +
> >
> > +       mutex_lock(&sbi->fs_lock[next_lock]);
> >
> >         return next_lock;
> >
> >  }
> >
> >
> >
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >
> > old mode 100644
> >
> > new mode 100755
> >
> > index 75c7dc3..4f27596
> >
> > --- a/fs/f2fs/super.c
> >
> > +++ b/fs/f2fs/super.c
> >
> > @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
> > void *data, int silent)
> >
> >         mutex_init(&sbi->cp_mutex);
> >
> >         for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> >
> >                 mutex_init(&sbi->fs_lock[i]);
> >
> > +       spin_lock_init(&sbi->spin_lock);
> >
> >         mutex_init(&sbi->node_write);
> >
> >         sbi->por_doing = 0;
> >
> >         spin_lock_init(&sbi->stat_lock);
> >
> > (END)
> >
> >
> >
> >
> >
> >
>
> --
> Jaegeuk Kim
> Samsung

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
@ 2013-09-11 13:14   ` Kim Jaegeuk
  0 siblings, 0 replies; 6+ messages in thread
From: Kim Jaegeuk @ 2013-09-11 13:14 UTC (permalink / raw)
  To: chao2.yu
  Cc: ???, 谭姝, linux-fsdevel, linux-kernel, linux-f2fs-devel

Hi,

2013/9/11 Chao Yu <chao2.yu@samsung.com>
>
> Hi Kim,
>
> I did some tests as you mention of using random instead of spin_lock.
> The test model is as following:
> eight threads race to grab one of eight locks for one thousand times,
> and I used four methods to generate lock num:
>
> 1.atomic_add_return(1, &sbi->next_lock_num) % NR_GLOBAL_LOCKS;
> 2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
> 3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
> 4.get_random_bytes(&next_lock, sizeof(unsigned int));
>
> the result indicate that:
> max count of collide continuously: 4 > 3 > 2 = 1
> max-min count of lock is grabbed: 4 > 3 > 2 = 1
> elapsed time of generating: 3 > 2 > 4 > 1
>
> So I think it's better to use atomic_add_return in round-robin method to
> cost less time and reduce collide.
> What's your opinion?

Could you test with sbi->next_lock_num++ only instead of using
atomic_add_return?
IMO, this is just an integer value and still I don't think this value should
be covered by any kind of locks.
Thanks,

>
> thanks
>
> ------- Original Message -------
> Sender : ???<jaegeuk.kim@samsung.com> S5(??)/??/?????????(???)/????
> Date : 九月 10, 2013 09:52 (GMT+09:00)
> Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
>
> Hi,
>
> At first, thank you for the report and please follow the email writing
> rules. :)
>
> Anyway, I agree to the below issue.
> One thing that I can think of is that we don't need to use the
> spin_lock, since we don't care about the exact lock number, but just
> need to get any not-collided number.
>
> So, how about removing the spin_lock?
> And how about using a random number?
> Thanks,
>
> 2013-09-06 (?), 09:48 +0000, Chao Yu:
> > Hi Kim:
> >
> >      I think there is a performance problem: when all sbi->fs_lock is
> > holded,
> >
> > then all other threads may get the same next_lock value from
> > sbi->next_lock_num in function mutex_lock_op,
> >
> > and wait to get the same lock at position fs_lock[next_lock], it
> > unbalance the fs_lock usage.
> >
> > It may lost performance when we do the multithread test.
> >
> >
> >
> > Here is the patch to fix this problem:
> >
> >
> >
> > Signed-off-by: Yu Chao
> >
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >
> > old mode 100644
> >
> > new mode 100755
> >
> > index 467d42d..983bb45
> >
> > --- a/fs/f2fs/f2fs.h
> >
> > +++ b/fs/f2fs/f2fs.h
> >
> > @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> >
> >         struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> > operations */
> >
> >         struct mutex node_write;                /* locking node writes
> > */
> >
> >         struct mutex writepages;                /* mutex for
> > writepages() */
> >
> > +       spinlock_t spin_lock;                   /* lock for
> > next_lock_num */
> >
> >         unsigned char next_lock_num;            /* round-robin global
> > locks */
> >
> >         int por_doing;                          /* recovery is doing
> > or not */
> >
> >         int on_build_free_nids;                 /* build_free_nids is
> > doing */
> >
> > @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> > f2fs_sb_info *sbi)
> >
> >
> >
> >  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> >
> >  {
> >
> > -       unsigned char next_lock = sbi->next_lock_num %
> > NR_GLOBAL_LOCKS;
> >
> > +       unsigned char next_lock;
> >
> >         int i = 0;
> >
> >
> >
> >         for (; i < NR_GLOBAL_LOCKS; i++)
> >
> >                 if (mutex_trylock(&sbi->fs_lock[i]))
> >
> >                         return i;
> >
> >
> >
> > -       mutex_lock(&sbi->fs_lock[next_lock]);
> >
> > +       spin_lock(&sbi->spin_lock);
> >
> > +       next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> >
> >         sbi->next_lock_num++;
> >
> > +       spin_unlock(&sbi->spin_lock);
> >
> > +
> >
> > +       mutex_lock(&sbi->fs_lock[next_lock]);
> >
> >         return next_lock;
> >
> >  }
> >
> >
> >
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >
> > old mode 100644
> >
> > new mode 100755
> >
> > index 75c7dc3..4f27596
> >
> > --- a/fs/f2fs/super.c
> >
> > +++ b/fs/f2fs/super.c
> >
> > @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
> > void *data, int silent)
> >
> >         mutex_init(&sbi->cp_mutex);
> >
> >         for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> >
> >                 mutex_init(&sbi->fs_lock[i]);
> >
> > +       spin_lock_init(&sbi->spin_lock);
> >
> >         mutex_init(&sbi->node_write);
> >
> >         sbi->por_doing = 0;
> >
> >         spin_lock_init(&sbi->stat_lock);
> >
> > (END)
> >
> >
> >
> >
> >
> >
>
> --
> Jaegeuk Kim
> Samsung
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
  2013-09-11 13:14   ` Kim Jaegeuk
  (?)
@ 2013-09-12  2:02   ` 俞超
  -1 siblings, 0 replies; 6+ messages in thread
From: 俞超 @ 2013-09-12  2:02 UTC (permalink / raw)
  To: 'Kim Jaegeuk'
  Cc: '???', '谭姝',
	linux-fsdevel, linux-kernel, linux-f2fs-devel

Hi Kim

> -----Original Message-----
> From: Kim Jaegeuk [mailto:jaegeuk.kim@gmail.com]
> Sent: Wednesday, September 11, 2013 9:15 PM
> To: chao2.yu@samsung.com
> Cc: ???; 谭姝; linux-fsdevel@vger.kernel.org;
linux-kernel@vger.kernel.org;
> linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better
performance
> 
> Hi,
> 
> 2013/9/11 Chao Yu <chao2.yu@samsung.com>
> >
> > Hi Kim,
> >
> > I did some tests as you mention of using random instead of spin_lock.
> > The test model is as following:
> > eight threads race to grab one of eight locks for one thousand times,
> > and I used four methods to generate lock num:
> >
> > 1.atomic_add_return(1, &sbi->next_lock_num) % NR_GLOBAL_LOCKS;
> > 2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
> > 3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
> > 4.get_random_bytes(&next_lock, sizeof(unsigned int));
> >
> > the result indicate that:
> > max count of collide continuously: 4 > 3 > 2 = 1 max-min count of lock
> > is grabbed: 4 > 3 > 2 = 1 elapsed time of generating: 3 > 2 > 4 > 1
> >
> > So I think it's better to use atomic_add_return in round-robin method
> > to cost less time and reduce collide.
> > What's your opinion?
> 
> Could you test with sbi->next_lock_num++ only instead of using
> atomic_add_return?
> IMO, this is just an integer value and still I don't think this value
should be
> covered by any kind of locks.
> Thanks,

Thanks for the advice, I have tested sbi->next_lock_num++.
The time cost of it is a little bit lower than the atomic one's.
for running 8 thread for 1000000 times test.
the performance of it's balance and collide play quit well than I expected.

Can we modify this patch as following?

root@virtaulmachine:/home/yuchao/git/linux-next/fs/f2fs# git diff --stat
 fs/f2fs/f2fs.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
root@virtaulmachine:/home/yuchao/git/linux-next/fs/f2fs# git diff
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 608f0df..7fd99d8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -544,15 +544,15 @@ static inline void mutex_unlock_all(struct
f2fs_sb_info *sbi)
 
 static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 {
-       unsigned char next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
+       unsigned char next_lock;
        int i = 0;
 
        for (; i < NR_GLOBAL_LOCKS; i++)
                if (mutex_trylock(&sbi->fs_lock[i]))
                        return i;
 
+       next_lock = sbi->next_lock_num++ % NR_GLOBAL_LOCKS;
        mutex_lock(&sbi->fs_lock[next_lock]);
-       sbi->next_lock_num++;
        return next_lock;
 }

> 
> >
> > thanks
> >
> > ------- Original Message -------
> > Sender : ???<jaegeuk.kim@samsung.com> S5(??)/??/?????????(???)/????
> > Date : 九月 10, 2013 09:52 (GMT+09:00)
> > Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better
> > performance
> >
> > Hi,
> >
> > At first, thank you for the report and please follow the email writing
> > rules. :)
> >
> > Anyway, I agree to the below issue.
> > One thing that I can think of is that we don't need to use the
> > spin_lock, since we don't care about the exact lock number, but just
> > need to get any not-collided number.
> >
> > So, how about removing the spin_lock?
> > And how about using a random number?
> > Thanks,
> >
> > 2013-09-06 (?), 09:48 +0000, Chao Yu:
> > > Hi Kim:
> > >
> > >      I think there is a performance problem: when all sbi->fs_lock
> > > is holded,
> > >
> > > then all other threads may get the same next_lock value from
> > > sbi->next_lock_num in function mutex_lock_op,
> > >
> > > and wait to get the same lock at position fs_lock[next_lock], it
> > > unbalance the fs_lock usage.
> > >
> > > It may lost performance when we do the multithread test.
> > >
> > >
> > >
> > > Here is the patch to fix this problem:
> > >
> > >
> > >
> > > Signed-off-by: Yu Chao
> > >
> > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > >
> > > old mode 100644
> > >
> > > new mode 100755
> > >
> > > index 467d42d..983bb45
> > >
> > > --- a/fs/f2fs/f2fs.h
> > >
> > > +++ b/fs/f2fs/f2fs.h
> > >
> > > @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> > >
> > >         struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> > > operations */
> > >
> > >         struct mutex node_write;                /* locking node
> writes
> > > */
> > >
> > >         struct mutex writepages;                /* mutex for
> > > writepages() */
> > >
> > > +       spinlock_t spin_lock;                   /* lock for
> > > next_lock_num */
> > >
> > >         unsigned char next_lock_num;            /* round-robin
> global
> > > locks */
> > >
> > >         int por_doing;                          /* recovery is
> doing
> > > or not */
> > >
> > >         int on_build_free_nids;                 /* build_free_nids is
> > > doing */
> > >
> > > @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> > > f2fs_sb_info *sbi)
> > >
> > >
> > >
> > >  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> > >
> > >  {
> > >
> > > -       unsigned char next_lock = sbi->next_lock_num %
> > > NR_GLOBAL_LOCKS;
> > >
> > > +       unsigned char next_lock;
> > >
> > >         int i = 0;
> > >
> > >
> > >
> > >         for (; i < NR_GLOBAL_LOCKS; i++)
> > >
> > >                 if (mutex_trylock(&sbi->fs_lock[i]))
> > >
> > >                         return i;
> > >
> > >
> > >
> > > -       mutex_lock(&sbi->fs_lock[next_lock]);
> > >
> > > +       spin_lock(&sbi->spin_lock);
> > >
> > > +       next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> > >
> > >         sbi->next_lock_num++;
> > >
> > > +       spin_unlock(&sbi->spin_lock);
> > >
> > > +
> > >
> > > +       mutex_lock(&sbi->fs_lock[next_lock]);
> > >
> > >         return next_lock;
> > >
> > >  }
> > >
> > >
> > >
> > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > >
> > > old mode 100644
> > >
> > > new mode 100755
> > >
> > > index 75c7dc3..4f27596
> > >
> > > --- a/fs/f2fs/super.c
> > >
> > > +++ b/fs/f2fs/super.c
> > >
> > > @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block
> > > *sb, void *data, int silent)
> > >
> > >         mutex_init(&sbi->cp_mutex);
> > >
> > >         for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> > >
> > >                 mutex_init(&sbi->fs_lock[i]);
> > >
> > > +       spin_lock_init(&sbi->spin_lock);
> > >
> > >         mutex_init(&sbi->node_write);
> > >
> > >         sbi->por_doing = 0;
> > >
> > >         spin_lock_init(&sbi->stat_lock);
> > >
> > > (END)
> > >
> > >
> > >
> > >
> > >
> > >
> >
> > --
> > Jaegeuk Kim
> > Samsung


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance
       [not found] <04.C0.13361.61DDA225@epcpsbge5.samsung.com>
@ 2013-09-10  0:59 ` Jaegeuk Kim
  0 siblings, 0 replies; 6+ messages in thread
From: Jaegeuk Kim @ 2013-09-10  0:59 UTC (permalink / raw)
  To: chao2.yu
  Cc: Russ Knize, linux-fsdevel, 谭姝,
	linux-kernel, linux-f2fs-devel

Hi,

2013-09-07 (토), 08:00 +0000, Chao Yu:
> Hi Knize,
> 
>     Thanks for your reply, I think it's actually meaningless that it's
> being named after "spin_lock",
> it's better to rename this spinlock to "round_robin_lock".
> 
>     This patch can only resolve the issue of unbalanced fs_lock usage,
> it can not fix the deadlock issue.
> can we fix deadlock issue through this method:
> 
> - vfs_create()
>  - f2fs_create() - takes an fs_lock and save current thread info into
> thread_info[NR_GLOBAL_LOCKS]
>   - f2fs_add_link()
>    - __f2fs_add_link()
>     - init_inode_metadata()
>      - f2fs_init_security()
>       - security_inode_init_security()
>        - f2fs_initxattrs()
>         - f2fs_setxattr() - get fs_lock only if there is no current
> thread info in thread_info
>         
> So it keeps one thread can only hold one fs_lock to avoid deadlock.
> Can we use this solution?

It could be.
But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()
level, since this case only happens when f2fs_initxattrs() is called.
Let's think about ut in more detail.
Thanks,

> 
>  
> 
> thanks again!
> 
>  
> 
> ------- Original Message -------
> 
> Sender : Russ Knize<Russ.Knize@motorola.com>
> 
> Date : 九月 07, 2013 04:25 (GMT+09:00)
> 
> Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
> performance
> 
>  
> 
> I encountered this same issue recently and solved it in much the same
> way.  Can we rename "spin_lock" to something more meaningful? 
> 
> 
> This race actually exposed a potential deadlock between f2fs_create()
> and f2fs_initxattrs(): 
> 
> 
> - vfs_create()
>  - f2fs_create() - takes an fs_lock
>   - f2fs_add_link()
>    - __f2fs_add_link()
>     - init_inode_metadata()
>      - f2fs_init_security()
>       - security_inode_init_security()
>        - f2fs_initxattrs()
>         - f2fs_setxattr() - also takes an fs_lock
> 
> 
> If another CPU happens to have the same lock that f2fs_setxattr() was
> trying to take because of the race around next_lock_num, we can get
> into a deadlock situation if the two threads are also contending over
> another resource (like bdi).
> 
> 
> Another scenario is if the above happens while another thread is in
> the middle of grabbing all of the locks via mutex_lock_all().
>  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
> and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
> for.
> 
> 
> Russ
> 
> 
> On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu <chao2.yu@samsung.com> wrote:
>         Hi Kim:
>         
>              I think there is a performance problem: when all
>         sbi->fs_lock is holded, 
>         
>         then all other threads may get the same next_lock value from
>         sbi->next_lock_num in function mutex_lock_op, 
>         
>         and wait to get the same lock at position fs_lock[next_lock],
>         it unbalance the fs_lock usage. 
>         
>         It may lost performance when we do the multithread test.
>         
>          
>         
>         Here is the patch to fix this problem:
>         
>          
>         
>         Signed-off-by: Yu Chao <chao2.yu@samsung.com>
>         
>         diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>         
>         old mode 100644
>         
>         new mode 100755
>         
>         index 467d42d..983bb45
>         
>         --- a/fs/f2fs/f2fs.h
>         
>         +++ b/fs/f2fs/f2fs.h
>         
>         @@ -371,6 +371,7 @@ struct f2fs_sb_info {
>         
>                 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
>         operations */
>         
>                 struct mutex node_write;                /* locking
>         node writes */
>         
>                 struct mutex writepages;                /* mutex for
>         writepages() */
>         
>         +       spinlock_t spin_lock;                   /* lock for
>         next_lock_num */
>         
>                 unsigned char next_lock_num;            /* round-robin
>         global locks */
>         
>                 int por_doing;                          /* recovery is
>         doing or not */
>         
>                 int on_build_free_nids;                 /*
>         build_free_nids is doing */
>         
>         @@ -533,15 +534,19 @@ static inline void
>         mutex_unlock_all(struct f2fs_sb_info *sbi)
>         
>          
>         
>          static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
>         
>          {
>         
>         -       unsigned char next_lock = sbi->next_lock_num %
>         NR_GLOBAL_LOCKS;
>         
>         +       unsigned char next_lock;
>         
>                 int i = 0;
>         
>          
>         
>                 for (; i < NR_GLOBAL_LOCKS; i++)
>         
>                         if (mutex_trylock(&sbi->fs_lock[i]))
>         
>                                 return i;
>         
>          
>         
>         -       mutex_lock(&sbi->fs_lock[next_lock]);
>         
>         +       spin_lock(&sbi->spin_lock);
>         
>         +       next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
>         
>                 sbi->next_lock_num++;
>         
>         +       spin_unlock(&sbi->spin_lock);
>         
>         +
>         
>         +       mutex_lock(&sbi->fs_lock[next_lock]);
>         
>                 return next_lock;
>         
>          }
>         
>          
>         
>         diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>         
>         old mode 100644
>         
>         new mode 100755
>         
>         index 75c7dc3..4f27596
>         
>         --- a/fs/f2fs/super.c
>         
>         +++ b/fs/f2fs/super.c
>         
>         @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct
>         super_block *sb, void *data, int silent)
>         
>                 mutex_init(&sbi->cp_mutex);
>         
>                 for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>         
>                         mutex_init(&sbi->fs_lock[i]);
>         
>         +       spin_lock_init(&sbi->spin_lock);
>         
>                 mutex_init(&sbi->node_write);
>         
>                 sbi->por_doing = 0;
>         
>                 spin_lock_init(&sbi->stat_lock);
>         
>         (END)
>         
>          
>         
>         
>         
>         
>         
>         ------------------------------------------------------------------------------
>         Learn the latest--Visual Studio 2012, SharePoint 2013, SQL
>         2012, more!
>         Discover the easy way to master current and previous Microsoft
>         technologies
>         and advance your career. Get an incredible 1,500+ hours of
>         step-by-step
>         tutorial videos with LearnDevNow. Subscribe today and save!
>         http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
>         _______________________________________________
>         Linux-f2fs-devel mailing list
>         Linux-f2fs-devel@lists.sourceforge.net
>         https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>         
> 
> 
>  
> 
>  
> 
>  
> 
> 
> 
> 

-- 
Jaegeuk Kim
Samsung


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-12  2:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-11  2:36 Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance Chao Yu
2013-09-11  2:36 ` Chao Yu
2013-09-11 13:14 ` Kim Jaegeuk
2013-09-11 13:14   ` Kim Jaegeuk
2013-09-12  2:02   ` 俞超
     [not found] <04.C0.13361.61DDA225@epcpsbge5.samsung.com>
2013-09-10  0:59 ` Re: [f2fs-dev] [PATCH] " Jaegeuk Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.