linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
@ 2017-06-24 16:25 Jaegeuk Kim
  2017-06-24 16:25 ` [PATCH 2/2] f2fs: report # of free inodes more precisely Jaegeuk Kim
  2017-06-26  8:15 ` [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op Chao Yu
  0 siblings, 2 replies; 16+ messages in thread
From: Jaegeuk Kim @ 2017-06-24 16:25 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

- punch_hole
 - fill_zero
  - f2fs_lock_op
  - get_new_data_page
   - lock_page

- f2fs_write_data_pages
 - lock_page
 - do_write_data_page
  - f2fs_lock_op

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/data.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7d3af48d34a9..9141bd19a902 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
 		}
 	}
 
-	if (fio->need_lock == LOCK_REQ)
-		f2fs_lock_op(fio->sbi);
+	/* Deadlock due to between page->lock and f2fs_lock_op */
+	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
+		return -EAGAIN;
 
 	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
 	if (err)
-- 
2.13.0.rc1.294.g07d810a77f-goog

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] f2fs: report # of free inodes more precisely
  2017-06-24 16:25 [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op Jaegeuk Kim
@ 2017-06-24 16:25 ` Jaegeuk Kim
  2017-06-26 10:52   ` [f2fs-dev] " Chao Yu
  2017-06-26  8:15 ` [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op Chao Yu
  1 sibling, 1 reply; 16+ messages in thread
From: Jaegeuk Kim @ 2017-06-24 16:25 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

If the partition is small, we don't need to report total # of inodes including
hidden free nodes.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/super.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8e39b850bfc0..3da6fb276f8b 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -680,6 +680,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
 	struct f2fs_sb_info *sbi = F2FS_SB(sb);
 	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
 	block_t total_count, user_block_count, start_count, ovp_count;
+	u64 avail_node_count;
 
 	total_count = le64_to_cpu(sbi->raw_super->block_count);
 	user_block_count = sbi->user_block_count;
@@ -692,9 +693,16 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
 	buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
 	buf->f_bavail = user_block_count - valid_user_blocks(sbi);
 
-	buf->f_files = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
-	buf->f_ffree = min(buf->f_files - valid_node_count(sbi),
-							buf->f_bavail);
+	avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
+
+	if (avail_node_count > user_block_count) {
+		buf->f_files = user_block_count;
+		buf->f_ffree = buf->f_bavail;
+	} else {
+		buf->f_files = avail_node_count;
+		buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
+					buf->f_bavail);
+	}
 
 	buf->f_namelen = F2FS_NAME_LEN;
 	buf->f_fsid.val[0] = (u32)id;
-- 
2.13.0.rc1.294.g07d810a77f-goog

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-06-24 16:25 [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op Jaegeuk Kim
  2017-06-24 16:25 ` [PATCH 2/2] f2fs: report # of free inodes more precisely Jaegeuk Kim
@ 2017-06-26  8:15 ` Chao Yu
  2017-06-26 14:54   ` Jaegeuk Kim
  1 sibling, 1 reply; 16+ messages in thread
From: Chao Yu @ 2017-06-26  8:15 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-fsdevel, linux-f2fs-devel

Hi Jaegeuk,

On 2017/6/25 0:25, Jaegeuk Kim wrote:
> - punch_hole
>  - fill_zero
>   - f2fs_lock_op
>   - get_new_data_page
>    - lock_page
> 
> - f2fs_write_data_pages
>  - lock_page
>  - do_write_data_page
>   - f2fs_lock_op

Good catch!

With this implementation, page writeback can fail due to concurrent checkpoint,
this will make fsync/atomic_commit which trigger synchronous write failed randomly.

How about unifying the lock order in punch_hole as one in writepages for regular
inode? We can add one more parameter in get_new_data_page to indicate whether
callee needs to lock cp_rwsem.

Thanks,

> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/data.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 7d3af48d34a9..9141bd19a902 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
>  		}
>  	}
>  
> -	if (fio->need_lock == LOCK_REQ)
> -		f2fs_lock_op(fio->sbi);
> +	/* Deadlock due to between page->lock and f2fs_lock_op */
> +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
> +		return -EAGAIN;
>  
>  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
>  	if (err)
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: report # of free inodes more precisely
  2017-06-24 16:25 ` [PATCH 2/2] f2fs: report # of free inodes more precisely Jaegeuk Kim
@ 2017-06-26 10:52   ` Chao Yu
  2017-06-26 14:58     ` Jaegeuk Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Chao Yu @ 2017-06-26 10:52 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-fsdevel, linux-f2fs-devel

Hi Jaegeuk,

On 2017/6/25 0:25, Jaegeuk Kim wrote:
> If the partition is small, we don't need to report total # of inodes including
> hidden free nodes.
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/super.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 8e39b850bfc0..3da6fb276f8b 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -680,6 +680,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>  	struct f2fs_sb_info *sbi = F2FS_SB(sb);
>  	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
>  	block_t total_count, user_block_count, start_count, ovp_count;
> +	u64 avail_node_count;
>  
>  	total_count = le64_to_cpu(sbi->raw_super->block_count);
>  	user_block_count = sbi->user_block_count;
> @@ -692,9 +693,16 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>  	buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
>  	buf->f_bavail = user_block_count - valid_user_blocks(sbi);
>  
> -	buf->f_files = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
> -	buf->f_ffree = min(buf->f_files - valid_node_count(sbi),
> -							buf->f_bavail);
> +	avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
> +
> +	if (avail_node_count > user_block_count) {
> +		buf->f_files = user_block_count;
> +		buf->f_ffree = buf->f_bavail;

f_ffree is limited both by remained free nid count and free block count, so it
needs to change like this?

if (avail_node_count > user_block_count)
	avail_node_count = user_block_count;

buf->f_files = avail_node_count;
buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
					buf->f_bavail);

Thanks,

> +	} else {
> +		buf->f_files = avail_node_count;
> +		buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
> +					buf->f_bavail);
> +	}
>  
>  	buf->f_namelen = F2FS_NAME_LEN;
>  	buf->f_fsid.val[0] = (u32)id;
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-06-26  8:15 ` [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op Chao Yu
@ 2017-06-26 14:54   ` Jaegeuk Kim
  2017-06-26 15:42     ` Chao Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jaegeuk Kim @ 2017-06-26 14:54 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-fsdevel, linux-f2fs-devel

Hi Chao,

On 06/26, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2017/6/25 0:25, Jaegeuk Kim wrote:
> > - punch_hole
> >  - fill_zero
> >   - f2fs_lock_op
> >   - get_new_data_page
> >    - lock_page
> > 
> > - f2fs_write_data_pages
> >  - lock_page
> >  - do_write_data_page
> >   - f2fs_lock_op
> 
> Good catch!
> 
> With this implementation, page writeback can fail due to concurrent checkpoint,
> this will make fsync/atomic_commit which trigger synchronous write failed randomly.
> 
> How about unifying the lock order in punch_hole as one in writepages for regular
> inode? We can add one more parameter in get_new_data_page to indicate whether
> callee needs to lock cp_rwsem.

Currently, there would be some places to keep cp_rwsem -> page.lock, which seems
not simple to change the lock order with page.lock -> cp_rwsem. IMO, we can retry
flushing data in f2fs_sync_file, once it gets -EAGAIN.

Any thoughts?

> 
> Thanks,
> 
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/data.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 7d3af48d34a9..9141bd19a902 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
> >  		}
> >  	}
> >  
> > -	if (fio->need_lock == LOCK_REQ)
> > -		f2fs_lock_op(fio->sbi);
> > +	/* Deadlock due to between page->lock and f2fs_lock_op */
> > +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
> > +		return -EAGAIN;
> >  
> >  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
> >  	if (err)
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: report # of free inodes more precisely
  2017-06-26 10:52   ` [f2fs-dev] " Chao Yu
@ 2017-06-26 14:58     ` Jaegeuk Kim
  2017-06-28 13:01       ` Chao Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jaegeuk Kim @ 2017-06-26 14:58 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-fsdevel, linux-f2fs-devel

On 06/26, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2017/6/25 0:25, Jaegeuk Kim wrote:
> > If the partition is small, we don't need to report total # of inodes including
> > hidden free nodes.
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/super.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > index 8e39b850bfc0..3da6fb276f8b 100644
> > --- a/fs/f2fs/super.c
> > +++ b/fs/f2fs/super.c
> > @@ -680,6 +680,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
> >  	struct f2fs_sb_info *sbi = F2FS_SB(sb);
> >  	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
> >  	block_t total_count, user_block_count, start_count, ovp_count;
> > +	u64 avail_node_count;
> >  
> >  	total_count = le64_to_cpu(sbi->raw_super->block_count);
> >  	user_block_count = sbi->user_block_count;
> > @@ -692,9 +693,16 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
> >  	buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
> >  	buf->f_bavail = user_block_count - valid_user_blocks(sbi);
> >  
> > -	buf->f_files = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
> > -	buf->f_ffree = min(buf->f_files - valid_node_count(sbi),
> > -							buf->f_bavail);
> > +	avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
> > +
> > +	if (avail_node_count > user_block_count) {
> > +		buf->f_files = user_block_count;
> > +		buf->f_ffree = buf->f_bavail;
> 
> f_ffree is limited both by remained free nid count and free block count, so it
> needs to change like this?

I thought both of them are same, since node block will consume user block. So,
we don't need to do min() again.

> 
> if (avail_node_count > user_block_count)
> 	avail_node_count = user_block_count;
> 
> buf->f_files = avail_node_count;
> buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
> 					buf->f_bavail);
> 
> Thanks,
> 
> > +	} else {
> > +		buf->f_files = avail_node_count;
> > +		buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
> > +					buf->f_bavail);
> > +	}
> >  
> >  	buf->f_namelen = F2FS_NAME_LEN;
> >  	buf->f_fsid.val[0] = (u32)id;
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-06-26 14:54   ` Jaegeuk Kim
@ 2017-06-26 15:42     ` Chao Yu
  2017-07-01  7:28       ` Jaegeuk Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Chao Yu @ 2017-06-26 15:42 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu; +Cc: linux-fsdevel, linux-kernel, linux-f2fs-devel

Hi Jaegeuk,

On 2017/6/26 22:54, Jaegeuk Kim wrote:
> Hi Chao,
> 
> On 06/26, Chao Yu wrote:
>> Hi Jaegeuk,
>>
>> On 2017/6/25 0:25, Jaegeuk Kim wrote:
>>> - punch_hole
>>>  - fill_zero
>>>   - f2fs_lock_op
>>>   - get_new_data_page
>>>    - lock_page
>>>
>>> - f2fs_write_data_pages
>>>  - lock_page
>>>  - do_write_data_page
>>>   - f2fs_lock_op
>>
>> Good catch!
>>
>> With this implementation, page writeback can fail due to concurrent checkpoint,
>> this will make fsync/atomic_commit which trigger synchronous write failed randomly.
>>
>> How about unifying the lock order in punch_hole as one in writepages for regular
>> inode? We can add one more parameter in get_new_data_page to indicate whether
>> callee needs to lock cp_rwsem.
> 
> Currently, there would be some places to keep cp_rwsem -> page.lock, which seems
> not simple to change the lock order with page.lock -> cp_rwsem. IMO, we can retry
> flushing data in f2fs_sync_file, once it gets -EAGAIN.
> 
> Any thoughts?

What about adding inode_lock in f2fs_sync_file to exclude other
foreground operation which have reversed lock order? Atomic_commit is OK
since it has inode_lock in its path.

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/data.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 7d3af48d34a9..9141bd19a902 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
>>>  		}
>>>  	}
>>>  
>>> -	if (fio->need_lock == LOCK_REQ)
>>> -		f2fs_lock_op(fio->sbi);
>>> +	/* Deadlock due to between page->lock and f2fs_lock_op */
>>> +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
>>> +		return -EAGAIN;
>>>  
>>>  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
>>>  	if (err)
>>>
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: report # of free inodes more precisely
  2017-06-26 14:58     ` Jaegeuk Kim
@ 2017-06-28 13:01       ` Chao Yu
  2017-07-01  7:27         ` Jaegeuk Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Chao Yu @ 2017-06-28 13:01 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu; +Cc: linux-fsdevel, linux-kernel, linux-f2fs-devel

On 2017/6/26 22:58, Jaegeuk Kim wrote:
> On 06/26, Chao Yu wrote:
>> Hi Jaegeuk,
>>
>> On 2017/6/25 0:25, Jaegeuk Kim wrote:
>>> If the partition is small, we don't need to report total # of inodes including
>>> hidden free nodes.
>>>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/super.c | 14 +++++++++++---
>>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>> index 8e39b850bfc0..3da6fb276f8b 100644
>>> --- a/fs/f2fs/super.c
>>> +++ b/fs/f2fs/super.c
>>> @@ -680,6 +680,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>  	struct f2fs_sb_info *sbi = F2FS_SB(sb);
>>>  	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
>>>  	block_t total_count, user_block_count, start_count, ovp_count;
>>> +	u64 avail_node_count;
>>>  
>>>  	total_count = le64_to_cpu(sbi->raw_super->block_count);
>>>  	user_block_count = sbi->user_block_count;
>>> @@ -692,9 +693,16 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>  	buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
>>>  	buf->f_bavail = user_block_count - valid_user_blocks(sbi);
>>>  
>>> -	buf->f_files = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>> -	buf->f_ffree = min(buf->f_files - valid_node_count(sbi),
>>> -							buf->f_bavail);
>>> +	avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>> +
>>> +	if (avail_node_count > user_block_count) {
>>> +		buf->f_files = user_block_count;
>>> +		buf->f_ffree = buf->f_bavail;
>>
>> f_ffree is limited both by remained free nid count and free block count, so it
>> needs to change like this?
> 
> I thought both of them are same, since node block will consume user block. So,
> we don't need to do min() again.

avail_node_count comes from total free nid counts which is limited with
nid_bitmap size, buf->f_bavail comes from total user block count which
can both cosumed by node and data. So the value of them may not be the same.

Thanks,

> 
>>
>> if (avail_node_count > user_block_count)
>> 	avail_node_count = user_block_count;
>>
>> buf->f_files = avail_node_count;
>> buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
>> 					buf->f_bavail);
>>
>> Thanks,
>>
>>> +	} else {
>>> +		buf->f_files = avail_node_count;
>>> +		buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
>>> +					buf->f_bavail);
>>> +	}
>>>  
>>>  	buf->f_namelen = F2FS_NAME_LEN;
>>>  	buf->f_fsid.val[0] = (u32)id;
>>>
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: report # of free inodes more precisely
  2017-06-28 13:01       ` Chao Yu
@ 2017-07-01  7:27         ` Jaegeuk Kim
  2017-07-01  8:28           ` Chao Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jaegeuk Kim @ 2017-07-01  7:27 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-fsdevel, linux-kernel, linux-f2fs-devel

On 06/28, Chao Yu wrote:
> On 2017/6/26 22:58, Jaegeuk Kim wrote:
> > On 06/26, Chao Yu wrote:
> >> Hi Jaegeuk,
> >>
> >> On 2017/6/25 0:25, Jaegeuk Kim wrote:
> >>> If the partition is small, we don't need to report total # of inodes including
> >>> hidden free nodes.
> >>>
> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> ---
> >>>  fs/f2fs/super.c | 14 +++++++++++---
> >>>  1 file changed, 11 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >>> index 8e39b850bfc0..3da6fb276f8b 100644
> >>> --- a/fs/f2fs/super.c
> >>> +++ b/fs/f2fs/super.c
> >>> @@ -680,6 +680,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
> >>>  	struct f2fs_sb_info *sbi = F2FS_SB(sb);
> >>>  	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
> >>>  	block_t total_count, user_block_count, start_count, ovp_count;
> >>> +	u64 avail_node_count;
> >>>  
> >>>  	total_count = le64_to_cpu(sbi->raw_super->block_count);
> >>>  	user_block_count = sbi->user_block_count;
> >>> @@ -692,9 +693,16 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
> >>>  	buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
> >>>  	buf->f_bavail = user_block_count - valid_user_blocks(sbi);
> >>>  
> >>> -	buf->f_files = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
> >>> -	buf->f_ffree = min(buf->f_files - valid_node_count(sbi),
> >>> -							buf->f_bavail);
> >>> +	avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
> >>> +
> >>> +	if (avail_node_count > user_block_count) {
> >>> +		buf->f_files = user_block_count;
> >>> +		buf->f_ffree = buf->f_bavail;
> >>
> >> f_ffree is limited both by remained free nid count and free block count, so it
> >> needs to change like this?
> > 
> > I thought both of them are same, since node block will consume user block. So,
> > we don't need to do min() again.
> 
> avail_node_count comes from total free nid counts which is limited with
> nid_bitmap size, buf->f_bavail comes from total user block count which
> can both cosumed by node and data. So the value of them may not be the same.

What I mean was, if avail_node_count is larger than user_block_count, we can
see buf->f_bavail is always smaller than avali_node_count - valid_node_count,
since node blocks concumes blocks as well.

> 
> Thanks,
> 
> > 
> >>
> >> if (avail_node_count > user_block_count)
> >> 	avail_node_count = user_block_count;
> >>
> >> buf->f_files = avail_node_count;
> >> buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
> >> 					buf->f_bavail);
> >>
> >> Thanks,
> >>
> >>> +	} else {
> >>> +		buf->f_files = avail_node_count;
> >>> +		buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
> >>> +					buf->f_bavail);
> >>> +	}
> >>>  
> >>>  	buf->f_namelen = F2FS_NAME_LEN;
> >>>  	buf->f_fsid.val[0] = (u32)id;
> >>>
> > 
> > ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-06-26 15:42     ` Chao Yu
@ 2017-07-01  7:28       ` Jaegeuk Kim
  2017-07-01  8:41         ` Chao Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jaegeuk Kim @ 2017-07-01  7:28 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-fsdevel, linux-kernel, linux-f2fs-devel

On 06/26, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2017/6/26 22:54, Jaegeuk Kim wrote:
> > Hi Chao,
> > 
> > On 06/26, Chao Yu wrote:
> >> Hi Jaegeuk,
> >>
> >> On 2017/6/25 0:25, Jaegeuk Kim wrote:
> >>> - punch_hole
> >>>  - fill_zero
> >>>   - f2fs_lock_op
> >>>   - get_new_data_page
> >>>    - lock_page
> >>>
> >>> - f2fs_write_data_pages
> >>>  - lock_page
> >>>  - do_write_data_page
> >>>   - f2fs_lock_op
> >>
> >> Good catch!
> >>
> >> With this implementation, page writeback can fail due to concurrent checkpoint,
> >> this will make fsync/atomic_commit which trigger synchronous write failed randomly.
> >>
> >> How about unifying the lock order in punch_hole as one in writepages for regular
> >> inode? We can add one more parameter in get_new_data_page to indicate whether
> >> callee needs to lock cp_rwsem.
> > 
> > Currently, there would be some places to keep cp_rwsem -> page.lock, which seems
> > not simple to change the lock order with page.lock -> cp_rwsem. IMO, we can retry
> > flushing data in f2fs_sync_file, once it gets -EAGAIN.
> > 
> > Any thoughts?
> 
> What about adding inode_lock in f2fs_sync_file to exclude other
> foreground operation which have reversed lock order? Atomic_commit is OK
> since it has inode_lock in its path.

I have concerned about performance regression, if we do that.

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> ---
> >>>  fs/f2fs/data.c | 5 +++--
> >>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>> index 7d3af48d34a9..9141bd19a902 100644
> >>> --- a/fs/f2fs/data.c
> >>> +++ b/fs/f2fs/data.c
> >>> @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
> >>>  		}
> >>>  	}
> >>>  
> >>> -	if (fio->need_lock == LOCK_REQ)
> >>> -		f2fs_lock_op(fio->sbi);
> >>> +	/* Deadlock due to between page->lock and f2fs_lock_op */
> >>> +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
> >>> +		return -EAGAIN;
> >>>  
> >>>  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
> >>>  	if (err)
> >>>
> > 
> > ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: report # of free inodes more precisely
  2017-07-01  7:27         ` Jaegeuk Kim
@ 2017-07-01  8:28           ` Chao Yu
  0 siblings, 0 replies; 16+ messages in thread
From: Chao Yu @ 2017-07-01  8:28 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-fsdevel, linux-kernel, linux-f2fs-devel

On 2017/7/1 15:27, Jaegeuk Kim wrote:
> On 06/28, Chao Yu wrote:
>> On 2017/6/26 22:58, Jaegeuk Kim wrote:
>>> On 06/26, Chao Yu wrote:
>>>> Hi Jaegeuk,
>>>>
>>>> On 2017/6/25 0:25, Jaegeuk Kim wrote:
>>>>> If the partition is small, we don't need to report total # of inodes including
>>>>> hidden free nodes.
>>>>>
>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>> ---
>>>>>  fs/f2fs/super.c | 14 +++++++++++---
>>>>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>> index 8e39b850bfc0..3da6fb276f8b 100644
>>>>> --- a/fs/f2fs/super.c
>>>>> +++ b/fs/f2fs/super.c
>>>>> @@ -680,6 +680,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>>>  	struct f2fs_sb_info *sbi = F2FS_SB(sb);
>>>>>  	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
>>>>>  	block_t total_count, user_block_count, start_count, ovp_count;
>>>>> +	u64 avail_node_count;
>>>>>  
>>>>>  	total_count = le64_to_cpu(sbi->raw_super->block_count);
>>>>>  	user_block_count = sbi->user_block_count;
>>>>> @@ -692,9 +693,16 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>>>  	buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
>>>>>  	buf->f_bavail = user_block_count - valid_user_blocks(sbi);
>>>>>  
>>>>> -	buf->f_files = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>>>> -	buf->f_ffree = min(buf->f_files - valid_node_count(sbi),
>>>>> -							buf->f_bavail);
>>>>> +	avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>>>> +
>>>>> +	if (avail_node_count > user_block_count) {
>>>>> +		buf->f_files = user_block_count;
>>>>> +		buf->f_ffree = buf->f_bavail;
>>>>
>>>> f_ffree is limited both by remained free nid count and free block count, so it
>>>> needs to change like this?
>>>
>>> I thought both of them are same, since node block will consume user block. So,
>>> we don't need to do min() again.
>>
>> avail_node_count comes from total free nid counts which is limited with
>> nid_bitmap size, buf->f_bavail comes from total user block count which
>> can both cosumed by node and data. So the value of them may not be the same.
> 
> What I mean was, if avail_node_count is larger than user_block_count, we can
> see buf->f_bavail is always smaller than avali_node_count - valid_node_count,
> since node blocks concumes blocks as well.

Got you. :)

Reviewed-by: Chao Yu <yuchao0@huawei.com>

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> if (avail_node_count > user_block_count)
>>>> 	avail_node_count = user_block_count;
>>>>
>>>> buf->f_files = avail_node_count;
>>>> buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
>>>> 					buf->f_bavail);
>>>>
>>>> Thanks,
>>>>
>>>>> +	} else {
>>>>> +		buf->f_files = avail_node_count;
>>>>> +		buf->f_ffree = min(avail_node_count - valid_node_count(sbi),
>>>>> +					buf->f_bavail);
>>>>> +	}
>>>>>  
>>>>>  	buf->f_namelen = F2FS_NAME_LEN;
>>>>>  	buf->f_fsid.val[0] = (u32)id;
>>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-07-01  7:28       ` Jaegeuk Kim
@ 2017-07-01  8:41         ` Chao Yu
  2017-07-01 14:27           ` Jaegeuk Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Chao Yu @ 2017-07-01  8:41 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-fsdevel, linux-kernel, linux-f2fs-devel

On 2017/7/1 15:28, Jaegeuk Kim wrote:
> On 06/26, Chao Yu wrote:
>> Hi Jaegeuk,
>>
>> On 2017/6/26 22:54, Jaegeuk Kim wrote:
>>> Hi Chao,
>>>
>>> On 06/26, Chao Yu wrote:
>>>> Hi Jaegeuk,
>>>>
>>>> On 2017/6/25 0:25, Jaegeuk Kim wrote:
>>>>> - punch_hole
>>>>>  - fill_zero
>>>>>   - f2fs_lock_op
>>>>>   - get_new_data_page
>>>>>    - lock_page
>>>>>
>>>>> - f2fs_write_data_pages
>>>>>  - lock_page
>>>>>  - do_write_data_page
>>>>>   - f2fs_lock_op
>>>>
>>>> Good catch!
>>>>
>>>> With this implementation, page writeback can fail due to concurrent checkpoint,
>>>> this will make fsync/atomic_commit which trigger synchronous write failed randomly.
>>>>
>>>> How about unifying the lock order in punch_hole as one in writepages for regular
>>>> inode? We can add one more parameter in get_new_data_page to indicate whether
>>>> callee needs to lock cp_rwsem.
>>>
>>> Currently, there would be some places to keep cp_rwsem -> page.lock, which seems
>>> not simple to change the lock order with page.lock -> cp_rwsem. IMO, we can retry
>>> flushing data in f2fs_sync_file, once it gets -EAGAIN.
>>>
>>> Any thoughts?
>>
>> What about adding inode_lock in f2fs_sync_file to exclude other
>> foreground operation which have reversed lock order? Atomic_commit is OK
>> since it has inode_lock in its path.
> 
> I have concerned about performance regression, if we do that.

I think fsync vs write or fsync vs fsync scenarios are unusual, so is
there any usecase?

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>> ---
>>>>>  fs/f2fs/data.c | 5 +++--
>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>>> index 7d3af48d34a9..9141bd19a902 100644
>>>>> --- a/fs/f2fs/data.c
>>>>> +++ b/fs/f2fs/data.c
>>>>> @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
>>>>>  		}
>>>>>  	}
>>>>>  
>>>>> -	if (fio->need_lock == LOCK_REQ)
>>>>> -		f2fs_lock_op(fio->sbi);
>>>>> +	/* Deadlock due to between page->lock and f2fs_lock_op */
>>>>> +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
>>>>> +		return -EAGAIN;
>>>>>  
>>>>>  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
>>>>>  	if (err)
>>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-07-01  8:41         ` Chao Yu
@ 2017-07-01 14:27           ` Jaegeuk Kim
  2017-07-05  2:58             ` Chao Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jaegeuk Kim @ 2017-07-01 14:27 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-fsdevel, linux-kernel, linux-f2fs-devel

On 07/01, Chao Yu wrote:
> On 2017/7/1 15:28, Jaegeuk Kim wrote:
> > On 06/26, Chao Yu wrote:
> >> Hi Jaegeuk,
> >>
> >> On 2017/6/26 22:54, Jaegeuk Kim wrote:
> >>> Hi Chao,
> >>>
> >>> On 06/26, Chao Yu wrote:
> >>>> Hi Jaegeuk,
> >>>>
> >>>> On 2017/6/25 0:25, Jaegeuk Kim wrote:
> >>>>> - punch_hole
> >>>>>  - fill_zero
> >>>>>   - f2fs_lock_op
> >>>>>   - get_new_data_page
> >>>>>    - lock_page
> >>>>>
> >>>>> - f2fs_write_data_pages
> >>>>>  - lock_page
> >>>>>  - do_write_data_page
> >>>>>   - f2fs_lock_op
> >>>>
> >>>> Good catch!
> >>>>
> >>>> With this implementation, page writeback can fail due to concurrent checkpoint,
> >>>> this will make fsync/atomic_commit which trigger synchronous write failed randomly.
> >>>>
> >>>> How about unifying the lock order in punch_hole as one in writepages for regular
> >>>> inode? We can add one more parameter in get_new_data_page to indicate whether
> >>>> callee needs to lock cp_rwsem.
> >>>
> >>> Currently, there would be some places to keep cp_rwsem -> page.lock, which seems
> >>> not simple to change the lock order with page.lock -> cp_rwsem. IMO, we can retry
> >>> flushing data in f2fs_sync_file, once it gets -EAGAIN.
> >>>
> >>> Any thoughts?
> >>
> >> What about adding inode_lock in f2fs_sync_file to exclude other
> >> foreground operation which have reversed lock order? Atomic_commit is OK
> >> since it has inode_lock in its path.
> > 
> > I have concerned about performance regression, if we do that.
> 
> I think fsync vs write or fsync vs fsync scenarios are unusual, so is
> there any usecase?

Well, that'd be common to call multiple fsync calls at the same time.
Like dbench or tiotest?

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>> ---
> >>>>>  fs/f2fs/data.c | 5 +++--
> >>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>>>> index 7d3af48d34a9..9141bd19a902 100644
> >>>>> --- a/fs/f2fs/data.c
> >>>>> +++ b/fs/f2fs/data.c
> >>>>> @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
> >>>>>  		}
> >>>>>  	}
> >>>>>  
> >>>>> -	if (fio->need_lock == LOCK_REQ)
> >>>>> -		f2fs_lock_op(fio->sbi);
> >>>>> +	/* Deadlock due to between page->lock and f2fs_lock_op */
> >>>>> +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
> >>>>> +		return -EAGAIN;
> >>>>>  
> >>>>>  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
> >>>>>  	if (err)
> >>>>>
> >>>
> >>> ------------------------------------------------------------------------------
> >>> Check out the vibrant tech community on one of the world's most
> >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>> _______________________________________________
> >>> Linux-f2fs-devel mailing list
> >>> Linux-f2fs-devel@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-07-01 14:27           ` Jaegeuk Kim
@ 2017-07-05  2:58             ` Chao Yu
  2017-07-05  3:28               ` Jaegeuk Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Chao Yu @ 2017-07-05  2:58 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-fsdevel, linux-kernel, linux-f2fs-devel

On 2017/7/1 22:27, Jaegeuk Kim wrote:
> On 07/01, Chao Yu wrote:
>> On 2017/7/1 15:28, Jaegeuk Kim wrote:
>>> On 06/26, Chao Yu wrote:
>>>> Hi Jaegeuk,
>>>>
>>>> On 2017/6/26 22:54, Jaegeuk Kim wrote:
>>>>> Hi Chao,
>>>>>
>>>>> On 06/26, Chao Yu wrote:
>>>>>> Hi Jaegeuk,
>>>>>>
>>>>>> On 2017/6/25 0:25, Jaegeuk Kim wrote:
>>>>>>> - punch_hole
>>>>>>>  - fill_zero
>>>>>>>   - f2fs_lock_op
>>>>>>>   - get_new_data_page
>>>>>>>    - lock_page
>>>>>>>
>>>>>>> - f2fs_write_data_pages
>>>>>>>  - lock_page
>>>>>>>  - do_write_data_page
>>>>>>>   - f2fs_lock_op
>>>>>>
>>>>>> Good catch!
>>>>>>
>>>>>> With this implementation, page writeback can fail due to concurrent checkpoint,
>>>>>> this will make fsync/atomic_commit which trigger synchronous write failed randomly.
>>>>>>
>>>>>> How about unifying the lock order in punch_hole as one in writepages for regular
>>>>>> inode? We can add one more parameter in get_new_data_page to indicate whether
>>>>>> callee needs to lock cp_rwsem.
>>>>>
>>>>> Currently, there would be some places to keep cp_rwsem -> page.lock, which seems
>>>>> not simple to change the lock order with page.lock -> cp_rwsem. IMO, we can retry
>>>>> flushing data in f2fs_sync_file, once it gets -EAGAIN.
>>>>>
>>>>> Any thoughts?
>>>>
>>>> What about adding inode_lock in f2fs_sync_file to exclude other
>>>> foreground operation which have reversed lock order? Atomic_commit is OK
>>>> since it has inode_lock in its path.
>>>
>>> I have concerned about performance regression, if we do that.
>>
>> I think fsync vs write or fsync vs fsync scenarios are unusual, so is
>> there any usecase?
> 
> Well, that'd be common to call multiple fsync calls at the same time.
> Like dbench or tiotest?

Do you have test numbers of dbench/tiotest with inode:lock in fsync?

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>> ---
>>>>>>>  fs/f2fs/data.c | 5 +++--
>>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>>>>> index 7d3af48d34a9..9141bd19a902 100644
>>>>>>> --- a/fs/f2fs/data.c
>>>>>>> +++ b/fs/f2fs/data.c
>>>>>>> @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
>>>>>>>  		}
>>>>>>>  	}
>>>>>>>  
>>>>>>> -	if (fio->need_lock == LOCK_REQ)
>>>>>>> -		f2fs_lock_op(fio->sbi);
>>>>>>> +	/* Deadlock due to between page->lock and f2fs_lock_op */
>>>>>>> +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
>>>>>>> +		return -EAGAIN;
>>>>>>>  
>>>>>>>  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
>>>>>>>  	if (err)
>>>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Check out the vibrant tech community on one of the world's most
>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>> _______________________________________________
>>>>> Linux-f2fs-devel mailing list
>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-07-05  2:58             ` Chao Yu
@ 2017-07-05  3:28               ` Jaegeuk Kim
  2017-07-07 19:13                 ` [f2fs-dev] [PATCH 1/2 v2] " Jaegeuk Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Jaegeuk Kim @ 2017-07-05  3:28 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-fsdevel, linux-kernel, linux-f2fs-devel

On 07/05, Chao Yu wrote:
> On 2017/7/1 22:27, Jaegeuk Kim wrote:
> > On 07/01, Chao Yu wrote:
> >> On 2017/7/1 15:28, Jaegeuk Kim wrote:
> >>> On 06/26, Chao Yu wrote:
> >>>> Hi Jaegeuk,
> >>>>
> >>>> On 2017/6/26 22:54, Jaegeuk Kim wrote:
> >>>>> Hi Chao,
> >>>>>
> >>>>> On 06/26, Chao Yu wrote:
> >>>>>> Hi Jaegeuk,
> >>>>>>
> >>>>>> On 2017/6/25 0:25, Jaegeuk Kim wrote:
> >>>>>>> - punch_hole
> >>>>>>>  - fill_zero
> >>>>>>>   - f2fs_lock_op
> >>>>>>>   - get_new_data_page
> >>>>>>>    - lock_page
> >>>>>>>
> >>>>>>> - f2fs_write_data_pages
> >>>>>>>  - lock_page
> >>>>>>>  - do_write_data_page
> >>>>>>>   - f2fs_lock_op
> >>>>>>
> >>>>>> Good catch!
> >>>>>>
> >>>>>> With this implementation, page writeback can fail due to concurrent checkpoint,
> >>>>>> this will make fsync/atomic_commit which trigger synchronous write failed randomly.
> >>>>>>
> >>>>>> How about unifying the lock order in punch_hole as one in writepages for regular
> >>>>>> inode? We can add one more parameter in get_new_data_page to indicate whether
> >>>>>> callee needs to lock cp_rwsem.
> >>>>>
> >>>>> Currently, there would be some places to keep cp_rwsem -> page.lock, which seems
> >>>>> not simple to change the lock order with page.lock -> cp_rwsem. IMO, we can retry
> >>>>> flushing data in f2fs_sync_file, once it gets -EAGAIN.
> >>>>>
> >>>>> Any thoughts?
> >>>>
> >>>> What about adding inode_lock in f2fs_sync_file to exclude other
> >>>> foreground operation which have reversed lock order? Atomic_commit is OK
> >>>> since it has inode_lock in its path.
> >>>
> >>> I have concerned about performance regression, if we do that.
> >>
> >> I think fsync vs write or fsync vs fsync scenarios are unusual, so is
> >> there any usecase?
> > 
> > Well, that'd be common to call multiple fsync calls at the same time.
> > Like dbench or tiotest?
> 
> Do you have test numbers of dbench/tiotest with inode:lock in fsync?

No, do we need?

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>
> >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>> ---
> >>>>>>>  fs/f2fs/data.c | 5 +++--
> >>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>>>>>> index 7d3af48d34a9..9141bd19a902 100644
> >>>>>>> --- a/fs/f2fs/data.c
> >>>>>>> +++ b/fs/f2fs/data.c
> >>>>>>> @@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
> >>>>>>>  		}
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>> -	if (fio->need_lock == LOCK_REQ)
> >>>>>>> -		f2fs_lock_op(fio->sbi);
> >>>>>>> +	/* Deadlock due to between page->lock and f2fs_lock_op */
> >>>>>>> +	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
> >>>>>>> +		return -EAGAIN;
> >>>>>>>  
> >>>>>>>  	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
> >>>>>>>  	if (err)
> >>>>>>>
> >>>>>
> >>>>> ------------------------------------------------------------------------------
> >>>>> Check out the vibrant tech community on one of the world's most
> >>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>>>> _______________________________________________
> >>>>> Linux-f2fs-devel mailing list
> >>>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2 v2] f2fs: avoid deadlock caused by lock order of page and lock_op
  2017-07-05  3:28               ` Jaegeuk Kim
@ 2017-07-07 19:13                 ` Jaegeuk Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Jaegeuk Kim @ 2017-07-07 19:13 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-fsdevel, linux-kernel, linux-f2fs-devel

- punch_hole
 - fill_zero
  - f2fs_lock_op
  - get_new_data_page
   - lock_page

- f2fs_write_data_pages
 - lock_page
 - do_write_data_page
  - f2fs_lock_op

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---

 Change log from v1:
 - hide EAGAIN to users

 fs/f2fs/data.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 72fc866cad19..7dd5fb647d43 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1404,8 +1404,9 @@ int do_write_data_page(struct f2fs_io_info *fio)
 		}
 	}
 
-	if (fio->need_lock == LOCK_REQ)
-		f2fs_lock_op(fio->sbi);
+	/* Deadlock due to between page->lock and f2fs_lock_op */
+	if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi))
+		return -EAGAIN;
 
 	err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
 	if (err)
@@ -1667,7 +1668,7 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
 			}
 
 			done_index = page->index;
-
+retry_write:
 			lock_page(page);
 
 			if (unlikely(page->mapping != mapping)) {
@@ -1703,6 +1704,15 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
 					unlock_page(page);
 					ret = 0;
 					continue;
+				} else if (ret == -EAGAIN) {
+					ret = 0;
+					if (wbc->sync_mode == WB_SYNC_ALL) {
+						cond_resched();
+						congestion_wait(BLK_RW_ASYNC,
+									HZ/50);
+						goto retry_write;
+					}
+					continue;
 				}
 				done_index = page->index + 1;
 				done = 1;
-- 
2.13.0.rc1.294.g07d810a77f-goog

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-07-07 19:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-24 16:25 [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op Jaegeuk Kim
2017-06-24 16:25 ` [PATCH 2/2] f2fs: report # of free inodes more precisely Jaegeuk Kim
2017-06-26 10:52   ` [f2fs-dev] " Chao Yu
2017-06-26 14:58     ` Jaegeuk Kim
2017-06-28 13:01       ` Chao Yu
2017-07-01  7:27         ` Jaegeuk Kim
2017-07-01  8:28           ` Chao Yu
2017-06-26  8:15 ` [f2fs-dev] [PATCH 1/2] f2fs: avoid deadlock caused by lock order of page and lock_op Chao Yu
2017-06-26 14:54   ` Jaegeuk Kim
2017-06-26 15:42     ` Chao Yu
2017-07-01  7:28       ` Jaegeuk Kim
2017-07-01  8:41         ` Chao Yu
2017-07-01 14:27           ` Jaegeuk Kim
2017-07-05  2:58             ` Chao Yu
2017-07-05  3:28               ` Jaegeuk Kim
2017-07-07 19:13                 ` [f2fs-dev] [PATCH 1/2 v2] " Jaegeuk Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).