linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] f2fs: fix race in between GC and atomic open
@ 2018-04-18  9:45 Chao Yu
  2018-04-18  9:45 ` [PATCH 2/5] f2fs: fix return value in f2fs_ioc_commit_atomic_write Chao Yu
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Chao Yu @ 2018-04-18  9:45 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao, Chao Yu

Thread					GC thread
- f2fs_ioc_start_atomic_write
 - get_dirty_pages
 - filemap_write_and_wait_range
					- f2fs_gc
					 - do_garbage_collect
					  - gc_data_segment
					   - move_data_page
					    - f2fs_is_atomic_file
					    - set_page_dirty
 - set_inode_flag(, FI_ATOMIC_FILE)

Dirty data page can still be generated by GC in race condition as
above call stack.

This patch adds fi->dio_rwsem[WRITE] in f2fs_ioc_start_atomic_write
to avoid such race.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
---
 fs/f2fs/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 78b3a58cfe21..408471bf4799 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1677,6 +1677,8 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
 
 	inode_lock(inode);
 
+	down_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
+
 	if (f2fs_is_atomic_file(inode))
 		goto out;
 
@@ -1702,6 +1704,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
 	stat_inc_atomic_write(inode);
 	stat_update_max_atomic_write(inode);
 out:
+	up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
 	inode_unlock(inode);
 	mnt_drop_write_file(filp);
 	return ret;
-- 
2.15.0.55.gc2ece9dc4de6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] f2fs: fix return value in f2fs_ioc_commit_atomic_write
  2018-04-18  9:45 [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
@ 2018-04-18  9:45 ` Chao Yu
  2018-04-18  9:45 ` [PATCH 3/5] f2fs: avoid stucking GC due to atomic write Chao Yu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Chao Yu @ 2018-04-18  9:45 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao, Chao Yu

In f2fs_ioc_commit_atomic_write, if file is volatile, return -EINVAL to
indicate that commit failure.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
---
 fs/f2fs/file.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 408471bf4799..7c90ded5a431 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1726,8 +1726,10 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 
 	down_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
 
-	if (f2fs_is_volatile_file(inode))
+	if (f2fs_is_volatile_file(inode)) {
+		ret = -EINVAL;
 		goto err_out;
+	}
 
 	if (f2fs_is_atomic_file(inode)) {
 		ret = commit_inmem_pages(inode);
-- 
2.15.0.55.gc2ece9dc4de6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] f2fs: avoid stucking GC due to atomic write
  2018-04-18  9:45 [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
  2018-04-18  9:45 ` [PATCH 2/5] f2fs: fix return value in f2fs_ioc_commit_atomic_write Chao Yu
@ 2018-04-18  9:45 ` Chao Yu
  2018-04-20  3:12   ` Jaegeuk Kim
  2018-04-18  9:45 ` [PATCH 4/5] f2fs: show GC failure info in debugfs Chao Yu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Chao Yu @ 2018-04-18  9:45 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao, Chao Yu

f2fs doesn't allow abuse on atomic write class interface, so except
limiting in-mem pages' total memory usage capacity, we need to limit
atomic-write usage as well when filesystem is seriously fragmented,
otherwise we may run into infinite loop during foreground GC because
target blocks in victim segment are belong to atomic opened file for
long time.

Now, we will detect failure due to atomic write in foreground GC, if
the count exceeds threshold, we will drop all atomic written data in
cache, by this, I expect it can keep our system running safely to
prevent Dos attack.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
---
 fs/f2fs/f2fs.h    |  1 +
 fs/f2fs/file.c    |  5 +++++
 fs/f2fs/gc.c      | 27 +++++++++++++++++++++++----
 fs/f2fs/gc.h      |  3 +++
 fs/f2fs/segment.c |  1 +
 fs/f2fs/segment.h |  2 ++
 6 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index c1c3a1d11186..3453288d6a71 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2249,6 +2249,7 @@ enum {
 	FI_EXTRA_ATTR,		/* indicate file has extra attribute */
 	FI_PROJ_INHERIT,	/* indicate file inherits projectid */
 	FI_PIN_FILE,		/* indicate file should not be gced */
+	FI_ATOMIC_REVOKE_REQUEST,/* indicate atomic committed data has been dropped */
 };
 
 static inline void __mark_inode_dirty_flag(struct inode *inode,
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 7c90ded5a431..cddd9aee1bb2 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1698,6 +1698,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
 skip_flush:
 	set_inode_flag(inode, FI_HOT_DATA);
 	set_inode_flag(inode, FI_ATOMIC_FILE);
+	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
 	f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
 
 	F2FS_I(inode)->inmem_task = current;
@@ -1746,6 +1747,10 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 	}
 err_out:
+	if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
+		clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
+		ret = -EINVAL;
+	}
 	up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
 	inode_unlock(inode);
 	mnt_drop_write_file(filp);
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index bfb7a4a3a929..495876ca62b6 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -135,6 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
 	gc_th->gc_urgent = 0;
 	gc_th->gc_wake= 0;
 
+	gc_th->atomic_file = 0;
+
 	sbi->gc_thread = gc_th;
 	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
 	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
@@ -603,7 +605,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
  * This can be used to move blocks, aka LBAs, directly on disk.
  */
 static void move_data_block(struct inode *inode, block_t bidx,
-					unsigned int segno, int off)
+				int gc_type, unsigned int segno, int off)
 {
 	struct f2fs_io_info fio = {
 		.sbi = F2FS_I_SB(inode),
@@ -630,8 +632,10 @@ static void move_data_block(struct inode *inode, block_t bidx,
 	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
 		goto out;
 
-	if (f2fs_is_atomic_file(inode))
+	if (f2fs_is_atomic_file(inode)) {
+		F2FS_I_SB(inode)->gc_thread->atomic_file++;
 		goto out;
+	}
 
 	if (f2fs_is_pinned_file(inode)) {
 		f2fs_pin_file_control(inode, true);
@@ -737,8 +741,10 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
 	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
 		goto out;
 
-	if (f2fs_is_atomic_file(inode))
+	if (f2fs_is_atomic_file(inode)) {
+		F2FS_I_SB(inode)->gc_thread->atomic_file++;
 		goto out;
+	}
 	if (f2fs_is_pinned_file(inode)) {
 		if (gc_type == FG_GC)
 			f2fs_pin_file_control(inode, true);
@@ -900,7 +906,8 @@ static void gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
 			start_bidx = start_bidx_of_node(nofs, inode)
 								+ ofs_in_node;
 			if (f2fs_encrypted_file(inode))
-				move_data_block(inode, start_bidx, segno, off);
+				move_data_block(inode, start_bidx, gc_type,
+								segno, off);
 			else
 				move_data_page(inode, start_bidx, gc_type,
 								segno, off);
@@ -1017,6 +1024,8 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 		.ilist = LIST_HEAD_INIT(gc_list.ilist),
 		.iroot = RADIX_TREE_INIT(GFP_NOFS),
 	};
+	unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
+	unsigned int skipped_round = 0, round = 0;
 
 	trace_f2fs_gc_begin(sbi->sb, sync, background,
 				get_pages(sbi, F2FS_DIRTY_NODES),
@@ -1068,11 +1077,21 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 		sec_freed++;
 	total_freed += seg_freed;
 
+	if (gc_type == FG_GC) {
+		if (sbi->gc_thread->atomic_file > last_atomic_file)
+			skipped_round++;
+		last_atomic_file = sbi->gc_thread->atomic_file;
+		round++;
+	}
+
 	if (gc_type == FG_GC)
 		sbi->cur_victim_sec = NULL_SEGNO;
 
 	if (!sync) {
 		if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
+			if (skipped_round > MAX_SKIP_ATOMIC_COUNT &&
+				skipped_round * 2 >= round)
+				drop_inmem_pages_all(sbi);
 			segno = NULL_SEGNO;
 			goto gc_more;
 		}
diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
index b0045d4c8d1e..bc1d21d46ae7 100644
--- a/fs/f2fs/gc.h
+++ b/fs/f2fs/gc.h
@@ -39,6 +39,9 @@ struct f2fs_gc_kthread {
 	unsigned int gc_idle;
 	unsigned int gc_urgent;
 	unsigned int gc_wake;
+
+	/* for stuck statistic */
+	unsigned int atomic_file;
 };
 
 struct gc_inode_list {
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index d809f731bfd1..831cefa088bc 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -287,6 +287,7 @@ void drop_inmem_pages_all(struct f2fs_sb_info *sbi)
 	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
 
 	if (inode) {
+		set_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
 		drop_inmem_pages(inode);
 		iput(inode);
 	}
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 492ad0c86fa9..7702b054689c 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -215,6 +215,8 @@ struct segment_allocation {
 #define IS_DUMMY_WRITTEN_PAGE(page)			\
 		(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
 
+#define MAX_SKIP_ATOMIC_COUNT			16
+
 struct inmem_pages {
 	struct list_head list;
 	struct page *page;
-- 
2.15.0.55.gc2ece9dc4de6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] f2fs: show GC failure info in debugfs
  2018-04-18  9:45 [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
  2018-04-18  9:45 ` [PATCH 2/5] f2fs: fix return value in f2fs_ioc_commit_atomic_write Chao Yu
  2018-04-18  9:45 ` [PATCH 3/5] f2fs: avoid stucking GC due to atomic write Chao Yu
@ 2018-04-18  9:45 ` Chao Yu
  2018-04-20  3:15   ` Jaegeuk Kim
  2018-04-18  9:45 ` [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer Chao Yu
  2018-04-18  9:53 ` [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
  4 siblings, 1 reply; 15+ messages in thread
From: Chao Yu @ 2018-04-18  9:45 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao, Chao Yu

This patch adds to show GC failure information in debugfs, now it just
shows count of failure caused by atomic write.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
---
 fs/f2fs/debug.c |  5 +++++
 fs/f2fs/f2fs.h  |  1 +
 fs/f2fs/gc.c    | 13 +++++++------
 fs/f2fs/gc.h    |  2 +-
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index a66107b5cfff..715beb85e9db 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -104,6 +104,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
 	si->avail_nids = NM_I(sbi)->available_nids;
 	si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
 	si->bg_gc = sbi->bg_gc;
+	si->bg_atomic = sbi->gc_thread->atomic_file[BG_GC];
+	si->fg_atomic = sbi->gc_thread->atomic_file[FG_GC];
 	si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
 		* 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
 		/ 2;
@@ -342,6 +344,9 @@ static int stat_show(struct seq_file *s, void *v)
 				si->bg_data_blks);
 		seq_printf(s, "  - node blocks : %d (%d)\n", si->node_blks,
 				si->bg_node_blks);
+		seq_printf(s, "Failure : atomic write %d (%d)\n",
+				si->bg_atomic + si->fg_atomic,
+				si->bg_atomic);
 		seq_puts(s, "\nExtent Cache:\n");
 		seq_printf(s, "  - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n",
 				si->hit_largest, si->hit_cached,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 3453288d6a71..567c6bb57ae3 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3003,6 +3003,7 @@ struct f2fs_stat_info {
 	int bg_node_segs, bg_data_segs;
 	int tot_blks, data_blks, node_blks;
 	int bg_data_blks, bg_node_blks;
+	unsigned int bg_atomic, fg_atomic;
 	int curseg[NR_CURSEG_TYPE];
 	int cursec[NR_CURSEG_TYPE];
 	int curzone[NR_CURSEG_TYPE];
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 495876ca62b6..da89ca16a55d 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -135,7 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
 	gc_th->gc_urgent = 0;
 	gc_th->gc_wake= 0;
 
-	gc_th->atomic_file = 0;
+	gc_th->atomic_file[BG_GC] = 0;
+	gc_th->atomic_file[FG_GC] = 0;
 
 	sbi->gc_thread = gc_th;
 	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
@@ -633,7 +634,7 @@ static void move_data_block(struct inode *inode, block_t bidx,
 		goto out;
 
 	if (f2fs_is_atomic_file(inode)) {
-		F2FS_I_SB(inode)->gc_thread->atomic_file++;
+		F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
 		goto out;
 	}
 
@@ -742,7 +743,7 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
 		goto out;
 
 	if (f2fs_is_atomic_file(inode)) {
-		F2FS_I_SB(inode)->gc_thread->atomic_file++;
+		F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
 		goto out;
 	}
 	if (f2fs_is_pinned_file(inode)) {
@@ -1024,7 +1025,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 		.ilist = LIST_HEAD_INIT(gc_list.ilist),
 		.iroot = RADIX_TREE_INIT(GFP_NOFS),
 	};
-	unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
+	unsigned int last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
 	unsigned int skipped_round = 0, round = 0;
 
 	trace_f2fs_gc_begin(sbi->sb, sync, background,
@@ -1078,9 +1079,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 	total_freed += seg_freed;
 
 	if (gc_type == FG_GC) {
-		if (sbi->gc_thread->atomic_file > last_atomic_file)
+		if (sbi->gc_thread->atomic_file[FG_GC] > last_atomic_file)
 			skipped_round++;
-		last_atomic_file = sbi->gc_thread->atomic_file;
+		last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
 		round++;
 	}
 
diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
index bc1d21d46ae7..a6cffe6b249b 100644
--- a/fs/f2fs/gc.h
+++ b/fs/f2fs/gc.h
@@ -41,7 +41,7 @@ struct f2fs_gc_kthread {
 	unsigned int gc_wake;
 
 	/* for stuck statistic */
-	unsigned int atomic_file;
+	unsigned int atomic_file[2];
 };
 
 struct gc_inode_list {
-- 
2.15.0.55.gc2ece9dc4de6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer
  2018-04-18  9:45 [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
                   ` (2 preceding siblings ...)
  2018-04-18  9:45 ` [PATCH 4/5] f2fs: show GC failure info in debugfs Chao Yu
@ 2018-04-18  9:45 ` Chao Yu
  2018-04-20  3:19   ` Jaegeuk Kim
  2018-04-18  9:53 ` [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
  4 siblings, 1 reply; 15+ messages in thread
From: Chao Yu @ 2018-04-18  9:45 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao, Chao Yu

Thread A			Thread B		Thread C
- f2fs_remount
 - stop_gc_thread
				- f2fs_sbi_store
							- issue_discard_thread
   sbi->gc_thread = NULL;
				  sbi->gc_thread->gc_wake = 1
							  access sbi->gc_thread->gc_urgent

Previously, we allocate memory for sbi->gc_thread based on background
gc thread mount option, the memory can be released if we turn off
that mount option, but still there are several places access gc_thread
pointer without considering race condition, result in NULL point
dereference.

In order to fix this issue, keep gc_thread structure valid in sbi all
the time instead of alloc/free it dynamically.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
---
 fs/f2fs/debug.c   |  3 +--
 fs/f2fs/f2fs.h    |  7 +++++++
 fs/f2fs/gc.c      | 58 +++++++++++++++++++++++++++++++++----------------------
 fs/f2fs/segment.c |  4 ++--
 fs/f2fs/super.c   | 13 +++++++++++--
 fs/f2fs/sysfs.c   |  8 ++++----
 6 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index 715beb85e9db..7bb036a3bb81 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
 	si->cache_mem = 0;
 
 	/* build gc */
-	if (sbi->gc_thread)
-		si->cache_mem += sizeof(struct f2fs_gc_kthread);
+	si->cache_mem += sizeof(struct f2fs_gc_kthread);
 
 	/* build merge flush thread */
 	if (SM_I(sbi)->fcc_info)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 567c6bb57ae3..c553f63199e8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct f2fs_sb_info *sbi)
 	return (struct sit_info *)(SM_I(sbi)->sit_info);
 }
 
+static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
+{
+	return (struct f2fs_gc_kthread *)(sbi->gc_thread);
+}
+
 static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
 {
 	return (struct free_segmap_info *)(SM_I(sbi)->free_info);
@@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
 /*
  * gc.c
  */
+int init_gc_context(struct f2fs_sb_info *sbi);
+void destroy_gc_context(struct f2fs_sb_info * sbi);
 int start_gc_thread(struct f2fs_sb_info *sbi);
 void stop_gc_thread(struct f2fs_sb_info *sbi);
 block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index da89ca16a55d..7d310e454b77 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -26,8 +26,8 @@
 static int gc_thread_func(void *data)
 {
 	struct f2fs_sb_info *sbi = data;
-	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
-	wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
+	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
+	wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
 	unsigned int wait_ms;
 
 	wait_ms = gc_th->min_sleep_time;
@@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
 	return 0;
 }
 
-int start_gc_thread(struct f2fs_sb_info *sbi)
+int init_gc_context(struct f2fs_sb_info *sbi)
 {
 	struct f2fs_gc_kthread *gc_th;
-	dev_t dev = sbi->sb->s_bdev->bd_dev;
-	int err = 0;
 
 	gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
-	if (!gc_th) {
-		err = -ENOMEM;
-		goto out;
-	}
+	if (!gc_th)
+		return -ENOMEM;
+
+	gc_th->f2fs_gc_task = NULL;
 
 	gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
 	gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
@@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
 	gc_th->atomic_file[FG_GC] = 0;
 
 	sbi->gc_thread = gc_th;
-	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
-	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
+
+	return 0;
+}
+
+void destroy_gc_context(struct f2fs_sb_info *sbi)
+{
+	kfree(GC_I(sbi));
+	sbi->gc_thread = NULL;
+}
+
+int start_gc_thread(struct f2fs_sb_info *sbi)
+{
+	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
+	dev_t dev = sbi->sb->s_bdev->bd_dev;
+	int err = 0;
+
+	init_waitqueue_head(&gc_th->gc_wait_queue_head);
+	gc_th->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
 			"f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev));
 	if (IS_ERR(gc_th->f2fs_gc_task)) {
 		err = PTR_ERR(gc_th->f2fs_gc_task);
-		kfree(gc_th);
-		sbi->gc_thread = NULL;
+		gc_th->f2fs_gc_task = NULL;
 	}
-out:
+
 	return err;
 }
 
 void stop_gc_thread(struct f2fs_sb_info *sbi)
 {
-	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
-	if (!gc_th)
-		return;
-	kthread_stop(gc_th->f2fs_gc_task);
-	kfree(gc_th);
-	sbi->gc_thread = NULL;
+	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
+
+	if (gc_th->f2fs_gc_task) {
+		kthread_stop(gc_th->f2fs_gc_task);
+		gc_th->f2fs_gc_task = NULL;
+	}
 }
 
 static int select_gc_type(struct f2fs_gc_kthread *gc_th, int gc_type)
@@ -190,15 +203,14 @@ static void select_policy(struct f2fs_sb_info *sbi, int gc_type,
 		p->max_search = dirty_i->nr_dirty[type];
 		p->ofs_unit = 1;
 	} else {
-		p->gc_mode = select_gc_type(sbi->gc_thread, gc_type);
+		p->gc_mode = select_gc_type(GC_I(sbi), gc_type);
 		p->dirty_segmap = dirty_i->dirty_segmap[DIRTY];
 		p->max_search = dirty_i->nr_dirty[DIRTY];
 		p->ofs_unit = sbi->segs_per_sec;
 	}
 
 	/* we need to check every dirty segments in the FG_GC case */
-	if (gc_type != FG_GC &&
-			(sbi->gc_thread && !sbi->gc_thread->gc_urgent) &&
+	if (gc_type != FG_GC && !GC_I(sbi)->gc_urgent &&
 			p->max_search > sbi->max_victim_search)
 		p->max_search = sbi->max_victim_search;
 
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 831cefa088bc..c869ce54be4d 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -177,7 +177,7 @@ bool need_SSR(struct f2fs_sb_info *sbi)
 
 	if (test_opt(sbi, LFS))
 		return false;
-	if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
+	if (GC_I(sbi)->gc_urgent)
 		return true;
 
 	return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs +
@@ -1422,7 +1422,7 @@ static int issue_discard_thread(void *data)
 		if (dcc->discard_wake)
 			dcc->discard_wake = 0;
 
-		if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
+		if (GC_I(sbi)->gc_urgent)
 			init_discard_policy(&dpolicy, DPOLICY_FORCE, 1);
 
 		sb_start_intwrite(sbi->sb);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index d01c11f5e9c1..f3a5f463496f 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1012,6 +1012,8 @@ static void f2fs_put_super(struct super_block *sb)
 		write_checkpoint(sbi, &cpc);
 	}
 
+	destroy_gc_context(sbi);
+
 	/* write_checkpoint can update stat informaion */
 	f2fs_destroy_stats(sbi);
 
@@ -1044,6 +1046,7 @@ static void f2fs_put_super(struct super_block *sb)
 	kfree(sbi->raw_super);
 
 	destroy_device_list(sbi);
+	destroy_gc_context(sbi);
 	mempool_destroy(sbi->write_io_dummy);
 #ifdef CONFIG_QUOTA
 	for (i = 0; i < MAXQUOTAS; i++)
@@ -1476,11 +1479,11 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 	 * option. Also sync the filesystem.
 	 */
 	if ((*flags & SB_RDONLY) || !test_opt(sbi, BG_GC)) {
-		if (sbi->gc_thread) {
+		if (GC_I(sbi)->f2fs_gc_task) {
 			stop_gc_thread(sbi);
 			need_restart_gc = true;
 		}
-	} else if (!sbi->gc_thread) {
+	} else if (!GC_I(sbi)->f2fs_gc_task) {
 		err = start_gc_thread(sbi);
 		if (err)
 			goto restore_opts;
@@ -2771,6 +2774,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		goto free_meta_inode;
 	}
 
+	err = init_gc_context(sbi);
+	if (err)
+		goto free_checkpoint;
+
 	/* Initialize device list */
 	err = f2fs_scan_devices(sbi);
 	if (err) {
@@ -2981,6 +2988,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 	destroy_segment_manager(sbi);
 free_devices:
 	destroy_device_list(sbi);
+	destroy_gc_context(sbi);
+free_checkpoint:
 	kfree(sbi->ckpt);
 free_meta_inode:
 	make_bad_inode(sbi->meta_inode);
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 2c53de9251be..fb3cd477d985 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -46,7 +46,7 @@ struct f2fs_attr {
 static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int struct_type)
 {
 	if (struct_type == GC_THREAD)
-		return (unsigned char *)sbi->gc_thread;
+		return (unsigned char *)GC_I(sbi);
 	else if (struct_type == SM_INFO)
 		return (unsigned char *)SM_I(sbi);
 	else if (struct_type == DCC_INFO)
@@ -252,9 +252,9 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
 
 	if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
 		f2fs_reset_iostat(sbi);
-	if (!strcmp(a->attr.name, "gc_urgent") && t == 1 && sbi->gc_thread) {
-		sbi->gc_thread->gc_wake = 1;
-		wake_up_interruptible_all(&sbi->gc_thread->gc_wait_queue_head);
+	if (!strcmp(a->attr.name, "gc_urgent") && t == 1) {
+		GC_I(sbi)->gc_wake = 1;
+		wake_up_interruptible_all(&GC_I(sbi)->gc_wait_queue_head);
 		wake_up_discard_thread(sbi, true);
 	}
 
-- 
2.15.0.55.gc2ece9dc4de6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] f2fs: fix race in between GC and atomic open
  2018-04-18  9:45 [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
                   ` (3 preceding siblings ...)
  2018-04-18  9:45 ` [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer Chao Yu
@ 2018-04-18  9:53 ` Chao Yu
  4 siblings, 0 replies; 15+ messages in thread
From: Chao Yu @ 2018-04-18  9:53 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao

Hi all,

Please ignore this patch, I just sent this before, sorry.

Thanks,

On 2018/4/18 17:45, Chao Yu wrote:
> Thread					GC thread
> - f2fs_ioc_start_atomic_write
>  - get_dirty_pages
>  - filemap_write_and_wait_range
> 					- f2fs_gc
> 					 - do_garbage_collect
> 					  - gc_data_segment
> 					   - move_data_page
> 					    - f2fs_is_atomic_file
> 					    - set_page_dirty
>  - set_inode_flag(, FI_ATOMIC_FILE)
> 
> Dirty data page can still be generated by GC in race condition as
> above call stack.
> 
> This patch adds fi->dio_rwsem[WRITE] in f2fs_ioc_start_atomic_write
> to avoid such race.
> 
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  fs/f2fs/file.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 78b3a58cfe21..408471bf4799 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1677,6 +1677,8 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>  
>  	inode_lock(inode);
>  
> +	down_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
> +
>  	if (f2fs_is_atomic_file(inode))
>  		goto out;
>  
> @@ -1702,6 +1704,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>  	stat_inc_atomic_write(inode);
>  	stat_update_max_atomic_write(inode);
>  out:
> +	up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
>  	inode_unlock(inode);
>  	mnt_drop_write_file(filp);
>  	return ret;
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] f2fs: avoid stucking GC due to atomic write
  2018-04-18  9:45 ` [PATCH 3/5] f2fs: avoid stucking GC due to atomic write Chao Yu
@ 2018-04-20  3:12   ` Jaegeuk Kim
  2018-04-20  3:22     ` Chao Yu
  0 siblings, 1 reply; 15+ messages in thread
From: Jaegeuk Kim @ 2018-04-20  3:12 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel, linux-kernel, chao

On 04/18, Chao Yu wrote:
> f2fs doesn't allow abuse on atomic write class interface, so except
> limiting in-mem pages' total memory usage capacity, we need to limit
> atomic-write usage as well when filesystem is seriously fragmented,
> otherwise we may run into infinite loop during foreground GC because
> target blocks in victim segment are belong to atomic opened file for
> long time.
> 
> Now, we will detect failure due to atomic write in foreground GC, if
> the count exceeds threshold, we will drop all atomic written data in
> cache, by this, I expect it can keep our system running safely to
> prevent Dos attack.
> 
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  fs/f2fs/f2fs.h    |  1 +
>  fs/f2fs/file.c    |  5 +++++
>  fs/f2fs/gc.c      | 27 +++++++++++++++++++++++----
>  fs/f2fs/gc.h      |  3 +++
>  fs/f2fs/segment.c |  1 +
>  fs/f2fs/segment.h |  2 ++
>  6 files changed, 35 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index c1c3a1d11186..3453288d6a71 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -2249,6 +2249,7 @@ enum {
>  	FI_EXTRA_ATTR,		/* indicate file has extra attribute */
>  	FI_PROJ_INHERIT,	/* indicate file inherits projectid */
>  	FI_PIN_FILE,		/* indicate file should not be gced */
> +	FI_ATOMIC_REVOKE_REQUEST,/* indicate atomic committed data has been dropped */
>  };
>  
>  static inline void __mark_inode_dirty_flag(struct inode *inode,
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 7c90ded5a431..cddd9aee1bb2 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1698,6 +1698,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>  skip_flush:
>  	set_inode_flag(inode, FI_HOT_DATA);
>  	set_inode_flag(inode, FI_ATOMIC_FILE);
> +	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>  	f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
>  
>  	F2FS_I(inode)->inmem_task = current;
> @@ -1746,6 +1747,10 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>  	}
>  err_out:
> +	if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
> +		clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> +		ret = -EINVAL;
> +	}
>  	up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
>  	inode_unlock(inode);
>  	mnt_drop_write_file(filp);
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index bfb7a4a3a929..495876ca62b6 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -135,6 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>  	gc_th->gc_urgent = 0;
>  	gc_th->gc_wake= 0;
>  
> +	gc_th->atomic_file = 0;
> +
>  	sbi->gc_thread = gc_th;
>  	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>  	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> @@ -603,7 +605,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
>   * This can be used to move blocks, aka LBAs, directly on disk.
>   */
>  static void move_data_block(struct inode *inode, block_t bidx,
> -					unsigned int segno, int off)
> +				int gc_type, unsigned int segno, int off)
>  {
>  	struct f2fs_io_info fio = {
>  		.sbi = F2FS_I_SB(inode),
> @@ -630,8 +632,10 @@ static void move_data_block(struct inode *inode, block_t bidx,
>  	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>  		goto out;
>  
> -	if (f2fs_is_atomic_file(inode))
> +	if (f2fs_is_atomic_file(inode)) {
> +		F2FS_I_SB(inode)->gc_thread->atomic_file++;
>  		goto out;
> +	}
>  
>  	if (f2fs_is_pinned_file(inode)) {
>  		f2fs_pin_file_control(inode, true);
> @@ -737,8 +741,10 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
>  	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>  		goto out;
>  
> -	if (f2fs_is_atomic_file(inode))
> +	if (f2fs_is_atomic_file(inode)) {
> +		F2FS_I_SB(inode)->gc_thread->atomic_file++;
>  		goto out;
> +	}
>  	if (f2fs_is_pinned_file(inode)) {
>  		if (gc_type == FG_GC)
>  			f2fs_pin_file_control(inode, true);
> @@ -900,7 +906,8 @@ static void gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
>  			start_bidx = start_bidx_of_node(nofs, inode)
>  								+ ofs_in_node;
>  			if (f2fs_encrypted_file(inode))
> -				move_data_block(inode, start_bidx, segno, off);
> +				move_data_block(inode, start_bidx, gc_type,
> +								segno, off);
>  			else
>  				move_data_page(inode, start_bidx, gc_type,
>  								segno, off);
> @@ -1017,6 +1024,8 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  		.ilist = LIST_HEAD_INIT(gc_list.ilist),
>  		.iroot = RADIX_TREE_INIT(GFP_NOFS),
>  	};
> +	unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
> +	unsigned int skipped_round = 0, round = 0;
>  
>  	trace_f2fs_gc_begin(sbi->sb, sync, background,
>  				get_pages(sbi, F2FS_DIRTY_NODES),
> @@ -1068,11 +1077,21 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  		sec_freed++;
>  	total_freed += seg_freed;
>  
> +	if (gc_type == FG_GC) {
> +		if (sbi->gc_thread->atomic_file > last_atomic_file)
> +			skipped_round++;
> +		last_atomic_file = sbi->gc_thread->atomic_file;

How about resetting ->atomic_file to zero?

> +		round++;
> +	}
> +
>  	if (gc_type == FG_GC)
>  		sbi->cur_victim_sec = NULL_SEGNO;
>  
>  	if (!sync) {
>  		if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> +			if (skipped_round > MAX_SKIP_ATOMIC_COUNT &&
> +				skipped_round * 2 >= round)
> +				drop_inmem_pages_all(sbi);
>  			segno = NULL_SEGNO;
>  			goto gc_more;
>  		}
> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
> index b0045d4c8d1e..bc1d21d46ae7 100644
> --- a/fs/f2fs/gc.h
> +++ b/fs/f2fs/gc.h
> @@ -39,6 +39,9 @@ struct f2fs_gc_kthread {
>  	unsigned int gc_idle;
>  	unsigned int gc_urgent;
>  	unsigned int gc_wake;
> +
> +	/* for stuck statistic */
> +	unsigned int atomic_file;
>  };
>  
>  struct gc_inode_list {
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index d809f731bfd1..831cefa088bc 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -287,6 +287,7 @@ void drop_inmem_pages_all(struct f2fs_sb_info *sbi)
>  	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>  
>  	if (inode) {
> +		set_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>  		drop_inmem_pages(inode);
>  		iput(inode);
>  	}
> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> index 492ad0c86fa9..7702b054689c 100644
> --- a/fs/f2fs/segment.h
> +++ b/fs/f2fs/segment.h
> @@ -215,6 +215,8 @@ struct segment_allocation {
>  #define IS_DUMMY_WRITTEN_PAGE(page)			\
>  		(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
>  
> +#define MAX_SKIP_ATOMIC_COUNT			16
> +
>  struct inmem_pages {
>  	struct list_head list;
>  	struct page *page;
> -- 
> 2.15.0.55.gc2ece9dc4de6

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] f2fs: show GC failure info in debugfs
  2018-04-18  9:45 ` [PATCH 4/5] f2fs: show GC failure info in debugfs Chao Yu
@ 2018-04-20  3:15   ` Jaegeuk Kim
  2018-04-20  3:24     ` Chao Yu
  0 siblings, 1 reply; 15+ messages in thread
From: Jaegeuk Kim @ 2018-04-20  3:15 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel, linux-kernel, chao

On 04/18, Chao Yu wrote:
> This patch adds to show GC failure information in debugfs, now it just
> shows count of failure caused by atomic write.
> 
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  fs/f2fs/debug.c |  5 +++++
>  fs/f2fs/f2fs.h  |  1 +
>  fs/f2fs/gc.c    | 13 +++++++------
>  fs/f2fs/gc.h    |  2 +-
>  4 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> index a66107b5cfff..715beb85e9db 100644
> --- a/fs/f2fs/debug.c
> +++ b/fs/f2fs/debug.c
> @@ -104,6 +104,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
>  	si->avail_nids = NM_I(sbi)->available_nids;
>  	si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
>  	si->bg_gc = sbi->bg_gc;
> +	si->bg_atomic = sbi->gc_thread->atomic_file[BG_GC];
> +	si->fg_atomic = sbi->gc_thread->atomic_file[FG_GC];

Need to change the naming like skipped_atomic_files?

>  	si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
>  		* 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
>  		/ 2;
> @@ -342,6 +344,9 @@ static int stat_show(struct seq_file *s, void *v)
>  				si->bg_data_blks);
>  		seq_printf(s, "  - node blocks : %d (%d)\n", si->node_blks,
>  				si->bg_node_blks);
> +		seq_printf(s, "Failure : atomic write %d (%d)\n",

It's not failure.

> +				si->bg_atomic + si->fg_atomic,
> +				si->bg_atomic);
>  		seq_puts(s, "\nExtent Cache:\n");
>  		seq_printf(s, "  - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n",
>  				si->hit_largest, si->hit_cached,
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 3453288d6a71..567c6bb57ae3 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -3003,6 +3003,7 @@ struct f2fs_stat_info {
>  	int bg_node_segs, bg_data_segs;
>  	int tot_blks, data_blks, node_blks;
>  	int bg_data_blks, bg_node_blks;
> +	unsigned int bg_atomic, fg_atomic;
>  	int curseg[NR_CURSEG_TYPE];
>  	int cursec[NR_CURSEG_TYPE];
>  	int curzone[NR_CURSEG_TYPE];
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 495876ca62b6..da89ca16a55d 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -135,7 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>  	gc_th->gc_urgent = 0;
>  	gc_th->gc_wake= 0;
>  
> -	gc_th->atomic_file = 0;
> +	gc_th->atomic_file[BG_GC] = 0;
> +	gc_th->atomic_file[FG_GC] = 0;

Need to merge the previous patch with this.

>  
>  	sbi->gc_thread = gc_th;
>  	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> @@ -633,7 +634,7 @@ static void move_data_block(struct inode *inode, block_t bidx,
>  		goto out;
>  
>  	if (f2fs_is_atomic_file(inode)) {
> -		F2FS_I_SB(inode)->gc_thread->atomic_file++;
> +		F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>  		goto out;
>  	}
>  
> @@ -742,7 +743,7 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
>  		goto out;
>  
>  	if (f2fs_is_atomic_file(inode)) {
> -		F2FS_I_SB(inode)->gc_thread->atomic_file++;
> +		F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>  		goto out;
>  	}
>  	if (f2fs_is_pinned_file(inode)) {
> @@ -1024,7 +1025,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  		.ilist = LIST_HEAD_INIT(gc_list.ilist),
>  		.iroot = RADIX_TREE_INIT(GFP_NOFS),
>  	};
> -	unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
> +	unsigned int last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>  	unsigned int skipped_round = 0, round = 0;
>  
>  	trace_f2fs_gc_begin(sbi->sb, sync, background,
> @@ -1078,9 +1079,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  	total_freed += seg_freed;
>  
>  	if (gc_type == FG_GC) {
> -		if (sbi->gc_thread->atomic_file > last_atomic_file)
> +		if (sbi->gc_thread->atomic_file[FG_GC] > last_atomic_file)
>  			skipped_round++;
> -		last_atomic_file = sbi->gc_thread->atomic_file;
> +		last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>  		round++;
>  	}
>  
> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
> index bc1d21d46ae7..a6cffe6b249b 100644
> --- a/fs/f2fs/gc.h
> +++ b/fs/f2fs/gc.h
> @@ -41,7 +41,7 @@ struct f2fs_gc_kthread {
>  	unsigned int gc_wake;
>  
>  	/* for stuck statistic */
> -	unsigned int atomic_file;
> +	unsigned int atomic_file[2];
>  };
>  
>  struct gc_inode_list {
> -- 
> 2.15.0.55.gc2ece9dc4de6

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer
  2018-04-18  9:45 ` [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer Chao Yu
@ 2018-04-20  3:19   ` Jaegeuk Kim
  2018-04-20  3:28     ` Chao Yu
  0 siblings, 1 reply; 15+ messages in thread
From: Jaegeuk Kim @ 2018-04-20  3:19 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel, linux-kernel, chao

On 04/18, Chao Yu wrote:
> Thread A			Thread B		Thread C
> - f2fs_remount
>  - stop_gc_thread
> 				- f2fs_sbi_store
> 							- issue_discard_thread
>    sbi->gc_thread = NULL;
> 				  sbi->gc_thread->gc_wake = 1
> 							  access sbi->gc_thread->gc_urgent

Do we simply need a lock for this?

> 
> Previously, we allocate memory for sbi->gc_thread based on background
> gc thread mount option, the memory can be released if we turn off
> that mount option, but still there are several places access gc_thread
> pointer without considering race condition, result in NULL point
> dereference.
> 
> In order to fix this issue, keep gc_thread structure valid in sbi all
> the time instead of alloc/free it dynamically.
> 
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  fs/f2fs/debug.c   |  3 +--
>  fs/f2fs/f2fs.h    |  7 +++++++
>  fs/f2fs/gc.c      | 58 +++++++++++++++++++++++++++++++++----------------------
>  fs/f2fs/segment.c |  4 ++--
>  fs/f2fs/super.c   | 13 +++++++++++--
>  fs/f2fs/sysfs.c   |  8 ++++----
>  6 files changed, 60 insertions(+), 33 deletions(-)
> 
> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> index 715beb85e9db..7bb036a3bb81 100644
> --- a/fs/f2fs/debug.c
> +++ b/fs/f2fs/debug.c
> @@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
>  	si->cache_mem = 0;
>  
>  	/* build gc */
> -	if (sbi->gc_thread)
> -		si->cache_mem += sizeof(struct f2fs_gc_kthread);
> +	si->cache_mem += sizeof(struct f2fs_gc_kthread);
>  
>  	/* build merge flush thread */
>  	if (SM_I(sbi)->fcc_info)
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 567c6bb57ae3..c553f63199e8 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct f2fs_sb_info *sbi)
>  	return (struct sit_info *)(SM_I(sbi)->sit_info);
>  }
>  
> +static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
> +{
> +	return (struct f2fs_gc_kthread *)(sbi->gc_thread);
> +}
> +
>  static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
>  {
>  	return (struct free_segmap_info *)(SM_I(sbi)->free_info);
> @@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
>  /*
>   * gc.c
>   */
> +int init_gc_context(struct f2fs_sb_info *sbi);
> +void destroy_gc_context(struct f2fs_sb_info * sbi);
>  int start_gc_thread(struct f2fs_sb_info *sbi);
>  void stop_gc_thread(struct f2fs_sb_info *sbi);
>  block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index da89ca16a55d..7d310e454b77 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -26,8 +26,8 @@
>  static int gc_thread_func(void *data)
>  {
>  	struct f2fs_sb_info *sbi = data;
> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
> -	wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> +	wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
>  	unsigned int wait_ms;
>  
>  	wait_ms = gc_th->min_sleep_time;
> @@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
>  	return 0;
>  }
>  
> -int start_gc_thread(struct f2fs_sb_info *sbi)
> +int init_gc_context(struct f2fs_sb_info *sbi)
>  {
>  	struct f2fs_gc_kthread *gc_th;
> -	dev_t dev = sbi->sb->s_bdev->bd_dev;
> -	int err = 0;
>  
>  	gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
> -	if (!gc_th) {
> -		err = -ENOMEM;
> -		goto out;
> -	}
> +	if (!gc_th)
> +		return -ENOMEM;
> +
> +	gc_th->f2fs_gc_task = NULL;
>  
>  	gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
>  	gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
> @@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>  	gc_th->atomic_file[FG_GC] = 0;
>  
>  	sbi->gc_thread = gc_th;
> -	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> -	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> +
> +	return 0;
> +}
> +
> +void destroy_gc_context(struct f2fs_sb_info *sbi)
> +{
> +	kfree(GC_I(sbi));
> +	sbi->gc_thread = NULL;
> +}
> +
> +int start_gc_thread(struct f2fs_sb_info *sbi)
> +{
> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> +	dev_t dev = sbi->sb->s_bdev->bd_dev;
> +	int err = 0;
> +
> +	init_waitqueue_head(&gc_th->gc_wait_queue_head);
> +	gc_th->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>  			"f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev));
>  	if (IS_ERR(gc_th->f2fs_gc_task)) {
>  		err = PTR_ERR(gc_th->f2fs_gc_task);
> -		kfree(gc_th);
> -		sbi->gc_thread = NULL;
> +		gc_th->f2fs_gc_task = NULL;
>  	}
> -out:
> +
>  	return err;
>  }
>  
>  void stop_gc_thread(struct f2fs_sb_info *sbi)
>  {
> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
> -	if (!gc_th)
> -		return;
> -	kthread_stop(gc_th->f2fs_gc_task);
> -	kfree(gc_th);
> -	sbi->gc_thread = NULL;
> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> +
> +	if (gc_th->f2fs_gc_task) {
> +		kthread_stop(gc_th->f2fs_gc_task);
> +		gc_th->f2fs_gc_task = NULL;
> +	}
>  }
>  
>  static int select_gc_type(struct f2fs_gc_kthread *gc_th, int gc_type)
> @@ -190,15 +203,14 @@ static void select_policy(struct f2fs_sb_info *sbi, int gc_type,
>  		p->max_search = dirty_i->nr_dirty[type];
>  		p->ofs_unit = 1;
>  	} else {
> -		p->gc_mode = select_gc_type(sbi->gc_thread, gc_type);
> +		p->gc_mode = select_gc_type(GC_I(sbi), gc_type);
>  		p->dirty_segmap = dirty_i->dirty_segmap[DIRTY];
>  		p->max_search = dirty_i->nr_dirty[DIRTY];
>  		p->ofs_unit = sbi->segs_per_sec;
>  	}
>  
>  	/* we need to check every dirty segments in the FG_GC case */
> -	if (gc_type != FG_GC &&
> -			(sbi->gc_thread && !sbi->gc_thread->gc_urgent) &&
> +	if (gc_type != FG_GC && !GC_I(sbi)->gc_urgent &&
>  			p->max_search > sbi->max_victim_search)
>  		p->max_search = sbi->max_victim_search;
>  
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 831cefa088bc..c869ce54be4d 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -177,7 +177,7 @@ bool need_SSR(struct f2fs_sb_info *sbi)
>  
>  	if (test_opt(sbi, LFS))
>  		return false;
> -	if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
> +	if (GC_I(sbi)->gc_urgent)
>  		return true;
>  
>  	return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs +
> @@ -1422,7 +1422,7 @@ static int issue_discard_thread(void *data)
>  		if (dcc->discard_wake)
>  			dcc->discard_wake = 0;
>  
> -		if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
> +		if (GC_I(sbi)->gc_urgent)
>  			init_discard_policy(&dpolicy, DPOLICY_FORCE, 1);
>  
>  		sb_start_intwrite(sbi->sb);
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index d01c11f5e9c1..f3a5f463496f 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1012,6 +1012,8 @@ static void f2fs_put_super(struct super_block *sb)
>  		write_checkpoint(sbi, &cpc);
>  	}
>  
> +	destroy_gc_context(sbi);
> +
>  	/* write_checkpoint can update stat informaion */
>  	f2fs_destroy_stats(sbi);
>  
> @@ -1044,6 +1046,7 @@ static void f2fs_put_super(struct super_block *sb)
>  	kfree(sbi->raw_super);
>  
>  	destroy_device_list(sbi);
> +	destroy_gc_context(sbi);
>  	mempool_destroy(sbi->write_io_dummy);
>  #ifdef CONFIG_QUOTA
>  	for (i = 0; i < MAXQUOTAS; i++)
> @@ -1476,11 +1479,11 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>  	 * option. Also sync the filesystem.
>  	 */
>  	if ((*flags & SB_RDONLY) || !test_opt(sbi, BG_GC)) {
> -		if (sbi->gc_thread) {
> +		if (GC_I(sbi)->f2fs_gc_task) {
>  			stop_gc_thread(sbi);
>  			need_restart_gc = true;
>  		}
> -	} else if (!sbi->gc_thread) {
> +	} else if (!GC_I(sbi)->f2fs_gc_task) {
>  		err = start_gc_thread(sbi);
>  		if (err)
>  			goto restore_opts;
> @@ -2771,6 +2774,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  		goto free_meta_inode;
>  	}
>  
> +	err = init_gc_context(sbi);
> +	if (err)
> +		goto free_checkpoint;
> +
>  	/* Initialize device list */
>  	err = f2fs_scan_devices(sbi);
>  	if (err) {
> @@ -2981,6 +2988,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  	destroy_segment_manager(sbi);
>  free_devices:
>  	destroy_device_list(sbi);
> +	destroy_gc_context(sbi);
> +free_checkpoint:
>  	kfree(sbi->ckpt);
>  free_meta_inode:
>  	make_bad_inode(sbi->meta_inode);
> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
> index 2c53de9251be..fb3cd477d985 100644
> --- a/fs/f2fs/sysfs.c
> +++ b/fs/f2fs/sysfs.c
> @@ -46,7 +46,7 @@ struct f2fs_attr {
>  static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int struct_type)
>  {
>  	if (struct_type == GC_THREAD)
> -		return (unsigned char *)sbi->gc_thread;
> +		return (unsigned char *)GC_I(sbi);
>  	else if (struct_type == SM_INFO)
>  		return (unsigned char *)SM_I(sbi);
>  	else if (struct_type == DCC_INFO)
> @@ -252,9 +252,9 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
>  
>  	if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
>  		f2fs_reset_iostat(sbi);
> -	if (!strcmp(a->attr.name, "gc_urgent") && t == 1 && sbi->gc_thread) {
> -		sbi->gc_thread->gc_wake = 1;
> -		wake_up_interruptible_all(&sbi->gc_thread->gc_wait_queue_head);
> +	if (!strcmp(a->attr.name, "gc_urgent") && t == 1) {
> +		GC_I(sbi)->gc_wake = 1;
> +		wake_up_interruptible_all(&GC_I(sbi)->gc_wait_queue_head);
>  		wake_up_discard_thread(sbi, true);
>  	}
>  
> -- 
> 2.15.0.55.gc2ece9dc4de6

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] f2fs: avoid stucking GC due to atomic write
  2018-04-20  3:12   ` Jaegeuk Kim
@ 2018-04-20  3:22     ` Chao Yu
  2018-04-20  3:27       ` Jaegeuk Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Chao Yu @ 2018-04-20  3:22 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel, linux-kernel, chao

On 2018/4/20 11:12, Jaegeuk Kim wrote:
> On 04/18, Chao Yu wrote:
>> f2fs doesn't allow abuse on atomic write class interface, so except
>> limiting in-mem pages' total memory usage capacity, we need to limit
>> atomic-write usage as well when filesystem is seriously fragmented,
>> otherwise we may run into infinite loop during foreground GC because
>> target blocks in victim segment are belong to atomic opened file for
>> long time.
>>
>> Now, we will detect failure due to atomic write in foreground GC, if
>> the count exceeds threshold, we will drop all atomic written data in
>> cache, by this, I expect it can keep our system running safely to
>> prevent Dos attack.
>>
>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>> ---
>>  fs/f2fs/f2fs.h    |  1 +
>>  fs/f2fs/file.c    |  5 +++++
>>  fs/f2fs/gc.c      | 27 +++++++++++++++++++++++----
>>  fs/f2fs/gc.h      |  3 +++
>>  fs/f2fs/segment.c |  1 +
>>  fs/f2fs/segment.h |  2 ++
>>  6 files changed, 35 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index c1c3a1d11186..3453288d6a71 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -2249,6 +2249,7 @@ enum {
>>  	FI_EXTRA_ATTR,		/* indicate file has extra attribute */
>>  	FI_PROJ_INHERIT,	/* indicate file inherits projectid */
>>  	FI_PIN_FILE,		/* indicate file should not be gced */
>> +	FI_ATOMIC_REVOKE_REQUEST,/* indicate atomic committed data has been dropped */
>>  };
>>  
>>  static inline void __mark_inode_dirty_flag(struct inode *inode,
>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>> index 7c90ded5a431..cddd9aee1bb2 100644
>> --- a/fs/f2fs/file.c
>> +++ b/fs/f2fs/file.c
>> @@ -1698,6 +1698,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>>  skip_flush:
>>  	set_inode_flag(inode, FI_HOT_DATA);
>>  	set_inode_flag(inode, FI_ATOMIC_FILE);
>> +	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>>  	f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
>>  
>>  	F2FS_I(inode)->inmem_task = current;
>> @@ -1746,6 +1747,10 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>>  	}
>>  err_out:
>> +	if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
>> +		clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>> +		ret = -EINVAL;
>> +	}
>>  	up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
>>  	inode_unlock(inode);
>>  	mnt_drop_write_file(filp);
>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>> index bfb7a4a3a929..495876ca62b6 100644
>> --- a/fs/f2fs/gc.c
>> +++ b/fs/f2fs/gc.c
>> @@ -135,6 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>>  	gc_th->gc_urgent = 0;
>>  	gc_th->gc_wake= 0;
>>  
>> +	gc_th->atomic_file = 0;
>> +
>>  	sbi->gc_thread = gc_th;
>>  	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>>  	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>> @@ -603,7 +605,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
>>   * This can be used to move blocks, aka LBAs, directly on disk.
>>   */
>>  static void move_data_block(struct inode *inode, block_t bidx,
>> -					unsigned int segno, int off)
>> +				int gc_type, unsigned int segno, int off)
>>  {
>>  	struct f2fs_io_info fio = {
>>  		.sbi = F2FS_I_SB(inode),
>> @@ -630,8 +632,10 @@ static void move_data_block(struct inode *inode, block_t bidx,
>>  	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>>  		goto out;
>>  
>> -	if (f2fs_is_atomic_file(inode))
>> +	if (f2fs_is_atomic_file(inode)) {
>> +		F2FS_I_SB(inode)->gc_thread->atomic_file++;
>>  		goto out;
>> +	}
>>  
>>  	if (f2fs_is_pinned_file(inode)) {
>>  		f2fs_pin_file_control(inode, true);
>> @@ -737,8 +741,10 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
>>  	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>>  		goto out;
>>  
>> -	if (f2fs_is_atomic_file(inode))
>> +	if (f2fs_is_atomic_file(inode)) {
>> +		F2FS_I_SB(inode)->gc_thread->atomic_file++;
>>  		goto out;
>> +	}
>>  	if (f2fs_is_pinned_file(inode)) {
>>  		if (gc_type == FG_GC)
>>  			f2fs_pin_file_control(inode, true);
>> @@ -900,7 +906,8 @@ static void gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
>>  			start_bidx = start_bidx_of_node(nofs, inode)
>>  								+ ofs_in_node;
>>  			if (f2fs_encrypted_file(inode))
>> -				move_data_block(inode, start_bidx, segno, off);
>> +				move_data_block(inode, start_bidx, gc_type,
>> +								segno, off);
>>  			else
>>  				move_data_page(inode, start_bidx, gc_type,
>>  								segno, off);
>> @@ -1017,6 +1024,8 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>  		.ilist = LIST_HEAD_INIT(gc_list.ilist),
>>  		.iroot = RADIX_TREE_INIT(GFP_NOFS),
>>  	};
>> +	unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
>> +	unsigned int skipped_round = 0, round = 0;
>>  
>>  	trace_f2fs_gc_begin(sbi->sb, sync, background,
>>  				get_pages(sbi, F2FS_DIRTY_NODES),
>> @@ -1068,11 +1077,21 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>  		sec_freed++;
>>  	total_freed += seg_freed;
>>  
>> +	if (gc_type == FG_GC) {
>> +		if (sbi->gc_thread->atomic_file > last_atomic_file)
>> +			skipped_round++;
>> +		last_atomic_file = sbi->gc_thread->atomic_file;
> 
> How about resetting ->atomic_file to zero?

Since we will show this stat number through sysfs entry, it'd better to keep its
data?

Thanks,

> 
>> +		round++;
>> +	}
>> +
>>  	if (gc_type == FG_GC)
>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>  
>>  	if (!sync) {
>>  		if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
>> +			if (skipped_round > MAX_SKIP_ATOMIC_COUNT &&
>> +				skipped_round * 2 >= round)
>> +				drop_inmem_pages_all(sbi);
>>  			segno = NULL_SEGNO;
>>  			goto gc_more;
>>  		}
>> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
>> index b0045d4c8d1e..bc1d21d46ae7 100644
>> --- a/fs/f2fs/gc.h
>> +++ b/fs/f2fs/gc.h
>> @@ -39,6 +39,9 @@ struct f2fs_gc_kthread {
>>  	unsigned int gc_idle;
>>  	unsigned int gc_urgent;
>>  	unsigned int gc_wake;
>> +
>> +	/* for stuck statistic */
>> +	unsigned int atomic_file;
>>  };
>>  
>>  struct gc_inode_list {
>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>> index d809f731bfd1..831cefa088bc 100644
>> --- a/fs/f2fs/segment.c
>> +++ b/fs/f2fs/segment.c
>> @@ -287,6 +287,7 @@ void drop_inmem_pages_all(struct f2fs_sb_info *sbi)
>>  	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>>  
>>  	if (inode) {
>> +		set_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>>  		drop_inmem_pages(inode);
>>  		iput(inode);
>>  	}
>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
>> index 492ad0c86fa9..7702b054689c 100644
>> --- a/fs/f2fs/segment.h
>> +++ b/fs/f2fs/segment.h
>> @@ -215,6 +215,8 @@ struct segment_allocation {
>>  #define IS_DUMMY_WRITTEN_PAGE(page)			\
>>  		(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
>>  
>> +#define MAX_SKIP_ATOMIC_COUNT			16
>> +
>>  struct inmem_pages {
>>  	struct list_head list;
>>  	struct page *page;
>> -- 
>> 2.15.0.55.gc2ece9dc4de6
> 
> .
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] f2fs: show GC failure info in debugfs
  2018-04-20  3:15   ` Jaegeuk Kim
@ 2018-04-20  3:24     ` Chao Yu
  0 siblings, 0 replies; 15+ messages in thread
From: Chao Yu @ 2018-04-20  3:24 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel, linux-kernel, chao

On 2018/4/20 11:15, Jaegeuk Kim wrote:
> On 04/18, Chao Yu wrote:
>> This patch adds to show GC failure information in debugfs, now it just
>> shows count of failure caused by atomic write.
>>
>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>> ---
>>  fs/f2fs/debug.c |  5 +++++
>>  fs/f2fs/f2fs.h  |  1 +
>>  fs/f2fs/gc.c    | 13 +++++++------
>>  fs/f2fs/gc.h    |  2 +-
>>  4 files changed, 14 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
>> index a66107b5cfff..715beb85e9db 100644
>> --- a/fs/f2fs/debug.c
>> +++ b/fs/f2fs/debug.c
>> @@ -104,6 +104,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
>>  	si->avail_nids = NM_I(sbi)->available_nids;
>>  	si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
>>  	si->bg_gc = sbi->bg_gc;
>> +	si->bg_atomic = sbi->gc_thread->atomic_file[BG_GC];
>> +	si->fg_atomic = sbi->gc_thread->atomic_file[FG_GC];
> 
> Need to change the naming like skipped_atomic_files?

OK

> 
>>  	si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
>>  		* 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
>>  		/ 2;
>> @@ -342,6 +344,9 @@ static int stat_show(struct seq_file *s, void *v)
>>  				si->bg_data_blks);
>>  		seq_printf(s, "  - node blocks : %d (%d)\n", si->node_blks,
>>  				si->bg_node_blks);
>> +		seq_printf(s, "Failure : atomic write %d (%d)\n",
> 
> It's not failure.

Alright... just skip..

> 
>> +				si->bg_atomic + si->fg_atomic,
>> +				si->bg_atomic);
>>  		seq_puts(s, "\nExtent Cache:\n");
>>  		seq_printf(s, "  - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n",
>>  				si->hit_largest, si->hit_cached,
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 3453288d6a71..567c6bb57ae3 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -3003,6 +3003,7 @@ struct f2fs_stat_info {
>>  	int bg_node_segs, bg_data_segs;
>>  	int tot_blks, data_blks, node_blks;
>>  	int bg_data_blks, bg_node_blks;
>> +	unsigned int bg_atomic, fg_atomic;
>>  	int curseg[NR_CURSEG_TYPE];
>>  	int cursec[NR_CURSEG_TYPE];
>>  	int curzone[NR_CURSEG_TYPE];
>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>> index 495876ca62b6..da89ca16a55d 100644
>> --- a/fs/f2fs/gc.c
>> +++ b/fs/f2fs/gc.c
>> @@ -135,7 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>>  	gc_th->gc_urgent = 0;
>>  	gc_th->gc_wake= 0;
>>  
>> -	gc_th->atomic_file = 0;
>> +	gc_th->atomic_file[BG_GC] = 0;
>> +	gc_th->atomic_file[FG_GC] = 0;
> 
> Need to merge the previous patch with this.

Let me merge them. :)

Thanks,

> 
>>  
>>  	sbi->gc_thread = gc_th;
>>  	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>> @@ -633,7 +634,7 @@ static void move_data_block(struct inode *inode, block_t bidx,
>>  		goto out;
>>  
>>  	if (f2fs_is_atomic_file(inode)) {
>> -		F2FS_I_SB(inode)->gc_thread->atomic_file++;
>> +		F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>>  		goto out;
>>  	}
>>  
>> @@ -742,7 +743,7 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
>>  		goto out;
>>  
>>  	if (f2fs_is_atomic_file(inode)) {
>> -		F2FS_I_SB(inode)->gc_thread->atomic_file++;
>> +		F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>>  		goto out;
>>  	}
>>  	if (f2fs_is_pinned_file(inode)) {
>> @@ -1024,7 +1025,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>  		.ilist = LIST_HEAD_INIT(gc_list.ilist),
>>  		.iroot = RADIX_TREE_INIT(GFP_NOFS),
>>  	};
>> -	unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
>> +	unsigned int last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>>  	unsigned int skipped_round = 0, round = 0;
>>  
>>  	trace_f2fs_gc_begin(sbi->sb, sync, background,
>> @@ -1078,9 +1079,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>  	total_freed += seg_freed;
>>  
>>  	if (gc_type == FG_GC) {
>> -		if (sbi->gc_thread->atomic_file > last_atomic_file)
>> +		if (sbi->gc_thread->atomic_file[FG_GC] > last_atomic_file)
>>  			skipped_round++;
>> -		last_atomic_file = sbi->gc_thread->atomic_file;
>> +		last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>>  		round++;
>>  	}
>>  
>> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
>> index bc1d21d46ae7..a6cffe6b249b 100644
>> --- a/fs/f2fs/gc.h
>> +++ b/fs/f2fs/gc.h
>> @@ -41,7 +41,7 @@ struct f2fs_gc_kthread {
>>  	unsigned int gc_wake;
>>  
>>  	/* for stuck statistic */
>> -	unsigned int atomic_file;
>> +	unsigned int atomic_file[2];
>>  };
>>  
>>  struct gc_inode_list {
>> -- 
>> 2.15.0.55.gc2ece9dc4de6
> 
> .
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] f2fs: avoid stucking GC due to atomic write
  2018-04-20  3:22     ` Chao Yu
@ 2018-04-20  3:27       ` Jaegeuk Kim
  0 siblings, 0 replies; 15+ messages in thread
From: Jaegeuk Kim @ 2018-04-20  3:27 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel, linux-kernel, chao

On 04/20, Chao Yu wrote:
> On 2018/4/20 11:12, Jaegeuk Kim wrote:
> > On 04/18, Chao Yu wrote:
> >> f2fs doesn't allow abuse on atomic write class interface, so except
> >> limiting in-mem pages' total memory usage capacity, we need to limit
> >> atomic-write usage as well when filesystem is seriously fragmented,
> >> otherwise we may run into infinite loop during foreground GC because
> >> target blocks in victim segment are belong to atomic opened file for
> >> long time.
> >>
> >> Now, we will detect failure due to atomic write in foreground GC, if
> >> the count exceeds threshold, we will drop all atomic written data in
> >> cache, by this, I expect it can keep our system running safely to
> >> prevent Dos attack.
> >>
> >> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >> ---
> >>  fs/f2fs/f2fs.h    |  1 +
> >>  fs/f2fs/file.c    |  5 +++++
> >>  fs/f2fs/gc.c      | 27 +++++++++++++++++++++++----
> >>  fs/f2fs/gc.h      |  3 +++
> >>  fs/f2fs/segment.c |  1 +
> >>  fs/f2fs/segment.h |  2 ++
> >>  6 files changed, 35 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >> index c1c3a1d11186..3453288d6a71 100644
> >> --- a/fs/f2fs/f2fs.h
> >> +++ b/fs/f2fs/f2fs.h
> >> @@ -2249,6 +2249,7 @@ enum {
> >>  	FI_EXTRA_ATTR,		/* indicate file has extra attribute */
> >>  	FI_PROJ_INHERIT,	/* indicate file inherits projectid */
> >>  	FI_PIN_FILE,		/* indicate file should not be gced */
> >> +	FI_ATOMIC_REVOKE_REQUEST,/* indicate atomic committed data has been dropped */
> >>  };
> >>  
> >>  static inline void __mark_inode_dirty_flag(struct inode *inode,
> >> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >> index 7c90ded5a431..cddd9aee1bb2 100644
> >> --- a/fs/f2fs/file.c
> >> +++ b/fs/f2fs/file.c
> >> @@ -1698,6 +1698,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
> >>  skip_flush:
> >>  	set_inode_flag(inode, FI_HOT_DATA);
> >>  	set_inode_flag(inode, FI_ATOMIC_FILE);
> >> +	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >>  	f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
> >>  
> >>  	F2FS_I(inode)->inmem_task = current;
> >> @@ -1746,6 +1747,10 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >>  	}
> >>  err_out:
> >> +	if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
> >> +		clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >> +		ret = -EINVAL;
> >> +	}
> >>  	up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
> >>  	inode_unlock(inode);
> >>  	mnt_drop_write_file(filp);
> >> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >> index bfb7a4a3a929..495876ca62b6 100644
> >> --- a/fs/f2fs/gc.c
> >> +++ b/fs/f2fs/gc.c
> >> @@ -135,6 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
> >>  	gc_th->gc_urgent = 0;
> >>  	gc_th->gc_wake= 0;
> >>  
> >> +	gc_th->atomic_file = 0;
> >> +
> >>  	sbi->gc_thread = gc_th;
> >>  	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> >>  	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> >> @@ -603,7 +605,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
> >>   * This can be used to move blocks, aka LBAs, directly on disk.
> >>   */
> >>  static void move_data_block(struct inode *inode, block_t bidx,
> >> -					unsigned int segno, int off)
> >> +				int gc_type, unsigned int segno, int off)
> >>  {
> >>  	struct f2fs_io_info fio = {
> >>  		.sbi = F2FS_I_SB(inode),
> >> @@ -630,8 +632,10 @@ static void move_data_block(struct inode *inode, block_t bidx,
> >>  	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
> >>  		goto out;
> >>  
> >> -	if (f2fs_is_atomic_file(inode))
> >> +	if (f2fs_is_atomic_file(inode)) {
> >> +		F2FS_I_SB(inode)->gc_thread->atomic_file++;
> >>  		goto out;
> >> +	}
> >>  
> >>  	if (f2fs_is_pinned_file(inode)) {
> >>  		f2fs_pin_file_control(inode, true);
> >> @@ -737,8 +741,10 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
> >>  	if (!check_valid_map(F2FS_I_SB(inode), segno, off))
> >>  		goto out;
> >>  
> >> -	if (f2fs_is_atomic_file(inode))
> >> +	if (f2fs_is_atomic_file(inode)) {
> >> +		F2FS_I_SB(inode)->gc_thread->atomic_file++;
> >>  		goto out;
> >> +	}
> >>  	if (f2fs_is_pinned_file(inode)) {
> >>  		if (gc_type == FG_GC)
> >>  			f2fs_pin_file_control(inode, true);
> >> @@ -900,7 +906,8 @@ static void gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
> >>  			start_bidx = start_bidx_of_node(nofs, inode)
> >>  								+ ofs_in_node;
> >>  			if (f2fs_encrypted_file(inode))
> >> -				move_data_block(inode, start_bidx, segno, off);
> >> +				move_data_block(inode, start_bidx, gc_type,
> >> +								segno, off);
> >>  			else
> >>  				move_data_page(inode, start_bidx, gc_type,
> >>  								segno, off);
> >> @@ -1017,6 +1024,8 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>  		.ilist = LIST_HEAD_INIT(gc_list.ilist),
> >>  		.iroot = RADIX_TREE_INIT(GFP_NOFS),
> >>  	};
> >> +	unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
> >> +	unsigned int skipped_round = 0, round = 0;
> >>  
> >>  	trace_f2fs_gc_begin(sbi->sb, sync, background,
> >>  				get_pages(sbi, F2FS_DIRTY_NODES),
> >> @@ -1068,11 +1077,21 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>  		sec_freed++;
> >>  	total_freed += seg_freed;
> >>  
> >> +	if (gc_type == FG_GC) {
> >> +		if (sbi->gc_thread->atomic_file > last_atomic_file)
> >> +			skipped_round++;
> >> +		last_atomic_file = sbi->gc_thread->atomic_file;
> > 
> > How about resetting ->atomic_file to zero?
> 
> Since we will show this stat number through sysfs entry, it'd better to keep its
> data?

Alright, but we need to manage overflow?

> 
> Thanks,
> 
> > 
> >> +		round++;
> >> +	}
> >> +
> >>  	if (gc_type == FG_GC)
> >>  		sbi->cur_victim_sec = NULL_SEGNO;
> >>  
> >>  	if (!sync) {
> >>  		if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> >> +			if (skipped_round > MAX_SKIP_ATOMIC_COUNT &&
> >> +				skipped_round * 2 >= round)
> >> +				drop_inmem_pages_all(sbi);
> >>  			segno = NULL_SEGNO;
> >>  			goto gc_more;
> >>  		}
> >> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
> >> index b0045d4c8d1e..bc1d21d46ae7 100644
> >> --- a/fs/f2fs/gc.h
> >> +++ b/fs/f2fs/gc.h
> >> @@ -39,6 +39,9 @@ struct f2fs_gc_kthread {
> >>  	unsigned int gc_idle;
> >>  	unsigned int gc_urgent;
> >>  	unsigned int gc_wake;
> >> +
> >> +	/* for stuck statistic */
> >> +	unsigned int atomic_file;
> >>  };
> >>  
> >>  struct gc_inode_list {
> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >> index d809f731bfd1..831cefa088bc 100644
> >> --- a/fs/f2fs/segment.c
> >> +++ b/fs/f2fs/segment.c
> >> @@ -287,6 +287,7 @@ void drop_inmem_pages_all(struct f2fs_sb_info *sbi)
> >>  	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> >>  
> >>  	if (inode) {
> >> +		set_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >>  		drop_inmem_pages(inode);
> >>  		iput(inode);
> >>  	}
> >> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> >> index 492ad0c86fa9..7702b054689c 100644
> >> --- a/fs/f2fs/segment.h
> >> +++ b/fs/f2fs/segment.h
> >> @@ -215,6 +215,8 @@ struct segment_allocation {
> >>  #define IS_DUMMY_WRITTEN_PAGE(page)			\
> >>  		(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
> >>  
> >> +#define MAX_SKIP_ATOMIC_COUNT			16
> >> +
> >>  struct inmem_pages {
> >>  	struct list_head list;
> >>  	struct page *page;
> >> -- 
> >> 2.15.0.55.gc2ece9dc4de6
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer
  2018-04-20  3:19   ` Jaegeuk Kim
@ 2018-04-20  3:28     ` Chao Yu
  2018-04-20  3:54       ` Jaegeuk Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Chao Yu @ 2018-04-20  3:28 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel, linux-kernel, chao

On 2018/4/20 11:19, Jaegeuk Kim wrote:
> On 04/18, Chao Yu wrote:
>> Thread A			Thread B		Thread C
>> - f2fs_remount
>>  - stop_gc_thread
>> 				- f2fs_sbi_store
>> 							- issue_discard_thread
>>    sbi->gc_thread = NULL;
>> 				  sbi->gc_thread->gc_wake = 1
>> 							  access sbi->gc_thread->gc_urgent
> 
> Do we simply need a lock for this?

Code will be more complicated for handling existed and new coming fields with
the sbi->gc_thread pointer, and causing unneeded lock overhead, right?

So let's just allocate memory during fill_super?

Thanks,

> 
>>
>> Previously, we allocate memory for sbi->gc_thread based on background
>> gc thread mount option, the memory can be released if we turn off
>> that mount option, but still there are several places access gc_thread
>> pointer without considering race condition, result in NULL point
>> dereference.
>>
>> In order to fix this issue, keep gc_thread structure valid in sbi all
>> the time instead of alloc/free it dynamically.
>>
>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>> ---
>>  fs/f2fs/debug.c   |  3 +--
>>  fs/f2fs/f2fs.h    |  7 +++++++
>>  fs/f2fs/gc.c      | 58 +++++++++++++++++++++++++++++++++----------------------
>>  fs/f2fs/segment.c |  4 ++--
>>  fs/f2fs/super.c   | 13 +++++++++++--
>>  fs/f2fs/sysfs.c   |  8 ++++----
>>  6 files changed, 60 insertions(+), 33 deletions(-)
>>
>> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
>> index 715beb85e9db..7bb036a3bb81 100644
>> --- a/fs/f2fs/debug.c
>> +++ b/fs/f2fs/debug.c
>> @@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
>>  	si->cache_mem = 0;
>>  
>>  	/* build gc */
>> -	if (sbi->gc_thread)
>> -		si->cache_mem += sizeof(struct f2fs_gc_kthread);
>> +	si->cache_mem += sizeof(struct f2fs_gc_kthread);
>>  
>>  	/* build merge flush thread */
>>  	if (SM_I(sbi)->fcc_info)
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 567c6bb57ae3..c553f63199e8 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct f2fs_sb_info *sbi)
>>  	return (struct sit_info *)(SM_I(sbi)->sit_info);
>>  }
>>  
>> +static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
>> +{
>> +	return (struct f2fs_gc_kthread *)(sbi->gc_thread);
>> +}
>> +
>>  static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
>>  {
>>  	return (struct free_segmap_info *)(SM_I(sbi)->free_info);
>> @@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
>>  /*
>>   * gc.c
>>   */
>> +int init_gc_context(struct f2fs_sb_info *sbi);
>> +void destroy_gc_context(struct f2fs_sb_info * sbi);
>>  int start_gc_thread(struct f2fs_sb_info *sbi);
>>  void stop_gc_thread(struct f2fs_sb_info *sbi);
>>  block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>> index da89ca16a55d..7d310e454b77 100644
>> --- a/fs/f2fs/gc.c
>> +++ b/fs/f2fs/gc.c
>> @@ -26,8 +26,8 @@
>>  static int gc_thread_func(void *data)
>>  {
>>  	struct f2fs_sb_info *sbi = data;
>> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
>> -	wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
>> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>> +	wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
>>  	unsigned int wait_ms;
>>  
>>  	wait_ms = gc_th->min_sleep_time;
>> @@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
>>  	return 0;
>>  }
>>  
>> -int start_gc_thread(struct f2fs_sb_info *sbi)
>> +int init_gc_context(struct f2fs_sb_info *sbi)
>>  {
>>  	struct f2fs_gc_kthread *gc_th;
>> -	dev_t dev = sbi->sb->s_bdev->bd_dev;
>> -	int err = 0;
>>  
>>  	gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
>> -	if (!gc_th) {
>> -		err = -ENOMEM;
>> -		goto out;
>> -	}
>> +	if (!gc_th)
>> +		return -ENOMEM;
>> +
>> +	gc_th->f2fs_gc_task = NULL;
>>  
>>  	gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
>>  	gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
>> @@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>>  	gc_th->atomic_file[FG_GC] = 0;
>>  
>>  	sbi->gc_thread = gc_th;
>> -	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>> -	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>> +
>> +	return 0;
>> +}
>> +
>> +void destroy_gc_context(struct f2fs_sb_info *sbi)
>> +{
>> +	kfree(GC_I(sbi));
>> +	sbi->gc_thread = NULL;
>> +}
>> +
>> +int start_gc_thread(struct f2fs_sb_info *sbi)
>> +{
>> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>> +	dev_t dev = sbi->sb->s_bdev->bd_dev;
>> +	int err = 0;
>> +
>> +	init_waitqueue_head(&gc_th->gc_wait_queue_head);
>> +	gc_th->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>>  			"f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev));
>>  	if (IS_ERR(gc_th->f2fs_gc_task)) {
>>  		err = PTR_ERR(gc_th->f2fs_gc_task);
>> -		kfree(gc_th);
>> -		sbi->gc_thread = NULL;
>> +		gc_th->f2fs_gc_task = NULL;
>>  	}
>> -out:
>> +
>>  	return err;
>>  }
>>  
>>  void stop_gc_thread(struct f2fs_sb_info *sbi)
>>  {
>> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
>> -	if (!gc_th)
>> -		return;
>> -	kthread_stop(gc_th->f2fs_gc_task);
>> -	kfree(gc_th);
>> -	sbi->gc_thread = NULL;
>> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>> +
>> +	if (gc_th->f2fs_gc_task) {
>> +		kthread_stop(gc_th->f2fs_gc_task);
>> +		gc_th->f2fs_gc_task = NULL;
>> +	}
>>  }
>>  
>>  static int select_gc_type(struct f2fs_gc_kthread *gc_th, int gc_type)
>> @@ -190,15 +203,14 @@ static void select_policy(struct f2fs_sb_info *sbi, int gc_type,
>>  		p->max_search = dirty_i->nr_dirty[type];
>>  		p->ofs_unit = 1;
>>  	} else {
>> -		p->gc_mode = select_gc_type(sbi->gc_thread, gc_type);
>> +		p->gc_mode = select_gc_type(GC_I(sbi), gc_type);
>>  		p->dirty_segmap = dirty_i->dirty_segmap[DIRTY];
>>  		p->max_search = dirty_i->nr_dirty[DIRTY];
>>  		p->ofs_unit = sbi->segs_per_sec;
>>  	}
>>  
>>  	/* we need to check every dirty segments in the FG_GC case */
>> -	if (gc_type != FG_GC &&
>> -			(sbi->gc_thread && !sbi->gc_thread->gc_urgent) &&
>> +	if (gc_type != FG_GC && !GC_I(sbi)->gc_urgent &&
>>  			p->max_search > sbi->max_victim_search)
>>  		p->max_search = sbi->max_victim_search;
>>  
>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>> index 831cefa088bc..c869ce54be4d 100644
>> --- a/fs/f2fs/segment.c
>> +++ b/fs/f2fs/segment.c
>> @@ -177,7 +177,7 @@ bool need_SSR(struct f2fs_sb_info *sbi)
>>  
>>  	if (test_opt(sbi, LFS))
>>  		return false;
>> -	if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
>> +	if (GC_I(sbi)->gc_urgent)
>>  		return true;
>>  
>>  	return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs +
>> @@ -1422,7 +1422,7 @@ static int issue_discard_thread(void *data)
>>  		if (dcc->discard_wake)
>>  			dcc->discard_wake = 0;
>>  
>> -		if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
>> +		if (GC_I(sbi)->gc_urgent)
>>  			init_discard_policy(&dpolicy, DPOLICY_FORCE, 1);
>>  
>>  		sb_start_intwrite(sbi->sb);
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>> index d01c11f5e9c1..f3a5f463496f 100644
>> --- a/fs/f2fs/super.c
>> +++ b/fs/f2fs/super.c
>> @@ -1012,6 +1012,8 @@ static void f2fs_put_super(struct super_block *sb)
>>  		write_checkpoint(sbi, &cpc);
>>  	}
>>  
>> +	destroy_gc_context(sbi);
>> +
>>  	/* write_checkpoint can update stat informaion */
>>  	f2fs_destroy_stats(sbi);
>>  
>> @@ -1044,6 +1046,7 @@ static void f2fs_put_super(struct super_block *sb)
>>  	kfree(sbi->raw_super);
>>  
>>  	destroy_device_list(sbi);
>> +	destroy_gc_context(sbi);
>>  	mempool_destroy(sbi->write_io_dummy);
>>  #ifdef CONFIG_QUOTA
>>  	for (i = 0; i < MAXQUOTAS; i++)
>> @@ -1476,11 +1479,11 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>  	 * option. Also sync the filesystem.
>>  	 */
>>  	if ((*flags & SB_RDONLY) || !test_opt(sbi, BG_GC)) {
>> -		if (sbi->gc_thread) {
>> +		if (GC_I(sbi)->f2fs_gc_task) {
>>  			stop_gc_thread(sbi);
>>  			need_restart_gc = true;
>>  		}
>> -	} else if (!sbi->gc_thread) {
>> +	} else if (!GC_I(sbi)->f2fs_gc_task) {
>>  		err = start_gc_thread(sbi);
>>  		if (err)
>>  			goto restore_opts;
>> @@ -2771,6 +2774,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>  		goto free_meta_inode;
>>  	}
>>  
>> +	err = init_gc_context(sbi);
>> +	if (err)
>> +		goto free_checkpoint;
>> +
>>  	/* Initialize device list */
>>  	err = f2fs_scan_devices(sbi);
>>  	if (err) {
>> @@ -2981,6 +2988,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>  	destroy_segment_manager(sbi);
>>  free_devices:
>>  	destroy_device_list(sbi);
>> +	destroy_gc_context(sbi);
>> +free_checkpoint:
>>  	kfree(sbi->ckpt);
>>  free_meta_inode:
>>  	make_bad_inode(sbi->meta_inode);
>> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
>> index 2c53de9251be..fb3cd477d985 100644
>> --- a/fs/f2fs/sysfs.c
>> +++ b/fs/f2fs/sysfs.c
>> @@ -46,7 +46,7 @@ struct f2fs_attr {
>>  static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int struct_type)
>>  {
>>  	if (struct_type == GC_THREAD)
>> -		return (unsigned char *)sbi->gc_thread;
>> +		return (unsigned char *)GC_I(sbi);
>>  	else if (struct_type == SM_INFO)
>>  		return (unsigned char *)SM_I(sbi);
>>  	else if (struct_type == DCC_INFO)
>> @@ -252,9 +252,9 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
>>  
>>  	if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
>>  		f2fs_reset_iostat(sbi);
>> -	if (!strcmp(a->attr.name, "gc_urgent") && t == 1 && sbi->gc_thread) {
>> -		sbi->gc_thread->gc_wake = 1;
>> -		wake_up_interruptible_all(&sbi->gc_thread->gc_wait_queue_head);
>> +	if (!strcmp(a->attr.name, "gc_urgent") && t == 1) {
>> +		GC_I(sbi)->gc_wake = 1;
>> +		wake_up_interruptible_all(&GC_I(sbi)->gc_wait_queue_head);
>>  		wake_up_discard_thread(sbi, true);
>>  	}
>>  
>> -- 
>> 2.15.0.55.gc2ece9dc4de6
> 
> .
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer
  2018-04-20  3:28     ` Chao Yu
@ 2018-04-20  3:54       ` Jaegeuk Kim
  2018-04-20  7:25         ` Chao Yu
  0 siblings, 1 reply; 15+ messages in thread
From: Jaegeuk Kim @ 2018-04-20  3:54 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel, linux-kernel, chao

On 04/20, Chao Yu wrote:
> On 2018/4/20 11:19, Jaegeuk Kim wrote:
> > On 04/18, Chao Yu wrote:
> >> Thread A			Thread B		Thread C
> >> - f2fs_remount
> >>  - stop_gc_thread
> >> 				- f2fs_sbi_store
> >> 							- issue_discard_thread
> >>    sbi->gc_thread = NULL;
> >> 				  sbi->gc_thread->gc_wake = 1
> >> 							  access sbi->gc_thread->gc_urgent
> > 
> > Do we simply need a lock for this?
> 
> Code will be more complicated for handling existed and new coming fields with
> the sbi->gc_thread pointer, and causing unneeded lock overhead, right?
> 
> So let's just allocate memory during fill_super?

No, the case is when stopping the thread. We can keep the gc_thread and indicate
its state as "disabled". Then, we need to handle other paths with the state?

> 
> Thanks,
> 
> > 
> >>
> >> Previously, we allocate memory for sbi->gc_thread based on background
> >> gc thread mount option, the memory can be released if we turn off
> >> that mount option, but still there are several places access gc_thread
> >> pointer without considering race condition, result in NULL point
> >> dereference.
> >>
> >> In order to fix this issue, keep gc_thread structure valid in sbi all
> >> the time instead of alloc/free it dynamically.
> >>
> >> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >> ---
> >>  fs/f2fs/debug.c   |  3 +--
> >>  fs/f2fs/f2fs.h    |  7 +++++++
> >>  fs/f2fs/gc.c      | 58 +++++++++++++++++++++++++++++++++----------------------
> >>  fs/f2fs/segment.c |  4 ++--
> >>  fs/f2fs/super.c   | 13 +++++++++++--
> >>  fs/f2fs/sysfs.c   |  8 ++++----
> >>  6 files changed, 60 insertions(+), 33 deletions(-)
> >>
> >> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> >> index 715beb85e9db..7bb036a3bb81 100644
> >> --- a/fs/f2fs/debug.c
> >> +++ b/fs/f2fs/debug.c
> >> @@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
> >>  	si->cache_mem = 0;
> >>  
> >>  	/* build gc */
> >> -	if (sbi->gc_thread)
> >> -		si->cache_mem += sizeof(struct f2fs_gc_kthread);
> >> +	si->cache_mem += sizeof(struct f2fs_gc_kthread);
> >>  
> >>  	/* build merge flush thread */
> >>  	if (SM_I(sbi)->fcc_info)
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >> index 567c6bb57ae3..c553f63199e8 100644
> >> --- a/fs/f2fs/f2fs.h
> >> +++ b/fs/f2fs/f2fs.h
> >> @@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct f2fs_sb_info *sbi)
> >>  	return (struct sit_info *)(SM_I(sbi)->sit_info);
> >>  }
> >>  
> >> +static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
> >> +{
> >> +	return (struct f2fs_gc_kthread *)(sbi->gc_thread);
> >> +}
> >> +
> >>  static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
> >>  {
> >>  	return (struct free_segmap_info *)(SM_I(sbi)->free_info);
> >> @@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
> >>  /*
> >>   * gc.c
> >>   */
> >> +int init_gc_context(struct f2fs_sb_info *sbi);
> >> +void destroy_gc_context(struct f2fs_sb_info * sbi);
> >>  int start_gc_thread(struct f2fs_sb_info *sbi);
> >>  void stop_gc_thread(struct f2fs_sb_info *sbi);
> >>  block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
> >> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >> index da89ca16a55d..7d310e454b77 100644
> >> --- a/fs/f2fs/gc.c
> >> +++ b/fs/f2fs/gc.c
> >> @@ -26,8 +26,8 @@
> >>  static int gc_thread_func(void *data)
> >>  {
> >>  	struct f2fs_sb_info *sbi = data;
> >> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
> >> -	wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
> >> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> >> +	wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
> >>  	unsigned int wait_ms;
> >>  
> >>  	wait_ms = gc_th->min_sleep_time;
> >> @@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
> >>  	return 0;
> >>  }
> >>  
> >> -int start_gc_thread(struct f2fs_sb_info *sbi)
> >> +int init_gc_context(struct f2fs_sb_info *sbi)
> >>  {
> >>  	struct f2fs_gc_kthread *gc_th;
> >> -	dev_t dev = sbi->sb->s_bdev->bd_dev;
> >> -	int err = 0;
> >>  
> >>  	gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
> >> -	if (!gc_th) {
> >> -		err = -ENOMEM;
> >> -		goto out;
> >> -	}
> >> +	if (!gc_th)
> >> +		return -ENOMEM;
> >> +
> >> +	gc_th->f2fs_gc_task = NULL;
> >>  
> >>  	gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
> >>  	gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
> >> @@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
> >>  	gc_th->atomic_file[FG_GC] = 0;
> >>  
> >>  	sbi->gc_thread = gc_th;
> >> -	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> >> -	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +void destroy_gc_context(struct f2fs_sb_info *sbi)
> >> +{
> >> +	kfree(GC_I(sbi));
> >> +	sbi->gc_thread = NULL;
> >> +}
> >> +
> >> +int start_gc_thread(struct f2fs_sb_info *sbi)
> >> +{
> >> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> >> +	dev_t dev = sbi->sb->s_bdev->bd_dev;
> >> +	int err = 0;
> >> +
> >> +	init_waitqueue_head(&gc_th->gc_wait_queue_head);
> >> +	gc_th->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> >>  			"f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev));
> >>  	if (IS_ERR(gc_th->f2fs_gc_task)) {
> >>  		err = PTR_ERR(gc_th->f2fs_gc_task);
> >> -		kfree(gc_th);
> >> -		sbi->gc_thread = NULL;
> >> +		gc_th->f2fs_gc_task = NULL;
> >>  	}
> >> -out:
> >> +
> >>  	return err;
> >>  }
> >>  
> >>  void stop_gc_thread(struct f2fs_sb_info *sbi)
> >>  {
> >> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
> >> -	if (!gc_th)
> >> -		return;
> >> -	kthread_stop(gc_th->f2fs_gc_task);
> >> -	kfree(gc_th);
> >> -	sbi->gc_thread = NULL;
> >> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> >> +
> >> +	if (gc_th->f2fs_gc_task) {
> >> +		kthread_stop(gc_th->f2fs_gc_task);
> >> +		gc_th->f2fs_gc_task = NULL;
> >> +	}
> >>  }
> >>  
> >>  static int select_gc_type(struct f2fs_gc_kthread *gc_th, int gc_type)
> >> @@ -190,15 +203,14 @@ static void select_policy(struct f2fs_sb_info *sbi, int gc_type,
> >>  		p->max_search = dirty_i->nr_dirty[type];
> >>  		p->ofs_unit = 1;
> >>  	} else {
> >> -		p->gc_mode = select_gc_type(sbi->gc_thread, gc_type);
> >> +		p->gc_mode = select_gc_type(GC_I(sbi), gc_type);
> >>  		p->dirty_segmap = dirty_i->dirty_segmap[DIRTY];
> >>  		p->max_search = dirty_i->nr_dirty[DIRTY];
> >>  		p->ofs_unit = sbi->segs_per_sec;
> >>  	}
> >>  
> >>  	/* we need to check every dirty segments in the FG_GC case */
> >> -	if (gc_type != FG_GC &&
> >> -			(sbi->gc_thread && !sbi->gc_thread->gc_urgent) &&
> >> +	if (gc_type != FG_GC && !GC_I(sbi)->gc_urgent &&
> >>  			p->max_search > sbi->max_victim_search)
> >>  		p->max_search = sbi->max_victim_search;
> >>  
> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >> index 831cefa088bc..c869ce54be4d 100644
> >> --- a/fs/f2fs/segment.c
> >> +++ b/fs/f2fs/segment.c
> >> @@ -177,7 +177,7 @@ bool need_SSR(struct f2fs_sb_info *sbi)
> >>  
> >>  	if (test_opt(sbi, LFS))
> >>  		return false;
> >> -	if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
> >> +	if (GC_I(sbi)->gc_urgent)
> >>  		return true;
> >>  
> >>  	return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs +
> >> @@ -1422,7 +1422,7 @@ static int issue_discard_thread(void *data)
> >>  		if (dcc->discard_wake)
> >>  			dcc->discard_wake = 0;
> >>  
> >> -		if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
> >> +		if (GC_I(sbi)->gc_urgent)
> >>  			init_discard_policy(&dpolicy, DPOLICY_FORCE, 1);
> >>  
> >>  		sb_start_intwrite(sbi->sb);
> >> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >> index d01c11f5e9c1..f3a5f463496f 100644
> >> --- a/fs/f2fs/super.c
> >> +++ b/fs/f2fs/super.c
> >> @@ -1012,6 +1012,8 @@ static void f2fs_put_super(struct super_block *sb)
> >>  		write_checkpoint(sbi, &cpc);
> >>  	}
> >>  
> >> +	destroy_gc_context(sbi);
> >> +
> >>  	/* write_checkpoint can update stat informaion */
> >>  	f2fs_destroy_stats(sbi);
> >>  
> >> @@ -1044,6 +1046,7 @@ static void f2fs_put_super(struct super_block *sb)
> >>  	kfree(sbi->raw_super);
> >>  
> >>  	destroy_device_list(sbi);
> >> +	destroy_gc_context(sbi);
> >>  	mempool_destroy(sbi->write_io_dummy);
> >>  #ifdef CONFIG_QUOTA
> >>  	for (i = 0; i < MAXQUOTAS; i++)
> >> @@ -1476,11 +1479,11 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >>  	 * option. Also sync the filesystem.
> >>  	 */
> >>  	if ((*flags & SB_RDONLY) || !test_opt(sbi, BG_GC)) {
> >> -		if (sbi->gc_thread) {
> >> +		if (GC_I(sbi)->f2fs_gc_task) {
> >>  			stop_gc_thread(sbi);
> >>  			need_restart_gc = true;
> >>  		}
> >> -	} else if (!sbi->gc_thread) {
> >> +	} else if (!GC_I(sbi)->f2fs_gc_task) {
> >>  		err = start_gc_thread(sbi);
> >>  		if (err)
> >>  			goto restore_opts;
> >> @@ -2771,6 +2774,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>  		goto free_meta_inode;
> >>  	}
> >>  
> >> +	err = init_gc_context(sbi);
> >> +	if (err)
> >> +		goto free_checkpoint;
> >> +
> >>  	/* Initialize device list */
> >>  	err = f2fs_scan_devices(sbi);
> >>  	if (err) {
> >> @@ -2981,6 +2988,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>  	destroy_segment_manager(sbi);
> >>  free_devices:
> >>  	destroy_device_list(sbi);
> >> +	destroy_gc_context(sbi);
> >> +free_checkpoint:
> >>  	kfree(sbi->ckpt);
> >>  free_meta_inode:
> >>  	make_bad_inode(sbi->meta_inode);
> >> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
> >> index 2c53de9251be..fb3cd477d985 100644
> >> --- a/fs/f2fs/sysfs.c
> >> +++ b/fs/f2fs/sysfs.c
> >> @@ -46,7 +46,7 @@ struct f2fs_attr {
> >>  static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int struct_type)
> >>  {
> >>  	if (struct_type == GC_THREAD)
> >> -		return (unsigned char *)sbi->gc_thread;
> >> +		return (unsigned char *)GC_I(sbi);
> >>  	else if (struct_type == SM_INFO)
> >>  		return (unsigned char *)SM_I(sbi);
> >>  	else if (struct_type == DCC_INFO)
> >> @@ -252,9 +252,9 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
> >>  
> >>  	if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
> >>  		f2fs_reset_iostat(sbi);
> >> -	if (!strcmp(a->attr.name, "gc_urgent") && t == 1 && sbi->gc_thread) {
> >> -		sbi->gc_thread->gc_wake = 1;
> >> -		wake_up_interruptible_all(&sbi->gc_thread->gc_wait_queue_head);
> >> +	if (!strcmp(a->attr.name, "gc_urgent") && t == 1) {
> >> +		GC_I(sbi)->gc_wake = 1;
> >> +		wake_up_interruptible_all(&GC_I(sbi)->gc_wait_queue_head);
> >>  		wake_up_discard_thread(sbi, true);
> >>  	}
> >>  
> >> -- 
> >> 2.15.0.55.gc2ece9dc4de6
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer
  2018-04-20  3:54       ` Jaegeuk Kim
@ 2018-04-20  7:25         ` Chao Yu
  0 siblings, 0 replies; 15+ messages in thread
From: Chao Yu @ 2018-04-20  7:25 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel, linux-kernel, chao

On 2018/4/20 11:54, Jaegeuk Kim wrote:
> On 04/20, Chao Yu wrote:
>> On 2018/4/20 11:19, Jaegeuk Kim wrote:
>>> On 04/18, Chao Yu wrote:
>>>> Thread A			Thread B		Thread C
>>>> - f2fs_remount
>>>>  - stop_gc_thread
>>>> 				- f2fs_sbi_store
>>>> 							- issue_discard_thread
>>>>    sbi->gc_thread = NULL;
>>>> 				  sbi->gc_thread->gc_wake = 1
>>>> 							  access sbi->gc_thread->gc_urgent
>>>
>>> Do we simply need a lock for this?
>>
>> Code will be more complicated for handling existed and new coming fields with
>> the sbi->gc_thread pointer, and causing unneeded lock overhead, right?
>>
>> So let's just allocate memory during fill_super?
> 
> No, the case is when stopping the thread. We can keep the gc_thread and indicate
> its state as "disabled". Then, we need to handle other paths with the state?

After this patch, we use f2fs_gc_kthread.f2fs_gc_task to indicate whether GC
thread is existed, so you mean if we do that change, we also need to add a lock,
and access/update other fields in f2fs_gc_kthread after we check
f2fs_gc_kthread.f2fs_gc_task with a lock, right?

Like:

f2fs_sbi_store:

	if (!strcmp(a->attr.name, "gc_urgent") && t == 1) {
		gc_context_lock()
		if (GC_I(sbi)->f2fs_gc_task) {
			GC_I(sbi)->gc_wake = 1;
			...
		}
		gc_context_unlock()

Do you mean that?

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Previously, we allocate memory for sbi->gc_thread based on background
>>>> gc thread mount option, the memory can be released if we turn off
>>>> that mount option, but still there are several places access gc_thread
>>>> pointer without considering race condition, result in NULL point
>>>> dereference.
>>>>
>>>> In order to fix this issue, keep gc_thread structure valid in sbi all
>>>> the time instead of alloc/free it dynamically.
>>>>
>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>>> ---
>>>>  fs/f2fs/debug.c   |  3 +--
>>>>  fs/f2fs/f2fs.h    |  7 +++++++
>>>>  fs/f2fs/gc.c      | 58 +++++++++++++++++++++++++++++++++----------------------
>>>>  fs/f2fs/segment.c |  4 ++--
>>>>  fs/f2fs/super.c   | 13 +++++++++++--
>>>>  fs/f2fs/sysfs.c   |  8 ++++----
>>>>  6 files changed, 60 insertions(+), 33 deletions(-)
>>>>
>>>> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
>>>> index 715beb85e9db..7bb036a3bb81 100644
>>>> --- a/fs/f2fs/debug.c
>>>> +++ b/fs/f2fs/debug.c
>>>> @@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
>>>>  	si->cache_mem = 0;
>>>>  
>>>>  	/* build gc */
>>>> -	if (sbi->gc_thread)
>>>> -		si->cache_mem += sizeof(struct f2fs_gc_kthread);
>>>> +	si->cache_mem += sizeof(struct f2fs_gc_kthread);
>>>>  
>>>>  	/* build merge flush thread */
>>>>  	if (SM_I(sbi)->fcc_info)
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>> index 567c6bb57ae3..c553f63199e8 100644
>>>> --- a/fs/f2fs/f2fs.h
>>>> +++ b/fs/f2fs/f2fs.h
>>>> @@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct f2fs_sb_info *sbi)
>>>>  	return (struct sit_info *)(SM_I(sbi)->sit_info);
>>>>  }
>>>>  
>>>> +static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
>>>> +{
>>>> +	return (struct f2fs_gc_kthread *)(sbi->gc_thread);
>>>> +}
>>>> +
>>>>  static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
>>>>  {
>>>>  	return (struct free_segmap_info *)(SM_I(sbi)->free_info);
>>>> @@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
>>>>  /*
>>>>   * gc.c
>>>>   */
>>>> +int init_gc_context(struct f2fs_sb_info *sbi);
>>>> +void destroy_gc_context(struct f2fs_sb_info * sbi);
>>>>  int start_gc_thread(struct f2fs_sb_info *sbi);
>>>>  void stop_gc_thread(struct f2fs_sb_info *sbi);
>>>>  block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>>> index da89ca16a55d..7d310e454b77 100644
>>>> --- a/fs/f2fs/gc.c
>>>> +++ b/fs/f2fs/gc.c
>>>> @@ -26,8 +26,8 @@
>>>>  static int gc_thread_func(void *data)
>>>>  {
>>>>  	struct f2fs_sb_info *sbi = data;
>>>> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
>>>> -	wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
>>>> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>>>> +	wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
>>>>  	unsigned int wait_ms;
>>>>  
>>>>  	wait_ms = gc_th->min_sleep_time;
>>>> @@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
>>>>  	return 0;
>>>>  }
>>>>  
>>>> -int start_gc_thread(struct f2fs_sb_info *sbi)
>>>> +int init_gc_context(struct f2fs_sb_info *sbi)
>>>>  {
>>>>  	struct f2fs_gc_kthread *gc_th;
>>>> -	dev_t dev = sbi->sb->s_bdev->bd_dev;
>>>> -	int err = 0;
>>>>  
>>>>  	gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
>>>> -	if (!gc_th) {
>>>> -		err = -ENOMEM;
>>>> -		goto out;
>>>> -	}
>>>> +	if (!gc_th)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	gc_th->f2fs_gc_task = NULL;
>>>>  
>>>>  	gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
>>>>  	gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
>>>> @@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>>>>  	gc_th->atomic_file[FG_GC] = 0;
>>>>  
>>>>  	sbi->gc_thread = gc_th;
>>>> -	init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>>>> -	sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +void destroy_gc_context(struct f2fs_sb_info *sbi)
>>>> +{
>>>> +	kfree(GC_I(sbi));
>>>> +	sbi->gc_thread = NULL;
>>>> +}
>>>> +
>>>> +int start_gc_thread(struct f2fs_sb_info *sbi)
>>>> +{
>>>> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>>>> +	dev_t dev = sbi->sb->s_bdev->bd_dev;
>>>> +	int err = 0;
>>>> +
>>>> +	init_waitqueue_head(&gc_th->gc_wait_queue_head);
>>>> +	gc_th->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>>>>  			"f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev));
>>>>  	if (IS_ERR(gc_th->f2fs_gc_task)) {
>>>>  		err = PTR_ERR(gc_th->f2fs_gc_task);
>>>> -		kfree(gc_th);
>>>> -		sbi->gc_thread = NULL;
>>>> +		gc_th->f2fs_gc_task = NULL;
>>>>  	}
>>>> -out:
>>>> +
>>>>  	return err;
>>>>  }
>>>>  
>>>>  void stop_gc_thread(struct f2fs_sb_info *sbi)
>>>>  {
>>>> -	struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
>>>> -	if (!gc_th)
>>>> -		return;
>>>> -	kthread_stop(gc_th->f2fs_gc_task);
>>>> -	kfree(gc_th);
>>>> -	sbi->gc_thread = NULL;
>>>> +	struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>>>> +
>>>> +	if (gc_th->f2fs_gc_task) {
>>>> +		kthread_stop(gc_th->f2fs_gc_task);
>>>> +		gc_th->f2fs_gc_task = NULL;
>>>> +	}
>>>>  }
>>>>  
>>>>  static int select_gc_type(struct f2fs_gc_kthread *gc_th, int gc_type)
>>>> @@ -190,15 +203,14 @@ static void select_policy(struct f2fs_sb_info *sbi, int gc_type,
>>>>  		p->max_search = dirty_i->nr_dirty[type];
>>>>  		p->ofs_unit = 1;
>>>>  	} else {
>>>> -		p->gc_mode = select_gc_type(sbi->gc_thread, gc_type);
>>>> +		p->gc_mode = select_gc_type(GC_I(sbi), gc_type);
>>>>  		p->dirty_segmap = dirty_i->dirty_segmap[DIRTY];
>>>>  		p->max_search = dirty_i->nr_dirty[DIRTY];
>>>>  		p->ofs_unit = sbi->segs_per_sec;
>>>>  	}
>>>>  
>>>>  	/* we need to check every dirty segments in the FG_GC case */
>>>> -	if (gc_type != FG_GC &&
>>>> -			(sbi->gc_thread && !sbi->gc_thread->gc_urgent) &&
>>>> +	if (gc_type != FG_GC && !GC_I(sbi)->gc_urgent &&
>>>>  			p->max_search > sbi->max_victim_search)
>>>>  		p->max_search = sbi->max_victim_search;
>>>>  
>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>>>> index 831cefa088bc..c869ce54be4d 100644
>>>> --- a/fs/f2fs/segment.c
>>>> +++ b/fs/f2fs/segment.c
>>>> @@ -177,7 +177,7 @@ bool need_SSR(struct f2fs_sb_info *sbi)
>>>>  
>>>>  	if (test_opt(sbi, LFS))
>>>>  		return false;
>>>> -	if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
>>>> +	if (GC_I(sbi)->gc_urgent)
>>>>  		return true;
>>>>  
>>>>  	return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs +
>>>> @@ -1422,7 +1422,7 @@ static int issue_discard_thread(void *data)
>>>>  		if (dcc->discard_wake)
>>>>  			dcc->discard_wake = 0;
>>>>  
>>>> -		if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
>>>> +		if (GC_I(sbi)->gc_urgent)
>>>>  			init_discard_policy(&dpolicy, DPOLICY_FORCE, 1);
>>>>  
>>>>  		sb_start_intwrite(sbi->sb);
>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>> index d01c11f5e9c1..f3a5f463496f 100644
>>>> --- a/fs/f2fs/super.c
>>>> +++ b/fs/f2fs/super.c
>>>> @@ -1012,6 +1012,8 @@ static void f2fs_put_super(struct super_block *sb)
>>>>  		write_checkpoint(sbi, &cpc);
>>>>  	}
>>>>  
>>>> +	destroy_gc_context(sbi);
>>>> +
>>>>  	/* write_checkpoint can update stat informaion */
>>>>  	f2fs_destroy_stats(sbi);
>>>>  
>>>> @@ -1044,6 +1046,7 @@ static void f2fs_put_super(struct super_block *sb)
>>>>  	kfree(sbi->raw_super);
>>>>  
>>>>  	destroy_device_list(sbi);
>>>> +	destroy_gc_context(sbi);
>>>>  	mempool_destroy(sbi->write_io_dummy);
>>>>  #ifdef CONFIG_QUOTA
>>>>  	for (i = 0; i < MAXQUOTAS; i++)
>>>> @@ -1476,11 +1479,11 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>>>  	 * option. Also sync the filesystem.
>>>>  	 */
>>>>  	if ((*flags & SB_RDONLY) || !test_opt(sbi, BG_GC)) {
>>>> -		if (sbi->gc_thread) {
>>>> +		if (GC_I(sbi)->f2fs_gc_task) {
>>>>  			stop_gc_thread(sbi);
>>>>  			need_restart_gc = true;
>>>>  		}
>>>> -	} else if (!sbi->gc_thread) {
>>>> +	} else if (!GC_I(sbi)->f2fs_gc_task) {
>>>>  		err = start_gc_thread(sbi);
>>>>  		if (err)
>>>>  			goto restore_opts;
>>>> @@ -2771,6 +2774,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>  		goto free_meta_inode;
>>>>  	}
>>>>  
>>>> +	err = init_gc_context(sbi);
>>>> +	if (err)
>>>> +		goto free_checkpoint;
>>>> +
>>>>  	/* Initialize device list */
>>>>  	err = f2fs_scan_devices(sbi);
>>>>  	if (err) {
>>>> @@ -2981,6 +2988,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>  	destroy_segment_manager(sbi);
>>>>  free_devices:
>>>>  	destroy_device_list(sbi);
>>>> +	destroy_gc_context(sbi);
>>>> +free_checkpoint:
>>>>  	kfree(sbi->ckpt);
>>>>  free_meta_inode:
>>>>  	make_bad_inode(sbi->meta_inode);
>>>> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
>>>> index 2c53de9251be..fb3cd477d985 100644
>>>> --- a/fs/f2fs/sysfs.c
>>>> +++ b/fs/f2fs/sysfs.c
>>>> @@ -46,7 +46,7 @@ struct f2fs_attr {
>>>>  static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int struct_type)
>>>>  {
>>>>  	if (struct_type == GC_THREAD)
>>>> -		return (unsigned char *)sbi->gc_thread;
>>>> +		return (unsigned char *)GC_I(sbi);
>>>>  	else if (struct_type == SM_INFO)
>>>>  		return (unsigned char *)SM_I(sbi);
>>>>  	else if (struct_type == DCC_INFO)
>>>> @@ -252,9 +252,9 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
>>>>  
>>>>  	if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
>>>>  		f2fs_reset_iostat(sbi);
>>>> -	if (!strcmp(a->attr.name, "gc_urgent") && t == 1 && sbi->gc_thread) {
>>>> -		sbi->gc_thread->gc_wake = 1;
>>>> -		wake_up_interruptible_all(&sbi->gc_thread->gc_wait_queue_head);
>>>> +	if (!strcmp(a->attr.name, "gc_urgent") && t == 1) {
>>>> +		GC_I(sbi)->gc_wake = 1;
>>>> +		wake_up_interruptible_all(&GC_I(sbi)->gc_wait_queue_head);
>>>>  		wake_up_discard_thread(sbi, true);
>>>>  	}
>>>>  
>>>> -- 
>>>> 2.15.0.55.gc2ece9dc4de6
>>>
>>> .
>>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-04-20  7:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-18  9:45 [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu
2018-04-18  9:45 ` [PATCH 2/5] f2fs: fix return value in f2fs_ioc_commit_atomic_write Chao Yu
2018-04-18  9:45 ` [PATCH 3/5] f2fs: avoid stucking GC due to atomic write Chao Yu
2018-04-20  3:12   ` Jaegeuk Kim
2018-04-20  3:22     ` Chao Yu
2018-04-20  3:27       ` Jaegeuk Kim
2018-04-18  9:45 ` [PATCH 4/5] f2fs: show GC failure info in debugfs Chao Yu
2018-04-20  3:15   ` Jaegeuk Kim
2018-04-20  3:24     ` Chao Yu
2018-04-18  9:45 ` [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer Chao Yu
2018-04-20  3:19   ` Jaegeuk Kim
2018-04-20  3:28     ` Chao Yu
2018-04-20  3:54       ` Jaegeuk Kim
2018-04-20  7:25         ` Chao Yu
2018-04-18  9:53 ` [PATCH 1/5] f2fs: fix race in between GC and atomic open Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).