All of lore.kernel.org
 help / color / mirror / Atom feed
* [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes
@ 2019-07-16 21:00 Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 1/6] fuse: Continue to wait if inline reclaim returns null Vivek Goyal
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Vivek Goyal @ 2019-07-16 21:00 UTC (permalink / raw)
  To: virtio-fs

Hi Liu Bo,

I am proposing following patches and fixes. I have modified your patch
a bit to use refcount_t. 

I have done few other cleanup and fixes as well.

If things look good, I will squash these patches into exisitng patches.

Thanks
Vivek

Vivek Goyal (6):
  fuse: Continue to wait if inline reclaim returns null
  fuse: Requeue work if number of free ranges is not above threshold
  fuse: Do cond_resched() if we can't get inode lock
  fuse: Get rid of inode lock in range reclaim path
  fuse: Get rid of -EAGAIN logic in dmap free worker thread
  fuse: Dax read can now do range reclaim inline

 fs/fuse/file.c   | 161 ++++++++++++++++++++++++++---------------------
 fs/fuse/fuse_i.h |   3 +
 fs/fuse/inode.c  |   1 +
 3 files changed, 93 insertions(+), 72 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Virtio-fs] [PATCH 1/6] fuse: Continue to wait if inline reclaim returns null
  2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
@ 2019-07-16 21:00 ` Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 2/6] fuse: Requeue work if number of free ranges is not above threshold Vivek Goyal
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vivek Goyal @ 2019-07-16 21:00 UTC (permalink / raw)
  To: virtio-fs

It is possible that inline reclaim returns null. Continue to wait for free
range in that case.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/file.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index d9bada37d8ef..310fb22b31bd 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -4074,8 +4074,11 @@ static struct fuse_dax_mapping *alloc_dax_mapping_reclaim(struct fuse_conn *fc,
 		if (dmap)
 			return dmap;
 
-		if (fi->nr_dmaps)
-			return inode_reclaim_first_dmap(fc, inode);
+		if (fi->nr_dmaps) {
+			dmap = inode_reclaim_first_dmap(fc, inode);
+			if (dmap)
+				return dmap;
+		}
 		/*
 		 * There are no mappings which can be reclaimed.
 		 * Wait for one.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Virtio-fs] [PATCH 2/6] fuse: Requeue work if number of free ranges is not above threshold
  2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 1/6] fuse: Continue to wait if inline reclaim returns null Vivek Goyal
@ 2019-07-16 21:00 ` Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 3/6] fuse: Do cond_resched() if we can't get inode lock Vivek Goyal
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vivek Goyal @ 2019-07-16 21:00 UTC (permalink / raw)
  To: virtio-fs

If number of free ranges is not above threshold, requeue work (from
worker context itself). This will make sure worker continues to run
till we hit threshold.

Otherwise, we could be below threshold and still worker exits until
and some memory allocation kicks the worker again.

As a side affect, it also solves one race where where with nr_dmaps=1,
it is possible that worker starts but exits as there are no busy ranges,
and soon after busy range count goes up (dmap has been taken off free
list already). In that case, worker never runs again, sleepers continue
to sleep and caller who added dmap to busy list has exited. Busy range
remains in busy list of the inode and application hangs.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/file.c | 39 +++++++++++++++++++++++++++++++--------
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 310fb22b31bd..6e351fed6d65 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -179,9 +179,30 @@ static void fuse_link_write_file(struct file *file)
 	spin_unlock(&fi->lock);
 }
 
-static struct fuse_dax_mapping *alloc_dax_mapping(struct fuse_conn *fc)
+static void
+__kick_dmap_free_worker(struct fuse_conn *fc, unsigned long delay_ms)
 {
 	unsigned long free_threshold;
+
+	/* If number of free ranges are below threshold, start reclaim */
+	free_threshold = max((fc->nr_ranges * FUSE_DAX_RECLAIM_THRESHOLD)/100,
+				(unsigned long)1);
+	if (fc->nr_free_ranges < free_threshold) {
+		pr_debug("fuse: Kicking dax memory reclaim worker. nr_free_ranges=0x%ld nr_total_ranges=%ld\n", fc->nr_free_ranges, fc->nr_ranges);
+		queue_delayed_work(system_long_wq, &fc->dax_free_work,
+				   msecs_to_jiffies(delay_ms));
+	}
+}
+
+static void kick_dmap_free_worker(struct fuse_conn *fc, unsigned long delay_ms)
+{
+	spin_lock(&fc->lock);
+	__kick_dmap_free_worker(fc, delay_ms);
+	spin_unlock(&fc->lock);
+}
+
+static struct fuse_dax_mapping *alloc_dax_mapping(struct fuse_conn *fc)
+{
 	struct fuse_dax_mapping *dmap = NULL;
 
 	spin_lock(&fc->lock);
@@ -202,13 +223,7 @@ static struct fuse_dax_mapping *alloc_dax_mapping(struct fuse_conn *fc)
 	spin_unlock(&fc->lock);
 
 out_kick:
-	/* If number of free ranges are below threshold, start reclaim */
-	free_threshold = max((fc->nr_ranges * FUSE_DAX_RECLAIM_THRESHOLD)/100,
-				(unsigned long)1);
-	if (fc->nr_free_ranges < free_threshold) {
-		pr_debug("fuse: Kicking dax memory reclaim worker. nr_free_ranges=0x%ld nr_total_ranges=%ld\n", fc->nr_free_ranges, fc->nr_ranges);
-		queue_delayed_work(system_long_wq, &fc->dax_free_work, 0);
-	}
+	kick_dmap_free_worker(fc, 0);
 	return dmap;
 }
 
@@ -4170,6 +4185,11 @@ static int try_to_free_dmap_chunks(struct fuse_conn *fc,
 		dmap = NULL;
 		spin_lock(&fc->lock);
 
+		if (!fc->nr_busy_ranges) {
+			spin_unlock(&fc->lock);
+			return 0;
+		}
+
 		list_for_each_entry_safe(pos, temp, &fc->busy_ranges,
 						busy_list) {
 			inode = igrab(pos->inode);
@@ -4228,4 +4248,7 @@ void fuse_dax_free_mem_worker(struct work_struct *work)
 		pr_debug("fuse: try_to_free_dmap_chunks() failed with err=%d\n",
 			 ret);
 	}
+
+	/* If number of free ranges are still below threhold, requeue */
+	kick_dmap_free_worker(fc, 1);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Virtio-fs] [PATCH 3/6] fuse: Do cond_resched() if we can't get inode lock
  2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 1/6] fuse: Continue to wait if inline reclaim returns null Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 2/6] fuse: Requeue work if number of free ranges is not above threshold Vivek Goyal
@ 2019-07-16 21:00 ` Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 4/6] fuse: Get rid of inode lock in range reclaim path Vivek Goyal
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vivek Goyal @ 2019-07-16 21:00 UTC (permalink / raw)
  To: virtio-fs

cond_resched() might be better as opposed

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/file.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 6e351fed6d65..fc40e0f44578 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -4167,7 +4167,7 @@ static int try_to_free_dmap_chunks(struct fuse_conn *fc,
 				   unsigned long nr_to_free)
 {
 	struct fuse_dax_mapping *dmap, *pos, *temp;
-	int ret, nr_freed = 0, nr_eagain = 0;
+	int ret, nr_freed = 0;
 	u64 dmap_start = 0, window_offset = 0;
 	struct inode *inode = NULL;
 
@@ -4176,12 +4176,6 @@ static int try_to_free_dmap_chunks(struct fuse_conn *fc,
 		if (nr_freed >= nr_to_free)
 			break;
 
-		if (nr_eagain > 20) {
-			queue_delayed_work(system_long_wq, &fc->dax_free_work,
-						msecs_to_jiffies(10));
-			return 0;
-		}
-
 		dmap = NULL;
 		spin_lock(&fc->lock);
 
@@ -4225,7 +4219,7 @@ static int try_to_free_dmap_chunks(struct fuse_conn *fc,
 
 		/* Could not get inode lock. Try next element */
 		if (ret == -EAGAIN) {
-			nr_eagain++;
+			cond_resched();
 			continue;
 		}
 		nr_freed++;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Virtio-fs] [PATCH 4/6] fuse: Get rid of inode lock in range reclaim path
  2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
                   ` (2 preceding siblings ...)
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 3/6] fuse: Do cond_resched() if we can't get inode lock Vivek Goyal
@ 2019-07-16 21:00 ` Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 5/6] fuse: Get rid of -EAGAIN logic in dmap free worker thread Vivek Goyal
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vivek Goyal @ 2019-07-16 21:00 UTC (permalink / raw)
  To: virtio-fs

With free fuse dax mapping reducing, read performance is impacted
significantly because reads need to wait for a free fuse dax mapping.

Although reads will trigger reclaim work to try to reclaim fuse dax
mapping, reclaim code can barely make any progress if most of fuse dax
mappings are used by the file we're reading since inode lock is required
by reclaim code.

However, we don't have to take inode lock for reclaiming if dax mapping
has its own reference count, reference counting is to tell reclaim code to
skip those in use dax mappings, such that we can avoid the risk of
accidentally reclaiming a dax mapping that other readers are using.

On the other hand, holding ->i_dmap_sem during reclaim can be used to
prevent the follwing reads to get a dax mapping under reclaim.

Another reason is that reads/writes only use fuse dax mapping within
dax_iomap_rw(), so we can do such a trick, while mmap/faulting is a
different story and we have to take ->i_mmap_sem prior to reclaiming a dax
mapping in order to avoid the race.

This adds reference count for fuse dax mapping and removes the acquisition
of inode lock during reclaim.


RESULTS:

virtiofsd -cache_size=2G

vanilla kernel: IOPS=378
patched kernel: IOPS=4508


*********************************
$ cat fio-rand-read.job
; fio-rand-read.job for fiotest

[global]
name=fio-rand-read
filename=fio_file
rw=randread
bs=4K
direct=1
numjobs=1
time_based=1
runtime=120
directory=/mnt/test/
fsync=1
group_reporting=1

[file1]
size=5G
# use sync/libaio
ioengine=sync
iodepth=1


Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/file.c   | 52 +++++++++++++++++++++++++++++++++++-------------
 fs/fuse/fuse_i.h |  3 +++
 fs/fuse/inode.c  |  1 +
 3 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index fc40e0f44578..bf9903a858db 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1885,6 +1885,18 @@ static void fuse_fill_iomap(struct inode *inode, loff_t pos, loff_t length,
 		if (flags & IOMAP_FAULT)
 			iomap->length = ALIGN(len, PAGE_SIZE);
 		iomap->type = IOMAP_MAPPED;
+
+		/*
+		 * increace refcnt so that reclaim code knows this dmap is in
+		 * use. This assumes i_dmap_sem mutex is held either
+		 * shared/exclusive.
+		 */
+		refcount_inc(&dmap->refcnt);
+
+		/* iomap->private should be NULL */
+		WARN_ON_ONCE(iomap->private);
+		iomap->private = dmap;
+
 		pr_debug("%s: returns iomap: addr 0x%llx offset 0x%llx"
 				" length 0x%llx\n", __func__, iomap->addr,
 				iomap->offset, iomap->length);
@@ -2014,6 +2026,16 @@ static int fuse_iomap_end(struct inode *inode, loff_t pos, loff_t length,
 			  ssize_t written, unsigned flags,
 			  struct iomap *iomap)
 {
+	struct fuse_dax_mapping *dmap = iomap->private;
+
+	if (dmap) {
+		if (refcount_dec_and_test(&dmap->refcnt)) {
+			/* refcount should not hit 0. This object only goes
+			 * away when fuse connection goes away */
+			WARN_ON_ONCE(1);
+		}
+	}
+
 	/* DAX writes beyond end-of-file aren't handled using iomap, so the
 	 * file size is unchanged and there is nothing to do here.
 	 */
@@ -4009,6 +4031,10 @@ static int reclaim_one_dmap_locked(struct fuse_conn *fc, struct inode *inode,
 	int ret;
 	struct fuse_inode *fi = get_fuse_inode(inode);
 
+	/*
+	 * igrab() was done to make sure inode won't go under us, and this
+	 * further avoids the race with evict().
+	 */
 	ret = dmap_writeback_invalidate(inode, dmap);
 
 	/* TODO: What to do if above fails? For now,
@@ -4113,16 +4139,18 @@ static int lookup_and_reclaim_dmap_locked(struct fuse_conn *fc,
 	struct fuse_inode *fi = get_fuse_inode(inode);
 	struct fuse_dax_mapping *dmap;
 
-	WARN_ON(!inode_is_locked(inode));
-
 	/* Find fuse dax mapping at file offset inode. */
 	dmap = fuse_dax_interval_tree_iter_first(&fi->dmap_tree, dmap_start,
-							dmap_start);
+						 dmap_start);
 
 	/* Range already got cleaned up by somebody else */
 	if (!dmap)
 		return 0;
 
+	/* still in use. */
+	if (refcount_read(&dmap->refcnt) > 1)
+		return 0;
+
 	ret = reclaim_one_dmap_locked(fc, inode, dmap);
 	if (ret < 0)
 		return ret;
@@ -4137,10 +4165,9 @@ static int lookup_and_reclaim_dmap_locked(struct fuse_conn *fc,
 /*
  * Free a range of memory.
  * Locking.
- * 1. Take inode->i_rwsem to prever further read/write.
- * 2. Take fuse_inode->i_mmap_sem to block dax faults.
- * 3. Take fuse_inode->i_dmap_sem to protect interval tree. It might not
- *    be strictly necessary as lock 1 and 2 seem sufficient.
+ * 1. Take fuse_inode->i_mmap_sem to block dax faults.
+ * 2. Take fuse_inode->i_dmap_sem to protect interval tree and also to make
+ *    sure read/write can not reuse a dmap which we might be freeing.
  */
 static int lookup_and_reclaim_dmap(struct fuse_conn *fc, struct inode *inode,
 			    u64 dmap_start)
@@ -4148,18 +4175,11 @@ static int lookup_and_reclaim_dmap(struct fuse_conn *fc, struct inode *inode,
 	int ret;
 	struct fuse_inode *fi = get_fuse_inode(inode);
 
-	/*
-	 * If process is blocked waiting for memory while holding inode
-	 * lock, we will deadlock. So continue to free next range.
-	 */
-	if (!inode_trylock(inode))
-		return -EAGAIN;
 	down_write(&fi->i_mmap_sem);
 	down_write(&fi->i_dmap_sem);
 	ret = lookup_and_reclaim_dmap_locked(fc, inode, dmap_start);
 	up_write(&fi->i_dmap_sem);
 	up_write(&fi->i_mmap_sem);
-	inode_unlock(inode);
 	return ret;
 }
 
@@ -4186,6 +4206,10 @@ static int try_to_free_dmap_chunks(struct fuse_conn *fc,
 
 		list_for_each_entry_safe(pos, temp, &fc->busy_ranges,
 						busy_list) {
+			/* skip this range if it's in use. */
+			if (refcount_read(&pos->refcnt) > 1)
+				continue;
+
 			inode = igrab(pos->inode);
 			/*
 			 * This inode is going away. That will free
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index edcc4a3e119b..e1e58fbdb603 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -134,6 +134,9 @@ struct fuse_dax_mapping {
 
        /** Length of mapping, in bytes */
        loff_t length;
+
+	/* reference count when the mapping is used by dax iomap. */
+	refcount_t refcnt;
 };
 
 /** FUSE inode */
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index ffa00caeea01..2d2748e47787 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -666,6 +666,7 @@ static int fuse_dax_mem_range_init(struct fuse_conn *fc,
 		range->window_offset = i * FUSE_DAX_MEM_RANGE_SZ;
 		range->length = FUSE_DAX_MEM_RANGE_SZ;
 		INIT_LIST_HEAD(&range->busy_list);
+		refcount_set(&range->refcnt, 1);
 		list_add_tail(&range->list, &mem_ranges);
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Virtio-fs] [PATCH 5/6] fuse: Get rid of -EAGAIN logic in dmap free worker thread
  2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
                   ` (3 preceding siblings ...)
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 4/6] fuse: Get rid of inode lock in range reclaim path Vivek Goyal
@ 2019-07-16 21:00 ` Vivek Goyal
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 6/6] fuse: Dax read can now do range reclaim inline Vivek Goyal
  2019-07-16 21:58 ` [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Liu Bo
  6 siblings, 0 replies; 8+ messages in thread
From: Vivek Goyal @ 2019-07-16 21:00 UTC (permalink / raw)
  To: virtio-fs

Now we are not doing inode trylock anymore. That means we will not get
-EAGAIN at high frequency. So get rid of this logic. Its not needed anymore.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/file.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index bf9903a858db..42b250d23888 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -4235,17 +4235,11 @@ static int try_to_free_dmap_chunks(struct fuse_conn *fc,
 
 		ret = lookup_and_reclaim_dmap(fc, inode, dmap_start);
 		iput(inode);
-		if (ret && ret != -EAGAIN) {
+		if (ret) {
 			printk("%s(window_offset=0x%llx) failed. err=%d\n",
 				__func__, window_offset, ret);
 			return ret;
 		}
-
-		/* Could not get inode lock. Try next element */
-		if (ret == -EAGAIN) {
-			cond_resched();
-			continue;
-		}
 		nr_freed++;
 	}
 	return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Virtio-fs] [PATCH 6/6] fuse: Dax read can now do range reclaim inline
  2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
                   ` (4 preceding siblings ...)
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 5/6] fuse: Get rid of -EAGAIN logic in dmap free worker thread Vivek Goyal
@ 2019-07-16 21:00 ` Vivek Goyal
  2019-07-16 21:58 ` [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Liu Bo
  6 siblings, 0 replies; 8+ messages in thread
From: Vivek Goyal @ 2019-07-16 21:00 UTC (permalink / raw)
  To: virtio-fs

Now we don't take inode lock in range reclaim path. That means read path
which is holding shared inode lock can do inline dax range reclaim. Get
rid of logic where we dropped inode lock and waited for a range to become
free before retrying.

In read path inode lock is held shared. So make sure that dmap reference
count is 1 before we try to do reclaim it inline.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/file.c | 47 +++++++++++++----------------------------------
 1 file changed, 13 insertions(+), 34 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 42b250d23888..6613d66de724 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1959,12 +1959,8 @@ static int fuse_iomap_begin(struct inode *inode, loff_t pos, loff_t length,
 		if (pos >= i_size_read(inode))
 			goto iomap_hole;
 
-		/* Can't do reclaim in fault path yet due to lock ordering.
-		 * Read path takes shared inode lock and that's not sufficient
-		 * for inline range reclaim. Caller needs to drop lock, wait
-		 * and retry.
-		 */
-		if (flags & IOMAP_FAULT || !(flags & IOMAP_WRITE)) {
+		/* Can't do reclaim in fault path yet due to lock ordering. */
+		if (flags & IOMAP_FAULT) {
 			alloc_dmap = alloc_dax_mapping(fc);
 			if (!alloc_dmap)
 				return -ENOSPC;
@@ -2050,18 +2046,7 @@ static const struct iomap_ops fuse_iomap_ops = {
 static ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
-	struct fuse_conn *fc = get_fuse_conn(inode);
 	ssize_t ret;
-	bool retry = false;
-
-retry:
-	if (retry && !(fc->nr_free_ranges > 0)) {
-		ret = -EINTR;
-		if (wait_event_killable_exclusive(fc->dax_range_waitq,
-						  (fc->nr_free_ranges > 0))) {
-			goto out;
-		}
-	}
 
 	if (iocb->ki_flags & IOCB_NOWAIT) {
 		if (!inode_trylock_shared(inode))
@@ -2073,19 +2058,7 @@ static ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	ret = dax_iomap_rw(iocb, to, &fuse_iomap_ops);
 	inode_unlock_shared(inode);
 
-	/* If a dax range could not be allocated and it can't be reclaimed
-	 * inline, then drop inode lock and retry. Range reclaim logic
-	 * requires exclusive access to inode lock.
-	 *
-	 * TODO: What if -ENOSPC needs to be returned to user space. Fix it.
-	 */
-	if (ret == -ENOSPC) {
-		retry = true;
-		goto retry;
-	}
 	/* TODO file_accessed(iocb->f_filp) */
-
-out:
 	return ret;
 }
 
@@ -4056,16 +4029,24 @@ static int reclaim_one_dmap_locked(struct fuse_conn *fc, struct inode *inode,
 	return 0;
 }
 
+#define fuse_dax_interval_tree_foreach(dmap, root, start, last)           \
+	for (dmap = fuse_dax_interval_tree_iter_first(root, start, last); \
+	     dmap; dmap = fuse_dax_interval_tree_iter_next(dmap, start, last))
+
 /* First first mapping in the tree and free it. */
 static struct fuse_dax_mapping *
 inode_reclaim_first_dmap_locked(struct fuse_conn *fc, struct inode *inode)
 {
 	struct fuse_inode *fi = get_fuse_inode(inode);
-	struct fuse_dax_mapping *dmap;
+	struct fuse_dax_mapping *dmap = NULL;
 	int ret;
 
-	/* Find fuse dax mapping at file offset inode. */
-	dmap = fuse_dax_interval_tree_iter_first(&fi->dmap_tree, 0, -1);
+	fuse_dax_interval_tree_foreach(dmap, &fi->dmap_tree, 0, -1) {
+		if (refcount_read(&dmap->refcnt) > 1)
+			continue;
+		break;
+	}
+
 	if (!dmap)
 		return NULL;
 
@@ -4087,8 +4068,6 @@ inode_reclaim_first_dmap_locked(struct fuse_conn *fc, struct inode *inode)
 /*
  * First first mapping in the tree and free it and return it. Do not add
  * it back to free pool.
- *
- * This is called with inode lock held.
  */
 static struct fuse_dax_mapping *inode_reclaim_first_dmap(struct fuse_conn *fc,
 							 struct inode *inode)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes
  2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
                   ` (5 preceding siblings ...)
  2019-07-16 21:00 ` [Virtio-fs] [PATCH 6/6] fuse: Dax read can now do range reclaim inline Vivek Goyal
@ 2019-07-16 21:58 ` Liu Bo
  6 siblings, 0 replies; 8+ messages in thread
From: Liu Bo @ 2019-07-16 21:58 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs

On Tue, Jul 16, 2019 at 05:00:45PM -0400, Vivek Goyal wrote:
> Hi Liu Bo,
> 
> I am proposing following patches and fixes. I have modified your patch
> a bit to use refcount_t. 
> 
> I have done few other cleanup and fixes as well.
> 
> If things look good, I will squash these patches into exisitng patches.
>

Thanks for sending out the set, will get it into our testing.

thanks,
-liubo

> Thanks
> Vivek
> 
> Vivek Goyal (6):
>   fuse: Continue to wait if inline reclaim returns null
>   fuse: Requeue work if number of free ranges is not above threshold
>   fuse: Do cond_resched() if we can't get inode lock
>   fuse: Get rid of inode lock in range reclaim path
>   fuse: Get rid of -EAGAIN logic in dmap free worker thread
>   fuse: Dax read can now do range reclaim inline
> 
>  fs/fuse/file.c   | 161 ++++++++++++++++++++++++++---------------------
>  fs/fuse/fuse_i.h |   3 +
>  fs/fuse/inode.c  |   1 +
>  3 files changed, 93 insertions(+), 72 deletions(-)
> 
> -- 
> 2.20.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-07-16 21:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-16 21:00 [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Vivek Goyal
2019-07-16 21:00 ` [Virtio-fs] [PATCH 1/6] fuse: Continue to wait if inline reclaim returns null Vivek Goyal
2019-07-16 21:00 ` [Virtio-fs] [PATCH 2/6] fuse: Requeue work if number of free ranges is not above threshold Vivek Goyal
2019-07-16 21:00 ` [Virtio-fs] [PATCH 3/6] fuse: Do cond_resched() if we can't get inode lock Vivek Goyal
2019-07-16 21:00 ` [Virtio-fs] [PATCH 4/6] fuse: Get rid of inode lock in range reclaim path Vivek Goyal
2019-07-16 21:00 ` [Virtio-fs] [PATCH 5/6] fuse: Get rid of -EAGAIN logic in dmap free worker thread Vivek Goyal
2019-07-16 21:00 ` [Virtio-fs] [PATCH 6/6] fuse: Dax read can now do range reclaim inline Vivek Goyal
2019-07-16 21:58 ` [Virtio-fs] [PATCH 0/6] virtio-fs: Get rid of inode lock and misc fixes Liu Bo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.