stgt.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] sheepdog driver cleanup
@ 2016-09-06  7:37 Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 1/6] sheepdog: prevent double locking during inode reload Hitoshi Mitake
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Hitoshi Mitake @ 2016-09-06  7:37 UTC (permalink / raw)
  To: stgt; +Cc: Hitoshi Mitake

This patchset does some cleaning of the sheepdog driver related to the
iSCSI multipath functionality.

Hitoshi Mitake (6):
  sheepdog: prevent double locking during inode reload
  sheepdog: serialize overwrapping request
  sheepdog: pass a correct flag to reload_inode()
  sheepdog: handle a case of snapshot -> failover
  sheepdog: don't let ai have min and max dirty data indexes
  sheepdog: handle an inconsistent state of metadata

 usr/bs_sheepdog.c | 182 +++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 153 insertions(+), 29 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/6] sheepdog: prevent double locking during inode reload
  2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
@ 2016-09-06  7:37 ` Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 2/6] sheepdog: serialize overwrapping request Hitoshi Mitake
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Hitoshi Mitake @ 2016-09-06  7:37 UTC (permalink / raw)
  To: stgt; +Cc: Hitoshi Mitake, Teruaki Ishizaki, Takashi Menjo

The commit 0c531dfe18f9b2af made the lock protecting on memory inode
object finer grain and improved performacne, but it also introduced a
possibility of double locking single VDI.

The problem can arise in this sequence:
1. both of thread A and B issues write request to a VDI
2. user make a snapshot of the VDI
3. the VDI is now readonly snapshot, sheep returns SD_RES_READONLY
4. both of A and B calls find_vdi_name() for obtaining a new ID of
   working VDI, it results double locking

This patch resolves the problem. Each thread has a version number of
inode object in its TLS. Acess info struct also has the number. When
the thread needs to reload the inode object, it check the number with
the one of the access info first. If they differ, it means no other
threads reloaded the object. If not, it means other thread already
reloaded, so the thread doesn't call find_vdi_name() and double
locking can be avoided.

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Tested-by: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
---
 usr/bs_sheepdog.c | 41 +++++++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/usr/bs_sheepdog.c b/usr/bs_sheepdog.c
index cfaf48b..21d7dac 100644
--- a/usr/bs_sheepdog.c
+++ b/usr/bs_sheepdog.c
@@ -291,6 +291,9 @@ struct sheepdog_access_info {
 
 	struct sheepdog_inode inode;
 	pthread_rwlock_t inode_lock;
+
+	pthread_mutex_t inode_version_mutex;
+	uint64_t inode_version;
 };
 
 static inline int is_data_obj_writeable(struct sheepdog_inode *inode,
@@ -656,37 +659,58 @@ static int read_object(struct sheepdog_access_info *ai, char *buf, uint64_t oid,
 
 static int reload_inode(struct sheepdog_access_info *ai, int is_snapshot)
 {
-	int ret, need_reload = 0;
+	int ret = 0, need_reload = 0;
 	char tag[SD_MAX_VDI_TAG_LEN];
 	uint32_t vid;
 
+	static __thread uint64_t inode_version;
+
+	pthread_mutex_lock(&ai->inode_version_mutex);
+
+	if (inode_version != ai->inode_version) {
+		/* some other threads reloaded inode */
+		inode_version = ai->inode_version;
+		goto ret;
+	}
+
 	if (is_snapshot) {
 		memset(tag, 0, sizeof(tag));
 
 		ret = find_vdi_name(ai, ai->inode.name, CURRENT_VDI_ID, tag,
 				    &vid, 0);
-		if (ret)
-			return -1;
+		if (ret) {
+			ret = -1;
+			goto ret;
+		}
 
 		ret = read_object(ai, (char *)&ai->inode, vid_to_vdi_oid(vid),
 				  ai->inode.nr_copies,
 				  offsetof(struct sheepdog_inode, data_vdi_id),
 				  0, &need_reload);
-		if (ret)
-			return -1;
+		if (ret) {
+			ret = -1;
+			goto ret;
+		}
 	} else {
 		ret = read_object(ai, (char *)&ai->inode,
 				  vid_to_vdi_oid(ai->inode.vdi_id),
 				  ai->inode.nr_copies, SD_INODE_SIZE, 0,
 				  &need_reload);
-		if (ret)
-			return -1;
+		if (ret) {
+			ret = -1;
+			goto ret;
+		}
 	}
 
 	ai->min_dirty_data_idx = UINT32_MAX;
 	ai->max_dirty_data_idx = 0;
 
-	return 0;
+	inode_version++;
+	ai->inode_version = inode_version;
+
+ret:
+	pthread_mutex_unlock(&ai->inode_version_mutex);
+	return ret;
 }
 
 static int read_write_object(struct sheepdog_access_info *ai, char *buf,
@@ -1426,6 +1450,7 @@ static tgtadm_err bs_sheepdog_init(struct scsi_lu *lu, char *bsopts)
 	INIT_LIST_HEAD(&ai->fd_list_head);
 	pthread_rwlock_init(&ai->fd_list_lock, NULL);
 	pthread_rwlock_init(&ai->inode_lock, NULL);
+	pthread_mutex_init(&ai->inode_version_mutex, NULL);
 
 	return bs_thread_open(info, bs_sheepdog_request, nr_iothreads);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/6] sheepdog: serialize overwrapping request
  2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 1/6] sheepdog: prevent double locking during inode reload Hitoshi Mitake
@ 2016-09-06  7:37 ` Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 3/6] sheepdog: pass a correct flag to reload_inode() Hitoshi Mitake
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Hitoshi Mitake @ 2016-09-06  7:37 UTC (permalink / raw)
  To: stgt; +Cc: Hitoshi Mitake, Teruaki Ishizaki, Takashi Menjo

iSCSI initiators can issue read and write requests that target
overlapping areas. In such a case, current sheepdog driver cannot
serialize these requests so the semantics of virtual drive will be
corrupt. This patch implements a mechanism for serializing such
requests.

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Tested-by: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
---
 usr/bs_sheepdog.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/usr/bs_sheepdog.c b/usr/bs_sheepdog.c
index 21d7dac..be6d321 100644
--- a/usr/bs_sheepdog.c
+++ b/usr/bs_sheepdog.c
@@ -294,6 +294,10 @@ struct sheepdog_access_info {
 
 	pthread_mutex_t inode_version_mutex;
 	uint64_t inode_version;
+
+	struct list_head inflight_list_head;
+	pthread_mutex_t inflight_list_mutex;
+	pthread_cond_t inflight_list_cond;
 };
 
 static inline int is_data_obj_writeable(struct sheepdog_inode *inode,
@@ -1252,6 +1256,11 @@ trans_to_expect_nothing:
 		goto out;
 
 	ret = 0;
+
+	INIT_LIST_HEAD(&ai->inflight_list_head);
+	pthread_mutex_init(&ai->inflight_list_mutex, NULL);
+	pthread_cond_init(&ai->inflight_list_cond, NULL);
+
 out:
 	strcpy(filename, orig_filename);
 	free(orig_filename);
@@ -1349,6 +1358,41 @@ out:
 	return ret;
 }
 
+struct inflight_thread {
+	unsigned long min_idx, max_idx;
+	struct list_head list;
+};
+
+static void inflight_block(struct sheepdog_access_info *ai,
+			   struct inflight_thread *myself)
+{
+	struct inflight_thread *inflight;
+
+	pthread_mutex_lock(&ai->inflight_list_mutex);
+
+retry:
+	list_for_each_entry(inflight, &ai->inflight_list_head, list) {
+		if (!(myself->max_idx < inflight->min_idx ||
+		      inflight->max_idx < myself->min_idx)) {
+			pthread_cond_wait(&ai->inflight_list_cond,
+					  &ai->inflight_list_mutex);
+			goto retry;
+		}
+	}
+
+	list_add_tail(&myself->list, &ai->inflight_list_head);
+	pthread_mutex_unlock(&ai->inflight_list_mutex);
+}
+ void inflight_release(struct sheepdog_access_info *ai,
+			     struct inflight_thread *myself)
+{
+	pthread_mutex_lock(&ai->inflight_list_mutex);
+	list_del(&myself->list);
+	pthread_mutex_unlock(&ai->inflight_list_mutex);
+
+	pthread_cond_signal(&ai->inflight_list_cond);
+}
+
 static void bs_sheepdog_request(struct scsi_cmd *cmd)
 {
 	int ret = 0;
@@ -1360,6 +1404,13 @@ static void bs_sheepdog_request(struct scsi_cmd *cmd)
 	struct sheepdog_access_info *ai =
 		(struct sheepdog_access_info *)(info + 1);
 
+	uint32_t object_size = (UINT32_C(1) << ai->inode.block_size_shift);
+	struct inflight_thread myself;
+	int inflight = 0;
+
+	memset(&myself, 0, sizeof(myself));
+	INIT_LIST_HEAD(&myself.list);
+
 	switch (cmd->scb[0]) {
 	case SYNCHRONIZE_CACHE:
 	case SYNCHRONIZE_CACHE_16:
@@ -1383,6 +1434,13 @@ static void bs_sheepdog_request(struct scsi_cmd *cmd)
 		}
 
 		length = scsi_get_out_length(cmd);
+
+		myself.min_idx = cmd->offset / object_size;
+		myself.max_idx = (cmd->offset + length + (object_size - 1))
+			/ object_size;
+		inflight_block(ai, &myself);
+		inflight = 1;
+
 		ret = sd_io(ai, 1, scsi_get_out_buffer(cmd),
 			    length, cmd->offset);
 
@@ -1394,6 +1452,13 @@ static void bs_sheepdog_request(struct scsi_cmd *cmd)
 	case READ_12:
 	case READ_16:
 		length = scsi_get_in_length(cmd);
+
+		myself.min_idx = cmd->offset / object_size;
+		myself.max_idx = (cmd->offset + length + (object_size - 1))
+			/ object_size;
+		inflight_block(ai, &myself);
+		inflight = 1;
+
 		ret = sd_io(ai, 0, scsi_get_in_buffer(cmd),
 			    length, cmd->offset);
 		if (ret)
@@ -1413,6 +1478,9 @@ static void bs_sheepdog_request(struct scsi_cmd *cmd)
 			cmd, cmd->scb[0], ret, length, cmd->offset);
 		sense_data_build(cmd, key, asc);
 	}
+
+	if (inflight)
+		inflight_release(ai, &myself);
 }
 
 static int bs_sheepdog_open(struct scsi_lu *lu, char *path,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/6] sheepdog: pass a correct flag to reload_inode()
  2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 1/6] sheepdog: prevent double locking during inode reload Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 2/6] sheepdog: serialize overwrapping request Hitoshi Mitake
@ 2016-09-06  7:37 ` Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 4/6] sheepdog: handle a case of snapshot -> failover Hitoshi Mitake
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Hitoshi Mitake @ 2016-09-06  7:37 UTC (permalink / raw)
  To: stgt; +Cc: Hitoshi Mitake, Teruaki Ishizaki, Takashi Menjo

Current usage of reload_inode() is invalid because it is called with
is_snapshot == 0 even for a case of readonly. This patch fixes the
problem.

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Tested-by: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
---
 usr/bs_sheepdog.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/usr/bs_sheepdog.c b/usr/bs_sheepdog.c
index be6d321..ecb5033 100644
--- a/usr/bs_sheepdog.c
+++ b/usr/bs_sheepdog.c
@@ -905,6 +905,7 @@ static int sd_io(struct sheepdog_access_info *ai, int write, char *buf, int len,
 	int need_update_inode = 0, need_reload_inode;
 	int nr_copies = ai->inode.nr_copies;
 	int need_write_lock, check_idx;
+	int read_reload_snap = 0;
 
 	goto do_req;
 
@@ -912,7 +913,7 @@ reload_in_read_path:
 	pthread_rwlock_unlock(&ai->inode_lock); /* unlock current read lock */
 
 	pthread_rwlock_wrlock(&ai->inode_lock);
-	ret = reload_inode(ai, 0);
+	ret = reload_inode(ai, read_reload_snap);
 	if (ret) {
 		eprintf("failed to reload in read path\n");
 		goto out;
@@ -1008,6 +1009,8 @@ retry:
 					dprintf("reload in read path for not"\
 						" written area\n");
 					size = orig_size;
+					read_reload_snap =
+						need_reload_inode == 1;
 					goto reload_in_read_path;
 				}
 			}
@@ -1019,6 +1022,7 @@ retry:
 			if (need_reload_inode) {
 				dprintf("reload in ordinal read path\n");
 				size = orig_size;
+				read_reload_snap = need_reload_inode == 1;
 				goto reload_in_read_path;
 			}
 		}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/6] sheepdog: handle a case of snapshot -> failover
  2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
                   ` (2 preceding siblings ...)
  2016-09-06  7:37 ` [PATCH 3/6] sheepdog: pass a correct flag to reload_inode() Hitoshi Mitake
@ 2016-09-06  7:37 ` Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 5/6] sheepdog: don't let ai have min and max dirty data indexes Hitoshi Mitake
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Hitoshi Mitake @ 2016-09-06  7:37 UTC (permalink / raw)
  To: stgt; +Cc: Hitoshi Mitake, Teruaki Ishizaki, Takashi Menjo

If refreshed inode is snapshot, we can reload entire inode and switch
to working VDI immediately.

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Tested-by: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
---
 usr/bs_sheepdog.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/usr/bs_sheepdog.c b/usr/bs_sheepdog.c
index ecb5033..8f228b5 100644
--- a/usr/bs_sheepdog.c
+++ b/usr/bs_sheepdog.c
@@ -704,6 +704,34 @@ static int reload_inode(struct sheepdog_access_info *ai, int is_snapshot)
 			ret = -1;
 			goto ret;
 		}
+
+		if (!!ai->inode.snap_ctime) {
+			/*
+			 * This is a case like below:
+			 * take snapshot -> write something -> failover
+			 *
+			 * Because invalidated inode is readonly and latest
+			 * working VDI can have COWed objects, we need to
+			 * resolve VID and reload its entire inode object.
+			 */
+			memset(tag, 0, sizeof(tag));
+
+			ret = find_vdi_name(ai, ai->inode.name, CURRENT_VDI_ID,
+					    tag, &vid, 0);
+			if (ret) {
+				ret = -1;
+				goto ret;
+			}
+
+			ret = read_object(ai, (char *)&ai->inode,
+					  vid_to_vdi_oid(vid),
+					  ai->inode.nr_copies, SD_INODE_SIZE, 0,
+					  &need_reload);
+			if (ret) {
+				ret = -1;
+				goto ret;
+			}
+		}
 	}
 
 	ai->min_dirty_data_idx = UINT32_MAX;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/6] sheepdog: don't let ai have min and max dirty data indexes
  2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
                   ` (3 preceding siblings ...)
  2016-09-06  7:37 ` [PATCH 4/6] sheepdog: handle a case of snapshot -> failover Hitoshi Mitake
@ 2016-09-06  7:37 ` Hitoshi Mitake
  2016-09-06  7:37 ` [PATCH 6/6] sheepdog: handle an inconsistent state of metadata Hitoshi Mitake
  2016-09-07  0:24 ` [PATCH 0/6] sheepdog driver cleanup FUJITA Tomonori
  6 siblings, 0 replies; 8+ messages in thread
From: Hitoshi Mitake @ 2016-09-06  7:37 UTC (permalink / raw)
  To: stgt; +Cc: Hitoshi Mitake, Teruaki Ishizaki, Takashi Menjo

Like the qemu driver [1], this patch decouples min and dirty data
indexes from access info for avoiding overwriting by parallel
requests.

[1] https://github.com/codyprime/qemu-kvm-jtc/commit/498f21405a286f718a0767c791b7d2db19f4e5bd

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Tested-by: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
---
 usr/bs_sheepdog.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/usr/bs_sheepdog.c b/usr/bs_sheepdog.c
index 8f228b5..12bcfd9 100644
--- a/usr/bs_sheepdog.c
+++ b/usr/bs_sheepdog.c
@@ -286,9 +286,6 @@ struct sheepdog_access_info {
 	struct list_head fd_list_head;
 	pthread_rwlock_t fd_list_lock;
 
-	uint32_t min_dirty_data_idx;
-	uint32_t max_dirty_data_idx;
-
 	struct sheepdog_inode inode;
 	pthread_rwlock_t inode_lock;
 
@@ -734,9 +731,6 @@ static int reload_inode(struct sheepdog_access_info *ai, int is_snapshot)
 		}
 	}
 
-	ai->min_dirty_data_idx = UINT32_MAX;
-	ai->max_dirty_data_idx = 0;
-
 	inode_version++;
 	ai->inode_version = inode_version;
 
@@ -854,14 +848,11 @@ static int sd_sync(struct sheepdog_access_info *ai)
 	}
 }
 
-static int update_inode(struct sheepdog_access_info *ai)
+static int update_inode(struct sheepdog_access_info *ai, uint32_t min, uint32_t max)
 {
 	int ret = 0, need_reload_inode = 0;
 	uint64_t oid = vid_to_vdi_oid(ai->inode.vdi_id);
-	uint32_t min, max, offset, data_len;
-
-	min = ai->min_dirty_data_idx;
-	max = ai->max_dirty_data_idx;
+	uint32_t offset, data_len;
 
 	if (max < min)
 		goto end;
@@ -890,8 +881,6 @@ update:
 	}
 
 end:
-	ai->min_dirty_data_idx = UINT32_MAX;
-	ai->max_dirty_data_idx = 0;
 
 	return ret;
 }
@@ -934,6 +923,7 @@ static int sd_io(struct sheepdog_access_info *ai, int write, char *buf, int len,
 	int nr_copies = ai->inode.nr_copies;
 	int need_write_lock, check_idx;
 	int read_reload_snap = 0;
+	uint32_t min_dirty_data_idx = UINT32_MAX, max_dirty_data_idx = 0;
 
 	goto do_req;
 
@@ -1015,12 +1005,12 @@ retry:
 				}
 
 				if (create) {
-					ai->min_dirty_data_idx =
+					min_dirty_data_idx =
 						min_t(uint32_t, idx,
-						      ai->min_dirty_data_idx);
-					ai->max_dirty_data_idx =
+						      min_dirty_data_idx);
+					max_dirty_data_idx =
 						max_t(uint32_t, idx,
-						      ai->max_dirty_data_idx);
+						      max_dirty_data_idx);
 					ai->inode.data_vdi_id[idx] = vid;
 
 					need_update_inode = 1;
@@ -1066,7 +1056,7 @@ done:
 	}
 
 	if (need_update_inode)
-		ret = update_inode(ai);
+		ret = update_inode(ai, min_dirty_data_idx, max_dirty_data_idx);
 
 out:
 	pthread_rwlock_unlock(&ai->inode_lock);
@@ -1279,9 +1269,6 @@ trans_to_expect_nothing:
 	if (ret)
 		goto out;
 
-	ai->min_dirty_data_idx = UINT32_MAX;
-	ai->max_dirty_data_idx = 0;
-
 	ret = read_object(ai, (char *)&ai->inode, vid_to_vdi_oid(vid),
 			  0, SD_INODE_SIZE, 0, &need_reload);
 	if (ret)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/6] sheepdog: handle an inconsistent state of metadata
  2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
                   ` (4 preceding siblings ...)
  2016-09-06  7:37 ` [PATCH 5/6] sheepdog: don't let ai have min and max dirty data indexes Hitoshi Mitake
@ 2016-09-06  7:37 ` Hitoshi Mitake
  2016-09-07  0:24 ` [PATCH 0/6] sheepdog driver cleanup FUJITA Tomonori
  6 siblings, 0 replies; 8+ messages in thread
From: Hitoshi Mitake @ 2016-09-06  7:37 UTC (permalink / raw)
  To: stgt; +Cc: Hitoshi Mitake, Teruaki Ishizaki, Takashi Menjo

This commit lets the sheepdog driver to handle the case of no VDI
found during inode reloading. It can happen because sheepdog does not
provide metadata transaction.

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Tested-by: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
---
 usr/bs_sheepdog.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/usr/bs_sheepdog.c b/usr/bs_sheepdog.c
index 12bcfd9..fc91ad1 100644
--- a/usr/bs_sheepdog.c
+++ b/usr/bs_sheepdog.c
@@ -750,6 +750,7 @@ static int read_write_object(struct sheepdog_access_info *ai, char *buf,
 	unsigned int wlen, rlen;
 	int ret;
 
+retry:
 	memset(&hdr, 0, sizeof(hdr));
 
 	hdr.proto_ver = SD_PROTO_VER;
@@ -791,6 +792,17 @@ static int read_write_object(struct sheepdog_access_info *ai, char *buf,
 	case SD_RES_READONLY:
 		*need_reload = 1;
 		return 0;
+	case SD_RES_NO_OBJ:
+		if (!write && oid & (UINT64_C(1) << 63))
+			/*
+			 * sheepdog doesn't provide a mechanism of metadata
+			 * transaction, so tgt can see an inconsistent state
+			 * like this (old working VDI became snapshot already
+			 * but an inode object of new working VDI isn't
+			 * created yet).
+			 */
+			goto retry;
+		return -1;
 	default:
 		eprintf("%s (oid: %" PRIx64 ", old_oid: %" PRIx64 ")\n",
 			sd_strerror(rsp->result), oid, old_oid);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/6] sheepdog driver cleanup
  2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
                   ` (5 preceding siblings ...)
  2016-09-06  7:37 ` [PATCH 6/6] sheepdog: handle an inconsistent state of metadata Hitoshi Mitake
@ 2016-09-07  0:24 ` FUJITA Tomonori
  6 siblings, 0 replies; 8+ messages in thread
From: FUJITA Tomonori @ 2016-09-07  0:24 UTC (permalink / raw)
  To: mitake.hitoshi; +Cc: stgt

On Tue,  6 Sep 2016 16:37:23 +0900
Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp> wrote:

> This patchset does some cleaning of the sheepdog driver related to the
> iSCSI multipath functionality.
> 
> Hitoshi Mitake (6):
>   sheepdog: prevent double locking during inode reload
>   sheepdog: serialize overwrapping request
>   sheepdog: pass a correct flag to reload_inode()
>   sheepdog: handle a case of snapshot -> failover
>   sheepdog: don't let ai have min and max dirty data indexes
>   sheepdog: handle an inconsistent state of metadata
> 
>  usr/bs_sheepdog.c | 182 +++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 153 insertions(+), 29 deletions(-)

Applied, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-09-07  0:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-06  7:37 [PATCH 0/6] sheepdog driver cleanup Hitoshi Mitake
2016-09-06  7:37 ` [PATCH 1/6] sheepdog: prevent double locking during inode reload Hitoshi Mitake
2016-09-06  7:37 ` [PATCH 2/6] sheepdog: serialize overwrapping request Hitoshi Mitake
2016-09-06  7:37 ` [PATCH 3/6] sheepdog: pass a correct flag to reload_inode() Hitoshi Mitake
2016-09-06  7:37 ` [PATCH 4/6] sheepdog: handle a case of snapshot -> failover Hitoshi Mitake
2016-09-06  7:37 ` [PATCH 5/6] sheepdog: don't let ai have min and max dirty data indexes Hitoshi Mitake
2016-09-06  7:37 ` [PATCH 6/6] sheepdog: handle an inconsistent state of metadata Hitoshi Mitake
2016-09-07  0:24 ` [PATCH 0/6] sheepdog driver cleanup FUJITA Tomonori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).