All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/5] Dynamic growing data area support
@ 2017-03-08  8:45 lixiubo
  2017-03-08  8:45 ` [PATCHv2 1/5] target/user: Add dynamic growing data area feature support lixiubo
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: lixiubo @ 2017-03-08  8:45 UTC (permalink / raw)
  To: agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, Xiubo Li

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

These are based on Mike's leatest tcmu patches.

Changed for V2:
- The [PATCHv2 1/5] just fixes some small spelling and other mistakes.
  And as the initial patch, here sets cmd area to 8M and data area to
  1G(1M fixed and 1023M growing)

- The [PATCHv2 2/5] is a new one, adding global data block pool support.
  The max total size of the pool is 2G and all the targets will get
  growing blocks from here.
  Test this using multi-targets at the same time.

- The [PATCHv2 3/5] changed nothing, respin it to avoid the conflict.

- The [PATCHv2 4/5] and [PATCHv2 5/5] are new ones.


Xiubo Li (5):
  target/user: Add dynamic growing data area feature support
  target/user: Add global data block pool support
  target/user: Fix possible overwrite of t_data_sg's last iov[]
  target/user: Fix wrongly calculating of the base_command_size
  target/user: Clean up tcmu_queue_cmd_ring

 drivers/target/target_core_user.c | 573 +++++++++++++++++++++++++++++++-------
 1 file changed, 478 insertions(+), 95 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCHv2 1/5] target/user: Add dynamic growing data area feature support
  2017-03-08  8:45 [PATCHv2 0/5] Dynamic growing data area support lixiubo
@ 2017-03-08  8:45 ` lixiubo
  2017-03-08  8:45 ` [PATCHv2 2/5] target/user: Add global data block pool support lixiubo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: lixiubo @ 2017-03-08  8:45 UTC (permalink / raw)
  To: agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, Xiubo Li

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

Currently for the TCMU, the ring buffer size is fixed to 64K cmd
area + 1M data area, and this will be bottlenecks for high iops.

The struct tcmu_cmd_entry {} size is fixed about 112 bytes with
iovec[N] & N <= 4, and the size of struct iovec is about 16 bytes.

If N == 0, the ratio will be sizeof(cmd entry) : sizeof(datas) ==
112Bytes : (N * 4096)Bytes = 28 : 0, no data area is need.

If 0 < N <=4, the ratio will be sizeof(cmd entry) : sizeof(datas)
== 112Bytes : (N * 4096)Bytes = 28 : (N * 1024), so the max will
be 28 : 1024.

If N > 4, the sizeof(cmd entry) will be [(N - 4) *16 + 112] bytes,
and its corresponding data size will be [N * 4096], so the ratio
of sizeof(cmd entry) : sizeof(datas) == [(N - 4) * 16 + 112)Bytes
: (N * 4096)Bytes == 4/1024 - 12/(N * 1024), so the max is about
4 : 1024.

When N is bigger, the ratio will be smaller.

As the initial patch, we will set the cmd area size to 2M, and
the cmd area size to 32M. The TCMU will dynamically grows the data
area from 0 to max 32M size as needed.

The cmd area memory will be allocated through vmalloc(), and the
data area's blocks will be allocated individually later when needed.

The allocated data area block memory will be managed via radix tree.
For now the bitmap still be the most efficient way to search and
manage the block index, this could be update later.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com>
---
 drivers/target/target_core_user.c | 286 ++++++++++++++++++++++++++++----------
 1 file changed, 215 insertions(+), 71 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 2b4e7e4..e904bc0 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -2,6 +2,7 @@
  * Copyright (C) 2013 Shaohua Li <shli@kernel.org>
  * Copyright (C) 2014 Red Hat, Inc.
  * Copyright (C) 2015 Arrikto, Inc.
+ * Copyright (C) 2017 Chinamobile, Inc.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -25,6 +26,7 @@
 #include <linux/parser.h>
 #include <linux/vmalloc.h>
 #include <linux/uio_driver.h>
+#include <linux/radix-tree.h>
 #include <linux/stringify.h>
 #include <linux/bitops.h>
 #include <linux/highmem.h>
@@ -62,15 +64,17 @@
  * this may have a 'UAM' comment.
  */
 
-
 #define TCMU_TIME_OUT (30 * MSEC_PER_SEC)
 
-#define DATA_BLOCK_BITS 256
-#define DATA_BLOCK_SIZE 4096
+/* For cmd area, the size is fixed 2M */
+#define CMDR_SIZE (2 * 1024 * 1024)
 
-#define CMDR_SIZE (16 * 4096)
+/* For data area, the size is fixed 32M */
+#define DATA_BLOCK_BITS (8 * 1024)
+#define DATA_BLOCK_SIZE 4096
 #define DATA_SIZE (DATA_BLOCK_BITS * DATA_BLOCK_SIZE)
 
+/* The ring buffer size is 34M */
 #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
 
 static struct device *tcmu_root_device;
@@ -102,12 +106,14 @@ struct tcmu_dev {
 	size_t data_off;
 	size_t data_size;
 
-	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
-
 	wait_queue_head_t wait_cmdr;
 	/* TODO should this be a mutex? */
 	spinlock_t cmdr_lock;
 
+	uint32_t dbi_cur;
+	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
+	struct radix_tree_root data_blocks;
+
 	struct idr commands;
 	spinlock_t commands_lock;
 
@@ -126,7 +132,9 @@ struct tcmu_cmd {
 
 	/* Can't use se_cmd when cleaning up expired cmds, because if
 	   cmd has been completed then accessing se_cmd is off limits */
-	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
+	uint32_t dbi_len;
+	uint32_t dbi_cur;
+	uint32_t *dbi;
 };
 
 static struct kmem_cache *tcmu_cmd_cache;
@@ -152,10 +160,80 @@ enum tcmu_multicast_groups {
 	.netnsok = true,
 };
 
+static int tcmu_db_get_empty_block(struct tcmu_dev *udev, void **addr)
+{
+	void *p;
+	uint32_t dbi;
+	int ret;
+
+	dbi = find_first_zero_bit(udev->data_bitmap, DATA_BLOCK_BITS);
+	if (dbi > udev->dbi_cur)
+		udev->dbi_cur = dbi;
+
+	set_bit(dbi, udev->data_bitmap);
+
+	p = radix_tree_lookup(&udev->data_blocks, dbi);
+	if (!p) {
+		p = kzalloc(DATA_BLOCK_SIZE, GFP_ATOMIC);
+		if (!p) {
+			clear_bit(dbi, udev->data_bitmap);
+			return -ENOMEM;
+		}
+
+		ret = radix_tree_insert(&udev->data_blocks, dbi, p);
+		if (ret) {
+			kfree(p);
+			clear_bit(dbi, udev->data_bitmap);
+			return ret;
+		}
+	}
+
+	*addr = p;
+	return dbi;
+}
+
+static void *tcmu_db_get_block_addr(struct tcmu_dev *udev, uint32_t dbi)
+{
+	return radix_tree_lookup(&udev->data_blocks, dbi);
+}
+
+#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
+#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
+#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
+
+static void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd)
+{
+	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
+	uint32_t bi;
+
+	for (bi = 0; bi < tcmu_cmd->dbi_len; bi++)
+		clear_bit(tcmu_cmd->dbi[bi], udev->data_bitmap);
+}
+
+static inline void tcmu_free_cmd(struct tcmu_cmd *tcmu_cmd)
+{
+	kfree(tcmu_cmd->dbi);
+	kmem_cache_free(tcmu_cmd_cache, tcmu_cmd);
+}
+
+static inline uint32_t tcmu_cmd_get_dbi_len(struct se_cmd *se_cmd)
+{
+	size_t data_length = se_cmd->data_length;
+	uint32_t dbi_len;
+
+	if (se_cmd->se_cmd_flags & SCF_BIDI)
+		data_length += se_cmd->t_bidi_data_sg->length;
+
+	dbi_len = (data_length + DATA_BLOCK_SIZE - 1) / DATA_BLOCK_SIZE;
+
+	return dbi_len;
+}
+
 static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 {
 	struct se_device *se_dev = se_cmd->se_dev;
 	struct tcmu_dev *udev = TCMU_DEV(se_dev);
+	uint32_t dbi_len = tcmu_cmd_get_dbi_len(se_cmd);
 	struct tcmu_cmd *tcmu_cmd;
 	int cmd_id;
 
@@ -166,6 +244,14 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 	tcmu_cmd->se_cmd = se_cmd;
 	tcmu_cmd->tcmu_dev = udev;
 
+	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
+	tcmu_cmd->dbi_len = dbi_len;
+	tcmu_cmd->dbi = kcalloc(dbi_len, sizeof(uint32_t), GFP_KERNEL);
+	if (!tcmu_cmd->dbi) {
+		kmem_cache_free(tcmu_cmd_cache, tcmu_cmd);
+		return NULL;
+	}
+
 	idr_preload(GFP_KERNEL);
 	spin_lock_irq(&udev->commands_lock);
 	cmd_id = idr_alloc(&udev->commands, tcmu_cmd, 0,
@@ -174,7 +260,7 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 	idr_preload_end();
 
 	if (cmd_id < 0) {
-		kmem_cache_free(tcmu_cmd_cache, tcmu_cmd);
+		tcmu_free_cmd(tcmu_cmd);
 		return NULL;
 	}
 	tcmu_cmd->cmd_id = cmd_id;
@@ -236,10 +322,10 @@ static inline void new_iov(struct iovec **iov, int *iov_cnt,
 #define UPDATE_HEAD(head, used, size) smp_store_release(&head, ((head % size) + used) % size)
 
 /* offset is relative to mb_addr */
-static inline size_t get_block_offset(struct tcmu_dev *dev,
-		int block, int remaining)
+static inline size_t get_block_offset_user(struct tcmu_dev *dev,
+		int dbi, int remaining)
 {
-	return dev->data_off + block * DATA_BLOCK_SIZE +
+	return dev->data_off + dbi * DATA_BLOCK_SIZE +
 		DATA_BLOCK_SIZE - remaining;
 }
 
@@ -248,14 +334,15 @@ static inline size_t iov_tail(struct tcmu_dev *udev, struct iovec *iov)
 	return (size_t)iov->iov_base + iov->iov_len;
 }
 
-static void alloc_and_scatter_data_area(struct tcmu_dev *udev,
-	struct scatterlist *data_sg, unsigned int data_nents,
-	struct iovec **iov, int *iov_cnt, bool copy_data)
+static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
+	struct tcmu_cmd *tcmu_cmd, struct scatterlist *data_sg,
+	unsigned int data_nents, struct iovec **iov,
+	int *iov_cnt, bool copy_data)
 {
-	int i, block;
+	int i, dbi;
 	int block_remaining = 0;
-	void *from, *to;
-	size_t copy_bytes, to_offset;
+	void *from, *to = NULL;
+	size_t copy_bytes, to_offset, offset;
 	struct scatterlist *sg;
 
 	for_each_sg(data_sg, sg, data_nents, i) {
@@ -263,22 +350,26 @@ static void alloc_and_scatter_data_area(struct tcmu_dev *udev,
 		from = kmap_atomic(sg_page(sg)) + sg->offset;
 		while (sg_remaining > 0) {
 			if (block_remaining == 0) {
-				block = find_first_zero_bit(udev->data_bitmap,
-						DATA_BLOCK_BITS);
 				block_remaining = DATA_BLOCK_SIZE;
-				set_bit(block, udev->data_bitmap);
+				dbi = tcmu_db_get_empty_block(udev, &to);
+				if (dbi < 0)
+					return dbi;
+				tcmu_cmd_set_dbi(tcmu_cmd, dbi);
 			}
+
 			copy_bytes = min_t(size_t, sg_remaining,
 					block_remaining);
-			to_offset = get_block_offset(udev, block,
+			to_offset = get_block_offset_user(udev, dbi,
 					block_remaining);
-			to = (void *)udev->mb_addr + to_offset;
+			offset = DATA_BLOCK_SIZE - block_remaining;
+			to = (void *)(unsigned long)to + offset;
+
 			if (*iov_cnt != 0 &&
 			    to_offset == iov_tail(udev, *iov)) {
 				(*iov)->iov_len += copy_bytes;
 			} else {
 				new_iov(iov, iov_cnt, udev);
-				(*iov)->iov_base = (void __user *) to_offset;
+				(*iov)->iov_base = (void __user *)to_offset;
 				(*iov)->iov_len = copy_bytes;
 			}
 			if (copy_data) {
@@ -291,21 +382,17 @@ static void alloc_and_scatter_data_area(struct tcmu_dev *udev,
 		}
 		kunmap_atomic(from - sg->offset);
 	}
-}
 
-static void free_data_area(struct tcmu_dev *udev, struct tcmu_cmd *cmd)
-{
-	bitmap_xor(udev->data_bitmap, udev->data_bitmap, cmd->data_bitmap,
-		   DATA_BLOCK_BITS);
+	return 0;
 }
 
-static void gather_data_area(struct tcmu_dev *udev, unsigned long *cmd_bitmap,
+static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
 		struct scatterlist *data_sg, unsigned int data_nents)
 {
-	int i, block;
+	int i, dbi;
 	int block_remaining = 0;
 	void *from, *to;
-	size_t copy_bytes, from_offset;
+	size_t copy_bytes, offset;
 	struct scatterlist *sg;
 
 	for_each_sg(data_sg, sg, data_nents, i) {
@@ -313,16 +400,14 @@ static void gather_data_area(struct tcmu_dev *udev, unsigned long *cmd_bitmap,
 		to = kmap_atomic(sg_page(sg)) + sg->offset;
 		while (sg_remaining > 0) {
 			if (block_remaining == 0) {
-				block = find_first_bit(cmd_bitmap,
-						DATA_BLOCK_BITS);
 				block_remaining = DATA_BLOCK_SIZE;
-				clear_bit(block, cmd_bitmap);
+				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
+				from = tcmu_db_get_block_addr(udev, dbi);
 			}
 			copy_bytes = min_t(size_t, sg_remaining,
 					block_remaining);
-			from_offset = get_block_offset(udev, block,
-					block_remaining);
-			from = (void *) udev->mb_addr + from_offset;
+			offset = DATA_BLOCK_SIZE - block_remaining;
+			from = (void *)(unsigned long)from + offset;
 			tcmu_flush_dcache_range(from, copy_bytes);
 			memcpy(to + sg->length - sg_remaining, from,
 					copy_bytes);
@@ -391,12 +476,11 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 	struct tcmu_mailbox *mb;
 	struct tcmu_cmd_entry *entry;
 	struct iovec *iov;
-	int iov_cnt;
+	int iov_cnt, ret;
 	uint32_t cmd_head;
 	uint64_t cdb_off;
 	bool copy_to_data_area;
 	size_t data_length;
-	DECLARE_BITMAP(old_bitmap, DATA_BLOCK_BITS);
 
 	if (test_bit(TCMU_DEV_BIT_BROKEN, &udev->flags))
 		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
@@ -482,28 +566,31 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 	entry->hdr.kflags = 0;
 	entry->hdr.uflags = 0;
 
-	bitmap_copy(old_bitmap, udev->data_bitmap, DATA_BLOCK_BITS);
-
 	/* Handle allocating space from the data area */
 	iov = &entry->req.iov[0];
 	iov_cnt = 0;
 	copy_to_data_area = (se_cmd->data_direction == DMA_TO_DEVICE
 		|| se_cmd->se_cmd_flags & SCF_BIDI);
-	alloc_and_scatter_data_area(udev, se_cmd->t_data_sg,
+	ret = alloc_and_scatter_data_area(udev, tcmu_cmd, se_cmd->t_data_sg,
 		se_cmd->t_data_nents, &iov, &iov_cnt, copy_to_data_area);
+	if (ret) {
+		pr_err("tcmu: alloc and scatter data failed\n");
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+	}
 	entry->req.iov_cnt = iov_cnt;
 	entry->req.iov_dif_cnt = 0;
 
 	/* Handle BIDI commands */
 	iov_cnt = 0;
-	alloc_and_scatter_data_area(udev, se_cmd->t_bidi_data_sg,
-		se_cmd->t_bidi_data_nents, &iov, &iov_cnt, false);
+	ret = alloc_and_scatter_data_area(udev, tcmu_cmd,
+		se_cmd->t_bidi_data_sg, se_cmd->t_bidi_data_nents,
+		&iov, &iov_cnt, false);
+	if (ret) {
+		pr_err("tcmu: alloc and scatter bidi data failed\n");
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+	}
 	entry->req.iov_bidi_cnt = iov_cnt;
 
-	/* cmd's data_bitmap is what changed in process */
-	bitmap_xor(tcmu_cmd->data_bitmap, old_bitmap, udev->data_bitmap,
-			DATA_BLOCK_BITS);
-
 	/* All offsets relative to mb_addr, not start of entry! */
 	cdb_off = CMDR_OFF + cmd_head + base_command_size;
 	memcpy((void *) mb + cdb_off, se_cmd->t_task_cdb, scsi_command_size(se_cmd->t_task_cdb));
@@ -540,7 +627,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 		idr_remove(&udev->commands, tcmu_cmd->cmd_id);
 		spin_unlock_irq(&udev->commands_lock);
 
-		kmem_cache_free(tcmu_cmd_cache, tcmu_cmd);
+		tcmu_free_cmd(tcmu_cmd);
 	}
 
 	return ret;
@@ -551,41 +638,34 @@ static void tcmu_handle_completion(struct tcmu_cmd *cmd, struct tcmu_cmd_entry *
 	struct se_cmd *se_cmd = cmd->se_cmd;
 	struct tcmu_dev *udev = cmd->tcmu_dev;
 
+	tcmu_cmd_reset_dbi_cur(cmd);
+
 	if (entry->hdr.uflags & TCMU_UFLAG_UNKNOWN_OP) {
-		free_data_area(udev, cmd);
 		pr_warn("TCMU: Userspace set UNKNOWN_OP flag on se_cmd %p\n",
 			cmd->se_cmd);
 		entry->rsp.scsi_status = SAM_STAT_CHECK_CONDITION;
 	} else if (entry->rsp.scsi_status == SAM_STAT_CHECK_CONDITION) {
 		memcpy(se_cmd->sense_buffer, entry->rsp.sense_buffer,
 			       se_cmd->scsi_sense_length);
-		free_data_area(udev, cmd);
 	} else if (se_cmd->se_cmd_flags & SCF_BIDI) {
-		DECLARE_BITMAP(bitmap, DATA_BLOCK_BITS);
-
 		/* Get Data-In buffer before clean up */
-		bitmap_copy(bitmap, cmd->data_bitmap, DATA_BLOCK_BITS);
-		gather_data_area(udev, bitmap,
+		gather_data_area(udev, cmd,
 			se_cmd->t_bidi_data_sg, se_cmd->t_bidi_data_nents);
-		free_data_area(udev, cmd);
 	} else if (se_cmd->data_direction == DMA_FROM_DEVICE) {
-		DECLARE_BITMAP(bitmap, DATA_BLOCK_BITS);
-
-		bitmap_copy(bitmap, cmd->data_bitmap, DATA_BLOCK_BITS);
-		gather_data_area(udev, bitmap,
+		gather_data_area(udev, cmd,
 			se_cmd->t_data_sg, se_cmd->t_data_nents);
-		free_data_area(udev, cmd);
 	} else if (se_cmd->data_direction == DMA_TO_DEVICE) {
-		free_data_area(udev, cmd);
+		/* TODO: */
 	} else if (se_cmd->data_direction != DMA_NONE) {
 		pr_warn("TCMU: data direction was %d!\n",
 			se_cmd->data_direction);
 	}
 
 	target_complete_cmd(cmd->se_cmd, entry->rsp.scsi_status);
-	cmd->se_cmd = NULL;
 
-	kmem_cache_free(tcmu_cmd_cache, cmd);
+	cmd->se_cmd = NULL;
+	tcmu_cmd_free_data(cmd);
+	tcmu_free_cmd(cmd);
 }
 
 static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
@@ -701,6 +781,54 @@ static int tcmu_irqcontrol(struct uio_info *info, s32 irq_on)
 	return 0;
 }
 
+static void tcmu_db_release(struct tcmu_dev *udev, bool release_pending)
+{
+	uint32_t dbi, end;
+	void *addr;
+
+	spin_lock_irq(&udev->cmdr_lock);
+
+	end = udev->dbi_cur + 1;
+
+	/* try to release all unused blocks */
+	dbi = find_first_zero_bit(udev->data_bitmap, end);
+	if (dbi >= end) {
+		spin_unlock_irq(&udev->cmdr_lock);
+		return;
+	}
+	do {
+		addr = radix_tree_delete(&udev->data_blocks, dbi);
+		kfree(addr);
+
+		dbi = find_next_zero_bit(udev->data_bitmap, end, dbi + 1);
+	} while (dbi < end);
+
+	if (!release_pending)
+		return;
+
+	/* try to release all pending blocks */
+	dbi = find_first_bit(udev->data_bitmap, end);
+	if (dbi >= end) {
+		spin_unlock_irq(&udev->cmdr_lock);
+		return;
+	}
+	do {
+		addr = radix_tree_delete(&udev->data_blocks, dbi);
+		kfree(addr);
+
+		dbi = find_next_bit(udev->data_bitmap, end, dbi + 1);
+	} while (dbi < end);
+
+	spin_unlock_irq(&udev->cmdr_lock);
+}
+
+static void tcmu_vma_close(struct vm_area_struct *vma)
+{
+	struct tcmu_dev *udev = vma->vm_private_data;
+
+	tcmu_db_release(udev, false);
+}
+
 /*
  * mmap code from uio.c. Copied here because we want to hook mmap()
  * and this stuff must come along.
@@ -736,17 +864,28 @@ static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	 */
 	offset = (vmf->pgoff - mi) << PAGE_SHIFT;
 
-	addr = (void *)(unsigned long)info->mem[mi].addr + offset;
-	if (info->mem[mi].memtype == UIO_MEM_LOGICAL)
-		page = virt_to_page(addr);
-	else
+	if (offset < udev->data_off) {
+		/* For the vmalloc()ed cmd area pages */
+		addr = (void *)(unsigned long)info->mem[mi].addr + offset;
 		page = vmalloc_to_page(addr);
+	} else {
+		/* For the dynamically growing data area pages */
+		uint32_t dbi;
+
+		dbi = (offset - udev->data_off) / DATA_BLOCK_SIZE;
+		addr = tcmu_db_get_block_addr(udev, dbi);
+		if (!addr)
+			return VM_FAULT_NOPAGE;
+		page = virt_to_page(addr);
+	}
+
 	get_page(page);
 	vmf->page = page;
 	return 0;
 }
 
 static const struct vm_operations_struct tcmu_vm_ops = {
+	.close = tcmu_vma_close,
 	.fault = tcmu_vma_fault,
 };
 
@@ -854,7 +993,7 @@ static int tcmu_configure_device(struct se_device *dev)
 
 	info->name = str;
 
-	udev->mb_addr = vzalloc(TCMU_RING_SIZE);
+	udev->mb_addr = vzalloc(CMDR_SIZE);
 	if (!udev->mb_addr) {
 		ret = -ENOMEM;
 		goto err_vzalloc;
@@ -863,8 +1002,9 @@ static int tcmu_configure_device(struct se_device *dev)
 	/* mailbox fits in first part of CMDR space */
 	udev->cmdr_size = CMDR_SIZE - CMDR_OFF;
 	udev->data_off = CMDR_SIZE;
-	udev->data_size = TCMU_RING_SIZE - CMDR_SIZE;
+	udev->data_size = DATA_SIZE;
 
+	/* Initialise the mailbox of the ring buffer */
 	mb = udev->mb_addr;
 	mb->version = TCMU_MAILBOX_VERSION;
 	mb->flags = TCMU_MAILBOX_FLAG_CAP_OOOC;
@@ -875,12 +1015,14 @@ static int tcmu_configure_device(struct se_device *dev)
 	WARN_ON(udev->data_size % PAGE_SIZE);
 	WARN_ON(udev->data_size % DATA_BLOCK_SIZE);
 
+	INIT_RADIX_TREE(&udev->data_blocks, GFP_ATOMIC);
+
 	info->version = __stringify(TCMU_MAILBOX_VERSION);
 
 	info->mem[0].name = "tcm-user command & data buffer";
 	info->mem[0].addr = (phys_addr_t)(uintptr_t)udev->mb_addr;
 	info->mem[0].size = TCMU_RING_SIZE;
-	info->mem[0].memtype = UIO_MEM_VIRTUAL;
+	info->mem[0].memtype = UIO_MEM_NONE;
 
 	info->irqcontrol = tcmu_irqcontrol;
 	info->irq = UIO_IRQ_CUSTOM;
@@ -938,6 +1080,8 @@ static void tcmu_free_device(struct se_device *dev)
 	idr_destroy(&udev->commands);
 	spin_unlock_irq(&udev->commands_lock);
 
+	tcmu_db_release(udev, true);
+
 	/* Device was configured */
 	if (udev->uio_info.uio_dev) {
 		tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE, udev->uio_info.name,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCHv2 2/5] target/user: Add global data block pool support
  2017-03-08  8:45 [PATCHv2 0/5] Dynamic growing data area support lixiubo
  2017-03-08  8:45 ` [PATCHv2 1/5] target/user: Add dynamic growing data area feature support lixiubo
@ 2017-03-08  8:45 ` lixiubo
  2017-03-08 20:20   ` Andy Grover
  2017-03-08  8:45 ` [PATCHv2 3/5] target/user: Fix possible overwrite of t_data_sg's last iov[] lixiubo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: lixiubo @ 2017-03-08  8:45 UTC (permalink / raw)
  To: agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, Xiubo Li

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

For each target there will be one ring, when the target number
grows larger and larger, it could eventually runs out of the
system memories.

In this patch for each target ring, the cmd area size will be
limited to 8M and the data area size will be limited to 1G. And
the data area will be divided into two parts: the fixed and
growing.

For the fixed part, it will be 1M size and pre-allocated together
with the cmd area. This could speed up the low iops case, and
also could make sure that each ring will have at least 1M private
data size when there has too many targets, which could get their
data blocks as quick as possible.

For the growing part, it will get the blocks from the global data
block pool. And this part will be used for high iops case.

The global data block pool is a cache, and the total size will be
limited to max 2G(grows from 0 to 2G as needed). And it will cache
the freed data blocks by a list, All targets will get from/release
to the free blocks here.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com>
---
 drivers/target/target_core_user.c | 352 +++++++++++++++++++++++++++++++-------
 1 file changed, 290 insertions(+), 62 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index e904bc0..cd9bc4a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -66,17 +66,31 @@
 
 #define TCMU_TIME_OUT (30 * MSEC_PER_SEC)
 
-/* For cmd area, the size is fixed 2M */
-#define CMDR_SIZE (2 * 1024 * 1024)
+/* For cmd area, the size is 8M */
+#define CMDR_SIZE (8 * 1024 * 1024)
 
-/* For data area, the size is fixed 32M */
-#define DATA_BLOCK_BITS (8 * 1024)
+/*
+ * For data area, the total size is 1G: 1M fixed
+ * size area, which will always at the beginning,
+ * and 1023M grow size followed,
+ */
 #define DATA_BLOCK_SIZE 4096
+
+#define DATA_BLOCK_FIX_BITS 256
+#define DATA_FIX_SIZE (DATA_BLOCK_FIX_BITS * DATA_BLOCK_SIZE)
+
+#define DATA_BLOCK_GROW_BITS (256 * 1023)
+#define DATA_GROW_SIZE (DATA_BLOCK_GROW_BITS * DATA_BLOCK_SIZE)
+
+#define DATA_BLOCK_BITS (DATA_BLOCK_FIX_BITS + DATA_BLOCK_GROW_BITS)
 #define DATA_SIZE (DATA_BLOCK_BITS * DATA_BLOCK_SIZE)
 
-/* The ring buffer size is 34M */
+/* The ring buffer size is 8M + 1G */
 #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
 
+/* Default maximum of the global data blocks(2G) */
+#define TCMU_GLOBAL_MAX_BLOCKS (512 * 1024)
+
 static struct device *tcmu_root_device;
 
 struct tcmu_hba {
@@ -137,6 +151,18 @@ struct tcmu_cmd {
 	uint32_t *dbi;
 };
 
+struct tcmu_block {
+	struct list_head node;
+
+	void *addr;
+	uint32_t user;
+	bool using;
+};
+
+static unsigned long global_db_count;
+static LIST_HEAD(free_data_blockss);
+static spinlock_t g_lock;
+static struct kmem_cache *tcmu_block_cache;
 static struct kmem_cache *tcmu_cmd_cache;
 
 /* multicast group */
@@ -160,54 +186,178 @@ enum tcmu_multicast_groups {
 	.netnsok = true,
 };
 
-static int tcmu_db_get_empty_block(struct tcmu_dev *udev, void **addr)
+#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
+#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
+#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
+
+static inline struct tcmu_block *try_to_get_cached_free_block(void)
 {
-	void *p;
-	uint32_t dbi;
-	int ret;
+	struct tcmu_block *p;
+
+	if (list_empty(&free_data_blockss))
+		return NULL;
+
+	p = list_first_entry(&free_data_blockss, struct tcmu_block, node);
+	list_del(&p->node);
+
+	return p;
+}
+
+static inline bool get_empty_fixed_block(struct tcmu_dev *udev,
+					 struct tcmu_cmd *tcmu_cmd)
+{
+	uint32_t end = DATA_BLOCK_FIX_BITS;
+	int dbi = -1;
+
+	dbi = find_first_zero_bit(udev->data_bitmap, end);
+	if (dbi >= end)
+		return false;
 
-	dbi = find_first_zero_bit(udev->data_bitmap, DATA_BLOCK_BITS);
 	if (dbi > udev->dbi_cur)
 		udev->dbi_cur = dbi;
 
 	set_bit(dbi, udev->data_bitmap);
+	tcmu_cmd_set_dbi(tcmu_cmd, dbi);
+	return true;
+}
+
+static inline bool get_empty_growing_block(struct tcmu_dev *udev,
+					   struct tcmu_cmd *tcmu_cmd)
+{
+	struct tcmu_block *p;
+	uint32_t end = DATA_BLOCK_BITS;
+	int ret, dbi = DATA_BLOCK_FIX_BITS - 1;
+	bool new = false;
+
+retry:
+	dbi = find_next_zero_bit(udev->data_bitmap, end, dbi + 1);
+	if (dbi >= end)
+		return false;
 
+	spin_lock_irq(&g_lock);
 	p = radix_tree_lookup(&udev->data_blocks, dbi);
 	if (!p) {
-		p = kzalloc(DATA_BLOCK_SIZE, GFP_ATOMIC);
+		/* try to get the page from free list first */
+		p = try_to_get_cached_free_block();
 		if (!p) {
-			clear_bit(dbi, udev->data_bitmap);
-			return -ENOMEM;
+			/* try to get new page from the mm */
+			if (global_db_count >= TCMU_GLOBAL_MAX_BLOCKS) {
+				spin_unlock_irq(&g_lock);
+				return false;
+			}
+
+			p = kmem_cache_zalloc(tcmu_block_cache, GFP_ATOMIC);
+			if (!p) {
+				spin_unlock_irq(&g_lock);
+				return false;
+			}
+
+			p->addr = kzalloc(DATA_BLOCK_SIZE, GFP_ATOMIC);
+			if (!p->addr) {
+				spin_unlock_irq(&g_lock);
+				kmem_cache_free(tcmu_block_cache, p);
+				return false;
+			}
+
+			new = true;
+			global_db_count++;
 		}
 
 		ret = radix_tree_insert(&udev->data_blocks, dbi, p);
 		if (ret) {
-			kfree(p);
-			clear_bit(dbi, udev->data_bitmap);
-			return ret;
+			if (new) {
+				kfree(p->addr);
+				kmem_cache_free(tcmu_block_cache, p);
+				global_db_count--;
+			}
+			spin_unlock_irq(&g_lock);
+			return false;
 		}
+
+		p->user++;
+	} else if (p->using) {
+		spin_unlock_irq(&g_lock);
+		goto retry;
+	} else {
+		list_del(&p->node);
 	}
+	p->using = true;
+	spin_unlock_irq(&g_lock);
 
-	*addr = p;
-	return dbi;
+	if (dbi > udev->dbi_cur)
+		udev->dbi_cur = dbi;
+
+	set_bit(dbi, udev->data_bitmap);
+	tcmu_cmd_set_dbi(tcmu_cmd, dbi);
+
+	return true;
 }
 
-static void *tcmu_db_get_block_addr(struct tcmu_dev *udev, uint32_t dbi)
+static bool tcmu_db_get_empty_blocks(struct tcmu_dev *udev,
+				     struct tcmu_cmd *tcmu_cmd)
 {
-	return radix_tree_lookup(&udev->data_blocks, dbi);
+	bool fix = true;
+	int i;
+
+	for (i = 0; i < tcmu_cmd->dbi_len; i++) {
+		/* Try to get free blocks from the fixed area */
+		if (fix) {
+			if (get_empty_fixed_block(udev, tcmu_cmd))
+				continue;
+			fix = false;
+		}
+
+		/* Then try to get free blocks from the grow area */
+		if (!get_empty_growing_block(udev, tcmu_cmd))
+			return false;
+	}
+
+	return true;
 }
 
-#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
-#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
-#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
+static void *tcmu_db_get_block_addr(struct tcmu_dev *udev, uint32_t dbi)
+{
+	unsigned long offset = udev->data_off + dbi * DATA_BLOCK_SIZE;
+	struct tcmu_block *p;
 
-static void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd)
+	if (dbi < DATA_BLOCK_FIX_BITS)
+		return (void *)((unsigned long)udev->mb_addr + offset);
+
+	p = radix_tree_lookup(&udev->data_blocks, dbi);
+	if (!p)
+		return NULL;
+
+	return p->addr;
+}
+
+static void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd,
+			       uint32_t len, bool del)
 {
 	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
-	uint32_t bi;
+	struct tcmu_block *p;
+	uint32_t i, dbi;
+
+	for (i = 0; i < len; i++) {
+		dbi = tcmu_cmd->dbi[i];
+		clear_bit(dbi, udev->data_bitmap);
+
+		if (dbi < DATA_BLOCK_FIX_BITS)
+			continue;
 
-	for (bi = 0; bi < tcmu_cmd->dbi_len; bi++)
-		clear_bit(tcmu_cmd->dbi[bi], udev->data_bitmap);
+		spin_lock_irq(&g_lock);
+		p = radix_tree_lookup(&udev->data_blocks, dbi);
+		BUG_ON(!p);
+
+		if (p->user == 1 && del) {
+			p = radix_tree_delete(&udev->data_blocks, dbi);
+			kfree(p->addr);
+			kmem_cache_free(tcmu_block_cache, p);
+		} else {
+			p->using = false;
+			list_add(&p->node, &free_data_blockss);
+		}
+		spin_unlock_irq(&g_lock);
+	}
 }
 
 static inline void tcmu_free_cmd(struct tcmu_cmd *tcmu_cmd)
@@ -351,10 +501,8 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
 		while (sg_remaining > 0) {
 			if (block_remaining == 0) {
 				block_remaining = DATA_BLOCK_SIZE;
-				dbi = tcmu_db_get_empty_block(udev, &to);
-				if (dbi < 0)
-					return dbi;
-				tcmu_cmd_set_dbi(tcmu_cmd, dbi);
+				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
+				to = tcmu_db_get_block_addr(udev, dbi);
 			}
 
 			copy_bytes = min_t(size_t, sg_remaining,
@@ -362,7 +510,7 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
 			to_offset = get_block_offset_user(udev, dbi,
 					block_remaining);
 			offset = DATA_BLOCK_SIZE - block_remaining;
-			to = (void *)(unsigned long)to + offset;
+			to = (void *)((unsigned long)to + offset);
 
 			if (*iov_cnt != 0 &&
 			    to_offset == iov_tail(udev, *iov)) {
@@ -407,7 +555,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
 			copy_bytes = min_t(size_t, sg_remaining,
 					block_remaining);
 			offset = DATA_BLOCK_SIZE - block_remaining;
-			from = (void *)(unsigned long)from + offset;
+			from = (void *)((unsigned long)from + offset);
 			tcmu_flush_dcache_range(from, copy_bytes);
 			memcpy(to + sg->length - sg_remaining, from,
 					copy_bytes);
@@ -431,7 +579,8 @@ static inline size_t spc_bitmap_free(unsigned long *bitmap)
  *
  * Called with ring lock held.
  */
-static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t data_needed)
+static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
+		size_t cmd_size, size_t data_needed)
 {
 	struct tcmu_mailbox *mb = udev->mb_addr;
 	size_t space, cmd_needed;
@@ -457,6 +606,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 		return false;
 	}
 
+	/* try to check and get the data blocks as needed */
 	space = spc_bitmap_free(udev->data_bitmap);
 	if (space < data_needed) {
 		pr_debug("no data space: only %zu available, but ask for %zu\n",
@@ -464,6 +614,12 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 		return false;
 	}
 
+	if (!tcmu_db_get_empty_blocks(udev, cmd)) {
+		pr_debug("no data space: ask for %zu\n", data_needed);
+		tcmu_cmd_free_data(cmd, cmd->dbi_cur, true);
+		return false;
+	}
+
 	return true;
 }
 
@@ -519,7 +675,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 		return TCM_INVALID_CDB_FIELD;
 	}
 
-	while (!is_ring_space_avail(udev, command_size, data_length)) {
+	while (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
 		int ret;
 		DEFINE_WAIT(__wait);
 
@@ -567,6 +723,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
 	entry->hdr.uflags = 0;
 
 	/* Handle allocating space from the data area */
+	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
 	iov = &entry->req.iov[0];
 	iov_cnt = 0;
 	copy_to_data_area = (se_cmd->data_direction == DMA_TO_DEVICE
@@ -664,7 +821,7 @@ static void tcmu_handle_completion(struct tcmu_cmd *cmd, struct tcmu_cmd_entry *
 	target_complete_cmd(cmd->se_cmd, entry->rsp.scsi_status);
 
 	cmd->se_cmd = NULL;
-	tcmu_cmd_free_data(cmd);
+	tcmu_cmd_free_data(cmd, cmd->dbi_len, false);
 	tcmu_free_cmd(cmd);
 }
 
@@ -783,41 +940,98 @@ static int tcmu_irqcontrol(struct uio_info *info, s32 irq_on)
 
 static void tcmu_db_release(struct tcmu_dev *udev, bool release_pending)
 {
-	uint32_t dbi, end;
-	void *addr;
+	int dbi = -1, end;
+	struct tcmu_block *p;
 
 	spin_lock_irq(&udev->cmdr_lock);
-
 	end = udev->dbi_cur + 1;
 
-	/* try to release all unused blocks */
-	dbi = find_first_zero_bit(udev->data_bitmap, end);
-	if (dbi >= end) {
-		spin_unlock_irq(&udev->cmdr_lock);
-		return;
-	}
+	/* try to release all unused but has mapped blocks */
 	do {
-		addr = radix_tree_delete(&udev->data_blocks, dbi);
-		kfree(addr);
-
 		dbi = find_next_zero_bit(udev->data_bitmap, end, dbi + 1);
-	} while (dbi < end);
+		if (dbi >= end)
+			break;
 
-	if (!release_pending)
-		return;
+		if (dbi < DATA_BLOCK_FIX_BITS)
+			continue;
+		/*
+		 * When the bit is cleared and p != NULL, meaning that
+		 * this tcmu block had already freed-after-use.
+		 *
+		 * If p->user == 0, meaning that the current ring buffer
+		 * is the last or the only user of the tcmu block, and
+		 * it must already in the free list, so it could be
+		 * remove from the list and then released its memories.
+		 *
+		 * If p->user != 0, meaning that the current tcmu block is
+		 * still referenced by other ring buffers, so just ignore
+		 * it without doing anyting.
+		 */
+		spin_lock_irq(&g_lock);
+		p = radix_tree_delete(&udev->data_blocks, dbi);
+		if (p) {
+			p->user--;
+			if (p->user == 0) {
+				list_del(&p->node);
+				kfree(p->addr);
+				kmem_cache_free(tcmu_block_cache, p);
+				global_db_count--;
+			}
+		}
+		spin_unlock_irq(&g_lock);
+	} while (1);
 
-	/* try to release all pending blocks */
-	dbi = find_first_bit(udev->data_bitmap, end);
-	if (dbi >= end) {
+	if (!release_pending) {
 		spin_unlock_irq(&udev->cmdr_lock);
 		return;
 	}
-	do {
-		addr = radix_tree_delete(&udev->data_blocks, dbi);
-		kfree(addr);
 
+	/* try to release all pending blocks */
+	dbi = -1;
+	do {
 		dbi = find_next_bit(udev->data_bitmap, end, dbi + 1);
-	} while (dbi < end);
+		if (dbi >= end)
+			break;
+
+		clear_bit(dbi, udev->data_bitmap);
+
+		if (dbi < DATA_BLOCK_FIX_BITS)
+			continue;
+
+		/*
+		 * When the bit is set and p != NULL, meaning that this
+		 * tcmu block is still being used here.
+		 *
+		 * If p->user == 0, meaning that the current ring buffer
+		 * is the last or the only user of this tcmu block, and
+		 * it won't in the free list, so could just release its
+		 * memories.
+		 *
+		 * If the p->user != 0, we should insert it to the free
+		 * list.
+		 *
+		 * p == NULL means that the current ring buffer is broken.
+		 */
+		spin_lock_irq(&g_lock);
+		p = radix_tree_delete(&udev->data_blocks, dbi);
+		if (p) {
+			p->user--;
+			p->using = false;
+			if (p->user == 0) {
+				kfree(p->addr);
+				kmem_cache_free(tcmu_block_cache, p);
+				global_db_count--;
+			} else {
+				list_add(&p->node, &free_data_blockss);
+			}
+		} else {
+			pr_err("block page not found, ring is broken\n");
+			set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags);
+			spin_unlock_irq(&g_lock);
+			break;
+		}
+		spin_unlock_irq(&g_lock);
+	} while (1);
 
 	spin_unlock_irq(&udev->cmdr_lock);
 }
@@ -851,7 +1065,7 @@ static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	struct tcmu_dev *udev = vma->vm_private_data;
 	struct uio_info *info = &udev->uio_info;
 	struct page *page;
-	unsigned long offset;
+	unsigned long offset, vm_offset = udev->data_off + DATA_FIX_SIZE;
 	void *addr;
 
 	int mi = tcmu_find_mem_index(vma);
@@ -864,9 +1078,9 @@ static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	 */
 	offset = (vmf->pgoff - mi) << PAGE_SHIFT;
 
-	if (offset < udev->data_off) {
+	if (offset < vm_offset) {
 		/* For the vmalloc()ed cmd area pages */
-		addr = (void *)(unsigned long)info->mem[mi].addr + offset;
+		addr = (void *)((unsigned long)info->mem[mi].addr + offset);
 		page = vmalloc_to_page(addr);
 	} else {
 		/* For the dynamically growing data area pages */
@@ -993,7 +1207,7 @@ static int tcmu_configure_device(struct se_device *dev)
 
 	info->name = str;
 
-	udev->mb_addr = vzalloc(CMDR_SIZE);
+	udev->mb_addr = vzalloc(CMDR_SIZE + DATA_FIX_SIZE);
 	if (!udev->mb_addr) {
 		ret = -ENOMEM;
 		goto err_vzalloc;
@@ -1241,6 +1455,9 @@ static int __init tcmu_module_init(void)
 
 	BUILD_BUG_ON((sizeof(struct tcmu_cmd_entry) % TCMU_OP_ALIGN_SIZE) != 0);
 
+	spin_lock_init(&g_lock);
+	global_db_count = 0;
+
 	tcmu_cmd_cache = kmem_cache_create("tcmu_cmd_cache",
 				sizeof(struct tcmu_cmd),
 				__alignof__(struct tcmu_cmd),
@@ -1248,6 +1465,15 @@ static int __init tcmu_module_init(void)
 	if (!tcmu_cmd_cache)
 		return -ENOMEM;
 
+	tcmu_block_cache = kmem_cache_create("tcmu_block_cache",
+				sizeof(struct tcmu_block),
+				__alignof__(struct tcmu_block),
+				0, NULL);
+	if (!tcmu_block_cache) {
+		kmem_cache_destroy(tcmu_cmd_cache);
+		return -ENOMEM;
+	}
+
 	tcmu_root_device = root_device_register("tcm_user");
 	if (IS_ERR(tcmu_root_device)) {
 		ret = PTR_ERR(tcmu_root_device);
@@ -1270,6 +1496,7 @@ static int __init tcmu_module_init(void)
 out_unreg_device:
 	root_device_unregister(tcmu_root_device);
 out_free_cache:
+	kmem_cache_destroy(tcmu_block_cache);
 	kmem_cache_destroy(tcmu_cmd_cache);
 
 	return ret;
@@ -1280,6 +1507,7 @@ static void __exit tcmu_module_exit(void)
 	target_backend_unregister(&tcmu_ops);
 	genl_unregister_family(&tcmu_genl_family);
 	root_device_unregister(tcmu_root_device);
+	kmem_cache_destroy(tcmu_block_cache);
 	kmem_cache_destroy(tcmu_cmd_cache);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCHv2 3/5] target/user: Fix possible overwrite of t_data_sg's last iov[]
  2017-03-08  8:45 [PATCHv2 0/5] Dynamic growing data area support lixiubo
  2017-03-08  8:45 ` [PATCHv2 1/5] target/user: Add dynamic growing data area feature support lixiubo
  2017-03-08  8:45 ` [PATCHv2 2/5] target/user: Add global data block pool support lixiubo
@ 2017-03-08  8:45 ` lixiubo
  2017-03-16 18:23   ` Bryant G. Ly
  2017-03-08  8:45 ` [PATCHv2 4/5] target/user: Fix wrongly calculating of the base_command_size lixiubo
  2017-03-08  8:45 ` [PATCHv2 5/5] target/user: Clean up tcmu_queue_cmd_ring lixiubo
  4 siblings, 1 reply; 15+ messages in thread
From: lixiubo @ 2017-03-08  8:45 UTC (permalink / raw)
  To: agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, Xiubo Li

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

If there has BIDI data, its first iov[] will overwrite the last
iov[] for se_cmd->t_data_sg.

To fix this, we can just increase the iov pointer, but this may
introuduce a new memory leakage bug: If the se_cmd->data_length
and se_cmd->t_bidi_data_sg->length are all not aligned up to the
DATA_BLOCK_SIZE, the actual length needed maybe larger than just
sum of them.

So, this could be avoided by rounding all the data lengthes up
to DATA_BLOCK_SIZE.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
---
 drivers/target/target_core_user.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index cd9bc4a..99cd239 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -368,13 +368,14 @@ static inline void tcmu_free_cmd(struct tcmu_cmd *tcmu_cmd)
 
 static inline uint32_t tcmu_cmd_get_dbi_len(struct se_cmd *se_cmd)
 {
-	size_t data_length = se_cmd->data_length;
+	size_t data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
 	uint32_t dbi_len;
 
 	if (se_cmd->se_cmd_flags & SCF_BIDI)
-		data_length += se_cmd->t_bidi_data_sg->length;
+		data_length += round_up(se_cmd->t_bidi_data_sg->length,
+					DATA_BLOCK_SIZE);
 
-	dbi_len = (data_length + DATA_BLOCK_SIZE - 1) / DATA_BLOCK_SIZE;
+	dbi_len = data_length / DATA_BLOCK_SIZE;
 
 	return dbi_len;
 }
@@ -661,10 +662,11 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 
 	mb = udev->mb_addr;
 	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
-	data_length = se_cmd->data_length;
+	data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
 	if (se_cmd->se_cmd_flags & SCF_BIDI) {
 		BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
-		data_length += se_cmd->t_bidi_data_sg->length;
+		data_length += round_up(se_cmd->t_bidi_data_sg->length,
+				DATA_BLOCK_SIZE);
 	}
 	if ((command_size > (udev->cmdr_size / 2)) ||
 	    data_length > udev->data_size) {
@@ -738,15 +740,19 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 	entry->req.iov_dif_cnt = 0;
 
 	/* Handle BIDI commands */
-	iov_cnt = 0;
-	ret = alloc_and_scatter_data_area(udev, tcmu_cmd,
-		se_cmd->t_bidi_data_sg, se_cmd->t_bidi_data_nents,
-		&iov, &iov_cnt, false);
-	if (ret) {
-		pr_err("tcmu: alloc and scatter bidi data failed\n");
-		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+	if (se_cmd->se_cmd_flags & SCF_BIDI) {
+		iov_cnt = 0;
+		iov++;
+		ret = alloc_and_scatter_data_area(udev, tcmu_cmd,
+					se_cmd->t_bidi_data_sg,
+					se_cmd->t_bidi_data_nents,
+					&iov, &iov_cnt, false);
+		if (ret) {
+			pr_err("tcmu: alloc and scatter bidi data failed\n");
+			return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+		}
+		entry->req.iov_bidi_cnt = iov_cnt;
 	}
-	entry->req.iov_bidi_cnt = iov_cnt;
 
 	/* All offsets relative to mb_addr, not start of entry! */
 	cdb_off = CMDR_OFF + cmd_head + base_command_size;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCHv2 4/5] target/user: Fix wrongly calculating of the base_command_size
  2017-03-08  8:45 [PATCHv2 0/5] Dynamic growing data area support lixiubo
                   ` (2 preceding siblings ...)
  2017-03-08  8:45 ` [PATCHv2 3/5] target/user: Fix possible overwrite of t_data_sg's last iov[] lixiubo
@ 2017-03-08  8:45 ` lixiubo
  2017-03-17  5:45   ` Xiubo Li
  2017-03-08  8:45 ` [PATCHv2 5/5] target/user: Clean up tcmu_queue_cmd_ring lixiubo
  4 siblings, 1 reply; 15+ messages in thread
From: lixiubo @ 2017-03-08  8:45 UTC (permalink / raw)
  To: agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, Xiubo Li

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

The t_data_nents and t_bidi_data_nents are all the numbers of the
segments, and we couldn't be sure the size of the data area block
will equal to size of the segment.

Use the actually block number needed intead of the sum of segments.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
---
 drivers/target/target_core_user.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 99cd239..117be07 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -650,8 +650,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 	 * expensive to tell how many regions are freed in the bitmap
 	*/
 	base_command_size = max(offsetof(struct tcmu_cmd_entry,
-				req.iov[se_cmd->t_bidi_data_nents +
-					se_cmd->t_data_nents]),
+				req.iov[tcmu_cmd->dbi_len]),
 				sizeof(struct tcmu_cmd_entry));
 	command_size = base_command_size
 		+ round_up(scsi_command_size(se_cmd->t_task_cdb), TCMU_OP_ALIGN_SIZE);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCHv2 5/5] target/user: Clean up tcmu_queue_cmd_ring
  2017-03-08  8:45 [PATCHv2 0/5] Dynamic growing data area support lixiubo
                   ` (3 preceding siblings ...)
  2017-03-08  8:45 ` [PATCHv2 4/5] target/user: Fix wrongly calculating of the base_command_size lixiubo
@ 2017-03-08  8:45 ` lixiubo
  2017-03-16 20:50   ` Bryant G. Ly
  4 siblings, 1 reply; 15+ messages in thread
From: lixiubo @ 2017-03-08  8:45 UTC (permalink / raw)
  To: agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, Xiubo Li

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

Add two helpers to simplify the code, and move some code out of
the cmdr spin lock to reduce the size of critical region.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
---
 drivers/target/target_core_user.c | 54 ++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 24 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 117be07..5205d2f 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -366,18 +366,22 @@ static inline void tcmu_free_cmd(struct tcmu_cmd *tcmu_cmd)
 	kmem_cache_free(tcmu_cmd_cache, tcmu_cmd);
 }
 
-static inline uint32_t tcmu_cmd_get_dbi_len(struct se_cmd *se_cmd)
+static inline uint32_t tcmu_cmd_get_data_length(struct se_cmd *se_cmd)
 {
 	size_t data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
-	uint32_t dbi_len;
 
 	if (se_cmd->se_cmd_flags & SCF_BIDI)
 		data_length += round_up(se_cmd->t_bidi_data_sg->length,
 					DATA_BLOCK_SIZE);
 
-	dbi_len = data_length / DATA_BLOCK_SIZE;
+	return data_length;
+}
+
+static inline uint32_t tcmu_cmd_get_dbi_len(struct se_cmd *se_cmd)
+{
+	size_t data_length = tcmu_cmd_get_data_length(se_cmd);
 
-	return dbi_len;
+	return data_length / DATA_BLOCK_SIZE;
 }
 
 static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
@@ -624,13 +628,27 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 	return true;
 }
 
+static inline void tcmu_cmd_get_cmd_size(struct se_cmd *se_cmd,
+			size_t *base_size, size_t *cmd_size)
+{
+	struct se_cmd *se_cmd = tcmu_cmd->se_cmd;
+
+	*base_size = max(offsetof(struct tcmu_cmd_entry,
+				 req.iov[tcmu_cmd->dbi_len]),
+				 sizeof(struct tcmu_cmd_entry));
+
+	*cmd_size = round_up(scsi_command_size(se_cmd->t_task_cdb),
+			TCMU_OP_ALIGN_SIZE) + *base_size;
+	WARN_ON(*cmd_size & (TCMU_OP_ALIGN_SIZE-1));
+}
+
 static sense_reason_t
 tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
 {
 	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
 	struct se_cmd *se_cmd = tcmu_cmd->se_cmd;
 	size_t base_command_size, command_size;
-	struct tcmu_mailbox *mb;
+	struct tcmu_mailbox *mb = udev->mb_addr;
 	struct tcmu_cmd_entry *entry;
 	struct iovec *iov;
 	int iov_cnt, ret;
@@ -642,6 +660,8 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 	if (test_bit(TCMU_DEV_BIT_BROKEN, &udev->flags))
 		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 
+	if (se_cmd->se_cmd_flags & SCF_BIDI)
+		BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
 	/*
 	 * Must be a certain minimum size for response sense info, but
 	 * also may be larger if the iov array is large.
@@ -649,33 +669,19 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 	 * We prepare way too many iovs for potential uses here, because it's
 	 * expensive to tell how many regions are freed in the bitmap
 	*/
-	base_command_size = max(offsetof(struct tcmu_cmd_entry,
-				req.iov[tcmu_cmd->dbi_len]),
-				sizeof(struct tcmu_cmd_entry));
-	command_size = base_command_size
-		+ round_up(scsi_command_size(se_cmd->t_task_cdb), TCMU_OP_ALIGN_SIZE);
-
-	WARN_ON(command_size & (TCMU_OP_ALIGN_SIZE-1));
-
-	spin_lock_irq(&udev->cmdr_lock);
-
-	mb = udev->mb_addr;
-	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
-	data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
-	if (se_cmd->se_cmd_flags & SCF_BIDI) {
-		BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
-		data_length += round_up(se_cmd->t_bidi_data_sg->length,
-				DATA_BLOCK_SIZE);
-	}
+	tcmu_cmd_get_cmd_size(tcmu_cmd, &base_command_size, &command_size);
+	data_length = tcmu_cmd_get_data_length(se_cmd);
 	if ((command_size > (udev->cmdr_size / 2)) ||
 	    data_length > udev->data_size) {
 		pr_warn("TCMU: Request of size %zu/%zu is too big for %u/%zu "
 			"cmd ring/data area\n", command_size, data_length,
 			udev->cmdr_size, udev->data_size);
-		spin_unlock_irq(&udev->cmdr_lock);
 		return TCM_INVALID_CDB_FIELD;
 	}
 
+	spin_lock_irq(&udev->cmdr_lock);
+
+	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
 	while (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
 		int ret;
 		DEFINE_WAIT(__wait);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 2/5] target/user: Add global data block pool support
  2017-03-08  8:45 ` [PATCHv2 2/5] target/user: Add global data block pool support lixiubo
@ 2017-03-08 20:20   ` Andy Grover
  2017-03-16  9:39     ` Xiubo Li
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Grover @ 2017-03-08 20:20 UTC (permalink / raw)
  To: lixiubo, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, linux-mm

On 03/08/2017 12:45 AM, lixiubo@cmss.chinamobile.com wrote:
> From: Xiubo Li <lixiubo@cmss.chinamobile.com>
>
> For each target there will be one ring, when the target number
> grows larger and larger, it could eventually runs out of the
> system memories.
>
> In this patch for each target ring, the cmd area size will be
> limited to 8M and the data area size will be limited to 1G. And
> the data area will be divided into two parts: the fixed and
> growing.
>
> For the fixed part, it will be 1M size and pre-allocated together
> with the cmd area. This could speed up the low iops case, and
> also could make sure that each ring will have at least 1M private
> data size when there has too many targets, which could get their
> data blocks as quick as possible.
>
> For the growing part, it will get the blocks from the global data
> block pool. And this part will be used for high iops case.
>
> The global data block pool is a cache, and the total size will be
> limited to max 2G(grows from 0 to 2G as needed). And it will cache
> the freed data blocks by a list, All targets will get from/release
> to the free blocks here.

Hi Xiubo,

I will leave the detailed patch critique to others but this does seem to 
achieve the goals of 1) larger TCMU data buffers to prevent bottlenecks 
and 2) Allocating memory in a way that avoids using up all system memory 
in corner cases.

The one thing I'm still unsure about is what we need to do to maintain 
the data area's virtual mapping properly. Nobody on linux-mm answered my 
email a few days ago on the right way to do this, alas. But, userspace 
accessing the data area is going to cause tcmu_vma_fault() to be called, 
and it seems to me like we must proactively do something -- some kind of 
unmap call -- before we can reuse that memory for another, possibly 
completely unrelated, backstore's data area. This could allow one 
backstore handler to read or write another's data.

Regards -- Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 2/5] target/user: Add global data block pool support
  2017-03-08 20:20   ` Andy Grover
@ 2017-03-16  9:39     ` Xiubo Li
  2017-03-17  8:04         ` Xiubo Li
  0 siblings, 1 reply; 15+ messages in thread
From: Xiubo Li @ 2017-03-16  9:39 UTC (permalink / raw)
  To: Andy Grover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, linux-mm

[-- Attachment #1: Type: text/plain, Size: 21963 bytes --]


> Hi Xiubo,
>
> I will leave the detailed patch critique to others but this does seem 
> to achieve the goals of 1) larger TCMU data buffers to prevent 
> bottlenecks and 2) Allocating memory in a way that avoids using up all 
> system memory in corner cases.
>
> The one thing I'm still unsure about is what we need to do to maintain 
> the data area's virtual mapping properly. Nobody on linux-mm answered 
> my email a few days ago on the right way to do this, alas. But, 
> userspace accessing the data area is going to cause tcmu_vma_fault() 
> to be called, and it seems to me like we must proactively do something 
> -- some kind of unmap call -- before we can reuse that memory for 
> another, possibly completely unrelated, backstore's data area. This 
> could allow one backstore handler to read or write another's data.
>

Hi Andy, Mike

These days what I have gotten is that the unmap_mapping_range() could be 
used.
At the same time I have deep into the mm code and fixed the double usage of
the data blocks and possible page fault call trace bugs mentioned above.

Following is the V3 patch. I have test this using 4 targets & fio for 
about 2 days, so
far so good.

I'm still testing this using more complex test case.

Thanks

From: Xiubo Li<lixiubo@cmss.chinamobile.com>

For each target there will be one ring, when the target number
grows larger and larger, it could eventually runs out of the
system memories.

In this patch for each target ring, for the cmd area the size
will be limited to 8MB and for the data area the size will be
limited to 256K * PAGE_SIZE.

For all the targets' data areas, they will get empty blocks
from the "global data block pool", which has limited to 512K *
PAGE_SIZE for now.

When the "global data block pool" has been used up, then any
target could trigger the unmapping thread routine to shrink the
targets' rings. And for the idle targets the unmapping routine
will reserve 256 blocks at least.

When user space has touched the data blocks out of the iov[N],
the tcmu_page_fault() will return one zeroed blocks.

Signed-off-by: Xiubo Li<lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianfei Hu<hujianfei@cmss.chinamobile.com>
---
  drivers/target/target_core_user.c | 433 ++++++++++++++++++++++++++++++--------
  1 file changed, 349 insertions(+), 84 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index e904bc0..bbc52074 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -30,6 +30,8 @@
  #include <linux/stringify.h>
  #include <linux/bitops.h>
  #include <linux/highmem.h>
+#include <linux/mutex.h>
+#include <linux/kthread.h>
  #include <net/genetlink.h>
  #include <scsi/scsi_common.h>
  #include <scsi/scsi_proto.h>
@@ -66,17 +68,24 @@
  
  #define TCMU_TIME_OUT (30 * MSEC_PER_SEC)
  
-/* For cmd area, the size is fixed 2M */
-#define CMDR_SIZE (2 * 1024 * 1024)
+/* For cmd area, the size is fixed 8MB */
+#define CMDR_SIZE (8 * 1024 * 1024)
  
-/* For data area, the size is fixed 32M */
-#define DATA_BLOCK_BITS (8 * 1024)
-#define DATA_BLOCK_SIZE 4096
+/*
+ * For data area, the block size is PAGE_SIZE and
+ * the total size is 256K * PAGE_SIZE.
+ */
+#define DATA_BLOCK_SIZE PAGE_SIZE
+#define DATA_BLOCK_BITS (256 * 1024)
  #define DATA_SIZE (DATA_BLOCK_BITS * DATA_BLOCK_SIZE)
+#define DATA_BLOCK_RES_BITS 256
  
-/* The ring buffer size is 34M */
+/* The total size of the ring is 8M + 256K * PAGE_SIZE */
  #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
  
+/* Default maximum of the global data blocks(512K * PAGE_SIZE) */
+#define TCMU_GLOBAL_MAX_BLOCKS (512 * 1024)
+
  static struct device *tcmu_root_device;
  
  struct tcmu_hba {
@@ -86,6 +95,8 @@ struct tcmu_hba {
  #define TCMU_CONFIG_LEN 256
  
  struct tcmu_dev {
+	struct list_head node;
+
  	struct se_device se_dev;
  
  	char *name;
@@ -97,6 +108,15 @@ struct tcmu_dev {
  
  	struct uio_info uio_info;
  
+	struct inode *inode;
+
+	bool unmapping;
+	bool waiting_global;
+	uint32_t dbi_cur;
+	uint32_t dbi_thresh;
+	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
+	struct radix_tree_root data_blocks;
+
  	struct tcmu_mailbox *mb_addr;
  	size_t dev_size;
  	u32 cmdr_size;
@@ -110,10 +130,6 @@ struct tcmu_dev {
  	/* TODO should this be a mutex? */
  	spinlock_t cmdr_lock;
  
-	uint32_t dbi_cur;
-	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
-	struct radix_tree_root data_blocks;
-
  	struct idr commands;
  	spinlock_t commands_lock;
  
@@ -137,6 +153,11 @@ struct tcmu_cmd {
  	uint32_t *dbi;
  };
  
+static wait_queue_head_t g_wait;
+static DEFINE_MUTEX(g_mutex);
+static LIST_HEAD(root_udev);
+static spinlock_t g_lock;
+static unsigned long global_db_count;
  static struct kmem_cache *tcmu_cmd_cache;
  
  /* multicast group */
@@ -160,54 +181,89 @@ enum tcmu_multicast_groups {
  	.netnsok = true,
  };
  
-static int tcmu_db_get_empty_block(struct tcmu_dev *udev, void **addr)
+#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
+#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
+#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
+
+static inline void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd, uint32_t len)
  {
-	void *p;
-	uint32_t dbi;
-	int ret;
+	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
+	uint32_t i;
  
-	dbi = find_first_zero_bit(udev->data_bitmap, DATA_BLOCK_BITS);
-	if (dbi > udev->dbi_cur)
-		udev->dbi_cur = dbi;
+	for (i = 0; i < len; i++)
+		clear_bit(tcmu_cmd->dbi[i], udev->data_bitmap);
+}
  
-	set_bit(dbi, udev->data_bitmap);
+static inline bool get_empty_growing_block(struct tcmu_dev *udev,
+					   struct tcmu_cmd *tcmu_cmd)
+{
+	struct page *page;
+	int ret, dbi;
  
-	p = radix_tree_lookup(&udev->data_blocks, dbi);
-	if (!p) {
-		p = kzalloc(DATA_BLOCK_SIZE, GFP_ATOMIC);
-		if (!p) {
-			clear_bit(dbi, udev->data_bitmap);
-			return -ENOMEM;
+	dbi = find_first_zero_bit(udev->data_bitmap, udev->dbi_thresh);
+	if (dbi == udev->dbi_thresh)
+		return false;
+
+	page = radix_tree_lookup(&udev->data_blocks, dbi);
+	if (!page) {
+		/* try to get new page from the mm */
+		spin_lock_irq(&g_lock);
+		if (global_db_count >= TCMU_GLOBAL_MAX_BLOCKS) {
+			spin_unlock_irq(&g_lock);
+			wake_up(&g_wait);
+			return false;
+		}
+		global_db_count++;
+		spin_unlock_irq(&g_lock);
+
+		page = alloc_page(GFP_ATOMIC);
+		if (!page) {
+			return false;
  		}
  
-		ret = radix_tree_insert(&udev->data_blocks, dbi, p);
+		ret = radix_tree_insert(&udev->data_blocks, dbi, page);
  		if (ret) {
-			kfree(p);
-			clear_bit(dbi, udev->data_bitmap);
-			return ret;
+			__free_page(page);
+			return false;
  		}
  	}
  
-	*addr = p;
-	return dbi;
+	if (dbi > udev->dbi_cur)
+		udev->dbi_cur = dbi;
+
+	set_bit(dbi, udev->data_bitmap);
+	tcmu_cmd_set_dbi(tcmu_cmd, dbi);
+
+	return true;
  }
  
-static void *tcmu_db_get_block_addr(struct tcmu_dev *udev, uint32_t dbi)
+static bool tcmu_db_get_empty_blocks(struct tcmu_dev *udev,
+				     struct tcmu_cmd *tcmu_cmd)
  {
-	return radix_tree_lookup(&udev->data_blocks, dbi);
-}
+	int i;
  
-#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
-#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
-#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
+	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
+	for (i = 0; i < tcmu_cmd->dbi_len; i++) {
+		if (!get_empty_growing_block(udev, tcmu_cmd))
+			goto err;
+	}
+	return true;
  
-static void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd)
+err:
+	tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cur);
+	udev->waiting_global = true;
+	return false;
+}
+
+static struct page *tcmu_db_get_block_page(struct tcmu_dev *udev, uint32_t dbi)
  {
-	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
-	uint32_t bi;
+	struct page *page;
  
-	for (bi = 0; bi < tcmu_cmd->dbi_len; bi++)
-		clear_bit(tcmu_cmd->dbi[bi], udev->data_bitmap);
+	page = radix_tree_lookup(&udev->data_blocks, dbi);
+	if (!page)
+		return NULL;
+
+	return page;
  }
  
  static inline void tcmu_free_cmd(struct tcmu_cmd *tcmu_cmd)
@@ -344,17 +400,20 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
  	void *from, *to = NULL;
  	size_t copy_bytes, to_offset, offset;
  	struct scatterlist *sg;
+	struct page *page;
  
  	for_each_sg(data_sg, sg, data_nents, i) {
  		int sg_remaining = sg->length;
  		from = kmap_atomic(sg_page(sg)) + sg->offset;
  		while (sg_remaining > 0) {
  			if (block_remaining == 0) {
+				if (to)
+					kunmap_atomic(to);
+
  				block_remaining = DATA_BLOCK_SIZE;
-				dbi = tcmu_db_get_empty_block(udev, &to);
-				if (dbi < 0)
-					return dbi;
-				tcmu_cmd_set_dbi(tcmu_cmd, dbi);
+				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
+				page = tcmu_db_get_block_page(udev, dbi);
+				to = kmap_atomic(page);
  			}
  
  			copy_bytes = min_t(size_t, sg_remaining,
@@ -362,7 +421,7 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
  			to_offset = get_block_offset_user(udev, dbi,
  					block_remaining);
  			offset = DATA_BLOCK_SIZE - block_remaining;
-			to = (void *)(unsigned long)to + offset;
+			to = (void *)((unsigned long)to + offset);
  
  			if (*iov_cnt != 0 &&
  			    to_offset == iov_tail(udev, *iov)) {
@@ -382,6 +441,8 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
  		}
  		kunmap_atomic(from - sg->offset);
  	}
+	if (to)
+		kunmap_atomic(to);
  
  	return 0;
  }
@@ -391,23 +452,28 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
  {
  	int i, dbi;
  	int block_remaining = 0;
-	void *from, *to;
+	void *from = NULL, *to;
  	size_t copy_bytes, offset;
  	struct scatterlist *sg;
+	struct page *page;
  
  	for_each_sg(data_sg, sg, data_nents, i) {
  		int sg_remaining = sg->length;
  		to = kmap_atomic(sg_page(sg)) + sg->offset;
  		while (sg_remaining > 0) {
  			if (block_remaining == 0) {
+				if (from)
+					kunmap_atomic(from);
+
  				block_remaining = DATA_BLOCK_SIZE;
  				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
-				from = tcmu_db_get_block_addr(udev, dbi);
+				page = tcmu_db_get_block_page(udev, dbi);
+				from = kmap_atomic(page);
  			}
  			copy_bytes = min_t(size_t, sg_remaining,
  					block_remaining);
  			offset = DATA_BLOCK_SIZE - block_remaining;
-			from = (void *)(unsigned long)from + offset;
+			from = (void *)((unsigned long)from + offset);
  			tcmu_flush_dcache_range(from, copy_bytes);
  			memcpy(to + sg->length - sg_remaining, from,
  					copy_bytes);
@@ -417,12 +483,13 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
  		}
  		kunmap_atomic(to - sg->offset);
  	}
+	if (from)
+		kunmap_atomic(from);
  }
  
-static inline size_t spc_bitmap_free(unsigned long *bitmap)
+static inline size_t spc_bitmap_free(unsigned long *bitmap, uint32_t thresh)
  {
-	return DATA_BLOCK_SIZE * (DATA_BLOCK_BITS -
-			bitmap_weight(bitmap, DATA_BLOCK_BITS));
+	return DATA_BLOCK_SIZE * (thresh - bitmap_weight(bitmap, thresh));
  }
  
  /*
@@ -431,12 +498,14 @@ static inline size_t spc_bitmap_free(unsigned long *bitmap)
   *
   * Called with ring lock held.
   */
-static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t data_needed)
+static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
+		size_t cmd_size, size_t data_needed)
  {
  	struct tcmu_mailbox *mb = udev->mb_addr;
  	size_t space, cmd_needed;
  	u32 cmd_head;
  
+	udev->waiting_global = false;
  	tcmu_flush_dcache_range(mb, sizeof(*mb));
  
  	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
@@ -457,10 +526,24 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
  		return false;
  	}
  
-	space = spc_bitmap_free(udev->data_bitmap);
+	/* try to check and get the data blocks as needed */
+	space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
  	if (space < data_needed) {
-		pr_debug("no data space: only %zu available, but ask for %zu\n",
-				space, data_needed);
+		if (udev->unmapping) {
+			pr_debug("no data space: only %zu available, but ask for %zu\n",
+					space, data_needed);
+			return false;
+		} else {
+			udev->dbi_thresh += udev->dbi_thresh / 2;
+			udev->dbi_thresh = min((int)udev->dbi_thresh, DATA_BLOCK_BITS);
+			space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
+			if (space < data_needed)
+				return false;
+		}
+	}
+
+	if (!tcmu_db_get_empty_blocks(udev, cmd)) {
+		pr_debug("no data space: ask for %zu\n", data_needed);
  		return false;
  	}
  
@@ -519,7 +602,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
  		return TCM_INVALID_CDB_FIELD;
  	}
  
-	while (!is_ring_space_avail(udev, command_size, data_length)) {
+	while (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
  		int ret;
  		DEFINE_WAIT(__wait);
  
@@ -567,6 +650,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
  	entry->hdr.uflags = 0;
  
  	/* Handle allocating space from the data area */
+	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
  	iov = &entry->req.iov[0];
  	iov_cnt = 0;
  	copy_to_data_area = (se_cmd->data_direction == DMA_TO_DEVICE
@@ -664,7 +748,7 @@ static void tcmu_handle_completion(struct tcmu_cmd *cmd, struct tcmu_cmd_entry *
  	target_complete_cmd(cmd->se_cmd, entry->rsp.scsi_status);
  
  	cmd->se_cmd = NULL;
-	tcmu_cmd_free_data(cmd);
+	tcmu_cmd_free_data(cmd, cmd->dbi_len);
  	tcmu_free_cmd(cmd);
  }
  
@@ -783,41 +867,80 @@ static int tcmu_irqcontrol(struct uio_info *info, s32 irq_on)
  
  static void tcmu_db_release(struct tcmu_dev *udev, bool release_pending)
  {
-	uint32_t dbi, end;
-	void *addr;
+	int dbi = -1, end;
+	struct page *page;
  
  	spin_lock_irq(&udev->cmdr_lock);
-
  	end = udev->dbi_cur + 1;
  
-	/* try to release all unused blocks */
-	dbi = find_first_zero_bit(udev->data_bitmap, end);
-	if (dbi >= end) {
-		spin_unlock_irq(&udev->cmdr_lock);
-		return;
-	}
+	/* try to release all unused but has mapped blocks */
  	do {
-		addr = radix_tree_delete(&udev->data_blocks, dbi);
-		kfree(addr);
-
  		dbi = find_next_zero_bit(udev->data_bitmap, end, dbi + 1);
-	} while (dbi < end);
+		if (dbi == end)
+			break;
  
-	if (!release_pending)
-		return;
+		/*
+		 * When the bit is cleared and p != NULL, meaning that
+		 * this tcmu block had already freed-after-use.
+		 *
+		 * If p->user == 0, meaning that the current ring buffer
+		 * is the last or the only user of the tcmu block, and
+		 * it must already in the free list, so it could be
+		 * remove from the list and then released its memories.
+		 *
+		 * If p->user != 0, meaning that the current tcmu block is
+		 * still referenced by other ring buffers, so just ignore
+		 * it without doing anyting.
+		 */
+		page = radix_tree_delete(&udev->data_blocks, dbi);
+		if (page) {
+				__free_page(page);
+				spin_lock_irq(&g_lock);
+				global_db_count--;
+				spin_unlock_irq(&g_lock);
+		}
+	} while (1);
  
-	/* try to release all pending blocks */
-	dbi = find_first_bit(udev->data_bitmap, end);
-	if (dbi >= end) {
+	if (!release_pending) {
  		spin_unlock_irq(&udev->cmdr_lock);
  		return;
  	}
-	do {
-		addr = radix_tree_delete(&udev->data_blocks, dbi);
-		kfree(addr);
  
+	/* try to release all pending blocks */
+	dbi = -1;
+	do {
  		dbi = find_next_bit(udev->data_bitmap, end, dbi + 1);
-	} while (dbi < end);
+		if (dbi == end)
+			break;
+
+		clear_bit(dbi, udev->data_bitmap);
+
+		/*
+		 * When the bit is set and p != NULL, meaning that this
+		 * tcmu block is still being used here.
+		 *
+		 * If p->user == 0, meaning that the current ring buffer
+		 * is the last or the only user of this tcmu block, and
+		 * it won't in the free list, so could just release its
+		 * memories.
+		 *
+		 * If the p->user != 0, we should insert it to the free
+		 * list.
+		 *
+		 * p == NULL means that the current ring buffer is broken.
+		 */
+		page = radix_tree_delete(&udev->data_blocks, dbi);
+		if (page) {
+				__free_page(page);
+				spin_lock_irq(&g_lock);
+				global_db_count--;
+				spin_unlock_irq(&g_lock);
+		} else {
+			pr_err("block page not found, ring is broken\n");
+			set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags);
+			break;
+		}
+	} while (1);
  
  	spin_unlock_irq(&udev->cmdr_lock);
  }
@@ -846,6 +969,43 @@ static int tcmu_find_mem_index(struct vm_area_struct *vma)
  	return -1;
  }
  
+/*
+ * Normally it shouldn't be here. This is just for avoid
+ * the page fault call trace, and will return zeroed page.
+ */
+static struct page *tcmu_try_to_alloc_new_page(struct tcmu_dev *udev, uint32_t dbi)
+{
+	struct page *page;
+	int ret;
+
+	if (dbi >= udev->dbi_thresh) {
+		udev->dbi_thresh = dbi;
+		udev->dbi_cur = dbi;
+	}
+
+	page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+	if (!page) {
+		return NULL;
+	}
+
+	ret = radix_tree_insert(&udev->data_blocks, dbi, page);
+	if (ret) {
+		__free_page(page);
+		return NULL;
+	}
+
+	/*
+	 * Since this case is rare in page fault routine, here we
+	 * will allow the global_db_count >= TCMU_GLOBAL_MAX_BLOCKS
+	 * to reduce possible page fault call trace.
+	 */
+	spin_lock_irq(&g_lock);
+	global_db_count++;
+	spin_unlock_irq(&g_lock);
+
+	return page;
+}
+
  static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
  {
  	struct tcmu_dev *udev = vma->vm_private_data;
@@ -869,14 +1029,17 @@ static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
  		addr = (void *)(unsigned long)info->mem[mi].addr + offset;
  		page = vmalloc_to_page(addr);
  	} else {
-		/* For the dynamically growing data area pages */
  		uint32_t dbi;
  
+		/* For the dynamically growing data area pages */
  		dbi = (offset - udev->data_off) / DATA_BLOCK_SIZE;
-		addr = tcmu_db_get_block_addr(udev, dbi);
-		if (!addr)
+		spin_lock_irq(&udev->cmdr_lock);
+		page = tcmu_db_get_block_page(udev, dbi);
+		if (!page)
+			page = tcmu_try_to_alloc_new_page(udev, dbi);
+		spin_unlock_irq(&udev->cmdr_lock);
+		if (!page)
  			return VM_FAULT_NOPAGE;
-		page = virt_to_page(addr);
  	}
  
  	get_page(page);
@@ -913,6 +1076,8 @@ static int tcmu_open(struct uio_info *info, struct inode *inode)
  	if (test_and_set_bit(TCMU_DEV_BIT_OPEN, &udev->flags))
  		return -EBUSY;
  
+	udev->inode = inode;
+
  	pr_debug("open\n");
  
  	return 0;
@@ -1003,6 +1168,8 @@ static int tcmu_configure_device(struct se_device *dev)
  	udev->cmdr_size = CMDR_SIZE - CMDR_OFF;
  	udev->data_off = CMDR_SIZE;
  	udev->data_size = DATA_SIZE;
+	udev->dbi_thresh = DATA_BLOCK_BITS;
+	udev->unmapping = false;
  
  	/* Initialise the mailbox of the ring buffer */
  	mb = udev->mb_addr;
@@ -1048,6 +1215,10 @@ static int tcmu_configure_device(struct se_device *dev)
  	if (ret)
  		goto err_netlink;
  
+	mutex_lock(&g_mutex);
+	list_add(&udev->node, &root_udev);
+	mutex_unlock(&g_mutex);
+
  	return 0;
  
  err_netlink:
@@ -1072,6 +1243,10 @@ static void tcmu_free_device(struct se_device *dev)
  {
  	struct tcmu_dev *udev = TCMU_DEV(dev);
  
+	mutex_lock(&g_mutex);
+	list_del(&udev->node);
+	mutex_unlock(&g_mutex);
+
  	vfree(udev->mb_addr);
  
  	/* Upper layer should drain all requests before calling this */
@@ -1235,12 +1410,90 @@ static sector_t tcmu_get_blocks(struct se_device *dev)
  	.tb_dev_attrib_attrs	= passthrough_attrib_attrs,
  };
  
+static struct task_struct *unmap_thread;
+
+/*
+ * The unmapping thread routine.
+ */
+static int unmap_thread_fn(void *data)
+{
+	struct tcmu_dev *udev;
+	loff_t offset;
+	uint32_t start, end, dbi;
+	struct page *page;
+	bool unmapped;
+	int i;
+
+	while (1) {
+		DEFINE_WAIT(__wait);
+
+		prepare_to_wait(&g_wait, &__wait, TASK_INTERRUPTIBLE);
+		schedule();
+		finish_wait(&g_wait, &__wait);
+
+		unmapped = false;
+		mutex_lock(&g_mutex);
+		list_for_each_entry(udev, &root_udev, node) {
+			spin_lock_irq(&udev->cmdr_lock);
+			end = udev->dbi_cur + 1;
+			dbi = find_last_bit(udev->data_bitmap, end);
+			if (dbi == end) {
+				/*
+				 * Reserved for DATA_BLOCK_RES_BITS
+				 * blocks for idle udev
+				 */
+				dbi = DATA_BLOCK_RES_BITS - 1;
+				udev->dbi_cur = 0;
+			} else {
+				udev->dbi_cur = dbi;
+			}
+
+			udev->dbi_thresh = start = dbi + 1;
+			if (start >= end) {
+				spin_unlock_irq(&udev->cmdr_lock);
+				continue;
+			}
+			udev->unmapping = true;
+			spin_unlock_irq(&udev->cmdr_lock);
+
+			/* Here will truncate the ring from offset */
+			offset = udev->data_off + start * DATA_BLOCK_SIZE;
+			unmap_mapping_range(udev->inode->i_mapping, offset, 0, 1);
+			unmapped = true;
+
+			spin_lock_irq(&udev->cmdr_lock);
+			for (i = start; i < end; i++) {
+				page = radix_tree_delete(&udev->data_blocks, i);
+				if (page) {
+					__free_page(page);
+					spin_lock_irq(&g_lock);
+					global_db_count--;
+					spin_unlock_irq(&g_lock);
+				}
+			}
+			udev->unmapping = false;
+			spin_unlock_irq(&udev->cmdr_lock);
+		}
+
+		if (unmapped) {
+			list_for_each_entry(udev, &root_udev, node)
+				if (udev->waiting_global)
+					wake_up(&udev->wait_cmdr);
+		}
+		mutex_unlock(&g_mutex);
+	}
+
+	return 0;
+}
+
  static int __init tcmu_module_init(void)
  {
  	int ret;
  
  	BUILD_BUG_ON((sizeof(struct tcmu_cmd_entry) % TCMU_OP_ALIGN_SIZE) != 0);
  
+	spin_lock_init(&g_lock);
+
  	tcmu_cmd_cache = kmem_cache_create("tcmu_cmd_cache",
  				sizeof(struct tcmu_cmd),
  				__alignof__(struct tcmu_cmd),
@@ -1263,8 +1516,17 @@ static int __init tcmu_module_init(void)
  	if (ret)
  		goto out_unreg_genl;
  
+	init_waitqueue_head(&g_wait);
+	unmap_thread = kthread_run(unmap_thread_fn, NULL, "tcmu_unmap");
+	if (IS_ERR(unmap_thread)) {
+		unmap_thread = NULL;
+		goto out_unreg_transport;
+	}
+
  	return 0;
  
+out_unreg_transport:
+	target_backend_unregister(&tcmu_ops);
  out_unreg_genl:
  	genl_unregister_family(&tcmu_genl_family);
  out_unreg_device:
@@ -1277,6 +1539,9 @@ static int __init tcmu_module_init(void)
  
  static void __exit tcmu_module_exit(void)
  {
+	if (unmap_thread)
+		kthread_stop(unmap_thread);
+
  	target_backend_unregister(&tcmu_ops);
  	genl_unregister_family(&tcmu_genl_family);
  	root_device_unregister(tcmu_root_device);
-- 1.8.3.1



[-- Attachment #2: Type: text/html, Size: 23303 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 3/5] target/user: Fix possible overwrite of t_data_sg's last iov[]
  2017-03-08  8:45 ` [PATCHv2 3/5] target/user: Fix possible overwrite of t_data_sg's last iov[] lixiubo
@ 2017-03-16 18:23   ` Bryant G. Ly
  0 siblings, 0 replies; 15+ messages in thread
From: Bryant G. Ly @ 2017-03-16 18:23 UTC (permalink / raw)
  To: lixiubo, agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix



On 3/8/17 2:45 AM, lixiubo@cmss.chinamobile.com wrote:
> From: Xiubo Li <lixiubo@cmss.chinamobile.com>
>
> If there has BIDI data, its first iov[] will overwrite the last
> iov[] for se_cmd->t_data_sg.
>
> To fix this, we can just increase the iov pointer, but this may
> introuduce a new memory leakage bug: If the se_cmd->data_length
> and se_cmd->t_bidi_data_sg->length are all not aligned up to the
> DATA_BLOCK_SIZE, the actual length needed maybe larger than just
> sum of them.
>
> So, this could be avoided by rounding all the data lengthes up
> to DATA_BLOCK_SIZE.
>
> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
> ---
>   drivers/target/target_core_user.c | 32 +++++++++++++++++++-------------
>   1 file changed, 19 insertions(+), 13 deletions(-)

I have seen this in my environment:
(gdb) print *((tcmulib_cmd->iovec)+0)
$7 = {iov_base = 0x3fff7c3d0000, iov_len = 8192}
(gdb) print *((tcmulib_cmd->iovec)+1)
$3 = {iov_base = 0x3fff7c3da000, iov_len = 4096}
(gdb) print *((tcmulib_cmd->iovec)+2)
$4 = {iov_base = 0x3fff7c3dc000, iov_len = 16384}
(gdb) print *((tcmulib_cmd->iovec)+3)
$5 = {iov_base = 0x3fff7c3f7000, iov_len = 12288}
(gdb) print *((tcmulib_cmd->iovec)+4)
$6 = {iov_base = 0x1306e853c0028, iov_len = 128}  <--- bad pointer and length

So this fix would be great!

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 5/5] target/user: Clean up tcmu_queue_cmd_ring
  2017-03-08  8:45 ` [PATCHv2 5/5] target/user: Clean up tcmu_queue_cmd_ring lixiubo
@ 2017-03-16 20:50   ` Bryant G. Ly
  0 siblings, 0 replies; 15+ messages in thread
From: Bryant G. Ly @ 2017-03-16 20:50 UTC (permalink / raw)
  To: lixiubo, agrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix


> From: Xiubo Li <lixiubo@cmss.chinamobile.com>
>
> Add two helpers to simplify the code, and move some code out of
> the cmdr spin lock to reduce the size of critical region.
>
> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
> ---
>   drivers/target/target_core_user.c | 54 ++++++++++++++++++++++-----------------
>   1 file changed, 30 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index 117be07..5205d2f 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c

<SNIP>

>
> +static inline void tcmu_cmd_get_cmd_size(struct se_cmd *se_cmd,
> +			size_t *base_size, size_t *cmd_size)

Should this be tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd instead? Look at bottom.

> +{
> +	struct se_cmd *se_cmd = tcmu_cmd->se_cmd;
> +
> +	*base_size = max(offsetof(struct tcmu_cmd_entry,
> +				 req.iov[tcmu_cmd->dbi_len]),
> +				 sizeof(struct tcmu_cmd_entry));
> +
> +	*cmd_size = round_up(scsi_command_size(se_cmd->t_task_cdb),
> +			TCMU_OP_ALIGN_SIZE) + *base_size;
> +	WARN_ON(*cmd_size & (TCMU_OP_ALIGN_SIZE-1));
> +}
> +
>   static sense_reason_t
>   tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
>   {
>   	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
>   	struct se_cmd *se_cmd = tcmu_cmd->se_cmd;
>   	size_t base_command_size, command_size;
> -	struct tcmu_mailbox *mb;
> +	struct tcmu_mailbox *mb = udev->mb_addr;
>   	struct tcmu_cmd_entry *entry;
>   	struct iovec *iov;
>   	int iov_cnt, ret;
> @@ -642,6 +660,8 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
>   	if (test_bit(TCMU_DEV_BIT_BROKEN, &udev->flags))
>   		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
>
> +	if (se_cmd->se_cmd_flags & SCF_BIDI)
> +		BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
>   	/*
>   	 * Must be a certain minimum size for response sense info, but
>   	 * also may be larger if the iov array is large.
> @@ -649,33 +669,19 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
>   	 * We prepare way too many iovs for potential uses here, because it's
>   	 * expensive to tell how many regions are freed in the bitmap
>   	*/
> -	base_command_size = max(offsetof(struct tcmu_cmd_entry,
> -				req.iov[tcmu_cmd->dbi_len]),
> -				sizeof(struct tcmu_cmd_entry));
> -	command_size = base_command_size
> -		+ round_up(scsi_command_size(se_cmd->t_task_cdb), TCMU_OP_ALIGN_SIZE);
> -
> -	WARN_ON(command_size & (TCMU_OP_ALIGN_SIZE-1));
> -
> -	spin_lock_irq(&udev->cmdr_lock);
> -
> -	mb = udev->mb_addr;
> -	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
> -	data_length = round_up(se_cmd->data_length, DATA_BLOCK_SIZE);
> -	if (se_cmd->se_cmd_flags & SCF_BIDI) {
> -		BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
> -		data_length += round_up(se_cmd->t_bidi_data_sg->length,
> -				DATA_BLOCK_SIZE);
> -	}
> +	tcmu_cmd_get_cmd_size(tcmu_cmd, &base_command_size, &command_size);

<SNIP>

So Based upon your definition of tcmu_cmd_get_cmd_size, why did you pass in tcmu_cmd here? It wont compile in its current state.

-Bryant

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 4/5] target/user: Fix wrongly calculating of the base_command_size
  2017-03-08  8:45 ` [PATCHv2 4/5] target/user: Fix wrongly calculating of the base_command_size lixiubo
@ 2017-03-17  5:45   ` Xiubo Li
  0 siblings, 0 replies; 15+ messages in thread
From: Xiubo Li @ 2017-03-17  5:45 UTC (permalink / raw)
  To: agrover, nab, mchristi; +Cc: shli, sheng, linux-scsi, target-devel, namei.unix

On 2017年03月08日 16:45, lixiubo@cmss.chinamobile.com wrote:
> From: Xiubo Li <lixiubo@cmss.chinamobile.com>
>
> The t_data_nents and t_bidi_data_nents are all the numbers of the
> segments, and we couldn't be sure the size of the data area block
> will equal to size of the segment.
>
> Use the actually block number needed intead of the sum of segments.
>
> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
> ---
>   drivers/target/target_core_user.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index 99cd239..117be07 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -650,8 +650,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
>   	 * expensive to tell how many regions are freed in the bitmap
>   	*/
>   	base_command_size = max(offsetof(struct tcmu_cmd_entry,
> -				req.iov[se_cmd->t_bidi_data_nents +
> -					se_cmd->t_data_nents]),
> +				req.iov[tcmu_cmd->dbi_len]),
For the old code:

If the segment size is larger than the DATA_BLOCK_SIZE, and at the same 
time all the
data blocks won't be continues between each other(that to say each block 
will use
one iov: iov_cnt == block cnt > nents), the entry's cdb data will 
overlap with entry's
iov[] data.

Thanks,

BRs
Xiubo


>   				sizeof(struct tcmu_cmd_entry));
>   	command_size = base_command_size
>   		+ round_up(scsi_command_size(se_cmd->t_task_cdb), TCMU_OP_ALIGN_SIZE);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 2/5] target/user: Add global data block pool support
  2017-03-16  9:39     ` Xiubo Li
@ 2017-03-17  8:04         ` Xiubo Li
  0 siblings, 0 replies; 15+ messages in thread
From: Xiubo Li @ 2017-03-17  8:04 UTC (permalink / raw)
  To: Andy Grover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, linux-mm

[-- Attachment #1: Type: text/plain, Size: 23131 bytes --]

[...]
> These days what I have gotten is that the unmap_mapping_range() could 
> be used.
> At the same time I have deep into the mm code and fixed the double 
> usage of
> the data blocks and possible page fault call trace bugs mentioned above.
>
> Following is the V3 patch. I have test this using 4 targets & fio for 
> about 2 days, so
> far so good.
>
> I'm still testing this using more complex test case.
>
I have test it the whole day today:
- using 4 targets
- setting TCMU_GLOBAL_MAX_BLOCKS = [512 1K 1M 1G 2G]
- each target here needs more than 450 blocks when running
- fio: -iodepth [1 2 4 8 16] -thread -rw=[read write] -bs=[1K 2K 3K 5K 
7K 16K 64K 1M] -size=20G -numjobs=10 -runtime=1000  ...


The result:
- the system mm, no memory leakage happen.
- the same blocks will page fault again after the unmap.
- try to touch the blocks out of the iov[N], no page fault call trace 
output.
- works well all the day.

Thanks,

BRs
Xiubo

[...]
> From: Xiubo Li<lixiubo@cmss.chinamobile.com>
>
> For each target there will be one ring, when the target number
> grows larger and larger, it could eventually runs out of the
> system memories.
>
> In this patch for each target ring, for the cmd area the size
> will be limited to 8MB and for the data area the size will be
> limited to 256K * PAGE_SIZE.
>
> For all the targets' data areas, they will get empty blocks
> from the "global data block pool", which has limited to 512K *
> PAGE_SIZE for now.
>
> When the "global data block pool" has been used up, then any
> target could trigger the unmapping thread routine to shrink the
> targets' rings. And for the idle targets the unmapping routine
> will reserve 256 blocks at least.
>
> When user space has touched the data blocks out of the iov[N],
> the tcmu_page_fault() will return one zeroed blocks.
>
> Signed-off-by: Xiubo Li<lixiubo@cmss.chinamobile.com>
> Signed-off-by: Jianfei Hu<hujianfei@cmss.chinamobile.com>
> ---
>   drivers/target/target_core_user.c | 433 ++++++++++++++++++++++++++++++--------
>   1 file changed, 349 insertions(+), 84 deletions(-)
>
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index e904bc0..bbc52074 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -30,6 +30,8 @@
>   #include <linux/stringify.h>
>   #include <linux/bitops.h>
>   #include <linux/highmem.h>
> +#include <linux/mutex.h>
> +#include <linux/kthread.h>
>   #include <net/genetlink.h>
>   #include <scsi/scsi_common.h>
>   #include <scsi/scsi_proto.h>
> @@ -66,17 +68,24 @@
>   
>   #define TCMU_TIME_OUT (30 * MSEC_PER_SEC)
>   
> -/* For cmd area, the size is fixed 2M */
> -#define CMDR_SIZE (2 * 1024 * 1024)
> +/* For cmd area, the size is fixed 8MB */
> +#define CMDR_SIZE (8 * 1024 * 1024)
>   
> -/* For data area, the size is fixed 32M */
> -#define DATA_BLOCK_BITS (8 * 1024)
> -#define DATA_BLOCK_SIZE 4096
> +/*
> + * For data area, the block size is PAGE_SIZE and
> + * the total size is 256K * PAGE_SIZE.
> + */
> +#define DATA_BLOCK_SIZE PAGE_SIZE
> +#define DATA_BLOCK_BITS (256 * 1024)
>   #define DATA_SIZE (DATA_BLOCK_BITS * DATA_BLOCK_SIZE)
> +#define DATA_BLOCK_RES_BITS 256
>   
> -/* The ring buffer size is 34M */
> +/* The total size of the ring is 8M + 256K * PAGE_SIZE */
>   #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
>   
> +/* Default maximum of the global data blocks(512K * PAGE_SIZE) */
> +#define TCMU_GLOBAL_MAX_BLOCKS (512 * 1024)
> +
>   static struct device *tcmu_root_device;
>   
>   struct tcmu_hba {
> @@ -86,6 +95,8 @@ struct tcmu_hba {
>   #define TCMU_CONFIG_LEN 256
>   
>   struct tcmu_dev {
> +	struct list_head node;
> +
>   	struct se_device se_dev;
>   
>   	char *name;
> @@ -97,6 +108,15 @@ struct tcmu_dev {
>   
>   	struct uio_info uio_info;
>   
> +	struct inode *inode;
> +
> +	bool unmapping;
> +	bool waiting_global;
> +	uint32_t dbi_cur;
> +	uint32_t dbi_thresh;
> +	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
> +	struct radix_tree_root data_blocks;
> +
>   	struct tcmu_mailbox *mb_addr;
>   	size_t dev_size;
>   	u32 cmdr_size;
> @@ -110,10 +130,6 @@ struct tcmu_dev {
>   	/* TODO should this be a mutex? */
>   	spinlock_t cmdr_lock;
>   
> -	uint32_t dbi_cur;
> -	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
> -	struct radix_tree_root data_blocks;
> -
>   	struct idr commands;
>   	spinlock_t commands_lock;
>   
> @@ -137,6 +153,11 @@ struct tcmu_cmd {
>   	uint32_t *dbi;
>   };
>   
> +static wait_queue_head_t g_wait;
> +static DEFINE_MUTEX(g_mutex);
> +static LIST_HEAD(root_udev);
> +static spinlock_t g_lock;
> +static unsigned long global_db_count;
>   static struct kmem_cache *tcmu_cmd_cache;
>   
>   /* multicast group */
> @@ -160,54 +181,89 @@ enum tcmu_multicast_groups {
>   	.netnsok = true,
>   };
>   
> -static int tcmu_db_get_empty_block(struct tcmu_dev *udev, void **addr)
> +#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
> +#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
> +#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
> +
> +static inline void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd, uint32_t len)
>   {
> -	void *p;
> -	uint32_t dbi;
> -	int ret;
> +	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
> +	uint32_t i;
>   
> -	dbi = find_first_zero_bit(udev->data_bitmap, DATA_BLOCK_BITS);
> -	if (dbi > udev->dbi_cur)
> -		udev->dbi_cur = dbi;
> +	for (i = 0; i < len; i++)
> +		clear_bit(tcmu_cmd->dbi[i], udev->data_bitmap);
> +}
>   
> -	set_bit(dbi, udev->data_bitmap);
> +static inline bool get_empty_growing_block(struct tcmu_dev *udev,
> +					   struct tcmu_cmd *tcmu_cmd)
> +{
> +	struct page *page;
> +	int ret, dbi;
>   
> -	p = radix_tree_lookup(&udev->data_blocks, dbi);
> -	if (!p) {
> -		p = kzalloc(DATA_BLOCK_SIZE, GFP_ATOMIC);
> -		if (!p) {
> -			clear_bit(dbi, udev->data_bitmap);
> -			return -ENOMEM;
> +	dbi = find_first_zero_bit(udev->data_bitmap, udev->dbi_thresh);
> +	if (dbi == udev->dbi_thresh)
> +		return false;
> +
> +	page = radix_tree_lookup(&udev->data_blocks, dbi);
> +	if (!page) {
> +		/* try to get new page from the mm */
> +		spin_lock_irq(&g_lock);
> +		if (global_db_count >= TCMU_GLOBAL_MAX_BLOCKS) {
> +			spin_unlock_irq(&g_lock);
> +			wake_up(&g_wait);
> +			return false;
> +		}
> +		global_db_count++;
> +		spin_unlock_irq(&g_lock);
> +
> +		page = alloc_page(GFP_ATOMIC);
> +		if (!page) {
> +			return false;
>   		}
>   
> -		ret = radix_tree_insert(&udev->data_blocks, dbi, p);
> +		ret = radix_tree_insert(&udev->data_blocks, dbi, page);
>   		if (ret) {
> -			kfree(p);
> -			clear_bit(dbi, udev->data_bitmap);
> -			return ret;
> +			__free_page(page);
> +			return false;
>   		}
>   	}
>   
> -	*addr = p;
> -	return dbi;
> +	if (dbi > udev->dbi_cur)
> +		udev->dbi_cur = dbi;
> +
> +	set_bit(dbi, udev->data_bitmap);
> +	tcmu_cmd_set_dbi(tcmu_cmd, dbi);
> +
> +	return true;
>   }
>   
> -static void *tcmu_db_get_block_addr(struct tcmu_dev *udev, uint32_t dbi)
> +static bool tcmu_db_get_empty_blocks(struct tcmu_dev *udev,
> +				     struct tcmu_cmd *tcmu_cmd)
>   {
> -	return radix_tree_lookup(&udev->data_blocks, dbi);
> -}
> +	int i;
>   
> -#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
> -#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
> -#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
> +	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
> +	for (i = 0; i < tcmu_cmd->dbi_len; i++) {
> +		if (!get_empty_growing_block(udev, tcmu_cmd))
> +			goto err;
> +	}
> +	return true;
>   
> -static void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd)
> +err:
> +	tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cur);
> +	udev->waiting_global = true;
> +	return false;
> +}
> +
> +static struct page *tcmu_db_get_block_page(struct tcmu_dev *udev, uint32_t dbi)
>   {
> -	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
> -	uint32_t bi;
> +	struct page *page;
>   
> -	for (bi = 0; bi < tcmu_cmd->dbi_len; bi++)
> -		clear_bit(tcmu_cmd->dbi[bi], udev->data_bitmap);
> +	page = radix_tree_lookup(&udev->data_blocks, dbi);
> +	if (!page)
> +		return NULL;
> +
> +	return page;
>   }
>   
>   static inline void tcmu_free_cmd(struct tcmu_cmd *tcmu_cmd)
> @@ -344,17 +400,20 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
>   	void *from, *to = NULL;
>   	size_t copy_bytes, to_offset, offset;
>   	struct scatterlist *sg;
> +	struct page *page;
>   
>   	for_each_sg(data_sg, sg, data_nents, i) {
>   		int sg_remaining = sg->length;
>   		from = kmap_atomic(sg_page(sg)) + sg->offset;
>   		while (sg_remaining > 0) {
>   			if (block_remaining == 0) {
> +				if (to)
> +					kunmap_atomic(to);
> +
>   				block_remaining = DATA_BLOCK_SIZE;
> -				dbi = tcmu_db_get_empty_block(udev, &to);
> -				if (dbi < 0)
> -					return dbi;
> -				tcmu_cmd_set_dbi(tcmu_cmd, dbi);
> +				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
> +				page = tcmu_db_get_block_page(udev, dbi);
> +				to = kmap_atomic(page);
>   			}
>   
>   			copy_bytes = min_t(size_t, sg_remaining,
> @@ -362,7 +421,7 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
>   			to_offset = get_block_offset_user(udev, dbi,
>   					block_remaining);
>   			offset = DATA_BLOCK_SIZE - block_remaining;
> -			to = (void *)(unsigned long)to + offset;
> +			to = (void *)((unsigned long)to + offset);
>   
>   			if (*iov_cnt != 0 &&
>   			    to_offset == iov_tail(udev, *iov)) {
> @@ -382,6 +441,8 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
>   		}
>   		kunmap_atomic(from - sg->offset);
>   	}
> +	if (to)
> +		kunmap_atomic(to);
>   
>   	return 0;
>   }
> @@ -391,23 +452,28 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
>   {
>   	int i, dbi;
>   	int block_remaining = 0;
> -	void *from, *to;
> +	void *from = NULL, *to;
>   	size_t copy_bytes, offset;
>   	struct scatterlist *sg;
> +	struct page *page;
>   
>   	for_each_sg(data_sg, sg, data_nents, i) {
>   		int sg_remaining = sg->length;
>   		to = kmap_atomic(sg_page(sg)) + sg->offset;
>   		while (sg_remaining > 0) {
>   			if (block_remaining == 0) {
> +				if (from)
> +					kunmap_atomic(from);
> +
>   				block_remaining = DATA_BLOCK_SIZE;
>   				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
> -				from = tcmu_db_get_block_addr(udev, dbi);
> +				page = tcmu_db_get_block_page(udev, dbi);
> +				from = kmap_atomic(page);
>   			}
>   			copy_bytes = min_t(size_t, sg_remaining,
>   					block_remaining);
>   			offset = DATA_BLOCK_SIZE - block_remaining;
> -			from = (void *)(unsigned long)from + offset;
> +			from = (void *)((unsigned long)from + offset);
>   			tcmu_flush_dcache_range(from, copy_bytes);
>   			memcpy(to + sg->length - sg_remaining, from,
>   					copy_bytes);
> @@ -417,12 +483,13 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
>   		}
>   		kunmap_atomic(to - sg->offset);
>   	}
> +	if (from)
> +		kunmap_atomic(from);
>   }
>   
> -static inline size_t spc_bitmap_free(unsigned long *bitmap)
> +static inline size_t spc_bitmap_free(unsigned long *bitmap, uint32_t thresh)
>   {
> -	return DATA_BLOCK_SIZE * (DATA_BLOCK_BITS -
> -			bitmap_weight(bitmap, DATA_BLOCK_BITS));
> +	return DATA_BLOCK_SIZE * (thresh - bitmap_weight(bitmap, thresh));
>   }
>   
>   /*
> @@ -431,12 +498,14 @@ static inline size_t spc_bitmap_free(unsigned long *bitmap)
>    *
>    * Called with ring lock held.
>    */
> -static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t data_needed)
> +static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
> +		size_t cmd_size, size_t data_needed)
>   {
>   	struct tcmu_mailbox *mb = udev->mb_addr;
>   	size_t space, cmd_needed;
>   	u32 cmd_head;
>   
> +	udev->waiting_global = false;
>   	tcmu_flush_dcache_range(mb, sizeof(*mb));
>   
>   	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
> @@ -457,10 +526,24 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>   		return false;
>   	}
>   
> -	space = spc_bitmap_free(udev->data_bitmap);
> +	/* try to check and get the data blocks as needed */
> +	space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
>   	if (space < data_needed) {
> -		pr_debug("no data space: only %zu available, but ask for %zu\n",
> -				space, data_needed);
> +		if (udev->unmapping) {
> +			pr_debug("no data space: only %zu available, but ask for %zu\n",
> +					space, data_needed);
> +			return false;
> +		} else {
> +			udev->dbi_thresh += udev->dbi_thresh / 2;
> +			udev->dbi_thresh = min((int)udev->dbi_thresh, DATA_BLOCK_BITS);
> +			space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
> +			if (space < data_needed)
> +				return false;
> +		}
> +	}
> +
> +	if (!tcmu_db_get_empty_blocks(udev, cmd)) {
> +		pr_debug("no data space: ask for %zu\n", data_needed);
>   		return false;
>   	}
>   
> @@ -519,7 +602,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>   		return TCM_INVALID_CDB_FIELD;
>   	}
>   
> -	while (!is_ring_space_avail(udev, command_size, data_length)) {
> +	while (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
>   		int ret;
>   		DEFINE_WAIT(__wait);
>   
> @@ -567,6 +650,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>   	entry->hdr.uflags = 0;
>   
>   	/* Handle allocating space from the data area */
> +	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
>   	iov = &entry->req.iov[0];
>   	iov_cnt = 0;
>   	copy_to_data_area = (se_cmd->data_direction == DMA_TO_DEVICE
> @@ -664,7 +748,7 @@ static void tcmu_handle_completion(struct tcmu_cmd *cmd, struct tcmu_cmd_entry *
>   	target_complete_cmd(cmd->se_cmd, entry->rsp.scsi_status);
>   
>   	cmd->se_cmd = NULL;
> -	tcmu_cmd_free_data(cmd);
> +	tcmu_cmd_free_data(cmd, cmd->dbi_len);
>   	tcmu_free_cmd(cmd);
>   }
>   
> @@ -783,41 +867,80 @@ static int tcmu_irqcontrol(struct uio_info *info, s32 irq_on)
>   
>   static void tcmu_db_release(struct tcmu_dev *udev, bool release_pending)
>   {
> -	uint32_t dbi, end;
> -	void *addr;
> +	int dbi = -1, end;
> +	struct page *page;
>   
>   	spin_lock_irq(&udev->cmdr_lock);
> -
>   	end = udev->dbi_cur + 1;
>   
> -	/* try to release all unused blocks */
> -	dbi = find_first_zero_bit(udev->data_bitmap, end);
> -	if (dbi >= end) {
> -		spin_unlock_irq(&udev->cmdr_lock);
> -		return;
> -	}
> +	/* try to release all unused but has mapped blocks */
>   	do {
> -		addr = radix_tree_delete(&udev->data_blocks, dbi);
> -		kfree(addr);
> -
>   		dbi = find_next_zero_bit(udev->data_bitmap, end, dbi + 1);
> -	} while (dbi < end);
> +		if (dbi == end)
> +			break;
>   
> -	if (!release_pending)
> -		return;
> +		/*
> +		 * When the bit is cleared and p != NULL, meaning that
> +		 * this tcmu block had already freed-after-use.
> +		 *
> +		 * If p->user == 0, meaning that the current ring buffer
> +		 * is the last or the only user of the tcmu block, and
> +		 * it must already in the free list, so it could be
> +		 * remove from the list and then released its memories.
> +		 *
> +		 * If p->user != 0, meaning that the current tcmu block is
> +		 * still referenced by other ring buffers, so just ignore
> +		 * it without doing anyting.
> +		 */
> +		page = radix_tree_delete(&udev->data_blocks, dbi);
> +		if (page) {
> +				__free_page(page);
> +				spin_lock_irq(&g_lock);
> +				global_db_count--;
> +				spin_unlock_irq(&g_lock);
> +		}
> +	} while (1);
>   
> -	/* try to release all pending blocks */
> -	dbi = find_first_bit(udev->data_bitmap, end);
> -	if (dbi >= end) {
> +	if (!release_pending) {
>   		spin_unlock_irq(&udev->cmdr_lock);
>   		return;
>   	}
> -	do {
> -		addr = radix_tree_delete(&udev->data_blocks, dbi);
> -		kfree(addr);
>   
> +	/* try to release all pending blocks */
> +	dbi = -1;
> +	do {
>   		dbi = find_next_bit(udev->data_bitmap, end, dbi + 1);
> -	} while (dbi < end);
> +		if (dbi == end)
> +			break;
> +
> +		clear_bit(dbi, udev->data_bitmap);
> +
> +		/*
> +		 * When the bit is set and p != NULL, meaning that this
> +		 * tcmu block is still being used here.
> +		 *
> +		 * If p->user == 0, meaning that the current ring buffer
> +		 * is the last or the only user of this tcmu block, and
> +		 * it won't in the free list, so could just release its
> +		 * memories.
> +		 *
> +		 * If the p->user != 0, we should insert it to the free
> +		 * list.
> +		 *
> +		 * p == NULL means that the current ring buffer is broken.
> +		 */
> +		page = radix_tree_delete(&udev->data_blocks, dbi);
> +		if (page) {
> +				__free_page(page);
> +				spin_lock_irq(&g_lock);
> +				global_db_count--;
> +				spin_unlock_irq(&g_lock);
> +		} else {
> +			pr_err("block page not found, ring is broken\n");
> +			set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags);
> +			break;
> +		}
> +	} while (1);
>   
>   	spin_unlock_irq(&udev->cmdr_lock);
>   }
> @@ -846,6 +969,43 @@ static int tcmu_find_mem_index(struct vm_area_struct *vma)
>   	return -1;
>   }
>   
> +/*
> + * Normally it shouldn't be here. This is just for avoid
> + * the page fault call trace, and will return zeroed page.
> + */
> +static struct page *tcmu_try_to_alloc_new_page(struct tcmu_dev *udev, uint32_t dbi)
> +{
> +	struct page *page;
> +	int ret;
> +
> +	if (dbi >= udev->dbi_thresh) {
> +		udev->dbi_thresh = dbi;
> +		udev->dbi_cur = dbi;
> +	}
> +
> +	page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
> +	if (!page) {
> +		return NULL;
> +	}
> +
> +	ret = radix_tree_insert(&udev->data_blocks, dbi, page);
> +	if (ret) {
> +		__free_page(page);
> +		return NULL;
> +	}
> +
> +	/*
> +	 * Since this case is rare in page fault routine, here we
> +	 * will allow the global_db_count >= TCMU_GLOBAL_MAX_BLOCKS
> +	 * to reduce possible page fault call trace.
> +	 */
> +	spin_lock_irq(&g_lock);
> +	global_db_count++;
> +	spin_unlock_irq(&g_lock);
> +
> +	return page;
> +}
> +
>   static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>   {
>   	struct tcmu_dev *udev = vma->vm_private_data;
> @@ -869,14 +1029,17 @@ static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>   		addr = (void *)(unsigned long)info->mem[mi].addr + offset;
>   		page = vmalloc_to_page(addr);
>   	} else {
> -		/* For the dynamically growing data area pages */
>   		uint32_t dbi;
>   
> +		/* For the dynamically growing data area pages */
>   		dbi = (offset - udev->data_off) / DATA_BLOCK_SIZE;
> -		addr = tcmu_db_get_block_addr(udev, dbi);
> -		if (!addr)
> +		spin_lock_irq(&udev->cmdr_lock);
> +		page = tcmu_db_get_block_page(udev, dbi);
> +		if (!page)
> +			page = tcmu_try_to_alloc_new_page(udev, dbi);
> +		spin_unlock_irq(&udev->cmdr_lock);
> +		if (!page)
>   			return VM_FAULT_NOPAGE;
> -		page = virt_to_page(addr);
>   	}
>   
>   	get_page(page);
> @@ -913,6 +1076,8 @@ static int tcmu_open(struct uio_info *info, struct inode *inode)
>   	if (test_and_set_bit(TCMU_DEV_BIT_OPEN, &udev->flags))
>   		return -EBUSY;
>   
> +	udev->inode = inode;
> +
>   	pr_debug("open\n");
>   
>   	return 0;
> @@ -1003,6 +1168,8 @@ static int tcmu_configure_device(struct se_device *dev)
>   	udev->cmdr_size = CMDR_SIZE - CMDR_OFF;
>   	udev->data_off = CMDR_SIZE;
>   	udev->data_size = DATA_SIZE;
> +	udev->dbi_thresh = DATA_BLOCK_BITS;
> +	udev->unmapping = false;
>   
>   	/* Initialise the mailbox of the ring buffer */
>   	mb = udev->mb_addr;
> @@ -1048,6 +1215,10 @@ static int tcmu_configure_device(struct se_device *dev)
>   	if (ret)
>   		goto err_netlink;
>   
> +	mutex_lock(&g_mutex);
> +	list_add(&udev->node, &root_udev);
> +	mutex_unlock(&g_mutex);
> +
>   	return 0;
>   
>   err_netlink:
> @@ -1072,6 +1243,10 @@ static void tcmu_free_device(struct se_device *dev)
>   {
>   	struct tcmu_dev *udev = TCMU_DEV(dev);
>   
> +	mutex_lock(&g_mutex);
> +	list_del(&udev->node);
> +	mutex_unlock(&g_mutex);
> +
>   	vfree(udev->mb_addr);
>   
>   	/* Upper layer should drain all requests before calling this */
> @@ -1235,12 +1410,90 @@ static sector_t tcmu_get_blocks(struct se_device *dev)
>   	.tb_dev_attrib_attrs	= passthrough_attrib_attrs,
>   };
>   
> +static struct task_struct *unmap_thread;
> +
> +/*
> + * The unmapping thread routine.
> + */
> +static int unmap_thread_fn(void *data)
> +{
> +	struct tcmu_dev *udev;
> +	loff_t offset;
> +	uint32_t start, end, dbi;
> +	struct page *page;
> +	bool unmapped;
> +	int i;
> +
> +	while (1) {
> +		DEFINE_WAIT(__wait);
> +
> +		prepare_to_wait(&g_wait, &__wait, TASK_INTERRUPTIBLE);
> +		schedule();
> +		finish_wait(&g_wait, &__wait);
> +
> +		unmapped = false;
> +		mutex_lock(&g_mutex);
> +		list_for_each_entry(udev, &root_udev, node) {
> +			spin_lock_irq(&udev->cmdr_lock);
> +			end = udev->dbi_cur + 1;
> +			dbi = find_last_bit(udev->data_bitmap, end);
> +			if (dbi == end) {
> +				/*
> +				 * Reserved for DATA_BLOCK_RES_BITS
> +				 * blocks for idle udev
> +				 */
> +				dbi = DATA_BLOCK_RES_BITS - 1;
> +				udev->dbi_cur = 0;
> +			} else {
> +				udev->dbi_cur = dbi;
> +			}
> +
> +			udev->dbi_thresh = start = dbi + 1;
> +			if (start >= end) {
> +				spin_unlock_irq(&udev->cmdr_lock);
> +				continue;
> +			}
> +			udev->unmapping = true;
> +			spin_unlock_irq(&udev->cmdr_lock);
> +
> +			/* Here will truncate the ring from offset */
> +			offset = udev->data_off + start * DATA_BLOCK_SIZE;
> +			unmap_mapping_range(udev->inode->i_mapping, offset, 0, 1);
> +			unmapped = true;
> +
> +			spin_lock_irq(&udev->cmdr_lock);
> +			for (i = start; i < end; i++) {
> +				page = radix_tree_delete(&udev->data_blocks, i);
> +				if (page) {
> +					__free_page(page);
> +					spin_lock_irq(&g_lock);
> +					global_db_count--;
> +					spin_unlock_irq(&g_lock);
> +				}
> +			}
> +			udev->unmapping = false;
> +			spin_unlock_irq(&udev->cmdr_lock);
> +		}
> +
> +		if (unmapped) {
> +			list_for_each_entry(udev, &root_udev, node)
> +				if (udev->waiting_global)
> +					wake_up(&udev->wait_cmdr);
> +		}
> +		mutex_unlock(&g_mutex);
> +	}
> +
> +	return 0;
> +}
> +
>   static int __init tcmu_module_init(void)
>   {
>   	int ret;
>   
>   	BUILD_BUG_ON((sizeof(struct tcmu_cmd_entry) % TCMU_OP_ALIGN_SIZE) != 0);
>   
> +	spin_lock_init(&g_lock);
> +
>   	tcmu_cmd_cache = kmem_cache_create("tcmu_cmd_cache",
>   				sizeof(struct tcmu_cmd),
>   				__alignof__(struct tcmu_cmd),
> @@ -1263,8 +1516,17 @@ static int __init tcmu_module_init(void)
>   	if (ret)
>   		goto out_unreg_genl;
>   
> +	init_waitqueue_head(&g_wait);
> +	unmap_thread = kthread_run(unmap_thread_fn, NULL, "tcmu_unmap");
> +	if (IS_ERR(unmap_thread)) {
> +		unmap_thread = NULL;
> +		goto out_unreg_transport;
> +	}
> +
>   	return 0;
>   
> +out_unreg_transport:
> +	target_backend_unregister(&tcmu_ops);
>   out_unreg_genl:
>   	genl_unregister_family(&tcmu_genl_family);
>   out_unreg_device:
> @@ -1277,6 +1539,9 @@ static int __init tcmu_module_init(void)
>   
>   static void __exit tcmu_module_exit(void)
>   {
> +	if (unmap_thread)
> +		kthread_stop(unmap_thread);
> +
>   	target_backend_unregister(&tcmu_ops);
>   	genl_unregister_family(&tcmu_genl_family);
>   	root_device_unregister(tcmu_root_device);
> -- 1.8.3.1
>


[-- Attachment #2: Type: text/html, Size: 23248 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 2/5] target/user: Add global data block pool support
@ 2017-03-17  8:04         ` Xiubo Li
  0 siblings, 0 replies; 15+ messages in thread
From: Xiubo Li @ 2017-03-17  8:04 UTC (permalink / raw)
  To: Andy Grover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, linux-mm

[-- Attachment #1: Type: text/plain, Size: 23131 bytes --]

[...]
> These days what I have gotten is that the unmap_mapping_range() could 
> be used.
> At the same time I have deep into the mm code and fixed the double 
> usage of
> the data blocks and possible page fault call trace bugs mentioned above.
>
> Following is the V3 patch. I have test this using 4 targets & fio for 
> about 2 days, so
> far so good.
>
> I'm still testing this using more complex test case.
>
I have test it the whole day today:
- using 4 targets
- setting TCMU_GLOBAL_MAX_BLOCKS = [512 1K 1M 1G 2G]
- each target here needs more than 450 blocks when running
- fio: -iodepth [1 2 4 8 16] -thread -rw=[read write] -bs=[1K 2K 3K 5K 
7K 16K 64K 1M] -size=20G -numjobs=10 -runtime=1000  ...


The result:
- the system mm, no memory leakage happen.
- the same blocks will page fault again after the unmap.
- try to touch the blocks out of the iov[N], no page fault call trace 
output.
- works well all the day.

Thanks,

BRs
Xiubo

[...]
> From: Xiubo Li<lixiubo@cmss.chinamobile.com>
>
> For each target there will be one ring, when the target number
> grows larger and larger, it could eventually runs out of the
> system memories.
>
> In this patch for each target ring, for the cmd area the size
> will be limited to 8MB and for the data area the size will be
> limited to 256K * PAGE_SIZE.
>
> For all the targets' data areas, they will get empty blocks
> from the "global data block pool", which has limited to 512K *
> PAGE_SIZE for now.
>
> When the "global data block pool" has been used up, then any
> target could trigger the unmapping thread routine to shrink the
> targets' rings. And for the idle targets the unmapping routine
> will reserve 256 blocks at least.
>
> When user space has touched the data blocks out of the iov[N],
> the tcmu_page_fault() will return one zeroed blocks.
>
> Signed-off-by: Xiubo Li<lixiubo@cmss.chinamobile.com>
> Signed-off-by: Jianfei Hu<hujianfei@cmss.chinamobile.com>
> ---
>   drivers/target/target_core_user.c | 433 ++++++++++++++++++++++++++++++--------
>   1 file changed, 349 insertions(+), 84 deletions(-)
>
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index e904bc0..bbc52074 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -30,6 +30,8 @@
>   #include <linux/stringify.h>
>   #include <linux/bitops.h>
>   #include <linux/highmem.h>
> +#include <linux/mutex.h>
> +#include <linux/kthread.h>
>   #include <net/genetlink.h>
>   #include <scsi/scsi_common.h>
>   #include <scsi/scsi_proto.h>
> @@ -66,17 +68,24 @@
>   
>   #define TCMU_TIME_OUT (30 * MSEC_PER_SEC)
>   
> -/* For cmd area, the size is fixed 2M */
> -#define CMDR_SIZE (2 * 1024 * 1024)
> +/* For cmd area, the size is fixed 8MB */
> +#define CMDR_SIZE (8 * 1024 * 1024)
>   
> -/* For data area, the size is fixed 32M */
> -#define DATA_BLOCK_BITS (8 * 1024)
> -#define DATA_BLOCK_SIZE 4096
> +/*
> + * For data area, the block size is PAGE_SIZE and
> + * the total size is 256K * PAGE_SIZE.
> + */
> +#define DATA_BLOCK_SIZE PAGE_SIZE
> +#define DATA_BLOCK_BITS (256 * 1024)
>   #define DATA_SIZE (DATA_BLOCK_BITS * DATA_BLOCK_SIZE)
> +#define DATA_BLOCK_RES_BITS 256
>   
> -/* The ring buffer size is 34M */
> +/* The total size of the ring is 8M + 256K * PAGE_SIZE */
>   #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
>   
> +/* Default maximum of the global data blocks(512K * PAGE_SIZE) */
> +#define TCMU_GLOBAL_MAX_BLOCKS (512 * 1024)
> +
>   static struct device *tcmu_root_device;
>   
>   struct tcmu_hba {
> @@ -86,6 +95,8 @@ struct tcmu_hba {
>   #define TCMU_CONFIG_LEN 256
>   
>   struct tcmu_dev {
> +	struct list_head node;
> +
>   	struct se_device se_dev;
>   
>   	char *name;
> @@ -97,6 +108,15 @@ struct tcmu_dev {
>   
>   	struct uio_info uio_info;
>   
> +	struct inode *inode;
> +
> +	bool unmapping;
> +	bool waiting_global;
> +	uint32_t dbi_cur;
> +	uint32_t dbi_thresh;
> +	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
> +	struct radix_tree_root data_blocks;
> +
>   	struct tcmu_mailbox *mb_addr;
>   	size_t dev_size;
>   	u32 cmdr_size;
> @@ -110,10 +130,6 @@ struct tcmu_dev {
>   	/* TODO should this be a mutex? */
>   	spinlock_t cmdr_lock;
>   
> -	uint32_t dbi_cur;
> -	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
> -	struct radix_tree_root data_blocks;
> -
>   	struct idr commands;
>   	spinlock_t commands_lock;
>   
> @@ -137,6 +153,11 @@ struct tcmu_cmd {
>   	uint32_t *dbi;
>   };
>   
> +static wait_queue_head_t g_wait;
> +static DEFINE_MUTEX(g_mutex);
> +static LIST_HEAD(root_udev);
> +static spinlock_t g_lock;
> +static unsigned long global_db_count;
>   static struct kmem_cache *tcmu_cmd_cache;
>   
>   /* multicast group */
> @@ -160,54 +181,89 @@ enum tcmu_multicast_groups {
>   	.netnsok = true,
>   };
>   
> -static int tcmu_db_get_empty_block(struct tcmu_dev *udev, void **addr)
> +#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
> +#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
> +#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
> +
> +static inline void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd, uint32_t len)
>   {
> -	void *p;
> -	uint32_t dbi;
> -	int ret;
> +	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
> +	uint32_t i;
>   
> -	dbi = find_first_zero_bit(udev->data_bitmap, DATA_BLOCK_BITS);
> -	if (dbi > udev->dbi_cur)
> -		udev->dbi_cur = dbi;
> +	for (i = 0; i < len; i++)
> +		clear_bit(tcmu_cmd->dbi[i], udev->data_bitmap);
> +}
>   
> -	set_bit(dbi, udev->data_bitmap);
> +static inline bool get_empty_growing_block(struct tcmu_dev *udev,
> +					   struct tcmu_cmd *tcmu_cmd)
> +{
> +	struct page *page;
> +	int ret, dbi;
>   
> -	p = radix_tree_lookup(&udev->data_blocks, dbi);
> -	if (!p) {
> -		p = kzalloc(DATA_BLOCK_SIZE, GFP_ATOMIC);
> -		if (!p) {
> -			clear_bit(dbi, udev->data_bitmap);
> -			return -ENOMEM;
> +	dbi = find_first_zero_bit(udev->data_bitmap, udev->dbi_thresh);
> +	if (dbi == udev->dbi_thresh)
> +		return false;
> +
> +	page = radix_tree_lookup(&udev->data_blocks, dbi);
> +	if (!page) {
> +		/* try to get new page from the mm */
> +		spin_lock_irq(&g_lock);
> +		if (global_db_count >= TCMU_GLOBAL_MAX_BLOCKS) {
> +			spin_unlock_irq(&g_lock);
> +			wake_up(&g_wait);
> +			return false;
> +		}
> +		global_db_count++;
> +		spin_unlock_irq(&g_lock);
> +
> +		page = alloc_page(GFP_ATOMIC);
> +		if (!page) {
> +			return false;
>   		}
>   
> -		ret = radix_tree_insert(&udev->data_blocks, dbi, p);
> +		ret = radix_tree_insert(&udev->data_blocks, dbi, page);
>   		if (ret) {
> -			kfree(p);
> -			clear_bit(dbi, udev->data_bitmap);
> -			return ret;
> +			__free_page(page);
> +			return false;
>   		}
>   	}
>   
> -	*addr = p;
> -	return dbi;
> +	if (dbi > udev->dbi_cur)
> +		udev->dbi_cur = dbi;
> +
> +	set_bit(dbi, udev->data_bitmap);
> +	tcmu_cmd_set_dbi(tcmu_cmd, dbi);
> +
> +	return true;
>   }
>   
> -static void *tcmu_db_get_block_addr(struct tcmu_dev *udev, uint32_t dbi)
> +static bool tcmu_db_get_empty_blocks(struct tcmu_dev *udev,
> +				     struct tcmu_cmd *tcmu_cmd)
>   {
> -	return radix_tree_lookup(&udev->data_blocks, dbi);
> -}
> +	int i;
>   
> -#define tcmu_cmd_reset_dbi_cur(cmd) ((cmd)->dbi_cur = 0)
> -#define tcmu_cmd_set_dbi(cmd, index) ((cmd)->dbi[(cmd)->dbi_cur++] = (index))
> -#define tcmu_cmd_get_dbi(cmd) ((cmd)->dbi[(cmd)->dbi_cur++])
> +	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
> +	for (i = 0; i < tcmu_cmd->dbi_len; i++) {
> +		if (!get_empty_growing_block(udev, tcmu_cmd))
> +			goto err;
> +	}
> +	return true;
>   
> -static void tcmu_cmd_free_data(struct tcmu_cmd *tcmu_cmd)
> +err:
> +	tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cur);
> +	udev->waiting_global = true;
> +	return false;
> +}
> +
> +static struct page *tcmu_db_get_block_page(struct tcmu_dev *udev, uint32_t dbi)
>   {
> -	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
> -	uint32_t bi;
> +	struct page *page;
>   
> -	for (bi = 0; bi < tcmu_cmd->dbi_len; bi++)
> -		clear_bit(tcmu_cmd->dbi[bi], udev->data_bitmap);
> +	page = radix_tree_lookup(&udev->data_blocks, dbi);
> +	if (!page)
> +		return NULL;
> +
> +	return page;
>   }
>   
>   static inline void tcmu_free_cmd(struct tcmu_cmd *tcmu_cmd)
> @@ -344,17 +400,20 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
>   	void *from, *to = NULL;
>   	size_t copy_bytes, to_offset, offset;
>   	struct scatterlist *sg;
> +	struct page *page;
>   
>   	for_each_sg(data_sg, sg, data_nents, i) {
>   		int sg_remaining = sg->length;
>   		from = kmap_atomic(sg_page(sg)) + sg->offset;
>   		while (sg_remaining > 0) {
>   			if (block_remaining == 0) {
> +				if (to)
> +					kunmap_atomic(to);
> +
>   				block_remaining = DATA_BLOCK_SIZE;
> -				dbi = tcmu_db_get_empty_block(udev, &to);
> -				if (dbi < 0)
> -					return dbi;
> -				tcmu_cmd_set_dbi(tcmu_cmd, dbi);
> +				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
> +				page = tcmu_db_get_block_page(udev, dbi);
> +				to = kmap_atomic(page);
>   			}
>   
>   			copy_bytes = min_t(size_t, sg_remaining,
> @@ -362,7 +421,7 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
>   			to_offset = get_block_offset_user(udev, dbi,
>   					block_remaining);
>   			offset = DATA_BLOCK_SIZE - block_remaining;
> -			to = (void *)(unsigned long)to + offset;
> +			to = (void *)((unsigned long)to + offset);
>   
>   			if (*iov_cnt != 0 &&
>   			    to_offset == iov_tail(udev, *iov)) {
> @@ -382,6 +441,8 @@ static int alloc_and_scatter_data_area(struct tcmu_dev *udev,
>   		}
>   		kunmap_atomic(from - sg->offset);
>   	}
> +	if (to)
> +		kunmap_atomic(to);
>   
>   	return 0;
>   }
> @@ -391,23 +452,28 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
>   {
>   	int i, dbi;
>   	int block_remaining = 0;
> -	void *from, *to;
> +	void *from = NULL, *to;
>   	size_t copy_bytes, offset;
>   	struct scatterlist *sg;
> +	struct page *page;
>   
>   	for_each_sg(data_sg, sg, data_nents, i) {
>   		int sg_remaining = sg->length;
>   		to = kmap_atomic(sg_page(sg)) + sg->offset;
>   		while (sg_remaining > 0) {
>   			if (block_remaining == 0) {
> +				if (from)
> +					kunmap_atomic(from);
> +
>   				block_remaining = DATA_BLOCK_SIZE;
>   				dbi = tcmu_cmd_get_dbi(tcmu_cmd);
> -				from = tcmu_db_get_block_addr(udev, dbi);
> +				page = tcmu_db_get_block_page(udev, dbi);
> +				from = kmap_atomic(page);
>   			}
>   			copy_bytes = min_t(size_t, sg_remaining,
>   					block_remaining);
>   			offset = DATA_BLOCK_SIZE - block_remaining;
> -			from = (void *)(unsigned long)from + offset;
> +			from = (void *)((unsigned long)from + offset);
>   			tcmu_flush_dcache_range(from, copy_bytes);
>   			memcpy(to + sg->length - sg_remaining, from,
>   					copy_bytes);
> @@ -417,12 +483,13 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *tcmu_cmd,
>   		}
>   		kunmap_atomic(to - sg->offset);
>   	}
> +	if (from)
> +		kunmap_atomic(from);
>   }
>   
> -static inline size_t spc_bitmap_free(unsigned long *bitmap)
> +static inline size_t spc_bitmap_free(unsigned long *bitmap, uint32_t thresh)
>   {
> -	return DATA_BLOCK_SIZE * (DATA_BLOCK_BITS -
> -			bitmap_weight(bitmap, DATA_BLOCK_BITS));
> +	return DATA_BLOCK_SIZE * (thresh - bitmap_weight(bitmap, thresh));
>   }
>   
>   /*
> @@ -431,12 +498,14 @@ static inline size_t spc_bitmap_free(unsigned long *bitmap)
>    *
>    * Called with ring lock held.
>    */
> -static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t data_needed)
> +static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
> +		size_t cmd_size, size_t data_needed)
>   {
>   	struct tcmu_mailbox *mb = udev->mb_addr;
>   	size_t space, cmd_needed;
>   	u32 cmd_head;
>   
> +	udev->waiting_global = false;
>   	tcmu_flush_dcache_range(mb, sizeof(*mb));
>   
>   	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
> @@ -457,10 +526,24 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>   		return false;
>   	}
>   
> -	space = spc_bitmap_free(udev->data_bitmap);
> +	/* try to check and get the data blocks as needed */
> +	space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
>   	if (space < data_needed) {
> -		pr_debug("no data space: only %zu available, but ask for %zu\n",
> -				space, data_needed);
> +		if (udev->unmapping) {
> +			pr_debug("no data space: only %zu available, but ask for %zu\n",
> +					space, data_needed);
> +			return false;
> +		} else {
> +			udev->dbi_thresh += udev->dbi_thresh / 2;
> +			udev->dbi_thresh = min((int)udev->dbi_thresh, DATA_BLOCK_BITS);
> +			space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
> +			if (space < data_needed)
> +				return false;
> +		}
> +	}
> +
> +	if (!tcmu_db_get_empty_blocks(udev, cmd)) {
> +		pr_debug("no data space: ask for %zu\n", data_needed);
>   		return false;
>   	}
>   
> @@ -519,7 +602,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>   		return TCM_INVALID_CDB_FIELD;
>   	}
>   
> -	while (!is_ring_space_avail(udev, command_size, data_length)) {
> +	while (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
>   		int ret;
>   		DEFINE_WAIT(__wait);
>   
> @@ -567,6 +650,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, size_t cmd_size, size_t d
>   	entry->hdr.uflags = 0;
>   
>   	/* Handle allocating space from the data area */
> +	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
>   	iov = &entry->req.iov[0];
>   	iov_cnt = 0;
>   	copy_to_data_area = (se_cmd->data_direction == DMA_TO_DEVICE
> @@ -664,7 +748,7 @@ static void tcmu_handle_completion(struct tcmu_cmd *cmd, struct tcmu_cmd_entry *
>   	target_complete_cmd(cmd->se_cmd, entry->rsp.scsi_status);
>   
>   	cmd->se_cmd = NULL;
> -	tcmu_cmd_free_data(cmd);
> +	tcmu_cmd_free_data(cmd, cmd->dbi_len);
>   	tcmu_free_cmd(cmd);
>   }
>   
> @@ -783,41 +867,80 @@ static int tcmu_irqcontrol(struct uio_info *info, s32 irq_on)
>   
>   static void tcmu_db_release(struct tcmu_dev *udev, bool release_pending)
>   {
> -	uint32_t dbi, end;
> -	void *addr;
> +	int dbi = -1, end;
> +	struct page *page;
>   
>   	spin_lock_irq(&udev->cmdr_lock);
> -
>   	end = udev->dbi_cur + 1;
>   
> -	/* try to release all unused blocks */
> -	dbi = find_first_zero_bit(udev->data_bitmap, end);
> -	if (dbi >= end) {
> -		spin_unlock_irq(&udev->cmdr_lock);
> -		return;
> -	}
> +	/* try to release all unused but has mapped blocks */
>   	do {
> -		addr = radix_tree_delete(&udev->data_blocks, dbi);
> -		kfree(addr);
> -
>   		dbi = find_next_zero_bit(udev->data_bitmap, end, dbi + 1);
> -	} while (dbi < end);
> +		if (dbi == end)
> +			break;
>   
> -	if (!release_pending)
> -		return;
> +		/*
> +		 * When the bit is cleared and p != NULL, meaning that
> +		 * this tcmu block had already freed-after-use.
> +		 *
> +		 * If p->user == 0, meaning that the current ring buffer
> +		 * is the last or the only user of the tcmu block, and
> +		 * it must already in the free list, so it could be
> +		 * remove from the list and then released its memories.
> +		 *
> +		 * If p->user != 0, meaning that the current tcmu block is
> +		 * still referenced by other ring buffers, so just ignore
> +		 * it without doing anyting.
> +		 */
> +		page = radix_tree_delete(&udev->data_blocks, dbi);
> +		if (page) {
> +				__free_page(page);
> +				spin_lock_irq(&g_lock);
> +				global_db_count--;
> +				spin_unlock_irq(&g_lock);
> +		}
> +	} while (1);
>   
> -	/* try to release all pending blocks */
> -	dbi = find_first_bit(udev->data_bitmap, end);
> -	if (dbi >= end) {
> +	if (!release_pending) {
>   		spin_unlock_irq(&udev->cmdr_lock);
>   		return;
>   	}
> -	do {
> -		addr = radix_tree_delete(&udev->data_blocks, dbi);
> -		kfree(addr);
>   
> +	/* try to release all pending blocks */
> +	dbi = -1;
> +	do {
>   		dbi = find_next_bit(udev->data_bitmap, end, dbi + 1);
> -	} while (dbi < end);
> +		if (dbi == end)
> +			break;
> +
> +		clear_bit(dbi, udev->data_bitmap);
> +
> +		/*
> +		 * When the bit is set and p != NULL, meaning that this
> +		 * tcmu block is still being used here.
> +		 *
> +		 * If p->user == 0, meaning that the current ring buffer
> +		 * is the last or the only user of this tcmu block, and
> +		 * it won't in the free list, so could just release its
> +		 * memories.
> +		 *
> +		 * If the p->user != 0, we should insert it to the free
> +		 * list.
> +		 *
> +		 * p == NULL means that the current ring buffer is broken.
> +		 */
> +		page = radix_tree_delete(&udev->data_blocks, dbi);
> +		if (page) {
> +				__free_page(page);
> +				spin_lock_irq(&g_lock);
> +				global_db_count--;
> +				spin_unlock_irq(&g_lock);
> +		} else {
> +			pr_err("block page not found, ring is broken\n");
> +			set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags);
> +			break;
> +		}
> +	} while (1);
>   
>   	spin_unlock_irq(&udev->cmdr_lock);
>   }
> @@ -846,6 +969,43 @@ static int tcmu_find_mem_index(struct vm_area_struct *vma)
>   	return -1;
>   }
>   
> +/*
> + * Normally it shouldn't be here. This is just for avoid
> + * the page fault call trace, and will return zeroed page.
> + */
> +static struct page *tcmu_try_to_alloc_new_page(struct tcmu_dev *udev, uint32_t dbi)
> +{
> +	struct page *page;
> +	int ret;
> +
> +	if (dbi >= udev->dbi_thresh) {
> +		udev->dbi_thresh = dbi;
> +		udev->dbi_cur = dbi;
> +	}
> +
> +	page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
> +	if (!page) {
> +		return NULL;
> +	}
> +
> +	ret = radix_tree_insert(&udev->data_blocks, dbi, page);
> +	if (ret) {
> +		__free_page(page);
> +		return NULL;
> +	}
> +
> +	/*
> +	 * Since this case is rare in page fault routine, here we
> +	 * will allow the global_db_count >= TCMU_GLOBAL_MAX_BLOCKS
> +	 * to reduce possible page fault call trace.
> +	 */
> +	spin_lock_irq(&g_lock);
> +	global_db_count++;
> +	spin_unlock_irq(&g_lock);
> +
> +	return page;
> +}
> +
>   static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>   {
>   	struct tcmu_dev *udev = vma->vm_private_data;
> @@ -869,14 +1029,17 @@ static int tcmu_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>   		addr = (void *)(unsigned long)info->mem[mi].addr + offset;
>   		page = vmalloc_to_page(addr);
>   	} else {
> -		/* For the dynamically growing data area pages */
>   		uint32_t dbi;
>   
> +		/* For the dynamically growing data area pages */
>   		dbi = (offset - udev->data_off) / DATA_BLOCK_SIZE;
> -		addr = tcmu_db_get_block_addr(udev, dbi);
> -		if (!addr)
> +		spin_lock_irq(&udev->cmdr_lock);
> +		page = tcmu_db_get_block_page(udev, dbi);
> +		if (!page)
> +			page = tcmu_try_to_alloc_new_page(udev, dbi);
> +		spin_unlock_irq(&udev->cmdr_lock);
> +		if (!page)
>   			return VM_FAULT_NOPAGE;
> -		page = virt_to_page(addr);
>   	}
>   
>   	get_page(page);
> @@ -913,6 +1076,8 @@ static int tcmu_open(struct uio_info *info, struct inode *inode)
>   	if (test_and_set_bit(TCMU_DEV_BIT_OPEN, &udev->flags))
>   		return -EBUSY;
>   
> +	udev->inode = inode;
> +
>   	pr_debug("open\n");
>   
>   	return 0;
> @@ -1003,6 +1168,8 @@ static int tcmu_configure_device(struct se_device *dev)
>   	udev->cmdr_size = CMDR_SIZE - CMDR_OFF;
>   	udev->data_off = CMDR_SIZE;
>   	udev->data_size = DATA_SIZE;
> +	udev->dbi_thresh = DATA_BLOCK_BITS;
> +	udev->unmapping = false;
>   
>   	/* Initialise the mailbox of the ring buffer */
>   	mb = udev->mb_addr;
> @@ -1048,6 +1215,10 @@ static int tcmu_configure_device(struct se_device *dev)
>   	if (ret)
>   		goto err_netlink;
>   
> +	mutex_lock(&g_mutex);
> +	list_add(&udev->node, &root_udev);
> +	mutex_unlock(&g_mutex);
> +
>   	return 0;
>   
>   err_netlink:
> @@ -1072,6 +1243,10 @@ static void tcmu_free_device(struct se_device *dev)
>   {
>   	struct tcmu_dev *udev = TCMU_DEV(dev);
>   
> +	mutex_lock(&g_mutex);
> +	list_del(&udev->node);
> +	mutex_unlock(&g_mutex);
> +
>   	vfree(udev->mb_addr);
>   
>   	/* Upper layer should drain all requests before calling this */
> @@ -1235,12 +1410,90 @@ static sector_t tcmu_get_blocks(struct se_device *dev)
>   	.tb_dev_attrib_attrs	= passthrough_attrib_attrs,
>   };
>   
> +static struct task_struct *unmap_thread;
> +
> +/*
> + * The unmapping thread routine.
> + */
> +static int unmap_thread_fn(void *data)
> +{
> +	struct tcmu_dev *udev;
> +	loff_t offset;
> +	uint32_t start, end, dbi;
> +	struct page *page;
> +	bool unmapped;
> +	int i;
> +
> +	while (1) {
> +		DEFINE_WAIT(__wait);
> +
> +		prepare_to_wait(&g_wait, &__wait, TASK_INTERRUPTIBLE);
> +		schedule();
> +		finish_wait(&g_wait, &__wait);
> +
> +		unmapped = false;
> +		mutex_lock(&g_mutex);
> +		list_for_each_entry(udev, &root_udev, node) {
> +			spin_lock_irq(&udev->cmdr_lock);
> +			end = udev->dbi_cur + 1;
> +			dbi = find_last_bit(udev->data_bitmap, end);
> +			if (dbi == end) {
> +				/*
> +				 * Reserved for DATA_BLOCK_RES_BITS
> +				 * blocks for idle udev
> +				 */
> +				dbi = DATA_BLOCK_RES_BITS - 1;
> +				udev->dbi_cur = 0;
> +			} else {
> +				udev->dbi_cur = dbi;
> +			}
> +
> +			udev->dbi_thresh = start = dbi + 1;
> +			if (start >= end) {
> +				spin_unlock_irq(&udev->cmdr_lock);
> +				continue;
> +			}
> +			udev->unmapping = true;
> +			spin_unlock_irq(&udev->cmdr_lock);
> +
> +			/* Here will truncate the ring from offset */
> +			offset = udev->data_off + start * DATA_BLOCK_SIZE;
> +			unmap_mapping_range(udev->inode->i_mapping, offset, 0, 1);
> +			unmapped = true;
> +
> +			spin_lock_irq(&udev->cmdr_lock);
> +			for (i = start; i < end; i++) {
> +				page = radix_tree_delete(&udev->data_blocks, i);
> +				if (page) {
> +					__free_page(page);
> +					spin_lock_irq(&g_lock);
> +					global_db_count--;
> +					spin_unlock_irq(&g_lock);
> +				}
> +			}
> +			udev->unmapping = false;
> +			spin_unlock_irq(&udev->cmdr_lock);
> +		}
> +
> +		if (unmapped) {
> +			list_for_each_entry(udev, &root_udev, node)
> +				if (udev->waiting_global)
> +					wake_up(&udev->wait_cmdr);
> +		}
> +		mutex_unlock(&g_mutex);
> +	}
> +
> +	return 0;
> +}
> +
>   static int __init tcmu_module_init(void)
>   {
>   	int ret;
>   
>   	BUILD_BUG_ON((sizeof(struct tcmu_cmd_entry) % TCMU_OP_ALIGN_SIZE) != 0);
>   
> +	spin_lock_init(&g_lock);
> +
>   	tcmu_cmd_cache = kmem_cache_create("tcmu_cmd_cache",
>   				sizeof(struct tcmu_cmd),
>   				__alignof__(struct tcmu_cmd),
> @@ -1263,8 +1516,17 @@ static int __init tcmu_module_init(void)
>   	if (ret)
>   		goto out_unreg_genl;
>   
> +	init_waitqueue_head(&g_wait);
> +	unmap_thread = kthread_run(unmap_thread_fn, NULL, "tcmu_unmap");
> +	if (IS_ERR(unmap_thread)) {
> +		unmap_thread = NULL;
> +		goto out_unreg_transport;
> +	}
> +
>   	return 0;
>   
> +out_unreg_transport:
> +	target_backend_unregister(&tcmu_ops);
>   out_unreg_genl:
>   	genl_unregister_family(&tcmu_genl_family);
>   out_unreg_device:
> @@ -1277,6 +1539,9 @@ static int __init tcmu_module_init(void)
>   
>   static void __exit tcmu_module_exit(void)
>   {
> +	if (unmap_thread)
> +		kthread_stop(unmap_thread);
> +
>   	target_backend_unregister(&tcmu_ops);
>   	genl_unregister_family(&tcmu_genl_family);
>   	root_device_unregister(tcmu_root_device);
> -- 1.8.3.1
>


[-- Attachment #2: Type: text/html, Size: 23248 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 2/5] target/user: Add global data block pool support
  2017-03-17  8:04         ` Xiubo Li
  (?)
@ 2017-03-17 17:11         ` Andy Grover
  2017-03-17 22:06           ` 李秀波
  -1 siblings, 1 reply; 15+ messages in thread
From: Andy Grover @ 2017-03-17 17:11 UTC (permalink / raw)
  To: Xiubo Li, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, linux-mm

On 03/17/2017 01:04 AM, Xiubo Li wrote:
> [...]
>> These days what I have gotten is that the unmap_mapping_range() could
>> be used.
>> At the same time I have deep into the mm code and fixed the double
>> usage of
>> the data blocks and possible page fault call trace bugs mentioned above.
>>
>> Following is the V3 patch. I have test this using 4 targets & fio for
>> about 2 days, so
>> far so good.
>>
>> I'm still testing this using more complex test case.
>>
> I have test it the whole day today:
> - using 4 targets
> - setting TCMU_GLOBAL_MAX_BLOCKS = [512 1K 1M 1G 2G]
> - each target here needs more than 450 blocks when running
> - fio: -iodepth [1 2 4 8 16] -thread -rw=[read write] -bs=[1K 2K 3K 5K
> 7K 16K 64K 1M] -size=20G -numjobs=10 -runtime=1000  ...

Hi Xiubo,

V3 is sounding very good. I look forward to reviewing it after it is posted.

Thanks -- Regards -- Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re:  [PATCHv2 2/5] target/user: Add global data block pool support
  2017-03-17 17:11         ` Andy Grover
@ 2017-03-17 22:06           ` 李秀波
  0 siblings, 0 replies; 15+ messages in thread
From: 李秀波 @ 2017-03-17 22:06 UTC (permalink / raw)
  To: AndyGrover, nab, mchristi
  Cc: shli, sheng, linux-scsi, target-devel, namei.unix, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1082 bytes --]



AndyGrover <agrover@redhat.com>写到:
>On 03/17/2017 01:04 AM, Xiubo Li wrote:
>> [...]
>>> These days what I have gotten is that the unmap_mapping_range()
>could
>>> be used.
>>> At the same time I have deep into the mm code and fixed the double
>>> usage of
>>> the data blocks and possible page fault call trace bugs mentioned
>above.
>>>
>>> Following is the V3 patch. I have test this using 4 targets & fio
>for
>>> about 2 days, so
>>> far so good.
>>>
>>> I'm still testing this using more complex test case.
>>>
>> I have test it the whole day today:
>> - using 4 targets
>> - setting TCMU_GLOBAL_MAX_BLOCKS = [512 1K 1M 1G 2G]
>> - each target here needs more than 450 blocks when running
>> - fio: -iodepth [1 2 4 8 16] -thread -rw=[read write] -bs=[1K 2K 3K
>5K
>> 7K 16K 64K 1M] -size=20G -numjobs=10 -runtime=1000  ...
>
>Hi Xiubo,
>
>V3 is sounding very good. I look forward to reviewing it after it is
>posted.
>

Yes, I will post it later after more test and checking.

Thanks,

BRs
Xiubo


>Thanks -- Regards -- Andy

[-- Attachment #2: Type: text/html, Size: 1168 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-03-17 22:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-08  8:45 [PATCHv2 0/5] Dynamic growing data area support lixiubo
2017-03-08  8:45 ` [PATCHv2 1/5] target/user: Add dynamic growing data area feature support lixiubo
2017-03-08  8:45 ` [PATCHv2 2/5] target/user: Add global data block pool support lixiubo
2017-03-08 20:20   ` Andy Grover
2017-03-16  9:39     ` Xiubo Li
2017-03-17  8:04       ` Xiubo Li
2017-03-17  8:04         ` Xiubo Li
2017-03-17 17:11         ` Andy Grover
2017-03-17 22:06           ` 李秀波
2017-03-08  8:45 ` [PATCHv2 3/5] target/user: Fix possible overwrite of t_data_sg's last iov[] lixiubo
2017-03-16 18:23   ` Bryant G. Ly
2017-03-08  8:45 ` [PATCHv2 4/5] target/user: Fix wrongly calculating of the base_command_size lixiubo
2017-03-17  5:45   ` Xiubo Li
2017-03-08  8:45 ` [PATCHv2 5/5] target/user: Clean up tcmu_queue_cmd_ring lixiubo
2017-03-16 20:50   ` Bryant G. Ly

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.